I have to plot some chronologically-ordered values (one value per month, in my case) on a Plotly (Python) graph. Also, I have to add a "end of period label" (i.e. a marker with text indicating the last value of the series) that has to be positioned at 'middle right'.
A working example would be something like this:
import pandas as pd
import numpy as np
import plotly.graph_objects as go
date_range = pd.to_datetime(pd.date_range(start='1/1/2013', end='9/1/2022', freq='M').tolist()).date
values = np.random.randint(100, size=len(date_range)).tolist()
fig = go.Figure(
)
fig.add_trace(go.Scatter(
showlegend=False,
x=date_range,
y=values,
mode='lines',
line=dict(
width=2,
color="red",
)
)
)
fig.add_trace(go.Scatter(
showlegend=False,
x=[date_range[-1]],
y=[values[-1]],
text=[values[-1]],
textposition='middle right',
texttemplate="%{text:.3f}",
mode='markers+text',
line=dict(
width=2,
color="red",
)
)
)
fig.update_layout(
xaxis=dict(
tickformat="%m\n<b>%Y", dtick="M3",
)
)
which produces the following plot:
I am facing the following problem: the end of period label "extends" beyond the last value of the date range and makes the x axis go into the green area, which are all undesired months (for example, those that extend beyond the last value of the date range and into 2023).
I tried several things to "erase" or delete that undesired part of the x axis, but nothing worked properly: either the end of period label was cut in half or the whole x axis disappeared.
Thank you in advance for any help or suggestion.
as per #r0beginners comments
given text is outside graph area use an annotation for the text
make marker scatter just mode=markers
explicitly state xaxis range range=date_range[[0,-1]]
import pandas as pd
import numpy as np
import plotly.graph_objects as go
date_range = pd.to_datetime(
pd.date_range(start="1/1/2013", end="9/1/2022", freq="M").tolist()
).date
values = np.random.randint(100, size=len(date_range)).tolist()
fig = go.Figure()
fig.add_trace(
go.Scatter(
showlegend=False,
x=date_range,
y=values,
mode="lines",
line=dict(
width=2,
color="red",
),
)
)
fig.add_trace(go.Scatter(
showlegend=False,
x=[date_range[-1]],
y=[values[-1]],
mode='markers',
marker_size=15
)
)
fig.add_annotation(
x = date_range[-1],
y = values[-1],
text = values[-1],
xshift=10,
yshift=0,
showarrow=False
)
fig.update_layout(
xaxis=dict(
tickformat="%m\n<b>%Y",
dtick="M3",
range=date_range[[0,-1]]
)
)
Here is the code that I have tried:
# import pandas as pd
import numpy as np
import plotly.graph_objects as go
from plotly.subplots import make_subplots
df = pd.read_csv("resultant_data.txt", index_col = 0, sep = ",")
display=df[["Velocity", "WinLoss"]]
pos = lambda col : col[col > 0].sum()
neg = lambda col : col[col < 0].sum()
Related_Display_Info = df.groupby("RacerCount").agg(Counts=("Velocity","count"),
WinLoss=("WinLoss","sum"),
Positives=("WinLoss", pos),
Negatives=("WinLoss", neg),
)
# Create figure with secondary y-axis
fig = make_subplots(specs=[[{"secondary_y": True}]])
# Add traces
fig.add_trace(
go.Scatter(x=display.index, y=display["Velocity"], name="Velocity", mode="markers"),
secondary_y=False
)
fig.add_trace(
go.Scatter(x=Related_Display_Info.index,
y=Related_Display_Info["WinLoss"],
name="Win/Loss",
mode="markers",
marker=dict(
color=(
(Related_Display_Info["WinLoss"] < 0)
).astype('int'),
colorscale=[[0, 'green'], [1, 'red']]
)
),
secondary_y=True,
)
# Add figure title
fig.update_layout(
title_text="Race Analysis"
)
# Set x-axis title
fig.update_xaxes(title_text="<b>Racer Counts</b>")
# Set y-axes titles
fig.update_yaxes(title_text="<b>Velocity</b>", secondary_y=False)
fig.update_yaxes(title_text="<b>Win/Loss/b>", secondary_y=True)
fig.update_layout(hovermode="x unified")
fig.show()
The output is:
But I was willing to display the following information when I hover on the point:
RaceCount = From Display dataframe value Number of the race corresponding to the dot I hover on.
Velocity = From Display Dataframe value Velocity at that point
Counts = From Related_Display_Info Column
WinLoss = From Related_Display_Info Column
Positives = From Related_Display_Info Column
Negatives = From Related_Display_Info Column
Please can anyone tell me what to do to get this information on my chart?
I have checked this but was not helpful since I got many errors: Python/Plotly: How to customize hover-template on with what information to show?
Data:
RacerCount,Velocity,WinLoss
111,0.36,1
141,0.31,1
156,0.3,1
141,0.23,1
147,0.23,1
156,0.22,1
165,0.2,1
174,0.18,1
177,0.18,1
183,0.18,1
114,0.32,1
117,0.3,1
120,0.29,1
123,0.29,1
126,0.28,1
129,0.27,1
120,0.32,1
144,0.3,1
147,0.3,1
159,0.27,1
165,0.26,1
168,0.25,1
156,0.29,1
165,0.26,1
168,0.26,1
165,0.28,1
213,0.17,1
243,0.15,1
249,0.14,1
228,0.54,1
177,0.67,1
180,0.66,1
183,0.65,1
192,0.66,1
195,0.62,1
198,0.6,1
180,0.66,1
222,0.56,1
114,0.41,1
81,0.82,1
102,0.56,1
111,0.55,1
90,1.02,1
93,1.0,1
90,1.18,1
90,1.18,1
93,1.1,1
96,1.07,1
99,1.04,1
102,0.99,1
105,0.94,1
108,0.92,1
111,0.9,1
162,0.66,1
159,0.63,1
162,0.65,-1
162,0.66,-1
168,0.64,-1
159,0.68,-1
162,0.67,-1
174,0.62,-1
168,0.65,-1
171,0.64,-1
198,0.55,-1
300,0.47,-1
201,0.56,-1
174,0.63,-1
180,0.61,-1
171,0.64,-1
174,0.62,-1
303,0.47,-1
312,0.48,-1
258,0.51,-1
261,0.51,-1
264,0.5,-1
279,0.47,-1
288,0.48,-1
294,0.47,-1
258,0.52,-1
261,0.51,-1
267,0.5,-1
222,0.53,-1
171,0.64,-1
177,0.63,-1
177,0.63,-1
Essentially, this code ungroups the data frame before plotting to create the hovertemplate you're looking for.
As stated in the comments, the data has to have the same number of rows to be shown in the hovertemplate. At the end of my answer, I added the code all in one chunk.
Since you have hovermode as x unified, you probably only want one of these traces to have hover content.
I slightly modified the creation of Related_Display_Info. Instead of WinLoss, which is already in the parent data frame, I modified it to WinLoss_sum, so there wouldn't be a naming conflict when I ungrouped.
Related_Display_Info = df.groupby("RacerCount").agg(
Counts=("Velocity","count"), WinLoss_sum=("WinLoss","sum"),
Positives=("WinLoss", pos), Negatives=("WinLoss", neg))
Now it's time to ungroup the data you grouped. I created dui (stands for display info ungrouped).
dui = pd.merge(df, Related_Display_Info, how = "outer", on="RacerCount",
suffixes=(False, False))
I created the hovertemplate for both traces. I passed the entire ungrouped data frame to customdata. It looks like the only column that isn't in the template is the original WinLoss.
# create hover template for all traces
ht="<br>".join(["<br>RacerCount: %{customdata[0]}",
"Velocity: %{customdata[1]:.2f}",
"Counts: %{customdata[3]}",
"Winloss: %{customdata[4]}",
"Positives: %{customdata[5]}",
"Negatives: %{customdata[6]}<br>"])
The creation of fig is unchanged. However, the traces are both based on dui. Additionally, the index isn't RacerCount, so I used the literal field instead.
# Create figure with secondary y-axis
fig = make_subplots(specs=[[{"secondary_y": True}]])
# Add traces
fig.add_trace(go.Scatter(x=dui["RacerCount"], y=dui["Velocity"],
name="Velocity", mode="markers",
customdata=dui, hovertemplate=ht),
secondary_y=False)
fig.add_trace(
go.Scatter(x = dui["RacerCount"], y=dui["WinLoss_sum"], customdata=dui,
name="Win/Loss", mode="markers",
marker=dict(color=((dui["WinLoss_sum"] < 0)).astype('int'),
colorscale=[[0, 'green'], [1, 'red']]),
hovertemplate=ht),
secondary_y=True)
All the code altogether (for easier copy + paste)
import pandas as pd
import numpy as np
import plotly.graph_objects as go
from plotly.subplots import make_subplots
df = pd.read_clipboard(sep = ',')
display=df[["Velocity", "WinLoss"]]
pos = lambda col : col[col > 0].sum()
neg = lambda col : col[col < 0].sum()
Related_Display_Info = df.groupby("RacerCount").agg(
Counts=("Velocity","count"), WinLoss_sum=("WinLoss","sum"),
Positives=("WinLoss", pos), Negatives=("WinLoss", neg))
# ungroup the data for the hovertemplate
dui = pd.merge(df, Related_Display_Info, how = "outer", on="RacerCount",
suffixes=(False, False))
# create hover template for all traces
ht="<br>".join(["<br>RacerCount: %{customdata[0]}",
"Velocity: %{customdata[1]:.2f}",
"Counts: %{customdata[3]}",
"Winloss: %{customdata[4]}",
"Positives: %{customdata[5]}",
"Negatives: %{customdata[6]}<br>"])
# Create figure with secondary y-axis
fig = make_subplots(specs=[[{"secondary_y": True}]])
# Add traces
fig.add_trace(go.Scatter(x=dui["RacerCount"], y=dui["Velocity"],
name="Velocity", mode="markers",
customdata=dui, hovertemplate=ht),
secondary_y=False)
fig.add_trace(
go.Scatter(x = dui["RacerCount"], y=dui["WinLoss_sum"], customdata=dui,
name="Win/Loss", mode="markers",
marker=dict(color=((dui["WinLoss_sum"] < 0)).astype('int'),
colorscale=[[0, 'green'], [1, 'red']]),
hovertemplate=ht),
secondary_y=True)
# Add figure title
fig.update_layout(
title_text="Race Analysis"
)
# Set x-axis title
fig.update_xaxes(title_text="<b>Racer Counts</b>")
# Set y-axes titles
fig.update_yaxes(title_text="<b>Velocity</b>", secondary_y=False)
fig.update_yaxes(title_text="<b>Win/Loss/b>", secondary_y=True)
fig.update_layout(hovermode="x unified")
fig.show()
I am working on some boxplots. I found this code very helpful and I managed to replicate it for my needs:
import plotly.express as px
import numpy as np
import pandas as pd
np.random.seed(1)
y0 = np.random.randn(50) - 1
y1 = np.random.randn(50) + 1
df = pd.DataFrame({'graph_name':['trace 0']*len(y0)+['trace 1']*len(y1),
'value': np.concatenate([y0,y1],0),
'color':np.random.choice([0,1,2,3,4,5,6,7,8,9], size=100, replace=True)}
)
fig = px.strip(df,
x='graph_name',
y='value',
color='color',
stripmode='overlay')
fig.add_trace(go.Box(y=df.query('graph_name == "trace 0"')['value'], name='trace 0'))
fig.add_trace(go.Box(y=df.query('graph_name == "trace 1"')['value'], name='trace 1'))
fig.update_layout(autosize=False,
width=600,
height=600,
legend={'traceorder':'normal'})
fig.show()
I am now trying to put some lines connecting the datapoints with the same colors, but I am lost. Any idea?
Something similar to this:
My first idea was to add lines to your figure by using plotly shapes and specifying the start and end points in x- and y-axis coordinates. However, when you use px.strip, plotly implements jittering (adding randomly generated small values, say between -0.1 and 0.1, to the x-coordinates under the hood to avoid points overlapping), but as far as I know, there is no way to retrieve the exact x-coordinates of each point.
However we can get around this by using go.Scatter to plot all the paired points individually, adding jittering as needed to the x-values and connecting each pair of points with a line. We are basically implementing px.strip ourselves but with full control of the exact coordinates of each point.
In order to toggle colors the same way that px.strip allows you to, we need to assign all points of the same color to the same legendgroup, and also only show the legend entry the first time a color is plotted (as we don't want an legend entry for each point)
import plotly.express as px
import plotly.graph_objects as go
import numpy as np
import pandas as pd
np.random.seed(1)
y0 = np.random.randn(50) - 1
y1 = np.random.randn(50) + 1
## sort both sets of data so we can easily connect them with line annotations
y0.sort()
y1.sort()
df = pd.DataFrame({'graph_name':['trace 0']*len(y0)+['trace 1']*len(y1),
'value': np.concatenate([y0,y1],0)}
# 'color':np.random.choice([0,1,2,3,4,5,6,7,8,9], size=100, replace=True)}
)
fig = go.Figure()
## i will set jittering to 0.1
x0 = np.array([0]*len(y0)) + np.random.uniform(-0.1,0.1,len(y0))
x1 = np.array([1]*len(y0)) + np.random.uniform(-0.1,0.1,len(y0))
## px.colors.sequential.Plasma contains 10 distinct colors
## colors_list = np.random.choice(px.colors.qualitative.D3, size=50)
## for simplicity, we repeat it 5 times instead of selecting randomly
## this guarantees the colors appear in order in the legend
colors_list = px.colors.qualitative.D3*5
color_number = {i:color for color,i in enumerate(px.colors.qualitative.D3)}
## keep track of whether the color is showing up for the first time as we build out the legend
colors_legend = {color:False for color in colors_list}
for x_start,x_end,y_start,y_end,color in zip(x0,x1,y0,y1,colors_list):
## if the color hasn't been added to the legend yet, add a legend entry
if colors_legend[color] == False:
fig.add_trace(
go.Scatter(
x=[x_start,x_end],
y=[y_start,y_end],
mode='lines+markers',
marker=dict(color=color),
line=dict(color="rgba(100,100,100,0.5)"),
legendgroup=color_number[color],
name=color_number[color],
showlegend=True,
hoverinfo='skip'
)
)
colors_legend[color] = True
## otherwise omit the legend entry, but add it to the same legend group
else:
fig.add_trace(
go.Scatter(
x=[x_start,x_end],
y=[y_start,y_end],
mode='lines+markers',
marker=dict(color=color),
line=dict(color="rgba(100,100,100,0.5)"),
legendgroup=color_number[color],
showlegend=False,
hoverinfo='skip'
)
)
fig.add_trace(go.Box(y=df.query('graph_name == "trace 0"')['value'], name='trace 0'))
fig.add_trace(go.Box(y=df.query('graph_name == "trace 1"')['value'], name='trace 1'))
fig.update_layout(autosize=False,
width=600,
height=600,
legend={'traceorder':'normal'})
fig.show()
This is my dataset:
import seaborn as sns
import plotly.graph_objs as go
x = [0,0,0,1,1,1,2,2,2]
y = [1,2,3,4,5,6,7,8,9]
Using:
sns.lineplot(x=x, y=y)
I get following figure:
I would like to get the same (at least similar result) in Plotly. Currently I have:
fig = go.Figure()
fig.add_trace(go.Scatter(x=x,
y=y,
mode='lines',
name='predictions',
fill="toself"))
However this is the result I obtain which I am not happy with:
Is it a matter of some specific keyword argument passed to fill? Thanks!
Plotly it's not meant to be a "statistical data visualization library" as seaborn so you should prepare the traces before to plot. For your given example you could do something like
import pandas as pd
import plotly.graph_objs as go
x = [0,0,0,1,1,1,2,2,2]
y = [1,2,3,4,5,6,7,8,9]
df = pd.DataFrame({"x": x, "y": y})
grp = df.groupby("x").agg({"y":{"mean", "min", "max"}})
grp.columns = ["_".join(col) for col in grp.columns]
grp = grp.reset_index()
fig = go.Figure()
fig.add_trace(go.Scatter(x=grp["x"],
y=grp["y_min"],
mode='lines',
name='y_min',
opacity=0.75,
# marker = {"color":"lightblue", "width":0.5},
line=dict(color='lightblue', width=0.5),
showlegend=False
))
fig.add_trace(go.Scatter(x=grp["x"],
y=grp["y_mean"],
mode='lines',
name='prediction',
fill="tonexty",
line=dict(color='lightblue', width=2)
))
fig.add_trace(go.Scatter(x=grp["x"],
y=grp["y_max"],
mode='lines',
name='y_max',
opacity=0.75,
fill="tonexty",
line=dict(color='lightblue', width=0.5),
showlegend=False
))
I have data points that belong to three different classes. On the other hand, I have weight for each data point in its class. I want to color my points based on their weight, but with three different continuous range of colors. Actually I want something like the following image (which is made by hand). Now I'm using Plotly for coloring, but any other method compatible with python is welcomed.
Actually I want to combine the two output of the code:
if __name__ == '__main__':
n_data = 100
n_class = 3
t1 = [random.random() for i in range(n_data)]
t2 = [random.random() for i in range(n_data)]
class_color = [str(random.randint(1,n_class)) for i in range(n_data)]
weight_color = [random.random() for i in range(n_data)]
df = pd.DataFrame()
print(len(t1))
print(len(t2))
df['x'] = t1
df['y'] = t2
df['class_color'] = class_color
df['weight_color'] = weight_color
fig1 = px.scatter(df, x="x", y="y", color="class_color")
fig1.show()
fig2 = px.scatter(df, x="x", y="y", color="weight_color")
fig2.show()
Please don't take it as an answer (yet). As far as I can see you can use different color scales with plotly. But you should work on how properly show all legends
import plotly.graph_objects as go
import plotly.express as px
df = px.data.iris()
dfs = [d[1] for d in list(df.groupby('species'))]
fig = go.Figure()
fig.add_trace(
go.Scatter(x=dfs[0]["sepal_width"],
y=dfs[0]["sepal_length"],mode="markers",
marker=dict(color=dfs[0]["sepal_length"],
colorscale='Viridis',
showscale=True),
name=dfs[0]["species"].unique()[0],
showlegend=False
))
fig.add_trace(
go.Scatter(x=dfs[1]["sepal_width"],
y=dfs[1]["sepal_length"],mode="markers",
marker=dict(color=dfs[1]["sepal_length"],
colorscale='Magenta',
showscale=False),
name=dfs[1]["species"].unique()[0],
showlegend=False
))
fig.add_trace(
go.Scatter(x=dfs[2]["sepal_width"],
y=dfs[2]["sepal_length"],mode="markers",
marker=dict(color=dfs[2]["sepal_length"],
colorscale='Cividis',
showscale=False),
name=dfs[2]["species"].unique()[0],
showlegend=False
))