Plotly: Cannot manually specify labels (legend entries) in multiple line chart - python

As per the Plotly website, in a simple line chart one can change the legend entry from the column name to a manually specified string of text. For example, this code results in the following chart:
import pandas as pd
import plotly.express as px
df = pd.DataFrame(dict(
x = [1, 2, 3, 4],
y = [2, 3, 4, 3]
))
fig = px.line(
df,
x="x",
y="y",
width=800, height=600,
labels={
"y": "Series"
},
)
fig.show()
label changed:
However, when one plots multiple columns to the line chart, this label specification no longer works. There is no error message, but the legend entries are simply not changed. See this example and output:
import pandas as pd
import plotly.express as px
df = pd.DataFrame(dict(
x = [1, 2, 3, 4],
y1 = [2, 3, 4, 3],
y2 = [2, 4, 6, 8]
))
fig = px.line(
df,
x="x",
y=["y1", "y2"],
width=800, height=600,
labels={
"y1": "Series 1",
"y2": "Series 2"
},
)
fig.show()
legend entries not changed:
Is this a bug, or am I missing something? Any idea how this can be fixed?

In case anybody read my previous post, I did some more digging and found the solution to this issue. At the heart, the labels one sees over on the right in the legend are attributes known as "names" and not "labels". Searching for how to revise those names, I came across another post about this issue with a solution Legend Label Update. Using that information, here is a revised version of your program.
import pandas as pd
import plotly.express as px
df = pd.DataFrame(dict(
x = [1, 2, 3, 4],
y1 = [2, 3, 4, 3],
y2 = [2, 4, 6, 8]
))
fig = px.line(df, x="x", y=["y1", "y2"], width=800, height=600)
fig.update_layout(legend_title_text='Variable', xaxis_title="X", yaxis_title="Series")
newnames = {'y1':'Series 1', 'y2': 'Series 2'} # From the other post
fig.for_each_trace(lambda t: t.update(name = newnames[t.name]))
fig.show()
Following is a sample graph.
Try that out to see if that addresses your situation.
Regards.

Related

Plotly line plot from two dictionaries

I have two dictionaries:
days = {'a':[1,2,3], 'b':[3,4,5]}
vals = {'a':[10,20,30], 'b':[9,16,25]}
Using plotly (ideally plotly express) I would like one line plot with two lines: the first line being days['a'] vs vals['a'] and the second line being days['b'] vs vals['b']. Of course in practice I may have many more potential lines. I am not sure how to pull this off. I'm happy to make a dataframe out of this data but not sure what the best structure is.
Thanks! Apologies for a noob question.
You can try the following:
import plotly.graph_objects as go
# your data
days = {'a':[1,2,3], 'b':[3,4,5]}
vals = {'a':[10,20,30], 'b':[9,16,25]}
# generate a plot for each dictionary key
data = []
for k in days.keys():
plot = go.Scatter(x=days[k],
y=vals[k],
mode="lines",
name=k # label for the plot legend
)
data.append(plot)
# create a figure with all plots and display it
fig = go.Figure(data=data)
fig.show()
This gives:
With Plotly Express:
import plotly.express as px
import pandas as pd
days = {'a': [1, 2, 3], 'b': [3, 4, 5]}
vals = {'a': [10, 20, 30], 'b': [9, 16, 25]}
# build DataFrame
df = pd.DataFrame(columns=["days", "vals", "label"])
for k in days.keys():
df = df.append(pd.DataFrame({
"days": days[k],
"vals": vals[k],
"label": k
}))
fig = px.line(df, x="days", y="vals", color="label")
fig.show()
The result is the same as above.

Stacked bar plot in python / plotly (express): grouping / ordering of bars

I have data in a dataframe that I want to plot with a stacked bar plot:
test_df = pd.DataFrame([[1, 5, 1, 'A'], [2, 10, 1, 'B'], [3, 3, 1, 'A']], columns = ('ID', 'Value', 'Bucket', 'Type'))
if I do the plot with Plotly Express I get bars stacked on each other and correctly ordered (based on the index):
fig = px.bar(test_df, x='Bucket', y='Value', barmode='stack')
However, I want to color the data based on Type, hence I go for
fig = px.bar(test_df, x='Bucket', y='Value', barmode='stack', color='Type')
This works, except now the ordering is messed up, because all bars are now grouped by Type. I looked through the docs of Plotly Express and couldn't find a way to specify the ordering of the bars independently. Any tips on how to do this?
I found this one here, but the scenario is a bit different and the options mentioned there don't seem to help me:
How to disable plotly express from grouping bars based on color?
Edit: This goes into the right direction, but not with using Plotly Express, but rather Plotly graph_objects:
import plotly.graph_objects as go
test_df = pd.DataFrame([[1, 5, 1, 'A', 'red'], [2, 10, 1, 'B', 'blue'], [3, 3, 1, 'A', 'red']], columns = ('ID', 'Value', 'Bucket', 'Type', 'Color'))
fig = go.Figure()
fig.add_trace(go.Bar(x=test_df["Bucket"], y=test_df["Value"], marker_color=test_df["Color"]))
Output:
Still, I'd prefer the Express version, because so many things are easier to handle there (Legend, Hover properties etc.).
The only way I can understand your question is that you don't want B to be stacked on top of A, but rather the opposite. If that's the case, then you can get what you want through:
fig.data = fig.data[::-1]
fig.layout.legend.traceorder = 'reversed'
Some details:
fig.data = fig.data[::-1] simply reverses the order that the traces appear in fig.data and ultimately in the plotted figure itself. This will however reverse the order of the legend as well. So without fig.layout.legend.traceorder = 'reversed' the result would be:
And so it follows that the complete work-around looks like this:
fig.data = fig.data[::-1]
fig.layout.legend.traceorder = 'reversed'
Complete code:
import pandas as px
import plotly.express as px
test_df = pd.DataFrame([[1, 5, 1, 'A'], [2, 10, 1, 'B'], [3, 3, 1, 'A']], columns = ('ID', 'Value', 'Bucket', 'Type'))
fig = px.bar(test_df, x='Bucket', y='Value', barmode='stack', color='Type')
fig.data = fig.data[::-1]
fig.layout.legend.traceorder = 'reversed'
fig.show()
Ok, sorry for the long delay on this, but I finally got around to solving this.
My solution is possibly not the most straight forward one, but it does work.
The basic idea is to use graph_objects instead of express and then iterate over the dataframe and add each bar as a separate trace. This way, each trace can get a name that can be grouped in a certain way (which is not possible if adding all bars in a single trace, or at least I could not find a way).
Unfortunately, the ordering of the legend is messed up (if you have more then 2 buckets) and there is no way in plotly currently to sort it. But that's a minor thing.
The main thing that bothers me is that this could've been so much easier if plotly.express allowed for manual ordering of the bars by a certain column.
Maybe I'll submit that as a suggestion.
import pandas as pd
import plotly.graph_objects as go
import plotly.io as pio
pio.renderers.default = "browser"
test_df = pd.DataFrame(
[[1, 5, 1, 'B'], [3, 3, 1, 'A'], [5, 10, 1, 'B'],
[2, 8, 2, 'B'], [4, 5, 2, 'A'], [6, 3, 2, 'A']],
columns = ('ID', 'Value', 'Bucket', 'Type'))
# add named colors to the dataframe based on type
test_df.loc[test_df['Type'] == 'A', 'Color'] = 'Crimson'
test_df.loc[test_df['Type'] == 'B', 'Color'] = 'ForestGreen'
# ensure that the dataframe is sorted by the values
test_df.sort_values('ID', inplace=True)
fig = go.Figure()
# it's tedious to iterate over each item, but only this way we can ensure that everything is correctly ordered and labelled
# Set up legend_show_dict to check if an item should be shown or not. This should be only done for the first occurrence to avoid duplication.
legend_show_dict = {}
for i, row in test_df.iterrows():
if row['Type'] in legend_show_dict:
legend_show = legend_show_dict[row['Type']]
else:
legend_show = True
legend_show_dict[row['Type']] = False
fig.add_trace(
go.Bar(
x=[row['Bucket']],
y=[row['Value']],
marker_color=row['Color'],
name=row['Type'],
legendgroup=row['Type'],
showlegend=legend_show,
hovertemplate="<br>".join([
'ID: ' + str(row['ID']),
'Value: ' + str(row['Value']),
'Bucket: ' + str(row['Value']),
'Type: ' + row['Type'],
])
))
fig.update_layout(
xaxis={'categoryorder': 'category ascending', 'title': 'Bucket'},
yaxis={'title': 'Value'},
legend={'traceorder': 'normal'}
)
fig.update_layout(barmode='stack', font_size=20)
fig.show()
This is what it should look like then:

Plotly: How to make all plots grayscale?

I am using Plotly to generate few line plots in Python. With a sample code like this:
from plotly import offline as plot, subplots as subplot, graph_objects as go
fig = subplot.make_subplots(rows=2, cols=1, shared_xaxes=True, vertical_spacing=0.01)
trace1 = go.Scatter(x = [1, 2, 3], y = [1, 2, 3])
trace2 = go.Scatter(x = [1, 2, 3], y = [4, 5, 6])
fig.append_trace(trace1, 1, 1)
fig.append_trace(trace2, 2, 1)
config_test_plot = {'displaylogo': False, 'displayModeBar': False, 'scrollZoom': True}
test_plot_html = plot.plot(fig, output_type='div', include_plotlyjs=False, config= config_test_plot)
I am able to get the required plots. However, I want to be able to get all my plots in grayscale. I see that none of the Plotly default themes are of this type. Is there anyway I can do this?
You haven't specified whether to assign a grey color scheme for your entire plot, or only for your lines. But just to not make things easy for myself, I'm going to assume the former. In that case, I would:
use template = 'plotly_white' for the figure elements not directly connected to your dataset, and
assign a grey scale to all lines using n_colors(lowcolor, highcolor, n_colors, colortype='tuple').
Example plot:
But as #S3DEV mentions, using the greys color palette could be a way to go too, and this is accesible through:
# In:
px.colors.sequential.Greys
# Out:
# ['rgb(255,255,255)',
# 'rgb(240,240,240)',
# 'rgb(217,217,217)',
# 'rgb(189,189,189)',
# 'rgb(150,150,150)',
# 'rgb(115,115,115)',
# 'rgb(82,82,82)',
# 'rgb(37,37,37)',
# 'rgb(0,0,0)']
And this would work perfectly for your use case with a limited number of lines. In that case you could just use this setup:
from plotly import offline as plot, subplots as subplot, graph_objects as go
from itertools import cycle
fig = subplot.make_subplots(rows=2, cols=1, shared_xaxes=True, vertical_spacing=0.01)
trace1 = go.Scatter(x = [1, 2, 3], y = [1, 2, 3])
trace2 = go.Scatter(x = [1, 2, 3], y = [4, 5, 6])
fig.append_trace(trace1, 1, 1)
fig.append_trace(trace2, 2, 1)
colors = cycle(list(set(px.colors.sequential.Greys)))
f = fig.full_figure_for_development(warn=False)
for d in fig.data:
d.line.color = next(colors)
fig.show()
And get:
And I assume that this is what you were looking for. But one considerable drawback here is that the number of colors in px.colors.sequential.Greys is limited, and I had to use a cycle to assign the line colors of your data. And n_colors(lowcolor, highcolor, n_colors, colortype='tuple') lets you define a starting color, an end color, and a number of colors scaled between them to form a complete scale for all your lines. This will also let you adjust the brightness of the colors to your liking. So you could get this:
...this:
or this:
Here's a complete setup for those figures if you would like to experiment with that as well:
import numpy as np
import pandas as pd
import plotly.graph_objects as go
import plotly.express as px
import datetime
from plotly.colors import n_colors
pd.set_option('display.max_rows', None)
pd.options.plotting.backend = "plotly"
# data sample
nperiods = 200
np.random.seed(123)
cols = 'abcdefghijkl'
df = pd.DataFrame(np.random.randint(-10, 12, size=(nperiods, len(cols))),
columns=list(cols))
datelist = pd.date_range(datetime.datetime(2020, 1, 1).strftime('%Y-%m-%d'),periods=nperiods).tolist()
df['dates'] = datelist
df = df.set_index(['dates'])
df.index = pd.to_datetime(df.index)
df.iloc[0] =1000
df = df.cumsum()#.reset_index()
greys_all = n_colors('rgb(0, 0, 0)', 'rgb(255, 255, 255)', len(cols)+1, colortype='rgb')
greys_dark = n_colors('rgb(0, 0, 0)', 'rgb(200, 200, 200)', len(cols)+1, colortype='rgb')
greys_light = n_colors('rgb(200, 200, 200)', 'rgb(255, 255, 255)', len(cols)+1, colortype='rgb')
greys = n_colors('rgb(100, 100, 100)', 'rgb(255, 255, 255)', len(cols)+1, colortype='rgb')
fig = df.plot(title = 'Greys_light', template='plotly_white', color_discrete_sequence=greys_light)
fig.update_layout(template='plotly_white')
fig.show()

How to label a grouped bar chart using plotly express?

I want to add data labels to the tops of bar charts in plotly express. I'm using two different columns from the data frame so I can't use the "colors" method. I want to define "text" for each bar so it shows the data on top of the bar. Here is an MRE.
import pandas as pd
import plotly.express as px
x = ['Aaron', 'Bob', 'Chris']
y1 = [5, 10, 6]
y2 = [8, 16, 12]
fig = px.bar(x=x, y=[y1,y2],barmode='group')
fig.show()
I tried:
fig = px.bar(x=x, y=[y1,y2],text=[y1,y2], barmode='group')
But this doesn't work.
Using your setup, just add the following to the mix:
texts = [y1, y2]
for i, t in enumerate(texts):
fig.data[i].text = t
fig.data[i].textposition = 'outside'
Result:
Complete code:
import pandas as pd
import plotly.express as px
x = ['Aaron', 'Bob', 'Chris']
y1 = [5, 10, 6]
y2 = [8, 16, 12]
fig = px.bar(x=x, y=[y1,y2],barmode='group')
texts = [y1, y2]
for i, t in enumerate(texts):
fig.data[i].text = t
fig.data[i].textposition = 'outside'
fig.show()
i found a answer that's is better.
Let's take as example this dictionary:
data_dictionary = {
"data_frame":{
"x":["Aaron", "Bob", "Chris"],
"y1":[5, 10, 6],
"y2":[8, 16, 12]
},
"x":"x",
"y":["y1", "y2"],
"barmode":"group",
"text":None,
"text_auto":True
}
After that let's create a figure:
fig = px.bar(
**data_dictionary
)
If you tipe fig.show(), you'll se a graph simillary to the vestland's graph.
The only thing you need to do is to set text as None and text_auto as True.
I hope that helps you.

Color seaborn boxplot based in DataFrame column name

I'd like to create a list of boxplots with the color of the box dependent on the name of the pandas.DataFrame column I use as input.
The column names contain strings that indicate an experimental condition based on which I want the box of the boxplot colored.
I do this to make the boxplots:
sns.boxplot(data = data.dropna(), orient="h")
plt.show()
This creates a beautiful list of boxplots with correct names. Now I want to give every boxplot that has 'prog +, DMSO+' in its name a red color, leaving the rest as blue.
I tried creating a dictionary with column names as keys and colors as values:
color = {}
for column in data.columns:
if 'prog+, DMSO+' in column:
color[column] = 'red'
else:
color[column] = 'blue'
And then using the dictionary as color:
sns.boxplot(data = data.dropna(), orient="h", color=color[column])
plt.show()
This does not work, understandably (there is no loop to go through the dictionary). So I make a loop:
for column in data.columns:
sns.boxplot(data = data[column], orient='h', color=color[column])
plt.show()
This does make boxplots of different colors but all on top of each other and without the correct labels. If I could somehow put these boxplot nicely in one plot below each other I'd be almost at what I want. Or is there a better way?
You should use the palette parameter, which handles multiple colors, rather than color, which handles a specific one. You can give palette a name, an ordered list, or a dictionary. The latter seems best suited to your question:
import seaborn as sns
sns.set_color_codes()
tips = sns.load_dataset("tips")
pal = {day: "r" if day == "Sat" else "b" for day in tips.day.unique()}
sns.boxplot(x="day", y="total_bill", data=tips, palette=pal)
You can set the facecolor of individual boxes after plotting them all in one go, using ax.artists[i].set_facecolor('r')
For example:
import seaborn as sns
import matplotlib.pyplot as plt
import pandas as pd
df = pd.DataFrame(
[[2, 4, 5, 6, 1],
[4, 5, 6, 7, 2],
[5, 4, 5, 5, 1],
[10, 4, 7, 8, 2],
[9, 3, 4, 6, 2],
[3, 3, 4, 4, 1]
],columns=['bar', 'prog +, DMSO+ 1', 'foo', 'something', 'prog +, DMSO+ 2'])
ax = sns.boxplot(data=df,orient='h')
boxes = ax.artists
for i,box in enumerate(boxes):
if 'prog +, DMSO+' in df.columns[i]:
box.set_facecolor('r')
else:
box.set_facecolor('b')
plt.tight_layout()
plt.show()

Categories