I am currently trying to add a dropdown menu to my treemap plot
The code I am using :
import pandas as pd
import plotly.express as px
fig = px.treemap(df,
path=['RuleName','RuleNumber','ParaInvolved',"CreationP","MAjP"],
color='Somme',
hover_data=["RuleDecision","RuleMAJ"],
color_continuous_scale='RdBu')
fig.show()
The problem I am facing is that in my column "RuleName" I have 151 different values (but 1300 rows in total), that's why I'm trying to add a button allowing myself to chose for what RuleName value I want to plot my treemap. For now I am using a barbaric method consisting in filtering my dataframe by each RuleName value, which lead me to get 151 different treemap. I don't find any solution on that website or any other.
Thanks for your help
Here I'm basically using the same logic from this answer but I use px.treemap(...).data[0] to produce the traces instead of go.
import plotly.express as px
import plotly.graph_objects as go
df = px.data.tips()
# We have a list for every day
# In your case will be gropuby('RuleName')
# here for every element d
# d[0] is the name(key) and d[1] is the dataframe
dfs = list(df.groupby("day"))
first_title = dfs[0][0]
traces = []
buttons = []
for i,d in enumerate(dfs):
visible = [False] * len(dfs)
visible[i] = True
name = d[0]
traces.append(
px.treemap(d[1],
path=['day', 'time', 'sex'],
values='total_bill').update_traces(visible=True if i==0 else False).data[0]
)
buttons.append(dict(label=name,
method="update",
args=[{"visible":visible},
{"title":f"{name}"}]))
updatemenus = [{'active':0, "buttons":buttons}]
fig = go.Figure(data=traces,
layout=dict(updatemenus=updatemenus))
fig.update_layout(title=first_title, title_x=0.5)
fig.show()
Related
I'm trying to create faceted maps by the column rank in my df. Each map will display the product for each state. I want the color of the product to be consistent across maps.
With the solution below I can achieve that, but the legend will show multiple entries for the same product, one for each state. How can I have the legend show only one entry per distinct product?
import pandas as pd
import plotly.express as px
from random import randint
df = pd.DataFrame({'rank': [1,1,1,1,2,2,2,2],'product':['A','B','C','D','C','D','Z','X'],'state':['WA','OR','CA','ID','WA','OR','CA','ID']})
unique_hi = df['product'].unique()
color_discrete_map = {unique_hi[k]: '#%06X' % randint(0, 0xFFFFFF) for k in range(len(unique_hi))}
fig = px.choropleth(df, color='product', facet_col="rank",facet_col_wrap=2,
locations="state", #featureidkey="properties.district",
locationmode="USA-states",
projection="mercator",height=600,
color_discrete_map=color_discrete_map,
title='Regional products'
)
fig.update_geos(fitbounds="locations", visible=False)
fig.update_layout(margin={"r":0,"t":30,"l":0,"b":0})
fig.show()
If you check the contents of the created map in fig.data, you will find the original name of the legend, which is collected and only the names of the non-duplicated.
import pandas as pd
import plotly.express as px
from random import randint
df = pd.DataFrame({'rank': [1,1,1,1,2,2,2,2],'product':['A','B','C','D','C','D','Z','X'],'state':['WA','OR','CA','ID','WA','OR','CA','ID']})
unique_hi = df['product'].unique()
color_discrete_map = {unique_hi[k]: '#%06X' % randint(0, 0xFFFFFF) for k in range(len(unique_hi))}
fig = px.choropleth(df, color='product', facet_col="rank",facet_col_wrap=2,
locations="state", #featureidkey="properties.district",
locationmode="USA-states",
projection="mercator",height=600,
color_discrete_map=color_discrete_map,
title='Regional products'
)
fig.update_geos(fitbounds="locations", visible=False)
fig.update_layout(margin={"r":0,"t":30,"l":0,"b":0})
# update
names = set()
fig.for_each_trace(
lambda trace:
trace.update(showlegend=False)
if (trace.name in names) else names.add(trace.name))
fig.show()
The way to add a product name as an annotation is not possible to specify it using map coordinates (I referred to this for the rationale), so adding the following code will make the annotation, but all products will need to be manually addressed. Upon further investigation, it seems that a combination of go.choroplethmapbox() and go.scattergeo() would do it. In this case, you will need to rewrite the code from scratch.
fig.add_annotation(
x=0.2,
xref='paper',
y=0.85,
yref='paper',
text='A',
showarrow=False,
font=dict(
color='yellow',
size=14
)
)
I have a relatively simple issue, but cannot find any answer online that addresses it. Starting from a simple boxplot:
import plotly.express as px
df = px.data.iris()
fig = px.box(
df, x='species', y='sepal_length'
)
val_counts = df['species'].value_counts()
I would now like to add val_counts (in this dataset, 50 for each species) to the plots, preferably on either of the following places:
On top of the median line
On top of the max/min line
Inside the hoverbox
How can I achieve this?
The snippet below will set count = 50 for all unique values of df['species'] on top of the max line using fig.add_annotation like this:
for s in df.species.unique():
fig.add_annotation(x=s,
y = df[df['species']==s]['sepal_length'].max(),
text = str(len(df[df['species']==s]['species'])),
yshift = 10,
showarrow = False
)
Plot:
Complete code:
import plotly.express as px
df = px.data.iris()
fig = px.box(
df, x='species', y='sepal_length'
)
for s in df.species.unique():
fig.add_annotation(x=s,
y = df[df['species']==s]['sepal_length'].max(),
text = str(len(df[df['species']==s]['species'])),
yshift = 10,
showarrow = False
)
f = fig.full_figure_for_development(warn=False)
fig.show()
Using same approach that I presented in this answer: Change Plotly Boxplot Hover Data
calculate all the measures a box plot calculates plus the additional measure you want count
overlay bar traces over box plot traces so hover has all measures required
import plotly.express as px
df = px.data.iris()
# summarize data as per same dimensions as boxplot
df2 = df.groupby("species").agg(
**{
m
if isinstance(m, str)
else m[0]: ("sepal_length", m if isinstance(m, str) else m[1])
for m in [
"max",
("q75", lambda s: s.quantile(0.75)),
"median",
("q25", lambda s: s.quantile(0.25)),
"min",
"count",
]
}
).reset_index().assign(y=lambda d: d["max"] - d["min"])
# overlay bar over boxplot
px.bar(
df2,
x="species",
y="y",
base="min",
hover_data={c:not c in ["y","species"] for c in df2.columns},
hover_name="species",
).update_traces(opacity=0.1).add_traces(px.box(df, x="species", y="sepal_length").data)
I have noticed that my go.Pie graph only shows 2 of the 3 values held in the dataframe column. I noticed this when creating a px.treemap referencing the exact same column in the dataframe and it shows all 3 values.
Below is my code for the pie chart and then the treemap
#docCategory count pie graph
valuesDocCat = df['docCategory'].value_counts()
figDocCat = go.Figure(data=[go.Pie(labels = df['docCategory'], values = valuesDocCat)])
figDocCat.update_traces(textposition = 'inside')
figDocCat.update_layout(uniformtext_minsize=14, uniformtext_mode='hide', title='Document Category breakdown')
#treeMap test graph
valuesTreemap = df['Kind'].value_counts()
figTreemap = px.treemap(df, path = ['docCategory', 'Kind'], color='docCategory')
figTreemap.update_traces(root_color='lightgrey')
figTreemap.update_layout(margin = dict(t=50, l=25, r=25, b=25)
You can see my code above referencing the df['docCategory'] in both instances but as you can see in the images below the pie chart doesnt have the 'Unknown' field whereas the treemap does.
Any ideas on why? I have other pie charts that have more than 2 fields being referenced and no issues, it is just happening on this one.
your question "Plotly pie graph not showing all data", it is showing everything.
figDocCat = go.Figure(data=[go.Pie(labels = df['docCategory'], values = valuesDocCat)])
you are passing different length arrays for labels and values. plotly is taking first 3 items from labels, some of which are the same.
to be consistent this line would be figDocCat = go.Figure(data=[go.Pie(labels=valuesDocCat.index, values=valuesDocCat)]). i.e. both labels and values come from the same pandas series
have simulated data frame to demonstrate
full solution
import plotly.graph_objects as go
import plotly.express as px
import pandas as pd
import numpy as np
cats = {
"Structured": ["Spreadsheet"],
"Unknown": ["System File", "Unrecognised"],
"Unstrcutured": ["Document", "Email", "Image", "Calendar Entry"],
}
df = pd.DataFrame(
[
{"docCategory": c, "Kind": np.random.choice(cats[c], 2)[0]}
for c in np.random.choice(list(cats.keys()), 25)
]
)
# docCategory count pie graph
valuesDocCat = df["docCategory"].value_counts()
figDocCat = go.Figure(data=[go.Pie(labels=valuesDocCat.index, values=valuesDocCat)])
figDocCat.update_traces(textposition="inside")
figDocCat.update_layout(
uniformtext_minsize=14, uniformtext_mode="hide", title="Document Category breakdown"
)
figDocCat.show()
# treeMap test graph
valuesTreemap = df["Kind"].value_counts()
figTreemap = px.treemap(df, path=["docCategory", "Kind"], color="docCategory")
figTreemap.update_traces(root_color="lightgrey")
figTreemap.update_layout(margin=dict(t=50, l=25, r=25, b=25))
I'm trying to create a plot using Plotly that allows you to select from dropdown menus what features are being plotted on the x and y axis. My approach works, but there's a set of actions that remove the coloring of the points being plotted.
Here's a Colab with the steps to reproduce this written out, and done with minimal code (Plotly plays nice with Colab):
https://colab.research.google.com/drive/19PCS8QH9n6VVN9UBOKMay99VuSXq1QGG?usp=sharing
If you want to use your own environment, the following code will reproduce the issue after you've done the following 2 steps:
Pick one of the two dropdown menus and change the selected value at least one time
Change the selected value on the dropdown menu you have not changed yet
You should then see that the original coloring of the points is lost.
import numpy as np
import pandas as pd
import plotly.express as px
import plotly.io as pio
pio.templates.default = "plotly_dark"
def get_correlation_figure_please(merged_df):
cols = [col for col, t in zip(merged_df.columns, merged_df.dtypes) if t != object]
start_dropdown_indices = [0, 0]
# Create the scatter plot of the initially selected variables
fig = px.scatter(
merged_df,
x=cols[start_dropdown_indices[0]],
y=cols[start_dropdown_indices[1]],
color='serial_number_id',
)
# Create the drop-down menus which will be used to choose the desired file characteristics for comparison
drop_downs = []
for axis in ['x', 'y']:
drop_downs.append([
dict(
method = 'update',
args = [
{axis : [merged_df[cols[k]]]},
{'%saxis.title.text'%axis: cols[k]},
# {'color':[merged_df['serial_number_id']],'color_discrete_map':SERIALS_TO_INDEX},
],
label = cols[k]) for k in range(len(cols))
])
# Sets up various apsects of the Plotly figure that is currently being produced. This ranges from
# aethetic things, to setting the dropdown menues as part of the figure
fig.update_layout(
title_x=0.4,
showlegend=False,
updatemenus=[{
'active': start_j,
'buttons': drop_down,
'x': 1.125,
'y': y_height,
'xanchor': 'left',
'yanchor': 'top',
} for drop_down, start_j, y_height in zip(drop_downs, start_dropdown_indices, [1, .85])])
return fig
# Set up a dummy dataframe with 20 points each with 5 featuers
df = pd.DataFrame({str(j):np.random.rand(20) for j in range(5)})
# Set up a column of dummied serial numbers (to be used to decide the coloring of each point)
df['serial_number_id'] = df['1'].map(lambda x : '0' if x < 1/3 else ('1' if x < 2/3 else '2'))
fig = get_correlation_figure_please(df)
fig.show()
Hi is it possible to have two different bubble types representing two different values from the same dataframe?
Currently my code is as follows:
covid = pd.read_csv('covid_19_data.csv')
fig = px.scatter_geo(covid, locations="Country/Region", locationmode="country names",animation_frame = "ObservationDate", hover_name = "Country/Region", size = "Confirmed", size_max = 100, projection= "natural earth")
Which produces the following output:
Map output
Is it possible to get it to show two different bubbles, one for confirmed cases and another for tweets? The data frame I'm working with is shown here:
Dataframe
Sure! You can freely add another dataset from px.scatter_geo() on an existing px.scatter_geo() using:
fig=px.scatter_geo()
fig.add_traces(fig1._data)
fig.add_traces(fig2._data)
Where fig1._data comes from a setup similar to yours in:
fig = px.scatter_geo(covid, locations="Country/Region", locationmode="country names",animation_frame = "ObservationDate", hover_name = "Country/Region", size = "Confirmed", size_max = 100, projection= "natural earth")
Since you haven't provided a dataset I'll use px.data.gapminder() and use the columns pop and gdpPercap, where the color of the latter is set to 'rgba(255,0,0,0.1)' which is a transparent red:
Complete code:
import plotly.express as px
df = px.data.gapminder().query("year == 2007")
fig1 = px.scatter_geo(df, locations="iso_alpha",
size="pop", # size of markers, "pop" is one of the columns of gapminder
)
fig2 = px.scatter_geo(df, locations="iso_alpha",
size="gdpPercap", # size of markers, "pop" is one of the columns of gapminder
)
# fig1.add_traces(fig2._data)
# fig1.show()
fig=px.scatter_geo()
fig.add_traces(fig1._data)
fig.add_traces(fig2._data)
fig.data[1].marker.color = 'rgba(255,0,0,0.1)'
f = fig.full_figure_for_development(warn=False)
fig.show()
Please let me know how this works out for you.