Plotly pie graph not showing all data - python

I have noticed that my go.Pie graph only shows 2 of the 3 values held in the dataframe column. I noticed this when creating a px.treemap referencing the exact same column in the dataframe and it shows all 3 values.
Below is my code for the pie chart and then the treemap
#docCategory count pie graph
valuesDocCat = df['docCategory'].value_counts()
figDocCat = go.Figure(data=[go.Pie(labels = df['docCategory'], values = valuesDocCat)])
figDocCat.update_traces(textposition = 'inside')
figDocCat.update_layout(uniformtext_minsize=14, uniformtext_mode='hide', title='Document Category breakdown')
#treeMap test graph
valuesTreemap = df['Kind'].value_counts()
figTreemap = px.treemap(df, path = ['docCategory', 'Kind'], color='docCategory')
figTreemap.update_traces(root_color='lightgrey')
figTreemap.update_layout(margin = dict(t=50, l=25, r=25, b=25)
You can see my code above referencing the df['docCategory'] in both instances but as you can see in the images below the pie chart doesnt have the 'Unknown' field whereas the treemap does.
Any ideas on why? I have other pie charts that have more than 2 fields being referenced and no issues, it is just happening on this one.

your question "Plotly pie graph not showing all data", it is showing everything.
figDocCat = go.Figure(data=[go.Pie(labels = df['docCategory'], values = valuesDocCat)])
you are passing different length arrays for labels and values. plotly is taking first 3 items from labels, some of which are the same.
to be consistent this line would be figDocCat = go.Figure(data=[go.Pie(labels=valuesDocCat.index, values=valuesDocCat)]). i.e. both labels and values come from the same pandas series
have simulated data frame to demonstrate
full solution
import plotly.graph_objects as go
import plotly.express as px
import pandas as pd
import numpy as np
cats = {
"Structured": ["Spreadsheet"],
"Unknown": ["System File", "Unrecognised"],
"Unstrcutured": ["Document", "Email", "Image", "Calendar Entry"],
}
df = pd.DataFrame(
[
{"docCategory": c, "Kind": np.random.choice(cats[c], 2)[0]}
for c in np.random.choice(list(cats.keys()), 25)
]
)
# docCategory count pie graph
valuesDocCat = df["docCategory"].value_counts()
figDocCat = go.Figure(data=[go.Pie(labels=valuesDocCat.index, values=valuesDocCat)])
figDocCat.update_traces(textposition="inside")
figDocCat.update_layout(
uniformtext_minsize=14, uniformtext_mode="hide", title="Document Category breakdown"
)
figDocCat.show()
# treeMap test graph
valuesTreemap = df["Kind"].value_counts()
figTreemap = px.treemap(df, path=["docCategory", "Kind"], color="docCategory")
figTreemap.update_traces(root_color="lightgrey")
figTreemap.update_layout(margin=dict(t=50, l=25, r=25, b=25))

Related

How to avoid None when plotting sunburst chart using Plotly?

I am trying to create sunburst chart using Plotly. My data consists of several different types of journeys of varying steps. Some journeys are 10 steps others are 100. But for the purposes of simplicity, let us consider only 3 steps.
Here is the data -
import pandas as pd
import plotly.express as px
import numpy as np
data = {
'step0' :['home', 'home','product2','product1','home'],
'step1' : ['product1','product1', None, 'product2',None] ,
'step2' : ['product2','checkout', None, None,None] ,
'total_sales' : [50,20,10,0,7]
}
data_df = pd.DataFrame(data)
data_df.head()
I now try to plot these steps in sunburst chart. Because some journeys can be short, the subsequent steps are marked as None in those cases.
data_df = data_df.fillna('end')
plotting code -
fig = px.sunburst(data_df, path=['step0','step1','step2'], values='total_sales', height = 400)
fig.show()
As you can see above, the None have been filled by end because Plotly does not like NAs. But then I do not want to show the end in the sunburst chart.
I want to re-create something like this -
https://bl.ocks.org/kerryrodden/7090426
How can I make this work in Plotly?
One workaround that uses what you already have would be to instead fillna with an empty string like " " so the word "end" doesn't show on the chart. Then you can loop through the marker colors and marker labels in the fig.data[0] object, changing the marker color to transparent "rgba(0,0,0,0)" for every label that matches the empty string.
The only thing is that the hovertemplate will still show information for the part of the sunburst chart we have used our workaround to hide, but the static image will look correct.
For example:
import pandas as pd
import plotly.express as px
import numpy as np
data = {
'step0' :['home', 'home','product2','product1','home'],
'step1' : ['product1','product1', None, 'product2',None] ,
'step2' : ['product2','checkout', None, None,None] ,
'total_sales' : [50,20,10,0,7]
}
data_df = pd.DataFrame(data)
# data_df.head()
data_df = data_df.fillna(" ")
fig = px.sunburst(
data_df,
path=['step0','step1','step2'],
values='total_sales',
color=["red","orange","yellow","green","blue"],
height = 400
)
## set marker colors whose labels are " " to transparent
marker_colors = list(fig.data[0].marker['colors'])
marker_labels = list(fig.data[0]['labels'])
new_marker_colors = ["rgba(0,0,0,0)" if label==" " else color for (color, label) in zip(marker_colors, marker_labels)]
marker_colors = new_marker_colors
fig.data[0].marker['colors'] = marker_colors
fig.show()

How to get standard notation (rather than scientific) when hovering over pie chart in Plotly

I have a pie chart that displays worldwide movie sales by rating. When I hover over the chart the woldwide sales are being displayed in scientific notation. How do I fix this so that worldwide sales are represented in standard notation instead? I would appreciate it if anyone has a solution to this in express or graph objects (or both).
Thank you.
# formatting and importing data
import pandas as pd
movie_dataframe = pd.read_csv("https://raw.githubusercontent.com/NicholasTuttle/public_datasets/main/movie_data.csv") # importing dataset to dataframe
movie_dataframe['worldwide_gross'] = movie_dataframe['worldwide_gross'].str.replace(',', '', regex=True) # removing commas from column
movie_dataframe['worldwide_gross'] = movie_dataframe['worldwide_gross'].str.replace('$', '' , regex=True ) # removing dollar signs from column
movie_dataframe['worldwide_gross'] = movie_dataframe['worldwide_gross'].astype(float)
# narrowing dataframe to specific columns
movies_df = movie_dataframe.loc[:, ['title', 'worldwide_gross', 'rating', 'rt_score', 'rt_freshness']]
# plotly express
import plotly.express as px
fig = px.pie(movies_df,
values= movies_df['worldwide_gross'],
names= movies_df['rating'],
)
fig.show()
# plotly graph objects
import plotly.graph_objects as go
fig = go.Figure(go.Pie(
values = movies_df['worldwide_gross'],
labels = movies_df['rating']
))
fig.show()
Have a look here: https://plotly.com/python/hover-text-and-formatting/#disabling-or-customizing-hover-of-columns-in-plotly-express
Basically you give a dictionary of row name and format string to hover_data. The formatting string follows the d3-format's syntax.
import plotly.express as px
fig = px.pie(
movies_df, values= movies_df['worldwide_gross'], names= movies_df['rating'],
hover_data={
"worldwide_gross": ':.d',
# "worldwide_gross": ':.2f', # float
}
)
fig.show()
For the graph object API you need to write an hover_template:
https://plotly.com/python/reference/pie/#pie-hovertemplate
import plotly.graph_objects as go
fig = go.Figure(go.Pie(
values = movies_df['worldwide_gross'],
labels = movies_df['rating'],
hovertemplate='Rating: %{label}<br />World wide gross: %{value:d}<extra></extra>'
))
fig.show()

Python Plotly express two bubble markers on the same scatter_geo?

Hi is it possible to have two different bubble types representing two different values from the same dataframe?
Currently my code is as follows:
covid = pd.read_csv('covid_19_data.csv')
fig = px.scatter_geo(covid, locations="Country/Region", locationmode="country names",animation_frame = "ObservationDate", hover_name = "Country/Region", size = "Confirmed", size_max = 100, projection= "natural earth")
Which produces the following output:
Map output
Is it possible to get it to show two different bubbles, one for confirmed cases and another for tweets? The data frame I'm working with is shown here:
Dataframe
Sure! You can freely add another dataset from px.scatter_geo() on an existing px.scatter_geo() using:
fig=px.scatter_geo()
fig.add_traces(fig1._data)
fig.add_traces(fig2._data)
Where fig1._data comes from a setup similar to yours in:
fig = px.scatter_geo(covid, locations="Country/Region", locationmode="country names",animation_frame = "ObservationDate", hover_name = "Country/Region", size = "Confirmed", size_max = 100, projection= "natural earth")
Since you haven't provided a dataset I'll use px.data.gapminder() and use the columns pop and gdpPercap, where the color of the latter is set to 'rgba(255,0,0,0.1)' which is a transparent red:
Complete code:
import plotly.express as px
df = px.data.gapminder().query("year == 2007")
fig1 = px.scatter_geo(df, locations="iso_alpha",
size="pop", # size of markers, "pop" is one of the columns of gapminder
)
fig2 = px.scatter_geo(df, locations="iso_alpha",
size="gdpPercap", # size of markers, "pop" is one of the columns of gapminder
)
# fig1.add_traces(fig2._data)
# fig1.show()
fig=px.scatter_geo()
fig.add_traces(fig1._data)
fig.add_traces(fig2._data)
fig.data[1].marker.color = 'rgba(255,0,0,0.1)'
f = fig.full_figure_for_development(warn=False)
fig.show()
Please let me know how this works out for you.

Add dropdown menu to plotly express treemap

I am currently trying to add a dropdown menu to my treemap plot
The code I am using :
import pandas as pd
import plotly.express as px
fig = px.treemap(df,
path=['RuleName','RuleNumber','ParaInvolved',"CreationP","MAjP"],
color='Somme',
hover_data=["RuleDecision","RuleMAJ"],
color_continuous_scale='RdBu')
fig.show()
The problem I am facing is that in my column "RuleName" I have 151 different values (but 1300 rows in total), that's why I'm trying to add a button allowing myself to chose for what RuleName value I want to plot my treemap. For now I am using a barbaric method consisting in filtering my dataframe by each RuleName value, which lead me to get 151 different treemap. I don't find any solution on that website or any other.
Thanks for your help
Here I'm basically using the same logic from this answer but I use px.treemap(...).data[0] to produce the traces instead of go.
import plotly.express as px
import plotly.graph_objects as go
df = px.data.tips()
# We have a list for every day
# In your case will be gropuby('RuleName')
# here for every element d
# d[0] is the name(key) and d[1] is the dataframe
dfs = list(df.groupby("day"))
first_title = dfs[0][0]
traces = []
buttons = []
for i,d in enumerate(dfs):
visible = [False] * len(dfs)
visible[i] = True
name = d[0]
traces.append(
px.treemap(d[1],
path=['day', 'time', 'sex'],
values='total_bill').update_traces(visible=True if i==0 else False).data[0]
)
buttons.append(dict(label=name,
method="update",
args=[{"visible":visible},
{"title":f"{name}"}]))
updatemenus = [{'active':0, "buttons":buttons}]
fig = go.Figure(data=traces,
layout=dict(updatemenus=updatemenus))
fig.update_layout(title=first_title, title_x=0.5)
fig.show()

Plotly subplot represent same y-axis name with same color and single legend

I am trying to create a plot for two categories in a subplot. 1st column represent category FF and 2nd column represent category RF in the subplot.
The x-axis is always time and y-axis is remaining columns. In other words, it is a plot with one column vs rest.
1st category and 2nd category always have same column names just only the values differs.
I tried to generate the plot in a for loop but the problem is plotly treats each column name as distinct and thereby it represents the lines in different color for y-axis with same name. As a consequence, in legend also an entry is created.
For example, in first row Time vs price2010 I want both subplot FF and RF to be represented in same color (say blue) and a single entry in legend.
I tried adding legendgroup in go.Scatter but it doesn't help.
import pandas as pd
from pandas import DataFrame
from plotly import tools
from plotly.offline import init_notebook_mode, plot, iplot
import plotly.graph_objs as go
from plotly.subplots import make_subplots
CarA = {'Time': [10,20,30,40 ],
'Price2010': [22000,26000,27000,35000],
'Price2011': [23000,27000,28000,36000],
'Price2012': [24000,28000,29000,37000],
'Price2013': [25000,29000,30000,38000],
'Price2014': [26000,30000,31000,39000],
'Price2015': [27000,31000,32000,40000],
'Price2016': [28000,32000,33000,41000]
}
ff = DataFrame(CarA)
CarB = {'Time': [8,18,28,38 ],
'Price2010': [19000,20000,21000,22000],
'Price2011': [20000,21000,22000,23000],
'Price2012': [21000,22000,23000,24000],
'Price2013': [22000,23000,24000,25000],
'Price2014': [23000,24000,25000,26000],
'Price2015': [24000,25000,26000,27000],
'Price2016': [25000,26000,27000,28000]
}
rf = DataFrame(CarB)
Type = {
'FF' : ff,
'RF' : rf
}
fig = make_subplots(rows=len(ff.columns), cols=len(Type), subplot_titles=('FF','RF'),vertical_spacing=0.3/len(ff.columns))
labels = ff.columns[1:]
for indexC, (cat, values) in enumerate(Type.items()):
for indexP, params in enumerate(values.columns[1:]):
trace = go.Scatter(x=values.iloc[:,0], y=values[params], mode='lines', name=params,legendgroup=params)
fig.append_trace(trace,indexP+1, indexC+1)
fig.update_xaxes(title_text=values.columns[0],row=indexP+1, col=indexC+1)
fig.update_yaxes(title_text=params,row=indexP+1, col=indexC+1)
fig.update_layout(height=2024, width=1024,title_text="Car Analysis")
iplot(fig)
It might not be a good solution, but so far I can able to come up only with this hack.
fig = make_subplots(rows=len(ff.columns), cols=len(Type), subplot_titles=('FF','RF'),vertical_spacing=0.2/len(ff.columns))
labels = ff.columns[1:]
colors = [ '#a60000', '#f29979', '#d98d36', '#735c00', '#778c23', '#185900', '#00a66f']
legend = True
for indexC, (cat, values) in enumerate(Type.items()):
for indexP, params in enumerate(values.columns[1:]):
trace = go.Scatter(x=values.iloc[:,0], y=values[params], mode='lines', name=params,legendgroup=params, showlegend=legend, marker=dict(
color=colors[indexP]))
fig.append_trace(trace,indexP+1, indexC+1)
fig.update_xaxes(title_text=values.columns[0],row=indexP+1, col=indexC+1)
fig.update_yaxes(title_text=params,row=indexP+1, col=indexC+1)
fig.update_layout(height=1068, width=1024,title_text="Car Analysis")
legend = False
If you combine your data into a single tidy data frame, you can use a simple Plotly Express call to make the chart: px.line() with color, facet_row and facet_col

Categories