I've been attempting to create a line graph with subplots for each column of a dataframe in Pandas. My dataframe has variable names as the column names, datetime objects as the columns, and percentages (floats) as the values.
I'm referencing Plotly: How to create subplots from each column in a pandas dataframe?, but when adapted for my scenario it results in a blank figure - no exceptions or anything, and it does print out an empty box whose size I can adjust, but nothing in it.
My code is
from plotly.subplots import make_subplots
import plotly.graph_objects as go
# get the number of columns
num_alerts = len(helpfulness_graph_data.columns)
# Get the alert names
alert_types = helpfulness_graph_data.columns.values.tolist()
# Create a subplot figure object with 1 column, num_rows rows, and titles for each alert
fig = make_subplots(
rows=num_alerts, cols=1,
subplot_titles=alert_types)
j = 1
for i in helpfulness_graph_data.columns:
#print(helpfulness_graph_data[i].values)
fig.append_trace(
go.Scatter(
{'x': helpfulness_graph_data.index,
'y': helpfulness_graph_data[i].values}),
row=j, col=1)
j += 1
fig.update_layout(height=1200, width=600, title_text="Helpfulness Over Time")
fig.show()
For anyone else who comes across this: This happens when running plotly in jupyterlab sometimes apparently - I found some additional questions with suggested solutions, but none of them worked for me; What did work however was running it in plain old Jupyter. That's what I'd recommend.
Related
I have recently started to produce scatterplot matrices via plotly.express. The plots reference custom data through a specialized hovertemplate. Here is an example:-
import plotly.express as px
import pandas as pd
df = pd.read_csv('./table.csv',dtype={'node':str,'ref_name':str,'population':int,'percent_use':float},keep_default_na=False,sep='\s\;\s',engine='python')
fig = px.scatter_matrix(df,
dimensions=['area', 'leakage', 'switch_power', 'internal_power', 'max_fall_drive', 'max_rise_drive'],
color='dont_use_status',
color_continuous_scale=px.colors.sequential.Bluered,
symbol='dont_use_status',
custom_data=['ref_name', 'population', 'percent_use', 'dont_use_status']
)
fig.update_traces(diagonal_visible=True,marker=dict(size=5))
fig.update_traces(hovertemplate=('<b>x: %{x}</b><br>'+'<b>y: %{y}</b><br>'+'<b>ref_name: %{customdata[0]}</b><br>'+'<b>population: %{customdata[1]}</b><br>'+'<b>percent_use: %{customdata[2]:.3f}</b><br>'+'<b>dont_use_status: %{customdata[3]}</b><br>'+'<extra></extra>'))
...and it works as intended; however, there is one little detail I cannot figure out: how do I replace the "x:" and "y:" in the hovertemplate with the actual x and y label names? Because it is a matrix, the x and y labels obviously change as you hover from one scatterplot to the next. I cannot seem to find the right keywords when searching. What %{} declaration is needed to dynamically retrieve the x axis and y axis labels?
Thanks!
Oy, found the answer right after posting. I need...
fig.update_traces(hovertemplate=('<b>%{xaxis.title.text}: %{x}</b><br>'+'<b>%{yaxis.title.text}: %{y}</b><br>'+'<b>ref_name: %{customdata[0]}</b><br>'+'<b>population: %{customdata[1]}</b><br>'+'<b>percent_use: %{customdata[2]:.3f}</b><br>'+'<b>dont_use_status: %{customdata[3]}</b><br>'+'<extra></extra>'))
...which is working.
I have a dataframe
a b c
0 2610.101010 13151.030303 33.000000
1 1119.459459 5624.216216 65.777778
2 3584.000000 18005.333333 3.000000
3 1227.272727 5303.272727 29.333333
4 1661.156504 8558.836558 499.666667
and I am plotting histograms using plotly.express and I am also printing a describe table with the following simple code:
import plotly.express as px
for col in df.columns:
px.histogram(df, x=col, title=col).show()
print(df[col].describe().T)
Is it possible to add next to each histogram the describe and save all the plots (together with their respective histograms) in a single pdf ?
One way to achieve this is by creating a subplot grid, the size of n_columns * 2 (one for the histogram and one for the table. For example:
from plotly.subplots import make_subplots
titles = [[f"Histogram of {col}", f"Stats of {col}"] for col in df.columns]
titles = [item for sublist in titles for item in sublist]
fig = make_subplots(rows=3,
cols=2,
specs=[[{"type": "histogram"}, {"type": "table"}]] *3,
subplot_titles=titles)
for i, col in enumerate(df.columns):
fig.add_histogram(x=df[col],
row=i+1,
col=1)
fig.add_table(cells=dict(
values=df[col].describe().reset_index().T.values.tolist()
),
header=dict(values=['Statistic', 'Value']),
row=i+1,
col=2
)
fig.update_layout(showlegend=False)
fig.show()
fig.write_image("example_output.pdf")
In the end, you can save the full fig (6 charts together) as pdf using .write_image() as explained here. You will need to install kaleido or orca utilities to do so. The output will look like this (you can of course customize it):
If you need to save each graph + table on a separate page of the PDF, you can take advantage of the PyPDF2 library. So, first, you would save each graph + table as a single PDF (as described above, but you would save as many PDF files as numbers of columns you have, not 1), and then you could follow the instructions from this answer to merge them:
I am trying to make dynamic plots with plotly. I want to plot a count of data that have been aggregated (using groupby).
I want to facet the plot by color (and maybe even by column). The problem is that I want the value count to be displayed on each bar. With histogram, I get smooth bars but I can't find how to display the count:
With a bar plot I can display the count but I don't get smooth bar and the count does not appear for the whole bar but for each case composing that bar
Here is my code for the barplot
val = pd.DataFrame(data2.groupby(["program", "gender"])["experience"].value_counts())
px.bar(x=val.index.get_level_values(0), y=val, color=val.index.get_level_values(1), barmode="group", text=val)
It's basically the same for the histogram.
Thank you for your help!
px.histogram does not seem to have a text attribute. So if you're willing to do any binning before producing your plot, I would use px.Bar. Normally, you apply text to your barplot using px.Bar(... text = <something>). But this gives the results you've described with text for all subcategories of your data. But since we know that px.Bar adds data and annotations in the order that the source is organized, we can simply update text to the last subcategory applied using fig.data[-1].text = sums. The only challenge that remains is some data munging to retrieve the correct sums.
Plot:
Complete code with data example:
import plotly.graph_objects as go
import plotly.express as px
import pandas as pd
# data
df = pd.DataFrame({'x':['a', 'b', 'c', 'd'],
'y1':[1, 4, 9, 16],
'y2':[1, 4, 9, 16],
'y3':[6, 8, 4.5, 8]})
df = df.set_index('x')
# calculations
# column sums for transposed dataframe
sums= []
for col in df.T:
sums.append(df.T[col].sum())
# change dataframe format from wide to long for input to plotly express
df = df.reset_index()
df = pd.melt(df, id_vars = ['x'], value_vars = df.columns[1:])
fig = px.bar(df, x='x', y='value', color='variable')
fig.data[-1].text = sums
fig.update_traces(textposition='inside')
fig.show()
If your first graph is with graph object librairy you can try:
# Use textposition='auto' for direct text
fig=go.Figure(data[go.Bar(x=val.index.get_level_values(0),
y=val, color=val.index.get_level_values(1),
barmode="group", text=val, textposition='auto',
)])
I am trying to create a plot for two categories in a subplot. 1st column represent category FF and 2nd column represent category RF in the subplot.
The x-axis is always time and y-axis is remaining columns. In other words, it is a plot with one column vs rest.
1st category and 2nd category always have same column names just only the values differs.
I tried to generate the plot in a for loop but the problem is plotly treats each column name as distinct and thereby it represents the lines in different color for y-axis with same name. As a consequence, in legend also an entry is created.
For example, in first row Time vs price2010 I want both subplot FF and RF to be represented in same color (say blue) and a single entry in legend.
I tried adding legendgroup in go.Scatter but it doesn't help.
import pandas as pd
from pandas import DataFrame
from plotly import tools
from plotly.offline import init_notebook_mode, plot, iplot
import plotly.graph_objs as go
from plotly.subplots import make_subplots
CarA = {'Time': [10,20,30,40 ],
'Price2010': [22000,26000,27000,35000],
'Price2011': [23000,27000,28000,36000],
'Price2012': [24000,28000,29000,37000],
'Price2013': [25000,29000,30000,38000],
'Price2014': [26000,30000,31000,39000],
'Price2015': [27000,31000,32000,40000],
'Price2016': [28000,32000,33000,41000]
}
ff = DataFrame(CarA)
CarB = {'Time': [8,18,28,38 ],
'Price2010': [19000,20000,21000,22000],
'Price2011': [20000,21000,22000,23000],
'Price2012': [21000,22000,23000,24000],
'Price2013': [22000,23000,24000,25000],
'Price2014': [23000,24000,25000,26000],
'Price2015': [24000,25000,26000,27000],
'Price2016': [25000,26000,27000,28000]
}
rf = DataFrame(CarB)
Type = {
'FF' : ff,
'RF' : rf
}
fig = make_subplots(rows=len(ff.columns), cols=len(Type), subplot_titles=('FF','RF'),vertical_spacing=0.3/len(ff.columns))
labels = ff.columns[1:]
for indexC, (cat, values) in enumerate(Type.items()):
for indexP, params in enumerate(values.columns[1:]):
trace = go.Scatter(x=values.iloc[:,0], y=values[params], mode='lines', name=params,legendgroup=params)
fig.append_trace(trace,indexP+1, indexC+1)
fig.update_xaxes(title_text=values.columns[0],row=indexP+1, col=indexC+1)
fig.update_yaxes(title_text=params,row=indexP+1, col=indexC+1)
fig.update_layout(height=2024, width=1024,title_text="Car Analysis")
iplot(fig)
It might not be a good solution, but so far I can able to come up only with this hack.
fig = make_subplots(rows=len(ff.columns), cols=len(Type), subplot_titles=('FF','RF'),vertical_spacing=0.2/len(ff.columns))
labels = ff.columns[1:]
colors = [ '#a60000', '#f29979', '#d98d36', '#735c00', '#778c23', '#185900', '#00a66f']
legend = True
for indexC, (cat, values) in enumerate(Type.items()):
for indexP, params in enumerate(values.columns[1:]):
trace = go.Scatter(x=values.iloc[:,0], y=values[params], mode='lines', name=params,legendgroup=params, showlegend=legend, marker=dict(
color=colors[indexP]))
fig.append_trace(trace,indexP+1, indexC+1)
fig.update_xaxes(title_text=values.columns[0],row=indexP+1, col=indexC+1)
fig.update_yaxes(title_text=params,row=indexP+1, col=indexC+1)
fig.update_layout(height=1068, width=1024,title_text="Car Analysis")
legend = False
If you combine your data into a single tidy data frame, you can use a simple Plotly Express call to make the chart: px.line() with color, facet_row and facet_col
been loving the plotly express graphs but want to create a dashboard with them now. Did not find any documentation for this. Is this possible?
I was struggling to find a response on this as well so I ended up having to create my own solution (see my full breakdown here: How To Create Subplots Using Plotly Express)
Essentially make_subplots() takes in plot traces to make the subplots instead of figure objects like that which Express returns. So what you can do is, after creating your figures in Express, is break apart the Express figure objects into their traces and then re-assemble their traces into subplots.
Code:
import dash_core_components as dcc
import plotly.express as px
import plotly.subplots as sp
# Create figures in Express
figure1 = px.line(my_df)
figure2 = px.bar(my_df)
# For as many traces that exist per Express figure, get the traces from each plot and store them in an array.
# This is essentially breaking down the Express fig into it's traces
figure1_traces = []
figure2_traces = []
for trace in range(len(figure1["data"])):
figure1_traces.append(figure1["data"][trace])
for trace in range(len(figure2["data"])):
figure2_traces.append(figure2["data"][trace])
#Create a 1x2 subplot
this_figure = sp.make_subplots(rows=1, cols=2)
# Get the Express fig broken down as traces and add the traces to the proper plot within in the subplot
for traces in figure1_traces:
this_figure.append_trace(traces, row=1, col=1)
for traces in figure2_traces:
this_figure.append_trace(traces, row=1, col=2)
#the subplot as shown in the above image
final_graph = dcc.Graph(figure=this_figure)
Output:
Working off #mmarion's solution:
import plotly.express as px
from plotly.offline import plot
from plotly.subplots import make_subplots
figures = [
px.line(df1),
px.line(df2)
]
fig = make_subplots(rows=len(figures), cols=1)
for i, figure in enumerate(figures):
for trace in range(len(figure["data"])):
fig.append_trace(figure["data"][trace], row=i+1, col=1)
plot(fig)
This is easily extended into the column dimension.
From the docs:
**facet_row**
(string: name of column in data_frame) Values from this column are used to assign marks to facetted subplots in the vertical direction.
**facet_col**
(string: name of column in data_frame) Values from this column are used to assign marks to facetted subplots in the horizontal direction.
Get here some examples too.
https://medium.com/#plotlygraphs/introducing-plotly-express-808df010143d
Unfortunately, it is not at the moment. See the following issue to get updated: https://github.com/plotly/plotly_express/issues/83
I solved it by combining all the data in a single dataframe,
with a column called "type" that distinguishes the two plots.
Then I used facet_col to create (some kind of) subplot:
px.scatter(df3, x = 'dim1', y = 'dim2', color = 'labels', facet_col='type')
Try this function out. You have to pass in the plotly express figures into the function and it returns a subplot figure.
#quick_subplot function
def quick_subplot(n,nrows,ncols, *args): #n:number of subplots, nrows:no.of. rows, ncols:no of cols, args
from dash import dcc
import plotly.subplots as sp
from plotly.subplots import make_subplots
fig=[] #list to store figures
for arg in args:
fig.append(arg)
combined_fig_title=str(input("Enter the figure title: "))
tok1=int(input("Do you want to disable printing legends after the first legend is printed ? {0:Disable, 1:Enable} : "))
fig_traces={} #Dictionary to store figure traces
subplt_titles=[]
#Appending the traces of the figures to a list in fig_traces dictionary
for i in range(n):
fig_traces[f'fig_trace{i}']=[]
for trace in range(len(fig[i]["data"])):
fig_traces[f'fig_trace{i}'].append(fig[i]["data"][trace])
if(i!=0 & tok1==0):
fig[i]["data"][trace]['showlegend'] = False #Disabling other legends
subplt_titles.append(str(input(f"Enter subplot title for subplot-{i+1}: ")))
#Creating a subplot
#Change height and width of figure here if necessary
combined_fig=sp.make_subplots(rows = nrows, cols = ncols, subplot_titles = subplt_titles)
combined_fig.update_layout(height = 500, width = 1200, title_text = '<b>'+combined_fig_title+'<b>', title_font_size = 25)
#Appending the traces to the newly created subplot
i=0
for a in range(1,nrows+1):
for b in range(1, ncols+1):
for traces in fig_traces[f"fig_trace{i}"]:
combined_fig.append_trace(traces, row=a, col=b)
i+=1
#Setting axis titles
#X-axis
combined_fig['layout']['xaxis']['title']['font']['color']='blue'
tok2=int(input("Separate x-axis titles?{0:'No',1:'Yes'}: "))
for i in range(max(nrows,ncols)):
if i==0:
combined_fig['layout']['xaxis']['title']=str(input(
f"Enter x-axis's title: "))
if tok2 & i!=0:
combined_fig['layout'][f'xaxis{i+1}']['title']=str(input(
f"Enter x-axis {i+1}'s title: "))
combined_fig['layout'][f'xaxis{i+1}']['title']['font']['color']='blue'
#Y-axis
combined_fig['layout']['yaxis']['title']['font']['color']='blue'
tok3=int(input("Separate y-axis titles?{0:'No',1:'Yes'}: "))
for i in range(max(nrows,ncols)):
if i==0:
combined_fig['layout']['yaxis']['title']=str(input(
f"Enter y-axis's title: "))
if tok3 & i!=0:
combined_fig['layout'][f'yaxis{i+1}']['title']=str(input(
f"Enter y-axis {i+1}'s title: "))
combined_fig['layout'][f'yaxis{i+1}']['title']['font']['color']='blue'
combined_fig['layout']['xaxis']['title']['font']['color']='blue'
combined_fig['layout']['yaxis']['title']['font']['color']='blue'
return combined_fig
f=quick_subplot(2,1,2,fig1,fig2)
f.show()