I am trying to plot the accuracy of the training and test set of my neural network using plotly.
I want also to add a marker with a text that says when was the maximum value of each but also displays a text that says what that value was. I tried doing something like in this example.
Here my mcve:
import plotly.graph_objects as go
data = {
'test acc': [1, 2, 3, 4, 5, 6, 7, 9, 10],
'train acc': [3, 5, 5, 6, 7, 8, 9, 10, 8]
}
fig = go.Figure()
color_train = 'rgb(255, 0, 0)'
color_test = 'rgb(0, 255, 0)'
assert len(data["train acc"]) == len(data["test acc"])
x = list(range(len(data["train acc"])))
fig.add_trace(go.Scatter(x=x,
y=data["train acc"],
mode='lines',
name='train acc',
line_color=color_train))
fig.add_trace(go.Scatter(x=x,
y=data["test acc"],
mode='lines',
name='test acc',
line_color=color_test))
# Max points
train_max = max(data["train acc"])
test_max = max(data["test acc"])
# ATTENTION! this will only give you first occurrence
train_max_index = data["train acc"].index(train_max)
test_max_index = data["test acc"].index(test_max)
fig.add_trace(go.Scatter(x=[train_max_index],
y=[train_max],
mode='markers',
name='max value train',
text=['{}%'.format(int(train_max * 100))],
textposition="top center",
marker_color=color_train))
fig.add_trace(go.Scatter(x=[test_max_index],
y=[test_max],
mode='markers',
name='max value test',
text=['{}%'.format(int(test_max*100))],
textposition="top center",
marker_color=color_test))
fig.update_layout(title='Train vs Test accuracy',
xaxis_title='epochs',
yaxis_title='accuracy (%)'
)
fig.show()
However, my output fire is the following:
As you can see, the value is not being displayed as in the example I found.
How can I make it appear?
If you'd only like to highlight a few certain values, use add_annotation(). In your case just find the max and min Y for the X that you'd like to put into focus. Lacking a data sample from your side, here's how I'd do it with a generic data sample:
Plot:
Code:
import plotly.graph_objects as go
import plotly.io as pio
pio.renderers.default='browser'
fig = go.Figure()
xVars1=[0, 1, 2, 3, 4, 5, 6, 7, 8]
yVars1=[0, 1, 3, 2, 4, 3, 4, 6, 5]
xVars2=[0, 1, 2, 3, 4, 5, 6, 7, 8]
yVars2=[0, 4, 5, 1, 2, 2, 3, 4, 2]
fig.add_trace(go.Scatter(
x=xVars1,
y=yVars1
))
fig.add_trace(go.Scatter(
x=xVars2,
y=yVars2
))
fig.add_annotation(
x=yVars1.index(max(yVars1)),
y=max(yVars1),
text="yVars1 max")
fig.add_annotation(
x=yVars2.index(max(yVars2)),
y=max(yVars2),
text="yVars2 max")
fig.update_annotations(dict(
xref="x",
yref="y",
showarrow=True,
arrowhead=7,
ax=0,
ay=-40
))
fig.update_layout(showlegend=False)
fig.show()
Related
Is there a way I can update each figure's layout in a loop like this? I added each layout to a list and am looping through each but can't seem to update the figures in the subplot:
# Data Visualization
from plotly.subplots import make_subplots
import plotly.graph_objects as go
epoch_list = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
loss_list = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
val_loss_list = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
error_rate = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
val_error_rate = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
layout_list = []
loss_plots = [go.Scatter(x=epoch_list,
y=loss_list,
mode='lines',
name='Loss',
line=dict(width=4)),
go.Scatter(x=epoch_list,
y=val_loss_list,
mode='lines',
name='Validation Loss',
line=dict(width=4))]
loss_layout = dict(font_color='black',
title_font_color='black',
title=dict(text='Loss Graph',
font_size=30),
xaxis_title=dict(text='Epochs',
font_size=25),
yaxis_title=dict(text='Loss',
font_size=25),
legend=dict(font_size=15))
loss_figure = go.Figure(data=loss_plots)
layout_list.append(loss_layout)
error_plots = [go.Scatter(x=epoch_list,
y=loss_list,
mode='lines',
name='Error Rate',
line=dict(width=4)),
go.Scatter(x=epoch_list,
y=val_loss_list,
mode='lines',
name='Validation Error Rate',
line=dict(width=4))]
error_rate_layout = dict(font_color='black',
title_font_color='black',
title=dict(text='Error Rate Graph',
font_size=30),
xaxis_title=dict(text='Epochs',
font_size=25),
yaxis_title=dict(text='Error Rate',
font_size=25),
legend=dict(font_size=15))
error_figure = go.Figure(data=error_plots)
layout_list.append(error_rate_layout)
metric_figure = make_subplots(
rows=3, cols=2,
specs=[[{}, {}],
[{}, {}],
[{}, {}]])
for t in loss_figure.data:
metric_figure.append_trace(t, row=1, col=1)
for t in error_figure.data:
metric_figure.append_trace(t, row=1, col=2)
for (figure, layout) in zip(metric_figure, layout_list):
figure.update_layout(layout)
metric_figure.show()
It seems that doing this doesn't work either as the layout does not transfer over because I am looping through the traces only:
loss_figure = go.Figure(data=loss_plots, layout=loss_layout)
you can use python dict merging techniques
metric_figure.update_layout({**loss_layout, **error_rate_layout})
alternatively, if layouts are in figures
metric_figure.update_layout({**error_figure.to_dict()["layout"],**error_ficture.to_dict()["layout"]})
both of these are of limited use as sub-plot layouts are significantly different from individual figures. There will be different x-axis and y-axis definitions than individual figures / layouts and where dictionary keys overlap only one can be used - for example title
I'm trying to make a horizontal bar graph with a large number of elements/bars with matplotlib's barh function. However, I'm having a couple of problems with bars being too close together and their labels being illegible (see image below):
I first tried changing the figure size, setting figsize=(10,40) and increasing the height up from 40, to no avail.
I also tried bumping up the spacing between bars from 0.2 to 0.3 (in the positions list), but it seems that going any higher than a spacing of 0.2 makes some of the bars disappear. In other words, there seem to be clusters of ~5 bars that are too close together that get spaced properly at 0.3, but all the bars between these clusters disappear.
The code is shown below (adapted from the mpl docs/examples). I'm sure there's rather an easy fix here that I'm just too much of a novice to realize. Alternatively, I could try graphing this in matlab but I prefer python for quality and simplicity. Are there improvements I could make that would make my bar graph legible?
Code:
genus = {'Parasutterella': 1, 'Anaerobaculum': 1, 'Clostridiales': 1, 'Butyrivibrio': 1, 'Anaerococcus': 1, 'Neisseria': 1, 'Campylobacter': 1, 'Intestinibacter': 1, 'Erysipelatoclostridium': 1, 'Tannerella': 1, 'Barnesiella': 1, 'Enterobacter': 1, 'Odoribacter': 1, 'Arcobacter': 1, 'Dialister': 1, 'Alistipes': 1, 'Collinsella': 2, 'Synergistes': 2, 'Burkholderiales': 2, 'Gordonibacter': 2, 'Tyzzerella': 2, 'Providencia': 2, 'Weissella': 2, 'Enterobacteriaceae': 2, 'Flavonifractor': 2, 'Prevotella': 2, 'Klebsiella': 2, 'Citrobacter': 2, 'Actinomyces': 2, 'Proteus': 2, 'Catenibacterium': 2, 'Propionibacterium': 2, 'Mitsuokella': 2, 'butyrate-producing': 2, 'Parvimonas': 2, 'Phascolarctobacterium': 2, 'Desulfovibrio': 2, 'Cedecea': 2, 'Finegoldia': 2, 'Slackia': 3, '[Bacteroides]': 3, 'Hafnia': 3, 'Acidaminococcus': 3, 'Bifidobacterium': 3, 'Sutterella': 3, 'Anaerofustis': 3, 'Paraprevotella': 3, 'Oxalobacter': 3, 'Yokenella': 3, 'Leuconostoc': 3, 'Dermabacter': 3, 'Megamonas': 4, 'Staphylococcus': 4, 'Fusobacterium': 4, 'Anaerostipes': 4, 'Bilophila': 4, 'Butyricicoccus': 4, 'Parabacteroides': 4, 'Erysipelotrichaceae': 4, 'Anaerotruncus': 4, 'Listeria': 4, 'Corynebacterium': 5, 'Pseudoflavonifractor': 5, 'Dorea': 5, 'Streptococcus': 6, 'Roseburia': 6, 'Helicobacter': 6, 'Eggerthella': 6, 'Acinetobacter': 6, '[Clostridium': 6, 'Ruminococcaceae': 6, 'Dysgonomonas': 6, '[Eubacterium]': 6, 'Enterococcus': 6, 'Subdoligranulum': 7, 'Faecalibacterium': 7, 'Blautia': 8, 'Holdemania': 8, 'Bacteroides': 8, 'Marvinbryantia': 8, 'Coprococcus': 9, 'Eubacterium': 9, 'Lactobacillus': 9, 'Paenisporosarcina': 9, 'Turicibacter': 9, 'Ruminococcus': 10, 'Coprobacillus': 11, 'Ralstonia': 11, 'Peptoclostridium': 11, 'Pseudomonas': 13, 'Desulfitobacterium': 14, 'Bacillus': 15, 'Streptomyces': 26, '[Clostridium]': 29, 'Paenibacillus': 32, 'Lachnospiraceae': 32, 'Clostridium': 35}
barWidth = 0.125
labels = list(genus.keys())
cols = len(labels)
bars = []
positions = [(i+1)*0.2 for i in range(cols)]
for key in labels:
bars.append(genus[key])
fig,ax = plt.subplots()
rects = []
for i in range(len(bars)):
if labels[i] in pos_genus:
rects.append(ax.barh(y=positions[i], width=bars[i], height=barWidth, color='#000000',label='Gram Positive'))
else:
rects.append(ax.barh(y=positions[i], width=bars[i], height=barWidth, color='#E8384F',label='Gram Negative'))
ax.set_title('Genus')
ax.set_yticks(positions)
ax.set_yticklabels(labels)
ax.set_ylabel('Genus')
ax.set_xlabel('Number of Organisms')
#ax.set_ylim(positions[0]-barWidth,positions[-1]+barWidth)
ax.set_xlim(0,40)
blk_patch = mpatches.Patch(color='#000000', label='Gram Positive')
red_patch = mpatches.Patch(color='#E8384F', label='Gram Negative')
plt.legend(handles=[blk_patch, red_patch])
#plt.figure(figsize=(10,50))
bar_path = os.path.join(paths['Figures'], "{0}_horiz_bar.png".format(str('genus')))
plt.savefig(bar_path,dpi=300,bbox_inches='tight')
plt.show()
Illegible barh plot:
I have a 3x10 2d ndarray that I would like to do a matplotlib hist plot. I want a hist plot of each array row in one subplot. I tried supplying the ndarray directly but discovered matplotlib would provide hist plots of each column of the ndarray, which is not what I want. How can I achieve my objective? Presently, I have to explicitly declare the hist() commands for each row and I would prefer to avoid this approach.
import numpy as np
import matplotlib.pyplot as plt
d = np.array([[1, 2, 2, 2, 3, 1, 3, 1, 2, 4, 5],
[4, 4, 5, 5, 3, 6, 6, 7, 6, 5, 7],
[5, 6, 7, 7, 8, 8, 9, 10, 11, 12, 10]] )
print( '\nd', d )
fig, ax = plt.subplots(4, 1)
dcount, dbins, dignored = ax[0].hist( d, bins=[2, 4, 6, 8, 10, 12], histtype='bar', label='d' )
d0count, d0bins, d0ignored = ax[1].hist( d[0,:], bins=[2, 4, 6, 8, 10, 12], histtype='bar', label='d0', alpha=0.2 )
d1count, d1bins, d1ignored = ax[2].hist( d[1,:], bins=[2, 4, 6, 8, 10, 12], histtype='bar', label='d1', alpha=0.2 )
d2count, d2bins, d2ignored = ax[3].hist( d[2,:], bins=[2, 4, 6, 8, 10, 12], histtype='bar', label='d2', alpha=0.2 )
ax[0].legend()
ax[1].legend()
ax[2].legend()
ax[3].legend()
print( '\ndcount', dcount )
print( '\ndbins', dbins )
print( '\ndignored', dignored )
print( '\nd0count', d0count )
print( '\nd0bins', d0bins )
print( '\nd0ignored', d0ignored )
print( '\nd1count', d0count )
print( '\nd1bins', d0bins )
print( '\nd1ignored', d0ignored )
print( '\nd2count', d0count )
print( '\nd2bins', d0bins )
print( '\nd2ignored', d0ignored )
plt.show()
# import needed packages
import numpy as np
import matplotlib.pyplot as plt
Create data to plot
Using list comprehension and numpy.random.normal:
gaussian0=[np.random.normal(loc=0, scale=1.5) for _ in range(100)]
gaussian1=[np.random.normal(loc=2, scale=0.5) for _ in range(100)]
gaussians = [gaussian0, gaussian1]
Plot with one hist call only
for gaussian in gaussians:
plt.hist(gaussian,alpha=0.5)
plt.show()
Resulting in:
I found a simpler way. Transpose d. That is, replace
dcount, dbins, dignored = ax[0].hist( d, bins=[2, 4, 6, 8, 10, 12], histtype='bar', label='d' )
with
dcount, dbins, dignored = ax[0].hist( d.T, bins=[2, 4, 6, 8, 10, 12], histtype='bar', label=['d0', 'd1','d2'], alpha=0.5 )
I was hoping for matplotlib's hist() command would have some command to do it but did not find it. Transposing the numpy array worked. I wonder if this is the usual way matplotlib user to do so?
Based on https://plot.ly/python/line-charts/#filled-lines, one can run the code below
import plotly.graph_objects as go
x = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
x_rev = x[::-1]
y = [5, 2.5, 5, 7.5, 5, 2.5, 7.5, 4.5, 5.5, 5]
y_upper = [5.5, 3, 5.5, 8, 6, 3, 8, 5, 6, 5.5]
y_lower = [4.5, 2, 4.4, 7, 4, 2, 7, 4, 5, 4.75]
y_lower_rev = y_lower[::-1]
fig = go.Figure()
fig.add_trace(go.Scatter(
x=x, y=y,
line_color='rgb(0,176,246)',
name='Mid line',
))
fig.add_trace(go.Scatter(
x=x+x_rev,
y=y_upper+y_lower_rev,
fill='toself',
fillcolor='rgba(0,176,246,0.2)',
line_color='rgba(255,255,255,0)',
name='Filled lines working properly',
))
fig.update_traces(mode='lines')
fig.show()
And successfully get the plot below
However in case there are data gaps, the filled portions do not seem to work properly (e.g. first and second connected component), at least with the code tried below.
What is the right way/code to successfully have data gaps and and filled lines?
x_for_gaps_example = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]
x_for_gaps_example_rev = x_for_gaps_example[::-1]
y_with_gaps =[5, 15, None, 10, 5, 0, 10, None, 15, 5, 5, 10, 20, 15, 5]
y_upper_with_gaps = [i+1 if i is not None else None for i in y_with_gaps]
y_lower_with_gaps = [i-2 if i is not None else None for i in y_with_gaps][::-1]
fig = go.Figure()
fig.add_trace(go.Scatter(
x=x_for_gaps_example,
y=y_with_gaps,
name='Mid Line with <b>Gaps</b>'
))
fig.add_trace(go.Scatter(
x=x_for_gaps_example+x_for_gaps_example_rev,
y=y_upper_with_gaps+y_lower_with_gaps,
fill='toself',
fillcolor='rgba(0,176,246,0.2)',
line_color='rgba(255,255,255,0)',
name='Filled Lines not working properly with <b>gaps</b>'
))
fig.show()
It seems to be quite an old plotly bug:
Refer to:
https://github.com/plotly/plotly.js/issues/1132
and:
https://community.plot.ly/t/scatter-line-plot-fill-option-fills-gaps/21264
One solution might be to break down your whole filling trace into multiple pieces and add them to the figure. However, this might a bit complicated, because it'd require different computation to determine the location of that filling area.
You can actually improve your chart a bit, by setting the connectgaps property to true, which result in this:
But, that looks somewhat weird ;)
I want to create a bar graph for a dataframe contains multiple categories, with a different color for each category. Below is my simplified code and resulting graph. The top subplot is a regular bar graph in one color, the bottom subplot is color coded but the bar width is messed up. Any suggestions? Thanks!
import random
import pandas as pd
import matplotlib.pyplot as plt
df = pd.DataFrame({'Cat': [1, 1, 2, 2, 2, 2, 3, 3, 4, 4, 4, 4],
'A': [2, 3, 6, 7, 9, 10, 15, 18, 22, 23, 24, 25],
'B': random.sample(range(1, 20), 12)})
fig = plt.figure(figsize=(15, 15/2.3))
ax = plt.subplot(2, 1, 1)
plt.bar(df.A, df.B)
plt.xlim(0, 30)
ax = plt.subplot(2, 1, 2)
for cat in df.Cat.unique():
df_ = df.loc[(df.Cat==cat), :]
plt.bar(df_.A, df_.B, width=0.5)
plt.xlim(0, 30)
plt.show()