Creating matplotlib legend with dynamic number of columns - python

I would like to create a legend in matplotlib with at max 5 entries per column. Right now, I can manually set the number of columns like so:
leg = plt.legend(loc='best', fancybox=None, ncol=2)
How do I modify this so that at most 5 entries are allowed per column?

There's no built-in way to specify a number of rows instead of a number of columns. However, you can get the number of items that would be added to the legend using the ax._get_legend_handles() method.
For example:
import numpy as np
import matplotlib.pyplot as plt
numlines = np.random.randint(1, 15)
x = np.linspace(0, 1, 10)
fig, ax = plt.subplots()
for i in range(1, numlines + 1):
ax.plot(x, i * x, label='$y={}x$'.format(i))
numitems = len(list(ax._get_legend_handles()))
nrows = 5
ncols = int(np.ceil(numitems / float(nrows)))
ax.legend(ncol=ncols, loc='best')
plt.show()

Related

How to keep the pyplot axis scaled according to one plot only and not extending for other plots?

I want to plot several functions in one figure, but I want to prevent the axis to be extended if one function is plotted that has much higher/smaller values than others. In the code below, the parameter alpha is actually random (here I fixed it to alpha = 2), and could get very high values which messes up my plot. Basically what I would like to do is, I'd like to plot one function, then freeze the axis according to its xlim and ylim, then add the remaining plots without extending the axis anymore if alpha happens to be large. How can I do this? The solution here did unfortunately not work for me, i.e., using plt.autoscale(False) I would need to fix the limits manually, which is not what I want.
Here is a minimum working example:
x = np.linspace(0,4*np.pi)
data1 = np.sin(0.5*x)
alpha = 2
data2 = alpha*np.sin(x)
data3 = np.sin(x)
data4 = np.sin(x)
data5 = np.cos(x)
fig = plt.figure(constrained_layout=True, figsize=(10, 4))
subfigs = fig.subfigures(1, 2, wspace=0.07)
axsLeft = subfigs[0].subplots(1, 1)
axsLeft.plot(x,data1)
# plt.autoscale(False)
axsLeft.plot(x,data2) #final prediction
axsLeft.plot(x,data3,'--k',linewidth=2.5)
# axsLeft.set_ylim([-1.05,+1.05])
axsLeft.set_xlabel("x")
axsRight = subfigs[1].subplots(2, 1, sharex=True)
axsRight[0].plot(data4)
axsRight[1].plot(data5)
axsRight[1].set_xlabel('x')
fig.show()
This orange plot extends the axis such that the other plots are not interpretable anymore. I'd like the orange plot to to overshoot in the y-direction, like this:
but without setting ylim manually.
After plotting the reference, in your case data1, you can retrieve the defined y-axis limits with get_ylim() in seperate variables a and b and rescale your axis accordingly after plotting the remaining curves with set_ylim:
This makes sure the axis is always scaled according to the reference and it works even if the lower limit of the y-axis is very low or zero.
import numpy as np
from matplotlib import pyplot as plt
x = np.linspace(0,4*np.pi)
data1 = np.sin(0.5*x)
alpha = 2
data2 = alpha*np.sin(x)
data3 = np.sin(x)
data4 = np.sin(x)
data5 = np.cos(x)
fig = plt.figure(constrained_layout=True, figsize=(10, 4))
subfigs = fig.subfigures(1, 2, wspace=0.07)
axsLeft = subfigs[0].subplots(1, 1)
# reference axis
axsLeft.plot(x,data1)
a,b = axsLeft.get_ylim()
axsLeft.plot(x,data2) #final prediction
axsLeft.plot(x,data3,'--k',linewidth=2.5)
axsLeft.set_xlabel("x")
# set limit according to reference
axsLeft.set_ylim((a,b))
axsRight = subfigs[1].subplots(2, 1, sharex=True)
axsRight[0].plot(data4)
axsRight[1].plot(data5)
axsRight[1].set_xlabel('x')
fig.show()
If you want to adjust the y-axis to the maximum and minimum values ​​of data1, use the code below. (0.05 is padding.)
axsLeft.set_ylim(np.min(data1) - 0.05, np.max(data1) + 0.05)
If you want the alpha value to also vary according to data1, you can get the value by subtracting the alpha value from np.max() and np.min(). Below is a modified version of the code you uploaded.
import numpy as np
import matplotlib.pyplot as plt
x = np.linspace(0,4*np.pi)
data1 = np.sin(0.5*x)
alpha = np.max(data1) - np.min(data1) # change 1
data2 = alpha*np.sin(x)
data3 = np.sin(x)
data4 = np.sin(x)
data5 = np.cos(x)
fig = plt.figure(constrained_layout=True, figsize=(10, 4))
subfigs = fig.subfigures(1, 2, wspace=0.07)
axsLeft = subfigs[0].subplots(1, 1)
axsLeft.plot(x,data1)
axsLeft.plot(x,data2) #final prediction
axsLeft.plot(x,data3,'--k',linewidth=2.5)
axsLeft.set_xlabel("x")
axsRight = subfigs[1].subplots(2, 1, sharex=True)
axsRight[0].plot(data4)
axsRight[1].plot(data5)
axsLeft.set_ylim(-alpha / 2 - 0.05, alpha / 2 + 0.05) # change 2
axsRight[1].set_xlabel('x')
plt.show()

How to plot 4 plots per row in matplotlib? [duplicate]

This question already has answers here:
How to plot in multiple subplots
(12 answers)
Closed 1 year ago.
I have 864 plots to plot.
On running a loop, I can do plt.show() in the for loop, and print 864 plots, but that's difficult to view.
Is there a way I can print 4 plots per row? (That would be 216 x 4).
And how can I save them at the same time?
Edit: Example:
import matplotlib.pyplot as plt
i = 0
for i in range(100):
plt.scatter(x[i],x[i])
plt.scatter(y[i],y[i])
plt.title('Vector: {}/100'.format(i+1))
plt.show()
where x & y are list of list of cosine vectors.
The easiest option that comes to my mind is using subplot:
import matplotlib.pyplot as plt
fig = plt.figure(figsize=(4,216))
axs = fig.axes
gs = fig.add_gridspec(ncols=4, nrows=216)
axs = gs.subplots()
There we are creating a figure with 4x216 plots. Probably you will need to adjust the figsize to your desired dimentions.
To plot something you just need to acces the axis using its index. For example:
x = [1, 2, 3]
y = [[1, 2], [3, 4], [5, 6]]
axs[0,0].plot(x, y)
To save it you can use fig.savefig("plot.png"). That creates a huge image. My suggestion is to create a pdf and store 4x6 plots per page. Here is an example of how to do it:
import matplotlib.pyplot as plt
from matplotlib.backends.backend_pdf import PdfPages
pp = PdfPages('plot.pdf')
for i in range(36):
fig = plt.figure(figsize=(16,24))
axs = fig.axes
gs = fig.add_gridspec(ncols=4, nrows=6)
axs = gs.subplots()
pp.savefig()
pp.close()
This process takes some time since it has to render a lot of images. Plotting a line made of 1000 random points (previously calculated) in each figure takes 37s. Here is the code of the test:
import time
import matplotlib.pyplot as plt
from matplotlib.backends.backend_pdf import PdfPages
import numpy as np
start = time.time()
pp = PdfPages('plot.pdf')
xx = np.random.randint(0,100, 1000)
yy = np.random.randint(0,100, 1000)
for i in range(36):
print(i)
fig = plt.figure(figsize=(16,24))
axs = fig.axes
gs = fig.add_gridspec(ncols=4, nrows=6)
axs = gs.subplots()
for x in range(4):
for y in range(6):
axs[y,x].plot(xx, yy)
pp.savefig()
pp.close()
print(time.time() - start)
Commented inline
# 10 plots
for j in range(10):
# close existing plots if any
plt.close('all')
plt.figure(figsize=(10,5))
for i in range(4):
# Plot with 1 row and 4 columns and current plot being drawn is (i+1)
plt.subplot(1,4,i+1)
x = np.random.randint(0,100, 100)
y = np.random.randint(0,100, 100)
plt.scatter(x,y)
plt.scatter(y,x)
# Finally save the plot
plt.savefig(f"plot_{j}.png")
Adjust figsize based on length of your x and y axis.

How to create grid plot with inner subplots?

I have configured subplots of (5 x 1) format shown in Fig. 1 as given by Figure block A in the MWE. I am trying to repeat them n times such that they appear in a grid format with number of rows and columns given by the function fitPlots as mentioned here; to give output as shown in Fig. 2.
Fig. 1 Initial plot
Fig. 2 Repeated plot (desired output)
What would be the best way to repeat the code block to create a grid plot with inner subplots? The MWE creates multiple pages, I want all of them on a single page.
MWE
from matplotlib.backends.backend_pdf import PdfPages
import matplotlib.pyplot as plt
import numpy as np
import math
x = np.arange(1, 100, 0.2)
y_a = np.sqrt(x)
y_b = np.sin(x)
y_c = np.sin(x)
y_d = np.cos(x) * np.cos(x)
y_e = 1/x
########## Figure block A #####################
with PdfPages('./plot_grid.pdf') as plot_grid_loop:
fig, (a, b, c, d, e) = plt.subplots(5, 1, sharex=True, gridspec_kw={'height_ratios': [5, 1, 1, 1, 1]})
a.plot(x, y_a)
b.plot(x, y_b)
c.plot(x, y_c)
d.plot(x, y_d)
e.plot(x, y_e)
plot_grid_loop.savefig()
plt.close
########## Figure block A #####################
# from https://stackoverflow.com/a/43366784/4576447
def fitPlots(N, aspect=(16,9)):
width = aspect[0]
height = aspect[1]
area = width*height*1.0
factor = (N/area)**(1/2.0)
cols = math.floor(width*factor)
rows = math.floor(height*factor)
rowFirst = width < height
while rows*cols < N:
if rowFirst:
rows += 1
else:
cols += 1
rowFirst = not(rowFirst)
return rows, cols
n_plots = 15
n_rows, n_cols = fitPlots(n_plots)
with PdfPages('./plot_grid.pdf') as plot_grid_loop:
for m in range(1, n_plots+1):
fig, (a, b, c, d, e) = plt.subplots(5, 1, sharex=True, gridspec_kw={'height_ratios': [5, 1, 1, 1, 1]})
a.plot(x, y_a)
b.plot(x, y_b)
c.plot(x, y_c)
d.plot(x, y_d)
e.plot(x, y_e)
plot_grid_loop.savefig()
plt.close
This can be done by generating a GridSpec object with gs_fig = fig.add_gridspec() that contains enough rows and columns to fit the five figure blocks (note that when you use plt.subplots a GridSpec is also generated and can be accessed with ax.get_gridspec()). Each empty slot in the GridSpec can then be filled with a sub-GridSpec with gs_sub = gs_fig[i].subgridspec() to hold the five subplots. The trickier part is sharing the x-axis. This can be done by generating an empty first Axes with which the x-axis of all the subplots can be shared.
The following example illustrates this with only three figure blocks, based on the code sample you have shared but with some differences regarding the figure design: the number of rows is computed based on the chosen number of columns, and the figure dimensions are set based on a chosen figure width and aspect ratio. The code for saving the figure to a pdf file is not included.
import numpy as np # v 1.19.2
import matplotlib.pyplot as plt # v 3.3.4
# Create variables to plot
x = np.arange(1, 100, 0.2)
y_a = np.sqrt(x)
y_b = np.sin(x)
y_c = np.sin(x)
y_d = np.cos(x)*np.cos(x)
y_e = 1/x
# Set parameters for figure dimensions
nplots = 3 # random number of plots for this example
ncols = 2
nrows = int(np.ceil(nplots/ncols))
subp_w = 10/ncols # 10 is the total figure width in inches
subp_h = 1*subp_w # set subplot aspect ratio
# Create figure containing GridSpec object with appropriate dimensions
fig = plt.figure(figsize=(ncols*subp_w, nrows*subp_h))
gs_fig = fig.add_gridspec(nrows, ncols)
# Loop through GridSpec to add sub-GridSpec for each figure block
heights = [5, 1, 1, 1, 1]
for i in range(nplots):
gs_sub = gs_fig[i].subgridspec(len(heights), 1, height_ratios=heights, hspace=0.2)
ax = fig.add_subplot(gs_sub[0, 0]) # generate first empty Axes to enable sharex
ax.axis('off') # remove x and y axes because it is overwritten in the loop below
# Loop through y variables to plot all the subplots with shared x-axis
for j, y in enumerate([y_a, y_b, y_c, y_d, y_e]):
ax = fig.add_subplot(gs_sub[j, 0], sharex=ax)
ax.plot(x, y)
if not ax.is_last_row():
ax.tick_params(labelbottom=False)
Reference: matplotlib tutorial GridSpec using SubplotSpec

How to annotate a seaborn barplot with the aggregated value

How can the following code be modified to show the mean as well as the different error bars on each bar of the bar plot?
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
sns.set_style("white")
a,b,c,d = [],[],[],[]
for i in range(1,5):
np.random.seed(i)
a.append(np.random.uniform(35,55))
b.append(np.random.uniform(40,70))
c.append(np.random.uniform(63,85))
d.append(np.random.uniform(59,80))
data_df =pd.DataFrame({'stages':[1,2,3,4],'S1':a,'S2':b,'S3':c,'S4':d})
print("Delay:")
display(data_df)
S1 S2 S3 S4
0 43.340440 61.609735 63.002516 65.348984
1 43.719898 40.777787 75.092575 68.141770
2 46.015958 61.244435 69.399904 69.727380
3 54.340597 56.416967 84.399056 74.011136
meansd_df=data_df.describe().loc[['mean', 'std'],:].drop('stages', axis = 1)
display(meansd_df)
sns.set()
sns.set_style('darkgrid',{"axes.facecolor": ".92"}) # (1)
sns.set_context('notebook')
fig, ax = plt.subplots(figsize = (8,6))
x = meansd_df.columns
y = meansd_df.loc['mean',:]
yerr = meansd_df.loc['std',:]
plt.xlabel("Time", size=14)
plt.ylim(-0.3, 100)
width = 0.45
for i, j,k in zip(x,y,yerr): # (2)
ax.bar(i,j, width, yerr = k, edgecolor = "black",
error_kw=dict(lw=1, capsize=8, capthick=1)) # (3)
ax.set(ylabel = 'Delay')
from matplotlib import ticker
ax.yaxis.set_major_locator(ticker.MultipleLocator(10))
plt.savefig("Over.png", dpi=300, bbox_inches='tight')
Given the example data, for a seaborn.barplot with capped error bars, data_df must be converted from a wide format, to a tidy (long) format, which can be accomplished with pandas.DataFrame.stack or pandas.DataFrame.melt
It is also important to keep in mind that a bar plot shows only the mean (or other estimator) value
Sample Data and DataFrame
.iloc[:, 1:] is used to skip the 'stages' column at column index 0.
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
# given data_df from the OP, select the columns except stage and reshape to long format
df = data_df.iloc[:, 1:].melt(var_name='set', value_name='val')
# display(df.head())
set val
0 S1 43.340440
1 S1 43.719898
2 S1 46.015958
3 S1 54.340597
4 S2 61.609735
Updated as of matplotlib v3.4.2
Use matplotlib.pyplot.bar_label
See How to add value labels on a bar chart for additional details and examples with .bar_label.
Some formatting can be done with the fmt parameter, but more sophisticated formatting should be done with the labels parameter, as show in How to add multiple annotations to a barplot.
Tested with seaborn v0.11.1, which is using matplotlib as the plot engine.
fig, ax = plt.subplots(figsize=(8, 6))
# add the plot
sns.barplot(x='set', y='val', data=df, capsize=0.2, ax=ax)
# add the annotation
ax.bar_label(ax.containers[-1], fmt='Mean:\n%.2f', label_type='center')
ax.set(ylabel='Mean Time')
plt.show()
plot with seaborn.barplot
Using matplotlib before version 3.4.2
The default for the estimator parameter is mean, so the height of the bar is the mean of the group.
The bar height is extracted from p with .get_height, which can be used to annotate the bar.
fig, ax = plt.subplots(figsize=(8, 6))
sns.barplot(x='set', y='val', data=df, capsize=0.2, ax=ax)
# show the mean
for p in ax.patches:
h, w, x = p.get_height(), p.get_width(), p.get_x()
xy = (x + w / 2., h / 2)
text = f'Mean:\n{h:0.2f}'
ax.annotate(text=text, xy=xy, ha='center', va='center')
ax.set(xlabel='Delay', ylabel='Time')
plt.show()
Seaborn is most powerfull with long form data. So you might want to transform your data, something like this:
sns.barplot(data=data_df.melt('stages', value_name='Delay', var_name='Time'),
x='Time', y='Delay',
capsize=0.1, edgecolor='k')
Output:

Turn Weighted Numbers into Multiple Histograms

I am using the below code to create a weighted list of random numbers within a range.
import csv
import random
import numpy as np
import matplotlib.pyplot as plt
itemsList = []
rnd_numbs = csv.writer(open("rnd_numbs.csv", "wb"))
rnd_numbs.writerow(['number'])
items = [1, 2, 3, 4, 5]
probabilities= [0.1, 0.1, 0.2, 0.2, 0.4]
prob = sum(probabilities)
print prob
c = (1.0)/prob
probabilities = map(lambda x: c*x, probabilities)
print probabilities
ml = max(probabilities, key=lambda x: len(str(x)) - str(x).find('.'))
ml = len(str(ml)) - str(ml).find('.') -1
amounts = [ int(x*(10**ml)) for x in probabilities]
itemsList = list()
for i in range(0, len(items)):
itemsList += items[i:i+1]*amounts[i]
for item in itemsList:
rnd_numbs.writerow([item])
What I would like to do is (a) list these numbers randomly down the csv column, not sure why they are coming out pre-sorted, (b) list the numbers down the comumn instead of as one entry, and (c) create and save multiple histrograms at defined intervals, such as the first 100 numbers, then first 250 numbers, then first 500 numbers, ... to the end
For (c) I would like to create multiple pictures such as this for various cutoffs of the data list.
Attempt at histogram
x = itemsList[0:20]
fig = plt.figure()
ax = fig.add_subplot(111)
# 100 is the number of bins
ax.hist(x, 10, normed=1, facecolor='green', alpha=0.75)
ax.set_xlim(0, 5)
ax.set_ylim(0, 500)
ax.grid(True)
plt.show()
As for the third part of your question, take a look at matplotlib (and numpy.loadtxt() for reading your data). There are many examples to help you learn the basics, as well as advanced features. Here's a quick example of plotting a histogram of a random normal distribution:
import numpy as np
import matplotlib.pyplot as plt
x = np.random.randn(10000)
fig = plt.figure()
ax = fig.add_subplot(111)
# 100 is the number of bins
n = ax.hist(x, 100, facecolor='green', alpha=0.75)
# n[0] is the array of bin heights,
# n[1] is the array of bin edges
xmin = min(n[1]) * 1.1
xmax = max(n[1]) * 1.1
ymax = max(n[0]) * 1.1
ax.set_xlim(xmin, xmax)
ax.set_ylim(0, ymax)
ax.grid(True)
plt.show()
which gives you a nice image:
You can make loops to generate multiple images using different ranges of your data, and save the generated figures in a number of formats, with or without previewing them first.

Categories