Matplotlib, shift boxplots along x-axis? - python

I am plotting multiple boxplots along two different axes.
My code looks like:
fig, (ax1, ax2) = plt.subplots(2, sharex=True, sharey=False)
data_1 = [array1, array2, array3]
ax1.boxplot(data_1, whis=[5,95], showfliers=True)
data_2 = [array4, array5]
ax2.boxplot(data_2, whis=[5,95], showfliers=True)
ax2.set_xlim(0,4)
This produces a plot (substituting in my actual data) that looks like:
However, I would like the lower plot (on ax2) to shift to the right along the x-axis by one unit. That is, I would like to have the 2 lower boxplots plot at x=2 and x=3, such that they line up with the 2nd and 3rd upper boxplots. I would like to keep the xlabels the same and consistent for all x-axes.
Any ideas?

This should work for your example code. However this solution bypasses the sharex aligment
In my opinion, the axis labeling when using box plots and sharex is a little unintuitive.
%matplotlib inline
import matplotlib.pylab as plt
import numpy as np
np.random.seed(42)
# create random data
for i in range(1,6):
x = np.random.rand(10)
exec("array%s = x" % i)
widths = 0.3
fig, (ax1, ax2) = plt.subplots(2, sharex=True, sharey=False)
data_1 = [array1, array2, array3]
ax1.boxplot(data_1, widths=0.3, whis=[5,95], showfliers=True)
data_2 = [array4, array5]
positions = [2,3]
ax2.boxplot(data_2, positions=positions, widths=widths, whis=[5,95], showfliers=True)
ax2.set_xticks([1,2,3])
ax1.set_xticks([1,2,3])
ax2.set_xticklabels([1,2,3])
plt.xlim(0,4)

Related

Share y axes for subplots that are dynamically created

Working example
import matplotlib.pyplot as plt
names = ['one','two','three']
upper = [[79,85,88],[79,85,88],[79,85,88]]
lower = [[73,72,66],[73,72,66],[73,72,66]]
fig = plt.figure(1)
for idx,lane in enumerate(names):
ax = fig.add_subplot(1,len(names)+1,idx+1)
ax.plot(upper[idx], color='tab:blue', marker='x', linestyle="None")
ax.plot(lower[idx], color='tab:blue', marker='x', linestyle="None")
ax.set_title(lane)
plt.show()
This generates 3 plots dynamically. It works I could very well not be using the best practices for dynamically generating plots. The goal is to have all the plots generated share the Y-axis so that it will give it a cleaner look. All the examples I've looked up show that you can assign the shared axis to the previously used axis but in my case all the plots are created dynamically. Is there a way to just lump all the subplots in a figure into sharing the same y axis?
The common approach to creating a figure with multiple axes is plt.subplots, which accepts a sharey = True argument.
Example:
import numpy as np
import matplotlib.pyplot as plt
xdata = np.linspace(0, 10, 100)
ydata_1 = np.sin(xdata)
ydata_2 = np.cos(xdata)
fig, (ax1, ax2) = plt.subplots(1, 2, sharey = True, figsize = (8, 4))
ax1.plot(xdata, ydata_1)
ax2.plot(xdata, ydata_2)
This outputs:
For less space between the plots, you can also use a tight_layout = True argument.
Using your data, you could maybe rewrite it to something like
fig, axes = plt.subplots(1, len(names), sharey = True, tight_layout = True)
for idx, (ax, name) in enumerate(zip(axes, names)):
ax.plot(upper[idx], color='tab:blue', marker='x', linestyle="None")
ax.plot(lower[idx], color='tab:blue', marker='x', linestyle="None")
ax.set_title(name)
plt.show()

make single plot from multi columns in matplotlib subplots

I'm using quite often matplotlibs subplots and i want something like this:
import mumpy as np
import matplotlib.pyplot as plt
fig, ax = plt.subplots(3, 2, figsize=(8, 10), sharey='row',
gridspec_kw={'height_ratios': [1, 2, 2]})
ax[0, :].plot(np.random.randn(128))
ax[1, 0].plot(np.arange(128))
ax[1, 1].plot(1 / (np.arange(128) + 1))
ax[2, 0].plot(np.arange(128) ** (2))
ax[2, 1].plot(np.abs(np.arange(-64, 64)))
I want to create a figure that have for 2 positions a single plot like done for ax1 in this (modified) gridspec example:
import matplotlib.pyplot as plt
from matplotlib.gridspec import GridSpec
fig = plt.figure()
gs = GridSpec(3, 3)
ax1 = plt.subplot(gs[0, :])
# identical to ax1 = plt.subplot(gs.new_subplotspec((0, 0), colspan=3))
ax2 = plt.subplot(gs[1, :-1])
ax3 = plt.subplot(gs[1:, -1])
ax4 = plt.subplot(gs[-1, 0])
ax5 = plt.subplot(gs[-1, -2])
fig.suptitle("GridSpec")
plt.show()
see for full example: https://matplotlib.org/gallery/userdemo/demo_gridspec02.html#sphx-glr-gallery-userdemo-demo-gridspec02-py
Since i'm using the subplots environment quite a lot i would know if this is possible too. Also because subplots can handle GridSpec arguments. The pity is that it is not really explained what the exceptions are.
plt.subplots provides a convenient way to create a fully populated gridspec.
For example, instead of
fig = plt.figure()
n = 3; m=3
gs = GridSpec(n, m)
axes = []
for i in range(n):
row = []
for j in range(m):
ax = fig.add_subplot(gs[i,j])
row.append(ax)
axes.append(row)
axes = np.array(axes)
you can just write a single line
n = 3; m=3
fig, axes = plt.subplots(ncols=m, nrows=n)
However, if you want the freedom to select which positions on the grid to fill or even to have subplots spanning several rows or columns, plt.subplots will not help much, because it does not have any options to specify which gridspec locations to occupy.
In that sense the documentation is pretty clear: Since it does not document any arguments that could be used to achieve a non rectilinear grid, there simply is no such option.
Whether to choose to use plt.subplots or gridspec is then a question of the desired plot. There might be cases where a combination of the two is still somehow useful, e.g.
import matplotlib.pyplot as plt
from matplotlib.gridspec import GridSpec
n=3;m=3
gridspec_kw = dict(height_ratios=[3,2,1])
fig, axes = plt.subplots(ncols=m, nrows=n, gridspec_kw=gridspec_kw)
for ax in axes[1:,2]:
ax.remove()
gs = GridSpec(3, 3, **gridspec_kw)
fig.add_subplot(gs[1:,2])
plt.show()
where a usual grid is defined first and only at the positions where we need a row spanning plot, we remove the axes and create a new one using the gridspec.

python - plotting N by 1 number of plots when N is unknown

I currently am plotting multiple plots across 4 axis using seaborn. In order to do this, I manually select nrows=4 and then run 4 boxplots at once.
import pandas as pd
import numpy as np
import seaborn as sns
%matplotlib inline
data=np.random.randn(1000)
label = ['A','B','C','D'] * 250
df = pd.DataFrame(
{'label': prod1,
'data': data
})
fig, (ax1, ax2, ax3, ax4) = plt.subplots(nrows=4, sharey=True)
fig.set_size_inches(12, 16)
sns.boxplot(data=df[df['label']=='A'], y='data', ax=ax1)
sns.boxplot(data=df[df['label']=='B'], y='data', ax=ax2)
sns.boxplot(data=df[df['label']=='C'], y='data', ax=ax3)
sns.boxplot(data=df[df['label']=='D'], y='data', ax=ax4)
I would like to rewrite this function so that it automatically recognizes the unique number of labels, creates the number of axes automatically, then plots.
Does anyone know how I can accomplish this? Thank you.
The assignment
fig, ax = plt.subplots(nrows=4, sharey=True)
makes ax a NumPy array of axes. This array can be one- or two-dimensional (depending on the value of the nrows and ncols parameters),
so calling ax.ravel() is used to ensure it is one-dimensional.
Now you can loop over zip(label, ax.ravel()) to call sns.boxplot once for each label and axes.
fig, ax = plt.subplots(nrows=4, sharey=True)
fig.set_size_inches(12, 16)
for labeli, axi in zip(label, ax.ravel()):
sns.boxplot(data=df[df['label']==labeli], y='data', ax=axi)
Note that zip ends when the shortest of the iterators end. So even though
label has length 1000, only the first 4 items are used in the loop since there
are only 4 axes.
Alternatively, just assign label = ['A','B','C','D'] since that variable is not used anywhere else (at least, not in the posted code).

Histogram with Boxplot above in Python

Hi I wanted to draw a histogram with a boxplot appearing the top of the histogram showing the Q1,Q2 and Q3 as well as the outliers. Example phone is below. (I am using Python and Pandas)
I have checked several examples using matplotlib.pyplot but hardly came out with a good example. And I also wanted to have the histogram curve appearing like in the image below.
I also tried seaborn and it provided me the shape line along with the histogram but didnt find a way to incorporate with boxpot above it.
can anyone help me with this to have this on matplotlib.pyplot or using pyplot
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
sns.set(style="ticks")
x = np.random.randn(100)
f, (ax_box, ax_hist) = plt.subplots(2, sharex=True,
gridspec_kw={"height_ratios": (.15, .85)})
sns.boxplot(x, ax=ax_box)
sns.distplot(x, ax=ax_hist)
ax_box.set(yticks=[])
sns.despine(ax=ax_hist)
sns.despine(ax=ax_box, left=True)
From seaborn v0.11.2, sns.distplot is deprecated. Use sns.histplot for axes-level plots instead.
np.random.seed(2022)
x = np.random.randn(100)
f, (ax_box, ax_hist) = plt.subplots(2, sharex=True, gridspec_kw={"height_ratios": (.15, .85)})
sns.boxplot(x=x, ax=ax_box)
sns.histplot(x=x, bins=12, kde=True, stat='density', ax=ax_hist)
ax_box.set(yticks=[])
sns.despine(ax=ax_hist)
sns.despine(ax=ax_box, left=True)
Solution using only matplotlib, just because:
# start the plot: 2 rows, because we want the boxplot on the first row
# and the hist on the second
fig, ax = plt.subplots(
2, figsize=(7, 5), sharex=True,
gridspec_kw={"height_ratios": (.3, .7)} # the boxplot gets 30% of the vertical space
)
# the boxplot
ax[0].boxplot(data, vert=False)
# removing borders
ax[0].spines['top'].set_visible(False)
ax[0].spines['right'].set_visible(False)
ax[0].spines['left'].set_visible(False)
# the histogram
ax[1].hist(data)
# and we are good to go
plt.show()
Expanding on the answer from #mwaskom, I made a little adaptable function.
import seaborn as sns
def histogram_boxplot(data, xlabel = None, title = None, font_scale=2, figsize=(9,8), bins = None):
""" Boxplot and histogram combined
data: 1-d data array
xlabel: xlabel
title: title
font_scale: the scale of the font (default 2)
figsize: size of fig (default (9,8))
bins: number of bins (default None / auto)
example use: histogram_boxplot(np.random.rand(100), bins = 20, title="Fancy plot")
"""
sns.set(font_scale=font_scale)
f2, (ax_box2, ax_hist2) = plt.subplots(2, sharex=True, gridspec_kw={"height_ratios": (.15, .85)}, figsize=figsize)
sns.boxplot(data, ax=ax_box2)
sns.distplot(data, ax=ax_hist2, bins=bins) if bins else sns.distplot(data, ax=ax_hist2)
if xlabel: ax_hist2.set(xlabel=xlabel)
if title: ax_box2.set(title=title)
plt.show()
histogram_boxplot(np.random.randn(100), bins = 20, title="Fancy plot", xlabel="Some values")
Image
def histogram_boxplot(feature, figsize=(15,10), bins=None):
f,(ax_box,ax_hist)=plt.subplots(nrows=2,sharex=True, gridspec_kw={'height_ratios':(.25,.75)},figsize=figsize)
sns.distplot(feature,kde=False,ax=ax_hist,bins=bins)
sns.boxplot(feature,ax=ax_box, color='Red')
ax_hist.axvline(np.mean(feature),color='g',linestyle='-')
ax_hist.axvline(np.median(feature),color='y',linestyle='--')

Matplotlib: adding a third subplot in the plot

I am completely new to Matplotlib and I have written this code to plot two series that so far is working fine:
import matplotlib.pyplot as plt
import matplotlib.gridspec as gridspec
list1 = [1,2,3,4]
list2 = [4,3,2,1]
somecondition = True
plt.figure(1) #create one of the figures that must appear with the chart
gs = gridspec.GridSpec(3,1)
if not somecondition:
ax = plt.subplot(gs[:,:]) #create the first subplot that will ALWAYS be there
ax.plot(list1) #populate the "main" subplot
else:
ax = plt.subplot(gs[:2, :])
ax.plot(list1)
ax = plt.subplot(gs[2, :]) #create the second subplot, that MIGHT be there
ax.plot(list2) #populate the second subplot
plt.show()
What I would like to do is adding a third series to this plot, let's say:
list3 = [4,1,2,4]
What matters is that the first subplot (list1) has to be twice as bigger than the other two; for doing this I have used gridspace, but as I am really new I'm not being able to understand how I should set the parameter for this sample code to get the third one. Can anyone explain me how I should edit the block somecondition == True to get 3 subplots (first 1 twice bigger than the other 2 below) rather than just two?
P.S. the code is executable.
This is an example with Matplotlib subplots
import matplotlib.pyplot as plt
import numpy as np
x,y = np.random.randn(2,100)
fig = plt.figure()
ax1 = fig.add_subplot(211)
ax1.xcorr(x, y, usevlines=True, maxlags=50, normed=True, lw=2)
ax1.grid(True)
ax1.axhline(0, color='black', lw=2)
ax2 = fig.add_subplot(212, sharex=ax1)
ax2.acorr(x, usevlines=True, normed=True, maxlags=50, lw=2)
ax2.grid(True)
ax2.axhline(0, color='black', lw=2)
plt.show()
it is using pyplot, and add_subplot with a quite straightforward syntax.
To get 2:1 ratio, you can use 4 rows, and make plots take 2, 1, 1 row respectively:
import matplotlib.pyplot as plt
import matplotlib.gridspec as gridspec
list1 = [1,2,3,4]
list2 = [4,3,2,1]
list3 = [4,1,2,4]
somecondition = True
plt.figure(1) #create one of the figures that must appear with the chart
gs = gridspec.GridSpec(4,1)
if not somecondition:
ax = plt.subplot(gs[:,:]) #create the first subplot that will ALWAYS be there
ax.plot(list1) #populate the "main" subplot
else:
ax = plt.subplot(gs[:2, :])
ax.plot(list1)
ax = plt.subplot(gs[2, :]) #create the second subplot, that MIGHT be there
ax.plot(list2) #populate the second subplot
ax = plt.subplot(gs[3, :]) #create the second subplot, that MIGHT be there
ax.plot(list3)
plt.show()

Categories