I currently have 2 subplots using seaborn:
import matplotlib.pyplot as plt
import seaborn.apionly as sns
f, (ax1, ax2) = plt.subplots(2, sharex=True)
sns.distplot(df['Difference'].values, ax=ax1) #array, top subplot
sns.boxplot(df['Difference'].values, ax=ax2, width=.4) #bottom subplot
sns.stripplot([cimin, cimax], color='r', marker='d') #overlay confidence intervals over boxplot
ax1.set_ylabel('Relative Frequency') #label only the top subplot
plt.xlabel('Difference')
plt.show()
Here is the output:
I am rather stumped on how to make ax2 (the bottom figure) to become shorter relative to ax1 (the top figure). I was looking over the GridSpec (http://matplotlib.org/users/gridspec.html) documentation but I can't figure out how to apply it to seaborn objects.
Question:
How do I make the bottom subplot shorter compared to the top
subplot?
Incidentally, how do I move the plot's title "Distrubition of Difference" to go above the top
subplot?
Thank you for your time.
As #dnalow mentioned, seaborn has no impact on GridSpec, as you pass a reference to the Axes object to the function. Like so:
import matplotlib.pyplot as plt
import seaborn.apionly as sns
import matplotlib.gridspec as gridspec
tips = sns.load_dataset("tips")
gridkw = dict(height_ratios=[5, 1])
fig, (ax1, ax2) = plt.subplots(2, 1, gridspec_kw=gridkw)
sns.distplot(tips.loc[:,'total_bill'], ax=ax1) #array, top subplot
sns.boxplot(tips.loc[:,'total_bill'], ax=ax2, width=.4) #bottom subplot
plt.show()
If you're using a FacetGrid (either directly or through something like catplot, which uses it indirectly), then you can pass gridspec_kws.
Here is an example using a catplot, where "var3" has two values, i.e. there are two subplots, which I am displaying at a ratio of 3:8, with un-shared x-axes.
g = sns.catplot(data=data, x="bin", y="y", col="var3", hue="var4", kind="bar",
sharex=False,
facet_kws={
'gridspec_kws': {'width_ratios': [3, 8]}
})
# Make the first subplot have a custom `xlim`:
g.axes[0][0].set_xlim(right=2.5)
Result, with labels hidden because I just copied my actual data's output, so the labels wouldn't make sense.
Related
I'm trying to plot a simple box plot next to a simple histogram in the same figure using seaborn (0.11.2) and pandas (1.3.4) in a jupyter notebook (6.4.5).
I've tried multiple approaches with nothing working.
fig, ax = plt.subplots(1, 2)
sns.boxplot(x='rent', data=df, ax=ax[0])
sns.displot(x='rent', data=df, bins=50, ax=ax[1])
There is an extra plot or grid that gets put next to the boxplot, and this extra empty plot shows up any time I try to create multiple axes.
Changing:
fig, ax = plt.subplots(2)
Gets:
Again, that extra empty plot next to the boxplot, but this time below it.
Trying the following code:
fig, (axbox, axhist) = plt.subplots(1,2)
sns.boxplot(x='rent', data=df, ax=axbox)
sns.displot(x='rent', data=df, bins=50, ax=axhist)
Gets the same results.
Following the answer in this post, I try:
fig, axs = plt.subplots(ncols=2)
sns.boxplot(x='rent', data=df, ax=axs[0])
sns.displot(x='rent', data=df, bins-50, ax=axs[1])
results in the same thing:
If I just create the figure and then the plots underneath:
plt.figure()
sns.boxplot(x='rent', data=df)
sns.displot(x='rent', data=df, bins=50)
It just gives me the two plots on top of each other, which I assume is just making two different figures.
I'm not sure why that extra empty plot shows up next to the boxplot when I try to do multiple axes in seaborn.
If I use pyplot instead of seaborn, I can get it to work:
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(10, 5))
ax1.hist(df['rent'], bins=50)
ax2.boxplot(df['rent'])
Results in:
The closest I've come is to use seaborn only on the boxplot, and pyplot for the histogram:
plt.figure(figsize=(8, 5))
plt.subplot(1, 2, 1)
sns.boxplot(x='rent', data=df)
plt.subplot(1, 2, 2)
plt.hist(df['rent'], bins=50)
Results:
What am I missing? Why can't I get this to work with two seaborn plots on the same figure, side by side (1 row, 2 columns)?
Try this function:
def creating_box_hist(column, df):
# creating a figure composed of two matplotlib.Axes objects (ax_box and ax_hist)
f, (ax_box, ax_hist) = plt.subplots(2, sharex=True, gridspec_kw={"height_ratios": (.15, .85)})
# assigning a graph to each ax
sns.boxplot(df[column], ax=ax_box)
sns.histplot(data=df, x=column, ax=ax_hist)
# Remove x axis name for the boxplot
ax_box.set(xlabel='')
plt.show()
I am attempting to place a Seaborn time-based heatmap on top of a bar chart, indicating the number of patients in each bin/timeframe. I can successfully make an individual heatmap and bar plot, but combining the two does not work as intended.
import pandas as pd
import numpy as np
import seaborn as sb
from matplotlib import pyplot as plt
# Mock data
patient_counts = [650, 28, 8]
missings_df = pd.DataFrame(np.array([[-15.8, 600/650, 580/650, 590/650],
[488.2, 20/23, 21/23, 21/23],
[992.2, 7/8, 8/8, 8/8]]),
columns=['time', 'Resp. (/min)', 'SpO2', 'Blood Pressure'])
missings_df.set_index('time', inplace=True)
# Plot heatmap
fig, (ax1, ax2) = plt.subplots(nrows=2, figsize=(26, 16), sharex=True, gridspec_kw={'height_ratios': [5, 1]})
sb.heatmap(missings_df.T, cmap="Blues", cbar_kws={"shrink": .8}, ax=ax1, xticklabels=False)
plt.xlabel('Time (hours)')
# Plot line graph under heatmap to show nr. of patients in each bin
x_ticks = [time for time in missings_df.index]
ax2.bar([i for i, _ in enumerate(x_ticks)], patient_counts, align='center')
plt.xticks([i for i, _ in enumerate(x_ticks)], x_ticks)
plt.show()
This code gives me the graph below. As you can see, there are two issues:
The bar plot extends too far
The first and second bar are not aligned with the top graph, where the tick of the first plot does not line up with the centre of the bar either.
I've tried looking online but could not find a good resource to fix the issues.. Any ideas?
A problem is that the colorbar takes away space from the heatmap, making its plot narrower than the bar plot. You can create a 2x2 grid to make room for the colorbar, and remove the empty subplot. Change sharex=True to sharex='col' to prevent the colorbar getting the same x-axis as the heatmap.
Another problem is that the heatmap has its cell borders at positions 0, 1, 2, ..., so their centers are at 0.5, 1.5, 2.5, .... You can put the bars at these centers instead of at their default positions:
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
fig, ((ax1, cbar_ax), (ax2, dummy_ax)) = plt.subplots(nrows=2, ncols=2, figsize=(26, 16), sharex='col',
gridspec_kw={'height_ratios': [5, 1], 'width_ratios': [20, 1]})
missings_df = np.random.rand(3, 3)
sns.heatmap(missings_df.T, cmap="Blues", cbar_ax=cbar_ax, xticklabels=False, linewidths=2, ax=ax1)
ax2.set_xlabel('Time (hours)')
patient_counts = np.random.randint(10, 50, 3)
x_ticks = ['Time1', 'Time2', 'Time3']
x_tick_pos = [i + 0.5 for i in range(len(x_ticks))]
ax2.bar(x_tick_pos, patient_counts, align='center')
ax2.set_xticks(x_tick_pos)
ax2.set_xticklabels(x_ticks)
dummy_ax.axis('off')
plt.tight_layout()
plt.show()
PS: Be careful not to mix the "functional" interface with the "object-oriented" interface to matplotlib. So, try not to use plt.xlabel() as it is not obvious that it will be applied to the "current" ax (ax2 in the code of the question).
I search to draw a catplot with violin plots using seaborn with a broken y-axis ('cause I have a cause consequence process acting at two different scales: one between [0,0.2] and a second between [2,12] of my quantitative y-variable).
I understood from this answer that there is not implemented easy feature allowing this kind of plot in seaborn (yet?)
So I tried different approaches, unsuccessful, to stack two plots of the same dataset but with two different scales.
Explored unsuccessful attempt:
Let's use the standard dataset 'exercise', I tried:
import seaborn as sns
import numpy as np
import matplotlib.pyplot as plt
exercise = sns.load_dataset("exercise")
f, (ax1, ax2) = plt.subplots(ncols=2, nrows=1, sharey=True)
f = sns.catplot(x="time", y="pulse", hue="kind",data=exercise, kind="violin",ax=ax1)
f = sns.catplot(x="time", y="pulse", hue="kind",data=exercise, kind="violin",ax=ax2)
ax1.set_ylim(0, 6.5) # those limits are fake
ax2.set_ylim(13.5, 20)
plt.subplots_adjust(wspace=0, hspace=0)
plt.show()
I also tried to use facegrid but without success
import seaborn as sns
import numpy as np
import matplotlib.pyplot as plt
exercise = sns.load_dataset("exercise")
g = sns.FacetGrid(exercise, col="kind",row="time")
g.map(sns.catplot, x="time", y="pulse", hue="kind",data=exercise, kind="violin")
plt.show()
here it gives me the right base of grid of plots but plots happen in other figures.
If you want to draw on a subplot, you cannot use catplot, which is a figure-level function. Instead, you need to use violinplot directly. Also, if you want two different y-scales, you cannot use sharey=True when you create your subplots.
The rest is pretty much copied/pasted from matplotlib's broken axes tutorial
import seaborn as sns
import numpy as np
import matplotlib.pyplot as plt
exercise = sns.load_dataset("exercise")
f, (ax_top, ax_bottom) = plt.subplots(ncols=1, nrows=2, sharex=True, gridspec_kw={'hspace':0.05})
sns.violinplot(x="time", y="pulse", hue="kind",data=exercise, ax=ax_top)
sns.violinplot(x="time", y="pulse", hue="kind",data=exercise, ax=ax_bottom)
ax_top.set_ylim(bottom=125) # those limits are fake
ax_bottom.set_ylim(0,100)
sns.despine(ax=ax_bottom)
sns.despine(ax=ax_top, bottom=True)
ax = ax_top
d = .015 # how big to make the diagonal lines in axes coordinates
# arguments to pass to plot, just so we don't keep repeating them
kwargs = dict(transform=ax.transAxes, color='k', clip_on=False)
ax.plot((-d, +d), (-d, +d), **kwargs) # top-left diagonal
ax2 = ax_bottom
kwargs.update(transform=ax2.transAxes) # switch to the bottom axes
ax2.plot((-d, +d), (1 - d, 1 + d), **kwargs) # bottom-left diagonal
#remove one of the legend
ax_bottom.legend_.remove()
plt.show()
I'm playing with seaborn for the first time, trying to plot different columns of a pandas dataframe on different plots using matplotlib subplots. The simple code below produces the expected figure but the last plot does not have a proper y range (it seems linked to the full range of values in the dataframe).
Does anyone have an idea why this happens and how to prevent it? Thanks.
import matplotlib.pyplot as plt
import numpy as np
import pandas as pds
import seaborn as sns
X = np.arange(0,10)
df = pds.DataFrame({'X': X, 'Y1': 4*X, 'Y2': X/2., 'Y3': X+3, 'Y4': X-7})
fig, axes = plt.subplots(ncols=2, nrows=2)
ax1, ax2, ax3, ax4 = axes.ravel()
sns.set(style="ticks")
sns.despine(fig=fig)
sns.regplot(x='X', y='Y1', data=df, fit_reg=False, ax=ax1)
sns.regplot(x='X', y='Y2', data=df, fit_reg=False, ax=ax2)
sns.regplot(x='X', y='Y3', data=df, fit_reg=False, ax=ax3)
sns.regplot(x='X', y='Y4', data=df, fit_reg=False, ax=ax4)
plt.show()
Update: I modified the above code with:
fig, axes = plt.subplots(ncols=2, nrows=3)
ax1, ax2, ax3, ax4, ax5, ax6 = axes.ravel()
If I plot data on any axis but the last one I obtain what I'm looking for:
Of course I don't want the empty frames. All plots present the data with a similar visual aspect.
When data is plotted on the last axis, it gets a y range that is too wide like in the first example. Only the last axis seems to have this problem. Any clue?
If you want the scales to be the same on all axes you could create subplots with this command:
fig, axes = plt.subplots(ncols=2, nrows=2, sharey=True, sharex=True)
Which will make all plots to share relevant axis:
If you want manually to change the limits of that particular ax, you could add this line at the end of plotting commands:
ax4.set_ylim(top=5)
# or for both limits like this:
# ax4.set_ylim([-2, 5])
Which will give something like this:
Is it possible to set the size/position of a matplotlib subplot after the axes are created? I know that I can do:
import matplotlib.pyplot as plt
ax = plt.subplot(111)
ax.change_geometry(3,1,1)
to put the axes on the top row of three. But I want the axes to span the first two rows. I have tried this:
import matplotlib.gridspec as gridspec
ax = plt.subplot(111)
gs = gridspec.GridSpec(3,1)
ax.set_subplotspec(gs[0:2])
but the axes still fill the whole window.
Update for clarity
I want to change the position of an existing axes instance rather than set it when it is created. This is because the extent of the axes will be modified each time I add data (plotting data on a map using cartopy). The map may turn out tall and narrow, or short and wide (or something in between). So the decision on the grid layout will happen after the plotting function.
Thanks to Molly pointing me in the right direction, I have a solution:
import matplotlib.pyplot as plt
import matplotlib.gridspec as gridspec
fig = plt.figure()
ax = fig.add_subplot(111)
gs = gridspec.GridSpec(3,1)
ax.set_position(gs[0:2].get_position(fig))
ax.set_subplotspec(gs[0:2]) # only necessary if using tight_layout()
fig.add_subplot(gs[2])
fig.tight_layout() # not strictly part of the question
plt.show()
You can create a figure with one subplot that spans two rows and one subplot that spans one row using the rowspan argument to subplot2grid:
import matplotlib.pyplot as plt
fig = plt.figure()
ax1 = plt.subplot2grid((3,1), (0,0), rowspan=2)
ax2 = plt.subplot2grid((3,1), (2,0))
plt.show()
If you want to change the subplot size and position after it's been created you can use the set_position method.
ax1.set_position([0.1,0.1, 0.5, 0.5])
Bu you don't need this to create the figure you described.
You can avoid ax.set_position() by using fig.tight_layout() instead which recalculates the new gridspec:
import matplotlib.pyplot as plt
import matplotlib.gridspec as gridspec
# create the first axes without knowing of further subplot creation
fig, ax = plt.subplots()
ax.plot(range(5), 'o-')
# now update the existing gridspec ...
gs = gridspec.GridSpec(3, 1)
ax.set_subplotspec(gs[0:2])
# ... and recalculate the positions
fig.tight_layout()
# add a new subplot
fig.add_subplot(gs[2])
fig.tight_layout()
plt.show()