Combine (overlay) two factorplots in matplotlib - python

I need to add swarmplot to boxplot in matplotlib, but I don't know how to do it with factorplot. I think I can iterate with subplots, but I would like to learn how to do it with seaborn and factorplot.
A simple example (plotting by using the same axis ax):
import seaborn as sns
tips = sns.load_dataset("tips")
ax = sns.boxplot(x="tip", y="day", data=tips, whis=np.inf)
ax = sns.swarmplot(x="tip", y="day", data=tips, color=".2")
The result:
In my case, I need to overlay the swarm factorplot:
g = sns.factorplot(x="sex", y="total_bill",
hue="smoker", col="time",
data=tips, kind="swarm",
size=4, aspect=.7);
and boxplot
I can't figure out how to use axes (extract from g)?
Something like:
g = sns.factorplot(x="sex", y="total_bill",
hue="smoker", col="time",
data=tips, kind="box",
size=4, aspect=.7);
I want something like this, but with factorplot and boxplot instead of violinplot

Instead of trying to overlay the two subplots of a factorplot with individual boxplots (which is possible, but I don't like it), one can just create the two subplots individually.
You would then loop over the groups and axes an plot a pair of box- and swarmplot to each.
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
tips = sns.load_dataset("tips")
fig, axes = plt.subplots(ncols=2, sharex=True, sharey=True)
for ax, (n,grp) in zip(axes, tips.groupby("time")):
sns.boxplot(x="sex", y="total_bill", data=grp, whis=np.inf, ax=ax)
sns.swarmplot(x="sex", y="total_bill", hue="smoker", data=grp,
palette=["crimson","indigo"], ax=ax)
ax.set_title(n)
axes[-1].get_legend().remove()
plt.show()

Related

How to overlay data points on a barplot with a categorical axis

Goal: I am trying to show individual data points in a figure with multiple grouped bar charts using Seaborn.
Problem: I tried to do it with a catplot for the bar chart and another catplot for the individual data points. However, this generates 2 figures: One figure with the bar chart and the other with the individual data points.
Question: Is there a way to show the individual data points in the same figure together with the bar chart using Seaborn?
This is my code generating 2 separate figures:
import seaborn as sns
tips = sns.load_dataset("tips")
g = sns.catplot(
x="sex",
y="total_bill",
hue="smoker",
row="time",
data=tips,
kind="bar",
ci = "sd",
edgecolor="black",
errcolor="black",
errwidth=1.5,
capsize = 0.1,
height=4,
aspect=.7,
)
g = sns.catplot(
x="sex",
y="total_bill",
hue="smoker",
row="time",
data=tips,
kind="strip",
height=4,
aspect=.7,
)
Output:
Question: Is there a way to show the individual data points in the same figure together with the bar chart using Seaborn?
seaborn.catplot is a figure-level plot, and they can't be combined.
As shown below, axes-level plots like seaborn.barplot and seaborn.stripplot can be plotted to the same axes.
import seaborn as sns
tips = sns.load_dataset("tips")
ax = sns.barplot(
x="sex",
y="total_bill",
hue="smoker",
data=tips,
ci="sd",
edgecolor="black",
errcolor="black",
errwidth=1.5,
capsize = 0.1,
alpha=0.5
)
sns.stripplot(
x="sex",
y="total_bill",
hue="smoker",
data=tips, dodge=True, alpha=0.6, ax=ax
)
# remove extra legend handles
handles, labels = ax.get_legend_handles_labels()
ax.legend(handles[2:], labels[2:], title='Smoker', bbox_to_anchor=(1, 1.02), loc='upper left')
Figure-level plots (seaborn.catplot) may not be combined, however, it's possible to map an axes-level plot (seaborn.stripplot) onto a figure-level plot.
See Building structured multi-plot grids
This can be a temperamental process, and may only work when the same columns from the dataframe are being used in the mapped plot.
Tested in python 3.8.11, matplotlib 3.4.3, seaborn 0.11.2
import seaborn as sns
tips = sns.load_dataset("tips")
g = sns.catplot(
x="sex",
y="total_bill",
hue="smoker",
row="time",
data=tips,
kind="bar",
ci = "sd",
edgecolor="black",
errcolor="black",
errwidth=1.5,
capsize = 0.1,
height=4,
aspect=.7,
alpha=0.5)
# map data to stripplot
g.map(sns.stripplot, 'sex', 'total_bill', 'smoker', hue_order=['Yes', 'No'], order=['Male', 'Female'],
palette=sns.color_palette(), dodge=True, alpha=0.6, ec='k', linewidth=1)

How to overlay a scatterplot on top of boxplot with sns.catplot?

It is possible to combine axes-level plot functions by simply calling them successively:
import seaborn as sns
import matplotlib.pyplot as plt
tips = sns.load_dataset("tips")
sns.set_theme(style="whitegrid")
ax = sns.boxplot(x="day", y="total_bill", data=tips)
ax = sns.stripplot(x="day", y="total_bill", data=tips,
color=".25", alpha=0.7, ax=ax)
plt.show()
How to achieve this for the figure-level function sns.catplot()? Successive calls to sns.catplot() creates a new figure each time, and passing a figure handle is not possible.
# This creates two separate figures:
sns.catplot(..., kind="box")
sns.catplot(..., kind="strip")
The following works for me with seaborn v0.11:
import seaborn as sns
import matplotlib.pyplot as plt
tips = sns.load_dataset("tips")
g = sns.catplot(x="sex", y="total_bill", hue="smoker", col="time",
data=tips, kind="box",
palette=["#FFA7A0", "#ABEAC9"],
height=4, aspect=.7);
g.map_dataframe(sns.stripplot, x="sex", y="total_bill",
hue="smoker", palette=["#404040"],
alpha=0.6, dodge=True)
# g.map(sns.stripplot, "sex", "total_bill", "smoker",
# palette=["#404040"], alpha=0.6, dodge=True)
plt.show()
Explanations: In a first pass, the box-plots are created using sns.catplot(). The function returns a sns.FacetGrid that accommodates the different axes for each value of the categorical parameter time. In a second pass, this FacetGrid is reused to overlay the scatter plot (sns.stripplot, or alternatively, sns.swarmplot). The above uses method map_dataframe() because data is a pandas DataFrame with named columns. (Alternatively, using map() is also possible.) Setting dodge=True makes sure that the scatter plots are shifted along the categorical axis for each hue category. Finally, note that by calling sns.catplot() with kind="box" and then overlaying the scatter in a second step, the problem of duplicated legend entries is implicitly circumvented.
Alternative (not recommended): It is also possible to create a FacetGrid object first and then call map_dataframe() twice. While this works for this example, in other situations one has to make sure that the mapping of properties is synchronized correctly across facets (see the warning in the docs). sns.catplot() takes care of this, as well as the legend.
g = sns.FacetGrid(tips, col="time", height=4, aspect=.7)
g.map_dataframe(sns.boxplot, x="sex", y="total_bill", hue="smoker",
palette=["#FFA7A0", "#ABEAC9"])
g.map_dataframe(sns.stripplot, x="sex", y="total_bill", hue="smoker",
palette=["#404040"], alpha=0.6, dodge=True)
# Note: the default legend is not resulting in the correct entries.
# Some fix-up step is required here...
# g.add_legend()
plt.show()

Seaborn stripplot with violin plot bars in front of points

I would like to draw a violin plot behind a jitter stripplot. The resulting plot has the mean/std bar behind the jitter points which makes it hard to see. I'm wondeing if there's a way to bring the bar in front of the points.
import seaborn as sns
import matplotlib.pyplot as plt
tips = sns.load_dataset("tips")
sns.violinplot(x="day", y="total_bill", data=tips, color="0.8")
sns.stripplot(x="day", y="total_bill", data=tips, jitter=True)
plt.show()
Seaborn doesn't care about exposing the objects it creates to the user. So one would need to collect them from the axes to manipulate them. The property you want to change here is the zorder. So the idea can be to
Plot the violins
Collect the lines and dots from the axes, and give the lines a high zorder, and give the dots an even higher zorder.
Last plot the strip- or swarmplot. This will have a lower zorder automatically.
Example:
import seaborn as sns
import matplotlib.pyplot as plt
from matplotlib.collections import PathCollection
tips = sns.load_dataset("tips")
ax = sns.violinplot(x="day", y="total_bill", data=tips, color=".8")
for artist in ax.lines:
artist.set_zorder(10)
for artist in ax.findobj(PathCollection):
artist.set_zorder(11)
sns.stripplot(x="day", y="total_bill", data=tips, jitter=True, ax=ax)
plt.show()
I just came across the same problem and could fix it simply by adjusting the zorderparameter in sns.stripplot:
import seaborn as sns
import matplotlib.pyplot as plt
tips = sns.load_dataset("tips")
sns.violinplot(x="day", y="total_bill", data=tips, color="0.8")
sns.stripplot(x="day", y="total_bill", data=tips, jitter=True, zorder=1)
plt.show()
The result then similar to the answer of #ImportanceOfBeingErnest:

Seaborn Boxplot with Same Color for All Boxes

I am using seaborn and want to generate a box plot where all boxes have the same color. For some reason seaborn uses different colors for each box and doesn't have an option to stop this behavior and set the same color for all boxes.
How can I force seaborn to use the same color for all boxes?
fig, ax = plt.subplots(figsize=(10, 20))
sns.boxplot(y='categorical_var', x='numeric_var', ax=ax)
Use the color parameter:
import seaborn as sns
tips = sns.load_dataset("tips")
sns.boxplot(x="day", y="tip", data=tips, color="seagreen")
Make your own palette and set the color of boxes like:
import seaborn as sns
import matplotlib.pylab as plt
sns.set_color_codes()
tips = sns.load_dataset("tips")
pal = {day: "b" for day in tips.day.unique()}
sns.boxplot(x="day", y="total_bill", data=tips, palette=pal)
plt.show()
Another way is to iterate over artists of boxplot and set the color with set_facecolor for every artist of axis istance:
ax = sns.boxplot(x="day", y="total_bill", data=tips)
for box in ax.artists:
box.set_facecolor("green")

Seaborn boxplot + stripplot: duplicate legend

One of the coolest things you can easily make in seaborn is boxplot + stripplot combination:
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
tips = sns.load_dataset("tips")
sns.stripplot(x="day", y="total_bill", hue="smoker",
data=tips, jitter=True,
palette="Set2", split=True,linewidth=1,edgecolor='gray')
sns.boxplot(x="day", y="total_bill", hue="smoker",
data=tips,palette="Set2",fliersize=0)
plt.legend(bbox_to_anchor=(1.05, 1), loc=2, borderaxespad=0.);
Unfortunately, as you can see above, it produced double legend, one for boxplot, one for stripplot. Obviously, it looks ridiculous and redundant. But I cannot seem to find a way to get rid of stripplot legend and only leave boxplot legend. Probably, I can somehow delete items from plt.legend, but I cannot find it in the documentation.
You can get what handles/labels should exist in the legend before you actually draw the legend itself. You then draw the legend only with the specific ones you want.
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
tips = sns.load_dataset("tips")
sns.stripplot(x="day", y="total_bill", hue="smoker",
data=tips, jitter=True,
palette="Set2", split=True,linewidth=1,edgecolor='gray')
# Get the ax object to use later.
ax = sns.boxplot(x="day", y="total_bill", hue="smoker",
data=tips,palette="Set2",fliersize=0)
# Get the handles and labels. For this example it'll be 2 tuples
# of length 4 each.
handles, labels = ax.get_legend_handles_labels()
# When creating the legend, only use the first two elements
# to effectively remove the last two.
l = plt.legend(handles[0:2], labels[0:2], bbox_to_anchor=(1.05, 1), loc=2, borderaxespad=0.)
I want to add that if you use subplots, the legend handling might be a bit more problematic. The code above, which gives a very nice figure by the way (#Sergey Antopolskiy and #Ffisegydd), will not relocate the legend in a subplot, which keeps appearing very stubbornly. See code above adapted to subplots:
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
tips = sns.load_dataset("tips")
fig, axes = sns.plt.subplots(2,2)
sns.stripplot(x="day", y="total_bill", hue="smoker",
data=tips, jitter=True, palette="Set2",
split=True,linewidth=1,edgecolor='gray', ax = axes[0,0])
ax = sns.boxplot(x="day", y="total_bill", hue="smoker",
data=tips,palette="Set2",fliersize=0, ax = axes[0,0])
handles, labels = ax.get_legend_handles_labels()
l = plt.legend(handles[0:2], labels[0:2], bbox_to_anchor=(1.05, 1), loc=2, borderaxespad=0.)
The original legend remains. In order to erase it, you can add this line:
axes[0,0].legend(handles[:0], labels[:0])
Edit: in recent versions of seaborn (>0.9.0), this used to leave a small white box in the corner as pointed in the comments. To solve it use the answer in this post:
axes[0,0].get_legend().remove()

Categories