Marker style of legend and the graph are not matching? - python

Below is the code I am using for generating the plot, but the issue is style of the marker in the graph is different from that of the plot
sns.set_style(rc={'boxplot.flierprops.markeredgecolor':'black' ,'boxplot.flierprops.markeredgewidth':1.25,'boxplot.flierprops.markerfacecolor':'white'})
fig, scatter = plt.subplots(figsize = (6,4), dpi = 100)
scatter = sns.lineplot(data=df_whole,x='shortest_distance',y='similarity',style ='Metric',hue='Metric'
,markers=True,lw=1,markeredgewidth=1.25,markeredgecolor='black',markersize=7,dashes= False,errorbar=None,markerfacecolor='white')
scatter.set(title='TF-IDF')
scatter.legend(title = "Similarity Methods",prop={'size': 12})

As seaborn uses complex combinations of matplotlib elements to create its plots, and tries to make the legend as compact as possible, the legend is often custom-made. As such, seaborn unfortunately does not always take into account all matplotlib-level parameters.
In this case, the problem can be worked around via assigning these parameters again to the legend handles. Here is an example using one of seaborn's test datasets:
import matplotlib.pyplot as plt
import seaborn as sns
flights = sns.load_dataset('flights')
markerprops = dict(markeredgewidth=1.25, markeredgecolor='black', markersize=7, markerfacecolor='none')
ax = sns.lineplot(data=flights, x='year', y='passengers', style='month', hue='month',
markers=True, lw=1, dashes=False, errorbar=None, **markerprops)
ax.set(title='TF-IDF')
handles, labels = ax.get_legend_handles_labels()
for h in handles:
h.set(**markerprops)
ax.legend(handles=handles, title="Months", prop={'size': 12}, ncol=3)
plt.tight_layout()
plt.show()
PS: Matplotlib functions usually return the graphical elements they created (e.g. scatter dots or lines), while seaborn (and pandas) usually returns the subplot (ax) or grid of subplots. As such, giving the name scatter to the return value of sns.lineplot might be confusing when comparing code with other matplotlib and seaborn examples.

Related

How to use seaborn kdeplot legend from one suplots axis for whole figure legend?

I'm using seaborn to plot kdeplots on axes of subplots, and I want to have one global figure legend instead of one legend on each subplot.
However, the axes I pass to sns.kdeplot or get from sns.kdeplot seem to have empty lists of handles and labels when I use get_legend_handles_labels() to get them.
I cannot rely on the data to create a legend from scratch, because I have no guarantee that the colors will actually match.
All these issues can be seen on the following example:
from matplotlib import pyplot as plt
import numpy as np
import pandas as pd
import seaborn as sns
# Two numerical data columns, plus one for the hue
data = pd.DataFrame(
{"A": np.random.random(20),
"B": np.random.random(20),
"C": ["c", "C"] * 10
})
# figure with 2 subplots
(fig, [ax1, ax2]) = plt.subplots(2, 1)
# passing ax1 to sns.kdeplot
sns.kdeplot(data=data, x="A", hue="C", ax=ax1)
# passing and getting ax2
ax2 = sns.kdeplot(data=data, x="B", hue="C", ax=ax2)
# Extracting handles and labels from the axes legends
# (Those are all empty lists. Where is the info I need?)
(handles_1, labels_1) = ax1.get_legend_handles_labels()
(handles_2, labels_2) = ax2.get_legend_handles_labels()
# Setting a global legend based on the data
fig.legend(labels=data["C"].unique(), loc="upper right")
# Setting another global legend
# based on extracted handles and labels
fig.legend(handles=handles_1, labels=labels_1, loc="center right")
# Displaying debugging info in the title
plt.suptitle(f"ax1 legend: {len(labels_1)}, ax2 legend: {len(labels_1)} labels")
plt.savefig("legend_issues.png")
This results in the following figure:
Is there a bug or do I look at the wrong place for legend handles and labels?

Seaborn pairplot legend don't show colors and labels

I'm using seaborn 0.11.2 but I have troubles seeing the legend of the seaborn pairplot.
Here is the code: all is working fine except for the legend
for x in x1_categorical:
plt.figure()
sns.pairplot(data=x1[[x,'weight']],hue=x, palette='husl', height=4, aspect=4)
plt.title(x)
I cannot see neither color or labels. I have already tried what suggested here: Seaborn Pairplot Legend Not Showing Colors
I have no clue, thanks in advance!
If I understand it correctly, x1_categorical contains categorical column names. Taking seaborn's penguin dataset as an example, the current code would look like:
from matplotlib import pyplot as plt
import seaborn as sns
x1 = sns.load_dataset('penguins')
x1_categorical = ['species', 'island', 'sex']
for x in x1_categorical:
g = sns.pairplot(data=x1[[x, 'body_mass_g']], hue=x, palette='husl', height=4, aspect=3)
plt.title(x)
plt.tight_layout()
When I try this (seaborn 0.11.2), I get plots such as:
These seem to be kdeplots for the numerical column, using the categorical column as hue. Unfortunately, the legends are empty, also when plt.legend() is tried.
An alternative is to explicitly create the kdeplots, for example:
from matplotlib import pyplot as plt
import seaborn as sns
x1 = sns.load_dataset('penguins')
x1_categorical = ['species', 'island', 'sex']
fig, axs = plt.subplots(ncols=1, nrows=len(x1_categorical), figsize=(12, 4*len(x1_categorical)))
for ax, x in zip(axs, x1_categorical):
sns.kdeplot(data=x1, x='body_mass_g', hue=x, palette='husl', fill=True, common_norm=False, ax=ax)
sns.despine()
The example code creates one large figure, but if needed, separate plots could be created as well.
An alternative could use common_norm=True, multiple='stack':

Separating violinplots in seaborn with a line

I'm trying to plot multi-hue distributions with Seaborn, but I find that the plots are difficult to be traced back to the tick they belong to. I have tried to add a grid, but the grid is only showing on the dimension of the distribution, so separating the distribution itself but not different distributions from each other. Is it possible to have Seaborn add a grid line between different violin plot groups/hues? To illustrate, take one of the plots from the docs. I've added what I'd like to see to this plot (I've made the width of these separators quite heavy for illustration purposes, in the solution I'd like them to be just as thick as the grid lines):
You could use matplotlib's axvline to draw vertical lines at positions 0.5, 1.5, ...
import numpy as np
import seaborn as sns
sns.set(style="whitegrid")
tips = sns.load_dataset("tips")
ax = sns.violinplot(x="day", y="total_bill", hue="smoker",
data=tips, palette="muted")
for i in range(len(np.unique(tips['day'])) - 1):
ax.axvline(i + 0.5, color='grey', lw=1)
plt.show()
Alternatively, you could set minor ticks at these positions and turn the minor gridlines on for the x-axis.

How to subplot seaborn catplot (kind='count') on-top of catplot (kind='violin') with sharex=True

So far I have tried the following code:
# Import to handle plotting
import seaborn as sns
# Import pyplot, figures inline, set style, plot pairplot
import matplotlib.pyplot as plt
# Make the figure space
fig = plt.figure(figsize=(2,4))
gs = fig.add_gridspec(2, 4)
ax1 = fig.add_subplot(gs[0, :])
ax2 = fig.add_subplot(gs[1, :])
# Load the example car crash dataset
tips = sns.load_dataset("tips")
# Plot the frequency counts grouped by time
sns.catplot(x='sex', hue='smoker',
kind='count',
col='time',
data=tips,
ax=ax1)
# View the data
sns.catplot(x='sex', y='total_bill', hue='smoker',
kind='violin',
col='time',
split='True',
cut=0,
bw=0.25,
scale='area',
scale_hue=False,
inner='quartile',
data=tips,
ax=ax2)
plt.close(2)
plt.close(3)
plt.show()
This seems to stack the categorial plots, of each kind respectively, on top of eachother.
What I want are the resulting plots of the following code in a single figure with the countplot in row one and the violin plot in row two.
# Import to handle plotting
import seaborn as sns
# Import pyplot, figures inline, set style, plot pairplot
import matplotlib.pyplot as plt
# Load the example car crash dataset
tips = sns.load_dataset("tips")
# Plot the frequency counts grouped by time
sns.catplot(x='sex', hue='smoker',
kind='count',
col='time',
data=tips)
# View the data
sns.catplot(x='sex', y='total_bill', hue='smoker',
kind='violin',
col='time',
split='True',
cut=0,
bw=0.25,
scale='area',
scale_hue=False,
inner='quartile',
data=tips)
The actual categorical countplot that I would like to span row one of a figure that also contains a categorical violin plot (Ref. Image 3):
The actual categorical violin plot that I would like to span row two of a figure that also contains a categorical countplot (Ref. Image 2):
I tried the following code which forced the plots to be in the same figure. The downside is that the children of the figure/axes did not transfer, i.e. axis-labels, legend, and grid lines. I feel pretty close with this hack but need another push or source for inspiration. Also, I'm no longer able to close the old/unwanted figures.
# Import to handle plotting
import seaborn as sns
# Import pyplot, figures inline, set style, plot pairplot
import matplotlib.pyplot as plt
# Set some style
sns.set_style("whitegrid")
# Load the example car crash dataset
tips = sns.load_dataset("tips")
# Plot the frequency counts grouped by time
a = sns.catplot(x='sex', hue='smoker',
kind='count',
col='time',
data=tips)
numSubs_A = len(a.col_names)
for i in range(numSubs_A):
for p in a.facet_axis(0,i).patches:
a.facet_axis(0,i).annotate(str(p.get_height()), (p.get_x()+0.15, p.get_height()+0.1))
# View the data
b = sns.catplot(x='sex', y='total_bill', hue='smoker',
kind='violin',
col='time',
split='True',
cut=0,
bw=0.25,
scale='area',
scale_hue=False,
inner='quartile',
data=tips)
numSubs_B = len(b.col_names)
# Subplots migration
f = plt.figure()
for i in range(numSubs_A):
f._axstack.add(f._make_key(a.facet_axis(0,i)), a.facet_axis(0,i))
for i in range(numSubs_B):
f._axstack.add(f._make_key(b.facet_axis(0,i)), b.facet_axis(0,i))
# Subplots size adjustment
f.axes[0].set_position([0,1,1,1])
f.axes[1].set_position([1,1,1,1])
f.axes[2].set_position([0,0,1,1])
f.axes[3].set_position([1,0,1,1])
It is in general not possible to combine the output of several seaborn figure-level functions into a single figure. See (this question, also this issue). I once wrote a hack to externally combine such figures, but it has several drawbacks. Feel free to use it if it works for you.
But in general, consider creating the plot you desired manually. In this case it could look like this:
import seaborn as sns
import matplotlib.pyplot as plt
sns.set()
fig, axes = plt.subplots(2,2, figsize=(8,6), sharey="row", sharex="col")
tips = sns.load_dataset("tips")
order = tips["sex"].unique()
hue_order = tips["smoker"].unique()
for i, (n, grp) in enumerate(tips.groupby("time")):
sns.countplot(x="sex", hue="smoker", data=grp,
order=order, hue_order=hue_order, ax=axes[0,i])
sns.violinplot(x='sex', y='total_bill', hue='smoker', data=grp,
order=order, hue_order=hue_order,
split='True', cut=0, bw=0.25,
scale='area', scale_hue=False, inner='quartile',
ax=axes[1,i])
axes[0,i].set_title(f"time = {n}")
axes[0,0].get_legend().remove()
axes[1,0].get_legend().remove()
axes[1,1].get_legend().remove()
plt.show()
seaborn.catplot does not accept an "ax" argument, hence the problem with your first code.
It appears that some hacking is needed to accomplish the x-sharing you aim for:
How to plot multiple Seaborn Jointplot in Subplot
As such, you could save the time and effort, and just manually stack the two figures from your second code.

Two seaborn distplots one same axis

I am trying to figure a nice way to plot two distplots (from seaborn) on the same axis. It is not coming out as pretty as I want since the histogram bars are covering each other. And I don't want to use countplot or barplot simply because they don't look as pretty. Naturally if there is no other way I shall do it in that fashion, but distplot looks very good. But, as said, the bars are now covering each other (see pic).
Thus is there any way to fit two distplot frequency bars onto one bin so that they do not overlap? Or placing the counts on top of each other? Basically I want to do this in seaborn:
Any ideas to clean it up are most welcome. Thanks.
MWE:
sns.set_context("paper",font_scale=2)
sns.set_style("white")
rc('text', usetex=False)
fig, ax = plt.subplots(figsize=(7,7),sharey=True)
sns.despine(left=True)
mats=dict()
mats[0]=[1,1,1,1,1,2,3,3,2,3,3,3,3,3]
mats[1]=[3,3,3,3,3,4,4,4,5,6,1,1,2,3,4,5,5,5]
N=max(max(set(mats[0])),max(set(mats[1])))
binsize = np.arange(0,N+1,1)
B=['Thing1','Thing2']
for i in range(len(B)):
ax = sns.distplot(mats[i],
kde=False,
label=B[i],
bins=binsize)
ax.set_xlabel('My label')
ax.get_yaxis().set_visible(False)
ax.legend()
plt.show()
As #mwaskom has said seaborn is wrapping matplotlib plotting functions (well to most part) to deliver more complex and nicer looking charts.
What you are looking for is "simple enough" to get it done with matplotlib:
sns.set_context("paper", font_scale=2)
sns.set_style("white")
plt.rc('text', usetex=False)
fig, ax = plt.subplots(figsize=(4,4))
sns.despine(left=True)
# mats=dict()
mats0=[1,1,1,1,1,2,3,3,2,3,3,3,3,3]
mats1=[3,3,3,3,3,4,4,4,5,6,1,1,2,3,4,5,5,5]
N=max(mats0 + mats1)
# binsize = np.arange(0,N+1,1)
binsize = N
B=['Thing1','Thing2']
ax.hist([mats0, mats1], binsize, histtype='bar',
align='mid', label=B, alpha=0.4)#, rwidth=0.6)
ax.set_xlabel('My label')
ax.get_yaxis().set_visible(False)
# ax.set_xlim(0,N+1)
ax.legend()
plt.show()
Which yields:
You can uncomment ax.set_xlim(0,N+1) to give more space around this histogram.

Categories