Edit legend title and labels of Seaborn scatterplot and countplot - python

I am using seaborn scatterplot and countplot on titanic dataset.
Here is my code to draw scatter plot. I also tried to edit legend label.
ax = seaborn.countplot(x='class', hue='who', data=titanic)
legend_handles, _ = ax.get_legend_handles_labels()
plt.show();
To edit legend label, I did this. In this case, there is no legend title anymore. How can I rename this title from 'who' to 'who1'?
ax = seaborn.countplot(x='class', hue='who', data=titanic)
legend_handles, _= ax.get_legend_handles_labels()
ax.legend(legend_handles, ['man1','woman1','child1'], bbox_to_anchor=(1,1))
plt.show()
I used the same method to edit legend labels on scatter plot and the result is different here. It uses 'dead' as legend title and use 'survived' as first legend label.
ax = seaborn.scatterplot(x='age', y='fare', data=titanic, hue = 'survived')
legend_handles, _= ax.get_legend_handles_labels()
ax.legend(legend_handles, ['dead', 'survived'],bbox_to_anchor=(1.26,1))
plt.show()
Is there a parameter to delete and add legend title?
I used same codes on two different graphs and outcome of legend is different. Why is that?

Try using
ax.legend(legend_handles, ['man1','woman1','child1'],
bbox_to_anchor=(1,1),
title='whatever title you want to use')

With seaborn v0.11.2 or later, use the move_legend() function.
From the FAQs page:
With seaborn v0.11.2 or later, use the move_legend() function.
On older versions, a common pattern was to call ax.legend(loc=...) after plotting. While this appears to move the legend, it actually replaces it with a new one, using any labeled artists that happen to be attached to the axes. This does not consistently work across plot types. And it does not propagate the legend title or positioning tweaks that are used to format a multi-variable legend.
The move_legend() function is actually more powerful than its name suggests, and it can also be used to modify other legend parameters (font size, handle length, etc.) after plotting.

Why does the legend order sometimes differ?
You can force the order of the legend via hue_order=['man', 'woman', 'child']. By default, the order is either the order in which they appear in the dataframe (when the values are just strings), or the order imposed by pd.Categorical.
How to rename the legend entries
The surest way is to rename the column values, e.g.
titanic["who"] = titanic["who"].map({'man': 'Man1', 'woman': 'Woman1', 'child': 'Child1'})
If the entries of the column exist of numbers in the range 0,1,..., you can use pd.Categorical.from_codes(...). This also forces an order.
Specific colors for specific hue values
There are many options to specify the colors to be used (via palette=). To assign a specific color to a specific hue value, the palette can be a dictionary, e.g.
palette = {'Man1': 'cornflowerblue', 'Woman1': 'fuchsia', 'Child1': 'limegreen'}
Renaming or removing the legend title
sns.move_legend(ax, title=..., loc='best') sets a new title. Setting the title to an empty string removes it (this is useful when the entries are self-explaining).
A code example
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
titanic = sns.load_dataset('titanic')
# titanic['survived'] = titanic['survived'].map({0:'Dead', 1:'Survived'})
titanic['survived'] = pd.Categorical.from_codes(titanic['survived'], ['Dead', 'Survived'])
palette = {'Dead': 'navy', 'Survived': 'turquoise'}
ax = sns.scatterplot(data=titanic, x='age', y='fare', hue='survived', palette=palette)
sns.move_legend(ax, title='', loc='best') # remove the title
plt.show()

Related

Customizing legend in Seaborn histplot subplots

I am trying to generate a figure with 4 subplots, each of which is a Seaborn histplot. The figure definition lines are:
fig,axes=plt.subplots(2,2,figsize=(6.3,7),sharex=True,sharey=True)
(ax1,ax2),(ax3,ax4)=axes
fig.subplots_adjust(wspace=0.1,hspace=0.2)
I would like to define strings for legend entries in each of the subplots. As an example, I am using the following code for the first subplot:
sp1=sns.histplot(df_dn,x="ktau",hue="statind",element="step", stat="density",common_norm=True,fill=False,palette=colvec,ax=ax1)
ax1.set_title(r'$d_n$')
ax1.set_xlabel(r'max($F_{a,max}$)')
ax1.set_ylabel(r'$\tau_{ken}$')
legend_labels,_=ax1.get_legend_handles_labels()
ax1.legend(legend_labels,['dep-','ind-','ind+','dep+'],title='Stat.ind.')
The legend is not showing correctly (legend entries are not plotted and the legend title is the name of the hue variable ("statind"). Please note I have successfully used the same code for other figures in which I used Seaborn relplots instead of histplots.
The main problem is that ax1.get_legend_handles_labels() returns empty lists (note that the first return value are the handles, the second would be the labels). At least for the current (0.11.1) version of seaborn's histplot().
To get the handles, you can do legend = ax1.get_legend(); handles = legend.legendHandles.
To recreate the legend, first the existing legend needs to be removed. Then, the new legend can be created starting from some handles.
Also note that to be sure of the order of the labels, it helps to set hue_order. Here is some example code to show the ideas:
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import seaborn as sns
df_dn = pd.DataFrame({'ktau': np.random.randn(4000).cumsum(),
'statind': np.repeat([*'abcd'], 1000)})
fig, ax1 = plt.subplots()
sp1 = sns.histplot(df_dn, x="ktau", hue="statind", hue_order=['a', 'b', 'c', 'd'],
element="step", stat="density", common_norm=True, fill=False, ax=ax1)
ax1.set_title(r'$d_n$')
ax1.set_xlabel(r'max($F_{a,max}$)')
ax1.set_ylabel(r'$\tau_{ken}$')
legend = ax1.get_legend()
handles = legend.legendHandles
legend.remove()
ax1.legend(handles, ['dep-', 'ind-', 'ind+', 'dep+'], title='Stat.ind.')
plt.show()

Overriding Seaborn legend

I made a line plot using seaborn's relplot and I wanted to customize my legend labels. For some reason when I do this, It creates another legend with out deleting the old one. How do I get rid of the initial legend (The legend with title "Sex")? Also how do I add a legend title to my new legend?
Here is the code I used to generate my plot:
plt.figure(figsize=(12,10))
sns.relplot(x='Year',y = 'cancer/100k pop' , data = dataset_sex,hue="Sex", kind="line",ci=None)
title_string = "Trend of Cancer incidencies by Sex "
plt.xlabel('Years')
plt.title(title_string)
plt.legend(['Men','Women'])
regplot is a figure-level function, and returns a FacetGrid. You can remove its legend via g.legend.remove().
import matplotlib.pyplot as plt
import seaborn as sns
tips = sns.load_dataset("tips")
g = sns.relplot(data=tips, x="total_bill", y="tip", hue="day")
g.legend.remove()
plt.legend(['Jeudi', 'Vendredi', 'Samedi', 'Dimanche'])
plt.show()
This code has been tested with seaborn 0.11. Possibly you'll need to upgrade. To add a title to the legend: plt.legend([...], title='New title').
Note that plt.legend(...) will create the legend inside the last (or only) subplot. If you prefer the figure-level legend next to the plot, to change the legend labels, you can call g.add_legend(labels=[...], title='new title') after having removed the old legend.
PS: Adding legend=False to sns.relplot() will not create the legend entries. So, you'll need to recreate both the legend markers and their labels, while you lost the information of which colors were used.

remove legend handles and labels completely

Some plotting tools such as seaborn automatically label plots and add a legend. Removing the legend is fairly easy with ax.get_legend().remove(), but the handles and labels are still stored somewhere in the axes object. Thus when adding another line or other plot type, for which I want to have a legend, the "old" legend handles and labels are also displayed. In other words: ax.get_legend_handles_labels() still returns the labels and handles introduced by f.i. seaborn.
Is there any way to completely remove the handles and labels to be used for a legend from an axis?
Such that ax.get_legend_handles_labels() returns ([], [])?
I know that I can set sns.lineplot(..., legend=False) in seaborn. I am just using seaborn to produce a nice example. The question is how to remove existing legend labels and handles in general.
Here is a minimum working example:
import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns
df = pd.DataFrame(data={
'x': [0, 1, 0, 1],
'y': [7.2, 10., 11.2, 3.],
'id': [4, 4, 7, 7]
})
# get mean per x location:
grpd = df.groupby('x').mean()
ax = sns.lineplot(data=df, x='x', y='y', hue='id', palette=['b', 'g'])
ax.get_legend().remove()
# this still returns the handles and labels:
print(ax.get_legend_handles_labels())
# Out:
# ([<matplotlib.lines.Line2D at 0x17c1c208f88>,
# <matplotlib.lines.Line2D at 0x17c1c1f8888>,
# <matplotlib.lines.Line2D at 0x17c1c20af88>],
# ['id', '4', '7'])
# plot some other lines, for which I want to have a legend:
ax.plot(grpd.index, grpd['y'], label='mean')
# add legend, ONLY FOR MEAN, if possible:
ax.legend()
I tried manipulating the axes ax.get_legend() and ax.legend_ objects, f.i. with ax.legend_.legendHandles = [], setting the labels, texts and everything else, but the "old" seaborn entries keep reappiering each time.
Since this is meant to be used within a module, it is no applicable to explicitly set a legend only containing the "mean"-entry. I need to "clear" everything happening in the backend to provide a clean API to the user.
Change:
ax = sns.lineplot(data=df, x='x', y='y', hue='id', palette=['b', 'g'])
to:
ax = sns.lineplot(data=df, x='x', y='y', hue='id', palette=['b', 'g'], legend=False)
That way you also don't need to make the call to remove the legend as none is created in the first place.
EDIT UPDATE:
If you want to clear the legend handles and label, I think you'll have to do something like:
for line in ax.lines: # put this before you call the 'mean' plot function.
line.set_label(s='')
This will loop over the lines draw on the plot and then set their label to be empty. That way the legend doesn't consider them and the print(ax.get_legend_handles_labels()) will return empty iterables.
Thanks at mullinscr for the answer considering line plots. To remove legend labels in general, the labels of lines, collections (f.i. scatter plots), patches (boxplots etc) and images (imshow etc.) must be cleared:
for _artist in ax.lines + ax.collections + ax.patches + ax.images:
_artist.set_label(s=None)
Please tell me, if I forgot to mention a specific kind of plot.
Another option is to do the following:
ax.legend(handles=[], labels= [])
In this case, everything will be removed.
If for example you would like to only remove the first handle, you can do the following:
fetch the handles and labels of your plot:
h, l = ax.get_legend_handles_labels()*
select the ones you want (here all except the first):
ax.legend(handles=[item for item in h[1:], labels= [item for item in l[1:])
This way, you can sort out the things you want.

Changing pointplot legend in seaborn

I would like to change the label for the legend and items in the legend for this plot. Right now the label for the legend is "Heart" and the items are 0 and 1. I would like to be able to change all of these to something else, but am unsure how. Here is what I have so far.
sns.set_context("talk",font_scale=3)
ax =sns.pointplot(x="Heart", y="FirstPersonPronouns", hue="Speech", data=df)
ax.set(xlabel='Condition', ylabel='First Person Pronouns')
ax.set(xticklabels=["Control", "Heart"])
Any help would be appreciated! Also, I'm assuming this is a set parameter that I don't know about, is there a comprehensive list of these? I can't seem to find one in the documentation.
An alternative to changing the column names of the data frame, is to create a new legend using the same legend handles (this is what determines the colored markers), but with new text labels:
import seaborn as sns
tips = sns.load_dataset('tips')
ax = sns.pointplot(x='sex', y='total_bill', hue='time', data=tips)
leg_handles = ax.get_legend_handles_labels()[0]
ax.legend(leg_handles, ['Blue', 'Orange'], title='New legend')

How to express classes on the axis of a heatmap in Seaborn

I created a very simple heatmap chart with Seaborn displaying a similarity square matrix. Here is the one line of code I used:
sns.heatmap(sim_mat, linewidths=0, square=True, robust=True)
sns.plt.show()
and this is the output I get:
What I'd like to do is to represent on the x and y axis not the labels of my instances but a colored indicator (imagine something like a small palplot on each axis) where each color represents another variable associated to each instance (let's say I have this info stored a list named labels) plus another legend for this kind of information next to the one specifying the colors of the heatmap (one like that for the lmplot). It is important that the two informations have different color palettes.
Is this possible in Seaborn?
UPDATE
What I am looking for is a clustermap as correctly suggested.
sns.clustermap(sim_mat, row_colors=label_cols, col_colors=label_cols
row_cluster=False, col_cluster=False)
Here is what I am getting btw, the dots and lines are too small and I do not see a way to enlarge them in the documentation. I'd like to
Plus, how can I add a legend and put the two one next to the other in the same position?
There are two options:
First, heatmap is an Axes level figure, so you could set up a main large main heatmap axes for the correlation matrix and flank it with heatmaps that you then pass class colors to yourself. This will be a little bit of work, but gives you lots of control over how everything works.
This is more or less an option in clustermap though, so I'm going to demonstrate how to do it that way here. It's a bit of a hack, but it will work.
First, we'll load the sample data and do a bit of roundabout transformations to get colors for the class labels.
networks = sns.load_dataset("brain_networks", index_col=0, header=[0, 1, 2])
network_labels = networks.columns.get_level_values("network")
network_pal = sns.cubehelix_palette(network_labels.unique().size,
light=.9, dark=.1, reverse=True,
start=1, rot=-2)
network_lut = dict(zip(map(str, network_labels.unique()), network_pal))
network_colors = pd.Series(network_labels).map(network_lut)
Next we call clustermap to make the main plot.
g = sns.clustermap(networks.corr(),
# Turn off the clustering
row_cluster=False, col_cluster=False,
# Add colored class labels
row_colors=network_colors, col_colors=network_colors,
# Make the plot look better when many rows/cols
linewidths=0, xticklabels=False, yticklabels=False)
The side colors are drawn with a heatmap, which matplotlib thinks of as quantitative data and thus there's not a straightforward way to get a legend directly from it. Instead of that, we'll add an invisible barplot with the right colors and labels, then add a legend for that.
for label in network_labels.unique():
g.ax_col_dendrogram.bar(0, 0, color=network_lut[label],
label=label, linewidth=0)
g.ax_col_dendrogram.legend(loc="center", ncol=6)
Finally, let's move the colorbar to take up the empty space where the row dendrogram would normally be and save the figure.
g.cax.set_position([.15, .2, .03, .45])
g.savefig("clustermap.png")
Building on the above answer, I think it's worth noting the possibility of multiple colour levels for labels - as noted in the clustermap docs ({row,col}_colors). I couldn't find an example of multiple levels, so I thought I'd share an example here.
networks = sns.load_dataset("brain_networks", index_col=0, header=[0, 1, 2])
network level
network_labels = networks.columns.get_level_values("network")
network_pal = sns.cubehelix_palette(network_labels.unique().size, light=.9, dark=.1, reverse=True, start=1, rot=-2)
network_lut = dict(zip(map(str, network_labels.unique()), network_pal))
Create index using the columns for networks
network_colors = pd.Series(network_labels, index=networks.columns).map(network_lut)
node level
node_labels = networks.columns.get_level_values("node")
node_pal = sns.cubehelix_palette(node_labels.unique().size)
node_lut = dict(zip(map(str, node_labels.unique()), node_pal))
Create index using the columns for nodes
node_colors = pd.Series(node_labels, index=networks.columns).map(node_lut)
Create dataframe for row and column color levels
network_node_colors = pd.DataFrame(network_colors).join(pd.DataFrame(node_colors))
create clustermap
g = sns.clustermap(networks.corr(),
# Turn off the clustering
row_cluster=False, col_cluster=False,
# Add colored class labels using data frame created from node and network colors
row_colors = network_node_colors,
col_colors = network_node_colors,
# Make the plot look better when many rows/cols
linewidths=0,
xticklabels=False, yticklabels=False,
center=0, cmap="vlag")
create two legends - one for each level by creating invisible column and row barplots (as per above)
network legend
from matplotlib.pyplot import gcf
for label in network_labels.unique():
g.ax_col_dendrogram.bar(0, 0, color=network_lut[label], label=label, linewidth=0)
l1 = g.ax_col_dendrogram.legend(title='Network', loc="center", ncol=5, bbox_to_anchor=(0.47, 0.8), bbox_transform=gcf().transFigure)
node legend
for label in node_labels.unique():
g.ax_row_dendrogram.bar(0, 0, color=node_lut[label], label=label, linewidth=0)
l2 = g.ax_row_dendrogram.legend(title='Node', loc="center", ncol=2, bbox_to_anchor=(0.8, 0.8), bbox_transform=gcf().transFigure)
plt.show()
When both dendrograms are used one can also add a new hidden axis and draw the legend.
ax= f.add_axes((0,0,0,0))
ax.xaxis.set_visible(False)
ax.yaxis.set_visible(False)
for label in node_labels.unique():
ax.bar(0, 0, color=node_lut[label], label=label, linewidth=0)
l2 = g.ax_row_dendrogram.legend(title='Node', loc="center", ncol=2, bbox_to_anchor=(0.8, 0.8), bbox_transform=f.transFigure)

Categories