Remove one of the two legends produced in this Seaborn figure? - python

I have just started using seaborn to produce my figures. However I can't seem to remove one of the legends produced here.
I am trying to plot two accuracies against each other and draw a line along the diagonal to make it easier to see which has performed better (if anyone has a better way of plotting this data in seaborn - let me know!). The legend I'd like to keep is the one on the left, that shows the different colours for 'N_bands' and different shapes for 'Subject No'
ax1 = sns.relplot(y='y',x='x',data=df,hue='N bands',legend='full',style='Subject No.',markers=['.','^','<','>','8','s','p','*','P','X','D','H','d']).set(ylim=(80,100),xlim=(80,100))
ax2 = sns.lineplot(x=range(80,110),y=range(80,110),legend='full')
I have tried setting the kwarg legend to 'full','brief' and False for both ax1 and ax2 (together and separately) and it only seems to remove the one on the left, or both.
I have also tried to remove the axes using matplotlib
ax1.ax.legend_.remove()
ax2.legend_.remove()
But this results in the same behaviour (left legend dissapearing).
UPDATE: Here is a minimal example you can run yourself:
test_data = np.array([[1.,2.,100.,9.],[2.,1.,100.,8.],[3.,4.,200.,7.]])
test_df = pd.DataFrame(columns=['x','y','p','q'], data=test_data)
sns.set_context("paper")
ax1=sns.relplot(y='y',x='x',data=test_df,hue='p',style='q',markers=['.','^','<','>','8'],legend='full').set(ylim=(0,4),xlim=(0,4))
ax2=sns.lineplot(x=range(0,5),y=range(0,5),legend='full')
Although this doesn't reproduce the error perfectly as the right legend is coloured (I have no idea how to reproduce this error then - does the way my dataframe was created make a difference?). But the essence of the problem remains - how do I remove the legend on the right but keep the one on the left?

You're plotting a lineplot in the (only) axes of a FacetGrid produced via relplot. That's quite unconventional, so strange things might happen.
One option to remove the legend of the FacetGrid but keeping the one from the lineplot would be
g._legend.remove()
Full code (where I also corrected for the confusing naming if grids and axes)
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns
test_data = np.array([[1.,2.,100.,9.],[2.,1.,100.,8.],[3.,4.,200.,7.]])
test_df = pd.DataFrame(columns=['x','y','p','q'], data=test_data)
sns.set_context("paper")
g=sns.relplot(y='y',x='x',data=test_df,hue='p',style='q',markers=['.','^','<','>','8'], legend='full')
sns.lineplot(x=range(0,5),y=range(0,5),legend='full', ax=g.axes[0,0])
g._legend.remove()
plt.show()
Note that this is kind of a hack, and it might break in future seaborn versions.
The other option is to not use a FacetGrid here, but just plot a scatter and a line plot in one axes,
ax1 = sns.scatterplot(y='y',x='x',data=test_df,hue='p',style='q',
markers=['.','^','<','>','8'], legend='full')
sns.lineplot(x=range(0,5),y=range(0,5), legend='full', ax=ax1)
plt.show()

Related

Two seaborn plots with different scales displayed on same plot but bars overlap

I am trying to include 2 seaborn countplots with different scales on the same plot but the bars display as different widths and overlap as shown below. Any idea how to get around this?
Setting dodge=False, doesn't work as the bars appear on top of each other.
The main problem of the approach in the question, is that the first countplot doesn't take hue into account. The second countplot won't magically move the bars of the first. An additional categorical column could be added, only taking on the 'weekend' value. Note that the column should be explicitly made categorical with two values, even if only one value is really used.
Things can be simplified a lot, just starting from the original dataframe, which supposedly already has a column 'is_weeked'. Creating the twinx ax beforehand allows to write a loop (so writing the call to sns.countplot() only once, with parameters).
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
import numpy as np
sns.set_style('dark')
# create some demo data
data = pd.DataFrame({'ride_hod': np.random.normal(13, 3, 1000).astype(int) % 24,
'is_weekend': np.random.choice(['weekday', 'weekend'], 1000, p=[5 / 7, 2 / 7])})
# now, make 'is_weekend' a categorical column (not just strings)
data['is_weekend'] = pd.Categorical(data['is_weekend'], ['weekday', 'weekend'])
fig, ax1 = plt.subplots(figsize=(16, 6))
ax2 = ax1.twinx()
for ax, category in zip((ax1, ax2), data['is_weekend'].cat.categories):
sns.countplot(data=data[data['is_weekend'] == category], x='ride_hod', hue='is_weekend', palette='Blues', ax=ax)
ax.set_ylabel(f'Count ({category})')
ax1.legend_.remove() # both axes got a legend, remove one
ax1.set_xlabel('Hour of Day')
plt.tight_layout()
plt.show()
use plt.xticks(['put the label by hand in your x label'])

Seaborn clustermap legend overlap with figure

'''
Hi there,
I created a clustermap using seaborn. Because the legend overlaps with the figure, I'd like to move it. However, plt.legend(bbox_to_anchor=(1,1)) gave the following error 'No handles with labels found to put in legend.'
That makes me wonder: what is the color scale -20 to 20 on the top left that I want to re-position? isn't that a legend?
Thank you in advance for shedding light on that for me.
'''
import matplotlib.pyplot as plt
import seaborn as sns
g = sns.clustermap(data=df_highestPivot,cmap='coolwarm')
plt.legend(bbox_to_anchor=(1,1)) #This line generate the error
plt.savefig('plot.png',dpi=300,bbox_to_inches='tight')
plt.show()
plt.close()
The colorbar is not a legend per se (not an object of type Legend at least). It is actually it's own subplots Axes, that you can access using g.ax_cbar.
If you want to move it, you can pass an argument cbar_pos= to clustermap(). However, it's complicated to find an empty space in the figure to place it. I would recommend you make some room using subplots_adjust() then move the ax_cbar Axes at the desired location
iris = sns.load_dataset('iris')
species = iris.pop("species")
g = sns.clustermap(iris)
g.fig.subplots_adjust(right=0.7)
g.ax_cbar.set_position((0.8, .2, .03, .4))

How to ensure even spacing between labels on x axis of matplotlib graph?

I have been given a data for which I need to find a histogram. So I used pandas hist() function and plot it using matplotlib. The code runs on a remote server so I cannot directly see it and hence I save the image. Here is what the image looks like
Here is my code below
import matplotlib.pyplot as plt
df_hist = pd.DataFrame(np.array(raw_data)).hist(bins=5) // raw_data is the data supplied to me
plt.savefig('/path/to/file.png')
plt.close()
As you can see the x axis labels are overlapping. So I used this function plt.tight_layout() like so
import matplotlib.pyplot as plt
df_hist = pd.DataFrame(np.array(raw_data)).hist(bins=5)
plt.tight_layout()
plt.savefig('/path/to/file.png')
plt.close()
There is some improvement now
But still the labels are too close. Is there a way to ensure the labels do not touch each other and there is fair spacing between them? Also I want to resize the image to make it smaller.
I checked the documentation here https://matplotlib.org/api/_as_gen/matplotlib.pyplot.savefig.html but not sure which parameter to use for savefig.
Since raw_data is not already a pandas dataframe there's no need to turn it into one to do the plotting. Instead you can plot directly with matplotlib.
There are many different ways to achieve what you'd like. I'll start by setting up some data which looks similar to yours:
import matplotlib.pyplot as plt
import numpy as np
from scipy.stats import gamma
raw_data = gamma.rvs(a=1, scale=1e6, size=100)
If we go ahead and use matplotlib to create the histogram we may find the xticks too close together:
fig, ax = plt.subplots(1, 1, figsize=[5, 3])
ax.hist(raw_data, bins=5)
fig.tight_layout()
The xticks are hard to read with all the zeros, regardless of spacing. So, one thing you may wish to do would be to use scientific formatting. This makes the x-axis much easier to interpret:
ax.ticklabel_format(style='sci', axis='x', scilimits=(0,0))
Another option, without using scientific formatting would be to rotate the ticks (as mentioned in the comments):
ax.tick_params(axis='x', rotation=45)
fig.tight_layout()
Finally, you also mentioned altering the size of the image. Note that this is best done when the figure is initialised. You can set the size of the figure with the figsize argument. The following would create a figure 5" wide and 3" in height:
fig, ax = plt.subplots(1, 1, figsize=[5, 3])
I think the two best fixes were mentioned by Pam in the comments.
You can rotate the labels with
plt.xticks(rotation=45
For more information, look here: Rotate axis text in python matplotlib
The real problem is too many zeros that don't provide any extra info. Numpy arrays are pretty easy to work with, so pd.DataFrame(np.array(raw_data)/1000).hist(bins=5) should get rid of three zeros off of both axes. Then just add a 'kilo' in the axes labels.
To change the size of the graph use rcParams.
from matplotlib import rcParams
rcParams['figure.figsize'] = 7, 5.75 #the numbers are the dimensions

Plotting multiple scattter plots in the same graph instead of Facet Grids

Currently I have a few plots using Facet Grids in seaborn. I have the following code:
g = sns.FacetGrid(masterdata1,col = "courseName")
g=g.map(plt.scatter, "SubjectwisePercentage", "SemesterPercentage")
The above code plots subjectwisepercentage vs semesterpercentage, for different courses across a semester. How can I plot the different scatter plots in a single plot, instead of multiple plots across the facet grid? In the single plot, the plotted points for each course should be a different color.
There are links online that specify how to plot different datasets in a single plot. However I need to use the same dataset. Therefore I need to specify col="courseName", or something equivalent, to plot course wise data in a single plot. I am not sure of how to accomplish this. Thank you in advance for your help.
You can try using seaborn's scatter plot features. It allows to define, x, y, hue and style, and even size. Which gives up to a 5D view of your data. Sometimes, people like to make hue and style based on the same variables for better-looking graphs.
Sample code (not pretty much mine, since the seaborn documentation pretty much explains everything).
import seaborn as sns
import matplotlib.pyplot as plt
sns.set(style="ticks", color_codes=True)
tips = sns.load_dataset("tips")
# g = sns.FacetGrid(tips, col="sex", hue="time", palette="Set1",
# hue_order=["Dinner", "Lunch"])
# g= (g.map(plt.scatter, "total_bill", "tip")).add_legend()
# sns.scatterplot(data=tips, x="total_bill", y="tip", hue='time', style='sex')
sns.scatterplot(data=tips, x="total_bill", y="tip", hue='time', style='sex', size='size')
plt.show()
The matplotlib scatter plot can also be helpful. Since you can plot several data on the same plot with different markers/colors/sizes.
See this example.

Save figure with clip box from another figure

Normally if you plot two different figures using the default settings in pyplot, they will be exactly the same size, and if saved can be neatly aligned in PowerPoint or the like. I'd like to generate one figure, however, which has a legend outside of the figure. The script I'm using is shown below.
import numpy as np
import matplotlib.pyplot as plt
x=np.linspace(0,1,201)
y1=x**2
y2=np.sin(x)
fig1=plt.figure(1)
plt.plot(x,y1,label='y1')
handles1,labels1=plt.gca().get_legend_handles_labels()
lgd1=plt.gca().legend(handles1,labels1,bbox_to_anchor=(1.27,1),borderaxespad=0.)
fig2=plt.figure(2)
plt.plot(x,y2)
fig1.savefig('fig1',bbox_extra_artists=(lgd1,),bbox_inches='tight')
fig2.savefig('fig2')
plt.show()
The problem is that in PowerPoint, I can no longer align the two figures left and have their axes aligned. Due to the use of the 'extra artists' and 'bbox_inches=tight' arguments for the first figure, the width of its margins becomes different from the second figure.
Is there any way to 'transfer' the clip box from the first figure to the second figure, such that they can be aligned by 'align left' in PowerPoint?
I think an easier way to achieve what you want is to just construct one figure with two subplots, and let matplotlib align everything for you.
Do you think doing something like this is a good idea?
import matplotlib.pyplot as plt
import numpy as np
x=np.linspace(0,1,201)
y1=x**2
y2=np.sin(x)
fig = plt.figure()
a = fig.add_subplot(211)
a.plot(x,y1, label='y1')
lgd1 = a.legend(bbox_to_anchor = (1.27,1), borderaxespad=0.)
a = fig.add_subplot(212)
a.plot(x,y2)
fig.savefig('fig',bbox_extra_artists=(lgd1,),bbox_inches='tight')

Categories