seaborn hue parameter not working correctly - python

I am having trouble getting the seaborn hue to work to color by value. My data is in a pandas df and I am using a barplot.
sns.barplot(x = plot_data['gene'], y = plot_data['freq'],
hue=plot_data["type"],palette={"type1":"red", "type2":"blue"}, ax=ax2)
I am confused by the grey bars that appear in places. I expect only red and blue bars and I am sure these are the only two types in the data.

I suggest you draw seaborn barplot in horizontal order because vertically not show proper way that's reason may be you saying seaborn barplot hue parameter not working.
and use
plt.figure(figsize(9,200)) # for figure size in ration 9:200
you can change according to requirement.

Related

How to rescale the y-axis of a boxplot in python

I have a boxplot below (using seaborn) where the "box" part is too squashed. How do I change the scale along the y-axis so that the boxplot is more presentable (ie. the "box" part is too squashed) but still keeping all the outliers in the plot?
Many thanks.
You can do two things here.
Make the plot bigger
Change the range of the y-axis
Since you want to keep the outliers, rescaling the y-axis may not be that effective. You haven't given any data or code examples. So I'll just add a way to make your figure bigger.
# this script makes the figure bigger and rescale the y-axis
ax = plt.figure(figsize=(20,15))
ax = sns.boxplot(x="day", y="total_bill", data=tips)
ax.set_ylim(0,100)
You could set the axis after the plot:
import seaborn as sns
df = sns.load_dataset('iris')
a = sns.boxplot(y=df["sepal_length"])
a.set(ylim=(0,10))
Additionally, you could try dropping outliers from the plot passing showfliers = False in boxplot.

Remove one of the two legends produced in this Seaborn figure?

I have just started using seaborn to produce my figures. However I can't seem to remove one of the legends produced here.
I am trying to plot two accuracies against each other and draw a line along the diagonal to make it easier to see which has performed better (if anyone has a better way of plotting this data in seaborn - let me know!). The legend I'd like to keep is the one on the left, that shows the different colours for 'N_bands' and different shapes for 'Subject No'
ax1 = sns.relplot(y='y',x='x',data=df,hue='N bands',legend='full',style='Subject No.',markers=['.','^','<','>','8','s','p','*','P','X','D','H','d']).set(ylim=(80,100),xlim=(80,100))
ax2 = sns.lineplot(x=range(80,110),y=range(80,110),legend='full')
I have tried setting the kwarg legend to 'full','brief' and False for both ax1 and ax2 (together and separately) and it only seems to remove the one on the left, or both.
I have also tried to remove the axes using matplotlib
ax1.ax.legend_.remove()
ax2.legend_.remove()
But this results in the same behaviour (left legend dissapearing).
UPDATE: Here is a minimal example you can run yourself:
test_data = np.array([[1.,2.,100.,9.],[2.,1.,100.,8.],[3.,4.,200.,7.]])
test_df = pd.DataFrame(columns=['x','y','p','q'], data=test_data)
sns.set_context("paper")
ax1=sns.relplot(y='y',x='x',data=test_df,hue='p',style='q',markers=['.','^','<','>','8'],legend='full').set(ylim=(0,4),xlim=(0,4))
ax2=sns.lineplot(x=range(0,5),y=range(0,5),legend='full')
Although this doesn't reproduce the error perfectly as the right legend is coloured (I have no idea how to reproduce this error then - does the way my dataframe was created make a difference?). But the essence of the problem remains - how do I remove the legend on the right but keep the one on the left?
You're plotting a lineplot in the (only) axes of a FacetGrid produced via relplot. That's quite unconventional, so strange things might happen.
One option to remove the legend of the FacetGrid but keeping the one from the lineplot would be
g._legend.remove()
Full code (where I also corrected for the confusing naming if grids and axes)
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns
test_data = np.array([[1.,2.,100.,9.],[2.,1.,100.,8.],[3.,4.,200.,7.]])
test_df = pd.DataFrame(columns=['x','y','p','q'], data=test_data)
sns.set_context("paper")
g=sns.relplot(y='y',x='x',data=test_df,hue='p',style='q',markers=['.','^','<','>','8'], legend='full')
sns.lineplot(x=range(0,5),y=range(0,5),legend='full', ax=g.axes[0,0])
g._legend.remove()
plt.show()
Note that this is kind of a hack, and it might break in future seaborn versions.
The other option is to not use a FacetGrid here, but just plot a scatter and a line plot in one axes,
ax1 = sns.scatterplot(y='y',x='x',data=test_df,hue='p',style='q',
markers=['.','^','<','>','8'], legend='full')
sns.lineplot(x=range(0,5),y=range(0,5), legend='full', ax=ax1)
plt.show()

MatPlotLib - Showing legend

I'm making a scatter plot from a Pandas DataFrame with 3 columns. The first two would be the x and y axis, and the third would be classicfication data that I want to visualize by points having different colors. My question is, how can I add the legend to this plot:
df= df.groupby(['Month', 'Price'])['Quantity'].sum().reset_index()
df.plot(kind='scatter', x='Month', y='Quantity', c=df.Price , s = 100, legend = True);
As you can see, I'd like to automatically color the dots based on their price, so adding labels manually is a bit of an inconvenience. Is there a way I could add something to this code, that would also show a legend to the Price values?
Also, this colors the scatter plot dots on a range from black to white. Can I add custom colors without giving up the easy usage of c=df.Price?
Thank you!

How to change boxplot size in seaborn FacetGrid object

I have a row of boxplots I produce using the following code:
import seaborn as sns
g = sns.FacetGrid(df, col="Column0", sharex=False)
g.map(sns.boxplot, 'column1', 'Column2')
It works well with the exception that the plots are super tiny. I have looked at How can I change the font size using seaborn FacetGrid? and How to change figuresize using seaborn factorplot as well as the seaborn manual, but I do not find the right way to include 'size' and 'aspect' into the code. What would be the proper way to change the plot size?
EDIT
If I try it like this:
g = sns.FacetGrid(df, col="Column0", sharex=False, size=20, aspect=3)
g.map(sns.boxplot, 'Column1', 'Column2')
I get the error: ValueError: width and height must each be below 32768. Is there a restriction in size for plots that are produced the way I do it?
Maybe you can try limiting maximum x and y values so that you plot will automatically adjust to values that are important.
g.set(xlim=(0, 60), ylim=(0, 14));
you say that plot is super tiny that means there are some elements present with very high values.

Sorted bar charts with pandas/matplotlib or seaborn

I have a dataset of 5000 products with 50 features. One of the column is 'colors' and there are more than 100 colors in the column. I'm trying to plot a bar chart to show only the top 10 colors and how many products there are in each color.
top_colors = df.colors.value_counts()
top_colors[:10].plot(kind='barh')
plt.xlabel('No. of Products');
Using Seaborn:
sns.factorplot("colors", data=df , palette="PuBu_d");
1) Is there a better way to do this?
2) How can i replicate this with Seaborn?
3) How do i plot such that the highest count is at the top (i.e black at the very top of the bar chart)
An easy trick might be to invert the y axis of your plot, rather than futzing with the data:
s = pd.Series(np.random.choice(list(string.uppercase), 1000))
counts = s.value_counts()
ax = counts.iloc[:10].plot(kind="barh")
ax.invert_yaxis()
Seaborn barplot doesn't currently support horizontally oriented bars, but if you want to control the order the bars appear in you can pass a list of values to the x_order param. But I think it's easier to use the pandas plotting methods here, anyway.
If you want to use pandas then you can first sort:
top_colors[:10].sort(ascending=0).plot(kind='barh')
Seaborn already styles your pandas plots, but you can also use:
sns.barplot(top_colors.index, top_colors.values)

Categories