How to change boxplot size in seaborn FacetGrid object - python

I have a row of boxplots I produce using the following code:
import seaborn as sns
g = sns.FacetGrid(df, col="Column0", sharex=False)
g.map(sns.boxplot, 'column1', 'Column2')
It works well with the exception that the plots are super tiny. I have looked at How can I change the font size using seaborn FacetGrid? and How to change figuresize using seaborn factorplot as well as the seaborn manual, but I do not find the right way to include 'size' and 'aspect' into the code. What would be the proper way to change the plot size?
EDIT
If I try it like this:
g = sns.FacetGrid(df, col="Column0", sharex=False, size=20, aspect=3)
g.map(sns.boxplot, 'Column1', 'Column2')
I get the error: ValueError: width and height must each be below 32768. Is there a restriction in size for plots that are produced the way I do it?

Maybe you can try limiting maximum x and y values so that you plot will automatically adjust to values that are important.
g.set(xlim=(0, 60), ylim=(0, 14));
you say that plot is super tiny that means there are some elements present with very high values.

Related

Two seaborn plots with different scales displayed on same plot but bars overlap

I am trying to include 2 seaborn countplots with different scales on the same plot but the bars display as different widths and overlap as shown below. Any idea how to get around this?
Setting dodge=False, doesn't work as the bars appear on top of each other.
The main problem of the approach in the question, is that the first countplot doesn't take hue into account. The second countplot won't magically move the bars of the first. An additional categorical column could be added, only taking on the 'weekend' value. Note that the column should be explicitly made categorical with two values, even if only one value is really used.
Things can be simplified a lot, just starting from the original dataframe, which supposedly already has a column 'is_weeked'. Creating the twinx ax beforehand allows to write a loop (so writing the call to sns.countplot() only once, with parameters).
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
import numpy as np
sns.set_style('dark')
# create some demo data
data = pd.DataFrame({'ride_hod': np.random.normal(13, 3, 1000).astype(int) % 24,
'is_weekend': np.random.choice(['weekday', 'weekend'], 1000, p=[5 / 7, 2 / 7])})
# now, make 'is_weekend' a categorical column (not just strings)
data['is_weekend'] = pd.Categorical(data['is_weekend'], ['weekday', 'weekend'])
fig, ax1 = plt.subplots(figsize=(16, 6))
ax2 = ax1.twinx()
for ax, category in zip((ax1, ax2), data['is_weekend'].cat.categories):
sns.countplot(data=data[data['is_weekend'] == category], x='ride_hod', hue='is_weekend', palette='Blues', ax=ax)
ax.set_ylabel(f'Count ({category})')
ax1.legend_.remove() # both axes got a legend, remove one
ax1.set_xlabel('Hour of Day')
plt.tight_layout()
plt.show()
use plt.xticks(['put the label by hand in your x label'])

How to change the transparency of the confidence interval in a relplot?

I found an answer for regplots, but I can't get the same code to work for relplots. I want to change the transparency of the confidence intervals while keeping the lines of my graph darker, but the alpha input for relplots makes the entire graph more translucent.
My code:
cookie = sns.relplot(
data=data, x="Day", y="Touches",
hue='Sex', ci=68, kind="line", col = 'Event'
)
The line that works for regplots is
plt.setp(cookie.collections[1], alpha=0.4)
While, regplot returns one ax (subplot), relplot returns a complete grid of subplots (a FacetGrid). Often, the return value is grabbed into a variable named g (calling it cookie can make things very confusing when comparing with code from the documents).
You can loop through the individual axes of the FacetGrid and make the change for each of them:
import matplotlib.pyplot as plt
import seaborn as sns
fmri = sns.load_dataset('fmri')
g = sns.relplot(data=fmri, x="timepoint", y="signal", col="region",
hue="event", style="event", kind="line")
for ax in g.axes.flat:
ax.collections[-1].set_alpha(0.4)
plt.show()
PS: As mentioned in the comments, if you want to change the alpha of all confidence intervals (instead of just one of them, as in the code of the question), you can use sns.relplot(..., err_kws={"alpha": .4}).

How to rescale the y-axis of a boxplot in python

I have a boxplot below (using seaborn) where the "box" part is too squashed. How do I change the scale along the y-axis so that the boxplot is more presentable (ie. the "box" part is too squashed) but still keeping all the outliers in the plot?
Many thanks.
You can do two things here.
Make the plot bigger
Change the range of the y-axis
Since you want to keep the outliers, rescaling the y-axis may not be that effective. You haven't given any data or code examples. So I'll just add a way to make your figure bigger.
# this script makes the figure bigger and rescale the y-axis
ax = plt.figure(figsize=(20,15))
ax = sns.boxplot(x="day", y="total_bill", data=tips)
ax.set_ylim(0,100)
You could set the axis after the plot:
import seaborn as sns
df = sns.load_dataset('iris')
a = sns.boxplot(y=df["sepal_length"])
a.set(ylim=(0,10))
Additionally, you could try dropping outliers from the plot passing showfliers = False in boxplot.

Incorrect legend labels in python seaborn plots

The above plot is made using seaborn in python. However, not sure why some of the legend circles are filled in with color and others are not. This is the colormap I am using:
sns.color_palette("Set2", 10)
g = sns.factorplot(x='month', y='vae_factor', hue='ad_name', col='crop', data=df_sub_panel,
col_wrap=3, size=5, lw=0.5, ci=None, capsize=.2, palette=sns.color_palette("Set2", 10),
sharex=False, aspect=.9, legend_out=False)
g.axes[0].legend(fancybox=None)
--EDIT:
Is there a way the circles can be filled? The reason they are not filled is that they might not have data in this particular plot
The circles are not filled in when there is no data, as I think you've already deduced. But it can be forced by manipulating the legend object.
Full example:
import pandas as pd
import seaborn as sns
df_sub_panel = pd.DataFrame([
{'month':'jan', 'vae_factor':50, 'ad_name':'China', 'crop':False},
{'month':'feb', 'vae_factor':60, 'ad_name':'China', 'crop':False},
{'month':'feb', 'vae_factor':None, 'ad_name':'Mexico', 'crop':False},
])
sns.color_palette("Set2", 10)
g = sns.factorplot(x='month', y='vae_factor', hue='ad_name', col='crop', data=df_sub_panel,
col_wrap=3, size=5, lw=0.5, ci=None, capsize=.2, palette=sns.color_palette("Set2", 10),
sharex=False, aspect=.9, legend_out=False)
# fill in empty legend handles (handles are empty when vae_factor is NaN)
for handle in g.axes[0].get_legend_handles_labels()[0]:
if not handle.get_facecolors().any():
handle.set_facecolor(handle.get_edgecolors())
legend = g.axes[0].legend(fancybox=None)
sns.plt.show()
The important part is the manipulation of the handle objects in legend at the end (in the for loop).
This will generate:
Compared to the original (without the for loop):
EDIT: Now less hacky thanks to suggestions from comments!

seaborn hue parameter not working correctly

I am having trouble getting the seaborn hue to work to color by value. My data is in a pandas df and I am using a barplot.
sns.barplot(x = plot_data['gene'], y = plot_data['freq'],
hue=plot_data["type"],palette={"type1":"red", "type2":"blue"}, ax=ax2)
I am confused by the grey bars that appear in places. I expect only red and blue bars and I am sure these are the only two types in the data.
I suggest you draw seaborn barplot in horizontal order because vertically not show proper way that's reason may be you saying seaborn barplot hue parameter not working.
and use
plt.figure(figsize(9,200)) # for figure size in ration 9:200
you can change according to requirement.

Categories