Running this below code produces seaborn facetgrid graphs.
merged1=merged[merged['TEST'].isin(['VL'])]
merged2=merged[merged['TEST'].isin(['CD4'])]
g = sns.relplot(data=merged1, x='Days Post-ART', y='Log of VL and CD4', col='PATIENT ID',col_wrap=4, kind="line", height=4, aspect=1.5,
color='b', facet_kws={'sharey':True,'sharex':True})
for patid, ax in g.axes_dict.items(): # axes_dict is new in seaborn 0.11.2
ax1 = ax.twinx()
sns.lineplot(data=merged2[merged2['PATIENT ID'] == patid], x='Days Post-ART', y='Log of VL and CD4', color='r')
I've used the facet_kws={'sharey':True, 'sharex':True} to share the x-axis and y-axis but it's not working properly. Can someone assist?
As stated in the comments, the FacetGrid axes are shared by default. However, the twinx axes are not. Also, the call to twinx seems to reset the default hiding of the y tick labels.
You can manually share the twinx axes, and remove the unwanted tick labels.
Here is some example code using the iris dataset:
from matplotlib import pyplot as plt
import seaborn as sns
import numpy as np
iris = sns.load_dataset('iris')
g = sns.relplot(data=iris, x='petal_length', y='petal_width', col='species', col_wrap=2, kind="line",
height=4, aspect=1.5, color='b')
last_axes = np.append(g.axes.flat[g._col_wrap - 1::g._col_wrap], g.axes.flat[-1])
shared_right_y = None
for species, ax in g.axes_dict.items():
ax1 = ax.twinx()
if shared_right_y is None:
shared_right_y = ax1
else:
shared_right_y.get_shared_y_axes().join(shared_right_y, ax1)
sns.lineplot(data=iris[iris['species'] == species], x='petal_length', y='sepal_length', color='r', ax=ax1)
if not ax in last_axes: # remove tick labels from secondary axis
ax1.yaxis.set_tick_params(labelleft=False, labelright=False)
ax1.set_ylabel('')
if not ax in g._left_axes: # remove tick labels from primary axis
ax.yaxis.set_tick_params(labelleft=False, labelright=False)
plt.tight_layout()
plt.show()
Related
I'd like to represent two datasets on the same plot, one as a line as one as a binned barplot. I can do each individually:
tobar = pd.melt(pd.DataFrame(np.random.randn(1000).cumsum()))
tobar["bins"] = pd.qcut(tobar.index, 20)
bp = sns.barplot(data=tobar, x="bins", y="value")
toline = pd.melt(pd.DataFrame(np.random.randn(1000).cumsum()))
lp = sns.lineplot(data=toline, x=toline.index, y="value")
But when I try to combine them, of course the x axis gets messed up:
fig, ax = plt.subplots()
ax2 = ax.twinx()
bp = sns.barplot(data=tobar, x="bins", y="value", ax=ax)
lp = sns.lineplot(data=toline, x=toline.index, y="value", ax=ax2)
bp.set(xlabel=None)
I also can't seem to get rid of the bin labels.
How can I get these two informations on the one plot?
This answer explains why it's better to plot the bars with matplotlib.axes.Axes.bar instead of sns.barplot or pandas.DataFrame.bar.
In short, the xtick locations correspond to the actual numeric value of the label, whereas the xticks for seaborn and pandas are 0 indexed, and don't correspond to the numeric value.
This answer shows how to add bar labels.
ax2 = ax.twinx() can be used for the line plot if needed
Works the same if the line plot is different data.
Tested in python 3.11, pandas 1.5.2, matplotlib 3.6.2, seaborn 0.12.1
Imports and DataFrame
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
import numpy as np
# test data
np.random.seed(2022)
df = pd.melt(pd.DataFrame(np.random.randn(1000).cumsum()))
# create the bins
df["bins"] = pd.qcut(df.index, 20)
# add a column for the mid point of the interval
df['mid'] = df.bins.apply(lambda row: row.mid.round().astype(int))
# pivot the dataframe to calculate the mean of each interval
pt = df.pivot_table(index='mid', values='value', aggfunc='mean').reset_index()
Plot 1
# create the figure
fig, ax = plt.subplots(figsize=(30, 7))
# add a horizontal line at y=0
ax.axhline(0, color='black')
# add the bar plot
ax.bar(data=pt, x='mid', height='value', width=4, alpha=0.5)
# set the labels on the xticks - if desired
ax.set_xticks(ticks=pt.mid, labels=pt.mid)
# add the intervals as labels on the bars - if desired
ax.bar_label(ax.containers[0], labels=df.bins.unique(), weight='bold')
# add the line plot
_ = sns.lineplot(data=df, x=df.index, y="value", ax=ax, color='tab:orange')
Plot 2
fig, ax = plt.subplots(figsize=(30, 7))
ax.axhline(0, color='black')
ax.bar(data=pt, x='mid', height='value', width=4, alpha=0.5)
ax.set_xticks(ticks=pt.mid, labels=df.bins.unique(), rotation=45)
ax.bar_label(ax.containers[0], weight='bold')
_ = sns.lineplot(data=df, x=df.index, y="value", ax=ax, color='tab:orange')
Plot 3
The bar width is the width of the interval
fig, ax = plt.subplots(figsize=(30, 7))
ax.axhline(0, color='black')
ax.bar(data=pt, x='mid', height='value', width=50, alpha=0.5, ec='k')
ax.set_xticks(ticks=pt.mid, labels=df.bins.unique(), rotation=45)
ax.bar_label(ax.containers[0], weight='bold')
_ = sns.lineplot(data=df, x=df.index, y="value", ax=ax, color='tab:orange')
I'd like to color my histogram according to my palette. Here's the code I used to make this, and here's the error I received when I tried an answer I found on here.
g = sns.jointplot(data=emb_df, x='f0', y='y', kind="hist", hue='klabels', palette='tab10', marginal_kws={'hist_kws': {'palette': 'tab10'}})
plt.show()
UserWarning: The marginal plotting function has changed to `histplot`, which does not accept the following argument(s): hist_kws.
I have also tried this:
plt.setp(g.ax_marg_y.patches, color='grey')
But this does not color my histogram according my 'klabels' parameter, just a flat grey.
The marginal plot is colored by default using the same palette with corresponding hue. So, you could just run it without marginal_kws=. The marginal_kws= go directly to the histplot; instead of marginal_kws={'hist_kws': {'palette': 'tab10'}}, the correct use would be marginal_kws={'palette': 'tab10'}. If you would like stacked bars, you could try marginal_kws={'multiple': 'stack'})
If you want the marginal plots to be larger, the ratio= parameter can be altered. The default is 5, meaning the central plot is 5 times as large as the marginal plots.
Here is an example:
from matplotlib import pyplot as plt
import seaborn as sns
iris = sns.load_dataset('iris')
g = sns.jointplot(data=iris, x='petal_length', y='sepal_length', kind="hist", hue='species', palette='tab10',
ratio=2, marginal_kws={'multiple': 'stack'})
sns.move_legend(g.ax_joint, loc='upper left') # optionally move the legend; seaborn >= 0.11.2 needed
plt.show()
To have these plots side-by-side as subplots, you can call the underlying sns.histplot either with both x= and y= filled in (2D histogram), only x= given (horizontal histogram) or only y= given (vertical histogram).
from matplotlib import pyplot as plt
import seaborn as sns
iris = sns.load_dataset('iris')
fig, (ax1, ax2, ax3) = plt.subplots(ncols=3, figsize=(15, 4))
sns.histplot(data=iris, x='petal_length', y='sepal_length', hue='species', palette='tab10', legend=False, ax=ax1)
sns.histplot(data=iris, x='petal_length', hue='species', palette='tab10', multiple='stack', legend=False, ax=ax2)
sns.histplot(data=iris, y='sepal_length', hue='species', palette='tab10', multiple='stack', ax=ax3)
sns.move_legend(ax3, bbox_to_anchor=[1.01, 1.01], loc='upper left')
plt.tight_layout()
plt.show()
In version 3.4, matplotlib added automatic Bar labels:
https://matplotlib.org/stable/users/whats_new.html#new-automatic-labeling-for-bar-charts
I'm trying to use this on a bar plot generated by Seaborn.
fig, axs = plt.subplots(
nrows=2,
)
for i, col in enumerate(['col_1', 'col_2']):
ax = axs[i]
sns.barplot(
x="class",
y=col,
hue="hue_col",
data=data_df,
edgecolor=".3",
linewidth=0.5,
ax=ax
)
ax.bar_label(ax.containers[i]) # Doesn't work
What do I need to do to make this work? example plot
You can loop through the containers and call ax.bar_label(...) for each of them. Note that seaborn creates one set of bars for each hue value.
The following example uses the titanic dataset and sets ci=None to avoid the error bars overlapping with the text (if error bars are needed, one could set a lighter color, e.g. errcolor='gold').
import seaborn as sns
import matplotlib.pyplot as plt
titanic = sns.load_dataset('titanic')
fig, axs = plt.subplots(ncols=2, figsize=(12, 4))
for ax, col in zip(axs, ['age', 'fare']):
sns.barplot(
x='sex',
y=col,
hue="class",
data=titanic,
edgecolor=".3",
linewidth=0.5,
ci=None,
ax=ax
)
ax.set_title('mean ' + col)
ax.margins(y=0.1) # make room for the labels
for bars in ax.containers:
ax.bar_label(bars, fmt='%.1f')
plt.tight_layout()
plt.show()
I made a plot that looks like this
I want to turn off the ticklabels along the y axis. And to do that I am using
plt.tick_params(labelleft=False, left=False)
And now the plot looks like this. Even though the labels are turned off the scale 1e67 still remains.
Turning off the scale 1e67 would make the plot look better. How do I do that?
seaborn is used to draw the plot, but it's just a high-level API for matplotlib.
The functions called to remove the y-axis labels and ticks are matplotlib methods.
After creating the plot, use .set().
.set(yticklabels=[]) should remove tick labels.
This doesn't work if you use .set_title(), but you can use .set(title='')
.set(ylabel=None) should remove the axis label.
.tick_params(left=False) will remove the ticks.
Similarly, for the x-axis: How to remove or hide x-axis labels from a seaborn / matplotlib plot?
Tested in python 3.11, pandas 1.5.2, matplotlib 3.6.2, seaborn 0.12.1
Example 1
import seaborn as sns
import matplotlib.pyplot as plt
# load data
exercise = sns.load_dataset('exercise')
pen = sns.load_dataset('penguins')
# create figures
fig, ax = plt.subplots(2, 1, figsize=(8, 8))
# plot data
g1 = sns.boxplot(x='time', y='pulse', hue='kind', data=exercise, ax=ax[0])
g2 = sns.boxplot(x='species', y='body_mass_g', hue='sex', data=pen, ax=ax[1])
plt.show()
Remove Labels
fig, ax = plt.subplots(2, 1, figsize=(8, 8))
g1 = sns.boxplot(x='time', y='pulse', hue='kind', data=exercise, ax=ax[0])
g1.set(yticklabels=[]) # remove the tick labels
g1.set(title='Exercise: Pulse by Time for Exercise Type') # add a title
g1.set(ylabel=None) # remove the axis label
g2 = sns.boxplot(x='species', y='body_mass_g', hue='sex', data=pen, ax=ax[1])
g2.set(yticklabels=[])
g2.set(title='Penguins: Body Mass by Species for Gender')
g2.set(ylabel=None) # remove the y-axis label
g2.tick_params(left=False) # remove the ticks
plt.tight_layout()
plt.show()
Example 2
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
# sinusoidal sample data
sample_length = range(1, 1+1) # number of columns of frequencies
rads = np.arange(0, 2*np.pi, 0.01)
data = np.array([(np.cos(t*rads)*10**67) + 3*10**67 for t in sample_length])
df = pd.DataFrame(data.T, index=pd.Series(rads.tolist(), name='radians'), columns=[f'freq: {i}x' for i in sample_length])
df.reset_index(inplace=True)
# plot
fig, ax = plt.subplots(figsize=(8, 8))
ax.plot('radians', 'freq: 1x', data=df)
# or skip the previous two lines and plot df directly
# ax = df.plot(x='radians', y='freq: 1x', figsize=(8, 8), legend=False)
Remove Labels
# plot
fig, ax = plt.subplots(figsize=(8, 8))
ax.plot('radians', 'freq: 1x', data=df)
# or skip the previous two lines and plot df directly
# ax = df.plot(x='radians', y='freq: 1x', figsize=(8, 8), legend=False)
ax.set(yticklabels=[]) # remove the tick labels
ax.tick_params(left=False) # remove the ticks
I have created a simple violin plot from a bands DataFrame (df10 below) using seaborn:
fig, ax = plt.subplots(figsize=(10,4))
ax = sns.violinplot(x='z', y='z_fit', hue='new_col', data=df10, cut=0, palette='Blues', linewidth=1)
ax.set_xlabel('z_sim')
ax.legend()
The legend is plotted automatically with the values of the hue parameter. Using ax.legend() I can only hide the name of the used column ('new_col').
However, I was wondering if there is some way to manually modify the legend (texts, colors and shapes) plotted below:
Example:
import seaborn as sns
tips = sns.load_dataset("tips")
g = sns.FacetGrid(tips, col="time", size=4, aspect=.75)
g = g.map(sns.violinplot, "sex", "total_bill", "smoker", palette={"No": "b", "Yes": "w"}, inner=None, linewidth=1, scale="area", split=True, width=0.75).despine(left=True)
g.fig.get_axes()[0].legend(title= 'smoker',loc='top left',labels=["YES","NO"],edgecolor='red',facecolor='blue',ncol=2)
g.set_axis_labels('lunch','total bill')
For more info run:
help(g.fig.get_axes()[0].legend)