I have a Pandas dataframe with 20+ features. I would like to see their correlation matrices. I create the heatmaps with code like the below, with subset1, subset2, etc.:
import seaborn as sns
cmap = sns.diverging_palette( 220 , 10 , as_cmap = True )
sb1 = sns.heatmap(
subset1.corr(),
cmap = cmap,
square=True,
cbar_kws={ 'shrink' : .9 },
annot = True,
annot_kws = { 'fontsize' : 12 })
I would like to be able to display multiple heatmaps generated by the above code, side-by-side like so:
display_side_by_side(sb1, sb2, sb3, . . .)
I'm not sure how to do this because the first code chunk above not only saves the results to sb1, but also plots the heatmap. Also, not sure how to write a function, display_side_by_side(). I am using the following for Pandas dataframes:
# create a helper function that takes pd.dataframes as input and outputs pretty, compact EDA results
from IPython.display import display_html
def display_side_by_side(*args):
html_str = ''
for df in args:
html_str = html_str + df.to_html()
display_html(html_str.replace('table','table style="display:inline"'),raw=True)
Based on the first answer below by Simas Joneliunas, I have come up with the following working solution:
import matplotlib.pyplot as plt
import seaborn as sns
# Here we create a figure instance, and two subplots
fig = plt.figure(figsize = (20,20)) # width x height
ax1 = fig.add_subplot(3, 3, 1) # row, column, position
ax2 = fig.add_subplot(3, 3, 2)
ax3 = fig.add_subplot(3, 3, 3)
ax4 = fig.add_subplot(3, 3, 4)
ax5 = fig.add_subplot(3, 3, 5)
# We use ax parameter to tell seaborn which subplot to use for this plot
sns.heatmap(data=subset1.corr(), ax=ax1, cmap = cmap, square=True, cbar_kws={'shrink': .3}, annot=True, annot_kws={'fontsize': 12})
sns.heatmap(data=subset2.corr(), ax=ax2, cmap = cmap, square=True, cbar_kws={'shrink': .3}, annot=True, annot_kws={'fontsize': 12})
sns.heatmap(data=subset3.corr(), ax=ax3, cmap = cmap, square=True, cbar_kws={'shrink': .3}, annot=True, annot_kws={'fontsize': 12})
sns.heatmap(data=subset4.corr(), ax=ax4, cmap = cmap, square=True, cbar_kws={'shrink': .3}, annot=True, annot_kws={'fontsize': 12})
sns.heatmap(data=subset5.corr(), ax=ax5, cmap = cmap, square=True, cbar_kws={'shrink': .3}, annot=True, annot_kws={'fontsize': 12})
You should look at matplotlib.add_subplot:
# Here we create a figure instance, and two subplots
fig = plt.figure()
ax1 = fig.add_subplot(211)
ax2 = fig.add_subplot(212)
# We use ax parameter to tell seaborn which subplot to use for this plot
sns.pointplot(x="x", y="y", data=data, ax=ax1)
Related
It seems that stripplot (or swarmplot) are automatically always added on top of boxplot even if I call one function in front of the other.
Am I missing something here? How to make stripplot be behind the boxplot? Also, I am actually only using boxplots to use the 'mean' with diamond marker.
Here is a sample code for how I pretty much call the functions now:
fig, axs = plt.subplots(nrows=1, ncols=1, figsize=(5, 4), dpi=200)
sns.boxplot(ax=axs, data=df, x='region', y='measure', hue='group',
meanprops={'marker' : 'D', 'markeredgecolor' : 'black', 'markersize' : 6},
medianprops={'visible': False}, whiskerprops={'visible': False},
showmeans=True, showfliers=False, showbox=False, showcaps=False)
sns.stripplot(ax=axs, data=df, x='region', y='measure', hue='group',
dodge=True, jitter=0.05)
plt.show()
The plot currently
You can add sns.stripplot(..., zorder=0) to put the strip plot below the other elements.
To have the mean in the legend, you can add a label. As this will label each individual mean, you can collect all the legend handles, and filter out the first of those, together with the PathCollections that represent the dots of the stripplot, also leaving out the rectangles that represent the left-out boxes.
import matplotlib.pyplot as plt
from matplotlib.collections import PathCollection
from matplotlib.lines import Line2D
import seaborn as sns
tips = sns.load_dataset('tips')
fig, axs = plt.subplots(figsize=(5, 4))
sns.boxplot(ax=axs, data=tips, x='day', y='tip', hue='smoker',
meanprops={'marker' : 'D', 'markeredgecolor' : 'black', 'markersize' : 6, 'label':'mean'},
medianprops={'visible': False}, whiskerprops={'visible': False},
showmeans=True, showfliers=False, showbox=False, showcaps=False)
sns.stripplot(ax=axs, data=tips, x='day', y='tip', hue='smoker', palette='autumn',
dodge=True, jitter=0.05, zorder=0)
handles, _ = axs.get_legend_handles_labels()
new_handles = [h for h in handles if isinstance(h, PathCollection)] + [h for h in handles if isinstance(h, Line2D)][:1]
axs.legend(handles=new_handles, title=axs.legend_.get_title().get_text())
plt.show()
if you want two and more plot use
fig, ax = plt.subplots(2,2, figsize=(20, 15))
And use ax=ax[0,1], row and col,
sns.boxplot(x = 'bedrooms', y = 'price', data = dataset_df, ax=ax[0,1])
sns.boxplot(x = 'floor, y = 'price', data = dataset_df, ax=ax[0,2])
I have two heatmap subplots using Seaborn (shown below)
I have looked for tutorials/help etc everywhere but I cannot figure out:
Q) How to change the color of the colorbar numbers on each of the heatmaps?
I want them both to be the color "yellow" and not the default "black"
Thank you for you time.
line_df
total_df
fig.set_facecolor("Blue")
fig, (ax1, ax2) = plt.subplots(ncols=2, figsize=(12,6))
sns.heatmap(line_df, ax = ax1, annot=True, annot_kws={'fontsize': 16, 'fontweight':'bold'}, xticklabels=line_df.columns, yticklabels=line_df.index, cbar_kws={'orientation':'vertical'} )
ax1.yaxis.label.set_color("Blue")
ax1.tick_params(colors="yellow")
sns.heatmap(total_df, ax = ax2, annot=True, annot_kws={'fontsize': 16, 'fontweight':'bold',}, xticklabels=total_df.columns, yticklabels=False, cbar_kws={'orientation':'vertical'})
ax2.get_yaxis().set_visible(False)
ax2.tick_params(colors="yellow")
fig.tight_layout()
plt.show()
plt.close()
You will need to use this to change the parameters including font color by calling each of the axis colorbar and then change the tick_params for that axis. As there was no data available, I have used random arrays to demonstrate the same. You can find more information tick_params here and on collections here
df1 = np.random.rand(5, 5)
df2 = np.random.rand(5, 5)
import seaborn as sns
fig, (ax1, ax2) = plt.subplots(ncols=2, figsize=(12,6))
sns.heatmap(data = df2, ax = ax1, annot=True, annot_kws={'fontsize': 16, 'fontweight':'bold'},
cbar_kws={'orientation':'vertical'} )
sns.heatmap(data = df1, ax = ax2, annot=True, annot_kws={'fontsize': 16, 'fontweight':'bold',},
cbar_kws={'orientation':'vertical'})
cbar1 = ax1.collections[0].colorbar
cbar1.ax.tick_params(labelsize=20, colors='yellow')
cbar2 = ax2.collections[0].colorbar
cbar2.ax.tick_params(labelsize=20, colors='yellow')
plt.show()
I am using secondary y-axis and cmap color but when I plot together the color bar cross to my plot
here is my code
fig,ax1=plt.subplots()
ax1 = df_Combine.plot.scatter('Parameter2', 'NPV (MM €)', marker='s', s=500, ylim=(-10,60), c='Lifetime1 (a)', colormap='jet_r', vmin=0, vmax=25, ax=ax1)
graph.axhline(0, color='k')
plt.xticks(rotation=90)
ax2 = ax1.twinx()
ax2.plot(df_Combine_min_select1["CumEnergy1 (kWH)"])
plt.show()
and here is my plotting
anyone can help how to solve this issue?
Thank you
When you let pandas automatically create a colorbar, you don't have positioning options. Therefore, you can create the colorbar in a separate step and provide the pad= parameter to set a wider gap. Default, pad is 0.05, meaning 5% of the width of the subplot.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
fig, ax1 = plt.subplots()
df_Combine = pd.DataFrame({'Parameter2': np.random.rand(10) * 10,
'NPV (MM €)': np.random.rand(10),
'Lifetime1 (a)': np.random.rand(10) * 25,
})
ax1 = df_Combine.plot.scatter('Parameter2', 'NPV (MM €)', marker='s', s=500, ylim=(-10, 60), c='Lifetime1 (a)',
colormap='jet_r', vmin=0, vmax=25, ax=ax1, colorbar=False)
plt.colorbar(ax1.collections[0], ax=ax1, pad=0.1)
ax2 = ax1.twinx()
ax2.plot(np.random.rand(10))
plt.show()
I'm trying to produce multiple seaborn kernel density plots for the numeric variables of my Pandas DataFrame. I have the names of all of my numeric columns in a list, numberCol. Presently, I can make a kdeplot for each variable that I explicitly name, like so:
import seaborn as sbn
sbn.set_style('whitegrid')
sbn.kdeplot(np.array(df.v2), bw=0.5) # for pandas.core.frame.DataFrame input
Is there a better way to iterate through the numberCol list, produce an sbn.kdeplot for each variable in numberCol, then display them side-by-side with something smarter than something like:
import matplotlib.pyplot as plt
import seaborn as sns
# Here we create a figure instance, and two subplots
fig = plt.figure(figsize = (20,20)) # width x height
ax1 = fig.add_subplot(3, 3, 1) # row, column, position
ax2 = fig.add_subplot(3, 3, 2)
ax3 = fig.add_subplot(3, 3, 3)
# We use ax parameter to tell seaborn which subplot to use for this plot
sns.heatmap(data=subset1.corr(), ax=ax1, cmap = cmap, square=True, cbar_kws={'shrink': .3}, annot=True, annot_kws={'fontsize': 12})
sns.heatmap(data=subset2.corr(), ax=ax2, cmap = cmap, square=True, cbar_kws={'shrink': .3}, annot=True, annot_kws={'fontsize': 12})
sns.heatmap(data=subset3.corr(), ax=ax3, cmap = cmap, square=True, cbar_kws={'shrink': .3}, annot=True, annot_kws={'fontsize': 12})
If I understand your question, this should do the trick
Ncols = 9
cols = ['col_{:d}'.format(i) for i in range(Ncols)]
df = pd.DataFrame(np.random.random(size=(1000,Ncols)),columns=cols)
fig, axs = plt.subplots(3,3) # adjust the geometry based on your number of columns to plot
for ax,col in zip(axs.flatten(), cols):
sns.kdeplot(df[col], ax=ax)
I have a dataframe with 15 rows, which I plot using a seaborn heatmap. I have three plots, each with different scale for the heatmap. The first two plots are the first two rows, which are not aligned on the plot.
I have created a grid with 15 rows, I give each of the first two rows 1/15th of the grid so I don't know why it is not aligned.
Another problem with the first two rows of the heatmap is that the text formatting doesn't work either.
So I want to do two things:
Stretch the top two rows of the table to align it with the bottom one and;
To make the formatting work for the top two rows as well.
Maybe also add titles to my white xaxes (l1 and l2) that separate the the subgroups in the bottom plot (standard methods like ax.set_title does not work).
My code:
import seaborn as sns
import matplotlib
import matplotlib.pyplot as plt
import matplotlib.gridspec as gs
gs = gs.GridSpec(15, 1) # nrows, ncols
f = plt.figure(figsize=(10, 15))
cmap = sns.diverging_palette(220, 10, as_cmap=True)
ax1 = f.add_subplot(gs[0:1, :])
ax2 = f.add_subplot(gs[1:2, :])
ax3 = f.add_subplot(gs[2:15, :])
ticksx = plt.xticks(fontsize = 18, fontweight='bold')
ticksy = plt.yticks(fontsize = 18, fontweight='bold')
wageplot = sns.heatmap(df[0:1], vmin=3000, vmax=10000, annot=False, square=True, cmap=cmap, ax=ax1, yticklabels=True, cbar=False, xticklabels=False)
tenureplot = sns.heatmap(df[1:2], vmin=45, vmax=100, annot=True, square=True, cmap=cmap, ax=ax2, yticklabels=True, cbar=False, xticklabels=False)
heatmap = sns.heatmap(df[2:15], vmin=0, vmax=1, annot=False, square=True, cmap=cmap, ax=ax3, yticklabels=True, cbar=True, xticklabels=True)
heatmap.set_xticklabels(cols, rotation=45, ha='right')
l1 = plt.axhline(y=1, linewidth=14, color='w', label='Female')
l2 = plt.axhline(y=5, linewidth=14, color='w', label='Education')
f.tight_layout()
I would appreciate if I can pointed to where can I get some information about how to program the needed grid. An example image: