seaborn/matplotlib: showing different tick ranges in one plot - python

I created a Boxplot like this:
f, ax = plt.subplots(figsize=(15,7))
sns.despine(bottom=True, left=True)
sns.boxplot(x=x)
ax.set(xlim=(0, 120))
ax.grid(linestyle='-', axis="x")
ax.xaxis.set_major_locator(ticker.MultipleLocator(24))
ax.set_axisbelow(True)
plt.show()
Which look like this:
Like i already marked in the Picture, i want a different xtick range for a specific part in the graph. So until the value of 24 the ticker should be ticker.MultipleLocator(8) and then it should continue with ticker.MultipleLocator(24).

Since multiple locators cannot be mixed, there is a way to create and combine scales for each.
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np
tips = sns.load_dataset("tips")
fig, ax = plt.subplots(figsize=(15,7))
sns.despine(bottom=True, left=True)
g = sns.boxplot(x=tips['total_bill'])
ax.set(xlim=(0, 120))
ax.grid(linestyle='-', axis="x")
tickA = np.arange(0,24,8)
tickB = np.arange(24,120,24)
new_ticks = np.concatenate([tickA, tickB])
ax.set_xticks(new_ticks)
ax.set_axisbelow(True)
plt.show()

Related

Heatmap with multi-color y-axis and correspondend colorbar

I want to create a heatmap with seaborn, similar to this (with the following code):
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
import numpy as np
# Create data
df = pd.DataFrame(np.random.random((5,5)), columns=["a","b","c","d","e"])
# Default heatmap
ax = sns.heatmap(df)
plt.show()
I'd also like to add a new variable (lets say new_var = pd.DataFrame(np.random.random((5,1)), columns=["new variable"])), such as that the values (and possibly the spine and ticks as well) of the y-axis are colored according to the new variable and a second color bar plotted in the same plot to represent the colors of the y-axis values. How can I do that?
This uses the new values to color the y-ticks and the y-tick labels and adds the associated colorbar.
import matplotlib.pyplot as plt
import matplotlib
import seaborn as sns
import pandas as pd
import numpy as np
# Create data
df = pd.DataFrame(np.random.random((5,5)), columns=["a","b","c","d","e"])
# Default heatmap
ax = sns.heatmap(df)
new_var = pd.DataFrame(np.random.random((5,1)), columns=["new variable"])
# Create the colorbar for y-ticks and labels
norm = plt.Normalize(new_var.min(), new_var.max())
cmap = matplotlib.cm.get_cmap('turbo')
yticks_locations = ax.get_yticks()
yticks_labels = df.index.values
#hide original ticks
ax.tick_params(axis='y', left=False)
ax.set_yticklabels([])
for var, ytick_loc, ytick_label in zip(new_var.values, yticks_locations, yticks_labels):
color = cmap(norm(float(var)))
ax.annotate(ytick_label, xy=(1, ytick_loc), xycoords='data', xytext=(-0.4, ytick_loc),
arrowprops=dict(arrowstyle="-", color=color, lw=1), zorder=0, rotation=90, color=color)
# Add colorbar for y-tick colors
sm = plt.cm.ScalarMappable(cmap=cmap, norm=norm)
cb = ax.figure.colorbar(sm)
# Match the seaborn style
cb.outline.set_visible(False)
I found your problem interesting, and inspired by the unanswered comment above:
How do you change the second colorbar position? For example, one on top the other on bottom sides. - Py-ser
I decided to spend a while doing some tests. After a little digging i find that cbar_kws={"orientation": "horizontal"} is the argument for sns.heatmap that makes the colorbars horizontal.
Borrowing the code from the solution and making some changes, you can format your plot the way you want as in:
import matplotlib.pyplot as plt
import matplotlib
import seaborn as sns
import pandas as pd
import numpy as np
# Create data
df = pd.DataFrame(np.random.random((5,5)), columns=["a","b","c","d","e"])
# Default heatmap
ax = sns.heatmap(df, cbar_kws={"orientation": "horizontal"}, square = False, annot = True)
new_var = pd.DataFrame(np.random.random((5,1)), columns=["new variable"])
# Create the colorbar for y-ticks and labels
norm = plt.Normalize(new_var.min(), new_var.max())
cmap = matplotlib.cm.get_cmap('turbo')
yticks_locations = ax.get_yticks()
yticks_labels = df.index.values
#hide original ticks
ax.tick_params(axis='y', left=False)
ax.set_yticklabels([])
for var, ytick_loc, ytick_label in zip(new_var.values, yticks_locations, yticks_labels):
color = cmap(norm(float(var)))
ax.annotate(ytick_label, xy=(1, ytick_loc), xycoords='data', xytext=(-0.4, ytick_loc),
arrowprops=dict(arrowstyle="-", color=color, lw=1), zorder=0, rotation=90, color=color)
# Add colorbar for y-tick colors
sm = plt.cm.ScalarMappable(cmap=cmap, norm=norm)
cb = ax.figure.colorbar(sm)
# Match the seaborn style
cb.outline.set_visible(False)
Also, you will notice that I listed the values ​​related to each cell in the heatmap, but just out of curiosity to make it clearer to check that everything was working as expected.
I'm still not very happy with the shape/size of the horizontal colorbar, but I'll keep testing and update any progress by editing this answer!
==========================================
EDIT
just to keep track of the updates, first i tried to change just some parameters of seaborn's heatmap function but wouldn't consider this a major improvement on the task... by adding
ax = sns.heatmap(df, cbar_kws = dict(use_gridspec=True, location="top", shrink =0.6), square = True, annot = True)
I end up with:
I did get to separate the colormap using the matplotlib subplot routine and honestly i believe this is the right way given the parameter control that is possible to get here, by:
# Define two rows for subplots
fig, (cax, ax) = plt.subplots(nrows=2, figsize=(5,5.025), gridspec_kw={"height_ratios":[0.025, 1]})
# Default heatmap
ax = sns.heatmap(df, cbar=False, annot = True)
# colorbar
fig.colorbar(ax.get_children()[0], cax=cax, orientation="horizontal")
plt.show()
I obtained:
Which is still not the prettiest graph I've ever made, but now the position and size of the heatmap can be edited normally within the plt.subplots subroutines that give absolute control over these parameters.

pandas barplot choose color for each variable

I usually use matplotlib, but was playing with pandas plotting and experienced unexpected behaviour. I was assuming the following would return red and green edges rather than alternating. What am I missing here?
import pandas as pd
import matplotlib.pyplot as plt
df = pd.DataFrame({"col1":[1,2,4,5,6], "col2":[4,5,1,2,3]})
def amounts(df):
fig, ax = plt.subplots(1,1, figsize=(3,4))
(df.filter(['col1','col2'])
.plot.bar(ax=ax,stacked=True, edgecolor=["red","green"],
fill=False,linewidth=2,rot=0))
ax.set_xlabel("")
plt.tight_layout()
plt.show()
amounts(df)
I think plotting each column separately and setting the bottom argument to stack the bars provides the output you desire.
import pandas as pd
import matplotlib.pyplot as plt
df = pd.DataFrame({"col1":[1,2,4,5,6], "col2":[4,5,1,2,3]})
def amounts(df):
fig, ax = plt.subplots(1,1, figsize=(3,4))
df['col1'].plot.bar(ax=ax, linewidth=2, edgecolor='green', rot=0, fill=False)
df['col2'].plot.bar(ax=ax, bottom=df['col1'], linewidth=2, edgecolor='red', rot=0, fill=False)
plt.legend()
plt.tight_layout()
plt.show()
amounts(df)

How to put a colorbar in seaborn scatterplot legend

I have the next scatterplot
But i want to change the dots on the legend by continuos color map like this:
This is my code:
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
import seaborn as sns
sns.set_style("whitegrid")
gene_list = pd.read_csv('interseccion.csv', header=None)
glist = gene_list.squeeze().str.strip().tolist()
names = gp.get_library_name()
enr = gp.enrichr(gene_list= glist,
gene_sets=['KEGG_2019_Human'],
organism='Human', # don't forget to set organism to the one you desired! e.g. Yeast
description='KEGG',
# no_plot=True,
cutoff=0.5 # test dataset, use lower value from range(0,1)
)
resultados = enr.results.head(15)
resultados['-log10(FDR)'] = -np.log10(resultados['Adjusted P-value'])
resultados['Genes'] = resultados['Genes'].str.split(';')
resultados['Genes'] = resultados['Genes'].apply(lambda x: len(x))
g = sns.scatterplot(data=resultados, x="-log10(FDR)", y="Term", hue='-log10(FDR)', palette="seismic"
, size="Genes", sizes=(30, 300), legend=True)
g.legend(loc=6, bbox_to_anchor=(1, 0.5), ncol=1)
g.fig.colorbar()
plt.ylabel('')
plt.xlabel('-log10(FDR)')
When i try to put a color bar with the funcion plt.colorbar() is not possible
I customized the code in the official sample with the understanding that I wanted to add a legend and color bars to the Seaborn scatterplot. A colormap has been created to match the colors of the sample graph, but it can be drawn without problems by specifying the colormap name. The color bar is customized by getting its position and adjusting it manually in the legend. The height of the color bar is halved to match the legend.
import seaborn as sns
import matplotlib.pyplot as plt
tips = sns.load_dataset("tips")
fig, ax = plt.subplots()
g = sns.scatterplot(
data=tips, x="total_bill", y="tip", hue="size", size="size",
sizes=(20, 200), legend="full", ax=ax)
g.legend(loc='upper right', bbox_to_anchor=(1.2, 1.0), ncol=1)
norm = plt.Normalize(tips['size'].min(), tips['size'].max())
cmap = sns.cubehelix_palette(light=1, as_cmap=True)
sm = plt.cm.ScalarMappable(cmap=cmap, norm=norm)
sm.set_array([])
cax = fig.add_axes([ax.get_position().x1+0.05, ax.get_position().y0, 0.06, ax.get_position().height / 2])
ax.figure.colorbar(sm, cax=cax)
plt.show()

Histogram with Boxplot above in Python

Hi I wanted to draw a histogram with a boxplot appearing the top of the histogram showing the Q1,Q2 and Q3 as well as the outliers. Example phone is below. (I am using Python and Pandas)
I have checked several examples using matplotlib.pyplot but hardly came out with a good example. And I also wanted to have the histogram curve appearing like in the image below.
I also tried seaborn and it provided me the shape line along with the histogram but didnt find a way to incorporate with boxpot above it.
can anyone help me with this to have this on matplotlib.pyplot or using pyplot
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
sns.set(style="ticks")
x = np.random.randn(100)
f, (ax_box, ax_hist) = plt.subplots(2, sharex=True,
gridspec_kw={"height_ratios": (.15, .85)})
sns.boxplot(x, ax=ax_box)
sns.distplot(x, ax=ax_hist)
ax_box.set(yticks=[])
sns.despine(ax=ax_hist)
sns.despine(ax=ax_box, left=True)
From seaborn v0.11.2, sns.distplot is deprecated. Use sns.histplot for axes-level plots instead.
np.random.seed(2022)
x = np.random.randn(100)
f, (ax_box, ax_hist) = plt.subplots(2, sharex=True, gridspec_kw={"height_ratios": (.15, .85)})
sns.boxplot(x=x, ax=ax_box)
sns.histplot(x=x, bins=12, kde=True, stat='density', ax=ax_hist)
ax_box.set(yticks=[])
sns.despine(ax=ax_hist)
sns.despine(ax=ax_box, left=True)
Solution using only matplotlib, just because:
# start the plot: 2 rows, because we want the boxplot on the first row
# and the hist on the second
fig, ax = plt.subplots(
2, figsize=(7, 5), sharex=True,
gridspec_kw={"height_ratios": (.3, .7)} # the boxplot gets 30% of the vertical space
)
# the boxplot
ax[0].boxplot(data, vert=False)
# removing borders
ax[0].spines['top'].set_visible(False)
ax[0].spines['right'].set_visible(False)
ax[0].spines['left'].set_visible(False)
# the histogram
ax[1].hist(data)
# and we are good to go
plt.show()
Expanding on the answer from #mwaskom, I made a little adaptable function.
import seaborn as sns
def histogram_boxplot(data, xlabel = None, title = None, font_scale=2, figsize=(9,8), bins = None):
""" Boxplot and histogram combined
data: 1-d data array
xlabel: xlabel
title: title
font_scale: the scale of the font (default 2)
figsize: size of fig (default (9,8))
bins: number of bins (default None / auto)
example use: histogram_boxplot(np.random.rand(100), bins = 20, title="Fancy plot")
"""
sns.set(font_scale=font_scale)
f2, (ax_box2, ax_hist2) = plt.subplots(2, sharex=True, gridspec_kw={"height_ratios": (.15, .85)}, figsize=figsize)
sns.boxplot(data, ax=ax_box2)
sns.distplot(data, ax=ax_hist2, bins=bins) if bins else sns.distplot(data, ax=ax_hist2)
if xlabel: ax_hist2.set(xlabel=xlabel)
if title: ax_box2.set(title=title)
plt.show()
histogram_boxplot(np.random.randn(100), bins = 20, title="Fancy plot", xlabel="Some values")
Image
def histogram_boxplot(feature, figsize=(15,10), bins=None):
f,(ax_box,ax_hist)=plt.subplots(nrows=2,sharex=True, gridspec_kw={'height_ratios':(.25,.75)},figsize=figsize)
sns.distplot(feature,kde=False,ax=ax_hist,bins=bins)
sns.boxplot(feature,ax=ax_box, color='Red')
ax_hist.axvline(np.mean(feature),color='g',linestyle='-')
ax_hist.axvline(np.median(feature),color='y',linestyle='--')

Matplotlib: how to adjust zorder of second legend?

Here is an example that reproduces my problem:
import matplotlib.pyplot as plt
import numpy as np
data1,data2,data3,data4 = np.random.random(100),np.random.random(100),np.random.random(100),np.random.random(100)
fig,ax = plt.subplots()
ax.plot(data1)
ax.plot(data2)
ax.plot(data3)
ax2 = ax.twinx()
ax2.plot(data4)
plt.grid('on')
ax.legend(['1','2','3'], loc='center')
ax2.legend(['4'], loc=1)
How can I get the legend in the center to plot on top of the lines?
To get exactly what you have asked for, try the following. Note I have modified your code to define the labels when you generate the plot and also the colors so you don't get a repeated blue line.
import matplotlib.pyplot as plt
import numpy as np
data1,data2,data3,data4 = (np.random.random(100),
np.random.random(100),
np.random.random(100),
np.random.random(100))
fig,ax = plt.subplots()
ax.plot(data1, label="1", color="k")
ax.plot(data2, label="2", color="r")
ax.plot(data3, label="3", color="g")
ax2 = ax.twinx()
ax2.plot(data4, label="4", color="b")
# First get the handles and labels from the axes
handles1, labels1 = ax.get_legend_handles_labels()
handles2, labels2 = ax2.get_legend_handles_labels()
# Add the first legend to the second axis so it displaysys 'on top'
first_legend = plt.legend(handles1, labels1, loc='center')
ax2.add_artist(first_legend)
# Add the second legend as usual
ax2.legend(handles2, labels2)
plt.show()
Now I will add that it would be clearer if you just use a single legend adding all the lines to that. This is described in this SO post and in the code above can easily be achieved with
ax2.legend(handles1+handles2, labels1+labels2)
But obviously you may have your own reasons for wanting two legends.

Categories