Relabel axis ticks in seaborn heatmap - python

I have a seaborn heatmap that I am building from a matrix of values. Each element of the matrix corresponds to an entitiy that I would like to make the tick label for each row/col in the matrix.
I tried using the ax.set_xticklabel() function to accomplish this but it seems to do nothing. Here is my code:
type(jr_matrix)
>>> numpy.ndarray
jr_matrix.shape
>>> (15, 15)
short_cols = ['label1','label2',...,'label15'] # list of strings with len 15
fig, ax = plt.subplots(figsize=(13,10))
ax.set_xticklabels(tuple(short_cols)) # i also tried passing a list
ax.set_yticklabels(tuple(short_cols))
sns.heatmap(jr_matrix,
center=0,
cmap="vlag",
linewidths=.75,
ax=ax,
norm=LogNorm(vmin=jr_matrix.min(), vmax=jr_matrix.max()))
The still has the matrix indices as labels:
Any ideas on how to correctly change these labels would be much appreciated.
Edit: I am doing this using jupyter notebooks if that matters.

You are setting the x and y tick labels of the axis you have just created. You are then plotting the seaborn heatmap which will overwrite the tick labels you have just set.
The solution is to create the heatmap first, then set the tick labels:
fig, ax = plt.subplots(figsize=(13,10))
sns.heatmap(jr_matrix,
center=0,
cmap="vlag",
linewidths=.75,
ax=ax,
norm=LogNorm(vmin=jr_matrix.min(), vmax=jr_matrix.max()))
# passing a list is fine, no need to convert to tuples
ax.set_xticklabels(short_cols)
ax.set_yticklabels(short_cols)

Related

Editing the labels and position of the axis ticks on a seaborn heatmap results in an empty plot

I am trying to plot a seaborn heatmap with custom locations and labels on both axes. The dataframe looks like this:
Dataframe
I can plot this normally with seaborn.heatmap:
fig, ax = plt.subplots(figsize=(8, 8))
sns.heatmap(genome_freq.applymap(lambda x: np.log10(x+1)),
ax=ax)
plt.show()
Normal heatmap
I have a list of positions I'd like to set as the xticks (binned_chrom_genome_pos):
[1000000, 248000000, 491000000, 690000000, 881000000, 1062000000, 1233000000, 1392000000, 1538000000, 1679000000, 1814000000, 1948000000, 2081000000, 2195000000, 2301000000, 2402000000, 2490000000, 2569000000, 2645000000, 2709000000, 2772000000, 2819000000, 2868000000, 3023000000]
However, when I try to modify the xticks, the plot becomes empty:
plt.xticks(binned_chrom_genome_pos)
Modified heatmap
I also noticed that the x-axis labels do not correspond to the ticks specified.
Could someone assist me in plotting this properly?
why the code does what it does
ax.get_xticks() returns the positions of the ticks. You can see that they are between 0.5 and 3000. These values refer to the index of your data. Large values, set by plt.xticks, or ax.set_xticks are still interpreted as data indices. So, if you have 10 rows of data, and set xticks to [0, 1000], the data in your figure will only occupy 1% of the x-range, hence disappearing. I am not sure if I am making myself clear, so I will give an example with synthetic data:
import seaborn as sns
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
#generating data
dic = {a:np.random.randint(0,1000,100) for a in range(0,1000000, 10000)}
genome_freq = pd.DataFrame(dic, index=range(0,1000000, 10000))
#plotting heatmaps
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(8, 4))
sns.heatmap(genome_freq.applymap(lambda x: np.log10(x+1)),
ax=ax1)
sns.heatmap(genome_freq.applymap(lambda x: np.log10(x+1)),
ax=ax2)
old_ticks = ax2.get_xticks()
print(np.min(old_ticks), np.max(old_ticks), len(old_ticks)) # prints 0.5 99.5 34
ax2.set_xticks([0,300]) # setting xticks with values way larger than your index squishes your data
plt.show()
what can be done to fix it
So, what you want to do, is to change the xticks based on the size of your data, and then overwrite xticklabels:
Given the new labels from your question:
new_labels = [1000000, 248000000, 491000000, 690000000, 881000000, 1062000000, 1233000000, 1392000000, 1538000000, 1679000000, 1814000000, 1948000000, 2081000000, 2195000000, 2301000000, 2402000000, 2490000000, 2569000000, 2645000000, 2709000000, 2772000000, 2819000000, 2868000000, 3023000000]
len(new_labels) # returns 24
fig, ax = plt.subplots(figsize=(4, 4))
sns.heatmap(genome_freq.applymap(lambda x: np.log10(x+1)),
ax=ax)
So, now we want 24 evenly spaced xticks between the former minimum and the former maximum. We can use np.linspace to achieve that:
old_ticks = ax.get_xticks()
new_ticks = np.linspace(np.min(old_ticks), np.max(old_ticks), len(new_labels))
ax.set_xticks(new_ticks)
ax.set_xticklabels(new_labels)
plt.show()

Matplotlib: Plot on double y-axis plot misaligned

I'm trying to plot two datasets into one plot with matplotlib. One of the two plots is misaligned by 1 on the x-axis.
This MWE pretty much sums up the problem. What do I have to adjust to bring the box-plot further to the left?
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
titles = ["nlnd", "nlmd", "nlhd", "mlnd", "mlmd", "mlhd", "hlnd", "hlmd", "hlhd"]
plotData = pd.DataFrame(np.random.rand(25, 9), columns=titles)
failureRates = pd.DataFrame(np.random.rand(9, 1), index=titles)
color = {'boxes': 'DarkGreen', 'whiskers': 'DarkOrange', 'medians': 'DarkBlue',
'caps': 'Gray'}
fig = plt.figure()
ax1 = fig.add_subplot(111)
ax2 = ax1.twinx()
plotData.plot.box(ax=ax1, color=color, sym='+')
failureRates.plot(ax=ax2, color='b', legend=False)
ax1.set_ylabel('Seconds')
ax2.set_ylabel('Failure Rate in %')
plt.xlim(-0.7, 8.7)
ax1.set_xticks(range(len(titles)))
ax1.set_xticklabels(titles)
fig.tight_layout()
fig.show()
Actual result. Note that its only 8 box-plots instead of 9 and that they're starting at index 1.
The issue is a mismatch between how box() and plot() work - box() starts at x-position 1 and plot() depends on the index of the dataframe (which defaults to starting at 0). There are only 8 plots because the 9th is being cut off since you specify plt.xlim(-0.7, 8.7). There are several easy ways to fix this, as #Sheldore's answer indicates, you can explicitly set the positions for the boxplot. Another way you can do this is to change the indexing of the failureRates dataframe to start at 1 in construction of the dataframe, i.e.
failureRates = pd.DataFrame(np.random.rand(9, 1), index=range(1, len(titles)+1))
note that you need not specify the xticks or the xlim for the question MCVE, but you may need to for your complete code.
You can specify the positions on the x-axis where you want to have the box plots. Since you have 9 boxes, use the following which generates the figure below
plotData.plot.box(ax=ax1, color=color, sym='+', positions=range(9))

Matplotlib histogram label text crowded

I'm making a histogram in matplotlib and the text label for each bin are overlapping on each other like this:
I tried to rotate the labels on the x-axis by following another solution
cuisine_hist = plt.hist(train.cuisine, bins=100)
cuisine_hist.set_xticklabels(rotation=45)
plt.show()
But I get error message 'tuple' object has no attribute 'set_xticklabels'. Why? How do I solve this problem? Alternatively, how can I "transpose" the plot so the labels are on the vertical axis?
Here you go. I lumped both answers in one example:
# create figure and ax objects, it is a good practice to always start with this
fig, ax = plt.subplots()
# then plot histogram using axis
# note that you can change orientation using keyword
ax.hist(np.random.rand(100), bins=10, orientation="horizontal")
# get_xticklabels() actually gets you an iterable, so you need to rotate each label
for tick in ax.get_xticklabels():
tick.set_rotation(45)
It produces the graph with rotated x-ticks and horizontal histogram.
The return value of plt.hist is not what you use to run the function set_xticklabels:
What's running that function is a matplotlib.axes._subplots.AxesSubplot, which you can get from here:
fig, ax = plt.subplots(1, 1)
cuisine_hist = ax.hist(train.cuisine, bins=100)
ax.set_xticklabels(rotation=45)
plt.show()
From the "help" of plt.hist:
Returns
-------
n : array or list of arrays
The values of the histogram bins. See *normed* or *density*
bins : array
The edges of the bins. ...
patches : list or list of lists
...
This might be helpful since it is about rotating labels.
import matplotlib.pyplot as plt
x = [1, 2, 3, 4]
y = [1, 4, 9, 6]
labels = ['Frogs', 'Hogs', 'Bogs', 'Slogs']
plt.plot(x, y, 'ro')
# You can specify a rotation for the tick labels in degrees or with keywords.
plt.xticks(x, labels, rotation='vertical')
# Pad margins so that markers don't get clipped by the axes
plt.margins(0.2)
# Tweak spacing to prevent clipping of tick-labels
plt.subplots_adjust(bottom=0.15)
plt.show()
so I think
plt.xticks(x, labels, rotation='vertical')
is the important line right here.
just this simple line would do the trick
plt.xticks(rotation=45)

Matplotlib: How does one plot a 1D array of x values with y-axis corresponding to a heatmap?

I want to make a subplot for a heatmap where the y-axis matches that of the heatmap (features), but the x axis is some transformation of the mean of the binned values represented for each feature in the heatmap. Below is an example figure:
I can make the heatmap already using imshow, and I have an array of transformed means for each feature with indices that match the heatmap array. How can I produce the subplot on the right of my example figure?
The two main things are setting up the axes to share the y-metric (sharey=True) and (as you have) setting up your the transformed data to use the same indices:
import matplotlib.pyplot as plt
from numpy.random import random
from numpy import var
H = random(size=(120,80))
Hvar = var(H, axis=1)
fig, axs = plt.subplots(figsize=(3,3), ncols=2, sharey=True, sharex=False)
plt.sca(axs[0])
plt.imshow(H) #heatmap into current axis
axs[0].set_ylim(0,120)
axs[1].scatter(Hvar, range(len(Hvar)))
plt.show()

Setting xticks and yticks for scatter plot matrix with pandas [duplicate]

I'm trying to modify the scatter_matrix plot available on Pandas.
Simple usage would be
Obtained doing :
iris = datasets.load_iris()
df = pd.DataFrame(iris.data, columns=iris.feature_names)
pd.tools.plotting.scatter_matrix(df, diagonal='kde', grid=False)
plt.show()
I want to do several modification, among which:
managing to turn off grid on all plots
rotate x any y labels 90 degree
turn ticks off
Is there a way for me to modify pandas' output without having to rewrite my own scatter plot function ? where to start to add non-existing options, fine tunings, etc ?
Thanks !
pd.tools.plotting.scatter_matrix returns an array of the axes it draws; The lower left boundary axes corresponds to indices [:,0] and [-1,:]. One can loop over these elements and apply any sort of modifications. For example:
axs = pd.tools.plotting.scatter_matrix(df, diagonal='kde')
def wrap(txt, width=8):
'''helper function to wrap text for long labels'''
import textwrap
return '\n'.join(textwrap.wrap(txt, width))
for ax in axs[:,0]: # the left boundary
ax.grid('off', axis='both')
ax.set_ylabel(wrap(ax.get_ylabel()), rotation=0, va='center', labelpad=20)
ax.set_yticks([])
for ax in axs[-1,:]: # the lower boundary
ax.grid('off', axis='both')
ax.set_xlabel(wrap(ax.get_xlabel()), rotation=90)
ax.set_xticks([])

Categories