Rotating Matplotlib tick labels causes weird spacing issues - python

I have the following plot:
I would like to make the x-axis ticks more readable by rotating the ticks by ~40 degrees. So from:
plt.xticks(list(range(0, width)), list(df_100.columns), rotation='90', fontsize=16)
To:
plt.xticks(list(range(0, width)), list(df_100.columns), rotation='40', fontsize=16)
When I do this, though, I get some crazy spacing issues:
(ignore the change in color...)
What's causing this problem? How can I fix it? Here's a minimum working example:
import matplotlib.pyplot as plt
import numpy as np
# Z is your data set
N = 100
height = df_100.shape[0]
width = df_100.shape[1]
# Z = np.random.random((100, 29))
# G is a NxNx3 matrix
G = np.zeros((height,width,3))
# Where we set the RGB for each pixel
G[Z>0.5] = [1, 1, 1]
G[Z<0.5] = [0.25, 0.25, 0.25]
fig, ax = plt.subplots(figsize=(20, 10))
ax.imshow(G, interpolation='none')
ax.set_aspect('auto')
ax.grid(None)
ax.xaxis.tick_top()
plt.xticks(list(range(0, width)), list(df_100.columns), rotation='45', fontsize=16)
plt.yticks([0, df_100.shape[0] - 1], [1, df_100.shape[0]], fontsize=20)
plt.tight_layout()
plt.show()

If xticklabels are of the same length, you won't have this kind of problem. But given different length of labels, you can encounter this kind of problem. Because the default rotation is from the center of the xlabel string. So you can try to set the rotation anchor properly from
['right', 'center', 'left'].
ha = 'left' # or 'right'. Experiment with it.
ax.set_xticks(x) # set tick location
ax.set_xticklabels(xlabels, rotation=40, ha=ha) # rotate the labels with proper anchoring.

Related

The axes labels do not align with the matrix when using matshow

I used the following code to create the attached plot:
fig, ax = plt.subplots()
fig.set_figheight(50)
fig.set_figwidth(50)
ax.matshow(power_final_for_plotting, cmap='GnBu', origin='upper')
ax.set_xticks(time_periods)
ax.set_xticklabels(time_periods)
ax.set_yticks(sig_wave_height)
ax.set_yticklabels(sig_wave_height)
for i in range(len(time_periods)):
for j in range(len(amplitude)):
c = round(power_final_for_plotting[j, i],3)
ax.text(i, j, str(c), va='center', ha='center', size=27)
plt.tight_layout()
Here time_period and sig_wave_height are lists of integers. The axis labels do not align properly in this case (Check the top right of the image to see the labels as they are). How can I fix this? The labels are really small in this case:
The current scale definition, for example, gives the necessary amount of x-axis as a numerical value, but the scale name is displayed up to 15 because it is a numerical value. We need 30 tick points and tick names for each column. Since no data is provided, I have created sample data as appropriate. Also, the small font of the scale is due to the size of 50 inches.
import numpy as np
import matplotlib.pyplot as plt
power_final_for_plotting = np.random.rand(450).reshape(15,30)
time_periods = np.arange(0,15,0.5)
sig_wave_height = np.arange(0.5,8.0,0.5)
fig, ax = plt.subplots()
fig.set_figheight(6)
fig.set_figwidth(12)
ax.matshow(power_final_for_plotting, cmap='GnBu', origin='upper')
ax.set_xticks(range(len(time_periods)))
ax.set_xticklabels([str(x) for x in time_periods])
ax.set_yticks(range(len(sig_wave_height)))
ax.set_yticklabels([str(x) for x in sig_wave_height])
for i in range(len(time_periods)):
for j in range(len(sig_wave_height)):
c = round(power_final_for_plotting[j, i],3)
ax.text(i, j, str(c), va='center', ha='center', size=9)
plt.tight_layout()
#print(ax.get_xticklabels())
plt.show()

How to draw the normal distribution of a barplot with log x axis?

I'd like to draw a lognormal distribution of a given bar plot.
Here's the code
import matplotlib.pyplot as plt
from matplotlib.pyplot import figure
import numpy as np; np.random.seed(1)
import scipy.stats as stats
import math
inter = 33
x = np.logspace(-2, 1, num=3*inter+1)
yaxis = [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.01,0.03,0.3,0.75,1.24,1.72,2.2,3.1,3.9,
4.3,4.9,5.3,5.6,5.87,5.96,6.01,5.83,5.42,4.97,4.60,4.15,3.66,3.07,2.58,2.19,1.90,1.54,1.24,1.08,0.85,0.73,
0.84,0.59,0.55,0.53,0.48,0.35,0.29,0.15,0.15,0.14,0.12,0.14,0.15,0.05,0.05,0.05,0.04,0.03,0.03,0.03, 0.02,
0.02,0.03,0.01,0.01,0.01,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0,0]
fig, ax = plt.subplots()
ax.bar(x[:-1], yaxis, width=np.diff(x), align="center", ec='k', color='w')
ax.set_xscale('log')
plt.xlabel('Diameter (mm)', fontsize='12')
plt.ylabel('Percentage of Total Particles (%)', fontsize='12')
plt.ylim(0,8)
plt.xlim(0.01, 10)
fig.set_size_inches(12, 12)
plt.savefig("Test.png", dpi=300, bbox_inches='tight')
Resulting plot:
What I'm trying to do is to draw the Probability Density Function exactly like the one shown in red in the graph below:
An idea is to convert everything to logspace, with u = log10(x). Then draw the density histogram in there. And also calculate a kde in the same space. Everything gets drawn as y versus u. When we have u at a top twin axes, x can stay at the bottom. Both axes get aligned by setting the same xlims, but converted to logspace on the top axis. The top axis can be hidden to get the desired result.
import matplotlib.pyplot as plt
import numpy as np
from scipy import stats
inter = 33
u = np.linspace(-2, 1, num=3*inter+1)
x = 10**u
us = np.linspace(u[0], u[-1], 500)
yaxis = [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.01,0.03,0.3,0.75,1.24,1.72,2.2,3.1,3.9,
4.3,4.9,5.3,5.6,5.87,5.96,6.01,5.83,5.42,4.97,4.60,4.15,3.66,3.07,2.58,2.19,1.90,1.54,1.24,1.08,0.85,0.73,
0.84,0.59,0.55,0.53,0.48,0.35,0.29,0.15,0.15,0.14,0.12,0.14,0.15,0.05,0.05,0.05,0.04,0.03,0.03,0.03, 0.02,
0.02,0.03,0.01,0.01,0.01,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0,0]
yaxis = np.array(yaxis)
# reconstruct data from the given frequencies
u_data = np.repeat((u[:-1] + u[1:]) / 2, (yaxis * 100).astype(np.int))
kde = stats.gaussian_kde((u[:-1]+u[1:])/2, weights=yaxis, bw_method=0.2)
total_area = (np.diff(u)*yaxis).sum() # total area of all bars; divide by this area to normalize
fig, ax = plt.subplots()
ax2 = ax.twiny()
ax2.bar(u[:-1], yaxis, width=np.diff(u), align="edge", ec='k', color='w', label='frequencies')
ax2.plot(us, total_area*kde(us), color='crimson', label='kde')
ax2.plot(us, total_area * stats.norm.pdf(us, u_data.mean(), u_data.std()), color='dodgerblue', label='lognormal')
ax2.legend()
ax.set_xscale('log')
ax.set_xlabel('Diameter (mm)', fontsize='12')
ax.set_ylabel('Percentage of Total Particles (%)', fontsize='12')
ax.set_ylim(0,8)
xlim = np.array([0.01,10])
ax.set_xlim(xlim)
ax2.set_xlim(np.log10(xlim))
ax2.set_xticks([]) # hide the ticks at the top
plt.tight_layout()
plt.show()
PS: Apparently this also can be achieved directly without explicitly using u (at the cost of being slightly more cryptic):
x = np.logspace(-2, 1, num=3*inter+1)
xs = np.logspace(-2, 1, 500)
total_area = (np.diff(np.log10(x))*yaxis).sum() # total area of all bars; divide by this area to normalize
kde = gaussian_kde((np.log10(x[:-1])+np.log10(x[1:]))/2, weights=yaxis, bw_method=0.2)
ax.bar(x[:-1], yaxis, width=np.diff(x), align="edge", ec='k', color='w')
ax.plot(xs, total_area*kde(np.log10(xs)), color='crimson')
ax.set_xscale('log')
Note that the bandwidth set for gaussian_kde is a somewhat arbitrarily value. Larger values give a more equalized curve, smaller values keep closer to the data. Some experimentation can help.

Polar plot - Put one grid line in bold

I am trying to make use the polar plot projection to make a radar chart. I would like to know how to put only one grid line in bold (while the others should remain standard).
For my specific case, I would like to highlight the gridline associated to the ytick "0".
from matplotlib import pyplot as plt
import pandas as pd
import numpy as np
#Variables
sespi = pd.read_csv("country_progress.csv")
labels = sespi.country
progress = sespi.progress
angles=np.linspace(0, 2*np.pi, len(labels), endpoint=False)
#Concatenation to close the plots
progress=np.concatenate((progress,[progress[0]]))
angles=np.concatenate((angles,[angles[0]]))
#Polar plot
fig=plt.figure()
ax = fig.add_subplot(111, polar=True)
ax.plot(angles, progress, '.--', linewidth=1, c="g")
#ax.fill(angles, progress, alpha=0.25)
ax.set_thetagrids(angles * 180/np.pi, labels)
ax.set_yticklabels([-200,-150,-100,-50,0,50,100,150,200])
#ax.set_title()
ax.grid(True)
plt.show()
The gridlines of a plot are Line2D objects. Therefore you can't make it bold. What you can do (as shown, in part, in the other answer) is to increase the linewidth and change the colour but rather than plot a new line you can do this to the specified gridline.
You first need to find the index of the y tick labels which you want to change:
y_tick_labels = [-100,-10,0,10]
ind = y_tick_labels.index(0) # find index of value 0
You can then get a list of the gridlines using gridlines = ax.yaxis.get_gridlines(). Then use the index you found previously on this list to change the properties of the correct gridline.
Using the example from the gallery as a basis, a full example is shown below:
r = np.arange(0, 2, 0.01)
theta = 2 * np.pi * r
ax = plt.subplot(111, projection='polar')
ax.set_rmax(2)
ax.set_rticks([0.5, 1, 1.5, 2]) # less radial ticks
ax.set_rlabel_position(-22.5) # get radial labels away from plotted line
ax.grid(True)
y_tick_labels = [-100, -10, 0, 10]
ax.set_yticklabels(y_tick_labels)
ind = y_tick_labels.index(0) # find index of value 0
gridlines = ax.yaxis.get_gridlines()
gridlines[ind].set_color("k")
gridlines[ind].set_linewidth(2.5)
plt.show()
Which gives:
It is just a trick, but I guess you could just plot a circle and change its linewidth and color to whatever could be bold for you.
For example:
import matplotlib.pyplot as plt
import numpy as np
Yline = 0
Npoints = 300
angles = np.linspace(0,360,Npoints)*np.pi/180
line = 0*angles + Yline
ax = plt.subplot(111, projection='polar')
plt.plot(angles, line, color = 'k', linewidth = 3)
plt.ylim([-1,1])
plt.grid(True)
plt.show()
In this piece of code, I plot a line using plt.plot between any point of the two vectors angles and line. The former is actually all the angles between 0 and 2*np.pi. The latter is constant, and equal to the 'height' you want to plot that line Yline.
I suggest you try to decrease and increase Npoints while having a look to the documentaion of np.linspace() in order to understand your problem with the roundness of the circle.

Discrete legend in seaborn heatmap plot

I am using the data present here to construct this heat map using seaborn and pandas.
Code:
import pandas
import seaborn.apionly as sns
# Read in csv file
df_trans = pandas.read_csv('LUH2_trans_matrix.csv')
sns.set(font_scale=0.8)
cmap = sns.cubehelix_palette(start=2.8, rot=.1, light=0.9, as_cmap=True)
cmap.set_under('gray') # 0 values in activity matrix are shown in gray (inactive transitions)
df_trans = df_trans.set_index(['Unnamed: 0'])
ax = sns.heatmap(df_trans, cmap=cmap, linewidths=.5, linecolor='lightgray')
# X - Y axis labels
ax.set_ylabel('FROM')
ax.set_xlabel('TO')
# Rotate tick labels
locs, labels = plt.xticks()
plt.setp(labels, rotation=0)
locs, labels = plt.yticks()
plt.setp(labels, rotation=0)
# revert matplotlib params
sns.reset_orig()
As you can see from csv file, it contains 3 discrete values: 0, -1 and 1. I want a discrete legend instead of the colorbar. Labeling 0 as A, -1 as B and 1 as C. How can I do that?
Well, there's definitely more than one way to accomplish this. In this case, with only three colors needed, I would pick the colors myself by creating a LinearSegmentedColormap instead of generating them with cubehelix_palette. If there were enough colors to warrant using cubehelix_palette, I would define the segments on colormap using the boundaries option of the cbar_kws parameter. Either way, the ticks can be manually specified using set_ticks and set_ticklabels.
The following code sample demonstrates the manual creation of LinearSegmentedColormap, and includes comments on how to specify boundaries if using a cubehelix_palette instead.
import matplotlib.pyplot as plt
import pandas
import seaborn.apionly as sns
from matplotlib.colors import LinearSegmentedColormap
sns.set(font_scale=0.8)
dataFrame = pandas.read_csv('LUH2_trans_matrix.csv').set_index(['Unnamed: 0'])
# For only three colors, it's easier to choose them yourself.
# If you still really want to generate a colormap with cubehelix_palette instead,
# add a cbar_kws={"boundaries": linspace(-1, 1, 4)} to the heatmap invocation
# to have it generate a discrete colorbar instead of a continous one.
myColors = ((0.8, 0.0, 0.0, 1.0), (0.0, 0.8, 0.0, 1.0), (0.0, 0.0, 0.8, 1.0))
cmap = LinearSegmentedColormap.from_list('Custom', myColors, len(myColors))
ax = sns.heatmap(dataFrame, cmap=cmap, linewidths=.5, linecolor='lightgray')
# Manually specify colorbar labelling after it's been generated
colorbar = ax.collections[0].colorbar
colorbar.set_ticks([-0.667, 0, 0.667])
colorbar.set_ticklabels(['B', 'A', 'C'])
# X - Y axis labels
ax.set_ylabel('FROM')
ax.set_xlabel('TO')
# Only y-axis labels need their rotation set, x-axis labels already have a rotation of 0
_, labels = plt.yticks()
plt.setp(labels, rotation=0)
plt.show()
Here's a simple solution based on the other answers that generalizes beyond 3 categories and uses a dict (vmap) to define the labels.
import seaborn as sns
import numpy as np
# This just makes some sample 2D data and a corresponding vmap dict with labels for the values in the data
data = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
vmap = {i: chr(65 + i) for i in range(len(np.ravel(data)))}
n = len(vmap)
print(vmap)
cmap = sns.color_palette("deep", n)
ax = sns.heatmap(data, cmap=cmap)
# Get the colorbar object from the Seaborn heatmap
colorbar = ax.collections[0].colorbar
# The list comprehension calculates the positions to place the labels to be evenly distributed across the colorbar
r = colorbar.vmax - colorbar.vmin
colorbar.set_ticks([colorbar.vmin + 0.5 * r / (n) + r * i / (n) for i in range(n)])
colorbar.set_ticklabels(list(vmap.values()))
I find that a discretized colorbar in seaborn is much easier to create if you use a ListedColormap. There's no need to define your own functions, just add a few lines to basically customize your axes.
import pandas
import matplotlib.pyplot as plt
import seaborn as sns
from matplotlib.colors import ListedColormap
# Read in csv file
df_trans = pandas.read_csv('LUH2_trans_matrix.csv')
sns.set(font_scale=0.8)
# cmap is now a list of colors
cmap = sns.cubehelix_palette(start=2.8, rot=.1, light=0.9, n_colors=3)
df_trans = df_trans.set_index(['Unnamed: 0'])
# Create two appropriately sized subplots
grid_kws = {'width_ratios': (0.9, 0.03), 'wspace': 0.18}
fig, (ax, cbar_ax) = plt.subplots(1, 2, gridspec_kw=grid_kws)
ax = sns.heatmap(df_trans, ax=ax, cbar_ax=cbar_ax, cmap=ListedColormap(cmap),
linewidths=.5, linecolor='lightgray',
cbar_kws={'orientation': 'vertical'})
# Customize tick marks and positions
cbar_ax.set_yticklabels(['B', 'A', 'C'])
cbar_ax.yaxis.set_ticks([ 0.16666667, 0.5, 0.83333333])
# X - Y axis labels
ax.set_ylabel('FROM')
ax.set_xlabel('TO')
# Rotate tick labels
locs, labels = plt.xticks()
plt.setp(labels, rotation=0)
locs, labels = plt.yticks()
plt.setp(labels, rotation=0)
The link provided by #Fabio Lamanna is a great start.
From there, you still want to set colorbar labels in the correct location and use tick labels that correspond to your data.
assuming that you have equally spaced levels in your data, this produces a nice discrete colorbar:
Basically, this comes down to turning off the seaborn colorbar and replacing it with a discretized colorbar yourself.
import pandas
import seaborn.apionly as sns
import matplotlib.pyplot as plt
import numpy as np
import matplotlib
def cmap_discretize(cmap, N):
"""Return a discrete colormap from the continuous colormap cmap.
cmap: colormap instance, eg. cm.jet.
N: number of colors.
Example
x = resize(arange(100), (5,100))
djet = cmap_discretize(cm.jet, 5)
imshow(x, cmap=djet)
"""
if type(cmap) == str:
cmap = plt.get_cmap(cmap)
colors_i = np.concatenate((np.linspace(0, 1., N), (0.,0.,0.,0.)))
colors_rgba = cmap(colors_i)
indices = np.linspace(0, 1., N+1)
cdict = {}
for ki,key in enumerate(('red','green','blue')):
cdict[key] = [ (indices[i], colors_rgba[i-1,ki], colors_rgba[i,ki]) for i in xrange(N+1) ]
# Return colormap object.
return matplotlib.colors.LinearSegmentedColormap(cmap.name + "_%d"%N, cdict, 1024)
def colorbar_index(ncolors, cmap, data):
"""Put the colorbar labels in the correct positions
using uique levels of data as tickLabels
"""
cmap = cmap_discretize(cmap, ncolors)
mappable = matplotlib.cm.ScalarMappable(cmap=cmap)
mappable.set_array([])
mappable.set_clim(-0.5, ncolors+0.5)
colorbar = plt.colorbar(mappable)
colorbar.set_ticks(np.linspace(0, ncolors, ncolors))
colorbar.set_ticklabels(np.unique(data))
# Read in csv file
df_trans = pandas.read_csv('d:/LUH2_trans_matrix.csv')
sns.set(font_scale=0.8)
cmap = sns.cubehelix_palette(n_colors=3,start=2.8, rot=.1, light=0.9, as_cmap=True)
cmap.set_under('gray') # 0 values in activity matrix are shown in gray (inactive transitions)
df_trans = df_trans.set_index(['Unnamed: 0'])
N = df_trans.max().max() - df_trans.min().min() + 1
f, ax = plt.subplots()
ax = sns.heatmap(df_trans, cmap=cmap, linewidths=.5, linecolor='lightgray',cbar=False)
colorbar_index(ncolors=N, cmap=cmap,data=df_trans)
# X - Y axis labels
ax.set_ylabel('FROM')
ax.set_xlabel('TO')
# Rotate tick labels
locs, labels = plt.xticks()
plt.setp(labels, rotation=0)
locs, labels = plt.yticks()
plt.setp(labels, rotation=0)
# revert matplotlib params
sns.reset_orig()
bits and pieces recycled and adapted from here and here

Aligning two combined plots - Matplotlib

I'm currently working in a plot in which I show to datas combined.
I plot them with the following code:
plt.figure()
# Data 1
data = plt.cm.binary(data1)
data[..., 3] = 1.0 * (data1 > 0.0)
fig = plt.imshow(data, interpolation='nearest', cmap='binary', vmin=0, vmax=1, extent=(-4, 4, -4, 4))
# Plotting just the nonzero values of data2
x = numpy.linspace(-4, 4, 11)
y = numpy.linspace(-4, 4, 11)
data2_x = numpy.nonzero(data2)[0]
data2_y = numpy.nonzero(data2)[1]
pts = plt.scatter(x[data2_x], y[data2_y], marker='s', c=data2[data2_x, data2_y])
And this gives me this plot:
As can be seen in the image, my background and foreground squares are not aligned.
Both of then have the same dimension (20 x 20). I would like to have a way, if its possible, to align center with center, or corner with corner, but to have some kind of alignment.
In some grid cells it seems that I have right bottom corner alignment, in others left bottom corner alignment and in others no alignment at all, with degrades the visualization.
Any help would be appreciated.
Thank you.
As tcaswell says, your problem may be easiest to solve by defining the extent keyword for imshow.
If you give the extent keyword, the outermost pixel edges will be at the extents. For example:
import matplotlib.pyplot as plt
import numpy as np
fig = plt.figure()
ax = fig.add_subplot(111)
ax.imshow(np.random.random((8, 10)), extent=(2, 6, -1, 1), interpolation='nearest', aspect='auto')
Now it is easy to calculate the center of each pixel. In X direction:
interpixel distance is (6-2) / 10 = 0.4 pixels
center of the leftmost pixel is half a pixel away from the left edge, 2 + .4/2 = 2.2
Similarly, the Y centers are at -.875 + n * 0.25.
So, by tuning the extent you can get your pixel centers wherever you want them.
An example with 20x20 data:
import matplotlib.pyplot as plt
import numpy
# create the data to be shown with "scatter"
yvec, xvec = np.meshgrid(np.linspace(-4.75, 4.75, 20), np.linspace(-4.75, 4.75, 20))
sc_data = random.random((20,20))
# create the data to be shown with "imshow" (20 pixels)
im_data = random.random((20,20))
fig = plt.figure()
ax = fig.add_subplot(111)
ax.imshow(im_data, extent=[-5,5,-5,5], interpolation='nearest', cmap=plt.cm.gray)
ax.scatter(xvec, yvec, 100*sc_data)
Notice that here the inter-pixel distance is the same for both scatter (if you have a look at xvec, all pixels are 0.5 units apart) and imshow (as the image is stretched from -5 to +5 and has 20 pixels, the pixels are .5 units apart).
here is a code where there is no alignment problem.
import matplotlib.pyplot as plt
import numpy
data1 = numpy.random.rand(10, 10)
data2 = numpy.random.rand(10, 10)
data2[data2 < 0.4] = 0.0
plt.figure()
# Plotting data1
fig = plt.imshow(data1, interpolation='nearest', cmap='binary', vmin=0.0, vmax=1.0)
# Plotting data2
data2_x = numpy.nonzero(data2)[0]
data2_y = numpy.nonzero(data2)[1]
pts = plt.scatter(data2_x, data2_y, marker='s', c=data2[data2_x, data2_y])
plt.show()
which gives a perfectly aligned combined plots:
Thus the use of additional options in your code might be the reason of the non-alignment of the combined plots.

Categories