Matplotlib histogram label text crowded - python

I'm making a histogram in matplotlib and the text label for each bin are overlapping on each other like this:
I tried to rotate the labels on the x-axis by following another solution
cuisine_hist = plt.hist(train.cuisine, bins=100)
cuisine_hist.set_xticklabels(rotation=45)
plt.show()
But I get error message 'tuple' object has no attribute 'set_xticklabels'. Why? How do I solve this problem? Alternatively, how can I "transpose" the plot so the labels are on the vertical axis?

Here you go. I lumped both answers in one example:
# create figure and ax objects, it is a good practice to always start with this
fig, ax = plt.subplots()
# then plot histogram using axis
# note that you can change orientation using keyword
ax.hist(np.random.rand(100), bins=10, orientation="horizontal")
# get_xticklabels() actually gets you an iterable, so you need to rotate each label
for tick in ax.get_xticklabels():
tick.set_rotation(45)
It produces the graph with rotated x-ticks and horizontal histogram.

The return value of plt.hist is not what you use to run the function set_xticklabels:
What's running that function is a matplotlib.axes._subplots.AxesSubplot, which you can get from here:
fig, ax = plt.subplots(1, 1)
cuisine_hist = ax.hist(train.cuisine, bins=100)
ax.set_xticklabels(rotation=45)
plt.show()
From the "help" of plt.hist:
Returns
-------
n : array or list of arrays
The values of the histogram bins. See *normed* or *density*
bins : array
The edges of the bins. ...
patches : list or list of lists
...

This might be helpful since it is about rotating labels.
import matplotlib.pyplot as plt
x = [1, 2, 3, 4]
y = [1, 4, 9, 6]
labels = ['Frogs', 'Hogs', 'Bogs', 'Slogs']
plt.plot(x, y, 'ro')
# You can specify a rotation for the tick labels in degrees or with keywords.
plt.xticks(x, labels, rotation='vertical')
# Pad margins so that markers don't get clipped by the axes
plt.margins(0.2)
# Tweak spacing to prevent clipping of tick-labels
plt.subplots_adjust(bottom=0.15)
plt.show()
so I think
plt.xticks(x, labels, rotation='vertical')
is the important line right here.

just this simple line would do the trick
plt.xticks(rotation=45)

Related

Relabel axis ticks in seaborn heatmap

I have a seaborn heatmap that I am building from a matrix of values. Each element of the matrix corresponds to an entitiy that I would like to make the tick label for each row/col in the matrix.
I tried using the ax.set_xticklabel() function to accomplish this but it seems to do nothing. Here is my code:
type(jr_matrix)
>>> numpy.ndarray
jr_matrix.shape
>>> (15, 15)
short_cols = ['label1','label2',...,'label15'] # list of strings with len 15
fig, ax = plt.subplots(figsize=(13,10))
ax.set_xticklabels(tuple(short_cols)) # i also tried passing a list
ax.set_yticklabels(tuple(short_cols))
sns.heatmap(jr_matrix,
center=0,
cmap="vlag",
linewidths=.75,
ax=ax,
norm=LogNorm(vmin=jr_matrix.min(), vmax=jr_matrix.max()))
The still has the matrix indices as labels:
Any ideas on how to correctly change these labels would be much appreciated.
Edit: I am doing this using jupyter notebooks if that matters.
You are setting the x and y tick labels of the axis you have just created. You are then plotting the seaborn heatmap which will overwrite the tick labels you have just set.
The solution is to create the heatmap first, then set the tick labels:
fig, ax = plt.subplots(figsize=(13,10))
sns.heatmap(jr_matrix,
center=0,
cmap="vlag",
linewidths=.75,
ax=ax,
norm=LogNorm(vmin=jr_matrix.min(), vmax=jr_matrix.max()))
# passing a list is fine, no need to convert to tuples
ax.set_xticklabels(short_cols)
ax.set_yticklabels(short_cols)

Matplotlib: how to scale the y axis according to the y-value? [duplicate]

I'm trying to create a histogram of a data column and plot it logarithmically (y-axis) and I'm not sure why the following code does not work:
import numpy as np
import matplotlib.pyplot as plt
data = np.loadtxt('foo.bar')
fig = plt.figure()
ax = fig.add_subplot(111)
plt.hist(data, bins=(23.0, 23.5,24.0,24.5,25.0,25.5,26.0,26.5,27.0,27.5,28.0))
ax.set_xlim(23.5, 28)
ax.set_ylim(0, 30)
ax.grid(True)
plt.yscale('log')
plt.show()
I've also tried instead of plt.yscale('log') adding Log=true in the plt.hist line and also I tried ax.set_yscale('log'), but nothing seems to work. I either get an empty plot, either the y-axis is indeed logarithmic (with the code as shown above), but there is no data plotted (no bins).
try
plt.yscale('log', nonposy='clip')
http://matplotlib.org/api/pyplot_api.html#matplotlib.pyplot.yscale
The issue is with the bottom of bars being at y=0 and the default is to mask out in-valid points (log(0) -> undefined) when doing the log transformation (there was discussion of changing this, but I don't remember which way it went) so when it tries to draw the rectangles for you bar plot, the bottom edge is masked out -> no rectangles.
The hist constructor accepts the log parameter.
You can do this:
plt.hist(data, bins=bins, log=True)
np.logspace returns bins in [1-10], logarithmically spaced - in my case xx is a npvector >0 so the following code does the trick
logbins=np.max(xx)*(np.logspace(0, 1, num=1000) - 1)/9
hh,ee=np.histogram(xx, density=True, bins=logbins)

How to remove a particular grid line corresponding to a custom xtick on a log scale axis?

I'd like to remove the vertical grid line corresponding to the custom xtick (displayed at x = 71 in the below picture). I could remove a horizontal grid line corresponding to the ytick 701 in the below picture by using a hack : since I have no minor tick on the y axis, I defined the custom ytick corresponding to the line that points toward the maximum and crosses the y axis as a minor tick, and then I disabled grid lines for minor ticks on the y axis. Unfortunately I cannot use the same hack on the x axis without disabling the grid lines of the minor ticks and that's something I'd like to avoid at all costs.
Below is a not so minimal albeit still WE.
There are many things I don't understand, the 2 majors are why does
locs, labels = plt.xticks()
not return the locs and labels that are plotted and why I don't get xticks labels displayed as 10^x where x = 0, 1, 2 and 3 but that's outside the scope of the original question.
import matplotlib.pyplot as plt
plt.grid(True)
import numpy as np
# Generate data
x_data = np.arange(1, 1000 , 10)
y_data = np.random.lognormal(1e-5, 3, len(x_data))
y_max = max(y_data)
# plot
plt.xscale('log')
import math
ratio_log = math.log(x_data[np.argmax(y_data)]) / math.log(max(x_data)) # I need to do this in order to plot a horizontal red dashed line that points to the max and do not extend any further.
plt.axhline(y=y_max, xmin=0, xmax = ratio_log, color='r', linestyle='--') # horizontal line pointing to the max y value.
axes = plt.gca()
axes.set_xlim([1, max(x_data)]) # Limits for the x axis.
# custom ticks and labels
# First yticks because I'm able to achieve what I seek
axes.set_yticks([int(y_max)], minor=True) # Sets the custom ytick as a minor one.
from matplotlib.ticker import FormatStrFormatter
axes.yaxis.set_minor_formatter(FormatStrFormatter("%.0f"))
axes.yaxis.grid(False, which='minor') # Removes minor yticks grid. Since I only have my custom yticks as a minor one, this will disable only the grid line corresponding to that ytick. That's a hack.
import matplotlib.ticker as plticker
loc = plticker.MultipleLocator(base=y_max / 3.3) # this locator puts ticks at regular intervals. I ensure the y axis ticks look ok.
axes.yaxis.set_major_locator(loc)
# Now xticks. I'm having a lot of difficulty here, unable to remove the grid of a particular custom xticks.
locs, labels = plt.xticks() # Strangely, this doesn't return the locs and labels that are plotted. There are indeed 2 values that aren't displayed in the plot, here 1.00000000e-01 and 1.00000000e+04. I've got to remove them before I can append my custom loc and label.
# This means that if I do: plt.xticks(locs, labels) right here, it would enlarge both the lower and upper limits on the x axis... I fail to see how that's intuitive or useful at all. Might this be a bug?
locs = np.append(locs[1:-1], np.asarray(x_data[np.argmax(y_data)])) # One of the ugliest hack I have ever seen... to get correct ticks and labels.
labels = (str(int(loc)) for loc in locs) # Just visuals to get integers on the axis.
plt.xticks(locs, labels) # updates the xticks and labels.
plt.plot((x_data[np.argmax(y_data)], x_data[np.argmax(y_data)]), (0, y_max), 'r--') # vertical line that points to the max. Non OO way to do it, so a bad way.
plt.plot(x_data, y_data)
plt.savefig('grid_prob.png')
plt.close()
Example picture below (the code outputs a different picture each time it is executed, but the problem appears in all pictures).
Credit for the idea goes to #ImportanceOfBeingErnest to whom I am extremely grateful.
I removed the grid with
axes.xaxis.grid(False, which='both')
, then I added a grid correspond to each xtick except the custom one with the following loop:
for loc in locs[1:-1]:
if loc != x_data[np.argmax(y_data)]:
plt.axvline(x=loc, color = 'grey', linestyle = '-', linewidth = 0.4)
Insert this code just before the line
plt.xticks(locs, labels) # updates the xticks and labels.
Example of output picture below.

How to change the axis interval in matplotlib? [duplicate]

I'm trying to create a histogram of a data column and plot it logarithmically (y-axis) and I'm not sure why the following code does not work:
import numpy as np
import matplotlib.pyplot as plt
data = np.loadtxt('foo.bar')
fig = plt.figure()
ax = fig.add_subplot(111)
plt.hist(data, bins=(23.0, 23.5,24.0,24.5,25.0,25.5,26.0,26.5,27.0,27.5,28.0))
ax.set_xlim(23.5, 28)
ax.set_ylim(0, 30)
ax.grid(True)
plt.yscale('log')
plt.show()
I've also tried instead of plt.yscale('log') adding Log=true in the plt.hist line and also I tried ax.set_yscale('log'), but nothing seems to work. I either get an empty plot, either the y-axis is indeed logarithmic (with the code as shown above), but there is no data plotted (no bins).
try
plt.yscale('log', nonposy='clip')
http://matplotlib.org/api/pyplot_api.html#matplotlib.pyplot.yscale
The issue is with the bottom of bars being at y=0 and the default is to mask out in-valid points (log(0) -> undefined) when doing the log transformation (there was discussion of changing this, but I don't remember which way it went) so when it tries to draw the rectangles for you bar plot, the bottom edge is masked out -> no rectangles.
The hist constructor accepts the log parameter.
You can do this:
plt.hist(data, bins=bins, log=True)
np.logspace returns bins in [1-10], logarithmically spaced - in my case xx is a npvector >0 so the following code does the trick
logbins=np.max(xx)*(np.logspace(0, 1, num=1000) - 1)/9
hh,ee=np.histogram(xx, density=True, bins=logbins)

Logarithmic y-axis bins in python

I'm trying to create a histogram of a data column and plot it logarithmically (y-axis) and I'm not sure why the following code does not work:
import numpy as np
import matplotlib.pyplot as plt
data = np.loadtxt('foo.bar')
fig = plt.figure()
ax = fig.add_subplot(111)
plt.hist(data, bins=(23.0, 23.5,24.0,24.5,25.0,25.5,26.0,26.5,27.0,27.5,28.0))
ax.set_xlim(23.5, 28)
ax.set_ylim(0, 30)
ax.grid(True)
plt.yscale('log')
plt.show()
I've also tried instead of plt.yscale('log') adding Log=true in the plt.hist line and also I tried ax.set_yscale('log'), but nothing seems to work. I either get an empty plot, either the y-axis is indeed logarithmic (with the code as shown above), but there is no data plotted (no bins).
try
plt.yscale('log', nonposy='clip')
http://matplotlib.org/api/pyplot_api.html#matplotlib.pyplot.yscale
The issue is with the bottom of bars being at y=0 and the default is to mask out in-valid points (log(0) -> undefined) when doing the log transformation (there was discussion of changing this, but I don't remember which way it went) so when it tries to draw the rectangles for you bar plot, the bottom edge is masked out -> no rectangles.
The hist constructor accepts the log parameter.
You can do this:
plt.hist(data, bins=bins, log=True)
np.logspace returns bins in [1-10], logarithmically spaced - in my case xx is a npvector >0 so the following code does the trick
logbins=np.max(xx)*(np.logspace(0, 1, num=1000) - 1)/9
hh,ee=np.histogram(xx, density=True, bins=logbins)

Categories