How to set y-axis for historgram in Python?

How to set y-axis for historgram in Python? - python

According to the documentation, one can set the range of the x-axis using the hist function, but there doesn't seem to be a way to control the y-axis.
I have a figure with 4 subplots, arranged in a 2x2 fashion, all of which are histograms. I have made their x-axis to be entirely the same by setting the range, but have been unable to figure out how to do likewise with the y-axis. But when I try to control the y-axis, using set_ylim, I get an error. When I tried using pylab.axis, the plots didn't turn out correctly (the bars of the historgram all had a y-value of 0.
pylab.hist(myData[x], bins = 20, range=(0,400))
pylab.axis([0,400,0,300])
How do I control the y-axis of the histogram? Essentially what I"m looking for is something like range in the hist function, but for the y-axis.
Update:
plotNumber = 1
for i in xrange(4):
pylab.subplot(2, 2, plotNumber)
pylab.hist(myData[i], bins = 20, range=(0,400))
pylab.title('Some Title')
pylab.xlabel('X')
pylab.ylabel('Y')
plotNumber += 1
pylab.show()
But when I include
pylab.axis([0,400,0,300])
All the y-values correspond to 0 (the histogram is flat).

Answer is given here: setting y-axis limit in matplotlib
axes = plt.gca()
axes.set_xlim([xmin,xmax])
axes.set_ylim([ymin,ymax])
For me this works for histogram subplots.

If you're looking to set ticks on the y-axis every n values, you can use:
pylab.yticks(range(min, max, n))
I am using Python 2.7.

Related

How to use the same width for numbers in y axis?

I am drawing some graphs and I wanna import them in LaTex in 2 by 2 format. One of the problems is that values on the y-axis for one graph range from 1 to 6, but for another graph, those range from 1 to 200. Because of that, when I import graphs into my document, they do not look good. Is there any way to set the same width for value on the y-axis?

You can set the y axis limits using ax.set_ylim or plt.ylim:
# Set axis from 1 to 200
ax.set_ylim((1,200))
# Or just set it directly - this will also act on the current axis
plt.ylim((1,200))
Edit: The question is about widths rather than limits.
I think making the subplots together on one figure should solve this problem.
plt.figure()
plt.subplot(2,2,1)
plt.plot(x1,y1)
.
.
plt.subplot(2,2,4)
plt.plot(x4,y4)

How to center the histogram bars around tick marks using seaborn displot? Stacking bars is essential

I have searched many ways of making histograms centered around tick marks but not able to find a solution that works with seaborn displot. The function displot lets me stack the histogram according to a column in the dataframe and thus would prefer a solution using displot or something that allows stacking based on a column in a data frame with color-coding as with palette.
Even after setting the tick values, I am not able to get the bars to center around the tick marks.
Example code
# Center the histogram on the tick marks
tips = sns.load_dataset('tips')
sns.displot(x="total_bill",
hue="day", multiple = 'stack', data=tips)
plt.xticks(np.arange(0, 50, 5))
I would also like to plot a histogram of a variable that takes a single value and choose the bin width of the resulting histogram in such a way that it is centered around the value. (0.5 in this example.)
I can get the center point by choosing the number of bins equal to a number of tick marks but the resulting bar is very thin. How can I increase the bin size in this case, where there is only one bar but want to display all the other possible points. By displaying all the tick marks, the bar width is very tiny.
I want the same centering of the bar at the 0.5 tick mark but make it wider as it is the only value for which counts are displayed.
Any solutions?
tips['single'] = 0.5
sns.displot(x='single',
hue="day", multiple = 'stack', data=tips, bins = 10)
plt.xticks(np.arange(0, 1, 0.1))
Edit:
Would it be possible to have more control over the tick marks in the second case? I would not want to display the round off to 1 decimal place but chose which of the tick marks to display. Is it possible to display just one value in the tick mark and have it centered around that?
Does the min_val and max_val in this case refer to value of the variable which will be 0 in this case and then the x axis would be plotted on negative values even when there are none and dont want to display them.

For your first problem, you may want to figure out a few properties of the data that your plotting. For example the range of the data. Additionally, you may want to choose beforehand the number of bins that you want displayed.
tips = sns.load_dataset('tips')
min_val = tips.total_bill.min()
max_val = tips.total_bill.max()
val_width = max_val - min_val
n_bins = 10
bin_width = val_width/n_bins
sns.histplot(x="total_bill",
hue="day", multiple = 'stack', data=tips,
bins=n_bins, binrange=(min_val, max_val),
palette='Paired')
plt.xlim(0, 55) # Define x-axis limits
Another thing to remember is that width a of a bar in a histogram identifies the bounds of its range. So a bar spanning [2,5] on the x-axis implies that the values represented by that bar belong to that range.
Considering this, it is easy to formulate a solution. Assume that we want the original bar graphs - identifying the bounds of each bar graph, one solution may look like
plt.xticks(np.arange(min_val-bin_width, max_val+bin_width, bin_width))
Now, if we offset the ticks by half a bin-width, we will get to the centers of the bars.
plt.xticks(np.arange(min_val-bin_width/2, max_val+bin_width/2, bin_width))
For your single value plot, the idea remains the same. Control the bin_width and the x-axis range and ticks. Bin-width has to be controlled explicitly since automatic inference of bin-width will probably be 1 unit wide which on the plot will have no thickness. Histogram bars always indicate a range - even though when we have just one single value. This is illustrated in the following example and figure.
single_val = 23.5
tips['single'] = single_val
bin_width = 4
fig, axs = plt.subplots(1, 2, sharey=True, figsize=(12,4)) # Get 2 subplots
# Case 1 - With the single value as x-tick label on subplot 0
sns.histplot(x='single',
hue="day", multiple = 'stack', data=tips,
binwidth=bin_width, binrange=(single_val-bin_width, single_val+bin_width),
palette='rocket',
ax=axs[0])
ticks = [single_val, single_val+bin_width] # 2 ticks - given value and given_value + width
axs[0].set(
title='Given value as tick-label starts the bin on x-axis',
xticks=ticks,
xlim=(0, int(single_val*2)+bin_width)) # x-range such that bar is at middle of x-axis
axs[0].xaxis.set_major_formatter(FormatStrFormatter('%.1f'))
# Case 2 - With centering on the bin starting at single-value on subplot 1
sns.histplot(x='single',
hue="day", multiple = 'stack', data=tips,
binwidth=bin_width, binrange=(single_val-bin_width, single_val+bin_width),
palette='rocket',
ax=axs[1])
ticks = [single_val+bin_width/2] # Just the bin center
axs[1].set(
title='Bin centre is offset from single_value by bin_width/2',
xticks=ticks,
xlim=(0, int(single_val*2)+bin_width) ) # x-range such that bar is at middle of x-axis
axs[1].xaxis.set_major_formatter(FormatStrFormatter('%.1f'))
Output:
I feel from your description that what you are really implying by a bar graph is a categorical bar graph. The centering is then automatic. Because the bar is not a range anymore but a discrete category. For the numeric and continuous nature of the variable in the example data, I would not recommend such an approach. Pandas provides for plotting categorical bar plots. See here. For our example, one way to do this is as follows:
n_colors = len(tips['day'].unique()) # Get number of uniques categories
agg_df = tips[['single', 'day']].groupby(['day']).agg(
val_count=('single', 'count'),
val=('single','max')
).reset_index() # Get aggregated information along the categories
agg_df.pivot(columns='day', values='val_count', index='val').plot.bar(
stacked=True,
color=sns.color_palette("Paired", n_colors), # Choose "number of days" colors from palette
width=0.05 # Set bar width
)
plt.show()
This yields:

how to use plt.yscale('log') for specific values between 0 and 1?

I need to plot a logarithmic y-axis between 0 and 1 like the graph in the picture.
I need the points on the y-axis to be [0.005,0.010,0.050,0.100,0.500,1] like the graph in the picture. how can I choose which values will show on the axis?

use plt.yscale('log') to make logarithmic scale and plt.axis([1,10000,0.004,1]) for plot borders
use plt.yticks([0.005,0.010,0.050,0.100,0.500,1],[0.005,0.010,0.050,0.100,0.500,1]) to choose the values that will show
plt.yticks([points],[names])

Date labels intersecting

I'm using Matplotlib to plot data on Ubuntu 15.10. My y-axis has numeric values and my x-axis timestamps.
I'm having the problem that the date labels intersect with each other making it look bad. How do I increase the distance between the x-axis ticks/labels to be evenly spaced still? Since the automatic selection of ticks was bad I'm okay with manually setting the amount of date ticks. Any other solution is appreciated, too.
Besides, I'm using the following DateFormatter:
formatter = DateFormatter('%m/%d/%y')
axis = plt.gca()
axis.xaxis.set_major_formatter(formatter)

You could add the following to your code:
plt.gcf().autofmt_xdate()
Which automatically formats the x axis for you (rotates the labels to something like 30 degrees etc).
You can also manually set the amount of x ticks that show on your x-axis to avoid it getting crowded, by using the following:
max_xticks = 10
xloc = plt.MaxNLocator(max_xticks)
ax.xaxis.set_major_locator(xloc)
I personally use both together as it makes the graph look much nicer when using dates.

You can simply set the locations you want to be labeled:
axis.set_xticks(x[[0, int(len(x)/2), -1]])
where x would be your array of timestamps

how to set unequal x axis intervals in Matplotlib

Now I just simply use plt.plot(x,y1,'b.-') to plot a figure, but it turns out so many data are displayed between 0 to 10 on the x axis, so I want to set x axis like this 0,1,5,10,100,1000,100000
thus, the massive data between 0 to 10 can be more spread out.
How can I do it in Python, I am using Matplotlib

0,1,5,10,100,1000,100000?
If you can live with (0.01, 0.1,), 1, 10, 100, 1000, 10000, 100000,… - then change the xscale to log:
plt.xscale('log')

See the accepted answer to the question How do I convert (or scale) axis values and redefine the tick frequency in matplotlib? Essentially, the matplotlib.pyplot.xticks command can be used to control to location and labels of the tick marks.
However, your data will still be plotted on a linear scale, so this won't strecth out the data between 0 and 10. You will need to use a different axis scaling to do this, using, for example, set_xscale.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.