How to use the same width for numbers in y axis? - python

I am drawing some graphs and I wanna import them in LaTex in 2 by 2 format. One of the problems is that values on the y-axis for one graph range from 1 to 6, but for another graph, those range from 1 to 200. Because of that, when I import graphs into my document, they do not look good. Is there any way to set the same width for value on the y-axis?

You can set the y axis limits using ax.set_ylim or plt.ylim:
# Set axis from 1 to 200
ax.set_ylim((1,200))
# Or just set it directly - this will also act on the current axis
plt.ylim((1,200))
Edit: The question is about widths rather than limits.
I think making the subplots together on one figure should solve this problem.
plt.figure()
plt.subplot(2,2,1)
plt.plot(x1,y1)
.
.
plt.subplot(2,2,4)
plt.plot(x4,y4)

Related

How to set y-axis for historgram in Python?

According to the documentation, one can set the range of the x-axis using the hist function, but there doesn't seem to be a way to control the y-axis.
I have a figure with 4 subplots, arranged in a 2x2 fashion, all of which are histograms. I have made their x-axis to be entirely the same by setting the range, but have been unable to figure out how to do likewise with the y-axis. But when I try to control the y-axis, using set_ylim, I get an error. When I tried using pylab.axis, the plots didn't turn out correctly (the bars of the historgram all had a y-value of 0.
pylab.hist(myData[x], bins = 20, range=(0,400))
pylab.axis([0,400,0,300])
How do I control the y-axis of the histogram? Essentially what I"m looking for is something like range in the hist function, but for the y-axis.
Update:
plotNumber = 1
for i in xrange(4):
pylab.subplot(2, 2, plotNumber)
pylab.hist(myData[i], bins = 20, range=(0,400))
pylab.title('Some Title')
pylab.xlabel('X')
pylab.ylabel('Y')
plotNumber += 1
pylab.show()
But when I include
pylab.axis([0,400,0,300])
All the y-values correspond to 0 (the histogram is flat).
Answer is given here: setting y-axis limit in matplotlib
axes = plt.gca()
axes.set_xlim([xmin,xmax])
axes.set_ylim([ymin,ymax])
For me this works for histogram subplots.
If you're looking to set ticks on the y-axis every n values, you can use:
pylab.yticks(range(min, max, n))
I am using Python 2.7.

grids of graphs in matplotlib

Using the AXIS notation for matplotlib has allowed me to manually plot a grid of 2x2 or 3x3 or whatever size grid (if I know what size grid I want beforehand.)
However, how do you determine what size grid is needed automatically. Like what if you don't know how many unique values are in a column that you want to graph?
I am thinking there must be a way of doing this in a loop and figuring out based on the number of unique values in the column this is how big the graph needs to be.
Example
When I plot this for some reason it doesn't show month_name on the x axis (as in Jan, Feb, Marc etc)
avg_all_account.plot(legend=False,subplots=True,x='month_date',figsize=(10,20))
plt.xlabel('month')
plt.ylabel('number of proposals')
Yet when I plot subplots on a figure and specify x axis paremeter x='month_name' The month name appears on the plot here:
f = plt.figure()
f.set_figheight(8)
f.set_figwidth(8)
f.sharex=True
f.sharey=True
#graph1 = f.add_subplot(2,2,1)
avg_all_account.ix[0:,['month_date','number_open_proposals_all']].plot(ax=f.add_subplot(331),legend=False,subplots=True,x='month_date',y='number_open_proposals_all',title='open proposals')
plt.xlabel('month')
plt.ylabel('number of proposals')
Thus because the subplot method worked and showed the month_name on the x axis, and my x and y axis labels: I wanted to know how would I work out how many subplots I would need without first calculating it, then writing out each line and hard coding the subplot position?

how to set bounds for the x-axis in one figure containing multiple matplotlib histograms and create just one column of graphs?

I am struggling to set xlim for each histogram and create 1 column of graphs so the x-axis ticks are aligned. Being new pandas, I am unsure of how to apply answer applies: Overlaying multiple histograms using pandas.
>import from pandas import DataFrame, read_csv
>import matplotlib.pyplot as plt
>import pandas as pd
>df=DataFrame({'score0':[0.047771,0.044174,0.044169,0.042892,0.036862,0.036684,0.036451,0.035530,0.034657,0.033666],
'score1':[0.061010,0.054999,0.048395,0.048327,0.047784,0.047387,0.045950,0.045707,0.043294,0.042243]})
>print df
score0 score1
0 0.047771 0.061010
1 0.044174 0.054999
2 0.044169 0.048395
3 0.042892 0.048327
4 0.036862 0.047784
5 0.036684 0.047387
6 0.036451 0.045950
7 0.035530 0.045707
8 0.034657 0.043294
9 0.033666 0.042243
>df.hist()
>plt.xlim(-1.0,1.0)
The result sets only one of the bounds on the x-axis to be [-1,1].
I'm very familiar ggplot in R and just trying out pandas/matplotlib in python. I'm open to suggestions for better plotting ideas. Any help would be greatly appreciated.
update #1 (#ct-zhu):
I have tried the following, but the xlim edit on the subplot does not seem to translate the bin widths across the new x-axis values. As a result, the graph now has odd bin widths and still has more than one column of graphs:
for array in df.hist(bins=10):
for subplot in array:
subplot.set_xlim((-1,1))
update #2:
Getting closer with the use of layout, but the width of bins does not equal the interval length divided by bin count. In the example below, I set bins=10. Hence, the width of each bin over the interval from [-1,1] should be 2/10=0.20; however, the graph does not have any bins with a width of 0.20.
for array in df.hist(layout=(2,1),bins=10):
for subplot in array:
subplot.set_xlim((-1,1))
There are two subplots, and you can access each of them and modify them seperately:
ax_list=df.hist()
ax_list[0][0].set_xlim((0,1))
ax_list[0][1].set_xlim((0.01, 0.07))
What you are doing, by plt.xlim, changes the limit of the current working axis only. In this case, it is the second plot which is the most recently generated.
Edit:
To make the plots into 2 rows 1 column, use layout argument. To make the bin edges aligns, use bins argument. Set the x limit to (-1, 1) is probably not a good idea, you numbers are all smallish.
ax_list=df.hist(layout=(2,1),bins=np.histogram(df.values.ravel())[1])
ax_list[0][0].set_xlim((0.01, 0.07))
ax_list[1][0].set_xlim((0.01, 0.07))
Or specify exactly 10 bins between (-1,1):
ax_list=df.hist(layout=(2,1),bins=np.linspace(-1,1,10))
ax_list[0][0].set_xlim((-1,1))
ax_list[1][0].set_xlim((-1,1))

're-sort' / adapt ticks of matshow matrix plot

I tried hard, but I'm stuck with matplotlib here. Please overlook, that the mpl docs are a bit confusing to me . My question concerns the following:
I draw a symmetrical n*n matrix D with matshow function. That works.
I want to do the same thing, just with different order of (the n) items in D
D = [:,neworder]
D = [neworder,:]
Now, how do I make the ticks reproduce this neworder, preferably using additionally MaxNLocator?
As far as I understand...
set_xticklabels assigns labels to the ticks by order, independently of where the ticks are set?!
set_xticks (mpl docs: 'Set the x ticks with list of ticks') here I'm really not sure what it does. Can somebody explain it precisely? I don't know, whether these functions are helpful in my case at all. Maybe even things are different between using a common xy plot and matshow.
import numpy as np
import matplotlib.pyplot as plt
fig = plt.figure()
ax = fig.gca()
D = np.arange(100).reshape(10,10)
neworder = np.arange(10)
np.random.shuffle(neworder)
D = D[:,neworder]
D = D[neworder, :]
# modify ticks somehow...
ax.matshow(D)
plt.show()
Referring to Paul's answer, think I tried smth like this. Using the neworder to define positions and using it for the labels, I added plt.xticks(neworder, neworder) as tick-modifier. For example with neworder = [9 8 4 7 2 6 3 0 1 5] I get is this
The order of the labels is correct, but the ticks are not. The labels should be independently show the correct element independently of where the ticks are set. So where is the mistake?
I think what you want to do is set the labels on the new plot to show the rearranged order of the values. Is that right? If so, you want to keep the tick locations the same, but change the labels:
plt.xticks(np.arange(0,10), neworder)
plt.yticks(np.arange(0,10), neworder)
Edit: Note that these commands must be issued after matshow. This seems to be a quirk of matshow (plot does not show this behaviour, for example). Perhaps it's related to this line from the plt.matshow documentation:
Because of how :func:matshow tries to set the figure aspect ratio to be the
one of the array, if you provide the number of an already
existing figure, strange things may happen.
Perhaps the safest way to go is to issue plt.matshow(D) without first creating a figure, then use plt.xticks and plt.yticks to make adjustments.
Your question also asks about the set_ticks and related axis methods. The same thing can be accomplished using those tools, again after issuing matshow:
ax = plt.gca()
ax.xaxis.set_ticks(np.arange(0,10)) # turn on all tick locations
ax.xaxis.set_ticklabels(neworder) # use neworder for labels
Edit2: The next part of your question is related to setting a max number of ticks. 20 would require a new example. For our example I'll set the max no. of ticks at 2:
ax = plt.gca()
ax.xaxis.set_major_locator(plt.MaxNLocator(nbins=3)) # one less tick than 'bin'
tl = ax.xaxis.get_ticklocs() # get current tick locations
tl[1:-1] = [neworder[idx] for idx in tl[1:-1]] # find what the labels should be at those locs
ax.xaxis.set_ticklabels(tl) # set the labels
plt.draw()
You are on the right track. The plt.xticks command is what you need.
You can specify the xtick locations and the label at each position with the following command.
labelPositions = arange(len(D))
newLabels = ['z','y','x','w','v','u','t','s','q','r']
plt.xticks(labelPositions,newLabels)
You could also specify an arbitrary order for labelPositions, as they will be assigned based on the values in the vector.
labelPositions = [0,9,1,8,2,7,3,6,4,5]
newLabels = ['z','y','x','w','v','u','t','s','q','r']
plt.xticks(labelPositions,newLabels)

how to set unequal x axis intervals in Matplotlib

Now I just simply use plt.plot(x,y1,'b.-') to plot a figure, but it turns out so many data are displayed between 0 to 10 on the x axis, so I want to set x axis like this 0,1,5,10,100,1000,100000
thus, the massive data between 0 to 10 can be more spread out.
How can I do it in Python, I am using Matplotlib
0,1,5,10,100,1000,100000?
If you can live with (0.01, 0.1,), 1, 10, 100, 1000, 10000, 100000,… - then change the xscale to log:
plt.xscale('log')
See the accepted answer to the question How do I convert (or scale) axis values and redefine the tick frequency in matplotlib? Essentially, the matplotlib.pyplot.xticks command can be used to control to location and labels of the tick marks.
However, your data will still be plotted on a linear scale, so this won't strecth out the data between 0 and 10. You will need to use a different axis scaling to do this, using, for example, set_xscale.

Categories