Why do matplotlib subplots start with 1 - python

When creating subplots with matplotlib you need to start with 1, while most other python things start with zero. So to create the very first subplot (top left)
ax = fig.add_subplot(3,4,1)
Where I would have expected 0 to be the first subplot
ax = fig.add_subplot(3,4,0)
I've seen the explanation "we got this from matlab" but that seems like a particularly unsatisfying answer.

The answer really is: "it's meant for matlab-compatibility". There is one minor advantage in terms of the shortcut integer notation (subplot(231) instead of subplot(2,3,1)). You can't express a 0-based system that way without using strings instead. However, that shortcut notation is generally a bad idea, and should only ever be used in an interactive scenario where readability isn't a concern.
As #Cong Ma mentioned, in most cases, you'd use subplots and index a 2D array instead of the matlab-style numerical system. It's a better approach all-around.
For example:
import matplotlib.pyplot as plt
fig, axes = plt.subplots(nrows=2, ncols=3)
axes[0, 0].plot(range(10))
plt.show()
It's not exactly identical, as it also adds all of the subplots, but you can always hide the ones you don't want to be visible (ax.axis('off') or ax.set(visible=False)).

Related

Use scientific notation for Python plots by default

Simple question: how do I get Python to use scientific notation in its plots by default? From various posts on SO I can write something like
from numpy import linspace
import matplotlib.pyplot as plt
plt.figure()
plt.plot(linspace(1e6,2e6),linspace(1e6,1e7))
plt.figure()
plt.plot(linspace(8e6,9e6),linspace(2e6,2.5e7))
plt.ticklabel_format(style='sci', axis='both', scilimits=(-2,2))
plt.show()
but ticklabel_format only acts on the last plot generated by matplotlib. (If plt.ticklabel_format() is put at the beginning of the code, I also get a blank figure showing the x,y axes.)
You can modify the default behaviour of matplotlib by edditing your "rc" file. See Customizing matplotlib.
In you case, it looks like you could adjust the item:
axes.formatter.limits : -2, 2 # use scientific notation if log10
# of the axis range is smaller than the
# first or larger than the second

pyplot.scatter(dataframe) vs. dataframe.plot(kind='scatter')

I have several pandas dataframes. I want to plot several columns against one another in separate scatter plots, and combine them as subplots in a figure. I want to label each subplot accordingly. I had a lot of trouble with getting subplot labels working, until I discovered that there are two ways of plotting directly from dataframes, as far as I know; see SO and pandasdoc:
ax0 = plt.scatter(df.column0, df.column5)
type(ax0): matplotlib.collections.PathCollection
and
ax1 = df.plot(0,5,kind='scatter')
type(ax1): matplotlib.axes._subplots.AxesSubplot
ax.set_title('title') works on ax1 but not on ax0, which returns
AttributeError: 'PathCollection' object has no attribute 'set_title'
I don't understand why the two separate ways exist. What is the purpose of the first method using PathCollections? The second one was added in 17.0; is the first one obsolete or has it a different purpose?
As you have found, the pandas function returns an axes object. The PathCollection object can be interpreted as an axes object as well using the "get current axes" function. For instance:
plot = plt.scatter(df.column0, df.column5)
ax0 = plt.gca()
type(ax0)
< matplotlib.axes._subplots.AxesSubplot at 0x10d2cde10>
A more standard way you might see this is the following:
fig = plt.figure()
ax0 = plt.add_subplot()
ax0.scatter(df.column0, df.column5)
At this point you are welcome to do "set" commands such as your set_title.
Hope this helps.
The difference between the two is that they are from different libraries. The first one is from matplotlib, the second one from pandas. They do the same, which is create a matplotlib scatter plot, but the matplotlib version returns a collection of points, whereas the pandas version returns a matplotlib subplot. This makes the matplotlib version a bit more versatile, as you can use the collection of points in another plot.

python 2D grid plot with origin at left upper corner [duplicate]

How can I flip the origin of a matplotlib plot to be in the upper-left corner - as opposed to the default lower-left? I'm using matplotlib.pylab.plot to produce the plot (though if there is another plotting routine that is more flexible, please let me know).
I'm looking for the equivalent of the matlab command: axis ij;
Also, I've spent a couple hours surfing matplotlib help and google but haven't come up with an answer. Some info on where I could have looked up the answer would be helpful as well.
The easiest way is to use:
plt.gca().invert_yaxis()
After you plotted the image. Origin works only for imshow.
axis ij just makes the y-axis increase downward instead of upward, right? If so, then matplotlib.axes.invert_yaxis() might be all you need -- but I can't test that right now.
If that doesn't work, I found a mailing post suggesting that
setp(gca(), 'ylim', reversed(getp(gca(), 'ylim')))
might do what you want to resemble axis ij.
For an image or contour plot, you can use the keyword origin = None | 'lower' | 'upper' and for a line plot, you can set the ylimits high to low.
from pylab import *
A = arange(25)/25.
A = A.reshape((5,5))
figure()
imshow(A, interpolation='nearest', origin='lower')
figure()
imshow(A, interpolation='nearest')
d = arange(5)
figure()
plot(d)
ylim(5, 0)
show()
The following is a basic way to achieve this
ax=pylab.gca()
ax.set_ylim(ax.get_ylim()[::-1])
This
plt.ylim(max(plt.ylim()), min(plt.ylim()))
has an advantage over this
plt.gca().invert_yaxis()
and is that if you are in interactive mode and you repeatedly plot the same plot (maybe with updated data and having a breakpoint after the plot) the y axis won't keep inverting every time.

zorder value to force grid to background [duplicate]

In Matplotlib, I make dashed grid lines as follows:
fig = pylab.figure()
ax = fig.add_subplot(1,1,1)
ax.yaxis.grid(color='gray', linestyle='dashed')
however, I can't find out how (or even if it is possible) to make the grid lines be drawn behind other graph elements, such as bars. Changing the order of adding the grid versus adding other elements makes no difference.
Is it possible to make it so that the grid lines appear behind everything else?
According to this - http://matplotlib.1069221.n5.nabble.com/axis-elements-and-zorder-td5346.html - you can use Axis.set_axisbelow(True)
(I am currently installing matplotlib for the first time, so have no idea if that's correct - I just found it by googling "matplotlib z order grid" - "z order" is typically used to describe this kind of thing (z being the axis "out of the page"))
To me, it was unclear how to apply andrew cooke's answer, so this is a complete solution based on that:
ax.set_axisbelow(True)
ax.yaxis.grid(color='gray', linestyle='dashed')
If you want to validate the setting for all figures, you may set
plt.rc('axes', axisbelow=True)
or
plt.rcParams['axes.axisbelow'] = True
It works for Matplotlib>=2.0.
I had the same problem and the following worked:
[line.set_zorder(3) for line in ax.lines]
fig.show() # to update
Increase 3to a higher value if it does not work.
You can also set the zorder kwarg in matplotlib.pyplot.grid
plt.grid(which='major', axis='y', zorder=-1.0)
You can try to use one of Seaborn's styles. For instance:
import seaborn as sns
sns.set_style("whitegrid")
Not only the gridlines will get behind but the looks are nicer.
For some (like me) it might be interesting to draw the grid behind only "some" of the other elements. For granular control of the draw order, you can use matplotlib.artist.Artist.set_zorder on the axes directly:
ax.yaxis.grid(color='gray', linestyle='dashed')
ax.set_zorder(3)
This is mentioned in the notes on matplotlib.axes.Axes.grid.

Show only the n'th ticklabel in a pandas boxplot

I am new to pandas and matplotlib, but not to Python. I have two questions; a primary and a secondary one.
Primary:
I have a pandas boxplot with FICO score on the x-axis and interest rate on the y-axis.
My x-axis is all messed up since the FICO scores are overwriting each other.
I'd like to show only every 4th or 5th ticklabel on the x-axis for a couple of reasons:
in general it's less chart-junky
in this case it will allow the labels to actually be read.
My code snippet is as follows:
plt.figure()
loansmin = pd.read_csv('../datasets/loanf.csv')
p = loansmin.boxplot('Interest.Rate','FICO.Score')
I saved the return value in p as I thought I might need to manipulate the plot further which I do now.
Secondary:
How do I access the plot, subplot, axes objects from pandas boxplot.
p above is an matplotlib.axes.AxesSubplot object.
help(matplotlib.axes.AxesSubplot) gives a message saying:
'AttributeError: 'module' object has no attribute 'AxesSubplot'
dir(matplotlib.axes) lists Axes, Subplot and Subplotbase as in that namespace but no AxesSubplot. How do I understand this returned object better?
As I explored further I found that one could explore the returned object p via dir().
Doing this I found a long list of useful methods, amongst which was set_xticklabels.
Doing help(p.set_xticklabels) gave some cryptic, but still useful, help - essentially suggesting passing in a list of strings for ticklabels.
I then tried doing the following - adding set_xticklabels to the end of the last line in the above code effectively chaining the invocations.
plt.figure()
loansmin = pd.read_csv('../datasets/loanf.csv')
p=loansmin.boxplot('Interest.Rate','FICO.Score').set_xticklabels(['650','','','','','700'])
This gave the desired result. I suspect there's a better way as in the way matplotlib does it which allows you to show every n'th label. But for immediate use this works, and also allows setting labels where they are not periodic for whatever reason, if you need that.
As usual, writing out the question explicitly helped me find the answer. And if anyone can help me get to the underlying matplotlib object that is still an open question.
AxesSubplot (I think) is just another way to get at the Axes in matplotlib. set_xticklabels() is part of the matplotlib object oriented interface (on axes). So, if you were using something like pylab, you might use xticks(ticks, labels), but instead here you have to separate it into different calls ax.set_xticks(ticks), ax.set_xticklabels(labels). (where ax is an Axes object).
Let's say you only want to set ticks at 650 and 700. You could do the following:
ticks = labels = [650, 700]
plt.figure()
loansmin = pd.read_csv('../datasets/loanf.csv')
p=loansmin.boxplot('Interest.Rate','FICO.Score')
p.set_xticks(ticks)
p.set_xticklabels(labels)
Similarly, you can use set_xlim and set_ylim to do the equivalent of xlim() and ylim() in plt.

Categories