Basically, when generating plots with matplotlib, The scale on the y-axis goes into the millions. How do I turn on digit grouping (i.e. so that 1000000 displays as 1,000,000) or turn on the decimal separator?
I don't think there's a built-in function to do this. (That's what i thought after i read your Q; i just checked and couldn't find one in the Documentation).
In any event, it's easy to roll your own.
(Below is a complete example--ie, it will generate an mpl plot with one axis having commified tick labels--although five lines of code are all you need to create custom tick labels--three (including import statement) for the function used to create the custom labels, and two lines to create the new labels and place them on the specified axis.)
# first code a function to generate the axis labels you want
# ie, turn numbers greater than 1000 into commified strings (12549 => 12,549)
import locale
locale.setlocale(locale.LC_ALL, 'en_US')
fnx = lambda x : locale.format("%d", x, grouping=True)
from matplotlib import pyplot as PLT
import numpy as NP
data = NP.random.randint(15000, 85000, 50).reshape(25, 2)
x, y = data[:,0], data[:,1]
fig = PLT.figure()
ax1 = fig.add_subplot(111)
ax1.plot(x, y, "ro")
default_xtick = range(20000, 100000, 10000)
# these two lines are the crux:
# create the custom tick labels
new_xtick = map(fnx, default_xtick)
# set those labels on the axis
ax1.set_xticklabels(new_xtick)
PLT.show()
Related
I have a column of data with a very large distribution and thus I log2-transform it before plotting and visualizing it. This works fine but I cannot seem to figure out how to set the y-scale to the exponential values of 2 (instead I have just the exponents themselves).
df['num_ratings_log2'] = df['num_ratings'] + 1
df['num_ratings_log2'] = np.log2(df['num_ratings_log2'])
df.boxplot(column = 'num_ratings_log2', figsize=(10,10))
As the scale, I would like to have 1 (2^0), 32 (2^5), 1024 (2^1) ... instead of 0, 5, 10 ...
I want everything else about the plot to stay the same. How can I achieve this?
Instead of taking the log of the data, you can create a normal boxplot and then set a log scale on the y-axis (ax.set_yscale('log'), or symlog to also represent zero). To get the ticks at powers of 2 (instead of powers of 10), use a LogLocator with base 2. A ScalarFormatter shows the values as regular numbers (instead of as powers such as 210). A NullLocator for the minor ticks suppresses undesired extra ticks.
import matplotlib.pyplot as plt
from matplotlib.ticker import ScalarFormatter, LogLocator, NullLocator
import pandas as pd
import numpy as np
np.random.seed(123)
df = pd.DataFrame({'num_ratings': (np.random.pareto(10, 10000) * 800).astype(int)})
ax = df.boxplot(column='num_ratings', figsize=(10, 10))
ax.set_yscale('symlog') # symlog also allows zero
# ax.yaxis.set_major_formatter(ScalarFormatter()) # show tick labels as regular numbers
ax.yaxis.set_major_formatter(lambda x, p: f'{int(x):,}')
ax.yaxis.set_minor_locator(NullLocator()) # remove minor ticks
plt.show()
Hope you are looking for below,
Code
ax = df.boxplot(column='num_ratings_log2', figsize=(20,10))
ymin = 0
ymax = 20
ax.set_ylim(2**ymin, 2**ymax)
I have a two dimensional (numpy)array and I plot the first column with the command plt.plot(wp[:, 0]). This shows exactly what I want and there is nothing I want to change besides the x axis labelling. For the x axis I am searching for a command which shows the area where the the value of the second column is the same and also which displays the y-number of this area.
[x1,y1]
[x2,y2]
[x3,y2]
[x4,y3]
[x5,y3]
[x6,y3]
[x7,y4]
As u can the see in my example matrix, the entries in the second column are not unique but instead there are "regions" with the same value.
Edit: So plt.xticks(tx, wp[:,2], rotation='vertical')does work for smaller matrices but looks really ugly for larger ones:
So in my opinion it would be enough if each number would just occur once. Do you know how to do that?
You'll have to:
Customize the number of ticks
Customize what to print when for a certain value
Modified from the examples:
import matplotlib.pyplot as plt
from matplotlib.ticker import FuncFormatter, MaxNLocator
fig = plt.figure()
ax = fig.add_subplot(111)
xs = range(100)
ys = range(100)
def format_fn(tick_val, tick_pos):
return '{0}'.format(int(tick_val))[:1]
ax.xaxis.set_major_formatter(FuncFormatter(format_fn))
ax.xaxis.set_major_locator(MaxNLocator(nbins=6,integer=True))
ax.plot(xs, ys)
plt.show()
I am trying to plot multiple lines in a 3D figure. Each line represents a month: I want them displayed parallel in the y-direction.
My plan was to loop over a set of Y values, but I cannot make this work properly, as using the ax.plot command (see working code below) produces a dozen lines all at the position of the final Y value. Confusingly, swapping ax.plot for ax.scatter does produce a set of parallel lines of data (albeit in the form of a set of dots; ax.view_init set to best display the parallel aspect of the result).
How can I use a produce a plot with multiple parallel lines?
My current workaround is to replace the loop with a dozen different arrays of Y values, and that can't be the right answer.
from mpl_toolkits.mplot3d.axes3d import Axes3D
import matplotlib.pyplot as plt
import numpy as np
# preamble
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
cs = ['r','g','b','y','r','g','b','y','r','g','b','y']
# x axis
X = np.arange(24)
# y axis
y = np.array([15,45,75,105,135,165,195,225,255,285,315,345])
Y = np.zeros(24)
# data - plotted against z axis
Z = np.random.rand(24)
# populate figure
for step in range(0,12):
Y[:] = y[step]
# ax.plot(X,Y,Z, color=cs[step])
ax.scatter(X,Y,Z, color=cs[step])
ax.set_xlabel('X')
ax.set_ylabel('Y')
ax.set_zlabel('Z')
# set initial view of plot
ax.view_init(elev=80., azim=345.)
plt.show()
I'm still learning python, so simple solutions (or, preferably, those with copious explanatory comments) are greatly appreciated.
Use
ax.plot(X, np.array(Y), Z, color=cs[step])
or
Y = [y[step]] * 24
This looks like a bug in mpl where we are not copying data when you hand it in so each line is sharing the same np.array object so when you update it all of your lines.
I work on a plot in python using the matplot library. The numbers which I have to generate are very big, so also the ticks on the axes are a large numbers and take a lot of space. I was trying to present them as a powers (for example instead having a tick 100000000 I want to have 10^8). I used command: ax.ticklabel_format(style='sci', axis='x', scilimits=(0,4)) however this only created something like this
Is there any other solution to have ticks for the plot as: 1 x 10^4, 2 x 10^4, etc or write the value 1e4 as 10^4 at the end of the label's ticks?
You can use the matplotlib.ticker module, and set the ax.xaxis.set_major_formatter to a FuncFormatter.
For example:
import matplotlib.pyplot as plt
import matplotlib.ticker as ticker
import numpy as np
plt.rcParams['text.usetex'] = True
fig,ax = plt.subplots(1)
x = y = np.arange(0,1.1e4,1e3)
ax.plot(x,y)
def myticks(x,pos):
if x == 0: return "$0$"
exponent = int(np.log10(x))
coeff = x/10**exponent
return r"${:2.0f} \times 10^{{ {:2d} }}$".format(coeff,exponent)
ax.xaxis.set_major_formatter(ticker.FuncFormatter(myticks))
plt.show()
Note, this uses LaTeX formatting (text.usetex = True) to render exponents in the tick labels. Also note the double braces required to differentiate the LaTeX braces from the python format string braces.
There might be a better solution, but if you know the values of each xtick, you can also manually name them.
Here is an example:
http://matplotlib.org/examples/ticks_and_spines/ticklabels_demo_rotation.html
I try to plot different data with similar representations but slight different behaviours and different origins on several figures. So the min & max of the Y axis is different between each figure, but the scale too.
e.g. here are some extracts of my batch plotting :
Does it exists a simple way with matplotlib to constraint the same Y step on those different figures, in order to have an easy visual interpretation, while keeping an automatically determined Y min and Y max ?
In others words, I'd like to have the same metric spacing between each Y-tick
you could use a MultipleLocator from the ticker module on both axes to define the tick spacings:
import matplotlib.pyplot as plt
import matplotlib.ticker as ticker
fig=plt.figure()
ax1=fig.add_subplot(211)
ax2=fig.add_subplot(212)
ax1.set_ylim(0,100)
ax2.set_ylim(40,70)
# set ticks every 10
tickspacing = 10
ax1.yaxis.set_major_locator(ticker.MultipleLocator(base=tickspacing))
ax2.yaxis.set_major_locator(ticker.MultipleLocator(base=tickspacing))
plt.show()
EDIT:
It seems like your desired behaviour was different to how I interpreted your question. Here is a function that will change the limits of the y axes to make sure ymax-ymin is the same for both subplots, using the larger of the two ylim ranges to change the smaller one.
import matplotlib.pyplot as plt
import numpy as np
fig=plt.figure()
ax1=fig.add_subplot(211)
ax2=fig.add_subplot(212)
ax1.set_ylim(40,50)
ax2.set_ylim(40,70)
def adjust_axes_limits(ax1,ax2):
yrange1 = np.ptp(ax1.get_ylim())
yrange2 = np.ptp(ax2.get_ylim())
def change_limits(ax,yr):
new_ymin = ax.get_ylim()[0] - yr/2.
new_ymax = ax.get_ylim()[1] + yr/2.
ax.set_ylim(new_ymin,new_ymax)
if yrange1 > yrange2:
change_limits(ax2,yrange1-yrange2)
elif yrange2 > yrange1:
change_limits(ax1,yrange2-yrange1)
else:
pass
adjust_axes_limits(ax1,ax2)
plt.show()
Note that the first subplot here has expanded from (40, 50) to (30, 60), to match the y range of the second subplot
The answer of Tom is pretty fine !
But I decided to use a simpler solution
I define an arbitrary yrange for all my plots e.g.
yrang = 0.003
and for each plot, I do :
ymin, ymax = ax.get_ylim()
ymid = np.mean([ymin,ymax])
ax.set_ylim([ymid - yrang/2 , ymid + yrang/2])
and possibly:
ax.yaxis.set_major_locator(ticker.MultipleLocator(base=0.005))