Python Plot: How to denote ticks on the axes as powers? - python

I work on a plot in python using the matplot library. The numbers which I have to generate are very big, so also the ticks on the axes are a large numbers and take a lot of space. I was trying to present them as a powers (for example instead having a tick 100000000 I want to have 10^8). I used command: ax.ticklabel_format(style='sci', axis='x', scilimits=(0,4)) however this only created something like this
Is there any other solution to have ticks for the plot as: 1 x 10^4, 2 x 10^4, etc or write the value 1e4 as 10^4 at the end of the label's ticks?

You can use the matplotlib.ticker module, and set the ax.xaxis.set_major_formatter to a FuncFormatter.
For example:
import matplotlib.pyplot as plt
import matplotlib.ticker as ticker
import numpy as np
plt.rcParams['text.usetex'] = True
fig,ax = plt.subplots(1)
x = y = np.arange(0,1.1e4,1e3)
ax.plot(x,y)
def myticks(x,pos):
if x == 0: return "$0$"
exponent = int(np.log10(x))
coeff = x/10**exponent
return r"${:2.0f} \times 10^{{ {:2d} }}$".format(coeff,exponent)
ax.xaxis.set_major_formatter(ticker.FuncFormatter(myticks))
plt.show()
Note, this uses LaTeX formatting (text.usetex = True) to render exponents in the tick labels. Also note the double braces required to differentiate the LaTeX braces from the python format string braces.

There might be a better solution, but if you know the values of each xtick, you can also manually name them.
Here is an example:
http://matplotlib.org/examples/ticks_and_spines/ticklabels_demo_rotation.html

Related

How to set y-scale when making a boxplot with dataframe

I have a column of data with a very large distribution and thus I log2-transform it before plotting and visualizing it. This works fine but I cannot seem to figure out how to set the y-scale to the exponential values of 2 (instead I have just the exponents themselves).
df['num_ratings_log2'] = df['num_ratings'] + 1
df['num_ratings_log2'] = np.log2(df['num_ratings_log2'])
df.boxplot(column = 'num_ratings_log2', figsize=(10,10))
As the scale, I would like to have 1 (2^0), 32 (2^5), 1024 (2^1) ... instead of 0, 5, 10 ...
I want everything else about the plot to stay the same. How can I achieve this?
Instead of taking the log of the data, you can create a normal boxplot and then set a log scale on the y-axis (ax.set_yscale('log'), or symlog to also represent zero). To get the ticks at powers of 2 (instead of powers of 10), use a LogLocator with base 2. A ScalarFormatter shows the values as regular numbers (instead of as powers such as 210). A NullLocator for the minor ticks suppresses undesired extra ticks.
import matplotlib.pyplot as plt
from matplotlib.ticker import ScalarFormatter, LogLocator, NullLocator
import pandas as pd
import numpy as np
np.random.seed(123)
df = pd.DataFrame({'num_ratings': (np.random.pareto(10, 10000) * 800).astype(int)})
ax = df.boxplot(column='num_ratings', figsize=(10, 10))
ax.set_yscale('symlog') # symlog also allows zero
# ax.yaxis.set_major_formatter(ScalarFormatter()) # show tick labels as regular numbers
ax.yaxis.set_major_formatter(lambda x, p: f'{int(x):,}')
ax.yaxis.set_minor_locator(NullLocator()) # remove minor ticks
plt.show()
Hope you are looking for below,
Code
ax = df.boxplot(column='num_ratings_log2', figsize=(20,10))
ymin = 0
ymax = 20
ax.set_ylim(2**ymin, 2**ymax)

How to get multiplier string of scientific notation on matplotlib

I'm trying to get the text in the offset of the scientific notation of matplotlib, but get_offset() or get_offset_text() returns an empty string. I have checked these questions, but they didn't work:
How to move the y axis scale factor to the position next to the y axis label?
Adjust exponent text after setting scientific limits on matplotlib axis
prevent scientific notation in matplotlib.pyplot
Here is a simple example:
import matplotlib.pyplot as plt
from matplotlib.ticker import ScalarFormatter
import numpy as np
x = np.arange(1,20)
y = np.exp(x)
fig,ax = plt.subplots(1,1)
ax.plot(x,y)
ax.yaxis.set_major_formatter(ScalarFormatter(useMathText=True))
print(ax.yaxis.get_offset_text())
print(ax.yaxis.get_major_formatter().get_offset())
fmt = ax.yaxis.get_major_formatter()
offset = ax.yaxis.get_major_formatter().get_offset()
print(offset)
plt.show()
That generates:
I'd like to get the x10^8, but it returns only:
Text(0, 0.5, '')
The same happens if I don't use the ScalarFormatter. Am I missing something? Is there a separate function to get the multiplier (instead of the offset)?
edit: I'm currently using Python 3.9.0 with matplotlib 3.4.2 on a MacBook Pro, and I just run python3 test.py.
edit2: I have installed Python 3.9.5, and the solution with fig.canvas.draw() still does not work. The same with Linux works.
edit3: The problem happens when using the MacOSX backend. When changing to TkAgg or Agg, the get_offset works with the provided solution.
You first need to draw the figure for the object to not hold some default values. From the source code on FigureCanvasBase.draw:
"""
Render the `.Figure`.
It is important that this method actually walk the artist tree
even if not output is produced because this will trigger
deferred work (like computing limits auto-limits and tick
values) that users may want access to before saving to disk.
"""
Simply call fig.canvas.draw() and then ax.yaxis.get_offset_text() will have the updated values you want.
x = np.arange(1, 20)
y = np.exp(x)
fig, ax = plt.subplots(1, 1)
ax.plot(x, y)
ax.yaxis.set_major_formatter(ScalarFormatter(useMathText=True))
fig.canvas.draw()
offset = ax.yaxis.get_major_formatter().get_offset()
print(offset)
# $\times\mathdefault{10^{8}}\mathdefault{}$

Pyplot: using percentage on x axis

I have a line chart based on a simple list of numbers. By default the x-axis is just the an increment of 1 for each value plotted. I would like to be a percentage instead but can't figure out how. So instead of having an x-axis from 0 to 5, it would go from 0% to 100% (but keeping reasonably spaced tick marks. Code below. Thanks!
from matplotlib import pyplot as plt
from mpl_toolkits.axes_grid.axislines import Subplot
data=[8,12,15,17,18,18.5]
fig=plt.figure(1,(7,4))
ax=Subplot(fig,111)
fig.add_subplot(ax)
plt.plot(data)
The code below will give you a simplified x-axis which is percentage based, it assumes that each of your values are spaces equally between 0% and 100%.
It creates a perc array which holds evenly-spaced percentages that can be used to plot with. It then adjusts the formatting for the x-axis so it includes a percentage sign using matplotlib.ticker.FormatStrFormatter. Unfortunately this uses the old-style string formatting, as opposed to the new style, the old style docs can be found here.
import matplotlib.pyplot as plt
import numpy as np
import matplotlib.ticker as mtick
data = [8,12,15,17,18,18.5]
perc = np.linspace(0,100,len(data))
fig = plt.figure(1, (7,4))
ax = fig.add_subplot(1,1,1)
ax.plot(perc, data)
fmt = '%.0f%%' # Format you want the ticks, e.g. '40%'
xticks = mtick.FormatStrFormatter(fmt)
ax.xaxis.set_major_formatter(xticks)
plt.show()
This is a few months late, but I have created PR#6251 with matplotlib to add a new PercentFormatter class. With this class you can do as follows to set the axis:
import matplotlib.ticker as mtick
# Actual plotting code omitted
ax.xaxis.set_major_formatter(mtick.PercentFormatter(5.0))
This will display values from 0 to 5 on a scale of 0% to 100%. The formatter is similar in concept to what #Ffisegydd suggests doing except that it can take any arbitrary existing ticks into account.
PercentFormatter() accepts three arguments, max, decimals, and symbol. max allows you to set the value that corresponds to 100% on the axis (in your example, 5).
The other two parameters allow you to set the number of digits after the decimal point and the symbol. They default to None and '%', respectively. decimals=None will automatically set the number of decimal points based on how much of the axes you are showing.
Note that this formatter will use whatever ticks would normally be generated if you just plotted your data. It does not modify anything besides the strings that are output to the tick marks.
Update
PercentFormatter was accepted into Matplotlib in version 2.1.0.
Totally late in the day, but I wrote this and thought it could be of use:
def transformColToPercents(x, rnd, navalue):
# Returns a pandas series that can be put in a new dataframe column, where all values are scaled from 0-100%
# rnd = round(x)
# navalue = Nan== this
hv = x.max(axis=0)
lv = x.min(axis=0)
pp = pd.Series(((x-lv)*100)/(hv-lv)).round(rnd)
return pp.fillna(navalue)
df['new column'] = transformColToPercents(df['a'], 2, 0)

python + matplotlib: use locale to format y axis

I want to format my y axis using matplotlib in python 2.7. This is what I tried:
ax.yaxis.get_major_formatter().set_useLocale()
to format my y axis using . as thousands separator. Instead of having 10000, I'd like to have 10.000, and so on... but I can't find any example on how this work...
I could not find the documentation, on this page here there is no example or further documentation: http://matplotlib.org/api/ticker_api.html#matplotlib.ticker.ScalarFormatter.set_useLocale
Or any other idea on how to format my axis?
thanks
I believe that you are looking for more control than perhaps set_useLocale() can offer. Therefore, drawing upon the example given here, I've used FuncFormatter with a simple function. The comma_format function inserts the y-axis labels with a comma as a thousands separator and then replaces the commas with periods. In this way, the y-axis labels can be formatted rather easily.
from pylab import *
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.ticker as ticker
def comma_format(x, p):
return format(x, "6,.0f").replace(",", ".")
ax = subplot(111)
xx = np.arange(0,20,1)
yy = np.arange(1000,10000,450)
ax.get_yaxis().set_major_formatter(ticker.FuncFormatter(comma_format))
plt.scatter(xx,yy)
plt.show()

Matplotlib digit grouping (decimal separator)

Basically, when generating plots with matplotlib, The scale on the y-axis goes into the millions. How do I turn on digit grouping (i.e. so that 1000000 displays as 1,000,000) or turn on the decimal separator?
I don't think there's a built-in function to do this. (That's what i thought after i read your Q; i just checked and couldn't find one in the Documentation).
In any event, it's easy to roll your own.
(Below is a complete example--ie, it will generate an mpl plot with one axis having commified tick labels--although five lines of code are all you need to create custom tick labels--three (including import statement) for the function used to create the custom labels, and two lines to create the new labels and place them on the specified axis.)
# first code a function to generate the axis labels you want
# ie, turn numbers greater than 1000 into commified strings (12549 => 12,549)
import locale
locale.setlocale(locale.LC_ALL, 'en_US')
fnx = lambda x : locale.format("%d", x, grouping=True)
from matplotlib import pyplot as PLT
import numpy as NP
data = NP.random.randint(15000, 85000, 50).reshape(25, 2)
x, y = data[:,0], data[:,1]
fig = PLT.figure()
ax1 = fig.add_subplot(111)
ax1.plot(x, y, "ro")
default_xtick = range(20000, 100000, 10000)
# these two lines are the crux:
# create the custom tick labels
new_xtick = map(fnx, default_xtick)
# set those labels on the axis
ax1.set_xticklabels(new_xtick)
PLT.show()

Categories