Pyplot: using percentage on x axis

Pyplot: using percentage on x axis - python

I have a line chart based on a simple list of numbers. By default the x-axis is just the an increment of 1 for each value plotted. I would like to be a percentage instead but can't figure out how. So instead of having an x-axis from 0 to 5, it would go from 0% to 100% (but keeping reasonably spaced tick marks. Code below. Thanks!
from matplotlib import pyplot as plt
from mpl_toolkits.axes_grid.axislines import Subplot
data=[8,12,15,17,18,18.5]
fig=plt.figure(1,(7,4))
ax=Subplot(fig,111)
fig.add_subplot(ax)
plt.plot(data)

The code below will give you a simplified x-axis which is percentage based, it assumes that each of your values are spaces equally between 0% and 100%.
It creates a perc array which holds evenly-spaced percentages that can be used to plot with. It then adjusts the formatting for the x-axis so it includes a percentage sign using matplotlib.ticker.FormatStrFormatter. Unfortunately this uses the old-style string formatting, as opposed to the new style, the old style docs can be found here.
import matplotlib.pyplot as plt
import numpy as np
import matplotlib.ticker as mtick
data = [8,12,15,17,18,18.5]
perc = np.linspace(0,100,len(data))
fig = plt.figure(1, (7,4))
ax = fig.add_subplot(1,1,1)
ax.plot(perc, data)
fmt = '%.0f%%' # Format you want the ticks, e.g. '40%'
xticks = mtick.FormatStrFormatter(fmt)
ax.xaxis.set_major_formatter(xticks)
plt.show()

This is a few months late, but I have created PR#6251 with matplotlib to add a new PercentFormatter class. With this class you can do as follows to set the axis:
import matplotlib.ticker as mtick
# Actual plotting code omitted
ax.xaxis.set_major_formatter(mtick.PercentFormatter(5.0))
This will display values from 0 to 5 on a scale of 0% to 100%. The formatter is similar in concept to what #Ffisegydd suggests doing except that it can take any arbitrary existing ticks into account.
PercentFormatter() accepts three arguments, max, decimals, and symbol. max allows you to set the value that corresponds to 100% on the axis (in your example, 5).
The other two parameters allow you to set the number of digits after the decimal point and the symbol. They default to None and '%', respectively. decimals=None will automatically set the number of decimal points based on how much of the axes you are showing.
Note that this formatter will use whatever ticks would normally be generated if you just plotted your data. It does not modify anything besides the strings that are output to the tick marks.
Update
PercentFormatter was accepted into Matplotlib in version 2.1.0.

Totally late in the day, but I wrote this and thought it could be of use:
def transformColToPercents(x, rnd, navalue):
# Returns a pandas series that can be put in a new dataframe column, where all values are scaled from 0-100%
# rnd = round(x)
# navalue = Nan== this
hv = x.max(axis=0)
lv = x.min(axis=0)
pp = pd.Series(((x-lv)*100)/(hv-lv)).round(rnd)
return pp.fillna(navalue)
df['new column'] = transformColToPercents(df['a'], 2, 0)

Related

How to set y-scale when making a boxplot with dataframe

I have a column of data with a very large distribution and thus I log2-transform it before plotting and visualizing it. This works fine but I cannot seem to figure out how to set the y-scale to the exponential values of 2 (instead I have just the exponents themselves).
df['num_ratings_log2'] = df['num_ratings'] + 1
df['num_ratings_log2'] = np.log2(df['num_ratings_log2'])
df.boxplot(column = 'num_ratings_log2', figsize=(10,10))
As the scale, I would like to have 1 (2^0), 32 (2^5), 1024 (2^1) ... instead of 0, 5, 10 ...
I want everything else about the plot to stay the same. How can I achieve this?

Instead of taking the log of the data, you can create a normal boxplot and then set a log scale on the y-axis (ax.set_yscale('log'), or symlog to also represent zero). To get the ticks at powers of 2 (instead of powers of 10), use a LogLocator with base 2. A ScalarFormatter shows the values as regular numbers (instead of as powers such as 210). A NullLocator for the minor ticks suppresses undesired extra ticks.
import matplotlib.pyplot as plt
from matplotlib.ticker import ScalarFormatter, LogLocator, NullLocator
import pandas as pd
import numpy as np
np.random.seed(123)
df = pd.DataFrame({'num_ratings': (np.random.pareto(10, 10000) * 800).astype(int)})
ax = df.boxplot(column='num_ratings', figsize=(10, 10))
ax.set_yscale('symlog') # symlog also allows zero
# ax.yaxis.set_major_formatter(ScalarFormatter()) # show tick labels as regular numbers
ax.yaxis.set_major_formatter(lambda x, p: f'{int(x):,}')
ax.yaxis.set_minor_locator(NullLocator()) # remove minor ticks
plt.show()

Hope you are looking for below,
Code
ax = df.boxplot(column='num_ratings_log2', figsize=(20,10))
ymin = 0
ymax = 20
ax.set_ylim(2**ymin, 2**ymax)

Adding extra space along the x-axis in matplotlib bar graph

I'm using matplotlib to draw a bar chart with 3 bars. I want to add some extra space along the x-axis (so that the x-axis line is drawn longer).
Below is what I have:
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
dt = [1,3,2]
plt.figure()
xvals = range(len(dt))
plt.bar(xvals, dt, width=0.5)
plt.tick_params(bottom=False)
plt.xticks(xvals, ['a','b','c'])
plt.yticks(range(0,4), [0,1,2,3])
plt.gca().spines['top'].set_visible(False)
plt.gca().spines['right'].set_visible(False)
plt.show()
This code produces:
I simply want (note the elongated x-axis):

Change the limits of the x-axis using xlim()
for ex:
plt.xlim(-0.5,3.5) # adjust as necessary

Just add the following limits. Yo can use None as the left hand limit to let the plot choose the limit as the default value. Since the x-values are 0, 1, 2 and now you add the right hand side limit as 3, you will have an extended axis. Replace 3 by whatever value you want.
plt.xlim(None, 3)

Pyplot how to reduce xticks and xticklabels density?

I have to plot several curves with very high xtick density, say 1000 date strings. To prevent these tick labels overlapping each other I manually set them to be 60 dates apart. Code below:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
ts_index = pd.period_range(start="20060429", periods=1000).strftime("%Y%m%d")
fig = plt.figure(1)
ax = plt.subplot(1, 1, 1)
tick_spacing = 60
for i in range(5):
plt.plot(ts_index, 1 + i * 0.01 * np.arange(0, 1000), label="group %d"%i)
plt.legend(loc='best')
plt.title(r'net value curves')
xticks = ax.get_xticks()
xlabels = ax.get_xticklabels()
ax.set_xticks(xticks[::tick_spacing])
ax.set_xticklabels(xlabels[::tick_spacing])
plt.xticks(rotation="vertical")
plt.xlabel(r'date')
plt.ylabel('net value')
plt.grid(True)
plt.show()
fig.savefig(r".\net_value_curves.png", )
fig.clf()
I'm running this piece of code in PyCharm Community Edition 2017.2.2 with a Python 3.6 kernel. Now comes the funny thing: whenever I ran the code in the normal "run" mode (i.e. just hit the execution button and let the code run "freely" till interruption or termination), then the figure I got would always miss xticklabels:
However, if I ran the code in "debug" mode and ran it step by step then I would get an expected figure with complete xticklabels:
This is really weird. Anyway, I just hope to find a way that can ensure me getting the desired output (the second figure) in the normal "run" mode. How can I modify my current code to achieve this?
Thanks in advance!

Your x axis data are strings. Hence you will get one tick per data point. This is probably not what you want. Instead use the dates to plot. Because you are using pandas, this is easily converted,
dates = pd.to_datetime(ts_index, format="%Y%m%d")
You may then get rid of your manual xtick locating and formatting, because matplotlib will automatically choose some nice tick locations for you.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
ts_index = pd.period_range(start="20060429", periods=1000).strftime("%Y%m%d")
dates = pd.to_datetime(ts_index, format="%Y%m%d")
fig, ax = plt.subplots()
for i in range(5):
plt.plot(dates, 1 + i * 0.01 * np.arange(0, 1000), label="group %d"%i)
plt.legend(loc='best')
plt.title(r'net value curves')
plt.xticks(rotation="vertical")
plt.xlabel(r'date')
plt.ylabel('net value')
plt.grid(True)
plt.show()
However in case you do want to have some manual control over the locations and formats you may use matplotlib.dates locators and formatters.
# tick every 3 months
plt.gca().xaxis.set_major_locator(mdates.MonthLocator((1,4,7,10)))
# format as "%Y%m%d"
plt.gca().xaxis.set_major_formatter(mdates.DateFormatter("%Y%m%d"))

In general, the Axis object computes and places ticks using a Locator object. Locators and Formatters are meant to be easily replaceable, with appropriate methods of Axis. The default Locator does not seem to be doing the trick for you so you can replace it with anything you want using axes.xaxis.set_major_locator. This problem is not complicated enough to write your own, so I would suggest that MaxNLocator fits your needs fairly well. Your example seems to work well with nbins=16 (which is what you have in the picture, since there are 17 ticks.
You need to add an import:
from matplotlib.ticker import MaxNLocator
You need to replace the block
xticks = ax.get_xticks()
xlabels = ax.get_xticklabels()
ax.set_xticks(xticks[::tick_spacing])
ax.set_xticklabels(xlabels[::tick_spacing])
with
ax.xaxis.set_major_locator(MaxNLocator(nbins=16))
or just
ax.xaxis.set_major_locator(MaxNLocator(16))
You may want to play around with the other arguments (all of which have to be keywords, except nbins). Pay especial attention to integer.
Note that for the Locator and Formatter APIs we work with an Axis object, not Axes. Axes is the whole plot, while Axis is the thing with the spines on it. Axes usually contains two Axis objects and all the other stuff in your plot.

You can set the visibility of the xticks-labels to False
for label in plt.gca().xaxis.get_ticklabels()[::N]:
label.set_visible(False)
This will set every Nth label invisible.

Python Plot: How to denote ticks on the axes as powers?

I work on a plot in python using the matplot library. The numbers which I have to generate are very big, so also the ticks on the axes are a large numbers and take a lot of space. I was trying to present them as a powers (for example instead having a tick 100000000 I want to have 10^8). I used command: ax.ticklabel_format(style='sci', axis='x', scilimits=(0,4)) however this only created something like this
Is there any other solution to have ticks for the plot as: 1 x 10^4, 2 x 10^4, etc or write the value 1e4 as 10^4 at the end of the label's ticks?

You can use the matplotlib.ticker module, and set the ax.xaxis.set_major_formatter to a FuncFormatter.
For example:
import matplotlib.pyplot as plt
import matplotlib.ticker as ticker
import numpy as np
plt.rcParams['text.usetex'] = True
fig,ax = plt.subplots(1)
x = y = np.arange(0,1.1e4,1e3)
ax.plot(x,y)
def myticks(x,pos):
if x == 0: return "$0$"
exponent = int(np.log10(x))
coeff = x/10**exponent
return r"${:2.0f} \times 10^{{ {:2d} }}$".format(coeff,exponent)
ax.xaxis.set_major_formatter(ticker.FuncFormatter(myticks))
plt.show()
Note, this uses LaTeX formatting (text.usetex = True) to render exponents in the tick labels. Also note the double braces required to differentiate the LaTeX braces from the python format string braces.

There might be a better solution, but if you know the values of each xtick, you can also manually name them.
Here is an example:
http://matplotlib.org/examples/ticks_and_spines/ticklabels_demo_rotation.html

Setting an automatic tick interval

When I draw a plot with pyplot, it automatically generates nice widely space tick marks and labels. Is there a simple way to adjust the spacing between the automatically generated marks? For example, in a plot where default tick positions are [2,4,6], have ticks at positions [2,3,4,5,6].
I know I can set the mark positions and labels with xticks() and yticks(), but I need to know the range of values first, and with different data, I'd need to adjust them manually.

There are a whole bunch of tick locators and formatters, depending on what you want to standardize. Here's an example of linearly spacing ticks, set up for comparison with the default:
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
df = pd.DataFrame((np.random.random((2,30)).T),columns=['A','B'])
fig, axs = plt.subplots(1,2)
axs[0].scatter(df.A, df.B)
axs[0].set_title('Default y-locator')
from matplotlib.ticker import LinearLocator
axs[1].get_yaxis().set_major_locator(LinearLocator(numticks=12))
axs[1].set_title('12 evenly spaced y-ticks')
axs[1].scatter(df.A, df.B)
See, generally, http://matplotlib.org/api/ticker_api.html?highlight=fixedlocator#module-matplotlib.ticker and the example gallery.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Pyplot: using percentage on x axis - python

Related

How to set y-scale when making a boxplot with dataframe

Adding extra space along the x-axis in matplotlib bar graph

Pyplot how to reduce xticks and xticklabels density?

Python Plot: How to denote ticks on the axes as powers?

Setting an automatic tick interval

Categories

Resources

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Pyplot: using percentage on x axis - python

Related

How to set y-scale when making a boxplot with dataframe

Adding extra space along the x-axis in matplotlib bar graph

Pyplot how to reduce xticks *and* xticklabels density?

Python Plot: How to denote ticks on the axes as powers?

Setting an automatic tick interval

Categories

Resources

Pyplot how to reduce xticks and xticklabels density?