Specify axis-data margin in matplotlib - python

I have a series of plots with categorical data on the y-axis. It seems that the additional margin between the axis and the data is correlated with the number of categories on the y-axis. If there are many categories, an additional margin appears, but if there are few, the margin is so small that the data points are being cut. The plots look like this:
The plot with few categories and too small margin:
The plot with many categories and too big margins (click for full size):
For now, I only found solutions to manipulate the white space around the plot, like bbox_inches='tight' or fig.tight_layout(), but this doesn't solve my problem.
I don't have such problems with the x-axis, can this be a question of x-axis containing only numerical values and y-axis categorical data?
The code I'm using to generate all the plots looks like this:
sns.set(style='whitegrid')
plt.xlim(left=left_lim, right=right_lim)
plt.xticks(np.arange(left_lim, right_lim, step))
plot = sns.scatterplot(method.loc[:,'Len'],
method.loc[:,'Bond'],
hue = method.loc[:,'temp'],
palette= palette,
legend = False,
s = 50)
set_size(width, height)
plt.savefig("method.png", dpi = 100, bbox_inches='tight', pad_inches=0)
plt.show()
The set_size() comes from the first answer to Axes class - set explicitly size (width/height) of axes in given units.

We can slightly adapt the function from Axes class - set explicitly size (width/height) of axes in given units
to add a line setting the axes margins.
import matplotlib.pyplot as plt
def set_size(w,h, ax=None, marg=(0.1, 0.1)):
""" w, h: width, height in inches """
if not ax: ax=plt.gca()
l = ax.figure.subplotpars.left
r = ax.figure.subplotpars.right
t = ax.figure.subplotpars.top
b = ax.figure.subplotpars.bottom
figw = float(w)/(r-l)
figh = float(h)/(t-b)
ax.figure.set_size_inches(figw, figh)
ax.margins(x=marg[0]/w, y=marg[1]/h)
And call it with
set_size(width, height, marg=(xmargin, ymargin))
where xmargin, ymargin are the margins in inches.

Related

How to fix overlapping matplotlib y-axis tick labels or autoscale the plot? [duplicate]

I am trying to make a series of matplotlib plots that plot timespans for different classes of objects. Each plot has an identical x-axis and plot elements like a title and a legend. However, which classes appear in each plot differs; each plot represents a different sampling unit, each of which only contains only a subset of all the possible classes.
I am having a lot of trouble determining how to set the figure and axis dimensions. The horizontal size should always remain the same, but the vertical dimensions need to be scaled to the number of classes represented in that sampling unit. The distance between each entry on the y-axis should be equal for every plot.
It seems that my difficulties lie in the fact that I can set the absolute size (in inches) of the figure with plt.figure(figsize=(w,h)), but I can only set the size of the axis with relative dimensions (e.g., fig.add_axes([0.3,0.05,0.6,0.85]) which leads to my x-axis labels getting cut off when the number of classes is small.
Here is an MSPaint version of what I'd like to get vs. what I'm getting.
Here is a simplified version of the code I have used. Hopefully it is enough to identify the problem/solution.
import pandas as pd
import matplotlib.pyplot as plt
import pylab as pl
from matplotlib import collections as mc
from matplotlib.lines import Line2D
import seaborn as sns
# elements for x-axis
start = 1
end = 6
interval = 1 # x-axis tick interval
xticks = [x for x in range(start, end, interval)] # create x ticks
# items needed for legend construction
lw_bins = [0,10,25,50,75,90,100] # bins for line width
lw_labels = [3,6,9,12,15,18] # line widths
def make_proxy(zvalue, scalar_mappable, **kwargs):
color = 'black'
return Line2D([0, 1], [0, 1], color=color, solid_capstyle='butt', **kwargs)
for line_subset in data:
# create line collection for this run through loop
lc = mc.LineCollection(line_subset)
# create plot and set properties
sns.set(style="ticks")
sns.set_context("notebook")
############################################################
# I think the problem lies here
fig = plt.figure(figsize=(11, len(line_subset.index)*0.25))
ax = fig.add_axes([0.3,0.05,0.6,0.85])
############################################################
ax.add_collection(lc)
ax.set_xlim(left=start, right=end)
ax.set_xticks(xticks)
ax.xaxis.set_ticks_position('bottom')
ax.margins(0.05)
sns.despine(left=True)
ax.set_yticks(line_subset['order_y'])
ax.set(yticklabels=line_subset['ylabel'])
ax.tick_params(axis='y', length=0)
# legend
proxies = [make_proxy(item, lc, linewidth=item) for item in lw_labels]
leg = ax.legend(proxies, ['0-10%', '10-25%', '25-50%', '50-75%', '75-90%', '90-100%'], bbox_to_anchor=(1.0, 0.9),
loc='best', ncol=1, labelspacing=3.0, handlelength=4.0, handletextpad=0.5, markerfirst=True,
columnspacing=1.0)
for txt in leg.get_texts():
txt.set_ha("center") # horizontal alignment of text item
txt.set_x(-23) # x-position
txt.set_y(15) # y-position
You can start by defining the margins on top and bottom in units of inches. Having a fixed unit of one data unit in inches allows to calculate how large the final figure should be.
Then dividing the margin in inches by the figure height gives the relative margin in units of figure size, this can be supplied to the figure using subplots_adjust, given the subplots has been added with add_subplot.
A minimal example:
import numpy as np
import matplotlib.pyplot as plt
data = [np.random.rand(i,2) for i in [2,5,8,4,3]]
height_unit = 0.25 #inch
t = 0.15; b = 0.4 #inch
for d in data:
height = height_unit*(len(d)+1)+t+b
fig = plt.figure(figsize=(5, height))
ax = fig.add_subplot(111)
ax.set_ylim(-1, len(d))
fig.subplots_adjust(bottom=b/height, top=1-t/height, left=0.2, right=0.9)
ax.barh(range(len(d)),d[:,1], left=d[:,0], ec="k")
ax.set_yticks(range(len(d)))
plt.show()

x-axis labels crops figure size [duplicate]

I want to to create a figure using matplotlib where I can explicitly specify the size of the axes, i.e. I want to set the width and height of the axes bbox.
I have looked around all over and I cannot find a solution for this. What I typically find is how to adjust the size of the complete Figure (including ticks and labels), for example using fig, ax = plt.subplots(figsize=(w, h))
This is very important for me as I want to have a 1:1 scale of the axes, i.e. 1 unit in paper is equal to 1 unit in reality. For example, if xrange is 0 to 10 with major tick = 1 and x axis is 10cm, then 1 major tick = 1cm. I will save this figure as pdf to import it to a latex document.
This question brought up a similar topic but the answer does not solve my problem (using plt.gca().set_aspect('equal', adjustable='box') code)
From this other question I see that it is possible to get the axes size, but not how to modify them explicitly.
Any ideas how I can set the axes box size and not just the figure size. The figure size should adapt to the axes size.
Thanks!
For those familiar with pgfplots in latex, it will like to have something similar to the scale only axis option (see here for example).
The axes size is determined by the figure size and the figure spacings, which can be set using figure.subplots_adjust(). In reverse this means that you can set the axes size by setting the figure size taking into acount the figure spacings:
import matplotlib.pyplot as plt
def set_size(w,h, ax=None):
""" w, h: width, height in inches """
if not ax: ax=plt.gca()
l = ax.figure.subplotpars.left
r = ax.figure.subplotpars.right
t = ax.figure.subplotpars.top
b = ax.figure.subplotpars.bottom
figw = float(w)/(r-l)
figh = float(h)/(t-b)
ax.figure.set_size_inches(figw, figh)
fig, ax=plt.subplots()
ax.plot([1,3,2])
set_size(5,5)
plt.show()
It appears that Matplotlib has helper classes that allow you to define axes with a fixed size Demo fixed size axes
I have found that ImportanceofBeingErnests answer which modifies that figure size to adjust the axes size provides inconsistent results with the paticular matplotlib settings I use to produce publication ready plots. Slight errors were present in the final figure size, and I was unable to find a way to solve the issue with his approach. For most use cases I think this is not a problem, however the errors were noticeable when combining multiple pdf's for publication.
In lieu of developing a minimum working example to find the real issue I am having with the figure resizing approach I instead found a work around which uses the fixed axes size utilising the divider class.
from mpl_toolkits.axes_grid1 import Divider, Size
def fix_axes_size_incm(axew, axeh):
axew = axew/2.54
axeh = axeh/2.54
#lets use the tight layout function to get a good padding size for our axes labels.
fig = plt.gcf()
ax = plt.gca()
fig.tight_layout()
#obtain the current ratio values for padding and fix size
oldw, oldh = fig.get_size_inches()
l = ax.figure.subplotpars.left
r = ax.figure.subplotpars.right
t = ax.figure.subplotpars.top
b = ax.figure.subplotpars.bottom
#work out what the new ratio values for padding are, and the new fig size.
neww = axew+oldw*(1-r+l)
newh = axeh+oldh*(1-t+b)
newr = r*oldw/neww
newl = l*oldw/neww
newt = t*oldh/newh
newb = b*oldh/newh
#right(top) padding, fixed axes size, left(bottom) pading
hori = [Size.Scaled(newr), Size.Fixed(axew), Size.Scaled(newl)]
vert = [Size.Scaled(newt), Size.Fixed(axeh), Size.Scaled(newb)]
divider = Divider(fig, (0.0, 0.0, 1., 1.), hori, vert, aspect=False)
# the width and height of the rectangle is ignored.
ax.set_axes_locator(divider.new_locator(nx=1, ny=1))
#we need to resize the figure now, as we have may have made our axes bigger than in.
fig.set_size_inches(neww,newh)
Things worth noting:
Once you call set_axes_locator() on an axis instance you break the tight_layout() function.
The original figure size you choose will be irrelevent, and the final figure size is determined by the axes size you choose and the size of the labels/tick labels/outward ticks.
This approach doesn't work with colour scale bars.
This is my first ever stack overflow post.
another method using fig.add_axes was quite accurate. I have included 1 cm grid aswell
import matplotlib.pyplot as plt
import matplotlib as mpl
# This example fits a4 paper with 5mm margin printers
# figure settings
figure_width = 28.7 # cm
figure_height = 20 # cm
left_right_magrin = 1 # cm
top_bottom_margin = 1 # cm
# Don't change
left = left_right_magrin / figure_width # Percentage from height
bottom = top_bottom_margin / figure_height # Percentage from height
width = 1 - left*2
height = 1 - bottom*2
cm2inch = 1/2.54 # inch per cm
# specifying the width and the height of the box in inches
fig = plt.figure(figsize=(figure_width*cm2inch,figure_height*cm2inch))
ax = fig.add_axes((left, bottom, width, height))
# limits settings (important)
plt.xlim(0, figure_width * width)
plt.ylim(0, figure_height * height)
# Ticks settings
ax.xaxis.set_major_locator(mpl.ticker.MultipleLocator(5))
ax.xaxis.set_minor_locator(mpl.ticker.MultipleLocator(1))
ax.yaxis.set_major_locator(mpl.ticker.MultipleLocator(5))
ax.yaxis.set_minor_locator(mpl.ticker.MultipleLocator(1))
# Grid settings
ax.grid(color="gray", which="both", linestyle=':', linewidth=0.5)
# your Plot (consider above limits)
ax.plot([1,2,3,5,6,7,8,9,10,12,13,14,15,17])
# save figure ( printing png file had better resolution, pdf was lighter and better on screen)
plt.show()
fig.savefig('A4_grid_cm.png', dpi=1000)
fig.savefig('tA4_grid_cm.pdf')
result:

Matplotlib - dynamic plot height, horizontal bars always 1px height

I'm dynamically generating a horizontal bar plot using MatPlotLib. It works pretty well most of the time, until people try to plot a very large numbers of data points. MatPlotLib tries to squish all of the bars into the plot and they start to disappear.
The ideal solution would be to generate the plot so that every horizontal bar is one pixel in height, with 1px separating every bar. The total height of the resulting plot image would then be dependent on the number of bars. But as everything in MatPlotLib is relative, I'm getting really stuck in how to do this. Any help would be much appreciated!
One option is to generate an image with the bars as pixels.
import matplotlib.pyplot as plt
import numpy as np
dpi = 100
N = 100 # numbner of bars (approx. half the number of pixels)
w = 200 #width of plot in pixels
sp = 3 # spacing within axes in pixels
bp = 50; lp = 70 # bottom, left pixel spacing
bottom=float(bp)/(2*N+2*sp+2*bp)
top = 1.-bottom
left=float(lp)/(w+2*lp)
right=1.-left
figheight = (2*N+2*sp)/float(dpi)/(1-(1-top)-bottom) #inch
figwidth = w/float(dpi)/(1-(1-right)-left)
# this is the input array to plot
inp = np.random.rand(N)+0.16
ar = np.zeros((2*N+2*sp,w))
ninp = np.round(inp/float(inp.max())*w).astype(np.int)
for n in range(N):
ar[2*n+sp, 0: ninp[n]] = np.ones(ninp[n])
fig, ax=plt.subplots(figsize=(figwidth, figheight), dpi=dpi)
plt.subplots_adjust(left=left, bottom=bottom, right=right, top=top)
plt.setp(ax.spines.values(), linewidth=0.5)
ext = [0,inp.max(), N-0.5+(sp+0.5)/2., -(sp+0.5)/2.]
ax.imshow(ar, extent=ext, interpolation="none", cmap="gray_r", origin="upper", aspect="auto")
ax.set_xlim((0,inp.max()*1.1))
ax.set_ylabel("item")
ax.set_xlabel("length")
plt.savefig(__file__+".png", dpi=dpi)
plt.show()
This will work for any setting of dpi.
Note that the ticklabels might appear a bit off, which is an inaccuracy from matplotlib; which I don't know how to overcome.
This example shows how you can plot lines with 1 pixel width:
yinch = 2
fig, ax = plt.subplots(figsize=(3,yinch), facecolor='w')
fig.subplots_adjust(left=0, right=1, bottom=0, top=1)
ypixels = int(yinch*fig.get_dpi())
for i in range(ypixels):
if i % 2 == 0:
c = 'k'
else:
c = 'w'
ax.plot([0,np.random.rand()], [i,i], color=c, linewidth=72./fig.get_dpi())
ax.set_ylim(0,ypixels)
ax.axis('off')
This is what the result looks like (magnified 200%):
edit:
Using a different dpi is not problem, but then using plot() becomes less useful because you cant specify the linewidth units. You can calculate the needed linewidth, but i think using barh() is more clear in that scenario.
In the example above i simply disabled the axis to focus on the 1px bars, if you remove that you can plot as normal. Spacing around it is not a problem because Matplotlib isn't bound to the 0-1 range for a Figure, but you want to add bbox_inches='tight' to your savefig to include artists outside of the normal 0-1 range. If you spend a lot of time 'precise' plotting within you axes, i think its easier to stretch the axis to span the entire figure size. You of course take a different approach but that would require you to also calculate the axes size in inches. Both angles would work, it depends or your precise case which might be more convenient.
Also be aware that old versions of Matplotlib (<2.0?) have a different default figure.dpi and savefig.dpi. You can avoid this by adding dpi=fig.get_dpi() to your savefig statement. One of many reasons to upgrade. ;)
yinch = 2
dpi = 128
fig, ax = plt.subplots(figsize=(3,yinch), facecolor='w', dpi=dpi)
fig.subplots_adjust(left=0, right=1, bottom=0, top=1)
ypixels = int(yinch*fig.get_dpi())
for i in range(ypixels):
if i % 2 == 0:
c = '#aa0000'
else:
c = 'w'
ax.barh(i,np.random.rand(), height=1, color=c)
ax.set_title('DPI %i' % dpi)
ax.set_ylim(0,ypixels)
fig.savefig('mypic.png', bbox_inches='tight')

How do I plot more than one set of bars per axis on a bar plot in python?

I currently use the align=’edge’ parameter and positive/negative widths in pyplot.bar() to plot the bar data of one metric to each axis. However, if I try to plot a second set of data to one axis, it covers the first set. Is there a way for pyplot to automatically space this data correctly?
lns3 = ax[1].bar(bucket_df.index,bucket_df.original_revenue,color='c',width=-0.4,align='edge')
lns4 = ax[1].bar(bucket_df.index,bucket_df.revenue_lift,color='m',bottom=bucket_df.original_revenue,width=-0.4,align='edge')
lns5 = ax3.bar(bucket_df.index,bucket_df.perc_first_priced,color='grey',width=0.4,align='edge')
lns6 = ax3.bar(bucket_df.index,bucket_df.perc_revenue_lift,color='y',width=0.4,align='edge')
This is what it looks like when I show the plot:
The data shown in yellow completely covers the data in grey. I'd like it to be shown next to the grey data.
Is there any easy way to do this? Thanks!
The first argument to the bar() plotting method is an array of the x-coordinates for your bars. Since you pass the same x-coordinates they will all overlap. You can get what you want by staggering the bars by doing something like this:
x = np.arange(10) # define your x-coordinates
width = 0.1 # set a width for your plots
offset = 0.15 # define an offset to separate each set of bars
fig, ax = plt.subplots() # define your figure and axes objects
ax.bar(x, y1) # plot the first set of bars
ax.bar(x + offset, y2) # plot the second set of bars
Since you have a few sets of data to plot, it makes more sense to make the code a bit more concise (assume y_vals is a list containing the y-coordinates you'd like to plot, bucket_df.original_revenue, bucket_df.revenue_lift, etc.). Then your plotting code could look like this:
for i, y in enumerate(y_vals):
ax.bar(x + i * offset, y)
If you want to plot more sets of bars you can decrease the width and offset accordingly.

Subplots: tight_layout changes figure size

Changing the vertical distance between two subplot using tight_layout(h_pad=-1) changes the total figuresize. How can I define the figuresize using tight_layout?
Here is the code:
#define figure
pl.figure(figsize=(10, 6.25))
ax1=subplot(211)
img=pl.imshow(np.random.random((10,50)), interpolation='none')
ax1.set_xticklabels(()) #hides the tickslabels of the first plot
subplot(212)
x=linspace(0,50)
pl.plot(x,x,'k-')
xlim( ax1.get_xlim() ) #same x-axis for both plots
And here is the results:
If I write
pl.tight_layout(h_pad=-2)
in the last line, then I get this:
As you can see, the figure is bigger...
You can use a GridSpec object to control precisely width and height ratios, as answered on this thread and documented here.
Experimenting with your code, I could produce something like what you want, by using a height_ratio that assigns twice the space to the upper subplot, and increasing the h_pad parameter to the tight_layout call. This does not sound completely right, but maybe you can adjust this further ...
import numpy as np
from matplotlib.pyplot import *
import matplotlib.pyplot as pl
import matplotlib.gridspec as gridspec
#define figure
fig = pl.figure(figsize=(10, 6.25))
gs = gridspec.GridSpec(2, 1, height_ratios=[2,1])
ax1=subplot(gs[0])
img=pl.imshow(np.random.random((10,50)), interpolation='none')
ax1.set_xticklabels(()) #hides the tickslabels of the first plot
ax2=subplot(gs[1])
x=np.linspace(0,50)
ax2.plot(x,x,'k-')
xlim( ax1.get_xlim() ) #same x-axis for both plots
fig.tight_layout(h_pad=-5)
show()
There were other issues, like correcting the imports, adding numpy, and plotting to ax2 instead of directly with pl. The output I see is this:
This case is peculiar because of the fact that the default aspect ratios of images and plots are not the same. So it is worth noting for people looking to remove the spaces in a grid of subplots consisting of images only or of plots only that you may find an appropriate solution among the answers to this question (and those linked to it): How to remove the space between subplots in matplotlib.pyplot?.
The aspect ratios of the subplots in this particular example are as follows:
# Default aspect ratio of images:
ax1.get_aspect()
# 1.0
# Which is as it is expected based on the default settings in rcParams file:
matplotlib.rcParams['image.aspect']
# 'equal'
# Default aspect ratio of plots:
ax2.get_aspect()
# 'auto'
The size of ax1 and the space beneath it are adjusted automatically based on the number of pixels along the x-axis (i.e. width) so as to preserve the 'equal' aspect ratio while fitting both subplots within the figure. As you mentioned, using fig.tight_layout(h_pad=xxx) or the similar fig.set_constrained_layout_pads(hspace=xxx) is not a good option as this makes the figure larger.
To remove the gap while preserving the original figure size, you can use fig.subplots_adjust(hspace=xxx) or the equivalent plt.subplots(gridspec_kw=dict(hspace=xxx)), as shown in the following example:
import numpy as np # v 1.19.2
import matplotlib.pyplot as plt # v 3.3.2
np.random.seed(1)
fig, (ax1, ax2) = plt.subplots(2, 1, figsize=(10, 6.25),
gridspec_kw=dict(hspace=-0.206))
# For those not using plt.subplots, you can use this instead:
# fig.subplots_adjust(hspace=-0.206)
size = 50
ax1.imshow(np.random.random((10, size)))
ax1.xaxis.set_visible(False)
# Create plot of a line that is aligned with the image above
x = np.arange(0, size)
ax2.plot(x, x, 'k-')
ax2.set_xlim(ax1.get_xlim())
plt.show()
I am not aware of any way to define the appropriate hspace automatically so that the gap can be removed for any image width. As stated in the docstring for fig.subplots_adjust(), it corresponds to the height of the padding between subplots, as a fraction of the average axes height. So I attempted to compute hspace by dividing the gap between the subplots by the average height of both subplots like this:
# Extract axes positions in figure coordinates
ax1_x0, ax1_y0, ax1_x1, ax1_y1 = np.ravel(ax1.get_position())
ax2_x0, ax2_y0, ax2_x1, ax2_y1 = np.ravel(ax2.get_position())
# Compute negative hspace to close the vertical gap between subplots
ax1_h = ax1_y1-ax1_y0
ax2_h = ax2_y1-ax2_y0
avg_h = (ax1_h+ax2_h)/2
gap = ax1_y0-ax2_y1
hspace=-(gap/avg_h) # this divided by 2 also does not work
fig.subplots_adjust(hspace=hspace)
Unfortunately, this does not work. Maybe someone else has a solution for this.
It is also worth mentioning that I tried removing the gap between subplots by editing the y positions like in this example:
# Extract axes positions in figure coordinates
ax1_x0, ax1_y0, ax1_x1, ax1_y1 = np.ravel(ax1.get_position())
ax2_x0, ax2_y0, ax2_x1, ax2_y1 = np.ravel(ax2.get_position())
# Set new y positions: shift ax1 down over gap
gap = ax1_y0-ax2_y1
ax1.set_position([ax1_x0, ax1_y0-gap, ax1_x1, ax1_y1-gap])
ax2.set_position([ax2_x0, ax2_y0, ax2_x1, ax2_y1])
Unfortunately, this (and variations of this) produces seemingly unpredictable results, including a figure resizing similar to when using fig.tight_layout(). Maybe someone else has an explanation for what is happening here behind the scenes.

Categories