Matplotlib axis limits and text positions independent of dataset units - python

I'm trying to make plots that are formatted the same way despite coming from different datasets and I'm running into issues with getting consistent text positions and appropriate axis limits because the datasets are not scaled exactly the same. For example - say I generate the following elevation profile:
import matplotlib.pyplot as plt
import numpy as np
Distance=np.array([1000,3000,7000,15000,20000])
Elevation=np.array([100,200,350,800,400])
def MyPlot(X,Y):
fig = plt.figure()
ax = fig.add_subplot(111, aspect='equal')
ax.plot(X,Y)
fig.set_size_inches(fig.get_size_inches()*2)
ax.set_ylim(min(Y)-50, max(Y)+500)
ax.set_xlim(min(X)-50, max(X)+50)
MaxPoint=X[np.argmax(Y)], max(Y)
ax.scatter(MaxPoint[0], MaxPoint[1], s=10)
ax.text(MaxPoint[0], MaxPoint[1]+100, s='Maximum = '+str(MaxPoint[1]), fontsize=8)
MyPlot(Distance,Elevation)
And then I have another dataset that's scaled differently:
Distance2=Distance*4
Elevation2=Elevation*5
MyPlot(Distance2,Elevation2)][2]][2]
Because of the fact that a unit change is relatively much larger in the first dataset than the second dataset, the text and axis labels do not get formatted as I'd like in the 2nd plot. Is there a way to adjust text position and axis limits that adjusts to the relative scale of the dataset?

First off, for placing text with an offset such as this, you almost never want to use text. Instead, use annotate. The advantage is that you can give an offset of the text in points instead of data units.
Next, to reduce the density of tick locations, use ax.locator_params and change the nbins parameter. nbins controls the tick density. Tick locations will still be automatically chosen, but reducing nbins will reduce the maximum number of tick locations. If you do lower nbins, you may want to also change the numbers that matplotlib considers "even" when picking tick intervals. That way, you have more options to get the expected number of ticks.
Finally, to avoid manually setting limits with a set padding, consider using margins(some_percentage) to pad the extents by a percentage of the current limits.
To show a complete example of all:
import matplotlib.pyplot as plt
import numpy as np
distance=np.array([1000,3000,7000,15000,20000])
elevation=np.array([100,200,350,800,400])
def plot(x, y):
fig, ax = plt.subplots(figsize=(8, 2))
# Plot your data and place a marker at the peak location
maxpoint=x[np.argmax(y)], max(y)
ax.scatter(maxpoint[0], maxpoint[1], s=10)
ax.plot(x, y)
# Reduce the maximum number of ticks and give matplotlib more flexibility
# in the tick intervals it can choose.
# Essentially, this will more or less always have two ticks on the y-axis
# and 4 on the x-axis
ax.locator_params(axis='y', nbins=3, steps=range(1, 11))
ax.locator_params(axis='x', nbins=5, steps=range(1, 11))
# Annotate the peak location. The text will always be 5 points from the
# data location.
ax.annotate('Maximum = {:0.0f}'.format(maxpoint[1]), size=8,
xy=maxpoint, xytext=(5, 5), textcoords='offset points')
# Give ourselves lots of padding on the y-axis, less on the x
ax.margins(x=0.01, y=0.3)
ax.set_ylim(bottom=y.min())
# Set the aspect of the plot to be equal and add some x/y labels
ax.set(xlabel='Distance', ylabel='Elevation', aspect=1)
plt.show()
plot(distance,elevation)
And if we change the data:
plot(distance * 4, elevation * 5)
Finally, you might consider placing the annotation just above the top of the axis, instead of offset from the point:
ax.annotate('Maximum = {:0.0f}'.format(maxpoint[1]), ha='center',
size=8, xy=(maxpoint[0], 1), xytext=(0, 5),
textcoords='offset points',
xycoords=('data', 'axes fraction'))

May be you should use seaborn where no any borders. I think it's very good way.
It will be look like this:
you should write string import seaborn at the beginning of the script.

Related

Adjust axes to make space for offset line plot

I would like to plot a series of curves in the same Axes each having a constant y offset from eachother. Because the data I have needs to be displayed in log scale, simply adding a y offset to each curve (as done here) does not give the desired output.
I have tried using matplotlib.transforms to achieve the same, i.e. artificially shifting the curve in Figure coordinates. This achieves the desired result, but requires adjusting the Axes y limits so that the shifted curves are visible. Here is an example to illustrate this, though such data would not require log scale to be visible:
import matplotlib as mpl
import matplotlib.pyplot as plt
import numpy as np
fig, ax = plt.subplots(1,1)
for i in range(1,19):
x, y = np.arange(200), np.random.rand(200)
dy = 0.5*i
shifted = mpl.transforms.offset_copy(ax.transData, y=dy, fig=fig, units='inches')
ax.set_xlim(0, 200)
ax.set_ylim(0.1, 1e20)
ax.set_yscale('log')
ax.plot(x, y, transform=shifted, c=mpl.cm.plasma(i/18), lw=2)
The problem is that to make all the shifted curves visible, I would need to adjust the ylim to a very high number, which compresses all the curves so that the features visible because of the log scale cannot be seen anymore.
Since the displayed y axis values are meaningless to me, is there any way to artificially extend the Axes limits to display all the curves, without having to make the Figure very large? Apparently this can be done with seaborn, but if possible I would like to stick to matplotlib.
EDIT:
This is the kind of data I need to plot (an X-ray diffraction pattern varying with temperature):

Using AxesGrid to plot data with different x and y range in a square [duplicate]

I want to make x and y axes be of equal lengths (i.e the plot minus the legend should be square ). I wish to plot the legend outside (I have already been able to put legend outside the box). The span of x axis in the data (x_max - x_min) is not the same as the span of y axis in the data (y_max - y_min).
This is the relevant part of the code that I have at the moment:
plt.legend(loc='center left', bbox_to_anchor=(1, 0.5), fontsize=15 )
plt.axis('equal')
plt.tight_layout()
The following link is an example of an output plot that I am getting : plot
How can I do this?
Would plt.axis('scaled') be what you're after? That would produce a square plot, if the data limits are of equal difference.
If they are not, you could get a square plot by setting the aspect of the axes to the ratio of xlimits and ylimits.
import numpy as np
import matplotlib.pyplot as plt
fig, (ax1, ax2) = plt.subplots(1,2)
ax1.plot([-2.5, 2.5], [-4,13], "s-")
ax1.axis("scaled")
ax2.plot([-2.5, 2.5], [-4,13], "s-")
ax2.set_aspect(np.diff(ax2.get_xlim())/np.diff(ax2.get_ylim()))
plt.show()
One option you have to is manually set the limits, assuming that you know the size of your dataset.
axes = plt.gca()
axes.set_xlim([xmin,xmax])
axes.set_ylim([ymin,ymax])
A better option would be to iterate through your data to find the maximum x- and y-coordinates, take the greater of those two numbers, add a little bit more to that value to act as a buffer, and set xmax and ymax to that new value. You can use a similar method to set xmin and ymin: instead of finding the maximums, find the minimums.
To put the legend outside of the plot, I would look at this question: How to put the legend out of the plot

How to remove a particular grid line corresponding to a custom xtick on a log scale axis?

I'd like to remove the vertical grid line corresponding to the custom xtick (displayed at x = 71 in the below picture). I could remove a horizontal grid line corresponding to the ytick 701 in the below picture by using a hack : since I have no minor tick on the y axis, I defined the custom ytick corresponding to the line that points toward the maximum and crosses the y axis as a minor tick, and then I disabled grid lines for minor ticks on the y axis. Unfortunately I cannot use the same hack on the x axis without disabling the grid lines of the minor ticks and that's something I'd like to avoid at all costs.
Below is a not so minimal albeit still WE.
There are many things I don't understand, the 2 majors are why does
locs, labels = plt.xticks()
not return the locs and labels that are plotted and why I don't get xticks labels displayed as 10^x where x = 0, 1, 2 and 3 but that's outside the scope of the original question.
import matplotlib.pyplot as plt
plt.grid(True)
import numpy as np
# Generate data
x_data = np.arange(1, 1000 , 10)
y_data = np.random.lognormal(1e-5, 3, len(x_data))
y_max = max(y_data)
# plot
plt.xscale('log')
import math
ratio_log = math.log(x_data[np.argmax(y_data)]) / math.log(max(x_data)) # I need to do this in order to plot a horizontal red dashed line that points to the max and do not extend any further.
plt.axhline(y=y_max, xmin=0, xmax = ratio_log, color='r', linestyle='--') # horizontal line pointing to the max y value.
axes = plt.gca()
axes.set_xlim([1, max(x_data)]) # Limits for the x axis.
# custom ticks and labels
# First yticks because I'm able to achieve what I seek
axes.set_yticks([int(y_max)], minor=True) # Sets the custom ytick as a minor one.
from matplotlib.ticker import FormatStrFormatter
axes.yaxis.set_minor_formatter(FormatStrFormatter("%.0f"))
axes.yaxis.grid(False, which='minor') # Removes minor yticks grid. Since I only have my custom yticks as a minor one, this will disable only the grid line corresponding to that ytick. That's a hack.
import matplotlib.ticker as plticker
loc = plticker.MultipleLocator(base=y_max / 3.3) # this locator puts ticks at regular intervals. I ensure the y axis ticks look ok.
axes.yaxis.set_major_locator(loc)
# Now xticks. I'm having a lot of difficulty here, unable to remove the grid of a particular custom xticks.
locs, labels = plt.xticks() # Strangely, this doesn't return the locs and labels that are plotted. There are indeed 2 values that aren't displayed in the plot, here 1.00000000e-01 and 1.00000000e+04. I've got to remove them before I can append my custom loc and label.
# This means that if I do: plt.xticks(locs, labels) right here, it would enlarge both the lower and upper limits on the x axis... I fail to see how that's intuitive or useful at all. Might this be a bug?
locs = np.append(locs[1:-1], np.asarray(x_data[np.argmax(y_data)])) # One of the ugliest hack I have ever seen... to get correct ticks and labels.
labels = (str(int(loc)) for loc in locs) # Just visuals to get integers on the axis.
plt.xticks(locs, labels) # updates the xticks and labels.
plt.plot((x_data[np.argmax(y_data)], x_data[np.argmax(y_data)]), (0, y_max), 'r--') # vertical line that points to the max. Non OO way to do it, so a bad way.
plt.plot(x_data, y_data)
plt.savefig('grid_prob.png')
plt.close()
Example picture below (the code outputs a different picture each time it is executed, but the problem appears in all pictures).
Credit for the idea goes to #ImportanceOfBeingErnest to whom I am extremely grateful.
I removed the grid with
axes.xaxis.grid(False, which='both')
, then I added a grid correspond to each xtick except the custom one with the following loop:
for loc in locs[1:-1]:
if loc != x_data[np.argmax(y_data)]:
plt.axvline(x=loc, color = 'grey', linestyle = '-', linewidth = 0.4)
Insert this code just before the line
plt.xticks(locs, labels) # updates the xticks and labels.
Example of output picture below.

Setting both axes logarithmic in bar plot matploblib

I have already binned data to plot a histogram. For this reason I'm using the plt.bar() function. I'd like to set both axes in the plot to a logarithmic scale.
If I set plt.bar(x, y, width=10, color='b', log=True) which lets me set the y-axis to log but I can't set the x-axis logarithmic.
I've tried plt.xscale('log') unfortunately this doesn't work right. The x-axis ticks vanish and the sizes of the bars don't have equal width.
I would be grateful for any help.
By default, the bars of a barplot have a width of 0.8. Therefore they appear larger for smaller x values on a logarithmic scale. If instead of specifying a constant width, one uses the distance between the bin edges and supplies this to the width argument, the bars will have the correct width. One would also need to set the align to "edge" for this to work.
import matplotlib.pyplot as plt
import numpy as np; np.random.seed(1)
x = np.logspace(0, 5, num=21)
y = (np.sin(1.e-2*(x[:-1]-20))+3)**10
fig, ax = plt.subplots()
ax.bar(x[:-1], y, width=np.diff(x), log=True,ec="k", align="edge")
ax.set_xscale("log")
plt.show()
I cannot reproduce missing ticklabels for a logarithmic scaling. This may be due to some settings in the code that are not shown in the question or due to the fact that an older matplotlib version is used. The example here works fine with matplotlib 2.0.
If the goal is to have equal width bars, assuming datapoints are not equidistant, then the most proper solution is to set width as
plt.bar(x, y, width=c*np.array(x), color='b', log=True) for a constant c appropriate for the plot. Alignment can be anything.
I know it is a very old question and you might have solved it but I've come to this post because I was with something like this but at the y axis and I manage to solve it just using ax.set_ylim(df['my data'].min()+100, df['my data'].max()+100). In y axis I have some sensible information which I thouhg the best way was to show in log scale but when I set log scale I couldn't see the numbers proper (as this post in x axis) so I just leave the idea of use log and use the min and max argment. It sets the scale of my graph much like as log. Still looking for another way for doesnt need use that -+100 at set_ylim.
While this does not actually use pyplot.bar, I think this method could be helpful in achieving what the OP is trying to do. I found this to be easier than trying to calibrate the width as a function of the log-scale, though it's more steps. Create a line collection whose width is independent of the chart scale.
import matplotlib.pyplot as plt
import numpy as np
import matplotlib.collections as coll
#Generate data and sort into bins
a = np.random.logseries(0.5, 1000)
hist, bin_edges = np.histogram(a, bins=20, density=False)
x = bin_edges[:-1] # remove the top-end from bin_edges to match dimensions of hist
lines = []
for i in range(len(x)):
pair=[(x[i],0), (x[i], hist[i])]
lines.append(pair)
linecoll = coll.LineCollection(lines, linewidths=10, linestyles='solid')
fig, ax = plt.subplots()
ax.add_collection(linecoll)
ax.set_xscale("log")
ax.set_yscale("log")
ax.set_xlim(min(x)/10,max(x)*10)
ax.set_ylim(0.1,1.1*max(hist)) #since this is an unweighted histogram, the logy doesn't make much sense.
Resulting plot - no frills
One drawback is that the "bars" will be centered, but this could be changed by offsetting the x-values by half of the linewidth value ... I think it would be
x_new = x + (linewidth/2)*10**round(np.log10(x),0).

Subplots: tight_layout changes figure size

Changing the vertical distance between two subplot using tight_layout(h_pad=-1) changes the total figuresize. How can I define the figuresize using tight_layout?
Here is the code:
#define figure
pl.figure(figsize=(10, 6.25))
ax1=subplot(211)
img=pl.imshow(np.random.random((10,50)), interpolation='none')
ax1.set_xticklabels(()) #hides the tickslabels of the first plot
subplot(212)
x=linspace(0,50)
pl.plot(x,x,'k-')
xlim( ax1.get_xlim() ) #same x-axis for both plots
And here is the results:
If I write
pl.tight_layout(h_pad=-2)
in the last line, then I get this:
As you can see, the figure is bigger...
You can use a GridSpec object to control precisely width and height ratios, as answered on this thread and documented here.
Experimenting with your code, I could produce something like what you want, by using a height_ratio that assigns twice the space to the upper subplot, and increasing the h_pad parameter to the tight_layout call. This does not sound completely right, but maybe you can adjust this further ...
import numpy as np
from matplotlib.pyplot import *
import matplotlib.pyplot as pl
import matplotlib.gridspec as gridspec
#define figure
fig = pl.figure(figsize=(10, 6.25))
gs = gridspec.GridSpec(2, 1, height_ratios=[2,1])
ax1=subplot(gs[0])
img=pl.imshow(np.random.random((10,50)), interpolation='none')
ax1.set_xticklabels(()) #hides the tickslabels of the first plot
ax2=subplot(gs[1])
x=np.linspace(0,50)
ax2.plot(x,x,'k-')
xlim( ax1.get_xlim() ) #same x-axis for both plots
fig.tight_layout(h_pad=-5)
show()
There were other issues, like correcting the imports, adding numpy, and plotting to ax2 instead of directly with pl. The output I see is this:
This case is peculiar because of the fact that the default aspect ratios of images and plots are not the same. So it is worth noting for people looking to remove the spaces in a grid of subplots consisting of images only or of plots only that you may find an appropriate solution among the answers to this question (and those linked to it): How to remove the space between subplots in matplotlib.pyplot?.
The aspect ratios of the subplots in this particular example are as follows:
# Default aspect ratio of images:
ax1.get_aspect()
# 1.0
# Which is as it is expected based on the default settings in rcParams file:
matplotlib.rcParams['image.aspect']
# 'equal'
# Default aspect ratio of plots:
ax2.get_aspect()
# 'auto'
The size of ax1 and the space beneath it are adjusted automatically based on the number of pixels along the x-axis (i.e. width) so as to preserve the 'equal' aspect ratio while fitting both subplots within the figure. As you mentioned, using fig.tight_layout(h_pad=xxx) or the similar fig.set_constrained_layout_pads(hspace=xxx) is not a good option as this makes the figure larger.
To remove the gap while preserving the original figure size, you can use fig.subplots_adjust(hspace=xxx) or the equivalent plt.subplots(gridspec_kw=dict(hspace=xxx)), as shown in the following example:
import numpy as np # v 1.19.2
import matplotlib.pyplot as plt # v 3.3.2
np.random.seed(1)
fig, (ax1, ax2) = plt.subplots(2, 1, figsize=(10, 6.25),
gridspec_kw=dict(hspace=-0.206))
# For those not using plt.subplots, you can use this instead:
# fig.subplots_adjust(hspace=-0.206)
size = 50
ax1.imshow(np.random.random((10, size)))
ax1.xaxis.set_visible(False)
# Create plot of a line that is aligned with the image above
x = np.arange(0, size)
ax2.plot(x, x, 'k-')
ax2.set_xlim(ax1.get_xlim())
plt.show()
I am not aware of any way to define the appropriate hspace automatically so that the gap can be removed for any image width. As stated in the docstring for fig.subplots_adjust(), it corresponds to the height of the padding between subplots, as a fraction of the average axes height. So I attempted to compute hspace by dividing the gap between the subplots by the average height of both subplots like this:
# Extract axes positions in figure coordinates
ax1_x0, ax1_y0, ax1_x1, ax1_y1 = np.ravel(ax1.get_position())
ax2_x0, ax2_y0, ax2_x1, ax2_y1 = np.ravel(ax2.get_position())
# Compute negative hspace to close the vertical gap between subplots
ax1_h = ax1_y1-ax1_y0
ax2_h = ax2_y1-ax2_y0
avg_h = (ax1_h+ax2_h)/2
gap = ax1_y0-ax2_y1
hspace=-(gap/avg_h) # this divided by 2 also does not work
fig.subplots_adjust(hspace=hspace)
Unfortunately, this does not work. Maybe someone else has a solution for this.
It is also worth mentioning that I tried removing the gap between subplots by editing the y positions like in this example:
# Extract axes positions in figure coordinates
ax1_x0, ax1_y0, ax1_x1, ax1_y1 = np.ravel(ax1.get_position())
ax2_x0, ax2_y0, ax2_x1, ax2_y1 = np.ravel(ax2.get_position())
# Set new y positions: shift ax1 down over gap
gap = ax1_y0-ax2_y1
ax1.set_position([ax1_x0, ax1_y0-gap, ax1_x1, ax1_y1-gap])
ax2.set_position([ax2_x0, ax2_y0, ax2_x1, ax2_y1])
Unfortunately, this (and variations of this) produces seemingly unpredictable results, including a figure resizing similar to when using fig.tight_layout(). Maybe someone else has an explanation for what is happening here behind the scenes.

Categories