I would like to plot a series of curves in the same Axes each having a constant y offset from eachother. Because the data I have needs to be displayed in log scale, simply adding a y offset to each curve (as done here) does not give the desired output.
I have tried using matplotlib.transforms to achieve the same, i.e. artificially shifting the curve in Figure coordinates. This achieves the desired result, but requires adjusting the Axes y limits so that the shifted curves are visible. Here is an example to illustrate this, though such data would not require log scale to be visible:
import matplotlib as mpl
import matplotlib.pyplot as plt
import numpy as np
fig, ax = plt.subplots(1,1)
for i in range(1,19):
x, y = np.arange(200), np.random.rand(200)
dy = 0.5*i
shifted = mpl.transforms.offset_copy(ax.transData, y=dy, fig=fig, units='inches')
ax.set_xlim(0, 200)
ax.set_ylim(0.1, 1e20)
ax.set_yscale('log')
ax.plot(x, y, transform=shifted, c=mpl.cm.plasma(i/18), lw=2)
The problem is that to make all the shifted curves visible, I would need to adjust the ylim to a very high number, which compresses all the curves so that the features visible because of the log scale cannot be seen anymore.
Since the displayed y axis values are meaningless to me, is there any way to artificially extend the Axes limits to display all the curves, without having to make the Figure very large? Apparently this can be done with seaborn, but if possible I would like to stick to matplotlib.
EDIT:
This is the kind of data I need to plot (an X-ray diffraction pattern varying with temperature):
Related
I am starting to play around with creating polar plots in Matplotlib that do NOT encompass an entire circle - i.e. a "wedge" plot - by setting the thetamin and thetamax properties. This is something I was waiting for for a long time, and I am glad they have it done :)
However, I have noticed that the figure location inside the axes seem to change in a strange manner when using this feature; depending on the wedge angular aperture, it can be difficult to fine tune the figure so it looks nice.
Here's an example:
import numpy as np
import matplotlib.pyplot as plt
# get 4 polar axes in a row
fig, axes = plt.subplots(2, 2, subplot_kw={'projection': 'polar'},
figsize=(8, 8))
# set facecolor to better display the boundaries
# (as suggested by ImportanceOfBeingErnest)
fig.set_facecolor('paleturquoise')
for i, theta_max in enumerate([2*np.pi, np.pi, 2*np.pi/3, np.pi/3]):
# define theta vector with varying end point and some data to plot
theta = np.linspace(0, theta_max, 181)
data = (1/6)*np.abs(np.sin(3*theta)/np.sin(theta/2))
# set 'thetamin' and 'thetamax' according to data
axes[i//2, i%2].set_thetamin(0)
axes[i//2, i%2].set_thetamax(theta_max*180/np.pi)
# actually plot the data, fine tune radius limits and add labels
axes[i//2, i%2].plot(theta, data)
axes[i//2, i%2].set_ylim([0, 1])
axes[i//2, i%2].set_xlabel('Magnitude', fontsize=15)
axes[i//2, i%2].set_ylabel('Angles', fontsize=15)
fig.set_tight_layout(True)
#fig.savefig('fig.png', facecolor='skyblue')
The labels are in awkward locations and over the tick labels, but can be moved closer or further away from the axes by adding an extra labelpad parameter to set_xlabel, set_ylabel commands, so it's not a big issue.
Unfortunately, I have the impression that the plot is adjusted to fit inside the existing axes dimensions, which in turn lead to a very awkward white space above and below the half circle plot (which of course is the one I need to use).
It sounds like something that should be reasonably easy to get rid of - I mean, the wedge plots are doing it automatically - but I can't seem to figure it out how to do it for the half circle. Can anyone shed a light on this?
EDIT: Apologies, my question was not very clear; I want to create a half circle polar plot, but it seems that using set_thetamin() you end up with large amounts of white space around the image (especially above and below) which I would rather have removed, if possible.
It's the kind of stuff that normally tight_layout() takes care of, but it doesn't seem to be doing the trick here. I tried manually changing the figure window size after plotting, but the white space simply scales with the changes. Below is a minimum working example; I can get the xlabel closer to the image if I want to, but saved image file still contains tons of white space around it.
Does anyone knows how to remove this white space?
import numpy as np
import matplotlib.pyplot as plt
# get a half circle polar plot
fig1, ax1 = plt.subplots(1, 1, subplot_kw={'projection': 'polar'})
# set facecolor to better display the boundaries
# (as suggested by ImportanceOfBeingErnest)
fig1.set_facecolor('skyblue')
theta_min = 0
theta_max = np.pi
theta = np.linspace(theta_min, theta_max, 181)
data = (1/6)*np.abs(np.sin(3*theta)/np.sin(theta/2))
# set 'thetamin' and 'thetamax' according to data
ax1.set_thetamin(0)
ax1.set_thetamax(theta_max*180/np.pi)
# actually plot the data, fine tune radius limits and add labels
ax1.plot(theta, data)
ax1.set_ylim([0, 1])
ax1.set_xlabel('Magnitude', fontsize=15)
ax1.set_ylabel('Angles', fontsize=15)
fig1.set_tight_layout(True)
#fig1.savefig('fig1.png', facecolor='skyblue')
EDIT 2: Added background color to figures to better show the boundaries, as suggested in ImportanteOfBeingErnest's answer.
It seems the wedge of the "truncated" polar axes is placed such that it sits in the middle of the original axes. There seems so be some constructs called LockedBBox and _WedgeBbox in the game, which I have never seen before and do not fully understand. Those seem to be created at draw time, such that manipulating them from the outside seems somewhere between hard and impossible.
One hack can be to manipulate the original axes such that the resulting wedge turns up at the desired position. This is not really deterministic, but rather looking for some good values by trial and error.
The parameters to adjust in this case are the figure size (figsize), the padding of the labels (labelpad, as already pointed out in the question) and finally the axes' position (ax.set_position([left, bottom, width, height])).
The result could then look like
import numpy as np
import matplotlib.pyplot as plt
# get a half circle polar plot
fig1, ax1 = plt.subplots(1, 1, figsize=(6,3.4), subplot_kw={'projection': 'polar'})
theta_min = 1.e-9
theta_max = np.pi
theta = np.linspace(theta_min, theta_max, 181)
data = (1/6.)*np.abs(np.sin(3*theta)/np.sin(theta/2.))
# set 'thetamin' and 'thetamax' according to data
ax1.set_thetamin(0)
ax1.set_thetamax(theta_max*180./np.pi)
# actually plot the data, fine tune radius limits and add labels
ax1.plot(theta, data)
ax1.set_ylim([0, 1])
ax1.set_xlabel('Magnitude', fontsize=15, labelpad=-60)
ax1.set_ylabel('Angles', fontsize=15)
ax1.set_position( [0.1, -0.45, 0.8, 2])
plt.show()
Here I've set some color to the background of the figure to better see the boundary.
I want to plot a KDE for some data with data that covers a large range in x-values. Therefore I want to use a logarithmic scale for the x-axis. For plotting I was using seaborn and the solution from Plotting 2D Kernel Density Estimation with Python, both of which fail once I set the xscale to logarithmic. When I take the logarithm of my x-data beforehand, everything looks fine, except the tics and ticlabels are still linear with the logarithm of the actual values as the labels. I could manually change the tics using something like:
labels = np.array(ax.get_xticks().tolist(), dtype=np.float64)
new_labels = [r'$10^{%.1f}$' % (labels[i]) for i in range(len(labels))]
ax.set_xticklabels(new_labels)
but in my eyes that looks just wrong and is nothing close to the axis labels (including the minor tics) when I would just use
ax.set_xscale('log')
Is there an easier way to plot a KDE with logarithmic x-data? Or is it possible to just change the tic- or label-scale without changing the scaling of the data, so that I could plot the logarithmic values of x and change the scaling of the labels afterwards?
Edit:
The plot I want to create looks like this:
The two right columns are what it is supposed to look like. There I used the the x data with the logarithm already applied. I don't like the labels on the x-axis, though.
The left column displays the plots, when the original data is used for the kde and all the other plots, and afterwards the scale is changed using
ax.set_xscale('log')
For some reason the kde, does not look like it is supposed to look. This is also not a result of erroneous data, since it looks just fine if the logarithmic data is used.
Edit 2:
A working example of code is
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
data = np.random.multivariate_normal((0, 0), [[0.8, 0.05], [0.05, 0.7]], 100)
x = np.power(10, data[:, 0])
y = data[:, 1]
fig, ax = plt.subplots(2, 1)
sns.kdeplot(data=np.log10(x), data2=y, ax=ax[0])
sns.kdeplot(data=x, data2=y, ax=ax[1])
ax[1].set_xscale('log')
plt.show()
The ax[1] plot is not displayed correctly for me (the x-axis is inverted), but the general behavior is the same as for the case described above. I believe the problem lies with the bandwidth of the kde, which should probably account for the logarithmic x-data.
I found an answer that works for me and wanted to post it in case someone else has a similar problem.
Based on the accepted answer from this post, I defined a function that first applies the logarithm to the x-data and after the KDE was performed, transforms the x-values back to the original values. Afterwards I can simply plot the contours and use ax.set_xscale('log')
import numpy as np
import scipy.stats as st
def logx_kde(x, y, xmin, xmax, ymin, ymax):
x = np.log10(x)
# Peform the kernel density estimate
xx, yy = np.mgrid[xmin:xmax:100j, ymin:ymax:100j]
positions = np.vstack([xx.ravel(), yy.ravel()])
values = np.vstack([x, y])
kernel = st.gaussian_kde(values)
f = np.reshape(kernel(positions).T, xx.shape)
return np.power(10, xx), yy, f
I have already binned data to plot a histogram. For this reason I'm using the plt.bar() function. I'd like to set both axes in the plot to a logarithmic scale.
If I set plt.bar(x, y, width=10, color='b', log=True) which lets me set the y-axis to log but I can't set the x-axis logarithmic.
I've tried plt.xscale('log') unfortunately this doesn't work right. The x-axis ticks vanish and the sizes of the bars don't have equal width.
I would be grateful for any help.
By default, the bars of a barplot have a width of 0.8. Therefore they appear larger for smaller x values on a logarithmic scale. If instead of specifying a constant width, one uses the distance between the bin edges and supplies this to the width argument, the bars will have the correct width. One would also need to set the align to "edge" for this to work.
import matplotlib.pyplot as plt
import numpy as np; np.random.seed(1)
x = np.logspace(0, 5, num=21)
y = (np.sin(1.e-2*(x[:-1]-20))+3)**10
fig, ax = plt.subplots()
ax.bar(x[:-1], y, width=np.diff(x), log=True,ec="k", align="edge")
ax.set_xscale("log")
plt.show()
I cannot reproduce missing ticklabels for a logarithmic scaling. This may be due to some settings in the code that are not shown in the question or due to the fact that an older matplotlib version is used. The example here works fine with matplotlib 2.0.
If the goal is to have equal width bars, assuming datapoints are not equidistant, then the most proper solution is to set width as
plt.bar(x, y, width=c*np.array(x), color='b', log=True) for a constant c appropriate for the plot. Alignment can be anything.
I know it is a very old question and you might have solved it but I've come to this post because I was with something like this but at the y axis and I manage to solve it just using ax.set_ylim(df['my data'].min()+100, df['my data'].max()+100). In y axis I have some sensible information which I thouhg the best way was to show in log scale but when I set log scale I couldn't see the numbers proper (as this post in x axis) so I just leave the idea of use log and use the min and max argment. It sets the scale of my graph much like as log. Still looking for another way for doesnt need use that -+100 at set_ylim.
While this does not actually use pyplot.bar, I think this method could be helpful in achieving what the OP is trying to do. I found this to be easier than trying to calibrate the width as a function of the log-scale, though it's more steps. Create a line collection whose width is independent of the chart scale.
import matplotlib.pyplot as plt
import numpy as np
import matplotlib.collections as coll
#Generate data and sort into bins
a = np.random.logseries(0.5, 1000)
hist, bin_edges = np.histogram(a, bins=20, density=False)
x = bin_edges[:-1] # remove the top-end from bin_edges to match dimensions of hist
lines = []
for i in range(len(x)):
pair=[(x[i],0), (x[i], hist[i])]
lines.append(pair)
linecoll = coll.LineCollection(lines, linewidths=10, linestyles='solid')
fig, ax = plt.subplots()
ax.add_collection(linecoll)
ax.set_xscale("log")
ax.set_yscale("log")
ax.set_xlim(min(x)/10,max(x)*10)
ax.set_ylim(0.1,1.1*max(hist)) #since this is an unweighted histogram, the logy doesn't make much sense.
Resulting plot - no frills
One drawback is that the "bars" will be centered, but this could be changed by offsetting the x-values by half of the linewidth value ... I think it would be
x_new = x + (linewidth/2)*10**round(np.log10(x),0).
I'm trying to make plots that are formatted the same way despite coming from different datasets and I'm running into issues with getting consistent text positions and appropriate axis limits because the datasets are not scaled exactly the same. For example - say I generate the following elevation profile:
import matplotlib.pyplot as plt
import numpy as np
Distance=np.array([1000,3000,7000,15000,20000])
Elevation=np.array([100,200,350,800,400])
def MyPlot(X,Y):
fig = plt.figure()
ax = fig.add_subplot(111, aspect='equal')
ax.plot(X,Y)
fig.set_size_inches(fig.get_size_inches()*2)
ax.set_ylim(min(Y)-50, max(Y)+500)
ax.set_xlim(min(X)-50, max(X)+50)
MaxPoint=X[np.argmax(Y)], max(Y)
ax.scatter(MaxPoint[0], MaxPoint[1], s=10)
ax.text(MaxPoint[0], MaxPoint[1]+100, s='Maximum = '+str(MaxPoint[1]), fontsize=8)
MyPlot(Distance,Elevation)
And then I have another dataset that's scaled differently:
Distance2=Distance*4
Elevation2=Elevation*5
MyPlot(Distance2,Elevation2)][2]][2]
Because of the fact that a unit change is relatively much larger in the first dataset than the second dataset, the text and axis labels do not get formatted as I'd like in the 2nd plot. Is there a way to adjust text position and axis limits that adjusts to the relative scale of the dataset?
First off, for placing text with an offset such as this, you almost never want to use text. Instead, use annotate. The advantage is that you can give an offset of the text in points instead of data units.
Next, to reduce the density of tick locations, use ax.locator_params and change the nbins parameter. nbins controls the tick density. Tick locations will still be automatically chosen, but reducing nbins will reduce the maximum number of tick locations. If you do lower nbins, you may want to also change the numbers that matplotlib considers "even" when picking tick intervals. That way, you have more options to get the expected number of ticks.
Finally, to avoid manually setting limits with a set padding, consider using margins(some_percentage) to pad the extents by a percentage of the current limits.
To show a complete example of all:
import matplotlib.pyplot as plt
import numpy as np
distance=np.array([1000,3000,7000,15000,20000])
elevation=np.array([100,200,350,800,400])
def plot(x, y):
fig, ax = plt.subplots(figsize=(8, 2))
# Plot your data and place a marker at the peak location
maxpoint=x[np.argmax(y)], max(y)
ax.scatter(maxpoint[0], maxpoint[1], s=10)
ax.plot(x, y)
# Reduce the maximum number of ticks and give matplotlib more flexibility
# in the tick intervals it can choose.
# Essentially, this will more or less always have two ticks on the y-axis
# and 4 on the x-axis
ax.locator_params(axis='y', nbins=3, steps=range(1, 11))
ax.locator_params(axis='x', nbins=5, steps=range(1, 11))
# Annotate the peak location. The text will always be 5 points from the
# data location.
ax.annotate('Maximum = {:0.0f}'.format(maxpoint[1]), size=8,
xy=maxpoint, xytext=(5, 5), textcoords='offset points')
# Give ourselves lots of padding on the y-axis, less on the x
ax.margins(x=0.01, y=0.3)
ax.set_ylim(bottom=y.min())
# Set the aspect of the plot to be equal and add some x/y labels
ax.set(xlabel='Distance', ylabel='Elevation', aspect=1)
plt.show()
plot(distance,elevation)
And if we change the data:
plot(distance * 4, elevation * 5)
Finally, you might consider placing the annotation just above the top of the axis, instead of offset from the point:
ax.annotate('Maximum = {:0.0f}'.format(maxpoint[1]), ha='center',
size=8, xy=(maxpoint[0], 1), xytext=(0, 5),
textcoords='offset points',
xycoords=('data', 'axes fraction'))
May be you should use seaborn where no any borders. I think it's very good way.
It will be look like this:
you should write string import seaborn at the beginning of the script.
Changing the vertical distance between two subplot using tight_layout(h_pad=-1) changes the total figuresize. How can I define the figuresize using tight_layout?
Here is the code:
#define figure
pl.figure(figsize=(10, 6.25))
ax1=subplot(211)
img=pl.imshow(np.random.random((10,50)), interpolation='none')
ax1.set_xticklabels(()) #hides the tickslabels of the first plot
subplot(212)
x=linspace(0,50)
pl.plot(x,x,'k-')
xlim( ax1.get_xlim() ) #same x-axis for both plots
And here is the results:
If I write
pl.tight_layout(h_pad=-2)
in the last line, then I get this:
As you can see, the figure is bigger...
You can use a GridSpec object to control precisely width and height ratios, as answered on this thread and documented here.
Experimenting with your code, I could produce something like what you want, by using a height_ratio that assigns twice the space to the upper subplot, and increasing the h_pad parameter to the tight_layout call. This does not sound completely right, but maybe you can adjust this further ...
import numpy as np
from matplotlib.pyplot import *
import matplotlib.pyplot as pl
import matplotlib.gridspec as gridspec
#define figure
fig = pl.figure(figsize=(10, 6.25))
gs = gridspec.GridSpec(2, 1, height_ratios=[2,1])
ax1=subplot(gs[0])
img=pl.imshow(np.random.random((10,50)), interpolation='none')
ax1.set_xticklabels(()) #hides the tickslabels of the first plot
ax2=subplot(gs[1])
x=np.linspace(0,50)
ax2.plot(x,x,'k-')
xlim( ax1.get_xlim() ) #same x-axis for both plots
fig.tight_layout(h_pad=-5)
show()
There were other issues, like correcting the imports, adding numpy, and plotting to ax2 instead of directly with pl. The output I see is this:
This case is peculiar because of the fact that the default aspect ratios of images and plots are not the same. So it is worth noting for people looking to remove the spaces in a grid of subplots consisting of images only or of plots only that you may find an appropriate solution among the answers to this question (and those linked to it): How to remove the space between subplots in matplotlib.pyplot?.
The aspect ratios of the subplots in this particular example are as follows:
# Default aspect ratio of images:
ax1.get_aspect()
# 1.0
# Which is as it is expected based on the default settings in rcParams file:
matplotlib.rcParams['image.aspect']
# 'equal'
# Default aspect ratio of plots:
ax2.get_aspect()
# 'auto'
The size of ax1 and the space beneath it are adjusted automatically based on the number of pixels along the x-axis (i.e. width) so as to preserve the 'equal' aspect ratio while fitting both subplots within the figure. As you mentioned, using fig.tight_layout(h_pad=xxx) or the similar fig.set_constrained_layout_pads(hspace=xxx) is not a good option as this makes the figure larger.
To remove the gap while preserving the original figure size, you can use fig.subplots_adjust(hspace=xxx) or the equivalent plt.subplots(gridspec_kw=dict(hspace=xxx)), as shown in the following example:
import numpy as np # v 1.19.2
import matplotlib.pyplot as plt # v 3.3.2
np.random.seed(1)
fig, (ax1, ax2) = plt.subplots(2, 1, figsize=(10, 6.25),
gridspec_kw=dict(hspace=-0.206))
# For those not using plt.subplots, you can use this instead:
# fig.subplots_adjust(hspace=-0.206)
size = 50
ax1.imshow(np.random.random((10, size)))
ax1.xaxis.set_visible(False)
# Create plot of a line that is aligned with the image above
x = np.arange(0, size)
ax2.plot(x, x, 'k-')
ax2.set_xlim(ax1.get_xlim())
plt.show()
I am not aware of any way to define the appropriate hspace automatically so that the gap can be removed for any image width. As stated in the docstring for fig.subplots_adjust(), it corresponds to the height of the padding between subplots, as a fraction of the average axes height. So I attempted to compute hspace by dividing the gap between the subplots by the average height of both subplots like this:
# Extract axes positions in figure coordinates
ax1_x0, ax1_y0, ax1_x1, ax1_y1 = np.ravel(ax1.get_position())
ax2_x0, ax2_y0, ax2_x1, ax2_y1 = np.ravel(ax2.get_position())
# Compute negative hspace to close the vertical gap between subplots
ax1_h = ax1_y1-ax1_y0
ax2_h = ax2_y1-ax2_y0
avg_h = (ax1_h+ax2_h)/2
gap = ax1_y0-ax2_y1
hspace=-(gap/avg_h) # this divided by 2 also does not work
fig.subplots_adjust(hspace=hspace)
Unfortunately, this does not work. Maybe someone else has a solution for this.
It is also worth mentioning that I tried removing the gap between subplots by editing the y positions like in this example:
# Extract axes positions in figure coordinates
ax1_x0, ax1_y0, ax1_x1, ax1_y1 = np.ravel(ax1.get_position())
ax2_x0, ax2_y0, ax2_x1, ax2_y1 = np.ravel(ax2.get_position())
# Set new y positions: shift ax1 down over gap
gap = ax1_y0-ax2_y1
ax1.set_position([ax1_x0, ax1_y0-gap, ax1_x1, ax1_y1-gap])
ax2.set_position([ax2_x0, ax2_y0, ax2_x1, ax2_y1])
Unfortunately, this (and variations of this) produces seemingly unpredictable results, including a figure resizing similar to when using fig.tight_layout(). Maybe someone else has an explanation for what is happening here behind the scenes.