Scaling the second axe on a histogram with Matplotlib - python

I want to have a second axe on my histogram, with the pourcentage corresponding to each bin, like if I used normed=True. I tried to use twins, but the scale is not correct.
x = np.random.randn(10000)
plt.hist(x)
ax2 = plt.twinx()
plt.show()
Bonus point if you can make it work with log scaled x :)

plt.hist returns the bins and the number of data in each bucket. You may use these to compute the area under the histogram, and using that you may find the normalized height of each bar. twinx axis can be aligned accordingly:
xs = np.random.randn(10000)
ax1 = plt.subplot(111)
cnt, bins, patches = ax1.hist(xs)
# area under the istogram
area = np.dot(cnt, np.diff(bins))
ax2 = ax1.twinx()
ax2.grid('off')
# align the twinx axis
ax2.set_yticks(ax1.get_yticks() / area)
lb, ub = ax1.get_ylim()
ax2.set_ylim(lb / area, ub / area)
# display the y-axis in percentage
from matplotlib.ticker import FuncFormatter
frmt = FuncFormatter(lambda x, pos: '{:>4.1f}%'.format(x*100))
ax2.yaxis.set_major_formatter(frmt)

Related

Extending colorbar to include out of range data

I was trying to make a Polar heatmap using the following code.
# Plotting the polar plot
from matplotlib.colorbar import ColorbarBase
from matplotlib.colors import LogNorm
import matplotlib.pyplot as plt
cmap = obspy_sequential
# Have defined the variables to be used for pointing to the coordinates
# baz is angular, slow is radial, abs_power is the value at every co-ordinate
# Choose number of fractions in plot (desirably 360 degree/N is an integer!)
N = 72
N2 = 30
abins = np.arange(N + 1) * 360. / N
sbins = np.linspace(0, 3, N2 + 1)
# Sum rel power in bins given by abins and sbins
hist, baz_edges, sl_edges = \
np.histogram2d(baz, slow, bins=[abins, sbins], weights=abs_power)
# Transform to radian
baz_edges = np.radians(baz_edges)
# Add polar and colorbar axes
fig = plt.figure(figsize=(8, 8))
cax = fig.add_axes([0.85, 0.2, 0.05, 0.5])
ax = fig.add_axes([0.10, 0.1, 0.70, 0.7], polar=True)
ax.set_theta_direction(-1)
ax.set_theta_zero_location("N")
dh = abs(sl_edges[1] - sl_edges[0])
dw = abs(baz_edges[1] - baz_edges[0])
# Circle through backazimuth
for i, row in enumerate(hist):
bars = ax.bar((i * dw) * np.ones(N2),
height=dh * np.ones(N2),
width=dw, bottom=dh * np.arange(N2),color=cmap(row / hist.max()))
ax.set_xticks(np.linspace(0, 2 * np.pi, 10, endpoint=False))
ax.set_yticklabels(velocity)
ax.set_ylim(0, 3)
[i.set_color('white') for i in ax.get_yticklabels()]
ColorbarBase(cax, cmap=cmap,
norm=LogNorm(vmin=hist.min(),vmax=hist.max()))
plt.show()
I am creating multiple plots like this and thus I need to extend the range of the colorbar beyond the maximum of the abs_power data range.
I tried changing the vmax and vmin to the maximum-minimum target numbers I want, but it plots out the exact same plot every single time. The maximum value on the colorbar keeps changing but the plot does not change. Why is this happening?
Here is how it looks,
Here the actual maximum power is way lesser than the maximum specified in the colorbar. Still a bright yellow spot is visible.
PS : I get this same plot for any vmax,vmin values I provide.
Changing the colorbar doesn't have an effect on the main plot. You'd need to change the formula used in color=cmap(row / hist.max()) to change the barplot. The 'norm' is just meant for this task. The norm maps the range of numbers to the interval [0, 1]. Every value that is mapped to a value higher than 1 (i.e. a value higher than hist.max() in the example), gets assigned the highest color.
To have the colorbar reflect the correct information, you'd need the same cmap and same norm for both the plot and the colorbar:
my_norm = LogNorm(vmin=hist.min(),vmax=hist.max())
for i, row in enumerate(hist):
bars = ax.bar((i * dw) * np.ones(N2),
height=dh * np.ones(N2),
width=dw, bottom=dh * np.arange(N2),color=cmap(my_norm(row)))
and
ColorbarBase(cax, cmap=cmap, norm=my_norm)
On the other hand, if you don't want the yellow color to show up, you could try something like my_norm = LogNorm(vmin=hist.min(), vmax=hist.max()*100) in the code above.
Instead of creating the colorbar via ColorbarBase, it can help to use a standard plt.colorbar(), but with a ScalarMappable that indicates the color map and the norm used. In case of a LogNorm this will show the ticks in log format.
from matplotlib.cm import ScalarMappable
plt.colorbar(ScalarMappable(cmap=cmap, norm=my_norm), ax=ax, cax=cax)

scatterplot and combined polar histogram in matplotlib

I am attempting to produce a plot like this which combines a cartesian scatter plot and a polar histogram. (Radial lines optional)
A similar solution (by Nicolas Legrand) exists for looking at differences in x and y (code here), but we need to look at ratios (i.e. x/y).
More specifically, this is useful when we want to look at the relative risk measure which is the ratio of two probabilities.
The scatter plot on it's own is obviously not a problem, but the polar histogram is more advanced.
The most promising lead I have found is this central example from the matplotlib gallery here
I have attempted to do this, but have run up against the limits of my matplotlib skills. Any efforts moving towards this goal would be great.
I'm sure that others will have better suggestions, but one method that gets something like you want (without the need for extra axes artists) is to use a polar projection with a scatter and bar chart together. Something like
import matplotlib.pyplot as plt
import numpy as np
x = np.random.uniform(size=100)
y = np.random.uniform(size=100)
r = np.sqrt(x**2 + y**2)
phi = np.arctan2(y, x)
h, b = np.histogram(phi, bins=np.linspace(0, np.pi/2, 21), density=True)
colors = plt.cm.Spectral(h / h.max())
ax = plt.subplot(111, projection='polar')
ax.scatter(phi, r, marker='.')
ax.bar(b[:-1], h, width=b[1:] - b[:-1],
align='edge', bottom=np.max(r) + 0.2, color=colors)
# Cut off at 90 degrees
ax.set_thetamax(90)
# Set the r grid to cover the scatter plot
ax.set_rgrids([0, 0.5, 1])
# Let's put a line at 1 assuming we want a ratio of some sort
ax.set_thetagrids([45], [1])
which will give
It is missing axes labels and some beautification, but it might be a place to start. I hope it is helpful.
You can use two axes on top of each other:
import matplotlib.pyplot as plt
fig = plt.figure(figsize=(6,6))
ax1 = fig.add_axes([0.1,0.1,.8,.8], label="cartesian")
ax2 = fig.add_axes([0.1,0.1,.8,.8], projection="polar", label="polar")
ax2.set_rorigin(-1)
ax2.set_thetamax(90)
plt.show()
Ok. Thanks to the answer from Nicolas, and the answer from tomjn I have a working solution :)
import numpy as np
import matplotlib.pyplot as plt
# Scatter data
n = 50
x = 0.3 + np.random.randn(n)*0.1
y = 0.4 + np.random.randn(n)*0.02
def radial_corner_plot(x, y, n_hist_bins=51):
"""Scatter plot with radial histogram of x/y ratios"""
# Axis setup
fig = plt.figure(figsize=(6,6))
ax1 = fig.add_axes([0.1,0.1,.6,.6], label="cartesian")
ax2 = fig.add_axes([0.1,0.1,.8,.8], projection="polar", label="polar")
ax2.set_rorigin(-20)
ax2.set_thetamax(90)
# define useful constant
offset_in_radians = np.pi/4
def rotate_hist_axis(ax):
"""rotate so that 0 degrees is pointing up and right"""
ax.set_theta_offset(offset_in_radians)
ax.set_thetamin(-45)
ax.set_thetamax(45)
return ax
# Convert scatter data to histogram data
r = np.sqrt(x**2 + y**2)
phi = np.arctan2(y, x)
h, b = np.histogram(phi,
bins=np.linspace(0, np.pi/2, n_hist_bins),
density=True)
# SCATTER PLOT -------------------------------------------------------
ax1.scatter(x,y)
ax1.set(xlim=[0, 1], ylim=[0, 1], xlabel="x", ylabel="y")
ax1.spines['right'].set_visible(False)
ax1.spines['top'].set_visible(False)
# HISTOGRAM ----------------------------------------------------------
ax2 = rotate_hist_axis(ax2)
# rotation of axis requires rotation in bin positions
b = b - offset_in_radians
# plot the histogram
bars = ax2.bar(b[:-1], h, width=b[1:] - b[:-1], align='edge')
def update_hist_ticks(ax, desired_ratios):
"""Update tick positions and corresponding tick labels"""
x = np.ones(len(desired_ratios))
y = 1/desired_ratios
phi = np.arctan2(y,x) - offset_in_radians
# define ticklabels
xticklabels = [str(round(float(label), 2)) for label in desired_ratios]
# apply updates
ax2.set(xticks=phi, xticklabels=xticklabels)
return ax
ax2 = update_hist_ticks(ax2, np.array([1/8, 1/4, 1/2, 1, 2, 4, 8]))
# just have radial grid lines
ax2.grid(which="major", axis="y")
# remove bin count labels
ax2.set_yticks([])
return (fig, [ax1, ax2])
fig, ax = radial_corner_plot(x, y)
Thanks for the pointers!

How to draw the normal distribution of a barplot with log x axis?

I'd like to draw a lognormal distribution of a given bar plot.
Here's the code
import matplotlib.pyplot as plt
from matplotlib.pyplot import figure
import numpy as np; np.random.seed(1)
import scipy.stats as stats
import math
inter = 33
x = np.logspace(-2, 1, num=3*inter+1)
yaxis = [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.01,0.03,0.3,0.75,1.24,1.72,2.2,3.1,3.9,
4.3,4.9,5.3,5.6,5.87,5.96,6.01,5.83,5.42,4.97,4.60,4.15,3.66,3.07,2.58,2.19,1.90,1.54,1.24,1.08,0.85,0.73,
0.84,0.59,0.55,0.53,0.48,0.35,0.29,0.15,0.15,0.14,0.12,0.14,0.15,0.05,0.05,0.05,0.04,0.03,0.03,0.03, 0.02,
0.02,0.03,0.01,0.01,0.01,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0,0]
fig, ax = plt.subplots()
ax.bar(x[:-1], yaxis, width=np.diff(x), align="center", ec='k', color='w')
ax.set_xscale('log')
plt.xlabel('Diameter (mm)', fontsize='12')
plt.ylabel('Percentage of Total Particles (%)', fontsize='12')
plt.ylim(0,8)
plt.xlim(0.01, 10)
fig.set_size_inches(12, 12)
plt.savefig("Test.png", dpi=300, bbox_inches='tight')
Resulting plot:
What I'm trying to do is to draw the Probability Density Function exactly like the one shown in red in the graph below:
An idea is to convert everything to logspace, with u = log10(x). Then draw the density histogram in there. And also calculate a kde in the same space. Everything gets drawn as y versus u. When we have u at a top twin axes, x can stay at the bottom. Both axes get aligned by setting the same xlims, but converted to logspace on the top axis. The top axis can be hidden to get the desired result.
import matplotlib.pyplot as plt
import numpy as np
from scipy import stats
inter = 33
u = np.linspace(-2, 1, num=3*inter+1)
x = 10**u
us = np.linspace(u[0], u[-1], 500)
yaxis = [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.01,0.03,0.3,0.75,1.24,1.72,2.2,3.1,3.9,
4.3,4.9,5.3,5.6,5.87,5.96,6.01,5.83,5.42,4.97,4.60,4.15,3.66,3.07,2.58,2.19,1.90,1.54,1.24,1.08,0.85,0.73,
0.84,0.59,0.55,0.53,0.48,0.35,0.29,0.15,0.15,0.14,0.12,0.14,0.15,0.05,0.05,0.05,0.04,0.03,0.03,0.03, 0.02,
0.02,0.03,0.01,0.01,0.01,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0,0]
yaxis = np.array(yaxis)
# reconstruct data from the given frequencies
u_data = np.repeat((u[:-1] + u[1:]) / 2, (yaxis * 100).astype(np.int))
kde = stats.gaussian_kde((u[:-1]+u[1:])/2, weights=yaxis, bw_method=0.2)
total_area = (np.diff(u)*yaxis).sum() # total area of all bars; divide by this area to normalize
fig, ax = plt.subplots()
ax2 = ax.twiny()
ax2.bar(u[:-1], yaxis, width=np.diff(u), align="edge", ec='k', color='w', label='frequencies')
ax2.plot(us, total_area*kde(us), color='crimson', label='kde')
ax2.plot(us, total_area * stats.norm.pdf(us, u_data.mean(), u_data.std()), color='dodgerblue', label='lognormal')
ax2.legend()
ax.set_xscale('log')
ax.set_xlabel('Diameter (mm)', fontsize='12')
ax.set_ylabel('Percentage of Total Particles (%)', fontsize='12')
ax.set_ylim(0,8)
xlim = np.array([0.01,10])
ax.set_xlim(xlim)
ax2.set_xlim(np.log10(xlim))
ax2.set_xticks([]) # hide the ticks at the top
plt.tight_layout()
plt.show()
PS: Apparently this also can be achieved directly without explicitly using u (at the cost of being slightly more cryptic):
x = np.logspace(-2, 1, num=3*inter+1)
xs = np.logspace(-2, 1, 500)
total_area = (np.diff(np.log10(x))*yaxis).sum() # total area of all bars; divide by this area to normalize
kde = gaussian_kde((np.log10(x[:-1])+np.log10(x[1:]))/2, weights=yaxis, bw_method=0.2)
ax.bar(x[:-1], yaxis, width=np.diff(x), align="edge", ec='k', color='w')
ax.plot(xs, total_area*kde(np.log10(xs)), color='crimson')
ax.set_xscale('log')
Note that the bandwidth set for gaussian_kde is a somewhat arbitrarily value. Larger values give a more equalized curve, smaller values keep closer to the data. Some experimentation can help.

Expanding axes to fill figure, same scale on x and y

I know 2 things but separately.
figure.tight_layout
will expand my current axes
axes.aspect('equal')
will keep same scale on x and y.
But when I use them both I get square axes view and I want it to be expanded.
By keeping same scale I mean there is same distance from 0 to 1 on x and y axis.
Is there any way to make it happen? Keep same scale and expand to full figure(not only a square)
The answer should work with autoscale
There might be less clumsy way, but at least you can do it manually. A very simple example:
import matplotlib.pyplot as plt
fig = plt.figure()
ax = fig.add_subplot(111)
ax.plot([0,1],[1,0])
ax.set_aspect(1)
ax.set_xlim(0, 1.5)
creates
which honours the aspect ratio.
If you want to have the automatic scaling offered by the tight_layout, then you'll have to do some maths of your own:
import matplotlib.pyplot as plt
fig = plt.figure()
ax = fig.add_subplot(111)
ax.plot([0,1],[1,0])
fig.tight_layout()
# capture the axis positioning in pixels
bb = fig.transFigure.transform(ax.get_position())
x0, y0 = bb[0]
x1, y1 = bb[1]
width = x1 - x0
height = y1 - y0
# set the aspect ratio
ax.set_aspect(1)
# calculate the aspect ratio of the plot
plot_aspect = width / height
# get the axis limits in data coordinates
ax0, ax1 = ax.get_xlim()
ay0, ay1 = ax.get_ylim()
awidth = ax1 - ax0
aheight = ay1 - ay0
# calculate the plot aspect
data_aspect = awidth / aheight
# check which one needs to be corrected
if data_aspect < plot_aspect:
ax.set_xlim(ax0, ax0 + plot_aspect * aheight)
else:
ax.set_ylim(ay0, ay0 + awidth / plot_aspect)
Of course, you may set the xlim and ylim any way you want, you might, for example, want to add an equal amount of space to either end of the scale.
The solution that worked in my case was to call
axis.aspect("equal")
axis.set_adjustable("datalim")
stolen from this example in the documentation.

Matplotlib polar plot radial axis offset

I was wondering, is it possible to offset the start of the radial axis or move it outside of the graph.
This is what I'm hoping to achieve:
And this is what I have for now.
I have read the documentation and different topics on SO, but I couldn't find anything helpful. Does that mean that it is not even possible if it is not mentioned anywhere.
Thank you in advance.
EDIT (added snippet of a code used to create the plot):
ax = fig.add_subplot(111, projection='polar')
ax.set_theta_zero_location('N')
ax.set_theta_direction(-1)
ax.plot(X,lines[li]*yScalingFactor,label=linelabels[li],color=color,linestyle=ls)
To offset the start of the radial axis:
EDIT: As of Matplotlib 2.2.3 there's a new Axes method called set_rorigin which does exactly that. You call it with the theoretical radial coordinate of the origin. So if you call ax.set_ylim(0, 2) and ax.set_rorigin(-1), the radius of the center circle will be a third of the radius of the plot.
A quick and dirty workaround for Matplotlib < 2.2.3 is to set the lower radial axis limit to a negative value and hide the inner part of the plot behind a circle:
import numpy as np
import matplotlib.pyplot as plt
CIRCLE_RES = 36 # resolution of circle inside
def offset_radial_axis(ax):
x_circle = np.linspace(0, 2*np.pi, CIRCLE_RES)
y_circle = np.zeros_like(x_circle)
ax.fill(x_circle, y_circle, fc='white', ec='black', zorder=2) # circle
ax.set_rmin(-1) # needs to be after ax.fill. No idea why.
ax.set_rticks([tick for tick in ax.get_yticks() if tick >= 0])
# or set the ticks manually (simple)
# or define a custom TickLocator (very flexible)
# or leave out this line if the ticks are fully behind the circle
To add a scale outside the plot:
You can add an extra axes object in the upper half of the other axes and use its yaxis:
X_OFFSET = 0 # to control how far the scale is from the plot (axes coordinates)
def add_scale(ax):
# add extra axes for the scale
rect = ax.get_position()
rect = (rect.xmin-X_OFFSET, rect.ymin+rect.height/2, # x, y
rect.width, rect.height/2) # width, height
scale_ax = ax.figure.add_axes(rect)
# hide most elements of the new axes
for loc in ['right', 'top', 'bottom']:
scale_ax.spines[loc].set_visible(False)
scale_ax.tick_params(bottom=False, labelbottom=False)
scale_ax.patch.set_visible(False) # hide white background
# adjust the scale
scale_ax.spines['left'].set_bounds(*ax.get_ylim())
# scale_ax.spines['left'].set_bounds(0, ax.get_rmax()) # mpl < 2.2.3
scale_ax.set_yticks(ax.get_yticks())
scale_ax.set_ylim(ax.get_rorigin(), ax.get_rmax())
# scale_ax.set_ylim(ax.get_ylim()) # Matplotlib < 2.2.3
Putting it all together:
(The example is taken from the Matplotlib polar plot demo)
r = np.arange(0, 2, 0.01)
theta = 2 * np.pi * r
ax = plt.subplot(111, projection='polar')
ax.plot(theta, r)
ax.grid(True)
ax.set_rorigin(-1)
# offset_radial_axis(ax) # Matplotlib < 2.2.3
add_scale(ax)
ax.set_title("A line plot on an offset polar axis", va='bottom')
plt.show()
I am not sure if the polar plot can be adjusted like that. But here is a work-around, based on the last example given here: Floating Axes.
I have included explanatory comments in the code, if you copy/paste it, it should run as-is:
import mpl_toolkits.axisartist.floating_axes as floating_axes
from matplotlib.projections import PolarAxes
from mpl_toolkits.axisartist.grid_finder import FixedLocator, \
MaxNLocator, DictFormatter
import numpy as np
import matplotlib.pyplot as plt
# generate 100 random data points
# order the theta coordinates
# theta between 0 and 2*pi
theta = np.random.rand(100)*2.*np.pi
theta = np.sort(theta)
# "radius" between 0 and a max value of 40,000
# as roughly in your example
# normalize the r coordinates and offset by 1 (will be clear later)
MAX_R = 40000.
radius = np.random.rand(100)*MAX_R
radius = radius/np.max(radius) + 1.
# initialize figure:
fig = plt.figure()
# set up polar axis
tr = PolarAxes.PolarTransform()
# define angle ticks around the circumference:
angle_ticks = [(0, r"$0$"),
(.25*np.pi, r"$\frac{1}{4}\pi$"),
(.5*np.pi, r"$\frac{1}{2}\pi$"),
(.75*np.pi, r"$\frac{3}{4}\pi$"),
(1.*np.pi, r"$\pi$"),
(1.25*np.pi, r"$\frac{5}{4}\pi$"),
(1.5*np.pi, r"$\frac{3}{2}\pi$"),
(1.75*np.pi, r"$\frac{7}{4}\pi$")]
# set up ticks and spacing around the circle
grid_locator1 = FixedLocator([v for v, s in angle_ticks])
tick_formatter1 = DictFormatter(dict(angle_ticks))
# set up grid spacing along the 'radius'
radius_ticks = [(1., '0.0'),
(1.5, '%i' % (MAX_R/2.)),
(2.0, '%i' % (MAX_R))]
grid_locator2 = FixedLocator([v for v, s in radius_ticks])
tick_formatter2 = DictFormatter(dict(radius_ticks))
# set up axis:
# tr: the polar axis setup
# extremes: theta max, theta min, r max, r min
# the grid for the theta axis
# the grid for the r axis
# the tick formatting for the theta axis
# the tick formatting for the r axis
grid_helper = floating_axes.GridHelperCurveLinear(tr,
extremes=(2.*np.pi, 0, 2, 1),
grid_locator1=grid_locator1,
grid_locator2=grid_locator2,
tick_formatter1=tick_formatter1,
tick_formatter2=tick_formatter2)
ax1 = floating_axes.FloatingSubplot(fig, 111, grid_helper=grid_helper)
fig.add_subplot(ax1)
# create a parasite axes whose transData in RA, cz
aux_ax = ax1.get_aux_axes(tr)
aux_ax.patch = ax1.patch # for aux_ax to have a clip path as in ax
ax1.patch.zorder=0.9 # but this has a side effect that the patch is
# drawn twice, and possibly over some other
# artists. So, we decrease the zorder a bit to
# prevent this.
# plot your data:
aux_ax.plot(theta, radius)
plt.show()
This will generate the following plot:
You'd have to tweak the axis labels to meet your demands.
I scaled the data because otherwise the same issue as with your plot would have occurred - the inner, empty circle would have been scaled to a dot. You might try the scaling with your polar plot and just put custom labels on the radial axis to achieve a similar effect.

Categories