Preventing plot joining when values "wrap" in matplotlib plots - python

I'm plotting right ascension ephemerides for planets, which have the property that they are cyclical: they hit a maximum value, 24, and then start again at 0. When I plot these using matplotlib, the "jump" from 24 to zero is joined so that I get horizontal lines running across my figure:
How can I eliminate these lines? Is there an approach in matplotlib, or perhaps a way to split the lists at between the points where the jump occurs.
Code to generate above figure:
from __future__ import division
import ephem
import matplotlib
import matplotlib.pyplot
import math
fig, ax = matplotlib.pyplot.subplots()
ax.set(xlim=[0, 24])
ax.set(ylim=[min(date_range), max(date_range)])
ax.plot([12*ep.ra/math.pi for ep in [ephem.Jupiter(base_date + d) for d in date_range]], date_range,
ls='-', color='g', lw=2)
ax.plot([12*ep.ra/math.pi for ep in [ephem.Venus(base_date + d) for d in date_range]], date_range,
ls='-', color='r', lw=1)
ax.plot([12*ep.ra/math.pi for ep in [ephem.Sun(base_date + d) for d in date_range]], date_range,
ls='-', color='y', lw=3)

Here is a generator function that finds the contiguous regions of 'wrapped' data:
import numpy as np
def unlink_wrap(dat, lims=[-np.pi, np.pi], thresh = 0.95):
"""
Iterate over contiguous regions of `dat` (i.e. where it does not
jump from near one limit to the other).
This function returns an iterator object that yields slice
objects, which index the contiguous portions of `dat`.
This function implicitly assumes that all points in `dat` fall
within `lims`.
"""
jump = np.nonzero(np.abs(np.diff(dat)) > ((lims[1] - lims[0]) * thresh))[0]
lasti = 0
for ind in jump:
yield slice(lasti, ind + 1)
lasti = ind + 1
yield slice(lasti, len(dat))
An example usage would be,
x = np.arange(0, 100, .1)
y = x.copy()
lims = [0, 24]
x = (x % lims[1])
fig, ax = matplotlib.pyplot.subplots()
for slc in unlink_wrap(x, lims):
ax.plot(x[slc], y[slc], 'b-', linewidth=2)
ax.plot(x, y, 'r-', zorder=-10)
ax.set_xlim(lims)
Which gives the figure below. Note that the blue lines (which utilize unlink_wrap) are broken and the standard-plotted red lines are shown for reference.

Related

How to set limits around (on both sides of) 0, in a polar Matplotlib plot (wedge diagram)

I am making a wedge diagram (plotting quasars in space, with RA as theta and Dec as r). I need to set the limits of a polar plot on both sides of 0. My limits should go from 45 degrees to 315 degrees with 0 degrees in between those two values (45-0-315). How do I do this?
This is my code:
import numpy as np
import matplotlib.pyplot as plt
theta = (np.pi/180)*np.array([340.555906,3.592373,32.473440,33.171584,35.463857,44.268397,339.362504,345.211906,346.485567,346.811945,348.672405,349.180736,349.370850,353.098343])
r = np.array([-32.906663,-33.842402,-32.425917,-32.677975, -30.701083,-31.460307,-32.909861,-30.802969,-33.683759,-32.207783,-33.068686,-33.820102,-31.438195,-31.920375])
colors = 'red'
fig = plt.figure()
ax = fig.add_subplot(111, polar=True)
c = ax.scatter(theta, r, c=colors, cmap='hsv', alpha=0.75)
plt.show()
If I put the limits:
ax.set_thetamin(45)
ax.set_thetamax(-45)
I get the correct slice of the diagram, but the wrong values on the theta axis (the axis now goes from -45-45 degrees).
If I put the limits:
ax.set_thetamin(45)
ax.set_thetamax(315)
I get the wrong slice of the diagram, but the correct values on the theta axis.
What to do?
It appears that matplotlib will only make the theta limits span across theta=0 if you have a positive and negative value for thetamin and thetamax. From the docstring for set_thetalim():
Values are wrapped in to the range [0, 2π] (in radians), so for example it is possible to do set_thetalim(-np.pi / 2, np.pi / 2) to have an axes symmetric around 0.
So setting:
ax.set_thetamin(45)
ax.set_thetamax(-45)
is the correct thing to do to get the plot you want. We can then modify the ticks later using a ticker.FuncFormatter to get the tick values you want.
For example:
import matplotlib.ticker as ticker
fmt = lambda x, pos: "{:g}".format(np.degrees(x if x >= 0 else x + 2 * np.pi))
ax.xaxis.set_major_formatter(ticker.FuncFormatter(fmt))
Which yields:
For completeness, here I put it all together in your script:
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.ticker as ticker
theta = (np.pi/180)*np.array([340.555906,3.592373,32.473440,33.171584,35.463857,44.268397,339.362504,345.211906,346.485567,346.811945,348.672405,349.180736,349.370850,353.098343])
r = np.array([-32.906663,-33.842402,-32.425917,-32.677975, -30.701083,-31.460307,-32.909861,-30.802969,-33.683759,-32.207783,-33.068686,-33.820102,-31.438195,-31.920375])
colors = 'red'
fig = plt.figure()
ax = fig.add_subplot(111, polar=True)
c = ax.scatter(theta, r, c=colors, cmap='hsv', alpha=0.75)
ax.set_thetamin(45)
ax.set_thetamax(-45)
fmt = lambda x, pos: "{:g}".format(np.degrees(x if x >= 0 else x + 2 * np.pi))
ax.xaxis.set_major_formatter(ticker.FuncFormatter(fmt))
plt.show()

Creating a modulo/folded plot in Python

I am trying to "fold" an exponential plot (and a fit to it - see the first image below) around a discrete interval on the x-axis (a.k.a a "modulo plot"). The aim is that after 10 x-units the exponential is continued on the same plot from 0 for the 10 to 20 interval, as shown on a second "photoshopped" image below.
The MWE code is below:
import numpy as np
from scipy import optimize
import matplotlib.pyplot as plt
Generate points
x=np.arange(20)
y=np.exp(-x/10)
Fit to data
def fit_func(x, t):
return np.exp(-x/t)
par, pcov = optimize.curve_fit(f=fit_func, xdata=x, ydata=y)
Plot data and fit function
fig, ax = plt.subplots()
ax.plot(x,y, c='g', label="Data");
ax.plot(x,fit_func(x, par), c='r', linestyle=":", label="Fit");
ax.set_xlabel("x (modulo 10)")
ax.legend()
plt.savefig("fig/mod.png", dpi=300)
What I have: Origianl exponential from 0 to 20
What I want: Modulo/folded exponential in intervals of 10
You could try to simply write:
ax.plot(x % 10,y, c='g', label="Data")
ax.plot(x % 10, f, c='r', linestyle=":", label="Fit")
but then you get confusing lines connecting the last point of one section to the first point of the next.
Another idea is to create a loop to plot every part separately. To avoid multiple legend entries, only the first section sets a legend label.
import numpy as np
from scipy import optimize
import matplotlib.pyplot as plt
x=np.arange(40)
y=np.exp(-x/10)
def fit_func(x, t):
return np.exp(-x/t)
par, pcov = optimize.curve_fit(f=fit_func, xdata=x, ydata=y)
f = fit_func(x, par)
fig, ax = plt.subplots()
left = x.min()
section = 1
while left < x.max():
right = left+10
filter = (x >= left) & (x <= right)
ax.plot(x[filter]-left,y[filter], c='g', label="Data" if section == 1 else '')
ax.plot(x[filter]-left, f[filter], c='r', linestyle=":", label="Fit" if section == 1 else '')
left = right
section += 1
ax.set_xlabel("x (modulo 10)")
ax.legend()
#plt.savefig("fig/mod.png", dpi=300)
plt.show()
Assuming that x is a sorted array, we'll have :
>>> y_ = fit_func(x, par)
>>> temp_x = []
>>> temp_y = []
>>> temp_y_ = []
>>> fig, ax = plt.subplots()
>>> for i in range(len(x)):
if x[i]%10==0 or i == len(x)-1:
ax.plot(temp_x,temp_y, c='g', label="Data");
ax.plot(temp_x,temp_y_, c='r', linestyle=":", label="Fit")
temp_x,temp_y,temp_y_ = [],[],[]
else:
temp_x.append(x[i]%10)
temp_y.append(y[i])
temp_y_.append(y_[i])
>>> plt.show()
and this would be the resulting plot :

How to avoid overlapping error bars in matplotlib?

I want to create a plot for two different datasets similar to the one presented in this answer:
In the above image, the author managed to fix the overlapping problem of the error bars by adding some small random scatter in x to the new dataset.
In my problem, I must plot a similar graphic, but having some categorical data in the x axis:
Any ideas on how to slightly move one the error bars of the second dataset using categorical variables at the x axis? I want to avoid the overlapping between the bars for making the visualization easier.
You can translate each errorbar by adding the default data transform to a prior translation in data space. This is possible when knowing that categories are in general one data unit away from each other.
import numpy as np; np.random.seed(42)
import matplotlib.pyplot as plt
from matplotlib.transforms import Affine2D
x = list("ABCDEF")
y1, y2 = np.random.randn(2, len(x))
yerr1, yerr2 = np.random.rand(2, len(x))*4+0.3
fig, ax = plt.subplots()
trans1 = Affine2D().translate(-0.1, 0.0) + ax.transData
trans2 = Affine2D().translate(+0.1, 0.0) + ax.transData
er1 = ax.errorbar(x, y1, yerr=yerr1, marker="o", linestyle="none", transform=trans1)
er2 = ax.errorbar(x, y2, yerr=yerr2, marker="o", linestyle="none", transform=trans2)
plt.show()
Alternatively, you could translate the errorbars after applying the data transform and hence move them in units of points.
import numpy as np; np.random.seed(42)
import matplotlib.pyplot as plt
from matplotlib.transforms import ScaledTranslation
x = list("ABCDEF")
y1, y2 = np.random.randn(2, len(x))
yerr1, yerr2 = np.random.rand(2, len(x))*4+0.3
fig, ax = plt.subplots()
trans1 = ax.transData + ScaledTranslation(-5/72, 0, fig.dpi_scale_trans)
trans2 = ax.transData + ScaledTranslation(+5/72, 0, fig.dpi_scale_trans)
er1 = ax.errorbar(x, y1, yerr=yerr1, marker="o", linestyle="none", transform=trans1)
er2 = ax.errorbar(x, y2, yerr=yerr2, marker="o", linestyle="none", transform=trans2)
plt.show()
While results look similar in both cases, they are fundamentally different. You will observe this difference when interactively zooming the axes or changing the figure size.
Consider the following approach to highlight plots - combination of errorbar and fill_between with non-zero transparency:
import random
import matplotlib.pyplot as plt
# create sample data
N = 8
data_1 = {
'x': list(range(N)),
'y': [10. + random.random() for dummy in range(N)],
'yerr': [.25 + random.random() for dummy in range(N)]}
data_2 = {
'x': list(range(N)),
'y': [10.25 + .5 * random.random() for dummy in range(N)],
'yerr': [.5 * random.random() for dummy in range(N)]}
# plot
plt.figure()
# only errorbar
plt.subplot(211)
for data in [data_1, data_2]:
plt.errorbar(**data, fmt='o')
# errorbar + fill_between
plt.subplot(212)
for data in [data_1, data_2]:
plt.errorbar(**data, alpha=.75, fmt=':', capsize=3, capthick=1)
data = {
'x': data['x'],
'y1': [y - e for y, e in zip(data['y'], data['yerr'])],
'y2': [y + e for y, e in zip(data['y'], data['yerr'])]}
plt.fill_between(**data, alpha=.25)
Result:
Threre is example on lib site: https://matplotlib.org/stable/gallery/lines_bars_and_markers/errorbar_subsample.html
enter image description here
You need parameter errorevery=(m, n),
n - how often plot error lines, m - shift with range from 0 to n

Extending a line segment in matplotlib

Is there a function in matplotlib similar to MATLAB's line extensions?
I am basically looking for a way to extend a line segment to a plot. My current plot looks like this.
After looking at another question and applying the formula, I was able to get it to here, but it still looks messy.
Does anyone have the magic formula here?
Have a go to write your own as I don't think this exists in matplotlib. This is a start, you could improve by adding the semiinfinite etc
import matplotlib.pylab as plt
import numpy as np
def extended(ax, x, y, **args):
xlim = ax.get_xlim()
ylim = ax.get_ylim()
x_ext = np.linspace(xlim[0], xlim[1], 100)
p = np.polyfit(x, y , deg=1)
y_ext = np.poly1d(p)(x_ext)
ax.plot(x_ext, y_ext, **args)
ax.set_xlim(xlim)
ax.set_ylim(ylim)
return ax
ax = plt.subplot(111)
ax.scatter(np.linspace(0, 1, 100), np.random.random(100))
x_short = np.linspace(0.2, 0.7)
y_short = 0.2* x_short
ax = extended(ax, x_short, y_short, color="r", lw=2, label="extended")
ax.plot(x_short, y_short, color="g", lw=4, label="short")
ax.legend()
plt.show()
I just realised you have some red dots on your plots, are those important? Anyway the main point I think you solution so far is missing is to set the plot limits to those that existed before otherwise, as you have found, they get extended.
New in matplotlib 3.3
There is now an axline method to easily extend arbitrary lines:
Adds an infinitely long straight line. The line can be defined either by two points xy1 and xy2
plt.axline(xy1=(0, 1), xy2=(1, 0.5), color='r')
or defined by one point xy1 and a slope.
plt.axline(xy1=(0, 1), slope=-0.5, color='r')
Sample data for reference:
import numpy as np
import matplotlib.pyplot as plt
x, y = np.random.default_rng(123).random((2, 100)) * 2 - 1
m, b = -0.5, 1
plt.scatter(x, y, c=np.where(y > m*x + b, 'r', 'k'))

How to space overlapping annotations

I want to annotate the bars in a graph with some text but if the bars are close together and have comparable height, the annotations are above ea. other and thus hard to read (the coordinates for the annotations were taken from the bar position and height).
Is there a way to shift one of them if there is a collision?
Edit: The bars are very thin and very close sometimes so just aligning vertically doesn't solve the problem...
A picture might clarify things:
I've written a quick solution, which checks each annotation position against default bounding boxes for all the other annotations. If there is a collision it changes its position to the next available collision free place. It also puts in nice arrows.
For a fairly extreme example, it will produce this (none of the numbers overlap):
Instead of this:
Here is the code:
import numpy as np
import matplotlib.pyplot as plt
from numpy.random import *
def get_text_positions(x_data, y_data, txt_width, txt_height):
a = zip(y_data, x_data)
text_positions = y_data.copy()
for index, (y, x) in enumerate(a):
local_text_positions = [i for i in a if i[0] > (y - txt_height)
and (abs(i[1] - x) < txt_width * 2) and i != (y,x)]
if local_text_positions:
sorted_ltp = sorted(local_text_positions)
if abs(sorted_ltp[0][0] - y) < txt_height: #True == collision
differ = np.diff(sorted_ltp, axis=0)
a[index] = (sorted_ltp[-1][0] + txt_height, a[index][1])
text_positions[index] = sorted_ltp[-1][0] + txt_height
for k, (j, m) in enumerate(differ):
#j is the vertical distance between words
if j > txt_height * 2: #if True then room to fit a word in
a[index] = (sorted_ltp[k][0] + txt_height, a[index][1])
text_positions[index] = sorted_ltp[k][0] + txt_height
break
return text_positions
def text_plotter(x_data, y_data, text_positions, axis,txt_width,txt_height):
for x,y,t in zip(x_data, y_data, text_positions):
axis.text(x - txt_width, 1.01*t, '%d'%int(y),rotation=0, color='blue')
if y != t:
axis.arrow(x, t,0,y-t, color='red',alpha=0.3, width=txt_width*0.1,
head_width=txt_width, head_length=txt_height*0.5,
zorder=0,length_includes_head=True)
Here is the code producing these plots, showing the usage:
#random test data:
x_data = random_sample(100)
y_data = random_integers(10,50,(100))
#GOOD PLOT:
fig2 = plt.figure()
ax2 = fig2.add_subplot(111)
ax2.bar(x_data, y_data,width=0.00001)
#set the bbox for the text. Increase txt_width for wider text.
txt_height = 0.04*(plt.ylim()[1] - plt.ylim()[0])
txt_width = 0.02*(plt.xlim()[1] - plt.xlim()[0])
#Get the corrected text positions, then write the text.
text_positions = get_text_positions(x_data, y_data, txt_width, txt_height)
text_plotter(x_data, y_data, text_positions, ax2, txt_width, txt_height)
plt.ylim(0,max(text_positions)+2*txt_height)
plt.xlim(-0.1,1.1)
#BAD PLOT:
fig = plt.figure()
ax = fig.add_subplot(111)
ax.bar(x_data, y_data, width=0.0001)
#write the text:
for x,y in zip(x_data, y_data):
ax.text(x - txt_width, 1.01*y, '%d'%int(y),rotation=0)
plt.ylim(0,max(text_positions)+2*txt_height)
plt.xlim(-0.1,1.1)
plt.show()
Another option using my library adjustText, written specially for this purpose (https://github.com/Phlya/adjustText). I think it's probably significantly slower that the accepted answer (it slows down considerably with a lot of bars), but much more general and configurable.
from adjustText import adjust_text
np.random.seed(2017)
x_data = np.random.random_sample(100)
y_data = np.random.random_integers(10,50,(100))
f, ax = plt.subplots(dpi=300)
bars = ax.bar(x_data, y_data, width=0.001, facecolor='k')
texts = []
for x, y in zip(x_data, y_data):
texts.append(plt.text(x, y, y, horizontalalignment='center', color='b'))
adjust_text(texts, add_objects=bars, autoalign='y', expand_objects=(0.1, 1),
only_move={'points':'', 'text':'y', 'objects':'y'}, force_text=0.75, force_objects=0.1,
arrowprops=dict(arrowstyle="simple, head_width=0.25, tail_width=0.05", color='r', lw=0.5, alpha=0.5))
plt.show()
If we allow autoalignment along x axis, it gets even better (I just need to resolve a small issue that it doesn't like putting labels above the points and not a bit to the side...).
np.random.seed(2017)
x_data = np.random.random_sample(100)
y_data = np.random.random_integers(10,50,(100))
f, ax = plt.subplots(dpi=300)
bars = ax.bar(x_data, y_data, width=0.001, facecolor='k')
texts = []
for x, y in zip(x_data, y_data):
texts.append(plt.text(x, y, y, horizontalalignment='center', size=7, color='b'))
adjust_text(texts, add_objects=bars, autoalign='xy', expand_objects=(0.1, 1),
only_move={'points':'', 'text':'y', 'objects':'y'}, force_text=0.75, force_objects=0.1,
arrowprops=dict(arrowstyle="simple, head_width=0.25, tail_width=0.05", color='r', lw=0.5, alpha=0.5))
plt.show()
(I had to adjust some parameters here, of course)
One option is to rotate the text/annotation, which is set by the rotation keyword/property. In the following example, I rotate the text 90 degrees to guarantee that it wont collide with the neighboring text. I also set the va (short for verticalalignment) keyword, so that the text is presented above the bar (above the point that I use to define the text):
import matplotlib.pyplot as plt
data = [10, 8, 8, 5]
fig = plt.figure()
ax = fig.add_subplot(111)
ax.bar(range(4),data)
ax.set_ylim(0,12)
# extra .4 is because it's half the default width (.8):
ax.text(1.4,8,"2nd bar",rotation=90,va='bottom')
ax.text(2.4,8,"3nd bar",rotation=90,va='bottom')
plt.show()
The result is the following figure:
Determining programmatically if there are collisions between various annotations is a trickier process. This might be worth a separate question: Matplotlib text dimensions.
Just thought I would provide an alternative solution that I just created textalloc that makes sure that text-boxes avoids overlap with both each other and lines when possible, and is fast.
For this example you could use something like this:
import textalloc as ta
import numpy as np
import matplotlib.pyplot as plt
np.random.seed(2017)
x_data = np.random.random_sample(100)
y_data = np.random.random_integers(10,50,(100))
f, ax = plt.subplots(dpi=200)
bars = ax.bar(x_data, y_data, width=0.002, facecolor='k')
ta.allocate_text(f,ax,x_data,y_data,
[str(yy) for yy in list(y_data)],
x_lines=[np.array([xx,xx]) for xx in list(x_data)],
y_lines=[np.array([0,yy]) for yy in list(y_data)],
textsize=8,
margin=0.004,
min_distance=0.005,
linewidth=0.7,
textcolor="b")
plt.show()
This results in this

Categories