Python boxplot showing means and confidence intervals

Python boxplot showing means and confidence intervals - python

How can I create a boxplot like the one below, in Python? I want to depict means and confidence bounds only (rather than proportions of IQRs, as in matplotlib boxplot).
I don't have any version constraints, and if your answer has some package dependency that's OK too. Thanks!

Use errorbar instead. Here is a minimal example:
import matplotlib.pyplot as plt
x = [2, 4, 3]
y = [1, 3, 5]
errors = [0.5, 0.25, 0.75]
plt.figure()
plt.errorbar(x, y, xerr=errors, fmt = 'o', color = 'k')
plt.yticks((0, 1, 3, 5, 6), ('', 'x3', 'x2', 'x1',''))
Note that boxplot is not the right approach; the conf_intervals parameter only controls the placement of the notches on the boxes (and we don't want boxes anyway, let alone notched boxes). There is no way to customize the whiskers except as a function of IQR.

Thanks to America, I propose a way to automatize this kind of graph a little bit.
Below an example of code generating 20 arrays from a normal distribution with mean=0.25 and std=0.1.
I used the formula W = t * s / sqrt(n), to calculate the margin of error of the confidence interval, with t the constant from the t distribution (see scipy.stats.t), s the standard deviation and n the number of values in an array.
list_samples=list() # making a list of arrays
for i in range(20):
list.append(np.random.normal(loc=0.25, scale=0.1, size=20))
def W_array(array, conf=0.95): # function that returns W based on the array provided
t = stats.t(df = len(array) - 1).ppf((1 + conf) /2)
W = t * np.std(array, ddof=1) / np.sqrt(len(array))
return W # the error
W_list = list()
mean_list = list()
for i in range(len(list_samples)):
W_list.append(W_array(list_samples[i])) # makes a list of W for each array
mean_list.append(np.mean(list_samples[i])) # same for the means to plot
plt.errorbar(x=mean_list, y=range(len(list_samples)), xerr=W_list, fmt='o', color='k')
plt.axvline(.25, ls='--') # this is only to demonstrate that 95%
# of the 95% CI contain the actual mean
plt.yticks([])
plt.show();

Related

How to Create a Boxplot / Group Boxplot from [Min ,Q1 ,Q2 ,Q3 ,Max] in Python? [duplicate]

From what I can see, boxplot() method expects a sequence of raw values (numbers) as input, from which it then computes percentiles to draw the boxplot(s).
I would like to have a method by which I could pass in the percentiles and get the corresponding boxplot.
For example:
Assume that I have run several benchmarks and for each benchmark I've measured latencies ( floating point values ). Now additionally, I have precomputed the percentiles for these values.
Hence for each benchmark, I have the 25th, 50th, 75th percentile along with the min and max.
Now given these data, I would like to draw the box plots for the benchmarks.

As of 2020, there is a better method than the one in the accepted answer.
The matplotlib.axes.Axes class provides a bxp method, which can be used to draw the boxes and whiskers based on the percentile values. Raw data is only needed for the outliers, and that is optional.
Example:
import matplotlib.pyplot as plt
fig, ax = plt.subplots()
boxes = [
{
'label' : "Male height",
'whislo': 162.6, # Bottom whisker position
'q1' : 170.2, # First quartile (25th percentile)
'med' : 175.7, # Median (50th percentile)
'q3' : 180.4, # Third quartile (75th percentile)
'whishi': 187.8, # Top whisker position
'fliers': [] # Outliers
}
]
ax.bxp(boxes, showfliers=False)
ax.set_ylabel("cm")
plt.savefig("boxplot.png")
plt.close()
This produces the following image:

To draw the box plot using just the percentile values and the outliers ( if any ) I made a customized_box_plot function that basically modifies attributes in a basic box plot ( generated from a tiny sample data ) to make it fit according to your percentile values.
The customized_box_plot function
def customized_box_plot(percentiles, axes, redraw = True, *args, **kwargs):
"""
Generates a customized boxplot based on the given percentile values
"""
box_plot = axes.boxplot([[-9, -4, 2, 4, 9],]*n_box, *args, **kwargs)
# Creates len(percentiles) no of box plots
min_y, max_y = float('inf'), -float('inf')
for box_no, (q1_start,
q2_start,
q3_start,
q4_start,
q4_end,
fliers_xy) in enumerate(percentiles):
# Lower cap
box_plot['caps'][2*box_no].set_ydata([q1_start, q1_start])
# xdata is determined by the width of the box plot
# Lower whiskers
box_plot['whiskers'][2*box_no].set_ydata([q1_start, q2_start])
# Higher cap
box_plot['caps'][2*box_no + 1].set_ydata([q4_end, q4_end])
# Higher whiskers
box_plot['whiskers'][2*box_no + 1].set_ydata([q4_start, q4_end])
# Box
box_plot['boxes'][box_no].set_ydata([q2_start,
q2_start,
q4_start,
q4_start,
q2_start])
# Median
box_plot['medians'][box_no].set_ydata([q3_start, q3_start])
# Outliers
if fliers_xy is not None and len(fliers_xy[0]) != 0:
# If outliers exist
box_plot['fliers'][box_no].set(xdata = fliers_xy[0],
ydata = fliers_xy[1])
min_y = min(q1_start, min_y, fliers_xy[1].min())
max_y = max(q4_end, max_y, fliers_xy[1].max())
else:
min_y = min(q1_start, min_y)
max_y = max(q4_end, max_y)
# The y axis is rescaled to fit the new box plot completely with 10%
# of the maximum value at both ends
axes.set_ylim([min_y*1.1, max_y*1.1])
# If redraw is set to true, the canvas is updated.
if redraw:
ax.figure.canvas.draw()
return box_plot
USAGE
Using inverse logic ( code at the very end ) I extracted the percentile values from this example
>>> percentiles
(-1.0597368367634488, 0.3977683984966961, 1.0298955252405229, 1.6693981537742526, 3.4951447843464449)
(-0.90494930553559483, 0.36916539612108634, 1.0303658700697103, 1.6874542731392828, 3.4951447843464449)
(0.13744105279440233, 1.3300645202649739, 2.6131540656339483, 4.8763411136047647, 9.5751914834437937)
(0.22786243898199182, 1.4120860286080519, 2.637650402506837, 4.9067126578493259, 9.4660357513550899)
(0.0064696168078617741, 0.30586770128093388, 0.70774153557312702, 1.5241965711101928, 3.3092932063051976)
(0.007009744579241136, 0.28627373934008982, 0.66039691869500572, 1.4772725266672091, 3.221716765477217)
(-2.2621660374110544, 5.1901313713883352, 7.7178532139979357, 11.277744848353247, 20.155971739152388)
(-2.2621660374110544, 5.1884411864079532, 7.3357079047721054, 10.792299385806913, 18.842012119715388)
(2.5417888074435702, 5.885996170695587, 7.7271286220368598, 8.9207423361593179, 10.846938621419374)
(2.5971767318505856, 5.753551925927133, 7.6569980004033464, 8.8161056254143233, 10.846938621419374)
Note that to keep this short I haven't shown the outliers vectors which will be the 6th element of each of the percentile array.
Also note that all usual additional kwargs / args can be used since they are simply passed to the boxplot method inside it :
>>> fig, ax = plt.subplots()
>>> b = customized_box_plot(percentiles, ax, redraw=True, notch=0, sym='+', vert=1, whis=1.5)
>>> plt.show()
EXPLANATION
The boxplot method returns a dictionary mapping the components of the boxplot to the individual matplotlib.lines.Line2D instances that were created.
Quoting from the matplotlib.pyplot.boxplot documentation :
That dictionary has the following keys (assuming vertical boxplots):
boxes: the main body of the boxplot showing the quartiles and the median’s confidence intervals if enabled.
medians: horizonal lines at the median of each box.
whiskers: the vertical lines extending to the most extreme, n-outlier data points. caps: the horizontal lines at the ends of the whiskers.
fliers: points representing data that extend beyond the whiskers (outliers).
means: points or lines representing the means.
For example observe the boxplot of a tiny sample data of [-9, -4, 2, 4, 9]
>>> b = ax.boxplot([[-9, -4, 2, 4, 9],])
>>> b
{'boxes': [<matplotlib.lines.Line2D at 0x7fe1f5b21350>],
'caps': [<matplotlib.lines.Line2D at 0x7fe1f54d4e50>,
<matplotlib.lines.Line2D at 0x7fe1f54d0e50>],
'fliers': [<matplotlib.lines.Line2D at 0x7fe1f5b317d0>],
'means': [],
'medians': [<matplotlib.lines.Line2D at 0x7fe1f63549d0>],
'whiskers': [<matplotlib.lines.Line2D at 0x7fe1f5b22e10>,
<matplotlib.lines.Line2D at 0x7fe20c54a510>]}
>>> plt.show()
The matplotlib.lines.Line2D objects have two methods that I'll be using in my function extensively. set_xdata ( or set_ydata ) and get_xdata ( or get_ydata ).
Using these methods we can alter the position of the constituent lines of the base box plot to conform to your percentile values ( which is what the customized_box_plot function does ). After altering the constituent lines' position, you can redraw the canvas using figure.canvas.draw()
Summarizing the mappings from percentile to the coordinates of the various Line2D objects.
The Y Coordinates :
The max ( q4_end - end of 4th quartile ) corresponds to the top most cap Line2D object.
The min ( q1_start - start of the 1st quartile ) corresponds to the lowermost most cap Line2D object.
The median corresponds to the ( q3_start ) median Line2D object.
The 2 whiskers lie between the ends of the boxes and extreme caps ( q1_start and q2_start - lower whisker; q4_start and q4_end - upper whisker )
The box is actually an interesting n shaped line bounded by a cap at the lower portion. The extremes of the n shaped line correspond to the q2_start and the q4_start.
The X Coordinates :
The Central x coordinates ( for multiple box plots are usually 1, 2, 3... )
The library automatically calculates the bounding x coordinates based on the width specified.
INVERSE FUNCTION TO RETRIEVE THE PERCENTILES FROM THE boxplot DICT:
def get_percentiles_from_box_plots(bp):
percentiles = []
for i in range(len(bp['boxes'])):
percentiles.append((bp['caps'][2*i].get_ydata()[0],
bp['boxes'][i].get_ydata()[0],
bp['medians'][i].get_ydata()[0],
bp['boxes'][i].get_ydata()[2],
bp['caps'][2*i + 1].get_ydata()[0],
(bp['fliers'][i].get_xdata(),
bp['fliers'][i].get_ydata())))
return percentiles
NOTE:
The reason why I did not make a completely custom boxplot method is because, there are many features offered by the inbuilt box plot that cannot be fully reproduced.
Also excuse me if I may have unnecessarily explained something that may have been too obvious.

Here is an updated version of this useful routine. Setting the vertices directly appears to work for both filled boxes (patchArtist=True) and unfilled ones.
def customized_box_plot(percentiles, axes, redraw = True, *args, **kwargs):
"""
Generates a customized boxplot based on the given percentile values
"""
n_box = len(percentiles)
box_plot = axes.boxplot([[-9, -4, 2, 4, 9],]*n_box, *args, **kwargs)
# Creates len(percentiles) no of box plots
min_y, max_y = float('inf'), -float('inf')
for box_no, pdata in enumerate(percentiles):
if len(pdata) == 6:
(q1_start, q2_start, q3_start, q4_start, q4_end, fliers_xy) = pdata
elif len(pdata) == 5:
(q1_start, q2_start, q3_start, q4_start, q4_end) = pdata
fliers_xy = None
else:
raise ValueError("Percentile arrays for customized_box_plot must have either 5 or 6 values")
# Lower cap
box_plot['caps'][2*box_no].set_ydata([q1_start, q1_start])
# xdata is determined by the width of the box plot
# Lower whiskers
box_plot['whiskers'][2*box_no].set_ydata([q1_start, q2_start])
# Higher cap
box_plot['caps'][2*box_no + 1].set_ydata([q4_end, q4_end])
# Higher whiskers
box_plot['whiskers'][2*box_no + 1].set_ydata([q4_start, q4_end])
# Box
path = box_plot['boxes'][box_no].get_path()
path.vertices[0][1] = q2_start
path.vertices[1][1] = q2_start
path.vertices[2][1] = q4_start
path.vertices[3][1] = q4_start
path.vertices[4][1] = q2_start
# Median
box_plot['medians'][box_no].set_ydata([q3_start, q3_start])
# Outliers
if fliers_xy is not None and len(fliers_xy[0]) != 0:
# If outliers exist
box_plot['fliers'][box_no].set(xdata = fliers_xy[0],
ydata = fliers_xy[1])
min_y = min(q1_start, min_y, fliers_xy[1].min())
max_y = max(q4_end, max_y, fliers_xy[1].max())
else:
min_y = min(q1_start, min_y)
max_y = max(q4_end, max_y)
# The y axis is rescaled to fit the new box plot completely with 10%
# of the maximum value at both ends
axes.set_ylim([min_y*1.1, max_y*1.1])
# If redraw is set to true, the canvas is updated.
if redraw:
ax.figure.canvas.draw()
return box_plot

Here is a bottom-up approach where the box_plot is build up using matplotlib's vline, Rectangle, and normal plot functions
def boxplot(df, ax=None, box_width=0.2, whisker_size=20, mean_size=10, median_size = 10 , line_width=1.5, xoffset=0,
color=0):
"""Plots a boxplot from existing percentiles.
Parameters
----------
df: pandas DataFrame
ax: pandas AxesSubplot
if to plot on en existing axes
box_width: float
whisker_size: float
size of the bar at the end of each whisker
mean_size: float
size of the mean symbol
color: int or rgb(list)
If int particular color of property cycler is taken. Example of rgb: [1,0,0] (red)
Returns
-------
f, a, boxes, vlines, whisker_tips, mean, median
"""
if type(color) == int:
color = plt.rcParams['axes.prop_cycle'].by_key()['color'][color]
if ax:
a = ax
f = a.get_figure()
else:
f, a = plt.subplots()
boxes = []
vlines = []
xn = []
for row in df.iterrows():
x = row[0] + xoffset
xn.append(x)
# box
y = row[1][25]
height = row[1][75] - row[1][25]
box = plt.Rectangle((x - box_width / 2, y), box_width, height)
a.add_patch(box)
boxes.append(box)
# whiskers
y = (row[1][95] + row[1][5]) / 2
vl = a.vlines(x, row[1][5], row[1][95])
vlines.append(vl)
for b in boxes:
b.set_linewidth(line_width)
b.set_facecolor([1, 1, 1, 1])
b.set_edgecolor(color)
b.set_zorder(2)
for vl in vlines:
vl.set_color(color)
vl.set_linewidth(line_width)
vl.set_zorder(1)
whisker_tips = []
if whisker_size:
g, = a.plot(xn, df[5], ls='')
whisker_tips.append(g)
g, = a.plot(xn, df[95], ls='')
whisker_tips.append(g)
for wt in whisker_tips:
wt.set_markeredgewidth(line_width)
wt.set_color(color)
wt.set_markersize(whisker_size)
wt.set_marker('_')
mean = None
if mean_size:
g, = a.plot(xn, df['mean'], ls='')
g.set_marker('o')
g.set_markersize(mean_size)
g.set_zorder(20)
g.set_markerfacecolor('None')
g.set_markeredgewidth(line_width)
g.set_markeredgecolor(color)
mean = g
median = None
if median_size:
g, = a.plot(xn, df['median'], ls='')
g.set_marker('_')
g.set_markersize(median_size)
g.set_zorder(20)
g.set_markeredgewidth(line_width)
g.set_markeredgecolor(color)
median = g
a.set_ylim(np.nanmin(df), np.nanmax(df))
return f, a, boxes, vlines, whisker_tips, mean, median
This is how it looks in action:
import numpy as np
import pandas as pd
import matplotlib.pylab as plt
nopts = 12
df = pd.DataFrame()
df['mean'] = np.random.random(nopts) + 7
df['median'] = np.random.random(nopts) + 7
df[5] = np.random.random(nopts) + 4
df[25] = np.random.random(nopts) + 6
df[75] = np.random.random(nopts) + 8
df[95] = np.random.random(nopts) + 10
out = boxplot(df)

Change the scale of the graph image

I try to generate a graph and save an image of the graph in python. Although the "plotting" of the values seems ok and I can get my picture, the scale of the graph is badly shifted.
If you compare the correct graph from tutorial example with my bad graph generated from different dataset, the curves are cut at the bottom to early: Y-axis should start just above the highest values and I should also see the curves for the highest X-values (in my case around 10^3).
But honestly, I think that problem is the scale of the y-axis, but actually do not know what parameteres should I change to fix it. I tried to play with some numbers (see below script), but without any good results.
This is the code for calculation and generation of the graph image:
import numpy as np
hic_data = load_hic_data_from_reads('/home/besy/Hi-C/MOREX/TCC35_parsedV2/TCC35_V2_interaction_filtered.tsv', resolution=100000)
min_diff = 1
max_diff = 500
import matplotlib.pyplot as plt
fig = plt.figure(figsize=(12, 12))
for cnum, c in enumerate(hic_data.chromosomes):
if c in ['ChrUn']:
continue
dist_intr = []
for diff in xrange(min_diff, min((max_diff, 1 + hic_data.chromosomes[c]))):
beg, end = hic_data.section_pos[c]
dist_intr.append([])
for i in xrange(beg, end - diff):
dist_intr[-1].append(hic_data[i, i + diff])
mean_intrp = []
for d in dist_intr:
if len(d):
mean_intrp.append(float(np.nansum(d)) / len(d))
else:
mean_intrp.append(0.0)
xp, yp = range(min_diff, max_diff), mean_intrp
x = []
y = []
for k in xrange(len(xp)):
if yp[k]:
x.append(xp[k])
y.append(yp[k])
l = plt.plot(x, y, '-', label=c, alpha=0.8)
plt.hlines(mean_intrp[2], 3, 5.25 + np.exp(cnum / 4.3), color=l[0].get_color(),
linestyle='--', alpha=0.5)
plt.text(5.25 + np.exp(cnum / 4.3), mean_intrp[2], c, color=l[0].get_color())
plt.plot(3, mean_intrp[2], '+', color=l[0].get_color())
plt.xscale('log')
plt.yscale('log')
plt.ylabel('number of interactions')
plt.xlabel('Distance between bins (in 100 kb bins)')
plt.grid()
plt.ylim(2, 250)
_ = plt.xlim(1, 110)
fig.savefig('/home/besy/Hi-C/MOREX/TCC35_V2_results/filtered/TCC35_V2_decay.png', dpi=fig.dpi)
I think that problem is in scale I need y-axis to start from 10^-1 (0.1), in order to change this I tried this:
min_diff = 0.1
.
.
.
dist_intr = []
for diff in xrange(min_diff, min((max_diff, 0.1 + hic_data.chromosomes[c]))):
.
.
.
plt.ylim((0.1, 20))
But this values return: "integer argument expected, got float"
I also tried to play with:
max_diff, plt.ylim and plt.xlim parameters little bit, but nothing changed to much.
I would like to ask you what parameter/s and how I need change to generate image of the correctly focused graph. Thank you in advance.

Scipy interpolate.splprep error "Invalid Inputs"

I am trying to interpolate a curve to a set of (x,y) points using SciPy's interpolate.splprep method, using the procedure followed in this StackOverflow answer. My code (with the data) is given below. Please excuse me for using this large dataset, as the code works perfectly fine on a different dataset. Kindly scroll to the bottom to see the implemetation.
#!/usr/bin/env python3
import numpy as np
import matplotlib.pyplot as plt
from scipy import interpolate
# -----------------------------------------------------------------------------
# Data
xp=np.array([ -1.19824526e-01, -1.19795807e-01, -1.22298912e-01,
-1.24784611e-01, -1.27233423e-01, -1.27048456e-01,
-1.29424259e-01, -1.31781573e-01, -1.34102825e-01,
-1.36386619e-01, -1.41324999e-01, -1.43569618e-01,
-1.48471481e-01, -1.53300646e-01, -1.55387133e-01,
-1.57436481e-01, -1.53938796e-01, -1.58562951e-01,
-1.53139517e-01, -1.50456275e-01, -1.49637920e-01,
-1.48774455e-01, -1.47843528e-01, -1.44278335e-01,
-1.43299274e-01, -1.39716798e-01, -1.36111285e-01,
-1.32534352e-01, -1.28982866e-01, -1.25433151e-01,
-1.21912263e-01, -1.16106245e-01, -1.12701128e-01,
-1.09303316e-01, -1.05947571e-01, -1.00467194e-01,
-9.72083398e-02, -9.39822094e-02, -9.08033710e-02,
-8.96420533e-02, -8.65053261e-02, -8.34162875e-02,
-8.03788778e-02, -7.73929193e-02, -7.62032638e-02,
-7.32655732e-02, -7.03760465e-02, -6.91826390e-02,
-6.63378816e-02, -6.35537275e-02, -6.08302060e-02,
-5.96426925e-02, -5.69864087e-02, -5.43931715e-02,
-5.18641746e-02, -4.93958173e-02, -4.82415854e-02,
-4.58486281e-02, -4.35196817e-02, -4.01162919e-02,
-3.79466513e-02, -3.48161871e-02, -3.18596693e-02,
-2.90650417e-02, -2.64251761e-02, -2.31429101e-02,
-1.94312163e-02, -1.73997964e-02, -1.55068323e-02,
-1.43163160e-02, -1.31800087e-02, -1.20987991e-02,
-1.10708190e-02, -1.05380016e-02, -9.58116017e-03,
-9.06399242e-03, -8.54450012e-03, -7.67847396e-03,
-7.17608354e-03, -6.67181154e-03, -5.89474349e-03,
-5.40878144e-03, -4.92121197e-03, -4.43202070e-03,
-3.94148294e-03, -3.44986011e-03, -2.82410814e-03,
-2.35269319e-03, -1.88058008e-03, -1.47393691e-03,
-9.78376399e-04, -4.82633521e-04, 1.33099164e-05,
5.09212801e-04, 1.05098855e-03, 1.56929991e-03,
2.08706303e-03, 2.72055571e-03, 3.26012954e-03,
3.79870854e-03, 4.33573131e-03, 4.87172652e-03,
5.40640816e-03, 5.93914581e-03, 6.47004490e-03,
6.99921852e-03, 7.52610639e-03, 7.70592714e-03,
8.20559501e-03, 8.70268809e-03, 9.19766855e-03,
9.68963219e-03, 1.01781695e-02, 1.01960805e-02,
1.06577199e-02, 1.11156340e-02, 1.15703286e-02,
1.20215921e-02, 1.24693015e-02, 1.29129042e-02,
1.33526781e-02, 1.37884367e-02, 1.42204360e-02,
1.46473802e-02, 1.50699789e-02, 1.54884533e-02,
1.59020551e-02, 1.63103362e-02, 8.12110387e-02,
7.80794051e-02, 1.67140103e-02, 8.31537241e-02,
7.99472912e-02, 7.99472912e-02, 7.67983984e-02,
1.71128723e-02, 8.50656342e-02, 8.17851028e-02,
7.85638577e-02, 7.53861405e-02, 1.75061328e-02,
8.19411806e-02, 7.38391281e-02, 1.78939640e-02,
8.70866930e-02, 8.36940292e-02, 8.03586974e-02,
7.70534244e-02, 7.70534244e-02, 7.38013540e-02,
7.38013540e-02, 7.06147796e-02, 1.82766038e-02,
8.54279559e-02, 8.20231372e-02, 7.53294330e-02,
7.20765174e-02, 1.86539411e-02, 8.36524496e-02,
7.85095832e-02, 7.51592888e-02, 7.18792721e-02,
1.90250409e-02, 7.82997201e-02, 7.49183992e-02,
7.49183992e-02, 7.16144248e-02, 7.16144248e-02,
6.83771846e-02, 1.93904576e-02, 7.46192919e-02,
7.12865685e-02, 7.12865685e-02, 6.80175748e-02,
1.97501330e-02, 7.42568965e-02, 7.08996495e-02,
7.08996495e-02, 6.75887344e-02, 2.01042729e-02,
7.38173451e-02, 6.70923613e-02, 2.13903228e-02,
7.50479910e-02, 6.82108239e-02, 5.69753762e-02,
5.24303656e-02, 5.24303656e-02, 4.52683211e-02,
4.52683211e-02, 4.25493203e-02, 2.17470907e-02,
7.45062992e-02, 6.76173090e-02, 6.76173090e-02,
6.42925100e-02, 6.42925100e-02, 5.94649095e-02,
5.94649095e-02, 3.92303424e-02, 2.20977481e-02,
7.21341379e-02, 3.72338037e-02, 2.24415025e-02,
7.14448972e-02, 3.40025442e-02, 2.27777176e-02,
7.07064856e-02, 3.57533680e-02, 2.41421550e-02,
6.81719132e-02, 3.62534788e-02, 2.44798556e-02,
6.56110398e-02, 3.80586628e-02, 3.29287629e-02,
2.93070471e-02, 2.48093588e-02, 6.13326924e-02,
3.85518913e-02, 3.46206958e-02, 2.85091877e-02,
2.51312268e-02, 5.38330011e-02, 3.76841669e-02,
3.50540735e-02, 2.77018960e-02, 2.65615352e-02,
5.28838088e-02, 3.81396763e-02, 3.54777506e-02,
2.80364970e-02, 2.68822682e-02, 5.03377702e-02,
3.85814254e-02, 3.58887890e-02, 4.93316503e-02,
4.04098395e-02, 3.62892096e-02, 4.67615526e-02,
4.22828625e-02, 3.80435955e-02, 3.84376145e-02,
4.02332775e-02, 4.06156847e-02, 4.24553741e-02,
4.43352031e-02, 4.47040511e-02, 4.66233682e-02,
4.69790035e-02, 4.89341212e-02, 5.09256192e-02,
5.12584867e-02, 5.32790231e-02, 5.35890744e-02,
5.38831411e-02, 5.41625645e-02, 5.44267004e-02,
5.46700348e-02, 5.48984863e-02, 5.51117932e-02,
5.53082440e-02, 5.54849716e-02, 5.56464539e-02,
5.57928396e-02, 5.59201893e-02, 5.60294455e-02,
5.61233441e-02, 5.62020138e-02, 5.62604489e-02,
5.63017253e-02, 5.63275468e-02, 5.63341408e-02,
5.63226424e-02, 5.62957310e-02, 5.62533699e-02,
5.61937444e-02, 5.61140110e-02, 5.60191106e-02,
5.59087917e-02, 5.57801898e-02, 5.56328560e-02,
5.54704141e-02, 5.70775198e-02, 5.68728844e-02,
5.66515897e-02, 5.64149230e-02, 5.61622287e-02,
5.76630266e-02, 5.73643873e-02, 5.70502787e-02,
5.67190716e-02, 5.63668473e-02, 5.59997391e-02,
5.73489998e-02, 5.69355151e-02, 5.65029189e-02,
5.77751241e-02, 5.72977910e-02, 5.67990710e-02,
5.79863269e-02, 5.74393835e-02, 5.68773454e-02,
5.62926261e-02, 5.56922722e-02, 5.50771272e-02,
5.44454686e-02, 5.37935810e-02, 5.31273003e-02,
5.24468411e-02, 5.17483760e-02, 5.10330229e-02,
5.03036776e-02, 4.95607328e-02, 4.87997085e-02,
4.80238054e-02, 4.72347342e-02, 4.64331616e-02,
4.56132865e-02, 4.47805574e-02, 4.39358955e-02,
4.30782240e-02, 4.22044750e-02, 4.01052073e-02,
3.92354976e-02, 3.83523540e-02, 3.74567873e-02,
3.65508593e-02, 3.45751478e-02, 3.36740998e-02,
3.27625023e-02, 3.18417381e-02, 3.09129121e-02,
2.90665673e-02, 2.81454989e-02, 2.72171846e-02,
2.62807950e-02, 2.53342284e-02, 2.43816409e-02,
2.34221736e-02, 2.24541496e-02, 2.08179757e-02,
1.98678098e-02, 1.89113740e-02, 1.79488243e-02,
1.69806146e-02, 1.65158032e-02, 1.55075714e-02,
1.44932106e-02, 1.34746855e-02, 1.24525920e-02,
1.14268067e-02, 1.03968750e-02, 9.36414487e-03,
8.58823755e-03, 7.51804527e-03, 6.44485601e-03,
5.37002690e-03, 4.29398700e-03, 3.31511044e-03,
2.20302298e-03, 1.09069996e-03, -2.27320426e-05,
-1.16892664e-03, -2.31490869e-03, -3.46060569e-03,
-4.74178052e-03, -5.91852523e-03, -7.09360822e-03,
-8.26683115e-03, -9.43736653e-03, -1.06042682e-02,
-1.17686419e-02, -1.33107457e-02, -1.45010352e-02,
-1.56869180e-02, -1.68693838e-02, -1.80464175e-02,
-1.97732638e-02, -2.09722818e-02, -2.21650612e-02,
-2.40185758e-02, -2.52303300e-02, -2.71803154e-02,
-2.84115598e-02, -3.04489552e-02, -3.16936647e-02,
-3.29299358e-02, -3.50861051e-02, -3.63332401e-02,
-3.85745058e-02, -3.98348648e-02, -4.21660006e-02,
-4.34302610e-02, -4.46836493e-02, -4.59254575e-02,
-4.71530952e-02, -4.96209305e-02, -4.95594200e-02,
-5.07435074e-02, -5.19101301e-02, -5.16977894e-02,
-5.14280802e-02, -5.11057669e-02, -5.07251169e-02,
-5.16985297e-02, -5.12126585e-02, -5.06852098e-02,
-5.15589749e-02, -5.09397027e-02, -5.17615499e-02,
-5.10672514e-02, -5.18313966e-02, -5.25816754e-02,
-5.33179227e-02, -5.40360028e-02, -5.47358953e-02,
-5.54213064e-02, -5.77400978e-02, -5.84092053e-02,
-5.90603644e-02, -6.14284845e-02, -6.38379284e-02,
-6.62872262e-02, -6.69166162e-02, -6.93865431e-02,
-7.18947674e-02, -7.44284962e-02, -7.69969804e-02,
-7.96063191e-02, -8.01834105e-02, -8.28053535e-02,
-8.54623715e-02, -8.59961071e-02, -8.86660185e-02,
-8.91520913e-02, -9.18335218e-02, -9.45402708e-02,
-9.49610563e-02, -9.76401856e-02, -1.00332460e-01,
-1.03032191e-01, -1.03358935e-01, -1.06040606e-01,
-1.06322470e-01, -1.08984284e-01, -1.09195131e-01,
-1.11833426e-01, -1.11994247e-01, -1.14596404e-01,
-1.17192554e-01, -1.17248317e-01])
yp = np.array([ -3.90948536e-05, -2.12984775e-03, -4.31095583e-03,
-6.58019633e-03, -8.93758156e-03, -1.11568100e-02,
-1.36444162e-02, -1.62222092e-02, -1.88895170e-02,
-2.16446498e-02, -2.49629308e-02, -2.79508857e-02,
-3.16029501e-02, -3.54376380e-02, -3.87881494e-02,
-4.22310942e-02, -4.41873802e-02, -4.85246067e-02,
-4.68663315e-02, -4.60459599e-02, -4.86676408e-02,
-5.12750434e-02, -5.38586293e-02, -5.54310799e-02,
-5.79452426e-02, -5.93547929e-02, -6.06497762e-02,
-6.18505946e-02, -6.29584706e-02, -6.39609234e-02,
-6.48713094e-02, -6.44090476e-02, -6.51181556e-02,
-6.57260659e-02, -6.62541381e-02, -6.52943568e-02,
-6.56184758e-02, -6.58578685e-02, -6.60229010e-02,
-6.76012689e-02, -6.76366183e-02, -6.76004442e-02,
-6.74972483e-02, -6.73282385e-02, -6.86657097e-02,
-6.83738036e-02, -6.80140059e-02, -6.92366190e-02,
-6.87491258e-02, -6.82071471e-02, -6.76134579e-02,
-6.86669494e-02, -6.79695621e-02, -6.72259327e-02,
-6.64391135e-02, -6.56069234e-02, -6.64563885e-02,
-6.55361171e-02, -6.45783892e-02, -6.18312378e-02,
-6.07850085e-02, -5.80009440e-02, -5.52383021e-02,
-5.24888121e-02, -4.97523554e-02, -4.54714570e-02,
-3.98863362e-02, -3.73592876e-02, -3.48720213e-02,
-3.37707235e-02, -3.26655171e-02, -3.15625118e-02,
-3.04616664e-02, -3.06508019e-02, -2.95344258e-02,
-2.96968330e-02, -2.98505905e-02, -2.87101259e-02,
-2.88391064e-02, -2.89597166e-02, -2.77967360e-02,
-2.78958771e-02, -2.79854740e-02, -2.80670276e-02,
-2.81405467e-02, -2.82051366e-02, -2.69913041e-02,
-2.70365186e-02, -2.70739448e-02, -2.83768113e-02,
-2.83979671e-02, -2.84108899e-02, -2.84155794e-02,
-2.84104617e-02, -2.96993141e-02, -2.96767995e-02,
-2.96453017e-02, -3.09305120e-02, -3.08782748e-02,
-3.08172540e-02, -3.07460634e-02, -3.06652277e-02,
-3.05756546e-02, -3.04773301e-02, -3.03684498e-02,
-3.02505329e-02, -3.01240628e-02, -2.87032761e-02,
-2.85638294e-02, -2.84161924e-02, -2.82602014e-02,
-2.80957411e-02, -2.79220043e-02, -2.65224371e-02,
-2.63408455e-02, -2.61506690e-02, -2.59523304e-02,
-2.57465736e-02, -2.55333569e-02, -2.53114227e-02,
-2.50819674e-02, -2.48453976e-02, -2.46014650e-02,
-2.43490672e-02, -2.40896946e-02, -2.38232320e-02,
-2.35495727e-02, -2.32681400e-02, -1.11708561e-01,
-1.07398522e-01, -2.29799277e-02, -1.10281290e-01,
-1.06025945e-01, -1.06025945e-01, -1.01847844e-01,
-2.26850806e-02, -1.08812919e-01, -1.04614895e-01,
-1.00492396e-01, -9.64256156e-02, -2.23830803e-02,
-1.01124594e-01, -9.11212826e-02, -2.20738630e-02,
-1.03723227e-01, -9.96804013e-02, -9.57062055e-02,
-9.17682599e-02, -9.17682599e-02, -8.78935733e-02,
-8.78935733e-02, -8.40962884e-02, -2.17583603e-02,
-9.82127298e-02, -9.42965108e-02, -8.65980524e-02,
-8.28570139e-02, -2.14365508e-02, -9.28460674e-02,
-8.71354106e-02, -8.34157663e-02, -7.97743543e-02,
-2.11075333e-02, -8.39100274e-02, -8.02849723e-02,
-8.02849723e-02, -7.67428202e-02, -7.67428202e-02,
-7.32724167e-02, -2.07721464e-02, -7.72159766e-02,
-7.37663681e-02, -7.37663681e-02, -7.03828404e-02,
-2.04308432e-02, -7.42042591e-02, -7.08482147e-02,
-7.08482147e-02, -6.75385453e-02, -2.00834820e-02,
-7.12338454e-02, -6.47417418e-02, -2.06352744e-02,
-6.99333169e-02, -6.35600774e-02, -5.30876202e-02,
-4.88515872e-02, -4.88515872e-02, -4.21763073e-02,
-4.21763073e-02, -3.96425097e-02, -2.02588101e-02,
-6.70368116e-02, -6.08364913e-02, -6.08364913e-02,
-5.78440553e-02, -5.78440553e-02, -5.34994049e-02,
-5.34994049e-02, -3.52908904e-02, -1.98763502e-02,
-6.26583213e-02, -3.23368135e-02, -1.94880238e-02,
-5.99037138e-02, -2.85040222e-02, -1.90931928e-02,
-5.72132575e-02, -2.89247783e-02, -1.95297821e-02,
-5.32198482e-02, -2.82971986e-02, -1.91058177e-02,
-4.94013681e-02, -2.86515116e-02, -2.47888430e-02,
-2.20618305e-02, -1.86758942e-02, -4.45232330e-02,
-2.79827472e-02, -2.51286391e-02, -2.06919011e-02,
-1.82397645e-02, -3.76607947e-02, -2.63609122e-02,
-2.45208701e-02, -1.93767971e-02, -1.85788804e-02,
-3.56379420e-02, -2.56998805e-02, -2.39058698e-02,
-1.88908564e-02, -1.81130913e-02, -3.26595065e-02,
-2.50304222e-02, -2.32829732e-02, -3.07966353e-02,
-2.52257065e-02, -2.26527986e-02, -2.80693713e-02,
-2.53799880e-02, -2.28350066e-02, -2.21686432e-02,
-2.22782703e-02, -2.15723084e-02, -2.16081542e-02,
-2.15998200e-02, -2.08220272e-02, -2.07341864e-02,
-1.99180705e-02, -1.97463091e-02, -1.95241512e-02,
-1.86330762e-02, -1.83210810e-02, -1.73881714e-02,
-1.64501676e-02, -1.55073488e-02, -1.45603397e-02,
-1.36076891e-02, -1.26514336e-02, -1.16918550e-02,
-1.07281971e-02, -9.76103257e-03, -8.79150351e-03,
-7.81935696e-03, -6.84417527e-03, -5.86703766e-03,
-4.88857954e-03, -3.90851347e-03, -2.92690669e-03,
-1.94445885e-03, -9.62077293e-04, 2.10973681e-05,
1.00443470e-03, 1.98670872e-03, 2.96920518e-03,
3.95065293e-03, 4.93054490e-03, 5.90896238e-03,
6.88594418e-03, 7.86095305e-03, 8.83291761e-03,
9.80227952e-03, 1.11168744e-02, 1.21109612e-02,
1.31014370e-02, 1.40884671e-02, 1.50714343e-02,
1.65579859e-02, 1.75619959e-02, 1.85609524e-02,
1.95539892e-02, 2.05406220e-02, 2.15208623e-02,
2.31958067e-02, 2.41936890e-02, 2.51825785e-02,
2.69676402e-02, 2.79735240e-02, 2.89676199e-02,
3.08600313e-02, 3.18685118e-02, 3.28673845e-02,
3.38531747e-02, 3.48305552e-02, 3.57981735e-02,
3.67545357e-02, 3.76978426e-02, 3.86308181e-02,
3.95533112e-02, 4.04626970e-02, 4.13583593e-02,
4.22429533e-02, 4.31163338e-02, 4.39732984e-02,
4.48174616e-02, 4.56497573e-02, 4.64690781e-02,
4.72699006e-02, 4.80584575e-02, 4.88339015e-02,
4.95941309e-02, 5.03364921e-02, 4.95646923e-02,
5.02584615e-02, 5.09357803e-02, 5.15956682e-02,
5.22416815e-02, 5.13017754e-02, 5.18954788e-02,
5.24741267e-02, 5.30389590e-02, 5.35886852e-02,
5.24828002e-02, 5.29815950e-02, 5.34658214e-02,
5.39333680e-02, 5.43819816e-02, 5.48154596e-02,
5.52339801e-02, 5.56342267e-02, 5.42908141e-02,
5.46458325e-02, 5.49864909e-02, 5.53070690e-02,
5.56106186e-02, 5.76746395e-02, 5.79561804e-02,
5.82151857e-02, 5.84584712e-02, 5.86858866e-02,
5.88950787e-02, 5.90831689e-02, 5.92552324e-02,
6.12619766e-02, 6.14026109e-02, 6.15224608e-02,
6.16256880e-02, 6.17123394e-02, 6.36720486e-02,
6.37186812e-02, 6.37481408e-02, 6.56861133e-02,
6.56722393e-02, 6.56407664e-02, 6.55917721e-02,
6.74743714e-02, 6.73786194e-02, 6.72646677e-02,
6.71325153e-02, 6.69773741e-02, 6.68003322e-02,
6.66053297e-02, 6.83473413e-02, 6.81027202e-02,
6.78376164e-02, 6.75542903e-02, 6.72524243e-02,
6.88634515e-02, 6.85066915e-02, 6.81318613e-02,
6.96716731e-02, 6.92387957e-02, 7.07292209e-02,
7.02467261e-02, 7.16584645e-02, 7.11134550e-02,
7.05499862e-02, 7.18681370e-02, 7.12419450e-02,
7.24822144e-02, 7.17989089e-02, 7.29694293e-02,
7.22180480e-02, 7.14476517e-02, 7.06588875e-02,
6.98486903e-02, 7.08078412e-02, 6.81567317e-02,
6.72843393e-02, 6.63881936e-02, 6.37899997e-02,
6.12404950e-02, 5.87433383e-02, 5.62902969e-02,
5.53950962e-02, 5.29895052e-02, 5.06437557e-02,
4.97490264e-02, 4.74631181e-02, 4.65678359e-02,
4.43551128e-02, 4.34554011e-02, 4.25440351e-02,
4.16213883e-02, 4.06842153e-02, 3.97338457e-02,
3.87727819e-02, 3.89122376e-02, 3.78978623e-02,
3.68719281e-02, 3.68766567e-02, 3.68230044e-02,
3.67095055e-02, 3.55465346e-02, 3.53200609e-02,
3.50311849e-02, 3.46717730e-02, 3.42461153e-02,
3.37555022e-02, 3.23610029e-02, 3.17505933e-02,
3.10701527e-02, 2.95754797e-02, 2.87735213e-02,
2.72210019e-02, 2.62970023e-02, 2.52956273e-02,
2.36404433e-02, 2.25053642e-02, 2.12889860e-02,
1.99902757e-02, 1.81872330e-02, 1.67574555e-02,
1.49054892e-02, 1.33429656e-02, 1.14391250e-02,
9.74643800e-03, 7.79267351e-03, 5.96714375e-03,
4.05355227e-03, 2.00672241e-03])
# -----------------------------------------------------------------------------
# Use scipy to interpolate.
xp = np.r_[xp, xp[0]]
yp = np.r_[yp, yp[0]]
tck, u = interpolate.splprep([xp, yp], s=0, k=1, per=True)
xi, yi = interpolate.splev(np.linspace(0, 1, 1000), tck)
# -----------------------------------------------------------------------------
# Plot result
fig = plt.figure()
ax = plt.subplot(111)
ax.plot(xp, yp, '.', markersize=2)
ax.plot(xi, yi, alpha=0.5)
plt.show()
I get the following error on one machine (MacOS),
---> tck, u = interpolate.splprep([xp, yp], s=0, k=1, per=True)
SystemError: <built-in function _parcur> returned NULL without setting an error
And this error on another machine (Ubuntu),
----> tck, u = interpolate.splprep([xp, yp], s=0, k=1, per=True)
ValueError: Invalid inputs.
interpolate.splprep uses the FORTRAN parcur routine from FITPACK (from the documentation).
My questions are -
Why does the code work for different datasets? e.g. xp = np.array([0.1, 0.2, 0.3, 0.4]) yp = np.array([-0.1, -0.3, -0.4, 0.2]) and not for this particular one? What does the error mean?
How can I get this to work? (Using this method or any other method) i.e. either interpolate a curve or filter the outliers ...
Out of curiosity, why is the error machine (and OS) dependent?
This is how the data looks when plotted, I think you can guess which curve I'd like to interpolate to (and which outliers I'd like to remove, if possible)

Fitpack has a fit if it two consecutive inputs are identical. The error happens deep enough that it depends on how the libraries were compiled and linked, hence the assortment of errors.
For example, xp[147:149], yp[147:149] (and several others):
(array([ 0.07705342, 0.07705342]), array([-0.09176826, -0.09176826]))
These are okay:
okay = np.where(np.abs(np.diff(xp)) + np.abs(np.diff(yp)) > 0)
xp = np.r_[xp[okay], xp[-1], xp[0]]
yp = np.r_[yp[okay], yp[-1], yp[0]]
# the rest of your code
I add the last point back because the output of diff is always one element shorter, so the last one needs to be included manually. (And then of course, you put the 0th point again for periodicity)
Cutting off the weird part
This is my attempt to cut off the weird extruding part of the dataset. It uses a Gaussian filter from ndimage. The original points xp, yp are kept this time; the filtered ones are xn, yn.
jump = np.sqrt(np.diff(xp)**2 + np.diff(yp)**2)
smooth_jump = ndimage.gaussian_filter1d(jump, 5, mode='wrap') # window of size 5 is arbitrary
limit = 2*np.median(smooth_jump) # factor 2 is arbitrary
xn, yn = xp[:-1], yp[:-1]
xn = xn[(jump > 0) & (smooth_jump < limit)]
yn = yn[(jump > 0) & (smooth_jump < limit)]
So, we remove not only duplicate points but also the points where the values jump around too much. The rest goes as before, interpolation is built out of xn, yn now. I plot original points for comparison with the new (red) curve):
ax.plot(xp, yp, 'o', markersize=2)
ax.plot(xi, yi, 'r', alpha=0.5)

How to uniformly resample a non-uniform signal using SciPy?

I have an (x, y) signal with non-uniform sample rate in x. (The sample rate is roughly proportional to 1/x). I attempted to uniformly re-sample it using scipy.signal's resample function. From what I understand from the documentation, I could pass it the following arguments:
scipy.resample(array_of_y_values, number_of_sample_points, array_of_x_values)
and it would return the array of
[[resampled_y_values],[new_sample_points]]
I'd expect it to return an uniformly sampled data with a roughly identical form of the original, with the same minimal and maximalx value. But it doesn't:
# nu_data = [[x1, x2, ..., xn], [y1, y2, ..., yn]]
# with x values in ascending order
length = len(nu_data[0])
resampled = sg.resample(nu_data[1], length, nu_data[0])
uniform_data = np.array([resampled[1], resampled[0]])
plt.plot(nu_data[0], nu_data[1], uniform_data[0], uniform_data[1])
plt.show()
blue: nu_data, orange: uniform_data
It doesn't look unaltered, and the x scale have been resized too. If I try to fix the range: construct the desired uniform x values myself and use them instead, the distortion remains:
length = len(nu_data[0])
resampled = sg.resample(nu_data[1], length, nu_data[0])
delta = (nu_data[0,-1] - nu_data[0,0]) / length
new_samplepoints = np.arange(nu_data[0,0], nu_data[0,-1], delta)
uniform_data = np.array([new_samplepoints, resampled[0]])
plt.plot(nu_data[0], nu_data[1], uniform_data[0], uniform_data[1])
plt.show()
What is the proper way to re-sample my data uniformly, if not this?

Please look at this rough solution:
import matplotlib.pyplot as plt
from scipy import interpolate
import numpy as np
x = np.array([0.001, 0.002, 0.005, 0.01, 0.02, 0.05, 0.1, 0.2, 0.5, 1, 2, 5, 10, 20])
y = np.exp(-x/3.0)
flinear = interpolate.interp1d(x, y)
fcubic = interpolate.interp1d(x, y, kind='cubic')
xnew = np.arange(0.001, 20, 1)
ylinear = flinear(xnew)
ycubic = fcubic(xnew)
plt.plot(x, y, 'X', xnew, ylinear, 'x', xnew, ycubic, 'o')
plt.show()
That is a bit updated example from scipy page. If you execute it, you should see something like this:
Blue crosses are initial function, your signal with non uniform sampling distribution. And there are two results - orange x - representing linear interpolation, and green dots - cubic interpolation. Question is which option you prefer? Personally I don't like both of them, that is why I usually took 4 points and interpolate between them, then another points... to have cubic interpolation without that strange ups. That is much more work, and also I can't see doing it with scipy, so it will be slow. That is why I've asked about size of the data.

How do I plot a step function with Matplotlib in Python?

This should be easy but I have just started toying with matplotlib and python. I can do a line or a scatter plot but i am not sure how to do a simple step function. Any help is much appreciated.
x = 1,2,3,4
y = 0.002871972681775004, 0.00514787917410944, 0.00863476098280219, 0.012003316194034325

It seems like you want step.
E.g.
import matplotlib.pyplot as plt
x = [1,2,3,4]
y = [0.002871972681775004, 0.00514787917410944,
0.00863476098280219, 0.012003316194034325]
plt.step(x, y)
plt.show()

If you have non-uniformly spaced data points, you can use the drawstyle keyword argument for plot:
x = [1,2.5,3.5,4]
y = [0.002871972681775004, 0.00514787917410944,
0.00863476098280219, 0.012003316194034325]
plt.plot(x, y, drawstyle='steps-pre')
Also available are steps-mid and steps-post.

New in matplotlib 3.4.0
There is a new plt.stairs method to complement plt.step:
plt.stairs and the underlying StepPatch provide a cleaner interface for plotting stepwise constant functions for the common case that you know the step edges.
This supersedes many use cases of plt.step, for instance when plotting the output of np.histogram.
Check out the official matplotlib gallery for how to use plt.stairs and StepPatch.
When to use plt.step vs plt.stairs
Use the original plt.step if you have reference points. Here the steps are anchored at [1,2,3,4] and extended to the left:
plt.step(x=[1,2,3,4], y=[20,40,60,30])
Use the new plt.stairs if you have edges. The previous [1,2,3,4] step points correspond to [1,1,2,3,4] stair edges:
plt.stairs(values=[20,40,60,30], edges=[1,1,2,3,4])
Using plt.stairs with np.histogram
Since np.histogram returns edges, it works directly with plt.stairs:
data = np.random.normal(5, 3, 3000)
bins = np.linspace(0, 10, 20)
hist, edges = np.histogram(data, bins)
plt.stairs(hist, edges)

I think you want pylab.bar(x,y,width=1) or equally pyplot's bar method. if not checkout the gallery for the many styles of plots you can do. Each image comes with example code showing you how to make it using matplotlib.

Draw two lines, one at y=0, and one at y=1, cutting off at whatever x your step function is for.
e.g. if you want to step from 0 to 1 at x=2.3 and plot from x=0 to x=5:
import matplotlib.pyplot as plt
# _
# if you want the vertical line _|
plt.plot([0,2.3,2.3,5],[0,0,1,1])
#
# OR:
# _
# if you don't want the vertical line _
#plt.plot([0,2.3],[0,0],[2.3,5],[1,1])
# now change the y axis so we can actually see the line
plt.ylim(-0.1,1.1)
plt.show()

In case someone just wants to stepify some data rather than actually plot it:
def get_x_y_steps(x, y, where="post"):
if where == "post":
x_step = [x[0]] + [_x for tup in zip(x, x)[1:] for _x in tup]
y_step = [_y for tup in zip(y, y)[:-1] for _y in tup] + [y[-1]]
elif where == "pre":
x_step = [_x for tup in zip(x, x)[:-1] for _x in tup] + [x[-1]]
y_step = [y[0]] + [_y for tup in zip(y, y)[1:] for _y in tup]
return x_step, y_step

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python boxplot showing means and confidence intervals - python

How can I create a boxplot like the one below, in Python? I want to depict means and confidence bounds only (rather than proportions of IQRs, as in matplotlib boxplot). I don't have any version constraints, and if your answer has some package dependency that's OK too. Thanks!

Related

How to Create a Boxplot / Group Boxplot from [Min ,Q1 ,Q2 ,Q3 ,Max] in Python? [duplicate]

Change the scale of the graph image

Scipy interpolate.splprep error "Invalid Inputs"

How to uniformly resample a non-uniform signal using SciPy?

How do I plot a step function with Matplotlib in Python?

Categories

Resources