finding inflection points in spline fitted 1d data - python

I have some one dimensional data and fit it with a spline. Then I want to find the inflection points (ignoring saddle points) in it. Now I am searching the extrema of its first derivation by using scipy.signal.argrelmin (and argrelmax) on a lot of values generated by splev.
import scipy.interpolate
import scipy.optimize
import scipy.signal
import numpy as np
import matplotlib.pyplot as plt
import operator
y = [-1, 5, 6, 4, 2, 5, 8, 5, 1]
x = np.arange(0, len(y))
tck = scipy.interpolate.splrep(x, y, s=0)
print 'roots', scipy.interpolate.sproot(tck)
# output:
# [0.11381478]
xnew = np.arange(0, len(y), 0.01)
ynew = scipy.interpolate.splev(xnew, tck, der=0)
ynew_deriv = scipy.interpolate.splev(xnew, tck, der=1)
min_idxs = scipy.signal.argrelmin(ynew_deriv)
max_idxs = scipy.signal.argrelmax(ynew_deriv)
mins = zip(xnew[min_idxs].tolist(), ynew_deriv[min_idxs].tolist())
maxs = zip(xnew[max_idxs].tolist(), ynew_deriv[max_idxs].tolist())
inflection_points = sorted(mins + maxs, key=operator.itemgetter(0))
print 'inflection_points', inflection_points
# output:
# [(3.13, -2.9822449358974357),
# (5.03, 4.3817785256410255)
# (7.13, -4.867132628205128)]
plt.legend(['data','Cubic Spline', '1st deriv'])
plt.plot(x, y, 'o',
xnew, ynew, '-',
xnew, ynew_deriv, '-')
plt.show()
But this feels terribly wrong. I guess there is a possibility to find what I am looking for without generating so many values. Something like sproot but applicable to the second derivation perhaps?

The derivative of a B-spline is also a B-spline. You can therefore first fit a spline to your data, then use the derivative formula to construct the coefficients of the derivative spline, and finally use the spline root finding to get the roots of the derivative spline. These are then the maxima/minima of the original curve.
Here is code to do it: https://gist.github.com/pv/5504366
The relevant computation of the coefficients is:
t, c, k = scipys_spline_representation
# Compute the denominator in the differentiation formula.
dt = t[k+1:-1] - t[1:-k-1]
# Compute the new coefficients
d = (c[1:-1-k] - c[:-2-k]) * k / dt
# Adjust knots
t2 = t[1:-1]
# Pad coefficient array to same size as knots (FITPACK convention)
d = np.r_[d, [0]*k]
# Done, a new spline
new_spline_repr = t2, d, k-1

Related

Create BSpline from knots and coefficients

How can a spline be created if only the points and the coefficients are known? I'm using scipy.interpolate.BSpline here, but am open to other standard packages as well. So basically I want to be able to give someone just those short arrays of coefficients for them to be able to recreate the fit to the data. See the failed red-dashed curve below.
import numpy as np
import matplotlib.pyplot as plt
from scipy.interpolate import BSpline, LSQUnivariateSpline
x = np.linspace(0, 10, 50) # x-data
y = np.exp(-(x-5)**2/4) # y-data
# define the knot positions
t = [1, 2, 4, 5, 6, 8, 9]
# get spline fit
s1 = LSQUnivariateSpline(x, y, t)
x2 = np.linspace(0, 10, 200) # new x-grid
y2 = s1(x2) # evaluate spline on that new grid
# FAILED: try to construct BSpline using the knots and coefficients
k = s1.get_knots()
c = s1.get_coeffs()
s2 = BSpline(t,c,2)
# plotting
plt.plot(x, y, label='original')
plt.plot(t, s1(t),'o', label='knots')
plt.plot(x2, y2, '--', label='spline 1')
plt.plot(x2, s2(x2), 'r:', label='spline 2')
plt.legend()
The fine print under get_knots says:
Internally, the knot vector contains 2*k additional boundary knots.
That means, to get a usable knot array from get_knots, one should add k copies of the left boundary knot at the beginning of array, and k copies of the right boundary knot at the end. Here k is the degree of the spline, which is usually 3 (you asked for LSQUnivariateSpline of default degree, so that's 3). So:
kn = s1.get_knots()
kn = 3*[kn[0]] + list(kn) + 3*[kn[-1]]
c = s1.get_coeffs()
s2 = BSpline(kn, c, 3) # not "2" as in your sample; we are working with a cubic spline
Now, the spline s2 is the same as s1:
Equivalently, kn = 4*[x[0]] + t + 4*[x[-1]] would work: your t list contains only interior knots, so x[0] and x[-1] are added, and then each repeated k times more.
The mathematical reason for the repetition is that B-splines need some room to get built, due to their inductive definition which requires (k-1)-degree splines to exist around every interval in which we define the kth degree spline.
Here is a slightly more compact way of doing it if you don't care too much about the details of the knot positions. The tk array is what you are looking for. Once tk is in hand the spline can be reproduced using the y=splev(x,tk,der=0) line.
import numpy as np
import matplotlib.pyplot as plt
from scipy.interpolate import splrep,splev
import matplotlib.pyplot as plt
### Input data
x_arr = np.linspace(0, 10, 50) # x-data
y_arr = np.exp(-(x_arr-5)**2/4) # y-data
### Order of spline
order = 3
### Make the spline
tk = splrep(x_arr, y_arr, k=order) # Returns the knots and coefficents
### Evaluate the spline using the knots and coefficents on the domian x
x = np.linspace(0, 10, 1000) # new x-grid
y = splev(x, tk, der=0)
### Plot
f,ax=plt.subplots()
ax.scatter(x_arr, y_arr, label='original')
ax.plot(x,y,label='Spline')
ax.legend(fontsize=15)
plt.show()

fit numpy polynomials to noisy data

I want to exactly represent my noisy data with a numpy.polynomial polynomial. How can I do that?.
In this example, I chose legendre polynomials. When I use the polynomial legfit function, the coefficients it returns are either very large or very small. So, I think I need some sort of regularization.
Why doesn't my fit get more accurate as I increase the degree of the polynomial? (It can be seen that the 20, 200, and 300 degree polynomials are essentially identical.) Are any regularization options available in the polynomial package?
I tried implementing my own regression function, but it feels like I am re-inventing the wheel. Is making my own fitting function the best path forward?
from scipy.optimize import least_squares as mini
import matplotlib.pyplot as plt
import numpy as np
x = np.linspace(0, 5, 1000)
tofit = np.sin(3 * x) + .6 * np.sin(7*x) - .5 * np.cos(3 * np.cos(10 * x))
# This is here to illustrate what I expected the legfit function to do
# I expected it to do least squares regression with some sort of regularization.
def myfitfun(x, y, deg):
def fitness(a):
return ((np.polynomial.legendre.legval(x, a) - y)**2).sum() + np.sum(a ** 2)
return mini(fitness, np.zeros(deg)).x
degrees = [2, 4, 8, 16, 40, 200]
plt.plot(x, tofit, c='k', lw=4, label='Data')
for deg in degrees:
#coeffs = myfitfun(x, tofit, deg)
coeffs = np.polynomial.legendre.legfit(x, tofit, deg)
plt.plot(x, np.polynomial.legendre.legval(x, coeffs), label="Degree=%i"%deg)
plt.legend()
Legendre polynomials are meant to be used over the interval [-1,1]. Try to replace x with 2*x/x[-1] - 1 in your fit and you'll see that all is good:
nx = 2*x/x[-1] - 1
for deg in degrees:
#coeffs = myfitfun(x, tofit, deg)
coeffs = np.polynomial.legendre.legfit(nx, tofit, deg)
plt.plot(x, np.polynomial.legendre.legval(nx, coeffs), label="Degree=%i"%deg)
The easy way to use the proper interval in the fit is to use the Legendre class
from numpy.polynomial import Legendre as L
p = L.fit(x, y, order)
This will scale and shift the data to the interval [-1, 1] and track the scaling factors.

Spline with constraints at border

I have measured data on a three dimensional grid, e.g. f(x, y, t). I want to interpolate and smooth this data in the direction of t with splines.
Currently, I do this with scipy.interpolate.UnivariateSpline:
import numpy as np
from scipy.interpolate import UnivariateSpline
# data is my measured data
# data.shape is (len(y), len(x), len(t))
data = np.arange(1000).reshape((5, 5, 40)) # just for demonstration
times = np.arange(data.shape[-1])
y = 3
x = 3
sp = UnivariateSpline(times, data[y, x], k=3, s=6)
However, I need the spline to have vanishing derivatives at t=0. Is there a way to enforce this constraint?
The best thing I can think of is to do a minimization with a constraint with scipy.optimize.minimize. It is pretty easy to take the derivative of a spline, so the constraint is simply. I would use a regular spline fit (UnivariateSpline) to get the knots (t), and hold the knots fixed (and degree k, of course), and vary the coefficients c. Maybe there is a way to vary the knot locations as well but I will leave that to you.
import numpy as np
from scipy.interpolate import UnivariateSpline, splev, splrep
from scipy.optimize import minimize
def guess(x, y, k, s, w=None):
"""Do an ordinary spline fit to provide knots"""
return splrep(x, y, w, k=k, s=s)
def err(c, x, y, t, k, w=None):
"""The error function to minimize"""
diff = y - splev(x, (t, c, k))
if w is None:
diff = np.einsum('...i,...i', diff, diff)
else:
diff = np.dot(diff*diff, w)
return np.abs(diff)
def spline_neumann(x, y, k=3, s=0, w=None):
t, c0, k = guess(x, y, k, s, w=w)
x0 = x[0] # point at which zero slope is required
con = {'type': 'eq',
'fun': lambda c: splev(x0, (t, c, k), der=1),
#'jac': lambda c: splev(x0, (t, c, k), der=2) # doesn't help, dunno why
}
opt = minimize(err, c0, (x, y, t, k, w), constraints=con)
copt = opt.x
return UnivariateSpline._from_tck((t, copt, k))
And then we generate some fake data that should have zero initial slope and test it:
import matplotlib.pyplot as plt
n = 10
x = np.linspace(0, 2*np.pi, n)
y0 = np.cos(x) # zero initial slope
std = 0.5
noise = np.random.normal(0, std, len(x))
y = y0 + noise
k = 3
sp0 = UnivariateSpline(x, y, k=k, s=n*std)
sp = spline_neumann(x, y, k, s=n*std)
plt.figure()
X = np.linspace(x.min(), x.max(), len(x)*10)
plt.plot(X, sp0(X), '-r', lw=1, label='guess')
plt.plot(X, sp(X), '-r', lw=2, label='spline')
plt.plot(X, sp.derivative()(X), '-g', label='slope')
plt.plot(x, y, 'ok', label='data')
plt.legend(loc='best')
plt.show()
Here is one way to do this. The basic idea is to get a spline's coefficients with splrep and then modify them before calling splev. The first few knots in the spline correspond to the lowest value in the range of x values. If the coefficients that correspond to them are set equal to each other, that completely flattens out the spline at that end.
Using the same data, times, x, y as in your example:
# set up example data
data = np.arange(1000).reshape((5, 5, 40))
times = np.arange(data.shape[-1])
y = 3
x = 3
# make 1D spline
import scipy.interpolate
from pylab import * # for plotting
knots, coefficients, degree = scipy.interpolate.splrep(times, data[y, x])
t = linspace(0,3,100)
plot( t, scipy.interpolate.splev(t, (knots, coefficients, degree)) )
# flatten out the beginning
coefficients[:2] = coefficients[0]
plot( t, scipy.interpolate.splev(t, (knots, coefficients, degree)) )
scatter( times, data[y, x] )
xlim(0,3)
ylim(720,723)
Blue: original points and spline through them. Green: modified spline with derivative=0 at the beginning. Both are zoomed in to the very beginning.
plot( t, scipy.interpolate.splev(t, (knots, coefficients, degree), der=1), 'g' )
xlim(0,3)
Call splev(..., der=1) to plot the first derivative. The derivative starts at zero and overshoots a little so the modified spline can catch up (this is inevitable).
The modified spline does not go through the first two points it is based on (it still hits all the other points exactly). It is possible to modify this by adding an extra interior control point next to the origin to get both a zero derivative and go through the original points; experiment with the knots and coefficients until it does what you want.
Your example does not work ( on python 2.7.9), so I only sketch my idea:
calculate sp
take the derivative via sp.derivative and evaluate it at the relevant times (probably the same times at which you measured your data)
Set the relevant points to zero (e.g. the value at t=0)
Calculate another spline from the derivative values.
Integrate your spline function. I guess you will have to do this numerically, but that should not be a problem. Do not forget to add a constant, to get your original function.

Discretize path with numpy array and equal distance between points

Lets say I have a path in a 2d-plane given by a parametrization, for example the archimedian spiral:
x(t) = a*φ*cos(φ), y(t) = a*φ*sin(φ)
Im looking for a way to discretize this with a numpy array,
the problem is if I use
a = 1
phi = np.arange(0, 10*np.pi, 0.1)
x = a*phi*np.cos(phi)
y = a*phi*np.sin(phi)
plt.plot(x,y, "ro")
I get a nice curve but the points don't have the same distance, for
growing φ the distance between 2 points gets larger.
Im looking for a nice and if possible fast way to do this.
It might be possible to get the exact analytical formula for your simple spiral, but I am not in the mood to do that and this might not be possible in a more general case. Instead, here is a numerical solution:
import matplotlib.pyplot as plt
import numpy as np
a = 1
phi = np.arange(0, 10*np.pi, 0.1)
x = a*phi*np.cos(phi)
y = a*phi*np.sin(phi)
dr = (np.diff(x)**2 + np.diff(y)**2)**.5 # segment lengths
r = np.zeros_like(x)
r[1:] = np.cumsum(dr) # integrate path
r_int = np.linspace(0, r.max(), 200) # regular spaced path
x_int = np.interp(r_int, r, x) # interpolate
y_int = np.interp(r_int, r, y)
plt.subplot(1,2,1)
plt.plot(x, y, 'o-')
plt.title('Original')
plt.axis([-32,32,-32,32])
plt.subplot(1,2,2)
plt.plot(x_int, y_int, 'o-')
plt.title('Interpolated')
plt.axis([-32,32,-32,32])
plt.show()
It calculates the length of all the individual segments, integrates the total path with cumsum and finally interpolates to get a regular spaced path. You might have to play with your step-size in phi, if it is too large you will see that the spiral is not a smooth curve, but instead built from straight line segments. Result:

Getting spline equation from UnivariateSpline object

I'm using UnivariateSpline to construct piecewise polynomials for some data that I have. I would then like to use these splines in other programs (either in C or FORTRAN) and so I would like to understand the equation behind the generated spline.
Here is my code:
import numpy as np
import scipy as sp
from scipy.interpolate import UnivariateSpline
import matplotlib.pyplot as plt
import bisect
data = np.loadtxt('test_C12H26.dat')
Tmid = 800.0
print "Tmid", Tmid
nmid = bisect.bisect(data[:,0],Tmid)
fig = plt.figure()
plt.plot(data[:,0], data[:,7],ls='',marker='o',markevery=20)
npts = len(data[:,0])
#print "npts", npts
w = np.ones(npts)
w[0] = 100
w[nmid] = 100
w[npts-1] = 100
spline1 = UnivariateSpline(data[:nmid,0],data[:nmid,7],s=1,w=w[:nmid])
coeffs = spline1.get_coeffs()
print coeffs
print spline1.get_knots()
print spline1.get_residual()
print coeffs[0] + coeffs[1] * (data[0,0] - data[0,0]) \
+ coeffs[2] * (data[0,0] - data[0,0])**2 \
+ coeffs[3] * (data[0,0] - data[0,0])**3, \
data[0,7]
print coeffs[0] + coeffs[1] * (data[nmid,0] - data[0,0]) \
+ coeffs[2] * (data[nmid,0] - data[0,0])**2 \
+ coeffs[3] * (data[nmid,0] - data[0,0])**3, \
data[nmid,7]
print Tmid,data[-1,0]
spline2 = UnivariateSpline(data[nmid-1:,0],data[nmid-1:,7],s=1,w=w[nmid-1:])
print spline2.get_coeffs()
print spline2.get_knots()
print spline2.get_residual()
plt.plot(data[:,0],spline1(data[:,0]))
plt.plot(data[:,0],spline2(data[:,0]))
plt.savefig('test.png')
And here is the resulting plot. I believe I have valid splines for each interval but it looks like my spline equation is not correct... I can't find any reference to what it is supposed to be in the scipy documentation. Anybody knows? Thanks !
The scipy documentation does not have anything to say about how one can take the coefficients and manually generate the spline curve. However, it is possible to figure out how to do this from the existing literature on B-splines. The following function bspleval shows how to construct the B-spline basis functions (the matrix B in the code), from which one can easily generate the spline curve by multiplying the coefficients with the highest-order basis functions and summing:
def bspleval(x, knots, coeffs, order, debug=False):
'''
Evaluate a B-spline at a set of points.
Parameters
----------
x : list or ndarray
The set of points at which to evaluate the spline.
knots : list or ndarray
The set of knots used to define the spline.
coeffs : list of ndarray
The set of spline coefficients.
order : int
The order of the spline.
Returns
-------
y : ndarray
The value of the spline at each point in x.
'''
k = order
t = knots
m = alen(t)
npts = alen(x)
B = zeros((m-1,k+1,npts))
if debug:
print('k=%i, m=%i, npts=%i' % (k, m, npts))
print('t=', t)
print('coeffs=', coeffs)
## Create the zero-order B-spline basis functions.
for i in range(m-1):
B[i,0,:] = float64(logical_and(x >= t[i], x < t[i+1]))
if (k == 0):
B[m-2,0,-1] = 1.0
## Next iteratively define the higher-order basis functions, working from lower order to higher.
for j in range(1,k+1):
for i in range(m-j-1):
if (t[i+j] - t[i] == 0.0):
first_term = 0.0
else:
first_term = ((x - t[i]) / (t[i+j] - t[i])) * B[i,j-1,:]
if (t[i+j+1] - t[i+1] == 0.0):
second_term = 0.0
else:
second_term = ((t[i+j+1] - x) / (t[i+j+1] - t[i+1])) * B[i+1,j-1,:]
B[i,j,:] = first_term + second_term
B[m-j-2,j,-1] = 1.0
if debug:
plt.figure()
for i in range(m-1):
plt.plot(x, B[i,k,:])
plt.title('B-spline basis functions')
## Evaluate the spline by multiplying the coefficients with the highest-order basis functions.
y = zeros(npts)
for i in range(m-k-1):
y += coeffs[i] * B[i,k,:]
if debug:
plt.figure()
plt.plot(x, y)
plt.title('spline curve')
plt.show()
return(y)
To give an example of how this can be used with Scipy's existing univariate spline functions, the following is an example script. This takes the input data and uses Scipy's functional and also its object-oriented approach to spline fitting. Taking the coefficients and knot points from either of the two and using these as inputs to our manually-calculated routine bspleval, we reproduce the same curve that they do. Note that the difference between the manually evaluated curve and Scipy's evaluation method is so small that it is almost certainly floating-point noise.
x = array([-273.0, -176.4, -79.8, 16.9, 113.5, 210.1, 306.8, 403.4, 500.0])
y = array([2.25927498e-53, 2.56028619e-03, 8.64512988e-01, 6.27456769e+00, 1.73894734e+01,
3.29052124e+01, 5.14612316e+01, 7.20531200e+01, 9.40718450e+01])
x_nodes = array([-273.0, -263.5, -234.8, -187.1, -120.3, -34.4, 70.6, 194.6, 337.8, 500.0])
y_nodes = array([2.25927498e-53, 3.83520726e-46, 8.46685318e-11, 6.10568083e-04, 1.82380809e-01,
2.66344008e+00, 1.18164677e+01, 3.01811501e+01, 5.78812583e+01, 9.40718450e+01])
## Now get scipy's spline fit.
k = 3
tck = splrep(x_nodes, y_nodes, k=k, s=0)
knots = tck[0]
coeffs = tck[1]
print('knot points=', knots)
print('coefficients=', coeffs)
## Now try scipy's object-oriented version. The result is exactly the same as "tck": the knots are the
## same and the coeffs are the same, they are just queried in a different way.
uspline = UnivariateSpline(x_nodes, y_nodes, s=0)
uspline_knots = uspline.get_knots()
uspline_coeffs = uspline.get_coeffs()
## Here are scipy's native spline evaluation methods. Again, "ytck" and "y_uspline" are exactly equal.
ytck = splev(x, tck)
y_uspline = uspline(x)
y_knots = uspline(knots)
## Now let's try our manually-calculated evaluation function.
y_eval = bspleval(x, knots, coeffs, k, debug=False)
plt.plot(x, ytck, label='tck')
plt.plot(x, y_uspline, label='uspline')
plt.plot(x, y_eval, label='manual')
## Next plot the knots and nodes.
plt.plot(x_nodes, y_nodes, 'ko', markersize=7, label='input nodes') ## nodes
plt.plot(knots, y_knots, 'mo', markersize=5, label='tck knots') ## knots
plt.xlim((-300.0,530.0))
plt.legend(loc='best', prop={'size':14})
plt.figure()
plt.title('difference')
plt.plot(x, ytck-y_uspline, label='tck-uspl')
plt.plot(x, ytck-y_eval, label='tck-manual')
plt.legend(loc='best', prop={'size':14})
plt.show()
The coefficients given by get_coeffs are B-spline (Basis spline) coefficients, described here: B-spline (Wikipedia)
Probably whatever other program/language you will be using has an implementation. Supply the knot locations and coefficients, and you should be all set.

Categories