Pardon this long post.
I am new to BSpline and is struggling to understand few things. I have a set of data points for which I need to construct a BSpline curve. The Datapoints are as follows:
x = [150 130 148]
y = [149 114 79]
After running the following function:
from scipy.interpolate import splprep, splev
tck, u = splprep([x, y], k =2, s = 0)
I am getting
parameters
u = [0. 0.505987 1.]
knots t = [0, 0, 0, 1, 1, 1]
coefficients c = [array([150. , 111.01850233, 148. ]), array([149. , 114.83829958, 79. ])]
k = 2 (this is the degree of curve I have used as an input for splprep).
I now need to test whether the t,c,k values generated are correct or not.
I ran the following function -
newPts = splev(u, tck)
This is giving me back the x and y data points I used in slprep.
newPts[0] = x
newPts[1] = y
Plotting newPts[0]againt newPts[1] gives me the following spline
spline evlaluation 1
The second test I ran was changing the parameters value to
u = np.linspace(0,1,5)
then ran the following
newPts = splev(u, tck)
This time my spline curve looks like the following
spline evaluation 2
From the following links computing the parameters, knot vector generation, I deduced that my parameter(u) and knots(t) are derived correctly. However, the computation of coeffcients look complicated. But from the Global Curve interpolation formula, found here, coefficient matrix, I can see the coefficient matrix is an nXn, matrix that is in my case it has to be a 3X3 matrix. But the coefficient matrix that I am getting is 2X3 that too the first array are the coefficients of x and the last array are the coefficients of y.
I really need a concrete way to prove if the coefficients derived from the splprep library are correct or not.
Really appreciate the help.
Yes, the values are correct. Let me show you how I have checked them using the Wolfram language (AKA Mathematica):
First, I take the control points (what you saved as c)
cps={{150,149},{111.01850233,114.83829958},{148,79}};
Since there are no internal knots (i.e., t=[0, 0, 0, 1, 1, 1]), your B-spline actually reduces to a Bézier curve. Let's create it:
curve:=BezierFunction[cps]
Now we can evaluate it in the parameters u and check that it interpolates your data.
In[23]:= curve[0]
Out[23]= {150.,149.}
In[24]:= curve[0.505987]
Out[24]= {130.,114.}
In[25]:= curve[1]
Out[25]= {148.,79.}
We can even plot the entire curve:
data={{150,149}, {130,114},{148,79}};
Graphics[{PointSize[Large],BezierCurve[cps], Green, Line[cps],Red,Point[cps],Blue, Point[data]}]
The curve is black, its control polygon red and the data points blue; clearly, the curve passes through all the three data points.
I have 40 points for each curve and I would like to smooth the function and estimate when the curve crosses a threshold on the y axis.
Is there a fitting function that I can easily apply this to, I can use interpolate to plot the new function but I can't figure out how to request the x value for which y = threshold.
Unfortunately the curves don't all have the same shape so I can't use scipy.optimize.curve_fit.
Thanks!
Update: Adding two curves:
Curve 1
[942.153,353.081,53.088,125.110,140.851,188.170,70.536,-122.473,-369.061,-407.945,88.734,484.334,267.762,65.831,74.010,-55.781,-260.024,-466.830,-524.511,-76.833,-36.779,-117.366,218.578,175.662,185.653,299.285,215.276,546.048,1210.132,3087.326,7052.849,13867.824,27156.939,51379.664,91908.266,148874.563,215825.031,290073.219,369567.781,437031.688]
Curve 2
[-39034.039,-34637.941,-24945.094,-16697.996,-9247.398,-2002.051,3409.047,3658.145,7542.242,11781.340,11227.688,10089.035,9155.883,8413.980,5289.578,3150.676,4590.023,6342.871,3294.719,580.567,-938.586,-3919.738,-5580.390,-3141.793,-2785.945,-2683.597,-4287.750,-4947.902,-7347.554,-8919.457,-6403.359,-6722.011,-8181.414,-6807.566,-7603.218,-6298.371,-6909.523,-5878.675,-5193.578,-7193.980]
x values are
[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40]
For fitting a smooth curve, you can fit Legendre polynomials using numpy.polynomial.legendre.Legendre's fit method.
# import packages we need later
import matplotlib.pyplot as plt
import numpy as np
Fitting Legendre Polynomials
Preparing data as numpy arrays:
curve1 = \
np.asarray([942.153,353.081,53.088,125.110,140.851,188.170,70.536,-122.473,-369.061,-407.945,88.734,484.334,267.762,65.831,74.010,-55.781,-260.024,-466.830,-524.511,-76.833,-36.779,-117.366,218.578,175.662,185.653,299.285,215.276,546.048,1210.132,3087.326,7052.849,13867.824,27156.939,51379.664,91908.266,148874.563,215825.031,290073.219,369567.781,437031.688])
curve2 = \
np.asarray([-39034.039,-34637.941,-24945.094,-16697.996,-9247.398,-2002.051,3409.047,3658.145,7542.242,11781.340,11227.688,10089.035,9155.883,8413.980,5289.578,3150.676,4590.023,6342.871,3294.719,580.567,-938.586,-3919.738,-5580.390,-3141.793,-2785.945,-2683.597,-4287.750,-4947.902,-7347.554,-8919.457,-6403.359,-6722.011,-8181.414,-6807.566,-7603.218,-6298.371,-6909.523,-5878.675,-5193.578,-7193.980])
xvals = \
np.asarray([1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40])
Lets fit the Legendre polynomials, degree being the highest degree polynomial used, the first few is here for example.
degree=10
legendrefit_curve1 = np.polynomial.legendre.Legendre.fit(xvals, curve1, deg=degree)
legendrefit_curve2 = np.polynomial.legendre.Legendre.fit(xvals, curve2, deg=degree)
Calculate these fitted curves at evenly spaced points using the linspace method. n is the number of point pairs we want to have.
n=100
fitted_vals_curve1 = legendrefit_curve1.linspace(n=n)
fitted_vals_curve2 = legendrefit_curve2.linspace(n=n)
Let's plot the result, along with a threshold (using axvline):
plt.scatter(xvals, curve1)
plt.scatter(xvals, curve2)
plt.plot(fitted_vals_curve1[0],fitted_vals_curve1[1],c='r')
plt.plot(fitted_vals_curve2[0],fitted_vals_curve2[1],c='k')
threshold=100000
plt.axhline(y=threshold)
The curves fit beautifully.
When is the threshold crossed?
To check where the threshold is crossed in each series, you can do:
for x, y in zip(fitted_vals_curve1[0], fitted_vals_curve1[1]):
if y > threshold:
xcross_curve1 = x
break
for x, y in zip(fitted_vals_curve2[0], fitted_vals_curve2[1]):
if y > threshold:
xcross_curve2 = x
break
xcross_curve1 and xcross_curve2 will hold the x value where curve1 and curve2 crossed the threshold if they did cross the threshold; if they did not, they will be undefined.
Let's plot them to check if it works (link to axhline docs):
plt.scatter(xvals, curve1)
plt.scatter(xvals, curve2)
plt.plot(fitted_vals_curve1[0],fitted_vals_curve1[1],c='r')
plt.plot(fitted_vals_curve2[0],fitted_vals_curve2[1],c='k')
plt.axhline(y=threshold)
try: plt.axvline(x=xcross_curve1)
except NameError: print('curve1 is not passing the threshold',c='b')
try: plt.axvline(x=xcross_curve2)
except NameError: print('curve2 is not passing the threshold')
As expected, we get this plot:
(and a text output: curve2 is not passing the threshold.)
If you would like to increase accuracy of xcross_curve1 or xcross_curve2, you can increase degree and n defined above.
From Legendre to Polynomial form
We have fitted a curve, which roughly has the form:
where P_n is the nth Legendre polynomial, s(x) is some function which transforms x to the range P_n expects (some math stuff which we don't need to know now).
We want our fitted line in the form:
We'll use legendre() of scipy.special:
from scipy.special import legendre
We'll also use use np.pad (docs, good SO post).
legendredict={}
for icoef, coef in enumerate(legendrefit_curve1.coef):
legendredict[icoef]=coef*np.pad(legendre(icoef).coef,(10-icoef,0),mode='constant')
legendredict will hold keys from 0 to 10, and each value in the dict will be a list of floats. The key is refering to the degree of the Polynomial, and the list of floats are expressing what are the coefficients of x**n values within that constituent polynomial of our fit, in a backwards order.
For example:
P_4 is:
legendredict[4] is:
isarray([ 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00,
0.00000000e+00, 0.00000000e+00, 3.29634565e+05, 3.65967884e-11,
-2.82543913e+05, 1.82983942e-11, 2.82543913e+04])
Meaning that in the sum of P_ns (f(x), above), we have q_4 lot of P_4, which is equivalent to having 2.82543913e+04 of 1s, 1.82983942e-11 of x, -2.82543913e+05 of x^2, etc, only from the P_4 component.
So if we want to know how much 1s, xs, x^2s, etc we need to form the polynomial sum, we need to add the need for 1s, xs, x^2s, etcs from all the different P_ns. This is what we do:
polycoeffs = np.sum(np.stack(list(legendredict.values())),axis=0)
Then let's form a polynomial sum:
for icoef, coef in enumerate(reversed(polycoeffs)):
print(str(coef)+'*s(x)**'+str(icoef),end='\n +')
Giving the output:
-874.1456709637822*s(x)**0
+2893.7228005540596*s(x)**1
+50415.38472217957*s(x)**2
+-6979.322584205707*s(x)**3
+-453363.49985790614*s(x)**4
+-250464.7549807652*s(x)**5
+1250129.5521521813*s(x)**6
+1267709.5031024509*s(x)**7
+-493280.0177807359*s(x)**8
+-795684.224334346*s(x)**9
+-134370.1696946264*s(x)**10
+
(We are going to ignore the last + sign, formatting is not the main point here.)
We need to calculate s(x) as well. If we are working in a Jupyter Notebook / Google Colab, only executing a cell with legendrefit_curve1 returns:
From where we can clearly see that s(x) is -1.0512820512820513+0.05128205128205128x. If we want to do it in a more programmatic way:
2/(legendrefit_curve1.domain[1]-legendrefit_curve1.domain[0]) is 0.05128205128205128 &
-1-2/(legendrefit_curve1.domain[1]-legendrefit_curve1.domain[0]) is just -1.0512820512820513
Which is true for some mathamatical reasons not much relevant here (related Q).
So we can define:
def s(input):
a=-1-2/(legendrefit_curve1.domain[1]-legendrefit_curve1.domain[0])
b=2/(legendrefit_curve1.domain[1]-legendrefit_curve1.domain[0])
return a+b*input
Also, lets define, based on the above obtained sum of polynomials of s(x):
def polyval(x):
return -874.1456709637822*s(x)**0+2893.7228005540596*s(x)**1+50415.38472217957*s(x)**2+-6979.322584205707*s(x)**3+-453363.49985790614*s(x)**4+-250464.7549807652*s(x)**5+1250129.5521521813*s(x)**6+1267709.5031024509*s(x)**7+-493280.0177807359*s(x)**8+-795684.224334346*s(x)**9+-134370.1696946264*s(x)**10
In a more programmatic way:
def polyval(x):
return sum([coef*s(x)**icoef for icoef, coef in enumerate(reversed(polycoeffs))])
Check that our polynomial indeed fits:
plt.scatter(fitted_vals_curve1[0],fitted_vals_curve1[1],c='r')
plt.plot(fitted_vals_curve1[0],[polyval(val) for val in fitted_vals_curve1[0]])
It does:
So let's print out our pure polynomial sum, with s(x) being replaced by an explicit function:
for icoef, coef in enumerate(reversed(polycoeffs)):
print(str(coef)+'*(-1.0512820512820513+0512820512820513*x)**'+str(icoef),end='\n +')
Giving the output:
-874.1456709637822*(-1.0512820512820513+0512820512820513*x)**0
+2893.7228005540596*(-1.0512820512820513+0512820512820513*x)**1
+50415.38472217957*(-1.0512820512820513+0512820512820513*x)**2
+-6979.322584205707*(-1.0512820512820513+0512820512820513*x)**3
+-453363.49985790614*(-1.0512820512820513+0512820512820513*x)**4
+-250464.7549807652*(-1.0512820512820513+0512820512820513*x)**5
+1250129.5521521813*(-1.0512820512820513+0512820512820513*x)**6
+1267709.5031024509*(-1.0512820512820513+0512820512820513*x)**7
+-493280.0177807359*(-1.0512820512820513+0512820512820513*x)**8
+-795684.224334346*(-1.0512820512820513+0512820512820513*x)**9
+-134370.1696946264*(-1.0512820512820513+0512820512820513*x)**10
+
Which can be simplified, as desired. (Ignore the last + sign.)
If want a higher (lower) degree polynomial fit, just fit higher (lower) degrees of Legendre polynomials.
Suppose 'h' is a function of x,y,z and t and it gives us a graph line (t,h) (simulated). At the same time we also have observed graph (observed values of h against t). How can I reduce the difference between observed (t,h) and simulated (t,h) graph by optimizing values of x,y and z? I want to change the simulated graph so that it imitates closer and closer to the observed graph in MATLAB/Python. In literature I have read that people have done same thing by Lavenberg-marquardt algorithm but don't know how to do it?
You are actually trying to fit the parameters x,y,z of the parametrized function h(x,y,z;t).
MATLAB
You're right that in MATLAB you should either use lsqcurvefit of the Optimization toolbox, or fit of the Curve Fitting Toolbox (I prefer the latter).
Looking at the documentation of lsqcurvefit:
x = lsqcurvefit(fun,x0,xdata,ydata);
It says in the documentation that you have a model F(x,xdata) with coefficients x and sample points xdata, and a set of measured values ydata. The function returns the least-squares parameter set x, with which your function is closest to the measured values.
Fitting algorithms usually need starting points, some implementations can choose randomly, in case of lsqcurvefit this is what x0 is for. If you have
h = #(x,y,z,t) ... %// actual function here
t_meas = ... %// actual measured times here
h_meas = ... %// actual measured data here
then in the conventions of lsqcurvefit,
fun <--> #(params,t) h(params(1),params(2),params(3),t)
x0 <--> starting guess for [x,y,z]: [x0,y0,z0]
xdata <--> t_meas
ydata <--> h_meas
Your function h(x,y,z,t) should be vectorized in t, such that for vector input in t the return value is the same size as t. Then the call to lsqcurvefit will give you the optimal set of parameters:
x = lsqcurvefit(#(params,t) h(params(1),params(2),params(3),t),[x0,y0,z0],t_meas,h_meas);
h_fit = h(x(1),x(2),x(3),t_meas); %// best guess from curve fitting
Python
In python, you'd have to use the scipy.optimize module, and something like scipy.optimize.curve_fit in particular. With the above conventions you need something along the lines of this:
import scipy.optimize as opt
popt,pcov = opt.curve_fit(lambda t,x,y,z: h(x,y,z,t), t_meas, y_meas, p0=[x0,y0,z0])
Note that the p0 starting array is optional, but all parameters will be set to 1 if it's missing. The result you need is the popt array, containing the optimal values for [x,y,z]:
x,y,z = popt
h_fit = h(x,y,z,t_meas)