I'm looking for an algorithm that smoothly interpolates points as they come in live.
For example, say I start with an array of 10 (x,y) pairs. I'm currently using scipy and a gaussian window to generate a smooth curve. However, what I can't figure out is how to update the smoothed curve in response to an 11th point generated at some future point (without completely redoing the smoothing for all 11 points).
What I'm looking for is an algorithm that follows the previous smooth curve up to the 10th (x,y) pair and also smoothly interpolates between the 10th and 11th pair (in a way that's similar to redoing the entire algorithm - so no sharp edges). Is there something out there that does what I'm looking for?
I think you could make use of a Cubic Spline. Given a list of n points (x_1, y_1)..(x_n, y_n), the algorithm finds a cubic polynomial p_k between (x_k, y_k) and (x_{k+1}, y_{k+1}) with the following constraints:
polynomials p_k and p_{k+1} passes through the point (x_{k+1}, y_{k+1});
polynomials p_k and p_{k+1} have the same first derivative at (x_{k+1}, y_{k+1});
polynomials p_k and p_{k+1} have the same second derivative at (x_{k+1}, y_{k+1}).
Also, there are some boundary conditions, defined for the first and the last polynomial. I have used natural, which forces the second derivative to zero at the end of the curves.
The steps that you could apply are:
Interpolate the first 10 points using the Cubic Spline.
Assign the first derivative value at p_10 to a variable d.
Run the Cubic Spline for p_10 and p_11, enforcing that the first derivative at p_10 is d and the second derivative at p_11 is zero.
From there, you can repeat the same steps for the remaining points.
This code will generate a interpolation for all points:
import matplotlib.pyplot as plt
import numpy as np
from scipy.interpolate import CubicSpline
height=4
n = 20
x = np.arange(n)
xs = np.arange(-0.1,n+0.1,0.1)
y = np.random.uniform(low=0, high=height, size=n)
plt.plot(x, y, 'o', label='data')
cs = CubicSpline(x, y)
plt.plot(xs, cs(xs), color='orange')
plt.ylim([0, height+1])
Now, this code will interpolate the first 10 points, followed by another interpolation between points 10 and 11:
k = 10
delta = 0.001
plt.plot(x, y, 'o', label='data')
xs = np.arange(x[0], x[k-1]+delta, delta)
cs = CubicSpline(x[0:k], y[0:k])
plt.plot(xs, cs(xs), color='red')
d = cs(x[k-1], 1)
xs2 = np.arange(x[k-1], x[k]+delta, delta)
cs2 = CubicSpline(x[k-1:k+1], y[k-1:k+1], bc_type=((1, d), 'natural'))
plt.plot(xs2, cs2(xs2), color='blue')
plt.ylim([0, height+1])
Related
Below I have attached a Python script that calculates a 5-point Bezier curve and calculates the normal acceleration of that curve. The result looks as follows:
We see on the right-hand side that the normal acceleration at the start and end of curve is nonzero. Is there anyway to make sure that the normal acceleration is zero at the boundaries and also remains smooth? I have tried setting high weights for the 2nd and 4th point, this works but does not force it to absolute zero.
Note! it's basically similar to, how do we force the curvature to be zero at the boundaries?
import numpy as np
from scipy.special import comb
import matplotlib.pyplot as plt
# Order of bezier curve
N = 5
# Generate points
np.random.seed(2)
points = np.random.rand(N,2)
''' Calculate bezier curve '''
t = np.linspace(0,1,1000)
polys = np.asarray([comb(N-1, i) * ( (1-t)**(N-1-i) ) * t**i for i in range(0,N)])
curve = np.zeros((t.shape[0], 2))
curve[:,0] = points[:,0] # polys
curve[:,1] = points[:,1] # polys
''' Calculate normal acceleration '''
velocity = np.gradient(curve, axis=0) * t.shape[0]
acceleration = np.gradient(velocity, axis=0) * t.shape[0]
a_tangent = acceleration * velocity / (velocity[:,[0]]**2 + velocity[:,[1]]**2)
a_tangent = a_tangent * velocity
a_normal = acceleration - a_tangent
a_normal_mag = np.linalg.norm(a_normal,axis=1)
''' Plotting'''
fig,axes = plt.subplots(1,2)
axes[0].scatter(points[:,0], points[:,1])
axes[0].plot(curve[:,0], curve[:,1])
axes[0].set_xlabel('x(t)')
axes[0].set_xlabel('y(t)')
axes[1].plot(t,a_normal_mag)
axes[1].set_xlabel('t')
axes[1].set_ylabel('a_n(t)')
plt.show()
If you want the Bezier curve to have zero 2nd derivative or zero curvature at the ends, the control points can no longer be arbitrary and need to follow a certain "pattern".
For a Bezier curve, its first and 2nd derivatives at t=0 are
C'(0)=d(P1-P0)
C"(0)=d(d-1)(P2-2P1+P0)
where d = degree and P0,P1 and P2 are the first 3 control points.
If you want the curvature at t=0 to be zero, you can choose one of the following 3 conditions:
a) force first derivative to be zero ==> P0=P1.
b) force 2nd derivative to be zero ==> P2-2P1+P0 = 0,
c) force first and 2nd derivatives to be parallel to each other ==> P0, P1 and P2 are collinear.
If you choose to force parallel first and 2nd derivatives at both t=0 and t=1, for a Bezier curve of 5 control points, this means that P2 will be at the intersection of Line(P0,P1) and Line(P3,P4).
I need to fit a sine curve created from two sine waves and extract the parameters for the fitted curve (such as frequency, amplitude, etc).
Data example:
import matplotlib.pyplot as plt
import numpy as np
%matplotlib inline
x = np.arange(0, 50, 0.01)
x2 = np.arange(0, 100, 0.02)
x3 = np.arange(0, 150, 0.03)
sin1 = np.sin(x)
sin2 = np.sin(x2)
sin3= np.sin(x3/2)
sin4 = sin1 + sin2+sin3
plt.plot(x, sin4)
plt.show()
I used the codes provided in this answer.
yy = sin4
tt = x
res = fit_sin(tt, yy)
print(str(i), "Amplitude=%(amp)s, Angular freq.=%(omega)s, phase=%(phase)s, offset=%(offset)s, Max. Cov.=%(maxcov)s" % res )
fit_values=res["fitfunc"](tt)
Frequenc_fit= res['freq']
print(i, Frequenc_fit)
Frequenc_fit=Frequenc_fit
Amp_fit=res['amp']
Omega_fit=res['omega']
Phase_fit=res['phase']
Offset_fit=res['offset']
maxcov_fit=res['maxcov']
plt.plot(tt, yy, "-k", label="y", linewidth=2)
plt.plot(tt,fit_values, "r-", label="y fit curve", linewidth=2)
plt.legend(loc="best")
plt.show()
I got a fitted sine curve with a single frequency and amplitude as follows:
2 Amplitude=1.0149282025860233, Angular freq.=2.01112187048004, phase=-0.2730905030152767, offset=0.003304158823058212, Max. Cov.=0.0015266032307905222
2 0.3200799868471169
Is there a method to obtain fitted curve matches with the original one?
Supposing that the function to be fitted is
y(x)=a * sin( w * x )+b * sin( W * x )
the principle of the method below is explained in https://fr.scribd.com/doc/14674814/Regressions-et-equations-integrales
The graphical representation of the result is :
Blue curve : From data obtained by scanning the graph given in the question.
Black curve : From the above calculus.
The available data was not accurate because it comes from scanning of the original figure. The deviation is mainly due to the numerical integrations in computing the values of SS and SSSS (Four successive numerical integrations is not accurate especially with biaised data).
Probably the correct result should be : w=2 , W=1 , a=1 , b=1.
NOTE : The above method is not iterative and thus doesn't requires guessed values of the parameters to start an iterative process. The approximate results of the parameters can be good initial values in order to use an iterative non-linear regression process.
NOTE : If the values of w and W where known a-priori the solving thanks to linear regression would be very simple and much accurate (Only the last 2X2 matrix calculus shown above).
I'm trying to interpolate a set of points using the UnivariateSpline function, but I'm getting the usual big oscillations in the limits of the set, do you know any way to solve this?
My code looks like this:
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
x=pd.read_csv('thrustlaw.txt')
x1=x['Time(sec)']
y1=x['Thrust(N)']
def splines(x1,y1):
from scipy.interpolate import UnivariateSpline
si = UnivariateSpline(x1,y1,s=0, k=3)
xs = np.linspace(0, x1[len(x1)-1], 10000)
ys = si(xs)
plt.plot(x1,y1,'go')
plt.plot(xs, ys)
plt.ylabel("Thrust[N]")
plt.xlabel("Time[sec]")
plt.title("Thrust curve (splines)")
plt.grid()
plt.show()
splines(x1,y1)
Result:
Fitting high-degree polynomials to noisy data tends to do this. An interpolation method that doesn't have this problem is the (unique) piecewise cubic polynomial that, for each pair of successive points i, i+1:
goes through x_i, y_i
goes through x_{i+1}, y_{i+1}
at x_i, has slope (y_{i+1} - y_{i-1}) / (x_{i+1} - x_{i-1})
at x_{i+1}, has slope (y_{i+2} - y_i) / (x_{i+2} - x_i)
So the tangent at each point is parallel to the straight line segment from the previous point to the next. This forces the derivative to be "somewhat similar" to the original data, so it doesn't oscillate wildly.
If I'm not mistaken, this is a Catmull-Rom spline, a particular case of a cubic Hermite spline. Maybe this question will help you implement it in scipy, or to find another interpolation method to your liking.
I have two numpy arrays, one is an array of x values and the other an array of y values and together they give me the empirical cdf. E.g.:
plt.plot(xvalues, yvalues)
plt.show()
I assume the data needs to be smoothed somehow in order to give a smooth pdf.
I would like to plot the pdf. How can I do that?
The raw data is at: http://dpaste.com/1HVK5DR .
There are two main problems: Your data seems to be quite noisy, and it is not equally spaced: The points at the low end are sampled quite densly, while the ponts at the high end are sampled quite sparsely. This can cause numerical issues.
So first I suggest resampling the data using a linear interpolation to get equaly spaced samples: (Note that all the snippets appended to eachother form the content of one python file.)
import matplotlib.pyplot as plt
import numpy as np
from data import xvalues, yvalues #load data from file
print("#datapoints: {}".format(len(xvalues)))
#don't use every point if your computer is not very fast
xv = np.array(xvalues)[::5]
yv = np.array(yvalues)[::5]
#interpolate to have evenly space data
xi = np.linspace(xv.min(), xv.max(), 400)
yi = np.interp(xi, xv, yv)
Then, to smoothen the data, I suggest performing a RBF regression (=using an "RBF Network"). The idea is fiting a curve of the form
c(t) = sum a(i) * phi(t - x(i)) #(not part of the program)
where phi is some radial basis function. (In theory we could use any functions.) To have a very smooth result I choose a very smooth function, namely a gaussian: phi(x) = exp( - x^2/sigma^2) where sigma is yet to be determined. The x(i) are just some nodes that we can define. If we have a smooth function, we just need a few nodes. The number of nodes also determines how much computation needs to be done. The a(i) are the coefficients we can optimize to get the best fit. In this case I just use a least squares approach.
Note that IF we can write a function in the form above, it is very easy to compute the derivative, it is just
c(t) = sum a(i) * phi'(t - x(i))
where phi' is the derivative of phi. #(not part of the program)
Regarding sigma: It is usually a good idea to choose it as a multiple of the step between the nodes we chose. The greater we choose sigma, the smoother the resulting function gets.
#set up rbf network
rbf_nodes = xv[::50][None, :]#use a subset of the x-values as rbf nodes
print("#rbfs: {}".format(rbf_nodes.shape[1]))
#estimate width of kernels:
sigma = 20 #greater = smoother, this is the primary parameter to play with
sigma *= np.max(np.abs(rbf_nodes[0,1:]-rbf_nodes[0,:-1]))
# kernel & derivative
rbf = lambda r:1/(1+(r/sigma)**2)
Drbf = lambda r: -2*r*sigma**2/(sigma**2 + r**2)**2
#compute coefficients of rbf network
r = np.abs(xi[:, None]-rbf_nodes)
A = rbf(r)
coeffs = np.linalg.lstsq(A, yi, rcond=None)[0]
print(coeffs)
#evaluate rbf network
N=1000
xe = np.linspace(xi.min(), xi.max(), N)
Ae = rbf(xe[:, None] - rbf_nodes)
ye = Ae # coeffs
#evaluate derivative
N=1000
xd = np.linspace(xi.min(), xi.max(), N)
Bd = Drbf(xe[:, None] - rbf_nodes)
yd = Bd # coeffs
fig,ax = plt.subplots()
ax2 = ax.twinx()
ax.plot(xv, yv, '-')
ax.plot(xi, yi, '-')
ax.plot(xe, ye, ':')
ax2.plot(xd, yd, '-')
fig.savefig('graph.png')
print('done')
You need the derivative to go from CDF to PDF
PDF(x) = d CDF(x)/ dx
With NumPy, you could use gradient
pdf = np.gradient(yvalues, xvalues)
plt.plot(xvalues, pdf)
plt.show()
or manual differential
pdf = np.diff(yvalues)/np.diff(xvalues)
l = np.asarray(xvalues[:-1])
r = np.asarray(xvalues[1:])
plt.plot((l+r)/2.0, pdf) # points in the middle of interval
plt.show()
Both produce something like, updated picture it got botched somehow
I have a list of (x,y) values that are not uniformly spaced. Here is the archive used in this question.
I am able to interpolate between the values but what I get are not equispaced interpolating points. Here's what I do:
x_data = [0.613,0.615,0.615,...]
y_data = [5.919,5.349,5.413,...]
# Interpolate values for x and y.
t = np.linspace(0, 1, len(x_data))
t2 = np.linspace(0, 1, 100)
# One-dimensional linear interpolation.
x2 = np.interp(t2, t, x_data)
y2 = np.interp(t2, t, y_data)
# Plot x,y data.
plt.scatter(x_data, y_data, marker='o', color='k', s=40, lw=0.)
# Plot interpolated points.
plt.scatter(x2, y2, marker='o', color='r', s=10, lw=0.5)
Which results in:
As can be seen, the red dots are closer together in sections of the graph where the original points distribution is denser.
I need a way to generate the interpolated points equispaced in x, y according to a given step value (say 0.1)
As askewchan correctly points out, when I mean "equispaced in x, y" I mean that two consecutive interpolated points in the curve should be distanced from each other (euclidean straight line distance) by the same value.
I tried unubtu's answer and it works well for smooth curves but seems to break for not so smooth ones:
This happens because the code calculates the point distance in an euclidean way instead of directly over the curve and I need the distance over the curve to be the same between points. Can this issue be worked around somehow?
Convert your xy-data to a parametrized curve, i.e. calculate all all distances between the points and generate the coordinates on the curve by cumulative summing. Then interpolate the x- and y-coordinates independently with respect to the new coordinates.
import numpy as np
from matplotlib import pyplot as plt
data = '''0.615 5.349
0.615 5.413
0.617 6.674
0.617 6.616
0.63 7.418
0.642 7.809
0.648 8.04
0.673 8.789
0.695 9.45
0.712 9.825
0.734 10.265
0.748 10.516
0.764 10.782
0.775 10.979
0.783 11.1
0.808 11.479
0.849 11.951
0.899 12.295
0.951 12.537
0.972 12.675
1.038 12.937
1.098 13.173
1.162 13.464
1.228 13.789
1.294 14.126
1.363 14.518
1.441 14.969
1.545 15.538
1.64 16.071
1.765 16.7
1.904 17.484
2.027 18.36
2.123 19.235
2.149 19.655
2.172 20.096
2.198 20.528
2.221 20.945
2.265 21.352
2.312 21.76
2.365 22.228
2.401 22.836
2.477 23.804'''
data = np.array([line.split() for line in data.split('\n')],dtype=float)
x,y = data.T
xd = np.diff(x)
yd = np.diff(y)
dist = np.sqrt(xd**2+yd**2)
u = np.cumsum(dist)
u = np.hstack([[0],u])
t = np.linspace(0,u.max(),10)
xn = np.interp(t, u, x)
yn = np.interp(t, u, y)
f = plt.figure()
ax = f.add_subplot(111)
ax.set_aspect('equal')
ax.plot(x,y,'o', alpha=0.3)
ax.plot(xn,yn,'ro', markersize=8)
ax.set_xlim(0,5)
Let's first consider a simple case. Suppose your data looked like the blue line,
below.
If you wanted to select equidistant points that were r distance apart,
then there would be some critical value for r where the cusp at (1,2) is the first equidistant point.
If you wanted points that were greater than this critical distance apart, then
the first equidistant point would jump from (1,2) to some place very different --
depicted by the intersection of the green arc with the blue line. The change is not gradual.
This toy case suggests that a tiny change in the parameter r can have a radical, discontinuous affect on the solution.
It also suggests that you must know the location of the ith equidistant point
before you can determine the location of the (i+1)-th equidistant point.
So it appears an iterative solution is required:
import numpy as np
import matplotlib.pyplot as plt
import math
x, y = np.genfromtxt('data', unpack=True, skip_header=1)
# find lots of points on the piecewise linear curve defined by x and y
M = 1000
t = np.linspace(0, len(x), M)
x = np.interp(t, np.arange(len(x)), x)
y = np.interp(t, np.arange(len(y)), y)
tol = 1.5
i, idx = 0, [0]
while i < len(x):
total_dist = 0
for j in range(i+1, len(x)):
total_dist += math.sqrt((x[j]-x[j-1])**2 + (y[j]-y[j-1])**2)
if total_dist > tol:
idx.append(j)
break
i = j+1
xn = x[idx]
yn = y[idx]
fig, ax = plt.subplots()
ax.plot(x, y, '-')
ax.scatter(xn, yn, s=50)
ax.set_aspect('equal')
plt.show()
Note: I set the aspect ratio to 'equal' to make it more apparent that the points are equidistant.
The following script will interpolate points with a equal step of x_max - x_min / len(x) = 0.04438
import numpy as np
from scipy.interpolate import interp1d
import matplotlib.pyplot as plt
data = np.loadtxt('data.txt')
x = data[:,0]
y = data[:,1]
f = interp1d(x, y)
x_new = np.linspace(np.min(x), np.max(x), x.shape[0])
y_new = f(x_new)
plt.plot(x,y,'o', x_new, y_new, '*r')
plt.show()
Expanding on the answer by #Christian K., here's how to do this for higher dimensional data with scipy.interpolate.interpn. Let's say we want to resample to 10 equally-spaced points:
import numpy as np
import scipy
# Assuming that 'data' is rows x dims (where dims is the dimensionality)
diffs = data[1:, :] - data[:-1, :]
dist = np.linalg.norm(diffs, axis=1)
u = np.cumsum(dist)
u = np.hstack([[0], u])
t = np.linspace(0, u[-1], 10)
resampled = scipy.interpolate.interpn((u,), pts, t)
It IS possible to generate equidistant points along the curve. But there must be more definition of what you want for a real answer. Sorry, but the code I've written for this task is in MATLAB, but I can describe the general ideas. There are three possibilities.
First, are the points to be truly equidistant from the neighbors in terms of a simple Euclidean distance? To do so would involve finding the intersection at any point on the curve with a circle of a fixed radius. Then just step along the curve.
Next, if you intend distance to mean distance along the curve itself, if the curve is a piecewise linear one, the problem is again easy to do. Just step along the curve, since distance on a line segment is easy to measure.
Finally, if you intend for the curve to be a cubic spline, again this is not incredibly difficult, but is a bit more work. Here the trick is to:
Compute the piecewise linear arclength from point to point along the curve. Call it t.
Generate a pair of cubic splines, x(t), y(t).
Differentiate x and y as functions of t. Since these are cubic segments, this is easy. The derivative functions will be piecewise quadratic.
Use an ode solver to move along the curve, integrating the differential arclength function. In MATLAB, ODE45 worked nicely.
Thus, one integrates
sqrt((x')^2 + (y')^2)
Again, in MATLAB, ODE45 can be set to identify those locations where the function crosses certain specified points.
If your MATLAB skills are up to the task, you can look at the code in interparc for more explanation. It is reasonably well commented code.