Fitting sin curve using python - python

I am having two list:
# on x-axis:
# list1:
[70.434654, 37.147266, 8.5787086, 161.40877, -27.31284, 80.429482, -81.918106, 52.320129, 64.064552, -156.40771, 12.37026, 15.599689, 166.40984, 134.93636, 142.55002, -38.073524, -38.073524, 123.88509, -82.447571, 97.934402, 106.28793]
# on y-axis:
# list2:
[86683.961, -40564.863, 50274.41, 80570.828, 63628.465, -87284.016, 30571.402, -79985.648, -69387.891, 175398.62, -132196.5, -64803.133, -269664.06, 36493.316, 22769.121, 25648.252, 25648.252, 53444.855, 684814.69, 82679.977, 103244.58]
I need to fit a sine curve a+bsine(2*3.14*list1+c) in the data points obtained by plotting list1(on x-axis) against(on-y-axis) using python.
I am not able to get any good result.Can anyone help me with a suitable code,explanation...
Thanks!
this is my graph after plotting the list1(on x-axis) and list2(on y-axis)

Well, if you used lmfit setting up and running your fit would look like this:
xdeg = [70.434654, 37.147266, 8.5787086, 161.40877, -27.31284, 80.429482, -81.918106, 52.320129, 64.064552, -156.40771, 12.37026, 15.599689, 166.40984, 134.93636, 142.55002, -38.073524, -38.073524, 123.88509, -82.447571, 97.934402, 106.28793]
y = [86683.961, -40564.863, 50274.41, 80570.828, 63628.465, -87284.016, 30571.402, -79985.648, -69387.891, 175398.62, -132196.5, -64803.133, -269664.06, 36493.316, 22769.121, 25648.252, 25648.252, 53444.855, 684814.69, 82679.977, 103244.58]
import numpy as np
from lmfit import Model
import matplotlib.pyplot as plt
def sinefunction(x, a, b, c):
return a + b * np.sin(x*np.pi/180.0 + c)
smodel = Model(sinefunction)
result = smodel.fit(y, x=xdeg, a=0, b=30000, c=0)
print(result.fit_report())
plt.plot(xdeg, y, 'o', label='data')
plt.plot(xdeg, result.best_fit, '*', label='fit')
plt.legend()
plt.show()
That is assuming your X data is in degrees, and that you really intended to convert that to radians (as numpy's sin() function requires).
But that just addresses the mechanics of how to do the fit (and I'll leave the display of results up to you - it seems like you may need the practice).
The fit result is terrible, because these data are not sinusoidal. They are also not well ordered, which isn't a problem for doing the fit, but does make it harder to see what is going on.

Related

Sine fitting using scipy is not returning good fit

trying to fit some sine wave to data i collected. But Amplitude and Frequency are way off. Any suggestions?
x=[0,1,3,4,5,6,7,11,12,13,14,15,16,18,20,21,22,24,26,28,29,30,31,32,35,37,38,40,41,42,43,44,45,48,49,50,51,52,53,54,55,57,58,60,61,62,63,65,66,67,68,69,70,71,73,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,94,95,96,97,98,99,100,101,102,103,104,105,106,107,108,109,110,112,114,115,116,117,120,122,123,124,125,128,129,130,131,132,136,137,138,139,140,143,145,147,148,150,151,153,154,155,156,160,163,164,165,167,168,169,171,172,173,175,176,177,178,179,180,181,182,183,184,185,186,187,188,189,190,191,192,193,194,195,196,197,199,201,202,203,204,205,207,209,210,215,217,218,223,224,225,226,228,230,231,232,233,234,235,236,237,238,239,240,241,242,243,244,245,246,247,248,249,250,251,252,254,255,256,257,258,259,260,261,262,263,264,265,266,267,269,270,271,272,273,274,275,276,279,280,281,282,286,287,288,292,294,295,296,298,301,302,303,310,311,312,313,315,316,317,318,319,320,321,323,324,325,326,328,329,330,331,332,333,334,335,336,337,338,339,340,341,342,343,344,345,348,349,350,351,352,354,356,357,358,359,362,363,365,366,367,371,372,373,374,375,377,378,379,380,381,382,383,384,385,386,387,388,389,390,391,392,393,394,395,396,397,398,399,400,401,402,404,405,406,407,408,411,412,413,417,418,419,420,421,422,428,429,431,435,436,437,443,444,445,446,450,451,452,453,454,455,456,459,460,461,462,464,465,466,467,468,469,470,471,472,473,474,475,476,478,479,480,481,482,483,484,485,486,487,488,489,490,491,492,493,495,496,497,498,499,500,501,505,506,507,512,513,514,515,516,517,519,521,522,523,524,525,526,528,529,530,531,532,533,535,537,538,539,543,544,545,546,547,548,549,550,551,552,553,554,555,556,557,559,560,561,562,563,564,566,567,568,569,570,571,572,573,574,575,577,578,579,584,585,586,588,591,592,593,594,596,598,600,601,603,604,605,606,607,608,609,610,611,612,613,614,615,616,617,618,620,621,622,623,624,625,626,627,628,629,630,631,632,633,634,635,636,637,638,639,640,642,643,644,646,647,648,650,652,653,654,655,656,660,661,662,663,665,666,667,668,669,670,671,672,673,676,677,678,679,680,681,682,684,685,687,688,690,691,692,693,694,695,696,697,698,701,702,703,704,707,708,709,710,712,713,714,715,717,718,719,721,722,723 ]
y=[53.66666667,53.5,51,53.66666667,54.33333333,55.5,57,59,56.5,57.33333333,56,56,57,58,58.66666667,59.5,57,59,58,61.5,60,61,62.5,67,60.66666667,62.5,64.33333333,64,64,65,65,65.66666667,68,70.5,67,67.5,71.5,65,70.5,73.33333333,72,67,76,73.5,72.83333333,75,73,74,73,71,70.5,73.16666667,70,75,69,71,68.33333333,68.5,66.75,62,63.5,63,62.5,61,53.5,61.25,55,57.5,62,54.75,56.5,52.33333333,52.33333333,49,47.66666667,47.5,45,44,42.5,41,37,37.2,34.5,33.4,33.2,34,26,28.6,25,25.5,27,22.66666667,21.66666667,21.5,22.5,22,19.8,19.66666667,20,20,17,26,22.6,19,28,26.33333333,24.25,27,28.5,30,24,33,31,41,38,22,31.66666667,30,39,26,33.5,40,40.5,38,44,47,48,43,42.5,44,43,51.5,48,49.66666667,51.5,47,56,50,50,58,51,58,58.5,57.33333333,57.5,64,57,59,56.5,65.5,60,63.66666667,62,62,65.33333333,66.5,65,66,65,68,65.5,65.83333333,60,65.5,70,68,64,65.42857143,62,68,63.25,62,63.33333333,60.4,59,52.5,52.6,55.16666667,50,51,45.33333333,48.33333333,39.4,38.25,34.33333333,43.25,31.33333333,29.5,29.5,29,27,26,27,25.5,24.5,23,22,22.5,19.5,20,20,18,18.5,17,16,16,15,14,14.5,13,12.5,11.5,11,11,11,10.5,10.5,9,9,10,10,10.5,9,10,10,11,11,11,10,10.66666667,12,12,12.5,13,13,14,14,14.5,16,16,18,16.5,20.5,21.5,21,25,28,22,29,29,28.66666667,36,42,36.75,43.5,48,44.75,50.66666667,53.75,51,57.33333333,58.5,58.66666667,60,60.25,61.75,60,58.5,63,61,60.33333333,62,63,63,60,61.5,62.33333333,62.66666667,61,63.5,61,61.66666667,62,59,60,57.5,56,57,58.5,52.5,50.5,47.5,49.66666667,49.66666667,54.66666667,45.66666667,41,44,33.16666667,49,45,29.5,39.5,29,20.5,23.5,23,19,18.66666667,17,16.75,15.5,15,16,17,13.5,12.2,12,14,13,11,11.5,11.5,11,11.5,11,11.5,11.5,12,13,13,13,13,13.5,14,14,14,15,17,15,16,16,17,18,17,18,18.5,19.5,20.5,20,21.5,20,22,22,23,23,25,26,28,29,36.25,31,37.75,41.33333333,43.6,37.5,46.5,38,47.33333333,46.75,47,50.5,48.5,58,50.5,48.75,54.33333333,56,49,55.5,60,56.5,56,60,56.5,52.75,54,56,57,56,52.66666667,52,52.66666667,53,47.66666667,44,48,50.5,45,46.66666667,48,44.66666667,42.33333333,46.5,43,36.75,41,28,35,36.5,36,37.33333333,24,30.5,29,29.33333333,32.5,20,25.5,27.5,18,33,25.75,26,19.5,16,15.5,18,13,21,12,12.25,11,5,9,10,7.5,5,7.5,4,4.5,5.666666667,3.5,6.5,5,7,7.333333333,7,9,7.5,9,9.5,11,9,10,12,11.5,12.5,13,14,13.5,13,14,15,15,16,16.5,17.5,19.66666667,19.33333333,20.5,23.66666667,25.5,28.75,31,32.66666667,33.66666667,29,32.33333333,37.6,31,39.5,49,44.14285714,41,42.16666667,45,47.66666667,50.2,52.66666667,52,50,54,53.33333333,54.66666667,54.5,54,56,54,53.5,53,53,52,51.5,51.5,52,48,53,48,50,49.5,48.5,46,45,47,49,48,44,42,42,43,43,42.5,41.5,39.5,46,36,37.5,39,39,38,43,40,38,32.5,34,35.33333333,35,35,30.5,30,31.33333333,33,26,30,27,24,30,28,25,29,25.33333333]
from scipy.optimize import curve_fit
from numpy import sin
def fitting(x, a, b, c):
return a * sin(b*x + c)
constants = curve_fit(fitting, x, y)
a_fit= constants[0][0]
b_fit= constants[0][1]
c_fit = constants[0][2]
fit_y=[]
for i in x:
fit_y.append(fitting(i, a_fit, b_fit, c_fit))
plt.plot(x,fit_y, '--', color='red')
plt.scatter(x,y)
You should add an offset to your fitting function, as your data clearly has an offset around 40.
And then you need a proper initial estimate parameter p0 so that the fit converges to the ideal solution. This will do the job :
import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
from numpy import sin
def fitting(x, a,b,c,d):
return a * sin(b*x + c) + d
p0 = [ (np.max(y)-np.min(y))/2, 6/150, 0, np.mean(y)]
constants = curve_fit(fitting, x, y , p0=p0 )
guess_y = [ fitting(i, *p0) for i in x]
fit_y = [ fitting(i, *constants[0]) for i in x]
plt.plot(x,guess_y, '--', color='green',label='guess')
plt.plot(x,fit_y, '--', color='red',label='fit')
plt.scatter(x,y,label='data') plt.legend()
plt.legend()
If you feel like it, you could even add a linear offset (a*x+b)
Note : thanks for the edit jonsca
I would add this as a comment, but I can't. Fundamentally, a * sin(b*x + c) isn't going to fit well to your data, you don't have an average value of zero so you'd have to try a*sin(b*x +c) + d, but even then I don't think you'll get a great fit. You could try:
Give it some initial values to work with using the p0 input argument https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.curve_fit.html . It never hurts to help the minimizer out..
Try a different function, what you have here looks like a sin wave, with offset 'a0' and maybe a decaying amplitude.
But you really need to just look at your data before trying to force a function to fit to it.

Fitting with a gaussian

I have some problems when trying to fit data from a text file with a gaussian. This is my code, where cal1_p1 is an array containing 54 values.
%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np
from scipy.optimize import curve_fit
cal1=np.loadtxt("C:/Users/Luca/Desktop/G3/X_rays/cal1_5min_Am.txt")
cal1_p1=[0 for a in range(854,908)]
for i in range(0,54):
cal1_p1[i]=cal1[i+854]
# cal1_p1 takes the following values:
[5.0,6.0,5.0,11.0,4.0,9.0,14.0,13.0,13.0,14.0,12.0,13.0,16.0,20.0,15.0,23.0,23.0,33.0,43.0,46.0,41.0,40.0,49.0,57.0,62.0,61.0,53.0,65.0,64.0,42.0,72.0,55.0,47.0,43.0,38.0,46.0,37.0,39.0,27.0,18.0,20.0,20.0,18.0,10.0,11.0,8.0,10.0,6.0,8.0,8.0,6.0,10.0,6.0,4.0]
x=np.arange(854,908)
def gauss(x,sigma,m):
return np.exp(-(x-m)**2/(2*sigma**2))/(sigma*np.sqrt(2*np.pi))
from scipy.optimize import curve_fit
popt,pcov=curve_fit(gauss,x,cal1_p1,p0=[10,880])
plt.xlabel("Channel")
plt.ylabel("Counts")
axes=plt.gca()
axes.set_xlim([854,907])
axes.set_ylim([0,75])
plt.plot(x,cal1_p1,"k")
plt.plot(x,gauss(x,*popt),'b', label='fit')
The problem is that the resulting gaussian is really squeezed, namely it has a very low variance. Even if I try to modify the initial value p_0 the result doesn't change. What could be the problem? Thanks for any help you can provide!
The problem is that the Gaussian is normalised, while your data are not. You need to fit an amplitude as well. That is easy to fix, by adding an extra parameter a to your function:
x = np.arange(854, 908)
def gauss(x, sigma, m, a):
return a * np.exp(-(x-m)**2/(2*sigma**2))/(sigma*np.sqrt(2*np.pi))
popt, pcov = curve_fit(gauss, x, cal1_p1, p0=[10, 880, 1])
print(popt)
plt.xlabel("Channel")
plt.ylabel("Counts")
axes=plt.gca()
axes.set_xlim([854, 907])
axes.set_ylim([0, 75])
plt.plot(x, cal1_p1, "k")
plt.plot(x, gauss(x,*popt), 'b', label='fit')
While I've given 1 as starting parameter for a, you'll find that the fitted values are actually:
[ 9.55438603 880.88681556 1398.66618699]
but the amplitude value here can probably be ignored, since I assume you'd only be interested in the relative strength, which can be measured in counts.

Improve Polynomial Curve Fitting using numpy/Scipy in Python Help Needed

I have two NumPy arrays time and no of get requests. I need to fit this data using a function so that i could make future predictions.
These data were extracted from cassandra table which stores the details of a log file. So basically the time format is epoch-time and the training variable here is get_counts.
from cassandra.cluster import Cluster
import numpy as np
import matplotlib.pyplot as plt
from cassandra.query import panda_factory
session = Cluster(contact_points=['127.0.0.1'], port=9042).connect(keyspace='ASIA_KS')
session.row_factory = panda_factory
df = session.execute("SELECT epoch_time, get_counts FROM ASIA_TRAFFIC")
.sort(columns=['epoch_time','get_counts'], ascending=[1,0])
time = np.array([x[1] for x in enumerate(df['epoch_time'])])
get = np.array([x[1] for x in enumerate(df['get_counts'])])
plt.title('Trend')
plt.plot(time, byte,'o')
plt.show()
The data is as follows:
there are around 1000 pairs of data
time -> [1391193000 1391193060 1391193120 ..., 1391279280 1391279340 1391279400 1391279460]
get -> [577 380 430 ...,250 275 365 15]
Plot image (full size here):
Can someone please help me in providing a function so that i could properly fit in the data? I am new to python.
EDIT *
fit = np.polyfit(time, get, 3)
yp = np.poly1d(fit)
plt.plot(time, yp(time), 'r--', time, get, 'b.')
plt.xlabel('Time')
plt.ylabel('Number of Get requests')
plt.title('Trend')
plt.xlim([time[0]-10000, time[-1]+10000])
plt.ylim(0, 2000)
plt.show()
print yp(time[1400])
the fit curve looks like this:
https://drive.google.com/file/d/0B-r3Ym7u_hsKUTF1OFVqRWpEN2M/view?usp=sharing
However at the later part of the curve the value of y becomes (-ve) which is wrong. The curve must change its slope back to (+ve) somewhere in between.
Can anyone please suggest me how to go about it.
Help will be much appreciated.
You could try:
time = np.array([x[1] for x in enumerate(df['epoch_time'])])
byte = np.array([x[1] for x in enumerate(df['byte_transfer'])])
fit = np.polyfit(time, byte, n) # step up n value here,
# where n is the degree of the polynomial
yp = np.poly1d(fit)
print yp # displays function in cx^n +- cx^n-1...c format
plt.plot(x, yp(x), '-')
plt.xlabel('Time')
plt.ylabel('Bytes Transfered')
plt.title('Trend')
plt.plot(time, byte,'o')
plt.show()
I'm new to Numpy and curve fitting as well, but this is how I've been attempting to do it.

Why does InterpolatedUnivariateSpline return nan values

I have some data, y vs x, which I would like to interpolate at a finer resolution xx using a cubic spline.
Here is my dataset:
import numpy as np
print np.version.version
import scipy
print scipy.version.version
1.9.2
0.15.1
x = np.array([0.5372973, 0.5382103, 0.5392305, 0.5402197, 0.5412042, 0.54221, 0.543209,
0.5442277, 0.5442277, 0.5452125, 0.546217, 0.5472153, 0.5482086,
0.5492241, 0.5502117, 0.5512249, 0.5522136, 0.5532056, 0.5532056,
0.5542281, 0.5552039, 0.5562125, 0.5567836])
y = np.array([0.01, 0.03108, 0.08981, 0.18362, 0.32167, 0.50941, 0.72415, 0.90698,
0.9071, 0.97955, 0.99802, 1., 0.97863, 0.9323, 0.85344, 0.72936,
0.56413, 0.36997, 0.36957, 0.17623, 0.05922, 0.0163, 0.01, ])
xx = np.array([0.5372981, 0.5374106, 0.5375231, 0.5376356, 0.5377481, 0.5378606,
0.5379731, 0.5380856, 0.5381981, 0.5383106, 0.5384231, 0.5385356,
0.5386481, 0.5387606, 0.5388731, 0.5389856, 0.5390981, 0.5392106,
0.5393231, 0.5394356, 0.5395481, 0.5396606, 0.5397731, 0.5398856,
0.5399981, 0.5401106, 0.5402231, 0.5403356, 0.5404481, 0.5405606,
0.5406731, 0.5407856, 0.5408981, 0.5410106, 0.5411231, 0.5412356,
0.5413481, 0.5414606, 0.5415731, 0.5416856, 0.5417981, 0.5419106,
0.5420231, 0.5421356, 0.5422481, 0.5423606, 0.5424731, 0.5425856,
0.5426981, 0.5428106, 0.5429231, 0.5430356, 0.5431481, 0.5432606,
0.5433731, 0.5434856, 0.5435981, 0.5437106, 0.5438231, 0.5439356,
0.5440481, 0.5441606, 0.5442731, 0.5443856, 0.5444981, 0.5446106,
0.5447231, 0.5448356, 0.5449481, 0.5450606, 0.5451731, 0.5452856,
0.5453981, 0.5455106, 0.5456231, 0.5457356, 0.5458481, 0.5459606,
0.5460731, 0.5461856, 0.5462981, 0.5464106, 0.5465231, 0.5466356,
0.5467481, 0.5468606, 0.5469731, 0.5470856, 0.5471981, 0.5473106,
0.5474231, 0.5475356, 0.5476481, 0.5477606, 0.5478731, 0.5479856,
0.5480981, 0.5482106, 0.5483231, 0.5484356, 0.5485481, 0.5486606,
0.5487731, 0.5488856, 0.5489981, 0.5491106, 0.5492231, 0.5493356,
0.5494481, 0.5495606, 0.5496731, 0.5497856, 0.5498981, 0.5500106,
0.5501231, 0.5502356, 0.5503481, 0.5504606, 0.5505731, 0.5506856,
0.5507981, 0.5509106, 0.5510231, 0.5511356, 0.5512481, 0.5513606,
0.5514731, 0.5515856, 0.5516981, 0.5518106, 0.5519231, 0.5520356,
0.5521481, 0.5522606, 0.5523731, 0.5524856, 0.5525981, 0.5527106,
0.5528231, 0.5529356, 0.5530481, 0.5531606, 0.5532731, 0.5533856,
0.5534981, 0.5536106, 0.5537231, 0.5538356, 0.5539481, 0.5540606,
0.5541731, 0.5542856, 0.5543981, 0.5545106, 0.5546231, 0.5547356,
0.5548481, 0.5549606, 0.5550731, 0.5551856, 0.5552981, 0.5554106,
0.5555231, 0.5556356, 0.5557481, 0.5558606, 0.5559731, 0.5560856,
0.5561981, 0.5563106, 0.5564231, 0.5565356, 0.5566481, 0.5567606])
I am trying to fit using the scipy InterpolatedUnivariateSpline method, interpolated with a 3rd order spline k=3, and extrapolated as zeros ext='zeros':
import scipy.interpolate as interp
yspline = interp.InterpolatedUnivariateSpline(x,y, k=3, ext='zeros')
yvals = yspline(xx)
import matplotlib.pyplot as plt
fig = plt.figure()
ax = fig.add_subplot(111)
ax.plot(x, y, 'ko', label='Values')
ax.plot(xx, yvals, 'b-.', lw=2, label='Spline')
plt.xlim([min(x), max(x)])
However, as you can see in this image, my Spline returns NaN values :(
Is there a reason? I am pretty sure my x values are all increasing, so I am stumped as to why this is happening. I have many other datasets I am fitting using this method, and it only fails on this specific set of data.
Any help is greatly appreciated.
Thank you for reading.
EDIT!
The solution was that I have duplicate x values, with differing y values!
For this interpolation, you should rather use scipy.interpolate.interp1d with the argument kind='cubic' (see a related SO question )
I have yet to find a use case where InterpolatedUnivariateSpline can be used in practice (or maybe I just don't understand its purpose). With your code I get,
So the interpolation works but shows extremely strong oscillations, making it unusable, which is typically the result I was getting with this interpolation method in the past. With a lower order spline (e.g. k=1) that works better, but then you lose the advantage of cubic interpolation.
I've also encountered the problem with InterpolatedUnivariateSpline returning NaN values. But in my case the reason was not in having duplicates in x array but because values in x were decreasing when docs states that values "must be increasing".
So, in such a case, instead of original x and y one must supply them reversed: x[::-1] and y[::-1].

Gradient in noisy data, python

I have an energy spectrum from a cosmic ray detector. The spectrum follows an exponential curve but it will have broad (and maybe very slight) lumps in it. The data, obviously, contains an element of noise.
I'm trying to smooth out the data and then plot its gradient.
So far I've been using the scipy sline function to smooth it and then the np.gradient().
As you can see from the picture, the gradient function's method is to find the differences between each point, and it doesn't show the lumps very clearly.
I basically need a smooth gradient graph. Any help would be amazing!
I've tried 2 spline methods:
def smooth_data(y,x,factor):
print "smoothing data by interpolation..."
xnew=np.linspace(min(x),max(x),factor*len(x))
smoothy=spline(x,y,xnew)
return smoothy,xnew
def smooth2_data(y,x,factor):
xnew=np.linspace(min(x),max(x),factor*len(x))
f=interpolate.UnivariateSpline(x,y)
g=interpolate.interp1d(x,y)
return g(xnew),xnew
edit: Tried numerical differentiation:
def smooth_data(y,x,factor):
print "smoothing data by interpolation..."
xnew=np.linspace(min(x),max(x),factor*len(x))
smoothy=spline(x,y,xnew)
return smoothy,xnew
def minim(u,f,k):
""""functional to be minimised to find optimum u. f is original, u is approx"""
integral1=abs(np.gradient(u))
part1=simps(integral1)
part2=simps(u)
integral2=abs(part2-f)**2.
part3=simps(integral2)
F=k*part1+part3
return F
def fit(data_x,data_y,denoising,smooth_fac):
smy,xnew=smooth_data(data_y,data_x,smooth_fac)
y0,xnnew=smooth_data(smy,xnew,1./smooth_fac)
y0=list(y0)
data_y=list(data_y)
data_fit=fmin(minim, y0, args=(data_y,denoising), maxiter=1000, maxfun=1000)
return data_fit
However, it just returns the same graph again!
There is an interesting method published on this: Numerical Differentiation of Noisy Data. It should give you a nice solution to your problem. More details are given in another, accompanying paper. The author also gives Matlab code that implements it; an alternative implementation in Python is also available.
If you want to pursue the interpolation with splines method, I would suggest to adjust the smoothing factor s of scipy.interpolate.UnivariateSpline().
Another solution would be to smooth your function through convolution (say with a Gaussian).
The paper I linked to claims to prevent some of the artifacts that come up with the convolution approach (the spline approach might suffer from similar difficulties).
I won't vouch for the mathematical validity of this; it looks like the paper from LANL that EOL cited would be worth looking into. Anyway, I’ve gotten decent results using SciPy’s splines’ built-in differentiation when using splev.
%matplotlib inline
from matplotlib import pyplot as plt
import numpy as np
from scipy.interpolate import splrep, splev
x = np.arange(0,2,0.008)
data = np.polynomial.polynomial.polyval(x,[0,2,1,-2,-3,2.6,-0.4])
noise = np.random.normal(0,0.1,250)
noisy_data = data + noise
f = splrep(x,noisy_data,k=5,s=3)
#plt.plot(x, data, label="raw data")
#plt.plot(x, noise, label="noise")
plt.plot(x, noisy_data, label="noisy data")
plt.plot(x, splev(x,f), label="fitted")
plt.plot(x, splev(x,f,der=1)/10, label="1st derivative")
#plt.plot(x, splev(x,f,der=2)/100, label="2nd derivative")
plt.hlines(0,0,2)
plt.legend(loc=0)
plt.show()
You can also use scipy.signal.savgol_filter.
Result
Example
import matplotlib.pyplot as plt
import numpy as np
import scipy
from random import random
# generate data
x = np.array(range(100))/10
y = np.sin(x) + np.array([random()*0.25 for _ in x])
dydx = scipy.signal.savgol_filter(y, window_length=11, polyorder=2, deriv=1)
# Plot result
plt.plot(x, y, label='Original signal')
plt.plot(x, dydx*10, label='1st Derivative')
plt.plot(x, np.cos(x), label='Expected 1st Derivative')
plt.legend()
plt.show()

Categories