Fitting with a gaussian - python
I have some problems when trying to fit data from a text file with a gaussian. This is my code, where cal1_p1 is an array containing 54 values.
%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np
from scipy.optimize import curve_fit
cal1=np.loadtxt("C:/Users/Luca/Desktop/G3/X_rays/cal1_5min_Am.txt")
cal1_p1=[0 for a in range(854,908)]
for i in range(0,54):
cal1_p1[i]=cal1[i+854]
# cal1_p1 takes the following values:
[5.0,6.0,5.0,11.0,4.0,9.0,14.0,13.0,13.0,14.0,12.0,13.0,16.0,20.0,15.0,23.0,23.0,33.0,43.0,46.0,41.0,40.0,49.0,57.0,62.0,61.0,53.0,65.0,64.0,42.0,72.0,55.0,47.0,43.0,38.0,46.0,37.0,39.0,27.0,18.0,20.0,20.0,18.0,10.0,11.0,8.0,10.0,6.0,8.0,8.0,6.0,10.0,6.0,4.0]
x=np.arange(854,908)
def gauss(x,sigma,m):
return np.exp(-(x-m)**2/(2*sigma**2))/(sigma*np.sqrt(2*np.pi))
from scipy.optimize import curve_fit
popt,pcov=curve_fit(gauss,x,cal1_p1,p0=[10,880])
plt.xlabel("Channel")
plt.ylabel("Counts")
axes=plt.gca()
axes.set_xlim([854,907])
axes.set_ylim([0,75])
plt.plot(x,cal1_p1,"k")
plt.plot(x,gauss(x,*popt),'b', label='fit')
The problem is that the resulting gaussian is really squeezed, namely it has a very low variance. Even if I try to modify the initial value p_0 the result doesn't change. What could be the problem? Thanks for any help you can provide!
The problem is that the Gaussian is normalised, while your data are not. You need to fit an amplitude as well. That is easy to fix, by adding an extra parameter a to your function:
x = np.arange(854, 908)
def gauss(x, sigma, m, a):
return a * np.exp(-(x-m)**2/(2*sigma**2))/(sigma*np.sqrt(2*np.pi))
popt, pcov = curve_fit(gauss, x, cal1_p1, p0=[10, 880, 1])
print(popt)
plt.xlabel("Channel")
plt.ylabel("Counts")
axes=plt.gca()
axes.set_xlim([854, 907])
axes.set_ylim([0, 75])
plt.plot(x, cal1_p1, "k")
plt.plot(x, gauss(x,*popt), 'b', label='fit')
While I've given 1 as starting parameter for a, you'll find that the fitted values are actually:
[ 9.55438603 880.88681556 1398.66618699]
but the amplitude value here can probably be ignored, since I assume you'd only be interested in the relative strength, which can be measured in counts.
Related
Sine fitting using scipy is not returning good fit
trying to fit some sine wave to data i collected. But Amplitude and Frequency are way off. Any suggestions? x=[0,1,3,4,5,6,7,11,12,13,14,15,16,18,20,21,22,24,26,28,29,30,31,32,35,37,38,40,41,42,43,44,45,48,49,50,51,52,53,54,55,57,58,60,61,62,63,65,66,67,68,69,70,71,73,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,94,95,96,97,98,99,100,101,102,103,104,105,106,107,108,109,110,112,114,115,116,117,120,122,123,124,125,128,129,130,131,132,136,137,138,139,140,143,145,147,148,150,151,153,154,155,156,160,163,164,165,167,168,169,171,172,173,175,176,177,178,179,180,181,182,183,184,185,186,187,188,189,190,191,192,193,194,195,196,197,199,201,202,203,204,205,207,209,210,215,217,218,223,224,225,226,228,230,231,232,233,234,235,236,237,238,239,240,241,242,243,244,245,246,247,248,249,250,251,252,254,255,256,257,258,259,260,261,262,263,264,265,266,267,269,270,271,272,273,274,275,276,279,280,281,282,286,287,288,292,294,295,296,298,301,302,303,310,311,312,313,315,316,317,318,319,320,321,323,324,325,326,328,329,330,331,332,333,334,335,336,337,338,339,340,341,342,343,344,345,348,349,350,351,352,354,356,357,358,359,362,363,365,366,367,371,372,373,374,375,377,378,379,380,381,382,383,384,385,386,387,388,389,390,391,392,393,394,395,396,397,398,399,400,401,402,404,405,406,407,408,411,412,413,417,418,419,420,421,422,428,429,431,435,436,437,443,444,445,446,450,451,452,453,454,455,456,459,460,461,462,464,465,466,467,468,469,470,471,472,473,474,475,476,478,479,480,481,482,483,484,485,486,487,488,489,490,491,492,493,495,496,497,498,499,500,501,505,506,507,512,513,514,515,516,517,519,521,522,523,524,525,526,528,529,530,531,532,533,535,537,538,539,543,544,545,546,547,548,549,550,551,552,553,554,555,556,557,559,560,561,562,563,564,566,567,568,569,570,571,572,573,574,575,577,578,579,584,585,586,588,591,592,593,594,596,598,600,601,603,604,605,606,607,608,609,610,611,612,613,614,615,616,617,618,620,621,622,623,624,625,626,627,628,629,630,631,632,633,634,635,636,637,638,639,640,642,643,644,646,647,648,650,652,653,654,655,656,660,661,662,663,665,666,667,668,669,670,671,672,673,676,677,678,679,680,681,682,684,685,687,688,690,691,692,693,694,695,696,697,698,701,702,703,704,707,708,709,710,712,713,714,715,717,718,719,721,722,723 ] y=[53.66666667,53.5,51,53.66666667,54.33333333,55.5,57,59,56.5,57.33333333,56,56,57,58,58.66666667,59.5,57,59,58,61.5,60,61,62.5,67,60.66666667,62.5,64.33333333,64,64,65,65,65.66666667,68,70.5,67,67.5,71.5,65,70.5,73.33333333,72,67,76,73.5,72.83333333,75,73,74,73,71,70.5,73.16666667,70,75,69,71,68.33333333,68.5,66.75,62,63.5,63,62.5,61,53.5,61.25,55,57.5,62,54.75,56.5,52.33333333,52.33333333,49,47.66666667,47.5,45,44,42.5,41,37,37.2,34.5,33.4,33.2,34,26,28.6,25,25.5,27,22.66666667,21.66666667,21.5,22.5,22,19.8,19.66666667,20,20,17,26,22.6,19,28,26.33333333,24.25,27,28.5,30,24,33,31,41,38,22,31.66666667,30,39,26,33.5,40,40.5,38,44,47,48,43,42.5,44,43,51.5,48,49.66666667,51.5,47,56,50,50,58,51,58,58.5,57.33333333,57.5,64,57,59,56.5,65.5,60,63.66666667,62,62,65.33333333,66.5,65,66,65,68,65.5,65.83333333,60,65.5,70,68,64,65.42857143,62,68,63.25,62,63.33333333,60.4,59,52.5,52.6,55.16666667,50,51,45.33333333,48.33333333,39.4,38.25,34.33333333,43.25,31.33333333,29.5,29.5,29,27,26,27,25.5,24.5,23,22,22.5,19.5,20,20,18,18.5,17,16,16,15,14,14.5,13,12.5,11.5,11,11,11,10.5,10.5,9,9,10,10,10.5,9,10,10,11,11,11,10,10.66666667,12,12,12.5,13,13,14,14,14.5,16,16,18,16.5,20.5,21.5,21,25,28,22,29,29,28.66666667,36,42,36.75,43.5,48,44.75,50.66666667,53.75,51,57.33333333,58.5,58.66666667,60,60.25,61.75,60,58.5,63,61,60.33333333,62,63,63,60,61.5,62.33333333,62.66666667,61,63.5,61,61.66666667,62,59,60,57.5,56,57,58.5,52.5,50.5,47.5,49.66666667,49.66666667,54.66666667,45.66666667,41,44,33.16666667,49,45,29.5,39.5,29,20.5,23.5,23,19,18.66666667,17,16.75,15.5,15,16,17,13.5,12.2,12,14,13,11,11.5,11.5,11,11.5,11,11.5,11.5,12,13,13,13,13,13.5,14,14,14,15,17,15,16,16,17,18,17,18,18.5,19.5,20.5,20,21.5,20,22,22,23,23,25,26,28,29,36.25,31,37.75,41.33333333,43.6,37.5,46.5,38,47.33333333,46.75,47,50.5,48.5,58,50.5,48.75,54.33333333,56,49,55.5,60,56.5,56,60,56.5,52.75,54,56,57,56,52.66666667,52,52.66666667,53,47.66666667,44,48,50.5,45,46.66666667,48,44.66666667,42.33333333,46.5,43,36.75,41,28,35,36.5,36,37.33333333,24,30.5,29,29.33333333,32.5,20,25.5,27.5,18,33,25.75,26,19.5,16,15.5,18,13,21,12,12.25,11,5,9,10,7.5,5,7.5,4,4.5,5.666666667,3.5,6.5,5,7,7.333333333,7,9,7.5,9,9.5,11,9,10,12,11.5,12.5,13,14,13.5,13,14,15,15,16,16.5,17.5,19.66666667,19.33333333,20.5,23.66666667,25.5,28.75,31,32.66666667,33.66666667,29,32.33333333,37.6,31,39.5,49,44.14285714,41,42.16666667,45,47.66666667,50.2,52.66666667,52,50,54,53.33333333,54.66666667,54.5,54,56,54,53.5,53,53,52,51.5,51.5,52,48,53,48,50,49.5,48.5,46,45,47,49,48,44,42,42,43,43,42.5,41.5,39.5,46,36,37.5,39,39,38,43,40,38,32.5,34,35.33333333,35,35,30.5,30,31.33333333,33,26,30,27,24,30,28,25,29,25.33333333] from scipy.optimize import curve_fit from numpy import sin def fitting(x, a, b, c): return a * sin(b*x + c) constants = curve_fit(fitting, x, y) a_fit= constants[0][0] b_fit= constants[0][1] c_fit = constants[0][2] fit_y=[] for i in x: fit_y.append(fitting(i, a_fit, b_fit, c_fit)) plt.plot(x,fit_y, '--', color='red') plt.scatter(x,y)
You should add an offset to your fitting function, as your data clearly has an offset around 40. And then you need a proper initial estimate parameter p0 so that the fit converges to the ideal solution. This will do the job : import numpy as np import matplotlib.pyplot as plt from scipy.optimize import curve_fit from numpy import sin def fitting(x, a,b,c,d): return a * sin(b*x + c) + d p0 = [ (np.max(y)-np.min(y))/2, 6/150, 0, np.mean(y)] constants = curve_fit(fitting, x, y , p0=p0 ) guess_y = [ fitting(i, *p0) for i in x] fit_y = [ fitting(i, *constants[0]) for i in x] plt.plot(x,guess_y, '--', color='green',label='guess') plt.plot(x,fit_y, '--', color='red',label='fit') plt.scatter(x,y,label='data') plt.legend() plt.legend() If you feel like it, you could even add a linear offset (a*x+b) Note : thanks for the edit jonsca
I would add this as a comment, but I can't. Fundamentally, a * sin(b*x + c) isn't going to fit well to your data, you don't have an average value of zero so you'd have to try a*sin(b*x +c) + d, but even then I don't think you'll get a great fit. You could try: Give it some initial values to work with using the p0 input argument https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.curve_fit.html . It never hurts to help the minimizer out.. Try a different function, what you have here looks like a sin wave, with offset 'a0' and maybe a decaying amplitude. But you really need to just look at your data before trying to force a function to fit to it.
SciPy curve_fit displays straight line and is not fit to the data
I'm trying to fit a set of data to a CDF exponential function. However, I'm not sure what is going wrong either in my code or in the initial parameter guess, but it only creates a straight line. Data was imported from a CSV file. #Plot Data plt.figure(1,dpi=120) plt.title("Cell A3") plt.xlabel(rawdata[0][0]) plt.ylabel(rawdata[0][1]) plt.scatter(xdata,ydata,label="A3 Cell 1") #Define Function def func(t,lam): return 1 - (np.exp(-lam * t)) funcdata = func(xdata,1.17) plt.plot(xdata,funcdata,label="Model") plt.legend() #CurveFit data to model popt, pcov = curve_fit(func,xdata,ydata,p0=(-0.64)) perr = np.sqrt(np.diag(pcov)) Image of the graph I get with the initial data and the straight line that the curve_fit gives
You cannot fit correctly such a simple exponential function of this kind : y=( 1 - (np.exp(-lam * t)) ) * scale to the data because the shape of this function is far to the shape of the data in the range of 0<t<5. Better consider a function of the logistic kind, For example :
Think about your data and your function. ydata is quite a large value. What is the maximum value of def func(t,lam): return 1 - (np.exp(-lam * t)) I think you will find the max of the function occurs as lam approaches infinity, the function approaches 1. How can a function with max value == 1 fit data in the 1000s? If you want to be able to scale beyond 1, you need more parameters in your function. Try with def func(t,lam,scale): return ( 1 - (np.exp(-lam * t)) ) * scale and see if scipy is able to better fit the data. EDIT: I mananaged to get that to work, however, you aren't even plotting the optimum parameters. To do that, see my code with simulated xdata and ydata: #Plot Data import numpy as np from scipy.optimize import curve_fit from matplotlib import pyplot as plt def func(t,lam,scale): return ( 1 - (np.exp(-lam * t)) ) * scale xdata = np.arange(25.) ydata = func(xdata, 1.12, 2000.) plt.figure(1,dpi=120) plt.title("Cell A3") plt.xlabel(rawdata[0][0]) plt.ylabel(rawdata[0][1]) plt.scatter(xdata,ydata,label="A3 Cell 1") #CurveFit data to model popt, pcov = curve_fit(func,xdata,ydata,p0=[0.5, 1000.1]) plt.plot(np.arange(25),func(np.arange(25), *popt),label="Model") plt.legend() outputs:
Plotting Fourier Transform of Gaussian function with python, but the result was wrong
I have been thinking about it for a long time, but I don't find out what the problem is. Hope you can help me, Thank you. F(s) Gaussian function F(s)=1/(√2π s) e^(-(w-μ)^2/(2s^2 )) Code: import numpy as np from matplotlib import pyplot as plt from math import pi from scipy.fft import fft def F_S(w, mu, sig): return (np.exp(-np.power(w-mu, 2)/(2 * np.power(sig, 2))))/(np.power(2*pi, 0.5)*sig) w=np.linspace(-5,5,100) plt.plot(w, np.real(np.fft.fft(F_S(w, 0, 1)))) plt.show() Result:
As was mentioned before you want the absolute value, not the real part. A minimal example, showing the the re/im , abs/phase spectra. import numpy as np import matplotlib.pyplot as p %matplotlib inline n=1001 # add 1 to keep the interval a round number when using linspace t = np.linspace(-5, 5, n ) # presumed to be time dt=t[1]-t[0] # time resolution print(f'sampling every {dt:.3f} sec , so at {1/dt:.1f} Sa/sec, max. freq will be {1/2/dt:.1f} Hz') y = np.exp(-(t**2)/0.01) # signal in time fr= np.fft.fftshift(np.fft.fftfreq(n, dt)) # shift helps with sorting the frequencies for better plotting ft=np.fft.fftshift(np.fft.fft(y)) # fftshift only necessary for plotting in sequence p.figure(figsize=(20,12)) p.subplot(231) p.plot(t,y,'.-') p.xlabel('time (secs)') p.title('signal in time') p.subplot(232) p.plot(fr,np.abs(ft), '.-',lw=0.3) p.xlabel('freq (Hz)') p.title('spectrum, abs'); p.subplot(233) p.plot(fr,np.real(ft), '.-',lw=0.3) p.xlabel('freq (Hz)') p.title('spectrum, real'); p.subplot(235) p.plot(fr,np.angle(ft), '.-', lw=0.3) p.xlabel('freq (Hz)') p.title('spectrum, phase'); p.subplot(236) p.plot(fr,np.imag(ft), '.-',lw=0.3) p.xlabel('freq (Hz)') p.title('spectrum, imag');
you have to change from time scale to frequency scale
When you make a FFT you will get the simetric tranformation, i.e, mirror of the positive to negative curve. Usually, you only will look at the positive side. Also, you should take care with sample rate, as FFT is designed to transform time domain input to frequency domain, the time, or sample rate, of input info matters. So add timestep in np.fft.fftfreq(n, d=timestep) for your sample rate. If you simple want to make a fft of normal dist signal, here is another question with it and some good explanations on why are you geting this behavior: Fourier transform of a Gaussian is not a Gaussian, but thats wrong! - Python
There are two mistakes in your code: Don't take the real part, take the absoulte value when plotting. From the docs: If A = fft(a, n), then A[0] contains the zero-frequency term (the mean of the signal), which is always purely real for real inputs. Then A[1:n/2] contains the positive-frequency terms, and A[n/2+1:] contains the negative-frequency terms, in order of decreasingly negative frequency. You can rearrange the elements with np.fft.fftshift. The working code: import numpy as np from matplotlib import pyplot as plt from math import pi from scipy.fftpack import fft, fftshift def F_S(w, mu, sig): return (np.exp(-np.power(w-mu, 2)/(2 * np.power(sig, 2))))/(np.power(2*pi, 0.5)*sig) w=np.linspace(-5,5,100) plt.plot(w, fftshift(np.abs(np.fft.fft(F_S(w, 0, 1))))) plt.show() Also, you might want to consider scaling the x axis too.
Fitting sin curve using python
I am having two list: # on x-axis: # list1: [70.434654, 37.147266, 8.5787086, 161.40877, -27.31284, 80.429482, -81.918106, 52.320129, 64.064552, -156.40771, 12.37026, 15.599689, 166.40984, 134.93636, 142.55002, -38.073524, -38.073524, 123.88509, -82.447571, 97.934402, 106.28793] # on y-axis: # list2: [86683.961, -40564.863, 50274.41, 80570.828, 63628.465, -87284.016, 30571.402, -79985.648, -69387.891, 175398.62, -132196.5, -64803.133, -269664.06, 36493.316, 22769.121, 25648.252, 25648.252, 53444.855, 684814.69, 82679.977, 103244.58] I need to fit a sine curve a+bsine(2*3.14*list1+c) in the data points obtained by plotting list1(on x-axis) against(on-y-axis) using python. I am not able to get any good result.Can anyone help me with a suitable code,explanation... Thanks! this is my graph after plotting the list1(on x-axis) and list2(on y-axis)
Well, if you used lmfit setting up and running your fit would look like this: xdeg = [70.434654, 37.147266, 8.5787086, 161.40877, -27.31284, 80.429482, -81.918106, 52.320129, 64.064552, -156.40771, 12.37026, 15.599689, 166.40984, 134.93636, 142.55002, -38.073524, -38.073524, 123.88509, -82.447571, 97.934402, 106.28793] y = [86683.961, -40564.863, 50274.41, 80570.828, 63628.465, -87284.016, 30571.402, -79985.648, -69387.891, 175398.62, -132196.5, -64803.133, -269664.06, 36493.316, 22769.121, 25648.252, 25648.252, 53444.855, 684814.69, 82679.977, 103244.58] import numpy as np from lmfit import Model import matplotlib.pyplot as plt def sinefunction(x, a, b, c): return a + b * np.sin(x*np.pi/180.0 + c) smodel = Model(sinefunction) result = smodel.fit(y, x=xdeg, a=0, b=30000, c=0) print(result.fit_report()) plt.plot(xdeg, y, 'o', label='data') plt.plot(xdeg, result.best_fit, '*', label='fit') plt.legend() plt.show() That is assuming your X data is in degrees, and that you really intended to convert that to radians (as numpy's sin() function requires). But that just addresses the mechanics of how to do the fit (and I'll leave the display of results up to you - it seems like you may need the practice). The fit result is terrible, because these data are not sinusoidal. They are also not well ordered, which isn't a problem for doing the fit, but does make it harder to see what is going on.
Gradient in noisy data, python
I have an energy spectrum from a cosmic ray detector. The spectrum follows an exponential curve but it will have broad (and maybe very slight) lumps in it. The data, obviously, contains an element of noise. I'm trying to smooth out the data and then plot its gradient. So far I've been using the scipy sline function to smooth it and then the np.gradient(). As you can see from the picture, the gradient function's method is to find the differences between each point, and it doesn't show the lumps very clearly. I basically need a smooth gradient graph. Any help would be amazing! I've tried 2 spline methods: def smooth_data(y,x,factor): print "smoothing data by interpolation..." xnew=np.linspace(min(x),max(x),factor*len(x)) smoothy=spline(x,y,xnew) return smoothy,xnew def smooth2_data(y,x,factor): xnew=np.linspace(min(x),max(x),factor*len(x)) f=interpolate.UnivariateSpline(x,y) g=interpolate.interp1d(x,y) return g(xnew),xnew edit: Tried numerical differentiation: def smooth_data(y,x,factor): print "smoothing data by interpolation..." xnew=np.linspace(min(x),max(x),factor*len(x)) smoothy=spline(x,y,xnew) return smoothy,xnew def minim(u,f,k): """"functional to be minimised to find optimum u. f is original, u is approx""" integral1=abs(np.gradient(u)) part1=simps(integral1) part2=simps(u) integral2=abs(part2-f)**2. part3=simps(integral2) F=k*part1+part3 return F def fit(data_x,data_y,denoising,smooth_fac): smy,xnew=smooth_data(data_y,data_x,smooth_fac) y0,xnnew=smooth_data(smy,xnew,1./smooth_fac) y0=list(y0) data_y=list(data_y) data_fit=fmin(minim, y0, args=(data_y,denoising), maxiter=1000, maxfun=1000) return data_fit However, it just returns the same graph again!
There is an interesting method published on this: Numerical Differentiation of Noisy Data. It should give you a nice solution to your problem. More details are given in another, accompanying paper. The author also gives Matlab code that implements it; an alternative implementation in Python is also available. If you want to pursue the interpolation with splines method, I would suggest to adjust the smoothing factor s of scipy.interpolate.UnivariateSpline(). Another solution would be to smooth your function through convolution (say with a Gaussian). The paper I linked to claims to prevent some of the artifacts that come up with the convolution approach (the spline approach might suffer from similar difficulties).
I won't vouch for the mathematical validity of this; it looks like the paper from LANL that EOL cited would be worth looking into. Anyway, I’ve gotten decent results using SciPy’s splines’ built-in differentiation when using splev. %matplotlib inline from matplotlib import pyplot as plt import numpy as np from scipy.interpolate import splrep, splev x = np.arange(0,2,0.008) data = np.polynomial.polynomial.polyval(x,[0,2,1,-2,-3,2.6,-0.4]) noise = np.random.normal(0,0.1,250) noisy_data = data + noise f = splrep(x,noisy_data,k=5,s=3) #plt.plot(x, data, label="raw data") #plt.plot(x, noise, label="noise") plt.plot(x, noisy_data, label="noisy data") plt.plot(x, splev(x,f), label="fitted") plt.plot(x, splev(x,f,der=1)/10, label="1st derivative") #plt.plot(x, splev(x,f,der=2)/100, label="2nd derivative") plt.hlines(0,0,2) plt.legend(loc=0) plt.show()
You can also use scipy.signal.savgol_filter. Result Example import matplotlib.pyplot as plt import numpy as np import scipy from random import random # generate data x = np.array(range(100))/10 y = np.sin(x) + np.array([random()*0.25 for _ in x]) dydx = scipy.signal.savgol_filter(y, window_length=11, polyorder=2, deriv=1) # Plot result plt.plot(x, y, label='Original signal') plt.plot(x, dydx*10, label='1st Derivative') plt.plot(x, np.cos(x), label='Expected 1st Derivative') plt.legend() plt.show()