Related
I have been trying to fit a function(the function is given in the code under the name concave_func) onto data points in python but have had very little to no success. I have 7 parameters(C_1, C_2, alpha_one, alpha_two, I_x, nu_t, T_e) in the function that I have to estimate, and only 6 data points. I have tried 2 methods to fit the curve and estimate the parameters,
1). scipy.optimize.minimize
2). scipy.optimize.curve_fit.
However, I'm not obtaining the desired results i.e the curve is not fitting the data points.
I have attached my code below.
frequency = np.array([22,45,150,408,1420,23000]) #x_values
b_temp = [2.55080863e+04, 4.90777800e+03, 2.28984753e+02, 2.10842949e+01, 3.58631166e+00, 5.68716056e-04] #y_values
#Defining the function that I want to fit
def concave_func(x, C_1, C_2, alpha_one, alpha_two, I_x, nu_t, T_e):
one = x**(-alpha_one)
two = (C_2/C_1)*(x**(-alpha_two))
three = I_x*(x**-2.1)
expo = np.exp(-1*((nu_t/x)**2.1))
eqn_one = C_1*(one + two + three)*expo
eqn_two = T_e*(1 - expo)
return eqn_one + eqn_two
#Defining chi_square function
def chisq(params, xobs, yobs):
ynew = concave_func(xobs, *params)
#yerr = np.sum((ynew- yobs)**2)
yerr = np.sum(((yobs- ynew)/ynew)**2)
print(yerr)
return yerr
result = minimize(chisq, [1,2,2,2,1,1e6,8000], args = (frequency,b_temp), method = 'Nelder-Mead', options = {'disp' : True, 'maxiter': 10000})
x = np.linspace(-300,24000,1000)
plt.yscale("log")
plt.xscale("log")
plt.plot(x,concave_func(x, *result.x))
print(result.x)
print(result)
plt.plot(frequency, b_temp, 'r*' )
plt.xlabel("log Frequency[MHz]")
plt.ylabel("log Temp[K]")
plt.title('log Temparature vs log Frequency')
plt.grid()
plt.savefig('the_plot_2060.png')
I have attached the plot that I obtained below.
The plot clearly does not fit the data, and something is definitely wrong. I would also want my parameters alpha_one and alpha_two to be constrained to lie between 2 and 3. I also do not want my parameter T_e to exceed 10,000. Any thoughts?
I am trying to fit some data using a stretch exponential function of type : c*(exp(-x/tau)^beta). The value I am interested in is tau.
The data I am trying to fit passes through zero and is also negative sometimes (For example, value goes from -1 to 1).
def st_exp(x,c,tau,beta):
return c*(np.exp(-(x/tau)**beta))
When I try to fit I get a runtime warning :
RuntimeWarning: invalid value encountered in power
return c*(np.exp(-(x/tau)**beta))
I want to fit the data as is, however, this shows a runtime warning and fit does not converge or fits only till zero is encountered.
For fitting I used:
def get_index(x0,x):
return np.argmin(abs(x-x0))
init_vals = [max(y)-min(y),-1*x[get_index(np.mean(y),y)]/np.log(0.5),0.5]
best_vals, covar = curve_fit(st_exp, x,y, p0=init_vals)
The data I am trying to fit :
x = np.arange(0,400000,1000)
y = np.array([-45819., -37322., -34006., -28906., -26565., -13311., -10992.,
-11233., -3313., -2421., -1687., 9665., 11951., 12796.,
22440., 20331., 24732., 26594., 25464., 30668., 37412.,
33261., 34365., 39359., 39105., 40260., 48946., 48351.,
49872., 44422., 49969., 54536., 54248., 57340., 61403.,
61843., 63386., 61182., 64080., 64052., 68232., 68167.,
76288., 71786., 74485., 76070., 76540., 70167., 82014.,
79459., 80499., 80073., 80697., 88209., 80099., 83415.,
93613., 86038., 89498., 86073., 86999., 94242., 91823.,
91162., 93277., 94834., 89088., 92613., 97663., 95948.,
92840., 105920., 98487., 100951., 88721., 95078., 99831.,
94738., 102520., 98576., 99038., 103921., 102951., 103186.,
100755., 103631., 107259., 107376., 105404., 109739., 110135.,
107829., 103196., 110798., 104497., 107074., 111857., 110816.,
111853., 111890., 107932., 111878., 109776., 112154., 112769.,
113155., 114862., 109560., 111112., 111516., 110314., 115911.,
115820., 118418., 113124., 114579., 118102., 115259., 112640.,
121617., 118125., 114923., 115210., 121919., 115841., 111980.,
117730., 112565., 120893., 113758., 121129., 110559., 118674.,
122867., 118574., 118022., 118656., 117656., 116813., 118591.,
119722., 110845., 126545., 119452., 121438., 118271., 125652.,
121025., 119663., 119917., 121405., 124934., 117835., 121760.,
123870., 126825., 120996., 116165., 119473., 120996., 120530.,
122197., 119907., 123786., 116293., 118625., 123068., 123951.,
123443., 120781., 126291., 119316., 119401., 125871., 120863.,
117013., 125037., 124775., 117822., 123755., 121240., 122696.,
117997., 124865., 123457., 124229., 117705., 126550., 121866.,
123070., 123585., 126033., 126355., 124475., 121325., 125392.,
125882., 126755., 128013., 123610., 123611., 123853., 124819.,
125464., 123897., 128276., 120328., 125569., 128821., 128039.,
126223., 123052., 121924., 121932., 122968., 129473., 124053.,
122576., 124538., 127567., 129659., 126090., 130546., 131749.,
118672., 130372., 125783., 126413., 126283., 125898., 124901.,
130037., 123192., 122977., 125806., 125544., 131714., 130757.,
128980., 130233., 129140., 127372., 118302., 126342., 126046.,
127595., 129635., 121161., 123841., 124058., 124156., 131894.,
124745., 129556., 127832., 126236., 130072., 121877., 121383.,
136089., 123984., 127407., 128703., 127597., 126220., 124028.,
122716., 127398., 129724., 128971., 124488., 127229., 130337.,
132997., 126681., 127312., 123270., 123822., 127458., 127653.,
122740., 132875., 124466., 132315., 129569., 128041., 127525.,
124972., 123646., 122957., 130239., 126285., 127734., 131409.,
128138., 133744., 131438., 130377., 130763., 127868., 129223.,
130644., 131814., 132781., 127419., 124382., 127924., 129190.,
127443., 132475., 130202., 128066., 130360., 130282., 125531.,
130259., 123453., 126989., 129615., 132047., 129424., 126729.,
127324., 128756., 121690., 132176., 126250., 127830., 128985.,
133258., 125664., 123530., 130123., 126947., 123108., 125562.,
126388., 131747., 128793., 121865., 121705., 127039., 132701.,
128835., 133300., 125677., 134063., 136207., 128572., 127731.,
130304., 129674., 126436., 132357., 128154., 129400., 126893.,
132012., 129471., 124752., 127925., 123735., 125801., 126371.,
128554., 126691., 126970., 129754., 130953., 125113., 133345.,
127633., 128070., 127592., 125389., 127235., 125677., 131191.,
130972., 124687., 132342., 130269., 133340., 127084., 132171.,
131521., 133572., 124134., 132673., 131440., 122008., 129178.,
133775., 126584., 131278., 133229., 128349., 139349., 127294.,
133538.])
Your initial values are likely preventing you from finding a good fit. Try this:
best_vals, covar = curve_fit(st_exp, x, y, p0=[10000.0, 10000.0, 1.0])
print(best_vals)
# result: array([ 1.36046194e+05, 2.83889616e+04, -1.21296047e+00])
fig, ax = plt.subplots(1, 1)
ax.plot(x, y, label="data")
ax.plot(x, st_exp(x,*best_vals), label="fit")
ax.legend(loc="best")
The error I was making was that I was not proving an offset for the fitting function :
Either correct the offset before fitting.
or
Modify the fitting function as :
def st_exp(x,c,tau,beta,y_offset):
return c*(np.exp(-(x/tau)**beta))+y_offset
I'd like to reconstruct a signal for which I have some data and used LombScargle to obtain it's frequency components.
My data is in addition as the following:
r = np.array([119.75024144, 119.77177673, 119.79671626, 119.81566188,
119.81291201, 119.71610143, 119.24156708, 117.66932347,
114.22145178, 109.27266933, 104.57675147, 101.63381325,
100.42623807, 100.09436745, 100.02798438, 100.02696846,
100.05422613, 100.12216521, 100.27569606, 100.60962812,
101.32023289, 102.71102637, 105.01826819, 108.17052642,
111.67848758, 114.78442424, 116.95337537, 118.19437002,
118.84307457, 119.19571404, 119.40326818, 119.53101551,
119.61170874, 119.66610072, 119.68315253, 119.53757829,
118.83748609, 116.90425868, 113.32095843, 108.72465638,
104.58292906, 101.93316248, 100.68856962, 100.22523098,
100.08558767, 100.07194691, 100.11193397, 100.19142891,
100.33208922, 100.5849306 , 101.04224415, 101.87565882,
103.33985519, 105.63631456, 108.64972952, 111.86837667,
114.67115037, 116.69548163, 117.96207449, 118.69589499,
119.11781077, 119.36770681, 119.51566311, 119.59301667])
z = np.array ([-422.05230434, -408.98182253, -395.78387843, -382.43143962,
-368.92341485, -355.26851343, -341.47780372, -327.56493425,
-313.54536462, -299.43740189, -285.26768576, -271.07676026,
-256.92098157, -242.86416227, -228.95449427, -215.207069 ,
-201.61590575, -188.17719265, -174.89201262, -161.75452196,
-148.74812279, -135.85126854, -123.04093538, -110.29151714,
-97.57502515, -84.86119278, -72.1145478 , -59.2947726 ,
-46.36450604, -33.29821629, -20.08471733, -6.72030326,
6.80047849, 20.48309726, 34.32320864, 48.30267819,
62.393214 , 76.56022602, 90.76260159, 104.94787451,
119.04731699, 132.98616969, 146.71491239, 160.23436159,
173.58582543, 186.81849059, 199.96724955, 213.05229133,
226.08870416, 239.09310452, 252.08377421, 265.0769367 ,
278.08234368, 291.10215472, 304.13509998, 317.18351924,
330.25976991, 343.38777732, 356.59626164, 369.90725571,
383.33109354, 396.87227086, 410.5309987 , 424.2899438])
plt.plot(z,r, label='data');plt.legend()
Afterwards, I use the LobmbScargle on this data:
f, a = LombScargle(z, r).autopower()
plt.plot(f, a, label='frequency components');plt.legend()
Similar to a Fourier series, I would like to reconstruct the signal by a sum of sines or cosines.
Where I am mainly interested in finding a_i and w_i values.
Which I do like the following but my reconstruction does not look like the signal for which I had data.
s = 0
for i in range(f.shape[0]):
s += a[i]*np.sin(f[i]*z)
plt.plot(z, s, label='reconstructed signal');plt.legend()
Either there is a mistake in the way I am using Lomb-Scargle or in the signal reconstruction part but I haven't figured out what.
in the framework of my bachelor's thesis, I need to evaluate my data with python. Unfortunately there's no suiting script of my fellow students yet and I'm quite new to programming.
I have this data set and I'm trying to fit it with a gaussian by using scipy.optimize.curve_fit. Since there are a lot of unusable counts especially at the end of the axis, I'd like to confine the part that is to be fitted.
Picture raw data
This is what I have so far:
import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
x=np.arange(5120)
y=array([ 0.81434599, 1.17054264, 0.85279188, ..., 1. ,
1. , 13.56291391]) #most of the data isn't interesting
#to me, part of interest see below
def Gauss(x, a, x0, sigma):
return a * np.exp(-(x - x0)**2 / (2 * sigma**2))
mean = sum(x * y) / sum(y)
sigma = np.sqrt(sum(y * (x - mean)**2) / sum(y))
popt,pcov = curve_fit(Gauss, x, y, p0=[max(y), mean, sigma],
maxfev=360000)
plt.plot(x,y,label='data')
plt.plot(x,Gauss(x, *popt), 'r-',label='fit')
On docs.scipy.org I've found a general description about curve_fit
If I try using
bounds=([2400,-np.inf, -np.inf],[2600, np.inf, np.inf]),
I'm getting the ValueError: x0 is infeasible. What is the problem here?
I also tried to confine it with
popt,pcov = curve_fit(Gauss, x[2400:2600], y[2400:2600], p0=[max(y), mean, sigma], maxfev=360000)
as suggested in a comment on this question: "Error when obtaining gaussian fit for graph" at stackoverflow
In this case I only get a straight line though.
Picture: Confinement with x[2400:2600],y[2400:2600] as arguments of curve_fit
I really hope you can help me out here. I only need a way to fit a small part of my data. Thanks in advance!
interesting data:
y=array([ 0.93396226, 1.00884956, 1.15457413, 1.07590759,
0.88915094, 1.07142857, 1.10714286, 1.14171123, 1.06666667,
0.84975369, 0.95480226, 0.99388379, 1.01675978, 0.83967391,
0.9771987 , 1.02402402, 1.04531722, 1.07492795, 0.97135417,
0.99714286, 1.0248139 , 1.26223776, 1.1533101 , 0.99099099,
1.18867925, 1.15772871, 0.95076923, 1.03313253, 1.02278481,
0.93265993, 1.06705539, 1.00265252, 1.02023121, 0.92076503,
0.99728997, 1.03353659, 1.15116279, 1.04336043, 0.95076923,
1.05515588, 0.92571429, 0.93448276, 1.02702703, 0.90056818,
0.96068796, 1.08493151, 1.13584906, 1.1212938 , 1.0739645 ,
0.98972603, 0.94594595, 1.07913669, 0.98425197, 0.87762238,
0.96811594, 1.02710843, 0.99392097, 0.91384615, 1.09809264,
1.00630915, 0.93175074, 0.87572254, 1.00651466, 0.78772379,
1.12244898, 1.2248062 , 0.97109827, 0.94607843, 0.97900262,
0.97527473, 1.01212121, 1.16422287, 1.20634921, 0.97275204,
1.01090909, 0.99404762, 1.00561798, 1.01146132, 1.08695652,
0.97214485, 1.03525641, 0.99096386, 1.05135952, 1.16451613,
0.90462428, 0.76876877, 0.47701149, 0.27607362, 0.21580547,
0.20598007, 0.16766467, 0.15533981, 0.19745223, 0.15407855,
0.18925831, 0.26997245, 0.47603834, 0.596875 , 0.85126582, 0.96
, 1.06578947, 1.08761329, 0.89548023, 0.99705882, 1.07142857,
0.95677233, 0.86119874, 1.02857143, 0.98250729, 0.94214876,
1.04166667, 0.96024465, 1.07022472, 1.10344828, 1.04859335,
0.96655518, 1.06424581, 1.01754386, 1.03492063, 1.18627451,
0.91036415, 1.03355705, 1.09116809, 0.96083551, 1.01298701,
1.03691275, 1.02923977, 1.11612903, 1.01457726, 1.06285714,
0.98186528, 1.16470588, 0.86645963, 1.07317073, 1.09615385,
1.21192053, 0.94385027, 0.94244604, 0.88390501, 0.95718654,
0.9691358 , 1.01729107, 1.01119403, 1.20350877, 1.12890625,
1.06940063, 0.90410959, 1.14662757, 0.97093023, 1.03021148,
1.10629921, 0.97118156, 1.10693642, 1.07917889, 0.9484127 ,
1.07581227, 0.98006645, 0.98986486, 0.90066225, 0.90066225,
0.86779661, 0.86779661, 0.96996997, 1.01438849, 0.91186441,
0.91290323, 1.03745318, 1.0615942 , 0.97202797, 1.16608997,
0.94182825, 1.08333333, 0.9076087 , 1.18181818, 1.20618557,
1.01273885, 0.93606138, 0.87457627, 0.90575916, 1.09756098,
0.99115044, 1.13380282, 1.04333333, 1.04026846, 1.0297619 ,
1.04334365, 1.03395062, 0.92553191, 0.98198198, 1. ,
0.9439528 , 1.02684564, 1.1372549 , 0.96676737, 0.99649123,
1.07051282, 1.10367893, 1.0866426 , 1.15384615, 0.99667774])
You might find the lmfit module (https://lmfit.github.io/lmfit-py/) useful for this. It is designed to make curve fitting very easy, has built-in models for common peaks like Gaussian, and has many useful features such as allowing you to set bounds on parameters. A fit to your data with lmfit might look like this:
import numpy as np
import matplotlib.pyplot as plt
from lmfit.models import GaussianModel, ConstantModel
y = np.array([.....]) # uses your shorter data range
x = np.arange(len(y))
# make a model that is a Gaussian + a constant:
model = GaussianModel(prefix='peak_') + ConstantModel()
# make parameters with starting values:
params = model.make_params(c=1.0, peak_center=90,
peak_sigma=5, peak_amplitude=-5)
# it's not really needed for this data, but you can put bounds on
# parameters like this (or set .vary=False to fix a parameter)
params['peak_sigma'].min = 0 # sigma > 0
params['peak_amplitude'].max = 0 # amplitude < 0
params['peak_center'].min = 80
params['peak_center'].max = 100
# run fit
result = model.fit(y, params, x=x)
# print, plot results
print(result.fit_report())
plt.plot(x, y)
plt.plot(x, result.best_fit)
plt.show()
This will print out
[[Model]]
(Model(gaussian, prefix='peak_') + Model(constant))
[[Fit Statistics]]
# function evals = 54
# data points = 200
# variables = 4
chi-square = 1.616
reduced chi-square = 0.008
Akaike info crit = -955.625
Bayesian info crit = -942.432
[[Variables]]
peak_sigma: 4.03660814 +/- 0.204240 (5.06%) (init= 5)
peak_center: 91.2246614 +/- 0.200267 (0.22%) (init= 90)
peak_amplitude: -9.79111362 +/- 0.445273 (4.55%) (init=-5)
c: 1.02138228 +/- 0.006796 (0.67%) (init= 1)
peak_fwhm: 9.50548558 +/- 0.480950 (5.06%) == '2.3548200*peak_sigma'
peak_height: -0.96766623 +/- 0.041854 (4.33%) == '0.3989423*peak_amplitude/max(1.e-15, peak_sigma)'
[[Correlations]] (unreported correlations are < 0.100)
C(peak_sigma, peak_amplitude) = -0.599
C(peak_amplitude, c) = -0.328
C(peak_sigma, c) = 0.196
and make a plot like this:
I am trying to fit gaussian to a spectrum and the y values are on the order of 10^(-19). Curve_fit gives me poor fitting result, both before and after I multiply my whole data by 10^(-19). Attached is my code, it is fairly simple set of data except that the values are very small. If I want to keep my original values, how would I get a reasonable gaussian fit that would give me the correct parameters?
#get fits data
aaa=pyfits.getdata('p1.cal.fits')
aaa=np.matrix(aaa)
nrow=np.shape(aaa)[0]
ncol=np.shape(aaa)[1]
ylo=79
yhi=90
xlo=0
xhi=1023
glo=430
ghi=470
#sum all the rows to get spectrum
ysum=[]
for x in range(xlo,xhi):
sum=np.sum(aaa[ylo:yhi,x])
ysum.append(sum)
wavelen_pix=range(xhi-xlo)
max=np.max(ysum)
print "maximum is at x=", np.where(ysum==max)
##fit gaussian
#fit only part of my data in the chosen range [glo:ghi]
x=wavelen_pix[glo:ghi]
y=ysum[glo:ghi]
def func(x, a, x0, sigma):
return a*np.exp(-(x-x0)**2/float((2*sigma**2)))
sig=np.std(ysum[500:1000]) #std of background noise
popt, pcov = curve_fit(func, x, sig)
print popt
#this gives me [1.,1.,1.], which is obviously wrong
gaus=func(x,popt[0],popt[1],popt[2])
aaa is a 153 by 1024 image matrix, partly looks like this:
matrix([[ -8.99793629e-20, 8.57133275e-21, 4.83523386e-20, ...,
-1.54811004e-20, 5.22941515e-20, 1.71179195e-20],
[ 2.75769318e-20, 1.03177243e-20, -3.19634928e-21, ...,
1.66583803e-20, -9.88712568e-22, -2.56897725e-20],
[ 2.88121935e-20, 8.57964252e-21, -2.60784327e-20, ...,
1.72335180e-20, -7.61189937e-21, -3.45333075e-20],
...,
[ 1.04006903e-20, 1.61200683e-20, 7.04195205e-20, ...,
1.72459645e-20, 4.29404029e-20, 1.99889374e-20],
[ 3.22315752e-21, -5.61394194e-21, 3.28763096e-20, ...,
1.99063583e-20, 2.12989880e-20, -1.23250648e-21],
[ 3.66591810e-20, -8.08647455e-22, -6.22773168e-20, ...,
-4.06145681e-21, 4.92453132e-21, 4.23689309e-20]], dtype=float32)
You are calling curve_fit incorrectly, here is the usage
curve_fit(f, xdata, ydata, p0=None, sigma=None, absolute_sigma=False, check_finite=True, **kw)
f is your function whose first arg is an array of independent variables, and whose subsequent args are the function parameters (such as amplitude, center, etc)
xdata are the independent variables
ydata are the dependedent variable
p0 is an initial guess at the function parameters (for Guassian this is amplitude, width, center)
By default p0 is set to a list of ones [1,1,...], which is probably why you get that as a result, the fit just never executed because you called it incorrectly.
Try estimating the amplitude, center, and width from the data, then make a p0 object (see below for details)
init_guess = ( a_i, x0_i, sig_i) # same order as they are supplied to your function
popt, pcov = curve_fit(func, xdata=x,ydata=y,p0=init_guess)
Here is a short example
xdata = np.linspace(0, 4, 50)
mygauss = ( 10,2,0.5) #( amp, center, width)
y = func(xdata, *mygauss ) # using your func defined above
ydata = y + 2*(np.random.random(50)- 0.5) # add some noise to create fake data
Now I can guess the fit params
ai = np.max( ydata) # guess the amplitude
xi = xdata[ np.argmax( ydata)] # guess the position of center
Guessing the width is tricky, I would first find where the half max is located (there are two, but you only need to find one, as the Gaussian is symmetric):
pos_half = argmin( np.abs( ydata-ao/2 ) ) # subtract half the amplitude and find the minimum
Now evaluate how far this is from the center of the gaussian (xi) :
sig_i = np.abs( xi - xdata[ pos_half] ) # estimate the width
Now you can make make the initial guess
init_guess = (ai, xi sig_i)
and fit
params, variance = curve_fit( func, xdata=xdata, ydata=ydata, p0=init_guess)
print params
#array([ 9.99457443, 2.01992858, 0.49599629])
which is very close to mygauss. Hope it helps.
Forget about rescaling, or making linear changes, or using the p0 parameter, which usually don't work! Try using the bounds parameter in the curve_fit for n parameters like this:
a0=np.array([a01,...,a0n])
af=np.array([af1,...,afn])
method="trf",bounds=(a0,af)
Hope it works!
;)