The covariance of the parameters cannot be estimated during curve fitting

The covariance of the parameters cannot be estimated during curve fitting - python

I'm trying to solve two unknown parameters based on my function expression using the scipy.optimize.curve_fit function. The equation I used is as follows:
enter image description here
My code is as follows:
p_freqs =np.array(0.,8.19672131,16.39344262,24.59016393,32.78688525,
40.98360656,49.18032787,57.37704918,65.57377049,73.7704918,
81.96721311,90.16393443,98.36065574,106.55737705,114.75409836,
122.95081967,131.14754098,139.3442623, 147.54098361,155.73770492,
163.93442623,172.13114754,180.32786885,188.52459016,196.72131148,
204.91803279,213.1147541, 221.31147541,229.50819672,237.70491803,
245.90163934)
p_fft_amp1 = np.array(3.34278536e-08,5.73549829e-08,1.94897033e-08,1.59088184e-08,
9.23948302e-09,3.71198908e-09,1.85535722e-09,1.86064653e-09,
1.52149363e-09,1.33626573e-09,1.19468040e-09,1.08304535e-09,
9.96594475e-10,9.25671797e-10,8.66775330e-10,8.17287132e-10,
7.75342888e-10,7.39541296e-10,7.08843676e-10,6.82440637e-10,
6.59712650e-10,6.40169517e-10,6.23422124e-10,6.09159901e-10,
5.97134297e-10,5.87146816e-10,5.79040074e-10,5.72691200e-10,
5.68006964e-10,5.64920239e-10,5.63387557e-10)
def cal_omiga_tstar(omiga,tstar,f):
return omiga*np.exp(-np.pi*f*tstar)/(1+(f/18.15)**2)
omiga,tstar = optimize.curve_fit(cal_omiga_tstar,p_freqs,p_fft_amp1)[0]
When I run the code I get the following prompt:
OptimizeWarning: Covariance of the parameters could not be estimated warnings.warn('Covariance of the parameters could not be estimated'

I couldn't exactly pinpoint the cause of your error message because your code had some errors prior to that. First, the construction of the two arrays has invalid syntax, then your definition of cal_omiga_tstar has the wrong argument order. While fixing these problems I did get your error message once, but I haven't been able to reproduce it, weirdly enough. However, I did manage to fit your function. You should supply initial guesses to the parameters, especially since your y has so many low values. There's no magic here, just plot your model and data until it's relatively close. Then, let the algorithm take the wheel.
Here's my code:
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
import numpy as np
# Changed here, was "np.array(0.,..."
p_freqs =np.array([0.,8.19672131,16.39344262,24.59016393,32.78688525,
40.98360656,49.18032787,57.37704918,65.57377049,73.7704918,
81.96721311,90.16393443,98.36065574,106.55737705,114.75409836,
122.95081967,131.14754098,139.3442623, 147.54098361,155.73770492,
163.93442623,172.13114754,180.32786885,188.52459016,196.72131148,
204.91803279,213.1147541, 221.31147541,229.50819672,237.70491803,
245.90163934])
p_fft_amp1 = np.array([3.34278536e-08,5.73549829e-08,1.94897033e-08,1.59088184e-08,
9.23948302e-09,3.71198908e-09,1.85535722e-09,1.86064653e-09,
1.52149363e-09,1.33626573e-09,1.19468040e-09,1.08304535e-09,
9.96594475e-10,9.25671797e-10,8.66775330e-10,8.17287132e-10,
7.75342888e-10,7.39541296e-10,7.08843676e-10,6.82440637e-10,
6.59712650e-10,6.40169517e-10,6.23422124e-10,6.09159901e-10,
5.97134297e-10,5.87146816e-10,5.79040074e-10,5.72691200e-10,
5.68006964e-10,5.64920239e-10,5.63387557e-10])
# Changed sequence from "omiga, tstar, f" to "f, omiga, tstar".
def cal_omiga_tstar(f, omiga,tstar):
return omiga*np.exp(-np.pi*f*tstar)/(1+(f/18.15)**2)
# Changed this call to get popt, pcov, and supplied the initial guesses
popt, pcov = curve_fit(cal_omiga_tstar,p_freqs,p_fft_amp1, p0=(1E-5, 1E-2))
Here's popt: array([ 4.51365934e-08, -1.48124194e-06]) and pcov: array([[1.35757744e-17, 3.54656128e-12],[3.54656128e-12, 2.90508007e-06]]). As you can see, the covariance matrix could be estimated in this case.
Here's the model x data curve:

Related

ODE with non-analytical time-dependent parameters in PyMC3

I'm working on solving the following ODE with PyMC3:
def production( y, t, p ):
return p[0]*getBeam( t ) - p[1]*y[0]
The getBeam( t ) is my time dependent coefficient. Those coefficients are given by an array of data which is accessed by the time index as follows:
def getBeam( t ):
nBeam = I[int(t/10)]*pow( 10, -6 )/q_e
return nBeam
I have successfully implemented it by using the scipy.integrate.odeint, but I have hard time to do it with pymc3.ode. In fact, by using the following:
ode_model = DifferentialEquation(func=production, times=x, n_states=1, n_theta=3, t0=0)
with pm.Model() as model:
a = pm.Uniform( "S-Factor", lower=0.01, upper=100 )
ode_solution = ode_model(y0=[0], theta=[a, Yield, lambd])
I obviously get the error TypeError: __trunc__ returned non-Integral (type TensorVariable), as the t is a TensorVariable, thus can not be used to access the array in which the coefficients are stored.
Is there a way to overcome this difficulty? I thought about using the theano.function but I can not get it working since, unfortunately, the coefficients can not be expressed by any analytical function: they are just stored inside the array I which index represents the time variable t.
Thanks

Since you already have a working implementation with scipy.integrate.odeint, you could use theano.compile.ops.as_op, though it comes with some inconveniences (see how to fit a method belonging to an instance with pymc3? and How to write a custom Deterministic or Stochastic in pymc3 with theano.op?)
Using your exact definitions for production and getBeam, the following code seems to work for me:
from scipy.integrate import odeint
from theano.compile.ops import as_op
import theano.tensor as tt
import pymc3 as pm
def ScipySolveODE(a):
return odeint(production, y0=[0], t=x, args=([a, Yield, lambd],)).flatten()
#as_op(itypes=[tt.dscalar], otypes=[tt.dvector])
def TheanoSolveODE(a):
return ScipySolveODE(a)
with pm.Model() as model:
a = pm.Uniform( "S-Factor", lower=0.01, upper=100 )
ode_solution = TheanoSolveODE(a)
Sorry I know this is more of a workaround than an actual solution...

Python- scipy ODR going crazy

I would like to use scipy's ODR to fit a curve to a set of variables with variances. In this case, I am fitting a linear function with a set Y axis crossing point (e.g. a*x+100). Due to my inability to find an estimator (I asked about that here), I am using scipy.optimize curve_fit to estimate the initial a value. Now, the function works perfectly without standard deviation, but when I add it, the output makes completely no sense (the curve is far above all of the points). What could be the case of such behaviour? Thanks!
The code is attached here:
import scipy.odr as SODR
from scipy.optimize import curve_fit
def fun(kier, arg):
'''
Function to fit
:param kier: list of parameters
:param arg: argument of the function fun
:return y: value of function in x
'''
y =kier[0]*arg +100 #+kier[1]
return y
zx = [30.120348566300354, 36.218214083626386, 52.86998374096616]
zy = [83.47033171149137, 129.10207165602722, 85.59465198231146]
dx = [2.537935346025827, 4.918719773247683, 2.5477221183398977]
dy = [3.3729431749276837, 5.33696690247701, 2.0937213187876]
sx = [6.605581618516947, 8.221194790372632, 22.980577676739113]
sy = [1.0936584882351936, 0.7749999999999986, 20.915359045447914]
dx_total = [9.143516964542775, 13.139914563620316, 25.52829979507901]
dy_total = [4.466601663162877, 6.1119669024770085, 23.009080364235516]
# curve fitting
popt, pcov = curve_fit(fun, zx, zy)
danesd = SODR.RealData(x=zx, y=zy, sx=dx_total, sy=dy_total)
model = SODR.Model(fun)
onbig = SODR.ODR(danesd, model, beta0=[popt[0]])
outputbig = onbig.run()
biga=outputbig.beta[0]
print(biga)
daned = SODR.RealData(x=zx,y=zy,sx=dx,sy=dy)
on = SODR.ODR(daned, model, beta0=[popt[0]])
outputd = on.run()
normala = outputd.beta[0]
print(normala)
The outputs are:
30.926925885047254 (this is the output with standard deviation)
-0.25132703671513873 (this is without standard deviation)
This makes no sense, as shown here:
Also, I'd be happy to get any feedback whether my code is clear and the formatting of this question. I am still very new here.

Problems using scipy.odr with math.erf()

I have a problem using the orthogonal distance regression with a function using the error function math.erf(). To be more clear it seems I have a problem with the variables which should be fit.
import numpy as np
from scipy import odr
import math
def CDF(x,a,b,c):
#definition of errorfunction to fit data
return c/2*(1+math.erf((x-a)/(b*math.sqrt(2))))
#to make things a little shorter I excluded the parts in which the data is
read and prepared
guess=Ergebnis_popt[1] #using found values for a,b,c from curve_fit without
errors
guess_a=guess[0]
guess_b=guess[1]
guess_c=guess[2]
xerr=Verschiebung_fit[2] #determination of x_erros, the values for the y
errors are included in the prepared data
xerr=xerr/xerr
xerr=xerr/200
#building the model for the odr fit, as the erf function seems to only
handle single values I use np.vectorize
cfd = odr.Model(np.vectorize(CDF,excluded=['a','b','c']),extra_args=
['a','b','c'])
#Definition of the Data with ererrors for the fit
mydata =odr.RealData(Verschiebung_fit[2],y=y_values_fit[2],sx=xerr,sy=y_values_fit[4])
#basis for the odr fit
odr=odr.ODR(mydata,cfd, beta0 = [guess_a,guess_b,guess_c] )
myoutput = odr.run()
myoutput.pprint()
All this results in a Error:
File "C:\Users\Public\Anaconda\lib\site-packages\numpy\lib\function_base.py", line 2785, in _get_ufunc_and_otypes
outputs = func(*inputs)
File "C:\Users\Public\Anaconda\lib\site-packages\numpy\lib\function_base.py", line 2750, in func
return self.pyfunc(*the_args, **kwargs)
TypeError: CDF() takes 4 positional arguments but 5 were given
I am a novice using python. I guess there is a problem with communicating the variables which should be fitted, but I can not figure out where this problem occurs.

How to do Scipy curve fitting with error bars and obtain standard errors on fitting parameters?

I am trying to fit my data points. It looks like the fitting without errors are not that optimistic, therefore now I am trying to fit the data implementing the errors at each point. My fit function is below:
def fit_func(x,a,b,c):
return np.log10(a*x**b + c)
then my data points are below:
r = [ 0.00528039,0.00721161,0.00873037,0.01108928,0.01413011,0.01790143,0.02263833, 0.02886089,0.03663713,0.04659512,0.05921978,0.07540126,0.09593949, 0.12190075,0.15501736,0.19713563,0.25041524,0.3185025,0.40514023,0.51507869, 0.65489938,0.83278859,1.05865016,1.34624082]
logf = [-1.1020581079659384, -1.3966927245616112, -1.4571368537041418, -1.5032694247562564, -1.8534775558300272, -2.2715812166948304, -2.2627690390113862, -2.5275290780299331, -3.3798813619309365, -6.0, -2.6270989211307034, -2.6549656159564918, -2.9366845162570079, -3.0955026428779604, -3.2649261507250289, -3.2837123017838366, -3.0493752067042856, -3.3133647996463229, -3.0865051494299243, -3.1347499415910169, -3.1433062918466632, -3.1747394718538979, -3.1797597345585245, -3.1913094832146616]
Because my data is in log scale, logf, then the error bar for each data point is not symmetric. The upper error bar and lower error bar are below:
upper = [0.070648916083227764, 0.44346256268274886, 0.11928131794776076, 0.094260899008089094, 0.14357124858039971, 0.27236750587684311, 0.18877122991380402, 0.28707938182603066, 0.72011863806906318, 0, 0.16813325716948757, 0.13624929595316049, 0.21847915642008875, 0.25456116079315372, 0.31078368240910148, 0.23178227464741452, 0.09158189214515966, 0.14020538489677881, 0.059482730164901909, 0.051786777740678414, 0.041126467609954531, 0.034394612910981337, 0.027206248503368613, 0.021847333685597548]
lower = [0.06074797748043137, 0.21479225959441428, 0.093479845697059583, 0.077406149968278104, 0.1077175009766278, 0.16610073183912188, 0.13114254113054535, 0.17133966123838595, 0.57498950902908286, 2.9786837094190934, 0.12090437578535695, 0.10355760401838676, 0.14467588244034646, 0.15942693835964539, 0.17929440903034921, 0.15031667827534712, 0.075592499975030591, 0.10581886912443572, 0.05230849287772843, 0.04626422871423852, 0.03756658820680725, 0.03186944137872727, 0.025601929615431285, 0.02080073540367966]
I have the fitting as:
popt, pcov = optimize.curve_fit(fit_func, r, logf,sigma=[lower,upper])
logf_fit = fit_func(r,*popt)
But this is wrong, how can I implement the curve fitting from scipy to include the upper and lower errors? How could I get the fitting errors of the fitting parameters a, b, c?

You can use scipy.optimize.leastsq with custom weights:
import scipy.optimize as optimize
import numpy as np
# redefine lists as array
x=np.array(r)
y=np.array(logf)
errup=np.array(upper)
errlow=np.array(lower)
# error function
def fit_func(x,a,b,c):
return np.log10(a*x**b + c)
def my_error(V):
a,b,c=V
yfit=fit_func(x,a,b,c)
weight=np.ones_like(yfit)
weight[yfit>y]=errup[yfit>y] # if the fit point is above the measure, use upper weight
weight[yfit<=y]=errlow[yfit<=y] # else use lower weight
return (yfit-y)**2/weight**2
answer=optimize.leastsq(my_error,x0=[0.0001,-1,0.0006])
a,b,c=answer[0]
print(a,b,c)
It works, but is very sensitive to initial values, since there is a log which can go in wrong domain (negative numbers) and then it fails. Here I find a=9.14464745425e-06 b=-1.75179880756 c=0.00066720486385which is pretty close to data.

How to make user defined functions for binned_statistic

I am using scipy stats package to take statistics along the an axis, but I am having trouble taking the percentile statistic using binned_statistic. I have generalized the code below, where I am trying taking the 10th percentile of a dataset with x, y values within a series of x bins, and it fails.
I can of course do function options, like median, and even the numpy standard deviation using np.std. However, I cannot figure out how to use np.percentile because it requires 2 arguments (e.g. np.percentile(y, 10)), but then it gives me a ValueError: statistic not understood error.
import numpy as np
import scipy.stats as scist
y_median = scist.binned_statistic(x,y,statistic='median',bins=20,range=[(0,5)])[0]
y_std = scist.binned_statistic(x,y,statistic=np.std,bins=20,range=[(0,5)])[0]
y_10 = scist.binned_statistic(x,y,statistic=np.percentile(10),bins=20,range=[(0,5)])[0]
print y_median
print y_std
print y_10
I am at a loss and have even played around with user defined functions like this, but with no luck:
def percentile10():
return(np.percentile(y,10))
Any help, is greatly appreciated.
Thanks.

The problem with the function you defined is that it takes no arguments at all! It needs to take a y argument that corresponds to your sample, like this:
def percentile10(y):
return(np.percentile(y,10))
You could also use a lambda function for brevity:
scist.binned_statistic(x, y, statistic=lambda y: np.percentile(y, 10), bins=20,
range=[(0, 5)])[0]

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

The covariance of the parameters cannot be estimated during curve fitting - python

Related

ODE with non-analytical time-dependent parameters in PyMC3

Python- scipy ODR going crazy

Problems using scipy.odr with math.erf()

How to do Scipy curve fitting with error bars and obtain standard errors on fitting parameters?

How to make user defined functions for binned_statistic

Categories

Resources