def sigmoid_function(x1,k,xo,a,c):
return (a/ (1+ np.exp(-k*(x1-xo))))+c
x_data=[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35]
y_data =[0.08965066,0.08990541,0.090073960.09013885,0.09021248,0.09038204,
0.09044601,0.09062396,0.09074469,0.09097924,0.09101625,0.09110833,
0.09130073,0.09153685,0.09165991,0.09189038,0.09236043,0.09329333,
0.09470363,0.09750811,0.10305867,0.11295684,0.12767181,0.14647349,
0.16744916,0.18869261,0.20908784,0.22828775,0.2459888 ,0.262817,
0.27898482,0.29499955,0.31033699,0.32526762,0.33972489]
result,covariance= optimize.curve_fit(sigmoid_function,x_data,y_data, maxfev=10000)
Curve with exact data
Curve fit resut
I am new to ml, Please let me know if I can change any parameters in curve_fit().
If you look at the optimization at the scale of your observations, then it appears that the optimization function is not working very well.
But if you zoom out and look at the scale of the optimization function, things look quite different.
When no bounds or optimization method is provided to curve_fit it uses Levenberg-Marquardt which can fail to find the global solution.
in cases with only one minimum, an uninformed standard guess like β = ( 1 ,
1 , … , 1 ) will work fine; in cases with multiple minima, the algorithm converges to the global minimum only if the initial guess is already somewhat close to the final solution.
What you are seeing happen is the optimization function falling into a local minima. Like it says above, you can get around this by providing initial parameters that are closer to the solution so the optimization can avoid that minima trap. For instance by doing this:
p0 = [0.1, 0.1, 0.1, 0.1]
result, convariance = optimize.curve_fit(sigmoid_function, x_data, y_data, p0)
the optimization function behaves as you expect it:
Related
I am trying to fit a 4 parameter logistic regression to a set of data points in python with scipy.curve_fit. However, the fit is quite bad, see below:
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np
from scipy.optimize import curve_fit
def fit_4pl(x, a, b, c, d):
return a + (d-a)/(1+np.exp(-b*(x-c)))
x=np.array([2000. , 1000. , 500. , 250. , 125. , 62.5, 2000. , 1000. ,
500. , 250. , 125. , 62.5])
y=np.array([1.2935, 0.9735, 0.7274, 0.3613, 0.1906, 0.104 , 1.3964, 0.9751,
0.6589, 0.353 , 0.1568, 0.0909])
#initial guess
p0 = [y.min(), 0.5, np.median(x), y.max()]
#fit 4pl
p_opt, cov_p = curve_fit(fit_4pl, x, y, p0=p0, method='dogbox')
#get optimized model
a_opt, b_opt, c_opt, d_opt = p_opt
x_model = np.linspace(min(x), max(x), len(x))
y_model = fit_4pl(x_model, a_opt, b_opt, c_opt, d_opt)
#plot
plt.figure()
sns.scatterplot(x=x, y=y, label="measured data")
plt.title(f"Calibration curve of {i}")
sns.lineplot(x=x_model, y=y_model, label='fit')
plt.legend(loc='best')
This gives me the following parameters and graph:
[2.46783333e-01 5.00000000e-01 3.75000000e+02 1.14953333e+00]
Graphh of 4PL fit overlaid with measured data
This fit is clearly terrible, but I do not know how to improve this. Please help.
I've tried different initial guesses. This has resulted in either the fit shown above or no fit at all (either warnings.warn('Covariance of the parameters could not be estimated' error or Optimal parameters not found: Number of calls to function has reached maxfev = 1000)
I looked at this similar question and the graph produced in the accepted solution is what I'm aiming for. I have attempted to manually assign some bounds but I do not know what I'm doing and there was no discernible improvement.
In the usual nonlinear regresion softwares an iterative calculus is involved which requires initial values of the parameters to start. The initial values have to be not too far from the unknown correct values.
Possibly the trouble comes the "guessing" process of the initial values which might be not good enough.
In order to test this hypothesis one will try an unusal method of regression which is not iterative thus which doesn't require initial values. This is shown below.
The calculus is straightforward (with MathCad) and the results are shown below.
One have to take care about the notations which are not the same in your equation than in the above equation. In order to avoid confusion capital letters will be used in your equation while lowcase letters are used above. The relationship between them is :
Note that the sign in front of the exponential is changed to be compatible with negative c found above.
Try your nonlinear regression in starting with the above values A, B, C, D.
You should find values of parameters close to the above values but not exactly the same. This is due to the criteria of fitting (LMSE or LMSRE or LMSAE or other) implemented in your software which is different from the criteria of fitting used in the above method.
For general explanation about the method used above : https://fr.scribd.com/doc/14674814/Regressions-et-equations-integrale
I randomly generated 1000 data points using the weights I know are true for the normal distribution. Now I am trying to minimize the -log likelihood function to estimate the values of sig^2 and the weights. I sort of get the process conceptually, but when I try to code it I'm just lost.
This is my model:
p(y|x, w, sig^2) = N(y|w0+w1x+...+wnx^n, sig^2)
I've been googling for a while now and I've learned the scipy.stats.optimize.minimize function is good for this, but I can't get it to work right. Every solution I have tried has worked for the example I got the solution from, but I'm unable to extrapolate it to my problem.
x = np.linspace(0, 1000, num=1000)
data = []
for y in x:
data.append(np.polyval([.5, 1, 3], y))
#plot to confirm I do have a normal distribution...
data.sort()
pdf = stats.norm.pdf(data, np.mean(data), np.std(data))
plt.plot(test, pdf)
plt.show()
#This is where I am stuck.
logLik = -np.sum(stats.norm.logpdf(data, loc=??, scale=??))
I have found that the equation error(w) = .5*sum(poly(x_n, w) - y_n)^2 is relevant for minimizing the error of the weights, which therefore maximizes my likelihood for the weights, but I don't understand how to code this... I have found a similar relationship for sig^2, but have the same problem. Can somebody clarify how to do this to help my curve fitting? Maybe go as far to post psuedo code I can use?
Yes, implementing likelihood fitting with minimize is tricky, I spend a lot of time on it. Which is why I wrapped it. If I may shamelessly plug my own package symfit, your problem can be solved by doing something like this:
from symfit import Parameter, Variable, Likelihood, exp
import numpy as np
# Define the model for an exponential distribution
beta = Parameter()
x = Variable()
model = (1 / beta) * exp(-x / beta)
# Draw 100 samples from an exponential distribution with beta=5.5
data = np.random.exponential(5.5, 100)
# Do the fitting!
fit = Likelihood(model, data)
fit_result = fit.execute()
I have to admit I don't exactly understand your distribution, since I don't understand the role of your w, but perhaps with this code as an example, you'll know how to adapt it.
If not, let me know the full mathematical equation of your model so I can help you further.
For more info check the docs. (For a more technical description of what happens under the hood, read here and here.)
I think there's an issue with your setup. With maximum likelihood, you obtain the parameters that maximize the probability of observing your data (given a certain model). Your model seems to be:
where epsilon is N(0, sigma).
So you maximize it:
or equivalently take logs to get:
The f in this case is the log-normal probability density function which you can get with stats.norm.logpdf. You should then use scipy.minimize to maximize an expression that will be the summation of stats.norm.logpdf evaluated at each of the i points, from 1 to your sample size.
If I've understood you correctly, your code is missing having a y vector plus an x vector! Show us a sample of those vectors and I can update my answer to include a sample code for estimating MLE with that date.
Suppose 'h' is a function of x,y,z and t and it gives us a graph line (t,h) (simulated). At the same time we also have observed graph (observed values of h against t). How can I reduce the difference between observed (t,h) and simulated (t,h) graph by optimizing values of x,y and z? I want to change the simulated graph so that it imitates closer and closer to the observed graph in MATLAB/Python. In literature I have read that people have done same thing by Lavenberg-marquardt algorithm but don't know how to do it?
You are actually trying to fit the parameters x,y,z of the parametrized function h(x,y,z;t).
MATLAB
You're right that in MATLAB you should either use lsqcurvefit of the Optimization toolbox, or fit of the Curve Fitting Toolbox (I prefer the latter).
Looking at the documentation of lsqcurvefit:
x = lsqcurvefit(fun,x0,xdata,ydata);
It says in the documentation that you have a model F(x,xdata) with coefficients x and sample points xdata, and a set of measured values ydata. The function returns the least-squares parameter set x, with which your function is closest to the measured values.
Fitting algorithms usually need starting points, some implementations can choose randomly, in case of lsqcurvefit this is what x0 is for. If you have
h = #(x,y,z,t) ... %// actual function here
t_meas = ... %// actual measured times here
h_meas = ... %// actual measured data here
then in the conventions of lsqcurvefit,
fun <--> #(params,t) h(params(1),params(2),params(3),t)
x0 <--> starting guess for [x,y,z]: [x0,y0,z0]
xdata <--> t_meas
ydata <--> h_meas
Your function h(x,y,z,t) should be vectorized in t, such that for vector input in t the return value is the same size as t. Then the call to lsqcurvefit will give you the optimal set of parameters:
x = lsqcurvefit(#(params,t) h(params(1),params(2),params(3),t),[x0,y0,z0],t_meas,h_meas);
h_fit = h(x(1),x(2),x(3),t_meas); %// best guess from curve fitting
Python
In python, you'd have to use the scipy.optimize module, and something like scipy.optimize.curve_fit in particular. With the above conventions you need something along the lines of this:
import scipy.optimize as opt
popt,pcov = opt.curve_fit(lambda t,x,y,z: h(x,y,z,t), t_meas, y_meas, p0=[x0,y0,z0])
Note that the p0 starting array is optional, but all parameters will be set to 1 if it's missing. The result you need is the popt array, containing the optimal values for [x,y,z]:
x,y,z = popt
h_fit = h(x,y,z,t_meas)
I have two lists .
import numpy
x = numpy.array([7250, ... list of 600 ints ... ,7849])
y = numpy.array([2.4*10**-16, ... list of 600 floats ... , 4.3*10**-16])
They make a U shaped curve.
Now I want to fit a gaussian to that curve.
from scipy.optimize import curve_fit
n = len(x)
mean = sum(y)/n
sigma = sum(y - mean)**2/n
def gaus(x,a,x0,sigma,c):
return a*numpy.exp(-(x-x0)**2/(2*sigma**2))+c
popt, pcov = curve_fit(gaus,x,y,p0=[-1,mean,sigma,-5])
pylab.plot(x,y,'r-')
pylab.plot(x,gaus(x,*popt),'k-')
pylab.show()
I just end up with the noisy original U-shaped curve and a straight horizontal line running through the curve.
I am not sure what the -1 and the -5 represent in the above code but I am sure that I need to adjust them or something else to get the gaussian curve. I have been playing around with possible values but to no avail.
Any ideas?
First of all, your variable sigma is actually variance, i.e. sigma squared --- http://en.wikipedia.org/wiki/Variance#Definition.
This confuses the curve_fit by giving it a suboptimal starting estimate.
Then, your fitting ansatz, gaus, includes an amplitude a and an offset, is this what you actually need? And the starting values are a=-1 (negated bell shape) and offset c=-5. Where do they come from?
Here's what I'd do:
fix your fitting model. Do you want just a gaussian, does it need to be normalized. If it does, then the amplitude a is fixed by sigma etc.
Have a look at the actual data. What's the tail (offset), what's the sign (amplitude sign).
If you're actually want just a gaussian without any bells and whistles, you might not actually need curve_fit: a gaussian is fully defined by two first moments, mean and sigma. Calculate them as you do, plot them over the data and see if you're not all set.
p0 in your call to curve_fit gives the initial guesses for the additional parameters of you function in addition to x. In the above code you are saying that I want the curve_fit function to use -1 as the initial guess for a, -5 as the initial guess for c, mean as the initial guess for x0, and sigma as the guess for sigma. The curve_fit function will then adjust these parameters to try and get a better fit. The problem is your initial guesses at your function parameters are really bad given the order of (x,y)s.
Think a little bit about the order of magnitude of your different parameters for the Gaussian. a should be around the size of your y values (10**-16) as at the peak of the Gaussian the exponential part will never be larger than 1. x0 will give the position within your x values at which the exponential part of your Gaussian will be 1, so x0 should be around 7500, probably somewhere in the centre of your data. Sigma indicates the width, or spread of your Gaussian, so perhaps something in the 100's just a guess. Finally c is just an offset to shift the whole Gaussian up and down.
What I would recommend doing, is before fitting the curve, pick some values for a, x0, sigma, and c that seem reasonable and just plot the data with the Gaussian, and play with a, x0, sigma, and c until you get something that looks at least some what the way you want the Gaussian to fit, then use those as the starting points for curve_fit p0 values. The values I gave should get you started, but may not do exactly what you want. For instance a probably needs to be negative if you want to flip the Gaussian to get a "U" shape.
Also printing out the values that curve_fit thinks are good for your a,x0,sigma, and c might help you see what it is doing and if that function is on the right track to minimizing the residual of the fit.
I have had similar problems doing curve fitting with gnuplot, if the initial values are too far from what you want to fit it goes in completely the wrong direction with the parameters to minimize the residuals, and you could probably do better by eye. Think of these functions as a way to fine tune your by eye estimates of these parameters.
hope that helps
I don't think you are estimating your initial guesses for mean and sigma correctly.
Take a look at the SciPy Cookbook here
I think it should look like this.
x = numpy.array([7250, ... list of 600 ints ... ,7849])
y = numpy.array([2.4*10**-16, ... list of 600 floats ... , 4.3*10**-16])
n = len(x)
mean = sum(x*y)/sum(y)
sigma = sqrt(abs(sum((x-mean)**2*y)/sum(y)))
def gaus(x,a,x0,sigma,c):
return a*numpy.exp(-(x-x0)**2/(2*sigma**2))+c
popy, pcov = curve_fit(gaus,x,y,p0=[-max(y),mean,sigma,min(x)+((max(x)-min(x)))/2])
pylab.plot(x,gaus(x,*popt))
If anyone has a link to a simple explanation why these are the correct moments I would appreciate it. I am going on faith that SciPy Cookbook got it right.
Here is the solution thanks to everyone .
x = numpy.array([7250, ... list of 600 ints ... ,7849])
y = numpy.array([2.4*10**-16, ... list of 600 floats ... , 4.3*10**-16])
n = len(x)
mean = sum(x)/n
sigma = math.sqrt(sum((x-mean)**2)/n)
def gaus(x,a,x0,sigma,c):
return a*numpy.exp(-(x-x0)**2/(2*sigma**2))+c
popy, pcov = curve_fit(gaus,x,y,p0=[-max(y),mean,sigma,min(x)+((max(x)-min(x)))/2])
pylab.plot(x,gaus(x,*popt))
Maybe it is because I use matlab and fminsearch or my fits have to work on much fewer datapoints (~ 5-10), I have much better results with the following starter values (as simple as they are):
a = max(y)-min(y);
imax= find(y==max(y),1);
mean = x(imax);
avg = sum(x.*y)./sum(y);
sigma = sqrt(abs(sum((x-avg).^2.*y) ./ sum(y)));
c = min(y);
The sigma works fine.
I have a question about the fit algorithms used in scipy. In my program, I have a set of x and y data points with y errors only, and want to fit a function
f(x) = (a[0] - a[1])/(1+np.exp(x-a[2])/a[3]) + a[1]
to it.
The problem is that I get absurdly high errors on the parameters and also different values and errors for the fit parameters using the two fit scipy fit routines scipy.odr.ODR (with least squares algorithm) and scipy.optimize. I'll give my example:
Fit with scipy.odr.ODR, fit_type=2
Beta: [ 11.96765963 68.98892582 100.20926023 0.60793377]
Beta Std Error: [ 4.67560801e-01 3.37133614e+00 8.06031988e+04 4.90014367e+04]
Beta Covariance: [[ 3.49790629e-02 1.14441187e-02 -1.92963671e+02 1.17312104e+02]
[ 1.14441187e-02 1.81859542e+00 -5.93424196e+03 3.60765567e+03]
[ -1.92963671e+02 -5.93424196e+03 1.03952883e+09 -6.31965068e+08]
[ 1.17312104e+02 3.60765567e+03 -6.31965068e+08 3.84193143e+08]]
Residual Variance: 6.24982731975
Inverse Condition #: 1.61472215874e-08
Reason(s) for Halting:
Sum of squares convergence
and then the fit with scipy.optimize.leastsquares:
Fit with scipy.optimize.leastsq
beta: [ 11.9671859 68.98445306 99.43252045 1.32131099]
Beta Std Error: [0.195503 1.384838 34.891521 45.950556]
Beta Covariance: [[ 3.82214235e-02 -1.05423284e-02 -1.99742825e+00 2.63681933e+00]
[ -1.05423284e-02 1.91777505e+00 1.27300761e+01 -1.67054172e+01]
[ -1.99742825e+00 1.27300761e+01 1.21741826e+03 -1.60328181e+03]
[ 2.63681933e+00 -1.67054172e+01 -1.60328181e+03 2.11145361e+03]]
Residual Variance: 6.24982904455 (calulated by me)
My Point is the third fit parameter: The results are
scipy.odr.ODR, fit_type=2:
C = 100.209 +/- 80600
scipy.optimize.leastsq:
C = 99.432 +/- 12.730
I don't know why the first error is so much higher. Even better: If I put exactly the same data points with errors into Origin 9 I get
C = x0 = 99,41849 +/- 0,20283
and again exactly the same data into c++ ROOT Cern
C = 99.85+/- 1.373
even though I used exactly the same initial variables for ROOT and Python. Origin doesn't need any.
Do you have any clue why this happens and which is the best result?
I added the code for you at pastebin:
Data
C++ code
Python code: http://pastebin.com/jZVyzMkS
Thank you for helping!
EDIT: here's the plot related to SirJohnFranklins post:
Did you actually try plotting the ODR and leastsq fits side by side? They look basically identical:
Consider what the parameters correspond to - the step function described by beta[0] and beta[1], the initial and final values, explains by far the majority of the variance in your data. By contrast, small changes in beta[2] and beta[3], the inflexion point and slope, will have comparatively little effect on the overall shape of the curve and therefore the residual variance for the fit. It's therefore no surprise that these parameters have high standard errors, and are fitted slightly differently by the two algorithms.
The overall greater standard errors reported by ODR are due to the fact that this model incorporates errors in the y-values whereas the ordinary least squares fit does not - errors in the measured y-values ought to reduce our confidence in the estimated fit parameters.
(Sadly, i can't upload the fit, because I need more reputation. I'll give the plot to Captain Sandwich, so he can upload it for me.)
I'm in the same workgroup as the person who started the thread, but I did this plot.
So, I added x-errors on the data, because I was not that far the last time. The error obtained through the ODR is still absurdly high (4.18550164e+04 on beta[2]). In the plot, I show you what the FIT from [ROOT Cern][2] gives, now with x and y error. Here, x0 is the beta[2].
The red and the green curve have a different beta, the left one minus the error of the fit of 3.430 obtained by ROOT and the right one plus the error. I think this makes totally sense, much more, than the error of 0.2 given by the fit of Origin 9 (which can only handle y-errors, I think) or the error of about 40k given by the ODR which also includes x and y errors.
Maybe, because ROOT is mostly used by astrophysicists who need very roubust fitting algorithms, it can handle much more difficult fits, but I don't know enough about the robustness of fitting algorithms.