Optimization/curve fitting - python
I'm fitting an epidemiological model to covid data. The code is shown below.
The model has 3 parameters: N, beta and gamma.
How can I optimize these to get the best model fit, as measured by 'error'?
import matplotlib.pyplot as plt
import numpy as np
from scipy.integrate import odeint
# Data
t = np.array([1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103,104,105,106,107,108,109,110,111,112,113,114,115,116,117,118,119,120,121,122,123,124,125,126,127,128,129,130,131,132,133,134,135,136,137,138,139,140,141,142,143,144,145,146,147,148,149,150,151,152,153,154,155,156,157,158])
c = np.array([111.0,138.3,135.3,143.7,132.0,114.1,95.9,76.4,47.9,56.9,53.6,51.4,56.6,66.4,67.4,71.1,63.3,74.0,83.7,95.1,94.4,95.9,111.4,143.7,140.7,146.9,150.0,161.3,192.6,205.3,189.6,198.3,213.3,218.7,245.0,242.0,247.3,282.1,319.6,346.7,368.0,351.1,369.3,420.6,440.6,461.7,490.3,539.7,609.4,643.9,661.3,705.0,785.7,825.9,835.7,847.0,914.0,943.0,998.3,1009.7,1026.0,1009.1,1133.4,1180.9,1302.1,1374.9,1442.9,1534.6,1649.7,1655.4,1912.7,2027.7,2143.7,2228.9,2360.3,2454.1,2556.6,2539.4,2641.9,2823.3,3107.6,3236.3,3334.7,3570.1,3617.4,3684.0,3849.3,3894.9,4008.1,4253.4,4483.4,4926.4,5267.9,5588.4,5833.1,6096.3,6443.0,6791.0,7098.0,7504.9,8025.3,8373.7,8779.6,9235.1,9333.1,10039.7,10509.0,10886.7,11356.0,11725.0,11776.7,12340.6,12268.9,12415.3,12385.0,12583.7,12261.7,11929.4,11985.6,11975.9,12057.4,11903.0,11586.4,11271.6,11137.6,10882.1,10588.1,10169.6,9870.0,9436.0,9190.4,8793.9,8393.4,8002.1,7470.4,7128.3,6910.6,6676.6,6398.7,5577.4,4954.4,4809.1,4352.1,3926.6,3755.4,3719.3,3877.3,3867.9,3456.9,3341.7,3204.0,3080.6,2981.9,2805.9,2620.9,2399.1,2215.1,2183.3])
plt.title(r'Daily cases - SIR fit: N=42 000, $\beta=0.148$, $\gamma=0.05$')
plt.xlabel('Days')
plt.ylabel('Cases (7-d moving average)')
plt.plot(t, c, ".")
# Model
# Total population, N.
N = 42000
# Initial number of infected and recovered individuals, I0 and R0.
I0, R0 = 1, 0
# Everyone else, S0, is susceptible to infection initially.
S0 = N - I0 - R0
# Contact rate, beta, and mean recovery rate, gamma (in 1/days).
beta, gamma = .148, 1/20
# The SIR model differential equations.
def deriv(y, t, N, beta, gamma):
S, I, R = y
dSdt = -beta * S * I / N
dIdt = beta * S * I / N - gamma * I
dRdt = gamma * I
return dSdt, dIdt, dRdt
# Initial conditions vector
y0 = S0, I0, R0
# Integrate the SIR equations over time t.
ret = odeint(deriv, y0, t, args=(N, beta, gamma))
S, I, R = ret.T
plt.plot(t, I, color='red', linewidth=2)
# plt.savefig('fitted.pdf', dpi=300, bbox_inches='tight')
error = sum(I - c)
from scipy.optimize import minimize
def fn(x):
# parameters unwrapped
N, beta, gamma = x;
# Initial number of infected and recovered individuals, I0 and R0.
I0, R0 = 1, 0
# Everyone else, S0, is susceptible to infection initially.
S0 = N - I0 - R0
# Initial conditions vector
y0 = S0, I0, R0
# Integrate the SIR equations over time t.
ret = odeint(deriv, y0, t, args=(N, beta, gamma))
S, I, R = ret.T
# absolute error to avoid negatives
error = sum(abs(I - c))
return error
# initialise with current best guess
init_x = [42000, .148, 1/20]
# calculate result
res = minimize(fn, init_x, method='Nelder-Mead', tol=1e-6)
# calculate final error
fn(res.x)
Since you have a custom function, a simple way to minimize this is to call scipy's minimize argument. The function only takes one argument so the three parameters need to be passed as a list. See example above. You can then get the optimal parameters by calling res.x.
I suggest you take a look at Scipy's optimize.curve_fit function. Here you can define a function that can be fitted to your data. Optimal parameters and fitting covariances are returned.
Related
Resonance frequency with python curve-fit, error maxfev reached
I have measured a voltage over a LCR tank circuit (with unknown components) to determine the resonance frequency. Ik have performed a broad frequency sweep and measured the voltages. Now I want to determine the exact location of the resonance peak by adding a fit to the data. The curve looks like a damped, driven, harmonic oscillator so I used the following function to fit the data to: A = F0 / sqrt((k-mw^2)^2 + (bw)^2). This is the code I have for now, but I get the following error: "Optimal parameters not found: Number of calls to function has reached maxfev = 5000." import numpy as np import matplotlib.pyplot as plt from scipy.optimize import curve_fit def fit( f, F0 , k, m, b): w = 2 * np.pi * f return F0 / np.sqrt( ( k - m*w**2)**2 + ( b * w )**2 ) fuData = np.loadtxt( "ohlVW.txt", delimiter=',' ) fuData = fuData[ fuData[:,0].argsort()] f = fuData[:,0] U = fuData[:,1] popt, _ = curve_fit(fit, f, U, maxfev=5000) F0, k, m, b = popt print(popt) plt.scatter(f, U) x_line = np.arange(min(f), max(f), 1) y_line = fit(x_line, F0, k, m, b) plt.figure() plt.plot(f, U) plt.plot(x_line, y_line, '--', color='red') plt.show() Increasing maxfev did not work. How can I adjust the code to get a nice fit over the data?
How to calculate "relative error in the sum of squares" and "relative error in the approximate solution" from least squares method?
I have implemented a 3D gaussian fit using scipy.optimize.leastsq and now I would like to tweak the arguments ftol and xtol to optimize the performances. However, I don't understand the "units" of these two parameters in order to make a proper choice. Is it possible to calculate these two parameters from the results? That would give me an understanding of how to choose them. My data is numpy arrays of np.uint8. I tried to read the FORTRAN source code of MINIPACK but my FORTRAN knowledge is zero. I also read checked the Levenberg-Marquardt algorithm, but I could not really get a number that was below the ftol for example. Here is a minimal example of what I do: import numpy as np import matplotlib.pyplot as plt from scipy.optimize import leastsq class gaussian_model: def __init__(self): self.prev_iter_model = None self.f_vals = [] def gaussian_1D(self, coeffs, xx): A, sigma, mu = coeffs # Center rotation around peak center x0 = xx - mu model = A*np.exp(-(x0**2)/(2*(sigma**2))) return model def residuals(self, coeffs, I_obs, xx, model_func): model = model_func(coeffs, xx) residuals = I_obs - model if self.prev_iter_model is not None: self.f = np.sum(((model-self.prev_iter_model)/model)**2) self.f_vals.append(self.f) self.prev_iter_model = model return residuals # x data x_start = 1 x_stop = 10 num = 100 xx, dx = np.linspace(x_start, x_stop, num, retstep=True) # Simulated data with some noise A, s_x, mu = 10, 0.5, 3 coeffs = [A, s_x, mu] model = gaussian_model() yy = model.gaussian_1D(coeffs, xx) noise_ampl = 0.5 noise = np.random.normal(0, noise_ampl, size=num) yy += noise # LM Least squares initial_guess = [1, 1, 1] pred_coeffs, cov_x, info, mesg, ier = leastsq(model.residuals, initial_guess, args=(yy, xx, model.gaussian_1D), ftol=1E-6, full_output=True) yy_fit = model.gaussian_1D(pred_coeffs, xx) rel_SSD = np.sum(((yy-yy_fit)/yy)**2) RMS_SSD = np.sqrt(rel_SSD/num) print(RMS_SSD) print(model.f) print(model.f_vals) fig, ax = plt.subplots(1,2) # Plot results ax[0].scatter(xx, yy) ax[0].plot(xx, yy_fit, c='r') ax[1].scatter(range(len(model.f_vals)), model.f_vals, c='r') # ax[1].set_ylim(0, 1E-6) plt.show() rel_SSD is around 1 and definitely not something below ftol = 1E-6. EDIT: Based on #user12750353 answer below I updated my minimal example to try to recreate how lmdif determines termination with ftol. The problem is that my f_vals are too small, so they are not the right values. The reason I would like to recreate this is that I would like to see what kind of numbers I am getting on my main code to decide on a ftol that would terminate the fitting process earlier.
Since you are giving a function without the gradient, the method called is lmdif. Instead of gradients it will use forward difference gradient estimate, f(x + delta) - f(x) ~ delta * df(x)/dx (I will write as if the parameter). There you find the following description c ftol is a nonnegative input variable. termination c occurs when both the actual and predicted relative c reductions in the sum of squares are at most ftol. c therefore, ftol measures the relative error desired c in the sum of squares. c c xtol is a nonnegative input variable. termination c occurs when the relative error between two consecutive c iterates is at most xtol. therefore, xtol measures the c relative error desired in the approximate solution. Looking in the code the actual reduction acred = 1 - (fnorm1/fnorm)**2 is what you calculated for rel_SSD, but between the two last iterations, not between the fitted function and the target points. Example The problem here is that we need to discover what are the values assumed by the internal variables. An attempt to do so is to save the coefficients and the residual norm every time the function is called as follows. import numpy as np import matplotlib.pyplot as plt from scipy.optimize import leastsq class gaussian_model: def __init__(self): self.prev_iter_model = None self.fnorm = [] self.x = [] def gaussian_1D(self, coeffs, xx): A, sigma, mu = coeffs # Center rotation around peak center x0 = xx - mu model = A*np.exp(-(x0**2)/(2*(sigma**2))) grad = np.array([ model / A, model * x0**2 / (sigma**3), model * 2 * x0 / (2*(sigma**2)) ]).transpose(); return model, grad def residuals(self, coeffs, I_obs, xx, model_func): model, grad = model_func(coeffs, xx) residuals = I_obs - model self.x.append(np.copy(coeffs)); self.fnorm.append(np.sqrt(np.sum(residuals**2))) return residuals def grad(self, coeffs, I_obs, xx, model_func): model, grad = model_func(coeffs, xx) residuals = I_obs - model return -grad def plot_progress(self): x = np.array(self.x) dx = np.sqrt(np.sum(np.diff(x, axis=0)**2, axis=1)) plt.plot(dx / np.sqrt(np.sum(x[1:, :]**2, axis=1))) fnorm = np.array(self.fnorm) plt.plot(1 - (fnorm[1:]/fnorm[:-1])**2) plt.legend(['$||\Delta f||$', '$||\Delta x||$'], loc='upper left'); # x data x_start = 1 x_stop = 10 num = 100 xx, dx = np.linspace(x_start, x_stop, num, retstep=True) # Simulated data with some noise A, s_x, mu = 10, 0.5, 3 coeffs = [A, s_x, mu] model = gaussian_model() yy, _ = model.gaussian_1D(coeffs, xx) noise_ampl = 0.5 noise = np.random.normal(0, noise_ampl, size=num) yy += noise Then we can see the relative variation of $x$ and $f$ initial_guess = [1, 1, 1] pred_coeffs, cov_x, info, mesg, ier = leastsq(model.residuals, initial_guess, args=(yy, xx, model.gaussian_1D), xtol=1e-6, ftol=1e-6, full_output=True) plt.figure(figsize=(14, 6)) plt.subplot(121) model.plot_progress() plt.yscale('log') plt.grid() plt.subplot(122) yy_fit,_ = model.gaussian_1D(pred_coeffs, xx) # Plot results plt.scatter(xx, yy) plt.plot(xx, yy_fit, c='r') plt.show() The problem with this is that the function is evaluated both to compute f and to compute the gradient of f. To produce a cleaner plot what can be done is to implement pass Dfun so that it evaluate func only once per iteration. # x data x_start = 1 x_stop = 10 num = 100 xx, dx = np.linspace(x_start, x_stop, num, retstep=True) # Simulated data with some noise A, s_x, mu = 10, 0.5, 3 coeffs = [A, s_x, mu] model = gaussian_model() yy, _ = model.gaussian_1D(coeffs, xx) noise_ampl = 0.5 noise = np.random.normal(0, noise_ampl, size=num) yy += noise # LM Least squares initial_guess = [1, 1, 1] pred_coeffs, cov_x, info, mesg, ier = leastsq(model.residuals, initial_guess, args=(yy, xx, model.gaussian_1D), Dfun=model.grad, xtol=1e-6, ftol=1e-6, full_output=True) plt.figure(figsize=(14, 6)) plt.subplot(121) model.plot_progress() plt.yscale('log') plt.grid() plt.subplot(122) yy_fit,_ = model.gaussian_1D(pred_coeffs, xx) # Plot results plt.scatter(xx, yy) plt.plot(xx, yy_fit, c='r') plt.show() Well, the value I am obtaining for xtol is not exactly what is in the lmdif implementation.
Finding alpha and beta of beta-binomial distribution with scipy.optimize and loglikelihood
A distribution is beta-binomial if p, the probability of success, in a binomial distribution has a beta distribution with shape parameters α > 0 and β > 0. The shape parameters define the probability of success. I want to find the values for α and β that best describe my data from the perspective of a beta-binomial distribution. My dataset players consist of data about the number of hits (H), the number of at-bats (AB) and the conversion (H / AB) of a lot of baseball players. I estimate the PDF with the help of the answer of JulienD in Beta Binomial Function in Python from scipy.special import beta from scipy.misc import comb pdf = comb(n, k) * beta(k + a, n - k + b) / beta(a, b) Next, I write a loglikelihood function that we will minimize. def loglike_betabinom(params, *args): """ Negative log likelihood function for betabinomial distribution :param params: list for parameters to be fitted. :param args: 2-element array containing the sample data. :return: negative log-likelihood to be minimized. """ a, b = params[0], params[1] k = args[0] # the conversion rate n = args[1] # the number of at-bats (AE) pdf = comb(n, k) * beta(k + a, n - k + b) / beta(a, b) return -1 * np.log(pdf).sum() Now, I want to write a function that minimizes loglike_betabinom from scipy.optimize import minimize init_params = [1, 10] res = minimize(loglike_betabinom, x0=init_params, args=(players['H'] / players['AB'], players['AB']), bounds=bounds, method='L-BFGS-B', options={'disp': True, 'maxiter': 250}) print(res.x) The result is [-6.04544138 2.03984464], which implies that α is negative which is not possible. I based my script on the following R-snippet. They get [101.359, 287.318].. ll <- function(alpha, beta) { x <- career_filtered$H total <- career_filtered$AB -sum(VGAM::dbetabinom.ab(x, total, alpha, beta, log=True)) } m <- mle(ll, start = list(alpha = 1, beta = 10), method = "L-BFGS-B", lower = c(0.0001, 0.1)) ab <- coef(m) Can someone tell me what I am doing wrong? Help is much appreciated!!
One thing to pay attention to is that comb(n, k) in your log-likelihood might not be well-behaved numerically for the values of n and k in your dataset. You can verify this by applying comb to your data and see if infs appear. One way to amend things could be to rewrite the negative log-likelihood as suggested in https://stackoverflow.com/a/32355701/4240413, i.e. as a function of logarithms of Gamma functions as in from scipy.special import gammaln import numpy as np def loglike_betabinom(params, *args): a, b = params[0], params[1] k = args[0] # the OVERALL conversions n = args[1] # the number of at-bats (AE) logpdf = gammaln(n+1) + gammaln(k+a) + gammaln(n-k+b) + gammaln(a+b) - \ (gammaln(k+1) + gammaln(n-k+1) + gammaln(a) + gammaln(b) + gammaln(n+a+b)) return -np.sum(logpdf) You can then minimize the log-likelihood with from scipy.optimize import minimize init_params = [1, 10] # note that I am putting 'H' in the args res = minimize(loglike_betabinom, x0=init_params, args=(players['H'], players['AB']), method='L-BFGS-B', options={'disp': True, 'maxiter': 250}) print(res) and that should give reasonable results. You could check How to properly fit a beta distribution in python? for inspiration if you want to rework further your code.
"'float' is not subscriptable" in odeint
I'm trying to implement coupled differential equations in Python, and as a new user I seem to be stuck at something. I used this tutorial as a guide to how to solve my ODEs, and looked into the documentation to no available This is where I define the function def Burnout(t, y, m, nu, S0, V, delta, mu): S = y[0]; E = [0 for i in range(0,m)] dEdt = [0 for i in range(0,m)] for i in range(0,m): E.append(y[i+1]) P = y[m+1] dSdt = -nu*S*P*(S/S0)**V dEdt.append(nu*S*P*(S/S0)**V-m*delta*E[0]) for i in range(1,m): dEdt.append(m*delta*E[i-1]-m*delta*E[i]) dPdt = m*delta*E[m-1]-mu*P return [dSdt, *dEdt[0:m], dPdt] Then, as in the tutorial, I define the initial conditions by S0 = N y0.append(S0) for i in range (0, m): E.append(0) y0.append(E[i]) P0 = Z y0.append(P0) where N and Z are previously defined things, and E was an empty array. When I finally call odeint(Burnout, y0, t, args = p), I get a 'float' object is not subscriptable pointing to my definition of S in my Burnout function. As I passed a list to odeint I'm kind of confused on why Python says I passed a float. Does anyone see what I did wrong? Thanks in advance! EDIT: Ok, now here is a minimal, complete and verifiable example that gives me the same error import numpy as np from scipy.integrate import odeint def Burnout(t, y, m, nu, S0, V, delta, mu): S = y[0] E = [0 for i in range(0,m)] dEdt = [0 for i in range(0,m)] for i in range(0,m): E.append(y[i+1]) P = y[m+1] dSdt = -nu*S*P*(S/S0)**V dEdt.append(nu*S*P*(S/S0)**V-m*delta*E[0]) for i in range(1,m): dEdt.append(m*delta*E[i-1]-m*delta*E[i]) dPdt = m*delta*E[m-1]-mu*P return [dSdt, *dEdt[0:m], dPdt] V = 2.97 m = 26 delta = 1/6 mu = 1 nu = 10 S0 = 5 t = np.linspace(0,56,100) y = [10,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,100] p = (m, nu, V, S0, delta, mu) print(odeint(Burnout,y,t,args=p))
You ordered the arguments in your ode definition wrong. It is possible to have t before y, but then you must define tfirst = True (see docs). Swapping the arguments in your definition of Burnout fixes the problem for me. def Burnout(y, t, m, nu, S0, V, delta, mu): # ... # rest of function # ... Alternately you can define the additional keyword tfirst in the odeint call: odeint(Burnout, y, t, args=p, tfirst=True)
Fitting data to system of ODEs using Python via Scipy & Numpy
I am having some trouble translating my MATLAB code into Python via Scipy & Numpy. I am stuck on how to find optimal parameter values (k0 and k1) for my system of ODEs to fit to my ten observed data points. I currently have an initial guess for k0 and k1. In MATLAB, I can using something called 'fminsearch' which is a function that takes the system of ODEs, the observed data points, and the initial values of the system of ODEs. It will then calculate a new pair of parameters k0 and k1 that will fit the observed data. I have included my code to see if you can help me implement some kind of 'fminsearch' to find the optimal parameter values k0 and k1 that will fit my data. I want to add whatever code to do this to my lsqtest.py file. I have three .py files - ode.py, lsq.py, and lsqtest.py ode.py: def f(y, t, k): return (-k[0]*y[0], k[0]*y[0]-k[1]*y[1], k[1]*y[1]) lsq.py: import pylab as py import numpy as np from scipy import integrate from scipy import optimize import ode def lsq(teta,y0,data): #INPUT teta, the unknowns k0,k1 # data, observed # y0 initial values needed by the ODE #OUTPUT lsq value t = np.linspace(0,9,10) y_obs = data #data points k = [0,0] k[0] = teta[0] k[1] = teta[1] #call the ODE solver to get the states: r = integrate.odeint(ode.f,y0,t,args=(k,)) #the ODE system in ode.py #at each row (time point), y_cal has #the values of the components [A,B,C] y_cal = r[:,1] #separate the measured B #compute the expression to be minimized: return sum((y_obs-y_cal)**2) lsqtest.py: import pylab as py import numpy as np from scipy import integrate from scipy import optimize import lsq if __name__ == '__main__': teta = [0.2,0.3] #guess for parameter values k0 and k1 y0 = [1,0,0] #initial conditions for system y = [0.000,0.416,0.489,0.595,0.506,0.493,0.458,0.394,0.335,0.309] #observed data points data = y resid = lsq.lsq(teta,y0,data) print resid
For these kind of fitting tasks you could use the package lmfit. The outcome of the fit would look like this; as you can see, the data are reproduced very well: For now, I fixed the initial concentrations, you could also set them as variables if you like (just remove the vary=False in the code below). The parameters you obtain are: [[Variables]] x10: 5 (fixed) x20: 0 (fixed) x30: 0 (fixed) k0: 0.12183301 +/- 0.005909 (4.85%) (init= 0.2) k1: 0.77583946 +/- 0.026639 (3.43%) (init= 0.3) [[Correlations]] (unreported correlations are < 0.100) C(k0, k1) = 0.809 The code that reproduces the plot looks like this (some explanation can be found in the inline comments): import numpy as np import matplotlib.pyplot as plt from scipy.integrate import odeint from lmfit import minimize, Parameters, Parameter, report_fit from scipy.integrate import odeint def f(y, t, paras): """ Your system of differential equations """ x1 = y[0] x2 = y[1] x3 = y[2] try: k0 = paras['k0'].value k1 = paras['k1'].value except KeyError: k0, k1 = paras # the model equations f0 = -k0 * x1 f1 = k0 * x1 - k1 * x2 f2 = k1 * x2 return [f0, f1, f2] def g(t, x0, paras): """ Solution to the ODE x'(t) = f(t,x,k) with initial condition x(0) = x0 """ x = odeint(f, x0, t, args=(paras,)) return x def residual(paras, t, data): """ compute the residual between actual data and fitted data """ x0 = paras['x10'].value, paras['x20'].value, paras['x30'].value model = g(t, x0, paras) # you only have data for one of your variables x2_model = model[:, 1] return (x2_model - data).ravel() # initial conditions x10 = 5. x20 = 0 x30 = 0 y0 = [x10, x20, x30] # measured data t_measured = np.linspace(0, 9, 10) x2_measured = np.array([0.000, 0.416, 0.489, 0.595, 0.506, 0.493, 0.458, 0.394, 0.335, 0.309]) plt.figure() plt.scatter(t_measured, x2_measured, marker='o', color='b', label='measured data', s=75) # set parameters including bounds; you can also fix parameters (use vary=False) params = Parameters() params.add('x10', value=x10, vary=False) params.add('x20', value=x20, vary=False) params.add('x30', value=x30, vary=False) params.add('k0', value=0.2, min=0.0001, max=2.) params.add('k1', value=0.3, min=0.0001, max=2.) # fit model result = minimize(residual, params, args=(t_measured, x2_measured), method='leastsq') # leastsq nelder # check results of the fit data_fitted = g(np.linspace(0., 9., 100), y0, result.params) # plot fitted data plt.plot(np.linspace(0., 9., 100), data_fitted[:, 1], '-', linewidth=2, color='red', label='fitted data') plt.legend() plt.xlim([0, max(t_measured)]) plt.ylim([0, 1.1 * max(data_fitted[:, 1])]) # display fitted statistics report_fit(result) plt.show() If you have data for additional variables, you can simply update the function residual.
The following worked for me: import pylab as pp import numpy as np from scipy import integrate, interpolate from scipy import optimize ##initialize the data x_data = np.linspace(0,9,10) y_data = np.array([0.000,0.416,0.489,0.595,0.506,0.493,0.458,0.394,0.335,0.309]) def f(y, t, k): """define the ODE system in terms of dependent variable y, independent variable t, and optinal parmaeters, in this case a single variable k """ return (-k[0]*y[0], k[0]*y[0]-k[1]*y[1], k[1]*y[1]) def my_ls_func(x,teta): """definition of function for LS fit x gives evaluation points, teta is an array of parameters to be varied for fit""" # create an alias to f which passes the optional params f2 = lambda y,t: f(y, t, teta) # calculate ode solution, retuen values for each entry of "x" r = integrate.odeint(f2,y0,x) #in this case, we only need one of the dependent variable values return r[:,1] def f_resid(p): """ function to pass to optimize.leastsq The routine will square and sum the values returned by this function""" return y_data-my_ls_func(x_data,p) #solve the system - the solution is in variable c guess = [0.2,0.3] #initial guess for params y0 = [1,0,0] #inital conditions for ODEs (c,kvg) = optimize.leastsq(f_resid, guess) #get params print "parameter values are ",c # fit ODE results to interpolating spline just for fun xeval=np.linspace(min(x_data), max(x_data),30) gls = interpolate.UnivariateSpline(xeval, my_ls_func(xeval,c), k=3, s=0) #pick a few more points for a very smooth curve, then plot # data and curve fit xeval=np.linspace(min(x_data), max(x_data),200) #Plot of the data as red dots and fit as blue line pp.plot(x_data, y_data,'.r',xeval,gls(xeval),'-b') pp.xlabel('xlabel',{"fontsize":16}) pp.ylabel("ylabel",{"fontsize":16}) pp.legend(('data','fit'),loc=0) pp.show()
Look at the scipy.optimize module. The minimize function looks fairly similar to fminsearch, and I believe that both basically use a simplex algorithm for optimization.
# cleaned up a bit to get my head around it - thanks for sharing import pylab as pp import numpy as np from scipy import integrate, optimize class Parameterize_ODE(): def __init__(self): self.X = np.linspace(0,9,10) self.y = np.array([0.000,0.416,0.489,0.595,0.506,0.493,0.458,0.394,0.335,0.309]) self.y0 = [1,0,0] # inital conditions ODEs def ode(self, y, X, p): return (-p[0]*y[0], p[0]*y[0]-p[1]*y[1], p[1]*y[1]) def model(self, X, p): return integrate.odeint(self.ode, self.y0, X, args=(p,)) def f_resid(self, p): return self.y - self.model(self.X, p)[:,1] def optim(self, p_quess): return optimize.leastsq(self.f_resid, p_guess) # fit params po = Parameterize_ODE(); p_guess = [0.2, 0.3] c, kvg = po.optim(p_guess) # --- show --- print "parameter values are ", c, kvg x = np.linspace(min(po.X), max(po.X), 2000) pp.plot(po.X, po.y,'.r',x, po.model(x, c)[:,1],'-b') pp.xlabel('X',{"fontsize":16}); pp.ylabel("y",{"fontsize":16}); pp.legend(('data','fit'),loc=0); pp.show()