Optimization/curve fitting

Optimization/curve fitting - python

I'm fitting an epidemiological model to covid data. The code is shown below.
The model has 3 parameters: N, beta and gamma.
How can I optimize these to get the best model fit, as measured by 'error'?
import matplotlib.pyplot as plt
import numpy as np
from scipy.integrate import odeint
# Data
t = np.array([1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103,104,105,106,107,108,109,110,111,112,113,114,115,116,117,118,119,120,121,122,123,124,125,126,127,128,129,130,131,132,133,134,135,136,137,138,139,140,141,142,143,144,145,146,147,148,149,150,151,152,153,154,155,156,157,158])
c = np.array([111.0,138.3,135.3,143.7,132.0,114.1,95.9,76.4,47.9,56.9,53.6,51.4,56.6,66.4,67.4,71.1,63.3,74.0,83.7,95.1,94.4,95.9,111.4,143.7,140.7,146.9,150.0,161.3,192.6,205.3,189.6,198.3,213.3,218.7,245.0,242.0,247.3,282.1,319.6,346.7,368.0,351.1,369.3,420.6,440.6,461.7,490.3,539.7,609.4,643.9,661.3,705.0,785.7,825.9,835.7,847.0,914.0,943.0,998.3,1009.7,1026.0,1009.1,1133.4,1180.9,1302.1,1374.9,1442.9,1534.6,1649.7,1655.4,1912.7,2027.7,2143.7,2228.9,2360.3,2454.1,2556.6,2539.4,2641.9,2823.3,3107.6,3236.3,3334.7,3570.1,3617.4,3684.0,3849.3,3894.9,4008.1,4253.4,4483.4,4926.4,5267.9,5588.4,5833.1,6096.3,6443.0,6791.0,7098.0,7504.9,8025.3,8373.7,8779.6,9235.1,9333.1,10039.7,10509.0,10886.7,11356.0,11725.0,11776.7,12340.6,12268.9,12415.3,12385.0,12583.7,12261.7,11929.4,11985.6,11975.9,12057.4,11903.0,11586.4,11271.6,11137.6,10882.1,10588.1,10169.6,9870.0,9436.0,9190.4,8793.9,8393.4,8002.1,7470.4,7128.3,6910.6,6676.6,6398.7,5577.4,4954.4,4809.1,4352.1,3926.6,3755.4,3719.3,3877.3,3867.9,3456.9,3341.7,3204.0,3080.6,2981.9,2805.9,2620.9,2399.1,2215.1,2183.3])
plt.title(r'Daily cases - SIR fit: N=42 000, $\beta=0.148$, $\gamma=0.05$')
plt.xlabel('Days')
plt.ylabel('Cases (7-d moving average)')
plt.plot(t, c, ".")
# Model
# Total population, N.
N = 42000
# Initial number of infected and recovered individuals, I0 and R0.
I0, R0 = 1, 0
# Everyone else, S0, is susceptible to infection initially.
S0 = N - I0 - R0
# Contact rate, beta, and mean recovery rate, gamma (in 1/days).
beta, gamma = .148, 1/20
# The SIR model differential equations.
def deriv(y, t, N, beta, gamma):
S, I, R = y
dSdt = -beta * S * I / N
dIdt = beta * S * I / N - gamma * I
dRdt = gamma * I
return dSdt, dIdt, dRdt
# Initial conditions vector
y0 = S0, I0, R0
# Integrate the SIR equations over time t.
ret = odeint(deriv, y0, t, args=(N, beta, gamma))
S, I, R = ret.T
plt.plot(t, I, color='red', linewidth=2)
# plt.savefig('fitted.pdf', dpi=300, bbox_inches='tight')
error = sum(I - c)

from scipy.optimize import minimize
def fn(x):
# parameters unwrapped
N, beta, gamma = x;
# Initial number of infected and recovered individuals, I0 and R0.
I0, R0 = 1, 0
# Everyone else, S0, is susceptible to infection initially.
S0 = N - I0 - R0
# Initial conditions vector
y0 = S0, I0, R0
# Integrate the SIR equations over time t.
ret = odeint(deriv, y0, t, args=(N, beta, gamma))
S, I, R = ret.T
# absolute error to avoid negatives
error = sum(abs(I - c))
return error
# initialise with current best guess
init_x = [42000, .148, 1/20]
# calculate result
res = minimize(fn, init_x, method='Nelder-Mead', tol=1e-6)
# calculate final error
fn(res.x)
Since you have a custom function, a simple way to minimize this is to call scipy's minimize argument. The function only takes one argument so the three parameters need to be passed as a list. See example above. You can then get the optimal parameters by calling res.x.

I suggest you take a look at Scipy's optimize.curve_fit function. Here you can define a function that can be fitted to your data. Optimal parameters and fitting covariances are returned.

Related

Resonance frequency with python curve-fit, error maxfev reached

I have measured a voltage over a LCR tank circuit (with unknown components) to determine the resonance frequency. Ik have performed a broad frequency sweep and measured the voltages. Now I want to determine the exact location of the resonance peak by adding a fit to the data. The curve looks like a damped, driven, harmonic oscillator so I used the following function to fit the data to: A = F0 / sqrt((k-mw^2)^2 + (bw)^2).
This is the code I have for now, but I get the following error: "Optimal parameters not found: Number of calls to function has reached maxfev = 5000."
import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
def fit( f, F0 , k, m, b):
w = 2 * np.pi * f
return F0 / np.sqrt( ( k - m*w**2)**2 + ( b * w )**2 )
fuData = np.loadtxt( "ohlVW.txt", delimiter=',' )
fuData = fuData[ fuData[:,0].argsort()]
f = fuData[:,0]
U = fuData[:,1]
popt, _ = curve_fit(fit, f, U, maxfev=5000)
F0, k, m, b = popt
print(popt)
plt.scatter(f, U)
x_line = np.arange(min(f), max(f), 1)
y_line = fit(x_line, F0, k, m, b)
plt.figure()
plt.plot(f, U)
plt.plot(x_line, y_line, '--', color='red')
plt.show()
Increasing maxfev did not work. How can I adjust the code to get a nice fit over the data?

How to calculate "relative error in the sum of squares" and "relative error in the approximate solution" from least squares method?

I have implemented a 3D gaussian fit using scipy.optimize.leastsq and now I would like to tweak the arguments ftol and xtol to optimize the performances. However, I don't understand the "units" of these two parameters in order to make a proper choice. Is it possible to calculate these two parameters from the results? That would give me an understanding of how to choose them. My data is numpy arrays of np.uint8. I tried to read the FORTRAN source code of MINIPACK but my FORTRAN knowledge is zero. I also read checked the Levenberg-Marquardt algorithm, but I could not really get a number that was below the ftol for example.
Here is a minimal example of what I do:
import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import leastsq
class gaussian_model:
def __init__(self):
self.prev_iter_model = None
self.f_vals = []
def gaussian_1D(self, coeffs, xx):
A, sigma, mu = coeffs
# Center rotation around peak center
x0 = xx - mu
model = A*np.exp(-(x0**2)/(2*(sigma**2)))
return model
def residuals(self, coeffs, I_obs, xx, model_func):
model = model_func(coeffs, xx)
residuals = I_obs - model
if self.prev_iter_model is not None:
self.f = np.sum(((model-self.prev_iter_model)/model)**2)
self.f_vals.append(self.f)
self.prev_iter_model = model
return residuals
# x data
x_start = 1
x_stop = 10
num = 100
xx, dx = np.linspace(x_start, x_stop, num, retstep=True)
# Simulated data with some noise
A, s_x, mu = 10, 0.5, 3
coeffs = [A, s_x, mu]
model = gaussian_model()
yy = model.gaussian_1D(coeffs, xx)
noise_ampl = 0.5
noise = np.random.normal(0, noise_ampl, size=num)
yy += noise
# LM Least squares
initial_guess = [1, 1, 1]
pred_coeffs, cov_x, info, mesg, ier = leastsq(model.residuals, initial_guess,
args=(yy, xx, model.gaussian_1D),
ftol=1E-6, full_output=True)
yy_fit = model.gaussian_1D(pred_coeffs, xx)
rel_SSD = np.sum(((yy-yy_fit)/yy)**2)
RMS_SSD = np.sqrt(rel_SSD/num)
print(RMS_SSD)
print(model.f)
print(model.f_vals)
fig, ax = plt.subplots(1,2)
# Plot results
ax[0].scatter(xx, yy)
ax[0].plot(xx, yy_fit, c='r')
ax[1].scatter(range(len(model.f_vals)), model.f_vals, c='r')
# ax[1].set_ylim(0, 1E-6)
plt.show()
rel_SSD is around 1 and definitely not something below ftol = 1E-6.
EDIT: Based on #user12750353 answer below I updated my minimal example to try to recreate how lmdif determines termination with ftol. The problem is that my f_vals are too small, so they are not the right values. The reason I would like to recreate this is that I would like to see what kind of numbers I am getting on my main code to decide on a ftol that would terminate the fitting process earlier.

Since you are giving a function without the gradient, the method called is lmdif. Instead of gradients it will use forward difference gradient estimate, f(x + delta) - f(x) ~ delta * df(x)/dx (I will write as if the parameter).
There you find the following description
c ftol is a nonnegative input variable. termination
c occurs when both the actual and predicted relative
c reductions in the sum of squares are at most ftol.
c therefore, ftol measures the relative error desired
c in the sum of squares.
c
c xtol is a nonnegative input variable. termination
c occurs when the relative error between two consecutive
c iterates is at most xtol. therefore, xtol measures the
c relative error desired in the approximate solution.
Looking in the code the actual reduction acred = 1 - (fnorm1/fnorm)**2 is what you calculated for rel_SSD, but between the two last iterations, not between the fitted function and the target points.
Example
The problem here is that we need to discover what are the values assumed by the internal variables. An attempt to do so is to save the coefficients and the residual norm every time the function is called as follows.
import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import leastsq
class gaussian_model:
def __init__(self):
self.prev_iter_model = None
self.fnorm = []
self.x = []
def gaussian_1D(self, coeffs, xx):
A, sigma, mu = coeffs
# Center rotation around peak center
x0 = xx - mu
model = A*np.exp(-(x0**2)/(2*(sigma**2)))
grad = np.array([
model / A,
model * x0**2 / (sigma**3),
model * 2 * x0 / (2*(sigma**2))
]).transpose();
return model, grad
def residuals(self, coeffs, I_obs, xx, model_func):
model, grad = model_func(coeffs, xx)
residuals = I_obs - model
self.x.append(np.copy(coeffs));
self.fnorm.append(np.sqrt(np.sum(residuals**2)))
return residuals
def grad(self, coeffs, I_obs, xx, model_func):
model, grad = model_func(coeffs, xx)
residuals = I_obs - model
return -grad
def plot_progress(self):
x = np.array(self.x)
dx = np.sqrt(np.sum(np.diff(x, axis=0)**2, axis=1))
plt.plot(dx / np.sqrt(np.sum(x[1:, :]**2, axis=1)))
fnorm = np.array(self.fnorm)
plt.plot(1 - (fnorm[1:]/fnorm[:-1])**2)
plt.legend(['$||\Delta f||$', '$||\Delta x||$'], loc='upper left');
# x data
x_start = 1
x_stop = 10
num = 100
xx, dx = np.linspace(x_start, x_stop, num, retstep=True)
# Simulated data with some noise
A, s_x, mu = 10, 0.5, 3
coeffs = [A, s_x, mu]
model = gaussian_model()
yy, _ = model.gaussian_1D(coeffs, xx)
noise_ampl = 0.5
noise = np.random.normal(0, noise_ampl, size=num)
yy += noise
Then we can see the relative variation of $x$ and $f$
initial_guess = [1, 1, 1]
pred_coeffs, cov_x, info, mesg, ier = leastsq(model.residuals, initial_guess,
args=(yy, xx, model.gaussian_1D),
xtol=1e-6,
ftol=1e-6, full_output=True)
plt.figure(figsize=(14, 6))
plt.subplot(121)
model.plot_progress()
plt.yscale('log')
plt.grid()
plt.subplot(122)
yy_fit,_ = model.gaussian_1D(pred_coeffs, xx)
# Plot results
plt.scatter(xx, yy)
plt.plot(xx, yy_fit, c='r')
plt.show()
The problem with this is that the function is evaluated both to compute f and to compute the gradient of f. To produce a cleaner plot what can be done is to implement pass Dfun so that it evaluate func only once per iteration.
# x data
x_start = 1
x_stop = 10
num = 100
xx, dx = np.linspace(x_start, x_stop, num, retstep=True)
# Simulated data with some noise
A, s_x, mu = 10, 0.5, 3
coeffs = [A, s_x, mu]
model = gaussian_model()
yy, _ = model.gaussian_1D(coeffs, xx)
noise_ampl = 0.5
noise = np.random.normal(0, noise_ampl, size=num)
yy += noise
# LM Least squares
initial_guess = [1, 1, 1]
pred_coeffs, cov_x, info, mesg, ier = leastsq(model.residuals, initial_guess,
args=(yy, xx, model.gaussian_1D),
Dfun=model.grad,
xtol=1e-6,
ftol=1e-6, full_output=True)
plt.figure(figsize=(14, 6))
plt.subplot(121)
model.plot_progress()
plt.yscale('log')
plt.grid()
plt.subplot(122)
yy_fit,_ = model.gaussian_1D(pred_coeffs, xx)
# Plot results
plt.scatter(xx, yy)
plt.plot(xx, yy_fit, c='r')
plt.show()
Well, the value I am obtaining for xtol is not exactly what is in the lmdif implementation.

Finding alpha and beta of beta-binomial distribution with scipy.optimize and loglikelihood

A distribution is beta-binomial if p, the probability of success, in a binomial distribution has a beta distribution with shape parameters α > 0 and β > 0. The shape parameters define the probability of success.
I want to find the values for α and β that best describe my data from the perspective of a beta-binomial distribution. My dataset players consist of data about the number of hits (H), the number of at-bats (AB) and the conversion (H / AB) of a lot of baseball players. I estimate the PDF with the help of the answer of JulienD in Beta Binomial Function in Python
from scipy.special import beta
from scipy.misc import comb
pdf = comb(n, k) * beta(k + a, n - k + b) / beta(a, b)
Next, I write a loglikelihood function that we will minimize.
def loglike_betabinom(params, *args):
"""
Negative log likelihood function for betabinomial distribution
:param params: list for parameters to be fitted.
:param args: 2-element array containing the sample data.
:return: negative log-likelihood to be minimized.
"""
a, b = params[0], params[1]
k = args[0] # the conversion rate
n = args[1] # the number of at-bats (AE)
pdf = comb(n, k) * beta(k + a, n - k + b) / beta(a, b)
return -1 * np.log(pdf).sum()
Now, I want to write a function that minimizes loglike_betabinom
from scipy.optimize import minimize
init_params = [1, 10]
res = minimize(loglike_betabinom, x0=init_params,
args=(players['H'] / players['AB'], players['AB']),
bounds=bounds,
method='L-BFGS-B',
options={'disp': True, 'maxiter': 250})
print(res.x)
The result is [-6.04544138 2.03984464], which implies that α is negative which is not possible. I based my script on the following R-snippet. They get [101.359, 287.318]..
ll <- function(alpha, beta) {
x <- career_filtered$H
total <- career_filtered$AB
-sum(VGAM::dbetabinom.ab(x, total, alpha, beta, log=True))
}
m <- mle(ll, start = list(alpha = 1, beta = 10),
method = "L-BFGS-B", lower = c(0.0001, 0.1))
ab <- coef(m)
Can someone tell me what I am doing wrong? Help is much appreciated!!

One thing to pay attention to is that comb(n, k) in your log-likelihood might not be well-behaved numerically for the values of n and k in your dataset. You can verify this by applying comb to your data and see if infs appear.
One way to amend things could be to rewrite the negative log-likelihood as suggested in https://stackoverflow.com/a/32355701/4240413, i.e. as a function of logarithms of Gamma functions as in
from scipy.special import gammaln
import numpy as np
def loglike_betabinom(params, *args):
a, b = params[0], params[1]
k = args[0] # the OVERALL conversions
n = args[1] # the number of at-bats (AE)
logpdf = gammaln(n+1) + gammaln(k+a) + gammaln(n-k+b) + gammaln(a+b) - \
(gammaln(k+1) + gammaln(n-k+1) + gammaln(a) + gammaln(b) + gammaln(n+a+b))
return -np.sum(logpdf)
You can then minimize the log-likelihood with
from scipy.optimize import minimize
init_params = [1, 10]
# note that I am putting 'H' in the args
res = minimize(loglike_betabinom, x0=init_params,
args=(players['H'], players['AB']),
method='L-BFGS-B', options={'disp': True, 'maxiter': 250})
print(res)
and that should give reasonable results.
You could check How to properly fit a beta distribution in python? for inspiration if you want to rework further your code.

"'float' is not subscriptable" in odeint

I'm trying to implement coupled differential equations in Python, and as a new user I seem to be stuck at something. I used this tutorial as a guide to how to solve my ODEs, and looked into the documentation to no available
This is where I define the function
def Burnout(t, y, m, nu, S0, V, delta, mu):
S = y[0];
E = [0 for i in range(0,m)]
dEdt = [0 for i in range(0,m)]
for i in range(0,m):
E.append(y[i+1])
P = y[m+1]
dSdt = -nu*S*P*(S/S0)**V
dEdt.append(nu*S*P*(S/S0)**V-m*delta*E[0])
for i in range(1,m):
dEdt.append(m*delta*E[i-1]-m*delta*E[i])
dPdt = m*delta*E[m-1]-mu*P
return [dSdt, *dEdt[0:m], dPdt]
Then, as in the tutorial, I define the initial conditions by
S0 = N
y0.append(S0)
for i in range (0, m):
E.append(0)
y0.append(E[i])
P0 = Z
y0.append(P0)
where N and Z are previously defined things, and E was an empty array. When I finally call odeint(Burnout, y0, t, args = p), I get a 'float' object is not subscriptable pointing to my definition of S in my Burnout function. As I passed a list to odeint I'm kind of confused on why Python says I passed a float. Does anyone see what I did wrong? Thanks in advance!
EDIT: Ok, now here is a minimal, complete and verifiable example that gives me the same error
import numpy as np
from scipy.integrate import odeint
def Burnout(t, y, m, nu, S0, V, delta, mu):
S = y[0]
E = [0 for i in range(0,m)]
dEdt = [0 for i in range(0,m)]
for i in range(0,m):
E.append(y[i+1])
P = y[m+1]
dSdt = -nu*S*P*(S/S0)**V
dEdt.append(nu*S*P*(S/S0)**V-m*delta*E[0])
for i in range(1,m):
dEdt.append(m*delta*E[i-1]-m*delta*E[i])
dPdt = m*delta*E[m-1]-mu*P
return [dSdt, *dEdt[0:m], dPdt]
V = 2.97
m = 26
delta = 1/6
mu = 1
nu = 10
S0 = 5
t = np.linspace(0,56,100)
y = [10,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,100]
p = (m, nu, V, S0, delta, mu)
print(odeint(Burnout,y,t,args=p))

You ordered the arguments in your ode definition wrong. It is possible to have t before y, but then you must define tfirst = True (see docs).
Swapping the arguments in your definition of Burnout fixes the problem for me.
def Burnout(y, t, m, nu, S0, V, delta, mu):
# ...
# rest of function
# ...
Alternately you can define the additional keyword tfirst in the odeint call:
odeint(Burnout, y, t, args=p, tfirst=True)

Fitting data to system of ODEs using Python via Scipy & Numpy

I am having some trouble translating my MATLAB code into Python via Scipy & Numpy. I am stuck on how to find optimal parameter values (k0 and k1) for my system of ODEs to fit to my ten observed data points. I currently have an initial guess for k0 and k1. In MATLAB, I can using something called 'fminsearch' which is a function that takes the system of ODEs, the observed data points, and the initial values of the system of ODEs. It will then calculate a new pair of parameters k0 and k1 that will fit the observed data. I have included my code to see if you can help me implement some kind of 'fminsearch' to find the optimal parameter values k0 and k1 that will fit my data. I want to add whatever code to do this to my lsqtest.py file.
I have three .py files - ode.py, lsq.py, and lsqtest.py
ode.py:
def f(y, t, k):
return (-k[0]*y[0],
k[0]*y[0]-k[1]*y[1],
k[1]*y[1])
lsq.py:
import pylab as py
import numpy as np
from scipy import integrate
from scipy import optimize
import ode
def lsq(teta,y0,data):
#INPUT teta, the unknowns k0,k1
# data, observed
# y0 initial values needed by the ODE
#OUTPUT lsq value
t = np.linspace(0,9,10)
y_obs = data #data points
k = [0,0]
k[0] = teta[0]
k[1] = teta[1]
#call the ODE solver to get the states:
r = integrate.odeint(ode.f,y0,t,args=(k,))
#the ODE system in ode.py
#at each row (time point), y_cal has
#the values of the components [A,B,C]
y_cal = r[:,1] #separate the measured B
#compute the expression to be minimized:
return sum((y_obs-y_cal)**2)
lsqtest.py:
import pylab as py
import numpy as np
from scipy import integrate
from scipy import optimize
import lsq
if __name__ == '__main__':
teta = [0.2,0.3] #guess for parameter values k0 and k1
y0 = [1,0,0] #initial conditions for system
y = [0.000,0.416,0.489,0.595,0.506,0.493,0.458,0.394,0.335,0.309] #observed data points
data = y
resid = lsq.lsq(teta,y0,data)
print resid

For these kind of fitting tasks you could use the package lmfit. The outcome of the fit would look like this; as you can see, the data are reproduced very well:
For now, I fixed the initial concentrations, you could also set them as variables if you like (just remove the vary=False in the code below). The parameters you obtain are:
[[Variables]]
x10: 5 (fixed)
x20: 0 (fixed)
x30: 0 (fixed)
k0: 0.12183301 +/- 0.005909 (4.85%) (init= 0.2)
k1: 0.77583946 +/- 0.026639 (3.43%) (init= 0.3)
[[Correlations]] (unreported correlations are < 0.100)
C(k0, k1) = 0.809
The code that reproduces the plot looks like this (some explanation can be found in the inline comments):
import numpy as np
import matplotlib.pyplot as plt
from scipy.integrate import odeint
from lmfit import minimize, Parameters, Parameter, report_fit
from scipy.integrate import odeint
def f(y, t, paras):
"""
Your system of differential equations
"""
x1 = y[0]
x2 = y[1]
x3 = y[2]
try:
k0 = paras['k0'].value
k1 = paras['k1'].value
except KeyError:
k0, k1 = paras
# the model equations
f0 = -k0 * x1
f1 = k0 * x1 - k1 * x2
f2 = k1 * x2
return [f0, f1, f2]
def g(t, x0, paras):
"""
Solution to the ODE x'(t) = f(t,x,k) with initial condition x(0) = x0
"""
x = odeint(f, x0, t, args=(paras,))
return x
def residual(paras, t, data):
"""
compute the residual between actual data and fitted data
"""
x0 = paras['x10'].value, paras['x20'].value, paras['x30'].value
model = g(t, x0, paras)
# you only have data for one of your variables
x2_model = model[:, 1]
return (x2_model - data).ravel()
# initial conditions
x10 = 5.
x20 = 0
x30 = 0
y0 = [x10, x20, x30]
# measured data
t_measured = np.linspace(0, 9, 10)
x2_measured = np.array([0.000, 0.416, 0.489, 0.595, 0.506, 0.493, 0.458, 0.394, 0.335, 0.309])
plt.figure()
plt.scatter(t_measured, x2_measured, marker='o', color='b', label='measured data', s=75)
# set parameters including bounds; you can also fix parameters (use vary=False)
params = Parameters()
params.add('x10', value=x10, vary=False)
params.add('x20', value=x20, vary=False)
params.add('x30', value=x30, vary=False)
params.add('k0', value=0.2, min=0.0001, max=2.)
params.add('k1', value=0.3, min=0.0001, max=2.)
# fit model
result = minimize(residual, params, args=(t_measured, x2_measured), method='leastsq') # leastsq nelder
# check results of the fit
data_fitted = g(np.linspace(0., 9., 100), y0, result.params)
# plot fitted data
plt.plot(np.linspace(0., 9., 100), data_fitted[:, 1], '-', linewidth=2, color='red', label='fitted data')
plt.legend()
plt.xlim([0, max(t_measured)])
plt.ylim([0, 1.1 * max(data_fitted[:, 1])])
# display fitted statistics
report_fit(result)
plt.show()
If you have data for additional variables, you can simply update the function residual.

The following worked for me:
import pylab as pp
import numpy as np
from scipy import integrate, interpolate
from scipy import optimize
##initialize the data
x_data = np.linspace(0,9,10)
y_data = np.array([0.000,0.416,0.489,0.595,0.506,0.493,0.458,0.394,0.335,0.309])
def f(y, t, k):
"""define the ODE system in terms of
dependent variable y,
independent variable t, and
optinal parmaeters, in this case a single variable k """
return (-k[0]*y[0],
k[0]*y[0]-k[1]*y[1],
k[1]*y[1])
def my_ls_func(x,teta):
"""definition of function for LS fit
x gives evaluation points,
teta is an array of parameters to be varied for fit"""
# create an alias to f which passes the optional params
f2 = lambda y,t: f(y, t, teta)
# calculate ode solution, retuen values for each entry of "x"
r = integrate.odeint(f2,y0,x)
#in this case, we only need one of the dependent variable values
return r[:,1]
def f_resid(p):
""" function to pass to optimize.leastsq
The routine will square and sum the values returned by
this function"""
return y_data-my_ls_func(x_data,p)
#solve the system - the solution is in variable c
guess = [0.2,0.3] #initial guess for params
y0 = [1,0,0] #inital conditions for ODEs
(c,kvg) = optimize.leastsq(f_resid, guess) #get params
print "parameter values are ",c
# fit ODE results to interpolating spline just for fun
xeval=np.linspace(min(x_data), max(x_data),30)
gls = interpolate.UnivariateSpline(xeval, my_ls_func(xeval,c), k=3, s=0)
#pick a few more points for a very smooth curve, then plot
# data and curve fit
xeval=np.linspace(min(x_data), max(x_data),200)
#Plot of the data as red dots and fit as blue line
pp.plot(x_data, y_data,'.r',xeval,gls(xeval),'-b')
pp.xlabel('xlabel',{"fontsize":16})
pp.ylabel("ylabel",{"fontsize":16})
pp.legend(('data','fit'),loc=0)
pp.show()

Look at the scipy.optimize module. The minimize function looks fairly similar to fminsearch, and I believe that both basically use a simplex algorithm for optimization.

# cleaned up a bit to get my head around it - thanks for sharing
import pylab as pp
import numpy as np
from scipy import integrate, optimize
class Parameterize_ODE():
def __init__(self):
self.X = np.linspace(0,9,10)
self.y = np.array([0.000,0.416,0.489,0.595,0.506,0.493,0.458,0.394,0.335,0.309])
self.y0 = [1,0,0] # inital conditions ODEs
def ode(self, y, X, p):
return (-p[0]*y[0],
p[0]*y[0]-p[1]*y[1],
p[1]*y[1])
def model(self, X, p):
return integrate.odeint(self.ode, self.y0, X, args=(p,))
def f_resid(self, p):
return self.y - self.model(self.X, p)[:,1]
def optim(self, p_quess):
return optimize.leastsq(self.f_resid, p_guess) # fit params
po = Parameterize_ODE(); p_guess = [0.2, 0.3]
c, kvg = po.optim(p_guess)
# --- show ---
print "parameter values are ", c, kvg
x = np.linspace(min(po.X), max(po.X), 2000)
pp.plot(po.X, po.y,'.r',x, po.model(x, c)[:,1],'-b')
pp.xlabel('X',{"fontsize":16}); pp.ylabel("y",{"fontsize":16}); pp.legend(('data','fit'),loc=0); pp.show()

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Optimization/curve fitting - python

I suggest you take a look at Scipy's optimize.curve_fit function. Here you can define a function that can be fitted to your data. Optimal parameters and fitting covariances are returned.

Related

Resonance frequency with python curve-fit, error maxfev reached

How to calculate "relative error in the sum of squares" and "relative error in the approximate solution" from least squares method?

Finding alpha and beta of beta-binomial distribution with scipy.optimize and loglikelihood

"'float' is not subscriptable" in odeint

Fitting data to system of ODEs using Python via Scipy & Numpy

Categories

Resources