fmin_cg giving nexpected error - python

just learned gradient desc. algo and i tried to implement it, input is set of cordinates on 2D plane, and aim is to predict the line that passes through most of given input points.
using python, i wrote :
def cost( theta, data ):
X, y = data[:, 0], data[:, 1]
m = shape(X)[0]
y = y.reshape(m, 1)
X = c_[ones((m, 1)), X]
J = X.dot(theta) - y
J = J.T.dot(J) / (m)
# print(J[0, 0])
return J
def gradDesc(theta, data):
X = data[:, 0]
y = data[:, 1]
m = shape(X)[0]
X = c_[ones((m, 1)), X]
y = y.reshape(m, 1)
hypo = X.dot(theta)
grad = X.T.dot(hypo - y)/m
# print(grad)
return grad
def run(theta, data ):
result = scipy.optimize.fmin_cg( f = cost, fprime=gradDesc, x0=theta, \
args = (data), maxiter=50, disp=False, full_output=True )
theta = result[0]
minCost = result[1]
return theta, minCost
def main():
data = genfromtxt('in.txt', delimiter=',')
theta = zeros((2, 1))
# plot_samples(data)
run(theta, data)
i tried using fmin_cg() to minimize cost, but one of its parameters 'args' causes an error :
line 282, in function_wrapper
return function(*(wrapper_args + args))
TypeError: gradDesc() takes 2 positional arguments but 5 were given
in docs i read that args is the list of params passed to f and fprime other than the one to be altered to minimize f, which here is data. need help to know where i am going wrong ..
full code : http://ideone.com/E22yzl

Extra arguments need to be a tuple. If you mean to have a single parameter called data, you need to construct a one-element tuple (data,) -- notice the comma. This is not the same as (data), where brackets are effectively ignored.
Example:
>>> def f(x, data):
... return (x - data.sum())**2
...
>>> import numpy as np
>>> data = np.asarray([1., 2., 3.])
>>> fmin_cg(f, x0=11., args=(data,))
Optimization terminated successfully.
Current function value: 0.000000
Iterations: 2
Function evaluations: 15
Gradient evaluations: 5
array([ 6.])

Related

Implementing the Backwards Euler method in python to solve a pendulum

I am trying to set up an implicit solver to a pendulum F and dF are defined as:
def func(y,t):
### Simplified the Function to remove friction since it canceled out
x,v = y[:3],y[3:6]
grav = np.array([0., 0., -9.8 ])
lambd = (grav.dot(x)+v.dot(v))/x.dot(x)
return np.concatenate([v, grav - lambd*x] )
def dF_matrix(y):
n=y.size
dF=np.zeros((6,6))
xp=np.array([y[3],y[4],y[5]])[np.newaxis]
mass=1.
F1=0.
F2=0.
F3=-mass*9.8
F=np.array([F1,F2,F3])[np.newaxis]
phix=2.*y[0]
phiy=2.*y[1]
phiz=2.*y[2]
G=np.array([phix,phiy,phiz])[np.newaxis]
H=2.*np.eye(3)
lambd=(mass*np.dot(xp,np.dot(H,xp.T))+np.dot(F,G.T))/np.dot(G,G.T)
dF[0,3]=1
dF[1,4]=1
dF[2,5]=1
dF[3,0]=(y[0]*F1+2*lambd)/mass
dF[3,1]=(y[0]*F2)/mass
dF[3,2]=(y[0]*F3)/mass
dF[3,3]=phix*y[3]
dF[3,4]=phix*y[4]
dF[3,5]=phix*y[5]
dF[4,0]=(y[1]*F1)/mass
dF[4,1]=(y[1]*F2+2*lambd)/mass
dF[4,2]=(y[1]*F3)/mass
dF[4,3]=phiy*y[3]
dF[4,4]=phiy*y[4]
dF[4,5]=phiy*y[5]
dF[5,0]=(y[2]*F1)/mass
dF[5,1]=(y[2]*F2)/mass
dF[5,2]=(y[2]*F3+2*lambd)/mass
dF[5,3]=phiz*y[3]
dF[5,4]=phiz*y[4]
dF[5,5]=phiz*y[5]
return dF
I am trying to set up an implicit solver that uses the function and the dF in order to guess values in order to create a more accurate solver for my pendulum function.
the error that I get is the following
Traceback (most recent call last):
line 206, in test_function(backwards_Euler)
line 186, in test_function y1 = test_function(func, y0, t)
line 166, in backwards_Euler F = np.asarray(y[i] + dt * function(zold, time[i+1])-zold)
line 13, in func lambd = (grav.dot(x)+v.dot(v))/x.dot(x)
ValueError: shapes (3,6) and (3,6) not aligned: 6 (dim 1) != 3 (dim 0)
I think it is because one of the calculations I am doing is diverging to infinity but I cannot figure out the issue that might be causing this.
Here is my current backward Euler function:
### Backwards Euler Method
def backwards_Euler(function, y_matrix, time):
y = np.zeros((np.size(time), np.size(y_matrix)))
y[0, :] = y_matrix
dt = time[1] - time[0]
for i in range(len(time-1)):
err = 1
zold = y[i] + dt*function(y[i],time[i]) ## guess with forward euler
I = 0
while err > 10**(-10) and I < 5:
F = y[i] + dt * function(zold, time[i+1])-zold ## Here is where my error occurs
dF = dt*dF_matrix(y[i+1])-1
znew = zold - F/dF
zold = znew
I+=1
y[i+1]=znew
return y
Here is how I am testing the function although currently, I have no output
# initial condition
y0 = np.array([0.0, 1.0, 0.0, 0.8, 0.0, 1.2])
def test_function(test_function):
print(test_function.__name__ + "...")
nt = 2500
time_start = process_time()
# time points
t = np.linspace(0, 25, nt)
# solve ODE
y1 = test_function(func, y0, t)
time_elapsed = (process_time() - time_start)
print('elapsed time', time_elapsed)
# compute residual:
r = y1[:, 0] ** 2 + y1[:, 1] ** 2 + y1[:, 2] ** 2 - 1
rmax1 = np.max(np.abs(r))
print('error', rmax1)
fig = plt.figure()
ax = plt.axes(projection='3d')
ax.plot3D(y1[:, 0], y1[:, 1], y1[:, 2], 'gray')
plt.show()
test_function(backwards_Euler)
No, the error is that the matrix shapes in some matrix operations don't match. The immediate cause is that x.dot(x) works when x is a simple array, but not for two-dimensional arrays. Then the rules of matrix multiplication have to be strictly obeyed.
However, there is no reason that there is a matrix in place of x, you do not call the function in a "vectorized" fashion. The true reason is that python is not Matlab, the division in
znew = zold - F/dF
does not do what you think it does. Instead it does elementwise division, repeating the smaller object to fill the shape of the larger, which also happens with the subtraction, so that znew in the end has the same shape as the matrix dF, with the observed results in the next ODE function evaluation.
You need to explicitly invoke a linear solver, the standard one in in np.linalg (there are others there and in scipy.sparse.linalg using other matrix factorizations) is simply called solve(A,b) and computes the solution of A*x=b.
znew = zold - np.linalg.solve(dF, F)
Implicit Euler gives a diverging solution, the length of the pendulum increases rapidly. Applying these methods to the similar implicit trapezoidal method, which is also Adams-Moulton 2nd order, gives the code
def Adams_Moulton_2nd(function, y_init, time):
# solve F(z)=0 with simplified Newton, where
# F = y + 0.5*dt*(f(y,t)+f(z,t+dt)) - z
# is the defect of the implicit trapezium method. Using the derivative
# dF_z = 0.5*dt*df_z(z,t+dt) - I
# the non-linear solver step is solution of the linear system
# dF_z*dz = -F
y = np.zeros((np.size(time), np.size(y_init)))
y[0, :] = y_init
Id = np.eye(len(y_init))
dt = time[1] - time[0]
dt4 = dt**4
for i in range(len(time)-1):
f_0 = function(y[i],time[i])
norm_0 = sum(f_0**2)**0.5 + 1e-6
z = y[i] + dt*f_0 ## guess with forward euler
dF = 0.5*dt*dF_matrix(z, t[i+1]) - Id
for I in range(5):
F = y[i] + 0.5*dt * (f_0+function(z, time[i+1])) - z
dz = - np.linalg.solve(dF,F)
if max(abs(dz)) < norm_0*dt4: break
z += dz
y[i+1]=z
return y### Adams_Moulton_2nd

scipy.optimize.curve_fit can't fit non linear function

I have a very non-linear function with two parameters that curve_fit is not able to fit : it fits the first one but do not change the second one.
I also get the classical
.../.local/lib/python3.6/site-packages/scipy/optimize/minpack.py:794: OptimizeWarning: Covariance of the parameters could not be estimated
category=OptimizeWarning)
Here is the function I am trying to fit :
def tand(x):
return np.tan(x*np.pi/180.)
def sind(x):
return np.sin(x*np.pi/180.)
def cosd(x):
return np.cos(x*np.pi/180.)
def coeffAx(A0, alpha):
return A0*cosd(alpha)**2.
def coeffBx(B0, alpha):
return B0*cosd(alpha)**2.
def coeffAy(A0,alpha):
return (1./2.)*A0*cosd(alpha)*sind(alpha)
def coeffBy(B0,alpha):
return (1./2.)*B0*cosd(alpha)*sind(alpha)
def Growth_rate(k,alpha,A0,B0,mu,r):
c = (r**2.-1.)/r**2.
return (k**2./(1.+(k*cosd(alpha))**2.))*(cosd(alpha)*(coeffBx(B0,alpha) - cosd(alpha)/(mu*r**2.)) + sind(alpha)*(coeffBy(B0,alpha) - sind(alpha)/(mu*r))*c - k*cosd(alpha)*(cosd(alpha)*coeffAx(A0,alpha) + sind(alpha)*coeffAy(A0,alpha)*c))
def Get_most_unstable(Sigma,alpha,k):
SigMax = np.amax(Sigma)
Coord = np.argwhere(Sigma == SigMax)
kmax = k[Coord[:,1]]
amax = alpha[Coord[:,0]]
return np.array([SigMax, kmax, amax])
def lambda_fit(V, C1, C2):
A0 = 3.5
B0 = 2
mu = tand(35)
# R = C2 * (V -1) + 1
k = np.linspace(0., 0.6, 1001)
alpha = np.array([0])
K,ALPHA = np.meshgrid(k,alpha)
kM = []
for v in V:
Sigma = Growth_rate(K,ALPHA,A0,B0,mu, C2 * (v - 1) + 1)
kM.append(Get_most_unstable(Sigma,alpha,k)[1])
return 2*np.pi*C1/np.array(kM)
And here are the data :
V = np.array([1.0398639 , 1.13022518, 1.27846 , 1.31943454, 1.3898527 ,1.42114085])
Lambda_trans = [18.56117382616553, 13.747212426683717, 12.149968490349218, 12.034763392608163, 11.944807729994983, 12.6708866218023]
This is what I obtain :
p, pconv = curve_fit(lambda_fit, V, Lambda_trans, p0 = [1,10], check_finite = True)
/home/gadal/.local/lib/python3.6/site-packages/scipy/optimize/minpack.py:794: OptimizeWarning: Covariance of the parameters could not be estimated
category=OptimizeWarning)
>>> p
array([ 0.69145457, 10. ])
>>> pconv
array([[inf, inf],
[inf, inf]])
As you can see the first parameter is fitted but the second is not. What is very odd is that I can obtain very good fits using values of the second parameter between 9.5 and 10. I can't understand why curve_fit is not able to do it .. ? Any ideas ? I tried to add bounds as bounds = ([0.5,8], [1.2,12]) but the result is the same.

Find optimal vector that minimizes function

I am trying to find a vector that minimizes the residual sum of squares when multiplying a matrix.
I know of scipy's optimize package (which has a minimize function). However, there is an extra constraint for my code. The sum of all entries of w (see function below) must equal 1, and no entry of w can be less than 0. Is there a package that does this for me? If not, how can I do this?
Trying to minimize w:
def w_rss(w,x0,x1):
predictions = np.dot(x0,w)
errors = x1 - predictions
rss = np.dot(errors.transpose(),errors).item(0)
return rss
X0 = np.array([[3,4,5,3],
[1,2,2,4],
[6,5,3,7],
[1,0,5,2]])
X1 = np.array([[4],
[2],
[4],
[2]])
W = np.array([[.0],
[.5],
[.5],
[.0]])
print w_rss(W,X0,X1)
So far this is my best attempt at looping through possible values of w, but it's not working properly.
def get_w(x0,x1):
J = x0.shape[1]
W0 = np.matrix([[1.0/J]*J]).transpose()
rss0 = w_rss(W0,x0,x1)
loop = range(J)
for i in loop:
W1 = W0
rss1 = rss0
while rss0 == rss1:
den = len(loop)-1
W1[i][0] += 0.01
for j in loop:
if i == j:
continue
W1[j][0] -= 0.01/den
if W1[j][0] <= 0:
loop.remove(j)
rss1 = w_rss(W1,x0,x1)
if rss1 < rss0:
#print W1
W0 = W1
rss0 = rss1
print '--'
print rss0
print W0
return W0,rss0
The SLSQP code in scipy can do this. You can use scipy.optimize.minimize with method='SLSQP, or you can use the function fmin_slsqp directly. In the following, I use fmin_slsqp.
The scipy solvers generally pass a one-dimensional array to the objective function, so to be consistent, I'll change W and X1 to be 1-d arrays, and I'll write the objective function (now called w_rss1) to expect a 1-d argument w.
The condition that all the elements in w must be between 0 and 1 is specified using the bounds argument, and the condition that the sum must be 1 is specified using the f_eqcons argument. The constraint function returns np.sum(w) - 1, so it is 0 when the sum of the elements is 1.
Here's the code:
import numpy as np
from scipy.optimize import fmin_slsqp
def w_rss1(w, x0, x1):
predictions = np.dot(x0, w)
errors = x1 - predictions
rss = (errors**2).sum()
return rss
def sum1constraint(w, x0, x1):
return np.sum(w) - 1
X0 = np.array([[3,4,5,3],
[1,2,2,4],
[6,5,3,7],
[1,0,5,2]])
X1 = np.array([4, 2, 4, 2])
W = np.array([.0, .5, .5, .0])
result = fmin_slsqp(w_rss1, W, f_eqcons=sum1constraint, bounds=[(0.0, 1.0)]*len(W),
args=(X0, X1), disp=False, full_output=True)
Wopt, fW, its, imode, smode = result
if imode != 0:
print("Optimization failed: " + smode)
else:
print(Wopt)
When I run this, the output is
[ 0.05172414 0.55172414 0.39655172 0. ]

Reducing computation Time for a nested Loop

I would like to reduce the computation time for the code posted below. In essence, the code below calculates the array Tf as product of the following nested loop:
Af = lambda x: Approximationf(f, x)
for idxp, prior in enumerate(grid_prior):
for idxy, y in enumerate(grid_y):
posterior = lambda yPrime: updated_posterior(prior, y, yPrime)
integrateL = integrate(lambda z: Af(np.array([y*np.exp(mu[0])*z,
posterior(y*np.exp(mu[0]) * z)])))
integrateH = integrate(lambda z: Af(np.array([y*np.exp(mu[1])*z,
posterior(y * np.exp(mu[1])*z)])))
Tf[idxy, idxp] = (h[idxy, idxp] +
beta * ((prior * integrateL) +
(1-prior)*integrateH))
The objects posterior, integrate and Af are functions that are repeatedly called while iterating over the loop. The function posterior calculates a scalar called posterior. The function Af approximates the function f at sample points x and passes the result on to the function integrate, which calculates the conditional expectation of the function f.
The code posted below is a simplification of a more difficult problem. Instead of running the nested loop once, I have to run it multiple times to solve a fixed point problem. This problem is initialized with an arbitrary function f and a function Tf is created. This array is then used in the next iteration over the nested loop to calculate another array Tf. The process continues until convergence.
I decided not to report results of the cProfile module. By neglecting the iteration over the nested loop until convergence a lot of internal python executions require a relatively long time. However, when iterating until convergence, these internal executions loose their relative importance and are relegated to lower positions in the cPython output.
I tried to mimick different suggestions for lowering the computation time of loops I found online for slightly modified problems. Unfortunately, I couldn't do so and could not really figure out a common approach to tackle these problems. Does somebody has an idea how to lower the computation time of this loop? I am grateful for any help!
import numpy as np
from scipy import interpolate
from scipy.stats import lognorm
from scipy.integrate import fixed_quad
# == The following lines define the paramters for the problem == #
gamma, beta, sigma, mu = 2, 0.95, 0.0255, np.array([0.0113, -0.0016])
grid_y, grid_prior = np.linspace(7, 10, 15), np.linspace(0, 1, 5)
int_min, int_max = np.exp(- 7 * sigma), np.exp(+ 7 * sigma)
phi = lognorm(sigma)
f = np.array([[ 1.29824564, 1.29161017, 1.28379398, 1.2676886, 1.15320819],
[ 1.26290108, 1.26147364, 1.24755837, 1.23819851, 1.11912802],
[ 1.22847276, 1.23013194, 1.22128198, 1.20996971, 1.0864706 ],
[ 1.19528104, 1.19645792, 1.19056084, 1.17980572, 1.05532966],
[ 1.16344832, 1.16279841, 1.15997191, 1.15169942, 1.02564429],
[ 1.13301675, 1.13109952, 1.12883038, 1.1236645, 0.99730795],
[ 1.10398195, 1.10125013, 1.0988554, 1.09612933, 0.97019688],
[ 1.07630046, 1.07356297, 1.07126087, 1.06878758, 0.94417658],
[ 1.04989686, 1.04728542, 1.04514962, 1.04289665, 0.91910765],
[ 1.02467087, 1.0221532, 1.02011384, 1.01797238, 0.89485162],
[ 1.00050447, 0.99795025, 0.99576917, 0.99330549, 0.87127677],
[ 0.97726849, 0.97443288, 0.97190614, 0.96861352, 0.84826362],
[ 0.95482612, 0.94783816, 0.94340077, 0.93753641, 0.82569922],
[ 0.93302433, 0.91985497, 0.9059118, 0.88895196, 0.80348449],
[ 0.91165997, 0.88253486, 0.86126688, 0.84769975, 0.78147382]])
# == Calculate function h, Used in the loop below == #
E0 = np.exp((1-gamma)*mu + (1-gamma)**2*sigma**2/2)
h = np.outer(beta*grid_y**(1-gamma), grid_prior*E0[0] + (1-grid_prior)*E0[1])
def integrate(g):
"""
This function is repeatedly called in the loop below
"""
integrand = lambda z: g(z) * phi.pdf(z)
result = fixed_quad(integrand, int_min, int_max, n=15)[0]
return result
def Approximationf(f, x):
"""
This function approximates the function f and is repeatedly called in
the loop
"""
# == simplify notation == #
fApprox = np.empty((x.shape[1]))
lower, middle = (x[0] < grid_y[0]), (x[0] >= grid_y[0]) & (x[0] <= grid_y[-1])
upper = (x[0] > grid_y[-1])
# = Calculate Polynomial == #
y_tile = np.tile(grid_y, len(grid_prior))
prior_repeat = np.repeat(grid_prior, len(grid_y))
s = interpolate.SmoothBivariateSpline(y_tile, prior_repeat,
f.T.flatten(), kx=5, ky=5)
# == interpolation == #
fApprox[middle] = s(x[0, middle], x[1, middle])[:, 0]
# == Extrapolation == #
if any(lower):
s0 = s(lower[lower]*grid_y[0], x[1, lower])[:, 0]
s1 = s(lower[lower]*grid_y[1], x[1, lower])[:, 0]
slope_lower = (s0 - s1)/(grid_y[0] - grid_y[1])
fApprox[lower] = s0 + slope_lower*(x[0, lower] - grid_y[0])
if any(upper):
sM1 = s(upper[upper]*grid_y[-1], x[1, upper])[:, 0]
sM2 = s(upper[upper]*grid_y[-2], x[1, upper])[:, 0]
slope_upper = (sM1 - sM2)/(grid_y[-1] - grid_y[-2])
fApprox[upper] = sM1 + slope_upper*(x[0, upper] - grid_y[-1])
return fApprox
def updated_posterior(prior, y, yPrime):
"""
This function calculates the posterior weights put on each distribution.
It is the thrid function repeatedly called in the loop below.
"""
z_0 = yPrime/(y * np.exp(mu[0]))
z_1 = yPrime/(y * np.exp(mu[1]))
l0, l1 = phi.pdf(z_0), phi.pdf(z_1)
posterior = l0*prior / (l0*prior + l1*(1-prior))
return posterior
Tf = np.empty_like(f)
Af = lambda x: Approximationf(f, x)
# == Apply the T operator to f == #
for idxp, prior in enumerate(grid_prior):
for idxy, y in enumerate(grid_y):
posterior = lambda yPrime: updated_posterior(prior, y, yPrime)
integrateL = integrate(lambda z: Af(np.array([y*np.exp(mu[0])*z,
posterior(y*np.exp(mu[0]) * z)])))
integrateH = integrate(lambda z: Af(np.array([y*np.exp(mu[1])*z,
posterior(y * np.exp(mu[1])*z)])))
Tf[idxy, idxp] = (h[idxy, idxp] +
beta * ((prior * integrateL) +
(1-prior)*integrateH))
Some experience with multiprocessing Following reptilicus comment, I decided to investigate how to use the multiprocessing module. My idea was to begin by parallizing the computation of the intergrateL array. To do so, I fixed the outer loop to prior =0.5 and wanted to iterate over the inner loop, grid_y. However, I still have to take into consideration that intergrateL is a lambda function in z. I tried to follow the advice of the stack-overflow question "How to let Pool.map take a lambda function" and wrote the following code:
prior = 0.5
Af = lambda x: Approximationf(f, x)
class Iteration(object):
def __init__(self,state):
self.y = state
def __call__(self,z):
Af(np.array([self.y*np.exp(mu[0])*z,
updated_posterior(prior,
self.y,self.y*np.exp(mu[0])*z)]))
with Pool(processes=4) as pool:
out = pool.map(Iteration(y), np.nditer(grid_y))
Unfortunately, python returns upon running the program:
IndexError: tuple index out of range
On first sight, these sniffs like a trivial error, but I cannot remedy it. Does somebody has an idea how to tackle the problem? Again, I'm grateful for any advice I receive!
I would target that nested loop, something like this. This is psuedo-code but it should get you started.
def do_calc(idxp, idxy, y, prior):
posterior = lambda yPrime: updated_posterior(prior, y, yPrime)
integrateL = integrate(lambda z: Af(np.array([y*np.exp(mu[0])*z,
posterior(y*np.exp(mu[0]) * z)])))
integrateH = integrate(lambda z: Af(np.array([y*np.exp(mu[1])*z,
posterior(y * np.exp(mu[1])*z)])))
return (idxp, idyy, posterior, integrateL, integrateH)
pool = multiprocessing.pool(8) # or however many cores you have
results = []
# This is the part that I would try to parallelize
for idxp, prior in enumerate(grid_prior):
for idxy, y in enumerate(grid_y):
results.append(pool.apply_async(do_calc, args=(idxpy, idxy, y, prior))
pool.close()
pool.join()
results = [r.get() for r in results]
for r in results:
Tf[r[0], r[1] = (h[r[0], r[1]] +
beta * ((prior * r[3]) +
(1-prior)*r[4))

Python parameter transformation according to MINUIT

I am writing an automated curve fitting routine for 2D data based on scipy's optimize.leastsq, and it works. However when adding many curves with starting values slighly off I get non-physical results (negative amplitude, for example).
I found this post Scipy: bounds for fitting parameter(s) when using optimize.leastsq and was trying to use the parameter transformation according to Minuit from Cern. In the above mentioned question somebody provided a link to some python code.
code.google.com/p/nmrglue/source/browse/trunk/nmrglue/analysis/leastsqbound.py
I wrote this minimal working example (extending the code)
"""
http://code.google.com/p/nmrglue/source/browse/trunk/nmrglue/analysis/leastsqbound.py
Constrained multivariate Levenberg-Marquardt optimization
"""
from scipy.optimize import leastsq
import numpy as np
import matplotlib.pyplot as plt #new
def internal2external_grad(xi, bounds):
"""
Calculate the internal to external gradiant
Calculates the partial of external over internal
"""
ge = np.empty_like(xi)
for i, (v, bound) in enumerate(zip(xi, bounds)):
a = bound[0] # minimum
b = bound[1] # maximum
if a == None and b == None: # No constraints
ge[i] = 1.0
elif b == None: # only min
ge[i] = v / np.sqrt(v ** 2 + 1)
elif a == None: # only max
ge[i] = -v / np.sqrt(v ** 2 + 1)
else: # both min and max
ge[i] = (b - a) * np.cos(v) / 2.
return ge
def i2e_cov_x(xi, bounds, cov_x):
grad = internal2external_grad(xi, bounds)
grad = grad = np.atleast_2d(grad)
return np.dot(grad.T, grad) * cov_x
def internal2external(xi, bounds):
""" Convert a series of internal variables to external variables"""
xe = np.empty_like(xi)
for i, (v, bound) in enumerate(zip(xi, bounds)):
a = bound[0] # minimum
b = bound[1] # maximum
if a == None and b == None: # No constraints
xe[i] = v
elif b == None: # only min
xe[i] = a - 1. + np.sqrt(v ** 2. + 1.)
elif a == None: # only max
xe[i] = b + 1. - np.sqrt(v ** 2. + 1.)
else: # both min and max
xe[i] = a + ((b - a) / 2.) * (np.sin(v) + 1.)
return xe
def external2internal(xe, bounds):
""" Convert a series of external variables to internal variables"""
xi = np.empty_like(xe)
for i, (v, bound) in enumerate(zip(xe, bounds)):
a = bound[0] # minimum
b = bound[1] # maximum
if a == None and b == None: # No constraints
xi[i] = v
elif b == None: # only min
xi[i] = np.sqrt((v - a + 1.) ** 2. - 1)
elif a == None: # only max
xi[i] = np.sqrt((b - v + 1.) ** 2. - 1)
else: # both min and max
xi[i] = np.arcsin((2.*(v - a) / (b - a)) - 1.)
return xi
def err(p, bounds, efunc, args):
pe = internal2external(p, bounds) # convert to external variables
return efunc(pe, *args)
def calc_cov_x(infodic, p):
"""
Calculate cov_x from fjac, ipvt and p as is done in leastsq
"""
fjac = infodic['fjac']
ipvt = infodic['ipvt']
n = len(p)
# adapted from leastsq function in scipy/optimize/minpack.py
perm = np.take(np.eye(n), ipvt - 1, 0)
r = np.triu(np.transpose(fjac)[:n, :])
R = np.dot(r, perm)
try:
cov_x = np.linalg.inv(np.dot(np.transpose(R), R))
except LinAlgError:
cov_x = None
return cov_x
def leastsqbound(func, x0, bounds, args = (), **kw):
"""
Constrained multivariant Levenberg-Marquard optimization
Minimize the sum of squares of a given function using the
Levenberg-Marquard algorithm. Contraints on parameters are inforced using
variable transformations as described in the MINUIT User's Guide by
Fred James and Matthias Winkler.
Parameters:
* func functions to call for optimization.
* x0 Starting estimate for the minimization.
* bounds (min,max) pair for each element of x, defining the bounds on
that parameter. Use None for one of min or max when there is
no bound in that direction.
* args Any extra arguments to func are places in this tuple.
Returns: (x,{cov_x,infodict,mesg},ier)
Return is described in the scipy.optimize.leastsq function. x and con_v
are corrected to take into account the parameter transformation, infodic
is not corrected.
Additional keyword arguments are passed directly to the
scipy.optimize.leastsq algorithm.
"""
# check for full output
if "full_output" in kw and kw["full_output"]:
full = True
else:
full = False
# convert x0 to internal variables
i0 = external2internal(x0, bounds)
# perfrom unconstrained optimization using internal variables
r = leastsq(err, i0, args = (bounds, func, args), **kw)
# unpack return convert to external variables and return
if full:
xi, cov_xi, infodic, mesg, ier = r
xe = internal2external(xi, bounds)
cov_xe = i2e_cov_x(xi, bounds, cov_xi)
# XXX correct infodic 'fjac','ipvt', and 'qtf'
return xe, cov_xe, infodic, mesg, ier
else:
xi, ier = r
xe = internal2external(xi, bounds)
return xe, ier
# new below
def _evaluate(x, p):
'''
Linear plus Lorentzian curve
p = list with three parameters ([a, b, I, Pos, FWHM])
'''
return p[0] + p[1] * x + p[2] / (1 + np.power((x - p[3]) / (p[4] / 2), 2))
def residuals(p, y, x):
err = _evaluate(x, p) - y
return err
if __name__ == '__main__':
data = np.loadtxt('constraint.dat') # read data
p0 = [5000., 0., 500., 2450., 3] #Start values for a, b, I, Pos, FWHM
constraints = [(4000., None), (-50., 20.), (0., 2000.), (2400., 2451.), (None, None)]
p, res = leastsqbound(residuals, p0, constraints, args = (data[:, 1], data[:, 0]), maxfev = 20000)
print p, res
plt.plot(data[:, 0], data[:, 1]) # plot data
plt.plot(data[:, 0], _evaluate(data[:, 0], p0)) # plot start values
plt.plot(data[:, 0], _evaluate(data[:, 0], p)) # plot fit values
plt.show()
Thats the plot output, where green is the starting conditions and red the fit result:
Is this the correct usage? External2internal conversion just throws a nan if outside the bounds. leastsq seems to be able to handle this?
I uploaded the fitting data here. Just paste into a text file named constraint.dat.
There is already an existing popular constrained Lev-Mar code
http://adsabs.harvard.edu/abs/2009ASPC..411..251M
with the implementation in python
http://code.google.com/p/astrolibpy/source/browse/mpfit/mpfit.py
I would suggest not to reinvent the wheel.
Following sega_sai's answer I came up with this minimal working example using mpfit.py
import matplotlib.pyplot as plt
from mpfit import mpfit
import numpy as np
def _evaluate(p, x):
'''
Linear plus Lorentzian curve
p = list with three parameters ([a, b, I, Pos, FWHM])
'''
return p[0] + p[1] * x + p[2] / (1 + np.power((x - p[3]) / (p[4] / 2), 2))
def residuals(p, fjac = None, x = None, y = None, err = None):
status = 0
error = _evaluate(p, x) - y
return [status, error / err]
if __name__ == '__main__':
data = np.loadtxt('constraint.dat') # read data
x = data[:, 0]
y = data[:, 1]
err = 0 * np.ones(y.shape, dtype = 'float64')
parinfo = [{'value':5000., 'fixed':0, 'limited':[0, 0], 'limits':[0., 0.], 'parname':'a'},
{'value':0., 'fixed':0, 'limited':[0, 0], 'limits':[0., 0.], 'parname':'b'},
{'value':500., 'fixed':0, 'limited':[0, 0], 'limits':[0., 0.], 'parname':'I'},
{'value':2450., 'fixed':0, 'limited':[0, 0], 'limits':[0., 0.], 'parname':'Pos'},
{'value':3., 'fixed':0, 'limited':[0, 0], 'limits':[0., 0.], 'parname':'FWHM'}]
fa = {'x':x, 'y':y, 'err':err}
m = mpfit(residuals, parinfo = parinfo, functkw = fa)
print m
The fit results are:
mpfit.py: 3714.97545, 0.484193283, 2644.47271, 2440.13385, 22.1898496
leastsq: 3714.97187, 0.484194545, 2644.46890, 2440.13391, 22.1899295
So conclusion: Both methods work, both allow constraints. But as mpfit comes from a very established sourceI trust it more. Also it honors error values, if available.
Try lmfit-py - https://github.com/newville/lmfit-py
It also uses the Levenberg-Marquardt (LM) algorithm via scipy.optimize.leastsq. Uncertaintes are OK.
It allows you not only to constrain your fitting parameters with bounds but also with mathematical expressions between them without modification of your fitting function.
Forget about using those awful p[0], p[1] ... in fitting function. Just use names of the fitting parameters via the Parameters class.

Categories