Scipy Optimize Minimize: Optimization terminated successfully but not iterating at all - python

I am trying to code an optimizer finding the optimal constant parameters so as to minimize the MSE between an array y and a generic function over X. The generic function is given in pre-order, so for example if the function over X is x1 + c*x2 the function would be [+, x1, *, c, x2]. The objective in the previous example, would be minimizing:
sum_for_all_x (y - (x1 + c*x2))^2
I show next what I have done to solve the problem. Some things that sould be known are:
X and y are torch tensors.
constants is the list of values to be optimized.
def loss(self, constants, X, y):
stack = [] # Stack to save the partial results
const = 0 # Index of constant to be used
for idx in self.traversal[::-1]: # Reverse the prefix notation
if idx > Language.max_variables: # If we are dealing with an operator
function = Language.idx_to_token[idx] # Get its associated function
first_operand = stack.pop() # Get first operand
if function.arity == 1: # If the arity of the operator is one (e.g sin)
stack.append(function.function(first_operand)) # Append result
else: # Same but if arity is 2
second_operand = stack.pop() # Need a second operand
stack.append(function.function(first_operand, second_operand))
elif idx == 0: # If it is a constant -> idx 0 indicates a constant
stack.append(constants[const]*torch.ones(X.shape[0])) # Append constant
const += 1 # Update
else:
stack.append(X[:, idx - 1]) # Else append the associated column of X
prediction = stack[0]
return (y - prediction).pow(2).mean().cpu().numpy()
def optimize_constants(self, X, y):
'''
# This function optimizes the constants of the expression tree.
'''
if 0 not in self.traversal: # If there are no constants to be optimized return
return self.traversal
x0 = [0 for i in range(len(self.constants))] # Initial guess
ini = time.time()
res = minimize(self.loss, x0, args=(X, y), method='BFGS', options={'disp': True})
print(res)
print('Time:', time.time() - ini)
The problem is that the optimizer theoretically terminates successfully but does not iterate at all. The output res would be something like that:
Optimization terminated successfully.
Current function value: 2.920725
Iterations: 0
Function evaluations: 2
Gradient evaluations: 1
fun: 2.9207253456115723
hess_inv: array([[1]])
jac: array([0.])
message: 'Optimization terminated successfully.'
nfev: 2
nit: 0
njev: 1
status: 0
success: True
x: array([0.])
So far I have tried to:
Change the method in the minimizer (e.g Nelder-Mead, SLSQP,...) but it happens the same with all of them.
Change the way I return the result (e.g (y - prediction).pow(2).mean().item())

It seems that scipy optimize minimize does not work well with Pytorch. Changing the code to use numpy ndarrays solved the problem.

Related

Problem minimizing a constrained function in Python with scipy.optimize.minimize

I'm trying to minimize a constrained function of several variables adopting the algorithm scipy.optimize.minimize. The function concerns the minimization of 3*N parameters, where Nis an input. More specifically, my minimization parameters are given in three arrays H = H[0],H[1],...,H[N-1], a = a[0],a[1],...,a[N-1] and b = b[0],b[1],...,b[N-1] which I concatenated in only one array named mins, with len(mins)=3*N.
Those parameters are also subjected to constraints as follows:
0 <= H and sum(H) = 0.5
0 <= a <= Pi/2
0 <= b <= Pi/2
So, my code for the constraints read as:
import numpy as np
# constraints on x:
def Hlhs(mins): # left hand side
return np.diag(np.ones(N)) # mins.reshape(3,N)[0]
def Hrhs(mins): # right hand side
return np.sum(mins.reshape(3,N)[0]) - 0.5
con1H = {'type': 'ineq', 'fun': lambda H: Hlhs(H)}
con2H = {'type': 'eq', 'fun': lambda H: Hrhs(H)}
# constraints on a:
def alhs(mins):
return np.diag(np.ones(N)) # mins.reshape(3,N)[1]
def arhs(mins):
return -np.diag(np.ones(N)) # mins.reshape(3,N)[1] + (np.ones(N))*np.pi/2
con1a = {'type': 'ineq', 'fun': lambda a: alhs(a)}
con2a = {'type': 'ineq', 'fun': lambda a: arhs(a)}
# constraints on b:
def blhs(mins):
return np.diag(np.ones(N)) # mins.reshape(3,N)[2]
def brhs(mins):
return -np.diag(np.ones(N)) # mins.reshape(3,N)[2] + (np.ones(N))*np.pi/2
con1b = {'type': 'ineq', 'fun': lambda b: blhs(b)}
con2b = {'type': 'ineq', 'fun': lambda b: brhs(b)}
My function, with the other parameters (and adopting N=3) to be minimized, is given by (I'm sorry if it is too long):
gamma = 17
C = 85
T = 0
Hf = 0.5
Li = 2
Bi = 1
N = 3
def FUN(mins):
H, a, b = mins.reshape(3,N)
S1 = 0; S2 = 0
B = np.zeros(N); L = np.zeros(N);
for i in range(N):
sbi=Bi; sli=Li
for j in range(i+1):
sbi += 2*H[j]*np.tan(b[j])
sli += 2*H[j]*np.tan(a[j])
B[i]=sbi
L[i]=sli
for i in range(N):
S1 += (C*(1-np.sin(a[i])) + T*np.sin(a[i])) * (Bi*H[i]+H[i]**2*np.tan(b[i]))/np.cos(a[i]) + \
(C*(1-np.sin(b[i])) + T*np.sin(b[i])) * (Li*H[i]+H[i]**2*np.tan(a[i]))/np.cos(b[i])
S2 += (gamma*H[0]/12)*(Bi*Li + 4*(B[0]-H[0]*np.tan(b[0]))*(L[0]-H[0]*np.tan(a[0])) + B[0]*L[0])
j=1
while j<(N):
S2 += (gamma*H[j]/12)*(B[j-1]*L[j-1] + 4*(B[j]-H[j]*np.tan(b[j]))*(L[j]-H[j]*np.tan(a[j])) + B[j]*L[j])
j += 1
F = 2*(S1+S2)
return F
And, finally, adopting an initial guess for the values as 0, the minimization is given by:
x0 = np.zeros(3*N)
res = scipy.optimize.minimize(FUN,x0,constraints=(con1H,con2H,con1a,con2a,con1b,con2b),tol=1e-25)
My problems are:
a) Observing the result res, some values got negative even though I have constraints for them to be positive. The success of the minimization was False, and the message was: Positive directional derivative for linesearch. Also, the result is very far from the minimum expected.
b) Adopting the method='trust-constr' I got a value closer to what I was expecting but with a false success and the message The maximum number of function evaluations is exceeded.. Is there any way to improve this?
I know that there is a minimum very close to these values:
H = [0.2,0.15,0.15]
a = [1.0053,1.0053,1.2566]
b = [1.0681,1.1310,1.3195]
where the value for the function is 123,45. I've checked the function several times and it seems to be working properly. Can anyone help me to find where my problem is? I've tried to change xtol and maxiter but with no success.
Here are a few hints:
Your initial point x0 is not feasible since it doesn't satisfy the constraint sum(H) = 0.5. Providing a feasible initial point should fix your first problem.
Except for the constraint sum(H) = 0.5, all constraints are simple bounds on the variables. In general, it's recommended to pass variable bounds via the bounds parameter of minimize. You can simply define and pass all the bounds like this
from scipy.optimize import minimize
import numpy as np
# ..your variables and functions ..
bounds = [(0, None)]*N + [(0, np.pi/2)]*2*N
x0 = np.zeros(3*N)
x0[0] = 0.5
res = minimize(FUN, x0, constraints=(con2H,), bounds=bounds,
method="trust-constr", options={'maxiter': 20000})
where each tuple contains the lower and upper bound for each variable.
Unfortunately, 'trust-constr' has still trouble to converge to a local minimizer. In this case, you can either try other initial points or you can use the state-of-the-art open source solver Ipopt instead. The Cython wrapper cyipopt provides a interface similar to scipy:
from cyipopt import minimize_ipopt
# rest as above
res = minimize_ipopt(FUN, x0, constraints=(con2H,), bounds=bounds)
this gives me a solution with objective value 122.9.
Last but not least, it's always a good idea to provide exact gradients, jacobians and hessians.

Gradient-Based Optimizations in Python

I am trying to solve a couple minimization problems using Python but the setup with constraints is difficult for me to understand. I have:
minimize: x+y+2z^2
subject to: x = 1 and x^2+y^2 = 1
This is very easy obviously and I know the solution is x=1,y=0,z=0. I tried to use scipy.optimize.L-BFGS-B but had issues.
I also have:
minimize: 2x1^2+x2^2
subject to: x1+x2=1
I need to use a gradient based optimizer so I chose scipy.optimizer.COBYLA but had issues using an equality constraint as it only takes inequality constraints. The code for this is:
def objective(x):
x1 = x[0]
x2 = x[1]
return 2*(x1**2)+ x2
def constraint1(x):
return x[0]+x[1]-1
#Try an initial condition of x1=1 and x2=0
#Our initial condition satisfies the constraint already
x0 = [0.3,0.7]
print(objective(x0))
xnew = [0.25,0.75]
print(objective(xnew))
#Since we have already calculated on paper we know that x1 and x2 fall between 0 and 1
#We can set our bounds for both variables as being between 0 and 1
b = (0,1)
bounds = (b,b)
#Lets make note of the type of constraint we have for out optimizer
con1 = {'type': 'eq', 'fun':constraint1}
cons = [con1]
sol_gradient = minimize(objective,x0,method='COBYLA',bounds=bounds, constraints=cons)
Then I get error about using equality constraints with this optimizer.
A few things:
Your objective function does not match with the description you have provided. Should it be this: 2*(x1**2) + x2**2?
From the docs scipy.optimize.minimize you can see that COBYLA does not support eq as a constraint. From the page:
Note that COBYLA only supports inequality constraints.
Since you said you want to use a Gradient based optimizer, one option could be to use the Sequential Least Squares Programming (SLSQP) optimizer.
Below is the code replacing 'COBYLA' with 'SLSQP' and changing the objective function according to 1:
def objective(x):
x1 = x[0]
x2 = x[1]
return 2*(x1**2)+ x2**2
def constraint1(x):
return x[0]+x[1]-1
#Try an initial condition of x1=1 and x2=0
#Our initial condition satisfies the constraint already
x0 = [0.3,0.7]
print(objective(x0))
xnew = [0.25,0.75]
print(objective(xnew))
#Since we have already calculated on paper we know that x1 and x2 fall between 0 and 1
#We can set our bounds for both variables as being between 0 and 1
b = (0,1)
bounds = (b,b)
#Lets make note of the type of constraint we have for out optimizer
con1 = {'type': 'eq', 'fun':constraint1}
cons = [con1]
sol_gradient = minimize(objective,x0,method='SLSQP',bounds=bounds, constraints=cons)
print(sol_gradient)
Which gives the final answer as:
fun: 0.6666666666666665
jac: array([1.33333336, 1.33333335])
message: 'Optimization terminated successfully'
nfev: 7
nit: 2
njev: 2
status: 0
success: True
x: array([0.33333333, 0.66666667])

Object function can return None with certain parameters, how to skip or avoid while converging in Curvefit?

In my situation, the objective function is a numerical process contains a root finding process for an equation by bisection method.
With certain set of parameters, the equation does not have root for a intermediate variable. I thought making the bisection root finding routine return None can solve such problem.
As the object function with a set of date being regressed by scipy.optimize.curve_fit with p0 separate by this situation in between, error is then stop the process.
To study this case, a simplified case is shown.
import numpy as np
#Define object function:
def f(x,a1,a2):
if a1 < 0:
return None
elif a2 < 0:
return np.inf
else:
return a1 * x**2 + a2
#Making data:
x = np.linspace(-5,5,10)
i = 0
y = np.empty_like(x)
for xi in x:
y[i] = f(xi,1,1)
i += 1
import scipy.optimize as sp
para,pvoc = sp.curve_fit(f,x,y,p0=(-1,1))
#TypeError: unsupported operand type(s) for -: 'NoneType' and 'float'
para,pvoc = sp.curve_fit(f,x,y,p0=(1,-1))
#RuntimeError: Optimal parameters not found: Number of calls to function has reached maxfev = 600.
I also tried inf, and it is obviously not working.
What should I return to continue the curve_fit process?
Imagine it is trying to converge, what happen does the curve_fit do when it meets such situation.
Additional thinking:
I tried the try...except... to ignore the error and also simulate a case that the p0 is in a solvable range, but will pass the unsolvable segment to the true fit.
import numpy as np
def f(x,a1,a2):
if a1 < 0:
return None
elif a1 < 2:
return a1 * x**2 + a2
elif a2 < 0:
return np.inf
else:
return a1 * x**2 + a2
def ff(x,a1,a2):
output = f(x,a1,a2)
if output == None:
return 0
else:
return output
x = np.linspace(-5,5,10)
i = 0
y = np.empty_like(x)
for xi in x:
y[i] = f(xi,1,1)
i += 1
import scipy.optimize as sp
#para,pvoc = sp.curve_fit(f,x,y,p0=(-1,1))
#TypeError: unsupported operand type(s) for -: 'NoneType' and 'float':
#para,pvoc = sp.curve_fit(f,x,y,p0=(1,-1))
try:
para,pvoc = sp.curve_fit(f,x,y,p0=(-3,1))
except TypeError:
pass
Obviously error was met during converging and had been reported and was excepted.
What should I do to continue the curve_fit with the original converging direction?
Even I can make concession, how can I tell the curve_fit to return the last attempt to the a1?
On the other hand, I tried put this try... except... in the object function to return 0 when there is the error.
The result is then as I expect:
para,pvoc = sp.curve_fit(ff,x,y,p0=(-3,1))
#OptimizeWarning: Covariance of the parameters could not be estimated
category=OptimizeWarning)
I think you want to take a different approach. That is, you have written your objective function to return None or Inf to signal when the value for a1 or a2 is out of bounds: a1<0 and a2<0 are not acceptable values for the objective function.
If that is a correct interpretation of what you are trying to do, it would be better to place bounds on both a1 and a2 so that the objective function never gets those values at all. To do that with curve_fit you would need to create a tuple of arrays for lower and upper bounds, with an order matching your p0, so
pout, pcov = sp.curve_fit(f, x, y, p0=(1, 1), bounds=([0, 0], [np.inpf, np.inf])
As an aside: I have no idea why you're using a starting value for a1 that is < 0, and so out of bounds. That seems like you're asking for trouble.
For an even better experience setting bounds on fitting parameters, you might consider using the lmfit, which would allow you to write:
import numpy as np
from lmfit import Model
def f(x, a1, a2):
return a1 * x**2 + a2
fmod = Model(f)
params = fmod.make_params(a1=1, a2=0.5)
params['a1'].min = 0
params['a2'].min = 0
x = np.linspace(-5, 5, 10)
np.random.seed(0)
y = f(x, 1, 1) + np.random.normal(size=len(x), scale=0.02)
result = fmod.fit(y, params, x=x)
print(result.fit_report())
which will print out
[[Model]]
Model(f)
[[Fit Statistics]]
# fitting method = leastsq
# function evals = 13
# data points = 10
# variables = 2
chi-square = 0.00374066
reduced chi-square = 4.6758e-04
Akaike info crit = -74.9107853
Bayesian info crit = -74.3056151
[[Variables]]
a1: 0.99998038 +/- 7.6225e-04 (0.08%) (init = 1)
a2: 1.01496025 +/- 0.01034565 (1.02%) (init = 0.5)
[[Correlations]] (unreported correlations are < 0.100)
C(a1, a2) = -0.750
Hope that helps.

Scipy.optimize.minimize SLSQP with linear constraints fails

Consider the following (convex) optimization problem:
minimize 0.5 * y.T * y
s.t. A*x - b == y
where the optimization (vector) variables are x and y and A, b are a matrix and vector, respectively, of appropriate dimensions.
The code below finds a solution easily using the SLSQP method from Scipy:
import numpy as np
from scipy.optimize import minimize
# problem dimensions:
n = 10 # arbitrary integer set by user
m = 2 * n
# generate parameters A, b:
np.random.seed(123) # for reproducibility of results
A = np.random.randn(m,n)
b = np.random.randn(m)
# objective function:
def obj(z):
vy = z[n:]
return 0.5 * vy.dot(vy)
# constraint function:
def cons(z):
vx = z[:n]
vy = z[n:]
return A.dot(vx) - b - vy
# constraints input for SLSQP:
cons = ({'type': 'eq','fun': cons})
# generate a random initial estimate:
z0 = np.random.randn(n+m)
sol = minimize(obj, x0 = z0, constraints = cons, method = 'SLSQP', options={'disp': True})
Optimization terminated successfully. (Exit mode 0)
Current function value: 2.12236220865
Iterations: 6
Function evaluations: 192
Gradient evaluations: 6
Note that the constraint function is a convenient 'array-output' function.
Now, instead of an array-output function for the constraint, one could in principle use an equivalent set of 'scalar-output' constraint functions (actually, the scipy.optimize documentation discusses only this type of constraint functions as input to minimize).
Here is the equivalent constraint set followed by the output of minimize (same A, b, and initial value as the above listing):
# this is the i-th element of cons(z):
def cons_i(z, i):
vx = z[:n]
vy = z[n:]
return A[i].dot(vx) - b[i] - vy[i]
# listable of scalar-output constraints input for SLSQP:
cons_per_i = [{'type':'eq', 'fun': lambda z: cons_i(z, i)} for i in np.arange(m)]
sol2 = minimize(obj, x0 = z0, constraints = cons_per_i, method = 'SLSQP', options={'disp': True})
Singular matrix C in LSQ subproblem (Exit mode 6)
Current function value: 6.87999270692
Iterations: 1
Function evaluations: 32
Gradient evaluations: 1
Evidently, the algorithm fails (the returning objective value is actually the objective value for the given initialization), which I find a bit weird. Note that running [cons_per_i[i]['fun'](sol.x) for i in np.arange(m)] shows that sol.x, obtained using the array-output constraint formulation, satisfies all scalar-output constraints of cons_per_i as expected (within numerical tolerance).
I would appreciate if anyone has some explanation for this issue.
You've run into the "late binding closures" gotcha. All the calls to cons_i are being made with the second argument equal to 19.
A fix is to use the args dictionary element in the dictionary that defines the constraints instead of the lambda function closures:
cons_per_i = [{'type':'eq', 'fun': cons_i, 'args': (i,)} for i in np.arange(m)]
With this, the minimization works:
In [417]: sol2 = minimize(obj, x0 = z0, constraints = cons_per_i, method = 'SLSQP', options={'disp': True})
Optimization terminated successfully. (Exit mode 0)
Current function value: 2.1223622086
Iterations: 6
Function evaluations: 192
Gradient evaluations: 6
You could also use the the suggestion made in the linked article, which is to use a lambda expression with a second argument that has the desired default value:
cons_per_i = [{'type':'eq', 'fun': lambda z, i=i: cons_i(z, i)} for i in np.arange(m)]

Minimizing a multivariable function with scipy. Derivative not known

I have a function which is actually a call to another program (some Fortran code). When I call this function (run_moog) I can parse 4 variables, and it returns 6 values. These values should all be close to 0 (in order to minimize). However, I combined them like this: np.sum(results**2). Now I have a scalar function. I would like to minimize this function, i.e. get the np.sum(results**2) as close to zero as possible.
Note: When this function (run_moog) takes the 4 input parameters, it creates an input file for the Fortran code that depends on these parameters.
I have tried several ways to optimize this from the scipy docs. But none works as expected. The minimization should be able to have bounds on the 4 variables. Here is an attempt:
from scipy.optimize import minimize # Tried others as well from the docs
x0 = 4435, 3.54, 0.13, 2.4
bounds = [(4000, 6000), (3.00, 4.50), (-0.1, 0.1), (0.0, None)]
a = minimize(fun_mmog, x0, bounds=bounds, method='L-BFGS-B') # I've tried several different methods here
print a
This then gives me
status: 0
success: True
nfev: 5
fun: 2.3194639999999964
x: array([ 4.43500000e+03, 3.54000000e+00, 1.00000000e-01,
2.40000000e+00])
message: 'CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL'
jac: array([ 0., 0., -54090399.99999981, 0.])
nit: 0
The third parameter changes slightly, while the others are exactly the same. Also there have been 5 function calls (nfev) but no iterations (nit). The output from scipy is shown here.
Couple of possibilities:
Try COBYLA. It should be derivative-free, and supports inequality constraints.
You can't use different epsilons via the normal interface; so try scaling your first variable by 1e4. (Divide it going in, multiply coming back out.)
Skip the normal automatic jacobian constructor, and make your own:
Say you're trying to use SLSQP, and you don't provide a jacobian function. It makes one for you. The code for it is in approx_jacobian in slsqp.py. Here's a condensed version:
def approx_jacobian(x,func,epsilon,*args):
x0 = asfarray(x)
f0 = atleast_1d(func(*((x0,)+args)))
jac = zeros([len(x0),len(f0)])
dx = zeros(len(x0))
for i in range(len(x0)):
dx[i] = epsilon
jac[i] = (func(*((x0+dx,)+args)) - f0)/epsilon
dx[i] = 0.0
return jac.transpose()
You could try replacing that loop with:
for (i, e) in zip(range(len(x0)), epsilon):
dx[i] = e
jac[i] = (func(*((x0+dx,)+args)) - f0)/e
dx[i] = 0.0
You can't provide this as the jacobian to minimize, but fixing it up for that is straightforward:
def construct_jacobian(func,epsilon):
def jac(x, *args):
x0 = asfarray(x)
f0 = atleast_1d(func(*((x0,)+args)))
jac = zeros([len(x0),len(f0)])
dx = zeros(len(x0))
for i in range(len(x0)):
dx[i] = epsilon
jac[i] = (func(*((x0+dx,)+args)) - f0)/epsilon
dx[i] = 0.0
return jac.transpose()
return jac
You can then call minimize like:
minimize(fun_mmog, x0,
jac=construct_jacobian(fun_mmog, [1e0, 1e-4, 1e-4, 1e-4]),
bounds=bounds, method='SLSQP')
It sounds like your target function doesn't have well-behaving derivatives. The line in the output jac: array([ 0., 0., -54090399.99999981, 0.]) means that changing only the third variable value is significant. And because the derivative w.r.t. to this variable is virtually infinite, there is probably something wrong in the function. That is also why the third variable value ends up in its maximum.
I would suggest that you take a look at the derivatives, at least in a few points in your parameter space. Compute them using finite differences and the default step size of SciPy's fmin_l_bfgs_b, 1e-8. Here is an example of how you could compute the derivates.
Try also plotting your target function. For instance, keep two of the parameters constant and let the two others vary. If the function has multiple local optima, you shouldn't use gradient-based methods like BFGS.
How difficult is it to get an analytical expression for the gradient? If you have that you can then approximate the product of Hessian with a vector using finite difference. Then you can use other optimization routines available.
Among the various optimization routines available in SciPy, the one called TNC (Newton Conjugate Gradient with Truncation) is quite robust to the numerical values associated with the problem.
The Nelder-Mead Simplex Method (suggested by Cristián Antuña in the comments above) is well known to be a good choice for optimizing (posibly ill-behaved) functions with no knowledge of derivatives (see Numerical Recipies In C, Chapter 10).
There are two somewhat specific aspects to your question. The first is the constraints on the inputs, and the second is a scaling problem. The following suggests solutions to these points, but you might need to manually iterate between them a few times until things work.
Input Constraints
Assuming your input constraints form a convex region (as your examples above indicate, but I'd like to generalize it a bit), then you can write a function
is_in_bounds(p):
# Return if p is in the bounds
Using this function, assume that the algorithm wants to move from point from_ to point to, where from_ is known to be in the region. Then the following function will efficiently find the furthermost point on the line between the two points on which it can proceed:
from numpy.linalg import norm
def progress_within_bounds(from_, to, eps):
"""
from_ -- source (in region)
to -- target point
eps -- Eucliedan precision along the line
"""
if norm(from_, to) < eps:
return from_
mid = (from_ + to) / 2
if is_in_bounds(mid):
return progress_within_bounds(mid, to, eps)
return progress_within_bounds(from_, mid, eps)
(Note that this function can be optimized for some regions, but it's hardly worth the bother, as it doesn't even call your original object function, which is the expensive one.)
One of the nice aspects of Nelder-Mead is that the function does a series of steps which are so intuitive. Some of these points can obviously throw you out of the region, but it's easy to modify this. Here is an implementation of Nelder Mead with modifications made marked between pairs of lines of the form ##################################################################:
import copy
'''
Pure Python/Numpy implementation of the Nelder-Mead algorithm.
Reference: https://en.wikipedia.org/wiki/Nelder%E2%80%93Mead_method
'''
def nelder_mead(f, x_start,
step=0.1, no_improve_thr=10e-6, no_improv_break=10, max_iter=0,
alpha = 1., gamma = 2., rho = -0.5, sigma = 0.5):
'''
#param f (function): function to optimize, must return a scalar score
and operate over a numpy array of the same dimensions as x_start
#param x_start (numpy array): initial position
#param step (float): look-around radius in initial step
#no_improv_thr, no_improv_break (float, int): break after no_improv_break iterations with
an improvement lower than no_improv_thr
#max_iter (int): always break after this number of iterations.
Set it to 0 to loop indefinitely.
#alpha, gamma, rho, sigma (floats): parameters of the algorithm
(see Wikipedia page for reference)
'''
# init
dim = len(x_start)
prev_best = f(x_start)
no_improv = 0
res = [[x_start, prev_best]]
for i in range(dim):
x = copy.copy(x_start)
x[i] = x[i] + step
score = f(x)
res.append([x, score])
# simplex iter
iters = 0
while 1:
# order
res.sort(key = lambda x: x[1])
best = res[0][1]
# break after max_iter
if max_iter and iters >= max_iter:
return res[0]
iters += 1
# break after no_improv_break iterations with no improvement
print '...best so far:', best
if best < prev_best - no_improve_thr:
no_improv = 0
prev_best = best
else:
no_improv += 1
if no_improv >= no_improv_break:
return res[0]
# centroid
x0 = [0.] * dim
for tup in res[:-1]:
for i, c in enumerate(tup[0]):
x0[i] += c / (len(res)-1)
# reflection
xr = x0 + alpha*(x0 - res[-1][0])
##################################################################
##################################################################
xr = progress_within_bounds(x0, x0 + alpha*(x0 - res[-1][0]), prog_eps)
##################################################################
##################################################################
rscore = f(xr)
if res[0][1] <= rscore < res[-2][1]:
del res[-1]
res.append([xr, rscore])
continue
# expansion
if rscore < res[0][1]:
xe = x0 + gamma*(x0 - res[-1][0])
##################################################################
##################################################################
xe = progress_within_bounds(x0, x0 + gamma*(x0 - res[-1][0]), prog_eps)
##################################################################
##################################################################
escore = f(xe)
if escore < rscore:
del res[-1]
res.append([xe, escore])
continue
else:
del res[-1]
res.append([xr, rscore])
continue
# contraction
xc = x0 + rho*(x0 - res[-1][0])
##################################################################
##################################################################
xc = progress_within_bounds(x0, x0 + rho*(x0 - res[-1][0]), prog_eps)
##################################################################
##################################################################
cscore = f(xc)
if cscore < res[-1][1]:
del res[-1]
res.append([xc, cscore])
continue
# reduction
x1 = res[0][0]
nres = []
for tup in res:
redx = x1 + sigma*(tup[0] - x1)
score = f(redx)
nres.append([redx, score])
res = nres
Note This implementation is GPL, which is either fine for you or not. It's extremely easy to modify NM from any pseudocode, though, and you might want to throw in simulated annealing in any case.
Scaling
This is a trickier problem, but jasaarim has made an interesting point regarding that. Once the modified NM algorithm has found a point, you might want to run matplotlib.contour while fixing a few dimensions, in order to see how the function behaves. At this point, you might want to rescale one or more of the dimensions, and rerun the modified NM.
–

Categories