Scipy.optimize.minimize method='SLSQP' ignores constraint - python

I'm using SciPy for optimization and the method SLSQP seems to ignore my constraints.
Specifically, I want x[3] and x[4] to be in the range [0-1]
I'm getting the message: 'Inequality constraints incompatible'
Here is the results of the execution followed by an example code (uses a dummy function):
status: 4
success: False
njev: 2
nfev: 24
fun: 0.11923608071680103
x: array([-10993.4278558 , -19570.77080806, -23495.15914299, -26531.4862831 ,
4679.97660534])
message: 'Inequality constraints incompatible'
jac: array([ 12548372.4766904 , 12967696.88362279, 39928956.72239509,
-9224613.99092537, 3954696.30747453, 0. ])
nit: 2
Here is my code:
from random import random
from scipy.optimize import minimize
def func(x):
""" dummy function to optimize """
print 'x'+str(x)
return random()
my_constraints = ({'type':'ineq', 'fun':lambda(x):1-x[3]-x[4]},
{'type':'ineq', 'fun':lambda(x):x[3]},
{'type':'ineq', 'fun':lambda(x):x[4]},
{'type':'ineq', 'fun':lambda(x):1-x[4]},
{'type':'ineq', 'fun':lambda(x):1-x[3]})
minimize(func, [57.9499 ,-18.2736,1.1664,0.0000,0.0765],
method='SLSQP',constraints=my_constraints)
EDIT -
The problem persists when even when removing the first constraint.
The problem persists when I try to use the bounds variables.
i.e.,
bounds_pairs = [(None,None),(None,None),(None,None),(0,1),(0,1)]
minimize(f,initial_guess,method=method_name,bounds=bounds_pairs,constraints=non_negative_prob)

I know this is a very old question, but I was intrigued.
When does it happen?
This problem occurs when the optimisation function is not reliably differentiable. If you use a nice smooth function like this:
opt = numpy.array([2, 2, 2, 2, 2])
def func(x):
return sum((x - opt)**2)
The problem goes away.
How do I impose hard constraints?
Note that none of the constrained algorithms in scipy.minimize have guarantees that the function will never be evaluated outside the constraints. If this is a requirement for you, you should rather use transformations. So for instance to ensure that no negative values for x[3] are ever used, you can use a transformation x3_real = 10^x[3]. This way x[3] can be any value but the variable you use will never be negative.
Deeper analysis
Investigating the Fortran code for slsqp yields the following insights into when this error occurs. The routine returns a MODE variable, which can take on these values:
C* MODE = -1: GRADIENT EVALUATION, (G&A) *
C* 0: ON ENTRY: INITIALIZATION, (F,G,C&A) *
C* ON EXIT : REQUIRED ACCURACY FOR SOLUTION OBTAINED *
C* 1: FUNCTION EVALUATION, (F&C) *
C* *
C* FAILURE MODES: *
C* 2: NUMBER OF EQUALITY CONTRAINTS LARGER THAN N *
C* 3: MORE THAN 3*N ITERATIONS IN LSQ SUBPROBLEM *
C* 4: INEQUALITY CONSTRAINTS INCOMPATIBLE *
C* 5: SINGULAR MATRIX E IN LSQ SUBPROBLEM *
C* 6: SINGULAR MATRIX C IN LSQ SUBPROBLEM *
The part which assigns mode 4 (which is the error you are getting) is as follows:
C SEARCH DIRECTION AS SOLUTION OF QP - SUBPROBLEM
CALL dcopy_(n, xl, 1, u, 1)
CALL dcopy_(n, xu, 1, v, 1)
CALL daxpy_sl(n, -one, x, 1, u, 1)
CALL daxpy_sl(n, -one, x, 1, v, 1)
h4 = one
CALL lsq (m, meq, n , n3, la, l, g, a, c, u, v, s, r, w, iw, mode)
C AUGMENTED PROBLEM FOR INCONSISTENT LINEARIZATION
IF (mode.EQ.6) THEN
IF (n.EQ.meq) THEN
mode = 4
ENDIF
ENDIF
So basically you can see it attempts to find a descent direction, if the constraints are active it attempts derivative evaluation along the constraint and fails with a singular matrix in the lsq subproblem (mode = 6), then it reasons that if all the constraint equations were evaluated and none yielded successful descent directions, this must be a contradictory set of constraints (mode = 4).

Related

Constraint with scipy

I have a problem of multi-objective optimization under constraint (maximization), in fact I transformed it into a mono-objective problem via the weighting technique and I added 2 variable x1, x2 (to optimize) with a constraint 0 <x1 + x2 <1 so their sum must be strictly less than 1 to give rise to the 3rd objective function as described in the code below.
When I execute the sum is always greater than 1.
for i in range(len(f1)):
def f(x):
x1= x[0]
x2= x[1]
return -(x1*f1[i]+ x2*f2[i]+ (1-x1-x2)*f3[i])
def constraint(x):
return x[0]+x[1]-1
b= (0.2, 0.8)
bnds= (b, b)
x0=[0.5,0.4]
cons= ({'type': 'ineq','fun':constraint})
res = minimize(f,x0, method= 'SLSQP', bounds=bnds, constraints=cons)
print('Vect_ponderation : ', res.x)
Output:
Vect_ponderation : [0.8 0.8]
Vect_ponderation : [0.8 0.8]
Vect_ponderation : [0.8 0.8]
Vect_ponderation : [0.8 0.8]
The documentation says:
Equality constraint means that the constraint function result is to be zero whereas inequality means that it is to be non-negative. Note that COBYLA only supports inequality constraints.
def constraint(x):
return x[0] + x[1] -1
means:
x0 + x1 - 1 >= 0 # non-negative
<->
x0 + x1 >= 1
which is respected in your solution.
You probably want:
def constraint(x):
return - x[0] - x[1] + 1
Also keep in mind, that there is no concept of strict inequality. You will need to introduce some epsilon constant a-priori like: eps = 1e-6:
def constraint(x):
return - x[0] - x[1] + 1 - EPS
As I understand, your problem is that the output has x1+x2>1. I've checked optimize documentation and it says: 'inequality means that it is to be non-negative'. Thus, you have requested something opposite to what you think you have requested.
Suggestions:
You have very simple function, so try providing jacobian as an additional parameter. The fit is much more efficient then.
Try to present minimal reproducible example. This makes answering much easier. Your question lacks f1, f2, and f3 definitions.
Try to ask a question. I've found something that might be the problem you're trying to solve, but I'm not sure because there is no question (i.e., sentence with question mark) in your question.

Minimize quadratic function subject to linear equality constraints with SciPy

I have a reasonably simple constrained optimization problem but get different answers depending on how I do it. Let's get the import and a pretty print function out of the way first:
import numpy as np
from scipy.optimize import minimize, LinearConstraint, NonlinearConstraint, SR1
def print_res( res, label ):
print("\n\n ***** ", label, " ***** \n")
print(res.message)
print("obj func value at solution", obj_func(res.x))
print("starting values: ", x0)
print("ending values: ", res.x.astype(int) )
print("% diff", (100.*(res.x-x0)/x0).astype(int) )
print("target achieved?",target,res.x.sum())
The sample data is very simple:
n = 5
x0 = np.arange(1,6) * 10_000
target = x0.sum() + 5_000 # increase sum from 15,000 to 20,000
Here's the constrained optimization (including jacobians). In words, the objective function I want to minimize is just the sum of squared percentage changes from the initial values to final values. The linear equality constraint is simply requiring x.sum() to equal a constant.
def obj_func(x):
return ( ( ( x - x0 ) / x0 ) ** 2 ).sum()
def obj_jac(x):
return 2. * ( x - x0 ) / x0 ** 2
def constr_func(x):
return x.sum() - target
def constr_jac(x):
return np.ones(n)
And for comparison, I've re-factored as an unconstrained minimization by using the equality constraint to replace x[0] with a function of x[1:]. Note that the unconstrained function is passed x0[1:] whereas the constrained function is passed x0.
def unconstr_func(x):
x_one = target - x.sum()
first_term = ( ( x_one - x0[0] ) / x0[0] ) ** 2
second_term = ( ( ( x - x0[1:] ) / x0[1:] ) ** 2 ).sum()
return first_term + second_term
I then try to minimize in three ways:
Unconstrained with 'Nelder-Mead'
Constrained with 'trust-constr' (w/ & w/o jacobian)
Constrained with 'SLSQP' (w/ & w/o jacobian)
Code:
##### (1) unconstrained
res0 = minimize( unconstr_func, x0[1:], method='Nelder-Mead') # OK, but weird note
res0.x = np.hstack( [target - res0.x.sum(), res0.x] )
print_res( res0, 'unconstrained' )
##### (2a) constrained -- trust-constr w/ jacobian
nonlin_con = NonlinearConstraint( constr_func, 0., 0., constr_jac )
resTCjac = minimize( obj_func, x0, method='trust-constr',
jac='2-point', hess=SR1(), constraints = nonlin_con )
print_res( resTCjac, 'trust-const w/ jacobian' )
##### (2b) constrained -- trust-constr w/o jacobian
nonlin_con = NonlinearConstraint( constr_func, 0., 0. )
resTC = minimize( obj_func, x0, method='trust-constr',
jac='2-point', hess=SR1(), constraints = nonlin_con )
print_res( resTC, 'trust-const w/o jacobian' )
##### (3a) constrained -- SLSQP w/ jacobian
eq_cons = { 'type': 'eq', 'fun' : constr_func, 'jac' : constr_jac }
resSQjac = minimize( obj_func, x0, method='SLSQP',
jac = obj_jac, constraints = eq_cons )
print_res( resSQjac, 'SLSQP w/ jacobian' )
##### (3b) constrained -- SLSQP w/o jacobian
eq_cons = { 'type': 'eq', 'fun' : constr_func }
resSQ = minimize( obj_func, x0, method='SLSQP',
jac = obj_jac, constraints = eq_cons )
print_res( resSQ, 'SLSQP w/o jacobian' )
Here is some simplified output (and of course you can run the code to get the full output):
starting values: [10000 20000 30000 40000 50000]
***** (1) unconstrained *****
Optimization terminated successfully.
obj func value at solution 0.0045454545454545305
ending values: [10090 20363 30818 41454 52272]
***** (2a) trust-const w/ jacobian *****
The maximum number of function evaluations is exceeded.
obj func value at solution 0.014635854609684874
ending values: [10999 21000 31000 41000 51000]
***** (2b) trust-const w/o jacobian *****
`gtol` termination condition is satisfied.
obj func value at solution 0.0045454545462939935
ending values: [10090 20363 30818 41454 52272]
***** (3a) SLSQP w/ jacobian *****
Optimization terminated successfully.
obj func value at solution 0.014636111111111114
ending values: [11000 21000 31000 41000 51000]
***** (3b) SLSQP w/o jacobian *****
Optimization terminated successfully.
obj func value at solution 0.014636111111111114
ending values: [11000 21000 31000 41000 51000]
Notes:
(1) & (2b) are plausible solutions in that they achieve significantly lower objective function values and intuitively we'd expect the variables with larger starting values to move more (both absolutely and in percentage terms) than the smaller ones.
Adding the jacobian to 'trust-const' causes it to get the wrong answer (or at least a worse answer) and also to exceed max iterations. Maybe the jacobian is wrong, but the function is so simple that I'm pretty sure it's correct (?)
'SLSQP' doesn't seem to work w/ or w/o the jacobian supplied, but works very fast and claims to terminate successfully. This seems very worrisome in that getting the wrong answer and claiming to have terminated successfully is pretty much the worst possible outcome.
Initially I used very small starting values and targets (just 1/1,000 of what I have above) and in that case all 5 approaches above work fine and give the same answers. My sample data is still extremely small, and it seems kinda bizarre for it to handle 1,2,..,5 but not 1000,2000,..5000.
FWIW, note that the 3 incorrect results all hit the target by adding 1,000 to each initial value -- this satisfies the constraint but comes nowhere near minimizing the objective function (b/c variables with higher initial values should be increased more than lower ones to minimize the sum of squared percentage differences).
So my question is really just what is happening here and why do only (1) and (2b) seem to work?
More generally, I'd like to find a good python-based approach to this and similar optimization problems and will consider answers using other packages besides scipy although the best answer would ideally also address what is going on with scipy here (e.g. is this user error or a bug I should post to github?).
Here is how this problem could be solved using nlopt which is a library for nonlinear optimization which I've been pretty impressed with.
First, the objective function and gradient are both defined using the same function:
def obj_func(x, grad):
if grad.size > 0:
grad[:] = obj_jac(x)
return ( ( ( x/x0 - 1 )) ** 2 ).sum()
def obj_jac(x):
return 2. * ( x - x0 ) / x0 ** 2
def constr_func(x, grad):
if grad.size > 0:
grad[:] = constr_jac(x)
return x.sum() - target
def constr_jac(x):
return np.ones(n)
Then, to run the minimization using Nelder-Mead and SLSQP:
opt = nlopt.opt(nlopt.LN_NELDERMEAD,len(x0)-1)
opt.set_min_objective(unconstr_func)
opt.set_ftol_abs(1e-15)
xopt = opt.optimize(x0[1:].copy())
xopt = np.hstack([target - xopt.sum(), xopt])
fval = opt.last_optimum_value()
print_res(xopt,fval,"Nelder-Mead");
opt = nlopt.opt(nlopt.LD_SLSQP,len(x0))
opt.set_min_objective(obj_func)
opt.add_equality_constraint(constr_func)
opt.set_ftol_abs(1e-15)
xopt = opt.optimize(x0.copy())
fval = opt.last_optimum_value()
print_res(xopt,fval,"SLSQP w/ jacobian");
And here are the results:
***** Nelder-Mead *****
obj func value at solution 0.00454545454546
result: 3
starting values: [ 10000. 20000. 30000. 40000. 50000.]
ending values: [10090 20363 30818 41454 52272]
% diff [0 1 2 3 4]
target achieved? 155000.0 155000.0
***** SLSQP w/ jacobian *****
obj func value at solution 0.00454545454545
result: 3
starting values: [ 10000. 20000. 30000. 40000. 50000.]
ending values: [10090 20363 30818 41454 52272]
% diff [0 1 2 3 4]
target achieved? 155000.0 155000.0
When testing this out, I think I discovered what the issue with the original attempt was. If I set the absolute tolerance on the function to 1e-8 which is what the scipy functions default to I get:
***** Nelder-Mead *****
obj func value at solution 0.0045454580693
result: 3
starting values: [ 10000. 20000. 30000. 40000. 50000.]
ending values: [10090 20363 30816 41454 52274]
% diff [0 1 2 3 4]
target achieved? 155000.0 155000.0
***** SLSQP w/ jacobian *****
obj func value at solution 0.0146361108503
result: 3
starting values: [ 10000. 20000. 30000. 40000. 50000.]
ending values: [10999 21000 31000 41000 51000]
% diff [9 5 3 2 2]
target achieved? 155000.0 155000.0
which is exactly what you were seeing. So my guess is that the minimizer ends up somewhere in the likelihood space during SLSQP where the next jump is less than 1e-8 from the last place.
This is a partial answer to the question that I'm putting here to keep the question from getting even bigger, but I'd still love to see a more comprehensive and explanatory answer. These answers are based on comments from two others, but neither of them fully wrote out the code, and I thought it would make sense to make that explicit so here it is:
Fixing 2a (trust-constr with jacobian)
It's seems that the key here with regard to the Jacobian and Hessian is to specify neither or both (but not the jacobian only). #SubhaneilLahiri commented to this effect and there was also an error message to this effect that I initially failed to notice:
UserWarning: delta_grad == 0.0. Check if the approximated function is linear. If the function is linear better results can be obtained by defining the Hessian as zero instead of using quasi-Newton approximations.
So I fixed it by defining the hessian function:
def constr_hess(x,v):
return np.zeros([n,n])
and adding it to the constraint
nonlin_con = NonlinearConstraint( constr_func, 0., 0., constr_jac, constr_hess )
Fixing 3a & 3b (SLSQP)
This just seemed to be a matter of making the tolerance smaller as suggested by #user545424. So I just added options={'ftol':1e-15} to the minimization:
resSQjac = minimize( obj_func, x0, method='SLSQP',
options={'ftol':1e-15},
jac = obj_jac, constraints = eq_cons )

Using numpy to solve a linear system with one uknown?

I am trying to solve a really simple problem programatically.
I have a single variable t that must satisfy 2 equations simultaneously as follows:
x_v*t = (x_1 - x_2)
y_v*t = (y_1 - y_2)
My first reaction is to just solve it by dividing the right side by the coefficient in the left, however that coefficient is not guaranteed to be non 0.
Thus we can always use the RREF algorithm and represent the system as:
a | b
c | d
where a = x_v, b = (x_1 - x_2), c = y_v, d = (y_1 - y_2)
After finding the RREF we could have:
The 0 matrix (system is solvable)
First row has a leading one and the second row is 0's (system is sovable)
Either row has a leading 0 and a non zero trailing number (system is not solvable)
Although I could try to code the above myself, I wanted to use a library instead where I can just setup the system and ask an api whether a solution exists or not, so I used numpy.
Currently however I can't even set a system where the non-extended matrix is not square.
Is this achievable?
This is achievable. You can use the function fsolve of the scipy library. An example
import numpy as np
import scipy.optimize as so
def f(t, x_v, x_1, x_2, y_v, y_1, y_2):
return np.sum(np.abs([
x_v*t - (x_1 - x_2),
y_v*t - (y_1 - y_2),
]))
and then you would do
sol_object = so.fsolve(
func = f, # the function that returns the (scalar) 0 you want.
x0 = 1, # The starting estimate
args = (1, 2, 3, 1, 2, 3), # Other arguments of f, i.e. x_v, x_1, x_2, y_v, y_1, y_2
full_output = True
)
sol = sol_object[0]
message = sol_object[-1]
print(sol)
print(message)
Output
[-1.]
The solution converged.
As mentioned in comment by jdhesa, this could have been done using linear-in-parameter solving methods. The one I use above a priori works with any kind of transformation.
An alternative is to just perform the division.
If both "sides" are zero then the result will be NaN (0/0).
If the rhs i.e. (x_1 - x_2) is nonzero the result will be inf.
# c1 is np.array([x_1, y_1, z_1, ...])
# c2 is np.array([x_2, y_2, z_2, ...])
c = c1 - c2
# Use this to supress numpy warnings
with np.warnings.catch_warnings():
np.warnings.filterwarnings('ignore', 'invalid value encountered in true_divide')
np.warnings.filterwarnings('ignore','divide by zero encountered in true_divide')
t = c / v
non_nan_t = t[~np.isnan(t)]
if np.isinf(t).any():
print('unsolvable because rhs is nonzero but v is zero')
elif not np.allclose(non_nan_t, non_nan_t[0]):
print('no solution because equations disagree')
else:
print('solution:', non_nan_t[0])

Scipy.optimize.minimize SLSQP with linear constraints fails

Consider the following (convex) optimization problem:
minimize 0.5 * y.T * y
s.t. A*x - b == y
where the optimization (vector) variables are x and y and A, b are a matrix and vector, respectively, of appropriate dimensions.
The code below finds a solution easily using the SLSQP method from Scipy:
import numpy as np
from scipy.optimize import minimize
# problem dimensions:
n = 10 # arbitrary integer set by user
m = 2 * n
# generate parameters A, b:
np.random.seed(123) # for reproducibility of results
A = np.random.randn(m,n)
b = np.random.randn(m)
# objective function:
def obj(z):
vy = z[n:]
return 0.5 * vy.dot(vy)
# constraint function:
def cons(z):
vx = z[:n]
vy = z[n:]
return A.dot(vx) - b - vy
# constraints input for SLSQP:
cons = ({'type': 'eq','fun': cons})
# generate a random initial estimate:
z0 = np.random.randn(n+m)
sol = minimize(obj, x0 = z0, constraints = cons, method = 'SLSQP', options={'disp': True})
Optimization terminated successfully. (Exit mode 0)
Current function value: 2.12236220865
Iterations: 6
Function evaluations: 192
Gradient evaluations: 6
Note that the constraint function is a convenient 'array-output' function.
Now, instead of an array-output function for the constraint, one could in principle use an equivalent set of 'scalar-output' constraint functions (actually, the scipy.optimize documentation discusses only this type of constraint functions as input to minimize).
Here is the equivalent constraint set followed by the output of minimize (same A, b, and initial value as the above listing):
# this is the i-th element of cons(z):
def cons_i(z, i):
vx = z[:n]
vy = z[n:]
return A[i].dot(vx) - b[i] - vy[i]
# listable of scalar-output constraints input for SLSQP:
cons_per_i = [{'type':'eq', 'fun': lambda z: cons_i(z, i)} for i in np.arange(m)]
sol2 = minimize(obj, x0 = z0, constraints = cons_per_i, method = 'SLSQP', options={'disp': True})
Singular matrix C in LSQ subproblem (Exit mode 6)
Current function value: 6.87999270692
Iterations: 1
Function evaluations: 32
Gradient evaluations: 1
Evidently, the algorithm fails (the returning objective value is actually the objective value for the given initialization), which I find a bit weird. Note that running [cons_per_i[i]['fun'](sol.x) for i in np.arange(m)] shows that sol.x, obtained using the array-output constraint formulation, satisfies all scalar-output constraints of cons_per_i as expected (within numerical tolerance).
I would appreciate if anyone has some explanation for this issue.
You've run into the "late binding closures" gotcha. All the calls to cons_i are being made with the second argument equal to 19.
A fix is to use the args dictionary element in the dictionary that defines the constraints instead of the lambda function closures:
cons_per_i = [{'type':'eq', 'fun': cons_i, 'args': (i,)} for i in np.arange(m)]
With this, the minimization works:
In [417]: sol2 = minimize(obj, x0 = z0, constraints = cons_per_i, method = 'SLSQP', options={'disp': True})
Optimization terminated successfully. (Exit mode 0)
Current function value: 2.1223622086
Iterations: 6
Function evaluations: 192
Gradient evaluations: 6
You could also use the the suggestion made in the linked article, which is to use a lambda expression with a second argument that has the desired default value:
cons_per_i = [{'type':'eq', 'fun': lambda z, i=i: cons_i(z, i)} for i in np.arange(m)]

Minimizing a multivariable function with scipy. Derivative not known

I have a function which is actually a call to another program (some Fortran code). When I call this function (run_moog) I can parse 4 variables, and it returns 6 values. These values should all be close to 0 (in order to minimize). However, I combined them like this: np.sum(results**2). Now I have a scalar function. I would like to minimize this function, i.e. get the np.sum(results**2) as close to zero as possible.
Note: When this function (run_moog) takes the 4 input parameters, it creates an input file for the Fortran code that depends on these parameters.
I have tried several ways to optimize this from the scipy docs. But none works as expected. The minimization should be able to have bounds on the 4 variables. Here is an attempt:
from scipy.optimize import minimize # Tried others as well from the docs
x0 = 4435, 3.54, 0.13, 2.4
bounds = [(4000, 6000), (3.00, 4.50), (-0.1, 0.1), (0.0, None)]
a = minimize(fun_mmog, x0, bounds=bounds, method='L-BFGS-B') # I've tried several different methods here
print a
This then gives me
status: 0
success: True
nfev: 5
fun: 2.3194639999999964
x: array([ 4.43500000e+03, 3.54000000e+00, 1.00000000e-01,
2.40000000e+00])
message: 'CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL'
jac: array([ 0., 0., -54090399.99999981, 0.])
nit: 0
The third parameter changes slightly, while the others are exactly the same. Also there have been 5 function calls (nfev) but no iterations (nit). The output from scipy is shown here.
Couple of possibilities:
Try COBYLA. It should be derivative-free, and supports inequality constraints.
You can't use different epsilons via the normal interface; so try scaling your first variable by 1e4. (Divide it going in, multiply coming back out.)
Skip the normal automatic jacobian constructor, and make your own:
Say you're trying to use SLSQP, and you don't provide a jacobian function. It makes one for you. The code for it is in approx_jacobian in slsqp.py. Here's a condensed version:
def approx_jacobian(x,func,epsilon,*args):
x0 = asfarray(x)
f0 = atleast_1d(func(*((x0,)+args)))
jac = zeros([len(x0),len(f0)])
dx = zeros(len(x0))
for i in range(len(x0)):
dx[i] = epsilon
jac[i] = (func(*((x0+dx,)+args)) - f0)/epsilon
dx[i] = 0.0
return jac.transpose()
You could try replacing that loop with:
for (i, e) in zip(range(len(x0)), epsilon):
dx[i] = e
jac[i] = (func(*((x0+dx,)+args)) - f0)/e
dx[i] = 0.0
You can't provide this as the jacobian to minimize, but fixing it up for that is straightforward:
def construct_jacobian(func,epsilon):
def jac(x, *args):
x0 = asfarray(x)
f0 = atleast_1d(func(*((x0,)+args)))
jac = zeros([len(x0),len(f0)])
dx = zeros(len(x0))
for i in range(len(x0)):
dx[i] = epsilon
jac[i] = (func(*((x0+dx,)+args)) - f0)/epsilon
dx[i] = 0.0
return jac.transpose()
return jac
You can then call minimize like:
minimize(fun_mmog, x0,
jac=construct_jacobian(fun_mmog, [1e0, 1e-4, 1e-4, 1e-4]),
bounds=bounds, method='SLSQP')
It sounds like your target function doesn't have well-behaving derivatives. The line in the output jac: array([ 0., 0., -54090399.99999981, 0.]) means that changing only the third variable value is significant. And because the derivative w.r.t. to this variable is virtually infinite, there is probably something wrong in the function. That is also why the third variable value ends up in its maximum.
I would suggest that you take a look at the derivatives, at least in a few points in your parameter space. Compute them using finite differences and the default step size of SciPy's fmin_l_bfgs_b, 1e-8. Here is an example of how you could compute the derivates.
Try also plotting your target function. For instance, keep two of the parameters constant and let the two others vary. If the function has multiple local optima, you shouldn't use gradient-based methods like BFGS.
How difficult is it to get an analytical expression for the gradient? If you have that you can then approximate the product of Hessian with a vector using finite difference. Then you can use other optimization routines available.
Among the various optimization routines available in SciPy, the one called TNC (Newton Conjugate Gradient with Truncation) is quite robust to the numerical values associated with the problem.
The Nelder-Mead Simplex Method (suggested by Cristián Antuña in the comments above) is well known to be a good choice for optimizing (posibly ill-behaved) functions with no knowledge of derivatives (see Numerical Recipies In C, Chapter 10).
There are two somewhat specific aspects to your question. The first is the constraints on the inputs, and the second is a scaling problem. The following suggests solutions to these points, but you might need to manually iterate between them a few times until things work.
Input Constraints
Assuming your input constraints form a convex region (as your examples above indicate, but I'd like to generalize it a bit), then you can write a function
is_in_bounds(p):
# Return if p is in the bounds
Using this function, assume that the algorithm wants to move from point from_ to point to, where from_ is known to be in the region. Then the following function will efficiently find the furthermost point on the line between the two points on which it can proceed:
from numpy.linalg import norm
def progress_within_bounds(from_, to, eps):
"""
from_ -- source (in region)
to -- target point
eps -- Eucliedan precision along the line
"""
if norm(from_, to) < eps:
return from_
mid = (from_ + to) / 2
if is_in_bounds(mid):
return progress_within_bounds(mid, to, eps)
return progress_within_bounds(from_, mid, eps)
(Note that this function can be optimized for some regions, but it's hardly worth the bother, as it doesn't even call your original object function, which is the expensive one.)
One of the nice aspects of Nelder-Mead is that the function does a series of steps which are so intuitive. Some of these points can obviously throw you out of the region, but it's easy to modify this. Here is an implementation of Nelder Mead with modifications made marked between pairs of lines of the form ##################################################################:
import copy
'''
Pure Python/Numpy implementation of the Nelder-Mead algorithm.
Reference: https://en.wikipedia.org/wiki/Nelder%E2%80%93Mead_method
'''
def nelder_mead(f, x_start,
step=0.1, no_improve_thr=10e-6, no_improv_break=10, max_iter=0,
alpha = 1., gamma = 2., rho = -0.5, sigma = 0.5):
'''
#param f (function): function to optimize, must return a scalar score
and operate over a numpy array of the same dimensions as x_start
#param x_start (numpy array): initial position
#param step (float): look-around radius in initial step
#no_improv_thr, no_improv_break (float, int): break after no_improv_break iterations with
an improvement lower than no_improv_thr
#max_iter (int): always break after this number of iterations.
Set it to 0 to loop indefinitely.
#alpha, gamma, rho, sigma (floats): parameters of the algorithm
(see Wikipedia page for reference)
'''
# init
dim = len(x_start)
prev_best = f(x_start)
no_improv = 0
res = [[x_start, prev_best]]
for i in range(dim):
x = copy.copy(x_start)
x[i] = x[i] + step
score = f(x)
res.append([x, score])
# simplex iter
iters = 0
while 1:
# order
res.sort(key = lambda x: x[1])
best = res[0][1]
# break after max_iter
if max_iter and iters >= max_iter:
return res[0]
iters += 1
# break after no_improv_break iterations with no improvement
print '...best so far:', best
if best < prev_best - no_improve_thr:
no_improv = 0
prev_best = best
else:
no_improv += 1
if no_improv >= no_improv_break:
return res[0]
# centroid
x0 = [0.] * dim
for tup in res[:-1]:
for i, c in enumerate(tup[0]):
x0[i] += c / (len(res)-1)
# reflection
xr = x0 + alpha*(x0 - res[-1][0])
##################################################################
##################################################################
xr = progress_within_bounds(x0, x0 + alpha*(x0 - res[-1][0]), prog_eps)
##################################################################
##################################################################
rscore = f(xr)
if res[0][1] <= rscore < res[-2][1]:
del res[-1]
res.append([xr, rscore])
continue
# expansion
if rscore < res[0][1]:
xe = x0 + gamma*(x0 - res[-1][0])
##################################################################
##################################################################
xe = progress_within_bounds(x0, x0 + gamma*(x0 - res[-1][0]), prog_eps)
##################################################################
##################################################################
escore = f(xe)
if escore < rscore:
del res[-1]
res.append([xe, escore])
continue
else:
del res[-1]
res.append([xr, rscore])
continue
# contraction
xc = x0 + rho*(x0 - res[-1][0])
##################################################################
##################################################################
xc = progress_within_bounds(x0, x0 + rho*(x0 - res[-1][0]), prog_eps)
##################################################################
##################################################################
cscore = f(xc)
if cscore < res[-1][1]:
del res[-1]
res.append([xc, cscore])
continue
# reduction
x1 = res[0][0]
nres = []
for tup in res:
redx = x1 + sigma*(tup[0] - x1)
score = f(redx)
nres.append([redx, score])
res = nres
Note This implementation is GPL, which is either fine for you or not. It's extremely easy to modify NM from any pseudocode, though, and you might want to throw in simulated annealing in any case.
Scaling
This is a trickier problem, but jasaarim has made an interesting point regarding that. Once the modified NM algorithm has found a point, you might want to run matplotlib.contour while fixing a few dimensions, in order to see how the function behaves. At this point, you might want to rescale one or more of the dimensions, and rerun the modified NM.
–

Categories