I'm trying to minimize a constrained function of several variables adopting the algorithm scipy.optimize.minimize. The function concerns the minimization of 3*N parameters, where Nis an input. More specifically, my minimization parameters are given in three arrays H = H[0],H[1],...,H[N-1], a = a[0],a[1],...,a[N-1] and b = b[0],b[1],...,b[N-1] which I concatenated in only one array named mins, with len(mins)=3*N.
Those parameters are also subjected to constraints as follows:
0 <= H and sum(H) = 0.5
0 <= a <= Pi/2
0 <= b <= Pi/2
So, my code for the constraints read as:
import numpy as np
# constraints on x:
def Hlhs(mins): # left hand side
return np.diag(np.ones(N)) # mins.reshape(3,N)[0]
def Hrhs(mins): # right hand side
return np.sum(mins.reshape(3,N)[0]) - 0.5
con1H = {'type': 'ineq', 'fun': lambda H: Hlhs(H)}
con2H = {'type': 'eq', 'fun': lambda H: Hrhs(H)}
# constraints on a:
def alhs(mins):
return np.diag(np.ones(N)) # mins.reshape(3,N)[1]
def arhs(mins):
return -np.diag(np.ones(N)) # mins.reshape(3,N)[1] + (np.ones(N))*np.pi/2
con1a = {'type': 'ineq', 'fun': lambda a: alhs(a)}
con2a = {'type': 'ineq', 'fun': lambda a: arhs(a)}
# constraints on b:
def blhs(mins):
return np.diag(np.ones(N)) # mins.reshape(3,N)[2]
def brhs(mins):
return -np.diag(np.ones(N)) # mins.reshape(3,N)[2] + (np.ones(N))*np.pi/2
con1b = {'type': 'ineq', 'fun': lambda b: blhs(b)}
con2b = {'type': 'ineq', 'fun': lambda b: brhs(b)}
My function, with the other parameters (and adopting N=3) to be minimized, is given by (I'm sorry if it is too long):
gamma = 17
C = 85
T = 0
Hf = 0.5
Li = 2
Bi = 1
N = 3
def FUN(mins):
H, a, b = mins.reshape(3,N)
S1 = 0; S2 = 0
B = np.zeros(N); L = np.zeros(N);
for i in range(N):
sbi=Bi; sli=Li
for j in range(i+1):
sbi += 2*H[j]*np.tan(b[j])
sli += 2*H[j]*np.tan(a[j])
B[i]=sbi
L[i]=sli
for i in range(N):
S1 += (C*(1-np.sin(a[i])) + T*np.sin(a[i])) * (Bi*H[i]+H[i]**2*np.tan(b[i]))/np.cos(a[i]) + \
(C*(1-np.sin(b[i])) + T*np.sin(b[i])) * (Li*H[i]+H[i]**2*np.tan(a[i]))/np.cos(b[i])
S2 += (gamma*H[0]/12)*(Bi*Li + 4*(B[0]-H[0]*np.tan(b[0]))*(L[0]-H[0]*np.tan(a[0])) + B[0]*L[0])
j=1
while j<(N):
S2 += (gamma*H[j]/12)*(B[j-1]*L[j-1] + 4*(B[j]-H[j]*np.tan(b[j]))*(L[j]-H[j]*np.tan(a[j])) + B[j]*L[j])
j += 1
F = 2*(S1+S2)
return F
And, finally, adopting an initial guess for the values as 0, the minimization is given by:
x0 = np.zeros(3*N)
res = scipy.optimize.minimize(FUN,x0,constraints=(con1H,con2H,con1a,con2a,con1b,con2b),tol=1e-25)
My problems are:
a) Observing the result res, some values got negative even though I have constraints for them to be positive. The success of the minimization was False, and the message was: Positive directional derivative for linesearch. Also, the result is very far from the minimum expected.
b) Adopting the method='trust-constr' I got a value closer to what I was expecting but with a false success and the message The maximum number of function evaluations is exceeded.. Is there any way to improve this?
I know that there is a minimum very close to these values:
H = [0.2,0.15,0.15]
a = [1.0053,1.0053,1.2566]
b = [1.0681,1.1310,1.3195]
where the value for the function is 123,45. I've checked the function several times and it seems to be working properly. Can anyone help me to find where my problem is? I've tried to change xtol and maxiter but with no success.
Here are a few hints:
Your initial point x0 is not feasible since it doesn't satisfy the constraint sum(H) = 0.5. Providing a feasible initial point should fix your first problem.
Except for the constraint sum(H) = 0.5, all constraints are simple bounds on the variables. In general, it's recommended to pass variable bounds via the bounds parameter of minimize. You can simply define and pass all the bounds like this
from scipy.optimize import minimize
import numpy as np
# ..your variables and functions ..
bounds = [(0, None)]*N + [(0, np.pi/2)]*2*N
x0 = np.zeros(3*N)
x0[0] = 0.5
res = minimize(FUN, x0, constraints=(con2H,), bounds=bounds,
method="trust-constr", options={'maxiter': 20000})
where each tuple contains the lower and upper bound for each variable.
Unfortunately, 'trust-constr' has still trouble to converge to a local minimizer. In this case, you can either try other initial points or you can use the state-of-the-art open source solver Ipopt instead. The Cython wrapper cyipopt provides a interface similar to scipy:
from cyipopt import minimize_ipopt
# rest as above
res = minimize_ipopt(FUN, x0, constraints=(con2H,), bounds=bounds)
this gives me a solution with objective value 122.9.
Last but not least, it's always a good idea to provide exact gradients, jacobians and hessians.
Related
Well what I was trying to do was to model the following using scipy.optimize.minimize.
What I'm trying to optimize is this function with its constraints:
Here variable V is a list of variables, list's length is equal to the size of Omega.
What I did so far is the following:
import numpy as np
from scipy.linalg import norm
from scipy.optimize import minimize
# set of concepts
M = ['linear algebra','seq2seq', 'artificial neural network','pointer networks']
#subchapters
S1=['linear algebra', 'differential calculus','backpropagation']
S2=['linear algebra','seq2seq', 'artificial neural network']
S3=['linear algebra','seq2seq', 'artificial neural network','pointer networks']
#vector representation of the subchapter in the concept space
x=[[1,1,0,0],
[1,1,1,0],
[1,1,1,1]]
# set of prerequisite relations among subchapters (entered manually for testing)
Omega = [(1, 2),(2,3),(1,3)]
# number of concepts
m = len(M)
# define of theta and lambda constants (manual)
theta = 2
lamda = 1
# matrix A is a m x m matrix , where m is the number of concepts
# it represents the prerequisite relations among concepts
# A is generated randomly
np.random.seed(43)
#A = np.zeros((m,m), dtype = int)
A = np.random.randint(2, size=(m,m))
# define the slack variable V as an array of size equal to len(Omega)
V = np.empty(len(Omega), dtype=float)
bnds=[]
# bounds -1 and 1 , create the array
# -1 <= a[i][j] <= 1
bnds_a_s_t = [bnds.append((-1, 1)) for _ in range(np.size(A))]
# bounds for the slack variable V, V is positive
bnds_V_i_j = [bnds.append((0,np.inf)) for _ in range(np.size(V))]
#constraints
cons=[]
#equality constraint
#a[s][t] + a[t][s] = 0
def equality_constraint(X):
A_no_flatten=X[:len(X)-len(Omega)]
#reconstruct matrix A
A=np.reshape(A_no_flatten,(m,m))
for s in range(m):
for t in range(m):
r=A[s][t]+A[t][s]
#r=0 constraint
con = {'type': 'eq', 'fun': lambda X: r}
cons.append(con)
# inequality constraint
#x[i].T # (C[i][j] * A) # x[j]
def inequality_constraint(X,x):
for couple in Omega:
# get the i and j
i = couple[0]
j = couple[1]
#initialize C to 1s
C = np.full((m,m), 1, dtype = int)
# take all elements from X except last len(Omega) elements
A_no_flatten=X[:len(X)-len(Omega)]
# reconstruct list V
V=X[-len(Omega):]
#index for V
f=0
#reconstruct matrix A
A=np.reshape(A_no_flatten,(m,m))
#construct C[i][j]
for s in range(m):
for t in range(m):
if x[i][t]>0 or x[j][s]>0:
C[s][t]=0
else:
C[s][t]=1
first= x[i].T
second = C*A
third = x[j]
first_sec = first#second
res=first_sec#third
ineq_con = {'type': 'ineq', 'fun': lambda X: res -theta +V[f]}
f+=1
cons.append(ineq_con)
# arguments passed to the function
# here we pass x matrix
# arguments are passed and used in constraints and in the objective function
# the objective function will minimize A and V which are matrix A and slack variable V
arguments=(x,)
# objective function
def objective(X, x):
A_no_flatten=X[:len(X)-len(Omega)]
# reconstruct list V
V=X[-len(Omega):]
#reconstruct matrix A
A=np.reshape(A_no_flatten,(m,m))
# sum of square V
sum_square=0.0
for it in V:
sum_square+=it**2
# sum of square V * lambda
sum_square_lambda=sum_square*lamda
return norm(A, 1) + sum_square_lambda
# list of variables to pass to the objective function
#pass the x0.flatten() which is the A + V combined, and then when in objective function we recreate them
# the first one A is all except the last s items where s is the size of V
# and then V is the rest
B = A.flatten()
p0 = np.append(B,V)
# scipy minimize
sol = minimize(objective, x0 = p0, args=arguments, bounds=bnds, constraints=cons)
print(sol.x)
What I get is the following:
[-7.73458566e-010 0.00000000e+000 4.99999997e-001 1.00000000e+000
1.00000000e+000 0.00000000e+000 -5.00000003e-001 1.00000000e+000
1.00000000e+000 1.00000000e+000 4.99999997e-001 -7.73458566e-010
-7.73458566e-010 0.00000000e+000 4.99999997e-001 -7.73458566e-010
6.01347002e-154 1.07176259e-311 0.00000000e+000]
Which doesn't respect the constraints and is not what I expected
What I don't know is that is it correct to add constraints like that, because I don't seem to call the constraints function, and I need to add them in a loop, and each function depends on X which is the list to minimize.
When I print the cons array it is empty and I know, but I didn't find another way to add the constraint a[s][t]+a[t][s]=0 and the other one, I don't know if my approach is correct, thank you for your help in advance, much appreciated.
This isn't a complete answer, but it may get you started. As already mentioned in the comments, your list of constraints cons is empty when passed to the minimize method. So let's consider your first equality constraint. There are a few problems:
Each time you call the function equality_constraint, you'd append new constraints to your list cons.
Since you pass each constraint A[s][t] + A[t][s] == 0 as scalar function, it is quite cumbersome. Instead, you could use a single vector-valued function:
def constraint1(X):
A_no_flatten = X[:len(X)-len(Omega)]
A = np.reshape(A_no_flatten,(m,m))
return A.flatten() + A.T.flatten()
As the name indicates, the .flatten() method flattens the matrix to a vector and A.T is just the transpose of A. Now you can add this constraint:
cons = []
cons.append({'type': 'eq', 'fun': constraint1})
Proceed similarly for the other constraint.
I am trying to solve the following problem for many different values of K:
I am trying to use scipy optimize for greater generality (at some stage I would like to be able to change the functions).
This is my code:
import numpy as np
import matplotlib.pyplot as plt
import sympy as sp
from scipy import optimize
n=10
p1 = 0.2
p2 = 0.3
orig = (0,0)
endw = (1,1)
def U1(x):
return p1*(x[0])**0.5 + (1-p1)*(x[1])**0.5
def U2(x):
return p2*(1-x[0])**0.5 + (1-p2)*(1-x[1])**0.5
itervals = np.linspace(endw, orig, n)
utvals = np.array([U2(vec) for vec in itervals])
parvals = np.zeros((2, len(utvals)))
for it in range(len(utvals)):
def obj(x):
return -U1(x)
def constr(x):
return -U2(x)+utvals[it]
con = {'type': 'eq', 'fun': constr}
res = optimize.minimize(obj, itervals[it], method='SLSQP', constraints=con)
parvals[:, it] = res['x']
print(constr(parvals[:,it]), utvals[it])
However, when I check if the constrained is respected, I get negative values of constr(parvals[:,it]) in the code above, and if I turn the constraint to
def constr(x):
return U2(x)-utvals[it]
I get positive values of constr(parvals[:,it]). How come?
I mean, my initial guess (contained in itervals) always returns 0 for the constraint. Therefore it is always possible to reach 0, why is it sometimes positive and sometimes negative?
Varying K we find different solution pairs (x1, x2). Solving the maximization problem with Lagrange multiplier (imposing first order conditions) it is trivial to see that x2 must be a function of x1, i.e., this code gives the required relation:
Psi = (p1/p2*(1-p2)/(1-p1))**2
def realline(x1):
return x1/(Psi+(1-Psi)*x1)
It can be easily seen that defining the constraint with the right sign, the solution coincides almost everywhere:
def constr(x):
return U2(x)-utvals[it]
con = {'type': 'ineq', 'fun': constr}
I am trying to optimize the following function:
f(x,a+,a-,b) = a+*((1/(1+exp(-b*x)) - 1/2) if x>=0
= a-*((1/1+exp(-b*x)) -1/2) if x<0
constraints: a+ * b <=4, a-*b <=4
a+/2 <= max(if(x>0))
a-/2 <= -min(if(x<0))
I have tried to use the minimize function in scpiy with by putting the bounds [(0,2), (0,2), (1,None)] and constraints as defined above, but the function is not providing the right results, especially if I use beta to be an args in the constraints.
from scipy.optimize import minimize
init_params=[0.0,0.0,20.0]
bnds = [(0.0,2.0), (0.0,2.0), (1.0,None)]
S_curve = pd.DataFrame()
S_curve['year'] = dfs2.index
S_curve['H_Change'] = np.array(dfs2.loc[:,'annualH_chg'])
S_curve['R_Change'] = np.array(dfs2.loc[:,'annualR_chg'])
S_curve['Weight'] = 1
S_curve.reset_index(drop=True)
weighted = S_curve[S_curve.Weight !=0]
minimum = -S_curve['R_Change'].min()
maximum = S_curve['R_Change'].max()
beta=init_params[2]
def constraint1(up, beta):
return 4.0-(up*beta)
def constraint2(down, beta):
return 4.0-(down*beta)
def constraint3(up, maximum):
return maximum - up/2.0
def constraint4(down, minimum):
return minimum - down/2.0
cons = [{'type':'ineq', 'fun':constraint1, 'args':(beta,)},
{'type':'ineq', 'fun':constraint2, 'args':(beta,)},
{'type':'ineq', 'fun':constraint3, 'args':(maximum,)},
{'type':'ineq', 'fun':constraint4, 'args':(minimum,)}]
soln = minimize(func, init_params, bounds=bnds, constraints=cons, method='SLQSP')
expecting the first and second constraint to be satisfied, and beta (b) is not constant.
I have the same problem as in this question but don't want to add only one but several constraints to the optimization problem.
So e.g. I want to maximize x1 + 5 * x2 with the constraints that the sum of x1 and x2 is smaller than 5 and x2 is smaller than 3 (needless to say that the actual problem is far more complicated and cannot just thrown into scipy.optimize.minimize as this one; it just serves to illustrate the problem...).
I can to an ugly hack like this:
from scipy.optimize import differential_evolution
import numpy as np
def simple_test(x, more_constraints):
# check wether all constraints evaluate to True
if all(map(eval, more_constraints)):
return -1 * (x[0] + 5 * x[1])
# if not all constraints evaluate to True, return a positive number
return 10
bounds = [(0., 5.), (0., 5.)]
additional_constraints = ['x[0] + x[1] <= 5.', 'x[1] <= 3']
result = differential_evolution(simple_test, bounds, args=(additional_constraints, ), tol=1e-6)
print(result.x, result.fun, sum(result.x))
This will print
[ 1.99999986 3. ] -16.9999998396 4.99999985882
as one would expect.
Is there a better/ more straightforward way to add several constraints than using the rather 'dangerous' eval?
An example is something like this::
additional_constraints = [lambda(x): x[0] + x[1] <= 5., lambda(x):x[1] <= 3]
def simple_test(x, more_constraints):
# check wether all constraints evaluate to True
if all(constraint(x) for constraint in more_constraints):
return -1 * (x[0] + 5 * x[1])
# if not all constraints evaluate to True, return a positive number
return 10
There is a proper solution to the problem described in the question, to enforce multiple nonlinear constraints with scipy.optimize.differential_evolution.
The proper way is by using the scipy.optimize.NonlinearConstraint function.
Here below I give a non-trivial example of optimizing the classic Rosenbrock function inside a region defined by the intersection of two circles.
import numpy as np
from scipy import optimize
# Rosenbrock function
def fun(x):
return 100*(x[1] - x[0]**2)**2 + (1 - x[0])**2
# Function defining the nonlinear constraints:
# 1) x^2 + (y - 3)^2 < 4
# 2) (x - 1)^2 + (y + 1)^2 < 13
def constr_fun(x):
r1 = x[0]**2 + (x[1] - 3)**2
r2 = (x[0] - 1)**2 + (x[1] + 1)**2
return r1, r2
# No lower limit on constr_fun
lb = [-np.inf, -np.inf]
# Upper limit on constr_fun
ub = [4, 13]
# Bounds are irrelevant for this problem, but are needed
# for differential_evolution to compute the starting points
bounds = [[-2.2, 1.5], [-0.5, 2.2]]
nlc = optimize.NonlinearConstraint(constr_fun, lb, ub)
sol = optimize.differential_evolution(fun, bounds, constraints=nlc)
# Accurate solution by Mathematica
true = [1.174907377273171, 1.381484428610871]
print(f"nfev = {sol.nfev}")
print(f"x = {sol.x}")
print(f"err = {sol.x - true}\n")
This prints the following with default parameters:
nfev = 636
x = [1.17490808 1.38148613]
err = [7.06260962e-07 1.70116282e-06]
Here is a visualization of the function (contours) and the feasible region defined by the nonlinear constraints (shading inside the green line). The constrained global minimum is indicated by the yellow dot, while the magenta one shows the unconstrained global minimum.
This constrained problem has an obvious local minimum at (x, y) ~ (-1.2, 1.4) on the boundary of the feasible region which will make local optimizers fail to converge to the global minimum for many starting locations. However, differential_evolution consistently finds the global minimum as expected.
Consider the following (convex) optimization problem:
minimize 0.5 * y.T * y
s.t. A*x - b == y
where the optimization (vector) variables are x and y and A, b are a matrix and vector, respectively, of appropriate dimensions.
The code below finds a solution easily using the SLSQP method from Scipy:
import numpy as np
from scipy.optimize import minimize
# problem dimensions:
n = 10 # arbitrary integer set by user
m = 2 * n
# generate parameters A, b:
np.random.seed(123) # for reproducibility of results
A = np.random.randn(m,n)
b = np.random.randn(m)
# objective function:
def obj(z):
vy = z[n:]
return 0.5 * vy.dot(vy)
# constraint function:
def cons(z):
vx = z[:n]
vy = z[n:]
return A.dot(vx) - b - vy
# constraints input for SLSQP:
cons = ({'type': 'eq','fun': cons})
# generate a random initial estimate:
z0 = np.random.randn(n+m)
sol = minimize(obj, x0 = z0, constraints = cons, method = 'SLSQP', options={'disp': True})
Optimization terminated successfully. (Exit mode 0)
Current function value: 2.12236220865
Iterations: 6
Function evaluations: 192
Gradient evaluations: 6
Note that the constraint function is a convenient 'array-output' function.
Now, instead of an array-output function for the constraint, one could in principle use an equivalent set of 'scalar-output' constraint functions (actually, the scipy.optimize documentation discusses only this type of constraint functions as input to minimize).
Here is the equivalent constraint set followed by the output of minimize (same A, b, and initial value as the above listing):
# this is the i-th element of cons(z):
def cons_i(z, i):
vx = z[:n]
vy = z[n:]
return A[i].dot(vx) - b[i] - vy[i]
# listable of scalar-output constraints input for SLSQP:
cons_per_i = [{'type':'eq', 'fun': lambda z: cons_i(z, i)} for i in np.arange(m)]
sol2 = minimize(obj, x0 = z0, constraints = cons_per_i, method = 'SLSQP', options={'disp': True})
Singular matrix C in LSQ subproblem (Exit mode 6)
Current function value: 6.87999270692
Iterations: 1
Function evaluations: 32
Gradient evaluations: 1
Evidently, the algorithm fails (the returning objective value is actually the objective value for the given initialization), which I find a bit weird. Note that running [cons_per_i[i]['fun'](sol.x) for i in np.arange(m)] shows that sol.x, obtained using the array-output constraint formulation, satisfies all scalar-output constraints of cons_per_i as expected (within numerical tolerance).
I would appreciate if anyone has some explanation for this issue.
You've run into the "late binding closures" gotcha. All the calls to cons_i are being made with the second argument equal to 19.
A fix is to use the args dictionary element in the dictionary that defines the constraints instead of the lambda function closures:
cons_per_i = [{'type':'eq', 'fun': cons_i, 'args': (i,)} for i in np.arange(m)]
With this, the minimization works:
In [417]: sol2 = minimize(obj, x0 = z0, constraints = cons_per_i, method = 'SLSQP', options={'disp': True})
Optimization terminated successfully. (Exit mode 0)
Current function value: 2.1223622086
Iterations: 6
Function evaluations: 192
Gradient evaluations: 6
You could also use the the suggestion made in the linked article, which is to use a lambda expression with a second argument that has the desired default value:
cons_per_i = [{'type':'eq', 'fun': lambda z, i=i: cons_i(z, i)} for i in np.arange(m)]