I am learning how to use Drake to solving optimization problems.
This problem was to find the optimal length and width of a fence, the fence must have a perimeter less than or equal to 40. The code below only works when the perimeter constraint is an equality constraint. It should work as an inequality constraint, but my optimal solution results in x=[nan nan]. Does anyone know why this is the case?
from pydrake.solvers.mathematicalprogram import MathematicalProgram, Solve
import numpy as np
import matplotlib.pyplot as plt
prog = MathematicalProgram()
#add two decision variables
x = prog.NewContinuousVariables(2, "x")
#adds objective function where
#
# min 0.5 xt * Q * x + bt * x
#
# Q = [0,-1
# -1,0]
#
# bt = [0,
# 0]
#
Q = [[0,-1],[-1,0]]
b = [[0],[0]]
prog.AddQuadraticCost(Q , b, vars=[x[0],x[1]])
# Adds the linear constraints.
prog.AddLinearEqualityConstraint(2*x[0] + 2*x[1] == 40)
#prog.AddLinearConstraint(2*x[0] + 2*x[1] <= 40)
prog.AddLinearConstraint(0*x[0] + -1*x[1] <= 0)
prog.AddLinearConstraint(-1*x[0] + 0*x[1] <= 0)
# Solve the program.
result = Solve(prog)
print(f"optimal solution x: {result.GetSolution(x)}")
I get [nan, nan] for both inequality constraint and equality constraint.
As Russ mentioned, the problem is the cost being non-convex, and Drake incurred the wrong solver. For the moment, I would suggest to explicitly designate a solver. You could do
from pydrake.solvers.ipopt_solver import IpoptSolver
from pydrake.solvers.mathematicalprogram import MathematicalProgram, Solve
import numpy as np
import matplotlib.pyplot as plt
prog = MathematicalProgram()
#add two decision variables
x = prog.NewContinuousVariables(2, "x")
#adds objective function where
#
# min 0.5 xt * Q * x + bt * x
#
# Q = [0,-1
# -1,0]
#
# bt = [0,
# 0]
#
Q = [[0,-1],[-1,0]]
b = [[0],[0]]
prog.AddQuadraticCost(Q , b, vars=[x[0],x[1]])
# Adds the linear constraints.
prog.AddLinearEqualityConstraint(2*x[0] + 2*x[1] == 40)
#prog.AddLinearConstraint(2*x[0] + 2*x[1] <= 40)
prog.AddLinearConstraint(0*x[0] + -1*x[1] <= 0)
prog.AddLinearConstraint(-1*x[0] + 0*x[1] <= 0)
# Solve the program.
solver = IpoptSolver()
result = solver.Solve(prog)
print(f"optimal solution x: {result.GetSolution(x)}")
I will work on a fix on the Drake side, to make sure it incur the right solver when you have non-convex quadratic cost.
Related
I am trying to assign an infinite value to a variable of gekko. I have tried with the numpy's infinite value and python's own infinite but it is still not working due to a problem of recognition of gekko.
The main objective of this idea is to force a variable to be strictly equal to 0, at least in the first iteration of the solver.
from gekko import GEKKO
from numpy import Inf
model=GEKKO()
R=model.FV(value=Inf)
T=model.Array(model.Var,2)
Q=model.FV()
model.Equation(Q==(T[1]-T[0])/R)
model.solve()
And the error I am getting:
Exception: #error: Model Expression
*** Error in syntax of function string: Invalid element: inf
Moreover, sometimes other variables are also required to be infinite, again, variables that are located in the denominator of a model equation. This is quite useful in order to try different scenarios of the simulation I am working with and check the systems behavior.
Hope you can help me, thank you.
The large-scale NLP and MINLP solvers don't know how to compute gradients with a np.nan value so initializing with NaN generally doesn't help. Please post example code that demonstrates the issue that you are observing with improved performance from NaN initialization.
Below are four unconstrained optimization methods compared on the same sample problem. The algorithms do not benefit from NaN for initialization. Some solvers substitute NaN with 0 or a high or low number. I suggest that you try giving np.nan as an initial condition to these solution methods to see how it affects the search for the minimum.
import matplotlib
import numpy as np
import matplotlib.pyplot as plt
# define objective function
def f(x):
x1 = x[0]
x2 = x[1]
obj = x1**2 - 2.0 * x1 * x2 + 4 * x2**2
return obj
# define objective gradient
def dfdx(x):
x1 = x[0]
x2 = x[1]
grad = []
grad.append(2.0 * x1 - 2.0 * x2)
grad.append(-2.0 * x1 + 8.0 * x2)
return grad
# Exact 2nd derivatives (hessian)
H = [[2.0, -2.0],[-2.0, 8.0]]
# Start location
x_start = [-3.0, 2.0]
# Design variables at mesh points
i1 = np.arange(-4.0, 4.0, 0.1)
i2 = np.arange(-4.0, 4.0, 0.1)
x1_mesh, x2_mesh = np.meshgrid(i1, i2)
f_mesh = x1_mesh**2 - 2.0 * x1_mesh * x2_mesh + 4 * x2_mesh**2
# Create a contour plot
plt.figure()
# Specify contour lines
lines = range(2,52,2)
# Plot contours
CS = plt.contour(x1_mesh, x2_mesh, f_mesh,lines)
# Label contours
plt.clabel(CS, inline=1, fontsize=10)
# Add some text to the plot
plt.title(r'$f(x)=x_1^2 - 2x_1x_2 + 4x_2^2$')
plt.xlabel(r'$x_1$')
plt.ylabel(r'$x_2$')
##################################################
# Newton's method
##################################################
xn = np.zeros((2,2))
xn[0] = x_start
# Get gradient at start location (df/dx or grad(f))
gn = dfdx(xn[0])
# Compute search direction and magnitude (dx)
# with dx = -inv(H) * grad
delta_xn = np.empty((1,2))
delta_xn = -np.linalg.solve(H,gn)
xn[1] = xn[0]+delta_xn
plt.plot(xn[:,0],xn[:,1],'k-o')
##################################################
# Steepest descent method
##################################################
# Number of iterations
n = 8
# Use this alpha for every line search
alpha = 0.15
# Initialize xs
xs = np.zeros((n+1,2))
xs[0] = x_start
# Get gradient at start location (df/dx or grad(f))
for i in range(n):
gs = dfdx(xs[i])
# Compute search direction and magnitude (dx)
# with dx = - grad but no line searching
xs[i+1] = xs[i] - np.dot(alpha,dfdx(xs[i]))
plt.plot(xs[:,0],xs[:,1],'g-o')
##################################################
# Conjugate gradient method
##################################################
# Number of iterations
n = 8
# Use this alpha for the first line search
alpha = 0.15
neg = [[-1.0,0.0],[0.0,-1.0]]
# Initialize xc
xc = np.zeros((n+1,2))
xc[0] = x_start
# Initialize delta_gc
delta_cg = np.zeros((n+1,2))
# Initialize gc
gc = np.zeros((n+1,2))
# Get gradient at start location (df/dx or grad(f))
for i in range(n):
gc[i] = dfdx(xc[i])
# Compute search direction and magnitude (dx)
# with dx = - grad but no line searching
if i==0:
beta = 0
delta_cg[i] = - np.dot(alpha,dfdx(xc[i]))
else:
beta = np.dot(gc[i],gc[i]) / np.dot(gc[i-1],gc[i-1])
delta_cg[i] = alpha * np.dot(neg,dfdx(xc[i])) + beta * delta_cg[i-1]
xc[i+1] = xc[i] + delta_cg[i]
plt.plot(xc[:,0],xc[:,1],'y-o')
##################################################
# Quasi-Newton method
##################################################
# Number of iterations
n = 8
# Use this alpha for every line search
alpha = np.linspace(0.1,1.0,n)
# Initialize delta_xq and gamma
delta_xq = np.zeros((2,1))
gamma = np.zeros((2,1))
part1 = np.zeros((2,2))
part2 = np.zeros((2,2))
part3 = np.zeros((2,2))
part4 = np.zeros((2,2))
part5 = np.zeros((2,2))
part6 = np.zeros((2,1))
part7 = np.zeros((1,1))
part8 = np.zeros((2,2))
part9 = np.zeros((2,2))
# Initialize xq
xq = np.zeros((n+1,2))
xq[0] = x_start
# Initialize gradient storage
g = np.zeros((n+1,2))
g[0] = dfdx(xq[0])
# Initialize hessian storage
h = np.zeros((n+1,2,2))
h[0] = [[1, 0.0],[0.0, 1]]
for i in range(n):
# Compute search direction and magnitude (dx)
# with dx = -alpha * inv(h) * grad
delta_xq = -np.dot(alpha[i],np.linalg.solve(h[i],g[i]))
xq[i+1] = xq[i] + delta_xq
# Get gradient update for next step
g[i+1] = dfdx(xq[i+1])
# Get hessian update for next step
gamma = g[i+1]-g[i]
part1 = np.outer(gamma,gamma)
part2 = np.outer(gamma,delta_xq)
part3 = np.dot(np.linalg.pinv(part2),part1)
part4 = np.outer(delta_xq,delta_xq)
part5 = np.dot(h[i],part4)
part6 = np.dot(part5,h[i])
part7 = np.dot(delta_xq,h[i])
part8 = np.dot(part7,delta_xq)
part9 = np.dot(part6,1/part8)
h[i+1] = h[i] + part3 - part9
plt.plot(xq[:,0],xq[:,1],'r-o')
plt.tight_layout()
plt.savefig('contour.png',dpi=600)
plt.show()
More information is available in the design optimization course.
Response to Edit
Thanks for clarifying the question and for including a source code example. While it isn't possible to include Inf as a guess, an equivalent form with an additional variable x may be able to accomplish the desired behavior. This sets the term (T[1]-T[0])/R initially equal to zero at the beginning iteration.
from gekko import GEKKO
from numpy import Inf
model=GEKKO()
R=model.FV(value=1e20)
T=model.Array(model.Var,2)
x=model.Var(value=0)
Q=model.FV()
model.Equations([x==(T[1]-T[0])/R,
Q==x])
model.solve()
We’re trying to solve a one-dimensional Coupled Continuity-Poisson problem in Fipy. When applying
Dirichlet’s conditions, it gives the correct results, but when we change the boundaries conditions to Neumann’s which is closer to our problem, it gives “The Factor is exactly singular’’ error.
Any help is highly appreciated. The code is as follows (0<x<2.5):
from fipy import *
from fipy import Grid1D, CellVariable, TransientTerm, DiffusionTerm, Viewer
import numpy as np
import math
import matplotlib.pyplot as plt
from matplotlib import cm
from cachetools import cached, TTLCache #caching to increase the speed of python
cache = TTLCache(maxsize=100, ttl=86400) #creating the cache object: the
#first argument= the number of objects we store in the cache.
#____________________________________________________
nx=50
dx=0.05
L=nx*dx
e=math.e
m = Grid1D(nx=nx, dx=dx)
print(np.log(e))
#____________________________________________________
phi = CellVariable(mesh=m, hasOld=True, value=0.)
ne = CellVariable(mesh=m, hasOld=True, value=0.)
phi_face = phi.faceValue
ne_face = ne.faceValue
x = m.cellCenters[0]
t0 = Variable()
phi.setValue((x-1)**3)
ne.setValue(-6*(x-1))
#____________________________________________________
#cached(cache)
def S(x,t):
f=6*(x-1)*e**(-t)+54*((x-1)**2)*e**(-2.*t)
return f
#____________________________________________________
#Boundary Condition:
valueleft_phi=3*e**(-t0)
valueright_phi=6.75*e**(-t0)
valueleft_ne=-6*e**(-t0)
valueright_ne=-6*e**(-t0)
phi.faceGrad.constrain([valueleft_phi], m.facesLeft)
phi.faceGrad.constrain([valueright_phi], m.facesRight)
ne.faceGrad.constrain([valueleft_ne], m.facesLeft)
ne.faceGrad.constrain([valueright_ne], m.facesRight)
#____________________________________________________
eqn0 = DiffusionTerm(1.,var=phi)==ImplicitSourceTerm(-1.,var=ne)
eqn1 = TransientTerm(1.,var=ne) ==
VanLeerConvectionTerm(phi.faceGrad,var=ne)+S(x,t0)
eqn = eqn0 & eqn1
#____________________________________________________
steps = 1.e4
dt=1.e-4
T=dt*steps
F=dt/(dx**2)
print('F=',F)
#____________________________________________________
vi = Viewer(phi)
with open('out2.txt', 'w') as output:
while t0()<T:
print(t0)
phi.updateOld()
ne.updateOld()
res=1.e30
#for sweep in range(steps):
while res > 1.e-4:
res = eqn.sweep(dt=dt)
t0.setValue(t0()+dt)
for m in range(nx):
output.write(str(phi[m])+' ') #+ os.linesep
output.write('\n')
if __name__ == '__main__':
vi.plot()
#____________________________________________________
data = np.loadtxt('out2.txt')
X, T = np.meshgrid(np.linspace(0, L, len(data[0,:])), np.linspace(0, T,
len(data[:,0])))
fig = plt.figure(3)
ax = fig.add_subplot(111,projection='3d')
ax.plot_surface(X, T, Z=data)
plt.show(block=True)
The issue with these equations, particularly eqn0, is that they admit an infinite number of solutions when Neumann boundary conditions are applied on both boundaries. You can fix this by pinning a value somewhere with an internal fixed value. E.g., based on the analytical solution given in the comments, phi = (x-1)**3 * exp(-t), we can pin phi = 0 at x = 1 with
mask = (m.x > 1-dx/2) & (m.x < 1+dx/2)
largeValue = 1e6
value = 0.
#____________________________________________________
eqn0 = (DiffusionTerm(1.,var=phi)==ImplicitSourceTerm(-1.,var=ne)
+ ImplicitSourceTerm(mask * largeValue, var=phi) - mask * largeValue * value)
At this point, the solutions still do not agree with the expected solutions. This is because, while you have called ne.faceGrad.constrain() for the left and right boundaries, does not appear in the discretized equations. You can see this if you plot ne; the gradient is zero at both boundaries despite the constraint because FiPy never "sees" the constraint.
What does appear is the flux . By applying fixed flux boundary conditions, I obtain the expected solutions:
ne_left = 6 * numerix.exp(-t0)
ne_right = -9 * numerix.exp(-t0)
eqn1 = (TransientTerm(1.,var=ne)
== VanLeerConvectionTerm(phi.faceGrad * m.interiorFaces,var=ne)
+ S(x,t0)
+ (m.facesLeft * ne_left * phi.faceGrad).divergence
+ (m.facesRight * ne_right * phi.faceGrad).divergence)
You can probably get better convergence properties with
eqn1 = (TransientTerm(1.,var=ne)
== DiffusionTerm(coeff=ne.faceValue * m.interiorFaces, var=phi)
+ S(x,t0)
+ (m.facesLeft * ne_left * phi.faceGrad).divergence
+ (m.facesRight * ne_right * phi.faceGrad).divergence)
but either formulation seems to work.
Note: phi.faceGrad.constrain() is fine, because the flux does appear in DiffusionTerm(coeff=1., var=phi).
Separately, it appears (based on "The Factor is exactly singular") that you are solving with the SciPy LinearLUSolver. The PETSc LinearLUSolver does better, but the baseline value of the solution wanders all over the place. Calling
res = eqn.sweep(dt=dt, solver=LinearGMRESSolver())
also seems to produce stable results (without pinning an internal value). This behavior probably shouldn't be relied on; pinning a value is the right thing to do.
I have a complex model and I want to calculate the objective function value for different options (not just the optimal solution).
I created the toy example below:
import gurobipy as gp
from gurobipy import GRB
import numpy as np
m = gp.Model()
X = {}
for i in range(5):
X[i] = m.addVar(vtype= GRB.INTEGER, name=f"x[{i}]")
obj_func = 0
for i in range(5):
np.random.seed(i)
obj_func += np.random.rand() * X[i]
m.setObjective(obj_func, GRB.MINIMIZE)
m.addConstr(2 * X[1] + 5 * X[3] <= 10, name = 'const1')
m.addConstr(X[0] + 3 * X[4] >= 4, name = 'const2')
m.addConstr(3 * X[1] - X[4] <= 3, name = 'const3')
m.optimize()
for x_index, x in X.items():
if round(x.X)>0:
print(x_index, x.X)
# 0 1.0
# 4 1.0
How can I can calculate the objective function for the manual input of X = [0,1,1,0,0] or X=[1,0,0,0,1]. I want the output return the objective function value if this input is a feasible solution, otherwise a warning. I can hard-code the problem, but I would rather to extract the objective coefficients directly form m (model) and multiply them with the new X input.
A pretty simple idea: You could fix all the variables in the model by setting the lower and upper bounds to your given inputs and then solve the model. Then, the model.objVal attribute contains the objective value for your given input solution. A straightforward implementation looks like this:
import numpy as np
given_sols = np.array([[0, 1, 1, 0, 0], [1, 0, 0, 0, 1]])
# Evaluate the objective for multiple given solutions
def get_obj_vals(model, given_sols):
obj_vals = np.nan * np.zeros(given_sols.shape[0])
# for each given solution..
for k, x in enumerate(given_sols):
# reset the model
model.reset()
# Fix the lower/upper bounds
for i, var in enumerate(model.getVars()):
var.lb = x[i]
var.ub = x[i]
# solve the problem
model.optimize()
if model.Status == GRB.Status.OPTIMAL:
obj_vals[k] = model.objVal
return obj_vals
Here, nan in obj_vals[i] means that given_sols[i] is not feasible.
I am trying to find the minimum of a natural cubic spline. I have written the following code to find the natural cubic spline. (I have been given test data and have confirmed this method is correct.) Now I can not figure out how to find the minimum of this function.
This is the data
xdata = np.linspace(0.25, 2, 8)
ydata = 10**(-12) * np.array([1,2,1,2,3,1,1,2])
This is the function
import scipy as sp
import numpy as np
import math
from numpy.linalg import inv
from scipy.optimize import fmin_slsqp
from scipy.optimize import minimize, rosen, rosen_der
def phi(x, xd,yd):
n = len(xd)
h = np.array(xd[1:n] - xd[0:n-1])
f = np.divide(yd[1:n] - yd[0:(n-1)],h)
q = [0]*(n-2)
for i in range(n-2):
q[i] = 3*(f[i+1] - f[i])
A = np.zeros(((n-2),(n-2)))
#define A for j=0
A[0,0] = 2*(h[0] + h[1])
A[0,1] = h[1]
#define A for j = n-2
A[-1,-2] = h[-2]
A[-1,-1] = 2*(h[-2] + h[-1])
#define A for in the middle
for j in range(1,(n-3)):
A[j,j-1] = h[j]
A[j,j] = 2*(h[j] + h[j+1])
A[j,j+1] = h[j+1]
Ainv = inv(A)
B = Ainv.dot(q)
b = (n)*[0]
b[1:(n-1)] = B
# now we find a, b, c and d
a = [0]*(n-1)
c = [0]*(n-1)
d = [0]*(n-1)
s = [0]*(n-1)
for r in range(n-1):
a[r] = 1/(3*h[r]) * (b[r + 1] - b[r])
c[r] = f[r] - h[r]*((2*b[r] + b[r+1])/3)
d[r] = yd[r]
#solution 1 start
for m in range(n-1):
if xd[m] <= x <= xd[m+1]:
s = a[m]*(x - xd[m])**3 + b[m]*(x-xd[m])**2 + c[m]*(x-xd[m]) + d[m]
return(s)
#solution 1 end
I want to find the minimum on the domain of my xdata, so a fmin didn't work as you can not define bounds there. I tried both fmin_slsqp and minimize. They are not compatible with the phi function I wrote so I rewrote phi(x, xd,yd) and added an extra variable such that phi is phi(x, xd,yd, m). M indicates in which subfunction of the spline we are calculating a solution (from x_m to x_m+1). In the code we replaced #solution 1 by the following
# solution 2 start
return(a[m]*(x - xd[m])**3 + b[m]*(x-xd[m])**2 + c[m]*(x-xd[m]) + d[m])
# solution 2 end
To find the minimum in a domain x_m to x_(m+1) we use the following code: (we use an instance where m=0, so x from 0.25 to 0.5. The initial guess is 0.3)
fmin_slsqp(phi, x0 = 0.3, bounds=([(0.25,0.5)]), args=(xdata, ydata, 0))
What I would then do (I know it's crude), is iterate this with a for loop to find the minimum on all subdomains and then take the overall minimum. However, the function fmin_slsqp constantly returns the initial guess as the minimum. So there is something wrong, which I do not know how to fix. If you could help me this would be greatly appreciated. Thanks for reading this far.
When I plot your function phi and the data you feed in, I see that its range is of the order of 1e-12. However, fmin_slsqp is unable to handle that level of precision and fails to find any change in your objective.
The solution I propose is scaling the return of your objective by the same order of precision like so:
return(s*1e12)
Then you get good results.
>>> sol = fmin_slsqp(phi, x0=0.3, bounds=([(0.25, 0.5)]), args=(xdata, ydata))
>>> print(sol)
Optimization terminated successfully. (Exit mode 0)
Current function value: 1.0
Iterations: 2
Function evaluations: 6
Gradient evaluations: 2
[ 0.25]
I have the same problem as in this question but don't want to add only one but several constraints to the optimization problem.
So e.g. I want to maximize x1 + 5 * x2 with the constraints that the sum of x1 and x2 is smaller than 5 and x2 is smaller than 3 (needless to say that the actual problem is far more complicated and cannot just thrown into scipy.optimize.minimize as this one; it just serves to illustrate the problem...).
I can to an ugly hack like this:
from scipy.optimize import differential_evolution
import numpy as np
def simple_test(x, more_constraints):
# check wether all constraints evaluate to True
if all(map(eval, more_constraints)):
return -1 * (x[0] + 5 * x[1])
# if not all constraints evaluate to True, return a positive number
return 10
bounds = [(0., 5.), (0., 5.)]
additional_constraints = ['x[0] + x[1] <= 5.', 'x[1] <= 3']
result = differential_evolution(simple_test, bounds, args=(additional_constraints, ), tol=1e-6)
print(result.x, result.fun, sum(result.x))
This will print
[ 1.99999986 3. ] -16.9999998396 4.99999985882
as one would expect.
Is there a better/ more straightforward way to add several constraints than using the rather 'dangerous' eval?
An example is something like this::
additional_constraints = [lambda(x): x[0] + x[1] <= 5., lambda(x):x[1] <= 3]
def simple_test(x, more_constraints):
# check wether all constraints evaluate to True
if all(constraint(x) for constraint in more_constraints):
return -1 * (x[0] + 5 * x[1])
# if not all constraints evaluate to True, return a positive number
return 10
There is a proper solution to the problem described in the question, to enforce multiple nonlinear constraints with scipy.optimize.differential_evolution.
The proper way is by using the scipy.optimize.NonlinearConstraint function.
Here below I give a non-trivial example of optimizing the classic Rosenbrock function inside a region defined by the intersection of two circles.
import numpy as np
from scipy import optimize
# Rosenbrock function
def fun(x):
return 100*(x[1] - x[0]**2)**2 + (1 - x[0])**2
# Function defining the nonlinear constraints:
# 1) x^2 + (y - 3)^2 < 4
# 2) (x - 1)^2 + (y + 1)^2 < 13
def constr_fun(x):
r1 = x[0]**2 + (x[1] - 3)**2
r2 = (x[0] - 1)**2 + (x[1] + 1)**2
return r1, r2
# No lower limit on constr_fun
lb = [-np.inf, -np.inf]
# Upper limit on constr_fun
ub = [4, 13]
# Bounds are irrelevant for this problem, but are needed
# for differential_evolution to compute the starting points
bounds = [[-2.2, 1.5], [-0.5, 2.2]]
nlc = optimize.NonlinearConstraint(constr_fun, lb, ub)
sol = optimize.differential_evolution(fun, bounds, constraints=nlc)
# Accurate solution by Mathematica
true = [1.174907377273171, 1.381484428610871]
print(f"nfev = {sol.nfev}")
print(f"x = {sol.x}")
print(f"err = {sol.x - true}\n")
This prints the following with default parameters:
nfev = 636
x = [1.17490808 1.38148613]
err = [7.06260962e-07 1.70116282e-06]
Here is a visualization of the function (contours) and the feasible region defined by the nonlinear constraints (shading inside the green line). The constrained global minimum is indicated by the yellow dot, while the magenta one shows the unconstrained global minimum.
This constrained problem has an obvious local minimum at (x, y) ~ (-1.2, 1.4) on the boundary of the feasible region which will make local optimizers fail to converge to the global minimum for many starting locations. However, differential_evolution consistently finds the global minimum as expected.