Optimize with SciPy vector to scalar function with constraint - python

I would need to optimize a function f with respects to a vector x, that takes as input a constant matrix m and returns a scalar v >= 0.
MWE with random numbers:
import numpy as np
from scipy.optimize import minimize
np.random.seed(1)
m = np.array([[1,0,0.15],[2,0,0.15],[1.5,0.2,0.2],[3,0.5,0.1],[2.2,0.1,0.15]])
x0 = np.random.rand(5)*2
def f(x, m):
pg = -np.concatenate((-arr[:, :2], x.reshape(-1, 1)), axis=1).sum(axis=1)
return sum(arr[:, 2] * pg)
res = minimize(
f, x0,
method='nelder-mead', args=(m,),
options={'xatol': 1e-8, 'maxiter': 1e+4, 'disp': True}
)
How do I set up the constraint for the output value? As far as I read in the doc I can only set constraints for the inputs. I read this post saying to use minimize_scalar, but it can only be used when the input is scalar as well.

Simply add the constraint f(x,m) >= 0:
import numpy as np
from scipy.optimize import minimize
np.random.seed(1)
m = np.array([[1,0,0.15],[2,0,0.15],[1.5,0.2,0.2],[3,0.5,0.1],[2.2,0.1,0.15]])
x0 = np.random.rand(5)*2
def f(x, m):
pg = -np.concatenate((-arr[:, :2], x.reshape(-1, 1)), axis=1).sum(axis=1)
return sum(arr[:, 2] * pg)
# add the constraint f(x, m) >= 0
con = [{'type': 'ineq', 'fun': lambda x: f(x, m)}]
res = minimize(
f, x0,
constraints=con,
method='nelder-mead', args=(m,),
options={'xatol': 1e-8, 'maxiter': 1e+4, 'disp': True}
)
Alternatively, you can enforce a positive objective function value by minimizing some vector norm of your objective, e.g. f(x,m)**2. You wouldn't need a constraint then.
PS: The second argument of your function should probably be arr instead of m.
PPS: Since both your objective function and the constraint are continuously differentiable, a gradient-based algorithm will very likely perform much better than Nelder-Mead, even if the gradient is approximated by finite differences.

Related

When I apply approx_fprime to scipy.minimize, it doesn't iterate

I tried using minimize function in scipy packages like below code
When I use jac option = approx_fprime, iteration is 0 and optimization doesn't work.
But When I use jac option = rosen_der, it worked!
import numpy as np
from scipy.optimize import minimize, approx_fprime
def rosen(x):
"""The Rosenbrock function"""
return sum(100.0*(x[1:]-x[:-1]**2.0)**2.0 + (1-x[:-1])**2.0)
def rosen_der(x):
# derivative of rosenbrock function
xm = x[1:-1]
xm_m1 = x[:-2]
xm_p1 = x[2:]
der = np.zeros_like(x)
der[1:-1] = 200*(xm-xm_m1**2) - 400*(xm_p1 - xm**2)*xm - 2*(1-xm)
der[0] = -400*x[0]*(x[1]-x[0]**2) - 2*(1-x[0])
der[-1] = 200*(x[-1]-x[-2]**2)
return der
x0=np.array([1.3, 0.7])
eps = np.sqrt(np.finfo(float).eps)
fprime = lambda x : np.array(approx_fprime(x0, rosen, eps))
res = minimize(rosen, x0, method='CG', jac=fprime, options={'maxiter':10, 'disp': True})
print(res.x)
[ 515.40001106 -197.99999905]
[ 515.4 -198. ]
98.10000000000005
Warning: Desired error not necessarily achieved due to precision loss.
Current function value: 98.100000
Iterations: 0
Function evaluations: 33
Gradient evaluations: 21
[1.3 0.7]
I checked approx_fprime is ndarray, same rosen_der, and value is same too.
Why optimization doesn't work??
Your function fprime is a function of x but approximates the derivative at x0. Consequently, you're evaluating the gradient at the initial guess x0 in every iteration. You should evaluate/approximate the derivative at x instead:
fprime = lambda x : approx_fprime(x, rosen, eps)
Note that approx_fprime already returns an np.ndarray, so there's no need for the extra np.array call.
It's also worth mentioning that you don't need to pass approximated derivatives as minimize approximates them by finite differences by default once you don't pass any derivatives, i.e. jac=None. However, minimize uses approx_derivative under the hood instead of approx_fprime as it provides support for evaluating derivatives at variable bounds.

How to use correctly SLSQP algoritm with non-linear constraints?

I need to find the rectangle with max area inside an ellipse (which may be tilted).
The goal is to gerealize this problem to N dimension, so when we set N=2 we find our rectangle iside an ellipse.
So mathematically speaking I want to maximize the function
with the constraint
I use SLSQP from scipy.optimiize.minimize() to find the optimal , but I dont know how to give the constraint correctly:
here's my code :
you will need these two functions to plot the ellipse and the rectangle
import numpy as np
import matplotlib.pyplot as plt
def ellipse(A, ax=plt):
eigenvalues, eigenvectors = np.linalg.eig(A)
theta = np.linspace(0, 2*np.pi, 1000);
ellipsis = (1/np.sqrt(eigenvalues[None,:]) * eigenvectors) # [np.sin(theta), np.cos(theta)]
ax.plot(ellipsis[0,:], ellipsis[1,:])
def rectangle(x, y, ax=plt):
ax.plot([-x,-x,x,x,-x],
[-y,y,y,-y,-y])
import scipy
from scipy.optimize import NonlinearConstraint
def f(X):
return -2**len(X) * np.prod(X) # we minimize -f
def g(X):
A =np.array([[1, -0.7],
[-0.7, 4]])
return X.T # A # X - 1
contraintes = {'type' : 'ineq', 'fun' : g} # automatically g >= 0
res = scipy.optimize.minimize(f, x0, method="SLSQP", constraints=contraintes)
x, y = res.x
#================== or ===================
# contraintes = NonlinearConstraint(g, 0, np.inf)
# res = scipy.optimize.minimize(f, x0, method="SLSQP", constraints=contraintes)
# x, y = res.x
A =np.array([[1, -0.7],
[-0.7, 4]])
fig, ax = plt.subplots(1,1, figsize=(5,5))
ellipse(A,ax=ax)
rectangle(x, y, ax=ax)
here's the output:
as you can see the rectangle is not inside.
How do we give the constraints correctly to optimize function?
I checked the scipy page and it didn't help me to understand.
First of all, please note that scipy.optimize.minimize expects the inequality constraints to be non-negative, i.e. g(x) >= 0, so your code comment # automatically g >= 0 is wrong. You need to explicitly transform your constraint by multiplying it with -1.0. And speaking of your constraints, please also note that the constraint x.T # A # x - 1 <= 0 only ensures that one vertex (x1,x2) and its mirror point (-x1,-x2) (due to symmetry) of your rectangle lies inside the ellipsis but not the whole rectangle.
For this end, you need to impose a additional constraint such that (-x1, x2) and (x1, -x2) lie within the ellipsis too:
import numpy as np
from scipy.optimize import minimize
def f(x):
return -2**len(x)*np.prod(x)
def g(x, A):
xx = [-1.0, 1.0] * x
ineq1 = x.T # A # x - 1
ineq2 = xx.T # A # xx - 1
return np.array((ineq1, ineq2))
A = np.array([[1, -0.7], [-0.7, 4]])
# constraints
cons = [{'type': 'ineq', 'fun': lambda x: -1.0*g(x, A)}]
# optimize
res = minimize(f, x0=0.1*np.ones(2), method="SLSQP", constraints=cons)

L1 convex optimization with equality constraints in python

I need to minimize L_1(x) subject to Mx = y.
x is a vector with dimension b, y is a vector with dimension a, and M is a matrix with dimensions (a,b).
After some reading I determined to use scipy.optimize.minimize:
import numpy as np
from scipy.optimize import minimize
def objective(x): #L_1 norm objective function
return np.linalg.norm(x,ord=1)
constraints = [] #list of all constraint functions
for i in range(a):
def con(x,y=y,i=i):
return np.matmul(M[i],x)-y[i]
constraints.append(con)
#make constraints into tuple of dicts as required by scipy
cons = tuple({'type':'eq','fun':c} for c in constraints)
#perform the minimization with sequential least squares programming
opt = minimize(objective,x0 = x0,
constraints=cons,method='SLSQP',options={'disp': True})
First,
what can I use for x0? x is unknown, and I need an x0 which satisfies the constraint M*x0 = y: How can I find an initial guess which satisfies the constraint? M is a matrix of independent Gaussian variables (~N(0,1)) if that helps.
Second,
Is there a problem with the way I've set this up? When I use the true x (which I happen to know in the development phase) for x0, I expect it to return x = x0 quickly. Instead, it returns a zero vector x = [0,0,0...,0]. This behavior is unexpected.
Edit:
Here is a solution using cvxpy** solving min(L_1(x)) subject to Mx=y:
import cvxpy as cvx
x = cvx.Variable(b) #b is dim x
objective = cvx.Minimize(cvx.norm(x,1)) #L_1 norm objective function
constraints = [M*x == y] #y is dim a and M is dim a by b
prob = cvx.Problem(objective,constraints)
result = prob.solve(verbose=False)
#then clean up and chop the 1e-12 vals out of the solution
x = np.array(x.value) #extract array from variable
x = np.array([a for b in x for a in b]) #unpack the extra brackets
x[np.abs(x)<1e-9]=0 #chop small numbers to 0

Doing many iterations of scipy's `curve_fit` in one go

Consider the following MWE
import numpy as np
from scipy.optimize import curve_fit
X=np.arange(1,10,1)
Y=abs(X+np.random.randn(15,9))
def linear(x, a, b):
return (x/b)**a
coeffs=[]
for ix in range(Y.shape[0]):
print(ix)
c0, pcov = curve_fit(linear, X, Y[ix])
coeffs.append(c0)
XX=np.tile(X, Y.shape[0])
c0, pcov = curve_fit(linear, XX, Y.flatten())
I have a problem where I have to do basically that, but instead of 15 iterations it's thousands and it's pretty slow.
Is there any way to do all of those iterations at once with curve_fit? I know the result from the function is supposed to be a 1D-array, so just passing the args like this
c0, pcov = curve_fit(nlinear, X, Y)
is not going to work. Also I think the answer has to be in flattening Y, so I can get a flattened result, but I just can't get anything to work.
EDIT
I know that if I do something like
XX=np.tile(X, Y.shape[0])
c0, pcov = curve_fit(nlinear, XX, Y.flatten())
then I get a "mean" value of the coefficients, but that's not what I want.
EDIT 2
For the record, I solved with using Jacques Kvam's set-up but implemented using Numpy (because of a limitation)
lX = np.log(X)
lY = np.log(Y)
A = np.vstack([lX, np.ones(len(lX))]).T
m, c=np.linalg.lstsq(A, lY.T)[0]
And then m is a and to get b:
b=np.exp(-c/m)
Least squares won't give the same result because the noise is transformed by log in this case. If the noise is zero, both methods give the same result.
import numpy as np
from numpy import random as rng
from scipy.optimize import curve_fit
rng.seed(0)
X=np.arange(1,7)
Y = np.zeros((4, 6))
for i in range(4):
b = a = i + 1
Y[i] = (X/b)**a + 0.01 * randn(6)
def linear(x, a, b):
return (x/b)**a
coeffs=[]
for ix in range(Y.shape[0]):
print(ix)
c0, pcov = curve_fit(linear, X, Y[ix])
coeffs.append(c0)
coefs is
[array([ 0.99309127, 0.98742861]),
array([ 2.00197613, 2.00082722]),
array([ 2.99130237, 2.99390585]),
array([ 3.99644048, 3.9992937 ])]
I'll use scikit-learn's implementation of linear regression since I believe that scales well.
from sklearn.linear_model import LinearRegression
lr = LinearRegression()
Take logs of X and Y
lX = np.log(X)[None, :]
lY = np.log(Y)
Now fit and check that coeffiecients are the same as before.
lr.fit(lX.T, lY.T)
lr.coef_
Which gives similar exponent.
array([ 0.98613517, 1.98643974, 2.96602892, 4.01718514])
Now check the divisor.
np.exp(-lr.intercept_ / lr.coef_.ravel())
Which gives similar coefficient, you can see the methods diverging somewhat though in their answers.
array([ 0.99199406, 1.98234916, 2.90677142, 3.73416501])
It might be useful in some situations to have the best fit parameters as a numpy array for further calculations. One can add the following after the for loop:
bestfit_par = np.asarray(coeffs)

Jacobian and Hessian inputs in `scipy.optimize.minimize`

I am trying to understand how the "dogleg" method works in Python's scipy.optimize.minimize function. I am adapting the example at the bottom of the help page.
The dogleg method requires a Jacobian and Hessian argument according to the notes. For this I use the numdifftools package:
import numpy as np
from scipy.optimize import minimize
from numdifftools import Jacobian, Hessian
def fun(x,a):
return (x[0] - 1)**2 + (x[1] - a)**2
x0 = np.array([2,0]) # initial guess
a = 2.5
res = minimize(fun, x0, args=(a), method='dogleg',
jac=Jacobian(fun)([2,0]), hess=Hessian(fun)([2,0]))
print(res)
Edit:
If I make a change as suggested by a post below,
res = minimize(fun, x0, args=a, method='dogleg',
jac=Jacobian(lambda x: fun(x,a)),
hess=Hessian(lambda x: fun(x,a)))
I get an error TypeError: <lambda>() takes 1 positional argument but 2 were given. What am I doing wrong?
Also is it correct to calculate the Jacobian and Hessian at the initial guess x0?
I get that this is a toy example, but I would like to point out that using a tool like Jacobian or Hessian to calculate the derivatives instead of deriving the function itself is fairly costly. For example with your method:
x0 = np.array([2, 0])
a = 2.5
%timeit minimize(fun, x0, args=(a,), method='dogleg', jac=fun_der, hess=fun_hess)
100 loops, best of 3: 13.6 ms per loop
But you could calculate the derivative functions as such:
def fun_der(x, a):
dx = 2 * (x[0] - 1)
dy = 2 * (x[1] - a)
return np.array([dx, dy]
def fun_hess(x, a):
dx = 2
dy = 2
return np.diag([dx, dy])
%timeit minimize(fun, x0, args=(a,), method='dogleg', jac=fun_der, hess=fun_hess)
1000 loops, best of 3: 279 µs per loop
As you can see that is almost 50x faster. It really starts to add up with complex functions. As such I always try to derive the functions explicitly myself, regardless of how difficult that may be. One fun example is the kernel based implementation of Inductive Matrix Completion.
argmin --> sum((A - gamma_u(X) Z gamma_v(Y))**2 - lambda * ||Z||**2)
where gamma_u = (1/sqrt(m_x)) * [cos(UX), sin(UX)] and
gamma_v = (1/sqrt(m_y)) * [cos(VY), sin(VY)]
X.shape = n_x, p; Y.shape = n_y, q; U.shape = m_x, p; V.shape = m_y, q; Z.shape = 2m_x, 2m_y
Calculating the gradient and hessian from this equation is extremely unreasonable in comparison to explicitly deriving and utilizing those functions. So as #bnaul pointed out, if your function does have closed form derivates you really do want to calculate and use them.
That error is coming from the calls to Jacobian and Hessian, not in minimize. Replacing Jacobian(fun) with Jacobian(lambda x: fun(x, a)) and similarly for Hessian should do the trick (since now the function being differentiated only has a single vector argument).
One other thing: (a) is just a, if you want it to be a tuple use (a,).
import numpy as np
from scipy.optimize import minimize
from numdifftools import Jacobian, Hessian
def fun(x, a):
return (x[0] - 1) **2 + (x[1] - a) **2
def fun_der(x, a):
return Jacobian(lambda x: fun(x, a))(x).ravel()
def fun_hess(x, a):
return Hessian(lambda x: fun(x, a))(x)
x0 = np.array([2, 0]) # initial guess
a = 2.5
res = minimize(fun, x0, args=(a,), method='dogleg', jac=fun_der, hess=fun_hess)
print(res)
You can use autograd instead
import numpy as np
from scipy.optimize import minimize
from autograd import jacobian, hessian
def fun(x, a):
return (x[0] - 1) **2 + (x[1] - a) **2
def fun_der(x, a):
return jacobian(lambda x: fun(x, a))(x).ravel()
def fun_hess(x, a):
return hessian(lambda x: fun(x, a))(x)
x0 = np.array([2, 0]) # initial guess
a = 2.5
res = minimize(fun, x0, args=(a,), method='dogleg', jac=fun_der, hess=fun_hess)
print(res)

Categories