Jacobian and Hessian inputs in `scipy.optimize.minimize` - python

I am trying to understand how the "dogleg" method works in Python's scipy.optimize.minimize function. I am adapting the example at the bottom of the help page.
The dogleg method requires a Jacobian and Hessian argument according to the notes. For this I use the numdifftools package:
import numpy as np
from scipy.optimize import minimize
from numdifftools import Jacobian, Hessian
def fun(x,a):
return (x[0] - 1)**2 + (x[1] - a)**2
x0 = np.array([2,0]) # initial guess
a = 2.5
res = minimize(fun, x0, args=(a), method='dogleg',
jac=Jacobian(fun)([2,0]), hess=Hessian(fun)([2,0]))
print(res)
Edit:
If I make a change as suggested by a post below,
res = minimize(fun, x0, args=a, method='dogleg',
jac=Jacobian(lambda x: fun(x,a)),
hess=Hessian(lambda x: fun(x,a)))
I get an error TypeError: <lambda>() takes 1 positional argument but 2 were given. What am I doing wrong?
Also is it correct to calculate the Jacobian and Hessian at the initial guess x0?

I get that this is a toy example, but I would like to point out that using a tool like Jacobian or Hessian to calculate the derivatives instead of deriving the function itself is fairly costly. For example with your method:
x0 = np.array([2, 0])
a = 2.5
%timeit minimize(fun, x0, args=(a,), method='dogleg', jac=fun_der, hess=fun_hess)
100 loops, best of 3: 13.6 ms per loop
But you could calculate the derivative functions as such:
def fun_der(x, a):
dx = 2 * (x[0] - 1)
dy = 2 * (x[1] - a)
return np.array([dx, dy]
def fun_hess(x, a):
dx = 2
dy = 2
return np.diag([dx, dy])
%timeit minimize(fun, x0, args=(a,), method='dogleg', jac=fun_der, hess=fun_hess)
1000 loops, best of 3: 279 µs per loop
As you can see that is almost 50x faster. It really starts to add up with complex functions. As such I always try to derive the functions explicitly myself, regardless of how difficult that may be. One fun example is the kernel based implementation of Inductive Matrix Completion.
argmin --> sum((A - gamma_u(X) Z gamma_v(Y))**2 - lambda * ||Z||**2)
where gamma_u = (1/sqrt(m_x)) * [cos(UX), sin(UX)] and
gamma_v = (1/sqrt(m_y)) * [cos(VY), sin(VY)]
X.shape = n_x, p; Y.shape = n_y, q; U.shape = m_x, p; V.shape = m_y, q; Z.shape = 2m_x, 2m_y
Calculating the gradient and hessian from this equation is extremely unreasonable in comparison to explicitly deriving and utilizing those functions. So as #bnaul pointed out, if your function does have closed form derivates you really do want to calculate and use them.

That error is coming from the calls to Jacobian and Hessian, not in minimize. Replacing Jacobian(fun) with Jacobian(lambda x: fun(x, a)) and similarly for Hessian should do the trick (since now the function being differentiated only has a single vector argument).
One other thing: (a) is just a, if you want it to be a tuple use (a,).
import numpy as np
from scipy.optimize import minimize
from numdifftools import Jacobian, Hessian
def fun(x, a):
return (x[0] - 1) **2 + (x[1] - a) **2
def fun_der(x, a):
return Jacobian(lambda x: fun(x, a))(x).ravel()
def fun_hess(x, a):
return Hessian(lambda x: fun(x, a))(x)
x0 = np.array([2, 0]) # initial guess
a = 2.5
res = minimize(fun, x0, args=(a,), method='dogleg', jac=fun_der, hess=fun_hess)
print(res)

You can use autograd instead
import numpy as np
from scipy.optimize import minimize
from autograd import jacobian, hessian
def fun(x, a):
return (x[0] - 1) **2 + (x[1] - a) **2
def fun_der(x, a):
return jacobian(lambda x: fun(x, a))(x).ravel()
def fun_hess(x, a):
return hessian(lambda x: fun(x, a))(x)
x0 = np.array([2, 0]) # initial guess
a = 2.5
res = minimize(fun, x0, args=(a,), method='dogleg', jac=fun_der, hess=fun_hess)
print(res)

Related

Optimize with SciPy vector to scalar function with constraint

I would need to optimize a function f with respects to a vector x, that takes as input a constant matrix m and returns a scalar v >= 0.
MWE with random numbers:
import numpy as np
from scipy.optimize import minimize
np.random.seed(1)
m = np.array([[1,0,0.15],[2,0,0.15],[1.5,0.2,0.2],[3,0.5,0.1],[2.2,0.1,0.15]])
x0 = np.random.rand(5)*2
def f(x, m):
pg = -np.concatenate((-arr[:, :2], x.reshape(-1, 1)), axis=1).sum(axis=1)
return sum(arr[:, 2] * pg)
res = minimize(
f, x0,
method='nelder-mead', args=(m,),
options={'xatol': 1e-8, 'maxiter': 1e+4, 'disp': True}
)
How do I set up the constraint for the output value? As far as I read in the doc I can only set constraints for the inputs. I read this post saying to use minimize_scalar, but it can only be used when the input is scalar as well.
Simply add the constraint f(x,m) >= 0:
import numpy as np
from scipy.optimize import minimize
np.random.seed(1)
m = np.array([[1,0,0.15],[2,0,0.15],[1.5,0.2,0.2],[3,0.5,0.1],[2.2,0.1,0.15]])
x0 = np.random.rand(5)*2
def f(x, m):
pg = -np.concatenate((-arr[:, :2], x.reshape(-1, 1)), axis=1).sum(axis=1)
return sum(arr[:, 2] * pg)
# add the constraint f(x, m) >= 0
con = [{'type': 'ineq', 'fun': lambda x: f(x, m)}]
res = minimize(
f, x0,
constraints=con,
method='nelder-mead', args=(m,),
options={'xatol': 1e-8, 'maxiter': 1e+4, 'disp': True}
)
Alternatively, you can enforce a positive objective function value by minimizing some vector norm of your objective, e.g. f(x,m)**2. You wouldn't need a constraint then.
PS: The second argument of your function should probably be arr instead of m.
PPS: Since both your objective function and the constraint are continuously differentiable, a gradient-based algorithm will very likely perform much better than Nelder-Mead, even if the gradient is approximated by finite differences.

Solving nonlinear least-squares with function returning both value and jacobian

I am trying to speed up the solving of a nonlinear least-squares problem in Python. I can compute both the function value and the Jacobian via one forwardpass, (val, jac) = fun. A solver like scipy.optimize.least_squares only accepts two seperate functions, fun and jac, which for my problem means that the function value has to be computed twice per iteration (once in fun, and once in jac).
Is there a trick, for avoiding solving the primal problem twice?
The more general function scipy.optimize.minimize support the above style with the jac=True keyword, but it's slow for my problem.
I think the best approach would be to use the MemoizeJac decorator. This is exactly what is done under the hood of scipy.optimize.minimize for jac=True:
import numpy as np
from scipy.optimize import least_squares
from scipy.optimize._optimize import MemoizeJac
def fun_and_jac(x):
return x**2 - 5 * x + 3, 2 * x - 5
fun = MemoizeJac(fun_and_jac)
jac = fun.derivative
res = least_squares(fun, x0=0, jac=jac)
print(res)
You can do a bit of a hack:
val_cache = {}
jac_cache = {}
def val_fun(*args):
try:
return val_cache.pop(args)
except KeyError:
(val, jac) = fun(*args)
jac_cache[args] = jac
return val
def jac_fun(*args):
try:
return jac_cache.pop(args)
except KeyError:
(val, jac) = fun(*args)
val_cache[args] = val
return jac
From the documentation of scipy.optimize.minimize:
If jac is a Boolean and is True, fun is assumed to return a tuple (f, g) containing the objective function and the gradient.
https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.minimize.html?highlight=minimize
So you can simply do it like this:
from scipy.optimize import minimize
def function(x):
'''Function that returns both fun and jac'''
return x**2 - 5 * x + 3, 2 * x - 5
print(minimize(function, 0, jac=True))
Edit, reread your question, it seems this option also works for least_squares but is undocumented.
This works as well:
from scipy.optimize import least_squares
def function(x):
'''Function that returns both fun and jac'''
return x**2 - 5 * x + 3, 2 * x - 5
print(least_squares(function, 0, jac=True))

scipy.minimize with two equations returns initial values only

I would like to get an optimal solution for following equation set:
x_w * 1010 + x_m * d_m = 1017
x_w + x_m = 1
my code is as follows:
from scipy.optimize import minimize
import numpy as np
def f1(p):
x_w, x_m, d_m = p
return (x_w*1010 + x_m*d_m) - 1017.7
def f2(p):
x_w, x_m, d_m = p
return x_w + x_m - 1
bounds =[(0,1), (0,1), (1000, 10000)]
x0 = np.array([0.5, 0.5, 1500])
res = minimize(lambda p: f1(p)+f2(p), x0=x0, bounds=bounds)
However, all I get back (res.x) are the initial values (x0).
How do I make it work? Is there a better approach? There are just these two equations for the three variables.
In general, you can't solve the equation system by minimizing f1(p) + f2(p) since the minimum of this objective is no solution of the equation system. However, you have to minimize the sum of squared errors of each equation, i.e. you minimize f1(p)**2 + f2(p)**2:
minimize(lambda p: f1(p)**2 + f2(p)**2, x0=x0, bounds=bounds)
Alternatively, you could use scipy.optimize.fsolve which doesn't support bounds, unfortunately.

How to Integrate Arc Lengths using python, numpy, and scipy?

On another thread, I saw someone manage to integrate the length of a arc using mathematica.They wrote:
In[1]:= ArcTan[3.05*Tan[5Pi/18]/2.23]
Out[1]= 1.02051
In[2]:= x=3.05 Cos[t];
In[3]:= y=2.23 Sin[t];
In[4]:= NIntegrate[Sqrt[D[x,t]^2+D[y,t]^2],{t,0,1.02051}]
Out[4]= 2.53143
How exactly could this be transferred to python using the imports of numpy and scipy? In particular, I am stuck on line 4 in his code with the "NIntegrate" function. Thanks for the help!
Also, if I already have the arc length and the vertical axis length, how would I be able to reverse the program to spit out the original paremeters from the known values? Thanks!
To my knowledge scipy cannot perform symbolic computations (such as symbolic differentiation). You may want to have a look at http://www.sympy.org for a symbolic computation package. Therefore, in the example below, I compute derivatives analytically (the Dx(t) and Dy(t) functions).
>>> from scipy.integrate import quad
>>> import numpy as np
>>> Dx = lambda t: -3.05 * np.sin(t)
>>> Dy = lambda t: 2.23 * np.cos(t)
>>> quad(lambda t: np.sqrt(Dx(t)**2 + Dy(t)**2), 0, 1.02051)
(2.531432761012828, 2.810454936566873e-14)
EDIT: Second part of the question - inverting the problem
From the fact that you know the value of the integral (arc) you can now solve for one of the parameters that determine the arc (semi-axes, angle, etc.) Let's assume you want to solve for the angle. Then you can use one of the non-linear solvers in scipy, to revert the equation quad(theta) - arcval == 0. You can do it like this:
>>> from scipy.integrate import quad
>>> from scipy.optimize import broyden1
>>> import numpy as np
>>> a = 3.05
>>> b = 2.23
>>> Dx = lambda t: -a * np.sin(t)
>>> Dy = lambda t: b * np.cos(t)
>>> arc = lambda theta: quad(lambda t: np.sqrt(Dx(t)**2 + Dy(t)**2), 0, np.arctan((a / b) * np.tan(np.deg2rad(theta))))[0]
>>> invert = lambda arcval: float(broyden1(lambda x: arc(x) - arcval, np.rad2deg(arcval / np.sqrt((a**2 + b**2) / 2.0))))
Then:
>>> arc(50)
2.531419526553662
>>> invert(arc(50))
50.000031008458365
If you prefer a pure numerical approach, you could use the following barebones solution. This worked well for me given that I had two input numpy.ndarrays, x and y with no functional form available.
import numpy as np
def arclength(x, y, a, b):
"""
Computes the arclength of the given curve
defined by (x0, y0), (x1, y1) ... (xn, yn)
over the provided bounds, `a` and `b`.
Parameters
----------
x: numpy.ndarray
The array of x values
y: numpy.ndarray
The array of y values corresponding to each value of x
a: int
The lower limit to integrate from
b: int
The upper limit to integrate to
Returns
-------
numpy.float64
The arclength of the curve
"""
bounds = (x >= a) & (y <= b)
return np.trapz(
np.sqrt(
1 + np.gradient(y[bounds], x[bounds])
) ** 2),
x[bounds]
)
Note: I spaced the return variables out that way just to make it more readable and clear to understand the operations taking place.
As an aside, recall that the arc-length of a curve is given by:

Using fsolve with scipy function

I have encountered the following problem with scipy.fsolve, but I don't what to do:
U = 0.00043
ThC =1.19
Dist = 7
IncT = 0.2
pcw = 1180000
k = 1.19
B = U * pcw / (2 * k)
fugato = fsolve((((Ql/(2*math.pi* k))*math.exp(B * x)*special.kv(0, B * x))-IncT),0.01)
print fugato
I get the error TypeError: 'numpy.float64' object is not callable in fsolve.
How do I fix this problem?
The argument to fsolve must be a function.
I presume that you want to solve your equation for x? If so, writing:
fugato = fsolve(lambda x: Ql/(2*math.pi* k)*math.exp(B * x)*special.kv(0, B * x)-IncT,
0.01)
works.
To explain what's going on here, the construct lambda x: 2*x is a function definition. It is similar to writing:
def f(x):
return 2*x
The lambda construction is commonly used to define functions that you only need once. This is often the case when registering callbacks, or to represent a mathematical expression. For instance, if you wanted to integrate f(x) = 2*x, you could write:
from scipy.integrate import quad
integral = quad(lambda x: 2*x, 0., 3.)
Similarly, if you want to solve 2*x = 1, you can write:
from scipy.optimize import fsolve
fsolve(lambda x: 2*x-1, 0.)

Categories