I am trying to use a Scipy global optimizer to find a global min of a function, but instead of giving the global min as an answer, it stuck in a local min.
The code:
def f(x):
return x**2 + 10*np.sin(x)
x = np.arange(-10, 10, 0.1)
print(optimize.basinhopping(f, 3))
Can anyone tell me why? And which method in Scipy you think is the best one to solve global optimization?
Only SHGO will provide you guarantees. It also has a better performance based on my benchmarking exercise. Basin-hopping can spend too much time on one possible local minima whereas the homology based technique (i.e. shgo, which you can also pip install and use separately of scipy if you want) avoids this in a cunning fashion.
I should add that if you really want to be careful, you should change the default SHGO sampling (it will use Sobol by default, which technically breaks the guarantee).
The simple scipy.optimize is finding the global minimum for your problem. Run the following code:
import math
import numpy as np
from scipy.optimize import minimize
def objective(x):
# min f(x)
return x[0]**2+10*math.sin(x[0])
if __name__ == "__main__":
# initial guesses
n = 1
x0 = np.zeros(n)
x0[0] = 90.0
# show initial objective
print('\nInitial Objective: ' + str(objective(x0)))
# state bounds
b1 = (-100.0, 100.0)
bds = (b1,)
print("\n")
solution = minimize(objective, x0, method='SLSQP', jac=None, bounds=bds,
tol = 1e-20, constraints=(),
options={'maxiter': 1000000, 'ftol': 1e-20, 'disp': True})
# z is a numpy.ndarray vector
z = solution.x
# show final objective
print('\nFinal Objective')
print('f* = ' + str(objective(z)))
# print solution
print('\nSolution')
print('x1* = ' + str(z[0]))
which outputs:
Initial Objective: 10005.063656411097
Optimization terminated successfully. (Exit mode 0)
Current function value: -7.9458233756152845
Iterations: 7
Function evaluations: 34
Gradient evaluations: 7
Final Objective
f* = -7.9458233756152845
Solution
x1* = -1.3064400096083357
Related
I'm trying to plot the output from an ODE using a Kronecker delta function which should only become 'active' at a specific time = t1.
This should give a sawtooth like response where the initial value decays down exponentially until t=t1 where it rises again instantly before decaying down once again.
However, when I plot this it looks like the solver is seeing the Kronecker delta function as zero for all time t. Is there anyway to do this in Python?
from scipy import KroneckerDelta
import scipy.integrate as sp
import matplotlib.pyplot as plt
import numpy as np
def dy_dt(y,t):
dy_dt = 500*KroneckerDelta(t,t1) - 2y
return dy_dt
t1 = 4
y0 = 500
t = np.arrange(0,10,0.1)
y = sp.odeint(dy_dt,y0,t)
plt.plot(t,y)
In the case of a simple Kronecker delta using time, you can run the ode in pieces like so:
from scipy.integrate import odeint
import matplotlib.pyplot as plt
import numpy as np
def dy_dt(y,t):
return -2*y
t_delta = 4
tend = 10
y0 = [500]
t1 = np.linspace(0,t_delta,50)
y1 = odeint(dy_dt,y0,t1)
y0 = y1[-1] + 500 # execute Kronecker delta
t2 = np.linspace(t_delta,tend,50)
y2 = odeint(dy_dt,y0,t2)
t = np.append(t1, t2)
y = np.append(y1, y2)
plt.plot(t,y)
Another option for complicated situations is to the events functionality of solve_ivp.
I think the problem could be internal rounding errors, because 0.1 cannot be represented exactly as a python float. I would try
import math
def dy_dt(y,t):
if math.isclose(t, t1):
return 500 - 2*y
else:
return -2y
Also the documentation of odeint suggests using the args parameter instead of global variables to give your derivative function access to additional arguments and replacing np.arange by np.linspace:
import scipy.integrate as sp
import matplotlib.pyplot as plt
import numpy as np
import math
def dy_dt(y, t, t1):
if math.isclose(t, t1):
return 500 - 2*y
else:
return -2*y
t1 = 4
y0 = 500
t = np.linspace(0, 10, num=101)
y = sp.odeint(dy_dt, y0, t, args=(t1,))
plt.plot(t, y)
I did not test the code so tell me if there is anything wrong with it.
EDIT:
When testing my code I took a look at the t values for which dy_dt is evaluated. I noticed that odeint does not only use the t values that where specified, but alters them slightly:
...
3.6636447422787928
3.743098503914526
3.822552265550259
3.902006027185992
3.991829287543431
4.08165254790087
4.171475808258308
...
Now using my method, we get
math.isclose(3.991829287543431, 4) # False
because the default tolerance is set to a relative error of at most 10^(-9), so the odeint function "misses" the bump of the derivative at 4. Luckily, we can fix that by specifying a higher error threshold:
def dy_dt(y, t, t1):
if math.isclose(t, t1, abs_tol=0.01):
return 500 - 2*y
else:
return -2*y
Now dy_dt is very high for all values between 3.99 and 4.01. It is possible to make this range smaller if the num argument of linspace is increased.
TL;DR
Your problem is not a problem of python but a problem of numerically solving an differential equation: You need to alter your derivative for an interval of sufficient length, otherwise the solver will likely miss the interesting spot. A kronecker delta does not work with numeric approaches to solving ODEs.
I have a need to pass internal calculations from a call to odeint. I normally recalculate the values again after I am finished with the integration, but I would prefer to do all of the calculations in the called function for odeint.
My problems are not computationally heavy, so taking a little extra performance hit in doing calculations inside of the ode solver is acceptable.
from scipy.integrate import odeint
import numpy as np
def eom(y, t):
internal_calc = y/(t+1)
xdot = y
return xdot, internal_calc
if __name__ == '__main__':
t = np.linspace(0, 5, 100)
y0 = 1.0 # the initial condition
output, internal_calc = odeint(eom, y0, t)
This code doesn't run, but hopefully shows what I am after. I want to get the 'internal_calc' value out of the eom function for each pass through the integrator.
I've looked around for options, but one of the best python programmers I know told me to write my own integrator so I can do what I want.
Before I did that, I thought I would ask if anyone else has a method for getting values out of the odeint solver.
It is possible, you just can't use the return values of your eom function. So you need some other way of smuggling data out from eom. There are many, many different ways to do this. The simplest is probably to just use a global variable:
import scipy.integrate as spi
count = 0
def pend(t, y):
global count
theta, omega = y
dydt = [omega, -.25*omega - 5*np.sin(theta)]
count += 1
return dydt
sol = spi.solve_ivp(pend, [0, 10], [np.pi - 0.1, 0.0])
print(count)
Output:
182
Also, note that I used solve_ivp instead of odeint in the code above. The odeint docs say that when writing new code, you should now use solve_ivp instead of the older odeint.
If it were my own code, I'd probably accomplish the task by passing an accumulator object into a partial version of my function:
class Acc:
def __init__(self):
self.x = 0
def __str__(self):
return str(self.x)
def pend_partial(acc):
def pend(t, y):
theta, omega = y
dydt = [omega, -.25*omega - 5*np.sin(theta)]
acc.x += 1
return dydt
return pend
count = Acc()
sol = spi.solve_ivp(pend_partial(count), [0, 10], [np.pi - 0.1, 0.0])
print(count)
Output:
182
However, if you're just writing a short script or something, you should probably just use the simpler global approach. This is a pretty good use case for it.
Let us assume I have an ODE with x'(t) = f(x) with the respective solution x(t) = ϕ(x(0),t) of a initial condition x(0). Now I intend to calculate numerically the equilibria as a function of their initial condition: eq(x0) := ϕ(x0, ∞). The ODEs are such that these equilibria exist unambiguously for all initial conditions (including eq = ∞).
My poor man's approach would be to integrate the ODE up to a late time and fetch that value (for brevity I do not show the plotting):
import numpy as np
from scipy.integrate import odeint
# ODE
def func(X,t):
return [ X[2]**2 * (X[0] - X[1]),
X[2]**3 * (X[0] + 3 * X[1]),
-X[2]**2]
# Forming a grid
n = 15
x0 = x1 = np.linspace(0,1,n)
x0_,x1_ = np.meshgrid(x0,x1)
eq = np.zeros([n,n,3])
t = np.linspace(0,100,1000)
x2 = 1
for i in range(n):
for j in range(n):
X = odeint(func,[x0_[j,i],x1_[j,i],x2], t)
eq[j,i,:] = X[-1,:]
Naive example above:
The problem with that approach is that you can never be sure if it converged. I know that you can just find the roots of f(x), but this would not yield the equilibria as a function of their initial conditions (You could trace them back, but since this function is not injective, you will not find values for all initial values). I somehow need a ODE solver which integrates until an equilibria is reached (or stops integrating if it goes beyond a limit). Do you have any ideas?
I have a function which is actually a call to another program (some Fortran code). When I call this function (run_moog) I can parse 4 variables, and it returns 6 values. These values should all be close to 0 (in order to minimize). However, I combined them like this: np.sum(results**2). Now I have a scalar function. I would like to minimize this function, i.e. get the np.sum(results**2) as close to zero as possible.
Note: When this function (run_moog) takes the 4 input parameters, it creates an input file for the Fortran code that depends on these parameters.
I have tried several ways to optimize this from the scipy docs. But none works as expected. The minimization should be able to have bounds on the 4 variables. Here is an attempt:
from scipy.optimize import minimize # Tried others as well from the docs
x0 = 4435, 3.54, 0.13, 2.4
bounds = [(4000, 6000), (3.00, 4.50), (-0.1, 0.1), (0.0, None)]
a = minimize(fun_mmog, x0, bounds=bounds, method='L-BFGS-B') # I've tried several different methods here
print a
This then gives me
status: 0
success: True
nfev: 5
fun: 2.3194639999999964
x: array([ 4.43500000e+03, 3.54000000e+00, 1.00000000e-01,
2.40000000e+00])
message: 'CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL'
jac: array([ 0., 0., -54090399.99999981, 0.])
nit: 0
The third parameter changes slightly, while the others are exactly the same. Also there have been 5 function calls (nfev) but no iterations (nit). The output from scipy is shown here.
Couple of possibilities:
Try COBYLA. It should be derivative-free, and supports inequality constraints.
You can't use different epsilons via the normal interface; so try scaling your first variable by 1e4. (Divide it going in, multiply coming back out.)
Skip the normal automatic jacobian constructor, and make your own:
Say you're trying to use SLSQP, and you don't provide a jacobian function. It makes one for you. The code for it is in approx_jacobian in slsqp.py. Here's a condensed version:
def approx_jacobian(x,func,epsilon,*args):
x0 = asfarray(x)
f0 = atleast_1d(func(*((x0,)+args)))
jac = zeros([len(x0),len(f0)])
dx = zeros(len(x0))
for i in range(len(x0)):
dx[i] = epsilon
jac[i] = (func(*((x0+dx,)+args)) - f0)/epsilon
dx[i] = 0.0
return jac.transpose()
You could try replacing that loop with:
for (i, e) in zip(range(len(x0)), epsilon):
dx[i] = e
jac[i] = (func(*((x0+dx,)+args)) - f0)/e
dx[i] = 0.0
You can't provide this as the jacobian to minimize, but fixing it up for that is straightforward:
def construct_jacobian(func,epsilon):
def jac(x, *args):
x0 = asfarray(x)
f0 = atleast_1d(func(*((x0,)+args)))
jac = zeros([len(x0),len(f0)])
dx = zeros(len(x0))
for i in range(len(x0)):
dx[i] = epsilon
jac[i] = (func(*((x0+dx,)+args)) - f0)/epsilon
dx[i] = 0.0
return jac.transpose()
return jac
You can then call minimize like:
minimize(fun_mmog, x0,
jac=construct_jacobian(fun_mmog, [1e0, 1e-4, 1e-4, 1e-4]),
bounds=bounds, method='SLSQP')
It sounds like your target function doesn't have well-behaving derivatives. The line in the output jac: array([ 0., 0., -54090399.99999981, 0.]) means that changing only the third variable value is significant. And because the derivative w.r.t. to this variable is virtually infinite, there is probably something wrong in the function. That is also why the third variable value ends up in its maximum.
I would suggest that you take a look at the derivatives, at least in a few points in your parameter space. Compute them using finite differences and the default step size of SciPy's fmin_l_bfgs_b, 1e-8. Here is an example of how you could compute the derivates.
Try also plotting your target function. For instance, keep two of the parameters constant and let the two others vary. If the function has multiple local optima, you shouldn't use gradient-based methods like BFGS.
How difficult is it to get an analytical expression for the gradient? If you have that you can then approximate the product of Hessian with a vector using finite difference. Then you can use other optimization routines available.
Among the various optimization routines available in SciPy, the one called TNC (Newton Conjugate Gradient with Truncation) is quite robust to the numerical values associated with the problem.
The Nelder-Mead Simplex Method (suggested by Cristián Antuña in the comments above) is well known to be a good choice for optimizing (posibly ill-behaved) functions with no knowledge of derivatives (see Numerical Recipies In C, Chapter 10).
There are two somewhat specific aspects to your question. The first is the constraints on the inputs, and the second is a scaling problem. The following suggests solutions to these points, but you might need to manually iterate between them a few times until things work.
Input Constraints
Assuming your input constraints form a convex region (as your examples above indicate, but I'd like to generalize it a bit), then you can write a function
is_in_bounds(p):
# Return if p is in the bounds
Using this function, assume that the algorithm wants to move from point from_ to point to, where from_ is known to be in the region. Then the following function will efficiently find the furthermost point on the line between the two points on which it can proceed:
from numpy.linalg import norm
def progress_within_bounds(from_, to, eps):
"""
from_ -- source (in region)
to -- target point
eps -- Eucliedan precision along the line
"""
if norm(from_, to) < eps:
return from_
mid = (from_ + to) / 2
if is_in_bounds(mid):
return progress_within_bounds(mid, to, eps)
return progress_within_bounds(from_, mid, eps)
(Note that this function can be optimized for some regions, but it's hardly worth the bother, as it doesn't even call your original object function, which is the expensive one.)
One of the nice aspects of Nelder-Mead is that the function does a series of steps which are so intuitive. Some of these points can obviously throw you out of the region, but it's easy to modify this. Here is an implementation of Nelder Mead with modifications made marked between pairs of lines of the form ##################################################################:
import copy
'''
Pure Python/Numpy implementation of the Nelder-Mead algorithm.
Reference: https://en.wikipedia.org/wiki/Nelder%E2%80%93Mead_method
'''
def nelder_mead(f, x_start,
step=0.1, no_improve_thr=10e-6, no_improv_break=10, max_iter=0,
alpha = 1., gamma = 2., rho = -0.5, sigma = 0.5):
'''
#param f (function): function to optimize, must return a scalar score
and operate over a numpy array of the same dimensions as x_start
#param x_start (numpy array): initial position
#param step (float): look-around radius in initial step
#no_improv_thr, no_improv_break (float, int): break after no_improv_break iterations with
an improvement lower than no_improv_thr
#max_iter (int): always break after this number of iterations.
Set it to 0 to loop indefinitely.
#alpha, gamma, rho, sigma (floats): parameters of the algorithm
(see Wikipedia page for reference)
'''
# init
dim = len(x_start)
prev_best = f(x_start)
no_improv = 0
res = [[x_start, prev_best]]
for i in range(dim):
x = copy.copy(x_start)
x[i] = x[i] + step
score = f(x)
res.append([x, score])
# simplex iter
iters = 0
while 1:
# order
res.sort(key = lambda x: x[1])
best = res[0][1]
# break after max_iter
if max_iter and iters >= max_iter:
return res[0]
iters += 1
# break after no_improv_break iterations with no improvement
print '...best so far:', best
if best < prev_best - no_improve_thr:
no_improv = 0
prev_best = best
else:
no_improv += 1
if no_improv >= no_improv_break:
return res[0]
# centroid
x0 = [0.] * dim
for tup in res[:-1]:
for i, c in enumerate(tup[0]):
x0[i] += c / (len(res)-1)
# reflection
xr = x0 + alpha*(x0 - res[-1][0])
##################################################################
##################################################################
xr = progress_within_bounds(x0, x0 + alpha*(x0 - res[-1][0]), prog_eps)
##################################################################
##################################################################
rscore = f(xr)
if res[0][1] <= rscore < res[-2][1]:
del res[-1]
res.append([xr, rscore])
continue
# expansion
if rscore < res[0][1]:
xe = x0 + gamma*(x0 - res[-1][0])
##################################################################
##################################################################
xe = progress_within_bounds(x0, x0 + gamma*(x0 - res[-1][0]), prog_eps)
##################################################################
##################################################################
escore = f(xe)
if escore < rscore:
del res[-1]
res.append([xe, escore])
continue
else:
del res[-1]
res.append([xr, rscore])
continue
# contraction
xc = x0 + rho*(x0 - res[-1][0])
##################################################################
##################################################################
xc = progress_within_bounds(x0, x0 + rho*(x0 - res[-1][0]), prog_eps)
##################################################################
##################################################################
cscore = f(xc)
if cscore < res[-1][1]:
del res[-1]
res.append([xc, cscore])
continue
# reduction
x1 = res[0][0]
nres = []
for tup in res:
redx = x1 + sigma*(tup[0] - x1)
score = f(redx)
nres.append([redx, score])
res = nres
Note This implementation is GPL, which is either fine for you or not. It's extremely easy to modify NM from any pseudocode, though, and you might want to throw in simulated annealing in any case.
Scaling
This is a trickier problem, but jasaarim has made an interesting point regarding that. Once the modified NM algorithm has found a point, you might want to run matplotlib.contour while fixing a few dimensions, in order to see how the function behaves. At this point, you might want to rescale one or more of the dimensions, and rerun the modified NM.
–
I'm trying to solve a second order ODE using odeint from scipy. The issue I'm having is the function is implicitly coupled to the second order term, as seen in the simplified snippet (please ignore the pretend physics of the example):
import numpy as np
from scipy.integrate import odeint
def integral(y,t,F_l,mass):
dydt = np.zeros_like(y)
x, v = y
F_r = (((1-a)/3)**2 + (2*(1+a)/3)**2) * v # 'a' implicit
a = (F_l - F_r)/mass
dydt = [v, a]
return dydt
y0 = [0,5]
time = np.linspace(0.,10.,21)
F_lon = 100.
mass = 1000.
dydt = odeint(integral, y0, time, args=(F_lon,mass))
in this case I realise it is possible to algebraically solve for the implicit variable, however in my actual scenario there is a lot of logic between F_r and the evaluation of a and algebraic manipulation fails.
I believe the DAE could be solved using MATLAB's ode15i function, but I'm trying to avoid that scenario if at all possible.
My question is - is there a way to solve implicit ODE functions (DAE) in python( scipy preferably)? And is there a better way to pose the problem above to do so?
As a last resort, it may be acceptable to pass a from the previous time-step. How could I pass dydt[1] back into the function after each time-step?
Quite Old , but worth updating so it may be useful for anyone, who stumbles upon this question. There are quite few packages currently available in python that can solve implicit ODE.
GEKKO (https://github.com/BYU-PRISM/GEKKO) is one of the packages, that specializes on dynamic optimization for mixed integer , non linear optimization problems, but can also be used as a general purpose DAE solver.
The above "pretend physics" problem can be solved in GEKKO as follows.
m= GEKKO()
m.time = np.linspace(0,100,101)
F_l = m.Param(value=1000)
mass = m.Param(value =1000)
m.options.IMODE=4
m.options.NODES=3
F_r = m.Var(value=0)
x = m.Var(value=0)
v = m.Var(value=0,lb=0)
a = m.Var(value=5,lb=0)
m.Equation(x.dt() == v)
m.Equation(v.dt() == a)
m.Equation (F_r == (((1-a)/3)**2 + (2*(1+a)/3)**2 * v))
m.Equation (a == (1000 - F_l)/mass)
m.solve(disp=False)
plt.plot(x)
if algebraic manipulation fails, you can go for a numerical solution of your constraint, running for example fsolve at each timestep:
import sys
from numpy import linspace
from scipy.integrate import odeint
from scipy.optimize import fsolve
y0 = [0, 5]
time = linspace(0., 10., 1000)
F_lon = 10.
mass = 1000.
def F_r(a, v):
return (((1 - a) / 3) ** 2 + (2 * (1 + a) / 3) ** 2) * v
def constraint(a, v):
return (F_lon - F_r(a, v)) / mass - a
def integral(y, _):
v = y[1]
a, _, ier, mesg = fsolve(constraint, 0, args=[v, ], full_output=True)
if ier != 1:
print "I coudn't solve the algebraic constraint, error:\n\n", mesg
sys.stdout.flush()
return [v, a]
dydt = odeint(integral, y0, time)
Clearly this will slow down your time integration. Always check that fsolve finds a good solution, and flush the output so that you can realize it as it happens and stop the simulation.
About how to "cache" the value of a variable at a previous timestep, you can exploit the fact that default arguments are calculated only at the function definition,
from numpy import linspace
from scipy.integrate import odeint
#you can choose a better guess using fsolve instead of 0
def integral(y, _, F_l, M, cache=[0]):
v, preva = y[1], cache[0]
#use value for 'a' from the previous timestep
F_r = (((1 - preva) / 3) ** 2 + (2 * (1 + preva) / 3) ** 2) * v
#calculate the new value
a = (F_l - F_r) / M
cache[0] = a
return [v, a]
y0 = [0, 5]
time = linspace(0., 10., 1000)
F_lon = 100.
mass = 1000.
dydt = odeint(integral, y0, time, args=(F_lon, mass))
Notice that in order for the trick to work the cache parameter must be mutable, and that's why I use a list. See this link if you are not familiar with how default arguments work.
Notice that the two codes DO NOT produce the same result, and you should be very careful using the value at the previous timestep, both for numerical stability and precision. The second is clearly much faster though.