Inequality constrained least squares with python using 'COBYLA' algorithm - python

My goal is to minimize least squares (i.e. "fit" the function) with respect, that the returned function is non-decreasing, which means that the derivative in all points on I is >=0.
My function of choice is 4th degree polynomial function, i.e,
f(x) = a*x**4 + b*x**3 + c*x**2 + d*x + e
For this task, its best to use scipy.optimize.minimize method. I built very robust iteration algorithm, that search the points where my the result fuction from 'minimize' is decreasing, and sets the inequality constraint. For example, if in point x0 is f(x) decreasing, my constraint is:
4*a*x0**3 + 3*b*x0**2 + 2*c*x0 + d >= 0
For some of my data I succedet using 'SLSQP' optimization method WITH the inequality constraint as is described here. Its odd because the of minimize documentation of minimize states, that:
"Note that COBYLA only supports inequality constraints."
So my first question: 1] Is the tutorial from first link mistaken?
Even if it is right, it looks that I cant use 'SLSQP' for another datam because of 'incompatible constraints' issue from minimize process.
Now for the 'COBYLA algorithm' I want to use, because there could be some points, where f(x) is decreasing. Here is the sample code:
#STACK
#[ 1.01766416e-04, 1.80575564e-06, -7.51840485e-03, -7.51828086e-03, 9.84985357e-01]
import numpy as np
from scipy.optimize import minimize
def ecdf(arr):
arr = np.array(arr)
F = [len(arr[arr<=t]) / len(arr) for t in arr]
return np.array(F)
def der(args_pol, point):
a, b, c, d, e = args_pol
return (4*a*point**3 + 3*b*point**2 + 2*c*point + d)
def least_sq(args_pol, x, y):
a, b, c, d, e = args_pol
return ((y-(a*x**4 + b*x**3 + c*x**2 + d*x + e))**2).sum()
var = np.array([ 6.8, 6.9, 7. , 7.4, 7.4, 7.5, 7.5, 7.6, 7.7, 7.8, 8. ,
8. , 8. ])
ec = ecdf(var)
tip = [0., 0., 0., 0., 0.]
const = []
opt = minimize(least_sq, tip, method = 'COBYLA', args = (var, ec),
constraints = const)
The result from optimization proces is my second coment. The result function, if you plot it looks like this.
As you can see, the result function fits my data VERY poorly, even withouth any constraints. I saw similar behaviour for for data, where I needed some constraints as well, sometimes the result function event worse than this example. So my second question is:
2] Can anybody explain me, what I am doing wrong?

Related

Converting output of scipy.interpolate.splprep into NURBS format for IGES display

I'm looking to convert a series of ordered (pretty dense) 2D points describing arbitrary curves into a NURBS representation, which can be written into an IGES file.
I'm using scipy.interpolate's splprep to get a B-spline representation of the given series of points, and then I had presumed the NURBS definition would essentially be this plus saying all weights are equal to 1. However I think I am fundamentally misinterpreting the output of splprep, specifically the relation between 'B-spline coefficients' and the control points needed to manually recreate the spline in some CAD package (I am using Siemens NX11).
I've tried a simple example of approximating the function y = x^3 from a sparse set of points:
import scipy.interpolate as si
import numpy as np
import matplotlib.pyplot as plt
# Sparse points defining cubic
x = np.linspace(-1,1,7)
y = x**3
# Get B-spline representation
tck, u = si.splprep([x,y],s=0.0)
# Get (x,y) coordinates of control points
c_x = tck[1][0]
c_y = tck[1][1]
# Plotting
u_fine = np.linspace(0,1,1000)
x_fine, y_fine = si.splev(u_fine, tck)
fig = plt.figure()
ax = fig.add_subplot(111)
ax.plot(x, y, 'o', x_fine, y_fine)
ax.axis('equal')
plt.show()
Which gives the following parameters:
>>> t
array([ 0. , 0. , 0. , 0. , 0.39084883,
0.5 , 0.60915117, 1. , 1. , 1. , 1. ])
>>> c_x
array([ -1.00000000e+00, -9.17992269e-01, -6.42403598e-01,
-2.57934892e-16, 6.42403598e-01, 9.17992269e-01,
1.00000000e+00])
>>> c_y
array([ -1.00000000e+00, -7.12577481e-01, -6.82922469e-03,
-1.00363771e-18, 6.82922469e-03, 7.12577481e-01,
1.00000000e+00])
>>> k
3
>>> u
array([ 0. , 0.25341516, 0.39084883, 0.5 , 0.60915117,
0.74658484, 1. ])
>>>
I've assumed that the two sets of coefficients (c_x, c_y) describe the (x,y) coordinates of poles needed to construct the spline. Trying this manually in NX gives a similar spline, though not quite the same, with other points in the interval being evaluated differently than in Python. When I export this manual spline to IGES format, NX changes the knots to the below (while obviously keeping the same control points/poles and setting all weights = 1).
t_nx = np.array([0.0, 0.0, 0.0, 0.0, 0.25, 0.5, 0.75, 1.0, 1.0, 1.0, 1.0])
Going the other way and writing the splprep knots (t) into the IGES definition (with said 'control points' and weights = 1) does not seem to give a valid spline. NX and at least one other package cannot evaluate it, citing 'invalid trim or parametric values for B-spline curve'.
There seem to me to be at least three possibilities:
A non-trivial conversion is necessary to go from non-rational to rational B-splines
There is an application-specific interpretation of IGES splines (i.e. my interpretation of splprep output is correct, but this is simplified/approximated by NX when manually drawn/during the IGES conversion routine). Seems unlikely.
The coefficients from splprep cannot be interpreted as control points in the manner I've described
I had written off the first possibility by comparing the equations for a scipy B-spline (link) and an IGES NURBS spline with all weights = 1 (link, page 14). They look identical, and it was this that led me to believe splprep coefficients = control points.
Any help clarifying any of the above points would be very much appreciated!
NB, I would like the possibility of representing closed curves, so want to stick to splprep if possible.
EDIT:
I thought it would be simpler to try this process first using splrep, as the outputs seemed more intuitive to me. I assumed the coefficients returned were the y-values of the control points, but didn't know to what x position they corresponded. I therefore tried to calculate them from the spline definition and input data using this matrix approach. The C matrix is just the input data. The N matrix is the evaluation of each basis function for each x-value, I did this using the (slightly modified) recursive functions shown here. Then all that remains is to invert N, and pre-multiply C by it to get the control points. The code and result is below:
import numpy as np
import scipy.interpolate as si
# Functions to evaluate B-spline basis functions
def B(x, k, i, t):
if k == 0:
return 1.0 if t[i] <= x < t[i+1] else 0.0
if t[i+k] == t[i]:
c1 = 0.0
else:
c1 = (x - t[i])/(t[i+k] - t[i]) * B(x, k-1, i, t)
if t[i+k+1] == t[i+1]:
c2 = 0.0
else:
c2 = (t[i+k+1] - x)/(t[i+k+1] - t[i+1]) * B(x, k-1, i+1, t)
return c1 + c2
def bspline(x, t, c, k):
n = len(t) - k - 1
assert (n >= k+1) and (len(c) >= n)
cont = []
for i in range(n):
res = B(x, k, i, t)
cont.append(res)
return cont
# Input data
x = np.linspace(-1,1,7)
y = x**3
# B-spline definition
t, c, k = si.splrep(x,y)
# Number of knots = m + 1 = n + k + 2
m = len(t) - 1
# Number of kth degree basis fcns
n = m - k - 1
# Define C and initialise N matrix
C_mat = np.column_stack((x,y))
N_mat = np.zeros(((n+1),(n+1)))
# Calculate basis functions for each x, store in matrix
for i, xs in enumerate(x):
row = bspline(xs, t, c, k)
N_mat[i,:] = row
# Last value must be one...
N_mat[-1,-1] = 1.0
# Invert the matrix
N_inv = np.linalg.inv(N_mat)
# Now calculate control points
P = np.dot(N_inv, C_mat)
Resulting in:
>>> P
array([[ -1.00000000e+00, -1.00000000e+00],
[ -7.77777778e-01, -3.33333333e-01],
[ -4.44444444e-01, -3.29597460e-17],
[ -3.12250226e-17, 8.67361738e-18],
[ 4.44444444e-01, -2.77555756e-17],
[ 7.77777778e-01, 3.33333333e-01],
[ 1.00000000e+00, 1.00000000e+00]])
I think it's correct because the y-values of P match the coefficients from splrep, c. Interestingly the x-values seem to be the knot averages (which could be separately calculated as below). Perhaps this result is obvious to someone very familiar with the maths, it certainly wasn't to me.
def knot_average(knots, degree):
"""
Determines knot average vector from knot vector.
:knots: A 1D numpy array describing knots of B-spline.
(NB expected from scipy.interpolate.splrep)
:degree: Integer describing degree of B-spline basis fcns
"""
# Chop first and last vals off
knots_to_average = knots[1:-1]
num_averaged_knots = len(knots_to_average) - degree + 1
knot_averages = np.zeros((num_averaged_knots,))
for i in range(num_averaged_knots):
avg = np.average(knots_to_average[i: i + degree])
knot_averages[i] = avg
return(knot_averages)
Now, to convert these to IGES NURBS I thought it was a case of defining the normalised knot vector, setting the weights all equal to one, and including the P control points from above. I normalised it as below, and have included the IGES file below that.
However when I try to import the file into NX, it again fails stating invalid trim parameters in the definition. Can anyone tell me if this is a valid NURBS definition?
Or perhaps this is some limitation with NX? For instance, I noticed when interactively drawing studio splines the knot vector was forced to be (clamped) uniform (as alluded to by fang). This constraint (and weights all = 1) must be required to uniquely define the curve. Interestingly if I force splrep to return a spline representation using a uniform knot vector (that is, clamped but otherwise uniform), the IGES is read in. I shouldn't think this is necessary though from NXs point of view - it defeats the purpose of having a NURBS in the first place. So it doesn't seem likely and I loop round wondering if my interpretation of the output of splrep is correct...can someone please point out where I've gone wrong?
# Original knot vector
>>> t
array([-1. , -1. , -1. , -1. , -0.33333333,
0. , 0.33333333, 1. , 1. , 1. , 1. ])
mini = min(t)
maxi = max(t)
r = maxi - mini
norm_t = (t-mini)/r
# Giving:
>>> norm_t
array([ 0. , 0. , 0. , 0. , 0.33333333,
0.5 , 0.66666667, 1. , 1. , 1. , 1. ])
IGES definition:
S 1
,,11Hspline_test,13Hsome_path.igs,19HSpline to iges v1.0,4H 0.1,,,,,,, G 1
1.0, 2,2HMM,,,8H 8:58:19,,,,; G 2
126 1 1 1 0 0 0D 1
126 27 4 0 Spline1 1D 2
126,6,3,0,0,1,0,0.0,0.0,0.0,0.0,0.33333,0.5,0.6666666,1.0,1.0,1.0,1.0, 1P 1
1.0,1.0,1.0,1.0,1.0,1.0,1.0,-1.0,-1.0,0.0,-0.7777,-0.33333,0.0, 1P 2
-0.444444,0.0,0.0,0.0,0.0,0.0,0.4444444,0.0,0.0,0.777777777,0.33333, 1P 3
0.0,1.0,1.0,0.0,0.0,1.0,0.0,0.0,0.0,0; 1P 4
S 1G 2D 2P 4 T 1
On the off chance this niche query helps anyone else- it turns out the problem was incorrect formatting of the parameter data section in the IGES. The data describing the spline can't take up > 64 characters per line. The interpretation of splprep output was correct, the (c_x, c_y) arrays describe the (x,y) coordinates of successive poles. The equivalent NURBS definition just requires specification of all weights = 1.

Can Python optimize my function inputs to get a target value?

I have been trying to locate a method similar to Excel's Solver where I can target a specific value for a function to converge on. I do not want a minimum or maximum optimization.
For example, if my function is:
f(x) = A^2 + cos(B) - sqrt(C)
I want f(x) = 1.86, is there a Python method that can iterate a solution for A, B, and C to get as close to 1.86 as possible? (given an acceptable error to target value?)
You need a root finding algorithm for your problem. Only a small transformation required. Find roots for g(x):
g(x) = A^2 + cos(B) - sqrt(C) - 1.86
Using scipy.optimize.root, Refer documentation:
import numpy as np
from scipy import optimize
# extra two 0's as dummy equations as root solves a system of equations
# rather than single multivariate equation
def func(x): # A,B,C represented by x ndarray
return [np.square(x[0]) + np.cos(x[1]) - np.sqrt(x[2]) - 1.86, 0, 0]
result = optimize.root(func , x0 = [0.1,0.1,0.1])
x = result.x
A, B, C = x
x
# array([ 1.09328544, -0.37977694, 0.06970678])
you can now check your solution:
np.square(x[0]) + np.cos(x[1]) - np.sqrt(x[2])
# 1.8600000000000005

custom bounds constraint in basinhopping optimization

I'm trying to maximize a sum of two monotonically increasing functions f(x) = f1(x) + f2(x) within the given bounds, say x = 0 to 6. The curves of the two functions are:
To solve this I'm using basinhopping function from scipy package.
I would like to specify a constraint for using the bounds. Specifically, I want the summation of bounds to be lesser than or equal to a constant value. i.e. In my implementation below, I want x[0] + x[1] <= C where C = 6.
In the above figure, for C = 6, approximately x[0] = 2 and x[1] = 4 (4 + 2 =<=6) will yield the maximum value. My question is how to specify this constraint? If it's not possible, is there another optimization function that is better suited for this problem?
from scipy.optimize import basinhopping
from math import tanh
def f(x):
return -(f1(x[0]) + f2(x[1])) # -ve sign for maximization
def f1(x):
return tanh(x)
def f2(x):
return (10 ** (0.2 * x))
# Starting point
x0 = [1., 1.]
# Bounds
xmin = [1., 1.]
xmax = [6., 6.]
# rewrite the bounds in the way required by L-BFGS-B
bounds = [(low, high) for low, high in zip(xmin, xmax)]
minimizer_kwargs = dict(method="L-BFGS-B", bounds=bounds)
result = basinhopping(f,
x0,
minimizer_kwargs=minimizer_kwargs,
niter=200)
print("global max: x = [%.4f, %.4f], f(x0) = %.4f" % (result.x[0], result.x[1], result.fun))
This is possible with COBYLA constrained optimization, though I don't know how to do it with L-BFGS-B. You can add constraints like the following:
def constraint1(x):
return 6 - x[0]
def constraint2(x):
return 6 - x[0] - x[1]
def constraint3(x):
return x[0] - 1
To add these constraints to the minimizer, use:
c1={"type":"ineq","fun":constraint1}
c2={"type":"ineq","fun":constraint2}
c3={"type":"ineq","fun":constraint3}
minimizer_kwargs = dict(method="COBYLA",constraints=(c1,c2,c3))
When I run this example, I get the result global max: x = [1.0000, 5.0000], f(x0) = -10.7616
Note, I haven't tested whether you need to add a constraint4 that x[1]>1.
For more reading, see the documentation for cobyla:
https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.fmin_cobyla.html

Optimization with Python (scipy.optimize)

I am trying to maximize the following function using Python's scipy.optimize. However, after lots of trying, it doesn't seem to work. The function and my code are pasted below. Thanks for helping!
Problem
Maximize [sum (x_i / y_i)**gamma]**(1/gamma)
subject to the constraint sum x_i = 1; x_i is in the interval (0,1).
x is a vector of choice variables; y is a vector of parameters; gamma is a parameter. The xs must sum to one. And each x must be in the interval (0,1).
Code
def objective_function(x, y):
sum_contributions = 0
gamma = 0.2
for count in xrange(len(x)):
sum_contributions += (x[count] / y[count]) ** gamma
value = math.pow(sum_contributions, 1 / gamma)
return -value
cons = ({'type': 'eq', 'fun': lambda x: np.array([sum(x) - 1])})
y = [0.5, 0.3, 0.2]
initial_x = [0.2, 0.3, 0.5]
opt = minimize(objective_function, initial_x, args=(y,), method='SLSQP',
constraints=cons,bounds=[(0, 1)] * len(x))
Sometimes, numerical optimizer doesn't work for whatever reason. We can parametrize the problem slightly different and it will just work. (and might work faster)
For example, for bounds of (0,1), we can have a transform function such that values in (-inf, +inf), after being transformed, will end up in (0,1)
We can do a similar trick with the equality constraints. For example, we can reduce the dimension from 3 to 2, since the last element in x has to be 1-sum(x).
If it still won't work, we can switch to a optimizer that dose not require information from derivative, such as Nelder Mead.
And also there is Lagrange multiplier.
In [111]:
def trans_x(x):
x1 = x**2/(1+x**2)
z = np.hstack((x1, 1-sum(x1)))
return z
def F(x, y, gamma = 0.2):
z = trans_x(x)
return -(((z/y)**gamma).sum())**(1./gamma)
In [112]:
opt = minimize(F, np.array([0., 1.]), args=(np.array(y),),
method='Nelder-Mead')
opt
Out[112]:
status: 0
nfev: 96
success: True
fun: -265.27701747828007
x: array([ 0.6463264, 0.7094782])
message: 'Optimization terminated successfully.'
nit: 52
The result is:
In [113]:
trans_x(opt.x)
Out[113]:
array([ 0.29465097, 0.33482303, 0.37052601])
And we can visualize it, with:
In [114]:
x1 = np.linspace(0,1)
y1 = np.linspace(0,1)
X,Y = np.meshgrid(x1,y1)
Z = np.array([F(item, y) for item
in np.vstack((X.ravel(), Y.ravel())).T]).reshape((len(x1), -1), order='F')
Z = np.fliplr(Z)
Z = np.flipud(Z)
plt.contourf(X, Y, Z, 50)
plt.colorbar()
Even tough this questions is a bit dated I wanted to add an alternative solution which might be useful for others stumbling upon this question in the future.
It turns our your problem is solvable analytically. You can start by writing down the Lagrangian of the (equality constrained) optimization problem:
L = \sum_i (x_i/y_i)^\gamma - \lambda (\sum x_i - 1)
The optimal solution is found by setting the first derivative of this Lagrangian to zero:
0 = \partial L / \partial x_i = \gamma x_i^{\gamma-1}/\y_i - \lambda
=> x_i \propto y_i^{\gamma/(\gamma - 1)}
Using this insight the optimization problem can be solved simply and efficiently by:
In [4]:
def analytical(y, gamma=0.2):
x = y**(gamma/(gamma-1.0))
x /= np.sum(x)
return x
xanalytical = analytical(y)
xanalytical, objective_function(xanalytical, y)
Out [4]:
(array([ 0.29466774, 0.33480719, 0.37052507]), -265.27701765929692)
CT Zhu's solution is elegant but it might violate the positivity constraint on the third coordinate. For gamma = 0.2 this does not seem to be a problem in practice, but for different gammas you easily run into trouble:
In [5]:
y = [0.2, 0.1, 0.8]
opt = minimize(F, np.array([0., 1.]), args=(np.array(y), 2.0),
method='Nelder-Mead')
trans_x(opt.x), opt.fun
Out [5]:
(array([ 1., 1., -1.]), -11.249999999999998)
For other optimization problems with the same probability simplex constraints as your problem, but for which there is no analytical solution, it might be worth looking into projected gradient methods or similar. These methods leverage the fact that there is fast algorithm for the projection of an arbitrary point onto this set see https://en.wikipedia.org/wiki/Simplex#Projection_onto_the_standard_simplex.
(To see the complete code and a better rendering of the equations take a look at the Jupyter notebook http://nbviewer.jupyter.org/github/andim/pysnippets/blob/master/optimization-simplex-constraints.ipynb)

Minimizing a multivariable function with scipy. Derivative not known

I have a function which is actually a call to another program (some Fortran code). When I call this function (run_moog) I can parse 4 variables, and it returns 6 values. These values should all be close to 0 (in order to minimize). However, I combined them like this: np.sum(results**2). Now I have a scalar function. I would like to minimize this function, i.e. get the np.sum(results**2) as close to zero as possible.
Note: When this function (run_moog) takes the 4 input parameters, it creates an input file for the Fortran code that depends on these parameters.
I have tried several ways to optimize this from the scipy docs. But none works as expected. The minimization should be able to have bounds on the 4 variables. Here is an attempt:
from scipy.optimize import minimize # Tried others as well from the docs
x0 = 4435, 3.54, 0.13, 2.4
bounds = [(4000, 6000), (3.00, 4.50), (-0.1, 0.1), (0.0, None)]
a = minimize(fun_mmog, x0, bounds=bounds, method='L-BFGS-B') # I've tried several different methods here
print a
This then gives me
status: 0
success: True
nfev: 5
fun: 2.3194639999999964
x: array([ 4.43500000e+03, 3.54000000e+00, 1.00000000e-01,
2.40000000e+00])
message: 'CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL'
jac: array([ 0., 0., -54090399.99999981, 0.])
nit: 0
The third parameter changes slightly, while the others are exactly the same. Also there have been 5 function calls (nfev) but no iterations (nit). The output from scipy is shown here.
Couple of possibilities:
Try COBYLA. It should be derivative-free, and supports inequality constraints.
You can't use different epsilons via the normal interface; so try scaling your first variable by 1e4. (Divide it going in, multiply coming back out.)
Skip the normal automatic jacobian constructor, and make your own:
Say you're trying to use SLSQP, and you don't provide a jacobian function. It makes one for you. The code for it is in approx_jacobian in slsqp.py. Here's a condensed version:
def approx_jacobian(x,func,epsilon,*args):
x0 = asfarray(x)
f0 = atleast_1d(func(*((x0,)+args)))
jac = zeros([len(x0),len(f0)])
dx = zeros(len(x0))
for i in range(len(x0)):
dx[i] = epsilon
jac[i] = (func(*((x0+dx,)+args)) - f0)/epsilon
dx[i] = 0.0
return jac.transpose()
You could try replacing that loop with:
for (i, e) in zip(range(len(x0)), epsilon):
dx[i] = e
jac[i] = (func(*((x0+dx,)+args)) - f0)/e
dx[i] = 0.0
You can't provide this as the jacobian to minimize, but fixing it up for that is straightforward:
def construct_jacobian(func,epsilon):
def jac(x, *args):
x0 = asfarray(x)
f0 = atleast_1d(func(*((x0,)+args)))
jac = zeros([len(x0),len(f0)])
dx = zeros(len(x0))
for i in range(len(x0)):
dx[i] = epsilon
jac[i] = (func(*((x0+dx,)+args)) - f0)/epsilon
dx[i] = 0.0
return jac.transpose()
return jac
You can then call minimize like:
minimize(fun_mmog, x0,
jac=construct_jacobian(fun_mmog, [1e0, 1e-4, 1e-4, 1e-4]),
bounds=bounds, method='SLSQP')
It sounds like your target function doesn't have well-behaving derivatives. The line in the output jac: array([ 0., 0., -54090399.99999981, 0.]) means that changing only the third variable value is significant. And because the derivative w.r.t. to this variable is virtually infinite, there is probably something wrong in the function. That is also why the third variable value ends up in its maximum.
I would suggest that you take a look at the derivatives, at least in a few points in your parameter space. Compute them using finite differences and the default step size of SciPy's fmin_l_bfgs_b, 1e-8. Here is an example of how you could compute the derivates.
Try also plotting your target function. For instance, keep two of the parameters constant and let the two others vary. If the function has multiple local optima, you shouldn't use gradient-based methods like BFGS.
How difficult is it to get an analytical expression for the gradient? If you have that you can then approximate the product of Hessian with a vector using finite difference. Then you can use other optimization routines available.
Among the various optimization routines available in SciPy, the one called TNC (Newton Conjugate Gradient with Truncation) is quite robust to the numerical values associated with the problem.
The Nelder-Mead Simplex Method (suggested by Cristián Antuña in the comments above) is well known to be a good choice for optimizing (posibly ill-behaved) functions with no knowledge of derivatives (see Numerical Recipies In C, Chapter 10).
There are two somewhat specific aspects to your question. The first is the constraints on the inputs, and the second is a scaling problem. The following suggests solutions to these points, but you might need to manually iterate between them a few times until things work.
Input Constraints
Assuming your input constraints form a convex region (as your examples above indicate, but I'd like to generalize it a bit), then you can write a function
is_in_bounds(p):
# Return if p is in the bounds
Using this function, assume that the algorithm wants to move from point from_ to point to, where from_ is known to be in the region. Then the following function will efficiently find the furthermost point on the line between the two points on which it can proceed:
from numpy.linalg import norm
def progress_within_bounds(from_, to, eps):
"""
from_ -- source (in region)
to -- target point
eps -- Eucliedan precision along the line
"""
if norm(from_, to) < eps:
return from_
mid = (from_ + to) / 2
if is_in_bounds(mid):
return progress_within_bounds(mid, to, eps)
return progress_within_bounds(from_, mid, eps)
(Note that this function can be optimized for some regions, but it's hardly worth the bother, as it doesn't even call your original object function, which is the expensive one.)
One of the nice aspects of Nelder-Mead is that the function does a series of steps which are so intuitive. Some of these points can obviously throw you out of the region, but it's easy to modify this. Here is an implementation of Nelder Mead with modifications made marked between pairs of lines of the form ##################################################################:
import copy
'''
Pure Python/Numpy implementation of the Nelder-Mead algorithm.
Reference: https://en.wikipedia.org/wiki/Nelder%E2%80%93Mead_method
'''
def nelder_mead(f, x_start,
step=0.1, no_improve_thr=10e-6, no_improv_break=10, max_iter=0,
alpha = 1., gamma = 2., rho = -0.5, sigma = 0.5):
'''
#param f (function): function to optimize, must return a scalar score
and operate over a numpy array of the same dimensions as x_start
#param x_start (numpy array): initial position
#param step (float): look-around radius in initial step
#no_improv_thr, no_improv_break (float, int): break after no_improv_break iterations with
an improvement lower than no_improv_thr
#max_iter (int): always break after this number of iterations.
Set it to 0 to loop indefinitely.
#alpha, gamma, rho, sigma (floats): parameters of the algorithm
(see Wikipedia page for reference)
'''
# init
dim = len(x_start)
prev_best = f(x_start)
no_improv = 0
res = [[x_start, prev_best]]
for i in range(dim):
x = copy.copy(x_start)
x[i] = x[i] + step
score = f(x)
res.append([x, score])
# simplex iter
iters = 0
while 1:
# order
res.sort(key = lambda x: x[1])
best = res[0][1]
# break after max_iter
if max_iter and iters >= max_iter:
return res[0]
iters += 1
# break after no_improv_break iterations with no improvement
print '...best so far:', best
if best < prev_best - no_improve_thr:
no_improv = 0
prev_best = best
else:
no_improv += 1
if no_improv >= no_improv_break:
return res[0]
# centroid
x0 = [0.] * dim
for tup in res[:-1]:
for i, c in enumerate(tup[0]):
x0[i] += c / (len(res)-1)
# reflection
xr = x0 + alpha*(x0 - res[-1][0])
##################################################################
##################################################################
xr = progress_within_bounds(x0, x0 + alpha*(x0 - res[-1][0]), prog_eps)
##################################################################
##################################################################
rscore = f(xr)
if res[0][1] <= rscore < res[-2][1]:
del res[-1]
res.append([xr, rscore])
continue
# expansion
if rscore < res[0][1]:
xe = x0 + gamma*(x0 - res[-1][0])
##################################################################
##################################################################
xe = progress_within_bounds(x0, x0 + gamma*(x0 - res[-1][0]), prog_eps)
##################################################################
##################################################################
escore = f(xe)
if escore < rscore:
del res[-1]
res.append([xe, escore])
continue
else:
del res[-1]
res.append([xr, rscore])
continue
# contraction
xc = x0 + rho*(x0 - res[-1][0])
##################################################################
##################################################################
xc = progress_within_bounds(x0, x0 + rho*(x0 - res[-1][0]), prog_eps)
##################################################################
##################################################################
cscore = f(xc)
if cscore < res[-1][1]:
del res[-1]
res.append([xc, cscore])
continue
# reduction
x1 = res[0][0]
nres = []
for tup in res:
redx = x1 + sigma*(tup[0] - x1)
score = f(redx)
nres.append([redx, score])
res = nres
Note This implementation is GPL, which is either fine for you or not. It's extremely easy to modify NM from any pseudocode, though, and you might want to throw in simulated annealing in any case.
Scaling
This is a trickier problem, but jasaarim has made an interesting point regarding that. Once the modified NM algorithm has found a point, you might want to run matplotlib.contour while fixing a few dimensions, in order to see how the function behaves. At this point, you might want to rescale one or more of the dimensions, and rerun the modified NM.
–

Categories