Can scipy.optimize minimize functions of complex variables at all and how? - python

I am trying to minimize a function of a complex (vector) variable using scipy.optimize. My results so far indicate that it may not be possible. To investigate the problem, I have implemented a simple example - minimize the 2-norm of a complex vector with an offset:
import numpy as np
from scipy.optimize import fmin
def fun(x):
return np.linalg.norm(x - 1j * np.ones(2), 2)
sol = fmin(fun, x0=np.ones(2) + 0j)
The output is
Optimization terminated successfully.
Current function value: 2.000000
Iterations: 38
Function evaluations: 69
>>> sol
array([-2.10235293e-05, 2.54845649e-05])
Clearly, the solution should be
array([0.+1.j, 0.+1.j])
Disappointed with this outcome, I have also tried scipy.optimize.minimize:
from scipy.optimize import minimize
def fun(x):
return np.linalg.norm(x - 1j * np.ones(2), 1)
sol = minimize(fun, x0=np.ones(2) + 0j)
The output is
>>> sol
fun: 2.0
hess_inv: array([[ 9.99997339e-01, -2.66135332e-06],
[-2.66135332e-06, 9.99997339e-01]])
jac: array([0., 0.])
message: 'Optimization terminated successfully.'
nfev: 24
nit: 5
njev: 6
status: 0
success: True
x: array([6.18479071e-09+0.j, 6.18479071e-09+0.j])
Not good either. I have tried specifying all of the possible methods for minimize (supplying the Jacobian and Hessian as necessary), but none of them reach the correct result. Most of them cause ComplexWarning: Casting complex values to real discards the imaginary part, indicating that they cannot handle complex numbers correctly.
Is this possible at all using scipy.optimize?
If so, I would very much appreciate if someone can tell me what I am doing wrong.
If not, do you perhaps have suggestions for alternative optimization tools (for Python) that allow this?

The minimization methods of SciPy work with real arguments only. But minimization on the complex space Cn amounts to minimization on R2n, the algebra of complex numbers never enters the consideration. Thus, adding two wrappers for conversion from Cn to R2n and back, you can optimize over complex numbers.
def real_to_complex(z): # real vector of length 2n -> complex of length n
return z[:len(z)//2] + 1j * z[len(z)//2:]
def complex_to_real(z): # complex vector of length n -> real of length 2n
return np.concatenate((np.real(z), np.imag(z)))
sol = minimize(lambda z: fun(real_to_complex(z)), x0=complex_to_real(np.ones(2) + 0j))
print(real_to_complex(sol.x)) # [-7.40376620e-09+1.j -8.77719406e-09+1.j]
You mention Jacobian and Hessian... but minimization only makes sense for real-valued functions, and those are never differentiable with respect to complex variables. The Jacobian and Hessian would have to be computed over R2n anyway, treating the real and imaginary parts as separate variables.

I have needed to minimize the departure of a complex valued model function upon complex valued parameters, over a real domain.
A toy example:
def f(x, a, b):
ab = complex(a,b);
return np.exp(x*ab)
And suppose that I have data DATA for x = np.arange(N). Note that x is real.
What I did was this:
def helper(x, a, b):
return abs(f(x,a,b) - DATA[x])
and then I can use curve_fit():
curve_fit(helper, np.arange(N), np.zeros(N), p0 = [1,0])
What is happening is this: By subtracting the data from the model function, the new "ideal" output is all zeroes, which can be (must be) real in order for curve_fit() to work. The complex parameter ab = a + jb has been broken into its real and imaginary parts. The helper() function returns the absolute value of the difference between the model and the data.
A critical issue is that curve_fit() doesn't evaluate any other x values than those you give it. Otherwise DATA[x] would fail.
Note that by using abs() I'm achieving an L1 fit (more or less). One could just as well use abs()**2 to get an L2 fit ... but why one would use L1 or L2 is a topic for another day.
You could fret, "suppose that the x[] aren't integers (but are real)?" which my code requires. Well, that's doable, simply by putting them into an array, and indexing that. There's probably some clever hack using a dictionary that would address this issue, too.
Sorry about the code formatting; haven't figured out the markup yet.

Related

What is the difference between scipy.optimize's 'root' and 'fixed_point' methods

There are two methods in scipy.optimize which are root and fixed_point.
I am very surprised to find that root offers many methods, whereas fixed_point has just one. Mathematically the two are identical. They relate the following fixed points of g(x) with the roots of f(x):
[ g(x) = f(x) - x ]
How do I determine which function to use?
Also, none of the two methods allow me to specify the regions where the functions are defined. Is there a way to limit the range of x?
Summary: if you don't know what to use, use root. The method fixed_point merits consideration if your problem is naturally a fixed-point problem g(x) = x where it's reasonable to expect that iterating g will help in solving the problem (i.e., g has some non-expanding behavior). Otherwise, use root or something else.
Although every root-finding problem is mathematically equivalent to a fixed-point problem, it's not always beneficial to restate it as such from the numerical methods point of view. Sometimes it is, as in Newton's method. But the trivial restatement, replacing f(x) = 0 as g(x) = x with g(x) = f(x) + x is not likely to help.
The method fixed_point iterates the provided function, optionally with adjustments that make convergence faster / more likely. This is going to be problematic if the iterated values move away from the fixed point (a repelling fixed point), which can happen despite the adjustments. An example: solving exp(x) = 1 directly and as a fixed point problem for exp(x) - 1 + x, with the same starting point:
import numpy as np
from scipy.optimize import fixed_point, root
root(lambda x: np.exp(x) - 1, 3) # converges to 0 in 14 steps
fixed_point(lambda x: np.exp(x) - 1 + x, 3) # RuntimeError: Failed to converge after 500 iterations, value is 2.9999533400931266
To directly answer the question: the difference is in the methods being used. Fixed point solver is quite simple, it's the iteration of a given function boosted by some acceleration of convergence. When that doesn't work (and often it doesn't), too bad. The root finding methods are more sophisticated and more robust, they should be preferred.

Sympy function derivatives and sets of equations

I'm working with nonlinear systems of equations. These systems are generally a nonlinear vector differential equation.
I now want to use functions and derive them with respect to time and to their time-derivatives, and find equilibrium points by solving the nonlinear equations 0=rhs(eqs).
Similar things are needed to calculate the Euler-Lagrange equations, where you need the derivative of L wrt. diff(x,t).
Now my question is, how do I implement this in Sympy?
My main 2 problems are, that deriving a Symbol f wrt. t diff(f,t), I get 0. I can see, that with
x = Symbol('x',real=True);
diff(x.subs(x,x(t)),t) # because diff(x,t) => 0
and
diff(x**2, x)
does kind of work.
However, with
x = Fuction('x')(t);
diff(x,t);
I get this to work, but I cannot differentiate wrt. the funtion x itself, like
diff(x**2,x) -DOES NOT WORK.
Since I need these things, especially not only for scalars, but for vectors (using jacobian) all the time, I really want this to be a clean and functional workflow.
Which kind of type should I initiate my mathematical functions in Sympy in order to avoid strange substitutions?
It only gets worse for matricies, where I cannot get
eqns = Matrix([f1-5, f2+1]);
variabs = Matrix([f1,f2]);
nonlinsolve(eqns,variabs);
to work as expected, since it only allows symbols as input. Is there an easy conversion here? Like eqns.tolist() - which doesn't work either?
EDIT:
I just found this question, which was answered towards using expressions and matricies. I want to be able to solve sets of nonlinear equations, build the jacobian of a vector wrt. another vector and derive wrt. functions as stated above. Can anyone point me into a direction to start a concise workflow for this purpose? I guess the most complex task is calculating the Lie-derivative wrt. a vector or list of functions, the rest should be straight forward.
Edit 2:
def substi(expr,variables):
return expr.subs( {w:w(t)} )
would automate the subsitution, such that substi(vector_expr,varlist_vector).diff(t) is not all 0.
Yes, one has to insert an argument in a function before taking its derivative. But after that, differentiation with respect to x(t) works for me in SymPy 1.1.1, and I can also differentiate with respect to its derivative. Example of Euler-Lagrange equation derivation:
t = Symbol("t")
x = Function("x")(t)
L = x**2 + diff(x, t)**2 # Lagrangian
EL = -diff(diff(L, diff(x, t)), t) + diff(L, x)
Now EL is 2*x(t) - 2*Derivative(x(t), t, t) as expected.
That said, there is a build-in method for Euler-Lagrange:
EL = euler_equations(L)
would yield the same result, except presented as a differential equation with right-hand side 0: [Eq(2*x(t) - 2*Derivative(x(t), t, t), 0)]
The following defines x to be a function of t
import sympy as s
t = s.Symbol('t')
x = s.Function('x')(t)
This should solve your problem of diff(x,t) being evaluated as 0. But I think you will still run into problems later on in your calculations.
I also work with calculus of variations and Euler-Lagrange equations. In these calculations, x' needs to be treated as independent of x. So, it is generally better to use two entirely different variables for x and x' so as not to confuse Sympy with the relationship between those two variables. After we are done with the calculations in Sympy and we go back to our pen and paper we can substitute x' for the second variable.

Python: multivariate non-linear solver with constraints

Given a function f(x) that takes an input vector x and returns a vector of the same length, how can you find the roots of the function setting constraints on x? (E.g. a range for each component of x.)
To my surprise I could not find a lot of useful information about this. In the scipy list for Optimization and Root finding algorithms there seem to be some options for scalar functions such as brentq. I can not find any algorithms that supports such an option for the multivariate case though.
Of course one could do a work-around like squaring each component of the returned vector and then use one of the minimizers such as differential_evolution (this is the only one I think actually). I can not imagine that this is a good strategy though, since it kills the quadratic convergence of Newton's algorithm. Also I find it really surprising that there does not seem to be an option for this, since it must be a really common problem. Have I missed something?
One (not particularly nice but hopefully working) option to work around this problem would be to give the solver a function that only has roots in the constrained region and that is continued in a way ensuring that the solver is pushed back in the proper region (a little bit like here but in multiple dimensions).
What one might do to achieve this (at least for rectangular constraints) is to implement a constrainedFunction that is linearly continued starting from the border value of your function:
import numpy as np
def constrainedFunction(x, f, lower, upper, minIncr=0.001):
x = np.asarray(x)
lower = np.asarray(lower)
upper = np.asarray(upper)
xBorder = np.where(x<lower, lower, x)
xBorder = np.where(x>upper, upper, xBorder)
fBorder = f(xBorder)
distFromBorder = (np.sum(np.where(x<lower, lower-x, 0.))
+np.sum(np.where(x>upper, x-upper, 0.)))
return (fBorder + (fBorder
+np.where(fBorder>0, minIncr, -minIncr))*distFromBorder)
You can pass this function an x value, the function f that you want to continue, as well as two arrays lower and upper of the same shape like x giving the lower and upper bounds in all dimensions. Now you can pass this function rather than your original function to the solver to find the roots.
The steepness of the continuation is simply taken as the border value at the moment to prevent steep jumps for sign changes at the border. To prevent roots outside the constrained region, some small value is added/substracted to positive/negative boundary values. I agree that this is not a very nice way to handle this, but it seems to work.
Here are two examples. For both the initial guess is outside the constrained region but a correct root in the constrained region is found.
Finding the roots of a multidimensional cosine constrained to [-2, -1]x[1, 2] gives:
from scipy import optimize as opt
opt.root(constrainedFunction, x0=np.zeros(2),
args=(np.cos, np.asarray([-2., 1.]), np.asarray([-1, 2.])))
gives:
fjac: array([[ -9.99999975e-01, 2.22992740e-04],
[ 2.22992740e-04, 9.99999975e-01]])
fun: array([ 6.12323400e-17, 6.12323400e-17])
message: 'The solution converged.'
nfev: 11
qtf: array([ -2.50050470e-10, -1.98160617e-11])
r: array([-1.00281376, 0.03518108, -0.9971942 ])
status: 1
success: True
x: array([-1.57079633, 1.57079633])
This also works for functions that are not diagonal:
def f(x):
return np.asarray([0., np.cos(x.sum())])
opt.root(constrainedFunction, x0=np.zeros(2),
args=(f, np.asarray([-2., 2.]), np.asarray([-1, 4.])))
gives:
fjac: array([[ 0.00254922, 0.99999675],
[-0.99999675, 0.00254922]])
fun: array([ 0.00000000e+00, 6.12323400e-17])
message: 'The solution converged.'
nfev: 11
qtf: array([ 1.63189544e-11, 4.16007911e-14])
r: array([-0.75738638, -0.99212138, -0.00246647])
status: 1
success: True
x: array([-1.65863336, 3.22942968])
If you want to handle an optimization with constraints, you can use the facile lirbary, which is a lot easier than scipy.optimize
Here is the link to the package :
https://pypi.python.org/pypi/facile/1.2
Here's how to use the facile library for your example. You will need to refine what I write here, which is only general. If you have Errors raised, tell me which.
import facile
# Your vector x
x = [ facile.variable('name', min, max) for i in range(Size) ]
# I give an example here of your vector being ordered and each component in a range
# You could as well put in the range where declaring variables
for i in range(len(x)-1):
facile.constraint( x[i] < x[i+1])
facile.constraint( range[i,0] < x[i] < range[i,1] ) #Supposed you have a 'range' array where you store the range for each variable
def function(...)
# Define here the function you want to find roots of
# Add as constraint that you want the vector to be a root of function
facile.constraint(function(x) == 0)
# Use facile solver
if facile.solve(x):
print [x[i].value() for i in range(len(x))]
else:
print "Impossible to find roots"
At the risk of suggesting something you might've already crossed off, I believe this should feasible with just scipy.minimize. The catch is that the function must have only one argument, but that argument can be a vector/list.
So f(x, y) becomes just f(z) where z = [x, y].
A good example that you might find useful if you haven't come across is here.
If you want to impose bounds, as you mentioned, for a 2x1 vector, you could use:
# Specify a (lower, upper) tuple for each component of the vector
bnds = [(0., 1.) for i in len(x)]
And use this as the bounds parameter within minimize.

Finding complex roots from set of non-linear equations in python

I have been testing an algorithm that has been published in literature that involves solving a set of 'm' non-linear equations in both Matlab and Python. The set of non-linear equations involves input variables that contain complex numbers, and therefore the resulting solutions should also be complex. As of now, I have been able to get pretty good results in Matlab by using the following lines of code:
lambdas0 = ones(1,m)*1e-5;
options = optimset('Algorithm','levenberg-marquardt',...
'MaxFunEvals',1000000,'MaxIter',10000,'TolX',1e-20,...
'TolFun',1e-20);
Eq = #(lambda)maxentfun(lambda,m,h,g);
[lambdasf] = fsolve(Eq,lambdas0,options);
where h and g are a complex matrix and vector, respectively. The solution converges very well for a wide range of initial values.
I have been trying to mimic these results in Python with very little success however. The numerical solvers seem to be set up much differently, and the 'levenburg-marquardt' algorithm exists under the function root. In python this algorithm cannot handle complex roots, and when I run the following lines:
lambdas0 = np.ones(m)*1e-5
sol = root(maxentfun, lambdas0, args = (m,h,g), method='lm', tol = 1e-20, options = {'maxiter':10000, 'xtol':1e-20})
lambdasf = sol.x
I get the following error:
minpack.error: Result from function call is not a proper array of floats.
I have tried using some of the other algorithms, such as 'broyden2' and 'anderson', but they are much inferior to Matlab, and only give okay results after playing around with the initial conditions. The function 'fsolve' also cannot handle complex variables either.
I was wondering if there is something I am applying incorrectly, and if anybody has an idea on maybe how to properly solve complex non-linear equations in Python.
Thank you very much
When I encounter this type of problem I try to rewrite my function as an array of real and imaginary parts. For example, if f is your function which takes complex input array x (say x has size 2, for simplicity)
from numpy import *
def f(x):
# Takes a complex-valued vector of size 2 and outputs a complex-valued vector of size 2
return [x[0]-3*x[1]+1j+2, x[0]+x[1]] # <-- for example
def real_f(x1):
# converts a real-valued vector of size 4 to a complex-valued vector of size 2
# outputs a real-valued vector of size 4
x = [x1[0]+1j*x1[1],x1[2]+1j*x1[3]]
actual_f = f(x)
return [real(actual_f[0]),imag(actual_f[0]),real(actual_f[1]),imag(actual_f[1])]
The new function, real_f can be used in fsolve: the real and imaginary parts of the function are simultaneously solved for, treating the real and imaginary parts of the input argument as independent.
Here append() and extend() methods can be used to make it automatic and easily extendable to N number of variables
def real_eqns(y1):
y=[]
for i in range(N):
y.append(y1[2*i+0]+1j*y1[2*i+1])
real_eqns1 = eqns(y)
real_eqns=[]
for i in range(N):
real_eqns.extend([real_eqns1[i].real,real_eqns1[i].imag])
return real_eqns

On ordinary differential equations (ODE) and optimization, in Python

I want to solve this kind of problem:
dy/dt = 0.01*y*(1-y), find t when y = 0.8 (0<t<3000)
I've tried the ode function in Python, but it can only calculate y when t is given.
So are there any simple ways to solve this problem in Python?
PS: This function is just a simple example. My real problem is so complex that can't be solve analytically. So I want to know how to solve it numerically. And I think this problem is more like an optimization problem:
Objective function y(t) = 0.8, Subject to dy/dt = 0.01*y*(1-y), and 0<t<3000
PPS: My real problem is:
objective function: F(t) = 0.85,
subject to: F(t) = sqrt(x(t)^2+y(t)^2+z(t)^2),
x''(t) = (1/F(t)-1)*250*x(t),
y''(t) = (1/F(t)-1)*250*y(t),
z''(t) = (1/F(t)-1)*250*z(t)-10,
x(0) = 0, y(0) = 0, z(0) = 0.7,
x'(0) = 0.1, y'(0) = 1.5, z'(0) = 0,
0<t<5
This differential equation can be solved analytically quite easily:
dy/dt = 0.01 * y * (1-y)
rearrange to gather y and t terms on opposite sides
100 dt = 1/(y * (1-y)) dy
The lhs integrates trivially to 100 * t, rhs is slightly more complicated. We can always write a product of two quotients as a sum of the two quotients * some constants:
1/(y * (1-y)) = A/y + B/(1-y)
The values for A and B can be worked out by putting the rhs on the same denominator and comparing constant and first order y terms on both sides. In this case it is simple, A=B=1. Thus we have to integrate
1/y + 1/(1-y) dy
The first term integrates to ln(y), the second term can be integrated with a change of variables u = 1-y to -ln(1-y). Our integrated equation therefor looks like:
100 * t + C = ln(y) - ln(1-y)
not forgetting the constant of integration (it is convenient to write it on the lhs here). We can combine the two logarithm terms:
100 * t + C = ln( y / (1-y) )
In order to solve t for an exact value of y, we first need to work out the value of C. We do this using the initial conditions. It is clear that if y starts at 1, dy/dt = 0 and the value of y never changes. Thus plug in the values for y and t at the beginning
100 * 0 + C = ln( y(0) / (1 - y(0) )
This will give a value for C (assuming y is not 0 or 1) and then use y=0.8 to get a value for t. Note that because of the logarithm and the factor 100 multiplying t y will reach 0.8 within a relatively short range of t values, unless the initial value of y is incredibly small. It is of course also straightforward to rearrange the equation above to express y in terms of t, then you can plot the function as well.
Edit: Numerical integration
For a more complexed ODE which cannot be solved analytically, you will have to try numerically. Initially we only know the value of the function at zero time y(0) (we have to know at least that in order to uniquely define the trajectory of the function), and how to evaluate the gradient. The idea of numerical integration is that we can use our knowledge of the gradient (which tells us how the function is changing) to work out what the value of the function will be in the vicinity of our starting point. The simplest way to do this is Euler integration:
y(dt) = y(0) + dy/dt * dt
Euler integration assumes that the gradient is constant between t=0 and t=dt. Once y(dt) is known, the gradient can be calculated there also and in turn used to calculate y(2 * dt) and so on, gradually building up the complete trajectory of the function. If you are looking for a particular target value, just wait until the trajectory goes past that value, then interpolate between the last two positions to get the precise t.
The problem with Euler integration (and with all other numerical integration methods) is that its results are only accurate when its assumptions are valid. Because the gradient is not constant between pairs of time points, a certain amount of error will arise for each integration step, which over time will build up until the answer is completely inaccurate. In order to improve the quality of the integration, it is necessary to use more sophisticated approximations to the gradient. Check out for example the Runge-Kutta methods, which are a family of integrators which remove progressive orders of error term at the cost of increased computation time. If your function is differentiable, knowing the second or even third derivatives can also be used to reduce the integration error.
Fortunately of course, somebody else has done the hard work here, and you don't have to worry too much about solving problems like numerical stability or have an in depth understanding of all the details (although understanding roughly what is going on helps a lot). Check out http://docs.scipy.org/doc/scipy/reference/generated/scipy.integrate.ode.html#scipy.integrate.ode for an example of an integrator class which you should be able to use straightaway. For instance
from scipy.integrate import ode
def deriv(t, y):
return 0.01 * y * (1 - y)
my_integrator = ode(deriv)
my_integrator.set_initial_value(0.5)
t = 0.1 # start with a small value of time
while t < 3000:
y = my_integrator.integrate(t)
if y > 0.8:
print "y(%f) = %f" % (t, y)
break
t += 0.1
This code will print out the first t value when y passes 0.8 (or nothing if it never reaches 0.8). If you want a more accurate value of t, keep the y of the previous t as well and interpolate between them.
As an addition to Krastanov`s answer:
Aside of PyDSTool there are other packages, like Pysundials and Assimulo which provide bindings to the solver IDA from Sundials. This solver has root finding capabilites.
Use scipy.integrate.odeint to handle your integration, and analyse the results afterward.
import numpy as np
from scipy.integrate import odeint
ts = np.arange(0,3000,1) # time series - start, stop, step
def rhs(y,t):
return 0.01*y*(1-y)
y0 = np.array([1]) # initial value
ys = odeint(rhs,y0,ts)
Then analyse the numpy array ys to find your answer (dimensions of array ts matches ys). (This may not work first time because I am constructing from memory).
This might involve using the scipy interpolate function for the ys array, such that you get a result at time t.
EDIT: I see that you wish to solve a spring in 3D. This should be fine with the above method; Odeint on the scipy website has examples for systems such as coupled springs that can be solved for, and these could be extended.
What you are asking for is a ODE integrator with root finding capabilities. They exist and the low-level code for such integrators is supplied with scipy, but they have not yet been wrapped in python bindings.
For more information see this mailing list post that provides a few alternatives: http://mail.scipy.org/pipermail/scipy-user/2010-March/024890.html
You can use the following example implementation which uses backtracking (hence it is not optimal as it is a bolt-on addition to an integrator that does not have root finding on its own): https://github.com/scipy/scipy/pull/4904/files

Categories