Constrained non-linear optimization - overdetermined system of polynomial equations

Constrained non-linear optimization - overdetermined system of polynomial equations - python

I have a system of 21 polynomial equations in a total of 12 unknowns a, ..., l. Each equation has the general form V1*abc + V2*abd + ... + V64*jkl = x, where V1, ..., V64 are each either 0 or 1, i.e., each equation contains on the left hand side the sum of some products of three different unknowns.
There is a set of constrains: a + b + c + d = 1, e + f + g + h = 1, i + j + k + l = 1. The sum of all xs (right hand sides) is equal to 1.
I have as an input a vector of xs. Is there a solver which could provide me the values of a, ..., l which yield a vector of x's as close as possible to the original xs while adhering to the constrains ? I'm looking for a python implementation.
I looked in scipy.optimize but I'm not able to establish which method is preferable for my problem.

You might want to try this python binding for IPOPT. IPOPT is an optimization library that uses an interior-point solver for finding (local) optima of functions with generalized constraints, both equality and inequality constraints. As you've described your problem, you won't care about the inequality constraints.
A candidate function for your optimization objective would be the sum of the squared differences for your 21 polynomial equations. Let's say you start with your initial x, which is a 21-element vector, then your objective would be:
(V1_0*abc + V2_0*abd + ... + V64_0*jkl - x_0)^2 + (V1_1*abc + V2_1*abd + ... + V64_1*jkl - x_1)^2 + ...(V1_{21}*abc + V2_{21}*abd + ... + V64_{21}*jkl - x_{21})^2
To use IPOPT, you will need to compute the partial derivatives of your constraints and objective wrt all of your variable a-l.
If IPOPT won't work for you, you might even be able to use scipy.optimize with this objective function. From the docs, it looks like scipy.optimize will try to pick the method appropriate for your problem based upon how you define it; if you define your constraints and objective correctly, scipy.optimize should pick the correct method.

Related

Finding the minimum distance from a point to a curve

I need to find the minimum distance from a point (X,Y) to a curve defined by four coefficients C0, C1, C2, C3 like y = C0 + C1X + C2X^2 + C3X^3
I have used a numerical approach using np.linspace and np.polyval to generate discrete (X,Y) for the curve and then the shapely 's Point, MultiPoint and nearest_points to find the nearest points, and finally np.linalg.norm to find the distance.
This is a numerical approach by discretizing the curve.
My question is how can I find the distance by analytical methods and code it?

Problem definition
For the sake of simplicity let's use P for the point and Px and Py for the coordinates. Let's call the function f(x).
An other way to look at you're problem is that you're trying to find an x that minimzes the distance between the P and the point (x, f(x))
The problem can then be formulated as a minimization problem.
Find x that minimizes (x-Px)² + (f(x)-Py)²
(Not that we can drop the square root that should be there because square root is a monotonic function and doesn't change the optima. Some details here.)
Analytical solution
The fully analytical way to solve this would be a pen and paper approach. You can develop the equation and compute the derivative, see where they cancel out to find out where extremums are (This will be a lengthy process to do analytically. #Yves Daoust addresses it in his answer. Either do that or use a numerical solver for this part. For example a version of Newton's method should do). Then check if the extremums are maximums or minimums by computing the point and sampling a few points around to check how the function is evolving. From this you can find where the global minimum is and that gives you the x you're looking for. But developing this is content probably better suited for mathematics.
Optimization approach
So instead I'm gonna suggest a solution that uses numerical minimization that doesn't use a sampling approach. You can use the minimize function from scipy to solve the minimization problem.
from math import pow
from scipy.optimize import minimize
# Define function
C0 = -1
C1 = 5
C2 = -5
C3 = 6
f = lambda x: C0 + C1 * x + C2 * pow(x, 2) + C3 * pow(x, 3)
# Define function to minimize
p_x = 12
p_y = -7
min_f = lambda x: pow(x-p_x, 2) + pow(f(x) - p_y, 2)
# Minimize
min_res = minimze(min_f, 0) # The starting point doesn't really matter here
# Show result
print("Closest point is x=", min_res.x[0], " y=", f(min_res.x[0]))
Here I used your function with dummy values but you could use any function you want with this approach.

You need to differentiate (x - X)² + (C0 + C1 x + C2 x² + C3 x³ - Y)² and find the roots. But this is a quintic polynomial (fifth degree) with general coefficients so the Abel-Ruffini theorem fully applies, meaning that there is no solution in radicals.
There is a known solution anyway, by reducing the equation (via a lengthy substitution process) to the form x^5 - x + t = 0 known as the Bring–Jerrard normal form, and getting the solutions (called ultraradicals) by means of the elliptic functions of Hermite or evaluation of the roots by Taylor.
Personal note:
This approach is virtually foolish, as there exist ready-made numerical polynomial root-finders, and the ultraradical function is uneasy to evaluate.
Anyway, looking at the plot of x^5 - x, one can see that it is intersected once or three times by and horizontal, and finding an interval with a change of sign is easy. With that, you can obtain an accurate root by dichotomy (and far from the extrema, Newton will easily converge).
After having found this root, you can deflate the polynomial to a quartic, for which explicit formulas by radicals are known.

How to solve Linear Programming Problem with more than one optimal solution using pulp

I want to solve an LPP which has more than one optimal solution. How can I do that?
For Example:
Maximize 2000x1 + 3000x2
subject to
6x1 + 9x2 ≤ 100
2x1 + x2 ≤ 20
x1, x2 ≥ 0
In this LPP problem, there is more than one optimal solution i.e (0,100/9) and (20/3,20/3). When I solve this problem using the pulp library, it gives me only a (0,100/9) solution. I want all the possible solutions.

There is a good discusion on this here.
There are two questions here: (i) How to find multiple optimal solutions to an LP, and (ii) Why you would want to.
Answering (ii) first - typically you might want to ask for all optimal solutions because there is some secondary or lower imporatance objective to pick between them (for example wanting to minimize the change from a current setting, or minimise risk in some way). If this is the case my personal recommendation would be to find a way of incorporating that lower order preference into your objective function (for example by adding in a term with a low weighting).
Answering (i) I find it helps to look at an LP graphically - and the example you have given works nicely for this (two variables means you can plot it - see plot on wolfram). Each of your constraints puts a lines on this inequality plot, and solutions can only be chosen on one side of that line.
Your objective function is like having a constant gradient across this feasible area and you are trying to find the highest spot. You can draw contours of your objective function by setting it to a specific value and drawing that line. What you'll find if you do this is that your objective function contours are parrallel to the top constraint line (your first constratin).
You can see this directly from the equation: 6x1 + 9x2 <= 100 divides down to 2x1 + 3x2 <= 100/3, and your objective divides down to have the same gradient. What this means is you can move along that top constraint from one corner to the other without changing the value of your objective function.
There are infinitely many optimal solutions which solve the equation:
2x1 + 3x2 == 100/3, between x1==0, and x1==20/3. You have already identified the solutions at the two corners.
If you want to find all the nodes which are equally optimal - for large problems there could be a large number of these - then the code below gives a basic implementation the method discussed here. When you run it the first time it will give you one of the corner solutions - you then need to add this node (set of variables and slacks which are zero) to A, and iterate until the objective degrades. You could put this within a loop. Note that as currently implemented this only works for problems with variables which have 0 lower bound, and are unbounded above.
import pulp as pulp
# Accounting:
# n structural varuables (n = 2)
# m constraints (m = 2)
# => No. of basics = 2 (no. of constraints)
# => No. of non-basics = 2 (no. of variables)
nb = 2
M = 100 # large M value - upper bound for x1, x2 * the slacks
model = pulp.LpProblem('get all basis', pulp.LpMaximize)
# Variables
x = pulp.LpVariable.dicts('x', range(2), lowBound=0, upBound=None, cat='Continuous')
# Non-negative Slack Variables - one for each constraint
s = pulp.LpVariable.dicts('s', range(2), lowBound=0, upBound=None, cat='Continuous')
# Basis variables (binary)
# one for each variable & one for each constraint (& so slack)
B_x = pulp.LpVariable.dicts('b_x', range(len(x)), cat='Binary')
B_s = pulp.LpVariable.dicts('b_s', range(len(s)), cat='Binary')
# Objective
model += 2000*x[0] + 3000*x[1]
# Constraints - with explicit slacks
model += 6*x[0] + 9*x[1] + s[0] == 100
model += 2*x[0] + x[1] + s[1] == 20
# No. of basics is correct:
model += pulp.lpSum(B_x) + pulp.lpSum(B_s) == nb
# Enforce basic and non-basic behaviour
for i in range(len(x)):
model += x[i] <= M*B_x[i]
for i in range(len(x)):
model += s[i] <= M*B_s[i]
# Cuts - already discovered solutions
A = []
# A = [[1, 1, 0, 0]]
# A = [[1, 1, 0, 0], [0, 1, 0, 1]]
for a in A:
model += (B_x[0]*a[0] + B_x[1]*a[1] +
B_s[0]*a[2] + B_s[1]*a[3]) <= nb - 1
model.solve()
print('Status:', pulp.LpStatus[model.status])
print('Objective:', pulp.value(model.objective))
for v in model.variables():
print (v.name, "=", v.varValue)

Here is an example of how to add a cut to a PuLP LP problem with multiple optimal solutions:
from pulp import *
# Define the LP problem
prob = LpProblem("LP Problem", LpMaximize)
# Define variables
x = LpVariable("x", lowBound=0)
y = LpVariable("y", lowBound=0)
# Define an objective function
prob += 2 * x + 3 * y
# Define constraints
prob += x + 2 * y <= 10
prob += x + y <= 5
# Solve the LP problem
prob.solve()
# Check if the LP problem has multiple solutions
if LpStatus[prob.status] == "Optimal":
if value(x) == 0:
# Add a cut to eliminate one of the solutions
prob += x >= 1
prob.solve()
# Print the result
print("x =", value(x))
print("y =", value(y))
In this example, after solving the LP problem, we check if it has an optimal solution and if variable x is equal to 0. If these conditions are true, it means that the problem has multiple solutions. To eliminate one of the solutions, we add a cut that requires variable x to be greater than or equal to 1. We then solve the LP problem again, and this time we should obtain a unique solution.

Is there a way to generate random solutions to non-square linear equations, preferably in python?

First of all, I know that these threads exist! So bear with me, my question is not fully answered by them.
As an example assume we are in a 4-dimensional vector space, i.e R^4. We are looking at the two linear equations:
3*x1 - 2* x2 + 7*x3 - 2*x4 = 6
1*x1 + 3* x2 - 2*x3 + 5*x4 = -2
The actual questions is: Is there a way to generate a number N of points that solve both of these equations making use of the linear solvers from NumPy etc?
The main problem with all python libraries I have tried so far is: they need n equations for a n-dimensional space
Solving the problem is very easy for one equation, since you can simply use n-1 randomly generated vlaues and adapt the last one such that the vector solves the equation.
My expected result would be a list of N "randomly" generated points that solve k linear equations in an n-dimensional space, where k<n.

A system of linear equations with more variables than equations is known as an underdetermined system.
An underdetermined linear system has either no solution or infinitely many solutions.
...
There are algorithms to decide whether an underdetermined system has solutions, and if it has any, to express all solutions as linear functions of k of the variables (same k as above). The simplest one is Gaussian elimination.
As you say, many functions available in libraries (e.g. np.linalg.solve) require a square matrix (i.e. n equations for n unknowns), what you are looking for is an implementation of Gaussian elimination for non square linear systems.
This isn't 'random', but np.linalg.lstsq (least square) is will solve non-square matrices:
Return the least-squares solution to a linear matrix equation.
Solves the equation a x = b by computing a vector x that minimizes the Euclidean 2-norm || b - a x ||^2. The equation may be under-, well-, or over- determined (i.e., the number of linearly independent rows of a can be less than, equal to, or greater than its number of linearly independent columns). If a is square and of full rank, then x (but for round-off error) is the “exact” solution of the equation.
For more info, see:
solving Ax =b for a non-square matrix A using python

Since you have an underdetermined system of equations (too few constraints for your solutions, or fewer equations than variables) you can just pick some arbitrary values for x3 and x4 and solve the system in x1, x2 (this has 2 variables/2 equations).
You will just need to check that the resulting system is not inconsistent (i.e. it admits no solution) and that there are no duplicate solutions.
You could for instance fix x3=0 and choosing random values of x4 generate solutions for your equations in x1, x2
Here's an example generating 10 "random" solutions
n = 10
x3 = 0
X = []
for x4 in np.random.choice(1000, n):
b = np.array([[6-7*x3+2*x4],[-2+2*x3-5*x4]])
x = np.linalg.solve(a, b)
X.append(np.append(x,[x3,x4]))
# check solution nr. 3
[x1, x2, x3, x4] = X[3]
3*x1 - 2* x2 + 7*x3 - 2*x4
# output: 6.0
1*x1 + 3* x2 - 2*x3 + 5*x4
# output: -2.0

Thanks for the answers, which both helped me and pointed me in the right direction.
I now have an easy step-by-step solution to my problem for arbitrary k<n.
1. Find one solution to all equations given. This can be done by using
solution_vec = numpy.linalg.lstsq(A,b)
this gives a solution as seen in ukemis answer. In my example above, the Matrix A is equal to the coefficients of the equations on the left side, b represents the vector on the right side.
2. Determine the null space of your matrix A.
These are all vectors v such that the skalar product v*A_i = 0 for every(!) row A_i of A. The following function, found in this thread can be used to get representatives of the null space of A:
def nullSpaceOfMatrix(A, eps=1e-15):
u, s, vh = scipy.linalg.svd(A)
null_mask = (s <= eps)
null_space = scipy.compress(null_mask, vh, axis=0)
return scipy.transpose(null_space)
3. Generate as many (N) "random" linear combinations (meaning with random coefficients) of solution_vec and resulting vectors of the nullspace of the matrix as you want! This works because the scalar product is additive and nullspace vectors have a scalar product of 0 to the vectors of the equations. Those linear combinations always must contain solution_vec, as in:
linear_combination = solution_vec + a*null_spacevec_1 + b*nullspacevec_2...
where a and b can be randomly chosen.

How to implement Quadratic constraint in SCS for python

I have an quadratic optimisation problem of the form
min w.r.t. to x 1/2 x'x + q'x S.t. Gx <= h
I have a rather big problem ( few million points and constraints), and while cvxopt's default solver proved effective, I'm curious about implementing it in SCS which should be faster (and without using the CVXPY interface).
After a literature search (mainly Boyd's Convex Analysis), the reformulation to SOCP-form should yield
min w.r.t. to z of c'z subject to Az + s = b, s in K
with c = (1 q)' and z = (t x)' where t is a scalar, and K is the cartesian product of the linear cones associated with my original constraints (Gx \leq h) and of a quadratic cone Q = { (t,x) | t >= ||x||}
However, how should I actually define A and b ?
I imagined something in the like of A = [[0 G],[2, 1, .., 1]] and b = (h 0).
However, with the 'q' option set to [1] in the python scs.solve cone dictionary argument I can't get it to work ? What is the expected syntax of A's last line? (That is, assuming my mathematical reformulation is correct...)
Thank you for your help !

On ordinary differential equations (ODE) and optimization, in Python

I want to solve this kind of problem:
dy/dt = 0.01*y*(1-y), find t when y = 0.8 (0<t<3000)
I've tried the ode function in Python, but it can only calculate y when t is given.
So are there any simple ways to solve this problem in Python?
PS: This function is just a simple example. My real problem is so complex that can't be solve analytically. So I want to know how to solve it numerically. And I think this problem is more like an optimization problem:
Objective function y(t) = 0.8, Subject to dy/dt = 0.01*y*(1-y), and 0<t<3000
PPS: My real problem is:
objective function: F(t) = 0.85,
subject to: F(t) = sqrt(x(t)^2+y(t)^2+z(t)^2),
x''(t) = (1/F(t)-1)*250*x(t),
y''(t) = (1/F(t)-1)*250*y(t),
z''(t) = (1/F(t)-1)*250*z(t)-10,
x(0) = 0, y(0) = 0, z(0) = 0.7,
x'(0) = 0.1, y'(0) = 1.5, z'(0) = 0,
0<t<5

This differential equation can be solved analytically quite easily:
dy/dt = 0.01 * y * (1-y)
rearrange to gather y and t terms on opposite sides
100 dt = 1/(y * (1-y)) dy
The lhs integrates trivially to 100 * t, rhs is slightly more complicated. We can always write a product of two quotients as a sum of the two quotients * some constants:
1/(y * (1-y)) = A/y + B/(1-y)
The values for A and B can be worked out by putting the rhs on the same denominator and comparing constant and first order y terms on both sides. In this case it is simple, A=B=1. Thus we have to integrate
1/y + 1/(1-y) dy
The first term integrates to ln(y), the second term can be integrated with a change of variables u = 1-y to -ln(1-y). Our integrated equation therefor looks like:
100 * t + C = ln(y) - ln(1-y)
not forgetting the constant of integration (it is convenient to write it on the lhs here). We can combine the two logarithm terms:
100 * t + C = ln( y / (1-y) )
In order to solve t for an exact value of y, we first need to work out the value of C. We do this using the initial conditions. It is clear that if y starts at 1, dy/dt = 0 and the value of y never changes. Thus plug in the values for y and t at the beginning
100 * 0 + C = ln( y(0) / (1 - y(0) )
This will give a value for C (assuming y is not 0 or 1) and then use y=0.8 to get a value for t. Note that because of the logarithm and the factor 100 multiplying t y will reach 0.8 within a relatively short range of t values, unless the initial value of y is incredibly small. It is of course also straightforward to rearrange the equation above to express y in terms of t, then you can plot the function as well.
Edit: Numerical integration
For a more complexed ODE which cannot be solved analytically, you will have to try numerically. Initially we only know the value of the function at zero time y(0) (we have to know at least that in order to uniquely define the trajectory of the function), and how to evaluate the gradient. The idea of numerical integration is that we can use our knowledge of the gradient (which tells us how the function is changing) to work out what the value of the function will be in the vicinity of our starting point. The simplest way to do this is Euler integration:
y(dt) = y(0) + dy/dt * dt
Euler integration assumes that the gradient is constant between t=0 and t=dt. Once y(dt) is known, the gradient can be calculated there also and in turn used to calculate y(2 * dt) and so on, gradually building up the complete trajectory of the function. If you are looking for a particular target value, just wait until the trajectory goes past that value, then interpolate between the last two positions to get the precise t.
The problem with Euler integration (and with all other numerical integration methods) is that its results are only accurate when its assumptions are valid. Because the gradient is not constant between pairs of time points, a certain amount of error will arise for each integration step, which over time will build up until the answer is completely inaccurate. In order to improve the quality of the integration, it is necessary to use more sophisticated approximations to the gradient. Check out for example the Runge-Kutta methods, which are a family of integrators which remove progressive orders of error term at the cost of increased computation time. If your function is differentiable, knowing the second or even third derivatives can also be used to reduce the integration error.
Fortunately of course, somebody else has done the hard work here, and you don't have to worry too much about solving problems like numerical stability or have an in depth understanding of all the details (although understanding roughly what is going on helps a lot). Check out http://docs.scipy.org/doc/scipy/reference/generated/scipy.integrate.ode.html#scipy.integrate.ode for an example of an integrator class which you should be able to use straightaway. For instance
from scipy.integrate import ode
def deriv(t, y):
return 0.01 * y * (1 - y)
my_integrator = ode(deriv)
my_integrator.set_initial_value(0.5)
t = 0.1 # start with a small value of time
while t < 3000:
y = my_integrator.integrate(t)
if y > 0.8:
print "y(%f) = %f" % (t, y)
break
t += 0.1
This code will print out the first t value when y passes 0.8 (or nothing if it never reaches 0.8). If you want a more accurate value of t, keep the y of the previous t as well and interpolate between them.

As an addition to Krastanov`s answer:
Aside of PyDSTool there are other packages, like Pysundials and Assimulo which provide bindings to the solver IDA from Sundials. This solver has root finding capabilites.

Use scipy.integrate.odeint to handle your integration, and analyse the results afterward.
import numpy as np
from scipy.integrate import odeint
ts = np.arange(0,3000,1) # time series - start, stop, step
def rhs(y,t):
return 0.01*y*(1-y)
y0 = np.array([1]) # initial value
ys = odeint(rhs,y0,ts)
Then analyse the numpy array ys to find your answer (dimensions of array ts matches ys). (This may not work first time because I am constructing from memory).
This might involve using the scipy interpolate function for the ys array, such that you get a result at time t.
EDIT: I see that you wish to solve a spring in 3D. This should be fine with the above method; Odeint on the scipy website has examples for systems such as coupled springs that can be solved for, and these could be extended.

What you are asking for is a ODE integrator with root finding capabilities. They exist and the low-level code for such integrators is supplied with scipy, but they have not yet been wrapped in python bindings.
For more information see this mailing list post that provides a few alternatives: http://mail.scipy.org/pipermail/scipy-user/2010-March/024890.html
You can use the following example implementation which uses backtracking (hence it is not optimal as it is a bolt-on addition to an integrator that does not have root finding on its own): https://github.com/scipy/scipy/pull/4904/files

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Constrained non-linear optimization - overdetermined system of polynomial equations - python

Related

Finding the minimum distance from a point to a curve

How to solve Linear Programming Problem with more than one optimal solution using pulp

Is there a way to generate random solutions to non-square linear equations, preferably in python?

How to implement Quadratic constraint in SCS for python

On ordinary differential equations (ODE) and optimization, in Python

Categories

Resources