Finding the minima/maxima of a multi-variable polynomial in python - python

I have the following polynomial equation that I would like to find the local minima and maxima for.
I defined the function as follows. It uses a flatten function to flatten the nested list, I'll include it for testing purposes (found it here http://rightfootin.blogspot.com/2006/09/more-on-python-flatten.html)
flatten list
from itertools import combinations
import math
def flatten(l, ltypes=(list, tuple)):
ltype = type(l)
l = list(l)
i = 0
while i < len(l):
while isinstance(l[i], ltypes):
if not l[i]:
l.pop(i)
i -= 1
break
else:
l[i:i + 1] = l[i]
i += 1
return ltype(l)
my polynomial
def poly(coefficients, factors):
#quadratic terms
constant = 1
singles = factors
products = [math.prod(c) for c in combinations(factors, 2)]
squares = [f**2 for f in factors]
sequence = flatten([constant, singles, products, squares])
z = sum([math.prod(i) for i in zip(coefficients, sequence)])
return z
The arguments it takes is a list of coefficients, for example:
coefs = [12.19764959, -1.8233151, 2.50952816,-1.56344375, 1.00003828, -1.72128301, -2.54254877, -1.20377309, 5.53510616, 2.94755653, 4.83759279, -0.85507208, -0.48007208, -3.70507208, -0.27007208]
And a list of factor or variable values:
factors = [0.4714, 0.4714, -0.4714, 0.4714]
Plug these in and it calculates the result of the polynomial. The reason I wrote it like this is because the number of variables (factors) changes from fit to fit, so I wanted to keep it flexible. I now want to find the combination of "factors" values within a certain range (let's say between -1 and 1) where the function reaches its maximum and minimum values. If the function was "hard coded" I could use scipy.optimize, but I can't figure out how to make it works as is.
Another option is a brute force grid search (which I use at the moment), but it's very slow as soon as you have more than 2 variables, especially with small step sizes. There may be no true minima/maxima where slope == 0 within the bounds, but as long as I can get the maximum and minimum values that is OK.

Ok, I figured it out. It was two really silly things:
the order of the arguments in the function had to be reversed, so that the first argument (the one I wanted to optimize for) were the "factors" or the X values, followed by the coefficients. That way an array of the same size could be used as the X0 and the coefficients could be used as args.
That wassn't enough, as the function would return an array if an array was the input. I just added a factors = list(factors) to the function itself to put it into the correct shape.
The new function:
def poly(factors, coefficients):
factors = list(factors)
#quadratic terms
constant = 1
singles = factors
products = [math.prod(c) for c in combinations(factors, 2)]
squares = [f**2 for f in factors]
sequence = flatten([constant, singles, products, squares])
z = sum([math.prod(i) for i in zip(coefficients, sequence)])
return z
And the optimization:
coefs = [4.08050532, -0.47042713, -0.08200181, -0.54184481, -0.18515675,
-0.96751856, -1.10814625, -1.7831592, 5.2763512, 2.83505438, 4.7082153,
0.22988773, 1.06488773, -0.70011227, 1.42988773]
x0 = [0.1, 0.1, 0.1, 0.1]
minimize(poly,x0 = x0, args = coefs, bounds = ((-1,1),(-1,1),(-1,1),(-1,1)))
Which returns:
fun: -1.6736636102536673
hess_inv: <4x4 LbfgsInvHessProduct with dtype=float64>
jac: array([-2.10611305e-01, 2.19138777e+00, -8.16990766e+00, -1.11022302e-07])
message: 'CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL'
nfev: 85
nit: 12
njev: 17
status: 0
success: True
x: array([1., -1.,1., 0.03327357])

Related

How to speed up an N dimensional interval tree in python?

Consider the following problem: Given a set of n intervals and a set of m floating-point numbers, determine, for each floating-point number, the subset of intervals that contain the floating-point number.
This problem has been addressed by constructing an interval tree (or called range tree or segment tree). Implementations have been done for the one-dimensional case, e.g. python's intervaltree package. Usually, these implementations consider one or few floating-point numbers, namely a small "m" above.
In my problem setting, both n and m are extremely large numbers (from solving an image processing problem). Further, I need to consider the N-dimensional intervals (called cuboid when N=3, because I was modeling human brains with the Finite Element Method). I have implemented a simple N-dimensional interval tree in python, but it run in a loop and can only take one floating-point number at a time. Can anyone help improve the implementation in terms of efficiency? You can change data structure freely.
import sys
import time
import numpy as np
# find the index of a satisfying x > a in one dimension
def find_index_smaller(a, x):
idx = np.argsort(a)
ss = np.searchsorted(a, x, sorter=idx)
res = idx[0:ss]
return res
# find the index of a satisfying x < a in one dimension
def find_index_larger(a, x):
return find_index_smaller(-a, -x)
# find the index of a satisfing amin < x < amax in one dimension
def find_intv_at(amin, amax, x):
idx = find_index_smaller(amin, x)
idx2 = find_index_larger(amax[idx], x)
res = idx[idx2]
return res
# find the index of a satisfying amin < x < amax in N dimensions
def find_intv_at_nd(amin, amax, x):
dim = amin.shape[0]
res = np.arange(amin.shape[-1])
for i in range(dim):
idx = find_intv_at(amin[i, res], amax[i, res], x[i])
res = res[idx]
return res
I also have two test examples for sanity check and performance testing:
def demo1():
print ("By default, we do a correctness test")
n_intv = 2
n_point = 2
# generate the test data
point = np.random.rand(3, n_point)
intv_min = np.random.rand(3, n_intv)
intv_max = intv_min + np.random.rand(3, n_intv)*8
print ("point ")
print (point)
print ("intv_min")
print (intv_min)
print ("intv_max")
print (intv_max)
print ("===Indexes of intervals that contain the point===")
for i in range(n_point):
print (find_intv_at_nd(intv_min,intv_max, point[:, i]))
def demo2():
print ("Performance:")
n_points=100
n_intv = 1000000
# generate the test data
points = np.random.rand(n_points, 3)*512
intv_min = np.random.rand(3, n_intv)*512
intv_max = intv_min + np.random.rand(3, n_intv)*8
print ("point.shape = "+str(points.shape))
print ("intv_min.shape = "+str(intv_min.shape))
print ("intv_max.shape = "+str(intv_max.shape))
starttime = time.time()
for point in points:
tmp = find_intv_at_nd(intv_min, intv_max, point)
print("it took this long to run {} points, with {} interva: {}".format(n_points, n_intv, time.time()-starttime))
My idea would be:
Remove np.argsort() from the algo, because the interval tree does not change, so sorting could have been done in pre-processing.
Vectorize x. The algo runs a loop for each x. It would be nice if we can get rid of the loop over x.
Any contribution would be appreciated.

compute the infinity norm of the difference between the two solutions

In the following code I have been able to:
Implement Gaussian elimination with no pivoting for a general square linear system.
I have tested it by solving Ax=b, where A is a random 100x100 matrix and b is a random 100x1 vector.
I have compared my solution against the solution obtained using numpy.linalg.solve
However in the final task I need to compute the infinity norm of the difference between the two solutions. I know the infinity norm is the greatest absolute row sum of a matrix. But how can I do this to compute the infinity norm of the difference between the two solutions, my solution and the numpy.linalg.solve. Looking for some help with this!
import numpy as np
def GENP(A, b):
'''
Gaussian elimination with no pivoting.
% input: A is an n x n nonsingular matrix
% b is an n x 1 vector
% output: x is the solution of Ax=b.
% post-condition: A and b have been modified.
'''
n = len(A)
if b.size != n:
raise ValueError("Invalid argument: incompatible sizes between A & b.", b.size, n)
for pivot_row in range(n-1):
for row in range(pivot_row+1, n):
multiplier = A[row][pivot_row]/A[pivot_row][pivot_row]
#the only one in this column since the rest are zero
A[row][pivot_row] = multiplier
for col in range(pivot_row + 1, n):
A[row][col] = A[row][col] - multiplier*A[pivot_row][col]
#Equation solution column
b[row] = b[row] - multiplier*b[pivot_row]
x = np.zeros(n)
k = n-1
x[k] = b[k]/A[k,k]
while k >= 0:
x[k] = (b[k] - np.dot(A[k,k+1:],x[k+1:]))/A[k,k]
k = k-1
return x
if __name__ == "__main__":
A = np.round(np.random.rand(100, 100)*10)
b = np.round(np.random.rand(100)*10)
print (GENP(np.copy(A), np.copy(b)))
for example this code gives the following output for task 1 listed above:
[-6.61537666 0.95704368 1.30101768 -3.69577873 -2.51427519 -4.56927017
-1.61201589 2.88242622 1.67836096 2.18145556 2.60831672 0.08055869
-2.39347903 2.19672137 -0.91609732 -1.17994959 -3.87309152 -2.53330865
5.97476318 3.74687301 5.38585146 -2.71597978 2.0034079 -0.35045844
0.43988439 -2.2623829 -1.82137544 3.20545721 -4.98871738 -6.94378666
-6.5076601 3.28448129 3.42318453 -1.63900434 4.70352047 -4.12289961
-0.79514656 3.09744616 2.96397264 2.60408589 2.38707091 8.72909353
-1.33584905 1.30879264 -0.28008339 0.93560728 -1.40591226 1.31004142
-1.43422946 0.41875924 3.28412668 3.82169545 1.96675247 2.76094378
-0.90069455 1.3641636 -0.60520103 3.4814196 -1.43076816 5.01222382
0.19160657 2.23163261 2.42183726 -0.52941262 -7.35597457 -3.41685057
-0.24359225 -5.33856181 -1.41741354 -0.35654736 -1.71158503 -2.24469314
-3.26453092 1.0932765 1.58333208 0.15567584 0.02793548 1.59561909
0.31732915 -1.00695954 3.41663177 -4.06869021 3.74388762 -0.82868155
1.49789582 -1.63559124 0.2741194 -1.11709237 1.97177449 0.66410154
0.48397714 -1.96241854 0.34975886 1.3317751 2.25763568 -6.80055066
-0.65903682 -1.07105965 -0.40211347 -0.30507635]
then for task two my code gives the following:
my_solution = GENP(np.copy(A), np.copy(b))
numpy_solution = np.linalg.solve(A, b)
print(numpy_solution)
resulting in:
[-6.61537666 0.95704368 1.30101768 -3.69577873 -2.51427519 -4.56927017
-1.61201589 2.88242622 1.67836096 2.18145556 2.60831672 0.08055869
-2.39347903 2.19672137 -0.91609732 -1.17994959 -3.87309152 -2.53330865
5.97476318 3.74687301 5.38585146 -2.71597978 2.0034079 -0.35045844
0.43988439 -2.2623829 -1.82137544 3.20545721 -4.98871738 -6.94378666
-6.5076601 3.28448129 3.42318453 -1.63900434 4.70352047 -4.12289961
-0.79514656 3.09744616 2.96397264 2.60408589 2.38707091 8.72909353
-1.33584905 1.30879264 -0.28008339 0.93560728 -1.40591226 1.31004142
-1.43422946 0.41875924 3.28412668 3.82169545 1.96675247 2.76094378
-0.90069455 1.3641636 -0.60520103 3.4814196 -1.43076816 5.01222382
0.19160657 2.23163261 2.42183726 -0.52941262 -7.35597457 -3.41685057
-0.24359225 -5.33856181 -1.41741354 -0.35654736 -1.71158503 -2.24469314
-3.26453092 1.0932765 1.58333208 0.15567584 0.02793548 1.59561909
0.31732915 -1.00695954 3.41663177 -4.06869021 3.74388762 -0.82868155
1.49789582 -1.63559124 0.2741194 -1.11709237 1.97177449 0.66410154
0.48397714 -1.96241854 0.34975886 1.3317751 2.25763568 -6.80055066
-0.65903682 -1.07105965 -0.40211347 -0.30507635]
finally for task 3:
if np.allclose(my_solution, numpy_solution):
print("These solutions agree")
else:
print("These solutions do not agree")
resulting in:
These solutions agree
If what you want is only the infinity norm for matrix,
it generally should look something like this:
def inf_norm(matrix):
return max(abs(row.sum()) for row in matrix)
But since your my_solution and numpy_solution are just 1-D vectors, you
may either to reshape them (I assume 100x1 which is what you have in your
example) for use with above function:
alternative 1:
def inf_norm(matrix):
return max(abs(row.sum()) for row in matrix)
diff = my_solution - numpy_solution
inf_norm_result = inf_norm(diff.reshape((100, 1))
alternative 2:
Or if you know they will always be 1-D vectors, you can omit the sum
(because the rows will all have length 1) and compute it directly:
abs(my_solution - numpy_solution).max()
alternative 3:
or as it is written in numpy.linalg.norm (see below) documentation:
max(sum(abs(my_solution - numpy_solution), axis=1))
alternative 4:
or use the numpy.linalg.norm() (see: https://docs.scipy.org/doc/numpy-1.15.1/reference/generated/numpy.linalg.norm.html):
np.linalg.norm(my_solution - numpy_solution, np.inf)

(Python) Markov, Chebyshev, Chernoff upper bound functions

I'm stuck with one task on my learning path.
For the binomial distribution X∼Bp,n with mean μ=np and variance σ**2=np(1−p), we would like to upper bound the probability P(X≥c⋅μ) for c≥1.
Three bounds introduced:
Formulas
The task is to write three functions respectively for each of the inequalities. They must take n , p and c as inputs and return the upper bounds for P(X≥c⋅np) given by the above Markov, Chebyshev, and Chernoff inequalities as outputs.
And there is an example of IO:
Code:
print Markov(100.,0.2,1.5)
print Chebyshev(100.,0.2,1.5)
print Chernoff(100.,0.2,1.5)
Output
0.6666666666666666
0.16
0.1353352832366127
I'm completely stuck. I just can't figure out how to plug in all that math into functions (or how to think algorithmically here). If someone could help me out, that would be of great help!
p.s. and all libs are not allowed by task conditions except math.exp
Ok, let's look at what's given:
Input and derived values:
n = 100
p = 0.2
c = 1.5
m = n*p = 100 * 0.2 = 20
s2 = n*p*(1-p) = 16
s = sqrt(s2) = sqrt(16) = 4
You have multiple inequalities of the form P(X>=a*m) and you need to provide bounds for the term P(X>=c*m), so you need to think how a relates to c in all cases.
Markov inequality: P(X>=a*m) <= 1/a
You're asked to implement Markov(n,p,c) that will return the upper bound for P(X>=c*m). Since from
P(X>=a*m)
= P(X>=c*m)
it's clear that a == c, you get 1/a = 1/c. Well, that's just
def Markov(n, p, c):
return 1.0/c
>>> Markov(100,0.2,1.5)
0.6666666666666666
That was easy, wasn't it?
Chernoff inequality states that P(X>=(1+d)*m) <= exp(-d**2/(2+d)*m)
First, let's verify that if
P(X>=(1+d)*m)
= P(X>=c *m)
then
1+d = c
d = c-1
This gives us everything we need to calculate the uper bound:
def Chernoff(n, p, c):
d = c-1
m = n*p
return math.exp(-d**2/(2+d)*m)
>>> Chernoff(100,0.2,1.5)
0.1353352832366127
Chebyshev inequality bounds P(X>=m+k*s) by 1/k**2
So again, if
P(X>=c*m)
= P(X>=m+k*s)
then
c*m = m+k*s
m*(c-1) = k*s
k = m*(c-1)/s
Then it's straight forward to implement
def Chebyshev(n, p, c):
m = n*p
s = math.sqrt(n*p*(1-p))
k = m*(c-1)/s
return 1/k**2
>>> Chebyshev(100,0.2,1.5)
0.16

SciPy Minimize with monotonically decreasing Xs constraint

I am looking to do a strenuous optimization in which I use SciPy to optimize discount factors for bond cashflows (application less important, but if interested). So essentially I take multiple known values 'P', where P[i] is a function of C[i] known constant, and array X (X[j]=x(t) where x is a function of time). where the sum-product of C[i] and X = P.
Hope that makes some sense, but essentially in order for a sensible result, I want to put a constraint where X (my array of x values) has the constraint that x[j] < x[j-1], that is, x's are monotonically decreasing.
Here is my code snippet for the optimization function:
In [400]:
import numpy as np
import pandas as pd
import scipy as s
def MyOptimization(X):
P=np.array([99.,100.,105.,110.]) #just example known "P" array, in reality closer to 40 values
c=np.array([1.25,4.,3.1,2.5]) #Cash flows for each P
t=np.array([[1.2,2.,4.,10.0],[0.5,1.],[2.3,5.,10.5],[1.7]]) #time t of each cash flow, multiple per 'P'
#remember P=X(t)*c[i] and x(t) where x[i+1]<x[i]
tlist=[] #t's will be used as index, so pulling individual values
for i in t:
for j in i:
tlist.append(j)
df=pd.DataFrame(data=X,index=tlist).drop_duplicates().sort() #dataframe to hold t (index) and x, x(t), and P(x,c) where c is known
#print df
sse=0
for i in range(0,len(P)):
pxi = np.sum(df.loc[t[i],0].values*c[i])+100*df.loc[t[i][-1],0]
sse=sse+(pxi-P[i])**2 #want to minimize sum squared errors between calculated P(x,c) and known P
return sse
cons=({'type':'ineq','fun': lambda x: x[1] < x[0]}) #trying to define constraint that x is decreasing with t
opti=s.optimize.minimize(MyOptimization,x0=[0.90,0.89,0.88,0.87,0.86,0.85,0.84,0.83,0.82,0.81],bounds=([0,1],)*10,constraints=cons)
In [401]:
opti
Out[401]:
status: 0
success: True
njev: 4
nfev: 69
fun: 5.445290696814009e-15
x: array([ 0.90092322, 0.89092322, 0.88092322, 0.94478062, 0.86301329,
0.92834564, 0.84444848, 0.83444848, 0.96794781, 1.07317073])
message: 'Optimization terminated successfully.'
jac: array([ -7.50609263e-05, -7.50609263e-05, -7.50609263e-05,
-5.92906077e-03, 3.46914830e-04, 9.17475767e-03,
-4.89504256e-04, -4.89504256e-04, -1.61263312e-02,
8.35321580e-03, 0.00000000e+00])
nit: 4
And it is clear to see where in the results the x array is not decreasing. (tried adding (0,1) bounds as well but result failed, so focussing on this for now.
The important line here for the constraint that I'm really not sure about is:
cons=({'type':'ineq','fun': lambda x: x[1] < x[0]})
I tried following the documentation, but clearly it hasn't worked.
Any ideas greatly appreciated.
Let's try
def con(x):
for i in range(len(x)-1):
if x[i] <= x[i+1]:
return -1
return 1
cons=({'type':'ineq','fun': con})
This should reject lists that aren't set up like you want, but I'm not sure is scipy is going to like it.
I can't comment on the post below, but you need to have an i=i in there... tuple([{'type':'ineq', 'fun': lambda x,i=i: x[i] - x[i+1]} for i in range(9)] + [{'type':'eq', 'fun': lambda x,i=i: 0 if x[j] != x[j+1] else 1} for j in range(9)])

Too many indices for array

I am trying to create a 3D image mat1 from the data given to me by an object. But I am getting the error for the last line: mat1[x,y,z] = mat[x,y,z] + (R**2/U**2)**pf1[l,m,beta]:
IndexError: too many indices for array
What could possible be the problem here?
Following is my code :
mat1 = np.zeros((1024,1024,360),dtype=np.int32)
k = 498
gamma = 0.00774267
R = 0.37
g = np.zeros(1024)
g[0:512] = np.linspace(0,1,512)
g[513:] = np.linspace(1,0,511)
pf = np.zeros((1024,1024,360))
pf1 = np.zeros((1024,1024,360))
for b in range(0,1023) :
for beta in range(0,359) :
for a in range(0,1023) :
pf[a,b,beta] = (R/(((R**2)+(a**2)+(b**2))**0.5))*mat[a,b,beta]
pf1[:,b,beta] = np.convolve(pf[:,b,beta],g,'same')
for x in range(0,1023) :
for y in range(0,1023) :
for z in range(0,359) :
for beta in range(0,359) :
a = R*((-x*0.005)*(sin(beta)) + (y*0.005)*(cos(beta)))/(R+ (x*0.005)*(cos(beta))+(y*0.005)*(sin(beta)))
b = z*R/(R+(x*0.005)*(cos(beta))+(y*0.005)*(sin(beta)))
U = R+(x*0.005)*(cos(beta))+(y*0.005)*(sin(beta))
l = math.trunc(a)
m = math.trunc(b)
if (0<=l<1024 and 0<=m<1024) :
mat1[x,y,z] = mat[x,y,z] + (R**2/U**2)**pf1[l,m,beta]
The line where you do the convolution:
pf1 = np.convolve(pf[:,b,beta],g)
generates a 1-dimensional array, and not 3-dimensional as your call in the last line: pf1[l,m,beta]
To solve this you can use:
pf1[:,b,beta] = np.convolve(pf[:,b,beta],g,'same')
and you also need to predefine pf1:
pf1 = np.zeros((1024,1024,360))
Note that the convolution of f*g (np.convole(f,g)) returns normally a length of |f|+|g|-1. If you however use np.convolve with the parameter 'same' it returns an array which has the maximum length of f or g (i.e. max(|f|,|g|)).
Edit:
Furthermore you have to be sure that the dimensions of the matrices and the indices you use are correct, for example:
You define mat1 = np.zeros((100,100,100),dtype=np.int32), thus a 100x100x100 matrix, but in the last line you do mat1[x,y,z] where the variables x, y and z clearly get out of these dimensions. In this case they get to the range of the mat matrix. Probably you have to change the dimensions of mat1 also to those:
mat1 = np.zeros((1024,1024,360),dtype=np.int32)
Also be sure that the last variable indices you calculate (l and m) are within the dimensions of pf1.
Edit 2: The range(a,b) function returns an array from a to b, but not including b. So instead of range(0,1023) for example, you should write range(0,1024) (or shorter: range(1024)).
Edit 3: To check if l or m exceed the dimensions you could add an error as soon as they do:
l = math.trunc(a)
if l>=1024:
print 'l exceeded bounds: ',l
m = math.trunc(b)
if m>=1024:
print 'm exceeded bounds: ',m
Edit 4: note that your your code, especially your last for will take a long time! Your last nested for results in 1024*1024*360*360=135895449600 iterations. With a small time estimation I did (calculating the running time of the code in your for loop) your code might take about 5 days to run.
A small easy optimization you could do is instead of calculating the sin and cos several times, create a variable storing the value:
sinbeta = sin(beta)
cosbeta = cos(beta)
but it will probably still take several days. You might want to check how to optimize your calculations or calculate it with a C program for example.

Categories