I am looking to do a strenuous optimization in which I use SciPy to optimize discount factors for bond cashflows (application less important, but if interested). So essentially I take multiple known values 'P', where P[i] is a function of C[i] known constant, and array X (X[j]=x(t) where x is a function of time). where the sum-product of C[i] and X = P.
Hope that makes some sense, but essentially in order for a sensible result, I want to put a constraint where X (my array of x values) has the constraint that x[j] < x[j-1], that is, x's are monotonically decreasing.
Here is my code snippet for the optimization function:
In [400]:
import numpy as np
import pandas as pd
import scipy as s
def MyOptimization(X):
P=np.array([99.,100.,105.,110.]) #just example known "P" array, in reality closer to 40 values
c=np.array([1.25,4.,3.1,2.5]) #Cash flows for each P
t=np.array([[1.2,2.,4.,10.0],[0.5,1.],[2.3,5.,10.5],[1.7]]) #time t of each cash flow, multiple per 'P'
#remember P=X(t)*c[i] and x(t) where x[i+1]<x[i]
tlist=[] #t's will be used as index, so pulling individual values
for i in t:
for j in i:
tlist.append(j)
df=pd.DataFrame(data=X,index=tlist).drop_duplicates().sort() #dataframe to hold t (index) and x, x(t), and P(x,c) where c is known
#print df
sse=0
for i in range(0,len(P)):
pxi = np.sum(df.loc[t[i],0].values*c[i])+100*df.loc[t[i][-1],0]
sse=sse+(pxi-P[i])**2 #want to minimize sum squared errors between calculated P(x,c) and known P
return sse
cons=({'type':'ineq','fun': lambda x: x[1] < x[0]}) #trying to define constraint that x is decreasing with t
opti=s.optimize.minimize(MyOptimization,x0=[0.90,0.89,0.88,0.87,0.86,0.85,0.84,0.83,0.82,0.81],bounds=([0,1],)*10,constraints=cons)
In [401]:
opti
Out[401]:
status: 0
success: True
njev: 4
nfev: 69
fun: 5.445290696814009e-15
x: array([ 0.90092322, 0.89092322, 0.88092322, 0.94478062, 0.86301329,
0.92834564, 0.84444848, 0.83444848, 0.96794781, 1.07317073])
message: 'Optimization terminated successfully.'
jac: array([ -7.50609263e-05, -7.50609263e-05, -7.50609263e-05,
-5.92906077e-03, 3.46914830e-04, 9.17475767e-03,
-4.89504256e-04, -4.89504256e-04, -1.61263312e-02,
8.35321580e-03, 0.00000000e+00])
nit: 4
And it is clear to see where in the results the x array is not decreasing. (tried adding (0,1) bounds as well but result failed, so focussing on this for now.
The important line here for the constraint that I'm really not sure about is:
cons=({'type':'ineq','fun': lambda x: x[1] < x[0]})
I tried following the documentation, but clearly it hasn't worked.
Any ideas greatly appreciated.
Let's try
def con(x):
for i in range(len(x)-1):
if x[i] <= x[i+1]:
return -1
return 1
cons=({'type':'ineq','fun': con})
This should reject lists that aren't set up like you want, but I'm not sure is scipy is going to like it.
I can't comment on the post below, but you need to have an i=i in there... tuple([{'type':'ineq', 'fun': lambda x,i=i: x[i] - x[i+1]} for i in range(9)] + [{'type':'eq', 'fun': lambda x,i=i: 0 if x[j] != x[j+1] else 1} for j in range(9)])
Related
I have the following polynomial equation that I would like to find the local minima and maxima for.
I defined the function as follows. It uses a flatten function to flatten the nested list, I'll include it for testing purposes (found it here http://rightfootin.blogspot.com/2006/09/more-on-python-flatten.html)
flatten list
from itertools import combinations
import math
def flatten(l, ltypes=(list, tuple)):
ltype = type(l)
l = list(l)
i = 0
while i < len(l):
while isinstance(l[i], ltypes):
if not l[i]:
l.pop(i)
i -= 1
break
else:
l[i:i + 1] = l[i]
i += 1
return ltype(l)
my polynomial
def poly(coefficients, factors):
#quadratic terms
constant = 1
singles = factors
products = [math.prod(c) for c in combinations(factors, 2)]
squares = [f**2 for f in factors]
sequence = flatten([constant, singles, products, squares])
z = sum([math.prod(i) for i in zip(coefficients, sequence)])
return z
The arguments it takes is a list of coefficients, for example:
coefs = [12.19764959, -1.8233151, 2.50952816,-1.56344375, 1.00003828, -1.72128301, -2.54254877, -1.20377309, 5.53510616, 2.94755653, 4.83759279, -0.85507208, -0.48007208, -3.70507208, -0.27007208]
And a list of factor or variable values:
factors = [0.4714, 0.4714, -0.4714, 0.4714]
Plug these in and it calculates the result of the polynomial. The reason I wrote it like this is because the number of variables (factors) changes from fit to fit, so I wanted to keep it flexible. I now want to find the combination of "factors" values within a certain range (let's say between -1 and 1) where the function reaches its maximum and minimum values. If the function was "hard coded" I could use scipy.optimize, but I can't figure out how to make it works as is.
Another option is a brute force grid search (which I use at the moment), but it's very slow as soon as you have more than 2 variables, especially with small step sizes. There may be no true minima/maxima where slope == 0 within the bounds, but as long as I can get the maximum and minimum values that is OK.
Ok, I figured it out. It was two really silly things:
the order of the arguments in the function had to be reversed, so that the first argument (the one I wanted to optimize for) were the "factors" or the X values, followed by the coefficients. That way an array of the same size could be used as the X0 and the coefficients could be used as args.
That wassn't enough, as the function would return an array if an array was the input. I just added a factors = list(factors) to the function itself to put it into the correct shape.
The new function:
def poly(factors, coefficients):
factors = list(factors)
#quadratic terms
constant = 1
singles = factors
products = [math.prod(c) for c in combinations(factors, 2)]
squares = [f**2 for f in factors]
sequence = flatten([constant, singles, products, squares])
z = sum([math.prod(i) for i in zip(coefficients, sequence)])
return z
And the optimization:
coefs = [4.08050532, -0.47042713, -0.08200181, -0.54184481, -0.18515675,
-0.96751856, -1.10814625, -1.7831592, 5.2763512, 2.83505438, 4.7082153,
0.22988773, 1.06488773, -0.70011227, 1.42988773]
x0 = [0.1, 0.1, 0.1, 0.1]
minimize(poly,x0 = x0, args = coefs, bounds = ((-1,1),(-1,1),(-1,1),(-1,1)))
Which returns:
fun: -1.6736636102536673
hess_inv: <4x4 LbfgsInvHessProduct with dtype=float64>
jac: array([-2.10611305e-01, 2.19138777e+00, -8.16990766e+00, -1.11022302e-07])
message: 'CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL'
nfev: 85
nit: 12
njev: 17
status: 0
success: True
x: array([1., -1.,1., 0.03327357])
I have a particular function which calculates average cost of electricity ($/MWh) over the lifetime of a power plant.
An example function looks like this
def calc(a,b,c):
res = 65*a+74*b+12*c
return res
Where a b and c are cost parameters, such as operating expenditure, construction cost and insurance.
I could vary a b and c in an infinite number of ways, but I would like to keep the ratios the same as an example data point I have, with a lower result for average cost of electricity.
For example
When a=1, b=2 and c=3, res = 249.
However, I would like to find out the optimal values, which keeps the same original ratios, for a b and c when res=600
I have tried to figure out a way to do this using scipy.optimize, but with some difficulty.
I'm not sure how I would program in the ratios for the constraints.
Many thanks.
Let's say you have two sets of values, (a_old, b_old, c_old) and (a_new, b_new, c_new). If you want their respective ratios to be the same (e.g., a_old:c_old is the same as a_new:c_new, and c_old:b_old is the same as c_new:b_new, and so on), then that's the same as saying there exists some constant k such that a_new = k*a_old, b_new = k*b_old, and c_new = k*c_old.
In your example, 65*a_old + 74*b_old + 12*c_old = 249. If you multiply both sides of this equation by k, you get
65(k*a_old) + 74(k*b_old) + 12(k*c_old) = 249*k. This is the same as '65(a_new) + 74(b_new) + 12(c_new) = 249k'.
You want 249*k to be equal to 600. Therefore, k = 600/249 = about 2.4096. You can then use this k value along with a_old, b_old, c_old to find the values of a_new, b_new, c_new. Remember the new values are just k times the old values.
Here's a function that returns the set of scaled parameter values:
def optimize(a,b,c, opt_res):
res = 65 * a + 74 * b + 12 * c
k = opt_res/res
new_vals = [parameter * k for parameter in [a,b,c]]
return new_vals
print(optimize(1,2,3,600.0))
## output: [2.4096385542168677, 4.819277108433735, 7.2289156626506035]
Note I used "600.0", not "600". This forces Python to use floats instead of doing everything with truncated integers.
From this answer, you can specify the constraints like this:
cons = [{'type':'eq', 'fun': con1},
{'type':'eq', 'fun': con2}]
and use the minimize function like this:
scipy.optimize.minimize(func, x0, constraints=cons)
I managed to come to a solution which helped my particular use-case, even though it was pointed out that there was a simpler solution for this particular example.
from scipy.optimize import minimize
import numpy as np
a = 1
b = 2
c = 3
def calc(x):
res = 65*x[0]+74*x[1]+12*x[2]
return res
cons = [{'type': 'eq', 'fun': lambda x: x[0]/x[1]-a/b},
{'type': 'eq', 'fun': lambda x: x[1]/x[2]-b/c},
{'type': 'eq', 'fun': lambda x: calc(x)-600}]
start_pos = np.ones(3)*(1/6.)
print(minimize(calc, x0=start_pos, constraints=cons))
The constraints keep the same ratios, and set the result of calc to equal 600.
In the following code I have been able to:
Implement Gaussian elimination with no pivoting for a general square linear system.
I have tested it by solving Ax=b, where A is a random 100x100 matrix and b is a random 100x1 vector.
I have compared my solution against the solution obtained using numpy.linalg.solve
However in the final task I need to compute the infinity norm of the difference between the two solutions. I know the infinity norm is the greatest absolute row sum of a matrix. But how can I do this to compute the infinity norm of the difference between the two solutions, my solution and the numpy.linalg.solve. Looking for some help with this!
import numpy as np
def GENP(A, b):
'''
Gaussian elimination with no pivoting.
% input: A is an n x n nonsingular matrix
% b is an n x 1 vector
% output: x is the solution of Ax=b.
% post-condition: A and b have been modified.
'''
n = len(A)
if b.size != n:
raise ValueError("Invalid argument: incompatible sizes between A & b.", b.size, n)
for pivot_row in range(n-1):
for row in range(pivot_row+1, n):
multiplier = A[row][pivot_row]/A[pivot_row][pivot_row]
#the only one in this column since the rest are zero
A[row][pivot_row] = multiplier
for col in range(pivot_row + 1, n):
A[row][col] = A[row][col] - multiplier*A[pivot_row][col]
#Equation solution column
b[row] = b[row] - multiplier*b[pivot_row]
x = np.zeros(n)
k = n-1
x[k] = b[k]/A[k,k]
while k >= 0:
x[k] = (b[k] - np.dot(A[k,k+1:],x[k+1:]))/A[k,k]
k = k-1
return x
if __name__ == "__main__":
A = np.round(np.random.rand(100, 100)*10)
b = np.round(np.random.rand(100)*10)
print (GENP(np.copy(A), np.copy(b)))
for example this code gives the following output for task 1 listed above:
[-6.61537666 0.95704368 1.30101768 -3.69577873 -2.51427519 -4.56927017
-1.61201589 2.88242622 1.67836096 2.18145556 2.60831672 0.08055869
-2.39347903 2.19672137 -0.91609732 -1.17994959 -3.87309152 -2.53330865
5.97476318 3.74687301 5.38585146 -2.71597978 2.0034079 -0.35045844
0.43988439 -2.2623829 -1.82137544 3.20545721 -4.98871738 -6.94378666
-6.5076601 3.28448129 3.42318453 -1.63900434 4.70352047 -4.12289961
-0.79514656 3.09744616 2.96397264 2.60408589 2.38707091 8.72909353
-1.33584905 1.30879264 -0.28008339 0.93560728 -1.40591226 1.31004142
-1.43422946 0.41875924 3.28412668 3.82169545 1.96675247 2.76094378
-0.90069455 1.3641636 -0.60520103 3.4814196 -1.43076816 5.01222382
0.19160657 2.23163261 2.42183726 -0.52941262 -7.35597457 -3.41685057
-0.24359225 -5.33856181 -1.41741354 -0.35654736 -1.71158503 -2.24469314
-3.26453092 1.0932765 1.58333208 0.15567584 0.02793548 1.59561909
0.31732915 -1.00695954 3.41663177 -4.06869021 3.74388762 -0.82868155
1.49789582 -1.63559124 0.2741194 -1.11709237 1.97177449 0.66410154
0.48397714 -1.96241854 0.34975886 1.3317751 2.25763568 -6.80055066
-0.65903682 -1.07105965 -0.40211347 -0.30507635]
then for task two my code gives the following:
my_solution = GENP(np.copy(A), np.copy(b))
numpy_solution = np.linalg.solve(A, b)
print(numpy_solution)
resulting in:
[-6.61537666 0.95704368 1.30101768 -3.69577873 -2.51427519 -4.56927017
-1.61201589 2.88242622 1.67836096 2.18145556 2.60831672 0.08055869
-2.39347903 2.19672137 -0.91609732 -1.17994959 -3.87309152 -2.53330865
5.97476318 3.74687301 5.38585146 -2.71597978 2.0034079 -0.35045844
0.43988439 -2.2623829 -1.82137544 3.20545721 -4.98871738 -6.94378666
-6.5076601 3.28448129 3.42318453 -1.63900434 4.70352047 -4.12289961
-0.79514656 3.09744616 2.96397264 2.60408589 2.38707091 8.72909353
-1.33584905 1.30879264 -0.28008339 0.93560728 -1.40591226 1.31004142
-1.43422946 0.41875924 3.28412668 3.82169545 1.96675247 2.76094378
-0.90069455 1.3641636 -0.60520103 3.4814196 -1.43076816 5.01222382
0.19160657 2.23163261 2.42183726 -0.52941262 -7.35597457 -3.41685057
-0.24359225 -5.33856181 -1.41741354 -0.35654736 -1.71158503 -2.24469314
-3.26453092 1.0932765 1.58333208 0.15567584 0.02793548 1.59561909
0.31732915 -1.00695954 3.41663177 -4.06869021 3.74388762 -0.82868155
1.49789582 -1.63559124 0.2741194 -1.11709237 1.97177449 0.66410154
0.48397714 -1.96241854 0.34975886 1.3317751 2.25763568 -6.80055066
-0.65903682 -1.07105965 -0.40211347 -0.30507635]
finally for task 3:
if np.allclose(my_solution, numpy_solution):
print("These solutions agree")
else:
print("These solutions do not agree")
resulting in:
These solutions agree
If what you want is only the infinity norm for matrix,
it generally should look something like this:
def inf_norm(matrix):
return max(abs(row.sum()) for row in matrix)
But since your my_solution and numpy_solution are just 1-D vectors, you
may either to reshape them (I assume 100x1 which is what you have in your
example) for use with above function:
alternative 1:
def inf_norm(matrix):
return max(abs(row.sum()) for row in matrix)
diff = my_solution - numpy_solution
inf_norm_result = inf_norm(diff.reshape((100, 1))
alternative 2:
Or if you know they will always be 1-D vectors, you can omit the sum
(because the rows will all have length 1) and compute it directly:
abs(my_solution - numpy_solution).max()
alternative 3:
or as it is written in numpy.linalg.norm (see below) documentation:
max(sum(abs(my_solution - numpy_solution), axis=1))
alternative 4:
or use the numpy.linalg.norm() (see: https://docs.scipy.org/doc/numpy-1.15.1/reference/generated/numpy.linalg.norm.html):
np.linalg.norm(my_solution - numpy_solution, np.inf)
Given an N x N symmetric matrix C and an N x N diagonal matrix I, find the solutions of the equation det(λI-C)=0. In other words, the (generalized) eigenvalues of C are to be found.
I know few ways how to solve this in MATLAB using build-in functions:
1st way:
function lambdas=eigenValues(C,I)
syms x;
lambdas=sort(roots(double(fliplr(coeffs(det(C-I*x))))));
2nd way:
[V,D]=eig(C,I);
However, I need to use Python. There are similar function in NumPy and SymPy, but, according to docs (numpy, sympy), they take only one matrix C, as the input. Though, the result's different from the result produced by Matlab. Also, symbolic solutions produced by SymPy aren't helpful. Maybe I am doing something wrong? How to find solution?
Example
MATLAB:
%INPUT
I =
2 0 0
0 6 0
0 0 5
C =
4 7 0
7 8 -4
0 -4 1
[v,d]=eig(C,I)
%RESULT
v =
-0.3558 -0.3109 -0.5261
0.2778 0.1344 -0.2673
0.2383 -0.3737 0.0598
d =
-0.7327 0 0
0 0.4876 0
0 0 3.7784
Python 3.5:
%INPUT
I=np.matrix([[2,0,0],
[0,6,0],
[0,0,5]])
C=np.matrix([[4,7,0],[7,8,-4],[0,-4,1]])
np.linalg.eigh(C)
%RESULT
(array([-3., 1.91723747, 14.08276253]),
matrix(
[[-0.57735027, 0.60061066, -0.55311256],
[ 0.57735027, -0.1787042 , -0.79670037],
[ 0.57735027, 0.77931486, 0.24358781]]))
At least if I has positive diagonal entries you can simply solve a transformed system:
# example problem
>>> A = np.random.random((3, 3))
>>> A = A.T # A
>>> I = np.identity(3) * np.random.random((3,))
# transform
>>> J = np.sqrt(np.einsum('ii->i', I))
>>> B = A / np.outer(J, J)
# solve
>>> eval_, evec = np.linalg.eigh(B)
# back transform result
>>> evec /= J[:, None]
# check
>>> A # evec
array([[ -1.43653725e-02, 4.14643550e-01, -2.42340866e+00],
[ -1.75615960e-03, -4.17347693e-01, -8.19546081e-01],
[ 1.90178603e-02, 1.34837899e-01, -1.69999003e+00]])
>>> eval_ * (I # evec)
array([[ -1.43653725e-02, 4.14643550e-01, -2.42340866e+00],
[ -1.75615960e-03, -4.17347693e-01, -8.19546081e-01],
[ 1.90178603e-02, 1.34837899e-01, -1.69999003e+00]])
OP's example. IMPORTANT: must use np.arrays for I and C, np.matrix will not work.
>>> I=np.array([[2,0,0],[0,6,0],[0,0,5]])
>>> C=np.array([[4,7,0],[7,8,-4],[0,-4,1]])
>>>
>>> J = np.sqrt(np.einsum('ii->i', I))
>>> B = C / np.outer(J, J)
>>> eval_, evec = np.linalg.eigh(B)
>>> evec /= J[:, None]
>>>
>>> evec
array([[-0.35578356, -0.31094779, -0.52605088],
[ 0.27778714, 0.1343625 , -0.267297 ],
[ 0.23826117, -0.37371199, 0.05975754]])
>>> eval_
array([-0.73271478, 0.48762792, 3.7784202 ])
If I has positive and negative entries use eig instead of eigh and before taking the square root cast to complex dtype.
Differing from other answers, I assume that by the symbol I you mean the identity matrix, Ix=x.
What you want to solve, Cx=λIx, is the so-called standard eigenvalue problem,
and most eigenvalue solvers tackle the problem described in that format, hence the
Numpy function has the signature eig(C).
If your C matrix is a symmetric matrix and your problem is indeed a standard eigenvalue problem I'd recommend the use of numpy.linalg.eigh, that is optimized for this type of problems.
On the contrary if your problem is really a generalized eigenvalue problem, as, e.g., the frequency equation Kx=ω²Mx you could use scipy.linalg.eigh, that supports that type of problem statement for symmetric matrices.
eigvals, eigvecs = scipy.linalg.eigh(C, I)
With respect to the discrepancies in eigenvalues, the Numpy implementation gives no guarantees w/r to their ordering, so it could be just a different ordering, but if your problem is indeed a generalized problem (I not being the identity matrix...) the solution is of course different and you have to use the Scipy implementation of eigh.
If the discrepancies is within the eigenvectors, please remember that the eigenvectors are known within an arbitrary scale factor and, again, the ordering could be undefined (but, of course, their order is the same order in which you have the eigenvalues) — the situation is a little different for scipy.linalg.eigh because in this case the eigenvalues are sorted and the eigenvectors are normalized with respect to the second matrix argument (I in your example).
Ps: scipy.linalg.eigh behaviour (i.e., sorted eigenvalues and normalized eigenvectors) is so convenient for my use cases that I use to use it also to solve standard eigenvalue problems.
Using SymPy:
>>> from sympy import *
>>> t = Symbol('t')
>>> D = diag(2,6,5)
>>> S = Matrix([[ 4, 7, 0],
[ 7, 8,-4],
[ 0,-4, 1]])
>>> (t*D - S).det()
60*t**3 - 212*t**2 - 77*t + 81
Computing the exact roots:
>>> roots = solve(60*t**3 - 212*t**2 - 77*t + 81,t)
>>> roots
[53/45 + (-1/2 - sqrt(3)*I/2)*(312469/182250 + sqrt(797521629)*I/16200)**(1/3) + 14701/(8100*(-1/2 - sqrt(3)*I/2)*(312469/182250 + sqrt(797521629)*I/16200)**(1/3)), 53/45 + 14701/(8100*(-1/2 + sqrt(3)*I/2)*(312469/182250 + sqrt(797521629)*I/16200)**(1/3)) + (-1/2 + sqrt(3)*I/2)*(312469/182250 + sqrt(797521629)*I/16200)**(1/3), 53/45 + 14701/(8100*(312469/182250 + sqrt(797521629)*I/16200)**(1/3)) + (312469/182250 + sqrt(797521629)*I/16200)**(1/3)]
Computing floating-point approximations of the roots:
>>> for r in roots:
... r.evalf()
...
0.487627918145732 + 0.e-22*I
-0.73271478047926 - 0.e-22*I
3.77842019566686 - 0.e-21*I
Note that the roots are real.
I have two numpy arrays, like X=[x1,x2,x3,x4], y=[y1,y2,y3,y4]. Three of the elements are close and the fourth of them maybe close or not.
Like:
X [ 84.04467948 52.42447842 39.13555678 21.99846595]
y [ 78.86529444 52.42447842 38.74910101 21.99846595]
Or it can be:
X [ 84.04467948 60 52.42447842 39.13555678]
y [ 78.86529444 52.42447842 38.74910101 21.99846595]
I want to define a function to find the the corresponding index in the two arrays, like in first case:
y[0] correspond to X[0],
y[1] correspond to X[1],
y[2] correspond to X[2],
y[3] correspond to X[3]
And in second case:
y[0] correspond to X[0],
y[1] correspond to X[2],
y[2] correspond to X[3]
and y[3] correspond to X[1].
I can't write a function to solve the problem completely, please help.
You can start by precomputing the distance matrix as show in this answer:
import numpy as np
X = np.array([84.04467948,60.,52.42447842,39.13555678])
Y = np.array([78.86529444,52.42447842,38.74910101,21.99846595])
dist = np.abs(X[:, np.newaxis] - Y)
Now you can compute the minimums along one axis (I chose 1 corresponding to finding the closest element of Y for every X):
potentialClosest = dist.argmin(axis=1)
This still may contain duplicates (in your case 2). To check for that, you can find find all Y indices that appear in potentialClosest by use of np.unique:
closestFound, closestCounts = np.unique(potentialClosest, return_counts=True)
Now you can check for duplicates by checking if closestFound.shape[0] == X.shape[0]. If so, you're golden and potentialClosest will contain your partners for every element in X. In your case 2 though, one element will occur twice and therefore closestFound will only have X.shape[0]-1 elements whereas closestCounts will not contain only 1s but one 2. For all elements with count 1 the partner is already found. For the two candidates with count 2, though you will have to choose the closer one while the partner of the one with the larger distance will be the one element of Y which is not in closestFound. This can be found as:
missingPartnerIndex = np.where(
np.in1d(np.arange(Y.shape[0]), closestFound)==False
)[0][0]
You can do the matchin in a loop (even though there might be some nicer way using numpy). This solution is rather ugly but works. Any suggestions for improvements are very appreciated:
partners = np.empty_like(X, dtype=int)
nonClosePartnerFound = False
for i in np.arange(X.shape[0]):
if closestCounts[closestFound==potentialClosest[i]][0]==1:
# A unique partner was found
partners[i] = potentialClosest[i]
else:
# Partner is not unique
if nonClosePartnerFound:
partners[i] = potentialClosest[i]
else:
if np.argmin(dist[:, potentialClosest[i]]) == i:
partners[i] = potentialClosest[i]
else:
partners[i] = missingPartnerIndex
nonClosePartnerFound = True
print(partners)
This answer will only work if only one pair is not close. If that is not the case, you will have to define how to find the correct partner for multiple non-close elements. Sadly it's neither a very generic nor a very nice solution, but hopefully you will find it a helpful starting point.
Using this answer https://stackoverflow.com/a/8929827/3627387 and https://stackoverflow.com/a/12141207/3627387
FIXED
def find_closest(alist, target):
return min(alist, key=lambda x:abs(x-target))
X = [ 84.04467948, 52.42447842, 39.13555678, 21.99846595]
Y = [ 78.86529444, 52.42447842, 38.74910101, 21.99846595]
def list_matching(list1, list2):
list1_copy = list1[:]
pairs = []
for i, e in enumerate(list2):
elem = find_closest(list1_copy, e)
pairs.append([i, list1.index(elem)])
list1_copy.remove(elem)
return pairs
Seems like best approach would be to pre-sort both array (nlog(n)) and then perform merge-like traverse through both arrays. It's definitely faster than nn which you indicated in comment.
The below simply prints the corresponding indexes of the two arrays as you have done in your question as I'm not sure what output you want your function to give.
X1 = [84.04467948, 52.42447842, 39.13555678, 21.99846595]
Y1 = [78.86529444, 52.42447842, 38.74910101, 21.99846595]
X2 = [84.04467948, 60, 52.42447842, 39.13555678]
Y2 = [78.86529444, 52.42447842, 38.74910101, 21.99846595]
def find_closest(x_array, y_array):
# Copy x_array as we will later remove an item with each iteration and
# require the original later
remaining_x_array = x_array[:]
for y in y_array:
differences = []
for x in remaining_x_array:
differences.append(abs(y - x))
# min_index_remaining is the index position of the closest x value
# to the given y in remaining_x_array
min_index_remaining = differences.index(min(differences))
# related_x is the closest x value of the given y
related_x = remaining_x_array[min_index_remaining]
print 'Y[%s] corresponds to X[%s]' % (y_array.index(y), x_array.index(related_x))
# Remove the corresponding x value in remaining_x_array so it
# cannot be selected twice
remaining_x_array.pop(min_index_remaining)
This then outputs the following
find_closest(X1,Y1)
Y[0] corresponds to X[0]
Y[1] corresponds to X[1]
Y[2] corresponds to X[2]
Y[3] corresponds to X[3]
and
find_closest(X2,Y2)
Y[0] corresponds to X[0]
Y[1] corresponds to X[2]
Y[2] corresponds to X[3]
Y[3] corresponds to X[1]
Hope this helps.