scipy.optimize Constrained Minimization SLSQP - Cannot Match the target 100% - python

I am trying to get the minimized weight that's closer to average weight combined. My current problem is using the SLSQP solver I cannot find the right weight that meet the target 100%. Is there another solver I could use to solve my problem? Or any math suggestions. Please help.
My math right now is
**min⁡(∑|x-mean(x)|)**
**s.t.** Aw-b=0, w>=0
**bound** 0.2'<'x<0.5
Data
nRow# Variable1 Variable2 Variable3
1 3582.00 233445193.00 559090945.00
2 3394.00 217344811.00 496500751.00
3 3356.00 237746918.00 493639029.00
4 3256.00 219204892.00 461547877.00
5 3415.00 225272825.00 501057960.00
6 3505.00 242819442.00 505073223.00
7 3442.00 215258725.00 490458632.00
8 3381.00 227681178.00 503102998.00
9 3392.00 215189377.00 487026744.00
w1 w2 w3
Target 8531.00 429386951.00 1079115532.00
Question: Find the minimized weight that are closer to average
Python Code:
A = np.array([[3582.000000, 3394.000000, 3356.000000, 3256.000000, 3415.000000,
3505.000000, 3442.000000, 3381.000000, 3392.000000],
[233445193.000000, 217344811.000000, 237746918.000000,
219204892.000000, 225272825.000000, 242819442.000000,
215258725.000000, 227681178.000000, 215189377.000000],
[559090945.000000, 496500751.000000, 493639029.000000,
461547877.000000, 501057960.000000, 505073223.000000,
490458632.000000, 503102998.000000, 487026744.000000]])
b = np.array([8531, 1079115532, 429386951])
n=9
def fsolveMin(A,b,n):
# The constraints: Ax = b
def cons(x):
return A.dot(x)-b
cons = ({'type':'eq','fun':cons},
{'type':'ineq','fun':lambda x:x[0]})
# The minimizing constraints: the total absolute difference between the coefficients
def fn(x):
return np.sum(np.abs(x-np.mean(x)))
# Initialize the coefficients randomly
z0 = abs(np.random.randn(len(A[1,:])))
# Set up bound
# bnds = [(0, None)]*n
# Solve the problem
sol = minimize(fn, x0 = z0, constraints = cons, method = 'SLSQP', options={'disp': True})
#expected 35%
print(sol.x)
print(A.dot(sol.x))
#print(fn(sol.x))
print(str(fsolveMin(A,b,n))+"\n\n")

To give you an idea how bloated this will get with low-level tools like scipy's linprog as we have to mimic their standard-form:
The basic idea is to:
add an auxiliary-variable for the mean, like Erwin proposed
add n-auxiliary-variables to handle the absolute values like documented in lpsolve's docs
(and i'm using n-extra-aux-vars as temp-vars for x-mean)
Now this example works.
For your data you have to be careful in regards to the bounds. Make sure, the problem stays feasible. The problem itself is pretty unstable as hard-constraints are used for Ax=b; usually you would minimize some norm / least-squares here (no LP anymore; QP/SOCP) and add this error to the objective)!
It might be needed to switch the solver from method='simplex' to method='interior-point' at some point (only available since scipy 1.0).
Alternative:
Using cvxpy, formulation is much easier (both variants mentioned) and you get quite a good solver (ECOS) for free.
Code:
import numpy as np
import scipy.optimize as spo
np.set_printoptions(linewidth=120)
np.random.seed(1)
""" Create random data """
M, N = 2, 3
A = np.random.random(size=(M,N))
x_hidden = np.random.random(size=N)
b = A.dot(x_hidden) # target
n = N
print('Original A')
print(A)
print('hidden x: ', x_hidden)
print('target: ', b)
""" Optimize as LP """
def solve(A, b, n):
print('Reformulation')
am, an = A.shape
# Introduce aux-vars
# 1: y = mean
# n: z = x - mean
# n: abs(z)
n_plus_aux_vars = 3*n + 1
# Equality constraint: y = mean
eq_mean_A = np.zeros(n_plus_aux_vars)
eq_mean_A[:n] = 1. / n
eq_mean_A[n] = -1.
eq_mean_b = np.array([0])
print('y = mean A:')
print(eq_mean_A)
print('y = mean b:')
print(eq_mean_b)
# Equality constraints: Ax = b
eq_A = np.zeros((am, n_plus_aux_vars))
eq_A[:, :n] = A[:, :n]
eq_b = np.copy(b)
print('Ax=b A:')
print(eq_A)
print('Ax=b b:')
print(eq_b)
# Equality constraints: z = x - mean
eq_mean_A_z = np.hstack([-np.eye(n), np.ones((n, 1)), + np.eye(n), np.zeros((n, n))])
eq_mean_b_z = np.zeros(n)
print('z = x - mean A:')
print(eq_mean_A_z)
print('z = x - mean b:')
print(eq_mean_b_z)
# Inequality constraints: absolute values -> x <= x' ; -x <= x'
ineq_abs_0_A = np.hstack([np.zeros((n, n)), np.zeros((n, 1)), np.eye(n), -np.eye(n)])
ineq_abs_0_b = np.zeros(n)
ineq_abs_1_A = np.hstack([np.zeros((n, n)), np.zeros((n, 1)), -np.eye(n), -np.eye(n)])
ineq_abs_1_b = np.zeros(n)
# Bounds
# REMARK: Do not touch anything besides the first bounds-row!
bounds = [(0., 1.) for i in range(n)] + \
[(None, None)] + \
[(None, None) for i in range(n)] + \
[(0, None) for i in range(n)]
# Objective
c = np.zeros(n_plus_aux_vars)
c[-n:] = 1
A_eq = np.vstack((eq_mean_A, eq_A, eq_mean_A_z))
b_eq = np.hstack([eq_mean_b, eq_b, eq_mean_b_z])
A_ineq = np.vstack((ineq_abs_0_A, ineq_abs_1_A))
b_ineq = np.hstack([ineq_abs_0_b, ineq_abs_1_b])
print('solve...')
result = spo.linprog(c, A_ineq, b_ineq, A_eq, b_eq, bounds=bounds, method='simplex')
print(result)
x = result.x[:n]
print('x: ', x)
print('residual Ax-b: ', A.dot(x) - b)
print('mean: ', result.x[n])
print('x - mean: ', x - result.x[n])
print('l1-norm(x - mean) / objective: ', np.linalg.norm(x - result.x[n], 1))
solve(A, b, n)
Output:
Original A
[[ 4.17022005e-01 7.20324493e-01 1.14374817e-04]
[ 3.02332573e-01 1.46755891e-01 9.23385948e-02]]
hidden x: [ 0.18626021 0.34556073 0.39676747]
target: [ 0.32663584 0.14366255]
Reformulation
y = mean A:
[ 0.33333333 0.33333333 0.33333333 -1. 0. 0. 0. 0. 0. 0. ]
y = mean b:
[0]
Ax=b A:
[[ 4.17022005e-01 7.20324493e-01 1.14374817e-04 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00
0.00000000e+00 0.00000000e+00 0.00000000e+00]
[ 3.02332573e-01 1.46755891e-01 9.23385948e-02 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00
0.00000000e+00 0.00000000e+00 0.00000000e+00]]
Ax=b b:
[ 0.32663584 0.14366255]
z = x - mean A:
[[-1. -0. -0. 1. 1. 0. 0. 0. 0. 0.]
[-0. -1. -0. 1. 0. 1. 0. 0. 0. 0.]
[-0. -0. -1. 1. 0. 0. 1. 0. 0. 0.]]
z = x - mean b:
[ 0. 0. 0.]
solve...
fun: 0.078779576294411263
message: 'Optimization terminated successfully.'
nit: 10
slack: array([ 0.07877958, 0. , 0. , 0. , 0.07877958, 0. , 0.76273076, 0.68395118,
0.72334097])
status: 0
success: True
x: array([ 0.23726924, 0.31604882, 0.27665903, 0.27665903, -0.03938979, 0.03938979, 0. , 0.03938979,
0.03938979, 0. ])
x: [ 0.23726924 0.31604882 0.27665903]
residual Ax-b: [ 5.55111512e-17 0.00000000e+00]
mean: 0.276659030053
x - mean: [-0.03938979 0.03938979 0. ]
l1-norm(x - mean) / objective: 0.0787795762944
Now for your original data, things get tough!
You need to:
scale your data as the magnitude of your variables will hurt the solver
depending on your task you might need to analyze the effects of scaling / invert it
use the interior-point solver
be careful with the bounds
Example:
A = np.array([[3582.000000, 3394.000000, 3356.000000, 3256.000000, 3415.000000,
3505.000000, 3442.000000, 3381.000000, 3392.000000],
[233445193.000000, 217344811.000000, 237746918.000000,
219204892.000000, 225272825.000000, 242819442.000000,
215258725.000000, 227681178.000000, 215189377.000000],
[559090945.000000, 496500751.000000, 493639029.000000,
461547877.000000, 501057960.000000, 505073223.000000,
490458632.000000, 503102998.000000, 487026744.000000]])
b = np.array([8531., 1079115532., 429386951.])
A /= 10000. # scaling
b /= 10000. # scaling
bounds = [(-50., 50.) for i in range(n)] + \
...
result = spo.linprog(c, A_ineq, b_ineq, A_eq, b_eq, bounds=bounds, method='interior-point')
Output:
solve...
con: array([ 3.98410760e-09, 1.18067724e-08, 8.12879938e-04, 1.75969041e-03, -3.93853838e-09, -3.96305566e-09,
-4.10043555e-09, -3.94957667e-09, -3.88362764e-09, -3.89452381e-09, -3.95134592e-09, -3.92182287e-09,
-3.85762178e-09])
fun: 52.742900697626389
message: 'Optimization terminated successfully.'
nit: 8
slack: array([ 5.13245265e+01, 1.89309145e-08, 1.83429094e-09, 4.28687782e-09, 1.03726911e-08, 2.77000474e-09,
1.41837413e+00, 6.75769654e-09, 8.65285462e-10, 2.78501844e-09, 3.09591539e-09, 5.27429006e+01,
1.30944103e-08, 5.32994799e-09, 3.15369669e-08, 2.51943821e-09, 7.54848797e-09, 3.22510447e-09])
status: 0
success: True
x: array([ -2.51938304e+01, 4.68432810e-01, 2.68398831e+01, 4.68432822e-01, 4.68432815e-01, 4.68432832e-01,
-2.40754247e-01, 4.68432818e-01, 4.68432819e-01, 4.68432822e-01, -2.56622633e+01, -7.91749954e-09,
2.63714503e+01, 4.40376624e-09, -2.52137156e-09, 1.43834811e-08, -7.09187065e-01, 3.95395716e-10,
1.17990950e-09, 2.56622633e+01, 1.10134149e-08, 2.63714503e+01, 8.69064406e-09, 7.85131955e-09,
1.71534858e-08, 7.09187068e-01, 7.15309226e-09, 2.04519496e-09])
x: [-25.19383044 0.46843281 26.83988313 0.46843282 0.46843282 0.46843283 -0.24075425 0.46843282 0.46843282]
residual Ax-b: [ -1.18067724e-08 -8.12879938e-04 -1.75969041e-03]
mean: 0.468432821891
x - mean: [ -2.56622633e+01 -1.18805552e-08 2.63714503e+01 4.54189575e-10 -6.40499920e-09 1.04889573e-08 -7.09187069e-01
-3.52642715e-09 -2.67771227e-09]
l1-norm(x - mean) / objective: 52.7429006758
Edit
Here a SOCP-based least-squares (soft-constrained) approach, which i would recommend in terms of numerical-stability! This approach can and should be tuned for whatever you need. It's implemented using the already mentioned cvxpy modelling-tool using the ECOS solver.
The basic idea:
instead of: min(l1-norm(x - mean(x)) st. Ax=b
solve: min(l2-norm(Ax-b) + c * l1-norm(x - mean(x)))
where c is the nonnegative trade-off parameter
small c: Ax=b is more important
big c: x - mean(x) is more important
Example code & output for your data and bounds of -50, 50 and c=1e-3:
import numpy as np
import cvxpy as cvx
""" DATA """
A = np.array([[3582.000000, 3394.000000, 3356.000000, 3256.000000, 3415.000000,
3505.000000, 3442.000000, 3381.000000, 3392.000000],
[233445193.000000, 217344811.000000, 237746918.000000,
219204892.000000, 225272825.000000, 242819442.000000,
215258725.000000, 227681178.000000, 215189377.000000],
[559090945.000000, 496500751.000000, 493639029.000000,
461547877.000000, 501057960.000000, 505073223.000000,
490458632.000000, 503102998.000000, 487026744.000000]])
b = np.array([8531., 1079115532., 429386951.])
n = 9
# A /= 10000. scaling would be a good idea
# b /= 10000. """
""" SOCP-based least-squares approach """
def solve(A, b, n, c=1e-1):
x = cvx.Variable(n)
y = cvx.Variable(1) # mean
lower_bounds = np.zeros(n) - 50 # -50
upper_bounds = np.zeros(n) + 50 # 50
constraints = []
constraints.append(x >= lower_bounds)
constraints.append(x <= upper_bounds)
constraints.append(y == cvx.sum_entries(x) / n)
objective = cvx.Minimize(cvx.norm(A*x-b, 2) + c * cvx.norm(x - y, 1))
problem = cvx.Problem(objective, constraints)
problem.solve(solver=cvx.ECOS, verbose=True)
print('Objective: ', problem.value)
print('x: ', x.T.value)
print('mean: ', y.value)
print('Ax-b: ')
print((A*x - b).value)
print('x - mean: ', (x - y).T.value)
solve(A, b, n)
Output:
ECOS 2.0.4 - (C) embotech GmbH, Zurich Switzerland, 2012-15. Web: www.embotech.com/ECOS
It pcost dcost gap pres dres k/t mu step sigma IR | BT
0 +2.637e-17 -1.550e+06 +7e+08 1e-01 2e-04 1e+00 2e+07 --- --- 1 1 - | - -
1 -8.613e+04 -1.014e+05 +8e+06 1e-03 2e-06 2e+03 2e+05 0.9890 1e-04 2 1 1 | 0 0
2 -1.287e+03 -1.464e+03 +1e+05 1e-05 9e-08 4e+01 3e+03 0.9872 1e-04 3 1 1 | 0 0
3 +1.794e+02 +1.900e+02 +2e+03 2e-07 1e-07 1e+01 5e+01 0.9890 5e-03 5 3 4 | 0 0
4 -1.388e+00 -6.826e-01 +1e+02 1e-08 7e-08 9e-01 3e+00 0.9458 4e-03 7 6 6 | 0 0
5 +5.491e+00 +5.683e+00 +1e+01 1e-09 8e-09 2e-01 3e-01 0.9617 6e-02 1 1 1 | 0 0
6 +6.480e+00 +6.505e+00 +1e+00 2e-10 5e-10 3e-02 4e-02 0.8928 2e-02 1 1 1 | 0 0
7 +6.746e+00 +6.746e+00 +2e-02 3e-12 5e-10 5e-04 6e-04 0.9890 5e-03 1 0 0 | 0 0
8 +6.759e+00 +6.759e+00 +3e-04 2e-12 2e-10 6e-06 7e-06 0.9890 1e-04 1 0 0 | 0 0
9 +6.759e+00 +6.759e+00 +3e-06 2e-13 2e-10 6e-08 8e-08 0.9890 1e-04 2 0 0 | 0 0
10 +6.758e+00 +6.758e+00 +3e-08 5e-14 2e-10 7e-10 9e-10 0.9890 1e-04 1 0 0 | 0 0
OPTIMAL (within feastol=2.0e-10, reltol=4.7e-09, abstol=3.2e-08).
Runtime: 0.002901 seconds.
Objective: 6.757722879805085
x: [[-18.09169736 -5.55768047 11.12130645 11.48355878 -1.13982006
12.4290884 -3.00165819 -1.05158589 -2.4468432 ]]
mean: 0.416074272576
Ax-b:
[[ 2.17051777e-03]
[ 1.90734863e-06]
[ -5.72204590e-06]]
x - mean: [[-18.50777164 -5.97375474 10.70523218 11.0674845 -1.55589434
12.01301413 -3.41773246 -1.46766016 -2.86291747]]
This approach will always output a feasible solution (for our task) and you can then decide on the observed residual if it works for you.
As you observed, a lower-bound of 0 is deadly, in all formulations (look at the magnitude difference in your data!).
Here a lower-bound of 0 will get you a solution with some high residual-error.
E.g.:
c=1e-7
bounds = 0 / 15
Output:
Objective: 785913288.2410747
x: [[ -5.57966858e-08 -4.74997454e-08 1.56066068e+00 1.68021234e-07
-3.55602958e-08 1.75340641e-06 -4.69609562e-08 -3.10216680e-08
-4.39482554e-08]]
mean: 0.173406926909
Ax-b:
[[ -3.29341696e+03]
[ -7.08072860e+08]
[ 3.41016903e+08]]
x - mean: [[-0.17340698 -0.17340697 1.38725375 -0.17340676 -0.17340696 -0.17340517
-0.17340697 -0.17340696 -0.17340697]]

First introduce a free variable mu with the constraint:
mu = sum(i, x(i))/n
Then introduce non-negative variables y(i) with:
-y(i) <= x(i) - mu <= y(i)
Now you can minimize
min sum(i,y(i))
This is now a straight LP (linear objective and linear constraints) and can be solved with any LP solver.

Related

Computationally efficient way of maximizing a complicated multivariable function

I've been trying to find an efficient way of maximizing the following monster function in four variables but the program is taking ages to run and I'm not even sure if the results are correct. Can anyone help me code it better in Python?
Here's the function:
where
a=[p,q,r,s].
Y is the measured data sampled at 30 points.
Here's my code.
import numpy as np
import math
Y=Y_t #Y_t is a predefined column vector with 30 entries.
tstep=0.05 #in s
N=30
cov=np.zeros([30,30])
def R(p,q,r,t):
om_D=p*np.sqrt(1-q**2)
return np.pi*r*(np.exp(-q*p*abs(t)))*(np.cos(om_D*t)+(q/(np.sqrt(1-q**2)))*(np.sin(om_D*abs(t))))/(2*q*(p**3))
def I(m,p):
if m==p:
return 1
else:
return 0
def func(a):
a1=a[0] #natural angular frequency bounds=[3,20]
a2=a[1] #damping ratio bounds=[0,1]
a3=a[2] #psd of forcing signal bounds=[300,600]
a4=a[3] #variance of noise bounds=[0,0.0001] in m
#assuming uniform prior for a, we only have to maximise the likelihood function
for i in range(30):
for j in range(30):
cov[i,j]+=R(a1,a2,a3,(j-i)*tstep)+a4*I(i,j)
P=((2*np.pi)**(-N/2)) * ((np.linalg.det(cov))**(-0.5)) * np.exp((-0.5) *np.linalg.multi_dot([np.transpose(Y),np.linalg.inv(cov),Y]))
return (-1)*P[0]
a_start=[5,0.05,100,0.00001]
bnds=((5,20),(0,1),(300,600),(0,0.0001))
result=spo.differential_evolution(func,bounds=bnds)
print(result.x) ```
There is an issue in cov initialization that is why it does not converge. Also an issue on bound for damping ratio, was (0, 1) now (0.0001, 0.999) the ratio should not be 0 or 1 because if it is there will be division by zero error in R(). Code is fixed now see also the output.
Code
import time
import numpy as np
from scipy.optimize import differential_evolution
Y = [[-0.00445551], [-0.01164452], [-0.02171495], [-0.03475491], [-0.00770873], [ 0.0492236 ],
[ 0.07264838], [ 0.03066707], [-0.02457141], [-0.04065968], [-0.01135125], [ 0.02677074], [ 0.06517749],
[ 0.09611112], [ 0.12300657], [ 0.0923581 ], [ 0.03982604], [-0.01473844], [-0.09024497], [-0.14304097],
[-0.17447606], [-0.16926952], [-0.12006193], [-0.00120763], [ 0.11006087], [ 0.19978283], [ 0.24388584],
[ 0.18768875], [ 0.12844553], [ 0.03099409]] #Y_t is a predefined column vector with 30 entries.
tstep = 0.05 #in s
N = 30
def R(p,q,r,t):
om_D = p*np.sqrt(1-q**2)
return np.pi*r*(np.exp(-q*p*abs(t)))*(np.cos(om_D*t)+(q/(np.sqrt(1-q**2)))*(np.sin(om_D*abs(t))))/(2*q*(p**3))
def I(m,p):
if m==p:
return 1
else:
return 0
def func(a):
cov=np.zeros([N,N])
a1=a[0] #natural angular frequency bounds=[3,20]
a2=a[1] #damping ratio bounds=[0,1]
a3=a[2] #psd of forcing signal bounds=[300,600]
a4=a[3] #variance of noise bounds=[0,0.0001] in m
#assuming uniform prior for a, we only have to maximise the likelihood function
for i in range(N):
for j in range(N):
cov[i,j]+=R(a1,a2,a3,(j-i)*tstep)+a4*I(i,j)
P=((2*np.pi)**(-N/2)) * ((np.linalg.det(cov))**(-0.5)) * np.exp((-0.5) *np.linalg.multi_dot([np.transpose(Y),np.linalg.inv(cov),Y]))
return (-1)*P[0]
if __name__ == '__main__':
t0 = time.perf_counter()
a_start = [5, 0.05, 350, 0.00001]
bnds = ((5, 20), (0.0001, 0.999), (300, 600), (0, 0.0001))
result=differential_evolution(func, x0=a_start, bounds=bnds, maxiter=1000)
print(result)
print(f'elapse: {time.perf_counter() - t0:0.0f}s')
Output
fun: array([-2.76736878e+11])
jac: array([-2.91459845e+11, -4.55652161e+12, 1.27377279e+10, 3.34234132e+14])
message: 'Optimization terminated successfully.'
nfev: 3430
nit: 56
success: True
x: array([ 20. , 0.999, 300. , 0. ])
elapse: 55s
Scipy minimize is very fast
Changes:
from scipy.optimize import minimize
result = minimize(func, x0=a_start, bounds=bnds, options={'maxiter': 100, 'disp': True})
Output:
fun: array([-2.76736878e+11])
hess_inv: <4x4 LbfgsInvHessProduct with dtype=float64>
jac: array([-2.91459845e+11, -4.55652161e+12, 1.27377279e+10, 3.34234132e+14])
message: 'CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL'
nfev: 30
nit: 4
njev: 6
status: 0
success: True
x: array([ 20. , 0.999, 300. , 0. ])
elapse: 0.5s
Optuna
Optuna after 1000 trials is right there too.
value is positive here because I use maximize direction. In both scipy's DE and minimize values have to be negated.
best param: {'a1': 20, 'a2': 0.9989999999999999, 'a3': 300, 'a4': 0.0}
best value: 276736878140.3103
best trial num: 73
elapse: 22s

Rank issue with spdiags in Python

I am currently trying to create a sparse matrix that will look like this.
[[ 50. -25. 0. 0.]
[-25. 50. -25. 0.]
[ 0. -25. 50. -25.]
[ 0. 0. -25. 50.]]
But when I run it through I keep getting the value error
'data array must have rank 2' in my data array.
I am positive it is a problem with my B variable. I have tried several things but nothing is working. Any advice?
def sparse(a,b,N):
h = (b-a)/(N+1)
e = np.ones([N,1])/h**2
B = np.array([e, -2*e, e])
diags = np.array([-1,0,1])
A = spdiags(B,diags,N,N).toarray()
return A
print(sparse(0,1,4))
Just change to this:
import numpy as np
from scipy.sparse import spdiags
def sparse(a, b, N):
h = (b - a) / (N + 1)
e = np.ones(N) / h ** 2
diags = np.array([-1, 0, 1])
A = spdiags([-1 * e, 2 * e, -1 * e], diags, N, N).toarray()
return A
print(sparse(0, 1, 4))
Output
[[-50. 25. 0. 0.]
[ 25. -50. 25. 0.]
[ 0. 25. -50. 25.]
[ 0. 0. 25. -50.]]
The main change is this:
e = np.ones([N,1])/h**2
by
e = np.ones(N) / h ** 2
Note that toarray transforms the sparse matrix into a dense one, from the documentation:
Return a dense ndarray representation of this matrix.

Calculate pixelwise distance of 3D tensor in tensorflow?

I am trying to create a 3d distance map (size: W * H * D) in tensorflow to be used in a loss function for training. I have a ground truth (binary volume of size W * H * D) that I will use to create the distance map, i.e. the value of each pixel of my distance map will be the minimum distance of that pixel to the positive valued (i.e pixel=1) shape in the ground truth.
Having issues with the 3d shape problem as L2.NORM reduce the axis to a 2D shape and making this problem fully differentiable. Any advice or pointers would be much appreciated.
If I understand correctly, you want to compute the distance from each position in the volume to the closest position of a given class. For simplicity, I will assume that the interesting class is labelled with 1, but hopefully you can adapt it to your case if it is different. The code is for TensorFlow 2.0, but should work the same for 1.x.
The simplest way to do this is to compute the distance between all the coordinates in the volume against every coordinate with a 1, and then pick the smallest distance from there. You can do that like this:
import tensorflow as tf
# Make input data
w, h, d = 10, 20, 30
w, h, d = 2, 3, 4
t = tf.random.stateless_uniform([w, h, d], (0, 0), 0, 2, tf.int32)
print(t.numpy())
# [[[0 1 0 0]
# [0 0 0 0]
# [1 1 0 1]]
#
# [[1 0 0 0]
# [0 0 0 0]
# [1 1 0 0]]]
# Make coordinates
coords = tf.meshgrid(tf.range(w), tf.range(h), tf.range(d), indexing='ij')
coords = tf.stack(coords, axis=-1)
# Find coordinates that are positive
m = t > 0
coords_pos = tf.boolean_mask(coords, m)
# Find every pairwise distance
vec_d = tf.reshape(coords, [-1, 1, 3]) - coords_pos
# You may choose a difference precision type here
dists = tf.linalg.norm(tf.dtypes.cast(vec_d, tf.float32), axis=-1)
# Find minimum distances
min_dists = tf.reduce_min(dists, axis=-1)
# Reshape
out = tf.reshape(min_dists, [w, h, d])
print(out.numpy().round(3))
# [[[1. 0. 1. 2. ]
# [1. 1. 1.414 1. ]
# [0. 0. 1. 0. ]]
#
# [[0. 1. 1.414 2.236]
# [1. 1. 1.414 1.414]
# [0. 0. 1. 1. ]]]
This may work well enough for you, although it may not be the most efficient solution. The smartest thing would be to search for the closest positive position in the neighboring area of each position, but that is complicated to do effectively, both in general and more so in a vectorized way in TensorFlow. There are however a couple of ways we can improve on the code above. On the one hand, we know that positions with a 1 will always have zero distance, so computing for those is unnecessary. On the other hand, if the 1 class in the 3D volume represents some kind of dense shape, then we could save some time if we only computed the distances against the surface of that shape. All other positive positions will have necessarily a greater distance to positions outside the shape. So we can do the same thing we were doing, but computing only distances from non-positive positions to positive surface positions. You can do that like this:
import tensorflow as tf
# Make input data
w, h, d = 10, 20, 30
w, h, d = 2, 3, 4
t = tf.dtypes.cast(tf.random.stateless_uniform([w, h, d], (0, 0)) > .15, tf.int32)
print(t.numpy())
# [[[1 1 1 1]
# [1 1 1 1]
# [1 1 0 0]]
#
# [[1 1 1 1]
# [1 1 1 1]
# [1 1 1 1]]]
# Find coordinates that are positive and on the surface
# (surrounded but at least one 0)
t_pad_z = tf.pad(t, [(1, 1), (1, 1), (1, 1)]) <= 0
m_pos = t > 0
m_surround_z = tf.zeros_like(m_pos)
# Go through the 6 surrounding positions
for i in range(3):
for s in [slice(None, -2), slice(2, None)]:
slices = tuple(slice(1, -1) if i != j else s for j in range(3))
m_surround_z |= t_pad_z.__getitem__(slices)
# Surface points are positive points surrounded by some zero
m_surf = m_pos & m_surround_z
coords_surf = tf.where(m_surf)
# Find coordinates that are zero
coords_z = tf.where(~m_pos)
# Find every pairwise distance
vec_d = tf.reshape(coords_z, [-1, 1, 3]) - coords_surf
dists = tf.linalg.norm(tf.dtypes.cast(vec_d, tf.float32), axis=-1)
# Find minimum distances
min_dists = tf.reduce_min(dists, axis=-1)
# Put minimum distances in output array
out = tf.scatter_nd(coords_z, min_dists, [w, h, d])
print(out.numpy().round(3))
# [[[0. 0. 0. 0.]
# [0. 0. 0. 0.]
# [0. 0. 1. 1.]]
#
# [[0. 0. 0. 0.]
# [0. 0. 0. 0.]
# [0. 0. 0. 0.]]]
EDIT: Here is one way in which you can divide the distance computations in chunks with a TensorFlow loop:
# Following from before
coords_surf = ...
coords_z = ...
CHUNK_SIZE = 1_000 # Choose chunk size
dtype = tf.float32
# If using TF 2.x you can know in advance the size of the tensor array
# (although the element shape will not be constant due to the last chunk)
num_z = tf.shape(coords_z)[0]
arr = tf.TensorArray(dtype, size=(num_z - 1) // CHUNK_SIZE + 1, element_shape=[None], infer_shape=False)
_, arr = tf.while_loop(lambda i, arr: i < num_z,
lambda i, arr: (i + CHUNK_SIZE, arr.write(i // CHUNK_SIZE,
tf.reduce_min(tf.linalg.norm(tf.dtypes.cast(
tf.reshape(coords_z[i:i + CHUNK_SIZE], [-1, 1, 3]) - coords_surf,
dtype), axis=-1), axis=-1))),
[tf.constant(0, tf.int32), arr])
min_dists = arr.concat()
out = tf.scatter_nd(coords_z, min_dists, [w, h, d])

Scipy - Nan when calculating Mahalanobis distance

When I try to calculate the Mahalanobis distance with the following python code I get some Nan entries in the result. Do you have any insight about why this happens?
My data.shape = (181, 1500)
from scipy.spatial.distance import pdist, squareform
data_log = log2(data + 1) # A log transform that I usually apply to my data
data_centered = data_log - data_log.mean(0) # zero centering
D = squareform( pdist(data_centered, 'mahalanobis' ) )
I also tried:
data_standard = data_centered / data_centered.std(0, ddof=1)
D = squareform( pdist(data_standard, 'mahalanobis' ) )
Also got nans.
The input is not corrupted and other distances, such as correlation distance, can be computed just fine.
For some reason when I reduce the number of features I stop getting Nans. E.g the following examples does not get any Nan:
D = squareform( pdist(data_centered[:,:200], 'mahalanobis' ) )
D = squareform( pdist(data_centered[:,180:480], 'mahalanobis' ) )
while those others get Nans:
D = squareform( pdist(data_centered[:,:300], 'mahalanobis' ) )
D = squareform( pdist(data_centered[:,180:600], 'mahalanobis' ) )
Any clue? Is this an expected behaviour if some condition for the input is not satisfied?
You have fewer observations than features, so the covariance matrix V computed by the scipy code is singular. The code doesn't check this, and blindly computes the "inverse" of the covariance matrix. Because this numerically computed inverse is basically garbage, the product (x-y)*inv(V)*(x-y) (where x and y are observations) might turn out to be negative. Then the square root of that value results in nan.
For example, this array also results in a nan:
In [265]: x
Out[265]:
array([[-1. , 0.5, 1. , 2. , 2. ],
[ 2. , 1. , 2.5, -1.5, 1. ],
[ 1.5, -0.5, 1. , 2. , 2.5]])
In [266]: squareform(pdist(x, 'mahalanobis'))
Out[266]:
array([[ 0. , nan, 1.90394328],
[ nan, 0. , nan],
[ 1.90394328, nan, 0. ]])
Here's the Mahalanobis calculation done "by hand":
In [279]: V = np.cov(x.T)
In theory, V is singular; the following value is effectively 0:
In [280]: np.linalg.det(V)
Out[280]: -2.968550671342364e-47
But inv doesn't see the problem, and returns an inverse:
In [281]: VI = np.linalg.inv(V)
Let's compute the distance between x[0] and x[2] and verify that we get the same non-nan value (1.9039) returned by pdist when we use VI:
In [295]: delta = x[0] - x[2]
In [296]: np.dot(np.dot(delta, VI), delta)
Out[296]: 3.625
In [297]: np.sqrt(np.dot(np.dot(delta, VI), delta))
Out[297]: 1.9039432764659772
Here's what happens when we try to compute the distance between x[0] and x[1]:
In [300]: delta = x[0] - x[1]
In [301]: np.dot(np.dot(delta, VI), delta)
Out[301]: -1.75
Then the square root of that value gives nan.
In scipy 0.16 (to be released in June 2015), you will get an error instead of nan or garbage. The error message describes the problem:
In [4]: x = array([[-1. , 0.5, 1. , 2. , 2. ],
...: [ 2. , 1. , 2.5, -1.5, 1. ],
...: [ 1.5, -0.5, 1. , 2. , 2.5]])
In [5]: pdist(x, 'mahalanobis')
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-5-a3453ff6fe48> in <module>()
----> 1 pdist(x, 'mahalanobis')
/Users/warren/local_scipy/lib/python2.7/site-packages/scipy/spatial/distance.pyc in pdist(X, metric, p, w, V, VI)
1298 "singular. For observations with %d "
1299 "dimensions, at least %d observations "
-> 1300 "are required." % (m, n, n + 1))
1301 V = np.atleast_2d(np.cov(X.T))
1302 VI = _convert_to_double(np.linalg.inv(V).T.copy())
ValueError: The number of observations (3) is too small; the covariance matrix is singular. For observations with 5 dimensions, at least 6 observations are required.

Python Scipy Error

import scipy.sparse.linalg as scial
import scipy.sparse as scisp
import numpy
def buildB(A,x,col_size_A):
d = numpy.zeros(col_size_A)
for index in xrange(col_size_A):
d[index] = 2*x[index]-1
tmp = scisp.spdiags(d,0,col_size_A,col_size_A)
return scisp.bmat([[A],[tmp]])
def buildQ(l,row_size_A):
q = numpy.zeros(row_size_A)
for index in xrange(row_size_A):
q[index] = 2*l[index]
return scisp.spdiags(q,0,row_size_A,row_size_A)
def buildh(A,x,b,col_size_A):
p = A.dot(x)
p = numpy.subtract(p, b)
quad = numpy.zeros(col_size_A)
for index in xrange(col_size_A):
quad[index] = x[index]*x[index]-x[index]
return numpy.concatenate((p, quad))
def ini():
A = numpy.array([[1,1],[1,-1]])
b = [1, 0]
c = [1, 1]
col_size_A = 2
row_size_A = 2
main(A,b,c,col_size_A,row_size_A)
def main(A,b,c, col_size_A, row_size_A):
x = numpy.zeros(col_size_A)
l = numpy.zeros(row_size_A*2)
eps = 10e-6
k = 0
while True:
B = buildB(A,x,col_size_A)
Q = buildQ(l[row_size_A/2:row_size_A+1], col_size_A)
Bt = B.transpose()
h = buildh(A,x,b,col_size_A)
g = numpy.add(c,Bt.dot(l))
F = numpy.concatenate((g, h))
print "Iteration " + str(k),
tol = numpy.amax(F)
print "- Tol "+ str(tol)
if tol < eps:
print "Done"
break
tF = -numpy.concatenate((c, h))
FGrad2 = scisp.csc_matrix(scisp.bmat([[Q,Bt],[B, None]]))
print FGrad2
print FGrad2.todense()
print " "
print tF
xdelta = scial.spsolve(FGrad2,tF)
print xdelta
x = x + xdelta[0:col_size_A]
l = x[col_size_A:]
k = k + 1
if __name__ == "__main__":
ini()
The output is:
(2, 0) 1.0
(3, 0) 1.0
(4, 0) -1.0
(2, 1) 1.0
(3, 1) -1.0
(5, 1) -1.0
(0, 2) 1.0
(1, 2) 1.0
(0, 3) 1.0
(1, 3) -1.0
(0, 4) -1.0
(1, 5) -1.0
[[ 0. 0. 1. 1. -1. 0.]
[ 0. 0. 1. -1. 0. -1.]
[ 1. 1. 0. 0. 0. 0.]
[ 1. -1. 0. 0. 0. 0.]
[-1. 0. 0. 0. 0. 0.]
[ 0. -1. 0. 0. 0. 0.]]
lda must be >= MAX(N,1): lda=2 N=3BLAS error: Parameter number 7 passed to cblas_dtrsv had an invalid value
[-1. -1. 1. -0. -0. -0.]
So FGrad2 seems to be a valid csc matrix and tF a valid numpy.array.
What is wrong with this code? I don't even know why the error is before the print of tF even so the error is behind at spsolve
Edit
Ok i fixed that, it is because the first guess for the parameters was wrong leading to a singular matrix, but suppling a valid guess for l, leads to wrong calculation of spsolve
as mentioned i labeled all output as you can see spsolve returns the wrong calculation.
$FGrad2 * xdelta != tF$
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import scipy.sparse.linalg as scial
import scipy.sparse as scisp
import numpy
def buildB(A,x,col_size_A):
d = numpy.zeros(col_size_A)
for index in xrange(col_size_A):
d[index] = 2*x[index]-1
tmp = scisp.spdiags(d,0,col_size_A,col_size_A)
return scisp.bmat([[A],[tmp]])
def buildQ(l,row_size_A):
q = numpy.zeros(row_size_A)
for index in xrange(row_size_A):
q[index] = 2*l[index]
return scisp.spdiags(q,0,row_size_A,row_size_A)
def buildh(A,x,b,col_size_A):
p = A.dot(x)
p = numpy.subtract(p, b)
quad = numpy.zeros(col_size_A)
for index in xrange(col_size_A):
quad[index] = x[index]*x[index]-x[index]
return numpy.concatenate((p, quad))
def ini():
A = numpy.array([[1,1],[1,0]])
b = [1, 0]
c = [1, 1]
col_size_A = 2
row_size_A = 2
main(A,b,c,col_size_A,row_size_A)
def main(A,b,c, col_size_A, row_size_A):
x = numpy.zeros(col_size_A)
x[0] = 0
x[1] = 1
l = numpy.ones(row_size_A*2)
eps = 10e-6
k = 0
while True:
B = buildB(A,x,col_size_A)
Q = buildQ(l[row_size_A:], col_size_A)
Bt = B.transpose()
h = buildh(A,x,b,col_size_A)
g = numpy.add(c,Bt.dot(l))
F = numpy.concatenate((g, h))
print "Iteration " + str(k),
tol = numpy.amax(numpy.absolute(F))
print "- Tol "+ str(tol)
if tol < eps:
print "Done"
print x
break
tF = -numpy.concatenate((c, h))
FGrad2 = scisp.csc_matrix(scisp.bmat([[Q,Bt],[B, None]]))
print "FGrad2"
print FGrad2.todense()
print "tF"
print tF
xdelta = scial.spsolve(FGrad2,tF)
print "spsolution"
print xdelta
print ""
x = x + xdelta[0:col_size_A]
l = xdelta[col_size_A:]
k = k + 1
if __name__ == "__main__":
ini()
Output:
Iteration 0 - Tol 3.0
FGrad2
[[ 2. 0. 1. 1. -1. 0.]
[ 0. 2. 1. 0. 0. 1.]
[ 1. 1. 0. 0. 0. 0.]
[ 1. 0. 0. 0. 0. 0.]
[-1. 0. 0. 0. 0. 0.]
[ 0. 1. 0. 0. 0. 0.]]
tF
[-1. -1. -0. -0. -0. -0.]
spsolution
[-1. -1. -0. -0. -0. -0.]
I think this is failing for you because your matrix is singular. E.g. convert to dense and use the regular numpy.linalg.solve:
>>> xdelta = numpy.linalg.solve(FGrad2.todense(), tF)
...
raise LinAlgError('Singular matrix')
numpy.linalg.linalg.LinAlgError: Singular matrix
The error I get is:
File "stack27538259.py", line 62, in main
xdelta = scial.spsolve(FGrad2,tF)
File "/usr/lib/python2.7/dist-packages/scipy/sparse/linalg/dsolve/linsolve.py", line 143, in spsolve
b, flag, options=options)
RuntimeError: superlu failure (singular matrix?) at line 100 in file scipy/sparse/linalg/dsolve/SuperLU/SRC/dsnode_bmod.c
As xnx wrote, FGrad2 is singular.
np.linalg.det(FGrad2.todense()) # 0.0
(scipy version 0.14.0)
after the change I get:
/usr/lib/python2.7/dist-packages/scipy/sparse/linalg/dsolve/linsolve.py:145: MatrixRankWarning: Matrix is exactly singular
and
spsolution
[ nan nan nan nan nan nan]
and an infinite loop unless I add k counter and break.
Documentation for cblas_dtrsv may be found (here)
Accordingly,
the routine solves a triangular system A*X = B (presumably)
lda is the leading dimension of matrix B
N is the order of the matrix A
the error message says lda = 2 and N = 3 but lda must be >= MAX(N,1)
Perhaps this helps track down the problem.

Categories