I've been trying to find an efficient way of maximizing the following monster function in four variables but the program is taking ages to run and I'm not even sure if the results are correct. Can anyone help me code it better in Python?
Here's the function:
where
a=[p,q,r,s].
Y is the measured data sampled at 30 points.
Here's my code.
import numpy as np
import math
Y=Y_t #Y_t is a predefined column vector with 30 entries.
tstep=0.05 #in s
N=30
cov=np.zeros([30,30])
def R(p,q,r,t):
om_D=p*np.sqrt(1-q**2)
return np.pi*r*(np.exp(-q*p*abs(t)))*(np.cos(om_D*t)+(q/(np.sqrt(1-q**2)))*(np.sin(om_D*abs(t))))/(2*q*(p**3))
def I(m,p):
if m==p:
return 1
else:
return 0
def func(a):
a1=a[0] #natural angular frequency bounds=[3,20]
a2=a[1] #damping ratio bounds=[0,1]
a3=a[2] #psd of forcing signal bounds=[300,600]
a4=a[3] #variance of noise bounds=[0,0.0001] in m
#assuming uniform prior for a, we only have to maximise the likelihood function
for i in range(30):
for j in range(30):
cov[i,j]+=R(a1,a2,a3,(j-i)*tstep)+a4*I(i,j)
P=((2*np.pi)**(-N/2)) * ((np.linalg.det(cov))**(-0.5)) * np.exp((-0.5) *np.linalg.multi_dot([np.transpose(Y),np.linalg.inv(cov),Y]))
return (-1)*P[0]
a_start=[5,0.05,100,0.00001]
bnds=((5,20),(0,1),(300,600),(0,0.0001))
result=spo.differential_evolution(func,bounds=bnds)
print(result.x) ```
There is an issue in cov initialization that is why it does not converge. Also an issue on bound for damping ratio, was (0, 1) now (0.0001, 0.999) the ratio should not be 0 or 1 because if it is there will be division by zero error in R(). Code is fixed now see also the output.
Code
import time
import numpy as np
from scipy.optimize import differential_evolution
Y = [[-0.00445551], [-0.01164452], [-0.02171495], [-0.03475491], [-0.00770873], [ 0.0492236 ],
[ 0.07264838], [ 0.03066707], [-0.02457141], [-0.04065968], [-0.01135125], [ 0.02677074], [ 0.06517749],
[ 0.09611112], [ 0.12300657], [ 0.0923581 ], [ 0.03982604], [-0.01473844], [-0.09024497], [-0.14304097],
[-0.17447606], [-0.16926952], [-0.12006193], [-0.00120763], [ 0.11006087], [ 0.19978283], [ 0.24388584],
[ 0.18768875], [ 0.12844553], [ 0.03099409]] #Y_t is a predefined column vector with 30 entries.
tstep = 0.05 #in s
N = 30
def R(p,q,r,t):
om_D = p*np.sqrt(1-q**2)
return np.pi*r*(np.exp(-q*p*abs(t)))*(np.cos(om_D*t)+(q/(np.sqrt(1-q**2)))*(np.sin(om_D*abs(t))))/(2*q*(p**3))
def I(m,p):
if m==p:
return 1
else:
return 0
def func(a):
cov=np.zeros([N,N])
a1=a[0] #natural angular frequency bounds=[3,20]
a2=a[1] #damping ratio bounds=[0,1]
a3=a[2] #psd of forcing signal bounds=[300,600]
a4=a[3] #variance of noise bounds=[0,0.0001] in m
#assuming uniform prior for a, we only have to maximise the likelihood function
for i in range(N):
for j in range(N):
cov[i,j]+=R(a1,a2,a3,(j-i)*tstep)+a4*I(i,j)
P=((2*np.pi)**(-N/2)) * ((np.linalg.det(cov))**(-0.5)) * np.exp((-0.5) *np.linalg.multi_dot([np.transpose(Y),np.linalg.inv(cov),Y]))
return (-1)*P[0]
if __name__ == '__main__':
t0 = time.perf_counter()
a_start = [5, 0.05, 350, 0.00001]
bnds = ((5, 20), (0.0001, 0.999), (300, 600), (0, 0.0001))
result=differential_evolution(func, x0=a_start, bounds=bnds, maxiter=1000)
print(result)
print(f'elapse: {time.perf_counter() - t0:0.0f}s')
Output
fun: array([-2.76736878e+11])
jac: array([-2.91459845e+11, -4.55652161e+12, 1.27377279e+10, 3.34234132e+14])
message: 'Optimization terminated successfully.'
nfev: 3430
nit: 56
success: True
x: array([ 20. , 0.999, 300. , 0. ])
elapse: 55s
Scipy minimize is very fast
Changes:
from scipy.optimize import minimize
result = minimize(func, x0=a_start, bounds=bnds, options={'maxiter': 100, 'disp': True})
Output:
fun: array([-2.76736878e+11])
hess_inv: <4x4 LbfgsInvHessProduct with dtype=float64>
jac: array([-2.91459845e+11, -4.55652161e+12, 1.27377279e+10, 3.34234132e+14])
message: 'CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL'
nfev: 30
nit: 4
njev: 6
status: 0
success: True
x: array([ 20. , 0.999, 300. , 0. ])
elapse: 0.5s
Optuna
Optuna after 1000 trials is right there too.
value is positive here because I use maximize direction. In both scipy's DE and minimize values have to be negated.
best param: {'a1': 20, 'a2': 0.9989999999999999, 'a3': 300, 'a4': 0.0}
best value: 276736878140.3103
best trial num: 73
elapse: 22s
Related
I'm solving an optimization problem to do a constrained nonlinear regression using experimental data. I use scipy minimize and it works with the original data, but it doesn't work when I do a simple data transformation. For the transformed data I use the excel solver solution for the same problem as the initial condition so it should work butcan't figure out why it doesn not. Please any help is appreciated. Thanks beforehand btw.
Here is the code with the original data (works) and the transformation (does not work)
import numpy as np
from scipy.optimize import Bounds, minimize
def yp(x, time, mode = 'fit'):
y1 = x[0] + x[1]*time
y2 = x[2] + (x[3] - x[2])*np.exp(-x[5]*(time - x[4])/(x[3] - x[2]))
comparison = time < x[4]
yp = y1*comparison + y2*(~comparison)
if mode == 'fit':
return yp
elif mode == 'calc':
return y1, y2
else:
print('Unsupported mode, returning default behavior for fitting data')
return yp
def objective(x, time, y):
ypred = yp(x, time)
z = sum((ypred - y)**2)
return z
#***********************
#Original data
#***********************
data_x = np.array([0,30,60,90,120,150,180,210,240,270,300,330,360,420,480,540,600,720,840])
data_y = np.array([11.06468023,10.03242418,9.771158736,8.873720137,8.618127786,
8.397702515,7.581607582,7.131636821,6.537043245,6.358885017,
5.898468977,5.25275811,4.983989976,4.141791045,2.602472349,
2.07395813,1.078129376,0.551764193,0.480052971])
x0 = [11.5, -0.0211, 0.6, 3.26, 400, 0.01919]
lbound = [9, -0.1, 0.3, 1, 200, 0]
ubound = [14, -1e-5, 1, 4, 800, 0.1]
bounds = Bounds(lbound,ubound)
constraint = dict(type = 'ineq',
fun = lambda x: 0.1 - abs(x[0] + x[1]*x[4] - x[3]))
res = minimize(fun = objective,
x0 = x0,
args = (data_x, data_y),
method = 'SLSQP',
constraints = constraint,
options = {'disp':True},
bounds = bounds)
print(res)
Optimization terminated successfully (Exit mode 0)
Current function value: 0.6681037696841838
Iterations: 20
Function evaluations: 149
Gradient evaluations: 20
fun: 0.6681037696841838
jac: array([ 1.19826198e-03, 5.93313336e-01, 9.38165262e-02, 6.77183270e-04,
1.15633011e-05, -3.35602835e-02])
message: 'Optimization terminated successfully'
nfev: 149
nit: 20
njev: 20
status: 0
success: True
x: array([ 1.06185481e+01, -1.59476490e-02, 3.00000000e-01, 3.86162000e+00,
4.29964822e+02, 2.80661182e-02])
#***********************
#Transformed data
#***********************
data_y_rel = data_y/data_y[0]
x0_rel = [1, -0.00207571, 0.03, 0.359269446, 313.497571, 0.001970666]
lbound_rel = [1, -0.1, 0.03, 0.1, 200, 0]
ubound_rel = [1, -1e-5, 0.1, 0.4, 800, 0.1]
bounds_rel = Bounds(lbound_rel,ubound_rel)
constraint_rel = dict(type = 'ineq',
fun = lambda x: 0.01 - abs(x[0] + x[1]*x[4] - x[3]))
res_rel = minimize(fun = objective,
x0 = x0_rel,
args = (data_x, data_y_rel),
method = 'SLSQP',
constraints = constraint_rel,
options = {'disp':True},
bounds = bounds_rel)
print(res_rel)
Inequality constraints incompatible (Exit mode 4)
Current function value: 0.1593965203706159
Iterations: 1
Function evaluations: 7
Gradient evaluations: 1
fun: 0.1593965203706159
jac: array([ nan, -2.88985475e+02, -1.53672213e-01, -1.13128023e+00,
-1.58630125e-03, 5.45240970e+01])
message: 'Inequality constraints incompatible'
nfev: 7
nit: 1
njev: 1
status: 4
success: False
x: array([ 1.00000000e+00, -2.07571000e-03, 3.00000000e-02, 3.59269446e-01,
3.13497571e+02, 1.97066600e-03])
C:\Users\username\Anaconda3\lib\site-packages\scipy\optimize\_numdiff.py:519: RuntimeWarning: invalid value encountered in true_divide
J_transposed[i] = df / dx
Changing the method from 'SLSQP' to 'trust-constr' worked for me. 'COBYLA' is also an option.
I am trying to get the minimized weight that's closer to average weight combined. My current problem is using the SLSQP solver I cannot find the right weight that meet the target 100%. Is there another solver I could use to solve my problem? Or any math suggestions. Please help.
My math right now is
**min(∑|x-mean(x)|)**
**s.t.** Aw-b=0, w>=0
**bound** 0.2'<'x<0.5
Data
nRow# Variable1 Variable2 Variable3
1 3582.00 233445193.00 559090945.00
2 3394.00 217344811.00 496500751.00
3 3356.00 237746918.00 493639029.00
4 3256.00 219204892.00 461547877.00
5 3415.00 225272825.00 501057960.00
6 3505.00 242819442.00 505073223.00
7 3442.00 215258725.00 490458632.00
8 3381.00 227681178.00 503102998.00
9 3392.00 215189377.00 487026744.00
w1 w2 w3
Target 8531.00 429386951.00 1079115532.00
Question: Find the minimized weight that are closer to average
Python Code:
A = np.array([[3582.000000, 3394.000000, 3356.000000, 3256.000000, 3415.000000,
3505.000000, 3442.000000, 3381.000000, 3392.000000],
[233445193.000000, 217344811.000000, 237746918.000000,
219204892.000000, 225272825.000000, 242819442.000000,
215258725.000000, 227681178.000000, 215189377.000000],
[559090945.000000, 496500751.000000, 493639029.000000,
461547877.000000, 501057960.000000, 505073223.000000,
490458632.000000, 503102998.000000, 487026744.000000]])
b = np.array([8531, 1079115532, 429386951])
n=9
def fsolveMin(A,b,n):
# The constraints: Ax = b
def cons(x):
return A.dot(x)-b
cons = ({'type':'eq','fun':cons},
{'type':'ineq','fun':lambda x:x[0]})
# The minimizing constraints: the total absolute difference between the coefficients
def fn(x):
return np.sum(np.abs(x-np.mean(x)))
# Initialize the coefficients randomly
z0 = abs(np.random.randn(len(A[1,:])))
# Set up bound
# bnds = [(0, None)]*n
# Solve the problem
sol = minimize(fn, x0 = z0, constraints = cons, method = 'SLSQP', options={'disp': True})
#expected 35%
print(sol.x)
print(A.dot(sol.x))
#print(fn(sol.x))
print(str(fsolveMin(A,b,n))+"\n\n")
To give you an idea how bloated this will get with low-level tools like scipy's linprog as we have to mimic their standard-form:
The basic idea is to:
add an auxiliary-variable for the mean, like Erwin proposed
add n-auxiliary-variables to handle the absolute values like documented in lpsolve's docs
(and i'm using n-extra-aux-vars as temp-vars for x-mean)
Now this example works.
For your data you have to be careful in regards to the bounds. Make sure, the problem stays feasible. The problem itself is pretty unstable as hard-constraints are used for Ax=b; usually you would minimize some norm / least-squares here (no LP anymore; QP/SOCP) and add this error to the objective)!
It might be needed to switch the solver from method='simplex' to method='interior-point' at some point (only available since scipy 1.0).
Alternative:
Using cvxpy, formulation is much easier (both variants mentioned) and you get quite a good solver (ECOS) for free.
Code:
import numpy as np
import scipy.optimize as spo
np.set_printoptions(linewidth=120)
np.random.seed(1)
""" Create random data """
M, N = 2, 3
A = np.random.random(size=(M,N))
x_hidden = np.random.random(size=N)
b = A.dot(x_hidden) # target
n = N
print('Original A')
print(A)
print('hidden x: ', x_hidden)
print('target: ', b)
""" Optimize as LP """
def solve(A, b, n):
print('Reformulation')
am, an = A.shape
# Introduce aux-vars
# 1: y = mean
# n: z = x - mean
# n: abs(z)
n_plus_aux_vars = 3*n + 1
# Equality constraint: y = mean
eq_mean_A = np.zeros(n_plus_aux_vars)
eq_mean_A[:n] = 1. / n
eq_mean_A[n] = -1.
eq_mean_b = np.array([0])
print('y = mean A:')
print(eq_mean_A)
print('y = mean b:')
print(eq_mean_b)
# Equality constraints: Ax = b
eq_A = np.zeros((am, n_plus_aux_vars))
eq_A[:, :n] = A[:, :n]
eq_b = np.copy(b)
print('Ax=b A:')
print(eq_A)
print('Ax=b b:')
print(eq_b)
# Equality constraints: z = x - mean
eq_mean_A_z = np.hstack([-np.eye(n), np.ones((n, 1)), + np.eye(n), np.zeros((n, n))])
eq_mean_b_z = np.zeros(n)
print('z = x - mean A:')
print(eq_mean_A_z)
print('z = x - mean b:')
print(eq_mean_b_z)
# Inequality constraints: absolute values -> x <= x' ; -x <= x'
ineq_abs_0_A = np.hstack([np.zeros((n, n)), np.zeros((n, 1)), np.eye(n), -np.eye(n)])
ineq_abs_0_b = np.zeros(n)
ineq_abs_1_A = np.hstack([np.zeros((n, n)), np.zeros((n, 1)), -np.eye(n), -np.eye(n)])
ineq_abs_1_b = np.zeros(n)
# Bounds
# REMARK: Do not touch anything besides the first bounds-row!
bounds = [(0., 1.) for i in range(n)] + \
[(None, None)] + \
[(None, None) for i in range(n)] + \
[(0, None) for i in range(n)]
# Objective
c = np.zeros(n_plus_aux_vars)
c[-n:] = 1
A_eq = np.vstack((eq_mean_A, eq_A, eq_mean_A_z))
b_eq = np.hstack([eq_mean_b, eq_b, eq_mean_b_z])
A_ineq = np.vstack((ineq_abs_0_A, ineq_abs_1_A))
b_ineq = np.hstack([ineq_abs_0_b, ineq_abs_1_b])
print('solve...')
result = spo.linprog(c, A_ineq, b_ineq, A_eq, b_eq, bounds=bounds, method='simplex')
print(result)
x = result.x[:n]
print('x: ', x)
print('residual Ax-b: ', A.dot(x) - b)
print('mean: ', result.x[n])
print('x - mean: ', x - result.x[n])
print('l1-norm(x - mean) / objective: ', np.linalg.norm(x - result.x[n], 1))
solve(A, b, n)
Output:
Original A
[[ 4.17022005e-01 7.20324493e-01 1.14374817e-04]
[ 3.02332573e-01 1.46755891e-01 9.23385948e-02]]
hidden x: [ 0.18626021 0.34556073 0.39676747]
target: [ 0.32663584 0.14366255]
Reformulation
y = mean A:
[ 0.33333333 0.33333333 0.33333333 -1. 0. 0. 0. 0. 0. 0. ]
y = mean b:
[0]
Ax=b A:
[[ 4.17022005e-01 7.20324493e-01 1.14374817e-04 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00
0.00000000e+00 0.00000000e+00 0.00000000e+00]
[ 3.02332573e-01 1.46755891e-01 9.23385948e-02 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00
0.00000000e+00 0.00000000e+00 0.00000000e+00]]
Ax=b b:
[ 0.32663584 0.14366255]
z = x - mean A:
[[-1. -0. -0. 1. 1. 0. 0. 0. 0. 0.]
[-0. -1. -0. 1. 0. 1. 0. 0. 0. 0.]
[-0. -0. -1. 1. 0. 0. 1. 0. 0. 0.]]
z = x - mean b:
[ 0. 0. 0.]
solve...
fun: 0.078779576294411263
message: 'Optimization terminated successfully.'
nit: 10
slack: array([ 0.07877958, 0. , 0. , 0. , 0.07877958, 0. , 0.76273076, 0.68395118,
0.72334097])
status: 0
success: True
x: array([ 0.23726924, 0.31604882, 0.27665903, 0.27665903, -0.03938979, 0.03938979, 0. , 0.03938979,
0.03938979, 0. ])
x: [ 0.23726924 0.31604882 0.27665903]
residual Ax-b: [ 5.55111512e-17 0.00000000e+00]
mean: 0.276659030053
x - mean: [-0.03938979 0.03938979 0. ]
l1-norm(x - mean) / objective: 0.0787795762944
Now for your original data, things get tough!
You need to:
scale your data as the magnitude of your variables will hurt the solver
depending on your task you might need to analyze the effects of scaling / invert it
use the interior-point solver
be careful with the bounds
Example:
A = np.array([[3582.000000, 3394.000000, 3356.000000, 3256.000000, 3415.000000,
3505.000000, 3442.000000, 3381.000000, 3392.000000],
[233445193.000000, 217344811.000000, 237746918.000000,
219204892.000000, 225272825.000000, 242819442.000000,
215258725.000000, 227681178.000000, 215189377.000000],
[559090945.000000, 496500751.000000, 493639029.000000,
461547877.000000, 501057960.000000, 505073223.000000,
490458632.000000, 503102998.000000, 487026744.000000]])
b = np.array([8531., 1079115532., 429386951.])
A /= 10000. # scaling
b /= 10000. # scaling
bounds = [(-50., 50.) for i in range(n)] + \
...
result = spo.linprog(c, A_ineq, b_ineq, A_eq, b_eq, bounds=bounds, method='interior-point')
Output:
solve...
con: array([ 3.98410760e-09, 1.18067724e-08, 8.12879938e-04, 1.75969041e-03, -3.93853838e-09, -3.96305566e-09,
-4.10043555e-09, -3.94957667e-09, -3.88362764e-09, -3.89452381e-09, -3.95134592e-09, -3.92182287e-09,
-3.85762178e-09])
fun: 52.742900697626389
message: 'Optimization terminated successfully.'
nit: 8
slack: array([ 5.13245265e+01, 1.89309145e-08, 1.83429094e-09, 4.28687782e-09, 1.03726911e-08, 2.77000474e-09,
1.41837413e+00, 6.75769654e-09, 8.65285462e-10, 2.78501844e-09, 3.09591539e-09, 5.27429006e+01,
1.30944103e-08, 5.32994799e-09, 3.15369669e-08, 2.51943821e-09, 7.54848797e-09, 3.22510447e-09])
status: 0
success: True
x: array([ -2.51938304e+01, 4.68432810e-01, 2.68398831e+01, 4.68432822e-01, 4.68432815e-01, 4.68432832e-01,
-2.40754247e-01, 4.68432818e-01, 4.68432819e-01, 4.68432822e-01, -2.56622633e+01, -7.91749954e-09,
2.63714503e+01, 4.40376624e-09, -2.52137156e-09, 1.43834811e-08, -7.09187065e-01, 3.95395716e-10,
1.17990950e-09, 2.56622633e+01, 1.10134149e-08, 2.63714503e+01, 8.69064406e-09, 7.85131955e-09,
1.71534858e-08, 7.09187068e-01, 7.15309226e-09, 2.04519496e-09])
x: [-25.19383044 0.46843281 26.83988313 0.46843282 0.46843282 0.46843283 -0.24075425 0.46843282 0.46843282]
residual Ax-b: [ -1.18067724e-08 -8.12879938e-04 -1.75969041e-03]
mean: 0.468432821891
x - mean: [ -2.56622633e+01 -1.18805552e-08 2.63714503e+01 4.54189575e-10 -6.40499920e-09 1.04889573e-08 -7.09187069e-01
-3.52642715e-09 -2.67771227e-09]
l1-norm(x - mean) / objective: 52.7429006758
Edit
Here a SOCP-based least-squares (soft-constrained) approach, which i would recommend in terms of numerical-stability! This approach can and should be tuned for whatever you need. It's implemented using the already mentioned cvxpy modelling-tool using the ECOS solver.
The basic idea:
instead of: min(l1-norm(x - mean(x)) st. Ax=b
solve: min(l2-norm(Ax-b) + c * l1-norm(x - mean(x)))
where c is the nonnegative trade-off parameter
small c: Ax=b is more important
big c: x - mean(x) is more important
Example code & output for your data and bounds of -50, 50 and c=1e-3:
import numpy as np
import cvxpy as cvx
""" DATA """
A = np.array([[3582.000000, 3394.000000, 3356.000000, 3256.000000, 3415.000000,
3505.000000, 3442.000000, 3381.000000, 3392.000000],
[233445193.000000, 217344811.000000, 237746918.000000,
219204892.000000, 225272825.000000, 242819442.000000,
215258725.000000, 227681178.000000, 215189377.000000],
[559090945.000000, 496500751.000000, 493639029.000000,
461547877.000000, 501057960.000000, 505073223.000000,
490458632.000000, 503102998.000000, 487026744.000000]])
b = np.array([8531., 1079115532., 429386951.])
n = 9
# A /= 10000. scaling would be a good idea
# b /= 10000. """
""" SOCP-based least-squares approach """
def solve(A, b, n, c=1e-1):
x = cvx.Variable(n)
y = cvx.Variable(1) # mean
lower_bounds = np.zeros(n) - 50 # -50
upper_bounds = np.zeros(n) + 50 # 50
constraints = []
constraints.append(x >= lower_bounds)
constraints.append(x <= upper_bounds)
constraints.append(y == cvx.sum_entries(x) / n)
objective = cvx.Minimize(cvx.norm(A*x-b, 2) + c * cvx.norm(x - y, 1))
problem = cvx.Problem(objective, constraints)
problem.solve(solver=cvx.ECOS, verbose=True)
print('Objective: ', problem.value)
print('x: ', x.T.value)
print('mean: ', y.value)
print('Ax-b: ')
print((A*x - b).value)
print('x - mean: ', (x - y).T.value)
solve(A, b, n)
Output:
ECOS 2.0.4 - (C) embotech GmbH, Zurich Switzerland, 2012-15. Web: www.embotech.com/ECOS
It pcost dcost gap pres dres k/t mu step sigma IR | BT
0 +2.637e-17 -1.550e+06 +7e+08 1e-01 2e-04 1e+00 2e+07 --- --- 1 1 - | - -
1 -8.613e+04 -1.014e+05 +8e+06 1e-03 2e-06 2e+03 2e+05 0.9890 1e-04 2 1 1 | 0 0
2 -1.287e+03 -1.464e+03 +1e+05 1e-05 9e-08 4e+01 3e+03 0.9872 1e-04 3 1 1 | 0 0
3 +1.794e+02 +1.900e+02 +2e+03 2e-07 1e-07 1e+01 5e+01 0.9890 5e-03 5 3 4 | 0 0
4 -1.388e+00 -6.826e-01 +1e+02 1e-08 7e-08 9e-01 3e+00 0.9458 4e-03 7 6 6 | 0 0
5 +5.491e+00 +5.683e+00 +1e+01 1e-09 8e-09 2e-01 3e-01 0.9617 6e-02 1 1 1 | 0 0
6 +6.480e+00 +6.505e+00 +1e+00 2e-10 5e-10 3e-02 4e-02 0.8928 2e-02 1 1 1 | 0 0
7 +6.746e+00 +6.746e+00 +2e-02 3e-12 5e-10 5e-04 6e-04 0.9890 5e-03 1 0 0 | 0 0
8 +6.759e+00 +6.759e+00 +3e-04 2e-12 2e-10 6e-06 7e-06 0.9890 1e-04 1 0 0 | 0 0
9 +6.759e+00 +6.759e+00 +3e-06 2e-13 2e-10 6e-08 8e-08 0.9890 1e-04 2 0 0 | 0 0
10 +6.758e+00 +6.758e+00 +3e-08 5e-14 2e-10 7e-10 9e-10 0.9890 1e-04 1 0 0 | 0 0
OPTIMAL (within feastol=2.0e-10, reltol=4.7e-09, abstol=3.2e-08).
Runtime: 0.002901 seconds.
Objective: 6.757722879805085
x: [[-18.09169736 -5.55768047 11.12130645 11.48355878 -1.13982006
12.4290884 -3.00165819 -1.05158589 -2.4468432 ]]
mean: 0.416074272576
Ax-b:
[[ 2.17051777e-03]
[ 1.90734863e-06]
[ -5.72204590e-06]]
x - mean: [[-18.50777164 -5.97375474 10.70523218 11.0674845 -1.55589434
12.01301413 -3.41773246 -1.46766016 -2.86291747]]
This approach will always output a feasible solution (for our task) and you can then decide on the observed residual if it works for you.
As you observed, a lower-bound of 0 is deadly, in all formulations (look at the magnitude difference in your data!).
Here a lower-bound of 0 will get you a solution with some high residual-error.
E.g.:
c=1e-7
bounds = 0 / 15
Output:
Objective: 785913288.2410747
x: [[ -5.57966858e-08 -4.74997454e-08 1.56066068e+00 1.68021234e-07
-3.55602958e-08 1.75340641e-06 -4.69609562e-08 -3.10216680e-08
-4.39482554e-08]]
mean: 0.173406926909
Ax-b:
[[ -3.29341696e+03]
[ -7.08072860e+08]
[ 3.41016903e+08]]
x - mean: [[-0.17340698 -0.17340697 1.38725375 -0.17340676 -0.17340696 -0.17340517
-0.17340697 -0.17340696 -0.17340697]]
First introduce a free variable mu with the constraint:
mu = sum(i, x(i))/n
Then introduce non-negative variables y(i) with:
-y(i) <= x(i) - mu <= y(i)
Now you can minimize
min sum(i,y(i))
This is now a straight LP (linear objective and linear constraints) and can be solved with any LP solver.
Some hypothetical example solving a nonlinear equation system with fsolve:
from scipy.optimize import fsolve
import math
def equations(p):
x, y = p
return (x+y**2-4, math.exp(x) + x*y - 3)
x, y = fsolve(equations, (1, 1))
print(equations((x, y)))
Is it somehow possible to solve it using scipy.optimize.brentq with some interval, e.g. [-1,1]? How does the unpacking work in that case?
As sascha suggested, constrained optimization is the easiest way to proceed. The least_squares method is convenient here: you can directly pass your equations to it, and it will minimize the sum of squares of its components.
from scipy.optimize import least_squares
res = least_squares(equations, (1, 1), bounds = ((-1, -1), (2, 2)))
The structure of bounds is ((min_first_var, min_second_var), (max_first_var, max_second_var)), or similarly for more variables.
The resulting object has a bunch of fields, shown below. The most relevant ones are: res.cost is essentially zero, which means a root was found; and res.x says what the root is: [ 0.62034453, 1.83838393]
active_mask: array([0, 0])
cost: 1.1745369255773682e-16
fun: array([ -1.47918522e-08, 4.01353883e-09])
grad: array([ 5.00239352e-11, -5.18964300e-08])
jac: array([[ 1. , 3.67676787],
[ 3.69795254, 0.62034452]])
message: '`gtol` termination condition is satisfied.'
nfev: 7
njev: 7
optimality: 8.3872972696740977e-09
status: 1
success: True
x: array([ 0.62034453, 1.83838393])
I am trying to use stats.optimize.minimize function. First, I am trying something very simple.
I define:
lik1 = lambda n,k,p: math.log(stats.binom.pmf(k,n,p))
I am trying to see if minimize will give me the correct MLE, which is, k/n == p.
Then I try:
optimize.minimize(lik1, 0.5, args=(10,2))
where I am assuming n == 10 and k == 2 and my guess for p (the argument x0) is 0.5. I get the following error:
fun: nan
hess_inv: array([[1]])
jac: array([ nan])
message: 'Desired error not necessarily achieved due to precision loss.'
nfev: 3
nit: 0
njev: 1
status: 2
success: False
x: array([ 0.5])
What am I doing wrong?
A few changes:
Select a more appropriate minimization method for this problem. The minimize function defaults to the BFGS method when no constraints or bounds are provided which is a method for unconstrained optimization. It fails because it tries to evaluate the function for values of p > 1. You could provide some reasonable bounds, or I've found here that using the TNC method works in this instance.
The order of the function arguments should be (p, n, k)
You want to maximize the log, or equivalently minimize the negative of the log.
Code:
import scipy as sp
import scipy.stats
import scipy.optimize
lik1 = lambda p, n, k: -sp.log(sp.stats.binom.pmf(k, n, p))
res = sp.optimize.minimize(lik1, 0.5, args=(10, 2), method='TNC')
print(res)
Output:
fun: array([ 1.19736175])
jac: array([ 1.22124533e-05])
message: 'Converged (|f_n-f_(n-1)| ~= 0)'
nfev: 10
nit: 4
status: 1
success: True
x: array([ 0.20000019])
When I try to calculate the Mahalanobis distance with the following python code I get some Nan entries in the result. Do you have any insight about why this happens?
My data.shape = (181, 1500)
from scipy.spatial.distance import pdist, squareform
data_log = log2(data + 1) # A log transform that I usually apply to my data
data_centered = data_log - data_log.mean(0) # zero centering
D = squareform( pdist(data_centered, 'mahalanobis' ) )
I also tried:
data_standard = data_centered / data_centered.std(0, ddof=1)
D = squareform( pdist(data_standard, 'mahalanobis' ) )
Also got nans.
The input is not corrupted and other distances, such as correlation distance, can be computed just fine.
For some reason when I reduce the number of features I stop getting Nans. E.g the following examples does not get any Nan:
D = squareform( pdist(data_centered[:,:200], 'mahalanobis' ) )
D = squareform( pdist(data_centered[:,180:480], 'mahalanobis' ) )
while those others get Nans:
D = squareform( pdist(data_centered[:,:300], 'mahalanobis' ) )
D = squareform( pdist(data_centered[:,180:600], 'mahalanobis' ) )
Any clue? Is this an expected behaviour if some condition for the input is not satisfied?
You have fewer observations than features, so the covariance matrix V computed by the scipy code is singular. The code doesn't check this, and blindly computes the "inverse" of the covariance matrix. Because this numerically computed inverse is basically garbage, the product (x-y)*inv(V)*(x-y) (where x and y are observations) might turn out to be negative. Then the square root of that value results in nan.
For example, this array also results in a nan:
In [265]: x
Out[265]:
array([[-1. , 0.5, 1. , 2. , 2. ],
[ 2. , 1. , 2.5, -1.5, 1. ],
[ 1.5, -0.5, 1. , 2. , 2.5]])
In [266]: squareform(pdist(x, 'mahalanobis'))
Out[266]:
array([[ 0. , nan, 1.90394328],
[ nan, 0. , nan],
[ 1.90394328, nan, 0. ]])
Here's the Mahalanobis calculation done "by hand":
In [279]: V = np.cov(x.T)
In theory, V is singular; the following value is effectively 0:
In [280]: np.linalg.det(V)
Out[280]: -2.968550671342364e-47
But inv doesn't see the problem, and returns an inverse:
In [281]: VI = np.linalg.inv(V)
Let's compute the distance between x[0] and x[2] and verify that we get the same non-nan value (1.9039) returned by pdist when we use VI:
In [295]: delta = x[0] - x[2]
In [296]: np.dot(np.dot(delta, VI), delta)
Out[296]: 3.625
In [297]: np.sqrt(np.dot(np.dot(delta, VI), delta))
Out[297]: 1.9039432764659772
Here's what happens when we try to compute the distance between x[0] and x[1]:
In [300]: delta = x[0] - x[1]
In [301]: np.dot(np.dot(delta, VI), delta)
Out[301]: -1.75
Then the square root of that value gives nan.
In scipy 0.16 (to be released in June 2015), you will get an error instead of nan or garbage. The error message describes the problem:
In [4]: x = array([[-1. , 0.5, 1. , 2. , 2. ],
...: [ 2. , 1. , 2.5, -1.5, 1. ],
...: [ 1.5, -0.5, 1. , 2. , 2.5]])
In [5]: pdist(x, 'mahalanobis')
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-5-a3453ff6fe48> in <module>()
----> 1 pdist(x, 'mahalanobis')
/Users/warren/local_scipy/lib/python2.7/site-packages/scipy/spatial/distance.pyc in pdist(X, metric, p, w, V, VI)
1298 "singular. For observations with %d "
1299 "dimensions, at least %d observations "
-> 1300 "are required." % (m, n, n + 1))
1301 V = np.atleast_2d(np.cov(X.T))
1302 VI = _convert_to_double(np.linalg.inv(V).T.copy())
ValueError: The number of observations (3) is too small; the covariance matrix is singular. For observations with 5 dimensions, at least 6 observations are required.