Context:
My goal is to create a Python3 program to operate differential operations on a vector V of size N. I did so, test it for basic operation and it works (differentiation, gradient...).
I tried to write with that basis more complex equations (Navier-Stokes, Orr-Sommerfeld,...) and I tried to validate my work by calculating the eigenvalues of these equations.
As these eigenvalues were completely unexpected, I simplify my problem and I am currently trying to calculate the eigenvalues only for the differentiation matrix (see below). But the results seem wrong...
Thanks in advance for your help, because I do not find any solution to my problem...
Definition of DM:
I use Chebyshev spectral method to operate the differentiation of vectors.
I use the following Chebyshev package (translated from Matlab to Python):
http://dip.sun.ac.za/%7Eweideman/research/differ.html
That package allow me to create a differentiation matrix DM, obtained with:
nodes, DM = chebyshev.chebdiff(N, maximal_order)
To obtain the 1st, 2nd, 3th... order differentiation, I write for example:
dVdx1 = np.dot(DM[0,:,:], V)
d2Vdx2 = np.dot(DM[1,:,:], V)
d3Vdx3 = np.dot(DM[2,:,:], V)
I tested that and it works.
I've build different operators based on that differentiation process.
I've tried to validate them by finding their eigenvalues. It didn't go well so I am just trying right now with DM only.
I do not manage to find the right eigenvalues of DM.
I've tried with different functions:
numpy.linalg.eigvals(DM)
scipy.linalg.eig(DM)
scipy.sparse.linalg.eigs(DM)
sympy.solve( (DM-x*np.eye).det(), x) [for snall size only]
Why I use scipy.sparse.LinearOperator:
I do not want to directly use the matrix DM, so I wrapped into a function which operates the differentiation (see code below) like that:
dVdx1 = derivative(V)
The reason why I do that comes from the global project itself.
This is useful for more complicated equations.
Creating such a function prevents me from using directly the matrix DM to find its eigenvalues (because DM stay inside the function).
For that reason, I use a scipy.sparse.LinearOperator to wrap my method derivative() and use it as an input of scipy.sparse.eig().
Code and results:
Here is the code to compute these eigenvalues:
import numpy as np
import scipy
import sympy
from scipy.sparse.linalg import aslinearoperator
from scipy.sparse.linalg import eigs
from scipy.sparse.linalg import LinearOperator
import chebyshev
N = 20 # should be 4, 20, 50, 100, 300
max_order = 4
option = 1
#option 1: building the differentiation matrix DM for a given order
if option == 1:
if 0:
# usage of package chebyshev, but I add a file with the matrix inside
nodes, DM = chebyshev.chebdiff(N, max_order)
order = 1
DM = DM[order-1,:,:]
#outfile = TemporaryFile()
np.save('DM20', DM)
if 1:
# loading the matrix from the file
# uncomment depending on N
#DM = np.load('DM4.npy')
DM = np.load('DM20.npy')
#DM = np.load('DM50.npy')
#DM = np.load('DM100.npy')
#DM = np.load('DM300.npy')
#option 2: building a random matrix
elif option == 2:
j = np.complex(0,1)
np.random.seed(0)
Real = np.random.random((N, N)) - 0.5
Im = np.random.random((N,N)) - 0.5
# If I want DM symmetric:
#Real = np.dot(Real, Real.T)
#Im = np.dot(Im, Im.T)
DM = Real + j*Im
# If I want DM singular:
#DM[0,:] = DM[1,:]
# Test DM symmetric
print('Is DM symmetric ? \n', (DM.transpose() == DM).all() )
# Test DM Hermitian
print('Is DM hermitian ? \n', (DM.transpose().real == DM.real).all() and
(DM.transpose().imag == -DM.imag).all() )
# building a linear operator which wrap matrix DM
def derivative(v):
return np.dot(DM, v)
linop_DM = LinearOperator( (N, N), matvec = derivative)
# building a linear operator directly from a matrix DM with asLinearOperator
aslinop_DM = aslinearoperator(DM)
# comparison of LinearOperator and direct Dot Product
V = np.random.random((N))
diff_lo = linop_DM.matvec(V)
diff_mat = np.dot(DM, V)
# diff_lo and diff_mat are equals
# FINDING EIGENVALUES
#number of eigenvalues to find
k = 1
if 1:
# SCIPY SPARSE LINALG LINEAR OPERATOR
vals_sparse, vecs = scipy.sparse.linalg.eigs(linop_DM, k, which='SR',
maxiter = 10000,
tol = 1E-3)
vals_sparse = np.sort(vals_sparse)
print('\nEigenvalues (scipy.sparse.linalg Linear Operator) : \n', vals_sparse)
if 1:
# SCIPY SPARSE ARRAY
vals_sparse2, vecs2 = scipy.sparse.linalg.eigs(DM, k, which='SR',
maxiter = 10000,
tol = 1E-3)
vals_sparse2 = np.sort(vals_sparse2)
print('\nEigenvalues (scipy.sparse.linalg with matrix DM) : \n', vals_sparse2)
if 1:
# SCIPY SPARSE AS LINEAR OPERATOR
vals_sparse3, vecs3 = scipy.sparse.linalg.eigs(aslinop_DM, k, which='SR',
maxiter = 10000,
tol = 1E-3)
vals_sparse3 = np.sort(vals_sparse3)
print('\nEigenvalues (scipy.sparse.linalg AS linear Operator) : \n', vals_sparse3)
if 0:
# NUMPY LINALG / SAME RESULT AS SCIPY LINALG
vals_np = np.linalg.eigvals(DM)
vals_np = np.sort(vals_np)
print('\nEigenvalues (numpy.linalg) : \n', vals_np)
if 1:
# SCIPY LINALG
vals_sp = scipy.linalg.eig(DM)
vals_sp = np.sort(vals_sp[0])
print('\nEigenvalues (scipy.linalg.eig) : \n', vals_sp)
if 0:
x = sympy.Symbol('x')
D = sympy.Matrix(DM)
print('\ndet D (sympy):', D.det() )
E = D - x*np.eye(DM.shape[0])
eig_sympy = sympy.solve(E.det(), x)
print('\nEigenvalues (sympy) : \n', eig_sympy)
Here are my results (for N=20):
Is DM symmetric ?
False
Is DM hermitian ?
False
Eigenvalues (scipy.sparse.linalg Linear Operator) :
[-2.5838015+0.j]
Eigenvalues (scipy.sparse.linalg with matrix DM) :
[-2.58059801+0.j]
Eigenvalues (scipy.sparse.linalg AS linear Operator) :
[-2.36137671+0.j]
Eigenvalues (scipy.linalg.eig) :
[-2.92933791+0.j -2.72062839-1.01741142j -2.72062839+1.01741142j
-2.15314244-1.84770128j -2.15314244+1.84770128j -1.36473659-2.38021351j
-1.36473659+2.38021351j -0.49536645-2.59716913j -0.49536645+2.59716913j
0.38136094-2.53335888j 0.38136094+2.53335888j 0.55256471-1.68108134j
0.55256471+1.68108134j 1.26425751-2.25101241j 1.26425751+2.25101241j
2.03390489-1.74122287j 2.03390489+1.74122287j 2.57770573-0.95982011j
2.57770573+0.95982011j 2.77749810+0.j ]
The values returned by scipy.sparse should be included in the ones found by scipy/numpy, which is not the case. (idem for sympy)
I've tried with different random matrices instead of DM (see option 2) (symmetric, non-symmetric, real, imaginary, etc...), which had small size N (4,5,6..) and also bigger ones (100,...).
That worked
By changing parameters like 'which' (LM, SM, LR...), 'tol' (10E-3, 10E-6..), 'maxiter', 'sigma' (0) in scipy.sparse... scipy.sparse.linalg.eigs always worked for random matrices but never for my matrix DM. In best cases, found eigenvalues are close to the ones found by scipy, but never match.
I really do not know what is so particular in my matrix.
I also dont know why using scipy.sparse.linagl.eig with a matrix, a LinearOperator or a AsLinearOperator gives different results.
I DO NOT KNOW HOW I COULD INCLUDE MY FILES CONTAINING MATRICES DM...
For N = 4 :
[[ 3.16666667 -4. 1.33333333 -0.5 ]
[ 1. -0.33333333 -1. 0.33333333]
[-0.33333333 1. 0.33333333 -1. ]
[ 0.5 -1.33333333 4. -3.16666667]]
Every idea is welcome.
May a moderator could tag my question with :
scipy.sparse.linalg.eigs / weideman / eigenvalues / scipy.eig /scipy.sparse.lingalg.linearOperator
Geoffroy.
I spoke with a few colleague and solve partly my problem.
My conclusion is that my matrix is simply very ill conditioned...
In my project, I can simplify my matrix by imposing boundary condition as follow:
DM[0,:] = 0
DM[:,0] = 0
DM[N-1,:] = 0
DM[:,N-1] = 0
which produces a matrix similar to that for N=4:
[[ 0 0 0 0]
[ 0 -0.33333333 -1. 0]
[ 0 1. 0.33333333 0]
[ 0 0 0 0]]
By using such condition, I obtain eigenvalues for scipy.sparse.linalg.eigs which are equal to the one in scipy.linalg.eig.
I also tried using Matlab, and it return the same values.
To continue my work, I actually needed to use the generalized eigenvalue problem in the standard form
λ B x= DM x
It seems that it does not work in my case because of my matrix B (which represents a Laplacian operator matrix).
If you have a similar problem, I advise you to visit that question:
https://scicomp.stackexchange.com/questions/10940/solving-a-generalised-eigenvalue-problem
(I think that) the matrix B needs to be positive definite to use scipy.sparse.
A solution would be to change B, to use scipy.linalg.eig or to use Matlab.
I will confirm that later.
EDIT:
I wrote a solution to the stack exchange question I post above which explains how I solve my problem.
I appears that scipy.sparse.linalg.eigs has indeed a bug if matrix B is not positive definite, and will return bad eigenvalues.
Related
I am trying to solve the following simple optimization problem. One seeks to find the global minimum variance portfolio, being the portfolio that minimizes variance with only one constraint : weights must sum to one.
Optimization program
This problem has a well-known closed-form solution: Solution
I'm trying to reproduce the results using CVXopt in Python, and I encounter a puzzling issue.
I have a 10x10 sample covariance matrix.
If I solve the problem with the entire 10x10 matrix, the output is incorrect : the "sum-to-one" constraint is not respected, and weights are different from the closed-form solution
If I solve the problem with a 7x7 subsample of the same matrix, the output is correct : the "sum-to-one" constraint is respected, and weights are equivalent to the closed-form solution
Actually, with any subsample of size equal or lower than 7x7, it works. But for any subsample of size higher or equal to 8x8, it does not work anymore. I can't grasp where the problem is coming from.
Thank you for your help!
Here is a sample code
%reset -sf #Clear Environment
import numpy as np
import cvxopt
from cvxopt import matrix as dmatrix
from cvxopt.solvers import qp, options
from numpy.linalg import inv
from numpy import ones
from numpy import transpose as t
# Sample 10x10 covariance matrix
sigmafull = np.array([[0.01449082, 0.00846992, 0.00846171, 0.00773097, 0.00878925,
0.00748843, 0.00672341, 0.00665912, 0.0068593 , 0.00827341],
[0.00846992, 0.00952205, 0.00766057, 0.00726647, 0.00781524,
0.00672368, 0.00642426, 0.00609368, 0.00617965, 0.00704281],
[0.00846171, 0.00766057, 0.00842194, 0.00700168, 0.00772423,
0.0061137 , 0.00612574, 0.00601041, 0.00621007, 0.00712152],
[0.00773097, 0.00726647, 0.00700168, 0.00687784, 0.00726901,
0.00573606, 0.00567145, 0.00556391, 0.00575279, 0.00660916],
[0.00878925, 0.00781524, 0.00772423, 0.00726901, 0.00860462,
0.00612804, 0.0061301 , 0.00603605, 0.00630947, 0.0075281 ],
[0.00748843, 0.00672368, 0.0061137 , 0.00573606, 0.00612804,
0.00634431, 0.0054793 , 0.00513665, 0.00511852, 0.00575049],
[0.00672341, 0.00642426, 0.00612574, 0.00567145, 0.0061301 ,
0.0054793 , 0.0055722 , 0.0050824 , 0.00512499, 0.00576934],
[0.00665912, 0.00609368, 0.00601041, 0.00556391, 0.00603605,
0.00513665, 0.0050824 , 0.00521583, 0.00510142, 0.00576414],
[0.0068593 , 0.00617965, 0.00621007, 0.00575279, 0.00630947,
0.00511852, 0.00512499, 0.00510142, 0.00547566, 0.00603528],
[0.00827341, 0.00704281, 0.00712152, 0.00660916, 0.0075281 ,
0.00575049, 0.00576934, 0.00576414, 0.00603528, 0.00756009]])
# sigma = sigmafull[0:8,0:8] #With this subsample, output is incorrect. n=8 (and for all n>8)
sigma = sigmafull[0:7,0:7] #With this subsample, output is correct. n=7 (and for all n<7)
n=len(sigma)
sigma = dmatrix(sigma) #Formatting sigma to be a dense matrix for cvxopt
mu = dmatrix(np.zeros(n)) #We just want to minimize variance, hence vector of zeroes
#Format of the equality constraint : Ax = b
#We want the sum of x to be equal to 1
Amatrix = dmatrix(ones(n)).T #Vector of ones
bmatrix = dmatrix(1.0) #Scalar = 1
sol = qp(sigma, mu, None, None, A=Amatrix, b=bmatrix) #No inequality constraint
w_gmv = (inv(sigma)#ones(n))/(t(ones(n))#inv(sigma)#ones(n)) #Analytical solution which indeed does sum to 1
print(t(np.array(sol['x'])) - w_gmv) #If Vector of zeroes -> Weights are equivalent -> OK
print(sum(np.array(sol['x']))) #If equal to 1 -> OK
I'm trying to find approximate nonzero solutions of M#x = 0 using SVD in scipy where M is a complex-valued 4x4 matrix.
First a toy example:
M = np.array([
[1,1,1,1],
[1,0,1,1],
[1,-1,0,0],
[0,0,0,1e-10]
])
U, s, Vh = scipy.linalg.svd(M)
print(s) # [2.57554368e+00 1.49380718e+00 3.67579714e-01 7.07106781e-11]
print(Vh[-1]) # [ 0.00000000e+00 2.77555756e-16 -7.07106781e-01 7.07106781e-01]
print(np.linalg.norm( M#Vh[-1] )) # 7.07106781193738e-11
So in this case, the smallest (last) value in s is very small, and corresponding last column Vh[-1] is the approximate solution to M#x=0, where M#Vh[-1] is also very small, roughly same order as s[-1].
Now the real example which doesn't work the same way:
M = np.array([[ 1.68572560e-01-3.98053448e-02j, 5.61165939e-01-1.22638499e-01j,
3.39625823e-02-1.16216469e+00j, 2.65140034e-06-4.10296457e-06j],
[ 4.17991622e-01+1.33504182e-02j, -4.79190633e-01-2.08562169e-01j,
4.87429517e-01+3.68070222e-01j, -3.63710538e-05+6.43912577e-06j],
[-2.18353842e+06-4.20344071e+05j, -2.52806647e+06-2.08794519e+05j,
-2.01808847e+06-1.96246695e+06j, -5.77147300e-01-3.12598394e+00j],
[-3.03044160e+05-6.45842521e+04j, -6.85879183e+05+2.07045473e+05j,
6.14194217e+04-1.28864668e+04j, -7.08794838e+00+9.70230041e+00j]])
U, s, Vh = scipy.linalg.svd(M)
print(s) # [4.42615634e+06 5.70600901e+05 4.68468171e-01 5.21600592e-13]
print(Vh[-1]) # [-5.35883825e-05+0.00000000e+00j 3.74712739e-05-9.89288566e-06j 4.03111556e-06+7.59306578e-06j -8.20834667e-01+5.71165865e-01j]
print(np.linalg.norm( M#Vh[-1] )) # 35.950705194666476
What's going on here? s[-1] is very small, so M#x should have a solution in principle, but Vh[-1] doesn't look like a solution. Is this an issue with M and Vh being complex numbers? A numerical stability/accuracy issue? Something else?
I'd really like to figure out what x would give M#x with roughly the same order of magnitude as s[-1], please let me know any way to solve this.
You forgot the conjugate transpose
The decomposition given by SVD is np.allclose(M, U # np.diag(s) # Vh), if s[-1] is small it means that the last column of U # np.diag(s) ~ M # np.inv(Vh) ~ M # Vh.T.conj(). So you can find the use
M # Vh[-1].T.conj() # [-7.77136331e-14-3.74441041e-13j,
# 4.67810503e-14+3.45797987e-13j,
# -2.84217094e-14-1.06581410e-14j,
# 7.10542736e-15+3.10862447e-15j]
I am trying to use python (and at present failing) to come to a more efficient solution than Excel Solver provides for an optimization problem.
Matrices
The problem is the form AB=C -->D
Where AB produces C where the absolute value for C-D for each row in the matrix is minimized.
I have seven funds contained in matrix B all of which have geographic exposure of the form
FUND_NAME = np.array([UK,USA,EuroZone, Japan,EM,Apac)]
as below
RLS = np.array([0.788743177, 0.168048481,0,0.043208342,0,0])
LIOGLB=np.array([0.084313978,0.578528092,0,0.23641746,0.033709666,0.067030804])
LIONEUR=np.array([0.055032339,0,0,0.944967661,0,0])
STEW_WLDWD=np.array([0.09865472,0.210582713,0.053858632,0.431968002,0.086387178,0.118548755])
EMMK=np.array([0.080150377,0.025212864,0.597285513,0.031832241,0.212440426,0.053078578])
PAC=np.array([0,0.013177633,0.41273195,0,0.510644775,0.063445642])
PICTET=np.array([0.089520913,0.635857603,0,0.218148413,0.023290413,0.033182659])
From this I need to construct an optimal weighting of the seven funds using a matrix (imaginatively named A) [x1,x2,x3,x4,x5,x6,x7] with x1+x2+...+x7=1 & Also for i=(1,7)
xi lower bound =0
xi upper bound =0.25
To arrive at the actual regional weights (matrix C)as close as possible to the below Target array (which corresponds to matrix D above)
Target=np.array([0.2310,0.2576,0.1047,0.1832,0.1103,0.1131])
I've tried using libprog. But I know that the answer I am getting is wrong.
Funds =np.array([RLS,LIOGLB, LIONEUR,STEW_WLDWD, EMMK,PAC,PICTET])
twentyfive=np.full((1, 7), 0.25)
bounds=[0,0.25]
res = linprog(Target,A_ub=Funds,b_ub=twentyfive,bounds=[bounds])
Can anyone help me move on from excel ?
This is really a LAD regression problem (LAD=Least Absolute Deviation) with some side constraints. Different LP formulations for the LAD regression problems can be found here. Based on the sparse bounding problem, we can formulate the LP model:
This is the mathematical model I am going to try to solve with linprog. The coloring as as follows: blue symbols represent data, red symbols are the decision variables. x are the allocations (fractions) we need to find, d are the residuals of the linear fit and r are the absolute values of d.
linprog requires an explicit LP matrix. For the model above, this A matrix can look like:
With this it is no longer very difficult to develop a Python implementation. The Python code can look like:
import numpy as np
import scipy.optimize as sp
B = np.array([[0.788743177, 0.168048481,0,0.043208342,0,0],
[0.084313978,0.578528092,0,0.23641746,0.033709666,0.067030804],
[0.055032339,0,0,0.944967661,0,0],
[0.09865472,0.210582713,0.053858632,0.431968002,0.086387178,0.118548755],
[0.080150377,0.025212864,0.597285513,0.031832241,0.212440426,0.053078578],
[0,0.013177633,0.41273195,0,0.510644775,0.063445642],
[0.089520913,0.635857603,0,0.218148413,0.023290413,0.033182659]]).T
target = np.array([0.2310,0.2576,0.1047,0.1832,0.1103,0.1131])
m,n = np.shape(B)
A_eq = np.block([[B, np.eye(m), np.zeros((m,m))],
[np.ones(n), np.zeros(m), np.zeros(m)]])
A_ub = np.block([[np.zeros((m,n)),-np.eye(m), -np.eye(m)],
[np.zeros((m,n)),np.eye(m), -np.eye(m)]])
b_eq = np.block([target,1])
b_ub = np.zeros(2*m)
c = np.block([np.zeros(n),np.zeros(m),np.ones(m)])
bnd = n*[(0,0.25)] + m*[(None,None)] + m*[(0,None)]
res = sp.linprog(c,A_ub,b_ub,A_eq,b_eq,bnd,options={'disp':True})
allocation = res.x[0:n]
The results look like:
Primal Feasibility Dual Feasibility Duality Gap Step Path Parameter Objective
1.0 1.0 1.0 - 1.0 6.0
0.3777262386888 0.3777262386888 0.3777262386888 0.6478228594143 0.3777262386888 0.3200496644143
0.08438152300367 0.08438152300366 0.08438152300367 0.8087424108466 0.08438152300366 0.1335722585582
0.01563291142478 0.01563291142478 0.01563291142478 0.8341722620104 0.01563291142478 0.1118298108651
0.004083901923022 0.004083901923022 0.004083901923023 0.7432737130498 0.004083901923024 0.1049630948572
0.0006190254179117 0.0006190254179117 0.0006190254179116 0.8815177164943 0.000619025417913 0.1016021916581
3.504935403199e-05 3.504935403066e-05 3.504935403079e-05 0.9676694788778 3.504935402756e-05 0.1012177893279
5.983549975387e-07 5.98354980932e-07 5.983549810074e-07 0.9885372873161 5.983549719474e-07 0.1011921413019
3.056236812029e-11 3.056401712736e-11 3.056394819773e-11 0.9999489201822 3.056087926755e-11 0.1011915586046
Optimization terminated successfully.
Current function value: 0.101192
Iterations: 8
print(allocation)
[2.31621461e-01 2.50000000e-01 9.07425872e-12 2.50000000e-01
4.45030949e-10 2.39692743e-01 2.86857955e-02]
I wrote some code to implement the modified Gram Schmidt process. When
I tested it on real matrices, it is correct. However, when I tested it
on complex matrices, it went wrong.
I believe my code is correct by doing a step by step check. Therefore,
I wonder if there are numerical reasons why the modified Gram Schmidt
process fails on complex vectors.
Following is the code:
import numpy as np
def modifiedGramSchmidt(A):
"""
Gives a orthonormal matrix, using modified Gram Schmidt Procedure
:param A: a matrix of column vectors
:return: a matrix of orthonormal column vectors
"""
# assuming A is a square matrix
dim = A.shape[0]
Q = np.zeros(A.shape, dtype=A.dtype)
for j in range(0, dim):
q = A[:,j]
for i in range(0, j):
rij = np.vdot(q, Q[:,i])
q = q - rij*Q[:,i]
rjj = np.linalg.norm(q, ord=2)
if np.isclose(rjj,0.0):
raise ValueError("invalid input matrix")
else:
Q[:,j] = q/rjj
return Q
Following is the test code:
import numpy as np
# If testing on random matrices:
# X = np.random.rand(dim,dim)*10 + np.random.rand(dim,dim)*5 *1j
# If testing on some good one
v1 = np.array([1, 0, 1j]).reshape((3,1))
v2 = np.array([-1, 1j, 1]).reshape((3,1))
v3 = np.array([0, -1, 1j+1]).reshape((3,1))
X = np.hstack([v1,v2,v3])
Y = modifiedGramSchmidt(X)
Y3 = np.linalg.qr(X, mode="complete")[0]
if np.isclose(Y3.conj().T.dot(Y3), np.eye(dim, dtype=complex)).all():
print("The QR-complete gives orthonormal vectors")
if np.isclose(Y.conj().T.dot(Y), np.eye(dim, dtype=complex)).all():
print("The Gram Schmidt process is tested against a random matrix")
else:
print("But My modified GS goes wrong!")
print(Y.conj().T.dot(Y))
Update
The problem is that I implemented a algorithm designed for inner product linear in first argument
whereas I thought it were linear in second argument.
Thanks #landogardner
Your problem is to do with how numpy.vdot handles complex numbers — the complex conjugate of the first argument is used for the calculation (ref). So you're calculating rij as q*.Q[:,i] instead of q.Q[:,i]*. Just swap the order of the args:
rij = np.vdot(Q[:,i], q)
This got the test code working for me.
I am implementing some basic linear equation solvers in Python.
I have currently implemented forward and backward substitution for triangular systems of equations (so very straightforward to solve!), but the precision of the solutions becomes very poor even with systems of about 50 equations (50x50 coefficient matrix).
The following code performs the forward/backward substitution:
FORWARD_SUBSTITUTION = 1
BACKWARD_SUBSTITUTION = 2
def solve_triang_subst(A: np.ndarray, b: np.ndarray,
substitution=FORWARD_SUBSTITUTION) -> np.ndarray:
"""Solves a triangular system via
forward or backward substitution.
A must be triangular. FORWARD_SUBSTITUTION means A should be
lower-triangular, BACKWARD_SUBSTITUTION means A should be upper-triangular.
"""
rows = len(A)
x = np.zeros(rows, dtype=A.dtype)
row_sequence = reversed(range(rows)) if substitution == BACKWARD_SUBSTITUTION else range(rows)
for row in row_sequence:
delta = b[row] - np.dot(A[row], x)
cur_x = delta / A[row][row]
x[row] = cur_x
return x
I am using numpy and 64-bit floats.
Simple Testing Tool
I have set up a simple test suite which generates coefficient matrices and x vectors, computes the b, and then uses forward or backward substitution to recover the x, comparing it to the its known value for validity.
The following code performs these checks:
import numpy as np
import scipy.linalg as sp_la
RANDOM_SEED = 1984
np.random.seed(RANDOM_SEED)
def check(sol: np.ndarray, x_gt: np.ndarray, description: str) -> None:
if not np.allclose(sol, x_gt, rtol=0.1):
print("Found inaccurate solution:")
print(sol)
print("Ground truth (not achieved...):")
print(x_gt)
raise ValueError("{} did not work!".format(description))
def fuzz_test_solving():
N_ITERATIONS = 100
refine_result = True
for mode in [FORWARD_SUBSTITUTION, BACKWARD_SUBSTITUTION]:
print("Starting mode {}".format(mode))
for iteration in range(N_ITERATIONS):
N = np.random.randint(3, 50)
A = np.random.uniform(0.0, 1.0, [N, N]).astype(np.float64)
if mode == BACKWARD_SUBSTITUTION:
A = np.triu(A)
elif mode == FORWARD_SUBSTITUTION:
A = np.tril(A)
else:
raise ValueError()
x_gt = np.random.uniform(0.0, 1.0, N).astype(np.float64)
b = np.dot(A, x_gt)
x_est = solve_triang_subst(A, b, substitution=mode,
refine_result=refine_result)
# TODO report error and count, don't throw!
# Keep track of error norm!!
check(x_est, x_gt,
"Mode {} custom triang iteration {}".format(mode, iteration))
if __name__ == '__main__':
fuzz_test_solving()
Note that the maximum size of a test matrix is 49x49. Even in this case, the system cannot always compute decent solutions, and fails by more than a margin of 0.1. Here's an example of such a failure (this is doing backward substitution, so the biggest error is in the 0th coefficient; all the test data are sampled uniformly from [0, 1[):
Solution found with Mode 2 custom triang iteration 24:
[ 0.27876067 0.55200497 0.49499509 0.3259397 0.62420183 0.47041149
0.63557676 0.41155446 0.47191956 0.74385864 0.03002819 0.4700286
0.37989592 0.56527691 0.15072607 0.05659282 0.52587574 0.82252197
0.65662833 0.50250729 0.74139748 0.10852731 0.27864265 0.42981232
0.16327331 0.74097937 0.24411709 0.96934199 0.890266 0.9183985
0.14842446 0.51806495 0.36966843 0.18227989 0.85399593 0.89615663
0.39819336 0.90445931 0.21430972 0.61212349 0.85205597 0.66758689
0.1793689 0.38067267 0.39104614 0.6765885 0.4118123 ]
Ground truth (not achieved...)
[ 0.20881608 0.71009766 0.44735271 0.31169033 0.63982328 0.49075813
0.59669585 0.43844108 0.47764942 0.72222069 0.03497499 0.4707452
0.37679884 0.56439738 0.15120397 0.05635977 0.52616387 0.82230625
0.65670245 0.50251426 0.74139956 0.10845974 0.27864289 0.42981226
0.1632732 0.74097939 0.24411707 0.96934199 0.89026601 0.91839849
0.14842446 0.51806495 0.36966843 0.18227989 0.85399593 0.89615663
0.39819336 0.90445931 0.21430972 0.61212349 0.85205597 0.66758689
0.1793689 0.38067267 0.39104614 0.6765885 0.4118123 ]
I have also implemented the iterative refinement method described in Section 2.5 of [0], and while it did help a little, the results are still poor for larger matrices.
MATLAB Sanity Check
I also did this experiment in MATLAB, and even there, once there are more than 100 equations, the estimation error shoots up exponentially.
Here is the MATLAB code I used for this experiment:
err_norms = [];
range = 1:3:120;
for size=range
A = rand(size, size);
A = tril(A);
x_gt = rand(size, 1);
b = A * x_gt;
x_sol = A\b;
err_norms = [err_norms, norm(x_gt - x_sol)];
end
plot(range, err_norms);
set(gca, 'YScale', 'log')
And here is the resulting plot:
Main Question
My question is: Is this normal behavior, seeing as there is essentially no structure in the problem, given that I randomly generate the A matrix and x?
What about solving linear systems of 100s of equations for various practical applications? Are these limitations simply an accepted fact, and e.g., optimization algorithms are just naturally robust to these issues? Or am I missing some important facets of this problem?
[0]: Press, William H. Numerical recipes 3rd edition: The art of scientific computing. Cambridge university press, 2007.
There are no limitations. This is a very fruitful exercise that we all came to realize; writing linear solvers are not that easy and that's why almost always LAPACK or its cousins in other languages are used with full confidence.
You are hit by almost singular matrices and because you are using matlab's backslash you don't see that matlab is switching to least squares solutions behind the scenes when near singularity is hit. If you just change A\b to linsolve(A,b) hence you restrict the solver to solve square systems you'll probably see lots of warnings on your console.
I didn't test it because I don't have a license anymore but if I write blindly this should show you the condition numbers of the matrices at each step.
err_norms = [];
range = 1:3:120;
for i=1:40
size = range(i);
A = rand(size, size);
A = tril(A);
x_gt = rand(size, 1);
b = A * x_gt;
x_sol = linsolve(A,b);
err_norms = [err_norms, norm(x_gt - x_sol)];
zzz(i) = rcond(A);
end
semilogy(range, err_norms);
figure,semilogy(range,zzz);
Note that because you are picking up numbers from a uniform distribution it becomes more and more likely to hit ill-conditioned matrices (wrt to inversion) as the rows have more probability to have rank deficiency. That's why the error becomes bigger and bigger. Sprinkle some identity matrix times a scalar and all errors should come back to eps*n levels.
But best, leave this to expert algorithms which have been tested through decades. It is really not that trivial to write any of these. You can read the Fortran codes, for example, dtrsm solves the triangular system.
On the Python side, you can use scipy.linalg.solve_triangular which uses ?trtrs routines from LAPACK.