How to find two largest eigenvector in python? - python

I can find eigenvectors of a matrix in Python as follows:
from numpy import linalg as LA
w, v = LA.eig(np.diag((1, 2, 3)))
But how to find the largest two eigenvectors for a larger matrix of size 100*200?

Eigenvalue decomposition is not defined for a non-square matrix. The closest operation is single value decomposition. SVD and EIG for a non-square matrix are related in that the single values are the square root of the eigenvalues of the transpose of the matrix times itself.
B = A' * A
SVD(A) * SVD(A) ~= EIG(B)
So one potential answer to your question is:
import numpy as np
A = np.array([[1,2,3],[4,5,6],[7,8,9],[10,11,12]])
B = np.matmul(np.transpose(A), A)
u,s,v = np.linalg.svd(A)
V, D = np.linalg.eig(B)
print(f'Compare s*s to V {s*s - V}')
While s is not directly the eigenvalues of A it is somewhat related.

Related

How to find non-normalized eigen vector?

My goal is to validate whether an eigen vector is part of an nxn square.
The formula that I know A * x = lamda * x
Example
import numpy as np
A = np.array([[1,-1],
[6,4]])
eigvalues, eigvectors = np.linalg.eig(A)
The eigvectors output are normalized eigen vector.
According to this website, normalized eigen vector is an eigvector having unit length.
In this case I want to check whether eigen vector is part of an n x n square matrix.
x = [ 1
3 ]
So, what I did is that to multiply the eigen vector in question with the square root of the eigen vector:
normalized_eigvector = np.array([1,3]) * npsqrt(1 + 9)
np.array_equal(normalized_eigvector, np.absolute(eigectors[1]))
Is this the correct way to do it?

Return the eigenvector corresponding to the max eigenvalue of A

As the title says, I must compute the eigenvector v corresponding to the max eigenvalue. I'm not sure what commands do this. Any tips?
import numpy as np
import scipy.linalg as la
#x and y both 1D NumPy arrays of same length
def eigen_X(x,y):
xa = np.mean(x)
ya = np.mean(y)
x_bar = x - xa
y_bar = y - ya
X = np.column_stack(x_bar,y_bar)
A = X.transpose()#X
#The rest of the code goes here
scipy.linalg.eig provides a function that calculates eigenvalues and eigenvectors of a 2D, square matrix. To get the (right?) eigenvector corresponding to the largest eigenvalue, use
w, vl, vr = la.eig(A)
largest_eigenvector = vr[:, np.argmax(w)]
Replace vr[:, np.argmax(w)] above with vl[np.argmax(w)] if you're looking for the corresponding left eigenvector.
It's possible to do this with just numpy's "linalg" library. The eig() function can give you the eigenvalues and eigenvectors. I converted the eigenvalues from a numpy array into a list in order to use "index" here to find the position of the largest eigenvalue. Then I picked the corresponding column from the eigenvector array.
>>> from numpy import linalg as LA
>>> M = ((1,-3,3), (3,-5,3), (6,-6,4))
>>> vals, vects = LA.eig(M)
>>> maxcol = list(vals).index(max(vals))
>>> eigenvect = vects[:,maxcol]
>>> print eigenvect
[-0.40824829+0.j -0.40824829+0.j -0.81649658+0.j]
If you need "dominant eigenvalue" you need to find the position of the largest eigenvalue, all in absolute value. Try
import numpy.linalg as npla
import numpy as np
M = np.array([[0.12,6.1,-5.2], [4.6,-7.8,9.3], [3.1,2.4,8.7]])
# egv, vects = npla.eig(M)
egv = npla.eigvals(M)
print('egv dominant of M ', egv[np.argmax(np.abs(egv))])
You can ignore np.abs(egv) and leave agv alone for max eigenvalue

Scipy eigsh returning wrong results for complex input matrix

I am trying to find the eigenvalues and eigenvectors of a complex matrix with scipy.sparse.linalg.eigsh using its shift-invert mode. With just real numbers in the matrix I get the same result for the spicy.linalg.eigh solver, but when adding the imaginary parts the eigenvalues diverge. A tiny example:
import numpy as np
from scipy.linalg import eigh
from scipy.sparse.linalg import eigsh
n = 10
X = np.random.random((n, n)) - 0.5 + (np.random.random((n, n)) - 0.5) * 1j
X = np.dot(X, X.T) # create a symmetric matrix
evals_all, evecs_all = eigh(X)
evals_small, evecs_small = eigsh(X, 3, sigma=0, which='LM')
print(sorted(evals_all, key=abs))
print(sorted(evals_small, key=abs))
The prints in this case are for example
[0.041577858515751132, -0.084104744918533481, -0.58668240775486691, 0.63845672501004724, -1.2311727737115068, 1.5193345703630159, -1.8652302423152105, 1.9970059660853923, -2.6414593461321654, 2.8624290667460293]
[-0.017278543470343462, -0.32684893256215408, 0.34551438015659475]
whereas in the real case, the first three eigenvalues are identical.
I am aware that I'm passing a dense matrix to the sparse solver, but this is just intended as an example.
I am probably missing something obvious somewhere, but I'd be happy about some hints where to look. Thank you!
scipy is not checking your input if it's hermitian.
Doing it like proposed in the link:
if not np.allclose(X, np.asmatrix(X).H):
raise ValueError('expected symmetric or Hermitian matrix')
outputs:
ValueError: expected symmetric or Hermitian matrix
I think this is also indicated by those negative eigenvalues you see (but complex-based math is really not my speciality...).

How can I compute the null space/kernel (x: M·x = 0) of a sparse matrix in Python?

I found some examples online showing how to find the null space of a regular matrix in Python, but I couldn't find any examples for a sparse matrix (scipy.sparse.csr_matrix).
By null space I mean x such that M·x = 0, where '·' is matrix multiplication. Does anybody know how to do this?
Furthermore, in my case I know that the null space will consist of a single vector. Can this information be used to improve the efficiency of the method?
This isn't a complete answer yet, but hopefully it will be a starting point towards one. You should be able to compute the null space using a variant on the SVD-based approach shown for dense matrices in this question:
import numpy as np
from scipy import sparse
import scipy.sparse.linalg
def rand_rank_k(n, k, **kwargs):
"generate a random (n, n) sparse matrix of rank <= k"
a = sparse.rand(n, k, **kwargs)
b = sparse.rand(k, n, **kwargs)
return a.dot(b)
# I couldn't think of a simple way to generate a random sparse matrix with known
# rank, so I'm currently using a dense matrix for proof of concept
n = 100
M = rand_rank_k(n, n - 1, density=1)
# # this seems like it ought to work, but it doesn't
# u, s, vh = sparse.linalg.svds(M, k=1, which='SM')
# this works OK, but obviously converting your matrix to dense and computing all
# of the singular values/vectors is probably not feasible for large sparse matrices
u, s, vh = np.linalg.svd(M.todense(), full_matrices=False)
tol = np.finfo(M.dtype).eps * M.nnz
null_space = vh.compress(s <= tol, axis=0).conj().T
print(null_space.shape)
# (100, 1)
print(np.allclose(M.dot(null_space), 0))
# True
If you know that x is a single row vector then in principle you would only need to compute the smallest singular value/vector of M. It ought to be possible to do this using scipy.sparse.linalg.svds, i.e.:
u, s, vh = sparse.linalg.svds(M, k=1, which='SM')
null_space = vh.conj().ravel()
Unfortunately, scipy's svds seems to be badly behaved when finding small singular values of singular or near-singular matrices and usually either returns NaNs or throws an ArpackNoConvergence error.
I'm not currently aware of an alternative implementation of truncated SVD with Python bindings that will work on sparse matrices and can selectively find the smallest singular values - perhaps someone else knows of one?
Edit
As a side note, the second approach seems to work reasonably well using MATLAB or Octave's svds function:
>> M = rand(100, 99) * rand(99, 100);
% svds converges much more reliably if you set sigma to something small but nonzero
>> [U, S, V] = svds(M, 1, 1E-9);
>> max(abs(M * V))
ans = 1.5293e-10
I have been trying to find a solution to the same problem. Using Scipy's svds function provides unreliable results for small singular values. Therefore i have been using QR decomposition instead. The sparseqr https://github.com/yig/PySPQR provides a wrapper for Matlabs SuiteSparseQR method, and works reasonably well. Using this the null space can be calculated as:
from sparseqr import qr
Q, _, _,r = qr( M.transpose() )
N = Q.tocsr()[:,r:]

QR decomposition for rectangular matrices in which n > m in scipy/numpy

I have a m x n rectangular matrix A for which n > m. Given the rank r <= m of A, the reduced QR decomposition yields matrix Q with m x r dimensions, and R with r x n dimensions. The columns of Q are an orthonormal basis for the range of A. R will be upper triangular but in a staircase pattern. Columns in R with a pivot correspond to independent columns in A.
When I apply qr function from numpy.linalg (there is also a version of this function in scipy.linalg, which seems to be the same), it returns matrix Q with m x m dimensions, and R with m x n dimensions, even when the rank of matrix A is less than m. This seems to be the "full" QR decomposition, for which the columns of Q are an orthonormal basis for Re^m. Is it possible to identify the independent columns of A through this R matrix returned by function qr in numpy.linalg;scipy.linalg?
Check for diagonal elements of R that are non-zero:
import numpy as np
min_tol = 1e-9
A = np.array([[1,2,3],[4,3,2],[1,1,1]])
print("Matrix rank of: {}".format(np.linalg.matrix_rank(A)))
Q,R = np.linalg.qr(A)
indep = np.where(np.abs(R.diagonal()) > min_tol)[0]
print(A[:, indep])
print("Independent columns are: {}".format(indep))
see also here:
How to find degenerate rows/columns in a covariance matrix

Categories