Solving generalized eigenvalue system with a semidefinite positive B in python - python

I am trying to use Normalized Cut algorithm (Shi and Malik, 2000) to cut a matrix into two matrices. In this regard, I need to find the second smallest eigenvector in a generalized eigenvalue system (Ax = lambda.B.x). In my input, B is a semidefinite positive matrix. However, scipy.linalg.eigh requires B to be definite positive and raises an error when I use it. I need to know if I can have a solution with this input, and how can I find it.
I tried
eigvals, eigvecs = eigh(A, B, eigvals_only=False, subset_by_index=[0, 1])
But I got:
numpy.linalg.LinAlgError: The leading minor of order 2 of B is not positive definite. The factorization of B could not be completed and no eigenvalues or eigenvectors were computed.

If B is semidefinite, it means it has at least one eigenvector associated with an eigenvalue 0, you still could have solutions if the nullspace of B is also a null space of A, i.e. if B # x = 0, A # x = 0, but in that case the generalized eigenvalue associated with x is undetermined.

Related

Generating a Matrix from a sum of Matrices

I am trying to do MCMC methods on the inverse problem of A(u)x=b, where A is a symmetric positive definite square matrix. I was given that A can be expressed as A = A0 + Σi ui Ai. I want to check whether the ergodic average converges to the initial u. But I need to create some random matrix A satisfying the conditions to test out my MCMC function.
Is it possible to create such a random matrix A that is of the form of A = A0 + Σi ui Ai and how can I go about it in python?
Any help is greatly appreciated, thank you!
One way would be to generate SPD (symmetric positive definite) matrices A[0],A1,.. and positive numbers u1,.. and then sum them up
B = A[0] + Sum{ i>=1 | u[i]*A[i]}
This will be SPD. It would be possible for B to be SPD even though the A[] were not, and the u[] not all positive, but I think that it could be tricky to determine the A[] and the u[] so that B is SPD in that case.
One issue is whether you want B to be strictly positive definite -- ie invertible -- or not. B will be invertible if either A[0] is , or at least one of the A[i], with u[i] > 0, is. Again it B could be invertible even if those conditions were not met, and again it might be tricky to ensure that B was invertible in that case.
There are various ways you could generate a single SPD nxn matrix P:
a/ Generate an upper triangular nxn matrix U and compute
P = U'*U
P will be SPD, and invertible iff all the diagonal elements of U are non-zero
b/ Generate a mxn matrix M and compute
P = M'*M
P will be SPD, but not necessarily invertible. It definitely won't be invertible if m<n. To make it invertible, add a positive multiple of the identity matrix.
c/ use sklearn.datasets.make_spd_matrix
eg here From the documentation it's not clear to me whether this will be invertible or not, so if you need an invertible one you might be best tp add a multiple of the identity.

Understanding the logic behind numpy code for Moore-Penrose inverse

I was going through the book called Hands-On Machine Learning with Scikit-Learn, Keras and Tensorflow and the author was explaining how the pseudo-inverse (Moore-Penrose inverse) of a matrix is calculated in the context of Linear Regression. I'm quoting verbatim here:
The pseudoinverse itself is computed using a standard matrix
factorization technique called Singular Value Decomposition (SVD) that
can decompose the training set matrix X into the matrix
multiplication of three matrices U Σ VT (see numpy.linalg.svd()). The
pseudoinverse is calculated as X+ = V * Σ+ * UT. To compute the matrix
Σ+, the algorithm takes Σ and sets to zero all values smaller than a
tiny threshold value, then it replaces all nonzero values with their
inverse, and finally it transposes the resulting matrix. This approach
is more efficient than computing the Normal equation.
I've got an understanding of how the pseudo-inverse and SVD are related from this post. But I'm not able to grasp the rationale behind setting all values less than the threshold to zero. The inverse of a diagonal matrix is obtained by taking the reciprocals of the diagonal elements. Then small values would be converted to large values in the inverse matrix, right? Then why are we removing the large values?
I went and looked into the numpy code, and it looks like follows, just for reference:
#array_function_dispatch(_pinv_dispatcher)
def pinv(a, rcond=1e-15, hermitian=False):
a, wrap = _makearray(a)
rcond = asarray(rcond)
if _is_empty_2d(a):
m, n = a.shape[-2:]
res = empty(a.shape[:-2] + (n, m), dtype=a.dtype)
return wrap(res)
a = a.conjugate()
u, s, vt = svd(a, full_matrices=False, hermitian=hermitian)
# discard small singular values
cutoff = rcond[..., newaxis] * amax(s, axis=-1, keepdims=True)
large = s > cutoff
s = divide(1, s, where=large, out=s)
s[~large] = 0
res = matmul(transpose(vt), multiply(s[..., newaxis], transpose(u)))
return wrap(res)
It's almost certainly an adjustment for numerical error. To see why this might be necessary, look what happens when you take the svd of a rank-one 2x2 matrix. We can create a rank-one matrix by taking the outer product of a vector like so:
>>> a = numpy.arange(2) + 1
>>> A = a[:, None] * a[None, :]
>>> A
array([[1, 2],
[2, 4]])
Although this is a 2x2 matrix, it only has one linearly independent column, and so its rank is one instead of two. So we should expect that when we pass it to svd, one of the singular values will be zero. But look what happens:
>>> U, s, V = numpy.linalg.svd(A)
>>> s
array([5.00000000e+00, 1.98602732e-16])
What we actually get is a singular value that is not quite zero. This result is inevitable in many cases given that we are working with finite-precision floating point numbers. So although the problem you have identified is a real one, we will not be able to tell in practice the difference between a matrix that really has a very small singular value and a matrix that ought to have a zero singular value but doesn't. Setting small values to zero is the safest practical way to handle that problem.

Spectral norm 2x2 matrix in tensorflow

I've got a 2x2 matrix defined by the variables J00, J01, J10, J11 coming in from other inputs. Since the matrix is small, I was able to compute the spectral norm by first computing the trace and determinant
J_T = tf.reduce_sum([J00, J11])
J_ad = tf.reduce_prod([J00, J11])
J_cb = tf.reduce_prod([J01, J10])
J_det = tf.reduce_sum([J_ad, -J_cb])
and then solving the quadratic
L1 = J_T/2.0 + tf.sqrt(J_T**2/4.0 - J_det)
L2 = J_T/2.0 - tf.sqrt(J_T**2/4.0 - J_det)
spectral_norm = tf.maximum(L1, L2)
This works, but it looks rather ugly and it isn't generalizable to larger matrices. Is there cleaner way (maybe a method call that I'm missing) to compute spectral_norm?
The spectral norm of a matrix J equals the largest singular value of the matrix.
Therefore you can use tf.svd() to perform the singular value decomposition, and take the largest singular value:
spectral_norm = tf.svd(J,compute_uv=False)[...,0]
where J is your matrix.
Notes:
I use compute_uv=False since we are interested only in singular values, not singular vectors.
J does not need to be square.
This solution works also for the case where J has any number of batch dimensions (as long as the two last dimensions are the matrix dimensions).
The elipsis ... operation works as in NumPy.
I take the 0 index because we are interested only in the largest singular value.

LinAlgError: Last 2 dimensions of the array must be square

I need to solve a set of simultaneous equations of the form Ax = B for x. I've used the numpy.linalg.solve function, inputting A and B, but I get the error 'LinAlgError: Last 2 dimensions of the array must be square'. How do I fix this?
Here's my code:
A = matrix([[v1x, v2x], [v1y, v2y], [v1z, v2z]])
print A
B = [(p2x-p1x-nmag[0]), (p2y-p1y-nmag[1]), (p2z-p1z-nmag[2])]
print B
x = numpy.linalg.solve(A, B)
The values of the matrix/vector are calculated earlier in the code and this works fine, but the values are:
A =
(-0.56666301, -0.52472909)
(0.44034147, 0.46768087)
(0.69641397, 0.71129036)
B =
(-0.38038602567630364, -24.092279373295057, 0.0)
x should have the form (x1,x2,0)
In case you still haven't found an answer, or in case someone in the future has this question.
To solve Ax=b:
numpy.linalg.solve uses LAPACK gesv. As mentioned in the documentation of LAPACK, gesv requires A to be square:
LA_GESV computes the solution to a real or complex linear system of equations AX = B, where A is a square matrix and X and B are rectangular matrices or vectors. Gaussian elimination with row interchanges is used to factor A as A = PL*U , where P is a permutation matrix, L is unit lower triangular, and U is upper triangular. The factored form of A is then used to solve the above system.
If A matrix is not square, it means that you either have more variables than your equations or the other way around. In these situations, you can have the cases of no solution or infinite number of solutions. What determines the solution space is the rank of the matrix compared to the number of columns. Therefore, you first have to check the rank of the matrix.
That being said, you can use another method to solve your system of linear equations. I suggest having a look at factorization methods like LU or QR or even SVD. In LAPACK you can use getrs, in Python you can different things:
first do the factorization like QR and then feed the resulting matrices to a method like scipy.linalg.solve_triangular
solve the least-squares using numpy.linalg.lstsq
Also have a look here where a simple example is formulated and solved.
A square matrix is a matrix with the same number of rows and columns. The matrix you are doing is a 3 by 2. Add a column of zeroes to fix this problem.

Quickly and efficiently calculating an eigenvector for known eigenvalue

Short version of my question:
What would be the optimal way of calculating an eigenvector for a matrix A, if we already know the eigenvalue belonging to the eigenvector?
Longer explanation:
I have a large stochastic matrix A which, because it is stochastic, has a non-negative left eigenvector x (such that A^Tx=x).
I'm looking for quick and efficient methods of numerically calculating this vector. (Preferrably in MATLAB or numpy/scipy - since both of these wrap around ARPACK/LAPACK, any one would be fine).
I know that 1 is the largest eigenvalue of A, so I know that calling something like this Python code:
from scipy.sparse.linalg import eigs
vals, vecs = eigs(A, k=1)
will result in vals = 1 and vecs equalling the vector I need.
However, the thing that bothers me here is that calculating eigenvalues is, in general, a more difficult operation than solving a linear system, and, in general, if a matrix M has eigenvalue l, then finding the appropriate eigenvector is a matter of solving the equation (M - 1 * I) * x = 0, which is, in theory at least, an operation that is simpler than calculating an eigenvalue, since we are only solving a linear system, more specifically, finding the nullspace of a matrix.
However, I find that all methods of nullspace calculation in MATLAB rely on svd calculation, a process I cannot afford to perform on a matrix of my size. I also cannot call solvers on the linear equation, because they all only find one solution, and that solution is 0 (which, yes, is a solution, but not the one I need).
Is there any way to avoid calls to eigs-like function to solve my problem more quickly than by calculating the largest eigenvalue and accompanying eigenvector?
Here's one approach using Matlab:
Let x denote the (row) left† eigenvector associated to eigenvalue 1. It satisfies the system of linear equations (or matrix equation) xA = x, or x(A−I)=0.
To avoid the all-zeros solution to that system of equations, remove the first equation and arbitrarily set the first entry of x to 1 in the remaining equations.
Solve those remaining equations (with x1 = 1) to obtain the other entries of x.
Example using Matlab:
>> A = [.6 .1 .3
.2 .7 .1
.5 .1 .4]; %// example stochastic matrix
>> x = [1, -A(1, 2:end)/(A(2:end, 2:end)-eye(size(A,1)-1))]
x =
1.000000000000000 0.529411764705882 0.588235294117647
>> x*A %// check
ans =
1.000000000000000 0.529411764705882 0.588235294117647
Note that the code -A(1, 2:end)/(A(2:end, 2:end)-eye(size(A,1)-1)) is step 3.
In your formulation you define x to be a (column) right eigenvector of AT (such that ATx = x). This is just x.' from the above code:
>> x = x.'
x =
1.000000000000000
0.529411764705882
0.588235294117647
>> A.'*x %// check
ans =
1.000000000000000
0.529411764705882
0.588235294117647
You can of course normalize the eigenvector to sum 1:
>> x = x/sum(x)
x =
0.472222222222222
0.250000000000000
0.277777777777778
>> A.'*x %'// check
ans =
0.472222222222222
0.250000000000000
0.277777777777778
† Following the usual convention. Equivalently, this corresponds to a right eigenvector of the transposed matrix.

Categories