If I want to calculate the k smallest eigenvalues of the matrix multiplication AA' with A of size 300K by 512 and "'" is the transpose, then that would be infeasible to do it in traditional way. Matlab however provides a nice functionality by using a function argument that perform the product Afun = #(x) A*(A'*x)); to the eigs function. Then, to find the smallest 6 eigenvalues/eigenvectors we call d = eigs(Afun,300000,6,'smallestabs'), where the second input is the size of the matrix AA'. Is there a function in python that performs a similar thing?
To my knowledge, there is no such functionality in numpy. However, I don't see any limitations by using simply numpy.linalg.eigvals for retrieving an array of the matrix eigenvalues. Then simply find the N smallest with a sort:
import numpy as np
import numpy.linalg
A = np.array() # your matrix
eigvals = numpy.linalg.eigvals(A)
eigvals.sort()
smallest_6_eigvals = eigvals[:6]
Related
Equation f(x,a,b) below requires an iterative solution, for which I am using one of the scipy optimisation methods ('brentq') which is essentially calculating the value of x for which f(x,a,b)=0.
However, I need to use array inputs for 'a' and 'b' and the arrays are very large e.g. could be as high as 1-100 million.
What is the most efficient/fastest way to do this with scipy/numpy? At present I am resorting to for loops as per below, but this becomes slow with my actual underlying equations (not shown). Note that each row in array is independent of others.
import numpy as np
from scipy import optimize
# function to solve (simplified)
def f(x,a,b): return (a/x)**0.25 * (x**0.5) - b*x
# array size
N = 10000000
# example input arrays from which 'a' and 'b' are taken (in reality values come from other complex functions)
A = np.linspace(1,500,N)
B = np.linspace(0.1,1,N)
# solution using brentq
results = [optimize.brentq(f, 1e10, 1000, args=(a,b)) for a,b in zip(A,B)]
results
I'm trying to make a dot product of an expression and it was supposed to be symmetric.
It turns out that it just isn't.
B is a 4D array which I must transpose its last two dimensions to become B^t.
D is a 2D array. (It's an expression of the Stiffness Matrix known to the Finite Element Method programmers)
The numpy.dotproduct associated with numpy.transpose and as a second alternative numpy.einsum (the idea came from this topic: Numpy Matrix Multiplication U*B*U.T Results in Non-symmetric Matrix) have already been used and the problem persists.
By the end of the calculations the product B^tDB is obtained and when it's verified if it really is symmetric by subtracting its transpose B^tDB, there is still a residue.
The Dot product or the Einstein Summation are used only over the dimensions of interest (last ones).
The question is: How can these residues be eliminated?
You need to use arbitrary precision floating point math. Here's how you can combine numpy and the mpmath package to define an arbitrary precision version of matrix multiplication (ie the np.dot method):
from mpmath import mp, mpf
import numpy as np
# stands for "decimal places". Larger values
# mean higher precision, but slower computation
mp.dps = 75
def tompf(arr):
"""Convert any numpy array to one of arbitrary precision mpmath.mpf floats
"""
if arr.size and not isinstance(arr.flat[0], mpf):
return np.array([mpf(x) for x in arr.flat]).reshape(*arr.shape)
else:
return arr
def dotmpf(arr0, arr1):
"""An arbitrary precision version of np.dot
"""
return tompf(arr0).dot(tompf(arr1))
As an example, if you then set up B, B^t, and D matrices as so:
bshape = (8,8,8,8)
dshape = (8,8)
B = np.random.rand(*bshape)
BT = np.swapaxes(B, -2, -1)
d = np.random.rand(*dshape)
D = d.dot(d.T)
then B^tDB - (B^tDB)^t will always have a non-zero value if you calculate it using the standard matrix multiplication method from numpy:
M = np.dot(np.dot(B, D), BT)
np.sum(M - M.T)
but if you use the arbitrary precision version given above it won't have a residue:
M = dotmpf(dotmpf(B, D), BT)
np.sum(M - M.T)
Watch out though. Calculations using arbitrary precision math run much slower than those done using standard floating point numbers.
Suppose I have two dense matrices U (10000x50) and V(50x10000), and one sparse matrix A(10000x10000). Each element in A is either 1 or 0. I hope to find A*(UV), noting that '*' is element-wise multiplication. To solve the problem, Scipy/numpy will calculate a dense matrix UV first. But UV is dense and large (10000x10000) so it's very slow.
Because I only need a few elements of UV indicated by A, it should save a lot of time if only necessary elements are calculated instead of calculating all elements then filtering using A. Is there a way to instruct scipy to do this?
BTW, I used Matlab to solve this problem and Matlab is smart enough to find what I'm trying to do and works efficiently.
Update:
I found Matlab calculated UV fully as scipy does. My scipy installation is simply too slow...
Here's a test script and possible speedup. The basic idea is to use the nonzero coordinates of A to select rows and columns of U and V, and then use einsum to perform a subset of the possible dot products.
import numpy as np
from scipy import sparse
#M,N,d = 10,5,.1
#M,N,d = 1000,50,.1
M,N,d = 5000,50,.01 # about the limit for my memory
A=sparse.rand(M,M,d)
A.data[:] = 1 # a sparse 0,1 array
U=(np.arange(M*N)/(M*N)).reshape(M,N)
V=(np.arange(M*N)/(M*N)).reshape(N,M)
A1=A.multiply(U.dot(V)) # the direct solution
A2=np.einsum('ij,ik,kj->ij',A.A,U,V)
print(np.allclose(A1,A2))
def foo(A,U,V):
# use A to select elements of U and V
A3=A.copy()
U1=U[A.row,:]
V1=V[:,A.col]
A3.data[:]=np.einsum('ij,ji->i',U1,V1)
return A3
A3 = foo(A,U,V)
print(np.allclose(A1,A3.A))
The 3 solutions match. For large arrays, foo is about 2x faster than the direct solution. For small size, the pure einsum is competitive, but bogs down for large arrays.
The use of dot in foo would have computed too many products, ij,jk->ik as opposed to ij,ji->i.
I am using linalg.eig(A) to get the eigenvalues and eigenvectors of a matrix. Is there an easy way to sort these eigenvalues (and associated vectors) in order?
You want to use the NumPy sort() and argsort() functions. argsort() returns the permutation of indices needed to sort an array, so if you want to sort by eigenvalue magnitude (the standard sort for NumPy arrays seems to be smallest-to-largest), you can do:
import numpy as np
A = np.asarray([[1,2,3],[4,5,6],[7,8,9]])
eig_vals, eig_vecs = np.linalg.eig(A)
eig_vals_sorted = np.sort(eig_vals)
eig_vecs_sorted = eig_vecs[:, eig_vals.argsort()]
# Alternatively, to avoid making new arrays
# do this:
sort_perm = eig_vals.argsort()
eig_vals.sort() # <-- This sorts the list in place.
eig_vecs = eig_vecs[:, sort_perm]
np.linalg.eig will often return complex values. You may want to consider using np.sort_complex(eig_vals).
I'm using numpy.linalg.eig to obtain a list of eigenvalues and eigenvectors:
A = someMatrixArray
from numpy.linalg import eig as eigenValuesAndVectors
solution = eigenValuesAndVectors(A)
eigenValues = solution[0]
eigenVectors = solution[1]
I would like to sort my eigenvalues (e.g. from lowest to highest), in a way I know what is the associated eigenvector after the sorting.
I'm not finding any way of doing that with python functions. Is there any simple way or do I have to code my sort version?
Use numpy.argsort. It returns the indices one would use to sort the array.
import numpy as np
import numpy.linalg as linalg
A = np.random.random((3,3))
eigenValues, eigenVectors = linalg.eig(A)
idx = eigenValues.argsort()[::-1]
eigenValues = eigenValues[idx]
eigenVectors = eigenVectors[:,idx]
If the eigenvalues are complex, the sort order is lexicographic (that is, complex numbers are sorted according to their real part first, with ties broken by their imaginary part).
Above answer by unutbu is very crisp and concise. But, here is another way we can do it which more general and can be used for lists as well.
eval, evec = sp.eig(A)
ev_list = zip( eval, evec )
ev_list.sort(key=lambda tup:tup[0], reverse=False)
eval, evec = zip(*ev_list)
This tup[0] is the eigenvalue based on which the sort function will sort the list.
reverse = False is for increasing order.
The ubuntu's piece of code doesn't work on my Python 3.6.5. It leads run-time errors. So, I refactored his/her code to this one which works ok on my test cases:
import numpy as np
from numpy import linalg as npla
#
def eigen(A):
eigenValues, eigenVectors = npla.eig(A)
idx = np.argsort(eigenValues)
eigenValues = eigenValues[idx]
eigenVectors = eigenVectors[:,idx]
return (eigenValues, eigenVectors)