python numpy sort eigenvalues - python

I am using linalg.eig(A) to get the eigenvalues and eigenvectors of a matrix. Is there an easy way to sort these eigenvalues (and associated vectors) in order?

You want to use the NumPy sort() and argsort() functions. argsort() returns the permutation of indices needed to sort an array, so if you want to sort by eigenvalue magnitude (the standard sort for NumPy arrays seems to be smallest-to-largest), you can do:
import numpy as np
A = np.asarray([[1,2,3],[4,5,6],[7,8,9]])
eig_vals, eig_vecs = np.linalg.eig(A)
eig_vals_sorted = np.sort(eig_vals)
eig_vecs_sorted = eig_vecs[:, eig_vals.argsort()]
# Alternatively, to avoid making new arrays
# do this:
sort_perm = eig_vals.argsort()
eig_vals.sort() # <-- This sorts the list in place.
eig_vecs = eig_vecs[:, sort_perm]

np.linalg.eig will often return complex values. You may want to consider using np.sort_complex(eig_vals).

Related

python equivalent for `eigs` in matlab with a matrix function

If I want to calculate the k smallest eigenvalues of the matrix multiplication AA' with A of size 300K by 512 and "'" is the transpose, then that would be infeasible to do it in traditional way. Matlab however provides a nice functionality by using a function argument that perform the product Afun = #(x) A*(A'*x)); to the eigs function. Then, to find the smallest 6 eigenvalues/eigenvectors we call d = eigs(Afun,300000,6,'smallestabs'), where the second input is the size of the matrix AA'. Is there a function in python that performs a similar thing?
To my knowledge, there is no such functionality in numpy. However, I don't see any limitations by using simply numpy.linalg.eigvals for retrieving an array of the matrix eigenvalues. Then simply find the N smallest with a sort:
import numpy as np
import numpy.linalg
A = np.array() # your matrix
eigvals = numpy.linalg.eigvals(A)
eigvals.sort()
smallest_6_eigvals = eigvals[:6]

How to generate a number of random vectors starting from a given one

I have an array of values and would like to create a matrix from that, where each row is my starting point vector multiplied by a sample from a (normal) distribution.
The number of rows of this matrix will then vary in dependence from the number of samples I want.
%pylab
my_vec = array([1,2,3])
my_rand_vec = my_vec*randn(100)
Last command does not work, because array shapes do not match.
I could think of using a for loop, but I am trying to leverage on array operations.
Try this
my_rand_vec = my_vec[None,:]*randn(100)[:,None]
For small numbers I get for example
import numpy as np
my_vec = np.array([1,2,3])
my_rand_vec = my_vec[None,:]*np.random.randn(5)[:,None]
my_rand_vec
# array([[ 0.45422416, 0.90844831, 1.36267247],
# [-0.80639766, -1.61279531, -2.41919297],
# [ 0.34203295, 0.6840659 , 1.02609885],
# [-0.55246431, -1.10492863, -1.65739294],
# [-0.83023829, -1.66047658, -2.49071486]])
Your solution my_vec*rand(100) does not work because * corresponds to the element-wise multiplication which only works if both arrays have identical shapes.
What you have to do is adding an additional dimension using [None,:] and [:,None] such that numpy's broadcasting works.
As a side note I would recommend not to use pylab. Instead, use import as in order to include modules as pointed out here.
It is the outer product of vectors:
my_rand_vec = numpy.outer(randn(100), my_vec)
You can pass the dimensions of the array you require to numpy.random.randn:
my_rand_vec = my_vec*np.random.randn(100,3)
To multiply each vector by the same random number, you need to add an extra axis:
my_rand_vec = my_vec*np.random.randn(100)[:,np.newaxis]

How can this numpy 2D sorted array creation be optimized?

I have a NxM matrix called coefficients that I want to sort:
import numpy
N = 10
M = 42
coefficients = numpy.random.uniform(size=(N, M))
I have an array called order with N elements that says the order that the rows of coefficients should be in:
order = numpy.random.choice(range(N), N, False)
I'm sorting coefficients by sorting order:
coefficients = numpy.array([mag for (orig, mag)
in sorted(zip(order, coefficients),
key=lambda pair: pair[0])])
This works, but it's probably slower than it should be. If this was in 1D, I'd use fromiter, but I don't know how to tackle this since it's 2D. Is there an optimization I can make here?
To answer your question, just coefficients[order.argsort()] is enough :)
See also Numpy: sort by key function.

specifying missing values to pdist in scipy

how can missing values be specified when calling pdist in scipy? i.e. the function described here:
http://docs.scipy.org/doc/scipy/reference/generated/scipy.spatial.distance.pdist.html
for example if you have:
pdist(X, "euclidean")
but X might contain missing values like the string "NA" and you want those to be excluded in pairwise comparisons among X's columns. the behavior i'm looking for is to not consider missing values when getting the euclidean distance between any pair of columns in X.
The best way is to fill your X array with np.nan for the points to be excluded. For example, assuming a 2D case with a X a (10,2) array:
import numpy as np
X = np.random.rand(10, 2)
Let's assume you want to exclude X[7] from the calculation:
X[7] = np.nan
my_dist = pdist(X, "euclidean")
Then, you'll see that my_dist has 'nan' for the pairs that involved calculating distance with the excluded element. You can exclude multiple elements.
A better idea would be to use a numpy masked array, but pdist ignores masked arrays and uses the data anyway. However, once you have the output my_dist, you can convert it to a masked array so that the nans don't get in the way of future array operations:
my_dist = np.ma.array(my_dist, mask = ~np.isfinite(my_dist))

sort eigenvalues and associated eigenvectors after using numpy.linalg.eig in python

I'm using numpy.linalg.eig to obtain a list of eigenvalues and eigenvectors:
A = someMatrixArray
from numpy.linalg import eig as eigenValuesAndVectors
solution = eigenValuesAndVectors(A)
eigenValues = solution[0]
eigenVectors = solution[1]
I would like to sort my eigenvalues (e.g. from lowest to highest), in a way I know what is the associated eigenvector after the sorting.
I'm not finding any way of doing that with python functions. Is there any simple way or do I have to code my sort version?
Use numpy.argsort. It returns the indices one would use to sort the array.
import numpy as np
import numpy.linalg as linalg
A = np.random.random((3,3))
eigenValues, eigenVectors = linalg.eig(A)
idx = eigenValues.argsort()[::-1]
eigenValues = eigenValues[idx]
eigenVectors = eigenVectors[:,idx]
If the eigenvalues are complex, the sort order is lexicographic (that is, complex numbers are sorted according to their real part first, with ties broken by their imaginary part).
Above answer by unutbu is very crisp and concise. But, here is another way we can do it which more general and can be used for lists as well.
eval, evec = sp.eig(A)
ev_list = zip( eval, evec )
ev_list.sort(key=lambda tup:tup[0], reverse=False)
eval, evec = zip(*ev_list)
This tup[0] is the eigenvalue based on which the sort function will sort the list.
reverse = False is for increasing order.
The ubuntu's piece of code doesn't work on my Python 3.6.5. It leads run-time errors. So, I refactored his/her code to this one which works ok on my test cases:
import numpy as np
from numpy import linalg as npla
#
def eigen(A):
eigenValues, eigenVectors = npla.eig(A)
idx = np.argsort(eigenValues)
eigenValues = eigenValues[idx]
eigenVectors = eigenVectors[:,idx]
return (eigenValues, eigenVectors)

Categories