Issue 1
Can somebody recommend a less awkward way of doing a Cholesky factorization in python? Particularly the last line bugs me.
SigmaSqrt = matrix(Sigma)
cvxopt.lapack.potrf(SigmaSqrt)
SigmaSqrt = matrix(np.tril(SigmaSqrt))
Issue 2
I have the problem, that one entire line & column (e.g. all elements in first row and all elements in first column) are zero, lapack fails with the error that the matrix is not positive definite. What is the best way of dealing with this?
Currently I am doing this: (which seems uber-awkward...)
try:
SigmaSqrt = matrix(Sigma)
cvxopt.lapack.potrf(SigmaSqrt)
SigmaSqrt = matrix(np.tril(SigmaSqrt))
except ArithmeticError:
SigmaSqrt = matrix(Sigma.ix[1:,1:])
cvxopt.lapack.potrf(SigmaSqrt)
SigmaSqrt = matrix(np.tril(SigmaSqrt))
SigmaSqrt = sparse([[v0],[v0[1:].T, SigmaSqrt]])
You could just use numpy.linalg.cholesky. Also if all of one column or all of one row are zeros, the matrix will be singular, have at least on eigenvalue that will be zero and therefore, not be positive definite. Since Cholesky is only defined for matrices that are "Hermitian (symmetric if real-valued) and positive-definite" it would not work for it.
EDIT: to "deal with" your problem depends on what you want. Anything you do to make it work would yeild a cholesky that will not be the Cholesky of the original matrix. If you are doing an iterative process and it can be fudge a little, if it is already symmetric then use numpy.linalg.eigvalsh to find the eigenvalues of the matrix. Let d be the most negative eigenvalue. Then set A += (abs(d) + 1e-4) * np.indentity(len(A)). this will make it positive definite.
EDIT: It is a trick used in the Levenberg–Marquardt algorithm. This links to a Wikipedia article on Newton's Method that mentions it as the article on Levenberg–Marquard is doesn't go into this.Also, here is a paper on it as well. Basically this will shift all the eigenvalues by (abs(d) + 1e-4) which will thereby make them all positive, which is a sufficient condition to the matrix being positive definite.
Another option is the chompack module: chompack homepage chompack.cholesky
I use it myself in combination with the cvxopt module. It works perfectly with the (sparse-) matrices from cvxopt.
Related
I'm translating a Python class to Matlab. Most of it is straightforward, but I'm not so good with Python syntax (I hardly ever use it). I'm stuck on the following:
# find the basis that will be uncorrelated using the covariance matrix
basis = (sqrt(eigenvalues)[newaxis,:] * eigenvectors).transpose()
Can someone help me figure out what the equivalent Matlab syntax would be?
I've found via Google that np.newaxis increases the dimensionality of the array, and transpose is pretty self explanatory. So for newaxis, something involving cat in matlab would probably do it, but I'm really not clear on how Python handles arrays TBH.
Assuming eigenvalues is a 1D array of length N in Python, then sqrt(eigenvalues)[newaxis,:] would be a 1xN array. This is translated to MATLAB as either sqrt(eigenvalues) or sqrt(eigenvalues).', depending on the orientation of the eigenvalues array in MATLAB.
The * operation then does broadcasting (in MATLAB this is called singleton expansion). It looks like the operation multiplies each eigenvector by the square root of the corresponding eigenvalue (assuming eigenvectors are the columns).
If in MATLAB you computed the eigendecomposition like this:
[eigenvectors, eigenvalues] = eig(A);
then you’d just do:
basis = sqrt(eigenvalues) * eigenvectors.';
or
basis = (eigenvectors * sqrt(eigenvalues)).';
(note the parentheses) because eigenvalues is a diagonal matrix.
I've computed the eigenvalues and eigenstates of a Hamiltonian in Python. I have a matrix containing all the wavefunctions in discrete space psi. I'd like to normalise the total wavefunction (or the 'ket') (i.e the matrix of vectors) such that its modulus squared integrates to 1.
I've tried the following:
A= np.linalg.norm(abs(psi.T)**2)
normed_psi=psi.T/np.sqrt(A)
print(np.linalg.norm(normed_psi))
The matrix is transposed so I can access each state using psi[n].
However, the output of the print statement is:
20.44795885105457
When it should be 1.I feel like I'm not using linalg.norm correctly. I've also tried using my own integral function using the trapezium rule to no success.
I'm not really sure as to what to do at this point. Any help would be great.
It seems you're confusing np.linalg.norm and np.sum, up to the usual floating point issues these two snippets should be identical:
normed_psi = psi.T / np.sqrt(np.sum(psi.T**2))
normed_psi = psi.T / np.linalg.norm(psi.T)
I am trying to find an inverse of this 9x9 covariance matrix so I can use it with mahalanobis distance. However, the result I'm getting from matrix inverse is a matrix full of 1.02939420e+16. I have been trying to find why, considering Wolfram would give me the correct answer, and this seems to have something to do with condition number of matrix, which in this case is 3.98290435292e+16.
Although I would like to understand the math behind this, what I really need at this moment is just a solution to this problem so I can continue with implementation. Is there a way how to find an inverse of such matrix? Or is it somehow possible to find inverse covariance matrix directly from data instead?
Edit: Matrix data (same as the pastebin link)
[[ 0.46811097 0.15024959 0.01806486 -0.03029948 -0.12472314 -0.11952018 -0.14738093 -0.14655549 -0.06794621]
[ 0.15024959 0.19338707 0.09046136 0.01293189 -0.05290348 -0.07200769 -0.09317139 -0.10125269 -0.12769464]
[ 0.01806486 0.09046136 0.12575072 0.06507481 -0.00951239 -0.02944675 -0.05349869 -0.07496244 -0.13193147]
[-0.03029948 0.01293189 0.06507481 0.12214787 0.04527352 -0.01478612 -0.02879678 -0.06006481 -0.1114809 ]
[-0.12472314 -0.05290348 -0.00951239 0.04527352 0.164018 0.05474073 -0.01028871 -0.02695087 -0.03965366]
[-0.11952018 -0.07200769 -0.02944675 -0.01478612 0.05474073 0.13397166 0.06839442 0.00403321 -0.02537928]
[-0.14738093 -0.09317139 -0.05349869 -0.02879678 -0.01028871 0.06839442 0.14424203 0.0906558 0.02984426]
[-0.14655549 -0.10125269 -0.07496244 -0.06006481 -0.02695087 0.00403321 0.0906558 0.17054466 0.14455264]
[-0.06794621 -0.12769464 -0.13193147 -0.1114809 -0.03965366 -0.02537928 0.02984426 0.14455264 0.32968928]]
The matrix m you provide has a determinant of 0 and is hence uninvertible from a numerical point of view (and this explain the great values you have which tends to bump to Inf):
In [218]: np.linalg.det(m)
Out[218]: 2.8479946613617788e-16
If you start doing linear algebra operations/problem solving, I strongly advise to check some basic concepts, which would avoid doing numerical mistakes/errors:
https://en.wikipedia.org/wiki/Invertible_matrix
You are faced with a very important and fundamental mathematical problem. If your method gives non-invertible matrix the method has a trouble. The method is trying to solve an ill-posed problem. Probably all well-posed problems have been solved in the XIX century. The most common way to solve ill-posed problems is regularization. Sometimes Moore-Penrose pseudoinverse may be convenient. Scipy.linalg have pseudoinverse. But pseudoinverse is not a shortcut. Using pseudoinverse you're replacing non-solvable problem A by solvable problem B. Sometimes the solution of problem B can successfully work instead of non-existent solution of problem A, but it is a matter of mathematical research.
Zero determinant means that your matrix has linearly dependent rows (or columns). In other words, some information in your model is redundant (it contains excessive or duplicate information). Re-develop your model in order to exclude redundancy.
I have a numpy array points of shape [N,2] which contains the (x,y) coordinates of N points. I'd like to compute the mean distance of every point to all other points using an existing function (which we'll call cmp_dist and which I just use as a black box).
First a verbose solution in "normal" python to illustrate what I want to do (written from the top of my head):
mean_dist = []
for i,(x0,y0) in enumerate(points):
dist = [
for j,(x1,y1) in enumerate(points):
if i==j: continue
dist.append(comp_dist(x0,y0,x1,y1))
mean_dist.append(np.array(dist).mean())
I already found a "better" solution using list comprehensions (assuming list comprehensions are usually better) which seems to work just fine:
mean_dist = [np.array([cmp_dist(x0,y0,x1,y1) for j,(x1,y1) in enumerate(points) if not i==j]).mean()
for i,(x0,y0) in enumerate(points)]
However, I'm sure there's a much better solution for this in pure numpy, hopefully some function that allows to do an operation for every element using all other elements.
How can I write this code in pure numpy/scipy?
I tried to find something myself, but this is quite hard to google without knowing how such operations are called (my respective math classes are quite a while back).
Edit: Not a duplicate of Fastest pairwise distance metric in python
The author of that question has a 1D array r and is satisfied with what scipy.spatial.distance.pdist(r, 'cityblock') returns (an array containing the distances between all points). However, pdist returns a flat array, that is, is is not clear which of the distances belong to which point (see my answer).
(Although, as explained in that answer, pdist is what I was ultimately looking for, it doesnt solve the problem as I've specified it in the question.)
Based on #ali_m's comment to the question ("Take a look at scipy.spatial.distance.pdist"), I found a "pure" numpy/scipy solution:
from scipy.spatial.distance import cdist
...
fct = lambda p0,p1: great_circle_distance(p0[0],p0[1],p1[0],p1[1])
mean_dist = np.sort(cdist(points,points,fct))[:,1:].mean(1)
definitely
That's for sure an improvement over my list comprehension "solution".
What i don't really like about this, though, is that I have to sort and slice the array to remove the 0.0 values which are the result of computing the distance between identical points (so basically that's my way of removing the diagonal entries of the matrix I get back from cdist).
Note two things about the above solution:
I'm using cdist, not pdist as suggested by #ali_m.
I'm getting back an array of the same size as points, which contains the mean distance from every point to all other points, just as specified in the original question.
pdist unfortunately just returns an array that contains all these mean values in a flat array, that is, the mean values are unlinked from the points they are referring to, which is necessary for the problem as it I've described it in the original question.
However, since in the actual problem at hand I only need the mean over the means of all points (which I did not mention in the question), pdist serves me just fine:
from scipy.spatial.distance import pdist
...
fct = lambda p0,p1: great_circle_distance(p0[0],p0[1],p1[0],p1[1])
mean_dist_overall = pdist(points,fct).mean()
Though this would for sure be the definite answer if I had asked for the mean of the means, but I've purposely asked for the array of means for all points. Because I think there's still room for improvement in the above cdist solution, I won't accept this as THE answer.
I have a vector of floats (coming from an operation on an array) and a float value (which is actually an element of the array, but that's unimportant), and I need to find the smallest float out of them all.
I'd love to be able to find the minimum between them in one line in a 'Pythony' way.
MinVec = N[i,:] + N[:,j]
Answer = min(min(MinVec),N[i,j])
Clearly I'm performing two minimisation calls, and I'd love to be able to replace this with one call. Perhaps I could eliminate the vector MinVec as well.
As an aside, this is for a short program in Dynamic Programming.
TIA.
EDIT: My apologies, I didn't specify I was using numpy. The variable N is an array.
You can append the value, then minimize. I'm not sure what the relative time considerations of the two approaches are, though - I wouldn't necessarily assume this is faster:
Answer = min(np.append(MinVec, N[i, j]))
This is the same thing as the answer above but without using numpy.
Answer = min(MinVec.append(N[i, j]))