I want to make a zero-mean Gaussian Matrix, e.g., M of size (n,n) in Python such that
where, the four dimensional matrix A with entries is given. Is there any way to do that, without changing M into a vector?
Related
I would like to generate invertible matrices (specifically those from GL(n), a general linear group of size n) using Tensorflow and/or Numpy for use with my neural network.
How can this be done and what would be the best way of doing so?
I understand there is a way to generate symmetric invertible matrices by computing (A + A.T)/2 for arbitrary square matrices A, however, I would like mine to not just be symmetric.
I happened to have found one way which I believe can generate a large variety of random invertible matrices using diagonal dominance.
The theorem is that given an nxn matrix, if the abs of the diagonal element is larger than the sum of the abs of all the row elements with respect to the row the diagonal element is in, and this holds true for all rows, then the underlying matrix is invertible. (here is the corresponding wikipedia article: https://en.wikipedia.org/wiki/Diagonally_dominant_matrix)
Therefore the following code snippet generates an arbitrary invertible matrix.
n = 5 # size of invertible matrix I wish to generate
m = np.random.rand(n, n)
mx = np.sum(np.abs(m), axis=1)
np.fill_diagonal(m, mx)
I have derived d sparse matrices m[d] of size (n, n) each, and I would like to stack them along a new axis in order to build a sparse matrix of size (n, n, d).
I tried building this stacked matrix with np.stack([m[i] for i in range(d)], axis=-1) but this yields a numpy.ndarray of size d and not a sparse matrix (in such format, I can't use scipy.sparse.save_npz, which is what I ultimately want to do). scipy.sparse only comes with vstack and hstack, none of which suits my need here.
Is there a way to build such a matrix?
Is there a way to build a sparse matrix with more than two axis at all?
Notes:
All sparse matrices have the same number of stored elements m[d], and these elements have the same coordinates in the matrix, so stacking them should be straightforward.
To give some context, I encountered this problem trying to compute the gradient of a function f defined on a mesh surface. This function associates each vertex i of the mesh surface with a vector f(i) of size d. All edges (i,j) can be stored in a sparse matrix of size (n, n). Finally, each matrix m[d] contains the gradient along the dth dimension for each edge (i, j) of the mesh.
This is question is the same as this, but for a sparse matrix (scipy.sparse). The solution given to the linked question used indexing schemes that are incompatible with sparse matrices.
For context I am constructing a Jacobian for a large discretized PDE, so the B matrix in this case contains various relevant partial terms while A will be the complete Jacobian I need to invert for a Newton's method approximation. On a large grid A will be far too large to fit in memory, so I want to use sparse matrices.
I would like to construct an array with the following structure:
A[i,j,i,j,] = B[i,j] with all other entries 0: A[i,j,l,k]=0 # (i,j) =\= (l,k)
I.e. if I have the B matrix constructed how can I create the matrix A, preferably in a vectorized manner.
Explicitly, let B = [[1,2],[3,4]]
Then:
A[1,1,:,:]=[[1,0],[0,0]]
A[1,2,:,:]=[[0,2],[0,0]]
A[2,1,:,:]=[[0,0],[3,0]]
A[2,2,:,:]=[[0,0],[0,4]]
I am using python to find the covariance matrix between 2 images, e.g. of size (N, N), but numpy.cov or numpy.corrcoef always returns a matrix of the size (2N, 2N), which I dont understand.
Isn't a covariance matrix the same size of a N,N array?
As shown
The upper left square is the covariance within the first image. The bottom right square is the covariance within the second image. The other two squares are the covariance between the images; each should be the same as the other flipped about the main diagonal.
I'm using the module hcluster to calculate a dendrogram from a distance matrix. My distance matrix is an array of arrays generated like this:
import hcluster
import numpy as np
mols = (..a list of molecules)
distMatrix = np.zeros((10, 10))
for i in range(0,10):
for j in range(0,10):
sim = OETanimoto(mols[i],mols[j]) # a function to calculate similarity between molecules
distMatrix[i][j] = 1 - sim
I then use the command distVec = hcluster.squareform(distMatrix) to convert the matrix into a condensed vector and calculate the linkage matrix with vecLink = hcluster.linkage(distVec).
All this works fine but if I calculate the linkage matrix using the distance matrix and not the condensed vector matLink = hcluster.linkage(distMatrix) I get a different linkage matrix (the distances between the nodes are a lot larger and topology is slightly different)
Now I'm not sure whether this is because hcluster only works with condensed vectors or whether I'm making mistakes on the way there.
Thanks for your help!
I knocked up a quick random example similar to yours and experienced the same problem.
In the docstring it does say :
Performs hierarchical/agglomerative clustering on the
condensed distance matrix y. y must be a :math:{n \choose 2} sized
vector where n is the number of original observations paired
in the distance matrix.
However, having had a quick look at the code, it seems like the intent is for it to both work with vector shaped and matrix shaped code:
In hierachy.py there is a switch based upon the shape of the matrix.
It seems however that the key bit of info is in the function linkage's docstring:
- Q : ndarray
A condensed or redundant distance matrix. A condensed
distance matrix is a flat array containing the upper
triangular of the distance matrix. This is the form that
``pdist`` returns. Alternatively, a collection of
:math:`m` observation vectors in n dimensions may be passed as
a :math:`m` by :math:`n` array.
So I think that the interface doesn't allow the passing of a distance matrix.
Instead it thinks you are passing it m observation vectors in n dimensions .
Hence the difference in result?
Does that seem reasonable?
Else just take a look at the code itself I'm sure you'll be able to debug it and figure out why your examples are different.
Cheers
Matt