I have a function which currently multiplies a matrix in scipy.sparse.csr_matrix form by a vector. I use this function for different values lots of times and I would like the matrix * vector multiplication to be as efficient as possible. The matrix is an N x N matrix, but only contains m x N non-zero elements, where m << N. The non-zero elements are currently arranged randomly about the matrix. I could perform row operations to get this matrix in a form such that all the elements appear on only m + 2 diagonals. Then use scipy.sparse.dia_matrix instead of scipy.sparse.csr_matrix. It will take quite a bit of work so I was wondering if anyone knows if this will even improve the computational efficiency?
Related
I would like to generate invertible matrices (specifically those from GL(n), a general linear group of size n) using Tensorflow and/or Numpy for use with my neural network.
How can this be done and what would be the best way of doing so?
I understand there is a way to generate symmetric invertible matrices by computing (A + A.T)/2 for arbitrary square matrices A, however, I would like mine to not just be symmetric.
I happened to have found one way which I believe can generate a large variety of random invertible matrices using diagonal dominance.
The theorem is that given an nxn matrix, if the abs of the diagonal element is larger than the sum of the abs of all the row elements with respect to the row the diagonal element is in, and this holds true for all rows, then the underlying matrix is invertible. (here is the corresponding wikipedia article: https://en.wikipedia.org/wiki/Diagonally_dominant_matrix)
Therefore the following code snippet generates an arbitrary invertible matrix.
n = 5 # size of invertible matrix I wish to generate
m = np.random.rand(n, n)
mx = np.sum(np.abs(m), axis=1)
np.fill_diagonal(m, mx)
I want to compute a sum of terms relating to the following jagged arrays x, tokens, and phi with the same shape, and with the following forms:
x has length n, with x[i] taking various lengths, (here depicted with n=5 but in reality n is very large, in the millions)
tokens has the same shape as x but where each entry is a number between 1 and 200000:
Lastly phi also has the same shape as x and tokens.
I want to compute, for each v=1,...,200000, the sum over all products x[i][j]* phi[i][j] for all i,j where tokens[i][j]==v.
Specifically I want to get out a 200000-vector where the first term is the sum over all indices i,j where tokens[i][v]==1.
Now, the only ways I could think to do this are brute-force, which are
turn x, tokens and phi into sparse matrices of shape n*200000 and then take the componentwise product np.multiply(x,phi). But this is way too large since n is in the millions.
manually get each location, something like t = np.argwhere(tokens==v) then take the product x[i][t]*phi[i][t]. But this also seems very slow since I would have to search a million arrays for 200000 different values.
Is there a better way, or which of these two would you pick?
This is question is the same as this, but for a sparse matrix (scipy.sparse). The solution given to the linked question used indexing schemes that are incompatible with sparse matrices.
For context I am constructing a Jacobian for a large discretized PDE, so the B matrix in this case contains various relevant partial terms while A will be the complete Jacobian I need to invert for a Newton's method approximation. On a large grid A will be far too large to fit in memory, so I want to use sparse matrices.
I would like to construct an array with the following structure:
A[i,j,i,j,] = B[i,j] with all other entries 0: A[i,j,l,k]=0 # (i,j) =\= (l,k)
I.e. if I have the B matrix constructed how can I create the matrix A, preferably in a vectorized manner.
Explicitly, let B = [[1,2],[3,4]]
Then:
A[1,1,:,:]=[[1,0],[0,0]]
A[1,2,:,:]=[[0,2],[0,0]]
A[2,1,:,:]=[[0,0],[3,0]]
A[2,2,:,:]=[[0,0],[0,4]]
I have the need to write an algorithm that deals with some very low rank (compared to the dimension) square matrices. I'd like to write such matrices as sum of "product" of a (d, 1)-matrix with a (1, d)-matrix by saving only a list of the vectors.
Also I'd like to have left and right matrix multiplication done with application of the matrix to the vectors: i.e. call $M = \sum_i v_i * w_i^T$ then I'd like that $TM = \sum_i (T v_i) * w_i^T$ and the like.
I've not seen any such thing in scipy but this would be really useful since matrix multiplication now becomes some matrix-vector multiplication.
Please note that the rank of my matrices is about 20, while their dimension is about 400.000, so this would save my computations a lot of time.
Please also note that such matrices are not sparse, they are just low rank and already decomposed into a sum of (d, 1)-matrix with a (1, d)-matrix.
How do you advise to do such a thing? Where can i found references to add a matrix type to scipy?
So I would like to generate a 50 X 50 covariance matrix for a random variable X given the following conditions:
one variance is 10 times larger than the others
the parameters of X are only slightly correlated
Is there a way of doing this in Python/R etc? Or is there a covariance matrix that you can think of that might satisfy these requirements?
Thank you for your help!
OK, you only need one matrix and randomness isn't important. Here's a way to construct a matrix according to your description. Start with an identity matrix 50 by 50. Assign 10 to the first (upper left) element. Assign a small number (I don't know what's appropriate for your problem, maybe 0.1? 0.01? It's up to you) to all the other elements. Now take that matrix and square it (i.e. compute transpose(X) . X where X is your matrix). Presto! You've squared the eigenvalues so now you have a covariance matrix.
If the small element is small enough, X is already positive definite. But squaring guarantees it (assuming there are no zero eigenvalues, which you can verify by computing the determinant -- if the determinant is nonzero then there are no zero eigenvalues).
I assume you can find Python functions for these operations.