I have the need to write an algorithm that deals with some very low rank (compared to the dimension) square matrices. I'd like to write such matrices as sum of "product" of a (d, 1)-matrix with a (1, d)-matrix by saving only a list of the vectors.
Also I'd like to have left and right matrix multiplication done with application of the matrix to the vectors: i.e. call $M = \sum_i v_i * w_i^T$ then I'd like that $TM = \sum_i (T v_i) * w_i^T$ and the like.
I've not seen any such thing in scipy but this would be really useful since matrix multiplication now becomes some matrix-vector multiplication.
Please note that the rank of my matrices is about 20, while their dimension is about 400.000, so this would save my computations a lot of time.
Please also note that such matrices are not sparse, they are just low rank and already decomposed into a sum of (d, 1)-matrix with a (1, d)-matrix.
How do you advise to do such a thing? Where can i found references to add a matrix type to scipy?
Related
The bottleneck of some code I have is:
for _ in range(n):
W = np.dot(A, W)
where n can vary, A is a fixed size MxM matrix, W is Mx1.
Is there a good way to optimize this?
Numpy Solution
Since np.dot is just a matrix multiplication for your shapes you can write what you want as A^n*W. With ^ being repeated matrix multiplication "matrix_power" and * matrix multiplication. So you can rewrite your code as
np.linalg.matrix_power(A,n)#W
Linear Algebra Solution
You can do even better with linear algebra. Assuming for the moment W was an eigenvector of A i.e. that A*W=a*W with a just a number then it follows A^n*W=a^n*W. And now you might think ok but what if W is not an eigenvector. Since matrix multiplication is linear it is just as good if W can be written as a linear combination of eigenvectors and there is even a generalisation of this idea in case W can not be written as a linear combination of eigenvectors. If you want to read more about this google diagonalization and Jordan normal form.
I would like to generate invertible matrices (specifically those from GL(n), a general linear group of size n) using Tensorflow and/or Numpy for use with my neural network.
How can this be done and what would be the best way of doing so?
I understand there is a way to generate symmetric invertible matrices by computing (A + A.T)/2 for arbitrary square matrices A, however, I would like mine to not just be symmetric.
I happened to have found one way which I believe can generate a large variety of random invertible matrices using diagonal dominance.
The theorem is that given an nxn matrix, if the abs of the diagonal element is larger than the sum of the abs of all the row elements with respect to the row the diagonal element is in, and this holds true for all rows, then the underlying matrix is invertible. (here is the corresponding wikipedia article: https://en.wikipedia.org/wiki/Diagonally_dominant_matrix)
Therefore the following code snippet generates an arbitrary invertible matrix.
n = 5 # size of invertible matrix I wish to generate
m = np.random.rand(n, n)
mx = np.sum(np.abs(m), axis=1)
np.fill_diagonal(m, mx)
I implement Crank-Nicolson 2D finite-difference method.
I get a matrix A which is banded with 1 band above and below the main diagonal, but also contains 2 additional bands , further apart from the main diagonal, so it is NOT penta-diagonal.
A picture showing the structure is below. My matrix is the RHS one. The LHS is easy, it's the penta-diagonal one.
I couldn't find up until now a way to solve Ax = b with A being the RHS matrix from the photo in python.
I could barely find a name for it, in these lecture notes https://ocw.mit.edu/ans7870/2/2.086/F12/MIT2_086F12_notes_unit5.pdf it is called an 'outrigger' matrix (page 403).
At the moment I am using spsolve from from scipy.sparse.linalg, into which I feed two arguments, namely sparse.csc_matrix(A) and sparse.csc_array(b), where A and b have been defined initially as A = sparse.dok_matrix((size, size), dtype=np.complex64) and b = sparse.dok_array((size, 1), dtype=np.complex64), then populated with values by iterating element by element through them.
It is extremely slow and I was wondering maybe someone more experienced knows a way to exploit the structure appearing in A.
Thank you!
You should consider ussing the Gauss-Seidel method.
If your system is diagonal dominant it will converge, if it is not you probably can make it so by changing using a higher resolution grid.
Where both x and b have shape (N, M) and A has shape (N, N).
Let L = np.diag(np.diag(A)), vL = np.diag(A).reshape(N, 1) and U = A - L.
The inv(L) * (b - U # x) iteration can be written as (b - U # x) / vL, so each iteration will have O(n) complexity if you use sparse matrices.
If you want to make it even more efficient you can do the multiplications by sum of rolled diagonal matrices.
np.roll(np.diag(np.roll(A, k, axis=0)) * x[:,0], -k, axis=0).reshape(N, M)
You can precompute the rolled diagonals, then your matrix multiplication is performed by 4 (or five if the structure is not symmetric) vector multiplications, and some additional rolling and adding operations.
Given a query vector (one-hot-vector) q with size of 50000x1 and a large sparse matrix A with size of 50000 x 50000 and nnz of A is 0.3 billion, I want to compute r=(A + A^2 + ... + A^S)q (usually 4 <= S <=6).
I can above equation iteratively using loop
r = np.zeros((50000,1))
for i in range(S):
q = A.dot(q)
r += q
but I want to more fast method.
First thought was A can be symmetric, so eigen decomposition would help for compute power of A. But since A is large sparse matrix, decomposition makes dense matrix with same size as A which leads to performance degradation (in aspect of memory and speed).
Low Rank Approximation was also considered. But A is large and sparse, so not sure which rank r is appropriate.
It is totally fine to pre-compute something, like A + A^2 + ... + A^S = B. But I hope last computation will be fast: compute Bq less than 40ms.
Is there any reference or paper or tricks for that?
Even if the matrix is not sparse this the iterative method is the way to go.
Multiplying A.dot(q) has complexity O(N^2), while computing A.dot(A^i) has complexity O(N^3).
The fact that q is sparse (indeed much more sparse than A) may help.
For the first iteration A*q can be computed as A[q_hot_index,:].T.
For the second iteration the expected density of A # q has the same expected density as A for A (about 10%) so it is still good to do it sparsely.
For the third iteration afterwards the A^i # q will be dense.
Since you are accumulating the result it is good that your r is not sparse, it prevents index manipulation.
There are several different ways to store sparse matrices. I myself can't say I understand in depth all of them, but I think csr_matrix, csc_matrix, are the most compact for generic sparse matrices.
Eigen decomposition is good when you need to compute a P(A), to compute a P(A)*q the eigen decomposition becomes advantageous only when P(A) has degree of the order of the size of A. Eigen-decomposition has complexity O(N^3), a matrix-vector product has complexity O(N^2), the evaluation of a polynomial P(A) of degree D using the eigen decomposition can be achieved in O(N^3 + N*D).
Edit: Answering questionss on the comments
"it prevents index manipulation" <- Could you elaborate this?
Suppose you have a sparse matrix [0,0,0,2,0,7,0]. This could be described as ((3,2), (5,7)). Now suppose you assigne 1 to one element and it becomes [0,0,0,2,1,7,0], it is now represented as ((3,2), (4,1), (5,7)). The assignment is performed by insertion in an array, and inserting in an array has complexity O(nnz), where nnz is the number of nonzero elements. If you have a dense matrix you can always modify one element with complexity O(1).
What is the N in the complexity?
It is the number of rows, or columns, of the matrix A
About the eigen decomposition, do you want to say that it is worth
that computing r can be achieved in O(N^3 +N*D) not O(N^3 + N^2)
Computing P(A) will have complexity O(N^3 * D) (with a different constant), for big matrices, computing P(A) using the eigen decomposition is probably the most efficient. But P(A)x have O(N^2 * D) complexity, so it is probably not a good idea to compute P(A)x with eigen decomposition unless you have big D (>N), when speed is concerned.
I'm trying to implement the idea I have suggested here, for Cauchy product of multivariate finite power series (i.e. polynomials) represented as NumPy ndarrays. numpy.convolve does the job for 1D arrays, respectively. But to my best knowledge there is no implementations of convolution for arbitrary dimensional arrays. In the above link, I have suggested the equation:
for convolution of two n dimensional arrays Phi of shape P=[p1,...,pn] and Psi of the shape Q=[q1,...,qn], where:
omegas are the elements of n dimensional array Omega of the shape O=P+Q-1
<A,B>_F is the generalization of Frobenius inner product for arbitrary dimensional arrays A and B of the same shape
A^F is A flipped in all n directions
{A}_[k1,...,kn] is a slice of A starting from [0,...,0] to [k1,...,kn]
Psi' is Psi extended with zeros to have the shape O as defined above
I tried implementing the above functions one by one:
import numpy as np
def crop(A,D1,D2):
return A[tuple(slice(D1[i], D2[i]) for i in range(D1.shape[0]))]
as was suggested here, slices/crops A from D1 to D2,
def sumall(A):
sum1=A
for k in range(A.ndim):
sum1 = np.sum(sum1,axis=0)
return sum1
is a generalization of numpy.sum for multidimensional ndarrays,
def flipall(A):
A1=A
for k in range(A.ndim):
A1=np.flip(A1,k)
return A1
flips A is all existing axises, and finally
def conv(A,B,K):
D0=np.zeros(K.shape,dtype=K.dtype)
return sumall(np.multiply(crop(A,np.maximum(D0,np.minimum(A.shape,K-B.shape)) \
,np.minimum(A.shape,K)), \
flipall(crop(B,np.maximum(D0,np.minimum(B.shape,K-A.shape)) \
,np.minimum(B.shape,K)))))
where K=[k1,...,kn] and for all 0<=kj<=oj, is a modified version of formula above which only calculate the non-zero multiplications to be more efficient. Now I'm trying to populate the Omega array using fromfunction or meshgrid in combination to vectorize as suggested here, but I have failed so far. Now my questions in prioritized order are:
how can I implement the final step and populate the final array in an efficient and pythonic way?
are there more efficient implementations of the functions above? Or how would you implement the formula?
is my equation correct? does this represent multiplication of multivariate finite power series?
haven't really others implemented this before in NumPy or am I reinventing the wheel here? I would appreciate if you could point me towards other solutions.
I would appreciate if you could help me with these questions. Thanks for your help in advance.
P.S.1 You may find some examples and other information in this GitHub Gist
P.S.2 Here in the AstroPy mailing list I was told that scipy.signal.convolve and/or scipy.ndimage.convolve do the job for higher dimensions as well. There is also a scipy.ndimage.filters.convolve. Here I have explained why they are not what I'm looking for.