Numpy: smart matrix multiplication to sparse result matrix - python

In python with numpy, say I have two matrices:
S, a sparse x*x matrix
M, a dense x*y matrix
Now I want to do, M.T) which will return a dense x*x matrix S_.
However, I only care about the cells that are nonzero in S, which means that it would not make a difference for my application if I did
S_ = S*S_
Obviously, that would be a waste of operations as I would like to leave out the irrelevant cells given in S alltogether. Remember that in matrix multiplication
S_[i,j] = np.sum(M[i,:]*M[:,j])
So I want to do this operation only for i,j such that S[i,j]=True.
Is this supported somehow by numpy implementations that run in C so that I do not need to implement it with python loops?
EDIT 1 [solved]: I still have this problem, actually M is now also sparse.
Now, given rows and cols of S, I implemented it like this:
data = np.array([ M[rows[i],:].dot(M[cols[i],:]).data[0] for i in xrange(len(rows)) ])
S_ = csr( (data, (rows,cols)) )
... but it is still slow. Any new ideas?
EDIT 2: jdehesa has given a great solution, but I would like to save more memory.
The solution was to do the following:
data = M[rows,:].multiply(M[cols,:]).sum(axis=1)
and then build a new sparse matrix from rows, cols and data.
However, when running the above line, scipy builds a (contiguous) numpy array with as many elements as nnz of the first submatrix plus nnz of the second submatrix, which can lead to MemoryError in my case.
In order to save more memory, I would like to multiply iteratively each row with its respective 'partner' column, then sum over and discard the result vector. Using simple python to implement this, basically I am back to the extremely slow version.
Is there a fast way of solving this problem?

Here is how you can do it with NumPy/SciPy, both for dense and sparse M matrices:
import numpy as np
import scipy.sparse as sp
# Coordinates where S is True
S = np.array([[0, 1],
[3, 6],
[3, 4],
[9, 1],
[4, 7]])
# Dense M matrix
# Random big matrix
M = np.random.random(size=(1000, 2000))
# Take relevant rows and compute values
values = np.sum(M[S[:, 0]] * M[S[:, 1]], axis=1)
# Make result matrix from values
result = np.zeros((len(M), len(M)), dtype=values.dtype)
result[S[:, 0], S[:, 1]] = values
# Sparse M matrix
# Construct sparse M as COO matrix or any other way
M = sp.coo_matrix(([10, 20, 30, 40, 50], # Data
([0, 1, 3, 4, 6], # Rows
[4, 4, 5, 5, 8])), # Columns
shape=(1000, 2000))
# Convert to CSR for fast row slicing
M_csr = M.tocsr()
# Take relevant rows and compute values
values = M_csr[S[:, 0]].multiply(M_csr[S[:, 1]]).sum(axis=1)
values = np.squeeze(np.asarray(values))
# Construct COO sparse matrix from values
result = sp.coo_matrix((values, (S[:, 0], S[:, 1])), shape=(M.shape[0], M.shape[0]))


Slice a 3d numpy array using a 1d lookup between indices

Slice a 3d numpy array using a 1d lookup between indices
import numpy as np
a = np.arange(12).reshape(2, 3, 2)
b = np.array([2, 0])
b maps i to j where i and j are the first 2 indexes of a, so ​a[i,j,k]
Desired result after applying b to a is:
[[4 5]
​ [6 7]]
Naive solution:
c = np.empty(shape=(2, 2), dtype=int)
for i in range(2):
​j = b[i]
​c[i, :] = a[i, j, :]
Question: Is there a way to do this using a numpy or scipy routine or routines or fancy indexing?
Application: Reinforcement Learning finite MDPs where b is a deterministic policy vector pi(a|s), a is the state transition probabilities p(s'|s,a) and c is the state transition matrix for that policy vector p(s'|s). The arrays will be large and this operation will be repeated a large number of times so needs to be scaleable and fast.
What I have tried:
Compiling using numba but line profiler suggests my code is slower compared to a similarly sized numpy routine. Also numpy is more widely understood and used.
Maintaining pi(a|s) as a sparse matrix (all zero except one 1 per row) b_as_a_matrix and then using einsum but this involves storing and updating the matrix and creates more work (an extra loop over j and sum operation).
c = np.einsum('ij,ijk->ik', b_as_a_matrix, a)
Numpy arrays can be indexed using other arrays as indices. See also: NumPy selecting specific column index per row by using a list of indexes.
With that in mind, we can vectorize your loop to simply use b for indexing:
>>> import numpy as np
>>> a = np.arange(12).reshape(2, 3, 2)
>>> b = np.array([2, 0])
>>> i = np.arange(len(b))
>>> i
array([0, 1])
>>> a[i, b, :]
array([[4, 5],
[6, 7]])

Change the data type of one element in a matrix

I'm looking to implement a hardware-efficient multiplication of a list of large matrices (on the order of 200,000 x 200,000). The matrices are very nearly the identity matrix, but with some elements changed to irrational numbers.
In an effort to reduce the memory footprint and make the computation go faster, I want to store the 0s and 1s of the identity as single bytes like so.
import numpy as np
size = 200000
large_matrix = np.identity(size, dtype=uint8)
and just change a few elements to a different data type.
import sympy as sp
# sympy object
irr1 = sp.sqrt(2)
# float
irr2 = e
large_matrix[123456, 100456] = irr1
large_matirx[100456, 123456] = irr2
Is is possible to hold only these elements of the matrix with a different data type, while all the other elements are still bytes? I don't want to have to change everything to a float just because I need one element to be a float.
If it's not possible in numpy, then how can I find a solution without numpy?
Maybe you can have a look at the SciPy's Coordinate-based sparse matrix. In that case SciPy creates a sparse matrix (optimized for such large empty matrices) and with its coordinate format you can access and modify the data as you intend.
From its documentation:
>>> from scipy.sparse import coo_matrix
>>> # Constructing a matrix using ijv format
>>> row = np.array([0, 3, 1, 0])
>>> col = np.array([0, 3, 1, 2])
>>> data = np.array([4, 5, 7, 9])
>>> m = coo_matrix((data, (row, col)), shape=(4, 4))
>>> m.toarray()
array([[4, 0, 9, 0],
[0, 7, 0, 0],
[0, 0, 0, 0],
[0, 0, 0, 5]])
It does not create a matrix but a set of coordinates with values, which takes much less space than just filling a matrix with zeros.
>>> from sys import getsizeof
>>> getsizeof(m)
>>> getsizeof(m.toarray())
By definition, NumPy arrays only have one dtype. You can see in the NumPy documentation:
A numpy array is homogeneous, and contains elements described by a dtype object. A dtype object can be constructed from different combinations of fundamental numeric types.
Further reading:

How to find linearly independent rows from a matrix

How to identify the linearly independent rows from a matrix? For instance,
The 4th rows is independent.
First, your 3rd row is linearly dependent with 1t and 2nd row. However, your 1st and 4th column are linearly dependent.
Two methods you could use:
If one eigenvalue of the matrix is zero, its corresponding eigenvector is linearly dependent. The documentation eig states the returned eigenvalues are repeated according to their multiplicity and not necessarily ordered. However, assuming the eigenvalues correspond to your row vectors, one method would be:
import numpy as np
matrix = np.array(
[0, 1 ,0 ,0],
[0, 0, 1, 0],
[0, 1, 1, 0],
[1, 0, 0, 1]
lambdas, V = np.linalg.eig(matrix.T)
# The linearly dependent row vectors
print matrix[lambdas == 0,:]
Cauchy-Schwarz inequality
To test linear dependence of vectors and figure out which ones, you could use the Cauchy-Schwarz inequality. Basically, if the inner product of the vectors is equal to the product of the norm of the vectors, the vectors are linearly dependent. Here is an example for the columns:
import numpy as np
matrix = np.array(
[0, 1 ,0 ,0],
[0, 0, 1, 0],
[0, 1, 1, 0],
[1, 0, 0, 1]
print np.linalg.det(matrix)
for i in range(matrix.shape[0]):
for j in range(matrix.shape[0]):
if i != j:
inner_product = np.inner(
norm_i = np.linalg.norm(matrix[:,i])
norm_j = np.linalg.norm(matrix[:,j])
print 'I: ', matrix[:,i]
print 'J: ', matrix[:,j]
print 'Prod: ', inner_product
print 'Norm i: ', norm_i
print 'Norm j: ', norm_j
if np.abs(inner_product - norm_j * norm_i) < 1E-5:
print 'Dependent'
print 'Independent'
To test the rows is a similar approach.
Then you could extend this to test all combinations of vectors, but I imagine this solution scale badly with size.
With sympy you can find the linear independant rows using: sympy.Matrix.rref:
>>> import sympy
>>> import numpy as np
>>> mat = np.array([[0,1,0,0],[0,0,1,0],[0,1,1,0],[1,0,0,1]]) # your matrix
>>> _, inds = sympy.Matrix(mat).T.rref() # to check the rows you need to transpose!
>>> inds
[0, 1, 3]
Which basically tells you the rows 0, 1 and 3 are linear independant while row 2 isn't (it's a linear combination of row 0 and 1).
Then you could remove these rows with slicing:
>>> mat[inds]
array([[0, 1, 0, 0],
[0, 0, 1, 0],
[1, 0, 0, 1]])
This also works well for rectangular (not only for quadratic) matrices.
I edited the code for Cauchy-Schwartz inequality which scales better with dimension: the inputs are the matrix and its dimension, while the output is a new rectangular matrix which contains along its rows the linearly independent columns of the starting matrix. This works in the assumption that the first column in never null, but can be readily generalized in order to implement this case too. Another thing that I observed is that 1e-5 seems to be a "sloppy" threshold, since some particular pathologic vectors were found to be linearly dependent in that case: 1e-4 doesn't give me the same problems. I hope this could be of some help: it was pretty difficult for me to find a really working routine to extract li vectors, and so I'm willing to share mine. If you find some bug, please report them!!
from numpy import dot, zeros
from numpy.linalg import matrix_rank, norm
def find_li_vectors(dim, R):
r = matrix_rank(R)
index = zeros( r ) #this will save the positions of the li columns in the matrix
counter = 0
index[0] = 0 #without loss of generality we pick the first column as linearly independent
j = 0 #therefore the second index is simply 0
for i in range(R.shape[0]): #loop over the columns
if i != j: #if the two columns are not the same
inner_product = dot( R[:,i], R[:,j] ) #compute the scalar product
norm_i = norm(R[:,i]) #compute norms
norm_j = norm(R[:,j])
#inner product and the product of the norms are equal only if the two vectors are parallel
#therefore we are looking for the ones which exhibit a difference which is bigger than a threshold
if absolute(inner_product - norm_j * norm_i) > 1e-4:
counter += 1 #counter is incremented
index[counter] = i #index is saved
j = i #j is refreshed
#do not forget to refresh j: otherwise you would compute only the vectors li with the first column!!
R_independent = zeros((r, dim))
i = 0
#now save everything in a new matrix
while( i < r ):
R_independent[i,:] = R[index[i],:]
i += 1
return R_independent
I know this was asked a while ago, but here is a very simple (although probably inefficient) solution. Given an array, the following finds a set of linearly independent vectors by progressively adding a vector and testing if the rank has increased:
from numpy.linalg import matrix_rank
def LI_vecs(dim,M):
for i in range(dim):
for r in LI:
tmp.append(M[i]) #set tmp=LI+[M[i]]
if matrix_rank(tmp)>len(LI): #test if M[i] is linearly independent from all (row) vectors in LI
LI.append(M[i]) #note that matrix_rank does not need to take in a square matrix
return LI #return set of linearly independent (row) vectors
I interpret the problem as finding rows that are linearly independent from other rows.
That is equivalent to finding rows that are linearly dependent on other rows.
Gaussian elimination and treat numbers smaller than a threshold as zeros can do that. It is faster than finding eigenvalues of a matrix, testing all combinations of rows with Cauchy-Schwarz inequality, or singular value decomposition.
Problem with floating point numbers:
With regards to the following discussion:
Find dependent rows/columns of a matrix using Matlab?
from sympy import *
A = Matrix([[1,1,1],[2,2,2],[1,7,5]])
It is obvious that the first and second row are multiplication of each other.
If we execute the above code we get [-1/3, -2/3, 1]. The indices of the zero elements in the null space show independence. But why is the third element here not zero? If we multiply the A matrix with the null space, we get a zero column vector. So what's wrong?
The answer which we are looking for is the null space of the transpose of A.
B = A.T
Now we get the [-2, 1, 0], which shows that the third row is independent.
Two important notes here:
Consider whether we want to check the row dependencies or the column
Notice that the null space of a matrix is not equal to the null
space of the transpose of that matrix unless it is symmetric.
You can basically find the vectors spanning the columnspace of the matrix by using SymPy library's columnspace() method of Matrix object. Automatically, they are the linearly independent columns of the matrix.
import sympy as sp
import numpy as np
M = sp.Matrix([[0, 1, 0, 0],
[0, 0, 1, 0],
[1, 0, 0, 1]])
for i in M.columnspace():
# The output is following.
# [[0]
# [0]
# [1]]
# [[1]
# [0]
# [0]]
# [[0]
# [1]
# [0]]

Create matrix with 2 arrays in numpy

I want to find a command in numpy for a column vector times a row vector equals to a matrix
[1,1,1,1 ] ^T * [ 2,3 ] = [[2,3],[2,3],[2,3],[2,3]]
First, let's define your 1-D numpy arrays:
In [5]: one = np.array([ 1,1,1,1 ]); two = np.array([ 2,3 ])
Now, lets multiply them:
In [6]: one[:, np.newaxis] * two[np.newaxis, :]
array([[2, 3],
[2, 3],
[2, 3],
[2, 3]])
This used numpy's newaxis to add the appropriate axes to get a 4x2 output matrix.
The problem you are encountering is that both of your vectors are neither column nor row vectors - they're just vectors. If you look at len(vec.shape) it's 1.
What you can do is use numpy.reshape to turn your column vector into shape (m, 1) and your row vector into shape (1, n).
import numpy as np
colu = np.reshape(u, (u.shape[0], 1))
rowv = np.reshape(v, (1, v.shape[0]))
Now when you multiply colu and rowv you'll get a matrix with shape (m, n).
If you need a matrix - use matrices. This way you can use your expression nearly verbatim:
np.matrix([1,1,1,1]).T * np.matrix([2,3])
You might want to use numpy.kron(a,b) it takes the Kronecker product of two arrays. You can see the b vector as a block. The function puts this block, multiplied by the corresponding coefficient of the a vector, on the position of that coefficient. You can also use it for matrices.
For your example it would look like:
import numpy as np
vecA = np.array([[1],[1],[1],[1]])
vecB = np.array([2,3])
Out = np.kron(vecA,vecB)
this returns
>>> Out
array([[2, 3],
[2, 3],
[2, 3],
[2, 3]])
Hope this helps you.

Repeat a scipy csr sparse matrix along axis 0

I wanted to repeat the rows of a scipy csr sparse matrix, but when I tried to call numpy's repeat method, it simply treats the sparse matrix like an object, and would only repeat it as an object in an ndarray. I looked through the documentation, but I couldn't find any utility to repeats the rows of a scipy csr sparse matrix.
I wrote the following code that operates on the internal data, which seems to work
def csr_repeat(csr, repeats):
if isinstance(repeats, int):
repeats = np.repeat(repeats, csr.shape[0])
repeats = np.asarray(repeats)
rnnz = np.diff(csr.indptr)
ndata =
if ndata == 0:
return sparse.csr_matrix((np.sum(repeats), csr.shape[1]),
indmap = np.ones(ndata,
indmap[0] = 0
rnnz_ = np.repeat(rnnz, repeats)
indptr_ = rnnz_.cumsum()
mask = indptr_ < ndata
indmap -= np.int_(np.bincount(indptr_[mask],
jumps = (rnnz * repeats).cumsum()
mask = jumps < ndata
indmap += np.int_(np.bincount(jumps[mask],
indmap = indmap.cumsum()
return sparse.csr_matrix(([indmap],
np.r_[0, indptr_]),
shape=(np.sum(repeats), csr.shape[1]))
and be reasonably efficient, but I'd rather not monkey patch the class. Is there a better way to do this?
As I revisit this question, I wonder why I posted it in the first place. Almost everything I could think to do with the repeated matrix would be easier to do with the original matrix, and then apply the repetition afterwards. My assumption is that post repetition will always be the better way to approach this problem than any of the potential answers.
from scipy.sparse import csr_matrix
repeated_row_matrix = csr_matrix(np.ones([repeat_number,1])) * sparse_row
It's not surprising that np.repeat does not work. It delegates the action to the hardcoded a.repeat method, and failing that, first turns a into an array (object if needed).
In the linear algebra world where sparse code was developed, most of the assembly work was done on the row, col, data arrays BEFORE creating the sparse matrix. The focus was on efficient math operations, and not so much on adding/deleting/indexing rows and elements.
I haven't worked through your code, but I'm not surprised that a csr format matrix requires that much work.
I worked out a similar function for the lil format (working from lil.copy):
def lil_repeat(S, repeat):
# row repeat for lil sparse matrix
# test for lil type and/or convert
if isinstance(repeat, int):
shape = tuple(shape)
new = sparse.lil_matrix(shape, dtype=S.dtype) = # flat repeat
new.rows = S.rows.repeat(repeat)
return new
But it is also possible to repeat using indices. Both lil and csr support indexing that is close to that of regular numpy arrays (at least in new enough versions). Thus:
S = sparse.lil_matrix([[0,1,2],[0,0,0],[1,0,0]])
print S.A.repeat([1,2,3], axis=0)
print S.A[(0,1,1,2,2,2),:]
print lil_repeat(S,[1,2,3]).A
print S[(0,1,1,2,2,2),:].A
give the same result
and best of all?
print S[np.arange(3).repeat([1,2,3]),:].A
After someone posted a really clever response for how best to do this I revisited my original question, to see if there was an even better way. I I came up with one more way that has some pros and cons. Instead of repeating all of the data (as is done with the accepted answer), we can instead instruct scipy to reuse the data of the repeated rows, creating something akin to a view of the original sparse array (as you might do with broadcast_to). This can be done by simply tiling the indptr field.
repeated = sparse.csr_matrix((, orig.indices, np.tile(orig.indptr, repeat_num)))
This technique repeats the vector repeat_num times, while only modifying the the indptr. The downside is that due to the way the csr matrices encode data, instead of creating a matrix that's repeat_num x n in dimension, it creates one that's (2 * repeat_num - 1) x n where every odd row is 0. This shouldn't be too big of a deal as any operation will be quick given that each row is 0, and they should be pretty easy to slice out afterwards (with something like [::2]), but it's not ideal.
I think the marked answer is probably still the "best" way to do this.
One of the most efficient ways to repeat the sparse matrix would be the way OP suggested. I modified indptr so that it doesn't output rows of 0s.
## original sparse matrix
indptr = np.array([0, 2, 3, 6])
indices = np.array([0, 2, 2, 0, 1, 2])
data = np.array([1, 2, 3, 4, 5, 6])
x = scipy.sparse.csr_matrix((data, indices, indptr), shape=(3, 3))
array([[1, 0, 2],
[0, 0, 3],
[4, 5, 6]])
To repeat this, you need to repeat data and indices, and you need to fix-up the indptr. This is not the most elegant way, but it works.
## repeated sparse matrix
repeat = 5
new_indptr = indptr
for r in range(1,repeat):
new_indptr = np.concatenate((new_indptr, new_indptr[-1]+indptr[1:]))
x = scipy.sparse.csr_matrix((np.tile(data,repeat), np.tile(indices,repeat), new_indptr))
array([[1, 0, 2],
[0, 0, 3],
[4, 5, 6],
[1, 0, 2],
[0, 0, 3],
[4, 5, 6],
[1, 0, 2],
[0, 0, 3],
[4, 5, 6],
[1, 0, 2],
[0, 0, 3],
[4, 5, 6],
[1, 0, 2],
[0, 0, 3],
[4, 5, 6]])
