I'm working with numpy arrays of shape (N,), (N,3) and (N,3,3) which represent sequences of scalars, vectors and matrices in 3D space. I have implemented pointwise dot product, matrix multiplication, and matrix/vector multiplication as follows:
def dot_product(v, w):
return np.einsum('ij, ij -> i', v, w)
def matrix_vector_product(M, v):
return np.einsum('ijk, ik -> ij', M, v)
def matrix_matrix_product(A, B):
return np.einsum('ijk, ikl -> ijl', A, B)
As you can see I use einsum for lack of a better solution. To my surprise I was not able to use np.dot... which seems not suitable for this need. Is there a more numpythonic way to implement these function?
In particular it would be nice if the functions could work also on the shapes (3,) and (3,3) by broadcasting the first missing axis. I think I need ellipsis, but I don't quite understand how to achieve the result.
These operations cannot be reshaped into general BLAS calls and looping BLAS calls would be quite slow for arrays of this size. As such, einsum is likely optimal for this kind of operation.
Your functions can be generalized with ellipses as follows:
def dot_product(v, w):
return np.einsum('...j,...j->...', v, w)
def matrix_vector_product(M, v):
return np.einsum('...jk,...k->...j', M, v)
def matrix_matrix_product(A, B):
return np.einsum('...jk,...kl->...jl', A, B)
Just as working notes, these 3 calculations can also be written as:
np.einsum(A,[0,1,2],B,[0,2,3],[0,1,3])
np.einsum(M,[0,1,2],v,[0,2],[0,1])
np.einsum(w,[0,1],v,[0,1],[0])
Or with Ophion's generalization
np.einsum(A,[Ellipsis,1,2], B, ...)
It shouldn't be hard to generate the [0,1,..] lists based on the dimensions of the inputs arrays.
By focusing on generalizing the einsum expressions, I missed the fact that what you are trying to reproduce is N small dot products.
np.array([np.dot(i,j) for i,j in zip(a,b)])
It's worth keeping mind that np.dot uses fast compiled code, and focuses on calculations where the arrays are large. Where as your problem is one of calculating many small dot products.
And without extra arguments that define axes, np.dot performs just 2 of the possible combinations, ones which can be expressed as:
np.einsum('i,i', v1, v2)
np.einsum('...ij,...jk->...ik', m1, m2)
An operator version of dot would face the same limitation - no extra parameters to specify how the axes are to be combined.
It may also be instructive to note what tensordot does to generalize dot:
def tensordot(a, b, axes=2):
....
newshape_a = (-1, N2)
...
newshape_b = (N2, -1)
....
at = a.transpose(newaxes_a).reshape(newshape_a)
bt = b.transpose(newaxes_b).reshape(newshape_b)
res = dot(at, bt)
return res.reshape(olda + oldb)
It can perform a dot with summation over several axes. But after the transposing and reshaping is done, the calculation becomes the standard dot with 2d arrays.
This could have been flagged as a duplicate issue. People have asking about doing multiple dot products for some time.
Matrix vector multiplication along array axes
suggests using numpy.core.umath_tests.matrix_multiply
https://stackoverflow.com/a/24174347/901925 equates:
matrix_multiply(matrices, vectors[..., None])
np.einsum('ijk,ik->ij', matrices, vectors)
The C documentation for matrix_multiply notes:
* This implements the function
* out[k, m, p] = sum_n { in1[k, m, n] * in2[k, n, p] }.
inner1d from the same directory does the same same for (N,n) vectors
inner1d(vector, vector)
np.einsum('ij,ij->i', vector, vector)
# out[n] = sum_i { in1[n, i] * in2[n, i] }
Both are UFunc, and can handle broadcasting on the right most dimensions. In numpy/core/test/test_ufunc.py these functions are used to exercise the UFunc mechanism.
matrix_multiply(np.ones((4,5,6,2,3)),np.ones((3,2)))
https://stackoverflow.com/a/16704079/901925 adds that this kind of calculation can be done with * and sum, eg
(w*v).sum(-1)
(M*v[...,None]).sum(-1)
(A*B.swapaxes(...)).sum(-1)
On further testing, I think inner1d and matrix_multiply match your dot and matrix-matrix product cases, and the matrix-vector case if you add the [...,None]. Looks like they are 2x faster than the einsum versions (on my machine and test arrays).
https://github.com/numpy/numpy/blob/master/doc/neps/return-of-revenge-of-matmul-pep.rst
is the discussion of the # infix operator on numpy. I think the numpy developers are less enthused about this PEP than the Python ones.
Related
Imagine having 2 sparse matrix:
> A, A.shape = (n,m)
> B, B.shape = (m,n)
I would like to compute the dot product A*B, but then only keep the diagonal. The matrices being big, I actually don't want to compute other values than the ones in the diagonal.
This is a variant of the question Is there a numpy/scipy dot product, calculating only the diagonal entries of the result?
Where the most relevant answer seems to be to use np.einsum:
np.einsum('ij,ji->i', A, B)
However this does not work:
ValueError: einstein sum subscripts string contains too many subscripts for operand 0
The solution is to use todense(), but it increases a lot the memory usage: np.einsum('ij,ji->i', A.todense(), B.todense())
The other solution, that I currently use, is to iterate over all the rows of A and compute each product in the loop :
for i in range(len_A):
result = np.float32(A[i].dot(B[:, i])[0, 0])
...
None of these solutions seems perfect. Is there an equivalent to np.einsum that could work with sparse matrices ?
[sum(A[i]*B.T[i]) for i in range(min(A.shape[0], B.shape[1]))]
otherwise this is faster:
l = min(A.shape[0], B.shape[1])
(A[np.arange(l)]*B.T[np.arange(l)]).sum(axis=1)
In general you shouldn't try to use numpy functions on the scipy.sparse arrays. In your case I'd first make sure both arrays actually have a compatible shape, that is
A, A.shape = (r,m)
B, B.shape = (m,r)
where r = min(n, p). Then we can compute the diagonal of the matrix product using
d = (A.multiply(B.T)).sum(axis=1)
Here we compute the entry wise row-column products, and manually sum them up. This avoids all the unnecessary computations you'd get using dot/#/*. (Note that unlike in numpy, both * and # perform matrix multiplication.)
I'm trying to decompose a Tensor (m, n, o) into matrices A(m, r), B (n, r) and C (k, r). This is known as PARAFAC decomposition. Tensorly already does this kind of a decomposition.
An important step is to multiply A, B, and C to get a tensor of shape (m, n, o).
Tensorly does this as follows:
def kt_to_tensor(A, B, C):
factors = [A, B, C]
for r in range(factors[0].shape[1]):
vecs = np.ix_(*[u[:, r] for u in factors])
if r:
res += reduce(np.multiply, vecs)
else:
res = reduce(np.multiply, vecs)
return res
However, the package I'm using (Autograd) does not support np.ix_ operations. I thus wrote a simpler definition as follows:
def new_kt_to_tensor(A, B, C):
m, n, o = A.shape[0], B.shape[0], C.shape[0]
out = np.zeros((m, n, o))
k_max = A.shape[1]
for alpha in range(0, m):
for beta in range(0, n):
for delta in range(0, o):
for k in range(0, k_max):
out[alpha, beta, delta]=out[alpha, beta, delta]+ A[alpha, k]*B[beta, k]*C[delta, k]
return out
However, it turns out that this implementation also has some aspects that autograd does not support. However, autograd does support np.tensordot.
I was wondering how to use np.tensordot to obtain this multiplication. I think that Tensorflow's tf.tensordot would also have a similar functionality.
Intended solution should be something like:
def tensordot_multplication(A, B, C):
"""
use np.tensordot
"""
Don't think np.tensordot would help you here, as it needs to spread-out the axes that don't participate in sum-reductions, as we have the alignment requirement of keeping the last axis aligned between the three inputs while performing multiplication. Thus, with tensordot, you would need extra processing and have more memory requirements there.
I would suggest two methods - One with broadcasting and another with np.einsum.
Approach #1 : With broadcasting -
(A[:,None,None,:]*B[:,None,:]*C).sum(-1)
Explanation :
Extend A to 4D, by introducing new axes at axis=(1,2) with None/np.newaxis.
Similarly extend B to 3D, by introducing new axis at axis=(1).
Keep C as it is and perform elementwise multiplications resulting in a 4D array.
Finally, the sum-reduction comes in along the last axis of the 4D array.
Schematically put -
A : m r
B : n r
C : k r
=> A*B*C : m n k r
=> out : m n k # (sum-reduction along last axis)
Approach #2 : With np.einsum -
np.einsum('il,jl,kl->ijk',A,B,C)
The idea is the same here as with the previous broadcasting one, but with string notations helping us out in conveying the axes info in a more concise manner.
Broadcasting is surely available on tensorflow as it has tools to expand dimensions, whereas np.einsum is probably not.
The code you refer to isn't actually how TensorLy implements it but simply an alternative implementation given in the doc.
The actual code used in TensorLy is:
def kruskal_to_tensor(factors):
shape = [factor.shape[0] for factor in factors]
full_tensor = np.dot(factors[0], khatri_rao(factors[1:]).T)
return fold(full_tensor, 0, shape)
where the khatri_rao is implemented using numpy.einsum in a way that generalizes what Divakar suggested.
I have two matrices A, B, NxKxD dimensions and I want get matrix C, NxKxDxD dimensions, where C[n, k] = A[n, k] x B[n, k].T (here "x" means product of matrices of dimensions Dx1 and 1xD, so the result must be DxD dimensional), so now my code looking like this (here A = B = X):
def square(X):
out = np.zeros((N, K, D, D))
for n in range(N):
for k in range(K):
out[n, k] = np.dot(X[n, k, :, np.newaxis], X[n, k, np.newaxis, :])
return out
It may be slow for big N and K because of python's for cycle. Is there some way to make this multiplication in one numpy function?
It seems you are not using np.dot for sum-reduction, but just for expansion that results in broadcasting. So, you can simply extend the array to have one more dimension with the use of np.newaxis/None and let the implicit broadcasting help out.
Thus, an implementation would be -
X[...,None]*X[...,None,:]
More info on broadcasting specifically how to add new axes could be found in this other post.
I have a 1D array A = [a, b, c...] (length N_A) and a 3D array T of shape (N_A, N_B, N_A). A is meant to represent a diagonal N_A by N_A matrix.
I'd like to perform contractions of A with T without having to promote A to dense storage. In particular, I'd like to do
np.einsum('ij, ikl', A, T)
and
np.einsum('ikl, lm', T, A)
is it possible to do such things while keeping A sparse?
Note this question is similar to
dot product with diagonal matrix, without creating it full matrix
but not identical, since it's not clear to me how one generalizes to more complicated index patterns.
np.einsum('ij, ikl', np.diag(a), t) is equivalent to (a * t.T).T.
np.einsum('ikl, lm', t, np.diag(a)) is equivalent to a * t.
(found by trial-and-error)
I would like to solve a sparse linear equations system: A x = b, where A is a (M x M) array, b is an (M x N) array and x is and (M x N) array.
I solve this in three ways using the:
scipy.linalg.solve(A.toarray(), b.toarray()),
scipy.sparse.linalg.spsolve(A, b),
scipy.sparse.linalg.splu(A).solve(b.toarray()) # returns a dense array
I wish to solve the problem using the iterative scipy.sparse.linalg methods:
scipy.sparse.linalg.cg,
scipy.sparse.linalg.bicg,
...
However, the metods suport only a right hand side b with a shape (M,) or (M, 1). Any ideas on how to expand these methods to (M x N) array b?
A key difference between iterative solvers and direct solvers is that direct solvers can more efficiently solve for multiple right-hand values by using a factorization (usually either Cholesky or LU), while iterative solvers can't. This means that for direct solvers there is a computational advantage to solving for multiple columns simultaneously.
For iterative solvers, on the other hand, there's no computational gain to be had in simultaneously solving multiple columns, and this is probably why matrix solutions are not supported natively in the API of cg, bicg, etc.
Because of this, a direct solution like scipy.sparse.linalg.spsolve will probably be optimal for your case. If for some reason you still desire an iterative solution, I'd just create a simple convenience function like this:
from scipy.sparse.linalg import bicg
def bicg_solve(M, B):
X, info = zip(*(bicg(M, b) for b in B.T))
return np.transpose(X), info
Then you can create some data and call it as follows:
import numpy as np
from scipy.sparse import csc_matrix
# create some matrices
M = csc_matrix(np.random.rand(5, 5))
B = np.random.rand(5, 4)
X, info = bicg_solve(M, B)
print(X.shape)
# (5, 4)
Any iterative solver API which accepts a matrix on the right-hand-side will essentially just be a wrapper for something like this.