Python: fast matrix multiplication with extra indices

Python: fast matrix multiplication with extra indices - python

I have two arrays, A and B, with dimensions (l,m,n) and (l,m,n,n), respectively. I would like to obtain an array C of dimensions (l,m,n) which is obtained by treating A and B as matrices in their fourth (A) and third and fourth indices (B). An easy way to do this is:
import numpy as np
#Define dimensions
l = 1024
m = l
n = 6
#Create some random arrays
A = np.random.rand(l,m,n)
B = np.random.rand(l,m,n,n)
C = np.zeros((l,m,n))
#Desired multiplication
for i in range(0,l):
for j in range(0,m):
C[i,j,:] = np.matmul(A[i,j,:],B[i,j,:,:])
It is, however, slow (about 3 seconds on my MacBook). What'd be the fastest, fully vectorial way to do this?

Try to use einsum.
It has many use cases, check the docs: https://numpy.org/doc/stable/reference/generated/numpy.einsum.html
Or, for more info, a really good explanation can be also found at: https://ajcr.net/Basic-guide-to-einsum/
In your case, it seems like
np.einsum('dhi,dhij->dhj',A,B)
should work. Also, you can try the optimize=True flag to get more speed, if needed.

Related

Numpy: Efficient way to create a complex array from two real arrays

I have two real arrays (a and b), and I would like create a complex array (c) which takes the two real arrays as its real and imaginary parts respectively.
The simplest one would be
c = a + b * 1.0j
However, since my data size is quite large, such code is not very efficient.
We can also do the following,
c = np.empty(data_shape)
c.real = a
c.imag = b
I am wondering is there a better way to do that (e.g. using buffer or something)?
Thank you very much!

Since the real and imaginary parts of each element have to be contiguous, you will have to allocate another buffer to interleave the data no matter what. The second method shown in the question is therefore about as efficient as you're likely to get. One alternative would be
np.stack((a, b), axis=-1).view(np.complex).squeeze(-1)
This works for any array shape, not just 1D. It ensures proper interleaving by stacking along the last dimension in C order.
This assumes that your datatype is np.float. If not, either promote to float (e.g. a = a.astype(float)), or possibly change np.complex to something else.

Computation difference between function and manual computation

I am facing a mystery right now. I get strange results in some program and I think it may be related to the computation since I got different results with my functions compared to manual computation.
This is from my program, I am printing the values pre-computation :
print("\nPrecomputation:\nmatrix\n:", matrix)
tmp = likelihood_left * likelihood_right
print("\nconditional_dep:", tmp)
print("\nfinal result:", matrix # tmp)
I got the following output:
Precomputation:
matrix:
[array([0.08078721, 0.5802404 , 0.16957052, 0.09629893, 0.07310294])
array([0.14633129, 0.45458744, 0.20096238, 0.02142105, 0.17669784])
array([0.41198731, 0.06197812, 0.05934063, 0.23325626, 0.23343768])
array([0.15686545, 0.29516415, 0.20095091, 0.14720275, 0.19981674])
array([0.15965914, 0.18383683, 0.10606946, 0.14234812, 0.40808645])]
conditional_dep: [0.01391123 0.01388155 0.17221067 0.02675524 0.01033257]
final result: [0.07995043 0.03485223 0.02184015 0.04721548 0.05323298]
The thing is when I compute the following code:
matrix = [np.array([0.08078721, 0.5802404 , 0.16957052, 0.09629893, 0.07310294]),
np.array([0.14633129, 0.45458744, 0.20096238, 0.02142105, 0.17669784]),
np.array([0.41198731, 0.06197812, 0.05934063, 0.23325626, 0.23343768]),
np.array([0.15686545, 0.29516415, 0.20095091, 0.14720275, 0.19981674]),
np.array([0.15965914, 0.18383683, 0.10606946, 0.14234812, 0.40808645])]
tmp = np.asarray([0.01391123, 0.01388155, 0.17221067, 0.02675524, 0.01033257])
matrix # tmp
The values in use are exactly the same as they should be in the computation before but I get the following result:
array([0.04171218, 0.04535276, 0.02546353, 0.04688848, 0.03106443])
This result is then obviously different than the previous one and is the true one (I computed the dot product by hand).
I have been facing this problem the whole day and I did not find anything useful online. If any of you have any even tiny idea where it can come from I'd be really happy :D
Thank's in advance
Yann
PS: I can show more of the code if needed.
PS2: I don't know if it is relevant but this is used in a dynamic programming algorithm.

To recap our discussion in the comments, in the first part ("pre-computation"), the following is true about the matrix object:
>>> matrix.shape
(5,)
>>> matrix.dtype
dtype('O') # aka object
And as you say, this is due to matrix being a slice of a larger, non-uniform array. Let's recreate this situation:
>>> matrix = np.array([[], np.array([0.08078721, 0.5802404 , 0.16957052, 0.09629893, 0.07310294]), np.array([0.14633129, 0.45458744, 0.20096238, 0.02142105, 0.17669784]), np.array([0.41198731, 0.06197812, 0.05934063, 0.23325626, 0.23343768]), np.array([0.15686545, 0.29516415, 0.20095091, 0.14720275, 0.19981674]), np.array([0.15965914, 0.18383683, 0.10606946, 0.14234812, 0.40808645])])[1:]
It is now not a matrix with scalars in rows and columns, but a column vector of column vectors. Technically, matrix # tmp is an operation between two 1-D arrays and hence NumPy should, according to the documentation, calculate the inner product of the two. This is true in this case, with the convention that the sum be over the first axis:
>>> np.array([matrix[i] * tmp[i] for i in range(5)]).sum(axis=0)
array([0.07995043, 0.03485222, 0.02184015, 0.04721548, 0.05323298])
>>> matrix # tmp
array([0.07995043, 0.03485222, 0.02184015, 0.04721548, 0.05323298])
This is essentially the same as taking the transpose of the proper 2-D matrix before the multiplication:
>>> np.stack(matrix).T # tmp
array([0.07995043, 0.03485222, 0.02184015, 0.04721548, 0.05323298])
Equivalently, as noted by #jirasssimok:
>>> tmp # np.stack(matrix)
array([0.07995043, 0.03485222, 0.02184015, 0.04721548, 0.05323298])
Hence the erroneous or unexpected result.
As you have already resolved to do in the comments, this can be avoided in the future by ensuring all matrices are proper 2-D arrays.

It looks like you got the operands switched in one of your matrix multiplications.
Using the same values of matrix and tmp that you provided, matrix # tmp and tmp # matrix provide the two results you showed.1
matrix = [np.array([0.08078721, 0.5802404 , 0.16957052, 0.09629893, 0.07310294]),
np.array([0.14633129, 0.45458744, 0.20096238, 0.02142105, 0.17669784]),
np.array([0.41198731, 0.06197812, 0.05934063, 0.23325626, 0.23343768]),
np.array([0.15686545, 0.29516415, 0.20095091, 0.14720275, 0.19981674]),
np.array([0.15965914, 0.18383683, 0.10606946, 0.14234812, 0.40808645])]
tmp = np.asarray([0.01391123, 0.01388155, 0.17221067, 0.02675524, 0.01033257])
print(matrix # tmp) # [0.04171218 0.04535276 0.02546353 0.04688848 0.03106443]
print(tmp # matrix) # [0.07995043 0.03485222 0.02184015 0.04721548 0.05323298]
To make it a little more obvious what your code is doing, you might also consider using np.dot instead of #. If you pass matrix as the first argument and tmp as the second, it will have the result you want, and make it more clear that you're conceptually calculating dot products rather than multiplying matrices.
As an additional note, if you're performing matrix operations on matrix, it might be better if it was a single two-dimensional array instead of a list of 1-dimensional arrays. this will prevent errors of the sort you'll see right now if you try to run matrix # matrix. This would also let you say matrix.dot(tmp) instead of np.dot(matrix, tmp) if you wanted to.
(I'd guess that you can use np.stack or a similar function to create matrix, or you can call np.stack on matrix after creating it.)
1 Because tmp has only one dimension and matrix has two, NumPy can and will treat tmp as whichever type of vector makes the multiplication work (using broadcasting). So tmp is treated as a column vector in matrix # tmp and a row vector in tmp # matrix.

Fast way to construct a matrix in Python

I have been browsing through the questions, and could find some help, but I prefer having confirmation by asking it directly. So here is my problem.
I have an (numpy) array u of dimension N, from which I want to build a square matrix k of dimension N^2. Basically, each matrix element k(i,j) is defined as k(i,j)=exp(-|u_i-u_j|^2).
My first naive way to do it was like this, which is, I believe, Fortran-like:
for i in range(N):
for j in range(N):
k[i][j]=np.exp(np.sum(-(u[i]-u[j])**2))
However, this is extremely slow. For N=1000, for example, it is taking around 15 seconds.
My other way to proceed is the following (inspired by other questions/answers):
i, j = np.ogrid[:N,:N]
k = np.exp(np.sum(-(u[i]-u[j])**2,axis=2))
This is way faster, as for N=1000, the result is almost instantaneous.
So I have two questions.
1) Why is the first method so slow, and why is the second one so fast ?
2) Is there a faster way to do it ? For N=10000, it is starting to take quite some time already, so I really don't know if this was the "right" way to do it.
Thank you in advance !
P.S: the matrix is symmetric, so there must also be a way to make the process faster by calculating only the upper half of the matrix, but my question was more related to the way to manipulate arrays, etc.

First, a small remark, there is no need to use np.sum if u can be re-written as u = np.arange(N). Which seems to be the case since you wrote that it is of dimension N.
1) First question:
Accessing indices in Python is slow, so best is to not use [] if there is a way to not use it. Plus you call multiple times np.exp and np.sum, whereas they can be called for vectors and matrices. So, your second proposal is better since you compute your k all in once, instead of elements by elements.
2) Second question:
Yes there is. You should consider using only numpy functions and not using indices (around 3 times faster):
k = np.exp(-np.power(np.subtract.outer(u,u),2))
(NB: You can keep **2 instead of np.power, which is a bit faster but has smaller precision)
edit (Take into account that u is an array of tuples)
With tuple data, it's a bit more complicated:
ma = np.subtract.outer(u[:,0],u[:,0])**2
mb = np.subtract.outer(u[:,1],u[:,1])**2
k = np.exp(-np.add(ma, mb))
You'll have to use twice np.substract.outer since it will return a 4 dimensions array if you do it in one time (and compute lots of useless data), whereas u[i]-u[j] returns a 3 dimensions array.
I used np.add instead of np.sum since it keep the array dimensions.
NB: I checked with
N = 10000
u = np.random.random_sample((N,2))
I returns the same as your proposals. (But 1.7 times faster)

clean summation involving index of numpy arrays

I've occasionally but not frequently used numpy. I'm now needing to do some summations where the sums involve the row/column indices.
I have an m x n array S. I want to do the create a new m x n array whose 's,i' entry is
-c i S[s,i] + g (i+1)S[s,i+1] + (s+1)S[s+1,i-1]
So say S=np.array([[1,2],[3,4], [5,6]]) the result I want is
-c*np.array([[0*1, 1*2],[0*3, 1*4],[0*5, 1*6]])
+ g*np.array([[1*2, 2*0],[1*4, 2*0],[1*6, 2*0]])
+ np.array([[1*0, 1*3],[2*0, 2*5],[3*0, 3*0]])
(that's not all the terms in my equation, but I feel like knowing how to do this would be enough to complete what I'm after).
I think what I will need to do is create a new array whose rows are just the index of the rows and another corresponding for columns. Then do some component-wise multiplication. But this is well outside what I normally do in my research, so I've taken a few wrong steps already.
note: It is understood that where the indices refer to something outside my array the value is zero.
Is there a clean way to do the summation I've described above?

I would do it in several steps, due to your possible out-of-bounds indexing:
import numpy as np
S = np.array([[1,2],[3,4], [5,6]])
c = np.random.rand()
g = np.random.rand()
m,n = S.shape
Stmp1 = S*np.arange(0,n) # i*S[s,i]
Stmp2 = S*np.arange(0,m)[:,None] # s*S[s,i]
# the answer:
Sout = -c*Stmp1
Sout[:,:-1] = Sout[:,:-1] + g*Stmp1[:,1:]
Sout[:-1,1:] = Sout[:-1,1:] + Stmp2[1:,:-1]
# only for control:
Sout2 = -c*np.array([[0*1, 1*2],[0*3, 1*4],[0*5, 1*6]]) \
+ g*np.array([[1*2, 2*0],[1*4, 2*0],[1*6, 2*0]]) \
+ np.array([[1*0, 1*3],[2*0, 2*5],[3*0, 3*0]])
Check:
In [431]: np.all(Sout==Sout2)
Out[431]: True
I introduced auxiliary arrays for i*S[s,i] and s*S[s,i]. While this is clearly not necessary, it makes the code easier to read. We could've easily sliced into the np.arange(0,n) calls directly, but unless memory is not an issue, I find this approach much more straightforward.

Multiply several matrices in numpy

Suppose you have n square matrices A1,...,An. Is there anyway to multiply these matrices in a neat way? As far as I know dot in numpy accepts only two arguments. One obvious way is to define a function to call itself and get the result. Is there any better way to get it done?

This might be a relatively recent feature, but I like:
A.dot(B).dot(C)
or if you had a long chain you could do:
reduce(numpy.dot, [A1, A2, ..., An])
Update:
There is more info about reduce here. Here is an example that might help.
>>> A = [np.random.random((5, 5)) for i in xrange(4)]
>>> product1 = A[0].dot(A[1]).dot(A[2]).dot(A[3])
>>> product2 = reduce(numpy.dot, A)
>>> numpy.all(product1 == product2)
True
Update 2016:
As of python 3.5, there is a new matrix_multiply symbol, #:
R = A # B # C

Resurrecting an old question with an update:
As of November 13, 2014 there is now a np.linalg.multi_dot function which does exactly what you want. It also has the benefit of optimizing call order, though that isn't necessary in your case.
Note that this available starting with numpy version 1.10.

If you compute all the matrices a priori then you should use an optimization scheme for matrix chain multiplication. See this Wikipedia article.

Another way to achieve this would be using einsum, which implements the Einstein summation convention for NumPy.
To very briefly explain this convention with respect to this problem: When you write down your multiple matrix product as one big sum of products, you get something like:
P_im = sum_j sum_k sum_l A1_ij A2_jk A3_kl A4_lm
where P is the result of your product and A1, A2, A3, and A4 are the input matrices. Note that you sum over exactly those indices that appear twice in the summand, namely j, k, and l. As a sum with this property often appears in physics, vector calculus, and probably some other fields, there is a NumPy tool for it, namely einsum.
In the above example, you can use it to calculate your matrix product as follows:
P = np.einsum( "ij,jk,kl,lm", A1, A2, A3, A4 )
Here, the first argument tells the function which indices to apply to the argument matrices and then all doubly appearing indices are summed over, yielding the desired result.
Note that the computational efficiency depends on several factors (so you are probably best off with just testing it):
Why is numpy's einsum slower than numpy's built-in functions?
Why is numpy's einsum faster than numpy's built in functions?

A_list = [np.random.randn(100, 100) for i in xrange(10)]
B = np.eye(A_list[0].shape[0])
for A in A_list:
B = np.dot(B, A)
C = reduce(np.dot, A_list)
assert(B == C)

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python: fast matrix multiplication with extra indices - python

Related

Numpy: Efficient way to create a complex array from two real arrays

Computation difference between function and manual computation

Fast way to construct a matrix in Python

clean summation involving index of numpy arrays

Multiply several matrices in numpy

Categories

Resources