I used to perform an outer subtraction on two one-dimensional arrays as follows to receive a single two-dimensional arrays that contains all pairs of subtractions:
import numpy as np
a = np.arange(5)
b = np.arange(3)
result = np.subtract.outer(a, b)
assert result.shape == (5, 3)
assert np.all(result == np.array([[aa - bb for bb in b] for aa in a ])) # no rounding errors
Now the state space switches to two dimensions, and I would like to perform the same operation, but only perform each subtraction on the two values on the last axis of the arrays A and B:
import numpy as np
A = np.arange(5 * 2).reshape(-1, 2)
B = np.arange(3 * 2).reshape(-1, 2)
result = np.subtract.outer(A, B)
# Obviously the following does not hold, because here we have got all subtractions, therefore the shape (5, 2, 3, 2)
# I would like to exchange np.subtract.outer such that the following holds:
# assert result.shape == (5, 3, 2)
expected_result = np.array([[aa - bb for bb in B] for aa in A ])
assert expected_result.shape == (5, 3, 2)
# That's what I want to hold:
# assert np.all(result == expected_result) # no rounding errors
Is there a "numpy-only" solution to perform this operation?
You can expand/reshape A to (5, 1, 2) and B to (1, 3, 2) and let the broadcasting do the job:
A[:, None, :] - B[None, :, :]
A[:, None] - B[None, :] does it.
A = np.arange(5 * 2).reshape(-1, 2)
B = np.arange(3 * 2).reshape(-1, 2)
expected_result = np.array([[aa - bb for bb in B] for aa in A ])
C = A[:, None] - B[None, :]
np.allclose(expected_result, C)
#> True
The exact same syntax works for your first example too. This is because with your requirement, you are combining every first axis element of A with every first axis element of B.
Related
I have two matrices. The first has the following structure:
[[1, 0, a],
[0, 1, b],
[1, 0, c],
[0, 1, d]]
where 1, 0, a, b, c, and d are scalars. The matrix is 4 by 3
The second is just a 2 by 3 matrix:
[[r1],
[r2]]
where r1 and r2 are the first and second rows respectively, each having 3 elements.
I would like the output to be:
[[r1, 0, a*r1],
[0, r1, b*r1],
[r2, 0, c*r2],
[0, r2, d*r2]]
which would be a 4 by 9 matrix.
This is similar to the Kronecker product, except separately for each row of the second matrix. Of course this could be done with cumbersome loops which I want to avoid.
How can I do this concisely?
You can do exactly what you said in the last line: do a separate Kronecker product for each row of the second column and then concatenate the results.
Let's assume that the two matrices are called x (4 by 3) and y (2 by 3). The first thing to do is to split x in two parts because only half matrix participates in each part of the product.
x = x.reshape(2, 2, 3)
Then you can calculate the two products separately:
z0 = np.kron(x[0], y[0])
z1 = np.kron(x[1], y[1])
Finally, concatenate the two results along the first axis:
z = np.concatenate([z0, z1], axis=0)
Or if, like me, you enjoy big ugly one-liners you can do:
z = np.concatenate([np.kron(xr, yr) for xr, yr in zip(x.reshape(2, 2, 3), y)], axis=0)
In the general case you mentioned in the comments, it would become:
z = np.concatenate([np.kron(xr, yr) for xr, yr in zip(x.reshape(int(n / 2), 2, 3), y)], axis=0)
This gives equal results to the explicit loop, which can be numba.jit compiled I believe:
def solve_explicit(x, y):
# sanity checks
assert x.shape[0] == 2*y.shape[0]
assert x.shape[1] == y.shape[1]
n = x.shape[0]
z = np.zeros((n, 9))
for i in range(n):
for j in range(3):
for k in range(3):
z[i, k + 3 * j] = x[i, j] * y[int(i / 2), k]
return z
Using broadcasting, with x.shape (n, 3), and y.shape (n//2, 3):
out = (x.reshape(-1, 2, 3, 1) * y.reshape(-1, 1, 1, 3)).reshape(-1, 9)
I personally would use np.einsum in this situation because I think it's easier to understand than broadcasting.
import numpy as np
(a, b, c, d) = np.random.rand(4)
x = np.array([[1, 0, a], [0, 1, b], [1, 0, c], [0, 1, d]])
y = np.random.rand(2, 3)
z = np.einsum("ij,ik->ijk", x.reshape(-1, 6), y).reshape(-1, 9)
# timeit magic commands.
# %timeit -n 50000 np.einsum("ij,ik->ijk", x.reshape(-1, 6), y).reshape(-1, 9)
# %timeit -n 50000 (x.reshape(-1, 2, 3, 1) * y.reshape(-1, 1, 1, 3)).reshape(-1, 9)
Some good references on Einstein summation in NumPy: [2, 3, 4].
Given two numpy arrays of shape (25, 2), and (2,), one can easily multiply them across:
import numpy as np
a = np.random.rand(2, 25)
b = np.random.rand(2)
(a.T * b).T # ok, shape (2, 25)
I have a similar situation where b is of shape (2, 4), and I'd like to get the same results as above for all "4" b. The following works,
a = np.random.rand(25, 2)
b = np.random.rand(2, 4)
c = np.moveaxis([a * bb for bb in b.T], -1, 0) # shape (2, 4, 25)
but I have a hunch that this is possible without moveaxis.
Any ideas?
In [185]: a = np.random.rand(2, 25)
...: b = np.random.rand(2)
The multiplication is possible with broadcasting:
In [186]: a.shape
Out[186]: (2, 25)
In [187]: a.T.shape
Out[187]: (25, 2)
In [189]: (a.T*b).shape
Out[189]: (25, 2)
(25,2) * (2,) => (25,2) * (1,2) => (25,2). The transpose is a moveaxis, changing the result to (2,25)
In your second case.
In [191]: c = np.moveaxis([a * bb for bb in b.T], -1, 0)
In [192]: c.shape
Out[192]: (2, 4, 25)
In [193]: np.array([a * bb for bb in b.T]).shape
Out[193]: (4, 25, 2)
b.T is (4,2), so bb is (2,); with the (25,2) a, produces (25,2) as above. add in the (4,) iteration.
(25,1,2) * (1,4,2) => (25,4,2), which can be transposed to (2,4,25)
In [195]: (a[:,None]*b.T).shape
Out[195]: (25, 4, 2)
In [196]: np.allclose((a[:,None]*b.T).T,c)
Out[196]: True
(2,4,1) * (2,1,25) => (2,4,25)
In [197]: (b[:,:,None] * a.T[:,None]).shape
Out[197]: (2, 4, 25)
In [198]: np.allclose((b[:,:,None] * a.T[:,None]),c)
Out[198]: True
An alternative with numpy.einsum:
np.einsum('ij,jk->jki', a, b)
Check results are the same:
(np.einsum('ij,jk->jki', a, b) == c).all()
True
I'm trying to make a clean connection between the dimensions in a numpy array and the dimensions of a matrix via classical linear algebra. Suppose the following:
In [1] import numpy as np
In [2] rand = np.random.RandomState(42)
In [3] a = rand.rand(3,2)
In [4] a
Out[4]:
array([[0.61185289, 0.13949386],
[0.29214465, 0.36636184],
[0.45606998, 0.78517596]])
In [5]: a[np.newaxis,:,:]
Out[5]:
array([[[0.61185289, 0.13949386],
[0.29214465, 0.36636184],
[0.45606998, 0.78517596]]])
In [6]: a[:,np.newaxis,:]
Out[6]:
array([[[0.61185289, 0.13949386]],
[[0.29214465, 0.36636184]],
[[0.45606998, 0.78517596]]])
In [7]: a[:,:,np.newaxis]
Out[7]:
array([[[0.61185289],
[0.13949386]],
[[0.29214465],
[0.36636184]],
[[0.45606998],
[0.78517596]]])
My questions are as follows:
Is is correct to say that the dimensions of a are 3 X 2? In other words, a 3 X 2 matrix?
Is it correct to say that the dimensions of a[np.newaxis,:,:] are 1 X 3 X 2? In other words, a matrix containing a 3 X 2 matrix?
Is it correct to say that the dimensions of a[:,np.newaxis,:] are 3 X 1 X 2? In other words a matrix containing 3 1 X 2 matrices?
Is it correct to say that the dimensions of a[:,:,np.newaxis] are 3 X 2 X1? In other words a matrix containing 3 matrices each of which contain 2 1 X 1 matrices?
yes
yes
yes
three 2x1 matrices each of which contains one vector of size 1
Just find out using .shape:
import numpy as np
rand = np.random.RandomState(42)
# 1.
a = rand.rand(3, 2)
print(a.shape, a, sep='\n', end='\n\n')
# 2.
b = a[np.newaxis, :, :]
print(b.shape, b, sep='\n', end='\n\n')
# 3.
c = a[:, np.newaxis, :]
print(c.shape, c, sep='\n', end='\n\n')
# 4.a
d = a[:, :, np.newaxis]
print(d.shape, d, sep='\n', end='\n\n')
# 4.b
print(d[0].shape, d[0], sep='\n', end='\n\n')
print(d[0, 0].shape, d[0, 0])
output:
(3, 2)
[[0.37454012 0.95071431]
[0.73199394 0.59865848]
[0.15601864 0.15599452]]
(1, 3, 2)
[[[0.37454012 0.95071431]
[0.73199394 0.59865848]
[0.15601864 0.15599452]]]
(3, 1, 2)
[[[0.37454012 0.95071431]]
[[0.73199394 0.59865848]]
[[0.15601864 0.15599452]]]
(3, 2, 1)
[[[0.37454012]
[0.95071431]]
[[0.73199394]
[0.59865848]]
[[0.15601864]
[0.15599452]]]
(2, 1)
[[0.37454012]
[0.95071431]]
(1,) [0.37454012]
I have an array A (shape = (a, 1)) and matrix B (shape = (b1, b2)). Want to multiply the latter by each element of the former to generate a tridimensional array (shape = (a, b1, b2)).
Is there a vectorized way to do this?
import numpy as np
A = np.random.rand(3, 1)
B = np.random.rand(5, 4)
C = np.array([ a * B for a in A ])
There are several ways you can achieve this.
One is using np.dot, note that it will be necessary to introduce a second axis in B so both ndarrays can be multiplied:
C = np.dot(A,B[:,None])
print(C.shape)
# (3, 5, 4)
Using np.multiply.outer, as #divakar suggests:
C = np.multiply.outer(A,B)
print(C.shape)
# (3, 5, 4)
Or you could also use np.einsum:
C = np.einsum('ij,kl->ikl', A, B)
print(C.shape)
# (3, 5, 4)
I have a numpy array a of size 5x5x4x5x5. I have another matrix b of size 5x5. I want to get a[i,j,b[i,j]] for i from 0 to 4 and for j from 0 to 4. This will give me a 5x5x1x5x5 matrix. Is there any way to do this without just using 2 for loops?
Let's think of the matrix a as 100 (= 5 x 5 x 4) matrices of size (5, 5). So, if you could get a liner index for each triplet - (i, j, b[i, j]) - you are done. That's where np.ravel_multi_index comes in. Following is the code.
import numpy as np
import itertools
# create some matrices
a = np.random.randint(0, 10, (5, 5, 4, 5, 5))
b = np.random(0, 4, (5, 5))
# creating all possible triplets - (ind1, ind2, ind3)
inds = list(itertools.product(range(5), range(5)))
(ind1, ind2), ind3 = zip(*inds), b.flatten()
allInds = np.array([ind1, ind2, ind3])
linearInds = np.ravel_multi_index(allInds, (5,5,4))
# reshaping the input array
a_reshaped = np.reshape(a, (100, 5, 5))
# selecting the appropriate indices
res1 = a_reshaped[linearInds, :, :]
# reshaping back into desired shape
res1 = np.reshape(res1, (5, 5, 1, 5, 5))
# verifying with the brute force method
res2 = np.empty((5, 5, 1, 5, 5))
for i in range(5):
for j in range(5):
res2[i, j, 0] = a[i, j, b[i, j], :, :]
print np.all(res1 == res2) # should print True
There's np.take_along_axis exactly for this purpose -
np.take_along_axis(a,b[:,:,None,None,None],axis=2)