What is the meaning of the following operation in numpy? - python

I'm digging out a piece of numpy code and there's a line I don't understand at all:
W[:, :, None] * h[None, :, :] * diff[:, None, :]
where W, h and diff are 784x20, 20x100 and 784x100 matrices. Multiplication result is 784x20x100 array, but I have no idea what does this computation actually do and what is the meaning of the result.
For what it's worth, the line is from machine learning related code, W corresponds to the weights array of of neural network's layer, h is layer activation, and diff is the difference between network's target and hypothesis (from Sida Wang's thesis on transforming autoencoder).

For NumPy arrays, * corresponds to element-wise multiplication. In order for this to work, the two arrays have to be either:
the same shape as each other
such that one array can be broadcast to the other
One array can be broadcast to another if, when pairing the trailing dimensions of each array, either the lengths in each pair are equal or one of the lengths is 1.
For example, the following arrays A and B have shapes which are compatible for broadcasting:
A.shape == (20, 1, 3)
B.shape == (4, 3)
(3 is equal to 3 and then the next length in A is 1 which can be paired with any length. It doesn't matter that B has fewer dimensions than A.)
To make two incompatible arrays broadcastable with each other, extra dimensions can be inserted into one or both arrays. Indexing a dimension with None or np.newaxis inserts an extra dimension of length one into an array.
Let's look at the example in the question. Python evaluates repeated multiplications left to right:
W[:, :, None] has shape (784, 20, 1)
h[None, :, :] has shape ( 1, 20, 100)
These shapes are broadcastable according to the explanation above and the multiplication returns an array with shape (784, 20, 100).
Array shape from last multiplication, (784, 20, 100)
diff[:, None, :] has a shape of (784, 1, 100)
These shapes of these two arrays are compatible so the second multiplication succeeds. An array with the shape (784, 20, 100) is returned.

Related

Dynamically broadcast a numpy array

I currently have a 1D numpy array, epsilons, that needs to perform element-wise multiplication on array x. However, the dimensionality of x is dynamic and changes with each iteration of the following for loop:
for x in grads:
x = x * epsilons
print(grad)
epsilons always has the shape (M,). However, for the first iteration, x takes the shape (M,4,2) while it takes the shape (M,4) for the second iteration (the shape of x changes as the code iterates over grads). Is there a way I can automatically broadcast epsilons to the shape of x so that I can perform this element-wise multiplication for any shape of x?
You can just reshape epsilons to the correct shape. Indeed, Numpy automatically broadcast the vector shape (like the broadcast_to call) if is has a compatible shape: the same number of dimension should be at least the same and the shape should be either 1 of full for each dimension.
Thanks to #hpaulj for the improved solution.
# Reshape epsilons so that the vector value are along the first dimension (the least contiguous one)
reshapedEpsilons = epsilons.reshape((M,)+(1,)*(x.ndim-1))
# Broadcast automatically the vector values in the other dimensions so the result have the same shape than x
# Actual element-wise multiplication
x *= reshapedEpsilons
PS: note that a = a * b should create a new matrix and is less efficient than a *= b which modify the values in-place.

Numpy Subtract two arrays of equal ndim but different shape

So I have two ndarrays:
A with shape (N,a,a), a stack of N arrays of shape (a,a) basically
B with shape (8,M,a,a), a matrix of 8 x M arrays of shape (a,a)
I need to subtract B from A (A-B) such that the resulting array is of shape (8,M*N,a,a).
More verbosely each (M total) of the 8 arrays of B needs to be subtracted from each array in A, resulting in 8*M*N subtractions between (a,a) shape arrays.
How can I do this in a vectorized manner without loops?
This thread does something similar but in lower dimensions and I can't figure out how to extend it.
A = np.arange(8).reshape(2,2,2)
B = np.ones(shape=(8,4,2,2))
General broadcasting works if dimensions are the same or if one dimension is 1, so we do this;
a = A[np.newaxis, :, np.newaxis, :, :]
b = B[:, np.newaxis, :, :, :]
a.shape # <- (1,2,1,2,2)
b.shape # <- (8,1,4,2,2)
Now when you can do broadcasting
c = a - b
c.shape # <- (8,2,4,2,2)
And when you reshape the (2x4=8) components get aligned.
c.reshape(8,-1,2,2)
The ordering of the new axes dictates the reshaping, so be careful with that.

Numpy dot over of shapes (2,) (3,1) gives error but multiplication doesn't

I'm looking for a bit clarification on broadcasting rules and numpy.dot method over multiplication factor. I have created two arrays of shape (2,) and (3,) which can be multiplied by adding a new axis (3,1 shape) but it couldn't be through np.dot method even though adding a new axis and turning into (3,1) shape. here's the below little test done.
x_1 = np.random.rand(2,)
print(x_1)
x_2 = np.random.rand(3,)
print(x_2)
> [ 0.48362051 0.55892736]
> [ 0.16988562 0.09078386 0.04844093]
x_8 = np.dot(x_1, x_2[:, np.newaxis])
> ValueError: shapes (2,) and (3,1) not aligned: 2 (dim 0) != 3 (dim 0)
x_9 = x_1 * x_2[:, np.newaxis]
print(x_9)
> [[ 0.47231067 0.30899592]
[ 0.17436521 0.11407352]
[ 0.01312074 0.00858387]]
x__7 = x_1[:, np.newaxis] * x_2[:, np.newaxis]
> ValueError: operands could not be broadcast together with shapes (2,1) (3,1)
I understand np.dot of (2,1) & (1,3) works, but why not (2,1) & (3,1) because broadcasting rule number two says, Two dimensions are compatible when one of them is 1. So if one of its dimension is 1, np.dot should work or have I understood rule number two wrong ? Also why X_9 works (multiplication) but not x_8 (np.dot), when both are same shapes.
np.dot is for matrix-matrix multiplication (where a column vector can be considered to be a matrix with one column and a row vector as a matrix with one row).
* (multiplication) is for scalar multiplication in the case that one of the arguments is a scalar, and broadcasting otherwise. So the broadcasting rules are not for np.dot.
x_9 works because, as stated in the broadcasting rules here https://docs.scipy.org/doc/numpy-1.12.0/user/basics.broadcasting.html
When operating on two arrays, NumPy compares their shapes element-wise. It starts with the trailing dimensions, and works its way forward. Two dimensions are compatible when
they are equal, or
one of them is 1
so your (only) dimension of x_1, (which is 2) is compatible with the last dimension of x_2 (which is 1 because you added a new dimension), and the remaining dimension is 3.

Why does indexing a matrix by an integer produce a different shape than the dot product with a one hot vector in numpy?

I have a matrix that I initialized with numpy.random.uniform like so:
W = np.random.uniform(-1, 1, (V,N))
In my case, V = 10000 and N = 50, x is a positive integer
When I multiply W by a one hot vector x_vec of dimension V X 1, like W.T.dot(x_vec), I get a column vector with a shape of (50,1). When I try to get the same vector by indexing W, as in W[x].T or W[x,:].T I get shape (50,).
Can anyone explain to me why these two expression return different shapes and if it's possible to return a (50,1) matrix (vector) with the indexing method. The vector of shape (50,) is problematic because it doesn't behave the same way as the (50,1) vector when I multiply it with other matrices, but I'd like to use indexing to speed things up a little.
*Sorry in advance if this question should be in a place like Cross Validated instead of Stack Exchange
They are different operations. matrix (in the maths sense) times matrix gives matrix, some of your matrices just happen to have width 1.
Indexing with an integer scalar eats the dimension you are indexing into. Once you are down to a single dimension, .T does nothing because it doesn't have enough axes to shuffle.
If you want to go from (50,) to (50, 1) shape-wise, the recipe is indexing with None like so v[:, None]. In your case you have at least two one-line options:
W[x, :][:, None] # or W[x][:, None] or
W[x:x+1, :].T # or W[x:x+1].T
The second-line option preserves the first dimension of W by requesting a subrange of length one. The first option can be contracted into a single indexing operation - thanks to #hpaulj for pointing this out - which gives the arguably most readable option:
W[x, :, None]
The first index (scalar integer x) consumes the first dimension of W, the second dimension (unaffected by :) becomes the first and None creates a new dimension on the right.

How to get these shapes to line up for a numpy matrix

I'm trying to input vectors into a numpy matrix by doing:
eigvec[:,i] = null
However I keep getting the error:
ValueError: could not broadcast input array from shape (20,1) into shape (20)
I've tried using flatten and reshape, but nothing seems to work
The shapes in the error message are a good clue.
In [161]: x = np.zeros((10,10))
In [162]: x[:,1] = np.ones((1,10)) # or x[:,1] = np.ones(10)
In [163]: x[:,1] = np.ones((10,1))
...
ValueError: could not broadcast input array from shape (10,1) into shape (10)
In [166]: x[:,1].shape
Out[166]: (10,)
In [167]: x[:,[1]].shape
Out[167]: (10, 1)
In [168]: x[:,[1]] = np.ones((10,1))
When the shape of the destination matches the shape of the new value, the copy works. It also works in some cases where the new value can be 'broadcasted' to fit. But it does not try more general reshaping. Also note that indexing with a scalar reduces the dimension.
I can guess that
eigvec[:,i] = null.flat
would work (however, null.flatten() should work too). In fact, it looks like NumPy complains because of you are assigning a pseudo-1D array (shape (20, 1)) to a 1D array which is considered to be oriented differently (shape (1, 20), if you wish).
Another solution would be:
eigvec[:,i] = null.T
where you properly transpose the "vector" null.
The fundamental point here is that NumPy has "broadcasting" rules for converting between arrays with different numbers of dimensions. In the case of conversions between 2D and 1D, a 1D array of size n is broadcast into a 2D array of shape (1, n) (and not (n, 1)). More generally, missing dimensions are added to the left of the original dimensions.
The observed error message basically said that shapes (20,) and (20, 1) are not compatible: this is because (20,) becomes (1, 20) (and not (20, 1)). In fact, one is a column matrix, while the other is a row matrix.

Categories