I am bit puzzled by how to read and understand a simple line of code:
I have a tensor input of shape (19,4,64,64,3).
The line of code input[:, None] returns a tensor of shape (19, 1,
4, 64, 64, 3).
How should I understand the behavior of that line? It seems that None is adding a dimension, with a size of 1. But why is this added on that specific position (between 19 and 4)?
Indeed, None adds a new dimension. You can also use tf.newaxis for this which is a bit more explicit IMHO.
The new dimension is added in axis 1 because that's where it appears in the index. E.g. input[:, :, None] should result in shape (19, 4, 1, 64, 64, 3) and so on.
It might get clearer if we write all the dimensions in the slicing: input[:, None, :, :, :, :]. In slicing, : simply means taking all elements of the dimension. So by using one :, we take all elements of dimension 0 and then "move on" to dimension 1. Since None appears here, we know that the new size-1 axis should be in dimension 1. Accordingly, the remaining dimensions get "pushed back".
Related
In the following,
x_6 = torch.cat((x_1, x_2_1, x_3_1, x_5_1), dim=-3)
Sizes of tensors x_1, x_2_1, x_3_1, x_5_1 are
torch.Size([1, 256, 7, 7])
torch.Size([1, 256, 7, 7])
torch.Size([1, 256, 7, 7])
torch.Size([1, 256, 7, 7]) respectively.
The size of x_6 turns out to be torch.Size([1, 1024, 7, 7])
I couldn't understand & visualise this concatenation along a negative dimension(-3 in this case).
What exactly is happening here?
How does the same go if dim = 3?
Is there any constraint on dim for a given set of tensors?
The answer by danin is not completely correct, actually wrong when looked from the perspective of tensor algebra, since the answer indicates that the problem has to do with accessing or indexing a Python list. It isn't.
The -3 means that we concatenate the tensors along the 2nd dimension. (you could've very well used 1 instead of the confusing -3).
From taking a closer look at the tensor shapes, it seems that they represent (b, c, h, w) where b stands for batch_size, c stands for number of channels, h stands for height and w stands for width.
This is usually the case, somewhere at the final stages of encoding (possibly) images in a deep neural network and we arrive at these feature maps.
The torch.cat() operation with dim=-3 is meant to say that we concatenate these 4 tensors along the dimension of channels c (see above).
4 * 256 => 1024
Hence, the resultant tensor ends up with a shape torch.Size([1, 1024, 7, 7]).
Notes: It is hard to visualize a 4 dimensional space since we humans live in an inherently 3D world. Nevertheless, here are some answers that I wrote a while ago which will help to get some mental picture.
How to understand the term tensor in TensorFlow?
Very Basic Numpy array dimension visualization
Python provides negative indexing, so you can access elements starting from the end of the list e.g, -1 is the last element of a list.
In this case the tensor has 4 dimensions, so -3 is actually the 2nd element.
I'm trying to input vectors into a numpy matrix by doing:
eigvec[:,i] = null
However I keep getting the error:
ValueError: could not broadcast input array from shape (20,1) into shape (20)
I've tried using flatten and reshape, but nothing seems to work
The shapes in the error message are a good clue.
In [161]: x = np.zeros((10,10))
In [162]: x[:,1] = np.ones((1,10)) # or x[:,1] = np.ones(10)
In [163]: x[:,1] = np.ones((10,1))
...
ValueError: could not broadcast input array from shape (10,1) into shape (10)
In [166]: x[:,1].shape
Out[166]: (10,)
In [167]: x[:,[1]].shape
Out[167]: (10, 1)
In [168]: x[:,[1]] = np.ones((10,1))
When the shape of the destination matches the shape of the new value, the copy works. It also works in some cases where the new value can be 'broadcasted' to fit. But it does not try more general reshaping. Also note that indexing with a scalar reduces the dimension.
I can guess that
eigvec[:,i] = null.flat
would work (however, null.flatten() should work too). In fact, it looks like NumPy complains because of you are assigning a pseudo-1D array (shape (20, 1)) to a 1D array which is considered to be oriented differently (shape (1, 20), if you wish).
Another solution would be:
eigvec[:,i] = null.T
where you properly transpose the "vector" null.
The fundamental point here is that NumPy has "broadcasting" rules for converting between arrays with different numbers of dimensions. In the case of conversions between 2D and 1D, a 1D array of size n is broadcast into a 2D array of shape (1, n) (and not (n, 1)). More generally, missing dimensions are added to the left of the original dimensions.
The observed error message basically said that shapes (20,) and (20, 1) are not compatible: this is because (20,) becomes (1, 20) (and not (20, 1)). In fact, one is a column matrix, while the other is a row matrix.
I'm digging out a piece of numpy code and there's a line I don't understand at all:
W[:, :, None] * h[None, :, :] * diff[:, None, :]
where W, h and diff are 784x20, 20x100 and 784x100 matrices. Multiplication result is 784x20x100 array, but I have no idea what does this computation actually do and what is the meaning of the result.
For what it's worth, the line is from machine learning related code, W corresponds to the weights array of of neural network's layer, h is layer activation, and diff is the difference between network's target and hypothesis (from Sida Wang's thesis on transforming autoencoder).
For NumPy arrays, * corresponds to element-wise multiplication. In order for this to work, the two arrays have to be either:
the same shape as each other
such that one array can be broadcast to the other
One array can be broadcast to another if, when pairing the trailing dimensions of each array, either the lengths in each pair are equal or one of the lengths is 1.
For example, the following arrays A and B have shapes which are compatible for broadcasting:
A.shape == (20, 1, 3)
B.shape == (4, 3)
(3 is equal to 3 and then the next length in A is 1 which can be paired with any length. It doesn't matter that B has fewer dimensions than A.)
To make two incompatible arrays broadcastable with each other, extra dimensions can be inserted into one or both arrays. Indexing a dimension with None or np.newaxis inserts an extra dimension of length one into an array.
Let's look at the example in the question. Python evaluates repeated multiplications left to right:
W[:, :, None] has shape (784, 20, 1)
h[None, :, :] has shape ( 1, 20, 100)
These shapes are broadcastable according to the explanation above and the multiplication returns an array with shape (784, 20, 100).
Array shape from last multiplication, (784, 20, 100)
diff[:, None, :] has a shape of (784, 1, 100)
These shapes of these two arrays are compatible so the second multiplication succeeds. An array with the shape (784, 20, 100) is returned.
I've got a question...
I created a numpy.array with the shape=(4,128,256,256).
If I print out the following:
print shape(x[:][3][1][:])
the output is shape=(256,256), not (4,256) as I expected...
Also the statement
print x[:][4][1][1]
produces an error: index out of bounds
After some try and error it seems to me that the [:] do not work if another argument with discrete value follows...
I solved my current problem by using loops, but for the future I want to understand what I did wrong...
Thank you for your help...
To get what you want you must do indecing properly:
x[:, 3, 1, :].shape => (4, 256)
numpy arrays are not standard lists
If you do x[:][3][1][:] you actually do the following:
x1 = x[:] # get the whole array
x2 = x1[3] # get the fourth element along the first dimension
x2.shape => (128, 256, 256)
x3 = x2[1] # get the second element along the first dimension of `x2`
x3.shape => (256, 256)
x3[:] # get all `x3`
For more explanations about indexing see the numpy documentation
About the error when you do
x[:][4][1][1]
You get a index out of bounds because x[:] is the whole array and the first dimension is 4, so x[:][4] does not exists
I generally use MATLAB and Octave, and i recently switching to python numpy.
In numpy when I define an array like this
>>> a = np.array([[2,3],[4,5]])
it works great and size of the array is
>>> a.shape
(2, 2)
which is also same as MATLAB
But when i extract the first entire column and see the size
>>> b = a[:,0]
>>> b.shape
(2,)
I get size (2,), what is this? I expect the size to be (2,1). Perhaps i misunderstood the basic concept. Can anyone make me clear about this??
A 1D numpy array* is literally 1D - it has no size in any second dimension, whereas in MATLAB, a '1D' array is actually 2D, with a size of 1 in its second dimension.
If you want your array to have size 1 in its second dimension you can use its .reshape() method:
a = np.zeros(5,)
print(a.shape)
# (5,)
# explicitly reshape to (5, 1)
print(a.reshape(5, 1).shape)
# (5, 1)
# or use -1 in the first dimension, so that its size in that dimension is
# inferred from its total length
print(a.reshape(-1, 1).shape)
# (5, 1)
Edit
As Akavall pointed out, I should also mention np.newaxis as another method for adding a new axis to an array. Although I personally find it a bit less intuitive, one advantage of np.newaxis over .reshape() is that it allows you to add multiple new axes in an arbitrary order without explicitly specifying the shape of the output array, which is not possible with the .reshape(-1, ...) trick:
a = np.zeros((3, 4, 5))
print(a[np.newaxis, :, np.newaxis, ..., np.newaxis].shape)
# (1, 3, 1, 4, 5, 1)
np.newaxis is just an alias of None, so you could do the same thing a bit more compactly using a[None, :, None, ..., None].
* An np.matrix, on the other hand, is always 2D, and will give you the indexing behavior you are familiar with from MATLAB:
a = np.matrix([[2, 3], [4, 5]])
print(a[:, 0].shape)
# (2, 1)
For more info on the differences between arrays and matrices, see here.
Typing help(np.shape) gives some insight in to what is going on here. For starters, you can get the output you expect by typing:
b = np.array([a[:,0]])
Basically numpy defines things a little differently than MATLAB. In the numpy environment, a vector only has one dimension, and an array is a vector of vectors, so it can have more. In your first example, your array is a vector of two vectors, i.e.:
a = np.array([[vec1], [vec2]])
So a has two dimensions, and in your example the number of elements in both dimensions is the same, 2. Your array is therefore 2 by 2. When you take a slice out of this, you are reducing the number of dimensions that you have by one. In other words, you are taking a vector out of your array, and that vector only has one dimension, which also has 2 elements, but that's it. Your vector is now 2 by _. There is nothing in the second spot because the vector is not defined there.
You could think of it in terms of spaces too. Your first array is in the space R^(2x2) and your second vector is in the space R^(2). This means that the array is defined on a different (and bigger) space than the vector.
That was a lot to basically say that you took a slice out of your array, and unlike MATLAB, numpy does not represent vectors (1 dimensional) in the same way as it does arrays (2 or more dimensions).