I am working on argmax function of PyTorch which is defined as:
torch.argmax(input, dim=None, keepdim=False)
Consider an example
a = torch.randn(4, 4)
print(a)
print(torch.argmax(a, dim=1))
Here when I use dim=1 instead of searching column vectors, the function searches for row vectors as shown below.
print(a) :
tensor([[-1.7739, 0.8073, 0.0472, -0.4084],
[ 0.6378, 0.6575, -1.2970, -0.0625],
[ 1.7970, -1.3463, 0.9011, -0.8704],
[ 1.5639, 0.7123, 0.0385, 1.8410]])
print(torch.argmax(a, dim=1))
tensor([1, 1, 0, 3])
As far as my assumption goes dim = 0 represents rows and dim =1 represent columns.
It's time to correctly understand how the axis or dim argument work in PyTorch:
The following example should make sense once you comprehend the above picture:
|
v
dim-0 ---> -----> dim-1 ------> -----> --------> dim-1
| [[-1.7739, 0.8073, 0.0472, -0.4084],
v [ 0.6378, 0.6575, -1.2970, -0.0625],
| [ 1.7970, -1.3463, 0.9011, -0.8704],
v [ 1.5639, 0.7123, 0.0385, 1.8410]]
|
v
# argmax (indices where max values are present) along dimension-1
In [215]: torch.argmax(a, dim=1)
Out[215]: tensor([1, 1, 0, 3])
Note: dim (short for 'dimension') is the torch equivalent of 'axis' in NumPy.
Dimensions are defined as shown in the above excellent answer. I have highlighted the way I understand dimensions in Torch and Numpy (dim and axis respectively) and hope that this will be helpful to others.
Notice that only the specified dimension’s index varies during the argmax operation, and the specified dimension’s index range reduces to a single index once the operation is completed. Let tensor A have M rows and N columns and consider the sum operation for simplicity. The shape of A is (M, N). If dim=0 is specified, then the vectors A[0,:], A[1,:], ..., A[M-1,:] are summed elementwise and the result is another tensor with 1 row and N columns. Notice that only the 0th dimension’s indices vary from 0 throughout M-1. Similarly, If dim=1 is specified, then the vectors A[:,0], A[:,1], ..., A[:,N-1] are summed elementwise and the result is another tensor with M rows and 1 column.
An example is given below:
>>> A = torch.tensor([[1,2,3], [4,5,6]])
>>> A
tensor([[1, 2, 3],
[4, 5, 6]])
>>> S0 = torch.sum(A, dim = 0)
>>> S0
tensor([5, 7, 9])
>>> S1 = torch.sum(A, dim = 1)
>>> S1
tensor([ 6, 15])
In the above sample code, the first sum operation specifies dim=0, therefore A[0,:] and A[1,:], which are [1,2,3] and [4,5,6], are summed and resulted in [5, 7, 9]. When dim=1 was specified, the vectors A[:,0], A[:,1], and A[:2], which are the vectors [1, 4], [2, 5], and [3, 6], are elementwise added to find [6, 15].
Note also that the specified dimension collapses. Again let A have the shape (M, N). If dim=0, then the result will have the shape (1, N), where dimension 0 is reduced from M to 1. Similarly if dim=1, then the result would have the shape (M, 1), where N is reduced to 1. Note also that shapes (1, N) and (M,1) are represented by a single-dimensional tensor with N and M elements respectively.
I got a 2d numpy array (shape(y,x)=601,1200) and a 3d numpy array (shape(z,y,x)=137,601,1200).
In my 2d array, I saved the z values at the y, x point which I now want to access from my 3d array and save it into a new 2d array.
I tried something like this without success.
levels = array2d.reshape(-1)
y = np.arange(601)
x = np.arange(1200)
newArray2d=oldArray3d[levels,y,x]
IndexError: shape mismatch: indexing arrays could not be broadcast together with shapes (721200,) (601,) (1200,)
I don't want to try something with loops, so is there any faster method?
This is the data you have:
x_len = 12 # In your case, 1200
y_len = 6 # In your case, 601
z_len = 3 # In your case, 137
import numpy as np
my2d = np.random.randint(0,z_len,(y_len,x_len))
my3d = np.random.randint(0,5,(z_len,y_len,x_len))
This is one way to build your new 2d array:
yindices,xindices = np.indices(my2d.shape)
new2d = my3d[my2d[:], yindices, xindices]
Notes:
We're using Integer Advanced Indexing.
This means we index the 3d array my3d with 3 integer index arrays.
For more explanation on how integer array indexing works, please refer to my answer on this other question
In your attempt, there was no need to reshape your 2d with reshape(-1), since the shape of the integer index array that we pass, will (after any broadcasting) become the shape of the resulting 2d array.
Also, in your attempt, your second and third index arrays need to have opposite orientations. That is, they must be of shape (y_len,1) and (1, x_len). Notice the different positions of the 1. This ensures that these two index arrays will get broadcasted
There's some vagueness in your question, but I think you want to advanced indexing like this:
In [2]: arr = np.arange(24).reshape(4,3,2)
In [3]: levels = np.random.randint(0,4,(3,2))
In [4]: levels
Out[4]:
array([[1, 2],
[3, 1],
[0, 2]])
In [5]: arr
Out[5]:
array([[[ 0, 1],
[ 2, 3],
[ 4, 5]],
[[ 6, 7],
[ 8, 9],
[10, 11]],
[[12, 13],
[14, 15],
[16, 17]],
[[18, 19],
[20, 21],
[22, 23]]])
In [6]: arr[levels, np.arange(3)[:,None], np.arange(2)]
Out[6]:
array([[ 6, 13],
[20, 9],
[ 4, 17]])
levels is (3,2). I created the other 2 indexing arrays to they broadcast with it (3,1) and (2,). The result is a (3,2) array of values from arr, selected by their combined indices.
In [28]: arr = np.arange(16).reshape((2, 2, 4))
In [29]: arr
Out[29]:
array([[[ 0, 1, 2, 3],
[ 4, 5, 6, 7]],
[[ 8, 9, 10, 11],
[12, 13, 14, 15]]])
In [32]: arr.transpose((1, 0, 2))
Out[32]:
array([[[ 0, 1, 2, 3],
[ 8, 9, 10, 11]],
[[ 4, 5, 6, 7],
[12, 13, 14, 15]]])
When we pass a tuple of integers to the transpose() function, what happens?
To be specific, this is a 3D array: how does NumPy transform the array when I pass the tuple of axes (1, 0 ,2)? Can you explain which row or column these integers refer to? And what are axis numbers in the context of NumPy?
To transpose an array, NumPy just swaps the shape and stride information for each axis. Here are the strides:
>>> arr.strides
(64, 32, 8)
>>> arr.transpose(1, 0, 2).strides
(32, 64, 8)
Notice that the transpose operation swapped the strides for axis 0 and axis 1. The lengths of these axes were also swapped (both lengths are 2 in this example).
No data needs to be copied for this to happen; NumPy can simply change how it looks at the underlying memory to construct the new array.
Visualising strides
The stride value represents the number of bytes that must be travelled in memory in order to reach the next value of an axis of an array.
Now, our 3D array arr looks this (with labelled axes):
This array is stored in a contiguous block of memory; essentially it is one-dimensional. To interpret it as a 3D object, NumPy must jump over a certain constant number of bytes in order to move along one of the three axes:
Since each integer takes up 8 bytes of memory (we're using the int64 dtype), the stride value for each dimension is 8 times the number of values that we need to jump. For instance, to move along axis 1, four values (32 bytes) are jumped, and to move along axis 0, eight values (64 bytes) need to be jumped.
When we write arr.transpose(1, 0, 2) we are swapping axes 0 and 1. The transposed array looks like this:
All that NumPy needs to do is to swap the stride information for axis 0 and axis 1 (axis 2 is unchanged). Now we must jump further to move along axis 1 than axis 0:
This basic concept works for any permutation of an array's axes. The actual code that handles the transpose is written in C and can be found here.
As explained in the documentation:
By default, reverse the dimensions, otherwise permute the axes according to the values given.
So you can pass an optional parameter axes defining the new order of dimensions.
E.g. transposing the first two dimensions of an RGB VGA pixel array:
>>> x = np.ones((480, 640, 3))
>>> np.transpose(x, (1, 0, 2)).shape
(640, 480, 3)
In C notation, your array would be:
int arr[2][2][4]
which is an 3D array having 2 2D arrays. Each of those 2D arrays has 2 1D array, each of those 1D arrays has 4 elements.
So you have three dimensions. The axes are 0, 1, 2, with sizes 2, 2, 4. This is exactly how numpy treats the axes of an N-dimensional array.
So, arr.transpose((1, 0, 2)) would take axis 1 and put it in position 0, axis 0 and put it in position 1, and axis 2 and leave it in position 2. You are effectively permuting the axes:
0 -\/-> 0
1 -/\-> 1
2 ----> 2
In other words, 1 -> 0, 0 -> 1, 2 -> 2. The destination axes are always in order, so all you need is to specify the source axes. Read off the tuple in that order: (1, 0, 2).
In this case your new array dimensions are again [2][2][4], only because axes 0 and 1 had the same size (2).
More interesting is a transpose by (2, 1, 0) which gives you an array of [4][2][2].
0 -\ /--> 0
1 --X---> 1
2 -/ \--> 2
In other words, 2 -> 0, 1 -> 1, 0 -> 2. Read off the tuple in that order: (2, 1, 0).
>>> arr.transpose((2,1,0))
array([[[ 0, 8],
[ 4, 12]],
[[ 1, 9],
[ 5, 13]],
[[ 2, 10],
[ 6, 14]],
[[ 3, 11],
[ 7, 15]]])
You ended up with an int[4][2][2].
You'll probably get better understanding if all dimensions were of different size, so you could see where each axis went.
Why is the first inner element [0, 8]? Because if you visualize your 3D array as two sheets of paper, 0 and 8 are lined up, one on one paper and one on the other paper, both in the upper left. By transposing (2, 1, 0) you're saying that you want the direction of paper-to-paper to now march along the paper from left to right, and the direction of left to right to now go from paper to paper. You had 4 elements going from left to right, so now you have four pieces of paper instead. And you had 2 papers, so now you have 2 elements going from left to right.
Sorry for the terrible ASCII art. ¯\_(ツ)_/¯
It seems the question and the example originates from the book Python for Data Analysis by Wes McKinney. This feature of transpose is mentioned in Chapter 4.1. Transposing Arrays and Swapping Axes.
For higher dimensional arrays, transpose will accept a tuple of axis numbers to permute the axes (for extra mind bending).
Here "permute" means "rearrange", so rearranging the order of axes.
The numbers in .transpose(1, 0, 2) determines how the order of axes are changed compared to the original. By using .transpose(1, 0, 2), we mean, "Change the 1st axis with the 2nd." If we use .transpose(0, 1, 2), the array will stay the same because there is nothing to change; it is the default order.
The example in the book with a (2, 2, 4) sized array is not very clear since 1st and 2nd axes has the same size. So the end result doesn't seem to change except the reordering of rows arr[0, 1] and arr[1, 0].
If we try a different example with a 3 dimensional array with each dimension having a different size, the rearrangement part becomes more clear.
In [2]: x = np.arange(24).reshape(2, 3, 4)
In [3]: x
Out[3]:
array([[[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]],
[[12, 13, 14, 15],
[16, 17, 18, 19],
[20, 21, 22, 23]]])
In [4]: x.transpose(1, 0, 2)
Out[4]:
array([[[ 0, 1, 2, 3],
[12, 13, 14, 15]],
[[ 4, 5, 6, 7],
[16, 17, 18, 19]],
[[ 8, 9, 10, 11],
[20, 21, 22, 23]]])
Here, original array sizes are (2, 3, 4). We changed the 1st and 2nd, so it becomes (3, 2, 4) in size. If we look closer to see how the rearrangement exactly happened; arrays of numbers seems to have changed in a particular pattern. Using the paper analogy of #RobertB, if we were to take the 2 chunks of numbers, and write each one on sheets, then take one row from each sheet to construct one dimension of the array, we would now have a 3x2x4-sized array, counting from the outermost to the innermost layer.
[ 0, 1, 2, 3] \ [12, 13, 14, 15]
[ 4, 5, 6, 7] \ [16, 17, 18, 19]
[ 8, 9, 10, 11] \ [20, 21, 22, 23]
It could be a good idea to play with different sized arrays, and change different axes to gain a better intuition of how it works.
I ran across this in Python for Data Analysis by Wes McKinney as well.
I will show the simplest way of solving this for a 3-dimensional tensor, then describe the general approach that can be used for n-dimensional tensors.
Simple 3-dimensional tensor example
Suppose you have the (2,2,4)-tensor
[[[ 0 1 2 3]
[ 4 5 6 7]]
[[ 8 9 10 11]
[12 13 14 15]]]
If we look at the coordinates of each point, they are as follows:
[[[ (0,0,0) (0,0,1) (0,0,2) (0,0,3)]
[ (0,1,0) (0,1,1) (0,1,2) (0,1,3)]]
[[ (1,0,0) (1,0,1) (1,0,2) (0,0,3)]
[ (1,1,0) (1,1,1) (1,1,2) (0,1,3)]]
Now suppose that the array above is example_array and we want to perform the operation: example_array.transpose(1,2,0)
For the (1,2,0)-transformation, we shuffle the coordinates as follows (note that this particular transformation amounts to a "left-shift":
(0,0,0) -> (0,0,0)
(0,0,1) -> (0,1,0)
(0,0,2) -> (0,2,0)
(0,0,3) -> (0,3,0)
(0,1,0) -> (1,0,0)
(0,1,1) -> (1,1,0)
(0,1,2) -> (1,2,0)
(0,1,3) -> (1,3,0)
(1,0,0) -> (0,0,1)
(1,0,1) -> (0,1,1)
(1,0,2) -> (0,2,1)
(0,0,3) -> (0,3,0)
(1,1,0) -> (1,0,1)
(1,1,1) -> (1,1,1)
(1,1,2) -> (1,2,1)
(0,1,3) -> (1,3,0)
Now, for each original value, place it into the shifted coordinates in the result matrix.
For instance, the value 10 has coordinates (1, 0, 2) in the original matrix and will have coordinates (0, 2, 1) in the result matrix. It is placed into the first 2d tensor submatrix in the third row of that submatrix, in the second column of that row.
Hence, the resulting matrix is:
array([[[ 0, 8],
[ 1, 9],
[ 2, 10],
[ 3, 11]],
[[ 4, 12],
[ 5, 13],
[ 6, 14],
[ 7, 15]]])
General n-dimensional tensor approach
For n-dimensional tensors, the algorithm is the same. Consider all of the coordinates of a single value in the original matrix. Shuffle the axes for that individual coordinate. Place the value into the resulting, shuffled coordinates in the result matrix. Repeat for all of the remaining values.
To summarise a.transpose()[i,j,k] = a[k,j,i]
a = np.array( range(24), int).reshape((2,3,4))
a.shape gives (2,3,4)
a.transpose().shape gives (4,3,2) shape tuple is reversed.
when is a tuple parameter is passed axes are permuted according to the tuple.
For example
a = np.array( range(24), int).reshape((2,3,4))
a[i,j,k] equals a.transpose((2,0,1))[k,i,j]
axis 0 takes 2nd place
axis 1 takes 3rd place
axis 2 tales 1st place
of course we need to take care that values in tuple parameter passed to transpose are unique and in range(number of axis)