Numpy: Get matrices from tensor given list of indexes - python

I have a tensor with the shape (4, 3, 20). When I do X[:, 0, :].shape I get (4, 20). When I do X[:, [0,2,0,1], :].shape I get (4, 4, 20).
What I have is a list of indexes representing the second dimension of my tensor. I want to get a two-dimensional matrix like I get when I do X[:, 0, :] but I have different indexes for the second dimension instead of only one. How do I do that?

Your question is unclear, but I'll make a guess
In [58]: X=np.arange(24).reshape(4,3,2)
In [59]: X[range(4),[0,2,0,1],:]
Out[59]:
array([[ 0, 1],
[10, 11],
[12, 13],
[20, 21]])
This picks row 0 from the 1st plane; row 2 from the 2nd, etc. The result has the same shape as X[:,0,:], but values are pulled from different 1st dimension planes.
In [61]: X[:,0,:]
Out[61]:
array([[ 0, 1], # same
[ 6, 7],
[12, 13], # same
[18, 19]])

I think you are looking for np.squeeze. So, for cases when the indexing list, say L has just one element and upon indexing the input array with it would result in a 3D array with a singleton second dimension (dimension of length 1), would result in a 2D output with that squeez-ing. For L with more than one element, the indexing would result in a 3D array without any singleton dimension and thus, no change with that squeez-ing and hence the desired output. Thus, the solution with it would be -
np.squeeze(X[:,L,:])
Sample run to test out shapes on a random array -
In [25]: A = np.random.rand(4,3,20)
In [26]: L = [0]
In [27]: np.squeeze(A[:,L,:]).shape
Out[27]: (4, 20)
In [28]: L = [0,2,0,1]
In [29]: np.squeeze(A[:,L,:]).shape
Out[29]: (4, 4, 20)

Related

Is there a function for automatic broadcasting when multiplying two numpy arrays with different shapes?

I want to multiply two numpy arrays with different shapes. The result should be broadcasted in such a way that e.g. the multiplication of arrays with shape (3,) and (5,) returns an array with shape (3,5). I know that this is possible using array1[:,numpy.newaxis]*array2[numpy.newaxis,:]. But what I am looking for is something more general, a function that does also automatically multiply the arrays with shapes (3,5) and (4,) to an array with shape (3,5,4). Is there any numpy function to do this? Sure, a can write myself a function but is there any function existing?
So I am looking for a function numpy.func(array1, array2) that does return an array array3 with shape (*array1.shape, *array2.shape) and values array3[i1,j1,..,i2,j2,..] = array1[i1,j1,...]*array2[i2,j2,...].
Thanks
Take a look at numpy.multiply.outer. outer is a standard method that all the "ufuncs" have.
For example,
In [19]: a # a has shape (3, 4)
Out[19]:
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]])
In [20]: b # b has shape (5,)
Out[20]: array([0, 1, 2, 3, 4])
In [21]: c = np.multiply.outer(a, b)
In [22]: c.shape
Out[22]: (3, 4, 5)
In [23]: c[1, 2, 3]
Out[23]: 18
In [24]: a[1, 2]*b[3]
Out[24]: 18

reshape numpy array with a[:,None] [duplicate]

What is numpy.newaxis and when should I use it?
Using it on a 1-D array x produces:
>>> x
array([0, 1, 2, 3])
>>> x[np.newaxis, :]
array([[0, 1, 2, 3]])
>>> x[:, np.newaxis]
array([[0],
[1],
[2],
[3]])
Simply put, numpy.newaxis is used to increase the dimension of the existing array by one more dimension, when used once. Thus,
1D array will become 2D array
2D array will become 3D array
3D array will become 4D array
4D array will become 5D array
and so on..
Here is a visual illustration which depicts promotion of 1D array to 2D arrays.
Scenario-1: np.newaxis might come in handy when you want to explicitly convert a 1D array to either a row vector or a column vector, as depicted in the above picture.
Example:
# 1D array
In [7]: arr = np.arange(4)
In [8]: arr.shape
Out[8]: (4,)
# make it as row vector by inserting an axis along first dimension
In [9]: row_vec = arr[np.newaxis, :] # arr[None, :]
In [10]: row_vec.shape
Out[10]: (1, 4)
# make it as column vector by inserting an axis along second dimension
In [11]: col_vec = arr[:, np.newaxis] # arr[:, None]
In [12]: col_vec.shape
Out[12]: (4, 1)
Scenario-2: When we want to make use of numpy broadcasting as part of some operation, for instance while doing addition of some arrays.
Example:
Let's say you want to add the following two arrays:
x1 = np.array([1, 2, 3, 4, 5])
x2 = np.array([5, 4, 3])
If you try to add these just like that, NumPy will raise the following ValueError :
ValueError: operands could not be broadcast together with shapes (5,) (3,)
In this situation, you can use np.newaxis to increase the dimension of one of the arrays so that NumPy can broadcast.
In [2]: x1_new = x1[:, np.newaxis] # x1[:, None]
# now, the shape of x1_new is (5, 1)
# array([[1],
# [2],
# [3],
# [4],
# [5]])
Now, add:
In [3]: x1_new + x2
Out[3]:
array([[ 6, 5, 4],
[ 7, 6, 5],
[ 8, 7, 6],
[ 9, 8, 7],
[10, 9, 8]])
Alternatively, you can also add new axis to the array x2:
In [6]: x2_new = x2[:, np.newaxis] # x2[:, None]
In [7]: x2_new # shape is (3, 1)
Out[7]:
array([[5],
[4],
[3]])
Now, add:
In [8]: x1 + x2_new
Out[8]:
array([[ 6, 7, 8, 9, 10],
[ 5, 6, 7, 8, 9],
[ 4, 5, 6, 7, 8]])
Note: Observe that we get the same result in both cases (but one being the transpose of the other).
Scenario-3: This is similar to scenario-1. But, you can use np.newaxis more than once to promote the array to higher dimensions. Such an operation is sometimes needed for higher order arrays (i.e. Tensors).
Example:
In [124]: arr = np.arange(5*5).reshape(5,5)
In [125]: arr.shape
Out[125]: (5, 5)
# promoting 2D array to a 5D array
In [126]: arr_5D = arr[np.newaxis, ..., np.newaxis, np.newaxis] # arr[None, ..., None, None]
In [127]: arr_5D.shape
Out[127]: (1, 5, 5, 1, 1)
As an alternative, you can use numpy.expand_dims that has an intuitive axis kwarg.
# adding new axes at 1st, 4th, and last dimension of the resulting array
In [131]: newaxes = (0, 3, -1)
In [132]: arr_5D = np.expand_dims(arr, axis=newaxes)
In [133]: arr_5D.shape
Out[133]: (1, 5, 5, 1, 1)
More background on np.newaxis vs np.reshape
newaxis is also called as a pseudo-index that allows the temporary addition of an axis into a multiarray.
np.newaxis uses the slicing operator to recreate the array while numpy.reshape reshapes the array to the desired layout (assuming that the dimensions match; And this is must for a reshape to happen).
Example
In [13]: A = np.ones((3,4,5,6))
In [14]: B = np.ones((4,6))
In [15]: (A + B[:, np.newaxis, :]).shape # B[:, None, :]
Out[15]: (3, 4, 5, 6)
In the above example, we inserted a temporary axis between the first and second axes of B (to use broadcasting). A missing axis is filled-in here using np.newaxis to make the broadcasting operation work.
General Tip: You can also use None in place of np.newaxis; These are in fact the same objects.
In [13]: np.newaxis is None
Out[13]: True
P.S. Also see this great answer: newaxis vs reshape to add dimensions
What is np.newaxis?
The np.newaxis is just an alias for the Python constant None, which means that wherever you use np.newaxis you could also use None:
>>> np.newaxis is None
True
It's just more descriptive if you read code that uses np.newaxis instead of None.
How to use np.newaxis?
The np.newaxis is generally used with slicing. It indicates that you want to add an additional dimension to the array. The position of the np.newaxis represents where I want to add dimensions.
>>> import numpy as np
>>> a = np.arange(10)
>>> a
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
>>> a.shape
(10,)
In the first example I use all elements from the first dimension and add a second dimension:
>>> a[:, np.newaxis]
array([[0],
[1],
[2],
[3],
[4],
[5],
[6],
[7],
[8],
[9]])
>>> a[:, np.newaxis].shape
(10, 1)
The second example adds a dimension as first dimension and then uses all elements from the first dimension of the original array as elements in the second dimension of the result array:
>>> a[np.newaxis, :] # The output has 2 [] pairs!
array([[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]])
>>> a[np.newaxis, :].shape
(1, 10)
Similarly you can use multiple np.newaxis to add multiple dimensions:
>>> a[np.newaxis, :, np.newaxis] # note the 3 [] pairs in the output
array([[[0],
[1],
[2],
[3],
[4],
[5],
[6],
[7],
[8],
[9]]])
>>> a[np.newaxis, :, np.newaxis].shape
(1, 10, 1)
Are there alternatives to np.newaxis?
There is another very similar functionality in NumPy: np.expand_dims, which can also be used to insert one dimension:
>>> np.expand_dims(a, 1) # like a[:, np.newaxis]
>>> np.expand_dims(a, 0) # like a[np.newaxis, :]
But given that it just inserts 1s in the shape you could also reshape the array to add these dimensions:
>>> a.reshape(a.shape + (1,)) # like a[:, np.newaxis]
>>> a.reshape((1,) + a.shape) # like a[np.newaxis, :]
Most of the times np.newaxis is the easiest way to add dimensions, but it's good to know the alternatives.
When to use np.newaxis?
In several contexts is adding dimensions useful:
If the data should have a specified number of dimensions. For example if you want to use matplotlib.pyplot.imshow to display a 1D array.
If you want NumPy to broadcast arrays. By adding a dimension you could for example get the difference between all elements of one array: a - a[:, np.newaxis]. This works because NumPy operations broadcast starting with the last dimension 1.
To add a necessary dimension so that NumPy can broadcast arrays. This works because each length-1 dimension is simply broadcast to the length of the corresponding1 dimension of the other array.
1 If you want to read more about the broadcasting rules the NumPy documentation on that subject is very good. It also includes an example with np.newaxis:
>>> a = np.array([0.0, 10.0, 20.0, 30.0])
>>> b = np.array([1.0, 2.0, 3.0])
>>> a[:, np.newaxis] + b
array([[ 1., 2., 3.],
[ 11., 12., 13.],
[ 21., 22., 23.],
[ 31., 32., 33.]])
You started with a one-dimensional list of numbers. Once you used numpy.newaxis, you turned it into a two-dimensional matrix, consisting of four rows of one column each.
You could then use that matrix for matrix multiplication, or involve it in the construction of a larger 4 x n matrix.
newaxis object in the selection tuple serves to expand the dimensions of the resulting selection by one unit-length dimension.
It is not just conversion of row matrix to column matrix.
Consider the example below:
In [1]:x1 = np.arange(1,10).reshape(3,3)
print(x1)
Out[1]: array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
Now lets add new dimension to our data,
In [2]:x1_new = x1[:,np.newaxis]
print(x1_new)
Out[2]:array([[[1, 2, 3]],
[[4, 5, 6]],
[[7, 8, 9]]])
You can see that newaxis added the extra dimension here, x1 had dimension (3,3) and X1_new has dimension (3,1,3).
How our new dimension enables us to different operations:
In [3]:x2 = np.arange(11,20).reshape(3,3)
print(x2)
Out[3]:array([[11, 12, 13],
[14, 15, 16],
[17, 18, 19]])
Adding x1_new and x2, we get:
In [4]:x1_new+x2
Out[4]:array([[[12, 14, 16],
[15, 17, 19],
[18, 20, 22]],
[[15, 17, 19],
[18, 20, 22],
[21, 23, 25]],
[[18, 20, 22],
[21, 23, 25],
[24, 26, 28]]])
Thus, newaxis is not just conversion of row to column matrix. It increases the dimension of matrix, thus enabling us to do more operations on it.

Numpy: index array by array

>>> idx = np.random.randint(2, size=(9, 31))
>>> a = np.random.random((9, 31, 2))
>>> a[idx].shape
(9, 31, 31, 2)
Why is the above not resulting in at least a shape of (9, 31, 1), or even better (9, 31)? How can I get it to return a selection based on the values in idx?
Update
This is perhaps a more concrete and hopefully analogue example: Assume this array
a = np.asarray([[1, 2], [3, 4], [5, 6], [7, 8]])
How would I go about selection the array [1, 4, 5, 8] (i.e. the 0th, 1st, 0th, 1st element of each row)?
I think this is what you want:
>>> a[np.arange(9)[:, None], np.arange(31), idx].shape
(9, 31)
For your second example you would do:
>>> a[np.arange(4), [0, 1, 0, 1]]
array([1, 4, 5, 8])
Read the docs on fancy indexing, especially the part on what happens when you don't have an index array for each dimension here: those extra np.arange arrays are placed there to avoid that behavior.
Note also how they are reshaped (indexing with [:, None] is equivalent to .reshape(-1, 1)) so that their broadcast shape has the shape of the desired output array.
You're doing advanced indexing on the ndarray http://docs.scipy.org/doc/numpy/reference/arrays.indexing.html#advanced-indexing.
Advanced indexes always are broadcast and iterated as one:
This is triggered because in your case the number of elements in the ndarray-index is not equal to the number of dimensions in the ndarray you are indexing into. Effectively you're producing an outer-product of slices: each element in your index produces a slice of the indexed array and not an element.
UPDATE:
>>> map(lambda idx: a[idx[0],idx[1]], [[0,0], [1,1], [2,0], [3,1]])
This will return:
[1, 4, 5, 8]

How do I use np.newaxis?

What is numpy.newaxis and when should I use it?
Using it on a 1-D array x produces:
>>> x
array([0, 1, 2, 3])
>>> x[np.newaxis, :]
array([[0, 1, 2, 3]])
>>> x[:, np.newaxis]
array([[0],
[1],
[2],
[3]])
Simply put, numpy.newaxis is used to increase the dimension of the existing array by one more dimension, when used once. Thus,
1D array will become 2D array
2D array will become 3D array
3D array will become 4D array
4D array will become 5D array
and so on..
Here is a visual illustration which depicts promotion of 1D array to 2D arrays.
Scenario-1: np.newaxis might come in handy when you want to explicitly convert a 1D array to either a row vector or a column vector, as depicted in the above picture.
Example:
# 1D array
In [7]: arr = np.arange(4)
In [8]: arr.shape
Out[8]: (4,)
# make it as row vector by inserting an axis along first dimension
In [9]: row_vec = arr[np.newaxis, :] # arr[None, :]
In [10]: row_vec.shape
Out[10]: (1, 4)
# make it as column vector by inserting an axis along second dimension
In [11]: col_vec = arr[:, np.newaxis] # arr[:, None]
In [12]: col_vec.shape
Out[12]: (4, 1)
Scenario-2: When we want to make use of numpy broadcasting as part of some operation, for instance while doing addition of some arrays.
Example:
Let's say you want to add the following two arrays:
x1 = np.array([1, 2, 3, 4, 5])
x2 = np.array([5, 4, 3])
If you try to add these just like that, NumPy will raise the following ValueError :
ValueError: operands could not be broadcast together with shapes (5,) (3,)
In this situation, you can use np.newaxis to increase the dimension of one of the arrays so that NumPy can broadcast.
In [2]: x1_new = x1[:, np.newaxis] # x1[:, None]
# now, the shape of x1_new is (5, 1)
# array([[1],
# [2],
# [3],
# [4],
# [5]])
Now, add:
In [3]: x1_new + x2
Out[3]:
array([[ 6, 5, 4],
[ 7, 6, 5],
[ 8, 7, 6],
[ 9, 8, 7],
[10, 9, 8]])
Alternatively, you can also add new axis to the array x2:
In [6]: x2_new = x2[:, np.newaxis] # x2[:, None]
In [7]: x2_new # shape is (3, 1)
Out[7]:
array([[5],
[4],
[3]])
Now, add:
In [8]: x1 + x2_new
Out[8]:
array([[ 6, 7, 8, 9, 10],
[ 5, 6, 7, 8, 9],
[ 4, 5, 6, 7, 8]])
Note: Observe that we get the same result in both cases (but one being the transpose of the other).
Scenario-3: This is similar to scenario-1. But, you can use np.newaxis more than once to promote the array to higher dimensions. Such an operation is sometimes needed for higher order arrays (i.e. Tensors).
Example:
In [124]: arr = np.arange(5*5).reshape(5,5)
In [125]: arr.shape
Out[125]: (5, 5)
# promoting 2D array to a 5D array
In [126]: arr_5D = arr[np.newaxis, ..., np.newaxis, np.newaxis] # arr[None, ..., None, None]
In [127]: arr_5D.shape
Out[127]: (1, 5, 5, 1, 1)
As an alternative, you can use numpy.expand_dims that has an intuitive axis kwarg.
# adding new axes at 1st, 4th, and last dimension of the resulting array
In [131]: newaxes = (0, 3, -1)
In [132]: arr_5D = np.expand_dims(arr, axis=newaxes)
In [133]: arr_5D.shape
Out[133]: (1, 5, 5, 1, 1)
More background on np.newaxis vs np.reshape
newaxis is also called as a pseudo-index that allows the temporary addition of an axis into a multiarray.
np.newaxis uses the slicing operator to recreate the array while numpy.reshape reshapes the array to the desired layout (assuming that the dimensions match; And this is must for a reshape to happen).
Example
In [13]: A = np.ones((3,4,5,6))
In [14]: B = np.ones((4,6))
In [15]: (A + B[:, np.newaxis, :]).shape # B[:, None, :]
Out[15]: (3, 4, 5, 6)
In the above example, we inserted a temporary axis between the first and second axes of B (to use broadcasting). A missing axis is filled-in here using np.newaxis to make the broadcasting operation work.
General Tip: You can also use None in place of np.newaxis; These are in fact the same objects.
In [13]: np.newaxis is None
Out[13]: True
P.S. Also see this great answer: newaxis vs reshape to add dimensions
What is np.newaxis?
The np.newaxis is just an alias for the Python constant None, which means that wherever you use np.newaxis you could also use None:
>>> np.newaxis is None
True
It's just more descriptive if you read code that uses np.newaxis instead of None.
How to use np.newaxis?
The np.newaxis is generally used with slicing. It indicates that you want to add an additional dimension to the array. The position of the np.newaxis represents where I want to add dimensions.
>>> import numpy as np
>>> a = np.arange(10)
>>> a
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
>>> a.shape
(10,)
In the first example I use all elements from the first dimension and add a second dimension:
>>> a[:, np.newaxis]
array([[0],
[1],
[2],
[3],
[4],
[5],
[6],
[7],
[8],
[9]])
>>> a[:, np.newaxis].shape
(10, 1)
The second example adds a dimension as first dimension and then uses all elements from the first dimension of the original array as elements in the second dimension of the result array:
>>> a[np.newaxis, :] # The output has 2 [] pairs!
array([[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]])
>>> a[np.newaxis, :].shape
(1, 10)
Similarly you can use multiple np.newaxis to add multiple dimensions:
>>> a[np.newaxis, :, np.newaxis] # note the 3 [] pairs in the output
array([[[0],
[1],
[2],
[3],
[4],
[5],
[6],
[7],
[8],
[9]]])
>>> a[np.newaxis, :, np.newaxis].shape
(1, 10, 1)
Are there alternatives to np.newaxis?
There is another very similar functionality in NumPy: np.expand_dims, which can also be used to insert one dimension:
>>> np.expand_dims(a, 1) # like a[:, np.newaxis]
>>> np.expand_dims(a, 0) # like a[np.newaxis, :]
But given that it just inserts 1s in the shape you could also reshape the array to add these dimensions:
>>> a.reshape(a.shape + (1,)) # like a[:, np.newaxis]
>>> a.reshape((1,) + a.shape) # like a[np.newaxis, :]
Most of the times np.newaxis is the easiest way to add dimensions, but it's good to know the alternatives.
When to use np.newaxis?
In several contexts is adding dimensions useful:
If the data should have a specified number of dimensions. For example if you want to use matplotlib.pyplot.imshow to display a 1D array.
If you want NumPy to broadcast arrays. By adding a dimension you could for example get the difference between all elements of one array: a - a[:, np.newaxis]. This works because NumPy operations broadcast starting with the last dimension 1.
To add a necessary dimension so that NumPy can broadcast arrays. This works because each length-1 dimension is simply broadcast to the length of the corresponding1 dimension of the other array.
1 If you want to read more about the broadcasting rules the NumPy documentation on that subject is very good. It also includes an example with np.newaxis:
>>> a = np.array([0.0, 10.0, 20.0, 30.0])
>>> b = np.array([1.0, 2.0, 3.0])
>>> a[:, np.newaxis] + b
array([[ 1., 2., 3.],
[ 11., 12., 13.],
[ 21., 22., 23.],
[ 31., 32., 33.]])
You started with a one-dimensional list of numbers. Once you used numpy.newaxis, you turned it into a two-dimensional matrix, consisting of four rows of one column each.
You could then use that matrix for matrix multiplication, or involve it in the construction of a larger 4 x n matrix.
newaxis object in the selection tuple serves to expand the dimensions of the resulting selection by one unit-length dimension.
It is not just conversion of row matrix to column matrix.
Consider the example below:
In [1]:x1 = np.arange(1,10).reshape(3,3)
print(x1)
Out[1]: array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
Now lets add new dimension to our data,
In [2]:x1_new = x1[:,np.newaxis]
print(x1_new)
Out[2]:array([[[1, 2, 3]],
[[4, 5, 6]],
[[7, 8, 9]]])
You can see that newaxis added the extra dimension here, x1 had dimension (3,3) and X1_new has dimension (3,1,3).
How our new dimension enables us to different operations:
In [3]:x2 = np.arange(11,20).reshape(3,3)
print(x2)
Out[3]:array([[11, 12, 13],
[14, 15, 16],
[17, 18, 19]])
Adding x1_new and x2, we get:
In [4]:x1_new+x2
Out[4]:array([[[12, 14, 16],
[15, 17, 19],
[18, 20, 22]],
[[15, 17, 19],
[18, 20, 22],
[21, 23, 25]],
[[18, 20, 22],
[21, 23, 25],
[24, 26, 28]]])
Thus, newaxis is not just conversion of row to column matrix. It increases the dimension of matrix, thus enabling us to do more operations on it.

How does the axis parameter from NumPy work?

Can someone explain exactly what the axis parameter in NumPy does?
I am terribly confused.
I'm trying to use the function myArray.sum(axis=num)
At first I thought if the array is itself 3 dimensions, axis=0 will return three elements, consisting of the sum of all nested items in that same position. If each dimension contained five dimensions, I expected axis=1 to return a result of five items, and so on.
However this is not the case, and the documentation does not do a good job helping me out (they use a 3x3x3 array so it's hard to tell what's happening)
Here's what I did:
>>> e
array([[[1, 0],
[0, 0]],
[[1, 1],
[1, 0]],
[[1, 0],
[0, 1]]])
>>> e.sum(axis = 0)
array([[3, 1],
[1, 1]])
>>> e.sum(axis=1)
array([[1, 0],
[2, 1],
[1, 1]])
>>> e.sum(axis=2)
array([[1, 0],
[2, 1],
[1, 1]])
>>>
Clearly the result is not intuitive.
Clearly,
e.shape == (3, 2, 2)
Sum over an axis is a reduction operation so the specified axis disappears. Hence,
e.sum(axis=0).shape == (2, 2)
e.sum(axis=1).shape == (3, 2)
e.sum(axis=2).shape == (3, 2)
Intuitively, we are "squashing" the array along the chosen axis, and summing the numbers that get squashed together.
To understand the axis intuitively, refer the picture below (source: Physics Dept, Cornell Uni)
The shape of the (boolean) array in the above figure is shape=(8, 3). ndarray.shape will return a tuple where the entries correspond to the length of the particular dimension. In our example, 8 corresponds to length of axis 0 whereas 3 corresponds to length of axis 1.
If someone need this visual description:
There are good answers for visualization however it might help to think purely from analytical perspective.
You can create array of arbitrary dimension with numpy.
For example, here's a 5-dimension array:
>>> a = np.random.rand(2, 3, 4, 5, 6)
>>> a.shape
(2, 3, 4, 5, 6)
You can access any element of this array by specifying indices. For example, here's the first element of this array:
>>> a[0, 0, 0, 0, 0]
0.0038908603263844155
Now if you take out one of the dimensions, you get number of elements in that dimension:
>>> a[0, 0, :, 0, 0]
array([0.00389086, 0.27394775, 0.26565889, 0.62125279])
When you apply a function like sum with axis parameter, that dimension gets eliminated and array of dimension less than original gets created. For each cell in new array, the operator will get list of elements and apply the reduction function to get a scaler.
>>> np.sum(a, axis=2).shape
(2, 3, 5, 6)
Now you can check that the first element of this array is sum of above elements:
>>> np.sum(a, axis=2)[0, 0, 0, 0]
1.1647502999560164
>>> a[0, 0, :, 0, 0].sum()
1.1647502999560164
The axis=None has special meaning to flatten out the array and apply function on all numbers.
Now you can think about more complex cases where axis is not just number but a tuple:
>>> np.sum(a, axis=(2,3)).shape
(2, 3, 6)
Note that we use same technique to figure out how this reduction was done:
>>> np.sum(a, axis=(2,3))[0,0,0]
7.889432081931909
>>> a[0, 0, :, :, 0].sum()
7.88943208193191
You can also use same reasoning for adding dimension in array instead of reducing dimension:
>>> x = np.random.rand(3, 4)
>>> y = np.random.rand(3, 4)
# New dimension is created on specified axis
>>> np.stack([x, y], axis=2).shape
(3, 4, 2)
>>> np.stack([x, y], axis=0).shape
(2, 3, 4)
# To retrieve item i in stack set i in that axis
Hope this gives you generic and full understanding of this important parameter.
Some answers are too specific or do not address the main source of confusion. This answer attempts to provide a more general but simple explanation of the concept, with a simple example.
The main source of confusion is related to expressions such as "Axis along which the means are computed", which is the documentation of the argument axis of the numpy.mean function. What the heck does "along which" even mean here? "Along which" essentially means that you will sum the rows (and divide by the number of rows, given that we are computing the mean), if the axis is 0, and the columns, if the axis is 1. In the case of axis is 0 (or 1), the rows can be scalars or vectors or even other multi-dimensional arrays.
In [1]: import numpy as np
In [2]: a=np.array([[1, 2], [3, 4]])
In [3]: a
Out[3]:
array([[1, 2],
[3, 4]])
In [4]: np.mean(a, axis=0)
Out[4]: array([2., 3.])
In [5]: np.mean(a, axis=1)
Out[5]: array([1.5, 3.5])
So, in the example above, np.mean(a, axis=0) returns array([2., 3.]) because (1 + 3)/2 = 2 and (2 + 4)/2 = 3. It returns an array of two numbers because it returns the mean of the rows for each column (and there are two columns).
Both 1st and 2nd reply is great for understanding ndarray concept in numpy. I am giving a simple example.
And according to this image by #debaonline4u
https://i.stack.imgur.com/O5hBF.jpg
Suppose , you have an 2D array -
[1, 2, 3]
[4, 5, 6]
In, numpy format it will be -
c = np.array([[1, 2, 3],
[4, 5, 6]])
Now,
c.ndim = 2 (rows/axis=0)
c.shape = (2,3) (axis0, axis1)
c.sum(axis=0) = [1+4, 2+5, 3+6] = [5, 7, 9] (sum of the 1st elements of each rows, so along axis0)
c.sum(axis=1) = [1+2+3, 4+5+6] = [6, 15] (sum of the elements in a row, so along axis1)
So for your 3D array,

Categories