What's the difference between shape(150,) and shape (150,1)? - python

What's the difference between shape(150,) and shape (150,1)?
I think they are the same, I mean they both represent a column vector.

Both have the same values, but one is a vector and the other one is a matrix of the vector. Here's an example:
import numpy as np
x = np.array([1, 2, 3, 4, 5])
y = np.array([[1], [2], [3], [4], [5]])
print(x.shape)
print(y.shape)
And the output is:
(5,)
(5, 1)

Although they both occupy same space and positions in memory,
I think they are the same, I mean they both represent a column vector.
No they are not and certainly not according to NumPy (ndarrays).
The main difference is that the
shape (150,) => is a 1D array, whereas
shape (150,1) => is a 2D array

Questions like this see to come from two misconceptions.
not realizing that (5,) is a 1 element tuple.
expecting MATLAB like matrices
Make an array with the handy arange function:
In [424]: x = np.arange(5)
In [425]: x.shape
Out[425]: (5,) # 1 element tuple
In [426]: x.ndim
Out[426]: 1
numpy does not automatically make matrices, 2d arrays. It does not follow MATLAB in that regard.
We can reshape that array, adding a 2nd dimension. The result is a view (sooner or later you need to learn what that means):
In [427]: y = x.reshape(5,1)
In [428]: y.shape
Out[428]: (5, 1)
In [429]: y.ndim
Out[429]: 2
The display of these 2 arrays is very different. Same numbers, but the layout and number of brackets is very different, reflecting the respective shapes:
In [430]: x
Out[430]: array([0, 1, 2, 3, 4])
In [431]: y
Out[431]:
array([[0],
[1],
[2],
[3],
[4]])
The shape difference may seem academic - until you try to do math with the arrays:
In [432]: x+x
Out[432]: array([0, 2, 4, 6, 8]) # element wise sum
In [433]: x+y
Out[433]:
array([[0, 1, 2, 3, 4],
[1, 2, 3, 4, 5],
[2, 3, 4, 5, 6],
[3, 4, 5, 6, 7],
[4, 5, 6, 7, 8]])
How did that end up producing a (5,5) array? Broadcasting a (5,) array with a (5,1) array!

Related

reshape numpy array with a[:,None] [duplicate]

What is numpy.newaxis and when should I use it?
Using it on a 1-D array x produces:
>>> x
array([0, 1, 2, 3])
>>> x[np.newaxis, :]
array([[0, 1, 2, 3]])
>>> x[:, np.newaxis]
array([[0],
[1],
[2],
[3]])
Simply put, numpy.newaxis is used to increase the dimension of the existing array by one more dimension, when used once. Thus,
1D array will become 2D array
2D array will become 3D array
3D array will become 4D array
4D array will become 5D array
and so on..
Here is a visual illustration which depicts promotion of 1D array to 2D arrays.
Scenario-1: np.newaxis might come in handy when you want to explicitly convert a 1D array to either a row vector or a column vector, as depicted in the above picture.
Example:
# 1D array
In [7]: arr = np.arange(4)
In [8]: arr.shape
Out[8]: (4,)
# make it as row vector by inserting an axis along first dimension
In [9]: row_vec = arr[np.newaxis, :] # arr[None, :]
In [10]: row_vec.shape
Out[10]: (1, 4)
# make it as column vector by inserting an axis along second dimension
In [11]: col_vec = arr[:, np.newaxis] # arr[:, None]
In [12]: col_vec.shape
Out[12]: (4, 1)
Scenario-2: When we want to make use of numpy broadcasting as part of some operation, for instance while doing addition of some arrays.
Example:
Let's say you want to add the following two arrays:
x1 = np.array([1, 2, 3, 4, 5])
x2 = np.array([5, 4, 3])
If you try to add these just like that, NumPy will raise the following ValueError :
ValueError: operands could not be broadcast together with shapes (5,) (3,)
In this situation, you can use np.newaxis to increase the dimension of one of the arrays so that NumPy can broadcast.
In [2]: x1_new = x1[:, np.newaxis] # x1[:, None]
# now, the shape of x1_new is (5, 1)
# array([[1],
# [2],
# [3],
# [4],
# [5]])
Now, add:
In [3]: x1_new + x2
Out[3]:
array([[ 6, 5, 4],
[ 7, 6, 5],
[ 8, 7, 6],
[ 9, 8, 7],
[10, 9, 8]])
Alternatively, you can also add new axis to the array x2:
In [6]: x2_new = x2[:, np.newaxis] # x2[:, None]
In [7]: x2_new # shape is (3, 1)
Out[7]:
array([[5],
[4],
[3]])
Now, add:
In [8]: x1 + x2_new
Out[8]:
array([[ 6, 7, 8, 9, 10],
[ 5, 6, 7, 8, 9],
[ 4, 5, 6, 7, 8]])
Note: Observe that we get the same result in both cases (but one being the transpose of the other).
Scenario-3: This is similar to scenario-1. But, you can use np.newaxis more than once to promote the array to higher dimensions. Such an operation is sometimes needed for higher order arrays (i.e. Tensors).
Example:
In [124]: arr = np.arange(5*5).reshape(5,5)
In [125]: arr.shape
Out[125]: (5, 5)
# promoting 2D array to a 5D array
In [126]: arr_5D = arr[np.newaxis, ..., np.newaxis, np.newaxis] # arr[None, ..., None, None]
In [127]: arr_5D.shape
Out[127]: (1, 5, 5, 1, 1)
As an alternative, you can use numpy.expand_dims that has an intuitive axis kwarg.
# adding new axes at 1st, 4th, and last dimension of the resulting array
In [131]: newaxes = (0, 3, -1)
In [132]: arr_5D = np.expand_dims(arr, axis=newaxes)
In [133]: arr_5D.shape
Out[133]: (1, 5, 5, 1, 1)
More background on np.newaxis vs np.reshape
newaxis is also called as a pseudo-index that allows the temporary addition of an axis into a multiarray.
np.newaxis uses the slicing operator to recreate the array while numpy.reshape reshapes the array to the desired layout (assuming that the dimensions match; And this is must for a reshape to happen).
Example
In [13]: A = np.ones((3,4,5,6))
In [14]: B = np.ones((4,6))
In [15]: (A + B[:, np.newaxis, :]).shape # B[:, None, :]
Out[15]: (3, 4, 5, 6)
In the above example, we inserted a temporary axis between the first and second axes of B (to use broadcasting). A missing axis is filled-in here using np.newaxis to make the broadcasting operation work.
General Tip: You can also use None in place of np.newaxis; These are in fact the same objects.
In [13]: np.newaxis is None
Out[13]: True
P.S. Also see this great answer: newaxis vs reshape to add dimensions
What is np.newaxis?
The np.newaxis is just an alias for the Python constant None, which means that wherever you use np.newaxis you could also use None:
>>> np.newaxis is None
True
It's just more descriptive if you read code that uses np.newaxis instead of None.
How to use np.newaxis?
The np.newaxis is generally used with slicing. It indicates that you want to add an additional dimension to the array. The position of the np.newaxis represents where I want to add dimensions.
>>> import numpy as np
>>> a = np.arange(10)
>>> a
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
>>> a.shape
(10,)
In the first example I use all elements from the first dimension and add a second dimension:
>>> a[:, np.newaxis]
array([[0],
[1],
[2],
[3],
[4],
[5],
[6],
[7],
[8],
[9]])
>>> a[:, np.newaxis].shape
(10, 1)
The second example adds a dimension as first dimension and then uses all elements from the first dimension of the original array as elements in the second dimension of the result array:
>>> a[np.newaxis, :] # The output has 2 [] pairs!
array([[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]])
>>> a[np.newaxis, :].shape
(1, 10)
Similarly you can use multiple np.newaxis to add multiple dimensions:
>>> a[np.newaxis, :, np.newaxis] # note the 3 [] pairs in the output
array([[[0],
[1],
[2],
[3],
[4],
[5],
[6],
[7],
[8],
[9]]])
>>> a[np.newaxis, :, np.newaxis].shape
(1, 10, 1)
Are there alternatives to np.newaxis?
There is another very similar functionality in NumPy: np.expand_dims, which can also be used to insert one dimension:
>>> np.expand_dims(a, 1) # like a[:, np.newaxis]
>>> np.expand_dims(a, 0) # like a[np.newaxis, :]
But given that it just inserts 1s in the shape you could also reshape the array to add these dimensions:
>>> a.reshape(a.shape + (1,)) # like a[:, np.newaxis]
>>> a.reshape((1,) + a.shape) # like a[np.newaxis, :]
Most of the times np.newaxis is the easiest way to add dimensions, but it's good to know the alternatives.
When to use np.newaxis?
In several contexts is adding dimensions useful:
If the data should have a specified number of dimensions. For example if you want to use matplotlib.pyplot.imshow to display a 1D array.
If you want NumPy to broadcast arrays. By adding a dimension you could for example get the difference between all elements of one array: a - a[:, np.newaxis]. This works because NumPy operations broadcast starting with the last dimension 1.
To add a necessary dimension so that NumPy can broadcast arrays. This works because each length-1 dimension is simply broadcast to the length of the corresponding1 dimension of the other array.
1 If you want to read more about the broadcasting rules the NumPy documentation on that subject is very good. It also includes an example with np.newaxis:
>>> a = np.array([0.0, 10.0, 20.0, 30.0])
>>> b = np.array([1.0, 2.0, 3.0])
>>> a[:, np.newaxis] + b
array([[ 1., 2., 3.],
[ 11., 12., 13.],
[ 21., 22., 23.],
[ 31., 32., 33.]])
You started with a one-dimensional list of numbers. Once you used numpy.newaxis, you turned it into a two-dimensional matrix, consisting of four rows of one column each.
You could then use that matrix for matrix multiplication, or involve it in the construction of a larger 4 x n matrix.
newaxis object in the selection tuple serves to expand the dimensions of the resulting selection by one unit-length dimension.
It is not just conversion of row matrix to column matrix.
Consider the example below:
In [1]:x1 = np.arange(1,10).reshape(3,3)
print(x1)
Out[1]: array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
Now lets add new dimension to our data,
In [2]:x1_new = x1[:,np.newaxis]
print(x1_new)
Out[2]:array([[[1, 2, 3]],
[[4, 5, 6]],
[[7, 8, 9]]])
You can see that newaxis added the extra dimension here, x1 had dimension (3,3) and X1_new has dimension (3,1,3).
How our new dimension enables us to different operations:
In [3]:x2 = np.arange(11,20).reshape(3,3)
print(x2)
Out[3]:array([[11, 12, 13],
[14, 15, 16],
[17, 18, 19]])
Adding x1_new and x2, we get:
In [4]:x1_new+x2
Out[4]:array([[[12, 14, 16],
[15, 17, 19],
[18, 20, 22]],
[[15, 17, 19],
[18, 20, 22],
[21, 23, 25]],
[[18, 20, 22],
[21, 23, 25],
[24, 26, 28]]])
Thus, newaxis is not just conversion of row to column matrix. It increases the dimension of matrix, thus enabling us to do more operations on it.

Very Basic Numpy array dimension visualization

I'm a beginner to numpy with no experience in matrices. I understand basic 1d and 2d arrays but I'm having trouble visualizing a 3d numpy array like the one below. How do the following python lists form a 3d array with height, length and width? Which are the rows and columns?
b = np.array([[[1, 2, 3],[4, 5, 6]],
[[7, 8, 9],[10, 11, 12]]])
The anatomy of an ndarray in NumPy looks like this red cube below: (source: Physics Dept, Cornell Uni)
Once you leave the 2D space and enter 3D or higher dimensional spaces, the concept of rows and columns doesn't make much sense anymore. But still you can intuitively understand 3D arrays. For instance, considering your example:
In [41]: b
Out[41]:
array([[[ 1, 2, 3],
[ 4, 5, 6]],
[[ 7, 8, 9],
[10, 11, 12]]])
In [42]: b.shape
Out[42]: (2, 2, 3)
Here the shape of b is (2, 2, 3). You can think about it like, we've two (2x3) matrices stacked to form a 3D array. To access the first matrix you index into the array b like b[0] and to access the second matrix, you index into the array b like b[1].
# gives you the 2D array (i.e. matrix) at position `0`
In [43]: b[0]
Out[43]:
array([[1, 2, 3],
[4, 5, 6]])
# gives you the 2D array (i.e. matrix) at position 1
In [44]: b[1]
Out[44]:
array([[ 7, 8, 9],
[10, 11, 12]])
However, if you enter 4D space or higher, it will be very hard to make any sense out of the arrays itself since we humans have hard time visualizing 4D and more dimensions. So, one would rather just consider the ndarray.shape attribute and work with it.
More information about how we build higher dimensional arrays using (nested) lists:
For 1D arrays, the array constructor needs a sequence (tuple, list, etc) but conventionally list is used.
In [51]: oneD = np.array([1, 2, 3,])
In [52]: oneD.shape
Out[52]: (3,)
For 2D arrays, it's list of lists but can also be tuple of lists or tuple of tuples etc:
In [53]: twoD = np.array([[1, 2, 3], [4, 5, 6]])
In [54]: twoD.shape
Out[54]: (2, 3)
For 3D arrays, it's list of lists of lists:
In [55]: threeD = np.array([[[1, 2, 3], [2, 3, 4]], [[5, 6, 7], [6, 7, 8]]])
In [56]: threeD.shape
Out[56]: (2, 2, 3)
P.S. Internally, the ndarray is stored in a memory block as shown in the below picture. (source: Enthought)

numpy: how to construct a matrix of vectors from vector of matrix

I'm new to numpy,
so, with numpy, is it possible to use a vector of matrix to get a matrix of vectors"
for example:
matrix1(
[
[1, 2, 3],
[1, 2, 3],
[1, 2, 3]
])
matrix2(
[
[2, 4, 6],
[2, 4, 6],
[2, 4, 6]
])
-->
matrix(
[
[array('1 2'), array('2 4'), array('3 6')],
[array('1 2'), array('2 4'), array('3 6')],
[array('1 2'), array('2 4'), array('3 6')]
])
I'm new to numpy, so I'm not sure if it is allowed to put any thing in numpy's matrix or just numbers.
And it's not easy to get answer from google with descriptions like "matrix of vectors and vectors of matrix"
numpy doesn't have a concept of "vector" separate from "matrix." It does have distinct concepts of "matrix" and "array," but most people avoid the matrix representation entirely. If you use arrays, the concepts of "vector," "matrix," and "tensor" are all subsumed under the general concept of an array's "shape" attribute.
In this worldview, vectors and matrices are both 2-dimensional arrays, distinguished only by their shape. Row vectors are arrays with the shape (1, n), while column vectors are arrays with the shape (n, 1). Matrices are arrays with the shape (n, m). 1-dimensional arrays can behave like vectors sometimes, depending on context, but often you'll find that you won't get what you want unless you "upgrade" them.
With all that in mind, here's one possible answer to your question. First, we create a 1-d array:
>>> a1d = numpy.array([1, 2, 3])
>>> a1d
array([1, 2, 3])
Now we reshape it to create a column vector. The -1 here tells numpy to figure out the right size given the input.
>>> vcol = a1d.reshape((-1, 1))
>>> vcol
array([[1],
[2],
[3]])
Observe the doubled brackets at the beginning and ending of this. That's a subtle cue that this is a 2-d array, even though one dimension has a size of just 1.
We can do the same thing, swapping the dimensions, to get a row. Note again the doubled brackets.
>>> vrow = a1d.reshape((1, -1))
>>> vrow
array([[1, 2, 3]])
You can tell that these are 2-d arrays, because a 1-d array would have only one value in its shape tuple:
>>> a1d.shape
(3,)
>>> vcol.shape
(3, 1)
>>> vrow.shape
(1, 3)
To build a matrix from column vectors we can use hstack. There are lots of other methods that may be faster, but this is a good starting point. Here, note that [vcol] is not a numpy object, but an ordinary python list, so [vcol] * 3 means the same thing as [vcol, vcol, vcol].
>>> mat = numpy.hstack([vcol] * 3)
>>> mat
array([[1, 1, 1],
[2, 2, 2],
[3, 3, 3]])
And vstack gives us the same thing from row vectors.
>>> mat2 = numpy.vstack([vrow] * 3)
>>> mat2
array([[1, 2, 3],
[1, 2, 3],
[1, 2, 3]])
It's unlikely that any other interpretation of "construct a matrix of vectors from vector of matrix" will generate something you actually want in numpy!
Since you mention wanting to do linear algebra, here are a couple of operations that are possible. This assumes you're using a recent-enough version of python to use the new # operator, which provides an unambiguous inline notation for matrix multiplication of arrays.1
For arrays, multiplication is always element-wise. But sometimes there is broadcasting. For values with the same shape, it's plain element-wise multiplication:
>>> vrow * vrow
array([[1, 4, 9]])
>>> vcol * vcol
array([[1],
[4],
[9]])
When values have different shapes, they are broadcast together if possible to produce a sensible result:
>>> vrow * vcol
array([[1, 2, 3],
[2, 4, 6],
[3, 6, 9]])
>>> vcol * vrow
array([[1, 2, 3],
[2, 4, 6],
[3, 6, 9]])
Broadcasting works in the way you'd expect for other shapes as well:
>>> vrow * mat
array([[1, 2, 3],
[2, 4, 6],
[3, 6, 9]])
>>> vcol * mat
array([[1, 1, 1],
[4, 4, 4],
[9, 9, 9]])
If you want a dot product, you have to use the # operator:
>>> vrow # vcol
array([[14]])
Note that unlike the * operator, this is not symmetric:
>>> vcol # vrow
array([[1, 2, 3],
[2, 4, 6],
[3, 6, 9]])
This can be a bit confusing at first, because this looks the same as vrow * vcol, but don't be fooled. * will produce the same result regardless of argument order. Finally, for a matrix-vector product:
>>> mat # vcol
array([[ 6],
[12],
[18]])
Observe again the difference between # and *:
>>> mat * vcol
array([[1, 1, 1],
[4, 4, 4],
[9, 9, 9]])
1. Sadly, this only exists as of Python 3.5. If you need to work with an earlier version, all the same advice applies, except that instead of using inline notation for a # b, you have to use np.dot(a, b). numpy's matrix type overrides * to behave like #... but then you can't do element-wise multiplication or broadcasting the same way! So even if you have an earlier version, I don't recommend using the matrix type.

slice a 3d numpy array using a 2d numpy array

Is it possible to slice a 3d array using a 2d array. Im assuming it can be done but would require that you have to specify the axis?
If I have 3 arrays, such that:
A = [[1,2,3,4,5],
[1,3,5,7,9],
[5,4,3,2,1]] # shape (3,5)
B1 = [[1],
[2],
[3]] # shape (3, 1)
B2 = [[4],
[3],
[4]] # shape (3,1)
Is its possible to slice A using B1 an B2 like:
Out = A[B1:B2]
so that it would return me:
Out = [[2,3,4,5],
[5, 7],
[2, 1]]
or would this not work if the slices created arrays in Out of different lengths?
Numpy is optimized for homogeneous arrays of numbers with fixed dimensions, so it does not support varying row or column sizes.
However you can achieve what you want by using a list of arrays:
Out = [A[i, B1[i]:B2[i]+1] for i in range(len(B1))]
Here's one to vectorization -
n_range = np.arange(A.shape[1])
elems = A[(n_range >= B1) & (n_range <= B2)]
idx = (B2 - B1 + 1).ravel().cumsum()
out = np.split(elems,idx)[:-1]
The trick is to use broadcasting to create a mask of elements to be selected for the output. Then, splitting the array of those elements at specified positions to get list of arrays.
Sample input, output -
In [37]: A
Out[37]:
array([[1, 2, 3, 4, 5],
[1, 3, 5, 7, 9],
[5, 4, 3, 2, 1]])
In [38]: B1
Out[38]:
array([[1],
[2],
[3]])
In [39]: B2
Out[39]:
array([[4],
[3],
[4]])
In [40]: out
Out[40]: [array([2, 3, 4, 5]), array([5, 7]), array([2, 1])]
# Please note that the o/p is a list of arrays
Your desired result has a different number of terms in each row - that's a strong indicator that a fully vectorized solution is not possible. It is not doing the same thing for each row or each column.
Secondly, n:m translates to slice(n,m). slice only takes integers, not lists or arrays.
The obvious solution is some sort of iteration over rows:
In [474]: A = np.array([[1,2,3,4,5],
[1,3,5,7,9],
[5,4,3,2,1]]) # shape (3,5)
In [475]: B1=[1,2,3] # no point in making these 2d
In [476]: B2=[5,4,5] # corrected values
In [477]: [a[b1:b2] for a,b1,b2 in zip(A,B1,B2)]
Out[477]: [array([2, 3, 4, 5]), array([5, 7]), array([2, 1])]
This solution works just as well if A is a nested list
In [479]: [a[b1:b2] for a,b1,b2 in zip(A.tolist(),B1,B2)]
Out[479]: [[2, 3, 4, 5], [5, 7], [2, 1]]
The 2 lists could also be converted to an array of 1d indices, and then select values from A.ravel(). That would produce a 1d array, e.g.
array([2, 3, 4, 5, 5, 7, 2, 1]
which in theory could be np.split - but recent experience with other questions indicates that this doesn't save much time.
If the length of the row selections were all the same we can get a 2d array. Iterative version taking 2 elements per row:
In [482]: np.array([a[b1:b1+2] for a,b1 in zip(A,B1)])
Out[482]:
array([[2, 3],
[5, 7],
[2, 1]])
I've discussed in earlier SO questions how produce this sort of result with one indexing operation.
On what slice accepts:
In [486]: slice([1,2],[3,4]).indices(10)
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-486-0c3514e61cf6> in <module>()
----> 1 slice([1,2],[3,4]).indices(10)
TypeError: slice indices must be integers or None or have an __index__ method
'vectorized' ravel indexing
In [505]: B=np.array([B1,B2])
In [506]: bb=A.shape[1]*np.arange(3)+B
In [508]: ri =np.r_[tuple([slice(i,j) for i,j in bb.T])]
# or np.concatenate([np.arange(i,j) for i,j in bb.T])
In [509]: ri
Out[509]: array([ 1, 2, 3, 4, 7, 8, 13, 14])
In [510]: A.ravel()[ri]
Out[510]: array([2, 3, 4, 5, 5, 7, 2, 1])
It still has an iteration - to generate the slices that go into np.r_ (which expands them into a single indexing array)

How do I use np.newaxis?

What is numpy.newaxis and when should I use it?
Using it on a 1-D array x produces:
>>> x
array([0, 1, 2, 3])
>>> x[np.newaxis, :]
array([[0, 1, 2, 3]])
>>> x[:, np.newaxis]
array([[0],
[1],
[2],
[3]])
Simply put, numpy.newaxis is used to increase the dimension of the existing array by one more dimension, when used once. Thus,
1D array will become 2D array
2D array will become 3D array
3D array will become 4D array
4D array will become 5D array
and so on..
Here is a visual illustration which depicts promotion of 1D array to 2D arrays.
Scenario-1: np.newaxis might come in handy when you want to explicitly convert a 1D array to either a row vector or a column vector, as depicted in the above picture.
Example:
# 1D array
In [7]: arr = np.arange(4)
In [8]: arr.shape
Out[8]: (4,)
# make it as row vector by inserting an axis along first dimension
In [9]: row_vec = arr[np.newaxis, :] # arr[None, :]
In [10]: row_vec.shape
Out[10]: (1, 4)
# make it as column vector by inserting an axis along second dimension
In [11]: col_vec = arr[:, np.newaxis] # arr[:, None]
In [12]: col_vec.shape
Out[12]: (4, 1)
Scenario-2: When we want to make use of numpy broadcasting as part of some operation, for instance while doing addition of some arrays.
Example:
Let's say you want to add the following two arrays:
x1 = np.array([1, 2, 3, 4, 5])
x2 = np.array([5, 4, 3])
If you try to add these just like that, NumPy will raise the following ValueError :
ValueError: operands could not be broadcast together with shapes (5,) (3,)
In this situation, you can use np.newaxis to increase the dimension of one of the arrays so that NumPy can broadcast.
In [2]: x1_new = x1[:, np.newaxis] # x1[:, None]
# now, the shape of x1_new is (5, 1)
# array([[1],
# [2],
# [3],
# [4],
# [5]])
Now, add:
In [3]: x1_new + x2
Out[3]:
array([[ 6, 5, 4],
[ 7, 6, 5],
[ 8, 7, 6],
[ 9, 8, 7],
[10, 9, 8]])
Alternatively, you can also add new axis to the array x2:
In [6]: x2_new = x2[:, np.newaxis] # x2[:, None]
In [7]: x2_new # shape is (3, 1)
Out[7]:
array([[5],
[4],
[3]])
Now, add:
In [8]: x1 + x2_new
Out[8]:
array([[ 6, 7, 8, 9, 10],
[ 5, 6, 7, 8, 9],
[ 4, 5, 6, 7, 8]])
Note: Observe that we get the same result in both cases (but one being the transpose of the other).
Scenario-3: This is similar to scenario-1. But, you can use np.newaxis more than once to promote the array to higher dimensions. Such an operation is sometimes needed for higher order arrays (i.e. Tensors).
Example:
In [124]: arr = np.arange(5*5).reshape(5,5)
In [125]: arr.shape
Out[125]: (5, 5)
# promoting 2D array to a 5D array
In [126]: arr_5D = arr[np.newaxis, ..., np.newaxis, np.newaxis] # arr[None, ..., None, None]
In [127]: arr_5D.shape
Out[127]: (1, 5, 5, 1, 1)
As an alternative, you can use numpy.expand_dims that has an intuitive axis kwarg.
# adding new axes at 1st, 4th, and last dimension of the resulting array
In [131]: newaxes = (0, 3, -1)
In [132]: arr_5D = np.expand_dims(arr, axis=newaxes)
In [133]: arr_5D.shape
Out[133]: (1, 5, 5, 1, 1)
More background on np.newaxis vs np.reshape
newaxis is also called as a pseudo-index that allows the temporary addition of an axis into a multiarray.
np.newaxis uses the slicing operator to recreate the array while numpy.reshape reshapes the array to the desired layout (assuming that the dimensions match; And this is must for a reshape to happen).
Example
In [13]: A = np.ones((3,4,5,6))
In [14]: B = np.ones((4,6))
In [15]: (A + B[:, np.newaxis, :]).shape # B[:, None, :]
Out[15]: (3, 4, 5, 6)
In the above example, we inserted a temporary axis between the first and second axes of B (to use broadcasting). A missing axis is filled-in here using np.newaxis to make the broadcasting operation work.
General Tip: You can also use None in place of np.newaxis; These are in fact the same objects.
In [13]: np.newaxis is None
Out[13]: True
P.S. Also see this great answer: newaxis vs reshape to add dimensions
What is np.newaxis?
The np.newaxis is just an alias for the Python constant None, which means that wherever you use np.newaxis you could also use None:
>>> np.newaxis is None
True
It's just more descriptive if you read code that uses np.newaxis instead of None.
How to use np.newaxis?
The np.newaxis is generally used with slicing. It indicates that you want to add an additional dimension to the array. The position of the np.newaxis represents where I want to add dimensions.
>>> import numpy as np
>>> a = np.arange(10)
>>> a
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
>>> a.shape
(10,)
In the first example I use all elements from the first dimension and add a second dimension:
>>> a[:, np.newaxis]
array([[0],
[1],
[2],
[3],
[4],
[5],
[6],
[7],
[8],
[9]])
>>> a[:, np.newaxis].shape
(10, 1)
The second example adds a dimension as first dimension and then uses all elements from the first dimension of the original array as elements in the second dimension of the result array:
>>> a[np.newaxis, :] # The output has 2 [] pairs!
array([[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]])
>>> a[np.newaxis, :].shape
(1, 10)
Similarly you can use multiple np.newaxis to add multiple dimensions:
>>> a[np.newaxis, :, np.newaxis] # note the 3 [] pairs in the output
array([[[0],
[1],
[2],
[3],
[4],
[5],
[6],
[7],
[8],
[9]]])
>>> a[np.newaxis, :, np.newaxis].shape
(1, 10, 1)
Are there alternatives to np.newaxis?
There is another very similar functionality in NumPy: np.expand_dims, which can also be used to insert one dimension:
>>> np.expand_dims(a, 1) # like a[:, np.newaxis]
>>> np.expand_dims(a, 0) # like a[np.newaxis, :]
But given that it just inserts 1s in the shape you could also reshape the array to add these dimensions:
>>> a.reshape(a.shape + (1,)) # like a[:, np.newaxis]
>>> a.reshape((1,) + a.shape) # like a[np.newaxis, :]
Most of the times np.newaxis is the easiest way to add dimensions, but it's good to know the alternatives.
When to use np.newaxis?
In several contexts is adding dimensions useful:
If the data should have a specified number of dimensions. For example if you want to use matplotlib.pyplot.imshow to display a 1D array.
If you want NumPy to broadcast arrays. By adding a dimension you could for example get the difference between all elements of one array: a - a[:, np.newaxis]. This works because NumPy operations broadcast starting with the last dimension 1.
To add a necessary dimension so that NumPy can broadcast arrays. This works because each length-1 dimension is simply broadcast to the length of the corresponding1 dimension of the other array.
1 If you want to read more about the broadcasting rules the NumPy documentation on that subject is very good. It also includes an example with np.newaxis:
>>> a = np.array([0.0, 10.0, 20.0, 30.0])
>>> b = np.array([1.0, 2.0, 3.0])
>>> a[:, np.newaxis] + b
array([[ 1., 2., 3.],
[ 11., 12., 13.],
[ 21., 22., 23.],
[ 31., 32., 33.]])
You started with a one-dimensional list of numbers. Once you used numpy.newaxis, you turned it into a two-dimensional matrix, consisting of four rows of one column each.
You could then use that matrix for matrix multiplication, or involve it in the construction of a larger 4 x n matrix.
newaxis object in the selection tuple serves to expand the dimensions of the resulting selection by one unit-length dimension.
It is not just conversion of row matrix to column matrix.
Consider the example below:
In [1]:x1 = np.arange(1,10).reshape(3,3)
print(x1)
Out[1]: array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
Now lets add new dimension to our data,
In [2]:x1_new = x1[:,np.newaxis]
print(x1_new)
Out[2]:array([[[1, 2, 3]],
[[4, 5, 6]],
[[7, 8, 9]]])
You can see that newaxis added the extra dimension here, x1 had dimension (3,3) and X1_new has dimension (3,1,3).
How our new dimension enables us to different operations:
In [3]:x2 = np.arange(11,20).reshape(3,3)
print(x2)
Out[3]:array([[11, 12, 13],
[14, 15, 16],
[17, 18, 19]])
Adding x1_new and x2, we get:
In [4]:x1_new+x2
Out[4]:array([[[12, 14, 16],
[15, 17, 19],
[18, 20, 22]],
[[15, 17, 19],
[18, 20, 22],
[21, 23, 25]],
[[18, 20, 22],
[21, 23, 25],
[24, 26, 28]]])
Thus, newaxis is not just conversion of row to column matrix. It increases the dimension of matrix, thus enabling us to do more operations on it.

Categories