Explain this 4D numpy array indexing intuitively - python

x = np.random.randn(4, 3, 3, 2)
print(x[1,1])
output:
[[ 1.68158825 -0.03701415]
[ 1.0907524 -1.94530359]
[ 0.25659178 0.00475093]]
I am python newbie. I can't really understand 4-D array index like above. What does x[1,1] mean?
For example, for vector
a = [[2][3][8][9]], a[0] = 2, a[3] = 9.
I get this but I don't know what x[1,1] refers to.
Please explain in detail. Thank you.

A 2D array is a matrix : an array of arrays.
A 4D array is basically a matrix of matrices:
Specifying one index gives you an array of matrices:
>>> x[1]
array([[[-0.37387191, -0.19582887],
[-2.88810217, -0.8249608 ],
[-0.46763329, 1.18628611]],
[[-1.52766397, -0.2922034 ],
[ 0.27643125, -0.87816021],
[-0.49936658, 0.84011388]],
[[ 0.41885001, 0.16037164],
[ 1.21510322, 0.01923682],
[ 0.96039904, -0.22761806]]])
Specifying two indices gives you a matrix:
>>> x[1, 1]
array([[-1.52766397, -0.2922034 ],
[ 0.27643125, -0.87816021],
[-0.49936658, 0.84011388]])
Specifying three indices gives you an array:
>>> x[1, 1, 1]
array([ 0.27643125, -0.87816021])
Specifying four indices gives you a single element:
>>> x[1, 1, 1, 1]
-0.87816021212791107
x[1,1] gives you the small matrix that was saved in the 2nd column of the 2nd row of the large matrix.

A 4d numpy array is an array nested 4 layers deep, so at the top level it would look like this:
[ # 1st level Array (Outer)
[ # 2nd level Array
[[1, 2], [3, 4]], # 3rd level arrays, containing 2 4th level arrays
[[5, 6], [7, 8]]
],
[ # 2nd Level array
[[9, 10], [11, 12]],
[[13, 14], [15, 16]]
]
]
x[1,1] expands to x[1][1], Let's unpack this one expression at a time, the first expression x[1] selects the first element from the global array which is the following object from the earlier array:
[
[[1, 2], [3, 4]],
[[5, 6], [7, 8]]
]
The next expression now looks like this:
[
[[1, 2], [3, 4]],
[[5, 6], [7, 8]]
][1]
So evaluating that (selecting the first element in the array) gives us the following result:
[[1, 2], [3, 4]]
As you can see selecting an element in a 4d array gives us a 3d array, selecting an element from a 3d array gives a 2d array and selecting an element from a 2d array gives us a 1d array.

Related

Numpy Search & Slice 3D Array

I'm very new to Python & Numpy and am trying to accomplish the following:
Given, 3D Array:
arr_3d = [[[1,2,3],[4,5,6],[0,0,0],[0,0,0]],
[[3,2,1],[0,0,0],[0,0,0],[0,0,0]]
[[1,2,3],[4,5,6],[7,8,9],[0,0,0]]]
arr_3d = np.array(arr_3d)
Get the indices where [0,0,0] appears in the given 3D array.
Slice the given 3D array from where [0,0,0] appears first.
In other words, I'm trying to remove the padding (In this case: [0,0,0]) from the given 3D array.
Here is what I have tried,
arr_zero = np.zeros(3)
for index in range(0, len(arr_3d)):
rows, cols = np.where(arr_3d[index] == arr_zero)
arr_3d[index] = np.array(arr_3d[0][:rows[0]])
But doing this, I keep getting the following error:
Could not broadcast input array from shape ... into shape ...
I'm expecting something like this:
[[[1,2,3],[4,5,6]],
[[3,2,1]]
[[1,2,3],[4,5,6],[7,8,9]]]
Any help would be appreciated.
Get the first occurance of those indices with all() reduction alongwith argmax() and then slice each 2D slice off the 3D array -
In [106]: idx = (arr_3d == [0,0,0]).all(-1).argmax(-1)
# Output as list of arrays
In [107]: [a[:i] for a,i in zip(arr_3d,idx)]
Out[107]:
[array([[1, 2, 3],
[4, 5, 6]]), array([[3, 2, 1]]), array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])]
# Output as list of lists
In [108]: [a[:i].tolist() for a,i in zip(arr_3d,idx)]
Out[108]: [[[1, 2, 3], [4, 5, 6]], [[3, 2, 1]], [[1, 2, 3], [4, 5, 6], [7, 8, 9]]]

numpy einsum: nested dot products

I have two n-by-k-by-3 arrays a and b, e.g.,
import numpy as np
a = np.array([
[
[1, 2, 3],
[3, 4, 5]
],
[
[4, 2, 4],
[1, 4, 5]
]
])
b = np.array([
[
[3, 1, 5],
[0, 2, 3]
],
[
[2, 4, 5],
[1, 2, 4]
]
])
and it like to compute the dot-product of all pairs of "triplets", i.e.,
np.sum(a*b, axis=2)
A better way to do that is perhaps einsum, but I can't seem to get the indices straight.
Any hints here?
You are loosing the third axis on those two 3D input arrays with that sum-reduction, while keeping the first two axes aligned. Thus, with np.einsum, we would have the first two strings identical alongwith the third string being identical too, but would be skipped in the output string notation signalling we are reducing along that axis for both the inputs. Thus, the solution would be -
np.einsum('ijk,ijk->ij',a,b)

Numpy 3D array transposed when indexed in single step vs two steps

import numpy as np
x = np.random.randn(2, 3, 4)
mask = np.array([1, 0, 1, 0], dtype=np.bool)
y = x[0, :, mask]
z = x[0, :, :][:, mask]
print(y)
print(z)
print(y.T)
Why does doing the above operation in two steps result in the transpose of doing it in one step?
Here's the same behavior with a list index:
In [87]: x=np.arange(2*3*4).reshape(2,3,4)
In [88]: x[0,:,[0,2]]
Out[88]:
array([[ 0, 4, 8],
[ 2, 6, 10]])
In [89]: x[0,:,:][:,[0,2]]
Out[89]:
array([[ 0, 2],
[ 4, 6],
[ 8, 10]])
In the 2nd case, x[0,:,:] returns a (3,4) array, and the next index picks 2 columns.
In the 1st case, it first selects on the first and last dimensions, and appends the slice (the middle dimension). The 0 and [0,2] produce a 2 dimension, and the 3 from the middle is appended, giving (2,3) shape.
This is a case of mixed basic and advanced indexing.
http://docs.scipy.org/doc/numpy/reference/arrays.indexing.html#combining-advanced-and-basic-indexing
In the first case, the dimensions resulting from the advanced indexing operation come first in the result array, and the subspace dimensions after that.
This is not an easy case to comprehend or explain. Basically there's some ambiguity as to what the final dimension should be. It tries to illustrate with an example x[:,ind_1,:,ind_2] where ind_1 and ind_2 are 3d (or together broadcast to that).
Earlier attempts to explain this are:
How does numpy order array slice indices?
Combining slicing and broadcasted indexing for multi-dimensional numpy arrays
===========================
A way around this problem is to replace the slice with an array - a column vector
In [221]: x[0,np.array([0,1,2])[:,None],[0,2]]
Out[221]:
array([[ 0, 2],
[ 4, 6],
[ 8, 10]])
In [222]: np.ix_([0],[0,1,2],[0,2])
Out[222]:
(array([[[0]]]), array([[[0],
[1],
[2]]]), array([[[0, 2]]]))
In [223]: x[np.ix_([0],[0,1,2],[0,2])]
Out[223]:
array([[[ 0, 2],
[ 4, 6],
[ 8, 10]]])
Though this last case is 3d, (1,3,2). ix_ didn't like the scalar 0. An alternate way of using ix_:
In [224]: i,j=np.ix_([0,1,2],[0,2])
In [225]: x[0,i,j]
Out[225]:
array([[ 0, 2],
[ 4, 6],
[ 8, 10]])
And here's a way of getting the same numbers, but in a (2,1,3) array:
In [232]: i,j=np.ix_([0,2],[0])
In [233]: x[j,:,i]
Out[233]:
array([[[ 0, 4, 8]],
[[ 2, 6, 10]]])

Create array of outer products in numpy

I have an array of n vectors of length m. For example, with n = 3, m = 2:
x = array([[1, 2], [3, 4], [5,6]])
I want to take the outer product of each vector with itself, then concatenate them into an array of square matrices of shape (n, m, m). So for the x above I would get
array([[[ 1, 2],
[ 2, 4]],
[[ 9, 12],
[12, 16]],
[[25, 30],
[30, 36]]])
I can do this with a for loop like so
np.concatenate([np.outer(v, v) for v in x]).reshape(3, 2, 2)
Is there a numpy expression that does this without the Python for loop?
Bonus question: since the outer products are symmetric, I don't need to m x m multiplication operations to calculate them. Can I get this symmetry optimization from numpy?
Maybe use einsum?
>>> x = np.array([[1, 2], [3, 4], [5,6]])
>>> np.einsum('ij...,i...->ij...',x,x)
array([[[ 1, 2],
[ 2, 4]],
[[ 9, 12],
[12, 16]],
[[25, 30],
[30, 36]]])
I used the following snippet when I was trying to do the same in Theano:
def multiouter(A,B):
'''Provided NxK (Theano) matrices A and B it returns a NxKxK tensor C with C[i,:,:]=A[i,:]*B[i,:].T'''
return A.dimshuffle(0,1,'x')*B.dimshuffle(0,'x',1)
Doing a straighforward conversion to Numpy yields
def multiouter(A,B):
'''Provided NxK (Numpy) arrays A and B it returns a NxKxK tensor C with C[i,:,:]=A[i,:]*B[i,:].T'''
return A[:,:,None]*B[:,None,:]
I think I got the inspiration for it from another StackOverflow posting, so I am not sure I can take all the credit.
Note: indexing with None is equivalent to indexing with np.newaxis and instantiates a new axis with dimension 1.

How to delete column in 3d numpy array

I have a numpy array that looks like this
[
[[1,2,3], [4,5,6]],
[[3,8,9], [2,9,4]],
[[7,1,3], [1,3,6]]
]
I want it like this after deleting first column
[
[[2,3], [5,6]],
[[8,9], [9,4]],
[[1,3], [3,6]]
]
so currently the dimension is 3*3*3, after removing the first column it should be 3*3*2
You can slice it as so, where 1: signifies that you only want the second and all remaining columns from the inner most array (i.e. you 'delete' its first column).
>>> a[:, :, 1:]
array([[[2, 3],
[5, 6]],
[[8, 9],
[9, 4]],
[[1, 3],
[3, 6]]])
Since you are using numpy I'll mention numpy way of doing this. First of all, the dimension you have specified for the question seems wrong. See below
x = np.array([
[[1,2,3], [4,5,6]],
[[3,8,9], [2,9,4]],
[[7,1,3], [1,3,6]]
])
The shape of x is
x.shape
(3, 2, 3)
You can use numpy.delete to remove a column as shown below
a = np.delete(x, 0, 2)
a
array([[[2, 3],
[5, 6]],
[[8, 9],
[9, 4]],
[[1, 3],
[3, 6]]])
To find the shape of a
a.shape
(3, 2, 2)

Categories