Indexing numpy array with list of slices - python

I have a list of slices and use them to index a numpy array.
arr = np.arange(25).reshape(5, 5)
# array([[ 0, 1, 2, 3, 4],
# [ 5, 6, 7, 8, 9],
# [10, 11, 12, 13, 14],
# [15, 16, 17, 18, 19],
# [20, 21, 22, 23, 24]])
slice_list = list(map(lambda i: slice(i, i+2), [1, 2]))
# [slice(1, 3, None), slice(2, 4, None)]
print(arr[slice_list])
# == arr[1:3, 2:4]
# [[ 7 8]
# [12 13]]
This works fine but it breaks if I have fewer slices than the number of dimensions
of the array I want to index.
arr3d = arr[np.newaxis, :, :] # dims: [1, 5, 5]
arr3d[:, slice_list]
# IndexError: only integers, slices (`:`), ellipsis (`...`),(`None`)
# numpy.newaxis and integer or boolean arrays are valid indices
The following examples work however:
arr3d[:, slice_list[0], slice_list[1]]
arr3d[[slice(None)] + slice_list]
arr3d[:, [[1], [2]], [2, 3]]
Is there a way I can use a list of slices to index an array with more dimensions.
I want to do things like:
arr[..., slice_list]
arr[..., slice_list, :]
arr[:, slice_list, :]
without thinking about the dimensions of the array and figuring out how many [slice(None)]*X
I have to pad on either side of my slice_list.

You can do that using tuples of slices and ellipsis objects. Just put all the elements to you want to use for indexing into a tuple and use it as index:
import numpy as np
arr = np.arange(24).reshape(2, 3, 4)
print(arr)
# [[[ 0 1 2 3]
# [ 4 5 6 7]
# [ 8 9 10 11]]
#
# [[12 13 14 15]
# [16 17 18 19]
# [20 21 22 23]]]
slice_tup = tuple(map(lambda i: slice(i, i+2), [1, 2]))
print(slice_tup)
# (slice(1, 3, None), slice(2, 4, None))
print(arr[slice_tup])
# [[[20 21 22 23]]]
# arr[..., slice_list]
print(arr[(Ellipsis, *slice_tup)])
# [[[ 6 7]
# [10 11]]
#
# [[18 19]
# [22 23]]]
# arr[..., slice_list, :]
print(arr[(Ellipsis, *slice_tup, slice(None))])
# [[[20 21 22 23]]]
# arr[:, slice_list, :]
print(arr[(slice(None), *slice_tup, slice(None))])
# IndexError: too many indices for array

Related

Numpy indexing - Using an unraveled index for basic indexing

If I have the following 4D array:
mat = np.array(np.arange(27)).reshape((3,3,3))
[[[ 0 1 2]
[ 3 4 5]
[ 6 7 8]]
[[ 9 10 11]
[12 13 14]
[15 16 17]]
[[18 19 20]
[21 22 23]
[24 25 26]]]
and the following unraveled index:
ind = np.unravel_index([7], mat.shape[1:])
(array([2], dtype=int64), array([1], dtype=int64))
what is the best way to access
mat[:, 2, 1]
[ 7 16 25]
using the unraveled index? I am looking for a generic solution to this issue where the number of dimensions mat has can vary.
I am aware that I could do something like this:
new_ind = (np.arange(mat.shape[0]),) + ind
mat[new_ind]
[ 7 16 25]
but I was wondering if there is way to do this which does not require the explicit construction of a new index?
You do need to construct a new indexing tuple:
In [8]: ind=np.unravel_index([7,8],(3,3))
In [9]: ind
Out[9]: (array([2, 2]), array([1, 2]))
In [10]: (slice(None),*ind)
Out[10]: (slice(None, None, None), array([2, 2]), array([1, 2]))
In [11]: np.arange(27).reshape(3,3,3)[_]
Out[11]:
array([[ 7, 8],
[16, 17],
[25, 26]])
The Out[10] is equivalent to adding a : to your unraveled indices:
In [12]: np.s_[:,[2,2],[1,2]]
Out[12]: (slice(None, None, None), [2, 2], [1, 2])

From a 2d array, create another 2d array composed of randomly selected values from original array (values not shared among rows) without using a loop

To select random values from a 2d array, you can use this
pool = np.random.randint(0, 30, size=[4,5])
seln = np.random.choice(pool.reshape(-1), 3, replace=False)
print(pool)
print(seln)
>[[29 7 19 26 22]
[26 12 14 11 14]
[ 6 1 13 11 1]
[ 7 3 27 1 12]]
[11 14 26]
pool needs to be reshaped into a 1-d vector because np.random.choice can not handle 2d objects. So in order to create a 2d array composed of randomly selected values from the original 2d array, I had to do one row at a time using a loop.
pool = np.random.randint(0, 30, size=[4,5])
seln = np.empty([4,3], int)
for i in range(0, pool.shape[0]):
seln[i] =np.random.choice(pool[i], 3, replace=False)
print('pool = ', pool)
print('seln = ', seln)
>pool = [[ 1 11 29 4 13]
[29 1 2 3 24]
[ 0 25 17 2 14]
[20 22 18 9 29]]
seln = [[ 8 12 0]
[ 4 19 13]
[ 8 15 24]
[12 12 19]]
However, I am looking for a parallel method; handling all the rows at the same time, instead of one at a time in a loop.
Is this possible? If not numpy, how about Tensorflow?
Here's a way avoiding for loops:
pool = np.random.randint(0, 30, size=[4,5])
print(pool)
array([[ 4, 18, 0, 15, 9],
[ 0, 9, 21, 26, 9],
[16, 28, 11, 19, 24],
[20, 6, 13, 2, 27]])
# New array shape
new_shape = (pool.shape[0],3)
# Indices where to randomly choose from
ix = np.random.choice(pool.shape[1], new_shape)
array([[0, 3, 3],
[1, 1, 4],
[2, 4, 4],
[1, 2, 1]])
So ix's rows are each a set of random indices from which pool will be sampled. Now each row is scaled according to the shape of pool so that it can be sampled when flattened:
ixs = (ix.T + range(0,np.prod(pool.shape),pool.shape[1])).T
array([[ 0, 3, 3],
[ 6, 6, 9],
[12, 14, 14],
[16, 17, 16]])
And ixs can be used to sample from pool with:
pool.flatten()[ixs].reshape(new_shape)
array([[ 4, 15, 15],
[ 9, 9, 9],
[11, 24, 24],
[ 6, 13, 6]])

How to assign new different values to numpy array at the same dimension using iteration function

I have the following multidimensional numpy array:
a = np.arange(16).reshape(2,2,2,2)
I want to assign new different values of the array for each element of certain dimension e.g. 4th dimension
I used the following code:
for i in range(a.shape[3]):
if i == 0:
for t in np.nditer(a[:,:,:,i], op_flags = ['readwrite']):
t[...] = t*2
if i == 1:
for t in np.nditer(a[:,:,:,i], op_flags = ['readwrite']):
t[...] = t*3
print(a)
print(a.shape)
the output is shown as
[[[[ 0 1]
[ 4 3]]
[[ 8 5]
[12 7]]]
[[[16 9]
[20 11]]
[[24 13]
[28 15]]]]
[[[[ 0 3]
[ 4 9]]
[[ 8 15]
[12 21]]]
[[[16 27]
[20 33]]
[[24 39]
[28 45]]]]
(2, 2, 2, 2)
(2, 2, 2, 2)
What I understand that it iterates over the array and at the first i it assigns new values then at the next i it assigns new values and creates new array with both new values besides the first array of the first i that's why I got two arrays in one variable. I only concern about the last array where all values have been assigned the new values. How could I extract the last array only. Or there is another code which is simpler and time saving for this task?
You can simply do this:
>>> a[:,:,:,0] = a[:,:,:,0]*2
>>> a[:,:,:,1] = a[:,:,:,1]*3
>>> a
array([[[[ 0, 3],
[ 4, 9]],
[[ 8, 15],
[12, 21]]],
[[[16, 27],
[20, 33]],
[[24, 39],
[28, 45]]]])
Or even this:
>>> a[:,:,:,0] *= 2
>>> a[:,:,:,1] *= 3

I want to reshape 2D array into 3D array

I want to reshape 2D array into 3D array.I wrote codes,
for i in range(len(array)):
i = np.reshape(i,(2,2,2))
print(i)
i variable has even number's length array like [["100","150","2","4"],["140","120","3","5"]] or
[[“1”,”5”,”6”,”2”],[“4”,”2”,”3”,”7”],[“7”,”5”,”6”,”6”],[“9”,”1”,”8”,”3”],[“3”,”4”,”5”,”6”],[“7”,”8”,”9”,”2”],,[“1”,”5”,”2”,”8”],[“6”,”7”,”2”,”1”],[“9”,”3”,”1”,”2”],[“6”,”8”,”3”,”3”]]
The length is >= 6.
When I run this codes,ValueError: cannot reshape array of size 148 into shape (2,2,2) error happens.
My ideal output is
[[['100', '150'], ['2', '4']], [['140', '120'], ['3', '5']]] or [[[“1”,”5”],[”6”,”2”]],[[“4”,”2”],[”3”,”7”]],[[“7”,”5”],[”6”,”6”]],[[“9”,”1”],[”8”,”3”]],[[“3”,”4”],[”5”,”6”]],[[“7”,”8”],[”9”,”2”]],[[“1”,”5”],[”2”,”8”]],[[“6”,”7”],[”2”,”1”]],[[“9”,”3”],[[”1”,”2”]],[[“6”,”8”],[”3”,”3”]]]
I rewrote the codesy = [[x[:2], x[2:]] for x in i] but output is not my ideal one.What is wrong in my codes?
First of all, you are missing the meaning of reshaping. Let say your origin array has shape (A, B) and you want to reshape it to shape (M, N, O), you have to make sure that A * B = M * N * O. Obviously 148 != 2 * 2 * 2, right?
In your case, you want to reshape an array of shape (N, 4) to an array of shape (N, 2, 2). You can do like below:
x = np.reshape(y, (-1, 2, 2))
Hope this help :)
You don't need to loop to reshape the way you want to, just use arr.reshape((-1,2,2))
In [3]: x = np.random.randint(low=0, high=10, size=(2,4))
In [4]: x
Out[4]:
array([[1, 1, 2, 5],
[8, 8, 0, 5]])
In [5]: x.reshape((-1,2,2))
Out[5]:
array([[[1, 1],
[2, 5]],
[[8, 8],
[0, 5]]])
This approach will work for both of your arrays. The -1 as the first argument means numpy will infer the value of the unknown dimension.
In [76]: arr = np.arange(24).reshape(3,8)
In [77]: for i in range(len(arr)):
...: print(i)
...: i = np.reshape(i, (2,2,2))
...: print(i)
...:
0
....
AttributeError: 'int' object has no attribute 'reshape'
len(arr) is 3, so range(3) produces values, 0,1,2. You can't reshape the number 0.
Or did you mean to reshape arr[0], arr[1], etc?
In [79]: for i in arr:
...: print(i)
...: i = np.reshape(i, (2,2,2))
...: print(i)
...:
[0 1 2 3 4 5 6 7]
[[[0 1]
[2 3]]
[[4 5]
[6 7]]]
[ 8 9 10 11 12 13 14 15]
[[[ 8 9]
[10 11]]
[[12 13]
[14 15]]]
[16 17 18 19 20 21 22 23]
[[[16 17]
[18 19]]
[[20 21]
[22 23]]]
That works - sort of. The prints look ok, but arr itself does not get changed:
In [80]: arr
Out[80]:
array([[ 0, 1, 2, 3, 4, 5, 6, 7],
[ 8, 9, 10, 11, 12, 13, 14, 15],
[16, 17, 18, 19, 20, 21, 22, 23]])
That's because i is the iteration variable. Assigning a new value to it does not change the original object. If that's confusing, you need to review basic Python iteration.
Or we could iterate the range, and use it as an index:
In [81]: for i in range(len(arr)):
...: print(i)
...: x = np.reshape(arr[i], (2,2,2))
...: print(x)
...: arr[i] = x
...:
...:
...:
...:
0
[[[0 1]
[2 3]]
[[4 5]
[6 7]]]
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-81-5f0985cb2277> in <module>()
3 x = np.reshape(arr[i], (2,2,2))
4 print(x)
----> 5 arr[i] = x
6
7
ValueError: could not broadcast input array from shape (2,2,2) into shape (8)
The reshape works, but you can't put a (2,2,2) array back into a slot of shape (8,). The number of elements is right, but the shape isn't.
In other words, you can't reshape an array piecemeal. You have to reshape the whole thing. (If arr was a list of lists, this kind of piecemeal reshaping would work.)
In [82]: np.reshape(arr, (3,2,2,2))
Out[82]:
array([[[[ 0, 1],
[ 2, 3]],
[[ 4, 5],
[ 6, 7]]],
[[[ 8, 9],
[10, 11]],
[[12, 13],
[14, 15]]],
[[[16, 17],
[18, 19]],
[[20, 21],
[22, 23]]]])

Slicing a numpy image array into blocks

I'm doing image processing for object detection using python. I need to divide my image into all possible blocks. For example given this toy image:
x = np.arange(25)
x = x.reshape((5, 5))
[[ 0 1 2 3 4]
[ 5 6 7 8 9]
[10 11 12 13 14]
[15 16 17 18 19]
[20 21 22 23 24]]
I want to retrieve all possible blocks of a given size, for example the 2x2 blocks are:
[[0 1]
[5 6]]
[[1 2]
[6 7]]
.. and so on. How can I do this?
The scikit image extract_patches_2d does that
>>> from sklearn.feature_extraction import image
>>> one_image = np.arange(16).reshape((4, 4))
>>> one_image
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11],
[12, 13, 14, 15]])
>>> patches = image.extract_patches_2d(one_image, (2, 2))
>>> print(patches.shape)
(9, 2, 2)
>>> patches[0]
array([[0, 1],
[4, 5]])
>>> patches[1]
array([[1, 2],
[5, 6]])
>>> patches[8]
array([[10, 11],
[14, 15]])
You can use something like this:
def rolling_window(arr, window):
"""Very basic multi dimensional rolling window. window should be the shape of
of the desired subarrays. Window is either a scalar or a tuple of same size
as `arr.shape`.
"""
shape = np.array(arr.shape*2)
strides = np.array(arr.strides*2)
window = np.asarray(window)
shape[arr.ndim:] = window # new dimensions size
shape[:arr.ndim] -= window - 1
if np.any(shape < 1):
raise ValueError('window size is too large')
return np.lib.stride_tricks.as_strided(arr, shape=shape, strides=strides)
# Now:
slices = rolling_window(arr, 2)
# Slices will be 4-d not 3-d as you wanted. You can reshape
# but it may need to copy (not if you have done no slicing, etc. with the array):
slices = slices.reshape(-1,slices.shape[2:])
Simple code with a double loop and slice:
>>> a = np.arange(12).reshape(3,4)
>>> print(a)
[[ 0 1 2 3]
[ 4 5 6 7]
[ 8 9 10 11]]
>>> r = 2
>>> n_rows, n_cols = a.shape
>>> for row in range(n_rows - r + 1):
... for col in range(n_cols - r + 1):
... print(a[row:row + r, col:col + r])
...
[[0 1]
[4 5]]
[[1 2]
[5 6]]
[[2 3]
[6 7]]
[[4 5]
[8 9]]
[[ 5 6]
[ 9 10]]
[[ 6 7]
[10 11]]

Categories