numpys fancy indexing pattern [duplicate] - python

This question already has answers here:
Understanding slicing
(38 answers)
Closed 1 year ago.
I am trying to get my head around numpy's fancy indexing.
While trying to approach the following I am currently unable to solve the problem:
Given the following np.array t.
t = np.array([[6, 1, 8],
[4, 3, 7],
[9, 5, 2]])
I want to achieve the following pattern using fancy indexing
array([[8, 1, 8],
[1, 8, 1],
[8, 1, 8]])
With my closest approach getting to
array([[8, 1, 8],
[8, 1, 8],
[8, 1, 8]])
Using t[:,[2,1,2]][[0,0,0]]
how to tackle such problems?

I found two approaches which solve the problem.
Map the indices with the corresponding points to a 2d array of row and column coordinates.
row = np.array([[0,0,0],[0,0,0],[0,0,0]])
col = np.array([[2,1,2],[1,2,1],[2,1,2]])
t[row,col]
array([[8, 1, 8],
[1, 8, 1],
[8, 1, 8]])
which is the same as using
t[ [[0,0,0],[0,0,0],[0,0,0]] , [[2,1,2],[1,2,1],[2,1,2]] ]
Start by creating a 1d Array including only required numbers and then cast to a 2d array, allthough is approach triggers a Future Warning: Using a non-tuple sequence for multidimensional indexing is deprecated; use arr[tuple(seq)] instead of arr[seq]. In the future this will be interpreted as an array index, arr[np.array(seq)], which will result either in an error or a different result.
t[0,[1,2]] [[ [[1,0,1],[0,1,0],[1,0,1]] ]]
array([[8, 1, 8],
[1, 8, 1],
[8, 1, 8]])

In [198]: t = np.array([[6, 1, 8],
...: [4, 3, 7],
...: [9, 5, 2]])
Looks like you want to take 2 elements, and repeat them.
In [199]: t[0,[2,1]] # select the elements
Out[199]: array([8, 1])
Then taking advantage of how np.resize "pads" a larger array:
In [200]: np.resize(_,(3,3))
Out[200]:
array([[8, 1, 8],
[1, 8, 1],
[8, 1, 8]])
That's not a very general solution, but then I don't know how you'd imagine generalizing the problem
Another way to expand and reshape
In [217]: np.repeat(t[[[0]],[2,1]],5,0).ravel()[:9].reshape(3,3)
Out[217]:
array([[8, 1, 8],
[1, 8, 1],
[8, 1, 8]])

Related

Numpy Array: Slice several values at every step

I am trying to extract several values at once from an array but I can't seem to find a way to do it in a one-liner in Numpy.
Simply put, considering an array:
a = numpy.arange(10)
> array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
I would like to be able to extract, say, 2 values, skip the next 2, extract the 2 following values etc. This would result in:
array([0, 1, 4, 5, 8, 9])
This is an example but I am ideally looking for a way to extract x values and skip y others.
I thought this could be done with slicing, doing something like:
a[:2:2]
but it only returns 0, which is the expected behavior.
I know I could obtain the expected result by combining several slicing operations (similarly to Numpy Array Slicing) but I was wondering if I was not missing some numpy feature.
If you want to avoid creating copies and allocating new memory, you could use a window_view of two elements:
win = np.lib.stride_tricks.sliding_window_view(a, 2)
array([[0, 1],
[1, 2],
[2, 3],
[3, 4],
[4, 5],
[5, 6],
[6, 7],
[7, 8],
[8, 9]])
And then only take every 4th window view:
win[::4].ravel()
array([0, 1, 4, 5, 8, 9])
Or directly go with the more dangerous as_strided, but heed the warnings in the documentation:
np.lib.stride_tricks.as_strided(a, shape=(3,2), strides=(32,8))
You can use a modulo operator:
x = 2 # keep
y = 2 # skip
out = a[np.arange(a.shape[0])%(x+y)<x]
Output: array([0, 1, 4, 5, 8, 9])
Output with x = 2 ; y = 3:
array([0, 1, 5, 6])

How to loop back to beginning of the array for out of bounds index in numpy?

I have a 2D numpy array that I want to extract a submatrix from.
I get the submatrix by slicing the array as below.
Here I want a 3*3 submatrix around an item at the index of (2,3).
>>> import numpy as np
>>> a = np.array([[0, 1, 2, 3],
... [4, 5, 6, 7],
... [8, 9, 0, 1],
... [2, 3, 4, 5]])
>>> a[1:4, 2:5]
array([[6, 7],
[0, 1],
[4, 5]])
But what I want is that for indexes that are out of range, it goes back to the beginning of array and continues from there. This is the result I want:
array([[6, 7, 4],
[0, 1, 8],
[4, 5, 2]])
I know that I can do things like getting mod of the index to the width of the array; but I'm looking for a numpy function that does that.
And also for an one dimensional array this will cause an index out of range error, which is not really useful...
This is one way using np.pad with wraparound mode.
>>> a = np.array([[0, 1, 2, 3],
[4, 5, 6, 7],
[8, 9, 0, 1],
[2, 3, 4, 5]])
>>> pad_width = 1
>>> i, j = 2, 3
>>> startrow, endrow = i-1+pad_width, i+2+pad_width # for 3 x 3 submatrix
>>> startcol, endcol = j-1+pad_width, j+2+pad_width
>>> np.pad(a, (pad_width, pad_width), 'wrap')[startrow:endrow, startcol:endcol]
array([[6, 7, 4],
[0, 1, 8],
[4, 5, 2]])
Depending on the shape of your patch (eg. 5 x 5 instead of 3 x 3) you can increase the pad_width and start and end row and column indices accordingly.
np.take does have a mode parameter which can wrap-around out of bound indices. But it's a bit hacky to use np.take for multidimensional arrays since the axis must be a scalar.
However, In your particular case you could do this:
a = np.array([[0, 1, 2, 3],
[4, 5, 6, 7],
[8, 9, 0, 1],
[2, 3, 4, 5]])
np.take(a, np.r_[2:5], axis=1, mode='wrap')[1:4]
Output:
array([[6, 7, 4],
[0, 1, 8],
[4, 5, 2]])
EDIT
This function might be what you are looking for (?)
def select3x3(a, idx):
x,y = idx
return np.take(np.take(a, np.r_[x-1:x+2], axis=0, mode='wrap'), np.r_[y-1:y+2], axis=1, mode='wrap')
But in retrospect, i recommend using modulo and fancy indexing for this kind of operation (it's basically what the mode='wrap' is doing internally anyways):
def select3x3(a, idx):
x,y = idx
return a[np.r_[x-1:x+2][:,None] % a.shape[0], np.r_[y-1:y+2][None,:] % a.shape[1]]
The above solution is also generalized for any 2d shape on a.

Reorganizing a 3d numpy array

I've tried and searched for a few days, I've come closer but need your help.
I have a 3d array in python,
shape(files)
>> (31,2049,2)
which corresponds to 31 input files with 2 columns of data with 2048 rows and a header.
I'd like to sort this array based on the header, which is a number, in each file.
I tried to follow NumPy: sorting 3D array but keeping 2nd dimension assigned to first , but i'm incredibly confused.
First I try to setup get my headers for the argsort, I thought I could do
sortval=files[:][0][0]
but this does not work..
Then I simply did a for loop to iterate and get my headers
for i in xrange(shape(files)[0]:
sortval.append([i][0][0])
Then
sortedIdx = np.argsort(sortval)
This works, however I dont understand whats happening in the last line..
files = files[np.arange(len(deck))[:,np.newaxis],sortedIdx]
Help would be appreciated.
Another way to do this is with np.take
header = a[:,0,0]
sorted = np.take(a, np.argsort(header), axis=0)
Here we can use a simple example to demonstrate what your code is doing:
First we create a random 3D numpy matrix:
a = (np.random.rand(3,3,2)*10).astype(int)
array([[[3, 1],
[3, 7],
[0, 3]],
[[2, 9],
[1, 0],
[9, 2]],
[[9, 2],
[8, 8],
[8, 0]]])
Then a[:] will gives a itself, and a[:][0][0] is just the first row in first 2D array in a, which is:
a[:][0]
# array([[3, 1],
# [3, 7],
# [0, 3]])
a[:][0][0]
# array([3, 1])
What you want is the header which are 3,2,9 in this example, so we can use a[:, 0, 0] to extract them:
a[:,0,0]
# array([3, 2, 9])
Now we sort the above list and get an index array:
np.argsort(a[:,0,0])
# array([1, 0, 2])
In order to rearrange the entire 3D array, we need to slice the array with correct order. And np.arange(len(a))[:,np.newaxis] is equal to np.arange(len(a)).reshape(-1,1) which creates a sequential 2D index array:
np.arange(len(a))[:,np.newaxis]
# array([[0],
# [1],
# [2]])
Without the 2D array, we will slice the array to 2 dimension
a[np.arange(3), np.argsort(a[:,0,0])]
# array([[3, 7],
# [2, 9],
# [8, 0]])
With the 2D array, we can perform 3D slicing and keeps the shape:
a[np.arange(3).reshape(-1,1), np.argsort(a[:,0,0])]
array([[[3, 7],
[3, 1],
[0, 3]],
[[1, 0],
[2, 9],
[9, 2]],
[[8, 8],
[9, 2],
[8, 0]]])
And above is the final result you want.
Edit:
To arange the 2D arrays:, one could use:
a[np.argsort(a[:,0,0])]
array([[[2, 9],
[1, 0],
[9, 2]],
[[3, 1],
[3, 7],
[0, 3]],
[[9, 2],
[8, 8],
[8, 0]]])

NumPy complicated slicing

I have a NumPy array, for example:
>>> import numpy as np
>>> x = np.random.randint(0, 10, size=(5, 5))
>>> x
array([[4, 7, 3, 7, 6],
[7, 9, 5, 7, 8],
[3, 1, 6, 3, 2],
[9, 2, 3, 8, 4],
[0, 9, 9, 0, 4]])
Is there a way to get a view (or copy) that contains indices 1:3 of the first row, indices 2:4 of the second row and indices 3:5 of the forth row?
So, in the above example, I wish to get:
>>> # What to write here?
array([[7, 3],
[5, 7],
[8, 4]])
Obviously, I would like a general method that would work efficiently also for multi-dimensional large arrays (and not only for the toy example above).
Try:
>>> np.array([x[0, 1:3], x[1, 2:4], x[3, 3:5]])
array([[7, 3],
[5, 7],
[8, 4]])
You can use numpy.lib.stride_tricks.as_strided as long as the offsets between rows are uniform:
# How far to step along the rows
offset = 1
# How wide the chunk of each row is
width = 2
view = np.lib.stride_tricks.as_strided(x, shape=(x.shape[0], width), strides=(x.strides[0] + offset * x.strides[1],) + x.strides[1:])
The result is guaranteed to be a view into the original data, not a copy.
Since as_strided is ridiculously powerful, be very careful how you use it. For example, make absolutely sure that the view does not go out of bounds in the last few rows.
If you can avoid it, try not to assign anything into a view returned by as_strided. Assignment just increases the dangers of unpredictable behavior and crashing a thousandfold if you don't know exactly what you're doing.
I guess something like this :D
In:
import numpy as np
x = np.random.randint(0, 10, size=(5, 5))
Out:
array([[7, 3, 3, 1, 9],
[6, 1, 3, 8, 7],
[0, 2, 2, 8, 4],
[8, 8, 1, 8, 8],
[1, 2, 4, 3, 4]])
In:
list_of_indicies = [[0,1,3], [1,2,4], [3,3,5]] #[row, start, stop]
def func(array, row, start, stop):
return array[row, start:stop]
for i in range(len(list_of_indicies)):
print(func(x,list_of_indicies[i][0],list_of_indicies[i][1], list_of_indicies[i][2]))
Out:
[3 3]
[3 8]
[3 4]
So u can modify it for your needs. Good luck!
I would extract diagonal vectors and stack them together, like this:
def diag_slice(x, start, end):
n_rows = min(*x.shape)-end+1
columns = [x.diagonal(i)[:n_rows, None] for i in range(start, end)]
return np.hstack(columns)
In [37]: diag_slice(x, 1, 3)
Out[37]:
array([[7, 3],
[5, 7],
[3, 2]])
For the general case it will be hard to beat a row by row list comprehension:
In [28]: idx = np.array([[0,1,3],[1,2,4],[4,3,5]])
In [29]: [x[i,j:k] for i,j,k in idx]
Out[29]: [array([7, 8]), array([2, 0]), array([9, 2])]
If the resulting arrays are all the same size, they can be combined into one 2d array:
In [30]: np.array(_)
Out[30]:
array([[7, 8],
[2, 0],
[9, 2]])
Another approach is to concatenate the indices before. I won't get into the details, but create something like this:
In [27]: x[[0,0,1,1,3,3],[1,2,2,3,3,4]]
Out[27]: array([7, 8, 2, 0, 3, 8])
Selecting from different rows complicates this 2nd approach. Conceptually the first is simpler. Past experience suggests the speed is about the same.
For uniform length slices, something like the as_strided trick may be faster, but it requires more understanding.
Some masking based approaches have also been suggested. But the details are more complicated, so I'll leave those to people like #Divakar who have specialized in them.
Someone has already pointed out the as_strided tricks, and yes, you should really use it with caution.
Here is a broadcast / fancy index approach which is less efficient than as_strided but still works pretty well IMO
window_size, step_size = 2, 1
# index within window
index = np.arange(2)
# offset
offset = np.arange(1, 4, step_size)
# for your case it's [0, 1, 3], I'm not sure how to generalize it without further information
fancy_row = np.array([0, 1, 3]).reshape(-1, 1)
# array([[1, 2],
# [2, 3],
# [3, 4]])
fancy_col = offset.reshape(-1, 1) + index
x[fancy_row, fancy_col]

Intersperse items of a numpy array

I have a 3 dimensional numpy array similar to this:
a = np.array([[[1, 2],
[3, 4]],
[[5, 6],
[7, 8]],
[[9, 10],
[11, 12]]])
What I'd like to do is intersperse each 2D array contained inside the outer array to produce this result:
t = np.array([[[1, 2], [5, 6], [9, 10]],
[[3, 4], [7, 8], [11, 12]]])
I could do this in Python like this, but I'm hoping there's a more efficient, numpy version:
t = np.empty((a.shape[1], a.shape[0], a.shape[2]), a.dtype)
for i, x in np.ndenumerate(a):
t[i[1], i[0], i[2]] = x
As #UdayrajDeshmukh said, you can use the transpose method (which, despite the name that evokes the "transpose" operator in linear algebra, is better understood as "permuting the axes"):
>>> t = a.transpose(1, 0, 2)
>>> t
array([[[ 1, 2],
[ 5, 6],
[ 9, 10]],
[[ 3, 4],
[ 7, 8],
[11, 12]]])
The newly created object t is a shallow array looking into a's data with a different permutation of indices. To replicate your own example, you need to copy it, e.g. t = a.transpose(1, 0, 2).copy()
Try the transpose function. You simply change the first two axes.
t = np.transpose(a, axes=(1, 0, 2))

Categories