numpy - select multiple elements from each row of an array - python

I need to select multiple different values from each row of a 2D array.
A = np.array([[ 1, 2, 3, 4],
[ 5, 6, 7, 8],
[ 9,10,11,12])
A[something]
>>> np.array([[ 1, 2],
[ 6, 7],
[11,12]])
I know I can create a boolean array the same shape as A and set each element in a for loop, but I'm hoping come up with a better solution.

You can try the following:
import numpy as np
A = np.array([[ 1, 2, 3, 4],
[ 5, 6, 7, 8],
[ 9,10,11,12]])
i = [[0],[1],[2]]
j = [[0,1], [1,2],[2,3]]
B = A[i,j]
print(B)
#Prints
[[ 1 2]
[ 6 7]
[11 12]]

Related

Numpy array: iterate through column and change value depending on the next value

I have a numpy array like this:
data = np.array([
[1,2,3],
[1,2,3],
[1,2,101],
[4,5,111],
[4,5,6],
[4,5,6],
[4,5,101],
[4,5,112],
[4,5,6],
])
In the third column, I want the value to be replaced with 10001 if the next one along is 101. which would result in an array like this:
data = np.array([
[1,2,3],
[1,2,10001],
[1,2,101],
[4,5,111],
[4,5,6],
[4,5,10001],
[4,5,101],
[4,5,112],
[4,5,6],
])
I tried this which I was certain would work but it doesn't...
dith = np.nditer(data[:, 2], op_flags=['readwrite'])
for i in dith:
if i+1 == 3:
i[...] = 10001
If anyone could help with this then that would be great.
Try this -
data[np.roll(data==101,-1,0)] = 10001
array([[ 1, 2, 3],
[ 1, 2, 10001],
[ 1, 2, 101],
[ 4, 5, 111],
[ 4, 5, 6],
[ 4, 5, 10001],
[ 4, 5, 101],
[ 4, 5, 112],
[ 4, 5, 6]])
Only assumption here is that your first row doesn't contain a 101
In case there is a potential scenario that 101 may occur in the first row of the matrix, then try this approach below.
idx = np.vstack([np.roll(data==101,-1,0)[:-1], np.array([False, False, False])])
data[idx] = 10001
array([[ 1, 2, 3],
[ 1, 2, 10001],
[ 1, 2, 101],
[ 4, 5, 111],
[ 4, 5, 6],
[ 4, 5, 10001],
[ 4, 5, 101],
[ 4, 5, 112],
[ 4, 5, 6]])
indices_of_101 = np.where(data[:, 2] == 101)[0]
if indices_of_101[0] = 0: # taking into accound boundary problem
indices_of_101 = indices_of_101[1:]
data[:, indices_of_101-1] = 10001

Python reshaping 1D array into 2D array by 'rows' or 'columns'

I would like to control how array.reshape() populates the new array. For example
a = np.arange(12).reshape(3,4)
## array([[ 0, 1, 2, 3],
## [ 4, 5, 6, 7],
## [ 8, 9, 10, 11]])
but what I would like to be able to is populate the array columnwise with something like:
a = np.arange(9).reshape(3,3, 'columnwise')
## array([[ 0, 3, 6, 9],
## [ 1, 4, 7, 10],
## [ 2, 5, 8, 11]])
Use np.transpose.
import numpy as np
print(np.arange(9).reshape(3,3).transpose())
Output:
[[0 3 6]
[1 4 7]
[2 5 8]]
In [22]: np.arange(12).reshape(3,4, order='F')
Out[22]:
array([[ 0, 3, 6, 9],
[ 1, 4, 7, 10],
[ 2, 5, 8, 11]])
If you take a transpose of the original matrix, you will get your desired effect.
import numpy as np
a = np.arange(6).reshape(3,3).tranpose()
OR
a = np.arange(6).reshape(3,3).T

numpy - slicing a 3d array, how to apply two slices of different length in a certain axis

I have been stuck with a question about slicing numpy array for a while.
Below is an array I have right now:
a = np.array([[[ 1, 2],
[ 3, 4],
[ 5, 6]],
[[ 7, 8],
[ 9, 10],
[11, 12]]]
How can I use slicing to get an array like the following?
np.array([[[ 1, 2]],
[[ 9, 10],
[11, 12]]]
I have tried a[[0,1],[0,[1,2]] however it didn't work and gave an error:
ValueError: setting an array element with a sequence.
Thank you in advance!
The exact thing you give as your desired output is not possible, since arrays have to be "hyper-rectangles", so X[0].shape has to be the same as X[1].shape.
What you can do is:
a[[0,1,1],[0,1,2]]
# array([[ 1, 2],
# [ 9, 10],
# [11, 12]])
You can do this, for example:
import numpy as np
a = np.array([[[ 1, 2], [ 3, 4], [ 5, 6]], [[ 7, 8], [ 9, 10], [11, 12]]])
print(np.array([[a[0, 0 ,: ], a[1, 1 ,:], a[1, 2 ,: ]]]))
Result:
[[[ 1 2]
[ 9 10]
[11 12]]]
You can apply two operations separably and merge them afterwards:
np.array((a[0,0:1].tolist(), a[1,1:].tolist()))
# array([[[1, 2]], [[9, 10], [11, 12]]], dtype=object)

Numpy where for 2 dimensional array

I have a 2 d numpy array. I need to keep all the rows whose value at a specific column is greater than a certain number. Right now, I have:
f_left = np.where(f_sorted[:,attribute] >= split_point)
And it is failing with: "Index Error: too many indices for array"
How should I do this? I can't figure it out from the numpy website, here
You actually don't even need where.
yy = np.array(range(12)).reshape((4,3))
yy[yy[:,1] > 2]
Output
array([[ 3, 4, 5],
[ 6, 7, 8],
[ 9, 10, 11]])
x = np.array([[2,3,4],[5,6,7],[1,2,3],[8,9,10]])
array([[ 2, 3, 4],
[ 5, 6, 7],
[ 1, 2, 3],
[ 8, 9, 10]])
Find the rows where the second element are >=4
x[np.where(x[:,1] >= 4)]
array([[ 5, 6, 7],
[ 8, 9, 10]])

Efficient way of making a list of pairs from an array in Numpy

I have a numpy array x (with (n,4) shape) of integers like:
[[0 1 2 3],
[1 2 7 9],
[2 1 5 2],
...]
I want to transform the array into an array of pairs:
[0,1]
[0,2]
[0,3]
[1,2]
...
so first element makes a pair with other elements in the same sub-array. I have already a for-loop solution:
y=np.array([[x[j,0],x[j,i]] for i in range(1,4) for j in range(0,n)],dtype=int)
but since looping over numpy array is not efficient, I tried slicing as the solution. I can do the slicing for every column as:
y[1]=np.array([x[:,0],x[:,1]]).T
# [[0,1],[1,2],[2,1],...]
I can repeat this for all columns. My questions are:
How can I append y[2] to y[1],... such that the shape is (N,2)?
If number of columns is not small (in this example 4), how can I find y[i] elegantly?
What are the alternative ways to achieve the final array?
The cleanest way of doing this I can think of would be:
>>> x = np.arange(12).reshape(3, 4)
>>> x
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]])
>>> n = x.shape[1] - 1
>>> y = np.repeat(x, (n,)+(1,)*n, axis=1)
>>> y
array([[ 0, 0, 0, 1, 2, 3],
[ 4, 4, 4, 5, 6, 7],
[ 8, 8, 8, 9, 10, 11]])
>>> y.reshape(-1, 2, n).transpose(0, 2, 1).reshape(-1, 2)
array([[ 0, 1],
[ 0, 2],
[ 0, 3],
[ 4, 5],
[ 4, 6],
[ 4, 7],
[ 8, 9],
[ 8, 10],
[ 8, 11]])
This will make two copies of the data, so it will not be the most efficient method. That would probably be something like:
>>> y = np.empty((x.shape[0], n, 2), dtype=x.dtype)
>>> y[..., 0] = x[:, 0, None]
>>> y[..., 1] = x[:, 1:]
>>> y.shape = (-1, 2)
>>> y
array([[ 0, 1],
[ 0, 2],
[ 0, 3],
[ 4, 5],
[ 4, 6],
[ 4, 7],
[ 8, 9],
[ 8, 10],
[ 8, 11]])
Like Jaimie, I first tried a repeat of the 1st column followed by reshaping, but then decided it was simpler to make 2 intermediary arrays, and hstack them:
x=np.array([[0,1,2,3],[1,2,7,9],[2,1,5,2]])
m,n=x.shape
x1=x[:,0].repeat(n-1)[:,None]
x2=x[:,1:].reshape(-1,1)
np.hstack([x1,x2])
producing
array([[0, 1],
[0, 2],
[0, 3],
[1, 2],
[1, 7],
[1, 9],
[2, 1],
[2, 5],
[2, 2]])
There probably are other ways of doing this sort of rearrangement. The result will copy the original data in one way or other. My guess is that as long as you are using compiled functions like reshape and repeat, the time differences won't be significant.
Suppose the numpy array is
arr = np.array([[0, 1, 2, 3],
[1, 2, 7, 9],
[2, 1, 5, 2]])
You can get the array of pairs as
import itertools
m, n = arr.shape
new_arr = np.array([x for i in range(m)
for x in itertools.product(a[i, 0 : 1], a[i, 1 : n])])
The output would be
array([[0, 1],
[0, 2],
[0, 3],
[1, 2],
[1, 7],
[1, 9],
[2, 1],
[2, 5],
[2, 2]])

Categories