how to index a numpy array using conditions? - python

Suppose I have an array like this:
a = np.array([[2,1],
[4,2],
[1,3],...]
I want to retrieve the elements of the second column where the corresponding elements in the first column match some condition. So something like
a[a[:,0] == np.array([2,4]),1] (?)
should give
np.array([1,2])

While this uses list to collect results and requires a for loop, this collects the second column values once the first column has passed some criteria (in a list of acceptable results in this case).
a = np.array([[2, 1],
[4, 2],
[1, 3]])
b = []
criteria = [2, 4]
for entry in a:
if entry[0] in criteria:
b.append(entry[1])
b = np.array(b)

You could create a mask based off of your first column, and then use that to mask off the second column.
import numpy as np
a = np.array([[2, 1],
[4, 2],
[1, 3]])
mask = np.logical_or(a[:,0] == 2, a[:,0] == 4)
b = a[:,1][mask]
print(b)
Returns:
[1, 2]
It could get a little clumsy if you have many values you want to compare to.

Related

Given the indexes corresponding to each row, get the corresponding elements from a matrix

Given indexes for each row, how to return the corresponding elements in a 2-d matrix?
For instance, In array of np.array([[1,2,3,4],[4,5,6,7]]) I expect to see the output [[1,2],[4,5]] given indxs = np.array([[0,1],[0,1]]). Below is what I've tried:
a= np.array([[1,2,3,4],[4,5,6,7]])
indxs = np.array([[0,1],[0,1]]) #means return the elements located at 0 and 1 for each row
#I tried this, but it returns an array with shape (2, 2, 4)
a[idxs]
The reason you are getting two times your array is that when you do a[[0,1]] you are selecting the rows 0 and 1 from your array a, which are indeed your entire array.
In[]: a[[0,1]]
Out[]: array([[1, 2, 3, 4],
[4, 5, 6, 7]])
You can get the desired output using slides. That would be the easiest way.
a = np.array([[1,2,3,4],[4,5,6,7]])
a[:,0:2]
Out []: array([[1, 2],
[4, 5]])
In case you are still interested on indexing, you could also get your output doing:
In[]: [list(a[[0],[0,1]]),list(a[[1],[0,1]])]
Out[]: [[1, 2], [4, 5]]
The NumPy documentation gives you a really nice overview on how indexes work.
In [120]: indxs = np.array([[0,1],[0,1]])
In [121]: a= np.array([[1,2,3,4],[4,5,6,7]])
...: indxs = np.array([[0,1],[0,1]]) #
You need to provide an index for the first dimension, one that broadcasts with with indxs.
In [122]: a[np.arange(2)[:,None], indxs]
Out[122]:
array([[1, 2],
[4, 5]])
indxs is (2,n), so you need a (2,1) array to give a (2,n) result

Delete specific column in an array

I have an numpy array of size NxD called X.
I have created a mask of size D represented by a numpy vector with 1 and 0 called ind_delete
I would like to delete all column of X corresponding to 1 in ind_delete.
I have tried:
X = np.delete(X,ind_delete,1)
but it obviously does not work. I have tried to find an easy way to to that on python but as it is trivial in matlab, it seems not as much here. Thanks for pointing out the best way to achieve it.
Boolean array indexing:
>>> x = np.array([[1, 2, 3],
... [4, 5, 6]])
>>> d = np.array([1, 0, 1])
>>> x[:, d==1]
array([[1, 3],
[4, 6]])
You need to create a boolean array and you can select what you want:
X = X[ind_delete!=1]
Selects the positions for where the value is not 1.

python how to create an array based on condition

I want to create an array with a format, and the values originate from another array. My input array consists out of three columns. I want to create an array with in the first row all values from the third column if the second column is equal. So in this example the first three values in the second column are equal, so in the new array i want the third value of each row in the new array.
a =
[[1, 1, 4],
[2, 1, 6],
[3, 1, 7],
[4, 2, 0],
[5, 2, 7],
[6, 3, 1]]
result:
b =
[[4, 6 , 7],
[0, 7],
[1]]
I tried:
c = []
x = 1
for row in a:
if row[0] == x
c.extend[row[2]]
else:
x = x + 1
c.append(row[2])
But the result is a list of all 3rd values
a = np.asarray(a)
c = []
for i in range(a[-1,1]): #a[-1,1] is the maximum that will occur
save = a[a[:,1]==i] # take all the ones that have i in the second entry
c.append(save[:,2]) # of those add the last entry
It's important, that ais converted to a np.array for this.
If the second column is sorted, you can use np.diff to find out the index where the value changes and then split on it:
np.split(a[:,2], np.flatnonzero(np.diff(a[:,1]) != 0)+1)
# [array([4, 6, 7]), array([0, 7]), array([1])]
The below works for me:
import numpy as np
c = [[]]
x = 1
for row in a:
if row[1] == x:
c[-1].append(row[2])
else:
x = x + 1
c.append([row[2]])
c = np.asarray(c)

Modify different columns in each row of a 2D NumPy array

I have the following problem:
Let's say I have an array defined like this:
A = np.array([[1,2,3],[4,5,6],[7,8,9]])
What I would like to do is to make use of Numpy multiple indexing and set several elements to 0. To do that I'm creating a vector:
indices_to_remove = [1, 2, 0]
What I want it to mean is the following:
Remove element with index '1' from the first row
Remove element with index '2' from the second row
Remove element with index '0' from the third row
The result should be the array [[1,0,3],[4,5,0],[0,8,9]]
I've managed to get values of the elements I would like to modify by following code:
values = np.diagonal(np.take(A, indices, axis=1))
However, that doesn't allow me to modify them. How could this be solved?
You could use integer array indexing to assign those zeros -
A[np.arange(len(indices_to_remove)), indices_to_remove] = 0
Sample run -
In [445]: A
Out[445]:
array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
In [446]: indices_to_remove
Out[446]: [1, 2, 0]
In [447]: A[np.arange(len(indices_to_remove)), indices_to_remove] = 0
In [448]: A
Out[448]:
array([[1, 0, 3],
[4, 5, 0],
[0, 8, 9]])

select elements of different columns at different rows of numpy array

In [62]: a
Out[62]:
array([[1, 2],
[3, 4]])
Is there an easy way to get [2,3], i.e. the second element of the first row, and the first element of the second row? I have the list of the indices for each row, i.e. [1,0] in this case. I have tried a[:,[1,0]], but it doesn't work.
You need to specify both i and j for all the elements you want. For example:
import numpy as np
a = np.array([[1, 2],
[3, 4]])
i = [0, 1]
j = [1, 0]
print(a[i, j])
# [2, 3]
If you need one item from each row, you can use i = np.arange(a.shape[0])

Categories