Element-wise in-operator between two arrays - python

I'm wondering if there is a nice and elegant way to do an element-wise in comparison between two arrays.
arr1 = [[1, 2],
[3, 4],
[5, 6]]
àrr2 = [3,
5,
6]
result = arr2 in arr1
Now I want a result like :
[False, False, True]
Thanks a lot in advance!
Edit: I'm sorry, my example was a bit misleading. I want this to be performed element-wise, meaning I want to check, whether arr2[0] is in arr1[0], arr2[1] is in arr2[1] and so on.. I updated the example
Also the real arrays are much larger, so I would like to do it without loops

You can use operator.contains:
>>> arr1 = [[1, 2], [4, 5], [7, 8]]
>>> arr2 = [3, 4, 7]
>>> list(map(contains, arr1, arr2)
[False, True, True]
Or for numpy use np.isin
>>> arr1 = np.array([[1, 2], [4, 5], [7, 8]])
>>> arr2 = np.array([3, 4, 7])
>>> np.isin(arr2, arr1).any(1)
[False True True]

IIUC, there is the wonderful np.in1d to do this:
In [16]: np.in1d(arr2, arr1)
Out[16]: array([False, True, True])
From the docs, this function does the following:
Test whether each element of a 1-D array is also present in a second array.

comprehension and zip
[a in b for a, b in zip(arr2, arr1)]
[False, False, True]

You can do print([any(x in arr2 for x in a) for a in arr1])

Can be done with a list comprehension
result = [arr2[i] in arr1[i] for i in range(len(arr1))]
Then you have
[False, True, True]

Here's a quick way:
for i in zip(arr2,arr1):
print(i[0] in i[1])

Related

For each row in one array: is it in another array?

How can check for each row in a numpy array if it can be found in another array? So if I have
import numpy as np
a = np.array([[1, 2.6], [3, 4], [2.6, 1]])
b = np.array([[0, 0], [3, 4], [1, 2.6], [4.3, 4], [5, 5]])
I would like to get
array([True, True, False])
Obvioulsly
np.isin(a,b)
array([[ True, True],
[ True, True],
[ True, True]])
is not an answer and of course I can write something like
return_ = np.zeros(a.shape[0], dtype=bool)
for index, loc in enumerate(a):
for loc2 in b:
if np.allclose(loc, loc2):
return_[index] = True
break
but this is slow and looks horrible. I would prefer using proper numpy commands.
You could try creating a boolean index with appropriate broadcasting and then checking in the result for a True subarray
np.any(np.all(b == a[:, None, :], axis=-1), axis=1)

What's the fastest way to return that indices of values of two arrays that are equal to each other?

Say I have these two numpy arrays:
A = np.array([[1,2,3],[4,5,6],[8,7,3])
B = np.array([[1,2,3],[3,2,1],[8,7,3])
It should return
[0,2]
Since the values at the 0th and 2nd index are equal to each other.
What's the most efficient way of doing this?
I tried something like:
[val for val in range(len(A)) if A[val]==B[val]]
but got the error:
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
You better look for vectorized solution so...
You can try do:
>>>np.where(np.all(A == B, axis=1))
array([0 2])
You can see the benefit of vectorization When it comes to speed here : https://chelseatroy.com/2018/11/07/code-mechanic-numpy-vectorization/amp/
Assuming A.shape == B.shape (otherwise just take A=A[:len(B)] and B=B[:len(A)]) consider:
>>> A==B
[[ True True True]
[False False False]
[ True True True]]
>>> (A==B).all(axis=1)
[ True False True]
>>> np.argwhere((A==B).all(axis=1))
[[0]
[2]]
You can do something like that
>>> [a in B for a in A]
[True, False, True]
>>> A[[a in B for a in A]]
array([[1, 2, 3],
[8, 7, 3]])
>>> np.where((A==B).all(axis=1))
(array([0, 2]),)
The following solution works also for arrays that do not match in their first dimension, i.e., have a different number of rows. It also works if a match occurs multiple times.
import numpy as np
from scipy.spatial import distance
A = np.array([[1, 2 ,3],
[4, 5 ,6],
[8, 7, 3]])
B = np.array([[1, 2, 3],
[3, 2, 1],
[1, 2, 3],
[9, 9, 9]])
res = np.nonzero(distance.cdist(A, B) == 0)
# ==> (array([0, 0]), array([0, 2]))
The result res is a tuple of two array, which represent the match index of the first and the second input array, respectively. So, in this example, the row at the 0th index of the first array matches the row of the 0th index of second array, and the row at the 0th index of the first array matches the row at the second index of the second array.
In [174]: A = np.array([[1,2,3],[4,5,6],[8,7,3]])
...: B = np.array([[1,2,3],[3,2,1],[8,7,3]])
Your list comprehension works fine for lists:
In [175]: Al = A.tolist(); Bl = B.tolist()
In [177]: [val for val in range(len(Al)) if Al[val]==Bl[val]]
Out[177]: [0, 2]
For lists == is a simple boolean test - same or not; for arrays it returns a boolean array, which can't be use in an if:
In [178]: Al[0]==Bl[0]
Out[178]: True
In [179]: A[0]==B[0]
Out[179]: array([ True, True, True])
With arrays, you need to add a all as suggested by the error:
In [180]: [val for val in range(len(A)) if np.all(A[val]==B[val])]
Out[180]: [0, 2]
The list version will be faster.
But you can also compare the whole arrays, and take row by row all:
In [181]: A==B
Out[181]:
array([[ True, True, True],
[False, False, False],
[ True, True, True]])
In [182]: np.all(A==B, axis=1)
Out[182]: array([ True, False, True])
In [183]: np.nonzero(np.all(A==B, axis=1))
Out[183]: (array([0, 2]),)

How exactly does numpy.where() select the elements in this example?

From numpy docs
>>> np.where([[True, False], [True, True]],
... [[1, 2], [3, 4]],
... [[9, 8], [7, 6]])
array([[1, 8],
[3, 4]])
Am I right in assuming that the [[True, False], [True, True]] part is the condition and [[1, 2], [3, 4]] and [[9, 8], [7, 6]] are x and y respectively according to the docs parameters.
Then how exactly is the function choosing the elements in the following examples?
Also, why is the element type in these examples a list?
>>> np.where([[True, False,True], [False, True]], [[1, 2,56], [3, 4]], [[9, 8,79], [7, 6]])
array([list([1, 2, 56]), list([3, 4])], dtype=object)
>>> np.where([[False, False,True,True], [False, True]], [[1, 2,56,69], [3, 4]], [[9, 8,90,100], [7, 6]])
array([list([1, 2, 56, 69]), list([3, 4])], dtype=object)
In the first case, each term is a (2,2) array (or rather list that can be made into such an array). For each True in the condition, it returns the corresponding term in x, the [[1 -][3,4]], and for each False, the term from y [[- 8][- -]]
In the second case, the lists are ragged
In [1]: [[True, False,True], [False, True]]
Out[1]: [[True, False, True], [False, True]]
In [2]: np.array([[True, False,True], [False, True]])
Out[2]: array([list([True, False, True]), list([False, True])], dtype=object)
the array is (2,), with 2 lists. And when cast as boolean, a 2 element array, with both True. Only an empty list would produce False.
In [3]: _.astype(bool)
Out[3]: array([ True, True])
The where then returns just the x values.
This second case is understandable, but pathological.
more details
Let's demonstrate where in more detail, with a simpler case. Same condition array:
In [57]: condition = np.array([[True, False], [True, True]])
In [58]: condition
Out[58]:
array([[ True, False],
[ True, True]])
The single argument version, which is the equivalent to condition.nonzero():
In [59]: np.where(condition)
Out[59]: (array([0, 1, 1]), array([0, 0, 1]))
Some find it easier to visualize the transpose of that tuple - the 3 pairs of coordinates where condition is True:
In [60]: np.argwhere(condition)
Out[60]:
array([[0, 0],
[1, 0],
[1, 1]])
Now the simplest version with 3 arguments, with scalar values.
In [61]: np.where(condition, True, False) # same as condition
Out[61]:
array([[ True, False],
[ True, True]])
In [62]: np.where(condition, 100, 200)
Out[62]:
array([[100, 200],
[100, 100]])
A good way of visualizing this action is with two masked assignments.
In [63]: res = np.zeros(condition.shape, int)
In [64]: res[condition] = 100
In [65]: res[~condition] = 200
In [66]: res
Out[66]:
array([[100, 200],
[100, 100]])
Another way to do this is to initial an array with the y value(s), and where the nonzero where to fill in the x value.
In [69]: res = np.full(condition.shape, 200)
In [70]: res
Out[70]:
array([[200, 200],
[200, 200]])
In [71]: res[np.where(condition)] = 100
In [72]: res
Out[72]:
array([[100, 200],
[100, 100]])
If x and y are arrays, not scalars, this masked assignment will require refinements, but hopefully for a start this will help.
np.where(condition,x,y)
It checks the condition and if its True returns x else it returns y
np.where([[True, False], [True, True]],
[[1, 2], [3, 4]],
[[9, 8], [7, 6]])
Here you condition is[[True, False], [True, True]]
x = [[1 , 2] , [3 , 4]]
y = [[9 , 8] , [7 , 6]]
First condition is true so it return 1 instead of 9
Second condition is false so it returns 8 instead of 2
After reading about broadcasting as #hpaulj suggested I think I know how the function works.
It will try to broadcast the 3 arrays,then if the broadcast was successful it will use the True and False values to pick elements either from x or y.
In the example
>>>np.where([[True, False,True], [False, True]], [[1, 2,56], [3, 4]], [[9, 8,79], [7, 6]])
We have
cnd=np.array([[True, False,True], [False, True]])
x=np.array([[1, 2,56], [3, 4]])
y=np.array([[9, 8,79], [7, 6]])
Now
>>>x.shape
Out[7]: (2,)
>>>y.shape
Out[8]: (2,)
>>>cnd.shape
Out[9]: (2,)
So all three are just arrays with 2 elements(of type list) even the condition(cnd).So both [True, False,True] and [False, True] will be evaluated as True.And both the elements will be selected from x.
>>>np.where([[True, False,True], [False, True]], [[1, 2,56], [3, 4]], [[9, 8,79], [7, 6]])
Out[10]: array([list([1, 2, 56]), list([3, 4])], dtype=object)
I also tried it with a more complex example(a 2x2x2 broadcast) and it still explains it.
np.where([[[True,False],[True,True]], [[False,False],[True,False]]],
[[[12,45],[10,50]], [[100,10],[17,81]]],
[[[90,93],[85,13]], [[12,345], [190,56,34]]])
Where
cnd=np.array([[[True,False],[True,True]], [[False,False],[True,False]]])
x=np.array([[[12,45],[10,50]], [[100,10],[17,81]]])
y=np.array( [[[90,93],[85,13]], [[12,345], [190,56,34]]])
Here cnd and x have the shape (2,2,2) and y has the shape (2,2).
>>>cnd.shape
Out[14]: (2, 2, 2)
>>>x.shape
Out[15]: (2, 2, 2)
>>>y.shape
Out[16]: (2, 2)
Now as #hpaulj commented y will be broadcasted to (2,2,2).
And it'll probably look like this
>>>cnd
Out[6]:
array([[[ True, False],
[ True, True]],
[[False, False],
[ True, False]]])
>>>x
Out[7]:
array([[[ 12, 45],
[ 10, 50]],
[[100, 10],
[ 17, 81]]])
>>>np.broadcast_to(y,(2,2,2))
Out[8]:
array([[[list([90, 93]), list([85, 13])],
[list([12, 345]), list([190, 56, 34])]],
[[list([90, 93]), list([85, 13])],
[list([12, 345]), list([190, 56, 34])]]], dtype=object)
And the result can be easily predicted to be
>>>np.where([[[True,False],[True,True]], [[False,False],[True,False]]], [[[12,45],[10,50]], [[100,10],[17,81]]],[[[90,93],[85,13]], [[12,345], [190,56,34]]])
Out[9]:
array([[[12, list([85, 13])],
[10, 50]],
[[list([90, 93]), list([85, 13])],
[17, list([190, 56, 34])]]], dtype=object)

Numpy arrays - Multidimensional logic comparison

I am trying find the entries in a two-dimensional array that are above a certain threshold. The thresholds for the individual columns is given by a one-dimensional array. To exemplify,
[[1, 2, 3],
[4, 5, 6],
[2, 0, 4]]
is the two-dimensional array and I want to see if where in the columns values are bigger than
[2, 1, 3]
so the output of running the operation should be
[[False, True, False]
[True, True, True],
[False, False, True]]
Thanks!
Well, assuming there's an error in the example, I would simply do:
import numpy as np
A = np.array([[1, 2, 3],[4, 5, 6],[2, 0, 4]])
T = np.array([2, 1, 3])
X = A > T
Which gives
array([[False, True, False],
[ True, True, True],
[False, False, True]], dtype=bool)
I think there may be inconsistencies in your example (e.g. 2 > 1 is True, yet 2 > 4 is True) - can you clarify this?
Assuming you you want to know, for each row, which columns of the values in the first list are greater than the [2,1,3] list you gave, I suggest the following:
import numpy as np
tmp = [[1, 2, 3],
[4, 5, 6],
[2, 0, 4]]
output = [ np.less([2, 1, 3], tmp[i]) for i in range(len(tmp))]
Similarly, try greater or greater_equal or less_equal for the result you're after:
http://docs.scipy.org/doc/numpy/reference/routines.logic.html

list comprehension to create list of list

What is the list comprehension to achieve this:
a=[1,2,3,4,5]
b=[[x,False] for x in a]
will give,
[[1,False],[2,False],[3,False],[4,False],[5,False]]
How can I get True for some number in the list? I need something like this:
[[1,False],[2,False],[3,False],[4,True],[5,False]]
My random playing has not solved the problem.
Use if-else conditional:
>>> a = [1,2,3,4,5]
>>> b = [[x, True if x == 4 else False] for x in a]
>>> b
[[1, False], [2, False], [3, False], [4, True], [5, False]]
or just:
>>> b = [[x, x == 4] for x in a]
Maybe this?
b=[[x, x==4] for x in a]
>>> a = [1, 2, 3, 4, 5]
>>> b = [[x, x==4] for x in a]
>>> b
[[1, False], [2, False], [3, False], [4, True], [5, False]]
>>>
This takes advantage of the fact that x==4 will return True if x is equal to 4; otherwise, it will return False.
Use the ternary operator to choose different values based on conditions:
conditional_expression ::= or_test ["if" or_test "else" expression]
Example:
>>> [[x,False if x%4 else True] for x in a]
[[1, False], [2, False], [3, False], [4, True], [5, False]]

Categories