Can the numpy.where function be used for more than one specific value?
I can specify a specific value:
>>> x = numpy.arange(5)
>>> numpy.where(x == 2)[0][0]
2
But I would like to do something like the following. It gives an error of course.
>>> numpy.where(x in [3,4])[0][0]
[3,4]
Is there a way to do this without iterating through the list and combining the resulting arrays?
EDIT: I also have a lists of lists of unknown lengths and unknown values so I cannot easily form the parameters of np.where() to search for multiple items. It would be much easier to pass a list.
You can use the numpy.in1d function with numpy.where:
import numpy
numpy.where(numpy.in1d(x, [2,3]))
# (array([2, 3]),)
I guess np.in1d might help you, instead:
>>> x = np.arange(5)
>>> np.in1d(x, [3,4])
array([False, False, False, True, True], dtype=bool)
>>> np.argwhere(_)
array([[3],
[4]])
If you only need to check for a few values you can:
import numpy as np
x = np.arange(4)
ret_arr = np.where([x == 1, x == 2, x == 4, x == 0])[1]
print "Ret arr = ",ret_arr
Output:
Ret arr = [1 2 0]
Related
I want to check if two arrays share at least one common element. For two arrays of equal size I can do the following:
import numpy as np
A = np.array([0,1,2,3,4])
B = np.array([5,6,7,8,9])
print(np.isin(A,B).any())
False
In my task, however, I want to do this over a 2d array of variable size. Example:
A = np.array([[0,1,2,3,4],[3,4,5], [2,4,7], [12,14]])
B = np.array([5,6,7,8,9])
function(A,B)
should return:
[False, True, True, False]
How can this task be performed efficiently?
A = np.array([[0,1,2,3,4], [3,4,5], [2,4,7], [12,14]])
B = np.array([5,6,7,8,9])
result = [np.isin(x, B).any() for x in A]
This might be what you're looking for.
Solution without a loop:
import numpy as np
A = np.array([[0,1,2,3,4],[3,4,5], [2,4,7], [12,14]], dtype=object)
B = np.array([5,6,7,8,9])
result = np.intersect1d(np.hstack(A), B)
print(result)
Prints:
[5 7]
I attempted to compare slices of a list in Python but to no avail? Is there a better way to do this?
My Code (Attempt to make slice return True)
a = [1,2,3]
# Slice Assignment
a[0:1] = [0,0]
print(a)
# Slice Comparisons???
print(a[0:2])
print(a[0:2] == True)
print(a[0:2] == [True, True])
My Results
[0, 0, 2, 3]
[0, 0]
False
False
Since slicing returns lists and lists automatically compare element-wise, all you need to do is use ==:
>>> a = [1, 2, 3, 1, 2, 3]
>>> a[:3] == a[3:]
True
To compare to a fixed value, you need a little more effort:
>>> b = [1, 1, 1, 3]
>>> all(e == 1 for e in b[:3])
True
>>> all(e == 1 for e in b[2:])
False
Bonus: if you are doing lots of array calculations, you might benefit from using numpy arrays:
>>> import numpy as np
>>> c = np.array(b)
>>> c[:3] == 1 # this automatically gets applied to all elements
array([ True, True, True])
>>> (c[:3] == 1).all()
True
It is not quite clear what you're trying to do exactly,
As you printed, a[0:2] is [0,0], you're trying to compare the list to a boolean which are different types so they are different
In the second one, you are comparing [0,0] to [True, True], python compares the lists element by element, and 0 evaluvates to false, so [False, False] is clearly not == to [True, True]
Could you edit your question and add what you want the code to do? I would add this in a comment but I dont have enough rep yet :)
Say I have the following arrays:
a = np.array([1,1,1,2,2,2])
b = np.array([4,6,1,8,2,1])
Is it possible to do the following:
a[np.where(b>3)[0]]
#array([1, 1, 2])
Thus select values from a according to the indices in which a condition in b is satisfied, but using exclusively np.where or a similar numpy function?
In other words, can np.where be used specifying only an array from which to get values when the condition is True? Or is there another numpy function to do this in one step?
Yes, there is a function: numpy.extract(condition, array) returns all values from array that satifsy the condition.
There is not much benefit in using this function over np.where or boolean indexing. All of these approaches create a temporary boolean array that stores the result of b>3. np.where creates an additional index array, while a[b>3]and np.extract use the boolean array directly.
Personally, I would use a[b>3] because it is the tersest form.
Just use boolean indexing.
>>> a = np.array([1,1,1,2,2,2])
>>> b = np.array([4,6,1,8,2,1])
>>>
>>> a[b > 3]
array([1, 1, 2])
b > 3 will give you array([True, True, False, True, False, False]) and with a[b > 3] you select all elements from a where the indexing array is True.
Let's use list comprehension to solve this -
a = np.array([1,1,1,2,2,2])
b = np.array([4,6,1,8,2,1])
indices = [i for i in range(len(b)) if b[i]>3] # Returns indexes of b where b > 3 - [0, 1, 3]
a[indices]
array([1, 1, 2])
What is the python equivalent of this in operator? I am trying to filter down a pandas database by having rows only remain if a column in the row has a value found in my list.
I tried using any() and am having immense difficulty with this.
Pandas comparison with R docs are here.
s <- 0:4
s %in% c(2,4)
The isin method is similar to R %in% operator:
In [13]: s = pd.Series(np.arange(5),dtype=np.float32)
In [14]: s.isin([2, 4])
Out[14]:
0 False
1 False
2 True
3 False
4 True
dtype: bool
FWIW: without having to call pandas, here's the answer using a for loop and list compression in pure python
x = [2, 3, 5]
y = [1, 2, 3]
# for loop
for i in x: [].append(i in y)
Out: [True, True, False]
# list comprehension
[i in y for i in x]
Out: [True, True, False]
If you want to use only numpy without panads (like a use case I had) then you can:
import numpy as np
x = np.array([1, 2, 3, 10])
y = np.array([10, 11, 2])
np.isin(y, x)
This is equivalent to:
c(10, 11, 2) %in% c(1, 2, 3, 10)
Note that the last line will work only for numpy >= 1.13.0, for older versions you'll need to use np.in1d.
As others indicate, in operator of base Python works well.
myList = ["a00", "b000", "c0"]
"a00" in myList
# True
"a" in myList
# False
I frequently use the numpy.where function to gather a tuple of indices of a matrix having some property. For example
import numpy as np
X = np.random.rand(3,3)
>>> X
array([[ 0.51035326, 0.41536004, 0.37821622],
[ 0.32285063, 0.29847402, 0.82969935],
[ 0.74340225, 0.51553363, 0.22528989]])
>>> ix = np.where(X > 0.5)
>>> ix
(array([0, 1, 2, 2]), array([0, 2, 0, 1]))
ix is now a tuple of ndarray objects that contain the row and column indices, whereas the sub-expression X>0.5 contains a single boolean matrix indicating which cells had the >0.5 property. Each representation has its own advantages.
What is the best way to take ix object and convert it back to the boolean form later when it is desired? For example
G = np.zeros(X.shape,dtype=np.bool)
>>> G[ix] = True
Is there a one-liner that accomplishes the same thing?
Something like this maybe?
mask = np.zeros(X.shape, dtype='bool')
mask[ix] = True
but if it's something simple like X > 0, you're probably better off doing mask = X > 0 unless mask is very sparse or you no longer have a reference to X.
mask = X > 0
imask = np.logical_not(mask)
For example
Edit: Sorry for being so concise before. Shouldn't be answering things on the phone :P
As I noted in the example, it's better to just invert the boolean mask. Much more efficient/easier than going back from the result of where.
The bottom of the np.where docstring suggests to use np.in1d for this.
>>> x = np.array([1, 3, 4, 1, 2, 7, 6])
>>> indices = np.where(x % 3 == 1)[0]
>>> indices
array([0, 2, 3, 5])
>>> np.in1d(np.arange(len(x)), indices)
array([ True, False, True, True, False, True, False], dtype=bool)
(While this is a nice one-liner, it is a lot slower than #Bi Rico's solution.)