Numpy slice of a slice assignment fails unexpectedly - python

This attempt at a slice+assignment operation fails unexpectedly:
>>> x = np.array([True, True, True, True])
>>> x[x][0:2] = False
>>> x
array([ True, True, True, True])
I'd like to understand why the above simplified code snippet fails to assign the underlying array values.
Seemingly equivalent slicing+assignment operations do work, for example:
>>> x = np.array([True, True, True, True])
>>> x[0:4][0:2] = False
>>> x
array([False, False, True, True])
np.version.version == 1.17.0

The reason this will not work is because x[x] is not a "view", but a copy, and then you thus assign on a slice of that copy. But that copy is never saved. Indeed, if we evaluate x[x], then we see it has no base:
>>> x[x].base is None
True
We can however assign to the first two, or last five, etc. items, by first calculating the indices:
>>> x = np.array([True, True, True, True])
>>> x[np.where(x)[0][:2]] = False
>>> x
array([False, False, True, True])
Here np.where(x) will return a 1-tuple that contains the indices for which x is True:
>>> np.where(x)
(array([0, 1, 2, 3]),)
we then slice that array, and assign the indices of the sliced array.

Related

element wise "contains" in Python

Say I have an array:
import numpy as np
arr = np.random.randint(0, 5, 20)
then arr>3 results in an array of type bool with shape (20,). How can I most efficiently do the same thing with the "contains" operator? The simple
arr in [2, 4]
will result in "The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()". Is there another way than
np.array([ x in [2, 4] for x in arr])
?
You can use np.isin:
import numpy as np
arr = np.random.randint(0, 5, 20)
np.isin(arr, [2, 4])
Output:
array([False, True, False, False, False, True, False, False, True,
False, False, False, True, False, False, True, False, True,
True, True])
The function returns a boolean array of the same shape as your input array named arr that is True where an element of arr is in your second list argument [2, 4] and False otherwise
pandas offer this via, pd.Series, or np.ndarray, but so far I don't know any other array module provide this.
a = pd.Series([0,1,2,3,4,5,6,7,8,9])
print(a.isin([0,3]).any()) # returns True
print(a.values.isin([0,3]).any()) # returns True (a.values is np.ndarray)

Logical operation between two Boolean lists

I get a weird result and I try to apply the and or the or operator to 2 Boolean lists in python. I actually get the exact opposite of what I was expecting.
[True, False, False] and [True, True, False]
> [True, True, False]
[True, False, False] or [True, True, False]
> [True, False, False]
Is that normal, and if yes, why?
If what you actually wanted was element-wise boolean operations between your two lists, consider using the numpy module:
>>> import numpy as np
>>> a = np.array([True, False, False])
>>> b = np.array([True, True, False])
>>> a & b
array([ True, False, False], dtype=bool)
>>> a | b
array([ True, True, False], dtype=bool)
This is normal, because and and or actually evaluate to one of their operands. x and y is like
def and(x, y):
if x:
return y
return x
while x or y is like
def or(x, y):
if x:
return x
return y
Since both of your lists contain values, they are both "truthy" so and evaluates to the second operand, and or evaluates to the first.
I think you need something like this:
[x and y for x, y in zip([True, False, False], [True, True, False])]
Both lists are truthy because they are non-empty.
Both and and or return the operand that decided the operation's value.
If the left side of and is truthy, then it must evaluate the right side, because it could be falsy, which would make the entire operation false (false and anything is false). Therefore, it returns the right side.
If the left side of or is truthy, it does not need to evaluate the right side, because it already knows that the expression is true (true or anything is true). So it returns the left side.
If you wish to perform pairwise comparisons of items in the list, use a list comprehension, e.g.:
[x or y for (x, y) in zip(a, b)] # a and b are your lists
Your lists aren't comparing each individual value, they're comparing the existence of values in the list.
For any truthy variables a and b:
a and b
> b #The program evaluates a, a is truthy, it evaluates b, b is truthy, so it returns the last evaluated value, b.
a or b
> a #The program evaluates a, a is truthy, so the or statement is true, so it returns the last evaluated value, a.
Now, truthy depends on the type. For example, integers are truthy for my_int != 0, and are falsy for my_int == 0. So if you have:
a = 0
b = 1
a or b
> b #The program evaluates a, a is falsy, so the or statement goes on to evaluate b, b is truthy, so the or statement is true and it returns the last evaluated value b.
Very convenient way:
>>> import numpy as np
>>> np.logical_and([True, False, False], [True, True, False])
array([ True, False, False], dtype=bool)
>>> np.logical_or([True, False, False], [True, True, False])
array([ True, True, False], dtype=bool)
Мore functional:
from operator import or_, and_
from itertools import starmap
a = [True, False, False]
b = [True, True, False]
starmap(or_, zip(a,b)) # [True, True, False]
starmap(and_, zip(a,b)) # [True, False, False]

Python: numpy array larger and smaller than a value

How to look for numbers that is between a range?
c = array[2,3,4,5,6]
>>> c>3
>>> array([False, False, True, True, True]
However, when I give c in between two numbers, it return error
>>> 2<c<5
>>> ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
The desire output is
array([False, True, True, False, False]
Try this,
(c > 2) & (c < 5)
Result
array([False, True, True, False, False], dtype=bool)
Python evaluates 2<c<5 as (2<c) and (c<5) which would be valid, except the and keyword doesn't work as we would want with numpy arrays. (It attempts to cast each array to a single boolean, and that behavior can't be overridden, as discussed here.) So for a vectorized and operation with numpy arrays you need to do this:
(2<c) & (c<5)
You can do something like this :
import numpy as np
c = np.array([2,3,4,5,6])
output = [(i and j) for i, j in zip(c>2, c<5)]
Output :
[False, True, True, False, False]

python: convert ascii character to boolean array

I have a character. I want to represent its ascii value as a numpy array of booleans.
This works, but seems contorted. Is there a better way?
bin_str = bin(ord(mychar))
bool_array = array([int(x)>0 for x in list(bin_str[2:])], dtype=bool)
for
mychar = 'd'
the desired resulting value for bool_array is
array([ True, True, False, False, True, False, False], dtype=bool)
You can extract the bits from a uint8 array directly using np.unpackbits:
np.unpackbits(np.array(ord(mychar), dtype=np.uint8))
EDIT: To get only the 7 relevant bits in a boolean array:
np.unpackbits(np.array(ord(mychar), dtype=np.uint8)).astype(bool)[1:]
This is more or less the same thing:
>>> import numpy as np
>>> mychar = 'd'
>>> np.array(list(np.binary_repr(ord(mychar), width=4))).astype('bool')
array([ True, True, False, False, True, False, False], dtype=bool)
Is it less contorted?

Use of python's logical operators when slicing a numpy array

I would like to perform a slicing on a two dimensional numpy array:
type1_c = type1_c[
(type1_c[:,10]==2) or
(type1_c[:,10]==3) or
(type1_c[:,10]==4) or
(type1_c[:,10]==5) or
(type1_c[:,10]==6)
]
The syntax looks right; however I got the following error message:
'The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()'
I really don't understand what's going wrong. Any idea?
or is unambiguous when it's between two scalars, but what's the right vector generalization? if x == array([0, 0]) and y == array([0,1]), should x or y be (1) False, because not all pairwise terms or-ed together are True, (2) True, because at least one pairwise or result is true, (3) array([0, 1]), because that's the pairwise result of an or, (4) array([0, 0]), because [0,0] or [0,1] would return [0,0] because nonempty lists are truthy, and so should arrays be?
You could use | here, and treat it as a bitwise issue:
>>> import numpy as np
>>> vec = np.arange(10)
>>> vec[(vec == 2) | (vec == 7)]
array([2, 7])
Explicitly use numpys vectorized logical or:
>>> np.logical_or(vec==3, vec==5)
array([False, False, False, True, False, True, False, False, False, False], dtype=bool)
>>> vec[np.logical_or(vec==3, vec==5)]
array([3, 5])
or use in1d, which is far more efficient here:
>>> np.in1d(vec, [2, 7])
array([False, False, True, False, False, False, False, True, False, False], dtype=bool)
>>> vec[np.in1d(vec, [2, 7])]
array([2, 7])

Categories