I have a 2D numpy array of boolean masks with n rows where each row is an array of m masks.
maskArr = [
[[True, False, True, False], [True, True, False, True], [True, True, False, True]],
[[False, True, False, True], [False, True, True, True], [True, True, False, True]],
[[True, False, True, False], [True, True, False, True], [True, True, False, True]],
[[False, True, False, True], [False, True, True, True], [True, True, False, True]],
[[True, False, True, False], [True, True, False, True], [True, True, False, True]],
[[False, True, False, True], [False, True, True, True], [True, True, False, True]]
]
Is there a way to vectorize the combining of mask arrays in each row to get the following result?
combinedMaskArr = [
[True, False, False, False],
[False, True, False, True],
[True, False, False, False],
[False, True, False, True],
[True, False, False, False],
[False, True, False, True]
]
Thank you for any guidance or suggestions in advance.
You're trying to testing whether all elements are true along a specific axis. Use np.all
np.all(maskArr, axis=1)
Output
array([[ True, False, False, False],
[False, True, False, True],
[ True, False, False, False],
[False, True, False, True],
[ True, False, False, False],
[False, True, False, True]])
Related
I need all permutations of a bool array, the following code is inefficient, but does what I want:
from itertools import permutations
import numpy as np
n1=2
n2=3
a = np.array([True]*n1+[False]*n2)
perms = set(permutations(a))
However it is inefficient and fails for long arrays. Is there a more efficent implementation?
What about sampling the combinations of indices of the True values:
from itertools import combinations
import numpy as np
a = np.arange(n1+n2)
out = [np.isin(a, x).tolist() for x in combinations(range(n1+n2), r=n1)]
Output:
[[True, True, False, False, False],
[True, False, True, False, False],
[True, False, False, True, False],
[True, False, False, False, True],
[False, True, True, False, False],
[False, True, False, True, False],
[False, True, False, False, True],
[False, False, True, True, False],
[False, False, True, False, True],
[False, False, False, True, True]]
I am using Platypus in Python. The problem is, after the optimizations, when I need to see the variables it shows them in boolean format.
when I change the variables types to "Real", the variable results are shown as real numbers but when I choose the "Integer" type it shows them as booleans.
So this is how one of the individuals look like:
[[True, False, True], [False, True, True, False, True, True, False, True], [False, True, True], [False, False, True, False, True, True, False, True], [True, False, True], [False, False, True, True, False, False, True, False], [False, True, True], [False, False, True, False, False, False, True, False], [True, True, False], [False, False, False, True, True, False, False, True], [False, False, True], [False, True, True, False, True, True, True, False], [True, True, False], [False, True, True, True, True, True, False, False], [True, True, False], [False, False, False, True, False, True, False, True], [False, True, False], [False, True, False, True, False, True, True, True], [True, True, True], [False, True, True, False, True, False, True, True]]
Before entering the function, they are integers (I am doing this intentionally) but, after entering the function, the variables become a mixture of integers and real numbers.
I have a 2D array of masks that I want to collapse along axis 0 using logical OR operation for values that are True. I was wondering whether there was a numpy function to do this process. My code looks something like:
>>> all_masks
array([[False, False, False, ..., False, False, False],
[False, False, False, ..., False, False, False],
[False, False, False, ..., False, False, False],
[False, True, False, ..., False, True, False],
[False, False, False, ..., False, False, False],
[False, True, False, ..., False, True, False]])
>>> all_masks.shape
(6, 870)
>>> output_mask
array([False, True, False, ..., False, True, False])
>>> output_mask.shape
(870,)
I have achieved output_mask this process through using a for loop. However I know using a for loop makes my code slower (and kinda messy) so I was wondering whether this process could be completed through a function of numpy or likewise?
Code for collapsing masks using for loop:
mask_out = np.zeros(all_masks.shape[1], dtype=bool)
for mask in all_masks:
mask_out = mask_out | mask
return mask_out
You can use ndarray.any:
all_masks = np.array([[False, False, False, False, False, False],
[False, False, False, False, False, False],
[False, False, False, False, False, False],
[False, True, False, False, True, False],
[False, False, False, False, False, False],
[False, True, False, False, True, False]])
all_masks.any(axis=0)
Output:
array([False, True, False, False, True, False])
You could use np.logical_or.reduce:
In [200]: all_masks = np.array([[False, False, False, False, False, False],
[False, False, False, False, False, False],
[False, False, False, False, False, False],
[False, True, False, False, True, False],
[False, False, False, False, False, False],
[False, True, False, False, True, False]])
In [201]: np.logical_or.reduce(all_masks, axis=0)
Out[207]: array([False, True, False, False, True, False])
np.logical_or is a ufunc, and every ufunc has a reduce method.
TLDR; How do I set values in a numpy array dependent on values in columns to the left of each value...?
I am running some simulations where I am predicting survival rates, but below is the core of what I'm trying to do. I predict a bunch of discrete point in time survivals, represented as True and Falses. Each row is a simulation, and each column is a point in time (i.e. col 0 is the first point in time, col 1 is subsequent to that)
mc = (8, 4)
survival = np.random.random(mc) > np.random.random(mc)
survival
This will give me output like this.
array([[False, True, True, False],
[True, False, True, False],
[ True, True, True, True], ...
But if something dies in the first point in time, it is dead forever. So my output needs to be:
array([[False, False, False, False],
[True, False, False, False],
[ True, True, True, True], ...
So for a row, I want to set everything False to the right of the first False I find. Is there a way to do this without two nested loops? I'm looking for a better approach but struggling to know if I can do this with built-in functions.
The perfect tool exists :
np.logical_and.accumulate(survival,axis=1)
Example :
array([[False, True, False, True],
[ True, True, False, True],
[False, True, True, True],
[False, True, False, False],
[ True, False, False, False],
[False, True, True, True],
[False, False, True, False],
[False, False, True, True]])
=>
array([[False, False, False, False],
[ True, True, False, False],
[False, False, False, False],
[False, False, False, False],
[ True, False, False, False],
[False, False, False, False],
[False, False, False, False],
[False, False, False, False]])
Try not to use pure for loops when working with numpy arrays.
Use instead cumulative product against axis=1
arr.cumprod(1).astype(np.bool)
>>> mc = (8, 4)
>>> survival = np.random.random(mc) > np.random.random(mc)
>>> survival
array([[ True, True, True, True],
[ True, False, False, True],
[ True, False, True, True],
[ True, False, True, False],
[False, True, False, False],
[ True, True, False, True],
[ True, True, False, False],
[False, False, True, True]])
and
>>> death = [x.tolist().index(False) if False in x else -1 for x in survival]
>>> [s[ : d].tolist() + [False] * (survival.shape[1] - d) if d != -1 else s.tolist() for s, d in zip(survival, death)]
[[True, True, True, True],
[True, False, False, False],
[True, False, False, False],
[True, False, False, False],
[False, False, False, False],
[True, True, False, False],
[True, True, False, False],
[False, False, False, False]]
By using np.argwhere:
import numpy as np
bob = np.array([[True,True,False,True,True],[True,True,False,True,True],[False,True,True,True,True],[True,True,False,True,True],[False,True,True,True,True]])
for arr in np.argwhere(bob == False):
bob[arr[0],arr[1]:] = False
the above argwhere returns for each instance of false the row,column, i use those value to set the rest of the row to false (after each false).
How can I convert this loop code to vector notation? I tried a bunch of things including trying to get a logical_and but it doesn't broadcast.
import numpy as np
coord_mask = np.zeros((10, 5), dtype=np.bool)
latx = np.random.choice(a=[False, True], size=10)
laty = np.random.choice(a=[False, True], size=5)
for i in range(0, coord_mask.shape[0]):
for j in range(0, coord_mask.shape[1]):
coord_mask[i, j] = latx[i] * laty[j]
print(coord_mask)
Can anyone help?
Take your pick:
In [629]: coord_mask = np.zeros((10, 5), dtype=np.bool)
...: latx = np.random.choice(a=[False, True], size=10)
...: laty = np.random.choice(a=[False, True], size=5)
...:
...: for i in range(0, coord_mask.shape[0]):
...: for j in range(0, coord_mask.shape[1]):
...: coord_mask[i, j] = latx[i] * laty[j]
...:
In [630]: coord_mask
Out[630]:
array([[False, False, False, False, False],
[False, True, True, True, False],
[False, True, True, True, False],
[False, True, True, True, False],
[False, False, False, False, False],
[False, True, True, True, False],
[False, True, True, True, False],
[False, True, True, True, False],
[False, False, False, False, False],
[False, False, False, False, False]], dtype=bool)
Broadcasted multiplication: (the None turns latx into a (n,1) column matrix, which multliplies a (m,) laty (equivalently (1,m)), producing a (n,m) result. This is a very convenient, and powerful numpy tool.
In [631]: latx[:,None]*laty
Out[631]:
array([[False, False, False, False, False],
[False, True, True, True, False],
[False, True, True, True, False],
[False, True, True, True, False],
[False, False, False, False, False],
[False, True, True, True, False],
[False, True, True, True, False],
[False, True, True, True, False],
[False, False, False, False, False],
[False, False, False, False, False]], dtype=bool)
outer
In [632]: np.outer(latx, laty)
Out[632]:
array([[False, False, False, False, False],
[False, True, True, True, False],
[False, True, True, True, False],
[False, True, True, True, False],
[False, False, False, False, False],
[False, True, True, True, False],
[False, True, True, True, False],
[False, True, True, True, False],
[False, False, False, False, False],
[False, False, False, False, False]], dtype=bool)
einsum generalization of a dot product:
In [633]: np.einsum('i,j',latx, laty)
Out[633]:
array([[False, False, False, False, False],
[False, True, True, True, False],
[False, True, True, True, False],
[False, True, True, True, False],
[False, False, False, False, False],
[False, True, True, True, False],
[False, True, True, True, False],
[False, True, True, True, False],
[False, False, False, False, False],
[False, False, False, False, False]], dtype=bool)
With the broadcasting approach you can substitute another binary operation like &:
In [634]: latx[:,None] & laty
Out[634]:
array([[False, False, False, False, False],
[False, True, True, True, False],
[False, True, True, True, False],
[False, True, True, True, False],
[False, False, False, False, False],
[False, True, True, True, False],
[False, True, True, True, False],
[False, True, True, True, False],
[False, False, False, False, False],
[False, False, False, False, False]], dtype=bool)
Seems to me that
coord_mask = np.outer(latx, laty)
should do the trick.