Populate numpy matrix dynamically from array values? - python

I'm trying to dynamically construct a 2-D matrix with numpy based on the values of an array, like this:
In [113]: A = np.zeros((5,5),dtype=bool)
In [114]: A
Out[114]: array([[False, False, False, False, False],
[False, False, False, False, False],
[False, False, False, False, False],
[False, False, False, False, False],
[False, False, False, False, False]], dtype=bool)
In [116]: B = np.array([0,1,3,0,2])
In [117]: B
Out[117]: array([0, 1, 3, 0, 2])
Now, I'd like to use the values of B to assign the first n values of each row to A to True. For this A and B, the correct output would be:
In [118]: A
Out[118]: array([[False, False, False, False, False],
[ True, False, False, False, False],
[ True, True, True, False, False],
[False, False, False, False, False],
[ True, True, False, False, False]], dtype=bool)
The length of B will always equal the number of rows of A, and the the values of B will always be less than or equal to the number of columns of A. The size of A and the values of B are constantly changing, so I need to build these on the fly.
I'm certain that this has a simple(-ish) solution in numpy, but I've spent the last hour banging my head against variations of repeat, tile, and anything else I can think of. Can anyone help me out before I give myself a concussion? :)
EDIT: I'm going to need to do this a lot, so speed will be an issue. The only version that I can come up with for now is something like:
np.vstack([ [True]*x + [False]*(500-x) for x in B ])
but I expect that this will be slow due to the for loop (I would time it if I had anything to compare it to).

How about:
>>> A = np.zeros((5, 7),dtype=bool)
>>> B = np.array([0,1,3,0,2])
>>> (np.arange(len(A[0])) < B[:,None])
array([[False, False, False, False, False, False, False],
[ True, False, False, False, False, False, False],
[ True, True, True, False, False, False, False],
[False, False, False, False, False, False, False],
[ True, True, False, False, False, False, False]], dtype=bool)
(I changed the shape from (5,5) because I was getting confused about which axis was which, and I wanted to make sure I was using the right one.)
[Simplified from (np.arange(len(A[0]))[:,None] < B).T -- if we expand B and not A, there's no need for the transpose.]

Related

Collapse mask array along axis - Numpy in Python

I have a 2D array of masks that I want to collapse along axis 0 using logical OR operation for values that are True. I was wondering whether there was a numpy function to do this process. My code looks something like:
>>> all_masks
array([[False, False, False, ..., False, False, False],
[False, False, False, ..., False, False, False],
[False, False, False, ..., False, False, False],
[False, True, False, ..., False, True, False],
[False, False, False, ..., False, False, False],
[False, True, False, ..., False, True, False]])
>>> all_masks.shape
(6, 870)
>>> output_mask
array([False, True, False, ..., False, True, False])
>>> output_mask.shape
(870,)
I have achieved output_mask this process through using a for loop. However I know using a for loop makes my code slower (and kinda messy) so I was wondering whether this process could be completed through a function of numpy or likewise?
Code for collapsing masks using for loop:
mask_out = np.zeros(all_masks.shape[1], dtype=bool)
for mask in all_masks:
mask_out = mask_out | mask
return mask_out
You can use ndarray.any:
all_masks = np.array([[False, False, False, False, False, False],
[False, False, False, False, False, False],
[False, False, False, False, False, False],
[False, True, False, False, True, False],
[False, False, False, False, False, False],
[False, True, False, False, True, False]])
all_masks.any(axis=0)
Output:
array([False, True, False, False, True, False])
You could use np.logical_or.reduce:
In [200]: all_masks = np.array([[False, False, False, False, False, False],
[False, False, False, False, False, False],
[False, False, False, False, False, False],
[False, True, False, False, True, False],
[False, False, False, False, False, False],
[False, True, False, False, True, False]])
In [201]: np.logical_or.reduce(all_masks, axis=0)
Out[207]: array([False, True, False, False, True, False])
np.logical_or is a ufunc, and every ufunc has a reduce method.

Numpy: Set false where anything to the left is false

TLDR; How do I set values in a numpy array dependent on values in columns to the left of each value...?
I am running some simulations where I am predicting survival rates, but below is the core of what I'm trying to do. I predict a bunch of discrete point in time survivals, represented as True and Falses. Each row is a simulation, and each column is a point in time (i.e. col 0 is the first point in time, col 1 is subsequent to that)
mc = (8, 4)
survival = np.random.random(mc) > np.random.random(mc)
survival
This will give me output like this.
array([[False, True, True, False],
[True, False, True, False],
[ True, True, True, True], ...
But if something dies in the first point in time, it is dead forever. So my output needs to be:
array([[False, False, False, False],
[True, False, False, False],
[ True, True, True, True], ...
So for a row, I want to set everything False to the right of the first False I find. Is there a way to do this without two nested loops? I'm looking for a better approach but struggling to know if I can do this with built-in functions.
The perfect tool exists :
np.logical_and.accumulate(survival,axis=1)
Example :
array([[False, True, False, True],
[ True, True, False, True],
[False, True, True, True],
[False, True, False, False],
[ True, False, False, False],
[False, True, True, True],
[False, False, True, False],
[False, False, True, True]])
=>
array([[False, False, False, False],
[ True, True, False, False],
[False, False, False, False],
[False, False, False, False],
[ True, False, False, False],
[False, False, False, False],
[False, False, False, False],
[False, False, False, False]])
Try not to use pure for loops when working with numpy arrays.
Use instead cumulative product against axis=1
arr.cumprod(1).astype(np.bool)
>>> mc = (8, 4)
>>> survival = np.random.random(mc) > np.random.random(mc)
>>> survival
array([[ True, True, True, True],
[ True, False, False, True],
[ True, False, True, True],
[ True, False, True, False],
[False, True, False, False],
[ True, True, False, True],
[ True, True, False, False],
[False, False, True, True]])
and
>>> death = [x.tolist().index(False) if False in x else -1 for x in survival]
>>> [s[ : d].tolist() + [False] * (survival.shape[1] - d) if d != -1 else s.tolist() for s, d in zip(survival, death)]
[[True, True, True, True],
[True, False, False, False],
[True, False, False, False],
[True, False, False, False],
[False, False, False, False],
[True, True, False, False],
[True, True, False, False],
[False, False, False, False]]
By using np.argwhere:
import numpy as np
bob = np.array([[True,True,False,True,True],[True,True,False,True,True],[False,True,True,True,True],[True,True,False,True,True],[False,True,True,True,True]])
for arr in np.argwhere(bob == False):
bob[arr[0],arr[1]:] = False
the above argwhere returns for each instance of false the row,column, i use those value to set the rest of the row to false (after each false).

Compare a numpy array to each element of another one

A = np.array([5,1,5,8])
B = np.array([2,5])
I want to compare the A array to each element of B. In other words I'm lookin for a function which do the following computations :
A>2
A>5
(array([ True, False, True, True]), array([False, False, False, True]))
Not particularly fancy but a list comprehension will work:
[A > b for b in B]
[array([ True, False, True, True], dtype=bool),
array([False, False, False, True], dtype=bool)]
You can also use np.greater(), which requires the dimension-adding trick that Brenlla uses in the comments:
np.greater(A, B[:,np.newaxis])
array([[ True, False, True, True],
[False, False, False, True]], dtype=bool)

Intersect two boolean arrays for True

Having the numpy arrays
a = np.array([ True, False, False, True, False], dtype=bool)
b = np.array([False, True, True, True, False], dtype=bool)
how can I make the intersection of the two so that only the True values match? I can do something like:
a == b
array([False, False, False, True, True], dtype=bool)
but the last item is True (understandably because both are False), whereas I would like the result array to be True only in the 4th element, something like:
array([False, False, False, True, False], dtype=bool)
Numpy provides logical_and() for that purpose:
a = np.array([ True, False, False, True, False], dtype=bool)
b = np.array([False, True, True, True, False], dtype=bool)
c = np.logical_and(a, b)
# array([False, False, False, True, False], dtype=bool)
More at Numpy Logical operations.

Elementwise comparison of numpy arrays with different lengths

I want to compare the elements of two 3D numpy arrays of different lengths. The goal is, to find overlapping elements in the two arrays.
All functions I found so far, rely on the two arrays being of the same lengths.
Is there an efficient way to do compare the 2D-elements (for loops will be very inefficient, since each array has tens of thousands of elements)?
Here a few ways of comparing 2 1d arrays:
In [325]: n=np.arange(0,10)
In [326]: m=np.arange(3,9)
In [327]: np.in1d(n,m)
Out[327]: array([False, False, False, True, True, True, True, True, True, False], dtype=bool)
In [328]: np.in1d(m,n)
Out[328]: array([ True, True, True, True, True, True], dtype=bool)
In [329]: n[:,None]==m[None,:]
Out[329]:
array([[False, False, False, False, False, False],
[False, False, False, False, False, False],
[False, False, False, False, False, False],
[ True, False, False, False, False, False],
[False, True, False, False, False, False],
[False, False, True, False, False, False],
[False, False, False, True, False, False],
[False, False, False, False, True, False],
[False, False, False, False, False, True],
[False, False, False, False, False, False]], dtype=bool)
and farenorths suggestion
In [330]: np.intersect1d(n,m)
Out[330]: array([3, 4, 5, 6, 7, 8])
In [331]: np.where(np.in1d(n,m))
Out[331]: (array([3, 4, 5, 6, 7, 8], dtype=int64),)
Is intersect1d what you want? For example, if your arrays are a and b, you could simply do:
duplicates = np.intersect1d(a, b)

Categories