Mergining Two Numpy Boolean Arrays after an Index - python

In my problem I have 2 boolean numpy arrays that I would like to merge after a given index. Currently I am using np.logical_or(arr1, arr2), but executes on the entire array. I am trying to only execute the operation after an index.
Below I would like to use arr1 as the master and merge arr2 after any index.
For example, take the arrays and index below
arr1 = np.array([ True, False, False, True, False])
arr2 = np.array([False, True, True, True, False])
index = 2
Returns
# array([True, False, True, True, False])

You can use array slicing and np.concatenate to implement this. In this case, arr3 will consist of elements from arr1 in indices 0 to 'index' and the rest of the elements will be from arr2.
arr1 = np.array([ True, False, False, True, False])
arr2 = np.array([False, True, True, True, False])
index = 2
arr3= np.concatenate((arr1[:index], arr2[index:]), axis = 0)
print(arr3)

Related

Count number of transitions in each row of a Numpy array

I have a 2D boolean array
a=np.array([[True, False, True, False, True],[True , True, True , True, True], [True , True ,False, False ,False], [False, True , True, False, False], [True , True ,False, True, False]])
I would like to create a new array, providing count of True-False transitions in each row of this array.
The desired result is count=[2, 0, 1, 1, 2]
I operate with a large numpy array, so I don't apply cycle to browse through all lines.
I tried to adopt available solutions to a 2D array with counting for each line separately, but did not succeed.
Here is a possible solution:
b = a.astype(int)
c = (b[:, :-1] - b[:, 1:])
count = (c == 1).sum(axis=1)
Result:
>>> count
array([2, 0, 1, 1, 2])

How to return a numpy array of the indices of the first element in each row of a numpy array with a given value?

Given a numpy array of shape (2, 4):
input = np.array([[False, True, False, True], [False, False, True, True]])
I want to return an array of shape (N,) where each element of the array is the index of the first True value:
expected = np.array([1, 2])
Is there an easy way to do this using numpy functions and without resorting to standard loops?
np.max with axis finds the max along the dimension; argmax finds the first max index:
In [42]: arr = np.array([[False, True, False, True], [False, False, True, True]])
In [43]: np.argmax(arr, axis=1)
Out[43]: array([1, 2])
This worked for me:
nonzeros = np.nonzero(input)
u, indices = np.unique(nonzeros[0], return_index=True)
expected = nonzeros[1][indices]

Using entrywise sum of boolean arrays as inclusive `or`

I would like to compare many m-by-n boolean numpy arrays and get an array of the same shape whose entries are True if the corresponding entry in at least one of the inputs is True.
The easiest way I've found to do this is:
In [5]: import numpy as np
In [6]: a = np.array([True, False, True])
In [7]: b = np.array([True, True, False])
In [8]: a + b
Out[8]: array([ True, True, True])
But I can also use
In [11]: np.stack([a, b]).sum(axis=0) > 0
Out[11]: array([ True, True, True])
Are these equivalent operations? Are there any gotchas I should be aware of? Is one method preferable to the other?
You can use np.logical_or
a = np.array([True, False, True])
b = np.array([True, True, False])
np.logical_or(a,b)
it also works for (m,n) arrays
a = np.random.rand(3,4) < 0.5
b = np.random.rand(3,4) < 0.5
print('a\n',a)
print('b\n',b)
np.logical_or(a,b)

Filling a numpy array with x random values

I have a numpy array of size x, which I need to fill with 700 true.
For example:
a = np.zeros(5956)
If I want to fill this with 70 % True, I can write this:
msk = np.random.rand(len(a)) < 0.7
b = spam_df[msk]
But what if I need exactly 700 true, and the rest false?
import numpy as np
x = 5956
a = np.zeros((x), dtype=bool)
random_places = np.random.choice(x, 700, replace=False)
a[random_places] = True
import numpy as np
zeros = np.zeros(5956-700, dtype=bool)
ones=np.ones(700, dtype=bool)
arr=np.concatenate((ones,zeros), axis=0, out=None)
np.random.shuffle(arr)#Now, this array 'arr' is shuffled, with 700 Trues and rest False
Example - there should be 5 elements in an array with 3 True and rest False.
ones= np.ones(3, dtype=bool) #array([True, True, True])
zeros= np.zeros(5-3, dtype=bool) #array([False, False])
arr=np.concatenate((ones,zeros), axis=0, out=None) #arr - array([ True, True, True, False, False])
np.random.shuffle(arr) # now arr - array([False, True, True, True, False])

How to create multiple column list of booleans from given list of integers in phython?

I am new to Python. I want to do following.
Input: A list of integers of size n. Each integer is in a range of 0 to 3.
Output: A multi-column (4 column in this case as integer range in 0-3 = 4) numpy list of size n. Each row of the new list will have the column corresponding to the integer value of Input list as True and rest of the columns as False.
E.g. Input list : [0, 3, 2, 1, 1, 2], size = 6, Each integer is in range of 0-3
Output list :
Row 0: True False False False
Row 1: False False False True
Row 2: False False True False
Row 3: False True False False
Row 4: False True False False
Row 5: False False True False
Now, I can start with 4 columns. Traverse through the input list and create this as follows,
output_columns[].
for i in Input list:
output_column[i] = True
Create an output numpy list with output columns
Is this the best way to do this in Python? Especially for creating numpy list as an output.
If yes, How do I merge output_columns[] at the end to create numpy multidimensional list with each dimension as a column of output_columns.
If not, what would be the best (most time efficient way) to do this in Python?
Thank you,
Is this the best way to do this in Python?
No, a more Pythonic and probably the best way is to use a simple broadcasting comparison as following:
In [196]: a = np.array([0, 3, 2, 1, 1, 2])
In [197]: r = list(range(0, 4))
In [198]: a[:,None] == r
Out[198]:
array([[ True, False, False, False],
[False, False, False, True],
[False, False, True, False],
[False, True, False, False],
[False, True, False, False],
[False, False, True, False]])
You are creating so called one-hot vector (each row in matrix is a one-hot vector meaning that only one value is True).
mylist = [0, 3, 2, 1, 1, 2]
one_hot = np.zeros((len(mylist), 4), dtype=np.bool)
for i, v in enumerate(mylist):
one_hot[i, v] = True
Output
array([[ True, False, False, False],
[False, False, False, True],
[False, False, True, False],
[False, True, False, False],
[False, True, False, False],
[False, False, True, False]], dtype=bool)

Categories