Python How to count intersection of a specific value in both array? - python

I have two array A, B which both have values of [0 , 1 , 2] (same size)
I want to count the intersection of index for value 1. Basically in another word, I want to check the precision of value 1 base on array A.
So far I have tried map function but it doesnt work.
temp = list(map(lambda x,y: (x is y) == 1 ,A ,B))
However the result is not what I expected. Can you show some advice or example on how to solve this problem ?

Try this:
x = np.array([0, 1, 2, 3, 1, 4, 5])
y = np.array([0, 1, 2, 4, 1, 3, 5])
print(np.sum(list(map(lambda x,y: (x==y==1) , x, y))))
output:
2
Tensorflow code:
elems = (np.array([0, 1, 2, 3, 1, 4, 5, 0, 1, 2, 3, 1, 4, 5]), np.array([0, 1, 2, 4, 1, 3, 5, 0, 1, 2, 3, 1, 4, 5]))
alternate = tf.map_fn(lambda x: tf.math.logical_and(tf.equal(x[0], 1), tf.equal(x[0], x[1])), elems, dtype=tf.bool)
print(alternate)
print(tf.reduce_sum(tf.cast(alternate, tf.float32)))
output:
tf.Tensor([False True False False True False False False True False False True False False], shape=(14,), dtype=bool)
tf.Tensor(4.0, shape=(), dtype=float32)

Related

Replace consecutive identic elements in the beginning of an array with 0

I want to replace the N first identic consecutive numbers from an array with 0.
import numpy as np
x = np.array([1, 1, 1, 1, 2, 3, 1, 2, 3, 2, 2, 2, 3, 3, 3, 1, 1, 2, 2])
OUT -> np.array([0, 0, 0, 0 2, 3, 1, 2, 3, 2, 2, 2, 3, 3, 3, 1, 1, 2, 2])
Loop works, but what would be a faster-vectorized implementation?
i = 0
first = x[0]
while x[i] == first and i <= x.size - 1:
x[i] = 0
i += 1
You can use argmax on a boolean array to get the index of the first changing value.
Then slice and replace:
n = (x!=x[0]).argmax() # 4
x[:n] = 0
output:
array([0, 0, 0, 0, 2, 3, 1, 2, 3, 2, 2, 2, 3, 3, 3, 1, 1, 2, 2])
intermediate array:
(x!=x[0])
# n=4
# [False False False False True True True True True True True True
# True True True True True True True]
My solution is based on itertools.groupby, so start from import itertools.
This function creates groups of consecutive equal values, contrary to e.g.
the pandasonic version of groupby, which collects withis a single group all
equal values from the input.
Another important feature is that you can assign any value to N and
replaced will be only the first N of a sequence of consecutive values.
To test my code, I set N = 4 and defined the source array as:
x = np.array([1, 1, 1, 1, 2, 3, 1, 2, 3, 2, 2, 2, 3, 3, 3, 1, 1, 2, 2, 2, 2, 2])
Note that it contains 5 consecutive values of 2 at the end.
Then, to get the expected result, run:
rv = []
for key, grp in itertools.groupby(x):
lst = list(grp)
lgth = len(lst)
if lgth >= N:
lst[0:N] = [0] * N
rv.extend(lst)
xNew = np.array(rv)
The result is:
[0, 0, 0, 0, 2, 3, 1, 2, 3, 2, 2, 2, 3, 3, 3, 1, 1, 0, 0, 0, 0, 2]
Note that a sequence of 4 zeroes occurs:
at the beginning (all 4 values of 1 have been replaced),
almost at the end (from 5 values of 2 first 4 have been replaced).

How do I extract rows of a 2D NumPy array by condition?

I have a 4*5 NumPy array and I want to retrieve rows in which all elements are less than 5.
arr = np.array([[0,2,3,4,5],[1,2,4,1,3], [2,2,5,4,6], [0,2,3,4,3]])
arr[np.where(arr[:,:] <= 4)]
expected output:
[[1,2,4,1,3],[0,2,3,4,3]]
actual output:
array([0, 2, 3, 4, 1, 2, 4, 1, 3, 2, 2, 4, 0, 2, 3, 4, 3])
Any help is appreciated!
This actually quite simple. Just convert the entire array to booleans (each value is True if it's less than 5, False otherwise), and use np.all with axis=1 to return True for each row where all items are True:
>>> arr[np.all(arr < 5, axis=1)]
array([[1, 2, 4, 1, 3],
[0, 2, 3, 4, 3]])

Index targeting on new list from old list

So let's say I have a list that looks like:
x = [1, 0, 0, 1, 1, 1, 0, 0, 0, 0]
I then have another list with indices that needs to be removed from list x:
x_remove = [1, 4, 5]
I can then use the numpy command delete to remove this from x and end up with:
x_final = np.delete(x, x_remove)
>>> x_final = [0, 0, 1, 0, 0, 0, 0]
So far so good. Now I then figure out that I don't want to use the entire list x, but start perhaps from index 2. So basically:
x_new = x[2:]
>>> x_new = [0, 1, 1, 1, 0, 0, 0, 0]
I do however still need to remove the indices from the x_remove list, but now, as you can see, the indices are not the same placement as before, so the wrong items are removed. And same thing will happen if I do it the other way around (i.e. first removing the indices, and then use slice to start at index 2). So basically it will/should look like:
x_new_final = [0, 1, 1, 0, 0] (first use slice, and the remove list)
x_new_final_v2 = [1, 0, 0, 0, 0] (first use remove list, and then slice)
x_new_final_correct_one = [0, 1, 0, 0, 0, 0] (as it should be)
So is there some way in which I can start my list at various indices (through slicing), and still use the delete command to remove the correct indices that would correspond to the full list ?
You could change the x_remove list depending on the slice location. For example:
slice_location = 2
x = [1, 0, 0, 1, 1, 1, 0, 0, 0, 0]
x_remove = [1, 4, 5]
x_new=x[slice_location:]
x_remove = [x-slice_location for x in x_remove if x-slice_location>0]
x_new = np.delete(x, x_remove)
x = [1, 0, 0, 1, 1, 1, 0, 0, 0, 0]
x_remove = [1, 4, 5]
for index,value in enumerate(x):
for remove_index in x_remove:
if(index == remove_index-1):
x[index] = ""
final_list = [final_value for final_value in x if(final_value != "")]
print(final_list)
Try it this simple way...
First let's explore alternatives for the simple removal (with out this change in starting position issue):
First make an x with unique and easily recognized values:
In [787]: x = list(range(10))
In [788]: x
Out[788]: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
A list comprehension method - maybe not the fastest, but fairly clear and bug free:
In [789]: [v for i,v in enumerate(x) if i not in x_remove]
Out[789]: [0, 2, 3, 6, 7, 8, 9]
Your np.delete approach:
In [790]: np.delete(x, x_remove)
Out[790]: array([0, 2, 3, 6, 7, 8, 9])
That has a downside of converting x to an array, which is not a trivial task (time wise). It also makes a new array. My guess is that it is slower.
Try in place removeal:
In [791]: y=x[:]
In [792]: for i in x_remove:
...: del y[i]
...:
In [793]: y
Out[793]: [0, 2, 3, 4, 6, 8, 9]
oops - wrong. We need to start from the end (largest index). This is a well known Python 'recipe':
In [794]: y=x[:]
In [795]: for i in x_remove[::-1]:
...: del y[i]
...:
...:
In [796]: y
Out[796]: [0, 2, 3, 6, 7, 8, 9]
Under the covers np.delete is taking a masked approach:
In [797]: arr = np.array(x)
In [798]: arr
Out[798]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
In [799]: mask = np.ones(arr.shape, bool)
In [800]: mask[x_remove] = False
In [801]: mask
Out[801]:
array([ True, False, True, True, False, False, True, True, True,
True])
In [802]: arr[mask]
Out[802]: array([0, 2, 3, 6, 7, 8, 9])
Now to the question of applying x_remove to a slice of x. The slice of x does not have a record of the slice parameters. That is you can't readily determine that y = x[2:] is missing two values. (Well, I could deduce it by comparing some attributes of x and y, but not from y alone).
So regardless of how you do the delete, you will have to first adjust the values of x_remove.
In [803]: x2 = np.array(x_remove)-2
In [804]: x2
Out[804]: array([-1, 2, 3])
In [805]: [v for i,v in enumerate(x[2:]) if i not in x2]
Out[805]: [2, 3, 6, 7, 8, 9]
This works ok, but that -1 is potentially a problem. We don't want it mean the last element. So we have to first filter out the negative indicies to be safe.
In [806]: np.delete(x[2:], x2)
/usr/local/bin/ipython3:1: FutureWarning: in the future negative indices will not be ignored by `numpy.delete`.
#!/usr/bin/python3
Out[806]: array([2, 3, 6, 7, 8, 9])
If delete didn't ignore negative indices, it could get a mask like this - with a False at the end:
In [808]: mask = np.ones(arr[2:].shape, bool)
In [809]: mask[x2] = False
In [810]: mask
Out[810]: array([ True, True, False, False, True, True, True, False])

generate numbers between min and max with equal counts

I have a minimum value and maximum value, I'd like to generate a list of numbers between them such that all the numbers have equal counts. Is there a numpy function or any function out there?
Example: GenerateNums(start=1, stop=5, nums=10)
Expected output: [1,1,2,2,3,3,4,4,5,5] i.e each number has an almost equal count
Takes "almost equal" to heart -- the difference between the most common and least common number is at most 1. No guarantee about which number is the mode.
def gen_nums(start, stop, nums):
binsize = (1 + stop - start) * 1.0 / nums
return map(lambda x: int(start + binsize * x), xrange(nums))
gen_nums(1, 5, 10)
[1, 1, 2, 2, 3, 3, 4, 4, 5, 5]
There is a numpy function:
In [3]: np.arange(1,6).repeat(2)
Out[3]: array([1, 1, 2, 2, 3, 3, 4, 4, 5, 5])
def GenerateNums(start=1, stop=5, nums=10):
result = []
rep = nums/(stop - start + 1 )
for i in xraneg(start,stop):
for j in range(rep):
result.append(i)
return result
For almost equal counts, you can sample from a uniform distribution. numpy.random.randint does this:
>>> import numpy as np
>>> np.random.randint(low=1, high=6, size=10)
array([4, 5, 5, 4, 5, 5, 2, 1, 4, 2])
To get these values in sorted order:
>>> sorted(np.random.randint(low=1, high=6, size=10))
[1, 1, 1, 2, 3, 3, 3, 3, 5, 5]
This process is just like rolling dice :) As you sample more times, the counts of each value should become very similar:
>>> from collections import Counter
>>> Counter(np.random.randint(low=1, high=6, size=10000))
Counter({1: 1978, 2: 1996, 3: 2034, 4: 1982, 5: 2010})
For exactly equal counts:
>>> range(1,6) * 2
[1, 2, 3, 4, 5, 1, 2, 3, 4, 5]
>>> sorted(range(1,6) * 2)
[1, 1, 2, 2, 3, 3, 4, 4, 5, 5]
def GenerateNums(start=0,stop=0,nums=0,result=[]):
assert (nums and stop > 0), "ZeroDivisionError"
# get repeating value
iter_val = int(round(nums/stop))
# go through strt/end and repeat the item on adding
[[result.append(x) for __ in range(iter_val)] for x in range(start,stop)]
return result
print (GenerateNums(start=0, stop=5, nums=30))
>>> [0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4]

Comparing 2 numpy arrays

I have 2 numpy arrays and I want whenever element B is 1, the element in A is equal to 0. Both arrays are always in the same dimension:
A = [1, 2, 3, 4, 5]
B = [0, 0, 0, 1, 0]
I tried to do numpy slicing but I still can't get it to work.
B[A==1]=0
How can I achieve this in numpy without doing the conventional loop ?
First, you need them to be numpy arrays and not lists. Then, you just inverted B and A.
import numpy as np
A = np.array([1, 2, 3, 4, 5])
B = np.array([0, 0, 0, 1, 0])
A[B==1]=0 ## array([1, 2, 3, 0, 5])
If you use lists instead, here is what you get
A = [1, 2, 3, 4, 5]
B = [0, 0, 0, 1, 0]
A[B==1]=0 ## [0, 2, 3, 4, 5]
That's because B == 1 is False or 0 (instead of an array). So you essentially write A[0] = 0
Isn't it that what you want to do ?
A[B==1] = 0
A
array([1, 2, 3, 0, 5])

Categories