Mask from max values in numpy array, specific axis - python

Input example:
I have a numpy array, e.g.
a=np.array([[0,1], [2, 1], [4, 8]])
Desired output:
I would like to produce a mask array with the max value along a given axis, in my case axis 1, being True and all others being False. e.g. in this case
mask = np.array([[False, True], [True, False], [False, True]])
Attempt:
I have tried approaches using np.amax but this returns the max values in a flattened list:
>>> np.amax(a, axis=1)
array([1, 2, 8])
and np.argmax similarly returns the indices of the max values along that axis.
>>> np.argmax(a, axis=1)
array([1, 0, 1])
I could iterate over this in some way but once these arrays become bigger I want the solution to remain something native in numpy.

Method #1
Using broadcasting, we can use comparison against the max values, while keeping dims to facilitate broadcasting -
a.max(axis=1,keepdims=1) == a
Sample run -
In [83]: a
Out[83]:
array([[0, 1],
[2, 1],
[4, 8]])
In [84]: a.max(axis=1,keepdims=1) == a
Out[84]:
array([[False, True],
[ True, False],
[False, True]], dtype=bool)
Method #2
Alternatively with argmax indices for one more case of broadcasted-comparison against the range of indices along the columns -
In [92]: a.argmax(axis=1)[:,None] == range(a.shape[1])
Out[92]:
array([[False, True],
[ True, False],
[False, True]], dtype=bool)
Method #3
To finish off the set, and if we are looking for performance, use intialization and then advanced-indexing -
out = np.zeros(a.shape, dtype=bool)
out[np.arange(len(a)), a.argmax(axis=1)] = 1

Create an identity matrix and select from its rows using argmax on your array:
np.identity(a.shape[1], bool)[a.argmax(axis=1)]
# array([[False, True],
# [ True, False],
# [False, True]], dtype=bool)
Please note that this ignores ties, it just goes with the value returned by argmax.

You're already halfway in the answer. Once you compute the max along an axis, you can compare it with the input array and you'll have the required binary mask!
In [7]: maxx = np.amax(a, axis=1)
In [8]: maxx
Out[8]: array([1, 2, 8])
In [12]: a >= maxx[:, None]
Out[12]:
array([[False, True],
[ True, False],
[False, True]], dtype=bool)
Note: This uses NumPy broadcasting when doing the comparison between a and maxx

in on line : np.equal(a.max(1)[:,None],a) or np.equal(a.max(1),a.T).T .
But this can lead to several ones in a row.

In a multi-dimensional case you can also use np.indices. Let's suppose you have an array:
a = np.array([[
[0, 1, 2],
[3, 8, 5],
[6, 7, -1],
[9, 5, 8]],[
[5, 2, 8],
[7, 6, -3],
[-1, 2, 1],
[3, 5, 6]]
])
you can access argmax values calculated for axis 0 like so:
k = np.zeros((2, 4, 3), np.bool)
k[a.argmax(0), ind[0], ind[1]] = 1
The output would be:
array([[[False, False, False],
[False, True, True],
[ True, True, False],
[ True, True, True]],
[[ True, True, True],
[ True, False, False],
[False, False, True],
[False, False, False]]])

Related

NumPy 2D array boolean indexing with each axis

I created 2D array and I did boolean indexing with 2 bool index arrays.
first one is for axis 0, next one is for axis 1.
I expected that values on cross True and True from each axis are selected like Pandas.
but the result is not.
I wonder how it works that code below.
and I want to get the link from official numpy site describing this question.
Thanks in advance.
a = np.arange(9).reshape(3,3)
a
----------------------------
array([[0, 1, 2],
[3, 4, 5],
[6, 7, 8]])
a[ [True, False, True], [True, False, True] ]
--------------------------
array([0, 8])
My expectation is [0, 6, 2, 8].
(I know how to get the result that I expect.)
In [20]: a = np.arange(9).reshape(3,3)
If the lists are passed to ix_, the result is 2 arrays that can be used, with broadcasting to index the desired block:
In [21]: np.ix_([True, False, True], [True, False, True] )
Out[21]:
(array([[0],
[2]]),
array([[0, 2]]))
In [22]: a[_]
Out[22]:
array([[0, 2],
[6, 8]])
This isn't 1d, but can be easily raveled.
Trying to make equivalent boolean arrays does not work:
In [23]: a[[[True], [False], [True]], [True, False, True]]
Traceback (most recent call last):
File "<ipython-input-23-26bc93cfc53a>", line 1, in <module>
a[[[True], [False], [True]], [True, False, True]]
IndexError: too many indices for array: array is 2-dimensional, but 3 were indexed
Boolean indexes must be either 1d, or nd matching the target, here (3,3).
In [26]: np.array([True, False, True])[:,None]& np.array([True, False, True])
Out[26]:
array([[ True, False, True],
[False, False, False],
[ True, False, True]])
What you want is consecutive slices: a[[True, False, True]][:,[True, False, True]]
a = np.arange(9).reshape(3,3)
x = [True, False, True]
y = [True, False, True]
a[x][:,y]
as flat array
a[[True, False, True]][:,[True, False, True]].flatten(order='F')
output: array([0, 6, 2, 8])
alternative
NB. this requires arrays for slicing
a = np.arange(9).reshape(3,3)
x = np.array([False, False, True])
y = np.array([True, False, True])
a.T[x&y[:,None]]
output: array([0, 6, 2, 8])

Boolean indicies of extended slice

I have the following numpy array:
a = np.array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
I can use extended slicing to select e.g. columns:
>>> a[:,0::2]
array([[1, 3],
[4, 6],
[7, 9]])
>>> a[:,1::2]
array([[2],
[5],
[8]])
But I want to produce the following:
array([[True, False, True],
[True, False, True],
[True, False, True]])
array([[False, True, False],
[False, True, False],
[False, True, False]])
import numpy as np
bools = np.array([[False, False, False],
[False, False, False],
[False, False, False]])
bools[:, 0::2] = True
print(bools)
Output:
[[ True False True]
[ True False True]
[ True False True]]
np.array([[True if y%2==0 else False for y,z in enumerate(x)] for x in bools])
np.array([[False if y%2==0 else True for y,z in enumerate(x)] for x in bools])
Explanation:
By using list comprehension, variable 'x' iterates through each row of a. The inner list comprehension iterates through each of this('x' from outer comprehension) list elements. It can be observed that in your output, True & False values depend on index of the elements rather than the element values. Hence by using enumerate(), we get the index of each element in 'y' & value in 'z'. And using conditions on 'y', we decide on replacing with True or False

Numpy element-wise in operation

Suppose I have a column vector y with length n, and I have a matrix X of size n*m. I want to check for each element i in y, whether the element is in the corresponding row in X. What is the most efficient way of doing this?
For example:
y = [1,2,3,4].T
and
X =[[1, 2, 3],[3, 4, 5],[4, 3, 2],[2, 2, 2]]
Then the output should be
[1, 0, 1, 0] or [True, False, True, False]
which ever is easier.
Of course we can use a for loop to iterate through both y and X, but is there any more efficient way of doing this?
Vectorized approach using broadcasting -
((X == y[:,None]).any(1)).astype(int)
Sample run -
In [41]: X # Input 1
Out[41]:
array([[1, 2, 3],
[3, 4, 5],
[4, 3, 2],
[2, 2, 2]])
In [42]: y # Input 2
Out[42]: array([1, 2, 3, 4])
In [43]: X == y[:,None] # Broadcasted comparison
Out[43]:
array([[ True, False, False],
[False, False, False],
[False, True, False],
[False, False, False]], dtype=bool)
In [44]: (X == y[:,None]).any(1) # Check for any match along each row
Out[44]: array([ True, False, True, False], dtype=bool)
In [45]: ((X == y[:,None]).any(1)).astype(int) # Convert to 1s and 0s
Out[45]: array([1, 0, 1, 0])

Different starting indices for slices in NumPy

I'm wondering if it's possible without iterating with a for loop to do something like this:
a = np.array([[1, 2, 5, 3, 4],
[4, 5, 6, 7, 8]])
cleaver = np.argmax(a == 5, axis=1) # np.array([2, 1])
foo(a, cleaver)
>>> np.array([False, False, True, True, True],
[False, True, True, True, True])
Is there a way to accomplish this through slicing or some other non-iterative function? The arrays I'm using are quite large and iterating over them row by row is prohibitively expensive.
You can use some broadcasting magic -
cleaver[:,None] <= np.arange(a.shape[1])
Sample run -
In [60]: a
Out[60]:
array([[1, 2, 5, 3, 4],
[4, 5, 6, 7, 8]])
In [61]: cleaver
Out[61]: array([2, 1])
In [62]: cleaver[:,None] <= np.arange(a.shape[1])
Out[62]:
array([[False, False, True, True, True],
[False, True, True, True, True]], dtype=bool)

numpy.equal with nested lists

I'll want to search a rectangle in a picture. The picture is gathered from PIL. This means I'll get a 2d-array where each item is a list with three entries for the colors.
To get where's the rectangle with the searched color I'm using np.equal. Here an shrunk down example:
>>> l = np.array([[1,1], [2,1], [2,2], [1,0]])
>>> np.equal(l, [2,1]) # where [2,1] is the searched color
array([[False, True],
[ True, True],
[ True, False],
[False, False]], dtype=bool)
But I've expected:
array([False, True, False, False], dtype=bool)
or
array([[False, False],
[ True, True],
[ False, False],
[False, False]], dtype=bool)
How can I achieve a nested list comparison with numpy?
Note: and then I'll want to extract with np.where the indexes of the rectangle out of the result from np.equal.
You could use the all method along the second axis:
>>> result = numpy.array([[1, 1], [2, 1], [2, 2], [1, 0]]) == [2, 1]
>>> result.all(axis=1)
array([False, True, False, False], dtype=bool)
And to get the indices:
>>> result.all(axis=1).nonzero()
(array([1]),)
I prefer nonzero to where for this, because where does two very different things depending on how many arguments are passed to it. I use where when I need its unique functionality; when I need the behavior of nonzero, I use nonzero explicitly.

Categories