python array filtering out of bounds - python

I tried doing this in python, but I get an error:
import numpy as np
array_to_filter = np.array([1,2,3,4,5])
equal_array = np.array([1,2,5,5,5])
array_to_filter[equal_array]
and this results in:
IndexError: index 5 is out of bounds for axis 0 with size 5
What gives? I thought I was doing the right operation here.
I am expecting that if I do
array_to_filter[equal_array]
That it would return
np.array([1,2,5])
If I am not on the right track, how would I get it to do that?

In the last statement the indices for your array are 1,2,5,5 and 5. Index 5 refers to 6th element in the array while you have only 5 elements. array_to_filter[5] does not exist.
[i for i in np.unique(equal_array) if i in array_to_filter]
would return the answer you want. This returns each of the unique value in equal_array if it also exist in array_to_filter

If array_to_filter is guaranteed to have unique values, you can do:
>>> array_to_filter[np.in1d(array_to_filter, equal_array)]
array([1, 2, 5])
From the documentation: np.in1d can be considered as an element-wise function version of the python keyword in, for 1-D sequences. in1d(a, b) is roughly equivalent to np.array([item in b for item in a]).

Related

How to return the positions of the maximum value in an array

As stated I want to return the positions of the maximum value of the array. For instance if I have the array:
A = np.matrix([[1,2,3,33,99],[4,5,6,66,22],[7,8,9,99,11]])
np.argmax(A) returns only the first value which is the maximum, in this case this is 4. However how do I write code so it returns [4, 13]. Maybe there is a better function than argamax() for this as all I actually need is the position of the final maximum value.
Find the max value of the array and then use np.where.
>>> m = a.max()
>>> np.where(a.reshape(1,-1) == m)
(array([0, 0]), array([ 4, 13]))
After that, just index the second element of the tuple. Note that we have to reshape the array in order to get the indices that you are interested in.
Since you mentioned that you're interested only in the last position of the maximum value, a slightly faster solution could be:
A.size - 1 - np.argmax(A.flat[::-1])
Here:
A.flat is a flat view of A.
A.flat[::-1] is a reversed view of that flat view.
np.argmax(A.flat[::-1]) returns the first occurrence of the maximum, in that reversed view.

How to get the two smallest values from a numpy array

I would like to take the two smallest values from an array x. But when I use np.where:
A,B = np.where(x == x.min())[0:1]
I get this error:
ValueError: need more than 1 value to unpack
How can I fix this error? And do I need to arange numbers in ascending order in array?
You can use numpy.partition to get the lowest k+1 items:
A, B = np.partition(x, 1)[0:2] # k=1, so the first two are the smallest items
In Python 3.x you could also use:
A, B, *_ = np.partition(x, 1)
For example:
import numpy as np
x = np.array([5, 3, 1, 2, 6])
A, B = np.partition(x, 1)[0:2]
print(A) # 1
print(B) # 2
How about using sorted instead of np.where?
A,B = sorted(x)[:2]
There are two errors in the code. The first is that the slice is [0:1] when it should be [0:2]. The second is actually a very common issue with np.where. If you look into the documentation, you will see that it always returns a tuple, with one element if you only pass one parameter. Hence you have to access the tuple element first and then index the array normally:
A,B = np.where(x == x.min())[0][0:2]
Which will give you the two first indices containing the minimum value. If no two such indices exist you will get an exception, so you may want to check for that.

What are the semantics of numpy advanced indexing in-place increments when the indices overlap?

I want to increment a numpy array using advanced indexing, e.g.
import numpy
x = numpy.array([0,0])
indices = numpy.array([1,1])
x[indices] += [1,2]
print x #prints [0 2]
I would have expected, that the result is [0 3], since both 1 and 2 should be added to the second zero of x, but apparently numpy only adds the last element which matches to a particular index.
Is this the general behaviour and I can rely on that, or is this undefined behaviour and could change with a different version of numpy?
Additionally, is there an (easy) way to get numpy to add all elements which match the index and not just the last one?
From numpy docs:
For advanced assignments, there is in general no guarantee for the iteration order. This means that if an element is set more than once, it is not possible to predict the final result.
You can use np.add.at to get the desired behaviour:
Help on built-in function at in numpy.add:
numpy.add.at = at(...) method of numpy.ufunc instance
at(a, indices, b=None)
Performs unbuffered in place operation on operand 'a' for elements
specified by 'indices'. For addition ufunc, this method is equivalent to
`a[indices] += b`, except that results are accumulated for elements that
are indexed more than once. For example, `a[[0,0]] += 1` will only
increment the first element once because of buffering, whereas
`add.at(a, [0,0], 1)` will increment the first element twice.
.. versionadded:: 1.8.0
< snip >
Example:
>>> b = np.ones(2, int)
>>> a = np.zeros(2, int)
>>> c = np.arange(2,4)
>>> np.add.at(a, b, c)
>>> a
array([0, 5])

How to index a numpy array element with an array

I've got a numpy array, and would like to get the value at a specific element. For example, I might like to access the value at [1,1]
import numpy as np
A = np.arange(9).reshape(3,3)
print A[1,1]
# 4
Now, say I've got the coordinates in an array:
i = np.array([1,1])
How can I index A with my i coordinate array. The following doesn't work:
print A[i]
# [[3 4 5]
# [3 4 5]]
http://docs.scipy.org/doc/numpy/reference/arrays.indexing.html
In Python, x[(exp1, exp2, ..., expN)] is equivalent to x[exp1, exp2, ..., expN]; the latter is just syntactic sugar for the former.
So to get the same result as with A[1,1], you have to index with a tuple.
If you use an ndarray as the indexing object, advanced indexing is triggered:
http://docs.scipy.org/doc/numpy/reference/arrays.indexing.html#advanced-indexing
Your best bet is A[tuple(i)]. The tuple(i) call just treats i as a sequence and puts the sequence items into a tuple. Note that if your array has more than one dimension, this won't make a nested tuple. It doesn't matter in this case, though.

Argmax of numpy array returning non-flat indices

I'm trying to get the indices of the maximum element in a Numpy array.
This can be done using numpy.argmax. My problem is, that I would like to find the biggest element in the whole array and get the indices of that.
numpy.argmax can be either applied along one axis, which is not what I want, or on the flattened array, which is kind of what I want.
My problem is that using numpy.argmax with axis=None returns the flat index when I want the multi-dimensional index.
I could use divmod to get a non-flat index but this feels ugly. Is there any better way of doing this?
You could use numpy.unravel_index() on the result of numpy.argmax():
>>> a = numpy.random.random((10, 10))
>>> numpy.unravel_index(a.argmax(), a.shape)
(6, 7)
>>> a[6, 7] == a.max()
True
np.where(a==a.max())
returns coordinates of the maximum element(s), but has to parse the array twice.
>>> a = np.array(((3,4,5),(0,1,2)))
>>> np.where(a==a.max())
(array([0]), array([2]))
This, comparing to argmax, returns coordinates of all elements equal to the maximum. argmax returns just one of them (np.ones(5).argmax() returns 0).
To get the non-flat index of all occurrences of the maximum value, you can modify eumiro's answer slightly by using argwhere instead of where:
np.argwhere(a==a.max())
>>> a = np.array([[1,2,4],[4,3,4]])
>>> np.argwhere(a==a.max())
array([[0, 2],
[1, 0],
[1, 2]])

Categories