I'm a beginner to the Python world and hope someone can answer my question. I haven an array and need to access certain indices of elements as below
x = np.random.rand(10)
x
array([ 0.56807058, 0.8404783 , 0.86835717, 0.76030882, 0.40242679,
0.22941009, 0.56842643, 0.94541468, 0.92813747, 0.95980955])
indx = np.where(x < 0.5)
indx
(array([4, 5], dtype=int64),)
However, when I try to access first element with indx[0] it returns array([4, 5], dtype=int64). What I want to do is access elements 4 and 5 inside indx. Thank you for looking into my question and any support.
np.where returns a tuple of indices. In this case the tuple contains only one array of indices. This consistent with how where handles multi-dimensional arrays. It returns a tuple containing multiple arrays which together define the indices of the non-zero elements.
To access 4 from indx you would do: indx[0][0]. The first [0] selects the first element of the indx tuple, which is array([4, 5], dtype=int64) and the second accesses an element of this array.
Related
Short version: Given a list with two elements [i, j], how do I get the i, j -element of a 2d array instead of the i, j rows: arr[[i,j]] to arr[i,j].
I've seen similar cases where *list has been used, but I don't entirely understand how that operator works.
Deeper context:
I have a function that returns a nested list, where each sub-list is a pair of indices to be used in an array:
grid = np.full((3,3), 1)
def path():
...
return [[i_1, j_1], [i_2, j_2], ...]
for i in path():
grid[path()[i]] = 0
But since path()[i] is a list, grid[path()[i]] == 0 sets two rows equal to zero, and not a single element. How do I prevent that?
While not stricly necessary, a faster solution would be preferable as this operation is to be done many times.
The thing that is confusing you is the difference in mathematic notation and indexing 2D (or n-dimensional) lists in Python/programming languages.
If you have a 2D matrix in mathematics, called X, and let's say you'd want to access the element in the first row and first column, you'd write X(1, 1).
If you have a 2D array, it's elements are lists. So, if you want to access the 1st row of an array called X you'd do:
X[0] # 0 because indexation starts at 0
Keep in mind that the previous statement returns a new list. You can index this list as well. Since we need the first column, we do:
X[0][0] # element in 1st row and 1st column of matrix X
Thus the answer is you need to successively index the matrix, i.e. apply chain indexing.
As for your original question, here is my answer. Let's say a is the 2-element list which contains the indices i and j which you want to access in the 2D array. I'll denote the 2D array as b. Then you apply chain indexing, the first index is the first element of a and the second index is the second element of a:
a = [0, 0] # list with 2 elements containing the indices
b = [[1, 2, 3], [4, 5, 6], [7, 8, 9]] # 2D array in which you want to access b[0][0]
i, j = a[0], a[1]
print(b[i][j])
Obviously you don't need to use variables i, j, I just did so for clarity:
print(b[a[0]][a[1]])
Both statements print out the element in the 1st row and 1st column of b:
1
I was wondering whether anyone could explain to me the syntax of this line of code
a[[1, 6, 7]] = 10
where a is an array.
The need for the double brackets is what is confusing me the most
Indexing with 1, 6, 7 (i.e. a[1, 6, 7]) would suggest you are indexing along three separate axes of a which is not what you are looking for. This corresponds to the following assignment:
a[1][6][7] = 10
Indexing with [1, 6, 7] (i.e. a[[1, 6, 7]]), means you are indexing a single axis with a list. Essentially accessing multiple different values of array a. This corresponds to the following assignments:
a[1] = 10
a[6] = 10
a[7] = 10
If a is a numpy.array, the syntax a[x, y, z] has another use, which is to select from a 3-dimensional array the element at that position. In your case, a seems to be a 1-dimensional array, so you can only select by x (that is a[x], and for example a[x,y] would be incorrect since there is no second dimension)
And that 'x' value can be either just one index (for instance a[0], here the x is an integer) or multiple ones (for instance a[[0,1]], and here x is a list of integers).
I was playing with numpy array indexing and find this odd behavior. When I index with np.array or list it works as expected:
In[1]: arr = np.arange(10).reshape(5,2)
arr[ [1, 1] ]
Out[1]: array([[2, 3],
[2, 3]])
But when I put tuple, it gives me a single element:
In[1]: arr = np.arange(10).reshape(5,2)
arr[ (1, 1) ]
Out[1]: 3
Also some kind of this strange tuple vs list behavior occurs with arr.flat:
In[1]: arr = np.arange(10).reshape(5,2)
In[2]: arr.flat[ [3, 4] ]
Out[2]: array([3, 4])
In[3]: arr.flat[ (3, 4) ]
Out[3]: IndexError: unsupported iterator index
I can't understand what is going on under the hood? What difference between tuple and list in this case?
Python 3.5.2
NumPy 1.11.1
What's happening is called fancy indexing, or advanced indexing. There's a difference between indexing with slices, or with a list/array. The trick is that multidimensional indexing actually works with tuples due to the implicit tuple syntax:
import numpy as np
arr = np.arange(10).reshape(5,2)
arr[2,1] == arr[(2,1)] # exact same thing: 2,1 matrix element
However, using a list (or array) inside an index expression will behave differently:
arr[[2,1]]
will index into arr with 1, then with 2, so first it fetches arr[2]==arr[2,:], then arr[1]==arr[1,:], and returns these two rows (row 2 and row 1) as the result.
It gets funkier:
print(arr[1:3,0:2])
print(arr[[1,2],[0,1]])
The first one is regular indexing, and it slices rows 1 to 2 and columns 0 to 1 inclusive; giving you a 2x2 subarray. The second one is fancy indexing, it gives you arr[1,0],arr[2,1] in an array, i.e. it indexes selectively into your array using, essentially, the zip() of the index lists.
Now here's why flat works like that: it returns a flatiter of your array. From help(arr.flat):
class flatiter(builtins.object)
| Flat iterator object to iterate over arrays.
|
| A `flatiter` iterator is returned by ``x.flat`` for any array `x`.
| It allows iterating over the array as if it were a 1-D array,
| either in a for-loop or by calling its `next` method.
So the resulting iterator from arr.flat behaves as a 1d array. When you do
arr.flat[ [3, 4] ]
you're accessing two elements of that virtual 1d array using fancy indexing; it works. But when you're trying to do
arr.flat[ (3,4) ]
you're attempting to access the (3,4) element of a 1d (!) array, but this is erroneous. The reason that this doesn't throw an IndexError is probably only due to the fact that arr.flat itself handles this indexing case.
In [387]: arr=np.arange(10).reshape(5,2)
With this list, you are selecting 2 rows from arr
In [388]: arr[[1,1]]
Out[388]:
array([[2, 3],
[2, 3]])
It's the same as if you explicitly marked the column slice (with : or ...)
In [389]: arr[[1,1],:]
Out[389]:
array([[2, 3],
[2, 3]])
Using an array instead of a list works: arr[np.array([1,1]),:]. (It also eliminates some ambiguities.)
With the tuple, the result is the same as if you wrote the indexing without the tuple wrapper. So it selects an element with row index of 1, column index of 1.
In [390]: arr[(1,1)]
Out[390]: 3
In [391]: arr[1,1]
Out[391]: 3
The arr[1,1] is translated by the interpreter to arr.__getitem__((1,1)). As is common in Python 1,1 is shorthand for (1,1).
In the arr.flat cases you are indexing the array as if it were 1d. np.arange(10)[[2,3]] selects 2 items, while np.arange(10)[(2,3)] is 2d indexing, hence the error.
A couple of recent questions touch on a messier corner case. Sometimes the list is treated as a tuple. The discussion might be enlightening, but don't go there if it's confusing.
Advanced slicing when passed list instead of tuple in numpy
numpy indexing: shouldn't trailing Ellipsis be redundant?
What I have is a list where the elements are an array like this:
([1,2,3],[4,5,6],[7,8,9])
What I want is to find the index of an element in this list, something like:
list.index([4,5,6]) #should return 1.
Problem is numpy array comparison throws up errors unless you put something like: (A==B).all()
But this comparison is inside the index function so i can't and don't really want to add the all() bit to the function. Is there an easier solution to this?
Your last error message indicates that you are still mixing lists and arrays. I'll try to recreate the situation:
Make a list of lists. Finding a sublist works just fine:
In [256]: ll=[[1,2,3],[4,5,6],[7,8,9]]
In [257]: ll.index([4,5,6])
Out[257]: 1
Make an array from it - it's 2d.
In [258]: la=np.array(ll)
In [259]: la
Out[259]:
array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
It does not have an index method
In [260]: la.index([4,5,6])
...
AttributeError: 'numpy.ndarray' object has no attribute 'index'
Make it a list - but we get your ValueError:
In [265]: list(la).index([4,5,6])
...
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
That's because list(la) returns a list of arrays, and arrays produce multiple values in == expressions:
In [266]: list(la)
Out[266]: [array([1, 2, 3]), array([4, 5, 6]), array([7, 8, 9])]
The correct way to produce a list from an array is tolist, which returns the original ll list of lists:
In [267]: la.tolist().index([4,5,6])
Out[267]: 1
If you are starting with a numpy array, you can get the result that you want by converting it to a list of lists before using the index() function, e.g.:
import numpy as np
arr = np.array([[1,2,3],[4,5,6],[7,8,9]])
lst = [list(x) for x in arr]
print (lst.index([4,5,6]))
... which gives the expected output 1.
I'm trying to get the indices of the maximum element in a Numpy array.
This can be done using numpy.argmax. My problem is, that I would like to find the biggest element in the whole array and get the indices of that.
numpy.argmax can be either applied along one axis, which is not what I want, or on the flattened array, which is kind of what I want.
My problem is that using numpy.argmax with axis=None returns the flat index when I want the multi-dimensional index.
I could use divmod to get a non-flat index but this feels ugly. Is there any better way of doing this?
You could use numpy.unravel_index() on the result of numpy.argmax():
>>> a = numpy.random.random((10, 10))
>>> numpy.unravel_index(a.argmax(), a.shape)
(6, 7)
>>> a[6, 7] == a.max()
True
np.where(a==a.max())
returns coordinates of the maximum element(s), but has to parse the array twice.
>>> a = np.array(((3,4,5),(0,1,2)))
>>> np.where(a==a.max())
(array([0]), array([2]))
This, comparing to argmax, returns coordinates of all elements equal to the maximum. argmax returns just one of them (np.ones(5).argmax() returns 0).
To get the non-flat index of all occurrences of the maximum value, you can modify eumiro's answer slightly by using argwhere instead of where:
np.argwhere(a==a.max())
>>> a = np.array([[1,2,4],[4,3,4]])
>>> np.argwhere(a==a.max())
array([[0, 2],
[1, 0],
[1, 2]])