python finding index of an array within a list - python

What I have is a list where the elements are an array like this:
([1,2,3],[4,5,6],[7,8,9])
What I want is to find the index of an element in this list, something like:
list.index([4,5,6]) #should return 1.
Problem is numpy array comparison throws up errors unless you put something like: (A==B).all()
But this comparison is inside the index function so i can't and don't really want to add the all() bit to the function. Is there an easier solution to this?

Your last error message indicates that you are still mixing lists and arrays. I'll try to recreate the situation:
Make a list of lists. Finding a sublist works just fine:
In [256]: ll=[[1,2,3],[4,5,6],[7,8,9]]
In [257]: ll.index([4,5,6])
Out[257]: 1
Make an array from it - it's 2d.
In [258]: la=np.array(ll)
In [259]: la
Out[259]:
array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
It does not have an index method
In [260]: la.index([4,5,6])
...
AttributeError: 'numpy.ndarray' object has no attribute 'index'
Make it a list - but we get your ValueError:
In [265]: list(la).index([4,5,6])
...
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
That's because list(la) returns a list of arrays, and arrays produce multiple values in == expressions:
In [266]: list(la)
Out[266]: [array([1, 2, 3]), array([4, 5, 6]), array([7, 8, 9])]
The correct way to produce a list from an array is tolist, which returns the original ll list of lists:
In [267]: la.tolist().index([4,5,6])
Out[267]: 1

If you are starting with a numpy array, you can get the result that you want by converting it to a list of lists before using the index() function, e.g.:
import numpy as np
arr = np.array([[1,2,3],[4,5,6],[7,8,9]])
lst = [list(x) for x in arr]
print (lst.index([4,5,6]))
... which gives the expected output 1.

Related

Rationale for numpy.split returning a list and not an array

I was surprised that numpy.split yields a list and not an array. I would have thought it would be better to return an array, since numpy has put a lot of work into making arrays more useful than lists. Can anyone justify numpy returning a list instead of an array? Why would that be a better programming decision for the numpy developers to have made?
A comment pointed out that if the slit is uneven, the result can't be a array, at least not one that has the same dtype. At best it would be an object dtype.
But lets consider the case of equal length subarrays:
In [124]: x = np.arange(10)
In [125]: np.split(x,2)
Out[125]: [array([0, 1, 2, 3, 4]), array([5, 6, 7, 8, 9])]
In [126]: np.array(_) # make an array from that
Out[126]:
array([[0, 1, 2, 3, 4],
[5, 6, 7, 8, 9]])
But we can get the same array without split - just reshape:
In [127]: x.reshape(2,-1)
Out[127]:
array([[0, 1, 2, 3, 4],
[5, 6, 7, 8, 9]])
Now look at the code for split. It just passes the task to array_split. Ignoring the details about alternative axes, it just does
sub_arys = []
for i in range(Nsections):
# st and end from `div_points
sub_arys.append(sary[st:end])
return sub_arys
In other words, it just steps through array and returns successive slices. Those (often) are views of the original.
So split is not that sophisticate a function. You could generate such a list of subarrays yourself without a lot of numpy expertise.
Another point. Documentation notes that split can be reversed with an appropriate stack. concatenate (and family) takes a list of arrays. If give an array of arrays, or a higher dim array, it effectively iterates on the first dimension, e.g. concatenate(arr) => concatenate(list(arr)).
Actually you are right it returns a list
import numpy as np
a=np.random.randint(1,30,(2,2))
b=np.hsplit(a,2)
type(b)
it will return type(b) as list so, there is nothing wrong in the documentation, i also first thought that the documentation is wrong it doesn't return a array, but when i checked
type(b[0])
type(b[1])
it returned type as ndarray.
it means it returns a list of ndarrary's.

numpy where returns an array. I need just index

I have a numpy array: k = np.array([100,20,25,10,1,2]) and I'm trying to use a np where as index=np.where(k<10) which gives me
index (array([4, 5]),). I'm insterested in something to give me just the index so here I'd like to have index[0]=4 not index[0]=[4 5]
I couldn't find anything here on the numpy docs.
You can take the first element in the result that you get, as follows:
index=np.where(k<10)[0]
Then index is array([4, 5], dtype=int64), and you can access index[0] and index[1] as you wanted.
You can use numpy.flatnonzero, which returns an array of index instead of tuples of array(s):
k = np.array([100,20,25,10,1,2])
np.flatnonzero(k < 10)
# array([4, 5])

Numpy array indexing behavior

I was playing with numpy array indexing and find this odd behavior. When I index with np.array or list it works as expected:
In[1]: arr = np.arange(10).reshape(5,2)
arr[ [1, 1] ]
Out[1]: array([[2, 3],
[2, 3]])
But when I put tuple, it gives me a single element:
In[1]: arr = np.arange(10).reshape(5,2)
arr[ (1, 1) ]
Out[1]: 3
Also some kind of this strange tuple vs list behavior occurs with arr.flat:
In[1]: arr = np.arange(10).reshape(5,2)
In[2]: arr.flat[ [3, 4] ]
Out[2]: array([3, 4])
In[3]: arr.flat[ (3, 4) ]
Out[3]: IndexError: unsupported iterator index
I can't understand what is going on under the hood? What difference between tuple and list in this case?
Python 3.5.2
NumPy 1.11.1
What's happening is called fancy indexing, or advanced indexing. There's a difference between indexing with slices, or with a list/array. The trick is that multidimensional indexing actually works with tuples due to the implicit tuple syntax:
import numpy as np
arr = np.arange(10).reshape(5,2)
arr[2,1] == arr[(2,1)] # exact same thing: 2,1 matrix element
However, using a list (or array) inside an index expression will behave differently:
arr[[2,1]]
will index into arr with 1, then with 2, so first it fetches arr[2]==arr[2,:], then arr[1]==arr[1,:], and returns these two rows (row 2 and row 1) as the result.
It gets funkier:
print(arr[1:3,0:2])
print(arr[[1,2],[0,1]])
The first one is regular indexing, and it slices rows 1 to 2 and columns 0 to 1 inclusive; giving you a 2x2 subarray. The second one is fancy indexing, it gives you arr[1,0],arr[2,1] in an array, i.e. it indexes selectively into your array using, essentially, the zip() of the index lists.
Now here's why flat works like that: it returns a flatiter of your array. From help(arr.flat):
class flatiter(builtins.object)
| Flat iterator object to iterate over arrays.
|
| A `flatiter` iterator is returned by ``x.flat`` for any array `x`.
| It allows iterating over the array as if it were a 1-D array,
| either in a for-loop or by calling its `next` method.
So the resulting iterator from arr.flat behaves as a 1d array. When you do
arr.flat[ [3, 4] ]
you're accessing two elements of that virtual 1d array using fancy indexing; it works. But when you're trying to do
arr.flat[ (3,4) ]
you're attempting to access the (3,4) element of a 1d (!) array, but this is erroneous. The reason that this doesn't throw an IndexError is probably only due to the fact that arr.flat itself handles this indexing case.
In [387]: arr=np.arange(10).reshape(5,2)
With this list, you are selecting 2 rows from arr
In [388]: arr[[1,1]]
Out[388]:
array([[2, 3],
[2, 3]])
It's the same as if you explicitly marked the column slice (with : or ...)
In [389]: arr[[1,1],:]
Out[389]:
array([[2, 3],
[2, 3]])
Using an array instead of a list works: arr[np.array([1,1]),:]. (It also eliminates some ambiguities.)
With the tuple, the result is the same as if you wrote the indexing without the tuple wrapper. So it selects an element with row index of 1, column index of 1.
In [390]: arr[(1,1)]
Out[390]: 3
In [391]: arr[1,1]
Out[391]: 3
The arr[1,1] is translated by the interpreter to arr.__getitem__((1,1)). As is common in Python 1,1 is shorthand for (1,1).
In the arr.flat cases you are indexing the array as if it were 1d. np.arange(10)[[2,3]] selects 2 items, while np.arange(10)[(2,3)] is 2d indexing, hence the error.
A couple of recent questions touch on a messier corner case. Sometimes the list is treated as a tuple. The discussion might be enlightening, but don't go there if it's confusing.
Advanced slicing when passed list instead of tuple in numpy
numpy indexing: shouldn't trailing Ellipsis be redundant?

Index a numpy array with another array

I feel silly, because this is such a simple thing, but I haven't found the answer either here or anywhere else.
Is there no straightforward way of indexing a numpy array with another?
Say I have a 2D array
>> A = np.asarray([[1, 2], [3, 4], [5, 6], [7, 8]])
array([[1, 2],
[3, 4],
[5, 6],
[7, 8]])
if I want to access element [3,1] I type
>> A[3,1]
8
Now, say I store this index in an array
>> ind = np.array([3,1])
and try using the index this time:
>> A[ind]
array([[7, 8],
[3, 4]])
the result is not A[3,1]
The question is: having arrays A and ind, what is the simplest way to obtain A[3,1]?
Just use a tuple:
>>> A[(3, 1)]
8
>>> A[tuple(ind)]
8
The A[] actually calls the special method __getitem__:
>>> A.__getitem__((3, 1))
8
and using a comma creates a tuple:
>>> 3, 1
(3, 1)
Putting these two basic Python principles together solves your problem.
You can store your index in a tuple in the first place, if you don't need NumPy array features for it.
That is because by giving an array you actually ask
A[[3,1]]
Which gives the third and first index of the 2d array instead of the first index of the third index of the array as you want.
You can use
A[ind[0],ind[1]]
You can also use (if you want more indexes at the same time);
A[indx,indy]
Where indx and indy are numpy arrays of indexes for the first and second dimension accordingly.
See here for all possible indexing methods for numpy arrays: http://docs.scipy.org/doc/numpy-1.10.1/user/basics.indexing.html

Access elements inside numpy.where index

I'm a beginner to the Python world and hope someone can answer my question. I haven an array and need to access certain indices of elements as below
x = np.random.rand(10)
x
array([ 0.56807058, 0.8404783 , 0.86835717, 0.76030882, 0.40242679,
0.22941009, 0.56842643, 0.94541468, 0.92813747, 0.95980955])
indx = np.where(x < 0.5)
indx
(array([4, 5], dtype=int64),)
However, when I try to access first element with indx[0] it returns array([4, 5], dtype=int64). What I want to do is access elements 4 and 5 inside indx. Thank you for looking into my question and any support.
np.where returns a tuple of indices. In this case the tuple contains only one array of indices. This consistent with how where handles multi-dimensional arrays. It returns a tuple containing multiple arrays which together define the indices of the non-zero elements.
To access 4 from indx you would do: indx[0][0]. The first [0] selects the first element of the indx tuple, which is array([4, 5], dtype=int64) and the second accesses an element of this array.

Categories