I am learning numpy newly and confused about syntax used in indexing of arrays. For example:
arr[2, 3]
This means element at intersection of 3nd row and 4th column. What confuses me separation of different indices by comma inside square brackets (like in function arguments). Doing so with python lists is not valid:
l = [[1, 2], [3, 4]]
l[1, 1]
Traceback (most recent call last):
File "", line 1, in
TypeError: list indices must be integers or slices, not tuple
So, if this not a valid python syntax, how numpy arrays work?
Use Colon ':' instead of commas ','.
In slicing or indexing is done using colon ':'
In your above example,
l = [[1, 2], [3, 4]]
->l[0] is [1,2] and -> l[1] is [3,4]
Read further documentation for better understanding.
Thank You
In your given example, you're comparing a numpy array to a list of lists. The main difference between the two is that a numpy array is predictable in terms of shape, data type of its elements, and so on, while a list can contain an arbitrary combination of any other python objects (lists, tuples, strings, etc.)
Take this as an example, say you create a numpy array like so:
arr = np.array([[0, 1], [2, 3], [4, 5]])
Here, the shape of arr is known right after instantiation "arr.shape returns (3,2)", so you can easily index the array with only a comma separated square bracket. On the other hand, take the list example:
l = [[0, 1], [2, 3], [4, 5]]
l[0] # This returns the list [0, 1]
l[0].append("HELLO")
l[0] # This returns the list [0, 1, "HELLO"]
A list is very unpredictable, as there's no way to know what each list element will return to you. So, the way we index a specific element in a list of lists is by using 2 square brackets "e.g. l[0][0]"
What if we created a non-uniform numpy array? Well, you get a similar behaviour to a list of lists:
arr = np.array([[0, 1], [2, 3], [4]]) # Here, you get a Warning!
print(arr) # Returns: array([list([0, 1]), list([2, 3]), list([4])], dtype=object)
In this case, you can't index the numpy array using [0, 0]. Instead, you have to use two square brackets, just like a list of lists
You can also check the documentation of ndarray for more info.
Related
I want to "set" the values of a row of a Python nested list to another row without using NumPy.
I have a sample list:
lst = [[0, 0, 1],
[0, 2, 3],
[5, 2, 3]]
I want to make row 1 to row 2, row 2 to row 3, and row 3 to row 1. My desired output is:
lst = [[0, 2, 3],
[5, 2, 3],
[0, 0, 1]]
How can I do this without using Numpy?
I tried to do something like arr[[0, 1]] = arr[[1, 0]] but it gives the error 'NoneType' object is not subscriptable.
One very straightforward way:
arr = [arr[-1], *arr[:-1]]
Or another way to achieve the same:
arr = [arr[-1]] + arr[:-1]
arr[-1] is the last element of the array. And arr[:-1] is everything up to the last element of the array.
The first solution builds a new list and adds the last element first and then all the other elements. The second one constructs a list with only the last element and then extends it with the list containing the rest.
Note: naming your list an array doesn't make it one. Although you can access a list of lists like arr[i1][i2], it's still just a list of lists. Look at the array documentation for Python's actual array.
The solution user #MadPhysicist provided comes down to the second solution provided here, since [arr[-1]] == arr[-1:]
Since python does not actually support multidimensional lists, your task becomes simpler by virtue of the fact that you are dealing with a list containing lists of rows.
To roll the list, just reassemble the outer container:
result = lst[-1:] + lst[:-1]
Numpy has a special interpretation for lists of integers, like [0, 1], tuples, like :, -1, single integers, and slices. Python lists only understand single integers and slices as indices, and do not accept tuples as multidimensional indices, because, again, lists are fundamentally one-dimensional.
use this generalisation
arr = [arr[-1]] + arr[:-1]
which according to your example means
arr[0],arr[1],arr[2] = arr[1],arr[2],arr[0]
or
arr = [arr[2]]+arr[:2]
or
arr = [arr[2]]+arr[:-1]
You can use this
>>> lst = [[0, 0, 1],
[0, 2, 3],
[5, 2, 3]]
>>> lst = [*lst[1:], *lst[:1]]
>>>lst
[[0, 2, 3],
[5, 2, 3],
[0, 0, 1]]
I need to access the first element of each list within a list. Usually I do this via numpy arrays by indexing:
import numpy as np
nparr=np.array([[1,2,3],[4,5,6],[7,8,9]])
first_elements = nparr[:,0]
b/c:
print(nparr[0,:])
[1 2 3]
print(nparr[:,0])
[1 4 7]
Unfortunately I have to tackle non-rectangular dynamic arrays now so numpy won't work.
But Pythons standard lists behave strangely (at least for me):
pylist=[[1,2,3],[4,5,6],[7,8,9]]
print(pylist[0][:])
[1, 2, 3]
print(pylist[:][0])
[1, 2, 3]
I guess either lists doesn't support this (which would lead to a second question: What to use instead) or I got the syntax wrong?
You have a few options. Here's one.
pylist=[[1,2,3],[4,5,6],[7,8,9]]
print(pylist[0]) # [1, 2, 3]
print([row[0] for row in pylist]) # [1, 4, 7]
Alternatively, if you want to transpose pylist (make its rows into columns), you could do the following.
pylist_transpose = [*zip(*pylist)]
print(pylist_transpose[0]) # [1, 4, 7]
pylist_transpose will always be a rectangular array with a number of rows equal to the length of the shortest row in pylist.
I have a list of list that looks like this:a = [[1,1],[1,1]].
I would like to convert it into a numpy array that looks like this:a = [[1,1] [1,1]].
Is there any way I can achieve this? I don't seem to be able to mimic the X_train dataset structure given by keras.datasets.imdb even though type(X_train) = numpy.ndarray.
You need to explain the keras target better. The presence or not of commas is an indication of the nature of the data structure, but it is really just a display convention.
Python list uses the comma delimiter all the time:
In [751]: alist = [[1,2],[3,4]]
In [752]: alist
Out[752]: [[1, 2], [3, 4]]
A numpy array can be displayed with and without the comma.
In [753]: arr=np.array(arr)
In [754]: arr
Out[754]:
array([[1, 2],
[3, 4]])
In [755]: str(arr)
Out[755]: '[[1 2]\n [3 4]]'
In [756]: print(arr)
[[1 2]
[3 4]]
To get your mix of commas and space I'd have to make an object dtype array:
In [757]: a=np.empty((2,), dtype=object)
In [758]: a
Out[758]: array([None, None], dtype=object)
In [759]: print(a)
[None None]
In [760]: a[0]=[1,2]
In [761]: a[1]=[3,4]
In [762]: a
Out[762]: array([[1, 2], [3, 4]], dtype=object)
In [763]: print(a)
[[1, 2] [3, 4]]
I can imagine the keras package constructing such an array, but I wouldn't recommend it to a beginner.
From the keras imdb documentation
X_train, X_test: list of sequences, which are lists of indexes (integers). If the nb_words argument was specific, the maximum possible index value is nb_words-1. If the maxlen argument was specified, the largest possible sequence length is maxlen.
It says list of lists, but it could be object dtype array of lists. Sounds like those inner lists vary in length, which explains why it isn't a 2d array.
np.array([[1,2], [3,4,5]])
Produces such an array.
I feel silly, because this is such a simple thing, but I haven't found the answer either here or anywhere else.
Is there no straightforward way of indexing a numpy array with another?
Say I have a 2D array
>> A = np.asarray([[1, 2], [3, 4], [5, 6], [7, 8]])
array([[1, 2],
[3, 4],
[5, 6],
[7, 8]])
if I want to access element [3,1] I type
>> A[3,1]
8
Now, say I store this index in an array
>> ind = np.array([3,1])
and try using the index this time:
>> A[ind]
array([[7, 8],
[3, 4]])
the result is not A[3,1]
The question is: having arrays A and ind, what is the simplest way to obtain A[3,1]?
Just use a tuple:
>>> A[(3, 1)]
8
>>> A[tuple(ind)]
8
The A[] actually calls the special method __getitem__:
>>> A.__getitem__((3, 1))
8
and using a comma creates a tuple:
>>> 3, 1
(3, 1)
Putting these two basic Python principles together solves your problem.
You can store your index in a tuple in the first place, if you don't need NumPy array features for it.
That is because by giving an array you actually ask
A[[3,1]]
Which gives the third and first index of the 2d array instead of the first index of the third index of the array as you want.
You can use
A[ind[0],ind[1]]
You can also use (if you want more indexes at the same time);
A[indx,indy]
Where indx and indy are numpy arrays of indexes for the first and second dimension accordingly.
See here for all possible indexing methods for numpy arrays: http://docs.scipy.org/doc/numpy-1.10.1/user/basics.indexing.html
I have a quite involved nested list: each element is a tuple with two elements: one is an object, the other is an 3x2xn array. Here is a toy model.
toy=[('mol1',array([[[1,1,1],[2,2,2]],[[1,1,1],[2,2,2]]])),('mol2',array([[[1,1,1],[2,2,2]],[[1,1,1],[2,2,2]]]))]
How can I get a single column from that?
I am looking for
('mol1', 'mol2')
and for the 2Darrays like:
array([[1,1,1],[1,1,1],[1,1,1],[1,1,1]])
I have a solution but I think it is pretty inefficient:
zip(*toy)[0]
it returns
('mol1', 'mol2')
then
zip(*toy)[1][0][:,0]
which returns
array([[1, 1, 1],
[1, 1, 1]])
a for cycle like that
for i in range(len(toy)):
zip(*toy)[1][i][:,0]
gives all the element of the column and I can build it with a vstack
This should be reasonably efficient:
>>> tuple(t[0] for t in toy)
('mol1', 'mol2')
For the 2D array, with the help of numpy's vstack function:
>>> from numpy import vstack
>>> vstack([t[1][:, 0] for t in toy])
array([[1, 1, 1],
[1, 1, 1],
[1, 1, 1],
[1, 1, 1]])
You can use the array in numpy to store your data or convert yours to that, then use the column slicing function built in. In general numpy slicing is very fast.
import numpy as np
np.asarray(toy)[::, 0] # first column
# output
array(['mol1', 'mol2'],
dtype='|S4')