NumPy indexing using List? - python

SOF,
I noticed an interesting NumPy demo in this URL:
http://cs231n.github.io/python-numpy-tutorial/
I see this:
import numpy as np
a = np.array([[1,2], [3, 4], [5, 6]])
# An example of integer array indexing.
# The returned array will have shape (3,) and
print( a[[0, 1, 2], [0, 1, 0]] )
# Prints "[1 4 5]"
I understand using integers as index arguments:
a[1,1]
and this syntax:
a[0:2,:]
Generally,
If I use a list as index syntax, what does that mean?
Specifically,
I do not understand why:
print( a[[0, 1, 2], [0, 1, 0]] )
# Prints "[1 4 5]"

The last statement will print (in matrix notation) a(0,0), a(1,1) and a(2,0). In python notation that's a[0][0], a[1][1] and a[2][0].
The first index list contains the indices for the first axis (matrix notation: row index), the second list contains the indices for the second axis (column index).

Related

Check shape of numpy array

I want to write a function that takes a numpy array and I want to check if it meets the requirements. One thing that confuses me is that:
np.array([1,2,3]).shape = np.array([[1,2,3],[2,3],[2,43,32]]) = (3,)
[1,2,3] should be allowed, while [[1,2,3],[2,3],[2,43,32]] shouldn't.
Allowed shapes:
[0, 1, 2, 3, 4]
[0, 1, 2]
[[1],[2]]
[[1, 2], [2, 3], [3, 4]]
Not Allowed:
[] (empty array is not allowed)
[[0], [1, 2]] (inner dimensions must have same size 1!=2)
[[[4,5,6],[4,3,2][[2,3,2],[2,3,4]]] (more than 2 dimension)
You should start with defining what you want in terms of shape. I tried to understand it from the question, please add more details if it is not correct.
So here we have (1) empty array is not allowed and (2) no more than two dimensions. It translates the following way:
def is_allowed(arr):
return arr.shape != (0, ) and len(arr.shape) <= 2
The first condition just compares you array's shape with the shape of an empty array. the second condition checks that an array has no more than two dimensions.
With inner dimensions there is a problem. Some of the lists you provided as an example are not numpy arrays. If you cast np.array([[1,2,3],[2,3],[2,43,32]]), you get just an array where each element is the list. It is not a "real" numpy array with direct access to all the elements. See example:
>>> np.array([[1,2,3],[2,3],[2,43,32]])
array([list([1, 2, 3]), list([2, 3]), list([2, 43, 32])], dtype=object)
>>> np.array([[1,2,3],[2,3, None],[2,43,32]])
array([[1, 2, 3],
[2, 3, None],
[2, 43, 32]], dtype=object)
So I would recommend (if you are operating with usual lists) check that all arrays have the same length without numpy.

Add 2D arrays side by side in python

Hello everyone I want to add 2 2x2 arrays side by side in python. In the end I want to get a 2x4 array which rows are shared and 1st and 2nd columns are from the first array and 3rd and the 4th columns are from the second array. I get an array where it sums the arrays not put them side by side. Can you help me please?
Example:
Array 1:
[[1 2]
[1 2]]
Array 2:
[[1 2]
[1 2]]
Expected Result:
[[1 2 1 2]
[1 2 1 2]]
Real Result:
[[2 4]
[2 4]]
import numpy as np
a = np.matrix('1 2; 1 2')
b = np.matrix('1 2; 1 2')
x = a + b
print(x)
Using np.concatenate
>>> numpy.concatenate((a, b), axis=1)
matrix([[1, 2, 1, 2],
[1, 2, 1, 2]])
Another option is using np.hstack:
>>> np.hstack((a, b))
matrix([[1, 2, 1, 2],
[1, 2, 1, 2]])
The reason I think why this is happening is because the addition returns a normal matrix addition that adds the two matrices component by component.
Try, np.concatenate(), it might help as #sacul has suggested.
numpy arrays do not act in the same way as python lists. Whereas the + operator can do some sort of list concatenation, when you use it with numpy arrays, you are doing vector addition.
Instead, you can flatten each array and concatenate:
np.concatenate([a.flatten(),b.flatten()])
matrix([[1, 2, 1, 2],
[1, 2, 1, 2]])
[Edit:]
re-reading your question, it seems I misunderstood what you were after. #Thomas' answers make more sense in your scenario, and an alternative would be np.column_stack:
>>> np.column_stack((a,b))
matrix([[1, 2, 1, 2],
[1, 2, 1, 2]])

numpy array indexing with lists and arrays

I have:
>>> a
array([[1, 2],
[3, 4]])
>>> type(l), l # list of scalers
(<type 'list'>, [0, 1])
>>> type(i), i # a numpy array
(<type 'numpy.ndarray'>, array([0, 1]))
>>> type(j), j # list of numpy arrays
(<type 'list'>, [array([0, 1]), array([0, 1])])
When I do
>>> a[l] # Case 1, l is a list of scalers
I get
array([[1, 2],
[3, 4]])
which means indexing happened only on 0th axis.
But when I do
>>> a[j] # Case 2, j is a list of numpy arrays
I get
array([1, 4])
which means indexing happened along axis 0 and axis 1.
Q1: When used for indexing, why is there a difference in treatment of list of scalers and list of numpy arrays ? (Case 1 vs Case 2). In Case 2, I was hoping to see indexing happen only along axis 0 and get
array( [[[1,2],
[3,4]],
[[1,2],
[3,4]]])
Now, when using numpy array of arrays instead
>>> j1 = np.array(j) # numpy array of arrays
The result below indicates that indexing happened only along axis 0 (as expected)
>>> a[j1] Case 3, j1 is a numpy array of numpy arrays
array([[[1, 2],
[3, 4]],
[[1, 2],
[3, 4]]])
Q2: When used for indexing, why is there a difference in treatment of list of numpy arrays and numpy array of numpy arrays? (Case 2 vs Case 3)
Case1, a[l] is actually a[(l,)] which expands to a[(l, slice(None))]. That is, indexing the first dimension with the list l, and an automatic trailing : slice. Indices are passed as a tuple to the array __getitem__, and extra () may be added without confusion.
Case2, a[j] is treated as a[array([0, 1]), array([0, 1]] or a[(array(([0, 1]), array([0, 1])]. In other words, as a tuple of indexing objects, one per dimension. It ends up returning a[0,0] and a[1,1].
Case3, a[j1] is a[(j1, slice(None))], applying the j1 index to just the first dimension.
Case2 is a bit of any anomaly. Your intuition is valid, but for historical reasons, this list of arrays (or list of lists) is interpreted as a tuple of arrays.
This has been discussed in other SO questions, and I think it is documented. But off hand I can't find those references.
So it's safer to use either a tuple of indexing objects, or an array. Indexing with a list has a potential ambiguity.
numpy array indexing: list index and np.array index give different result
This SO question touches on the same issue, though the clearest statement of what is happening is buried in a code link in a comment by #user2357112.
Another way of forcing the Case3 like indexing, make the 2nd dimension slice explicit, a[j,:]
In [166]: a[j]
Out[166]: array([1, 4])
In [167]: a[j,:]
Out[167]:
array([[[1, 2],
[3, 4]],
[[1, 2],
[3, 4]]])
(I often include the trailing : even if it isn't needed. It makes it clear to me, and readers, how many dimensions we are working with.)
A1: The structure of l is not the same as j.
l is just one-dimension while j is two-dimension. If you change one of them:
# l = [0, 1] # just one dimension!
l = [[0, 1], [0, 1]] # two dimensions
j = [np.array([0,1]), np.array([0, 1])] # two dimensions
They have the same behave.
A2: The same, the structure of arrays in Case 2 and Case 3 are not the same.

How to access numpy array with a set of indices stored in another numpy array?

I have a numpy array which stores a set of indices I need to access another numpy array.
I tried to use a for loop but it doesn't work as I expected.
The situation is like this:
>>> a
array([[1, 2],
[3, 4]])
>>> c
array([[0, 0],
[0, 1]])
>>> a[c[0]]
array([[1, 2],
[1, 2]])
>>> a[0,0] # the result I want
1
Above is a simplified version of my actual code, where the c array is much larger so I have to use a for loop to get every index.
Convert it to a tuple:
>>> a[tuple(c[0])]
1
Because list and array indices trigger advanced indexing. tuples are (mostly) basic slicing.
Index a with columns of c by passing the first column as row's index and second one as column index:
In [23]: a[c[:,0], c[:,1]]
Out[23]: array([1, 2])

finding indices that would sort numpy column returns zeros

I am trying to get the indices that would sort each column of an array using the function argsort. However, it keeps simply returning zeros instead of the true indices. For example:
x = np.matrix([[5, 2, 6], [3, 4, 1]])
print(x)
print(x[:,0])
print(x[:,1])
print(x[:,2])
print(x[:,0].argsort())
print(x[:,1].argsort())
print(x[:,2].argsort())
I am expecting this to return three arrays. [1 0], [0 1] and [1 0] denoting the indices of each column if it were sorted, however, instead I get three arrays that all contain zeros.
Any help much appreciated!
Indexing a matrix with a slice always returns another 2-d matrix. (This behavior is not the same as for a regular numpy array.) See, for example, the output of x[:,0]:
In [133]: x[:,0]
Out[133]:
matrix([[5],
[3]])
x[:,0] is a matrix with shape (2, 1).
To argsort the first (and only) column of that matrix, you have to tell argsort to use the first axis:
In [135]: x[:,0].argsort(axis=0)
Out[135]:
matrix([[1],
[0]])
The default (axis=-1) is to use the last axis, and since the rows in that matrix have length 1, the result when axis is not given is a column of zeros.
By the way, you can do all the columns at once:
In [138]: x
Out[138]:
matrix([[5, 2, 6],
[3, 4, 1]])
In [139]: x.argsort(axis=0)
Out[139]:
matrix([[1, 0, 1],
[0, 1, 0]])

Categories