Numpy: index array by array - python

>>> idx = np.random.randint(2, size=(9, 31))
>>> a = np.random.random((9, 31, 2))
>>> a[idx].shape
(9, 31, 31, 2)
Why is the above not resulting in at least a shape of (9, 31, 1), or even better (9, 31)? How can I get it to return a selection based on the values in idx?
Update
This is perhaps a more concrete and hopefully analogue example: Assume this array
a = np.asarray([[1, 2], [3, 4], [5, 6], [7, 8]])
How would I go about selection the array [1, 4, 5, 8] (i.e. the 0th, 1st, 0th, 1st element of each row)?

I think this is what you want:
>>> a[np.arange(9)[:, None], np.arange(31), idx].shape
(9, 31)
For your second example you would do:
>>> a[np.arange(4), [0, 1, 0, 1]]
array([1, 4, 5, 8])
Read the docs on fancy indexing, especially the part on what happens when you don't have an index array for each dimension here: those extra np.arange arrays are placed there to avoid that behavior.
Note also how they are reshaped (indexing with [:, None] is equivalent to .reshape(-1, 1)) so that their broadcast shape has the shape of the desired output array.

You're doing advanced indexing on the ndarray http://docs.scipy.org/doc/numpy/reference/arrays.indexing.html#advanced-indexing.
Advanced indexes always are broadcast and iterated as one:
This is triggered because in your case the number of elements in the ndarray-index is not equal to the number of dimensions in the ndarray you are indexing into. Effectively you're producing an outer-product of slices: each element in your index produces a slice of the indexed array and not an element.
UPDATE:
>>> map(lambda idx: a[idx[0],idx[1]], [[0,0], [1,1], [2,0], [3,1]])
This will return:
[1, 4, 5, 8]

Related

How to use a combination of integer indices and an array as indices for a multi-dimensional array?

x = np.array([[0, 1, 2, 3],
[4, 5, 6, 7],
[8, 9, 10, 11]])
x = np.array([np.array(x), np.array(x), np.array(x)])
arr = [[1, 1], [2, 2]]
print(x[:, arr])
I need (:, 1, 1) and (:, 2, 2) of that array
That is:
[5, 5, 5]
[10, 10, 10]
But it is returning (:, 1) and (:, 2) twice
I've tried using tuple as well.
Edit: using x[(slice(None), *zip(*arr))] worked. But what if i need to use ':' in between the two values of arr?
Like x[arr[0], :, arr[1])
What you can do is obtain these elements with
x[:,[1,2],[1,2]]
You perhaps are not capable to write this directly, since the arr can have an arbitrary number of elements. In that case we can unpack the result in a tuple:
x[(slice(None), *zip(*arr))]
Here the zip(*arr) will transpose the elements in arr, and we will then unpack the transpose as extra elements in the tuple. The slice(None) is basically what happens behind the curtains if you write a : in a subscript.

Check shape of numpy array

I want to write a function that takes a numpy array and I want to check if it meets the requirements. One thing that confuses me is that:
np.array([1,2,3]).shape = np.array([[1,2,3],[2,3],[2,43,32]]) = (3,)
[1,2,3] should be allowed, while [[1,2,3],[2,3],[2,43,32]] shouldn't.
Allowed shapes:
[0, 1, 2, 3, 4]
[0, 1, 2]
[[1],[2]]
[[1, 2], [2, 3], [3, 4]]
Not Allowed:
[] (empty array is not allowed)
[[0], [1, 2]] (inner dimensions must have same size 1!=2)
[[[4,5,6],[4,3,2][[2,3,2],[2,3,4]]] (more than 2 dimension)
You should start with defining what you want in terms of shape. I tried to understand it from the question, please add more details if it is not correct.
So here we have (1) empty array is not allowed and (2) no more than two dimensions. It translates the following way:
def is_allowed(arr):
return arr.shape != (0, ) and len(arr.shape) <= 2
The first condition just compares you array's shape with the shape of an empty array. the second condition checks that an array has no more than two dimensions.
With inner dimensions there is a problem. Some of the lists you provided as an example are not numpy arrays. If you cast np.array([[1,2,3],[2,3],[2,43,32]]), you get just an array where each element is the list. It is not a "real" numpy array with direct access to all the elements. See example:
>>> np.array([[1,2,3],[2,3],[2,43,32]])
array([list([1, 2, 3]), list([2, 3]), list([2, 43, 32])], dtype=object)
>>> np.array([[1,2,3],[2,3, None],[2,43,32]])
array([[1, 2, 3],
[2, 3, None],
[2, 43, 32]], dtype=object)
So I would recommend (if you are operating with usual lists) check that all arrays have the same length without numpy.

how to reshape an array of tuples

I need to reshape numpy arrays in order to plot some data.
The following work fine:
import numpy as np
target_shape = (350, 277)
arbitrary_array = np.random.normal(size = 96950)
reshaped_array = np.reshape(arbitrary_array, target_shape)
However, if instead of an array of shape (96950, ) I have an array of tuples with 3 elements each (96950,3) I got a
cannot reshape array of size 290850 into shape (350,277)
Here the code to replicate the error
array_of_tuple = np.array([(el, el, el) for el in arbitrary_array])
reshaped_array = np.reshape(array_of_tuple, target_shape)
I guess that what reshape is doing is flattening the tuples array (hence the size 290850) and then trying to reshape it. However, what I would like to have is an array of tuples in the shape (350, 277), basically ignoring the second dimension and just reshaping the tuples as they were scalar. Is there a way of achieving this ?
You could reshape to (350, 277, 3) instead:
>>> a = np.array([(x,x,x) for x in range(10)])
>>> a.reshape((2,5,3))
array([[[0, 0, 0],
[1, 1, 1],
[2, 2, 2],
[3, 3, 3],
[4, 4, 4]],
[[5, 5, 5],
[6, 6, 6],
[7, 7, 7],
[8, 8, 8],
[9, 9, 9]]])
Technically, the result will not be a 350x277 2D-array of 3-tuples but a 350x277x3 3D-array, though, but neither is your array_of_tuple an actual "array-of-tuples" but a 2D array.
reshaped_array=np.reshape(array_of_tuple,(350,-1))
reshaped_array.shape
gives (350, 831)
You are getting the error because of the mismatch of column numbers and rows number that cover the entire elements of array
350*831= 290850 where as
350*277=96950
and hence numpy doesnt know what to do with the additional elements of the array,,You can try reducing the original size of the array to reduce the number of elements.If you dont want to remove the elements then
reshape(350,277,3)
is an option
Your problem steps from a misconception of the result of np.array(iterable), have a look at this
In [7]: import numpy as np
In [8]: np.array([(el, el, el) for el in (1,)])
Out[8]: array([[1, 1, 1]])
In [9]: _.shape
Out[9]: (1, 3)
and ask yourself which is the shape of
array_of_tuple = np.array([(el, el, el) for el in np.random.normal(size = 96950)])

Numpy: Get matrices from tensor given list of indexes

I have a tensor with the shape (4, 3, 20). When I do X[:, 0, :].shape I get (4, 20). When I do X[:, [0,2,0,1], :].shape I get (4, 4, 20).
What I have is a list of indexes representing the second dimension of my tensor. I want to get a two-dimensional matrix like I get when I do X[:, 0, :] but I have different indexes for the second dimension instead of only one. How do I do that?
Your question is unclear, but I'll make a guess
In [58]: X=np.arange(24).reshape(4,3,2)
In [59]: X[range(4),[0,2,0,1],:]
Out[59]:
array([[ 0, 1],
[10, 11],
[12, 13],
[20, 21]])
This picks row 0 from the 1st plane; row 2 from the 2nd, etc. The result has the same shape as X[:,0,:], but values are pulled from different 1st dimension planes.
In [61]: X[:,0,:]
Out[61]:
array([[ 0, 1], # same
[ 6, 7],
[12, 13], # same
[18, 19]])
I think you are looking for np.squeeze. So, for cases when the indexing list, say L has just one element and upon indexing the input array with it would result in a 3D array with a singleton second dimension (dimension of length 1), would result in a 2D output with that squeez-ing. For L with more than one element, the indexing would result in a 3D array without any singleton dimension and thus, no change with that squeez-ing and hence the desired output. Thus, the solution with it would be -
np.squeeze(X[:,L,:])
Sample run to test out shapes on a random array -
In [25]: A = np.random.rand(4,3,20)
In [26]: L = [0]
In [27]: np.squeeze(A[:,L,:]).shape
Out[27]: (4, 20)
In [28]: L = [0,2,0,1]
In [29]: np.squeeze(A[:,L,:]).shape
Out[29]: (4, 4, 20)

Numpy array slicing using colons

I am trying to learn numpy array slicing.
But this is a syntax i cannot seem to understand.
What does
a[:1] do.
I ran it in python.
a = np.array([1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16])
a = a.reshape(2,2,2,2)
a[:1]
Output:
array([[[ 5, 6],
[ 7, 8]],
[[13, 14],
[15, 16]]])
Can someone explain to me the slicing and how it works. The documentation doesn't seem to answer this question.
Another question would be would there be a way to generate the a array using something like
np.array(1:16) or something like in python where
x = [x for x in range(16)]
The commas in slicing are to separate the various dimensions you may have. In your first example you are reshaping the data to have 4 dimensions each of length 2. This may be a little difficult to visualize so if you start with a 2D structure it might make more sense:
>>> a = np.arange(16).reshape((4, 4))
>>> a
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11],
[12, 13, 14, 15]])
>>> a[0] # access the first "row" of data
array([0, 1, 2, 3])
>>> a[0, 2] # access the 3rd column (index 2) in the first row of the data
2
If you want to access multiple values using slicing you can use the colon to express a range:
>>> a[:, 1] # get the entire 2nd (index 1) column
array([[1, 5, 9, 13]])
>>> a[1:3, -1] # get the second and third elements from the last column
array([ 7, 11])
>>> a[1:3, 1:3] # get the data in the second and third rows and columns
array([[ 5, 6],
[ 9, 10]])
You can do steps too:
>>> a[::2, ::2] # get every other element (column-wise and row-wise)
array([[ 0, 2],
[ 8, 10]])
Hope that helps. Once that makes more sense you can look in to stuff like adding dimensions by using None or np.newaxis or using the ... ellipsis:
>>> a[:, None].shape
(4, 1, 4)
You can find more here: http://docs.scipy.org/doc/numpy/reference/arrays.indexing.html
It might pay to explore the shape and individual entries as we go along.
Let's start with
>>> a = np.array([1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16])
>>> a.shape
(16, )
This is a one-dimensional array of length 16.
Now let's try
>>> a = a.reshape(2,2,2,2)
>>> a.shape
(2, 2, 2, 2)
It's a multi-dimensional array with 4 dimensions.
Let's see the 0, 1 element:
>>> a[0, 1]
array([[5, 6],
[7, 8]])
Since there are two dimensions left, it's a matrix of two dimensions.
Now a[:, 1] says: take a[i, 1 for all possible values of i:
>>> a[:, 1]
array([[[ 5, 6],
[ 7, 8]],
[[13, 14],
[15, 16]]])
It gives you an array where the first item is a[0, 1], and the second item is a[1, 1].
To answer the second part of your question (generating arrays of sequential values) you can use np.arange(start, stop, step) or np.linspace(start, stop, num_elements). Both of these return a numpy array with the corresponding range of values.

Categories