numpy array integer indexing in more than one dimension - python

I'm pretty sure I'm missing something with integer indexing and could use some help. Say that I create a 2D array:
>>> import numpy as np
>>> x=np.array(range(24)).reshape((4,6))
>>> x
array([[ 0, 1, 2, 3, 4, 5],
[ 6, 7, 8, 9, 10, 11],
[12, 13, 14, 15, 16, 17],
[18, 19, 20, 21, 22, 23]])
I can then select row 1 and 2 with:
>>> x[[1,2],:]
array([[ 6, 7, 8, 9, 10, 11],
[12, 13, 14, 15, 16, 17]])
Or the column 1 of rows 2 and 3 with:
>>> x[[1,2],1]
array([ 7, 13])
So it would makes sense to me that I can select columns 3, 4 and 5 of rows 1 and 2 with this:
>>> x[[1,2],[3,4,5]]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: shape mismatch: objects cannot be broadcast to a single shape
And instead I need to do it in two steps:
>>> a=x[[1,2],:]
>>> a
array([[ 6, 7, 8, 9, 10, 11],
[12, 13, 14, 15, 16, 17]])
>>> a[:,[3,4,5]]
array([[ 9, 10, 11],
[15, 16, 17]])
Coming from R, my expectations seem to be wrong. Can you confirm that this is indeed not possible in one step, or suggest a better alternative? Thanks!
EDIT: please note my choice of rows and columns in the example happen to be consecutive, but they don't have to be. In other words, slice indexing won't do for my case.

You also have the option of using broadcasting among the indexing arrays, which is what I would normally do, rather than indexing twice, which creates an intermediate copy of your data:
>>> x[[[1], [2]],[[3, 4, 5]]]
array([[ 9, 10, 11],
[15, 16, 17]])
To see a little better what is going on and how to handle larger numbers of indices:
>>> row_idx = np.array([1, 2])
>>> col_idx = np.array([3, 4, 5])
>>> x[row_idx.reshape(-1, 1), col_idx]
array([[ 9, 10, 11],
[15, 16, 17]])

Something like this:
In [28]: x[1:3, 3:6]
Out[28]:
array([[ 9, 10, 11],
[15, 16, 17]])

Related

Figuring out correct numpy transpose for 3*3*3 array

Suppose I have a 3*3*3 array x. I would like to find out an array y, such that such that y[0,1,2] = x[1,2,0], or more generally, y[a,b,c]= x[b,c,a]. I can try numpy.transpose
import numpy as np
x = np.arange(27).reshape((3,3,3))
y = np.transpose(x, [2,0,1])
print(x[0,1,2],x[1,2,0])
print(y[0,1,2])
The output is
5 15
15
The result 15,15 is what I expected (the first 15 is the reference value from x[1,2,0]; the second is from y[0,1,2]) . However, I found the transpose [2,0,1] by drawing in a paper.
B C A
A B C
by inspection, the transpose should be [2,0,1], the last entry in the upper row goes to 1st in the lower row; the middle goes last; the first go middle. Is there any automatic and hopefully efficient way to do it (like any standard function in numpy/sympy)?
Given the input y[a,b,c]= x[b,c,a], output [2,0,1]?
I find easier to explore tranpose with a example with shape like (2,3,4), each axis is different.
But sticking with your (3,3,3)
In [23]: x = np.arange(27).reshape(3,3,3)
In [24]: x
Out[24]:
array([[[ 0, 1, 2],
[ 3, 4, 5],
[ 6, 7, 8]],
[[ 9, 10, 11],
[12, 13, 14],
[15, 16, 17]],
[[18, 19, 20],
[21, 22, 23],
[24, 25, 26]]])
In [25]: x[0,1,2]
Out[25]: 5
Your sample transpose:
In [26]: y = x.transpose(2,0,1)
In [27]: y
Out[27]:
array([[[ 0, 3, 6],
[ 9, 12, 15],
[18, 21, 24]],
[[ 1, 4, 7],
[10, 13, 16],
[19, 22, 25]],
[[ 2, 5, 8],
[11, 14, 17],
[20, 23, 26]]])
We get the same 5 with
In [28]: y[2,0,1]
Out[28]: 5
We could get that (2,0,1) by applying the same transposing values:
In [31]: idx = np.array((0,1,2)) # use an array for ease of indexing
In [32]: idx[[2,0,1]]
Out[32]: array([2, 0, 1])
The way I think about the trapose (2,0,1), we are moving the last axis, 2, to the front, and preserving the order of the other 2.
With differing dimensions, it's easier to visualize the change:
In [33]: z=np.arange(2*3*4).reshape(2,3,4)
In [34]: z
Out[34]:
array([[[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]],
[[12, 13, 14, 15],
[16, 17, 18, 19],
[20, 21, 22, 23]]])
In [35]: z.transpose(2,0,1)
Out[35]:
array([[[ 0, 4, 8],
[12, 16, 20]],
[[ 1, 5, 9],
[13, 17, 21]],
[[ 2, 6, 10],
[14, 18, 22]],
[[ 3, 7, 11],
[15, 19, 23]]])
In [36]: _.shape
Out[36]: (4, 2, 3)
np.swapaxes is another compiled function for making these changes. np.rollaxis is another, though it's python code that ends up calling transpose.
I haven't tried to follow all of your reasoning, though I think you want a kind reverse of the transpose numbers, one where you specify the result order, and want how to get them.

Numpy N-Dimensional Array Unfolding

I am basically trying to achieve this, but need the unfolding to be done in a different fashion. i want all samples of the N-1th dimension to be concatenated. For example, if my unfolding were to be applied to an RGB image of (100,100,3) the new array would basically become a (100,300) where the 3 colour channel images are now side by side in the new array.
All my attempts to use a neat built in numpy function like flatten and concatenate yielded no results. (flatten, because the end goal is to apply this unfolding until it is a 1D array)
Can't even think of a slicing way of doing it in a loop since the starting number of dimensions isn't constant (array = array[:,...,:,0]+...+array[:,...,:,0])
EDIT
I just came up with this way of achieving what I want, but would still welcome better, more pure, numpy solutions.
shape = numpy.random.randint(100, size=numpy.random.randint(100))
array = numpy.random.uniform(size=shape)
array = array.T
for i in range(0, len(shape)-1, -1):
array = numpy.concatenate(array)
Am I right in deducing that you want flatten the last 2 dimensions of the array?
In [96]: shape = numpy.random.randint(10, size=numpy.random.randint(10))+1
In [97]: shape
Out[97]: array([2, 7, 2])
In [98]: newshape=tuple(shape[:-2])+(-1,)
In [99]: arr = np.arange(np.prod(shape)).reshape(shape)
In [100]: arr.shape
Out[100]: (2, 7, 2)
In [101]: arr.reshape(newshape).shape
Out[101]: (2, 14)
In [102]: arr.reshape(newshape)
Out[102]:
array([[ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13],
[14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27]])
If you don't like the order of terms in the last dimension, you may need to transpose
In [109]: np.swapaxes(arr, -1,-2).reshape(newshape)
Out[109]:
array([[ 0, 2, 4, 6, 8, 10, 12, 1, 3, 5, 7, 9, 11, 13],
[14, 16, 18, 20, 22, 24, 26, 15, 17, 19, 21, 23, 25, 27]])
I can't test it against your code because range(0,len(shape)-1, -1) is an empty range.
I don't think you want
In [112]: np.concatenate(arr,axis=-1).shape
Out[112]: (7, 4)
In [113]: np.concatenate((arr[0,...],arr[1,...]), axis=-1)
Out[113]:
array([[ 0, 1, 14, 15],
[ 2, 3, 16, 17],
[ 4, 5, 18, 19],
[ 6, 7, 20, 21],
[ 8, 9, 22, 23],
[10, 11, 24, 25],
[12, 13, 26, 27]])
That splits the arr on the 1st axis, and then joins it on the last.

Matrix to Vector with python/numpy

Numpy ravel works well if I need to create a vector by reading by rows or by columns. However, I would like to transform a matrix to a 1d array, by using a method that is often used in image processing. This is an example with initial matrix A and final result B:
A = np.array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11],
[12, 13, 14, 15]])
B = np.array([[ 0, 1, 4, 8, 5, 2, 3, 6, 9, 12, 13, 10, 7, 11, 14, 15])
Is there an existing function already that could help me with that? If not, can you give me some hints on how to solve this problem? PS. the matrix A is NxN.
I've been using numpy for several years, and I've never seen such a function.
Here's one way you could do it (not necessarily the most efficient):
In [47]: a
Out[47]:
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11],
[12, 13, 14, 15]])
In [48]: np.concatenate([np.diagonal(a[::-1,:], k)[::(2*(k % 2)-1)] for k in range(1-a.shape[0], a.shape[0])])
Out[48]: array([ 0, 1, 4, 8, 5, 2, 3, 6, 9, 12, 13, 10, 7, 11, 14, 15])
Breaking down the one-liner into separate steps:
a[::-1, :] reverses the rows:
In [59]: a[::-1, :]
Out[59]:
array([[12, 13, 14, 15],
[ 8, 9, 10, 11],
[ 4, 5, 6, 7],
[ 0, 1, 2, 3]])
(This could also be written a[::-1] or np.flipud(a).)
np.diagonal(a, k) extracts the kth diagonal, where k=0 is the main diagonal. So, for example,
In [65]: np.diagonal(a[::-1, :], -3)
Out[65]: array([0])
In [66]: np.diagonal(a[::-1, :], -2)
Out[66]: array([4, 1])
In [67]: np.diagonal(a[::-1, :], 0)
Out[67]: array([12, 9, 6, 3])
In [68]: np.diagonal(a[::-1, :], 2)
Out[68]: array([14, 11])
In the list comprehension, k gives the diagonal to be extracted. We want to reverse the elements in every other diagonal. The expression 2*(k % 2) - 1 gives the values 1, -1, 1, ... as k varies from -3 to 3. Indexing with [::1] leaves the order of the array being indexed unchanged, and indexing with [::-1] reverses the order of the array. So np.diagonal(a[::-1, :], k)[::(2*(k % 2)-1)] gives the kth diagonal, but with every other diagonal reversed:
In [71]: [np.diagonal(a[::-1,:], k)[::(2*(k % 2)-1)] for k in range(1-a.shape[0], a.shape[0])]
Out[71]:
[array([0]),
array([1, 4]),
array([8, 5, 2]),
array([ 3, 6, 9, 12]),
array([13, 10, 7]),
array([11, 14]),
array([15])]
np.concatenate() puts them all into a single array:
In [72]: np.concatenate([np.diagonal(a[::-1,:], k)[::(2*(k % 2)-1)] for k in range(1-a.shape[0], a.shape[0])])
Out[72]: array([ 0, 1, 4, 8, 5, 2, 3, 6, 9, 12, 13, 10, 7, 11, 14, 15])
I found discussion of zigzag scan for MATLAB, but not much for numpy. One project appears to use a hardcoded indexing array for 8x8 blocks
https://github.com/lot9s/lfv-compression/blob/master/scripts/our_mpeg/zigzag.py
ZIG = np.array([[0, 1, 5, 6, 14, 15, 27, 28],
[2, 4, 7, 13, 16, 26, 29, 42],
[3, 8, 12, 17, 25, 30, 41, 43],
[9, 11, 18, 24, 31, 40, 44,53],
[10, 19, 23, 32, 39, 45, 52,54],
[20, 22, 33, 38, 46, 51, 55,60],
[21, 34, 37, 47, 50, 56, 59,61],
[35, 36, 48, 49, 57, 58, 62,63]])
Apparently it's used jpeg and mpeg compression.

Selecting every alternate group of n columns - NumPy

I would like to select every nth group of n columns in a numpy array. It means that I want the first n columns, not the n next columns, the n next columns, not the n next columns etc.
For example, with the following array and n=2:
import numpy as np
arr = np.array([[1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
[11, 12, 13, 14, 15, 16, 17, 18, 19, 20]])
I would like to get:
[[1, 2, 5, 6, 9, 10],
[11, 12, 15, 16, 19, 20]]
And with n=3:
[[1, 2, 3, 7, 8, 9],
[11, 12, 13, 17, 18, 19]]
With n=1 we can simply use the syntax arr[:,::2], but is there something similar for n>1?
You can use modulus to create ramps starting from 0 until 2n and then select the first n from each such ramp. Thus, for each ramp, we would have first n as True and rest as False, to give us a boolean array covering the entire length of the array. Then, we simply use boolean indexing along the columns to select the valid columns for the final output. Thus, the implementation would look something like this -
arr[:,np.mod(np.arange(arr.shape[-1]),2*n)<n]
Step by step code runs to give a better idea -
In [43]: arr
Out[43]:
array([[ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
[11, 12, 13, 14, 15, 16, 17, 18, 19, 20]])
In [44]: n = 3
In [45]: np.mod(np.arange(arr.shape[-1]),2*n)
Out[45]: array([0, 1, 2, 3, 4, 5, 0, 1, 2, 3])
In [46]: np.mod(np.arange(arr.shape[-1]),2*n)<n
Out[46]: array([ True,True,True,False,False,False,True,True,True,False])
In [47]: arr[:,np.mod(np.arange(arr.shape[-1]),2*n)<n]
Out[47]:
array([[ 1, 2, 3, 7, 8, 9],
[11, 12, 13, 17, 18, 19]])
Sample runs across various n -
In [29]: arr
Out[29]:
array([[ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
[11, 12, 13, 14, 15, 16, 17, 18, 19, 20]])
In [30]: n = 1
In [31]: arr[:,np.mod(np.arange(arr.shape[-1]),2*n)<n]
Out[31]:
array([[ 1, 3, 5, 7, 9],
[11, 13, 15, 17, 19]])
In [32]: n = 2
In [33]: arr[:,np.mod(np.arange(arr.shape[-1]),2*n)<n]
Out[33]:
array([[ 1, 2, 5, 6, 9, 10],
[11, 12, 15, 16, 19, 20]])
In [34]: n = 3
In [35]: arr[:,np.mod(np.arange(arr.shape[-1]),2*n)<n]
Out[35]:
array([[ 1, 2, 3, 7, 8, 9],
[11, 12, 13, 17, 18, 19]])

indexing multidimensional arrays with an array

I have a multidimensional NumPy array:
In [1]: m = np.arange(1,26).reshape((5,5))
In [2]: m
Out[2]:
array([[ 1, 2, 3, 4, 5],
[ 6, 7, 8, 9, 10],
[11, 12, 13, 14, 15],
[16, 17, 18, 19, 20],
[21, 22, 23, 24, 25]])
and another array p = np.asarray([[1,1],[3,3]]). I wanted p to act as a array of indexes for m, i.e.:
m[p]
array([7, 19])
However I get:
In [4]: m[p]
Out[4]:
array([[[ 6, 7, 8, 9, 10],
[ 6, 7, 8, 9, 10]],
[[16, 17, 18, 19, 20],
[16, 17, 18, 19, 20]]])
How can I get the desired slice of m using p?
Numpy is using your array to index the first dimension only. As a general rule, indices for a multidimensional array should be in a tuple. This will get you a little closer to what you want:
>>> m[tuple(p)]
array([9, 9])
But now you are indexing the first dimension twice with 1, and the second twice with 3. To index the first dimension with a 1 and a 3, and then the second with a 1 and a 3 also, you could transpose your array:
>>> m[tuple(p.T)]
array([ 7, 19])

Categories