splitting ND arrays using numpy - python

I have a 3D numpy array and I want to partition it by the first 2 dimensions (and select all elements in the last one). Is there a simple way I can do that using numpy?
Example: given array
a = array([[[ 0, 1, 2],
[ 3, 4, 5],
[ 6, 7, 8]],
[[ 9, 10, 11],
[12, 13, 14],
[15, 16, 17]],
[[18, 19, 20],
[21, 22, 23],
[24, 25, 26]]])
I would like to split it N ways by the first two axes (while retaining all elements in the last one), e.g.,:
a[0:2, 0:2, :], a[2:3, 2:3, :]
But it doesn't need to be evenly split. Seems like numpy.array_split will split on all axes?

In [179]: np.array_split(a,2,0)
Out[179]:
[array([[[ 0, 1, 2],
[ 3, 4, 5],
[ 6, 7, 8]],
[[ 9, 10, 11],
[12, 13, 14],
[15, 16, 17]]]),
array([[[18, 19, 20],
[21, 22, 23],
[24, 25, 26]]])]
is the same as [a[:2,:,:], a[2:,:,:]]
You could loop on those 2 arrays and apply split on the next axis.
In [182]: a2=[np.array_split(aa,2,1) for aa in a1]
In [183]: a2 # edited for clarity
Out[183]:
[[array([[[ 0, 1, 2],
[ 3, 4, 5]],
[[ 9, 10, 11],
[12, 13, 14]]]), # (2,2,3)
array([[[ 6, 7, 8]],
[[15, 16, 17]]])], # (2,1,3)
[array([[[18, 19, 20],
[21, 22, 23]]]), # (1,2,3)
array([[[24, 25, 26]]])]] # (1,1,3)
In [184]: a2[0][0].shape
Out[184]: (2, 2, 3)
In [185]: a2[0][1].shape
Out[185]: (2, 1, 3)
In [187]: a2[1][0].shape
Out[187]: (1, 2, 3)
In [188]: a2[1][1].shape
Out[188]: (1, 1, 3)
With the potential of splitting in uneven arrays in each dimension, it is hard to do this in a full vectorized form. And even if the splits were even it's tricky to do this sort of grid splitting because values are not contiguous. In this example there's a gap between 5 and 9 in the first subarray.

A quick list comprehension will do the trick
[np.array_split(arr, 2, axis=1)
for arr in np.array_split(a, 2, axis=0)]
This will result in a list of lists, the items of which contain the arrays you're looking for.

Related

Problem simultaneously indexing several dimensions of a multidimensional numpy array

Consider a 4-dimensional numpy array (variable a). We have a.shape = (16, 5, 66, 717).
From the second dimension containing 4 elements, I want to select the second and the fifth:
b = a[:, [1,4],:,:]
b.shape returns (16, 2, 66, 717), so I guess what I did is correct. Now I want to extract 4 elements from the first dimension (eighth, eleventh, twelfth, thirteenth) and two elements from the second dimension (second and fifth):
b = a[[7,10,12,13,14], [1,4],:,:]
which gives an error:
IndexError: shape mismatch: indexing arrays could not be broadcast together with shapes (5,) (2,)
I don't understand why this simultaneous indexing across >1 dimensions of numpy array doesn't work. I guess I could sequentially do b = a[:, [1,4],:,:] and c = b[[7,10,12,13,14],:,:,:] to get what I want, but there must be a way to do that in one step. Could you please help?
Make a smaller 3d array:
In [155]: a = np.arange(24).reshape(2,3,4)
In [158]: a
Out[158]:
array([[[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]],
[[12, 13, 14, 15],
[16, 17, 18, 19],
[20, 21, 22, 23]]])
Selecting two "rows" (on the middle dimension):
In [159]: a[:,[0,2],:]
Out[159]:
array([[[ 0, 1, 2, 3],
[ 8, 9, 10, 11]],
[[12, 13, 14, 15],
[20, 21, 22, 23]]])
If we use 2 lists (or arrays) of the same shape, we end up selecting 2 "rows" from [159]:
In [160]: a[[0,1],[0,2],:]
Out[160]:
array([[ 0, 1, 2, 3],
[20, 21, 22, 23]])
If instead the first list/array is a "column vector", we select a (2,2) "block":
In [161]: a[[[0],[1]],[0,2],:]
Out[161]:
array([[[ 0, 1, 2, 3],
[ 8, 9, 10, 11]],
[[12, 13, 14, 15],
[20, 21, 22, 23]]])
ix_ can be used to create the same 2 arrays:
In [162]: np.ix_([0,1],[0,2])
Out[162]:
(array([[0],
[1]]),
array([[0, 2]]))
So using ix_ arrays:
In [163]: I,J = np.ix_([0,1],[0,2])
In [164]: a[I,J,:]
Out[164]:
array([[[ 0, 1, 2, 3],
[ 8, 9, 10, 11]],
[[12, 13, 14, 15],
[20, 21, 22, 23]]])
when I say they broadcast against each other, I mean in the same sense as broadcasting during adding or multiplication:
In [165]: I*10 + J
Out[165]:
array([[ 0, 2],
[10, 12]])
reference: https://numpy.org/doc/stable/user/basics.indexing.html#advanced-indexing
edit
In [166]: np.ix_([7,10,12,13,14], [1,4])
Out[166]:
(array([[ 7],
[10],
[12],
[13],
[14]]),
array([[1, 4]]))
Regarding your error:
In [167]: np.ix_([7,10,12,13,14], [1,4],:,:)
Input In [167]
np.ix_([7,10,12,13,14], [1,4],:,:)
^
SyntaxError: invalid syntax
ix_ is a function. ':' isn't allowed in a function call. It only works in an indexing, where it's converted to a slice. That's why you get a syntax error.

Figuring out correct numpy transpose for 3*3*3 array

Suppose I have a 3*3*3 array x. I would like to find out an array y, such that such that y[0,1,2] = x[1,2,0], or more generally, y[a,b,c]= x[b,c,a]. I can try numpy.transpose
import numpy as np
x = np.arange(27).reshape((3,3,3))
y = np.transpose(x, [2,0,1])
print(x[0,1,2],x[1,2,0])
print(y[0,1,2])
The output is
5 15
15
The result 15,15 is what I expected (the first 15 is the reference value from x[1,2,0]; the second is from y[0,1,2]) . However, I found the transpose [2,0,1] by drawing in a paper.
B C A
A B C
by inspection, the transpose should be [2,0,1], the last entry in the upper row goes to 1st in the lower row; the middle goes last; the first go middle. Is there any automatic and hopefully efficient way to do it (like any standard function in numpy/sympy)?
Given the input y[a,b,c]= x[b,c,a], output [2,0,1]?
I find easier to explore tranpose with a example with shape like (2,3,4), each axis is different.
But sticking with your (3,3,3)
In [23]: x = np.arange(27).reshape(3,3,3)
In [24]: x
Out[24]:
array([[[ 0, 1, 2],
[ 3, 4, 5],
[ 6, 7, 8]],
[[ 9, 10, 11],
[12, 13, 14],
[15, 16, 17]],
[[18, 19, 20],
[21, 22, 23],
[24, 25, 26]]])
In [25]: x[0,1,2]
Out[25]: 5
Your sample transpose:
In [26]: y = x.transpose(2,0,1)
In [27]: y
Out[27]:
array([[[ 0, 3, 6],
[ 9, 12, 15],
[18, 21, 24]],
[[ 1, 4, 7],
[10, 13, 16],
[19, 22, 25]],
[[ 2, 5, 8],
[11, 14, 17],
[20, 23, 26]]])
We get the same 5 with
In [28]: y[2,0,1]
Out[28]: 5
We could get that (2,0,1) by applying the same transposing values:
In [31]: idx = np.array((0,1,2)) # use an array for ease of indexing
In [32]: idx[[2,0,1]]
Out[32]: array([2, 0, 1])
The way I think about the trapose (2,0,1), we are moving the last axis, 2, to the front, and preserving the order of the other 2.
With differing dimensions, it's easier to visualize the change:
In [33]: z=np.arange(2*3*4).reshape(2,3,4)
In [34]: z
Out[34]:
array([[[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]],
[[12, 13, 14, 15],
[16, 17, 18, 19],
[20, 21, 22, 23]]])
In [35]: z.transpose(2,0,1)
Out[35]:
array([[[ 0, 4, 8],
[12, 16, 20]],
[[ 1, 5, 9],
[13, 17, 21]],
[[ 2, 6, 10],
[14, 18, 22]],
[[ 3, 7, 11],
[15, 19, 23]]])
In [36]: _.shape
Out[36]: (4, 2, 3)
np.swapaxes is another compiled function for making these changes. np.rollaxis is another, though it's python code that ends up calling transpose.
I haven't tried to follow all of your reasoning, though I think you want a kind reverse of the transpose numbers, one where you specify the result order, and want how to get them.

Understanding Numpy rot90 axes

Numpy's rot90 function promises to rotate a 2d or higher array by 90 degrees, taking an axes parameter. The method:
numpy.rot90(m, k=1, axes=(0, 1))[source]
Rotate an array by 90 degrees in the plane specified by axes.
Rotation direction is from the first towards the second axis.
I'm very confused about the axes part. An object can be rotated around the x, y, or z axis. Typically, this is defined by something such as a Vector3f, with 3 floats defining the axis value (ex, (0, 0, 1) to rotate around z axis.) I do not understand how these two numbers can be used to rotate a 3d object, shouldn't it be 3 like a Vector3f? Can anyone help me understand what these two axes mean, and what two numbers would be used for, respectively, rotation around the x, y, and z axis? I've tried many different combinations of numbers and they all have various results (I can't put in two of the same numbers), but I have no idea how it's possible to have enough information with two numbers (k represents amount of times to rotate.)
I like to work with a sample array with distinct values and dimensions, such as np.arange(24).reshape(2,3,4).
In this case I also looked at the code. After some preliminaries to make sure axes and k are right it does different things depending on the 4 possible values of k.
axes define a "plane". With a 3d array, axes=(0,1) can be thought of as rotation about the third axes (2), imagining a "vector" perpendicular to that "plane". But it's these axes values that used in the code. While I suspect we could construct equivalent operations with trig based rotation matrices, this code does not do any calculations. (Note that integers are not changed to floats.)
k=0 changes nothing:
In [104]: np.rot90(m,k=0, axes=(0,1))
Out[104]:
array([[[ 0, 1, 2, 3], # shape (2,3,4)
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]],
[[12, 13, 14, 15],
[16, 17, 18, 19],
[20, 21, 22, 23]]])
k=1 involves a flip followed by a transpose
In [105]: np.rot90(m,k=1, axes=(0,1))
Out[105]:
array([[[ 8, 9, 10, 11], # shape (3,2,4)
[20, 21, 22, 23]],
[[ 4, 5, 6, 7],
[16, 17, 18, 19]],
[[ 0, 1, 2, 3],
[12, 13, 14, 15]]])
k=2 is simpler - just a flip on both axes. This is easy to visualize. Last dimension is unchanged (across rows), but planes and rows are reversed:
In [106]: np.rot90(m,k=2, axes=(0,1))
Out[106]:
array([[[20, 21, 22, 23],
[16, 17, 18, 19],
[12, 13, 14, 15]],
[[ 8, 9, 10, 11],
[ 4, 5, 6, 7],
[ 0, 1, 2, 3]]])
k=3 does the flip first, then the transpose
In [107]: np.rot90(m,k=3, axes=(0,1))
Out[107]:
array([[[12, 13, 14, 15], # again (3,2,4)
[ 0, 1, 2, 3]],
[[16, 17, 18, 19],
[ 4, 5, 6, 7]],
[[20, 21, 22, 23],
[ 8, 9, 10, 11]]])
Looking at the strides:
In [111]: m.strides
Out[111]: (96, 32, 8)
In [112]: np.rot90(m,k=2, axes=(0,1)).strides
Out[112]: (-96, -32, 8) # the double flip
The transpose switches the order, while the flip again changes the sign:
In [113]: np.rot90(m,k=1, axes=(0,1)).strides
Out[113]: (-32, 96, 8)
In [114]: np.rot90(m,k=3, axes=(0,1)).strides
Out[114]: (32, -96, 8)
Because it just uses flip and transpose the result is a view.
Simpler (1,3,4) array
It may be easier to visualize in an array that represents values in one plane, a (3,4) array:
In [120]: m = np.arange(12).reshape(1,3,4)
In [121]: m
Out[121]:
array([[[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]]])
In [122]: np.rot90(m,k=2, axes=(1,2))
Out[122]:
array([[[11, 10, 9, 8],
[ 7, 6, 5, 4],
[ 3, 2, 1, 0]]])
In [123]: np.rot90(m,k=1, axes=(1,2)) # visualize a counterclockwise rotation
Out[123]:
array([[[ 3, 7, 11],
[ 2, 6, 10],
[ 1, 5, 9],
[ 0, 4, 8]]])
In [124]: np.rot90(m,k=3, axes=(1,2)) # clockwise
Out[124]:
array([[[ 8, 4, 0],
[ 9, 5, 1],
[10, 6, 2],
[11, 7, 3]]])
k=3 can also be visualized as successive counterclockwise rotations through 1 and 2.

Find all n-dimensional lines and diagonals with NumPy

Using NumPy, I would like to produce a list of all lines and diagonals of an n-dimensional array with lengths of k.
Take the case of the following three-dimensional array with lengths of three.
array([[[ 0, 1, 2],
[ 3, 4, 5],
[ 6, 7, 8]],
[[ 9, 10, 11],
[12, 13, 14],
[15, 16, 17]],
[[18, 19, 20],
[21, 22, 23],
[24, 25, 26]]])
For this case, I would like to obtain all of the following types of sequences. For any given case, I would like to obtain all of the possible sequences of each type. Examples of desired sequences are given in parentheses below, for each case.
1D lines
x axis (0, 1, 2)
y axis (0, 3, 6)
z axis (0, 9, 18)
2D diagonals
x/y axes (0, 4, 8, 2, 4, 6)
x/z axes (0, 10, 20, 2, 10, 18)
y/z axes (0, 12, 24, 6, 12, 18)
3D diagonals
x/y/z axes (0, 13, 26, 2, 13, 24)
The solution should be generalized, so that it will generate all lines and diagonals for an array, regardless of the array's number of dimensions or length (which is constant across all dimensions).
This solution generalized over n
Lets rephrase this problem as "find the list of indices".
We're looking for all of the 2d index arrays of the form
array[i[0], i[1], i[2], ..., i[n-1]]
Let n = arr.ndim
Where i is an array of shape (n, k)
Each of i[j] can be one of:
The same index repeated n times, ri[j] = [j, ..., j]
The forward sequence, fi = [0, 1, ..., k-1]
The backward sequence, bi = [k-1, ..., 1, 0]
With the requirements that each sequence is of the form ^(ri)*(fi)(fi|bi|ri)*$ (using regex to summarize it). This is because:
there must be at least one fi so the "line" is not a point selected repeatedly
no bis come before fis, to avoid getting reversed lines
def product_slices(n):
for i in range(n):
yield (
np.index_exp[np.newaxis] * i +
np.index_exp[:] +
np.index_exp[np.newaxis] * (n - i - 1)
)
def get_lines(n, k):
"""
Returns:
index (tuple): an object suitable for advanced indexing to get all possible lines
mask (ndarray): a boolean mask to apply to the result of the above
"""
fi = np.arange(k)
bi = fi[::-1]
ri = fi[:,None].repeat(k, axis=1)
all_i = np.concatenate((fi[None], bi[None], ri), axis=0)
# inedx which look up every possible line, some of which are not valid
index = tuple(all_i[s] for s in product_slices(n))
# We incrementally allow lines that start with some number of `ri`s, and an `fi`
# [0] here means we chose fi for that index
# [2:] here means we chose an ri for that index
mask = np.zeros((all_i.shape[0],)*n, dtype=np.bool)
sl = np.index_exp[0]
for i in range(n):
mask[sl] = True
sl = np.index_exp[2:] + sl
return index, mask
Applied to your example:
# construct your example array
n = 3
k = 3
data = np.arange(k**n).reshape((k,)*n)
# apply my index_creating function
index, mask = get_lines(n, k)
# apply the index to your array
lines = data[index][mask]
print(lines)
array([[ 0, 13, 26],
[ 2, 13, 24],
[ 0, 12, 24],
[ 1, 13, 25],
[ 2, 14, 26],
[ 6, 13, 20],
[ 8, 13, 18],
[ 6, 12, 18],
[ 7, 13, 19],
[ 8, 14, 20],
[ 0, 10, 20],
[ 2, 10, 18],
[ 0, 9, 18],
[ 1, 10, 19],
[ 2, 11, 20],
[ 3, 13, 23],
[ 5, 13, 21],
[ 3, 12, 21],
[ 4, 13, 22],
[ 5, 14, 23],
[ 6, 16, 26],
[ 8, 16, 24],
[ 6, 15, 24],
[ 7, 16, 25],
[ 8, 17, 26],
[ 0, 4, 8],
[ 2, 4, 6],
[ 0, 3, 6],
[ 1, 4, 7],
[ 2, 5, 8],
[ 0, 1, 2],
[ 3, 4, 5],
[ 6, 7, 8],
[ 9, 13, 17],
[11, 13, 15],
[ 9, 12, 15],
[10, 13, 16],
[11, 14, 17],
[ 9, 10, 11],
[12, 13, 14],
[15, 16, 17],
[18, 22, 26],
[20, 22, 24],
[18, 21, 24],
[19, 22, 25],
[20, 23, 26],
[18, 19, 20],
[21, 22, 23],
[24, 25, 26]])
Another good set of test data is np.moveaxis(np.indices((k,)*n), 0, -1), which gives an array where every value is its own index
I've solved this problem before to implement a higher dimensional tic-tac-toe
In [1]: x=np.arange(27).reshape(3,3,3)
Selecting individual rows is easy:
In [2]: x[0,0,:]
Out[2]: array([0, 1, 2])
In [3]: x[0,:,0]
Out[3]: array([0, 3, 6])
In [4]: x[:,0,0]
Out[4]: array([ 0, 9, 18])
You could iterate over dimensions with an index list:
In [10]: idx=[slice(None),0,0]
In [11]: x[idx]
Out[11]: array([ 0, 9, 18])
In [12]: idx[2]+=1
In [13]: x[idx]
Out[13]: array([ 1, 10, 19])
Look at the code for np.apply_along_axis to see how it implements this sort of iteration.
Reshape and split can also produce a list of rows. For some dimensions this might require a transpose:
In [20]: np.split(x.reshape(x.shape[0],-1),9,axis=1)
Out[20]:
[array([[ 0],
[ 9],
[18]]), array([[ 1],
[10],
[19]]), array([[ 2],
[11],
...
np.diag can get diagonals from 2d subarrays
In [21]: np.diag(x[0,:,:])
Out[21]: array([0, 4, 8])
In [22]: np.diag(x[1,:,:])
Out[22]: array([ 9, 13, 17])
In [23]: np.diag?
In [24]: np.diag(x[1,:,:],1)
Out[24]: array([10, 14])
In [25]: np.diag(x[1,:,:],-1)
Out[25]: array([12, 16])
And explore np.diagonal for direct application to the 3d. It's also easy to index the array directly, with range and arange, x[0,range(3),range(3)].
As far as I know there isn't a function to step through all these alternatives. Since dimensions of the returned arrays can differ, there's little point to producing such a function in compiled numpy code. So even if there was a function, it would step through the alternatives as I outlined.
==============
All the 1d lines
x.reshape(-1,3)
x.transpose(0,2,1).reshape(-1,3)
x.transpose(1,2,0).reshape(-1,3)
y/z diagonal and anti-diagonal
In [154]: i=np.arange(3)
In [155]: j=np.arange(2,-1,-1)
In [156]: np.concatenate((x[:,i,i],x[:,i,j]),axis=1)
Out[156]:
array([[ 0, 4, 8, 2, 4, 6],
[ 9, 13, 17, 11, 13, 15],
[18, 22, 26, 20, 22, 24]])
np.einsum can be used to build all these kind of expressions; for instance:
# 3d diagonals
print(np.einsum('iii->i', a))
# 2d diagonals
print(np.einsum('iij->ij', a))
print(np.einsum('iji->ij', a))

How to select values in a n-dimensional array

I have been trying to perform a simple operation, but I can't seem to find a simple way to do it using Numpy functions without creating unnecessary copies of the array.
Suppose we have the following 3-dimensional array :
In [171]: x = np.arange(24).reshape((4, 3, 2))
In [172]: x
Out[172]:
array([[[ 0, 1],
[ 2, 3],
[ 4, 5]],
[[ 6, 7],
[ 8, 9],
[10, 11]],
[[12, 13],
[14, 15],
[16, 17]],
[[18, 19],
[20, 21],
[22, 23]]])
And the following array :
In [173]: y = np.array([0, 1, 1, 0])
I want to select in x, for each row, the value of the last dimension whose index is the corresponding element in y. In other words, I want :
array([[ 0, 2, 4],
[ 7, 9, 11],
[13, 15, 17],
[18, 20, 22]])
The only solution that I have for now is using a for loop over the first dimension of x and y, as follows :
z = np.zeros((4, 3), dtype=int)
for i, row in enumerate(x):
z[i, :] = row[:, y[i]]
Is there a way of avoiding a for loop here, using numpy functions or fancy indexing?
Thanks!
The tricky aspect is that you don't want all of the 0th-dimension for each slice, you want the slices to correspond to each element in the 0th-dimension. So you could do something like:
>>> x[np.arange(x.shape[0]), :, y]
array([[ 0, 2, 4],
[ 7, 9, 11],
[13, 15, 17],
[18, 20, 22]])
Fancy indexing:
x[np.arange(y.size),:,y]
gives:
array([[ 0, 2, 4],
[ 7, 9, 11],
[13, 15, 17],
[18, 20, 22]])

Categories