Subsampling 3D array using the neighbourhood sum - python

The title is probably confusing. I have a reasonably large 3D numpy array. I'd like to cut it's size by 2^3 by binning blocks of size (2,2,2). Each element in the new 3D array should then contain the sum of the elements in it's respective block in the original array.
As an example, consider a 4x4x4 array:
input = [[[1, 1, 2, 2],
[1, 1, 2, 2],
[3, 3, 4, 4],
[3, 3, 4, 4]],
[[1, 1, 2, 2],
[1, 1, 2, 2],
[3, 3, 4, 4],
[3, 3, 4, 4]],
... ]]]
(I'm only representing half of it to save space). Notice that all the elements with the same value constitute a (2x2x2) block. The output should be a 2x2x2 array such that each element is the sum of a block:
output = [[[8, 16],
[24, 32]],
... ]]]
So 8 is the sum of all 1's, 16 is the sum of the 2's, and so on.

There's a builtin to do those block-wise reductions - skimage.measure.block_reduce-
In [36]: a
Out[36]:
array([[[1, 1, 2, 2],
[1, 1, 2, 2],
[3, 3, 4, 4],
[3, 3, 4, 4]],
[[1, 1, 2, 2],
[1, 1, 2, 2],
[3, 3, 4, 4],
[3, 3, 4, 4]]])
In [37]: from skimage.measure import block_reduce
In [39]: block_reduce(a, block_size=(2,2,2), func=np.sum)
Out[39]:
array([[[ 8, 16],
[24, 32]]])
Use other reduction ufuncs, say max-reduction -
In [40]: block_reduce(a, block_size=(2,2,2), func=np.max)
Out[40]:
array([[[1, 2],
[3, 4]]])
Implementing such a function isn't that difficult with NumPy tools and could be done like so -
def block_reduce_numpy(a, block_size, func):
shp = a.shape
new_shp = np.hstack([(i//j,j) for (i,j) in zip(shp,block_size)])
select_axes = tuple(np.arange(a.ndim)*2+1)
return func(a.reshape(new_shp),axis=select_axes)

Related

Numpy 2D to 3D array based on data in a column

Let's say I have data structured in a 2D array like this:
[[1, 3, 4, 6],
[1, 4, 8, 2],
[1, 3, 2, 9],
[2, 2, 4, 8],
[2, 4, 9, 1],
[2, 2, 9, 3]]
The first column denotes a third dimension, so I want to convert this to the following 3D array:
[[[3, 4, 6],
[4, 8, 2],
[3, 2, 9]],
[[2, 4, 8],
[4, 9, 1],
[2, 9, 3]]]
Is there a built-in numpy function to do this?
You can try code below:
import numpy as np
array = np.array([[1, 3, 4, 6],
[1, 4, 8, 2],
[1, 3, 2, 9],
[2, 2, 4, 8],
[2, 4, 9, 1],
[2, 2, 9, 3]])
array = np.delete(array, 0, 1)
array.reshape(2,3,-1)
Output
array([[[3, 4, 6],
[4, 8, 2],
[3, 2, 9]],
[[2, 4, 8],
[4, 9, 1],
[2, 9, 3]]])
However, this code can be used when you are aware of the array's shape. But if you are sure that the number of columns in the array is a multiple of 3, you can simply use code below to show the array in the desired format.
array.reshape(array.shape[0]//3,3,-3)
Use numpy array slicing with reshape function.
import numpy as np
arr = [[1, 3, 4, 6],
[1, 4, 8, 2],
[1, 3, 2, 9],
[2, 2, 4, 8],
[2, 4, 9, 1],
[2, 2, 9, 3]]
# convert the list to numpy array
arr = np.array(arr)
# remove first column from numpy array
arr = arr[:,1:]
# reshape the remaining array to desired shape
arr = arr.reshape(len(arr)//3,3,-1)
print(arr)
Output:
[[[3 4 6]
[4 8 2]
[3 2 9]]
[[2 4 8]
[4 9 1]
[2 9 3]]]
You list a non numpy array. I am unsure if you are just suggesting numpy as a means to get a non numpy result, or you are actually looking for a numpy array as result. If you don't actually need numpy, you could do something like this:
arr = [[1, 3, 4, 6],
[1, 4, 8, 2],
[1, 3, 2, 9],
[2, 2, 4, 8],
[2, 4, 9, 1],
[2, 2, 9, 3]]
# Length of the 3rd and 2nd dimension.
nz = arr[-1][0] + (arr[0][0]==0)
ny = int(len(arr)/nz)
res = [[arr[ny*z_idx+y_idx][1:] for y_idx in range(ny)] for z_idx in range(nz)]
OUTPUT:
[[[3, 4, 6], [4, 8, 2], [3, 2, 9]], [[2, 4, 8], [4, 9, 1], [2, 9, 3]]]
Note that the calculation of nz takes into account that the 3rd dimension index in your array is either 0-based (as python is per default) or 1-based (as you show in your example).

Given indexes, get values from numpy matrix

Let's say I have this numpy matrix:
>>> mat = np.matrix([[3,4,5,2,1], [1,2,7,6,5], [8,9,4,5,2]])
>>> mat
matrix([[3, 4, 5, 2, 1],
[1, 2, 7, 6, 5],
[8, 9, 4, 5, 2]])
Now let's say I have some indexes in this form:
>>> ind = np.matrix([[0,2,3], [0,4,2], [3,1,2]])
>>> ind
matrix([[0, 2, 3],
[0, 4, 2],
[3, 1, 2]])
What I would like to do is to get three values from each row of the matrix, specifically values at columns 0, 2, and 3 for the first row, values at columns 0, 4 and 2 for the second row, etc. This is the expected output:
matrix([[3, 5, 2],
[1, 5, 7],
[5, 9, 4]])
I've tried using np.take but it doesn't seem to work. Any suggestion?
This is take_along_axis.
>>> np.take_along_axis(mat, ind, axis=1)
matrix([[3, 5, 2],
[1, 5, 7],
[5, 9, 4]])
This will do it: mat[np.arange(3).reshape(-1, 1), ind]
In [245]: mat[np.arange(3).reshape(-1, 1), ind]
Out[245]:
matrix([[3, 5, 2],
[1, 5, 7],
[5, 9, 4]])
(but take_along_axis in #user3483203's answer is simpler).

Split nested numpy array

I have a numpy array of shape 28 x 1875. Each element is a 3-element list (only floats). I need to split each of these elements to individual ones, to obtain an array of shape 28x5625(1875*3). I've tried np.split, however it only separates each element, but no each sub-element. Is there a fast way to do this?
Making a 2d array of lists:
In [522]: arr = np.empty(6,object)
In [523]: arr[:] = [list(range(i,i+3)) for i in range(6)]
In [524]: arr = arr.reshape(2,3)
In [525]: arr
Out[525]:
array([[list([0, 1, 2]), list([1, 2, 3]), list([2, 3, 4])],
[list([3, 4, 5]), list([4, 5, 6]), list([5, 6, 7])]], dtype=object)
It's easier to fill such an array if it is 1d, which is why I start with (6,) and reshape after.
Paul Panzer's suggestion:
In [526]: np.array(arr.tolist())
Out[526]:
array([[[0, 1, 2],
[1, 2, 3],
[2, 3, 4]],
[[3, 4, 5],
[4, 5, 6],
[5, 6, 7]]])
In [527]: _.reshape(2,-1)
Out[527]:
array([[0, 1, 2, 1, 2, 3, 2, 3, 4],
[3, 4, 5, 4, 5, 6, 5, 6, 7]])
You can also use np.stack (a version of np.concatenate) to create a nd array. It does though, require a 1d object array - hence the ravel:
In [536]: np.stack(arr.ravel())
Out[536]:
array([[0, 1, 2],
[1, 2, 3],
[2, 3, 4],
[3, 4, 5],
[4, 5, 6],
[5, 6, 7]])
That can be reshaped as needed:
In [537]: np.stack(arr.ravel()).reshape(2,-1)
Out[537]:
array([[0, 1, 2, 1, 2, 3, 2, 3, 4],
[3, 4, 5, 4, 5, 6, 5, 6, 7]])
In some cases we need to transpose axes to get the desired order.

Reshaping a multidimensional Numpy array

I have a numpy array of shape (1429,1) where each row itself is a numpy array of shape (3,100) where l may vary from row to row.
How can I reshape this array by flattening each row such that the resulting numpy array will have the shape (1429, 300)?
I guess your initial array's shape is (1429, 3, 100), if that's true, you can change it's shape as below:
import numpy as np
a = a.flatten().reshape((1429, 300)) #a is the initial numpy array
The type of your embedding structure is probably object. It's just a collection of references on 1429 numpy.ndarrays.
As an exemple :
a=np.empty((1429,1),object)
for x in a :
x[0]=np.random.rand(3,100)
In [19]: a.shape,a.dtype
Out[19]: ((1429, 1), dtype('O'))
In [20]: a[0,0].shape
Out[20]: (3, 100)
The structure is probably not contiguous. To obtain a block containing all your data, you must reconstruct it to obtain the good layout :
b=np.array([x.ravel() for x in a.ravel()])
In [21]: b.shape
Out[21]: (1429, 300)
ravel discard unwanted dimensions.
Assuming it is an object dtype array with shape (1429,1), and all elements are 2d of shape (3,100), a good way to 'flatten' is to use concatenate or stack.
np.stack(arr.ravel()).reshape(-1,300)
I use arr.ravel() so the array looks like a (1429) element list to stack. stack then concatenates the elements, creating a (1429, 3, 100) array. The reshape then converts that to (1429, 300).
In [939]: arr = np.empty((5,1),object)
In [940]: arr[:,0] = [np.arange(6).reshape(2,3) for _ in range(5)]
In [941]: arr
Out[941]:
array([[array([[0, 1, 2],
[3, 4, 5]])],
[array([[0, 1, 2],
[3, 4, 5]])],
[array([[0, 1, 2],
[3, 4, 5]])],
[array([[0, 1, 2],
[3, 4, 5]])],
[array([[0, 1, 2],
[3, 4, 5]])]], dtype=object)
In [942]: np.stack(arr.ravel())
Out[942]:
array([[[0, 1, 2],
[3, 4, 5]],
[[0, 1, 2],
[3, 4, 5]],
[[0, 1, 2],
[3, 4, 5]],
[[0, 1, 2],
[3, 4, 5]],
[[0, 1, 2],
[3, 4, 5]]])
In [943]: np.stack(arr.ravel()).reshape(-1,6)
Out[943]:
array([[0, 1, 2, 3, 4, 5],
[0, 1, 2, 3, 4, 5],
[0, 1, 2, 3, 4, 5],
[0, 1, 2, 3, 4, 5],
[0, 1, 2, 3, 4, 5]])
np.stack with the default axis=0 is the same as np.array(...).
Or with concatenate
In [950]: np.concatenate(arr.ravel(),axis=0)
Out[950]:
array([[0, 1, 2],
[3, 4, 5],
[0, 1, 2],
[3, 4, 5],
[0, 1, 2],
[3, 4, 5],
[0, 1, 2],
[3, 4, 5],
[0, 1, 2],
[3, 4, 5]])
In [951]: np.concatenate(arr.ravel(),axis=0).reshape(5,6)
Out[951]:
array([[0, 1, 2, 3, 4, 5],
[0, 1, 2, 3, 4, 5],
[0, 1, 2, 3, 4, 5],
[0, 1, 2, 3, 4, 5],
[0, 1, 2, 3, 4, 5]])

numpy array operation method

>>> c= array([[[1, 2],
[3, 4]],
[[2, 1],
[4, 3]],
[[3, 2],
[1, 4]]])
>>> x
array([[0, 1, 2],
[3, 4, 5]])
return me a matrix such that each column is the product of each matrix in c multiply the each corresponding column of x in regular matrix multiplication. I'm trying to figure out a way to vectorized it or at least not using for loop to solve it.
array([[6, 6, 16]
12, 16, 22]])
to extends this operation further let's say that I have an array of matrices,say
>>> c
array([[[1, 2],
[3, 4]],
[[2, 1],
[4, 3]],
[[3, 2],
[1, 4]]])
>>> x
array([[[1, 2, 3],
[1, 2, 3]],
[[1, 0, 2],
[1, 0, 2]],
[[2, 3, 1],
[0, 1, 0]]])
def fun(c,x):
for i in range(len(x)):
np.einsum('ijk,ki->ji',c,x[i])
##something
So basically, I want to have each matrix in x multiply with all of c. return a structure similar to c without introducing this for loop
The reason I'm doing this because I've encounter a problem to solve a problem ,trying to vectorized
Xc (the operation follows the normal matrix column vector multiplication), c is 3D array; like the c from above-- a column vector that each element is a matrix (in numpy its the form in the above). X is the matrix with each elements is a 1D array. The output of the Xc should be 1D array.
You can use np.einsum -
np.einsum('ijk,ki->ji',c,x)
Sample run -
In [155]: c
Out[155]:
array([[[1, 2],
[3, 4]],
[[2, 1],
[4, 3]],
[[3, 2],
[1, 4]]])
In [156]: x
Out[156]:
array([[0, 1, 2],
[3, 4, 5]])
In [157]: np.einsum('ijk,ki->ji',c,x)
Out[157]:
array([[ 6, 6, 16],
[12, 16, 22]])
For the 3D case of x, simply append the new dimension at the start of the string notation for x and correspondingly at the output string notation too, like so -
np.einsum('ijk,lki->lji',c,x)
Sample run -
In [151]: c
Out[151]:
array([[[1, 2],
[3, 4]],
[[2, 1],
[4, 3]],
[[3, 2],
[1, 4]]])
In [152]: x
Out[152]:
array([[[1, 2, 3],
[1, 2, 3]],
[[1, 0, 2],
[1, 0, 2]],
[[2, 3, 1],
[0, 1, 0]]])
In [153]: np.einsum('ijk,lki->lji',c,x)
Out[153]:
array([[[ 3, 6, 15],
[ 7, 14, 15]],
[[ 3, 0, 10],
[ 7, 0, 10]],
[[ 2, 7, 3],
[ 6, 15, 1]]])

Categories