How would you separate a 2D numpy array into a nxn chunks?
For example, the following array of shape (4,4):
arr = [[1,2,3,4],
[5,6,7,8],
[9,10,11,12],
[13,14,15,16]]
Transformed to this array, of shape (4,2,2), by subsampling with a different (2x2) array:
new_arr = [[[1,2],
[5,6]],
[[3,4],
[7,8]],
[[9,10],
[13,14]],
[[11,12],
[15,16]]]
You can use np.vsplit to split the array into multiple subarrays vertically. Similarly you can use np.hsplit to split the array into multiple subarrays horizontally. To better understand this examine the generalized resample function which makes the use of np.vsplit and np.hsplit methods.
Use this:
def ressample(arr, N):
A = []
for v in np.vsplit(arr, arr.shape[0] // N):
A.extend([*np.hsplit(v, arr.shape[0] // N)])
return np.array(A)
Example 1:
The given 2D array is of shape 4x4 and we want to subsample it into the chunks of shape 2x2.
arr = np.array([[1, 2, 3, 4],
[5, 6, 7, 8],
[9, 10, 11, 12],
[13, 14, 15, 16]])
print(ressample(arr, 2)) #--> chunk size 2
Output 1:
[[[ 1 2]
[ 5 6]]
[[ 3 4]
[ 7 8]]
[[ 9 10]
[13 14]]
[[11 12]
[15 16]]]
Example 2:
Consider the given 2D array contains 8 rows and 8 columns. Now we subsample this array into the chunks of shape 4x4.
arr = np.random.randint(0, 10, 64).reshape(8, 8)
print(ressample(arr, 4)) #--> chunck size 4
Sample Output 2:
[[[8 3 7 5]
[7 2 6 1]
[7 9 2 2]
[3 1 8 8]]
[[2 0 3 2]
[2 9 0 8]
[2 6 3 9]
[2 4 4 8]]
[[9 9 1 8]
[9 1 5 0]
[8 5 1 2]
[2 7 5 1]]
[[7 8 9 6]
[9 0 9 5]
[8 9 8 3]
[7 3 6 3]]]
You could do the following, and adjust it to your array:
import numpy as np
arr = [[1,2,3,4],
[5,6,7,8],
[9,10,11,12],
[13,14,15,16]]
arr_new = np.array([[arr[i][j:j+2], arr[i+1][j:j+2]] for j in range(len(arr[0])-2) for i in range(len(arr)-2)])
print(arr_new)
print(arr_new.shape)
This gives the following output:
[[[ 1 2]
[ 5 6]]
[[ 5 6]
[ 9 10]]
[[ 2 3]
[ 6 7]]
[[ 6 7]
[10 11]]]
(4, 2, 2)
You could use hsplit() and vsplit() methods to achieve the above.
import numpy as np
arr = np.array([[1,2,3,4],[5,6,7,8],[9,10,11,12],[13,14,15,16]])
ls1,ls2 = np.hsplit(arr, 2)
ls1 = np.vsplit(ls1,2)
ls2 = np.vsplit(ls2,2)
ls = ls1 + ls2
result = np.array(ls)
print(result)
>>>
[[[ 1 2]
[ 5 6]]
[[ 9 10]
[13 14]]
[[ 3 4]
[ 7 8]]
[[11 12]
[15 16]]]
print(result.tolist())
>>> [[[1, 2], [5, 6]], [[9, 10], [13, 14]], [[3, 4], [7, 8]], [[11, 12], [15, 16]]]
There is no need to split or anything; the same can be achieved by reshaping and reordering the axes.
result = np.swapaxes(arr.reshape(2, 2, 2, 2), 1, 2).reshape(-1, 2, 2)
Dividing an (N, N) array to (n, n) chunks is also basically a sliding window op with an (n, n) window and a stride of n.
from numpy.lib.stride_tricks import sliding_window_view
result = sliding_window_view(arr, (2, 2))[::2, ::2].reshape(-1, 2, 2)
Related
I had to make a matrix using numpy.array method. How can I now update every third element of my matrix? I have made a for loop for the problem but that is not the optimal solution. Is there a way to avoid loops? For example if I have this matrix:
matrix = np.array([[1,2,3,4],
[5,6,7,8],
[4,7,6,9]])
is there a way to add 1 to every third element and get this matrix:
[[2,2,3,5],[5,6,8,8],[4,8,6,9]]
Solution:
matrix = np.ascontiguousarray(matrix)
matrix.ravel()[::3] += 1
Why does the ascontiguousarray is needed? Because matrix may not be c-contiguous (for example matrix may have fortran-order - column major). It that case ravel returns a copy instead of a view so a simple inplace operation matrix.ravel()[::3] += 1 will not work as expected.
Example 1
import numpy as np
arr = np.array([
[1, 2, 3, 4],
[5, 6, 7, 8],
[4, 7, 6, 9]])
arr.ravel()[::3] += 1
print(arr)
Works as expected:
[[2 2 3 5]
[5 6 8 8]
[4 8 6 9]]
Example 2
But with fortran-order
import numpy as np
arr = np.array([
[1, 2, 3, 4],
[5, 6, 7, 8],
[4, 7, 6, 9]])
arr = np.asfortranarray(arr)
arr.ravel()[::3] += 1
print(arr)
produces:
[[1 2 3 4]
[5 6 7 8]
[4 7 6 9]]
Example 3
Will work as expected in both cases
import numpy as np
arr = np.array([
[1, 2, 3, 4],
[5, 6, 7, 8],
[4, 7, 6, 9]])
# arr = np.asfortranarray(arr)
arr = np.ascontiguousarray(arr)
arr.ravel()[::3] += 1
print(arr)
I want replace last element of every row in an ndarray with a constant. Currently I can solve this by using loops, but i'm looking for an elegant solution. preferably using numpy functions.
for example i have a ndarray :
[1 3 4 5]
[4 2 4 1]
[3 2 7 3]
[7 9 4 3]
[6 9 7 2]
Here is the result i want, with last element of every row is replaced with 10
[1 3 4 10]
[4 2 4 10]
[3 2 7 10]
[7 9 4 10]
[6 9 7 10]
use numpy indexing for columns
import numpy as np
arr = np.array([[1,3,4,5],
[4,2,4,1],
[3,2,7,3],
[7,9,4,3],
[6,9,7,2]])
arr[:,-1]=10
arr
array([[ 1, 3, 4, 10],
[ 4, 2, 4, 10],
[ 3, 2, 7, 10],
[ 7, 9, 4, 10],
[ 6, 9, 7, 10]])
I am trying to use the function as_strided from numpy.lib.stride_tricks to extract sub series from a larger 2D array, but I struggled to find the right thing to write for the strides argument.
Let's say I have a matrix m which contains 5 1D array of length (a=)10. I want to extract sub 1D arrays of length (b=)4 for each 1D array in m.
import numpy
from numpy.lib.stride_tricks import as_strided
a, b = 10, 4
m = numpy.array([range(i,i+a) for i in range(5)])
# first try
sub_m = as_strided(m, shape=(m.shape[0], m.shape[1]-b+1, b))
print sub_m.shape # (5,7,4) which is what i expected
print sub_m[-1,-1,-1] # Some unexpected strange number: 8227625857902995061
# second try with strides argument
sub_m = as_strided(m, shape=(m.shape[0], m.shape[1]-b+1, b), strides=(m.itemize,m.itemize,m.itemize))
# gives error, see below
AttributeError: 'numpy.ndarray' object has no attribute 'itemize'
As you can see I succeed to get the right shape for sub_m in my first try. However I can't find what to write in strides=()
For information:
m = [[ 0 1 2 3 4 5 6 7 8 9]
[ 1 2 3 4 5 6 7 8 9 10]
[ 2 3 4 5 6 7 8 9 10 11]
[ 3 4 5 6 7 8 9 10 11 12]
[ 4 5 6 7 8 9 10 11 12 13]]
Expected output:
sub_n = [
[[0 1 2 3] [1 2 3 4] ... [5 6 7 8] [6 7 8 9]]
[[1 2 3 4] [2 3 4 5] ... [6 7 8 9] [7 8 9 10]]
[[2 3 4 5] [3 4 5 6] ... [7 8 9 10] [8 9 10 11]]
[[3 4 5 6] [4 5 6 7] ... [8 9 10 11] [9 10 11 12]]
[[4 5 6 7] [5 6 7 8] ... [9 10 11 12] [10 11 12 13]]
]
edit: I have much more data, that's the reason why I want to use as_strided (efficiency)
Here's one approach with np.lib.stride_tricks.as_strided -
def strided_lastaxis(a, L):
s0,s1 = a.strides
m,n = a.shape
return np.lib.stride_tricks.as_strided(a, shape=(m,n-L+1,L), strides=(s0,s1,s1))
Bit of explanation on strides for as_strided :
We have 3D strides, that increments by one element along the last/third axis, so s1 there for the last axis striding. The second axis strides by the same one element "distance", so s1 for that too. For the first axis, the striding is same as the first axis stride length of the array, as we move on the next row, so s0 there.
Sample run -
In [46]: a
Out[46]:
array([[0, 5, 6, 2, 3, 6, 7, 1, 4, 8],
[2, 1, 3, 7, 0, 3, 5, 4, 0, 1]])
In [47]: strided_lastaxis(a, L=4)
Out[47]:
array([[[0, 5, 6, 2],
[5, 6, 2, 3],
[6, 2, 3, 6],
[2, 3, 6, 7],
[3, 6, 7, 1],
[6, 7, 1, 4],
[7, 1, 4, 8]],
[[2, 1, 3, 7],
[1, 3, 7, 0],
[3, 7, 0, 3],
[7, 0, 3, 5],
[0, 3, 5, 4],
[3, 5, 4, 0],
[5, 4, 0, 1]]])
I have a 3D numpy array that looks like this:
X = [[[10 1] [ 2 10] [-5 3]]
[[-1 10] [ 0 2] [ 3 10]]
[[ 0 3] [10 3] [ 1 2]]
[[ 0 2] [ 0 0] [10 0]]]
At first I want the maximum along axis zero with X.max(axis = 0)):
which gives me:
[[10 10] [10 10] [10 10]]
The next step is now my problem; I would like to call the location of each 10 and create a new 2D array from another 3D array which has the same dimeonsions as X.
for example teh array with same dimensions looks like that:
Y = [[[11 2] [ 3 11] [-4 100]]
[[ 0 11] [ 100 3] [ 4 11]]
[[ 1 4] [11 100] [ 2 3]]
[[ 100 3] [ 1 1] [11 1]]]
I want to find the location of the maximum in X and create a 2D array from the numbers and location in Y.
the answer in this case should then be:
[[11 11] [11 11] [11 11]]
Thank you for your help in advance :)
you can do this with numpy.argmax and numpy.indices.
import numpy as np
X = np.array([[[10, 1],[ 2,10],[-5, 3]],
[[-1,10],[ 0, 2],[ 3,10]],
[[ 0, 3],[10, 3],[ 1, 2]],
[[ 0, 2],[ 0, 0],[10, 0]]])
Y = np.array([[[11, 2],[ 3,11],[-4, 100]],
[[ 0,11],[ 100, 3],[ 4,11]],
[[ 1, 4],[11, 100],[ 2, 3]],
[[ 100, 3],[ 1, 1],[11, 1]]])
ind = X.argmax(axis=0)
a1,a2=np.indices(ind.shape)
print X[ind,a1,a2]
# [[10 10]
# [10 10]
# [10 10]]
print Y[ind,a1,a2]
# [[11 11]
# [11 11]
# [11 11]]
The answer here provided the inspiration for this
You could try
Y[X==X.max(axis=0)].reshape(X.max(axis=0).shape)
I'm doing image processing for object detection using python. I need to divide my image into all possible blocks. For example given this toy image:
x = np.arange(25)
x = x.reshape((5, 5))
[[ 0 1 2 3 4]
[ 5 6 7 8 9]
[10 11 12 13 14]
[15 16 17 18 19]
[20 21 22 23 24]]
I want to retrieve all possible blocks of a given size, for example the 2x2 blocks are:
[[0 1]
[5 6]]
[[1 2]
[6 7]]
.. and so on. How can I do this?
The scikit image extract_patches_2d does that
>>> from sklearn.feature_extraction import image
>>> one_image = np.arange(16).reshape((4, 4))
>>> one_image
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11],
[12, 13, 14, 15]])
>>> patches = image.extract_patches_2d(one_image, (2, 2))
>>> print(patches.shape)
(9, 2, 2)
>>> patches[0]
array([[0, 1],
[4, 5]])
>>> patches[1]
array([[1, 2],
[5, 6]])
>>> patches[8]
array([[10, 11],
[14, 15]])
You can use something like this:
def rolling_window(arr, window):
"""Very basic multi dimensional rolling window. window should be the shape of
of the desired subarrays. Window is either a scalar or a tuple of same size
as `arr.shape`.
"""
shape = np.array(arr.shape*2)
strides = np.array(arr.strides*2)
window = np.asarray(window)
shape[arr.ndim:] = window # new dimensions size
shape[:arr.ndim] -= window - 1
if np.any(shape < 1):
raise ValueError('window size is too large')
return np.lib.stride_tricks.as_strided(arr, shape=shape, strides=strides)
# Now:
slices = rolling_window(arr, 2)
# Slices will be 4-d not 3-d as you wanted. You can reshape
# but it may need to copy (not if you have done no slicing, etc. with the array):
slices = slices.reshape(-1,slices.shape[2:])
Simple code with a double loop and slice:
>>> a = np.arange(12).reshape(3,4)
>>> print(a)
[[ 0 1 2 3]
[ 4 5 6 7]
[ 8 9 10 11]]
>>> r = 2
>>> n_rows, n_cols = a.shape
>>> for row in range(n_rows - r + 1):
... for col in range(n_cols - r + 1):
... print(a[row:row + r, col:col + r])
...
[[0 1]
[4 5]]
[[1 2]
[5 6]]
[[2 3]
[6 7]]
[[4 5]
[8 9]]
[[ 5 6]
[ 9 10]]
[[ 6 7]
[10 11]]