Numpy padding a 4D array is slow

Numpy padding a 4D array is slow - python

I'm trying to pad a 4D numpy array. This array is just a collection of 3D arrays, for example, its size can be interpreted as (samples, height, width, depth). The array size before padding is (1682160, 21, 200, 3). I would like to pad the rows in this array by 10 pixels in each side, hence my final padded array size would be (1682160, 21, 220, 3). I currently use the numpy pad function this way:
matrix = np.pad(matrix, ((0, 0), (0, 0), (10, 10), (0,0)), mode='reflect')
I rewrite the padded array matrix into the same variable since it is float16 datatype, with a size of roughly 40 GB. Writing into a new matrix results in a memory error. The execution time for the above command is roughly 700 seconds.
I would like to know if there is a faster way of implementing this in python, maybe in a vectorized fashion?

Related

Creating 3d Numpy array by looping through and appending new 2d Numpy array to it

I initially had a 3d numpy array of arr_3d of (85, 150, 150) dimension (float values) referred to as original_npy_3d (85 layers, and 150 grids in x and 150 grids in y direction).
I converted it to 2d arr(0, 150, 150) (the first layer) to perform some operation and modify the data in 2D. Now, I would like to repeat the same operation on every layer (0 or rather 1 to 85) and then convert everything back to the original size (85, 150, 150). Is there a way in Python to do that?
arr_3d = np.empty((85, 150, 150), float)
for item in range(85):
arr= original_npy_3d[item,:,:]
#some operation on arr of the size (150, 150)
return arr_3d #with all of the generated arr for every layer appended to it

How to make a 2D ndarray from 3D so that (100, 50, 20) is (100, 100)

I want to merge two dimensions (y,z) of a 3D array (x,y,z) into one. Each corresponding value from y should be copied next to z.
For eg. I have 100 frames of a video with coordinates of 15 key points in 3 dimensions. The array shape is (100,15,3). I want output as (100, 45), which is merging y and z as 15x3.

Just use numpy.reshape. It can be used to flatten dimensions selectively.
import numpy as np
mat_3d = np.random.randn(2, 3, 4)
mat_2d = mat_3d.reshape((mat_3d.shape[0], -1))
print(mat_3d)
print(mat_2d)
In this example, I'm using (mat_3d.shape[0], -1) as argument of reshape. It means that the first dimension must stay unchanged, but all the other ones must be flatten (-1 is extra sugar to let numpy infers the right size, but using np.prod(mat_3d.shape[1:]) would be the same).
In such as case, Numpy first fetches values across the last axis (z here), then the second to last axis (y here), and so on and so forth in higher dimension.

Duplicating vector along an arbitrary number of dimensions

I want to repeat a 1D-array along the dimensions of another array, knowing that this number of dimensions can change.
For example:
import numpy as np
to_repeat = np.linspace(0, 100, 10)
base_array = np.random.random((24, 60)) ## this one can have more than two dimensions.
final_array = np.array([[to_repeat for i in range(base_array.shape[0])] for j in range(base_array.shape[1])]).T
print(final_array.shape)
# >>> (10, 24, 60)
How can this be extended to an array base_array with an arbitrary number of dimensions?
Possibly using numpy vectorized functions in order to avoid loops?
EDIT (bigger picture):
base_array is in fact of shape (10, 24, 60) (if we stick to this example), where the coordinates along the first dimension are the vector to_repeat.
I'm looking for the minimum along the first dimension of base_array, and create the array of corresponding coordinates, here of shape (24, 60).

You don't need final_array, you can get the result you want by:
to_repeat[base_array.argmin(0)]

Numpy: How to multiply (N,N) and (N,N,M,M) numpy arrays?

I want to multiply two numpy arrays. One numpy array is given by matrix of shape (10, 10) and the other is given by a matrix of matrices, i.e. shape (10, 10, 256, 256).
I now simply want to multiply each matrix in the second matrix of matrices with the corresponding component in the first matrix. For instance, the matrix at position (0, 0) in the second matrix shall be multiplied by the value at position (0, 0) in the first matrix.
Intuitively, this is not really complicated, but numpy does not seem to support that. Or at least I am not smart enough to make it work. The ValueError that is thrown says:
ValueError: operands could not be broadcast together with shapes (10,10) (10,10,256,256)
Can anybody of you help me please? How can I achieve what I want in a numpyy way.

You can use the NumPy einsum function, e.g., (using zeros arrays as dummies in this example):
import numpy as np
x = np.zeros((10, 10))
y = np.zeros((10, 10, 256, 256))
z = np.einsum("ij,ijkm->km", x, y)
print(z.shape)
(256, 256)
See here for a nice description of einsum's usage.

Change shape and dtype of numpy array

I have a list of numpy arrays, and each one of them has a dtype=object in the first dimension (3 objects), where there are arrays of different shapes e.g. (200, 10), (100, 10), (50, 10).
Although this works well, when it happens that the three objects get the same first dimension (e.g. (200, 10), (200,10), (200,10)), the array automatically goes from dtype=object to dtype=float). So, i end up with one of these arrays, being (3,200,10), instead of (3,) object type.
That ends up with an error when i try to make the list a numpy array, since one of the arrays has different shape.
Is there any solution to that?

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Numpy padding a 4D array is slow - python

Related

Creating 3d Numpy array by looping through and appending new 2d Numpy array to it

How to make a 2D ndarray from 3D so that (100, 50, 20) is (100, 100)

Duplicating vector along an arbitrary number of dimensions

Numpy: How to multiply (N,N) and (N,N,M,M) numpy arrays?

Change shape and dtype of numpy array

Categories

Resources