Stacking a single 2D array into 3D efficiently - python

I have a single 2D array that I want to stack identical versions of into a third dimension, specifically axis=1. The following code does the job, but it's very slow for a 2D array of size (300,300) stacked into a third-dimension of length 300.
arr_2d = arr_2d.reshape(arr_2d.shape[0],arr_2d.shape[1],1)
arr_3d = np.empty((sampling,sampling,sampling)) # allocate space
arr_3d = [arr_3d[:,:,i]==arr_2d for i in range(sampling)]
Is there a better, more efficient way of doing this?

You can use numpy.repeat after you add a new third dimension to stack on:
import numpy as np
arr = np.random.rand(300, 300)
# arr.shape => (300, 300)
dup_arr = np.repeat(arr.reshape(*arr.shape, 1), repeats=10, axis=-1)
# dup_arr.shape => (300, 300, 10)
As commented by #xdurch0 and since you're stacking your copies along the last dimension, you can also use numpy.tile:
dup_arr = np.tile(arr.reshape(*arr.shape, 1), reps=10)
# dup_arr.shape => (300, 300, 10)

Related

Update a 3d numpy array at given indices in a vectorized way using a 2d numpy array

I am trying to place data in a 3d numpy array from a 2d array using the following code:
data = # some data has shape for eg: (300, 400)
di = # axis 0 indices has same shape as data (300, 400)
dj = # axis 1 indices has same shape as data (300, 400)
dz = # axis 2 indices has same shape as data (300, 400)
output = # output has shape (80, 90, 5)
for idx in range(data.size):
data_pt = data.flat[idx]
di_idx = di.flat[idx]
dj_idx = dj.flat[idx]
dz_idx = dz.flat[idx]
output[di_idx, dj_idx, dz_idx] = data_pt
But this code becomes very slow for large data numpy arrays.
I tried to vectorize using:
indices = np.stack([di, dj, dz])
output[indices] += data
But I get out-of-bounds error:
IndexError: index 80 is out of bounds for axis 0 with size 80
What am I doing wrong?
Update: I am able to speed up the code by using numba.jit(nopython=True) but I want to understand how to do this myself.

Creating 3d Numpy array by looping through and appending new 2d Numpy array to it

I initially had a 3d numpy array of arr_3d of (85, 150, 150) dimension (float values) referred to as original_npy_3d (85 layers, and 150 grids in x and 150 grids in y direction).
I converted it to 2d arr(0, 150, 150) (the first layer) to perform some operation and modify the data in 2D. Now, I would like to repeat the same operation on every layer (0 or rather 1 to 85) and then convert everything back to the original size (85, 150, 150). Is there a way in Python to do that?
arr_3d = np.empty((85, 150, 150), float)
for item in range(85):
arr= original_npy_3d[item,:,:]
#some operation on arr of the size (150, 150)
return arr_3d #with all of the generated arr for every layer appended to it

How to make a 2D ndarray from 3D so that (100, 50, 20) is (100, 100)

I want to merge two dimensions (y,z) of a 3D array (x,y,z) into one. Each corresponding value from y should be copied next to z.
For eg. I have 100 frames of a video with coordinates of 15 key points in 3 dimensions. The array shape is (100,15,3). I want output as (100, 45), which is merging y and z as 15x3.
Just use numpy.reshape. It can be used to flatten dimensions selectively.
import numpy as np
mat_3d = np.random.randn(2, 3, 4)
mat_2d = mat_3d.reshape((mat_3d.shape[0], -1))
print(mat_3d)
print(mat_2d)
In this example, I'm using (mat_3d.shape[0], -1) as argument of reshape. It means that the first dimension must stay unchanged, but all the other ones must be flatten (-1 is extra sugar to let numpy infers the right size, but using np.prod(mat_3d.shape[1:]) would be the same).
In such as case, Numpy first fetches values across the last axis (z here), then the second to last axis (y here), and so on and so forth in higher dimension.

Duplicating vector along an arbitrary number of dimensions

I want to repeat a 1D-array along the dimensions of another array, knowing that this number of dimensions can change.
For example:
import numpy as np
to_repeat = np.linspace(0, 100, 10)
base_array = np.random.random((24, 60)) ## this one can have more than two dimensions.
final_array = np.array([[to_repeat for i in range(base_array.shape[0])] for j in range(base_array.shape[1])]).T
print(final_array.shape)
# >>> (10, 24, 60)
How can this be extended to an array base_array with an arbitrary number of dimensions?
Possibly using numpy vectorized functions in order to avoid loops?
EDIT (bigger picture):
base_array is in fact of shape (10, 24, 60) (if we stick to this example), where the coordinates along the first dimension are the vector to_repeat.
I'm looking for the minimum along the first dimension of base_array, and create the array of corresponding coordinates, here of shape (24, 60).
You don't need final_array, you can get the result you want by:
to_repeat[base_array.argmin(0)]

Numpy group scalars into arrays

I have a numpy array U with shape (20, 50): 20 spatial points, in a space of 50 dimensions.
How can I transform it into a (20, 1, 50) array, i.e. 20 rows, 1 column, and each element is a 50 dimension point? Kind of encapsulating each row as a numpy array.
Context
The point is that I want to expand the array along the columns (actually, replicating the same array along the columns X times) using numpy.concatenate. But if I would do it straight away I would not get the result I want.
E.g., if I would expand it once along the columns, I would get an array with shape (20, 100). But what I would like is to access each element as a 50-dimensional point, so when I expand it I would expect to have a new U' with shape (20, 2, 50).
You can do U[:, None, :] to add a new dimension to the array.
You can also use reshape:
import numpy as np
a = np.zeros((20, 50))
print a.shape # (20, 50)
b = a.reshape((20, 1, 50))
print b.shape # (20, 1, 50)

Categories