Extracting 1d arrays from 3d numpy array using 2d boolean - python

Say I have a 3d numpy array:
i, j, k = 10, 3, 4
arr = np.arange(120).reshape(i, j, k)
and a 2d boolean array:
mask = np.random.random((j, k)) > 0.5
n = mask.sum()
I want to be able to extract the 1d arrays from arr along its 1st dimension which correspond with the True values of mask. The result should have shape, (i, n). How could this be done?
I pulling up some old code and for some reason I was doing arr[mask] but this gives a shape of (n, k) (I'm not sure why) and a warning:
VisibleDeprecationWarning: boolean index did not match indexed array along dimension 0; dimension is 10949 but corresponding boolean dimension is 11

Simply mask along the last two axes -
arr[:,mask]

Related

Inserting an array inside the list of arrays

I have an issue with numpy arrays and I can't understand what I am doing wrong. I need to create a 100x100 matrix with random int (non zero) and the last row should be the combination of all previous rows. Here is my code:
non_zero_m = np.random.randint(0,10,(99,100))
arr = non_zero_m.sum(axis=0)
singular_m = np.concatenate((non_zero_m, arr))
ValueError: all the input arrays must have same number of dimensions, but the array at index 0 has 2 dimension(s) and the array at index 1 has 1 dimension(s)
I can't understand why python shows that arrays has different dimensions
The problem is that arr is a 1-dimensional array, and you are trying to concatenate it to a matrix (2-dimensional).
Just replace the second line with:
arr = non_zero_m.sum(axis=0).reshape(1, -1)
This reshapes arr to a 2-dimensonal array, such that the first axis has dimension 1 (thus making arr effectively a row vector), and the second axis has the required dimension to keep all of arr's elements (this is the meaning of -1 in this context).

Average of a 3D numpy slice based on 2D arrays

I am trying to calculate the average of a 3D array between two indices on the 1st axis. The start and end indices vary from cell to cell and are represented by two separate 2D arrays that are the same shape as a slice of the 3D array.
I have managed to implement a piece of code that loops through the pixels of my 3D array, but this method is painfully slow in the case of my array with a shape of (70, 550, 350). Is there a way to vectorise the operation using numpy or xarray (the arrays are stored in an xarray dataset)?
Here is a snippet of what I would like to optimise:
# My 3D raster containing values; shape = (time, x, y)
values = np.random.rand(10, 55, 60)
# A 2D raster containing start indices for the averaging
start_index = np.random.randint(0, 4, size=(values.shape[1], values.shape[2]))
# A 2D raster containing end indices for the averaging
end_index = np.random.randint(5, 9, size=(values.shape[1], values.shape[2]))
# Initialise an array that will contain results
mean_array = np.zeros_like(values[0, :, :])
# Loop over 3D raster to calculate the average between indices on axis 0
for i in range(0, values.shape[1]):
for j in range(0, values.shape[2]):
mean_array[i, j] = np.mean(values[start_index[i, j]: end_index[i, j], i, j], axis=0)
One way to do this without loops is to zero-out the entries you don't want to use, compute the sum of the remaining items, then divide by the number of nonzero entries. For example:
i = np.arange(values.shape[0])[:, None, None]
mean_array_2 = np.where((i >= start_index) & (i < end_index), values, 0).sum(0) / (end_index - start_index)
np.allclose(mean_array, mean_array_2)
# True
Note that this assumes that the indices are in the range 0 <= i < values.shape[0]; if this is not the case you can use np.clip or other means to standardize the indices before computation.

mask first k elements in a 3D tensor in PyTorch (different k for each row)

I have a tensor M of dimensions [NxQxD] and a 1d tensor of indices idx (of size N). I want to efficiently create a tensor mask of dimensions [NxQxD] such that mask[i,j,k] = 1 iff j <= idx[i], i.e. I want to keep only the idx[i] first dimensions out of Q in the second dimension (dim=1) of M, for every row i.
Thanks!
It turns out this can be done via a broadcasting trick:
mask_2d = torch.arange(Q)[None, :] < idx[:, None] #(N,Q)
mask_3d = mask[..., None] #(N,Q,1)
masked = mask.float() * data

How to quickly mask different slices in my array?

I have a 3d array where all axis lengths are the same (for example (5,5,5)). I need to mask all of the array and keep certain slices in the array unmasked as per the code below. I managed to accomplish this using a for loop but I wondered if there was a faster solution out there.
array = np.reshape(np.array(np.random.rand(125)), (5,5,5))
array = ma.array(array, mask=True)
for i in range(array.shape[0]):
for j in range(array.shape[1]):
array[i, j, :].mask[i:j] = False
This allows me to sum this array with another array of the same size while ignoring the masked values.
You can create the entire mask in one step using broadcasting:
i, j, k = np.ogrid[:5, :5, :5]
mask = (i>k) | (k>=j)

filtering a 3D numpy array according to 2D numpy array

I have a 2D numpy array with the shape (3024, 4032).
I have a 3D numpy array with the shape (3024, 4032, 3).
2D numpy array is filled with 0s and 1s.
3D numpy array is filled with values between 0 and 255.
By looking at the 2D array values, I want to change the values in 3D array. If a value in 2D array is 0, I will change the all 3 pixel values in 3D array into 0 along the last axes. If a value in 2D array is 1, I won't change it.
I have checked this question, How to filter a numpy array with another array's values, but it applies for 2 arrays which have same dimensions. In my case, dimensions are different.
How the filtering is applied in two arrays, with same size on 2 dimensions, but not size on the last dimension?
Ok, I'll answer this to highlight one pecularity regarding "missing" dimensions. Lets' assume a.shape==(5,4,3) and b.shape==(5,4)
When indexing, existing dimensions are left aligned which is why #Divakar's solution a[b == 0] = 0 works.
When broadcasting, existing dimensions are right aligned which is why #InvaderZim's a*b does not work. What you need to do is a*b[..., None] which inserts a broadcastable dimension at the right
I think this one is very simple:
If a is a 3D array (a.shape == (5, 4, 3)) filled with values, and b is a 2D array (b.shape == (5, 4)) filled with 1 and 0, then reshape b and multiply them:
a = a * b.reshape(5, 4, 1)
Numpy will automatically expand the arrays as needed.

Categories