python numpy array padding issue - python

I have a numpy array v with shape (1000, 68), v is supposed to padding to 100 dimension with 0s. As a result, the v's shape will be (1000, 100)
I tried to use the following approaches:
t = np.lib.pad(v, (16, 16), 'minimum') # numpy method
t = sequence.pad_sequences(v, maxlen = 100, padding = 'post') # Keras text processing method
Above two methods returned the t with correct shape (1000, 100), but each array t[n] (n from 0 to 99) is a zero vector [0, 0, 0, ....0]

Following numpy.pad documentation, I tried
np.pad(v, [(0,0), (16,16)], 'constant')
with the expected result: 16 columns of zeros added on the left, and 16 on the right.

Related

Subtract 2D array from each pixel of a 3D image and get a 4D array

I have a 2D array of shape (10, 3) and an image represented as a 3D array of shape (480, 640, 3). I'd like to perform a difference between each pixel and each element of the 2D array, to get a final result of shape (10, 480, 640, 3).
For now, my code looks like this:
arr_2d = np.random.rand(10, 3)
arr_3d = np.random.rand(480, 640, 3)
res = np.ones_like(arr_3d)
res = np.tile(res, (10, 1, 1, 1))
for i in range(10):
res[i] = arr_3d - arr_2d[i]
My question is if there's a way to do this without the for loop, only using numpy operations.
You can try broadcasting with np.array like this
arr_2d = arr_2d.reshape(-1,1,1,3)
arr_3d = arr_3d.reshape((-1,*arr_3d.shape))
res = arr_3d - arr_2d
This should give the same result as your original code

How to stack numpy array along an axis

I have two numpy arrays, one with shape let's say (10, 5, 200), and another one with the shape (1, 200), how can I stack them so I get as a result an array of dimensions (10, 6, 200)? Basically by stacking it to each 2-d array iterating along the first dimension
a = np.random.random((10, 5, 200))
b = np.zeros((1, 200))
I'v tried with hstack and vstack but I get an error in incorrect number of axis
Let's say:
a = np.random.random((10, 5, 200))
b = np.zeros((1, 200))
Let's look at the volume (number of elements) of each array:
The volume of a is 10*5*200 = 10000.
The volume of an array with (10,6,200) is 10*5*200=1200.
That is you want to create an array that has 2000 more elements.
However, the volume of b is 1*200 = 200.
This means a and b can't be stacked.
As hpaulj mentioned in the comments, one way is to define an numpy array and then fill it:
result = np.empty((a.shape[0], a.shape[1] + b.shape[0], a.shape[2]))
result[:, :a.shape[1], :] = a
result[:, a.shape[1]:, :] = b

Binary mask of top n-th quantile in a batch of 2D tensors, but with individual n for each tensor

I have a tensor A of shape (100, 16, 16) and tensor B of shape (100), where 100 is the batch size. I want to create a binary mask of A that has shape (100, 16, 16), where in each element (element has shape (1, 16, 16)) of the mask, the value is 1 if the element is greater than the computed quantile value, else 0. Each element in tensor B indicates the percentile value for each individual element in A, in sequence. If B is simply a scalar, I can use:
flat_A = torch.reshape(A, (100, -1))
quants = torch.quantile(flat_A, B, dim=1)
quants = torch.reshape(quants, (100, 1, 1))
mask = torch.where(A >= quants, 1, 0)
# quants will have shape (100, 1, 1)
The question is: if B is a 1D tensor of shape (100) like I said above, how can I compute the percentile value for each individual element in A? I tried the following, but the results did not look like what I expected:
>>> torch.quantile(flat_A, B, dim=1).shape
torch.Size([100, 100])
>>> torch.quantile(flat_A, B, dim=0).shape
torch.Size([100, 256])
I think the result's shape should be (100), so I can use mask = torch.where(A >= quants, 1, 0), or maybe I misunderstand it?
For more context, this question is also the extension of the scalar B value question I had previously here.
This is one way using torch.quantile() function. Note that here I am using tensors of shape (5, 2, 2) instead of (100, 16, 16) for simplicity.
import torch
# Generate some data of shape (5, 2, 2)
A = torch.arange(5 * 2 * 2).reshape(5, 2, 2) + 1.0
B = torch.linspace(0, 1, 5) # 5 quantile values for each element in A
Af = A.reshape(A.shape[0], -1) # flattens A to a 2D tensor
quantiles = torch.quantile(Af, B, dim = 1, keepdim = True)
quants = quantiles[torch.arange(A.shape[0]), torch.arange(A.shape[0]), 0]
mask = (A >= quants[:, None, None]).type(torch.uint8)
Here the tensor quantiles is of shape torch.Size([5, 5, 1]) because it stores the thresholds for each quantile value in B for each element in A (or row in Af). Since we have 5 quantile values, we get 5 thresholds for each element in A.
For instance, quantiles[i, j, 0] has the threshold for B[i]th quantile of A[j] or Af[j], and you essentially need the values quantiles[k, k, 0] for k in range of batch size or 5 here.
Now to satisfy the requirement that you need thresholds for corresponding quantiles in B and elements in A, simply index out the diagonal elements from quantiles and populate quants that has shape torch.Size([5]).
Finally to get the mask, compare A with the corresponding thresholds for each element. Note that this uses a broadcasted elementwise comparison with the thresholds. mask has the required shape of torch.Size([5, 2, 2]).

combining multi numpy arrays (images) in one array (image) in python

I have some numpy arrays which its elements are pixels of 28*28 images like this:
25 of these arrays are in one array in shape of (25,28,28) or (5,5,28,28). Is there any efficient way to stack them to have one image: 5*5 of 28*28 images.
I tried np.reshape to (140,140) array and plt.imgshow. But the output was a messed image.
"I tried np.reshape to (140,140)..." That will work if you first transpose the input appropriately.
Suppose the input x has shape (5, 5, 28, 28). To get the array y with shape (140, 140) that contains the images arranged the way you want, you can do:
xshp = x.shp
y = x.transpose((0, 2, 1, 3)).reshape((xshp[0]*xshp[2], xshp[1]*xshp[3]))
If x always has shape (5, 5, 28, 28), you can hardcode the constant 140:
y = x.transpose((0, 2, 1, 3)).reshape((140, 140))
For example, here I create x with shape (5, 5, 28, 28) where each 28x28 image is a constant. The constants are chosen randomly. The tranposed, reshaped array y is plotted, and you can see that all the constant blocks are arranged correctly.
In [148]: rng = np.random.default_rng()
In [149]: x = np.repeat(rng.integers(0, 256, size=(5, 5)), 28*28, axis=-1).reshape((5, 5, 28, 28))
In [150]: y = x.transpose((0, 2, 1, 3)).reshape((140, 140))
In [151]: imshow(y)

Stacking numpy arrays with padding

I have a list of 32 numpy arrays, each of which has shape (n, 108, 108, 2), where n is different in each array. I want to stack all of them to create a numpy array of shape (32, m, 108, 108, 2), where m is the maximum among the ns, and the shorter arrays are padded with zeros.
How do I do this?
I asked something similar yesterday, but the answers there seem to break when using deep arrays like in my case.
Concretely, I went with this solution in the end, which produced the cleanest code:
data = np.column_stack(zip_longest(*data, fillvalue=0))
But now it is throwing this error:
ValueError: setting an array element with a sequence.
I have found a godly answer in this webpage.
The pad_sequences function is exactly what I needed.
from tensorflow.python.keras.preprocessing.sequence import pad_sequences
result = pad_sequences(imgs, padding='post')
In my case I needed to stack images with different width and padded with zeros to the left side.
for me this works well:
np.random.seed(42)
image_batch = []
for i in np.random.randint(50,500,size=10):
image_batch.append(np.random.randn(32,i))
for im in image_batch:
print(im.shape)
output: (32, 152)
(32, 485)
(32, 398)
(32, 320)
(32, 156)
(32, 121)
(32, 238)
(32, 70)
(32, 152)
(32, 171)
def stack_images_rows_with_pad(list_of_images):
func = lambda x: np.array(list(zip_longest(*x, fillvalue=0))) # applied row wise
return np.array(list(map(func, zip(*list_of_images)))).transpose(2,0,1)
res = stack_images_rows_with_pad(image_batch)
for im in rez:
print(im.shape)
output: (32, 485)
(32, 485)
(32, 485)
(32, 485)
(32, 485)
(32, 485)
(32, 485)
(32, 485)
(32, 485)
(32, 485)
Try this:
# Create matrices with random first axis length.
depth = np.random.randint(3,20,size=32)
l = []
lmax = 0
for i in depth:
l.append(np.ones((i,10,10,2)))
lmax = i if i > lmax else lmax
# Join the matrices:
new_l = []
for m in l:
new_l.append(np.vstack([m, np.zeros((lmax-m.shape[0], 10, 10, 2))]))
master = np.stack(new_l, axis=0)
master.shape
>>> (32, 19, 10, 10, 2)
I find np.pad almost impossible to work with on higher dimensional matrix - luckily, what you asked was simple, where only one of the dimension will have to extended, such that it's easy to use np.vstack to stack a zeros array that make it conform to a new shape.
A = np.ones((4,3))
border_top_bottom = np.zeros((A.shape[1])).reshape(1,A.shape[1])
print(np.vstack([border_top_bottom,A,border_top_bottom]))
temp = np.vstack([border_top_bottom,A,border_top_bottom])
border_right_left = np.zeros((temp.shape[0])).reshape(temp.shape[0],1)
print(np.hstack([np.hstack([border_right_left,temp,border_right_left])]))

Categories