Bootstrap weighted mean of tensor - python

I have some masked 1D signal data stacked in a 3D tensor, so the dimensions are [batch_size, num_signals, num_samples]. I would like to compute bootstrap averages of each set of non-masked signals. So the final dimensions should be [batch_size, num_samples]
I have a solution that is pretty painful in a mix of python, numpy, and TensorFlow - it loops over each batch, gets the non-masked signals, samples from that list, and computes an average - is there a better way?
# set up data
data = tf.cast(tf.random.uniform((2, 4, 5), minval=1.0, maxval=100.0), tf.float32)
mask = tf.constant([[1,1,0,0], [1,1,0,1]])
# gather results for each element in batch
results = []
for n_in_batch in range(data.shape[0]):
# select signals by mask
valid_idxs = np.nonzero(mask[n_in_batch,:])[0]
# sample from valid indexes with replacement
boot_idxs = tf.numpy_function(np.random.choice, (valid_idxs, len(valid_idxs), True), tf.int64)
# select from the data using the bootstrapped indices
boot_signals = tf.gather(data[n_in_batch,:,:], boot_idxs)
# compute average and save results
av_signal = tf.reduce_mean(boot_signals, axis=0)
results.append(av_signal)
final = tf.stack(results)

Related

Perform image transformations on batches of images in tensorflow

I am trying to perform some transformations on a batch of images of size [128,128,3] with a batch_size of 20. I have managed to do part of the transformations on a single image but I'm not sure how to extend the same for the entire batch.
x is the batched image tensor of shape [20,128,128,3]
grads is also a tensor of shape [20,128,128,3]
I want to find the maximum value of grads per image ie I will get 20 max values. Let us call the max value for image_i as max_grads_i.
add_max_grads is a tensor of shape [20,128,128,3]
I want to update the value of add_max_grads such that for image_i I need to update the corresponding add_max_grads at the position of the max_grads_i in grads tensor and I need to update this across the 3 channels.
For instance, for image i the shape is [128,128,3], if the max_grads_i was found at grads[50][60][1], I want to update add_max_grads at [50][60][0], [50][60][1], [50][60][2] with the same constant value.
Eventually, this needs to be extended for all the images in the batch to create add_max_grads which has a shape [20,128,128,3].
How can I achieve this?
Currently I am using tf.math.reduce_max(grads,(1,2,3)) to find the max value in the grads tensor for each image in the batch which returns a tensor of shape [20,] but I'm stuck at the rest.
I think this does what you're asking, if I've understood it right.
This is a self-contained example.
import tensorflow as tf
#tshape = (20,128,128,3) # The real shape needed
tshape = (2,4,4,3) # A small shape, to allow print output to be readable
# A sample x vector, set to zeros here so we can see what gets updated easily
x = tf.zeros(shape=tshape, dtype=tf.float32)
# grads, same shape as x, let's use a random matrix here
grads = tf.random.normal(shape=tshape)
print(f"grads: \n{grads}")
# ---- Now calculate the indices we will use in tf.tensor_scatter_nd_update
# 4-D mask tensor (m, x, y, c) with 1 in max locations (by m), zeros otherwise
mask4d = tf.cast(grads == tf.reduce_max(grads, axis=(1,2,3), keepdims=True), dtype=tf.int32)
# 3D mask tensor (m, x, y) with 1 in max pixel locations (by m) across channel
mask3d = tf.reduce_max(mask4d, axis=3)
# indices of maximum values, each item gives m, x, y
indices = tf.where(mask3d)
print(f"\nindices\n{indices}")
# ---- Now calculate the updates we will use in tf.tensor_scatter_nd_update. This has shape (m, c)
newval = 999
updates = tf.constant(newval, shape=(tshape[0], tshape[3]), dtype=tf.float32)
# ---- Now modify x by scattering newval through it. Each entry in indices indicates a slice x[m, x, y, :]
x_updated = tf.tensor_scatter_nd_update(x, indices, updates)
print(f"\nx_updated\n{x_updated}")

How to do argmax in group in pytorch?

Is there any ways to implement maxpooling according to norm of sub vectors in a group in Pytorch? Specifically, this is what I want to implement:
Input:
x: a 2-D float tensor, shape #Nodes * dim
cluster: a 1-D long tensor, shape #Nodes
Output:
y, a 2-D float tensor, and:
y[i]=x[k] where k=argmax_{cluster[k]=i}(torch.norm(x[k],p=2)).
I tried torch.scatter with reduce="max", but this only works for dim=1 and x[i]>0.
Can someone help me to solve the problem?
I don't think there's any built-in function to do what you want. Basically this would be some form of scatter_reduce on the norm of x, but instead of selecting the max norm you want to select the row corresponding to the max norm.
A straightforward implementation may look something like this
"""
input
x: float tensor of size [NODES, DIMS]
cluster: long tensor of size [NODES]
output
float tensor of size [cluster.max()+1, DIMS]
"""
num_clusters = cluster.max().item() + 1
y = torch.zeros((num_clusters, DIMS), dtype=x.dtype, device=x.device)
for cluster_id in torch.unique(cluster):
x_cluster = x[cluster == cluster_id]
y[cluster_id] = x_cluster[torch.argmax(torch.norm(x_cluster, dim=1), dim=0)]
Which should work just fine if clusters.max() is relatively small. If there are many clusters though then this approach has to unnecessarily create masks over cluster for every unique cluster id. To avoid this you can make use of argsort. The best I could come up with in pure python was the following.
num_clusters = cluster.max().item() + 1
x_norm = torch.norm(x, dim=1)
cluster_sortidx = torch.argsort(cluster)
cluster_ids, cluster_counts = torch.unique_consecutive(cluster[cluster_sortidx], return_counts=True)
end_indices = torch.cumsum(cluster_counts, dim=0).cpu().tolist()
start_indices = [0] + end_indices[:-1]
y = torch.zeros((num_clusters, DIMS), dtype=x.dtype, device=x.device)
for cluster_id, a, b in zip(cluster_ids, start_indices, end_indices):
indices = cluster_sortidx[a:b]
y[cluster_id] = x[indices[torch.argmax(x_norm[indices], dim=0)]]
For example in random tests with NODES = 60000, DIMS = 512, cluster.max()=6000 the first version takes about 620ms whie the second version takes about 78ms.

Handling non-image inputs of different shapes in a batch without zero-filling/zero-padding

I have (for the most part) gigapixel images that I have divided into 512x512 patches. Then I feed each 512x512 2D image with 3 channel into a ResNet18 frozen network for feature extraction and I end up with a 1D 512 tensor. Eventually, I concatenate all these 512x512 1D 512 tensors and I end up with Nx512 intermediate representation dimension where N is the number of patches in the gigapixel image.
Since my original gigapixel images are not all the same size and they range from 17x512 to 6000x512, I am using the following as a strategy in order to feed them to my model. However, my preference is to use a more standardize method as in PyTorch (in case of 2D images with 3 channel perhaps we could easily do torch transform -- not here).
feature_path = 'features.pt'
features = torch.load(feature_path, map_location=lambda storage, loc: storage)
if features.shape[0] <= median_num_patches:
a = torch.zeros((median_num_patches - features.shape[0], 512)) #zero padding to lenght median_num_patches
embeddings = torch.cat((features, a), axis=0)
sample['image'] = embeddings
else:
random_indices = torch.randint(features.shape[0], (median_num_patches, )) # max size: 6000 patches in an image
sample['image'] = features[random_indices, :]
^ As mentioned earlier, the 2D intermediate representation (Nx512) is created in an offline process and saved in features.pt files.
The above solution, after finding what the median of size of 2D intermediate representations are based on number of patches in each gigapixel image, first checks to see if the size of current 2D intermediate representation in the batch is smaller that the median, and if so, it zero-fills that 2D intermediate representation to the size of median. And if the size of 2D intermediate representation in the batch is larger than median, it does sample median number of patches from that 2D representation.
I am looking for a better solution than the current one. Perhaps something without sampling or zero-filling and without loss of data. Thanks for any possible lead.

How to extract vector of exact coordinates of the image in tensorflow?

I'm trying to write the code for the model of GAN generator using keras with tensorflow backend. I want the generator output to be vector (for each image in the batch the same size) of values of the image in exact coordinates. These coordinates are given as an input do the generator also.
I've tried using tf.gather_nd as a function to do numpy-like operation of extracting values from exact coordinates.
img is a generated from the noise image with shape=(?,28,28,1),
coordinates is an input tensor of the shape (?,80,2) with 80 points to be extracted from the generated image img,
vect is an output vector, should be the size of (?, 80),
where ? is a batch size.
vect = Lambda(lambda x: tf.gather_nd(x, tf.cast(coordinates, 'int64')))(img)
Finally the output shape of this function is (?,80,28,1) instead of (?,80).
How is it better to extract such points?
You can do that with tf.gather_nd like this:
import tensorflow as tf
def extract_pixels(img, coords):
# Number of images and pixels
s = tf.shape(coords, out_type=coords.dtype)
n = s[0]
p = s[1]
# Make gather index
i = tf.range(n)
ii = tf.tile(i[:, tf.newaxis, tf.newaxis], [1, p, 1])
idx = tf.concat([ii, coords], axis=-1)
# Gather pixel values
pixels = tf.gather_nd(tf.squeeze(img, axis=-1), idx)
return pixels
# ...
vect = Lambda(lambda x: extract_pixels(x, tf.cast(coordinates, 'int64')))(img)

SIFT Input to ANN

I'm trying to classify images using an Artificial Neural Network and the approach I want to try is:
Get feature descriptors (using SIFT for now)
Classify using a Neural Network
I'm using OpenCV3 and Python for this.
I'm relatively new to Machine Learning and I have the following question -
Each image that I analyse will have different number of 'keypoints' and hence different dimensions of the 2D 'descriptor' array. How do I decide the input for my ANN. For example for one sample image the descriptor shape is (12211, 128) so do I flatten this array and use it as an input, in which case I have to worry about varying input sizes for each image, or do I compute something else for the input?
I'm not sure if this is an exact solution but this worked for me. The main idea is as follows:
Divide your image into a MxN grid.
Obtain a set number of feature points for each sub-image.
Concatenate the results for all the sub-images to obtain a feature vector for the entire image.
The supporting code roughly is given below (the function "pre_process_image"):
def tiles(arr, nrows, ncols):
"""
If arr is a 2D array, the returned list contains nrowsXncols numpy arrays
with each array preserving the "physical" layout of arr.
When the array shape (rows, cols) are not divisible by (nrows, ncols) then
some of the array dimensions can change according to numpy.array_split.
"""
rows, cols, channel = arr.shape
col_arr = np.array_split(range(cols), ncols)
row_arr = np.array_split(range(rows), nrows)
return [arr[r[0]: r[-1]+1, c[0]: c[-1]+1]
for r, c in product(row_arr, col_arr)]
def pre_process_images(data, dimensions=(28, 28)):
images = data['image']
features = []
count = 1
nrows = dimensions[0]
ncols = dimensions[1]
sift = cv2.xfeatures2d.SIFT_create(1)
for arr in images:
image_feature = []
cut_image = tiles(arr, nrows, ncols)
for small_image in cut_image:
(kps, descs) = sift.detectAndCompute(im, None)
image_feature.append(descs.flatten())
features.append(image_feature)
print count
count += 1
data['sift_features'] = features
return data
However this is extremely slow. I'm working on a way to optimally select features using PCA right now for the same.
It will be good if you apply Normalization on each image before getting the feature extractor.

Categories