Perform image transformations on batches of images in tensorflow - python

I am trying to perform some transformations on a batch of images of size [128,128,3] with a batch_size of 20. I have managed to do part of the transformations on a single image but I'm not sure how to extend the same for the entire batch.
x is the batched image tensor of shape [20,128,128,3]
grads is also a tensor of shape [20,128,128,3]
I want to find the maximum value of grads per image ie I will get 20 max values. Let us call the max value for image_i as max_grads_i.
add_max_grads is a tensor of shape [20,128,128,3]
I want to update the value of add_max_grads such that for image_i I need to update the corresponding add_max_grads at the position of the max_grads_i in grads tensor and I need to update this across the 3 channels.
For instance, for image i the shape is [128,128,3], if the max_grads_i was found at grads[50][60][1], I want to update add_max_grads at [50][60][0], [50][60][1], [50][60][2] with the same constant value.
Eventually, this needs to be extended for all the images in the batch to create add_max_grads which has a shape [20,128,128,3].
How can I achieve this?
Currently I am using tf.math.reduce_max(grads,(1,2,3)) to find the max value in the grads tensor for each image in the batch which returns a tensor of shape [20,] but I'm stuck at the rest.

I think this does what you're asking, if I've understood it right.
This is a self-contained example.
import tensorflow as tf
#tshape = (20,128,128,3) # The real shape needed
tshape = (2,4,4,3) # A small shape, to allow print output to be readable
# A sample x vector, set to zeros here so we can see what gets updated easily
x = tf.zeros(shape=tshape, dtype=tf.float32)
# grads, same shape as x, let's use a random matrix here
grads = tf.random.normal(shape=tshape)
print(f"grads: \n{grads}")
# ---- Now calculate the indices we will use in tf.tensor_scatter_nd_update
# 4-D mask tensor (m, x, y, c) with 1 in max locations (by m), zeros otherwise
mask4d = tf.cast(grads == tf.reduce_max(grads, axis=(1,2,3), keepdims=True), dtype=tf.int32)
# 3D mask tensor (m, x, y) with 1 in max pixel locations (by m) across channel
mask3d = tf.reduce_max(mask4d, axis=3)
# indices of maximum values, each item gives m, x, y
indices = tf.where(mask3d)
print(f"\nindices\n{indices}")
# ---- Now calculate the updates we will use in tf.tensor_scatter_nd_update. This has shape (m, c)
newval = 999
updates = tf.constant(newval, shape=(tshape[0], tshape[3]), dtype=tf.float32)
# ---- Now modify x by scattering newval through it. Each entry in indices indicates a slice x[m, x, y, :]
x_updated = tf.tensor_scatter_nd_update(x, indices, updates)
print(f"\nx_updated\n{x_updated}")

Related

Bootstrap weighted mean of tensor

I have some masked 1D signal data stacked in a 3D tensor, so the dimensions are [batch_size, num_signals, num_samples]. I would like to compute bootstrap averages of each set of non-masked signals. So the final dimensions should be [batch_size, num_samples]
I have a solution that is pretty painful in a mix of python, numpy, and TensorFlow - it loops over each batch, gets the non-masked signals, samples from that list, and computes an average - is there a better way?
# set up data
data = tf.cast(tf.random.uniform((2, 4, 5), minval=1.0, maxval=100.0), tf.float32)
mask = tf.constant([[1,1,0,0], [1,1,0,1]])
# gather results for each element in batch
results = []
for n_in_batch in range(data.shape[0]):
# select signals by mask
valid_idxs = np.nonzero(mask[n_in_batch,:])[0]
# sample from valid indexes with replacement
boot_idxs = tf.numpy_function(np.random.choice, (valid_idxs, len(valid_idxs), True), tf.int64)
# select from the data using the bootstrapped indices
boot_signals = tf.gather(data[n_in_batch,:,:], boot_idxs)
# compute average and save results
av_signal = tf.reduce_mean(boot_signals, axis=0)
results.append(av_signal)
final = tf.stack(results)

Tensorflow: How to gather from indices along inner dimension

I have a 4d tensor output for an object detector that outputs per-pixel, per class box predictions, i.e. Shape H x W x C x 6, where the innermost 6 wide dimension is box parameter for that class. Now, when computing loss, I want to update only the predictions from the ground truth class. To do this, I'd like to have a tensor with shape H x W whose elements are the ground truth class index. This tensor is then used to extract the relevant class only from the input, outputting a tensor with H x W x 6. I know this should be possible using gather or gather_nd but I can't get the parameters right to get the desired output. Plus I'm confused about the purpose of the batch_dims parameter for gather_nd, that may be relevant though to solving this. Any suggestions on how I can properly use these or some other tf function to achieve this result?

How to extract vector of exact coordinates of the image in tensorflow?

I'm trying to write the code for the model of GAN generator using keras with tensorflow backend. I want the generator output to be vector (for each image in the batch the same size) of values of the image in exact coordinates. These coordinates are given as an input do the generator also.
I've tried using tf.gather_nd as a function to do numpy-like operation of extracting values from exact coordinates.
img is a generated from the noise image with shape=(?,28,28,1),
coordinates is an input tensor of the shape (?,80,2) with 80 points to be extracted from the generated image img,
vect is an output vector, should be the size of (?, 80),
where ? is a batch size.
vect = Lambda(lambda x: tf.gather_nd(x, tf.cast(coordinates, 'int64')))(img)
Finally the output shape of this function is (?,80,28,1) instead of (?,80).
How is it better to extract such points?
You can do that with tf.gather_nd like this:
import tensorflow as tf
def extract_pixels(img, coords):
# Number of images and pixels
s = tf.shape(coords, out_type=coords.dtype)
n = s[0]
p = s[1]
# Make gather index
i = tf.range(n)
ii = tf.tile(i[:, tf.newaxis, tf.newaxis], [1, p, 1])
idx = tf.concat([ii, coords], axis=-1)
# Gather pixel values
pixels = tf.gather_nd(tf.squeeze(img, axis=-1), idx)
return pixels
# ...
vect = Lambda(lambda x: extract_pixels(x, tf.cast(coordinates, 'int64')))(img)

Tensor multiply along axis in tensorflow

The problem I have using tensorflow is as follows:
For two tensors
[[x11,x12...],[x21,x22...],...[xn1,xn2...]]
and
[y1,y2,...yn],
I want to multiply them along axis 0 to get
[[x11*y1,x12*y1...],[x21*y2,x22*y2...]...]
For example, for
[[1,2],[3,4]] and [1,2], I want to get the result tensor [[1,2],[6,8]].
The real scenario is that I have two tensors A and B shaped (batches,height,width,n_channels) and (batches,1). Both are tensors defined in tensorflow. For each image of A in the batch I want to multiply it with the corresponding value in B.
Given a 2-dimensional tensor x and a vector y, you just need to do:
result = x * tf.expand_dims(y, axis=-1)
Or, if you like it more:
result = x * y[:, tf.newaxis]

TensorFlow, how to index so that (batch_size x num_labels)+(batch_size) -> (batch_size)

Suppose I just have got a matrix (2D tensor) X, whose shape is (batch_size x num_labels). And the scores of labels for each sample are stored in the matrix. Now I want to extract the true labels' scores, while the true labels are stored in another 1D tensor y, whose shape is (batch_size).
What can I do ?
I know that in Theano or Numpy. It can be done with a single expression: X[y].
BUT in TensorFlow, what is the most convenient or cost-less way to achieve that ?
X = tf.get_variable("X",[batch_size,num_labels])
y = tf.placeholder(tf.int32,[batch_size])
Note 0 <= y[i] <= num_labels - 1. The output z should be 1D tensor where z[i]= X[i][y[i]]
I understand that X is a vector containing probabilities for each class and batch instance and that you want to get the probability of the true label. I propose on solution, though it may not be the optimal one:
# Create mask for values
increasing = tf.range(start=0, limit=tf.shape(X)[0], delta=1)
# Concatenate batch index and true label
# Note that in Tensorflow < 1.0.0 you must call tf.pack
mask = tf.stack([increasing, y], axis=1)
# Extract values
masked = tf.gather_nd(params=X, indices=mask)
Hope it helps.

Categories