scatter_nd_add() example for sparse addition in TensroFlow - python

I am having difficulty applying tf.scatter_nd_add() to 2D tensors. The documentation is a bit unclear and has does not contain an example for sparse update but only for full slice updates.
My case is the following:
updates - 2D tensor of shape [None, 6]
indices - 2D tensor of shape [None, 6]
ref - 2D Variable of zeros of shape [None, 6]
It is guaranteed that updates, indices and ref will always have their first dimension equal, but the size of that dimension can be varying. The update I want to perform looks like
for i, j:
k = indices[i][j]
ref[i][k] += updates[i][j]
Note that indices contains duplicates. tf.scatter_nd_add(ref, indices, updates) complains about shape mismatch and I cannot figure out how I need to restructure the tensors in order to performs the update.

I figured it out. Each 2D entry in indices must actually specify the absolute location that will get updated in ref. This means that indices must be 3D and then the non-vectorized update looks like:
for i, j:
r, k = indices[i][j]
ref[r][k] += updates[i][j]
In the above question it just happens that r is always equal to i.
Here is a full Tensorflow implementation with varying shapes. For clarity, in the following example, col_indices corresponds to indices from the original question:
import tensorflow as tf
import numpy as np
updates = tf.placeholder(dtype=tf.float32, shape=[None, 6])
col_indices = tf.placeholder(dtype=tf.int32, shape=[None, 6])
row_indices = tf.cumsum(tf.ones_like(col_indices), axis=0, exclusive=True)
indices = tf.concat([tf.expand_dims(row_indices, axis=-1),
tf.expand_dims(col_indices, axis=-1)], axis=-1)
tmp_var = tf.Variable(0, trainable=False, dtype=tf.float32, validate_shape=False)
ref = tf.assign(tmp_var, tf.zeros_like(updates), validate_shape=False)
# This makes sure that ref is always 0 before scatter_nd_add() runs
with tf.control_dependencies([target_var]):
result = tf.scatter_nd_add(ref, indices, updates)
sess = tf.Session()
sess.run(tf.global_variables_initializer())
# Create example input data
np_input = np.arange(0, 6, 1, dtype=np.int32)
np_input = np.tile(np_input[None,:], [10, 1])
res = sess.run(result, feed_dict={updates: np_input, col_indices: np_input})
print(res)

Related

how to change torch.scatter_add to tensorflow function

I need transfer code pytorch to tensorflow
this pytorch code is here NADST
encoded_context = ft['encoded_context2']
encoded_in_domainslots = ft['encoded_in_domainslots2']
self.pointer_attn(ft['out_states'], encoded_context, encoded_context, context_mask)
pointer_attn = self.pointer_attn.attn.squeeze(1)
p_vocab = F.softmax(vocab_attn, dim = -1)
context_index = context.unsqueeze(1).expand_as(pointer_attn)
p_context_ptr = torch.zeros(p_vocab.size()).cuda()
p_context_ptr.scatter_add_(2, context_index, pointer_attn)
I want to change code "p_context_ptr.scatter_add_(2, context_index, pointer_attn)" to tensorflow version.
so I use "tf.compat.v1.tensor_scatter_nd_add()" of tensorflow function, but not same operation torch scatter_add_() fucntion
I'm so try work until now but not found solution my some code like this
def get_scatter_add(tensor, indices, updates):
if indices.shape.rank > 2:
tensor = tf.compat.v1.reshape(tensor, shape=[-1, tensor.shape[-1]])
indices = tf.compat.v1.reshape(indices, shape=[-1, indices.shape[-1]])
updates = tf.compat.v1.reshape(updates, shape=[-1, updates.shape[-1]])
one_hot_index = tf.compat.v1.one_hot(indices=indices, depth=tensor.shape[-1])
tile_update = tf.compat.v1.expand_dims(updates, axis=-1)
updates = tf.compat.v1.to_float(one_hot_index) * tf.compat.v1.to_float(tile_update)
indices = tf.compat.v1.expand_dims(indices, axis=-1)
update = tensor.shape[indices.shape[-1]:]
res = indices.shape[:-1] + update
scatter = tf.compat.v1.tensor_scatter_nd_add(tensor, indices, updates)
return scatter
but, memory overflow, my variable shape is tensor.shape()->[1100, 19200], update.shape()->[1100, 900], updates.shape()->[1100, 900]
how to solve this problem ???
Thank you for your reply
have nice day!!!
I found solution by myself
tensorflow tensor_scatter_nd_add function is some problem vector dimension is expanded for target vector.
but except for one case is same operation to torch scatter_add_ fucntion
this case :
import tensorflow as tf
indices = tf.constant([[4], [3], [1], [7]])
updates = tf.constant([9, 10, 11, 12])
tensor = tf.ones([8], dtype=tf.int32)
updated = tf.tensor_scatter_nd_add(tensor, indices, updates)
print(updated)
it only update, tensor one dimension and indices is rank 2 shape
so i am change shape like above method like this
tensor.shape()->reshape[-1]
update.shape()->reshape[-1]
indices.shape()->reshape[-1, 1]
this same above case but, we need update index operation but if we have pointer generater for DST task, becuase tensor is vocabulary size of last dimension, so index + vocab size next batch and +vocab*2 next batch
so it function same operation Torch scatter_add_
example:
tensor = [35, 32, vocab_size], indices = [35, 32, 900], update = [35, 32, 900]
Torch case:
tensor.scatter_add_(2, indices, update)
Tensorflow case:
tensor = my_tensorflow_scatter_add(tensor, indices, update)
this same operation case above variable dimension
my_tensorflow_scatter_add function:
def my_tensorflow_scatter_add(tensor, indices, updates):
original_tensor = tensor
# expand index value from vocab size
indices = tf.compat.v1.reshape(indices, shape=[-1, tf.shape(indices)[-1]])
indices_add = tf.compat.v1.expand_dims(tf.range(0, tf.shape(indices)[0], 1)*(tf.shape(tensor)[-1]), axis=-1)
indices += indices_add
# resize
tensor = tf.compat.v1.reshape(tensor, shape=[-1])
indices = tf.compat.v1.reshape(indices, shape=[-1, 1])
updates = tf.compat.v1.reshape(updates, shape=[-1])
#check_
"""
update = tensor.shape[indices.shape[-1]:]
res = indices.shape[:-1] + update
"""
#same Torch scatter_add_
scatter = tf.compat.v1.tensor_scatter_nd_add(tensor, indices, updates)
scatter = tf.compat.v1.reshape(scatter, shape=[tf.shape(original_tensor)[0], tf.shape(original_tensor)[1], -1])
return scatter
I solved my question problem
Alternative solution without flattening all tensors. Assuming the tensor shapes tensor = [35, 32, vocab_size], indices = [35, 32, 900], update = [35, 32, 900] (based on Proper usage of `tf.scatter_nd` in tensorflow-r1.2) :
def scatter_add(tensor, indices, updates):
"""
Args:
tensor: (seq_len, batch_size, vocab_size)
indices: (seq_len, batch_size, dim)
updates: (seq_len, batch_size, dim)
Returns:
(seq_len, batch_size, vocab_size)
"""
seq_len, batch_size, dim = indices.shape
# Create additional indices
i1, i2 = tf.meshgrid(tf.range(seq_len),
tf.range(batch_size), indexing="ij")
i1 = tf.tile(i1[:, :, tf.newaxis], [1, 1, dim])
i2 = tf.tile(i2[:, :, tf.newaxis], [1, 1, dim])
# Create final indices
idx = tf.stack([i1, i2, indices], axis=-1)
# Get scatter-added tensor
scatter = tf.tensor_scatter_nd_add(tensor, idx, updates)
return scatter

Explicit broadcasting of variable batch-size tensor

I'm trying to implement a custom Keras Layer in Tensorflow 2.0RC and need to concatenate a [None, Q] shaped tensor onto a [None, H, W, D] shaped tensor to produce a [None, H, W, D + Q] shaped tensor. It is assumed that the two input tensors have the same batch size even though it is not known beforehand. Also, none of H, W, D, and Q are known at write-time but are evaluated in the layer's build method when the layer is first called. The issue that I'm experiencing is when broadcasting the [None, Q] shaped tensor up to a [None, H, W, Q] shaped tensor in order to concatenate.
Here is an example of trying to create a Keras Model using the Functional API that performs variable-batch broadcasting from shape [None, 3] to shape [None, 5, 5, 3]:
import tensorflow as tf
import tensorflow.keras.layers as kl
import numpy as np
x = tf.keras.Input([3]) # Shape [None, 3]
y = kl.Reshape([1, 1, 3])(x) # Need to add empty dims before broadcasting
y = tf.broadcast_to(y, [-1, 5, 5, 3]) # Broadcast to shape [None, 5, 5, 3]
model = tf.keras.Model(inputs=x, outputs=y)
print(model(np.random.random(size=(8, 3))).shape)
Tensorflow produces the error:
InvalidArgumentError: Dimension -1 must be >= 0
And then when I change -1 to None it gives me:
TypeError: Failed to convert object of type <class 'list'> to Tensor. Contents: [None, 5, 5, 3]. Consider casting elements to a supported type.
How can I perform the specified broadcasting?
You need to use the dynamic shape of y to determine the batch size. The dynamic shape of a tensor y is given by tf.shape(y) and is a tensor op representing the shape of y evaluated at runtime. The modified example demonstrates this by selecting between the old shape, [None, 1, 1, 3], and the new shape using tf.where.
import tensorflow as tf
import tensorflow.keras.layers as kl
import numpy as np
x = tf.keras.Input([3]) # Shape [None, 3]
y = kl.Reshape([1, 1, 3])(x) # Need to add empty dims before broadcasting
# Retain the batch and depth dimensions, but broadcast along H and W
broadcast_shape = tf.where([True, False, False, True],
tf.shape(y), [0, 5, 5, 0])
y = tf.broadcast_to(y, broadcast_shape) # Broadcast to shape [None, 5, 5, 3]
model = tf.keras.Model(inputs=x, outputs=y)
print(model(np.random.random(size=(8, 3))).shape)
# prints: "(8, 5, 5, 3)"
References:
"TensorFlow: Shapes and dynamic dimensions"

Tensorflow tensordot for unknown batch size

there might be an obvious solution, but I haven't found it yet. I want to do a simple multiplication, where I have one tensor that gives me a kind of weight vector and another one that are stacked tensors (same number as weights). It seems straight forward using tf.tensordot but that doesn't work for unknown batch sizes.
import collections
import tensorflow as tf
tf.reset_default_graph()
x = tf.placeholder(shape=(None, 4, 1), dtype=tf.float32, name='x')
y_true = tf.placeholder(shape=(None, 4, 1), dtype=tf.float32, name='y_true')
# These are the models that I want to combine
linear_model0 = tf.layers.Dense(units=1, name='linear_model0')
linear_model1 = tf.layers.Dense(units=1, name='linear_model1')
agents = collections.OrderedDict()
agents[0] = linear_model0(x) # shape (?,4,1)
agents[1] = linear_model1(x) # shape (?,4,1)
stacked = tf.stack(list(agents.values()), axis=1) # shape (?,2,4,1)
# This is the model that produces the weights
x_flat = tf.layers.Flatten()(x)
weight_model = tf.layers.Dense(units=2, name='weight_model')
weights = weight_model(x_flat) # shape: (?,2)
# This is the final output
y_pred = tf.tensordot(weights, stacked, axes = 2, name='y_pred')
# PROBLEM HERE: shape: (4,1) instead of (?,4,1)
# Running the whole thing
sess = tf.Session()
init = tf.global_variables_initializer()
sess.run(init)
# Example1 (output of shape (1,4,1) expected, got (4,1))
print('model', sess.run(y_pred,
{x: [[[1], [2], [3], [4]]]}).shape)
# Example2 (output of (2,4,1) expected, got (4,1))
print('model', sess.run(y_pred,
{x: [[[1], [2], [3], [4]], [[1], [2], [3], [4]]]}).shape)
So, the multiplication works as expected for the first input, but does only that first one and not a batch of inputs. Any help?
Similar questions that didn't resolve my issue:
obstacles in tensorflow's tensordot using batch multiplication
The tf.tensordot is not suitable in the case because, based on your explanation, it necessary to set axis equal to 1 which cause the incompatibility in matrix sizes. One is [batch_size, 2] the other is [batch_size, 8]. On the other, If you set the axis to [[1],[1]] it is not what you expected:
tf.tensordot(weights, stacks, axes=[[1],[1]]) # shape = (?,?,1,1)
How to fix the issue?
Use tf.ensim as contraction between tensors of arbitrary dimension:
tf.einsum('ij,ijkl->ikl', weights, stacked)

How do I zip tensors in tensorflow when the dimensions don't match

I have two tensors, one of shape [None, 20, 2], and one of shape [None, 1].
I would like to do an operation on each of the sub-tensors in lockstep to produce a value such that I would end up with a tensor of shape [None, 1].
In python land, I would zip these two, and iterate over the result.
So, just to be clear, I'd like to write a function that takes a [20, 2]-shape tensor and a [1]-shape tensor, and produces a [1]-shape tensor, then apply this function to the [None, 20, 2] and [None, 1] tensors, to produce a [None, 1] tensor.
Hope I articulated that well enough; higher dimensionality makes my head spin sometimes.
This works for me (TensorFlow version 1.4.0)
tf.reset_default_graph()
sess = tf.Session()
# Define placeholders with undefined first dimension.
a = tf.placeholder(dtype=tf.float32, shape=[None, 3, 4])
b = tf.placeholder(dtype=tf.float32, shape=[None, 1])
# Create some input data.
a_input = np.arange(24).reshape(2, 3, 4)
b_input = np.arange(2).reshape(2, 1)
# TensorFlow map function.
def f_tf(x):
return tf.reduce_sum(x[0]) + tf.reduce_sum(x[1])
# Numpy map function (for validation of results).
def f_numpy(x):
return np.sum(x[0]) + np.sum(x[1])
# Run TensorFlow function.
s = tf.map_fn(f, [a, b], dtype=tf.float32)
sess.run(s, feed_dict={a: a_input, b: b_input})
array([ 66., 211.], dtype=float32)
# Run Numpy function.
for inp in zip(a_input, b_input):
print(f_numpy(inp))
66
211

Looping over a tensor

I am trying to process a tensor of variable size, in a python way that would be something like:
# X is of shape [m, n]
for x in X:
process(x)
I have tried to use tf.scan, the thing is that I want to process every sub-tensor, so I have tried to use a nested scan, but I was enable to do it, because tf.scan work with the accumulator, if not found it will take the first entry of the elems as initializer, which I don't want to do.
As an example, suppose I want to add one to every element of my tensor (this is just an example), and I want to process it element by element. If I run the code bellow, I will only have one added to a sub-tensor, because scan consider the first tensor as initializer, along with the first element of every sub-tensor.
import numpy as np
import tensorflow as tf
batch_x = np.random.randint(0, 10, size=(5, 10))
x = tf.placeholder(tf.float32, shape=[None, 10])
def inner_loop(x_in):
return tf.scan(lambda _, x_: x_ + 1, x_in)
outer_loop = tf.scan(lambda _, input_: inner_loop(input_), x, back_prop=True)
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
rs = sess.run(outer_loop, feed_dict={x: batch_x})
Any suggestions ?
To loop over a tensor you could try tf.unstack
Unpacks the given dimension of a rank-R tensor into rank-(R-1) tensors.
So adding 1 to each tensor would look something like:
import tensorflow as tf
x = tf.placeholder(tf.float32, shape=(None, 10))
x_unpacked = tf.unstack(x) # defaults to axis 0, returns a list of tensors
processed = [] # this will be the list of processed tensors
for t in x_unpacked:
# do whatever
result_tensor = t + 1
processed.append(result_tensor)
output = tf.concat(processed, 0)
with tf.Session() as sess:
print(sess.run([output], feed_dict={x: np.zeros((5, 10))}))
Obviously you can further unpack each tensor from the list to process it, down to single elements. To avoid lots of nested unpacking though, you could maybe try flattening x with tf.reshape(x, [-1]) first, and then loop over it like
flattened_unpacked = tf.unstack(tf.reshape(x, [-1])
for elem in flattened_unpacked:
process(elem)
In this case elem is a scalar.
Most of tensorflow built-in functions could be applied elementwise. So you could just pass a tensor into a function. Like:
outer_loop = inner_loop(x)
However, if you have some function that could not be applied this way (it's really tempting to see that function), you could use map_fn.
Say, your function simply adds 1 to every element of a tensor (or whatever):
inputs = tf.placeholder...
def my_elementwise_func(x):
return x + 1
def recursive_map(inputs):
if tf.shape(inputs).ndims > 0:
return tf.map_fn(recursive_map, inputs)
else:
return my_elementwise_func(inputs)
result = recursive_map(inputs)

Categories