I'm trying to implement a custom Keras Layer in Tensorflow 2.0RC and need to concatenate a [None, Q] shaped tensor onto a [None, H, W, D] shaped tensor to produce a [None, H, W, D + Q] shaped tensor. It is assumed that the two input tensors have the same batch size even though it is not known beforehand. Also, none of H, W, D, and Q are known at write-time but are evaluated in the layer's build method when the layer is first called. The issue that I'm experiencing is when broadcasting the [None, Q] shaped tensor up to a [None, H, W, Q] shaped tensor in order to concatenate.
Here is an example of trying to create a Keras Model using the Functional API that performs variable-batch broadcasting from shape [None, 3] to shape [None, 5, 5, 3]:
import tensorflow as tf
import tensorflow.keras.layers as kl
import numpy as np
x = tf.keras.Input([3]) # Shape [None, 3]
y = kl.Reshape([1, 1, 3])(x) # Need to add empty dims before broadcasting
y = tf.broadcast_to(y, [-1, 5, 5, 3]) # Broadcast to shape [None, 5, 5, 3]
model = tf.keras.Model(inputs=x, outputs=y)
print(model(np.random.random(size=(8, 3))).shape)
Tensorflow produces the error:
InvalidArgumentError: Dimension -1 must be >= 0
And then when I change -1 to None it gives me:
TypeError: Failed to convert object of type <class 'list'> to Tensor. Contents: [None, 5, 5, 3]. Consider casting elements to a supported type.
How can I perform the specified broadcasting?
You need to use the dynamic shape of y to determine the batch size. The dynamic shape of a tensor y is given by tf.shape(y) and is a tensor op representing the shape of y evaluated at runtime. The modified example demonstrates this by selecting between the old shape, [None, 1, 1, 3], and the new shape using tf.where.
import tensorflow as tf
import tensorflow.keras.layers as kl
import numpy as np
x = tf.keras.Input([3]) # Shape [None, 3]
y = kl.Reshape([1, 1, 3])(x) # Need to add empty dims before broadcasting
# Retain the batch and depth dimensions, but broadcast along H and W
broadcast_shape = tf.where([True, False, False, True],
tf.shape(y), [0, 5, 5, 0])
y = tf.broadcast_to(y, broadcast_shape) # Broadcast to shape [None, 5, 5, 3]
model = tf.keras.Model(inputs=x, outputs=y)
print(model(np.random.random(size=(8, 3))).shape)
# prints: "(8, 5, 5, 3)"
References:
"TensorFlow: Shapes and dynamic dimensions"
Related
In my Tensorflow 2 model, I want my batch size to be parametric, such that I can build tensors which have appropriate batch size dynamically. I have the following code:
batch_size_param = 128
tf_batch_size = tf.keras.Input(shape=(), name="tf_batch_size", dtype=tf.int32)
batch_indices = tf.range(0, tf_batch_size, 1)
md = tf.keras.Model(inputs={"tf_batch_size": tf_batch_size}, outputs=[batch_indices])
res = md(inputs={"tf_batch_size": batch_size_param})
The code throws an error in tf.range:
ValueError: Shape must be rank 0 but is rank 1
for 'limit' for '{{node Range}} = Range[Tidx=DT_INT32](Range/start, tf_batch_size, Range/delta)' with input shapes: [], [?], []
I think the problem is with the fact that tf.keras.Input automatically tries to expand the input array at the first dimension, since it expects the partial shape of the input without the batch size and will attach the batch size according to the shape of the input array, which in my case a scalar. I can just feed the scalar value as a constant integer into tf.range but this time, I won't be able to change it after the model graph has been compiled.
Interestingly, I failed to find a proper way to input only a scalar into a TF-2 model even though I checked the documentation, too. So, what would be the best way to handle such a case?
Don't use tf.keras.Input and just define the model by subclassing.
import tensorflow as tf
class ScalarModel(tf.keras.Model):
def __init__(self):
super().__init__()
def call(self, x):
return tf.range(0, x, 1)
print(ScalarModel()(10))
# tf.Tensor([0 1 2 3 4 5 6 7 8 9], shape=(10,), dtype=int32)
I'm not sure if this is actually a good idea, but you could use tf.squeeze like
inp = keras.Input(shape=(), dtype=tf.int32)
batch_indices = tf.range(tf.squeeze(inp))
model = keras.Model(inputs=inp, outputs=batch_indices)
so that
model(6)
gives
<tf.Tensor: shape=(6,), dtype=int32, numpy=array([0, 1, 2, 3, 4, 5])>
Edit:
Depending on what you want to achieve, it might also be worth looking into ragged tensors:
inp = keras.Input(shape=(), dtype=tf.int32)
batch_indices = tf.ragged.range(inp)
model = keras.Model(inputs=inp, outputs=batch_indices)
would make
model(np.array([6,7]))
return
<tf.RaggedTensor [[0, 1, 2, 3, 4, 5], [0, 1, 2, 3, 4, 5, 6]]>
Say I have two rank 1 tensors of different (important) length:
import tensorflow as tf
x = tf.constant([1, 2, 3])
y = tf.constant([4, 5])
Now I want to append y to the end of x to give me the tensor:
<tf.Tensor: shape=(5,), dtype=int32, numpy=array([1, 2, 3, 4, 5], dtype=int32)>
But I can't seem to figure out how.
I will be doing this inside a function that I will decorate with tf.function, and it is my understanding that everything needs to be tensorflow operations for the tf.function decorator to work. That is, converting x and y to numpy arrays and back to a tensor will cause problems.
Thanks!
EDIT:
The solution is to use tf.concat() as pointed out by #Andrey:
tf.concat([x, y], axis=0)
It turns out that the problem originated when trying to append a single number to the end of a rank 1 tensor as follows:
x = tf.constant([1, 2, 3])
y = tf.constant(5)
tf.concat([x, y], axis=0)
which fails since here y is a rank 0 tensor of shape (). This can be solved by writing:
x = tf.constant([1, 2, 3])
y = tf.constant([5])
tf.concat([x, y], axis=0)
since y will then be a rank 1 tensor of shape (1,).
Use tf.concat():
import tensorflow as tf
t1 = tf.constant([1, 2, 3])
t2 = tf.constant([4, 5])
output = tf.concat([t1, t2], 0)
I have two tensors, one of shape [None, 20, 2], and one of shape [None, 1].
I would like to do an operation on each of the sub-tensors in lockstep to produce a value such that I would end up with a tensor of shape [None, 1].
In python land, I would zip these two, and iterate over the result.
So, just to be clear, I'd like to write a function that takes a [20, 2]-shape tensor and a [1]-shape tensor, and produces a [1]-shape tensor, then apply this function to the [None, 20, 2] and [None, 1] tensors, to produce a [None, 1] tensor.
Hope I articulated that well enough; higher dimensionality makes my head spin sometimes.
This works for me (TensorFlow version 1.4.0)
tf.reset_default_graph()
sess = tf.Session()
# Define placeholders with undefined first dimension.
a = tf.placeholder(dtype=tf.float32, shape=[None, 3, 4])
b = tf.placeholder(dtype=tf.float32, shape=[None, 1])
# Create some input data.
a_input = np.arange(24).reshape(2, 3, 4)
b_input = np.arange(2).reshape(2, 1)
# TensorFlow map function.
def f_tf(x):
return tf.reduce_sum(x[0]) + tf.reduce_sum(x[1])
# Numpy map function (for validation of results).
def f_numpy(x):
return np.sum(x[0]) + np.sum(x[1])
# Run TensorFlow function.
s = tf.map_fn(f, [a, b], dtype=tf.float32)
sess.run(s, feed_dict={a: a_input, b: b_input})
array([ 66., 211.], dtype=float32)
# Run Numpy function.
for inp in zip(a_input, b_input):
print(f_numpy(inp))
66
211
I am having difficulty applying tf.scatter_nd_add() to 2D tensors. The documentation is a bit unclear and has does not contain an example for sparse update but only for full slice updates.
My case is the following:
updates - 2D tensor of shape [None, 6]
indices - 2D tensor of shape [None, 6]
ref - 2D Variable of zeros of shape [None, 6]
It is guaranteed that updates, indices and ref will always have their first dimension equal, but the size of that dimension can be varying. The update I want to perform looks like
for i, j:
k = indices[i][j]
ref[i][k] += updates[i][j]
Note that indices contains duplicates. tf.scatter_nd_add(ref, indices, updates) complains about shape mismatch and I cannot figure out how I need to restructure the tensors in order to performs the update.
I figured it out. Each 2D entry in indices must actually specify the absolute location that will get updated in ref. This means that indices must be 3D and then the non-vectorized update looks like:
for i, j:
r, k = indices[i][j]
ref[r][k] += updates[i][j]
In the above question it just happens that r is always equal to i.
Here is a full Tensorflow implementation with varying shapes. For clarity, in the following example, col_indices corresponds to indices from the original question:
import tensorflow as tf
import numpy as np
updates = tf.placeholder(dtype=tf.float32, shape=[None, 6])
col_indices = tf.placeholder(dtype=tf.int32, shape=[None, 6])
row_indices = tf.cumsum(tf.ones_like(col_indices), axis=0, exclusive=True)
indices = tf.concat([tf.expand_dims(row_indices, axis=-1),
tf.expand_dims(col_indices, axis=-1)], axis=-1)
tmp_var = tf.Variable(0, trainable=False, dtype=tf.float32, validate_shape=False)
ref = tf.assign(tmp_var, tf.zeros_like(updates), validate_shape=False)
# This makes sure that ref is always 0 before scatter_nd_add() runs
with tf.control_dependencies([target_var]):
result = tf.scatter_nd_add(ref, indices, updates)
sess = tf.Session()
sess.run(tf.global_variables_initializer())
# Create example input data
np_input = np.arange(0, 6, 1, dtype=np.int32)
np_input = np.tile(np_input[None,:], [10, 1])
res = sess.run(result, feed_dict={updates: np_input, col_indices: np_input})
print(res)
I have two vectors, weighted: shape (None, 3) and D: shape (None, 3, 5). Then I want to multiply weighted to D like weighted * D: shape(None, 3, 5).
I attached my image below. So each scalar value is multiplied to each row element.
So I tried multiply([weighted, D]), but I got an error ValueError: Operands could not be broadcast together with shapes (3, 5) (3,). I assume this is caused of different shape of inputs. Then, how do I fix this?
Update
multiply([weighted, Permute((2, 1))(D)]) worked. I am not sure but last element of shape must be same..
You can reshape weighted and use broadcasting to accomplish that. Like this:
weighted = weighted.reshape(-1, 3, 1)
result = weighted * D
Update 1: The same concept (broadcasting) can be used for instance in tensorflow with tf.expand_dims(weights, dim=2). My POC:
import tensorflow as tf
import numpy as np
tf.reset_default_graph()
anp = np.array([[1, 2, 10], [2, 1, 10]])
bnp = np.random.random((2, 3, 5))
with tf.Session() as sess:
weighted = tf.placeholder(tf.float32, shape=(None, 3))
D = tf.placeholder(tf.float32, shape=(None, 3, 5))
rweighted = tf.expand_dims(weighted, dim=2)
result = rweighted * D
r = sess.run(result, feed_dict={weighted: anp, D: bnp})
print(bnp)
print("--")
print(r)
For keras use the backend API:
from keras import backend as K
...
K.expand_dims(weighted, 2)