Proper way to input a scalar into a Tensorflow 2 model - python

In my Tensorflow 2 model, I want my batch size to be parametric, such that I can build tensors which have appropriate batch size dynamically. I have the following code:
batch_size_param = 128
tf_batch_size = tf.keras.Input(shape=(), name="tf_batch_size", dtype=tf.int32)
batch_indices = tf.range(0, tf_batch_size, 1)
md = tf.keras.Model(inputs={"tf_batch_size": tf_batch_size}, outputs=[batch_indices])
res = md(inputs={"tf_batch_size": batch_size_param})
The code throws an error in tf.range:
ValueError: Shape must be rank 0 but is rank 1
for 'limit' for '{{node Range}} = Range[Tidx=DT_INT32](Range/start, tf_batch_size, Range/delta)' with input shapes: [], [?], []
I think the problem is with the fact that tf.keras.Input automatically tries to expand the input array at the first dimension, since it expects the partial shape of the input without the batch size and will attach the batch size according to the shape of the input array, which in my case a scalar. I can just feed the scalar value as a constant integer into tf.range but this time, I won't be able to change it after the model graph has been compiled.
Interestingly, I failed to find a proper way to input only a scalar into a TF-2 model even though I checked the documentation, too. So, what would be the best way to handle such a case?

Don't use tf.keras.Input and just define the model by subclassing.
import tensorflow as tf
class ScalarModel(tf.keras.Model):
def __init__(self):
super().__init__()
def call(self, x):
return tf.range(0, x, 1)
print(ScalarModel()(10))
# tf.Tensor([0 1 2 3 4 5 6 7 8 9], shape=(10,), dtype=int32)

I'm not sure if this is actually a good idea, but you could use tf.squeeze like
inp = keras.Input(shape=(), dtype=tf.int32)
batch_indices = tf.range(tf.squeeze(inp))
model = keras.Model(inputs=inp, outputs=batch_indices)
so that
model(6)
gives
<tf.Tensor: shape=(6,), dtype=int32, numpy=array([0, 1, 2, 3, 4, 5])>
Edit:
Depending on what you want to achieve, it might also be worth looking into ragged tensors:
inp = keras.Input(shape=(), dtype=tf.int32)
batch_indices = tf.ragged.range(inp)
model = keras.Model(inputs=inp, outputs=batch_indices)
would make
model(np.array([6,7]))
return
<tf.RaggedTensor [[0, 1, 2, 3, 4, 5], [0, 1, 2, 3, 4, 5, 6]]>

Related

tf slices with lambda layer only uses last index

TLDR
My for lambda layers to get tensor slices only get the last column of data.
I have a (Batch_size, R) shape tensor that I will be running through an embedding layer for each of the R features seperately. I wrote the following code to split the input (Batch_size, R) shaped tensor into R (None,) slices.
R=2
inp = tf.keras.Input(shape = (R,), dtype=tf.int32)
SLICES = []
for i in range(R):
slice_ = tf.keras.layers.Lambda(lambda a: a[:,i], name=f"slice_{i}", dtype=tf.int32)(inp)
SLICES.append(slice_)
model = tf.keras.Model(inputs= inp, outputs = SLICES)
Running tf.keras.utils.plot_model(model, show_shapes=True, show_dtype=True) makes it appear that the code works. Running data into the model shows that there is a problem: the model takes the last feature and copies it for all layers.
input_ = np.array([[1,2],[3,4],[5,6]])
model.predict(input_)
[array([2, 4, 6], dtype=int32), array([2, 4, 6], dtype=int32)]
Approach 1
I "fixed" the problem in the R=2 case by getting rid of the for loop and writing each layer by hand.
slice1 = tf.keras.layers.Lambda(lambda a: a[:,0], name=f"first_slice", dtype=tf.int32)(inp)
slice2 = tf.keras.layers.Lambda(lambda a: a[:,1], name=f"second_slice", dtype=tf.int32)(inp)
model = tf.keras.Model(inputs= inp, outputs = [slice1, slice2])
input_ = np.array([[1,2],[3,4],[5,6]])
model.predict(input_)
[array([1, 3, 5], dtype=int32), array([2, 4, 6], dtype=int32)]
This is clearly undesirable for any number of reasons.
Approach 2
Another approach is to do the embedding on the raw features. Unfortunately, I have a CutMix like layer in front of the embedding operation, preventing me from embedding the raw features.
How can I get the for loop to correctly copy each slice of the tensor?
The reason why your first block of codes not working is you need to write the lambda function like this instead: lambda a,k=i: a[:,k]
R=2
inp = tf.keras.Input(shape = (R,), dtype=tf.int32)
SLICES = []
for i in range(R):
slice_ = tf.keras.layers.Lambda(lambda a,k=i: a[:,k], name=f"slice_{i}", dtype=tf.int32)(inp)
SLICES.append(slice_)
model = tf.keras.Model(inp, SLICES)
input_ = np.array([[1,2],[3,4],[5,6]])
print(model.predict(input_))
Outputs:
[array([1, 3, 5], dtype=int32), array([2, 4, 6], dtype=int32)]

Formatting inputs for LSTM layer with variable timestep using Tensorflow

According to documentation, the LSTM layer should handle inputs with (None, CONST, CONST) shape. For variable timestep, it should be able to handle inputs with (None, None, CONST) shape.
Let say my data is the following :
X = [
[
[1, 2, 3],
[4, 5, 6]
],
[
[7, 8, 9]
]
]
Y = [0, 1]
And my model :
model = tf.keras.models.Sequential([
tf.keras.layers.LSTM(32, activation='tanh',input_shape=(None, 3)),
tf.keras.layers.Dense(1, activation='sigmoid')
])
model.compile(loss='categorical_crossentropy', optimizer='adam')
model.fit(X, Y)
My question is: how should I format these inputs to make this code work ?
I cannot use pandas dataframes here, as I was used to. If I run the code above, I get this error :
Error when checking model input: the list of Numpy arrays that you are passing to your model is not the size the model expected. Expected to see 1 array(s), but instead got the following list of 2 arrays:
And if I change the last line with :
model.fit(np.array(X), np.array(Y))
The error is now :
Error when checking input: expected lstm_8_input to have 3 dimensions, but got array with shape (2, 1)
You are close but in Keras / Tensorflow you need to pad your sequences and then use a Masking to let the LSTM skip those padded ones. Why? Because your entries in your tensor need to have the same shape (batch_size, max_length, features). So if you have variable length, the sequence gets padded.
You can use keras.preprocessing.sequence.pad_sequences to pad your sequences to obtain something like:
X = [
[
[1, 2, 3],
[4, 5, 6]
],
[
[7, 8, 9],
[0, 0, 0],
]
]
X.shape == (2, 2, 3)
Y = [0, 1]
Y.shape == (2, 1)
And then use the masking layer:
model = tf.keras.models.Sequential([
tf.keras.layers.Masking(), # this tells LSTM to skip certain timesteps
tf.keras.layers.LSTM(32, activation='tanh',input_shape=(None, 3)),
tf.keras.layers.Dense(1, activation='sigmoid')
])
model.compile(loss='binary_crossentropy', optimizer='adam')
model.fit(X, Y)
You also want binary_crossentropy since you have a binary classification problem with sigmoid output.

Concatenate identity matrix to each vector

I want to modify my input by adding several different suffixes to the input vectors. For example, if the (single) input is [1, 5, 9, 3] I want to create three vectors (stored as matrix) like this:
[[1, 5, 9, 3, 1, 0, 0],
[1, 5, 9, 3, 0, 1, 0],
[1, 5, 9, 3, 0, 0, 1]]
Of course, this is just one observation so the input to the model is (None, 4) in this case. The simple way is to prepare the input data somewhere else (numpy most probably) and adjust the shape of input accordingly. That I can do but I would prefer doing it inside TensorFlow/Keras.
I have isolated the problem into this code:
import keras.backend as K
from keras import Input, Model
from keras.layers import Lambda
def build_model(dim_input: int, dim_eye: int):
input = Input((dim_input,))
concat = Lambda(lambda x: concat_eye(x, dim_input, dim_eye))(input)
return Model(inputs=[input], outputs=[concat])
def concat_eye(x, dim_input, dim_eye):
x = K.reshape(x, (-1, 1, dim_input))
x = K.repeat_elements(x, dim_eye, axis=1)
eye = K.expand_dims(K.eye(dim_eye), axis=0)
eye = K.tile(eye, (-1, 1, 1))
out = K.concatenate([x, eye], axis=2)
return out
def main():
import numpy as np
n = 100
dim_input = 20
dim_eye = 3
model = build_model(dim_input, dim_eye)
model.compile(optimizer='sgd', loss='mean_squared_error')
x_train = np.zeros((n, dim_input))
y_train = np.zeros((n, dim_eye, dim_eye + dim_input))
model.fit(x_train, y_train)
if __name__ == '__main__':
main()
The problem seems to be in the -1 in shape argument in tile function. I tried to replace it with 1 and None. Each has its own error:
-1: error during model.fit
tensorflow.python.framework.errors_impl.InvalidArgumentError: Expected multiples[0] >= 0, but got -1
1: error duting model.fit
tensorflow.python.framework.errors_impl.InvalidArgumentError: ConcatOp : Dimensions of inputs should match: shape[0] = [32,3,20] vs. shape[1] = [1,3,3]
None: error during build_model:
Failed to convert object of type <class 'tuple'> to Tensor. Contents: (None, 1, 1). Consider casting elements to a supported type.
You need to use K.shape() instead to get the symbolic shape of input tensor. That's because the batch size is None and therefore passing K.int_shape(x)[0] or None or -1 as a part of the second argument of K.tile() would not work:
eye = K.tile(eye, (K.shape(x)[0], 1, 1))

tensorflow: how to set the shape of tensor with different conditional statements?

I would like to train a network with two different shapes of input tensor. Each epoch chooses one type.
Here I write a small code:
import tensorflow as tf
import numpy as np
with tf.Session() as sess:
imgs1 = tf.placeholder(tf.float32, [4, 224, 224, 3], name = 'input_imgs1')
imgs2 = tf.placeholder(tf.float32, [4, 180, 180, 3], name = 'input_imgs2')
epoch_num_tf = tf.placeholder(tf.int32, [], name = 'input_epoch_num')
imgs = tf.cond(tf.equal(tf.mod(epoch_num_tf, 2), 0),
lambda: tf.Print(imgs2, [imgs2.get_shape()], message='(even number) input epoch number is '),
lambda: tf.Print(imgs1, [imgs1.get_shape()], message='(odd number) input epoch number is'))
print(imgs.get_shape())
for epoch in range(10):
epoch_num = np.array(epoch).astype(np.int32)
imgs1_input = np.ones([4, 224, 224, 3], dtype = np.float32)
imgs2_input = np.ones([4, 180, 180, 3], dtype = np.float32)
output = sess.run(imgs, feed_dict = {epoch_num_tf: epoch_num,
imgs1: imgs1_input,
imgs2: imgs2_input})
When I execute it, the output of imgs.get_shape() is (4, ?, ?, 3)
i.e. imgs.get_shape()[1]=None, imgs.get_shape()[2]=None.
But I will use the value of the output of imgs.get_shape() to define the kernel (ksize) and strides size (strides) of the tf.nn.max_pool() e.g. ksize=[1,imgs.get_shape()[1]/6, imgs.get_shape()[2]/6, 1] in the future code.
I think ksize and strides cannot support tf.Tensor value.
How to solve this problem? Or how to set the shape of imgs conditionally?
When you do print(a.get_shape()), you are getting the static shape of the tensor a. Assuming you mean imgs.get_shape() and not a.get_shape() in the code above, dimensions 1 and 2 of imgs vary dynamically with the value of epoch_num_tf. Therefore the static shape in those dimensions is unknown, which TensorFlow represents as None.
If you want to use the dynamic shape of imgs in subsequent code, you should use the tf.shape() operator to get the shape as a tf.Tensor. For example, instead of imgs.get_shape()[2], you can use tf.shape(imgs)[2].
Unfortunately, the ksize and strides arguments of tf.nn.max_pool() do not accept tf.Tensor values. (I think this is a historical limitation, because these were configured as "attrs" rather than "inputs" of the corresponding kernel. Please open a GitHub issue if you'd like to request this feature!) One possible workaround would be to use another tf.cond():
imgs = ...
# Could also use `tf.equal(tf.mod(epoch_num_tf, 2), 0)` as the predicate.
pool_output = tf.cond(tf.equal(tf.shape(imgs)[2], 180),
lambda: tf.nn.max_pool(imgs, ksize=[1, 180/6, 180/6, 1], ...),
lambda: tf.nn.max_pool(imgs, ksize=[1, 224/6, 224/6, 1], ...))

tf.shape() get wrong shape in tensorflow

I define a tensor like this:
x = tf.get_variable("x", [100])
But when I try to print shape of tensor :
print( tf.shape(x) )
I get Tensor("Shape:0", shape=(1,), dtype=int32), why the result of output should not be shape=(100)
tf.shape(input, name=None) returns a 1-D integer tensor representing the shape of input.
You're looking for: x.get_shape() that returns the TensorShape of the x variable.
Update: I wrote an article to clarify the dynamic/static shapes in Tensorflow because of this answer: https://pgaleone.eu/tensorflow/2018/07/28/understanding-tensorflow-tensors-shape-static-dynamic/
Clarification:
tf.shape(x) creates an op and returns an object which stands for the output of the constructed op, which is what you are printing currently. To get the shape, run the operation in a session:
matA = tf.constant([[7, 8], [9, 10]])
shapeOp = tf.shape(matA)
print(shapeOp) #Tensor("Shape:0", shape=(2,), dtype=int32)
with tf.Session() as sess:
print(sess.run(shapeOp)) #[2 2]
credit: After looking at the above answer, I saw the answer to tf.rank function in Tensorflow which I found more helpful and I have tried rephrasing it here.
Just a quick example, to make things clear:
a = tf.Variable(tf.zeros(shape=(2, 3, 4)))
print('-'*60)
print("v1", tf.shape(a))
print('-'*60)
print("v2", a.get_shape())
print('-'*60)
with tf.Session() as sess:
print("v3", sess.run(tf.shape(a)))
print('-'*60)
print("v4",a.shape)
Output will be:
------------------------------------------------------------
v1 Tensor("Shape:0", shape=(3,), dtype=int32)
------------------------------------------------------------
v2 (2, 3, 4)
------------------------------------------------------------
v3 [2 3 4]
------------------------------------------------------------
v4 (2, 3, 4)
Also this should be helpful:
How to understand static shape and dynamic shape in TensorFlow?
Similar question is nicely explained in TF FAQ:
In TensorFlow, a tensor has both a static (inferred) shape and a
dynamic (true) shape. The static shape can be read using the
tf.Tensor.get_shape method: this shape is inferred from the operations
that were used to create the tensor, and may be partially complete. If
the static shape is not fully defined, the dynamic shape of a Tensor t
can be determined by evaluating tf.shape(t).
So tf.shape() returns you a tensor, will always have a size of shape=(N,), and can be calculated in a session:
a = tf.Variable(tf.zeros(shape=(2, 3, 4)))
with tf.Session() as sess:
print sess.run(tf.shape(a))
On the other hand you can extract the static shape by using x.get_shape().as_list() and this can be calculated anywhere.
Simply, use tensor.shape to get the static shape:
In [102]: a = tf.placeholder(tf.float32, [None, 128])
# returns [None, 128]
In [103]: a.shape.as_list()
Out[103]: [None, 128]
Whereas to get the dynamic shape, use tf.shape():
dynamic_shape = tf.shape(a)
You can also get the shape as you'd in NumPy with your_tensor.shape as in the following example.
In [11]: tensr = tf.constant([[1, 2, 3, 4, 5], [2, 3, 4, 5, 6]])
In [12]: tensr.shape
Out[12]: TensorShape([Dimension(2), Dimension(5)])
In [13]: list(tensr.shape)
Out[13]: [Dimension(2), Dimension(5)]
In [16]: print(tensr.shape)
(2, 5)
Also, this example, for tensors which can be evaluated.
In [33]: tf.shape(tensr).eval().tolist()
Out[33]: [2, 5]
Tensorflow 2.0 Compatible Answer: Tensorflow 2.x (>= 2.0) compatible answer for nessuno's solution is shown below:
x = tf.compat.v1.get_variable("x", [100])
print(x.get_shape())

Categories