max value of a tensor in graph mode tensorflow - python

I have this tf code in graph mode (it has a training function wrapped by #tf.function) where I need to get the max value of a tensor x with type
<class 'tensorflow.python.framework.ops.Tensor'> Tensor("x_3:0", shape=(100,), dtype=int64).
Then I need to use the max value of x as one of the arguments in the shape argument of tf.reshape(). If I print the output of tf.reduce_max(x) I get Tensor("Max:0", shape=(), dtype=int64), which is an invalid argument for tf.reshape(). I have tried tf.reduce_max(x).numpy() and it throws the error message 'Tensor' object has no attribute 'numpy'
So how do I get the max value of a tensor in graph mode in tf 2.6.0?
UPDATE This is my code with the necessary details (I hope) to see what is going on:
MyModel.py
class MyModel(tf.keras.Model):
def __init__(self, ...,hidden_size,name='model',**kwargs):
super(MyModel, self).__init__(name=name, **kwargs)
self.hidden_size = hidden_size
def call(self, inputs, training=True):
x1, input, target, length, y = inputs
batch_size = input.shape[0]
print('check point 2', length, tf.reduce_max(length))
padded_outputs = tf.reshape(tf.boolean_mask(outputs_dec,mask), shape=(batch_size,tf.reduce_max(length),self.hidden_size))
print('check point 3',padded_outputs.shape)
#tf.function
def train(self, inputs, optimizer):
with tf.GradientTape() as tape:
costs = self.call(inputs)
gradients = tape.gradient(self.loss, self.params)
optimizer.apply_gradients(zip(gradients, self.params))
train_mymodel.py
tr_data = tf.data.Dataset.from_tensor_slices((x1_tr,
x2.input,
x2.target,
x2.length,
y_tr))\
.batch(args.batch_size)
while int(checkpoint.step) < args.epochs:
for i, (x1_batch, input_batch, target_batch, length_batch, y_batch) in enumerate(tr_data):
print('check point 1', length_batch)
costs, obj_fn = mymodel.train((x1_batch, input_batch, target_batch, length_batch, y_batch),optimizer)
check point 1 tf.Tensor([300 300 ... 300 300], shape=(100,),type=int64)
check point 2 Tensor("x_3:0", shape=(100,), dtype=int64) Tensor("Max_1:0", shape=(), dtype=int64)
check point 3 (100, None, 500)
The shape of padded_outputs should be (100, 300, 500).
UPDATE2 The error happens when the graph is traced. If I hard code shape=(batch_size,300,self.hidden_size) and use tf.print(batch_size,tf.reduce_max(length),self.hidden_size) then the code runs without error messages and the output of tf.print() is (100,300,500). Is it any way to avoid such behavior?

It should work by simply passing the reduced tensor as an argument:
import tensorflow as tf
tf.random.set_seed(1)
#tf.function
def reshape_on_max_value():
tensor1 = tf.random.uniform((5, 2), maxval=5, dtype=tf.int32)
tensor2 = tf.random.uniform((4, 1), maxval=5, dtype=tf.int32)
x = tf.reduce_max(tensor1)
tf.print(type(tensor1), type(tensor2))
tf.print(tf.reshape(tensor2, [x, 1, 1]).shape)
reshape_on_max_value()
<class 'tensorflow.python.framework.ops.Tensor'> <class 'tensorflow.python.framework.ops.Tensor'>
TensorShape([4, 1, 1])

Related

keras.Model ValueError/TypeError when output is from customized layer

I'm attempting to model a function as a Keras Functional model combined with a customized Keras Layer class. The idea is to simply have the Keras layer's call method use a predefined function. The function will take as input a given tensor, say tensor1, and compute a certain dot product for every value t in tensor1. The dot product relies on tensors (vectors) of different lengths, but the output is a tensor (vector) of the same length as tensor1.
When passing tensor1 into the function TheFunction(tensor) (defined below), it returns the output I expected. Moreover, when building the TheFunctionLayer object and passing in tensor1, the output is also as expected, being the same as that of TheFunction(tensor1).
The issue arises when I try to use Keras's Functional model on the same thing. I'm trying to have it so that I can build the model object and pass tensor1 into it. However, when attempting to do build the object, I get a ValueError regarding the dimensions of the tensors at play in the dot product. But if I make one small change to the output declaration in the definition of the model, I get a TypeError. Below is my source code and the issues that happen as they arise:
Imports:
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.layers import Input, Layer
import numpy as np
Tensors involved:
tensor1 = [1,2,3,4,5,6,7,8,9,10]
tensor1 = tf.convert_to_tensor(tensor1, dtype=tf.float32)
tensor2 = [1,2,3,4,5]
tensor2 = tf.convert_to_tensor(tensor2, dtype=tf.float32)
tensor3 = [6,7,8,9,10]
tensor3 = tf.convert_to_tensor(tensor3, dtype=tf.float32)
The dot product function:
def TheFunction(tensor):
tensor = tf.convert_to_tensor(tensor)
def prod_term(t):
return tensor3*t
def dot(t):
return tf.tensordot(tensor2,prod_term(t),axes=1)
return tf.map_fn(lambda t: dot(t), tensor)
Result of into the dot product function (no issues with dimensions):
<tf.Tensor: shape=(10,), dtype=float32, numpy=
array([ 130., 260., 390., 520., 650., 780., 910., 1040., 1170.,
1300.], dtype=float32)>
The customized Keras layer (only call, no __init__ needed):
class TheFunctionLayer(Layer):
def call(self,tensor):
return TheFunction(tensor)
Result of TheFunctionLayer()(tensor1) (again, no issues with dimensions):
<tf.Tensor: shape=(10,), dtype=float32, numpy=
array([ 130., 260., 390., 520., 650., 780., 910., 1040., 1170.,
1300.], dtype=float32)>
The Keras Functional model:
def TheFunctionModel(input_shape):
x = Input(shape=input_shape)
y = TheFunctionLayer()(x)
model = keras.Model(inputs=x, outputs=y)
return model
Result of building model object TheFunctionModel(tf.shape(tensor1)) (conflict with dimensions):
ValueError: Dimensions must be equal, but are 5 and 10 for '{{node the_function_layer_19/map/while/mul}} = Mul[T=DT_FLOAT](the_function_layer_19/map/while/mul/x, the_function_layer_19/map/while/TensorArrayV2Read/TensorListGetItem)' with input shapes: [5], [10].
What can possibly be going on here? I'm still new to the way TensorFlow and Keras works, but it seems like if the the output of my function works and has no conflict with tensor dimensions, why does trying to build the model object give me this dimensionality issue?
The other thing I'll add is the TypeError I get when changing the output y = TheFunctionLayer()(x) in the definition of TheFunctionModel to y = TheFunctionLayer().call(x):
def TheFunctionModel(input_shape):
x = Input(shape=input_shape)
y = TheFunctionLayer().call(x)
model = keras.Model(inputs=x, outputs=y)
return model
Result of TheFunctionModel(tf.shape(tensor1)) (TypeError occurs instead of original ValueError):
TypeError: Could not build a TypeSpec for <KerasTensor: shape=(None, 10) dtype=float32 (created by layer 'tf.convert_to_tensor')> with type KerasTensor
If changing the output in the model definition is the way to get rid of the value error, then I'm not sure I understand the type error I get and how to fix it.
Suggestions and or solutions to the first value error or even the second type error are very much appreciated.
You are forgetting the batch dimension and that is why it is not working. You will have to rewrite TheFunction like this:
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.layers import Input, Layer
import numpy as np
tensor2 = tf.constant([1,2,3,4,5], dtype=tf.float32)
tensor3 = tf.constant([6,7,8,9,10], dtype=tf.float32)
def TheFunction(tensor):
def prod_term(t):
return tensor3*t
def dot(t):
return tf.tensordot(tensor2,prod_term(t),axes=1)
e = tf.zeros_like(tensor, dtype=tf.float32)
i = tf.constant(0)
while_condition = lambda i, e, tensor: tf.math.less(i, tf.shape(tensor)[0])
def body(i, ta, tensor):
tensor_shape = tf.shape(tensor)
j = tf.repeat([i], tensor_shape[-1])
indices = tf.stack([j, tf.range(tensor_shape[-1])], axis=1)
ta = tf.tensor_scatter_nd_update(tf.cast(ta, dtype=tf.float32), indices, tf.map_fn(lambda t: dot(t), tensor[i]))
return tf.add(i, 1), ta, tensor
_, e, _ = tf.while_loop(while_condition, body, loop_vars=(i, e, tensor))
return e
The tf.while_loop ensures that each sample in a batch is calculated independently of the others.
Your model:
tensor1 = [1,2,3,4,5,6,7,8,9,10]
tensor1 = tf.convert_to_tensor(tensor1, dtype=tf.float32)
class TheFunctionLayer(Layer):
def call(self,tensor):
return TheFunction(tensor)
def TheFunctionModel(input_shape = (10, )):
x = Input(shape=input_shape)
y = TheFunctionLayer()(x)
model = keras.Model(inputs=x, outputs=y)
return model
model = TheFunctionModel()
print(model(tf.expand_dims(tensor1, axis=0)))
tf.Tensor([[ 130. 260. 390. 520. 650. 780. 910. 1040. 1170. 1300.]], shape=(1, 10), dtype=float32)

TensorFlow Custom Layer: Get the actual Batch Size

I would like to implement a custom tf layer that performs a mathematical operation involving the actual batch-size of the input tensor:
import tensorflow as tf
from tensorflow import keras
class MyLayer(keras.layers.Layer):
def build(self, input_shape):
self.batch_size = input_shape[0]
super().build(input_shape)
def call(self,input):
self.batch_size + 1 # do something with the batch size
return input
However, when building a graph, its value is initially None, which breaks the functionality in MyLayer:
input = keras.Input(shape=(10,))
x = MyLayer()(input)
TypeError: in user code:
<ipython-input-41-98e23e82198d>:11 call *
self.batch_size + 1 # do something with the batch size
TypeError: unsupported operand type(s) for +: 'NoneType' and 'int'
Is there any way to make such layers work after the model has been constructed?
Use tf.shape to grab the batch size inside your layer's call method.
Example:
import tensorflow as tf
# custom layer
class MyLayer(tf.keras.layers.Layer):
def __init__(self):
super().__init__()
def call(self, x):
bs = tf.shape(x)[0]
return x, tf.add(bs, 1)
# network
x_in = tf.keras.Input(shape=(None, 10,))
x = MyLayer()(x_in)
# model def
model = tf.keras.models.Model(x_in, x)
# forward pass
_, shp = model(tf.random.normal([5, 10]))
# shape value
print(shp)
# tf.Tensor(6, shape=(), dtype=int32)

Tesnorflow custom layer that loops over ragged tensor cannot be built

I am trying customize a layer in tensorflow. The layer has to take ragged tesnor with unidentified length as input. But the code is stuck when trying to build the layer. Even the simple code attached below could not work properly.
import tensorflow as tf
class myLayer(tf.keras.layers.Layer):
def __init__(self):
super(myLayer, self).__init__()
self._supports_ragged_inputs = True
def call(self, inputs):
# Try to loop over ragged tensor
for x in inputs:
pass
return tf.constant(0)
# Input is ragged tensor
inputs = tf.keras.layers.Input(shape=(None, 1), ragged=True)
layer1 = myLayer()
output = layer1(inputs)
When I ran your code in Tensorflow version 2.2.0, I got the below error in the for loop -
Error -
ValueError: in user code:
<ipython-input-24-1681d59017fc>:10 call *
for x in inputs:
/usr/local/lib/python3.6/dist-packages/tensorflow/python/autograph/operators/control_flow.py:359 for_stmt
iter_, extra_test, body, get_state, set_state, symbol_names, opts)
/usr/local/lib/python3.6/dist-packages/tensorflow/python/autograph/operators/control_flow.py:491 _tf_ragged_for_stmt
opts)
/usr/local/lib/python3.6/dist-packages/tensorflow/python/autograph/operators/control_flow.py:885 _tf_while_stmt
aug_test, aug_body, init_vars, **opts)
/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/control_flow_ops.py:2688 while_loop
back_prop=back_prop)
/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/while_v2.py:104 while_loop
maximum_iterations)
/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/while_v2.py:1258 _build_maximum_iterations_loop_var
maximum_iterations, dtype=dtypes.int32, name="maximum_iterations")
/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py:1317 convert_to_tensor
(dtype.name, value.dtype.name, value))
ValueError: Tensor conversion requested dtype int32 for Tensor with dtype int64: <tf.Tensor 'my_layer_15/strided_slice:0' shape=() dtype=int64>
So I just performed the below experiment to understand the data type produced by the for loop and enumerate when using inputs. for loop generates a tensor class whereas enumerate generates a int class.
Experiment Code -
inputs = tf.keras.layers.Input(shape=(None, 1), ragged=True)
for x in inputs:
print(type(x))
break
for i,x in enumerate(inputs):
print(type(i))
break
Output -
<class 'tensorflow.python.framework.ops.Tensor'>
<class 'int'>
So I modified your code as below and it worked fine -
Fixed Code -
import tensorflow as tf
class myLayer(tf.keras.layers.Layer):
def __init__(self):
super(myLayer, self).__init__()
self._supports_ragged_inputs = True
def call(self, inputs):
# Try to loop over ragged tensor
# for x in inputs: # Throws Error
for i,x in enumerate(inputs): #Enumerate Works fine
break #Using break as pass will go into loop
return tf.constant(0)
# Input is ragged tensor
inputs = tf.keras.layers.Input(shape=(None, 1), ragged=True)
layer1 = myLayer()
output = layer1(inputs)
print(output)
Output -
Tensor("my_layer_17/Identity:0", shape=(), dtype=int32)
Hope this answers your question. Happy Learning.

how to write a keras custom loss function when you need the input value to calculate loss?

I'm trying to duplicate a fast style transfer paper (see diagram above) using the method described in keras built-in training and evaluation loops
I'm having problems understanding how to do this with a custom loss class (see below).
In order to calculate the loss components, I need the following:
y_hat, the generated image to get
(generated_content_features, generated_style_features) = VGG(y_hat)
generated_style_gram = [ utils.gram(value) for value in generated_style_features ]
target_style_gram which is static so I can derive once from target_style_features and cache, (_,target_style_features) = VGG(y_s)
x, the InputImage (same as y_c ContentTarget) to get (target_content_features, _) = VGG(x)
I find that I'm monkey-patching a whole lot of stuff in the loss class, tf.keras.losses.Loss, in order to derive these values and ultimately perform a loss calculation. This is particularly true of the target_content_features which requires the input image, something that I passed in through y_true, but that is obviously a hack
y_pred = generated_image # y_hat from diagram, shape=(b,256,256,3)
y_true = x # hack: access the input image here
lossFn = PerceptualLosses_Loss(VGG, target_style_gram)
loss = lossFn(y_true, y_pred)
class PerceptualLosses_Loss(tf.losses.Loss):
name="PerceptualLosses_Loss"
reduction=tf.keras.losses.Reduction.AUTO
RGB_MEAN_NORMAL_VGG = tf.constant( [0.48501961, 0.45795686, 0.40760392], dtype=tf.float32)
def __init__(self, loss_network, target_style_gram, loss_weights=None):
super(PerceptualLosses_Loss, self).__init__( name=self.name, reduction=self.reduction )
self.target_style_gram = target_style_gram # repeated in y_true
print("PerceptualLosses_Loss init()", type(target_style_gram), type(self.target_style_gram))
self.VGG = loss_network
def call(self, y_true, y_pred):
b,h,w,c = y_pred.shape
#???: y_pred.shape=(None, 256,256,3), need batch dim for utils.gram(value)
generated_batch = tf.reshape(y_pred, (BATCH_SIZE,h,w,c) )
# generated_batch: expecting domain=(+-int), mean centered
generated_batch = tf.nn.tanh(generated_batch) # domain=(-1.,1.), mean centered
# reverse VGG mean_center
generated_batch = tf.add( generated_batch, self.RGB_MEAN_NORMAL_VGG) # domain=(0.,1.)
generated_batch_BGR_centered = tf.keras.applications.vgg19.preprocess_input(generated_batch*255.)/255.
generated_content_features, generated_style_features = self.VGG( generated_batch_BGR_centered, preprocess=False )
generated_style_gram = [ utils.gram(value) for value in generated_style_features ] # list
y_pred = generated_content_features + generated_style_gram
# print("PerceptualLosses_Loss: y_pred, output_shapes=", type(y_pred), [v.shape for v in y_pred])
# PerceptualLosses_Loss: y_pred, output_shapes= [
# TensorShape([4, 16, 16, 512]),
# TensorShape([4, 64, 64]),
# TensorShape([4, 128, 128]),
# TensorShape([4, 256, 256]),
# TensorShape([4, 512, 512]),
# TensorShape([4, 512, 512])
# ]
if tf.is_tensor(y_true):
# print("detect y_true is image", type(y_true), y_true.shape)
x_train = y_true
x_train_BGR_centered = tf.keras.applications.vgg19.preprocess_input(x_train*255.)/255.
target_content_features, _ = self.VGG(x_train_BGR_centered, preprocess=False )
# ???: target_content_features[0].shape=(None, None, None, 512), should be shape=(4, 16, 16, 512)
target_content_features = [tf.reshape(v, generated_content_features[i].shape) for i,v in enumerate(target_content_features)]
elif isinstance(y_true, tuple):
print("detect y_true is tuple(target_content_features + self.target_style_gram)", y_true[0].shape)
target_content_features = y_true[:len(generated_content_features)]
if self.target_style_gram is None:
self.target_style_gram = y_true[len(generated_content_features):]
else:
assert False, "unexpected result for y_true"
# losses = tf.keras.losses.MSE(y_true, y_pred)
def batch_reduce_sum(y_true, y_pred, weight, name):
losses = tf.zeros(BATCH_SIZE)
for a,b in zip(y_true, y_pred):
# batch_reduce_sum()
loss = tf.keras.losses.MSE(a,b)
loss = tf.reduce_sum(loss, axis=[i for i in range(1,len(loss.shape))] )
losses = tf.add(losses, loss)
return tf.multiply(losses, weight, name="{}_loss".format(name)) # shape=(BATCH_SIZE,)
c_loss = batch_reduce_sum(target_content_features, generated_content_features, CONTENT_WEIGHT, 'content_loss')
s_loss = batch_reduce_sum(self.target_style_gram, generated_style_gram, STYLE_WEIGHT, 'style_loss')
return (c_loss, s_loss)
I also tried to pre-calculate y_true in the tf.data.Dataset, but while it worked fine under eager execution, it caused an error during model.fit()
xy_true_Dataset = tf.data.Dataset.from_generator(
xyGenerator_y_true(image_ds, VGG, target_style_gram),
output_types=(tf.float32, (tf.float32, tf.float32,tf.float32,tf.float32,tf.float32,tf.float32) ),
output_shapes=(
(256,256,3),
( (16, 16, 512), (64, 64), (128, 128), (256, 256), (512, 512), (512, 512))
),
)
# eager execution, y_true: <class 'tuple'> [TensorShape([4, 16, 16, 512]), TensorShape([4, 64, 64]), TensorShape([4, 128, 128]), TensorShape([4, 256, 256]), TensorShape([4, 512, 512]), TensorShape([4, 512, 512])]
# model.fit(), y_true: <class 'tensorflow.python.framework.ops.Tensor'> (None, None, None, None)
ValueError: Error when checking model target: the list of Numpy arrays that you are passing to your model is not the size the model expected. Expected to see 1 array(s), for inputs ['output_1'] but instead got the following list of 6 arrays: [<tf.Tensor 'args_1:0' shape=(None, 16, 16, 512) dtype=float32>, <tf.Tensor 'args_2:0' shape=(None, 64, 64) dtype=float32>, <tf.Tensor 'args_3:0' shape=(None, 128, 128) dtype=float32>, <tf.Tensor 'arg...
Do I have the complete wrong approach to this problem?
Since you didn't show the model, I'm not very sure about the problem. But you can try some of the followings:
You said:
I also tried to pre-calculate y_true in the tf.data.Dataset, but while it worked fine under eager execution, it caused an error during model.fit()
You can set "True" for eagar execution while compiling the model, like: model.compile(..., run_eagerly=True)
The return error said you passed a wrong shape to "output_1". Use model.summary() to see the whole model and find which one is "output_1". Then check the model.
If you want to use other parameters for loss function, one can do something like:
def other_parameters(para1):
def loss_fn(y_true, y_pred):
# just an example
return y_true - para1*y_pred
# Don't forget to return "loss_fn"
return loss_fn
while compile the model, do model.compile(..., loss=other_parameters(para1)). Or you can define the class of the loss:
class CustomMSE(keras.losses.Loss):
def __init__(self, regularization_factor=0.1, name="custom_mse"):
super().__init__(name=name)
self.regularization_factor = regularization_factor
def call(self, y_true, y_pred):
mse = tf.math.reduce_mean(tf.square(y_true - y_pred))
reg = tf.math.reduce_mean(tf.square(0.5 - y_pred))
return mse + reg * self.regularization_factor
...
model.compile(optimizer=keras.optimizers.Adam(), loss=CustomMSE(0.2))
...
model.fit(...)
More details, see here: Keras: Training and evaluation with the built-in methods, read Custom losses and Handling losses and metrics that don't fit the standard signature. Note that sometimes you may need to write your own training loop, then you can see this and this.
Hope these can help you.

How to fix "ValueError: Operands could not be broadcast together with shapes (2592,) (4,)" in Tensorflow?

I am currently designing a NoisyNet layer, as proposed here: "Noisy Networks for Exploration", in Tensorflow and get the dimensionality error as indicated in the title, while the dimensions of the two tensors to be multiplied element-wise in line filtered_output = keras.layers.merge.Multiply()([output, actions_input]) should (in principle) be compatible with each other according to the printed output when printing the dimensions of both tensors involved, filtered_output and actions_input, where both tensors seem to be of dimension shape=(1, 4).
I am using Tensorflow 1.12.0 in Python3.
The relevant code looks as follows:
import numpy as np
import tensorflow as tf
import keras
class NoisyLayer(keras.layers.Layer):
def __init__(self, in_shape=(1,2592), out_units=256, activation=tf.identity):
super(NoisyLayer, self).__init__()
self.in_shape = in_shape
self.out_units = out_units
self.mu_interval = 1.0/np.sqrt(float(self.out_units))
self.sig_0 = 0.5
self.activation = activation
self.assign_resampling()
def build(self, input_shape):
# Initializer
self.mu_initializer = tf.initializers.random_uniform(minval=-self.mu_interval, maxval=self.mu_interval) # Mu-initializer
self.si_initializer = tf.initializers.constant(self.sig_0/np.sqrt(float(self.out_units))) # Sigma-initializer
# Weights
self.w_mu = tf.Variable(initial_value=self.mu_initializer(shape=(self.in_shape[-1], self.out_units), dtype='float32'), trainable=True) # (1,2592)x(2592,4) = (1,4)
self.w_si = tf.Variable(initial_value=self.si_initializer(shape=(self.in_shape[-1], self.out_units), dtype='float32'), trainable=True)
# Biases
self.b_mu = tf.Variable(initial_value=self.mu_initializer(shape=(self.in_shape[0], self.out_units), dtype='float32'), trainable=True)
self.b_si = tf.Variable(initial_value=self.si_initializer(shape=(self.in_shape[0], self.out_units), dtype='float32'), trainable=True)
def call(self, inputs, resample_noise_flag):
if resample_noise_flag:
self.assign_resampling()
# Putting it all together
self.w = tf.math.add(self.w_mu, tf.math.multiply(self.w_si, self.w_eps))
self.b = tf.math.add(self.b_mu, tf.math.multiply(self.b_si, self.q_eps))
return self.activation(tf.linalg.matmul(inputs, self.w) + self.b)
def assign_resampling(self):
self.p_eps = self.f(self.resample_noise([self.in_shape[-1], 1]))
self.q_eps = self.f(self.resample_noise([1, self.out_units]))
self.w_eps = self.p_eps * self.q_eps # Cartesian product of input_noise x output_noise
def resample_noise(self, shape):
return tf.random.normal(shape, mean=0.0, stddev=1.0, seed=None, name=None)
def f(self, x):
return tf.math.multiply(tf.math.sign(x), tf.math.sqrt(tf.math.abs(x)))
frames_input = tf.ones((1, 84, 84, 4)) # Toy input
conv1 = keras.layers.Conv2D(16, (8, 8), strides=(4, 4), activation="relu")(frames_input)
conv2 = keras.layers.Conv2D(32, (4, 4), strides=(2, 2), activation="relu")(conv1)
flattened = keras.layers.Flatten()(conv2)
actionspace_size = 4
# NoisyNet
hidden = NoisyLayer(activation=tf.nn.relu)(inputs=flattened, resample_noise_flag=True)
output = NoisyLayer(in_shape=(1,256), out_units=actionspace_size)(inputs=hidden, resample_noise_flag=True)
actions_input = tf.ones((1,actionspace_size))
print('hidden:\n', hidden)
print('output:\n', output)
print('actions_input:\n', actions_input)
filtered_output = keras.layers.merge.Multiply()([output, actions_input])
The output, when I run the code, looks as follows:
hidden:
Tensor("noisy_layer_5/Relu:0", shape=(1, 256), dtype=float32)
output:
Tensor("noisy_layer_6/Identity:0", shape=(1, 4), dtype=float32)
actions_input:
Tensor("ones_5:0", shape=(1, 4), dtype=float32)
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-4-f6df621eacab> in <module>()
68 print('actions_input:\n', actions_input)
69
---> 70 filtered_output = keras.layers.merge.Multiply()([output, actions_input])
2 frames
/usr/local/lib/python3.6/dist-packages/keras/layers/merge.py in _compute_elemwise_op_output_shape(self, shape1, shape2)
59 raise ValueError('Operands could not be broadcast '
60 'together with shapes ' +
---> 61 str(shape1) + ' ' + str(shape2))
62 output_shape.append(i)
63 return tuple(output_shape)
ValueError: Operands could not be broadcast together with shapes (2592,) (4,)
Particularly, I am wondering where the number 2592 in Operands could not be broadcast together with shapes (2592,) (4,) comes from, since the number coincides with the length of the flattened input tensor flattened to the first noisy layer, but is -as it seems to me- not part of the output dimension of the second noisy layer output anymore, which in turn serves as the input to the erroneous line indicated above.
Does anyone know what's going wrong?
Thanks in advance, Daniel
As stated in the custom layer document, you need to implement compute_output_shape(input_shape) method:
compute_output_shape(input_shape): in case your layer modifies the
shape of its input, you should specify here the shape transformation
logic. This allows Keras to do automatic shape inference.
Keras can't do shape inference without actually executing the computation when you don't apply this method.
print(keras.backend.int_shape(hidden))
print(keras.backend.int_shape(output))
(1, 2592)
(1, 2592)
So you need to add it as follows:
def compute_output_shape(self, input_shape):
return (input_shape[0], self.out_units)
In addition, build() method must set self.built = True at the end, which can be done by calling super(NoisyLayer, self).build(input_shape) according to the document.

Categories