How do I assign names to Ouputs in a subclassed Keras Model? - python

I want to name the outputs of a subclassed TensorFlow Keras Model, so I can pass targets to them in fit(), e.g. self.model.fit(np_inputs, {'q_values': np_targets}, verbose=0)
The model looks like this:
class MyModel(tf.keras.models.Model):
def __init__(self, name):
super(MyModel, self).__init__()
self.input_layer = tf.keras.Input(shape=(BOARD_SIZE * 3,))
self.d1 = tf.keras.layers.Dense(BOARD_SIZE * 3 * 9, activation='relu')
self.d2 = tf.keras.layers.Dense(BOARD_SIZE * 3 * 100, activation='relu')
self.d3 = tf.keras.layers.Dense(BOARD_SIZE * 3 * 9, activation='relu')
self.q_values_l = tf.keras.layers.Dense(BOARD_SIZE, activation=None, name='q_values')
self.probabilities_l = tf.keras.layers.Softmax(name='probabilities')
#tf.function
def call(self, input_data):
x = self.d1(input_data)
x = self.d2(x)
x = self.d3(x)
q = self.q_values_l(x)
p = self.probabilities_l(q)
return p, q
I naively assumed the name of the corresponding layers would also be assigned to the outputs, but this does not seem to be the case.
I only have targets to 1 of the outputs, thus the need to exactly specify what output the targets are for when calling fit().
In the functional way of using Keras this works well, but I can't replicate it in the subclass approach. I can't use the functional Keras way in my case for unrelated reasons.

Why not just pass a dummy target?
model.fit(np_inputs, [np.zeros((len(np_inputs),)), np_targets], ...)
Maybe even None can be passed instead of np.zeros.
You can compile the model exactly the same way:
model.compile(loss=[p_loss, q_loss], ...)

Related

Obtain the output of intermediate layer (Functional API) and use it in SubClassed API

In the keras doc, it says that if we want to pick the intermediate layer's output of the model (sequential and functional), all we need to do as follows:
model = ... # create the original model
layer_name = 'my_layer'
intermediate_layer_model = keras.Model(inputs=model.input,
outputs=model.get_layer(layer_name).output)
intermediate_output = intermediate_layer_model(data)
So, here we get two models, the intermediate_layer_model is the sub-model of its parent model. And they're independent as well. Likewise, if we get the intermediate layer's output feature maps of the parent model (or base model), and do some operation with it and get some output feature maps from this operation, then we can also impute this output feature maps back to the parent model.
input = tf.keras.Input(shape=(size,size,3))
model = tf.keras.applications.DenseNet121(input_tensor = input)
layer_name = "conv1_block1" # for example
output_feat_maps = SomeOperationLayer()(model.get_layer(layer_name).output)
# assume, they're able to add up
base = Add()([model.output, output_feat_maps])
# bind all
imputed_model = tf.keras.Model(inputs=[model.input], outputs=base)
So, in this way we have one modified model. It's quite easy with functional API. All the keras imagenet models are written with functional API (mostly). In model subclassing API, we can use these models. My concern here is, what to do if we need the intermediate output feature maps of these functional API models' inside call function.
class Subclass(tf.keras.Model):
def __init__(self, dim):
super(Subclass, self).__init__()
self.dim = dim
self.base = DenseNet121(input_shape=self.dim)
# building new model with the desired output layer of base model
self.mid_layer_model = tf.keras.Model(self.base.inputs,
self.base.get_layer(layer_name).output)
def call(self, inputs):
# forward with base model
x = self.base(inputs)
# forward with mid_layer_model
mid_feat = self.mid_layer_model(inputs)
# do some op with it
mid_x = SomeOperationLayer()(mid_feat)
# assume, they're able to add up
out = tf.keras.layers.add([x, mid_x])
return out
The issue is, here we've technically two models in a joint fashion. But unlike building a model like this, here we simply want the intermediate output feature maps (from some inputs) of the base model forward manner and use it somewhere else and get some output. Like this
mid_x = SomeOperationLayer()(self.base.get_layer(layer_name).output)
But it gives ValueError: Graph disconnected. So, currently, we have to build a new model from the base model based on our desired intermediate layer. In the init method we define or create new self.mid_layer_model model that gives our desired output feature maps like this: mid_feat = self.mid_layer_model(inputs). Next, we take the mid_faet and do some operation and get some output and lastly add them with tf.keras.layers.add([x, mid_x]). So by creating a new model with desired intermediate out works but by the same time, we repeat the same operation twice i.e the base model and its subset model. Maybe I'm missing something obvious, please add up something. Is it how it is! or there some strategies we can adopt. I've asked in the forum here, no response yet.
Update
Here is a working example. Let's say we have a custom layer like this
import tensorflow as tf
from tensorflow.keras.applications import DenseNet121
from tensorflow.keras.layers import Add
from tensorflow.keras.layers import Dense
from tensorflow.keras.layers import Flatten
class ConvBlock(tf.keras.layers.Layer):
def __init__(self, kernel_num=32, kernel_size=(3,3), strides=(1,1), padding='same'):
super(ConvBlock, self).__init__()
# conv layer
self.conv = tf.keras.layers.Conv2D(kernel_num,
kernel_size=kernel_size,
strides=strides, padding=padding)
# batch norm layer
self.bn = tf.keras.layers.BatchNormalization()
def call(self, input_tensor, training=False):
x = self.conv(input_tensor)
x = self.bn(x, training=training)
return tf.nn.relu(x)
And we want to impute this layer into an ImageNet model and construct a model like this
input = tf.keras.Input(shape=(32, 32, 3))
base = DenseNet121(weights=None, input_tensor = input)
# get output feature maps of at certain layer, ie. conv2_block1_0_relu
cb = ConvBlock()(base.get_layer("conv2_block1_0_relu").output)
flat = Flatten()(cb)
dense = Dense(1000)(flat)
# adding up
adding = Add()([base.output, dense])
model = tf.keras.Model(inputs=[base.input], outputs=adding)
from tensorflow.keras.utils import plot_model
plot_model(model,
show_shapes=True, show_dtype=True,
show_layer_names=True,expand_nested=False)
Here the computation from input to layer conv2_block1_0_relu is computed one time. Next, if we want to translate this functional API to subclassing API, we had to build a model from the base model's input to layer conv2_block1_0_relu first. Like
class ModelWithMidLayer(tf.keras.Model):
def __init__(self, dim=(32, 32, 3)):
super().__init__()
self.dim = dim
self.base = DenseNet121(input_shape=self.dim, weights=None)
# building sub-model from self.base which gives
# desired output feature maps: ie. conv2_block1_0_relu
self.mid_layer = tf.keras.Model(self.base.inputs,
self.base.get_layer("conv2_block1_0_relu").output)
self.flat = Flatten()
self.dense = Dense(1000)
self.add = Add()
self.cb = ConvBlock()
def call(self, x):
# forward with base model
bx = self.base(x)
# forward with mid layer
mx = self.mid_layer(x)
# make same shape or do whatever
mx = self.dense(self.flat(mx))
# combine
out = self.add([bx, mx])
return out
def build_graph(self):
x = tf.keras.layers.Input(shape=(self.dim))
return tf.keras.Model(inputs=[x], outputs=self.call(x))
mwml = ModelWithMidLayer()
plot_model(mwml.build_graph(),
show_shapes=True, show_dtype=True,
show_layer_names=True,expand_nested=False)
Here model_1 is actually a sub-model from DenseNet, which probably leads the whole model (ModelWithMidLayer) to compute the same operation twice. If this observation is correct, then this gives us concern.
I thought it might be much complex but it's actually rather very simple. We just need to build a model with desired output layers at the __init__ method and use it normally in the call method.
import tensorflow as tf
from tensorflow.keras.applications import DenseNet121
from tensorflow.keras.layers import Add
from tensorflow.keras.layers import Dense
from tensorflow.keras.layers import Flatten
class ConvBlock(tf.keras.layers.Layer):
def __init__(self, kernel_num=32, kernel_size=(3,3), strides=(1,1), padding='same'):
super(ConvBlock, self).__init__()
# conv layer
self.conv = tf.keras.layers.Conv2D(kernel_num,
kernel_size=kernel_size,
strides=strides, padding=padding)
# batch norm layer
self.bn = tf.keras.layers.BatchNormalization()
def call(self, input_tensor, training=False):
x = self.conv(input_tensor)
x = self.bn(x, training=training)
return tf.nn.relu(x)
class ModelWithMidLayer(tf.keras.Model):
def __init__(self, dim=(32, 32, 3)):
super().__init__()
self.dim = dim
self.base = DenseNet121(input_shape=self.dim, weights=None)
# building sub-model from self.base which gives
# desired output feature maps: ie. conv2_block1_0_relu
self.mid_layer = tf.keras.Model(
inputs=[self.base.inputs],
outputs=[
self.base.get_layer("conv2_block1_0_relu").output,
self.base.output])
self.flat = Flatten()
self.dense = Dense(1000)
self.add = Add()
self.cb = ConvBlock()
def call(self, x):
# forward with base model
bx = self.mid_layer(x)[1] # output self.base.output
# forward with mid layer
mx = self.mid_layer(x)[0] # output base.get_layer("conv2_block1_0_relu").output
# make same shape or do whatever
mx = self.dense(self.flat(mx))
# combine
out = self.add([bx, mx])
return out
def build_graph(self):
x = tf.keras.layers.Input(shape=(self.dim))
return tf.keras.Model(inputs=[x], outputs=self.call(x))
mwml = ModelWithMidLayer()
tf.keras.utils.plot_model(mwml.build_graph(),
show_shapes=True, show_dtype=True,
show_layer_names=True,expand_nested=False)

How to learn two functions simultaneously in using python (either pytorch or tensorflow)?

I have three series of observations, namely Y, T, and X. I would like to study the differences between the predicted values of the two models. The first model is to learn g such that Y=g(T, X). The second model is to learn L and f such that Y=L(T)f(X). I have no problem in learning the first model using the PyTorch package or the Tensorflow package. However, I am not sure how to learn L and f. In using the PyTorch package, I can set up two feedforward MLPs with different hidden layers and inputs. For simplicity, I define a Feedforward MLP class as follows:
class Feedforward(t.nn.Module): # the definition of a feedforward neural network
# Basic definition
def __init__(self, input_size, hidden_size):
super(Feedforward, self).__init__()
self.input_size = input_size
self.hidden_size = hidden_size
self.fc1 = t.nn.Linear(self.input_size, self.hidden_size)
self.relu = t.nn.ReLU()
self.fc2 = t.nn.Linear(self.hidden_size, 1)
self.sigmoid = t.nn.Sigmoid()
# Advance definition
def forward(self, x):
hidden = self.fc1(x)
relu = self.relu(hidden)
output = self.fc2(relu)
output = self.sigmoid(output)
return output
Suppose L=Feedforward(2,10) and L=Feedforward(3,9). From my understanding, I can only learn either L or f, but not both simultaneously. Is it possible to learn L and f simultaneously using Y, T, and X?
I may be missing something, but I think you can :
L = Feedforward(2,10)
f = Feedforward(3,9)
L_opt = Adam(L.parameters(), lr=...)
f_opt = Adam(f.parameters(), lr=...)
for (x,t,y) in dataset:
L.zero_grad()
f.zero_grad()
y_pred = L(t)*f(x)
loss = (y-y_pred)**2
loss.backward()
L_opt.step()
f_opt.step()
You can also fuse them together in one single model :
class ProductModel(t.nn.Module):
def __init__(self, L, f):
self.L = L
self.f = f
def forward(self, x,t):
return self.L(t)*self.f(x)
and then train this model like you trained g

tensorflow 2 : loss using hidden layers output

I am trying to implement the OSME MAMC model describe in article https://arxiv.org/abs/1806.05372.
I'm stuck where I have to add a cost that doesn't depend on y_true and y_pred but on hidden layers and y_true.
It can't be right as tensorflow custom loss, for which we need y_true and y_pred.
I wrote the model into class, then tried to use gradient tape to add NPairLoss to Softmax output loss, but gradient is NaN during training.
I think my approach isn't good, but I have no idea how to design / write it.
Here my model :
class OSME_network(tf.keras.Model):
def __init__(self, nbrclass=10, weight="imagenet",input_tensor=(32,32,3)):
super(OSME_network, self).__init__()
self.nbrclass = nbrclass
self.weight = weight
self.input_tensor=input_tensor
self.Resnet_50=ResNet50(include_top=False, weights=self.weight, input_shape=self.input_tensor)
self.Resnet_50.trainable=False
self.split=Lambda(lambda x: tf.split(x,num_or_size_splits=2,axis=-1))
self.s_1=OSME_Layer(ch=1024,ratio=16)
self.s_2=OSME_Layer(ch=1024,ratio=16)
self.fl1=tf.keras.layers.Flatten()
self.fl2=tf.keras.layers.Flatten()
self.d1=tf.keras.layers.Dense(1024, name='fc1')
self.d2=tf.keras.layers.Dense(1024,name='fc2')
self.fc=Concatenate()
self.preds=tf.keras.layers.Dense(self.nbrclass,activation='softmax')
#tf.function
def call(self,x): #set à construire le model sequentiellement
x=self.Resnet_50(x)
x_1,x_2=self.split(x)
xx_1 = self.s_1(x_1)
xx_2 = self.s_2(x_2)
xxx_1 = self.d1(xx_1)
xxx_2 = self.d2(xx_2)
xxxx_1 = self.fl1(xxx_1)
xxxx_2 = self.fl2(xxx_2)
fc = self.fc([xxxx_1,xxxx_2]) #fc1 + fc2
ret=self.preds(fc)
return xxxx_1,xxxx_2,ret
class OSME_Layer(tf.keras.layers.Layer):
def __init__(self,ch,ratio):
super(OSME_Layer,self).__init__()
self.GloAvePool2D=GlobalAveragePooling2D()
self.Dense1=Dense(ch//ratio,activation='relu')
self.Dense2=Dense(ch,activation='sigmoid')
self.Mult=Multiply()
self.ch=ch
def call(self,inputs):
squeeze=self.GloAvePool2D(inputs)
se_shape = (1, 1, self.ch)
se = Reshape(se_shape)(squeeze)
excitation=self.Dense1(se)
excitation=self.Dense2(excitation)
scale=self.Mult([inputs,excitation])
return scale
class NPairLoss():
def __init__(self):
self._inputs = None
self._y=None
#tf.function
def __call__(self,inputs,y):
targets=tf.argmax(y, axis=1)
b, p, _ = inputs.shape
n = b * p
inputs=tf.reshape(inputs, [n, -1])
targets = tf.repeat(targets,repeats=p)
parts = tf.tile(tf.range(p),[b])
prod=tf.linalg.matmul(inputs,inputs,transpose_a=False,transpose_b=True)
same_class_mask = tf.math.equal(tf.broadcast_to(targets,[n, n]),tf.transpose(tf.broadcast_to(targets,(n, n))))
same_atten_mask = tf.math.equal(tf.broadcast_to(parts,[n, n]),tf.transpose(tf.broadcast_to(parts,(n, n))))
s_sasc = same_class_mask & same_atten_mask
s_sadc = (~same_class_mask) & same_atten_mask
s_dasc = same_class_mask & (~same_atten_mask)
s_dadc = (~same_class_mask) & (~same_atten_mask)
loss_sasc = 0
loss_sadc = 0
loss_dasc = 0
for i in range(n):
#loss_sasc
pos = prod[i][s_sasc[i]]
neg = prod[i][s_sadc[i] | s_dasc[i] | s_dadc[i]]
n_pos=tf.shape(pos)[0]
n_neg=tf.shape(neg)[0]
pos = tf.transpose(tf.broadcast_to(pos,[n_neg,n_pos]))
neg = tf.broadcast_to(neg,[n_pos,n_neg])
exp=tf.clip_by_value(tf.math.exp(neg - pos),clip_value_min=0,clip_value_max=9e6) # need to clip value, else inf
loss_sasc += tf.reduce_sum(tf.math.log(1 + tf.reduce_sum(exp,axis=1)))
#loss_sadc
pos = prod[i][s_sadc[i]]
neg = prod[i][s_dadc[i]]
n_pos = tf.shape(pos)[0]
n_neg = tf.shape(neg)[0]
pos = tf.transpose(tf.broadcast_to(pos,[n_neg,n_pos])) #np.transpose(np.tile(pos,[n_neg,1]))
neg = tf.broadcast_to(neg,[n_pos,n_neg])#np.tile(neg,[n_pos,1])
exp=tf.clip_by_value(tf.math.exp(neg - pos),clip_value_min=0,clip_value_max=9e6)
loss_sadc += tf.reduce_sum(tf.math.log(1 + tf.reduce_sum(exp,axis=1)))
#loss_dasc
pos = prod[i][s_dasc[i]]
neg = prod[i][s_dadc[i]]
n_pos = tf.shape(pos)[0]
n_neg = tf.shape(neg)[0]
pos = tf.transpose(tf.broadcast_to(pos,[n_neg,n_pos])) #np.transpose(np.tile(pos,[n_neg,1]))
neg = tf.broadcast_to(neg,[n_pos,n_neg])#np.tile(neg,[n_pos,1])
exp=tf.clip_by_value(tf.math.exp(neg - pos),clip_value_min=0,clip_value_max=9e6)
loss_dasc += tf.reduce_sum(tf.math.log(1 + tf.reduce_sum(exp,axis=1)))
return (loss_sasc + loss_sadc + loss_dasc) / n
then, for training :
#tf.function
def train_step(x,y):
with tf.GradientTape() as tape:
fc1,fc2,y_pred=model(x,training=True)
stacked=tf.stack([fc1,fc2],axis=1)
layerLoss=npair(stacked,y)
loss=cce(y, y_pred) +0.001*layerLoss
grads=tape.gradient(loss,model.trainable_variables)
opt.apply_gradients(zip(grads,model.trainable_variables))
return loss
model=OSME_network(weight="imagenet",nbrclass=10,input_tensor=(32, 32, 3))
model.compile(optimizer=opt, loss=categorical_crossentropy, metrics=["acc"])
model.build(input_shape=(None,32,32,3))
cce = tf.keras.losses.CategoricalCrossentropy(from_logits=True,name='categorical_crossentropy')
npair=NPairLoss()
for each batch :
x=tf.Variable(x_train[start:end])
y=tf.Variable(y_train[start:end])
train_loss=train_step(x,y)
Thanks for any help :)
You can use tensorflow's add_loss.
model.compile() loss functions in Tensorflow always take two parameters y_true and y_pred. Using model.add_loss() has no such restriction and allows you to write much more complex losses that depend on many other tensors, but it has the inconvenience of being more dependent on the model, whereas the standard loss functions work with just any model.
You can find the official documentation of add_loss here. Add loss tensor(s), potentially dependent on layer inputs. This method can be used inside a subclassed layer or model's call function, in which case losses should be a Tensor or list of Tensors. There are few example in the documentation to explain the add_loss.
This method can also be called directly on a Functional Model during construction. In this case, any loss Tensors passed to this Model must be symbolic and be able to be traced back to the model's Inputs. These losses become part of the model's topology and are tracked in get_config.
Example :
inputs = tf.keras.Input(shape=(10,))
x = tf.keras.layers.Dense(10)(inputs)
outputs = tf.keras.layers.Dense(1)(x)
model = tf.keras.Model(inputs, outputs)
# Activity regularization.
model.add_loss(tf.abs(tf.reduce_mean(x)))
You can call self.add_loss(loss_value) from inside the call method of a custom layer. Here's a simple example that adds activity regularization.
Example:
class ActivityRegularizationLayer(layers.Layer):
def call(self, inputs):
self.add_loss(tf.reduce_sum(inputs) * 0.1)
return inputs # Pass-through layer.
inputs = keras.Input(shape=(784,), name='digits')
x = layers.Dense(64, activation='relu', name='dense_1')(inputs)
# Insert activity regularization as a layer
x = ActivityRegularizationLayer()(x)
x = layers.Dense(64, activation='relu', name='dense_2')(x)
outputs = layers.Dense(10, name='predictions')(x)
model = keras.Model(inputs=inputs, outputs=outputs)
model.compile(optimizer=keras.optimizers.RMSprop(learning_rate=1e-3),
loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True))
# The displayed loss will be much higher than before
# due to the regularization component.
model.fit(x_train, y_train,
batch_size=64,
epochs=1)
You can find good example using add_loss here and here with explanations.
Hope this answers your question. Happy Learning.

Custom loss function on Keras

I have a dataset containing a matrix of features X and a matrix of labels y of size N where each element y_i belongs to [0,1]. I have the following loss function
where g(.) is a function that depends on the input matrix X.
I know that Keras custom loss function has to be of the form customLoss(y_true,y_predicted), however, I'm having difficulties incorporating the term g(X) in the loss function since this depends on the input matrix.
For each data point in my dataset, my input is of the form X_i = (H, P) where these two parameters are matrices and the function g is defined for each data point as g(X_i) = H x P. Can I pass a = (H, P) in the loss function since this depends on each example or do I need to pass all the matrices at once by concatenating them?
Edit (based on Daniel's answer):
original_model_inputs = keras.layers.Input(shape=X_train.shape[1])
y_true_inputs = keras.layers.Input(shape=y_train.shape[1])
hidden1 = keras.layers.Dense(256, activation="relu")(original_model_inputs)
hidden2 = keras.layers.Dense(128, activation="relu")(hidden1)
output = keras.layers.Dense(K)(hidden2)
def lambdaLoss(x):
yTrue, yPred, alpha = x
return (K.log(yTrue) - K.log(yPred))**2+alpha*yPred
loss = Lambda(lambdaLoss)(y_true_inputs, output, a)
model = Keras.Model(inputs=[original_model_inputs, y_true_inputs], outputs=[output], loss)
def dummyLoss(true, pred):
return pred
model.compile(loss = dummyLoss, optimizer=Adam())
train_model = model.fit([X_train, y_train], None, batch_size = 32,
epochs = 50,
validation_data = ([X_valid, y_valid], None),
callbacks=callbacks)
Fixing the understanding of my answer:
original_model_inputs = keras.layers.Input(shape=X_train.shape[1:]) #must be a tuple, not an int
y_true_inputs = keras.layers.Input(shape=y_train.shape[1:]) #must be a tuple, not an int
hidden1 = keras.layers.Dense(256, activation="relu")(original_model_inputs)
hidden2 = keras.layers.Dense(128, activation="relu")(hidden1)
output = keras.layers.Dense(K)(hidden2)
You need something to do g(X), I have no idea of what it is, but you need to do it somewhere.
And yes, you need to pass the whole tensor at once, you cannot make x_i and everything else.
def g(x):
return something
gResults = Lambda(g)(original_model_inputs)
Continuing my answer:
def lambdaLoss(x):
yTrue, yPred, G = x
.... #wait.... where is Y_true in your loss formula?
loss = Lambda(lambdaLoss)([y_true_inputs, output, gResults]) #must be a list of inputs including G
You need a model for training and another to get the outputs, because we're doing a frankenstein model because of the different loss.
training_model = keras.Model(inputs=[original_model_inputs, y_true_inputs], outputs=loss)
prediction_model = keras.Model(original_model_inputs, output)
Only the training model must be compiled:
def dummyLoss(true, pred):
return pred
training_model.compile(loss = dummyLoss, optimizer=Adam())
training_model = model.fit([X_train, y_train], None, batch_size = 32,
epochs = 50,
validation_data = ([X_valid, y_valid], None),
callbacks=callbacks)
Use the other model to get result data:
results = prediction_model.predict(some_x)
Looks like a GAN of some sort. I will refer to (x) as "x_input", Two methods:
Method 1) Inherit from tf.keras.model class and write your own (not recommended, not shown)
Method 2) Inherit from tf.keras.losses.Loss class. and return tuple of (custom) tf.keras.losses.Loss instance and tf.keras.layers.Layer that does nothing more than act as shell to grab and save a copy of the x_input (x). This layer instance can then be added as the top layer in model. The (custom) tf.keraslosses. Loss instance can then access the input on demand. This method also has best future support throughout the life of Tensorflow.
First, create a custom layer and custom loss class:
class Acrylic_Layer(tf.keras.layers.Layer):
def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
self.x_input = None
def build(self, *args, **kwargs):
pass
def call(self, input):
self.x_input = input
return input # Pass input directly through to next layer
class Custom_Loss(tf.keras.losses.Loss):
def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
self.input_thief = Acrylic_Layer() # <<< Magic, python is pass by reference!
def __call__(self, y_true, y_pred, sample_weight=None):
x_input = self.input_thief.x_input # <<< x_input pulled from model
Second, add layer and loss function to model
loss_fn = Custom_Loss(*args, **kwargs)
input_thief = loss_fn.input_thief
model = tf.keras.models.Sequential([
input_thief, # <<< transparent layer
Other_layers,
])
model.fit(loss=loss_fn) # <<< loss function
Lastly, I'm the market looking for a ML/python role, giving a shout out.

TypeError: __init__() takes at least 3 arguments (2 given) when subclassing Model class

I want to create a simple neural network using Tensorflow and Keras.
When I try to instantiate a Model by subclassing the Model class
class TwoLayerFC(tf.keras.Model):
def __init__(self, hidden_size, num_classes):
super(TwoLayerFC, self).__init__()
self.fc1 = keras.layers.Dense(hidden_size,activation=tf.nn.relu)
self.fc2 = keras.layers.Dense(num_classes)
def call(self, x, training=None):
x = tf.layers.flatten(x)
x = self.fc1(x)
x = self.fc2(x)
return x
This is how I test the network
def test_TwoLayerFC():
tf.reset_default_graph()
input_size, hidden_size, num_classes = 50, 42, 10
model = TwoLayerFC(hidden_size, num_classes)
with tf.device(device):
x = tf.zeros((64, input_size))
scores = model(x)
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
scores_np = sess.run(scores)
print(scores_np.shape)
I get an error:
TypeError: init() takes at least 3 arguments (2 given)
I followed this tutorial, and it seems that there should be two parameters.
I read your code and I see a PyTorch model being created, including the mistake in the second Dense layer with two passed numbers.
Keras models should not follow the same logic of PyTorch models.
This model should be created like this:
input_tensor = Input(input_shape)
output_tensor = Flatten()(input_tensor)
output_tensor = Dense(hidden_size, activation='relu')(output_tensor)
output_tensor = Dense(num_classes)
model = keras.models.Model(input_tensor, output_tensor)
This model instance is ready to be compiled and trained:
model.compile(optimizer=..., loss = ..., metrics=[...])
model.fit(x_train, y_train, epochs=..., batch_size=..., ...)
There is no reason in Keras to subclass Model, unless you're a really advanced user trying some very unconventional things.
By the way, be careful not to mix tf.keras.anything with keras.anything. The first is a version of Keras maitained directly by tensorflow, while the second is original Keras. They're not exactly the same, tensorflow's version seems more buggy and mixing the two in the same code sounds like a bad idea.

Categories