What actually happens on model creation and on model compile? Why compile is not inside Model? What is happening in terms of tensorflow graph, session?
Example code:
# model creation
model = Model(inputs, outputs)
# model compile
model.compile(optimizer='adadelta', loss='binary_crossentropy')
model = Model(inputs, outputs)
The above statement groups the various layers together and defines the flow of data between the various layers. The optimizers, loss functions or metrics to be used for evaluation are not specified here.
model.compile(optimizer='adadelta', loss='binary_crossentropy')
The above statement, model.compile() is basically used to config the model with losses and metrics.
Related
I tried load_model and then model.fit method to load a existing model and adding some more data on it. It seems working. Epoch also worked without any issue. But after saving the new trained model it looks like the old model. Exactly same file size, same data. What I am doing wrong?
from keras.models import load_model
model = load_model('/content/drive/MyDrive/Trained_database/diu_project.h5')
model.fit(x=X_train, y=y_train, epochs=30, batch_size = 5,shuffle = False, validation_split=0.2)
model.save('/content/drive/MyDrive/Trained_database/diu_project_3.h5')
Did you check if the layers of the model you are loading are trainable?
model.summary()
Is the number of trainable parameters what you would expect? You can make all the layers trainable with the following code:
model.trainable = True
Or train only from a certain layer (let's say number 100). This is particularly useful for transfer learning between close domains.
# Fine-tune from this layer onwards
fine_tune_at = 100
# Freeze all the layers before the `fine_tune_at` layer
for layer in base_model.layers[:fine_tune_at]:
layer.trainable = False
The examples are from:
https://www.tensorflow.org/tutorials/images/transfer_learning
I am creating a Keras model like this one:
model_1 = Model(inputs=[input_1], outputs=main_network)
This model will be used for predictions, but not for training. Instead, this model will belong to a larger on:
error = Subtract()([y, model_1.output])
model_2 = Model(inputs = model_1.inputs+[input_2], outputs=error)
Where the variable y is the output of other layers. I have checked that both models are correctly built and they both have the right layers, but when I train model_2, the weights of the layers in model_1 are not updated. I thought they would, since the layers in model_1 and model_2 should be the same instances (in fact, they are using the same addresses). What can I do so the weights of both models update at the same time?
Thanks!!!
EDIT: there was a mistake in the code
I'm trying to implement an adversarial loss in keras.
The model consists of two networks, one auto-encoder (the target model) and one discriminator. The two models share the encoder.
I created the adversarial loss of the auto-encoder by setting a keras variable
def get_adv_loss(d_loss):
def loss(y_true, y_pred):
return some_loss(y_true, y_pred) - d_loss
return loss
discriminator_loss = K.variable()
L = get_adv_loss(discriminator_loss)
autoencoder.compile(..., loss=L)
and during training I interleave train_on_batch of discriminator and autoencoder to update discriminator_loss
d_loss = disciminator.train_on_batch(x, y_domain)
discriminator_loss.assign(d_loss)
a_loss, ... = self.segmenter.train_on_batch(x, y_target)
However, I found out that the value of these variables is frozen when the model is compiled. I tried to recompile the model during training but that raise the error
Node 'IsVariableInitialized_13644': Unknown input node
'training_12/Adam/Variable'
which I guess it means i cant recompile during training? any suggestion on how i can inject the discriminator loss in the autoencoder?
Keras model supports multiple outputs. So just include your discirminator into your keras model and freeze the discrminator layers, if the discriminator should not be trained.
The next question would be how to combine autoencoder loss and discriminator loss. Luckily keras model.compile supports loss weights. If autoencoder is your first output and discriminator is your second you could do something like loss_weights=[1, -1]. So a better discriminator is worse for the autoencoder.
Edit: Here is an example, how to implement an Adversary Network:
# Build your architecture
auto_encoder_input = Input((5,))
auto_encoder_net = Dense(10)(auto_encoder_input)
auto_encoder_output = Dense(5)(auto_encoder_net)
discriminator_net = Dense(20)(auto_encoder_output)
discriminator_output = Dense(5)(discriminator_net)
# Define outputs of your model
train_autoencoder_model = Model(auto_encoder_input, [auto_encoder_output, discriminator_output])
train_discriminator_model = Model(auto_encoder_input, discriminator_output)
# Compile the models (compile the first model and then change the trainable attribute for the second)
for layer_index, layer in enumerate(train_autoencoder_model.layers):
layer.trainable = layer_index < 3
train_autoencoder_model.compile('Adam', loss=['mse', 'mse'], loss_weights=[1, -1])
for layer_index, layer in enumerate(train_discriminator_model.layers):
layer.trainable = layer_index >= 3
train_discriminator_model.compile('Adam', loss='mse')
# A simple example how a training can look like
for i in range(10):
auto_input = np.random.sample((10,5))
discrimi_output = np.random.sample((10,5))
train_discriminator_model.fit(auto_input, discrimi_output, steps_per_epoch=5, epochs=1)
train_autoencoder_model.fit(auto_input, [auto_input, discrimi_output], steps_per_epoch=1, epochs=1)
As you can see there is no much magic behind building an Adversary Model with keras.
Unless you decide to go deep in the keras source code, I don't think you can do this easily. Before writing your own adversarial module, you should check the existing works carefully. As far as I know, keras-adversarial is still used by many people. Of course, it only supports old keras versions, e.g. 2.0.8.
Several other things:
be careful when you freeze your model weights. If you first compile a model and then freeze some weights, these weights are still trainable, because when the train function is generated during compiling. So you should freeze weights first then compile.
keras-adversarial does this job in a more elegant way. Instead of making two models, shared weights but freeze some weights in different ways, it creates two train functions, one for each player.
I created a Sequential model using tf.keras as follows:
model = tf.keras.Sequential()
model.add(tf.keras.layers.Dense(8, input_dim=4))
model.add(tf.keras.layers.Dense(3, activation=tf.nn.softmax))
opt = tf.train.AdamOptimizer(learning_rate=0.001)
model.compile(optimizer=opt, loss="categorical_crossentropy", metrics=["accuracy"])
model.summary()
After that, I created a training process using train_on_batch:
EPOCHS=50
for epoch in range(EPOCHS):
for metrics, labels in dataset:
# Calculate training loss and accuracy
tr_loss, tr_accuracy = model.train_on_batch(metrics, labels)
When I try to save the model, I receive a warning. I can't understand why, because I included the optimizer as part of the model.compile:
tf.keras.models.save_model(
model,
"./model/iris_model.h5",
overwrite=True,
include_optimizer=True
)
WARNING:tensorflow:TensorFlow optimizers do not make it possible to access optimizer attributes or optimizer state after instantiation. As a result, we cannot save the optimizer as part of the model save file.You will have to compile your model again after loading it. Prefer using a Keras optimizer instead (see keras.io/optimizers).
The TF version I used is 1.9.0-rc2.
As the warning says the Tensorflow optimizers cannot be saved when saving the model. Instead, use the optimizers provided by Keras:
opt = tf.keras.optimizers.Adam(lr=0.001)
Optimizer like tf.keras.optimizers.Adam() will be saved, and tf.train.AdamOptimizer() will not be saved on model.save().
As the time as writing this some official tutorials on TensorFlow use tf.train.* optimizer while I strongly believe selecting the tf.keras.optimizers.* is the best way to go.
I'm training a modified InceptionV3 model with the multi_gpu_model in Keras, and I use model.save to save the whole model.
Then I closed and restarted the IDE and used load_model to reinstantiate the model.
The problem is that I am not able to resume the training exactly where I left off.
Here is the code:
parallel_model = multi_gpu_model(model, gpus=2)
parallel_model.compile(optimizer='rmsprop', loss='categorical_crossentropy')
history = parallel_model.fit_generator(generate_batches(path), steps_per_epoch = num_images/batch_size, epochs = num_epochs)
model.save('my_model.h5')
Before the IDE closed, the loss is around 0.8.
After restarting the IDE, reloading the model and re-running the above code, the loss became 1.5.
But, according to the Keras FAQ, model_save should save the whole model (architecture + weights + optimizer state), and load_model should return a compiled model that is identical to the previous one.
So I don't understand why the loss becomes larger after resuming the training.
EDIT: If I don't use the multi_gpu_model and just use the ordinary model, I'm able to resume exactly where I left off.
When you call multi_gpu_model(...), Keras automatically sets the weights of your model to some default values (at least in the version 2.2.0 which I am currently using). That's why you were not able to resume the training at the same point as it was when you saved it.
I just solved the issue by replacing the weights of the parallel model with the weights from the sequential model:
parallel_model = multi_gpu_model(model, gpus=2)
parallel_model.layers[-2].set_weights(model.get_weights()) # you can check the index of the sequential model with parallel_model.summary()
parallel_model.compile(optimizer='rmsprop', loss='categorical_crossentropy')
history = parallel_model.fit_generator(generate_batches(path), steps_per_epoch = num_images/batch_size, epochs = num_epochs)
I hope this will help you.
#saul19am When you compile it, you can only load the weights and the model structure, but you still lose the optimizer_state. I think this can help.