I am creating a Keras model like this one:
model_1 = Model(inputs=[input_1], outputs=main_network)
This model will be used for predictions, but not for training. Instead, this model will belong to a larger on:
error = Subtract()([y, model_1.output])
model_2 = Model(inputs = model_1.inputs+[input_2], outputs=error)
Where the variable y is the output of other layers. I have checked that both models are correctly built and they both have the right layers, but when I train model_2, the weights of the layers in model_1 are not updated. I thought they would, since the layers in model_1 and model_2 should be the same instances (in fact, they are using the same addresses). What can I do so the weights of both models update at the same time?
Thanks!!!
EDIT: there was a mistake in the code
Related
I tried load_model and then model.fit method to load a existing model and adding some more data on it. It seems working. Epoch also worked without any issue. But after saving the new trained model it looks like the old model. Exactly same file size, same data. What I am doing wrong?
from keras.models import load_model
model = load_model('/content/drive/MyDrive/Trained_database/diu_project.h5')
model.fit(x=X_train, y=y_train, epochs=30, batch_size = 5,shuffle = False, validation_split=0.2)
model.save('/content/drive/MyDrive/Trained_database/diu_project_3.h5')
Did you check if the layers of the model you are loading are trainable?
model.summary()
Is the number of trainable parameters what you would expect? You can make all the layers trainable with the following code:
model.trainable = True
Or train only from a certain layer (let's say number 100). This is particularly useful for transfer learning between close domains.
# Fine-tune from this layer onwards
fine_tune_at = 100
# Freeze all the layers before the `fine_tune_at` layer
for layer in base_model.layers[:fine_tune_at]:
layer.trainable = False
The examples are from:
https://www.tensorflow.org/tutorials/images/transfer_learning
I want to use ResNet model architecture and want to change last few layers; how can I only use model architecture from model zoo in Tensorflow?
To use a ResNet model, you can choose a select few from tensorflow.keras.applications including ResNet50, ResNet101, and ResNet152. Then, you will need to change a few of the default arguments if you want to do transfer learning. For your question, you will need to set the weights parameter equal to None. Otherwise, 'imagenet' weights are provided. Also, you need to set include_top to be False since the number of classes for your problem will likely be different from ImageNet. Finally, you will need to provide the shape of your data in input_shape. This would look something like this.
base = tf.keras.applications.ResNet50(include_top=False, weights=None, input_shape=shape)
To get a summary of the model, you can do
base.summary()
To add your own head, you can use the Functional API. You will need to add an Input layer and your own Dense layer which will correspond to your task. This could be
input = tf.keras.layers.Input(shape=shape)
base = base(input)
out = tf.keras.layers.Dense(num_classes, activation='softmax')(base)
Finally, to construct a model, you can do
model = tf.keras.models.Model(input, out)
The Model constructor takes 2 arguments. The first being the inputs to your model, and the second being the outputs. Note that calling model.summary() will show the ResNet base as a separate layer. To view all layers of the ResNet base, you can do model.layers[1].summary(), or you can modify the code on how you built your model. The second way would be
out = tf.keras.layers.Dense(num_classes, activation='softmax')(base.output)
model = tf.keras.models.Model(base.input, out)
Now you can view all layers with just model.summary().
What actually happens on model creation and on model compile? Why compile is not inside Model? What is happening in terms of tensorflow graph, session?
Example code:
# model creation
model = Model(inputs, outputs)
# model compile
model.compile(optimizer='adadelta', loss='binary_crossentropy')
model = Model(inputs, outputs)
The above statement groups the various layers together and defines the flow of data between the various layers. The optimizers, loss functions or metrics to be used for evaluation are not specified here.
model.compile(optimizer='adadelta', loss='binary_crossentropy')
The above statement, model.compile() is basically used to config the model with losses and metrics.
I want to train a model in a sequential manner. That is I want to train the model initially with a simple architecture and once it is trained, I want to add a couple of layers and continue training. Is it possible to do this in Keras? If so, how?
I tried to modify the model architecture. But until I compile, the changes are not effective. Once I compile, all the weights are re-initialized and I lose all the trained information.
All the questions in web and SO I found are either about loading a pre-trained model and continuing training or modifying the architecture of pre-trained model and then only test it. I didn't find anything related to my question. Any pointers are also highly appreciated.
PS: I'm using Keras in tensorflow 2.0 package.
Without knowing the details of your model, the following snippet might help:
from tensorflow.keras.models import Model
from tensorflow.keras.layers import Dense, Input
# Train your initial model
def get_initial_model():
...
return model
model = get_initial_model()
model.fit(...)
model.save_weights('initial_model_weights.h5')
# Use Model API to create another model, built on your initial model
initial_model = get_initial_model()
initial_model.load_weights('initial_model_weights.h5')
nn_input = Input(...)
x = initial_model(nn_input)
x = Dense(...)(x) # This is the additional layer, connected to your initial model
nn_output = Dense(...)(x)
# Combine your model
full_model = Model(inputs=nn_input, outputs=nn_output)
# Compile and train as usual
full_model.compile(...)
full_model.fit(...)
Basically, you train your initial model, save it. And reload it again, and wrap it together with your additional layers using the Model API. If you are not familiar with Model API, you can check out the Keras documentation here (afaik the API remains the same for Tensorflow.Keras 2.0).
Note that you need to check if your initial model's final layer's output shape is compatible with the additional layers (e.g. you might want to remove the final Dense layer from your initial model if you are just doing feature extraction).
I'm trying to implement an adversarial loss in keras.
The model consists of two networks, one auto-encoder (the target model) and one discriminator. The two models share the encoder.
I created the adversarial loss of the auto-encoder by setting a keras variable
def get_adv_loss(d_loss):
def loss(y_true, y_pred):
return some_loss(y_true, y_pred) - d_loss
return loss
discriminator_loss = K.variable()
L = get_adv_loss(discriminator_loss)
autoencoder.compile(..., loss=L)
and during training I interleave train_on_batch of discriminator and autoencoder to update discriminator_loss
d_loss = disciminator.train_on_batch(x, y_domain)
discriminator_loss.assign(d_loss)
a_loss, ... = self.segmenter.train_on_batch(x, y_target)
However, I found out that the value of these variables is frozen when the model is compiled. I tried to recompile the model during training but that raise the error
Node 'IsVariableInitialized_13644': Unknown input node
'training_12/Adam/Variable'
which I guess it means i cant recompile during training? any suggestion on how i can inject the discriminator loss in the autoencoder?
Keras model supports multiple outputs. So just include your discirminator into your keras model and freeze the discrminator layers, if the discriminator should not be trained.
The next question would be how to combine autoencoder loss and discriminator loss. Luckily keras model.compile supports loss weights. If autoencoder is your first output and discriminator is your second you could do something like loss_weights=[1, -1]. So a better discriminator is worse for the autoencoder.
Edit: Here is an example, how to implement an Adversary Network:
# Build your architecture
auto_encoder_input = Input((5,))
auto_encoder_net = Dense(10)(auto_encoder_input)
auto_encoder_output = Dense(5)(auto_encoder_net)
discriminator_net = Dense(20)(auto_encoder_output)
discriminator_output = Dense(5)(discriminator_net)
# Define outputs of your model
train_autoencoder_model = Model(auto_encoder_input, [auto_encoder_output, discriminator_output])
train_discriminator_model = Model(auto_encoder_input, discriminator_output)
# Compile the models (compile the first model and then change the trainable attribute for the second)
for layer_index, layer in enumerate(train_autoencoder_model.layers):
layer.trainable = layer_index < 3
train_autoencoder_model.compile('Adam', loss=['mse', 'mse'], loss_weights=[1, -1])
for layer_index, layer in enumerate(train_discriminator_model.layers):
layer.trainable = layer_index >= 3
train_discriminator_model.compile('Adam', loss='mse')
# A simple example how a training can look like
for i in range(10):
auto_input = np.random.sample((10,5))
discrimi_output = np.random.sample((10,5))
train_discriminator_model.fit(auto_input, discrimi_output, steps_per_epoch=5, epochs=1)
train_autoencoder_model.fit(auto_input, [auto_input, discrimi_output], steps_per_epoch=1, epochs=1)
As you can see there is no much magic behind building an Adversary Model with keras.
Unless you decide to go deep in the keras source code, I don't think you can do this easily. Before writing your own adversarial module, you should check the existing works carefully. As far as I know, keras-adversarial is still used by many people. Of course, it only supports old keras versions, e.g. 2.0.8.
Several other things:
be careful when you freeze your model weights. If you first compile a model and then freeze some weights, these weights are still trainable, because when the train function is generated during compiling. So you should freeze weights first then compile.
keras-adversarial does this job in a more elegant way. Instead of making two models, shared weights but freeze some weights in different ways, it creates two train functions, one for each player.