My keras model is made up of multiple models. Each "sub-model" has multiple layers. How do I call out the layers in the "sub-model" and set trainability / freeze specific layers?
I'll use an example of the VGG19 convolutional neural network in Keras, although it applies to any neural network architecture:
from keras.applications.vgg19 import VGG19
model = VGG19(weights='imagenet')
You can visualise the layers using:
model.summary()
The summary will show the amount of trainable parameters in the network. To freeze certain layers, i.e. the last 5 layers in the network:
for layer in model.layers[:-5]:
layer.trainable = False
Calling the summary again you'll see the amount of trainable parameters have reduced.
Related
I am using a pretrained Resnet50 (from the tensorflow.keras.applications package) and finetune it for multilabel classification (with 2 classes), and I'd like to extract the Saliency maps from the finetuned model.
To make a classifier, i add 2 dense layers to the Resnet model, creating a new sequential model as follow :
self.model = tf.keras.Sequential([
resnet50,
layers.Dense(1024, activation='relu', name='hidden_layer'),
layers.Dense(2, activation='sigmoid', name='output')
])
but my problem is that the resnet50 becomes a "single layer", like each layer is no more accessible : the model summary only contains 3 layers. I'd like to know if there is a way to add layers to a functional model without creating a sequential model, in order to be able to access each layer of the resnet model.
Thank you in advance,
To access layers of the resnet50 model, you have to first access the resnet50 layer aand then from that access the convolution layer for which you want to create saliency map.
self.model.get_layer("resnet50").get_layer("conv5_block2_3_conv")
im having trouble deciding or being sure what layer of a densenet-121 (fined-tuned model) to use for feature extraction.
I have the following model (is based on DenseNet-121 but I added a classification layer because I have trained it to classify the image into 7 classes). These are the last layers of my model:
However, Im having trouble figuring out which layer to use (BatchNormalization or the relu). I want to have a vector of len(4096). Is there a difference of output from the two layers? Which one is the recommended one to use?
if you are doing classification you want the dense_3 layer as the model output. The batch normalization layer and the relu layer each produce an output of shape(2,2,1024). 4096 is the number of trainable parameters for the layer
I am following with Datacamp's tutorial on using convolutional autoencoders for classification here. I understand in the tutorial that we only need the autoencoder's head (i.e. the encoder part) stacked to a fully-connected layer to do the classification.
After stacking, the resulting network (convolutional-autoencoder) is trained twice. The first by setting the encoder's weights to false as:
for layer in full_model.layers[0:19]:
layer.trainable = False
And then setting back to true, and re-trained the network:
for layer in full_model.layers[0:19]:
layer.trainable = True
I cannot understand why we are doing this twice. Anyone with experience working with conv-net or autoencoders?
It's because the first 19 layers are already trained in this line:
autoencoder_train = autoencoder.fit(train_X, train_ground, batch_size=batch_size,epochs=epochs,verbose=1,validation_data=(valid_X, valid_ground))
The point of autoencoder is for dimensionality reduction. Let's say you have 1000 features and you want to reduce it to 100 features. You train an autoencoder with encoder layer followed by decoder layer. The point is so that the encoded feature (outputted by encoder layer) can be decoded back to the original features.
After the autoencoder is trained, the decoder part is thrown away and instead a fully connected classification layer is added on top of the encoder layer to train a classification network from a reduced set of features. This is why the encoder layers trainable is set to false to only train the fully connected classification layer (to speed up training).
The reason that the encoder layer trainable is set to true again is to train the entire network which can be slow because changes are back propagated to the entire network.
They freeze the Autoencoder's layers first because they need to initialize the stacked CNN network so as to make them "catch up" with the pre-trained weights.
If you skip this step, what would happen is that the uninitialized layers would untrain your Autoencoder's layer during the backward pass because the gradients between both networks can be large. Eventually you will return to the same point but you'll have lost the time savings of using a pre-trained network.
Should the inference model in a chatbot model with keras lstm, have the same amount of layers as the main model or it doesnt matter?
I don't know what you exactly mean by inference model.
The number of layers of a model is an hyperparameter that you tune during training. Let's say that you train an LSTM model with 3 layers, then the model used for inference must have the same number of layers and use the weights resulting from the training.
Otherwise, if you add non trained layer when inference, the results won't make any sense.
Hope this helps
I am using a pre-trained keras model ( Convolutional network) and I am retraining this model again on my dataset.
Now, I need to get the output of some layers, to visualize the gradient activation. I just found out that every trained model has different naming of layers. for example, the input layer in one model is: input_7 (InputLayer) and in another model is input_5 (InputLayer).
Do you know how to prevent this bad behavior? How can I keep the same naming without the need to manually name all the layers, as I have more than 53 convolutional layers?