Loading & Freezing a Pretrained Model to Combine with a New Network - python

I have a pretrained model and would like to build a classifier on top of it. I’m trying to load and freeze the weights of the pretrained model, and pass its outputs to the new classifier, which I’d like to optimise. Here is what I have so far, I’m a little stuck on a TypeError: forward() missing 1 required positional argument: 'x' error from the nn.Sequential line:
import model #model.py contains the architecture of the pretrained model
class Classifier(nn.Module):
def __init__(self):
...
def forward(self, x):
...
net = model.Model()
net.load_state_dict(checkpoint["net"])
for c in net.children():
for param in child.parameters():
params.requires_grad = False
model = nn.Sequential(nn.ModuleList(net()), Classifier())

TL;DR
model = nn.Sequential(nn.ModuleList(net), Classifier())
You are "calling" net.forward by net(), as opposed to the __init__ method of Classifier class in Classifier().

I finally solved this issue after a discussion with #ptrblck from the PyTorch Forums. The solution is similar to Shai's answer, only that because net contains an instance of the model.Model class, one should do model = nn.Sequential(net, Classifier()) instead, without calling nn.ModuleList().

Related

Tensorflow Keras Model subclassing -- call function

I am experimenting with self supervised learning using tensorflow. The example code I'm running can be found in the Keras examples website. This is the link to the NNCLR example. The Github link to download the code can be found here. While I have no issues running the examples, I am running into issues when I try to save the pretrained or the finetuned model using model.save().
The error I'm getting is this:
f"Model {model} cannot be saved either because the input shape is not "
ValueError: Model <__main__.NNCLR object at 0x7f6bc0f39550> cannot be saved either
because the input shape is not available or because the forward pass of the model is
not defined. To define a forward pass, please override `Model.call()`.
To specify an input shape, either call `build(input_shape)` directly, or call the model on actual data using `Model()`, `Model.fit()`, or `Model.predict()`.
If you have a custom training step, please make sure to invoke the forward pass in train step through
`Model.__call__`, i.e. `model(inputs)`, as opposed to `model.call()`.
I am unsure how to override the Model.call() method. Appreciate some help.
One way to achieve model saving in such cases is to override the save (or save_weights) method in the keras.Model class. In your case, first initialize the finetune model in the NNCLR class. And next, override the save method for it. FYI, in this way, you may also able to use ModelCheckpoint API.
As said, define the finetune model in the NNCLR model class and override the save method for it.
class NNCLR(keras.Model):
def __init__(...):
super().__init__()
...
self.finetuning_model = keras.Sequential(
[
layers.Input(shape=input_shape),
self.classification_augmenter,
self.encoder,
layers.Dense(10),
],
name="finetuning_model",
)
...
def save(
self, filepath, overwrite=True, include_optimizer=True,
save_format=None, signatures=None, options=None
):
self.finetuning_model.save(
filepath=filepath,
overwrite=overwrite,
save_format=save_format,
options=options,
include_optimizer=include_optimizer,
signatures=signatures
)
model = NNCLR(...)
model.compile
model.fit
Next, you can do
model.save('finetune_model') # SavedModel format
finetune_model = tf.keras.models.load_model('finetune_model', compile=False)
'''
NNCLR code example: Evaluate sections
"A popular way to evaluate a SSL method in computer vision or
for that fact any other pre-training method as such is to learn
a linear classifier on the frozen features of the trained backbone
model and evaluate the classifier on unseen images."
'''
for layer in finetune_model.layers:
if not isinstance(layer, layers.Dense):
layer.trainable = False
finetune_model.summary() # OK
finetune_model.compile(
optimizer=keras.optimizers.Adam(),
loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True),
metrics=[keras.metrics.SparseCategoricalAccuracy(name="acc")],
)
finetune_model.fit

How to make pytorch lightning module have injected, nested models?

I have some nets, such as the following (augmented) resnet18:
num_classes = 10
resnet = models.resnet18(pretrained=True)
for param in resnet.parameters():
param.requires_grad = True
num_ftrs = resnet.fc.in_features
resnet.fc = nn.Linear(num_ftrs, num_classes)
And I want to use them inside a lightning module, and have it handle all optimizations, to_device, stages and so on. In other words, I want to register those modules for my lightning module.
I also want to be able to access their public members.
class MyLightning(LightningModule):
def __init__(self, resnet):
super().__init__()
self._resnet = resnet
self._criterion = lambda x: 1.0
def forward(self, x):
resnet_out = self._resnet(x)
loss = self._criterion(resnet_out)
return loss
my_lightning = MyLightning(resnet)
The above doesn't optimize any parameters.
Trying
def __init__(self, resnet)
...
_layers = list(resnet.children())[:-1]
self._resnet = nn.Sequential(*_layers)
Doesn't take resnet.fc into account. This also doesn't make sense to be the intended way of nesting models inside pytorch lightning.
How to nest models in pytorch lightning, and have them fully accessible and handled by the framework?
The training loop and optimization process is handles by the Trainer class. You can do so by initializing a new instance:
>>> trainer = Trainer()
And wrapping your PyTorch Lightning module with it. This way you can perform fitting, tuning, validating, and testing on that instance provided a DataLoader or LightningDataModule:
>>> trainer.fit(my_lightning, train_dataloader, val_dataloader)
You will have to implement the following functions on your Lightning module (i.e. in your case MyLightning):
Name
Description
init
Define computations here
forward
Use for inference only (separate from training_step)
training_step
the complete training loop
validation_step
the complete validation loop
test_step
the complete test loop
predict_step
the complete prediction loop
configure_optimizers
define optimizers and LR schedulers
source LightningModule documentation page.
Keep in mind a LightningModule is a nn.Module, so whenever you define a nn.Module as attribute to a LightningModule in the __init__ function, this module will end being registered as a sub-module to the parent pytorch lightning module.
The pytorch model should inherit from nn.Module, So you should find firstly the resnet18 in pytorch, then you can use the resnet18 or revise it by youself
The origin resnet codes is in this path: ...\python\Lib\site-packages\torchvision\models\resnet.py, you import the resnet network from here, so you can use it directly.
Now, you will find the original codes
class ResNet(nn.Module):...
https://github.com/pytorch/vision/blob/main/torchvision/models/resnet.py#L166
And import it like
from torchvision.models import ResNet
Finally, you can inherit from ResNet
class MyLightning(ResNet):

Loading a modified pretrained model using strict=False in PyTorch

I want to use a pretrained model as the encoder part in my model. You can find a version of my model:
class MyClass(nn.Module):
def __init__(self, pretrained=False):
super(MyClass, self).__init__()
self.encoder=S3D_featureExtractor_multi_output()
if pretrained:
weight_dict=torch.load(os.path.join('models','weights.pt'))
model_dict=self.encoder.state_dict()
list_weight_dict=list(weight_dict.items())
list_model_dict=list(model_dict.items())
for i in range(len(list_model_dict)):
assert list_model_dict[i][1].shape==list_weight_dict[i][1].shape
model_dict[list_model_dict[i][0]].copy_(weight_dict[list_weight_dict[i][0]])
for i in range(len(list_model_dict)):
assert torch.all(torch.eq(model_dict[list_model_dict[i][0]],weight_dict[list_weight_dict[i][0]].to('cpu')))
print('Loading finished!')
def forward(self, x):
a, b = self.encoder(x)
return a, b
Because I modified some parts of the code of this pretrained model, based on this post I need to apply strict=False to avoid facing error, but based on the scenario that I load the pretrained weights, I cannot find a place in the code to apply strict=False. How can I apply that or how can I change the scenario of loading the pretrained model taht makes it possible to apply strict=False?
strict = False is to specify when you use load_state_dict() method. state_dict are just Python dictionaries that helps you save and load model weights.
(for more details, see https://pytorch.org/tutorials/recipes/recipes/what_is_state_dict.html)
If you use strict=False in load_state_dict, you inform PyTorch that the target model and the original model are not identical, so it just initialises the weights of layers which are present in both and ignores the rest.
(see https://pytorch.org/docs/stable/generated/torch.nn.Module.html?highlight=load_state_dict#torch.nn.Module.load_state_dict)
So, you will need to specify the strict argument when you load the pretrained model weights. load_state_dict can be called at this step.
If the model for which weights must be loaded is self.encoder
and if state_dict can be retrieved from the model you just loaded, you can just do this
loaded_weights = torch.load(os.path.join('models','weights.pt'))
self.encoder.load_state_dict(loaded_weights, strict=False)
for more details and a tutorial, see https://pytorch.org/tutorials/beginner/saving_loading_models.html .

Unable to restore custom object of type _tf_keras_metric currently using the HammingLoss metric from TensorFlow Addons module

I have created and trained a TensorFlow model using the HammingLoss metric from TensorFlow addons. Thus, it's not a custom metric that I have created on my own. I use a callbacks function with the methords ModelCheckpoint() and EarlyStopping to save the best weights of the best model and stop model training at a given threshold repsectively. When I save the model checkpoint I serialize the whole model structure (similar to model.save()), istead of model.save_weights(), which would have saved only the model weights (more about ModelCheckpoint here).
TL;DR: Here is a colab notebook with the code I post below in case you want to skip this.
The model I have trained is saved in GoogleDrive in the link here. To load the specific model I use the following code:
neural_network_parameters = {}
#======================================================================
# PARAMETERS THAT DEFINE THE NEURAL NETWORK STRUCTURE =
#======================================================================
neural_network_parameters['model_loss'] = tf.keras.losses.BinaryCrossentropy(from_logits=False, name='binary_crossentropy')
neural_network_parameters['model_metric'] = [tfa.metrics.HammingLoss(mode="multilabel", name="hamming_loss"),
tfa.metrics.F1Score(17, average="micro", name="f1_score_micro"),
tfa.metrics.F1Score(17, average=None, name="f1_score_none"),
tfa.metrics.F1Score(17, average="macro", name="f1_score_macro"),
tfa.metrics.F1Score(17, average="weighted", name="f1_score_weighted")]
"""Initialize the hyper parameters tuning the model using Tensorflow's hyperparameters module"""
HP_HIDDEN_UNITS = hp.HParam('batch_size', hp.Discrete([32]))
HP_EMBEDDING_DIM = hp.HParam('embedding_dim', hp.Discrete([50]))
HP_LEARNING_RATE = hp.HParam('learning_rate', hp.Discrete([0.001])) # Adam default: 0.001, SGD default: 0.01, RMSprop default: 0.001....0.1 to be removed
HP_DECAY_STEPS_MULTIPLIER = hp.HParam('decay_steps_multiplier', hp.Discrete([10]))
METRIC_ACCURACY = "hamming_loss"
dependencies = {
'hamming_loss': tfa.metrics.HammingLoss(mode="multilabel", name="hamming_loss"),
'attention': attention(return_sequences=True)
}
def import_trained_keras_model(model_index, method, decay_steps_mode, optimizer_name, hparams):
"""Load the model"""
training_date="2021-02-27"
model_path_structure=f"{folder_path_model_saved}/{initialize_notebbok_variables.saved_model_name}_{hparams[HP_EMBEDDING_DIM]}dim_{hparams[HP_HIDDEN_UNITS]}batchsize_{hparams[HP_LEARNING_RATE]}lr_{hparams[HP_DECAY_STEPS_MULTIPLIER]}decaymultiplier_{training_date}"
model_imported=load_model(f"{model_path_structure}", custom_objects=dependencies)
if optimizer_name=="adam":
optimizer = optimizer_adam_v2(hparams)
elif optimizer_name=="sgd":
optimizer = optimizer_sgd_v1(hparams, "step decay")
else:
optimizer = optimizer_rmsprop_v1(hparams)
model_imported.compile(optimizer=optimizer,
loss=neural_network_parameters['model_loss'],
metrics=neural_network_parameters['model_metric'])
print(f"Model {model_index} is loaded successfully\n")
return model_imported
Calling the function import trained keras model
"""Now that the functions have been created it's time to import each trained classifier from the selected dictionary of hyper parameters, calculate the evaluation metric per model and finally serialize the scores dataframe for later use."""
list_models=[] #a list to store imported models
model_optimizer="adam"
for batch_size in HP_HIDDEN_UNITS.domain.values:
for embedding_dim in HP_EMBEDDING_DIM.domain.values:
for learning_rate in HP_LEARNING_RATE.domain.values:
for decay_steps_multiplier in HP_DECAY_STEPS_MULTIPLIER.domain.values:
hparams = {
HP_HIDDEN_UNITS: batch_size,
HP_EMBEDDING_DIM: embedding_dim,
HP_LEARNING_RATE: learning_rate,
HP_DECAY_STEPS_MULTIPLIER: decay_steps_multiplier
}
print(f"\n{len(list_models)+1}/{(len(HP_HIDDEN_UNITS.domain.values)*len(HP_EMBEDDING_DIM.domain.values)*len(HP_LEARNING_RATE.domain.values)*len(HP_DECAY_STEPS_MULTIPLIER.domain.values))}")
print({h.name: hparams[h] for h in hparams},'\n')
model_object=import_trained_keras_model(len(list_models)+1, "import custom trained model", "on", model_optimizer, hparams)
list_models.append(model_object)
When I call the function I get the following error
ValueError: Unable to restore custom object of type _tf_keras_metric currently. Please make sure that the layer implements get_configand from_config when saving. In addition, please use the custom_objects arg when calling load_model().
It's strange that I get this error since the model metric to compile the NN is from a built in method of TensorFlow and NOT some sort of a custom metric that I developed myself.
I have searched also this thread in GitHub which closed without explaining the root of the problem.
[UPDATE]--Found a temporary solution
I managed to successfully import the model by turning the compile argument to False in order to re-compile the model imported inside the function.
So I did smth like model_imported=load_model(f"{model_path_structure}", custom_objects=dependencies, compile=False).
This action produced the following result:
WARNING:tensorflow:Unable to restore custom metric. Please ensure that the layer implements get_config and from_config when saving. In addition, please use the custom_objects arg when calling load_model().
Model 1 is loaded successfully.
So TensorFlow still cannot understand that HammingLoss is not a custom metric but rather a metric imported from Tensorflow Addons. However, despite the warning the model loaded successfully.

How do we create a reusable block that share architecture in a single model but learn different set of weight in the single model in Keras?

I am using tensorflow.keras and want to know if it is possible to create reusable blocks of inbuilt Keras layers. For example, I would like to repeatedly use the same set of layers (that able to learn the different weights) at a different position in a model. I would like to use the following block at different times in my model.
keep_prob_=0.5
input_features=Input(shape=(29, 1664))
Imortant_features= SelfAttention(activation='tanh',
kernel_regularizer=tf.keras.regularizers.l2(0.), kernel_initializer='glorot_uniform'
(input_features)
drop3=tf.keras.layers.Dropout(keep_prob_)(Imortant_features)
Layer_norm_feat=tf.keras.layers.Add()([input_features, drop3])
Layer_norm=tf.keras.layers.LayerNormalization(axis=-1)(Layer_norm_feat)
ff_out=tf.keras.layers.Dense(Layer_norm.shape[2], activation='relu')(Layer_norm)
ff_out=tf.keras.layers.Dense(Layer_norm.shape[2])(ff_out)
drop4=tf.keras.layers.Dropout(keep_prob_)(ff_out)
Layer_norm_input=tf.keras.layers.Add()([Layer_norm, drop4])
Attention_block_out=tf.keras.layers.LayerNormalization(axis=-1)(Layer_norm_input)
intraEpoch_att_block=tf.keras.Model(inputs=input_features, outputs=Attention_block_out)
I have read about creating custom layers in Keras but I did not find the documentation to be clear enough. I want to reuse the sub-model which able to learn the different set of weight in a single functional API model in tensorflow.keras.
Use this code (I removed SelfAttention, so add it back):
import tensorflow as tf
class my_model(tf.keras.layers.Layer):
def __init__(self):
super(my_model, self).__init__()
keep_prob_=0.5
input_features=tf.keras.layers.Input(shape=(29, 1664))
drop3=tf.keras.layers.Dropout(keep_prob_)(input_features)
Layer_norm_feat=tf.keras.layers.Add()([input_features, drop3])
Layer_norm=tf.keras.layers.LayerNormalization(axis=-1)(Layer_norm_feat)
ff_out=tf.keras.layers.Dense(Layer_norm.shape[2], activation='relu')(Layer_norm)
ff_out=tf.keras.layers.Dense(Layer_norm.shape[2])(ff_out)
drop4=tf.keras.layers.Dropout(keep_prob_)(ff_out)
Layer_norm_input=tf.keras.layers.Add()([Layer_norm, drop4])
Attention_block_out=tf.keras.layers.LayerNormalization(axis=-1)(Layer_norm_input)
self.intraEpoch_att_block=tf.keras.Model(inputs=input_features, outputs=Attention_block_out)
def call(self, inp, training=False):
x = self.intraEpoch_att_block(inp)
return x
model1 = my_model()
model2 = my_model()

Categories