Loading a modified pretrained model using strict=False in PyTorch

Loading a modified pretrained model using strict=False in PyTorch - python

I want to use a pretrained model as the encoder part in my model. You can find a version of my model:
class MyClass(nn.Module):
def __init__(self, pretrained=False):
super(MyClass, self).__init__()
self.encoder=S3D_featureExtractor_multi_output()
if pretrained:
weight_dict=torch.load(os.path.join('models','weights.pt'))
model_dict=self.encoder.state_dict()
list_weight_dict=list(weight_dict.items())
list_model_dict=list(model_dict.items())
for i in range(len(list_model_dict)):
assert list_model_dict[i][1].shape==list_weight_dict[i][1].shape
model_dict[list_model_dict[i][0]].copy_(weight_dict[list_weight_dict[i][0]])
for i in range(len(list_model_dict)):
assert torch.all(torch.eq(model_dict[list_model_dict[i][0]],weight_dict[list_weight_dict[i][0]].to('cpu')))
print('Loading finished!')
def forward(self, x):
a, b = self.encoder(x)
return a, b
Because I modified some parts of the code of this pretrained model, based on this post I need to apply strict=False to avoid facing error, but based on the scenario that I load the pretrained weights, I cannot find a place in the code to apply strict=False. How can I apply that or how can I change the scenario of loading the pretrained model taht makes it possible to apply strict=False?

strict = False is to specify when you use load_state_dict() method. state_dict are just Python dictionaries that helps you save and load model weights.
(for more details, see https://pytorch.org/tutorials/recipes/recipes/what_is_state_dict.html)
If you use strict=False in load_state_dict, you inform PyTorch that the target model and the original model are not identical, so it just initialises the weights of layers which are present in both and ignores the rest.
(see https://pytorch.org/docs/stable/generated/torch.nn.Module.html?highlight=load_state_dict#torch.nn.Module.load_state_dict)
So, you will need to specify the strict argument when you load the pretrained model weights. load_state_dict can be called at this step.
If the model for which weights must be loaded is self.encoder
and if state_dict can be retrieved from the model you just loaded, you can just do this
loaded_weights = torch.load(os.path.join('models','weights.pt'))
self.encoder.load_state_dict(loaded_weights, strict=False)
for more details and a tutorial, see https://pytorch.org/tutorials/beginner/saving_loading_models.html .

Related

Is there way to embed non-tf functions to a tf.Keras model graph as SavedModel Signature?

I want to add preprocessing functions and methods to the model graph as a SavedModel signature.
example:
# suppose we have a keras model
# ...
# defining the function I want to add to the model graph
#tf.function
def process(model, img_path):
# do some preprocessing using different libs. and modules...
outputs = {"preds": model.predict(preprocessed_img)}
return outputs
# saving the model with a custom signature
tf.saved_model.save(new_model, dst_path,
signatures={"process": process})
or we can use tf.Module here. However, the problem is I can not embed custom functions into the saved model graph.
Is there any way to do that?

I think you slightly misunderstand the purpose of save_model method in Tensorflow.
As per the documentation the intent is to have a method which serialises the model's graph so that it can be loaded with load_model afterwards.
The model returned by load_model is a class of tf.Module with all it's methods and attributes. Instead you want to serialise the prediction pipeline.
To be honest, I'm not aware of a good way to do that, however what you can do is to use a different method for serialisation of your preprocessing parameters, for example pickle or a different one, provided by the framework you use and write a class on top of that, which would do the following:
class MyModel:
def __init__(self, model_path, preprocessing_path):
self.model = load_model(model_path)
self.preprocessing = load_preprocessing(preprocessing_path)
def predict(self, img_path):
return self.model.predict(self.preprocessing(img_path))

Tensorflow Keras Model subclassing -- call function

I am experimenting with self supervised learning using tensorflow. The example code I'm running can be found in the Keras examples website. This is the link to the NNCLR example. The Github link to download the code can be found here. While I have no issues running the examples, I am running into issues when I try to save the pretrained or the finetuned model using model.save().
The error I'm getting is this:
f"Model {model} cannot be saved either because the input shape is not "
ValueError: Model <__main__.NNCLR object at 0x7f6bc0f39550> cannot be saved either
because the input shape is not available or because the forward pass of the model is
not defined. To define a forward pass, please override `Model.call()`.
To specify an input shape, either call `build(input_shape)` directly, or call the model on actual data using `Model()`, `Model.fit()`, or `Model.predict()`.
If you have a custom training step, please make sure to invoke the forward pass in train step through
`Model.__call__`, i.e. `model(inputs)`, as opposed to `model.call()`.
I am unsure how to override the Model.call() method. Appreciate some help.

One way to achieve model saving in such cases is to override the save (or save_weights) method in the keras.Model class. In your case, first initialize the finetune model in the NNCLR class. And next, override the save method for it. FYI, in this way, you may also able to use ModelCheckpoint API.
As said, define the finetune model in the NNCLR model class and override the save method for it.
class NNCLR(keras.Model):
def __init__(...):
super().__init__()
...
self.finetuning_model = keras.Sequential(
[
layers.Input(shape=input_shape),
self.classification_augmenter,
self.encoder,
layers.Dense(10),
],
name="finetuning_model",
)
...
def save(
self, filepath, overwrite=True, include_optimizer=True,
save_format=None, signatures=None, options=None
):
self.finetuning_model.save(
filepath=filepath,
overwrite=overwrite,
save_format=save_format,
options=options,
include_optimizer=include_optimizer,
signatures=signatures
)
model = NNCLR(...)
model.compile
model.fit
Next, you can do
model.save('finetune_model') # SavedModel format
finetune_model = tf.keras.models.load_model('finetune_model', compile=False)
'''
NNCLR code example: Evaluate sections
"A popular way to evaluate a SSL method in computer vision or
for that fact any other pre-training method as such is to learn
a linear classifier on the frozen features of the trained backbone
model and evaluate the classifier on unseen images."
'''
for layer in finetune_model.layers:
if not isinstance(layer, layers.Dense):
layer.trainable = False
finetune_model.summary() # OK
finetune_model.compile(
optimizer=keras.optimizers.Adam(),
loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True),
metrics=[keras.metrics.SparseCategoricalAccuracy(name="acc")],
)
finetune_model.fit

How do we create a reusable block that share architecture in a single model but learn different set of weight in the single model in Keras?

I am using tensorflow.keras and want to know if it is possible to create reusable blocks of inbuilt Keras layers. For example, I would like to repeatedly use the same set of layers (that able to learn the different weights) at a different position in a model. I would like to use the following block at different times in my model.
keep_prob_=0.5
input_features=Input(shape=(29, 1664))
Imortant_features= SelfAttention(activation='tanh',
kernel_regularizer=tf.keras.regularizers.l2(0.), kernel_initializer='glorot_uniform'
(input_features)
drop3=tf.keras.layers.Dropout(keep_prob_)(Imortant_features)
Layer_norm_feat=tf.keras.layers.Add()([input_features, drop3])
Layer_norm=tf.keras.layers.LayerNormalization(axis=-1)(Layer_norm_feat)
ff_out=tf.keras.layers.Dense(Layer_norm.shape[2], activation='relu')(Layer_norm)
ff_out=tf.keras.layers.Dense(Layer_norm.shape[2])(ff_out)
drop4=tf.keras.layers.Dropout(keep_prob_)(ff_out)
Layer_norm_input=tf.keras.layers.Add()([Layer_norm, drop4])
Attention_block_out=tf.keras.layers.LayerNormalization(axis=-1)(Layer_norm_input)
intraEpoch_att_block=tf.keras.Model(inputs=input_features, outputs=Attention_block_out)
I have read about creating custom layers in Keras but I did not find the documentation to be clear enough. I want to reuse the sub-model which able to learn the different set of weight in a single functional API model in tensorflow.keras.

Use this code (I removed SelfAttention, so add it back):
import tensorflow as tf
class my_model(tf.keras.layers.Layer):
def __init__(self):
super(my_model, self).__init__()
keep_prob_=0.5
input_features=tf.keras.layers.Input(shape=(29, 1664))
drop3=tf.keras.layers.Dropout(keep_prob_)(input_features)
Layer_norm_feat=tf.keras.layers.Add()([input_features, drop3])
Layer_norm=tf.keras.layers.LayerNormalization(axis=-1)(Layer_norm_feat)
ff_out=tf.keras.layers.Dense(Layer_norm.shape[2], activation='relu')(Layer_norm)
ff_out=tf.keras.layers.Dense(Layer_norm.shape[2])(ff_out)
drop4=tf.keras.layers.Dropout(keep_prob_)(ff_out)
Layer_norm_input=tf.keras.layers.Add()([Layer_norm, drop4])
Attention_block_out=tf.keras.layers.LayerNormalization(axis=-1)(Layer_norm_input)
self.intraEpoch_att_block=tf.keras.Model(inputs=input_features, outputs=Attention_block_out)
def call(self, inp, training=False):
x = self.intraEpoch_att_block(inp)
return x
model1 = my_model()
model2 = my_model()

Loading & Freezing a Pretrained Model to Combine with a New Network

I have a pretrained model and would like to build a classifier on top of it. I’m trying to load and freeze the weights of the pretrained model, and pass its outputs to the new classifier, which I’d like to optimise. Here is what I have so far, I’m a little stuck on a TypeError: forward() missing 1 required positional argument: 'x' error from the nn.Sequential line:
import model #model.py contains the architecture of the pretrained model
class Classifier(nn.Module):
def __init__(self):
...
def forward(self, x):
...
net = model.Model()
net.load_state_dict(checkpoint["net"])
for c in net.children():
for param in child.parameters():
params.requires_grad = False
model = nn.Sequential(nn.ModuleList(net()), Classifier())

TL;DR
model = nn.Sequential(nn.ModuleList(net), Classifier())
You are "calling" net.forward by net(), as opposed to the __init__ method of Classifier class in Classifier().

I finally solved this issue after a discussion with #ptrblck from the PyTorch Forums. The solution is similar to Shai's answer, only that because net contains an instance of the model.Model class, one should do model = nn.Sequential(net, Classifier()) instead, without calling nn.ModuleList().

Keras Lambda CTC unable to get model to load

Hi I have a model which is based on this https://github.com/igormq/asr-study/tree/keras-2 that is able to just about save okay but is unable to load (either full mode or json/weights) due to the fact the loss isn't defined properly.
inputs = Input(name='inputs', shape=(None, num_features))
...
o = TimeDistributed(Dense(num_hiddens))(inputs)
# Output layer
outputs = TimeDistributed(Dense(num_classes))(o)
# Define placeholders
labels = Input(name='labels', shape=(None,), dtype='int32', sparse=True)
inputs_length = Input(name='inputs_length', shape=(None,), dtype='int32')
# Define a decoder
dec = Lambda(ctc_utils.decode, output_shape=ctc_utils.decode_output_shape,
arguments={'is_greedy': True}, name='decoder')
y_pred = dec([output, inputs_length])
loss = ctc_utils.ctc_loss(output, labels, input_length)
model = Model(input=[inputs, labels, inputs_length], output=y_pred)
model.add_loss(loss)
opt = Adam(lr=args.lr, clipnorm=args.clipnorm)
# Compile with dummy loss
model.compile(optimizer=opt, loss=None, metrics=[metrics.ler])
This will compile and run (note it uses the add_loss function which isn't very well documented). It can even be convinced to save with a bit of work - as this post hints (https://github.com/fchollet/keras/issues/5179) you can make it save by forcing the graph to be complete. I did this by making a dummy lambda loss function to bring in the inputs that weren't fully part of the graph, now this appears to work.
#this captures all the dangling nodes so will now save
fake_dummy_loss = Lambda(fake_ctc_loss,output_shape(1,),name=ctc)([y_pred,labels,inputs_length])
def fake_ctc_loss(args):
return tf.Variable(tf.zeros([1]),name="fakeloss")
We can add this to the model like so:
model = Model(input=[inputs, labels, inputs_length], output=[y_pred, fake_dummy_loss])
Now the loss when trying to load, says that it cannot due to the fact that it is missing a loss function (i guess this is because it's set to None despite add_loss being used.
Any help here appreciated

I faced a similar problem in a project of mine in which add_loss is used to manually add a custom loss function to my model. You can see my model here: Keras Loss Function with Additional Dynamic Parameter As you found, loading the model with load_model fails, complaining about a missing loss function.
Anyway, my solution was to save and load the model's weights rather than the whole model. The Model class has a save_weights method, which is discussed here: https://keras.io/models/about-keras-models/ Likewise, there's a load_weights method. Using these methods, you should be able to save and load the model just fine. The downside is that you have to define the model upfront, and then load the weights. In my project that wasn't an issue and only involved a small refactor.
Hope that helps.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Loading a modified pretrained model using strict=False in PyTorch - python

Related

Is there way to embed non-tf functions to a tf.Keras model graph as SavedModel Signature?

Tensorflow Keras Model subclassing -- call function

How do we create a reusable block that share architecture in a single model but learn different set of weight in the single model in Keras?

Loading & Freezing a Pretrained Model to Combine with a New Network

Keras Lambda CTC unable to get model to load

Categories

Resources