Tensorflow Keras Model subclassing -- call function - python

I am experimenting with self supervised learning using tensorflow. The example code I'm running can be found in the Keras examples website. This is the link to the NNCLR example. The Github link to download the code can be found here. While I have no issues running the examples, I am running into issues when I try to save the pretrained or the finetuned model using model.save().
The error I'm getting is this:
f"Model {model} cannot be saved either because the input shape is not "
ValueError: Model <__main__.NNCLR object at 0x7f6bc0f39550> cannot be saved either
because the input shape is not available or because the forward pass of the model is
not defined. To define a forward pass, please override `Model.call()`.
To specify an input shape, either call `build(input_shape)` directly, or call the model on actual data using `Model()`, `Model.fit()`, or `Model.predict()`.
If you have a custom training step, please make sure to invoke the forward pass in train step through
`Model.__call__`, i.e. `model(inputs)`, as opposed to `model.call()`.
I am unsure how to override the Model.call() method. Appreciate some help.

One way to achieve model saving in such cases is to override the save (or save_weights) method in the keras.Model class. In your case, first initialize the finetune model in the NNCLR class. And next, override the save method for it. FYI, in this way, you may also able to use ModelCheckpoint API.
As said, define the finetune model in the NNCLR model class and override the save method for it.
class NNCLR(keras.Model):
def __init__(...):
super().__init__()
...
self.finetuning_model = keras.Sequential(
[
layers.Input(shape=input_shape),
self.classification_augmenter,
self.encoder,
layers.Dense(10),
],
name="finetuning_model",
)
...
def save(
self, filepath, overwrite=True, include_optimizer=True,
save_format=None, signatures=None, options=None
):
self.finetuning_model.save(
filepath=filepath,
overwrite=overwrite,
save_format=save_format,
options=options,
include_optimizer=include_optimizer,
signatures=signatures
)
model = NNCLR(...)
model.compile
model.fit
Next, you can do
model.save('finetune_model') # SavedModel format
finetune_model = tf.keras.models.load_model('finetune_model', compile=False)
'''
NNCLR code example: Evaluate sections
"A popular way to evaluate a SSL method in computer vision or
for that fact any other pre-training method as such is to learn
a linear classifier on the frozen features of the trained backbone
model and evaluate the classifier on unseen images."
'''
for layer in finetune_model.layers:
if not isinstance(layer, layers.Dense):
layer.trainable = False
finetune_model.summary() # OK
finetune_model.compile(
optimizer=keras.optimizers.Adam(),
loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True),
metrics=[keras.metrics.SparseCategoricalAccuracy(name="acc")],
)
finetune_model.fit

Related

Is there way to embed non-tf functions to a tf.Keras model graph as SavedModel Signature?

I want to add preprocessing functions and methods to the model graph as a SavedModel signature.
example:
# suppose we have a keras model
# ...
# defining the function I want to add to the model graph
#tf.function
def process(model, img_path):
# do some preprocessing using different libs. and modules...
outputs = {"preds": model.predict(preprocessed_img)}
return outputs
# saving the model with a custom signature
tf.saved_model.save(new_model, dst_path,
signatures={"process": process})
or we can use tf.Module here. However, the problem is I can not embed custom functions into the saved model graph.
Is there any way to do that?
I think you slightly misunderstand the purpose of save_model method in Tensorflow.
As per the documentation the intent is to have a method which serialises the model's graph so that it can be loaded with load_model afterwards.
The model returned by load_model is a class of tf.Module with all it's methods and attributes. Instead you want to serialise the prediction pipeline.
To be honest, I'm not aware of a good way to do that, however what you can do is to use a different method for serialisation of your preprocessing parameters, for example pickle or a different one, provided by the framework you use and write a class on top of that, which would do the following:
class MyModel:
def __init__(self, model_path, preprocessing_path):
self.model = load_model(model_path)
self.preprocessing = load_preprocessing(preprocessing_path)
def predict(self, img_path):
return self.model.predict(self.preprocessing(img_path))

Loading a modified pretrained model using strict=False in PyTorch

I want to use a pretrained model as the encoder part in my model. You can find a version of my model:
class MyClass(nn.Module):
def __init__(self, pretrained=False):
super(MyClass, self).__init__()
self.encoder=S3D_featureExtractor_multi_output()
if pretrained:
weight_dict=torch.load(os.path.join('models','weights.pt'))
model_dict=self.encoder.state_dict()
list_weight_dict=list(weight_dict.items())
list_model_dict=list(model_dict.items())
for i in range(len(list_model_dict)):
assert list_model_dict[i][1].shape==list_weight_dict[i][1].shape
model_dict[list_model_dict[i][0]].copy_(weight_dict[list_weight_dict[i][0]])
for i in range(len(list_model_dict)):
assert torch.all(torch.eq(model_dict[list_model_dict[i][0]],weight_dict[list_weight_dict[i][0]].to('cpu')))
print('Loading finished!')
def forward(self, x):
a, b = self.encoder(x)
return a, b
Because I modified some parts of the code of this pretrained model, based on this post I need to apply strict=False to avoid facing error, but based on the scenario that I load the pretrained weights, I cannot find a place in the code to apply strict=False. How can I apply that or how can I change the scenario of loading the pretrained model taht makes it possible to apply strict=False?
strict = False is to specify when you use load_state_dict() method. state_dict are just Python dictionaries that helps you save and load model weights.
(for more details, see https://pytorch.org/tutorials/recipes/recipes/what_is_state_dict.html)
If you use strict=False in load_state_dict, you inform PyTorch that the target model and the original model are not identical, so it just initialises the weights of layers which are present in both and ignores the rest.
(see https://pytorch.org/docs/stable/generated/torch.nn.Module.html?highlight=load_state_dict#torch.nn.Module.load_state_dict)
So, you will need to specify the strict argument when you load the pretrained model weights. load_state_dict can be called at this step.
If the model for which weights must be loaded is self.encoder
and if state_dict can be retrieved from the model you just loaded, you can just do this
loaded_weights = torch.load(os.path.join('models','weights.pt'))
self.encoder.load_state_dict(loaded_weights, strict=False)
for more details and a tutorial, see https://pytorch.org/tutorials/beginner/saving_loading_models.html .

How to download a HuggingFace model 'transformers.trainer.Trainer'?

In 1 code., I have uploaded hugging face 'transformers.trainer.Trainer' based model using save_pretrained() function
In 2nd code, I want to download this uploaded model and use it to make predictions. I need help in this step - How to download the uploaded model & then make a prediction?
Steps to create model:
from transformers import AutoModelForQuestionAnswering, TrainingArguments, Trainer
model = AutoModelForQuestionAnswering.from_pretrained('xlm-roberta-large)
trainer = Trainer(
model,
args,
train_dataset=tokenized_train_ds,
eval_dataset=tokenized_val_ds,
data_collator=data_collator,
tokenizer=tokenizer,)
#Arguments used above not mentioned here - model, args, tokenized_train_ds, tokenized_val_ds, data_collator, tokenizer
#Below step train the pre-trained model
trainer.train()
I then uploaded this 'trainer' model using the below command:-
trainer.save_model('./trainer_sm')
In a different code, I now want to download this model & use it for making predictions, Can someone advise how to do this? I tried the below command to upload it:-
model_sm=AutoModelForQuestionAnswering.from_pretrained("./trainer_sm")
And used it to make predictions by this line of code:-
model_sm.predict(test_features)
AttributeError: 'XLMRobertaForQuestionAnswering' object has no attribute 'predict'
I also used 'use_auth_token=True' as an argument for from_pretrained, but that also didn't work.
Also, type(trainer) is 'transformers.trainer.Trainer' , while type(model_sm) is transformers.models.xlm_roberta.modeling_xlm_roberta.XLMRobertaForQuestionAnswering
What you have saved is the model which the trainer was going to tune and you should be aware that predicting, training, evaluation and etc, are the utilities of transformers.trainer.Trainer object, not transformers.models.xlm_roberta.modeling_xlm_roberta.XLMRobertaForQuestionAnswering. Based on what was mentioned the easiest way to keep things going is creating another instance of the trainer.
model_sm=AutoModelForQuestionAnswering.from_pretrained("./trainer_sm")
reloaded_trainer = Trainer(
model = model_sm,
tokenizer = tokenizer,
# other arguments if you have changed the defaults
)
reloaded_trainer.predict(test_dataset)

Unable to restore custom object of type _tf_keras_metric currently using the HammingLoss metric from TensorFlow Addons module

I have created and trained a TensorFlow model using the HammingLoss metric from TensorFlow addons. Thus, it's not a custom metric that I have created on my own. I use a callbacks function with the methords ModelCheckpoint() and EarlyStopping to save the best weights of the best model and stop model training at a given threshold repsectively. When I save the model checkpoint I serialize the whole model structure (similar to model.save()), istead of model.save_weights(), which would have saved only the model weights (more about ModelCheckpoint here).
TL;DR: Here is a colab notebook with the code I post below in case you want to skip this.
The model I have trained is saved in GoogleDrive in the link here. To load the specific model I use the following code:
neural_network_parameters = {}
#======================================================================
# PARAMETERS THAT DEFINE THE NEURAL NETWORK STRUCTURE =
#======================================================================
neural_network_parameters['model_loss'] = tf.keras.losses.BinaryCrossentropy(from_logits=False, name='binary_crossentropy')
neural_network_parameters['model_metric'] = [tfa.metrics.HammingLoss(mode="multilabel", name="hamming_loss"),
tfa.metrics.F1Score(17, average="micro", name="f1_score_micro"),
tfa.metrics.F1Score(17, average=None, name="f1_score_none"),
tfa.metrics.F1Score(17, average="macro", name="f1_score_macro"),
tfa.metrics.F1Score(17, average="weighted", name="f1_score_weighted")]
"""Initialize the hyper parameters tuning the model using Tensorflow's hyperparameters module"""
HP_HIDDEN_UNITS = hp.HParam('batch_size', hp.Discrete([32]))
HP_EMBEDDING_DIM = hp.HParam('embedding_dim', hp.Discrete([50]))
HP_LEARNING_RATE = hp.HParam('learning_rate', hp.Discrete([0.001])) # Adam default: 0.001, SGD default: 0.01, RMSprop default: 0.001....0.1 to be removed
HP_DECAY_STEPS_MULTIPLIER = hp.HParam('decay_steps_multiplier', hp.Discrete([10]))
METRIC_ACCURACY = "hamming_loss"
dependencies = {
'hamming_loss': tfa.metrics.HammingLoss(mode="multilabel", name="hamming_loss"),
'attention': attention(return_sequences=True)
}
def import_trained_keras_model(model_index, method, decay_steps_mode, optimizer_name, hparams):
"""Load the model"""
training_date="2021-02-27"
model_path_structure=f"{folder_path_model_saved}/{initialize_notebbok_variables.saved_model_name}_{hparams[HP_EMBEDDING_DIM]}dim_{hparams[HP_HIDDEN_UNITS]}batchsize_{hparams[HP_LEARNING_RATE]}lr_{hparams[HP_DECAY_STEPS_MULTIPLIER]}decaymultiplier_{training_date}"
model_imported=load_model(f"{model_path_structure}", custom_objects=dependencies)
if optimizer_name=="adam":
optimizer = optimizer_adam_v2(hparams)
elif optimizer_name=="sgd":
optimizer = optimizer_sgd_v1(hparams, "step decay")
else:
optimizer = optimizer_rmsprop_v1(hparams)
model_imported.compile(optimizer=optimizer,
loss=neural_network_parameters['model_loss'],
metrics=neural_network_parameters['model_metric'])
print(f"Model {model_index} is loaded successfully\n")
return model_imported
Calling the function import trained keras model
"""Now that the functions have been created it's time to import each trained classifier from the selected dictionary of hyper parameters, calculate the evaluation metric per model and finally serialize the scores dataframe for later use."""
list_models=[] #a list to store imported models
model_optimizer="adam"
for batch_size in HP_HIDDEN_UNITS.domain.values:
for embedding_dim in HP_EMBEDDING_DIM.domain.values:
for learning_rate in HP_LEARNING_RATE.domain.values:
for decay_steps_multiplier in HP_DECAY_STEPS_MULTIPLIER.domain.values:
hparams = {
HP_HIDDEN_UNITS: batch_size,
HP_EMBEDDING_DIM: embedding_dim,
HP_LEARNING_RATE: learning_rate,
HP_DECAY_STEPS_MULTIPLIER: decay_steps_multiplier
}
print(f"\n{len(list_models)+1}/{(len(HP_HIDDEN_UNITS.domain.values)*len(HP_EMBEDDING_DIM.domain.values)*len(HP_LEARNING_RATE.domain.values)*len(HP_DECAY_STEPS_MULTIPLIER.domain.values))}")
print({h.name: hparams[h] for h in hparams},'\n')
model_object=import_trained_keras_model(len(list_models)+1, "import custom trained model", "on", model_optimizer, hparams)
list_models.append(model_object)
When I call the function I get the following error
ValueError: Unable to restore custom object of type _tf_keras_metric currently. Please make sure that the layer implements get_configand from_config when saving. In addition, please use the custom_objects arg when calling load_model().
It's strange that I get this error since the model metric to compile the NN is from a built in method of TensorFlow and NOT some sort of a custom metric that I developed myself.
I have searched also this thread in GitHub which closed without explaining the root of the problem.
[UPDATE]--Found a temporary solution
I managed to successfully import the model by turning the compile argument to False in order to re-compile the model imported inside the function.
So I did smth like model_imported=load_model(f"{model_path_structure}", custom_objects=dependencies, compile=False).
This action produced the following result:
WARNING:tensorflow:Unable to restore custom metric. Please ensure that the layer implements get_config and from_config when saving. In addition, please use the custom_objects arg when calling load_model().
Model 1 is loaded successfully.
So TensorFlow still cannot understand that HammingLoss is not a custom metric but rather a metric imported from Tensorflow Addons. However, despite the warning the model loaded successfully.

Keras Lambda CTC unable to get model to load

Hi I have a model which is based on this https://github.com/igormq/asr-study/tree/keras-2 that is able to just about save okay but is unable to load (either full mode or json/weights) due to the fact the loss isn't defined properly.
inputs = Input(name='inputs', shape=(None, num_features))
...
o = TimeDistributed(Dense(num_hiddens))(inputs)
# Output layer
outputs = TimeDistributed(Dense(num_classes))(o)
# Define placeholders
labels = Input(name='labels', shape=(None,), dtype='int32', sparse=True)
inputs_length = Input(name='inputs_length', shape=(None,), dtype='int32')
# Define a decoder
dec = Lambda(ctc_utils.decode, output_shape=ctc_utils.decode_output_shape,
arguments={'is_greedy': True}, name='decoder')
y_pred = dec([output, inputs_length])
loss = ctc_utils.ctc_loss(output, labels, input_length)
model = Model(input=[inputs, labels, inputs_length], output=y_pred)
model.add_loss(loss)
opt = Adam(lr=args.lr, clipnorm=args.clipnorm)
# Compile with dummy loss
model.compile(optimizer=opt, loss=None, metrics=[metrics.ler])
This will compile and run (note it uses the add_loss function which isn't very well documented). It can even be convinced to save with a bit of work - as this post hints (https://github.com/fchollet/keras/issues/5179) you can make it save by forcing the graph to be complete. I did this by making a dummy lambda loss function to bring in the inputs that weren't fully part of the graph, now this appears to work.
#this captures all the dangling nodes so will now save
fake_dummy_loss = Lambda(fake_ctc_loss,output_shape(1,),name=ctc)([y_pred,labels,inputs_length])
def fake_ctc_loss(args):
return tf.Variable(tf.zeros([1]),name="fakeloss")
We can add this to the model like so:
model = Model(input=[inputs, labels, inputs_length], output=[y_pred, fake_dummy_loss])
Now the loss when trying to load, says that it cannot due to the fact that it is missing a loss function (i guess this is because it's set to None despite add_loss being used.
Any help here appreciated
I faced a similar problem in a project of mine in which add_loss is used to manually add a custom loss function to my model. You can see my model here: Keras Loss Function with Additional Dynamic Parameter As you found, loading the model with load_model fails, complaining about a missing loss function.
Anyway, my solution was to save and load the model's weights rather than the whole model. The Model class has a save_weights method, which is discussed here: https://keras.io/models/about-keras-models/ Likewise, there's a load_weights method. Using these methods, you should be able to save and load the model just fine. The downside is that you have to define the model upfront, and then load the weights. In my project that wasn't an issue and only involved a small refactor.
Hope that helps.

Categories