In 1 code., I have uploaded hugging face 'transformers.trainer.Trainer' based model using save_pretrained() function
In 2nd code, I want to download this uploaded model and use it to make predictions. I need help in this step - How to download the uploaded model & then make a prediction?
Steps to create model:
from transformers import AutoModelForQuestionAnswering, TrainingArguments, Trainer
model = AutoModelForQuestionAnswering.from_pretrained('xlm-roberta-large)
trainer = Trainer(
model,
args,
train_dataset=tokenized_train_ds,
eval_dataset=tokenized_val_ds,
data_collator=data_collator,
tokenizer=tokenizer,)
#Arguments used above not mentioned here - model, args, tokenized_train_ds, tokenized_val_ds, data_collator, tokenizer
#Below step train the pre-trained model
trainer.train()
I then uploaded this 'trainer' model using the below command:-
trainer.save_model('./trainer_sm')
In a different code, I now want to download this model & use it for making predictions, Can someone advise how to do this? I tried the below command to upload it:-
model_sm=AutoModelForQuestionAnswering.from_pretrained("./trainer_sm")
And used it to make predictions by this line of code:-
model_sm.predict(test_features)
AttributeError: 'XLMRobertaForQuestionAnswering' object has no attribute 'predict'
I also used 'use_auth_token=True' as an argument for from_pretrained, but that also didn't work.
Also, type(trainer) is 'transformers.trainer.Trainer' , while type(model_sm) is transformers.models.xlm_roberta.modeling_xlm_roberta.XLMRobertaForQuestionAnswering
What you have saved is the model which the trainer was going to tune and you should be aware that predicting, training, evaluation and etc, are the utilities of transformers.trainer.Trainer object, not transformers.models.xlm_roberta.modeling_xlm_roberta.XLMRobertaForQuestionAnswering. Based on what was mentioned the easiest way to keep things going is creating another instance of the trainer.
model_sm=AutoModelForQuestionAnswering.from_pretrained("./trainer_sm")
reloaded_trainer = Trainer(
model = model_sm,
tokenizer = tokenizer,
# other arguments if you have changed the defaults
)
reloaded_trainer.predict(test_dataset)
Related
I am experimenting with self supervised learning using tensorflow. The example code I'm running can be found in the Keras examples website. This is the link to the NNCLR example. The Github link to download the code can be found here. While I have no issues running the examples, I am running into issues when I try to save the pretrained or the finetuned model using model.save().
The error I'm getting is this:
f"Model {model} cannot be saved either because the input shape is not "
ValueError: Model <__main__.NNCLR object at 0x7f6bc0f39550> cannot be saved either
because the input shape is not available or because the forward pass of the model is
not defined. To define a forward pass, please override `Model.call()`.
To specify an input shape, either call `build(input_shape)` directly, or call the model on actual data using `Model()`, `Model.fit()`, or `Model.predict()`.
If you have a custom training step, please make sure to invoke the forward pass in train step through
`Model.__call__`, i.e. `model(inputs)`, as opposed to `model.call()`.
I am unsure how to override the Model.call() method. Appreciate some help.
One way to achieve model saving in such cases is to override the save (or save_weights) method in the keras.Model class. In your case, first initialize the finetune model in the NNCLR class. And next, override the save method for it. FYI, in this way, you may also able to use ModelCheckpoint API.
As said, define the finetune model in the NNCLR model class and override the save method for it.
class NNCLR(keras.Model):
def __init__(...):
super().__init__()
...
self.finetuning_model = keras.Sequential(
[
layers.Input(shape=input_shape),
self.classification_augmenter,
self.encoder,
layers.Dense(10),
],
name="finetuning_model",
)
...
def save(
self, filepath, overwrite=True, include_optimizer=True,
save_format=None, signatures=None, options=None
):
self.finetuning_model.save(
filepath=filepath,
overwrite=overwrite,
save_format=save_format,
options=options,
include_optimizer=include_optimizer,
signatures=signatures
)
model = NNCLR(...)
model.compile
model.fit
Next, you can do
model.save('finetune_model') # SavedModel format
finetune_model = tf.keras.models.load_model('finetune_model', compile=False)
'''
NNCLR code example: Evaluate sections
"A popular way to evaluate a SSL method in computer vision or
for that fact any other pre-training method as such is to learn
a linear classifier on the frozen features of the trained backbone
model and evaluate the classifier on unseen images."
'''
for layer in finetune_model.layers:
if not isinstance(layer, layers.Dense):
layer.trainable = False
finetune_model.summary() # OK
finetune_model.compile(
optimizer=keras.optimizers.Adam(),
loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True),
metrics=[keras.metrics.SparseCategoricalAccuracy(name="acc")],
)
finetune_model.fit
So I defined my keras model and have used a custom_loss function to train the model:
model.compile(optimizer='adam', loss=custom_loss, metrics=[custom_loss])
Then I am training the model:
history = model.fit(X_train, y_train, batch_size=1024, epochs=125, validation_split=0.2, shuffle=True)
Then I save this history object using the following code:
with open('history.pkl', 'wb') as file:
pickle.dump(history, file)
Now, when I am trying to read the history object as follows:
with open('history.pkl', 'rb') as file:
history = pickle.load(file)
I get the following error:
ValueError: Unknown loss function:custom_loss
How can I read the history object? I don't get this error when I am not using custom_loss function.
I am using keras 2.2.4 and tensorflow 1.15.5
Edit: Complete error traceback as requested:
For most use cases, you don't want to serialize the history object. What you are usually interested in is history.history, which is a dict of the logs / metrics / losses / etc.
Try that:
pickle.dump(history.history, file)
The fuller answer is that the history object returned is a tf.keras.callbacks.History, which subclasses tf.keras.callbacks.Callback. Callback itself has a ref to the model, which then has refs to all kinds of stuff including custom objects like your custom loss. Serialization of Keras custom objects is a whole other big topic... tldr the recommended way to serialize Keras models is not to use pickle.
+1 to #Yaoshiang's answer. That's the right answer.
This is just a big note.
Reading that trace, it looks like keras has custom pickle logic that uses the standard keras save/load logic. keras doesn't know about your loss function unless you tell it.
Search for "custom objects" in this guide: https://keras.io/guides/serialization_and_saving/
Try something like:
custom_objects = {"custom_loss": custom_loss}
with keras.utils.custom_object_scope(custom_objects):
with open('history.pkl', 'rb') as file:
history = pickle.load(file)
I have created and trained a TensorFlow model using the HammingLoss metric from TensorFlow addons. Thus, it's not a custom metric that I have created on my own. I use a callbacks function with the methords ModelCheckpoint() and EarlyStopping to save the best weights of the best model and stop model training at a given threshold repsectively. When I save the model checkpoint I serialize the whole model structure (similar to model.save()), istead of model.save_weights(), which would have saved only the model weights (more about ModelCheckpoint here).
TL;DR: Here is a colab notebook with the code I post below in case you want to skip this.
The model I have trained is saved in GoogleDrive in the link here. To load the specific model I use the following code:
neural_network_parameters = {}
#======================================================================
# PARAMETERS THAT DEFINE THE NEURAL NETWORK STRUCTURE =
#======================================================================
neural_network_parameters['model_loss'] = tf.keras.losses.BinaryCrossentropy(from_logits=False, name='binary_crossentropy')
neural_network_parameters['model_metric'] = [tfa.metrics.HammingLoss(mode="multilabel", name="hamming_loss"),
tfa.metrics.F1Score(17, average="micro", name="f1_score_micro"),
tfa.metrics.F1Score(17, average=None, name="f1_score_none"),
tfa.metrics.F1Score(17, average="macro", name="f1_score_macro"),
tfa.metrics.F1Score(17, average="weighted", name="f1_score_weighted")]
"""Initialize the hyper parameters tuning the model using Tensorflow's hyperparameters module"""
HP_HIDDEN_UNITS = hp.HParam('batch_size', hp.Discrete([32]))
HP_EMBEDDING_DIM = hp.HParam('embedding_dim', hp.Discrete([50]))
HP_LEARNING_RATE = hp.HParam('learning_rate', hp.Discrete([0.001])) # Adam default: 0.001, SGD default: 0.01, RMSprop default: 0.001....0.1 to be removed
HP_DECAY_STEPS_MULTIPLIER = hp.HParam('decay_steps_multiplier', hp.Discrete([10]))
METRIC_ACCURACY = "hamming_loss"
dependencies = {
'hamming_loss': tfa.metrics.HammingLoss(mode="multilabel", name="hamming_loss"),
'attention': attention(return_sequences=True)
}
def import_trained_keras_model(model_index, method, decay_steps_mode, optimizer_name, hparams):
"""Load the model"""
training_date="2021-02-27"
model_path_structure=f"{folder_path_model_saved}/{initialize_notebbok_variables.saved_model_name}_{hparams[HP_EMBEDDING_DIM]}dim_{hparams[HP_HIDDEN_UNITS]}batchsize_{hparams[HP_LEARNING_RATE]}lr_{hparams[HP_DECAY_STEPS_MULTIPLIER]}decaymultiplier_{training_date}"
model_imported=load_model(f"{model_path_structure}", custom_objects=dependencies)
if optimizer_name=="adam":
optimizer = optimizer_adam_v2(hparams)
elif optimizer_name=="sgd":
optimizer = optimizer_sgd_v1(hparams, "step decay")
else:
optimizer = optimizer_rmsprop_v1(hparams)
model_imported.compile(optimizer=optimizer,
loss=neural_network_parameters['model_loss'],
metrics=neural_network_parameters['model_metric'])
print(f"Model {model_index} is loaded successfully\n")
return model_imported
Calling the function import trained keras model
"""Now that the functions have been created it's time to import each trained classifier from the selected dictionary of hyper parameters, calculate the evaluation metric per model and finally serialize the scores dataframe for later use."""
list_models=[] #a list to store imported models
model_optimizer="adam"
for batch_size in HP_HIDDEN_UNITS.domain.values:
for embedding_dim in HP_EMBEDDING_DIM.domain.values:
for learning_rate in HP_LEARNING_RATE.domain.values:
for decay_steps_multiplier in HP_DECAY_STEPS_MULTIPLIER.domain.values:
hparams = {
HP_HIDDEN_UNITS: batch_size,
HP_EMBEDDING_DIM: embedding_dim,
HP_LEARNING_RATE: learning_rate,
HP_DECAY_STEPS_MULTIPLIER: decay_steps_multiplier
}
print(f"\n{len(list_models)+1}/{(len(HP_HIDDEN_UNITS.domain.values)*len(HP_EMBEDDING_DIM.domain.values)*len(HP_LEARNING_RATE.domain.values)*len(HP_DECAY_STEPS_MULTIPLIER.domain.values))}")
print({h.name: hparams[h] for h in hparams},'\n')
model_object=import_trained_keras_model(len(list_models)+1, "import custom trained model", "on", model_optimizer, hparams)
list_models.append(model_object)
When I call the function I get the following error
ValueError: Unable to restore custom object of type _tf_keras_metric currently. Please make sure that the layer implements get_configand from_config when saving. In addition, please use the custom_objects arg when calling load_model().
It's strange that I get this error since the model metric to compile the NN is from a built in method of TensorFlow and NOT some sort of a custom metric that I developed myself.
I have searched also this thread in GitHub which closed without explaining the root of the problem.
[UPDATE]--Found a temporary solution
I managed to successfully import the model by turning the compile argument to False in order to re-compile the model imported inside the function.
So I did smth like model_imported=load_model(f"{model_path_structure}", custom_objects=dependencies, compile=False).
This action produced the following result:
WARNING:tensorflow:Unable to restore custom metric. Please ensure that the layer implements get_config and from_config when saving. In addition, please use the custom_objects arg when calling load_model().
Model 1 is loaded successfully.
So TensorFlow still cannot understand that HammingLoss is not a custom metric but rather a metric imported from Tensorflow Addons. However, despite the warning the model loaded successfully.
After creating a FastText model using Gensim, I want to load it but am running into errors seemingly related to callbacks.
The code used to create the model is
TRAIN_EPOCHS = 30
WINDOW = 5
MIN_COUNT = 50
DIMS = 256
vocab_model = gensim.models.FastText(sentences=model_input,
size=DIMS,
window=WINDOW,
iter=TRAIN_EPOCHS,
workers=6,
min_count=MIN_COUNT,
callbacks=[EpochSaver("./ftchkpts/")])
vocab_model.save('ft_256_min_50_model_30eps')
and the callback EpochSaver is defined as
from gensim.models.callbacks import CallbackAny2Vec
class EpochSaver(CallbackAny2Vec):
'''Callback to save model after each epoch and show training parameters '''
def __init__(self, savedir):
self.savedir = savedir
self.epoch = 0
os.makedirs(self.savedir, exist_ok=True)
def on_epoch_end(self, model):
savepath = os.path.join(self.savedir, f"ft256_{self.epoch}e")
model.save(savepath)
print(f"Epoch saved: {self.epoch + 1}")
if os.path.isfile(os.path.join(self.savedir, f"ft256_{self.epoch-1}e")):
os.remove(os.path.join(self.savedir, f"ft256_{self.epoch-1}e"))
print("Previous model deleted ")
self.epoch += 1
Aside from the type of model, this is identical to my process for Word2Vec which worked without issue. However when I open another file and try to load the model with
from gensim.models import FastText
vocab = FastText.load(r'vocab/ft_256_min_50_model_30eps')
I'm greeted with the error
AttributeError: Can't get attribute 'EpochSaver' on <module '__main__'>
What can I do to get the vocabulary to load so I can create the embedding layer for my keras model? If it's relevant, this is happening in JupyterLab.
This extra difficulty loading models with custom callbacks is a known, open issue (at least through gensim-3.8.1 and October 2019).
You can see discussions of possible workarounds and fixes there – and the gensim team is considering simply disabling the auto-saving of callbacks at all, requiring them to be re-specified for each later train()/etc call that needs them.
You may be able to load existing models saved with your custom callbacks by importing those same callback classes, as the same names, into the code context where you're doing a load().
You could save callback-free versions of your trained models by blanking the model's callbacks property to its empty default value, just before you save(), eg:
model.callbacks = ()
model.save(save_path)
Then, you wouldn't need to do any special importing of custom classes before a load(). (Of course if you again needed callback functionality on the re-loaded model, they'd then have to be explicitly reestablished after load()).
What is the best way to store a trainer and all necessary components?
1. Storing:
Store checkpoint of the trainer: Use its trainer.save_checkpoint(filename, external_state={}) function
Additionally store the model separately: Use the z.save(filename) method, every cntk operation has. You can also get z = trainer.model.
2. Reloading:
Restore the model: Use C.load_model(...). (Don't get confused by the deprecated persist namespace from the Cntk 1.)
Get the inputs from the restored model.
Restore the trainer itself: Use trainer.restore_from_checkpoint as eg. shown here. The problem is, this function already needs a trainer object which probably has to be initialized in the same way as the trainer used to create the check point!?
How do I now restore the label-inputs which are going into the error function used by the trainer? In the following code I marked the variables which I think I have to restore after I once stored them.
z = C.layers.Dense(.... )
loss = error = C.squared_error(z, **l**)
**trainer** = C.Trainer(**z**, (loss, error), [mylearner], my_tensorboard_writer)
You can restore your trainer, but I actually prefer to just load my model m. The simple reason is that it is much easier to create a whole new trainer, beacuse then you can change all the other parameters of the trainer more easily.
Then you can get the input variable from the loaded model (if your network has only one input):
input_var = m.arguments[0]
then you need the output of your model:
output = m(input_var)
and define the loss function using your target output target_output:
C.squared_error(output, target_output)
using your model and the loss function you can recreate your trainer from there, setting the learning rate etc. as you like