After creating a FastText model using Gensim, I want to load it but am running into errors seemingly related to callbacks.
The code used to create the model is
TRAIN_EPOCHS = 30
WINDOW = 5
MIN_COUNT = 50
DIMS = 256
vocab_model = gensim.models.FastText(sentences=model_input,
size=DIMS,
window=WINDOW,
iter=TRAIN_EPOCHS,
workers=6,
min_count=MIN_COUNT,
callbacks=[EpochSaver("./ftchkpts/")])
vocab_model.save('ft_256_min_50_model_30eps')
and the callback EpochSaver is defined as
from gensim.models.callbacks import CallbackAny2Vec
class EpochSaver(CallbackAny2Vec):
'''Callback to save model after each epoch and show training parameters '''
def __init__(self, savedir):
self.savedir = savedir
self.epoch = 0
os.makedirs(self.savedir, exist_ok=True)
def on_epoch_end(self, model):
savepath = os.path.join(self.savedir, f"ft256_{self.epoch}e")
model.save(savepath)
print(f"Epoch saved: {self.epoch + 1}")
if os.path.isfile(os.path.join(self.savedir, f"ft256_{self.epoch-1}e")):
os.remove(os.path.join(self.savedir, f"ft256_{self.epoch-1}e"))
print("Previous model deleted ")
self.epoch += 1
Aside from the type of model, this is identical to my process for Word2Vec which worked without issue. However when I open another file and try to load the model with
from gensim.models import FastText
vocab = FastText.load(r'vocab/ft_256_min_50_model_30eps')
I'm greeted with the error
AttributeError: Can't get attribute 'EpochSaver' on <module '__main__'>
What can I do to get the vocabulary to load so I can create the embedding layer for my keras model? If it's relevant, this is happening in JupyterLab.
This extra difficulty loading models with custom callbacks is a known, open issue (at least through gensim-3.8.1 and October 2019).
You can see discussions of possible workarounds and fixes there – and the gensim team is considering simply disabling the auto-saving of callbacks at all, requiring them to be re-specified for each later train()/etc call that needs them.
You may be able to load existing models saved with your custom callbacks by importing those same callback classes, as the same names, into the code context where you're doing a load().
You could save callback-free versions of your trained models by blanking the model's callbacks property to its empty default value, just before you save(), eg:
model.callbacks = ()
model.save(save_path)
Then, you wouldn't need to do any special importing of custom classes before a load(). (Of course if you again needed callback functionality on the re-loaded model, they'd then have to be explicitly reestablished after load()).
Related
I am experimenting with self supervised learning using tensorflow. The example code I'm running can be found in the Keras examples website. This is the link to the NNCLR example. The Github link to download the code can be found here. While I have no issues running the examples, I am running into issues when I try to save the pretrained or the finetuned model using model.save().
The error I'm getting is this:
f"Model {model} cannot be saved either because the input shape is not "
ValueError: Model <__main__.NNCLR object at 0x7f6bc0f39550> cannot be saved either
because the input shape is not available or because the forward pass of the model is
not defined. To define a forward pass, please override `Model.call()`.
To specify an input shape, either call `build(input_shape)` directly, or call the model on actual data using `Model()`, `Model.fit()`, or `Model.predict()`.
If you have a custom training step, please make sure to invoke the forward pass in train step through
`Model.__call__`, i.e. `model(inputs)`, as opposed to `model.call()`.
I am unsure how to override the Model.call() method. Appreciate some help.
One way to achieve model saving in such cases is to override the save (or save_weights) method in the keras.Model class. In your case, first initialize the finetune model in the NNCLR class. And next, override the save method for it. FYI, in this way, you may also able to use ModelCheckpoint API.
As said, define the finetune model in the NNCLR model class and override the save method for it.
class NNCLR(keras.Model):
def __init__(...):
super().__init__()
...
self.finetuning_model = keras.Sequential(
[
layers.Input(shape=input_shape),
self.classification_augmenter,
self.encoder,
layers.Dense(10),
],
name="finetuning_model",
)
...
def save(
self, filepath, overwrite=True, include_optimizer=True,
save_format=None, signatures=None, options=None
):
self.finetuning_model.save(
filepath=filepath,
overwrite=overwrite,
save_format=save_format,
options=options,
include_optimizer=include_optimizer,
signatures=signatures
)
model = NNCLR(...)
model.compile
model.fit
Next, you can do
model.save('finetune_model') # SavedModel format
finetune_model = tf.keras.models.load_model('finetune_model', compile=False)
'''
NNCLR code example: Evaluate sections
"A popular way to evaluate a SSL method in computer vision or
for that fact any other pre-training method as such is to learn
a linear classifier on the frozen features of the trained backbone
model and evaluate the classifier on unseen images."
'''
for layer in finetune_model.layers:
if not isinstance(layer, layers.Dense):
layer.trainable = False
finetune_model.summary() # OK
finetune_model.compile(
optimizer=keras.optimizers.Adam(),
loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True),
metrics=[keras.metrics.SparseCategoricalAccuracy(name="acc")],
)
finetune_model.fit
I have some nets, such as the following (augmented) resnet18:
num_classes = 10
resnet = models.resnet18(pretrained=True)
for param in resnet.parameters():
param.requires_grad = True
num_ftrs = resnet.fc.in_features
resnet.fc = nn.Linear(num_ftrs, num_classes)
And I want to use them inside a lightning module, and have it handle all optimizations, to_device, stages and so on. In other words, I want to register those modules for my lightning module.
I also want to be able to access their public members.
class MyLightning(LightningModule):
def __init__(self, resnet):
super().__init__()
self._resnet = resnet
self._criterion = lambda x: 1.0
def forward(self, x):
resnet_out = self._resnet(x)
loss = self._criterion(resnet_out)
return loss
my_lightning = MyLightning(resnet)
The above doesn't optimize any parameters.
Trying
def __init__(self, resnet)
...
_layers = list(resnet.children())[:-1]
self._resnet = nn.Sequential(*_layers)
Doesn't take resnet.fc into account. This also doesn't make sense to be the intended way of nesting models inside pytorch lightning.
How to nest models in pytorch lightning, and have them fully accessible and handled by the framework?
The training loop and optimization process is handles by the Trainer class. You can do so by initializing a new instance:
>>> trainer = Trainer()
And wrapping your PyTorch Lightning module with it. This way you can perform fitting, tuning, validating, and testing on that instance provided a DataLoader or LightningDataModule:
>>> trainer.fit(my_lightning, train_dataloader, val_dataloader)
You will have to implement the following functions on your Lightning module (i.e. in your case MyLightning):
Name
Description
init
Define computations here
forward
Use for inference only (separate from training_step)
training_step
the complete training loop
validation_step
the complete validation loop
test_step
the complete test loop
predict_step
the complete prediction loop
configure_optimizers
define optimizers and LR schedulers
source LightningModule documentation page.
Keep in mind a LightningModule is a nn.Module, so whenever you define a nn.Module as attribute to a LightningModule in the __init__ function, this module will end being registered as a sub-module to the parent pytorch lightning module.
The pytorch model should inherit from nn.Module, So you should find firstly the resnet18 in pytorch, then you can use the resnet18 or revise it by youself
The origin resnet codes is in this path: ...\python\Lib\site-packages\torchvision\models\resnet.py, you import the resnet network from here, so you can use it directly.
Now, you will find the original codes
class ResNet(nn.Module):...
https://github.com/pytorch/vision/blob/main/torchvision/models/resnet.py#L166
And import it like
from torchvision.models import ResNet
Finally, you can inherit from ResNet
class MyLightning(ResNet):
I saved a nn.Module model using (logically):
model = MyWeirdModel()
model.patched_features = .....
train(model)
torch.save(model, file)
Ideally, one would load this model using
model = torch.load(file)
However, in my case this doesn't work because Pickle uses the static class definition when un-pickling, so I get AttributeError: 'MyWeirdModel' object has no attribute 'patched_features' (this attribute was added to MyWeirdModel at runtime).
I would like to avoid having to re-train the model, so I don't want to change the code for saving, only loading.
# Initialise the model in the same way as before
model = MyWeirdModel()
model.patched_features = .....
state_dict = load_state_dict_only(file) # How does one do this?
model.load_state_dict(state_dict)
My understanding is that torch.save() saves the model AND the state dict. How do I load only the state dict from the pickled model, such that I can recover the model?
Referring to the documentation,
only layers with learnable parameters (convolutional layers, linear
layers, etc.) and registered buffers (batchnorm’s running_mean) have
entries in the model’s state_dict.
Your patched_features is not stored in the model’s state_dict, but you can do so by registering it with register_parameter as pointed out by ptrblck in this post but it may involve retraining.
I have created and trained a TensorFlow model using the HammingLoss metric from TensorFlow addons. Thus, it's not a custom metric that I have created on my own. I use a callbacks function with the methords ModelCheckpoint() and EarlyStopping to save the best weights of the best model and stop model training at a given threshold repsectively. When I save the model checkpoint I serialize the whole model structure (similar to model.save()), istead of model.save_weights(), which would have saved only the model weights (more about ModelCheckpoint here).
TL;DR: Here is a colab notebook with the code I post below in case you want to skip this.
The model I have trained is saved in GoogleDrive in the link here. To load the specific model I use the following code:
neural_network_parameters = {}
#======================================================================
# PARAMETERS THAT DEFINE THE NEURAL NETWORK STRUCTURE =
#======================================================================
neural_network_parameters['model_loss'] = tf.keras.losses.BinaryCrossentropy(from_logits=False, name='binary_crossentropy')
neural_network_parameters['model_metric'] = [tfa.metrics.HammingLoss(mode="multilabel", name="hamming_loss"),
tfa.metrics.F1Score(17, average="micro", name="f1_score_micro"),
tfa.metrics.F1Score(17, average=None, name="f1_score_none"),
tfa.metrics.F1Score(17, average="macro", name="f1_score_macro"),
tfa.metrics.F1Score(17, average="weighted", name="f1_score_weighted")]
"""Initialize the hyper parameters tuning the model using Tensorflow's hyperparameters module"""
HP_HIDDEN_UNITS = hp.HParam('batch_size', hp.Discrete([32]))
HP_EMBEDDING_DIM = hp.HParam('embedding_dim', hp.Discrete([50]))
HP_LEARNING_RATE = hp.HParam('learning_rate', hp.Discrete([0.001])) # Adam default: 0.001, SGD default: 0.01, RMSprop default: 0.001....0.1 to be removed
HP_DECAY_STEPS_MULTIPLIER = hp.HParam('decay_steps_multiplier', hp.Discrete([10]))
METRIC_ACCURACY = "hamming_loss"
dependencies = {
'hamming_loss': tfa.metrics.HammingLoss(mode="multilabel", name="hamming_loss"),
'attention': attention(return_sequences=True)
}
def import_trained_keras_model(model_index, method, decay_steps_mode, optimizer_name, hparams):
"""Load the model"""
training_date="2021-02-27"
model_path_structure=f"{folder_path_model_saved}/{initialize_notebbok_variables.saved_model_name}_{hparams[HP_EMBEDDING_DIM]}dim_{hparams[HP_HIDDEN_UNITS]}batchsize_{hparams[HP_LEARNING_RATE]}lr_{hparams[HP_DECAY_STEPS_MULTIPLIER]}decaymultiplier_{training_date}"
model_imported=load_model(f"{model_path_structure}", custom_objects=dependencies)
if optimizer_name=="adam":
optimizer = optimizer_adam_v2(hparams)
elif optimizer_name=="sgd":
optimizer = optimizer_sgd_v1(hparams, "step decay")
else:
optimizer = optimizer_rmsprop_v1(hparams)
model_imported.compile(optimizer=optimizer,
loss=neural_network_parameters['model_loss'],
metrics=neural_network_parameters['model_metric'])
print(f"Model {model_index} is loaded successfully\n")
return model_imported
Calling the function import trained keras model
"""Now that the functions have been created it's time to import each trained classifier from the selected dictionary of hyper parameters, calculate the evaluation metric per model and finally serialize the scores dataframe for later use."""
list_models=[] #a list to store imported models
model_optimizer="adam"
for batch_size in HP_HIDDEN_UNITS.domain.values:
for embedding_dim in HP_EMBEDDING_DIM.domain.values:
for learning_rate in HP_LEARNING_RATE.domain.values:
for decay_steps_multiplier in HP_DECAY_STEPS_MULTIPLIER.domain.values:
hparams = {
HP_HIDDEN_UNITS: batch_size,
HP_EMBEDDING_DIM: embedding_dim,
HP_LEARNING_RATE: learning_rate,
HP_DECAY_STEPS_MULTIPLIER: decay_steps_multiplier
}
print(f"\n{len(list_models)+1}/{(len(HP_HIDDEN_UNITS.domain.values)*len(HP_EMBEDDING_DIM.domain.values)*len(HP_LEARNING_RATE.domain.values)*len(HP_DECAY_STEPS_MULTIPLIER.domain.values))}")
print({h.name: hparams[h] for h in hparams},'\n')
model_object=import_trained_keras_model(len(list_models)+1, "import custom trained model", "on", model_optimizer, hparams)
list_models.append(model_object)
When I call the function I get the following error
ValueError: Unable to restore custom object of type _tf_keras_metric currently. Please make sure that the layer implements get_configand from_config when saving. In addition, please use the custom_objects arg when calling load_model().
It's strange that I get this error since the model metric to compile the NN is from a built in method of TensorFlow and NOT some sort of a custom metric that I developed myself.
I have searched also this thread in GitHub which closed without explaining the root of the problem.
[UPDATE]--Found a temporary solution
I managed to successfully import the model by turning the compile argument to False in order to re-compile the model imported inside the function.
So I did smth like model_imported=load_model(f"{model_path_structure}", custom_objects=dependencies, compile=False).
This action produced the following result:
WARNING:tensorflow:Unable to restore custom metric. Please ensure that the layer implements get_config and from_config when saving. In addition, please use the custom_objects arg when calling load_model().
Model 1 is loaded successfully.
So TensorFlow still cannot understand that HammingLoss is not a custom metric but rather a metric imported from Tensorflow Addons. However, despite the warning the model loaded successfully.
I am working on Project 2 of a course with Udacity (Artificial Intelligence with Python Programming).
I have trained a model and saved it in checkpoint.pth and I want to load the checkpoint.pth so I can rebuild the model .
I have written the code to save checkpoint.pth and also to load checkpoint.
model.class_to_idx = image_datasets['train_dir'].class_to_idx
model.cpu()
checkpoint = {'input_size': 25088,
'output_size': 102,
'hidden_layers': 4096,
'epochs': epochs,
'optimizer': optimizer.state_dict(),
'state_dict': model.state_dict(),
'class_to_index' : model.class_to_idx
}
torch.save(checkpoint, 'checkpoint.pth')
def load_checkpoint(filepath):
checkpoint = torch.load(filepath)
model = checkpoint.Network(checkpoint['input_size'],
checkpoint['output_size'],
checkpoint['hidden_layers'],
checkpoint['epochs'],
checkpoint['optimizer'],
checkpoint['class_to_index']
)
model.load_state_dict(checkpoint['state_dict'])
return model
model = load_checkpoint('checkpoint.pth')
While loading checkpoint.pth, I get an error:
AttributeError: 'dict' object has no attribute 'Network'
I want successfully load checkpoint.
Thank you
UPDATE: With the full code visibile, I think the issues is in the implementation. torch.load will load the information from the dict that has been deserialized to the file. This loads as the original dict object, so in the function, you should expect checkpoint == checkpoint(original definition).
In this instance, I think what you are actually looking to do is calling the load on the file saved as checkpoint.pth and the first call might not be necessary.
def load_checkpoint(filepath):
model = torch.load(filepath)
return model
The other possibility is that the nested object must be what the object is called, and then it would be just a small adjustment:
def load_checkpoint(filepath):
checkpoint = torch.load(filepath)
model = torch.load_state_dict(checkpoint['state_dict'])
return model
The most likely problem is that you are calling on the Network class, which is not contained within the checkpoint dictionary object.
I can't speak to the actual lesson or other nuances within the lesson, the simplest solution might be to just call the Network class definition with the variables already in the checkpoint dictionary like so:
model = Network(checkpoint['input_size'],
checkpoint['output_size'],
checkpoint['hidden_layers'],
checkpoint['epochs'],
checkpoint['optimizer'],
checkpoint['class_to_index'])
model.load_state_dict(checkpoint['state_dict'])
return model
The checkpoint dict may only have the values you expect ('input_size', 'output_size' etc) But this is just the most obvious issue I see.