How can I save a PyTorch model without a need for the model class to be defined somewhere?
Disclaimer:
In Best way to save a trained model in PyTorch?, there are no solutions (or a working solution) for saving the model without access to the model class code.
If you plan to do inference with the Pytorch library available (i.e. Pytorch in Python, C++, or other platforms it supports) then the best way to do this is via TorchScript.
I think the simplest thing is to use trace = torch.jit.trace(model, typical_input) and then torch.jit.save(trace, path). You can then load the traced model with torch.jit.load(path).
Here's a really simple example. We make two files:
train.py :
import torch
class Model(torch.nn.Module):
def __init__(self):
super(Model, self).__init__()
self.linear = torch.nn.Linear(4, 4)
def forward(self, x):
x = torch.relu(self.linear(x))
return x
model = Model()
x = torch.FloatTensor([[0.2, 0.3, 0.2, 0.7], [0.4, 0.2, 0.8, 0.9]])
with torch.no_grad():
print(model(x))
traced_cell = torch.jit.trace(model, (x))
torch.jit.save(traced_cell, "model.pth")
infer.py :
import torch
x = torch.FloatTensor([[0.2, 0.3, 0.2, 0.7], [0.4, 0.2, 0.8, 0.9]])
loaded_trace = torch.jit.load("model.pth")
with torch.no_grad():
print(loaded_trace(x))
Running these sequentially gives results:
python train.py
tensor([[0.0000, 0.1845, 0.2910, 0.2497],
[0.0000, 0.5272, 0.3481, 0.1743]])
python infer.py
tensor([[0.0000, 0.1845, 0.2910, 0.2497],
[0.0000, 0.5272, 0.3481, 0.1743]])
The results are the same, so we are good. (Note that the result will be different each time here due to randomness of the initialisation of the nn.Linear layer).
TorchScript provides for much more complex architectures and graph definitions (including if statements, while loops, and more) to be saved in a single file, without needing to redefine the graph at inference time. See the docs (linked above) for more advanced possibilities.
I recomend you to convert you pytorch model to onnx and save it. Probably its best way to store model without an access to the class.
Supplying an official answer by one of the core PyTorch devs (smth):
There are limitations to loading a pytorch model without code.
First limitation:
We only save the source code of the class definition. We do not save beyond that (like the package sources that the class is referring to).
For example:
import foo
class MyModel(...):
def forward(input):
foo.bar(input)
Here the package foo is not saved in the model checkpoint.
Second limitation:
There are limitations on robustly serializing python constructs. For example the default picklers cannot serialize lambdas. There are helper packages that can serialize more python constructs than the standard, but they still have limitations. Dill 25 is one such package.
Given these limitations, there is no robust way to have torch.load work without having the original source files.
There is no a solutins (or working solution) for saving model without an access to the class.
You can save whatever you like.
You can save the model, torch.save(model, filepath). It saves the model object itself.
You can save just the model state dict.
torch.save(model.state_dict(), filepath)
Further, you can save anything you like, since torch.save is just a pickle based save.
state = {
'hello_text': 'just the optimizer sd will be saved',
'optimizer': optimizer.state_dict(),
}
torch.save(state, filepath)
You may check what I wrote on torch.save some time ago.
Related
I want to add preprocessing functions and methods to the model graph as a SavedModel signature.
example:
# suppose we have a keras model
# ...
# defining the function I want to add to the model graph
#tf.function
def process(model, img_path):
# do some preprocessing using different libs. and modules...
outputs = {"preds": model.predict(preprocessed_img)}
return outputs
# saving the model with a custom signature
tf.saved_model.save(new_model, dst_path,
signatures={"process": process})
or we can use tf.Module here. However, the problem is I can not embed custom functions into the saved model graph.
Is there any way to do that?
I think you slightly misunderstand the purpose of save_model method in Tensorflow.
As per the documentation the intent is to have a method which serialises the model's graph so that it can be loaded with load_model afterwards.
The model returned by load_model is a class of tf.Module with all it's methods and attributes. Instead you want to serialise the prediction pipeline.
To be honest, I'm not aware of a good way to do that, however what you can do is to use a different method for serialisation of your preprocessing parameters, for example pickle or a different one, provided by the framework you use and write a class on top of that, which would do the following:
class MyModel:
def __init__(self, model_path, preprocessing_path):
self.model = load_model(model_path)
self.preprocessing = load_preprocessing(preprocessing_path)
def predict(self, img_path):
return self.model.predict(self.preprocessing(img_path))
I am experimenting with self supervised learning using tensorflow. The example code I'm running can be found in the Keras examples website. This is the link to the NNCLR example. The Github link to download the code can be found here. While I have no issues running the examples, I am running into issues when I try to save the pretrained or the finetuned model using model.save().
The error I'm getting is this:
f"Model {model} cannot be saved either because the input shape is not "
ValueError: Model <__main__.NNCLR object at 0x7f6bc0f39550> cannot be saved either
because the input shape is not available or because the forward pass of the model is
not defined. To define a forward pass, please override `Model.call()`.
To specify an input shape, either call `build(input_shape)` directly, or call the model on actual data using `Model()`, `Model.fit()`, or `Model.predict()`.
If you have a custom training step, please make sure to invoke the forward pass in train step through
`Model.__call__`, i.e. `model(inputs)`, as opposed to `model.call()`.
I am unsure how to override the Model.call() method. Appreciate some help.
One way to achieve model saving in such cases is to override the save (or save_weights) method in the keras.Model class. In your case, first initialize the finetune model in the NNCLR class. And next, override the save method for it. FYI, in this way, you may also able to use ModelCheckpoint API.
As said, define the finetune model in the NNCLR model class and override the save method for it.
class NNCLR(keras.Model):
def __init__(...):
super().__init__()
...
self.finetuning_model = keras.Sequential(
[
layers.Input(shape=input_shape),
self.classification_augmenter,
self.encoder,
layers.Dense(10),
],
name="finetuning_model",
)
...
def save(
self, filepath, overwrite=True, include_optimizer=True,
save_format=None, signatures=None, options=None
):
self.finetuning_model.save(
filepath=filepath,
overwrite=overwrite,
save_format=save_format,
options=options,
include_optimizer=include_optimizer,
signatures=signatures
)
model = NNCLR(...)
model.compile
model.fit
Next, you can do
model.save('finetune_model') # SavedModel format
finetune_model = tf.keras.models.load_model('finetune_model', compile=False)
'''
NNCLR code example: Evaluate sections
"A popular way to evaluate a SSL method in computer vision or
for that fact any other pre-training method as such is to learn
a linear classifier on the frozen features of the trained backbone
model and evaluate the classifier on unseen images."
'''
for layer in finetune_model.layers:
if not isinstance(layer, layers.Dense):
layer.trainable = False
finetune_model.summary() # OK
finetune_model.compile(
optimizer=keras.optimizers.Adam(),
loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True),
metrics=[keras.metrics.SparseCategoricalAccuracy(name="acc")],
)
finetune_model.fit
I have trained a LGBM model (gbdt) with python on a dataset with 5 classes (classification problem) and I'm able to make a correct inference on a test set, loading that model in a python script.
Now I need to use this model in a C++ program. To do this I have exported this model and I have lodaded it in C++ to make inference. The problem is that in C++ the output probabilities are always the same so I can't choose a winner class (each class results always 0.2).
To save the model I've tried this 2 ways.
First I've tried to save a model like a string:
s = lgb_model.model_to_string(num_iteration=114)
f = open('model_out.txt','w')
f.write(s)
f.close()
Second directly with save model method:
lgb_model.save_model('model_out.txt')
To load the model in C++ I've used this with no error:
int ret = LGBM_BoosterLoadModelFromString(model_string, &num_iter, &booster_handle);
To make inference I have prepared an input buffer and I have passed it on this function:
int res = LGBM_BoosterPredictForMat(booster_handle, input_data, C_API_DTYPE_FLOAT64,
n_row, n_cols, 1, C_API_PREDICT_NORMAL, 0, -1,"", &out_len, out_result);
I obtained a matrix with 5 rows, and a column for each sample like this:
0.2
0.2
0.2
0.2
0.2
I have tried to make inference with a lot of changes but the results are always the same (random inputs, different parameters, etc.). Moreover I have checked the loaded model trying to re-dump it with this function and the result seemed corrent:
LGBM_BoosterDumpModel(booster_handle, 0, -1, C_API_FEATURE_IMPORTANCE_SPLIT, 1, &out_len, out_string);
Where am I wrong?
I had a similar issue and in my case I found that the problem was the is_linear property in the model.
I compared the model that I generated from the binary_classification example with the model I was using and I noticed that the model in the example has the is_linear=0 property for each tree. On my model it was missing.
Then I checked the c++ code and found that if this property is missing, the variable describing this is true. I set it to false as default and that works for me.
I can't give more details as I just recently began working with LGBM models and c++.
I want to use hub at training and serving, but I am getting a little confused how to do it on the same graph. Namely I have something like
def build_graph(..., mode, ...):
tags_and_args= ... # one for training, one for serving
if mode == 'training':
hub.create_module_spec(module_fn, tags_and_args=tags_and_args)
module_output = hub.Module(...)
hub.register_module_for_export(module_fn, tags_and_args=tags_and_args)
loss, output = ...
else:
module_output = hub.Module(XXX)
should I reload the module from disk? Therefore XXX will be the path where i saved it before. Or is it somehow saved as a graph object in memory?
I will call my code as
estimator.train(...)
exporter = hub.LatestModuleExporter(...)
exporter.export(...)
esimator.export_savedmodel(...) # for serving
You can use a hub.Module in the model_fn of an Estimator without ever exporting it. At the start of Estimator.train(), the module's variables will be initialized from their pre-trained values (much like other variables are initialized randomly). After that, the module's variables behave much like the other variables of your model - they are part of the model's checkpoint, and restored from there for evaluation, resumed training, or export to a SavedModel for serving, like any other variable.
Exporting a hub.Module is only needed in case you want to create a new version of the module (with the weights updated from your training) available to yet another, separate Estimator.
What is the best way to store a trainer and all necessary components?
1. Storing:
Store checkpoint of the trainer: Use its trainer.save_checkpoint(filename, external_state={}) function
Additionally store the model separately: Use the z.save(filename) method, every cntk operation has. You can also get z = trainer.model.
2. Reloading:
Restore the model: Use C.load_model(...). (Don't get confused by the deprecated persist namespace from the Cntk 1.)
Get the inputs from the restored model.
Restore the trainer itself: Use trainer.restore_from_checkpoint as eg. shown here. The problem is, this function already needs a trainer object which probably has to be initialized in the same way as the trainer used to create the check point!?
How do I now restore the label-inputs which are going into the error function used by the trainer? In the following code I marked the variables which I think I have to restore after I once stored them.
z = C.layers.Dense(.... )
loss = error = C.squared_error(z, **l**)
**trainer** = C.Trainer(**z**, (loss, error), [mylearner], my_tensorboard_writer)
You can restore your trainer, but I actually prefer to just load my model m. The simple reason is that it is much easier to create a whole new trainer, beacuse then you can change all the other parameters of the trainer more easily.
Then you can get the input variable from the loaded model (if your network has only one input):
input_var = m.arguments[0]
then you need the output of your model:
output = m(input_var)
and define the loss function using your target output target_output:
C.squared_error(output, target_output)
using your model and the loss function you can recreate your trainer from there, setting the learning rate etc. as you like