accuracy drops around 10% after exporting from pytorch to ONNX - python

I've been training an efficientnetV2 network using this repository.
The train process goes well and I reach around 93-95% validation accuracy. After that I run an inference process over a set test which contains new images with an acceptable accuracy, around 88% (for example).
After I check if the model works fine on pytorch I need to convert it to ONNX and then to a tensorrt engine. I have a script to run inference with an ONNX model to check if I'm having some problems with the conversion process.
I'm using this code to convert the model:
import torch
from timm.models import create_model
import os
# create model
base_model = create_model(
model_arch,
num_classes=num_classes,
in_chans=3,
checkpoint_path=model_path)
model = torch.nn.Sequential(
base_model,
torch.nn.Softmax(dim=1)
)
model.cpu()
model.eval()
dummy_input = torch.randn(1, 3, 224, 224, requires_grad=True)
torch.onnx.export(model,
dummy_input,
model_export,
verbose=False,
export_params=True,
do_constant_folding=True
)
I've tried several tutorials like this one but unfortunately I'm getting the same result.
I've tried different onset combinations, with and without do_constant_folding, I've even trained another model with parameter called 'exportable', which is a bool and tells the train script if the model is exportable or not (is an experimental feature according to repository's documentation).
Do you have any idea about this issue?
Thanks in advance.

Hard to guess which bug you get, here is few typical:
some layers have not properly converted even after eval
you may need to write timm.create_model(...scriptable=True, exportable=True)
different preprocessing, e.g. timm model input normalized to specific values, after conversion - not.
Do those models output the near values on model(dummy_input) ?

Related

Why does "load_model" cause RAM memory problems while predicting?

I trained neural network (transformer architecture) and saved it by using:
model.save(directory + args.name, save_format="tf")
After that, I want to load the model again with another script to test it by letting it make iterative predictions:
from keras.models import load_model
model = load_model(args.model)
for i in range(very_big_number):
out, _ = model(something, training=False)
However, I have noticed that the RAM usage increases with each prediction and I don't know why. At some point the programme stops because there is no more memory available. You can also see the RAM consumption in the following screenshot:
If I use the same architecture, but only load the weights of the model with model.load_weigts( ... ), I do not have the problem.
My question now is, why does load_model seem to cause this and how do I solve the problem?
I'm using tensorflow 2.5.0.
Edit:
As I was not able to solve the problem and the answers did not help either, I simply used the load_weights method so that I created a new model and loaded the weights of the saved model like this:
model = myModel()
saved_model = load_model(args.model)
model.load_weights(saved_model + "/variables/variables")
In this way, the usage of RAM remained constant. Nevertheless an non-optimal solution, in my opinion.
There is a fundamental difference between load_model and load_weights. When you save an model using save_model you save the following things:
A Keras model consists of multiple components:
The architecture, or configuration, which specifies what layers the model contain, and how they're connected.
A set of weights values (the "state of the model").
An optimizer (defined by compiling the model).
A set of losses and metrics (defined by compiling the model or calling
add_loss() or add_metric()).
However when you save the weights using save_weights, you only saves the weights, and this is useful for the inference purpose, while when you want to resume the training process, you need a model object, that is the reason we save everything in the model. When you just want to predict and get the result save_weights is enough. To learn more, you can check the documentation of save/load models.
So, as you can see when you do load_model, it has many things to load as compared to load_weights, thus it will have more overhead hence your RAM usage.

How to fix inconsistent predictions right after training and after loading the saved model?

I trained my Keras (version 2.3.1) Sequential models for a regression problem and achieved very good results. Right after training, I make predictions on the test set and then save the model as well as the weights in separate files.
To check for the speed of the models, I recently loaded them and made predictions on a single test input array but the results are way off, which should mean that the weights at the end of the training are different from the ones being loaded.
I tried making predictions using the loaded model as is and from the loaded weights too. The results for both of them are consistent. So at least, it saves the same weights in both files, however wrong they are.
From what I have read, this looks like a common issue with Keras. I came across this suggestion at several places - set the global variable initializer manually.
My problem is that this suggestion, along with a few others (like setting a fixed seed), are to be put in place before training. Training my models takes 4-5 days! How can I fix this without having to retrain the models?
Here is how I fit the models:
hist = model.fit(
X_train, y_train,
batch_size=batch_size,
verbose=1,
epochs=epochs,
validation_split=0.2
)
Then I save the model as well as the weights:
model.save("path to .h5 file")
model.save_weights("path to .hdf5 file")
Eventually, I am loading the model and predicting from it like so:
from keras.models import load_model
model = load_model("path to the same .h5 file")
ypred = model.predict(input_arr)

What is the proper way to load a transfer learning model for inference in PyTorch?

I am training a model using transfer learning based on Resnet152. Based on PyTorch tutorial, I have no problem in saving a trained model, and loading it for inference. However, the time needed to load the model is slow. I don't know if I did it correct, here is my code:
To save the trained model as state dict:
torch.save(model.state_dict(), 'model.pkl')
To load it for inference:
model = models.resnet152()
num_ftrs = model.fc.in_features
model.fc = nn.Linear(num_ftrs, len(classes))
st = torch.load('model.pkl', map_location='cuda:0' if torch.cuda.is_available() else 'cpu')
model.load_state_dict(st)
model.eval()
I timed the code and found that the first line model = models.resnet152() takes the longest time to load. On CPU, it takes 10 seconds to test one image. So my thinking is that this might not be the proper way to load it?
If I save the entire model instead of the state.dict like this:
torch.save(model, 'model_entire.pkl')
and test it like this:
model = torch.load('model_entire.pkl')
model.eval()
on the same machine it takes only 5 seconds to test one image.
So my question is: is it the proper way to load the state_dict for inference?
In the first code snippet, you are downloading a model from TorchVision (with random weights), and then loading your (locally stored) weights to it.
In the second example you are loading a locally stored model (and its weights).
The former will be slower since you need to connect to the server hosting the model and download it, as opposed to a local file, but it is more reproduceable not relying on your local file. Also, the time difference should be a one-off initialisation, and they should have the same time complexity (as by the point you are performing inference the model has already been loaded in both, and they are equivalent).

UserWarning: No training configuration found in save file: the model was *not* compiled. Compile it manually

After a training procedure, I wanted to check the accuracy by loading the created model.h5 and executing an evaluation procedure. However, I am getting a following warning:
/usr/local/lib/python3.5/dist-packages/keras/engine/saving.py:269:
UserWarning: No training configuration found in save file: the model
was not compiled. Compile it manually. warnings.warn('No training
configuration found in save file:
This dist-packages/keras/engine/saving.py file
so the problem in loading created model -> this line of code
train_model = load_model('model.h5')
Problem indicates that the model was not compiled, however, I did it.
optimizer = Adam(lr=lr, clipnorm=0.001)
train_model.compile(loss=dummy_loss, optimizer=optimizer)
I can't understand what I am doing wrong . . .
Please help me! SOS :-(
Intro
I'd like to add to olejorgenb's answer - for a specific scenario, where you don't want to train the model, just use it (e.g. in production).
"Compile" means "prepare for training", which includes mainly setting up the optimizer. It could also have been saved before, and then you can continue the "same" training after loading the saved model.
The fix
But, what about the scenario - I want to just run the model? Well, use the compile=False argument to load_model like that:
trained_model = load_model('model.h5', compile=False)
You won't be able to .fit() this model without using trained_model.compile(...) first, but most importantly - the warning will go away.
Misc Notes
Btw, in my Keras version, the argument include_optimizer has a default of True. This should work also for trainig callbacks like Checkpoint. This means, when loading a model saved by Keras, you can usually count on the optimizer being included (except for the situation: see Hull Gasper's answer).
But, when you have a model which was not trained by Keras (e.g. when converting a model trained by Darknet), the model is saved un-compiled. This produces the warning, and you can get rid of it in the way described above.
Do you get this warning when saving the model?
WARNING:tensorflow:TensorFlow optimizers do not make it possible to access
optimizer attributes or optimizer state after instantiation. As a result, we
cannot save the optimizer as part of the model save file.You will have to
compile your model again after loading it. Prefer using a Keras optimizer
instead (see keras.io/optimizers).
Seems tensorflow optimizers can't be preserved by keras :/
As mentioned keras can't save Tensorflow optimizers. Use the keras one:
optimizer = keras.optimizers.Adam(lr=0.0001, beta_1=0.9, beta_2=0.999, epsilon=None, decay=0.0, amsgrad=False)
model.compile(optimizer=optimizer,
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
model.fit(...)
model.save('...')
This way works for me without manual compiling after calling load.

Keras Visualization of Model Built from Functional API

I wanted to ask if there was an easy way to visualize a Keras model built from the Functional API?
Right now, the best ways to debug at a high level a sequential model for me is:
model = Sequential()
model.add(...
...
print(model.summary())
SVG(model_to_dot(model).create(prog='dot', format='svg'))
However, I am having a hard time finding a good way to visualize the Keras API if we build a more complex, non-sequential model.
Yes there is, try checking the keras.utils which has a method plot_model() as explained on detail here. Seems that you already are familiar with keras.utils.vis_utils and the model_to_dot method, but this is another option. It's usage is something like:
from keras.utils import plot_model
plot_model(model, to_file='model.png')
To be honest, that is the best I have managed to find using Keras only. Using model.summary() as you did is also useful sometimes. I also wished there were some tool to enable for better visualization of one's models, perhaps even to be able to see the weights per layers as to decide on optimal network structures and initializations (if you know about one please tell :] ).
Probably the best option you currently have is to visualize things on Tensorboard, which you an include in Keras with the TensorBoard Callback. This enables you to visualize your training and the metrics of interest, as well as some info on activations of your layers,your biases and kernels, etc.. Basically you have to add this code to your program, before fitting your model:
from keras.callbacks import TensorBoard
#indicate folder to save, plus other options
tensorboard = TensorBoard(log_dir='./logs/run1', histogram_freq=1,
write_graph=True, write_images=False)
#save it in your callback list, where you can include other callbacks
callbacks_list = [tensorboard]
#then pass to fit as callback, remember to use validation_data also
regressor.fit(X, Y, callbacks=callbacks_list, epochs=64,
validation_data=(X_test, Y_test), shuffle=True)
You can then run Tensorboard (which runs locally on a webservice) with the following command on your terminal:
tensorboard --logdir=/logs/run1
This will then indicate you in which port to visualize your training. If you got different runs you can pass --logdir=/logs instead to be able to visualize them together for comparison. There are of course more options on the use of Tensorboard, so I suggest you check the included links if you are considering its use.
After a bit of googling and trial/error... Turns out you have to just convert the entire functional api model back into a "model format".
model = some_model()
output_layer = _build_output()
finalmodel = Model(inputs=model.input, outputs=finalmodel)
then, you can run finalmodel.summary(), or any of the plotting features for sequential modeling.
However, this requires I guess careful tracking of the model, which I admittedly did not do.
tf.keras.utils.plot_model(
model,
to_file="model.png",
show_shapes=False,
show_layer_names=True,
rankdir="TB",
expand_nested=False,
dpi=96,
)

Categories