How can I load and use a PyTorch (.pth.tar) model - python

I am not very familiar with Torch, and I primarily use Tensorflow. I, however, need to use a retrained inception model that was retrained in Torch. Due to the large amount of computing resources required to retrain an inception model for my particular application, I would like to use the model that was already retrained.
This model is saved as a .pth.tar file.
I would like to be able to first load this model. So far, I have been able to figure out that I must use the following:
model = torch.load('iNat_2018_InceptionV3.pth.tar', map_location='cpu')
This seems to work, because print(model) prints out a large set of numbers and other values, which I presume are the values for the weights an biases.
After this, I need to be able to classify an image with it. I haven't been able to figure this out. How must I format the image? Should the image be converted into an array? After this, how must I pass the input data to the network?

you basically need to do the same as in tensorflow. That is, when you store a network, only the parameters (i.e. the trainable objects in your network) will be stored, but not the "glue", that is all the logic you need to use a trained model.
So if you have a .pth.tar file, you can load it, thereby overriding the parameter values of a model already defined.
That means that the general procedure of saving/loading a model is as follows:
write your network definition (i.e. your nn.Module object)
train or otherwise change the network's parameters in a way you want
save the parameters using torch.save
when you want to use that network, use the same definition of an nn.Module object to first instantiate a pytorch network
then override the values of the network's parameters using torch.load
Here's a discussion with some references on how to do this: pytorch forums
And here's a super short mwe:
# to store
torch.save({
'state_dict': model.state_dict(),
'optimizer' : optimizer.state_dict(),
}, 'filename.pth.tar')
# to load
checkpoint = torch.load('filename.pth.tar')
model.load_state_dict(checkpoint['state_dict'])
optimizer.load_state_dict(checkpoint['optimizer'])

Related

Is it possible to obtain the output of a intermediate layer?

If a big model consists of end-to-end individual models, can I (after training) preserve only one model and freeze/discard other models during inference?
An example: this struct2depth (see below) have three models training in an unsupervised fashion. However, what I really need is the object motion, namely 3D Object Motion Estimation part. So I wonder if this is feasible to
train on the original networks, but
inference with only Object Motion Estimator, i.e. other following layers frozen/discarded?
I saw that in tensorflow one can obtain tensor-output of a specified layer, but to save unnecessary computation I'd like to simply freeze all other parts... don't know if it's possible.
Looking forward to some insights. Thanks in advance!
You can ignore weights by setting them to 0. For this, you can directly get a weight W and do W.assign(tf.mul(W,0)). I know that you care about speeding up inference but unless you rewrite your code to use sparse representations, you will probably not be speeding up inference since weights can't be removed fully.
What you can alternatively do, is look at existing solutions for pruning in custom layers:
class MyDenseLayer(tf.keras.layers.Dense, tfmot.sparsity.keras.PrunableLayer):
def get_prunable_weights(self):
# Prune bias also, though that usually harms model accuracy too much.
return [self.kernel, self.bias]
# Use `prune_low_magnitude` to make the `MyDenseLayer` layer train with pruning.
model_for_pruning = tf.keras.Sequential([
tfmot.sparsity.keras.prune_low_magnitude(MyDenseLayer(20, input_shape=input_shape)),
tf.keras.layers.Flatten()
])
You can e.g. use ConstantSparsity (see here) and set the parameters such that your layers are fully pruned.
Another alternative is to construct a second, smaller model that you only use for inference. You can then save the required weights separately (instead of saving the entire model) after training and load them in the second model.

How to reset classes while retaining class specific weights in TensorFlow Object Detection API

I am currently using the TensorFlow Object Detection API and am attempting to fine tune a pre-trained Faster-RCNN from the model zoo. Currently, if I choose a different number of classes to the number used in the original network, it will simply not initialise the weights and biases from the SecondStageBoxPredictor/ClassPredictor as this now has different dimensions from the original ClassPredictor. However, as all of the classes I would like to train the network on are classes the original network has been trained to identify, I would like to retain the weights and biases associated with the classes I want to use in SecondStageBoxPredictor/ClassPredictor and prune all the others, rather than simply initialising these values from scratch (similar to the behaviour of this function).
Is this possible, and if so, how would I go about modifying the structure of this layer in the Estimator?
n.b. This question asks a similar thing, and their response is to ignore irrelevant classes from the network output - in this situation, however, I am attempting to fine tune the network and I assume the presence of these redundant classes would complicate the training / evaluation process?
If all the classes you would like to train the network on are the ones the network has been trained to identify, you could simply use the network to detect, isn't it?
However, if you have extra classes and you would like to do transfer-learning, you can have as many variables restored from checkpoint as possible by setting:
fine_tune_checkpoint_type: 'detection'
load_all_detection_checkpoint_vars: True
in field train_config from the pipeline config file.
Finally, by looking at the computation graph, it can be seen that the shape of SecondStageBoxPredictor/ClassPredictor/weights is dependent on the number of output classes.
Note that in tensorflow you can only restore in variables level, if two variables have different shapes, one can not use one to initialize the other. So in your case the idea of preserving some values of the weights variable is not feasible.

How to pass output of one graph as input to another graph in tensorflow

I want to do transfer learning with INCEPTION_V4 model as feature extractor
and downloaded the code and checkpoint file from
GitHub repository of Tensorflow
Then I added my own layer for classification in 5 classes. But during model restore using tf.train.Saver, it showing error that it can not find variable values for my layer that I added.
To solve this, I created two separate graphs, one for loading pre-trained model and one for my classification layers. But I can't pass output of one graph as input to the second graph.
Can you suggest any other way to do transfer learning or solve the problem
The easy solution is to just construct the inception model (without your layers), then create the saver and use it for restore, and only then create your layers.
There are more complex solutions (you can pass the var_list parameter of tf.Saver with the list of all variables you want to restore, and initialize that to be the list of all the inception variables), but this one should be straightforward, and I do it with my transfer learning models.

Modify pretrained model in tensorflow

I want to know how to make changes to a graph loaded from tensorflow's meta and checkpoint files like:
saver = tf.train.import_meta_graph('***.meta')
saver.restore(sess,tf.train.latest_checkpoint('./'))
For example, there are old_layer1 -> old_layer2 in existing graph with pretrained weights. I want to insert one then it becomes old_layer1 -> new_layer -> old_layer2, and new_layer are randomly initialized since there are no pretrained parameter for it. Answer here said its impossible, since tf's graph only allow append, is this true?
So I wonder if this can be worked around by loading the pretrained layers as individual variables, and assigning pre-trained weights as initial values and connect them by myself, so that I can add new layers between old ones. But I don't know how to do this in code.
Doing this with raw tensorflow can be complicated since the tf graph does not encode directly the structure of the layers. If your model was built with tf.keras, however, this is fairly straightforward as loading a keras model also loads its layer structure.

Saving just model & weights in Keras (in single file)

I have Python code that generates a deep convolutional neural network using Keras. I'm trying to save the model, but the result is gigantic (100s of MBs). I'd like to pare that down a bit to make something more manageable.
The problem is that model.save() stores (quoting the Keras FAQ):
the architecture of the model, allowing to re-create the model
the weights of the model
the training configuration (loss, optimizer)
the state of the optimizer, allowing to resume training exactly where you left off.
If I'm not doing any more training, I think I just need the first two.
I can use model.to_json() to make a JSON string of the architecture and save that off, and model.save_weights() to make a separate file containing the weights. That's about a third the size of the full model.save() result. But I'm wondering if there's some way to store these in a single self-contained file? (Short of outputting two files, zipping them together, and deleting the originals.) Alternatively, maybe there's a way to delete the training configuration and optimizer state when training is complete, so that model.save() doesn't give me something nearly so big?
Thanks.
The save function of a Model has a parameter exactly for this, called include_optimizer, setting it to false will save the model without including the optimizer state, which should lead to a much smaller HDF5 file:
model.save("something.hdf5", include_optimizer=False)

Categories