I want to know how to make changes to a graph loaded from tensorflow's meta and checkpoint files like:
saver = tf.train.import_meta_graph('***.meta')
saver.restore(sess,tf.train.latest_checkpoint('./'))
For example, there are old_layer1 -> old_layer2 in existing graph with pretrained weights. I want to insert one then it becomes old_layer1 -> new_layer -> old_layer2, and new_layer are randomly initialized since there are no pretrained parameter for it. Answer here said its impossible, since tf's graph only allow append, is this true?
So I wonder if this can be worked around by loading the pretrained layers as individual variables, and assigning pre-trained weights as initial values and connect them by myself, so that I can add new layers between old ones. But I don't know how to do this in code.
Doing this with raw tensorflow can be complicated since the tf graph does not encode directly the structure of the layers. If your model was built with tf.keras, however, this is fairly straightforward as loading a keras model also loads its layer structure.
Related
I was trying to load the keras model which I saved during my training.So I went to keras documentation where I saw this.
Only topological loading (by_name=False) is supported when loading
weights from the TensorFlow format. Note that topological loading
differs slightly between TensorFlow and HDF5 formats for user-defined
classes inheriting from tf.keras.Model: HDF5 loads based on a
flattened list of weights, while the TensorFlow format loads based on
the object-local names of attributes to which layers are assigned in
the Model's constructor.
Could you please explain the above one?
For clarity purpose let's consider two cases.
Case 1: Simple model, and
Case 2: Complex model where user-defined classes inherited from tf.keras.Model were used.
Case 1: Simple model (as in keras Functional and Sequential models)
When you save model weights (using model.save_weights) and then load weights (using model.load_weights), by default the load_weights method uses topological loading. This is same for Tensorflow saved_model ('tf') format as well as 'h5' format. For example,
loadedh5_model.load_weights('./MyModel_h5.h5')
# the line above is same as the line below (as second and third arguments are default)
#loadedh5_model.load_weights('./MyModel_h5.h5',by_name=False, skip_mismatch=False)
In case if you want to load weights of specific layers of a saved model, then you need to use by_name=True. There are use cases that requires this type of loading.
loadedh5_model.load_weights('./MyModel_h5.h5',by_name=True, skip_mismatch=False)
Case 2: Complex model(as in Keras Subclassed models)
As of now only 'tf' format is only supported when user-defined classes inherited from tf.keras.Model were used in the model creation.
Only topological loading (by_name=False) is supported when loading
weights from the TensorFlow format. Note that topological loading
differs slightly between TensorFlow and HDF5 formats for user-defined
classes inheriting from tf.keras.Model: HDF5 loads based on a
flattened list of weights, while the TensorFlow format loads based on
the object-local names of attributes to which layers are assigned in
the Model's constructor.
The main reason is the way weights are in h5 format and tf format.
For example, consider Case 1 where HDF5 loads based on a flattened list of weights. The weights are loaded without any error. However, in Case 2, the model has user defined classes which requires different approach than just loading flattened weights. In order to take care of assigning weights of custom classes, 'tf' format load the weights based on the object-local names of attributes to which layers are assigned in the Model's constructor.
The following paragraph mentioned in keras website, further clarifies
When loading a weight file in TensorFlow format, returns the same
status object as tf.train.Checkpoint.restore. When graph building,
restore ops are run automatically as soon as the network is built (on
first call for user-defined classes inheriting from Model, immediately
if it is already built).
Another point to understand is keras Functional or Sequential models are static graphs of layers that can use flattened weights without any problem. Keras Subclassed model (as in our Case 2), is piece of Python code (a call method). There is no graph of layers. So as soon as the network is built with custom classes, restore ops are run to update status objects. Hope it helps.
I am currently using the TensorFlow Object Detection API and am attempting to fine tune a pre-trained Faster-RCNN from the model zoo. Currently, if I choose a different number of classes to the number used in the original network, it will simply not initialise the weights and biases from the SecondStageBoxPredictor/ClassPredictor as this now has different dimensions from the original ClassPredictor. However, as all of the classes I would like to train the network on are classes the original network has been trained to identify, I would like to retain the weights and biases associated with the classes I want to use in SecondStageBoxPredictor/ClassPredictor and prune all the others, rather than simply initialising these values from scratch (similar to the behaviour of this function).
Is this possible, and if so, how would I go about modifying the structure of this layer in the Estimator?
n.b. This question asks a similar thing, and their response is to ignore irrelevant classes from the network output - in this situation, however, I am attempting to fine tune the network and I assume the presence of these redundant classes would complicate the training / evaluation process?
If all the classes you would like to train the network on are the ones the network has been trained to identify, you could simply use the network to detect, isn't it?
However, if you have extra classes and you would like to do transfer-learning, you can have as many variables restored from checkpoint as possible by setting:
fine_tune_checkpoint_type: 'detection'
load_all_detection_checkpoint_vars: True
in field train_config from the pipeline config file.
Finally, by looking at the computation graph, it can be seen that the shape of SecondStageBoxPredictor/ClassPredictor/weights is dependent on the number of output classes.
Note that in tensorflow you can only restore in variables level, if two variables have different shapes, one can not use one to initialize the other. So in your case the idea of preserving some values of the weights variable is not feasible.
I want to do transfer learning with INCEPTION_V4 model as feature extractor
and downloaded the code and checkpoint file from
GitHub repository of Tensorflow
Then I added my own layer for classification in 5 classes. But during model restore using tf.train.Saver, it showing error that it can not find variable values for my layer that I added.
To solve this, I created two separate graphs, one for loading pre-trained model and one for my classification layers. But I can't pass output of one graph as input to the second graph.
Can you suggest any other way to do transfer learning or solve the problem
The easy solution is to just construct the inception model (without your layers), then create the saver and use it for restore, and only then create your layers.
There are more complex solutions (you can pass the var_list parameter of tf.Saver with the list of all variables you want to restore, and initialize that to be the list of all the inception variables), but this one should be straightforward, and I do it with my transfer learning models.
I am not very familiar with Torch, and I primarily use Tensorflow. I, however, need to use a retrained inception model that was retrained in Torch. Due to the large amount of computing resources required to retrain an inception model for my particular application, I would like to use the model that was already retrained.
This model is saved as a .pth.tar file.
I would like to be able to first load this model. So far, I have been able to figure out that I must use the following:
model = torch.load('iNat_2018_InceptionV3.pth.tar', map_location='cpu')
This seems to work, because print(model) prints out a large set of numbers and other values, which I presume are the values for the weights an biases.
After this, I need to be able to classify an image with it. I haven't been able to figure this out. How must I format the image? Should the image be converted into an array? After this, how must I pass the input data to the network?
you basically need to do the same as in tensorflow. That is, when you store a network, only the parameters (i.e. the trainable objects in your network) will be stored, but not the "glue", that is all the logic you need to use a trained model.
So if you have a .pth.tar file, you can load it, thereby overriding the parameter values of a model already defined.
That means that the general procedure of saving/loading a model is as follows:
write your network definition (i.e. your nn.Module object)
train or otherwise change the network's parameters in a way you want
save the parameters using torch.save
when you want to use that network, use the same definition of an nn.Module object to first instantiate a pytorch network
then override the values of the network's parameters using torch.load
Here's a discussion with some references on how to do this: pytorch forums
And here's a super short mwe:
# to store
torch.save({
'state_dict': model.state_dict(),
'optimizer' : optimizer.state_dict(),
}, 'filename.pth.tar')
# to load
checkpoint = torch.load('filename.pth.tar')
model.load_state_dict(checkpoint['state_dict'])
optimizer.load_state_dict(checkpoint['optimizer'])
I have a trained model with 10 classes and i need to add new class without loosing the weights of the pre-trained. I know that in order to do this issue, I should first of all load the pre-trained, remove/replace the FC , freeze lower layers of the pre-trained model and train the network. So theoritical is clear but practically which method i should call for loading the previous trained model because i have confused between just restore the last checkpoint saver.restore(sess, tf.train.latest_checkpoint) or by importing meta graph by calling tf.train.import_meta_graph .
According to freeze all the weights, i don't know which methods are responsible for that ? Any idea or if there any other possible way to add new class.