I have been unable to figure out how to use transfer learning/last layer retraining with the new TF Estimator API.
The Estimator requires a model_fn which contains the architecture of the network, and training and eval ops, as defined in the documentation. An example of a model_fn using a CNN architecture is here.
If I want to retrain the last layer of, for example, the inception architecture, I'm not sure whether I will need to specify the whole model in this model_fn, then load the pre-trained weights, or whether there is a way to use the saved graph as is done in the 'traditional' approach (example here).
This has been brought up as an issue, but is still open and the answers are unclear to me.
It is possible to load the metagraph during model definition and use SessionRunHook to load the weights from a ckpt file.
def model(features, labels, mode, params):
# Create the graph here
return tf.estimator.EstimatorSpec(mode,
predictions,
loss,
train_op,
training_hooks=[RestoreHook()])
The SessionRunHook can be:
class RestoreHook(tf.train.SessionRunHook):
def after_create_session(self, session, coord=None):
if session.run(tf.train.get_or_create_global_step()) == 0:
# load weights here
This way, the weights are loaded in first step and saved during training in model checkpoints.
Related
I am using Weights & Biases (link) to manage hyperparameter optimization and log the results. I am training using Keras with a Tensorflow backend, and I am using the out-of-the-box logging functionality of Weights & Biases, in which I run
wandb.init(project='project_name', entity='username', config=config)
and then add a WandbCallback() to the callbacks of classifier.fit(). By default, Weights & Biases appears to save the model parameters (i.e., the model's weights and biases) and store them in the cloud. This eats up my account's storage quota, and it is unnecessary --- I only care about tracking the model loss/accuracy as a function of the hyperparameters.
Is it possible for me to train a model and log the loss and accuracy using Weights & Biases, but not store the model parameters in the cloud? How can I do this?
In order to not save the trained model weights during hyperparam optimization you do something like this:
classifier.fit(..., callbacks=[WandbCallback(.., save_model=False)]
This will only track the metrics (train/validation loss/acc, etc.).
If the pretrained model such as Resnet101 were trained on ImageNet dataset, then I change some layers inside it. Can I still be able to use the pretrained model on different ABC dataset?
Lets say This is ResNet34 Model,
It is pretrained on ImageNet and saved as ResNet.pt file.
If I changed some layers inside it, lets say I made it more deeper by introducing some layers in conv4_x (check image)
model = Resnet34() #I have changes some layers inside this ResNet34()
optimizer = optim.Adam(model.parameters(), lr=0.00005)
model.load_state_dict(torch.load('Resnet.pt')['state_dict']) #This is pretrained model of ResNet before some changes
optimizer.load_state_dict(torch.load('Resnet.pt')['optimizer'])
Can I do this? or there are anyother method?
You can do anything you like - the question is: would it be better than training from scratch?
Here are a few issues you might encounter:
1. A mismatch between weights saved in ResNet.pt (the trained weights of the original ResNet18) and the state_dict of your modified model.
You would probably need to manually make sure that the old weights are correctly assigned to the original layers and only the new layer is not initialized.
2. Initializing the weights of the new layer.
Since you are training a resNet - you can take advantage of the residual connections and init the weights of the new layer such that it would initially make no contribution to the predicted value and only pass the input directly to the output via the residual link.
I have been browsing the documentation for the tensorflow.keras.save_model() API and I came across the parameter include_optimizer and I am wondering what would be the advantage of not including the optimizer, or perhaps what problems could arise if the optimizer isn't saved with the model?
To give more context for my specific use-case, I want to save a model and then use the generated .pb file with Tensorflow Serving. Is there any reason I would need to save the optimizer state, would not saving it reduce the overall size of the resultant file? If I don't save it is it possible that the model will not work correctly in TF serving?
Saving the optimizer state will require more space, as the optimizer has parameters that are adjusted during training. For some optimizers, this space can be significant, as several meta-parameters are saved for each tuned model parameter.
Saving the optimizer parameters allows you to restart training in exactly the same state as you saved the checkpoint, whereas without saving the optimizer state, even the same model parameters might result in a variety of training outcomes with different optimizer parameters.
Thus, if you plan on continuing to train your model from the saved checkpoint, you'd probably want to save the optimizer's state as well. However, if you're instead saving the model state for future use only for inference, you don't need the optimizer state for anything. Based on your description of wanting to deploy the model on TF Serving, it sounds like you'll only be doing inference with the saved model, so are safe to exclude the optimizer.
I am training a model in tf.keras and I want to save all the activations of a given layer during training (so at each batch for instance) in order to be able to track boxplots/histograms of these activations in Tensorboard.
I am getting lost between Tensorboard callbacks options I don't manage to use for this purpose.
I have tried to write custom callback but I get an error when I use .numpy on the model.layers[i].output.
I have also tried custom metrics but it seems from the example that they only store a variable with shape=().
I have found answer about visualizing activation on inference but not during training on the training data itself.
thanks
I have trained a cnn model using tf.estimator and tf.data.TFRecordDataset, which define a model in model_fn funcition and input in input_fn function. Also using an one-shot iterator to get one batch examples at a time.
Now I have trained model files(ckpt, meta, index) in a directory. What I want to do is predicting a image's label based on the trained model without training and evaluation again. The image can be numpy array but not possible a TFRecords file(which used when traing).
I can't find an effictive solution after trying all day. I only can get the value of weights and biases and don't know how to make my predicting image and model compatible.
FYI, my training code is here.
The similar question is Prediction from model saved with tf.estimator.Estimator in Tensorflow
, but no accepted answer and my model input is using the dataset api.
So reaaally need help. Thanks.
I have answered a similar question here.
To make predictions using a custom input, you need to use the built-in predict method of Estimators:
estimator = tf.estimator.Estimator(model_fn, ...)
predict_input_fn = ... # define this using tf.data
predict_results = estimator.predict(predict_input_fn)
for idx, prediction in enumerate(predict_results):
print(idx)
for key in prediction:
print("...{}: {}".format(key, prediction[key]))