I want to train a model in a sequential manner. That is I want to train the model initially with a simple architecture and once it is trained, I want to add a couple of layers and continue training. Is it possible to do this in Keras? If so, how?
I tried to modify the model architecture. But until I compile, the changes are not effective. Once I compile, all the weights are re-initialized and I lose all the trained information.
All the questions in web and SO I found are either about loading a pre-trained model and continuing training or modifying the architecture of pre-trained model and then only test it. I didn't find anything related to my question. Any pointers are also highly appreciated.
PS: I'm using Keras in tensorflow 2.0 package.
Without knowing the details of your model, the following snippet might help:
from tensorflow.keras.models import Model
from tensorflow.keras.layers import Dense, Input
# Train your initial model
def get_initial_model():
...
return model
model = get_initial_model()
model.fit(...)
model.save_weights('initial_model_weights.h5')
# Use Model API to create another model, built on your initial model
initial_model = get_initial_model()
initial_model.load_weights('initial_model_weights.h5')
nn_input = Input(...)
x = initial_model(nn_input)
x = Dense(...)(x) # This is the additional layer, connected to your initial model
nn_output = Dense(...)(x)
# Combine your model
full_model = Model(inputs=nn_input, outputs=nn_output)
# Compile and train as usual
full_model.compile(...)
full_model.fit(...)
Basically, you train your initial model, save it. And reload it again, and wrap it together with your additional layers using the Model API. If you are not familiar with Model API, you can check out the Keras documentation here (afaik the API remains the same for Tensorflow.Keras 2.0).
Note that you need to check if your initial model's final layer's output shape is compatible with the additional layers (e.g. you might want to remove the final Dense layer from your initial model if you are just doing feature extraction).
Related
If the pretrained model such as Resnet101 were trained on ImageNet dataset, then I change some layers inside it. Can I still be able to use the pretrained model on different ABC dataset?
Lets say This is ResNet34 Model,
It is pretrained on ImageNet and saved as ResNet.pt file.
If I changed some layers inside it, lets say I made it more deeper by introducing some layers in conv4_x (check image)
model = Resnet34() #I have changes some layers inside this ResNet34()
optimizer = optim.Adam(model.parameters(), lr=0.00005)
model.load_state_dict(torch.load('Resnet.pt')['state_dict']) #This is pretrained model of ResNet before some changes
optimizer.load_state_dict(torch.load('Resnet.pt')['optimizer'])
Can I do this? or there are anyother method?
You can do anything you like - the question is: would it be better than training from scratch?
Here are a few issues you might encounter:
1. A mismatch between weights saved in ResNet.pt (the trained weights of the original ResNet18) and the state_dict of your modified model.
You would probably need to manually make sure that the old weights are correctly assigned to the original layers and only the new layer is not initialized.
2. Initializing the weights of the new layer.
Since you are training a resNet - you can take advantage of the residual connections and init the weights of the new layer such that it would initially make no contribution to the predicted value and only pass the input directly to the output via the residual link.
I'm working on supporting automatic model detection/logging for Keras/Tensorflow models for our Machine learning platform https://iko.ai and I have some questions:
What are the different ways we can define a tf/keras model ?
tf.keras.Model
tf.Estimator
tensorflow_estimator
Any other ways I'm not aware of? Why are there so many ways to do the same thing?
What are the proper functions to save/load them?
How could we differentiate TF/Keras model instances from other non-model objects? I want to be able to write a function that checks if an object is a TF/Keras model, something like
def is_tf_or_keras_model(obj):
# check somehow if the obj is a TF/Keras model
pass
Regarding questions 1 and 2:
Another way to represent a neural network model is by using tf.keras.Sequential. It allows you to easily create a model that follows a sequential structure, for example:
import tensorflow as tf
# tf.__version__ == 2.4
model_seq = tf.keras.Sequential()
model_seq.add(tf.keras.Input(shape=(64,1)))
model_seq.add(tf.keras.layers.Conv1D(32, 2, activation='relu'))
model_seq.add(tf.keras.layers.Dense(16, activation='relu'))
model_seq.add(tf.keras.layers.Dense(4, activation='softmax'))
This is the same as using the tf.keras functional API. This allows to build more complex networks that do not follow a sequential structure, but of course you can build sequential models anyway:
i = tf.keras.Input(shape=(64,1))
x = tf.keras.layers.Conv1D(32, 2, activation='relu')(i)
x = tf.keras.layers.Dense(16, activation='relu')(x)
x = tf.keras.layers.Dense(4, activation='softmax')(x)
model_func = tf.keras.Model(inputs=i, outputs=x)
tf.estimator.Estimator is used to train prebuilt networks, so you only have to define some hyperparameter values and train out of the box model.
In conclusion, the method used to build a model depends on how complex/ad-hoc the network is.
Regarding question 3:
The tf.keras implementation allows you to either save the weights of a model (model.save_weights) or save the whole model (model.save). To load the model weights afterwards you need to create the model first, following the same structure of the model used to train those weights.
Regarding question 4:
model_classes = [tf.keras.Model, keras.Model, tf.estimator.Estimator]
def is_keras_or_tf_model(obj, model_classes):
return isinstance(obj, model_clases)
All I have said here is in the TensorFlow documentation.
Really don't have much idea of what I'm doing, followed this tutorial to process deepdream images https://www.youtube.com/watch?v=Wkh72OKmcKI
Trying to change the base model data set to any from here, https://keras.io/api/applications/#models-for-image-classification-with-weights-trained-on-imagenet particularly InceptionResNetV2 currently. InceptionV3 uses "mixed0" up to "mixed10" whereas, the former data set uses a different naming system apparently.
Would have to change this section
# Maximize the activations of these layers
names = ['mixed3', 'mixed5']
layers = [base_model.get_layer(name).output for name in names]
# Create the feature extraction model
dream_model = tf.keras.Model(inputs=base_model.input, outputs=layers)
I'm getting an error "no such layer: mixed3"
So yea, just trying to figure out how to get the names for the layers in this data set as well as others
You can simply enter the following code to find out the model architecture(including layer names).
#Model denotes the Inception model
model.summary()
Or to visualize complex relationships,
tf.keras.utils.plot_model(model, to_file='model.png')
I have a few Tensorflow models saved as .h5 files.
Due to poor record-keeping and documentation on my part, I can't recall the exact architecture each has. So, I was wondering if there was a way, from the h5 files saved for each model, to inspect the models and determine the architecture.
For example, is there a way to find out the number of layers, the activation functions, input/ouput, size, etc.
Any help is appreciated.
Thanks,
Sam
As suggested by Edwin in comments you can load the model and see the summary for layer details.
You can use the below code to get both the information about activation function and model architecture.
from tensorflow.keras.models import load_model
model = load_model('saved_model.h5')
for layer in model:
try:
print(layer.activation)
#for some layers there will not be any activation fucntion.
except:
pass
#To get the name of layers in the model.
layer_names=[layer.name for layer in model.layers]
#for model's summary and details.
model.summary()
What is the simplest way to use tf.estimator trained model A during the training of another model B?
The weights in model A are fixed. In model B, I would like to take some inputs, compute, feed these results into model A, then do some more computations on the output.
A simple example:
ModelA returns tf.matmul(input,weights)
In ModelB, I would like to do the following:
x1 = tf.matmul(new_inputs,new_weights1)
x2 = modelA(x1) # with fixed weights
return tf.matmul(x2,new_weights2)
But with more complicated models A and B, each of which is trained as a tf.estimator (though I'm happy to not use estimators if there's another easy solution -- I'm using them because I would like to use ML Engine).
This question is related, but the proposed solution does not work for training model B, because the gradients of tf.py_func are [None]. I have tried registering a gradient for tf.py_func, but this fails with
Unsupported object type Tensor
I have also tried tf.import_graph_def for model A, but this seems to load the pretrained graph, but not the actual weights.
For model composability, Keras works a whole lot better. You can convert a Keras model to estimator:
https://cloud.google.com/blog/products/gcp/new-in-tensorflow-14-converting-a-keras-model-to-a-tensorflow-estimator
So you can still train on ML Engine.
With Keras, it is then just a matter of loading the intermediate layers' weights and biases from a checkpoint and make that layer non-trainable. See:
Is it possible to save a trained layer to use layer on Keras?