Store/Reload CNTK Trainer, Model, Inputs, Outputs - python

What is the best way to store a trainer and all necessary components?
1. Storing:
Store checkpoint of the trainer: Use its trainer.save_checkpoint(filename, external_state={}) function
Additionally store the model separately: Use the z.save(filename) method, every cntk operation has. You can also get z = trainer.model.
2. Reloading:
Restore the model: Use C.load_model(...). (Don't get confused by the deprecated persist namespace from the Cntk 1.)
Get the inputs from the restored model.
Restore the trainer itself: Use trainer.restore_from_checkpoint as eg. shown here. The problem is, this function already needs a trainer object which probably has to be initialized in the same way as the trainer used to create the check point!?
How do I now restore the label-inputs which are going into the error function used by the trainer? In the following code I marked the variables which I think I have to restore after I once stored them.
z = C.layers.Dense(.... )
loss = error = C.squared_error(z, **l**)
**trainer** = C.Trainer(**z**, (loss, error), [mylearner], my_tensorboard_writer)

You can restore your trainer, but I actually prefer to just load my model m. The simple reason is that it is much easier to create a whole new trainer, beacuse then you can change all the other parameters of the trainer more easily.
Then you can get the input variable from the loaded model (if your network has only one input):
input_var = m.arguments[0]
then you need the output of your model:
output = m(input_var)
and define the loss function using your target output target_output:
C.squared_error(output, target_output)
using your model and the loss function you can recreate your trainer from there, setting the learning rate etc. as you like

Related

Tensorflow Keras Model subclassing -- call function

I am experimenting with self supervised learning using tensorflow. The example code I'm running can be found in the Keras examples website. This is the link to the NNCLR example. The Github link to download the code can be found here. While I have no issues running the examples, I am running into issues when I try to save the pretrained or the finetuned model using model.save().
The error I'm getting is this:
f"Model {model} cannot be saved either because the input shape is not "
ValueError: Model <__main__.NNCLR object at 0x7f6bc0f39550> cannot be saved either
because the input shape is not available or because the forward pass of the model is
not defined. To define a forward pass, please override `Model.call()`.
To specify an input shape, either call `build(input_shape)` directly, or call the model on actual data using `Model()`, `Model.fit()`, or `Model.predict()`.
If you have a custom training step, please make sure to invoke the forward pass in train step through
`Model.__call__`, i.e. `model(inputs)`, as opposed to `model.call()`.
I am unsure how to override the Model.call() method. Appreciate some help.
One way to achieve model saving in such cases is to override the save (or save_weights) method in the keras.Model class. In your case, first initialize the finetune model in the NNCLR class. And next, override the save method for it. FYI, in this way, you may also able to use ModelCheckpoint API.
As said, define the finetune model in the NNCLR model class and override the save method for it.
class NNCLR(keras.Model):
def __init__(...):
super().__init__()
...
self.finetuning_model = keras.Sequential(
[
layers.Input(shape=input_shape),
self.classification_augmenter,
self.encoder,
layers.Dense(10),
],
name="finetuning_model",
)
...
def save(
self, filepath, overwrite=True, include_optimizer=True,
save_format=None, signatures=None, options=None
):
self.finetuning_model.save(
filepath=filepath,
overwrite=overwrite,
save_format=save_format,
options=options,
include_optimizer=include_optimizer,
signatures=signatures
)
model = NNCLR(...)
model.compile
model.fit
Next, you can do
model.save('finetune_model') # SavedModel format
finetune_model = tf.keras.models.load_model('finetune_model', compile=False)
'''
NNCLR code example: Evaluate sections
"A popular way to evaluate a SSL method in computer vision or
for that fact any other pre-training method as such is to learn
a linear classifier on the frozen features of the trained backbone
model and evaluate the classifier on unseen images."
'''
for layer in finetune_model.layers:
if not isinstance(layer, layers.Dense):
layer.trainable = False
finetune_model.summary() # OK
finetune_model.compile(
optimizer=keras.optimizers.Adam(),
loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True),
metrics=[keras.metrics.SparseCategoricalAccuracy(name="acc")],
)
finetune_model.fit

Saving/Restoring weights under different variable scopes in Tensorflow

I've been trying to research model/weight saving for a while, but I still can't fully grasp it. I feel what I'd like to do should be simple enough, but I've not found a solution.
The final goal is to do transfer laerning with a collection of pretrained networks. I write my models/layers as classes, so class method(s) for saving the weights and restoring would be ideal.
Example:
If I have a graph, features > A > B > labels, where A and B are sub-networks, I'd like to save and/or restore weights for these sections. Say I already have the weights for A trained, but the variable scope is now different, how would I restore the weights I've trained for A from a different training session? At the end of training this new graph i'd like 1 directory for my new A weights, 1 directory for my new B weights, and 1 directory for the full graph (I can handle the full graph bit).
It's very possible I keep overlooking the solution, but model saving is so poorly documented.
Hope I've explained the scenario well.
You can do this with tf.train.init_from_checkpoint
Define your model
def model_fn():
with tf.variable_scope('One'):
layer = any_tf_layer
with tf.variable_scope('Two'):
layer = any_tf_layer
Output variable names in checkpoint file
vars = [i[0] for i in tf.train.list_variables(ckpt_file)]
Then you can create assignment map to load only variables, defined in your model.
You can also assign new names to restored variables
map = {variable.op.name: variable for variable in tf.global_variables() if variable.op.name in vars}
This line is placed before session or outside model function for Estimator API
tf.train.init_from_checkpoint(ckpt_file, map)
https://www.tensorflow.org/api_docs/python/tf/train/init_from_checkpoint
You also can do it with tf.train.Saver
First you need to know the names of variables
vars_dict = {}
for var_current in tf.global_variables():
print(var_current)
print(var_current.op.name) # this gets only name
for var_ckpt in tf.train.list_variables(ckpt):
print(var_ckpt[0]) this gets only name
When you know exact names of all variables you can assign whatever value you need, provided variables have same shape and dtype. So to get a dictionary
vars_dict[var_ckpt[0]) = tf.get_variable(var_current.op.name, shape) # remember to specify shape, you can always get it from var_current
saver = tf.train.Saver(vars_dict)
Take a look at my other answer to similar question
How to restore pretrained checkpoint for current model in Tensorflow?

Using the same script for training and serving (Estimator + hub)

I want to use hub at training and serving, but I am getting a little confused how to do it on the same graph. Namely I have something like
def build_graph(..., mode, ...):
tags_and_args= ... # one for training, one for serving
if mode == 'training':
hub.create_module_spec(module_fn, tags_and_args=tags_and_args)
module_output = hub.Module(...)
hub.register_module_for_export(module_fn, tags_and_args=tags_and_args)
loss, output = ...
else:
module_output = hub.Module(XXX)
should I reload the module from disk? Therefore XXX will be the path where i saved it before. Or is it somehow saved as a graph object in memory?
I will call my code as
estimator.train(...)
exporter = hub.LatestModuleExporter(...)
exporter.export(...)
esimator.export_savedmodel(...) # for serving
You can use a hub.Module in the model_fn of an Estimator without ever exporting it. At the start of Estimator.train(), the module's variables will be initialized from their pre-trained values (much like other variables are initialized randomly). After that, the module's variables behave much like the other variables of your model - they are part of the model's checkpoint, and restored from there for evaluation, resumed training, or export to a SavedModel for serving, like any other variable.
Exporting a hub.Module is only needed in case you want to create a new version of the module (with the weights updated from your training) available to yet another, separate Estimator.

Keras Lambda CTC unable to get model to load

Hi I have a model which is based on this https://github.com/igormq/asr-study/tree/keras-2 that is able to just about save okay but is unable to load (either full mode or json/weights) due to the fact the loss isn't defined properly.
inputs = Input(name='inputs', shape=(None, num_features))
...
o = TimeDistributed(Dense(num_hiddens))(inputs)
# Output layer
outputs = TimeDistributed(Dense(num_classes))(o)
# Define placeholders
labels = Input(name='labels', shape=(None,), dtype='int32', sparse=True)
inputs_length = Input(name='inputs_length', shape=(None,), dtype='int32')
# Define a decoder
dec = Lambda(ctc_utils.decode, output_shape=ctc_utils.decode_output_shape,
arguments={'is_greedy': True}, name='decoder')
y_pred = dec([output, inputs_length])
loss = ctc_utils.ctc_loss(output, labels, input_length)
model = Model(input=[inputs, labels, inputs_length], output=y_pred)
model.add_loss(loss)
opt = Adam(lr=args.lr, clipnorm=args.clipnorm)
# Compile with dummy loss
model.compile(optimizer=opt, loss=None, metrics=[metrics.ler])
This will compile and run (note it uses the add_loss function which isn't very well documented). It can even be convinced to save with a bit of work - as this post hints (https://github.com/fchollet/keras/issues/5179) you can make it save by forcing the graph to be complete. I did this by making a dummy lambda loss function to bring in the inputs that weren't fully part of the graph, now this appears to work.
#this captures all the dangling nodes so will now save
fake_dummy_loss = Lambda(fake_ctc_loss,output_shape(1,),name=ctc)([y_pred,labels,inputs_length])
def fake_ctc_loss(args):
return tf.Variable(tf.zeros([1]),name="fakeloss")
We can add this to the model like so:
model = Model(input=[inputs, labels, inputs_length], output=[y_pred, fake_dummy_loss])
Now the loss when trying to load, says that it cannot due to the fact that it is missing a loss function (i guess this is because it's set to None despite add_loss being used.
Any help here appreciated
I faced a similar problem in a project of mine in which add_loss is used to manually add a custom loss function to my model. You can see my model here: Keras Loss Function with Additional Dynamic Parameter As you found, loading the model with load_model fails, complaining about a missing loss function.
Anyway, my solution was to save and load the model's weights rather than the whole model. The Model class has a save_weights method, which is discussed here: https://keras.io/models/about-keras-models/ Likewise, there's a load_weights method. Using these methods, you should be able to save and load the model just fine. The downside is that you have to define the model upfront, and then load the weights. In my project that wasn't an issue and only involved a small refactor.
Hope that helps.

How to keep lookup tables initialized for prediction (and not just training)?

I create a lookup table from tf.contrib.lookup, using the training data (as input). Then, I pass every input through that lookup table, before passing it through my model.
This works for training, but when it comes to online prediction from this same model, it raises the error:
Table not initialized
I'm using SavedModel to save the model. I run the prediction from this saved model.
How can I initialize this table so that it stays initialized? Or is there a better way to save the model so that the table is always initialized?
I think you would be better off using tf.tables_initializer() as the legacy_init_op.
tf.saved_model.main_op.main_op() also adds local and global initialization ops in addition to table initialization.
when you load the saved model and it runs the legacy_init_op, it would reset your variables, which is not what you want.
You can specify an "initialization" operation when you add a meta graph to your SavedModel bundle with tf.saved_model.builder.SavedModelBuilder.add_meta_graph, using the main_op or legacy_init_op kwarg. You can either use a single operation, or group together a number of operations with tf.group if you need more than one.
Note that in Cloud ML Engine, You'll have to use the legacy_init_op. However in future runtime_versions you will be able to use main_op
(IIRC, starting with runtime_version == 1.2)
The saved_model module provides a built in tf.saved_model.main_op.main_op to wrap up common initialization actions in a single op (local variable initialization, and table initialization).
So in summary, code should look like this (adapted from this example):
exporter = tf.saved_model.builder.SavedModelBuilder(
os.path.join(job_dir, 'export', name))
# signature_def gets constructed here
with tf.Session(graph=prediction_graph) as session:
# Need to be initialized before saved variables are restored
session.run([tf.local_variables_initializer(), tf.tables_initializer()])
# Restore the value of the saved variables
saver.restore(session, latest)
exporter.add_meta_graph_and_variables(
session,
tags=[tf.saved_model.tag_constants.SERVING],
signature_def_map={
tf.saved_model.signature_constants.DEFAULT_SERVING_SIGNATURE_DEF_KEY: signature_def
},
# Relevant change to the linked example is here!
legacy_init_op=tf.saved_model.main_op.main_op()
)
NOTE: If you are using the high level libraries (such as tf.estimator) this should be the default, and if you need to specify additional initialization actions you can specify them as part of the tf.train.Scaffold object that you pass to your tf.estimator.EstimatorSpec in your model_fn.

Categories