Create hub module from existing checkpoint - python

Is it possible to create a hub module from existing checkpoints without chaining the training code?

Yes, absolutely. You need a session with (1) a Module and (2) the proper values in its variables. It doesn't matter if those come from actual training or merely restoring a checkpoint. Given a Python library for model building that knows nothing about TensorFlow Hub, you can have a tool on the side for export to a Hub Module that looks like:
import tensorflow_hub as hub
import your_library as build_model_body
def module_fn():
inputs = tf.placeholder(...)
logits = build_model_body(inputs)
hub.add_signature(inputs=inputs, outputs=logits)
def main(_):
spec = hub.create_module_spec(module_fn)
# Supply a checkpoint trained on a model from the same Python code.
checkpoint_path = "..."
# Output will be written here:
export_path = "..."
with tf.Graph().as_default():
module = hub.Module(spec)
init_fn = tf.contrib.framework.assign_from_checkpoint_fn(
checkpoint_path, module.variable_map)
with tf.Session() as session:
init_fn(session)
module.export(export_path, session=session)
Fine points to note:
build_model_body() should transform inputs to outputs (say, pixels to feature vectors) as suitable for a Hub module, but not include data reading, or loss and optimizers. For transfer learning, these are best left to the consumer of the module. Some refactoring may be required.
Supplying the module.variable_map is essential, to translate from plain variable names as created by running build_model_body() by itself to the variable names created by instantiating the Module, live in scope module/state.

Related

Unable to restore a layer of class TextVectorization - Text Classification

System information
Google Colab
When I run the example provided by official tensorflow basic text classification, everything runs fine until the model save, but when I load the model it gives me this error.
RuntimeError: Unable to restore a layer of class TextVectorization. Layers of class TextVectorization require that the class be provided to the model loading code, either by registering the class using #keras.utils.register_keras_serializable on the class def and including that file in your program, or by passing the class in a keras.utils.CustomObjectScope that wraps this load call.
Expected Behavior: Model should be loaded successfully and process the raw input
https://colab.research.google.com/gist/amahendrakar/8b65a688dc87ce9ca07ffb0ce50b84c7/44199.ipynb#scrollTo=fEjmSrKIqiiM
Example Link: https://tensorflow.google.cn/tutorials/keras/text_classification
I also ran into this error message (RuntimeError: Unable to restore a layer of class TextVectorization. [...]) when I implemented (and customized) the code from the "Basic Text Classification" tutorial.
Instead of running the code in a notebook, I have two scripts, one for building, training and saving the model and the other one for loading it and making predictions. (Thus, the error does not seem to be limited to Google Colab).
This is what I had to do (see https://github.com/tensorflow/tensorflow/issues/45231):
First, I added this line in the first script before the function definition and built, trained and saved the model again:
#tf.keras.utils.register_keras_serializable()
def custom_standardization(input_data):
[...]
# Save model as SavedModel
export_model.save(model_path, save_format='tf')
Secondly, I also had to add the same line and the whole function definition in the second script to make sure that it works if I restart(!) ipython (where I currently run the scripts) and only run the second script:
#tf.keras.utils.register_keras_serializable()
def custom_standardization(input_data):
lowercase = tf.strings.lower(input_data)
stripped_html = tf.strings.regex_replace(lowercase, '<br />', ' ')
return tf.strings.regex_replace(stripped_html,
'[%s]' % re.escape(string.punctuation),
'')
[...]
# Load model
reloaded_model = tf.keras.models.load_model(model_path)
# Make predictions
predictions = reloaded_model.predict(examples)
Note: If I run the second script without restarting ipython after running the first script, I get this error:
ValueError: Custom>custom_standardization has already been registered [...]
Alternatively, you can just use the default standardization method in the vectorizer layer when building the model:
vectorize_layer = TextVectorization(
standardize="lower_and_strip_punctuation",
max_tokens=max_features,
output_mode='int',
output_sequence_length=sequence_length)
I got something working as Hassan describes it, I think. Not sure it's the right way, but it seems to work for me...
I define, train, and archive the model in one notebook
I un-archive it, load it, and use it for predictions from another notebook.
See here: https://github.com/OlivierLD/oliv-ai/tree/master/JupyterNotebooks/tf-tutorials/sentiment-analysis

Tensorflow Dataset .map() API

Couple of questions about this
For occasions when I'd like to do something like the following in Tensorflow (assume I'm creating training examples by loading WAV files):
import tensorflow as tf
def _some_audio_preprocessing_func(filename):
# ... some logic here which mostly uses Tensorflow ops ...
with tf.Session(graph=tf.Graph()) as sess:
wav_filename_placeholder = tf.placeholder(tf.string, [])
wav_loader = io_ops.read_file(wav_filename_placeholder)
wav_decoder = contrib_audio.decode_wav(wav_loader, desired_channels=1)
data = sess.run(
[wav_decoder],
feed_dict={wav_filename_placeholder: filename})
return data
dataset = tf.data.Dataset.list_files('*.wav')
dataset = dataset.map(_some_preprocessing_func)
If I have a parse_image() function that uses tensor ops - should
this be part of the main Graph? Following the example set in Google's own audio TF tutorial, it looks like they create a separate graph! Doesn't this ruin the point of using Tensorflow to make things faster?
Do I use tf.py_func() any time any single line isn't from the tensorflow library? Again, I wonder what the performance implications are and when I should use this...
Thanks!
When you use Dataset.map(map_func), TensorFlow defines a subgraph for all the ops created in the function map_func, and arranges to execute it efficiently in the same session as the rest of your graph. There is almost never any need to create a tf.Graph or tf.Session inside map_func: if your parsing function is made up of TensorFlow ops, these ops can be embedded directly in the graph that defines the input pipeline.
The modified version of the code using tf.data would look like this:
import tensorflow as tf
from tensorflow.contrib.framework.python.ops import audio_ops as contrib_audio
def _some_audio_preprocessing_func(filename):
wav_loader = tf.read_file(filename)
return contrib_audio.decode_wav(wav_loader, desired_channels=1)
dataset = tf.data.Dataset.list_files('*.wav')
dataset = dataset.map(_some_preprocessing_func)
If your map_func contains non-TensorFlow operations that you want to apply to each element, you should wrap them in a tf.py_func() (or Dataset.from_generator(), if the data generation process is defined in Python logic). The main performance implication is that any code running in a tf.py_func() is subject to the Global Interpreter Lock, so I would generally recommend trying to find a native TensorFlow implementation for anything that is performance critical.

TensorFlow Eager Mode: How to restore a model from a checkpoint?

I've trained a CNN model in TensorFlow eager mode. Now I'm trying to restore the trained model from a checkpoint file but haven't got any success.
All the examples (as shown below) I've found are talking about restoring checkpoint to a Session. But what I need is to restore the model into eager mode, i.e. without creating a session.
with tf.Session() as sess:
# Restore variables from disk.
saver.restore(sess, "/tmp/model.ckpt")
Basically what I need is something like:
tfe.enable_eager_execution()
model = tfe.restore('model.ckpt')
model.predict(...)
and then I can use the model to make predictions.
Can someone please help?
Update
The example code can be found at: mnist eager mode demo
I've tried to follow the steps from #Jay Shah 's answer and it almost worked but the restored model doesn't have any variables in it.
tfe.save_network_checkpoint(model,'./test/my_model.ckpt')
Out[58]:
'./test/my_model.ckpt-1720'
model2 = MNISTModel()
tfe.restore_network_checkpoint(model2,'./test/my_model.ckpt-1720')
model2.variables
Out[72]:
[]
The original model has lots of variables in it.:
model.variables
[<tf.Variable 'mnist_model_1/conv2d/kernel:0' shape=(5, 5, 1, 32) dtype=float32, numpy=
array([[[[ -8.25184360e-02, 6.77833706e-03, 6.97569922e-02,...
Eager Execution is still a new feature in TensorFlow, and was not included in the latest version, so not all features, are supported, but fortunately, loading a model from a saved checkpoint is.
You'll need to use the tfe.Saver class (which is a thin wrapper over the tf.train.Saver class), and your code should look something like this:
saver = tfe.Saver([x, y])
saver.restore('/tmp/ckpt')
Where [x,y] represents the list of variables and/or models you wish to restore. This should precisely match the variables passed when the saver that created the checkpoint was initially created.
More details, including sample code, can be found here, and the API details of the saver can be found here.
Ok, after spending a few hours running the code in line-by-line mode, I've figured out a way to restore a checkpoint to a new TensorFlow Eager Mode model.
Using the examples from TF Eager Mode MNIST
Steps:
After your model has been trained, find the latest checkpoint(or the checkpoint you want) index file from the checkpoint folder created in the training process, such as 'ckpt-25800.index'. Use only the filename 'ckpt-25800' while restoring in step 5.
Start a new python terminal and enable TensorFlow Eager mode by running:
tfe.enable_eager_execution()
Create a new instance of the MNISTMOdel:
model_new = MNISTModel()
Initialise the variables for model_new by running a dummy train process once.(This step is important. Without initialising the variables first, they can't be restored by the following step. However I can't find another way to initialise variables in Eager mode other than what I did below.)
model_new(tfe.Variable(np.zeros((1,784),dtype=np.float32)), training=True)
Restore the variables to model_new using the checkpoint identified in step 1.
tfe.Saver((model_new.variables)).restore('./tf_checkpoints/ckpt-25800')
If restore process is successful, you should see something like:
INFO:tensorflow:Restoring parameters from ./tf_checkpoints/ckpt-25800
Now the checkpoint has been successfully restored to model_new and you can use it to make predictions on new data.
I like to share TFLearn library which is Deep learning library featuring a higher-level API for TensorFlow. With the help of this library you can easily save and restore a model.
Saving a model
model = tflearn.DNN(net) #Here 'net' is your designed network model.
#This is a sample example for training the model
model.fit(train_x, train_y, n_epoch=10, validation_set=(test_x, test_y), batch_size=10, show_metric=True)
model.save("model_name.ckpt")
Restore a model
model = tflearn.DNN(net)
model.load("model_name.ckpt")
For more example of tflearn you can check some site like...
My first CNN in TFLearn.
Github Link
First you save your model in a checkpoint by doing following:
saver.save(sess, './my_model.ckpt')
In above line you are saving you session in "my_model.ckpt" checkpoint
Following code restores the model
saver = tf.train.Saver()
with tf.Session() as sess:
saver.restore(sess, './my_model.ckpt')
When you restore the session as a model then you restores your model from the ckpt
For eager mode to save :
tf.contrib.eager.save_network_checkpoint(sess,'./my_model.ckpt')
For eager mode to restore :
tf.contrib.eager.restore_network_checkpoint(sess,'./my_model.ckpt')
sess is an object of class Network. Any object of class Network can be saved and restored. A quick explanation of network objects :-
class TwoLayerNetwork(tfe.Network):
def __init__(self, name):
super(TwoLayerNetwork, self).__init__(name=name)
self.layer_one = self.track_layer(tf.layers.Dense(16, input_shape=(8,)))
self.layer_two = self.track_layer(tf.layers.Dense(1, input_shape=(16,)))
def call(self, inputs):
return self.layer_two(self.layer_one(inputs))
After constructing an object and calling the Network, a list of variables
created by tracked Layers is available via Network.variables:
python
sess = TwoLayerNetwork(name="net") # sess is object of Network
output = sess(tf.ones([1, 8]))
print([v.name for v in sess.variables])
```
=================================================================
This example prints variable names, one kernel and one bias per
`tf.layers.Dense` layer:
['net/dense/kernel:0',
'net/dense/bias:0',
'net/dense_1/kernel:0',
'net/dense_1/bias:0']
These variables can be passed to a `Saver` (`tf.train.Saver`, or
`tf.contrib.eager.Saver` when executing eagerly) to save or restore the
`Network`
=================================================================
```
tfe.save_network_checkpoint(sess,'./my_model.ckpt') # saving the model
tfe.restore_network_checkpoint(sess,'./my_model.ckpt') # restoring
Saving variables with tfe.Saver().save() :
for epoch in range(epochs):
train_and_optimize()
all_variables = model.variables + optimizer.variables()
# save the varibles
tfe.Saver(all_variables).save(checkpoint_prefix)
And then reload saved variables with tfe.Saver().restore() :
tfe.Saver((model.variables + optimizer.variables())).restore(checkpoint_prefix)
Then the model is loaded with the saved variables, and no need to create a new one as in #Stefan Falk 's answer.

Using 'read_batch_record_features' with an Estimator

(I'm using tensorflow 1.0 and Python 2.7)
I'm having trouble getting an Estimator to work with queues. Indeed, if I use the deprecated SKCompat interface with custom data files and a given batch size, the model trains properly. I'm trying to use the new interface with an input_fn that batches features out of TFRecord files (equivalent to my custom data files). The scripts runs properly but the loss value doesn't change after 200 or 300 steps. It seems that the model is looping on a small input batch (this would explain why the loss converges so fast).
I have a 'run.py' script that looks like the following:
import tensorflow as tf
from tensorflow.contrib import learn, metrics
#[...]
evalMetrics = {'accuracy':learn.MetricSpec(metric_fn=metrics.streaming_accuracy)}
runConfig = learn.RunConfig(save_summary_steps=10)
estimator = learn.Estimator(model_fn=myModel,
params=myParams,
modelDir='/tmp/myDir',
config=runConfig)
session = tf.Session(graph=tf.get_default_graph())
with session.as_default():
tf.global_variables_initializer()
coordinator = tf.train.Coordinator()
threads = tf.train.start_queue_runners(sess=session,coord=coordinator)
estimator.fit(input_fn=lambda: inputToModel(trainingFileList),steps=10000)
estimator.evaluate(input_fn=lambda: inputToModel(evalFileList),steps=10000,metrics=evalMetrics)
coordinator.request_stop()
coordinator.join(threads)
session.close()
My inputToModel function looks like this:
import tensorflow as tf
def inputToModel(fileList):
features = {'rawData': tf.FixedLenFeature([100],tf.float32),
'label': tf.FixedLenFeature([],tf.int64)}
tensorDict = tf.contrib.learn.read_batch_record_features(fileList,
batch_size=100,
features=features,
randomize_input=True,
reader_num_threads=4,
num_epochs=1,
name='inputPipeline')
tf.local_variables_initializer()
data = tensorDict['rawData']
labelTensor = tensorDict['label']
inputTensor = tf.reshape(data,[-1,10,10,1])
return inputTensor,labelTensor
Any help or suggestions is welcome !
Try to use: tf.global_variables_initializer().run()
I wanna do a similar thing but I do not know how to use Estimator API with multi-threading. There is an Experiment class for serving too - might be useful
delete line session = tf.Session(graph=tf.get_default_graph())
and session.close() and try:
with tf.Session() as sess:
tf.global_variables_initializer().run()

Tensorflow: How to give variables scope

I have to first pretrain a network before training it. I do this using code in separate files with their own sessions, but the variables from the first session are still getting carried over and causing problems (as I'm running both these files within one 'main' file).
I could get around this problem by simply running my pretrain file which saves the trained layers and then running my training file which loads the saved layers in. But it would be nice to be able to do these two things in one step. How can I 'break the link' and avoid unwanted variables having a global scope?
The 'main' file looks something like this:
from util import pretrain_nn
from NN import Network
shape = [...]
layer_save_file = ''
data = get_data()
# Trains and saves layers
pretrain_nn(shape, data, layer_save_file)
# If I were to print all variables (using tf.all_variables)
# variables only used in pretrain_nn show up
# (the printing would be done inside `Network`)
NN = Network(shape, pretrain=True, layer_save_file)
NN.train(data)
# Doesn't work because apparently some variables haven't been initialized.
NN.save()
The variables' lifetime is implicitly tied to the TensorFlow graph, and by default both of your computations will be added to the same (global) graph. You can scope them appropriately using with tf.Graph().as_default(): blocks around each of the subcomputations:
with tf.Graph().as_default():
# Trains and saves layers
pretrain_nn(shape, data, layer_save_file)
with tf.Graph().as_default():
NN = Network(shape, pretrain=True, layer_save_file)
NN.train(data)
NN.save()

Categories