I want to load a TensorFlow model (checkpoint) and use in in a while loop.
Loading the model takes some time, so I want to do that before the while loop.
If I use:
with tf.Graph().as_default():
with tf.Session() as sess:
print("loading checkpoint ...")
saver = tf.train.import_meta_graph(str(modelpath / 'mfn.ckpt.meta'))
saver.restore(sess, str(modelpath / 'mfn.ckpt'))
while:
...
the problem is that the session is closed after the end of while.
I saw now this post about a similar problem.
The answer seemed to be using TensorFlow Serving. Unfortunately, therefore the model has to be in the format of SavedModel class. I do not have a SavedModel but only the checkpoints.
I tried saving the loaded checkpoint with the tf.saved_model.builder.SavedModelBuilder()
but ran into some issues. I made a post about those issues separately here.
Is there another way of running a loaded model (as in the code above) outside of a session?
From your illustration code, I guess you're using TensorFlow version 1.x (with tf.Graph, tf.Session ...). Is this right?
So, about your question: "Is there another way of running a loaded model (as in the code above) outside of a session?",
I have a suggestion: have you ever tried to convert your code to TensorFlow version 2.x?
If you can do this, after that:
You can easily save-load your TF model using tf.saved_model.save() and tf.saved_model.load() methods,
Then, use the loaded model easily in a while loop.
Related
I need to have a frozen graph (GrafDef file) while using Tensorflow 2.X.
That is because I use a tool which expects a frozen graph, however, my training needed to be done on TF2.X and Keras.
I tried many different ways to save my TF2 model. The variant with which I was able to get the most useful formats is the following:
sess = tf.compat.v1.Session()
saver = tf.compat.v1.train.Saver(var_list=cnn.trainable_variables)
save_path = saver.save(sess, os.path.join(CHKPT_DIR, CHKPT_FILE))
tf.compat.v1.train.write_graph(sess.graph_def, CHKPT_DIR, TRAIN_GRAPH, as_text=False)
That way I was able to get the following files:
float_model.ckpt.data-00000-of-00001
float_model.ckpt.index
checkpoint
training_model.pb
Of these files I need the *.ckpt and training_model.pb to freeze my model. However, when using the freeze_graph.sh (with TF1.X, different virtual environment), it throws the error
ValueError: No variables to save
This is although I give it the variables as a list via var_list=cnn.trainable_variables. cnn.trainable_variables also is not empty and seems to have all the used variables of my model.
Thus, I tried using the following method, according to TF2.X standards (assuming cnn is my model):
cnn.save(CHKPT_PATH)
checkpoint = tf.train.Checkpoint(cnn)
save_path = checkpoint.save(CHKPT_PATH)
Here I get the following files:
float_model.ckpt-1.data-00000-of-00001
float_model.ckpt-1.index
checkpoint
floating_model.ckpt/keras_metadata.pb
floating_model.ckpt/saved_model.pb
floating_model.ckpt/assets
floating_model.ckpt/variables
But here is where I get confused. Is there some kind of frozen graph available already? Or is there some kind of equivalent in here? And if not, how to get it with TF2.X if possible? I found the sentence
The .save() method is already saving a *.pb ready for inference.
in this post. So the frozen graph is ready for inference, and thus one of these files must be equivalent to a frozen graph, right?
I try to build a tensorflow model - where i use the tf.py_func to create a part of the code in ordinary python code. The problem is that when I save the model to a .pb file, the .pb file itself is very small and does not include the py_func:0 tensor. When I try to load and run the model from the .pb file I get this error: get ValueError: callback pyfunc_0 is not found.
It works when I dont save and load as a .pb file
Is anyone able ton help. This is super important to me and have given me a couple of sleepless nights.
model_version = "465555564"
tensorboard = TensorBoard(log_dir='./logs', histogram_freq = 0, write_graph = True, write_images = False)
sess = tf.Session()
K.set_session(sess)
K.set_learning_phase(0)
def my_func(x):
some_function
input = tf.placeholder(tf.float32)
y = tf.py_func(my_func, [input], tf.float32)
prediction_signature = tf.saved_model.signature_def_utils.predict_signature_def({"inputs": input}, {"prediction": y})
builder = saved_model_builder.SavedModelBuilder('./'+model_version)
legacy_init_op = tf.group(tf.tables_initializer(), name='legacy_init_op')
builder.add_meta_graph_and_variables(
sess, [tag_constants.SERVING],
signature_def_map={
signature_constants.DEFAULT_SERVING_SIGNATURE_DEF_KEY:prediction_signature,
},
legacy_init_op=legacy_init_op)
builder.save()
There is a way to save TF models with tf.py_func, but you have to do it without using a SavedModel.
TF has 2 levels of model saving: checkpoints and SavedModels. See this answer for more details, but to quote it here:
A checkpoint contains the value of (some of the) variables in a TensorFlow model. It is created by a Saver. To use a checkpoint, you need to have a compatible TensorFlow Graph, whose Variables have the same names as the Variables in the checkpoint.
SavedModel is much more comprehensive: It contains a set of Graphs (MetaGraphs, in fact, saving collections and such), as well as a checkpoint which is supposed to be compatible with these Graphs, and any asset files that are needed to run the model (e.g. Vocabulary files). For each MetaGraph it contains, it also stores a set of signatures. Signatures define (named) input and output tensors.
The tf.py_func op cannot be saved with a SavedModel (noted on this page in the docs), which is what you tried to do here. There is a good reason for this. SavedModels are supposed to be totally independent from the original code, able to be loaded in any other language that can deserialize it. This allows the models to be loaded by things like ML Engine, which is probably written in C++ or something like that. The problem is that it cannot serialize arbitrary Python code, so py_func is a no-go.
You can work around this by using checkpoints, as long as you are okay with staying in Python. You will not get the independence that SavedModels provide. You can save a checkpoint after training with a tf.train.Saver, and then in a new Session, re-build the whole graph and load it with that Saver. There is even a way to use that code in ML Engine, which used to be exclusively for SavedModels. You can use custom prediction routines to side-step the need for a SavedModel.
More info on saving/restoring models in the docs.
I am using Tensorflow + Python.
I am curious if I can release a saved Tensorflow model (architecture + trained variables) without detailed source code. I'm aware of tf.train.Saver(), but it looks to save only variables, and in order to restore/run them, a user needs to "define" the same architecture.
For the testing/running purpose only, is there a way to release a saved {architecture+trained variables} without source code, so that a user can just cast a query and get a result?
The TensorFlow Serving project is intended to make this use case straightforward (assuming that the end user is only using the model for inference, not training). TensorFlow Serving includes an Exporter class that takes your tf.train.Saver, the tf.GraphDef that defines your overall model, and a "signature" that describes the inputs to and output from your model.
The basics tutorial has a good introduction to exporting your model.
You can build a Saver from the MetaGraphDef (saved with checkpoints by default: those .meta files). and then use that Saver to restore your model. So users don't have to re-define your graph in their code. But then they still need to figure out the model signature (input, output variables). I solve this using tf.Collection (but i am interested to find better ways to do it as well).
You can take a look at my example implementation (the eval.py evaluate a model without re-defining a model):
reconstruct saver from meta graph https://github.com/falcondai/cifar10/blob/master/eval.py#L18
get input variables from collections https://github.com/falcondai/cifar10/blob/master/eval.py#L58
how to define your model https://github.com/falcondai/cifar10/blob/master/models/cp2f3d.py
I want to train a cnn for 20000 steps. In the 100th step I want to save all variables and after that I want to re-run my code restoring model and starting from the 100th step. I am trying to make it work with tensorflow documentation: https://www.tensorflow.org/versions/r0.10/how_tos/variables/index.html but I can't. Any help?
Im stuck in something similar but maybe this link can help you. Im new in tensorflow but i think you cant restore and fit without need to training again you model.
This functionality is still unstable , and the documentation is outdated so is confusing, what worked for me(this was a suggestion of people from google that works directly on tensorflow) was to use the model_dir parameter on the constructor of my models before training, in this you will tell where to store your model, after training you just instantiate again a model using the same model_dir and it will restore the model from the files and checkpoints generated.
I am using Tensorflow + Python.
I am curious if I can release a saved Tensorflow model (architecture + trained variables) without detailed source code. I'm aware of tf.train.Saver(), but it looks to save only variables, and in order to restore/run them, a user needs to "define" the same architecture.
For the testing/running purpose only, is there a way to release a saved {architecture+trained variables} without source code, so that a user can just cast a query and get a result?
The TensorFlow Serving project is intended to make this use case straightforward (assuming that the end user is only using the model for inference, not training). TensorFlow Serving includes an Exporter class that takes your tf.train.Saver, the tf.GraphDef that defines your overall model, and a "signature" that describes the inputs to and output from your model.
The basics tutorial has a good introduction to exporting your model.
You can build a Saver from the MetaGraphDef (saved with checkpoints by default: those .meta files). and then use that Saver to restore your model. So users don't have to re-define your graph in their code. But then they still need to figure out the model signature (input, output variables). I solve this using tf.Collection (but i am interested to find better ways to do it as well).
You can take a look at my example implementation (the eval.py evaluate a model without re-defining a model):
reconstruct saver from meta graph https://github.com/falcondai/cifar10/blob/master/eval.py#L18
get input variables from collections https://github.com/falcondai/cifar10/blob/master/eval.py#L58
how to define your model https://github.com/falcondai/cifar10/blob/master/models/cp2f3d.py