I try to build a tensorflow model - where i use the tf.py_func to create a part of the code in ordinary python code. The problem is that when I save the model to a .pb file, the .pb file itself is very small and does not include the py_func:0 tensor. When I try to load and run the model from the .pb file I get this error: get ValueError: callback pyfunc_0 is not found.
It works when I dont save and load as a .pb file
Is anyone able ton help. This is super important to me and have given me a couple of sleepless nights.
model_version = "465555564"
tensorboard = TensorBoard(log_dir='./logs', histogram_freq = 0, write_graph = True, write_images = False)
sess = tf.Session()
K.set_session(sess)
K.set_learning_phase(0)
def my_func(x):
some_function
input = tf.placeholder(tf.float32)
y = tf.py_func(my_func, [input], tf.float32)
prediction_signature = tf.saved_model.signature_def_utils.predict_signature_def({"inputs": input}, {"prediction": y})
builder = saved_model_builder.SavedModelBuilder('./'+model_version)
legacy_init_op = tf.group(tf.tables_initializer(), name='legacy_init_op')
builder.add_meta_graph_and_variables(
sess, [tag_constants.SERVING],
signature_def_map={
signature_constants.DEFAULT_SERVING_SIGNATURE_DEF_KEY:prediction_signature,
},
legacy_init_op=legacy_init_op)
builder.save()
There is a way to save TF models with tf.py_func, but you have to do it without using a SavedModel.
TF has 2 levels of model saving: checkpoints and SavedModels. See this answer for more details, but to quote it here:
A checkpoint contains the value of (some of the) variables in a TensorFlow model. It is created by a Saver. To use a checkpoint, you need to have a compatible TensorFlow Graph, whose Variables have the same names as the Variables in the checkpoint.
SavedModel is much more comprehensive: It contains a set of Graphs (MetaGraphs, in fact, saving collections and such), as well as a checkpoint which is supposed to be compatible with these Graphs, and any asset files that are needed to run the model (e.g. Vocabulary files). For each MetaGraph it contains, it also stores a set of signatures. Signatures define (named) input and output tensors.
The tf.py_func op cannot be saved with a SavedModel (noted on this page in the docs), which is what you tried to do here. There is a good reason for this. SavedModels are supposed to be totally independent from the original code, able to be loaded in any other language that can deserialize it. This allows the models to be loaded by things like ML Engine, which is probably written in C++ or something like that. The problem is that it cannot serialize arbitrary Python code, so py_func is a no-go.
You can work around this by using checkpoints, as long as you are okay with staying in Python. You will not get the independence that SavedModels provide. You can save a checkpoint after training with a tf.train.Saver, and then in a new Session, re-build the whole graph and load it with that Saver. There is even a way to use that code in ML Engine, which used to be exclusively for SavedModels. You can use custom prediction routines to side-step the need for a SavedModel.
More info on saving/restoring models in the docs.
Related
I need to have a frozen graph (GrafDef file) while using Tensorflow 2.X.
That is because I use a tool which expects a frozen graph, however, my training needed to be done on TF2.X and Keras.
I tried many different ways to save my TF2 model. The variant with which I was able to get the most useful formats is the following:
sess = tf.compat.v1.Session()
saver = tf.compat.v1.train.Saver(var_list=cnn.trainable_variables)
save_path = saver.save(sess, os.path.join(CHKPT_DIR, CHKPT_FILE))
tf.compat.v1.train.write_graph(sess.graph_def, CHKPT_DIR, TRAIN_GRAPH, as_text=False)
That way I was able to get the following files:
float_model.ckpt.data-00000-of-00001
float_model.ckpt.index
checkpoint
training_model.pb
Of these files I need the *.ckpt and training_model.pb to freeze my model. However, when using the freeze_graph.sh (with TF1.X, different virtual environment), it throws the error
ValueError: No variables to save
This is although I give it the variables as a list via var_list=cnn.trainable_variables. cnn.trainable_variables also is not empty and seems to have all the used variables of my model.
Thus, I tried using the following method, according to TF2.X standards (assuming cnn is my model):
cnn.save(CHKPT_PATH)
checkpoint = tf.train.Checkpoint(cnn)
save_path = checkpoint.save(CHKPT_PATH)
Here I get the following files:
float_model.ckpt-1.data-00000-of-00001
float_model.ckpt-1.index
checkpoint
floating_model.ckpt/keras_metadata.pb
floating_model.ckpt/saved_model.pb
floating_model.ckpt/assets
floating_model.ckpt/variables
But here is where I get confused. Is there some kind of frozen graph available already? Or is there some kind of equivalent in here? And if not, how to get it with TF2.X if possible? I found the sentence
The .save() method is already saving a *.pb ready for inference.
in this post. So the frozen graph is ready for inference, and thus one of these files must be equivalent to a frozen graph, right?
I am working on Object Detection using Tensorflow 2 API in Python. This works great so far. However, if I want to save the model, I am using exporter_main_v2.py which exports a graph (.pb) and a checkpoint (checkpoint, ckpt-0.data, ckpt-0.index). The graph does not include any weights, I always have to use the checkpoint to work with the saved model.
Is there any way to save all weights into the Protobuf (.pb) file?
Here's what I've tried:
Save frozen model: TF2 does obviously not support frozen graphs any more. The export_inference_graph.py, which would freeze the graph including all weights, does not work under TF2.
Same goes with freeze_graph.py: Only possible using TF1
You can still use the freezing technique from TF1 in TF2, using the compat.v1 module:
In the following snippet, I assume that you have a pretrained model with weights saved in the TF2 fashion, with tf.saved_model.save.
graph = tf.Graph()
with graph.as_default():
sess = tf.compat.v1.Session()
with sess.as_default():
# creating the model/loading it from a TF2 pb file
# (If you have a keras model, you can use
#`tf.keras.models.load_model` instead).
model = tf.saved_model.load("/path/to/model")
# the default signature might be different.
sign = model.signatures["serving_default"]
# if using keras, just use model.outputs
tensor_out_names = [out.name.split(":")[0] for out in sign.outputs]
graphdef = tf.compat.v1.graph_util.convert_variables_to_constants(
sess, graph.as_graph_def(), tensor_out_names
)
# the following is optional, use only if no more training is required
graphdef = tf.compat.v1.graph_util.remove_training_nodes(graphdef)
tf.python.framework.graph_io.write_graph(graphdef, "./", "/path/to/frozengraph", as_text=False)
However, I would refrain to do it other than for compatibility reason with an old tool. The compat module might be deprecated one day, and as far as I can understand, there is not a big value having only one file containing the graph+the weights rather than splitting them.
I am using Tensorflow + Python.
I am curious if I can release a saved Tensorflow model (architecture + trained variables) without detailed source code. I'm aware of tf.train.Saver(), but it looks to save only variables, and in order to restore/run them, a user needs to "define" the same architecture.
For the testing/running purpose only, is there a way to release a saved {architecture+trained variables} without source code, so that a user can just cast a query and get a result?
The TensorFlow Serving project is intended to make this use case straightforward (assuming that the end user is only using the model for inference, not training). TensorFlow Serving includes an Exporter class that takes your tf.train.Saver, the tf.GraphDef that defines your overall model, and a "signature" that describes the inputs to and output from your model.
The basics tutorial has a good introduction to exporting your model.
You can build a Saver from the MetaGraphDef (saved with checkpoints by default: those .meta files). and then use that Saver to restore your model. So users don't have to re-define your graph in their code. But then they still need to figure out the model signature (input, output variables). I solve this using tf.Collection (but i am interested to find better ways to do it as well).
You can take a look at my example implementation (the eval.py evaluate a model without re-defining a model):
reconstruct saver from meta graph https://github.com/falcondai/cifar10/blob/master/eval.py#L18
get input variables from collections https://github.com/falcondai/cifar10/blob/master/eval.py#L58
how to define your model https://github.com/falcondai/cifar10/blob/master/models/cp2f3d.py
From what I've gathered so far, there are several different ways of dumping a TensorFlow graph into a file and then loading it into another program, but I haven't been able to find clear examples/information on how they work. What I already know is this:
Save the model's variables into a checkpoint file (.ckpt) using a tf.train.Saver() and restore them later (source)
Save a model into a .pb file and load it back in using tf.train.write_graph() and tf.import_graph_def() (source)
Load in a model from a .pb file, retrain it, and dump it into a new .pb file using Bazel (source)
Freeze the graph to save the graph and weights together (source)
Use as_graph_def() to save the model, and for weights/variables, map them into constants (source)
However, I haven't been able to clear up several questions regarding these different methods:
Regarding checkpoint files, do they only save the trained weights of a model? Could checkpoint files be loaded into a new program, and be used to run the model, or do they simply serve as ways to save the weights in a model at a certain time/stage?
Regarding tf.train.write_graph(), are the weights/variables saved as well?
Regarding Bazel, can it only save into/load from .pb files for retraining? Is there a simple Bazel command just to dump a graph into a .pb?
Regarding freezing, can a frozen graph be loaded in using tf.import_graph_def()?
The Android demo for TensorFlow loads in Google's Inception model from a .pb file. If I wanted to substitute my own .pb file, how would I go about doing that? Would I need to change any native code/methods?
In general, what exactly is the difference between all these methods? Or more broadly, what is the difference between as_graph_def()/.ckpt/.pb?
In short, what I'm looking for is a method to save both a graph (as in, the various operations and such) and its weights/variables into a file, which can then be used to load the graph and weights into another program, for use (not necessarily continuing/retraining).
Documentation about this topic isn't very straightforward, so any answers/information would be greatly appreciated.
There are many ways to approach the problem of saving a model in TensorFlow, which can make it a bit confusing. Taking each of your sub-questions in turn:
The checkpoint files (produced e.g. by calling saver.save() on a tf.train.Saver object) contain only the weights, and any other variables defined in the same program. To use them in another program, you must re-create the associated graph structure (e.g. by running code to build it again, or calling tf.import_graph_def()), which tells TensorFlow what to do with those weights. Note that calling saver.save() also produces a file containing a MetaGraphDef, which contains a graph and details of how to associate the weights from a checkpoint with that graph. See the tutorial for more details.
tf.train.write_graph() only writes the graph structure; not the weights.
Bazel is unrelated to reading or writing TensorFlow graphs. (Perhaps I misunderstand your question: feel free to clarify it in a comment.)
A frozen graph can be loaded using tf.import_graph_def(). In this case, the weights are (typically) embedded in the graph, so you don't need to load a separate checkpoint.
The main change would be to update the names of the tensor(s) that are fed into the model, and the names of the tensor(s) that are fetched from the model. In the TensorFlow Android demo, this would correspond to the inputName and outputName strings that are passed to TensorFlowClassifier.initializeTensorFlow().
The GraphDef is the program structure, which typically does not change through the training process. The checkpoint is a snapshot of the state of a training process, which typically changes at every step of the training process. As a result, TensorFlow uses different storage formats for these types of data, and the low-level API provides different ways to save and load them. Higher-level libraries, such as the MetaGraphDef libraries, Keras, and skflow build on these mechanisms to provide more convenient ways to save and restore an entire model.
You can try the following code:
with tf.gfile.FastGFile('model/frozen_inference_graph.pb', "rb") as f:
graph_def = tf.GraphDef()
graph_def.ParseFromString(f.read())
g_in = tf.import_graph_def(graph_def, name="")
sess = tf.Session(graph=g_in)
I am using Tensorflow + Python.
I am curious if I can release a saved Tensorflow model (architecture + trained variables) without detailed source code. I'm aware of tf.train.Saver(), but it looks to save only variables, and in order to restore/run them, a user needs to "define" the same architecture.
For the testing/running purpose only, is there a way to release a saved {architecture+trained variables} without source code, so that a user can just cast a query and get a result?
The TensorFlow Serving project is intended to make this use case straightforward (assuming that the end user is only using the model for inference, not training). TensorFlow Serving includes an Exporter class that takes your tf.train.Saver, the tf.GraphDef that defines your overall model, and a "signature" that describes the inputs to and output from your model.
The basics tutorial has a good introduction to exporting your model.
You can build a Saver from the MetaGraphDef (saved with checkpoints by default: those .meta files). and then use that Saver to restore your model. So users don't have to re-define your graph in their code. But then they still need to figure out the model signature (input, output variables). I solve this using tf.Collection (but i am interested to find better ways to do it as well).
You can take a look at my example implementation (the eval.py evaluate a model without re-defining a model):
reconstruct saver from meta graph https://github.com/falcondai/cifar10/blob/master/eval.py#L18
get input variables from collections https://github.com/falcondai/cifar10/blob/master/eval.py#L58
how to define your model https://github.com/falcondai/cifar10/blob/master/models/cp2f3d.py