From what I've gathered so far, there are several different ways of dumping a TensorFlow graph into a file and then loading it into another program, but I haven't been able to find clear examples/information on how they work. What I already know is this:
Save the model's variables into a checkpoint file (.ckpt) using a tf.train.Saver() and restore them later (source)
Save a model into a .pb file and load it back in using tf.train.write_graph() and tf.import_graph_def() (source)
Load in a model from a .pb file, retrain it, and dump it into a new .pb file using Bazel (source)
Freeze the graph to save the graph and weights together (source)
Use as_graph_def() to save the model, and for weights/variables, map them into constants (source)
However, I haven't been able to clear up several questions regarding these different methods:
Regarding checkpoint files, do they only save the trained weights of a model? Could checkpoint files be loaded into a new program, and be used to run the model, or do they simply serve as ways to save the weights in a model at a certain time/stage?
Regarding tf.train.write_graph(), are the weights/variables saved as well?
Regarding Bazel, can it only save into/load from .pb files for retraining? Is there a simple Bazel command just to dump a graph into a .pb?
Regarding freezing, can a frozen graph be loaded in using tf.import_graph_def()?
The Android demo for TensorFlow loads in Google's Inception model from a .pb file. If I wanted to substitute my own .pb file, how would I go about doing that? Would I need to change any native code/methods?
In general, what exactly is the difference between all these methods? Or more broadly, what is the difference between as_graph_def()/.ckpt/.pb?
In short, what I'm looking for is a method to save both a graph (as in, the various operations and such) and its weights/variables into a file, which can then be used to load the graph and weights into another program, for use (not necessarily continuing/retraining).
Documentation about this topic isn't very straightforward, so any answers/information would be greatly appreciated.
There are many ways to approach the problem of saving a model in TensorFlow, which can make it a bit confusing. Taking each of your sub-questions in turn:
The checkpoint files (produced e.g. by calling saver.save() on a tf.train.Saver object) contain only the weights, and any other variables defined in the same program. To use them in another program, you must re-create the associated graph structure (e.g. by running code to build it again, or calling tf.import_graph_def()), which tells TensorFlow what to do with those weights. Note that calling saver.save() also produces a file containing a MetaGraphDef, which contains a graph and details of how to associate the weights from a checkpoint with that graph. See the tutorial for more details.
tf.train.write_graph() only writes the graph structure; not the weights.
Bazel is unrelated to reading or writing TensorFlow graphs. (Perhaps I misunderstand your question: feel free to clarify it in a comment.)
A frozen graph can be loaded using tf.import_graph_def(). In this case, the weights are (typically) embedded in the graph, so you don't need to load a separate checkpoint.
The main change would be to update the names of the tensor(s) that are fed into the model, and the names of the tensor(s) that are fetched from the model. In the TensorFlow Android demo, this would correspond to the inputName and outputName strings that are passed to TensorFlowClassifier.initializeTensorFlow().
The GraphDef is the program structure, which typically does not change through the training process. The checkpoint is a snapshot of the state of a training process, which typically changes at every step of the training process. As a result, TensorFlow uses different storage formats for these types of data, and the low-level API provides different ways to save and load them. Higher-level libraries, such as the MetaGraphDef libraries, Keras, and skflow build on these mechanisms to provide more convenient ways to save and restore an entire model.
You can try the following code:
with tf.gfile.FastGFile('model/frozen_inference_graph.pb', "rb") as f:
graph_def = tf.GraphDef()
graph_def.ParseFromString(f.read())
g_in = tf.import_graph_def(graph_def, name="")
sess = tf.Session(graph=g_in)
Related
I need to have a frozen graph (GrafDef file) while using Tensorflow 2.X.
That is because I use a tool which expects a frozen graph, however, my training needed to be done on TF2.X and Keras.
I tried many different ways to save my TF2 model. The variant with which I was able to get the most useful formats is the following:
sess = tf.compat.v1.Session()
saver = tf.compat.v1.train.Saver(var_list=cnn.trainable_variables)
save_path = saver.save(sess, os.path.join(CHKPT_DIR, CHKPT_FILE))
tf.compat.v1.train.write_graph(sess.graph_def, CHKPT_DIR, TRAIN_GRAPH, as_text=False)
That way I was able to get the following files:
float_model.ckpt.data-00000-of-00001
float_model.ckpt.index
checkpoint
training_model.pb
Of these files I need the *.ckpt and training_model.pb to freeze my model. However, when using the freeze_graph.sh (with TF1.X, different virtual environment), it throws the error
ValueError: No variables to save
This is although I give it the variables as a list via var_list=cnn.trainable_variables. cnn.trainable_variables also is not empty and seems to have all the used variables of my model.
Thus, I tried using the following method, according to TF2.X standards (assuming cnn is my model):
cnn.save(CHKPT_PATH)
checkpoint = tf.train.Checkpoint(cnn)
save_path = checkpoint.save(CHKPT_PATH)
Here I get the following files:
float_model.ckpt-1.data-00000-of-00001
float_model.ckpt-1.index
checkpoint
floating_model.ckpt/keras_metadata.pb
floating_model.ckpt/saved_model.pb
floating_model.ckpt/assets
floating_model.ckpt/variables
But here is where I get confused. Is there some kind of frozen graph available already? Or is there some kind of equivalent in here? And if not, how to get it with TF2.X if possible? I found the sentence
The .save() method is already saving a *.pb ready for inference.
in this post. So the frozen graph is ready for inference, and thus one of these files must be equivalent to a frozen graph, right?
I have a TensorFlow model that I built (a 1D CNN) that I would now like to implement into .NET.
In order to do so I need to know the Input and Output nodes.
When I uploaded the model on Netron I get a different graph depending on my save method and the only one that looks correct comes from an h5 upload. Here is the model.summary():
If I save the model as an h5 model.save("Mn_pb_model.h5") and load that into the Netron to graph it, everything looks correct:
However, ML.NET will not accept h5 format so it needs to be saved as a pb.
In looking through samples of adopting TensorFlow in ML.NET, this sample shows a TensorFlow model that is saved in a similar format to the SavedModel format - recommended by TensorFlow (and also recommended by ML.NET here "Download an unfrozen [SavedModel format] ..."). However when saving and loading the pb file into Netron I get this:
And zoomed in a little further (on the far right side),
As you can see, it looks nothing like it should.
Additionally the input nodes and output nodes are not correct so it will not work for ML.NET (and I think something is wrong).
I am using the recommended way from TensorFlow to determine the Input / Output nodes:
When I try to obtain a frozen graph and load it into Netron, at first it looks correct, but I don't think that it is:
There are four reasons I do not think this is correct.
it is very different from the graph when it was uploaded as an h5 (which looks correct to me).
as you can see from earlier, I am using 1D convolutions throughout and this is showing that it goes to 2D (and remains that way).
this file size is 128MB whereas the one in the TensorFlow to ML.NET example is only 252KB. Even the Inception model is only 56MB.
if I load the Inception model in TensorFlow and save it as an h5, it looks the same as from the ML.NET resource, yet when I save it as a frozen graph it looks different. If I take the same model and save it in the recommended SavedModel format, it shows up all messed up in Netron. Take any model you want and save it in the recommended SavedModel format and you will see for yourself (I've tried it on a lot of different models).
Additionally in looking at the model.summary() of Inception with it's graph, it is similar to its graph in the same way my model.summary() is to the h5 graph.
It seems like there should be an easier way (and a correct way) to save a TensorFlow model so it can be used in ML.NET.
Please show that your suggested solution works: In the answer that you provide, please check that it works (load the pb model [this should also have a Variables folder in order to work for ML.NET] into Netron and show that it is the same as the h5 model, e.g., screenshot it). So that we are all trying the same thing, here is a link to a MNIST ML crash course example. It takes less than 30s to run the program and produces a model called my_model. From here you can save it according to your method and upload it to see the graph on Netron. Here is the h5 model upload:
This answer is made of 3 parts:
going through other programs
NOT going through other programs
Difference between op-level graph and conceptual graph (and why Netron show you different graphs)
1. Going through other programs:
ML.net needs an ONNX model, not a pb file.
There is several ways to convert your model from TensorFlow to an ONNX model you could load in ML.net :
With WinMLTools tools: https://learn.microsoft.com/en-us/windows/ai/windows-ml/convert-model-winmltools
With MMdnn: https://github.com/microsoft/MMdnn
With tf2onnx: https://github.com/onnx/tensorflow-onnx
If trained with Keras, with keras2onnx: https://github.com/onnx/keras-onnx
This SO post could help you too: Load model with ML.NET saved with keras
And here you will find more informations on the h5 and pb files formats, what they contain, etc.: https://www.tensorflow.org/guide/keras/save_and_serialize#weights_only_saving_in_savedmodel_format
2. But you are asking "TensorFlow -> ML.NET without going through other programs":
2.A An overview of the problem:
First, the pl file format you made using the code you provided from seems, from what you say, to not be the same as the one used in the example you mentionned in comment (https://learn.microsoft.com/en-us/dotnet/machine-learning/tutorials/text-classification-tf)
Could to try to use the pb file that will be generated via tf.saved_model.save ? Is it working ?
A thought about this microsoft blog post:
From this page we can read:
In ML.NET you can load a frozen TensorFlow model .pb file (also called
“frozen graph def” which is essentially a serialized graph_def
protocol buffer written to disk)
and:
That TensorFlow .pb model file that you see in the diagram (and the
labels.txt codes/Ids) is what you create/train in Azure Cognitive
Services Custom Vision then exporte as a frozen TensorFlow model file
to be used by ML.NET C# code.
So, this pb file is a type of file generated from Azure Cognitive Services Custom Vision.
Perharps you could try this way too ?
2.B Now, we'll try to provide the solution:
In fact, in TensorFlow 1.x you could save a frozen graph easily, using freeze_graph.
But TensorFlow 2.x does not support freeze_graph and converter_variables_to_constants.
You could read some usefull informations here too: Tensorflow 2.0 : frozen graph support
Some users are wondering how to do in TF 2.x: how to freeze graph in tensorflow 2.0 (https://github.com/tensorflow/tensorflow/issues/27614)
There are some solutions however to create the pb file you could load in ML.net as you want:
https://leimao.github.io/blog/Save-Load-Inference-From-TF2-Frozen-Graph/
How to save Keras model as frozen graph? (already linked in your question though)
Difference between op-level graph and conceptual graph (and why Netron show you different graphs):
As #mlneural03 said in a comment to you question, Netron shows a different graph depending on what file format you give:
If you load a h5 file, Netron wil display the conceptual graph
If you load a pb file, Netron wil display the op-level graph
What is the difference between a op-level graph and a conceptual graph ?
In TensorFlow, the nodes of the op-level graph represent the operations ("ops"), like tf.add , tf.matmul , tf.linalg.inv, etc.
The conceptual graph will show you your your model's structure.
That's completely different things.
"ops" is an abbreviation for "operations".
Operations are nodes that perform the computations.
So, that's why you get a very large graph with a lot of nodes when you load the pb fil in Netron: you see all the computation nodes of the graph.
but when you load the h5 file in Netron, you "just" see your model's tructure, the design of your model.
In TensorFlow, you can view your graph with TensorBoard:
By default, TensorBoard displays the op-level graph.
To view the coneptual graph, in TensorBoard, select the "keras" tag.
There is a Jupyter Notebook that explains very clearly the difference between the op-level graph and the coneptual graph here: https://colab.research.google.com/github/tensorflow/tensorboard/blob/master/docs/graphs.ipynb
You can also read this "issue" on the TensorFlow Github too, related to your question: https://github.com/tensorflow/tensorflow/issues/39699
In a nutshell:
In fact there is no problem, just a little misunderstanding (and that's OK, we can't know everything).
You would like to see the same graphs when loading the h5 file and the pb file in Netron, but it has to be unsuccessful, because the files does not contains the same graphs. These graphs are two ways of displaying the same model.
The pb file created with the method we described will be the correct pb file to load whith ML.NET, as described in the Microsoft's tutorial we talked about. SO, if you load you correct pb file as described in these tutorials, you wil load your real/true model.
I am using Tensorflow + Python.
I am curious if I can release a saved Tensorflow model (architecture + trained variables) without detailed source code. I'm aware of tf.train.Saver(), but it looks to save only variables, and in order to restore/run them, a user needs to "define" the same architecture.
For the testing/running purpose only, is there a way to release a saved {architecture+trained variables} without source code, so that a user can just cast a query and get a result?
The TensorFlow Serving project is intended to make this use case straightforward (assuming that the end user is only using the model for inference, not training). TensorFlow Serving includes an Exporter class that takes your tf.train.Saver, the tf.GraphDef that defines your overall model, and a "signature" that describes the inputs to and output from your model.
The basics tutorial has a good introduction to exporting your model.
You can build a Saver from the MetaGraphDef (saved with checkpoints by default: those .meta files). and then use that Saver to restore your model. So users don't have to re-define your graph in their code. But then they still need to figure out the model signature (input, output variables). I solve this using tf.Collection (but i am interested to find better ways to do it as well).
You can take a look at my example implementation (the eval.py evaluate a model without re-defining a model):
reconstruct saver from meta graph https://github.com/falcondai/cifar10/blob/master/eval.py#L18
get input variables from collections https://github.com/falcondai/cifar10/blob/master/eval.py#L58
how to define your model https://github.com/falcondai/cifar10/blob/master/models/cp2f3d.py
is it possible to share tensorflow checkpoint files with other users (plattform & CPU/GPU independet)? I had shared a tensorflow implementation of the DeconvNet and now I want to provide the trained weights. Can I simply upload the saved model or is there another tf way? I'm asking because I read a tutorial were the weights were stored using numpy.savetxt and then restored during the weight initalization. But this method was used for the MNIST example which uses a very small net..
Thanks!
You could save metagraph + provide code to restore and run your model --
http://tensorflow.org/how_tos/meta_graph
One downside of this is that it doesn't provide annotations of which tensors to feed/fetch, so you need to provide some code showing how to use it.
SavedModel is the next iteration of TensorFlow checkpoint format that takes care of that, but it doesn't have much documentation yet.
I use pickle, in binary mode, to dump and load big numpy matrix and it works quite well.
I've built and trained some networks with TensorFlow and successfully managed to save and restore the model's parameters.
However, for some scenarios - e.g. like deploying a trained network in a customer's infrastructure - it is not the best solution to ship the full code/model. Thus, I am wondering if there is any way to restore/run a trained network without the original code/model used for training?
I guess this leads to the question if TensorFlow is able to save a (compressed?) version of the network architecture into the checkpoint files in addition to the weights of the variables.
Is this somehow possible?
If you really need to restore just from the graphdef file (*.pb), to load it from another application for instance, you will need to use the freeze_graph.py script from here: https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/tools/freeze_graph.py
This script takes a graphdef (.pb) and a checkpoint (.ckpt) file as input and outputs a graphdef file which contains the weights in the form of constants (you can read the docs on the script for more details).