TensorFlow restore/deploy network without the model?

TensorFlow restore/deploy network without the model? - python

I've built and trained some networks with TensorFlow and successfully managed to save and restore the model's parameters.
However, for some scenarios - e.g. like deploying a trained network in a customer's infrastructure - it is not the best solution to ship the full code/model. Thus, I am wondering if there is any way to restore/run a trained network without the original code/model used for training?
I guess this leads to the question if TensorFlow is able to save a (compressed?) version of the network architecture into the checkpoint files in addition to the weights of the variables.
Is this somehow possible?

If you really need to restore just from the graphdef file (*.pb), to load it from another application for instance, you will need to use the freeze_graph.py script from here: https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/tools/freeze_graph.py
This script takes a graphdef (.pb) and a checkpoint (.ckpt) file as input and outputs a graphdef file which contains the weights in the form of constants (you can read the docs on the script for more details).

Related

Deploying TensorFlow Checkpoint to Google Cloud Platform

I have trained a TensorFlow model and saved a checkpoint, and would like to deploy it to Google Cloud Platform. In the model deployment documentation it says that you need to create a SavedModel. It seems that others also use checkpoints instead of SavedModel.
Given that I have already spent time training this model and only have checkpoints instead of a SavedModel, is there a method I can use to deploy the model still or will I need to retrain?

A checkpoint maps variable names to tensor values. This, as is, is not enough for higher-level systems to use your model. On the other hand, a SavedModel is complete and airtight. As is made clear in the answer you link to in your post, a SavedModel provides all the info needed for serving TensorFlow models: a set of MetaGraphs, a checkpoint compatible with these Graphs and all necessary asset files. If you look at it this way, it makes sense that you need to export your model to a SavedModel in order to deploy it to ML Engine. Now, this does not imply that you need to retrain. What you need to do instead is to wrap one of your checkpoints into a SavedModel.

How to Optimize a Trained Frozen Model for Inference?

My goal is to decrease the size and complexity of a pre-trained Model (a Tensorflow Frozen Graph as Protobuf .pb file) as far as possible to make Inference (in my case realtime object detection using Webcams) as fast as possible.
(See my project repo for more information: https://github.com/GustavZ/realtime_object_detection)
Let's take a look at the pre-trained ssd_mobilenet_v1_coco provided by the tensorflow object detection API:
link to ssd_mobilenet graph
Which Layers are not necessary for inference (so only for the already completed training) and thus can be removed (from the config file to export a new frozen model using the export_inference_graph.py script)?
It would be very nice to get a general answer on how to optimize Models for inference as well as an answer on my special case as I think this could be of interest for others.
EDIT: I know now about the optimize_for_inference.py script provided py tensorflow. But i have no experience using it, for example how do i know which are the really necessary Input and Output nodes, or how do i read them from tensorboard?

Export and import tensorflow network for evaluating states in application

I am writing a neural network in tensorflow and I want to be able to export my final trained network and import it in another program to play a game. I have found multiple forum posts like:
Tensorflow: How to use a trained model in a application?
Tensorflow: how to save/restore a model?
I also saw in the tf documentations they were using estimators to save the model but I am not sure if that is what I'm looking for and how to apply it.
But those talk about exporting the entire session and importing it into the application and using Session.run, but as I understand it that requires an input of the predicted output and will run another training step on my network. I don't want to continue training my network - it's finished - I now want to evaluate a specific state given to me by the game only.
Thanks in advance for any help available.

As I know, there are 2 way of doing it.
checkpoint files(metagraph)
savedmodel
savedmodel is very convenient, but study curve is higher than checkpoint file. you can check this tutorial
and import model is not continue run the training, it is basically restore all the variable you learned.

How to share tensorflow weights with other users

is it possible to share tensorflow checkpoint files with other users (plattform & CPU/GPU independet)? I had shared a tensorflow implementation of the DeconvNet and now I want to provide the trained weights. Can I simply upload the saved model or is there another tf way? I'm asking because I read a tutorial were the weights were stored using numpy.savetxt and then restored during the weight initalization. But this method was used for the MNIST example which uses a very small net..
Thanks!

You could save metagraph + provide code to restore and run your model --
http://tensorflow.org/how_tos/meta_graph
One downside of this is that it doesn't provide annotations of which tensors to feed/fetch, so you need to provide some code showing how to use it.
SavedModel is the next iteration of TensorFlow checkpoint format that takes care of that, but it doesn't have much documentation yet.

I use pickle, in binary mode, to dump and load big numpy matrix and it works quite well.

TensorFlow saving into/loading a graph from a file

From what I've gathered so far, there are several different ways of dumping a TensorFlow graph into a file and then loading it into another program, but I haven't been able to find clear examples/information on how they work. What I already know is this:
Save the model's variables into a checkpoint file (.ckpt) using a tf.train.Saver() and restore them later (source)
Save a model into a .pb file and load it back in using tf.train.write_graph() and tf.import_graph_def() (source)
Load in a model from a .pb file, retrain it, and dump it into a new .pb file using Bazel (source)
Freeze the graph to save the graph and weights together (source)
Use as_graph_def() to save the model, and for weights/variables, map them into constants (source)
However, I haven't been able to clear up several questions regarding these different methods:
Regarding checkpoint files, do they only save the trained weights of a model? Could checkpoint files be loaded into a new program, and be used to run the model, or do they simply serve as ways to save the weights in a model at a certain time/stage?
Regarding tf.train.write_graph(), are the weights/variables saved as well?
Regarding Bazel, can it only save into/load from .pb files for retraining? Is there a simple Bazel command just to dump a graph into a .pb?
Regarding freezing, can a frozen graph be loaded in using tf.import_graph_def()?
The Android demo for TensorFlow loads in Google's Inception model from a .pb file. If I wanted to substitute my own .pb file, how would I go about doing that? Would I need to change any native code/methods?
In general, what exactly is the difference between all these methods? Or more broadly, what is the difference between as_graph_def()/.ckpt/.pb?
In short, what I'm looking for is a method to save both a graph (as in, the various operations and such) and its weights/variables into a file, which can then be used to load the graph and weights into another program, for use (not necessarily continuing/retraining).
Documentation about this topic isn't very straightforward, so any answers/information would be greatly appreciated.

There are many ways to approach the problem of saving a model in TensorFlow, which can make it a bit confusing. Taking each of your sub-questions in turn:
The checkpoint files (produced e.g. by calling saver.save() on a tf.train.Saver object) contain only the weights, and any other variables defined in the same program. To use them in another program, you must re-create the associated graph structure (e.g. by running code to build it again, or calling tf.import_graph_def()), which tells TensorFlow what to do with those weights. Note that calling saver.save() also produces a file containing a MetaGraphDef, which contains a graph and details of how to associate the weights from a checkpoint with that graph. See the tutorial for more details.
tf.train.write_graph() only writes the graph structure; not the weights.
Bazel is unrelated to reading or writing TensorFlow graphs. (Perhaps I misunderstand your question: feel free to clarify it in a comment.)
A frozen graph can be loaded using tf.import_graph_def(). In this case, the weights are (typically) embedded in the graph, so you don't need to load a separate checkpoint.
The main change would be to update the names of the tensor(s) that are fed into the model, and the names of the tensor(s) that are fetched from the model. In the TensorFlow Android demo, this would correspond to the inputName and outputName strings that are passed to TensorFlowClassifier.initializeTensorFlow().
The GraphDef is the program structure, which typically does not change through the training process. The checkpoint is a snapshot of the state of a training process, which typically changes at every step of the training process. As a result, TensorFlow uses different storage formats for these types of data, and the low-level API provides different ways to save and load them. Higher-level libraries, such as the MetaGraphDef libraries, Keras, and skflow build on these mechanisms to provide more convenient ways to save and restore an entire model.

You can try the following code:
with tf.gfile.FastGFile('model/frozen_inference_graph.pb', "rb") as f:
graph_def = tf.GraphDef()
graph_def.ParseFromString(f.read())
g_in = tf.import_graph_def(graph_def, name="")
sess = tf.Session(graph=g_in)

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.