Google says:
But as far as I know if I would like to use tensorflow inference i need protobuf file (.pb) which I can get using freeze_graph method. So what is the difference between those two?
As a heads-up, freeze_graph is generally deprecated in TensorFlow 2.x. You should be using Saved Models for the same functionality in Tensorflow 2.x. I'll be answering this question from the perspective of TensorFlow 1.x.
Before understanding the difference between the two, you need to know how a TensorFlow model is shaped.
Each TensorFlow model is composed of a graph data structure, which contains the Operation objects, which are the units of computation, and the Tensor objects, which are the units of data that flow between them.
However, a graph alone is not enough to do anything like inference. When you train a model, it learns and optimizes a unique set of parameters for the different parts of your graph.
A PB file, then, includes both of these parts - the graph that represents the structure of the model, and the parameters that the model has learned through training.
So back to the original question - what's the difference between write_graph and freeze_graph?
write_graph writes out the graph of the model (the structure) into the PB file.
This does not require any training, so it doesn't include any parameters the model may have learned.
freeze_graph takes the trained parameters of the model from a training checkpoint and saves that as well to the PB file.
Related
I was trying to load the keras model which I saved during my training.So I went to keras documentation where I saw this.
Only topological loading (by_name=False) is supported when loading
weights from the TensorFlow format. Note that topological loading
differs slightly between TensorFlow and HDF5 formats for user-defined
classes inheriting from tf.keras.Model: HDF5 loads based on a
flattened list of weights, while the TensorFlow format loads based on
the object-local names of attributes to which layers are assigned in
the Model's constructor.
Could you please explain the above one?
For clarity purpose let's consider two cases.
Case 1: Simple model, and
Case 2: Complex model where user-defined classes inherited from tf.keras.Model were used.
Case 1: Simple model (as in keras Functional and Sequential models)
When you save model weights (using model.save_weights) and then load weights (using model.load_weights), by default the load_weights method uses topological loading. This is same for Tensorflow saved_model ('tf') format as well as 'h5' format. For example,
loadedh5_model.load_weights('./MyModel_h5.h5')
# the line above is same as the line below (as second and third arguments are default)
#loadedh5_model.load_weights('./MyModel_h5.h5',by_name=False, skip_mismatch=False)
In case if you want to load weights of specific layers of a saved model, then you need to use by_name=True. There are use cases that requires this type of loading.
loadedh5_model.load_weights('./MyModel_h5.h5',by_name=True, skip_mismatch=False)
Case 2: Complex model(as in Keras Subclassed models)
As of now only 'tf' format is only supported when user-defined classes inherited from tf.keras.Model were used in the model creation.
Only topological loading (by_name=False) is supported when loading
weights from the TensorFlow format. Note that topological loading
differs slightly between TensorFlow and HDF5 formats for user-defined
classes inheriting from tf.keras.Model: HDF5 loads based on a
flattened list of weights, while the TensorFlow format loads based on
the object-local names of attributes to which layers are assigned in
the Model's constructor.
The main reason is the way weights are in h5 format and tf format.
For example, consider Case 1 where HDF5 loads based on a flattened list of weights. The weights are loaded without any error. However, in Case 2, the model has user defined classes which requires different approach than just loading flattened weights. In order to take care of assigning weights of custom classes, 'tf' format load the weights based on the object-local names of attributes to which layers are assigned in the Model's constructor.
The following paragraph mentioned in keras website, further clarifies
When loading a weight file in TensorFlow format, returns the same
status object as tf.train.Checkpoint.restore. When graph building,
restore ops are run automatically as soon as the network is built (on
first call for user-defined classes inheriting from Model, immediately
if it is already built).
Another point to understand is keras Functional or Sequential models are static graphs of layers that can use flattened weights without any problem. Keras Subclassed model (as in our Case 2), is piece of Python code (a call method). There is no graph of layers. So as soon as the network is built with custom classes, restore ops are run to update status objects. Hope it helps.
Note: this question has an accompanying, documented Colab notebook.
TensorFlow's documentation can, at times, leave a lot to be desired. Some of the older docs for lower level apis seem to have been expunged, and most newer documents point towards using higher level apis such as TensorFlow's subset of keras or estimators. This would not be so problematic if the higher level apis did not so often rely closely on their lower levels. Case in point, estimators (especially the input_fn when using TensorFlow Records).
Over the following Stack Overflow posts:
Tensorflow v1.10: store images as byte strings or per channel?
Tensorflow 1.10 TFRecordDataset - recovering TFRecords
Tensorflow v1.10+ why is an input serving receiver function needed when checkpoints are made without it?
TensorFlow 1.10+ custom estimator early stopping with train_and_evaluate
TensorFlow custom estimator stuck when calling evaluate after training
and with the gracious assistance of the TensorFlow / StackOverflow community, we have moved closer to doing what the TensorFlow "Creating Custom Estimators" guide has not, demonstrating how to make an estimator one might actually use in practice (rather than toy example) e.g. one which:
has a validation set for early stopping if performance worsen,
reads from TF Records because many datasets are larger than the TensorFlow recommend 1Gb for in memory, and
that saves its best version whilst training
While I still have many questions regarding this (from the best way to encode data into a TF Record, to what exactly the serving_input_fn expects), there is one question that stands out more prominently than the rest:
How to predict with the custom estimator we just made?
Under the documentation for predict, it states:
input_fn: A function that constructs the features. Prediction continues until input_fn raises an end-of-input exception (tf.errors.OutOfRangeError or StopIteration). See Premade Estimators for more information. The function should construct and return one of the following:
A tf.data.Dataset object: Outputs of Dataset object must have same constraints as below.
features: A tf.Tensor or a dictionary of string feature name to Tensor. features are consumed by model_fn. They should satisfy the expectation of model_fn from inputs.
A tuple, in which case the first item is extracted as features.
(perhaps) Most likely, if one is using estimator.predict, they are using data in memory such as a dense tensor (because a held out test set would likely go through evaluate).
So I, in the accompanying Colab, create a single dense example, wrap it up in a tf.data.Dataset, and call predict to get a ValueError.
I would greatly appreciate it if someone could explain to me how I can:
load my saved estimator
given a dense, in memory example, predict the output with the estimator
to_predict = random_onehot((1, SEQUENCE_LENGTH, SEQUENCE_CHANNELS))\
.astype(tf_type_string(I_DTYPE))
pred_features = {'input_tensors': to_predict}
pred_ds = tf.data.Dataset.from_tensor_slices(pred_features)
predicted = est.predict(lambda: pred_ds, yield_single_examples=True)
next(predicted)
ValueError: Tensor("IteratorV2:0", shape=(), dtype=resource) must be from the same graph as Tensor("TensorSliceDataset:0", shape=(), dtype=variant).
When you use the tf.data.Dataset module, it actually defines an input graph which is independant from the model graph. What happens here is that you first created a small graph by calling tf.data.Dataset.from_tensor_slices(), then the estimator API created a second graph by calling dataset.make_one_shot_iterator() automatically. These 2 graphs can't communicate so it throws an error.
To circumvent this, you should never create a dataset outside of estimator.train/evaluate/predict. This is why everything data related is wrapped inside input functions.
def predict_input_fn(data, batch_size=1):
dataset = tf.data.Dataset.from_tensor_slices(data)
return dataset.batch(batch_size).prefetch(None)
predicted = est.predict(lambda: predict_input_fn(pred_features), yield_single_examples=True)
next(predicted)
Now, the graph is not created outside of the predict call.
I also added dataset.batch() because the rest of your code expect batched data and it was throwing a shape error. Prefetch just speed things up.
I'm tryinig to train my LSTM model in tensorflow and my module has to calculate parameter inside parameter. And i want to train both parameters altogether.
More details are in the picture below.
I think that tensorflow LSTM module's input must be a perfect sequence and parameters like "tf.placeholder".
How can i do this in tensorflow? Or can you recommend another appropriate framework better than tensorflow in this task?
Sorry for my poor english.
First of all your usage of the word parameter is quite confusing. Normally parameters are referred as trainable parameters and therefore every variable which is trained by the optimizer. There are also so-called hyper-parameters, which have to be set per hand e.g. like the model topology.
Tensorflow work with tensors, which are representations of data which are used to build the workflow and are filled with data during run time via placeholder which is like an entry point for the data.
Also, if you have trouble to build your model in tensorflow, then there is also keras. Keras can run with tensorflow as its backend but model building is much easier. Also, keras is also available in the tensorflow API as tf.keras. In keras one or multiple LSTMs are simplified as a layer which can be added to your model.
If you like a more specific answer to your question, please provide code to describe your problem.
My goal is to decrease the size and complexity of a pre-trained Model (a Tensorflow Frozen Graph as Protobuf .pb file) as far as possible to make Inference (in my case realtime object detection using Webcams) as fast as possible.
(See my project repo for more information: https://github.com/GustavZ/realtime_object_detection)
Let's take a look at the pre-trained ssd_mobilenet_v1_coco provided by the tensorflow object detection API:
link to ssd_mobilenet graph
Which Layers are not necessary for inference (so only for the already completed training) and thus can be removed (from the config file to export a new frozen model using the export_inference_graph.py script)?
It would be very nice to get a general answer on how to optimize Models for inference as well as an answer on my special case as I think this could be of interest for others.
EDIT: I know now about the optimize_for_inference.py script provided py tensorflow. But i have no experience using it, for example how do i know which are the really necessary Input and Output nodes, or how do i read them from tensorboard?
I am using Tensorflow + Python.
I am curious if I can release a saved Tensorflow model (architecture + trained variables) without detailed source code. I'm aware of tf.train.Saver(), but it looks to save only variables, and in order to restore/run them, a user needs to "define" the same architecture.
For the testing/running purpose only, is there a way to release a saved {architecture+trained variables} without source code, so that a user can just cast a query and get a result?
The TensorFlow Serving project is intended to make this use case straightforward (assuming that the end user is only using the model for inference, not training). TensorFlow Serving includes an Exporter class that takes your tf.train.Saver, the tf.GraphDef that defines your overall model, and a "signature" that describes the inputs to and output from your model.
The basics tutorial has a good introduction to exporting your model.
You can build a Saver from the MetaGraphDef (saved with checkpoints by default: those .meta files). and then use that Saver to restore your model. So users don't have to re-define your graph in their code. But then they still need to figure out the model signature (input, output variables). I solve this using tf.Collection (but i am interested to find better ways to do it as well).
You can take a look at my example implementation (the eval.py evaluate a model without re-defining a model):
reconstruct saver from meta graph https://github.com/falcondai/cifar10/blob/master/eval.py#L18
get input variables from collections https://github.com/falcondai/cifar10/blob/master/eval.py#L58
how to define your model https://github.com/falcondai/cifar10/blob/master/models/cp2f3d.py