Context:
I have a simple classifier based on tf.estimator.DNNClassifier that takes text and output probabilities over an intent tags. I am able to train an export the model to a serveable as well as serve the serveable using tensorflow serving. The problem is this servable is too big (around 1GB) and so I wanted to try some tensorflow graph transforms to try to reduce the size of the files being served.
Problem:
I understand how to take the saved_model.pb and use freeze_model.py to create a new .pb file that can be used to call transforms on. The result of these transforms (a .pb file as well) is not a servable and cannot be used with tensorflow serving.
How can a developer go from:
saved model -> graph transforms -> back to a servable
There's documentation that suggests that this is certainly possible, but its not at all intuitive from the docs as to how to do this.
What I've Tried:
import tensorflow as tf
from tensorflow.saved_model import simple_save
from tensorflow.saved_model import signature_constants
from tensorflow.saved_model import tag_constants
from tensorflow.tools.graph_transforms import TransformGraph
with tf.Session(graph=tf.Graph()) as sess_meta:
meta_graph_def = tf.saved_model.loader.load(
sess_meta,
[tag_constants.SERVING],
"/model/path")
graph_def = meta_graph_def.graph_def
other_graph_def = TransformGraph(
graph_def,
["Placeholder"],
["dnn/head/predictions/probabilities"],
["quantize_weights"])
with tf.Graph().as_default():
graph = tf.get_default_graph()
tf.import_graph_def(other_graph_def)
in_tensor = graph.get_tensor_by_name(
"import/Placeholder:0")
out_tensor = graph.get_tensor_by_name(
"import/dnn/head/predictions/probabilities:0")
inputs = {"inputs": in_tensor}
outputs = {"outputs": out_tensor}
simple_save(sess_meta, "./new", inputs, outputs)
My idea was to load the servable, extract the graph_def from the meta_graph_def, transform the graph_def and then try to recreate the servable. This seems to be the incorrect approach.
Is there a way to successfully perform transforms (to reduce file size at inference) on a graph from an exported servable, and then recreate a servable with the transformed graph?
Thanks.
Update (2018-08-28):
Found contrib.meta_graph_transform() which looks promising.
Update (2018-12-03):
A related github issue I opened that seems to be resolved in a detailed blog post which is listed at the end of the ticket.
Related
I trained a Pipeline model, which uses CountVectorizer, TfidfTransformer, OneVsRestClassifier and also a GridSearchCV.
Now I want to save it into a tflite file, to use it on my Android app.
For a Sequential model (where my tflite file was created successfully), I did:
sequential_model = Sequential()
...
# train and fit the model
...
h5_file = "h5_model.h5"
tflite_file = "tflite_model.tflite"
sequential_model.save(h5_file)
converter = tf.lite.TFLiteConverter.from_keras_model_file(h5_file)
tflite_model = converter.convert()
open(tflite_file, "wb").write(tflite_model)
All good to save Sequential model into a tflite file.
Well, Pipeline has no attribute "save", unlike a Sequential model, so I tried saving the Pipeline model with joblib and then with pickle, but none of them worked.
Let's say that pipeline_model is my trained model (the one described in the first sentence).
pb_file = 'pipeline_model.pb'
# I also tried with other extensions, like h5, hdf5, sav, pkl
joblib.dump(pipeline_model, filename)
# or with pickle equivalent and pkl extension
# pickle.dump(pipeline_model, open(pb_file, 'wb'))
Now the pb file is created and I want to create a tflite one. Since it's not a Keras model, I can't use from_keras_model_file, so I tried instead with from_saved_model.
pb_file = 'pipeline_model.pb'
tflite_file = "tflite_model.tflite"
converter = tf.lite.TFLiteConverter.from_saved_model(pb_file)
tflite_model = converter.convert()
open(tflite_file, "wb").write(tflite_model)
It generates the error on line of converter = ...:
OSError: SavedModel file does not exist at: pb_file.pb/{saved_model.pbtxt|saved_model.pb}
I tried running it on Kaggle, Colab, PyCharm IDE, with both versions of tensorflow (1 and 2), with different file extensions and nothing seems to work.
I also noticed that TFLiteConverter contains the methods from_frozen_graph and from_session, but these two requires an extra parameter, so I don't think these could be the solution.
So, how can I obtain my tflite file from the trained Pipeline model? Please, if you find any solution, tell me the library versions that you used, since there could be a different behaviour on different libs.
I trained a model on darknet using the YOLOv3-SPP model. I need to be able to use this model in my iPhone app so I need to convert it to CoreML. I started by converting the .weights file to a .pb file. Now I am trying to convert it from TensorFlow to CoreML with tfcoreml. However I cannot seem to determine my input and output tensor names. I tried to use tensorboard to visualize the model and determine the inputs and outputs but since I am quite new to TensorFlow I can't figure out what to use. I am using this script to convert the model from TensorFlow to CoreML:
import tfcoreml
import os
import tensorflow as tf
frozen_model_file = os.path.abspath('frozen_darknet_yolov3_model.pb')
input_tensor_shapes = {"input/placeholder:0": [1, 32, 32, 9]}
# Output CoreML model path
coreml_model_file = './model.mlmodel'
output_tensor_names = ['output/prediction:0']
def convert():
# Read the pb model
with tf.gfile.GFile(frozen_model_file, "rb") as f:
graph_def = tf.GraphDef()
graph_def.ParseFromString(f.read())
# Then, we import the graph_def into a new Graph
tf.import_graph_def(graph_def, name="")
# Convert
tfcoreml.convert(
tf_model_path=frozen_model_file,
mlmodel_path=coreml_model_file,
input_name_shape_dict=input_tensor_shapes,
output_feature_names=output_tensor_names)
convert()
This what my tensorboard looks like:
What should I set the input_tensor_shapes and output_tensor_names too so that I don't get an error saying that my TensorFlow graph does not contain a tensor with that name.
I suggest using Netron to view the TensorFlow file. It makes the graph much easier to understand.
I am trying to freeze a pre-trained model to then convert it in TF Lite and deploy it into an Android device.
By inspecting the resulting .pb file with xxd I see that it only contains the placeholder output variable. The size of the .pb is a few Bytes.
Why all the graph and variables are not included in the model?
I used the code below derived from https://github.com/sankit1/cv-tricks.com/tree/master/Tensorflow-tutorials/freeze_model_and_deploy. It works fine with other models but not with mine.
import tensorflow as tf
from tensorflow.python.framework import graph_util
import os,sys
path = './model-gaze/'
output_node_names = "pos"
model_name = 'model-23'
saver = tf.train.import_meta_graph(path+model_name+'.meta', clear_devices=True)
graph = tf.get_default_graph()
input_graph_def = graph.as_graph_def()
sess = tf.Session()
saver.restore(sess, path+model_name)
output_graph_def = graph_util.convert_variables_to_constants(
sess, # The session is used to retrieve the weights
input_graph_def, # The graph_def is used to retrieve the nodes
output_node_names.split(",") # The output node names are used to select the usefull nodes
)
output_graph=path+model_name+".pb"
with tf.gfile.GFile(output_graph, "wb") as f:
f.write(output_graph_def.SerializeToString())
sess.close()
I would expect that all the weights and graph data are included inside the .pb but cannot manage to get them there.
The link which you are following is the right procedure to freeze a tensorflow model.
Freezing a model reduces the size of the model as it only saves the ""output_node_names"" which you give it to store.
Please refer to the below link on the entire process.
https://cv-tricks.com/how-to/freeze-tensorflow-models/
Here, can you please elaborate on what ""pos"" is ?
Also, here if you pass the prediction op as that is the required final op for predictions, it should work fine.
And if this does not help, please share your model and the code from where you have saved the model, to further debug the issue.
I have .meta and .ckpt files of the tensorflow model. I wanted to know exact input and output node name but I am getting a list of node names by following this.
When I have a frozen protobuf model, I get the input node name and output node name as the starting and end of the list using this code:
import tensorflow as tf
from tensorflow.python.platform import gfile
GRAPH_PB_PATH = 'frozen_model.pb'
with tf.Session() as sess:
print("load graph")
with gfile.FastGFile(GRAPH_PB_PATH,'rb') as f:
graph_def = tf.GraphDef()
graph_def.ParseFromString(f.read())
sess.graph.as_default()
tf.import_graph_def(graph_def, name='')
graph_nodes=[n for n in graph_def.node]
names = []
for t in graph_nodes:
names.append(t.name)
print(names)
Can I do something similar for .ckpt or .meta file ?
The .meta file contains information about the different node in the tensorflow graph. This has been better explained here.
The values of the different variables in the graph at that moment are stored separately in the checkpoint folder in checkpoint.data-xxxx-of-xxxx file.
There is no concept of an input or output node in the normal checkpoint process, as opposed to the case of a frozen model. Freezing a model outputs a subset of the whole tensorflow graph. This subset of the main graph has only those nodes present on which the output node is dependent on. Because freezing a model is done for serving purposes, it converts the tensorflow variables to constants, eliminating the need for storing additional information like gradients of the different variables at each step.
If you still want to identify the nodes you would be interested in, you can restore your graph from the .meta file and visualize it in tensorboard.
import tensorflow as tf
from tensorflow.summary import FileWriter
sess = tf.Session()
tf.train.import_meta_graph("your-meta-graph-file.meta")
FileWriter("__tb", sess.graph)
This will create a __tb folder in your current directory and you can then view the graph by issuing the following command.
tensorboard --logdir __tb
Here is a link to the screenshot of some model with a node selected. You can get the name of the node from the top right corner.
I trained my model with estimator of TensorFlow. It seems that export_savedmodel should be used to make .pb file, but I don't really know how to construct the serving_input_receiver_fn. Anybody any ideas?
Example code is welcomed.
Extra questions:
Is .pb the only file I need when I want to reload the model? Variable unnecessary?
How much will .pb reduced the model file size compared with .ckpt with adam optimizer?
You can use freeze_graph.py to produce a .pb from .ckpt + .pbtxt
if you're using tf.estimator.Estimator, then you'll find these two files in the model_dir
python freeze_graph.py \
--input_graph=graph.pbtxt \
--input_checkpoint=model.ckpt-308 \
--output_graph=output_graph.pb
--output_node_names=<output_node>
Is .pb the only file I need when I want to reload the model? Variable unnecessary?
Yes, You'll have to know you're model's input nodes and output node names too. Then use import_graph_def to load the .pb file and get the input and output operations using get_operation_by_name
How much will .pb reduced the model file size compared with .ckpt with adam optimizer?
A .pb file is not a compressed .ckpt file, so there is no "compression rate".
However, there is a way to optimize your .pb file for inference, and this optimization may reduce the file size as it removes parts of the graph that are training only operations (see the complete description here).
[comment] how can I get the input and output node names?
You can set the input and output node names using the op name parameter.
To list the node names in your .pbtxt file, use the following script.
import tensorflow as tf
from google.protobuf import text_format
with open('graph.pbtxt') as f:
graph_def = text_format.Parse(f.read(), tf.GraphDef())
print [n.name for n in graph_def.node]
[comment] I found that there is a tf.estimator.Estimator.export_savedmodel(), is that the function to store model in .pb directly? And I'm struggling in it's parameter serving_input_receiver_fn. Any ideas?
export_savedmodel() generates a SavedModel which is a universal serialization format for TensorFlow models. It should contain everything's needed to fit with TensorFlow Serving APIs
serving_input_receiver_fn() is a part of those needed things you have to provide in order to generate a SavedModel, it determines the input signature of your model by adding placeholders to the graph.
From the doc
This function has the following purposes:
To add placeholders to the graph that the serving system will feed
with inference requests.
To add any additional ops needed to convert
data from the input format into the feature Tensors expected by the
model.
If you're receiving your inference requests in the form of serialized tf.Examples (which is a typical pattern) then you can use the example provided in the doc.
feature_spec = {'foo': tf.FixedLenFeature(...),
'bar': tf.VarLenFeature(...)}
def serving_input_receiver_fn():
"""An input receiver that expects a serialized tf.Example."""
serialized_tf_example = tf.placeholder(dtype=tf.string,
shape=[default_batch_size],
name='input_example_tensor')
receiver_tensors = {'examples': serialized_tf_example}
features = tf.parse_example(serialized_tf_example, feature_spec)
return tf.estimator.export.ServingInputReceiver(features, receiver_tensors)
[comment] Any idea to list the node names in '.pb'?
It depends on how it was generated.
if it's a SavedModel the use:
import tensorflow as tf
with tf.Session() as sess:
meta_graph_def = tf.saved_model.loader.load(
sess,
[tf.saved_model.tag_constants.SERVING],
'./saved_models/1519232535')
print [n.name for n in meta_graph_def.graph_def.node]
if it's a MetaGraph then use:
import tensorflow as tf
from tensorflow.python.platform import gfile
with tf.Session() as sess:
with gfile.FastGFile('model.pb', 'rb') as f:
graph_def = tf.GraphDef()
graph_def.ParseFromString(f.read())
sess.graph.as_default()
tf.import_graph_def(graph_def, name='')
print [n.name for n in graph_def.node]