Quantize a Keras neural network model

Quantize a Keras neural network model - python

Recently, I've started creating neural networks with Tensorflow + Keras and I would like to try the quantization feature available in Tensorflow. So far, experimenting with examples from TF tutorials worked just fine and I have this basic working example (from https://www.tensorflow.org/tutorials/keras/basic_classification):
import tensorflow as tf
from tensorflow import keras
fashion_mnist = keras.datasets.fashion_mnist
(train_images, train_labels), (test_images, test_labels) = fashion_mnist.load_data()
# fashion mnist data labels (indexes related to their respective labelling in the data set)
class_names = ['T-shirt/top', 'Trouser', 'Pullover', 'Dress', 'Coat', 'Sandal', 'Shirt', 'Sneaker', 'Bag', 'Ankle boot']
# preprocess the train and test images
train_images = train_images / 255.0
test_images = test_images / 255.0
# settings variables
input_shape = (train_images.shape[1], train_images.shape[2])
# create the model layers
model = keras.Sequential([
keras.layers.Flatten(input_shape=input_shape),
keras.layers.Dense(128, activation=tf.nn.relu),
keras.layers.Dense(10, activation=tf.nn.softmax)
])
# compile the model with added settings
model.compile(optimizer=tf.train.AdamOptimizer(),
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
# train the model
epochs = 3
model.fit(train_images, train_labels, epochs=epochs)
# evaluate the accuracy of model on test data
test_loss, test_acc = model.evaluate(test_images, test_labels)
print('Test accuracy:', test_acc)
Now, I would like to employ quantization in the learning and classification process. The quantization documentation (https://www.tensorflow.org/performance/quantization) (the page is no longer available since cca September 15, 2018) suggests to use this piece of code:
loss = tf.losses.get_total_loss()
tf.contrib.quantize.create_training_graph(quant_delay=2000000)
optimizer = tf.train.GradientDescentOptimizer(0.00001)
optimizer.minimize(loss)
However, it does not contain any information about where this code should be utilized or how it should be connected to a TF code (not even mentioning a high level model created with Keras). I have no idea how this quantization part relates to the previously created neural network model. Just inserting it following the neural network code runs into the following error:
Traceback (most recent call last):
File "so.py", line 41, in <module>
loss = tf.losses.get_total_loss()
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/losses/util.py", line 112, in get_total_loss
return math_ops.add_n(losses, name=name)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/math_ops.py", line 2119, in add_n
raise ValueError("inputs must be a list of at least one Tensor with the "
ValueError: inputs must be a list of at least one Tensor with the same dtype and shape
Is it possible to quantize a Keras NN model in this way or am I missing something basic?
A possible solution that crossed my mind could be using low level TF API instead of Keras (needing to do quite a bit of work to construct the model), or maybe trying to extract some of the lower level methods from the Keras models.

As mentioned in other answers, TensorFlow Lite can help you with network quantization.
TensorFlow Lite provides several levels of support for quantization.
Tensorflow Lite post-training quantization quantizes weights and
activations post training easily. Quantization-aware training allows
for training of networks that can be quantized with minimal accuracy
drop; this is only available for a subset of convolutional neural
network architectures.
So first, you need to decide whether you need post-training quantization or quantization-aware training. For example, if you already saved the model as *.h5 files, you would probably want to follow #Mitiku's instruction and do the post-training quantization.
If you prefer to achieve higher performance by simulating the effect of quantization in training (using the method you quoted in the question), and your model is in the subset of CNN architecture supported by quantization-aware training, this example may help you in terms of interaction between Keras and TensorFlow. Basically, you only need to add this code between model definition and its fitting:
sess = tf.keras.backend.get_session()
tf.contrib.quantize.create_training_graph(sess.graph)
sess.run(tf.global_variables_initializer())

As your network looks quite simple, you can maybe use Tensorflow lite.

Tensorflow lite can be used to quantize keras model.
The following code was written for tensorflow 1.14. It might not work for earlier versions.
First, after training the model you should save your model to h5
model.fit(train_images, train_labels, epochs=epochs)
# evaluate the accuracy of model on test data
test_loss, test_acc = model.evaluate(test_images, test_labels)
print('Test accuracy:', test_acc)
model.save("model.h5")
To load keras model use tf.lite.TFLiteConverter.from_keras_model_file
# load the previously saved model
converter = tf.lite.TFLiteConverter.from_keras_model_file("model.h5")
tflite_model = converter.convert()
# Save the model to file
with open("tflite_model.tflite", "wb") as output_file:
output_file.write(tflite_model)
The saved model can be loaded to python script or to other platforms and languages. To use saved tflite model, tensorlfow.lite provides Interpreter. The following example from here shows how to load tflite model from local file using python scripts.
import numpy as np
import tensorflow as tf
# Load TFLite model and allocate tensors.
interpreter = tf.lite.Interpreter(model_path="tflite_model.tflite")
interpreter.allocate_tensors()
# Get input and output tensors.
input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()
# Test model on random input data.
input_shape = input_details[0]['shape']
input_data = np.array(np.random.random_sample(input_shape), dtype=np.float32)
interpreter.set_tensor(input_details[0]['index'], input_data)
interpreter.invoke()
# The function `get_tensor()` returns a copy of the tensor data.
# Use `tensor()` in order to get a pointer to the tensor.
output_data = interpreter.get_tensor(output_details[0]['index'])
print(output_data)

Related

Saved pre-train model layer weights but cannot load the weights through H5PY

I have been trying to save the weights of my neural network model so that I could use a few of its layers for another neural network model to be trained on another dataset.
pre-trained model:
model = Sequential()
model.add(tf.keras.layers.Dense(100, input_shape=(X_train_orig_sm.shape)))
model.add(tf.keras.layers.Activation('relu'))
model.add(tf.keras.layers.Dropout(0.2))
model.add(tf.keras.layers.Dense(10))
model.add(tf.keras.layers.Activation('relu'))
model.add(tf.keras.layers.Dropout(0.2))
model.add(tf.keras.layers.Dense(1))
model.add(tf.keras.layers.Activation('sigmoid'))
model.summary()
# need sparse otherwise shape is wrong. check why
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
print('Fitting the data to the model')
batch_size = 20
epochs = 10
history = model.fit(X_train_orig_sm, Y_train_orig_sm, batch_size=batch_size, epochs=epochs, verbose=1, validation_split=0.2)
print('Evaluating the test data on the model')
How I saved the weights of neural network:
model.save_weights("dnn_model.h5")
How I try to load the weights of neural network:
dnn_model=model.load_weights("dnn_model.h5")
dnn_model.layers[5]
While trying to load the model, I get the following error:
AttributeError: 'NoneType' object has no attribute 'layers'
I dont seem to understand why the layers of the neural network are not recognised even though the pre-trained neural network is trained before the model was saved. Any advice, solution or direction will be highly appreciated. Thank you.

When you call model.save_weights("dnn_model.h5"), you only save the "weights" of the model. You do not save the actual structure of the model. That's why you cannot access the layers etc.
To save the actual model, you can call the below.
# save
model.save('dnn_model') # save as pb
model.save('dnn_model.h5') # save as HDF5
# load
dnn_model = tf.keras.models.load_model('dnn_model') # load as pb
dnn_model = tf.keras.models.load_model('dnn_model.h5') # load as HDF5
Note: You do not need to add an extension to the name to save as pb.
Source: https://www.tensorflow.org/tutorials/keras/save_and_load

Getting keras predictions as a tensor graph for use in tensorflow

I currently have a custom LSTM model that I have saved as a .h5 file using save(). I am loading this model using load_model() during a tensorflow graph construction, and want to construct a part of the graph using the LSTM model's prediction output (which I therefore need in the form of a tensor). I have established the same session for the tensorflow graph and the keras backend graph, but I am having trouble connecting the output into my tensorflow code graph. Using the standard predict() seems to attempt to run the keras model's session, and I have scoured the internet for something other than hideously converting it to a .pb file and messing with it like that. It seems like it should be easy, considering I am using tensorflow as the Keras backend...Any ideas on how to achieve this?

I will show how to import saved keras model into tensorflow graph. I will show this using simple single layer feed forward model.
inputs = tf.keras.layers.Input(shape=(1,), name="inputs")
outputs = tf.keras.layers.Dense(1, activation="linear", name="outputs")(inputs)
model = tf.keras.models.Model(inputs=inputs, outputs=outputs)
model.compile(loss="mse", optimizer="adam")
model.save("model.h5")
Now let's load the model using load_model method of keras and use it in tensorflow to multiply the output of the model with new placeholder tensor.
model = tf.keras.models.load_model("model.h5")
model_output = model.output
new_tensor_ph = tf.placeholder(tf.float32, [None, 1])
new_output = tf.multiply(model_output, new_tensor_ph)
sess = tf.keras.backend.get_session()
prediction = sess.run(new_output, feed_dict={model.input:[[3]],new_tensor_ph :[[4]]})
## This works without error

Training accuracy graph with model_to_estimator

I have a Keras sequential model and I'm using:
model.compile(optimizer="adam", loss="categorical_crossentropy", metrics=["accuracy"])
I can see the training accuracy printed when I use Keras fit() function to train the model.
I need to use Estimator API to train the model and I'm using model_to_estimator to convert the model to estimator. Then I use train_and_evaluate() to train the model.
However I don't see the accuracy graph in Tensorboard. There's only one accuracy value (from evaluation), so the graph is just a dot.
What I need is the accuracy graph from training like the one shown here:
https://www.tensorflow.org/guide/custom_estimators#tensorboard
I checked the examples and all I could find were ones where they use Estimator API to build the model and use following code to define a summary scalar.
# Compute evaluation metrics.
accuracy = tf.metrics.accuracy(labels=labels,
predictions=predicted_classes,
name='acc_op')
metrics = {'accuracy': accuracy}
tf.summary.scalar('accuracy', accuracy[1])
Does anyone know how to use this with models converted from Keras?
I'm using Tensorflow version r1.10.

Loading trained Tensorflow model into estimator

Say that I have trained a Tensorflow Estimator:
estimator = tf.contrib.learn.Estimator(
model_fn=model_fn,
model_dir=MODEL_DIR,
config=some_config)
And I fit it to some train data:
estimator.fit(input_fn=input_fn_train, steps=None)
The idea is that a model is fit to my MODEL_DIR. This folder contains a checkpoint and several files of .meta and .index.
This works perfectly. I want to do some predictions using my functions:
estimator = tf.contrib.Estimator(
model_fn=model_fn,
model_dir=MODEL_DIR,
config=some_config)
predictions = estimator.predict(input_fn=input_fn_test)
My solution works perfectly but there is one big disadvantage: you need to know model_fn, which is my model defined in Python. But if I change the model by adding a dense layer in my Python code, this model is incorrect for the saved data in MODEL_DIR, leading to incorrect results:
NotFoundError (see above for traceback): Key xxxx/dense/kernel not found in checkpoint
How do I cope with this? How can I load my model / estimator such that I can make predictions on some new data? How can I load model_fn or the estimator from MODEL_DIR?

Avoiding a bad restoration
Restoring a model's state from a checkpoint only works if the model and checkpoint are compatible. For example, suppose you trained a DNNClassifier Estimator containing two hidden layers, each having 10 nodes:
classifier = tf.estimator.DNNClassifier(
feature_columns=feature_columns,
hidden_units=[10, 10],
n_classes=3,
model_dir='models/iris')
classifier.train(
input_fn=lambda:train_input_fn(train_x, train_y, batch_size=100),
steps=200)
After training (and, therefore, after creating checkpoints in models/iris), imagine that you changed the number of neurons in each hidden layer from 10 to 20 and then attempted to retrain the model:
classifier2 = tf.estimator.DNNClassifier(
feature_columns=my_feature_columns,
hidden_units=[20, 20], # Change the number of neurons in the model.
n_classes=3,
model_dir='models/iris')
classifier.train(
input_fn=lambda:train_input_fn(train_x, train_y, batch_size=100),
steps=200)
Since the state in the checkpoint is incompatible with the model described in classifier2, retraining fails with the following error:
...
InvalidArgumentError (see above for traceback): tensor_name =
dnn/hiddenlayer_1/bias/t_0/Adagrad; shape in shape_and_slice spec [10]
does not match the shape stored in checkpoint: [20]
To run experiments in which you train and compare slightly different versions of a model, save a copy of the code that created each model_dir, possibly by creating a separate git branch for each version. This separation will keep your checkpoints recoverable.
copy from tensorflow checkpoints doc.
https://www.tensorflow.org/get_started/checkpoints
hope that can help you.

Learning Keras model by using Distributed Tensorflow

I have two GPU installed on two different machines. I want to build a cluster that allows me to learn a Keras model by using the two GPUs together.
Keras blog shows two slices of code in Distributed training section and link official Tensorflow documentation.
My problem is that I don't know how to learn my model and put into practice what is reported in Tensorflow documentation.
For example, what should I do if I want to execute the following code on a cluster of multiple GPU?
# For a single-input model with 2 classes (binary classification):
model = Sequential()
model.add(Dense(32, activation='relu', input_dim=100))
model.add(Dense(1, activation='sigmoid'))
model.compile(optimizer='rmsprop',
loss='binary_crossentropy',
metrics=['accuracy'])
# Generate dummy data
import numpy as np
data = np.random.random((1000, 100))
labels = np.random.randint(2, size=(1000, 1))
# Train the model, iterating on the data in batches of 32 samples
model.fit(data, labels, epochs=10, batch_size=32)

In the first and second part of the blog he explains how to use keras models with tensorflow.
Also I found this example of keras with distributed training.
And here is another with horovod.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Quantize a Keras neural network model - python

As your network looks quite simple, you can maybe use Tensorflow lite.

Related

Saved pre-train model layer weights but cannot load the weights through H5PY

Getting keras predictions as a tensor graph for use in tensorflow

Training accuracy graph with model_to_estimator

Loading trained Tensorflow model into estimator

Learning Keras model by using Distributed Tensorflow

Categories

Resources