I have a simple pytorch model which I transformed into ONNX and eventually to tflite.
When I load the model and do inference with TF.lite, all goes well.
However when I try using tflite_runtime to load the model and do inference, I get the following error:
RunTimeError: external/org_tensorflow/tensorflow/lite/kernels/add.cc:385 Type INT64 is unsupported by op Add.Node number 70 (ADD) failed to invoke.
Here is the conversion code I'm currently using with TF2.6:
converter = tf.lite.TFLiteConverter.from_saved_model(path)
converter.optimizations = [tf.lite.Optimize.DEFAULT]
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS]
converter.allow_custom_ops=True # if omitted, conversion fails
tflite_rep = converter.convert()
open('exports/deep_snore.tflite', 'wb').write(tflite_rep)
I have checked many possible tf blogs but I can't figure out where the issue is.
The solution I can think of is to rewrite the model with TF, retrain it and transform it to tflite.
Related
I'm trying to convert PyTorch ml model into Core ML. As shown in this WWDC video, I first converted it to TorchScript Version.
But the problem happens when converting TorchScript Version to Core ML.
As first, I did as follow:
import coremltools as ct
model = ct.convert(
traced_model,
inputs=[ct.TensorType(shape=input_ids.shape), ct.TensorType(shape=attention_mask.shape)],
outputs=[ct.TensorType(shape=decoder_input_ids.shape)]
)
But it was giving me an error saying:
ValueError: The 'shape' argument must not be specified for the outputs, since it is automatically inferred from the input shapes and the ops in the model
So I used following code for converting it to Core ml:
import coremltools as ct
model = ct.convert(
traced_model,
inputs=[ct.TensorType(shape=input_ids.shape), ct.TensorType(shape=attention_mask.shape), ct.TensorType(shape=decoder_input_ids.shape)]
)
This time code block ran successfully, but when I actually downloaded the converted coreml, it was not detecting decoder_input_ids as one of inputs like following:
How can I fix this problem, and what am I doing wrong? Btw the model is Seq2Seq model
I want to use a generator to quantize a LSTM model.
Questions
I start with the question as this is quite a long post.
I actually want to know if you have manged to quantize (int8) a LSTM model with post training quantization.
I tried it different TF versions but always bumped into an error. Below are some of my tries. Maybe you see an error I made or have a suggestion.
Thanks
Working Part
The input is expected as (batch,1,45). Running inference with the un-quantized model runs fine. The model and csv can be found here:
csv file: https://mega.nz/file/5FciFDaR#Ev33Ij124vUmOF02jWLu0azxZs-Yahyp6PPGOqr8tok
modelfile: https://mega.nz/file/UAMgUBQA#oK-E0LjZ2YfShPlhHN3uKg8t7bALc2VAONpFirwbmys
import tensorflow as tf
import numpy as np
import pathlib as path
import pandas as pd
def reshape_for_Lstm(data):
timesteps=1
samples=int(np.floor(data.shape[0]/timesteps))
data=data.reshape((samples,timesteps,data.shape[1])) #samples, timesteps, sensors
return data
if __name__ == '__main__':
#GET DATA
import pandas as pd
data=pd.read_csv('./test_x_data_OOP3.csv', index_col=[0])
data=np.array(data)
data=reshape_for_Lstm(data)
#LOAD MODEL
saved_model_dir= path.Path.cwd() / 'model' / 'singnature_model_tf_2.7.0-dev20210914'
model=tf.keras.models.load_model(saved_model_dir)
# INFERENCE
[yhat,yclass] = model.predict(data)
Yclass=[np.argmax(yclass[i],0) for i in range(len(yclass))] # get final class
print('all good')
The shape and dtypes of the variable data are (20000,1,45), float64
Where it goes wrong
Now I want to quantize the model. But depending on the TensorFlow version I run into different errors.
The code options I use are merged as follows:
converter=tf.lite.TFLiteConverter.from_saved_model('./model/singnature_model_tf_2.7.0-dev20210914')
converter.representative_dataset = batch_generator
converter.optimizations = [tf.lite.Optimize.DEFAULT]
converter.experimental_new_converter = False
#converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8, tf.lite.OpsSet.TFLITE_BUILTINS]
#converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS, tf.lite.OpsSet.SELECT_TF_OPS]
#converter._experimental_lower_tensor_list_ops = False
converter.target_spec.supported_types = [tf.int8]
quantized_tflite_model = converter.convert()
TensorFlow 2.2
Using TF 2.2 as often suggested in Git, I run into non-supported operators from tflite. Using a tf2.2 created model to assure version-support. Here, only TOCO conversion is supported.
Some of the operators in the model are not supported by the standard
TensorFlow Lite runtime and are not recognized by TensorFlow.
The error does not depend on converter.target_spec.supported_ops options. I could not find a solution therefore. allow_custom_ops only shifts the problem.
There are quite some git issues(just some examples) on this out there, but all suggested options did not work.
One is to try the new MILR converter, however, in 2.2 the integer only conversion for MILR was not done yet.
So lets try a newer version
TensorFlow 2.5.0
Then I tried a well vetted version. Here, no matter the converter.target_spec.supported_ops I run in following error using the MLIR conversion:
in the calibrator.py
ValueError: Failed to parse the model: pybind11::init(): factory
function returned nullptr.
The solution on Git is to use TF==2.2.0 version.
With TOCO conversion, I get the following error:
tensorflow/lite/toco/allocate_transient_arrays.cc:181] An array,
StatefulPartitionedCall/StatefulPartitionedCall/model/lstm/TensorArrayUnstack/TensorListFromTensor,
still does not have a known data type after all graph transformations
have run. Fatal Python error: Aborted
I did not find anything on this error.
Maybe it is solved in 2.6
TensorFlow 2.6.0
Here, no matter which converter.target_spec.supported_ops I use, I run into the following error:
ValueError: Failed to parse the model: Only models with a single
subgraph are supported, model had 5 subgraphs.
The model is a five layer model. So it seems that each layer is seen as a subgraph. I did not find an answer on how to merge them into one subgraph. The issue is apparently with 2.6.0 and is solved in 2.7 So, let's try the nightly build.
TensorFlow 2.7-nightly (tried 2.7.0-dev20210914 and 2.7.0-dev20210921)
Here we have to use Python 3.7 as 3.6 is no longer supported
Here we have to use
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS, tf.lite.OpsSet.SELECT_TF_OPS]
However, even it is stated that
converter._experimental_lower_tensor_list_ops = False
should be set, it does not seem necessary.
The problem here is that, to my knowledge, tf.lite.OpsSet.SELECT_TF_OPS calls the calibrator.py. In the calibrator.py the representative_dataset is expecting specific generator data. From line 93 onwards in the _feed_tensor() function the generator wants either a dict, list or tuple.
In the tf.lite.RepresentativeDataset function description or tflite class description, it states that the dataset should look the same as the input for the model. Which in my case (most cases) is just an numpy array in the correct dimensions.
Here I could try to convert my data into a tuple, however, this does not seem right.
Or is that actually the way to go?
Thanks so much for reading all this. If I find an answer, I will of course update the post
I have the same problem as you, and I'm still trying to solve it, but I noticed a couple of differences in our codes, so sharing it could be useful.
I'm using TF 2.7.0 and the conversion works fine when using:
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS, tf.lite.OpsSet.SELECT_TF_OPS, tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
Anyway, as far as I know, using these options (as you mentioned) is not guaranteeing you the full quantization of the model; so it's likely that you'll not be able to deploy it completely on microcontrollers or TPU systems as the Google Coral.
When using the conversion options recommended by the official guide for the complete quantization:
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
The conversion fails.
I recently succeeded in solving the problem! There is an extra line of code to add when configuring the converter:
converter.target_spec.supported_types = [tf.int8]
Here is the link to the tutorial I followed: https://colab.research.google.com/github/google-coral/tutorials/blob/master/train_lstm_timeseries_ptq_tf2.ipynb#scrollTo=EBRDh9SZVBX1
If possible, you can try modifying your LSTM so that is can be converted to TFLite's fused LSTM operator. https://www.tensorflow.org/lite/convert/rnn It supports full-integer quantization for basic fused LSTM and UnidirectionalSequenceLSTM operators.
I have been studying about GradCam and I noticed most cases are used on a Keras/Tensorflow model. However I have a tensorflow lite model that has been compiled to .tflite format. I am not sure if it's even possible to access my CNN layers after it's been compiled, given that I tried using keras library to load the model and it only accepts specific file types, not exactly .tflite since it threw errors:
from tensorflow.keras.models import load_model
model = load_model("/content/drive/My Drive/tensorflow_lite_model.tflite")
It gives the error:
OSError: SavedModel file does not exist
What I was trying to do was to print the .tflite models using model.summary as a way to confirm If I could perform any operation with the model layers. If that is so, then I don't think it's possible to use Grad-Cam with a tensorflow lite model.
Therefore, I would like to know If that is true, or did I just try to validate it, the wrong way?
TFLite model file is a different serialization format with the TensorFlow model formats, keras and saved model.
Since you already have a TFLite model, you need to use the TensorFlow Lite Interpreter API, instead of using the TensorFlow API.
interpreter = tf.lite.Interpreter(model_path="converted_model.tflite")
interpreter.allocate_tensors()
Please refer to this link for the details.
The TF GradCam model can be converted into the TFLite model. It is technically possible to convert any TF models to the corresponding TFLite model. If you have any issues with the conversion, please file bug at the tensorflow github.
I'm running into an issue where I convert my keras model into tensorflow lite format but once I do the model accuracy of the converted model drops significantly. The model is a fairly simple natural language processing model. Before conversion the model has an accuracy of around 96%, but once it is converted into the tensorflow lite format (without any optimizations) it drops to around 20%. This is a ridiculous drop in performance so I was wondering is this something that can happen or am I doing something wrong here? I am running the tflite model on a beaglebone SBC running debian and running the inferences on python.
My tflite conversion code:
converter = tf.lite.TFLiteConverter.from_keras_model(model)
tflite_model = converter.convert()
with open('model.tflite', 'wb') as f:
f.write(tflite_model)
My model code:
model = tf.keras.Sequential([
tf.keras.layers.Embedding(vocab_size, 128, input_length=maxlen),
tf.keras.layers.GlobalAveragePooling1D(),
tf.keras.layers.Dense(24, activation='relu'),
tf.keras.layers.Dense(1, activation='sigmoid')
])
I encountered the same problem. I solved it with post-training quantization. So I applied quantization on my trained model, and retrain it. It reduced the accuracy significantly that there was no more than roughly 2-10% difference on keras and TFLite.
It seems that when a keras model was converted to TFLite, a sort of quantization was also applied and the float parameters were converted to integers, which resulted in the accuracy drop. By quantizing the model first, we trained the model with integers. I think this is more or less what happened. Correct me if I'm wrong
References
https://www.tensorflow.org/model_optimization/guide/quantization/training
https://www.tensorflow.org/lite/performance/model_optimization
Is there any way to convert data-00000-of-00001 to Tensorflow Lite model?
The file structure is like this
|-semantic_model.data-00000-of-00001
|-semantic_model.index
|-semantic_model.meta
Using TensorFlow Version: 1.15
The following 2 steps will convert it to a .tflite model.
1. Generate a TensorFlow Model for Inference (a frozen graph .pb file) using the answer posted here
What you currently have is model checkpoint (a TensorFlow 1 model saved in 3 files: .data..., .meta and .index. This model can be further trained if needed). You need to convert this to a frozen graph (a TensorFlow 1 model saved in a single .pb file. This model cannot be trained further and is optimized for inference/prediction).
2. Generate a TensorFlow lite model ( .tflite file)
A. Initialize the TFLiteConverter: The .from_frozen_graph API can be defined this way and the attributes which can be added are here. To find the names of these arrays, visualize the .pb file in Netron
converter = tf.compat.v1.lite.TFLiteConverter.from_frozen_graph(
graph_def_file='....path/to/frozen_graph.pb',
input_arrays=...,
output_arrays=....,
input_shapes={'...' : [_, _,....]}
)
B. Optional: Perform the simplest optimization known as post-training dynamic range quantization. You can refer to the same document for other types of optimizations/quantization methods.
converter.optimizations = [tf.lite.Optimize.DEFAULT]
C. Convert it to a .tflite file and save it
tflite_model = converter.convert()
tflite_model_size = open('model.tflite', 'wb').write(tflite_model)
print('TFLite Model is %d bytes' % tflite_model_size)