I was converting my tensorflow model to tensorflowjs form, using the following command
tensorflowjs_converter
--input_format=tf_saved_model
--output_node_names="my_output_node" \
--saved_model_tags=serve my_saved_model_dir \
./web_model
I encourtered the following mysterious error:
ValueError: Unsupported Ops in the model before optimization NonMaxSuppression, ResizeArea
I do have those operations in my graph. Do I need to swap them out for something more tensorflowjs friendly?
I went deep google, and only came across a reference to the following flag I could add to the tensorflowjs_converter command --skip_op_check=SKIP_OP_CHECK \
This did indeed compile, but then when trying to serve the js model I encountered a js error similar to the above:
Error: Tensorflow Op is not supported: ResizeArea
Any ideas how to modify my graph or my command to navigate this?
Thank you
The short answer is yes, you will need to change them.
TensorflowJS will change the ops for optimisation purposes, but not all the ops have an equivalent TFJS version.
The full list of supported ops is here: https://github.com/tensorflow/tfjs-converter/blob/master/docs/supported_ops.md
Oddly 'NonMaxSuppression' does seem to be on the list, but ResizeArea is not, and will 100% not work.
An alternative is to create a custom operation yourself, and use that code, but I'm not sure how to do that in TFJS.
Related
I have a pickled sklearn model, which I need to get to run. This model, however, is trained in unknown version of sklearn.
When I look up the model in debugger, I find that there is a bunch of strange tracebacks inside, instead of the keys you'd expect, for example:
decision_function -> 'RandomForestClassifier' object has no attribute 'decision_function'
fit_predict -> 'RandomForestClassifier' object has no attribute 'fit_predict'
score_samples -> 'RandomForestClassifier' object has no attribute 'score_samples'
How can I get this model to run? Does these error message hint you anything?
EDIT: The solution is to brute force search the sklearn version. In my case when I got to the correct major version, the error message pointed me to the correct minor version.
Just like #rickhg12hs suggested, the python -m pickletools your_pickled_model_file does the job!
The output is quite long, so I recommend using head:
python -m pickletools your_pickled_model_file | head -100
You can know the version of a pickled model after scikit-learn 0.18. Using
model.__getstate__()['_sklearn_version']
So at least you will know if it is pre 0.18 or newer.
I want to use a generator to quantize a LSTM model.
Questions
I start with the question as this is quite a long post.
I actually want to know if you have manged to quantize (int8) a LSTM model with post training quantization.
I tried it different TF versions but always bumped into an error. Below are some of my tries. Maybe you see an error I made or have a suggestion.
Thanks
Working Part
The input is expected as (batch,1,45). Running inference with the un-quantized model runs fine. The model and csv can be found here:
csv file: https://mega.nz/file/5FciFDaR#Ev33Ij124vUmOF02jWLu0azxZs-Yahyp6PPGOqr8tok
modelfile: https://mega.nz/file/UAMgUBQA#oK-E0LjZ2YfShPlhHN3uKg8t7bALc2VAONpFirwbmys
import tensorflow as tf
import numpy as np
import pathlib as path
import pandas as pd
def reshape_for_Lstm(data):
timesteps=1
samples=int(np.floor(data.shape[0]/timesteps))
data=data.reshape((samples,timesteps,data.shape[1])) #samples, timesteps, sensors
return data
if __name__ == '__main__':
#GET DATA
import pandas as pd
data=pd.read_csv('./test_x_data_OOP3.csv', index_col=[0])
data=np.array(data)
data=reshape_for_Lstm(data)
#LOAD MODEL
saved_model_dir= path.Path.cwd() / 'model' / 'singnature_model_tf_2.7.0-dev20210914'
model=tf.keras.models.load_model(saved_model_dir)
# INFERENCE
[yhat,yclass] = model.predict(data)
Yclass=[np.argmax(yclass[i],0) for i in range(len(yclass))] # get final class
print('all good')
The shape and dtypes of the variable data are (20000,1,45), float64
Where it goes wrong
Now I want to quantize the model. But depending on the TensorFlow version I run into different errors.
The code options I use are merged as follows:
converter=tf.lite.TFLiteConverter.from_saved_model('./model/singnature_model_tf_2.7.0-dev20210914')
converter.representative_dataset = batch_generator
converter.optimizations = [tf.lite.Optimize.DEFAULT]
converter.experimental_new_converter = False
#converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8, tf.lite.OpsSet.TFLITE_BUILTINS]
#converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS, tf.lite.OpsSet.SELECT_TF_OPS]
#converter._experimental_lower_tensor_list_ops = False
converter.target_spec.supported_types = [tf.int8]
quantized_tflite_model = converter.convert()
TensorFlow 2.2
Using TF 2.2 as often suggested in Git, I run into non-supported operators from tflite. Using a tf2.2 created model to assure version-support. Here, only TOCO conversion is supported.
Some of the operators in the model are not supported by the standard
TensorFlow Lite runtime and are not recognized by TensorFlow.
The error does not depend on converter.target_spec.supported_ops options. I could not find a solution therefore. allow_custom_ops only shifts the problem.
There are quite some git issues(just some examples) on this out there, but all suggested options did not work.
One is to try the new MILR converter, however, in 2.2 the integer only conversion for MILR was not done yet.
So lets try a newer version
TensorFlow 2.5.0
Then I tried a well vetted version. Here, no matter the converter.target_spec.supported_ops I run in following error using the MLIR conversion:
in the calibrator.py
ValueError: Failed to parse the model: pybind11::init(): factory
function returned nullptr.
The solution on Git is to use TF==2.2.0 version.
With TOCO conversion, I get the following error:
tensorflow/lite/toco/allocate_transient_arrays.cc:181] An array,
StatefulPartitionedCall/StatefulPartitionedCall/model/lstm/TensorArrayUnstack/TensorListFromTensor,
still does not have a known data type after all graph transformations
have run. Fatal Python error: Aborted
I did not find anything on this error.
Maybe it is solved in 2.6
TensorFlow 2.6.0
Here, no matter which converter.target_spec.supported_ops I use, I run into the following error:
ValueError: Failed to parse the model: Only models with a single
subgraph are supported, model had 5 subgraphs.
The model is a five layer model. So it seems that each layer is seen as a subgraph. I did not find an answer on how to merge them into one subgraph. The issue is apparently with 2.6.0 and is solved in 2.7 So, let's try the nightly build.
TensorFlow 2.7-nightly (tried 2.7.0-dev20210914 and 2.7.0-dev20210921)
Here we have to use Python 3.7 as 3.6 is no longer supported
Here we have to use
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS, tf.lite.OpsSet.SELECT_TF_OPS]
However, even it is stated that
converter._experimental_lower_tensor_list_ops = False
should be set, it does not seem necessary.
The problem here is that, to my knowledge, tf.lite.OpsSet.SELECT_TF_OPS calls the calibrator.py. In the calibrator.py the representative_dataset is expecting specific generator data. From line 93 onwards in the _feed_tensor() function the generator wants either a dict, list or tuple.
In the tf.lite.RepresentativeDataset function description or tflite class description, it states that the dataset should look the same as the input for the model. Which in my case (most cases) is just an numpy array in the correct dimensions.
Here I could try to convert my data into a tuple, however, this does not seem right.
Or is that actually the way to go?
Thanks so much for reading all this. If I find an answer, I will of course update the post
I have the same problem as you, and I'm still trying to solve it, but I noticed a couple of differences in our codes, so sharing it could be useful.
I'm using TF 2.7.0 and the conversion works fine when using:
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS, tf.lite.OpsSet.SELECT_TF_OPS, tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
Anyway, as far as I know, using these options (as you mentioned) is not guaranteeing you the full quantization of the model; so it's likely that you'll not be able to deploy it completely on microcontrollers or TPU systems as the Google Coral.
When using the conversion options recommended by the official guide for the complete quantization:
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
The conversion fails.
I recently succeeded in solving the problem! There is an extra line of code to add when configuring the converter:
converter.target_spec.supported_types = [tf.int8]
Here is the link to the tutorial I followed: https://colab.research.google.com/github/google-coral/tutorials/blob/master/train_lstm_timeseries_ptq_tf2.ipynb#scrollTo=EBRDh9SZVBX1
If possible, you can try modifying your LSTM so that is can be converted to TFLite's fused LSTM operator. https://www.tensorflow.org/lite/convert/rnn It supports full-integer quantization for basic fused LSTM and UnidirectionalSequenceLSTM operators.
I trained a model using matterport maskrcnn. I already had .h5 model file but i am not able to convert it to .mlmodel. As there are many custom layers involved. I already tried whatever I am able to find on google regarding the same. I also tried https://github.com/edouardlp/Mask-RCNN-CoreML for conversion. So far no success.
Does anybody able to did the conversion so far successfully, if yes can you share the codebase or tutorial for the same.
I am able to convert using the same github repo mentioned in the question. But you can't debug the code in Xcode as maskrcnn is to memory heavy. Its better to use another architecture like deeplab.
Here's a github project https://github.com/edouardlp/Mask-RCNN-CoreML/releases/tag/0.2 with a MaskRCNN.ml model.
Note: You have to copy the models into the project to get it to compile.
I am trying to deploy a trained U-Net with TensorRT. The model was trained using Keras (with Tensorflow as backend). The code is very similar to this one: https://github.com/zhixuhao/unet/blob/master/model.py
When I converted the model to UFF format, using some code like this:
import uff
import os
uff_fname = os.path.join("./models/", "model_" + idx + ".uff")
uff_model = uff.from_tensorflow_frozen_model(
frozen_file = os.path.join('./models', trt_fname), output_nodes = output_names,
output_filename = uff_fname
)
I will get the following warning:
Warning: No conversion function registered for layer: ResizeNearestNeighbor yet.
Converting up_sampling2d_32_12/ResizeNearestNeighbor as custom op: ResizeNearestNeighbor
Warning: No conversion function registered for layer: DataFormatVecPermute yet.
Converting up_sampling2d_32_12/Shape-0-0-VecPermuteNCHWToNHWC-LayoutOptimizer as custom op: DataFormatVecPermute
I tried to avoid this by replacing the upsampling layer with upsampling(bilinear interpolation) and transpose convolution. But the converter would throw me similar errors. I checked https://docs.nvidia.com/deeplearning/sdk/tensorrt-support-matrix/index.html and it seemed all these operations are not supported yet.
I am wondering if there is any workaround to this problem? Is there any other format/framework that TensorRT likes and has upsampling supported? Or is it possible to replace it with some other supported operations?
I also saw somewhere that one can add customized operations to replace those unsupported ones for TensorRT. Though I am not so sure how the workflow would be. It would also be really helpful if someone could point out an example of custom layers.
Thank you in advance!
The warnings are because these operations are not supported yet by TensorRT, as you already mentioned.
Unfortunately there is no easy way to fix this. You either have to modify the graph (even after training) to use a combination supported operation only; or write these operation yourself as custom layer.
However, there is a better way to run inference on other devices in C++. You can use TensorFlow mixed with TensorRT together. TensorRT will analyze the graph for ops that it supports and convert them to TensorRT nodes, and the remaining of the graph will be handled by TensorFlow as usual. More information here. This solution is much faster than rewriting the operations yourself. The only complicated part is to build TensorFlow from sources on your target device and generating the dynamic library tensorflow_cc. Recently there are many guides and support for TensorFlow ports to various architectures e.g. ARM.
Update 09/28/2019
Nvidia released TensorRT 6.0.1 about two weeks ago and added a new API called "IResizeLayer". This layer supports "Nearest" interpolation and can thus be used to implement upsampling. No need to use custom layers/plugins any more!
Original answer:
thanks for all the answers and suggestions posted here!
In the end, we implemented the network in TensorRT C++ API directly and loaded the weights from the .h5 model file. We haven't got the time to profile and polish the solution yet, but the inference seems to be working according to the test images we fed in.
Here's the workflow we've adopted:
Step 1: Code the upsampling layer.
In our U-Net model, all the upsampling layer has a scaling factor of (2, 2) and they all use ResizeNearestNeighbor interpolation. Essentially, pixel value at (x,y) in the original tensor will go to four pixels: (2x, 2y), (2x+1, 2y), (2x, 2y+1) and (2x+1, 2y+1) in the new tensor. This can be easily coded up into a CUDA kernel function.
Once we got the upsampling kernel we need to wrap it with TensorRT API, specifically the IPluginV2Ext class. The developer reference has some descriptions of what functions need to be implemented. I'd say enqueue() is the most important function because the CUDA kernel gets executed there.
There are also examples in the TensorRT Samples folder. For my version, these resources are helpful:
Github: Leaky Relu as custom layer
TensorRT-5.1.2.2/samples/sampleUffSSD
TensorRT-5.1.2.2/samples/sampleSSD
Step 2: Code the rest of the network using TensorRT API
The rest of the network should be quite straightforward. Just find call different "addxxxLayer" function from TensorRT network definitions.
One thing to keep in mind:
depending on which version of TRT you are using, the way to add padding can be different. I think the newest version (5.1.5) allows developers to add parameters in addConvolution() so that the proper padding mode can be selected.
My model was trained using Keras, the default padding mode is that the right and bottom get more padding if the total number of padding is not even. Check this Stack Overflow link for details. There's a mode in 5.1.5 that represents this padding scheme.
If you are on an older version (5.1.2.2), you will need to add the padding as a separate layer before the convolution layer, which has two parameters: pre-padding and post-padding.
Also, all things are NCHW in TensorRT
Helpful sample:
TensorRT-5.1.2.2/samples/sampleMNISTAP
Step 3: Load the weights
TensorRT wants weights in format [out_c, in_c, filter_h, filter_w], which is mentioned in an archived documentation. Keras has weights in format [filter_h, filter_w, c_in, c_out].
We got a pure weights file by calling model.save_weights('weight.h5') in Python. Then we can read the weights into a Numpy array using h5py, performed transposing and saved the transposed weights as a new file. We also figured out the Group and Dataset name using h5py. This info was used when loading weights into C++ code using HDF5 C++ API.
We compared the output layer by layer between C++ code and Python code. For our U-Net, all the activation maps are the same till maybe the third block (after 2 pooling). After that, there is a tiny difference between pixel values. The absolute percentage error is 10^-8 so we don't think it's that bad. We are still in the process of polishing the C++ implementation.
Again, thanks for all the suggestions and answers we got in this post. Hope our solution can be helpful as well!
Hey I've done something similar, I'd say the best way to tackle the issue is to export your model to .onnx with a good like this one, if you check the support matrix for onnx, upsample is supported:
Then you can use https://github.com/onnx/onnx-tensorrt to convert the onnx-model to tensorrt, I've this to convert a network that I trained in pytorch and that had upsample. The repo for onnx-tensorrt is a bit more active, and if you check the pr tab you can check other people writing custom layers and fork from there.
I have a dataset which contains complex numbers. And when I feed the data into the network, I got an error:
ValueError: An initializer for variable encoder/conv2d/kernel of <dtype: 'complex64'> is required
Here is some of the code in my network:
self.input_placeholder=tf.placeholder(tf.complex64,[None,self.train_data[0].shape[1],self.train_data[0].shape[2],self.train_data[0].shape[3]])
The error occurs in the convolution step before initialize all of the parameters:
layer=tf.layers.conv2d(inputs,64,[1,self.F],strides=(1,1),padding='same',activation=None)
Is there any solution?
Is there any support in tensorflow for complex numbers?
Thank you very much!
Support for complex initializers is not available yet.
There is an open issue describing a feature request here:
https://github.com/tensorflow/tensorflow/issues/17097
According to the discussion in that ticket, it seems that Keras already provides a way to do it. Maybe you can do something similar to that.