Getting a prediction from an ONNX model in python

Getting a prediction from an ONNX model in python - python

I can't find anyone who explains to a layman how to load an onnx model into a python script, then use that model to make a prediction when fed an image. All I could find were these lines of code:
sess = rt.InferenceSession("onnx_model.onnx")
input_name = sess.get_inputs()[0].name
label_name = sess.get_outputs()[0].name
pred = sess.run([label_name], {input_name: X.astype(np.float32)})[0]
But I don't know what any of that means. And everywhere I look, everybody already seems to know what they mean, so nobody's explaining it. That would be one thing if I could just run this code, but I can't. It gives me this error:
onnxruntime.capi.onnxruntime_pybind11_state.InvalidArgument: [ONNXRuntimeError] : 2 : INVALID_ARGUMENT : Invalid rank for input: Input3 Got: 2 Expected: 4 Please fix either the inputs or the model.
So I need to actually know what those things mean so I can figure out how to fix the error. Will someone knowledgeable please explain?

Let's first start by going over the code you provided, to make everything clear.
sess = ort.InferenceSession("onnx_model.onnx")
This line loads the model into a session object. This means that the layers, functions and weights used in the model are made ready to perform inferences.
input_name = sess.get_inputs()[0].name
label_name = sess.get_outputs()[0].name
The two methods get_inputs and get_outputs each retrieve some meta information about the model, that being what inputs the model expects, and what outputs it can provide. Off of this meta information in these lines, only the first input & output is actually used, and off of these, only the name is being gotten, and saved into variables.
For the last line, let's tackle that part by part.
pred = sess.run(...)[0]
This performs a inference on the model, we'll go over the inputs to this method after this, but for now, the output is a list of different outputs. These outputs are each numpy arrays. In this case only the first output in this list is being used, and saved to the pred variable
([label_name], {input_name: X.astype(np.float32)})
These are the inputs to sess.run. The fist is a list of names of outputs that you want to be computed by the session. The second argument is a dict, where each input's name maps to numpy arrays. These arrays are are expected to be of the same dimension as the ones supplied during creation of the model. Similarly the types of these arrays should also match the types used during creation of the model.
The error you encountered seems to indicate that the supplied array doesn't have the expected dimensions. These intended amount of dimensions seems to be 4.
To gain clarity about what the exact shape and data type of the input array should be, there are visualization tools, like Netron

Related

TensorFlow Federated: How can I write an Input Spec for a model with more than one input

I'm trying to make an image captioning model using the federated learning library provided by tensorflow, but I'm stuck at this error
Input 0 of layer dense is incompatible with the layer: : expected min_ndim=2, found ndim=1.
this is my input_spec:
input_spec=collections.OrderedDict(x=(tf.TensorSpec(shape=(2048,), dtype=tf.float32), tf.TensorSpec(shape=(34,), dtype=tf.int32)), y=tf.TensorSpec(shape=(None), dtype=tf.int32))
The model takes image features as the first input and a list of vocabulary as a second input, but I can't express this in the input_spec variable. I tried expressing it as a list of lists but it still didn't work. What can I try next?

Great question! It looks to me like this error is coming out of TensorFlow proper--indicating that you probably have the correct nested structure, but the leaves may be off. Your input spec looks like it "should work" from TFF's perspective, so it seems it is probably slightly mismatched with the data you have
The first thing I would try--if you have an example tf.data.Dataset which will be passed in to your client computation, you can simply read input_spec directly off this dataset as the element_spec attribute. This would look something like:
# ds = example dataset
input_spec = ds.element_spec
This is the easiest path. If you have something like "lists of lists of numpy arrays", there is still a way for you to pull this information off the data itself--the following code snippet should get you there:
# data = list of list of numpy arrays
input_spec = tf.nest.map_structure(lambda x: tf.TensorSpec(x.shape, x.dtype), data)
Finally, if you have a list of lists of tf.Tensors, TensorFlow provides a similar function:
# tensor_structure = list of lists of tensors
tf.nest.map_structure(tf.TensorSpec.from_tensor, tensor_structure)
In short, I would reocmmend not specifying input_spec by hand, but rather letting the data tell you what its input spec should be.

Visualize TFLite graph and get intermediate values of a particular node?

I was wondering if there is a way to know the list of inputs and outputs for a particular node in tflite? I know that I can get input/outputs details, but this does not allow me to reconstruct the computation process that happens inside an Interpreter. So what I do is:
interpreter = tf.lite.Interpreter(model_path=model_path)
interpreter.allocate_tensors()
input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()
interpreter.get_tensor_details()
The last 3 commands basically give me dictionaries which don't seem to have the necessary information.
So I was wondering if there is way to know where each nodes outputs goes? Surely Interpreter knows this somehow. Can we? Thanks.

Note: this answer was written for Tensorflow 1.x and, while the concept and core idea remains the same in TensorFlow 2.x, the commands in this answer might be deprecated.
The mechanism of TF-Lite makes the whole process of inspecting the graph and getting the intermediate values of inner nodes a bit tricky. The get_tensor(...) method suggested by the other answer does not work.
How to visualize TF-Lite inference graph?
TensorFlow Lite models can be visualized using the visualize.py script in the TensorFlow Lite repository. You just need to:
Clone the TensorFlow repository
Run the visualize.py script with bazel:
bazel run //tensorflow/lite/tools:visualize \
model.tflite \
visualized_model.html
Does the nodes in my TF model have a equivalent one in TF-Lite?
NO! In fact, TF-Lite can modify your graph so that it become more optimal. Here are some words about it from the TF-Lite documentation:
A number of TensorFlow operations can be processed by TensorFlow Lite even though they have no direct equivalent. This is the case for operations that can be simply removed from the graph (tf.identity), replaced by tensors (tf.placeholder), or fused into more complex operations (tf.nn.bias_add). Even some supported operations may sometimes be removed through one of these processes.
Moreover, the TF-Lite API currently doesn't allow to get node correspondence; it's hard to interpret the inner format of TF-Lite. So, you can't get the intermediate outputs for any nodes you want, even without the one more issue below...
Can I get intermediate values of some TF-Lite nodes?
NO! Here, I will explain why get_tensor(...) wouldn't work in TF-Lite. Suppose in the inner representation, the graph contains of 3 tensors, together with some dense operations (nodes) in-between (you can think of tensor1 as input and tensor3 as output of your model). During inference of this particular graph, TF-Lite only needs 2 buffers, let's show how.
First, use tensor1 to compute tensor2 by applying dense operation. This only requires 2 buffers to store the values:
dense dense
[tensor1] -------> [tensor2] -------> [tensor3]
^^^^^^^ ^^^^^^^
bufferA bufferB
Second, use the value of tensor2 stored in bufferB to compute tensor3... but wait! We don't need bufferA anymore, so let's use it to store the value of tensor3:
dense dense
[tensor1] -------> [tensor2] -------> [tensor3]
^^^^^^^ ^^^^^^^
bufferB bufferA
Now is the tricky part. The "output value" of tensor1 will still point to bufferA, which now holds the values of tensor3. So if you call get_tensor(...) for the 1st tensor, you'll get incorrect values. The documentation of this method even states:
This function cannot be used to read intermediate results.
How to get around this?
Easy but limited way. You can specify the names of the nodes, output tensors of which you want to get the values of during conversion:
tflite_convert \
-- # other options of your model
--output_arrays="output_node,intermediate/node/n1,intermediate/node/n2"
Hard but flexible way. You can compile TF-Lite with Bazel (using this instruction). Then you can actually inject some logging code to Interpreter::Invoke() in the file tensorflow/lite/interpreter.cc. An ugly hack, but it works.

As #FalconUA has pointed out, we cannot directly get intermediate inputs and outputs from a TFlite model. But, we can get inputs and outputs of layers by modifying the model buffer. This repo shows how it is done. We need to modify flat buffer schema for this to work. The modified TFlite schema (tflite folder in the repo) is available in the repo.
For the completeness of the answer, below is the relevant code:
def buffer_change_output_tensor_to(model_buffer, new_tensor_i):
# from https://github.com/raymond-li/tflite_tensor_outputter
# Set subgraph 0's output(s) to new_tensor_i
# Reads model_buffer as a proper flatbuffer file and gets the offset programatically
# It might be much more efficient if Model.subgraphs[0].outputs[] was set to a list of all the tensor indices.
fb_model_root = tflite_model.Model.GetRootAsModel(model_buffer, 0)
output_tensor_index_offset = fb_model_root.Subgraphs(0).OutputsOffset(0) # Custom added function to return the file offset to this vector
# print("buffer_change_output_tensor_to. output_tensor_index_offset: ")
# print(output_tensor_index_offset)
# output_tensor_index_offset = 0x5ae07e0 # address offset specific to inception_v3.tflite
# output_tensor_index_offset = 0x16C5A5c # address offset specific to inception_v3_quant.tflite
# Flatbuffer scalars are stored in little-endian.
new_tensor_i_bytes = bytes([
new_tensor_i & 0x000000FF, \
(new_tensor_i & 0x0000FF00) >> 8, \
(new_tensor_i & 0x00FF0000) >> 16, \
(new_tensor_i & 0xFF000000) >> 24 \
])
# Replace the 4 bytes corresponding to the first output tensor index
return model_buffer[:output_tensor_index_offset] + new_tensor_i_bytes + model_buffer[output_tensor_index_offset + 4:]
def get_tensor(path_tflite, tensor_id):
with open(path_tflite, 'rb') as fp:
model_buffer = fp.read()
model_buffer = buffer_change_output_tensor_to(model_buffer, int(tensor_id))
interpreter = tf.lite.Interpreter(model_content=model_buffer)
interpreter.allocate_tensors()
tensor_details = interpreter._get_tensor_details(tensor_id)
tensor_name = tensor_details['name']
input_details = interpreter.get_input_details()
interpreter.set_tensor(input_details[0]['index'], input_tensor)
interpreter.invoke()
tensor = interpreter.get_tensor(tensor_id)
return tensor

BucketIterator not returning batches of correct size

I'm implementing a simple LSTM language model in PyTorch, and wanted to check out the BucketIterator that is provided by torchtext.
It turns out that the batch that is returned has the size of my entire corpus, so I must be doing something wrong during its initialisation.
I've already got the BPTTIterator working, but as I want to be able to train on batches of complete sentences as well, I thought the BucketIterator should be the way to go.
I use the following setup, with my corpus a simple txt file containing sentences at each line.
field = Field(use_vocab=True, batch_first=True)
corpus = PennTreebank('project_2_data/train_lines.txt', field)
field.build_vocab(corpus)
iterator = BucketIterator(corpus,
batch_size=64,
repeat=False,
sort_key=lambda x: len(x.text),
sort_within_batch=True,
)
I expect a batch from this iterator to have the shape (batch_size, max_len), but it appends the entire corpus into 1 tensor of shape (1, corpus_size).
What am I missing in my setup?
Edit: it seems the PennTreebank object is not compatible with a BucketIterator (it contains only 1 Example as noted here http://mlexplained.com/2018/02/15/language-modeling-tutorial-in-torchtext-practical-torchtext-part-2/). Using a TabularDataset with only 1 Field got it working.
If someone has an idea how language modelling with padded sentence batches can be done in torchtext in a more elegant manner I'd love to hear it!

Create Version failed. Model validation failed: Outer dimension for outputs must be unknown, outer dimension of 'Const_1:0' is 5

I trained an image classifier with tf.keras and exported the model after the training is done to serve it in the cloud and make online predictions.
I served my model on a localhost using :
tensorflow_model_server --model_base $PATH_TO_SAVEDMODEL --rest_api_port=9000 --model_name=saved_model
I was able to make predictions and receive results. When i tried to deploy the model in the cloud i got the error in the title.
The thing is, i want to map the classes names with the prediction results and i was able to achieve that by doing the following :
# after i got the label names i convert the variable to a tensor
label_names_tensor = tf.convert_to_tensor(label_names) # shape (5,)
to export the model i use this :
tf.saved_model.simple_save(
sess,
"./saved_models/v1",
inputs={'image': model.input},
outputs={'label' : label_names_tensor,'prediction': model.output[0]})
NOTE :
model.output has the shape of (?,5)
model.output[0] has the shape of (5,)
this works locally and i get the classes names mapped with the prediction results.
It is obvious where the problem is.. how can i get this to work and map the classes names correctly with the prediction result ?
I tried to use the reshape function but i couldn't get it to work. I think i need to have this in the end :
shape of label_names_tensor --> (?,5)
so i can do this :
--outputs = {'label' : label_names_tensor,'prediction': model.output}
any help is much appreciated

A few introductory notes. First, the reason for the requirement that the outer dimension of inputs be None is to allow for optimizations involving batching of inputs. The inputs are row-based: one row per input feature vector/matrix/tensor. Another assumption is that each input row produces exactly one output row. Since the number of input rows is variable, the number of output rows will be, too.
One consequence of this is that there is no way to output "static" information without repeating it in each of the rows. That said, if you're generally going to only be passing in one input at a time, there won't be any repetition, but you do have the extra overhead of handling the case as if there would be more than one input/output row. You can repeat the labeled rows as follows:
batch_size = tf.expand_dims(tf.shape(model.output)[0], [-1])
new_shape = tf.stack([batch_size[0], -1])
label_names_tensor = tf.reshape(tf.expand_dims(tf.tile(label_names, batch_size), [-1]), new_shape)
# ...
tf.saved_model.simple_save(
sess,
"./saved_models/v1",
inputs={'image': model.input},
outputs={'label' : label_names_tensor,'prediction': model.output})

Keras: How to make average/maximum operation on top of some layers results?

I design neural network with the dynamic amount of inputs. My idea is to process set of data with shared model, then average/maximum their results and place classifier on top of it.
After reading documentation i was sure that it is possible using Keras, but i faced problem that keras.layers.average can receive only input tensors. But i need to make this operation over models or other layers instead.
Here is code i have now
inputs = [Input((countInputCount,))] # count is always here
downs = []
for i in range(count):
inputs.append(Input((inputCount,)))
downs.append(sharedDown(inputs[-1]))
avg = keras.layers.average(downs)
max = keras.layers.maximum(downs)
middle = keras.layers.concatenate([inputs[0], avg, max])
For this i got next error: ValueError: A merge layer should be called on a list of inputs.
Probably i understand this error incorrectly. Any help will be highly appreciated.
Thank you.

In fact keras.layers.average works for any tensor, but not accept list of one tensor. So if you will face something similar you can handle with the single if statement:
if count==1:
avg = downs[0]
else:
avg = keras.layers.average(downs)

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Getting a prediction from an ONNX model in python - python

Related

TensorFlow Federated: How can I write an Input Spec for a model with more than one input

Visualize TFLite graph and get intermediate values of a particular node?

BucketIterator not returning batches of correct size

Create Version failed. Model validation failed: Outer dimension for outputs must be unknown, outer dimension of 'Const_1:0' is 5

Keras: How to make average/maximum operation on top of some layers results?

Categories

Resources