I have converted the .pb file to tflite file using the bazel. Now I want to load this tflite model in my python script just to test that weather this is giving me correct output or not ?
You can use TensorFlow Lite Python interpreter to load the tflite model in a python shell, and test it with your input data.
The code will be like this:
import numpy as np
import tensorflow as tf
# Load TFLite model and allocate tensors.
interpreter = tf.lite.Interpreter(model_path="converted_model.tflite")
# Get input and output tensors.
input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()
# Test model on random input data.
input_shape = input_details[0]['shape']
input_data = np.array(np.random.random_sample(input_shape), dtype=np.float32)
interpreter.set_tensor(input_details[0]['index'], input_data)
# The function `get_tensor()` returns a copy of the tensor data.
# Use `tensor()` in order to get a pointer to the tensor.
output_data = interpreter.get_tensor(output_details[0]['index'])
The above code is from TensorFlow Lite official guide, for more detailed information, read this.
Using TensorFlow lite models in Python:
The verbosity of TensorFlow Lite is powerful because it allows you more control, but in many cases you just want to pass input and get an output, so I made a class that wraps this logic:
The following works with classification models from tfhub.dev, for example: https://tfhub.dev/tensorflow/lite-model/mobilenet_v2_1.0_224/1/metadata/1
# Usage
model = TensorflowLiteClassificationModel("path/to/model.tflite")
(label, probability) = model.run_from_filepath("path/to/image.jpeg")
import tensorflow as tf
import numpy as np
from PIL import Image
class TensorflowLiteClassificationModel:
def __init__(self, model_path, labels, image_size=224):
self.interpreter = tf.lite.Interpreter(model_path=model_path)
self._input_details = self.interpreter.get_input_details()
self._output_details = self.interpreter.get_output_details()
self.labels = labels
def run_from_filepath(self, image_path):
input_data_type = self._input_details[0]["dtype"]
image = np.array(Image.open(image_path).resize((self.image_size, self.image_size)), dtype=input_data_type)
if input_data_type == np.float32:
image = image / 255.
if image.shape == (1, 224, 224):
image = np.stack(image*3, axis=0)
return self.run(image)
def run(self, image):
image: a (1, image_size, image_size, 3) np.array
Returns list of [Label, Probability], of type List<str, float>
self.interpreter.set_tensor(self._input_details[0]["index"], image)
tflite_interpreter_output = self.interpreter.get_tensor(self._output_details[0]["index"])
probabilities = np.array(tflite_interpreter_output[0])
# create list of ["label", probability], ordered descending probability
label_to_probabilities = []
for i, probability in enumerate(probabilities):
label_to_probabilities.append([self.labels[i], float(probability)])
return sorted(label_to_probabilities, key=lambda element: element[1])
However, you'll need to modify this to support different use cases, since I am passing images as input, and getting classification ([label, probability]) output. If you need text input (NLP), or other output (object detection outputs bounding boxes, labels and probabilities), classification (just labels), etc).
Also, if you are expecting different size image inputs, then you'd have to change the input size and reallocate the model (self.interpreter.allocate_tensors()). This is slow (inefficient). It's better to use the platform resizing functionality (e.g. Android graphics library) instead of using a TensorFlow lite model to do the resizing. Alternatively, you could resize the model with a separate model which would be much quicker to allocate_tensors() for.
I have a pre-trained PyTorch model that I want to convert to TFlite. The model is from the seisbench API. I have used the code below for the conversion. The code has some checks to confirm that the various format conversions worked.
I have followed the flow .pt -> .onnx -> tensorflow -> tflite, but I obtain an .onnx file which is smaller (98 kB) than the final tflite model (108 kB). I am using the onnx-tensorflow library to convert the .onnx file to tensorflow (https://github.com/onnx/onnx-tensorflow)
model = sbm.PhaseNet.from_pretrained("instance") #load the model from the seisbench api
print("Model's state_dict:")
for param_tensor in model.state_dict():
print(param_tensor, "\t", model.state_dict()[param_tensor].size())
# Save model information
input_lenght = model.in_samples
input_depth = model.in_channels
# save to .pt
model.eval() #turn off gradient computations and other training-only operations
torch.save(model, 'pNET.pt')
# check if the model has been saved correctly
temp_model = torch.load('pNET.pt')
print("Model's state_dict:")
for param_tensor in temp_model.state_dict():
print(param_tensor, "\t", temp_model.state_dict()[param_tensor].size())
# save to .onnx
# define an input vector (random vector)
sample_input = torch.randn(1, input_depth, input_lenght, requires_grad=True) #order is width, depth, lenght of input
#width fixed to 1 for time series data
# export
model, # PyTorch Model
sample_input, # Input tensor
'pNET.onnx', # Output file name
input_names=['input'], # Input tensor name (arbitrary)
output_names=['output'] # Output tensor name (arbitrary)
# check if the model has been saved correctly
onnx_model = onnx.load('pNET.onnx')
# Check that the IR is well formed
# Print a Human readable representation of the graph
# Try to run an inference with the newly saved onnx model
import onnxruntime as ort
import numpy as np
ort_session = ort.InferenceSession('pNET.onnx')
outputs = ort_session.run(
{'input': np.random.randn(1, input_depth, input_lenght).astype(np.float32)} #random input
print(outputs) #check if you get a tensor of the right shape
from onnx_tf.backend import prepare
# Converting to TensorFlow model
onnx_model = onnx.load("pNET.onnx") # load onnx model
tf_rep = prepare(onnx_model) # prepare tf representation
tf_rep.export_graph("pNET") # export the model
# Check if the conversion worked
# Run a TF inference
import tensorflow as tf
model = tf.saved_model.load("./pNET")
model.trainable = False
input_tensor = tf.random.uniform([1, input_depth, input_lenght])
out = model(**{'input': input_tensor})
print(out) #check if you get a tensor of the right shape
# float16 quantization
converter = tf.lite.TFLiteConverter.from_saved_model("./pNET")
converter.optimizations = [tf.lite.Optimize.DEFAULT]
converter.target_spec.supported_types = [tf.float16]
tflite_quant_model = converter.convert()
# Save the model
with open('pNETlite16float.tflite', 'wb') as f:
f.write(tflite_model) # same size as when I use interpreter instead of converter?
My confusion stems from the fact that I was expecting post-training quantization to reduce model size. Does TFLite add some extra wrappers or methods to a model, increasing the size compared to .onnx?
I am attempting to understand more about computer vision models, and I'm trying to do some exploring of how they work. In an attempt to understand how to interpret feature vectors more I'm trying to use Pytorch to extract a feature vector. Below is my code that I've pieced together from various places.
import torch
import torch.nn as nn
import torchvision.models as models
import torchvision.transforms as transforms
from torch.autograd import Variable
from PIL import Image
# Load the pretrained model
model = models.resnet18(pretrained=True)
# Use the model object to select the desired layer
layer = model._modules.get('avgpool')
# Set model to evaluation mode
transforms = torchvision.transforms.Compose([
torchvision.transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
def get_vector(image_name):
# Load the image with Pillow library
img = Image.open("Documents/Documents/Driven Data Competitions/Hateful Memes Identification/data/01235.png")
# Create a PyTorch Variable with the transformed image
t_img = transforms(img)
# Create a vector of zeros that will hold our feature vector
# The 'avgpool' layer has an output size of 512
my_embedding = torch.zeros(512)
# Define a function that will copy the output of a layer
def copy_data(m, i, o):
# Attach that function to our selected layer
h = layer.register_forward_hook(copy_data)
# Run the model on our transformed image
# Detach our copy function from the layer
# Return the feature vector
return my_embedding
pic_vector = get_vector(img)
When I do this I get the following error:
RuntimeError: Expected 4-dimensional input for 4-dimensional weight [64, 3, 7, 7], but got 3-dimensional input of size [3, 224, 224] instead
I'm sure this is an elementary error, but I can't seem to figure out how to fix this. It was my impression that the "totensor" transformation would make my data 4-d, but it seems it's either not working correctly or I'm misunderstanding it. Appreciate any help or resources I can use to learn more about this!
All the default nn.Modules in pytorch expect an additional batch dimension. If the input to a module is shape (B, ...) then the output will be (B, ...) as well (though the later dimensions may change depending on the layer). This behavior allows efficient inference on batches of B inputs simultaneously. To make your code conform you can just unsqueeze an additional unitary dimension onto the front of t_img tensor before sending it into your model to make it a (1, ...) tensor. You will also need to flatten the output of layer before storing it if you want to copy it into your one-dimensional my_embedding tensor.
A couple of other things:
You should infer within a torch.no_grad() context to avoid computing gradients since you won't be needing them (note that model.eval() just changes the behavior of certain layers like dropout and batch normalization, it doesn't disable construction of the computation graph, but torch.no_grad() does).
I assume this is just a copy paste issue but transforms is the name of an imported module as well as a global variable.
o.data is just returning a copy of o. In the old Variable interface (circa PyTorch 0.3.1 and earlier) this used to be necessary, but the Variable interface was deprecated way back in PyTorch 0.4.0 and no longer does anything useful; now its use just creates confusion. Unfortunately, many tutorials are still being written using this old and unnecessary interface.
Updated code is then as follows:
import torch
import torchvision
import torchvision.models as models
from PIL import Image
img = Image.open("Documents/01235.png")
# Load the pretrained model
model = models.resnet18(pretrained=True)
# Use the model object to select the desired layer
layer = model._modules.get('avgpool')
# Set model to evaluation mode
transforms = torchvision.transforms.Compose([
torchvision.transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
def get_vector(image):
# Create a PyTorch tensor with the transformed image
t_img = transforms(image)
# Create a vector of zeros that will hold our feature vector
# The 'avgpool' layer has an output size of 512
my_embedding = torch.zeros(512)
# Define a function that will copy the output of a layer
def copy_data(m, i, o):
my_embedding.copy_(o.flatten()) # <-- flatten
# Attach that function to our selected layer
h = layer.register_forward_hook(copy_data)
# Run the model on our transformed image
with torch.no_grad(): # <-- no_grad context
model(t_img.unsqueeze(0)) # <-- unsqueeze
# Detach our copy function from the layer
# Return the feature vector
return my_embedding
pic_vector = get_vector(img)
model(t_img) Instead of this
Here just do--
This will add an extra dimension, hence the image will be of shape [1,3,224,224] and it will work.
I successfully converted a Keras H5 model into a Tensorflow pb file but I get totally different result when making a prediction.
In Python I use 2 Keras modules to preprocess the data before feeding the network:
from tensorflow.keras.applications.mobilenet_v2 import preprocess_input
from tensorflow.keras.preprocessing.image import img_to_array
Here is how I preprocess the data in my Python code:
# extract the object ROI, convert it from BGR to RGB channel
# ordering, resize it to 224x224, and preprocess it
moving_object = img_orig[startY:endY, startX:endX]
moving_object = cv2.cvtColor(moving_object, cv2.COLOR_BGR2RGB)
moving_object = cv2.resize(moving_object, (224, 224))
moving_object = img_to_array(moving_object)
moving_object = preprocess_input(moving_object)
Then I make batch predictions via the Keras predict method:
# only make a predictions if at least one object was detected
if len(objects) > 0:
objects = np.array(objects, dtype="float32")
preds = wine_plant_model.predict(objects)
Here is how I preprocess the data in C++:
vector<Mat> detected_objects;
//extract the object ROI
Mat image_roi = img_orig(roi);
and how I make batch predictions in C++:
if (detected_objects.size() > 0) {
vector<Mat> preds;
Mat inputBlobs = cv::dnn::blobFromImages(detected_objects, 1.0, Size(224, 224));
Mat outputs = net.forward();
It seems that I am not preprocessing the image the right way in C++ and therefore I am not getting the same results. But I cannot find a equivalent for the Keras preprocess_input() method in C++.
Looking at the Keras documentation the python preprocess_input() method scale the data between 1 and -1. So I do not know if I should normalize the data using the cv::normalize method or do something with the blobFromImages scale factor. I am a bit confused here.
Could you please tell me what I should do to preprocess the data the same way in C++ even if it is not through Keras which does not seem to be available in C++.
I am developing an image classification model/program for Raspberry Pi 0 W. I was wondering if it is possible to make a code upgrade that will accelerate image processing.
General information:
the main model was trained on EfficientNetB5
image dimensions are 240x320 in grayscale
on Raspberry, it should be an image classification, no possibility of 'live streaming' and object detection
I acknowledge that Raspberry Pi 0 W is not the best match for TF, but anyway maybe there is a way for acceleration
at the moment one image is being predicted in 60 seconds, which is too much
My thoughts about this are that maybe I should train the model with lower dimensions and maybe the learning_rate of the main model can affect rpi's speed?
Below I am attaching two scripts.
Tensorflow save_model transformation into tf_lite quantized model
import tensorflow as tf
import tensorflow_hub as hub
from tensorflow.keras.models import load_model
model = load_model('../models/effnet_v22.h5')
TFLITE_QUANT_MODEL = "../tflite_models/effnet_v22_quant.tflite"
run_model = tf.function(lambda x : model(x))
# Save the concrete function.
concrete_func = run_model.get_concrete_function(
tf.TensorSpec(model.inputs[0].shape, model.inputs[0].dtype)
# Convert the model to quantized version with post-training quantization
converter = tf.lite.TFLiteConverter.from_concrete_functions([concrete_func])
converter.optimizations = [tf.lite.Optimize.OPTIMIZE_FOR_SIZE]
tflite_quant_model = converter.convert()
open(TFLITE_QUANT_MODEL, "wb").write(tflite_quant_model)
print("TFLite Quantized Model Is Created")
One image processing on Raspberry Pi 0
import tensorflow as tf
import numpy as np
import matplotlib.image as img
import cv2
# uploading tflite model
tflite_interpreter =tf.lite.Interpreter(
# taking pre-trained model parameters
input_details = tflite_interpreter.get_input_details()
output_details = tflite_interpreter.get_output_details()
img_width = input_details[0]['shape'][2]
img_height = input_details[0]['shape'][1]
# uploading and processing the image to be predicted
testimg=cv2.resize(testimg, (img_width,img_height))
testimg=cv2.cvtColor(testimg, cv2.COLOR_BGR2GRAY)
testimg=testimg[np.newaxis, ..., np.newaxis]
testimg=np.array(testimg, dtype=np.float32)
# resizing tflite's tensors
tflite_interpreter.resize_tensor_input(input_details[0]['index'], (1, img_height, img_width, 1))
tflite_interpreter.resize_tensor_input(output_details[0]['index'], (1, 8))
input_details = tflite_interpreter.get_input_details()
output_details = tflite_interpreter.get_output_details()
tflite_interpreter.set_tensor(input_details[0]['index'], testimg)
tflite_model_predictions = tflite_interpreter.get_tensor(output_details[0]['index'])
# TFLite prediction results
classes = np.array([101,102,104,105, 107, 110, 113, 115]) # class array creation
mat = np.vstack([classes, tflite_model_predictions])
np.set_printoptions(suppress=True, precision = 10) # to get rid of scientific numbers
if np.max(mat[1,:]) > 0.50:
theclass = int(mat[0, np.argmax(mat[1,:])])
theclass = "NO_CLASS"
print("The predicted class is", theclass)
You are using EfficientNet-B5 model which has nearly 30M parameters. Even though you get benefits from Tensorflow Lite and quantization method, it is very hard to get a latency of inference below 30ms assuming you are using high-performance CPU like in Pixel 4. Considering you are using very limited powered embedded system, it is normal to get 60 seconds for one inferencing.
There exists one well-explained webpage about latency on EfficientNet-lite models. Here, you can visit, https://blog.tensorflow.org/2020/03/higher-accuracy-on-vision-models-with-efficientnet-lite.html
I want to train a convolutional neural network (using tf.keras from Tensorflow version 1.13) using numpy arrays as input data. The training data (which I currently store in a single >30GB '.npz' file) does not fit in RAM all at once. What is the best way to save and load large data-sets into a neural network for training? Since I didn't manage to find a good answer to this (surely ubiquitous?) problem, I'm hoping to hear one here. Thank you very much in advance for any help!
Similar questions seem to have been asked many times (e.g. training-classifier-from-tfrecords-in-tensorflow, tensorflow-synchronize-readings-from-tfrecord, how-to-load-data-parallelly-in-tensorflow) but are several years old and usually contain no conclusive answer.
My current understanding is that using TFRecord files is a good way to approach this problem. The most promising tutorial I found so far explaining how to use TFRecord files with keras is medium.com. Other helpful sources were machinelearninguru.com and medium.com_source2 and sources therin.
The official tensorflow documentation and tutorials (on tf.data.Dataset, Importing Data, tf_records etc.) did not help me. In particular, several of the examples given there didn't work for me even without modifications.
My Attempt at using TFRecord files
I'm assuming TFRecords are a good way to solve my problem but I'm having a hard time using them. Here is an example I made based on the tutorial medium.com. I stripped down the code as much as I could.
# python 3.6, tensorflow 1.13.
# Adapted from https://medium.com/#moritzkrger/speeding-up-keras-with-tfrecord-datasets-5464f9836c36
import tensorflow as tf
import numpy as np
from tensorflow.python import keras as keras
# Helper functions (see also https://www.tensorflow.org/tutorials/load_data/tf_records)
def _int64_feature(value):
return tf.train.Feature(int64_list=tf.train.Int64List(value=[value]))
def _bytes_feature(value):
return tf.train.Feature(bytes_list=tf.train.BytesList(value=[value]))
def writeTFRecords():
number_of_samples = 100 # create some random data to play with
images, labels = (np.random.sample((number_of_samples, 256, 256, 1)), np.random.randint(0, 30, number_of_samples))
writer = tf.python_io.TFRecordWriter("bla.tfrecord")
for index in range(images.shape[0]):
image = images[index]
label = labels[index]
feature = {'image': _bytes_feature(tf.compat.as_bytes(image.tostring())),
'label': _int64_feature(int(label))}
example = tf.train.Example(features=tf.train.Features(feature=feature))
def loadTFRecord(data_path):
with tf.Session() as sess:
feature = {'train/image': tf.FixedLenFeature([], tf.string),
'train/label': tf.FixedLenFeature([], tf.int64)}
# Create a list of filenames and pass it to a queue
filename_queue = tf.train.string_input_producer([data_path], num_epochs=1)
# Define a reader and read the next record
reader = tf.TFRecordReader()
_, serialized_example = reader.read(filename_queue)
# Decode the record read by the reader
features = tf.parse_single_example(serialized_example, features=feature)
# Convert the image data from string back to the numbers
image = tf.decode_raw(features['train/image'], tf.float32)
# Cast label data into int32
label = tf.cast(features['train/label'], tf.int32)
# Reshape image data into the original shape
image = tf.reshape(image, [256, 256, 1])
return image, label # I'm not 100% sure that's how this works...
# ######### generate a TFRecords file in the working directory containing random data. #################################
# ######## Load the TFRecords file and use it to train a simple example neural network. ################################
image, label = loadTFRecord("bla.tfrecord")
model_input = keras.layers.Input(tensor=image)
model_output = keras.layers.Flatten(input_shape=(-1, 256, 256, 1))(model_input)
model_output = keras.layers.Dense(16, activation='relu')(model_output)
train_model = keras.models.Model(inputs=model_input, outputs=model_output)
print("\n \n start training \n \n") # Execution gets stuck on fitting
train_model.fit(epochs=1, steps_per_epoch=10) # no output or error messages.
The code creates a TFRecord file and starts fitting, then just gets stuck with no output or error messages. I don't know what the problem is or how I could try to fix it.
While this is no real answer to the original question (i.e. "what is the optimal way to train on large datasets"), I managed to get tfrecords and datasets to work. Of particular help was this tutorial on YouTube. I include a minimal example with working code for anyone struggling with the same problem.
# Developed using python 3.6, tensorflow 1.14.0.
# This code writes data (pairs (label, image) where label is int64 and image is np.ndarray) into .tfrecord files and
# uses them for training a simple neural network. It is meant as a minimal working example of how to use tfrecords. This
# solution is likely not optimal. If you know how to improve it, please comment on
# https://stackoverflow.com/q/57717004/9988487. Refer to links therein for further information.
import tensorflow as tf
import numpy as np
from tensorflow.python import keras as keras
# Helper functions (see also https://www.tensorflow.org/tutorials/load_data/tf_records)
def _int64_feature(value):
return tf.train.Feature(int64_list=tf.train.Int64List(value=[value]))
def _bytes_feature(value):
return tf.train.Feature(bytes_list=tf.train.BytesList(value=[value]))
def write_tfrecords_file(out_path: str, images: np.ndarray, labels: np.ndarray) -> None:
"""Write all image-label pairs into a single .tfrecord file.
:param out_path: File path of the .tfrecord file to generate or overwrite.
:param images: array with first dimension being the image index. Every images[i].tostring() is
serialized and written into the file as 'image': wrap_bytes(img_bytes)
:param labels: 1d array of integers. labels[i] is the label of images[i]. Written as 'label': wrap_int64(label)"""
assert len(images) == len(labels)
with tf.io.TFRecordWriter(out_path) as writer: # could use writer_options parameter to enable compression
for i in range(len(labels)):
img_bytes = images[i].tostring() # Convert the image to raw bytes.
label = labels[i]
data = {'image': _bytes_feature(img_bytes), 'label': _int64_feature(label)}
feature = tf.train.Features(feature=data) # Wrap the data as TensorFlow Features.
example = tf.train.Example(features=feature) # Wrap again as a TensorFlow Example.
serialized = example.SerializeToString() # Serialize the data.
writer.write(serialized) # Write the serialized data to the TFRecords file.
def parse_example(serialized, shape=(256, 256, 1)):
features = {'image': tf.io.FixedLenFeature([], tf.string), 'label': tf.io.FixedLenFeature([], tf.int64)}
# Parse the serialized data so we get a dict with our data.
parsed_example = tf.io.parse_single_example(serialized=serialized, features=features)
label = parsed_example['label']
image_raw = parsed_example['image'] # Get the image as raw bytes.
image = tf.decode_raw(image_raw, tf.float32) # Decode the raw bytes so it becomes a tensor with type.
image = tf.reshape(image, shape=shape)
return image, label # this function will be called once (to add it to tf graph; then parse images individually)
# create some arbitrary data to play with: 1000 images sized 256x256 with one colour channel. Use your custom np-arrays
# using float32 to save memory. Must match type in parse_example(), tf.decode_raw(image_raw, tf.float32)
features_train = np.random.sample((NUM_OF_IMAGES, IMAGE_WIDTH, IMAGE_WIDTH, COLOUR_CHANNELS)).astype(np.float32)
labels_train = np.random.randint(low=0, high=NUM_OF_CLASSES, size=NUM_OF_IMAGES) # one random label for each image
features_eval = features_train[:200] # use the first 200 images as evaluation data for simplicity.
labels_eval = labels_train[:200]
write_tfrecords_file("train.tfrecord", features_train, labels_train) # normal: split the data files of several GB each
write_tfrecords_file("eval.tfrecord", features_eval, labels_eval) # this may take a while. Consider a progressbar
# The files are complete. Now define a model and use datasets to feed the data from the .tfrecord files into the model.
model = keras.Sequential([keras.layers.Flatten(input_shape=(256, 256, 1)),
keras.layers.Dense(128, activation='relu'),
keras.layers.Dense(10, activation='softmax')])
# Check docs for parameters (compression, buffer size, thread count. Also www.tensorflow.org/guide/performance/datasets
train_dataset = tf.data.TFRecordDataset("train.tfrecord") # specify a list (or dataset) of file names for large data
train_dataset = train_dataset.map(parse_example) # parse tfrecords. Parameter num_parallel_calls may help performance.
train_dataset = train_dataset.shuffle(buffer_size=1024).batch(64)
validation_dataset = tf.data.TFRecordDataset("eval.tfrecord")
validation_dataset = validation_dataset.map(parse_example).batch(64)
model.fit(train_dataset, epochs=3)
# evaluate the results
results = model.evaluate(validation_dataset)
print('\n\nvalidation loss, validation acc:', results)
Note that it's tricky to use some_keras_model.fit(..., validation_data=some_dataset) with dataset objects. It may result in
TypeError: 'DatasetV1Adapter' object does not support indexing.
This seems to be a bug (see github.com/tensorflow/tensorflow/issues/28995) and is supposedly fixed as of tf-nightly version '1.15.0-dev20190808'; The official tutorial uses this too, although it doesn't work in most versions. An easy but dirty-ish fix is to use verbose=0 (which only suppresses program output) and plot the validation results using tensorboard. Also see Keras model.fit() with tf.dataset API + validation_data.