unsupported operation _FusedBatchNormV3 with tensorRT and jetson tx2

unsupported operation _FusedBatchNormV3 with tensorRT and jetson tx2 - python

On a Jetson TX2 I am running:
Linux4Tegra R32.2.1
UFF Version 0.6.3
TensorRT 5.1.6.1
Cuda 10
Python 3.6.8
I get this error message:
[TensorRT] ERROR: UffParser: Validator error: sequential/batch_normalization_1/FusedBatchNormV3: Unsupported operation _FusedBatchNormV3
From this code:
output_nodes = [args.output_node_names]
input_node = args.input_node_name
frozen_graph_pb = args.frozen_graph_pb
uff_model = uff.from_tensorflow(frozen_graph_pb, output_nodes) . #Successfully creates uff model
network = builder.create_network()
G_LOGGER = trt.Logger(trt.Logger.INFO)
builder = trt.Builder(G_LOGGER)
builder.max_batch_size = 10
builder.max_workspace_size = 1 << 30
data_type = trt.DataType.FLOAT
parser = trt.UffParser()
input_verified =parser.register_input(input_node, (1,234,234,3)) #returns true
output_verified = parser.register_output(output_nodes[0]) #returns true
buffer_verified = parser.parse_buffer(uff_model, network, data_type) #returns false
The uff model was created successfully.
The parser successfully registered the inputs and outputs.
Parsing the buffer fails with the error above.
Does anyone know if FusedBatchNormV3 is truly not supported in tensorRT and if not is there an existing plugin that I can pull using the graph surgeon module?

Related

How to solve Qt display/platform error on Google Collab

I am trying to run an optical flow model, RAFT, on Google Colab. I have installed the setup file and libraries necessary for it but when I try to run the demo file, I get a Qt error that looks like this.
!python demo.py --model=raft-things.pth --path=demo-frames
/usr/local/lib/python3.10/site-packages/torch-1.13.0-py3.10-linux-x86_64.egg/torch/functional.py:504: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at ../aten/src/ATen/native/TensorShape.cpp:3190.)
return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined]
qt.qpa.xcb: could not connect to display
qt.qpa.plugin: Could not load the Qt platform plugin "xcb" in "/usr/local/lib/python3.10/site-packages/opencv_python-4.6.0.66-py3.10-linux-x86_64.egg/cv2/qt/plugins" even though it was found.
This application failed to start because no Qt platform plugin could be initialized. Reinstalling the application may fix this problem.
Available platform plugins are: xcb, eglfs, minimal, minimalegl, offscreen, vnc.
I have never had any GUI errors on Colab with other libraries so I am not sure why torch is giving it.
Here is the demo.py file:
import argparse
import os
import cv2
import glob
import numpy as np
import torch
from PIL import Image
from raft import RAFT
from raft.utils import flow_viz
from raft.utils.utils import InputPadder
DEVICE = 'cuda'
def load_image(imfile):
img = np.array(Image.open(imfile)).astype(np.uint8)
img = torch.from_numpy(img).permute(2, 0, 1).float()
return img[None].to(DEVICE)
def viz(img, flo):
img = img[0].permute(1,2,0).cpu().numpy()
flo = flo[0].permute(1,2,0).cpu().numpy()
# map flow to rgb image
flo = flow_viz.flow_to_image(flo)
img_flo = np.concatenate([img, flo], axis=0)
# import matplotlib.pyplot as plt
# plt.imshow(img_flo / 255.0)
# plt.show()
cv2.imshow('image', img_flo[:, :, [2,1,0]]/255.0)
cv2.waitKey()
def demo(args):
model = torch.nn.DataParallel(RAFT(args))
model.load_state_dict(torch.load(args.model, map_location=DEVICE))
model = model.module
model.to(DEVICE)
model.eval()
with torch.no_grad():
images = glob.glob(os.path.join(args.path, '*.png')) + \
glob.glob(os.path.join(args.path, '*.jpg'))
images = sorted(images)
for imfile1, imfile2 in zip(images[:-1], images[1:]):
image1 = load_image(imfile1)
image2 = load_image(imfile2)
padder = InputPadder(image1.shape)
image1, image2 = padder.pad(image1, image2)
flow_low, flow_up = model(image1, image2, iters=20, test_mode=True)
viz(image1, flow_up)
if __name__ == '__main__':
parser = argparse.ArgumentParser()
parser.add_argument('--model', help="restore checkpoint")
parser.add_argument('--path', help="dataset for evaluation")
parser.add_argument('--small', action='store_true', help='use small model')
parser.add_argument('--mixed_precision', action='store_true', help='use mixed precision')
parser.add_argument('--alternate_corr', action='store_true', help='use efficent correlation implementation')
args = parser.parse_args()
demo(args)
My hunch is that the issue might be coming from the fact the model might be outdated (2years). But I have tried using the pytorch and cuda version they specified and the most recent stable version (both installed through conda) and am still getting the same error.
Before this, I was getting a CudaCheck error but I assumed that was because the setup.py file wasn't correctly installed because of not having the correct version of python (3.8+), which I resolved by creating a separate kernel for python 3.10 and installing with that. I am now getting this error first.
My other hunch is that it has something to do with cv2 functions like cv2.namedWindow or cv2.imshow. That is what I gained from this other post. Nothing from there solved my issue. But they are necessary as a lot of the architecture is built on cv.

Could not load dynamic library 'libcudart.so.11.0' ? / failed call to cuInit: UNKNOWN ERROR (303)?

I am a beginner TensorFlow user and am running into the following issue when attempting to load an already saved model for segmentation on a test images.
i installed all the libraries on a virtual environment that i created.
The same code runs on google colab and now i am trying to run it on my machine.
My Environment
Ubunutu 16 /
tensorflow 2.5.0
My code
When running the code :
import os
from glob import glob
from tqdm import tqdm
import cv2
import tensorflow as tf
import nibabel as nib
import numpy as np
import matplotlib.pyplot as plt
test_images = sorted(glob("/mnt/DATA2To/projet/all/Souris/SOD/data/20210726_C321/EXVIVO/IRM/test-20210726_C321/*")) # images 160x120x1
i = 0 # iterator initialized to zero
model = tf.keras.models.load_model("/mnt/DATA2To/projet/all/Souris/SOD/segmentation/segmentation-moelle.h5", compile= False )
`
for path in tqdm(test_images, total=len(test_images)):
x = nib.load(path) # load the images (160x1x120)
new_header = header=x.header.copy() # copy the header to a variable for writing the results at the end
x = nib.load(path).get_data() # get the data from the image loaded
original_image = x
original_image_bis = original_image.transpose((0,2,1))
h, w, _ = original_image_bis.shape
original_image_bis = cv2.resize(original_image_bis, (w, h))
x = x.transpose((0,2,1)) # permute the image axes to (160x120x1)
x = cv2.resize(x, (128, 128)) # resize the image to have a shape of (128x128)
x = (x - x.min()) / (x.max() - x.min()) # do the min-max normalisation
x.shape= x.shape + (1,) # add the third axes (128x128x1)
x = x.astype(np.float32)
x1 = np.expand_dims(x, axis=0)
pred_mask = model.predict(x1)[0]
#pred_mask = (np.where(pred_mask > np.mean(pred_mask), 1,0))
pred_mask = pred_mask.astype(np.float32)
pred_mask1 = cv2.resize(pred_mask, (w, h))
pred_mask1 = (np.where(pred_mask1 > 0.92, 1,0))
pred_mask1.shape= pred_mask1.shape + (1,) # add the third axes (160x120x1)
pred_mask1 = pred_mask1.transpose((0,2,1)) #permute the image axes to (160x1x120)
Sform= new_header.get_base_affine()
pred_mask2 = nib.Nifti1Image(pred_mask1,None, header= new_header)
fname= "/mnt/DATA2To/projet/all/Souris/SOD/data/20210726_C321/EXVIVO/IRM/results-moelle/image%04d.nii" %i
nib.save(pred_mask2, fname)
i+=1
My Error
I am greeted with this error :
(venv) etudiant#PTT:~$ python3 '/home/etudiant/Documents/code/Segmentation.py'
2021-07-28 09:58:12.539200: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /mnt/software//mrtrix/lib::/opt/minc/lib:/opt/minc/lib/InsightToolkit
2021-07-28 09:58:12.539221: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
2021-07-28 09:58:13.429146: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /mnt/software//mrtrix/lib::/opt/minc/lib:/opt/minc/lib/InsightToolkit
2021-07-28 09:58:13.429164: W tensorflow/stream_executor/cuda/cuda_driver.cc:326] failed call to cuInit: UNKNOWN ERROR (303)
2021-07-28 09:58:13.429179: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (PTT): /proc/driver/nvidia/version does not exist
2021-07-28 09:58:13.429322: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
0it [00:00, ?it/s]
Can anyone would tell me what is wrong and how to fix that!?

W stands for "Warning" and I stands for "Information".
There are no problems with your code, TF just tells you it did not find the libraries required for GPU computation; this does not mean that TensorFlow does not run successfully on CPU.
What you can do instead to avoid receiving such messages in the future is to suppress the warnings.
Solution 1:
import os
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
import tensorflow as tf
Level 2 means to ignore warning and information, and print only the error.
Solution 2:
import tensorflow as tf
tf.logging.set_verbosity(tf.logging.ERROR)
Solution 3:
import logging
tf.get_logger().setLevel(logging.ERROR)

Getting Memory usage of a tensorflow model with guppy.hpy() not working

I am loading a saved model of tensorflow (.pb file) and trying to evaluate how much memory it allocates for the model with guppy package. Following a simple tutorial, here is what i tried:
from guppy import hpy
import tensorflow as tf
heap = hpy()
print("Heap Status at starting: ")
heap_status1 = heap.heap()
print("Heap Size : ", heap_status1.size, " bytes\n")
print(heap_status1)
heap.setref()
print("\nHeap Status after setting reference point: ")
heap_status2 = heap.heap()
print("Heap size: ", heap_status2.size, " bytes\n")
print(heap_status2)
model_path = "./saved_model/" #.pb file directory
model = tf.saved_model.load(model_path)
print("\nHeap status after creating model: ")
heap_status3 = heap.heap()
print("Heap size: ", heap_status3.size, " bytes\n")
print(heap_status3)
print("Memory used by the model: ", heap_status3.size - heap_status2.size)
I don't know why, but when i run the code it suddenly stops executing when i call heap_status1 = heap.heap(). It doesn't throw any error.
This same code runs fine when i don't use anything related to tensorflow, i.e. it runs successfully when i just create some random lists, strings, etc instead of loading a tensorflow model.
Note: my model will run in a CPU device. Unfortunately, tf.config.experimental.get_memory_info works with GPUs only.

If you are on Windows, the crash may be related to https://github.com/zhuyifei1999/guppy3/issues/25. Check pywin32 version and if it is < 300, upgrade pywin32 with
pip install -U pywin32

MaskRCNN TensorFlow Lite Inference Issue. No output from TFLite Model

System information
OS Platform and Distribution ( Ubuntu 18.04.5 LTS (GNU/Linux 5.4.0-1034-azure x86_64)):
TensorFlow installed from (source- Pip Install):
TensorFlow version (2.3.0):
Command used to run the converter
converter = tf.lite.TFLiteConverter.from_keras_model(keras_model)
converter.allow_custom_ops = True
converter.experimental_new_converter = True
converter.target_spec.supported_ops = [
tf.lite.OpsSet.TFLITE_BUILTINS, # enable TensorFlow Lite ops.
tf.lite.OpsSet.SELECT_TF_OPS # enable TensorFlow ops.
]
converter.optimizations = [ tf.lite.Optimize.DEFAULT ]
tflite_model = converter.convert()
link to Jupyter notebook and tflite model
https://drive.google.com/drive/folders/1pTB33fTSo5ENzevobTvuG7hN4YmiCPF_?usp=sharing
Commands used for inference
### Load the TFLite model and allocate tensors.
interpreter = tf.lite.Interpreter(model_path="model_2.3.tflite")
interpreter.allocate_tensors()
### Get input and output tensors.
input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()
### Test the model on random input data.
input_data_1 = np.array(np.random.random_sample(input_details[0]['shape']), dtype=np.float32)
input_data_2 = np.array(np.random.random_sample(input_details[1]['shape']), dtype=np.float32)
input_data_3 = np.array(np.random.random_sample(input_details[2]['shape']), dtype=np.float32)
interpreter.set_tensor(input_details[0]['index'], input_data_1)
interpreter.set_tensor(input_details[1]['index'], input_data_2)
interpreter.set_tensor(input_details[2]['index'], input_data_3)
interpreter.invoke() ---> Kernel is getting stuck here. No output. I am executing the code from jupyter.
The output from the converter invocation
No output in Jupyter.
Segmentation fault (core dumped) -- When executed in command line.
Failure details
Conversion is successful. But there is no output from model.
Could you guys please provide some ideas? I am stuck here and don't know how to proceed!

setUpNet DNN module was not built with CUDA backend; switching to CPU

I want to run my script python with GPU as u see in this photo
I used the command line: watch nvidia-smi,to show Processes of GPU, unfortunately the script python use just 41Mib of GPU capacity:
this is a part of my code :
import time
import math
import cv2
import numpy as np
labelsPath = "./coco.names"
LABELS = open(labelsPath).read().strip().split("\n")
np.random.seed(42)
weightsPath = "./yolov3.weights"
configPath = "./yolov3.cfg"
net = cv2.dnn.readNetFromDarknet(configPath, weightsPath)
ln = net.getLayerNames()
ln = [ln[i[0] - 1] for i in net.getUnconnectedOutLayers()]
FR=0
vs = cv2.VideoCapture(vid_path)
# vs = cv2.VideoCapture(0) ## USe this if you want to use webcam feed
writer = None
(W, H) = (None, None)
fl = 0
q = 0
while True:
(grabbed, frame) = vs.read()
if not grabbed:
break
if W is None or H is None:
(H, W) = frame.shape[:2]
FW=W
if(W<1075):
FW = 1075
FR = np.zeros((H+210,FW,3), np.uint8)
col = (255,255,255)
FH = H + 210
FR[:] = col
blob = cv2.dnn.blobFromImage(frame, 1 / 255.0, (416, 416),
swapRB=True, crop=False)
net.setInput(blob)
start = time.time()
layerOutputs = net.forward(ln)
end = time.time()
I tried to add this command line to force run with GPU ,
net.setPreferableBackend(cv2.dnn.DNN_BACKEND_CUDA)
net.setPreferableTarget(cv2.dnn.DNN_TARGET_CUDA)
then after running the script again it gives me this message and continue running the script with CPU :
[ WARN:0] global /io/opencv/modules/dnn/src/dnn.cpp (1363) setUpNet DNN module was not built with CUDA backend; switching to CPU

You'll need to manually build OpenCV to work with your GPU.
Here is a great tutorial on how to do so.

You might have to uninstall your opencv-python package using pip in case you are already having one, only then will the custom built opencv be accessible to the program.
pip3 uninstall opencv-python

Compatibility chart of cuda and cudnn:
https://docs.nvidia.com/deeplearning/cudnn/support-matrix/index.html#cudnn-cuda-hardware-versions
Checking the computation capability version from:
https://en.wikipedia.org/wiki/CUDA
Which is 7.5
In GPU supported, for 7.5 computation capability, CUDA SDK 11.0 – 11.2 support for compute capability 3.5 – 8.6 (Kepler (in part), Maxwell, Pascal, Volta, Turing, Ampere):
check for your Supported NVIDIA Hardware.
In my case, I was using Tesla T4 having Turing, which is compatible with cuDNN.
so in compilation report, you can see that Cmake returns cuDNN availability as "NO":
Got the docker Image Using:
sudo docker nvidia/cuda:11.1-cudnn8-runtime-ubuntu18.04
Compiled Opencv Cuda from:
https://www.pyimagesearch.com/2020/02/03/how-to-use-opencvs-dnn-module-with-nvidia-gpus-cuda-and-cudnn/

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

unsupported operation _FusedBatchNormV3 with tensorRT and jetson tx2 - python

Related

How to solve Qt display/platform error on Google Collab

Could not load dynamic library 'libcudart.so.11.0' ? / failed call to cuInit: UNKNOWN ERROR (303)?

Getting Memory usage of a tensorflow model with guppy.hpy() not working

MaskRCNN TensorFlow Lite Inference Issue. No output from TFLite Model

setUpNet DNN module was not built with CUDA backend; switching to CPU

Categories

Resources