I have written an object detection model that uses the COCO dataset to detect the 91 different classes of objects. I was curious if I could write a custom object detection model to detect any additional classes that would also be detected. For instance, I trained a custom object detection model to detect two random classes (dog and panda) and wanted to create a python script that can detect the 91 classes present in the COCO dataset and the 2 additional classes for a total of 93-classifier detection model.
import cv2
import pyttsx3
cap = cv2.VideoCapture(0)
cap.set(3, 640)
cap.set(4, 480)
# Model 1 - Coco Dataset (91 classes)
classNames = []
classFile = 'coco.names'
with open(classFile,'rt') as f:
classNames = [line.rstrip() for line in f]
# Found documentation for SSD mobilenet V3
configPath = 'ssd_mobilenet_v3_large_coco_2020_01_14.pbtxt'
weightsPath = 'frozen_inference_graph_coco.pb'
net = cv2.dnn_DetectionModel(weightsPath, configPath)
net.setInputSize(320, 320)
net.setInputScale(1.0 / 127.5)
net.setInputMean((127.5, 127.5, 127.5))
net.setInputSwapRB(True)
# Model 2 - 2 classifier Model dog and panda
classNames2 = []
classFile2 = 'coco.names2' #this file contains 1. dog and 2. panda
with open(classFile2,'rt') as f:
classNames2 = [line.rstrip() for line in f]
configPath2 = 'ssd_mobilenet_v1_coco.config' # This is where I get the error - I am not sure which configuration path to use
weightsPath2 = 'frozen_inference_graph_dog_panda.pb'
net2 = cv2.dnn_DetectionModel(weightsPath2, configPath2)
net2.setInputSize(320, 320)
net2.setInputScale(1.0 / 127.5)
net2.setInputMean((127.5, 127.5, 127.5))
net2.setInputSwapRB(True)
engine = pyttsx3.init()
while True:
success, img = cap.read()
classIds, confs, bbox = net.detect(img, confThreshold=0.60, nmsThreshold=0.2)
classIds2, confs2, bbox2 = net2.detect(img, confThreshold=0.60, nmsThreshold=0.2)
print(classIds, bbox)
print(classIds2, bbox2)
if len(classIds2) != 0:
for classId2, confidence2, box2 in zip(classIds2.flatten(), confs2.flatten(), bbox2):
className2 = classNames2[classId2 - 1]
str2 = str(className2)
print(str2)
engine.say(str2 + "detected")
engine.runAndWait()
continue
if len(classIds) != 0:
for classId, confidence, box in zip(classIds.flatten(), confs.flatten(), bbox):
className = classNames[classId - 1]
str1 = str(className)
print(str1)
engine.say(str1 + "detected")
engine.runAndWait()
continue
if len(classIds and classIds2) == 0:
engine.say("no objects detected")
engine.runAndWait()
continue
cv2.imshow('Output', img)
cv2.waitKey(1)
My Error:
/Users/venuchannarayappa/PycharmProjects/ObjectDetector/venv/bin/python /Users/venuchannarayappa/PycharmProjects/ObjectDetector/Custom_Object_Detection_Voice_Feedback.py
[ERROR:0] global /Users/runner/work/opencv-python/opencv-python/opencv/modules/dnn/src/dnn.cpp (3554) getLayerShapesRecursively OPENCV/DNN: []:(_input): getMemoryShapes() throws exception. inputs=1 outputs=0/0 blobs=0
[ERROR:0] global /Users/runner/work/opencv-python/opencv-python/opencv/modules/dnn/src/dnn.cpp (3557) getLayerShapesRecursively input[0] = [ 1 3 300 300 ]
[ERROR:0] global /Users/runner/work/opencv-python/opencv-python/opencv/modules/dnn/src/dnn.cpp (3567) getLayerShapesRecursively Exception message: OpenCV(4.5.4) /Users/runner/work/opencv-python/opencv-python/opencv/modules/dnn/src/dnn.cpp:795: error: (-215:Assertion failed) inputs.size() == requiredOutputs in function 'getMemoryShapes'
Traceback (most recent call last):
File "/Users/venuchannarayappa/PycharmProjects/ObjectDetector/Custom_Object_Detection_Voice_Feedback.py", line 43, in
classIds2, confs2, bbox2 = net2.detect(img, confThreshold=0.60, nmsThreshold=0.2)
cv2.error: OpenCV(4.5.4) /Users/runner/work/opencv-python/opencv-python/opencv/modules/dnn/src/dnn.cpp:795: error: (-215:Assertion failed) inputs.size() == requiredOutputs in function 'getMemoryShapes'
Process finished with exit code 1
Here is a link to the files that I used to create my custom object detection model https://www.icloud.com/iclouddrive/0c9q7nxUN4bceeLg29XY157cw#my_model
Please help me in determining how to set the configuration path. I believe the .config file of the SSD Mobilenet V1 should be used but I keep getting a memory shape error. Any help is appreciated!
Also, I have tried changing net.setInputSize(320, 320) -> net.setInputSize(300, 300). This change gave me the same error.
Finally, I also tried using the same configuration and weights path for the COCO dataset (configPath = 'ssd_mobilenet_v3_large_coco_2020_01_14.pbtxt'
weightsPath = 'frozen_inference_graph_coco.pb' both times) and the code runs properly. I believe this means the code is functional but there is some shape error when I use my own frozen inference graph with any .config or .pbtxt file.
If anyone is able to train their own custom model, they would need to change the 'coco.names' file to display the correct classes and input the configuration and weights path. I would highly appreciate it.
Related
I'm new to tensorflow and object detetion, and any help would be greatly appreciated! I got a database of 50 photos, used this video to get me started, and it DID work with Google's Sample Model (I'm using a RPi4B with 8 GB of RAM), then I wanted to create my own model. I tried a couple of options, but ultimately failed since the type of files I needed were a .TFLITE and a .txt one with the labels. I only managed to get a .LITE file which from what I tested didn't work
I tried his google collab sheet but the terminal got stuck at step 5 when I pressed the button to train the model, so I tried Edge Impulse but the output models were all in a .LITE file, and didn't provide a labelmap.txt file for the code. I tried manually changing the extension from .LITE to .TFLITE since according to this thread it was supposed to work, but it didn't!
I need this to be ready in 3 days from now... Isn't there a more beginner-friendly way to do this? How can I get a valid .TFLITE model to work with my RPI4? If I have to, I will change the code for this to work. Here's the code the tutorial provided:
######## Webcam Object Detection Using Tensorflow-trained Classifier #########
#
# Author: Evan Juras
# Date: 10/27/19
# Description:
# This program uses a TensorFlow Lite model to perform object detection on a live webcam
# feed. It draws boxes and scores around the objects of interest in each frame from the
# webcam. To improve FPS, the webcam object runs in a separate thread from the main program.
# This script will work with either a Picamera or regular USB webcam.
#
# This code is based off the TensorFlow Lite image classification example at:
# https://github.com/tensorflow/tensorflow/blob/master/tensorflow/lite/examples/python/label_image.py
#
# I added my own method of drawing boxes and labels using OpenCV.
# Import packages
import os
import argparse
import cv2
import numpy as np
import sys
import time
from threading import Thread
import importlib.util
# Define VideoStream class to handle streaming of video from webcam in separate processing thread
# Source - Adrian Rosebrock, PyImageSearch: https://www.pyimagesearch.com/2015/12/28/increasing-raspberry-pi-fps-with-python-and-opencv/
class VideoStream:
"""Camera object that controls video streaming from the Picamera"""
def _init_(self,resolution=(640,480),framerate=30):
# Initialize the PiCamera and the camera image stream
self.stream = cv2.VideoCapture(0)
ret = self.stream.set(cv2.CAP_PROP_FOURCC, cv2.VideoWriter_fourcc(*'MJPG'))
ret = self.stream.set(3,resolution[0])
ret = self.stream.set(4,resolution[1])
# Read first frame from the stream
(self.grabbed, self.frame) = self.stream.read()
# Variable to control when the camera is stopped
self.stopped = False
def start(self):
# Start the thread that reads frames from the video stream
Thread(target=self.update,args=()).start()
return self
def update(self):
# Keep looping indefinitely until the thread is stopped
while True:
# If the camera is stopped, stop the thread
if self.stopped:
# Close camera resources
self.stream.release()
return
# Otherwise, grab the next frame from the stream
(self.grabbed, self.frame) = self.stream.read()
def read(self):
# Return the most recent frame
return self.frame
def stop(self):
# Indicate that the camera and thread should be stopped
self.stopped = True
# Define and parse input arguments
parser = argparse.ArgumentParser()
parser.add_argument('--modeldir', help='Folder the .tflite file is located in',
required=True)
parser.add_argument('--graph', help='Name of the .tflite file, if different than detect.tflite',
default='detect.lite')
parser.add_argument('--labels', help='Name of the labelmap file, if different than labelmap.txt',
default='labelmap.txt')
parser.add_argument('--threshold', help='Minimum confidence threshold for displaying detected objects',
default=0.5)
parser.add_argument('--resolution', help='Desired webcam resolution in WxH. If the webcam does not support the resolution entered, errors may occur.',
default='1280x720')
parser.add_argument('--edgetpu', help='Use Coral Edge TPU Accelerator to speed up detection',
action='store_true')
args = parser.parse_args()
MODEL_NAME = args.modeldir
GRAPH_NAME = args.graph
LABELMAP_NAME = args.labels
min_conf_threshold = float(args.threshold)
resW, resH = args.resolution.split('x')
imW, imH = int(resW), int(resH)
use_TPU = args.edgetpu
# Import TensorFlow libraries
# If tflite_runtime is installed, import interpreter from tflite_runtime, else import from regular tensorflow
# If using Coral Edge TPU, import the load_delegate library
pkg = importlib.util.find_spec('tflite_runtime')
if pkg:
from tflite_runtime.interpreter import Interpreter
if use_TPU:
from tflite_runtime.interpreter import load_delegate
else:
from tensorflow.lite.python.interpreter import Interpreter
if use_TPU:
from tensorflow.lite.python.interpreter import load_delegate
# If using Edge TPU, assign filename for Edge TPU model
if use_TPU:
# If user has specified the name of the .tflite file, use that name, otherwise use default 'edgetpu.tflite'
if (GRAPH_NAME == 'detect.lite'):
GRAPH_NAME = 'edgetpu.tflite'
# Get path to current working directory
CWD_PATH = os.getcwd()
# Path to .tflite file, which contains the model that is used for object detection
PATH_TO_CKPT = os.path.join(CWD_PATH,MODEL_NAME,GRAPH_NAME)
# Path to label map file
PATH_TO_LABELS = os.path.join(CWD_PATH,MODEL_NAME,LABELMAP_NAME)
# Load the label map
with open(PATH_TO_LABELS, 'r') as f:
labels = [line.strip() for line in f.readlines()]
# Have to do a weird fix for label map if using the COCO "starter model" from
# https://www.tensorflow.org/lite/models/object_detection/overview
# First label is '???', which has to be removed.
if labels[0] == '???':
del(labels[0])
# Load the Tensorflow Lite model.
# If using Edge TPU, use special load_delegate argument
if use_TPU:
interpreter = Interpreter(model_path=PATH_TO_CKPT,
experimental_delegates=[load_delegate('libedgetpu.so.1.0')])
print(PATH_TO_CKPT)
else:
interpreter = Interpreter(model_path=PATH_TO_CKPT)
interpreter.allocate_tensors()
# Get model details
input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()
height = input_details[0]['shape'][1]
width = input_details[0]['shape'][2]
floating_model = (input_details[0]['dtype'] == np.float32)
input_mean = 127.5
input_std = 127.5
# Check output layer name to determine if this model was created with TF2 or TF1,
# because outputs are ordered differently for TF2 and TF1 models
outname = output_details[0]['name']
if ('StatefulPartitionedCall' in outname): # This is a TF2 model
boxes_idx, classes_idx, scores_idx = 1, 3, 0
else: # This is a TF1 model
boxes_idx, classes_idx, scores_idx = 0, 1, 2
# Initialize frame rate calculation
frame_rate_calc = 1
freq = cv2.getTickFrequency()
# Initialize video stream
videostream = VideoStream(resolution=(imW,imH),framerate=30).start()
time.sleep(1)
#for frame1 in camera.capture_continuous(rawCapture, format="bgr",use_video_port=True):
while True:
# Start timer (for calculating frame rate)
t1 = cv2.getTickCount()
# Grab frame from video stream
frame1 = videostream.read()
# Acquire frame and resize to expected shape [1xHxWx3]
frame = frame1.copy()
frame_rgb = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
frame_resized = cv2.resize(frame_rgb, (width, height))
input_data = np.expand_dims(frame_resized, axis=0)
# Normalize pixel values if using a floating model (i.e. if model is non-quantized)
if floating_model:
input_data = (np.float32(input_data) - input_mean) / input_std
# Perform the actual detection by running the model with the image as input
interpreter.set_tensor(input_details[0]['index'],input_data)
interpreter.invoke()
# Retrieve detection results
boxes = interpreter.get_tensor(output_details[boxes_idx]['index'])[0] # Bounding box coordinates of detected objects
classes = interpreter.get_tensor(output_details[classes_idx]['index'])[0] # Class index of detected objects
scores = interpreter.get_tensor(output_details[scores_idx]['index'])[0] # Confidence of detected objects
# Loop over all detections and draw detection box if confidence is above minimum threshold
for i in range(len(scores)):
if ((scores[i] > min_conf_threshold) and (scores[i] <= 1.0)):
# Get bounding box coordinates and draw box
# Interpreter can return coordinates that are outside of image dimensions, need to force them to be within image using max() and min()
ymin = int(max(1,(boxes[i][0] * imH)))
xmin = int(max(1,(boxes[i][1] * imW)))
ymax = int(min(imH,(boxes[i][2] * imH)))
xmax = int(min(imW,(boxes[i][3] * imW)))
cv2.rectangle(frame, (xmin,ymin), (xmax,ymax), (10, 255, 0), 2)
# Draw label
object_name = labels[int(classes[i])] # Look up object name from "labels" array using class index
label = '%s: %d%%' % (object_name, int(scores[i]*100)) # Example: 'person: 72%'
if object_name=='person' and int(scores[i]*100)>65:
print("YES")
else:
print("NO")
labelSize, baseLine = cv2.getTextSize(label, cv2.FONT_HERSHEY_SIMPLEX, 0.7, 2) # Get font size
label_ymin = max(ymin, labelSize[1] + 10) # Make sure not to draw label too close to top of window
cv2.rectangle(frame, (xmin, label_ymin-labelSize[1]-10), (xmin+labelSize[0], label_ymin+baseLine-10), (255, 255, 255), cv2.FILLED) # Draw white box to put label text in
cv2.putText(frame, label, (xmin, label_ymin-7), cv2.FONT_HERSHEY_SIMPLEX, 0.7, (0, 0, 0), 2) # Draw label text
# Draw framerate in corner of frame
cv2.putText(frame,'FPS: {0:.2f}'.format(frame_rate_calc),(30,50),cv2.FONT_HERSHEY_SIMPLEX,1,(255,255,0),2,cv2.LINE_AA)
# All the results have been drawn on the frame, so it's time to display it.
cv2.imshow('Object detector', frame)
# Calculate framerate
t2 = cv2.getTickCount()
time1 = (t2-t1)/freq
frame_rate_calc= 1/time1
# Press 'q' to quit
if cv2.waitKey(1) == ord('q'):
break
# Clean up
cv2.destroyAllWindows()
videostream.stop()
```
Easy, just downgrade to OpenCV version 3.4.16, and use Tensorflow 1.0 instead of 2.0 and that should solve all your problems. That will allow the use of .LITE files, as well that of .TFLITE
Also, try increasing the resolution to a 720x1280, most likely that can cause a ton of errors as well when working with tensorflow
Take a look here: https://www.tensorflow.org/tutorials/images/classification
This notebook sets up a new classification model, and ends with "Convert the Keras Sequential model to a TensorFlow Lite model"
https://www.tensorflow.org/tutorials/images/classification#convert_the_keras_sequential_model_to_a_tensorflow_lite_model
# Convert the model.
converter = tf.lite.TFLiteConverter.from_keras_model(model)
tflite_model = converter.convert()
# Save the model.
with open('model.tflite', 'wb') as f:
f.write(tflite_model)
This reliably produces a tflite model from a standard tf model.
Can you guys help me? I'm trying to use my own data (path = data) to create a dataset by applying the mediapipe on my videos. The processed vid (.ny) will be in output folder (path = O_Video) which I have declared previously. No of seq is 30 as I have 30 videos with the start_folder = 0 and start_video = 0.
videos = cv2.VideoCapture(IMPORT_DATA)
videos_input = cv2.cvtColor(videos, cv2.COLOR_BGR2RGB)
# Set mediapipe model
with mp_holistic.Holistic(min_detection_confidence=0.5, min_tracking_confidence=0.5) as holistic:
# NEW LOOP
# Loop through actions
for action in actions:
# Loop through sequences aka videos
for sequence in range(start_folder, start_video+no_sequences):
# get results
results = mp_face.process(videos_input)
for detection in results.detections: # iterate over each detection and draw on image
mp_drawing.draw_detection(videos, detection)
# NEW Export keypoints
keypoints = extract_keypoints(results)
npy_path = os.path.join(DATA_PATH, action, str(sequence))
np.save(npy_path, keypoints)
# Break gracefully
if cv2.waitKey(10) & 0xFF == ord('q'):
break
cap.release()
cv2.destroyAllWindows()
Error that I'm getting is as below:
Error Traceback (most recent call last)
Input In [15], in <module>
1 videos = cv2.VideoCapture(IMPORT_DATA)
----> 2 videos_input = cv2.cvtColor(videos, cv2.COLOR_BGR2RGB)
4 # Set mediapipe model
5 with mp_holistic.Holistic(min_detection_confidence=0.5, min_tracking_confidence=0.5) as holistic:
6
7 # NEW LOOP
8 # Loop through actions
error: OpenCV(4.5.5) :-1: error: (-5:Bad argument) in function 'cvtColor'
> Overload resolution failed:
> - src is not a numpy array, neither a scalar
> - Expected Ptr<cv::UMat> for argument 'src'
You can't pass a VideoCapture object to cvtColor.
You have to pass each frame (numpy array) individually.
vid = cv.VideoCapture(...)
assert vid.isOpened()
while True:
(valid, frame) = vid.read()
if not valid:
break
converted = cv.cvtColor(frame, ...)
...
you need to capture the frame data as shown below.
videos = cv2.VideoCapture(IMPORT_DATA)
# Capture the video by frame
ret, frame = videos.read()
# check ret for success and then do this
videos_input = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
Below is the order of how I am going to present my problem:
First I will show you the script .py that I am using to run the web app in a local host(flask app). This web app is a classifier which shows you whether a person has either Viral Pneumonia, Bacterial Pneumonia or they are Normal. Thus there are three classes(Viral, Bacterial or Normal) looking from chest x-rays which are in jpeg format.
Second I will show you the differnt .py script for Binary Classification for Pneumonia which is taking in raw dicom files and converting them into numpy arrays before they are diagnosed.
So to achieve diagnosis I am trying to integrate my app.py script which takes in jpegs, with the Pneumonia binary classification which takes in dicom files so as to take advantage of the dicom processing function of the second script but using all of the information and weights of the Viral and Bacterial one that I have, so that it can be used in a clinical setup. Clinical setups use dicom files not jpegs, that is why I am trying to combine these two scripts to reach the goal.
Below is my app.py script for Viral and Bacterial Pneumonia Classification which takes in jpegs, which I am trying to integrate on the other script that I am going to attach further below:
#::: Import modules and packages :::
# Flask utils
from flask import Flask, redirect, url_for, request, render_template
from werkzeug.utils import secure_filename
from gevent.pywsgi import WSGIServer
# Import Keras dependencies
from tensorflow.keras.models import model_from_json
from tensorflow.python.framework import ops
ops.reset_default_graph()
from keras.preprocessing import image
# Import other dependecies
import numpy as np
import h5py
from PIL import Image
import PIL
import os
#::: Flask App Engine :::
# Define a Flask app
app = Flask(__name__)
# ::: Prepare Keras Model :::
# Model files
MODEL_ARCHITECTURE = './model/model_adam.json'
MODEL_WEIGHTS = './model/model_100_eopchs_adam_20190807.h5'
# Load the model from external files
json_file = open(MODEL_ARCHITECTURE)
loaded_model_json = json_file.read()
json_file.close()
model = model_from_json(loaded_model_json)
# Get weights into the model
model.load_weights(MODEL_WEIGHTS)
print('Model loaded. Check http://127.0.0.1:5000/')
# ::: MODEL FUNCTIONS :::
def model_predict(img_path, model):
'''
Args:
-- img_path : an URL path where a given image is stored.
-- model : a given Keras CNN model.
'''
IMG = image.load_img(img_path).convert('L')
print(type(IMG))
# Pre-processing the image
IMG_ = IMG.resize((257, 342))
print(type(IMG_))
IMG_ = np.asarray(IMG_)
print(IMG_.shape)
IMG_ = np.true_divide(IMG_, 255)
IMG_ = IMG_.reshape(1, 342, 257, 1)
print(type(IMG_), IMG_.shape)
print(model)
model.compile(loss='categorical_crossentropy', metrics=['accuracy'], optimizer='rmsprop')
predict_x = model.predict(IMG_)
print(predict_x)
prediction = np.argmax(predict_x,axis=1)
print(prediction)
return prediction
# ::: FLASK ROUTES
#app.route('/', methods=['GET'])
def index():
# Main Page
return render_template('index.html')
#app.route('/predict', methods=['GET', 'POST'])
def upload():
# Constants:
classes = {'TRAIN': ['BACTERIA', 'NORMAL', 'VIRUS'],
'VALIDATION': ['BACTERIA', 'NORMAL'],
'TEST': ['BACTERIA', 'NORMAL', 'VIRUS']}
if request.method == 'POST':
# Get the file from post request
f = request.files['file']
# Save the file to ./uploads
basepath = os.path.dirname(__file__)
file_path = os.path.join(
basepath, 'uploads', secure_filename(f.filename))
f.save(file_path)
# Make a prediction
prediction = model_predict(file_path, model)
predicted_class = classes['TRAIN'][prediction[0]]
print('We think that is {}.'.format(predicted_class.lower()))
return str(predicted_class).lower()
if __name__ == '__main__':
app.run(debug = True)`
Below again is the already functioning script of Pneumonia binary classification which is taking in dicom files that I am trying to integrate with the weights and preprocessing information of the Viral and Bacterial classifier that I want to use:
## Loading standard modules and libraries
import numpy as np
import pandas as pd
import pydicom
%matplotlib inline
import matplotlib.pyplot as plt
import keras
from keras.models import Sequential
from keras.layers import Dense
from keras.models import model_from_json
from skimage.transform import resize
# This function reads in a .dcm file, checks the important fields for our device, and returns a numpy array
# of just the imaging data
def check_dicom(filename):
print('Loading file {} ...'.format(filename))
ds = pydicom.dcmread(filename)
if (ds.BodyPartExamined !='CHEST') | (ds.Modality !='DX') | (ds.PatientPosition not in ['PA', 'AP']):
print('The image is not valid because the image position, the image type or the body part is not as per standards')
return
else:
print('ID:', ds.PatientID,
'Age:', ds.PatientAge,
'Modality:', ds.Modality,
'Postion: ', ds.PatientPosition,
'Body Part: ', ds.BodyPartExamined,
'Study Desc: ', ds.StudyDescription)
img = ds.pixel_array
return img
# This function takes the numpy array output by check_dicom and
# runs the appropriate pre-processing needed for our model input
def preprocess_image(img,img_mean,img_std,img_size):
# todo
img = resize(img, (224,224))
img = img / 255.0
grey_img = (img - img_mean) / img_std
proc_img = np.zeros((224,224,3))
proc_img[:, :, 0] = grey_img
proc_img[:, :, 1] = grey_img
proc_img[:, :, 2] = grey_img
proc_img = np.resize(proc_img, img_size)
return proc_img
# This function loads in our trained model w/ weights and compiles it
def load_model(model_path, weight_path):
# todo
json_file = open(model_path, 'r')
loaded_model_json = json_file.read()
json_file.close()
model = model_from_json(loaded_model_json)
model.load_weights(weight_path)
return model
# This function uses our device's threshold parameters to predict whether or not
# the image shows the presence of pneumonia using our trained model
def predict_image(model, img, thresh):
# todo
result = model.predict(img)
print('Predicted value:', result)
predict=result[0]
prediction = "Negative"
if(predict > thresh):
prediction = "Positive"
return prediction
# This function uses our device's threshold parameters to predict whether or not
# the image shows the presence of pneumonia using our trained model
def predict_image(model, img, thresh):
# todo
result = model.predict(img)
print('Predicted value:', result)
predict=result[0]
prediction = "Negative"
if(predict > thresh):
prediction = "Positive"
return prediction
test_dicoms = ['test1.dcm','test2.dcm','test3.dcm','test4.dcm','test5.dcm','test6.dcm']
model_path = "my_model2.json" #path to saved model
weight_path = "xray_class_my_model2.best.hdf5" #path to saved best weights
IMG_SIZE=(1,224,224,3) # This might be different if you did not use vgg16
img_mean = 0.49262813 # mean image value from Build and train model line 22
img_std = 0.24496286 # loads the std dev from Build and train model line 22
my_model = load_model(model_path, weight_path) #loads model
thresh = 0.62786263 #threshold value for New Model2 from Build and train model line 66 at 80% Precision
# use the .dcm files to test your prediction
for i in test_dicoms:
img = np.array([])
img = check_dicom(i)
if img is None:
continue
img_proc = preprocess_image(img,img_mean,img_std,IMG_SIZE)
pred = predict_image(my_model,img_proc,thresh)
print('Model Classification:', pred , 'for Pneumonia' )
print('--------------------------------------------------------------------------------------------------------')
Output of above script:
Loading file test1.dcm ...
ID: 2 Age: 81 Modality: DX Postion: PA Body Part: CHEST Study Desc: No Finding
Predicted value: [[0.4775539]]
Model Classification: Negative for Pneumonia
--------------------------------------------------------------------------------------------------------
Loading file test2.dcm ...
ID: 1 Age: 58 Modality: DX Postion: AP Body Part: CHEST Study Desc: Cardiomegaly
Predicted value: [[0.47687072]]
Model Classification: Negative for Pneumonia
--------------------------------------------------------------------------------------------------------
Loading file test3.dcm ...
ID: 61 Age: 77 Modality: DX Postion: AP Body Part: CHEST Study Desc: Effusion
Predicted value: [[0.47764364]]
Model Classification: Negative for Pneumonia
--------------------------------------------------------------------------------------------------------
Loading file test4.dcm ...
The image is not valid because the image position, the image type or the body part is not as per standards
Loading file test5.dcm ...
The image is not valid because the image position, the image type or the body part is not as per standards
Loading file test6.dcm ...
The image is not valid because the image position, the image type or the body part is not as per standards
Threshold of 0.62786263 is considered at 80% Precision
Below is what I have tried so far but the diagnosis I am getting is always Viral on each and every dicom sample:
## Loading standard modules and libraries
import numpy as np
import pandas as pd
import pydicom
from PIL import Image
#%matplotlib inline
import matplotlib.pyplot as plt
import keras
from keras.models import Sequential
from keras.layers import Dense
from keras.models import model_from_json
from keras.preprocessing import image
from skimage.transform import resize
# This function reads in a .dcm file, checks the important fields for our device, and returns a numpy array
# of just the imaging data
def check_dicom(filename):
print('Loading file {} ...'.format(filename))
ds = pydicom.dcmread(filename)
if (ds.BodyPartExamined !='CHEST'): #| (ds.Modality !='DX'): #| (ds.PatientPosition not in ['PA', 'AP']):
print('The image is not valid because the image position, the image type or the body part is not as per standards')
return
else:
print('ID:', ds.PatientID,
'Age:', ds.PatientAge,
'Modality:', ds.Modality,
'Postion: ', ds.PatientPosition,
'Body Part: ', ds.BodyPartExamined,
'Study Desc: ', ds.StudyDescription)
img = ds.pixel_array
return img
# This function takes the numpy array output by check_dicom and
# runs the appropriate pre-processing needed for our model input
def preprocess_image(img):
# todo
#im = np.reshape(img, (342,257 ))
#im = np.arange(257)
#img = Image.fromarray(im)
#img = image.load_img(img).convert('L')
img = resize(img, (342,257))
grey_img = img / 255.0
#grey_img = (img - img_mean) / img_std
proc_img = np.zeros((1,342,257,1))
proc_img[:, :, :, 0] = grey_img
#proc_img[:, :, :, 1] = grey_img
#proc_img[:, :, :, 2] = grey_img
proc_img = proc_img.reshape(1, 342, 257, 1)
return proc_img
# This function loads in our trained model w/ weights and compiles it
def load_model(model_path, weight_path):
# todo
json_file = open(model_path, 'r')
loaded_model_json = json_file.read()
json_file.close()
model = model_from_json(loaded_model_json)
model.load_weights(weight_path)
model.compile(loss='categorical_crossentropy', metrics=['accuracy'], optimizer='rmsprop')
return model
# This function uses our device's threshold parameters to predict whether or not
# the image shows the presence of pneumonia using our trained model
def predict_image(model, img):
# todo
model.compile(loss='categorical_crossentropy', metrics=['accuracy'], optimizer='rmsprop')
#x = np.expand_dims(img, axis=0)
predict_x= model.predict(img)
print(predict_x)
prediction = np.argmax(predict_x,axis=1)
print(prediction)
return prediction
test_dicoms = ['test3.dcm','test2.dcm','test1.dcm','test4.dcm','test5.dcm','test6.dcm']
model_path = "model_adam.json" #path to saved model
weight_path = "model.h5" #path to saved best weights
#IMG_SIZE=(1,342,257,1) # This might be different if you did not use vgg16
#img_mean = 0.49262813 # mean image value from Build and train model line 22
#img_std = 0.24496286 # loads the std dev from Build and train model line 22
#my_model = load_model(model_path, weight_path) #loads model
#thresh = 0.62786263 #threshold value for New Model2 from Build and train model line 66 at 80% Precision
# use the .dcm files to test your prediction
for i in test_dicoms:
img = np.array([])
img = check_dicom(i)
if img is None:
continue
classes = {'TRAIN': ['BACTERIAL', 'NORMAL', 'VIRAL'],
'VALIDATION': ['BACTERIA', 'NORMAL'],
'TEST': ['BACTERIA', 'NORMAL', 'VIRUS']}
img_proc = preprocess_image(img)
prediction = predict_image(load_model(model_path, weight_path),img_proc)
predicted_class = classes['TRAIN'][int(prediction[0])]
print('Model Classification:', predicted_class, 'Pneumonia' )
print('--------------------------------------------------------------------------------------------------------')
Below is the output:
2022-01-02 10:50:00.817561: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory
2022-01-02 10:50:00.817601: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
Loading file test3.dcm ...
ID: 61 Age: 77 Modality: DX Postion: AP Body Part: CHEST Study Desc: Effusion
2022-01-02 10:50:02.652828: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory
2022-01-02 10:50:02.652859: W tensorflow/stream_executor/cuda/cuda_driver.cc:269] failed call to cuInit: UNKNOWN ERROR (303)
2022-01-02 10:50:02.652899: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (Wisdom-HP-250-G3-Notebook-PC): /proc/driver/nvidia/version does not exist
2022-01-02 10:50:02.653123: I tensorflow/core/platform/cpu_feature_guard.cc:151] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
[[0.01132523 0.00254696 0.98612785]]
[2]
Model Classification: VIRAL Pneumonia
--------------------------------------------------------------------------------------------------------
Loading file test2.dcm ...
ID: 1 Age: 58 Modality: DX Postion: AP Body Part: CHEST Study Desc: Cardiomegaly
[[0.01112939 0.00251635 0.9863543 ]]
[2]
Model Classification: VIRAL Pneumonia
--------------------------------------------------------------------------------------------------------
Loading file test1.dcm ...
ID: 2 Age: 81 Modality: DX Postion: PA Body Part: CHEST Study Desc: No Finding
[[0.01128576 0.00255111 0.9861631 ]]
[2]
Model Classification: VIRAL Pneumonia
--------------------------------------------------------------------------------------------------------
Loading file test4.dcm ...
The image is not valid because the image position, the image type or the body part is not as per standards
Loading file test5.dcm ...
ID: 2 Age: 81 Modality: CT Postion: PA Body Part: CHEST Study Desc: No Finding
[[0.01128576 0.00255111 0.9861631 ]]
[2]
Model Classification: VIRAL Pneumonia
--------------------------------------------------------------------------------------------------------
Loading file test6.dcm ...
ID: 2 Age: 81 Modality: DX Postion: XX Body Part: CHEST Study Desc: No Finding
WARNING:tensorflow:5 out of the last 5 calls to <function Model.make_predict_function.<locals>.predict_function at 0x7fba38ed19d0> triggered tf.function retracing. Tracing is expensive and the excessive number of tracings could be due to (1) creating #tf.function repeatedly in a loop, (2) passing tensors with different shapes, (3) passing Python objects instead of tensors. For (1), please define your #tf.function outside of the loop. For (2), #tf.function has experimental_relax_shapes=True option that relaxes argument shapes that can avoid unnecessary retracing. For (3), please refer to https://www.tensorflow.org/guide/function#controlling_retracing and https://www.tensorflow.org/api_docs/python/tf/function for more details.
[[0.01128576 0.00255111 0.9861631 ]]
[2]
Model Classification: VIRAL Pneumonia
---------------------------------------
My suspicion is that I did it wrong on the image preprocessing steps when I have integrated these two scripts (Remember: The goal is to take advantage of the Dicom reading function of the second script). Thus the model is taking in and predicting wrong input altogether due to wrong array arrangements on trying to preprocess when I have integrated these two scripts.
If in need of some information on parameters in the jupyter training presentation of the model kindly highlight.
When a classifier work okay in train/test but not when doing inference in production, a very common reason is that the training data was processed differently from the production data. The fix is to make sure it is processed the same, ideally using the same bit of code.
How were the jpegs the classifier was trained on processed? Do the originally come from dicoms? If yes, what was the exact code for the conversion?
How were the jpegs loaded during training? Pay special attention to bits that modify the data rather than merely copy it, such as grey_img = (img - img_mean) / img_std and the other commented out lines in your code (maybe they were not commented out during training)
If you copy the dicom->jpeg conversion from 1 and the jpeg loading from 2, you will probably have a working prediction
The below dicom to jpeg conversion function did the job for me:
def take_dicom(dicomname):
ds = read_file('Dicom_files/' + dicomname)
im = fromarray(ds.pixel_array)
final_img = im.save('./Jpeg/' + dicomname + '.jpg')
pure_jpg = dicomname + '.jpg'
return pure_jpg
Just had to use the os function to point my prediction function to where it should pick these jpegs before they are preprocessed and diagnosed:
def preprocess_image(pure_jpg):
'''
Args:
-- img_path : an URL path where a given image is stored.
-- model : a given Keras CNN model.
'''
#print(pure_jpg)
basepath = os.path.dirname('./Jpeg/')
file_path = os.path.join(
basepath, img)
#image = take_dicom(file_path)
#print(str(image))
IMG = image.load_img(file_path).convert('L')
#print(IMG)
#print(type(IMG))
# Pre-processing the image
IMG_ = IMG.resize((257, 342))
#print(type(IMG_))
IMG_ = np.asarray(IMG_)
#print(IMG_.shape)
IMG_ = np.true_divide(IMG_, 255)
IMG_ = IMG_.reshape(1, 342, 257, 1)
#print(type(IMG_), IMG_.shape)
return IMG_
However, the problem is that it's only working for the following two dicom imaging modalities:
DX (Digital X-Ray)
CT (Computed Tormography)
CR (Computed Radiography) dicom images are failing to convert.
While running the following python code in C++ using pybind11, pytorch 1.6.0, I get "Invalid Pointer" error. In python, the code runs successfully without any error. Whats the reason? How can I solve this problem?
import torch
import torch.nn.functional as F
import numpy as np
import cv2
import torchvision
import eval_widerface
import torchvision_model
def resize(image, size):
image = F.interpolate(image.unsqueeze(0), size=size, mode="nearest").squeeze(0)
return image
# define constants
model_path = '/path/to/model.pt'
image_path = '/path/to/image_pad.jpg'
scale = 1.0 #Image resize scale (2 for half size)
font = cv2.FONT_HERSHEY_SIMPLEX
MIN_SCORE = 0.9
image_bgr = cv2.imread(image_path)
image_rgb = cv2.cvtColor(image_bgr, cv2.COLOR_BGR2RGB)#skimage.io.imread(args.image_path)
cv2.imshow("input image",image_bgr)
cv2.waitKey()
cv2.destroyAllWindows()
# load pre-trained model
return_layers = {'layer2':1,'layer3':2,'layer4':3}
RetinaFace = torchvision_model.create_retinaface(return_layers)
print('RetinaFace.state_dict().')
retina_dict = RetinaFace.state_dict()
the following function generates error.
def create_retinaface(return_layers,backbone_name='resnet50',anchors_num=3,pretrained=True):
print('In create_retinaface.')
print(resnet.__dict__)
backbone = resnet.__dict__[backbone_name](pretrained=pretrained)
print('backbone.')
# freeze layer1
for name, parameter in backbone.named_parameters():
print('freeze layer 1.');
# if 'layer2' not in name and 'layer3' not in name and 'layer4' not in name:
# parameter.requires_grad_(False)
if name == 'conv1.weight':
# print('freeze first conv layer...')
parameter.requires_grad_(False)
model = RetinaFace(backbone,return_layers,anchor_nums=3)
return model
The statement backbone = resnet.__dict__ [backbone_name](pretrained=pretrained) generated error that looks like
*** Error in `./p': munmap_chunk(): invalid pointer: 0x00007f4461866db0 ***
======= Backtrace: =========
/usr/lib64/libc.so.6(+0x7f3e4)[0x7f44736b43e4]
/usr/local/lib64/libopencv_gapi.so.4.1(_ZNSt10_HashtableISsSsSaISsENSt8__detail9_IdentityESt8equal_toISsESt4hashISsENS1_18_Mod_range_hashingENS1_20_Default_ranged_hashENS1_20_Prime_rehash_policyENS1_17_Hashtable_traitsILb1ELb1ELb1EEEE21_M_insert_unique_nodeEmmPNS1_10_Hash_nodeISsLb1EEE+0xc9)[0x7f4483dee1a9]
/home/20face/.virtualenvs/torch/lib64/python3.6/site-packages/torch/lib/libtorch_python.so(+0x4403b5)[0x7f4460bb73b5]
/home/20face/.virtualenvs/torch/lib64/python3.6/site-packages/torch/lib/libtorch_python.so(+0x44570a)[0x7f4460bbc70a]
/home/20face/.virtualenvs/torch/lib64/python3.6/site-packages/torch/lib/libtorch_python.so(+0x275b20)[0x7f44609ecb20]
/usr/lib64/libpython3.6m.so.1.0(_PyCFunction_FastCallDict+0x147)[0x7f4474307167]
/usr/lib64/libpython3.6m.so.1.0(+0x1507df)[0x7f44743727df]
/usr/lib64/libpython3.6m.so.1.0(_PyEval_EvalFrameDefault+0x3a7)[0x7f44743670f7]
/usr/lib64/libpython3.6m.so.1.0(+0x1505ca)[0x7f44743725ca]
/usr/lib64/libpython3.6m.so.1.0(+0x150903)[0x7f4474372903]
/usr/lib64/libpython3.6m.so.1.0(_PyEval_EvalFrameDefault+0x3a7)[0x7f44743670f7]
/usr/lib64/libpython3.6m.so.1.0(+0x14fb69)[0x7f4474371b69]
/usr/lib64/libpython3.6m.so.1.0(_PyFunction_FastCallDict+0x24f)[0x7f44743739ff]
/usr/lib64/libpython3.6m.so.1.0(_PyObject_FastCallDict+0x10e)[0x7f44742ca1de]
/usr/lib64/libpython3.6m.so.1.0(_PyObject_Call_Prepend+0x61)[0x7f44742ca2f1]
/usr/lib64/libpython3.6m.so.1.0(PyObject_Call+0x43)[0x7f44742c9f63]
/usr/lib64/libpython3.6m.so.1.0(+0xfa7e5)[0x7f447431c7e5]
/usr/lib64/libpython3.6m.so.1.0(+0xf71e2)[0x7f44743191e2]
/usr/lib64/libpython3.6m.so.1.0(PyObject_Call+0x43)[0x7f44742c9f63]
/usr/lib64/libpython3.6m.so.1.0(_PyEval_EvalFrameDefault+0x2067)[0x7f4474368db7]
/usr/lib64/libpython3.6m.so.1.0(PyEval_EvalCodeEx+0x24f)[0x7f4474372c9f]
This line is causing the error because it assumes __dict__ has a backbone_name element:
backbone = resnet.__dict__[backbone_name](pretrained=pretrained)
When that isn't the case, it basically tries to access invalid memory. Check __dict__ first with an if statement or make sure that it has the backbone_name element before trying to use it.
I am using Caffe to do image classification, can I am using MAC OS X, Pyhton.
Right now I know how to classify a list of images using Caffe with Spark python, but if I want to make it faster, I want to use Spark.
Therefore, I tried to apply the image classification on each element of an RDD, the RDD created from a list of image_path. However, Spark does not allow me to do so.
Here is my code:
This is the code for image classification:
# display image name, class number, predicted label
def classify_image(image_path, transformer, net):
image = caffe.io.load_image(image_path)
transformed_image = transformer.preprocess('data', image)
net.blobs['data'].data[...] = transformed_image
output = net.forward()
output_prob = output['prob'][0]
pred = output_prob.argmax()
labels_file = caffe_root + 'data/ilsvrc12/synset_words.txt'
labels = np.loadtxt(labels_file, str, delimiter='\t')
lb = labels[pred]
image_name = image_path.split(images_folder_path)[1]
result_str = 'image: '+image_name+' prediction: '+str(pred)+' label: '+lb
return result_str
This this the code generates Caffe parameters and apply the classify_image method on each element of the RDD:
def main():
sys.path.insert(0, caffe_root + 'python')
caffe.set_mode_cpu()
model_def = caffe_root + 'models/bvlc_reference_caffenet/deploy.prototxt'
model_weights = caffe_root + 'models/bvlc_reference_caffenet/bvlc_reference_caffenet.caffemodel'
net = caffe.Net(model_def,
model_weights,
caffe.TEST)
mu = np.load(caffe_root + 'python/caffe/imagenet/ilsvrc_2012_mean.npy')
mu = mu.mean(1).mean(1)
transformer = caffe.io.Transformer({'data': net.blobs['data'].data.shape})
transformer.set_transpose('data', (2,0,1))
transformer.set_mean('data', mu)
transformer.set_raw_scale('data', 255)
transformer.set_channel_swap('data', (2,1,0))
net.blobs['data'].reshape(50,
3,
227, 227)
image_list= []
for image_path in glob.glob(images_folder_path+'*.jpg'):
image_list.append(image_path)
images_rdd = sc.parallelize(image_list)
transformer_bc = sc.broadcast(transformer)
net_bc = sc.broadcast(net)
image_predictions = images_rdd.map(lambda image_path: classify_image(image_path, transformer_bc, net_bc))
print image_predictions
if __name__ == '__main__':
main()
As you can see, here I tried to broadcast the caffe parameters, transformer_bc = sc.broadcast(transformer), net_bc = sc.broadcast(net)
The error is:
RuntimeError: Pickling of "caffe._caffe.Net" instances is not enabled
Before I am doing the broadcast, the error was :
Driver stacktrace.... Caused by: org.apache.spark.api.python.PythonException: Traceback (most recent call last):....
So, do you know, is there any way I can classify images using Caffe and Spark but also take advantage of Spark?
When you work with complex, non-native objects initialization has to moved directly to the workers for example with singleton module:
net_builder.py:
import cafe
net = None
def build_net(*args, **kwargs):
... # Initialize net here
return net
def get_net(*args, **kwargs):
global net
if net is None:
net = build_net(*args, **kwargs)
return net
main.py:
import net_builder
sc.addPyFile("net_builder.py")
def classify_image(image_path, transformer, *args, **kwargs):
net = net_builder.get_net(*args, **kwargs)
It means you'll have to distribute all required files as well. It can be done either manually or using SparkFiles mechanism.
On a side note you should take a look at the SparkNet package.