I am working on a face detection and recognition app in python using tensorflow and opencv. The overall flow is as follow:
while True:
#1) read one frame using opencv: ret, frame = video_capture.read()
#2) Detect faces in the current frame using tensorflow (using mxnet_mtcnn_face_detection)
#3) for each detected face in the current frame, run facenet algorithm (tensorflow) and compare with my Database, to find the name of the detected face/person
#4) displlay a box around each face with the name of the person using opencv
Now, my issue is that the overhead (runtime) of face detection and recognition is very high and thus sometime the output video is more like a slow motion! I tried to use tracking methods (e.g., MIL, KCF), but in this case, I cannot detect new faces coming into the frame! Any approach to increase the speedup? At least to get rid of "face recognition function" for those faces that already recognized in previous frames!
Related
I'm writting a program to detect object using OpenCV DNN with a pre-trained model like SSD Mobilenet or yolo v3.
I want to connect several cameras to my program and do object detection.
I don't know what is the best approach to this problem.
Can I have for example 20 threads which get frames from different cameras and each in turn give the frame to a single opencv dnn object to do detection (use queue to store frame to analyse) ?
OR
Can I instanciate 20 detection objects for each camera and do detection when frame is available. So potentially, I can have 20 detections at the same time.
I am building a type of "person counter" that is getting face images from live video footage.
If a new face is detected in some frame the program will count that face/person. I thus need a way to check if a particular face has already been detected.
I have tried using a training program to recognize a template image to avoid counting the same face multiple times but due to there being only one template, the system was massively inaccurate and slightly too slow to run for every frame of the feed.
To better understand the process: at the beginning, as a face is detected the frame is cropped and the (new) face is saved in a file location. Afterwards, faces detected in subsequent frames need to go through a process to detect whether a similar face has been detected before and exist in the database (if they do, they shouldn't get added to the database).
One recipe to face (pun! ;) this could be, for every frame:
get all the faces for all the frames (with opencv you can detect those and crop them)
generate face embeddings for the faces collected (e.g. using a tool for the purpose <- most likely this is the pre-trained component you are looking for, and allows you to "condense" the face image into a vector)
add all the the so-obtained face embeddings to a list
With some pre-defined time interval, run a clustering algorithm (see also Face clustering using Chinese Whispers algorithm) on the list of face embeddings collected. This will allow to group together faces belonging to the same person, and thus count the people appearing in the video.
Once that clusters are consolidated, you could prune some of the faces belonging to the same clusters/persons (to save storage in case you wanted)
I have used OpenCVs detectMultiScale and res10_300x300_ssd_iter_140000.caffemodel.forward() methods to detect faces in the images. Exact code is shown.
Both these methods worked well providing good results until a day ago. Today, both of these processes hangs at detectMultiScale and net.forward commands respectively. Additionally when the DNN based model runs, the system memory slowly starts building up until the system hangs.
There has been no modification in any of the python libraries or system configuration in the last day. I have tried to reinstall openCV and python so far which has not been benificial.
#code for For cascade based detection:
faceCascade = cv2.CascadeClassifier('./haarcascade_frontalface_default.xml')
faces = faceCascade.detectMultiScale(frame)
#Python code for For DNN based detection:
modelFile = "res10_300x300_ssd_iter_140000.caffemodel"
configFile = "deploy.prototxt"
net = cv2.dnn.readNetFromCaffe(configFile, modelFile)
net.setInput(blob)
detections = net.forward()
I am unable to understand the reason behind the memory leak that is happening and possible solution to overcome this.
I am new to using python and object detection and need some help for a uni project!
1- I am wandering if it is possible to use the same haar cascade file i used to train the pi to detect a specific object in a still image (using the picamera) for detecting the same image in a live video feed as my uni project requires the robot to search and find the specific object and retrieve it or do is it a completely separate process (i.e. setting up classes etc)? if so, where could I i find the most helpful information on doing this as I haven't been too successful when looking online. (pyimagesearch has a blog post on this but goes about using classes and i'm not sure how to go about even creating a class just for a specific object or if you even need one..)
2- Currently, my pi can kind of detect the object (a specific cube) in the still images but isnt very accurate or consistent, it often detects around the edges of the object as well as incorrectly detecting other things in the background or shadows that are part of the image, as the object as well. It tends to find more than one of the cube (lots of small rectangles mostly around - in close proximity) and in the object so it says its detecting 2+ cubes in an image (sometimes 15-20 or more) rather than just the one cube. I am wandering how I could reduce this error and increase the accuracy and consistency of the pi so it detects just the one, or at least doesn't wrongfully detect background shadows or other things in the image? I understand that lighting affects the result but I am wandering if its due to the quality of the original image i took with the picamera of the cube I used to train the haar cascade (it was quite a dark photo due to insufficient lighting), or maybe the size of the image itself (cropped it down to the edges so it is just the cube and resized it to 50x50), or maybe that I didnt train more than one image of the object against negatives when training the cascade file..? do the images you supply for training the pi have to taken with the picamera or could you take clearer pictures say with your phone and use them to train the cascade for detection via the picamera?I tried upgrading the resolution in the code but that made the data analysis take too long and didnt make much difference. Apologies for the long post as I am new to all this and wandering if theres any way to improve the results or is the only way for higher accuracy to retrain the cascade which I'd rather not do as it took two days to complete due to working with a pi zero W board!
Much Appreciated!
My specs: A raspberry pi Zero W board with a 16gb SD Card on Mac, running openCV 3.2 and Python 2.7.
Code for object detection in an image taken using the pi camera:
import io
import picamera
import cv2
import numpy as np
stream = io.BytesIO()
with picamera.PiCamera() as camera: camera.resolution = (320, 240) camera.capture(stream, format='jpeg')
buff = numpy.fromstring(stream.getvalue(), dtype=numpy.uint8)
image = cv2.imdecode(buff, 1)
cube_cascade = cv2.CascadeClassifier('/home/pi/data/cube1-cascade-10stages.xml')
gray = cv2.cvtColor(image,cv2.COLOR_BGR2GRAY)
Cube = cube_cascade.detectMultiScale(gray, 1.1, 5)
print "Found "+str(len(Cube))+" cube(s)"
for (x,y,w,h) in Cube: cv2.rectangle(image, (x,y), (x+w,y+h), (255,255,0), 2)
cv2.imwrite('result.jpg',image)
Thanks in advance.
I am working on with the Face detection using cascade Classifier in opencv python. Its working fine. But i want to develop my code to detect only one face and also the largest face only to detect.
Sort the detected faces by size and keep the biggest one only?