I am currently trying to run this code on a dataset which includes multiple full-bodied photos of people with hardhats.
Here is the code:
def detect_face(img):
heregray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
face_cascade = cv2.CascadeClassifier('haarcascade_frontalface_default.xml')
faces = face_cascade.detectMultiScale(gray, scaleFactor=1.7, minNeighbors=5)
return faces
filename = ("C:\\Users\Vitaliy Yashchenko\\Desktop\\OpenCV Face recognition\\!!!\\dataset\\hattocrop")
for img in glob.glob(filename+'/*.*'):
var_img = cv2.imread(img)
face = detect_face(var_img)
print(face)
if (len(face) == 0):
continue
for(ex, ey, ew, eh) in face:
crop_image = var_img[ey:eh+ey, ex:ex+ew]
cv2.imshow("cropped", crop_image)
cv2.waitKey(0)
cv2.imwrite(os.path.join("outputs/",str(img)),crop_image)
As the haarcascade recognizes only the face, I tried to crop the face with the hard hat and I was slightly confused with the axes ex, ey, ew, eh.
If I run the following in the corresponding line:
crop_image = var_img[ey:eh+ey+100, ex:ex+ew]
I get the lower part of the face.
What is the approriate way to define the higher part of the face(head) in the cropped img so the cropped one will include the safety hard hat?
I am not sure what is the issue but:
dimensions in CV images and basically in computer vision are read from the top left corner (0,0) to the bottom right corner (img_width,img_height)
cv2 has some deep learning tools built-in but if you are really interested in the project you should use other tools such as PyTorch, you can also use the models trained there in OpenCV
last tip (side note) - don't put spaces in your directories names :)
Related
I am using Cascade Trainer GUI to get an XML file. I have 100 positive images and 400 negative images. The training process only took about 5 minutes, and the results are not accurate. The object I trained the model for is a small screwdriver. The resulting .xml file was only 31.5 KB. Please see image.
enter image description here
Also, the rectangle in the photo is quite small, let alone not accurate.
Besides adding more positive and negative images, what should I do to create a more accurate model? I eventually need to do image tracking as well. Thanks
#import numpy as np
import cv2
import time
"""
This program uses openCV to detect faces, smiles, and eyes. It uses haarcascades which are public domain. Haar cascades rely on
xml files which contain model training data. An xml file can be generated through training many positive and negative images.
Try your built-in camera with 'cap = cv2.VideoCapture(0)' or use any video. cap = cv2.VideoCapture("videoNameHere.mp4")
"""
face_cascade = cv2.CascadeClassifier(cv2.data.haarcascades + 'haarcascade_frontalface_default.xml')
eye_cascade = cv2.CascadeClassifier(cv2.data.haarcascades + 'haarcascade_eye.xml')
smile = cv2.CascadeClassifier(cv2.data.haarcascades + 'haarcascade_smile.xml')
screw = cv2.CascadeClassifier('cascade.xml')
cap = cv2.VideoCapture(0)
font = cv2.FONT_HERSHEY_SIMPLEX
prev_frame_time, new_frame_time = 0,0
while 1:
ret, img = cap.read()
img = cv2.resize(img,(1920,1080))
#faces = face_cascade.detectMultiScale(img, 1.5, 5)
#eyes = eye_cascade.detectMultiScale(img,1.5,6)
# smiles = smile.detectMultiScale(img,1.1,400)
screws = screw.detectMultiScale(img,1.2,3)
new_frame = time.time()
try:
fps = 1/(new_frame_time-prev_frame_time)
except:
fps = 0
fps = int(fps)
cv2.putText(img,"FPS: "+str(fps),(10,450), font, 3, (0,0,0), 5, cv2.LINE_AA)
# for (x,y,w,h) in smiles:
#cv2.rectangle(img,(x,y),(x+w,y+h),(0,69,255),2)
# cv2.putText(img,"smile",(int(x-.1*x),int(y-.1*y)),font,1,(255,255,255),2)
for (x,y,w,h) in screws:
cv2.rectangle(img,(x,y),(x+w,y+h),(255,0,255),2)
cv2.putText(img,"screwdriver",(int(x-.1*x),int(y-.1*y)),font,1,(255,0,255),2)
# for (x,y,w,h) in faces:
# cv2.rectangle(img,(x,y),(x+w,y+h),(255,0,0),2)
# cv2.putText(img,"FACE",(int(x-.1*x),int(y-.1*y)),font,1,(255,255,255),2)
# roi_color = img[y:y+h, x:x+w]
# eyes = eye_cascade.detectMultiScale(roi_color)
# for (ex,ey,ew,eh) in eyes:
# cv2.rectangle(roi_color,(ex,ey),(ex+ew,ey+eh),(0,255,0),2)
cv2.imshow('img',img)
k = cv2.waitKey(30) & 0xff
if k == 27:
break
prev_frame_time = new_frame_time
cap.release()
cv2.destroyAllWindows()
Most resources on the topic recommend 3000-5000 images for positive and negative each. That might very well be the reason for lower accuracy.
Some resources:
Link 1 - sonots
Link 2 - opencv-user-blog
Link 3 - computer vision software
Link 4 - pythonprogramming.net
if your image above is a 'typical' one, then it cannot work ever, using cascades.
those need reliable texture and pose, your scene lacks both.
(i also guess, that you do not really have 100 positive images, but that you tried to "synthesize" them from a few or a single image only, proven NOT to work in real life)
dont waste more time on this.
get more (real !) images, and read up on object detection cnn's like SSD or YOLO, which are far more robust with your situation.
I want to compare two photos. The first has the face of one individual. The second is a group photo with many faces. I want to see if the individual from the first photo appears in the second photo.
I have tried to do this with the deepface and face_recognition libraries in python, by pulling faces one by one from the group photo and comparing them to the original photo.
face_locations = face_recognition.face_locations(img2_loaded)
for face in face_locations:
top, right, bottom, left = face
face_img = img2_loaded[top:bottom, left:right]
face_recognition.compare_faces(img1_loaded, face_img)
This results in an error about operands cannot be broadcast together with shapes (3088,2316,3) (90,89,3). I also get this same error when I take the faces I pulled out from the group photo, save them using PIL and then try passing them into deepface. Can anyone recommend any alternative ways to achieve this functionality? Or fix my current attempt? Thank you so much!
deepface is designed to compare two faces but you can still compare one to many face recognition.
You have two pictures. One has just a single face photo. I call this img1.jpg. And second has many faces. I call this img2.jpg.
You can firstly detect faces in img2.jpg with OpenCV.
import cv2
img2 = cv2.imread("img2.jpg")
face_detector = cv2.CascadeClassifier("haarcascade_frontalface_default.xml")
faces = face_detector.detectMultiScale(img2, 1.3, 5)
detected_faces = []
for face in faces:
x,y,w,h = face
detected_face = img2[int(y):int(y+h), int(x):int(x+w)]
detected_faces.append(detected_face)
Then, you need to compare each item of faces variable with img1.jpg.
img1 = cv2.imread("img1.jpg")
targets = face_detector.detectMultiScale(img1, 1.3, 5)
x,y,w,h = targets[0] #this has just a single face
target = img1[int(y):int(y+h), int(x):int(x+w)]
for face in detected_faces:
#compare face and target in each iteration
compare(face, target)
We should design compare function
from deepface import DeepFace
def compare(img1, img2):
resp = DeepFace.verify(img1, img2)
print(resp["verified"])
So, you can adapt deepface for your case like that.
I really don't know if "UV's" is the right word as i'm from the world of Unity and am trying to write some stuff in python. What i'm trying to do is to take a picture of a human (from webcam) take the placement of their landmarks/key features and alter a second image (of a different person) to make their key features in the same place whilst morphing / warping the parts of their skin that are within the face to fit the position of the first input image (webcam)'s landmarks. After i do that I need to put the face back on the non-webcam input. (i'm sorry for how much that made me sound like a serial killer, stretching and cutting faces) I know that probably didn't make any sense but I want it to look like this.
I have the face landmark and cutting done with DLIB and OpenCV but i need a way to find a way to take these "cut" face chunks and stretch them "dynamically". What I mean by dynamically is that you don't just put a mask on by linearly re-sizing it on 1 or 2 axises. You can select a point of the mask and change that, I wanna do that but my mask is my cut chunk and the point is a section of that chunk that needs to change for the chunk to comply with the position of the generated landmarks. I know this is a very hard topic to think about and if you guys need any clarification just ask. My code:
import cv2
import numpy as np
import dlib
cap = cv2.VideoCapture(0)
detector = dlib.get_frontal_face_detector()
predictor = dlib.shape_predictor("shape_predictor_68_face_landmarks.dat")
while True:
_, frame = cap.read()
gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
faces = detector(gray)
for face in faces:
x1 = face.left()
y1 = face.top()
x2 = face.right()
y2 = face.bottom()
cv2.rectangle(frame, (x1, y1), (x2, y2), (0, 255, 0), 3)
landmarks = predictor(gray, face)
for n in range(0, 68):
x = landmarks.part(n).x
y = landmarks.part(n).y
cv2.circle(frame, (x, y), 4, (255, 0, 0), -1)
cv2.imshow("Frame", frame)
key = cv2.waitKey(1)
if key == 27:
break
EDIT: No i'm not a serial killer
If you need to deform source image like a rubber sheet using 2 sets of keypoints, you need to use thin plate spline (TPS), or, better, piecewice affine transformation like here. The last one is more similar to texture rasterization methods (triangle to triangle texture transform).
I am developing a project for my university assignment which has a AR part that I tried to with Unity and Vuforia. I want to get a simple T shape (or any shape which is easy for user to draw on a body part such as hand) as the image target, because I'm developing an app similar to inkHunter. In this app they have got a smiley as the image target and when the customer draws a smiley on the body and places the camera on it, the camera finds that and shows the selected tattoo design on it. I tried it with Vuforia SDK but they give a rating for the image target, so I can't get what I want as the image target. I think using openCV is the right way to do it but it's so hard to learn and I got less time. I think this is not a big thing to implement so please try to help me with this problem. I think you get my idea. in inkHunter even if I draw the target in a sheet also they show the tattoo on it. I need the same which means I need to detect the Drawn target. It would be great if you could help me in this situation. Thanks.
target can be like this,
I was able to do template matching from pictures and I applied the same to real-time which means I looped through the frames. But it does not seem to be matching the template with frames, And I realized that found(bookkeeping variable) is always None.
import cv2 as cv2
import numpy as np
import imutils
def main():
template = cv2.imread("C:\\Users\\Manthika\\Desktop\\opencvtest\\template.jpg")
template = cv2.cvtColor(template, cv2.COLOR_BGR2GRAY)
template = cv2.Canny(template, 50, 200)
(tH, tW) = template.shape[:2]
cv2.imshow("Template", template)
windowName = "Something"
cv2.namedWindow(windowName)
cap = cv2.VideoCapture(0)
if cap.isOpened():
ret, frame = cap.read()
else:
ret = False
# loop over the frames to find the template
while ret:
# load the image, convert it to grayscale, and initialize the
# bookkeeping variable to keep track of the matched region
ret, frame = cap.read()
gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
found = None
# loop over the scales of the image
for scale in np.linspace(0.2, 1.0, 20)[::-1]:
# resize the image according to the scale, and keep track
# of the ratio of the resizing
resized = imutils.resize(gray, width=int(gray.shape[1] * scale))
r = gray.shape[1] / float(resized.shape[1])
# if the resized image is smaller than the template, then break
# from the loop
if resized.shape[0] < tH or resized.shape[1] < tW:
break
# detect edges in the resized, grayscale image and apply template
# matching to find the template in the image
edged = cv2.Canny(resized, 50, 200)
result = cv2.matchTemplate(edged, template, cv2.TM_CCOEFF)
(_, maxVal, _, maxLoc) = cv2.minMaxLoc(result)
# if we have found a new maximum correlation value, then update
# the bookkeeping variable
if found is None or maxVal > found[0]:
found = (maxVal, maxLoc, r)
print(found)
# unpack the bookkeeping variable and compute the (x, y) coordinates
# of the bounding box based on the resized ratio
print(found)
if found is None:
# just show only the frames if the template is not detected
cv2.imshow(windowName, frame)
else:
(_, maxLoc, r) = found
(startX, startY) = (int(maxLoc[0] * r), int(maxLoc[1] * r))
(endX, endY) = (int((maxLoc[0] + tW) * r), int((maxLoc[1] + tH) * r))
# draw a bounding box around the detected result and display the image
cv2.rectangle(frame, (startX, startY), (endX, endY), (0, 0, 255), 2)
cv2.imshow(windowName, frame)
if cv2.waitKey(1) == 27:
break
cv2.destroyAllWindows()
cap.release()
if __name__ == "__main__":
main()
Please help me to solve this problem
I can hint you with the OpenCV part, but without Unity and Vuforia, hope it may help.
So, the way I see the pipeline for the project:
Detect location, size, and aspect ratio
Use homography for transformation of the image that should be put over the original
Overlay put one image on top of the other
I will assume that the target will be a dark "T" on a white piece of paper, and it may appear in different locations of the paper, as well as the paper itself may move.
1. Detect location, size, and aspect ratio
Firstly, you need to detect the piece of paper, as you know its color and aspect ration, you may use RGB/HSV thresholding for segmentation. You may also try using Deep/Machine Learning (some similar strategy like in R-CNN, HOG-SVM etc.), but it will take time. Then, you can use findContours() function from OpenCV to get the largest object. From the contour you can get the location, size, and aspect ratio of the paper.
After that you do the same thing but within the piece of paper and looking for the "T". Here you can use template matching method, just scan the Region Of Interest with predefined mask of different sizes, or just repeat what steps above.
A useful resource may be this credit card characters recognition example. It helped me a lot one day:)
2. Use homography for transformation of the image that should be put over the original
After extracting aspect ratio you will know the approximate size and shape that should appear on top of the "T". This will let you to use homograpy for transformation of the image you want to put over "T". Here is a good point to start, you can also google for some other sources, there should be plenty of them, and as far as I know, OpenCV should have functions for that.
After the transformation, I would recommend you to use interpolation, because there might be some missing pixels afterwards.
3. Overlay put one image on top of the other
The last step is just to go through all pixels of the input image and put the transformed image over target pixels.
Hope this helps, good luck!:)
Currently I am trying to create a pattern recognition program as a pet project. It involves jpeg files of knitting swatches and basically recognizing the stitches out of the swatch. Each stitch essentially takes the shape of an inverted 'v'.
So far have managed to get current versions of OpenCV in Python up and running in a Visual Studio environment using the inbuilt Canny Edge detection but am unsure how to progress from there because am reading up on edge detection methods and finding there are quite many.
If anyone can point me in the right way would appreciate it a lot.
So heres the code:
import numpy as np
import cv2
#Defining the autocanny function
def auto_canny(image, sigma=0.10):
#compute median of image thresholds
v = np.median(image)
#apply automatic canny edge detection using the computed median
lower = int(max(0,(1.0 - sigma) * v))
upper = int(min(255, (1.0 + sigma) * v))
edged = cv2.Canny(image, lower, upper)
#return the edged image
return edged
#defining the image, grayscale, blurred
image = cv2.imread('img_knit_sample2.jpg')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
blurred = cv2.GaussianBlur(gray, (3, 3), 0)
#apply Canny edge detection using a wide threshold, tight
#threshold, and automatically determined threshold
wide = cv2.Canny(blurred, 10, 200)
tight = cv2.Canny(blurred, 225, 250)
auto = auto_canny(blurred)
#show the images
cv2.imshow("Original", image)
cv2.imshow("Edges-wide", wide)
cv2.imshow("Edges-tight", tight)
cv2.imshow("Edges-auto", auto)
#Save the images to disk
cv2.imwrite('Wide_config.jpg', wide)
cv2.imwrite('Tight_config.jpg', tight)
cv2.imwrite('Autocanny.jpg', auto)
cv2.waitKey(0)
cv2.destroyAllWindows()
Unfortunately i cannot upload more than 2 images but am more than happy to get the URL's for anyone willing to go further
(Apologies for the crappy description since I am new to this and if you do understand my query and can still help then kudos and much appreciation to you)
Cheers
Edges appear where there is contrast, i.e. at the limit between zones of a different color (intensity). In your picture, this is essentially between the blue and black wools.
You can see some separation between the blue threads, but these are ridges, not edges, and you'd better use a ridge detector.
In the black areas, seeing the edges is hopeless. Don't even try.
If your goal is to locate the stitches, you may be more lucky with template matching.