I am trying to create a program that can input an image (I am doing it by imageGrab from PIL) and detect some known symbols in it, and their locations. The good thing is that I am pretty sure I don't need neural networks, because I know the exact shape and size of each symbol. the problem is that I have no idea how much of these will be, and what is the color in the background of each symbol. some of the symbols are numbers, I have an image of each digit 0-9, but there may be up to 3-digit numbers. I think I will be able to find a way to know which digits are part of the same number by their location, but lets talk about it later. right now, I have turned the image into grayscale and imshow it using opencv2.
do you have any idea how can I do it with opencv? some other library?
and I need it to be fast enough, hopefuly 10 frames per second.
this is my current code (modified sentdex's "python plays GTA" code, the most bottom of the page):
import numpy as np
from PIL import ImageGrab
import cv2
def screen_record():
while(True):
global printscreen
image = ImageGrab.grab(bbox=(20,270,430,685))
printscreen = np.array(image)
grayscale_image = cv2.cvtColor(printscreen, cv2.COLOR_BGR2GRAY)
cv2.imshow('window', grayscale_image)
if cv2.waitKey(25) & 0xFF == ord('q'):
cv2.destroyAllWindows()
break
if cv2.waitKey(25) & 0xFF == ord('w'):
image.save("screen_shot.png")
print("Saved current window as image")
screen_record()
EDIT: I managed to get to something with opencv's template match, only with the digit 2 (for now). I found a nice tutorial here. my problem is when there is not exactly 1 match of the template, means no number 2s, or more then 1. when there aren't any it looks like its choosing something random on the image, and when there's more then one, I have only 1 of them detected. is it ossible to apply it in a different way to match my needs?
So, I have a solution to my problem.
For all of those who reach this page in the future to get help, here are the steps to regognize templates in images:
create 2 images, the one you want to detect, and another one for your template.
then, upload the whoever you want using opencv, and copy this function:
def locate_symbol(x, template):
w, h = filter_num2.shape[::-1]
res = cv2.matchTemplate(x, template, cv2.TM_SQDIFF_NORMED)
min_val, max_val, min_loc, max_loc = cv2.minMaxLoc(res)
min_thresh = 0.45
match_locations = np.where(res<=min_thresh)
return w, h, match_locations
and use these lines to draw bounding boxes on the image:
w, h, locs = locate_symbol(grayscale_image, filter_num2)
for (x, y) in zip(locs[1], locs[0]):
cv2.rectangle(printable_image, (x, y), (x+w, y+h), [255, 0, 0], 2)
then you can draw everything with cv2.imshow()
Related
I really don't know if "UV's" is the right word as i'm from the world of Unity and am trying to write some stuff in python. What i'm trying to do is to take a picture of a human (from webcam) take the placement of their landmarks/key features and alter a second image (of a different person) to make their key features in the same place whilst morphing / warping the parts of their skin that are within the face to fit the position of the first input image (webcam)'s landmarks. After i do that I need to put the face back on the non-webcam input. (i'm sorry for how much that made me sound like a serial killer, stretching and cutting faces) I know that probably didn't make any sense but I want it to look like this.
I have the face landmark and cutting done with DLIB and OpenCV but i need a way to find a way to take these "cut" face chunks and stretch them "dynamically". What I mean by dynamically is that you don't just put a mask on by linearly re-sizing it on 1 or 2 axises. You can select a point of the mask and change that, I wanna do that but my mask is my cut chunk and the point is a section of that chunk that needs to change for the chunk to comply with the position of the generated landmarks. I know this is a very hard topic to think about and if you guys need any clarification just ask. My code:
import cv2
import numpy as np
import dlib
cap = cv2.VideoCapture(0)
detector = dlib.get_frontal_face_detector()
predictor = dlib.shape_predictor("shape_predictor_68_face_landmarks.dat")
while True:
_, frame = cap.read()
gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
faces = detector(gray)
for face in faces:
x1 = face.left()
y1 = face.top()
x2 = face.right()
y2 = face.bottom()
cv2.rectangle(frame, (x1, y1), (x2, y2), (0, 255, 0), 3)
landmarks = predictor(gray, face)
for n in range(0, 68):
x = landmarks.part(n).x
y = landmarks.part(n).y
cv2.circle(frame, (x, y), 4, (255, 0, 0), -1)
cv2.imshow("Frame", frame)
key = cv2.waitKey(1)
if key == 27:
break
EDIT: No i'm not a serial killer
If you need to deform source image like a rubber sheet using 2 sets of keypoints, you need to use thin plate spline (TPS), or, better, piecewice affine transformation like here. The last one is more similar to texture rasterization methods (triangle to triangle texture transform).
I have few images where I need to increase or decrease the contrast and brightness of the image in a dynamic way so that it is visible clearly. And the program needs to be dynamic so that it even works for new images also. I also want character should be dark.
I was able to increase brightness and contrast but it is not working properly for each image.
import cv2
import numpy as np
img = cv2.imread('D:\Bright.png')
image = cv2.GaussianBlur(img, (5, 5), 0)
#image = cv2.threshold(image, 0, 255, cv2.THRESH_BINARY)[1]
#kernel = np.ones((2,1),np.uint8)
#dilation = cv2.dilate(img,kernel)
cv2.imshow('test', image)
cv2.waitKey(0)
cv2.destroyAllWindows()
imghsv = cv2.cvtColor(image, cv2.COLOR_BGR2HSV)
imghsv[:,:,2] = [[max(pixel - 25, 0) if pixel < 190 else min(pixel + 25, 255) for pixel in row] for row in imghsv[:,:,2]]
cv2.imshow('contrast', cv2.cvtColor(imghsv, cv2.COLOR_HSV2BGR))
#cv2.imwrite('D:\\112.png',cv2.cvtColor(imghsv, cv2.COLOR_HSV2BGR))
cv2.waitKey(0)
cv2.destroyAllWindows()
#raw_input()
I want a program which works fine for every image and words are a little darker so that they are easily visible.
As Tilarion suggested, you could try "Auto Brightness And Contrast" to see if it works well. The theory behind this is explained well here in the solution section. The solution is in C++. I've written a version of it in python which you can directly use, works only on 1 channel at a time for colour images:
def auto_brightandcontrast(input_img, channel, clip_percent=1):
histSize=180
alpha=0
beta=0
minGray=0
maxGray=0
accumulator=[]
if(clip_percent==0):
#min_val, max_val, min_loc, max_loc = cv2.minMaxLoc(hist)
return input_img
else:
hist = cv2.calcHist([input_img],[channel],None,[256],[0, 256])
accumulator.insert(0,hist[0])
for i in range(1,histSize):
accumulator.insert(i,accumulator[i-1]+hist[i])
maxx=accumulator[histSize-1]
minGray=0
clip_percent=clip_percent*(maxx/100.0)
clip_percent=clip_percent/2.0
while(accumulator[minGray]<clip_percent[0]):
minGray=minGray+1
maxGray=histSize-1
while(accumulator[maxGray]>=(maxx-clip_percent[0])):
maxGray=maxGray-1
inputRange=maxGray-minGray
alpha=(histSize-1)/inputRange
beta=-minGray*alpha
out_img=input_img.copy()
cv2.convertScaleAbs(input_img,out_img,alpha,beta)
return out_img
It is a very few lines of code to do it in Python Wand (which is based upon ImageMagick). Here is a script.
#!/bin/python3.7
from wand.image import Image
with Image(filename='task4.jpg') as img:
img.contrast_stretch(black_point=0.02, white_point=0.99)
img.save(filename='task4_stretch2_99.jpg')
Input:
Result:
Increase the black point value to make the text darker and/or decrease the white point value to make the lighter parts brighter.
Thanks to Eric McConville (the Wand developer) for correcting my arguments to make the code work.
I am developing a project for my university assignment which has a AR part that I tried to with Unity and Vuforia. I want to get a simple T shape (or any shape which is easy for user to draw on a body part such as hand) as the image target, because I'm developing an app similar to inkHunter. In this app they have got a smiley as the image target and when the customer draws a smiley on the body and places the camera on it, the camera finds that and shows the selected tattoo design on it. I tried it with Vuforia SDK but they give a rating for the image target, so I can't get what I want as the image target. I think using openCV is the right way to do it but it's so hard to learn and I got less time. I think this is not a big thing to implement so please try to help me with this problem. I think you get my idea. in inkHunter even if I draw the target in a sheet also they show the tattoo on it. I need the same which means I need to detect the Drawn target. It would be great if you could help me in this situation. Thanks.
target can be like this,
I was able to do template matching from pictures and I applied the same to real-time which means I looped through the frames. But it does not seem to be matching the template with frames, And I realized that found(bookkeeping variable) is always None.
import cv2 as cv2
import numpy as np
import imutils
def main():
template = cv2.imread("C:\\Users\\Manthika\\Desktop\\opencvtest\\template.jpg")
template = cv2.cvtColor(template, cv2.COLOR_BGR2GRAY)
template = cv2.Canny(template, 50, 200)
(tH, tW) = template.shape[:2]
cv2.imshow("Template", template)
windowName = "Something"
cv2.namedWindow(windowName)
cap = cv2.VideoCapture(0)
if cap.isOpened():
ret, frame = cap.read()
else:
ret = False
# loop over the frames to find the template
while ret:
# load the image, convert it to grayscale, and initialize the
# bookkeeping variable to keep track of the matched region
ret, frame = cap.read()
gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
found = None
# loop over the scales of the image
for scale in np.linspace(0.2, 1.0, 20)[::-1]:
# resize the image according to the scale, and keep track
# of the ratio of the resizing
resized = imutils.resize(gray, width=int(gray.shape[1] * scale))
r = gray.shape[1] / float(resized.shape[1])
# if the resized image is smaller than the template, then break
# from the loop
if resized.shape[0] < tH or resized.shape[1] < tW:
break
# detect edges in the resized, grayscale image and apply template
# matching to find the template in the image
edged = cv2.Canny(resized, 50, 200)
result = cv2.matchTemplate(edged, template, cv2.TM_CCOEFF)
(_, maxVal, _, maxLoc) = cv2.minMaxLoc(result)
# if we have found a new maximum correlation value, then update
# the bookkeeping variable
if found is None or maxVal > found[0]:
found = (maxVal, maxLoc, r)
print(found)
# unpack the bookkeeping variable and compute the (x, y) coordinates
# of the bounding box based on the resized ratio
print(found)
if found is None:
# just show only the frames if the template is not detected
cv2.imshow(windowName, frame)
else:
(_, maxLoc, r) = found
(startX, startY) = (int(maxLoc[0] * r), int(maxLoc[1] * r))
(endX, endY) = (int((maxLoc[0] + tW) * r), int((maxLoc[1] + tH) * r))
# draw a bounding box around the detected result and display the image
cv2.rectangle(frame, (startX, startY), (endX, endY), (0, 0, 255), 2)
cv2.imshow(windowName, frame)
if cv2.waitKey(1) == 27:
break
cv2.destroyAllWindows()
cap.release()
if __name__ == "__main__":
main()
Please help me to solve this problem
I can hint you with the OpenCV part, but without Unity and Vuforia, hope it may help.
So, the way I see the pipeline for the project:
Detect location, size, and aspect ratio
Use homography for transformation of the image that should be put over the original
Overlay put one image on top of the other
I will assume that the target will be a dark "T" on a white piece of paper, and it may appear in different locations of the paper, as well as the paper itself may move.
1. Detect location, size, and aspect ratio
Firstly, you need to detect the piece of paper, as you know its color and aspect ration, you may use RGB/HSV thresholding for segmentation. You may also try using Deep/Machine Learning (some similar strategy like in R-CNN, HOG-SVM etc.), but it will take time. Then, you can use findContours() function from OpenCV to get the largest object. From the contour you can get the location, size, and aspect ratio of the paper.
After that you do the same thing but within the piece of paper and looking for the "T". Here you can use template matching method, just scan the Region Of Interest with predefined mask of different sizes, or just repeat what steps above.
A useful resource may be this credit card characters recognition example. It helped me a lot one day:)
2. Use homography for transformation of the image that should be put over the original
After extracting aspect ratio you will know the approximate size and shape that should appear on top of the "T". This will let you to use homograpy for transformation of the image you want to put over "T". Here is a good point to start, you can also google for some other sources, there should be plenty of them, and as far as I know, OpenCV should have functions for that.
After the transformation, I would recommend you to use interpolation, because there might be some missing pixels afterwards.
3. Overlay put one image on top of the other
The last step is just to go through all pixels of the input image and put the transformed image over target pixels.
Hope this helps, good luck!:)
I'm a 1st-grade cs student and I know only a little bit of python. For a project, I need to use OpenCV to detect several traffic signs. I searched a little bit on the web and I decided to use Haar-Cascade classifier. I followed this tutorial : haar-cascade
I trained the code for this sign left-sign
Everything was fine up to this point. However my code (trained with 3000 positive 1500 negative jpgs and finished 8 stages)detects both right and left signs. Code needs to recognize right and left signs separately because my aim is to command my robot to turn left or turn right.
Here is my code:
import numpy as np
import cv2
ok_cascade = cv2.CascadeClassifier('new_kocum.xml')
cap = cv2.VideoCapture(0)
while 1:
ret, img = cap.read()
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
oks = ok_cascade.detectMultiScale(gray,3,5)
for (x,y,w,h) in oks:
cv2.rectangle(img,(x,y),(x+w,y+h),(255,255,0),2)
font = cv2.FONT_HERSHEY_SIMPLEX
cv2.putText(img,'ok',(x-w,y-h), font, 0.5, (11,255,255), 2, cv2.LINE_AA)
cv2.imshow('img',img)
k = cv2.waitKey(30) & 0xff
if k == 27:
break
cap.release()
cv2.destroyAllWindows()
Here is the right sign : right-sign
So my question: Is it possible to fix that just by changing the code? If easier, which method should I use to detect these signs?
The correct way: you need add to the negative set a lot of the right signs.
The best way: don't use haar-cascade.
Simplest way: train the second classifier (for example naive Bayes) for comparing left and right signs after work you cascade. Features: correlation between images, Hu-moments etc.
This is a simple code for face detection using openCV :
import cv2
img = cv2.imread("one.jpg")
hc = cv2.CascadeClassifier("haarcascade_frontalface_alt2.xml")
faces = hc.detectMultiScale(img)
for face in faces:
print 'inside for loop ! '
cv2.rectangle(img, (face[0], face[1]), (face[0] + face[2], face[0] + face[3]), (255, 0, 0), 3)
cv2.imshow("Face", img)
if cv2.waitKey(5000) == 27:
cv2.destroyWindow("Face")
cv2.imwrite("two.jpg", img)
but when I run this code, the final image displayed ie two.jpg is the same as given in input ie one.jpg! without any face being detected.. the code inside the for loop is never executed ... why is it so ? Are there any changes which I should make in the code ?
this is the image I am giving as one.jpg & the final image ie two.jpg also looks the same
It seems none of the faces in the image you use have been detected (in which case, the for loop will not be executed). you can:
Use an image that has an easier-to-detect (large, no glasses) face.
Step through the code using a debugger (I don't know how to do this in Python, but it should be easier to find out)
Check the following attributes of faces
a. size
b. location coordinates of each face if detected.