Get the coordinates of the objects found by a template - python

I wanted to make a bot for a game that looks for a certain item on the floor, and then clicks on it. I managed to get the first part right (it even draws a rectangle around it) but whats embarrassing is that i cant get the coordinates of that object right. I use cv2.matchTemplate method. This is my code:
import numpy as np
import pyautogui
img_bgr = cv2.imread('gra.png')
img_gray = cv2.cvtColor(img_bgr, cv2.COLOR_BGR2GRAY)
template = cv2.imread('bones2.png', 0)
w, h = template.shape[:: -1]
res = cv2.matchTemplate(img_gray, template, cv2.TM_CCOEFF_NORMED)
threshhold = 0.90
loc = np.where( res >= threshhold)
for pt in zip(*loc[:: -1]):
cv2.rectangle(img_bgr, pt, (pt[0] + w, pt[1] + h),(0, 255, 255), 2 )
#here i wanted to move the mouse to the coordinates of a found item, however
#i cant get these two right ↓ ↓
pyautogui.moveTo( ? , ? ,duration=0.5)
cv2.imshow('znalezione', img_bgr)
cv2.waitKey()
cv2.destroyAllWindows()
I tried this:
pyautogui.moveTo( (pt[0] *2 + w)/2 , (pt[1] *2 + h)/2 ,duration=0.5)
but this doesn't work at all. Can someone explain to me what pt is at all and how to get the coordinates?
Also here is a screen shot of what i achieved so far:

From my understanding, both OpenCV and pyautogui uses the same coordinate system as illustrated with an example 1920x1080 resolution.
0,0 X increases -->
+---------------------------+
| | Y increases
| | |
| 1920 x 1080 screen | |
| | V
| |
| |
+---------------------------+ 1919, 1079
OpenCV's cv2.rectangle function takes the top left coordinates and the bottom right coordinates of the rectangle as parameters. Since you were able to draw the bounding box in your image, you have the correct coordinates of the ROI you want to examine. From the docs the moveTo function takes two parameters: x and y. Assuming you want to move the mouse to the center of the bounding box, you can do
pyautogui.moveTo(pt[0] + w/2, pt[1] + h/2, duration=0.5)

first of all. you dont need so complex calculation
x=pt[0]
y=pt[1]
center_x = x + 0.5 * w
center_y = y + 0.5 * h
In terms of points, I don't see any issues. It's not the coordinate issue. I guess high chance, It's the pyautoui function issue. But I could not verify as I cant seems to install it on my PC.
based on example
>>> pyautogui.moveTo(100, 200, 2)
Try to call the same first to rule out the last parameter issue. If can, then its simple format isse.
If can not then it might be an image conversion issue. pyautogui function is using Pillow which is giving a format that must be adapted to work with opencv. So its either datatype RGB, BGR or image coordinate issue(e.g opencv refer to image coordinate, and pyautogui uses desktop coordinate?).

Related

Rotating a Bitmap with 3 Shears

I am rotating a bitmap using the the three shear method documented in these articles [1][2].
From about 0-90°, the quality is acceptable, but beyond that it gets progressively more distorted until it's unintelligible.
Can anyone help me locate what is going wrong? There are a few calls to methods from the application Cinema 4D's API, but I believe the issue is coming from the math. Thank you!
This is my shear function:
def shear(angle,x,y):
'''
|1 -tan(𝜃/2) | |1 0| |1 -tan(𝜃/2) |
|0 1 | |sin(𝜃) 1| |0 1 |
'''
# shear 1
tangent=math.tan(angle/2)
new_x=round(x-y*tangent)
new_y=y
#shear 2
new_y=round(new_x*math.sin(angle)+new_y) #since there is no change in new_x according to the shear matrix
#shear 3
new_x=round(new_x-new_y*tangent) #since there is no change in new_y according to the shear matrix
return new_x,new_y
This is the code in the draw function:
cos = math.cos(self.rotation)
sin = math.sin(self.rotation)
# Define the width and height of the destination image
newWidth = round(abs(w*cos)+abs(h*sin))+1
newHeight = round(abs(h*cos)+abs(w*sin))+1
destBmp = c4d.bitmaps.BaseBitmap() #creates a new BaseBitmap instance for the destination image
destBmp.Init(newWidth,newHeight) #initializes the bitmap
destAlpha = destBmp.AddChannel(True, False) #adds an alpha channel
# Find the center of the source image for rotation
origCenterWidth = round(((w+1)/2)-1) #with respect to the source image
origCenterHeight = round(((h+1)/2)-1) #with respect to the source image
# Find the center of the destination image
newCenterWidth = round(((newWidth+1)/2)-1) #with respect to the destination image
newCenterHeight = round(((newHeight)/2)-1) #with respect to the destination image
for xP in range(w):
for yP in range(h):
destBmp.SetPixel(int(xP), int(yP), 0, 0, 255) #sets the destination bitmap's background color to blue
for i in range(h):
for j in range(w):
#co-ordinates of pixel with respect to the center of source image
x = w-1-j-origCenterWidth
y = h-1-i-origCenterHeight
#Applying the Shear Transformation
new_x,new_y = shear(self.rotation,x,y)
#with rotation, the center will change so new_x and new_y will be the new center
new_y = newCenterHeight-new_y
new_x = newCenterWidth-new_x
alphaValue = sourceBmp.GetAlphaPixel(alphaChannel, j, i) #gets the source image pixel's alpha
col = sourceBmp.GetPixelDirect(j, i) #gets the source image pixel's color as a Color Vector
destBmp.SetAlphaPixel(nBmpAlpha, int(new_x), int(new_y), alphaValue) #sets the destination image pixel's alpha
destBmp.SetPixel(int(new_x), int(new_y), int(col.x), int(col.y), int(col.z)) #sets the destination image pixel's color
I ran accross the same exact problem today. It seems as though there is something about the shearing method that is most optimal between 315 and 45 degrees, and noticeably degrades between 90 and 270. What I did to get around this was flip the image both on x, and y if the rotation is between 90 and 270, and then tack on an extra 180, to get it back into the desired range of rotation.
Here's basically what that was:
if (rotation > 90 && rotation < 270) {
scale.x = scale.x * -1.0f;
scale.y = scale.y * -1.0f;
rotation += 180;
if (rotation >= 360) {
rotation -= 360;
}
}

locate and click, a portion of image inside another image

I have a image1 which I locate with pyautogui, center and click.
It is ok.
But I have portion of this image to click, once I locate the first and I m not able to get coordinates to click.
I find out CV2 module, and I was able to match template with image, but I m not able to get TEMPLATE coordinates once I GOT the first image.
so basically I have image1, which i locate, and there is a portion of image2, called template, which i need to locate.
I need do this, because, first image can change position on screen. How i get x, y to center the template image?
pyautogui.position ( x, y , 1 )
this is the script which work matching image with template
the code
import cv2
import numpy as np
import os
import pyautogui as p
img_rgb = cv2.imread('big.png')
img_gray = cv2.cvtColor(img_rgb, cv2.COLOR_BGR2GRAY)
template = cv2.imread('portion.png',0)
w, h = template.shape[::-1]
##print (w,h)
res = cv2.matchTemplate(img_gray,template,cv2.TM_SQDIFF)
threshold = 0.8
loc = np.where( res >= threshold)
for pt in zip(*loc[::-1]):
cv2.rectangle(img_gray, pt, (pt[0] + w, pt[1] + h), (0,255,255), 2)
##cv2.imshow('Detected',template)
#( of course before this I will center the x and y with locate / center somehow )
p.moveTo (x of portion , y of portion ,1) #( of course before this I will center the x and y with locate / center somehow )
Ok. I did it.
I just locate the first image. Then I locate the second giving first image coordinates as region ti check.
It was simple.
Sorry guys bother u.

PIL zoom into image at a particular point

I'm creating some images with python imaging library (PIL). Now, like we zoom into a map at a particular location, I want to similarly zoom into my image at a specified point. Note that this is different from resizing the image. I want the size to remain the same. I couldn't find any inbuilt method in the documentation that does this. Is anyone aware of a method that might achieve this. I'd ideally like to do this without other dependencies like openCV.
I think you mean this:
def zoom_at(img, x, y, zoom):
w, h = img.size
zoom2 = zoom * 2
img = img.crop((x - w / zoom2, y - h / zoom2,
x + w / zoom2, y + h / zoom2))
return img.resize((w, h), Image.LANCZOS)
This will crop the image around the point where you zoom into and then upscale the resulting image to the original size.

Face and hair detection for Python

I am using OpenCV or dlib to detect face from images. The result is very good. Here is an example:
However, I also want to take the hair and the neck from the image, like that:
I have tried to look for a library or framework to help me achieve that but I can't find one.
Are there any way to do that?
In case you want to extract exactly region of hair and neck, you need to train your own model because the current dlib model does not include them.
Otherwise, you just want to capture relatively, you can use Openpose which gives you the landmarks of faces + ears + shoulders (even body and hand fingers). From those landmarks you can draw your interested area.
Example:
the width of rectangle = the length of shoulder (point 2 -> point 5)
the height = the length from the neck to (point 1) to the nose (point 0) x 2. (point 1 - point 0)*2
landmarks by openpose
face + hair + neck
use this code to increase the bounding box by percentage.
rects = detector(original_image, 1)
for rect in rects:
(x, y, w, h) = rect_to_bb(rect)
x_inc = int(w*0.3)
y_inc = int(h*0.3)
sub_face = original_image[y-y_inc:y+h+y_inc, x-x_inc:x+w+x_inc]
newimg = cv2.resize(sub_face,(int(224),int(224)))

Most efficient way to find center of two circles in a picture

I'm trying to take a picture (.jpg file) and find the exact centers (x/y coords) of two differently colored circles in this picture. I've done this in python 2.7. My program works well, but it takes a long time and I need to drastically reduce the amount of time it takes to do this. I currently check every pixel and test its color, and I know I could greatly improve efficiency by pre-sampling a subset of pixels (e.g. every tenth pixel in both horizontal and vertical directions to find areas of the picture to hone in on). My question is if there are pre-developed functions or ways of finding the x/y coords of objects that are much more efficient than my code. I've already removed function calls within the loop, but that only reduced the run time by a few percent.
Here is my code:
from PIL import Image
import numpy as np
i = Image.open('colors4.jpg')
iar = np.asarray(i)
(numCols,numRows) = i.size
print numCols
print numRows
yellowPixelCount = 0
redPixelCount = 0
yellowWeightedCountRow = 0
yellowWeightedCountCol = 0
redWeightedCountRow = 0
redWeightedCountCol = 0
for row in range(numRows):
for col in range(numCols):
pixel = iar[row][col]
r = pixel[0]
g = pixel[1]
b = pixel[2]
brightEnough = r > 200 and g > 200
if r > 2*b and g > 2*b and brightEnough: #yellow pixel
yellowPixelCount = yellowPixelCount + 1
yellowWeightedCountRow = yellowWeightedCountRow + row
yellowWeightedCountCol = yellowWeightedCountCol + col
if r > 2*g and r > 2*b and r > 100: # red pixel
redPixelCount = redPixelCount + 1
redWeightedCountRow = redWeightedCountRow + row
redWeightedCountCol = redWeightedCountCol + col
print "Yellow circle location"
print yellowWeightedCountRow/yellowPixelCount
print yellowWeightedCountCol/yellowPixelCount
print " "
print "Red circle location"
print redWeightedCountRow/redPixelCount
print redWeightedCountCol/redPixelCount
print " "
Update: As I mentioned below, the picture is somewhat arbitrary, but here is an example of one frame from the video I am using:
First you have to do some clearing:
what do you consider fast enough? where is the sample image so we can see what are you dealing with (resolution, bit per pixel). what platform (especially CPU so we can estimate speed).
As you are dealing with circles (each one encoded with different color) then it should be enough to find bounding box. So find min and max x,y coordinates of the pixels of each color. Then your circle is:
center.x=(xmin+xmax)/2
center.y=(ymin+ymax)/2
radius =((xmax-xmin)+(ymax-ymin))/4
If coded right even with your approach it should take just few ms. on images around 1024x1024 resolution I estimate 10-100 ms on average machine. You wrote your approach is too slow but you did not specify the time itself (in some cases 1us is slow in other 1min is enough so we can only guess what you need and got). Anyway if you got similar resolution and time is 1-10 sec then you most likelly use some slow pixel access (most likely from GDI) like get/setpixel use bitmap Scanline[] or direct Pixel access with bitblt or use own memory for images.
Your approach can be speeded up by using ray cast to find approximate location of circles.
cast horizontal lines
their distance should be smaller then radius of smallest circle you search for. cast as many rays until you hit each circle with at least 2 rays
cast 2 vertical lines
you can use found intersection points from #1 so no need to cast many rays just 2 ... use the H ray where intersection points are closer together but not too close.
compute you circle properties
so from the 4 intersection points compute center and radius as it is axis aligned rectangle +/- pixel error it should be as easy just find the mid point of any diagonal and radius is also obvious as half of diagonal size.
As you did not share any image we can only guess what you got in case you do no have circles or need an idea for different approach see:
Algorithms: Ellipse matching
find archery target in image of different perspectives
If you are sure of the colours of the circle, easier method be to filter the colors using a mask and then apply Hough circles as Mathew Pope suggested.
Here is a snippet to get you started quick.
import cv2 as cv2
import numpy as np
fn = '200px-Traffic_lights_dark_red-yellow.svg.png'
# OpenCV reads image with BGR format
img = cv2.imread(fn)
# Convert to HSV format
img_hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)
# lower mask (0-10)
lower_red = np.array([0, 50, 50])
upper_red = np.array([10, 255, 255])
mask = cv2.inRange(img_hsv, lower_red, upper_red)
# Bitwise-AND mask and original image
masked_red = cv2.bitwise_and(img, img, mask=mask)
# Check for circles using HoughCircles on opencv
circles = cv2.HoughCircles(mask, cv2.cv.CV_HOUGH_GRADIENT, 1, 20, param1=30, param2=15, minRadius=0, maxRadius=0)
print 'Radius ' + 'x = ' + str(circles[0][0][0]) + ' y = ' + str(circles[0][0][1])
One example of applying it on image looks like this. First is the original image, followed by the red colour mask obtained and the last is after circle is found using Hough circle function of OpenCV.
Radius found using the above method is Radius x = 97.5 y = 99.5
Hope this helps! :)

Categories