how to find the template matching accuracy - python

i am doing a template matching
now,what i want to do is find the accuracy of template matching
I have done template matching, but how do i get the accuracy
i think i have to subtract the matched region and template image.
how do i achieve this
CODE
import cv2 as cv
import numpy as np
import matplotlib.pyplot as plt
img = cv.imread('image.jpg',0)
img1 = img.copy()
template = cv.imread('template.jpg',0)
w, h = template.shape[::-1]
method = ['cv.TM_CCOEFF_NORMED','cv.TM_CCORR_NORMED']
for meth in method:
img = img1.copy()
method = eval(meth)
res = cv.matchTemplate(img,template,method)
min_val, max_val, min_loc, max_loc = cv.minMaxLoc(res)
bottom_right = (top_left[0] + w, top_left[1] + h)
cv.rectangle(img,top_left, bottom_right, 255, 2)
plt.subplot(121)
plt.imshow(res,cmap = 'gray')
plt.title('Matching Result')
plt.subplot(122)
plt.imshow(img,cmap = 'gray')
plt.title('Detected Point')
plt.show()

Please don't use absolute dif or any similar method to calculate accuracy. You already have accuracy values in the variables min_val, max_val.
The OpenCV template matching uses various forms of correlation to calculate the match. So when you use cv.matchTemplate(img,template,method) the value stored in the res image is the result of this correlation.
So, when you use cv.minMaxLoc(res) you are calculating the minimum and maximum result of this correlation. I simply use max_val to tell me how well it has matched. Since both min_val and max_val are in the range [-1.0, 1.0], if max_val is 1.0 I take that as a 100% match, a max_val of 0.5 as a 50% match, and so on.
I've tried using a combination of min_val and max_val to scale the values to get a better understanding, but I found that simply using max_val gives me the desired results.

There are several examples of metrics one can use to compare images. Some of them are:
SAD: Sum of Absolute Differences
SSD: Sum of Squared Differences
Cross-correlation
All these require some elementary operations on the pixel values of the images to compare.

Related

How to properly filter an image using OpenCV? To read and extract the text with the highest possible percentage of effectiveness

This is what I do, I take some images and some of them contain information that I need. Being these the images:
How do I find that information? I use a template that contains two symbols (Euro and Dollar), when this symbols are found in any of the images, then I can process the image and try to extract the data that I need.
How do I extract the data? I take the dimensions of the found match, and since I know that the information to extract will always be contained to the right of the match, I dimension a box towards the right edge of my image, with which I make sure I have a box with the data to extract.
Here is the code, I will divide it into several sections to explain the process a little better:
1) Initial Settings for the Code (Imports, a list of images which will be processed, a couple of functions to filter the image and finally the configuration set for reading data from Tesseract):
import cv2
import numpy as np
from matplotlib import pyplot as plt
import pytesseract
from pytesseract import Output
imagenes = ["monitor1.jpg", "monitor2.jpg", "monitor3.jpg"]
# get grayscale image
def get_grayscale(image):
return cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
# Noise Removal (This is the filter I am using)
def remove_noise(image):
return cv2.medianBlur(image,5)
# The configuration we will use to read the images:
my_config = r"--psm 11 --oem 3"
2) Next, the template with which we will try to match is read and we take its dimensions (w=width and h=height).
We present the methods to find the matches and enter a loop reviewing image by image, trying to find a matching image:
# Reading the Template (Euro and Dollar):
# template_simbolo = cv2.imread('template_euro_dolar.jpg', 0)
template = cv2.imread('template_simbolos.jpg', 0)
w, h = template.shape[::-1]
# The methods we will use to find the matchs (This should be a list of 6 methods
# but working with a big list of images it takes an eternity, so we will only
# use one by now for the tests) :
methods = ['cv2.TM_CCOEFF']
# A loop where we are going to filter every image, find the matchs if there are
# and extract the data:
for img in imagenes:
print("**************************")
# Image to read:
img_rgb = cv2.imread(img)
# The image filtered in gray:
gray = get_grayscale(img_rgb)
img_gray = remove_noise(gray)
# With res we will find the matches but we only take the accurate ones (80% or more)
res = cv2.matchTemplate(img_gray, template, cv2.TM_CCOEFF_NORMED)
threshold = 0.8
loc = np.where(res >= threshold)
print(loc)
print(len(loc[0]))
3) In this part of the code, first we enclose the match in a box, we filter the original image, and we look for the coordinates of the match, once this is done, we proceed to enclose in a box the section where the desired information is found.
# If loc contains values ​​it is because there is a match:
if len(loc[0]) > 0:
print("Match Found")
# We enclose the found matches in a box and save the result:
for pt in zip(*loc[::-1]):
cv2.rectangle(img_rgb, pt, (pt[0] + w, pt[1] + h), (0, 0, 255), 2)
cv2.imwrite('res_monitor.png', img_rgb)
# A loop of matching methods:
for meth in methods:
# We filter the image and change it to a gray color
gray = get_grayscale(img_rgb)
img_gray = remove_noise(gray)
# We evaluate the method to use and according to it we have some
# default coordinates
method = eval(meth)
min_val, max_val, min_loc, max_loc = cv2.minMaxLoc(res)
print("min_val:", min_val)
print("max_val:", max_val)
print("min_loc:", min_loc)
print("max_loc:", max_loc)
# If the method is TM_SQDIFF or TM_SQDIFF_NORMED, take minimum
if method in [cv2.TM_SQDIFF, cv2.TM_SQDIFF_NORMED]:
top_left = min_loc
else:
top_left = max_loc
# To know the bottom right coordinate, we respectively take the value
# of top_left and add the width w and height h to know this coordinate.
w, h = template.shape[::-1]
bottom_right = (top_left[0] + w, top_left[1] + h)
print("top_left:", top_left)
print("bottom_right:", bottom_right)
print("x:", top_left[0])
print("y:", top_left[1])
# Now, in our original image, which we previously filtered, we will
# place a box or rectangle according to the dimensions established
# before. (top_left and bottom_right)
w, h = img_gray.shape[::-1]
print("w:", w)
print("h:", bottom_right[1])
cv2.rectangle(img_gray, top_left, bottom_right, 255, 2)
imagen = cv2.rectangle(img_gray, top_left, (w, bottom_right[1]), 255, 2)
x = top_left[0]
y = top_left[1]
w = w
h = bottom_right[1]
# Finally we crop this section of the code where we established the area
# to review and with pytesseract we look for the data that we can obtain
# from said cropped image.
crop_image = img_gray[y:h, x:w]
cv2.imwrite("croped.jpg", crop_image)
data = pytesseract.image_to_data(crop_image, config=my_config,
output_type=Output.DICT)
print(data, "\n")
4) Finally we create a dictionary to save the rate of the euro and the dollar, if everything goes well they will be saved correctly.
At the end of this process, a plot is shown to verify that the information was extracted correctly.
# We create a dictionary to store the values ​​of Euro and Dollar
i = 0
currencies= {}
for value in data["text"]:
print(value)
try:
currency= value.replace(".", "").replace(",", ".")
currency= float(currency)
i = i + 1
if i == 1:
currencies["Euro"] = currency
elif i == 2 and currency< currencies["Euro"]:
currencies["Dolar"] = currency
except: ValueError
# We pass the image to string to obtain the rates of the currencies
text = pytesseract.image_to_string(crop_image, config = my_config)
print(text)
print(currencies)
# We graph the results and confirm that the data extraction and the
# demarcated area are correct.
plt.subplot(121),plt.imshow(res, cmap = 'gray')
plt.title('Matching Result'), plt.xticks([]), plt.yticks([])
plt.subplot(122),plt.imshow(img_gray, cmap = 'gray')
plt.title('Detected Point'), plt.xticks([]), plt.yticks([])
plt.suptitle(meth)
plt.show()
else:
print("DOES NOT MATCH")
The results:
With the code and all the logic presented above, it usually works very well on these images, but for some reason sometimes it doesn't read the image properly, it doesn't save the information in the desired way.
As can be seen, the area that the code takes is the desired one, but the currency dictionary does not record any information:
Which is strange because if I run the code on a longer list of images, that same image is recognized perfectly.
So the problem here is that sometimes it worked and sometimes it didn't, and I'm not quite sure why. Does anyone know what I can polish? What am I doing wrong? Any advice?

Finding the darkest region in a depth map using numpy and/or cv2

I am attempting to consistently find the darkest region in a series of depth map images generated from a video. The depth maps are generated using the PyTorch implementation here
Their sample run script generates a prediction of the same size as the input where each pixel is a floating point value, with the highest/brightest value being the closest. Standard depth estimation using ConvNets.
The depth prediction is then normalized as follows to make a png for review
bits = 2
depth_min = prediction.min()
depth_max = prediction.max()
max_val = (2**(8*bits))-1
out = max_val * (prediction - depth_min) / (depth_max - depth_min)
I am attempting to identify the darkest region in each image in the video, with the assumption that this region has the most "open space".
I've tried several methods:
cv2 template matching
Using cv2 template matching and minMaxLoc I created a template of np.zeros(100,100), then applied the template similar to the docs
img2 = out.copy().astype("uint8")
template = np.zeros((100, 100)).astype("uint8")
w, h = template.shape[::-1]
res = cv2.matchTemplate(img2,template,cv2.TM_SQDIFF)
min_val, max_val, min_loc, max_loc = cv2.minMaxLoc(res)
top_left = min_loc
bottom_right = (top_left[0] + w, top_left[1] + h)
val = out.max()
cv2.rectangle(out,top_left, bottom_right, int(val) , 2)
As you can see, this implementation is very inconsistent with many false positives
np.argmin
Using np.argmin(out, axis=1) which generates many indices. I take the first two, and write the word MIN at those coordinates
text = "MIN"
textsize = cv2.getTextSize(text, font, 1, 2)[0]
textX, textY = np.argmin(prediction, axis=1)[:2]
cv2.putText(out, text, (textX, textY), font, 1, (int(917*max_val), int(917*max_val), int(917*max_val)), 2)
This is less inconsistent but still lacking
np.argwhere
Using np.argwhere(prediction == np.min(preditcion) then write the word MIN at the coordanites. I imagined this would give me the darkest pixel on the image, but this is not the case
I've also thought of running a convolution operation with a kernel of 50x50, then taking the region with the smallest value as the darkest region
My question is why are there inconsistencies and false positives. How can I fix that? Intuitively this seems like a very simple thing to do.
UPDATE
Thanks to Hans for the idea. Please follow this link to download the output depths in png format.
The minimum is not a single point but as a rule a larger area. argmin finds the first x and y (top left corner) of this area:
In case of multiple occurrences of the minimum values, the indices
corresponding to the first occurrence are returned.
What you need is the center of this minimum region. You can find it using moments. Sometimes you have multiple minimum regions for instance in frame107.png. In this case we take the biggest one by finding the contour with the largest area.
We still have some jumping markers as sometimes you have a tiny area that is the minimum, e.g. in frame25.png. Therefore we use a minimum area threshold min_area, i.e. we don't use the absolute minimum region but the region with the smallest value from all regions greater or equal that threshold.
import numpy as np
import cv2
import glob
min_area = 500
for file in glob.glob("*.png"):
img = cv2.imread(file, cv2.IMREAD_GRAYSCALE)
for i in range(img.min(), 255):
if np.count_nonzero(img==i) >= min_area:
b = np.where(img==i, 1, 0).astype(np.uint8)
break
contours,_ = cv2.findContours(b, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)
max_contour = max(contours, key=cv2.contourArea)
m = cv2.moments(max_contour)
x = int(m["m10"] / m["m00"])
y = int(m["m01"] / m["m00"])
out = cv2.circle(img, (x,y), 10, 255, 2 )
cv2.imwrite(file,out)
frame107 with five regions where the image is 0 shown with enhanced gamma:
frame25 with very small min region (red arrow), we take the fifth largest min region instead (white cirle):
The result (for min_area=500) is still a bit jumpy at some places, but if you further increase min_area you'll get false results for frames with a very steeply descending (and hence small per value) dark area. Maybe you can use the time axis (frame number) to filter out frames where the location of the darkest region jumps back and forth within 3 frames.

How to find one template in several images?

I have one template and several images. So the problem is to find out is this template in the image or not. I wrote some loop, but I think it doesn't return Boolean value...
for i in images:
res = cv2.matchTemplate(i,templateDealer,cv2.TM_CCOEFF_NORMED)
min_val, max_val, min_loc, max_loc = cv2.minMaxLoc(res)
top_left = max_loc
bottom_right = (top_left[0] + w, top_left[1] + h)
cv2.rectangle(i, top_left, bottom_right, (0, 255, 0), 2)
result.append(res)
Please help me to make it...
matchTemplate returns a 2D array with the match values for each pixel location - how well the template matches that location.
cv2.minMaxLoc(res) returns the value and location of the best match in the image. It may not be an actual match when you compare it visually, but it is the highest value the algorithm returned. You can compare this highest value over multiple images, the highest overall is the one you seek.
Note: you should not use a normalizing algorithm for this, use cv2.TM_CCOEFF instead of cv2.TM_CCOEFF_NORMED
What you need to do is append a tuple that holds the highest value and it's location:
result.append((max_val, max_loc)).
After you have processed all images, find the highest max_val, and draw a rectangle using it's max_loc
minMaxLoc will always give you something.
It totally depends on ur task. There are a few possibility I can think of
(1) You are looking for only 1 image of the most likely case among the N image.
In this case, concatenated all Image across one direction. Then run standard opencv tempalte matching and minmaxloc to find the most likely loation
vis = np.concatenate((img1, img2), axis=0) # do it for N image if necessary
(2) You are just want to check the similarity of the template to all N image.
Then you need to declare a threshold after minmax to see if there is a point in it that is above the threshold, if have return 1 if no return 0
def getBestMatch():
images = [
cv2.imread('tmp/1.png'),
cv2.imread('tmp/2.png'),
cv2.imread('tmp/3.png'),
cv2.imread('tmp/4.png'),
cv2.imread('tmp/5.png'),
cv2.imread('tmp/6.png')
]
template = cv2.imread('template.png')
result = []
for i in images:
match = cv2.matchTemplate(i, template, cv2.TM_CCOEFF_NORMED)
_, confidence, _, _ = cv2.minMaxLoc(match)
result.append(confidence)
posNum = result.index(max(result))
return posNum

Template Matching Returns Zero

I'am simply using template matching with method cv2.TM_CCOEFF_NORMED by using opencv. The difference is that both reference image and warped image are divided into small pieces for instances 128x108 resolution. Generally, it works well but sometimes it returns zero even both pieces are almost same. Below are the sample image pairs that one of them has a line with "1" intensity but all other values are zero. Is there a specific reason why it fails for this example? Maybe because of low resolution of images?
Thanks in advance.
grab_image
ref_image
import cv2
import numpy as np
np.set_printoptions(threshold='nan')
def main():
img_ref = cv2.imread('folderoftheimage')
img_grab = cv2.imread('folderoftheimage')
max_val_array = []
template_matching_array = []
# Apply template Matching
res = cv2.matchTemplate(img_ref,img_grab,cv2.TM_CCOEFF_NORMED)
min_val, max_val, min_loc, max_loc = cv2.minMaxLoc(res)
max_val_array.append(max_val)
print "min_val", min_val
print "max_val", max_val
print "min_loc", min_loc
print "max_loc", max_loc
template_matching_array.append(np.min(max_val_array))
index_min = max_val_array.index(np.min(max_val_array))
print "template matching", template_matching_array
print "zero element", index_min
main()

Opencv has found image that doesn't exist on screen

I try to use opencv for search button location on screen. If button exist on screen opencv work perfect but it return some !=0 x,y even if image doesn't exist. How to fix it?
import cv2
def buttonlocation(image):
im = ImageGrab.grab()
im.save('screenshot.png')
img = cv2.imread(image,0)
img2 = img.copy()
template = cv2.imread('screenshot.png',0)
w,h = template.shape[::-1]
meth = 'cv2.TM_SQDIFF'
img = img2.copy()
method = eval(meth)
res = cv2.matchTemplate(img,template,method)
min_val, max_val, min_loc, max_loc = cv2.minMaxLoc(res)
top_left = min_loc
x,y = top_left
return x,y
The documentation of opencv details to two steps of the template matching procedure.
R=cv2.matchTemplate(I,T,method) computes an image R. Each pixel x,y of this image represents a mark depending on the similarity between the template T and the sub-image of I starting at x,y. For instance, if the method cv.TM_SQDIFF is applied, the mark is computed as:
If R[x,y] is null, then the sub-image I[x:x+sxT,y:y+syT] is exactly identical to the template T. The smaller R[x,y] is, the closer to the template the sub-image is.
cv2.minMaxLoc(R) is applied to find the minimum of R. The corresponding subimage of I is expected to closer to the template than any other sub-image of I.
If the image I does not contain the template, the sub-image of I corresponding to the minimum of R can be very different from T. But the value of the minimum reflects this ! Indeed, a threshold on R can be applied as a way to decide whether the template is in the image or not.
Choosing the value for the threshold is a tricky task. It could be a fraction of the maximum value of R or a fraction of the mean value of R. The influence of the size of the template can be discarted by dividing R by the sxT*syT. For instance, the maximum value of R depends on the template size and the type of the image. For instance, for CV_8UC3 (unsigned char, 3 channels) the maximum value of R is 255*3*sxT*syT.
Here is an example:
import cv2
img = cv2.imread('image.jpg',eval('cv2.CV_LOAD_IMAGE_COLOR'))
template = cv2.imread('template.jpg',eval('cv2.CV_LOAD_IMAGE_COLOR'))
cv2.imshow('image',img)
#cv2.waitKey(0)
#cv2.destroyAllWindows()
meth = 'cv2.TM_SQDIFF'
method = eval(meth)
res = cv2.matchTemplate(img,template,method)
min_val, max_val, min_loc, max_loc = cv2.minMaxLoc(res)
top_left = min_loc
x,y = top_left
h,w,c=template.shape
print 'R='+str( min_val)
if min_val< h*w*3*(20*20):
cv2.rectangle(img,min_loc,(min_loc[0] + w,min_loc[1] + h),(0,255,0),3)
else:
print 'first template not found'
template = cv2.imread('template2.jpg',eval('cv2.CV_LOAD_IMAGE_COLOR'))
res = cv2.matchTemplate(img,template,method)
min_val, max_val, min_loc, max_loc = cv2.minMaxLoc(res)
top_left = min_loc
x,y = top_left
h,w,c=template.shape
print 'R='+str( min_val)
if min_val< h*w*3*(20*20):
cv2.rectangle(img,min_loc,(min_loc[0] + w,min_loc[1] + h),(0,0,255),3)
else:
print 'second template not found'
cv2.imwrite( "result.jpg", img);
cv2.namedWindow('res',0)
cv2.imshow('res',img)
cv2.waitKey(0)
cv2.destroyAllWindows()
The image:
The first template is to be found:
The second template is not to be found:
The result:

Categories