I have been following a tutorial about computer vision and doing a little project to read the time from a game. The game time is formatted h:m. So far I got the h and m figured out using findContours, but I'm having trouble isolating the colon as the character shape is not continuous. Because of this when I try to matchTemplate the code freaks out and starts to use the dot to match to all the other digits.
Are there ways to group the contours by X?
Here are simplified code to get the reference digits, the code to get digits from the screen is basically the same.
refCnts = cv2.findContours(ref.copy(), cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
refCnts = imutils.grab_contours(refCnts)
refCnts = contours.sort_contours(refCnts, method="left-to-right")[0]
digits = {}
# loop over the OCR-A reference contours
for (i, c) in enumerate(refCnts):
# compute the bounding box for the digit, extract it, and resize
# it to a fixed size
(x, y, w, h) = cv2.boundingRect(c)
roi = ref[y:y + h, x:x + w]
roi = cv2.resize(roi, (10, 13))
digits[i] = roi
Im new to python and opencv. Apologies in advance if this is a dumb question.
Here is the reference image I'm using:
Here is the input image I'm trying to read:
Do you have to use findCountours? Because there are better suited methods for such problems. For instance, you can use template matching as shown below:
These are input, template (cut out from your reference image), and output images:
import cv2
import numpy as np
# Read the input image & convert to grayscale
input_rgb = cv2.imread('input.png')
input_gray = cv2.cvtColor(input_rgb, cv2.COLOR_BGR2GRAY)
# Read the template (Using 0 to read image in grayscale mode)
template = cv2.imread('template.png', 0)
# Perform template matching - more on this here: https://docs.opencv.org/4.0.1/df/dfb/group__imgproc__object.html#ga3a7850640f1fe1f58fe91a2d7583695d
res = cv2.matchTemplate(input_gray,template,cv2.TM_CCOEFF_NORMED)
# Store the coordinates of matched area
# found the threshold value of .56 using trial & error using the input image - might be different in your game
lc = np.where( res >= 0.56)
# Draw a rectangle around the matched region
# I used the width and height of the template image but in practice you need to use a better method to accomplish this
w, h = template.shape[::-1]
for pt in zip(*lc[::-1]):
cv2.rectangle(input_rgb, pt, (pt[0] + w, pt[1] + h), (0,255,255), 1)
# display output
cv2.imshow('Detected',input_rgb)
# cv2.imwrite('output.png', input_rgb)
# cv2.waitKey(0)
# cv2.destroyAllWindows()
You may also look into text detection & recognition using openCV.
Related
I am currently working on a small project, but I have an unresolved problem. That is I want to draw a shape through the desired objects , The first thing is to determine the coordinates of the starting and ending points but I don't have a specific idea yet but I don't know how to do it,I hope you can give me suggestions, Glad to have your help.
i want the result in like this
Here is an example using opencv you can draw rectangle over an object:
By giving the template image of the source image you can draw the shape over the image
# importing needed libraries
import cv2
import numpy as np
img_rgb = cv2.imread(source image) # opening the source image
img_gray = cv2.cvtColor(img_rgb, cv2.COLOR_BGR2GRAY) # converting to the gray scale
template = cv2.imread(template image,0) # opening the template image
w, h = template.shape[::-1] # giving the sape of template image
res = cv2.matchTemplate(img_gray,template,cv2.TM_CCOEFF_NORMED) # matching both the image using the opencv methods for matching object
threshold = 0.9
loc = np.where( res >= threshold)
for pt in zip(*loc[::-1]):
cv2.rectangle(img_rgb, pt, (pt[0] + w, pt[1] + h), (0,0,255), 1)
cv2.imshow('screen',img_rgb)
cv2.waitKey(0)
Source Image
Template Image
Result Image
This is what I do, I take some images and some of them contain information that I need. Being these the images:
How do I find that information? I use a template that contains two symbols (Euro and Dollar), when this symbols are found in any of the images, then I can process the image and try to extract the data that I need.
How do I extract the data? I take the dimensions of the found match, and since I know that the information to extract will always be contained to the right of the match, I dimension a box towards the right edge of my image, with which I make sure I have a box with the data to extract.
Here is the code, I will divide it into several sections to explain the process a little better:
1) Initial Settings for the Code (Imports, a list of images which will be processed, a couple of functions to filter the image and finally the configuration set for reading data from Tesseract):
import cv2
import numpy as np
from matplotlib import pyplot as plt
import pytesseract
from pytesseract import Output
imagenes = ["monitor1.jpg", "monitor2.jpg", "monitor3.jpg"]
# get grayscale image
def get_grayscale(image):
return cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
# Noise Removal (This is the filter I am using)
def remove_noise(image):
return cv2.medianBlur(image,5)
# The configuration we will use to read the images:
my_config = r"--psm 11 --oem 3"
2) Next, the template with which we will try to match is read and we take its dimensions (w=width and h=height).
We present the methods to find the matches and enter a loop reviewing image by image, trying to find a matching image:
# Reading the Template (Euro and Dollar):
# template_simbolo = cv2.imread('template_euro_dolar.jpg', 0)
template = cv2.imread('template_simbolos.jpg', 0)
w, h = template.shape[::-1]
# The methods we will use to find the matchs (This should be a list of 6 methods
# but working with a big list of images it takes an eternity, so we will only
# use one by now for the tests) :
methods = ['cv2.TM_CCOEFF']
# A loop where we are going to filter every image, find the matchs if there are
# and extract the data:
for img in imagenes:
print("**************************")
# Image to read:
img_rgb = cv2.imread(img)
# The image filtered in gray:
gray = get_grayscale(img_rgb)
img_gray = remove_noise(gray)
# With res we will find the matches but we only take the accurate ones (80% or more)
res = cv2.matchTemplate(img_gray, template, cv2.TM_CCOEFF_NORMED)
threshold = 0.8
loc = np.where(res >= threshold)
print(loc)
print(len(loc[0]))
3) In this part of the code, first we enclose the match in a box, we filter the original image, and we look for the coordinates of the match, once this is done, we proceed to enclose in a box the section where the desired information is found.
# If loc contains values it is because there is a match:
if len(loc[0]) > 0:
print("Match Found")
# We enclose the found matches in a box and save the result:
for pt in zip(*loc[::-1]):
cv2.rectangle(img_rgb, pt, (pt[0] + w, pt[1] + h), (0, 0, 255), 2)
cv2.imwrite('res_monitor.png', img_rgb)
# A loop of matching methods:
for meth in methods:
# We filter the image and change it to a gray color
gray = get_grayscale(img_rgb)
img_gray = remove_noise(gray)
# We evaluate the method to use and according to it we have some
# default coordinates
method = eval(meth)
min_val, max_val, min_loc, max_loc = cv2.minMaxLoc(res)
print("min_val:", min_val)
print("max_val:", max_val)
print("min_loc:", min_loc)
print("max_loc:", max_loc)
# If the method is TM_SQDIFF or TM_SQDIFF_NORMED, take minimum
if method in [cv2.TM_SQDIFF, cv2.TM_SQDIFF_NORMED]:
top_left = min_loc
else:
top_left = max_loc
# To know the bottom right coordinate, we respectively take the value
# of top_left and add the width w and height h to know this coordinate.
w, h = template.shape[::-1]
bottom_right = (top_left[0] + w, top_left[1] + h)
print("top_left:", top_left)
print("bottom_right:", bottom_right)
print("x:", top_left[0])
print("y:", top_left[1])
# Now, in our original image, which we previously filtered, we will
# place a box or rectangle according to the dimensions established
# before. (top_left and bottom_right)
w, h = img_gray.shape[::-1]
print("w:", w)
print("h:", bottom_right[1])
cv2.rectangle(img_gray, top_left, bottom_right, 255, 2)
imagen = cv2.rectangle(img_gray, top_left, (w, bottom_right[1]), 255, 2)
x = top_left[0]
y = top_left[1]
w = w
h = bottom_right[1]
# Finally we crop this section of the code where we established the area
# to review and with pytesseract we look for the data that we can obtain
# from said cropped image.
crop_image = img_gray[y:h, x:w]
cv2.imwrite("croped.jpg", crop_image)
data = pytesseract.image_to_data(crop_image, config=my_config,
output_type=Output.DICT)
print(data, "\n")
4) Finally we create a dictionary to save the rate of the euro and the dollar, if everything goes well they will be saved correctly.
At the end of this process, a plot is shown to verify that the information was extracted correctly.
# We create a dictionary to store the values of Euro and Dollar
i = 0
currencies= {}
for value in data["text"]:
print(value)
try:
currency= value.replace(".", "").replace(",", ".")
currency= float(currency)
i = i + 1
if i == 1:
currencies["Euro"] = currency
elif i == 2 and currency< currencies["Euro"]:
currencies["Dolar"] = currency
except: ValueError
# We pass the image to string to obtain the rates of the currencies
text = pytesseract.image_to_string(crop_image, config = my_config)
print(text)
print(currencies)
# We graph the results and confirm that the data extraction and the
# demarcated area are correct.
plt.subplot(121),plt.imshow(res, cmap = 'gray')
plt.title('Matching Result'), plt.xticks([]), plt.yticks([])
plt.subplot(122),plt.imshow(img_gray, cmap = 'gray')
plt.title('Detected Point'), plt.xticks([]), plt.yticks([])
plt.suptitle(meth)
plt.show()
else:
print("DOES NOT MATCH")
The results:
With the code and all the logic presented above, it usually works very well on these images, but for some reason sometimes it doesn't read the image properly, it doesn't save the information in the desired way.
As can be seen, the area that the code takes is the desired one, but the currency dictionary does not record any information:
Which is strange because if I run the code on a longer list of images, that same image is recognized perfectly.
So the problem here is that sometimes it worked and sometimes it didn't, and I'm not quite sure why. Does anyone know what I can polish? What am I doing wrong? Any advice?
I have a 3D film image that has star-like figures in itself. I have a template whose shape is (21,21,1) that helps to find the initial point (rectangle) on the image. I have done the block matching part and be able to determine some matched coordinates which are sometimes correct and incorrect due to the different pixel intensity of the image. The image and template are all gray. The following are my codes, results, and the expected results. Any help (idea) to solve this problem will be acknowledged
Codes for template matching
image = cv2.imread('45_gray.jpg', 0 )
template = cv2.imread('45tmpl.jpg', 0)
(tW, tH) = template.shape[::-1]
result = cv2.matchTemplate(image, template, cv2.TM_CCOEFF_NORMED)
#print(result)
threshold = 0.8
(yCoords, xCoords) = np.where(result >= threshold)
clone = image.copy()
#print(yCoords, xCoords)
for (x, y) in zip(xCoords, yCoords):
# draw the bounding box on the image
cv2.rectangle(clone, (x, y), (x + tW, y + tH),(255, 247, 263), 3)
# show our output image *before* applying non-maxima suppression
#cv2.imshow("Before NMS", clone)
#cv2.waitKey(0)
cv2.imwrite('output match.jpg', clone)
I don't think this template matching will work out very well here. As you may notice in the input image there is a gradient from bottom left to top-right, The image is apprearing to fade in that direction. So on the top left side the features are not that pronounced for the template matching to work efficiently. I would recommend to first convert this image to a binary image using adaptiveThreshold technique as:
img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
img_adaptive = cv2.adaptiveThreshold(img_gray, 255, cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY, 15, 5)
Once you have the bianry image which looks consistent, that is not containing any gradient, you can either follow the template matching path, or in this case I chose to follow an alternate method of detecting the straight lines and then treating their intersection as the point of interest.
Now Detecting the lines could be done using cv2.HoughLinesP, but it has it's own set of parameters which need to be tweaked properly to get it working so I simply counted the number of while pixels present in each row and column respectively and filtered the local maxima points. That is, I chose the rows which had more white pixel count than neighbours and did the same for each column as well.
import cv2
import numpy as np
from scipy.signal import argrelextrema
img = cv2.imread("/home/anmol/Downloads/x5JQF.jpg")
img = cv2.resize(img, None, fx=0.4, fy=0.4)
img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
img_adaptive = cv2.adaptiveThreshold(img_gray, 255, cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY, 11, 5)
cv2.imwrite("./debug.png", img_adaptive)
row_white_pixel_count = np.array([np.count_nonzero(img_adaptive[i,:]) for i in range(img_adaptive.shape[0])])
col_white_pixel_count = np.array([np.count_nonzero(img_adaptive[:,i]) for i in range(img_adaptive.shape[1])])
row_maxima = argrelextrema(row_white_pixel_count, np.greater, order=50)[0]
col_maxima = argrelextrema(col_white_pixel_count, np.greater, order=50)[0]
all_intersection_coords = []
for row_idx in row_maxima:
for col_idx in col_maxima:
all_intersection_coords.append((col_idx, row_idx))
for coord in all_intersection_coords:
img = cv2.circle(img, coord, 10, (0, 240, 0), 2)
I would like to know if a big image contains a small image. The small image can be semi-transparent (similar to watermark, so it's not a fully filled photo). I've tried following different SO answers on this topic, but they're all matching the EXACT photo, but what I am looking for is whether the photo exists with 80% accuracy as the photo will be a lossy rendered version of the original one.
This is a procedure of how the images I am searching in will be generated:
Use any photo, put a semi-transparent "watermark" on it within Photoshop and save it. Then I want to check if the "watermark" exists within created photo with certain percent of accuracy (80% is good enough).
I've tried using the original template matching example provided on their docs page but I'm getting barely any match at all.
This is the code I'm using:
import cv2
import numpy as np
img_rgb = cv2.imread('photo2.jpeg')
img_gray = cv2.cvtColor(img_rgb, cv2.COLOR_BGR2GRAY)
template = cv2.imread('small-image.png', 0)
w, h = template.shape[::-1]
res = cv2.matchTemplate(img_gray,template,cv2.TM_CCOEFF_NORMED)
threshold = 0.7
loc = np.where( res >= threshold)
for pt in zip(*loc[::-1]):
cv2.rectangle(img_rgb, pt, (pt[0] + w, pt[1] + h), (0,0,255), 2)
cv2.imshow('output', img_rgb)
cv2.waitKey(0)
Here are the photos I've been using for the test, as this is something similar I am trying to make a match on.
small-image.png
photo2.jpeg
I am assuming the whole watermark will have the same RGB values and the text will have a little different RGB values otherwise this technique will not work. Based on this we can obtain the RGB values of a pixel of the small image and treated it as a mask by using cv2.inRange to find those pixel values in the large image. Similarly a mask is also created for the small image using those pixel values.
small = cv2.imread('small_find.png')
large = cv2.imread('large_find.jpg')
pixel = np.reshape(small[3,3], (1,3))
lower =[pixel[0,0]-10,pixel[0,1]-10,pixel[0,2]-10]
lower = np.array(lower, dtype = 'uint8')
upper =[pixel[0,0]+10,pixel[0,1]+10,pixel[0,2]+10]
upper = np.array(upper, dtype = 'uint8')
mask = cv2.inRange(large,lower, upper)
mask2 = cv2.inRange(small, lower, upper)
I had to take a buffer value of 20 because the values were not clearly matching in the large image otherwise only 1 is enough in either upper or lower. Then we find contours in mask and find values of its bounding rectangle which is cut out and reshaped to the size of mask2.
im, contours, hierarchy = cv2.findContours(mask,cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_SIMPLE)
#cv2.drawContours(large, contours, -1, (0,0,255), 1)
cnt = max(contours, key = cv2.contourArea)
x,y,w,h = cv2.boundingRect(cnt)
wanted_part = mask[y:y+h, x:x+w]
wanted_part = cv2.resize(wanted_part, (mask2.shape[1], mask2.shape[0]), interpolation = cv2.INTER_LINEAR)
The two masks side by side (inverted them otherwise they were not visible).
For comparing they you can use any parameter and check whether it satisfies your condition or not. I used mean square error and got error of only 6.20 which is very low.
def MSE(img1, img2):
squared_diff = img1 - img2
summed = np.sum(squared_diff)
num_pix = img1.shape[0] * img1.shape[1] #img1 and 2 should have same shape
err = summed / num_pix
return err
I am trying to identify portions of the image that has text. For this, I am using OpenCV (v.3) first to pre-process the image and then add rectangles/boxes to the text portions.
My code below does report some contours. See code, input image and output below.
Code:
import os,sys,cv2,pytesseract
## IMAGE
afile = "test-small.jpg"
def reader(afile):
aimg = cv2.imread(afile,0)
print("Image Shape%s | Size:%s" % (aimg.shape,aimg.size))
return aimg
def boundbox(aimg):
out_path2 = "%s-tagged.jpg" % (afile.rpartition(".")[0])
ret,thresh = cv2.threshold(aimg,127,255,0)
image, contours, hierarchy = cv2.findContours(thresh,cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_NONE)
acount = 0
for contour in contours:
acount+=1
x, y, w, h = cv2.boundingRect(contour)
print("Coordinates",x,y,w,h)
if w < 100 and h < 100: ## Avoid tagging small objects i.e. false positives
continue
cv2.rectangle(aimg, (x, y), (x + w, y + h), (255, 0, 0), 8) ##
print("Total contours found:%s" % (acount))
cv2.imwrite(out_path2,aimg)
return out_path2
def main():
aimg = reader(afile)
bimg = boundbox(aimg)
if __name__ == '__main__':
main()
Test image:
Output:
Problem is that (1) rectangles are not visible on image and (2) detection of text portions is inaccurate. How can improve the above code to detect portions with text?
Thanks for helping.
Bade
Try resizing the image before applying to the threshold. You can also try out erosion and dilation functions before applying contours.