Python OCR Sudoku image - python

I have searched and found the following python code but it doesn't return the result as expected. I need to use ocr to convert the numbers on the sudoku image and read it as a grid
import cv2
from imutils import contours
import numpy as np
# Load image, grayscale, and adaptive threshold
image = cv2.imread('Sample.jpg')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
thresh = cv2.adaptiveThreshold(gray,255,cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY_INV,57,5)
# Filter out all numbers and noise to isolate only boxes
cnts = cv2.findContours(thresh, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
for c in cnts:
area = cv2.contourArea(c)
if area < 1000:
cv2.drawContours(thresh, [c], -1, (0,0,0), -1)
# Fix horizontal and vertical lines
vertical_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (1,5))
thresh = cv2.morphologyEx(thresh, cv2.MORPH_CLOSE, vertical_kernel, iterations=9)
horizontal_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (5,1))
thresh = cv2.morphologyEx(thresh, cv2.MORPH_CLOSE, horizontal_kernel, iterations=4)
# Sort by top to bottom and each row by left to right
invert = 255 - thresh
cnts = cv2.findContours(invert, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
(cnts, _) = contours.sort_contours(cnts, method="top-to-bottom")
sudoku_rows = []
row = []
for (i, c) in enumerate(cnts, 1):
area = cv2.contourArea(c)
if area < 50000:
row.append(c)
if i % 9 == 0:
(cnts, _) = contours.sort_contours(row, method="left-to-right")
sudoku_rows.append(cnts)
row = []
# Iterate through each box
for row in sudoku_rows:
for c in row:
mask = np.zeros(image.shape, dtype=np.uint8)
cv2.drawContours(mask, [c], -1, (255,255,255), -1)
result = cv2.bitwise_and(image, mask)
result[mask==0] = 255
cv2.imshow('result', result)
cv2.waitKey(175)
cv2.imshow('thresh', thresh)
cv2.imshow('invert', invert)
cv2.waitKey()
I have no great idea how to solve such a problem and forgive me if I was a beginner.
Here's sample of the image.

The best I could do CLI wise was run the image via any converter into PNM format which is preferred for most OCR apps, however most OCR apps will convert to Plain Text and those 7 may sometimes be seen as T (easy enough in this simplified case to Find and Replace).
The BIGGER hurdle is OCR just like PDF has no concept of Indents or margins so now we get this output. And no amount of correction of char spacing would help.
Thus your solution may rely on convert Image to vector placement by convert to PDF XY positions then with PDF OCR attempt to get the Character Layout from the pdf extraction result.
Python libs have data frame solutions that attempt to maintain tabular positions, however I don't do python to suggest which one can do this well.

Related

How can I get the required ROI in the below image using OpenCV (Python)?

So, I'm working on finding the small components on an electric chip that can be seen in the image. Till now, I have been working on finding the contours and then applying the Morphology operation and drawing the rectangles. I am attaching the original, required, and achieved image along with code so that it can be easy for the community to understand the problem.
Input:
import cv2
import os
# Load iamge, grayscale, adaptive threshold
image = cv2.imread('4.jpg')
result = image.copy()
gray = cv2.cvtColor(image,cv2.COLOR_BGR2GRAY)
thresh = cv2.adaptiveThreshold(gray,255,cv2.ADAPTIVE_THRESH_GAUSSIAN_C,
cv2.THRESH_BINARY_INV,51,9)
# Fill rectangular contours
cnts = cv2.findContours(thresh, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
for c in cnts:
cv2.drawContours(thresh, [c], -1, (255,255,255), -1)
# Morph opend
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (9,9))
opening = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, kernel, iterations=4)
# Draw rectangles
cnts = cv2.findContours(opening, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
for c in cnts:
x,y,w,h = cv2.boundingRect(c)
cv2.rectangle(image, (x, y), (x + w, y + h), (36,255,12), 3)
cv2.imshow('thresh', thresh)
cv2.imshow('opening', opening)
cv2.imshow('image', image)
cv2.waitKey()
Current result:
Expected output:
Any help will be appreciated. Thanks
Template matching could be already enough. Just give it a try with the ROI as the template image. You may vary the score to get the tilted elements.
Have a look here:
https://docs.opencv.org/master/d4/dc6/tutorial_py_template_matching.html
i think you can use matchTemplate, but while you use the function, you will get more then one similar point,
for example, your image have three object, the first object while have multi similarity points, it depends on your threshold, and you need to eliminate that.
in my experience, i use half size of the target image to be the bound distance, sorting the result point, then suppose the first point is the best point, then eliminate the similarity point inside bound distance near the best point. the next point will be your second object's point.

How to find the contour of a completed form scanned image?

I would like to detect the contour of the completed form in this scan.
Ideally I would want to find the corners of the table painted with red.
My final goal is to detect that the whole document was scanned and that the four corners are within the boundaries of the scan.
I used OpenCV from python - but it was not able to find the contour of the big container.
Any ideas?
With the observation that the form can be identified using the table grid, here's a simple approach:
Obtain binary image. Load the image, grayscale, Gaussian blur, then Otsu's threshold to get a binary image
Find horizontal sections. We create a horizontal shaped kernel and find horizontal table lines and draw onto a mask
Find vertical sections. We create a vertical shaped kernel and find vertical table lines and draw onto a mask
Fill text document body and morph open. We perform morph operations to close the table then find contours and fill the mask to obtain a contour of the shape. This step fulfills your needs since you can just find contours on the mask but we can go further and extract only the desired sections.
Perform four-point perspective transform. We find contours, sort for the largest contour, sort using contour approximation then perform a four-point perspective transform to obtain a birds eye view of the image.
Here's the results:
Input image
Detected contour to extract highlighted in green
Output after 4-point perspective transform
Code
import cv2
import numpy as np
from imutils.perspective import four_point_transform
# Load image, create mask, grayscale, and Otsu's threshold
image = cv2.imread('1.jpg')
mask = np.zeros(image.shape, dtype=np.uint8)
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
blur = cv2.GaussianBlur(gray, (3,3), 0)
thresh = cv2.adaptiveThreshold(blur,255,cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY_INV,11,3)
# Find horizontal sections and draw on mask
horizontal_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (80,1))
detect_horizontal = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, horizontal_kernel, iterations=2)
cnts = cv2.findContours(detect_horizontal, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
for c in cnts:
cv2.drawContours(mask, [c], -1, (255,255,255), -1)
# Find vertical sections and draw on mask
vertical_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (1,50))
detect_vertical = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, vertical_kernel, iterations=2)
cnts = cv2.findContours(detect_vertical, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
for c in cnts:
cv2.drawContours(mask, [c], -1, (255,255,255), -1)
# Fill text document body
mask = cv2.cvtColor(mask, cv2.COLOR_BGR2GRAY)
close_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (9,9))
close = cv2.morphologyEx(mask, cv2.MORPH_CLOSE, close_kernel, iterations=3)
cnts = cv2.findContours(close, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
for c in cnts:
cv2.drawContours(mask, [c], -1, 255, -1)
# Perform morph operations to remove noise
# Find contours and sort for largest contour
opening = cv2.morphologyEx(mask, cv2.MORPH_OPEN, close_kernel, iterations=5)
cnts = cv2.findContours(opening, cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
cnts = sorted(cnts, key=cv2.contourArea, reverse=True)
displayCnt = None
for c in cnts:
# Perform contour approximation
peri = cv2.arcLength(c, True)
approx = cv2.approxPolyDP(c, 0.02 * peri, True)
if len(approx) == 4:
displayCnt = approx
break
# Obtain birds' eye view of image
warped = four_point_transform(image, displayCnt.reshape(4, 2))
cv2.imwrite('mask.png', mask)
cv2.imwrite('thresh.png', thresh)
cv2.imwrite('warped.png', warped)
cv2.imwrite('opening.png', opening)
What about using the Hough transform with a narrow direction range, to find the verticals and horizontals ? If you are lucky, those that you need will be the longest, and after selecting them you can reconstruct the rectangle.

How to extract bottom part from engineering drawing image using python?

My input image
To extract highlighted part
My desired output
Please someone help and give me a suggestion. My images looks like this. This is just sample one. I need to crop the bottom template part and do OCR. I have attached my desire output picture. Please have a look. How to implement it using python?
PS: The sheet size will differ and there may the chance of template to dislocate. but mostly it will be in lower left corner
Here's a potential approach:
Obtain binary image. We convert to grayscale, Gaussian blur, then Otsu's threshold
Fill in potential contours. We iterate through contours and filter using contour approximation to determine if they are rectangular.
Perform morphological operations. We morph open to remove non-rectangular contours using a rectangular kernel.
Filter and extract desired contour. Find contours and filter using contour approximation, aspect ratio, and contour area to isolate the desired contour. Then extract using Numpy slicing.
Binary image
Filled in contours
Morphological operation to remove non-rectangular contours
Desired contour highlighted in green
Extracted ROI
Code
import cv2
# Grayscale, blur, and threshold
image = cv2.imread('1.png')
original = image.copy()
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
blur = cv2.GaussianBlur(gray, (3,3), 0)
thresh = cv2.threshold(blur, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]
# Fill in potential contours
cnts = cv2.findContours(thresh, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
for c in cnts:
peri = cv2.arcLength(c, True)
approx = cv2.approxPolyDP(c, 0.05 * peri, True)
if len(approx) == 4:
cv2.drawContours(thresh, [c], -1, (255,255,255), -1)
# Remove non rectangular contours
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (40,10))
close = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, kernel, iterations=2)
# Filtered for desired contour
cnts = cv2.findContours(close, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
for c in cnts:
peri = cv2.arcLength(c, True)
approx = cv2.approxPolyDP(c, 0.05 * peri, True)
x,y,w,h = cv2.boundingRect(approx)
aspect_ratio = w / float(h)
area = cv2.contourArea(approx)
if len(approx) == 4 and w > h and aspect_ratio > 2.75 and area > 45000:
cv2.drawContours(image, [c], -1, (36,255,12), -1)
ROI = original[y:y+h, x:x+w]
cv2.imwrite('image.png', image)
cv2.imwrite('ROI.png', ROI)
cv2.waitKey()

How to extract text from table in image?

I have data which in a structured table image. The data is like below:
I tried to extract the text from this image using this code:
import pytesseract
from PIL import Image
value=Image.open("data/pic_table3.png")
text = pytesseract.image_to_string(value, lang="eng")
print(text)
and, here is the output:
EA Domains
Traditional role
Future role
Technology e Closed platforms ¢ Open platforms
e Physical e Virtualized
Applicationsand |e Proprietary e Inter-organizational
Integration e Siloed composite
e P2P integrations applications
e EAI technology e Software asa Service
e Enterprise Systems e Service-Oriented
e Automating transactions Architecture
e “Informating”
interactions
However, the expected data output should be aligned according to the column and row. How can I do that?
You must preprocess the image to remove the table lines and dots before throwing it into OCR. Here's an approach using OpenCV.
Load image, grayscale, and Otsu's threshold
Remove horizontal lines
Remove vertical lines
Dilate to connect text and remove dots using contour area filtering
Bitwise-and to reconstruct image
OCR
Here's the processed image:
Result from Pytesseract
EA Domains Traditional role Future role
Technology Closed platforms Open platforms
Physical Virtualized
Applications and Proprietary Inter-organizational
Integration Siloed composite
P2P integrations applications
EAI technology Software as a Service
Enterprise Systems Service-Oriented
Automating transactions Architecture
“‘Informating”
interactions
Code
import cv2
import pytesseract
pytesseract.pytesseract.tesseract_cmd = r"C:\Program Files\Tesseract-OCR\tesseract.exe"
# Load image, grayscale, and Otsu's threshold
image = cv2.imread('1.png')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]
# Remove horizontal lines
horizontal_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (50,1))
detect_horizontal = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, horizontal_kernel, iterations=2)
cnts = cv2.findContours(detect_horizontal, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
for c in cnts:
cv2.drawContours(thresh, [c], -1, (0,0,0), 2)
# Remove vertical lines
vertical_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (1,15))
detect_vertical = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, vertical_kernel, iterations=2)
cnts = cv2.findContours(detect_vertical, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
for c in cnts:
cv2.drawContours(thresh, [c], -1, (0,0,0), 3)
# Dilate to connect text and remove dots
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (10,1))
dilate = cv2.dilate(thresh, kernel, iterations=2)
cnts = cv2.findContours(dilate, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
for c in cnts:
area = cv2.contourArea(c)
if area < 500:
cv2.drawContours(dilate, [c], -1, (0,0,0), -1)
# Bitwise-and to reconstruct image
result = cv2.bitwise_and(image, image, mask=dilate)
result[dilate==0] = (255,255,255)
# OCR
data = pytesseract.image_to_string(result, lang='eng',config='--psm 6')
print(data)
cv2.imshow('thresh', thresh)
cv2.imshow('result', result)
cv2.imshow('dilate', dilate)
cv2.waitKey()
You might want to detect the cells first, as shown in this image. You can do it using a hough line transform, a library provided by OpenCV. After that, you can use the detected lines to select the ROI and then extract the text for each cell.
For detailed explanation, kindly visit my blogpost

How to remove all portrait pictures from a document

I am working on OCRing a document image. I want to detect all pictures and remove from the document image. I want to retain tables in the document image. Once I detect pictures I will remove and then want to OCR. I tried to find contour tried to detect all the bigger areas. unfortunately it detects tables also. Also how to remove the objects keeping other data in the doc image. I am using opencv and python
Here's my code
import os
from PIL import Image
import pytesseract
img = cv2.imread('block2.jpg' , 0)
mask = np.ones(img.shape[:2], dtype="uint8") * 255
ret,thresh1 = cv2.threshold(img,127,255,0)
contours, sd = cv2.findContours(thresh1,cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_SIMPLE)
areacontainer = []
for cnt in contours:
area = cv2.contourArea(cnt)
areacontainer.append(area)
avgArea = sum(areacontainer)/len(areacontainer)
[enter code here][1]
for c in contours:# average area heuristics
if cv2.contourArea(c)>6*avgArea:
cv2.drawContours(mask, [c], -1, 0, -1)
binary = cv2.bitwise_and(img, img, mask=mask) # subtracting
cv2.imwrite("bin.jpg" , binary)
cv2.imwrite("mask.jpg" , mask)
Here's an approach:
Convert image to grayscale and Gaussian blur
Perform canny edge detection
Perform morphological operations to smooth image
Find contours and filter using a minimum/maximum threshold area
Remove portrait images
Here's the detected portraits highlighted in green
Now that we have the bounding box ROIs, we can effectively remove the pictures by filling them in with white. Here's the result
import cv2
image = cv2.imread('1.jpg')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
blur = cv2.GaussianBlur(gray, (3,3), 0)
canny = cv2.Canny(blur, 120, 255, 1)
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (3,3))
close = cv2.morphologyEx(canny, cv2.MORPH_CLOSE, kernel, iterations=2)
cnts = cv2.findContours(close, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
for c in cnts:
area = cv2.contourArea(c)
if area > 15000 and area < 35000:
x,y,w,h = cv2.boundingRect(c)
cv2.rectangle(image, (x, y), (x + w, y + h), (255,255,255), -1)
cv2.imshow('image', image)
cv2.waitKey()

Categories