How to find table like structure in image by Open CV - python

I have not bordered table like on picture .
I tried to use example from this post
I have got this result
But I need something like this
How I could tune Open CV to get needed result?

You can easily achieved by using image_to_data method of pytesseract.
You also need to know:
Pre-processing methods
Page-segmentation-modes
Steps:
Load the image in BGR mode and convert it to the gray-scale
Get region-of-interest areas
From each area, get the coordinates and draw rectangle.
Result:
Code:
# Load the library
import cv2
import pytesseract
# Load the image
img = cv2.imread("1Tksb.jpg")
# Convert to gry-scale
gry = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
# OCR detection
d = pytesseract.image_to_data(gry, config="--psm 6", output_type=pytesseract.Output.DICT)
# Get ROI part from the detection
n_boxes = len(d['level'])
# For each detected part
for i in range(n_boxes):
# Get the localized region
(x, y, w, h) = (d['left'][i], d['top'][i], d['width'][i], d['height'][i])
# Draw rectangle to the detected region
cv2.rectangle(img, (x, y), (x+w, y+h), (0, 0, 255), 1)
# Display
cv2.imshow("img", img)
cv2.waitKey(0)
If you want to read you can use image_to_string method.

Related

OpenCV Image Classification - Small Dots On Map

I am attempting to create a model that can recognize green circles or green rectangular arrows as shown in the "image".
I figured I would need to make my own classifier and I am using Cascade Trainer to accomplish this. (See repo for the classification images and .xml file.)
Complete code and supporting files here:
https://github.com/ThePieMonster/Computer-Vision
# Analyze Screenshots
print("Analyze Screenshots")
img = cv2.imread('Data/image.png')
classifier_path = 'Data/train/classifier/cascade.xml'
img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
img_rgb = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
print("- load classifier")
cascade = cv2.CascadeClassifier(classifier_path)
print("- detect objects")
objects = cascade.detectMultiScale(image=img_rgb, scaleFactor=1.10, minNeighbors=3)
print("- draw rectangle around objects")
for(x,y,w,h) in objects:
# cv2.rectangle(<image>, (x,y), (x+w,y+h), <rectangle rgb color>, <rectangle thickness>)
img = cv2.rectangle(img, (x,y), (x+w,y+h), (0,255,0), 2)
cv2.imshow('img', img)
cv2.waitKey(0)
print()
As you can see via the below image, I can't seem to get the classifier to recognize the green dots. There are about 41 green dots in the image and I am hoping to spot them all, currently I have a few red dots spotted and some random map squares. Any help is much appreciated!

OpenCV and Tesseract on door label detection

I'm fairly new on OpenCv and tesseract. I'm recently building a project on using computer vision to detect door labels. Hopefully it would be beneficial for visually impaired group.
The idea of the program is to preprocess the input image by converting it into binary color, then use canny edge to detect the outlines of door label, then dilate the canny edge result. After these, feed image to tesseract while trying to show the text detected with boxes.
Expected results are green rectangles on text. While printing out the text itself.
The issue is the missing rectangles and failure in text detection.
I have tried going through these:
Recognize Text in images using Canny Edge detection in Opencv
OpenCv pytesseract for OCR
Image preprocessing with OpenCV before doing character recognition (tesseract)
The questions and solutions are either too simple or not as relevant. Some are not in python as well.
Attached below is my attempt on the code:
import pytesseract as pytess
import cv2 as cv
import numpy as np
from PIL import Image
from pytesseract import Output
img = cv.imread(r"C:\Users\User\Desktop\dataset\p\Image_31.jpg", 0)
# edges store the canny version of img
edges = cv.Canny(img, 100, 200)
# ker as in kernel
# (5, 5) is the matrix while uint8 is datatype
ker = np.ones((3, 3), np.uint8)
# dil as in dilation
# edges as the src, ker is the kernel we set above, number of dilation
dil = cv.dilate(edges, ker, iterations=1)
# setup pytesseract parameters
configs = r'--oem 3 --psm 6'
# feed image to tesseract
result = pytess.image_to_data(dil, output_type=Output.DICT, config=configs, lang='eng')
print(result.keys())
boxes = len(result['text'])
# make a new copy of edges
new_item = dil.copy()
for sequence_number in range(boxes):
if int(result['conf'][sequence_number]) > 30: # removed constraints
(x, y, w, h) = (result['left'][sequence_number], result['top'][sequence_number],
result['width'][sequence_number], result['height'][sequence_number])
new_item = cv.rectangle(new_item, (x, y), (x + w, y + h), (0, 255, 0), 2)
# detect sentence with tesseract
# pending as rectangle not achieved
cv.imshow("original", img)
cv.imshow("canny", edges)
cv.imshow("dilation", dil)
cv.imshow("capturedText", new_item)
#ignore below this line, it is only for testing
#testobj = Image.fromarray(dil)
#testtext = pytess.image_to_string(testobj, lang='eng')
#print(testtext)
cv.waitKey(0)
cv.destroyAllWindows()
Resultant image:
The testing part of the code return results as shown below:
a)
Meets
Which, obviously does not satisfy the objective.
EDIT
After posting the question, I realized I may have done it wrong in the beginning. I should attempt to use OpencV to detect the contour of the door label and isolate the part containing text before sending whatever is in the rectangle for OCR recognition.
EDIT2
Now that I identify the issue thanks to our stackoverflow members, now I'm attempting to add on image rectification/image wrapping technique to retrieve a straight front view to get a better accuracy for the system. Update soon.
EDIT3
After certain bug fixing, reducing the constraint while allowing the function to draw on the original image, I have achieved the results below. Attached the updated code as well.
import cv2 as cv
import numpy as np
import pytesseract as pytess
from pytesseract import Output
# input of img source
img = cv.imread(r"C:\Users\User\Desktop\dataset\p\Image_31.jpg")
# necessary image color conversion
img2 = cv.cvtColor(img, cv.COLOR_BGR2GRAY)
# edges store the canny version of img
edges = cv.Canny(img2, 100, 200)
# ker as in kernel
# (5, 5) is the matrix while uint8 is datatype
ker = np.ones((3, 3), np.uint8)
# dil as in dilation
# edges as the src, ker is the kernel we set above, number of dilation
dil = cv.dilate(edges, ker, iterations=1)
# setup pytesseract parameters
configs = r'--oem 3 --psm 6'
# feed image to tesseract
result = pytess.image_to_data(dil, output_type=Output.DICT, config=configs, lang='eng')
# number of boxes that encapsulate the boxes
boxes = len(result['text'])
# make a new copy of edges
new_item = dil.copy()
for sequence_number in range(boxes):
if int(result['conf'][sequence_number]) > 0: #removed constraints
(x, y, w, h) = (result['left'][sequence_number], result['top'][sequence_number],
result['width'][sequence_number], result['height'][sequence_number])
# draw rectangle boxes on the original img
cv.rectangle(img, (x, y), (x + w, y + h), (0, 255, 0), 3)
# Crop the image
crp = new_item[y:y + h, x:x + w]
# OCR
txt = pytess.image_to_string(crp, config=configs)
# returns recognised text
print(txt)
cv.imshow("capturedText", crp)
cv.waitKey(0)
# cv.imshow("original", img)
# cv.imshow("canny", edges)
# cv.imshow("dilation", dil)
cv.imshow("results", img)
cv.waitKey(0)
cv.destroyAllWindows()
You have found all the detected text in the image:
for sequence_number in range(boxes):
if int(result['conf'][sequence_number]) > 30:
(x, y, w, h) = (result['left'][sequence_number], result['top'][sequence_number],
result['width'][sequence_number], result['height'][sequence_number])
new_item = cv.rectangle(new_item, (x, y), (x + w, y + h), (0, 255, 0), 2)
But you also say the current confidence should be more than 70%.
If we remove the constraint
If we OCR each new item
Result will be:
Now if you read:
txt = pytesseract.image_to_string(new_item, config="--psm 6")
print(txt)
OCR will be:
Meeting Room ยง
The output of the current pytesseract version 0.3.7
Code:
# Load the libraries
import cv2
import pytesseract
# Load the image
img = cv2.imread("fsUSw.png")
# Convert it to the gray-scale
gry = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
# OCR detection
d = pytesseract.image_to_data(gry, config="--psm 6", output_type=pytesseract.Output.DICT)
# Get ROI part from the detection
n_boxes = len(d['level'])
# For each detected part
for i in range(1, 2):
# Get the localized region
(x, y, w, h) = (d['left'][i], d['top'][i], d['width'][i], d['height'][i])
# Draw rectangle to the detected region
cv2.rectangle(img, (x, y), (x + w, y + h), (0, 0, 255), 5)
# Crop the image
crp = gry[y:y + h, x:x + w]
# OCR
txt = pytesseract.image_to_string(crp, config="--psm 6")
print(txt)
# Display the cropped image
cv2.imshow("crp", crp)
cv2.waitKey(0)
# Display
cv2.imshow("img", img)
cv2.waitKey(0)
I think what you are looking for here is image rectificaiton (warping image to make it look like taken from another point of view) and there seem to be tools for this in python. However, the problem gets more complicated since in your case you need to detect how you want to rectify it. I am not sure how you should go about that.

How to customize parameters for face detection by using face cascade

I have a simple face detection implementation as following
import cv2
face_cascade = cv2.CascadeClassifier(cv2.data.haarcascades + "haarcascade_frontalface_default.xml")
filename = "path/to/image"
img = cv2.imread(filename)
cv2.imshow("Original image", img)
face_region = face_cascade.detectMultiScale(img, 1.1, 4)
for (x, y, w, h) in face_region:
cv2.rectangle(img, (x, y), (x+w, y+h), (255, 0, 0), 2)
cv2.imshow("Output", img)
cv2.waitKey(0)
after running the code, I got the following result
As you can see that, the implementation detects two faces! How can I get rid of this kind of false detection?
first delete the textual data like in this link Delete OCR word from Image (OpenCV,Python)
after that try to use you face detection code...then it will improve your accuracy

OpenCV crop image and display the original with crop removed

I have a image as follows:
On this image I am running a face detection code as follows:
More details incase you want to run this code are here
Face detector:
import cv2
# Load the cascade
face_cascade = cv2.CascadeClassifier('haarcascade_frontalface_default.xml')
# Read the input image
img = cv2.imread('5.jpg')
# Convert into grayscale
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
# Detect faces
faces = face_cascade.detectMultiScale(gray, 1.1, 4)
# Draw rectangle around the faces
for (x, y, w, h) in faces:
cv2.rectangle(img, (x-20, y-70), (x + w + 50, y + h + 50), (255, 0, 0), 2)
crop_img = img[y:y+h, x:x+w]
cv2.imwrite("cropped.jpg", crop_img)
As a output of this code image I get
What I want is:
In the output I want the entire image but with the face removed. How to achieve this result ?
Just use numpy to overwrite the pixels:
img[y1:y1+h, x1:x1+w, 0] = 0
img[y1:y1+h, x1:x1+w, 1] = 0
img[y1:y1+h, x1:x1+w, 2] = 0
note that numpy has a strange indexing system that rows are first, columns second (opposite to common indexing)

How to use openCV and HAAR Cascades to blur faces?

I would like to know is there is a way to blur the faces that have been automatically identify by the haarcascade face classifier.
using the code below, I'm able to detect the faces, crop the image around this face or draw a rectangle on it.
image = cv2.imread(imagepath)
# Specify the trained cascade classifier
face_cascade_name = "./haarcascade_frontalface_alt.xml"
# Create a cascade classifier
face_cascade = cv2.CascadeClassifier()
# Load the specified classifier
face_cascade.load(face_cascade_name)
#Preprocess the image
grayimg = cv2.cvtColor(image, cv2.cv.CV_BGR2GRAY)
grayimg = cv2.equalizeHist(grayimg)
#Run the classifiers
faces = face_cascade.detectMultiScale(grayimg, 1.1, 2, 0|cv2.cv.CV_HAAR_SCALE_IMAGE, (30, 30))
print "Faces detected"
if len(faces) != 0: # If there are faces in the images
for f in faces: # For each face in the image
# Get the origin co-ordinates and the length and width till where the face extends
x, y, w, h = [ v for v in f ]
# Draw rectangles around all the faces
cv2.rectangle(image, (x,y), (x+w,y+h), (255,255,255))
sub_face = image[y:y+h, x:x+w]
for i in xrange(1,31,2):
cv2.blur(sub_face, (i,i))
face_file_name = "./face_" + str(y) + ".jpg"
cv2.imwrite(face_file_name, sub_face)
But I would like to blur the face of the people so they can't be recognized.
Do you have an idea on how to do that?
Thanks for your help
Arnaud
I finally succeeded to do what I want.
To do that apply a gaussianblur as Hammer has suggested.
The code is :
image = cv2.imread(imagepath)
result_image = image.copy()
# Specify the trained cascade classifier
face_cascade_name = "./haarcascade_frontalface_alt.xml"
# Create a cascade classifier
face_cascade = cv2.CascadeClassifier()
# Load the specified classifier
face_cascade.load(face_cascade_name)
#Preprocess the image
grayimg = cv2.cvtColor(image, cv2.cv.CV_BGR2GRAY)
grayimg = cv2.equalizeHist(grayimg)
#Run the classifiers
faces = face_cascade.detectMultiScale(grayimg, 1.1, 2, 0|cv2.cv.CV_HAAR_SCALE_IMAGE, (30, 30))
print "Faces detected"
if len(faces) != 0: # If there are faces in the images
for f in faces: # For each face in the image
# Get the origin co-ordinates and the length and width till where the face extends
x, y, w, h = [ v for v in f ]
# get the rectangle img around all the faces
cv2.rectangle(image, (x,y), (x+w,y+h), (255,255,0), 5)
sub_face = image[y:y+h, x:x+w]
# apply a gaussian blur on this new recangle image
sub_face = cv2.GaussianBlur(sub_face,(23, 23), 30)
# merge this blurry rectangle to our final image
result_image[y:y+sub_face.shape[0], x:x+sub_face.shape[1]] = sub_face
face_file_name = "./face_" + str(y) + ".jpg"
cv2.imwrite(face_file_name, sub_face)
# cv2.imshow("Detected face", result_image)
cv2.imwrite("./result.png", result_image)
Arnaud
The whole end of your code can be replaced by :
img[startX:endX, startY:endY] = cv2.blur(img[startX:endX, startY:endY], (23, 23))
instead of :
# Get the origin co-ordinates and the length and width till where the face extends
x, y, w, h = [ v for v in f ]
# get the rectangle img around all the faces
cv2.rectangle(image, (x,y), (x+w,y+h), (255,255,0), 5)
sub_face = image[y:y+h, x:x+w]
# apply a gaussian blur on this new recangle image
sub_face = cv2.GaussianBlur(sub_face,(23, 23), 30)
# merge this blurry rectangle to our final image
result_image[y:y+sub_face.shape[0], x:x+sub_face.shape[1]] = sub_face
Especially because you don't request to have a circular mask, it's (to me) much easier to read.
PS : Sorry for not commenting, not enough reputation to do it. Even if the post is 5 years old, I guess this may be worth it, as found it for this particular question ..
Note: Neural Networks (like Resnet) are now more accurate than HAAR Cascade to detect the faces, and they are also now integrated in OpenCV. It might be better than using the solutions mentionned in this question.
However, the code to blur / pixelate a face is still applicable.
You can also pixelate the region of the face by adding squares that contain the average of RGB values of the zones in the face.
A function performing this could be like that:
def pixelate_image(image: np.ndarray, nb_blocks=5, in_place=False) -> np.ndarray:
"""Return a pixelated version of a picture (need to be fed with a face to pixelate)"""
# To pixelate, we will split into a given number of blocks
# For each block, we will compute the average of RGB values of the block
# And then we can just replace with a rectangle of this color
# divide the input image into NxN blocks
if not in_place:
image = np.copy(image)
h, w = image.shape[:2]
blocks = tuple(
np.linspace(0, d, nb_blocks + 1, dtype="int") for d in (w, h)
)
for i, j in product(*[range(1, len(s)) for s in blocks]):
# compute the starting and ending (x, y)-coordinates
# for the current block
start = blocks[0][i - 1], blocks[1][j - 1]
end = blocks[0][i], blocks[1][j]
# extract the ROI using NumPy array slicing, compute the
# mean of the ROI, and then draw a rectangle with the
# mean RGB values over the ROI in the original image
roi = image[start[1]:end[1], start[0]:end[0]]
bgr = [int(x) for x in cv2.mean(roi)[:3]]
cv2.rectangle(image, start, end, bgr, -1)
return image
You then just need to use it in a function like this (updated to Python 3 with pathlib and type hints):
from pathlib import Path
from typing import Union
import cv2
import numpy as np
PathLike = Union[Path, str]
face_cascade = cv2.CascadeClassifier("haarcascade_frontalface_alt.xml")
def pixelate_faces_haar(img_path: PathLike, dest: Path):
"""Pixelate faces of people with OpenCV and save to a destination file"""
img = cv2.imread(str(img_path))
# To use cascade, we need to use Grayscale images
# We can then detect faces
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
faces = face_cascade.detectMultiScale(gray, 1.1, 4)
for (x, y, width, height) in faces:
roi = img[y:y+height, x:x+width]
pixelate_image(roi, 15, in_place=True)
dest.parent.mkdir(parents=True, exist_ok=True)
cv2.imwrite(str(dest), img)
print(f"Saved pixelated version of {img_path} to {dest}")```

Categories