I have some sketched images where the images contain text captions. I am trying to remove those caption.
I am using this code:
import cv2
import pytesseract
pytesseract.pytesseract.tesseract_cmd = r"C:\Program Files\Tesseract-OCR\tesseract.exe"
# Load image, grayscale, blur, Otsu's threshold
image = cv2.imread('1.png')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
blur = cv2.GaussianBlur(gray, (3,3), 0)
thresh = cv2.threshold(blur, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]
# Find contours and filter using contour area
cnts = cv2.findContours(thresh, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
for c in cnts:
area = cv2.contourArea(c)
if area > 500:
cv2.drawContours(thresh, [c], -1, 0, -1)
# Invert image and OCR
invert = 255 - thresh
Output= thresh - invert
cv2.imshow('thresh', thresh)
cv2.imshow('invert', invert)
cv2.imshow('output', output)
cv2.waitKey()
Code not working for these images.
The cv2 pre-processing is unecessary here, tesseract is able to find the text on its own. See the example below, commented inline:
results = pytesseract.image_to_data('1.png', config='--psm 11', output_type='dict')
for i in range(len(results["text"])):
# extract the bounding box coordinates of the text region from
# the current result
x = results["left"][i]
y = results["top"][i]
w = results["width"][i]
h = results["height"][i]
# Extract the confidence of the text
conf = int(results["conf"][i])
if conf > 60: # adjust to your liking
# Cover the text with a white rectangle
cv2.rectangle(image, (x, y), (x + w, y + h), (255, 255, 255), -1)
Detected text on the left, cleaned image on the right:
Another option, without using Tesseract. Just use the area of the contours to filter the smaller ones by covering them with white-filled rectangles:
# Imports
import cv2
import numpy as np
# Read image
imagePath = "C://opencvImages//"
inputImage = cv2.imread(imagePath+"0enxN.png")
# Convert BGR to grayscale:
binaryImage = cv2.cvtColor(inputImage, cv2.COLOR_BGR2GRAY)
# Invert image:
binaryImage = 255 - binaryImage
# Find the external contours on the binary image:
contours, hierarchy = cv2.findContours(binaryImage, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
# Invert image:
binaryImage = 255 - binaryImage
# Look for the bounding boxes:
for _, c in enumerate(contours):
# Get the contour's bounding rectangle:
boundRect = cv2.boundingRect(c)
# Get the dimensions of the bounding rect:
rectX = boundRect[0]
rectY = boundRect[1]
rectWidth = boundRect[2]
rectHeight = boundRect[3]
# Get Bounding Rectangle Area:
rectArea = rectWidth * rectHeight
# Set minimum area threshold:
minArea = 1000
# Check for minimum area:
if rectArea < minArea:
# Draw white rectangle to cover small contour:
cv2.rectangle(binaryImage, (rectX, rectY), (rectX + rectWidth, rectY + rectHeight),
(255, 255, 255), -1)
cv2.imshow("Binary Mask", binaryImage)
cv2.waitKey(0)
This produces:
Related
I would like to find all the big elements in the document, but I do not know how to control the size (the document is downloaded from the Internet :))
I have a document
And I wrote a simple code
import cv2
import pytesseract
image = cv2.imread('2.png')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
blur = cv2.GaussianBlur(gray, (7, 7), 0)
thresh = cv2.threshold(
blur, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]
kernal = cv2.getStructuringElement(cv2.MORPH_RECT, (3, 50))
dilate = cv2.dilate(thresh, kernal, iterations=1)
cv2.imwrite('1_dilated.png', dilate)
cnts = cv2.findContours(dilate, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
cnts = sorted(cnts, key=lambda x: cv2.boundingRect(x)[1])
for c in cnts:
x, y, w, h = cv2.boundingRect(c)
if h > 100 and w > 100:
roi = image[y:y+h, x:x+w]
cv2.rectangle(image, (x, y), (x+w, y+h), (36, 255, 12), 2)
# ocr = pytesseract.image_to_string(roi)
# print(ocr)
cv2.imwrite('1_boxes4.png', image)
But only detects it
And I would like this
How to control the size of the detected area ?
Thank you very much for all your comments
You are close, but you need to increase the number of iterations of the dilate operation. Also, a rectangular structuring element might help better forming the blobs of text. Let's check out some possible improvements of your code:
# imports:
import cv2
import numpy as np
# Set image path
imagePath = "D://opencvImages//"
imageName = "F74Yq.png"
# Read image:
inputImage = cv2.imread(imagePath + imageName)
# Store a deeep copy for results:
inputCopy = inputImage.copy()
# Convert BGR to grayscale:
grayInput = cv2.cvtColor(inputImage, cv2.COLOR_BGR2GRAY)
# Threshold via Otsu
_, binaryImage = cv2.threshold(grayInput, 0, 255, cv2.THRESH_BINARY_INV+cv2.THRESH_OTSU)
The first part produces the binary image of the input image, there's nothing fancy going on here - just a direct thresholding via Otsu's method. This is the binary image obtained:
Now, let's apply the dilate operation. Let's use a 9 x 9 rectangular kernel and set the number of iterations to 5. Gotta be careful you don't dilate too much, because blobs of text from different portions of the document could end up joined:
# Set kernel (structuring element) size:
kernelSize = (9, 9)
# Set operation iterations:
opIterations = 5
# Get the structuring element:
morphKernel = cv2.getStructuringElement(cv2.MORPH_RECT, kernelSize)
# Perform Dilate:
dilateImage = cv2.morphologyEx(binaryImage, cv2.MORPH_DILATE, morphKernel, None, None, opIterations, cv2.BORDER_REFLECT101)
This is the result:
Ok, now let's just detect external contours and get their bounding boxes so we can draw rectangles around the target areas. Note that I'm drawing the rectangles on a deep copy of the input:
# Find the contours on the binary image:
contours, hierarchy = cv2.findContours(dilateImage, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
# Look for the outer bounding boxes (no children):
for _, c in enumerate(contours):
# Get the contours bounding rectangle:
boundRect = cv2.boundingRect(c)
# Get the dimensions of the bounding rectangle:
rectX = boundRect[0]
rectY = boundRect[1]
rectWidth = boundRect[2]
rectHeight = boundRect[3]
# Set bounding rectangle:
color = (0, 0, 255)
cv2.rectangle( inputCopy, (int(rectX), int(rectY)),
(int(rectX + rectWidth), int(rectY + rectHeight)), color, 5 )
cv2.imshow("Bounding Rectangles", inputCopy)
cv2.waitKey()
This is the final result:
I would like to get the coordinates of the box around the initial ("H") on the following page (and similar ones with other initials, so opencv template matching is not an option):
Following this tutorial, I tried to solve the problem with opencv contours:
import cv2
import matplotlib.pyplot as plt
page = "image.jpg"
# read the image
image = cv2.imread(page)
# convert to RGB
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
# convert to grayscale
gray = cv2.cvtColor(image, cv2.COLOR_RGB2GRAY)
# create a binary thresholded image
_, binary = cv2.threshold(gray, 0,150,cv2.THRESH_BINARY+cv2.THRESH_OTSU)
# find the contours from the thresholded image
contours, hierarchy = cv2.findContours(binary, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
# draw all contours
image = cv2.drawContours(image, contours, 3, (0, 255, 0), 2)
plt.savefig("result.png")
The result is of course not exactly what I wanted:
Does anyone know of an viable algorithm (and possibly an implementation thereof) that could provide an easy solution to my task?
You can find the target area by filtering your contours. Now, there's at least two filtering criteria that you can use. One is filter by area - that is, discard too small and too large contours until you get the contour you are looking for. The other one is by computing the extent of every contour. The extent is the ratio of the contour's area to its bounding rectangle area. You are looking for a square-like contour, so its extent should be close to 1.0.
Let's see the code:
# imports:
import cv2
import numpy as np
# Reading an image in default mode:
inputImage = cv2.imread(path + fileName)
# Deep copy for results:
inputImageCopy = inputImage.copy()
# Convert RGB to grayscale:
grayscaleImage = cv2.cvtColor(inputImage, cv2.COLOR_BGR2GRAY)
# Get binary image via Otsu:
_, binaryImage = cv2.threshold(grayscaleImage, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)
The first portion of the code gets you a binary image that you can use as a mask to compute contours:
Now, let's filter contours. Let's use the area approach first. You need to define a range of minimum area and maximum area to filter everything that does not fall in this range. I've heuristically determined a range of areas from 30000 px to 150000 px:
# Find the contours on the binary image:
contours, hierarchy = cv2.findContours(binaryImage, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
# Look for the outer bounding boxes (no children):
for _, c in enumerate(contours):
# Get blob area:
currentArea = cv2.contourArea(c)
print("Contour Area: "+str(currentArea))
# Set an area range:
minArea = 30000
maxArea = 150000
if minArea < currentArea < maxArea:
# Get the contour's bounding rectangle:
boundRect = cv2.boundingRect(c)
# Get the dimensions of the bounding rect:
rectX = boundRect[0]
rectY = boundRect[1]
rectWidth = boundRect[2]
rectHeight = boundRect[3]
# Set bounding rect:
color = (0, 0, 255)
cv2.rectangle( inputImageCopy, (int(rectX), int(rectY)),
(int(rectX + rectWidth), int(rectY + rectHeight)), color, 2 )
cv2.imshow("Rectangles", inputImageCopy)
cv2.waitKey(0)
Once you successfully filter the area, you can then compute the bounding rectangle of the contour with cv2.boundingRect. You can retrieve the bounding rectangle's x, y (top left) coordinates as well as its width and height. After that just draw the rectangle on a deep copy of the original input.
Now, let's see the second option, using the contour's extent. The for loop gets modified as follows:
# Look for the outer bounding boxes (no children):
for _, c in enumerate(contours):
# Get blob area:
currentArea = cv2.contourArea(c)
# Get the contour's bounding rectangle:
boundRect = cv2.boundingRect(c)
# Get the dimensions of the bounding rect:
rectX = boundRect[0]
rectY = boundRect[1]
rectWidth = boundRect[2]
rectHeight = boundRect[3]
# Calculate extent:
extent = float(currentArea)/(rectWidth *rectHeight)
print("Extent: " + str(extent))
# Set the extent filter, look for an extent close to 1.0:
delta = abs(1.0 - extent)
epsilon = 0.1
if delta < epsilon:
# Set bounding rect:
color = (0, 0, 255)
cv2.rectangle( inputImageCopy, (int(rectX), int(rectY)),
(int(rectX + rectWidth), int(rectY + rectHeight)), color, 2 )
cv2.imshow("Rectangles", inputImageCopy)
cv2.waitKey(0)
Both approaches yield this result:
You almost have it. You just need to filter contours on area and aspect ratio. Here is my approach in Python/OpenCV.
Input:
import cv2
import numpy as np
# read image as grayscale
img = cv2.imread('syriados.jpg')
# convert to grayscale
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
# threshold to binary
#thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY)[1]
thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY+cv2.THRESH_OTSU)[1]
# invert threshold
thresh = 255 - thresh
# apply morphology to remove small white regions and to close the rectangle boundary
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (3,3))
morph = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, kernel)
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (7,7))
morph = cv2.morphologyEx(thresh, cv2.MORPH_CLOSE, kernel)
# find contours
result = img.copy()
cntrs = cv2.findContours(morph, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
cntrs = cntrs[0] if len(cntrs) == 2 else cntrs[1]
# filter on area and aspect ratio
for c in cntrs:
area = cv2.contourArea(c)
x,y,w,h = cv2.boundingRect(c)
if area > 10000 and abs(w-h) < 100:
cv2.drawContours(result, [c], 0, (0,0,255), 2)
# write results
cv2.imwrite("syriados_thresh.jpg", thresh)
cv2.imwrite("syriados_morph.jpg", morph)
cv2.imwrite("syriados_box.jpg", result)
# show results
cv2.imshow("thresh", thresh)
cv2.imshow("morph", morph)
cv2.imshow("result", result)
cv2.waitKey(0)
Threshold image:
Morphology image:
Resulting contour image:
To get a result like this:
You'll need to detect the contour in the image with the second to the greatest area, as the one possessing the greatest area would be the border of the image.
So with the list of contours, we can get the one with the second greatest area via the built-in sorted method, using the cv2.contourArea method as the custom key:
import cv2
import numpy as np
def process(img):
img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
img_blur = cv2.GaussianBlur(img_gray, (7, 7), 2)
img_canny = cv2.Canny(img_blur, 50, 50)
kernel = np.ones((6, 6))
img_dilate = cv2.dilate(img_canny, kernel, iterations=1)
img_erode = cv2.erode(img_dilate, kernel, iterations=2)
return img_erode
def get_contours(img):
contours, _ = cv2.findContours(process(img), cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)
cnt = sorted(contours, key=cv2.contourArea)[-2]
peri = cv2.arcLength(cnt, True)
approx = cv2.approxPolyDP(cnt, 0.02 * peri, True)
cv2.drawContours(img, [approx], -1, (0, 255, 0), 2)
page = "image.jpg"
image = cv2.imread(page)
get_contours(image)
cv2.imshow("Image", image)
cv2.waitKey(0)
The above only puts the area of the contours into consideration; if you want more reliable results, you can make it so that it will only detect contours that are 4-sided.
I am using pytessearct to extract the text from images. But it doesn't work on images which are inclined. Consider the image given below:
Here is the code to extract text, which is working fine on images which are not inclined.
img = cv2.imread(<path_to_image>)
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
blur = cv2.GaussianBlur(gray, (5,5),0)
ret3, thresh = cv2.threshold(blur,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)
def findSignificantContours (img, edgeImg):
contours, heirarchy = cv2.findContours(edgeImg, cv2.RETR_TREE, cv2.CHAIN_APPROX_NONE)
# Find level 1 contours
level1 = []
for i, tupl in enumerate(heirarchy[0]):
# Each array is in format (Next, Prev, First child, Parent)
# Filter the ones without parent
if tupl[3] == -1:
tupl = np.insert(tupl, 0, [i])
level1.append(tupl)
significant = []
tooSmall = edgeImg.size * 5 / 100 # If contour isn't covering 5% of total area of image then it probably is too small
for tupl in level1:
contour = contours[tupl[0]];
area = cv2.contourArea(contour)
if area > tooSmall:
significant.append([contour, area])
# Draw the contour on the original image
cv2.drawContours(img, [contour], 0, (0,255,0),2, cv2.LINE_AA, maxLevel=1)
significant.sort(key=lambda x: x[1])
#print ([x[1] for x in significant]);
mx = (0,0,0,0) # biggest bounding box so far
mx_area = 0
for cont in contours:
x,y,w,h = cv2.boundingRect(cont)
area = w*h
if area > mx_area:
mx = x,y,w,h
mx_area = area
x,y,w,h = mx
# Output to files
roi = img[y:y+h,x:x+w]
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
blur = cv2.GaussianBlur(gray, (5,5),0)
ret3, thresh = cv2.threshold(blur,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)
cv2_imshow(thresh)
text = pytesseract.image_to_string(roi);
print(text); print("\n"); print(pytesseract.image_to_string(thresh));
print("\n")
return [x[0] for x in significant];
edgeImg_8u = np.asarray(thresh, np.uint8)
# Find contours
significant = findSignificantContours(img, edgeImg_8u)
mask = thresh.copy()
mask[mask > 0] = 0
cv2.fillPoly(mask, significant, 255)
# Invert mask
mask = np.logical_not(mask)
#Finally remove the background
img[mask] = 0;
Tesseract can't extract the text from this image. Is there a way I can rotate it to align the text perfectly and then feed it to pytesseract? Please let me know if my question require any more clarity.
Here's a simple approach:
Obtain binary image. Load image, convert to grayscale,
Gaussian blur, then Otsu's threshold.
Find contours and sort for largest contour. We find contours then filter using contour area with cv2.contourArea() to isolate the rectangular contour.
Perform perspective transform. Next we perform contour approximation with cv2.contourArea() to obtain the rectangular contour. Finally we utilize imutils.perspective.four_point_transform to actually obtain the bird's eye view of the image.
Binary image
Result
To actually extract the text, take a look at
Use pytesseract OCR to recognize text from an image
Cleaning image for OCR
Detect text area in an image using python and opencv
Code
from imutils.perspective import four_point_transform
import cv2
import numpy
# Load image, grayscale, Gaussian blur, Otsu's threshold
image = cv2.imread("1.jpg")
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
blur = cv2.GaussianBlur(gray, (7,7), 0)
thresh = cv2.threshold(blur, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)[1]
# Find contours and sort for largest contour
cnts = cv2.findContours(thresh, cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
cnts = sorted(cnts, key=cv2.contourArea, reverse=True)
displayCnt = None
for c in cnts:
# Perform contour approximation
peri = cv2.arcLength(c, True)
approx = cv2.approxPolyDP(c, 0.02 * peri, True)
if len(approx) == 4:
displayCnt = approx
break
# Obtain birds' eye view of image
warped = four_point_transform(image, displayCnt.reshape(4, 2))
cv2.imshow("thresh", thresh)
cv2.imshow("warped", warped)
cv2.waitKey()
To Solve this problem you can also use minAreaRect api in opencv which will give you a minimum area rotated rectangle with an angle of rotation. You can then get the rotation matrix and apply warpAffine for the image to straighten it. I have also attached a colab notebook which you can play around on.
Colab notebook : https://colab.research.google.com/drive/1SKxrWJBOHhGjEgbR2ALKxl-dD1sXIf4h?usp=sharing
import cv2
from google.colab.patches import cv2_imshow
import numpy as np
def rotate_image(image, angle):
image_center = tuple(np.array(image.shape[1::-1]) / 2)
rot_mat = cv2.getRotationMatrix2D(image_center, angle, 1.0)
result = cv2.warpAffine(image, rot_mat, image.shape[1::-1], flags=cv2.INTER_LINEAR)
return result
img = cv2.imread("/content/sxJzw.jpg")
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
mask = np.zeros((img.shape[0], img.shape[1]))
blur = cv2.GaussianBlur(gray, (5,5),0)
ret, thresh = cv2.threshold(blur,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)
cv2_imshow(thresh)
contours, _ = cv2.findContours(thresh, cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_SIMPLE)
largest_countour = max(contours, key = cv2.contourArea)
binary_mask = cv2.drawContours(mask, [largest_countour], 0, 1, -1)
new_img = img * np.dstack((binary_mask, binary_mask, binary_mask))
minRect = cv2.minAreaRect(largest_countour)
rotate_angle = minRect[-1] if minRect[-1] < 0 else -minRect[-1]
new_img = rotate_image(new_img, rotate_angle)
cv2_imshow(new_img)
I need to extract the 12 oval shapes from the image and store them in separate variables say 1 to 12.
The original image was as follows
Original Image:
Output image:
Can someone help me extract all those oval shapes into different variables ?
my code is
import cv2
import numpy as np
path = r'/home/parallels/Desktop/Opencv/data/test.JPG'
i = cv2.imread(path, -1)
img_rgb = cv2.resize(i, (1280,720))
cv2.namedWindow("Original Image",cv2.WINDOW_NORMAL)
img = cv2.cvtColor(img_rgb, cv2.COLOR_RGB2HSV)
img = cv2.bilateralFilter(img,9,105,105)
r,g,b=cv2.split(img)
equalize1= cv2.equalizeHist(r)
equalize2= cv2.equalizeHist(g)
equalize3= cv2.equalizeHist(b)
equalize=cv2.merge((r,g,b))
equalize = cv2.cvtColor(equalize,cv2.COLOR_RGB2GRAY)
ret,thresh_image = cv2.threshold(equalize,0,255,cv2.THRESH_OTSU+cv2.THRESH_BINARY)
equalize= cv2.equalizeHist(thresh_image)
canny_image = cv2.Canny(equalize,250,255)
canny_image = cv2.convertScaleAbs(canny_image)
kernel = np.ones((3,3), np.uint8)
dilated_image = cv2.dilate(canny_image,kernel,iterations=1)
contours, hierarchy = cv2.findContours(dilated_image, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
contours= sorted(contours, key = cv2.contourArea, reverse = True)[:10]
c=contours[0]
print(cv2.contourArea(c))
final = cv2.drawContours(img, [c], -1, (255,0, 0), 3)
mask = np.zeros(img_rgb.shape,np.uint8)
new_image = cv2.drawContours(mask,[c],0,255,-1,)
new_image = cv2.bitwise_and(img_rgb, img_rgb, mask=equalize)
cv2.namedWindow("new",cv2.WINDOW_NORMAL)
cv2.imshow("new",new_image)
cv2.waitKey(0)
You're on the right track. After obtaining your binary image, you can perform contour area + aspect ratio filtering. We can sort the contours in order from left-to-right using imutils.contours.sort_contours(). We find contours then filter using cv2.contourArea
and aspect ratio with cv2.approxPolyDP + cv2.arcLength. If they pass this filter, we draw the contours and append it to a oval list to keep track of the contours. Here's the results:
Filtered mask
Results
Isolated ovals
Output from oval list
Oval contours: 12
Code
import cv2
import numpy as np
from imutils import contours
# Load image, resize, convert to HSV, bilaterial filter
image = cv2.imread('1.jpg')
resize = cv2.resize(image, (1280,720))
original = resize.copy()
mask = np.zeros(resize.shape[:2], dtype=np.uint8)
hsv = cv2.cvtColor(resize, cv2.COLOR_RGB2HSV)
hsv = cv2.bilateralFilter(hsv,9,105,105)
# Split into channels and equalize
r,g,b=cv2.split(hsv)
equalize1 = cv2.equalizeHist(r)
equalize2 = cv2.equalizeHist(g)
equalize3 = cv2.equalizeHist(b)
equalize = cv2.merge((r,g,b))
equalize = cv2.cvtColor(equalize,cv2.COLOR_RGB2GRAY)
# Blur and threshold for binary image
blur = cv2.GaussianBlur(equalize, (3,3), 0)
thresh = cv2.threshold(blur, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)[1]
# Find contours, sort from left-to-right
# Filter using contour area and aspect ratio filtering
ovals = []
num = 0
cnts = cv2.findContours(thresh, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
(cnts, _) = contours.sort_contours(cnts, method="left-to-right")
for c in cnts:
area = cv2.contourArea(c)
x,y,w,h = cv2.boundingRect(c)
ar = w / float(h)
if area > 1000 and ar < .8:
cv2.drawContours(resize, [c], -1, (36,255,12), 3)
cv2.drawContours(mask, [c], -1, (255,255,255), -1)
cv2.putText(resize, str(num), (x,y - 5), cv2.FONT_HERSHEY_SIMPLEX, 0.7, (36,55,12), 2)
ovals.append(c)
num += 1
result = cv2.bitwise_and(original, original, mask=mask)
result[mask==0] = (255,255,255)
print('Oval contours: {}'.format(len(ovals)))
cv2.imshow('equalize', equalize)
cv2.imshow('thresh', thresh)
cv2.imshow('resize', resize)
cv2.imshow('result', result)
cv2.imshow('mask', mask)
cv2.waitKey()
Right now I am trying to create one program, which remove text from background but I am facing a lot of problem going through it
My approach is to use pytesseract to get text boxes and once I get boxes, I use cv2.inpaint to paint it and remove text from there. In short:
d = pytesseract.image_to_data(img, output_type=Output.DICT) # Get text
n_boxes = len(d['level']) # get boxes
for i in range(n_boxes): # Looping through boxes
# Get coordinates
(x, y, w, h) = (d['left'][i], d['top'][i], d['width'][i], d['height'][i])
crop_img = img[y:y+h, x:x+w] # Crop image
gray = cv2.cvtColor(crop_img, cv2.COLOR_BGR2GRAY)
gray = inverte(gray) # Inverse it
thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_OTSU)[1]
dst = cv2.inpaint(crop_img, thresh, 10, cv2.INPAINT_TELEA) # Then Inpaint
img[y:y+h, x:x+w] = dst # Place back cropped image back to the source image
Now the problem is that I am not able to remove text completely
Image:
Now I am not sure what other method I can use to remove text from image, I am new to this that's why I am facing problem. Any help is much appreciated
Note: Image looks stretched because I resized it to show it in screen size
Original Image:
Here's an approach using morphological operations + contour filtering
Convert image to grayscale
Otsu's threshold to obtain a binary image
Perform morph close to connect words into a single contour
Dilate to ensure that all bits of text are contained in the contour
Find contours and filter using contour area
Remove text by "filling" in the contour rectangle with the background color
I used chrome developer tools to determine the background color of the image which was (222,228,251). If you want to dynamically determine the background color, you could try finding the dominant color using k-means. Here's the result
import cv2
image = cv2.imread('1.jpg')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]
close_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (15,3))
close = cv2.morphologyEx(thresh, cv2.MORPH_CLOSE, close_kernel, iterations=1)
dilate_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (5,3))
dilate = cv2.dilate(close, dilate_kernel, iterations=1)
cnts = cv2.findContours(dilate, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
for c in cnts:
area = cv2.contourArea(c)
if area > 800 and area < 15000:
x,y,w,h = cv2.boundingRect(c)
cv2.rectangle(image, (x, y), (x + w, y + h), (222,228,251), -1)
cv2.imshow('image', image)
cv2.waitKey()