OpenCV: How to remove text from background - python

Right now I am trying to create one program, which remove text from background but I am facing a lot of problem going through it
My approach is to use pytesseract to get text boxes and once I get boxes, I use cv2.inpaint to paint it and remove text from there. In short:
d = pytesseract.image_to_data(img, output_type=Output.DICT) # Get text
n_boxes = len(d['level']) # get boxes
for i in range(n_boxes): # Looping through boxes
# Get coordinates
(x, y, w, h) = (d['left'][i], d['top'][i], d['width'][i], d['height'][i])
crop_img = img[y:y+h, x:x+w] # Crop image
gray = cv2.cvtColor(crop_img, cv2.COLOR_BGR2GRAY)
gray = inverte(gray) # Inverse it
thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_OTSU)[1]
dst = cv2.inpaint(crop_img, thresh, 10, cv2.INPAINT_TELEA) # Then Inpaint
img[y:y+h, x:x+w] = dst # Place back cropped image back to the source image
Now the problem is that I am not able to remove text completely
Image:
Now I am not sure what other method I can use to remove text from image, I am new to this that's why I am facing problem. Any help is much appreciated
Note: Image looks stretched because I resized it to show it in screen size
Original Image:

Here's an approach using morphological operations + contour filtering
Convert image to grayscale
Otsu's threshold to obtain a binary image
Perform morph close to connect words into a single contour
Dilate to ensure that all bits of text are contained in the contour
Find contours and filter using contour area
Remove text by "filling" in the contour rectangle with the background color
I used chrome developer tools to determine the background color of the image which was (222,228,251). If you want to dynamically determine the background color, you could try finding the dominant color using k-means. Here's the result
import cv2
image = cv2.imread('1.jpg')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]
close_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (15,3))
close = cv2.morphologyEx(thresh, cv2.MORPH_CLOSE, close_kernel, iterations=1)
dilate_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (5,3))
dilate = cv2.dilate(close, dilate_kernel, iterations=1)
cnts = cv2.findContours(dilate, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
for c in cnts:
area = cv2.contourArea(c)
if area > 800 and area < 15000:
x,y,w,h = cv2.boundingRect(c)
cv2.rectangle(image, (x, y), (x + w, y + h), (222,228,251), -1)
cv2.imshow('image', image)
cv2.waitKey()

Related

How to crop image to only text section with Python OpenCV?

I want to crop the image to only extract the text sections. There are thousands of them with different sizes so I can't hardcode coordinates. I'm trying to remove the unwanted lines on the left and on the bottom. How can I do this?
Original
Expected
Determine the least spanning bounding box by finding all the non-zero points in the image. Finally, crop your image using this bounding box. Finding the contours is time-consuming and unnecessary here, especially because your text is axis-aligned. You may accomplish your goal by combining cv2.findNonZero and cv2.boundingRect.
Hope this will work ! :
import numpy as np
import cv2
img = cv2.imread(r"W430Q.png")
# Read in the image and convert to grayscale
img = img[:-20, :-20] # Perform pre-cropping
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
gray = 255*(gray < 50).astype(np.uint8) # To invert the text to white
gray = cv2.morphologyEx(gray, cv2.MORPH_OPEN, np.ones(
(2, 2), dtype=np.uint8)) # Perform noise filtering
coords = cv2.findNonZero(gray) # Find all non-zero points (text)
x, y, w, h = cv2.boundingRect(coords) # Find minimum spanning bounding box
# Crop the image - note we do this on the original image
rect = img[y:y+h, x:x+w]
cv2.imshow("Cropped", rect) # Show it
cv2.waitKey(0)
cv2.destroyAllWindows()
in above code from forth line of code is where I set the threshold below 50 to make the dark text white. However, because this outputs a binary image, I convert to uint8, then scale by 255. The text is effectively inverted.
Then, using cv2.findNonZero, we discover all of the non-zero locations for this image.We then passed this to cv2.boundingRect, which returns the top-left corner of the bounding box, as well as its width and height. Finally, we can utilise this to crop the image. This is done on the original image, not the inverted version.
Here's a simple approach:
Obtain binary image. Load the image, grayscale, Gaussian blur, then Otsu's threshold to obtain a binary black/white image.
Remove horizontal lines. Since we're trying to only extract text, we remove horizontal lines to aid us in our next step so incorrect contours will not merge together.
Merge text into a single contour. The idea is that characters which are adjacent to each other are part of the wall of text. So we can dilate individual contours together to obtain a single contour to extract.
Find contours and extract ROI. We find contours, sort contours by area, then extract the largest contour ROI using Numpy slicing.
Here's the visualization of each step:
Binary image -> Removed horizontal lines in green
1
2
Dilate to combine into a single contour -> Detected ROI to extract in green
3
4
Result
Code
import cv2
import numpy as np
# Load image, grayscale, Gaussian blur, Otsu's threshold
image = cv2.imread('1.png')
original = image.copy()
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
blur = cv2.GaussianBlur(gray, (3, 3), 0)
thresh = cv2.threshold(blur, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]
# Remove horizontal lines
horizontal_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (25,1))
detected_lines = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, horizontal_kernel, iterations=1)
cnts = cv2.findContours(detected_lines, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
for c in cnts:
cv2.drawContours(thresh, [c], -1, 0, -1)
# Dilate to merge into a single contour
vertical_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (2,30))
dilate = cv2.dilate(thresh, vertical_kernel, iterations=3)
# Find contours, sort for largest contour and extract ROI
cnts, _ = cv2.findContours(dilate, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)[-2:]
cnts = sorted(cnts, key=cv2.contourArea, reverse=True)[:-1]
for c in cnts:
x,y,w,h = cv2.boundingRect(c)
cv2.rectangle(image, (x, y), (x + w, y + h), (36,255,12), 4)
ROI = original[y:y+h, x:x+w]
break
cv2.imshow('image', image)
cv2.imshow('dilate', dilate)
cv2.imshow('thresh', thresh)
cv2.imshow('ROI', ROI)
cv2.waitKey()

Detect squares (paintings) in images and draw contour around them using python

I'm trying to detect and draw a rectangular contour on every painting on for example this image:
I followed some guides and did the following:
Grayscale conversion
Applied median blur
Sharpen image
Applied adaptive Threshold
Applied Morphological Gradient
Find contours
Draw contours
And got the following result:
I know it's messy but is there a way to somehow detect and draw a contour around the paintings better?
Here is the code I used:
path = '<PATH TO THE PICTURE>'
#reading in and showing original image
image = cv2.imread(path)
image = cv2.resize(image,(880,600)) # resize was nessecary because of the large images
cv2.imshow("original", image)
cv2.waitKey(0)
cv2.destroyAllWindows()
# grayscale conversion
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
cv2.imshow("painting_gray", gray)
cv2.waitKey(0)
cv2.destroyAllWindows()
# we need to find a way to detect the edges better so we implement a couple of things
# A little help was found on stackoverflow: https://stackoverflow.com/questions/55169645/square-detection-in-image
median = cv2.medianBlur(gray,5)
cv2.imshow("painting_median_blur", median) #we use median blur to smooth the image
cv2.waitKey(0)
cv2.destroyAllWindows()
# now we sharpen the image with help of following URL: https://www.analyticsvidhya.com/blog/2021/08/sharpening-an-image-using-opencv-library-in-python/
kernel = np.array([[0, -1, 0],
[-1, 5,-1],
[0, -1, 0]])
image_sharp = cv2.filter2D(src=median, ddepth=-1, kernel=kernel)
cv2.imshow('painting_sharpend', image_sharp)
cv2.waitKey(0)
cv2.destroyAllWindows()
# now we apply adapptive thresholding
# thresholding: https://opencv24-python-tutorials.readthedocs.io/en/latest/py_tutorials/py_imgproc/py_thresholding/py_thresholding.html#adaptive-thresholding
thresh = cv2.adaptiveThreshold(src=image_sharp,maxValue=255,adaptiveMethod=cv2.ADAPTIVE_THRESH_GAUSSIAN_C,
thresholdType=cv2.THRESH_BINARY,blockSize=61,C=20)
cv2.imshow('thresholded image', thresh)
cv2.waitKey(0)
cv2.destroyAllWindows()
# lets apply a morphological transformation
kernel = np.ones((7,7),np.uint8)
gradient = cv2.morphologyEx(thresh, cv2.MORPH_GRADIENT, kernel)
cv2.imshow('dilated image', gradient)
cv2.waitKey(0)
cv2.destroyAllWindows()
# # lets now find the contours of the image
# # find contours: https://docs.opencv.org/4.x/dd/d49/tutorial_py_contour_features.html
contours, hierarchy = cv2.findContours(gradient, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
print("contours: ", len(contours))
print("hierachy: ", len(hierarchy))
print(hierarchy)
cv2.drawContours(image, contours, -1, (0,255,0), 3)
cv2.imshow("contour image", image)
cv2.waitKey(0)
cv2.destroyAllWindows()
Tips, help or code is appreciated!
Here's a simple approach:
Obtain binary image. We load the image, grayscale, Gaussian blur, then Otsu's threshold to obtain a binary image.
Two pass dilation to merge contours. At this point, we have a binary image but individual separated contours. Since we can assume that a painting is a single large square contour, we can merge small individual adjacent contours together to form a single contour. To do this, we create a vertical and horizontal kernel using cv2.getStructuringElement then dilate to merge them together. Depending on the image, you may need to adjust the kernel sizes or number of dilation iterations.
Detect paintings. Now we find contours and filter using contour area using a minimum threshold area to filter out small contours. Finally we obtain the bounding rectangle coordinates and draw the rectangle with cv2.rectangle.
Code
import cv2
# Load image, grayscale, Gaussian blur, Otsu's threshold
image = cv2.imread('1.jpeg')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
blur = cv2.GaussianBlur(gray, (13,13), 0)
thresh = cv2.threshold(blur, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)[1]
# Two pass dilate with horizontal and vertical kernel
horizontal_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (9,5))
dilate = cv2.dilate(thresh, horizontal_kernel, iterations=2)
vertical_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (5,9))
dilate = cv2.dilate(dilate, vertical_kernel, iterations=2)
# Find contours, filter using contour threshold area, and draw rectangle
cnts = cv2.findContours(dilate, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
for c in cnts:
area = cv2.contourArea(c)
if area > 20000:
x,y,w,h = cv2.boundingRect(c)
cv2.rectangle(image, (x, y), (x + w, y + h), (36, 255, 12), 3)
cv2.imshow('thresh', thresh)
cv2.imshow('dilate', dilate)
cv2.imshow('image', image)
cv2.waitKey()
So here is the actual size of the portrait frame.
So here is small code.
#!/usr/bin/python 37
#OpenCV 4.3.0, Raspberry Pi 3/B/4B-w/4/8GB RAM, Buster,v10.
#Date: 3rd, June, 2020
import cv2
# Load the image
img = cv2.imread('portrait.jpeg')
# convert to grayscale
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
edged = cv2.Canny(img, 120,890)
# Apply adaptive threshold
thresh = cv2.adaptiveThreshold(edged, 255, 1, 1, 11, 2)
thresh_color = cv2.cvtColor(thresh, cv2.COLOR_GRAY2BGR)
# apply some dilation and erosion to join the gaps - change iteration to detect more or less area's
thresh = cv2.dilate(thresh,None,iterations = 50)
thresh = cv2.erode(thresh,None,iterations = 50)
# Find the contours
contours,hierarchy = cv2.findContours(thresh,
cv2.RETR_TREE,
cv2.CHAIN_APPROX_SIMPLE)
# For each contour, find the bounding rectangle and draw it
for cnt in contours:
area = cv2.contourArea(cnt)
if area > 20000:
x,y,w,h = cv2.boundingRect(cnt)
cv2.rectangle(img,
(x,y),(x+w,y+h),
(0,255,0),
2)
cv2.imshow('img',img)
cv2.waitKey(0)
cv2.destroyAllWindows()
Here is output:

How to crop and save segmented objects from an image?

I have a hyperspectral image. In each image, there are many objects. I have to segment, crop, and save them as separate images. The segmented image after applying the thresholding is as below:
The problem is that I have to crop the segmented objects and save them. How it can be done?
Here's a simple approach:
Obtain binary image. Load the image, convert to grayscale, and Otsu's threshold to obtain a binary image.
Remove noise. We morphological operations to remove any particles of noise in the image.
Extract objects. From here we find contours, obtain each bounding rectangle then extract and save each ROI using Numpy slicing.
Detected objects
Saved ROIs
import cv2
import numpy as np
# Load image, grayscale, Otsu's threshold
image = cv2.imread('1.jpg')
original = image.copy()
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)[1]
# Morph open to remove noise
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (5,5))
opening = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, kernel, iterations=1)
# Find contours, obtain bounding box, extract and save ROI
ROI_number = 0
cnts = cv2.findContours(opening, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
for c in cnts:
x,y,w,h = cv2.boundingRect(c)
cv2.rectangle(image, (x, y), (x + w, y + h), (36,255,12), 2)
ROI = original[y:y+h, x:x+w]
cv2.imwrite('ROI_{}.png'.format(ROI_number), ROI)
ROI_number += 1
cv2.imshow('image', image)
cv2.imshow('thresh', thresh)
cv2.imshow('opening', opening)
cv2.waitKey()

How to remove all portrait pictures from a document

I am working on OCRing a document image. I want to detect all pictures and remove from the document image. I want to retain tables in the document image. Once I detect pictures I will remove and then want to OCR. I tried to find contour tried to detect all the bigger areas. unfortunately it detects tables also. Also how to remove the objects keeping other data in the doc image. I am using opencv and python
Here's my code
import os
from PIL import Image
import pytesseract
img = cv2.imread('block2.jpg' , 0)
mask = np.ones(img.shape[:2], dtype="uint8") * 255
ret,thresh1 = cv2.threshold(img,127,255,0)
contours, sd = cv2.findContours(thresh1,cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_SIMPLE)
areacontainer = []
for cnt in contours:
area = cv2.contourArea(cnt)
areacontainer.append(area)
avgArea = sum(areacontainer)/len(areacontainer)
[enter code here][1]
for c in contours:# average area heuristics
if cv2.contourArea(c)>6*avgArea:
cv2.drawContours(mask, [c], -1, 0, -1)
binary = cv2.bitwise_and(img, img, mask=mask) # subtracting
cv2.imwrite("bin.jpg" , binary)
cv2.imwrite("mask.jpg" , mask)
Here's an approach:
Convert image to grayscale and Gaussian blur
Perform canny edge detection
Perform morphological operations to smooth image
Find contours and filter using a minimum/maximum threshold area
Remove portrait images
Here's the detected portraits highlighted in green
Now that we have the bounding box ROIs, we can effectively remove the pictures by filling them in with white. Here's the result
import cv2
image = cv2.imread('1.jpg')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
blur = cv2.GaussianBlur(gray, (3,3), 0)
canny = cv2.Canny(blur, 120, 255, 1)
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (3,3))
close = cv2.morphologyEx(canny, cv2.MORPH_CLOSE, kernel, iterations=2)
cnts = cv2.findContours(close, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
for c in cnts:
area = cv2.contourArea(c)
if area > 15000 and area < 35000:
x,y,w,h = cv2.boundingRect(c)
cv2.rectangle(image, (x, y), (x + w, y + h), (255,255,255), -1)
cv2.imshow('image', image)
cv2.waitKey()

How do I make masks to set all of image background, except the text, to white?

I'm trying to extract the text in this region to run OCR, but the stray black edges are interfering with some results. Is there a way to isolate this text?
After finding this contour, I've cropped it out of the original image with a black background mask. I'm not too sure how to change the background to white, nor can I figure out a way to get rid of the black edges around the contour. Thresholding the image seems to get rid of some of the black pixels in the text, which I don't want.
Ideally the output should be simply the black text, and a white background.
This is a section in the code of the original masking that I've attempted-
mask = np.ones(orig_img.shape).astype(orig_img.dtype)
cv2.fillPoly(mask, [cnt], (255,255,255))
cropped_contour = cv2.bitwise_and(orig_img, mask)
To isolate the text, one approach is to obtain the bounding box coordinates of the desired ROI and then mask that ROI onto a blank white image. The main idea is:
Convert image to grayscale
Threshold image
Dilate image to connect text as a single bounding box
Find contours and filter used contour area to find ROI
Place ROI onto mask
Threshold image (left) then dilate to connect text (right)
You can find contours using cv2.boundingRect() then once you have the ROI, you can place this ROI onto the mask with
mask = np.zeros(image.shape, dtype='uint8')
mask.fill(255)
mask[y:y+h, x:x+w] = original_image[y:y+h, x:x+w]
Find contours then filter for ROI (left), final result (right)
Depending on your image size, you may need to adjust the filter for the contour area.
import cv2
import numpy as np
original_image = cv2.imread('1.png')
image = original_image.copy()
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY_INV | cv2.THRESH_OTSU)[1]
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (5,5))
dilate = cv2.dilate(thresh, kernel, iterations=5)
# Find contours
cnts = cv2.findContours(dilate, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
# Create a blank white mask
mask = np.zeros(image.shape, dtype='uint8')
mask.fill(255)
# Iterate thorugh contours and filter for ROI
for c in cnts:
area = cv2.contourArea(c)
if area < 15000:
x,y,w,h = cv2.boundingRect(c)
cv2.rectangle(image, (x, y), (x + w, y + h), (36,255,12), 2)
mask[y:y+h, x:x+w] = original_image[y:y+h, x:x+w]
cv2.imshow("mask", mask)
cv2.imshow("image", image)
cv2.imshow("dilate", dilate)
cv2.imshow("thresh", thresh)
cv2.imshow("result", image)
cv2.waitKey(0)

Categories