I want to crop the image to only extract the text sections. There are thousands of them with different sizes so I can't hardcode coordinates. I'm trying to remove the unwanted lines on the left and on the bottom. How can I do this?
Original
Expected
Determine the least spanning bounding box by finding all the non-zero points in the image. Finally, crop your image using this bounding box. Finding the contours is time-consuming and unnecessary here, especially because your text is axis-aligned. You may accomplish your goal by combining cv2.findNonZero and cv2.boundingRect.
Hope this will work ! :
import numpy as np
import cv2
img = cv2.imread(r"W430Q.png")
# Read in the image and convert to grayscale
img = img[:-20, :-20] # Perform pre-cropping
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
gray = 255*(gray < 50).astype(np.uint8) # To invert the text to white
gray = cv2.morphologyEx(gray, cv2.MORPH_OPEN, np.ones(
(2, 2), dtype=np.uint8)) # Perform noise filtering
coords = cv2.findNonZero(gray) # Find all non-zero points (text)
x, y, w, h = cv2.boundingRect(coords) # Find minimum spanning bounding box
# Crop the image - note we do this on the original image
rect = img[y:y+h, x:x+w]
cv2.imshow("Cropped", rect) # Show it
cv2.waitKey(0)
cv2.destroyAllWindows()
in above code from forth line of code is where I set the threshold below 50 to make the dark text white. However, because this outputs a binary image, I convert to uint8, then scale by 255. The text is effectively inverted.
Then, using cv2.findNonZero, we discover all of the non-zero locations for this image.We then passed this to cv2.boundingRect, which returns the top-left corner of the bounding box, as well as its width and height. Finally, we can utilise this to crop the image. This is done on the original image, not the inverted version.
Here's a simple approach:
Obtain binary image. Load the image, grayscale, Gaussian blur, then Otsu's threshold to obtain a binary black/white image.
Remove horizontal lines. Since we're trying to only extract text, we remove horizontal lines to aid us in our next step so incorrect contours will not merge together.
Merge text into a single contour. The idea is that characters which are adjacent to each other are part of the wall of text. So we can dilate individual contours together to obtain a single contour to extract.
Find contours and extract ROI. We find contours, sort contours by area, then extract the largest contour ROI using Numpy slicing.
Here's the visualization of each step:
Binary image -> Removed horizontal lines in green
1
2
Dilate to combine into a single contour -> Detected ROI to extract in green
3
4
Result
Code
import cv2
import numpy as np
# Load image, grayscale, Gaussian blur, Otsu's threshold
image = cv2.imread('1.png')
original = image.copy()
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
blur = cv2.GaussianBlur(gray, (3, 3), 0)
thresh = cv2.threshold(blur, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]
# Remove horizontal lines
horizontal_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (25,1))
detected_lines = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, horizontal_kernel, iterations=1)
cnts = cv2.findContours(detected_lines, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
for c in cnts:
cv2.drawContours(thresh, [c], -1, 0, -1)
# Dilate to merge into a single contour
vertical_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (2,30))
dilate = cv2.dilate(thresh, vertical_kernel, iterations=3)
# Find contours, sort for largest contour and extract ROI
cnts, _ = cv2.findContours(dilate, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)[-2:]
cnts = sorted(cnts, key=cv2.contourArea, reverse=True)[:-1]
for c in cnts:
x,y,w,h = cv2.boundingRect(c)
cv2.rectangle(image, (x, y), (x + w, y + h), (36,255,12), 4)
ROI = original[y:y+h, x:x+w]
break
cv2.imshow('image', image)
cv2.imshow('dilate', dilate)
cv2.imshow('thresh', thresh)
cv2.imshow('ROI', ROI)
cv2.waitKey()
Related
I am new to OpenCV.
I have a "simple" image of a stamp that I have already processed a bit, as you can see in the code below.
Now I have the problem of cropping the image to get just the stamp.
The dots and the stripes on the edges interfere with my current code recognizing the stamp.
The images can be different so it is not an option to fix the location of the image.
Code:
img = cv2.imread('./images/image.JPG')
img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
img_blur = cv2.GaussianBlur(img_gray, (3,3), 0)
edges = cv2.Canny(image=img_blur, threshold1=100, threshold2=200)
Real Images
Here's a simple method:
Obtain binary image. We load the image, convert to grayscale, Gaussian blur, then Otsu's threshold to obtain a binary image.
Remove small artifacts and noise. Create a rectangular structuring element and morph open to remove small bits of noise. Then morph close to combine individual contours into a single contour with the assumption that a stamp is a single contour.
Detect and extract stamp ROI. Find contours, filter using contour area and shape approximation. The idea is that if a contour has four vertices then it's a square shape. We can extract the stamp ROI using Numpy slicing and save the stamp
Extracted ROI results
Here's the results with the other two input images from the comments. The assumption is that for each image, there's only one stamp, or one group of adjacent stamps. For these cases, we sort by contour area and assume the largest contour is the stamp.
Code
import cv2
# Load image, grayscale, Gaussian blur, Otsu's threshold
image = cv2.imread("1.jpg")
original = image.copy()
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
blur = cv2.GaussianBlur(gray, (5,5), 0)
thresh = cv2.threshold(blur, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)[1]
# Morph operations to remove small artifacts and noise
open_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (3,3))
opening = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, open_kernel, iterations=1)
close_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (7,7))
close = cv2.morphologyEx(opening, cv2.MORPH_CLOSE, close_kernel, iterations=2)
# Find contours, filter using contour area, and shape approximation
cnts = cv2.findContours(close, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
cnts = sorted(cnts, key=cv2.contourArea, reverse=True)
for c in cnts:
peri = cv2.arcLength(c, True)
area = cv2.contourArea(c)
approx = cv2.approxPolyDP(c, 0.05 * peri, True)
# Assumption is if the contour has 4 vertices then its a square shape
# 2nd assumption is that there's only one stamp, or one group of stamps
if len(approx) == 4 and area > 100:
x,y,w,h = cv2.boundingRect(approx)
ROI = original[y:y+h, x:x+w]
cv2.imshow("ROI", ROI)
cv2.imwrite("ROI.png", ROI)
break
cv2.waitKey()
I'm trying to detect and draw a rectangular contour on every painting on for example this image:
I followed some guides and did the following:
Grayscale conversion
Applied median blur
Sharpen image
Applied adaptive Threshold
Applied Morphological Gradient
Find contours
Draw contours
And got the following result:
I know it's messy but is there a way to somehow detect and draw a contour around the paintings better?
Here is the code I used:
path = '<PATH TO THE PICTURE>'
#reading in and showing original image
image = cv2.imread(path)
image = cv2.resize(image,(880,600)) # resize was nessecary because of the large images
cv2.imshow("original", image)
cv2.waitKey(0)
cv2.destroyAllWindows()
# grayscale conversion
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
cv2.imshow("painting_gray", gray)
cv2.waitKey(0)
cv2.destroyAllWindows()
# we need to find a way to detect the edges better so we implement a couple of things
# A little help was found on stackoverflow: https://stackoverflow.com/questions/55169645/square-detection-in-image
median = cv2.medianBlur(gray,5)
cv2.imshow("painting_median_blur", median) #we use median blur to smooth the image
cv2.waitKey(0)
cv2.destroyAllWindows()
# now we sharpen the image with help of following URL: https://www.analyticsvidhya.com/blog/2021/08/sharpening-an-image-using-opencv-library-in-python/
kernel = np.array([[0, -1, 0],
[-1, 5,-1],
[0, -1, 0]])
image_sharp = cv2.filter2D(src=median, ddepth=-1, kernel=kernel)
cv2.imshow('painting_sharpend', image_sharp)
cv2.waitKey(0)
cv2.destroyAllWindows()
# now we apply adapptive thresholding
# thresholding: https://opencv24-python-tutorials.readthedocs.io/en/latest/py_tutorials/py_imgproc/py_thresholding/py_thresholding.html#adaptive-thresholding
thresh = cv2.adaptiveThreshold(src=image_sharp,maxValue=255,adaptiveMethod=cv2.ADAPTIVE_THRESH_GAUSSIAN_C,
thresholdType=cv2.THRESH_BINARY,blockSize=61,C=20)
cv2.imshow('thresholded image', thresh)
cv2.waitKey(0)
cv2.destroyAllWindows()
# lets apply a morphological transformation
kernel = np.ones((7,7),np.uint8)
gradient = cv2.morphologyEx(thresh, cv2.MORPH_GRADIENT, kernel)
cv2.imshow('dilated image', gradient)
cv2.waitKey(0)
cv2.destroyAllWindows()
# # lets now find the contours of the image
# # find contours: https://docs.opencv.org/4.x/dd/d49/tutorial_py_contour_features.html
contours, hierarchy = cv2.findContours(gradient, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
print("contours: ", len(contours))
print("hierachy: ", len(hierarchy))
print(hierarchy)
cv2.drawContours(image, contours, -1, (0,255,0), 3)
cv2.imshow("contour image", image)
cv2.waitKey(0)
cv2.destroyAllWindows()
Tips, help or code is appreciated!
Here's a simple approach:
Obtain binary image. We load the image, grayscale, Gaussian blur, then Otsu's threshold to obtain a binary image.
Two pass dilation to merge contours. At this point, we have a binary image but individual separated contours. Since we can assume that a painting is a single large square contour, we can merge small individual adjacent contours together to form a single contour. To do this, we create a vertical and horizontal kernel using cv2.getStructuringElement then dilate to merge them together. Depending on the image, you may need to adjust the kernel sizes or number of dilation iterations.
Detect paintings. Now we find contours and filter using contour area using a minimum threshold area to filter out small contours. Finally we obtain the bounding rectangle coordinates and draw the rectangle with cv2.rectangle.
Code
import cv2
# Load image, grayscale, Gaussian blur, Otsu's threshold
image = cv2.imread('1.jpeg')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
blur = cv2.GaussianBlur(gray, (13,13), 0)
thresh = cv2.threshold(blur, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)[1]
# Two pass dilate with horizontal and vertical kernel
horizontal_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (9,5))
dilate = cv2.dilate(thresh, horizontal_kernel, iterations=2)
vertical_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (5,9))
dilate = cv2.dilate(dilate, vertical_kernel, iterations=2)
# Find contours, filter using contour threshold area, and draw rectangle
cnts = cv2.findContours(dilate, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
for c in cnts:
area = cv2.contourArea(c)
if area > 20000:
x,y,w,h = cv2.boundingRect(c)
cv2.rectangle(image, (x, y), (x + w, y + h), (36, 255, 12), 3)
cv2.imshow('thresh', thresh)
cv2.imshow('dilate', dilate)
cv2.imshow('image', image)
cv2.waitKey()
So here is the actual size of the portrait frame.
So here is small code.
#!/usr/bin/python 37
#OpenCV 4.3.0, Raspberry Pi 3/B/4B-w/4/8GB RAM, Buster,v10.
#Date: 3rd, June, 2020
import cv2
# Load the image
img = cv2.imread('portrait.jpeg')
# convert to grayscale
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
edged = cv2.Canny(img, 120,890)
# Apply adaptive threshold
thresh = cv2.adaptiveThreshold(edged, 255, 1, 1, 11, 2)
thresh_color = cv2.cvtColor(thresh, cv2.COLOR_GRAY2BGR)
# apply some dilation and erosion to join the gaps - change iteration to detect more or less area's
thresh = cv2.dilate(thresh,None,iterations = 50)
thresh = cv2.erode(thresh,None,iterations = 50)
# Find the contours
contours,hierarchy = cv2.findContours(thresh,
cv2.RETR_TREE,
cv2.CHAIN_APPROX_SIMPLE)
# For each contour, find the bounding rectangle and draw it
for cnt in contours:
area = cv2.contourArea(cnt)
if area > 20000:
x,y,w,h = cv2.boundingRect(cnt)
cv2.rectangle(img,
(x,y),(x+w,y+h),
(0,255,0),
2)
cv2.imshow('img',img)
cv2.waitKey(0)
cv2.destroyAllWindows()
Here is output:
I have two pictures of the same dimension and I want to detect and replace the white region in the first picture (black image) at the same location in the second picture. Is there any way to do this using OpenCV? I want to replace the blue region in the original image with the white region in the first picture.
First picture
Original image
If I'm understanding you correctly, you want to replace the white ROI on the black image onto the original image. Here's a simple approach:
Obtain binary image. Load image, grayscale, Gaussian blur, then Otsu's threshold
Extract ROI and replace. Find contours with cv2.findContours then filter using contour approximation with cv2.arcLength and cv2.approxPolyDP. With the assumption that the region is a rectangle, if the contour approximation result is 4 then we have found our desired region. In addition, we filter using cv2.contourArea to ensure that we don't include noise. Finally we obtain the bounding box coordinates with cv2.boundingRect and extract the ROI with Numpy slicing. Finally we replace the ROI into the original image.
Detected region to extract/replace highlighted in green
Extracted ROI
Result
Code
import cv2
# Load images, grayscale, Gaussian blur, Otsu's threshold
original = cv2.imread('1.jpg')
image = cv2.imread('2.png')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
blur = cv2.GaussianBlur(gray, (3,3), 0)
thresh = cv2.threshold(blur, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)[1]
# Find contours, filter using contour approximation + area, then extract
# ROI using Numpy slicing and replace into original image
cnts = cv2.findContours(thresh, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
for c in cnts:
peri = cv2.arcLength(c, True)
approx = cv2.approxPolyDP(c, 0.015 * peri, True)
area = cv2.contourArea(c)
if len(approx) == 4 and area > 1000:
x,y,w,h = cv2.boundingRect(c)
ROI = image[y:y+h,x:x+w]
original[y:y+h, x:x+w] = ROI
cv2.imshow('thresh', thresh)
cv2.imshow('ROI', ROI)
cv2.imshow('original', original)
cv2.waitKey()
I'm trying to extract the text in this region to run OCR, but the stray black edges are interfering with some results. Is there a way to isolate this text?
After finding this contour, I've cropped it out of the original image with a black background mask. I'm not too sure how to change the background to white, nor can I figure out a way to get rid of the black edges around the contour. Thresholding the image seems to get rid of some of the black pixels in the text, which I don't want.
Ideally the output should be simply the black text, and a white background.
This is a section in the code of the original masking that I've attempted-
mask = np.ones(orig_img.shape).astype(orig_img.dtype)
cv2.fillPoly(mask, [cnt], (255,255,255))
cropped_contour = cv2.bitwise_and(orig_img, mask)
To isolate the text, one approach is to obtain the bounding box coordinates of the desired ROI and then mask that ROI onto a blank white image. The main idea is:
Convert image to grayscale
Threshold image
Dilate image to connect text as a single bounding box
Find contours and filter used contour area to find ROI
Place ROI onto mask
Threshold image (left) then dilate to connect text (right)
You can find contours using cv2.boundingRect() then once you have the ROI, you can place this ROI onto the mask with
mask = np.zeros(image.shape, dtype='uint8')
mask.fill(255)
mask[y:y+h, x:x+w] = original_image[y:y+h, x:x+w]
Find contours then filter for ROI (left), final result (right)
Depending on your image size, you may need to adjust the filter for the contour area.
import cv2
import numpy as np
original_image = cv2.imread('1.png')
image = original_image.copy()
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY_INV | cv2.THRESH_OTSU)[1]
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (5,5))
dilate = cv2.dilate(thresh, kernel, iterations=5)
# Find contours
cnts = cv2.findContours(dilate, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
# Create a blank white mask
mask = np.zeros(image.shape, dtype='uint8')
mask.fill(255)
# Iterate thorugh contours and filter for ROI
for c in cnts:
area = cv2.contourArea(c)
if area < 15000:
x,y,w,h = cv2.boundingRect(c)
cv2.rectangle(image, (x, y), (x + w, y + h), (36,255,12), 2)
mask[y:y+h, x:x+w] = original_image[y:y+h, x:x+w]
cv2.imshow("mask", mask)
cv2.imshow("image", image)
cv2.imshow("dilate", dilate)
cv2.imshow("thresh", thresh)
cv2.imshow("result", image)
cv2.waitKey(0)
I am using OpenCV to put bounding boxes on handwritten math equation inputs. Currently, my code sometimes places multiple smaller bounding boxes around different parts of a singular image instead of creating one large box around the image. I'm not sure why this is happening. My current code to filter the image and find the contours to draw the bounding box is as follows:
img = cv2.imread(imgpath)
morph = img.copy()
morph = cv2.fastNlMeansDenoising(img)
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (1, 1))
morph = cv2.morphologyEx(morph, cv2.MORPH_CLOSE, kernel)
morph = cv2.morphologyEx(morph, cv2.MORPH_OPEN, kernel)
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (3, 15))
# take morphological gradient
gradient_image = cv2.morphologyEx(morph, cv2.MORPH_GRADIENT, kernel)
gray = cv2.cvtColor(gradient_image, cv2.COLOR_BGR2GRAY)
img_grey = cv2.morphologyEx(gray, cv2.MORPH_CLOSE, kernel)
blur = cv2.medianBlur(img_grey,3)
ret, thing = cv2.threshold(blur, 0.0, 255.0, cv2.THRESH_BINARY + cv2.THRESH_OTSU)
img_dilation = cv2.dilate(thing, kernel, iterations=3)
conturs_lst = cv2.findContours(img_dilation, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)[-2]
An example of the actual result is as follows:
OG Image:
You have the right idea but I think you're overusing cv2.morphologyEx to continuously erode and dilate the image. You mention your problem:
Currently, my code sometimes places multiple smaller bounding boxes around different parts of a singular image instead of creating one large box around the image.
When you use cv2.findContours, its working correctly but since your contours are actually blobs instead of one interconnected singular image, it creates multiple bounding boxes. To remedy this problem, you can dilate the image to connect the blobs together.
I've rewrote your code without the extra cv2.morphologyEx repetitions. The main idea is as follows:
Convert the image into grayscale
Blur image
Threshold image to separate background from desired object
Dilate image to connect blobs to form a singular image
Find contours and filter contours using threshold min/max area
Threshold image to isolate desired sections. Note some of the contours have broken connections. To fix this, we dilate the image to connect the blobs.
Dilate image to form singular objects. Now note we have the unwanted horizontal section at the bottom, we can find contours and then filter using area to remove that section.
Results
import numpy as np
import cv2
original_image = cv2.imread("1.jpg")
image = original_image.copy()
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
blurred = cv2.GaussianBlur(gray, (3, 3), 0)
thresh = cv2.threshold(blurred, 160, 255, cv2.THRESH_BINARY_INV)[1]
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (3,3))
dilate = cv2.dilate(thresh, kernel , iterations=4)
cv2.imshow("thresh", thresh)
cv2.imshow("dilate", dilate)
# Find contours in the image
cnts = cv2.findContours(dilate.copy(), cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
contours = []
threshold_min_area = 400
threshold_max_area = 3000
for c in cnts:
x,y,w,h = cv2.boundingRect(c)
area = cv2.contourArea(c)
if area > threshold_min_area and area < threshold_max_area:
# cv2.drawContours(original_image,[c], 0, (0,255,0), 3)
cv2.rectangle(original_image, (x,y), (x+w, y+h), (0,255,0),1)
contours.append(c)
cv2.imshow("detected", original_image)
print('contours detected: {}'.format(len(contours)))
cv2.waitKey(0)