I have found contours of some rectangles in an image and created a mask as shown below. What I am trying to do is finding those two columns of rectangles as highlighted in the image.
The source image:
Columns highlighted:
Desired output:
I'm not sure if one can achieve that with simple algorithm. If those inline-vertical rectangle does not change their position, You could just define (hardcoded) ROI (region of interest) for them, based on pixel coordinates.
If not using any machine learning to solve this, defining ROI is Your best option.
Feel free to ask, if needed.
Assuming the desired columns have significantly more contours than the other columns – as shown in the given example image – simple dilation with some vertical structuring element might be sufficient to find those columns:
Load image as grayscale, get rid of JPG artifacts.
Dilation with "small" vertical structuring element to combine all contours within a column.
Opening with "large" vertical structuring element to neglect all smaller contours. The two columns in question will now have quite large contours.
For safety reasons: Again, dilation with "small" vertical structuring element. Since the columns are slightly rotated, skewed, ... single pixels might've been falsely removed by the opening.
Find remaining contours.
For each contour: Get the bounding boxes in two steps, since we have to refine the bounding box by using the original image (because of step 4).
That'd be the full Python code:
import cv2
import numpy as np
# Read image, get rid of JPG artifacts
img = cv2.imread('gXylF.jpg', cv2.IMREAD_GRAYSCALE)
img = cv2.threshold(img, 128, 255, cv2.THRESH_BINARY)[1]
# Dilating using vertical structuring element to combine contours
mask = cv2.dilate(img, kernel=np.ones((21, 1)))
# Opening using vertical structuring element to neglect small contours
mask = cv2.morphologyEx(mask, cv2.MORPH_OPEN, kernel=np.ones((201, 1)))
# Dilating using rectangular structuring element; for safety reasons:
# Single pixels might've been falsely removed by the opening
mask = cv2.dilate(mask, kernel=np.ones((11, 11)))
# Find remaining contours w.r.t. the OpenCV version
cnts = cv2.findContours(mask, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
# Iterate remaining contours
for i, cnt in enumerate(cnts):
# Get bounding rectangle of contour, and region of interest (ROI)
(x, y, w, h) = cv2.boundingRect(cnt)
roi = img[y:y+h, x:x+w]
# Get bounding rectangle of actual values in ROI; the contour of the
# mask is larger than the actual contours in the original image
(x, y, w, h) = cv2.boundingRect(roi)
roi = roi[y:y+h, x:x+w]
cv2.imwrite('{}.png'.format(i), roi)
The two exported images look like this:
If you, instead, wanted to mark the regions of interest (ROIs) within the original image, pay attention to use the correct coordinates of the bounding boxes (refined coordinates are w.r.t. to the first cut ROI).
If your other images show more rotation, skew, ... you'd might need to re-draw the contours w.r.t. the original image, if neighbouring, small contours should be neglected completely.
----------------------------------------
System information
----------------------------------------
Platform: Windows-10-10.0.16299-SP0
Python: 3.9.1
NumPy: 1.20.1
OpenCV: 4.5.1
----------------------------------------
Related
I am planning to split the questions from this PDF document. The challenge is that the questions are not orderly spaced. For example the first question occupies an entire page, second also the same while the third and fourth together make up one page. If I have to manually slice it, it will be ages. So, I thought to split it up into images and work on them. Is there a possibility to take image as this
and split into individual components like this?
This is a classic situation for dilate. The idea is that adjacent text corresponds with the same question while text that is farther away is part of another question. Whenever you want to connect multiple items together, you can dilate them to join adjacent contours into a single contour. Here's a simple approach:
Obtain binary image. Load the image, convert to grayscale, Gaussian blur, then Otsu's threshold to obtain a binary image.
Remove small noise and artifacts. We create a rectangular kernel and morph open to remove small noise and artifacts in the image.
Connect adjacent words together. We create a larger rectangular kernel and dilate to merge individual contours together.
Detect questions. From here we find contours, sort contours from top-to-bottom using imutils.sort_contours(), filter with a minimum contour area, obtain the rectangular bounding rectangle coordinates and highlight the rectangular contours. We then crop each question using Numpy slicing and save the ROI image.
Otsu's threshold to obtain a binary image
Here's where the interesting section happens. We assume that adjacent text/characters are part of the same question so we merge individual words into a single contour. A question is a section of words that are close together so we dilate to connect them all together.
Individual questions highlighted in green
Top question
Bottom question
Saved ROI questions (assumption is from top-to-bottom)
Code
import cv2
from imutils import contours
# Load image, grayscale, Gaussian blur, Otsu's threshold
image = cv2.imread('1.png')
original = image.copy()
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
blur = cv2.GaussianBlur(gray, (7,7), 0)
thresh = cv2.threshold(blur, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]
# Remove small artifacts and noise with morph open
open_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (5,5))
opening = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, open_kernel, iterations=1)
# Create rectangular structuring element and dilate
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (9,9))
dilate = cv2.dilate(opening, kernel, iterations=4)
# Find contours, sort from top to bottom, and extract each question
cnts = cv2.findContours(dilate, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
(cnts, _) = contours.sort_contours(cnts, method="top-to-bottom")
# Get bounding box of each question, crop ROI, and save
question_number = 0
for c in cnts:
# Filter by area to ensure its not noise
area = cv2.contourArea(c)
if area > 150:
x,y,w,h = cv2.boundingRect(c)
cv2.rectangle(image, (x, y), (x + w, y + h), (36,255,12), 2)
question = original[y:y+h, x:x+w]
cv2.imwrite('question_{}.png'.format(question_number), question)
question_number += 1
cv2.imshow('thresh', thresh)
cv2.imshow('dilate', dilate)
cv2.imshow('image', image)
cv2.waitKey()
We may solve it using (mostly) morphological operations:
Read the input image as grayscale.
Apply thresholding with inversion.
Automatic thresholding using cv2.THRESH_OTSU is working well.
Apply opening morphological operation for removing small artifacts (using the kernel np.ones(1, 3))
Dilate horizontally with very long horizontal kernel - make horizontal lines out of the text lines.
Apply closing vertically - create two large clusters.
The size of the vertical kernel should be tuned according to the typical gap.
Finding connected components with statistics.
Iterate the connected components and crop the relevant area in the vertical direction.
Complete code sample:
import cv2
import numpy as np
img = cv2.imread('scanned_image.png', cv2.IMREAD_GRAYSCALE) # Read image as grayscale
thesh = cv2.threshold(img, 0, 255, cv2.THRESH_OTSU + cv2.THRESH_BINARY_INV)[1] # Apply automatic thresholding with inversion.
thesh = cv2.morphologyEx(thesh, cv2.MORPH_OPEN, np.ones((1, 3), np.uint8)) # Apply opening morphological operation for removing small artifacts.
thesh = cv2.dilate(thesh, np.ones((1, img.shape[1]), np.uint8)) # Dilate horizontally - make horizontally lines out of the text.
thesh = cv2.morphologyEx(thesh, cv2.MORPH_CLOSE, np.ones((50, 1), np.uint8)) # Apply closing vertically - create two large clusters
nlabel, labels, stats, centroids = cv2.connectedComponentsWithStats(thesh, 4) # Finding connected components with statistics
parts_list = []
# Iterate connected components:
for i in range(1, nlabel):
top = int(stats[i, cv2.CC_STAT_TOP]) # Get most top y coordinate of the connected component
height = int(stats[i, cv2.CC_STAT_HEIGHT]) # Get the height of the connected component
roi = img[top-5:top+height+5, :] # Crop the relevant part of the image (add 5 extra rows from top and bottom).
parts_list.append(roi.copy()) # Add the cropped area to a list
cv2.imwrite(f'part{i}.png', roi) # Save the image part for testing
cv2.imshow(f'part{i}', roi) # Show part for testing
# Show image and thesh testing
cv2.imshow('img', img)
cv2.imshow('thesh', thesh)
cv2.waitKey()
cv2.destroyAllWindows()
Results:
Stage 1:
Stage 2:
Stage 3:
Stage 4:
Top area:
Bottom area:
My code selects a frame from a video which is than subtracted with a background frame selected from the same video. It is then converted to grayscale, blurred, and then an image threshold is applied. Then a contour is drawn which outputs this image. However, I would only like to have the outermost contour and also not have any contours drawn above y=500. How can I implement this?
Contour code:
contours, hierarchy = cv2.findContours(tframe,cv2.RETR_TREE,cv2.CHAIN_APPROX_SIMPLE)
image = cv2.drawContours(sampleframe, contours, -1, (0, 255, 0), 2)
I have tried using cv2.dilate which works given enough iteration to remove internal contours but the iteration causes the contour to be overestimated which is not desired.
If you filter out the contours by their area size, you can keep the biggest contour. check this link please: https://docs.opencv.org/3.4/dd/d49/tutorial_py_contour_features.html
And for the second problem you can cut your frame on y axis to be lower than 500.
Let's say I have image with embossed and debossed object like this
or
Is there a way to determine the above object is embossed and the below object is debossed using OpenCV? Preferably using C++, but Python is also fine. I couldn't find any good resource on the internet.
Here's an approach which takes advantage of the sunken and lifted contours of the embossed/debossed image. The main idea is:
Convert the image to grayscale
Perform a morphological transformation
Find outlines using Canny edge detection
Dilate canny image to merge individual contours into a single contour
Perform contour detection to find the ROI dimensions of top/bottom halves
Obtain ROI of top/bottom canny image
Count non-zero array elements for each half
Convert to grayscale and perform morphological transformation
Perform canny edge detection to find outlines. The key to determine if an object is embossed/debossed is to compare the canny edges. Here's the approach: We look at the object, if its upper half has more contour/lines/pixels than its lower half then it is debossed. Similarly, if the upper half has less pixels than its lower half then it is embossed.
Now that we have the canny edges, we dialte the image until all the contours connect so we obtain one single object.
We then perform contour detection to obtain the ROI of the objects
From here, we separate each object into top and bottom sections
Now that we have the ROI of the top and bottom sections, we crop the ROI in the canny image
With each half, we count non-zero array elements using cv2.countNonZero(). For the embossed object, we get this
('top', 1085)
('bottom', 1899)
For the debossed object, we get this
('top', 979)
('bottom', 468)
Therefore by comparing the values between the two halves, if the top half has less pixels than the bottom, it is embossed. If it has more, it is debossed.
import numpy as np
import cv2
original_image = cv2.imread("1.jpg")
image = original_image.copy()
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (5,5))
morph = cv2.morphologyEx(gray, cv2.MORPH_OPEN, kernel)
canny = cv2.Canny(morph, 130, 255, 1)
# Dilate canny image so contours connect and form a single contour
dilate = cv2.dilate(canny, kernel, iterations=4)
cv2.imshow("morph", morph)
cv2.imshow("canny", canny)
cv2.imshow("dilate", dilate)
# Find contours in the image
cnts = cv2.findContours(dilate.copy(), cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
contours = []
# For each image separate it into top/bottom halfs
for c in cnts:
# Obtain bounding rectangle for each contour
x,y,w,h = cv2.boundingRect(c)
# Draw bounding box rectangle
cv2.rectangle(original_image,(x,y),(x+w,y+h),(0,255,0),3)
# cv2.rectangle(original_image,(x,y),(x+w,y+h/2),(0,255,0),3) # top
# cv2.rectangle(original_image,(x,y+h/2),(x+w,y+h),(0,255,0),3) # bottom
top_half = ((x,y), (x+w, y+h/2))
bottom_half = ((x,y+h/2), (x+w, y+h))
# Collect top/bottom ROIs
contours.append((top_half, bottom_half))
for index, c in enumerate(contours):
top_half, bottom_half = c
top_x1,top_y1 = top_half[0]
top_x2,top_y2 = top_half[1]
bottom_x1,bottom_y1 = bottom_half[0]
bottom_x2,bottom_y2 = bottom_half[1]
# Grab ROI of top/bottom section from canny image
top_image = canny[top_y1:top_y2, top_x1:top_x2]
bottom_image = canny[bottom_y1:bottom_y2, bottom_x1:bottom_x2]
cv2.imshow('top_image', top_image)
cv2.imshow('bottom_image', bottom_image)
# Count non-zero array elements
top_pixels = cv2.countNonZero(top_image)
bottom_pixels = cv2.countNonZero(bottom_image)
print('top', top_pixels)
print('bottom', bottom_pixels)
cv2.imshow("detected", original_image)
print('contours detected: {}'.format(len(contours)))
cv2.waitKey(0)
One insight you could use is that an embossed object is usually brighter than a debossed object.
I would probably do an edge detection to find the "boss-edges" which should form a closed polygon, and compare the relative lightness value of the enclosed "bossment". Special care must be taken for objects with holes, e.g. the letter O, but it is do-able.
You can probably do more sophisticated processing if you know the light direction that is hitting the bossment. e.g. if you know the light is coming from top left, you can try only focusing on the top left edge pixels
I'm currently working on an image where I have to find the box outer region. But I failed to find the white and black boxes regions.
input image:
https://i.imgur.com/gec9eP5.png
output image:
https://i.imgur.com/Giz1DAW.png
Update edit:
if I use HLS instead of HSV I can find 3 more box region but 2 is still missing.
here is new output:
https://i.imgur.com/eUqltKI.png
and here is my code:
import cv2
import numpy as np
img = cv2.imread("1.png")
imghsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)
lower_blue = np.array([0,50,0])
upper_blue = np.array([255,255,255])
mask_blue = cv2.inRange(imghsv, lower_blue, upper_blue)
_, contours, _ = cv2.findContours(mask_blue, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
im = np.copy(img)
cv2.drawContours(im, contours, -1, (0, 255, 0), 2)
cv2.imwrite("contours_blue.png", im)
The mask you're generating with
mask_blue = cv2.inRange(imghsv, lower_blue, upper_blue)
does not include the bottom row at all, so it's impossible to detect these outlines with
_, contours, _ = cv2.findContours(mask_blue, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
You could try to work with multiple masks / thresholds to account for the different color ranges and merge the detected contours.
Color channel threshold is not the optimal solution in cases when you are dealing with objects of many different colors (that are not known in advance) and with background color that is not necessarily distinctly different from all object colors. Combination of multiple thresholds/conditions could solve the job for this particular image but this same combination can fail for slightly different input, so I think this approach is generally not too good.
I think the problem is very elementary in nature so I would recommend sticking to a simple approach. For example, if you apply Sobel operator on your image, you get result like one below. The intensity of the result is weak on some borders, so I inverted the image colors to make it better visible.
There are tons of tutorials on Sobel operator on the web so I won't go into detail here. On your input image there is no noise, so the intensity outside and within the boxes is zero. I would therefore suggest masking-out all zero values. If you do contour detection after that, you will end up with two contours per square - one will be on the inner side of the border and one on the outer side of the border. If you only want to extract contours on the outer border, see how contour hirearchy works in the OpenCV documentation. If you want to have contour exactly on the border, help yourself with outer contour and erosion.
I have the following code in Python to find contours in my image:
import cv2
im = cv2.imread('test.jpg')
imgray = cv2.cvtColor(im, cv2.COLOR_BGR2GRAY)
ret, thresh = cv2.threshold(imgray, 127, 255, 0)
im2, contours, hierarchy = cv2.findContours(thresh, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
Now I want to copy the area inside the first contour to another image, but I can't find any tutorial or example code that shows how to do that.
Here's a fully working example. It's a bit overkill in that it outputs all the contours but I think you may find a way to tweak it to your liking. Also not sure what you mean by copying, so I'll assume you just want the contours outputted to a file.
We will start with an image like so (in this case you will notice I don't need to threshold the image). The script below can be broken down into 6 major steps:
Canny filter to find the edges
cv2.findContours to keep track of our contours, note that we only need outer contours hence the cv2.RETR_EXTERNAL flag.
cv2.drawContours draws the shapes of each contour to our image
Loop through all contours and put bounding boxes around them.
Use the x,y,w,h information of our boxes to help us crop every contour
Write the cropped image to a file.
import cv2
image = cv2.imread('images/blobs1.png')
edged = cv2.Canny(image, 175, 200)
contours, hierarchy = cv2.findContours(edged, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cv2.drawContours(image, contours, -1, (0,255,0), 3)
cv2.imshow("Show contour", image)
cv2.waitKey(0)
cv2.destroyAllWindows()
for i,c in enumerate(contours):
rect = cv2.boundingRect(c)
x,y,w,h = rect
box = cv2.rectangle(image, (x,y), (x+w,y+h), (0,0,255), 2)
cropped = image[y: y+h, x: x+w]
cv2.imshow("Show Boxes", cropped)
cv2.waitKey(0)
cv2.destroyAllWindows()
cv2.imwrite("blobby"+str(i)+".png", cropped)
cv2.imshow("Show Boxes", image)
cv2.waitKey(0)
cv2.destroyAllWindows()
Not familiar with cv2.findContours but I imagine that a contour is represented by an array of points with row/column (x/y values. If this is the case and the contour is a single pixel in width then there should be two points for every row - one each on the left and right extremes of the contour.
For each row in the contour
*select* all the points in the image that are between the two contour points for that row
save those points to a new array.
As #DanMašek points out, if the points in the contour array describe a simple shape with with only the ends, corners or breakpoints represented then you would need to fill in the gaps to use the method above.
Also if the contour shape is something like a star you would need to figure out a different method for determining if an image point is inside of the contour. The method I posted is a bit naive - but might be a good starting point. For a convoluted shape like a star there might be multiple points per row of the contour but it seems like the points would come in pairs and the points you are interested in would be between the pairs.
.