Let's say I have image with embossed and debossed object like this
or
Is there a way to determine the above object is embossed and the below object is debossed using OpenCV? Preferably using C++, but Python is also fine. I couldn't find any good resource on the internet.
Here's an approach which takes advantage of the sunken and lifted contours of the embossed/debossed image. The main idea is:
Convert the image to grayscale
Perform a morphological transformation
Find outlines using Canny edge detection
Dilate canny image to merge individual contours into a single contour
Perform contour detection to find the ROI dimensions of top/bottom halves
Obtain ROI of top/bottom canny image
Count non-zero array elements for each half
Convert to grayscale and perform morphological transformation
Perform canny edge detection to find outlines. The key to determine if an object is embossed/debossed is to compare the canny edges. Here's the approach: We look at the object, if its upper half has more contour/lines/pixels than its lower half then it is debossed. Similarly, if the upper half has less pixels than its lower half then it is embossed.
Now that we have the canny edges, we dialte the image until all the contours connect so we obtain one single object.
We then perform contour detection to obtain the ROI of the objects
From here, we separate each object into top and bottom sections
Now that we have the ROI of the top and bottom sections, we crop the ROI in the canny image
With each half, we count non-zero array elements using cv2.countNonZero(). For the embossed object, we get this
('top', 1085)
('bottom', 1899)
For the debossed object, we get this
('top', 979)
('bottom', 468)
Therefore by comparing the values between the two halves, if the top half has less pixels than the bottom, it is embossed. If it has more, it is debossed.
import numpy as np
import cv2
original_image = cv2.imread("1.jpg")
image = original_image.copy()
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (5,5))
morph = cv2.morphologyEx(gray, cv2.MORPH_OPEN, kernel)
canny = cv2.Canny(morph, 130, 255, 1)
# Dilate canny image so contours connect and form a single contour
dilate = cv2.dilate(canny, kernel, iterations=4)
cv2.imshow("morph", morph)
cv2.imshow("canny", canny)
cv2.imshow("dilate", dilate)
# Find contours in the image
cnts = cv2.findContours(dilate.copy(), cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
contours = []
# For each image separate it into top/bottom halfs
for c in cnts:
# Obtain bounding rectangle for each contour
x,y,w,h = cv2.boundingRect(c)
# Draw bounding box rectangle
cv2.rectangle(original_image,(x,y),(x+w,y+h),(0,255,0),3)
# cv2.rectangle(original_image,(x,y),(x+w,y+h/2),(0,255,0),3) # top
# cv2.rectangle(original_image,(x,y+h/2),(x+w,y+h),(0,255,0),3) # bottom
top_half = ((x,y), (x+w, y+h/2))
bottom_half = ((x,y+h/2), (x+w, y+h))
# Collect top/bottom ROIs
contours.append((top_half, bottom_half))
for index, c in enumerate(contours):
top_half, bottom_half = c
top_x1,top_y1 = top_half[0]
top_x2,top_y2 = top_half[1]
bottom_x1,bottom_y1 = bottom_half[0]
bottom_x2,bottom_y2 = bottom_half[1]
# Grab ROI of top/bottom section from canny image
top_image = canny[top_y1:top_y2, top_x1:top_x2]
bottom_image = canny[bottom_y1:bottom_y2, bottom_x1:bottom_x2]
cv2.imshow('top_image', top_image)
cv2.imshow('bottom_image', bottom_image)
# Count non-zero array elements
top_pixels = cv2.countNonZero(top_image)
bottom_pixels = cv2.countNonZero(bottom_image)
print('top', top_pixels)
print('bottom', bottom_pixels)
cv2.imshow("detected", original_image)
print('contours detected: {}'.format(len(contours)))
cv2.waitKey(0)
One insight you could use is that an embossed object is usually brighter than a debossed object.
I would probably do an edge detection to find the "boss-edges" which should form a closed polygon, and compare the relative lightness value of the enclosed "bossment". Special care must be taken for objects with holes, e.g. the letter O, but it is do-able.
You can probably do more sophisticated processing if you know the light direction that is hitting the bossment. e.g. if you know the light is coming from top left, you can try only focusing on the top left edge pixels
Related
I am planning to split the questions from this PDF document. The challenge is that the questions are not orderly spaced. For example the first question occupies an entire page, second also the same while the third and fourth together make up one page. If I have to manually slice it, it will be ages. So, I thought to split it up into images and work on them. Is there a possibility to take image as this
and split into individual components like this?
This is a classic situation for dilate. The idea is that adjacent text corresponds with the same question while text that is farther away is part of another question. Whenever you want to connect multiple items together, you can dilate them to join adjacent contours into a single contour. Here's a simple approach:
Obtain binary image. Load the image, convert to grayscale, Gaussian blur, then Otsu's threshold to obtain a binary image.
Remove small noise and artifacts. We create a rectangular kernel and morph open to remove small noise and artifacts in the image.
Connect adjacent words together. We create a larger rectangular kernel and dilate to merge individual contours together.
Detect questions. From here we find contours, sort contours from top-to-bottom using imutils.sort_contours(), filter with a minimum contour area, obtain the rectangular bounding rectangle coordinates and highlight the rectangular contours. We then crop each question using Numpy slicing and save the ROI image.
Otsu's threshold to obtain a binary image
Here's where the interesting section happens. We assume that adjacent text/characters are part of the same question so we merge individual words into a single contour. A question is a section of words that are close together so we dilate to connect them all together.
Individual questions highlighted in green
Top question
Bottom question
Saved ROI questions (assumption is from top-to-bottom)
Code
import cv2
from imutils import contours
# Load image, grayscale, Gaussian blur, Otsu's threshold
image = cv2.imread('1.png')
original = image.copy()
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
blur = cv2.GaussianBlur(gray, (7,7), 0)
thresh = cv2.threshold(blur, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]
# Remove small artifacts and noise with morph open
open_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (5,5))
opening = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, open_kernel, iterations=1)
# Create rectangular structuring element and dilate
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (9,9))
dilate = cv2.dilate(opening, kernel, iterations=4)
# Find contours, sort from top to bottom, and extract each question
cnts = cv2.findContours(dilate, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
(cnts, _) = contours.sort_contours(cnts, method="top-to-bottom")
# Get bounding box of each question, crop ROI, and save
question_number = 0
for c in cnts:
# Filter by area to ensure its not noise
area = cv2.contourArea(c)
if area > 150:
x,y,w,h = cv2.boundingRect(c)
cv2.rectangle(image, (x, y), (x + w, y + h), (36,255,12), 2)
question = original[y:y+h, x:x+w]
cv2.imwrite('question_{}.png'.format(question_number), question)
question_number += 1
cv2.imshow('thresh', thresh)
cv2.imshow('dilate', dilate)
cv2.imshow('image', image)
cv2.waitKey()
We may solve it using (mostly) morphological operations:
Read the input image as grayscale.
Apply thresholding with inversion.
Automatic thresholding using cv2.THRESH_OTSU is working well.
Apply opening morphological operation for removing small artifacts (using the kernel np.ones(1, 3))
Dilate horizontally with very long horizontal kernel - make horizontal lines out of the text lines.
Apply closing vertically - create two large clusters.
The size of the vertical kernel should be tuned according to the typical gap.
Finding connected components with statistics.
Iterate the connected components and crop the relevant area in the vertical direction.
Complete code sample:
import cv2
import numpy as np
img = cv2.imread('scanned_image.png', cv2.IMREAD_GRAYSCALE) # Read image as grayscale
thesh = cv2.threshold(img, 0, 255, cv2.THRESH_OTSU + cv2.THRESH_BINARY_INV)[1] # Apply automatic thresholding with inversion.
thesh = cv2.morphologyEx(thesh, cv2.MORPH_OPEN, np.ones((1, 3), np.uint8)) # Apply opening morphological operation for removing small artifacts.
thesh = cv2.dilate(thesh, np.ones((1, img.shape[1]), np.uint8)) # Dilate horizontally - make horizontally lines out of the text.
thesh = cv2.morphologyEx(thesh, cv2.MORPH_CLOSE, np.ones((50, 1), np.uint8)) # Apply closing vertically - create two large clusters
nlabel, labels, stats, centroids = cv2.connectedComponentsWithStats(thesh, 4) # Finding connected components with statistics
parts_list = []
# Iterate connected components:
for i in range(1, nlabel):
top = int(stats[i, cv2.CC_STAT_TOP]) # Get most top y coordinate of the connected component
height = int(stats[i, cv2.CC_STAT_HEIGHT]) # Get the height of the connected component
roi = img[top-5:top+height+5, :] # Crop the relevant part of the image (add 5 extra rows from top and bottom).
parts_list.append(roi.copy()) # Add the cropped area to a list
cv2.imwrite(f'part{i}.png', roi) # Save the image part for testing
cv2.imshow(f'part{i}', roi) # Show part for testing
# Show image and thesh testing
cv2.imshow('img', img)
cv2.imshow('thesh', thesh)
cv2.waitKey()
cv2.destroyAllWindows()
Results:
Stage 1:
Stage 2:
Stage 3:
Stage 4:
Top area:
Bottom area:
I have found contours of some rectangles in an image and created a mask as shown below. What I am trying to do is finding those two columns of rectangles as highlighted in the image.
The source image:
Columns highlighted:
Desired output:
I'm not sure if one can achieve that with simple algorithm. If those inline-vertical rectangle does not change their position, You could just define (hardcoded) ROI (region of interest) for them, based on pixel coordinates.
If not using any machine learning to solve this, defining ROI is Your best option.
Feel free to ask, if needed.
Assuming the desired columns have significantly more contours than the other columns – as shown in the given example image – simple dilation with some vertical structuring element might be sufficient to find those columns:
Load image as grayscale, get rid of JPG artifacts.
Dilation with "small" vertical structuring element to combine all contours within a column.
Opening with "large" vertical structuring element to neglect all smaller contours. The two columns in question will now have quite large contours.
For safety reasons: Again, dilation with "small" vertical structuring element. Since the columns are slightly rotated, skewed, ... single pixels might've been falsely removed by the opening.
Find remaining contours.
For each contour: Get the bounding boxes in two steps, since we have to refine the bounding box by using the original image (because of step 4).
That'd be the full Python code:
import cv2
import numpy as np
# Read image, get rid of JPG artifacts
img = cv2.imread('gXylF.jpg', cv2.IMREAD_GRAYSCALE)
img = cv2.threshold(img, 128, 255, cv2.THRESH_BINARY)[1]
# Dilating using vertical structuring element to combine contours
mask = cv2.dilate(img, kernel=np.ones((21, 1)))
# Opening using vertical structuring element to neglect small contours
mask = cv2.morphologyEx(mask, cv2.MORPH_OPEN, kernel=np.ones((201, 1)))
# Dilating using rectangular structuring element; for safety reasons:
# Single pixels might've been falsely removed by the opening
mask = cv2.dilate(mask, kernel=np.ones((11, 11)))
# Find remaining contours w.r.t. the OpenCV version
cnts = cv2.findContours(mask, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
# Iterate remaining contours
for i, cnt in enumerate(cnts):
# Get bounding rectangle of contour, and region of interest (ROI)
(x, y, w, h) = cv2.boundingRect(cnt)
roi = img[y:y+h, x:x+w]
# Get bounding rectangle of actual values in ROI; the contour of the
# mask is larger than the actual contours in the original image
(x, y, w, h) = cv2.boundingRect(roi)
roi = roi[y:y+h, x:x+w]
cv2.imwrite('{}.png'.format(i), roi)
The two exported images look like this:
If you, instead, wanted to mark the regions of interest (ROIs) within the original image, pay attention to use the correct coordinates of the bounding boxes (refined coordinates are w.r.t. to the first cut ROI).
If your other images show more rotation, skew, ... you'd might need to re-draw the contours w.r.t. the original image, if neighbouring, small contours should be neglected completely.
----------------------------------------
System information
----------------------------------------
Platform: Windows-10-10.0.16299-SP0
Python: 3.9.1
NumPy: 1.20.1
OpenCV: 4.5.1
----------------------------------------
I'm wondering how to count objects that are connected to each other as two distinct objects using cv2.findContours in Python
For example this image :
contours, hierarchy = cv2.findContours(img, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
Will output one contour.
What can I do to get two contours instead ?
This can be done by converting your input image to edge image and then detecting contours. But in this case, there is a break in edge image(I tried with canny) at the intersection of the two objects as shown below.
Break in Canny:
Whereas the expected edge image should have all the pixels at the boundary as white.
Expected edge image:
Thus to get this perfect edge image, I have created an algo shared below(This algo will work only on a binary image like this with objects filled with white color).
Before using this algo, make sure that the object does not lie on the boundary, that is, all the boundary pixels of the image should be black. If not black, add a border on all sides of the image of black colour and length 1 pixel.
# Creating black image of same shape and single channel
edge = np.zeros(img.shape, dtype = np.uint8)
h, w = img.shape[:2]
# Iterating over each pixel except those at the boundary
for i in range(1, h-1):
for j in range(1, w-1):
# if current pixel is white
if img[i][j] == 255:
# roi is the image of 9 pixel from around the current pixel
roi = img[i-1:i+2, j-1:j+2].copy()
# Counting number of black pixel in the neighbourhood of current pixel
blackCount = np.sum(roi == 0)
# if a neighbouring pixel is black, then current pixel is a boundary pixel.
if blackCount > 0:
edge[i][j] = 255
After finding the edge image, get all the contours in the image:
cont, hier = cv2.findContours(edge, cv2.RETR_TREE, cv2.CHAIN_APPROX_NONE)
For this image, you will get 3 contours, 2 for the two objects and 1 for the two objects combined. To eliminate the contour of both the objects combined, use the hierarchy information.
# I am taking only those contours which do not have a child contour.
finalContours = np.asarray([cont[i] for i in range(len(cont)) if hier[0][i][2] == -1])
"finalContours" will have the 2 contours for the two objects.
Refer to this link for more information about the parent-child relationship of the contours
I have the following code in Python to find contours in my image:
import cv2
im = cv2.imread('test.jpg')
imgray = cv2.cvtColor(im, cv2.COLOR_BGR2GRAY)
ret, thresh = cv2.threshold(imgray, 127, 255, 0)
im2, contours, hierarchy = cv2.findContours(thresh, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
Now I want to copy the area inside the first contour to another image, but I can't find any tutorial or example code that shows how to do that.
Here's a fully working example. It's a bit overkill in that it outputs all the contours but I think you may find a way to tweak it to your liking. Also not sure what you mean by copying, so I'll assume you just want the contours outputted to a file.
We will start with an image like so (in this case you will notice I don't need to threshold the image). The script below can be broken down into 6 major steps:
Canny filter to find the edges
cv2.findContours to keep track of our contours, note that we only need outer contours hence the cv2.RETR_EXTERNAL flag.
cv2.drawContours draws the shapes of each contour to our image
Loop through all contours and put bounding boxes around them.
Use the x,y,w,h information of our boxes to help us crop every contour
Write the cropped image to a file.
import cv2
image = cv2.imread('images/blobs1.png')
edged = cv2.Canny(image, 175, 200)
contours, hierarchy = cv2.findContours(edged, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cv2.drawContours(image, contours, -1, (0,255,0), 3)
cv2.imshow("Show contour", image)
cv2.waitKey(0)
cv2.destroyAllWindows()
for i,c in enumerate(contours):
rect = cv2.boundingRect(c)
x,y,w,h = rect
box = cv2.rectangle(image, (x,y), (x+w,y+h), (0,0,255), 2)
cropped = image[y: y+h, x: x+w]
cv2.imshow("Show Boxes", cropped)
cv2.waitKey(0)
cv2.destroyAllWindows()
cv2.imwrite("blobby"+str(i)+".png", cropped)
cv2.imshow("Show Boxes", image)
cv2.waitKey(0)
cv2.destroyAllWindows()
Not familiar with cv2.findContours but I imagine that a contour is represented by an array of points with row/column (x/y values. If this is the case and the contour is a single pixel in width then there should be two points for every row - one each on the left and right extremes of the contour.
For each row in the contour
*select* all the points in the image that are between the two contour points for that row
save those points to a new array.
As #DanMašek points out, if the points in the contour array describe a simple shape with with only the ends, corners or breakpoints represented then you would need to fill in the gaps to use the method above.
Also if the contour shape is something like a star you would need to figure out a different method for determining if an image point is inside of the contour. The method I posted is a bit naive - but might be a good starting point. For a convoluted shape like a star there might be multiple points per row of the contour but it seems like the points would come in pairs and the points you are interested in would be between the pairs.
.
I am working on Retinal fundus images.The image consists of a circular retina on a black background. With OpenCV, I have managed to get a contour which surrounds the whole circular Retina. What I need is to crop out the circular retina from the black background.
It is unclear in your question whether you want to actually crop out the information that is defined within the contour or mask out the information that isn't relevant to the contour chosen. I'll explore what to do in both situations.
Masking out the information
Assuming you ran cv2.findContours on your image, you will have received a structure that lists all of the contours available in your image. I'm also assuming that you know the index of the contour that was used to surround the object you want. Assuming this is stored in idx, first use cv2.drawContours to draw a filled version of this contour onto a blank image, then use this image to index into your image to extract out the object. This logic masks out any irrelevant information and only retain what is important - which is defined within the contour you have selected. The code to do this would look something like the following, assuming your image is a grayscale image stored in img:
import numpy as np
import cv2
img = cv2.imread('...', 0) # Read in your image
# contours, _ = cv2.findContours(...) # Your call to find the contours using OpenCV 2.4.x
_, contours, _ = cv2.findContours(...) # Your call to find the contours
idx = ... # The index of the contour that surrounds your object
mask = np.zeros_like(img) # Create mask where white is what we want, black otherwise
cv2.drawContours(mask, contours, idx, 255, -1) # Draw filled contour in mask
out = np.zeros_like(img) # Extract out the object and place into output image
out[mask == 255] = img[mask == 255]
# Show the output image
cv2.imshow('Output', out)
cv2.waitKey(0)
cv2.destroyAllWindows()
If you actually want to crop...
If you want to crop the image, you need to define the minimum spanning bounding box of the area defined by the contour. You can find the top left and lower right corner of the bounding box, then use indexing to crop out what you need. The code will be the same as before, but there will be an additional cropping step:
import numpy as np
import cv2
img = cv2.imread('...', 0) # Read in your image
# contours, _ = cv2.findContours(...) # Your call to find the contours using OpenCV 2.4.x
_, contours, _ = cv2.findContours(...) # Your call to find the contours
idx = ... # The index of the contour that surrounds your object
mask = np.zeros_like(img) # Create mask where white is what we want, black otherwise
cv2.drawContours(mask, contours, idx, 255, -1) # Draw filled contour in mask
out = np.zeros_like(img) # Extract out the object and place into output image
out[mask == 255] = img[mask == 255]
# Now crop
(y, x) = np.where(mask == 255)
(topy, topx) = (np.min(y), np.min(x))
(bottomy, bottomx) = (np.max(y), np.max(x))
out = out[topy:bottomy+1, topx:bottomx+1]
# Show the output image
cv2.imshow('Output', out)
cv2.waitKey(0)
cv2.destroyAllWindows()
The cropping code works such that when we define the mask to extract out the area defined by the contour, we additionally find the smallest horizontal and vertical coordinates which define the top left corner of the contour. We similarly find the largest horizontal and vertical coordinates that define the bottom left corner of the contour. We then use indexing with these coordinates to crop what we actually need. Note that this performs cropping on the masked image - that is the image that removes everything but the information contained within the largest contour.
Note with OpenCV 3.x
It should be noted that the above code assumes you are using OpenCV 2.4.x. Take note that in OpenCV 3.x, the definition of cv2.findContours has changed. Specifically, the output is a three element tuple output where the first image is the source image, while the other two parameters are the same as in OpenCV 2.4.x. Therefore, simply change the cv2.findContours statement in the above code to ignore the first output:
_, contours, _ = cv2.findContours(...) # Your call to find contours
Here's another approach to crop out a rectangular ROI. The main idea is to find the edges of the retina using Canny edge detection, find contours, and then extract the ROI using Numpy slicing. Assuming you have an input image like this:
Extracted ROI
import cv2
# Load image, convert to grayscale, and find edges
image = cv2.imread('1.jpg')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_OTSU + cv2.THRESH_BINARY)[1]
# Find contour and sort by contour area
cnts = cv2.findContours(thresh, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
cnts = sorted(cnts, key=cv2.contourArea, reverse=True)
# Find bounding box and extract ROI
for c in cnts:
x,y,w,h = cv2.boundingRect(c)
ROI = image[y:y+h, x:x+w]
break
cv2.imshow('ROI',ROI)
cv2.imwrite('ROI.png',ROI)
cv2.waitKey()
This is a pretty simple way. Mask the image with transparency.
Read the image
Make a grayscale version.
Otsu Threshold
Apply morphology open and close to thresholded image as a mask
Put the mask into the alpha channel of the input
Save the output
Input
Code
import cv2
import numpy as np
# load image as grayscale
img = cv2.imread('retina.jpeg')
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
# threshold input image using otsu thresholding as mask and refine with morphology
ret, mask = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY+cv2.THRESH_OTSU)
kernel = np.ones((9,9), np.uint8)
mask = cv2.morphologyEx(mask, cv2.MORPH_CLOSE, kernel)
mask = cv2.morphologyEx(mask, cv2.MORPH_OPEN, kernel)
# put mask into alpha channel of result
result = img.copy()
result = cv2.cvtColor(result, cv2.COLOR_BGR2BGRA)
result[:, :, 3] = mask
# save resulting masked image
cv2.imwrite('retina_masked.png', result)
Output