I am trying to detect all of the overlapping circle/ellipses shapes in this image all of which have digits on them. I have tried different types of image processing techniques using OpenCV, however I cannot detect the shapes that overlap the tree. I have tried erosion and dilation however it has not helped.
Any pointers on how to go about this would be great. I have attached my code below
original = frame.copy()
gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
blurred = cv2.GaussianBlur(gray, (3, 3), 0)
canny = cv2.Canny(blurred, 120, 255, 1)
kernel = np.ones((5, 5), np.uint8)
dilate = cv2.dilate(canny, kernel, iterations=1)
# Find contours
cnts = cv2.findContours(dilate, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
image_number = 0
for c in cnts:
x, y, w, h = cv2.boundingRect(c)
cv2.rectangle(frame, (x, y), (x + w, y + h), (36, 255, 12), 2)
ROI = original[y:y + h, x:x + w]
cv2.imwrite("ROI_{}.png".format(image_number), ROI)
image_number += 1
cv2.imshow('canny', canny)
cv2.imshow('image', frame)
cv2.waitKey(0)
Here's a possible solution. I'm assuming that the target blobs (the saucer-like things) are always labeled - that is, they always have a white number inside them. The idea is to create a digits mask, because their size and color seem to be constant. I use the digits as guide to obtain sample pixels of the ellipses. Then, I convert these BGR pixels to HSV, create a binary mask and use that info to threshold and locate the ellipses. Let's check out the code:
# imports:
import cv2
import numpy as np
# image path
path = "D://opencvImages//"
fileName = "4dzfr.png"
# Reading an image in default mode:
inputImage = cv2.imread(path + fileName)
# Deep copy for results:
inputImageCopy = inputImage.copy()
# Convert RGB to grayscale:
grayscaleImage = cv2.cvtColor(inputImage, cv2.COLOR_BGR2GRAY)
# Get binary image via Otsu:
binaryImage = np.where(grayscaleImage >= 200, 255, 0)
# The above operation converted the image to 32-bit float,
# convert back to 8-bit uint
binaryImage = binaryImage.astype(np.uint8)
The first step is to make a mask of the digits. I also created a deep copy of the BGR image. The digits are close to white (That is, an intensity close to 255). I use 200 as threshold and obtain this result:
Now, let's locate these contours on this binary mask. I'm filtering based on aspect ratio, as the digits have a distinct aspect ratio close to 0.70. I'm also filtering contours based on hierarchy - as I'm only interested on external contours (the ones that do not have children). That's because I really don't need contours like the "holes" inside the digit 8:
# Find the contours on the binary image:
contours, hierarchy = cv2.findContours(binaryImage, cv2.RETR_CCOMP, cv2.CHAIN_APPROX_SIMPLE)
# Store the sampled pixels here:
sampledPixels = []
# Look for the outer bounding boxes (no children):
for i, c in enumerate(contours):
# Get the contour bounding rectangle:
boundRect = cv2.boundingRect(c)
# Get the dimensions of the bounding rect:
rectX = boundRect[0]
rectY = boundRect[1]
rectWidth = boundRect[2]
rectHeight = boundRect[3]
# Compute the aspect ratio:
aspectRatio = rectWidth / rectHeight
# Create the filtering threshold value:
delta = abs(0.7 - aspectRatio)
epsilon = 0.1
# Get the hierarchy:
currentHierarchy = hierarchy[0][i][3]
# Prepare the list of sampling points (One for the ellipse, one for the circle):
samplingPoints = [ (rectX - rectWidth, rectY), (rectX, rectY - rectHeight) ]
# Look for the target contours:
if delta < epsilon and currentHierarchy == -1:
# This list will hold both sampling pixels:
pixelList = []
# Get sampling pixels from the two locations:
for s in range(2):
# Get sampling point:
sampleX = samplingPoints[s][0]
sampleY = samplingPoints[s][1]
# Get sample BGR pixel:
samplePixel = inputImageCopy[sampleY, sampleX]
# Store into temp list:
pixelList.append(samplePixel)
# convert list to tuple:
pixelList = tuple(pixelList)
# Save pixel value:
sampledPixels.append(pixelList)
Ok, there area a couple of things happening in the last snippet of code. We want to sample pixels from both the ellipse and the circle. We will use two sampling locations that are function of each digit's original position. These positions are defined in the samplingPoints tuple. For the ellipse, I'm sampling at a little before the top right position of the digit. For the circle, I'm sapling directly above the top right position - we end up with two pixels for each figure.
You'll notice I'm doing a little bit of data type juggling, converting lists to tuples. I want these pixels stored as a tuple for convenience. If I draw bounding rectangles of the digits, this would be the resulting image:
Now, let's loop through the pixel list, convert them to HSV and create a HSV mask over the original BGR image. The final bounding rectangles of the ellipses are stored in boundingRectangles, additionally I draw results on the deep copy of the original input:
# Final bounding rectangles are stored here:
boundingRectangles = []
# Loop through sampled pixels:
for p in range(len(sampledPixels)):
# Get current pixel tuple:
currentPixelTuple = sampledPixels[p]
# Prepare the HSV mask:
imageHeight, imageWidth = binaryImage.shape[:2]
hsvMask = np.zeros((imageHeight, imageWidth), np.uint8)
# Process the two sampling pixels:
for m in range(len(currentPixelTuple)):
# Get current pixel in the list:
currentPixel = currentPixelTuple[m]
# Create BGR Mat:
pixelMat = np.zeros([1, 1, 3], dtype=np.uint8)
pixelMat[0, 0] = currentPixel
# Convert the BGR pixel to HSV:
hsvPixel = cv2.cvtColor(pixelMat, cv2.COLOR_BGR2HSV)
H = hsvPixel[0][0][0]
S = hsvPixel[0][0][1]
V = hsvPixel[0][0][2]
# Create HSV range for this pixel:
rangeThreshold = 5
lowerValues = np.array([H - rangeThreshold, S - rangeThreshold, V - rangeThreshold])
upperValues = np.array([H + rangeThreshold, S + rangeThreshold, V + rangeThreshold])
# Create HSV mask:
hsvImage = cv2.cvtColor(inputImage.copy(), cv2.COLOR_BGR2HSV)
tempMask = cv2.inRange(hsvImage, lowerValues, upperValues)
hsvMask = hsvMask + tempMask
First, I create a 1 x 1 Matrix (or Numpy Array) with just a BGR pixel value - the first of two I previously sampled. In this way, I can use cv2.cvtColor to get the corresponding HSV values. Then, I create lower and upper threshold values for the HSV mask. However, the image seems synthetic, and a range-based thresholding could be reduced to a unique tuple. After that, I create the HSV mask using cv2.inRange.
This will yield the HSV mask for the ellipse. After applying the method for the circle we will end up with two HSV masks. Well, I just added the two arrays to combine both masks. At the end you will have something like this, this is the "composite" HSV mask created for the first saucer-like figure:
We can apply a little bit of morphology to join both shapes, just a little closing will do:
# Set kernel (structuring element) size:
kernelSize = 3
# Set morph operation iterations:
opIterations = 2
# Get the structuring element:
morphKernel = cv2.getStructuringElement(cv2.MORPH_RECT, (kernelSize, kernelSize))
# Perform closing:
hsvMask = cv2.morphologyEx(hsvMask, cv2.MORPH_CLOSE, morphKernel, None, None, opIterations,cv2.BORDER_REFLECT101)
This is the result:
Nice. Let's continue and get the bounding rectangles of all the shapes. I'm using the boundingRectangles list to store each bounding rectangle, like this:
# Process current contour:
currentContour, _ = cv2.findContours(hsvMask, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
for _, c in enumerate(currentContour):
# Get the contour's bounding rectangle:
boundRect = cv2.boundingRect(c)
# Get the dimensions of the bounding rect:
rectX = boundRect[0]
rectY = boundRect[1]
rectWidth = boundRect[2]
rectHeight = boundRect[3]
# Store and set bounding rect:
boundingRectangles.append(boundRect)
color = (0, 0, 255)
cv2.rectangle(inputImageCopy, (int(rectX), int(rectY)),
(int(rectX + rectWidth), int(rectY + rectHeight)), color, 2)
cv2.imshow("Objects", inputImageCopy)
cv2.waitKey(0)
This is the image of the located rectangles once every sampled pixel is processed:
Related
I'm trying to crop both columns from several pages like this in order to later OCR, looking at splitting the page along the vertical line
What I've got so far is finding the header, so that it can be cropped out:
image = cv2.imread('014-page1.jpg')
im_h, im_w, im_d = image.shape
base_image = image.copy()
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
blur = cv2.GaussianBlur(gray, (7,7), 0)
thresh = cv2.threshold(blur, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]
# Create rectangular structuring element and dilate
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (50,10))
dilate = cv2.dilate(thresh, kernel, iterations=1)
# Find contours and draw rectangle
cnts = cv2.findContours(dilate, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
cnts = sorted(cnts, key=lambda x: cv2.boundingRect(x)[1])
for c in cnts:
x,y,w,h = cv2.boundingRect(c)
if h < 20 and w > 250:
cv2.rectangle(image, (x, y), (x + w, y + h), (36,255,12), 2)
How could I split the page vertically, and grab the text in sequence from the columns? Or alternatively, is there a better way to go about this?
Here's my take on the problem. It involves selecting a middle portion of the image, assuming the vertical line is present through all the image (or at least passes through the middle of the page). I process this Region of Interest (ROI) and then reduce it to a row. Then, I get the starting and ending horizontal coordinates of the crop. With this information and then produce the final cropped images.
I tried to made the algorithm general. It can split all the columns if you have more than two columns in the original image. Let's check out the code:
# Imports:
import numpy as np
import cv2
# Image path
path = "D://opencvImages//"
fileName = "pmALU.jpg"
# Reading an image in default mode:
inputImage = cv2.imread(path + fileName)
# To grayscale:
grayImage = cv2.cvtColor(inputImage, cv2.COLOR_BGR2GRAY)
# Otsu Threshold:
_, binaryImage = cv2.threshold(grayImage, 0, 255, cv2.THRESH_OTSU)
# Get image dimensions:
(imageHeight, imageWidth) = binaryImage.shape[:2]
# Set middle ROI dimensions:
middleVertical = 0.5 * imageHeight
roiWidth = imageWidth
roiHeight = int(0.1 * imageHeight)
middleRoiVertical = 0.5 * roiHeight
roiY = int(0.5 * imageHeight - middleRoiVertical)
The first portion of the code gets the ROI. I set it to crop around the middle of the image. Let's just visualize the ROI that will be used for processing:
The next step is to crop this:
# Slice the ROI:
middleRoi = binaryImage[roiY:roiY + roiHeight, 0:imageWidth]
showImage("middleRoi", middleRoi)
writeImage(path+"middleRoi", middleRoi)
This produces the following crop:
Alright. The idea is to reduce this image to one row. If I get the maximum value of all columns and store them in one row, I should get a big white portion where the vertical line passes through.
Now, there's a problem here. If I directly reduce this image, this would be the result (the following is an image of the reduced row):
The image is a little bit small, but you can see the row produces two black columns at the sides, followed by two white blobs. That's because the image has been scanned, additionally the text seems to be justified and some margins are produced at the sides. I only need the central white blob with everything else in black.
I can solve this in two steps: draw a white rectangle around the image before reducing it - this will take care of the black columns. After this, I can Flood-filling with black again at both sides of the reduced image:
# White rectangle around ROI:
rectangleThickness = int(0.01 * imageHeight)
cv2.rectangle(middleRoi, (0, 0), (roiWidth, roiHeight), 255, rectangleThickness)
# Image reduction to a row:
reducedImage = cv2.reduce(middleRoi, 0, cv2.REDUCE_MIN)
# Flood fill at the extreme corners:
fillPositions = [0, imageWidth - 1]
for i in range(len(fillPositions)):
# Get flood-fill coordinate:
x = fillPositions[i]
currentCorner = (x, 0)
fillColor = 0
cv2.floodFill(reducedImage, None, currentCorner, fillColor)
Now, the reduced image looks like this:
Nice. But there's another problem. The central black line produced a "gap" at the center of the row. Not a problem really, because I can fill that gap with an opening:
# Apply Opening:
kernel = np.ones((3, 3), np.uint8)
reducedImage = cv2.morphologyEx(reducedImage, cv2.MORPH_CLOSE, kernel, iterations=2)
This is the result. No more central gap:
Cool. Let's get the vertical positions (indices) where the transitions from black to white and vice versa occur, starting at 0:
# Get horizontal transitions:
whiteSpaces = np.where(np.diff(reducedImage, prepend=np.nan))[1]
I now know where to crop. Let's see:
# Crop the image:
colWidth = len(whiteSpaces)
spaceMargin = 0
for x in range(0, colWidth, 2):
# Get horizontal cropping coordinates:
if x != colWidth - 1:
x2 = whiteSpaces[x + 1]
spaceMargin = (whiteSpaces[x + 2] - whiteSpaces[x + 1]) // 2
else:
x2 = imageWidth
# Set horizontal cropping coordinates:
x1 = whiteSpaces[x] - spaceMargin
x2 = x2 + spaceMargin
# Clamp and Crop original input:
x1 = clamp(x1, 0, imageWidth)
x2 = clamp(x2, 0, imageWidth)
currentCrop = inputImage[0:imageHeight, x1:x2]
cv2.imshow("currentCrop", currentCrop)
cv2.waitKey(0)
You'll note I calculate a margin. This is to crop the margins of the columns. I also use a clamp function to make sure the horizontal cropping points are always within image dimensions. This is the definition of that function:
# Clamps an integer to a valid range:
def clamp(val, minval, maxval):
if val < minval: return minval
if val > maxval: return maxval
return val
These are the results (resized for the post, open them in a new tab to see the full image):
Let's check out how this scales to more than two columns. This is a modification of the original input, with more columns added manually, just to check out the results:
These are the four images produced:
In order to separate out the two columns you have to find the dividing line in the center.
You can use Sobel derivative filter in the x-axis to find the black vertical line. Follow this tutorial for more details on the Sobel filter operator.
sobel_vertical = cv2.Sobel(img,cv2.CV_64F,1,0,ksize=3) # (1,0) for x direction derivatives
Extract the line position by thresholding the sobel result:
ret, sobel_thresh = cv.threshold(sobel_vertical,127,255,cv.THRESH_BINARY)
Then scanning the center columns for a column with high concentration of white values.
One way to do this would be to do a column-wise sum and then find the column with the maximum values. But there are other ways to do it.
sum_cols = np.add.reduce(sobel_thresh, axis = 1)
max_col = np.argmax(sum_cols)
In a case where there is no black dividing line you can skip the sobel. Just resize aggressively and search for the columns in the center with high concentration of white pixels.
I am trying to convert the result of a skeletonization into a set of line segments, where the vertices correspond to the junction points of the skeleton. The shape is not a closed polygon and it may be somewhat noisy (the segments are not as straight as they should be).
Here is an example input image:
And here are the points I want to retrieve:
I have tried using the harris corner detector, but it has trouble in some areas even after trying to tweak the algorithm's parameters (such as the angled section on the bottom of the image). Here are the results:
Do you know of any method capable of doing this? I am using python with mostly OpenCV and Numpy but I am not bound to any library. Thanks in advance.
Edit: I've gotten some good responses regarding the junction points, I am really grateful. I would also appreciate any solutions regarding extracting line segments from the junction points. I think #nathancy's answer could be used to extract line segments by subtracting the masks with the intersection mask, but I am not sure.
My approach is based on my previous answer here. It involves convolving the image with a special kernel. This convolution identifies the end-points of the lines, as well as the intersections. This will result in a points mask containing the pixel that matches the points you are looking for. After that, apply a little bit of morphology to join possible duplicated points. The method is sensible to the corners produced by the skeleton.
This is the code:
import cv2
import numpy as np
# image path
path = "D://opencvImages//"
fileName = "Repn3.png"
# Reading an image in default mode:
inputImage = cv2.imread(path + fileName)
inputImageCopy = inputImage.copy()
# Convert to grayscale:
grayscaleImage = cv2.cvtColor(inputImage, cv2.COLOR_BGR2GRAY)
# Compute the skeleton:
skeleton = cv2.ximgproc.thinning(grayscaleImage, None, 1)
# Threshold the image so that white pixels get a value of 10 and
# black pixels a value of 0:
_, binaryImage = cv2.threshold(skeleton, 128, 10, cv2.THRESH_BINARY)
# Set the convolution kernel:
h = np.array([[1, 1, 1],
[1, 10, 1],
[1, 1, 1]])
# Convolve the image with the kernel:
imgFiltered = cv2.filter2D(binaryImage, -1, h)
So far I convolved the skeleton image with my special kernel. You can inspect the image produced and search for the numerical values at the corners and intersections.
This is the output so far:
Next, identify a corner or an intersection. This bit is tricky, because the threshold value depends directly on the skeleton image, which sometimes doesn't produce good (close to straight) corners:
# Create list of thresholds:
thresh = [130, 110, 40]
# Prepare the final mask of points:
(height, width) = binaryImage.shape
pointsMask = np.zeros((height, width, 1), np.uint8)
# Perform convolution and create points mask:
for t in range(len(thresh)):
# Get current threshold:
currentThresh = thresh[t]
# Locate the threshold in the filtered image:
tempMat = np.where(imgFiltered == currentThresh, 255, 0)
# Convert and shape the image to a uint8 height x width x channels
# numpy array:
tempMat = tempMat.astype(np.uint8)
tempMat = tempMat.reshape(height,width,1)
# Accumulate mask:
pointsMask = cv2.bitwise_or(pointsMask, tempMat)
This is the binary mask:
Let's dilate to join close points:
# Set kernel (structuring element) size:
kernelSize = 3
# Set operation iterations:
opIterations = 4
# Get the structuring element:
morphKernel = cv2.getStructuringElement(cv2.MORPH_RECT, (kernelSize, kernelSize))
# Perform Dilate:
pointsMask = cv2.morphologyEx(pointsMask, cv2.MORPH_DILATE, morphKernel, None, None, opIterations, cv2.BORDER_REFLECT101)
This is the output:
Now simple extract external contours. Get their bounding boxes and calculate their centroid:
# Look for the outer contours (no children):
contours, _ = cv2.findContours(pointsMask, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
# Store the points here:
pointsList = []
# Loop through the contours:
for i, c in enumerate(contours):
# Get the contours bounding rectangle:
boundRect = cv2.boundingRect(c)
# Get the centroid of the rectangle:
cx = int(boundRect[0] + 0.5 * boundRect[2])
cy = int(boundRect[1] + 0.5 * boundRect[3])
# Store centroid into list:
pointsList.append( (cx,cy) )
# Set centroid circle and text:
color = (0, 0, 255)
cv2.circle(inputImageCopy, (cx, cy), 3, color, -1)
font = cv2.FONT_HERSHEY_COMPLEX
cv2.putText(inputImageCopy, str(i), (cx, cy), font, 0.5, (0, 255, 0), 1)
# Show image:
cv2.imshow("Circles", inputImageCopy)
cv2.waitKey(0)
This is the result. Some corners are missed, you might one to improve the solution before computing the skeleton.
Here's a simple approach, the idea is:
Obtain binary image. Load image, convert to grayscale, Gaussian blur, then Otsu's threshold.
Obtain horizontal and vertical line masks. Create horizontal and vertical structuring elements with cv2.getStructuringElement then perform cv2.morphologyEx to isolate the lines.
Find joints. We cv2.bitwise_and the two masks together to get the joints. The idea is that the intersection points on the two masks are the joints.
Find centroid on joint mask. We find contours then calculate the centroid.
Find leftover endpoints. Endpoints do not correspond to an intersection so to find those, we can use the Shi-Tomasi Corner Detector
Horizontal and vertical line masks
Results (joints in green and endpoints in blue)
Code
import cv2
import numpy as np
# Load image, grayscale, Gaussian blur, Otsus threshold
image = cv2.imread('1.png')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
blur = cv2.GaussianBlur(gray, (3,3), 0)
thresh = cv2.threshold(blur, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)[1]
# Find horizonal lines
horizontal_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (5,1))
horizontal = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, horizontal_kernel, iterations=1)
# Find vertical lines
vertical_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (1,5))
vertical = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, vertical_kernel, iterations=1)
# Find joint intersections then the centroid of each joint
joints = cv2.bitwise_and(horizontal, vertical)
cnts = cv2.findContours(joints, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
for c in cnts:
# Find centroid and draw center point
x,y,w,h = cv2.boundingRect(c)
centroid, coord, area = cv2.minAreaRect(c)
cx, cy = int(centroid[0]), int(centroid[1])
cv2.circle(image, (cx, cy), 5, (36,255,12), -1)
# Find endpoints
corners = cv2.goodFeaturesToTrack(thresh, 5, 0.5, 10)
corners = np.int0(corners)
for corner in corners:
x, y = corner.ravel()
cv2.circle(image, (x, y), 5, (255,100,0), -1)
cv2.imshow('thresh', thresh)
cv2.imshow('joints', joints)
cv2.imshow('horizontal', horizontal)
cv2.imshow('vertical', vertical)
cv2.imshow('image', image)
cv2.waitKey()
I have these images and there is a shadow in all images. I target is making a single image of a car without shadow by using these three images:
Finally, how can I get this kind of image as shown below:
Any kind of help or suggestions are appreciated.
EDITED
According to the comments, I used np.maximum and achieved easily to my target:
import cv2
import numpy as np
img_1 = cv2.imread('1.png', cv2.COLOR_BGR2RGB)
img_2 = cv2.imread('2.png', cv2.COLOR_BGR2RGB)
img = np.maximum(img_1, img_2)
cv2.imshow('img1', img_1)
cv2.imshow('img2', img_2)
cv2.imshow('img', img)
cv2.waitKey(0)
Here's a possible solution. The overall idea is to compute the location of the shadows, produce a binary mask identifying the location of the shadows and use this information to copy pixels from all the cropped sub-images.
Let's see the code. The first problem is to locate the three images. I used the black box to segment and crop each car, like this:
# Imports:
import cv2
import numpy as np
# image path
path = "D://opencvImages//"
fileName = "qRLI7.png"
# Reading an image in default mode:
inputImage = cv2.imread(path + fileName)
# Get the HSV image:
hsvImage = cv2.cvtColor(inputImage, cv2.COLOR_BGR2HSV)
# Get the grayscale image:
grayImage = cv2.cvtColor(inputImage, cv2.COLOR_BGR2GRAY)
showImage("grayImage", grayImage)
# Threshold via Otsu:
_, binaryImage = cv2.threshold(grayImage, 5, 255, cv2.THRESH_BINARY_INV)
cv2.imshow("binaryImage", binaryImage)
cv2.waitKey(0)
The previous bit uses the grayscale version of the image and applies a fixed binarization using a threshold of 5. I also pre-compute the HSV version of the original image. The result of the thresholding is this:
I'm trying to get the black rectangles and use them to crop each car. Let's get the contours and filter them by area, as the black rectangles on the binary image have the biggest area:
for i, c in enumerate(currentContour):
# Get the contour's bounding rectangle:
boundRect = cv2.boundingRect(c)
# Get the dimensions of the bounding rect:
rectX = boundRect[0]
rectY = boundRect[1]
rectWidth = boundRect[2]
rectHeight = boundRect[3]
# Get the area:
blobArea = rectWidth * rectHeight
minArea = 20000
if blobArea > minArea:
# Deep local copies:
hsvImage = hsvImage.copy()
localImage = inputImage.copy()
# Get the S channel from the HSV image:
(H, S, V) = cv2.split(hsvImage)
# Crop image:
croppedImage = V[rectY:rectY + rectHeight, rectX:rectX + rectWidth]
localImage = localImage[rectY:rectY + rectHeight, rectX:rectX + rectWidth]
_, binaryMask = cv2.threshold(croppedImage, 0, 255, cv2.THRESH_OTSU + cv2.THRESH_BINARY_INV)
After filtering each contour to get the biggest one, I need to locate the position of the shadow. The shadow is mostly visible in the HSV color space, particularly, in the V channel. I cropped two versions of the image: The original BGR image, now cropped, and the V cropped channel of the HSV image. This is the binary mask that results from applying an automatic thresholding on the S channel :
To locate the shadow I only need the starting x coordinate and its width, because the shadow is uniform across every cropped image. Its height is equal to each cropped image's height. I reduced the V image to a row, using the SUM mode. This will sum each pixel across all columns. The biggest values will correspond to the position of the shadow:
# Image reduction:
reducedImg = cv2.reduce(binaryMask, 0, cv2.REDUCE_SUM, dtype=cv2.CV_32S)
# Normalize image:
max = np.max(reducedImg)
reducedImg = reducedImg / max
# Clip the values to [0,255]
reducedImg = np.clip((255 * reducedImg), 0, 255)
# Convert the mat type from float to uint8:
reducedImg = reducedImg.astype("uint8")
_, shadowMask = cv2.threshold(reducedImg, 250, 255, cv2.THRESH_BINARY)
The reduced image is just a row:
The white pixels denote the largest values. The location of the shadow is drawn like a horizontal line with the largest area, that is, the most contiguous white pixels. I process this row by getting contours and filtering, again, to the largest area:
# Get the biggest rectangle:
subContour, _ = cv2.findContours(shadowMask, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
for j, s in enumerate(subContour):
# Get the contour's bounding rectangle:
boundRect = cv2.boundingRect(s)
# Get the dimensions of the bounding rect:
rectX = boundRect[0]
rectY = boundRect[1]
rectWidth = boundRect[2]
rectHeight = boundRect[3]
# Get the area:
blobArea = rectWidth * rectHeight
minArea = 30
if blobArea > minArea:
# Get image dimensions:
(imageHeight, imageWidth) = localImage.shape[:2]
# Set an empty array, this will be the binary mask
shadowMask = np.zeros((imageHeight, imageWidth, 3), np.uint8)
color = (255, 255, 255)
cv2.rectangle(shadowMask, (int(rectX), int(0)),
(int(rectX + rectWidth), int(0 + imageHeight)), color, -1)
# Invert mask:
shadowMask = 255 - shadowMask
# Store mask and cropped image:
shadowRois.append((shadowMask.copy(), localImage.copy()))
Alright, with that information I create a mask, where the only thing drawn in white is the location of the mask. I store this mask and the original BGR crop in the shadowRois list.
What follows is a possible method to use this information and create a full image. The idea is that I use the information of each mask to copy all the non-masked pixels. I accumulate this information on a buffer, initially an empty image, like this:
# Prepare image buffer:
buffer = np.zeros((100, 100, 3), np.uint8)
# Loop through cropped images and produce the final image:
for r in range(len(shadowRois)):
# Get data from the list:
(mask, img) = shadowRois[r]
# Get image dimensions:
(imageHeight, imageWidth) = img.shape[:2]
# Resize the buffer:
newSize = (imageWidth, imageHeight)
buffer = cv2.resize(buffer, newSize, interpolation=cv2.INTER_AREA)
# Get the image mask:
temp = cv2.bitwise_and(img, mask)
# Set info in buffer, substitute the black pixels
# for the new data:
buffer = np.where(temp == (0, 0, 0), buffer, temp)
cv2.imshow("Composite Image", buffer)
cv2.waitKey(0)
The result is this:
I have an image that has cereal items below:
The image has:
3 walnuts
3 raisins
3 pumpkin seeds
27 similar looking cereal
I wish to count them separately using opencv, I do not want to recognize them. So far, I have tailored the AdaptiveThreshold method to count all the seeds, but not sure how to do it separately. This is my scripts:
import cv2
import numpy as np
import matplotlib.pyplot as plt
img = cv2.imread('/Users/vaibhavsaxena/Desktop/Screen Shot 2021-04-27 at 12.22.46.png', 0)
#img = cv2.fastNlMeansDenoisingColored(img,None,10,10,7,21)
windowSize = 31
windowConstant = 40
mask = cv2.adaptiveThreshold(img, 255, cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY_INV, windowSize, windowConstant)
plt.imshow(mask)
stats = cv2.connectedComponentsWithStats(mask, 8)[2]
label_area = stats[1:, cv2.CC_STAT_AREA]
min_area, max_area = 345, max(list(label_area)) # min/max for a single circle
singular_mask = (min_area < label_area) & (label_area <= max_area)
circle_area = np.mean(label_area[singular_mask])
n_circles = int(np.sum(np.round(label_area / circle_area)))
print('Total circles:', n_circles)
36
But this one seems extremely hard coded. For example, if I zoom in or zoom out the image, it yields a different count.
Can anyone please help?
Your lighting is not good, as HansHirse suggested, try normalizing the conditions in which you take your photos. There's, however, a method that can somewhat normalize the lighting and get it as uniform as possible. The method is called gain division. The idea is that you try to build a model of the background and then weight each input pixel by that model. The output gain should be relatively constant during most of the image. Let's give it a try:
# imports:
import cv2
import numpy as np
# Reading an image in default mode:
inputImage = cv2.imread(path + fileName)
# Deep copy for results:
inputImageCopy = inputImage.copy()
# Get local maximum:
kernelSize = 30
maxKernel = cv2.getStructuringElement(cv2.MORPH_RECT, (kernelSize, kernelSize))
localMax = cv2.morphologyEx(inputImage, cv2.MORPH_CLOSE, maxKernel, None, None, 1, cv2.BORDER_REFLECT101)
# Perform gain division
gainDivision = np.where(localMax == 0, 0, (inputImage/localMax))
# Clip the values to [0,255]
gainDivision = np.clip((255 * gainDivision), 0, 255)
# Convert the mat type from float to uint8:
gainDivision = gainDivision.astype("uint8")
Gotta be careful with those data types and their conversions. This is the result:
As you can see, most of the background is now uniform, that's pretty cool, because now we can apply a simple thresholding method. Let's try Otsu's Thresholding to get a nice binary mask of the elements:
# Convert RGB to grayscale:
grayscaleImage = cv2.cvtColor(gainDivision, cv2.COLOR_BGR2GRAY)
# Get binary image via Otsu:
_, binaryImage = cv2.threshold(grayscaleImage, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)
Which yields this binary mask:
The mask can be improved by applying morphology, let's try to join those blobs applying a gentle closing operation:
# Set kernel (structuring element) size:
kernelSize = 3
# Set morph operation iterations:
opIterations = 2
# Get the structuring element:
morphKernel = cv2.getStructuringElement(cv2.MORPH_RECT, (kernelSize, kernelSize))
# Perform closing:
binaryImage = cv2.morphologyEx( binaryImage, cv2.MORPH_CLOSE, morphKernel, None, None, opIterations, cv2.BORDER_REFLECT101 )
This is the result:
Alright, now, just for completeness, let's try to compute the bounding rectangles of all the elements. We can also filter blobs of small area and store each bounding rectangle in a list:
# Find the blobs on the binary image:
contours, hierarchy = cv2.findContours(binaryImage, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
# Store the bounding rectangles here:
rectanglesList = []
# Look for the outer bounding boxes (no children):
for _, c in enumerate(contours):
# Get blob area:
currentArea = cv2.contourArea(c)
# Set a min area threshold:
minArea = 100
if currentArea > minArea:
# Approximate the contour to a polygon:
contoursPoly = cv2.approxPolyDP(c, 3, True)
# Get the polygon's bounding rectangle:
boundRect = cv2.boundingRect(contoursPoly)
# Store rectangles in list:
rectanglesList.append(boundRect)
# Get the dimensions of the bounding rect:
rectX = boundRect[0]
rectY = boundRect[1]
rectWidth = boundRect[2]
rectHeight = boundRect[3]
# Set bounding rect:
color = (0, 0, 255)
cv2.rectangle( inputImageCopy, (int(rectX), int(rectY)),
(int(rectX + rectWidth), int(rectY + rectHeight)), color, 2 )
cv2.imshow("Rectangles", inputImageCopy)
cv2.waitKey(0)
The final image is this:
This is the total of detected elements:
print("Elements found: "+str(len(rectanglesList)))
Elements found: 37
As you can see, there's a false positive. A bit of the shadow of a grain gets detected as an actual grain. Maybe adjusting the minimum area will get rid of the problem. Or maybe, if you are classifying each grain anyway, you could filter this kind of noise.
I am attempting to find the area inside an arbitrarily-shaped closed curve plotted in python (example image below). So far, I have tried to use both the alphashape and polygon methods to acheive this, but both have failed. I am now attempting to use OpenCV and the floodfill method to count the number of pixels inside the curve and then I will later convert that to an area given the area that a single pixel encloses on the plot.
Example image:
testplot.jpg
In order to do this, I am doing the following, which I adapted from another post about OpenCV.
import cv2
import numpy as np
# Input image
img = cv2.imread('testplot.jpg', cv2.IMREAD_GRAYSCALE)
# Dilate to better detect contours
temp = cv2.dilate(temp, cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (3, 3)))
# Find largest contour
cnts, _ = cv2.findContours(255-temp, cv2.RETR_TREE , cv2.CHAIN_APPROX_NONE) #255-img and cv2.RETR_TREE is to account for how cv2 expects the background to be black, not white, so I convert the background to black.
largestCnt = [] #I expect this to yield the blue contour
for cnt in cnts:
if (len(cnt) > len(largestCnt)):
largestCnt = cnt
# Determine center of area of largest contour
M = cv2.moments(largestCnt)
x = int(M["m10"] / M["m00"])
y = int(M["m01"] / M["m00"])
# Initial mask for flood filling, should cover entire figure
width, height = temp.shape
mask = img2 = np.ones((width + 2, height + 2), np.uint8) * 255
mask[1:width, 1:height] = 0
# Generate intermediate image, draw largest contour onto it, flood fill this contour
temp = np.zeros(temp.shape, np.uint8)
temp = cv2.drawContours(temp, largestCnt, -1, 255, cv2.FILLED)
_, temp, mask, _ = cv2.floodFill(temp, mask, (x, y), 255)
temp = cv2.morphologyEx(temp, cv2.MORPH_OPEN, cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (3, 3)))
area = cv2.countNonZero(temp) #Number of pixels encircled by blue line
I expect from this to get to a place where I have the same image as above, but with the center of the contour filled in white and the background and original blue contour in black. I end up with this:
result.jpg
While this at first glance appears to have accurately turned the area inside the contour white, the white area is actually larger than the area inside the contour and so the result I get is overestimating the number of pixels inside it.
Any input on this would be greatly appreciated. I am fairly new to OpenCV so I may have misunderstood something.
EDIT:
Thanks to a comment below, I made some edits and this is now my code, with edits noted:
import cv2
import numpy as np
# EDITED INPUT IMAGE: Input image
img = cv2.imread('testplot2.jpg', cv2.IMREAD_GRAYSCALE)
# EDIT: threshold
_, temp = cv2.threshold(img, 250, 255, cv2.THRESH_BINARY_INV)
# EDIT, REMOVED: Dilate to better detect contours
# Find largest contour
cnts, _ = cv2.findContours(temp, cv2.RETR_EXTERNAL , cv2.CHAIN_APPROX_NONE)
largestCnt = [] #I expect this to yield the blue contour
for cnt in cnts:
if (len(cnt) > len(largestCnt)):
largestCnt = cnt
# Determine center of area of largest contour
M = cv2.moments(largestCnt)
x = int(M["m10"] / M["m00"])
y = int(M["m01"] / M["m00"])
# Initial mask for flood filling, should cover entire figure
width, height = temp.shape
mask = img2 = np.ones((width + 2, height + 2), np.uint8) * 255
mask[1:width, 1:height] = 0
# Generate intermediate image, draw largest contour, flood filled
temp = np.zeros(temp.shape, np.uint8)
temp = cv2.drawContours(temp, largestCnt, -1, 255, cv2.FILLED)
_, temp, mask, _ = cv2.floodFill(temp, mask, (x, y), 255)
temp = cv2.morphologyEx(temp, cv2.MORPH_OPEN, cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (3, 3)))
area = cv2.countNonZero(temp) #Number of pixels encircled by blue line
I input a different image with the axes and the frame that python adds by default removed for ease. I get what I expect at the second step, so this image. However, in the enter image description here both the original contour and the area it encircles appear to have been made white, whereas I want the original contour to be black and only the area it encircles to be white. How might I acheive this?
The problem is your opening operation at the end. This morphological operation includes a dilation at the end that expands the white contour, increasing its area. Let’s try a different approach where no morphology is involved. These are the steps:
Convert your image to grayscale
Apply Otsu’s thresholding to get a binary image, let’s work with black and white pixels only.
Apply a first flood-fill operation at image location (0,0) to get rid of the outer white space.
Filter small blobs using an area filter
Find the “Curve Canvas” (The white space that encloses the curve) and locate and store its starting point at (targetX, targetY)
Apply a second flood-fill al location (targetX, targetY)
Get the area of the isolated blob with cv2.countNonZero
Let’s take a look at the code:
import cv2
import numpy as np
# Set image path
path = "C:/opencvImages/"
fileName = "cLIjM.jpg"
# Read Input image
inputImage = cv2.imread(path+fileName)
inputCopy = inputImage.copy()
# Convert BGR to grayscale:
grayscaleImage = cv2.cvtColor(inputImage, cv2.COLOR_BGR2GRAY)
# Threshold via Otsu + bias adjustment:
threshValue, binaryImage = cv2.threshold(grayscaleImage, 0, 255, cv2.THRESH_BINARY+cv2.THRESH_OTSU)
This is the binary image you get:
Now, let’s flood-fill at the corner located at (0,0) with a black color to get rid of the first white space. This step is very straightforward:
# Flood-fill background, seed at (0,0) and use black color:
cv2.floodFill(binaryImage, None, (0, 0), 0)
This is the result, note how the first big white area is gone:
Let’s get rid of the small blobs applying an area filter. Everything below an area of 100 is gonna be deleted:
# Perform an area filter on the binary blobs:
componentsNumber, labeledImage, componentStats, componentCentroids = \
cv2.connectedComponentsWithStats(binaryImage, connectivity=4)
# Set the minimum pixels for the area filter:
minArea = 100
# Get the indices/labels of the remaining components based on the area stat
# (skip the background component at index 0)
remainingComponentLabels = [i for i in range(1, componentsNumber) if componentStats[i][4] >= minArea]
# Filter the labeled pixels based on the remaining labels,
# assign pixel intensity to 255 (uint8) for the remaining pixels
filteredImage = np.where(np.isin(labeledImage, remainingComponentLabels) == True, 255, 0).astype('uint8')
This is the result of the filter:
Now, what remains is the second white area, I need to locate its starting point because I want to apply a second flood-fill operation at this location. I’ll traverse the image to find the first white pixel. Like this:
# Get Image dimensions:
height, width = filteredImage.shape
# Store the flood-fill point here:
targetX = -1
targetY = -1
for i in range(0, width):
for j in range(0, height):
# Get current binary pixel:
currentPixel = filteredImage[j, i]
# Check if it is the first white pixel:
if targetX == -1 and targetY == -1 and currentPixel == 255:
targetX = i
targetY = j
print("Flooding in X = "+str(targetX)+" Y: "+str(targetY))
There’s probably a more elegant, Python-oriented way of doing this, but I’m still learning the language. Feel free to improve the script (and share it here). The loop, however, gets me the location of the first white pixel, so I can now apply a second flood-fill at this exact location:
# Flood-fill background, seed at (targetX, targetY) and use black color:
cv2.floodFill(filteredImage, None, (targetX, targetY), 0)
You end up with this:
As you see, just count the number of non-zero pixels:
# Get the area of the target curve:
area = cv2.countNonZero(filteredImage)
print("Curve Area is: "+str(area))
The result is:
Curve Area is: 1510
Here is another approach using Python/OpenCV.
Read Input
convert to HSV colorspace
Threshold on color range of blue
Find the largest contour
Get its area and print that
draw the contour as a white filled contour on black background
Save the results
Input:
import cv2
import numpy as np
# read image as grayscale
img = cv2.imread('closed_curve.jpg')
# convert to HSV
hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)
#select blu color range in hsv
lower = (24,128,115)
upper = (164,255,255)
# threshold on blue in hsv
thresh = cv2.inRange(hsv, lower, upper)
# get largest contour
contours = cv2.findContours(thresh, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)
contours = contours[0] if len(contours) == 2 else contours[1]
big_contour = max(contours, key=cv2.contourArea)
area = cv2.contourArea(c)
print("Area =",area)
# draw filled contour on black background
result = np.zeros_like(thresh)
cv2.drawContours(result, [c], -1, 255, cv2.FILLED)
# save result
cv2.imwrite("closed_curve_thresh.jpg", thresh)
cv2.imwrite("closed_curve_result.jpg", result)
# view result
cv2.imshow("threshold", thresh)
cv2.imshow("result", result)
cv2.waitKey(0)
cv2.destroyAllWindows()
Threshold Image:
Result Filled Contour On Black Background:
Area Result:
Area = 2347.0