I'm trying to take real time input for hand gestures with web cam, then processing the images to feed them to a neural network. I wrote this processing function to make the hand features look prominent:
img = cv2.imread('hand.png')
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
blur = cv2.GaussianBlur(gray,(5,5),2)
th3 = cv2.adaptiveThreshold(blur,10,cv2.ADAPTIVE_THRESH_GAUSSIAN_C,cv2.THRESH_BINARY_INV,11,2)
ret, res = cv2.threshold(th3, 225, 255, cv2.THRESH_BINARY_INV+cv2.THRESH_OTSU)
res = cv2.Canny(res,100,200)
cv2.imshow("Canny", res)
The input and the output images are as follows:
It's obvious that double lines, instead of one, are detected along the edges (allover the hand, not only contour). I want to make them single. If I apply just Canny edge detection algo, then the edges are not very prominent.
One straightforward solution would be flood-fill the background with white and then with black using cv2.floodFill, like this:
import cv2
import numpy as np
# image path
path = "D://opencvImages//"
fileName = "hand.png"
# Reading an image in default mode:
inputImage = cv2.imread(path + fileName)
# Convert the image to Grayscale:
binaryImage = cv2.cvtColor(inputImage, cv2.COLOR_BGR2GRAY)
# Flood fill bakcground (white + black):
cv2.floodFill(binaryImage, mask=None, seedPoint=(int(0), int(0)), newVal=(255))
cv2.floodFill(binaryImage, mask=None, seedPoint=(int(0), int(0)), newVal=(0))
cv2,imshow("floodFilled", binaryImage)
cv2.waitKey(0)
This is the result:
If you want to get a solid mask of the hand, you could try to fill the holes inside the hand's contour, also using flood-fill and some image arithmetic, like this:
# image path
path = "D://opencvImages//"
fileName = "hand.png"
# Reading an image in default mode:
inputImage = cv2.imread(path + fileName)
# Convert the image to Grayscale:
binaryImage = cv2.cvtColor(inputImage, cv2.COLOR_BGR2GRAY)
# Isolate holes on input image:
holes = binaryImage.copy()
# Get rows and cols from input:
(rows, cols) = holes.shape[:2]
# Remove background via flood-fill on 4 outermost corners
cv2.floodFill(holes, mask=None, seedPoint=(int(0), int(0)), newVal=(255))
cv2.floodFill(holes, mask=None, seedPoint=(int(10), int(rows-10)), newVal=(255))
cv2.floodFill(holes, mask=None, seedPoint=(int(cols-10), int(10)), newVal=(255))
cv2.floodFill(holes, mask=None, seedPoint=(int(cols-10), int(rows-10)), newVal=(255))
# Get holes:
holes = 255 - holes
# Final image is original imput + isolated holes:
mask = binaryImage + holes
# Deep copy for further results:
maskCopy = mask.copy()
maskCopy = cv2.cvtColor(maskCopy, cv2.COLOR_GRAY2BGR)
These are the isolated holes and hand mask:
You can then detect the bounding rectangle by processing contours, filtering small-area blobs and approximating to a rectangle, like this:
# Find the big contours/blobs on the processed image:
contours, hierarchy = cv2.findContours(mask, cv2.RETR_CCOMP, cv2.CHAIN_APPROX_SIMPLE)
# Get bounding rectangles:
for c in contours:
# Filter contour by area:
blobArea = cv2.contourArea(c)
maxArea = 100
if blobArea > maxArea:
# Approximate the contour to a polygon:
contoursPoly = cv2.approxPolyDP(c, 3, True)
# Get the polygon's bounding rectangle:
boundRect = cv2.boundingRect(contoursPoly)
# Get the dimensions of the bounding rect:
rectX = boundRect[0]
rectY = boundRect[1]
rectWidth = boundRect[2]
rectHeight = boundRect[3]
# Draw rectangle:
color = (0, 255, 0)
cv2.rectangle(maskCopy, (int(rectX), int(rectY)), (int(rectX + rectWidth), int(rectY + rectHeight)), color, 3)
cv2.imshow("Bounding Rectangle", maskCopy)
cv2.waitKey(0)
This is the result:
It looks like you are on the correct way, but as #CrisLuengo mentioned, Canny is applied on grayscale images rather than binary images. Here is an approach.
import numpy as np
import matplotlib.pyplot as plt
import cv2
img_gray = cv2.imread('hand.png',0)
sigma = 2
threshold1=30
threshold2=60
img_blur = cv2.GaussianBlur(img_gray,(5,5),sigmaX=sigma,sigmaY=sigma)
res = cv2.Canny(img_blur,threshold1=threshold1,threshold2=threshold2)
fig,ax = plt.subplots(1,2,sharex=True,sharey=True)
ax[0].imshow(img_gray,cmap='gray')
ax[1].imshow(res,cmap='gray')
plt.show()
After playing around with the parameters of the gaussian filter and the Canny threshold values, this is what I am getting:
As you can see most of the fingers are clearly detected except the thumb. The lighting conditions make it difficult for Canny to calculate a proper gradient there. You might either try to improve the contrast of your images through your setup (which is the easiest solution to me), or to apply some contrast enhancements methods like Contrast Limited Adaptive Histogram Equalization (CLAHE) before going for Canny. I did not get any better results than the one above after a few trials with CLAHE, though, but it might be worth to look at it. Good luck!
Related
I am trying to remove the checkered background (which represents transparent background in Adobe Illustrator and Photoshop) with transparent color (alpha channel) in some PNGs with Python script.
First, I use template matching:
import cv2
import numpy as np
from matplotlib import pyplot as plt
img_rgb = cv2.imread('testimages/fake1.png', cv2.IMREAD_UNCHANGED)
img_gray = cv2.cvtColor(img_rgb, cv2.COLOR_BGR2GRAY)
template = cv2.imread('pattern.png', 0)
w, h = template.shape[::-1]
res = cv2.matchTemplate(img_gray, template, cv2.TM_CCOEFF_NORMED)
threshold = 0.8
loc = np.where( res >= threshold)
for pt in zip(*loc[::-1]):
if len(img_rgb[0][0]) == 3:
# add alpha channel
rgba = cv2.cvtColor(img_rgb, cv2.COLOR_RGB2RGBA)
rgba[:, :, 3] = 255 # default not transparent
img_rgb = rgba
# replace the area with a transparent rectangle
cv2.rectangle(img_rgb, pt, (pt[0] + w, pt[1] + h), (255, 255, 255, 0), -1)
cv2.imwrite('result.png', img_rgb)
Source Image: fake1.png
Pattern Template: pattern.png
Output: result.png (the gray area is actually transparent; enlarge a bit for viewing easier)
I know this approach has problems, as the in some cases, the template cannot be identified fully, as part of the pattern is hidden by the graphics in the PNG image.
My question is: How can I match such a pattern perfectly using OpenCV? via FFT Filtering?
References:
How particular pixel to transparent in opencv python?
Detecting a pattern in an image and retrieving its position
https://python.plainenglish.io/how-to-remove-image-background-using-python-6f7ffa8eab15
https://answers.opencv.org/question/232506/make-the-background-of-the-image-transparent-using-a-mask/
https://dsp.stackexchange.com/questions/36679/which-image-filter-can-be-applied-to-remove-gridded-pattern-from-corrupt-jpegs
Here is one way to do that in Python/OpenCV simply by thresholding on the checks color range.
Input:
import cv2
import numpy as np
# read input
img = cv2.imread("fake.png")
# threshold on checks
low = (230,230,230)
high = (255,255,255)
mask = cv2.inRange(img, low, high)
# invert alpha
alpha = 255 - mask
# convert img to BGRA
result = cv2.cvtColor(img, cv2.COLOR_BGR2BGRA)
result[:,:,3] = alpha
# save output
cv2.imwrite('fake_transparent.png', result)
cv2.imshow('img', img)
cv2.imshow('mask', mask)
cv2.imshow('result', result)
cv2.waitKey(0)
cv2.destroyAllWindows()
Download the resulting image to see that it is actually transparent.
Here is one way to use DFT to process the image in Python/OpenCV/Numpy. One does need to know the size of the checkerboard pattern (light or dark square size).
Read the input
Separate channels
Apply DFT to each channel
Shift origin from top left to center of each channel
Extract magnitude and phase images from each channel
Define the checkerboard pattern size
Create a black and white checkerboard image of the same size
Apply similar DFT processing to the checkerboard image
Get the spectrum from the log(magnitude)
Threshold the spectrum to form a mask
Zero out the DC center point in the mask
OPTION: If needed apply morphology dilate to thicken the white dots. But does not seem to be needed here
Invert the mask so the background is white and the dots are black
Convert the mask to range 0 to 1 and make 2 channels
Apply the two-channel mask to the center shifted DFT channels
Shift the center back to the top left in each masked image
Do the IDFT to get back from complex domain to real domain on each channel
Merge the resulting channels back to a BGR image as the final reconstituted image
Save results
Input:
import numpy as np
import cv2
import math
# read input
# note: opencv fft only works on grayscale
img = cv2.imread('fake.png')
hh, ww = img.shape[:2]
# separate channels
b,g,r = cv2.split(img)
# convert images to floats and do dft saving as complex output
dft_b = cv2.dft(np.float32(b), flags = cv2.DFT_COMPLEX_OUTPUT)
dft_g = cv2.dft(np.float32(g), flags = cv2.DFT_COMPLEX_OUTPUT)
dft_r = cv2.dft(np.float32(r), flags = cv2.DFT_COMPLEX_OUTPUT)
# apply shift of origin from upper left corner to center of image
dft_b_shift = np.fft.fftshift(dft_b)
dft_g_shift = np.fft.fftshift(dft_g)
dft_r_shift = np.fft.fftshift(dft_r)
# extract magnitude and phase images
mag_b, phase_b = cv2.cartToPolar(dft_b_shift[:,:,0], dft_b_shift[:,:,1])
mag_g, phase_g = cv2.cartToPolar(dft_g_shift[:,:,0], dft_g_shift[:,:,1])
mag_r, phase_r = cv2.cartToPolar(dft_r_shift[:,:,0], dft_r_shift[:,:,1])
# set check size (size of either dark or light square)
check_size = 15
# create checkerboard pattern
white = np.full((check_size,check_size), 255, dtype=np.uint8)
black = np.full((check_size,check_size), 0, dtype=np.uint8)
checks1 = np.hstack([white,black])
checks2 = np.hstack([black,white])
checks3 = np.vstack([checks1,checks2])
numht = math.ceil(hh / (2*check_size))
numwd = math.ceil(ww / (2*check_size))
checks = np.tile(checks3, (numht,numwd))
checks = checks[0:hh, 0:ww]
# apply dft to checkerboard pattern
dft_c = cv2.dft(np.float32(checks), flags = cv2.DFT_COMPLEX_OUTPUT)
dft_c_shift = np.fft.fftshift(dft_c)
mag_c, phase_c = cv2.cartToPolar(dft_c_shift[:,:,0], dft_c_shift[:,:,1])
# get spectrum from magnitude (add tiny amount to avoid divide by zero error)
spec = np.log(mag_c + 0.00000001)
# theshold spectrum
mask = cv2.threshold(spec, 1, 255, cv2.THRESH_BINARY)[1]
# mask DC point (center spot)
centx = int(ww/2)
centy = int(hh/2)
dot = np.zeros((3,3), dtype=np.uint8)
mask[centy-1:centy+2, centx-1:centx+2] = dot
# If needed do morphology dilate by small amount.
# But does not seem to be needed in this case
# invert mask
mask = 255 - mask
# apply mask to real and imaginary components
mask1 = (mask/255).astype(np.float32)
mask2 = cv2.merge([mask1,mask1])
complex_b = dft_b_shift*mask2
complex_g = dft_g_shift*mask2
complex_r = dft_r_shift*mask2
# shift origin from center to upper left corner
complex_ishift_b = np.fft.ifftshift(complex_b)
complex_ishift_g = np.fft.ifftshift(complex_g)
complex_ishift_r = np.fft.ifftshift(complex_r)
# do idft with normalization saving as real output and crop to original size
img_notch_b = cv2.idft(complex_ishift_b, flags=cv2.DFT_SCALE+cv2.DFT_REAL_OUTPUT)
img_notch_b = img_notch_b.clip(0,255).astype(np.uint8)
img_notch_b = img_notch_b[0:hh, 0:ww]
img_notch_g = cv2.idft(complex_ishift_g, flags=cv2.DFT_SCALE+cv2.DFT_REAL_OUTPUT)
img_notch_g = img_notch_g.clip(0,255).astype(np.uint8)
img_notch_g = img_notch_g[0:hh, 0:ww]
img_notch_r = cv2.idft(complex_ishift_r, flags=cv2.DFT_SCALE+cv2.DFT_REAL_OUTPUT)
img_notch_r = img_notch_r.clip(0,255).astype(np.uint8)
img_notch_r = img_notch_r[0:hh, 0:ww]
# combine b,g,r components
img_notch = cv2.merge([img_notch_b, img_notch_g, img_notch_r])
# write result to disk
cv2.imwrite("fake_checks.png", checks)
cv2.imwrite("fake_spectrum.png", (255*spec).clip(0,255).astype(np.uint8))
cv2.imwrite("fake_mask.png", mask)
cv2.imwrite("fake_notched.png", img_notch)
# show results
cv2.imshow("ORIGINAL", img)
cv2.imshow("CHECKS", checks)
cv2.imshow("SPECTRUM", spec)
cv2.imshow("MASK", mask)
cv2.imshow("NOTCH", img_notch)
cv2.waitKey(0)
cv2.destroyAllWindows()
Checkerboard image:
Spectrum of checkerboard:
Mask:
Result (notch filtered image):
The checkerboard pattern in the result is mitigated from the original, but still there upon close inspection.
From here one needs to threshold on the white background and invert to make an image for the alpha channel. Then convert the image to 4 BGRA and insert the alpha channel into the BGRA image as I described in my other answer below.
since you're working on PNG's with transparent backgrounds, it would probably be equally viable to instead of trying to detect the checkered background, you try to extract the stuff that isn't checkered. This could probably be achieved using a color check on all pixels. You could use opencv's inRange() function. I'll link a StackOverflow link below that tries to detect dark spots on a image.
Inrange example
I have these images and there is a shadow in all images. I target is making a single image of a car without shadow by using these three images:
Finally, how can I get this kind of image as shown below:
Any kind of help or suggestions are appreciated.
EDITED
According to the comments, I used np.maximum and achieved easily to my target:
import cv2
import numpy as np
img_1 = cv2.imread('1.png', cv2.COLOR_BGR2RGB)
img_2 = cv2.imread('2.png', cv2.COLOR_BGR2RGB)
img = np.maximum(img_1, img_2)
cv2.imshow('img1', img_1)
cv2.imshow('img2', img_2)
cv2.imshow('img', img)
cv2.waitKey(0)
Here's a possible solution. The overall idea is to compute the location of the shadows, produce a binary mask identifying the location of the shadows and use this information to copy pixels from all the cropped sub-images.
Let's see the code. The first problem is to locate the three images. I used the black box to segment and crop each car, like this:
# Imports:
import cv2
import numpy as np
# image path
path = "D://opencvImages//"
fileName = "qRLI7.png"
# Reading an image in default mode:
inputImage = cv2.imread(path + fileName)
# Get the HSV image:
hsvImage = cv2.cvtColor(inputImage, cv2.COLOR_BGR2HSV)
# Get the grayscale image:
grayImage = cv2.cvtColor(inputImage, cv2.COLOR_BGR2GRAY)
showImage("grayImage", grayImage)
# Threshold via Otsu:
_, binaryImage = cv2.threshold(grayImage, 5, 255, cv2.THRESH_BINARY_INV)
cv2.imshow("binaryImage", binaryImage)
cv2.waitKey(0)
The previous bit uses the grayscale version of the image and applies a fixed binarization using a threshold of 5. I also pre-compute the HSV version of the original image. The result of the thresholding is this:
I'm trying to get the black rectangles and use them to crop each car. Let's get the contours and filter them by area, as the black rectangles on the binary image have the biggest area:
for i, c in enumerate(currentContour):
# Get the contour's bounding rectangle:
boundRect = cv2.boundingRect(c)
# Get the dimensions of the bounding rect:
rectX = boundRect[0]
rectY = boundRect[1]
rectWidth = boundRect[2]
rectHeight = boundRect[3]
# Get the area:
blobArea = rectWidth * rectHeight
minArea = 20000
if blobArea > minArea:
# Deep local copies:
hsvImage = hsvImage.copy()
localImage = inputImage.copy()
# Get the S channel from the HSV image:
(H, S, V) = cv2.split(hsvImage)
# Crop image:
croppedImage = V[rectY:rectY + rectHeight, rectX:rectX + rectWidth]
localImage = localImage[rectY:rectY + rectHeight, rectX:rectX + rectWidth]
_, binaryMask = cv2.threshold(croppedImage, 0, 255, cv2.THRESH_OTSU + cv2.THRESH_BINARY_INV)
After filtering each contour to get the biggest one, I need to locate the position of the shadow. The shadow is mostly visible in the HSV color space, particularly, in the V channel. I cropped two versions of the image: The original BGR image, now cropped, and the V cropped channel of the HSV image. This is the binary mask that results from applying an automatic thresholding on the S channel :
To locate the shadow I only need the starting x coordinate and its width, because the shadow is uniform across every cropped image. Its height is equal to each cropped image's height. I reduced the V image to a row, using the SUM mode. This will sum each pixel across all columns. The biggest values will correspond to the position of the shadow:
# Image reduction:
reducedImg = cv2.reduce(binaryMask, 0, cv2.REDUCE_SUM, dtype=cv2.CV_32S)
# Normalize image:
max = np.max(reducedImg)
reducedImg = reducedImg / max
# Clip the values to [0,255]
reducedImg = np.clip((255 * reducedImg), 0, 255)
# Convert the mat type from float to uint8:
reducedImg = reducedImg.astype("uint8")
_, shadowMask = cv2.threshold(reducedImg, 250, 255, cv2.THRESH_BINARY)
The reduced image is just a row:
The white pixels denote the largest values. The location of the shadow is drawn like a horizontal line with the largest area, that is, the most contiguous white pixels. I process this row by getting contours and filtering, again, to the largest area:
# Get the biggest rectangle:
subContour, _ = cv2.findContours(shadowMask, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
for j, s in enumerate(subContour):
# Get the contour's bounding rectangle:
boundRect = cv2.boundingRect(s)
# Get the dimensions of the bounding rect:
rectX = boundRect[0]
rectY = boundRect[1]
rectWidth = boundRect[2]
rectHeight = boundRect[3]
# Get the area:
blobArea = rectWidth * rectHeight
minArea = 30
if blobArea > minArea:
# Get image dimensions:
(imageHeight, imageWidth) = localImage.shape[:2]
# Set an empty array, this will be the binary mask
shadowMask = np.zeros((imageHeight, imageWidth, 3), np.uint8)
color = (255, 255, 255)
cv2.rectangle(shadowMask, (int(rectX), int(0)),
(int(rectX + rectWidth), int(0 + imageHeight)), color, -1)
# Invert mask:
shadowMask = 255 - shadowMask
# Store mask and cropped image:
shadowRois.append((shadowMask.copy(), localImage.copy()))
Alright, with that information I create a mask, where the only thing drawn in white is the location of the mask. I store this mask and the original BGR crop in the shadowRois list.
What follows is a possible method to use this information and create a full image. The idea is that I use the information of each mask to copy all the non-masked pixels. I accumulate this information on a buffer, initially an empty image, like this:
# Prepare image buffer:
buffer = np.zeros((100, 100, 3), np.uint8)
# Loop through cropped images and produce the final image:
for r in range(len(shadowRois)):
# Get data from the list:
(mask, img) = shadowRois[r]
# Get image dimensions:
(imageHeight, imageWidth) = img.shape[:2]
# Resize the buffer:
newSize = (imageWidth, imageHeight)
buffer = cv2.resize(buffer, newSize, interpolation=cv2.INTER_AREA)
# Get the image mask:
temp = cv2.bitwise_and(img, mask)
# Set info in buffer, substitute the black pixels
# for the new data:
buffer = np.where(temp == (0, 0, 0), buffer, temp)
cv2.imshow("Composite Image", buffer)
cv2.waitKey(0)
The result is this:
I have an image that has cereal items below:
The image has:
3 walnuts
3 raisins
3 pumpkin seeds
27 similar looking cereal
I wish to count them separately using opencv, I do not want to recognize them. So far, I have tailored the AdaptiveThreshold method to count all the seeds, but not sure how to do it separately. This is my scripts:
import cv2
import numpy as np
import matplotlib.pyplot as plt
img = cv2.imread('/Users/vaibhavsaxena/Desktop/Screen Shot 2021-04-27 at 12.22.46.png', 0)
#img = cv2.fastNlMeansDenoisingColored(img,None,10,10,7,21)
windowSize = 31
windowConstant = 40
mask = cv2.adaptiveThreshold(img, 255, cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY_INV, windowSize, windowConstant)
plt.imshow(mask)
stats = cv2.connectedComponentsWithStats(mask, 8)[2]
label_area = stats[1:, cv2.CC_STAT_AREA]
min_area, max_area = 345, max(list(label_area)) # min/max for a single circle
singular_mask = (min_area < label_area) & (label_area <= max_area)
circle_area = np.mean(label_area[singular_mask])
n_circles = int(np.sum(np.round(label_area / circle_area)))
print('Total circles:', n_circles)
36
But this one seems extremely hard coded. For example, if I zoom in or zoom out the image, it yields a different count.
Can anyone please help?
Your lighting is not good, as HansHirse suggested, try normalizing the conditions in which you take your photos. There's, however, a method that can somewhat normalize the lighting and get it as uniform as possible. The method is called gain division. The idea is that you try to build a model of the background and then weight each input pixel by that model. The output gain should be relatively constant during most of the image. Let's give it a try:
# imports:
import cv2
import numpy as np
# Reading an image in default mode:
inputImage = cv2.imread(path + fileName)
# Deep copy for results:
inputImageCopy = inputImage.copy()
# Get local maximum:
kernelSize = 30
maxKernel = cv2.getStructuringElement(cv2.MORPH_RECT, (kernelSize, kernelSize))
localMax = cv2.morphologyEx(inputImage, cv2.MORPH_CLOSE, maxKernel, None, None, 1, cv2.BORDER_REFLECT101)
# Perform gain division
gainDivision = np.where(localMax == 0, 0, (inputImage/localMax))
# Clip the values to [0,255]
gainDivision = np.clip((255 * gainDivision), 0, 255)
# Convert the mat type from float to uint8:
gainDivision = gainDivision.astype("uint8")
Gotta be careful with those data types and their conversions. This is the result:
As you can see, most of the background is now uniform, that's pretty cool, because now we can apply a simple thresholding method. Let's try Otsu's Thresholding to get a nice binary mask of the elements:
# Convert RGB to grayscale:
grayscaleImage = cv2.cvtColor(gainDivision, cv2.COLOR_BGR2GRAY)
# Get binary image via Otsu:
_, binaryImage = cv2.threshold(grayscaleImage, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)
Which yields this binary mask:
The mask can be improved by applying morphology, let's try to join those blobs applying a gentle closing operation:
# Set kernel (structuring element) size:
kernelSize = 3
# Set morph operation iterations:
opIterations = 2
# Get the structuring element:
morphKernel = cv2.getStructuringElement(cv2.MORPH_RECT, (kernelSize, kernelSize))
# Perform closing:
binaryImage = cv2.morphologyEx( binaryImage, cv2.MORPH_CLOSE, morphKernel, None, None, opIterations, cv2.BORDER_REFLECT101 )
This is the result:
Alright, now, just for completeness, let's try to compute the bounding rectangles of all the elements. We can also filter blobs of small area and store each bounding rectangle in a list:
# Find the blobs on the binary image:
contours, hierarchy = cv2.findContours(binaryImage, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
# Store the bounding rectangles here:
rectanglesList = []
# Look for the outer bounding boxes (no children):
for _, c in enumerate(contours):
# Get blob area:
currentArea = cv2.contourArea(c)
# Set a min area threshold:
minArea = 100
if currentArea > minArea:
# Approximate the contour to a polygon:
contoursPoly = cv2.approxPolyDP(c, 3, True)
# Get the polygon's bounding rectangle:
boundRect = cv2.boundingRect(contoursPoly)
# Store rectangles in list:
rectanglesList.append(boundRect)
# Get the dimensions of the bounding rect:
rectX = boundRect[0]
rectY = boundRect[1]
rectWidth = boundRect[2]
rectHeight = boundRect[3]
# Set bounding rect:
color = (0, 0, 255)
cv2.rectangle( inputImageCopy, (int(rectX), int(rectY)),
(int(rectX + rectWidth), int(rectY + rectHeight)), color, 2 )
cv2.imshow("Rectangles", inputImageCopy)
cv2.waitKey(0)
The final image is this:
This is the total of detected elements:
print("Elements found: "+str(len(rectanglesList)))
Elements found: 37
As you can see, there's a false positive. A bit of the shadow of a grain gets detected as an actual grain. Maybe adjusting the minimum area will get rid of the problem. Or maybe, if you are classifying each grain anyway, you could filter this kind of noise.
I am trying to detect all of the overlapping circle/ellipses shapes in this image all of which have digits on them. I have tried different types of image processing techniques using OpenCV, however I cannot detect the shapes that overlap the tree. I have tried erosion and dilation however it has not helped.
Any pointers on how to go about this would be great. I have attached my code below
original = frame.copy()
gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
blurred = cv2.GaussianBlur(gray, (3, 3), 0)
canny = cv2.Canny(blurred, 120, 255, 1)
kernel = np.ones((5, 5), np.uint8)
dilate = cv2.dilate(canny, kernel, iterations=1)
# Find contours
cnts = cv2.findContours(dilate, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
image_number = 0
for c in cnts:
x, y, w, h = cv2.boundingRect(c)
cv2.rectangle(frame, (x, y), (x + w, y + h), (36, 255, 12), 2)
ROI = original[y:y + h, x:x + w]
cv2.imwrite("ROI_{}.png".format(image_number), ROI)
image_number += 1
cv2.imshow('canny', canny)
cv2.imshow('image', frame)
cv2.waitKey(0)
Here's a possible solution. I'm assuming that the target blobs (the saucer-like things) are always labeled - that is, they always have a white number inside them. The idea is to create a digits mask, because their size and color seem to be constant. I use the digits as guide to obtain sample pixels of the ellipses. Then, I convert these BGR pixels to HSV, create a binary mask and use that info to threshold and locate the ellipses. Let's check out the code:
# imports:
import cv2
import numpy as np
# image path
path = "D://opencvImages//"
fileName = "4dzfr.png"
# Reading an image in default mode:
inputImage = cv2.imread(path + fileName)
# Deep copy for results:
inputImageCopy = inputImage.copy()
# Convert RGB to grayscale:
grayscaleImage = cv2.cvtColor(inputImage, cv2.COLOR_BGR2GRAY)
# Get binary image via Otsu:
binaryImage = np.where(grayscaleImage >= 200, 255, 0)
# The above operation converted the image to 32-bit float,
# convert back to 8-bit uint
binaryImage = binaryImage.astype(np.uint8)
The first step is to make a mask of the digits. I also created a deep copy of the BGR image. The digits are close to white (That is, an intensity close to 255). I use 200 as threshold and obtain this result:
Now, let's locate these contours on this binary mask. I'm filtering based on aspect ratio, as the digits have a distinct aspect ratio close to 0.70. I'm also filtering contours based on hierarchy - as I'm only interested on external contours (the ones that do not have children). That's because I really don't need contours like the "holes" inside the digit 8:
# Find the contours on the binary image:
contours, hierarchy = cv2.findContours(binaryImage, cv2.RETR_CCOMP, cv2.CHAIN_APPROX_SIMPLE)
# Store the sampled pixels here:
sampledPixels = []
# Look for the outer bounding boxes (no children):
for i, c in enumerate(contours):
# Get the contour bounding rectangle:
boundRect = cv2.boundingRect(c)
# Get the dimensions of the bounding rect:
rectX = boundRect[0]
rectY = boundRect[1]
rectWidth = boundRect[2]
rectHeight = boundRect[3]
# Compute the aspect ratio:
aspectRatio = rectWidth / rectHeight
# Create the filtering threshold value:
delta = abs(0.7 - aspectRatio)
epsilon = 0.1
# Get the hierarchy:
currentHierarchy = hierarchy[0][i][3]
# Prepare the list of sampling points (One for the ellipse, one for the circle):
samplingPoints = [ (rectX - rectWidth, rectY), (rectX, rectY - rectHeight) ]
# Look for the target contours:
if delta < epsilon and currentHierarchy == -1:
# This list will hold both sampling pixels:
pixelList = []
# Get sampling pixels from the two locations:
for s in range(2):
# Get sampling point:
sampleX = samplingPoints[s][0]
sampleY = samplingPoints[s][1]
# Get sample BGR pixel:
samplePixel = inputImageCopy[sampleY, sampleX]
# Store into temp list:
pixelList.append(samplePixel)
# convert list to tuple:
pixelList = tuple(pixelList)
# Save pixel value:
sampledPixels.append(pixelList)
Ok, there area a couple of things happening in the last snippet of code. We want to sample pixels from both the ellipse and the circle. We will use two sampling locations that are function of each digit's original position. These positions are defined in the samplingPoints tuple. For the ellipse, I'm sampling at a little before the top right position of the digit. For the circle, I'm sapling directly above the top right position - we end up with two pixels for each figure.
You'll notice I'm doing a little bit of data type juggling, converting lists to tuples. I want these pixels stored as a tuple for convenience. If I draw bounding rectangles of the digits, this would be the resulting image:
Now, let's loop through the pixel list, convert them to HSV and create a HSV mask over the original BGR image. The final bounding rectangles of the ellipses are stored in boundingRectangles, additionally I draw results on the deep copy of the original input:
# Final bounding rectangles are stored here:
boundingRectangles = []
# Loop through sampled pixels:
for p in range(len(sampledPixels)):
# Get current pixel tuple:
currentPixelTuple = sampledPixels[p]
# Prepare the HSV mask:
imageHeight, imageWidth = binaryImage.shape[:2]
hsvMask = np.zeros((imageHeight, imageWidth), np.uint8)
# Process the two sampling pixels:
for m in range(len(currentPixelTuple)):
# Get current pixel in the list:
currentPixel = currentPixelTuple[m]
# Create BGR Mat:
pixelMat = np.zeros([1, 1, 3], dtype=np.uint8)
pixelMat[0, 0] = currentPixel
# Convert the BGR pixel to HSV:
hsvPixel = cv2.cvtColor(pixelMat, cv2.COLOR_BGR2HSV)
H = hsvPixel[0][0][0]
S = hsvPixel[0][0][1]
V = hsvPixel[0][0][2]
# Create HSV range for this pixel:
rangeThreshold = 5
lowerValues = np.array([H - rangeThreshold, S - rangeThreshold, V - rangeThreshold])
upperValues = np.array([H + rangeThreshold, S + rangeThreshold, V + rangeThreshold])
# Create HSV mask:
hsvImage = cv2.cvtColor(inputImage.copy(), cv2.COLOR_BGR2HSV)
tempMask = cv2.inRange(hsvImage, lowerValues, upperValues)
hsvMask = hsvMask + tempMask
First, I create a 1 x 1 Matrix (or Numpy Array) with just a BGR pixel value - the first of two I previously sampled. In this way, I can use cv2.cvtColor to get the corresponding HSV values. Then, I create lower and upper threshold values for the HSV mask. However, the image seems synthetic, and a range-based thresholding could be reduced to a unique tuple. After that, I create the HSV mask using cv2.inRange.
This will yield the HSV mask for the ellipse. After applying the method for the circle we will end up with two HSV masks. Well, I just added the two arrays to combine both masks. At the end you will have something like this, this is the "composite" HSV mask created for the first saucer-like figure:
We can apply a little bit of morphology to join both shapes, just a little closing will do:
# Set kernel (structuring element) size:
kernelSize = 3
# Set morph operation iterations:
opIterations = 2
# Get the structuring element:
morphKernel = cv2.getStructuringElement(cv2.MORPH_RECT, (kernelSize, kernelSize))
# Perform closing:
hsvMask = cv2.morphologyEx(hsvMask, cv2.MORPH_CLOSE, morphKernel, None, None, opIterations,cv2.BORDER_REFLECT101)
This is the result:
Nice. Let's continue and get the bounding rectangles of all the shapes. I'm using the boundingRectangles list to store each bounding rectangle, like this:
# Process current contour:
currentContour, _ = cv2.findContours(hsvMask, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
for _, c in enumerate(currentContour):
# Get the contour's bounding rectangle:
boundRect = cv2.boundingRect(c)
# Get the dimensions of the bounding rect:
rectX = boundRect[0]
rectY = boundRect[1]
rectWidth = boundRect[2]
rectHeight = boundRect[3]
# Store and set bounding rect:
boundingRectangles.append(boundRect)
color = (0, 0, 255)
cv2.rectangle(inputImageCopy, (int(rectX), int(rectY)),
(int(rectX + rectWidth), int(rectY + rectHeight)), color, 2)
cv2.imshow("Objects", inputImageCopy)
cv2.waitKey(0)
This is the image of the located rectangles once every sampled pixel is processed:
I'm pre-processing some images in order to remove the background from my area of interest. However, the images on my bench have rounded edges due to the focus of the camera. How do I discard these rounded edges and be able to remove only my object of interest from the image? The code below I can remove the background of the image, but it does not work right due to the edges around.
import numpy as np
import cv2
#Read the image and perform threshold and get its height and weight
img = cv2.imread('IMD408.bmp')
h, w = img.shape[:2]
# Transform to gray colorspace and blur the image.
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
blur = cv2.GaussianBlur(gray,(5,5),0)
# Make a fake rectangle arround the image that will seperate the main contour.
cv2.rectangle(blur, (0,0), (w,h), (255,255,255), 10)
# Perform Otsu threshold.
_,thresh = cv2.threshold(blur,0,255,cv2.THRESH_BINARY_INV+cv2.THRESH_OTSU)
# Create a mask for bitwise operation
mask = np.zeros((h, w), np.uint8)
# Search for contours and iterate over contours. Make threshold for size to
# eliminate others.
contours, hierarchy = cv2.findContours(thresh,cv2.RETR_TREE,cv2.CHAIN_APPROX_NONE)
for i in contours:
cnt = cv2.contourArea(i)
if 1000000 >cnt > 100000:
cv2.drawContours(mask, [i],-1, 255, -1)
# Perform the bitwise operation.
res = cv2.bitwise_and(img, img, mask=mask)
# Display the result.
cv2.imwrite('IMD408.png', res)
cv2.imshow('img', res)
cv2.waitKey(0)
cv2.destroyAllWindows()
input image:
Exit:
Error:
Since you mentioned that all the images have the same hue, then this should work well for them. Steps is to do some white balancing which will increase the contrast a bit.
Get the greyscale.
Threshold the grayscale image. Values less than 127 are set to 255 (white). This will give you a binary image, which will become a mask for the original image.
Apply the mask
You might have to play around with the thresholding if you want better results, here is the link for that. But this should get you started. I'm using a different OpenCV version compared to you might have to tweak the code a bit.
import cv2
def equaliseWhiteBalance(image):
''' Return equilised WB of an image '''
wb = cv2.xphoto.createSimpleWB() #Create WB Object
imgWB = wb.balanceWhite(img) #Balance White on image
r,g,b = cv2.split(imgWB) #Get individual r,g,b channels
r_equ = cv2.equalizeHist(r) #Equalise RED channel
g_equ = cv2.equalizeHist(g) #Equalise GREEN channel
b_equ = cv2.equalizeHist(b) #Equalise BLUE channel
img_equ_WB = cv2.merge([r_equ,g_equ,b_equ]) #Merge equalised channels
return imgWB
#Read the image
img = cv2.imread('IMD408.bmp')
result = img.copy()
#Get whiteBalance of image
imgWB = equaliseWhiteBalance(img)
cv2.imshow('img', imgWB)
cv2.waitKey(0)
# Get gray image
gray = cv2.cvtColor(imgWB,cv2.COLOR_RGB2GRAY)
cv2.imshow('img', gray)
cv2.waitKey(0)
# Perform threshold
_, thresh = cv2.threshold(gray,127,255,cv2.THRESH_BINARY)
cv2.imshow('img', thresh)
cv2.waitKey(0)
# Apply mask
result[thresh!=0] = (255,255,255)
cv2.imshow('img', result)
cv2.waitKey(0)
If all the dark corner vignettes have different sizes per image, then I suggest looking for centroid of contours on the binary (mask) image. Centroids with a 'short' distance to any corner of your image will be the dark vignettes, so their value can be changed from black to white.