Related
I made a program that applies a mask over an object as described in this StackOverflow question. I did so using colour thresholding and making the mask select only the colour range of human skin (I don't know if it works for white people as I am not white and it works well for me). the problem is when I run it, some greys (grey area on the wall or a shadow) are also picked up on the mask and it is applied there.
I wanted to know whether there was a way to remove the unnecessary bits in the background, and/or if there was a way using object detection I could solve this. PS I tried using createBackgroundSubtractorGMG/MOG/etc but that came out very weird and way worse.
Here is my code:
import cv2
from cv2 import bitwise_and
from cv2 import COLOR_HSV2BGR
import numpy as np
from matplotlib import pyplot as plt
cap = cv2.VideoCapture(0)
image = cv2.imread('yesh1.jpg')
bg = cv2.imread('kruger.jpg')
bg = cv2.cvtColor(bg, cv2.COLOR_BGR2RGB)
kernel1 = np.ones((1,1),np.uint8)
kernel2 = np.ones((10,10),np.uint8)
while (1):
ret, frame = cap.read()
hsv = cv2.cvtColor(frame, cv2.COLOR_BGR2HSV)
lowerBound = np.array([1, 1, 1])
upperBound = np.array([140, 255 ,140])
mask = cv2.inRange(hsv, lowerBound, upperBound)
blur = cv2.GaussianBlur(mask,(5,5),0)
ret1,mask = cv2.threshold(blur,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)
mask = cv2.morphologyEx(mask, cv2.MORPH_OPEN, kernel1)
contourthickness = cv2.cvtColor(mask, cv2.IMREAD_COLOR)
res = bitwise_and(frame, frame, mask = mask)
crop_bg = bg[0:480, 0:640]
final = frame + res
final = np.where(contourthickness != 0, crop_bg, final)
cv2.imshow('frame', frame)
cv2.imshow('Final', final) # TIS WORKED BBYY
key = cv2.waitKey(1) & 0xFF
if key == 27:
break
cv2.destroyAllWindows()
EDIT:
Following #fmw42 's comment, I am adding the original image as well as a screenshot of how the different frames look. The masked image also changes colour. Something to fix that will also be helpful.
#Jeremi. Your code working 100%. Using White wall for background. Avoid door(it is not white, it is cream), shadow around edge, to prevent noising. If you have white bed sheet or white walls. I am using Raspberry pi 4b/8gb, 4k monitor. I can't get actual size of window.
Here is output:
What you see on my output. I placed my hand behind white sheet closer to camera. I do not have white wall on my room. My room is greener. That why you see logo on background. Btw, I can move my hand no problem.
Do you know of an algorithm that can see that there is handwriting on an image? I am not interested in knowing what the handwriting says, but only that there is one present?
I have a video of someone filling a slide with handwriting. My goal is to determine how much of the slide has been filled with handwriting already.
The video in question can be downloaded here: http://www.filedropper.com/00_6
For this particular video, a great solution was already suggested in Quantify how much a slide has been filled with handwriting
The solution is based on summing the amount of the specific color used for the handwriting. However, if the handwriting is not in blue but any other color that can also be found on non-handwriting, this approach will not work.
Therefore, I am interested to know, if there exists a more general solution to determine if there is handwriting present on an image?
What I have done so far:
I was thinking of extracting the contours of an image, and then somehow detect the handwriting part based on how curvy the contours are (but I have no clue how to do that part). it might not be the best idea, though, as again it's not always correct...
import cv2
import matplotlib.pyplot as plt
img = cv2.imread(PATH TO IMAGE)
print("img shape=", img.shape)
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
cv2.imshow("image", gray)
cv2.waitKey(1)
#### extract all contours
# Find Canny edges
edged = cv2.Canny(gray, 30, 200)
cv2.waitKey(0)
# Finding Contours
# Use a copy of the image e.g. edged.copy()
# since findContours alters the image
contours, hierarchy = cv2.findContours(edged,
cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)
cv2.imshow('Canny Edges After Contouring', edged)
cv2.waitKey(0)
print("Number of Contours found = " + str(len(contours)))
# Draw all contours
# -1 signifies drawing all contours
cv2.drawContours(img, contours, -1, (0, 255, 0), 3)
cv2.imshow('Contours', img)
cv2.waitKey(0)
You can identify the space taken by hand-writing by masking the pixels from the template, and then do the same for the difference between further frames and the template. You can use dilation, opening, and thresholding for this.
Let's start with your template. Let's identify the parts we will mask:
import cv2
import numpy as np
template = cv2.imread('template.jpg')
Now, let's broaden the occupied pixels to make a zone that we will mask (hide) later:
template = cv2.cvtColor(template, cv2.COLOR_BGR2GRAY)
kernel = np.ones((5, 5),np.uint8)
dilation = cv2.dilate(255 - template, kernel,iterations = 5)
Then, we will threshold to turn this into a black and white mask:
_, thresh = cv2.threshold(dilation,25,255,cv2.THRESH_BINARY_INV)
In later frames, we will subtract this mask from the picture, by turning all these pixels to white. For instance:
import numpy as np
import cv2
vidcap = cv2.VideoCapture('0_0.mp4')
success,image = vidcap.read()
count = 0
frames = []
while count < 500:
frames.append(image)
success,image = vidcap.read()
count += 1
mask = np.where(thresh == 0)
example = frames[300]
example[mask] = [255, 255, 255]
cv2.imshow('', example)
cv2.waitKey(0)
Now, we will create a function that will return the difference between the template and a given picture. We will also use opening to get rid of the left over single pixels that would make it ugly.
def difference_with_mask(image):
grayscale = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
kernel = np.ones((5, 5), np.uint8)
dilation = cv2.dilate(255 - grayscale, kernel, iterations=5)
_, thresh = cv2.threshold(dilation, 25, 255, cv2.THRESH_BINARY_INV)
thresh[mask] = 255
closing = cv2.morphologyEx(thresh, cv2.MORPH_CLOSE, kernel)
return closing
cv2.imshow('', difference_with_mask(frames[400]))
cv2.waitKey(0)
To address the fact that you don't want to have the hand detected as hand-writing, I suggest that instead of using the mask for every individual frame, you use the 95th percentile of the 15 last 30th frame... hang on. Look at this:
results = []
for ix, frame in enumerate(frames):
if ix % 30 == 0:
history.append(frame)
results.append(np.quantile(history, 0.95, axis=0))
print(ix)
Now, the example frame becomes this (the hand is removed because it wasn't mostly present in the 15 last 30th frames):
As you can see a little part of the hand-writing is missing. It will come later, because of the time-dependent percentile transformation we're doing. You'll see later: in my example with frame 18,400, the text that is missing in the image above is present. Then, you can use the function I gave you and this will be the result:
And here we go! Note that this solution, which doesn't include the hand, will take longer to compute because there's a few calculations needing to be done. Using just an image with no regard to the hand would calculate instantly, to the extent that you could probably run it on your webcam feed in real time.
Final Example:
Here's the frame 18,400:
Final image:
You can play with the function if you want the mask to wrap more thinly around the text:
Full code:
import os
import numpy as np
import cv2
vidcap = cv2.VideoCapture('0_0.mp4')
success,image = vidcap.read()
count = 0
from collections import deque
frames = deque(maxlen=700)
while count < 500:
frames.append(image)
success,image = vidcap.read()
count += 1
template = cv2.imread('template.jpg')
template = cv2.cvtColor(template, cv2.COLOR_BGR2GRAY)
kernel = np.ones((5, 5),np.uint8)
dilation = cv2.dilate(255 - template, kernel,iterations = 5)
cv2.imwrite('dilation.jpg', dilation)
cv2.imshow('', dilation)
cv2.waitKey(0)
_, thresh = cv2.threshold(dilation,25,255,cv2.THRESH_BINARY_INV)
cv2.imwrite('thresh.jpg', thresh)
cv2.imshow('', thresh)
cv2.waitKey(0)
mask = np.where(thresh == 0)
example = frames[400]
cv2.imwrite('original.jpg', example)
cv2.imshow('', example)
cv2.waitKey(0)
example[mask] = 255
cv2.imwrite('example_masked.jpg', example)
cv2.imshow('', example)
cv2.waitKey(0)
def difference_with_mask(image):
grayscale = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
kernel = np.ones((5, 5), np.uint8)
dilation = cv2.dilate(255 - grayscale, kernel, iterations=5)
_, thresh = cv2.threshold(dilation, 25, 255, cv2.THRESH_BINARY_INV)
thresh[mask] = 255
closing = cv2.morphologyEx(thresh, cv2.MORPH_CLOSE, kernel)
return closing
cv2.imshow('', difference_with_mask(frames[400]))
cv2.waitKey(0)
masked_example = difference_with_mask(frames[400])
cv2.imwrite('masked_example.jpg', masked_example)
from collections import deque
history = deque(maxlen=15)
results = []
for ix, frame in enumerate(frames):
if ix % 30 == 0:
history.append(frame)
results.append(np.quantile(history, 0.95, axis=0))
print(ix)
if ix > 500:
break
cv2.imshow('', frames[400])
cv2.waitKey(0)
cv2.imshow('', results[400].astype(np.uint8))
cv2.imwrite('percentiled_frame.jpg', results[400].astype(np.uint8))
cv2.waitKey(0)
cv2.imshow('', difference_with_mask(results[400].astype(np.uint8)))
cv2.imwrite('final.jpg', difference_with_mask(results[400].astype(np.uint8)))
cv2.waitKey(0)
You could try to make a template before detection which you could use to deduct it on the current frame of the video. One way you could make such a template is to iterate through every pixel of the frame and look-up if it has a higher value (white) in that coordinate than the value that is stored in the list.
Here is an example of such a template from your video by iterating through the first two seconds:
Once you have that it is simple to detect the text. You can use the cv2.absdiff() function to make difference of template and frame. Here is an example:
Once you have this image it is trivial to search for writting (threshold + contour search or something similar).
Here is an example code:
import numpy as np
import cv2
cap = cv2.VideoCapture('0_0.mp4') # read video
bgr = cap.read()[1] # get first frame
frame = cv2.cvtColor(bgr, cv2.COLOR_BGR2GRAY) # transform to grayscale
template = frame.copy() # make a copy of the grayscale
h, w = frame.shape[:2] # height, width
matrix = [] # a list for [y, x] coordinares
# fill matrix with all coordinates of the image (height x width)
for j in range(h):
for i in range(w):
matrix.append([j, i])
fps = cap.get(cv2.CAP_PROP_FPS) # frames per second of the video
seconds = 2 # How many seconds of the video you wish to look the template for
k = seconds * fps # calculate how many frames of the video is in that many seconds
i = 0 # some iterator to count the frames
lowest = [] # list that will store highest values of each pixel on the fram - that will build our template
# store the value of the first frame - just so you can compare it in the next step
for j in matrix:
y = j[0]
x = j[1]
lowest.append(template[y, x])
# loop through the number of frames calculated before
while(i < k):
bgr = cap.read()[1] # bgr image
frame = cv2.cvtColor(bgr, cv2.COLOR_BGR2GRAY) # transform to grayscale
# iterate through every pixel (pixels are located in the matrix)
for l, j in enumerate(matrix):
y = j[0] # x coordinate
x = j[1] # y coordinate
temp = template[y, x] # value of pixel in template
cur = frame[y, x] # value of pixel in the current frame
if cur > temp: # if the current frame has higher value change the value in the "lowest" list
lowest[l] = cur
i += 1 # increment the iterator
# just for vizualization
cv2.imshow('frame', frame)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
i = 0 # new iteratir to increment position in the "lowest" list
template = np.ones((h, w), dtype=np.uint8)*255 # new empty white image
# iterate through the matrix and change the value of the new empty white image to that value
# in the "lowest" list
for j in matrix:
template[j[0], j[1]] = lowest[i]
i += 1
# just for visualization - template
cv2.imwrite("template.png", template)
cv2.imshow("template", template)
cv2.waitKey(0)
cv2.destroyAllWindows()
counter = 0 # counter of countours: logicaly if the number of countours would
# rapidly decrease than that means that a new template is in order
mean_compare = 0 # this is needed for a simple color checker if the contour is
# the same color as the oders
# this is the difference between the frame of the video and created template
while(cap.isOpened()):
bgr = cap.read()[1] # bgr image
frame = cv2.cvtColor(bgr, cv2.COLOR_BGR2GRAY) # grayscale
img = cv2.absdiff(template, frame) # resulted difference
thresh = cv2.threshold(img, 0, 255, cv2.THRESH_BINARY+cv2.THRESH_OTSU)[1] # thresholded image
kernel = np.ones((5, 5), dtype=np.uint8) # simple kernel
thresh = cv2.dilate(thresh, kernel, iterations=1) # dilate thresholded image
cnts, h = cv2.findContours(thresh, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE) # contour search
if len(cnts) < counter*0.5 and counter > 50: # check if new template is in order
# search for new template again
break
else:
counter = len(cnts) # update counter
for cnt in cnts: # iterate through contours
size = cv2.contourArea(cnt) # size of contours - to filter out noise
if 20 < size < 30000: # noise criterion
mask = np.zeros(frame.shape, np.uint8) # empry mask - needed for color compare
cv2.drawContours(mask, [cnt], -1, 255, -1) # draw contour on mask
mean = cv2.mean(bgr, mask=mask) # the mean color of the contour
if not mean_compare: # first will set the template color
mean_compare = mean
else:
k1 = 0.85 # koeficient how much each channels value in rgb image can be smaller
k2 = 1.15 # koeficient how much each channels value in rgb image can be bigger
# condition
b = bool(mean_compare[0] * k1 < mean[0] < mean_compare[0] * k2)
g = bool(mean_compare[1] * k1 < mean[1] < mean_compare[1] * k2)
r = bool(mean_compare[2] * k1 < mean[2] < mean_compare[2] * k2)
if b and g and r:
cv2.drawContours(bgr, [cnt], -1, (0, 255, 0), 2) # draw on rgb image
# just for visualization
cv2.imshow('img', bgr)
if cv2.waitKey(1) & 0xFF == ord('s'):
cv2.imwrite(str(j)+".png", img)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
# release the video object and destroy window
cap.release()
cv2.destroyAllWindows()
One possible result with a simple size and color filter:
NOTE: This template search algorithm is very slow because of the nested loops and can probably be optimized to make it faster - you need a little more math knowledge than me. Also, you will need to make a check if the template changes in the same video - I'm guessing that shouldn't be too difficult.
A simpler idea on how to make it a bit faster is to resize the frames to let's say 20% and make the same template search. After that resize it back to the original and dilate the template. It will not be as nice of a result but it will make a mask on where the text and lines of the template are. Then simply draw it over the frame.
I don't think you really need the code in this case and it would be rather long if you did. But here's an algorithm to do it.
Use OpenCV's EAST (Efficient Accurate Scene Text detector) model at the beginning to establish the starting text on the slide. That gives you a bounding box(es) of the initial percentage of the slide covered with slide text as opposed to handwritten explanatory text.
Every, say 1-5 seconds (people don't write all that fast), compare that baseline image with the current image and the previous image.
If the current image has more text than the previous image but the initial bounding boxes are NOT the same, you have a new and rather busy slide.
If the current image has more text than the previous image but the initial bounding boxes are ARE the same, more text is being added.
If the current image had less text than the previous image but the initial bounding boxes are NOT the same, you again have a new slide -- only, not busy and with space like the last one to write.
If the current image has less text than the previous image but the initial bounding boxes are ARE the same, you either have a duplicate slide with what will presumably be more text or the teacher is erasing a section to continue, or modify their explanation. Meaning, you'll need some way of addressing this.
When you have a new slide, take the previous image, and compare the bounding boxes of all text, subtracting the boxes for the initial state.
Computationally, this isn't going to be cheap (you certainly won't be doing this life, at least not for a number of years) but it's robust, and sampling the text every so many seconds of time will help.
Personally, I would approach this as an ensemble. That is an initial bounding box then look at the color of the text. If you can get away with the percentage of different color text, do. And when you can't, you'll still be good.
In addition to the great answers that people provided, I have two other suggestions.
The first one, is the CNN methods. It's totally workable to use some object detection routine, or even a segmentation method (like U-NET) to differentiate between the texts. It is easy because you can find millions of images from digital text books and also handwritten documents to train your model.
The Second approach is to locate and to extract every single symbol on the image, separately (with a simple method like the one you used so far, or with connectedcomponent). Since typographic letters and symbols have a unique shape and style (similar fonts - unlike the handwritten letters) you can match all the found letters with sample typographic letters that you gathered separately to distinguish between the handwritten and the typographic. Feature-point-based matching (like SURF) could be a good tool for this approach.
kinda stuck trying to figure out how I can expand the background color inwards.
I have this image that has been generated through a mask after noisy background subtraction.
I am trying to make it into this:
So far I have tried to this, but to no avail:
import cv2
from PIL import Image
import numpy as np
Image.open("example_of_misaligned_frame.png") # open poor frame
img_copy = np.asanyarray(img).copy()
contours, _ = cv2.findContours(thresh, cv2.RETR_TREE, cv2.CHAIN_APPROX) # find contours
# create bounding box around blob and figure out the row.cols to iterate over
x,y,w,h = cv2.boundingRect(max(contours, key = cv2.contourArea))
# flood fill the entire region with back, hoping that off-white region gets filled due to connected components.
for row in range(y, y+h):
for col in range(x, x+w):
cv2.floodFill(img_copy, None, seedPoint=(col,row), newVal=0)
This results in a completely black image :(
Any help, pointing me in the right direction, is greatly appreciated.
You can solve it by using floodFill twice:
First time - fill the black pixels with Off-White color.
Second time - fill the Off-White pixels with black color.
There is still an issue for finding the RGB values of the Off-White color.
I found an improvised solution for finding the Off-White color (I don't know the exact rules for what color is considered to be background).
Here is a working code sample:
import cv2
import numpy as np
#Image.open("example_of_misaligned_frame.png") # open poor frame
img = cv2.imread("example_of_misaligned_frame.png")
#img_copy = np.asanyarray(img).copy()
img_copy = img.copy()
#Improvised way to find the Off White color (it's working because the Off White has the maximum color components values).
tmp = cv2.dilate(img, np.ones((50,50), np.uint8), iterations=10)
# Color of Off-White pixel
offwhite = tmp[0, 0, :]
# Convert to tuple
offwhite = tuple((int(offwhite[0]), int(offwhite[1]), int(offwhite[2])))
# Fill black pixels with off-white color
cv2.floodFill(img_copy, None, seedPoint=(0,0), newVal=offwhite)
# Fill off-white pixels with black color
cv2.floodFill(img_copy, None, seedPoint=(0,0), newVal=0, loDiff=(2, 2, 2, 2), upDiff=(2, 2, 2, 2))
cv2.imshow("img_copy", img_copy)
cv2.waitKey(0)
cv2.destroyAllWindows()
Result of cv2.dilate:
Result of first cv2.floodFill:
Result of second cv2.floodFill:
In Python/OpenCV, you can simply extract a binary mask from your flood filled process image and erode that mask. Then reapply it to the input or to your flood filled result.
Input:
import cv2
# read image
img = cv2.imread("masked_image.png")
# convert img to grayscale
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
# make anything not black into white
gray[gray!=0] = 255
# erode mask
kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (51,51))
mask = cv2.morphologyEx(gray, cv2.MORPH_ERODE, kernel)
# make mask into 3 channels
mask = cv2.merge([mask,mask,mask])
# apply new mask to img
result = img.copy()
result = cv2.bitwise_and(img, mask)
# write result to disk
cv2.imwrite("masked_image_original_mask.png", gray)
cv2.imwrite("masked_image_eroded_mask.png", mask)
cv2.imwrite("masked_image_eroded_image.png", result)
# display it
cv2.imshow("IMAGE", img)
cv2.imshow("MASK", mask)
cv2.imshow("RESULT", result)
cv2.waitKey(0)
Mask:
Eroded Mask:
Result:
Adjust the size of the circular (elliptical) morphology kernel as desired for more or less erosion.
I am working with skin images, in recognition of skin blemishes, and due to the presence of noises, mainly by the presence of hairs, this work becomes more complicated.
I have an image example in which I work in an attempt to highlight only the skin spot, but due to the large number of hairs, the algorithm is not effective. With this, I would like you to help me develop an algorithm to remove or reduce the amount of hair so that I can only highlight my area of interest (ROI), which are the spots.
Algorithm used to highlight skin blemishes:
import numpy as np
import cv2
#Read the image and perform threshold
img = cv2.imread('IMD006.bmp')
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
blur = cv2.medianBlur(gray,5)
_,thresh = cv2.threshold(blur,0,255,cv2.THRESH_BINARY_INV+cv2.THRESH_OTSU)
#Search for contours and select the biggest one
contours, hierarchy = cv2.findContours(thresh,cv2.RETR_TREE,cv2.CHAIN_APPROX_NONE)
cnt = max(contours, key=cv2.contourArea)
#Create a new mask for the result image
h, w = img.shape[:2]
mask = np.zeros((h, w), np.uint8)
#Draw the contour on the new mask and perform the bitwise operation
cv2.drawContours(mask, [cnt],-1, 255, -1)
res = cv2.bitwise_and(img, img, mask=mask)
#Display the result
cv2.imwrite('IMD006.png', res)
#cv2.imshow('img', res)
cv2.waitKey(0)
cv2.destroyAllWindows()
Example image used:
How to deal with these noises to the point of improving my region of interest?
This is quite a difficult task becasue the hair goes over your ROI (mole). I don't know how to help remove it from the mole but I can help to remove the backround like in the picture without hairs. For the removal of hairs from mole I advise you to search for "removing of watermarks from image" and "deep neural networks" to maybe train a model to remove the hairs (note that this task will be quite difficult).
That being said, for the removing of background you could try the same code that you allready have for detection without hairs. You will get a binary image like this:
Now your region is filled with white lines (hairs) that go over your contour that is your ROI and cv2.findContours() would also pick them out because they are connected. But if you look at the picture you will find out that the white lines are quite thin and you can remove it from the image by performing opening (cv2.morphologyEx) on the image. Opening is erosion followed by dilation so when you erode the image with a big enough kernel size the white lines will dissapear:
Now you have a white spot with some noises arround which you can connect by performing another dilation (cv2.dilate()):
To make the ROI a bit smoother you can blur the image cv2.blur():
After that you can make another treshold and search for the biggest contour. The final result:
Hope it helps a bit. Cheers!
Example code:
import numpy as np
import cv2
# Read the image and perfrom an OTSU threshold
img = cv2.imread('hair.png')
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
ret, thresh = cv2.threshold(gray,0,255,cv2.THRESH_BINARY_INV+cv2.THRESH_OTSU)
# Remove hair with opening
kernel = np.ones((5,5),np.uint8)
opening = cv2.morphologyEx(thresh,cv2.MORPH_OPEN,kernel, iterations = 2)
# Combine surrounding noise with ROI
kernel = np.ones((6,6),np.uint8)
dilate = cv2.dilate(opening,kernel,iterations=3)
# Blur the image for smoother ROI
blur = cv2.blur(dilate,(15,15))
# Perform another OTSU threshold and search for biggest contour
ret, thresh = cv2.threshold(blur,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)
_, contours, hierarchy = cv2.findContours(thresh,cv2.RETR_TREE,cv2.CHAIN_APPROX_NONE)
cnt = max(contours, key=cv2.contourArea)
# Create a new mask for the result image
h, w = img.shape[:2]
mask = np.zeros((h, w), np.uint8)
# Draw the contour on the new mask and perform the bitwise operation
cv2.drawContours(mask, [cnt],-1, 255, -1)
res = cv2.bitwise_and(img, img, mask=mask)
# Display the result
cv2.imshow('img', res)
cv2.waitKey(0)
cv2.destroyAllWindows()
i am trying to remove noise in an image less and am currently running this code
import numpy as np
import argparse
import cv2
from skimage import morphology
# Construct the argument parser and parse the arguments
ap = argparse.ArgumentParser()
ap.add_argument("-i", "--image", required = True,
help = "Path to the image")
args = vars(ap.parse_args())
# Load the image, convert it to grayscale, and blur it slightly
image = cv2.imread(args["image"])
cv2.imshow("Image", image)
cv2.imwrite("image.jpg", image)
greenLower = np.array([50, 100, 0], dtype = "uint8")
greenUpper = np.array([120, 255, 120], dtype = "uint8")
green = cv2.inRange(image, greenLower, greenUpper)
#green = cv2.GaussianBlur(green, (3, 3), 0)
cv2.imshow("green", green)
cv2.imwrite("green.jpg", green)
cleaned = morphology.remove_small_objects(green, min_size=64, connectivity=2)
cv2.imshow("cleaned", cleaned)
cv2.imwrite("cleaned.jpg", cleaned)
cv2.waitKey(0)
However, the image does not seem to have changed from "green" to "cleaned" despite using the remove_small_objects function. why is this and how do i clean the image up? Ideally i would like to isolate only the image of the cabbage.
My thought process is after thresholding to remove pixels less than 100 in size, then smoothen the image with blur and fill up the black holes surrounded by white - that is what i did in matlab. If anybody could direct me to get the same results as my matlab implementation, that would be greatly appreciated. Thanks for your help.
Edit: made a few mistakes when changing the code, updated to what it currently is now and display the 3 images
image:
green:
clean:
my goal is to get somthing like this picture below from matlab implementation:
Preprocessing
A good idea when you're filtering an image is to lowpass the image or blur it a bit; that way neighboring pixels become a little more uniform in color, so it will ease brighter and darker spots on the image and keep holes out of your mask.
img = cv2.imread('image.jpg')
blur = cv2.GaussianBlur(img, (15, 15), 2)
lower_green = np.array([50, 100, 0])
upper_green = np.array([120, 255, 120])
mask = cv2.inRange(blur, lower_green, upper_green)
masked_img = cv2.bitwise_and(img, img, mask=mask)
cv2.imshow('', masked_img)
cv2.waitKey()
Colorspace
Currently, you're trying to contain an image by a range of colors with different brightness---you want green pixels, regardless of whether they are dark or light. This is much more easily accomplished in the HSV colorspace. Check out my answer here going in-depth on the HSV colorspace.
img = cv2.imread('image.jpg')
blur = cv2.GaussianBlur(img, (15, 15), 2)
hsv = cv2.cvtColor(blur, cv2.COLOR_BGR2HSV)
lower_green = np.array([37, 0, 0])
upper_green = np.array([179, 255, 255])
mask = cv2.inRange(hsv, lower_green, upper_green)
masked_img = cv2.bitwise_and(img, img, mask=mask)
cv2.imshow('', masked_img)
cv2.waitKey()
Removing noise in a binary image/mask
The answer provided by ngalstyan shows how to do this nicely with morphology. What you want to do is called opening, which is the combined process of eroding (which more or less just removes everything within a certain radius) and then dilating (which adds back to any remaining objects however much was removed). In OpenCV, this is accomplished with cv2.morphologyEx(img, cv2.MORPH_OPEN, kernel). The tutorials on that page show how it works nicely.
img = cv2.imread('image.jpg')
blur = cv2.GaussianBlur(img, (15, 15), 2)
hsv = cv2.cvtColor(blur, cv2.COLOR_BGR2HSV)
lower_green = np.array([37, 0, 0])
upper_green = np.array([179, 255, 255])
mask = cv2.inRange(hsv, lower_green, upper_green)
kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (15, 15))
opened_mask = cv2.morphologyEx(mask, cv2.MORPH_OPEN, kernel)
masked_img = cv2.bitwise_and(img, img, mask=opened_mask)
cv2.imshow('', masked_img)
cv2.waitKey()
Filling in gaps
In the above, opening was shown as the method to remove small bits of white from your binary mask. Closing is the opposite operation---removing chunks of black from your image that are surrounded by white. You can do this with the same idea as above, but using cv2.morphologyEx(img, cv2.MORPH_CLOSE, kernel). This isn't even necessary after the above in your case, as the mask doesn't have any holes. But if it did, you could close them up with closing. You'll notice my opening step actually removed a small bit of the plant at the bottom. You could actually fill those gaps with closing first, and then opening to remove the spurious bits elsewhere, but it's probably not necessary for this image.
Trying out new values for thresholding
You might want to get more comfortable playing around with different colorspaces and threshold levels to get a feel for what will work best for a particular image. It's not complete yet and the interface is a bit wonky, but I have a tool you can use online to try out different thresholding values in different colorspaces; check it out here if you'd like. That's how I quickly found values for your image.
Although, the above problem is solved using cv2.morphologyEx(img, cv2.MORPH_OPEN, kernel). But, if somebody wants to use morphology.remove_small_objects to remove area less than a specified size, for those this answer may be helpful.
Code I used to remove noise for above image is:
import numpy as np
import cv2
from skimage import morphology
# Load the image, convert it to grayscale, and blur it slightly
image = cv2.imread('im.jpg')
cv2.imshow("Image", image)
#cv2.imwrite("image.jpg", image)
greenLower = np.array([50, 100, 0], dtype = "uint8")
greenUpper = np.array([120, 255, 120], dtype = "uint8")
green = cv2.inRange(image, greenLower, greenUpper)
#green = cv2.GaussianBlur(green, (3, 3), 0)
cv2.imshow("green", green)
cv2.imwrite("green.jpg", green)
imglab = morphology.label(green) # create labels in segmented image
cleaned = morphology.remove_small_objects(imglab, min_size=64, connectivity=2)
img3 = np.zeros((cleaned.shape)) # create array of size cleaned
img3[cleaned > 0] = 255
img3= np.uint8(img3)
cv2.imshow("cleaned", img3)
cv2.imwrite("cleaned.jpg", img3)
cv2.waitKey(0)
Cleaned image is shown below:
To use morphology.remove_small_objects, first labeling of blobs is essential. For that I use imglab = morphology.label(green). Labeling is done like, all pixels of 1st blob is numbered as 1. similarly, all pixels of 7th blob numbered as 7 and so on. So, after removing small area, remaining blob's pixels values should be set to 255 so that cv2.imshow() can show these blobs. For that I create an array img3 of the same size as of cleaned image. I used img3[cleaned > 0] = 255 line to convert all pixels which value is more than 0 to 255.
It seems what you want to remove is a disconnected group of small blobs.
I think erode() will do a good job remove them with the right kernel.
Given an nxn kernel, erode moves the kernel through the image and replaces the center pixel by the minimum pixel in the kernel.
Then you can dilate() the resulting image to restore eroded edges of the green part.
Another option would be to use fastndenoising
##### option 1
kernel_size = (5,5) # should roughly have the size of the elements you want to remove
kernel_el = cv2.getStructuringElement(cv2.MORPH_RECT, kernel_size)
eroded = cv2.erode(green, kernel_el, (-1, -1))
cleaned = cv2.dilate(eroded, kernel_el, (-1, -1))
##### option 2
cleaned = cv2.fastNlMeansDenoisingColored(green, h=10)