Edge Detection minimum line length? - python

I'm trying to filter out short lines from my canny edge detection. Here's what I'm currently using as well as a brief explanation:
I start out by taking a single channel of the image and running CV2's Canny edge detection. Following that, I scan through each pixel and detect if there are any around it that are white (True, 255). If it is, I add it to a group of true pixels and then check every pixel around it (and keep looping until there are no white/True pixels left. I then replace all the pixels with black/False if the group count is less than a designated threshold (In this case, 100 pixels).
While this works (as shown below) it's awfully slow. I'm wondering if there's a faster, easier way to do this.
import cv2
img = cv2.imread("edtest.jpg")
img_r = img.copy()
img_r[:, :, 0] = 0
img_r[:, :, 1] = 0
img_r = cv2.GaussianBlur(img_r, (3, 3), 0)
basic_edge = cv2.Canny(img_r, 240, 250)
culled_edge = basic_edge.copy()
min_threshold = 100
for x in range(len(culled_edge)):
print(x)
for y in range(len(culled_edge[x])):
test_pixels = [(x, y)]
true_pixels = [(x, y)]
while len(test_pixels) != 0:
xorigin = test_pixels[0][0]
yorigin = test_pixels[0][1]
if 0 < xorigin < len(culled_edge) - 1 and 0 < yorigin < len(culled_edge[0]) - 1:
for testx in range(3):
for testy in range(3):
if culled_edge[xorigin-1+testx][yorigin - 1 + testy] == 255 and (xorigin-1+testx, yorigin-1+testy) not in true_pixels:
test_pixels.append((xorigin-1+testx, yorigin-1+testy))
true_pixels.append((xorigin-1+testx, yorigin-1+testy))
test_pixels.pop(0)
if 1 < len(true_pixels) < min_threshold:
for i in range(len(true_pixels)):
culled_edge[true_pixels[i][0]][true_pixels[i][1]] = 0
cv2.imshow("basic_edge", basic_edge)
cv2.imshow("culled_edge", culled_edge)
cv2.waitKey(0)
Source Image:
Canny Detection and Filtered (Ideal) Results:

The operation you are applying is called an area opening. I don't think there is an implementation in OpenCV, but you can find one in either scikit-image (skimage.morphology.area_opening) or DIPlib (dip.BinaryAreaOpening).
For example with DIPlib (disclosure: I'm an author) you'd amend your code as follows:
import diplib as dip
# ...
basic_edge = cv2.Canny(img_r, 240, 250)
min_threshold = 100
culled_edge = dip.BinaryAreaOpening(basic_edge > 0, min_threshold)
The output, culled_edge, is now a dip.Image object, which is compatible with NumPy arrays and you should be able to use it as such in many situations. If there's an issue, then you can cast it back to a NumPy array with culled_edge = np.array(culled_edge).

Related

"filling" areas based on mask from other image

I have the following problem with a minimal example:
I have an two images of the same size (see below) - one shows diffrent ellipses and the other shows regions of interest within some of those ellipses. What I want now is to color the ellipses that contain those regions red and those that don't contain those blue, or use some other labeling separating those types of ellipses.
For this I already tried using connectedComponents with partial sucess:
import cv2 as cv
import numpy as np
example = cv.imread('example.png', cv.IMREAD_GRAYSCALE)
roi = cv.imread('roi.png', cv.IMREAD_GRAYSCALE)
_, comp = cv.connectedComponents(example)
# get list of labels where the roi is present
redmarks = np.unique(comp[roi != 0])
# components with labels mentioned above are red otherwise blue
redmask = np.isin(comp, redmarks)
bluemask = np.invert(redmask) & comp > 0
result = cv.cvtColor(example, cv.COLOR_GRAY2RGB)
result[redmask] = [0, 0, 255]
result[bluemask] = [255, 0, 0]
cv.imshow("Test", result)
k = cv.waitKey(0)
Now there are three questions:
For the bluemask for some reason two ellipses are missed. The two parts of the logic - inverted redmask and not being the black background label 0 - individualy include both parts, so both should be present. Am I missing something?
Is there potentially a better and faster way to achieve what I want? There may be a build-in method that does, what I`m trying to do, that I have not encountered yet, as I'm fairly new to OpenCV.
In the end I also want to assess whether this particle is "for reals" red or not, e.g. if a roi is very small in comparison to the whole associated ellipse it shall not be labeled with red. I haven't thought about it much yet, as I want the basic thing to work properly and I already feel that my implementation would be slow as hell. So if you have any ideas regarding this, I would be very happy.
Thanks!
example.png:
roi.png
Overlapped:
Current result:
Edit:
Alright, I have a solution for 3.:
import cv2 as cv
import numpy as np
if __name__ == '__main__':
example = cv.imread('example.png', cv.IMREAD_GRAYSCALE)
roi = cv.imread('roi.png', cv.IMREAD_GRAYSCALE)
result = np.zeros((example.shape[0], example.shape[1], 3), dtype=np.uint8)
count, comp = cv.connectedComponents(example)
for c in range(1, count+1):
# mask of c-connected component
mask = np.zeros(example.shape, np.uint8)
mask[comp == c] = 1
# average of roi in this mask
mn, _, _, _ = cv.mean(roi, mask=mask)
# everything over 20% shared area red, else blue
pr = 20
if mn > 255*pr/100:
result[comp == c] = [0, 0, 255]
else:
result[comp == c] = [255, 0, 0]
cv.imshow("Test", result)
k = cv.waitKey(0)
BUT the problem of 2. is as expected pretty big for this solution. While for this example it is as fast as the other code without respect towards area, it is much slower for examples with larger resolution and amount of components (can`t/won't share this example).
Is there a way to optimize the code inside the for-loop?

Image blending with laplacian pyramids yields bad image

I'm trying to implement an image blending algorithm that receives two images
im1, im2 and a binary mask and blends the two image accroding to that mask.
I'm using the following formula:
https://i.stack.imgur.com/uxz34.png
to build the laplacian pyramid of the blended image and then I reconstruct it
to receive the final image.
my code:
mask = mask.astype(np.float64)
pyr_im1, filter_vec = build_laplacian_pyramid(im1, max_levels,
filter_size_im)
pyr_im2 = build_laplacian_pyramid(im2, max_levels,
filter_size_im)[0]
pyr_mask = build_gaussian_pyramid(mask, max_levels,
filter_size_mask)[0]
# build the blended laplacian pyramid
l_out = [np.multiply(pyr_mask[k], pyr_im1[k])
+ np.multiply(1 - pyr_mask[k], pyr_im2[k])
for k in range(len(pyr_im1))]
im_blend = laplacian_to_image(l_out, filter_vec, [1] * len(l_out))
im_blend = np.clip(im_blend, 0, 1) # clip values to the range of [0,1]
return im_blend
Where mask is a binary mask with values of 0 or 1, im1 and im2 are np.float64
images normalized to the range of (0,1). The functions build_laplacian_pyramid,
build_gaussian_pyramid and laplacian_to_image work perfectly-I have tested
them and made sure that they work properly. When I use this code to try and
blend two images I get something like this:
https://i.stack.imgur.com/0NlKv.png
Are there any apperent issues with my code that can come to mind?
Thanks in advance
Hi I'm not sure what exactly is wrong but I think you need to reverse the mask when applying the blend.
Here is my code that seems to work:
def laplacianSameSize(outerImage, innerImage, mask, levels):
gpCEye = gaussianPyramid(innerImage, levels)
lpCEye = laplacianPyramid(gpCEye)
gpFrame = gaussianPyramid(outerImage, levels)
lpFrame = laplacianPyramid(gpFrame)
gpMask = gaussianPyramid(mask, levels)
gpMask.reverse()
LS = []
#Appling the mask
for lFrame, lCEye, gMask in zip(lpFrame, lpCEye, gpMask):
lFrame[gMask == 255] = lCEye[gMask == 255]
LS.append(lFrame)
# now reconstruct
ls_ = LS[0]
for i in range(1,levels+1):
size = (LS[i].shape[1], LS[i].shape[0])
ls_ = cv2.pyrUp(ls_,dstsize=size)
ls_ = ls_ + LS[i]
#Making it above 0 before becoming uint8
bound(ls_)
return ls_.astype(np.uint8)
With functions
def laplacianPyramid(gp):
levels = len(gp) - 1
lp = [gp[levels]]
for i in range(levels, 0, -1):
size = (gp[i - 1].shape[1], gp[i - 1].shape[0])
GE = cv2.pyrUp(gp[i], dstsize=size)
L = gp[i - 1] - GE
lp.append(L)
return lp
def gaussianPyramid(img, num_levels):
lower = img.copy()
gp = [np.float32(lower)]
for i in range(num_levels):
lower = cv2.pyrDown(lower)
gp.append(np.float32(lower))
return gp
def bound(im):
im[im < 0] = 0
im[im > 255] = 255
I also make sure to use floats until the end then bounding it within 0 to 255.
Hope this helps!

Fast and Robust Image Stitching Algorithm for many images in Python?

I have a stationary camera which takes photos rapidly of the continuosly moving product but in a fixed position just of the same angle (translation perspective). I need to stitch all images into a panoramic picture. I've tried by using the class Stitcher. It worked, but it took a long time to compute.
I also tried to use another method by using the SIFT detector, FNNbasedMatcher, finding Homography and then warping the images. This method works fine if I only use two images. For multiple images it still doesn't stitch them properly. Does anyone know the best and fastest image stitching algorithm for this case?
This is my code which uses the Stitcher class.
import time
import cv2
import os
import numpy as np
import sys
def main():
# read input images
imgs = []
path = 'pics_rotated/'
i = 0
for (root, dirs, files) in os.walk(path):
images = [f for f in files]
print(images)
for i in range(0,len(images)):
curImg = cv2.imread(path + images[i])
imgs.append(curImg)
stitcher = cv2.Stitcher.create(mode= 0)
status ,result = stitcher.stitch(imgs)
if status != cv2.Stitcher_OK:
print("Can't stitch images, error code = %d" % status)
sys.exit(-1)
cv2.imwrite("imagesout/output.jpg", result)
cv2.waitKey(0)
if __name__ == '__main__':
start = time.time()
main()
end = time.time()
print("Time --->>>>>", end - start)
cv2.destroyAllWindows()enter code here
Briefing
Although OpenCV Stitcher class provides lots of methods and options to perform stitching, I find it hard to use it because of the complexity.
Therefore, I will try to provide the minimum and fastest way to perform stitching.
In case you are wondering more sophisticated approachs such as exposure compensation, I highly recommend looking at the detailed sample code.
As a side note, I will be grateful if someone can convert the following functions to use Stitcher class.
Introduction
In order to combine multiple images into the same perspective, the following operations are needed:
Detect and match features.
Compute homography (perspective transform between frames).
Warp one image onto the other perspective.
Combine the base and warped images while keeping track of the shift in origin.
Given the combination pattern, stitch multiple images.
Feature detection and matching
What are features?
They are distinguishable parts, like corners of a square, that are preserved across images.
There are different algorithms proposed for obtaining these characteristic points, like Harris, ORB, SIFT, SURF, etc.
See cv::Feature2d for the full list.
I will use SIFT because it is accurate and sufficiently fast.
A feature consists of a KeyPoint, which is the location in the image, and a descriptor, which is a set of numbers (e.g. a 128-D vector) that represents the properties of the feature.
After finding distinct points in images, we need to match the corresponding point pairs.
See cv::DescriptionMatcher.
I will use Flann-based descriptor matcher.
First, we initialize the descriptor and matcher classes.
descriptor = cv.SIFT.create()
matcher = cv.DescriptorMatcher.create(cv.DescriptorMatcher.FLANNBASED)
Then, we find the features in each image.
(kps, desc) = descriptor.detectAndCompute(image, mask=None)
Now we find the corresponding point pairs.
if (desc1 is not None and desc2 is not None and len(desc1) >=2 and len(desc2) >= 2):
rawMatch = matcher->knnMatch(desc2, desc1, k=2)
matches = []
# ensure the distance is within a certain ratio of each other (i.e. Lowe's ratio test)
ratio = 0.75
for m in rawMatch:
if len(m) == 2 and m[0].distance < m[1].distance * ratio:
matches.append((m[0].trainIdx, m[0].queryIdx))
Homography computation
Homography is the perspective transformation from one view to another.
The parallel lines in one view may not be parallel in another, like a road to sunset.
We need to have at least 4 corresponding point pairs.
The more means redundant data that have to be decomposed or eliminated.
Homography matrix that transforms the point in the initial view to its warped position.
It is a 3x3 matrix that is computed by Direct Linear Transform algorithm.
There are 8 DoF and the last element in the matrix is 1.
[pt2] = H * [pt1]
Now that we have corresponding point matches, we compute the homography.
The method we use to handle redundant data is RANSAC, which randomly selects 4 point pairs and uses the best fitting result.
See cv::findHomography for more options.
if len(matches) > 4:
(H, status) = cv.findHomography(pts1, pts2, cv.RANSAC)
Warping to perspective
By computing homography, we know which point in the source image corresponds to which point in the destination image.
In order not to lose information from the source image, we need to pad the destination image by the amount that the transformed point falls to negative regions.
At the same time, we need to keep track of the shift amount of the origin for stitching multiple images.
Auxilary functions
# find the ROI of a transformation result
def warpRect(rect, H):
x, y, w, h = rect
corners = [[x, y], [x, y + h - 1], [x + w - 1, y], [x + w - 1, y + h - 1]]
extremum = cv.transform(corners, H)
minx, miny = np.min(extremum[:,0]), np.min(extremum[:,1])
maxx, maxy = np.max(extremum[:,0]), np.max(extremum[:,1])
xo = int(np.floor(minx))
yo = int(np.floor(miny))
wo = int(np.ceil(maxx - minx))
ho = int(np.ceil(maxy - miny))
outrect = (xo, yo, wo, ho)
return outrect
# homography matrix is translated to fit in the screen
def coverH(rect, H):
# obtain bounding box of the result
x, y, _, _ = warpRect(rect, H)
# shift amount to the first quadrant
xpos = int(-x if x < 0 else 0)
ypos = int(-y if y < 0 else 0)
# correct the homography matrix so that no point is thrown out
T = np.array([[1, 0, xpos], [0, 1, ypos], [0, 0, 1]])
H_corr = T.dot(H)
return (H_corr, (xpos, ypos))
# pad image to cover ROI, return the shift amount of origin
def addBorder(img, rect):
x, y, w, h = rect
tl = (x, y)
br = (x + w, y + h)
top = int(-tl[1] if tl[1] < 0 else 0)
bottom = int(br[1] - img.shape[0] if br[1] > img.shape[0] else 0)
left = int(-tl[0] if tl[0] < 0 else 0)
right = int(br[0] - img.shape[1] if br[0] > img.shape[1] else 0)
img = cv.copyMakeBorder(img, top, bottom, left, right, cv.BORDER_CONSTANT, value=[0, 0, 0])
orig = (left, top)
return img, orig
def size2rect(size):
return (0, 0, size[1], size[0])
Warping function
def warpImage(img, H):
# tweak the homography matrix to move the result to the first quadrant
H_cover, pos = coverH(size2rect(img.shape), H)
# find the bounding box of the output
x, y, w, h = warpRect(size2rect(img.shape), H_cover)
width, height = x + w, y + h
# warp the image using the corrected homography matrix
warped = cv.warpPerspective(img, H_corr, (width, height))
# make the external boundary solid black, useful for masking
warped = np.ascontiguousarray(warped, dtype=np.uint8)
gray = cv.cvtColor(warped, cv.COLOR_RGB2GRAY)
_, bw = cv.threshold(gray, 1, 255, cv.THRESH_BINARY)
# https://stackoverflow.com/a/55806272/12447766
major = cv.__version__.split('.')[0]
if major == '3':
_, cnts, _ = cv.findContours(bw, cv.RETR_EXTERNAL, cv.CHAIN_APPROX_NONE)
else:
cnts, _ = cv.findContours(bw, cv.RETR_EXTERNAL, cv.CHAIN_APPROX_NONE)
warped = cv.drawContours(warped, cnts, 0, [0, 0, 0], lineType=cv.LINE_4)
return (warped, pos)
Combining warped and destination images
This is the step where image enhancement such as exposure compensation becomes involved.
In order to keep things simple, we will use mean value blending.
The easiest solution would be overriding the existing data in the destination image but averaging operation is not a burden for us.
# only the non-zero pixels are weighted to the average
def mean_blend(img1, img2):
assert(img1.shape == img2.shape)
locs1 = np.where(cv.cvtColor(img1, cv.COLOR_RGB2GRAY) != 0)
blended1 = np.copy(img2)
blended1[locs1[0], locs1[1]] = img1[locs1[0], locs1[1]]
locs2 = np.where(cv.cvtColor(img2, cv.COLOR_RGB2GRAY) != 0)
blended2 = np.copy(img1)
blended2[locs2[0], locs2[1]] = img2[locs2[0], locs2[1]]
blended = cv.addWeighted(blended1, 0.5, blended2, 0.5, 0)
return blended
def warpPano(prevPano, img, H, orig):
# correct homography matrix
T = np.array([[1, 0, -orig[0]], [0, 1, -orig[1]], [0, 0, 1]])
H_corr = H.dot(T)
# warp the image and obtain shift amount of origin
result, pos = warpImage(prevPano, H_corr)
xpos, ypos = pos
# zero pad the result
rect = (xpos, ypos, img.shape[1], img.shape[0])
result, _ = addBorder(result, rect)
# mean value blending
idx = np.s_[ypos : ypos + img.shape[0], xpos : xpos + img.shape[1]]
result[idx] = mean_blend(result[idx], img)
# crop extra paddings
x, y, w, h = cv.boundingRect(cv.cvtColor(result, cv.COLOR_RGB2GRAY))
result = result[y : y + h, x : x + w]
# return the resulting image with shift amount
return (result, (xpos - x, ypos - y))
Stitching multiple images given combination pattern
# base image is the last image in each iteration
def blend_multiple_images(images, homographies):
N = len(images)
assert(N >= 2)
assert(len(homographies) == N - 1)
pano = np.copy(images[0])
pos = (0, 0)
for i in range(N - 1):
img = images[i + 1]
# get homography matrix
H = homographies[i]
# warp pano onto image
pano, pos = warpPano(pano, img, H, pos)
return (pano, pos)
The method above warps the previously combined image, called pano, onto the next image subsequently.
A pattern, however, may have conjunction points for the best stitching view.
For example
1 2 3
4 5 6
The best pattern to combine these images is
1 -> 2 <- 3
|
V
4 -> 5 <- 6
Therefore, we need one last function to combine 1 & 2 with 2 & 3, or 1235 with 456 at node 5.
from operator import sub
# no warping here, useful for combining two different stitched images
# the image at given origin coordinates must be the same
def patchPano(img1, img2, orig1=(0,0), orig2=(0,0)):
# bottom right points
br1 = (img1.shape[1] - 1, img1.shape[0] - 1)
br2 = (img2.shape[1] - 1, img2.shape[0] - 1)
# distance from orig to br
diag2 = tuple(map(sub, br2, orig2))
# possible pano corner coordinates based on img1
extremum = np.array([(0, 0), br1,
tuple(map(sum, zip(orig1, diag2))),
tuple(map(sub, orig1, orig2))])
bb = cv.boundingRect(extremum)
# patch img1 to img2
pano, shift = addBorder(img1, bb)
orig = tuple(map(sum, zip(orig1, shift)))
idx = np.s_[orig[1] : orig[1] + img2.shape[0] - orig2[1],
orig[0] : orig[0] + img2.shape[1] - orig2[0]]
subImg = img2[orig2[1] : img2.shape[0], orig2[0] : img2.shape[1]]
pano[idx] = mean_blend(pano[idx], subImg)
return (pano, orig)
For a quick demo, you can run the Python code in GitHub.
If you want to use the above methods in C++, you can have a look at Stitch library.
Any PR or edit to this post is welcome.
As an alternative to the last step that #Burak gave, this is the way I used as I had the number of images for each of the rows (chunks), the multiStitching being nothing but a function to stitch images horizontally:
def stitchingImagesHV(img_list, size):
"""
As our multi stitching algorithm works on the horizontal line, we will hack
it to use also the vertical stitching by rotating each row "stitch_img" and
apply the same technique, and after that, the final result is rotated back to the
original direction.
"""
# Generate row chunks of "size" length from image list
chunks = [img_list[i:i + size] for i in range(0, len(img_list), size)]
list_rotated_images = []
for i in range(len(chunks)):
stitch_img = multiStitching(chunks[i])
stitch_img_rotated = cv2.rotate(stitch_img, cv2.ROTATE_90_COUNTERCLOCKWISE)
list_rotated_images.append(stitch_img_rotated.astype('uint8'))
stitch_img2 = multiStitching(list_rotated_images)
return cv2.rotate(stitch_img2, cv2.ROTATE_90_CLOCKWISE)

OpenCV: How do I calculate minimum for a window around a central pixel?

I am trying to extract features from an image by calculating the minimum of a given window around an image and subtracting it from the original pixel value.
However that is turning out to be very slow, as I am iterating throughout the picture. Is there any optimised way to do it?
f = np.asarray(img.shape)
for i in range(img.shape[0]):
for j in range(img.shape[1]):
if mask[i][j]==255:
row,col = i,j
begin_row = row - 4
end_row = row + 4
begin_col = col - 4
end_col = col + 4
if begin_row < 0:
begin_row = 0
if begin_col < 0:
begin_col = 0
if end_col > img.shape[1]:
end_col = img.shape[1]
if end_row > img.shape[0]:
end_row = img.shape[0]
window = img[begin_row:end_row, begin_col:end_col]
curr = img.item(row, col)
f.itemset((row, col), curr - window.min())
I'm not sure if I understand your goal correctly:
Input image I
for every pixel p in I subtract the minimum of the surrounding windows (including the original pixel)
You could then use the morphological filter erode which acts like a minimum filter:
I_new(p) = I(p) - erode(I, p, window)
where you'd parameterize erode to have the correct window size and anchor.
As for an actual implementation, you could use the python version of opencv with its erode function.It is fast since it's implemented in C++/C. It could look like so (untested):
import cv2
import numpy as np
img = cv2.imread('path/to/image.jpg')
kernel = np.ones((5,5),np.float32)
dst = img - cv2.erode(img,kernel)
Most importantly: Avoid looping through image arrays with python loops - this can always only be slow.

OpenCV Python Bindings for GrabCut Algorithm

I've been trying to use the OpenCV implementation of the grab cut method via the Python bindings. I have tried using the version in both cv and cv2 but I am having trouble finding out the correct parameters to use to get the method to run correctly. I have tried several permutations of the parameters and nothing seems to work (basically every example I've seen on Github). Here are a couple examples I have tried to follow:
Example 1
Example 2
And here is the method's documentation and a known bug report:
Documentation
Known Grabcut Bug
I can get the code to execute using the example below, but it returns a blank (all black) image mask.
img = Image("pills.png")
mask = img.getEmpty(1)
bgModel = cv.CreateMat(1, 13*5, cv.CV_64FC1)
fgModel = cv.CreateMat(1, 13*5, cv.CV_64FC1)
for i in range(0, 13*5):
cv.SetReal2D(fgModel, 0, i, 0)
cv.SetReal2D(bgModel, 0, i, 0)
rect = (150,70,170,220)
tmp1 = np.zeros((1, 13 * 5))
tmp2 = np.zeros((1, 13 * 5))
cv.GrabCut(img.getBitmap(),mask,rect,tmp1,tmp2,5,cv.GC_INIT_WITH_RECT)
I am using SimpleCV to load the images. The mask type and return type from img.getBitmap() are:
iplimage(nChannels=1 width=730 height=530 widthStep=732 )
iplimage(nChannels=3 width=730 height=530 widthStep=2192 )
If someone has a working example of this code I would love to see it. For what it is worth I am running on OSX Snow Leopard, and my version of OpenCV was installed from the SVN repository (as of a few weeks ago). For reference my input image is this:
I've tried changing the result mask enum values to something more visible. It is not the return values that are the problem. This returns a completely black image. I will try a couple more values.
img = Image("pills.png")
mask = img.getEmpty(1)
bgModel = cv.CreateMat(1, 13*5, cv.CV_64FC1)
fgModel = cv.CreateMat(1, 13*5, cv.CV_64FC1)
for i in range(0, 13*5):
cv.SetReal2D(fgModel, 0, i, 0)
cv.SetReal2D(bgModel, 0, i, 0)
rect = (150,70,170,220)
tmp1 = np.zeros((1, 13 * 5))
tmp2 = np.zeros((1, 13 * 5))
cv.GrabCut(img.getBitmap(), mask, rect, tmp1, tmp2, 5, cv.GC_INIT_WITH_MASK)
mask[mask == cv.GC_BGD] = 0
mask[mask == cv.GC_PR_BGD] = 0
mask[mask == cv.GC_FGD] = 255
mask[mask == cv.GC_PR_FGD] = 255
result = Image(mask)
result.show()
result.save("result.png")
Kat, this version of your code seems to work for me.
import numpy as np
import matplotlib.pyplot as plt
import cv2
filename = "pills.png"
im = cv2.imread(filename)
h,w = im.shape[:2]
mask = np.zeros((h,w),dtype='uint8')
rect = (150,70,170,220)
tmp1 = np.zeros((1, 13 * 5))
tmp2 = np.zeros((1, 13 * 5))
cv2.grabCut(im,mask,rect,tmp1,tmp2,10,mode=cv2.GC_INIT_WITH_RECT)
plt.figure()
plt.imshow(mask)
plt.colorbar()
plt.show()
Produces a figure like this, with labels 0,2 and 3.
Your mask is filled with the following values:
GC_BGD defines an obvious background pixels.
GC_FGD defines an obvious foreground (object) pixel.
GC_PR_BGD defines a possible background pixel.
GC_PR_FGD defines a possible foreground pixel.
Which are all part of an enum:
enum { GC_BGD = 0, // background
GC_FGD = 1, // foreground
GC_PR_BGD = 2, // most probably background
GC_PR_FGD = 3 // most probably foreground
};
Which translates to the colors: completely black, very black, dark black, and black. I think you'll find that if you add the following code (taken from your example 1 and slightly modified) your mask will look nicer:
mask[mask == cv.GC_BGD] = 0 //certain background is black
mask[mask == cv.GC_PR_BGD] = 63 //possible background is dark grey
mask[mask == cv.GC_FGD] = 255 //foreground is white
mask[mask == cv.GC_PR_FGD] = 192 //possible foreground is light grey

Categories