I am using opencv (cv2 module) in python to recognize objects in a video. In each frame, I want to extract a particular region aka, the contour. After learning from opencv docs, I have the following code snippet:
# np is numpy module, contours are expected results,
# frame is each frame of the video
# Iterate through the contours.
for contour in contours:
# Compute the bounding box for the contour, draw
# it on the frame, and update the text.
x, y, w, h = cv2.boundingRect(contour)
# Find the mask and build a histogram for the object.
mask = np.zeros(frame.shape[:2], np.uint8)
mask[y:h, x:w] = 255
masked_img = cv2.bitwise_and(frame, frame, mask = mask)
obj_hist = cv2.calcHist([masked_img], [0], None, [256], [0, 256])
However, when I use matplotlib to show the masked_img, it returns a dark image. The obj_hist has only one element with number greater than 0, which is the first one. What is wrong?
The problem is the way you are setting the values in your mask. Specifically this line:
mask[y:h, x:w] = 255
You are trying to slice into each dimension of the image by using y:h and x:w to set up the mask. The left of the colon is the starting row or column, and the right of the colon denotes the end row or column. Given that you start at y, you need to offset by h using the same reference y... the same goes for x and w.
Doing the slicing where the right value of the colon is less than the left will not modify the array in any way, and that's most why you aren't getting any output as you are not modifying the mask when it's initially all zeroes.
You probably meant to do:
mask[y:y+h, x:x+w] = 255
This will properly set the proper region given by cv2.boundingRect to white (255).
Related
I have used:
result = cv2.matchTemplate(frame, template, cv2.TM_CCORR_NORMED)
to generate this output:
I need a list of (x, y) tuples at each of the local maxima (bright spots) in the result. Simply finding all points above a threshold doesn't work, since there are many such points around each maximum.
I can guarantee the minimum distance between any two maxima, which ought to help speed things up.
Is there an efficient technique for doing this?
(P.S.: this is cross-posted from https://forum.opencv.org/t/locating-local-maximums/1534)
update
Based on an excellent suggestion by Michael Lee, I've added skeletonizing to the thresholded image. It's close, but the skeletonizing still has many "worms" rather than single points. My processing flow is as follows:
# read the image
im = cv.imread("image.png", cv.IMREAD_GRAYSCALE)
# apply thresholding
ret, im2 = cv.threshold(im, args.threshold, 255, cv.THRESH_BINARY)
# dilate the thresholded image to eliminate "pinholes"
im3 = cv.dilate(im2, None, iterations=2)
# skeletonize the result
im4 = cv.ximgproc.thinning(im3, None, cv.ximgproc.THINNING_ZHANGSUEN)
# print the number of points found
x, y = np.nonzero(im5)
print(x.shape)
# => 1208
This is a step in the right direction, but there should be more like 220 points, not 1208.
Here are the intermediate results. As you can see in the last picture (skeletonized), there are still lots of little "worms" rather than single point. Is there a better approach?
Thresholded:
Dilated:
Skeletonized:
Update 2/14: Seems like skeletonization only took you part of the way there. Here's a better solution which I believe should get you the rest of the way. Here's how you would do it in scikit-image - maybe you can find the analog in OpenCV - seems like cv2.findContours would be a good start.
# mask is the thresholded image (before or after dilation should work, no skeletonization.
from skimage.measure import label, regionprops
labeled_image = label(mask)
output_points = [region.centroid for region in regionprops(labeled_image)]
Explanation: Label will convert your binary image into a labeled image, where each mask has a different integer value. Then, regionprops uses these labels in order to separate each mask, from which we can use the centroid property to compute the middle point from each - this is guaranteed to be a single point.
Simply finding all points above a threshold doesn't work, since there
are many such points around each maximum.
Actually, this does work - as long as you apply one more processing step. After thresholding, then we want to skeletonize. Scikit-image has a good function to achieve that here, which should give you a binary mask with single points.
Afterwards, you're probably going to want to run something like:
indices = zip(*np.where(skeleton))
to get your final points!
Based on Michael Lee's answer, here's the solution that worked for me (using all openCV rather than skimage):
# read in color image and create a grayscale copy
im = cv.imread("image.png")
img = cv.cvtColor(im, cv.COLOR_BGR2GRAY)
# apply thresholding
ret, im2 = cv.threshold(img, args.threshold, 255, cv.THRESH_BINARY)
# dilate the thresholded peaks to eliminate "pinholes"
im3 = cv.dilate(im2, None, iterations=2)
contours, hier = cv.findContours(im3, cv.RETR_TREE, cv.CHAIN_APPROX_SIMPLE)
print('found', len(contours), 'contours')
# draw a bounding box around each contour
for contour in contours:
x,y,w,h = cv.boundingRect(contour)
cv.rectangle(im, (x,y), (x+w,y+h), (255,0,0), 2)
cv.imshow('Contours', im)
cv.waitKey()
which results in just what we're looking for:
I have an image of a circular-shaped mask, which is essentially a colored circle within a black image.
I want to remove all the blank space around the mask, such that the boundaries of the image align with the circle as such:
I've written up a script to do this by searching through every column and row until a pixel with a value greater than 0 appears. searching from left to right, right to left, top to bottom, and bottom to the top gets me the mask boundaries, allowing me to crop the original image. Here is the code:
ROWS, COLS, _ = img.shape
BORDER_RIGHT = (0,0)
BORDER_LEFT = (0,0)
right_found = False
left_found = False
# find borders of blank space for removal.
# left and right border
print('Searching for Right and Left corners')
for col in tqdm(range(COLS), position=0, leave=True):
for row in range(ROWS):
if left_found and right_found:
break
# searching from left to right
if not left_found and N.sum(img[row][col]) > 0:
BORDER_LEFT = (row, col)
left_found = True
# searching from right to left
if not right_found and N.sum(img[row][-col]) > 0:
BORDER_RIGHT = (row, img.shape[1] + (-col))
right_found = True
BORDER_TOP = (0,0)
BORDER_BOTTOM = (0,0)
top_found = False
bottom_found = False
# top and bottom borders
print('Searching for Top and Bottom corners')
for row in tqdm(range(ROWS), position=0, leave=True):
for col in range(COLS):
if top_found and bottom_found:
break
# searching top to bottom
if not top_found and N.sum(img[row][col]) > 0:
BORDER_TOP = (row, col)
top_found = True
# searching bottom to top
if not bottom_found and N.sum(img[-row][col]) > 0:
BORDER_BOTTOM = (img.shape[0] + (-row), col)
bottom_found = True
# crop left and right borders
new_img = img[:,BORDER_LEFT[1]: BORDER_RIGHT[1] ,:]
# crop top and bottom borders
new_img = new_img[BORDER_TOP[0] : BORDER_BOTTOM[0],:,:]
I was wondering whether there was a more efficient way to do this. With larger images, this can be quite time-consuming especially if the mask is relatively small with respect to the original image shape. thanks!
Assuming you have only this object inside the image, there are two ways to do this:
You can threshold the image, then use numpy.where to find all locations that are non-zero, then use numpy.min and numpy.max on the appropriate row and column locations that come out of numpy.where to give you the bounding rectangle.
You can first find the contour points of the object after you threshold with cv2.findContours. This should result in a single contour, so once you have these points you put this through cv2.boundingRect to return the top-left corner of the rectangle followed by the width and height of its extent.
The first method will work if there is a single object and efficiently at that. The second one will work if there is more than one object, but you have to know which contour the object of interest is in, then you simply index into the output of cv2.findContours and pipe this through cv2.boundingRect to get the rectangular dimensions of the object of interest.
However, the takeaway is that either of these methods is much more efficient than the approach you have proposed where you are manually looping over each row and column and calculating sums.
Pre-processing
These sets of steps are going to be common to both methods. In summary, we read in the image, then convert it to grayscale then threshold. I didn't have access to your original image so I read it in from Stack Overflow and cropped it so that the axes are not showing. This will apply to the second method as well.
Here's a reconstruction of your image where I've taken a snapshot.
First I'll read in the image directly from the Internet as well as import the relevant packages I need to get the job done:
import skimage.io as io
import numpy as np
import cv2
img = io.imread('https://i.stack.imgur.com/dj1a8.png')
Thankfully, Scikit image has a method that reads in images directly from the Internet: skimage.io.imread.
After, I'm going to convert the image to grayscale, then threshold it:
img_gray = cv2.cvtColor(img, cv2.COLOR_RGB2GRAY)
im = img_gray > 40
I use OpenCV's cv2.cvtColor to convert the image from colour to grayscale. After, I threshold the image so that any intensity above 40 is set to True and everything else is set to False. The threshold of 40 I chose by trial and error until I get a mask that appeared to be circular. Taking a look at this image we get:
Method #1
As I illustrated above, use numpy.where on the thresholded image, then use numpy.min and numpy.max find the appropriate top-left and bottom-right corners and crop the image:
(r, c) = np.where(im == 1)
min_row, min_col = np.min(r), np.min(c)
max_row, max_col = np.max(r), np.max(c)
im_crop = img[min_row:max_row+1, min_col:max_col+1]
numpy.where for a 2D array will return a tuple of row and column locations that are non-zero. If we find the minimum row and column location, that corresponds to the top-left corner of the bounding rectangle. Similarly, the maximum row and column location corresponds to the bottom-right corner of the bounding rectangle. What's nice is that numpy.min and numpy.max work in a vectorised fashion, meaning that it operates on entire NumPy arrays in a single sweep. This logic is used above, then we index into the original colour image to crop out the range of rows and columns that contain the object of interest. im_crop contains that result. Note that the maximum row and column needs to be added with 1 when we're indexing as slicing with the end indices are exclusive so adding with 1 ensures we include the pixel locations at the bottom right corner of the rectangle.
We therefore get:
Method #2
We will use cv2.findContours to find all contour points of all objects in the image. Because there's a single object, only one contour should result, so we use this contour to pipe into cv2.boundingRect to find the top-left corner of the bounding rectangle of the object, combined with its width and height to crop out the image.
cnt, _ = cv2.findContours(im.astype(np.uint8), cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)
x, y, w, h = cv2.boundingRect(cnt[0])
im_crop = img[y:y+h, x:x+w]
Take note that we have to convert the thresholded image into unsigned 8-bit integer, as that is the type that the function is expecting. Furthermore, we use cv2.RETR_EXTERNAL as we only want to retrieve the coordinates of the outer perimeter of any objects we see in the image. We also use cv2.CHAIN_APPROX_NONE to return every possible contour point on the object. The cnt is a list of contours that was found in the image. The size of this list should only be 1, so we index into this directly and pipe this into cv2.boundingRect. We then use the top-left corner of the rectangle, combined with its width and height to crop out the object.
We therefore get:
Full Code
Here's the full code listing from start to finish. I've left comments below to delineate what methods #1 and #2 are. For now, method #2 has been commented out, but you can decide whichever one you want to use by simply commenting and uncommenting the relevant code.
import skimage.io as io
import cv2
import numpy as np
img = io.imread('https://i.stack.imgur.com/dj1a8.png')
img_gray = cv2.cvtColor(img, cv2.COLOR_RGB2GRAY)
im = img_gray > 40
# Method #1
(r, c) = np.where(im == 1)
min_row, min_col = np.min(r), np.min(c)
max_row, max_col = np.max(r), np.max(c)
im_crop = img[min_row:max_row+1, min_col:max_col+1]
# Method #2
#cnt, _ = cv2.findContours(im.astype(np.uint8), cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)
#x, y, w, h = cv2.boundingRect(cnt[0])
#im_crop = img[y:y+h, x:x+w]
I have a image which I have binarized and the problem is I want to remove the white spots in that image which are not connected in a loop i.e. small white dots. The reason is I want to take a measurement of that section like shown here.
I have tried some OpenCV morphology functions like erode, open, close but the results are not what I require. I have also tried using canny edge but some diagonal lines which I want for some processing are also gone. here
is the result of both thresh(left) and canny(right)
I have read that pruning is a process in which we remove pixels which are not connected but I don't know how it works? Is there a function in Opencv for that?
th3 = cv2.adaptiveThreshold(gray_img,255,cv2.ADAPTIVE_THRESH_GAUSSIAN_C,cv2.THRESH_BINARY,11,2)
th3=~th3
th3 = cv2.dilate(th3, cv2.getStructuringElement(cv2.MORPH_RECT,(3,3)))
th3 = cv2.erode(th3, cv2.getStructuringElement(cv2.MORPH_RECT,(5,5)))
edged = cv2.Canny(gray_img,lower,upper)
This answer explains how to measure the diameter of the drill bit:
The first step is to read the image as 2 channel (grayscale) and find the edges using a Canny filter:
img = cv2.imread('/home/stephen/Desktop/drill.jpg',0)
canny = cv2.Canny(img, 100, 100)
The edges of the drill using np.argmax():
diameters = []
# Iterate through each column in the image
for row in range(img.shape[1]-1):
img_row = img[:, row:row+1]
start = np.argmax(img_row)
end = np.argmax(np.flip(img_row))
diameters.append(end+start)
a = row, 0
b = row, start
c = row, img.shape[1]
d = row, img.shape[0] - end
cv2.line(img, a, b, 123, 1)
cv2.line(img, c, d, 123, 1)
The diameter of the drill in each column is plotted here:
All of this assumes that you have an aligned image. I used this code to align the image.
I would like to know if a big image contains a small image. The small image can be semi-transparent (similar to watermark, so it's not a fully filled photo). I've tried following different SO answers on this topic, but they're all matching the EXACT photo, but what I am looking for is whether the photo exists with 80% accuracy as the photo will be a lossy rendered version of the original one.
This is a procedure of how the images I am searching in will be generated:
Use any photo, put a semi-transparent "watermark" on it within Photoshop and save it. Then I want to check if the "watermark" exists within created photo with certain percent of accuracy (80% is good enough).
I've tried using the original template matching example provided on their docs page but I'm getting barely any match at all.
This is the code I'm using:
import cv2
import numpy as np
img_rgb = cv2.imread('photo2.jpeg')
img_gray = cv2.cvtColor(img_rgb, cv2.COLOR_BGR2GRAY)
template = cv2.imread('small-image.png', 0)
w, h = template.shape[::-1]
res = cv2.matchTemplate(img_gray,template,cv2.TM_CCOEFF_NORMED)
threshold = 0.7
loc = np.where( res >= threshold)
for pt in zip(*loc[::-1]):
cv2.rectangle(img_rgb, pt, (pt[0] + w, pt[1] + h), (0,0,255), 2)
cv2.imshow('output', img_rgb)
cv2.waitKey(0)
Here are the photos I've been using for the test, as this is something similar I am trying to make a match on.
small-image.png
photo2.jpeg
I am assuming the whole watermark will have the same RGB values and the text will have a little different RGB values otherwise this technique will not work. Based on this we can obtain the RGB values of a pixel of the small image and treated it as a mask by using cv2.inRange to find those pixel values in the large image. Similarly a mask is also created for the small image using those pixel values.
small = cv2.imread('small_find.png')
large = cv2.imread('large_find.jpg')
pixel = np.reshape(small[3,3], (1,3))
lower =[pixel[0,0]-10,pixel[0,1]-10,pixel[0,2]-10]
lower = np.array(lower, dtype = 'uint8')
upper =[pixel[0,0]+10,pixel[0,1]+10,pixel[0,2]+10]
upper = np.array(upper, dtype = 'uint8')
mask = cv2.inRange(large,lower, upper)
mask2 = cv2.inRange(small, lower, upper)
I had to take a buffer value of 20 because the values were not clearly matching in the large image otherwise only 1 is enough in either upper or lower. Then we find contours in mask and find values of its bounding rectangle which is cut out and reshaped to the size of mask2.
im, contours, hierarchy = cv2.findContours(mask,cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_SIMPLE)
#cv2.drawContours(large, contours, -1, (0,0,255), 1)
cnt = max(contours, key = cv2.contourArea)
x,y,w,h = cv2.boundingRect(cnt)
wanted_part = mask[y:y+h, x:x+w]
wanted_part = cv2.resize(wanted_part, (mask2.shape[1], mask2.shape[0]), interpolation = cv2.INTER_LINEAR)
The two masks side by side (inverted them otherwise they were not visible).
For comparing they you can use any parameter and check whether it satisfies your condition or not. I used mean square error and got error of only 6.20 which is very low.
def MSE(img1, img2):
squared_diff = img1 - img2
summed = np.sum(squared_diff)
num_pix = img1.shape[0] * img1.shape[1] #img1 and 2 should have same shape
err = summed / num_pix
return err
I followed this tutorial from official documentation. I run their code:
import numpy as np
import cv2
im = cv2.imread('test.jpg')
imgray = cv2.cvtColor(im,cv2.COLOR_BGR2GRAY)
ret,thresh = cv2.threshold(imgray,127,255,0)
contours, hierarchy = cv2.findContours(thresh,cv2.RETR_TREE,cv2.CHAIN_APPROX_SIMPLE)
cv2.drawContours(img, contours, -1, (0,255,0), 3)
That is ok: no errors, but nothing is displayed.I want to display the result they got as they showed it on the picture:
How can I display the result of the countours like that (just the left result or the right one) ?
I know I must use cv2.imshow(something) but how in this specific case ?
First off, that example only shows you how to draw contours with the simple approximation. Bear in mind that even if you draw the contours with the simple approximation, it will be visualized as having a blue contour drawn completely around the rectangle as seen in the left image. You will not be able to get the right image by simply drawing the contours onto the image. In addition, you want to compare two sets of contours - the simplified version on the right with its full representation on the left. Specifically, you need to replace the cv2.CHAIN_APPROX_SIMPLE flag with cv2.CHAIN_APPROX_NONE to get the full representation. Take a look at the OpenCV doc on findContours for more details: http://docs.opencv.org/modules/imgproc/doc/structural_analysis_and_shape_descriptors.html#findcontours
In addition, even though you draw the contours onto the image, it doesn't display the results. You'll need to call cv2.imshow for that. However, drawing the contours themselves will not show you the difference between the full and simplified version. The tutorial mentions that you need to draw circles at each contour point so we shouldn't use cv2.drawContours for this task. What you should do is extract out the contour points and draw circles at each point.
As such, create two images like so:
# Your code
import numpy as np
import cv2
im = cv2.imread('test.jpg')
imgray = cv2.cvtColor(im,cv2.COLOR_BGR2GRAY)
ret,thresh = cv2.threshold(imgray,127,255,0)
## Step #1 - Detect contours using both methods on the same image
contours1, _ = cv2.findContours(thresh,cv2.RETR_TREE,cv2.CHAIN_APPROX_NONE)
contours2, _ = cv2.findContours(thresh,cv2.RETR_TREE,cv2.CHAIN_APPROX_SIMPLE)
### Step #2 - Reshape to 2D matrices
contours1 = contours1[0].reshape(-1,2)
contours2 = contours2[0].reshape(-1,2)
### Step #3 - Draw the points as individual circles in the image
img1 = im.copy()
img2 = im.copy()
for (x, y) in contours1:
cv2.circle(img1, (x, y), 1, (255, 0, 0), 3)
for (x, y) in contours2:
cv2.circle(img2, (x, y), 1, (255, 0, 0), 3)
Take note that the above code is for OpenCV 2. For OpenCV 3, there is an additional output to cv2.findContours that is the first output which you can ignore in this case:
_, contours1, _ = cv2.findContours(thresh,cv2.RETR_TREE,cv2.CHAIN_APPROX_NONE)
_, contours2, _ = cv2.findContours(thresh,cv2.RETR_TREE,cv2.CHAIN_APPROX_SIMPLE)
Now let's walk through the code slowly. The first part of the code is what you provided. Now we move onto what is new.
Step #1 - Detect contours using both methods
Using the thresholded image, we detect contours using both the full and simple approximations. This gets stored in two lists, contours1 and contours2.
Step #2 - Reshape to 2D matrices
The contours themselves get stored as a list of NumPy arrays. For the simple image provided, there should only be one contour detected, so extract out the first element of the list, then use numpy.reshape to reshape the 3D matrices into their 2D forms where each row is a (x, y) point.
Step #3 - Draw the points as individual circles in the image
The next step would be to take each (x, y) point from each set of contours and draw them on the image. We make two copies of the original image in colour form, then we use cv2.circle and iterate through each pair of (x, y) points for both sets of contours and populate two different images - one for each set of contours.
Now, to get the figure you see above, there are two ways you can do this:
Create an image that stores both of these results together side by side, then show this combined image.
Use matplotlib, combined with subplot and imshow so that you can display two images in one window.
I'll show you how to do it using both methods:
Method #1
Simply stack the two images side by side, then show the image after:
out = np.hstack([img1, img2])
# Now show the image
cv2.imshow('Output', out)
cv2.waitKey(0)
cv2.destroyAllWindows()
I stack them horizontally so that they are a combined image, then show this with cv2.imshow.
Method #2
You can use matplotlib:
import matplotlib.pyplot as plt
# Spawn a new figure
plt.figure()
# Show the first image on the left column
plt.subplot(1,2,1)
plt.imshow(img1[:,:,::-1])
# Turn off axis numbering
plt.axis('off')
# Show the second image on the right column
plt.subplot(1,2,2)
plt.imshow(img2[:,:,::-1])
# Turn off the axis numbering
plt.axis('off')
# Show the figure
plt.show()
This should display both images in separate subfigures within an overall figure window. If you take a look at how I'm calling imshow here, you'll see that I am swapping the RGB channels because OpenCV reads in images in BGR format. If you want to display images with matplotlib, you'll need to reverse the channels as the images are in RGB format (as they should be).
To address your question in your comments, you would take which contour structure you want (contours1 or contours2) and search the contour points. contours is a list of all possible contours, and within each contour is a 3D matrix that is shaped in a N x 1 x 2 format. N would be the total number of points that represent the contour. I'm going to remove the singleton second dimension so we can get this to be a N x 2 matrix. Also, let's use the full representation of the contours for now:
points = contours1[0].reshape(-1,2)
I am going to assume that your image only has one object, hence my indexing into contours1 with index 0. I unravel the matrix so that it becomes a single row vector, then reshape the matrix so that it becomes N x 2. Next, we can find the minimum point by:
min_x = np.argmin(points[:,0])
min_point = points[min_x,:]
np.argmin finds the location of the smallest value in an array that you supply. In this case, we want to operate along the x coordinate, or the columns. Once we find this location, we simply index into our 2D contour point array and extract out the contour point.
You should add cv2.imshow("Title", img) at the end of your code. It should look like this:
import numpy as np
import cv2
im = cv2.imread('test.jpg')
imgray = cv2.cvtColor(im,cv2.COLOR_BGR2GRAY)
ret,thresh = cv2.threshold(imgray,127,255,0)
contours, hierarchy = cv2.findContours(thresh,cv2.RETR_TREE,cv2.CHAIN_APPROX_SIMPLE)
cv2.drawContours(im, contours, -1, (0,255,0), 3)
cv2.imshow("title", im)
cv2.waitKey()
Add these 2 lines at the end:
cv2.imshow("title", im)
cv2.waitKey()
Also, be aware that you have img instead of im in your last line.