OpenCV concave and convex corner points of polygons - python

Problem
I am working on a project where I need to get the bounding boxes of dumbell like shapes. However, I need the fewest points possible, and the boxes need to fit the shapes at all corners. Here's an Image I made to test: Blurry, cracked, dumbell shape
I don't care about the gaps going into the shape, I just want to clean it up, and straighten the edges so that I can get the contours of a shape like this: Cleaned up
I have been attempting to threshold() it out, getting the contours of it using findContours() and then using approxPolyDP() to simplify the crazy amount of points the contours end up being. So, after fiddling with this for about three days now, how can I simply get either:
Two boxes specifying the ends of the dumbell and a rectangle in the middle, or
One contour with the twelve points for all the corners
The second option would be preferred since that really is my ultimate goal: getting the points that are at those corners.
A few things to note:
I am using OpenCV for Python
There will generally be many of these shapes of all sizes all over the input image
They will have only horizontal or vertical positioning. No strange 27 degree angles...
What I need:
I really don't need someone to write the code for me, I just need some method or algorithm in order to get this done, preferably with some simple examples.
My Code
Here is my overly clean code with functions I don't even use but figure I would use them eventually:
import cv2
import numpy as np
class traceImage():
def __init__(self, imageLocation):
self.threshNum = 127
self.im = cv2.imread(imageLocation)
self.imOrig = self.im
self.imGray = cv2.cvtColor(self.im, cv2.COLOR_BGR2GRAY)
self.ret, self.imThresh = cv2.threshold(self.imGray, self.threshNum, 255, 0)
self.contours, self.hierarchy = cv2.findContours(self.imThresh, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
def createGray(self):
self.imGray = cv2.cvtColor(self.im, cv2.COLOR_BGR2GRAY)
def adjustThresh(self, threshNum):
self.ret, self.imThresh = cv2.threshold(self.imGray, threshNum, 255, 0)
def getContours(self):
self.contours, self.hierarchy = cv2.findContours(self.imThresh, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
def approximatePoly(self, percent):
i=0
for shape in self.contours:
shape = cv2.approxPolyDP(shape, percent*cv2.arcLength(shape, True), True)
self.contours[i] = shape
i+=1
def drawContours(self, blobWidth, color=(255,255,255)):
cv2.drawContours(self.im, self.contours, -1, color, blobWidth)
def newWindow(self, name):
cv2.namedWindow(name)
def showImage(self, window):
cv2.imshow(window, self.im)
def display(self):
while True:
cv2.waitKey()
def displayUntil(self, key):
while True:
pressed = cv2.waitKey()
if pressed == key:
break
if __name__ == "__main__":
blobWidth = 30
ti = traceImage("dumbell.png")
ti.approximatePoly(0.01)
for thresh in range(127,256):
ti.adjustThresh(thresh)
ti.getContours()
ti.drawContours(blobWidth)
ti.showImage("Image")
ti.displayUntil(10)
ti.createGray()
ti.adjustThresh(127)
ti.getContours()
ti.approximatePoly(0.0099)
ti.drawContours(2, (0,255,0))
ti.showImage("Image")
ti.display()
Code Explanation
I know I might not be doing some things right here, but hey, I'm proud of it :)
So, the idea is that there are very often holes and gaps in these dumbells and so I figured that if I iterated through all the threshold values from 127 to 255 and drew the contours onto the image with large enough thickness, the thickness of drawing the contours would fill in any small enough holes, and I could use the new, blobby image to get the edges and then scale the sides back down to size. That was my thinking. There's got to be another, beter way though...
Summary
I want to end up with 12 points; one for each corner of the shape.
EDIT:
After trying out some erosion and dilation, it seems that the best option would be to slice the contours at certain points and then use bounding boxes around the sliced shapes to get the right boxy corners, and then doing some calculations to rejoin the boxes into one shape. A rather interesting challenge...
EDIT 2:
I discovered something that works well! I made my own line detection system, that only detects horizontal or vertical lines, and then on a detected line/contour edge, the program draws a black line that extends across the whole image, thus effectively slicing the image at the straight lines of the contours. Once it does that, it gets new contours of the sliced up boxes, draws bounding boxes around the pieces and then uses dilation to close the gaps. So far, it works well on large shapes, but when the shapes are small, it tends to lose a bit of the shape.

So, after fiddling with erosion, dilation, closing, opening, and looking at straight contours, I have figured out a solution that works. Thank you #Ante and #a.alsram! Your two ideas combined helped me get to my solution. So here's how it works.
Method
The program iterates over each contour, and over every pair of points in the contour, looking for point pairs that lie on the same axis and calculating the distance between them. If the distance is greater than an adjustable threshold, the program decides that those points are considered an edge on the shape. Then the program uses that edge, and draws a black line along the whole contour, thus cutting the contour at that edge. Then the program redetermines contours and since the shape was cut. These pieces that were cut off are know their own contours, which then are bounded by bounding boxes. and finally, all shapes are dilated and eroded (close) to rejoin the boxes that were cut off.
This method can be done several times, but each time there is a little bit of accuracy loss. But it works for what I need and certainly was a fun challenge! Thanks for your help guys!
natebot13

Maybe simple solution can help. If there is a threshold length to close a gaps,
it is possible to split image in a grid with cell lengths >= threshold, and use
cells that have something inside. With that there will be only horizontal and
vertical lines, and by taking a care about grid to follow original horizontal
and vertical lines it will cover main line features.
Update
Take a look on mathematical morphology. I think closing operation with structuring element (2*k+1)x(2*k+1) pixels can do what you are looking for.
Algorithm should take threshold parameter k, and performs dilation and than erosion. That means change image so that for each white pixel set all neighbours on distance k ((2*k+1)x(2*k+1) box) to the white, and than change image so that for each black pixel set neighbours on distance k to the black.
It is enough to do operations on boundary pixels.

Related

Generating a segmentation mask for circular particles from threshold mask?

I am trying to find all the circular particles in the image attached. This is the only image I am have (along with its inverse).
I have read this post and yet I can't use hsv values for thresholding. I have tried using Hough Transform.
circles = cv2.HoughCircles(img, cv2.HOUGH_GRADIENT, dp=0.01, minDist=0.1, param1=10, param2=5, minRadius=3,maxRadius=6)
and using the following code to plot
names =[circles]
for nums in names:
color_img = cv2.imread(path)
blue = (211,211,211)
for x, y, r in nums[0]:
cv2.circle(color_img, (x,y), r, blue, 1)
plt.figure(figsize=(15,15))
plt.title("Hough")
plt.imshow(color_img, cmap='gray')
The following code was to plot the mask:
for masks in names:
black = np.zeros(img_gray.shape)
for x, y, r in masks[0]:
cv2.circle(black, (x,y), int(r), 255, -1) # -1 to draw filled circles
plt.imshow(black, gray)
Yet I am only able to get the following mask which if fairly poor.
This is an image of what is considered a particle and what is not.
One simple approach involves slightly eroding the image, to separate touching circular objects, then doing a connected component analysis and discarding all objects larger than some chosen threshold, and finally dilating the image back so the circular objects are approximately of the original size again. We can do this dilation on the labelled image, such that you retain the separated objects.
I'm using DIPlib because I'm most familiar with it (I'm an author).
import diplib as dip
a = dip.ImageRead('6O0Oe.png')
a = a(0) > 127 # the PNG is a color image, but OP's image is binary,
# so we binarize here to simulate OP's condition.
separation = 7 # tweak these two parameters as necessary
size_threshold = 500
b = dip.Erosion(a, dip.SE(separation))
b = dip.Label(b, maxSize=size_threshold)
b = dip.Dilation(b, dip.SE(separation))
Do note that the image we use here seems to be a zoomed-in screen grab rather than the original image OP is dealing with. If so, the parameters must be made smaller to identify the smaller objects in the smaller image.
My approach is based on a simple observation that most of the particles in your image have approximately same perimeter and the "not particles" have greater perimeter than them.
First, have a look at the RANSAC algorithm and how does it find inliers and outliers. It basically is for 2D data but we will have to transform it to 1D data in our case.
In your case, I am calling inliers to the correct particles and Outliers to incorrect particles.
Our data on which we have to work on will be the perimeter of these particles. To get the perimeter, find contours in this image and get the perimeter of each contour. Refer this for information about Contours.
Now we have the data, knowledge about RANSAC algo and our simple observation mentioned above. Now in this data, we have to find the most dense and compact cluster which will contain all the inliers and others will be outliers.
Now let's assume the inliers are in the range of 40-60 and the outliers are beyond 60. Let's define a threshold value T = 0. We say that for each point in the data, inliers for that point are in the range of (value of that point - T, value of that point + T).
Now first iterate over all the points in the data and count number of inliers to that point for a T and store this information. Find the maximum number of inliers possible for a value of T. Now increment the value of T by 1 and again find the maximum number of inliers possible for that T. Repeat these steps by incrementing value of T one by one.
There will be a range of values of T for which Maximum number of inliers are the same. These inliers are the particles in your image and the particles having perimeter greater than these inliers are the outliers thus the "not particles" in your image.
I have tried this algorithm in my test cases which are similar to your and it works. I am always able to determine the outliers. I hope it works for you too.
One last thing, I see that boundary of your particles are irregular and not smooth, try to make them smooth and use this algorithm if this doesn't work for you in this image.

keeping all details of object removing the shape in the smoothest and most efficient way

I have some images for which I also have their mask (green in the picture). I am producing a bounding box (dot line in the picture) around the object, and take only this part of the image.
Now I would like to replace the gray part with pixel that extend the car color in the most natural way. For example, taking the same color as the closest car pixel. At the end I would like to have an image with all the car details but without any shape anymore.
I tried to simple inverse the mask, so that the mask represent the gray pixel around the car, and then use the 'inpaint' function from opencv to paint this new mask with adequate color:
result = cv2.inpaint(car_image,new_mask,50,cv2.INPAINT_NS)
Its not working well as we clearly still see the borders all around the car.
Any hints would be greatly appreciated. I am working on python and it would need to be quite efficient as I have a huge number of images.
Here is a good working solution, its a fucntion that given an image with zero value outside of the mask, it output a similar image where instead of the zero values the most adequate color is choosen so that we keep all details of the object but removing the form:
#replace black pixel by smoothing with adequate color to keep all info, removing shape
def remove_shape_keep_all_info(img):
#create mask (s.t. Non-zero pixels indicate the area that needs to be inpainted)
mask_ = np.array([[1 if j==0 else 0 for j in i] for i in cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)]).astype('uint8')
kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (7,7))
mask_ = cv2.dilate(mask_, kernel, iterations=4)
result = cv2.inpaint(img,mask_,5,cv2.INPAINT_TELEA) #,cv2.INPAINT_TELEA, INPAINT_NS
return (result)

How to draw a line from two points and then let the line complete drawing until reaching a contour point with opencv, python?

I am using opencv and python for programming and I am trying to draw a line between two points that I know their coordinates, and then let the line complete until it reaches the end of the contour as shown in the image bellow. The contour in my case is actually of an image face, but I have provided a circle here for explanation. So what I am trying to achieve is to get the edge of the head at that point intersecting with the line and contour. Is there a way to draw a line from two points and then let the line complete drawing until reaching the contour?
I can think of one easy method off the top of my head that doesn't involve incrementally updating the image: on one blank image, draw a long line extending from point one in the direction of point two, and then AND the resulting image with the an image of the single contour drawn (filled). This will stop the line at the end of the contour. Then you can either use that mask to draw the line, or get the minimum/maximum x, y coords if you want the coordinates of the line.
To walk through an example, first we'll find the contour and draw it on a blank image:
contours = cv2.findContours(img, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)[1]
contour_img = np.zeros_like(img)
cv2.fillPoly(contour_img, contours, 255)
Then, if we have the points p1 and p2, find the direction they're heading in and find a point far off in that distance and draw that line on a new blank image (here I used a distance of 1000 pixels from p1):
p1 = (250, 250)
p2 = (235, 210)
theta = np.arctan2(p1[1]-p2[1], p1[0]-p2[0])
endpt_x = int(p1[0] - 1000*np.cos(theta))
endpt_y = int(p1[1] - 1000*np.sin(theta))
line_img = np.zeros_like(img)
cv2.line(line_img, (p1[0], p1[1]), (endpt_x, endpt_y), 255, 2)
Then simply cv2.bitwise_and() the two images together
contour_line_img = cv2.bitwise_and(line_img, contour_img)
Here is an image showing the points, the line extending past the contour, and the line breaking off at the contour.
Edit: Note that this will only work if your contours are convex. If there is any concavity and the line goes through that concave part, it will continue to draw on the other side of it. For e.g. in Silencer's answer, if both points were inside one of the ear and pointed towards the other ear, you'd want the contour to stop once it hit an edge, but mine will continue to draw on the other ear. I think an iterative method like Silencer's is the best for the general case, but I like the simplicity of this method if you know you have convex contours or if your points will be in a place to not have this issue.
Edit2: Someone else on Stack answered their own question about the Line Iterator class in Python by creating one: openCV 3.0 python LineIterator

Python OpenCV HoughLinesP Fails to Detect Lines

I am using OpenCV HoughlinesP to find horizontal and vertical lines. It is not finding any lines most of the time. Even when it finds a lines it is not even close to actual image.
import cv2
import numpy as np
img = cv2.imread('image_with_edges.jpg')
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
flag,b = cv2.threshold(gray,0,255,cv2.THRESH_OTSU)
element = cv2.getStructuringElement(cv2.MORPH_CROSS,(1,1))
cv2.erode(b,element)
edges = cv2.Canny(b,10,100,apertureSize = 3)
lines = cv2.HoughLinesP(edges,1,np.pi/2,275, minLineLength = 100, maxLineGap = 200)[0].tolist()
for x1,y1,x2,y2 in lines:
for index, (x3,y3,x4,y4) in enumerate(lines):
if y1==y2 and y3==y4: # Horizontal Lines
diff = abs(y1-y3)
elif x1==x2 and x3==x4: # Vertical Lines
diff = abs(x1-x3)
else:
diff = 0
if diff < 10 and diff is not 0:
del lines[index]
gridsize = (len(lines) - 2) / 2
cv2.line(img,(x1,y1),(x2,y2),(0,0,255),2)
cv2.imwrite('houghlines3.jpg',img)
Input Image:
Output Image: (see the Red Line):
#ljetibo Try this with:
c_6.jpg
There's quite a bit wrong here so I'll just start from the beginning.
Ok, first thing you do after opening an image is tresholding. I recommend strongly that you have another look at the OpenCV manual on tresholding and the exact meaning of the treshold methods.
The manual mentions that
cv2.threshold(src, thresh, maxval, type[, dst]) → retval, dst
the special value THRESH_OTSU may be combined with one of the above
values. In this case, the function determines the optimal threshold
value using the Otsu’s algorithm and uses it instead of the specified
thresh .
I know it's a bit confusing because you don't actully combine THRESH_OTSU with any of the other methods (THRESH_BINARY etc...), unfortunately that manual can be like that. What this method actually does is it assumes that there's a "foreground" and a "background" that follow a bi-modal histogram and then applies the THRESH_BINARY I believe.
Imagine this as if you're taking an image of a cathedral or a high building mid day. On a sunny day the sky will be very bright and blue, and the cathedral/building will be quite a bit darker. This means the group of pixels belonging to the sky will all have high brightness values, that is will be on the right side of the histogram, and the pixels belonging to the church will be darker, that is to the middle and left side of the histogram.
Otsu uses this to try and guess the right "cutoff" point, called thresh. For your image Otsu's alg. supposes that all that white on the side of the map is the background, and the map itself the foreground. Therefore your image after thresholding looks like this:
After this point it's not hard to guess what goes wrong. But let's go on, What you're trying to achieve is, I believe, something like this:
flag,b = cv2.threshold(gray,160,255,cv2.THRESH_BINARY)
Then you go on, and try to erode the image. I'm not sure why you're doing this, was your intention to "bold" the lines, or was your intention to remove noise. In any case you never assigned the result of erosion to something. Numpy arrays, which is the way images are represented, are mutable but it's not the way the syntax works:
cv2.erode(src, kernel, [optionalOptions] ) → dst
So you have to write:
b = cv2.erode(b,element)
Ok, now for the element and how the erosion works. Erosion drags a kernel over an image. Kernel is a simple matrix with 1's and 0's in it. One of the elements of that matrix, usually centre one, is called an anchor. An anchor is the element that will be replaced at the end of the operation. When you created
cv2.getStructuringElement(cv2.MORPH_CROSS, (1, 1))
what you created is actually a 1x1 matrix (1 column, 1 row). This makes erosion completely useless.
What erosion does, is firstly retrieves all the values of pixel brightness from the original image where the kernel element, overlapping the image segment, has a "1". Then it finds a minimal value of retrieved pixels and replaces the anchor with that value.
What this means, in your case, is that you drag [1] matrix over the image, compare if the source image pixel brightness is larger, equal or smaller than itself and then you replace it with itself.
If your intention was to remove "noise", then it's probably better to use a rectangular kernel over the image. Think of it this way, "noise" is that thing that "doesn't fit in" with the surroundings. So if you compare your centre pixel with it's surroundings and you find it doesn't fit, it's most likely noise.
Additionally, I've said it replaces the anchor with the minimal value retrieved by the kernel. Numerically, minimal value is 0, which is coincidentally how black is represented in the image. This means that in your case of a predominantly white image, erosion would "bloat up" the black pixels. Erosion would replace the 255 valued white pixels with 0 valued black pixels if they're in the reach of the kernel. In any case it shouldn't be of a shape (1,1), ever.
>>> cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (3, 3))
array([[0, 1, 0],
[1, 1, 1],
[0, 1, 0]], dtype=uint8)
If we erode the second image with a 3x3 rectangular kernel we get the image bellow.
Ok, now we got that out of the way, next thing you do is you find edges using Canny edge detection. The image you get from that is:
Ok, now we look for EXACTLY vertical and EXACTLY horizontal lines ONLY. Of course there are no such lines apart from the meridian on the left of the image (is that what it's called?) and the end image you get after you did it right would be this:
Now since you never described your exact idea, and my best guess is that you want the parallels and meridians, you'll have more luck on maps with lesser scale because those aren't lines to begin with, they are curves. Additionally, is there a specific reason to get a Probability Hough done? The "regular" Hough doesn't suffice?
Sorry for the too-long post, hope it helps a bit.
Text here was added as a request for clarification from the OP Nov. 24th. because there's no way to fit the answer into a char limited comment.
I'd suggest OP asks a new question more specific to the detection of curves because you are dealing with curves op, not horizontal and vertical lines.
There are several ways to detect curves but none of them are easy. In the order of simplest-to-implement to hardest:
Use RANSAC algorithm. Develop a formula describing the nature of the long. and lat. lines depending on the map in question. I.e. latitude curves will almost be a perfect straight lines on the map when you're near the equator, with the equator being the perfectly straight line, but will be very curved, resembling circle segments, when you're at high latitudes (near the poles). SciPy already has RANSAC implemented as a class all you have to do is find and the programatically define the model you want to try to fit to the curves. Of course there's the ever-usefull 4dummies text here. This is the easiest because all you have to do is the math.
A bit harder to do would be to create a rectangular grid and then try to use cv findHomography to warp the grid into place on the image. For various geometric transformations you can do to the grid you can check out OpenCv manual. This is sort of a hack-ish approach and might work worse than 1. because it depends on the fact that you can re-create a grid with enough details and objects on it that cv can identify the structures on the image you're trying to warp it to. This one requires you to do similar math to 1. and just a bit of coding to compose the end solution out of several different functions.
To actually do it. There are mathematically neat ways of describing curves as a list of tangent lines on the curve. You can try to fit a bunch of shorter HoughLines to your image or image segment and then try to group all found lines and determine, by assuming that they're tangents to a curve, if they really follow a curve of the desired shape or are they random. See this paper on this matter. Out of all approaches this one is the hardest because it requires a quite a bit of solo-coding and some math about the method.
There could be easier ways, I've never actually had to deal with curve detection before. Maybe there are tricks to do it easier, I don't know. If you ask a new question, one that hasn't been closed as an answer already you might have more people notice it. Do make sure to ask a full and complete question on the exact topic you're interested in. People won't usually spend so much time writing on such a broad topic.
To show you what you can do with just Hough transform check out bellow:
import cv2
import numpy as np
def draw_lines(hough, image, nlines):
n_x, n_y=image.shape
#convert to color image so that you can see the lines
draw_im = cv2.cvtColor(image, cv2.COLOR_GRAY2BGR)
for (rho, theta) in hough[0][:nlines]:
try:
x0 = np.cos(theta)*rho
y0 = np.sin(theta)*rho
pt1 = ( int(x0 + (n_x+n_y)*(-np.sin(theta))),
int(y0 + (n_x+n_y)*np.cos(theta)) )
pt2 = ( int(x0 - (n_x+n_y)*(-np.sin(theta))),
int(y0 - (n_x+n_y)*np.cos(theta)) )
alph = np.arctan( (pt2[1]-pt1[1])/( pt2[0]-pt1[0]) )
alphdeg = alph*180/np.pi
#OpenCv uses weird angle system, see: http://docs.opencv.org/3.0-beta/doc/py_tutorials/py_imgproc/py_houghlines/py_houghlines.html
if abs( np.cos( alph - 180 )) > 0.8: #0.995:
cv2.line(draw_im, pt1, pt2, (255,0,0), 2)
if rho>0 and abs( np.cos( alphdeg - 90)) > 0.7:
cv2.line(draw_im, pt1, pt2, (0,0,255), 2)
except:
pass
cv2.imwrite("/home/dino/Desktop/3HoughLines.png", draw_im,
[cv2.IMWRITE_PNG_COMPRESSION, 12])
img = cv2.imread('a.jpg')
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
flag,b = cv2.threshold(gray,160,255,cv2.THRESH_BINARY)
cv2.imwrite("1tresh.jpg", b)
element = np.ones((3,3))
b = cv2.erode(b,element)
cv2.imwrite("2erodedtresh.jpg", b)
edges = cv2.Canny(b,10,100,apertureSize = 3)
cv2.imwrite("3Canny.jpg", edges)
hough = cv2.HoughLines(edges, 1, np.pi/180, 200)
draw_lines(hough, b, 100)
As you can see from the image bellow, straight lines are only longitudes. Latitudes are not as straight therefore for each latitude you have several detected lines that behave like tangents on the line. Blue drawn lines are drawn by the if abs( np.cos( alph - 180 )) > 0.8: while the red drawn lines are drawn by rho>0 and abs( np.cos( alphdeg - 90)) > 0.7 condition. Pay close attention when comparing the original image with the image with lines drawn on it. The resemblance is uncanny (heh, get it?) but because they're not lines a lot of it only looks like junk. (especially that highest detected latitude line that seems like it's too "angled" but in reality those lines make a perfect tangent to the latitude line on its thickest point, just as hough algorithm demands it). Acknowledge that there are limitations to detecting curves with a line detection algorithm

Detecting an approaching object

I read this blog post where he uses a Laser and a Webcam to estimated the distance of the cardboard from the Webcam.
I had another idea about that. I don't want to calculate the distance from the webcam.
I want to check if an object is approaching the webcam. The algorithm, according to me, will be something like:
Detect the object in the webcam feed.
If the object is approaching the webcam it'll grow larger and larger in the video feed.
Use this data for further calculations.
Since I want to detect random objects, I am using the findContours() method to find the contours in the video feed. Using that, I will at least have the outlines of the objects in the video feed. The source code is:
import numpy as np
import cv2
vid=cv2.VideoCapture(0)
ans, instant=vid.read()
average=np.float32(instant)
cv2.accumulateWeighted(instant, average, 0.01)
background=cv2.convertScaleAbs(average)
while(1):
_,f=vid.read()
imgray=cv2.cvtColor(f, cv2.COLOR_BGR2GRAY)
ret, thresh=cv2.threshold(imgray,127,255,0)
diff=cv2.absdiff(f, background)
cv2.imshow("input", f)
cv2.imshow("Difference", diff)
if cv2.waitKey(5)==27:
break
cv2.destroyAllWindows()
The output is:
I am stuck here. I have the contours stored in an array. What do I do with it when the size increases? How do I proceed?
One trouble here is recognising and differentiating the moving objects from other stuff in the video feed. An approach might be to let the camera 'learn' what the background looks like with no object. Then you can constantly compare its input against this background. One way to get the background is to use a running average.
Any difference greater than a small threshold means there is a moving object. If you constantly display this difference, you basically have a motion tracker. The size of the objects is simply the sum of all the non-zero (thresholded) pixels, or their bounding rectangles. You can track this size and use it to guess whether the object is moving closer or further. Morphological operations can help group the contours into one cohesive object.
Since it will be tracking ANY movement, if there are two objects, they will be counted together. Here is where you can use the contours to find and track individual objects, e.g. using the contour bounds or centroids. You could also possibly separate them by colour.
Here are some results using this strategy (the grey blob is my hand):
It actually did a fairly good job of guessing which way my hand was moving.
Code:
import cv2
import numpy as np
AVERAGE_ALPHA = 0.2 # 0-1 where 0 never adapts, and 1 instantly adapts
MOVEMENT_THRESHOLD = 30 # Lower values pick up more movement
REDUCED_SIZE = (400, 600)
MORPH_KERNEL = np.ones((10, 10), np.uint8)
def reduce_image(input_image):
"""Make the image easier to deal with."""
reduced = cv2.resize(input_image, REDUCED_SIZE)
reduced = cv2.cvtColor(reduced, cv2.COLOR_BGR2GRAY)
return reduced
# Initialise
vid = cv2.VideoCapture(0)
average = None
old_sizes = np.zeros(20)
size_update_index = 0
while (True):
got_frame, frame = vid.read()
if got_frame:
# Reduce image
reduced = reduce_image(frame)
if average is None: average = np.float32(reduced)
# Get background
cv2.accumulateWeighted(reduced, average, AVERAGE_ALPHA)
background = cv2.convertScaleAbs(average)
# Get thresholded difference image
movement = cv2.absdiff(reduced, background)
_, threshold = cv2.threshold(movement, MOVEMENT_THRESHOLD, 255, cv2.THRESH_BINARY)
# Apply morphology to help find object
dilated = cv2.dilate(threshold, MORPH_KERNEL, iterations=10)
closed = cv2.morphologyEx(dilated, cv2.MORPH_CLOSE, MORPH_KERNEL)
# Get contours
contours, _ = cv2.findContours(closed, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
cv2.drawContours(closed, contours, -1, (150, 150, 150), -1)
# Find biggest bounding rectangle
areas = [cv2.contourArea(c) for c in contours]
if (areas != list()):
max_index = np.argmax(areas)
max_cont = contours[max_index]
x, y, w, h = cv2.boundingRect(max_cont)
cv2.rectangle(closed, (x, y), (x+w, y+h), (255, 255, 255), 5)
# Guess movement direction
size = w*h
if size > old_sizes.mean():
print "Towards"
else:
print "Away"
# Update object size
old_sizes[size_update_index] = size
size_update_index += 1
if (size_update_index) >= len(old_sizes): size_update_index = 0
# Display image
cv2.imshow('RaptorVision', closed)
Obviously this needs more work in terms of identifying, selecting and tracking the objects etc (at the moment it does horribly if there is something else moving in the background). There are also many parameters to vary and tweak (the ones set are what worked well for my system). I'll leave that up to you though.
Some links:
background extraction
motion tracking
If you want to get a bit more high-tech with the background removal, have a look here:
wallflower
Detect the object in the webcam feed.
If the object is approaching the webcam it'll grow larger and larger in the video feed.
Use this data for further calculations.
Good idea.
If you want to use the contour detection approach, you could do it the following way:
You have a series of Images I1, I2, ... In
Do a contour detection on each one. C1, C2, ..., Cn (Contour is a set of points in OpenCV)
Take a large enough sample on your Image i and i+1: S_i \leq C_i, i \in 1...n
Check for all points in your sample for the nearest point on i+1. Then you trajectorys for all your points.
Check if this trajectorys point mostly outwards (tricky part ;)
If they appear outwards for a suffiecent number of frames your contour got bigger.
Alternative you could try to prune the points that are not part of the correct contour and work with a covering rectangle. It's very easy to check the size that way, but i don't knwo how easy it will be to choose the "correct" points.

Categories