I'm working on a project which requires detection of people and due to the complexity of the system, I decided to use movement detection.
I faced some problems and upon asking on stack overflow, this answer seemed the best.
So I implemented the algorithm in the following steps:
Implement saliency on the input video
Applied K-means clustering
Background Subtraction
Morphological Transformation
Here is the code
import cv2
import time
import numpy as np
cap=cv2.VideoCapture(0)
#i wanted to try different background subtractors to get the best result.
fgbg=cv2.createBackgroundSubtractorMOG2()
fgbg1 = cv2.bgsegm.createBackgroundSubtractorMOG()
h = cap.get(4)
w = cap.get(3)
frameArea = h*w
areaTH = frameArea/150
while(cap.isOpened()):
#time.sleep(0.05)
_,frame=cap.read()
cv2.imshow("frame",frame)
image=frame
################Implementing Saliency########################
saliency = cv2.saliency.StaticSaliencySpectralResidual_create()
(success, saliencyMap) = saliency.computeSaliency(image)
saliencyMap = (saliencyMap * 255).astype("uint8")
#cv2.imshow("Image", image)
#cv2.imshow("Output", saliencyMap)
saliency = cv2.saliency.StaticSaliencyFineGrained_create()
(success, saliencyMap) = saliency.computeSaliency(image)
saliencyMap = (saliencyMap * 255).astype("uint8")
threshMap = cv2.threshold(saliencyMap.astype("uint8"), 0, 255,cv2.THRESH_BINARY | cv2.THRESH_OTSU)[1]
# show the images
#cv2.imshow("Image", image)
#cv2.imshow("saliency", saliencyMap)
#cv2.imshow("Thresh", threshMap)
kouts=saliencyMap
#cv2.imshow("kouts", kouts)
##############implementing k-means clustering#######################
clusters=12
z=kouts.reshape((-1,3))
#covert to np.float32
z=np.float32(z)
#define criteria and accuracy
criteria= (cv2.TERM_CRITERIA_EPS + cv2.TERM_CRITERIA_MAX_ITER,5,1.0)
#apply k-means
ret,label,center=cv2.kmeans(z,clusters,None,criteria,10,cv2.KMEANS_RANDOM_CENTERS)
#converting back the float 32 data to unit 8 and making the image
center=np.uint8(center)
res=center[label.flatten()]
kouts=res.reshape((kouts.shape))
cv2.imshow('clustered image',kouts)
############applying background subtraction#######################
fgmask=fgbg.apply(kouts)
fgmask1=fgbg1.apply(kouts)
cv2.imshow('fg',fgmask)
cv2.imshow('fgmask1',fgmask1)
#as i said earlier, i wanted to get the best background subtractor
#########################morphological transformation#####################
#Below i tried various techniques to get the best possible result
kernel=np.ones((5,5),np.uint8)
erosion=cv2.erode(fgmask1,kernel,iterations=1)
cv2.imshow('erosion',erosion)
dilation=cv2.dilate(fgmask1,kernel,iterations=1)
cv2.imshow('dilation',dilation)
gradient = cv2.morphologyEx(fgmask1, cv2.MORPH_GRADIENT, kernel)
cv2.imshow("gradient",gradient)
opening=cv2.morphologyEx(fgmask1,cv2.MORPH_OPEN,kernel)
closing=cv2.morphologyEx(fgmask1,cv2.MORPH_CLOSE,kernel)
cv2.imshow('opening',opening)
cv2.imshow('closing',closing)
#########for detection of contours##################
contours0, hierarchy = cv2.findContours(erosion,cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_SIMPLE)
for cnt in contours0:
area = cv2.contourArea(cnt)
if area > areaTH and area<frameArea*0.50:
M = cv2.moments(cnt)
x,y,f,g = cv2.boundingRect(cnt)
img = cv2.rectangle(frame,(x,y),(x+f,y+g),(0,255,0),2)
cv2.imshow('Original',frame)
k = cv2.waitKey(1) & 0xff
if k == ord('q'):
break
cap.release()
cv2.destroyAllWindows()
I tried this algorithm on this video but still there was a lot of noise in the output. I previously thought that the problem might be in the quality of the video but when I did cv2.VideoCapture(0), the problem still persist and the code doesn't seem to remove the noise and the situation I'm working in, has sometimes high noise.
Tell me any suggestions or where did I go wrong or a different approach to the problem.
Thanks in advance.
I spent sometime trying to see if something can be done with noise reduction, but I believe you already tried many of the known techniques in OpenCV. My opinion is to approach your problem using neural networks as they will be more accurate detecting the objects.
I created a Colab notebook, to illustrate this:
https://colab.research.google.com/drive/1rBrcu46sfo0F7fsQf4BC9hKoXTk_wNBk?usp=sharing
Even with this simple approach, it's possible to detect objects: persons and clothing. You can set a criteria that can just consider the top 10 items. As a bus entrance has a limit of people that can enter at the same time.
This is not a final solution because I am using a general purpose detector. This can be improved in your application by training the network with your video inputs. Labeling will be required but I believe this will give you the most accurate results.
I also think for the challenge to track the people that are inside the bus and the ones entering. For that you can take track the rectangles. There is an excellent example using dlib: https://www.pyimagesearch.com/2018/10/22/object-tracking-with-dlib/
Related
I'm stuck with a problem where I want to identify different coffee beans in a mix.
I created a neural network which is able to identify different beans individually. But in practice, I want to develop a algorithm where I can detect these beans in a bigger batch. It is not necessary to identify all the beans in the picture, but when i'm able to identify 10-15 beans in a bigger batch, this would be enough.
The problem is now that I'm able to segment the beans when there is just one layer of beans with a contrasting background, but when there are multiple layers of beans below this first layer, it gets really hard.
I tried to use distance transform and the watershed algorithm from openCV and as mentioned, this worked for just single beans and for some small overlap between beans (just as in this example). The picture below shows the results: results of single layer segmentation
My code used was based on the example mention before:
import cv2
import numpy as np
from matplotlib import pyplot as plt
from scipy.ndimage import label
from scipy.ndimage import morphology
# load the image as normal and grayscale
img_path = "FINAL/segmentation/IMG_6699.JPG"
img= cv.imread(img_path,0)
img0 = cv.imread(img_path)
#preprocess the image
img= cv.medianBlur(img,5)
ret,th1 = cv.threshold(img,80,255,cv.THRESH_BINARY_INV)
kernel = np.ones((5,5),np.uint8)
opening = cv2.morphologyEx(th1, cv2.MORPH_OPEN, kernel)
dilation = cv2.dilate(opening, None, iterations=2)
erosion = cv2.erode(dilation,kernel,iterations = 50)
border_nonseg = dilation - cv2.erode(dilation, None, iterations = 1)
#distance transform
#dt = morphology.distance_transform_bf(dilation, metric='chessboard')
dt = cv2.distanceTransform(dilation, 2, 5)
dt = ((dt - dt.min()) / (dt.max() - dt.min()) * 255).astype(np.uint8)
hier, dt1 = cv2.threshold(dt, 170, 255, cv2.THRESH_BINARY)
# label the centers found by the distance transform
lbl, ncc = label(dt1)
lbl = lbl * (255/ncc)
# Completing the markers now.
lbl[border_nonseg == 255] = 255
lbl = lbl.astype(np.int32)
# Watershed algorithm
cv2.watershed(img0, lbl)
lbl[lbl == -1] = 0
lbl = lbl.astype(np.uint8)
result = 255 - lbl
lbl_cont = result
# Draw the borders
result[result != 255] = 0
result = cv2.dilate(result, None, iterations=1)
img0[result == 255] = (255, 0, 0)
cv2.imwrite("output.png", img0)
contours, _ = cv.findContours(result, cv.RETR_EXTERNAL, cv.CHAIN_APPROX_SIMPLE)
titles = ['Original Image', 'dilation',
'gradient morph', 'erode']
images = [ border_nonseg, dt, lbl_cont, img0]
plt.figure(figsize=(20,20))
for i in range(4):
plt.subplot(2,2,i+1),plt.imshow(images[i],'gray')
plt.title(titles[i])
plt.xticks([]),plt.yticks([])
plt.show()
But the problem starts when there is a picture like this (which is a real situation): multi layer segmentation and harder multi layer segmentation
I don't think that I'm able to reuse the code mentioned before and that I need a different approach. Because the contrast between the first and second layer is just to small, because of the small size of the beans there is not a big shade created by them, which would give a nice contrast and the the color of the beans is also quite dark which doesn't make it easier.
So do you have any suggestion for different approaches to tackle this problem or maybe adjustment the the current code to solve for my problem?
I'm very curious to hear different opinions on this!
If I've understood correcly, you just need some image segmentation methods to separate the beans from a big picture into many small pictures with just one bean in it so you can feed them to your NN to train/test it.
With the kind of pictures that you showed, the segmentation is almost an identification by itself. I mean, you would almost need a trained NN to identify the beans in the picture to then separate them and feed them into your non-trained NN.
For these kinds of problems, I believe that there are some NN architectures (non-supervised) that are trained to extract the relevant features for you. I think autoencoders was one of the options, but I'm not sure right now.
The other approach is to use some kind of more general pattern recognition:
a) Shape-Based: They try to match a contour model over the gradient image
b) Correlation-Based: They try to match a sample image over your original image using grayscale correlations
These methods use pyramidal search to increase speed but you might want as well try different pyramid levels of the model for each pyramid level of the image to analyze to cope with different zoomings of the grains, which is equivalent to a matching method with scaling. You will need also several models of your beans (different perspectives of a single bean) to increase the number of results per image.
You could try as well c) a region expansion methodthat expands a seed region to the neighboring pixels under some conditions based on color smoothness, or d) a border finder combined with some contour closing algorithms; but I fear they could cause you many problems based on your images variability.
I'm making a document scanner for a college project, my code work quite well for any of the uniform lighted images. However I came across issues detecting images with even a little amount of light reflections (or too much light) on the background surface.
I first tried different simple codes I found online, then using different morphological operation, with the result that now my code is a little messy and inaccurate.
Here's the code:
def scanner(img):
clahe = cv2.createCLAHE(clipLimit=3.0, tileGridSize=(8,8))
image = cv2.imread(img)
original = image.copy()
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
contrast = clahe.apply(gray)
blurred = cv2.medianBlur(contrast, 21)
canny = cv2.Canny(blurred, 0, 70)
dialated = cv2.dilate(canny, cv2.getStructuringElement(cv2.MORPH_RECT,(5,5)), iterations = 3)
closing = cv2.morphologyEx(dialated, cv2.MORPH_CLOSE, np.ones((5,5),np.uint8),iterations = 10)
contimage, contours, hierarchy = cv2.findContours(closing, cv2.RETR_LIST, cv2.CHAIN_APPROX_SIMPLE)
contours = sorted(contours, key = cv2.contourArea, reverse = True)[:5]
target = None
for c in contours:
p = cv2.arcLength(c, True)
approx = cv2.approxPolyDP(c, 0.09 * p, True)
if len(approx) == 4:
target = approx
cv2.drawContours(image, [target], -1, (0, 255, 0), 2)
break
plt.figure(figsize = (20,20))
plt.imshow(cv2.cvtColor(image, cv2.COLOR_BGR2RGB))
plt.title("final")
plt.show()
Here's an example of the code working:
input1, output1, input2, output2
And not working:
input1, output1, input2, output2
This image shows a successful segmentation:
This image shows a failed segmentation:
It looks like the main issue here is that in the failing images, there is not a sufficient contrast between the paper and the table it's placed on. the ones that work it's basically white paper on dark background, so it's relatively easy to segment out the paper, however when you place the paper on a light colored surface, there isn't sufficient contrast between the paper and the background to tell what is what. Unfortunately in image processing, there is only so much you can do when your input image is bad so there is no easy automated fix for this but I can think of a few workarounds, but they're all going to require extra work.
One would be instead of having your program automatically detect where the paper is, just have a static box that the user has to place the document inside of and simply capture the contents that way. Probably the most simple to implement however it seems like you WANT to detect it automatically so this probably isn't what you're looking for.
Two would be to have some intermediate step allowing the user to select a specific threshold value to apply to the image. Basically you would take the picture, then have the user set a threshold value such that the paper ends up being white, and the background is dark, then you could use that as a template to create the boundary of the paper which you can then segment from the original image. This is probably the most work but closest to what you're looking for.
Three would be similar to number one but instead of having a set area you place the documents within, you could take the picture then have the user manually select where the corners are and segment it that way, more work than #1, less work than #2, but probably still not what you're looking for.
Finally you could just leave it as is and use it knowing that you need a sufficiently dark background for it to work correctly. There are probably some other work arounds but with a lot of image processing stuff you can be very constrained by the quality of your image and there isn't always a software solution for exactly what you're looking to do.
Firstly ı want to find the change of water dripping time into the fabric with camera.User will drip water to the fabric after that algorithm detect the movement until water absorb completely and plot the graph change-time with showing the absorbation time,area etc..
In order to do detect movement I have used absdif function with constant change rate.And I take frames start time of detection to the end like this image.There is no problem in here.But in order to calculate the absorbation of water I thresholded frame and use countNonZero function to calculate the number of black pixel . But there is one problem here,the black pixels that shown red lines of thresholded images are continuosly changing(like shaking,vibration etc).So plotting process is fail.
Try
I tried to change webcam device(using phone camera by İpcam)
I tried to adaptive threshold methods(otsu etc) to find optimum threshold
Smoothing lightning conditions and capturing without background
Success
When I use the video that was taken by phone camera as input, shaking and vibration effects decreases and I can reach the success this graph as expected
QUESTION
How can I smooth the thresholded image in real time
Another approach
Code
import cv2
from datetime import datetime
import numpy as np
import matplotlib.pyplot as plt
import operator
def pixelHesaplayici(x):
siyaholmayanpixel=cv2.countNonZero(x)
height,width=x.shape
toplampixel=height*width
siyahpixelsayisi=toplampixel-siyaholmayanpixel
return siyahpixelsayisi
def grafikciz(sure,newblackpixlist,maxValue,index,totaltime,cm):
plt.figure(figsize=(15,15))
plt.plot(sure,newblackpixlist)
line,=plt.plot(sure,newblackpixlist)
plt.setp(line,color='r')
plt.text(totaltime/2,maxValue/2, r'$Max-
Pixel=%d$'%maxValue,fontsize=14,color='r')
plt.text(totaltime/2,maxValue/2.5, r'$Max-emilim-
zamanı=%f$'%sure[index],fontsize=14,color='b')
plt.text(totaltime/2,maxValue/3, r'$Max-
Alan=%fcm^2$'%cm,fontsize=14,color='g')
plt.ylabel('Black Pixels')
plt.xlabel('Time(s)')
plt.grid(True)
plt.show()
static_back=None
i=0
blackpixlist=[]
newblackpixlist=[]
t=[]
video=cv2.VideoCapture("kumas1.mp4")
while(True):
ret,frame=video.read()
if ret==True:
gray=cv2.cvtColor(frame,cv2.COLOR_BGR2GRAY)
gray=cv2.GaussianBlur(gray,(5,5),0)
_,threshforgraph=cv2.threshold(gray,0,255,
cv2.THRESH_BINARY+cv2.THRESH_OTSU)
if static_back is None:
static_back=gray
continue
diff_frame=cv2.absdiff(static_back,gray)
threshfortime=cv2.threshold(diff_frame,127,255,cv2.THRESH_BINARY)[1]
#threshfortime=cv2.dilate(threshfortime,None,iterations=2)
(_,cnts,_)=cv2.findContours(threshfortime.copy(),
cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_SIMPLE)
for contour in cnts:
if cv2.contourArea(contour)<450:
continue
an=datetime.now()
t.append(an.minute*60+an.second+(an.microsecond/1000000))
cv2.fillPoly(frame,contour, (255,255,255), 8,0)
cv2.imwrite("samples/frame%d.jpg"%i,threshforgraph)
i+=1
cv2.imshow("org2",frame)
#cv2.imshow("Difference Frame",diff_frame)
#cv2.imshow("Threshold Frame",threshfortime)
#cv2.imshow("Threshforgraph",threshforgraph)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
else:
break
ti=t[1::3]
lasttime=ti[-1]
firsttime=ti[-len(ti)]
totaltime=lasttime-firsttime
for i in range(0,i):
img=cv2.imread('samples/frame%d.jpg'%i,0)
blackpixlist.append(pixelHesaplayici(img))
ilkpix=blackpixlist[0]
for a in blackpixlist:
newblackpixlist.append(a-ilkpix)
newblackpixlisti=newblackpixlist[1::3]
index , maxValue=max(enumerate(newblackpixlisti),
key=operator.itemgetter(1))
sure=np.linspace(0,totaltime,len(newblackpixlisti))
cm=0.0007*maxValue # For 96 dpi
grafikciz(sure,newblackpixlisti,maxValue,index,totaltime,cm)
What about subtracting the first frame from the next frames? If you can know or can detect when there is no drop and subtract it, the difference will only give you the result of the drop.
This approach might be interesting also if you have several drops in different places and want discard the previous drop.
Note that you can do the subtraction before and after thresholding. I would recommend before thresholding.
If you know that you have a lot of shaking in your process, you probably need to apply a digital stabilization, in which case I would advise to look at this tutorial:
https://www.learnopencv.com/video-stabilization-using-point-feature-matching-in-opencv/
Of course, the stabilization should be done before the subtraction.
In general for your problem, I wouldn't use an adaptive method. The threshold should be the same for all frames, if it adapts depending on the image, you could have invalid results.
I hope I properly understood your problem!
I am trying to make a program that is capable of identifying a road in a scene and proceeded to using morphological filtering and the watershed algorithm. However the program produces either mediocre or bad results. It seems to do okay (not good enough through) if the road takes up most of the scene. However in other pictures, it turns out that the sky gets segmented instead (watershed with the clouds).
I tried to see if I can preform more image processing to improve the results, but this is the best I have so far and don't know how to move forward to improve my program.
How can I improve my program?
Code:
import numpy as np
import cv2
from matplotlib import pyplot as plt
import imutils
def invert_img(img):
img = (255-img)
return img
#img = cv2.imread('images/coins_clustered.jpg')
img = cv2.imread('images/road_4.jpg')
img = imutils.resize(img, height = 300)
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
ret, thresh = cv2.threshold(gray,0,255,cv2.THRESH_BINARY_INV+cv2.THRESH_OTSU)
thresh = invert_img(thresh)
# noise removal
kernel = np.ones((3,3), np.uint8)
opening = cv2.morphologyEx(thresh,cv2.MORPH_OPEN,kernel, iterations = 4)
# sure background area
sure_bg = cv2.dilate(opening,kernel,iterations=3)
#sure_bg = cv2.morphologyEx(sure_bg, cv2.MORPH_TOPHAT, kernel)
# Finding sure foreground area
dist_transform = cv2.distanceTransform(opening, cv2.DIST_L2, 5)
ret, sure_fg = cv2.threshold(dist_transform,0.7*dist_transform.max(),255,0)
# Finding unknown region
sure_fg = np.uint8(sure_fg)
unknown = cv2.subtract(sure_bg,sure_fg)
# Marker labelling
ret, markers = cv2.connectedComponents(sure_fg)
# Add one to all labels so that sure background is not 0, but 1
markers = markers+1
# Now, mark the region of unknown with zero
markers[unknown==255] = 0
'''
imgray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
imgray = cv2.GaussianBlur(imgray, (5, 5), 0)
img = cv2.Canny(imgray,200,500)
'''
markers = cv2.watershed(img,markers)
img[markers == -1] = [255,0,0]
cv2.imshow('background',sure_bg)
cv2.imshow('foreground',sure_fg)
cv2.imshow('threshold',thresh)
cv2.imshow('result',img)
cv2.waitKey(0)
For start, segmentation problems are hard. The more general you want the solution to be, the more hard it gets. Road segemntation is a well-known problem, and i'm sure you can find many papers which tackle this issue from various directions.
Something that helps me get ideas for computer vision problems is trying to think what makes it so easy for me to detect it and so hard for computer.
For example, let's look on the road on your images. What makes it unique from the background?
Distinct gray color.
Always have 2 shoulders lines in white color
Always on the bottom section of the image
Always have a seperation line in the middle (yellow/white)
Pretty smooth
Wider on the bottom and vanishing into horizon.
Now, after we have found some unique features, we need to find ways to quantify them, so it will be obvious to the algorithm as it is obvious to us.
Work on the RGB (or even better - HSV) image, don't convert it to gray on the beginning and lose all the color data. Look for gray area!
Again, let's find white regions (inside gray ones). You can try do edge detection in the specific orientation of the shoulders line. You are looking for line that takes about half of the height of the image. etc...
Lets delete the upper half of the image. It is hardly that you ever have there a road, and you will get rid from a lot of noise in your algorithm.
see 2...
Lets check the local standard deviation, or some other smoothness feature.
If we found some shape, lets check if it fits what we expect.
I know these are just ideas and I don't claim they are easy to implement, but if you want to improve your algorithm you must give it more "knowledge", just as you have.
Exploit some domain knowledge; in other words, make some simplifying assumptions. Even basic things like "the camera's not upside down" and "the pavement has a uniform hue" will improve the common case.
If you can treat crossroads as a special case, then finding the edges of the roadway may be a simpler and more useful task than finding the roadway itself.
I read this blog post where he uses a Laser and a Webcam to estimated the distance of the cardboard from the Webcam.
I had another idea about that. I don't want to calculate the distance from the webcam.
I want to check if an object is approaching the webcam. The algorithm, according to me, will be something like:
Detect the object in the webcam feed.
If the object is approaching the webcam it'll grow larger and larger in the video feed.
Use this data for further calculations.
Since I want to detect random objects, I am using the findContours() method to find the contours in the video feed. Using that, I will at least have the outlines of the objects in the video feed. The source code is:
import numpy as np
import cv2
vid=cv2.VideoCapture(0)
ans, instant=vid.read()
average=np.float32(instant)
cv2.accumulateWeighted(instant, average, 0.01)
background=cv2.convertScaleAbs(average)
while(1):
_,f=vid.read()
imgray=cv2.cvtColor(f, cv2.COLOR_BGR2GRAY)
ret, thresh=cv2.threshold(imgray,127,255,0)
diff=cv2.absdiff(f, background)
cv2.imshow("input", f)
cv2.imshow("Difference", diff)
if cv2.waitKey(5)==27:
break
cv2.destroyAllWindows()
The output is:
I am stuck here. I have the contours stored in an array. What do I do with it when the size increases? How do I proceed?
One trouble here is recognising and differentiating the moving objects from other stuff in the video feed. An approach might be to let the camera 'learn' what the background looks like with no object. Then you can constantly compare its input against this background. One way to get the background is to use a running average.
Any difference greater than a small threshold means there is a moving object. If you constantly display this difference, you basically have a motion tracker. The size of the objects is simply the sum of all the non-zero (thresholded) pixels, or their bounding rectangles. You can track this size and use it to guess whether the object is moving closer or further. Morphological operations can help group the contours into one cohesive object.
Since it will be tracking ANY movement, if there are two objects, they will be counted together. Here is where you can use the contours to find and track individual objects, e.g. using the contour bounds or centroids. You could also possibly separate them by colour.
Here are some results using this strategy (the grey blob is my hand):
It actually did a fairly good job of guessing which way my hand was moving.
Code:
import cv2
import numpy as np
AVERAGE_ALPHA = 0.2 # 0-1 where 0 never adapts, and 1 instantly adapts
MOVEMENT_THRESHOLD = 30 # Lower values pick up more movement
REDUCED_SIZE = (400, 600)
MORPH_KERNEL = np.ones((10, 10), np.uint8)
def reduce_image(input_image):
"""Make the image easier to deal with."""
reduced = cv2.resize(input_image, REDUCED_SIZE)
reduced = cv2.cvtColor(reduced, cv2.COLOR_BGR2GRAY)
return reduced
# Initialise
vid = cv2.VideoCapture(0)
average = None
old_sizes = np.zeros(20)
size_update_index = 0
while (True):
got_frame, frame = vid.read()
if got_frame:
# Reduce image
reduced = reduce_image(frame)
if average is None: average = np.float32(reduced)
# Get background
cv2.accumulateWeighted(reduced, average, AVERAGE_ALPHA)
background = cv2.convertScaleAbs(average)
# Get thresholded difference image
movement = cv2.absdiff(reduced, background)
_, threshold = cv2.threshold(movement, MOVEMENT_THRESHOLD, 255, cv2.THRESH_BINARY)
# Apply morphology to help find object
dilated = cv2.dilate(threshold, MORPH_KERNEL, iterations=10)
closed = cv2.morphologyEx(dilated, cv2.MORPH_CLOSE, MORPH_KERNEL)
# Get contours
contours, _ = cv2.findContours(closed, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
cv2.drawContours(closed, contours, -1, (150, 150, 150), -1)
# Find biggest bounding rectangle
areas = [cv2.contourArea(c) for c in contours]
if (areas != list()):
max_index = np.argmax(areas)
max_cont = contours[max_index]
x, y, w, h = cv2.boundingRect(max_cont)
cv2.rectangle(closed, (x, y), (x+w, y+h), (255, 255, 255), 5)
# Guess movement direction
size = w*h
if size > old_sizes.mean():
print "Towards"
else:
print "Away"
# Update object size
old_sizes[size_update_index] = size
size_update_index += 1
if (size_update_index) >= len(old_sizes): size_update_index = 0
# Display image
cv2.imshow('RaptorVision', closed)
Obviously this needs more work in terms of identifying, selecting and tracking the objects etc (at the moment it does horribly if there is something else moving in the background). There are also many parameters to vary and tweak (the ones set are what worked well for my system). I'll leave that up to you though.
Some links:
background extraction
motion tracking
If you want to get a bit more high-tech with the background removal, have a look here:
wallflower
Detect the object in the webcam feed.
If the object is approaching the webcam it'll grow larger and larger in the video feed.
Use this data for further calculations.
Good idea.
If you want to use the contour detection approach, you could do it the following way:
You have a series of Images I1, I2, ... In
Do a contour detection on each one. C1, C2, ..., Cn (Contour is a set of points in OpenCV)
Take a large enough sample on your Image i and i+1: S_i \leq C_i, i \in 1...n
Check for all points in your sample for the nearest point on i+1. Then you trajectorys for all your points.
Check if this trajectorys point mostly outwards (tricky part ;)
If they appear outwards for a suffiecent number of frames your contour got bigger.
Alternative you could try to prune the points that are not part of the correct contour and work with a covering rectangle. It's very easy to check the size that way, but i don't knwo how easy it will be to choose the "correct" points.