I have a script in python which acts as a motion detector. I read a video file using cv2, convert to grayscale, and do simple background subtraction from the current frame to detect motion, which I draw a rectangle over. The video is eventually saved as a new file, where I can finally view it.
This works fine, except sometimes the starting frame (background frame) already has motion in it, or sometimes there are features in the background which move but I don't want to detect (eg if I was detecting people, I wouldn't be interested in a flag blowing in the breeze). So I want to somehow disregard 'stationary' movement (ie motion which does not move vertically/horizontally over the course of the video). However I'm having trouble with my approach. There doesn't seem to be any functions or scripts on the internet to solve this.
One idea I had was to draw a larger rectangle over the original, and then if the original rectangle doesn't leave the outer rectangle (which stays put) over the video, then that motion can be cancelled altogether. I have no idea how to implement this. I have managed to draw a larger rectangle, but it follows the original and doesn't stay in place.
Does anyone have any idea how I might be able to do this? Or any resources they could point me in? Thank you. Below is my code starting from when I draw the rectangles.
for c in cnts:
# if the contour is too small, ignore it
if cv2.contourArea(c) < min_area:
continue
# compute the bounding box for the contour, draw it on the frame, and update the text
(x, y, w, h) = cv2.boundingRect(c)
cv2.rectangle(frame, (x, y), (x + w, y + h), (0, 255, 0), 2)
text = "Occupied" # frame is occupied
half_w=int(w/2) # get 50% sizing width
half_h=int(h/2) # get 50% sizing height
x_surr = int (x - (half_w/2))
y_surr = int(y - (half_h/2))
w_surr = (w+half_w)
h_surr = (h+half_h)
cv2.rectangle(frame, (x_surr, y_surr), (x_surr+w_surr, y_surr + h_surr), (255, 255, 255), 2)
I think this code might help you. Basically it compares the value of each pixel in the current frame to the corresponding value of that pixel in the average of the previous n frames. When no motion is present, it is all black. When there is motion, it will show the color of the moving option. Since it is keeping track average of recent frames. You should be able to filter our slight movements for flags fluttering, etc. You will probably need to play around with some thresholding on the final image to get the result you want.
Stillness:
Motion:
import cv2
def main():
# define the length of the list of the number of recent frames to keep track of
NUMBER_FRAMES_TO_TRACK = 30
# start the webcam
cap = cv2.VideoCapture(1)
ret, frame = cap.read()
if ret == False:
print("No webcam detected.")
return
# generate a list of recent frames
recent_frames = [frame for n in range(NUMBER_FRAMES_TO_TRACK)]
# start the video loop
while True:
ret, frame = cap.read()
if ret == False:
break
# update the list of recent frames with the most recent frame
recent_frames = recent_frames[1:]
recent_frames.append(frame)
# calculate the average of all recent frames
average = recent_frames[0]
for i in range(len(recent_frames)):
if i == 0:
pass
else:
alpha = 1.0/(i + 1)
beta = 1.0 - alpha
average = cv2.addWeighted(recent_frames[i], alpha, average, beta, 0.0)
# find the difference between the current frame and the average of recent frames
difference = cv2.subtract(frame, average)
# show the results
cv2.imshow("video", frame)
cv2.imshow("average", average)
cv2.imshow("difference", difference)
key = cv2.waitKey(1)
if key == ord('q'):
break
cv2.destroyAllWindows()
cap.release()
if __name__ == "__main__":
main()
Related
I really don't know if "UV's" is the right word as i'm from the world of Unity and am trying to write some stuff in python. What i'm trying to do is to take a picture of a human (from webcam) take the placement of their landmarks/key features and alter a second image (of a different person) to make their key features in the same place whilst morphing / warping the parts of their skin that are within the face to fit the position of the first input image (webcam)'s landmarks. After i do that I need to put the face back on the non-webcam input. (i'm sorry for how much that made me sound like a serial killer, stretching and cutting faces) I know that probably didn't make any sense but I want it to look like this.
I have the face landmark and cutting done with DLIB and OpenCV but i need a way to find a way to take these "cut" face chunks and stretch them "dynamically". What I mean by dynamically is that you don't just put a mask on by linearly re-sizing it on 1 or 2 axises. You can select a point of the mask and change that, I wanna do that but my mask is my cut chunk and the point is a section of that chunk that needs to change for the chunk to comply with the position of the generated landmarks. I know this is a very hard topic to think about and if you guys need any clarification just ask. My code:
import cv2
import numpy as np
import dlib
cap = cv2.VideoCapture(0)
detector = dlib.get_frontal_face_detector()
predictor = dlib.shape_predictor("shape_predictor_68_face_landmarks.dat")
while True:
_, frame = cap.read()
gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
faces = detector(gray)
for face in faces:
x1 = face.left()
y1 = face.top()
x2 = face.right()
y2 = face.bottom()
cv2.rectangle(frame, (x1, y1), (x2, y2), (0, 255, 0), 3)
landmarks = predictor(gray, face)
for n in range(0, 68):
x = landmarks.part(n).x
y = landmarks.part(n).y
cv2.circle(frame, (x, y), 4, (255, 0, 0), -1)
cv2.imshow("Frame", frame)
key = cv2.waitKey(1)
if key == 27:
break
EDIT: No i'm not a serial killer
If you need to deform source image like a rubber sheet using 2 sets of keypoints, you need to use thin plate spline (TPS), or, better, piecewice affine transformation like here. The last one is more similar to texture rasterization methods (triangle to triangle texture transform).
I am developing a project for my university assignment which has a AR part that I tried to with Unity and Vuforia. I want to get a simple T shape (or any shape which is easy for user to draw on a body part such as hand) as the image target, because I'm developing an app similar to inkHunter. In this app they have got a smiley as the image target and when the customer draws a smiley on the body and places the camera on it, the camera finds that and shows the selected tattoo design on it. I tried it with Vuforia SDK but they give a rating for the image target, so I can't get what I want as the image target. I think using openCV is the right way to do it but it's so hard to learn and I got less time. I think this is not a big thing to implement so please try to help me with this problem. I think you get my idea. in inkHunter even if I draw the target in a sheet also they show the tattoo on it. I need the same which means I need to detect the Drawn target. It would be great if you could help me in this situation. Thanks.
target can be like this,
I was able to do template matching from pictures and I applied the same to real-time which means I looped through the frames. But it does not seem to be matching the template with frames, And I realized that found(bookkeeping variable) is always None.
import cv2 as cv2
import numpy as np
import imutils
def main():
template = cv2.imread("C:\\Users\\Manthika\\Desktop\\opencvtest\\template.jpg")
template = cv2.cvtColor(template, cv2.COLOR_BGR2GRAY)
template = cv2.Canny(template, 50, 200)
(tH, tW) = template.shape[:2]
cv2.imshow("Template", template)
windowName = "Something"
cv2.namedWindow(windowName)
cap = cv2.VideoCapture(0)
if cap.isOpened():
ret, frame = cap.read()
else:
ret = False
# loop over the frames to find the template
while ret:
# load the image, convert it to grayscale, and initialize the
# bookkeeping variable to keep track of the matched region
ret, frame = cap.read()
gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
found = None
# loop over the scales of the image
for scale in np.linspace(0.2, 1.0, 20)[::-1]:
# resize the image according to the scale, and keep track
# of the ratio of the resizing
resized = imutils.resize(gray, width=int(gray.shape[1] * scale))
r = gray.shape[1] / float(resized.shape[1])
# if the resized image is smaller than the template, then break
# from the loop
if resized.shape[0] < tH or resized.shape[1] < tW:
break
# detect edges in the resized, grayscale image and apply template
# matching to find the template in the image
edged = cv2.Canny(resized, 50, 200)
result = cv2.matchTemplate(edged, template, cv2.TM_CCOEFF)
(_, maxVal, _, maxLoc) = cv2.minMaxLoc(result)
# if we have found a new maximum correlation value, then update
# the bookkeeping variable
if found is None or maxVal > found[0]:
found = (maxVal, maxLoc, r)
print(found)
# unpack the bookkeeping variable and compute the (x, y) coordinates
# of the bounding box based on the resized ratio
print(found)
if found is None:
# just show only the frames if the template is not detected
cv2.imshow(windowName, frame)
else:
(_, maxLoc, r) = found
(startX, startY) = (int(maxLoc[0] * r), int(maxLoc[1] * r))
(endX, endY) = (int((maxLoc[0] + tW) * r), int((maxLoc[1] + tH) * r))
# draw a bounding box around the detected result and display the image
cv2.rectangle(frame, (startX, startY), (endX, endY), (0, 0, 255), 2)
cv2.imshow(windowName, frame)
if cv2.waitKey(1) == 27:
break
cv2.destroyAllWindows()
cap.release()
if __name__ == "__main__":
main()
Please help me to solve this problem
I can hint you with the OpenCV part, but without Unity and Vuforia, hope it may help.
So, the way I see the pipeline for the project:
Detect location, size, and aspect ratio
Use homography for transformation of the image that should be put over the original
Overlay put one image on top of the other
I will assume that the target will be a dark "T" on a white piece of paper, and it may appear in different locations of the paper, as well as the paper itself may move.
1. Detect location, size, and aspect ratio
Firstly, you need to detect the piece of paper, as you know its color and aspect ration, you may use RGB/HSV thresholding for segmentation. You may also try using Deep/Machine Learning (some similar strategy like in R-CNN, HOG-SVM etc.), but it will take time. Then, you can use findContours() function from OpenCV to get the largest object. From the contour you can get the location, size, and aspect ratio of the paper.
After that you do the same thing but within the piece of paper and looking for the "T". Here you can use template matching method, just scan the Region Of Interest with predefined mask of different sizes, or just repeat what steps above.
A useful resource may be this credit card characters recognition example. It helped me a lot one day:)
2. Use homography for transformation of the image that should be put over the original
After extracting aspect ratio you will know the approximate size and shape that should appear on top of the "T". This will let you to use homograpy for transformation of the image you want to put over "T". Here is a good point to start, you can also google for some other sources, there should be plenty of them, and as far as I know, OpenCV should have functions for that.
After the transformation, I would recommend you to use interpolation, because there might be some missing pixels afterwards.
3. Overlay put one image on top of the other
The last step is just to go through all pixels of the input image and put the transformed image over target pixels.
Hope this helps, good luck!:)
I am working on a project to detect a car in the video using haar cascade. It's working out fine but it's still bit unstalbe, such as unexpected object which is not a car is detected and vanishes in a matter of a second or two. So I tried to put the logic that is a detected object's coordinate changes abruptly, that is not what we expect, but if it doesn't change much, it is car. So I created the following code
import cv2
cap = cv2.VideoCapture('C:\\Users\\john\\Desktop\\bbd3.avi')
car_cascade = cv2.CascadeClassifier('C:\\Users\\john\\Desktop\\cars.xml')
i=0
x = [None] * 10000000
y = [None] * 10000000
while (cap.isOpened()):
ret, frame = cap.read()
gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
roi_gray = gray[0:480, 100:300]
roi_color = frame[0:480, 100:300]
# roi_gray defines area in the video that we will apply haar cascade detection
# we add roi_color with same area to draw rectangle on the color frame axis
cars = car_cascade.detectMultiScale(roi_gray, 1.05, 5, minSize=(30,30) )
# we can specify the value of certain parameter by mentioned name of that parameter before that value. parameter is image, scalefactor, minneigbors, minsize.
for (x,y,w,h) in cars:
x[i]=x
y[i]=y
if i>= 1:
if abs(x[i]-x[i-1]) < 10 and abs(y[i]-y[i-1]) < 10 :
cv2.rectangle(roi_color, (x,y), (x+w,y+h), (255,0,0),2)
cv2.imshow('frame', frame)
i=i+1
if cv2.waitKey(1) == 27:
break
cv2.destroyAllWindows()
Here detectmultiscale returns list of rectangles where the detected object is located. It's basically to create an empty array and assigning the left bottom coordinate to the array and compare them between consecutive frames. However it keeps returning
TypeError: 'numpy.int32' object does not support item assignment
So can I get any idea on why this happens in the first place and how to solve it? Thanks in advance.
*Also, for those who haven't dealt with opencv before, detectmultiscale returns list of rectangles, but there can be multiple rectangles returned depending on how much objects are detected in the video. For example if one car is detected in the first frame, it returns only one rectangle, but if there are three cars detected in the second frame, it returns three rectangle. I assume this is the main problem here. Assinging multiple values to a one parameter x[i]. However, how can I assign values to a fixed array if I don't know how much data will be given in one frame ?
When writing for (x,y,w,h) in cars, the object named x becomes the numpy.int32 (an integer) returned by detectMultiScale.
Now, when Python arrives to x[i]=x, it first evaluates the right value. It's an integer.
Then it tries to "evaluate" the left value, but 3[2] has no meaning in Python, so it fails.
Here's how I would refactor your code :
import cv2
MAX_DIST = 10
cap = cv2.VideoCapture('C:\\Users\\john\\Desktop\\bbd3.avi')
car_cascade = cv2.CascadeClassifier('C:\\Users\\john\\Desktop\\cars.xml')
cars_coordinates = []
while (cap.isOpened()):
ret, frame = cap.read()
gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
roi_gray = gray[0:480, 100:300]
roi_color = frame[0:480, 100:300]
cars = car_cascade.detectMultiScale(roi_gray, 1.05, 5, minSize=(30,30) )
for (x,y,w,h) in cars:
if cars_coordinates:
last_x, last_y = cars_coordinates[-1]
if abs(x - last_x) < MAX_DIST and abs(y - last_y) < MAX_DIST :
cv2.rectangle(roi_color, (x,y), (x+w,y+h), (255,0,0),2)
cars_coordinates.append((x,y)) # adding the coordinates here allows to keep track of only your past detected cars
cv2.imshow('frame', frame)
if cv2.waitKey(1) == 27:
break
Warning on your algorithm : I corrected it as is, but with that algorithm, if the first detection of detectMultiScale is a false positive, you won't "get back" to the real cars. Plus, you can only track one car.
Pseudo code for a slightly better algorithm:
During 5 frames, only memorize the detected areas
When handling a new frame, for each detected car:
if the car is near a memorized area in at least 4 of the last 5 frames:
Consider it as a car
Append it anyway to the detected areas for this frame
Doing like this, you won't miss a new car appearing at the middle of your video, nor loose a car that would not have been detected in one frame for whatever reason, and false positives have good chances to be avoided.
I have the following code that finds the Optical Flow of 2 images (or 2 frames of a video) and it's colour coded. What I want is the horizontal and vertical components of the optical flow separately (as in separate images)
Here is the code I have so far:
import cv2
import numpy as np
frame1 = cv2.imread('my1.bmp')
frame2 = cv2.imread('my2.bmp')
prvs = cv2.cvtColor(frame1,cv2.COLOR_BGR2GRAY)
next = cv2.cvtColor(frame2,cv2.COLOR_BGR2GRAY)
hsv = np.zeros_like(frame1)
hsv[...,1] = 255
while(1):
next = cv2.cvtColor(frame2,cv2.COLOR_BGR2GRAY)
flow = cv2.calcOpticalFlowFarneback(prvs, next, 0.5, 3, 15, 3, 5, 1.2, 0)
mag, ang = cv2.cartToPolar(flow[...,0], flow[...,1])
hsv[...,0] = ang*180/np.pi/2
hsv[...,2] = cv2.normalize(mag,None,0,255,cv2.NORM_MINMAX)
rgb = cv2.cvtColor(hsv,cv2.COLOR_HSV2BGR)
cv2.imshow('frame2',rgb)
k = cv2.waitKey(30) & 0xff
if k == 27:
break
elif k == ord('s'):
cv2.imwrite('opticalmyhsv.pgm',rgb)
cap.release()
cv2.destroyAllWindows()
This is what the optical flow looks like given my two images:
If you want to visualize the horizontal and vertical component separately, you can visualize both separately as grayscale images. I'll make it such that a colour of gray denotes no motion, black denotes the maximum amount of motion in the frame going to the left (negative) while white denotes the maximum amount of motion in the frame going towards the right (positive).
The output of calcOpticalFlowFarneback is a 3D numpy array where the first slice denotes the amount of horizontal (x) displacement while the second slice denotes the amount of vertical (y) displacement.
As such, all you need to do is define two separate 2D numpy arrays that will store these values so we can display them to the user. However, you're going to need to normalize the flow for display such that no motion is a rough gray, motion to the extreme left is black, or intensity 0, and motion to the extreme right is white, or intensity 255.
Therefore, all you would need to do is modify your code to show two OpenCV windows for the horizontal and vertical motion like so:
import cv2
import numpy as np
frame1 = cv2.imread('my1.bmp')
frame2 = cv2.imread('my2.bmp')
prvs = cv2.cvtColor(frame1,cv2.COLOR_BGR2GRAY)
next = cv2.cvtColor(frame2,cv2.COLOR_BGR2GRAY)
flow = cv2.calcOpticalFlowFarneback(prvs, next, 0.5, 3, 15, 3, 5, 1.2, 0)
# Change here
horz = cv2.normalize(flow[...,0], None, 0, 255, cv2.NORM_MINMAX)
vert = cv2.normalize(flow[...,1], None, 0, 255, cv2.NORM_MINMAX)
horz = horz.astype('uint8')
vert = vert.astype('uint8')
# Change here too
cv2.imshow('Horizontal Component', horz)
cv2.imshow('Vertical Component', vert)
k = cv2.waitKey(0) & 0xff
if k == ord('s'): # Change here
cv2.imwrite('opticalflow_horz.pgm', horz)
cv2.imwrite('opticalflow_vert.pgm', vert)
cv2.destroyAllWindows()
I've modified the code so that there is no while loop as you're only finding the optical flow between two predetermined frames. You're not grabbing frames off of a live source, like a camera, so we can just show both of the images not in a while loop. I've made the wait time for waitKey set to 0 so that you wait indefinitely until you push a key. This pretty much simulates your while loop behaviour from before, but it doesn't burden your CPU needlessly with wasted cycles. I've also removed some unnecessary variables, like the hsv variable as we aren't displaying both horizontal and vertical components colour coded. We also just compute the optical flow once.
In any case, with the above code we compute the optical flow, extract the horizontal and vertical components separately, normalize the components between the range of [0,255], cast to uint8 so that we can display the results then show the results. I've also modified your code so that if you wanted to save the components, it'll save the horizontal and vertical components as two separate images.
Edit
In your comments, you want to display a sequence of images using the same logic we have created above. You have a list of file names that you want to cycle through. That isn't very difficult to do. Simply take your strings and put them into a list and compute the optical flow between pairs of images by using the file names stored in this list. I'll modify the code such that when we reach the last element of the list, we will wait for the user to push something. Until then, we will cycle through each pair of images until the end. In other words:
import cv2
import numpy as np
# Create list of names here from my1.bmp up to my20.bmp
list_names = ['my' + str(i+1) + '.bmp' for i in range(20)]
# Read in the first frame
frame1 = cv2.imread(list_names[0])
prvs = cv2.cvtColor(frame1,cv2.COLOR_BGR2GRAY)
# Set counter to read the second frame at the start
counter = 1
# Until we reach the end of the list...
while counter < len(list_names):
# Read the next frame in
frame2 = cv2.imread(list_names[counter])
next = cv2.cvtColor(frame2,cv2.COLOR_BGR2GRAY)
# Calculate optical flow between the two frames
flow = cv2.calcOpticalFlowFarneback(prvs, next, 0.5, 3, 15, 3, 5, 1.2, 0)
# Normalize horizontal and vertical components
horz = cv2.normalize(flow[...,0], None, 0, 255, cv2.NORM_MINMAX)
vert = cv2.normalize(flow[...,1], None, 0, 255, cv2.NORM_MINMAX)
horz = horz.astype('uint8')
vert = vert.astype('uint8')
# Show the components as images
cv2.imshow('Horizontal Component', horz)
cv2.imshow('Vertical Component', vert)
# Change - Make next frame previous frame
prvs = next.copy()
# If we get to the end of the list, simply wait indefinitely
# for the user to push something
if counter == len(list_names)-1
k = cv2.waitKey(0) & 0xff
else: # Else, wait for 1 second for a key
k = cv2.waitKey(1000) & 0xff
if k == 27:
break
elif k == ord('s'): # Change
cv2.imwrite('opticalflow_horz' + str(counter) + '-' + str(counter+1) + '.pgm', horz)
cv2.imwrite('opticalflow_vert' + str(counter) + '-' + str(counter+1) + '.pgm', vert)
# Increment counter to go to next frame
counter += 1
cv2.destroyAllWindows()
The above code will cycle through pairs of frames and wait for 1 second between each pair to give you the opportunity to either break out of the showing, or saving the horizontal and vertical components to file. Bear in mind that I have made it such that whatever frames you save, they are indexed with two numbers that tell you which pairs of frames they are showing. Before the next iteration happens, the next frame will be come the previous frame and so next gets replaced by a copy of prvs. At the beginning of the loop, the next frame gets read in appropriately.
Hope this helps. Good luck!
I read this blog post where he uses a Laser and a Webcam to estimated the distance of the cardboard from the Webcam.
I had another idea about that. I don't want to calculate the distance from the webcam.
I want to check if an object is approaching the webcam. The algorithm, according to me, will be something like:
Detect the object in the webcam feed.
If the object is approaching the webcam it'll grow larger and larger in the video feed.
Use this data for further calculations.
Since I want to detect random objects, I am using the findContours() method to find the contours in the video feed. Using that, I will at least have the outlines of the objects in the video feed. The source code is:
import numpy as np
import cv2
vid=cv2.VideoCapture(0)
ans, instant=vid.read()
average=np.float32(instant)
cv2.accumulateWeighted(instant, average, 0.01)
background=cv2.convertScaleAbs(average)
while(1):
_,f=vid.read()
imgray=cv2.cvtColor(f, cv2.COLOR_BGR2GRAY)
ret, thresh=cv2.threshold(imgray,127,255,0)
diff=cv2.absdiff(f, background)
cv2.imshow("input", f)
cv2.imshow("Difference", diff)
if cv2.waitKey(5)==27:
break
cv2.destroyAllWindows()
The output is:
I am stuck here. I have the contours stored in an array. What do I do with it when the size increases? How do I proceed?
One trouble here is recognising and differentiating the moving objects from other stuff in the video feed. An approach might be to let the camera 'learn' what the background looks like with no object. Then you can constantly compare its input against this background. One way to get the background is to use a running average.
Any difference greater than a small threshold means there is a moving object. If you constantly display this difference, you basically have a motion tracker. The size of the objects is simply the sum of all the non-zero (thresholded) pixels, or their bounding rectangles. You can track this size and use it to guess whether the object is moving closer or further. Morphological operations can help group the contours into one cohesive object.
Since it will be tracking ANY movement, if there are two objects, they will be counted together. Here is where you can use the contours to find and track individual objects, e.g. using the contour bounds or centroids. You could also possibly separate them by colour.
Here are some results using this strategy (the grey blob is my hand):
It actually did a fairly good job of guessing which way my hand was moving.
Code:
import cv2
import numpy as np
AVERAGE_ALPHA = 0.2 # 0-1 where 0 never adapts, and 1 instantly adapts
MOVEMENT_THRESHOLD = 30 # Lower values pick up more movement
REDUCED_SIZE = (400, 600)
MORPH_KERNEL = np.ones((10, 10), np.uint8)
def reduce_image(input_image):
"""Make the image easier to deal with."""
reduced = cv2.resize(input_image, REDUCED_SIZE)
reduced = cv2.cvtColor(reduced, cv2.COLOR_BGR2GRAY)
return reduced
# Initialise
vid = cv2.VideoCapture(0)
average = None
old_sizes = np.zeros(20)
size_update_index = 0
while (True):
got_frame, frame = vid.read()
if got_frame:
# Reduce image
reduced = reduce_image(frame)
if average is None: average = np.float32(reduced)
# Get background
cv2.accumulateWeighted(reduced, average, AVERAGE_ALPHA)
background = cv2.convertScaleAbs(average)
# Get thresholded difference image
movement = cv2.absdiff(reduced, background)
_, threshold = cv2.threshold(movement, MOVEMENT_THRESHOLD, 255, cv2.THRESH_BINARY)
# Apply morphology to help find object
dilated = cv2.dilate(threshold, MORPH_KERNEL, iterations=10)
closed = cv2.morphologyEx(dilated, cv2.MORPH_CLOSE, MORPH_KERNEL)
# Get contours
contours, _ = cv2.findContours(closed, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
cv2.drawContours(closed, contours, -1, (150, 150, 150), -1)
# Find biggest bounding rectangle
areas = [cv2.contourArea(c) for c in contours]
if (areas != list()):
max_index = np.argmax(areas)
max_cont = contours[max_index]
x, y, w, h = cv2.boundingRect(max_cont)
cv2.rectangle(closed, (x, y), (x+w, y+h), (255, 255, 255), 5)
# Guess movement direction
size = w*h
if size > old_sizes.mean():
print "Towards"
else:
print "Away"
# Update object size
old_sizes[size_update_index] = size
size_update_index += 1
if (size_update_index) >= len(old_sizes): size_update_index = 0
# Display image
cv2.imshow('RaptorVision', closed)
Obviously this needs more work in terms of identifying, selecting and tracking the objects etc (at the moment it does horribly if there is something else moving in the background). There are also many parameters to vary and tweak (the ones set are what worked well for my system). I'll leave that up to you though.
Some links:
background extraction
motion tracking
If you want to get a bit more high-tech with the background removal, have a look here:
wallflower
Detect the object in the webcam feed.
If the object is approaching the webcam it'll grow larger and larger in the video feed.
Use this data for further calculations.
Good idea.
If you want to use the contour detection approach, you could do it the following way:
You have a series of Images I1, I2, ... In
Do a contour detection on each one. C1, C2, ..., Cn (Contour is a set of points in OpenCV)
Take a large enough sample on your Image i and i+1: S_i \leq C_i, i \in 1...n
Check for all points in your sample for the nearest point on i+1. Then you trajectorys for all your points.
Check if this trajectorys point mostly outwards (tricky part ;)
If they appear outwards for a suffiecent number of frames your contour got bigger.
Alternative you could try to prune the points that are not part of the correct contour and work with a covering rectangle. It's very easy to check the size that way, but i don't knwo how easy it will be to choose the "correct" points.