I have the following code that finds the Optical Flow of 2 images (or 2 frames of a video) and it's colour coded. What I want is the horizontal and vertical components of the optical flow separately (as in separate images)
Here is the code I have so far:
import cv2
import numpy as np
frame1 = cv2.imread('my1.bmp')
frame2 = cv2.imread('my2.bmp')
prvs = cv2.cvtColor(frame1,cv2.COLOR_BGR2GRAY)
next = cv2.cvtColor(frame2,cv2.COLOR_BGR2GRAY)
hsv = np.zeros_like(frame1)
hsv[...,1] = 255
while(1):
next = cv2.cvtColor(frame2,cv2.COLOR_BGR2GRAY)
flow = cv2.calcOpticalFlowFarneback(prvs, next, 0.5, 3, 15, 3, 5, 1.2, 0)
mag, ang = cv2.cartToPolar(flow[...,0], flow[...,1])
hsv[...,0] = ang*180/np.pi/2
hsv[...,2] = cv2.normalize(mag,None,0,255,cv2.NORM_MINMAX)
rgb = cv2.cvtColor(hsv,cv2.COLOR_HSV2BGR)
cv2.imshow('frame2',rgb)
k = cv2.waitKey(30) & 0xff
if k == 27:
break
elif k == ord('s'):
cv2.imwrite('opticalmyhsv.pgm',rgb)
cap.release()
cv2.destroyAllWindows()
This is what the optical flow looks like given my two images:
If you want to visualize the horizontal and vertical component separately, you can visualize both separately as grayscale images. I'll make it such that a colour of gray denotes no motion, black denotes the maximum amount of motion in the frame going to the left (negative) while white denotes the maximum amount of motion in the frame going towards the right (positive).
The output of calcOpticalFlowFarneback is a 3D numpy array where the first slice denotes the amount of horizontal (x) displacement while the second slice denotes the amount of vertical (y) displacement.
As such, all you need to do is define two separate 2D numpy arrays that will store these values so we can display them to the user. However, you're going to need to normalize the flow for display such that no motion is a rough gray, motion to the extreme left is black, or intensity 0, and motion to the extreme right is white, or intensity 255.
Therefore, all you would need to do is modify your code to show two OpenCV windows for the horizontal and vertical motion like so:
import cv2
import numpy as np
frame1 = cv2.imread('my1.bmp')
frame2 = cv2.imread('my2.bmp')
prvs = cv2.cvtColor(frame1,cv2.COLOR_BGR2GRAY)
next = cv2.cvtColor(frame2,cv2.COLOR_BGR2GRAY)
flow = cv2.calcOpticalFlowFarneback(prvs, next, 0.5, 3, 15, 3, 5, 1.2, 0)
# Change here
horz = cv2.normalize(flow[...,0], None, 0, 255, cv2.NORM_MINMAX)
vert = cv2.normalize(flow[...,1], None, 0, 255, cv2.NORM_MINMAX)
horz = horz.astype('uint8')
vert = vert.astype('uint8')
# Change here too
cv2.imshow('Horizontal Component', horz)
cv2.imshow('Vertical Component', vert)
k = cv2.waitKey(0) & 0xff
if k == ord('s'): # Change here
cv2.imwrite('opticalflow_horz.pgm', horz)
cv2.imwrite('opticalflow_vert.pgm', vert)
cv2.destroyAllWindows()
I've modified the code so that there is no while loop as you're only finding the optical flow between two predetermined frames. You're not grabbing frames off of a live source, like a camera, so we can just show both of the images not in a while loop. I've made the wait time for waitKey set to 0 so that you wait indefinitely until you push a key. This pretty much simulates your while loop behaviour from before, but it doesn't burden your CPU needlessly with wasted cycles. I've also removed some unnecessary variables, like the hsv variable as we aren't displaying both horizontal and vertical components colour coded. We also just compute the optical flow once.
In any case, with the above code we compute the optical flow, extract the horizontal and vertical components separately, normalize the components between the range of [0,255], cast to uint8 so that we can display the results then show the results. I've also modified your code so that if you wanted to save the components, it'll save the horizontal and vertical components as two separate images.
Edit
In your comments, you want to display a sequence of images using the same logic we have created above. You have a list of file names that you want to cycle through. That isn't very difficult to do. Simply take your strings and put them into a list and compute the optical flow between pairs of images by using the file names stored in this list. I'll modify the code such that when we reach the last element of the list, we will wait for the user to push something. Until then, we will cycle through each pair of images until the end. In other words:
import cv2
import numpy as np
# Create list of names here from my1.bmp up to my20.bmp
list_names = ['my' + str(i+1) + '.bmp' for i in range(20)]
# Read in the first frame
frame1 = cv2.imread(list_names[0])
prvs = cv2.cvtColor(frame1,cv2.COLOR_BGR2GRAY)
# Set counter to read the second frame at the start
counter = 1
# Until we reach the end of the list...
while counter < len(list_names):
# Read the next frame in
frame2 = cv2.imread(list_names[counter])
next = cv2.cvtColor(frame2,cv2.COLOR_BGR2GRAY)
# Calculate optical flow between the two frames
flow = cv2.calcOpticalFlowFarneback(prvs, next, 0.5, 3, 15, 3, 5, 1.2, 0)
# Normalize horizontal and vertical components
horz = cv2.normalize(flow[...,0], None, 0, 255, cv2.NORM_MINMAX)
vert = cv2.normalize(flow[...,1], None, 0, 255, cv2.NORM_MINMAX)
horz = horz.astype('uint8')
vert = vert.astype('uint8')
# Show the components as images
cv2.imshow('Horizontal Component', horz)
cv2.imshow('Vertical Component', vert)
# Change - Make next frame previous frame
prvs = next.copy()
# If we get to the end of the list, simply wait indefinitely
# for the user to push something
if counter == len(list_names)-1
k = cv2.waitKey(0) & 0xff
else: # Else, wait for 1 second for a key
k = cv2.waitKey(1000) & 0xff
if k == 27:
break
elif k == ord('s'): # Change
cv2.imwrite('opticalflow_horz' + str(counter) + '-' + str(counter+1) + '.pgm', horz)
cv2.imwrite('opticalflow_vert' + str(counter) + '-' + str(counter+1) + '.pgm', vert)
# Increment counter to go to next frame
counter += 1
cv2.destroyAllWindows()
The above code will cycle through pairs of frames and wait for 1 second between each pair to give you the opportunity to either break out of the showing, or saving the horizontal and vertical components to file. Bear in mind that I have made it such that whatever frames you save, they are indexed with two numbers that tell you which pairs of frames they are showing. Before the next iteration happens, the next frame will be come the previous frame and so next gets replaced by a copy of prvs. At the beginning of the loop, the next frame gets read in appropriately.
Hope this helps. Good luck!
Related
I have a script in python which acts as a motion detector. I read a video file using cv2, convert to grayscale, and do simple background subtraction from the current frame to detect motion, which I draw a rectangle over. The video is eventually saved as a new file, where I can finally view it.
This works fine, except sometimes the starting frame (background frame) already has motion in it, or sometimes there are features in the background which move but I don't want to detect (eg if I was detecting people, I wouldn't be interested in a flag blowing in the breeze). So I want to somehow disregard 'stationary' movement (ie motion which does not move vertically/horizontally over the course of the video). However I'm having trouble with my approach. There doesn't seem to be any functions or scripts on the internet to solve this.
One idea I had was to draw a larger rectangle over the original, and then if the original rectangle doesn't leave the outer rectangle (which stays put) over the video, then that motion can be cancelled altogether. I have no idea how to implement this. I have managed to draw a larger rectangle, but it follows the original and doesn't stay in place.
Does anyone have any idea how I might be able to do this? Or any resources they could point me in? Thank you. Below is my code starting from when I draw the rectangles.
for c in cnts:
# if the contour is too small, ignore it
if cv2.contourArea(c) < min_area:
continue
# compute the bounding box for the contour, draw it on the frame, and update the text
(x, y, w, h) = cv2.boundingRect(c)
cv2.rectangle(frame, (x, y), (x + w, y + h), (0, 255, 0), 2)
text = "Occupied" # frame is occupied
half_w=int(w/2) # get 50% sizing width
half_h=int(h/2) # get 50% sizing height
x_surr = int (x - (half_w/2))
y_surr = int(y - (half_h/2))
w_surr = (w+half_w)
h_surr = (h+half_h)
cv2.rectangle(frame, (x_surr, y_surr), (x_surr+w_surr, y_surr + h_surr), (255, 255, 255), 2)
I think this code might help you. Basically it compares the value of each pixel in the current frame to the corresponding value of that pixel in the average of the previous n frames. When no motion is present, it is all black. When there is motion, it will show the color of the moving option. Since it is keeping track average of recent frames. You should be able to filter our slight movements for flags fluttering, etc. You will probably need to play around with some thresholding on the final image to get the result you want.
Stillness:
Motion:
import cv2
def main():
# define the length of the list of the number of recent frames to keep track of
NUMBER_FRAMES_TO_TRACK = 30
# start the webcam
cap = cv2.VideoCapture(1)
ret, frame = cap.read()
if ret == False:
print("No webcam detected.")
return
# generate a list of recent frames
recent_frames = [frame for n in range(NUMBER_FRAMES_TO_TRACK)]
# start the video loop
while True:
ret, frame = cap.read()
if ret == False:
break
# update the list of recent frames with the most recent frame
recent_frames = recent_frames[1:]
recent_frames.append(frame)
# calculate the average of all recent frames
average = recent_frames[0]
for i in range(len(recent_frames)):
if i == 0:
pass
else:
alpha = 1.0/(i + 1)
beta = 1.0 - alpha
average = cv2.addWeighted(recent_frames[i], alpha, average, beta, 0.0)
# find the difference between the current frame and the average of recent frames
difference = cv2.subtract(frame, average)
# show the results
cv2.imshow("video", frame)
cv2.imshow("average", average)
cv2.imshow("difference", difference)
key = cv2.waitKey(1)
if key == ord('q'):
break
cv2.destroyAllWindows()
cap.release()
if __name__ == "__main__":
main()
Good day, in following code, I am able to crop a rectangle ROI from the first frame.
The final outcome of this while loop provided an ROI stored as numpy array named "monitor_region".
video = cv2.VideoCapture("Rob.mp4")
ret, frame = video.read()
roi_status = False
while(roi_status == False):
roi = cv2.selectROI("Region Selection by ROI", frame, False)
if(not all(roi)):
print("Undefined monitor region.")
continue
monitor_region = frame[int(roi[1]):int(roi[1]+roi[3]),int(roi[0]):int(roi[0]+roi[2])]
cv2.imshow("Selected Region", monitor_region)
if(cv2.waitKey(0) & 0xFF == 8): #backspace to save
print("Monitor region has been saved.")
roi_status = True
cv2.destroyAllWindows()
Since this "monitor_region" is a portion of entire frame as well as a rectangle. Therefore, I am looking for a feasible solution to define its left and top points in order to define a range for checking (As illustration provided). In following code, I am able to define and width and height of the roi.
monitor_width = monitor_region.shape[1]
monitor_height = monitor_region.shape[0]
However, I am still lack of left and top points. Once I have obtained both top and left points of ROI. I could perform x and y points checking in as below, which I use to determine if the object exists within ROI or not.
if((monitor_left < Point_x < (monitor_left + monitor_width)) and (monitor_top < Point_y < (monitor_top + monitor_height))):
selectROI returns a tuple of exactly the values you seek: the left bound, the top bound, the width and the height of the selected ROI.
If you print(roi) you will get something like (100,150,300,200)
You can unpack the values like this:
x,y,width,height = roi
Where x is the left bound and y is the top bound.
def run(self):
while True:
_ret, frame = self.cam.read()
frame_gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
vis = frame.copy()
if len(self.tracks) > 0:
img0, img1 = self.prev_gray, frame_gray
p0 = np.float32([tr[-1] for tr in self.tracks]).reshape(-1, 1, 2)
p1, _st, _err = cv2.calcOpticalFlowPyrLK(img0, img1, p0, None, **lk_params)
p0r, _st, _err = cv2.calcOpticalFlowPyrLK(img1, img0, p1, None, **lk_params)
d = abs(p0-p0r).reshape(-1, 2).max(-1)
good = d < 1
new_tracks = []
for i in range(len(p1)):
A.append(math.sqrt((p1[i][0][0])**2 + (p1[i][0][1])**2))
counts,bins,bars = plt.hist(A)
for tr, (x, y), good_flag in zip(self.tracks, p1.reshape(-1, 2), good):
if not good_flag:
continue
tr.append((x, y))
if len(tr) > self.track_len:
del tr[0]
new_tracks.append(tr)
cv2.circle(vis, (x, y), 2, (0, 255, 0), -1)
self.tracks = new_tracks
cv2.polylines(vis, [np.int32(tr) for tr in self.tracks], False, (0, 255, 0))
draw_str(vis, (20, 20), 'track count: %d' % len(self.tracks))
if self.frame_idx % self.detect_interval == 0:
mask = np.zeros_like(frame_gray)
mask[:] = 255
for x, y in [np.int32(tr[-1]) for tr in self.tracks]:
cv2.circle(mask, (x, y), 5, 0, -1)
p = cv2.goodFeaturesToTrack(frame_gray, mask = mask, **feature_params)
if p is not None:
for x, y in np.float32(p).reshape(-1, 2):
self.tracks.append([(x, y)])
self.frame_idx += 1
self.prev_gray = frame_gray
cv2.imshow('lk_track', vis)
ch = cv2.waitKey(1)
if ch == 27:
break
i am using lk_track.py from opencv samples to try and detect a moving object. I am trying to find the camera motion using the histogram of magnitude of optical flow vectors and then calculate the average for similar values which should be directly proportional to the camera motion. I have calculated the magnitude of the vectors and saved it in a list A. Can some suggest on how to find highest similar values from it and calculate the average for only those values?
I created a toy problem to model the approach of binarizing the images by optical flow. This is a massively simplified view of the problem, but gives the general idea well. I'll split the problem up into a few chunks and give functions for them. If you're working directly with video, there will be a lot of additional code needed of course, and I just hardcoded a lot of values that you'll need to turn into parameters.
The first function is just for generating the image sequence. The images are moving through a scene with an object moving inside the sequence. The image sequence is just simply translating through the scene, and the object appears stationary in the sequence, but that means that the object is actually moving in the opposite direction of the camera of course.
import numpy as np
import cv2
def gen_seq():
"""Generate motion sequence with an object"""
scene = cv2.GaussianBlur(np.uint8(255*np.random.rand(400, 500)), (21, 21), 3)
h, w = 400, 400
step = 4
obj_mask = np.zeros((h, w), np.bool)
obj_h, obj_w = 50, 50
obj_x, obj_y = 175, 175
obj_mask[obj_y:obj_y+obj_h, obj_x:obj_x+obj_w] = True
obj_data = np.uint8(255*np.random.rand(obj_h, obj_w)).ravel()
imgs = []
for i in range(0, 1+w//step, step):
img = scene[:, i:i+w].copy()
img[obj_mask] = obj_data
imgs.append(img)
return imgs
# generate image sequence
imgs = gen_seq()
# display images
for img in imgs:
cv2.imshow('Image', img)
k = cv2.waitKey(100) & 0xFF
if k == ord('q'):
break
cv2.destroyWindow('Image')
So here's the basic image sequence visualized. I just used a random scene, translated through, and added a random object in the center.
Great! Now we need to calculate the flow between each frame. I used dense flow here, but sparse flow would be more robust for actual images.
def find_flows(imgs):
"""Finds the dense optical flows"""
optflow_params = [0.5, 3, 15, 3, 5, 1.2, 0]
prev = imgs[0]
flows = []
for img in imgs[1:]:
flow = cv2.calcOpticalFlowFarneback(prev, img, None, *optflow_params)
flows.append(flow)
prev = img
return flows
# find optical flows between images
flows = find_flows(imgs)
# display flows
h, w = imgs[0].shape[:2]
hsv = np.zeros((h, w, 3), dtype=np.uint8)
hsv[..., 1] = 255
for flow in flows:
mag, ang = cv2.cartToPolar(flow[..., 0], flow[..., 1])
hsv[..., 0] = ang*180/np.pi/2
hsv[..., 2] = cv2.normalize(mag, None, 0, 255, cv2.NORM_MINMAX)
rgb = cv2.cvtColor(hsv, cv2.COLOR_HSV2BGR)
cv2.imshow('Flow', rgb)
k = cv2.waitKey(100) & 0xFF
if k == ord('q'):
break
cv2.destroyWindow('Flow')
Here I colorized the flow based on it's angle and magnitude. The angle will determine the color and the magnitude will determine the intensity/brightness of the color. This is the same view the OpenCV tutorial on dense optical flow uses.
Then, we need to binarize this flow so that we get two distinct sets of pixels based on how they're moving. In the sparse case, this works out the same except you will get two distinct sets of features.
def label_flows(flows):
"""Binarizes the flows by direction and magnitude"""
criteria = (cv2.TERM_CRITERIA_EPS + cv2.TERM_CRITERIA_MAX_ITER, 10, 1.0)
flags = cv2.KMEANS_RANDOM_CENTERS
h, w = flows[0].shape[:2]
labeled_flows = []
for flow in flows:
flow = flow.reshape(h*w, -1)
comp, labels, centers = cv2.kmeans(flow, 2, None, criteria, 10, flags)
n = np.sum(labels == 1)
camera_motion_label = np.argmax([labels.size-n, n])
labeled = np.uint8(255*(labels.reshape(h, w) == camera_motion_label))
labeled_flows.append(labeled)
return labeled_flows
# binarize the flows
labeled_flows = label_flows(flows)
# display binarized flows
for labeled_flow in labeled_flows:
cv2.imshow('Labeled Flow', labeled_flow)
k = cv2.waitKey(100) & 0xFF
if k == ord('q'):
break
cv2.destroyWindow('Labeled Flow')
The annoying thing here is the labels will be set randomly, i.e. the labels will be different for each frame. If you visualized the binary image, it would flip between black and white randomly. I'm only using binary labels, 0 and 1, so what I did was considered the label that is assigned to more pixels to be the "camera motion label" and then I set that label to be white in the resulting images, and the other label to be black, that way the camera motion label is always the same in each frame. This may need to be much more sophisticated for working on video feed.
But here we have it, a binarized flow where the color is just showing the two distinct sets of flow vectors.
Now if we wanted to find the target in this flow, we could invert the image and find the connected components of the binary image. The inversion will make the camera motion the background label (0). Then each of the black blobs will be white and will be labeled, and we could find the blob relating to the largest component which, in this case, will be the target. That will give a mask around the target, and we can draw the contours of that mask on the original images to see the target being detected. I'll also cut the borders of the image off before finding the connected components so edge effects from dense flow are ignored.
def find_target_in_labeled_flow(labeled_flow):
labeled_flow = cv2.bitwise_not(labeled_flow)
bw = 10
h, w = labeled_flow.shape[:2]
border_cut = labeled_flow[bw:h-bw, bw:w-bw]
conncomp, stats = cv2.connectedComponentsWithStats(border_cut, connectivity=8)[1:3]
target_label = np.argmax(stats[1:, cv2.CC_STAT_AREA]) + 1
img = np.zeros_like(labeled_flow)
img[bw:h-bw, bw:w-bw] = 255*(conncomp == target_label)
return img
for labeled_flow, img in zip(labeled_flows, imgs[:-1]):
target_mask = find_target_in_labeled_flow(labeled_flow)
display_img = cv2.merge([img, img, img])
contours = cv2.findContours(target_mask, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)[1]
display_img = cv2.drawContours(display_img, contours, -1, (0, 255, 0), 2)
cv2.imshow('Detected Target', display_img)
k = cv2.waitKey(100) & 0xFF
if k == ord('q'):
break
And of course this could get some cleaning up, and you won't be doing exactly this for sparse flow. You could just define a region of interest around the tracked points.
Now, there is still a lot of work to do. You have a binarized flow...you can probably assume that the label which occurs most frequently is the camera motion (like I did) safely. However, you'll have to make sure that the other label is the object you're interested in tracking. You'll have to keep track of it between flows so that if it stops moving, you'll know where it is as the camera is moving. When you do the k-means step, you'll want to make sure that the centers from k-means are "far enough" apart so that you know the object is moving or not.
The basic steps for that would be, from the starting frame of the video:
If the two centers are "close", then you can assume your object is either not in the scene or not moving in the scene.
Once the centers are split enough apart, you'll have found the object to track. Keep track of the location of the object.
During tracking of the object, verify the location is nearby a prediction. You can use the optical flow velocity vectors from the previous frame to predict the location each pixel/feature in the new frame, so make sure your predictions agree with your tracking result.
If the object stops moving, the centers from k-means should be close. Keep track of the optical flow vectors around the object location and follow them to have a prediction of where the object is again once it resumes moving, and again verify the detected location with this prediction.
I've never used these methods before so I'm not sure how robust they are. The typical approach for HOOF or "Histogram of oriented optical flow" is much more advanced than this (see the seminal paper here). Instead of just binarizing, the idea is to use histograms from each frame as a probability distribution, and the way this probability distribution changes over time can be analyzed with the tools from time series analysis, which I assume give a more robust framework to this approach.
with #alkasm's answer to avoid the following error:
(-215:Assertion failed) npoints > 0 in function 'drawContours'
simply replace:
contours = cv2.findContours(target_mask, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)[1]
with
contours, _ = cv2.findContours(target_mask, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
I can't comment this below as an answer due to new account with low reputation.
Updated question:
Would anyone be able to point me in the direction of any material that could help me to plot an optical flow map in python? Ideally i want to find something that provides a similar output to the video shown here: http://study.marearts.com/2014/04/opencv-study-calcopticalflowfarneback.html . Or something with a similar functional output
I have implemented the dense optical flow algorithm (cv2.calcOpticalFlowFarneback). And from this i have been able to sample the magnitudes at specified points of the image.
The video feed that is being input is 640x480, and i have set sample points to be at every fifth pixel vertically and horizontally.
import cv2
import numpy as np
import matplotlib.pyplot as plt
cap = cv2.VideoCapture("T5.avi")
ret, frame1 = cap.read()
prvs = cv2.cvtColor(frame1, cv2.COLOR_BGR2GRAY)
hsv = np.zeros_like(frame1)
hsv[..., 1] = 255
[R,C]=prvs.shape
count=0
while (1):
ret, frame2 = cap.read()
next = cv2.cvtColor(frame2, cv2.COLOR_BGR2GRAY)
flow = cv2.calcOpticalFlowFarneback(prvs, next, None, 0.5, 3, 15, 2, 5, 1.2, 0)
mag, ang = cv2.cartToPolar(flow[..., 0], flow[..., 1])
RV=np.arange(5,480,5)
CV=np.arange(5,640,5)
# These give arrays of points to sample at increments of 5
if count==0:
count =1 #so that the following creation is only done once
[Y,X]=np.meshgrid(CV,RV)
# makes an x and y array of the points specified at sample increments
temp =mag[np.ix_(RV,CV)]
# this makes a temp array that stores the magnitude of flow at each of the sample points
motionvectors=np.array((Y[:],X[:],Y[:]+temp.real[:],X[:]+temp.imag[:]))
Ydist=motionvectors[0,:,:]- motionvectors[2,:,:]
Xdist=motionvectors[1,:,:]- motionvectors[3,:,:]
Xoriginal=X-Xdist
Yoriginal=Y-Ydist
plot2 = plt.figure()
plt.quiver(Xoriginal, Yoriginal, X, Y,
color='Teal',
headlength=7)
plt.title('Quiver Plot, Single Colour')
plt.show(plot2)
hsv[..., 0] = ang * 180 / np.pi / 2
hsv[..., 2] = cv2.normalize(mag, None, 0, 255, cv2.NORM_MINMAX)
bgr = cv2.cvtColor(hsv, cv2.COLOR_HSV2BGR)
cv2.imshow('frame2', bgr)
k = cv2.waitKey(30) & 0xff
if k == 27:
break
prvs = next
cap.release()
cv2.destroyAllWindows()
I think i have calculated the original and final X,Y positions of the pixels and the distances the moved and have put these into a matplotlib quiver plot.
The result i get does not coincide with the hsv plot of the dense optical flow (which i know to be correct as it was taken from the OpenCV tutorials) and the quiver plot also only shows one frame at a time and the plot must be exited before the next one displays.
Can anyone see where i have gone wrong in my calculations and how i can make the plot update automatically with each frame?
I do not know how to change the behaviour of matplotlib quiver plots, but I'm sure it is possible.
An alternative is to create a function to draw lines on top of the original image, based on the calculated optical flow. The following code should achieve this:
def dispOpticalFlow( Image,Flow,Divisor,name ):
"Display image with a visualisation of a flow over the top. A divisor controls the density of the quiver plot."
PictureShape = np.shape(Image)
#determine number of quiver points there will be
Imax = int(PictureShape[0]/Divisor)
Jmax = int(PictureShape[1]/Divisor)
#create a blank mask, on which lines will be drawn.
mask = np.zeros_like(Image)
for i in range(1, Imax):
for j in range(1, Jmax):
X1 = (i)*Divisor
Y1 = (j)*Divisor
X2 = int(X1 + Flow[X1,Y1,1])
Y2 = int(Y1 + Flow[X1,Y1,0])
X2 = np.clip(X2, 0, PictureShape[0])
Y2 = np.clip(Y2, 0, PictureShape[1])
#add all the lines to the mask
mask = cv2.line(mask, (Y1,X1),(Y2,X2), [255, 255, 255], 1)
#superpose lines onto image
img = cv2.add(Image,mask)
#print image
cv2.imshow(name,img)
return []
This code only creates lines rather than arrows, but with some effort it could be modified to display arrows.
I'm still hacking together a book scanning script, and for now, all I need is to be able to automagically detect a page turn. The book fills up 90% of the screen (I'm using a cruddy webcam for the motion detection), so when I turn a page, the direction of motion is basically in that same direction.
I have modified a motion-tracking script, but derivatives are getting me nowhere:
#!/usr/bin/env python
import cv, numpy
class Target:
def __init__(self):
self.capture = cv.CaptureFromCAM(0)
cv.NamedWindow("Target", 1)
def run(self):
# Capture first frame to get size
frame = cv.QueryFrame(self.capture)
frame_size = cv.GetSize(frame)
grey_image = cv.CreateImage(cv.GetSize(frame), cv.IPL_DEPTH_8U, 1)
moving_average = cv.CreateImage(cv.GetSize(frame), cv.IPL_DEPTH_32F, 3)
difference = None
movement = []
while True:
# Capture frame from webcam
color_image = cv.QueryFrame(self.capture)
# Smooth to get rid of false positives
cv.Smooth(color_image, color_image, cv.CV_GAUSSIAN, 3, 0)
if not difference:
# Initialize
difference = cv.CloneImage(color_image)
temp = cv.CloneImage(color_image)
cv.ConvertScale(color_image, moving_average, 1.0, 0.0)
else:
cv.RunningAvg(color_image, moving_average, 0.020, None)
# Convert the scale of the moving average.
cv.ConvertScale(moving_average, temp, 1.0, 0.0)
# Minus the current frame from the moving average.
cv.AbsDiff(color_image, temp, difference)
# Convert the image to grayscale.
cv.CvtColor(difference, grey_image, cv.CV_RGB2GRAY)
# Convert the image to black and white.
cv.Threshold(grey_image, grey_image, 70, 255, cv.CV_THRESH_BINARY)
# Dilate and erode to get object blobs
cv.Dilate(grey_image, grey_image, None, 18)
cv.Erode(grey_image, grey_image, None, 10)
# Calculate movements
storage = cv.CreateMemStorage(0)
contour = cv.FindContours(grey_image, storage, cv.CV_RETR_CCOMP, cv.CV_CHAIN_APPROX_SIMPLE)
points = []
while contour:
# Draw rectangles
bound_rect = cv.BoundingRect(list(contour))
contour = contour.h_next()
pt1 = (bound_rect[0], bound_rect[1])
pt2 = (bound_rect[0] + bound_rect[2], bound_rect[1] + bound_rect[3])
points.append(pt1)
points.append(pt2)
cv.Rectangle(color_image, pt1, pt2, cv.CV_RGB(255,0,0), 1)
num_points = len(points)
if num_points:
x = 0
for point in points:
x += point[0]
x /= num_points
movement.append(x)
if len(movement) > 0 and numpy.average(numpy.diff(movement[-30:-1])) > 0:
print 'Left'
else:
print 'Right'
# Display frame to user
cv.ShowImage("Target", color_image)
# Listen for ESC or ENTER key
c = cv.WaitKey(7) % 0x100
if c == 27 or c == 10:
break
if __name__=="__main__":
t = Target()
t.run()
It detects the average motion of the average center of all of the boxes, which is extremely inefficient. How would I go about detecting such motions quickly and accurately (i.e. within a threshold)?
I'm using Python, and I plan to stick with it, as my whole framework is based on Python.
And help is appreciated, so thank you all in advance. Cheers.
I haven't used OpenCV in Python before, just a bit in C++ with openframeworks.
For this I presume OpticalFlow's velx,vely properties would work.
For more on how Optical Flow works check out this paper.
HTH
why don't you use cv.GoodFeaturesToTrack ? it may solve the script runtime ... and shorten the code ...