Create FlowMap in Python OpenCV - python

Updated question:
Would anyone be able to point me in the direction of any material that could help me to plot an optical flow map in python? Ideally i want to find something that provides a similar output to the video shown here: http://study.marearts.com/2014/04/opencv-study-calcopticalflowfarneback.html . Or something with a similar functional output
I have implemented the dense optical flow algorithm (cv2.calcOpticalFlowFarneback). And from this i have been able to sample the magnitudes at specified points of the image.
The video feed that is being input is 640x480, and i have set sample points to be at every fifth pixel vertically and horizontally.
import cv2
import numpy as np
import matplotlib.pyplot as plt
cap = cv2.VideoCapture("T5.avi")
ret, frame1 = cap.read()
prvs = cv2.cvtColor(frame1, cv2.COLOR_BGR2GRAY)
hsv = np.zeros_like(frame1)
hsv[..., 1] = 255
[R,C]=prvs.shape
count=0
while (1):
ret, frame2 = cap.read()
next = cv2.cvtColor(frame2, cv2.COLOR_BGR2GRAY)
flow = cv2.calcOpticalFlowFarneback(prvs, next, None, 0.5, 3, 15, 2, 5, 1.2, 0)
mag, ang = cv2.cartToPolar(flow[..., 0], flow[..., 1])
RV=np.arange(5,480,5)
CV=np.arange(5,640,5)
# These give arrays of points to sample at increments of 5
if count==0:
count =1 #so that the following creation is only done once
[Y,X]=np.meshgrid(CV,RV)
# makes an x and y array of the points specified at sample increments
temp =mag[np.ix_(RV,CV)]
# this makes a temp array that stores the magnitude of flow at each of the sample points
motionvectors=np.array((Y[:],X[:],Y[:]+temp.real[:],X[:]+temp.imag[:]))
Ydist=motionvectors[0,:,:]- motionvectors[2,:,:]
Xdist=motionvectors[1,:,:]- motionvectors[3,:,:]
Xoriginal=X-Xdist
Yoriginal=Y-Ydist
plot2 = plt.figure()
plt.quiver(Xoriginal, Yoriginal, X, Y,
color='Teal',
headlength=7)
plt.title('Quiver Plot, Single Colour')
plt.show(plot2)
hsv[..., 0] = ang * 180 / np.pi / 2
hsv[..., 2] = cv2.normalize(mag, None, 0, 255, cv2.NORM_MINMAX)
bgr = cv2.cvtColor(hsv, cv2.COLOR_HSV2BGR)
cv2.imshow('frame2', bgr)
k = cv2.waitKey(30) & 0xff
if k == 27:
break
prvs = next
cap.release()
cv2.destroyAllWindows()
I think i have calculated the original and final X,Y positions of the pixels and the distances the moved and have put these into a matplotlib quiver plot.
The result i get does not coincide with the hsv plot of the dense optical flow (which i know to be correct as it was taken from the OpenCV tutorials) and the quiver plot also only shows one frame at a time and the plot must be exited before the next one displays.
Can anyone see where i have gone wrong in my calculations and how i can make the plot update automatically with each frame?

I do not know how to change the behaviour of matplotlib quiver plots, but I'm sure it is possible.
An alternative is to create a function to draw lines on top of the original image, based on the calculated optical flow. The following code should achieve this:
def dispOpticalFlow( Image,Flow,Divisor,name ):
"Display image with a visualisation of a flow over the top. A divisor controls the density of the quiver plot."
PictureShape = np.shape(Image)
#determine number of quiver points there will be
Imax = int(PictureShape[0]/Divisor)
Jmax = int(PictureShape[1]/Divisor)
#create a blank mask, on which lines will be drawn.
mask = np.zeros_like(Image)
for i in range(1, Imax):
for j in range(1, Jmax):
X1 = (i)*Divisor
Y1 = (j)*Divisor
X2 = int(X1 + Flow[X1,Y1,1])
Y2 = int(Y1 + Flow[X1,Y1,0])
X2 = np.clip(X2, 0, PictureShape[0])
Y2 = np.clip(Y2, 0, PictureShape[1])
#add all the lines to the mask
mask = cv2.line(mask, (Y1,X1),(Y2,X2), [255, 255, 255], 1)
#superpose lines onto image
img = cv2.add(Image,mask)
#print image
cv2.imshow(name,img)
return []
This code only creates lines rather than arrows, but with some effort it could be modified to display arrows.

Related

How to detect a grainy line?

I am trying to detect a grainy printed line on a paper with cv2. I need the angle of the line. I dont have much knowledge in image processing and I only need to detect the line. I tried to play with the parameters but the angle is always detected wrong. Could someone help me. This is my code:
import cv2
import numpy as np
import matplotlib.pylab as plt
from matplotlib.pyplot import figure
img = cv2.imread('CamXY1_1.bmp')
crop_img = img[100:800, 300:900]
blur = cv2.GaussianBlur(crop_img, (1,1), 0)
ret,thresh = cv2.threshold(blur,150,255,cv2.THRESH_BINARY)
gray = cv2.cvtColor(thresh,cv2.COLOR_BGR2GRAY)
edges = cv2.Canny(gray, 60, 150)
figure(figsize=(15, 15), dpi=150)
plt.imshow(edges, 'gray')
lines = cv2.HoughLines(edges,1,np.pi/180,200)
for rho,theta in lines[0]:
a = np.cos(theta)
b = np.sin(theta)
x0 = a*rho
y0 = b*rho
x1 = int(x0 + 3000*(-b))
y1 = int(y0 + 3000*(a))
x2 = int(x0 - 3000*(-b))
y2 = int(y0 - 3000*(a))
cv2.line(img,(x1,y1),(x2,y2),(0, 255, 0),2)
imagetobedetected
Here's a possible solution to estimate the line (and its angle) without using the Hough line transform. The idea is to locate the start and ending points of the line using the reduce function. This function can reduce an image to a single column or row. If we reduce the image we can also get the total SUM of all the pixels across the reduced image. Using this info we can estimate the extreme points of the line and calculate its angle. This are the steps:
Resize your image because it is way too big
Get a binary image via adaptive thresholding
Define two extreme regions of the image and crop them
Reduce the ROIs to a column using the SUM mode, which is the sum of all rows
Accumulate the total values above a threshold value
Estimate the starting and ending points of the line
Get the angle of the line
Here's the code:
# imports:
import cv2
import numpy as np
import math
# image path
path = "D://opencvImages//"
fileName = "mmCAb.jpg"
# Reading an image in default mode:
inputImage = cv2.imread(path + fileName)
# Scale your BIG image into a small one:
scalePercent = 0.3
# Calculate the new dimensions
width = int(inputImage.shape[1] * scalePercent)
height = int(inputImage.shape[0] * scalePercent)
newSize = (width, height)
# Resize the image:
inputImage = cv2.resize(inputImage, newSize, None, None, None, cv2.INTER_AREA)
# Deep copy for results:
inputImageCopy = inputImage.copy()
# Convert BGR to grayscale:
grayInput = cv2.cvtColor(inputImage, cv2.COLOR_BGR2GRAY)
# Adaptive Thresholding:
windowSize = 51
windowConstant = 11
binaryImage = cv2.adaptiveThreshold(grayInput, 255, cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY_INV, windowSize, windowConstant)
The first step is to get the binary image. Note that I previously downscaled your input because it is too big and we don't need all that info. This is the binary mask:
Now, we don't need most of the image. In fact, since the line is across the whole image, we can only "trim" the first and last column and check out where the white pixels begin. I'll crop a column a little bit wider, though, so we can ensure we have enough data and as less noise as possible. I'll define two Regions of Interest (ROIs) and crop them. Then, I'll reduce each ROI to a column using the SUM mode, this will give me the summation of all intensity across each row. After that, I can accumulate the locations where the sum exceeds a certain threshold and approximate the location of the line, like this:
# Define the regions that will be cropped
# from the original image:
lineWidth = 5
cropPoints = [(0, 0, lineWidth, height), (width-lineWidth, 0, lineWidth, height)]
# Store the line points here:
linePoints = []
# Loop through the crop points and
# crop de ROI:
for p in range(len(cropPoints)):
# Get the ROI:
(x,y,w,h) = cropPoints[p]
# Crop the ROI:
imageROI = binaryImage[y:y+h, x:x+w]
# Reduce the ROI to a n row x 1 columns matrix:
reducedImg = cv2.reduce(imageROI, 1, cv2.REDUCE_SUM, dtype=cv2.CV_32S)
# Get the height (or lenght) of the arry:
reducedHeight = reducedImg.shape[0]
# Define a threshold and accumulate
# the coordinate of the points:
threshValue = 100
pointSum = 0
pointCount = 0
for i in range(reducedHeight):
currentValue = reducedImg[i]
if currentValue > threshValue:
pointSum = pointSum + i
pointCount = pointCount + 1
# Get average coordinate of the line:
y = int(accX / pixelCount)
# Store in list:
linePoints.append((x, y))
The red rectangles show the regions I cropped from the input image:
Note that I've stored both points in the linePoints list. Let's check out our approximation by drawing a line that connects both points:
# Get the two points:
p0 = linePoints[0]
p1 = linePoints[1]
# Draw the line:
cv2.line(inputImageCopy, (p0[0], p0[1]), (p1[0], p1[1]), (255, 0, 0), 1)
cv2.imshow("Line", inputImageCopy)
cv2.waitKey(0)
Which yields:
Not bad, huh? Now that we have both points, we can estimate the angle of this line:
# Get angle:
adjacentSide = p1[0] - p0[0]
oppositeSide = p0[1] - p1[1]
# Compute the angle alpha:
alpha = math.degrees(math.atan(oppositeSide / adjacentSide))
print("Angle: "+str(alpha))
This prints:
Angle: 0.534210901840831

OpenCV: undistort (for images) and undistortPoints are inconsistent

For testing I generate a grid image as matrix and again the grid points as point array:
This represents a "distorted" camera image along with some feature points.
When I now undistort both the image and the grid points, I get the following result:
(Note that the fact that the "distorted" image is straight and the "undistorted" image is morphed is not the point, I'm just testing the undistortion functions with a straight test image.)
The grid image and the red grid points are totally misaligned now. I googled and found that some people forget to specify the "new camera matrix" parameter in undistortPoints but I didn't. The documentation also mentions a normalization but I still have the problem when I use the identity matrix as camera matrix. Also, in the central region it fits perfectly.
Why is this not identical, do I use something in a wrong way?
I use cv2 (4.1.0) in Python. Here is the code for testing:
import numpy as np
import matplotlib.pyplot as plt
import cv2
w = 401
h = 301
# helpers
#--------
def plotImageAndPoints(im, pu, pv):
plt.imshow(im, cmap="gray")
plt.scatter(pu, pv, c="red", s=16)
plt.xlim(0, w)
plt.ylim(0, h)
plt.show()
def cv2_undistortPoints(uSrc, vSrc, cameraMatrix, distCoeffs):
uvSrc = np.array([np.matrix([uSrc, vSrc]).transpose()], dtype="float32")
uvDst = cv2.undistortPoints(uvSrc, cameraMatrix, distCoeffs, None, cameraMatrix)
uDst = [uv[0] for uv in uvDst[0]]
vDst = [uv[1] for uv in uvDst[0]]
return uDst, vDst
# test data
#----------
# generate grid image
img = np.ones((h, w), dtype = "float32")
img[0::20, :] = 0
img[:, 0::20] = 0
# generate grid points
uPoints, vPoints = np.meshgrid(range(0, w, 20), range(0, h, 20), indexing='xy')
uPoints = uPoints.flatten()
vPoints = vPoints.flatten()
# see if points align with the image
plotImageAndPoints(img, uPoints, vPoints) # perfect!
# undistort both image and points individually
#---------------------------------------------
# camera matrix parameters
fx = 1
fy = 1
cx = w/2
cy = h/2
# distortion parameters
k1 = 0.00003
k2 = 0
p1 = 0
p2 = 0
# convert for opencv
mtx = np.matrix([
[fx, 0, cx],
[ 0, fy, cy],
[ 0, 0, 1]
], dtype = "float32")
dist = np.array([k1, k2, p1, p2], dtype = "float32")
# undistort image
imgUndist = cv2.undistort(img, mtx, dist)
# undistort points
uPointsUndist, vPointsUndist = cv2_undistortPoints(uPoints, vPoints, mtx, dist)
# test if they still match
plotImageAndPoints(imgUndist, uPointsUndist, vPointsUndist) # awful!
Any help appreciated!
A bit late to the party, but to help others running into this issue:
The problem is that UndistortPoints is an iterative calculation which in some cases exits before a stable solution has been reached. This can be fixed by modifying the termination criteria for the calculation, which can be done by using UndistortPointsIter. You should replace:
uvDst = cv2.undistortPoints(uvSrc, cameraMatrix, distCoeffs, None, cameraMatrix)
with:
uvDst = cv2.undistortPointsIter(uvSrc, cameraMatrix, distCoeffs, None, cameraMatrix,(cv2.TERM_CRITERIA_COUNT | cv2.TERM_CRITERIA_EPS, 40, 0.03))
Now, it tries 40 iterations to find a solution, rather than the default 5 iterations.

Detect a certain Object in a video stream

I am trying to detect a white Object on a black/white road to let an autonmous RC car drive around it. And i am detecting everything but the white box on the road.
What I tried can be seen in my code Example
#input= one video stream frame 320x240
frame = copy.deepcopy(input)
grayFrame = cv2.cvtColor(frame,cv2.COLOR_BGR2GRAY)
threshGray = cv2.adaptiveThreshold(
grayFrame,
255,
cv2.ADAPTIVE_THRESH_MEAN_C,
cv2.THRESH_BINARY,
blockSize=123,
C=-19,
)
contours,_ = cv2.findContours(threshGray, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
for cnt in contours:
#some filtering needs to be done
#
#after filtering append contour
filteredContours.append(cnt)
cv2.rectangle(frame, (x, y), (x + w, y + h), (3, 244, 244), 1)
cv2.drawContours(frame, filteredContours, -1, (255, 0, 255),1 )
cv2.imshow("with contours", frame)
cv2.imshow("adaptiveThreshhold", threshGray)
cv2.imshow("input", input)
I'm looking for a way to draw a bounding box around the obstacle.
Problem is I dont know how to extract this box from the rest.
It is probably because the contour of the box and the lines on the right are connected and thats why the bounding box is that big. Would be great if someone knows a way to do that.
Click here to see the Result
First: Input image
Second: after adaptiveThreshold
third: with contours(pink) and bounding boxes(yellow)
At this point in time, you got several candidates of white color value.
You need to add code in to the #some filtering needs to be done to rid candidate list of NOT bounding box you want to find.
I suggest you to compare your candidate list with square box as bigger as enough.
Because all of contours without BOX(that you want to find on the road) do not satisfy condition about square box as I mentioned above.
I think what you are looking for is triangular masking, as seen in the input image you have lane marking as well. Did try using a lane detector with this all the areas out of lane can be masked and only the spaces in lane can be processed.
Below I have tried to use Lane detector using HoughLinesP and added Contours as well. Try to use this, I did not test this code but I see no issues.
#! /usr/bin/env python 3
"""
Lane detector using the Hog transform method
"""
import cv2 as cv
import numpy as np
# import matplotlib.pyplot as plt
import random as rng
rng.seed(369)
def do_canny(frame):
# Converts frame to grayscale because we only need the luminance channel for detecting edges - less computationally expensive
gray = cv.cvtColor(frame, cv.COLOR_RGB2GRAY)
# Applies a 5x5 gaussian blur with deviation of 0 to frame - not mandatory since Canny will do this for us
blur = cv.GaussianBlur(gray, (5, 5), 0)
# Applies Canny edge detector with minVal of 50 and maxVal of 150
canny = cv.Canny(blur, 50, 150)
return canny
def do_segment(frame):
# Since an image is a multi-directional array containing the relative intensities of each pixel in the image, we can use frame.shape to return a tuple: [number of rows, number of columns, number of channels] of the dimensions of the frame
# frame.shape[0] give us the number of rows of pixels the frame has. Since height begins from 0 at the top, the y-coordinate of the bottom of the frame is its height
height = frame.shape[0]
# Creates a triangular polygon for the mask defined by three (x, y) coordinates
polygons = np.array([
[(0, height), (800, height), (380, 290)]
])
# Creates an image filled with zero intensities with the same dimensions as the frame
mask = np.zeros_like(frame)
# Allows the mask to be filled with values of 1 and the other areas to be filled with values of 0
cv.fillPoly(mask, polygons, 255)
# A bitwise and operation between the mask and frame keeps only the triangular area of the frame
segment = cv.bitwise_and(frame, mask)
return segment
def calculate_lines(frame, lines):
# Empty arrays to store the coordinates of the left and right lines
left = []
right = []
# Loops through every detected line
for line in lines:
# Reshapes line from 2D array to 1D array
x1, y1, x2, y2 = line.reshape(4)
# Fits a linear polynomial to the x and y coordinates and returns a vector of coefficients which describe the slope and y-intercept
parameters = np.polyfit((x1, x2), (y1, y2), 1)
slope = parameters[0]
y_intercept = parameters[1]
# If slope is negative, the line is to the left of the lane, and otherwise, the line is to the right of the lane
if slope < 0:
left.append((slope, y_intercept))
else:
right.append((slope, y_intercept))
# Averages out all the values for left and right into a single slope and y-intercept value for each line
left_avg = np.average(left, axis = 0)
right_avg = np.average(right, axis = 0)
# Calculates the x1, y1, x2, y2 coordinates for the left and right lines
left_line = calculate_coordinates(frame, left_avg)
right_line = calculate_coordinates(frame, right_avg)
return np.array([left_line, right_line])
def calculate_coordinates(frame, parameters):
slope, intercept = parameters
# Sets initial y-coordinate as height from top down (bottom of the frame)
y1 = frame.shape[0]
# Sets final y-coordinate as 150 above the bottom of the frame
y2 = int(y1 - 150)
# Sets initial x-coordinate as (y1 - b) / m since y1 = mx1 + b
x1 = int((y1 - intercept) / slope)
# Sets final x-coordinate as (y2 - b) / m since y2 = mx2 + b
x2 = int((y2 - intercept) / slope)
return np.array([x1, y1, x2, y2])
def visualize_lines(frame, lines):
# Creates an image filled with zero intensities with the same dimensions as the frame
lines_visualize = np.zeros_like(frame)
# Checks if any lines are detected
if lines is not None:
for x1, y1, x2, y2 in lines:
# Draws lines between two coordinates with green color and 5 thickness
cv.line(lines_visualize, (x1, y1), (x2, y2), (0, 255, 0), 5)
return lines_visualize
# The video feed is read in as a VideoCapture object
cap = cv.VideoCapture(1)
while (cap.isOpened()):
# ret = a boolean return value from getting the frame, frame = the current frame being projected in the video
ret, frame = cap.read()
canny = do_canny(frame)
cv.imshow("canny", canny)
# plt.imshow(frame)
# plt.show()
segment = do_segment(canny)
hough = cv.HoughLinesP(segment, 2, np.pi / 180, 100, np.array([]), minLineLength = 100, maxLineGap = 50)
# Averages multiple detected lines from hough into one line for left border of lane and one line for right border of lane
lines = calculate_lines(frame, hough)
# Visualizes the lines
lines_visualize = visualize_lines(frame, lines)
cv.imshow("hough", lines_visualize)
# Overlays lines on frame by taking their weighted sums and adding an arbitrary scalar value of 1 as the gamma argument
output = cv.addWeighted(frame, 0.9, lines_visualize, 1, 1)
contours, _ = cv.findContours(output, cv.RETR_TREE, cv.CHAIN_APPROX_SIMPLE)
contours_poly = [None]*len(contours)
boundRect = [None]*len(contours)
centers = [None]*len(contours)
radius = [None]*len(contours)
for i, c in enumerate(contours):
contours_poly[i] = cv.approxPolyDP(c, 3, True)
boundRect[i] = cv.boundingRect(contours_poly[i])
centers[i], radius[i] = cv.minEnclosingCircle(contours_poly[i])
## [allthework]
## [zeroMat]
drawing = np.zeros((output.shape[0], output.shape[1], 3), dtype=np.uint8)
## [zeroMat]
## [forContour]
# Draw polygonal contour + bonding rects + circles
for i in range(len(contours)):
color = (rng.randint(0,256), rng.randint(0,256), rng.randint(0,256))
cv.drawContours(drawing, contours_poly, i, color)
cv.rectangle(drawing, (int(boundRect[i][0]), int(boundRect[i][1])), \
(int(boundRect[i][0]+boundRect[i][2]), int(boundRect[i][1]+boundRect[i][3])), color, 2)
# Opens a new window and displays the output frame
cv.imshow('Contours', drawing)
# Frames are read by intervals of 10 milliseconds. The programs breaks out of the while loop when the user presses the 'q' key
if cv.waitKey(10) & 0xFF == ord('q'):
break
# The following frees up resources and closes all windows
cap.release()
cv.destroyAllWindows()
try different values in the threshold for canny.

How can I extract image segment with specific color in OpenCV?

I work with logos and other simple graphics, in which there are no gradients or complex patterns. My task is to extract from the logo segments with letters and other elements.
To do this, I define the background color, and then I go through the picture in order to segment the images. Here is my code for more understanding:
MAXIMUM_COLOR_TRANSITION_DELTA = 100 # 0 - 765
def expand_segment_recursive(image, unexplored_foreground, segment, point, color):
height, width, _ = image.shape
# Unpack coordinates from point
py, px = point
# Create list of pixels to check
neighbourhood_pixels = [(py, px + 1), (py, px - 1), (py + 1, px), (py - 1, px)]
allowed_zone = unexplored_foreground & np.invert(segment)
for y, x in neighbourhood_pixels:
# Add pixel to segment if its coordinates within the image shape and its color differs from segment color no
# more than MAXIMUM_COLOR_TRANSITION_DELTA
if y in range(height) and x in range(width) and allowed_zone[y, x]:
color_delta = np.sum(np.abs(image[y, x].astype(np.int) - color.astype(np.int)))
print(color_delta)
if color_delta <= MAXIMUM_COLOR_TRANSITION_DELTA:
segment[y, x] = True
segment = expand_segment_recursive(image, unexplored_foreground, segment, (y, x), color)
allowed_zone = unexplored_foreground & np.invert(segment)
return segment
if __name__ == "__main__":
if len(sys.argv) < 2:
print("Pass image as the argument to use the tool")
exit(-1)
IMAGE_FILENAME = sys.argv[1]
print(IMAGE_FILENAME)
image = cv.imread(IMAGE_FILENAME)
height, width, _ = image.shape
# To filter the background I use median value of the image, as background in most cases takes > 50% of image area.
background_color = np.median(image, axis=(0, 1))
print("Background color: ", background_color)
# Create foreground mask to find segments in it (TODO: Optimize this part)
foreground = np.zeros(shape=(height, width, 1), dtype=np.bool)
for y in range(height):
for x in range(width):
if not np.array_equal(image[y, x], background_color):
foreground[y, x] = True
unexplored_foreground = foreground
for y in range(height):
for x in range(width):
if unexplored_foreground[y, x]:
segment = np.zeros(foreground.shape, foreground.dtype)
segment[y, x] = True
segment = expand_segment_recursive(image, unexplored_foreground, segment, (y, x), image[y, x])
cv.imshow("segment", segment.astype(np.uint8) * 255)
while cv.waitKey(0) != 27:
continue
Here is the desired result:
In the end of run-time I expect 13 extracted separated segments (for this particular image). But instead I got RecursionError: maximum recursion depth exceeded, which is not surprising as expand_segment_recursive() can be called for every pixel of the image. And since even with small image resolution of 600x500 i got at maximum 300K calls.
My question is how can I get rid of recursion in this case and possibly optimize the algorithm with Numpy or OpenCV algorithms?
You can actually use a thresholded image (binary) and connectedComponents to do this job in a couple of steps. Also, you may use findContours or other methods.
Here is the code:
import numpy as np
import cv2
# load image as greyscale
img = cv2.imread("hp.png", 0)
# puts 0 to the white (background) and 255 in other places (greyscale value < 250)
_, thresholded = cv2.threshold(img, 250, 255, cv2.THRESH_BINARY_INV)
# gets the labels and the amount of labels, label 0 is the background
amount, labels = cv2.connectedComponents(thresholded)
# lets draw it for visualization purposes
preview = np.zeros((img.shape[0], img.shape[2], 3), dtype=np.uint8)
print (amount) #should be 3 -> two components + background
# draw label 1 blue and label 2 green
preview[labels == 1] = (255, 0, 0)
preview[labels == 2] = (0, 255, 0)
cv2.imshow("frame", preview)
cv2.waitKey(0)
At the end, the thresholded image will look like this:
and the preview image (the one with the colored segments) will look like this:
With the mask you can always use numpy functions to get things like, coordinates of the segments you want or to color them (like I did with preview)
UPDATE
To get different colored segments, you may try to create a "border" between the segments. Since they are plain colors and not gradients, you can try to do an edge detector like canny and then put it black in the image....
import numpy as np
import cv2
img = cv2.imread("total.png", 0)
# background to black
img[img>=200] = 0
# get edges
canny = cv2.Canny(img, 60, 180)
# make them thicker
kernel = np.ones((3,3),np.uint8)
canny = cv2.morphologyEx(canny, cv2.MORPH_DILATE, kernel)
# apply edges as border in the image
img[canny==255] = 0
# same as before
amount, labels = cv2.connectedComponents(img)
preview = np.zeros((img.shape[0], img.shape[1], 3), dtype=np.uint8)
print (amount) #should be 14 -> 13 components + background
# color them randomly
for i in range(1, amount):
preview[labels == i] = np.random.randint(0,255, size=3, dtype=np.uint8)
cv2.imshow("frame", preview )
cv2.waitKey(0)
The result is:

Detecting moving object in moving camera(monitoring one area mounted on a drone)

def run(self):
while True:
_ret, frame = self.cam.read()
frame_gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
vis = frame.copy()
if len(self.tracks) > 0:
img0, img1 = self.prev_gray, frame_gray
p0 = np.float32([tr[-1] for tr in self.tracks]).reshape(-1, 1, 2)
p1, _st, _err = cv2.calcOpticalFlowPyrLK(img0, img1, p0, None, **lk_params)
p0r, _st, _err = cv2.calcOpticalFlowPyrLK(img1, img0, p1, None, **lk_params)
d = abs(p0-p0r).reshape(-1, 2).max(-1)
good = d < 1
new_tracks = []
for i in range(len(p1)):
A.append(math.sqrt((p1[i][0][0])**2 + (p1[i][0][1])**2))
counts,bins,bars = plt.hist(A)
for tr, (x, y), good_flag in zip(self.tracks, p1.reshape(-1, 2), good):
if not good_flag:
continue
tr.append((x, y))
if len(tr) > self.track_len:
del tr[0]
new_tracks.append(tr)
cv2.circle(vis, (x, y), 2, (0, 255, 0), -1)
self.tracks = new_tracks
cv2.polylines(vis, [np.int32(tr) for tr in self.tracks], False, (0, 255, 0))
draw_str(vis, (20, 20), 'track count: %d' % len(self.tracks))
if self.frame_idx % self.detect_interval == 0:
mask = np.zeros_like(frame_gray)
mask[:] = 255
for x, y in [np.int32(tr[-1]) for tr in self.tracks]:
cv2.circle(mask, (x, y), 5, 0, -1)
p = cv2.goodFeaturesToTrack(frame_gray, mask = mask, **feature_params)
if p is not None:
for x, y in np.float32(p).reshape(-1, 2):
self.tracks.append([(x, y)])
self.frame_idx += 1
self.prev_gray = frame_gray
cv2.imshow('lk_track', vis)
ch = cv2.waitKey(1)
if ch == 27:
break
i am using lk_track.py from opencv samples to try and detect a moving object. I am trying to find the camera motion using the histogram of magnitude of optical flow vectors and then calculate the average for similar values which should be directly proportional to the camera motion. I have calculated the magnitude of the vectors and saved it in a list A. Can some suggest on how to find highest similar values from it and calculate the average for only those values?
I created a toy problem to model the approach of binarizing the images by optical flow. This is a massively simplified view of the problem, but gives the general idea well. I'll split the problem up into a few chunks and give functions for them. If you're working directly with video, there will be a lot of additional code needed of course, and I just hardcoded a lot of values that you'll need to turn into parameters.
The first function is just for generating the image sequence. The images are moving through a scene with an object moving inside the sequence. The image sequence is just simply translating through the scene, and the object appears stationary in the sequence, but that means that the object is actually moving in the opposite direction of the camera of course.
import numpy as np
import cv2
def gen_seq():
"""Generate motion sequence with an object"""
scene = cv2.GaussianBlur(np.uint8(255*np.random.rand(400, 500)), (21, 21), 3)
h, w = 400, 400
step = 4
obj_mask = np.zeros((h, w), np.bool)
obj_h, obj_w = 50, 50
obj_x, obj_y = 175, 175
obj_mask[obj_y:obj_y+obj_h, obj_x:obj_x+obj_w] = True
obj_data = np.uint8(255*np.random.rand(obj_h, obj_w)).ravel()
imgs = []
for i in range(0, 1+w//step, step):
img = scene[:, i:i+w].copy()
img[obj_mask] = obj_data
imgs.append(img)
return imgs
# generate image sequence
imgs = gen_seq()
# display images
for img in imgs:
cv2.imshow('Image', img)
k = cv2.waitKey(100) & 0xFF
if k == ord('q'):
break
cv2.destroyWindow('Image')
So here's the basic image sequence visualized. I just used a random scene, translated through, and added a random object in the center.
Great! Now we need to calculate the flow between each frame. I used dense flow here, but sparse flow would be more robust for actual images.
def find_flows(imgs):
"""Finds the dense optical flows"""
optflow_params = [0.5, 3, 15, 3, 5, 1.2, 0]
prev = imgs[0]
flows = []
for img in imgs[1:]:
flow = cv2.calcOpticalFlowFarneback(prev, img, None, *optflow_params)
flows.append(flow)
prev = img
return flows
# find optical flows between images
flows = find_flows(imgs)
# display flows
h, w = imgs[0].shape[:2]
hsv = np.zeros((h, w, 3), dtype=np.uint8)
hsv[..., 1] = 255
for flow in flows:
mag, ang = cv2.cartToPolar(flow[..., 0], flow[..., 1])
hsv[..., 0] = ang*180/np.pi/2
hsv[..., 2] = cv2.normalize(mag, None, 0, 255, cv2.NORM_MINMAX)
rgb = cv2.cvtColor(hsv, cv2.COLOR_HSV2BGR)
cv2.imshow('Flow', rgb)
k = cv2.waitKey(100) & 0xFF
if k == ord('q'):
break
cv2.destroyWindow('Flow')
Here I colorized the flow based on it's angle and magnitude. The angle will determine the color and the magnitude will determine the intensity/brightness of the color. This is the same view the OpenCV tutorial on dense optical flow uses.
Then, we need to binarize this flow so that we get two distinct sets of pixels based on how they're moving. In the sparse case, this works out the same except you will get two distinct sets of features.
def label_flows(flows):
"""Binarizes the flows by direction and magnitude"""
criteria = (cv2.TERM_CRITERIA_EPS + cv2.TERM_CRITERIA_MAX_ITER, 10, 1.0)
flags = cv2.KMEANS_RANDOM_CENTERS
h, w = flows[0].shape[:2]
labeled_flows = []
for flow in flows:
flow = flow.reshape(h*w, -1)
comp, labels, centers = cv2.kmeans(flow, 2, None, criteria, 10, flags)
n = np.sum(labels == 1)
camera_motion_label = np.argmax([labels.size-n, n])
labeled = np.uint8(255*(labels.reshape(h, w) == camera_motion_label))
labeled_flows.append(labeled)
return labeled_flows
# binarize the flows
labeled_flows = label_flows(flows)
# display binarized flows
for labeled_flow in labeled_flows:
cv2.imshow('Labeled Flow', labeled_flow)
k = cv2.waitKey(100) & 0xFF
if k == ord('q'):
break
cv2.destroyWindow('Labeled Flow')
The annoying thing here is the labels will be set randomly, i.e. the labels will be different for each frame. If you visualized the binary image, it would flip between black and white randomly. I'm only using binary labels, 0 and 1, so what I did was considered the label that is assigned to more pixels to be the "camera motion label" and then I set that label to be white in the resulting images, and the other label to be black, that way the camera motion label is always the same in each frame. This may need to be much more sophisticated for working on video feed.
But here we have it, a binarized flow where the color is just showing the two distinct sets of flow vectors.
Now if we wanted to find the target in this flow, we could invert the image and find the connected components of the binary image. The inversion will make the camera motion the background label (0). Then each of the black blobs will be white and will be labeled, and we could find the blob relating to the largest component which, in this case, will be the target. That will give a mask around the target, and we can draw the contours of that mask on the original images to see the target being detected. I'll also cut the borders of the image off before finding the connected components so edge effects from dense flow are ignored.
def find_target_in_labeled_flow(labeled_flow):
labeled_flow = cv2.bitwise_not(labeled_flow)
bw = 10
h, w = labeled_flow.shape[:2]
border_cut = labeled_flow[bw:h-bw, bw:w-bw]
conncomp, stats = cv2.connectedComponentsWithStats(border_cut, connectivity=8)[1:3]
target_label = np.argmax(stats[1:, cv2.CC_STAT_AREA]) + 1
img = np.zeros_like(labeled_flow)
img[bw:h-bw, bw:w-bw] = 255*(conncomp == target_label)
return img
for labeled_flow, img in zip(labeled_flows, imgs[:-1]):
target_mask = find_target_in_labeled_flow(labeled_flow)
display_img = cv2.merge([img, img, img])
contours = cv2.findContours(target_mask, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)[1]
display_img = cv2.drawContours(display_img, contours, -1, (0, 255, 0), 2)
cv2.imshow('Detected Target', display_img)
k = cv2.waitKey(100) & 0xFF
if k == ord('q'):
break
And of course this could get some cleaning up, and you won't be doing exactly this for sparse flow. You could just define a region of interest around the tracked points.
Now, there is still a lot of work to do. You have a binarized flow...you can probably assume that the label which occurs most frequently is the camera motion (like I did) safely. However, you'll have to make sure that the other label is the object you're interested in tracking. You'll have to keep track of it between flows so that if it stops moving, you'll know where it is as the camera is moving. When you do the k-means step, you'll want to make sure that the centers from k-means are "far enough" apart so that you know the object is moving or not.
The basic steps for that would be, from the starting frame of the video:
If the two centers are "close", then you can assume your object is either not in the scene or not moving in the scene.
Once the centers are split enough apart, you'll have found the object to track. Keep track of the location of the object.
During tracking of the object, verify the location is nearby a prediction. You can use the optical flow velocity vectors from the previous frame to predict the location each pixel/feature in the new frame, so make sure your predictions agree with your tracking result.
If the object stops moving, the centers from k-means should be close. Keep track of the optical flow vectors around the object location and follow them to have a prediction of where the object is again once it resumes moving, and again verify the detected location with this prediction.
I've never used these methods before so I'm not sure how robust they are. The typical approach for HOOF or "Histogram of oriented optical flow" is much more advanced than this (see the seminal paper here). Instead of just binarizing, the idea is to use histograms from each frame as a probability distribution, and the way this probability distribution changes over time can be analyzed with the tools from time series analysis, which I assume give a more robust framework to this approach.
with #alkasm's answer to avoid the following error:
(-215:Assertion failed) npoints > 0 in function 'drawContours'
simply replace:
contours = cv2.findContours(target_mask, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)[1]
with
contours, _ = cv2.findContours(target_mask, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
I can't comment this below as an answer due to new account with low reputation.

Categories