I am trying to determine the centroid of one specific object using OpenCV and Python.
I am using the following code, but it is taking too much time to calculate the centroid.
I need a faster approach for this -- should I change the resolution of the cameras in order to increase the computing speed?
This is my code:
meanI=[0]
meanJ=[0]
#taking infinite frames continuously to make a video
while(True):
ret, frame = capture.read()
rgb_image = cv2.cvtColor(frame , 0)
content_red = rgb_image[:,:,2] #red channel of image
content_green = rgb_image[:,:,1] #green channel of image
content_blue = rgb_image[:,:,0] #blue channel of image
r = rgb_image.shape[0] #gives the rows of the image matrix
c = rgb_image.shape[1] # gives the columns of the image matrix
d = rgb_image.shape[2] #gives the depth order of the image matrux
binary_image = np.zeros((r,c),np.float32)
for i in range (1,r): #thresholding the object as per requirements
for j in range (1,c):
if((content_red[i][j]>186) and (content_red[i][j]<230) and \
(content_green[i][j]>155) and (content_green[i][j]<165) and \
(content_blue[i][j]> 175) and (content_blue[i][j]< 195)):
binary_image[i][j] = 1
meanI.append(i)
meanJ.append(j)
cv2.imshow('frame1',binary_image)
cv2.waitKey()
cox = np.mean(meanI) #x-coordinate of centroid
coy = np.mean(meanJ) #y-coordinate of centroid
As you have discovered, nested loops in Python are very slow. It is best to avoid iterating over every pixel using nested loops. Fortunately, OpenCV has some built-in functions that do exactly what you are trying to achieve: inRange(), which creates a binary image of pixels which fall in between the specified bounds, and moments(), which you can use to calculate the centroid of a binary image. I strongly suggest reading over OpenCV's documentation to get a feel for what the library offers.
Combining these two functions gives the following code:
import numpy as np
import cv2
lower = np.array([175, 155, 186], np.uint8) # Note these ranges are BGR ordered
upper = np.array([195, 165, 230], np.uint8)
binary = cv2.inRange(im, lower, upper) # im is your BGR image
moments = cv2.moments(binary, True)
cx = moments['m10'] / moments['m00']
cy = moments['m01'] / moments['m00']
cx and cy are the x- and y-coordinates of the image centroid. This version is a whopping 3000 times faster than using nested loops.
Related
I've been researching and trying a couple functions to get what I want and I feel like I might be overthinking it.
One version of my code is below. The sample image is here.
My end goal is to find the angle (yellow) of the approximated line with respect to the frame (green line) Final
I haven't even got to the angle portion of the program yet.
The results I was obtaining from the below code were as follows. Canny Closed Small Removed
Anybody have a better way of creating the difference and establishing the estimated line?
Any help is appreciated.
import cv2
import numpy as np
pX = int(512)
pY = int(768)
img = cv2.imread('IMAGE LOCATION', cv2.IMREAD_COLOR)
imgS = cv2.resize(img, (pX, pY))
aimg = cv2.imread('IMAGE LOCATION', cv2.IMREAD_GRAYSCALE)
# Blur image to reduce noise and resize for viewing
blur = cv2.medianBlur(aimg, 5)
rblur = cv2.resize(blur, (384, 512))
canny = cv2.Canny(rblur, 120, 255, 1)
cv2.imshow('canny', canny)
kernel = np.ones((2, 2), np.uint8)
#fringeMesh = cv2.dilate(canny, kernel, iterations=2)
#fringeMesh2 = cv2.dilate(fringeMesh, None, iterations=1)
#cv2.imshow('fringeMesh', fringeMesh2)
closing = cv2.morphologyEx(canny, cv2.MORPH_CLOSE, kernel)
cv2.imshow('Closed', closing)
nb_components, output, stats, centroids = cv2.connectedComponentsWithStats(closing, connectivity=8)
#connectedComponentswithStats yields every separated component with information on each of them, such as size
sizes = stats[1:, -1]; nb_components = nb_components - 1
min_size = 200 #num_pixels
fringeMesh3 = np.zeros((output.shape))
for i in range(0, nb_components):
if sizes[i] >= min_size:
fringeMesh3[output == i + 1] = 255
#contours, _ = cv2.findContours(fringeMesh3, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)
#cv2.drawContours(fringeMesh3, contours, -1, (0, 255, 0), 1)
cv2.imshow('final', fringeMesh3)
#cv2.imshow("Natural", imgS)
#cv2.imshow("img", img)
cv2.imshow("aimg", aimg)
cv2.imshow("Blur", rblur)
cv2.waitKey()
cv2.destroyAllWindows()
You can fit a straight line to the first white pixel you encounter in each column, starting from the bottom.
I had to trim your image because you shared a screen grab of it with a window decoration, title and frame rather than your actual image:
import cv2
import math
import numpy as np
# Load image as greyscale
im = cv2.imread('trimmed.jpg', cv2.IMREAD_GRAYSCALE)
# Get index of first white pixel in each column, starting at the bottom
yvals = (im[::-1,:]>200).argmax(axis=0)
# Make the x values 0, 1, 2, 3...
xvals = np.arange(0,im.shape[1])
# Fit a line of the form y = mx + c
z = np.polyfit(xvals, yvals, 1)
# Convert the slope to an angle
angle = np.arctan(z[0]) * 180/math.pi
Note 1: The value of z (the result of fitting) is:
array([ -0.74002694, 428.01463745])
which means the equation of the line you are looking for is:
y = -0.74002694 * x + 428.01463745
i.e. the y-intercept is at row 428 from the bottom of the image.
Note 2: Try to avoid JPEG format as an intermediate format in image processing - it is lossy and changes your pixel values - so where you have thresholded and done your morphology you are expecting values of 255 and 0, JPEG will lossily alter those values and you end up testing for a range or thresholding again.
Your 'Closed' image seems to quite clearly segment the two regions, so I'd suggest you focus on turning that boundary into a line that you can do something with. Connected components analysis and contour detection don't really provide any useful information here, so aren't necessary.
One quite simple approach to finding the line angle is to find the first white pixel in each row. To get only the rows that are part of your diagonal, don't include rows where that pixel is too close to either side (e.g. within 5%). That gives you a set of points (pixel locations) on the boundary of your two types of grass.
From there you can either do a linear regression to get an equation for the straight line, or you can get two points by averaging the x values for the top and bottom half of the rows, and then calculate the gradient angle from that.
An alternative approach would be doing another morphological close with a very large kernel, to end up with just a solid white region and a solid black region, which you could turn into a line with canny or findContours. From there you could either get some points by averaging, use the endpoints, or given a smooth enough result from a large enough kernel you could detect the line with hough lines.
I'm trying to take a picture (.jpg file) and find the exact centers (x/y coords) of two differently colored circles in this picture. I've done this in python 2.7. My program works well, but it takes a long time and I need to drastically reduce the amount of time it takes to do this. I currently check every pixel and test its color, and I know I could greatly improve efficiency by pre-sampling a subset of pixels (e.g. every tenth pixel in both horizontal and vertical directions to find areas of the picture to hone in on). My question is if there are pre-developed functions or ways of finding the x/y coords of objects that are much more efficient than my code. I've already removed function calls within the loop, but that only reduced the run time by a few percent.
Here is my code:
from PIL import Image
import numpy as np
i = Image.open('colors4.jpg')
iar = np.asarray(i)
(numCols,numRows) = i.size
print numCols
print numRows
yellowPixelCount = 0
redPixelCount = 0
yellowWeightedCountRow = 0
yellowWeightedCountCol = 0
redWeightedCountRow = 0
redWeightedCountCol = 0
for row in range(numRows):
for col in range(numCols):
pixel = iar[row][col]
r = pixel[0]
g = pixel[1]
b = pixel[2]
brightEnough = r > 200 and g > 200
if r > 2*b and g > 2*b and brightEnough: #yellow pixel
yellowPixelCount = yellowPixelCount + 1
yellowWeightedCountRow = yellowWeightedCountRow + row
yellowWeightedCountCol = yellowWeightedCountCol + col
if r > 2*g and r > 2*b and r > 100: # red pixel
redPixelCount = redPixelCount + 1
redWeightedCountRow = redWeightedCountRow + row
redWeightedCountCol = redWeightedCountCol + col
print "Yellow circle location"
print yellowWeightedCountRow/yellowPixelCount
print yellowWeightedCountCol/yellowPixelCount
print " "
print "Red circle location"
print redWeightedCountRow/redPixelCount
print redWeightedCountCol/redPixelCount
print " "
Update: As I mentioned below, the picture is somewhat arbitrary, but here is an example of one frame from the video I am using:
First you have to do some clearing:
what do you consider fast enough? where is the sample image so we can see what are you dealing with (resolution, bit per pixel). what platform (especially CPU so we can estimate speed).
As you are dealing with circles (each one encoded with different color) then it should be enough to find bounding box. So find min and max x,y coordinates of the pixels of each color. Then your circle is:
center.x=(xmin+xmax)/2
center.y=(ymin+ymax)/2
radius =((xmax-xmin)+(ymax-ymin))/4
If coded right even with your approach it should take just few ms. on images around 1024x1024 resolution I estimate 10-100 ms on average machine. You wrote your approach is too slow but you did not specify the time itself (in some cases 1us is slow in other 1min is enough so we can only guess what you need and got). Anyway if you got similar resolution and time is 1-10 sec then you most likelly use some slow pixel access (most likely from GDI) like get/setpixel use bitmap Scanline[] or direct Pixel access with bitblt or use own memory for images.
Your approach can be speeded up by using ray cast to find approximate location of circles.
cast horizontal lines
their distance should be smaller then radius of smallest circle you search for. cast as many rays until you hit each circle with at least 2 rays
cast 2 vertical lines
you can use found intersection points from #1 so no need to cast many rays just 2 ... use the H ray where intersection points are closer together but not too close.
compute you circle properties
so from the 4 intersection points compute center and radius as it is axis aligned rectangle +/- pixel error it should be as easy just find the mid point of any diagonal and radius is also obvious as half of diagonal size.
As you did not share any image we can only guess what you got in case you do no have circles or need an idea for different approach see:
Algorithms: Ellipse matching
find archery target in image of different perspectives
If you are sure of the colours of the circle, easier method be to filter the colors using a mask and then apply Hough circles as Mathew Pope suggested.
Here is a snippet to get you started quick.
import cv2 as cv2
import numpy as np
fn = '200px-Traffic_lights_dark_red-yellow.svg.png'
# OpenCV reads image with BGR format
img = cv2.imread(fn)
# Convert to HSV format
img_hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)
# lower mask (0-10)
lower_red = np.array([0, 50, 50])
upper_red = np.array([10, 255, 255])
mask = cv2.inRange(img_hsv, lower_red, upper_red)
# Bitwise-AND mask and original image
masked_red = cv2.bitwise_and(img, img, mask=mask)
# Check for circles using HoughCircles on opencv
circles = cv2.HoughCircles(mask, cv2.cv.CV_HOUGH_GRADIENT, 1, 20, param1=30, param2=15, minRadius=0, maxRadius=0)
print 'Radius ' + 'x = ' + str(circles[0][0][0]) + ' y = ' + str(circles[0][0][1])
One example of applying it on image looks like this. First is the original image, followed by the red colour mask obtained and the last is after circle is found using Hough circle function of OpenCV.
Radius found using the above method is Radius x = 97.5 y = 99.5
Hope this helps! :)
Background
A raster file collected via LIDAR records the topography of a watershed. To properly model the watershed, the river must appear continuous without any breaks or interruptions. The roads in the raster file appear like dams that interrupt the river as seen in the picture below
Specific Area Under Consideration in the Watershed
Objective
These river breaks are the main problem and I am trying but failing to remove them.
Approach
Via Python, I used the various tools and prebuilt functions in the OpenCV library. The primary function I used in this approach is the cv2.inpaint function. This function takes in an image file and a mask file and interpolates the original image wherever the mask file pixels are nonzero.
The main step here is determining the mask file which I did by detecting the corners at the break in the river. The mask file will guide the inpaint function to fill in the pixels according to the patterns in the surrounding pixels.
Problem
My issue is that this happens from all directions whereas I require it to only extrapolate pixel data from the river itself. The image below shows the flawed result: inpaint works but it considers data from outside the river too.
Inpainted Result
Here is my code if you are so kind as to help:
import scipy.io as sio
import numpy as np
from matplotlib import pyplot as plt
import cv2
matfile = sio.loadmat('box.mat') ## box.mat original raster file linked below
ztopo = matfile['box']
#Crop smaller section for analysis
ztopo2 = ztopo[200:500, 1400:1700]
## Step 1) Isolate river
river = ztopo2.copy()
river[ztopo2<217.5] = 0
#This will become problematic for entire watershed w/o proper slicing
## Step 2) Detect Corners
dst = cv2.cornerHarris(river,3,7,0.04)
# cornerHarris arguments adjust qualities of corner markers
# Dilate Image (unnecessary)
#dst = cv2.dilate(dst,None)
# Threshold for an optimal value, it may vary depending on the image.
# This adjusts what defines a corner
river2 = river.copy()
river2[dst>0.01*dst.max()]=[150]
## Step 3) Remove river and keep corners
#Initiate loop to isolate detected corners
i=0
j=0
rows,columns = river2.shape
for i in np.arange(rows):
for j in np.arange(columns):
if river2[i,j] != 150:
river2[i,j] = 0
j = j + 1
i = i + 1
# Save corners as new image for import during next step.
# Import must be via cv2 as thresholding and contour detection can only work on BGR files. Sio import in line 6 (matfile = sio.loadmat('box.mat')) imports 1 dimensional image rather than BGR.
cv2.imwrite('corners.png', river2)
## Step 4) Create mask image by defining and filling a contour around the previously detected corners
#Step 4 code retrieved from http://dsp.stackexchange.com/questions/2564/opencv-c-connect-nearby-contours-based-on-distance-between-them
#Article: OpenCV/C++ connect nearby contours based on distance between them
#Initiate function to specify features of contour connections
def find_if_close(cnt1,cnt2):
row1,row2 = cnt1.shape[0],cnt2.shape[0]
for i in xrange(row1):
for j in xrange(row2):
dist = np.linalg.norm(cnt1[i]-cnt2[j])
if abs(dist) < 50 :
return True
elif i==row1-1 and j==row2-1:
return False
#import image of corners created in step 3 so thresholding can function properly
img = cv2.imread('corners.png')
#Thresholding and Finding contours only works on grayscale image
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
ret,thresh = cv2.threshold(gray,127,255,cv2.THRESH_BINARY)
contours,hier = cv2.findContours(thresh,cv2.RETR_EXTERNAL,2)
LENGTH = len(contours)
status = np.zeros((LENGTH,1))
for i,cnt1 in enumerate(contours):
x = i
if i != LENGTH-1:
for j,cnt2 in enumerate(contours[i+1:]):
x = x+1
dist = find_if_close(cnt1,cnt2)
if dist == True:
val = min(status[i],status[x])
status[x] = status[i] = val
else:
if status[x]==status[i]:
status[x] = i+1
unified = []
maximum = int(status.max())+1
for i in xrange(maximum):
pos = np.where(status==i)[0]
if pos.size != 0:
cont = np.vstack(contours[i] for i in pos)
hull = cv2.convexHull(cont) # I don't know why/how this is used
unified.append(hull)
cv2.drawContours(img,unified,-1,(0,255,0),1) #Last argument specifies contour width
cv2.drawContours(thresh,unified,-1,255,-1)
# Thresh is the filled contour while img is the contour itself
# The contour surrounds the corners
#cv2.imshow('pic', thresh) #Produces black and white image
## Step 5) Merge via inpaint
river = np.uint8(river)
ztopo2 = np.uint8(ztopo2)
thresh[thresh>0] = 1
#new = river.copy()
merged = cv2.inpaint(river,thresh,12,cv2.INPAINT_TELEA)
plt.imshow(merged)
plt.colorbar()
plt.show()
I read this blog post where he uses a Laser and a Webcam to estimated the distance of the cardboard from the Webcam.
I had another idea about that. I don't want to calculate the distance from the webcam.
I want to check if an object is approaching the webcam. The algorithm, according to me, will be something like:
Detect the object in the webcam feed.
If the object is approaching the webcam it'll grow larger and larger in the video feed.
Use this data for further calculations.
Since I want to detect random objects, I am using the findContours() method to find the contours in the video feed. Using that, I will at least have the outlines of the objects in the video feed. The source code is:
import numpy as np
import cv2
vid=cv2.VideoCapture(0)
ans, instant=vid.read()
average=np.float32(instant)
cv2.accumulateWeighted(instant, average, 0.01)
background=cv2.convertScaleAbs(average)
while(1):
_,f=vid.read()
imgray=cv2.cvtColor(f, cv2.COLOR_BGR2GRAY)
ret, thresh=cv2.threshold(imgray,127,255,0)
diff=cv2.absdiff(f, background)
cv2.imshow("input", f)
cv2.imshow("Difference", diff)
if cv2.waitKey(5)==27:
break
cv2.destroyAllWindows()
The output is:
I am stuck here. I have the contours stored in an array. What do I do with it when the size increases? How do I proceed?
One trouble here is recognising and differentiating the moving objects from other stuff in the video feed. An approach might be to let the camera 'learn' what the background looks like with no object. Then you can constantly compare its input against this background. One way to get the background is to use a running average.
Any difference greater than a small threshold means there is a moving object. If you constantly display this difference, you basically have a motion tracker. The size of the objects is simply the sum of all the non-zero (thresholded) pixels, or their bounding rectangles. You can track this size and use it to guess whether the object is moving closer or further. Morphological operations can help group the contours into one cohesive object.
Since it will be tracking ANY movement, if there are two objects, they will be counted together. Here is where you can use the contours to find and track individual objects, e.g. using the contour bounds or centroids. You could also possibly separate them by colour.
Here are some results using this strategy (the grey blob is my hand):
It actually did a fairly good job of guessing which way my hand was moving.
Code:
import cv2
import numpy as np
AVERAGE_ALPHA = 0.2 # 0-1 where 0 never adapts, and 1 instantly adapts
MOVEMENT_THRESHOLD = 30 # Lower values pick up more movement
REDUCED_SIZE = (400, 600)
MORPH_KERNEL = np.ones((10, 10), np.uint8)
def reduce_image(input_image):
"""Make the image easier to deal with."""
reduced = cv2.resize(input_image, REDUCED_SIZE)
reduced = cv2.cvtColor(reduced, cv2.COLOR_BGR2GRAY)
return reduced
# Initialise
vid = cv2.VideoCapture(0)
average = None
old_sizes = np.zeros(20)
size_update_index = 0
while (True):
got_frame, frame = vid.read()
if got_frame:
# Reduce image
reduced = reduce_image(frame)
if average is None: average = np.float32(reduced)
# Get background
cv2.accumulateWeighted(reduced, average, AVERAGE_ALPHA)
background = cv2.convertScaleAbs(average)
# Get thresholded difference image
movement = cv2.absdiff(reduced, background)
_, threshold = cv2.threshold(movement, MOVEMENT_THRESHOLD, 255, cv2.THRESH_BINARY)
# Apply morphology to help find object
dilated = cv2.dilate(threshold, MORPH_KERNEL, iterations=10)
closed = cv2.morphologyEx(dilated, cv2.MORPH_CLOSE, MORPH_KERNEL)
# Get contours
contours, _ = cv2.findContours(closed, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
cv2.drawContours(closed, contours, -1, (150, 150, 150), -1)
# Find biggest bounding rectangle
areas = [cv2.contourArea(c) for c in contours]
if (areas != list()):
max_index = np.argmax(areas)
max_cont = contours[max_index]
x, y, w, h = cv2.boundingRect(max_cont)
cv2.rectangle(closed, (x, y), (x+w, y+h), (255, 255, 255), 5)
# Guess movement direction
size = w*h
if size > old_sizes.mean():
print "Towards"
else:
print "Away"
# Update object size
old_sizes[size_update_index] = size
size_update_index += 1
if (size_update_index) >= len(old_sizes): size_update_index = 0
# Display image
cv2.imshow('RaptorVision', closed)
Obviously this needs more work in terms of identifying, selecting and tracking the objects etc (at the moment it does horribly if there is something else moving in the background). There are also many parameters to vary and tweak (the ones set are what worked well for my system). I'll leave that up to you though.
Some links:
background extraction
motion tracking
If you want to get a bit more high-tech with the background removal, have a look here:
wallflower
Detect the object in the webcam feed.
If the object is approaching the webcam it'll grow larger and larger in the video feed.
Use this data for further calculations.
Good idea.
If you want to use the contour detection approach, you could do it the following way:
You have a series of Images I1, I2, ... In
Do a contour detection on each one. C1, C2, ..., Cn (Contour is a set of points in OpenCV)
Take a large enough sample on your Image i and i+1: S_i \leq C_i, i \in 1...n
Check for all points in your sample for the nearest point on i+1. Then you trajectorys for all your points.
Check if this trajectorys point mostly outwards (tricky part ;)
If they appear outwards for a suffiecent number of frames your contour got bigger.
Alternative you could try to prune the points that are not part of the correct contour and work with a covering rectangle. It's very easy to check the size that way, but i don't knwo how easy it will be to choose the "correct" points.
I have an image of a face and I have used haar cascades to detect the locations (x,y,width,height) of the mouth, nose and each eye. I would like to set all pixels outside these regions to zero. What would be the fastest (computationally) way to do this? I'll eventually be doing it to video frames in real time.
I don't know whether it is the fastest way, but It is a way to do it.
Create a mask image with region of face as white, then apply bitwise_and function with original image and mask image.
x = y = 30
w = h = 100
mask = np.zeros(img.shape[:2],np.uint8)
mask[y:y+h,x:x+w] = 255
res = cv2.bitwise_and(img,img,mask = mask)
It takes 0.16 ms in my system (core i5,4GB RAM) for an image of size 400x300
EDIT - BETTER METHOD: You need not do as above. Simply create a zero image and then copy ROI from original image to zero image. that's all.
mask = np.zeros(img.shape,np.uint8)
mask[y:y+h,x:x+w] = img[y:y+h,x:x+w]
It takes only 0.032 ms in my system for above parameters, 5 times faster than above.
Results :
Input Image :
Output :
If a polygon ROI is to be made.
Create the polygon and make a mask for it. Multiply the image with the created frame.
ret,frame = cv2.imread()
xr=1
yr=1
# y,x
pts = np.array([[int(112*yr),int(32*xr)],[int(0*yr),int(623*xr)],[int(789*yr),int(628*xr)],[int(381*yr),int(4*xr)]], np.int32)
pts = pts.reshape((-1,1,2))
cv2.polylines(frame,[pts],True,(0,255,255))
mask = np.zeros(frame.shape[:2],np.uint8)
cv2.fillPoly(mask,[pts],(255,255,255))
frame = cv2.bitwise_and(frame,frame,mask = mask)
cv2.imshow("masked frame", frame)