Most efficient way to find center of two circles in a picture - python

I'm trying to take a picture (.jpg file) and find the exact centers (x/y coords) of two differently colored circles in this picture. I've done this in python 2.7. My program works well, but it takes a long time and I need to drastically reduce the amount of time it takes to do this. I currently check every pixel and test its color, and I know I could greatly improve efficiency by pre-sampling a subset of pixels (e.g. every tenth pixel in both horizontal and vertical directions to find areas of the picture to hone in on). My question is if there are pre-developed functions or ways of finding the x/y coords of objects that are much more efficient than my code. I've already removed function calls within the loop, but that only reduced the run time by a few percent.
Here is my code:
from PIL import Image
import numpy as np
i = Image.open('colors4.jpg')
iar = np.asarray(i)
(numCols,numRows) = i.size
print numCols
print numRows
yellowPixelCount = 0
redPixelCount = 0
yellowWeightedCountRow = 0
yellowWeightedCountCol = 0
redWeightedCountRow = 0
redWeightedCountCol = 0
for row in range(numRows):
for col in range(numCols):
pixel = iar[row][col]
r = pixel[0]
g = pixel[1]
b = pixel[2]
brightEnough = r > 200 and g > 200
if r > 2*b and g > 2*b and brightEnough: #yellow pixel
yellowPixelCount = yellowPixelCount + 1
yellowWeightedCountRow = yellowWeightedCountRow + row
yellowWeightedCountCol = yellowWeightedCountCol + col
if r > 2*g and r > 2*b and r > 100: # red pixel
redPixelCount = redPixelCount + 1
redWeightedCountRow = redWeightedCountRow + row
redWeightedCountCol = redWeightedCountCol + col
print "Yellow circle location"
print yellowWeightedCountRow/yellowPixelCount
print yellowWeightedCountCol/yellowPixelCount
print " "
print "Red circle location"
print redWeightedCountRow/redPixelCount
print redWeightedCountCol/redPixelCount
print " "
Update: As I mentioned below, the picture is somewhat arbitrary, but here is an example of one frame from the video I am using:

First you have to do some clearing:
what do you consider fast enough? where is the sample image so we can see what are you dealing with (resolution, bit per pixel). what platform (especially CPU so we can estimate speed).
As you are dealing with circles (each one encoded with different color) then it should be enough to find bounding box. So find min and max x,y coordinates of the pixels of each color. Then your circle is:
center.x=(xmin+xmax)/2
center.y=(ymin+ymax)/2
radius =((xmax-xmin)+(ymax-ymin))/4
If coded right even with your approach it should take just few ms. on images around 1024x1024 resolution I estimate 10-100 ms on average machine. You wrote your approach is too slow but you did not specify the time itself (in some cases 1us is slow in other 1min is enough so we can only guess what you need and got). Anyway if you got similar resolution and time is 1-10 sec then you most likelly use some slow pixel access (most likely from GDI) like get/setpixel use bitmap Scanline[] or direct Pixel access with bitblt or use own memory for images.
Your approach can be speeded up by using ray cast to find approximate location of circles.
cast horizontal lines
their distance should be smaller then radius of smallest circle you search for. cast as many rays until you hit each circle with at least 2 rays
cast 2 vertical lines
you can use found intersection points from #1 so no need to cast many rays just 2 ... use the H ray where intersection points are closer together but not too close.
compute you circle properties
so from the 4 intersection points compute center and radius as it is axis aligned rectangle +/- pixel error it should be as easy just find the mid point of any diagonal and radius is also obvious as half of diagonal size.
As you did not share any image we can only guess what you got in case you do no have circles or need an idea for different approach see:
Algorithms: Ellipse matching
find archery target in image of different perspectives

If you are sure of the colours of the circle, easier method be to filter the colors using a mask and then apply Hough circles as Mathew Pope suggested.
Here is a snippet to get you started quick.
import cv2 as cv2
import numpy as np
fn = '200px-Traffic_lights_dark_red-yellow.svg.png'
# OpenCV reads image with BGR format
img = cv2.imread(fn)
# Convert to HSV format
img_hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)
# lower mask (0-10)
lower_red = np.array([0, 50, 50])
upper_red = np.array([10, 255, 255])
mask = cv2.inRange(img_hsv, lower_red, upper_red)
# Bitwise-AND mask and original image
masked_red = cv2.bitwise_and(img, img, mask=mask)
# Check for circles using HoughCircles on opencv
circles = cv2.HoughCircles(mask, cv2.cv.CV_HOUGH_GRADIENT, 1, 20, param1=30, param2=15, minRadius=0, maxRadius=0)
print 'Radius ' + 'x = ' + str(circles[0][0][0]) + ' y = ' + str(circles[0][0][1])
One example of applying it on image looks like this. First is the original image, followed by the red colour mask obtained and the last is after circle is found using Hough circle function of OpenCV.
Radius found using the above method is Radius x = 97.5 y = 99.5
Hope this helps! :)

Related

Finding the darkest region in a depth map using numpy and/or cv2

I am attempting to consistently find the darkest region in a series of depth map images generated from a video. The depth maps are generated using the PyTorch implementation here
Their sample run script generates a prediction of the same size as the input where each pixel is a floating point value, with the highest/brightest value being the closest. Standard depth estimation using ConvNets.
The depth prediction is then normalized as follows to make a png for review
bits = 2
depth_min = prediction.min()
depth_max = prediction.max()
max_val = (2**(8*bits))-1
out = max_val * (prediction - depth_min) / (depth_max - depth_min)
I am attempting to identify the darkest region in each image in the video, with the assumption that this region has the most "open space".
I've tried several methods:
cv2 template matching
Using cv2 template matching and minMaxLoc I created a template of np.zeros(100,100), then applied the template similar to the docs
img2 = out.copy().astype("uint8")
template = np.zeros((100, 100)).astype("uint8")
w, h = template.shape[::-1]
res = cv2.matchTemplate(img2,template,cv2.TM_SQDIFF)
min_val, max_val, min_loc, max_loc = cv2.minMaxLoc(res)
top_left = min_loc
bottom_right = (top_left[0] + w, top_left[1] + h)
val = out.max()
cv2.rectangle(out,top_left, bottom_right, int(val) , 2)
As you can see, this implementation is very inconsistent with many false positives
np.argmin
Using np.argmin(out, axis=1) which generates many indices. I take the first two, and write the word MIN at those coordinates
text = "MIN"
textsize = cv2.getTextSize(text, font, 1, 2)[0]
textX, textY = np.argmin(prediction, axis=1)[:2]
cv2.putText(out, text, (textX, textY), font, 1, (int(917*max_val), int(917*max_val), int(917*max_val)), 2)
This is less inconsistent but still lacking
np.argwhere
Using np.argwhere(prediction == np.min(preditcion) then write the word MIN at the coordanites. I imagined this would give me the darkest pixel on the image, but this is not the case
I've also thought of running a convolution operation with a kernel of 50x50, then taking the region with the smallest value as the darkest region
My question is why are there inconsistencies and false positives. How can I fix that? Intuitively this seems like a very simple thing to do.
UPDATE
Thanks to Hans for the idea. Please follow this link to download the output depths in png format.
The minimum is not a single point but as a rule a larger area. argmin finds the first x and y (top left corner) of this area:
In case of multiple occurrences of the minimum values, the indices
corresponding to the first occurrence are returned.
What you need is the center of this minimum region. You can find it using moments. Sometimes you have multiple minimum regions for instance in frame107.png. In this case we take the biggest one by finding the contour with the largest area.
We still have some jumping markers as sometimes you have a tiny area that is the minimum, e.g. in frame25.png. Therefore we use a minimum area threshold min_area, i.e. we don't use the absolute minimum region but the region with the smallest value from all regions greater or equal that threshold.
import numpy as np
import cv2
import glob
min_area = 500
for file in glob.glob("*.png"):
img = cv2.imread(file, cv2.IMREAD_GRAYSCALE)
for i in range(img.min(), 255):
if np.count_nonzero(img==i) >= min_area:
b = np.where(img==i, 1, 0).astype(np.uint8)
break
contours,_ = cv2.findContours(b, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)
max_contour = max(contours, key=cv2.contourArea)
m = cv2.moments(max_contour)
x = int(m["m10"] / m["m00"])
y = int(m["m01"] / m["m00"])
out = cv2.circle(img, (x,y), 10, 255, 2 )
cv2.imwrite(file,out)
frame107 with five regions where the image is 0 shown with enhanced gamma:
frame25 with very small min region (red arrow), we take the fifth largest min region instead (white cirle):
The result (for min_area=500) is still a bit jumpy at some places, but if you further increase min_area you'll get false results for frames with a very steeply descending (and hence small per value) dark area. Maybe you can use the time axis (frame number) to filter out frames where the location of the darkest region jumps back and forth within 3 frames.

Find minimal number of rectangles in the image

I have binary images where rectangles are placed randomly and I want to get the positions and sizes of those rectangles.
If possible I want the minimal number of rectangles necessary to exactly recreate the image.
On the left is my original image and on the right the image I get after applying scipys.find_objects()
(like suggested for this question).
import scipy
# image = scipy.ndimage.zoom(image, 9, order=0)
labels, n = scipy.ndimage.measurements.label(image, np.ones((3, 3)))
bboxes = scipy.ndimage.measurements.find_objects(labels)
img_new = np.zeros_like(image)
for bb in bboxes:
img_new[bb[0], bb[1]] = 1
This works fine if the rectangles are far apart, but if they overlap and build more complex structures this algorithm just gives me the largest bounding box (upsampling the image made no difference). I have the feeling that there should already exist a scipy or opencv method which does this.
I would be glad to know if somebody has an idea on how to tackle this problem or even better knows of an existing solution.
As result I want a list of rectangles (ie. lower-left-corner : upper-righ-corner) in the image. The condition is that when I redraw those filled rectangles I want to get exactly the same image as before. If possible the number of rectangles should be minimal.
Here is the code for generating sample images (and a more complex example original vs scipy)
import numpy as np
def random_rectangle_image(grid_size, n_obstacles, rectangle_limits):
n_dim = 2
rect_pos = np.random.randint(low=0, high=grid_size-rectangle_limits[0]+1,
size=(n_obstacles, n_dim))
rect_size = np.random.randint(low=rectangle_limits[0],
high=rectangle_limits[1]+1,
size=(n_obstacles, n_dim))
# Crop rectangle size if it goes over the boundaries of the world
diff = rect_pos + rect_size
ex = np.where(diff > grid_size, True, False)
rect_size[ex] -= (diff - grid_size)[ex].astype(int)
img = np.zeros((grid_size,)*n_dim, dtype=bool)
for i in range(n_obstacles):
p_i = np.array(rect_pos[i])
ps_i = p_i + np.array(rect_size[i])
img[tuple(map(slice, p_i, ps_i))] = True
return img
img = random_rectangle_image(grid_size=64, n_obstacles=30,
rectangle_limits=[4, 10])
Here is something to get you started: a naïve algorithm that walks your image and creates rectangles as large as possible. As it is now, it only marks the rectangles but does not report back coordinates or counts. This is to visualize the algorithm alone.
It does not need any external libraries except for PIL, to load and access the left side image when saved as a PNG. I'm assuming a border of 15 pixels all around can be ignored.
from PIL import Image
def fill_rect (pixels,xp,yp,w,h):
for y in range(h):
for x in range(w):
pixels[xp+x,yp+y] = (255,0,0,255)
for y in range(h):
pixels[xp,yp+y] = (255,192,0,255)
pixels[xp+w-1,yp+y] = (255,192,0,255)
for x in range(w):
pixels[xp+x,yp] = (255,192,0,255)
pixels[xp+x,yp+h-1] = (255,192,0,255)
def find_rect (pixels,x,y,maxx,maxy):
# assume we're at the top left
# get max horizontal span
width = 0
height = 1
while x+width < maxx and pixels[x+width,y] == (0,0,0,255):
width += 1
# now walk down, adjusting max width
while y+height < maxy:
for w in range(x,x+width,1):
if pixels[x,y+height] != (0,0,0,255):
break
if pixels[x,y+height] != (0,0,0,255):
break
height += 1
# fill rectangle
fill_rect (pixels,x,y,width,height)
image = Image.open('A.png')
pixels = image.load()
width, height = image.size
print (width,height)
for y in range(16,height-15,1):
for x in range(16,width-15,1):
if pixels[x,y] == (0,0,0,255):
find_rect (pixels,x,y,width,height)
image.show()
From the output
you can observe the detection algorithm can be improved, as, for example, the "obvious" two top left rectangles are split up into 3. Similar, the larger structure in the center also contains one rectangle more than absolutely needed.
Possible improvements are either to adjust the find_rect routine to locate a best fit¹, or store the coordinates and use math (beyond my ken) to find which rectangles may be joined.
¹ A further idea on this. Currently all found rectangles are immediately filled with the "found" color. You could try to detect obviously multiple rectangles, and then, after marking the first, the other rectangle(s) to check may then either be black or red. Off the cuff I'd say you'd need to try different scan orders (top-to-bottom or reverse, left-to-right or reverse) to actually find the minimally needed number of rectangles in any combination.

Connect the nearest points in segment and label segment

I using Open CV and skimage for document analysis of datasheets.
I am trying to segment out the shade region separately .
I am currently able to segment out the part and number as different clusters.
Using felzenszwalb() from skimage I segment the parts:
import matplotlib.pyplot as plt
import numpy as np
from skimage.segmentation import felzenszwalb
from skimage.io import imread
img = imread('test.jpg')
segments_fz = felzenszwalb(img, scale=100, sigma=0.2, min_size=50)
print("Felzenszwalb number of segments {}".format(len(np.unique(segments_fz))))
plt.imshow(segments_fz)
plt.tight_layout()
plt.show()
But not able to connect them. Any idea to connect methodically and label out the corresponding segment with part and part number would of great help .
Thanks in advance for your time – if I’ve missed out anything, over- or under-emphasised a specific point let me know in the comments.
Preliminaries
Some preliminary code:
%matplotlib inline
%load_ext Cython
import numpy as np
import cv2
from matplotlib import pyplot as plt
import skimage as sk
import skimage.morphology as skm
import itertools
def ShowImage(title,img,ctype):
plt.figure(figsize=(20, 20))
if ctype=='bgr':
b,g,r = cv2.split(img) # get b,g,r
rgb_img = cv2.merge([r,g,b]) # switch it to rgb
plt.imshow(rgb_img)
elif ctype=='hsv':
rgb = cv2.cvtColor(img,cv2.COLOR_HSV2RGB)
plt.imshow(rgb)
elif ctype=='gray':
plt.imshow(img,cmap='gray')
elif ctype=='rgb':
plt.imshow(img)
else:
raise Exception("Unknown colour type")
plt.axis('off')
plt.title(title)
plt.show()
For reference, here's your original image:
#Read in image
img = cv2.imread('part.jpg')
ShowImage('Original',img,'bgr')
Identifying Numbers
To simplify things, we'll want to classify pixels as being either on or off. We can do so with thresholding. Since our image contains two clear classes of pixels (black and white), we can use Otsu's method. We'll invert the colour scheme since the libraries we're using consider black pixels boring and white pixels interesting.
#Convert image to grayscale
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
#Apply Otsu's method to eliminate pixels of intermediate colour
ret, thresh = cv2.threshold(gray,0,255,cv2.THRESH_BINARY_INV+cv2.THRESH_OTSU)
ShowImage('Applying Otsu',thresh,'gray')
#Verify that pixels are either black or white and nothing in between
np.unique(thresh)
Our strategy will be to locate numbers and then follow the line(s) near them to parts and then to label those parts. Since, conveniently, all of the Arabic numerals are formed from contiguous pixels, we can start by finding the connected components.
ret, components = cv2.connectedComponents(thresh)
#Each component is a different colour
ShowImage('Connected Components', components, 'rgb')
We can then filter the connected components to find the numbers by filtering for dimension. Note that this is not a super robust method of doing this. A better option would be to use character recognition, but this is left as an exercise to the reader :-)
class Box:
def __init__(self,x0,x1,y0,y1):
self.x0, self.x1, self.y0, self.y1 = x0,x1,y0,y1
def overlaps(self,box2,tol):
if self.x0 is None or box2.x0 is None:
return False
return not (self.x1+tol<=box2.x0 or self.x0-tol>=box2.x1 or self.y1+tol<=box2.y0 or self.y0-tol>=box2.y1)
def merge(self,box2):
self.x0 = min(self.x0,box2.x0)
self.x1 = max(self.x1,box2.x1)
self.y0 = min(self.y0,box2.y0)
self.y1 = max(self.y1,box2.y1)
box2.x0 = None #Used to mark `box2` as being no longer valid. It can be removed later
def dist(self,x,y):
#Get center point
ax = (self.x0+self.x1)/2
ay = (self.y0+self.y1)/2
#Get distance to center point
return np.sqrt((ax-x)**2+(ay-y)**2)
def good(self):
return not (self.x0 is None)
def ExtractComponent(original_image, component_matrix, component_number):
"""Extracts a component from a ConnectedComponents matrix"""
#Create a true-false matrix indicating if a pixel is part of a particular component
is_component = component_matrix==component_number
#Find the coordinates of those pixels
coords = np.argwhere(is_component)
# Bounding box of non-black pixels.
y0, x0 = coords.min(axis=0)
y1, x1 = coords.max(axis=0) + 1 # slices are exclusive at the top
# Get the contents of the bounding box.
return x0,x1,y0,y1,original_image[y0:y1, x0:x1]
numbers_img = thresh.copy() #This is used purely to show that we can identify numbers
numbers = []
for component in range(components.max()):
tx0,tx1,ty0,ty1,this_component = ExtractComponent(thresh, components, component)
#ShowImage('Component #{0}'.format(component), this_component, 'gray')
cheight, cwidth = this_component.shape
#print(cwidth,cheight) #Enable this to see dimensions
#Identify numbers based on aspect ratio
if (abs(cwidth-14)<3 or abs(cwidth-7)<3) and abs(cheight-24)<3:
numbers_img[ty0:ty1,tx0:tx1] = 128
numbers.append(Box(tx0,tx1,ty0,ty1))
ShowImage('Numbers', numbers_img, 'gray')
We now connect the numbers into contiguous blocks by expanding their bounding boxes slightly and looking for overlaps.
#This is kind of a silly way to do this, but it will work find for small quantities (hundreds)
merged=True #If true, then a merge happened this round
while merged: #Continue until there are no more mergers
merged=False #Reset merge indicator
for a,b in itertools.combinations(numbers,2): #Consider all pairs of numbers
if a.overlaps(b,10): #If this pair overlaps
a.merge(b) #Merge it
merged=True #Make a note that we've merged
numbers = [x for x in numbers if x.good()] #Eliminate those boxes that were gobbled by the mergers
#This is used purely to show that we can identify numbers
numbers_img = thresh.copy()
for n in numbers:
numbers_img[n.y0:n.y1,n.x0:n.x1] = 128
thresh[n.y0:n.y1,n.x0:n.x1] = 0 #Drop numbers from thresholded image
ShowImage('Numbers', numbers_img, 'gray')
Okay, so now we've identified the numbers! We'll use these later to identify parts.
Identifying Arrows
Next, we'll want to figure out what parts the numbers are pointing to. To do so, we want to detect lines. The Hough transform is good for this. To reduce the number of false positives, we skeletonize the data, which transforms it into a representation which is at most one pixel wide.
skel = sk.img_as_ubyte(skm.skeletonize(thresh>0))
ShowImage('Skeleton', skel, 'gray')
Now we perform the Hough transform. We're looking for one that identifies all of the lines going from the numbers to the parts. Getting this right may take some fiddling with the parameters.
lines = cv2.HoughLinesP(
skel,
1, #Resolution of r in pixels
np.pi / 180, #Resolution of theta in radians
30, #Minimum number of intersections to detect a line
None,
80, #Min line length
10 #Max line gap
)
lines = [x[0] for x in lines]
line_img = thresh.copy()
line_img = cv2.cvtColor(line_img, cv2.COLOR_GRAY2BGR)
for l in lines:
color = tuple(map(int, np.random.randint(low=0, high=255, size=3)))
cv2.line(line_img, (l[0], l[1]), (l[2], l[3]), color, 3, cv2.LINE_AA)
ShowImage('Lines', line_img, 'bgr')
We now want to find the line or lines which are closest to each number and retain only these. We're essentially filtering out all of the lines which are not arrows. To do so, we compare the end points of each line to the center point of each number box.
comp_labels = np.zeros(img.shape[0:2], dtype=np.uint8)
for n_idx,n in enumerate(numbers):
distvals = []
for i,l in enumerate(lines):
#Distances from each point of line to midpoint of rectangle
dists = [n.dist(l[0],l[1]),n.dist(l[2],l[3])]
#Minimum distance and the end point (0 or 1) of the line associated with that point
#Tuples of (Line Number, Line Point, Dist to Line Point) are produced
distvals.append( (i,np.argmin(dists),np.min(dists)) )
#Sort by distance between the number box and the line
distvals = sorted(distvals, key=lambda x: x[2])
#Include nearby lines, not just the closest one. This accounts for forking.
distvals = [x for x in distvals if x[2]<1.5*distvals[0][2]]
#Draw a white rectangle where the number box was
cv2.rectangle(comp_labels, (n.x0,n.y0), (n.x1,n.y1), 1, cv2.FILLED)
#Draw white lines where the arrows are
for dv in distvals:
l = lines[dv[0]]
lp = (l[0],l[1]) if dv[1]==0 else (l[2],l[3])
cv2.line(comp_labels, (l[0], l[1]), (l[2], l[3]), 1, 3, cv2.LINE_AA)
cv2.line(comp_labels, (lp[0], lp[1]), ((n.x0+n.x1)//2, (n.y0+n.y1)//2), 1, 3, cv2.LINE_AA)
ShowImage('Lines', comp_labels, 'gray')
Finding Parts
This part was hard! We now want to segment the parts in the image. If there was some way to disconnect the lines linking subparts together, this would be easy. Unfortunately, the lines connecting the subparts are the same width as many of the lines which constitute the parts.
To work around this, we could use a lot of logic. It would be painful and error-prone.
Alternatively, we could assume you have an expert-in-the-loop. This expert's sole job is to cut the lines connecting the subparts. This should be both easy and fast for them. Labeling everything would be slow and sad for humans, but is fast for computers. Separating things is easy for humans, but hard for computers. So we let both do what they do best.
In this case, you could probably train someone to do this job in a few minutes, so a true "expert" isn't really necessary. Just a mildly competent human.
If you pursue this, you'll need to write the expert in the loop tool. To do so, save the skeleton images, have your expert modify them, and read the skeletonized images back in. Like so.
#Save the image, or display it on a GUI
#cv2.imwrite("/z/skel.png", skel);
#EXPERT DOES THEIR THING HERE
#Read the expert-mediated image back in
skelhuman = cv2.imread('/z/skel.png')
#Convert back to the form we need
skelhuman = cv2.cvtColor(skelhuman,cv2.COLOR_BGR2GRAY)
ret, skelhuman = cv2.threshold(skelhuman,0,255,cv2.THRESH_OTSU)
ShowImage('SkelHuman', skelhuman, 'gray')
Now that we have the parts separated, we'll eliminate as much of the arrows as possible. We've already extracted these above, so we can add them back later if we need to.
To eliminate the arrows, we'll find all of the lines that terminate in locations other than by another line. That is, we'll locate pixels which have only one neighbouring pixel. We'll then eliminate the pixel and look at its neighbour. Doing this iteratively eliminates the arrows. Since I don't know another term for it, I'll call this a Fuse Transform. Since this will require manipulating individual pixels, which would be super slow in Python, we'll write the transform in Cython.
%%cython -a --cplus
import cython
from libcpp.queue cimport queue
import numpy as np
cimport numpy as np
#cython.boundscheck(False)
#cython.wraparound(False)
#cython.nonecheck(False)
#cython.cdivision(True)
cpdef void FuseTransform(unsigned char [:, :] image):
# set the variable extension types
cdef int c, x, y, nx, ny, width, height, neighbours
cdef queue[int] q
# grab the image dimensions
height = image.shape[0]
width = image.shape[1]
cdef int dx[8]
cdef int dy[8]
#Offsets to neighbouring cells
dx[:] = [-1,-1,0,1,1,1,0,-1]
dy[:] = [0,-1,-1,-1,0,1,1,1]
#Find seed cells: those with only one neighbour
for y in range(1, height-1):
for x in range(1, width-1):
if image[y,x]==0: #Seed cells cannot be blank cells
continue
neighbours = 0
for n in range(0,8): #Looks at all neighbours
nx = x+dx[n]
ny = y+dy[n]
if image[ny,nx]>0: #This neighbour has a value
neighbours += 1
if neighbours==1: #Was there only one neighbour?
q.push(y*width+x) #If so, this is a seed cell
#Starting with the seed cells, gobble up the lines
while not q.empty():
c = q.front()
q.pop()
y = c//width #Convert flat index into 2D x-y index
x = c%width
image[y,x] = 0 #Gobble up this part of the fuse
neighbour = -1 #No neighbours yet
for n in range(0,8): #Look at all neighbours
nx = x+dx[n] #Find coordinates of neighbour cells
ny = y+dy[n]
#If the neighbour would be off the side of the matrix, ignore it
if nx<0 or ny<0 or nx==width or ny==height:
continue
if image[ny,nx]>0: #Is the neighbouring cell active?
if neighbour!=-1: #If we've already found an active neighbour
neighbour=-1 #Then pretend we found no neighbours
break #And stop looking. This is the end of the fuse.
else: #Otherwise, make a note of the neighbour's index.
neighbour = ny*width+nx
if neighbour!=-1: #If there was only one neighbour
q.push(neighbour) #Continue burning the fuse
Back in standard Python:
#Apply the Fuse Transform
skh_dilated=skelhuman.copy()
FuseTransform(skh_dilated)
ShowImage('Fuse Transform', skh_dilated, 'gray')
Now that we've eliminated all of the arrows and lines connecting the parts, we dilate the remaining pixels a lot.
kernel = np.ones((3,3),np.uint8)
dilated = cv2.dilate(skh_dilated, kernel, iterations=6)
ShowImage('Dilation', dilated, 'gray')
Putting It All Together
And overlay the labels and arrows we segmented out earlier...
comp_labels_dilated = cv2.dilate(comp_labels, kernel, iterations=5)
labels_combined = np.uint8(np.logical_or(comp_labels_dilated,dilated))
ShowImage('Comp Labels', labels_combined, 'gray')
Finally, we take the merged number boxes, component arrows, and parts and color each of them using pretty colors from Color Brewer. We then overlay this on the original image to obtain the desired highlighting.
ret, labels = cv2.connectedComponents(labels_combined)
colormask = np.zeros(img.shape, dtype=np.uint8)
#Colors from Color Brewer
colors = [(228,26,28),(55,126,184),(77,175,74),(152,78,163),(255,127,0),(255,255,51),(166,86,40),(247,129,191),(153,153,153)]
for l in range(labels.max()):
if l==0: #Background component
colormask[labels==0] = (255,255,255)
else:
colormask[labels==l] = colors[l]
ShowImage('Comp Labels', colormask, 'bgr')
blended = cv2.addWeighted(img,0.7,colormask,0.3,0)
ShowImage('Blended', blended, 'bgr')
The final image
So, to recap, we identified numbers, arrows, and parts. In some cases, we were able to separate them automatically. In other cases, we used expert in the loop. Where we had to manipulate pixels individually, we used Cython for speed.
Of course, the danger with this sort of thing is that some other image will break the (many) assumptions I've made here. But that's a risk that you take when you try to use a single image to present a problem.

time lag in program for centroid calculation

I am trying to determine the centroid of one specific object using OpenCV and Python.
I am using the following code, but it is taking too much time to calculate the centroid.
I need a faster approach for this -- should I change the resolution of the cameras in order to increase the computing speed?
This is my code:
meanI=[0]
meanJ=[0]
#taking infinite frames continuously to make a video
while(True):
ret, frame = capture.read()
rgb_image = cv2.cvtColor(frame , 0)
content_red = rgb_image[:,:,2] #red channel of image
content_green = rgb_image[:,:,1] #green channel of image
content_blue = rgb_image[:,:,0] #blue channel of image
r = rgb_image.shape[0] #gives the rows of the image matrix
c = rgb_image.shape[1] # gives the columns of the image matrix
d = rgb_image.shape[2] #gives the depth order of the image matrux
binary_image = np.zeros((r,c),np.float32)
for i in range (1,r): #thresholding the object as per requirements
for j in range (1,c):
if((content_red[i][j]>186) and (content_red[i][j]<230) and \
(content_green[i][j]>155) and (content_green[i][j]<165) and \
(content_blue[i][j]> 175) and (content_blue[i][j]< 195)):
binary_image[i][j] = 1
meanI.append(i)
meanJ.append(j)
cv2.imshow('frame1',binary_image)
cv2.waitKey()
cox = np.mean(meanI) #x-coordinate of centroid
coy = np.mean(meanJ) #y-coordinate of centroid
As you have discovered, nested loops in Python are very slow. It is best to avoid iterating over every pixel using nested loops. Fortunately, OpenCV has some built-in functions that do exactly what you are trying to achieve: inRange(), which creates a binary image of pixels which fall in between the specified bounds, and moments(), which you can use to calculate the centroid of a binary image. I strongly suggest reading over OpenCV's documentation to get a feel for what the library offers.
Combining these two functions gives the following code:
import numpy as np
import cv2
lower = np.array([175, 155, 186], np.uint8) # Note these ranges are BGR ordered
upper = np.array([195, 165, 230], np.uint8)
binary = cv2.inRange(im, lower, upper) # im is your BGR image
moments = cv2.moments(binary, True)
cx = moments['m10'] / moments['m00']
cy = moments['m01'] / moments['m00']
cx and cy are the x- and y-coordinates of the image centroid. This version is a whopping 3000 times faster than using nested loops.

Trim scanned images with PIL?

What would be the approach to trim an image that's been input using a scanner and therefore has a large white/black area?
the entropy solution seems problematic and overly intensive computationally. Why not edge detect?
I just wrote this python code to solve this same problem for myself. My background was dirty white-ish, so the criteria that I used was darkness and color. I simplified this criteria by just taking the smallest of the R, B or B value for each pixel, so that black or saturated red both stood out the same. I also used the average of the however many darkest pixels for each row or column. Then I started at each edge and worked my way in till I crossed a threshold.
Here is my code:
#these values set how sensitive the bounding box detection is
threshold = 200 #the average of the darkest values must be _below_ this to count (0 is darkest, 255 is lightest)
obviousness = 50 #how many of the darkest pixels to include (1 would mean a single dark pixel triggers it)
from PIL import Image
def find_line(vals):
#implement edge detection once, use many times
for i,tmp in enumerate(vals):
tmp.sort()
average = float(sum(tmp[:obviousness]))/len(tmp[:obviousness])
if average <= threshold:
return i
return i #i is left over from failed threshold finding, it is the bounds
def getbox(img):
#get the bounding box of the interesting part of a PIL image object
#this is done by getting the darekest of the R, G or B value of each pixel
#and finding were the edge gest dark/colored enough
#returns a tuple of (left,upper,right,lower)
width, height = img.size #for making a 2d array
retval = [0,0,width,height] #values will be disposed of, but this is a black image's box
pixels = list(img.getdata())
vals = [] #store the value of the darkest color
for pixel in pixels:
vals.append(min(pixel)) #the darkest of the R,G or B values
#make 2d array
vals = np.array([vals[i * width:(i + 1) * width] for i in xrange(height)])
#start with upper bounds
forupper = vals.copy()
retval[1] = find_line(forupper)
#next, do lower bounds
forlower = vals.copy()
forlower = np.flipud(forlower)
retval[3] = height - find_line(forlower)
#left edge, same as before but roatate the data so left edge is top edge
forleft = vals.copy()
forleft = np.swapaxes(forleft,0,1)
retval[0] = find_line(forleft)
#and right edge is bottom edge of rotated array
forright = vals.copy()
forright = np.swapaxes(forright,0,1)
forright = np.flipud(forright)
retval[2] = width - find_line(forright)
if retval[0] >= retval[2] or retval[1] >= retval[3]:
print "error, bounding box is not legit"
return None
return tuple(retval)
if __name__ == '__main__':
image = Image.open('cat.jpg')
box = getbox(image)
print "result is: ",box
result = image.crop(box)
result.show()
For starters, Here is a similar question. Here is a related question. And a another related question.
Here is just one idea, there are certainly other approaches. I would select an arbitrary crop edge and then measure the entropy* on either side of the line, then proceed to re-select the crop line (probably using something like a bisection method) until the entropy of the cropped-out portion falls below a defined threshold. As I think, you may need to resort to a brute root-finding method as you will not have a good indication of when you have cropped too little. Then repeat for the remaining 3 edges.
*I recall discovering that the entropy method in the referenced website was not completely accurate, but I could not find my notes (I'm sure it was in a SO post, however.)
Edit:
Other criteria for the "emptiness" of an image portion (other than entropy) might be contrast ratio or contrast ratio on an edge-detect result.

Categories