I have frames of a video taken from a microscope. I need to crop them to a square inscribed to the circle but the issue is that the circle isn't whole (like in the following image). How can I do it?
My idea was to use contour finding to get the center of the circle and then find the distance from each point over the whole array of coordinates to the center, take the maximum distance as the radius and find the corners of the square analytically but there must be a better way to do it (also I don't really have a formula to find the corners).
This may not be adequate in terms of centered at center of circle, but using my iterative processing, one can crop to an approximation of the largest rectangle inside your circle area.
Input:
import cv2
import numpy as np
# read image
img = cv2.imread('img.jpg')
h, w = img.shape[:2]
# threshold so border is black and rest is white (invert as needed).
# Here I needed to specify the upper threshold at 20 as your black is not pure black.
lower = (0,0,0)
upper = (20,20,20)
mask = cv2.inRange(img, lower, upper)
mask = 255 - mask
# define top and left starting coordinates and starting width and height
top = 0
left = 0
bottom = h
right = w
# compute the mean of each side of the image and its stop test
mean_top = np.mean( mask[top:top+1, left:right] )
mean_left = np.mean( mask[top:bottom, left:left+1] )
mean_bottom = np.mean( mask[bottom-1:bottom, left:right] )
mean_right = np.mean( mask[top:bottom, right-1:right] )
mean_minimum = min(mean_top, mean_left, mean_bottom, mean_right)
top_test = "stop" if (mean_top == 255) else "go"
left_test = "stop" if (mean_left == 255) else "go"
bottom_test = "stop" if (mean_bottom == 255) else "go"
right_test = "stop" if (mean_right == 255) else "go"
# iterate to compute new side coordinates if mean of given side is not 255 (all white) and it is the current darkest side
while top_test == "go" or left_test == "go" or right_test == "go" or bottom_test == "go":
# top processing
if top_test == "go":
if mean_top != 255:
if mean_top == mean_minimum:
top += 1
mean_top = np.mean( mask[top:top+1, left:right] )
mean_left = np.mean( mask[top:bottom, left:left+1] )
mean_bottom = np.mean( mask[bottom-1:bottom, left:right] )
mean_right = np.mean( mask[top:bottom, right-1:right] )
mean_minimum = min(mean_top, mean_left, mean_right, mean_bottom)
#print("top",mean_top)
continue
else:
top_test = "stop"
# left processing
if left_test == "go":
if mean_left != 255:
if mean_left == mean_minimum:
left += 1
mean_top = np.mean( mask[top:top+1, left:right] )
mean_left = np.mean( mask[top:bottom, left:left+1] )
mean_bottom = np.mean( mask[bottom-1:bottom, left:right] )
mean_right = np.mean( mask[top:bottom, right-1:right] )
mean_minimum = min(mean_top, mean_left, mean_right, mean_bottom)
#print("left",mean_left)
continue
else:
left_test = "stop"
# bottom processing
if bottom_test == "go":
if mean_bottom != 255:
if mean_bottom == mean_minimum:
bottom -= 1
mean_top = np.mean( mask[top:top+1, left:right] )
mean_left = np.mean( mask[top:bottom, left:left+1] )
mean_bottom = np.mean( mask[bottom-1:bottom, left:right] )
mean_right = np.mean( mask[top:bottom, right-1:right] )
mean_minimum = min(mean_top, mean_left, mean_right, mean_bottom)
#print("bottom",mean_bottom)
continue
else:
bottom_test = "stop"
# right processing
if right_test == "go":
if mean_right != 255:
if mean_right == mean_minimum:
right -= 1
mean_top = np.mean( mask[top:top+1, left:right] )
mean_left = np.mean( mask[top:bottom, left:left+1] )
mean_bottom = np.mean( mask[bottom-1:bottom, left:right] )
mean_right = np.mean( mask[top:bottom, right-1:right] )
mean_minimum = min(mean_top, mean_left, mean_right, mean_bottom)
#print("right",mean_right)
continue
else:
right_test = "stop"
# crop input
result = img[top:bottom, left:right]
# print crop values
print("top: ",top)
print("bottom: ",bottom)
print("left: ",left)
print("right: ",right)
print("height:",result.shape[0])
print("width:",result.shape[1])
# save cropped image
#cv2.imwrite('border_image1_cropped.png',result)
cv2.imwrite('img_cropped.png',result)
cv2.imwrite('img_mask.png',mask)
# show the images
cv2.imshow("mask", mask)
cv2.imshow("cropped", result)
cv2.waitKey(0)
cv2.destroyAllWindows()
Result:
Let's start with an illustration of the problem to help with the explanation.
Of course, we have to begin with loading the image. Let's also grab its width and height, since they will be useful later on.
img = cv2.imread('TUP74.jpg', cv2.IMREAD_COLOR)
height, width = img.shape[:2]
First, let's convert the image to grayscale and then apply threshold to make the circle all white, and the background black. I arbitrarily picked a threshold value of 31, which seems to give reasonable results.
img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
_, thresh = cv2.threshold(img_gray, 31, 255, cv2.THRESH_BINARY)
The result of those operations looks like this:
Now, we can determine the "top" and "bottom" of the circle (first_yd and last_yd), by finding the first and last row that contains at least one white pixel. I chose to use cv2.reduce to find the maximum of each row (since the thresholded image only contains 0's and 255's, a non-zero result means there is at least 1 white pixel), followed by cv2.findNonZero to get the row numbers.
reduced = cv2.reduce(thresh, 1, cv2.REDUCE_MAX)
row_info = cv2.findNonZero(reduced)
first_yd, last_yd = row_info[0][0][1], row_info[-1][0][1]
This information allows us to determine the diameter of the circle d, its radius r (r = d/2), as well as the Y coordinate of the center of the circle center_y.
diameter = last_yd - first_yd
radius = int(diameter / 2)
center_y = first_yd + radius
Next, we need to determine the X coordinate of the center of the circle center_x.
Let's take advantage of the fact that the circle is cropped on the left-hand side. The white pixels in the first column of the threshold image represent a chord c of the circle (red in the diagram).
Again, we begin with finding the "top" and "bottom" of the chord (first_yc and last_yc), but since we're working with a single column, we only need cv2.findNonZero.
row_info = cv2.findNonZero(thresh[:,0])
first_yc, last_yc = row_info[0][0][1], row_info[-1][0][1]
c = last_yc - first_yc
Now we have a nice right-angled triangle with one side adjacent to the right angle being half of the chord c (red in the diagram), the other adjacent side being the unknown offset o, and the hypotenuse (green in the diagram) being the radius of the circle r. Let's apply Pythagoras' theorem:
r2 = (c/2)2 + o2
o2 = r2 - (c/2)2
o = sqrt(r2 - (c/2)2)
And in Python:
center_x = int(math.sqrt(radius**2 - (c/2)**2))
Now we're ready to determine the parameters of the inscribed square. Let's keep in mind that the center of the circle and center of its inscribed square are co-located. Here is another illustration:
We will again use Pythagoras' theorem. The hypotenuse of the right triangle is again the radius r. Both of the sides adjacent to the right angle are of equal length, which is half the length of the side of inscribed square s.
r2 = (s/2)2 + (s/2)2
r2 = 2 × (s/2)2
r2 = 2 × s2/22
r2 = s2/2
s2 = 2 × r2
s = sqrt(2) × r
And in Python:
s = int(math.sqrt(2) * radius)
Finally, we can determine the top-left and bottom-right corners of the inscribed square. Both of those points are offset by s/2 from the common center.
half_s = int(s/2)
tl = (center_x - half_s, center_y - half_s)
br = (center_x + half_s, center_y + half_s)
We have determined all the parameters we need. Let's print them out...
Circle diameter = 1167 pixels
Circle radius = 583 pixels
Circle center = (404,1089)
Inscribed square side = 824 pixels
Inscribed square top-left = (-8,677)
Inscribed square bottom-right = (816,1501)
and visualize the center (green), the detected circle (red) and the inscribed square (blue) on a copy of the input image:
Now we can do the cropping, but first we have to make sure we don't go out of bounds of the source image.
crop_left = max(tl[0], 0)
crop_top = max(tl[1], 0) # Kinda redundant, but why not
crop_right = min(br[0], width)
crop_bottom = min(br[1], height) # ditto
cropped = img[crop_top:crop_bottom, crop_left:crop_right]
And that's it. Here's the cropped image (it's rectangular, since small part of the inscribed square falls outside the source image, and scaled down for embedding -- click to get the full-sized image):
Complete Script
import cv2
import numpy as np
import math
img = cv2.imread('TUP74.jpg', cv2.IMREAD_COLOR)
height, width = img.shape[:2]
img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
_, thresh = cv2.threshold(img_gray, 31, 255, cv2.THRESH_BINARY)
# Find top/bottom of the circle, to determine radius and center
reduced = cv2.reduce(thresh, 1, cv2.REDUCE_MAX)
row_info = cv2.findNonZero(reduced)
first_yd, last_yd = row_info[0][0][1], row_info[-1][0][1]
diameter = last_yd - first_yd
radius = int(diameter / 2)
center_y = first_yd + radius
# Repeat again, just on first column, to find length of a chord of the circle
row_info = cv2.findNonZero(thresh[:,0])
first_yc, last_yc = row_info[0][0][1], row_info[-1][0][1]
c = last_yc - first_yc
# Apply Pythagoras theorem to find the X offset of the center from the chord
# Since the chord is in row 0, this is also the X coordinate
center_x = int(math.sqrt(radius**2 - (c/2)**2))
# Find length of the side of the inscribed square (Pythagoras again)
s = int(math.sqrt(2) * radius)
# Now find the top-left and bottom-right corners of the square
half_s = int(s/2)
tl = (center_x - half_s, center_y - half_s)
br = (center_x + half_s, center_y + half_s)
# Let's print out what we found
print("Circle diameter = %d pixels" % diameter)
print("Circle radius = %d pixels" % radius)
print("Circle center = (%d,%d)" % (center_x, center_y))
print("Inscribed square side = %d pixels" % s)
print("Inscribed square top-left = (%d,%d)" % tl)
print("Inscribed square bottom-right = (%d,%d)" % br)
# And visualize it...
vis = img.copy()
cv2.line(vis, (center_x-5,center_y), (center_x+5,center_y), (0,255,0), 3)
cv2.line(vis, (center_x,center_y-5), (center_x,center_y+5), (0,255,0), 3)
cv2.circle(vis, (center_x,center_y), radius, (0,0,255), 3)
cv2.rectangle(vis, tl, br, (255,0,0), 3)
# Write some illustration images
cv2.imwrite('circ_thresh.png', thresh)
cv2.imwrite('circ_vis.png', vis)
# Time to do some cropping, but we need to make sure the coordinates are inside the bounds of the image
crop_left = max(tl[0], 0)
crop_top = max(tl[1], 0) # Kinda redundant, but why not
crop_right = min(br[0], width)
crop_bottom = min(br[1], height) # ditto
cropped = img[crop_top:crop_bottom, crop_left:crop_right]
cv2.imwrite('circ_cropped.png', cropped)
NB: The main focus of this was the explanation of the algorithm. I've been kinda blunt on rounding the values, and there may be some off-by-one errors. For the sake of brevity, error checking is minimal. It's left as an excercise to the reader to address those issues as necessary.
Furthermore, the assumption is that the left-hand side of the circle is cropped as in the sample image. It should be fairly trivial to extend this to handle other possible scenarios, using the techniques I've demonstrated.
Building on Dan Mašek's answer, here is an alternate method of computing the center and radius in Python/OpenCV/Numpy, in particular, the x-coordinate of the center.
The idea is simply find the coordinate of column that has the largest non-zero count in the thresholded image.
Input:
import cv2
import numpy as np
import math
img = cv2.imread('img_circle.jpg')
height, width = img.shape[:2]
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
thresh = cv2.threshold(gray, 31, 255, cv2.THRESH_BINARY)[1]
# Find top/bottom of the circle, to determine radius and y coordinate center
reduced = cv2.reduce(thresh, 1, cv2.REDUCE_MAX)
row_info = cv2.findNonZero(reduced)
first_yd, last_yd = row_info[0][0][1], row_info[-1][0][1]
diameter = last_yd - first_yd
radius = int(diameter / 2)
center_y = first_yd + radius
# count non-zero pixels in columns to find the column with the largest count
# that will give us the x coordinate center
col_counts = np.count_nonzero(thresh, axis=0)
max_counts = np.amax(col_counts)
# find index (x-coordinate) where col_counts=max_counts
max_coords = np.argwhere(col_counts==max_counts)
# get number of max values in case more than one
num_max = len(max_coords)
# compute center_y
center_x = max_coords[0][0] + num_max//2
print("radius:", radius, "center_x:", center_x, "center_y:", center_y)
print('')
Result:
radius: 583 center_x: 388 center_y: 1089
The rest is the same as in Dan Mašek's answer.
Find the edge points of the image circle, and then fit a circle to the edge.
Or, you may be able to use minEnclosingCircle() instead of circle fitting.
(I omit the explanation of the subsequent steps for obtaining a square.)
Related
Is there a Contour Method to detect arrows in Python CV? Maybe with Contours, Shapes, and Vertices.
# find contours in the thresholded image and initialize the shape detector
cnts = cv2.findContours(thresh.copy(), cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = imutils.grab_contours(cnts)
perimeterValue = cv2.arcLength(cnts , True)
vertices = cv2.approxPolyDP(cnts , 0.04 * perimeterValue, True)
Perhaps we can look at tips of the contours, and also detect triangles?
Hopefully it can detect arrows among different objects, among squares, rectangles, and circles. (otherwise, will have to use machine learning).
Also nice to get these three results if possible (arrow length, thickness, directionAngle)
This question recommends template matching, and doesn't specify any code base. Looking for something workable that can be code created
how to detect arrows using open cv python?
If PythonOpenCV doesn't have capability, open to utilizing another library.
The solution you are asking for is too complex to be solved by one function or particular algorithm. In fact, the problem could be broken down into smaller steps, each with their own algorithms and solutions. Instead of offering you a free, complete, copy-paste solution, I'll give you a general outline of the problem and post part of the solution I'd design. These are the steps I propose:
Identify and extract all the arrow blobs from the image, and process them one by one.
Try to find the end-points of the arrow. That is end and starting point (or "tail" and "tip")
Undo the rotation, so you have straightened arrows always, no matter their angle.
After this, the arrows will always point to one direction. This normalization let's itself easily for classification.
After processing, you can pass the image to a Knn classifier, a Support Vector Machine or even (if you are willing to call the "big guns" on this problem) a CNN (in which case, you probably won't need to undo the rotation - as long as you have enough training samples). You don't even have to compute features, as passing the raw image to a SVM would be probably enough. However, you need more than one training sample for each arrow class.
Alright, let's see. First, let's extract each arrow from the input. This is done using cv2.findCountours, this part is very straightforward:
# Imports:
import cv2
import math
import numpy as np
# image path
path = "D://opencvImages//"
fileName = "arrows.png"
# Reading an image in default mode:
inputImage = cv2.imread(path + fileName)
# Grayscale conversion:
grayscaleImage = cv2.cvtColor(inputImage, cv2.COLOR_BGR2GRAY)
grayscaleImage = 255 - grayscaleImage
# Find the big contours/blobs on the binary image:
contours, hierarchy = cv2.findContours(grayscaleImage, cv2.RETR_CCOMP, cv2.CHAIN_APPROX_SIMPLE)
Now, let's check out the contours and process them one by one. Let's compute a (non-rotated) bounding box of the arrow and crop that sub-image. Now, note that some noise could come up. In which case, we won't be processing that blob. I apply an area filter to bypass blobs of small area. Like this:
# Process each contour 1-1:
for i, c in enumerate(contours):
# Approximate the contour to a polygon:
contoursPoly = cv2.approxPolyDP(c, 3, True)
# Convert the polygon to a bounding rectangle:
boundRect = cv2.boundingRect(contoursPoly)
# Get the bounding rect's data:
rectX = boundRect[0]
rectY = boundRect[1]
rectWidth = boundRect[2]
rectHeight = boundRect[3]
# Get the rect's area:
rectArea = rectWidth * rectHeight
minBlobArea = 100
We set a minBlobArea and process that contour. Crop the image if the contour is above that area threshold value:
# Check if blob is above min area:
if rectArea > minBlobArea:
# Crop the roi:
croppedImg = grayscaleImage[rectY:rectY + rectHeight, rectX:rectX + rectWidth]
# Extend the borders for the skeleton:
borderSize = 5
croppedImg = cv2.copyMakeBorder(croppedImg, borderSize, borderSize, borderSize, borderSize, cv2.BORDER_CONSTANT)
# Store a deep copy of the crop for results:
grayscaleImageCopy = cv2.cvtColor(croppedImg, cv2.COLOR_GRAY2BGR)
# Compute the skeleton:
skeleton = cv2.ximgproc.thinning(croppedImg, None, 1)
There are some couple of things going on here. After I crop the ROI of the current arrow, I extend borders on that image. I store a deep-copy of this image for further processing and, lastly, I compute the skeleton. The border-extending is done prior to skeletonizing because the algorithm produces artifacts if the contour is too close to the image limits. Padding the image in all directions prevents these artifacts. The skeleton is needed for the way I'm finding ending and starting points of the arrow. More of this latter, this is the first arrow cropped and padded:
This is the skeleton:
Note that the "thickness" of the contour is normalized to 1 pixel. That's cool, because that's what I need for the following processing step: Finding start/ending points. This is done by applying a convolution with a kernel designed to identify one-pixel wide end-points on a binary image. Refer to this post for the specifics. We will prepare the kernel and use cv2.filter2d to get the convolution:
# Threshold the image so that white pixels get a value of 0 and
# black pixels a value of 10:
_, binaryImage = cv2.threshold(skeleton, 128, 10, cv2.THRESH_BINARY)
# Set the end-points kernel:
h = np.array([[1, 1, 1],
[1, 10, 1],
[1, 1, 1]])
# Convolve the image with the kernel:
imgFiltered = cv2.filter2D(binaryImage, -1, h)
# Extract only the end-points pixels, those with
# an intensity value of 110:
binaryImage = np.where(imgFiltered == 110, 255, 0)
# The above operation converted the image to 32-bit float,
# convert back to 8-bit uint
binaryImage = binaryImage.astype(np.uint8)
After the convolution, all end-points have a value of 110. Setting these pixels to 255, while the rest are set to black, yields the following image (after proper conversion):
Those tiny pixels correspond to the "tail" and "tip" of the arrow. Notice there's more than one point per "Arrow section". This is because the end-points of the arrow do not perfectly end in one pixel. In the case of the tip, for example, there will be more end-points than in the tail. This is a characteristic we will exploit latter. Now, pay attention to this. There are multiple end-points but we only need an starting point and an ending point. I'm gonna use K-Means to group the points in two clusters.
Using K-means will also let me identify which end-points belong to the tail and which to the tip, so I'll always know the direction of the arrow. Let's roll:
# Find the X, Y location of all the end-points
# pixels:
Y, X = binaryImage.nonzero()
# Check if I got points on my arrays:
if len(X) > 0 or len(Y) > 0:
# Reshape the arrays for K-means
Y = Y.reshape(-1,1)
X = X.reshape(-1,1)
Z = np.hstack((X, Y))
# K-means operates on 32-bit float data:
floatPoints = np.float32(Z)
# Set the convergence criteria and call K-means:
criteria = (cv2.TERM_CRITERIA_EPS + cv2.TERM_CRITERIA_MAX_ITER, 10, 1.0)
_, label, center = cv2.kmeans(floatPoints, 2, None, criteria, 10, cv2.KMEANS_RANDOM_CENTERS)
Be careful with the data types. If I print the label and center matrices, I get this (for the first arrow):
Center:
[[ 6. 102. ]
[104. 20.5]]
Labels:
[[1]
[1]
[0]]
center tells me the center (x,y) of each cluster – That is the two points I was originally looking for. label tells me on which cluster the original data falls in. As you see, there were originally 3 points. 2 of those points (the points belonging to the tip of the arrow) area assigned to cluster 1, while the remaining end-point (the arrow tail) is assigned to cluster 0. In the centers matrix the centers are ordered by cluster number. That is – first center is that one of cluster 0, while second cluster is the center of cluster 1. Using this info I can easily look for the cluster that groups the majority of points - that will be the tip of the arrow, while the remaining will be the tail:
# Set the cluster count, find the points belonging
# to cluster 0 and cluster 1:
cluster1Count = np.count_nonzero(label)
cluster0Count = np.shape(label)[0] - cluster1Count
# Look for the cluster of max number of points
# That cluster will be the tip of the arrow:
maxCluster = 0
if cluster1Count > cluster0Count:
maxCluster = 1
# Check out the centers of each cluster:
matRows, matCols = center.shape
# We need at least 2 points for this operation:
if matCols >= 2:
# Store the ordered end-points here:
orderedPoints = [None] * 2
# Let's identify and draw the two end-points
# of the arrow:
for b in range(matRows):
# Get cluster center:
pointX = int(center[b][0])
pointY = int(center[b][1])
# Get the "tip"
if b == maxCluster:
color = (0, 0, 255)
orderedPoints[0] = (pointX, pointY)
# Get the "tail"
else:
color = (255, 0, 0)
orderedPoints[1] = (pointX, pointY)
# Draw it:
cv2.circle(grayscaleImageCopy, (pointX, pointY), 3, color, -1)
cv2.imshow("End Points", grayscaleImageCopy)
cv2.waitKey(0)
This is the result; the tip of the end-point of the arrow will always be in red and the end-point for the tail in blue:
Now, we know the direction of the arrow, let's compute the angle. I will measure this angle from 0 to 360. The angle will always be the one between the horizon line and the tip. So, we manually compute the angle:
# Store the tip and tail points:
p0x = orderedPoints[1][0]
p0y = orderedPoints[1][1]
p1x = orderedPoints[0][0]
p1y = orderedPoints[0][1]
# Compute the sides of the triangle:
adjacentSide = p1x - p0x
oppositeSide = p0y - p1y
# Compute the angle alpha:
alpha = math.degrees(math.atan(oppositeSide / adjacentSide))
# Adjust angle to be in [0,360]:
if adjacentSide < 0 < oppositeSide:
alpha = 180 + alpha
else:
if adjacentSide < 0 and oppositeSide < 0:
alpha = 270 + alpha
else:
if adjacentSide > 0 > oppositeSide:
alpha = 360 + alpha
Now you have the angle, and this angle is always measured between the same references. That's cool, we can undo the rotation of the original image like follows:
# Deep copy for rotation (if needed):
rotatedImg = croppedImg.copy()
# Undo rotation while padding output image:
rotatedImg = rotateBound(rotatedImg, alpha)
cv2. imshow("rotatedImg", rotatedImg)
cv2.waitKey(0)
else:
print( "K-Means did not return enough points, skipping..." )
else:
print( "Did not find enough end points on image, skipping..." )
This yields the following result:
The arrow will always point top the right regardless of its original angle. Use this as normalization for a batch of training images, if you want to classify each arrow in its own class.
Now, you noticed that I used a function to rotate the image: rotateBound. This function is taken from here. This functions correctly pads the image after rotation, so you do not end up with a rotated image that is cropped incorrectly.
This is the definition and implementation of rotateBound:
def rotateBound(image, angle):
# grab the dimensions of the image and then determine the
# center
(h, w) = image.shape[:2]
(cX, cY) = (w // 2, h // 2)
# grab the rotation matrix (applying the negative of the
# angle to rotate clockwise), then grab the sine and cosine
# (i.e., the rotation components of the matrix)
M = cv2.getRotationMatrix2D((cX, cY), -angle, 1.0)
cos = np.abs(M[0, 0])
sin = np.abs(M[0, 1])
# compute the new bounding dimensions of the image
nW = int((h * sin) + (w * cos))
nH = int((h * cos) + (w * sin))
# adjust the rotation matrix to take into account translation
M[0, 2] += (nW / 2) - cX
M[1, 2] += (nH / 2) - cY
# perform the actual rotation and return the image
return cv2.warpAffine(image, M, (nW, nH))
These are results for the rest of your arrows. The tip (always in red), the tail (always in blue) and their "projective normalization" - always pointing to the right:
What remains is collect samples of your different arrow classes, set up a classifier, train it with your samples and test it with the straightened image coming from the last processing block we examined.
Some remarks: Some arrows, like the one that is not filled, failed the end-point identification part, thus, not yielding enough points for clustering. That arrow is by-passed by the algorithm. The problem is tougher than initially though, right? I recommend doing some research on the topic, because not matter how "easy" the task seems, at the end, it will be performed by an automated "smart" system. And those systems aren't really that smart at the end of the day.
Here is the workflow I put together that would make this work:
Import the necessary libraries:
import cv2
import numpy as np
Define a function that will take in an image, and process it into something that can allow python to more easily find the necessary contours of each shape. The values can be adjusted to better suit your needs:
def preprocess(img):
img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
img_blur = cv2.GaussianBlur(img_gray, (5, 5), 1)
img_canny = cv2.Canny(img_blur, 50, 50)
kernel = np.ones((3, 3))
img_dilate = cv2.dilate(img_canny, kernel, iterations=2)
img_erode = cv2.erode(img_dilate, kernel, iterations=1)
return img_erode
Define a function that will take in two lists; an approximate contour of a shape, points, and the indices of the convex hull of that contour, convex_hull. For the below function, you must make sure that the length of the points list is exactly 2 units greater than the length of the convex_hull list before calling the function. The reasoning is that optimally, the arrow should have exactly 2 more points that aren't present in the convex hull of the arrow.
def find_tip(points, convex_hull):
In the find_tip function, define a list of the indices of the points array where the values are not present in the convex_hull array:
length = len(points)
indices = np.setdiff1d(range(length), convex_hull)
In order to find the tip of the arrow, given we have the approximate outline of the arrow as points and the indices of the two points that are concave to the arrow, indices, we can find the tip by either subtracting 2 from the first index in the indices list, or by adding 2 to the first index of the indices list. See the below examples for reference:
In order to know whether you should subtract 2 from the first element of the indices list, or add 2, you'll need to do the exact opposite to the second (which is the last) element of the indices list; if the resulting two indices returns the same value from the points list, then you found the tip of the arrow. I used a for loop that loops through numbers 0 and 1. The first iteration will add 2 to the second element of the indices list: j = indices[i] + 2, and subtract 2 from the first element of the indices list: indices[i - 1] - 2:
for i in range(2):
j = indices[i] + 2
if j > length - 1:
j = length - j
if np.all(points[j] == points[indices[i - 1] - 2]):
return tuple(points[j])
This part:
if j > length - 1:
j = length - j
is there for cases like this:
where if you try adding 2 to the index 5, you will get an IndexError. So if, say j becomes 7 from the j = indices[i] + 2, the above condition will convert j to len(points) - j.
Read the image and get its contours, utilizing the preprocess function defined earlier before passing it into the cv2.findContours method:
img = cv2.imread("arrows.png")
contours, hierarchy = cv2.findContours(preprocess(img), cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)
Loop through the contours, and find the approximate contour and convex hull of each shape:
for cnt in contours:
peri = cv2.arcLength(cnt, True)
approx = cv2.approxPolyDP(cnt, 0.025 * peri, True)
hull = cv2.convexHull(approx, returnPoints=False)
sides = len(hull)
If the number of sides of the convex hull is 4 or 5 (the extra side in case the arrow has a flat bottom), and if the shape of the arrow has exactly two more points that are not present in the convex hull, find the tip of the arrow:
if 6 > sides > 3 and sides + 2 == len(approx):
arrow_tip = find_tip(approx[:,0,:], hull.squeeze())
If there is indeed a tip, then congratulation! You found a decent arrow! Now the arrow can be highlighted, and a circle can be drawn at the location of the tip of the arrow:
if arrow_tip:
cv2.drawContours(img, [cnt], -1, (0, 255, 0), 3)
cv2.circle(img, arrow_tip, 3, (0, 0, 255), cv2.FILLED)
Finally, show the image:
cv2.imshow("Image", img)
cv2.waitKey(0)
Altogether:
import cv2
import numpy as np
def preprocess(img):
img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
img_blur = cv2.GaussianBlur(img_gray, (5, 5), 1)
img_canny = cv2.Canny(img_blur, 50, 50)
kernel = np.ones((3, 3))
img_dilate = cv2.dilate(img_canny, kernel, iterations=2)
img_erode = cv2.erode(img_dilate, kernel, iterations=1)
return img_erode
def find_tip(points, convex_hull):
length = len(points)
indices = np.setdiff1d(range(length), convex_hull)
for i in range(2):
j = indices[i] + 2
if j > length - 1:
j = length - j
if np.all(points[j] == points[indices[i - 1] - 2]):
return tuple(points[j])
img = cv2.imread("arrows.png")
contours, hierarchy = cv2.findContours(preprocess(img), cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)
for cnt in contours:
peri = cv2.arcLength(cnt, True)
approx = cv2.approxPolyDP(cnt, 0.025 * peri, True)
hull = cv2.convexHull(approx, returnPoints=False)
sides = len(hull)
if 6 > sides > 3 and sides + 2 == len(approx):
arrow_tip = find_tip(approx[:,0,:], hull.squeeze())
if arrow_tip:
cv2.drawContours(img, [cnt], -1, (0, 255, 0), 3)
cv2.circle(img, arrow_tip, 3, (0, 0, 255), cv2.FILLED)
cv2.imshow("Image", img)
cv2.waitKey(0)
Original image:
Python program output:
Here is an approach with cv2.connectedComponentsWithStats. After extracting every arrow individually, I am getting the farthest points on the arrow. The distance between these points give me (more or less) the length of the arrow. Also, I am calculating the angle of the arrow by using these two points, i.e., slope between two points. Finally, in order to find the thickness, I am drawing a straight line between these points. And, I am calculating the shortest distance of each pixel of the arrow to the line. The most repeated distance value should give me the thickness of arrow.
The algorithm is not perfect, as it is. Especially, if the arrow is tilted. But, I feel like it is a good starting point and you can improve it.
import cv2
import numpy as np
import matplotlib.pyplot as plt
from scipy.spatial import distance
import math
img = cv2.imread('arrows.png',0)
_,img = cv2.threshold(img,10,255,cv2.THRESH_BINARY_INV)
labels, stats = cv2.connectedComponentsWithStats(img, 8)[1:3]
for label in np.unique(labels)[1:]:
arrow = labels==label
indices = np.transpose(np.nonzero(arrow)) #y,x
dist = distance.cdist(indices, indices, 'euclidean')
far_points_index = np.unravel_index(np.argmax(dist), dist.shape) #y,x
far_point_1 = indices[far_points_index[0],:] # y,x
far_point_2 = indices[far_points_index[1],:] # y,x
### Slope
arrow_slope = (far_point_2[0]-far_point_1[0])/(far_point_2[1]-far_point_1[1])
arrow_angle = math.degrees(math.atan(arrow_slope))
### Length
arrow_length = distance.cdist(far_point_1.reshape(1,2), far_point_2.reshape(1,2), 'euclidean')[0][0]
### Thickness
x = np.linspace(far_point_1[1], far_point_2[1], 20)
y = np.linspace(far_point_1[0], far_point_2[0], 20)
line = np.array([[yy,xx] for yy,xx in zip(y,x)])
thickness_dist = np.amin(distance.cdist(line, indices, 'euclidean'),axis=0).flatten()
n, bins, patches = plt.hist(thickness_dist,bins=150)
thickness = 2*bins[np.argmax(n)]
print(f"Thickness: {thickness}")
print(f"Angle: {arrow_angle}")
print(f"Length: {arrow_length}\n")
plt.figure()
plt.imshow(arrow,cmap='gray')
plt.scatter(far_point_1[1],far_point_1[0],c='r',s=10)
plt.scatter(far_point_2[1],far_point_2[0],c='r',s=10)
plt.scatter(line[:,1],line[:,0],c='b',s=10)
plt.show()
Thickness: 4.309328382835436
Angle: 58.94059117029002
Length: 102.7277956543408
Thickness: 7.851144897915465
Angle: -3.366460663429801
Length: 187.32325002519042
Thickness: 2.246710258748367
Angle: 55.51004336926862
Length: 158.93709447451215
Thickness: 25.060450615293227
Angle: -37.184706453233126
Length: 145.60219778561037
As your main concern is to filter out arrows from different shapes. I have implemented a method using convexityDefects. you can read more about convexity defects here.
Also, I have added more arrow inside other shapes to demonstrate the robustness of the method.
Updated Image
Method to filter arrows from image using convexity defects.
def get_filter_arrow_image(threslold_image):
blank_image = np.zeros_like(threslold_image)
# dilate image to remove self-intersections error
kernel_dilate = cv2.getStructuringElement(cv2.MORPH_RECT, (2, 2))
threslold_image = cv2.dilate(threslold_image, kernel_dilate, iterations=1)
contours, hierarchy = cv2.findContours(threslold_image, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
if hierarchy is not None:
threshold_distnace = 1000
for cnt in contours:
hull = cv2.convexHull(cnt, returnPoints=False)
defects = cv2.convexityDefects(cnt, hull)
if defects is not None:
for i in range(defects.shape[0]):
start_index, end_index, farthest_index, distance = defects[i, 0]
# you can add more filteration based on this start, end and far point
# start = tuple(cnt[start_index][0])
# end = tuple(cnt[end_index][0])
# far = tuple(cnt[farthest_index][0])
if distance > threshold_distnace:
cv2.drawContours(blank_image, [cnt], -1, 255, -1)
return blank_image
else:
return None
filter arrow image
I have added methods for the angle and length of the arrow, If this isn't good enough, let me know; there are more complicated methods for angle detection based on 3 coordinate points.
def get_max_distace_point(cnt):
max_distance = 0
max_points = None
for [[x1, y1]] in cnt:
for [[x2, y2]] in cnt:
distance = get_length((x1, y1), (x2, y2))
if distance > max_distance:
max_distance = distance
max_points = [(x1, y1), (x2, y2)]
return max_points
def angle_beween_points(a, b):
arrow_slope = (a[0] - b[0]) / (a[1] - b[1])
arrow_angle = math.degrees(math.atan(arrow_slope))
return arrow_angle
def get_arrow_info(arrow_image):
arrow_info_image = cv2.cvtColor(arrow_image.copy(), cv2.COLOR_GRAY2BGR)
contours, hierarchy = cv2.findContours(arrow_image, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
arrow_info = []
if hierarchy is not None:
for cnt in contours:
# draw single arrow on blank image
blank_image = np.zeros_like(arrow_image)
cv2.drawContours(blank_image, [cnt], -1, 255, -1)
point1, point2 = get_max_distace_point(cnt)
angle = angle_beween_points(point1, point2)
lenght = get_length(point1, point2)
cv2.line(arrow_info_image, point1, point2, (0, 255, 255), 1)
cv2.circle(arrow_info_image, point1, 2, (255, 0, 0), 3)
cv2.circle(arrow_info_image, point2, 2, (255, 0, 0), 3)
cv2.putText(arrow_info_image, "angle : {0:0.2f}".format(angle),
point2, cv2.FONT_HERSHEY_PLAIN, 0.8, (0, 0, 255), 1)
cv2.putText(arrow_info_image, "lenght : {0:0.2f}".format(lenght),
(point2[0], point2[1]+20), cv2.FONT_HERSHEY_PLAIN, 0.8, (0, 0, 255), 1)
return arrow_info_image, arrow_info
else:
return None, None
angle and length image
CODE
import math
import cv2
import numpy as np
def get_filter_arrow_image(threslold_image):
blank_image = np.zeros_like(threslold_image)
# dilate image to remove self-intersections error
kernel_dilate = cv2.getStructuringElement(cv2.MORPH_RECT, (2, 2))
threslold_image = cv2.dilate(threslold_image, kernel_dilate, iterations=1)
contours, hierarchy = cv2.findContours(threslold_image, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
if hierarchy is not None:
threshold_distnace = 1000
for cnt in contours:
hull = cv2.convexHull(cnt, returnPoints=False)
defects = cv2.convexityDefects(cnt, hull)
if defects is not None:
for i in range(defects.shape[0]):
start_index, end_index, farthest_index, distance = defects[i, 0]
# you can add more filteration based on this start, end and far point
# start = tuple(cnt[start_index][0])
# end = tuple(cnt[end_index][0])
# far = tuple(cnt[farthest_index][0])
if distance > threshold_distnace:
cv2.drawContours(blank_image, [cnt], -1, 255, -1)
return blank_image
else:
return None
def get_length(p1, p2):
line_length = ((p1[0] - p2[0]) ** 2 + (p1[1] - p2[1]) ** 2) ** 0.5
return line_length
def get_max_distace_point(cnt):
max_distance = 0
max_points = None
for [[x1, y1]] in cnt:
for [[x2, y2]] in cnt:
distance = get_length((x1, y1), (x2, y2))
if distance > max_distance:
max_distance = distance
max_points = [(x1, y1), (x2, y2)]
return max_points
def angle_beween_points(a, b):
arrow_slope = (a[0] - b[0]) / (a[1] - b[1])
arrow_angle = math.degrees(math.atan(arrow_slope))
return arrow_angle
def get_arrow_info(arrow_image):
arrow_info_image = cv2.cvtColor(arrow_image.copy(), cv2.COLOR_GRAY2BGR)
contours, hierarchy = cv2.findContours(arrow_image, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
arrow_info = []
if hierarchy is not None:
for cnt in contours:
# draw single arrow on blank image
blank_image = np.zeros_like(arrow_image)
cv2.drawContours(blank_image, [cnt], -1, 255, -1)
point1, point2 = get_max_distace_point(cnt)
angle = angle_beween_points(point1, point2)
lenght = get_length(point1, point2)
cv2.line(arrow_info_image, point1, point2, (0, 255, 255), 1)
cv2.circle(arrow_info_image, point1, 2, (255, 0, 0), 3)
cv2.circle(arrow_info_image, point2, 2, (255, 0, 0), 3)
cv2.putText(arrow_info_image, "angle : {0:0.2f}".format(angle),
point2, cv2.FONT_HERSHEY_PLAIN, 0.8, (0, 0, 255), 1)
cv2.putText(arrow_info_image, "lenght : {0:0.2f}".format(lenght),
(point2[0], point2[1] + 20), cv2.FONT_HERSHEY_PLAIN, 0.8, (0, 0, 255), 1)
return arrow_info_image, arrow_info
else:
return None, None
if __name__ == "__main__":
image = cv2.imread("image2.png")
gray_image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
_, thresh_image = cv2.threshold(gray_image, 100, 255, cv2.THRESH_BINARY_INV)
cv2.imshow("thresh_image", thresh_image)
arrow_image = get_filter_arrow_image(thresh_image)
if arrow_image is not None:
cv2.imshow("arrow_image", arrow_image)
cv2.imwrite("arrow_image.png", arrow_image)
arrow_info_image, arrow_info = get_arrow_info(arrow_image)
cv2.imshow("arrow_info_image", arrow_info_image)
cv2.imwrite("arrow_info_image.png", arrow_info_image)
cv2.waitKey(0)
cv2.destroyAllWindows()
convexity defects on a thin arrow.
Blue point - start point of defect
Green point - far point if defect
Red point - end point of defect
yellow line = defect line from start point to end point.
image
defect-1
defect-2
and so on...
..
for my home project I want to create some kind of ambilight. The backlighting already works by me passing a photo into a Python program and putting the photo into a 78x42 grid and then sending each edge color to an LED light. The LEDs are located behind the TV to create a background glow.
For now, the effect is already very nice.
Now the next step:
A Raspberry-Pi with a connected RPI camera should be pointed at the TV and the edge colors should be determined and sent to the LEDs.
So that the program learns where the TV is, I have already written a program, which first creates 4 images (all white, all red, all green and all blue), sends these images to the Chromecast and displays them. From each picture a photo is made. So I have 4 photos on which the TV can be seen. Each time the TV shows only one color.
I made the pictures in red, green and blue to get a color correction later, or to calibrate the colors. But that's not what I want to talk about here.
First I want to recognize the TV. I have no experience with OpenCV. I already have the following script, but I can't get any further from here.
import skimage
import skimage.feature
import skimage.viewer
import matplotlib
import cv2
image = skimage.io.imread(fname='tvImageW.jpg', as_gray=True)
edges = skimage.feature.canny(
image=image,
sigma=1,
low_threshold=0.1,
high_threshold=1.2,
)
matplotlib.image.imsave('edges.png', edges)
Original Image of TV with plain white content
Result of the python script
Does anyone have an idea or a plan how I can proceed?
I am convinced that it would be relatively easy to implement with OpenCV, if you knew your way around...
THANKS for every hint.
With Mario's help, I am now at the point of finding the contour. I have also discovered how to figure out the 4 corners of the TV. Here is my code so far:
import cv2
from math import sqrt
im = cv2.imread('tvImageW.jpg')
edges = cv2.Canny(im, 100, 200)
contours, hierarchy = cv2.findContours(edges,
cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
# find the biggest countour (c) by the area
c = max(contours, key = cv2.contourArea)
# find the perimeter of the contour
perim = cv2.arcLength(c, True)
# setting the precision
epsilon = 0.02*perim
# approximating the contour with a polygon
approxCorners = cv2.approxPolyDP(c, epsilon, True)
cv2.drawContours(im, [c], 0, (255,255,0), 2)
cv2.drawContours(im, approxCorners, -1, (0, 0, 255), 3)
cv2.imwrite('result.png', im)
Now I have 3 ideas that I will test further:
divide the lines between the points into 78 (or 42) parts. Then examine the individual grids for the color average.
use the 4 corner-points and the following guide to bring the image into the right perspective and then use my (already existing) routine: https://www.pyimagesearch.com/2014/08/25/4-point-opencv-getperspective-transform-example/
create a new test image and there mark the areas I am interested in and again recognize these contours. Each detected contour within the TV contour would then be a LED pixel at the end.
I suspect that point 2 is too slow at run time. Therefore I will start with the first point. Here, however, there could be problems with the perspective. If the problems are acceptable, I have the result. If it is not acceptable, I will try point 3....
Here is an attempt to represent the problem graphically. Below the trapezoid of the TV (tv-contour). Above the corrected image of the TV with the areas I am interested in.
Here is an image of the TV displaying the 78x42 grid:
Seems that tv is the max area contour.
In the dark you need different parameters
create a folder and name it "pyimagesearch".
Behind this folder we will put the empty __init__.py file and transform.py file that contains the functions to warp the image.
This is our main file:
import cv2
import numpy as np
from pyimagesearch.transform import *
im=cv2.imread('images/tv-wall.jpg')
img = im.copy()
edges = cv2.dilate(cv2.Canny(im,1,100),(2,2) ,iterations=1)
contours, hierarchy = cv2.findContours(edges,
cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
# find the biggest countour (c) by the area
c = max(contours, key = cv2.contourArea)
cv2.drawContours(img, [c], 0, (255,255,0), 2)
# find the perimeter of the contour
perim = cv2.arcLength(c, True)
# setting the precision
epsilon = 0.02*perim
# approximating the contour with a polygon
approxCorners = cv2.approxPolyDP(c, epsilon, True)
#draw 4 yellow corners points
for p in approxCorners:
cv2.circle(img,(p[0],p[1]),3,(0,255,255),3)
cv2.imshow('img',img)
Now we use 'approxCorners' to warp the 'im' image
with corrected functions from https://www.pyimagesearch.com/2014/08/25/4-point-opencv-getperspective-transform-example/
pts = np.array( approxCorners , dtype = "float32")
warped = four_point_transform(im, pts)
#Your tv w/h ratio
ratio=130/70
wrp_h, wrp_w ,depth = warped.shape
rsz_h = int(wrp_h * ratio)
#warped image stretch resize and show
rwarped = cv2.resize(warped,(wrp_w,rsz_h))
cv2.imshow("Warped",rwarped)
cv2.waitKey(0)
Folder structure:
pyimagesearch/__init__.py
pyimagesearch/transform.py
This is the 'transform.py' file behind 'pyimagesearch' folder
import numpy as np
import cv2
#This function is different from the original at pyimagesearch
def order_points(pts):
spts=sorted(pts,key=lambda k:[k[1],k[0]])
return np.array( [
tuple(spts[0]),#0 ok
tuple(spts[1]),#1 ok
tuple(spts[3]),#3 ok
tuple(spts[2]) #2 ok
])
def four_point_transform(image, pts):
# obtain a consistent order of the points and unpack them
# individually
rect = order_points(pts)
print("ptsorderd rect",rect)
print("rect type", type(rect))
print("rect shape", rect.shape)
(tl, tr, br, bl) = rect
# compute the width of the new image, which will be the
# maximum distance between bottom-right and bottom-left
# x-coordiates or the top-right and top-left x-coordinates
widthA = np.sqrt(((br[0] - bl[0]) ** 2) + ((br[1] - bl[1]) ** 2))
widthB = np.sqrt(((tr[0] - tl[0]) ** 2) + ((tr[1] - tl[1]) ** 2))
maxWidth = max(int(widthA), int(widthB))
# compute the height of the new image, which will be the
# maximum distance between the top-right and bottom-right
# y-coordinates or the top-left and bottom-left y-coordinates
heightA = np.sqrt(((tr[0] - br[0]) ** 2) + ((tr[1] - br[1]) ** 2))
heightB = np.sqrt(((tl[0] - bl[0]) ** 2) + ((tl[1] - bl[1]) ** 2))
maxHeight = max(int(heightA), int(heightB))
# now that we have the dimensions of the new image, construct
# the set of destination points to obtain a "birds eye view",
# (i.e. top-down view) of the image, again specifying points
# in the top-left, top-right, bottom-right, and bottom-left
# order
dst = np.array([
[0, 0],
[maxWidth - 1, 0],
[maxWidth - 1, maxHeight - 1],
[0, maxHeight - 1]], dtype = "float32")
# compute the perspective transform matrix and then apply it
M = cv2.getPerspectiveTransform(rect, dst)
warped = cv2.warpPerspective(image, M, (maxWidth, maxHeight))
# return the warped image
return warped
For the sake of completeness, I would like to post my final result here. The script creates a JSON at the end, in which a pixel area is stored for each individual LED. This pixel area is then used by the main program.
All in all it runs very smooth. Surprisingly good even.
In the main program, a higher speed could possibly lead to an (even) better experience, but it's already running better than I would have expected.
If there is a wish, I will post the complete source code on Github:
Script that sends the images to the TV via Chromecast.
Script that detects the TV and saves the result of the interesting pixels into a JSON file for the main program.
Main script that uses the RPI camera, detects the colors and sends them to a NodeMCU.
Arduino code of the NodeMCU with which transmits the colors to a tethered, 4 meter long, WS2812B behind the TV.
detectTv.py
import constant
import detectTvHelper as helper
import cv2
import numpy as np
import json
im = cv2.imread('tvImageW.jpg')
edges = cv2.dilate(cv2.Canny(im,1,100),(2,2) ,iterations=1) #cv2.Canny(im, 100, 200)
contours, hierarchy = cv2.findContours(edges, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
# find the biggest countour (c) by the area
c = max(contours, key = cv2.contourArea)
# find the perimeter of the contour
perim = cv2.arcLength(c, True)
# setting the precision
epsilon = 0.02*perim
# approximating the contour with a polygon
approxCorners = cv2.approxPolyDP(c, epsilon, True)
cv2.drawContours(im, [c], 0, (255,255,0), 2)
cv2.drawContours(im, approxCorners, -1, (0, 0, 255), 3)
cv2.imwrite('result.png', im)
orderedPoints = helper.order_points(approxCorners)
points = helper.getPointsBetweenPoints(orderedPoints[0][0], orderedPoints[0][1], orderedPoints[1][0], orderedPoints[1][1], 78)
points += helper.getPointsBetweenPoints(orderedPoints[1][0], orderedPoints[1][1], orderedPoints[2][0], orderedPoints[2][1], 42)
points += helper.getPointsBetweenPoints(orderedPoints[2][0], orderedPoints[2][1], orderedPoints[3][0], orderedPoints[3][1], 78)
points += helper.getPointsBetweenPoints(orderedPoints[3][0], orderedPoints[3][1], orderedPoints[0][0], orderedPoints[0][1], 42)
result={}
cimg = np.zeros_like(im)
cv2.fillPoly(cimg, pts =[c], color=255)
cBRx,cBRy,cBRw,cBRh = cv2.boundingRect(c)
pts = helper.getMarkedCoordinates(cimg, cBRx, cBRy, cBRw, cBRh)
led = 0
for p in points:
cimg2 = np.zeros_like(im)
cv2.circle(cimg2, (p[0], p[1]), radius=constant.COLOR_RADIUS, color=255, thickness=-1)
pts2 = helper.getMarkedCoordinates(cimg2, p[0]-5, p[1]-5, 10, 10)
inters = helper.intersect(pts, pts2)
result[led] = inters
print(led, len(inters))
led += 1
with open('pixels.json', 'w') as f:
json.dump(result, f)
detectTvHelper.py
#order corners tl, tr, bl, br
def order_points(approxCorners):
pts = [
[approxCorners[0][0][0], approxCorners[0][0][1]],
[approxCorners[1][0][0], approxCorners[1][0][1]],
[approxCorners[2][0][0], approxCorners[2][0][1]],
[approxCorners[3][0][0], approxCorners[3][0][1]]]
rect = [[0, 0]] * 4
pts.sort()
if (pts[0][1] < pts[1][1]):
rect[0] = pts[0]
rect[3] = pts[1]
else:
rect[0] = pts[1]
rect[3] = pts[0]
if (pts[2][1] < pts[3][1]):
rect[1] = pts[2]
rect[2] = pts[3]
else:
rect[1] = pts[3]
rect[2] = pts[2]
# return the ordered coordinates
return rect
def getPointsBetweenPoints(x1, y1, x2, y2, points):
res=[]
for i in range(points):
newX = x1 + ((x2 - x1) / (points) * (i + 0.5))
newY = y1 + ((y2 - y1) / (points) * (i + 0.5))
res.append([round(newX), round(newY)])
return res
def getMarkedCoordinates(image, x, y, w, h):
res = []
for _x in range(x, x + w):
for _y in range(y, y + h):
pixel = image[_y][_x]
if (pixel.max() > 0):
res.append([_x, _y])
return res
def intersect(pixels1, pixels2):
res = []
m = {}
if len(pixels1)<len(pixels2):
pixels1,pixels2 = pixels2,pixels1
for i in pixels1:
m[i[0]*1000+i[1]] = True
for i in pixels2:
if i[0]*1000+i[1] in m:
res.append(i)
return res
Since I have only roundimentary Python knowledge, I can imagine that there is room for improvement in one or the other place. I am open for hints.
However, performance does not play the biggest role here. In the main program, every bit of performance counts.
I wrote a small script in python where I'm trying to extract or crop the part of the playing card that represents the artwork only, removing all the rest. I've been trying various methods of thresholding but couldn't get there. Also note that I can't simply record manually the position of the artwork because it's not always in the same position or size, but always in a rectangular shape where everything else is just text and borders.
from matplotlib import pyplot as plt
import cv2
img = cv2.imread(filename)
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
ret,binary = cv2.threshold(gray, 0, 255, cv2.THRESH_OTSU | cv2.THRESH_BINARY)
binary = cv2.bitwise_not(binary)
kernel = np.ones((15, 15), np.uint8)
closing = cv2.morphologyEx(binary, cv2.MORPH_OPEN, kernel)
plt.imshow(closing),plt.show()
The current output is the closest thing I could get. I could be on the right way and try some further wrangling to draw a rectangle around the white parts, but I don't think it's a sustainable method :
As a last note, see the cards below, not all frames are exactly the same sizes or positions, but there's always a piece of artwork with only text and borders around it. It doesn't have to be super precisely cut, but clearly the art is a "region" of the card, surrounded by other regions containing some text. My goal is to try to capture the region of the artwork as well as I can.
I used Hough line transform to detect linear parts of the image.
The crossings of all lines were used to construct all possible rectangles, which do not contain other crossing points.
Since the part of the card you are looking for is always the biggest of those rectangles (at least in the samples you provided), i simply chose the biggest of those rectangles as winner.
The script works without user interaction.
import cv2
import numpy as np
from collections import defaultdict
def segment_by_angle_kmeans(lines, k=2, **kwargs):
#Groups lines based on angle with k-means.
#Uses k-means on the coordinates of the angle on the unit circle
#to segment `k` angles inside `lines`.
# Define criteria = (type, max_iter, epsilon)
default_criteria_type = cv2.TERM_CRITERIA_EPS + cv2.TERM_CRITERIA_MAX_ITER
criteria = kwargs.get('criteria', (default_criteria_type, 10, 1.0))
flags = kwargs.get('flags', cv2.KMEANS_RANDOM_CENTERS)
attempts = kwargs.get('attempts', 10)
# returns angles in [0, pi] in radians
angles = np.array([line[0][1] for line in lines])
# multiply the angles by two and find coordinates of that angle
pts = np.array([[np.cos(2*angle), np.sin(2*angle)]
for angle in angles], dtype=np.float32)
# run kmeans on the coords
labels, centers = cv2.kmeans(pts, k, None, criteria, attempts, flags)[1:]
labels = labels.reshape(-1) # transpose to row vec
# segment lines based on their kmeans label
segmented = defaultdict(list)
for i, line in zip(range(len(lines)), lines):
segmented[labels[i]].append(line)
segmented = list(segmented.values())
return segmented
def intersection(line1, line2):
#Finds the intersection of two lines given in Hesse normal form.
#Returns closest integer pixel locations.
#See https://stackoverflow.com/a/383527/5087436
rho1, theta1 = line1[0]
rho2, theta2 = line2[0]
A = np.array([
[np.cos(theta1), np.sin(theta1)],
[np.cos(theta2), np.sin(theta2)]
])
b = np.array([[rho1], [rho2]])
x0, y0 = np.linalg.solve(A, b)
x0, y0 = int(np.round(x0)), int(np.round(y0))
return [[x0, y0]]
def segmented_intersections(lines):
#Finds the intersections between groups of lines.
intersections = []
for i, group in enumerate(lines[:-1]):
for next_group in lines[i+1:]:
for line1 in group:
for line2 in next_group:
intersections.append(intersection(line1, line2))
return intersections
def rect_from_crossings(crossings):
#find all rectangles without other points inside
rectangles = []
# Search all possible rectangles
for i in range(len(crossings)):
x1= int(crossings[i][0][0])
y1= int(crossings[i][0][1])
for j in range(len(crossings)):
x2= int(crossings[j][0][0])
y2= int(crossings[j][0][1])
#Search all points
flag = 1
for k in range(len(crossings)):
x3= int(crossings[k][0][0])
y3= int(crossings[k][0][1])
#Dont count double (reverse rectangles)
if (x1 > x2 or y1 > y2):
flag = 0
#Dont count rectangles with points inside
elif ((((x3 >= x1) and (x2 >= x3))and (y3 > y1) and (y2 > y3) or ((x3 > x1) and (x2 > x3))and (y3 >= y1) and (y2 >= y3))):
if(i!=k and j!=k):
flag = 0
if flag:
rectangles.append([[x1,y1],[x2,y2]])
return rectangles
if __name__ == '__main__':
#img = cv2.imread('TAJFp.jpg')
#img = cv2.imread('Bj2uu.jpg')
img = cv2.imread('yi8db.png')
width = int(img.shape[1])
height = int(img.shape[0])
scale = 380/width
dim = (int(width*scale), int(height*scale))
# resize image
img = cv2.resize(img, dim, interpolation = cv2.INTER_AREA)
img2 = img.copy()
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
gray = cv2.GaussianBlur(gray,(5,5),cv2.BORDER_DEFAULT)
# Parameters of Canny and Hough may have to be tweaked to work for as many cards as possible
edges = cv2.Canny(gray,10,45,apertureSize = 7)
lines = cv2.HoughLines(edges,1,np.pi/90,160)
segmented = segment_by_angle_kmeans(lines)
crossings = segmented_intersections(segmented)
rectangles = rect_from_crossings(crossings)
#Find biggest remaining rectangle
size = 0
for i in range(len(rectangles)):
x1 = rectangles[i][0][0]
x2 = rectangles[i][1][0]
y1 = rectangles[i][0][1]
y2 = rectangles[i][1][1]
if(size < (abs(x1-x2)*abs(y1-y2))):
size = abs(x1-x2)*abs(y1-y2)
x1_rect = x1
x2_rect = x2
y1_rect = y1
y2_rect = y2
cv2.rectangle(img2, (x1_rect,y1_rect), (x2_rect,y2_rect), (0,0,255), 2)
roi = img[y1_rect:y2_rect, x1_rect:x2_rect]
cv2.imshow("Output",roi)
cv2.imwrite("Output.png", roi)
cv2.waitKey()
These are the results with the samples you provided:
The code for finding line crossings can be found here: find intersection point of two lines drawn using houghlines opencv
You can read more about Hough Lines here.
We know that cards have straight boundaries along the x and y axes. We can use this to extract parts of the image. The following code implements detecting horizontal and vertical lines in the image.
import cv2
import numpy as np
def mouse_callback(event, x, y, flags, params):
global num_click
if num_click < 2 and event == cv2.EVENT_LBUTTONDOWN:
num_click = num_click + 1
print(num_click)
global upper_bound, lower_bound, left_bound, right_bound
upper_bound.append(max(i for i in hor if i < y) + 1)
lower_bound.append(min(i for i in hor if i > y) - 1)
left_bound.append(max(i for i in ver if i < x) + 1)
right_bound.append(min(i for i in ver if i > x) - 1)
filename = 'image.png'
thr = 100 # edge detection threshold
lined = 50 # number of consequtive True pixels required an axis to be counted as line
num_click = 0 # select only twice
upper_bound, lower_bound, left_bound, right_bound = [], [], [], []
winname = 'img'
cv2.namedWindow(winname)
cv2.setMouseCallback(winname, mouse_callback)
img = cv2.imread(filename, 1)
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
bw = cv2.Canny(gray, thr, 3*thr)
height, width, _ = img.shape
# find horizontal lines
hor = []
for i in range (0, height-1):
count = 0
for j in range (0, width-1):
if bw[i,j]:
count = count + 1
else:
count = 0
if count >= lined:
hor.append(i)
break
# find vertical lines
ver = []
for j in range (0, width-1):
count = 0
for i in range (0, height-1):
if bw[i,j]:
count = count + 1
else:
count = 0
if count >= lined:
ver.append(j)
break
# draw lines
disp_img = np.copy(img)
for i in hor:
cv2.line(disp_img, (0, i), (width-1, i), (0,0,255), 1)
for i in ver:
cv2.line(disp_img, (i, 0), (i, height-1), (0,0,255), 1)
while num_click < 2:
cv2.imshow(winname, disp_img)
cv2.waitKey(10)
disp_img = img[min(upper_bound):max(lower_bound), min(left_bound):max(right_bound)]
cv2.imshow(winname, disp_img)
cv2.waitKey() # Press any key to exit
cv2.destroyAllWindows()
You just need to click two areas to include. A sample click area and the corresponding result are as follows:
Results from other images:
I don't think it is possible to automatically crop the artwork ROI using traditional image processing techniques due to the dynamic nature of the colors, dimensions, locations, and textures for each card. You would have to look into machine/deep learning and train your own classifier if you want to do it automatically. Instead, here's a manual approach to select and crop a static ROI from an image.
The idea is to use cv2.setMouseCallback() and event handlers to detect if the mouse has been clicked or released. For this implementation, you can extract the artwork ROI by holding down the left mouse button and dragging to select the desired ROI. Once you have selected the desired ROI, press c to crop and save the ROI. You can reset the ROI using the right mouse button.
Saved artwork ROIs
Code
import cv2
class ExtractArtworkROI(object):
def __init__(self):
# Load image
self.original_image = cv2.imread('1.png')
self.clone = self.original_image.copy()
cv2.namedWindow('image')
cv2.setMouseCallback('image', self.extractROI)
self.selected_ROI = False
# ROI bounding box reference points
self.image_coordinates = []
def extractROI(self, event, x, y, flags, parameters):
# Record starting (x,y) coordinates on left mouse button click
if event == cv2.EVENT_LBUTTONDOWN:
self.image_coordinates = [(x,y)]
# Record ending (x,y) coordintes on left mouse button release
elif event == cv2.EVENT_LBUTTONUP:
# Remove old bounding box
if self.selected_ROI:
self.clone = self.original_image.copy()
# Draw rectangle
self.selected_ROI = True
self.image_coordinates.append((x,y))
cv2.rectangle(self.clone, self.image_coordinates[0], self.image_coordinates[1], (36,255,12), 2)
print('top left: {}, bottom right: {}'.format(self.image_coordinates[0], self.image_coordinates[1]))
print('x,y,w,h : ({}, {}, {}, {})'.format(self.image_coordinates[0][0], self.image_coordinates[0][1], self.image_coordinates[1][0] - self.image_coordinates[0][0], self.image_coordinates[1][1] - self.image_coordinates[0][1]))
# Clear drawing boxes on right mouse button click
elif event == cv2.EVENT_RBUTTONDOWN:
self.selected_ROI = False
self.clone = self.original_image.copy()
def show_image(self):
return self.clone
def crop_ROI(self):
if self.selected_ROI:
x1 = self.image_coordinates[0][0]
y1 = self.image_coordinates[0][1]
x2 = self.image_coordinates[1][0]
y2 = self.image_coordinates[1][1]
# Extract ROI
self.cropped_image = self.original_image.copy()[y1:y2, x1:x2]
# Display and save image
cv2.imshow('Cropped Image', self.cropped_image)
cv2.imwrite('ROI.png', self.cropped_image)
else:
print('Select ROI before cropping!')
if __name__ == '__main__':
extractArtworkROI = ExtractArtworkROI()
while True:
cv2.imshow('image', extractArtworkROI.show_image())
key = cv2.waitKey(1)
# Close program with keyboard 'q'
if key == ord('q'):
cv2.destroyAllWindows()
exit(1)
# Crop ROI
if key == ord('c'):
extractArtworkROI.crop_ROI()
I am trying to detect a white Object on a black/white road to let an autonmous RC car drive around it. And i am detecting everything but the white box on the road.
What I tried can be seen in my code Example
#input= one video stream frame 320x240
frame = copy.deepcopy(input)
grayFrame = cv2.cvtColor(frame,cv2.COLOR_BGR2GRAY)
threshGray = cv2.adaptiveThreshold(
grayFrame,
255,
cv2.ADAPTIVE_THRESH_MEAN_C,
cv2.THRESH_BINARY,
blockSize=123,
C=-19,
)
contours,_ = cv2.findContours(threshGray, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
for cnt in contours:
#some filtering needs to be done
#
#after filtering append contour
filteredContours.append(cnt)
cv2.rectangle(frame, (x, y), (x + w, y + h), (3, 244, 244), 1)
cv2.drawContours(frame, filteredContours, -1, (255, 0, 255),1 )
cv2.imshow("with contours", frame)
cv2.imshow("adaptiveThreshhold", threshGray)
cv2.imshow("input", input)
I'm looking for a way to draw a bounding box around the obstacle.
Problem is I dont know how to extract this box from the rest.
It is probably because the contour of the box and the lines on the right are connected and thats why the bounding box is that big. Would be great if someone knows a way to do that.
Click here to see the Result
First: Input image
Second: after adaptiveThreshold
third: with contours(pink) and bounding boxes(yellow)
At this point in time, you got several candidates of white color value.
You need to add code in to the #some filtering needs to be done to rid candidate list of NOT bounding box you want to find.
I suggest you to compare your candidate list with square box as bigger as enough.
Because all of contours without BOX(that you want to find on the road) do not satisfy condition about square box as I mentioned above.
I think what you are looking for is triangular masking, as seen in the input image you have lane marking as well. Did try using a lane detector with this all the areas out of lane can be masked and only the spaces in lane can be processed.
Below I have tried to use Lane detector using HoughLinesP and added Contours as well. Try to use this, I did not test this code but I see no issues.
#! /usr/bin/env python 3
"""
Lane detector using the Hog transform method
"""
import cv2 as cv
import numpy as np
# import matplotlib.pyplot as plt
import random as rng
rng.seed(369)
def do_canny(frame):
# Converts frame to grayscale because we only need the luminance channel for detecting edges - less computationally expensive
gray = cv.cvtColor(frame, cv.COLOR_RGB2GRAY)
# Applies a 5x5 gaussian blur with deviation of 0 to frame - not mandatory since Canny will do this for us
blur = cv.GaussianBlur(gray, (5, 5), 0)
# Applies Canny edge detector with minVal of 50 and maxVal of 150
canny = cv.Canny(blur, 50, 150)
return canny
def do_segment(frame):
# Since an image is a multi-directional array containing the relative intensities of each pixel in the image, we can use frame.shape to return a tuple: [number of rows, number of columns, number of channels] of the dimensions of the frame
# frame.shape[0] give us the number of rows of pixels the frame has. Since height begins from 0 at the top, the y-coordinate of the bottom of the frame is its height
height = frame.shape[0]
# Creates a triangular polygon for the mask defined by three (x, y) coordinates
polygons = np.array([
[(0, height), (800, height), (380, 290)]
])
# Creates an image filled with zero intensities with the same dimensions as the frame
mask = np.zeros_like(frame)
# Allows the mask to be filled with values of 1 and the other areas to be filled with values of 0
cv.fillPoly(mask, polygons, 255)
# A bitwise and operation between the mask and frame keeps only the triangular area of the frame
segment = cv.bitwise_and(frame, mask)
return segment
def calculate_lines(frame, lines):
# Empty arrays to store the coordinates of the left and right lines
left = []
right = []
# Loops through every detected line
for line in lines:
# Reshapes line from 2D array to 1D array
x1, y1, x2, y2 = line.reshape(4)
# Fits a linear polynomial to the x and y coordinates and returns a vector of coefficients which describe the slope and y-intercept
parameters = np.polyfit((x1, x2), (y1, y2), 1)
slope = parameters[0]
y_intercept = parameters[1]
# If slope is negative, the line is to the left of the lane, and otherwise, the line is to the right of the lane
if slope < 0:
left.append((slope, y_intercept))
else:
right.append((slope, y_intercept))
# Averages out all the values for left and right into a single slope and y-intercept value for each line
left_avg = np.average(left, axis = 0)
right_avg = np.average(right, axis = 0)
# Calculates the x1, y1, x2, y2 coordinates for the left and right lines
left_line = calculate_coordinates(frame, left_avg)
right_line = calculate_coordinates(frame, right_avg)
return np.array([left_line, right_line])
def calculate_coordinates(frame, parameters):
slope, intercept = parameters
# Sets initial y-coordinate as height from top down (bottom of the frame)
y1 = frame.shape[0]
# Sets final y-coordinate as 150 above the bottom of the frame
y2 = int(y1 - 150)
# Sets initial x-coordinate as (y1 - b) / m since y1 = mx1 + b
x1 = int((y1 - intercept) / slope)
# Sets final x-coordinate as (y2 - b) / m since y2 = mx2 + b
x2 = int((y2 - intercept) / slope)
return np.array([x1, y1, x2, y2])
def visualize_lines(frame, lines):
# Creates an image filled with zero intensities with the same dimensions as the frame
lines_visualize = np.zeros_like(frame)
# Checks if any lines are detected
if lines is not None:
for x1, y1, x2, y2 in lines:
# Draws lines between two coordinates with green color and 5 thickness
cv.line(lines_visualize, (x1, y1), (x2, y2), (0, 255, 0), 5)
return lines_visualize
# The video feed is read in as a VideoCapture object
cap = cv.VideoCapture(1)
while (cap.isOpened()):
# ret = a boolean return value from getting the frame, frame = the current frame being projected in the video
ret, frame = cap.read()
canny = do_canny(frame)
cv.imshow("canny", canny)
# plt.imshow(frame)
# plt.show()
segment = do_segment(canny)
hough = cv.HoughLinesP(segment, 2, np.pi / 180, 100, np.array([]), minLineLength = 100, maxLineGap = 50)
# Averages multiple detected lines from hough into one line for left border of lane and one line for right border of lane
lines = calculate_lines(frame, hough)
# Visualizes the lines
lines_visualize = visualize_lines(frame, lines)
cv.imshow("hough", lines_visualize)
# Overlays lines on frame by taking their weighted sums and adding an arbitrary scalar value of 1 as the gamma argument
output = cv.addWeighted(frame, 0.9, lines_visualize, 1, 1)
contours, _ = cv.findContours(output, cv.RETR_TREE, cv.CHAIN_APPROX_SIMPLE)
contours_poly = [None]*len(contours)
boundRect = [None]*len(contours)
centers = [None]*len(contours)
radius = [None]*len(contours)
for i, c in enumerate(contours):
contours_poly[i] = cv.approxPolyDP(c, 3, True)
boundRect[i] = cv.boundingRect(contours_poly[i])
centers[i], radius[i] = cv.minEnclosingCircle(contours_poly[i])
## [allthework]
## [zeroMat]
drawing = np.zeros((output.shape[0], output.shape[1], 3), dtype=np.uint8)
## [zeroMat]
## [forContour]
# Draw polygonal contour + bonding rects + circles
for i in range(len(contours)):
color = (rng.randint(0,256), rng.randint(0,256), rng.randint(0,256))
cv.drawContours(drawing, contours_poly, i, color)
cv.rectangle(drawing, (int(boundRect[i][0]), int(boundRect[i][1])), \
(int(boundRect[i][0]+boundRect[i][2]), int(boundRect[i][1]+boundRect[i][3])), color, 2)
# Opens a new window and displays the output frame
cv.imshow('Contours', drawing)
# Frames are read by intervals of 10 milliseconds. The programs breaks out of the while loop when the user presses the 'q' key
if cv.waitKey(10) & 0xFF == ord('q'):
break
# The following frees up resources and closes all windows
cap.release()
cv.destroyAllWindows()
try different values in the threshold for canny.
I got two detected contours in an image and need the diameter between the two vertical-edges of the top contour and the diameter between the vertical-edges of the lower contour. I achieved this with this code.
import cv2
import numpy as np
import math, os
import imutils
img = cv2.imread("1.jpg")
gray = cv2.cvtColor(img, cv2.COLOR_RGB2GRAY)
gray = cv2.GaussianBlur(gray, (7, 7), 0)
edges = cv2.Canny(gray, 200, 100)
edges = cv2.dilate(edges, None, iterations=1)
edges = cv2.erode(edges, None, iterations=1)
cnts = cv2.findContours(edges.copy(), cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = imutils.grab_contours(cnts)
# sorting the contours to find the largest and smallest one
c1 = max(cnts, key=cv2.contourArea)
c2 = min(cnts, key=cv2.contourArea)
# determine the most extreme points along the contours
extLeft1 = tuple(c1[c1[:, :, 0].argmin()][0])
extRight1 = tuple(c1[c1[:, :, 0].argmax()][0])
extLeft2 = tuple(c2[c2[:, :, 0].argmin()][0])
extRight2 = tuple(c2[c2[:, :, 0].argmax()][0])
# show contour
cimg = cv2.drawContours(img, cnts, -1, (0,200,0), 2)
# set y of left point to y of right point
lst1 = list(extLeft1)
lst1[1] = extRight1[1]
extLeft1 = tuple(lst1)
lst2 = list(extLeft2)
lst2[1] = extRight2[1]
extLeft2= tuple(lst2)
# compute the distance between the points (x1, y1) and (x2, y2)
dist1 = math.sqrt( ((extLeft1[0]-extRight1[0])**2)+((extLeft1[1]-extRight1[1])**2) )
dist2 = math.sqrt( ((extLeft2[0]-extRight2[0])**2)+((extLeft2[1]-extRight2[1])**2) )
# draw lines
cv2.line(cimg, extLeft1, extRight1, (255,0,0), 1)
cv2.line(cimg, extLeft2, extRight2, (255,0,0), 1)
# draw the distance text
font = cv2.FONT_HERSHEY_SIMPLEX
fontScale = 0.5
fontColor = (255,0,0)
lineType = 1
cv2.putText(cimg,str(dist1),(155,100),font, fontScale, fontColor, lineType)
cv2.putText(cimg,str(dist2),(155,280),font, fontScale, fontColor, lineType)
# show image
cv2.imshow("Image", img)
cv2.waitKey(0)
Now I would also need the angle of the slope lines on the bottom side of the upper contour.
Any ideas how I can get this? Is it possible using contours?
Or is it necessary to use HoughLinesP and sort the regarding lines somehow?
And continued question: Maybe its also possible to get function which describes parabola slope of that sides ?
Thanks alot for any help!
There are several ways to obtain just the slopes. In order to know the slope, we can can use cv2.HoughLines to detect the bottom horizontal line, detect to end points of that line and from those, obtain the slopes. As an illustration,
lines = cv2.HoughLines(edges, rho=1, theta=np.pi/180, threshold=int(dist2*0.66) )
on edges in your code gives 4 lines, and if we force the angle to be horizontal
for line in lines:
rho, theta = line[0]
# here we filter out non-horizontal lines
if abs(theta - np.pi/2) > np.pi/180:
continue
a = np.cos(theta)
b = np.sin(theta)
x0 = a*rho
y0 = b*rho
x1 = int(x0 + 1000*(-b))
y1 = int(y0 + 1000*(a))
x2 = int(x0 - 1000*(-b))
y2 = int(y0 - 1000*(a))
cv2.line(img_lines,(x1,y1),(x2,y2),(0,0,255),1)
we get:
For the extended question concerns with the parabolas, we first compose a function that returns the left and right points:
def horizontal_scan(gray_img, thresh=50, start=50):
'''
scan horizontally for left and right points until we met an all-background line
#param thresh: threshold for background pixel
#param start: y coordinate to start scanning
'''
ret = []
thickness = 0
for i in range(start,len(gray_img)):
row = gray_img[i]
# scan for left:
left = 0
while left < len(row) and row[left]<thresh:
left += 1
if left==len(row):
break;
# scan for right:
right = left
while right < len(row) and row[right] >= thresh:
right+=1
if thickness == 0:
thickness = right - left
# prevent sudden drop, error/noise
if (right-left) < thickness//5:
continue
else:
thickness = right - left
ret.append((i,left,right))
return ret
# we start scanning from extLeft1 down until we see a blank line
# with some tweaks, we can make horizontal_scan run on edges,
# which would be simpler and faster
horizontal_lines = horizontal_scan(gray, start = extLeft1[1])
# check if horizontal_line[0] are closed to extLeft1 and extRight1
print(horizontal_lines[0], extLeft1, extRight1[0])
Note that we can use this function to find the end points of the horizontal line returned by HoughLines.
# last line of horizontal_lines would be the points we need:
upper_lowest_y, upper_lowest_left, upper_lowest_right = horizontal_lines[-1]
img_lines = img.copy()
cv2.line(img_lines, (upper_lowest_left, upper_lowest_y), extLeft1, (0,0,255), 1)
cv2.line(img_lines, (upper_lowest_right, upper_lowest_y), extRight1, (0,0,255),1)
and that gives:
Let's return to the extended question, where we have those left and right points:
left_points = [(x,y) for y,x,_ in horizontal_lines]
right_points = [(x,y) for y,_,x in horizontal_lines]
Obviously, they would not fit perfectly in a parabola, so we need some sort of approximation/fitting here. For that, we can build a LinearRegression model:
from sklearn.linear_model import LinearRegression
class BestParabola:
def __init__(self, points):
x_x2 = np.array([(x**2,x) for x,_ in points])
ys = np.array([y for _,y in points])
self.lr = LinearRegression()
self.lr.fit(x_x2,ys)
self.a, self.b = self.lr.coef_
self.c = self.lr.intercept_
self.coef_ = (self.c,self.b,self.a)
def transform(self,points):
x_x2 = np.array([(x**2,x) for x,_ in points])
ys = self.lr.predict(x_x2)
return np.array([(x,y) for (_,x),y in zip(x_x2,ys)])
And then, we can fit the given left_points, right_points to get the desired parabolas:
# construct the approximate parabola
# the parabollas' coefficients are accessible by BestParabola.coef_
left_parabola = BestParabola(left_points)
right_parabola = BestParabola(right_points)
# get points for rendering
left_parabola_points = left_parabola.transform(left_points)
right_parabola_points = right_parabola.transform(right_points)
# render with matplotlib, cv2.drawContours would work
plt.figure(figsize=(8,8))
plt.imshow(cv2.cvtColor(img,cv2.COLOR_BGR2RGB))
plt.plot(left_parabola_points[:,0], left_parabola_points[:,1], linewidth=3)
plt.plot(right_parabola_points[:,0], right_parabola_points[:,1], linewidth=3, color='r')
plt.show()
Which gives:
The left parabola is not perfect, but you should work out that if need be :-)