Python: Detecting tables using cv2 with incomplete or no internal grid - python

I am working on a project with the goal of extracting structured data from a series of tables captured in images.
I have achieved some success adapting the process outlined in this extremely helpful medium post.
As best I understand, this program works by creating a contour mask, of sorts, to outline the borders of a table. Here is the relevant code performing that function:
#Load image as numpy array
img = np.array(img)
#Threshold image to binary image
thresh,img_bin = cv2.threshold(img,128,255,cv2.THRESH_BINARY |cv2.THRESH_OTSU)
#inverting the image
img_bin = 255-img_bin
# Length(width) of kernel as 100th of total width
kernel_len = np.array(img).shape[1]//100
# Defining a vertical kernel to detect all vertical lines of image
ver_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (1, kernel_len))
# Defining a horizontal kernel to detect all horizontal lines of image
hor_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (kernel_len, 1))
# A kernel of 2x2
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (2, 2))
#Use vertical kernel to detect and save the vertical lines in a jpg
image_1 = cv2.erode(img_bin, ver_kernel, iterations=3)
vertical_lines = cv2.dilate(image_1, ver_kernel, iterations=3)
#Use horizontal kernel to detect and save the horizontal lines in a jpg
image_2 = cv2.erode(img_bin, hor_kernel, iterations=3)
horizontal_lines = cv2.dilate(image_2, hor_kernel, iterations=3)
# Combine horizontal and vertical lines in a new third image, with both having same weight.
img_vh = cv2.addWeighted(vertical_lines, 0.5, horizontal_lines, 0.5, 0.0)
#Eroding and thesholding the image
img_vh = cv2.erode(~img_vh, kernel, iterations=2)
thresh, img_vh = cv2.threshold(img_vh,128,255, cv2.THRESH_BINARY | cv2.THRESH_OTSU)
This process produces numpy array that can be interpreted as an image like this:
From there, the program can identify the table cells outlined on four sides by the contour mask.
Unfortunately, many of the tables that I seek to process, including the one above lack perfect border formatting. The left-most column above lacks a left border (there is still data inside it). Other tables I have lack internal borders at all, relying on white space to format the data for the human eye.
As best I can tell, my path forward here is to add the missing contour lines myself using some kind of logic based on visual elements on the page. In the first example, I could attempt to add a left-side vertical line to the contour mask based on the position of the other contours. In the second example, I could try to add table borders based on consistencies in the position of the text.
That being said, this strategy would require a significant amount of logic, and may not be flexible enough to deal with the various table formats I may come into contact with.
Am I approaching this challenge with the right strategy? Is there a deployable software solution that I am not seeing? Ideally, I'd like this to be as automated as possible.
Any help would be greatly appreciated!

Related

How to measure the central angle with Python cv2 package

Our team set up a vision system with a camera, a microscope and a tunable lens to look at the internal surface of a cone.
Visually speaking, the camera takes 12 image for one cone with each image covering 30 degrees.
Now we've collected many sample images and want to make sure each "fan"(as shown below) is at least 30 degree.
Is there any way in Python, with cv2 or other packages, to measure this central angle. Thanks.
Here is one way to do that in Python/OpenCV.
Read the image
Convert to gray
Threshold
Use morphology open and close to smooth and fill out the boundary
Apply Canny edge extraction
Separate the image into top edge and bottom edge by blackening the opposite side to each edge
Fit lines to the top and bottom edges
Compute the angle of each edge
Compute the difference between the two angles
Draw the lines on the input
Save the results
Input:
import cv2
import numpy as np
import math
# read image
img = cv2.imread('cone_shape.jpg')
# convert to grayscale
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
# threshold
thresh = cv2.threshold(gray,11,255,cv2.THRESH_BINARY)[1]
# apply open then close to smooth boundary
kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (13,13))
morph = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, kernel)
kernel = np.ones((33,33), np.uint8)
morph = cv2.morphologyEx(morph, cv2.MORPH_CLOSE, kernel)
# apply canny edge detection
edges = cv2.Canny(morph, 150, 200)
hh, ww = edges.shape
hh2 = hh // 2
# split edge image in half vertically and blacken opposite half
top_edge = edges.copy()
top_edge[hh2:hh, 0:ww] = 0
bottom_edge = edges.copy()
bottom_edge[0:hh2, 0:ww] = 0
# get coordinates of white pixels in top and bottom
# note: need to transpose y,x in numpy to x,y for opencv
top_white_pts = np.argwhere(top_edge.transpose()==255)
bottom_white_pts = np.argwhere(bottom_edge.transpose()==255)
# fit lines to white pixels
# (x,y) is point on line, (vx,vy) is unit vector along line
(vx1,vy1,x1,y1) = cv2.fitLine(top_white_pts, cv2.DIST_L2, 0, 0.01, 0.01)
(vx2,vy2,x2,y2) = cv2.fitLine(bottom_white_pts, cv2.DIST_L2, 0, 0.01, 0.01)
# compute angle for vectors vx,vy
top_angle = (180/math.pi)*math.atan(vy1/vx1)
bottom_angle = (180/math.pi)*math.atan(vy2/vx2)
print(top_angle, bottom_angle)
# cone angle is the difference
cone_angle = math.fabs(top_angle - bottom_angle)
print(cone_angle)
# draw lines on input
lines = img.copy()
p1x1 = int(x1-1000*vx1)
p1y1 = int(y1-1000*vy1)
p1x2 = int(x1+1000*vx1)
p1y2 = int(y1+1000*vy1)
cv2.line(lines, (p1x1,p1y1), (p1x2,p1y2), (0, 0, 255), 1)
p2x1 = int(x2-1000*vx2)
p2y1 = int(y2-1000*vy2)
p2x2 = int(x2+1000*vx2)
p2y2 = int(y2+1000*vy2)
cv2.line(lines, (p2x1,p2y1), (p2x2,p2y2), (0, 0, 255), 1)
# save resulting images
cv2.imwrite('cone_shape_thresh.jpg',thresh)
cv2.imwrite('cone_shape_morph.jpg',morph)
cv2.imwrite('cone_shape_edges.jpg',edges)
cv2.imwrite('cone_shape_lines.jpg',lines)
# show thresh and result
cv2.imshow("thresh", thresh)
cv2.imshow("morph", morph)
cv2.imshow("edges", edges)
cv2.imshow("top edge", top_edge)
cv2.imshow("bottom edge", bottom_edge)
cv2.imshow("lines", lines)
cv2.waitKey(0)
cv2.destroyAllWindows()
Thresholded image:
Morphology processed image:
Edge Image:
Lines on input:
Cone Angle (in degrees):
42.03975696357633
That sounds possible. You need to do some preprocessing and filtering to figure out what works and there is probably some tweaking involved.
There are three approaches that could work.
1.)
The basic idea is to somehow get two lines and measure the angle between them.
Define a threshold to define the outer black region (out of the central angle) and set all values below it to zero.
This will also set some of the blurry stripes inside the central angle to zero so we have to try to "heal" them away. This is done by using Morphological Transformations. You can read about them here and here.
You could try the operation Closing, but I don't know if it fixes stripes. Usually it fixes dots or scratches. This answer seems to indicate that it should work on lines.
Maybe at that point apply some Gaussian blurring and to the threshold thing again. Then try to use some edge or line detection.
It's basically try and error, you have to see what works.
2.)
Another thing that could work is to try to use the arc-enter code herelike scratches, maybe even strengthen them and use the Hough Circle Transform. I think it detects arcs as well.
Just try it and see what the function returns. In the best case there are several circles / arcs that you can use to estimate the central angle.
There are several approaches on arc detection here on StackOverflow or here.
I am not sure if that's the same with all your image, but the one above looks like there are some thin, green and pink arcs that seem to stretch all along the central angle. You could use that to filter for that color, then make it grey scale.
This question might be helpful.
3.)
Apply an edge filter, e.g Canny skimage.feature.canny
Try several sigmas and post the images in your question, then we can try to think on how to continue.
What could work is to calculate the convex hull around all points that are part of an edge. Then get the two lines that form the central angle from the convex hull.

How to get expected behavior from opencv's morhpologyEx with regard to image boundaries?

I'm using opencv (version 4.1.0, with Python 3.7) to perform morphological closing on binary images. I'm having trouble with boundaries when using big closing kernels.
My code is :
close_size = 20
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (close_size, close_size))
result = cv2.morphologyEx(im, cv2.MORPH_CLOSE, kernel)
I read this question and this question, and also the the docs, which lead me to also try to change the
borderValue argument in morphologyEx() like so
result = cv2.morphologyEx(im, cv2.MORPH_CLOSE, kernel,borderValue=[cv2.BORDER_CONSTANT,0])
But both methods do not lead to what I want. I've summed up their behaviors in the image below.
My original image is on top. The behavior I expect is for the two dots to remain separate for small kernels (eg, kernel = 1), and merge together for big enough kernels.
As you can see, for the default border (left column of the image), the merge is correct when the kernel = 6, but as soon as it gets bigger, the dots start to merge with the boundary.
With constant border (right column of the image), bigger kernels can be used but an unexpected behavior occurs nevertheless with really bigger kernels (e.g. kernel = 20), where the points dissapear.
The closing kernel is left as a parameter for the user in the final software, in order to be able to merge dots which can be really far away. So ideally, I would need to be able to handle smoothly kernels which are really bigger than the distance between objects and the boundaries.
Original image :
This answer explains how to use MORPH_CLOSE around the edge of an image by adding a buffer to the image.
You can add a buffer by creating an image of zeros using numpy:
# Import packages
import cv2
import numpy as np
import matplotlib.pyplot as plt
# Read in the image
img = cv2.imread('/home/stephen/Desktop/OKoUh.png', 0)
# Create a black bufffer around the image
h,w = img.shape
buffer = max(h,w)
bg = np.zeros((h+buffer*2, w+buffer*2), np.uint8)
bg[buffer:buffer+h, buffer:buffer+w] = img
Then you can iterate and check how it looks at different kernel sizes:
for close_size in range(1,11):
temp = bg.copy()
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (close_size, close_size))
result = cv2.morphologyEx(temp, cv2.MORPH_CLOSE, kernel)
results = result[buffer:buffer+h, buffer:buffer+w]
cv2.imshow('img', result)
cv2.waitKey()
My results:
Based on Stephen's answer, here is the snippet I ended up implementing :
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (close_size, close_size))
# Padding
im=cv2.copyMakeBorder(im,close_size,close_size,close_size,close_size,
borderType=cv2.BORDER_CONSTANT, value = 0)
# Closing
im = cv2.morphologyEx(im, cv2.MORPH_CLOSE, kernel)
# Unpadding
im = im[close_size:-close_size,close_size:-close_size]
As I mentionned in a comment, this can lead to a much longer computation time for large kernel sizes. It's possible that padding with lower values, eg close_size/2 would be enough to prevent border issues (didn't test it).

Filling edges using flood fill not working properly

I am using openCV in python to detect cracks in concrete. I am able to use canny edge detection to detect cracks. Next, I need to fill the edges. I used floodfill operation of openCV but some of the gaps are filled whereas some are not filled. The image on the left is the input image whereas that on the right is the floodfilled image. I am guessing this is because my edges have breaks at points. How do i solve this ?
My code for floodfilling:
im_th1 = imginput
im_floodfill = im_th1.copy()
# Mask used to flood filling.
# Notice the size needs to be 2 pixels than the image.
h, w = im_th1.shape[:2]
mask = np.zeros((h + 2, w + 2), np.uint8)
# Floodfill from point (0, 0)
cv2.floodFill(im_floodfill, mask, (5, 5), 255);
# Invert floodfilled image
im_floodfill_inv = cv2.bitwise_not(im_floodfill)
# Combine the two images to get the foreground.
im_out = im_th1 | im_floodfill_inv
cv2.imshow("Foreground", im_out)
cv2.waitKey(0)
I found the solution to what i was looking for. Posting it here as it might come of use to others. After some research on the internet, it was just 2 lines of codes as suggested in this : How to complete/close a contour in python opencv?
The code that worked for me is :
kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (9, 9))
dilated = cv2.dilate(image, kernel)
eroded=cv2.erode(dilated,kernel)
The result is in the image attached that shows before and after results.
I see this so often here on SO, everybody wants to use edge detection, and then fill in the area in between the edges.
Unless you use a method for edge detection that purposefully creates a closed contour, detected edges will likely not form a closed contour. And you cannot flood-fill a region unless you have a closed contour.
In most of these cases, some filtering and a simple threshold suffice. For example:
import PyDIP as dip
import matplotlib.pyplot as pp
img = dip.Image(pp.imread('oJAo7.jpg')).TensorElement(1) # From OP's other question
img = img[4:698,6:]
lines = dip.Tophat(img, 10, polarity='black')
dip.SetBorder(lines, [0], [2])
lines = dip.PathOpening(lines, length=100, polarity='opening', mode={'robust'})
lines = dip.Threshold(lines, method='otsu')[0]
This result is obtained after a simple top-hat filter, which keeps only thin things, followed by a path opening, which keeps only long things. This combination removes large-scale shading, as well as the small bumps and things. After the filtering, a simple Otsu threshold yields a binary image that marks all pixels in the crack.
Notes:
The input image is the one OP posted in another question, and is the input to the images posted in this question.
I'm using PyDIP, which you can get on GitHub and need to compile yourself. Hopefully soon we'll have a binary distribution. I'm an author.

Masking horizontal and vertical lines with Open CV

I'm trying to remove horizontal and vertical lines in this image in order to have more distinct text areas.
I'm using the below code, which follows this guide
image = cv2.imread('image.jpg')
gray = cv2.cvtColor(image,cv2.COLOR_BGR2GRAY)
thresh = cv2.adaptiveThreshold(
blurred, 255,
cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY_INV,
25,
15
)
# Create the images that will use to extract the horizontal and vertical lines
horizontal = np.copy(thresh)
vertical = np.copy(thresh)
# Specify size on horizontal axis
cols = horizontal.shape[1]
horizontal_size = math.ceil(cols / 20)
# Create structure element for extracting horizontal lines through morphology operations
horizontalStructure = cv2.getStructuringElement(cv2.MORPH_RECT, (horizontal_size, 1))
# Apply morphology operations
horizontal = cv2.erode(horizontal, horizontalStructure)
horizontal = cv2.dilate(horizontal, horizontalStructure)
# Show extracted horizontal lines
cv2.imwrite("horizontal.jpg", horizontal)
# Specify size on vertical axis
rows = vertical.shape[0]
verticalsize = math.ceil(rows / 20)
# Create structure element for extracting vertical lines through morphology operations
verticalStructure = cv2.getStructuringElement(cv2.MORPH_RECT, (1, verticalsize))
# Apply morphology operations
vertical = cv2.erode(vertical, verticalStructure)
vertical = cv2.dilate(vertical, verticalStructure)
After this, I know I would need to isolate the lines and mask the original image with the white lines, however I'm not really sure on how to proceed.
Does anyone have any suggestion?
Jeru's answer already gives you what you want. But I wanted to add an alternative that is maybe a bit more general than what you have so far.
You are converting the color image to gray-value, then apply adaptive threshold in an attempt to find lines. You filter this to get only the long horizontal and vertical lines, then use that mask to paint the original image white at those locations.
Here we look for all lines, and remove them from the image making painting them with whatever the surrounding color is. This process does not involve thresholding at all, all morphological operations are applied to the channels of the color image.
Ideally we'd use color morphology, but implementations of that are rare. Mathematical morphology is based on maximum and minimum operations, and the maximum or minimum of a color triplet (i.e. a vector) is not well defined.
So instead we apply the following procedure to each of the three color channels independently. This should produce results that are good enough for this application:
Extract the red channel: take the input RGB image, and extract the first channel. This is a gray-value image. We'll call this image channel.
Apply a top-hat filter to detect the thin structures: the difference between a closing with a small structuring element (SE) applied to channel, and channel (a closing is a dilation followed by an erosion with the same SE, you're using this to find lines as well). We'll call this output thin. thin = closing(channel)-channel. This step is similar to your local thresholding, but no actual threshold is applied. The resulting intensities indicate how dark the lines are w.r.t. to background. If you add thin to channel, you'll fill in these thin structures. The size of the SE here determines what is considered "thin".
Filter out the short lines, to keep only the long ones: apply an opening with a long horizontal SE to thin, and an opening with a long vertical SE to thin, and take the maximum of the two result. We'll call this lines. Note that this is the same process you used to generate horizontal and vertical. Instead of adding them together as Jeru suggested, we take the maximum. This makes it so that output intensities still match the contrast in channel. (In Mathematical Morphology parlance, the supremum of openings is an opening). The length of the SEs here determines what is long enough to be a line.
Fill in the lines in the original image channel: now simply add lines to channel. Write the result to the first channel of the output image.
Repeat the same process with the other two channels.
Using DIPlib this is quite a simple script:
import diplib as dip
input = dip.ImageReadTIFF('/home/cris/tmp/T4tbM.tif')
output = input.Copy()
for ii in range(0,3):
channel = output.TensorElement(ii)
thin = dip.Closing(channel, dip.SE(5, 'rectangular')) - channel
vertical = dip.Opening(thin, dip.SE([100,1], 'rectangular'))
horizontal = dip.Opening(thin, dip.SE([1,100], 'rectangular'))
lines = dip.Supremum(vertical,horizontal)
channel += lines # overwrites output image
Edit:
When increasing the size of the first SE, above set to 5, to be large enough to remove also the thicker gray bar in the middle of the example image, causes part of the block containing the inverted text "POWERLIFTING" to be left in thin.
To filter out those parts as well, we can change the definition of thin as follows:
notthin = dip.Closing(channel, dip.SE(11, 'rectangular'), ["add max"]))
notthin = dip.MorphologicalReconstruction(notthin, channel, 1, "erosion")
thin = notthin - channel
That is, instead of thin=closing(channel)-channel, we do thin=reconstruct(closing(channel))-channel. The reconstruction simply expands selected (not thin) structures so that where part of a structure was selected, now the full structure is selected. The only thing that is now in thin are lines that are not connected to thicker structures.
I've also added "add max" as a boundary condition -- this causes the closing to expand the area outside the image with white, and therefore see lines at the edges of the image as lines.
To elaborate more here is what to do:
First, add the resulting images of vertical and horizontal. This will give you an image containing both the horizontal and vertical lines. Since both the images are of type uint8 (unsigned 8-bit integer) adding them won't be a problem:
res = vertical + horizontal
Finally, mask the resulting image obtained above with the original 3-channel image. This can be accomplished using cv2.bitwise_and:
fin = cv2.bitwise_and(image, image, mask = cv2.bitwise_not(res))
A sample for removing horizontal lines.
Sample image:
import cv2
import numpy as np
img = cv2.imread("Image path", 0)
if len(img.shape) != 2:
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
else:
gray = img
gray = cv2.bitwise_not(gray)
bw = cv2.adaptiveThreshold(gray, 255, cv2.ADAPTIVE_THRESH_MEAN_C,
cv2.THRESH_BINARY, 15, -2)
horizontal = np.copy(bw)
cols = horizontal.shape[1]
horizontal_size = cols // 30
horizontalStructure = cv2.getStructuringElement(cv2.MORPH_RECT, (horizontal_size, 1))
horizontal = cv2.erode(horizontal, horizontalStructure)
horizontal = cv2.dilate(horizontal, horizontalStructure)
cv2.imwrite("horizontal_lines_extracted.png", horizontal)
horizontal_inv = cv2.bitwise_not(horizontal)
cv2.imwrite("inverse_extracted.png", horizontal_inv)
masked_img = cv2.bitwise_and(gray, gray, mask=horizontal_inv)
masked_img_inv = cv2.bitwise_not(masked_img)
cv2.imwrite("masked_img.jpg", masked_img_inv)
=> horizontal_lines_extracted.png:
=> inverse_extracted.png
=> masked_img.png(resultant image after masking)
Do you want something like this?
image = cv2.imread('image.jpg', cv2.IMREAD_UNCHANGED);
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
ret,binary = cv2.threshold(gray, 170, 255, cv2.THRESH_BINARY)#|cv2.THRESH_OTSU)
V = cv2.Sobel(binary, cv2.CV_8U, dx=1, dy=0)
H = cv2.Sobel(binary, cv2.CV_8U, dx=0, dy=1)
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (3,3))
V = cv2.morphologyEx(V, cv2.MORPH_DILATE, kernel, iterations = 2)
H = cv2.morphologyEx(H, cv2.MORPH_DILATE, kernel, iterations = 2)
rows,cols = image.shape[:2]
mask = np.zeros(image.shape[:2], dtype=np.uint8)
contours = cv2.findContours(V, cv2.RETR_LIST, cv2.CHAIN_APPROX_SIMPLE)[1]
for cnt in contours:
(x,y,w,h) = cv2.boundingRect(cnt)
# manipulate these values to change accuracy
if h > rows/2 and w < 10:
cv2.drawContours(mask, [cnt], -1, 255,-1)
contours = cv2.findContours(H, cv2.RETR_LIST, cv2.CHAIN_APPROX_SIMPLE)[1]
for cnt in contours:
(x,y,w,h) = cv2.boundingRect(cnt)
# manipulate these values to change accuracy
if w > cols/2 and h < 10:
cv2.drawContours(mask, [cnt], -1, 255,-1)
mask = cv2.morphologyEx(mask, cv2.MORPH_DILATE, kernel, iterations = 2)
image[mask == 255] = (255,255,255)
So I have found a solution by using part of Juke's suggestion. Eventually I would need to continue to process the image using a binary mode so figured I might keep it that way.
First, add the resulting images of vertical and horizontal. This will give you an image containing both the horizontal and vertical lines. Since both the images are of type uint8 (unsigned 8-bit integer) adding them won't be a problem:
res = vertical + horizontal
Then, subtract res from the original input image tresh, which was used to find the lines. This will remove the white lines and can than be used to apply some other morphology transformations.
fin = thresh - res

Trying to improve my road segmentation program in OpenCV

I am trying to make a program that is capable of identifying a road in a scene and proceeded to using morphological filtering and the watershed algorithm. However the program produces either mediocre or bad results. It seems to do okay (not good enough through) if the road takes up most of the scene. However in other pictures, it turns out that the sky gets segmented instead (watershed with the clouds).
I tried to see if I can preform more image processing to improve the results, but this is the best I have so far and don't know how to move forward to improve my program.
How can I improve my program?
Code:
import numpy as np
import cv2
from matplotlib import pyplot as plt
import imutils
def invert_img(img):
img = (255-img)
return img
#img = cv2.imread('images/coins_clustered.jpg')
img = cv2.imread('images/road_4.jpg')
img = imutils.resize(img, height = 300)
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
ret, thresh = cv2.threshold(gray,0,255,cv2.THRESH_BINARY_INV+cv2.THRESH_OTSU)
thresh = invert_img(thresh)
# noise removal
kernel = np.ones((3,3), np.uint8)
opening = cv2.morphologyEx(thresh,cv2.MORPH_OPEN,kernel, iterations = 4)
# sure background area
sure_bg = cv2.dilate(opening,kernel,iterations=3)
#sure_bg = cv2.morphologyEx(sure_bg, cv2.MORPH_TOPHAT, kernel)
# Finding sure foreground area
dist_transform = cv2.distanceTransform(opening, cv2.DIST_L2, 5)
ret, sure_fg = cv2.threshold(dist_transform,0.7*dist_transform.max(),255,0)
# Finding unknown region
sure_fg = np.uint8(sure_fg)
unknown = cv2.subtract(sure_bg,sure_fg)
# Marker labelling
ret, markers = cv2.connectedComponents(sure_fg)
# Add one to all labels so that sure background is not 0, but 1
markers = markers+1
# Now, mark the region of unknown with zero
markers[unknown==255] = 0
'''
imgray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
imgray = cv2.GaussianBlur(imgray, (5, 5), 0)
img = cv2.Canny(imgray,200,500)
'''
markers = cv2.watershed(img,markers)
img[markers == -1] = [255,0,0]
cv2.imshow('background',sure_bg)
cv2.imshow('foreground',sure_fg)
cv2.imshow('threshold',thresh)
cv2.imshow('result',img)
cv2.waitKey(0)
For start, segmentation problems are hard. The more general you want the solution to be, the more hard it gets. Road segemntation is a well-known problem, and i'm sure you can find many papers which tackle this issue from various directions.
Something that helps me get ideas for computer vision problems is trying to think what makes it so easy for me to detect it and so hard for computer.
For example, let's look on the road on your images. What makes it unique from the background?
Distinct gray color.
Always have 2 shoulders lines in white color
Always on the bottom section of the image
Always have a seperation line in the middle (yellow/white)
Pretty smooth
Wider on the bottom and vanishing into horizon.
Now, after we have found some unique features, we need to find ways to quantify them, so it will be obvious to the algorithm as it is obvious to us.
Work on the RGB (or even better - HSV) image, don't convert it to gray on the beginning and lose all the color data. Look for gray area!
Again, let's find white regions (inside gray ones). You can try do edge detection in the specific orientation of the shoulders line. You are looking for line that takes about half of the height of the image. etc...
Lets delete the upper half of the image. It is hardly that you ever have there a road, and you will get rid from a lot of noise in your algorithm.
see 2...
Lets check the local standard deviation, or some other smoothness feature.
If we found some shape, lets check if it fits what we expect.
I know these are just ideas and I don't claim they are easy to implement, but if you want to improve your algorithm you must give it more "knowledge", just as you have.
Exploit some domain knowledge; in other words, make some simplifying assumptions. Even basic things like "the camera's not upside down" and "the pavement has a uniform hue" will improve the common case.
If you can treat crossroads as a special case, then finding the edges of the roadway may be a simpler and more useful task than finding the roadway itself.

Categories