Trying to improve my road segmentation program in OpenCV

Trying to improve my road segmentation program in OpenCV - python

I am trying to make a program that is capable of identifying a road in a scene and proceeded to using morphological filtering and the watershed algorithm. However the program produces either mediocre or bad results. It seems to do okay (not good enough through) if the road takes up most of the scene. However in other pictures, it turns out that the sky gets segmented instead (watershed with the clouds).
I tried to see if I can preform more image processing to improve the results, but this is the best I have so far and don't know how to move forward to improve my program.
How can I improve my program?
Code:
import numpy as np
import cv2
from matplotlib import pyplot as plt
import imutils
def invert_img(img):
img = (255-img)
return img
#img = cv2.imread('images/coins_clustered.jpg')
img = cv2.imread('images/road_4.jpg')
img = imutils.resize(img, height = 300)
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
ret, thresh = cv2.threshold(gray,0,255,cv2.THRESH_BINARY_INV+cv2.THRESH_OTSU)
thresh = invert_img(thresh)
# noise removal
kernel = np.ones((3,3), np.uint8)
opening = cv2.morphologyEx(thresh,cv2.MORPH_OPEN,kernel, iterations = 4)
# sure background area
sure_bg = cv2.dilate(opening,kernel,iterations=3)
#sure_bg = cv2.morphologyEx(sure_bg, cv2.MORPH_TOPHAT, kernel)
# Finding sure foreground area
dist_transform = cv2.distanceTransform(opening, cv2.DIST_L2, 5)
ret, sure_fg = cv2.threshold(dist_transform,0.7*dist_transform.max(),255,0)
# Finding unknown region
sure_fg = np.uint8(sure_fg)
unknown = cv2.subtract(sure_bg,sure_fg)
# Marker labelling
ret, markers = cv2.connectedComponents(sure_fg)
# Add one to all labels so that sure background is not 0, but 1
markers = markers+1
# Now, mark the region of unknown with zero
markers[unknown==255] = 0
'''
imgray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
imgray = cv2.GaussianBlur(imgray, (5, 5), 0)
img = cv2.Canny(imgray,200,500)
'''
markers = cv2.watershed(img,markers)
img[markers == -1] = [255,0,0]
cv2.imshow('background',sure_bg)
cv2.imshow('foreground',sure_fg)
cv2.imshow('threshold',thresh)
cv2.imshow('result',img)
cv2.waitKey(0)

For start, segmentation problems are hard. The more general you want the solution to be, the more hard it gets. Road segemntation is a well-known problem, and i'm sure you can find many papers which tackle this issue from various directions.
Something that helps me get ideas for computer vision problems is trying to think what makes it so easy for me to detect it and so hard for computer.
For example, let's look on the road on your images. What makes it unique from the background?
Distinct gray color.
Always have 2 shoulders lines in white color
Always on the bottom section of the image
Always have a seperation line in the middle (yellow/white)
Pretty smooth
Wider on the bottom and vanishing into horizon.
Now, after we have found some unique features, we need to find ways to quantify them, so it will be obvious to the algorithm as it is obvious to us.
Work on the RGB (or even better - HSV) image, don't convert it to gray on the beginning and lose all the color data. Look for gray area!
Again, let's find white regions (inside gray ones). You can try do edge detection in the specific orientation of the shoulders line. You are looking for line that takes about half of the height of the image. etc...
Lets delete the upper half of the image. It is hardly that you ever have there a road, and you will get rid from a lot of noise in your algorithm.
see 2...
Lets check the local standard deviation, or some other smoothness feature.
If we found some shape, lets check if it fits what we expect.
I know these are just ideas and I don't claim they are easy to implement, but if you want to improve your algorithm you must give it more "knowledge", just as you have.

Exploit some domain knowledge; in other words, make some simplifying assumptions. Even basic things like "the camera's not upside down" and "the pavement has a uniform hue" will improve the common case.
If you can treat crossroads as a special case, then finding the edges of the roadway may be a simpler and more useful task than finding the roadway itself.

Related

Why is detection rate of Charucos in cv2.aruco.detectMarkers() so poor?

I am in trouble figuring out why cv2.aruco.detectMarkers() has problems in finding more than just a few markers with my calibration board. Playing around with the paramters didn't essentially improve the quality. The dictionary is correct as I tried it with the digital template before printing.
Here is, what I do to detect CHAruco markers from a real image:
import cv2
from cv2 import aruco
#ChAruco board variables
CHARUCOBOARD_ROWCOUNT = 26
CHARUCOBOARD_COLCOUNT = 26
ARUCO_DICT = cv2.aruco.Dictionary_get(aruco.DICT_4X4_1000)
#Create constants to be passed into OpenCV and Aruco methods
CHARUCO_BOARD = aruco.CharucoBoard_create(
squaresX=CHARUCOBOARD_COLCOUNT,
squaresY=CHARUCOBOARD_ROWCOUNT,
squareLength=5, #mm
markerLength=4, #mm
dictionary=ARUCO_DICT)
#load image
img = cv2.imread('imgs\\frame25_crop.png', 1)
test image with CHAruco markers
#initialize detector
parameters = aruco.DetectorParameters_create()
parameters.adaptiveThreshWinSizeMin = 150
parameters.adaptiveThreshWinSizeMax = 186
#Find aruco markers in the query image
corners, ids, _ = aruco.detectMarkers(
image=img,
dictionary=ARUCO_DICT,
parameters=parameters)
#Outline the ChAruco markers found in our image
img = aruco.drawDetectedMarkers(
image=img,
corners=corners)
The result is the following: only 3 are markers are found, which is bad.
resulting image with found markers
Does anyone has an idea how to considerably improve the results of the detector?

Your image is flipped.
Fix it with this line of code:
img = cv2.flip(img, 0)

Without looking at your code I may say that the image quality and perspective you selected is a bit poor. You may try to work with more clear view of your markers. For instance, hang the markers on the wall take one or two step back and try to take photo of it with better light and if not necessary do not add extra rotation, and keep the contrast high :). This will probably give better results.

How to deal with bright reflections in paper sheets detection in Python

I'm making a document scanner for a college project, my code work quite well for any of the uniform lighted images. However I came across issues detecting images with even a little amount of light reflections (or too much light) on the background surface.
I first tried different simple codes I found online, then using different morphological operation, with the result that now my code is a little messy and inaccurate.
Here's the code:
def scanner(img):
clahe = cv2.createCLAHE(clipLimit=3.0, tileGridSize=(8,8))
image = cv2.imread(img)
original = image.copy()
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
contrast = clahe.apply(gray)
blurred = cv2.medianBlur(contrast, 21)
canny = cv2.Canny(blurred, 0, 70)
dialated = cv2.dilate(canny, cv2.getStructuringElement(cv2.MORPH_RECT,(5,5)), iterations = 3)
closing = cv2.morphologyEx(dialated, cv2.MORPH_CLOSE, np.ones((5,5),np.uint8),iterations = 10)
contimage, contours, hierarchy = cv2.findContours(closing, cv2.RETR_LIST, cv2.CHAIN_APPROX_SIMPLE)
contours = sorted(contours, key = cv2.contourArea, reverse = True)[:5]
target = None
for c in contours:
p = cv2.arcLength(c, True)
approx = cv2.approxPolyDP(c, 0.09 * p, True)
if len(approx) == 4:
target = approx
cv2.drawContours(image, [target], -1, (0, 255, 0), 2)
break
plt.figure(figsize = (20,20))
plt.imshow(cv2.cvtColor(image, cv2.COLOR_BGR2RGB))
plt.title("final")
plt.show()
Here's an example of the code working:
input1, output1, input2, output2
And not working:
input1, output1, input2, output2
This image shows a successful segmentation:
This image shows a failed segmentation:

It looks like the main issue here is that in the failing images, there is not a sufficient contrast between the paper and the table it's placed on. the ones that work it's basically white paper on dark background, so it's relatively easy to segment out the paper, however when you place the paper on a light colored surface, there isn't sufficient contrast between the paper and the background to tell what is what. Unfortunately in image processing, there is only so much you can do when your input image is bad so there is no easy automated fix for this but I can think of a few workarounds, but they're all going to require extra work.
One would be instead of having your program automatically detect where the paper is, just have a static box that the user has to place the document inside of and simply capture the contents that way. Probably the most simple to implement however it seems like you WANT to detect it automatically so this probably isn't what you're looking for.
Two would be to have some intermediate step allowing the user to select a specific threshold value to apply to the image. Basically you would take the picture, then have the user set a threshold value such that the paper ends up being white, and the background is dark, then you could use that as a template to create the boundary of the paper which you can then segment from the original image. This is probably the most work but closest to what you're looking for.
Three would be similar to number one but instead of having a set area you place the documents within, you could take the picture then have the user manually select where the corners are and segment it that way, more work than #1, less work than #2, but probably still not what you're looking for.
Finally you could just leave it as is and use it knowing that you need a sufficiently dark background for it to work correctly. There are probably some other work arounds but with a lot of image processing stuff you can be very constrained by the quality of your image and there isn't always a software solution for exactly what you're looking to do.

Filling edges using flood fill not working properly

I am using openCV in python to detect cracks in concrete. I am able to use canny edge detection to detect cracks. Next, I need to fill the edges. I used floodfill operation of openCV but some of the gaps are filled whereas some are not filled. The image on the left is the input image whereas that on the right is the floodfilled image. I am guessing this is because my edges have breaks at points. How do i solve this ?
My code for floodfilling:
im_th1 = imginput
im_floodfill = im_th1.copy()
# Mask used to flood filling.
# Notice the size needs to be 2 pixels than the image.
h, w = im_th1.shape[:2]
mask = np.zeros((h + 2, w + 2), np.uint8)
# Floodfill from point (0, 0)
cv2.floodFill(im_floodfill, mask, (5, 5), 255);
# Invert floodfilled image
im_floodfill_inv = cv2.bitwise_not(im_floodfill)
# Combine the two images to get the foreground.
im_out = im_th1 | im_floodfill_inv
cv2.imshow("Foreground", im_out)
cv2.waitKey(0)

I found the solution to what i was looking for. Posting it here as it might come of use to others. After some research on the internet, it was just 2 lines of codes as suggested in this : How to complete/close a contour in python opencv?
The code that worked for me is :
kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (9, 9))
dilated = cv2.dilate(image, kernel)
eroded=cv2.erode(dilated,kernel)
The result is in the image attached that shows before and after results.

I see this so often here on SO, everybody wants to use edge detection, and then fill in the area in between the edges.
Unless you use a method for edge detection that purposefully creates a closed contour, detected edges will likely not form a closed contour. And you cannot flood-fill a region unless you have a closed contour.
In most of these cases, some filtering and a simple threshold suffice. For example:
import PyDIP as dip
import matplotlib.pyplot as pp
img = dip.Image(pp.imread('oJAo7.jpg')).TensorElement(1) # From OP's other question
img = img[4:698,6:]
lines = dip.Tophat(img, 10, polarity='black')
dip.SetBorder(lines, [0], [2])
lines = dip.PathOpening(lines, length=100, polarity='opening', mode={'robust'})
lines = dip.Threshold(lines, method='otsu')[0]
This result is obtained after a simple top-hat filter, which keeps only thin things, followed by a path opening, which keeps only long things. This combination removes large-scale shading, as well as the small bumps and things. After the filtering, a simple Otsu threshold yields a binary image that marks all pixels in the crack.
Notes:
The input image is the one OP posted in another question, and is the input to the images posted in this question.
I'm using PyDIP, which you can get on GitHub and need to compile yourself. Hopefully soon we'll have a binary distribution. I'm an author.

Difficulty in detected ellipses in image

I am trying to detect ellipses in some images.
After some functions I got this edges map:
I tried using Hough transform to detect ellipses, but this transform has very high complexity, so my computer didn't finish running the transform command even after 5 hours(!).
I also tried doing connected components and got this:
In last case I also tried continue and binarized the image.
In all cases I am stuck in these steps, and have no idea how continue from here.
My mission is detect tomatoes in the image. I am approaching this by trying to detect circles and ellipses and find the radius (or average radius in ellipses case) for each one.
edited:
I add my code for the first method (the result is edge map from above):
img = cv2.imread(r'../images/assorted_tomatoes.jpg')
gray_img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
imgAfterLight=lightreduce(img)
imgAfterGamma=gamma_correctiom(imgAfterLight,0.8)
th2 = 255 - cv2.adaptiveThreshold(imgAfterGamma,255,cv2.ADAPTIVE_THRESH_GAUSSIAN_C,cv2.THRESH_BINARY,5,3)
median2 = cv2.medianBlur(th2,3)
where median2 is the result of shown above in edge map
and the code for connected components:
import scipy
from scipy import ndimage
import matplotlib.pyplot as plt
import cv2
import numpy as np
fname=r'../images/assorted_tomatoes.jpg'
blur_radius = 1.0
threshold = 50
img = scipy.misc.imread(fname) # gray-scale image
gray_img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
print(img.shape)
# smooth the image (to remove small objects)
imgf = ndimage.gaussian_filter(gray_img, blur_radius)
threshold = 80
# find connected components
labeled, nr_objects = ndimage.label(imgf > threshold)
where labeled is the result above
another edit:
this is the input image:
Input
The problem is that after edge detection, there are a lot of unnecessary edges in sub regions that disturbing for make smooth edge map

To me this looks like a classic problem for the watershed algorithm. It is designed for segmenting out touching objects like the tomatoes. My example is in Matlab (I'm on the wrong computer today) but it should translate to python easily. First convert to greyscale as you do and then invert the images
I=rgb2gray(img)
I2=imcomplement(I)
The image as is will over segment, so we remove minima that are too shallow. This can be done with the h-minima transform
I3=imhmin(I2,50);
You might need to play with the 50 value which is the height threshold for suppressing shallow minima. Now run the watershed algorithm and we get the following result.
L=watershed(I3);
The results are not perfect. It needs additional logic to remove some of the small regions, but it will give a reasonable estimate. The watershed and h-minima are contained in the skimage.morphology package in python.

Detecting an approaching object

I read this blog post where he uses a Laser and a Webcam to estimated the distance of the cardboard from the Webcam.
I had another idea about that. I don't want to calculate the distance from the webcam.
I want to check if an object is approaching the webcam. The algorithm, according to me, will be something like:
Detect the object in the webcam feed.
If the object is approaching the webcam it'll grow larger and larger in the video feed.
Use this data for further calculations.
Since I want to detect random objects, I am using the findContours() method to find the contours in the video feed. Using that, I will at least have the outlines of the objects in the video feed. The source code is:
import numpy as np
import cv2
vid=cv2.VideoCapture(0)
ans, instant=vid.read()
average=np.float32(instant)
cv2.accumulateWeighted(instant, average, 0.01)
background=cv2.convertScaleAbs(average)
while(1):
_,f=vid.read()
imgray=cv2.cvtColor(f, cv2.COLOR_BGR2GRAY)
ret, thresh=cv2.threshold(imgray,127,255,0)
diff=cv2.absdiff(f, background)
cv2.imshow("input", f)
cv2.imshow("Difference", diff)
if cv2.waitKey(5)==27:
break
cv2.destroyAllWindows()
The output is:
I am stuck here. I have the contours stored in an array. What do I do with it when the size increases? How do I proceed?

One trouble here is recognising and differentiating the moving objects from other stuff in the video feed. An approach might be to let the camera 'learn' what the background looks like with no object. Then you can constantly compare its input against this background. One way to get the background is to use a running average.
Any difference greater than a small threshold means there is a moving object. If you constantly display this difference, you basically have a motion tracker. The size of the objects is simply the sum of all the non-zero (thresholded) pixels, or their bounding rectangles. You can track this size and use it to guess whether the object is moving closer or further. Morphological operations can help group the contours into one cohesive object.
Since it will be tracking ANY movement, if there are two objects, they will be counted together. Here is where you can use the contours to find and track individual objects, e.g. using the contour bounds or centroids. You could also possibly separate them by colour.
Here are some results using this strategy (the grey blob is my hand):
It actually did a fairly good job of guessing which way my hand was moving.
Code:
import cv2
import numpy as np
AVERAGE_ALPHA = 0.2 # 0-1 where 0 never adapts, and 1 instantly adapts
MOVEMENT_THRESHOLD = 30 # Lower values pick up more movement
REDUCED_SIZE = (400, 600)
MORPH_KERNEL = np.ones((10, 10), np.uint8)
def reduce_image(input_image):
"""Make the image easier to deal with."""
reduced = cv2.resize(input_image, REDUCED_SIZE)
reduced = cv2.cvtColor(reduced, cv2.COLOR_BGR2GRAY)
return reduced
# Initialise
vid = cv2.VideoCapture(0)
average = None
old_sizes = np.zeros(20)
size_update_index = 0
while (True):
got_frame, frame = vid.read()
if got_frame:
# Reduce image
reduced = reduce_image(frame)
if average is None: average = np.float32(reduced)
# Get background
cv2.accumulateWeighted(reduced, average, AVERAGE_ALPHA)
background = cv2.convertScaleAbs(average)
# Get thresholded difference image
movement = cv2.absdiff(reduced, background)
_, threshold = cv2.threshold(movement, MOVEMENT_THRESHOLD, 255, cv2.THRESH_BINARY)
# Apply morphology to help find object
dilated = cv2.dilate(threshold, MORPH_KERNEL, iterations=10)
closed = cv2.morphologyEx(dilated, cv2.MORPH_CLOSE, MORPH_KERNEL)
# Get contours
contours, _ = cv2.findContours(closed, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
cv2.drawContours(closed, contours, -1, (150, 150, 150), -1)
# Find biggest bounding rectangle
areas = [cv2.contourArea(c) for c in contours]
if (areas != list()):
max_index = np.argmax(areas)
max_cont = contours[max_index]
x, y, w, h = cv2.boundingRect(max_cont)
cv2.rectangle(closed, (x, y), (x+w, y+h), (255, 255, 255), 5)
# Guess movement direction
size = w*h
if size > old_sizes.mean():
print "Towards"
else:
print "Away"
# Update object size
old_sizes[size_update_index] = size
size_update_index += 1
if (size_update_index) >= len(old_sizes): size_update_index = 0
# Display image
cv2.imshow('RaptorVision', closed)
Obviously this needs more work in terms of identifying, selecting and tracking the objects etc (at the moment it does horribly if there is something else moving in the background). There are also many parameters to vary and tweak (the ones set are what worked well for my system). I'll leave that up to you though.
Some links:
background extraction
motion tracking
If you want to get a bit more high-tech with the background removal, have a look here:
wallflower

Detect the object in the webcam feed.
If the object is approaching the webcam it'll grow larger and larger in the video feed.
Use this data for further calculations.
Good idea.
If you want to use the contour detection approach, you could do it the following way:
You have a series of Images I1, I2, ... In
Do a contour detection on each one. C1, C2, ..., Cn (Contour is a set of points in OpenCV)
Take a large enough sample on your Image i and i+1: S_i \leq C_i, i \in 1...n
Check for all points in your sample for the nearest point on i+1. Then you trajectorys for all your points.
Check if this trajectorys point mostly outwards (tricky part ;)
If they appear outwards for a suffiecent number of frames your contour got bigger.
Alternative you could try to prune the points that are not part of the correct contour and work with a covering rectangle. It's very easy to check the size that way, but i don't knwo how easy it will be to choose the "correct" points.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.