I'm trying to detect underlines and boxes in the following image:
For example, this is the output I'm aiming for:
Here's what I've atempted:
import cv2
import numpy as np
# Load image
img = cv2.imread('document.jpg')
# Convert image to grayscale
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
# Apply Gaussian blur to reduce noise
blur = cv2.GaussianBlur(gray, (3, 3), 0)
# Threshold the image
ret, thresh = cv2.threshold(blur, 127, 255, cv2.THRESH_TRUNC)
# Apply Canny Edge Detection
edges = cv2.Canny(thresh, 155, 200)
# Use HoughLinesP to detect lines
lines = cv2.HoughLinesP(edges, rho=1, theta=1*np.pi/180, threshold=100, minLineLength=100, maxLineGap=50)
# Draw lines on image
for line in lines:
x1, y1, x2, y2 = line[0]
cv2.line(img, (x1, y1), (x2, y2), (0, 0, 255), 4)
However, this is the result I get:
Here are my thoughts regarding this problem:
I might need to use adaptive thresholding with Otsu's algorithm to provide a proper binary image to cv2.Canny(). However, I doubt this is the core issue. Here is how the image looks with the current thresholding applied:
cv2.threshold() already does a decent job separating the notes from the page.
Once I get HoughLinesP() to properly draw all the lines (and not the weird scribbles it's currently outputting), I can write some sort of box detector function to find the boxes based on the intersections (or near-intersections) of four lines. As for underlines, I simply need to detect horizontal lines from the output of HoughLinesP(), which shouldn't be difficult (e.g., for any given line, check if the two y coordinates are within some range of each other).
So the fundamental problem I have is this: how do I get HoughLinesP() to output smoother lines and not the current mess it's giving so I can then move forward with detecting boxes and lines?
Additionally, do my proposed methods for finding boxes and underlines make sense from an efficiency standpoint? Does OpenCV provide a better way for achieving what I want to accomplish?
As part of a program which contains a series of images to be processed, I first need to detect a green-coloured rectangle. I'm trying to write a program that doesn't use colour masking, since the lighting and glare on the images will make it difficult to find the appropriate HSV ranges.
(p.s. I already have two questions based on this program, but this one is unrelated to those. It's not a follow up, I want to address a separate issue.)
I used the standard rectangle detection technique, making use of findContours() and approxPolyDp() methods. I added some constraints that got rid of unnecessary rectangles (like aspectRatio>2.5, since my desired rectangle is clearly the "widest" and area>1500, to discard random small rectangles) .
import numpy as np
import cv2 as cv
img = cv.imread("t19.jpeg")
width=0
height=0
start_x=0
start_y=0
end_x=0
end_y=0
output = img.copy()
gray = cv.cvtColor(img, cv.COLOR_BGR2GRAY)
#threshold
th = cv.adaptiveThreshold(gray,255,cv.ADAPTIVE_THRESH_GAUSSIAN_C,cv.THRESH_BINARY,9,2)
cv.imshow("th",th)
#rectangle detection
contours, _ = cv.findContours(th, cv.RETR_TREE, cv.CHAIN_APPROX_NONE)
for contour in contours:
approx = cv.approxPolyDP(contour, 0.01* cv.arcLength(contour, True), True)
cv.drawContours(img, [approx], 0, (0, 0, 0), 5)
x = approx.ravel()[0]
y = approx.ravel()[1]
x1 ,y1, w, h = cv.boundingRect(approx)
a=w*h
if len(approx) == 4 and x>15 :
aspectRatio = float(w)/h
if aspectRatio >= 2.5 and a>1500:
print(x1,y1,w,h)
width=w
height=h
start_x=x1
start_y=y1
end_x=start_x+width
end_y=start_y+height
cv.rectangle(output, (start_x,start_y), (end_x,end_y), (0,0,255),3)
cv.putText(output, "rectangle "+str(x1)+" , " +str(y1-5), (x1, y1-5), cv.FONT_HERSHEY_COMPLEX, 0.5, (0, 0, 0))
cv.imshow("op",output)
print("start",start_x,start_y)
print("end", end_x,end_y)
print("width",width)
print("height",height)
It is working flawlessly for all the images, except one:
I used adaptive thresholding to create the threshold, which was used by the findContours() method.
I tried displaying the threshold and the output , and it looks like this:
The thresholds for the other images also looked similar...so I can't pinpoint what exactly has gone wrong in the rectangle detection procedure.
Some tweaks I have tried:
Changing the last two parameters in the adaptive parameters method.
I tried 11,1 , 9,1, and for both of them, the rectangle in the
threshold looked more prominent : but in this case the output
detected no rectangles at all.
I have already disregarded otsu thresholding, as it is not working
for about 4 of my test images.
What exactly can I tweak in the rectangle detection procedure for it to detect this rectangle?
I also request , if possible, only slight modifications to this method and not some entirely new method. As I have mentioned, this method is working perfectly for all of my other test images, and if the new suggested method works for this image but fails for the others, then I'll find myself back here asking why it failed.
Edit: The method that abss suggested worked for this image, however failed for:
image 4
image 1, far off
Other test images:
image 1, normal
image 2
image 3
image 9, part 1
image 9, part 2
You can easily do it by adding this line of code after your threshold
kernel = cv.getStructuringElement(cv.MORPH_RECT,(3,3))
th = cv.morphologyEx(th,cv.MORPH_OPEN,kernel)
This will remove noise within the image. you can see this link for more understanding about morphologyEx https://docs.opencv.org/master/d9/d61/tutorial_py_morphological_ops.html
The results I got is shown below
I have made a few modifications to your code so that it works with all of your test images. There are a few false positives that you may have to filter based on HSV color range for green (since your target is always a shade of green). Alternately you can take into account the fact that the one of the child hierarchy of your ROI contour is going to be > 0.4 or so times than the outer contour. Here are the modifications:
Used DoG for thresholding useful contours
Changed arcLength multiplier to 0.5 instead of 0.1 as square corners are not smooth
cv2.RETR_CCOMP to get 2 level hierarchy
Moved ApproxPolyDP inside to make it more efficient
Contour filter area changed to 600 to filter ROI for all test images
Removed a little bit of unnecessary code
Check with all the other test images that you may have and modify the parameters accordingly.
img = cv2.imread("/path/to/your_image")
width=0
height=0
start_x=0
start_y=0
end_x=0
end_y=0
output = img.copy()
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
gw, gs, gw1, gs1, gw2, gs2 = (3,1.0,7,3.0, 3, 2.0)
img_blur = cv2.GaussianBlur(gray, (gw, gw), gs)
g1 = cv2.GaussianBlur(img_blur, (gw1, gw1), gs1)
g2 = cv2.GaussianBlur(img_blur, (gw2, gw2), gs2)
ret, thg = cv2.threshold(g2-g1, 127, 255, cv2.THRESH_BINARY)
contours, hier = cv2.findContours(thg, cv2.RETR_CCOMP, cv2.CHAIN_APPROX_NONE)
img_cpy = img.copy()
width=0
height=0
start_x=0
start_y=0
end_x=0
end_y=0
for i in range(len(contours)):
if hier[0][i][2] == -1:
continue
x ,y, w, h = cv2.boundingRect(contours[i])
a=w*h
aspectRatio = float(w)/h
if aspectRatio >= 2.5 and a>600:
approx = cv2.approxPolyDP(contours[i], 0.05* cv2.arcLength(contours[i], True), True)
if len(approx) == 4 and x>15 :
width=w
height=h
start_x=x
start_y=y
end_x=start_x+width
end_y=start_y+height
cv2.rectangle(img_cpy, (start_x,start_y), (end_x,end_y), (0,0,255),3)
cv2.putText(img_cpy, "rectangle "+str(x)+" , " +str(y-5), (x, y-5), cv2.FONT_HERSHEY_COMPLEX, 0.5, (0, 0, 0))
plt.imshow(img_cpy)
print("start",start_x,start_y)
print("end", end_x,end_y)
I am trying to detect the underlines where students write their answers in the homework, but I cannot get Hough Line Transform to work. It is detecting way to many lines and if I increase thresholds, it will only detect vertical lines. Is there any other method to do this?
This is the code I have based on another post:
gray = cv2.imread(image_path + '000003.png')
edges = cv2.Canny(gray,50,150,apertureSize = 3)
cv2.imwrite('edges-50-150.jpg',edges)
minLineLength=100
lines = cv2.HoughLinesP(image=edges,rho=10,theta=np.pi/180, threshold=100,lines=np.array([]), minLineLength=minLineLength,maxLineGap=80)
a,b,c = lines.shape
for i in range(a):
cv2.line(gray, (lines[i][0][0], lines[i][0][1]), (lines[i][0][2], lines[i][0][3]), (0, 0, 255), 3, cv2.LINE_AA)
cv2.imwrite('houghlines5.jpg',gray)
When I run the code above I get these lines: Hough Transform Result
Edit: original image - Original Image
I'm studying OpenCV with python by working on a project which aims to detect the palm lines.
What I have done is basically use Canny edge detection and then apply Hough line detection on the edges but the outcome is not so good.
Here is the source code I am using:
original = cv2.imread(file)
img = cv2.cvtColor(original, cv2.COLOR_BGR2GRAY)
save_image_file(img, "gray")
img = cv2.equalizeHist(img)
save_image_file(img, "equalize")
img = cv2.GaussianBlur(img, (9, 9), 0)
save_image_file(img, "blur")
img = cv2.Canny(img, 40, 80)
save_image_file(img, "canny")
lined = np.copy(original) * 0
lines = cv2.HoughLinesP(img, 1, np.pi / 180, 15, np.array([]), 50, 20)
for line in lines:
for x1, y1, x2, y2 in line:
cv2.line(lined, (x1, y1), (x2, y2), (0, 0, 255))
save_image_file(lined, "lined")
output = cv2.addWeighted(original, 0.8, lined, 1, 0)
save_image_file(output, "output")
I tried different parameter sets of Gaussian kernel size and Canny low/high thresholds, but the outcome is either having too much noises, or missing (part of) major lines. Above picture is already the best I get, so far..
Is there anything I should do to get result improved, or any other approach would get better result?
Any help would be appreciated!
What you are looking for is really experimental. You have already done the most important function. I suggest that you tune your parameters to get a reasonable and a noisy number of lines, then you can make some filtering:
using morphological filters,
classification of lines
(according to their lengths, fits on contrasted area...etc)
improving your categories by dividing the area of palm (without
fingers) into a grid (4x4 .. where 4 vertical fingers corners can
define the configs of the grid).
calculate the gradient image,
orientation of lines may help as well
Make a search about the algorithm "cumulative level lines detection", it can help for the certainty of detected lines
Using various methods I have changed an image captcha to look somewhat like this
However while using Pytesseract OCR, the package is unable to identify any character and I think it is due to the line above the letters.
script.py
cv2.imwrite(filename, imgOP)
text = pytesseract.image_to_string(Image.open(filename))
Output in the console for the image is none
However when tried with another image (given below) I got the output as
PGKQKf
Which is wrong again because of the line above the letter T
I have used various techniques to clean the images such as erosion, dilation and also Probabilistic Hough Transform (result given below)
#Hough Line Transform
img = cv2.imread('Output1.png')
edges = cv2.Canny(img, 1000, 1500)
minLineLength = 0
maxLineGap = 10000000000
lines = cv2.HoughLinesP(edges, 1, np.pi / 180, 15, minLineLength, maxLineGap)
for x in range(0, len(lines)):
for x1, y1, x2, y2 in lines[x]:
cv2.line(img, (x1, y1), (x2, y2), (255, 255, 255), 2)
cv2.imwrite('houghlines3.jpg', img)
where the image after transformation looks somewhat like this
Any other combination of values of minLineLength and maxLineGap do not work.
How should one proceed forward? I had checked on various techniques to make Tesseract more accurate however I am confused as to which one should I use.
Other than Tesseract are there any other techniques that could be applied to get the desired the results.
I had thought of creating a mask, where using an online tool I had converted the image into 0 and 1 given below. However how to go about it and use it for identifying the characters ?