I'm desperately trying to find my way within openCV to detect lines using HoughLines or any other method, I'm starting from a document image and using structural element and erosion to obtain a binary image with lines.
I managed to obtain the following file but can't seem to obtain HoughLines that are following what seems to me (here is probably the issue) as obvious lines. Any idea on how to go forward or should I start from scratch using other methods ?
The ultimate goal is to extract the lines of the documents as separate images and then try some ML algorithm for handwritten text recognition.
I think that Hough Lines should work in your case. Running
lines = cv2.HoughLines(img_thr, 1, np.pi / 180, threshold=800)
where img_thr is your binary image gives quite good result:
The lines can by sorted by y coordinate of left end (for example) and then two consecutive lines will form a rectangle, which can be extracted using cv2.perspectiveTransform.
There are a few problems, which need to be solved to make this procedure more robust:
Algorithm can return multiple lines for each line in the image, so they need to be deduplicated.
There may be some false positive lines, so you need some condition to remove them. I think that looking at the slope of the line and distances between consecutive lines should do the work.
Effect of threshold parameter in cv2.HoughLines highly depends on the image resolution, so you should resize images to some constant size before running this procedure.
Full code:
img_orig = url_to_image('https://i.stack.imgur.com/PXDKG.png') # orignal image
img_thr = url_to_image('https://i.stack.imgur.com/jZChK.png') # binary image
h, w, _ = img_thr.shape
img_thr = img_thr[:,:,0]
lines = cv2.HoughLines(img_thr, 1, np.pi / 180, threshold=800)
img_copy = img_orig.copy()
points = []
for rho,theta in lines[:, 0]:
a, b = np.cos(theta), np.sin(theta)
x0, y0 = a*rho, b*rho
x1, x2 = 0, w
y1 = y0 + a*((0-x0) / -b)
y2 = y0 + a*((w-x0) / -b)
cv2.line(img_copy,(int(x1),int(y1)),(int(x2),int(y2)),(255,0,0),4)
points.append([[x1, y1], [x2, y2]])
points = np.array(points)
Related
I am looking to OCR some digital numbers in a couple of different formats. I have a function which levels text on the horizontal plane to enable me to create bounding boxes in Opencv which works for one of my digit images. However, the second digit style is slightly leaning (italicised), which sometimes works, but I have found the decimal point mostly gets lost as it gets incorporated into one of the digits bounding rectangles.
Is there a way to align the digits based on the vertical lines of the actual digit?
Below is my working function for the horizontal plane:
def deskew(img):
img_edges = cv2.Canny(img, 100, 100, apertureSize=3)
lines = cv2.HoughLinesP(img_edges, 1, math.pi / 180.0, 100, minLineLength=20, maxLineGap=50)
angles = []
for x1, y1, x2, y2 in lines[0]:
angle = math.degrees(math.atan2(y2 - y1, x2 - x1))
angles.append(angle)
med_angle = np.median(angles)
rotated_img = ndimage.rotate(img, med_angle, cval=255)
cv2.imshow("rotated image", rotated_img)
cv2.waitKey(0)
return rotated_img
Below is the type of image/digit format I am trying to deskew and OCR, I have found through some manual entries that an angle of around 5 degrees seems to work with accurately drawing separate bounding rectangles to capture the digits and decimal points.
Below is the manually adjusted angle, showing all digits and decimal point captured, which can be OCR'd
I need a way to split up the header and blueprint part of an image, but right now i'm out of ideas on how to do this. (cant use the original image so i tried to recreate it)
I've tried to do this using opencv's houghlines but it detects a lot of different lines because of the blueprint so i cant find a clear spot to cut off the image.
(like this:)
What i need is 2 different images of the header and the blueprint, but right now i can't find a good way to do this so any help would be appreciated.
For your given input image :
If I use the following piece of code :
import cv2
import numpy as np
img = cv2.imread('test1.png')
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
edges = cv2.Canny(gray,50,150,apertureSize = 3)
lines = cv2.HoughLines(edges,1,np.pi/180,200)
for rho,theta in lines[0]:
a = np.cos(theta)
b = np.sin(theta)
x0 = a*rho
y0 = b*rho
x1 = int(x0 + 1000*(-b))
y1 = int(y0 + 1000*(a))
x2 = int(x0 - 1000*(-b))
y2 = int(y0 - 1000*(a))
cv2.line(img,(x1,y1),(x2,y2),(0,0,255),2)
cv2.imwrite('houghlines3.jpg',img)
I get the following output :
See there is only one big straight line which differentiates the header and the blueprint part. You can get the starting and ending y co-ordinate of this line if you just print y1 and y2. For this case, they are y1 : 140, y2 : 141. Now what you need, is just crop the picture up to y pixel 141 value like this :
img = cv2.imread(path/to/original/image)
img_header = img[:141,:]
img_blueprint = img[141:, :]
cv2.imwrite("header.png", img_header)
cv2.imwrite("blueprint.png", img_blueprint)
Now, come to your problem. Here is a possible way. See the biggest straight line differentiating the header and the blueprint has got detected by three different red straight lines through the houghline transform. For these three lines, the starting y co-ordinates will be very much close like for example say 142, 145, 143. You need to append all ending y co-ordinates of the straight lines (y2) in a list and cluster all of them based on adjacency with a threshold value of 5-10 pixels, take the biggest cluster, take the largest ending y co-ordinate from the cluster and crop the original image accordingly.
If the line separating the header from the content is the longest black line across the width of the image, you can find it by summing the pixels across each row and then finding which row adds up to the least, i.e. has most black pixels:
# get sums across rows
rowSums = np.sum(image,axis=1)
# get index of blackest row
longestBlack = np.argmin(rowSums)
I'm working on OpenCV based project in python, and I have to calculate/extract and show visually the vanishing point from existing lines.
My first task is to detect lines, that's very easy with Canny and HoughLinesP functions:
import cv2
import numpy as np
img = cv2.imread('.image.jpg')
gray = cv2.cvtColor(img, cv2.COLOR_RGB2GRAY)
edges = cv2.Canny(gray, 500, 460)
lines = cv2.HoughLinesP(edges, 1, np.pi/180, 30, maxLineGap=250)
for line in lines:
x1, y1, x2, y2 = line[0]
cv2.line(img, (x1, y1), (x2, y2), (0, 0, 128), 1)
cv2.imwrite('linesDetected.jpg', img)
But I want to calculate/extrapolate the vanishing point of all lines, to find (and plot) where they cross with each other, like the image below.
I know I need to add a bigger frame to plot the continuation of lines, to find the cross (vanishing point), but I'm very lost at this point.
Thanks too much!!
Instead of the probabilistic Hough Transform implementation cv2.HoughTransformP, if you use the traditional one, cv2.HoughTransform, the lines are represented in parametric space (ρ,Θ). The parametric space relates to the actual point coordinates as ρ=xcosθ+ysinθ where ρ is the perpendicular distance from the origin to the line, and θ is the angle formed by this perpendicular line and the horizontal axis measured in counter-clockwise.
lines = cv2.HoughLines(edges, 1, np.pi/180, 200)
for line in lines:
rho,theta = line[0]
a = np.cos(theta)
b = np.sin(theta)
x0 = a*rho
y0 = b*rho
x1 = int(x0 + 10000*(-b))
y1 = int(y0 + 10000*(a))
x2 = int(x0 - 10000*(-b))
y2 = int(y0 - 10000*(a))
cv2.line(img,(x1,y1),(x2,y2),(0,255,0),1)
As you can see below, the projection of vanishing lines already starts to appear.
Now if we play with the parameters for this specific image and skip already-parallel vertical lines, we can get a better set of vanishing lines.
# fine tune parameters
lines = cv2.HoughLines(edges, 0.7, np.pi/120, 120, min_theta=np.pi/36, max_theta=np.pi-np.pi/36)
for line in lines:
rho,theta = line[0]
# skip near-vertical lines
if abs(theta-np.pi/90) < np.pi/9:
continue
a = np.cos(theta)
b = np.sin(theta)
x0 = a*rho
y0 = b*rho
x1 = int(x0 + 10000*(-b))
y1 = int(y0 + 10000*(a))
x2 = int(x0 - 10000*(-b))
y2 = int(y0 - 10000*(a))
cv2.line(img,(x1,y1),(x2,y2),(0,255,0),1)
At this step, there are multiple options to find the intersection point of the lines, the vanishing points. I will list some of them below.
Best approximation: All of these lines have a known (ρ,θ), and satisfy (ideally) only two (x,y) points, let's call the left one (x0,y0) and the right one (x1,y1). If you create a linear system with all these variables using the equation above, ρ=xcosθ+ysinθ, you can write it as ρ_n=[x y][cosθ_n sinθ_n]T. This turns the problem into a linear regression and you can solve for best (x,y) points. You can order the lines based on their slope and create two linear systems for (x0,y0) and (x1,y1).
Cumbersome solution: As mentioned in one of the comments, you can find the pairwise intersections of all lines, then cluster them based on proximity, and threshold the clusters based on number of intersections. Then you can output the cluster means of the two most populated clusters.
Trivial image-based solution: Since you already have the image of the intersections, you can do some image processing to find the points. This is by no means an exact solution, it is exercised as a quick and approximate solution. You can get rid of the lines by an erosion with a kernel same size of your lines. Then you can strengthen the intersections by a dilation with a larger kernel. Then if you do a closing operation with a slightly larger kernel, only the strongest intersections will remain. You can output the mean of these blobs as the vanishing points.
Below, you can see the line image before, and the resulting left and right blobs image after running the code below.
# delete lines
kernel = np.ones((3,3),np.uint8)
img2 = cv2.erode(img2,kernel,iterations = 1)
# strengthen intersections
kernel = np.ones((9,9),np.uint8)
img2 = cv2.dilate(img2,kernel,iterations = 1)
# close remaining blobs
kernel = np.ones((11,11),np.uint8)
img2 = cv2.erode(img2,kernel,iterations = 1)
img2 = cv2.dilate(img2,kernel,iterations = 1)
cv2.imwrite('points.jpg', img2)
If you want to find vanishing point from an image, You need to draw the lines. For this you can use Hough Transform. What it does, well it will draw the all possible lines on image. You can tune the parameter of it according to your need. It will give you the intersection points where most of the lines getting intersect. Although it's a one type of estimation which is not the perfectly correct but you can say that it is perfectly estimated. You can also use others forms of Hough as well according to your need.
In this case standard Hough transform is enough.
I'm trying to write a piece of code that can detect and isolate straight lines from an image. I'm using the opencv library, together with Canny edge detection and Hough transformation to achieve this. So far I've come up with the following:
import numpy as np
import cv2
# Reading the image
img = cv2.imread('sudoku-original.jpg')
# Convert the image to grayscale
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
# Edge detection
edges = cv2.Canny(gray,50,150,apertureSize = 3)
# Line detection
lines = cv2.HoughLines(edges,1,np.pi/180,200)
for rho,theta in lines[0]:
a = np.cos(theta)
b = np.sin(theta)
x0 = a*rho
y0 = b*rho
x1 = int(x0 + 1000*(-b))
y1 = int(y0 + 1000*(a))
x2 = int(x0 - 1000*(-b))
y2 = int(y0 - 1000*(a))
cv2.line(img,(x1,y1),(x2,y2),(0,0,255),2)
cv2.imwrite('linesDetected.jpg',img)
In theory this code snippet should do the job, but unfortunately it doesn't. The resulting picture clearly shows only one line being found. I'm not quite sure what I'm doing wrong here and why it's only detecting one specific line. Could someone possibly figure out the problem here?
Despite opencv's Hough Transform tutorial is using just one loop, the shape of lines is actual [None,1,2], thus when you use lines[0] you will only get one item for rho and one item for theta, which will only get you one line. therefore, my suggestion is to use a double loop (below) or some numpy slice magic to maintain using just 1 loop. To get all the grid detected, as Dan Masek mentioned, you will need to play with the edge detection logic. Maybe see the solution that uses HoughLinesP.
for item in lines:
for rho,theta in item:
...
I have an image that has items inside a grid. One stage of the problem is to detect and build a mask of the full grid, which can be slightly rotated clock or anticlockwise. My current CV pipeline extracts (probabilistic) Hough lines from an image and then uses its contours to filter for a set of rectangles (call the actual and the detected sets R' and R respectively). However due to occlusions and lighting conditions the hough lines (and consequently all downstream contours and lines segments) are incomplete (R << R'). Schematically the problem i must solve is to infer the missing grid components (R'- R) given the detected grid cells.
One strategy I am considering is the following. Foreach detected rectangle contour r in R do:
1- using fitLine() find vertical and horizontal lines that pass through the center of r (see code and image below)
rect = cv2.minAreaRect(r)
box = cv2.boxPoints(rect)
box = np.int0(box)
cx = np.int0(rect[0][0])
cy = np.int0(rect[0][1])
w = np.int0(rect[1][0])
h = np.int0(rect[1][1])
cv2.drawContours(img,[box],0,255,1)
[vx,vy,x,y] = cv2.fitLine(box, cv2.DIST_L2,0,0.01,0.01)
lefty = int((-x*vy/vx) + y)
righty = int(((cols-x)*vy/vx)+y)
start = (cols-1,righty)
end = (0,lefty)
cv2.line(img,start,end,255,1)
# HORIZONTAL
nx,ny = 1,-vx/vy
mag = np.sqrt((1+ny**2))
vx,vy = nx/mag,ny/mag
# Now find two extreme points on the line to draw line
lefty = int((-x*vy/vx) + y)
righty = int(((cols-x)*vy/vx)+y)
start = (cols-1,righty)
end = (0,lefty)
cv2.putText(img,'start',(start[0]-60,start[1]),font,0.5,255,1)
cv2.putText(img,'end',end,font,0.5,255,1)
cv2.line(img,start,end,255,1)
2- create a mask of same size and rotation as the source rectangle and move along the line define in (1), drawing the same shape rectangle as the source if there is no previously detected contour in the same area
ie.
My questions are:
is there a more efficient and better way to solve this problem?
What is best way to move the mask along the line that is at an angle
(given by cv2.minAreaRect)?
thank you