Automatic Skew correction using opencv [duplicate] - python

This question already has answers here:
Python OpenCV skew correction for OCR
(3 answers)
Closed 10 months ago.
I want a way to automatically detect and correct skew of a image of a receipt,
I tried to find variance between the rows for various angles of rotation and choose the angle which has the the maximum variance.
To calculate variance I did the following:
1.For each row I calculated the sum of the pixels values and stored it in a list.
2.Found the the variance of the list using np.var(list)
src = cv.imread(f_name, cv.IMREAD_GRAYSCALE)
blurred=median = cv.medianBlur(src,9)
ret,thresh2 = cv.threshold(src,127,255,cv.THRESH_BINARY_INV)
height, width = thresh2.shape[:2]
print(height,width)
res=[-1,0]
for angle in range(0,100,10):
rotated_temp=deskew(thresh2,angle)
cv.imshow('rotated_temp',rotated_temp)
cv.waitKey(0)
height,width=rotated_temp.shape[:2]
li=[]
for i in range(height):
sum=0
for j in range(width):
sum+=rotated_temp[i][j]
li.append(sum)
curr_variance=np.var(li)
print(curr_variance,angle)
if(curr_variance>res[0]):
res[0]=curr_variance
res[1]=angle
print(res)
final_rot=deskew(src,res[1])
cv.imshow('final_rot',final_rot)
cv.waitKey(0)
However the variance for a skewed image is coming to be more than the properly aligned image,is there any way to correct this
variance for the horizontal text aligned image(required):122449908.009789
variance for the vertical text aligned image :1840071444.404522
I have tried using HoughLines However since the spacing between the text is too less vertical lines are detected,hence this also fails
Any modifications or other approaches are appreciated

Working code for skew correction
import matplotlib.pyplot as plt
import numpy as np
from PIL import Image as im
from scipy.ndimage import interpolation as inter
input_file = r'E:\flaskV8\test1.jpg'
img = im.open(input_file)
convert to binary
wd, ht = img.size
pix = np.array(img.convert('1').getdata(), np.uint8)
bin_img = 1 - (pix.reshape((ht, wd)) / 255.0)
plt.imshow(bin_img, cmap='gray')
plt.savefig(r'E:\flaskV8\binary.png')
def find_score(arr, angle):
data = inter.rotate(arr, angle, reshape=False, order=0)
hist = np.sum(data, axis=1)
score = np.sum((hist[1:] - hist[:-1]) ** 2)
return hist, score
delta = 1
limit = 5
angles = np.arange(-limit, limit+delta, delta)
scores = []
for angle in angles:
hist, score = find_score(bin_img, angle)
scores.append(score)
best_score = max(scores)
best_angle = angles[scores.index(best_score)]
print('Best angle: {}'.format(best_angle))
data = inter.rotate(bin_img, best_angle, reshape=False, order=0)
img = im.fromarray((255 * data).astype("uint8")).convert("RGB")
img.save(r'E:\flaskV8\skew_corrected.png')

Related

Histogram equalization gives me all black image

I am studying image processing and I am writing a code to do a Histogram equalization, but the output always give me an all black image.
This is my code:
import numpy as np
import scipy.misc, math
from PIL import Image
img = Image.open('/home/thaidy/Desktop/ex.jpg').convert('L')
#converting to ndarray
img1 = np.asarray(img)
#converting to 1D
fl = img1.flatten()
#histogram and the bins are computed
hist, bins = np.histogram(img1,256,[0,255])
#cdf computed
cdf = hist.cumsum()
#places where cdf = 0 is ignored
#rest stored in cdf_m
cdf_m = np.ma.masked_equal(cdf,0)
#histogram eq is performed
num_cdf_m = (cdf_m - cdf_m.min())*255
den_cdf_m = (cdf_m.max()-cdf_m.min())*255
cdf_m = num_cdf_m/den_cdf_m
#the masked places are now 0
cdf = np.ma.filled(cdf_m,0).astype('uint8')
#cdf values assigned in the flattened array
im2 = cdf[fl]
#transformin in 2D
im3 = np.reshape(im2,img1.shape)
im4 = Image.fromarray(im3)
im4.save('/home/thaidy/Desktop/output.jpg')
im4.show()
The cdf needs to be normalized before the equalization.
One way to do that is to set the optional parameter density of np.histogram to True:
hist, bins = np.histogram(img1,256,[0,255],density=True)
Other way is to divide cdf after computation by total pixel count:
cdf = hist.cumsum()
cdf /= cdf[-1]
I would also change the equalization part to simply:
T = (255 * cdf).astype(np.uint8)
im2 = T[fl]
Wikipedia suggests to use this transformation formula instead:
T = (np.ceil(256 * cdf) - 1).astype(np.uint8)

How to detect a grainy line?

I am trying to detect a grainy printed line on a paper with cv2. I need the angle of the line. I dont have much knowledge in image processing and I only need to detect the line. I tried to play with the parameters but the angle is always detected wrong. Could someone help me. This is my code:
import cv2
import numpy as np
import matplotlib.pylab as plt
from matplotlib.pyplot import figure
img = cv2.imread('CamXY1_1.bmp')
crop_img = img[100:800, 300:900]
blur = cv2.GaussianBlur(crop_img, (1,1), 0)
ret,thresh = cv2.threshold(blur,150,255,cv2.THRESH_BINARY)
gray = cv2.cvtColor(thresh,cv2.COLOR_BGR2GRAY)
edges = cv2.Canny(gray, 60, 150)
figure(figsize=(15, 15), dpi=150)
plt.imshow(edges, 'gray')
lines = cv2.HoughLines(edges,1,np.pi/180,200)
for rho,theta in lines[0]:
a = np.cos(theta)
b = np.sin(theta)
x0 = a*rho
y0 = b*rho
x1 = int(x0 + 3000*(-b))
y1 = int(y0 + 3000*(a))
x2 = int(x0 - 3000*(-b))
y2 = int(y0 - 3000*(a))
cv2.line(img,(x1,y1),(x2,y2),(0, 255, 0),2)
imagetobedetected
Here's a possible solution to estimate the line (and its angle) without using the Hough line transform. The idea is to locate the start and ending points of the line using the reduce function. This function can reduce an image to a single column or row. If we reduce the image we can also get the total SUM of all the pixels across the reduced image. Using this info we can estimate the extreme points of the line and calculate its angle. This are the steps:
Resize your image because it is way too big
Get a binary image via adaptive thresholding
Define two extreme regions of the image and crop them
Reduce the ROIs to a column using the SUM mode, which is the sum of all rows
Accumulate the total values above a threshold value
Estimate the starting and ending points of the line
Get the angle of the line
Here's the code:
# imports:
import cv2
import numpy as np
import math
# image path
path = "D://opencvImages//"
fileName = "mmCAb.jpg"
# Reading an image in default mode:
inputImage = cv2.imread(path + fileName)
# Scale your BIG image into a small one:
scalePercent = 0.3
# Calculate the new dimensions
width = int(inputImage.shape[1] * scalePercent)
height = int(inputImage.shape[0] * scalePercent)
newSize = (width, height)
# Resize the image:
inputImage = cv2.resize(inputImage, newSize, None, None, None, cv2.INTER_AREA)
# Deep copy for results:
inputImageCopy = inputImage.copy()
# Convert BGR to grayscale:
grayInput = cv2.cvtColor(inputImage, cv2.COLOR_BGR2GRAY)
# Adaptive Thresholding:
windowSize = 51
windowConstant = 11
binaryImage = cv2.adaptiveThreshold(grayInput, 255, cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY_INV, windowSize, windowConstant)
The first step is to get the binary image. Note that I previously downscaled your input because it is too big and we don't need all that info. This is the binary mask:
Now, we don't need most of the image. In fact, since the line is across the whole image, we can only "trim" the first and last column and check out where the white pixels begin. I'll crop a column a little bit wider, though, so we can ensure we have enough data and as less noise as possible. I'll define two Regions of Interest (ROIs) and crop them. Then, I'll reduce each ROI to a column using the SUM mode, this will give me the summation of all intensity across each row. After that, I can accumulate the locations where the sum exceeds a certain threshold and approximate the location of the line, like this:
# Define the regions that will be cropped
# from the original image:
lineWidth = 5
cropPoints = [(0, 0, lineWidth, height), (width-lineWidth, 0, lineWidth, height)]
# Store the line points here:
linePoints = []
# Loop through the crop points and
# crop de ROI:
for p in range(len(cropPoints)):
# Get the ROI:
(x,y,w,h) = cropPoints[p]
# Crop the ROI:
imageROI = binaryImage[y:y+h, x:x+w]
# Reduce the ROI to a n row x 1 columns matrix:
reducedImg = cv2.reduce(imageROI, 1, cv2.REDUCE_SUM, dtype=cv2.CV_32S)
# Get the height (or lenght) of the arry:
reducedHeight = reducedImg.shape[0]
# Define a threshold and accumulate
# the coordinate of the points:
threshValue = 100
pointSum = 0
pointCount = 0
for i in range(reducedHeight):
currentValue = reducedImg[i]
if currentValue > threshValue:
pointSum = pointSum + i
pointCount = pointCount + 1
# Get average coordinate of the line:
y = int(accX / pixelCount)
# Store in list:
linePoints.append((x, y))
The red rectangles show the regions I cropped from the input image:
Note that I've stored both points in the linePoints list. Let's check out our approximation by drawing a line that connects both points:
# Get the two points:
p0 = linePoints[0]
p1 = linePoints[1]
# Draw the line:
cv2.line(inputImageCopy, (p0[0], p0[1]), (p1[0], p1[1]), (255, 0, 0), 1)
cv2.imshow("Line", inputImageCopy)
cv2.waitKey(0)
Which yields:
Not bad, huh? Now that we have both points, we can estimate the angle of this line:
# Get angle:
adjacentSide = p1[0] - p0[0]
oppositeSide = p0[1] - p1[1]
# Compute the angle alpha:
alpha = math.degrees(math.atan(oppositeSide / adjacentSide))
print("Angle: "+str(alpha))
This prints:
Angle: 0.534210901840831

How to measure distance on linear polar transformed data using OpenCV

I am working on some image data that I am doing a polar transform on where I want to measure the width of bright rings in a circular type object.
Example image:
So far I have something like this using faux data:
import cv2
import numpy as np
img = cv2.imread('testimg.tif')
img_gry = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
#threshold image to calculate center of object
ret,thresh = cv2.threshold(img_gry,254,255,cv2.THRESH_BINARY_INV)
M = cv2.moments(thresh)
cX = int(M["m10"] / M["m00"])
cY = int(M["m01"] / M["m00"])
#convert white space around object to 0 intensity
img_gry[img_gry == 255] = 0
#calculate radius of image to be used for polar transform
radius = np.sqrt(((img_gry.shape[0]/2.0)**2.0)+((img_gry.shape[1]/2.0)**2.0))
#transform using center coordinates and radius
polar_image = cv2.linearPolar(img_gry,(cX, cY), radius, cv2.WARP_FILL_OUTLIERS)
polar_image = polar_image.astype(np.uint8)
#add gaussian smoothing
polar_blurred = cv2.GaussianBlur(polar_image,(3,3),0)
This transform looks something like this:
And I will be looking at slices of the data that show intensity, like such:
My question from here is what formula to use to calculate the width of the bright peaks in the image. I don't really know what type of axes are used for displaying this transformation, which underlies my problem. For example, my non-transformed peaks have a width of ~3px, but the transformed data has a peak width of 8 units (radians? no clue). I'm wondering how exactly I can estimate the actual width of my non-transformed data based off the "distance" in this polar transformed data.

finding edge in tilted image with Canny

I'm trying to find the tilt angle in a series of images which look like the created example data below. There should be a clear edge which is visible by eye. However I'm struggling in extracting the edges so far. Is Canny the right way of finding the edge here or is there a better way of finding the edge?
import cv2 as cv
import numpy as np
import matplotlib.pyplot as plt
from scipy.ndimage.filters import gaussian_filter
# create data
xvals = np.arange(0,2000)
yvals = 10000 * np.exp((xvals - 1600)/200) + 100
yvals[1600:] = 100
blurred = gaussian_filter(yvals, sigma=20)
# create image
img = np.tile(blurred,(2000,1))
img = np.swapaxes(img,0,1)
# rotate image
rows,cols = img.shape
M = cv.getRotationMatrix2D((cols/2,rows/2),3.7,1)
img = cv.warpAffine(img,M,(cols,rows))
# convert to uint8 for Canny
img_8 = cv.convertScaleAbs(img,alpha=(255.0/65535.0))
fig,ax = plt.subplots(3)
ax[0].plot(xvals,blurred)
ax[1].imshow(img)
# find edge
ax[2].imshow(cv.Canny(img_8, 20, 100, apertureSize=5))
You can find the angle by transforming your image to binary (cv2.threshold(cv2.THRESH_BINARY)) then search for contours.
When you locate your contour (line) then you can fit a line on your contour cv2.fitLine() and get two points of your line. My math is not very good but I think that in linear equation the formula goes f(x) = k*x + n and you can get k out of those two points (k = (y2-y1)/(x2-x1)) and finally the angle phi = arctan(k). (If I'm wrong please correct it)
You can also use the rotated bounding rectangle - cv2.minAreaRect() - which already returns the angle of the rectangle (rect = cv2.minAreaRect() --> rect[2]). Hope it helps. Cheers!
Here is an example code:
import cv2
import numpy as np
import math
img = cv2.imread('angle.png')
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
ret, threshold = cv2.threshold(gray,170,255,cv2.THRESH_BINARY)
im, contours, hierarchy = cv2.findContours(threshold,cv2.RETR_TREE,cv2.CHAIN_APPROX_NONE)
for c in contours:
area = cv2.contourArea(c)
perimeter = cv2.arcLength(c, False)
if area < 10001 and 100 < perimeter < 1000:
# first approach - fitting line and calculate with y=kx+n --> angle=tan^(-1)k
rows,cols = img.shape[:2]
[vx,vy,x,y] = cv2.fitLine(c, cv2.DIST_L2,0,0.01,0.01)
lefty = int((-x*vy/vx) + y)
righty = int(((cols-x)*vy/vx)+y)
cv2.line(img,(cols-1,righty),(0,lefty),(0,255,0),2)
(x1, y1) = (cols-1, righty)
(x2, y2) = (0, lefty)
k = (y2-y1)/(x2-x1)
angle = math.atan(k)*180/math.pi
print(angle)
#second approch - cv2.minAreaRect --> returns center (x,y), (width, height), angle of rotation )
rect = cv2.minAreaRect(c)
box = cv2.boxPoints(rect)
box = np.int0(box)
cv2.drawContours(img,[box],0,(0,0,255),2)
print(rect[2])
cv2.imshow('img2', img)
Original image:
Output:
-3.8493663478518627
-3.7022125720977783
tribol,
it seems like you can take the gradient image G = |Gx| + |Gy| (normalize it to some known range), calc its Histogram and take the top bins of it. it will give you approx mask of the line. Then you can do line fitting. It'll give you a good initial guess.
A very simple way of doing it is as follows... adjust my numbers to suit your knowledge of the data.
Normalise your image to a scale of 0-255.
Choose two points A and B, where A is 10% of the image width in from the left side and B is 10% in from the right side. The distance AB is now 0.8 x 2000, or 1600 px.
Go North from point A sampling your image till you exceed some sensible threshold that means you have met the tilted line. Note the Y value at this point, as YA.
Do the same, going North from point B till you meet the tilted line. Note the Y value at this point, as YB.
The angle you seek is:
tan-1((YB-YA)/1600)
Thresholding as suggested by kavko didn't work that well, as the intensity varied from image to image (I could of course consider the histogram for each image to imrove this approach). I ended up with taking the maximum of the gradient in the y-direction:
def rotate_image(image):
blur = ndimage.gaussian_filter(image, sigma=10) # blur image first
grad = np.gradient(blur, axis= 0) # take gradient along y-axis
grad[grad>10000]=0 # filter unreasonable high values
idx_maxline = np.argmax(grad, axis=0) # get y-indices of max slope = indices of edge
mean = np.mean(idx_maxline)
std = np.std(idx_maxline)
idx = np.arange(idx_maxline.shape[0])
idx_filtered = idx[(idx_maxline < mean+std) & (idx_maxline > mean - std)] # filter positions where highest slope is at different position(blobs)
slope, intercept, r_value, p_value, std_err = stats.linregress(idx_filtered, idx_maxline[idx_filtered])
out = ndimage.rotate(image,slope*180/np.pi, reshape = False)
return out
out = rotate_image(img)
plt.imshow(out)

Convert 2D Shape to 1D Space (Shape Classification. )

I'm looking for a example of code/library using Python to convert a 2D shape to 1D space based on following steps:
Find the centroid of the shape.
By choosing the centroid as a reference origin, unwrap the outer contour counterclockwise to turn it into a distance signal that is composed of all between each boundary pixel and the centroid (like the image)
Thank you!
I did something like this a while back for fun, inspired by a Kaggle competition on leaf classification. I used opencv for finding the contours of the images. Below is the code for python 2.7. See here for the orientation of the returned contour. You may have to adapt it for your needs, specifically the thresholding part. Hope this helps.
import cv2
import numpy as np
import matplotlib.pyplot as plt
def shape_desc(im):
# threshold image
_, bw = cv2.threshold(im, 128, 255, cv2.THRESH_BINARY)
# find contours
contours, hierarchy = cv2.findContours(im.copy(), cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)
# extract largest contour
largest_idx = np.argmax([len(contours[i]) for i in range(0, len(contours))])
# get (x,y) coordinates
x = np.array([contours[largest_idx][i][0][0] for i in range(0, len(contours[largest_idx]))], dtype = np.float).reshape((len(contours[largest_idx]), 1))
y = np.array([contours[largest_idx][i][0][1] for i in range(0, len(contours[largest_idx]))], dtype = np.float).reshape((len(contours[largest_idx]), 1))
# find the centroid
m = cv2.moments(np.array([[x[i][0], y[i][0]] for i in range(0, len(x))]).reshape((-1, 1 ,2)).astype(np.int32))
x_bar = m['m10']/m['m00']
y_bar = m['m01']/m['m00']
x_1 = np.array([i[0] for i in x])
y_1 = np.array([i[0] for i in y])
# take the centroid as the reference
x = x_1 - x_bar
y = y_1 - y_bar
return np.sqrt(x*x + y*y)
Here are the results of applying this for the following images that are similar in shape. Note that the images and plots have been rescaled.
filename = '19.jpg'
im = cv2.imread(filename, 0)
desc = shape_desc(im)
plt.stem(desc)

Categories