Related
I am trying to remove the grayish “noise” surrounding the dates using Python/OpenCV to help the OCR (Optical Character Recognition) to recognize the dates.
The original image looks like this: https://static.mothership.sg/1/2017/03/10-Feb-MC-1.jpg
The python script I tried looked as below. However, I have other similar images in which the contrast or lighting coditions varies.
import cv2
import numpy as np
img = cv2.imread("mc.jpeg")
img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
alpha = 3.5
beta = -2
new = alpha * img + beta
new = np.clip(new, 0, 255).astype(np.uint8)
cv2.imwrite("cleaned.png", new)
I also tried Thresholding and/or adaptiveThresholding and some time, I was able to separate the dates from the grayish background. Sometimes it was very challenging. I wonder is there an automatic way to determine the threshold value ?
Below are example of what I hope to achieve.
Blurry Image:
Otsu's Binarization automatically calculates a threshold value from an image histogram.
# Otsu's thresholding after Gaussian filtering
blur = cv2.GaussianBlur(img,(5,5),0)
ret,Otsu = cv2.threshold(blur,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)
cv2.imwrite("Otsu's_thresholding", Otsu)
see this link
You can try to build a model of the background and then weight each input pixel by that model. The output gain should be relatively constant during most of the image. These are the steps for this method:
Apply a soft median blur filter to get rid of small noise
Get the model of the background via local maximum. Apply a very strong close operation, with a big structuring element (I’m using a rectangular kernel of size 15)
Perform gain adjustment by dividing 255 between each local maximum pixel. Weight this value with each input image pixel.
You should get a nice image where the background illumination is pretty much normalized, threshold this image to get a binary mask of the text
This is the code:
import numpy as np
import cv2
# image path
path = "C:/opencvImages/sheet01.jpg"
# Read an image in default mode:
inputImage = cv2.imread(path)
# Remove small noise via median:
filterSize = 5
imageMedian = cv2.medianBlur(inputImage, filterSize)
# Get local maximum:
kernelSize = 15
maxKernel = cv2.getStructuringElement(cv2.MORPH_RECT, (kernelSize, kernelSize))
localMax = cv2.morphologyEx(imageMedian, cv2.MORPH_CLOSE, maxKernel, None, None, 1, cv2.BORDER_REFLECT101)
# Adjust image gain:
height, width, depth = localMax.shape
# Create output Mat:
outputImage = np.zeros(shape=[height, width, depth], dtype=np.uint8)
for i in range(0, height):
for j in range(0, width):
# Get current BGR pixels:
v1 = inputImage[i, j]
v2 = localMax[i, j]
# Gain adjust:
tempArray = []
for c in range(0, 3):
currentPixel = v2[c]
if currentPixel != 0:
gain = 255 / v2[c]
gain = v1[c] * gain
else:
gain = 0
# Gain set and clamp:
tempArray.append(np.clip(gain, 0, 255))
# Set pixel vec to out image:
outputImage[i, j] = tempArray
# Convert RGB to grayscale:
grayscaleImage = cv2.cvtColor(outputImage, cv2.COLOR_BGR2GRAY)
# Threshold:
threshValue = 110
_, binaryImage = cv2.threshold(grayscaleImage, threshValue, 255, cv2.THRESH_BINARY)
# Write image:
imageFilename = "C:/opencvImages/binaryMask2.png"
cv2.imwrite(imageFilename, binaryImage)
I get the following results testing the complete image:
And the cropped text:
Please note that the gain adjustment operations are not vectorized. The script is slow, mainly because I'm starting with Python and don’t know the proper Numpy syntax to speed-up this operation. I've been using C++ for a long time, so feel free to further improve the code.
Edit:
Please, be aware that your result can only be as good as the quality of your input. See your input and ask yourself "Is this a good input for an automated process?" (Automated processes are usually not very smart). The second picture you posted is very low quality. Not only is blurry but also is low res and has compression artifacts. All these factors will hinder automated processing.
With that said, here's an improvement you can include in the original:
Try to normalize brightness-contrast on the grayscale output:
grayscaleImage = np.uint8(cv2.normalize(grayscaleImage, grayscaleImage, 0, 255, cv2.NORM_MINMAX))
Your grayscale image goes from this:
to this:
A little bit darker and improved on contrast. Let's try to compute the optimal threshold value automatically via Otsu thresholding:
threshValue, binaryImage = cv2.threshold(grayscaleImage, 0, 255, cv2.THRESH_BINARY+cv2.THRESH_OTSU)
It gets you this:
However, we can adjust the result if we add bias to Otsu's threshold, like this:
threshValue, binaryImage = cv2.threshold(grayscaleImage, 0, 255, cv2.THRESH_BINARY+cv2.THRESH_OTSU)
bias = 0.9
threshValue = bias * threshValue
_, binaryImage = cv2.threshold(grayscaleImage, threshValue, 255, cv2.THRESH_BINARY)
That's the best quality you can get with these images using this method.
If you find these suggestions and tips useful, please, at least up-vote my answer.
I have this image with tables where I want to remove the tabular structure from the image so that it can work more effectively with Tesseract. I used the following code to create a boundary around the table (and individual cells) so that it can be deleted.
img =cv2.imread('bfir.jpg')
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
edges = cv2.Canny(gray,50,150,apertureSize = 3)
img1 = np.ones(img.shape, dtype=np.uint8)*255
ret,thresh = cv2.threshold(gray,127,255,1)
(_,contours,h) = cv2.findContours(thresh,1,2)
for cnt in contours:
approx = cv2.approxPolyDP(cnt,0.01*cv2.arcLength(cnt,True),True)
if len(approx)==4:
cv2.drawContours(img1,[cnt],0,(0,255,0),2)
This draws green lines around the table like this image.
Next, I tried the cv2.subtract method to subtract the table from the image, somewhat like this.
final_img = cv2.subtract(img1, img)
But this didn't work as I expected and gives me a grayscale image with the table still in it. Link
While I just want the original image in B&W with the table removed. I am using OpenCV for the first time so I don't know what I am doing wrong and I am sorry for the long post but if anybody can please help with how to go about with this or just point me in the right direction about how to remove the table, that would be very much appreciated.
EDIT:
As suggested by RobAu it can also work with simply drawing the contours in white in the first place but I don't know how to do that without losing the rest of the data in the preprocessing stage.
You could try and simply overwrite the cells that represent the borders. This can be done by creating a mask image, and then using that as reference as to where to overwrite pixels in the original.
This can be done with:
mask_image = np.zeros(img.shape[0:2], np.uint8)
cv2.drawContours(mask_image, contours, -1, color=255, thickness=2)
border_points = np.array(np.where(mask_image == 255)).transpose()
background = [0, 0, 0] # Change this to the colour you want
for point in border_points :
img[point[0], point[1]] = background
Update:
You could use the 3-channel you already created for the mask, but that slightly complicates the algorithms. The mask image propose is more fitted for the task, but I will try to adapt it to your code:
# Create your mask image as usual...
border_points = np.array(np.where(img1[:,:,1] == 255)).transpose() # Only look at channel 2
background = [0, 0, 0] # Change this to the colour you want
for point in border_points :
img[point[0], point[1]] = background
Update to do as #RobAu suggested (quicker than my previous methods):
line_thickness = 3 # Change this value until it looks the best.
cv2.drawContours(img, contours, -1, color=(0,0,0), thickness=line_thickness )
Please note I didn't test this code. So it might need some further fiddling.
As a reference to the comments of this question, this is an example of a code that locates rectangles and creates new images for each one, this was an attempt at creating individual images of a picture of shredded paper. Some of the values will need to be changed for it to locate the rectangles with the right amount of size
There is also some code for tracking sizes of images and the code is made up by 50% what i have written and 50% by stackoverflow help.
import cv2
import numpy as np
fileName = ['9','8','7','6','5','4','3','2','1','0']
img = cv2.imread('#YOUR IMAGE#')
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
gray = cv2.bilateralFilter(gray, 11, 17, 17)
kernel = np.ones((5,5),np.uint8)
erosion = cv2.erode(gray,kernel,iterations = 2)
kernel = np.ones((4,4),np.uint8)
dilation = cv2.dilate(erosion,kernel,iterations = 2)
edged = cv2.Canny(dilation, 30, 200)
_, contours, hierarchy = cv2.findContours(edged, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
rects = [cv2.boundingRect(cnt) for cnt in contours]
rects = sorted(rects,key=lambda x:x[1],reverse=True)
i = -1
j = 1
y_old = 5000
x_old = 5000
for rect in rects:
x,y,w,h = rect
area = w * h
print('width: %d and height: %d' %(w,h))
if w > 50 and h > 500:
print('abs:')
print(abs(x_old - x))
if abs(x_old - x) > 0:
print('writing')
x_old = x
x,y,w,h = rect
out = img[y+10:y+h-10,x+10:x+w-10]
cv2.imwrite('assets/newImage' + fileName[i] + '.jpg', out)
j+=1
if (y_old - y) > 1000:
i += 1
y_old = y
Even though, the given input image links are not working & so I obviously doesn't know the following is what you have asked for, I learnt something from your question, when I was working on, removing table structure lines from given image, I like to share what I have learnt, for the future readers.
I followed the steps provided in opencv documentation to remove the lines.
But that only removed the horizontal lines. When I tried to remove vertical lines, the result image only had the vertical lines. The text in the table was not there.
Then I came across your question & saw final_img = cv2.subtract(img1, img) in the question. Tried that & it worked great.
Here are the steps that I followed:
# Load the image
src = cv.imread(argv[0], cv.IMREAD_COLOR)
# Check if image is loaded fine
if src is None:
print ('Error opening image: ' + argv[0])
return -1
# Show source image
cv.imshow("src", src)
# [load_image]
# [gray]
# Transform source image to gray if it is not already
if len(src.shape) != 2:
gray = cv.cvtColor(src, cv.COLOR_BGR2GRAY)
else:
gray = src
# Show gray image
# show_wait_destroy("gray", gray)
# [gray]
# [bin]
# Apply adaptiveThreshold at the bitwise_not of gray, notice the ~ symbol
gray = cv.bitwise_not(gray)
bw = cv.adaptiveThreshold(gray, 255, cv.ADAPTIVE_THRESH_MEAN_C, \
cv.THRESH_BINARY, 15, -2)
# Show binary image
# show_wait_destroy("binary", bw)
# [bin]
# [init]
# Create the images that will use to extract the horizontal and vertical lines
horizontal = np.copy(bw)
vertical = np.copy(bw)
# [horiz]
# [vert]
# Specify size on vertical axis
rows = vertical.shape[0]
verticalsize = rows / 10
# Create structure element for extracting vertical lines through morphology operations
verticalStructure = cv.getStructuringElement(cv.MORPH_RECT, (1, verticalsize))
# Apply morphology operations
vertical = cv.erode(vertical, verticalStructure)
vertical = cv.dilate(vertical, verticalStructure)
# [init]
# [horiz]
# Specify size on horizontal axis
cols = horizontal.shape[1]
horizontal_size = cols / 30
# Create structure element for extracting horizontal lines through morphology operations
horizontalStructure = cv.getStructuringElement(cv.MORPH_RECT, (horizontal_size, 1))
# Apply morphology operations
horizontal = cv.erode(horizontal, horizontalStructure)
horizontal = cv.dilate(horizontal, horizontalStructure)
lines_removed = cv.subtract(gray, vertical + horizontal)
show_wait_destroy("lines_removed", ~lines_removed)
Input:
Output:
Few things that I changed from the sources:
verticalsize = rows / 10, here, I do not understand the significance of the number 10. In the documentation, 30 was used. I got better result with 10. I guess, the less the division number, the large the structure element & here, as we are targeting straight lines, reducing the number works.
In the documentation, vertical lines are processed after horizontal lines. I reversed the order
I swapped the parameters to cv2.substract(). I used cv2.subtract(img, img1).
I am working on lane lines detection. My current working strategy is:
defining a region of interest where lane lines could be
Warping the image to get a bird eye view
Converting the image to YUV color space
Normalizing the Y channel
Fitting the second order polynomial and sliding window approach
every thing works fine but where there are shadows the algorithm do not work.
I have tried adaptive thresholding, otssu thresholding but not succeeded.
Source Image without Shadow:
Processed Source Image without Shadow:
Source Image with Shadow:
Processed Source Image with Shadow:
In the second Image it can be seen that the shadowed area is not detected. Actually shadows drops the image values down so i tried to threshold the image with new values lower than the previous one new image can be found here:
This technique does not work as it comes with a lot of noise
Currently I am trying background subtraction and shadow removal techniques but its not working. I am struck in this problem from last 2 3 weeks.
Any help will really be appreciated...
import cv2
import matplotlib.pyplot as plt
import numpy as np
from helper_functions import undistort, threshholding, unwarp,sliding_window_polyfit
from helper_functions import polyfit_using_prev_fit,calc_curv_rad_and_center_dist
from Lane_Lines_Finding import RoI
img = cv2.imread('./test_images/new_test.jpg')
new =undistort(img)
new = cv2.cvtColor(new, cv2.COLOR_RGB2BGR)
#new = threshholding(new)
h,w = new.shape[:2]
# define source and destination points for transform
imshape = img.shape
vertices = np.array([[
(257,670),
(590, 446),
(722, 440),
(1150,650)
]],
dtype=np.int32)
p1 = (170,670)
p2 = (472, 475)
p3 = (745, 466)
p4 = (1050,650)
vertices = np.array([[p1,
p2,
p3,
p4
]],
dtype=np.int32)
masked_edges = RoI(new, vertices)
#masked_edges = cv2.cvtColor(masked_edges, cv2.COLOR_RGB2BGR)
src = np.float32([(575,464),
(707,464),
(258,682),
(1049,682)])
dst = np.float32([(450,0),
(w-450,0),
(450,h),
(w-450,h)])
warp_img, M, Minv = unwarp(masked_edges, src, dst)
warp_img = increase_brightness_img(warp_img)
warp_img = contrast_img(warp_img)
YUV = cv2.cvtColor(warp_img, cv2.COLOR_RGB2YUV)
Y,U,V = cv2.split(YUV)
Y_equalized= cv2.equalizeHist(Y)
YUV = cv2.merge((Y,U,V))
thresh_min = 253
thresh_max = 255
binary = np.zeros_like(Y)
binary[(Y_equalized>= thresh_min) & (Y_equalized <= thresh_max)] = 1
kernel_opening= np.ones((3,3),np.uint8)
opening = cv2.morphologyEx(binary, cv2.MORPH_OPEN, kernel_opening)
kernel= np.ones((7,7),np.uint8)
dilation = cv2.dilate(opening,kernel,iterations = 3)
I developed script in Matlab which is analysing engraved text on a colour steal. I'm using range of morphological techniques to extract the text and read it with OCR. I need to implement it on Raspberry Pi therefore I decided to transfer my Matlab code into OpenCV (in python). I tried to transfer some methods and they work similarly but how do I implement imreconstruct and imbinarize (shown below) to OpenCV? (the challenge here is appropriate differentiate foreground and background).
Maybe I should try adding grabCut or getStructuringElement or morphologyEx or dilate? I tried them in range of combinations but have not found a perfect solution.
I will put the whole script for both if anyone could give me suggestions on how to generally improve this extraction and accuracy of OCR process I would greatly appreciate it.
Based on bin values of grey-scale image. I change some parameters in
those functions:
Matlab:
se = strel('disk', 300);
img = imtophat(img, se);
maker = imerode(img, strel('line',100,0)); %for whiter ones
maker = imerode(img, strel('line',85,0)); %for medium
maker = imerode(img, strel('line',5,0));
imgClear = imreconstruct(maker, img);
imgBlur = imgaussfilt(imgClear,1); %less blur for whiter frames
BW = imbinarize(imgBlur,'adaptive','ForegroundPolarity','Bright',...
'Sensitivity',0.7); %process for medium
BW = imbinarize(imgBlur, 'adaptive', 'ForegroundPolarity',...
'Dark', 'Sensitivity', 0.4); % process for black and white
res = ocr(BW, 'CharacterSet', '0123456789', 'TextLayout', 'Block');
res.Text;
OpenCv
kernel = numpy.ones((5,5),numpy.uint8)
blur = cv2.GaussianBlur(img,(5,5),0)
erosion = cv2.erode(blur,kernel,iterations = 1)
opening = cv2.morphologyEx(erosion, cv2.MORPH_OPEN, kernel)
#bremove = cv2.grabCut(opening,mask,rect,bgdModelmode==GC_INIT_WITH_MASK)
#th3 = cv2.adaptiveThreshold(opening,255,cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY_INV+cv2.THRESH_OTSU,11,2)
ret, thresh= cv2.threshold(opening,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)
ocr = pytesseract.image_to_string(Image.open('image2.png'),config='stdout -c tessedit_char_whitelist=0123456789')
Here is the input image:
I am surprised at how much difference between matlab and opencv there is when they both appear to use the same algorithm. Why do you run imbinarize twice? What does the sensitivity keyword actually do (mathematically, behind the background). Because they obviously have several steps more than just the bare OTSU.
import cv2
import numpy as np
import matplotlib.pyplot as plt
def show(img):
plt.imshow(img, cmap="gray")
plt.show()
img = cv2.imread("letters.jpg", cv2.IMREAD_GRAYSCALE)
kernel = np.ones((3,3), np.uint8)
blur = cv2.GaussianBlur(img,(3,3), 0)
erosion = cv2.erode(blur, kernel, iterations=3)
opening = cv2.dilate(erosion, kernel)
th3 = cv2.adaptiveThreshold(opening, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C,
cv2.THRESH_BINARY, 45, 2)
show(th3)
kernel2 = cv2.getGaussianKernel(6, 2) #np.ones((6,6))
kernel2 = np.outer(kernel2, kernel2)
th3 = cv2.dilate(th3, kernel2)
th3 = cv2.erode(th3, kernel)
show(th3)
The images that get displayed are:
After a bit of cleaning up:
So all in all not the same and certainly not as nice as matlab. But the basic principle seems the same, it's just that the numbers need playing with.
A better approach would probably be to do a threshold by the mean of the image and then use the output of that as a mask to adaptive threshold the original image. Hopefully then the results would be better than both opencv and matlab.
Try doing it with ADAPTIVE_THRESH_MEAN_C you can get some really nice results but there's more trash lying around. Again, maybe if you can use it as a mask to isolate the text and then do tresholding again it might turn out to be better. Also the shape of the erosion and dilation kernels will make a big difference here.
I worked out the code to have a positive result based on your engraved text sample.
import cv2
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
def show(img):
plt.imshow(img, cmap="gray")
plt.show()
# load the input image
img = cv2.imread('./imagesStackoverflow/engraved_text.jpg',0);
show(img)
ret, mask = cv2.threshold(img, 60, 120, cv2.THRESH_BINARY) # turn 60, 120 for the best OCR results
kernel = np.ones((5,3),np.uint8)
mask = cv2.erode(mask,kernel,iterations = 1)
show(mask)
# I used a version of OpenCV with Tesseract, you may use your pytesseract and set the modes as:
# OCR Enginer Mode (OEM) = 3 (defualt = 3)
# Page Segmentation mode (PSmode) = 11 (defualt = 3)
tesser = cv2.text.OCRTesseract_create('C:/Program Files/Tesseract 4.0.0/tessdata/','eng','0123456789',11,3)
retval = tesser.run(mask, 0) # return string type
print 'OCR:' + retval
Processed image and OCR output:
It would be great if you can feedback your test results with more sample images.
opencvpythontesseractocr
What I can see from your code is you have used tophat filtering in your Matlab code as the first step. However, I couldn't see the same in your python OpenCV code.
Python has built in tophat filter try applying that for getting similar result
kernel = np.ones((5,5),np.uint8)
tophat = cv2.morphologyEx(img, cv2.MORPH_TOPHAT, kernel)
Also, try using CLAHE it gives better contrast to your image and then apply blackhat to filter out small details.
clahe = cv2.createCLAHE(clipLimit=2.0, tileGridSize=(8,8))
cl1 = clahe.apply(img)
I have got better results by applying these transformations.
Tried below, it works to recognize the lighter engraved text sample. Hope it helps.
def show(img):
plt.imshow(img, cmap="gray")
plt.show()
# load the input image
img = cv2.imread('./imagesStackoverflow/engraved_text2.jpg',0);
show(img)
# apply CLAHE to adjust the contrast
clahe = cv2.createCLAHE(clipLimit=5.1, tileGridSize=(5,3))
cl1 = clahe.apply(img)
img = cl1.copy()
show(img)
img = cv2.GaussianBlur(img,(3,3), 1)
ret, mask = cv2.threshold(img, 125, 150, cv2.THRESH_BINARY) # turn 125, 150 for the best OCR results
kernel = np.ones((5,3),np.uint8)
mask = cv2.erode(mask,kernel,iterations = 1)
show(mask)
# I used a version of OpenCV with Tesseract, you may use your pytesseract and set the modes as:
# Page Segmentation mode (PSmode) = 11 (defualt = 3)
# OCR Enginer Mode (OEM) = 3 (defualt = 3)
tesser = cv2.text.OCRTesseract_create('C:/Program Files/Tesseract 4.0.0/tessdata/','eng','0123456789',11,3)
retval = tesser.run(mask, 0) # return string type
print 'OCR:' + retval
I'm trying to skeletonize the features below, in order to extract information about 1) the length of the features 2) the curvature of the features. I came across this skeletonization approach, which iteratively erodes and erodes the image. Shown below is the result - which looks awful. Can some one recommend a different approach to skeletonizing my features?
Here's the example from the link above:
import cv2
import numpy as np
img = cv2.imread('sofsk.png',0)
size = np.size(img)
skel = np.zeros(img.shape,np.uint8)
ret,img = cv2.threshold(img,127,255,0)
element = cv2.getStructuringElement(cv2.MORPH_CROSS,(3,3))
done = False
while( not done):
eroded = cv2.erode(img,element)
temp = cv2.dilate(eroded,element)
temp = cv2.subtract(img,temp)
skel = cv2.bitwise_or(skel,temp)
img = eroded.copy()
zeros = size - cv2.countNonZero(img)
if zeros==size:
done = True
cv2.imshow("skel",skel)
cv2.waitKey(0)
cv2.destroyAllWindows()