smoothen edges of pixelated binary image python code - python

I'm using pytesseract to convert images into text, however the accuracy isn't 100% since the images pixelate on resizing. Applying gaussian blur would smoothen the edges but blur the image making it impossible for OCR to detect text.
What sort of filter would smoothen the edges without blurring the image too much. The image looks something like this
Image

You can median blur image then try a series of morphological transformations, specifically cv2.MORPH_CLOSE with a 3x3 kernel seems to work well here. You can play with the kernel size and number of iterations to get desired results
import cv2
image = cv2.imread('1.png')
blur = cv2.medianBlur(image, 7)
gray = cv2.cvtColor(blur, cv2.COLOR_BGR2GRAY)
thresh = cv2.threshold(gray,125, 255,cv2.THRESH_BINARY_INV)[1]
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (3,3))
close = cv2.morphologyEx(thresh, cv2.MORPH_CLOSE, kernel, iterations=2)
result = 255 - close
cv2.imshow('thresh', thresh)
cv2.imshow('close', close)
cv2.imshow('result', result)
cv2.imwrite('result.png', result)
cv2.waitKey()

Related

OCR not performing well on clean image | Python Pytesseract

I have been working on project which involves extracting text from an image. I have researched that tesseract is one of the best libraries available and I decided to use the same along with opencv. Opencv is needed for image manipulation.
I have been playing a lot with tessaract engine and it does not seems to be giving the expected results to me. I have attached the image as an reference. Output I got is:
1] =501 [
Instead, expected output is
TM10-50%L
What I have done so far:
Remove noise
Adaptive threshold
Sending it tesseract ocr engine
Are there any other suggestions to improve the algorithm?
Thanks in advance.
Snippet of the code:
import cv2
import sys
import pytesseract
import numpy as np
from PIL import Image
if __name__ == '__main__':
if len(sys.argv) < 2:
print('Usage: python ocr_simple.py image.jpg')
sys.exit(1)
# Read image path from command line
imPath = sys.argv[1]
gray = cv2.imread(imPath, 0)
# Blur
blur = cv2.GaussianBlur(gray,(9,9), 0)
# Binarizing
thres = cv2.adaptiveThreshold(blur, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY, 5, 3)
text = pytesseract.image_to_string(thresh)
print(text)
Images attached.
First image is original image. Original image
Second image is what has been fed to tessaract. Input to tessaract
Before performing OCR on an image, it's important to preprocess the image. The idea is to obtain a processed image where the text to extract is in black with the background in white. For this specific image, we need to obtain the ROI before we can OCR.
To do this, we can convert to grayscale, apply a slight Gaussian blur, then adaptive threshold to obtain a binary image. From here, we can apply morphological closing to merge individual letters together. Next we find contours, filter using contour area filtering, and then extract the ROI. We perform text extraction using the --psm 6 configuration option to assume a single uniform block of text. Take a look here for more options.
Detected ROI
Extracted ROI
Result from Pytesseract OCR
TM10=50%L
Code
import cv2
import pytesseract
pytesseract.pytesseract.tesseract_cmd = r"C:\Program Files\Tesseract-OCR\tesseract.exe"
# Grayscale, Gaussian blur, Adaptive threshold
image = cv2.imread('1.jpg')
original = image.copy()
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
blur = cv2.GaussianBlur(gray, (3,3), 0)
thresh = cv2.adaptiveThreshold(blur, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY_INV, 5, 5)
# Perform morph close to merge letters together
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (5,5))
close = cv2.morphologyEx(thresh, cv2.MORPH_CLOSE, kernel, iterations=3)
# Find contours, contour area filtering, extract ROI
cnts, _ = cv2.findContours(close, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)[-2:]
for c in cnts:
area = cv2.contourArea(c)
if area > 1800 and area < 2500:
x,y,w,h = cv2.boundingRect(c)
ROI = original[y:y+h, x:x+w]
cv2.rectangle(image, (x, y), (x + w, y + h), (36,255,12), 3)
# Perform text extraction
ROI = cv2.GaussianBlur(ROI, (3,3), 0)
data = pytesseract.image_to_string(ROI, lang='eng', config='--psm 6')
print(data)
cv2.imshow('ROI', ROI)
cv2.imshow('close', close)
cv2.imshow('image', image)
cv2.waitKey()

How to remove noise in image OpenCV, Python?

I have some cropped images and I need images that have black texts on white background. Firstly I apply adaptive thresholding and then I try to remove noise. Although I tried a lot of noise removal techniques but when the image changed, the techniques I used failed.
The best method for converting image color to binary for my images is Adaptive Gaussian Thresholding. Here is my code:
im_gray = cv2.imread("image.jpg", cv2.IMREAD_GRAYSCALE)
image = cv2.GaussianBlur(im_gray, (5,5), 1)
th = cv2.adaptiveThreshold(image,255,cv2.ADAPTIVE_THRESH_GAUSSIAN_C,cv2.THRESH_BINARY,3,2)
I need smooth values, Decimal separator(dot) and postfix letters. How can I do this?
Before binarization, it is necessary to correct the nonuniform illumination of the background. For example, like this:
import cv2
image = cv2.imread('9qBsB.jpg')
image=cv2.cvtColor(image,cv2.COLOR_BGR2GRAY)
se=cv2.getStructuringElement(cv2.MORPH_RECT , (8,8))
bg=cv2.morphologyEx(image, cv2.MORPH_DILATE, se)
out_gray=cv2.divide(image, bg, scale=255)
out_binary=cv2.threshold(out_gray, 0, 255, cv2.THRESH_OTSU )[1]
cv2.imshow('binary', out_binary)
cv2.imwrite('binary.png',out_binary)
cv2.imshow('gray', out_gray)
cv2.imwrite('gray.png',out_gray)
Result:
You can do slightly better using division normalization in Python/OpenCV.
Input:
import cv2
import numpy as np
# load image
img = cv2.imread("license_plate.jpg")
# convert to grayscale
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
# blur
blur = cv2.GaussianBlur(gray, (0,0), sigmaX=33, sigmaY=33)
# divide
divide = cv2.divide(gray, blur, scale=255)
# otsu threshold
thresh = cv2.threshold(divide, 0, 255, cv2.THRESH_BINARY+cv2.THRESH_OTSU)[1]
# apply morphology
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (3,3))
morph = cv2.morphologyEx(thresh, cv2.MORPH_CLOSE, kernel)
# write result to disk
cv2.imwrite("hebrew_text_division.jpg", divide)
cv2.imwrite("hebrew_text_division_threshold.jpg", thresh)
cv2.imwrite("hebrew_text_division_morph.jpg", morph)
# display it
cv2.imshow("gray", gray)
cv2.imshow("divide", divide)
cv2.imshow("thresh", thresh)
cv2.imshow("morph", morph)
cv2.waitKey(0)
cv2.destroyAllWindows()
Division Image:
Thresholded Image:
Morphology Cleaned Image:
Im assuming that you are preprocessing the image for OCR(Optical Character Recognition)
I had a project to detect license plates and these were the steps I did, you can apply them to your project. After greying the image try applying equalize histogram to the image, this allows the area's in the image with lower contrast to gain a higher contrast. Then blur the image to reduce the noise in the background. Next apply edge detection on the image, make sure that noise is sufficiently removed as ED is susceptible to it. Lastly, apply closing(dilation then erosion) on the image to close all the small holes inside the words.
Instead of erode and dilate, you can check this, that is basically both in one.
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (1,2))
morphology_img = cv2.morphologyEx(img_grey, cv2.MORPH_OPEN, kernel,iterations=1)
plt.imshow(morphology_img,'Greys_r')
MORPHOLOGICAL_TRANSFORMATIONS

Detect optic disk in a retina image using contour detection in opencv?

I have the following retina image and I'm trying to draw a circle around the optic disk (the white round shape in retinal image). Here is the original image:
I applied adaptive thresholding then cv2.findcontour:
import cv2
def detectBlob(file):
# read image
img = cv2.imread(file)
imageName = file.split('.')[0]
# convert img to grayscale
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
# do adaptive threshold on gray image
thresh = cv2.adaptiveThreshold(gray, 255, cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY, 101, 3)
# apply morphology open then close
kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (3,3))
blob = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, kernel)
kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (20,20))
blob = cv2.morphologyEx(blob, cv2.MORPH_CLOSE, kernel)
# invert blob
blob = (255 - blob)
# Get contours
cnts,hierarchy = cv2.findContours(blob, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
# write results to disk
result = img.copy()
cv2.drawContours(result, cnts, -1, (0, 0, 255), 3)
cv2.imwrite(imageName+"_threshold.jpg", thresh)
cv2.imwrite(imageName+"_blob.jpg", blob)
cv2.imwrite(imageName+"_contour.jpg", result)
detectBlob('16.png')
Here is the what the threshold looks like:
Here is the final output of contours:
Ideally I'm looking for such an output:
Adaptive thresholding fails because the filter size is much too small. And though we don't figure this out, the waves in the background are quite perturbating.
I obtained an interesting result by reducing the image resolution by a factor 16 and applying an adaptive filter of extent 99x99.
You need to identify larger structures. Ideally you need a structure size about 1/4 of the radius of the optic disk to balance results and processing time (experiment with larger sizes until acceptable).
Or you could downsample the image (reduce the resolution and make the picture smaller), which is more or less the same thing, even if you lose precision on the optic disk borders.

Removing Borders/Margins from Video Frames

I am working with videos, that have borders (margins) around them. Some have it along all 4 sides, some along left&right only and some along top&bottom only. Length of these margins is also not fixed.
I am extracting frames from these videos, as for example,
and
Both of these contain borders on the top and bottom.
Can anyone please suggest some methods to remove these borders from these images (in Python, preferably).
I came across some methods, like this on Stackoverflow, but this deals with an ideal situation where borders are perfectly black (0,0,0). But in my case, they may not be pitch black, and also may contain jittery noises too.
Any help/suggestions would be highly appreciated.
Here is one way to do that in Python/OpenCV.
Read the image
Convert to grayscale and invert
Threshold
Apply morphology to remove small black or white regions then invert again
Get the contour of the one region
Get the bounding box of that contour
Use numpy slicing to crop that area of the image to form the resulting image
Save the resulting image
import cv2
import numpy as np
# read image
img = cv2.imread('gymnast.png')
# convert to grayscale
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
# invert gray image
gray = 255 - gray
# gaussian blur
blur = cv2.GaussianBlur(gray, (3,3), 0)
# threshold
thresh = cv2.threshold(blur,236,255,cv2.THRESH_BINARY)[1]
# apply close and open morphology to fill tiny black and white holes
kernel = np.ones((5,5), np.uint8)
thresh = cv2.morphologyEx(thresh, cv2.MORPH_CLOSE, kernel)
thresh = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, kernel)
# invert thresh
thresh = 255 -thresh
# get contours (presumably just one around the nonzero pixels)
# then crop it to bounding rectangle
contours = cv2.findContours(thresh, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
contours = contours[0] if len(contours) == 2 else contours[1]
cntr = contours[0]
x,y,w,h = cv2.boundingRect(cntr)
crop = img[y:y+h, x:x+w]
cv2.imshow("IMAGE", img)
cv2.imshow("THRESH", thresh)
cv2.imshow("CROP", crop)
cv2.waitKey(0)
cv2.destroyAllWindows()
# save cropped image
cv2.imwrite('gymnast_crop.png',crop)
cv2.imwrite('gymnast_crop.png',crop)
Input:
Thresholded and cleaned image:
Cropped Result:

How to smooth and make thinner these very rough images using OpenCV?

I have some black and white images of a single digit. I am using a NN model trained on MNIST to classify them. However, the digits are too rough and thick compared to the MNIST dataset. For example:
TLDR: I need to smoothen image and possibly make overall shape thinner using OpenCV.
You can use a combination of morphology close, open and erode (and optionally skeletonize and dilate) in Python/OpenCV as follows:
Input:
import cv2
import numpy as np
from skimage.morphology import skeletonize
# load image
img = cv2.imread("5.png")
# convert to gray
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
# threshold image
thresh = cv2.threshold(gray,0,255,cv2.THRESH_BINARY)[1]
# apply morphology close
kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (11,11))
thresh = cv2.morphologyEx(thresh, cv2.MORPH_CLOSE, kernel)
# apply morphology open
kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (11,11))
thresh = cv2.morphologyEx(thresh, cv2.MORPH_CLOSE, kernel)
# apply morphology erode
kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (21,21))
thresh = cv2.morphologyEx(thresh, cv2.MORPH_ERODE, kernel)
# write result to disk
cv2.imwrite("5_thinned.png", thresh)
# skeletonize image and dilate
skeleton = cv2.threshold(thresh,0,1,cv2.THRESH_BINARY)[1]
skeleton = (255*skeletonize(skeleton)).astype(np.uint8)
kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (15,15))
skeleton_dilated = cv2.morphologyEx(skeleton, cv2.MORPH_DILATE, kernel)
# write result to disk
cv2.imwrite("5_skeleton_dilated.png", skeleton_dilated)
cv2.imshow("IMAGE", img)
cv2.imshow("RESULT1", thresh)
cv2.imshow("RESULT2", skeleton_dilated)
cv2.waitKey(0)
cv2.destroyAllWindows()
Result1 (close, open, erode):
Result2 (close, open, erode, skeletonize, dilate):
You will most likely benefit from morphological operations. Specifically it sounds like you want erosion.
You do have some noise though. You should try OpenCV's smoothing operations. Based on my experience, I think you need to use a median blur with a kernel area of maybe around 9 (although it depends on what you want). Then you need to use erode.

Categories