How to process and extract text from image - python

I'm trying to extract text from image using python cv2. The result is pathetic and I can't figure out a way to improve my code.
I believe the image needs to be processed before the extraction of text but not sure how.
I've tried to convert it into black and white but no luck.
import cv2
import os
import pytesseract
from PIL import Image
import time
pytesseract.pytesseract.tesseract_cmd='C:\\Program Files\\Tesseract-OCR\\tesseract.exe'
cam = cv2.VideoCapture(1,cv2.CAP_DSHOW)
cam.set(cv2.CAP_PROP_FRAME_WIDTH, 8000)
cam.set(cv2.CAP_PROP_FRAME_HEIGHT, 6000)
while True:
return_value,image = cam.read()
image=cv2.cvtColor(image,cv2.COLOR_BGR2GRAY)
image = image[127:219, 508:722]
#(thresh, image) = cv2.threshold(image, 128, 255, cv2.THRESH_BINARY | cv2.THRESH_OTSU)
cv2.imwrite('test.jpg',image)
print('Text detected: {}'.format(pytesseract.image_to_string(Image.open('test.jpg'))))
time.sleep(2)
cam.release()
#os.system('del test.jpg')

Preprocessing to clean the image before performing text extraction can help. Here's a simple approach
Convert image to grayscale and sharpen image
Adaptive threshold
Perform morpholgical operations to clean image
Invert image
First we convert to grayscale then sharpen the image using a sharpening kernel
Next we adaptive threshold to obtain a binary image
Now we perform morphological transformations to smooth the image
Finally we invert the image
import cv2
import numpy as np
image = cv2.imread('1.jpg')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
sharpen_kernel = np.array([[-1,-1,-1], [-1,9,-1], [-1,-1,-1]])
sharpen = cv2.filter2D(gray, -1, sharpen_kernel)
thresh = cv2.threshold(sharpen, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (3,3))
close = cv2.morphologyEx(thresh, cv2.MORPH_CLOSE, kernel, iterations=1)
result = 255 - close
cv2.imshow('sharpen', sharpen)
cv2.imshow('thresh', thresh)
cv2.imshow('close', close)
cv2.imshow('result', result)
cv2.waitKey()

Related

OCR not performing well on clean image | Python Pytesseract

I have been working on project which involves extracting text from an image. I have researched that tesseract is one of the best libraries available and I decided to use the same along with opencv. Opencv is needed for image manipulation.
I have been playing a lot with tessaract engine and it does not seems to be giving the expected results to me. I have attached the image as an reference. Output I got is:
1] =501 [
Instead, expected output is
TM10-50%L
What I have done so far:
Remove noise
Adaptive threshold
Sending it tesseract ocr engine
Are there any other suggestions to improve the algorithm?
Thanks in advance.
Snippet of the code:
import cv2
import sys
import pytesseract
import numpy as np
from PIL import Image
if __name__ == '__main__':
if len(sys.argv) < 2:
print('Usage: python ocr_simple.py image.jpg')
sys.exit(1)
# Read image path from command line
imPath = sys.argv[1]
gray = cv2.imread(imPath, 0)
# Blur
blur = cv2.GaussianBlur(gray,(9,9), 0)
# Binarizing
thres = cv2.adaptiveThreshold(blur, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY, 5, 3)
text = pytesseract.image_to_string(thresh)
print(text)
Images attached.
First image is original image. Original image
Second image is what has been fed to tessaract. Input to tessaract
Before performing OCR on an image, it's important to preprocess the image. The idea is to obtain a processed image where the text to extract is in black with the background in white. For this specific image, we need to obtain the ROI before we can OCR.
To do this, we can convert to grayscale, apply a slight Gaussian blur, then adaptive threshold to obtain a binary image. From here, we can apply morphological closing to merge individual letters together. Next we find contours, filter using contour area filtering, and then extract the ROI. We perform text extraction using the --psm 6 configuration option to assume a single uniform block of text. Take a look here for more options.
Detected ROI
Extracted ROI
Result from Pytesseract OCR
TM10=50%L
Code
import cv2
import pytesseract
pytesseract.pytesseract.tesseract_cmd = r"C:\Program Files\Tesseract-OCR\tesseract.exe"
# Grayscale, Gaussian blur, Adaptive threshold
image = cv2.imread('1.jpg')
original = image.copy()
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
blur = cv2.GaussianBlur(gray, (3,3), 0)
thresh = cv2.adaptiveThreshold(blur, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY_INV, 5, 5)
# Perform morph close to merge letters together
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (5,5))
close = cv2.morphologyEx(thresh, cv2.MORPH_CLOSE, kernel, iterations=3)
# Find contours, contour area filtering, extract ROI
cnts, _ = cv2.findContours(close, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)[-2:]
for c in cnts:
area = cv2.contourArea(c)
if area > 1800 and area < 2500:
x,y,w,h = cv2.boundingRect(c)
ROI = original[y:y+h, x:x+w]
cv2.rectangle(image, (x, y), (x + w, y + h), (36,255,12), 3)
# Perform text extraction
ROI = cv2.GaussianBlur(ROI, (3,3), 0)
data = pytesseract.image_to_string(ROI, lang='eng', config='--psm 6')
print(data)
cv2.imshow('ROI', ROI)
cv2.imshow('close', close)
cv2.imshow('image', image)
cv2.waitKey()

How to crop image based on the object radius using OpenCV?

I'm new to python. I know basics about image-preprocessing. I don't know how to remove background and crop an image using OpenCV.
Here is the processing in Python/OpenCV for your new image.
Input:
import cv2
import numpy as np
# load image as grayscale
img = cv2.imread('Diabetic-Retinopathy_G_RM_151064169.jpg')
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
# threshold input image
mask = cv2.threshold(gray, 10, 255, cv2.THRESH_BINARY)[1]
# optional morphology to clean up small spots
kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (3,3))
mask = cv2.morphologyEx(mask, cv2.MORPH_CLOSE, kernel)
mask = cv2.morphologyEx(mask, cv2.MORPH_OPEN, kernel)
# put mask into alpha channel of image
result = np.dstack((img, mask))
# save resulting masked image
cv2.imwrite('Diabetic-Retinopathy_G_RM_151064169_masked.png', result)
# display result, though it won't show transparency
cv2.imshow("mask", mask)
cv2.imshow("RESULT", result)
cv2.waitKey(0)
cv2.destroyAllWindows()
Result
Here is one way to do that in Python/OpenCV.
Read input
Convert to grayscale
Threshold and invert as mask
Put mask into alpha channel of input
Save result
Input:
import cv2
import numpy as np
# load image
img = cv2.imread('black_circle.png')
# convert to grayscale
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
# threshold
threshold = cv2.threshold(gray,128,255,cv2.THRESH_BINARY)[1]
# invert so circle is white on black
mask = 255 - threshold
# put mask into alpha channel of image
result = np.dstack((img, mask))
# save resulting masked image
cv2.imwrite('black_circle_masked.png', result)
# display result, though it won't show transparency
cv2.imshow("MASK", mask)
cv2.imshow("RESULT", result)
cv2.waitKey(0)
cv2.destroyAllWindows()
Result:
Note: download the result to see the transparency.

py_tesseract doesnt recognize numbers in this image

I am trying to use py-tesseract on google colabs to parse the following image containing readings from a meter. However it fails to get me the result expected. Looks like i need to do some pre-processing on the image. I am new to py-tesseract. Can you please help what i need to do to get this to work?
Here is my current code followed by output seen and the image:
!sudo apt install tesseract-ocr
!pip install pytesseract
import pytesseract
import shutil
import os
import random
try:
from PIL import Image
except ImportError:
import Image
image_path='drive/MyDrive/cropped_image.jpg'
print(pytesseract.image_to_string(image_path, config='--psm 13 --oem=3'))
Output
ey
Image being parsed
Cropped_image
Thanks
Try this:
print(pytesseract.image_to_string(image_path, config='-1 eng --oem 2 --psm 12'))
Output:
194735787
First, you need to preprocess the image which helps to reduce the noise and help in text extraction.
Extract text area from the image
Convert the image to grayscale and sharpen the image
Apply adaptive threshold
Clean the image by performing morphological operations and invert the image
import cv2
import numpy as np
img = cv2.imread('qKiJi.jpg')
#crop the text extraction area
crop = img[200:350, 100:400]
gray = cv2.cvtColor(crop, cv2.COLOR_BGR2GRAY)
sharpen_kernel = np.array([[-1,-1,-1], [-1,9,-1], [-1,-1,-1]])
sharpen = cv2.filter2D(gray, -1, sharpen_kernel)
thresh = cv2.threshold(sharpen, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (2,2))
close = cv2.morphologyEx(thresh, cv2.MORPH_CLOSE, kernel, iterations=1)
result = 255 - close
cv2.imshow('img', crop)
cv2.imshow('sharpen', sharpen)
cv2.imshow('thresh', thresh)
cv2.imshow('close', close)
cv2.imshow('result', result)
cv2.waitKey()

Remove letter artifacts from mammography image

I want to remove the letter artifacts "L:CC and Strin" from breast mammography using python. How could I get that done? this is my image
Here is one way to do that in Python/OpenCV.
Read the input
Convert to grayscale
Threshold
Dilate as mask
Apply mask to change white letters to black
Save the results
import cv2
import numpy as np
# read image
img = cv2.imread('mammogram_letters.png')
# convert to gray
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
# create mask
thresh = cv2.threshold(gray, 247, 255, cv2.THRESH_BINARY)[1]
# dilate mask
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (3, 3))
mask = cv2.morphologyEx(thresh, cv2.MORPH_DILATE, kernel)
# apply change
result = img.copy()
result[mask == 255] = (0,0,0)
# save result
cv2.imwrite("mammogram_letters_thresh.png", thresh)
cv2.imwrite("mammogram_letters_mask.png", mask)
cv2.imwrite("mammogram_letters_blackened.png", result)
# show results
cv2.imshow("THRESH", thresh)
cv2.imshow("MASK", mask)
cv2.imshow("RESULT", result)
cv2.waitKey(0)
Threshold image:
Mask image:
Result:
You have to get pixel coordinate of the box containing test, if they are always the same my code will work.
from PIL import Image
im = Image.open('SqbIx.png')
img =im.load()
for i in range (73,116):
for j in range (36,57):
img[i,j]= (0, 0, 0)
im.save('mod.png')

is it possible to extracting text inside colored background region using opencv in python?

I having the following table area from the original image:
I'm trying extract the text,from this table.But when using threshold the whole gray regions get darkening.For example like below,
The threshold type which i did used,
thresh_value = cv2.threshold(original_gray, 128, 255, cv2.THRESH_BINARY_INV +cv2.THRESH_OTSU)[1]
is it possible to extract and change gray background into white and lets remain text pixel as it is if black then?
You should use adaptive thresholding in Python/OpenCV.
Input:
import cv2
import numpy as np
# read image
img = cv2.imread("text_table.jpg")
# convert img to grayscale
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
# do adaptive threshold on gray image
thresh = cv2.adaptiveThreshold(gray, 255, cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY, 11, 11)
# write results to disk
cv2.imwrite("text_table_thresh.jpg", thresh)
# display it
cv2.imshow("thresh", thresh)
cv2.waitKey(0)
Result

Categories