Python Tesseract License Plate Recognition - python

I'm a programmer, but i have no prior experience with Python or any of its libraries, or even OCR/ALPR as a whole. I have a script that i made (basically copying and paste other scripts for all over the web) that I pretend to use to recognize License Plates. But the reality is my code is very bad right now. It can recognize text in images pretty well, but it sucks to capture license plates. Very rarely i can get a License Plate with it.
So I would like some help on how should I change my code to make it better.
In my code, I simply choose an image, convert it to binary and BW, and try to read it.
I ONLY NEED the string (image_to_string); I do NOT NEED THE IMAGE.
Also is important to note that, as I said, I have no expertise with this code or the functions I'm using.
My code:
from PIL import Image
import pytesseract
import numpy as np
import cv2
image = cv2.imread('redcar.jpg')
image=cv2.cvtColor(image,cv2.COLOR_BGR2GRAY)
se=cv2.getStructuringElement(cv2.MORPH_RECT , (8,8))
bg=cv2.morphologyEx(image, cv2.MORPH_DILATE, se)
out_gray=cv2.divide(image, bg, scale=255)
out_binary=cv2.threshold(out_gray, 0, 255, cv2.THRESH_OTSU )[1]
#cv2.imshow('binary', out_binary)
cv2.imwrite('binary.png',out_binary)
#cv2.imshow('gray', out_gray)
cv2.imwrite('gray.png',out_gray)
filename = 'gray.png'
img1 = np.array(Image.open(filename))
text = pytesseract.image_to_string(filename,config ='--psm 6')
print(text)
The image I'm using:

I hope easyocr will be helpfull in such case. You can install easyocr by pip install easyocr with opencv version of opencv-python-4.5.4.60
import easyocr
IMAGE_PATH = 'AQFCB.jpg'
reader = easyocr.Reader(['en'])
result = reader.readtext(IMAGE_PATH)
for detection in result:
if detection[2] > 0.5:
print(detection[1])
the output is
HR.26 BR.9044

Related

Reading text from image, facing problem because of the font

I am trying to read this image and do the arithmetic operation in the image. For some reason i am not able to read 7 because of the font it has. I am relatively new to image processing. Can you please help me with solution. I tried pixeliating the image, but that did not help.
import cv2
import pytesseract
from PIL import Image
img = cv2.imread('modules/visual_basic_math/temp2.png', cv2.IMREAD_GRAYSCALE)
thresh = cv2.threshold(img, 100, 255, cv2.THRESH_BINARY_INV+cv2.THRESH_OTSU)[1]
print(pytesseract.image_to_string(img, config='--psm 6'))
Response i am getting is -
+44 849559
+46653% 14
+7776197
+6415995
+*9156346
x4463310
+54Q%433
+1664 20%
Right now, tesseract is a bit outdated. There are much more powerful libraries. I recommend PaddleOCR. To install it:
pip install paddlepaddle
pip install paddleocr
Then:
from paddleocr import PaddleOCR
ocr = PaddleOCR(use_angle_cls=True, lang='es')
predictions = ocr.ocr("ietDJpng")[0]
filtered_text = []
for pred in predictions:
filtered_text.append(pred[-1][0])
filtered_text = [t.replace(" ", "") for t in filtered_text] # Remove spaces
['+4487559', '+4665714', '+7776157', ':6415995', ':9156346', 'x4463310', '-54q7433', '+1664207']
The output is not completely correct (the division symbols are : and one of them is wrong). Also, it confuses a 9 with a q). However, the results are better and the use of the library is as comfortable as tesseract.
Hope it helps!

Pytesseract wrong number detection

My code is:
import cv2,numpy
import pytesseract
pytesseract.pytesseract.tesseract_cmd = r"C:\Program Files\Tesseract-OCR\tesseract.exe" # For Windows OS
def scan(image):
try:
img = cv2.cvtColor(numpy.array(image), cv2.COLOR_RGB2BGR)
except:
img = cv2.imread(image)
# Apply OCR
data = pytesseract.image_to_string(img, config="-c tessedit"
"_char_whitelist=1234567890"
" --psm 6"
" ")
return data
And when I make it scan this image it just gives me ''. Nothing. I don't know whats wrong, works on every other digit number, what should I change? If you have some python ocr that works on this image, you can also send it.
Using Tesseract or any OCR can get really tricky. The pictures you mentioned worked perfectly might have better quality or are closely related to the dataset version you are using in your code/computer.
Some basic steps you can do to improve this are:
Add a new trained data file that has similar font to the font you are trying to detect
Do some preprocessing on the image, sharpen it, change resolution and color, basically the whole routine till you find the perfect mix
Try a different OCR
Let me know if this works!
Read the documentation, understand what are you doing and you will get the correct result. Hint: pretending that the single character is a uniform block of text is not wise.
Your picture works for me. My guess is that you didn't successfully read the image? You can debug by print(img.shape) or if img is None: print('None'). Python might be operating in a different directory. os.getcwd() gets Pythons current working directory. You can also do os.path.isfile(image) to see if Python can find file where you are looking.
This is what I tried:
import cv2,numpy
import pytesseract
# ~ pytesseract.pytesseract.tesseract_cmd = r"C:\Program Files\Tesseract-OCR\tesseract.exe" # For Windows OS
img = cv2.imread('niner.png')
# Apply OCR
data = pytesseract.image_to_string(img, config="-c tessedit"
"_char_whitelist=1234567890"
" --psm 6"
" ")
print('tesseract version: ', pytesseract.get_tesseract_version())
print('=============================================')
print(data)
and the result is:
tesseract version: 4.0.0.20181030
leptonica-1.76.0
libgif 5.1.4 : libjpeg 8d (libjpeg-turbo 1.5.3) : libpng 1.6.34 : libtiff 4.0.9 : zlib 1.2.11 : libwebp 0.6.1 : libopenjp2 2.2.0
=============================================
9
♀

Preprocessing images for QR detection in python

I used Zbar and OpenCV to read the QR code in the image below but both failed to detect it. For ZBar, I use pyzbar library as the python wrapper. There are images that QR is detected correctly and images really similar to the successful ones that fail. My phone camera can read the QR code in the uploaded image which means it is a valid one. Below is the code snippet:
from pyzbar.pyzbar import decode
from pyzbar.pyzbar import ZBarSymbol
import cv2
# zbar
results = decode(cv2.imread(image_path), symbols=[ZBarSymbol.QRCODE])
print(results)
# opencv
qr_decoder = cv2.QRCodeDetector()
data, bbox, rectified_image = qr_decoder.detectAndDecode(cv2.imread(image_path))
print(data, bbox)
What type of pre-processing will help to increase the rate of success for detecting QR codes?
zbar, which does some preprocessing, does not detect the QR code, which you can test running zbarimg image.jpg.
Good binarization is useful here. I got this to work using the kraken.binarization.nlbin() function of the Kraken library. The library is for OCR, but works very well for QR codes, too, by using non-linear processing. The Kraken binarization code is here.
Here is the code for the sample:
from kraken import binarization
from PIL import Image
from pyzbar.pyzbar import decode
from pyzbar.pyzbar import ZBarSymbol
image_path = "image.jpg"
# binarization using kraken
im = Image.open(image_path)
bw_im = binarization.nlbin(im)
# zbar
decode(bw_im, symbols=[ZBarSymbol.QRCODE])
[Decoded(data=b'DE-AAA002065', type='QRCODE', rect=Rect(left=1429, top=361, width=300, height=306), polygon=[Point(x=1429, y=361), Point(x=1429, y=667), Point(x=1729, y=667), Point(x=1723, y=365)])]
The following picture shows the clear image of the QR code after binarization:
I had a similar issue, and Seanpue's answer got me on the right track for this problem. Since I was already using the OpenCV library for image processing rather than PIL, I used it to perform Otsu's Binarization using the directions in an OpenCV tutorial on Image Thresholding. Here's my code:
import cv2
from pyzbar.pyzbar import decode
from pyzbar.pyzbar import ZBarSymbol
image_path = "qr.jpg"
# preprocessing using opencv
im = cv2.imread(image_path, cv2.IMREAD_GRAYSCALE)
blur = cv2.GaussianBlur(im, (5, 5), 0)
ret, bw_im = cv2.threshold(blur, 0, 255, cv2.THRESH_BINARY+cv2.THRESH_OTSU)
# zbar
decode(bw_im, symbols=[ZBarSymbol.QRCODE])
[Decoded(data=b'DE-AAA002065', type='QRCODE', rect=Rect(left=1429, top=362, width=300, height=305), polygon=[Point(x=1429, y=362), Point(x=1430, y=667), Point(x=1729, y=667), Point(x=1724, y=366)])]
Applying the gaussian blur is supposed to remove noise from the picture to make the binarization more effective, but for my application it didn't actually make much difference. What was vital was to convert the image to grayscale to make the threshold function work (done here by opening the file with the cv2.IMREAD_GRAYSCALE flag).
QReader use to work quite well for these cases.
from qreader import QReader
import cv2
if __name__ == '__main__':
# Initialize QReader
detector = QReader()
img = cv2.cvtColor(cv2.imread('92iKG.jpg'), cv2.COLOR_BGR2RGB)
# Detect and Decode the QR
print(detector.detect_and_decode(image=img))
This code output for this QR:
DE-AAA002065

How to get text from image

I want to read the text from an image.
I use pytesseract in Python.
Here is my code:
import pytesseract
from PIL import Image
pytesseract.pytesseract.tesseract_cmd = r"C:\Program Files\Tesseract-OCR\tesseract.exe"
image = Image.open(r'a.jpg')
image.resize((150, 50),Image.ANTIALIAS).save("pic.jpg")
image = Image.open("pic.jpg")
captcha = pytesseract.image_to_string(image).replace(" ", "").replace("-", "").replace("$", "")
image
However, it returns empty string.
What should be the correct way?
Thanks.
i agree with #Jon Betts
tesseract is not very strong in OCR, only good in binary cases with right settings
CAPTCHAs ment to fool OCRs!
but if you really need to do it, you need to come up with the manual procedure for it,
i created the code below specifically for the type of CAPTCHAs that you gave (but its completely rigid and is not generalized/optimized for all cases)
psudo code
apply median blur
apply a threshold to get Blue colors only (binary image output from this stage)
apply opening to reduce small white pixels in binary image
give the image to tesseract with options:
limited whitelist of output chars
OEM 3 : tesseract + cube
PSM 8 : one word per image
Code
from PIL import Image
import pytesseract
import numpy as np
import cv2
img = cv2.imread('a.jpg')
img = cv2.medianBlur(img, 3)
# extract blue parts
img2 = np.zeros((img.shape[0], img.shape[1]), dtype=np.uint8)
cond = np.bitwise_and(img[:, :, 0] >= 100, img[:, :, 2] < 100)
img2[np.where(cond)] = 255
img = img2
# delete the noise
kernel = cv2.getStructuringElement(cv2.MORPH_CROSS, (3, 3))
img = cv2.morphologyEx(img, cv2.MORPH_OPEN, kernel)
str1 = pytesseract.image_to_string(Image.fromarray(img),
config='-c tessedit_char_whitelist=abcedfghijklmnopqrtuvwxyz0123456789 -oem 3 -psm 8')
cv2.imwrite("frame.png", img)
print(str1)
output
f2e4
image
in order to see full options of tesseract, type the following command tesseract --help-extra or refere to this_link
Tesseract is intended for performing OCR on text documents. In my experience it's good but a bit patchy even with very clean data.
In this case it appears you are trying to solve a CAPTCHA which is specifically designed to defeat OCR software. It's very likely you cannot use Tesseract to solve this issue, because:
It's not really designed for that
The scenario is adversarial:
The example is specifically designed to prevent what you are trying to do
If you could get it to work, the other party would likely change it to break again
If you want to proceed I would suggest:
Working on cleaning up the image before attempting to process it (can you get a nice readable black and white image?)
Train your own recognition network using a lot of examples

Using python to save a JPG image that was edited in the script

Referring to the answer to this question, I tried to save my own JPG image files, after some basic image processing. I've only applied a rotation and a shear. This is my code:
import numpy as np
import sys
from skimage import data, io, filter, color, exposure
import skimage.transform as tf
from skimage.transform import rotate, setup, AffineTransform
from PIL import Image
mypath = PATH_TO_FILENAME
readfile = FILENAME
img = color.rgb2gray(io.imread(mypath + readfile))
myimg = rotate(img, angle=10, order=2)
afine_tf = tf.AffineTransform(shear=0.1)
editedimg = tf.warp(myimg, afine_tf)
# IF I UNCOMMENT THE TWO LINES BELOW, I CAN SEE THE EDITED IMAGE AS EXPECTED
#io.imshow(editedimg)
#io.show()
saveimg= np.array(editedimg)
result = Image.fromarray((saveimg).astype(np.uint8))
newfile = "edited_" + readfile
result.save(path+newfile)
I know that the image processing was fine because if I display it before saving, it's just the original image with a bit of rotation and shearing, as expected. But I'm doing something wrong while saving it. I tried without the astype(np.uint8)) part but got an error. Then I removed some of the code from the link mentioned above because I guessed it was particularly for Fourier Transforms, since when I included some of their code, then I got an image that was all gray but with white lines in the direction of the shear I'd applied. But now the image that gets saved is just 2KB of nothing but blackness.
And when I tried something as simple as:
result = Image.fromarray(editedimg)
result.save(path+newfile)
then I got this error:
raise IOError("cannot write mode %s as JPEG" % im.mode)
IOError: cannot write mode F as JPEG
I don't really need to use PIL, if there's another way to simply save my image, I'm fine with that.
Look into the PIL fork, Pillow, is is not as outdated and what you should probably be using for this.
Also depending on your operating system you may need a few other libraries to compile PIL with JPEG support properly, see here
This may also help Says you need to convert your image to RGB mode before saving.
Image.open('old.jpeg').convert('RGB').save('new.jpeg')

Categories