I am currently facing a problem with pytesseract where the software is unable to detect a number in this image:
For some reason, pytesseract doesn't want to recognise digits in this image. Any suggestions? Here is my code:
import pytesseract
from PIL import ImageEnhance, ImageFilter, Image
img = r'/content/inv_thresh.png'
str = pytesseract.image_to_string(Image.open(img), lang='eng', \
config='--psm 8 --oem 3 -c tessedit_char_whitelist=0123456789')
It returns a string COTO
Why you specify --oem 3 (Default, based on what is available.)
Which model you use? Which tesseract version?
Tesseract expect clear image without artifacts to provide correct results => you will need better preprocess image.
I got following result with tessdata_best mode with recent tesseract (4.1/5.0alpha):
tesseract a9Uq4.png - --psm 8 --dpi 70
I am trying to read this image and do the arithmetic operation in the image. For some reason i am not able to read 7 because of the font it has. I am relatively new to image processing. Can you please help me with solution. I tried pixeliating the image, but that did not help.
import cv2
import pytesseract
from PIL import Image
img = cv2.imread('modules/visual_basic_math/temp2.png', cv2.IMREAD_GRAYSCALE)
thresh = cv2.threshold(img, 100, 255, cv2.THRESH_BINARY_INV+cv2.THRESH_OTSU)[1]
print(pytesseract.image_to_string(img, config='--psm 6'))
Response i am getting is -
+44 849559
+46653% 14
+1664 20%
Right now, tesseract is a bit outdated. There are much more powerful libraries. I recommend PaddleOCR. To install it:
pip install paddlepaddle
pip install paddleocr
from paddleocr import PaddleOCR
ocr = PaddleOCR(use_angle_cls=True, lang='es')
predictions = ocr.ocr("ietDJpng")[0]
filtered_text = []
for pred in predictions:
filtered_text = [t.replace(" ", "") for t in filtered_text] # Remove spaces
['+4487559', '+4665714', '+7776157', ':6415995', ':9156346', 'x4463310', '-54q7433', '+1664207']
The output is not completely correct (the division symbols are : and one of them is wrong). Also, it confuses a 9 with a q). However, the results are better and the use of the library is as comfortable as tesseract.
Hope it helps!
My code is:
import cv2,numpy
import pytesseract
pytesseract.pytesseract.tesseract_cmd = r"C:\Program Files\Tesseract-OCR\tesseract.exe" # For Windows OS
def scan(image):
img = cv2.cvtColor(numpy.array(image), cv2.COLOR_RGB2BGR)
img = cv2.imread(image)
# Apply OCR
data = pytesseract.image_to_string(img, config="-c tessedit"
" --psm 6"
" ")
return data
And when I make it scan this image it just gives me ''. Nothing. I don't know whats wrong, works on every other digit number, what should I change? If you have some python ocr that works on this image, you can also send it.
Using Tesseract or any OCR can get really tricky. The pictures you mentioned worked perfectly might have better quality or are closely related to the dataset version you are using in your code/computer.
Some basic steps you can do to improve this are:
Add a new trained data file that has similar font to the font you are trying to detect
Do some preprocessing on the image, sharpen it, change resolution and color, basically the whole routine till you find the perfect mix
Try a different OCR
Let me know if this works!
Read the documentation, understand what are you doing and you will get the correct result. Hint: pretending that the single character is a uniform block of text is not wise.
Your picture works for me. My guess is that you didn't successfully read the image? You can debug by print(img.shape) or if img is None: print('None'). Python might be operating in a different directory. os.getcwd() gets Pythons current working directory. You can also do os.path.isfile(image) to see if Python can find file where you are looking.
This is what I tried:
import cv2,numpy
import pytesseract
# ~ pytesseract.pytesseract.tesseract_cmd = r"C:\Program Files\Tesseract-OCR\tesseract.exe" # For Windows OS
img = cv2.imread('niner.png')
# Apply OCR
data = pytesseract.image_to_string(img, config="-c tessedit"
" --psm 6"
" ")
print('tesseract version: ', pytesseract.get_tesseract_version())
and the result is:
tesseract version:
libgif 5.1.4 : libjpeg 8d (libjpeg-turbo 1.5.3) : libpng 1.6.34 : libtiff 4.0.9 : zlib 1.2.11 : libwebp 0.6.1 : libopenjp2 2.2.0
I have this image, and I'm trying to read it with Tesseract:
My code is like that:
But, what I get is only LOW: 56. So, Tesseract is unable to read the 1 in the first line. I've tried to specify also a whitelist of only digits like
pytesseract.image_to_string(im, config="tessedit_char_whitelist=0123456789.")
and to process the image with an erosion but nothing works. Any suggestions?
Improving the quality of the output is your "holy scripture" when working with Tesseract. Especially, the page segmentation method should always be explicitly set. Here (as most of the times), I'd opt for --psm 6:
Assume a single uniform block of text.
Even without further preprocessing of your image, you already get the desired result:
import cv2
import pytesseract
image = cv2.imread('gBrcd.png')
text = pytesseract.image_to_string(image, config='--psm 6')
print(text.replace('\f', ''))
# 1
# LOW: 56
System information
Platform: Windows-10-10.0.19041-SP0
Python: 3.9.1
PyCharm: 2021.1.1
OpenCV: 4.5.2
pytesseract: 5.0.0-alpha.20201127
I am trying to detect the text from the images
but fail due to some unknown reasons.
import pytesseract as pt
from PIL import Image
import re
image = Image.open('sample.jpg')
custom_config = r'--oem 3 --psm 7 outbase digits'
number = pt.image_to_string(image, config=custom_config)
print('Number: ', number)
Number: 0 50 100 200 250 # This is the output that I am getting.
Expected --> 0,0,0,0,0,1,0,8
OCR using tesseract on crude/raw image inputs might not give you expected result.
For the given image, a somewhat better result can be obtained using grayscale conversion followed by thresholding operation
To perform the conversion and thresholding operation you may use ImageMagick as follows:
$ convert input_image.jpg -colorspace gray grayscale_image.jpg
$ convert grayscale_image.jpg -threshold 45% thresholded_image.jpg
$ convert thresholded_image.jpg -morphology Dilate Rectangle:4,3 dilated_binary.jpg
$ python run_tesseract.py
A more robust approach to OCR is via training the tesseract engine discussed here
I am using a combination of pyautogui and pytesseract to capture small regions on the screen and then pull the number/text out of the region. I have written script that has read the majority of captured images perfectly, but single digit numbers seem to cause an issue for it. For example small regions of an image containing numbers are saved to .png files the numbers 11, 14, and 18 were pulled perfectly, but the number 7 is just returning as a blank string.
Question: What could be causing this to happen?
Code: Scaled down drastically to make it every easy to follow:
def get_text(image):
return pytesseract.image_to_string(image)
answer2 = pyautogui.screenshot('answer2.png',region=(727, 566, 62, 48))
img = Image.open('answer2.png')
answer2 = get_text(img)
This code is repeated 4 times, once for each image, it worked for 11,14,18 but not for 7.
Just to slow the files being read here is a screenshot of the images after they were saved through the screenshot command.
here is a screenshot of what I am working from:
I found question Why pytesseract does not recognise single digits? and in comments I found option --psm 6.
I checked tesseract with option --psm 6 and it can recognize single digit on your image.
tesseract --psm 6 number-7.jpg result.txt
I checked pytesseract.image_to_string() with option config='--psm 6' and it can recognize single digit on your image too.
#!/usr/bin/env python3
from PIL import Image
import pytesseract
img = Image.open('number-7.jpg')
print(pytesseract.image_to_string(img, config='--psm 6'))