I am using a combination of pyautogui and pytesseract to capture small regions on the screen and then pull the number/text out of the region. I have written script that has read the majority of captured images perfectly, but single digit numbers seem to cause an issue for it. For example small regions of an image containing numbers are saved to .png files the numbers 11, 14, and 18 were pulled perfectly, but the number 7 is just returning as a blank string.
Question: What could be causing this to happen?
Code: Scaled down drastically to make it every easy to follow:
def get_text(image):
return pytesseract.image_to_string(image)
answer2 = pyautogui.screenshot('answer2.png',region=(727, 566, 62, 48))
img = Image.open('answer2.png')
answer2 = get_text(img)
This code is repeated 4 times, once for each image, it worked for 11,14,18 but not for 7.
Just to slow the files being read here is a screenshot of the images after they were saved through the screenshot command.
https://gyazo.com/0acbf5be2d970abeb29561113c171fbe
here is a screenshot of what I am working from:
https://gyazo.com/311913217a1302382b700b07ad3e3439
I found question Why pytesseract does not recognise single digits? and in comments I found option --psm 6.
I checked tesseract with option --psm 6 and it can recognize single digit on your image.
tesseract --psm 6 number-7.jpg result.txt
I checked pytesseract.image_to_string() with option config='--psm 6' and it can recognize single digit on your image too.
#!/usr/bin/env python3
from PIL import Image
import pytesseract
img = Image.open('number-7.jpg')
print(pytesseract.image_to_string(img, config='--psm 6'))
Related
I am trying to detect some numbers with tesseract in python. Below you will find my starting image and what I can get it down to. Here is the code I used to get it there.
import pytesseract
import cv2
import numpy as np
pytesseract.pytesseract.tesseract_cmd = "C:\\Users\\choll\\AppData\\Local\\Programs\\Tesseract-OCR\\tesseract.exe"
image = cv2.imread(r'64normalwart.png')
lower = np.array([254, 254, 254])
upper = np.array([255, 255, 255])
image = cv2.inRange(image, lower, upper)
image = cv2.bitwise_not(image)
#Uses a language that should work with minecraft text, I have tried with and without, no luck
text = pytesseract.image_to_string(image, lang='mc')
print(text)
cv2.imwrite("Wartthreshnew.jpg", image)
cv2.imshow("Image", image)
cv2.waitKey(0)
I end up with black numbers on a white background which seems pretty good but tesseract can still not detect the numbers. I also noticed the numbers were pretty jagged but I don't know how to fix that. Does anyone have recommendations for how I could make tesseract be able to recognize these numbers?
Starting Image
What I end up with
Your problem is with the page segmentation mode. Tesseract segments every image in a different way. When you don't choose an appropriate PSM, it goes for mode 3, which is automatic and might not be suitable for your case. I've just tried your image and it works perfectly with PSM 6.
df = pytesseract.image_to_string(np.array(image),lang='eng', config='--psm 6')
These are all PSMs availabe at this moment:
0 Orientation and script detection (OSD) only.
1 Automatic page segmentation with OSD.
2 Automatic page segmentation, but no OSD, or OCR.
3 Fully automatic page segmentation, but no OSD. (Default)
4 Assume a single column of text of variable sizes.
5 Assume a single uniform block of vertically aligned text.
6 Assume a single uniform block of text.
7 Treat the image as a single text line.
8 Treat the image as a single word.
9 Treat the image as a single word in a circle.
10 Treat the image as a single character.
11 Sparse text. Find as much text as possible in no particular order.
12 Sparse text with OSD.
13 Raw line. Treat the image as a single text line,
bypassing hacks that are Tesseract-specific.
Use the pytesseract.image_to_string(img, config='--psm 8') or try diffrent configs to see if the image will get recognized. Useful link here Pytesseract OCR multiple config options
I am trying to analyze a page footer in a video and retrieve the current page number. I got the frame collection working but I am struggling on reading the page number itself, using EasyOCR.
I already tried using pytesseract, but that doesnt work well. I have misinterpreted numbers: 10 gets recognized as 113, 6 as 41 and so on. Overall its very inconsistent, even though I format my input image correctly with grayscale, threshholding and cropping (only analyzing the pagenumber area of the footer).
Here is the code:
def getPageNumberTest(path, psm):
image = cv2.imread(path)
height = len(image)
width = len(image[0])
# the height of the footer
footerHeight = 90 # int(height / 15.5)
# retrieve only the footer from the image
cropped = image[height-footerHeight:height,0:width]
results = reader.readtext(cropped)
Which gives me the following output:
Is there a setting I am missing? Is there a way to instruct EasyOCR to look for numbers only?
Any help or hint is appreciated!
EDIT:
After some fiddling around with some optimizations of the number-images, I am now back to the beginning, not optimizing the images at all. All thats left is the conversion to gray-scale and a resize.
This is what a normal input looks like:
But the results are:
Which is weird, because for most numbers (especially for single digits) this works flawlessly, yielding certainties of over 95%...
I tried deblurring, threshholding, denoising with cv2.filter2D(), blurring,...
When I use threshholding, for example, my output looks like this (ignoring the "1", same applies for the single digit "1"):
I had a look into pattern matching, which isnt an option because I don't know the pagenumber shape beforehand...
txt = pytesseract.image_to_string(final_image, config='--psm 13 --oem 3 -c tessedit_char_whitelist=0123456789')
According to my tests, PaddleOCR works better than easyOCR in most scenes.
I need to extract the number from a display(LED dot matrix)
Sample Image:
I am using the example code given by pytesseract to test. But I am failing.
try:
from PIL import Image
except ImportError:
import Image
import pytesseract
# If you don't have tesseract executable in your PATH, include the following:
pytesseract.pytesseract.tesseract_cmd = r'<full_path_to_your_tesseract_executable>'
# Example tesseract_cmd = r'C:\Program Files (x86)\Tesseract-OCR\tesseract'
# Simple image to string
print(pytesseract.image_to_string(Image.open('test.png')))
From the sample image I need the output as the number and alphabets given.
If my image have only one number totally filled in whole display that number alone should be in result.. Am I using the right tools? or should I look into opencv and machine learning to extract characters from display.
Why do I get different output with tesseract and pytesseract?
In tesseract:
tesseract t10.tiff output -1 eng
In python
ocr_text = pytesseract.image_to_string(image, lang='eng', config='-psm 3').
If you look closely at pytesseract.run_tesseract(), you will see that pytesseract runs a subprocess that creates another .PNG file, THEN run tesseract subprocess on that image. I put a python debugger right after the file is created and tried copying the file on disk for inspection. Turned out that the file color profile is different from the original image. Further, the new image has 3 color channels while the original image has an alpha channel. Try running tesseract from command line on this new image and you'll get the same result you get from running pytesseract on the original image.Generated PNG vs Original png
I am currently facing a problem with pytesseract where the software is unable to detect a number in this image:
https://i.stack.imgur.com/kmH2R.png
This is taken from a bigger image with threshold filter applied.
For some reason, pytesseract doesn't want to recognise the 6 in this image. Any suggestions? Here is my code:
image = #Insert raw image here. My code takes a screenshot.
image = cv2.cvtColor(image, cv2.COLOR_RGB2GRAY)
image = cv2.medianBlur(image, 3)
rel, gray = cv2.threshold(image, 127, 255, cv2.THRESH_BINARY)
# If you want to use the image from above, start here.
image = Image.fromarray(image)
string = pytesseract.image_to_string(image)
print(string)
EDIT: With some further investigation, my code works fine wit numbers containing 2 digits. But not those with singular digits.
pytesseract defaults to a mode that looks for large chunks of text (PSM_SINGLE_BLOCK or --psm 6), in order to have it detect a single character you need to run it with the option --psm 10 (PSM_SINGLE_CHAR). However, due to the black spots in the corners of the image you provided it detects them as random dashes and returns nothing in this mode since it things there's multiple characters, so in this case you need to use --psm 8 (PSM_SINGLE_WORD):
string = pytesseract.image_to_string(image, config='--psm 8')
The output from this will include those random characters so you would need to strip them after pytesseract runs or improve your bounding box around the numbers to remove any noise. Also, if all of your characters being detected are numbers you can add '-c tessedit_char_whitelist=0123456789' after '--psm 8' to improve the detection.
Some other minor tips to simplify your code is that cv2.imread has an option to read the image as black & white so you don't need to run cvtColor afterwards, just do:
image = cv2.imread('/path/to/image/6.png', 0)
also you can create the PIL image object within your call to pytesseract, so that line simplifies to:
string = pytesseract.image_to_string(Image.fromarray(img), config='--psm 8')
as long as you have 'from PIL import Image' at the top of your script.