I'm learning OCR using PyTesser and Tesseract. As the first milestone, I want to write a tool to recognize captcha that simply consists of some digits. I read some tutorials and wrote such a test program.
from pytesser.pytesser import *
from PIL import Image, ImageFilter, ImageEnhance
im = Image.open("test.tiff")
im = im.filter(ImageFilter.MedianFilter())
enhancer = ImageEnhance.Contrast(im)
im = enhancer.enhance(2)
im = im.convert('1')
text = image_to_string(im)
print "text={}".format(text)
I tested my code with the image below. But the result is 2(T?770. And I've tested some other similar images as well, in 80% case the results are incorrect.
I'm not familiar with imaging processing. I've two questions here:
Is it possible to tell PyTesser to guess digits only?
I think the image is quite easy for human to read. If it is so difficult for PyTesser to read digits only image, is there any alternatives can do a better OCR?
Any hints are very appreciated.
I think your code is quite okay. It can recognize 207770. The problem is at pytesser installation. The Tesseract in pytesser is out-of-date. You'd download a most recent version and overwrite corresponding files. You'd also edit pytesser.py and change
tesseract_exe_name = 'tesseract'
to
import os.path
tesseract_exe_name = os.path.join(os.path.dirname(__file__), 'tesseract')
Related
Binary image B2
Binary image Y2
I think these images are quite simple and clear. Still pytesseract does not work. I really wonder why.
Here is my code
from pytesseract import pytesseract as tesseract
import cv2 as cv
binary = cv.imread(filepath)
lang = 'eng'
config = 'tessedit_char_whitelist=RGB123'
print(tesseract.image_to_string(binary, lang=lang, config=config))
The output is just blank string.
To Dennlinger's point, I would definitely rotate it before sending it through PyTess. PyTess should rotate it automatically though. Should.
Alternatively, I see in your configuration that you have white listed "RGB123" which, correct me if I'm wrong, may mean that PyTess is mainly looking for those specific numbers and characters.
I'd try changing your configuration by omiting that configuration so that it can pick up the "Y" in there.
so basicly im trying to make tesseract ocr do some elementary school math, but it wont work it prints out the math but not the answer to the math, here is my code
import pytesseract
pytesseract.pytesseract.tesseract_cmd = r'C:\Users\alloa026\Downloads\project\photo.png'
content = pytesseract.image_to_string(r'C:\Users\alloa026\Downloads\project\photo.png')
print(eval("content.strip()"))
anyone know how to fix this?
have you note that, you should correct the last line
print(eval("content.strip()")
to
print(eval(content.strip())
Obviously this image is pretty tough as it is low clarity and is not a real word. However, with this code, I'm detecting nothing close:
import pytesseract
from PIL import Image, ImageEnhance, ImageFilter
image_name = 'NedNoodleArms.jpg'
im = Image.open(image_name)
im = im.filter(ImageFilter.MedianFilter())
enhancer = ImageEnhance.Contrast(im)
im = enhancer.enhance(2)
im = im.convert('1')
im.save(image_name)
text = pytesseract.image_to_string(Image.open(image_name))
print(text)
outputs
, Mdfiaodfiamms
Any ideas here? The image my contrasting function produces is:
Which looks decent? I don't have a ton of OCR experience. What preprocessing would you recommend here? I've tried resizing the image larger, which helps a little bit but not enough, along with a bunch of different filters from PIL. Nothing getting particularly close though
You are right, tesseract works better with higher resolutions so sometimes resizing the image helps - but don't convert to 1 bit.
I got good results converting to grayscale, making it 3 times as large and making the letters a bit brighter:
>>> im = Image.open('j78TY.png')\
.convert('L').resize([3 * _ for _ in im.size], Image.BICUBIC)\
.point(lambda p: p > 75 and p + 100)
>>> pytesseract.image_to_string(im)
'NedNoodleArms'
Check this jupyter notebook:
I have been trying to teach myself more advanced methods in Python but can't seem to find anything similar to this problem to base my code off of.
First question: Is this only way to display an image in the terminal to install Pillow? I would prefer not to, as I'm trying to then teach what I learn to a very beginner student. My image.show() function doesn't do anything.
Second question: What is the best way to go about lowering the brightness of all RGB pixels in an image by 20%? What I have below doesn't do anything to the alter the brightness, but it also can compile completely. I would prefer the most simple way to go about this as far as importing minimal libraries.
Third Question: How do I made a new picture instead of changing the original? (IE- lower brightness 20%, "image-decreasedBrightness.jpg" is created from "image.jpg")
here is my code - sorry it isn't formatted correctly. Every time i tried to indent it would tab down to the tags bar.
import Image
import ImageEnhance
fileToBeOpened = raw_input("What is the file name? Include file type.")
image = Image.open(fileToBeOpened)
def decreaseBrightness(image):
image.show()
image = image.convert('L')
brightness = ImageEnhance.Brightness(image)
image = brightness.enhance(20)
image.show()
return image
decreaseBrightness(image)
To save the image as a file, there's an example on the documentation:
from PIL import ImageFile
fp = open("lena.pgm", "rb")
p = ImageFile.Parser()
while 1:
s = fp.read(1024)
if not s:
break
p.feed(s)
im = p.close()
im.save("copy.jpg")
The key function is im.save.
For a more in-depth solution, get a nice beverage, find a comfortable place to sit and enjoy your read:
Pillow 3.4.x Documentation.
So I have implemented the following screenshot functionality into my game just to log progress and stuff like that. This is my code:
pygame.image.save(screen, save_file)
Pretty basic. I recently upgraded to python 3.3 and have since been having the issue of distorted colors using this function. Here is what I mean:
Distorted Color:
So it looks quite nice, but it isn't how it supposed to be. This is the actual image:
Is this a known issue or is it just me? Are there any fixes to it or is it just a broken function at the moment. I am using pygame 1.9.2pre and I am assuming it is just a bug with the pre release but I was having issues using any other versions of pygame with python 3.3.
Some users have reported difficulty with saving images as pngs:
I only get .tga files even when I specify .png. Very frustrating.
If you use .PNG (uppercase), it will result in an invalid file (at least on my win32). Use .png (lowercase) instead.
PNG does not seem to work, I am able to get a preview of it in Thunar, but everywhere else It says that it is not a valid PNG.
Saving in a different format may be helpful. For example, BMP is a simple format, so it's unlikely that Pygame's implementation will be buggy.
If you really want to save as PNG, you can reverse the distortion by swapping the red channel with the green one. This is fairly easy. For example, using PIL:
from PIL import Image
im = Image.open("screenshot.png")
width, height = im.size
pix = im.load()
for i in range(width):
for j in range(height):
r,g,b = pix[i,j]
pix[i,j] = (g,r,b)
im.save("output.png")
Or you can save as BMP and convert to PNG after the fact:
from PIL import Image
im = Image.open("screenshot.bmp")
im.save("screenshot.png")
for future reference, this trick worked for me:
from PIL import Image
imgdata = pygame.surfarray.array3d(screen).transpose([1,0,2])[:,:,::-1]
Image.fromarray(imgdata).save('output.png')