I'm trying to write an OCR script with Python (2.7, Windows OS) to get text from images. First I've downloaded PyTesser and extracted it to Python27/Lib/site-packages as 'pytesser' and I've installed tesseract with pip install tesseract . Then I wrote the following script as self.py:
from PIL import Image
from pytesser.pytesser import *
image_file = 'C:/Users/blabla/test.png'
im = Image.open(image_file)
text = image_to_string(im)
text = image_file_to_string(image_file)
text = image_file_to_string(image_file, graceful_errors=True)
print text
But I'm getting the following error:
Traceback (most recent call last):
File "C:/Users/blabla/self.py", line 7, in <module>
text = image_file_to_string(image_file)
File "C:\Python27\lib\site-packages\pytesser\pytesser.py", line 44, in image_file_to_string
call_tesseract(filename, scratch_text_name_root)
File "C:\Python27\lib\site-packages\pytesser\pytesser.py", line 24, in call_tesseract
errors.check_for_errors()
File "C:\Python27\lib\site-packages\pytesser\errors.py", line 10, in check_for_errors
inf = file(logfile)
IOError: [Errno 2] No such file or directory: 'tesseract.log'
And yes, there's no 'tesseract.log' file anywhere. What should I do? How should I solve this problem?
Thank you in advance.
Note: I've changed the line tesseract_exe_name from pytesser.py from tesseract to C:/Python27/Lib/site-packages/pytesser/tesseract but it doesn't work.
Edit: Alright, I've just runned teseract.exe that is in 'pytesser' and it created the 'tesseract.log' file but I'm still getting same error.
I've changed the line from def check_for_errors(logfile = "tesseract.log"): to def check_for_errors(logfile = "C:/Python27/Lib/site-packages/pytesser/tesseract.log"): in ../pytesser/errors.py and it worked.
Related
I'm having an issue just drawing from a file which I don't think I've had an issue with before. I'm not sure if it's because I switched from PyCharm to IDLE
Here is my current code:
import time
import os
keep_running = True
last_time = 0
file = os.path.abspath(r'C:\Users\AUser\Desktop\test.txt')
current_time = os.path.getmtime(file)
print(file)
Here is the output I'm getting is:
Traceback (most recent call last):
File "C:/Users/AUser/Desktop/Scripts/FileAlert.py", line 9, in <module>
current_time = os.path.getmtime(file)
File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.9_3.9.2032.0_x64__qbz5n2kfra8p0\lib\genericpath.py", line 55, in getmtime
return os.stat(filename).st_mtime
FileNotFoundError: [WinError 2] The system cannot find the file specified: 'C:\\Users\\AUser\\Desktop\\test.txt'
If I remove 'r' from the file path, I get a unicode error instead. The file is in a different directory from the script so I'm not sure what the issue is. This is happening with Python 3.9 on a Windows 10 machine.
In this case the issue is the file being searched for was technically named 'test.txt.txt'.
Using this suggestion helped me to locate this error on my end:
Does that file exist? It seems it does not. You could try import os and then os.listdir('C:\Users\AUser\Desktop') to get the directory list. from user #tdelaney
from PIL import Image
from pytesser import *
image_file = 'D:\plate.jpg'
im = Image.open(image_file)
text = image_to_string(im)
text = image_file_to_string(image_file)
text = image_file_to_string(image_file, graceful_errors=True)
print "=====output=======\n"
print text
it gets me error that
C:\Users\KEN\Anaconda2\python.exe C:/Users/KENIL/PycharmProjects/plate/ocr.py
Traceback (most recent call last):
File "C:/Users/KEN/PycharmProjects/plate/ocr.py", line 2, in <module>
from pytesser import *
File "C:\Users\KEN\PycharmProjects\plate\pytesser.py", line 6, in <module>
import Image
ImportError: No module named Image
even though i have installed ananconda properly as well as pillow screenshot of package installed
The code you posted doesn't have the error. From the error message it seems that the error is in the module pytesser.py, where you probably use Image (in line 6 :-) ) without importing it.
I am using only PIL then it's work properly when I use pytesser then it doesn't work properly .What can i do for it?
from PIL import Image
from pytesser import *
image_file = Image.open("vote.jpg")
im = Image.open(image_file)
text = image_to_string(im)
print text
Traceback (most recent call last):
File "C:/Users/Tanvir/Desktop/py thon hand/hala.py", line 4, in <module>
image_file = Image.open("vote.jpg")
File "C:\Pythons27\lib\site-packages\PIL\Image.py", line 2286, in open
% (filename if filename else fp))
IOError: cannot identify image file 'vote.jpg'
I am find this solution .It was problem with PIL so at first I have to uninstall this PIL module then again install it and job done Every thing is okay.
I want to freeze a python script which is using dropbox to upload a file. I am using python 2.7 and windows 7. If I try just this example code:
import dropbox
client = dropbox.client.DropboxClient(<authtoken>)
client.put_file('FileName',"", overwrite=True)
I created an app with dropbox and generated a token which is explained on the dropbox homepage. The example works if I just use the python script. For future applications I want to freeze the script to use it on computers without python.
If I execute the .exe file I get the error message below, I named the python script dropbox.py:
Traceback (most recent call last):
File "C:\Python27\lib\site-packages\cx_Freeze\initscripts\Console.py", line 27, in <module>
exec(code, m.__dict__)
File "dropbox.py", line 2, in <module>
File "H:\dropbox.py", line 3, in <module>
client = dropbox.client.DropboxClient("authToken")
AttributeError: 'module' object has no attribute 'client'
Solved: Do not use dropbox.py for your example scripts
The error states that it can't find the client module, but how can fix this error?
My setup.py file is:
import cx_Freeze
import sys
base = None
executables = [
cx_Freeze.Executable("dropbox.py", base = base),
]
build_exe_options = {"includes":[],
"include_files":[],
"excludes":[],
"packages":[]
}
cx_Freeze.setup(
name = "script",
options = {"build_exe": build_exe_options},
version = "0.0",
description = "A basic example",
executables = executables)
I also tried to add in packages "dropbox" but it doesn't work.
I hope someone can help me. Maybe there is another way to upload a file to dropbox?
Cheers Max
Edit1:
Indeed it was a problem with the name of my example script. Though it still doesn't work. The new error is:
Traceback (most recent call last):
File "C:\Python27\lib\site-packages\cx_Freeze\initscripts\Console.py", line 27, in <module>
exec(code, m.__dict__)
File "mydropbox.py", line 2, in <module>
File "C:\Python27\lib\site-packages\dropbox\__init__.py", line 3, in <module>
from . import client, rest, session
File "C:\Python27\lib\site-packages\dropbox\client.py", line 22, in <module>
from .rest import ErrorResponse, RESTClient, params_to_urlencoded
File "C:\Python27\lib\site-packages\dropbox\rest.py", line 26, in <module>
TRUSTED_CERT_FILE = pkg_resources.resource_filename(__name__, 'trusted-certs.crt')
File "build\bdist.win32\egg\pkg_resources.py", line 950, in resource_filename
File "build\bdist.win32\egg\pkg_resources.py", line 1638, in get_resource_filename
NotImplementedError: resource_filename() only supported for .egg, not .zip
I tried to solve the error with this site:
enter link description here
but doesn't work. Does anybody has a solution regarding my new error?
I've solved the problem this way:
1) In your dropbox module location find the file "rest.py" (C:\Python27\Lib\site-packages\dropbox-2.2.0-py2.7.egg\dropbox - in my case)
2) Comment the string TRUSTED_CERT_FILE = pkg_resources.resource_filename(__name__, 'trusted-certs.crt')
3) You can either write
TRUSTED_CERT_FILE = 'trusted-certs.crt' (I've used this way) or
if hasattr(sys, 'frozen'):
TRUSTED_CERT_FILE = 'trusted-certs.crt'
else:
TRUSTED_CERT_FILE = pkg_resources.resource_filename(__name__, 'trusted-certs.crt')
4) Remove all pyc files from dropbox directory and run the py file with your program
5) Rebuild your executable with cx_freeze
6) Copy trusted-certs.crt file from dropbox directory into library.zip/dropbox
7) Enjoy!
I am trying to execute the following code
from pytesser import *
import Image
i="C:/Documents and Settings/Administrator/Desktop/attachments/R1PNDTCB.jpg"
print i
im = Image.open(i.strip())
text = image_to_string(im)
print text
I get the following error
C:/Documents and Settings/Administrator/Desktop/attachments/R1PNDTCB.jpg
Traceback (most recent call last):
File "C:\Python27\Lib\site-packages\Pythonwin\pywin\framework\scriptutils.py", line 322, in RunScript
debugger.run(codeObject, __main__.__dict__, start_stepping=0)
File "C:\Python27\Lib\site-packages\Pythonwin\pywin\debugger\__init__.py", line 60, in run
_GetCurrentDebugger().run(cmd, globals,locals, start_stepping)
File "C:\Python27\Lib\site-packages\Pythonwin\pywin\debugger\debugger.py", line 655, in run
exec cmd in globals, locals
File "C:\Documents and Settings\Administrator\Desktop\attachments\ocr.py", line 1, in <module>
from pytesser import *
File "C:\Python27\lib\site-packages\PIL\Image.py", line 1952, in open
fp = __builtin__.open(fp, "rb")
IOError: [Errno 2] No such file or directory: 'C:/Documents and Settings/Administrator/Desktop/attachments/R1PNDTCB.jpg'
Can someone please explain what I am doing wrong here.
Renamed the image file.Shifted the python file and the images to a new folder. Shifted the folder to E drive
Now the code is as follows:
from pytesser import *
import Image
import os
i=os.path.join("E:\\","ocr","a.jpg")
print i
im = Image.open(i.strip())
text = image_to_string(im)
print text
Now the error is as follows:
E:\ocr\a.jpg
Traceback (most recent call last):
File "or.py", line 8, in <module>
text = image_to_string(im)
File "C:\Python27\lib\pytesser.py", line 31, in image_to_string
call_tesseract(scratch_image_name, scratch_text_name_root)
File "C:\Python27\lib\pytesser.py", line 21, in call_tesseract
proc = subprocess.Popen(args)
File "C:\Python27\lib\subprocess.py", line 679, in __init__
errread, errwrite)
File "C:\Python27\lib\subprocess.py", line 893, in _execute_child
startupinfo)
WindowsError: [Error 2] The system cannot find the file specified
You need to install Tesseract first. Just installing pytesseract is not enough. Then edit the tesseract_cmd variable in pytesseract.py to point the the tessseract binary. For example, in my installation I set it to
tesseract_cmd = 'C:\\Program Files (x86)\\Tesseract-OCR\\tesseract.exe'
The exception is pretty clear: the file either doesn't exist, or you lack sufficient permissions to access it. If neither is the case, please provide evidence (e.g. relevant dir commands with output, run as the same user).
your image path maybe?
i="C:\\Documents and Settings\\Administrator\\Desktop\\attachments\\R1PNDTCB.jpg"
try this:
import os
os.path.join("C:\\", "Documents and Settings", "Administrator")
you should get a string similar to the one in the previous line
Try this first:
os.path.expanduser('~/Desktop/attachments/R1PNDTCB.jpg')
It could be that the space in the 'Documents and Settings' is causing this problem.
EDIT:
Use os.path.join so it uses the correct directory separator.
Just add these two lines in your code
import OS
os.chdir('C:\Python27\Lib\site-packages\pytesser')
before
from pytesser import *
If you are using pytesseract, you have to make sure that you have installed Tesseract-OCR in your system. After that you have to insert the path of the tesseract in your code, as below
from PIL import Image
import pytesseract
pytesseract.pytesseract.tesseract_cmd = 'C:/Program Files (x86)/Tesseract
OCR/tesseract'
You can download the Tesseract-OCR form https://github.com/UB-Mannheim/tesseract/wiki