Problem
My tesseract (tesserocr) is not found by the emacs python interpreter, but I am able to use tesseract on the terminal as well as in my Spyder installation. Emacs python interpreter is able to import pytesseract, but not find tesserocr. I get the following error:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/eghx/agent18/project-gym/tests/thresholding.py", line 34, in image_to_string2
print(image_to_string(img_open))
File "/home/eghx/anaconda3/lib/python3.6/site-packages/pytesseract-0.1.7-py3.6.egg/pytesseract/pytesseract.py", line 122, in image_to_string
File "/home/eghx/anaconda3/lib/python3.6/site-packages/pytesseract-0.1.7-py3.6.egg/pytesseract/pytesseract.py", line 46, in run_tesseract
File "/home/eghx/anaconda3/lib/python3.6/subprocess.py", line 709, in __init__
restore_signals, start_new_session)
File "/home/eghx/anaconda3/lib/python3.6/subprocess.py", line 1344, in _execute_child
raise child_exception_type(errno_num, err_msg, err_filename)
FileNotFoundError: [Errno 2] No such file or directory: 'tesseract': 'tesseract'
when I run
pytesseract.image_to_string(img)
However I don't get this error when I open EMACS from a terminal instead of the desktop. It appears that the path variable is inherited differently in the desktop version and terminal version of emacs. ODD!
Explanation
I have anaconda installation here:/path/to/anaconda3
I have added this line to my init file to run this particular python installation
(setq python-shell-interpreter "/path/to/anaconda3/bin/python")
I installed both pytesseract and tesserocr using conda install
which tesseract gives:
/path/to/anaconda3/bin/tesseract
$ echo $PATH gives:
/path/to/anaconda3/bin:/usr/local/sbin:/usr/lo....
What I did
I copied the sys.path from the working Spyder IDE to emacs python interpreter and still didn't work.
I looked around and found this but the top answer does not pertain to my case, as my $PATH variable contains the necessary path.
Can someone guide me? I am a noob. I have emacs 27 and ubuntu 16 and conda 4.5.0.
This is a possible duplicate of OSError: [Errno 2] No such file or directory using pytesser
Answer was found as per the 3rd point in the link, quoted below:
import pytesseract
pytesseract.pytesseract.tesseract_cmd = 'path-to-tesseract-including-bin'
In my case,
import pytesseract
pytesseract.pytesseract.tesseract_cmd = '/home/anaconda3/bin/tesseract'
This is only a temporary hack to get image_to_string to work, by typing the above in every file.
Why the $PATH variable having the /home/anaconda3/bin is not enough to get it to work sufficiently is not known. This seems to be a slighty long-term-temporary solution.
Related
Recently installed Anaconda as I wish to delve more into data science and machine learning, and am trying to set up my Sublime Text to be my main editor, which it did used to be when I had just Python installed.
I uninstalled everything python related, and installed just Anaconda, changing the PATH from the normal python path to C:\ProgramData\Anaconda3\Scripts and C:\ProgramData\Anaconda3. I reinstalled ST3, where the code did work, and then updated to ST4. (this is an edit, I had not realised my ST3 had updated to 4 when I first posted.)
I made sure that the command python works in cmd, and though I do get a warning;
""Warning:
This Python interpreter is in a conda environment, but the environment has
not been activated. Libraries may fail to load. To activate this environment
please see https://conda.io/activation""
a print("Hello World") statement works.
Moving on to Sublime Text, I attempted to test the same command and am met with an error;
[WinError 2] The system cannot find the file specified
[cmd: ['py', '-u', '']]
[dir: E:\Programs\Sublime Text 3]
[path: C:\Program Files\Oculus\Support\oculus-runtime;C:\Program Files (x86)\Common Files\Oracle\Java\javapath;C:\WINDOWS\system32;C:\WINDOWS;C:\WINDOWS\System32\Wbem;C:\WINDOWS\System32\WindowsPowerShell\v1.0\;C:\WINDOWS\System32\OpenSSH\;C:\Program Files\NVIDIA Corporation\NVIDIA NvDLISR;C:\Program Files (x86)\NVIDIA Corporation\PhysX\Common;C:\Program Files\PuTTY\;C:\ProgramData\Anaconda3\Scripts;C:\ProgramData\Anaconda3;C:\Users\seabr\AppData\Local\Microsoft\WindowsApps;C:\MinGW-w64\x86_64-8.1.0-posix-seh-rt_v6-rev0\mingw64\bin\;C:\Users\seabr\AppData\Local\GitHubDesktop\bin;]
[Finished]
I attempted to fix this by checking the sublime-build file for Python using PackageResouceViewer and changing python3 to python in there, which did not fix the issue. I then installed the conda package which when ran does not give me a build output, but looking into the sublime text console I can see;
Traceback (most recent call last):
File "./python3.3/subprocess.py", line 1104, in _execute_child
FileNotFoundError: [WinError 2] The system cannot find the file specified
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "E:\Programs\Sublime Text 3\Lib\python33\sublime_plugin.py", line 1456, in run_
return self.run(**args)
File "C:\Users\seabr\AppData\Roaming\Sublime Text\Installed Packages\Conda.sublime-package\commands.py", line 682, in run
File "C:\Users\seabr\AppData\Roaming\Sublime Text\Installed Packages\Conda.sublime-package\commands.py", line 645, in __enter__
File "C:\Users\seabr\AppData\Roaming\Sublime Text\Installed Packages\Conda.sublime-package\commands.py", line 629, in conda_version
File "./python3.3/subprocess.py", line 576, in check_output
File "./python3.3/subprocess.py", line 819, in __init__
File "./python3.3/subprocess.py", line 1110, in _execute_child
FileNotFoundError: [WinError 2] The system cannot find the file specified
This is the closest I can find to my situation, but following the fixes in there also have not fixed my issue.
In essense, I'm not sure whether this is an issue with my anaconda environment, my sublime text or my windows PATH, so any help would be greatly appricated and I'm happy to provide more information if it is needed
I had the same issue after updating to Sublime Text 4.
I fixed it by changing python3 to python (like you mentioned trying) and also py to python in my Python.sublime-build.
Hope this works.
I have installed the pytesseract library using
pip install pytesseract
When I tried to use the image_to_text method, it gave me a
FileNotFoundError: [WinError 2] The system can not find the file specified
I googled it and found that I should change something in the pytesseract.py file and the line
tesseract_cmd = 'tesseract'
should become
tesseract_cmd = path_to_folder_that_contains_tesseractEXE + 'tesseract'
I searched and haven't found any tesseract.exe files in my Python folder, I then reinstalled the library, but the file still wasn't there. Finnally, I replaced the line by:
tesseract_cmd = path_to_folder_that_contains_pytesseractEXE + 'pytesseract'
and my program threw:
pytesseract.pytesseract.TesseractError: (2, 'Usage: python pytesseract.py [-l lang] input_file')
What can I do make my programm work?
P.S Here is my programm code :
from pytesseract import image_to_string
from PIL import Image, ImageEnhance, ImageFilter
im = Image.open(r'C:\Users\Филипп\Desktop\ImageToText_Python\NoName.png')
print(im)
txt = image_to_string(im)
print(txt)
Full Traceback of first attempt :
File "C:/Users/user/Desktop/ImageToText.py", line 10, in <module>
text = pytesseract.image_to_string(im)
File "C:\Python\lib\site-packages\pytesseract\pytesseract.py", line 122, in
image_to_string config=config)
File "C:\Python\lib\site-packages\pytesseract\pytesseract.py", line 46, in
run_tesseract proc = subprocess.Popen(command, stderr=subprocess.PIPE)
File "C:\Python\lib\subprocess.py", line 947, in __init__ restore_signals, start_new_session)
File "C:\Python\lib\subprocess.py", line 1224, in _execute_child startupinfo)
FileNotFoundError: [WinError 2]The system can not find the file specified
Full Traceback of second attempt
Traceback (most recent call last):
File "C:\Users\user\Desktop\ImageToText.py", line 6, in <module> txt = image_to_string(im)
File "C:\Python\lib\site-packages\pytesseract\pytesseract.py", line 125, in image_to_string
raise TesseractError(status, errors)
pytesseract.pytesseract.TesseractError: (2, 'Usage: python pytesseract.py [-l lang] input_file')
From project's README:
try:
import Image
except ImportError:
from PIL import Image
import pytesseract
pytesseract.pytesseract.tesseract_cmd = '<full_path_to_your_tesseract_executable>'
# Include the above line, if you don't have tesseract executable in your PATH
# Example tesseract_cmd: 'C:\\Program Files (x86)\\Tesseract-OCR\\tesseract'
print(pytesseract.image_to_string(Image.open('test.png')))
print(pytesseract.image_to_string(Image.open('test-european.jpg'), lang='fra'))
So, you have to make sure tesseract.exe is on your computer (for example by installing Tesseract-OCR), then add the containing folder to your PATH environment variable, or declare it's location using pytesseract.pytesseract.tesseract_cmd attribute
For people in the same case as me: here is a tesseract-OCR downloader. After you finish the download, go to the path you've chosen, there should be a file named tesseract.exe, copy the path to this file and paste it into pytesseract.exe.
If you are using windows OS - you have to install tesseract-ocr from this link (3.05.01 is the stable version and supported for foreign language extraction). And add the path(where you installed the software) to the environment variable.
If you are using ubuntu OS - in terminal type "sudo apt-get install tesseract-ocr"
Pytesseract is python wrapper that helps you to access this tesseract-ocr software.
Note 1: if you want to extract foreign languages then you have to include tessdata files in the installed path.
Note 2: Python 2 will not have good support on foreign language extraction, so better go with python 3.
I'm using the Pannellum panorama viewer to do a tour experience of a campus. To improve the support for mobile devices I want to use the Pannellum's multiresolution format using the included python script to generate it form images.
I'm a mac user with very little python knowledge and gotten around to be able to execute the script from terminal (yay!)
The comments on the python script state the requirements:
# Requires Python 3.2+ (or Python 2.7), the Python Pillow package,
# and nona (from Hugin)
I'm running Python 2.7.10, I did install the Pillow package and got the
'nona' executable from the the 'HuginStitchProject.app' (From the package content) which was downloaded from their sourceforge mirror.
When executing the code with python generate.py basketball_court.jpg from the folder contaning the nona executable mentioned earlier, the *.jpg file and generate.py, I get the following error which I am not able to resolve:
Processing input image information...
Generating cube faces...
Traceback (most recent call last):
File "generate.py", line 106, in <module>
subprocess.check_call([args.nona, '-o', os.path.join(args.output, 'face'), os.path.join(args.output, 'cubic.pto')])
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/subprocess.py", line 535, in check_call
retcode = call(*popenargs, **kwargs)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/subprocess.py", line 522, in call
return Popen(*popenargs, **kwargs).wait()
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/subprocess.py", line 710, in __init__
errread, errwrite)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/subprocess.py", line 1335, in _execute_child
raise child_exception
OSError: [Errno 2] No such file or directory
I see the OSError: [Errno 2] No such file or directory, which file am I missing? Which file is it unable to see?
Any help would be greatly appreciated :)
EDIT: I'm pretty sure that none executable I've got is not the dependency. I don't even know what a executable should look like. Should I be looking for a nona.py, nona.exe, nona.sh or something else?
subprocess.check_call is used to run something in the terminal, but looks like it cannot find the nona executable.
If you edit generate.py and put print(args.nona) before line 106, you'll see what it's trying to call.
Most likely you need to check paths, perhaps put the nona executable in the same dir as generate.py or the one you're running from. Also check it has the executable flag set.
Checking the generate.py source, I see it gets args.nona from its own parameters, or uses a default if you don't explicitly set it (line 59).
The default is set by trying to find an executable called "nona", else set it to None (line 37).
I'm trying to use pytesseract for OCR.
I have installed google tesseract 3.03
I have installed pytesseract 0.1.6
I am running Python 3.5.1
I am running Windows 8
Tesseract is also in my path (I can call it from anywhere in a normal CMD and it will return the help function)
And this is the code I try to execute:
try:
import Image
except ImportError:
from PIL import Image
import pytesseract
im=Image.open('C:/Users/NeusAap/Google Drive/School/Jaar 1/Periode 1/Programming/Miniproject/GarageProject/scripts/test.png')
print(pytesseract.image_to_string(im))
But it returns this error:
Traceback (most recent call last):
File "C:/Users/NeusAap/Google Drive/School/Jaar 1/Periode 1/Programming/Miniproject/GarageProject/scripts/main.py", line 8, in <module>
print(pytesseract.image_to_string(im))
File "C:\Users\NeusAap\AppData\Local\Programs\Python\Python35-32\lib\site-packages\pytesseract\pytesseract.py", line 161, in image_to_string
config=config)
File "C:\Users\NeusAap\AppData\Local\Programs\Python\Python35-32\lib\site-packages\pytesseract\pytesseract.py", line 94, in run_tesseract
stderr=subprocess.PIPE)
File "C:\Users\NeusAap\AppData\Local\Programs\Python\Python35-32\lib\subprocess.py", line 947, in __init__
restore_signals, start_new_session)
File "C:\Users\NeusAap\AppData\Local\Programs\Python\Python35-32\lib\subprocess.py", line 1224, in _execute_child
startupinfo)
FileNotFoundError: [WinError 2] The system cannot find the file specified
Process finished with exit code 1
I know that both tesseract and pytesseract work because if I run this from CMD:
python pytesseract.py -l eng+nld test.png
It does work, and returns me the characters as expected.
What am I doing wrong?
Thanks in advance!
Mats de Waard
I finally got it working. Seems like everything was set up right, and that I was calling everything correctly, but I needed to reboot Windows, because the files could not be found by Python.
I forgot that windows debugging always starts with a reboot :P
First of all I did everything mentioned here pytesseract-no such file or directory error
Still doesn't work. Now I'm using Pycharm IDE with following code:
from PIL import Image
import pytesseract
import subprocess
im = Image.open('test.png')
im.show()
subprocess.call(['tesseract','test.png','out'])
print pytesseract.image_to_string(Image.open('test.png'))
im.show() opens the image successfully.
subprocess.call() with tesseract test.png out also extracts the text
from the image..
but pytesseract.image_to_string() fails.
I don't get it. Why I am able to use tesseract in shell but not in python. And in python I can open same image but when used with tesseract Image can't be found.
Below you can see the error output.
File "/home/hamza-c/Schreibtisch/Android/JioShare/orc.py", line 7, in <module>
print pytesseract.image_to_string(Image.open('/home/hamza-c/Schreibtisch/Android/JioShare/test.png'))
File "/usr/local/lib/python2.7/dist-packages/pytesseract/pytesseract.py", line 162, in image_to_string
config=config)
File "/usr/local/lib/python2.7/dist-packages/pytesseract/pytesseract.py", line 95, in run_tesseract
stderr=subprocess.PIPE)
File "/usr/lib/python2.7/subprocess.py", line 711, in __init__
errread, errwrite)
File "/usr/lib/python2.7/subprocess.py", line 1340, in _execute_child
raise child_exception
OSError: [Errno 2] No such file or directory
I tested the code you mentioned in your question. It works fine. I was facing the same error
No such file or directory found
The problem was the directory containing 'tesseract.exe' was not added to the environment Variable. You should be able to run command 'tesseract' in command prompt.
if tesseract is not installed you can download it from tesseract
1: https://github.com/tesseract-ocr/tesseract/wiki and for windows use third party installer available here
maybe you need install tesseract ,if your os is centos, please enter
yum install tesseract
I've used the following command and it worked for me:
brew install tesseract
I solved my own question.
im = Image.open('test.png')
print pytesseract.image_to_string(im)
It's still unclear why it works when a reference is passed but not directly when I try to open image inside the parameter.