PyCharm can't find Tesseract on Mac - python

I'm starting out with some basic python tutorials with OpenCV, and the first tutorial uses Tesseract, Pytesseract, and OpenCV. I have Tesseract downloaded and pip installed, and I have Pytesseract and OpenCV downloaded, installed, and included in my PyCharm packages, so I think the problem is how I'm addressing the Tesseract file in my code, since I'm new to using a Mac.
(I'm using Python 3.8, but also have Python 2.7 installed, because I needed it to get to this point. Weirdly enough, up to this point, the code only ran without error if I had Python 2.7 installed, but had 3.8 as my PyCharm interpreter.)
When I put Tesseract into my terminal, it tells me that the file address is simply 'Applications/tesseract'. But when I use this as the address in PyCharm, I get the error message below. If anyone could help me figure out how to handle this error, I would appreciate it a lot!!! (I'm new to everything computers, btw. This is how I'm learning.)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/Users/george/PycharmProjects/pythonProject2/main.py", line 7, in <module>
print(pytesseract.image_to_string(img))
File "/Users/george/Library/Python/3.8/lib/python/site-packages/pytesseract/pytesseract.py", line 370, in image_to_string
return {
File "/Users/george/Library/Python/3.8/lib/python/site-packages/pytesseract/pytesseract.py", line 373, in <lambda>
Output.STRING: lambda: run_and_get_output(*args),
File "/Users/george/Library/Python/3.8/lib/python/site-packages/pytesseract/pytesseract.py", line 282, in run_and_get_output
run_tesseract(**kwargs)
File "/Users/george/Library/Python/3.8/lib/python/site-packages/pytesseract/pytesseract.py", line 254, in run_tesseract
raise TesseractNotFoundError()
pytesseract.pytesseract.TesseractNotFoundError: \Applications\tesseract is not installed or it's not in your PATH. See README file for more information."
I don't know what to look for in the README file, though.
Here is the code that kicked off the error message:
import cv2
import pytesseract
pytesseract.pytesseract.tesseract_cmd = '\\Applications\\tesseract'
img = cv2.imread('im1.png')
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
print(pytesseract.image_to_string(img))
cv2.imshow('Result', img)
cv2.waitKey(0)

On Macs (and other Unix-like OSes), the path separator is the forward slash, not the backslash.
pytesseract.pytesseract.tesseract_cmd = '/Applications/tesseract'
could work better.
However, do you actually need to explicitly set that? It's likely the library could be able to find Tesseract on its own

Related

Problem occurred when using PyTesseract to recognize text from an image

I was trying to make an automatic login program for our school website, which requires recognizing text from a captcha code. So I installed pytesseract from pip, and ran the program in PyCharm: (the image is in the directory /Users/macintosh/Documents/PythonOutputs/2.jpg)
import pytesseract
from PIL import Image
image = Image.open("/Users/macintosh/Documents/PythonOutputs/2.jpg")
text = pytesseract.image_to_string(image)
print(text)
But this error occured:
Traceback (most recent call last): File
"/Users/macintosh/Library/Preferences/PyCharmCE2018.2/scratches/scratch_3.py",
line 5, in
text = pytesseract.image_to_string(image)
File
"/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/pytesseract/pytesseract.py",
line 294, in image_to_string
return run_and_get_output(*args)
File
"/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/pytesseract/pytesseract.py",
line 202, in run_and_get_output
run_tesseract(**kwargs)
File
"/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/pytesseract/pytesseract.py",
line 178, in run_tesseract
raise TesseractError(status_code, get_errors(error_string))
pytesseract.pytesseract.TesseractError: (2, 'Usage: python
pytesseract.py [-l lang] input_file')
What's the problem?
Well, although your error message is not really crystal clear I bet (judging from your actions) you haven't installed Tesseract itself.
In pytessaract documentation it states that:
Python-tesseract is a wrapper for Google’s Tesseract-OCR Engine.
so you should install the actual program (Tesseract that is) to do the job also.

Emacs python not able to find package/module

Problem
My tesseract (tesserocr) is not found by the emacs python interpreter, but I am able to use tesseract on the terminal as well as in my Spyder installation. Emacs python interpreter is able to import pytesseract, but not find tesserocr. I get the following error:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/eghx/agent18/project-gym/tests/thresholding.py", line 34, in image_to_string2
print(image_to_string(img_open))
File "/home/eghx/anaconda3/lib/python3.6/site-packages/pytesseract-0.1.7-py3.6.egg/pytesseract/pytesseract.py", line 122, in image_to_string
File "/home/eghx/anaconda3/lib/python3.6/site-packages/pytesseract-0.1.7-py3.6.egg/pytesseract/pytesseract.py", line 46, in run_tesseract
File "/home/eghx/anaconda3/lib/python3.6/subprocess.py", line 709, in __init__
restore_signals, start_new_session)
File "/home/eghx/anaconda3/lib/python3.6/subprocess.py", line 1344, in _execute_child
raise child_exception_type(errno_num, err_msg, err_filename)
FileNotFoundError: [Errno 2] No such file or directory: 'tesseract': 'tesseract'
when I run
pytesseract.image_to_string(img)
However I don't get this error when I open EMACS from a terminal instead of the desktop. It appears that the path variable is inherited differently in the desktop version and terminal version of emacs. ODD!
Explanation
I have anaconda installation here:/path/to/anaconda3
I have added this line to my init file to run this particular python installation
(setq python-shell-interpreter "/path/to/anaconda3/bin/python")
I installed both pytesseract and tesserocr using conda install
which tesseract gives:
/path/to/anaconda3/bin/tesseract
$ echo $PATH gives:
/path/to/anaconda3/bin:/usr/local/sbin:/usr/lo....
What I did
I copied the sys.path from the working Spyder IDE to emacs python interpreter and still didn't work.
I looked around and found this but the top answer does not pertain to my case, as my $PATH variable contains the necessary path.
Can someone guide me? I am a noob. I have emacs 27 and ubuntu 16 and conda 4.5.0.
This is a possible duplicate of OSError: [Errno 2] No such file or directory using pytesser
Answer was found as per the 3rd point in the link, quoted below:
import pytesseract
pytesseract.pytesseract.tesseract_cmd = 'path-to-tesseract-including-bin'
In my case,
import pytesseract
pytesseract.pytesseract.tesseract_cmd = '/home/anaconda3/bin/tesseract'
This is only a temporary hack to get image_to_string to work, by typing the above in every file.
Why the $PATH variable having the /home/anaconda3/bin is not enough to get it to work sufficiently is not known. This seems to be a slighty long-term-temporary solution.

Moviepy OSError Exec format error - Missing Shebang?

I am attempting to use MoviePy with Python 3.2.3 on Raspian.
I have installed it (for Python 2.7, 3.2 and 3.5... long story) and the line
from moviepy.editor import *
works fine.
When I try
clip = VideoFileClip("vid.mov")
which is the most basic command, it gives the error
Traceback (most recent call last):
File "/home/pi/QuickFlicsPics/moviepytest.py", line 8, in <module>
clip = VideoFileClip("vid.mov")
File "/usr/local/lib/python3.2/distpackages/moviepy/video/io/VideoFileClip.py", line 55, in __init__
reader = FFMPEG_VideoReader(filename, pix_fmt=pix_fmt)
File "/usr/local/lib/python3.2/dist-packages/moviepy/video/io/ffmpeg_reader.py", line 32, in __init__
infos = ffmpeg_parse_infos(filename, print_infos, check_duration)
File "/usr/local/lib/python3.2/dist-packages/moviepy/video/io/ffmpeg_reader.py", line 237, in ffmpeg_parse_infos
proc = sp.Popen(cmd, **popen_params)
File "/usr/lib/python3.2/subprocess.py", line 745, in __init__
restore_signals, start_new_session)
File "/usr/lib/python3.2/subprocess.py", line 1371, in _execute_child
raise child_exception_type(errno_num, err_msg)
OSError: [Errno 8] Exec format error
I have researched this error, and it appears to be something to do with a shebang line missing somewhere. Is this correct, if so, how do I go about finding where it is missing, and what do I add?
Thanks
Edit:
As per cxw's comment, I installed moviepy using the command
pip-3.2 install moviepy
(I may have used 'sudo' as well)
FFMPEG was supposed to download automatically when I first used moviepy:
MoviePy depends on the software FFMPEG for video reading and writing. > You don’t need to worry about that, as FFMPEG should be automatically > downloaded/installed by ImageIO during your first use of MoviePy (it takes a few seconds). If you want to use a specific version of FFMPEG, follow the instructions in file config_defaults.py.
[Quote from installation guide here]
Manually download ffmpeg, then before running your Python code, do
export FFMPEG_BINARY=path/to/ffmpeg
at the shell/terminal prompt.
As far as I can tell from the source, the automatic download of ffmpeg does not know about Raspberry Pis. The auto-download code pulls from the imageio github repo, which only knows "linux32" vs. "linux64". It doesn't look like it has an ARM-linux option. When the ARM kernel sees a non-ARM image, it throws the error you see.
Rather than using the environment variable, you can edit your moviepy config-defaults.py file to specify FFMPEG_BINARY = r"/path/to/ffmpeg".
Edit to find the path/to/ffmpeg after installing it with apt-get, do
dpkg -L ffmpeg | grep bin
at the shell/terminal prompt. It will probably be in /bin or /usr/bin, and will probably be called ffmpeg or ffmpeg-x.xx (with some version number).
Thanks to this answer for dpkg

Pytesseract No such file or directory error

First of all I did everything mentioned here pytesseract-no such file or directory error
Still doesn't work. Now I'm using Pycharm IDE with following code:
from PIL import Image
import pytesseract
import subprocess
im = Image.open('test.png')
im.show()
subprocess.call(['tesseract','test.png','out'])
print pytesseract.image_to_string(Image.open('test.png'))
im.show() opens the image successfully.
subprocess.call() with tesseract test.png out also extracts the text
from the image..
but pytesseract.image_to_string() fails.
I don't get it. Why I am able to use tesseract in shell but not in python. And in python I can open same image but when used with tesseract Image can't be found.
Below you can see the error output.
File "/home/hamza-c/Schreibtisch/Android/JioShare/orc.py", line 7, in <module>
print pytesseract.image_to_string(Image.open('/home/hamza-c/Schreibtisch/Android/JioShare/test.png'))
File "/usr/local/lib/python2.7/dist-packages/pytesseract/pytesseract.py", line 162, in image_to_string
config=config)
File "/usr/local/lib/python2.7/dist-packages/pytesseract/pytesseract.py", line 95, in run_tesseract
stderr=subprocess.PIPE)
File "/usr/lib/python2.7/subprocess.py", line 711, in __init__
errread, errwrite)
File "/usr/lib/python2.7/subprocess.py", line 1340, in _execute_child
raise child_exception
OSError: [Errno 2] No such file or directory
I tested the code you mentioned in your question. It works fine. I was facing the same error
No such file or directory found
The problem was the directory containing 'tesseract.exe' was not added to the environment Variable. You should be able to run command 'tesseract' in command prompt.
if tesseract is not installed you can download it from tesseract
1: https://github.com/tesseract-ocr/tesseract/wiki and for windows use third party installer available here
maybe you need install tesseract ,if your os is centos, please enter
yum install tesseract
I've used the following command and it worked for me:
brew install tesseract
I solved my own question.
im = Image.open('test.png')
print pytesseract.image_to_string(im)
It's still unclear why it works when a reference is passed but not directly when I try to open image inside the parameter.

Python: PIL - [Errno 32] Broken pipe when saving .png

What I'm trying to do here is save the contents of a Tkinter Canvas as a .png image using PIL.
This is my save function ('graph' is the canvas).
def SaveAs():
filename = tkFileDialog.asksaveasfilename(initialfile="Untitled Graph", parent=master)
graph.postscript(file=filename+".eps")
img = Image.open(filename+".eps")
img.save(filename+".png", "png")
But it's getting this error:
Exception in Tkinter callback
Traceback (most recent call last):
File "C:\Python27\lib\lib-tk\Tkinter.py", line 1410, in __call__
return self.func(*args)
File "C:\Users\Adam\Desktop\Graphing Calculator\Graphing Calculator.py", line 352, in SaveAs
img.save(filename+".png", "png")
File "C:\Python27\lib\site-packages\PIL\Image.py", line 1406, in save
self.load()
File "C:\Python27\lib\site-packages\PIL\EpsImagePlugin.py", line 283, in load
self.im = Ghostscript(self.tile, self.size, self.fp)
File "C:\Python27\lib\site-packages\PIL\EpsImagePlugin.py", line 72, in Ghostscript
gs.write(s)
IOError: [Errno 32] Broken pipe
I'm running this on Windows 7, Python 2.7.1.
How do I make this work?
oh I just get the same error. I have solve it now
just do the following after installing PIL and Ghostscript
1) Open C:\Python27\Lib\site-packages\PIL\EpsImagePlugin.py
2) Change code near line 50 so that it looks like this:
Build ghostscript command
command = ["gswin32c",
"-q", # quite mode
"-g%dx%d" % size, # set output geometry (pixels)
"-dNOPAUSE -dSAFER", # don't pause between pages, safe mode
"-sDEVICE=ppmraw", # ppm driver
"-sOutputFile=%s" % file,# output file
"-"
]
Make sure that gswin32c.exe is in the PATH
good luck
It looks like the Ghostscript executable is erroring out and then closing the connection. Others have had this same problem on different OSes.
So, first I would recommend that you confirm that PIL is installed correctly--see the FAQ page for hints. Next, ensure that Ghostscript is installed and working. Lastly, ensure that Python can find Ghostscript, for example by running a PIL script that works elsewhere.
Oh, also--here are some tips on catching the broken pipe error so your program can be more resilient, recognize the problem, and warn the end-user. Hope that helps!
I have realized that while Python 2.7 has this EPEImagePulgin.py, Anaconda also has it. And unfortunately Anaconda's file is an older version. And unfortunately again, when you run your from Spyder environment it was picking up the epsimageplugin.py file from anaconda folder.
So I was getting similar broken pipe error.
When I went into python 2.7 directory and opened python console and then ran my code, it ran just fine.
Because the lates epsimageplugin.py file takes into consideration the windows environment and appropriate ghostscript exe files. Hope this helps.

Categories