Running the sample code in pytesseract - python

I am running python 2.6.6 and want to install the pytesseract package. After extraction and installation, I can call the pytesseract from the command line. However I want to run the tesseract within python. I have the following code (ocr.py):
try:
import Image
except ImportError:
from PIL import Image
import pytesseract
print(pytesseract.image_to_string(Image.open('test.png')))
print(pytesseract.image_to_string(Image.open('test-european.jpg'),lang='fra'))
When I run the code by python ocr.py, I get the following output:
Traceback (most recent call last):
File "ocr.py", line 6, in <module>
print(pytesseract.image_to_string(Image.open('test.png')))
File "/pytesseract-0.1.6/build/lib/pytesseract/pytesseract.py", line 164, in image_to_string
raise TesseractError(status, errors)
pytesseract.TesseractError: (2, 'Usage: python tesseract.py [-l language] input_file')
test.png and test-european.jpg are in the working directory. Can Someone help me running this code?
I have tried the following:
Adjusted the tesseract_cmd to 'pytesseract'
Installed tesseract-ocr
Any help is appreciated as I am trying to solve this problem for hours now.

tesseract_cmd should point to the command line program tesseract, not pytesseract.
For instance on Ubuntu you can install the program using:
sudo apt install tesseract-ocr
And then set the variable to just tesseract or /usr/bin/tesseract.

Related

Unable to import pdfkit Python 3.9

Running Python3.9 on a Ubuntu Linux Box and properly went through the step installing pdfkit:
pip3 install pdfkit
sudo apt install wkhtmltopdf
Error says :
Traceback (most recent call last):
File "/home/shawn/Development/Websites/MDSova/restapi/app.py", line 8, in <module>
from messaging import Mailbox
File "/home/shawn/Development/Websites/MDSova/restapi/messaging.py", line 8, in <module>
import pdfkit
ModuleNotFoundError: No module named 'pdfkit'
The app was run inside an active environment.
I've read many responses to a similar question; however I have not run across a way to fix it. Does anyone have any suggestions?

PyAutoGui screenshot function weird [duplicate]

I'm trying to use pyautogui's screenshot functions with Python 3.6.5 on OSX 10.11.
>>> import pyautogui
>>> image = pyautogui.screenshot()
I get:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/pyscreeze/__init__.py", line 331, in _screenshot_osx
im = Image.open(tmpFilename)
NameError: name 'Image' is not defined
My understanding is that pyscreeze is failing to get the name Image from Pillow for some reason. I tried to update the pyautogui (it was up to date), then reinstall the pyautogui, which carries all its dependencies including pyscreeze and Pillow along with it.
I found this question with the same issue, but the fix that worked there (reinstalling) isn't working for me.
do
pip install Pillow==0.1.13
since Image is module from PIL
pip3 uninstall pyautogui
pip3 uninstall Pillow
then reinstall the modules and restart you editor.

Python error in opening image with PIL Image.Open()

I am trying to do some studies and automation related to image metadata.
from PIL import Image
Image.open("/Users/carlo/Desktop/JPEG 2/DSC_0393.jpeg")
This is the error that I am receiving:
Traceback (most recent call last):
File "/Users/carlo/PythonProjects/ImageMetaData_00/main.py", line 1, in <module>
from PIL import Image
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/PIL/Image.py", line 114, in <module>
from . import _imaging as core
ImportError: dlopen(/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/PIL/_imaging.cpython-310-darwin.so, 2): no suitable image found. Did find:
/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/PIL/_imaging.cpython-310-darwin.so: mach-o, but wrong architecture
/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/PIL/_imaging.cpython-310-darwin.so: mach-o, but wrong architecture
I am using Python 3.10, not sure what I am missing. Thanks!
It’s telling you you’ve got a version of PIL downloaded/installed, but it’s not suitable for your computer architecture. You’re probably on an M1 Mac instead of an Intel one. To fix this try these:
pip3 install wheel
pip3 install --no-cache-dir pillow
If that doesn't work, you can try to switch to using python via Rosetta.
Go to the Application folder -> Right-click on Terminal App -> Get Info
Tick Open with Rosetta option.
Also try reinstalling it: pip3 install pillow.
If all else fails try downgrading python and see if anything clicks.

Using pytesseract on Python 2.7 and Windows XP [duplicate]

This question already has answers here:
f-strings giving SyntaxError?
(7 answers)
Closed last month.
I need OCR for a certain project, after searching online I decided to use python and tesseract. Right now I am trying to run the following code just to see if it works:
import pytesseract
from PIL import Image
pytesseract.pytesseract.tesseract_cmd = r"C:\Program Files\Tesseract-OCR\tesseract.exe"
print(pytesseract.image_to_string(Image.open("C:\Documents and Settings\Yerutnik\Desktop\file.bmp")))
However, I am getting the following error:
Traceback (most recent call last):
File "C:\Documents and Settings\Yerutnik\Desktop\test1.py", line 2, in <module>
import pytesseract
File "C:\Python27\lib\site-packages\pytesseract\__init__.py", line 2, in <module>
from .pytesseract import ALTONotSupported
File "C:\Python27\lib\site-packages\pytesseract\pytesseract.py", line 89
f"{tesseract_cmd} is not installed or it's not in your PATH."
^
SyntaxError: invalid syntax
I am running this on a Windows XP 32bit machine (must use this machine), Python 2.7.9, Tesseract 4.0.0 (tested working separately in cmd, and I checked that it is in PATH).
you use pytesseract for python3 (f-string is python3 feature) in python2.7 Try some old version of pytesseract.
I was able to fix this by downgrading pytesseract (as suggested by user898678) from 4.0.0 to 0.2.2, upgrading pip from 1.5.2 to 20, and installing pytesseract from web instead of using a wheel file.

why do I have a traceback with Image class?

I wrote this code:
import Image
im = raw_input("Insert Image file: ")
handle = Image.open(im)
print handle.size
to read an Image file and print its size, but when I run this code I get a traceback:
Traceback (most recent call last):
File "image.py", line 1, in <module>
import Image
P.S - I wrote the program in mac os x if it matters
You most likely need to install PIL or Pillow. Here is the installation guide which can be summarized as:
Install Brew:
ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"
Install some dependencies for Pillow:
brew install libtiff libjpeg webp little-cms2
Install Pillow:
pip install Pillow
And change:
import Image
To:
from PIL import Image
If you name your file image.py, and then try to import Image there is a high risk that Python tries to import your own file and that all that ends in a stack overflow (the bad event, not the nice site).
NEVER give you files names that exists in Standard Python Library

Categories