Running pytesseract in Sagemaker Jupyter notebook - python

I want to use pytesseract in my Sagemaker Jupyter notebook.
I am following this tutorial for installing pytesseract. After running pip install:
!pip install pytesseract
Looking in indexes: https://pypi.org/simple, https://pip.repos.neuron.amazonaws.com
Requirement already satisfied: pytesseract in /home/ec2-user/anaconda3/envs/pytorch_p38/lib/python3.8/site-packages (0.3.10)
Requirement already satisfied: Pillow>=8.0.0 in /home/ec2-user/anaconda3/envs/pytorch_p38/lib/python3.8/site-packages (from pytesseract) (9.0.1)
Requirement already satisfied: packaging>=21.3 in /home/ec2-user/anaconda3/envs/pytorch_p38/lib/python3.8/site-packages (from pytesseract) (21.3)
Requirement already satisfied: pyparsing!=3.0.5,>=2.0.2 in /home/ec2-user/anaconda3/envs/pytorch_p38/lib/python3.8/site-packages (from packaging>=21.3->pytesseract) (3.0.6)
the turotial indicates I should add the tesseract executeable to my path however I don't know where pip installs this executable?
# If you don't have tesseract executable in your PATH, include the following:
pytesseract.pytesseract.tesseract_cmd = r'<full_path_to_your_tesseract_executable>'
if I try to run pytesseract without this I get an error message:
from PIL import Image
import pytesseract
print(pytesseract.image_to_string(Image.open(testimage)))
results in:
~/anaconda3/envs/pytorch_p38/lib/python3.8/site-packages/pytesseract/pytesseract.py in run_tesseract(input_filename, output_filename_base, extension, lang, config, nice, timeout)
258 raise
259 else:
--> 260 raise TesseractNotFoundError()
261
262 with timeout_manager(proc, timeout) as error_string:
TesseractNotFoundError: tesseract is not installed or it's not in your PATH. See README file for more information.
I was able to find to the pytesseract instalation here:
/home/ec2-user/anaconda3/envs/JupyterSystemEnv/lib/python3.7/site-packages/pytesseract
however when I update the tesseract_cmd with that location and invoke the same code I get:
PermissionError: [Errno 13] Permission denied: '/home/ec2-user/anaconda3/envs/JupyterSystemEnv/lib/python3.7/site-packages/pytesseract'
My question is distinct (but related) from this question and I am getting a permission denied error when I link to the tesseract binary.

Related

Pytesseract keeps on getting syntax or path error

I'm trying to run a python code on my Raspberry Pi, in which I get from some tutorial. I followed everything there; installed the necessary packages or library in order for me to run the script.
The first one:
sudo apt-get install tesseract-ocr
And I get these following messages.
Reading package lists... Done
Building dependency tree
Reading state information... Done tesseract-ocr is already the newest
version (4.0.0-2).
The following package was automatically installed and is no longer
required: python-colorzero
Use 'sudo apt autoremove' to remove it.
0 upgraded, 0 newly installed, 0 to remove and 1 not upgraded.
Proof here:Screen shot 1
The second one is this:
pip3 install pytesseract
I changed pip to pip3, since I read somewhere that pip is already obsolete. And so, I get these messages:
Looking in indexes: https://pypi.org/simple,
https://www.piwheels.org/simple
Requirement already satisfied: pytesseract in
/usr/local/lib/python3.7/dist-packages (0.3.8)
Requirement already satisfied: Pillow in
/usr/lib/python3/dist-packages (from pytesseract) (5.4.1)
Proof here:Screenshot 2
And lastly, in order for me to run the script. I install this.
pip3 install pillow
Same as the above one, I also used pip3 in installing pillow. So, I get these messages:
Looking in indexes: https://pypi.org/simple,
https://www.piwheels.org/simple
Requirement already satisfied: pillow in
/usr/lib/python3/dist-packages (5.4.1)
Proof here:Screenshot 3
After installing all the necessary requirements, I tried to run this simple script from the tutorial.
import pytesseract
from PIL import Image
import cv2
img = cv2.imread('platen5.jpeg',cv2.IMREAD_COLOR) #Open the image from which charectors has to be recognized
#img = cv2.resize(img, (620,480) ) #resize the image if required
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) #convert to grey to reduce detials
gray = cv2.bilateralFilter(gray, 11, 17, 17) #Blur to reduce noise
original = pytesseract.image_to_string(gray, config='')
#test = (pytesseract.image_to_data(gray, lang=None, config='', nice=0) ) #get confidence level if required
#print(pytesseract.image_to_boxes(gray))
print (original)
Unfortunately, I still get this syntax error thing though. The only thing I changed from the original script is the 'platen5.jpeg', since it is the saved image of a plate number I have on my file.
Here's the error:
Traceback (most recent call last):
File "tut.py", line 1, in
import pytesseract
File "/home/pi/pytesseract/init.py", line 2, in
from .pytesseract import ALTONotSupported
File "/home/pi/pytesseract/pytesseract.py", line 88
f"{tesseract_cmd} is not installed or it's not in your PATH."
SyntaxError: invalid syntax
Proof here: Screenshot 4
I am sorry, I'm kinda new to python, and raspberry pi, so I am very dependent on the tutorials on the internet, so I have no idea in which part I did wrong. Or am I missing something? Or everything from the tutorial are already obsolete so, it doesn't work?
Thanks.

Google Colab No Such File or Directory Error

Hello I'm trying to start tensorflow training process on google colab. I'm trying to run this code block in integrated notebook on google colab. Code block is:
!apt-get install protobuf-compiler python-pil python-lxml python-tk
!pip install Cython
%cd '/content/gdrive/My Drive/models/research/'
!protoc object_detection/protos/*.proto --python_out=.
import os
os.environ['PYTHONPATH'] += ':/content/gdrive/My Drive/models/research/:/content/gdrive/My Drive/models/research/slim'
!python setup.py build
!python setup.py install
It gives this output:
Reading package lists... Done
Building dependency tree
Reading state information... Done
protobuf-compiler is already the newest version (3.0.0-9.1ubuntu1).
python-lxml is already the newest version (4.2.1-1ubuntu0.4).
python-pil is already the newest version (5.1.0-1ubuntu0.6).
python-tk is already the newest version (2.7.17-1~18.04).
0 upgraded, 0 newly installed, 0 to remove and 39 not upgraded.
Requirement already satisfied: Cython in /usr/local/lib/python3.7/dist-packages (0.29.23)
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
/content/gdrive/My Drive/models/research
python3: can't open file 'setup.py': [Errno 2] No such file or directory
python3: can't open file 'setup.py': [Errno 2] No such file or directory
I think I cant set python path correctly. Can anyone help me please?
You have to first mount your Google drive:
from google.colab import drive
drive.mount('/content/gdrive')
Try to do
import os
os.listdir(os.getcwd())
This should return
['.config', 'sample_data']
before you mount the drive, and
['.config', 'gdrive', 'sample_data']
after you've mounted the drive.

Is my python environment broken? Unable to install gastrodon using PIP

Using Python 3.8 on Windows, having installed a number of other modules I have tried to install gastrodon with
(property) C:\Users\andyt>pip install gastrodon
The result is this:
Requirement already satisfied: gastrodon in c:\users\andyt\anaconda3\envs\property\lib\site-packages (0.9.3)
Requirement already satisfied: pandas in c:\users\andyt\anaconda3\envs\property\lib\site-packages (from gastrodon) (1.0.3)
Requirement already satisfied: IPython in c:\users\andyt\anaconda3\envs\property\lib\site-packages (from gastrodon) (7.13.0)
WARNING: No metadata found in c:\users\andyt\anaconda3\envs\property\lib\site-packages
ERROR: Could not install packages due to an EnvironmentError: [Errno 2] No such file or directory: 'c:\\users\\andyt\\anaconda3\\envs\\property\\lib\\site-packages\\ipython-7.13.0.dist-info\\METADATA'
Does anyone know how to fix this? I am able to install it in base...
Have you tried turning it on and off again? I mean potentially uninstall and reinstall it. I had a similar problem downloading numpy a while back and that seemed to fix it.
so I renamed metadata.json to metadata in a few similar positions where the install failed sequentially, and it appeared to have been successful, except I am now dealing with another error when trying to import gastrodon...

No Module named Slugify in Odoo 12

I came across a module from GitHub and I went through the steps to install, but I am getting this error:
Unable to install module because an external dependency is not met: No module named slugify
However, Slugify is installed:
Requirement already satisfied: python-slugify in c:\program files (x86)\python37-32\lib\site-packages (from -r requirements.txt (line 1)) (3.0.3)
Requirement already satisfied: text-unidecode==1.2 in c:\program files (x86)\python37-32\lib\site-packages (from python-slugify->-r requirements.txt (line 1)) (1.2)
I am using the following parameters for testing:
OS: Windows 10 Pro 64 Bit
Odoo 12.0-20181022 (Community Edition)
Can anyone please advise me where I failed?
Thanks in advance for your help.
Open the same python virtual environment that Odoo uses and run:
try:
import slugify
except ModuleNotFoundError:
import sys
print("Module not found under the following directories: %s"%sys.path)

NLTK | Sentiment Classifier | Issues with Install

I am having trouble installing the sentiment_classifier.
What I have currently done:
pip install sentiment_classifier
python setup.py install
Downloaded sentiment_classifier-0.5.tar.gz
Placed the package into my directory
Error in shell:
pip install sentiment_classifier:
Requirement already satisfied: sentiment_classifier in c:\users\ac\anaconda3\lib\site-packages
Requirement already satisfied: numpy in c:\users\ac\anaconda3\lib\site-packages (from sentiment_classifier)
Requirement already satisfied: nltk in c:\users\ac\anaconda3\lib\site-packages (from sentiment_classifier)
Requirement already satisfied: argparse in c:\users\ac\anaconda3\lib\site-packages (from sentiment_classifie
)
python setup.py install - C:\Users\AC\Anaconda3\python.exe: can't open file 'setup.py': [Errno 2] No such file or directory
When I call it in Jupyter Notebook:
from senti_classifier import senti_classifier
I get:
FileNotFoundError: [Errno 2] No such file or directory: 'C:\Users\AC\Anaconda3\lib\site-packages\senti_classifier\data\SentiWn.p'
Any help would be greatly appreciated.
Docs I've been referring to:
https://pypi.python.org/pypi/sentiment_classifier
Sentiment Analysis using senti_classifier and NLTK
https://github.com/kevincobain2000/sentiment_classifier
http://pythonhosted.org/sentiment_classifier/
Any help would be greatly appreciated.
It is missing some files needed for it to work and no those files aren't downloaded when you install the package using pip, you can download the repository for the library from https://github.com/kevincobain2000/sentiment_classifier and then copy paste the files inside the '/src/senti_classifier/data/' into your library's directory which is 'C:\Users\AC\Anaconda3\lib\site-packages\senti_classifier\data' directory.

Categories