OSError: decoder zip not available - python

import pytesseract as pt
from PIL import Image
img = Image.open("C:/Users/Abir Khan/Desktop/IIT B/Untitled.png")
text = pt.image_to_string(img)
print(text)

OS: MacOS Catalina 10.15.7
pip version: 20.2.4
requirements.txt: Pillow==2.8.1
Upgrading Pillow via pip install --upgrade Pillow to version 8.0.1 worked for me perfectly.
I have tried several different approaches like installing zlib and lzlib through brew but nothing worked.

It likely only needs the zip decoder to save the jpeg. I think I needed to follow these steps in OS X to preview jpegs.
this might help.
https://stackoverflow.com/a/3544159/8074624

Related

pyautogui.locateCenterOnScreen ERROR "pyscreeze.PyScreezeException: The Pillow package is required to use this function"

I am running into an issue and I am not sure what is causing it.
The code I am running is as follows:
gs_empty_search_btn_x, gs_empty_search_btn_y = pyautogui.locateCenterOnScreen('pictures/shopping/search_bar_empty.png',
confidence=.8)
print(gs_empty_search_btn_x)
For reference:
My OS is Windows
I am using PyCharm Community 2022.2.1
I have PIL (9.2.0) installed
PyAutoGui (0.9.53) installed
I ran pip install Pillow & pip install --upgrade Pillow" already and got "Requirement already satisfied" for both
I tried File -> Settings -> Project name -> Python Interpreter -> + -> Type pyautogui -> Install Package
I tried Python Packages > Pillow > Uninstalled Pillow > Re-Installed Pillow , still got the same error.
I added import PIL and Installed Pillow manually, still not working
I did pip freeze and got Pillow ==9.2.0
Nothing I am doing is working. Please help!
I figured it out! Had to delete everything and start over and delete a few packages that relied on other Python versions

Unknown OpenCV exception while using EasyOcr

Code:
import easyocr
reader = easyocr.Reader(['en'])
result = reader.readtext('R.png')
Output:
CUDA not available - defaulting to CPU. Note: This module is much faster with a GPU.
cv2.error: Unknown C++ exception from OpenCV code
I would truly appreciate any support!
The new version of OpenCV has some issues. Uninstall the newer version of OpenCV and install the older one using:
pip install opencv-python==4.5.4.60
install letest version of opnecv
pip install opencv-python==4.5.4.60

Can't use SIFT in Python OpenCV v4.20

I am using OpenCV v4.20 and PyCharm IDE. I want to use SIFT algorithm. But I get this error. I looked for solutions of this error on the internet but none of them did help me. Do you know the solution of this error? (With pip I can install at least 3.4.2.16 version of OpenCV)
Here is my error:
Traceback (most recent call last):
File "C:/Users/HP/PycharmProjects/features/featuredetect.py", line 7, in
sift = cv.xfeatures2d_SIFT.create()
cv2.error: OpenCV(4.2.0) C:\projects\opencv-python\opencv_contrib\modules\xfeatures2d\src\sift.cpp:1210: error: (-213:The function/feature is not implemented) This algorithm is patented and is excluded in this configuration; Set OPENCV_ENABLE_NONFREE CMake option and rebuild the library in function 'cv::xfeatures2d::SIFT::create'
Here is my code:
import cv2 as cv
image = cv.imread("the_book_thief.jpg")
gray_image = cv.cvtColor(image,cv.COLOR_BGR2GRAY)
sift = cv.xfeatures2d_SIFT.create()
keyPoints = sift.detect(image,None)
output = cv.drawKeypoints(image,keyPoints,None)
cv.imshow("FEATURES DETECTED",output)
cv.imshow("NORMAL",image)
cv.waitKey(0)
cv.destroyAllWindows()
SIFT's patent has expired in last July. in versions > 4.4, the detector init command has changed to cv2.SIFT_create().
If you're not using opencv's GUI, It's recommended to install the headless version: pip install opencv-python-headless
Unfortunately, according to this Github issue, SIFT no longer available in opencv > 3.4.2. Since you're using OpenCV v4.2.0, it's not included even if you have installed pip install opencv-contrib-python as shown in this post. A work around is to downgrade to any previous OpenCV version that includes SIFT (I believe any version below 3.4.3). I've been successful when downgrading to v3.4.2.16.
pip install opencv-python==3.4.2.16
pip install opencv-contrib-python==3.4.2.16
Using your code with v3.4.2.16, SIFT seems to work
I had the same issue previously. I had tried all methods but finally a very simple method work for me which has already answered by many. However, there is a little change in my approach.
Step 1:
Uninstall the previously install opencv library
pip uninstall opencv-python
Step 2:
Install opencv contrib library due to copyright issue. Here, we are using version 3.4.2.17
pip install opencv-contrib-python==3.4.2.17
opencv contrib library installation error
The above figure shows version 3.4.2.16 not found error. Hence, I tried with version 3.4.2.17. If this version doesn't work, try other versions of 3.4.x.
Step 3:
Then run the following
import cv2
sift = cv2.xfeatures2d.SIFT_create()
That's all. It works for me. I hope it works for you as well.
I had the same issue, after a lot of attempts, I tried installing opencv-contrib-python several times, but it worked just today. Just to be sure I installed both opencv-python and opencv-contrib-python.
pip install opencv-python
And
pip install opencv-contrib-python
The version that installed was 4.4.0.46 for both opencv-python and opencv-contrib-python. In case the later versions don't support it (A few of the previous versions didn't support SIFT, the one from a month ago, the latest opencv-contrib-python patch was released on Nov 2nd, 2020).
The solution to your problem should be installing opencv-contrib-python-nonfree (available via pip).
$ pip install opencv-contrib-python-nonfree
As the error states SIFT is patented and therefore not included into OpenCV for license reasons. It's included in the nonfree part.

jupyter notebook won't launch due to "Library not loaded" error

Jupyter notebook always launched with no problem. Until yesterday... I tried to pip install pytesseract then went off to do something else and now when I try to start jupyter notebook, this is what I get every single time:
File "/usr/local/Cellar/python3/3.6.4/Frameworks/Python.framework/Versions/3.6/lib/python3.6/ctypes/__init__.py", line 348, in __init__
self._handle = _dlopen(self._name, mode)
OSError: dlopen(/System/Library/Frameworks/Foundation.framework/Foundation, 6): Library not loaded: /System/Library/Frameworks/ImageIO.framework/Versions/A/Resources/libGIF.dylib
Referenced from: /System/Library/Frameworks/ImageIO.framework/Versions/A/ImageIO
Reason: Incompatible library version: ImageIO requires version 1.0.0 or later, but libGIF.dylib provides version 0.0.0
I figured all I had to do is pip uninstall imageio but no such luck...
In their webpage they state you need to download PIL or Pillow and not imageio, so I would just try to pip install Pillow and check again if that fixes the general problem. The link to their github:
https://github.com/madmaze/pytesseract
Prerequisites:
Python-tesseract requires Python 2.7 or Python 3.5+
You will need the Python Imaging Library (PIL) (or the Pillow fork).
Under Debian/Ubuntu, this is the package python-imaging or
python3-imaging.

Pytesseract : "TesseractNotFound Error: tesseract is not installed or it's not in your path", how do I fix this?

I'm trying to run a basic and very simple code in python.
from PIL import Image
import pytesseract
im = Image.open("sample1.jpg")
text = pytesseract.image_to_string(im, lang = 'eng')
print(text)
This is what it looks like, I have actually installed tesseract for windows through the installer. I'm very new to Python, and I'm unsure how to proceed?
Any guidance here would be very helpful. I've tried restarting my Spyder application but to no avail.
I see steps are scattered in different answers. Based on my recent experience with this pytesseract error on Windows, writing different steps in sequence to make it easier to resolve the error:
1. Install tesseract using windows installer available at: https://github.com/UB-Mannheim/tesseract/wiki
2. Note the tesseract path from the installation. Default installation path at the time of this edit was: C:\Users\USER\AppData\Local\Tesseract-OCR. It may change so please check the installation path.
3. pip install pytesseract
4. Set the tesseract path in the script before calling image_to_string:
pytesseract.pytesseract.tesseract_cmd = r'C:\Users\USER\AppData\Local\Tesseract-OCR\tesseract.exe'
First you should install binary:
On Linux
sudo apt-get update
sudo apt-get install libleptonica-dev tesseract-ocr tesseract-ocr-dev libtesseract-dev python3-pil tesseract-ocr-eng tesseract-ocr-script-latn
On Mac
brew install tesseract
On Windows
download binary from https://github.com/UB-Mannheim/tesseract/wiki. then add pytesseract.pytesseract.tesseract_cmd = 'C:\Program Files (x86)\Tesseract-OCR\tesseract.exe' to your script.
Then you should install python package using pip:
pip install tesseract
pip install tesseract-ocr
references:
https://pypi.org/project/pytesseract/ (INSTALLATION section) and
https://tesseract-ocr.github.io/tessdoc/Installation.html
For Windows Only
1 - You need to have Tesseract OCR installed on your computer.
get it from here.
https://github.com/UB-Mannheim/tesseract/wiki
Download the suitable version.
2 - Add Tesseract path to your System Environment. i.e. Edit system variables.
3 - Run pip install pytesseract and pip install tesseract
4 - Add this line to your python script every time
pytesseract.pytesseract.tesseract_cmd = 'C:/OCR/Tesseract-OCR/tesseract.exe' # your path may be different
5 - Run the code.
This error is because tesseract is not installed on your computer.
If you are using Ubuntu install tesseract using following command:
sudo apt-get install tesseract-ocr
For mac:
brew install tesseract
From https://pypi.org/project/pytesseract/ :
pytesseract.pytesseract.tesseract_cmd = '<full_path_to_your_tesseract_executable>'
# Include the above line, if you don't have tesseract executable in your PATH
# Example tesseract_cmd: 'C:\\Program Files (x86)\\Tesseract-OCR\\tesseract'
In windows:
pip install tesseract
pip install tesseract-ocr
and check the file which is stored in your system usr/appdata/local/programs/site-pakages/python/python36/lib/pytesseract/pytesseract.py file
and compile the file
On Mac, you can install it like shown below. This works for me.
brew install tesseract
For Linux Distribution (Ubuntu)
try
sudo apt install tesseract-ocr
sudo apt install libtesseract-dev
you can install this package...
https://github.com/UB-Mannheim/tesseract/wiki
after that you should go this path C:\Program Files (x86)\Tesseract-OCR\ tesseract.exe
then run tesseract file.
I think this will help you...
Step 1:
Install tesseract on your system as per the OS.
Latest installers can be found at https://github.com/UB-Mannheim/tesseract/wiki
Step 2:
Install the following dependency libraries using :
pip install pytesseract
pip install opencv-python
pip install numpy
Step 3:
Sample code
import cv2
import numpy as np
import pytesseract
from PIL import Image
from pytesseract import image_to_string
# Path of working folder on Disk Replace with your working folder
src_path = "C:\\Users\\<user>\\PycharmProjects\\ImageToText\\input\\"
# If you don't have tesseract executable in your PATH, include the
following:
pytesseract.pytesseract.tesseract_cmd = 'C:/Program Files (x86)/Tesseract-
OCR/tesseract'
TESSDATA_PREFIX = 'C:/Program Files (x86)/Tesseract-OCR'
def get_string(img_path):
# Read image with opencv
img = cv2.imread(img_path)
# Convert to gray
img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
# Apply dilation and erosion to remove some noise
kernel = np.ones((1, 1), np.uint8)
img = cv2.dilate(img, kernel, iterations=1)
img = cv2.erode(img, kernel, iterations=1)
# Write image after removed noise
cv2.imwrite(src_path + "removed_noise.png", img)
# Apply threshold to get image with only black and white
#img = cv2.adaptiveThreshold(img, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY, 31, 2)
# Write the image after apply opencv to do some ...
cv2.imwrite(src_path + "thres.png", img)
# Recognize text with tesseract for python
result = pytesseract.image_to_string(Image.open(src_path + "thres.png"))
# Remove template file
#os.remove(temp)
return result
print('--- Start recognize text from image ---')
print(get_string(src_path + "image.png") )
print("------ Done -------")
On Windows 64 bits, just add the following to the PATH environment variable:
"C:\Program Files\Tesseract-OCR" and it will work.
I can solve it by updating the tesseract_cmd variable with the bin/tesseract path in the pytesseract.py file
I had the same issue on Windows.
I tried to update the environment variables for the path of tesseract which did not work.
What worked for me was to modify the pytesseract.py which can be found at the path C:\Program Files\Python37\Lib\site-packages\pytesseract or usually in the C:\Users\YOUR USER\APPDATA\Python
I changed one line as per below:
#tesseract_cmd = 'tesseract'
#tesseract_cmd = 'C:\Program Files\Tesseract-OCR\\tesseract.exe'
Note I had to put an extra \ before tesseract as Python was interpreting same as \t and you will get the below error message:
pytesseract.pytesseract.TesseractNotFoundError: C:\Program Files\Tesseract-OCR esseract.exe is not installed or it's not in your path
for me it worked by putting single quote
pytesseract.pytesseract.tesseract_cmd =r'C:/Program Files/Tesseract-OCR/tesseract.exe'
actually putting inside double quotes was automatically inserting unwanted chracter
Anaconda Installation:
Works on Mac, Linux, and Windows
conda-forge/packages/tesseract 4.1.1
Step 1:
conda install -c conda-forge tesseract
Step 2: Find Tesseract PATH if you haven't already
for r,s,f in os.walk("/"):
for i in f:
if "tesseract" in i:
print(os.path.join(r,i))
For example, my Tesseract PATH is /anaconda/bin/tesseract
Step 3: Add tesseract to PATH
pytesseract.pytesseract.tesseract_cmd = r'/anaconda/bin/tesseract'
Perhaps this is happening because, even if Tesseract is correctly installed, you have not installed your language, as was my case. Fortunately this is very easy to fix, and I did not even need to mess with tesseract_cmd.
sudo apt-get install tesseract-ocr -y
sudo apt-get install tesseract-ocr-spa -y
tesseract --list-langs
Note that in the second line we have specified -spa for Spanish.
If installation has been successful, you should get a list of your available languages, like:
List of available languages (3):
eng
osd
spa
I found this at this blog post (Spanish). There is also a post for installation of Spanish language in Windows (not as easy apparently).
Note: since the question uses lang = 'eng', it is likely this is not the answer in that specific case. But the same error may happen in this other situation, which is why I posted the answer here.
For Windows in simple steps:
Download Windows version from https://github.com/UB-Mannheim/tesseract/wiki
Install
Write following in your .py file (check installed location)
pytesseract.pytesseract.tesseract_cmd = r"C:\Program Files\Tesseract-OCR\tesseract.exe"
img_text = pytesseract.image_to_string(Image.open(filename))
You would be needing to install tesseract.
https://github.com/tesseract-ocr/tesseract/wiki
Check out the above documentation on the installation.
In windows, the command path must be redirected, for a default windows tesseract installation.
In 32 bit system, add in this line after import commands.
pytesseract.pytesseract.tesseract_cmd = 'C:\Program Files (x86)\Tesseract-OCR\tesseract.exe'
In 64 bit system, add this line instead.
pytesseract.pytesseract.tesseract_cmd = 'C:\Program Files\Tesseract-OCR\tesseract.exe'
For Windows users only:
Install tesseract using:
pip install tesseract
and then add this line to your code, mind the "\"
pytesseract.pytesseract.tesseract_cmd = "C:\Program Files (x86)\Tesseract-OCR\\tesseract.exe"
It did work for me just by installing tesseract using conda.
conda install -c conda-forge tesseract
Use the following command to install tesseract
pip install tesseract
# {Windows 10 instructions}
# before you use the script you need to install the dependence
# 1. download the tesseract from the official link:
# https://github.com/UB-Mannheim/tesseract/wiki
# 2. install the tesseract
# i chosed this path
# *replace the user string in the below path with you name of user that you are using in your current machine
# C:\Users\user\AppData\Local\Tesseract-OCR\
# 3. Install the pillow for your python version
# * the best way for me is to install is this form(i'am using python3.7 version and in my CMD i run this version of python by typing py -3.7):
# * if you are using another version of python first look how you start the python from you CMD
# * for some machine the run of python from the CMD is different
# [examples]
# =================================
# PYTHON VERSION 3.7
# python
# python3.7
# python -3.7
# python 3.7
# python3
# python -3
# python 3
# py3.7
# py -3.7
# py 3.7
# py3
# py -3
# py 3
# PYTHON VERSION 3.6
# python
# python3.6
# python -3.6
# python 3.6
# python3
# python -3
# python 3
# py3.6
# py -3.6
# py 3.6
# py3
# py -3
# py 3
# PYTHON VERSION 2.7
# python
# python2.7
# python -2.7
# python 2.7
# python2
# python -2
# python 2
# py2.7
# py -2.7
# py 2.7
# py2
# py -2
# py 2
# ================================
# we are using pip to install the dependences
# because for me i start the python version 3.7 with the following line
# py -3.7
# open the CMD in windows machine and type the following line:
# py -3.7 -m pip install pillow
# 4. Install the pytesseract and tesseract for your python version
# * the best way for me is to install is this form(i'am using python3.7 version and in my CMD i run this version of python by typing py -3.7):
# we are using pip to install the dependences
# open the CMD in windows machine and type the following lines:
# py -3.7 -m pip install pytesseract
# py -3.7 -m pip install tesseract
#!/usr/bin/python
from PIL import Image
import pytesseract
import os
import getpass
def extract_text_from_image(image_file_name_arg):
# IMPORTANT
# if you have followed my instructions to install this dependence in above text explanatin
# for my machine is
# if you don't put the right path for tesseract.exe the script will not work
username = getpass.getuser()
# here above line get the username for your machine automatically
tesseract_exe_path_installation="C:\\Users\\"+username+"\\AppData\\Local\\Tesseract-OCR\\tesseract.exe"
pytesseract.pytesseract.tesseract_cmd=tesseract_exe_path_installation
# specify the direction of your image files manually or use line bellow if the images are in the script directory in folder images
# image_dir="D:\\GIT\\ai_example\\extract_text_from_image\\images"
image_dir=os.getcwd()+"\\images"
dir_seperator="\\"
image_file_name=image_file_name_arg
# if your image are in different format change the extension(ex. ".png")
image_ext=".jpg"
image_path_dir=image_dir+dir_seperator+image_file_name+image_ext
print("=============================================================================")
print("image used is in the following path dir:")
print("\t"+image_path_dir)
print("=============================================================================")
img=Image.open(image_path_dir)
text=pytesseract.image_to_string(img, lang="eng")
print(text)
# change the name "image_1" whith the name without extension for your image name
# image_file_name_arg="image_1"
image_file_name_arg="image_2"
# image_file_name_arg="image_3"
# image_file_name_arg="image_4"
# image_file_name_arg="image_5"
extract_text_from_image(image_file_name_arg)
# ==================================
# CREATED BY: SHERIFI
# e-mail: sherif_co#yahoo.com
# git-link for script: https://github.com/sherifi/ai_example.git
# ==================================
For Ubuntu 18.04
If you are getting an error like
tesseract is not installed or it's not in your path
and
OSError: [Errno 12] Cannot allocate memory
That might be and issue with the swap memory allocation issue
You can check this answer allocating more swap memory Hope that helps :)
https://askubuntu.com/questions/920595/fallocate-fallocate-failed-text-file-busy-in-ubuntu-17-04?answertab=active#tab-top
There are already many nice answers to this problem but I would like to share a wonderful site that I came across when I couldnt solve the 'TesseractNotFound Error: tesseract is not installed or it's not in your path” Please refer this site: https://www.thetopsites.net/article/50655738.shtml
I realised that I got this error because I installed pytesseract with pip but forget to install the binary.
You are probably missing tesseract-ocr from your machine. Check the installation instructions here: https://github.com/tesseract-ocr/tesseract/wiki
On a Mac, you can just install using homebrew:
brew install tesseract
It should run fine after that!
Under Windows 10 OS environment, the following method works for me:
Go to this link and Download tesseract and install it. Windows version is available here: https://github.com/UB-Mannheim/tesseract/wiki
Find script file pytesseract.py from C:\Users\User\Anaconda3\Lib\site-packages\pytesseract and open it. Change the following code from tesseract_cmd = 'tesseract' to: tesseract_cmd = 'C:/Program Files (x86)/Tesseract-OCR/tesseract.exe'
(This is the path where you install Tesseract-OCR so please check where you install it and accordingly update the path)
You may also need to add environment variable C:/Program Files (x86)/Tesseract-OCR/
Hope it works for you!
Solution for UBUNTU Worked for me:
Installed tesseract in ubuntu by following below link
https://medium.com/quantrium-tech/installing-tesseract-4-on-ubuntu-18-04-b6fcd0cbd78f
Later added traindata language to tessdata by following below link
Tesseract running error
There looks to be an issue with the latest version of the pip module pytesseract=0.3.7.
I have downgraded it to pytesseract=0.3.6 and don't see the error.
The above tips did not help me fix the problem, because the error specified in the section occurred when installing pytesseract (pycharm, python 2.7). The oddity was also that tesseract worked from the command line, so the installation was done correctly.
I was able to fix this problem by following these steps:
download pytesseract.py from the vault https://github.com/madmaze/pytesseract
remove all syntax errors related to the difference in the interpreters (2.7 and 3.*), including the try catch methods
import the edited script into your program as a self-written one and configure the tesseract_cmd variable according to the recommendations in the repository.
Subsequently, the image-to-text translation function worked in python 2.7
I aleady tried this one on my raspberry pi. I just changed the path from this:
C:/Program Files/Tesseract-OCR/tesseract.exe'
(Since, it is for windows) To this:
/usr/local/lib/python3.7/dist-packages
Since, it is the path I see whenever I try to run this command:
pip3 show pytesseract
For better clarity here's the message.
Command line here
I am also facing an same error while installing tesseract in windows.
Based my recent problem solving i am following thsese below steps
Install tesseract using windows installer available in the gievn link: https://github.com/UB-Mannheim/tesseract/wiki
Note the tesseract path from the installation. Default installation path at the time of this edit was: C:\Users\USER\AppData\Local\Tesseract-OCR. It may change so please check the installation path.
After installations, still it is showing error or not installing error you are facing then press windows + R keys and run your file path (C:\Program Files\Tesseract-OCR\tesseract.exe) it wil work for me,
3. pip install pytesseract
Set the tesseract path in the script before calling ```image_to_string:``
For windows file path -
pytesseract.pytesseract.tesseract_cmd=r'C:\Program Files(x86)\Tesseract-OCR\tesseract.exe'
For installing opencv please - refer this question link
For linux installations
$ sudo apt install tesseract-ocr
$ sudo apt install libtesseract-dev
$ tesseract --version
2.After running this command, you should something like this:
tesseract 4.0.0-beta.1
leptonica-1.75.3
3.Once your tesseract installation is successful, you can run the following command to check
$ tesseract --list-langs
4.You can expect the following output:
List of available languages (2):
eng
osd
5.linux file path was given below
pytesseract.pytesseract.tesseract_cmd = r'home/user/bin/tesseract'

Categories