raise "pytesseract.pytesseract.TesseractError: (3221225477, '')" - python

I got the following error when I tried to find out the Chinese words in a picture by python: (By the way, I had already had "chi_sim.traineddata" training file in tessdata directory and got a successful try to find out English sentences in a picture, so this error really confused me.)
*C:\Users\Lenovo\AppData\Local\Programs\Python\Python37-32\python.exe E:/PKU1.3/python_math/set_for_recognition.py
Traceback (most recent call last):
File "E:/PKU1.3/python_math/set_for_recognition.py", line 5, in <module>
text=pytesseract.image_to_string(Image.open('climb_high.jpeg'),lang='chi_sim')
File "C:\Users\Lenovo\AppData\Local\Programs\Python\Python37-32\lib\site-packages\pytesseract\pytesseract.py", line 295, in image_to_string
return run_and_get_output(*args)
File "C:\Users\Lenovo\AppData\Local\Programs\Python\Python37-32\lib\site-packages\pytesseract\pytesseract.py", line 203, in run_and_get_output
run_tesseract(**kwargs)
File "C:\Users\Lenovo\AppData\Local\Programs\Python\Python37-32\lib\site-packages\pytesseract\pytesseract.py", line 179, in run_tesseract
raise TesseractError(status_code, get_errors(error_string))
pytesseract.pytesseract.TesseractError: (3221225477, '')*

I think this problem is TRAINEDDATA that raised.
I used to develop the OCR project with TESSERACT on windows 7.
Now, I change to windows 10. I get this problem.
but, I found this issue is related to your TRAINEDDATA,
If I use TRAINEDDATA that I have trained on windows 7, then it fine without any error message.

Actually since the error code 3221225477 --> 0xC0000005 : ACCESS_VIOLATION means Tesseract has crashed (from here), change a version of Tesseract may help you.
In 4.00 (beta) and 3.02 this problem is occurred, 3.05 is fine (I use Windows 7).
Hope this helps.

I got this error because my UZN file extended beyond the image area. I patched pytesseract.py (print(' '.join(cmd_args)) in run_tesseract()) which was throwing an assertion error.

Please try the below code :
import pytesseract
from PIL import Image
pytesseract.pytesseract.tesseract_cmd = r'C:/Program Files/Tesseract-OCR/tesseract.exe'
tessdata_dir_config = '--tessdata-dir "C:/Program Files/Tesseract-OCR/tessdata"'
img = Image.open('images\Capture2.JPG')
text = pytesseract.image_to_string(img, config=tessdata_dir_config)
print(text)

Related

NFLfastpy installed but wont import

I have pip installed nflfastpy ,
But when I import it.
running only
import nflfastpy
i get this error message
(pythonCoursera) C:\Users\austi\PycharmProjects\pythonCoursera>python sportsbet.py
Traceback (most recent call last):
File "C:\Users\austi\PycharmProjects\pythonCoursera\sportsbet.py", line 2, in <module>
import nflfastpy as nfl
File "C:\Users\austi\anaconda3\envs\pythonCoursera\lib\site-packages\nflfastpy\__init__.py", line 16, in <module>
default_headshot = mpl_image.imread(headshot_url)
File "C:\Users\austi\anaconda3\envs\pythonCoursera\lib\site-packages\matplotlib\image.py", line 1536, in imread
raise ValueError(
ValueError: Please open the URL for reading and pass the result to Pillow, e.g. with ``np.array(PIL.Image.open(urllib.request.urlopen(url)))``.
on 1 single line of code, nothing else in the file,
I've trie a few versions of it, cant seem to figure it out.
Any suggestions?
That library has a bug and seems to be not actively maintained. You are on your own. At least that image loading error can be avoided by removing the dead code in nflfastpy/__init__.py like the following.
...
#default_headshot = mpl_image.imread(headshot_url)

PyTesseract failing to load languages

My code is as follows:
import pytesseract
from PIL import Image
pytesseract.pytesseract.tesseract_cmd = 'B:\\Program Files (x86)\\Tesseract-OCR\\tesseract.exe'
img = Image.open("sample.png")
text = pytesseract.image_to_string(img, lang="eng")
print(text)
The error I get is:
Traceback (most recent call last):
File "C:/PY/tesseract test.py", line 11, in <module>
text = pytesseract.image_to_string(img, lang="eng")
File "C:\PY\lib\site-packages\pytesseract\pytesseract.py", line 346, in image_to_string
return {
File "C:\PY\lib\site-packages\pytesseract\pytesseract.py", line 349, in <lambda>
Output.STRING: lambda: run_and_get_output(*args),
File "C:\PY\lib\site-packages\pytesseract\pytesseract.py", line 260, in run_and_get_output
run_tesseract(**kwargs)
File "C:\PY\lib\site-packages\pytesseract\pytesseract.py", line 236, in run_tesseract
raise TesseractError(proc.returncode, get_errors(error_string))
pytesseract.pytesseract.TesseractError: (1, 'Error opening data file \\Program Files (x86)\\Tesseract-OCR\\eng.traineddata Please make sure the TESSDATA_PREFIX environment variable is set to your "tessdata" directory. Failed loading language \'eng\' Tesseract couldn\'t load any languages! Could not initialize tesseract.')
I have tried searching for other solutions but cannot find anything
I'm not familiar with tesseract in Python, but you may need to load the eng.traineddata binary in order to make it work. Add a TESSDATA_PREFIX to your environment variables and point it to the folder where the binary is located.
You may want to at this answer, looks kind similar to your case: pytesseract Failed loading language \'eng\'
I fixed this issue by uninstalling tesseract and installing an older version (3.0.2). So far I haven't noticed any functionality loss. I'm personally just happy that it works.

How to fix "Unable to load sound" error with Python Playsound Module?

A while ago I made a little project, and I recently thought it would be cool to add some sounds to it as well. So I looked for ways to play sound in Python 3.x, and Playsound was well reviewed. I have it setup like so: I have folder Python Projects and inside that I have Sound Test But when I try to play my audio file (test.wav) it throws the following error:
Traceback (most recent call last):
File "soundtest.py", line 2, in <module>
playsound('test.wav')
File "/Users/rhett/env/lib/python3.7/site-packages/playsound.py", line 67, in _playsoundOSX
raise IOError('Unable to load sound named: ' + sound)
OSError: Unable to load sound named: file:///Users/rhett/Desktop/Python Projects/Sound Test/test.wav
I tried using the direct path, e.g.:
from playsound import playsound
playsound(/Users/Rhett/Desktop/Python\ Projects/Sound\ Test/test.wav)
I received the exact same error:
Traceback (most recent call last):
File "Sound Test/soundtest.py", line 2, in <module>
playsound("/Users/Rhett/Desktop/Python\ Projects/Sound\ Test/test.wav")
File "/Users/rhett/env/lib/python3.7/site-packages/playsound.py", line 67, in _playsoundOSX
raise IOError('Unable to load sound named: ' + sound)
OSError: Unable to load sound named: file:///Users/Rhett/Desktop/Python\ Projects/Sound\ Test/test.wav
I have found where the problem is.
In the python3.7/site-packages/playsound.py file
The developer has not checked for " " spaces in file path, so spaces character in the file path is creating trouble.
A quick fix without changing playsound.py 's code.
replace your folder name
~/Desktop/Python Projects/Sound Test/test.wav with ~/Desktop/PythonProjects/SoundTest/test.wav
(i.e, remove spaces from folder names)
This will fix your error.
This allows it to work...
s_musicfile = "/Users/xxxxxxxxxx/Desktop/play this file.mp3"
s_musicfile = s_musicfile.replace(" ", "%20")
playsound(s_musicfile)
I think this will work try it especially when it is for raspberry :
Python 2:
sudo apt install python-gst-1.0
Python 3:
sudo apt install python3-gst-1.0

File 'tesseract.log' is Missing (Python 2.7, Windows)

I'm trying to write an OCR script with Python (2.7, Windows OS) to get text from images. First I've downloaded PyTesser and extracted it to Python27/Lib/site-packages as 'pytesser' and I've installed tesseract with pip install tesseract . Then I wrote the following script as self.py:
from PIL import Image
from pytesser.pytesser import *
image_file = 'C:/Users/blabla/test.png'
im = Image.open(image_file)
text = image_to_string(im)
text = image_file_to_string(image_file)
text = image_file_to_string(image_file, graceful_errors=True)
print text
But I'm getting the following error:
Traceback (most recent call last):
File "C:/Users/blabla/self.py", line 7, in <module>
text = image_file_to_string(image_file)
File "C:\Python27\lib\site-packages\pytesser\pytesser.py", line 44, in image_file_to_string
call_tesseract(filename, scratch_text_name_root)
File "C:\Python27\lib\site-packages\pytesser\pytesser.py", line 24, in call_tesseract
errors.check_for_errors()
File "C:\Python27\lib\site-packages\pytesser\errors.py", line 10, in check_for_errors
inf = file(logfile)
IOError: [Errno 2] No such file or directory: 'tesseract.log'
And yes, there's no 'tesseract.log' file anywhere. What should I do? How should I solve this problem?
Thank you in advance.
Note: I've changed the line tesseract_exe_name from pytesser.py from tesseract to C:/Python27/Lib/site-packages/pytesser/tesseract but it doesn't work.
Edit: Alright, I've just runned teseract.exe that is in 'pytesser' and it created the 'tesseract.log' file but I'm still getting same error.
I've changed the line from def check_for_errors(logfile = "tesseract.log"): to def check_for_errors(logfile = "C:/Python27/Lib/site-packages/pytesser/tesseract.log"): in ../pytesser/errors.py and it worked.

Cannot determine type of file

Hi i have just started learning image processing using python.
When i tried to open an image that i downloaded from the net, I keep getting this error and I have no idea about how to resolve it. Can anyone please help me with this?
>>> dna=mahotas.imread('dna.jpeg')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Python27\lib\site-packages\mahotas\io\freeimage.py", line 773, in imread
img = read(filename)
File "C:\Python27\lib\site-packages\mahotas\io\freeimage.py", line 444, in read
bitmap = _read_bitmap(filename, flags)
File "C:\Python27\lib\site-packages\mahotas\io\freeimage.py", line 490, in _read_bitmap
'mahotas.freeimage: cannot determine type of file %s' % filename)
ValueError: mahotas.freeimage: cannot determine type of file dna.jpeg
Hello this looks like a pretty old thread but I found it recently because I had the same problem.
I think that the error message is misleading because it implies that the type of file is incorrect.
I fixed the problem by including the full path to the image file. For example, it could look something like:
dna = mahotas.imread('C:\Documents\dna.jpeg')

Categories