I have been learning Python PyPDF2, This was the code on geeksforgeeks.org/
# importing required modules
import PyPDF2
# creating a pdf file object
pdfFileObj = open('English.pdf', 'rb')
# creating a pdf reader object
pdfReader = PyPDF2.PdfFileReader(pdfFileObj)
# printing number of pages in pdf file
print(pdfReader.numPages)
# creating a page object
pageObj = pdfReader.getPage(0)
# extracting text from page
print(pageObj.extractText())
# closing the pdf file object
pdfFileObj.close()
After running this program, this error pops up:
Traceback (most recent call last):
File "C:\Program Files (x86)\Python38-32\lib\site-packages\PyPDF2\pdf.py", line 1147, in getNumPages
self.decrypt('')
File "C:\Program Files (x86)\Python38-32\lib\site-packages\PyPDF2\pdf.py", line 1987, in decrypt
return self._decrypt(password)
File "C:\Program Files (x86)\Python38-32\lib\site-packages\PyPDF2\pdf.py", line 1996, in _decrypt
raise NotImplementedError("only algorithm code 1 and 2 are supported")
NotImplementedError: only algorithm code 1 and 2 are supported
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "main.py", line 11, in <module>
print(pdfReader.getNumPages())
File "C:\Program Files (x86)\Python38-32\lib\site-packages\PyPDF2\pdf.py", line 1150, in getNumPages
raise utils.PdfReadError("File has not been decrypted")
PyPDF2.utils.PdfReadError: File has not been decrypted
I tried different ways to resolve this, but this error stays, can you please guide me here?
PyPDF2 only supports very old PDF files. It doesn't support the formats from Acrobat 6. You'll either need to convert this to an older format or find a different PDF library.
https://github.com/mstamy2/PyPDF2/issues/378
Related
I'm using the pd2image module to convert a list of .ai files into .png files in a python loop. When I use the module in a loop it will successful convert the first .ai file into a .png file per page in the .ai file, but it seems to break on the second .ai file.
Here's the code
import os
from pdf2image import convert_from_path
directory = '/Users/jacobpatty/vscode_projects/badger_colors/test_ai_work_orders'
def ai_to_png(ai_file):
convert_from_path(
ai_file,
dpi=200,
output_folder='/Users/jacobpatty/vscode_projects/badger_colors/ai_to_ping_temp_storage',
fmt='png'
)
def end_loop(database):
for filename in os.scandir(database):
ai_to_png(filename)
print(filename)
end_loop(directory)
and here's the error
<DirEntry '8904_Heather_B_12-6-17.ai'>
Traceback (most recent call last):
File "/opt/homebrew/lib/python3.9/site-packages/pdf2image/pdf2image.py", line 479, in pdfinfo_from_path
raise ValueError
ValueError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/Users/jacobpatty/vscode_projects/badger_colors/TS_pdf2image.py", line 21, in <module>
end_loop(directory)
File "/Users/jacobpatty/vscode_projects/badger_colors/TS_pdf2image.py", line 17, in end_loop
ai_to_png(filename)
File "/Users/jacobpatty/vscode_projects/badger_colors/TS_pdf2image.py", line 7, in ai_to_png
convert_from_path(
File "/opt/homebrew/lib/python3.9/site-packages/pdf2image/pdf2image.py", line 98, in convert_from_path
page_count = pdfinfo_from_path(pdf_path, userpw, poppler_path=poppler_path)["Pages"]
File "/opt/homebrew/lib/python3.9/site-packages/pdf2image/pdf2image.py", line 488, in pdfinfo_from_path
raise PDFPageCountError(
pdf2image.exceptions.PDFPageCountError: Unable to get page count.
Syntax Warning: May not be a PDF file (continuing anyway)
Syntax Error: Couldn't find trailer dictionary
Syntax Error: Couldn't find trailer dictionary
Syntax Error: Couldn't read xref table
I am very new to programming, but it seems odd to me that a line of code would work for one iteration but not the next. To fix it I tried using different ai files, but nothing changed and the same error occurred. I tried uninstalling and reinstalling pdf2image but that also didn't help.
Any ideas?
I am trying to play mp3 file using pyglet module.
Following some suggestions, I have already installed avbin64 and moved avbin64.dll to the directory where my python code is. but still, I am getting 2 errors
import pyglet
music = pyglet.resource.media('song.mp3')
music.play()
pyglet.app.run()
error code
Traceback (most recent call last):
File "F:\PycharmProjects\test\venv\lib\site-packages\pyglet\media\codecs\wave.py", line 59, in __init__
self._wave = wave.open(file)
File "C:\Users\udit\AppData\Local\Programs\Python\Python37\lib\wave.py", line 510, in open
return Wave_read(f)
File "C:\Users\udit\AppData\Local\Programs\Python\Python37\lib\wave.py", line 164, in __init__
self.initfp(f)
File "C:\Users\udit\AppData\Local\Programs\Python\Python37\lib\wave.py", line 131, in initfp
raise Error('file does not start with RIFF id')
wave.Error: file does not start with RIFF id
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "F:/PycharmProjects/test/test2.py", line 3, in <module>
music = pyglet.resource.media('song.mp3')
File "F:\PycharmProjects\test\venv\lib\site-packages\pyglet\resource.py", line 678, in media
return media.load(path, streaming=streaming)
File "F:\PycharmProjects\test\venv\lib\site-packages\pyglet\media\__init__.py", line 143, in load
raise first_exception
File "F:\PycharmProjects\test\venv\lib\site-packages\pyglet\media\__init__.py", line 133, in load
loaded_source = decoder.decode(file, filename, streaming)
File "F:\PycharmProjects\test\venv\lib\site-packages\pyglet\media\codecs\wave.py", line 109, in decode
return WaveSource(filename, file)
File "F:\PycharmProjects\test\venv\lib\site-packages\pyglet\media\codecs\wave.py", line 61, in __init__
raise WAVEDecodeException(e)
pyglet.media.codecs.wave.WAVEDecodeException: file does not start with RIFF id
As per "Loading media", you're supposed to open audio (and video) files with pyglet.media.load:
music = pyglet.media.load('song.mp3')
You must also have ffmpeg installed for pyglet to be able to read mp3 files (as per Supported media types). Make sure to follow the installation instructions.
import docx
f = open('~/Desktop/python/test/draft.docx','rb')
document = docx.Document(f)
Traceback (most recent call last):
File "./test.py", line 56, in <module>
document = docx.Document(f)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/docx/api.py", line 25, in Document
document_part = Package.open(docx).main_document_part
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/docx/opc/package.py", line 116, in open
pkg_reader = PackageReader.from_file(pkg_file)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/docx/opc/pkgreader.py", line 32, in from_file
phys_reader = PhysPkgReader(pkg_file)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/docx/opc/phys_pkg.py", line 101, in __init__
self._zipf = ZipFile(pkg_file, 'r')
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/zipfile.py", line 1200, in __init__
self._RealGetContents()
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/zipfile.py", line 1267, in _RealGetContents
raise BadZipFile("File is not a zip file")
zipfile.BadZipFile: File is not a zip file
Uninstalled docx and installed python-docx
Uninstalled lxml too
None of it works.
Any help would be appreciated
Running python3.7 on OS X10.13
Don't open the file before you pass it to Document(). Just give it the path like you did in the open() call above.
It needs to be an actual Word .docx file. Note that you can just call document = Document() to get started. The "save as" file name is provided in the document.save() call. The file (if any) provided in the Document() call is just the starting-point "template" to use.
See the related documentation here:
https://python-docx.readthedocs.io/en/latest/user/documents.html
I try to read a Minecraft world with Python from the filesystem and the .mca region/anvil files using the NBT 1.4.1 module (Named Binary Tag Reader/Writer), which is supposed to read the NBT format used in Minecraft. It works fine for files such as level.dat, but throws an error for the region files such as r.0.0.mca
Edit: I am referring to the auto generated world files that minecraft stores in the .minecraft/saves/"MyWorld"/ folder. Such as the level.dat (which works), and the mca files stored in the .minecraft/saves/"MyWorld"/region/ folder such as r.0.0.mca which don't work. I uploaded two sample files from one of my worlds.
Code:
from nbt import nbt
level_file = nbt.NBTFile("level.dat", "rb") # works
region_file = nbt.NBTFile("r.0.0.mca", "rb")# does not work
Error:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python3.5/dist-packages/nbt/nbt.py", line 508, in __init__
self.parse_file()
File "/usr/local/lib/python3.5/dist-packages/nbt/nbt.py", line 532, in parse_file
type = TAG_Byte(buffer=self.file)
File "/usr/local/lib/python3.5/dist-packages/nbt/nbt.py", line 85, in __init__
self._parse_buffer(buffer)
File "/usr/local/lib/python3.5/dist-packages/nbt/nbt.py", line 90, in _parse_buffer
self.value = self.fmt.unpack(buffer.read(self.fmt.size))[0]
File "/usr/lib/python3.5/gzip.py", line 274, in read
return self._buffer.read(size)
File "/usr/lib/python3.5/_compression.py", line 68, in readinto
data = self.read(len(byte_view))
File "/usr/lib/python3.5/gzip.py", line 461, in read
if not self._read_gzip_header():
File "/usr/lib/python3.5/gzip.py", line 409, in _read_gzip_header
raise OSError('Not a gzipped file (%r)' % magic)
OSError: Not a gzipped file (b'\x00\x00')
Any suggestions how to get this working?
r.0.0.mca is most definitely not compressed. About 80% of the bytes are zeros.
It turns out that the NBT library only supports .mcr region files which have been replaced by .mca files about 6 years ago. However, mcedit is written in Python and supports those files. Due the changes in the Minecraft save format, the interpretation of the content needs to be adjusted though, but the files can be successfully read.
I am trying to use the SpeechRecognition library (https://pypi.python.org/pypi/SpeechRecognition/). When running the example code (full example) below:
#!/usr/bin/env python3
import speech_recognition as sr
# obtain path to "english.wav" in the same folder as this script
from os import path
AUDIO_FILE = path.join(path.dirname(path.realpath(__file__)), "english.wav")
#AUDIO_FILE = path.join(path.dirname(path.realpath(__file__)), "french.aiff")
#AUDIO_FILE = path.join(path.dirname(path.realpath(__file__)), "chinese.flac")
# use the audio file as the audio source
r = sr.Recognizer()
with sr.AudioFile(AUDIO_FILE) as source:
audio = r.record(source) # read the entire audio file
I receive this error:
/usr/local/Cellar/python3/3.4.2_1/Frameworks/Python.framework/Versions/3.4/bin/python3.4 /Users/adamg/te/Polli/ASR/SpeechRecognitionTest0.py
Traceback (most recent call last):
File "/usr/local/Cellar/python3/3.4.2_1/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/speech_recognition/__init__.py", line 174, in __enter__
self.audio_reader = wave.open(self.filename_or_fileobject, "rb")
File "/usr/local/Cellar/python3/3.4.2_1/Frameworks/Python.framework/Versions/3.4/lib/python3.4/wave.py", line 497, in open
return Wave_read(f)
File "/usr/local/Cellar/python3/3.4.2_1/Frameworks/Python.framework/Versions/3.4/lib/python3.4/wave.py", line 163, in __init__
self.initfp(f)
File "/usr/local/Cellar/python3/3.4.2_1/Frameworks/Python.framework/Versions/3.4/lib/python3.4/wave.py", line 130, in initfp
raise Error('file does not start with RIFF id')
wave.Error: file does not start with RIFF id
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/Cellar/python3/3.4.2_1/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/speech_recognition/__init__.py", line 179, in __enter__
self.audio_reader = aifc.open(self.filename_or_fileobject, "rb")
File "/usr/local/Cellar/python3/3.4.2_1/Frameworks/Python.framework/Versions/3.4/lib/python3.4/aifc.py", line 887, in open
return Aifc_read(f)
File "/usr/local/Cellar/python3/3.4.2_1/Frameworks/Python.framework/Versions/3.4/lib/python3.4/aifc.py", line 340, in __init__
self.initfp(f)
File "/usr/local/Cellar/python3/3.4.2_1/Frameworks/Python.framework/Versions/3.4/lib/python3.4/aifc.py", line 305, in initfp
raise Error('file does not start with FORM id')
aifc.Error: file does not start with FORM id
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/Users/adamg/te/Polli/ASR/SpeechRecognitionTest0.py", line 13, in <module>
with sr.AudioFile(AUDIO_FILE) as source:
File "/usr/local/Cellar/python3/3.4.2_1/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/speech_recognition/__init__.py", line 199, in __enter__
self.audio_reader = aifc.open(aiff_file, "rb")
File "/usr/local/Cellar/python3/3.4.2_1/Frameworks/Python.framework/Versions/3.4/lib/python3.4/aifc.py", line 887, in open
return Aifc_read(f)
File "/usr/local/Cellar/python3/3.4.2_1/Frameworks/Python.framework/Versions/3.4/lib/python3.4/aifc.py", line 340, in __init__
self.initfp(f)
File "/usr/local/Cellar/python3/3.4.2_1/Frameworks/Python.framework/Versions/3.4/lib/python3.4/aifc.py", line 303, in initfp
chunk = Chunk(file)
File "/usr/local/Cellar/python3/3.4.2_1/Frameworks/Python.framework/Versions/3.4/lib/python3.4/chunk.py", line 63, in __init__
raise EOFError
EOFError
Process finished with exit code 1
How can this be fixed?
I had the same problem for a while, my mistake is to download the wav file using "save as". When I double click to play the wav file, I am not able to play the wav file using any player. After cloning or downloading the entire zip file, I am able to play the wav file and the error disappears.
Your wav file is probably corrupted. To check try to play the file using any media player if possible. Download the audio file into your environment correctly and then it should work.
I have faced this issue when I did a mistake by downloading the file the incorrect way, I downloaded the audio file using the following command in the google colab:
[!wget 'https://github.com/mozilla/DeepSpeech/blob/master/data/smoke_test/LDC93S1_pcms16le_1_16000.wav']
This threw an error as the audio file was corrupted(could not be played my media player when downloaded).
I was able to correct the issue by downloading the following way:
[!wget 'https://raw.githubusercontent.com/mozilla/DeepSpeech/master/data/smoke_test/LDC93S1_pcms16le_1_16000.wav']
For what it's worth, I ran into this problem when git LFS WAV files I had stored in a repo were not cloned properly. Their pointers were present, but the files were not. A
git lfs pull
fixed the problem.
That being said, this might just happen any time you a file is pointed to that isn't actually an audio file that is "complete". Hopefully that helps!