Decompressing .bz2 files in a directory in python - python

I would like to decompress a bunch of .bz2 files contained in a folder (where there are also .zst files). What I am doing is the following:
destination_folder = "/destination_folder_path/"
compressed_files_path="/compressedfiles_folder_path/"
dirListing = os.listdir(compressed_files_path)
for file in dirListing:
if ".bz2" in file:
unpackedfile = bz2.BZ2File(file)
data = unpackedfile.read()
open(destination_folder, 'wb').write(data)
But I keep on getting the following error message:
Traceback (most recent call last):
File "mycode.py", line 34, in <module>
unpackedfile = bz2.BZ2File(file)
File ".../miniconda3/lib/python3.9/bz2.py", line 85, in __init__
self._fp = _builtin_open(filename, mode)
FileNotFoundError: [Errno 2] No such file or directory: 'filename.bz2'
Why do I receive this error?

You must be sure that all the file paths you are using exist.
It is better to use the full path to the file being opened.
import os
import bz2
# this path must exist
destination_folder = "/full_path_to/folder/"
compressed_files_path = "/full_path_to_other/folder/"
# get list with filenames (strings)
dirListing = os.listdir(compressed_files_path)
for file in dirListing:
# ^ this is only filename.ext
if ".bz2" in file:
# concatenation of directory path and filename.bz2
existing_file_path = os.path.join(compressed_files_path, file)
# read the file as you want
unpackedfile = bz2.BZ2File(existing_file_path)
data = unpackedfile.read()
new_file_path = os.path.join(destination_folder, file)
with bz2.open(new_file_path, 'wb') as f:
f.write(data)
You can also use the shutil module to copy or move files.
os.path.exists
os.path.join
shutil
bz2 examples

Related

Iterating over a multiple files in a folder

Im trying to develop a program that can iterate over different files in the same folder. The files are all the same format but will have different names. Right now if there is only 1 file in the folder the code executes with no problems but with different files i get the error:
Traceback (most recent call last):
File "D:/Downloads/FYP/Feedback draft.py", line 24, in <module>
wb = openpyxl.load_workbook(filename)
File "C:\Users\shomi\AppData\Local\Programs\Python\Python38-32\lib\site-packages\openpyxl\reader\excel.py", line 315, in load_workbook
reader = ExcelReader(filename, read_only, keep_vba,
File "C:\Users\shomi\AppData\Local\Programs\Python\Python38-32\lib\site-packages\openpyxl\reader\excel.py", line 124, in __init__
self.archive = _validate_archive(fn)
File "C:\Users\shomi\AppData\Local\Programs\Python\Python38-32\lib\site-packages\openpyxl\reader\excel.py", line 96, in _validate_archive
archive = ZipFile(filename, 'r')
File "C:\Users\shomi\AppData\Local\Programs\Python\Python38-32\lib\zipfile.py", line 1251, in __init__
self.fp = io.open(file, filemode)
FileNotFoundError: [Errno 2] No such file or directory: 'tester2.xlsx'
The code im using is :
directory = r'D:\Downloads\FYP\TEST'
for filename in os.listdir(directory):
if filename.endswith(".xlsx"):
wb = openpyxl.load_workbook(filename)
sh1=wb['test']
doc = DocxTemplate('Assignment1feedback.docx')
context = {
'acc': acceleration
}
doc.render(context)
doc.save('D:\\Downloads\\FYP\\TEST\\' + filename + '.docx')
This is incomplete code as the full thing would be quite long but overall i want to access these excel files and then create a corresponding docx
So os.listdir only provides the basename of the directory files, which will cause problems if your working directory does not match the value of directory. If your working directory is D:\Downloads, ./file.xlsx does not exist but D:\Downloads\FYP\TEST/file.xlsx does.
You will want to use the absolute path to the file, you have two options here. You could follow #IronMan's suggestion in the their comment to produce the file path from the directory path and file basename:
import os
directory = r'D:\Downloads\FYP\TEST'
for filename in os.listdir():
wb = openpyxl.load_workbook(os.path.join(directory, filename))
This is a simple and useful approach; however, its functionality is somewhat limited and may make it harder to make changes in the future. The alternative is to use python's paathlib and scandir, and access the path directly from there:
import pathlib
directory = r'D:\Downloads\FYP\TEST'
for entry in pathlib.scandir(diectory):
wb = openpyxl.load_workbook(entry.path)

Python File Not Found Error even though file is in same directory

I'm running a python code (filename- images.py) that reads-
import gzip
f = gzip.open('i1.gz','r')
But it is showing the FileNotFoundError.
My folder containing images.py looks like-
New Folder/
images.py
i1.gz
(...Some other files...)
The problem is that you are not running the script from within the New Folder.
You can easily solve it by using the absolute path without hard-coding it:
from os import path
file_path = path.abspath(__file__) # full path of your script
dir_path = path.dirname(file_path) # full path of the directory of your script
zip_file_path = path.join(dir_path,'i1.gz') # absolute zip file path
# and now you can open it
f = gzip.open(zip_file_path,'r')
Check the current working directory of the script by doing:
import os
os.getcwd()
Then, compare this with your i1.gz absolute path. Then you should be able to see if there are any inconsistencies.
Are you run the script from the New Folder?
If you are in the folder, it should work:
c:\Data\Python\Projekty\Random\gzip_example>python load_gzip.py
but if you run the script from a parent folder with the folder name, it returned the error:
c:\Data\Python\Projekty\Random>python gzip_example\load_gzip.py
Traceback (most recent call last):
File "C:\Data\Python\Projekty\Random\gzip_example\load_gzip.py", line 2, in <module>
f = gzip.open('file.gz', 'r')
File "C:\Python\Python 3.8\lib\gzip.py", line 58, in open
binary_file = GzipFile(filename, gz_mode, compresslevel)
File "C:\Python\Python 3.8\lib\gzip.py", line 173, in __init__
fileobj = self.myfileobj = builtins.open(filename, mode or 'rb')
FileNotFoundError: [Errno 2] No such file or directory: 'file.gz'
The way I usually set working directory and work with files is as follow:
import os
pwd_path= os.path.dirname(os.path.abspath(__file__))
myfile = os.path.join(pwd_path, 'i1.gz')

Extracting zip files using python

I'm trying to get all zip files in a specific directory name "downloaded" and to extract all of their content to a directory named "extracted".
I don't know why, after I'm iterating only existing files name, I get an error that there is no such file...
allFilesList = os.listdir(os.getcwd()+"/downloaded")
print allFilesList #verify - correct expected list
from zipfile import ZipFile
os.chdir(os.getcwd()+"/extracted/")
print os.getcwd() #verify - correct expected dir
for fileName in allFilesList:
print fileName
with ZipFile(fileName, 'r') as zipFileObject:
if os.path.exists(fileName):
print "Skipping extracting " + fileName
continue
zipFileObject.extractall(pwd='hello')
print "Saving extracted file to extracted/",fileName
print "all files has been successfully extracted"
Error message:
File "program.py", line 77, in <module>
with ZipFile(fileName, 'r') as zipFileObject:
File "/usr/lib/python2.7/zipfile.py", line 779, in __init__
self.fp = open(file, modeDict[mode])
IOError: [Errno 2] No such file or directory: 'zipFile1.zip'
You're getting the list of filenames from one directory, then changing to another, and trying to extract the files from that directory that likely don't exist:
allFilesList = os.listdir(os.getcwd()+"/downloaded")
# ...
os.chdir(os.getcwd()+"/extracted/")
# ...
with ZipFile(fileName, 'r') as zipFileObject:
If you change that file ZipFile command to something like this:
with ZipFile(os.path.join("..", "downloaded", fileName), 'r') as zipFileObject:
You should be able to open the file in the directory you found it in.

Getting a error: FileNotFoundError: [Errno 2] No such file or directory: while trying to open a file

In a Folder called Assignment Parser, I've my parsing.py file along with a auth.txt file. Trying to open this auth.txt file. But getting an error that says :
(base) C:\Users\Ajay\Desktop\Python\Assignment Parser>python parsing.py
Traceback (most recent call last):
File "parsing.py", line 27, in <module>
main()
File "parsing.py", line 8, in main
file = open(file_path / "auth.txt","r")
FileNotFoundError: [Errno 2] No such file or directory: 'C:\\Users\\Ajay\\Desktop\\Python\\Assignment Parser\\auth.txt'
Code:
from pathlib import Path
import os
def main():
# read file
# C:\Users\Ajay\Desktop\Python\Assignment Parser\
file_path = Path("C:/Users/Ajay/Desktop/Python/Assignment Parser/")
file = open(file_path / "auth.txt","r")
# file = open("auth.txt", "r")
lines = file.readlines()
file.close()
Where is this going wrong? PFA for the screenprint.
Try this:
from pathlib import Path
import os
def main():
# read file
# C:\Users\Ajay\Desktop\Python\Assignment Parser\
file_path = Path("C:/Users/Ajay/Desktop/Python/Assignment Parser/")
file = open(os.path.join(file_path, "auth.txt"), "r")
# file = open("auth.txt", "r")
lines = file.readlines()
file.close()
I think the problem is in file extension, I see parsing has .py extension but auth is not
please try file = open(file_path / "auth", "r") again (just delete .txt extension)
As you have your python file in same folder as your text file. You can directly use below code.
def main():
file = open("./auth.txt")
lines = file.readlines()
file.close()
Also make sure, your syder working directory is set to this folder path "C:/Users/Ajay/Desktop/Python/Assignment Parser"

Python IOError: [Errno 2] from recursive directory call

The code below is part of a program I am writing that runs a method on every .py, .sh. or .pl file in a directory and its folders.
for root, subs, files in os.walk("."):
for a in files:
if a.endswith('.py') or a.endswith('.sh') or a.endswith('.pl'):
scriptFile = open(a, 'r')
writer(writeFile, scriptFile)
scriptFile.close()
else:
continue
When writing the program, it worked in the directory tree I wrote it in, but when I moved it to another folder to try it there I get this error message:
Traceback (most recent call last):
File "versionTEST.py", line 75, in <module>
scriptFile = open(a, 'r')
IOError: [Errno 2] No such file or directory: 'enabledLogSources.sh'
I know something weird is going on because the file is most definitely there...
You'll need to prepend the root directory to your filename
scriptFile = open(root + '/' + a, 'r')
files contains only the file names, not the entire path. The path to the file can be obtained by joining the file name and the root:
scriptFile = open(os.path.join(root, a), "r")
You might want to have a look at
https://docs.python.org/2/library/os.html#os.walk

Categories