I have 80 zipped files. In each of them, there are about 20 folders (that I will call first level folders). What is the python code to get a list of all of all of the first level folder names from each of the zipped file?
I need to have an excel spread sheet listing the names of the first level folders from all 80 zipped files.
Tricky part: There are 2 types of zipped files amongst those 80. Some have .zip extension while others have .7z extension.
The Python zipfile module documentaion answers your question well.
ZipFile.namelist()
Return a list of archive members by name.
For 7zip, it may be necessary to use the subprocess module and run 7zip; not all 7zip files can be opened by the zipfile module.
Related
I'm making a program to back up files and folders to a destination.
The problem I'm currently facing is if I have a folder inside a folder and so on, with files in between them, I can't Sync them at the destination.
e.g.:
The source contains folder 1 and file 2. Folder 1 contains folder 2, folder 2 contains folder 3 and files etc...
The backup only contains folder 1 and file 2.
If the backup doesn't exist I simply use: shutil.copytree(path, path_backup), but in the case, I need to sync I can't get the files and folders or at least I'm not seeing a way to do it. I have walked the directory with for path, dir, files in os.walk(directory) and even used what someone suggest in another post:
def walk_folder(target_path, path_backup):
for files in os.scandir(target_path):
if os.path.isfile(files):
file_name = os.path.abspath(files)
print(file_name)
os.makedirs(path_backup)
elif os.path.isdir(files):
walk_folder(files, path_backup)
Is there a way to make the directories in the backup folder from the ground up and then add the info alongside or is the only way to just delete the whole folder and use shutil.copytree(path, path_backup).
With makedirs, all it does is say it can't create because the folder already exists, this is understandable as it's trying to write in the Source folder and not in the backup. Is there a way to make the path to replace Source for backup?
If any more code is needed feel free to ask!
I want to extract all the folder names from the zip file so that I can extract them separately. My login is working with one zip and but it is not working with another zip with strcuture.
root_dir = r'C:/Workspace/Neo4j/FileStore/RDAR/data.zip'
archive = ZipFile(root_dir, "r")
folder_paths = []
for file in archive.namelist():
print(file)
if file.endswith("/"):
folder_paths.append(file)
The above code is working with data10.zip but it is not listing directories in data.zip
folder strcuture of data.zip which is not working
folder structure of data10.zip
I am able extract list of folder in data10.zip as shown above, but cannot in data.zip
Any clue what might be the reason ?
Thanks
I am trying to list the directories from zip, although my code is working for one zip and is not working for another with same structure.
I want to selectively zip files from some folders but couldn't find a good way.
The source folder structure:
C:/temp/x86/file1.dll
C:/temp/x86/file2.dll
C:/temp/x86/file3.txt
And
C:/temp/x64/file1.dll
C:/temp/x64/file2.dll
C:/temp/x64/file4.dll
My requirement:
Zip file1.dll from C:/temp/x86 and C:/temp/x64 into C:/outputFolder/x86 and C:/outputFolder/x64 separately. The zip file name can be example.zip.
That's to say, after example.zip was unzipped, the output structure is as the following:
outputFolder/x86/file1.dll
And
outputFolder/x64/file1.dll
One solution is to manually copy the files into the destination folder, then zip them.
But I want to avoid the copy because in my actual codes because there are dozens of files which are big.
How can I achieve that? Thanks all very much!
I am trying to extract zip files using the zipfile module's extractall method.
My code snippet is
import zipfile
file_path = '/something/airway.zip'
dir_path = 'something/'
with zipfile.ZipFile(file_path, "r") as zip_ref:
zip_ref.extractall(dir_path)
I have two zip files named, test (1.1 mb) and airway (520 mb).
For test.zip the folder contains all the files but for airway.zip, it creates another folder inside my target folder named Airway, and then extracts all the files there. Even after renaming the airway.zip to any garbage name, the result was same.
Is there some workaround to get only the files extracted in my target folder? It is critical for me as I'm doing this extraction automated from django
Python version: 3.9.6;
Django version: 2.2
I ran your code and it seems to be only a problem of the zipfile itself. If you create a zipfile by selecting only the elements you get the result you got with test.zip. If you create it by selecting a folder holding the elements the folder will be there if you extract it again, no matter what you name your zip file.
I have two articles related to this:
https://www.kite.com/python/docs/zipfile.ZipFile.extractall
https://www.geeksforgeeks.org/working-zip-files-python/
Even if both of these articles do not solve your problem then I think that instead of zipping the files in the folder you just zipped the folder itself so try by zipping the files inside the folder.
I'm creating a program in Python which downloads a set of files and puts them into an archive with the zipfile module.
I already found out how to append to the archive, but there are cases where the files in the archive already exist and should be overwritten.
Currently, if I append an already existing file to the archive I get a duplicate.
Does anyone know how to delete a file in an archive?
From http://docs.python.org/2/library/zipfile
ZipFile.namelist()
Return a list of archive members by name.
So it is trivial to get hold of the members list before appending to the file and performing a check operation against the list of existing members within the archive.
In addition: removing from a ZIP file is not supported. You need to write a new archive
if needed and copy over existing files and omit the file to be removed.
See also
Delete file from zipfile with the ZipFile Module