How to selectively zip files from specified folders - python

I want to selectively zip files from some folders but couldn't find a good way.
The source folder structure:
C:/temp/x86/file1.dll
C:/temp/x86/file2.dll
C:/temp/x86/file3.txt
And
C:/temp/x64/file1.dll
C:/temp/x64/file2.dll
C:/temp/x64/file4.dll
My requirement:
Zip file1.dll from C:/temp/x86 and C:/temp/x64 into C:/outputFolder/x86 and C:/outputFolder/x64 separately. The zip file name can be example.zip.
That's to say, after example.zip was unzipped, the output structure is as the following:
outputFolder/x86/file1.dll
And
outputFolder/x64/file1.dll
One solution is to manually copy the files into the destination folder, then zip them.
But I want to avoid the copy because in my actual codes because there are dozens of files which are big.
How can I achieve that? Thanks all very much!

Related

How do I copy subfolders into another location

I'm making a program to back up files and folders to a destination.
The problem I'm currently facing is if I have a folder inside a folder and so on, with files in between them, I can't Sync them at the destination.
e.g.:
The source contains folder 1 and file 2. Folder 1 contains folder 2, folder 2 contains folder 3 and files etc...
The backup only contains folder 1 and file 2.
If the backup doesn't exist I simply use: shutil.copytree(path, path_backup), but in the case, I need to sync I can't get the files and folders or at least I'm not seeing a way to do it. I have walked the directory with for path, dir, files in os.walk(directory) and even used what someone suggest in another post:
def walk_folder(target_path, path_backup):
for files in os.scandir(target_path):
if os.path.isfile(files):
file_name = os.path.abspath(files)
print(file_name)
os.makedirs(path_backup)
elif os.path.isdir(files):
walk_folder(files, path_backup)
Is there a way to make the directories in the backup folder from the ground up and then add the info alongside or is the only way to just delete the whole folder and use shutil.copytree(path, path_backup).
With makedirs, all it does is say it can't create because the folder already exists, this is understandable as it's trying to write in the Source folder and not in the backup. Is there a way to make the path to replace Source for backup?
If any more code is needed feel free to ask!

namelist() method not listing directories in python

I want to extract all the folder names from the zip file so that I can extract them separately. My login is working with one zip and but it is not working with another zip with strcuture.
root_dir = r'C:/Workspace/Neo4j/FileStore/RDAR/data.zip'
archive = ZipFile(root_dir, "r")
folder_paths = []
for file in archive.namelist():
print(file)
if file.endswith("/"):
folder_paths.append(file)
The above code is working with data10.zip but it is not listing directories in data.zip
folder strcuture of data.zip which is not working
folder structure of data10.zip
I am able extract list of folder in data10.zip as shown above, but cannot in data.zip
Any clue what might be the reason ?
Thanks
I am trying to list the directories from zip, although my code is working for one zip and is not working for another with same structure.

Merging files within a subfolder for a large batch of subfolders

I am analyzing some data in a bioinformatics pipeline (qiime). I am trying to use a cat command to merge two files within a subfolder - I need to do this for 330 files, but am having trouble with the command string.
My current string:
cat AdapterRemoval/*.fastq/output_paired.collapsed AdapterRemoval/*.fastq/output_paired.collapsed.truncated > AdapterRemoval/*.fastq/mergedfile.fastq
This is the code I am using - with the * to indicate the command should look in all .fastq folders for the files output_paired and output_paired.collapsed then merge those files into one mergedfile.fastq and place it within the same folder the original files are found in.
For instance:
AdapterRemoval/C1.fastq/output_paired.collapsed AdapterRemoval/C1.fastq/output_paired.collapsed.truncated > AdapterRemoval/C1.fastq/mergedfile.fastq
So that those two files found within the AdapterRemoval/C1 subfolder would be merged and the merged file placed in that same subfolder.
In fact, when I type it out like this using the single filepath with a specific folder ID, it works. But when I put the * in place of the subfolder I get an error saying there is no such directory or file as AdapterRemoval/*.fastq/mergedfile.fastq
Does anyone know what I might be doing wrong? Any help would be much appreciated!
Thank you,
Sarah

python get list of folder names from zip folder

I have 80 zipped files. In each of them, there are about 20 folders (that I will call first level folders). What is the python code to get a list of all of all of the first level folder names from each of the zipped file?
I need to have an excel spread sheet listing the names of the first level folders from all 80 zipped files.
Tricky part: There are 2 types of zipped files amongst those 80. Some have .zip extension while others have .7z extension.
The Python zipfile module documentaion answers your question well.
ZipFile.namelist()
Return a list of archive members by name.
For 7zip, it may be necessary to use the subprocess module and run 7zip; not all 7zip files can be opened by the zipfile module.

Get big TAR(gz)-file contents by dir levels

I use python tarfile module.
I have a system backup in tar.gz file.
I need to get first level dirs and files list without getting ALL the list of files in the archive because it's TOO LONG.
For example: I need to get ['bin/', 'etc/', ... 'var/'] and that's all.
How can I do it? May be not even with a tar-file? Then how?
You can't scan the contents of a tar without scanning the entire file; it has no central index. You need something like a ZIP.

Categories