How to create zip of a specific folder using shutil - python

I'm trying to zip all the folders of a particular directory separately with their respective folder names as the zip file name just the way how winzip does.My code below:
folder_list = os.walk('.').next()[1] # bingo
print folder_list
for each_folder in folder_list:
shutil.make_archive(each_folder, 'zip', os.getcwd())
But what it is doing is creating a zip of one folder and dumping all other files and folders of that directory into the zip file.LIke that it is doing for all the folders inside the current directory.
Any help on this !!!

With a little more research in shutil, I'm now able to make my code work. Below is my code:
import os, shutil
#Get the list of all folders present within the particular directory
folder_list = os.walk('.').next()[1]
#Start zipping the folders
for each_folder in folder_list:
shutil.make_archive(each_folder, 'zip', os.getcwd() + "\\" + each_folder)

I don;t think that is possible. You could look at the source.
In particular, at line 683 you can see that it explicitly passes compression=zipfile.ZIP_DEFLATED if your Python has the zipfile module, while the fallback code at line 635 doesn't pass any arguments besides -r and -q to the zip command-line tool.
You could try this one:
args = ['tar', '-zcf', dest_file_name, '-C', me_directory, '.']
res = subprocess.call(args)
If you want to get list of directories use some well written libs:
for (root, directories, _) in os.walk(my_dir):
for dir_name in directories:
path_to_dir = os.path.join(root, dir_name)// don't make concat, like a+'//'+b, thats not enviroment saint

Related

Zip each folder (directory) recursively

I am trying to zip each folder on its own in Python. However, the first folder is being zipped and includes all folders within it. Could someone please explain what is going on? Should I not be using shutil for this?
#%% Set path variable
path = r"G:\Folder"
os.chdir(path)
os.getcwd()
#%% Zip all folders
def retrieve_file_paths(dirName):
# File paths variable
filePaths = []
# Read all directory, subdirectories and file lists
for root, directories, files in os.walk(dirName):
for filename in directories:
# Createthe full filepath by using os module
filePath = os.path.join(root, filename)
filePaths.append(filePath)
# return all paths
return filePaths
filepaths = retrieve_file_paths(path)
#%% Print folders and start zipping individually
for x in filepaths:
print(x)
shutil.make_archive(x, 'zip', path)
shutil.make_archive will make an archive of all files and subfolders - since this is what most people want. If you need more choice of what files are included, you must use zipfile directly.
You can do this right within the walk loop (that is what it's for).
import os
import zipfile
dirName = 'C:\...'
# Read all directory, subdirectories and file lists
for root, directories, files in os.walk(dirName):
zf = zipfile.ZipFile(os.path.join(root, "thisdir.zip"), "w", compression=zipfile.ZIP_DEFLATED, compresslevel=9)
for name in files:
if name == 'thisdir.zip': continue
filePath = os.path.join(root, name)
zf.write(filePath, arcname=name)
zf.close()
This will create a file "thisdir.zip" in each subdirectory, containing only the files within this directory.
(edit: tested & corrected code example)
Following Torben's answer to my question, I modified the code to zip each directory recursively. I realised what had happened was that I was not specifying sub directories. Code below:
#Set path variable
path = r"insert root directory here"
os.chdir(path)
# Declare the functionto return all file paths in selected directory
def retrieve_file_paths(dirName):
for root, dirs, files in os.walk(dirName):
for dir in dirs:
zf = zipfile.ZipFile(os.path.join(root+dir, root+dir+'.zip'), "w", compression=zipfile.ZIP_DEFLATED, compresslevel=9)
files = os.listdir(root+dir)
print(files)
filePaths.append(files)
for f in files:
filepath = root + dir +'/'+ f
zf.write(filepath, arcname=f)
zf.close()
retrieve_file_paths(path)
it's a relativly simple answer once you got a look onto the Docs.
You can see the following under shutil.make_archive:
Note This function is not thread-safe.
The way threading in computing works on a high level basis:
On a machine there are cores, which can process data. (e.g. AMD Ryzen 5 5800x with 8cores)
Within a process, there are threads (e.g. 16 Threads on the Ryzen 5800X).
However, in multiprocessing there is no data shared between the processes.
In multithreading within one process you can access data from the same variable.
Because this function is not thread-safe, you will share the variable "x" and access the same item. Which means there can only be one output.
Have a look into multithreading and works with locks in order to sequelize threads.
Cheers

Python: Unzip selected files in directory tree

I have the following directory, in the parent dir there are several folders lets say ABCD and within each folder many zips with names as displayed and the letter of the parent folder included in the name along with other info:
-parent--A-xxxAxxxx_timestamp.zip
-xxxAxxxx_timestamp.zip
-xxxAxxxx_timestamp.zip
--B-xxxBxxxx_timestamp.zip
-xxxBxxxx_timestamp.zip
-xxxBxxxx_timestamp.zip
--C-xxxCxxxx_timestamp.zip
-xxxCxxxx_timestamp.zip
-xxxCxxxx_timestamp.zip
--D-xxxDxxxx_timestamp.zip
-xxxDxxxx_timestamp.zip
-xxxDxxxx_timestamp.zip
I need to unzip only selected zips in this tree and place them in the same directory with the same name without the .zip extension.
Output:
-parent--A-xxxAxxxx_timestamp
-xxxAxxxx_timestamp
-xxxAxxxx_timestamp
--B-xxxBxxxx_timestamp
-xxxBxxxx_timestamp
-xxxBxxxx_timestamp
--C-xxxCxxxx_timestamp
-xxxCxxxx_timestamp
-xxxCxxxx_timestamp
--D-xxxDxxxx_timestamp
-xxxDxxxx_timestamp
-xxxDxxxx_timestamp
My effort:
for path in glob.glob('./*/xxx*xxxx*'): ##walk the dir tree and find the files of interest
zipfile=os.path.basename(path) #save the zipfile path
zip_ref=zipfile.ZipFile(path, 'r')
zip_ref=extractall(zipfile.replace(r'.zip', '')) #unzip to a folder without the .zip extension
The problem is that i dont know how to save the A,B,C,D etc to include them in the path where the files will be unzipped. Thus, the unzipped folders are created in the parent directory. Any ideas?
The code that you have seems to be working fine, you just to make sure that you are not overriding variable names and using the correct ones. The following code works perfectly for me
import os
import zipfile
import glob
for path in glob.glob('./*/xxx*xxxx*'): ##walk the dir tree and find the files of interest
zf = os.path.basename(path) #save the zipfile path
zip_ref = zipfile.ZipFile(path, 'r')
zip_ref.extractall(path.replace(r'.zip', '')) #unzip to a folder without the .zip extension
Instead of trying to do it in a single statement , it would be much easier and more readable to do it by first getting list of all folders and then get list of files inside each folder. Example -
import os.path
for folder in glob.glob("./*"):
#Using *.zip to only get zip files
for path in glob.glob(os.path.join(".",folder,"*.zip")):
filename = os.path.split(path)[1]
if folder in filename:
#Do your logic

How to run script for all files in a folder/directry

I am new to python. I have successful written a script to search for something within a file using :
open(r"C:\file.txt) and re.search function and all works fine.
Is there a way to do the search function with all files within a folder? Because currently, I have to manually change the file name of my script by open(r"C:\file.txt),open(r"C:\file1.txt),open(r"C:\file2.txt)`, etc.
Thanks.
You can use os.walk to check all the files, as the following:
import os
for root, _, files in os.walk(path):
for filename in files:
with open(os.path.join(root, filename), 'r') as f:
#your code goes here
Explanation:
os.walk returns tuple of (root path, dir names, file names) in the folder, so you can iterate through filenames and open each file by using os.path.join(root, filename) which basically joins the root path with the file name so you can open the file.
Since you're a beginner, I'll give you a simple solution and walk through it.
Import the os module, and use the os.listdir function to create a list of everything in the directory. Then, iterate through the files using a for loop.
Example:
# Importing the os module
import os
# Give the directory you wish to iterate through
my_dir = <your directory - i.e. "C:\Users\bleh\Desktop\files">
# Using os.listdir to create a list of all of the files in dir
dir_list = os.listdir(my_dir)
# Use the for loop to iterate through the list you just created, and open the files
for f in dir_list:
# Whatever you want to do to all of the files
If you need help on the concepts, refer to the following:
for looops in p3: http://www.python-course.eu/python3_for_loop.php
os function Library (this has some cool stuff in it): https://docs.python.org/2/library/os.html
Good luck!
You can use the os.listdir(path) function:
import os
path = '/Users/ricardomartinez/repos/Salary-API'
# List for all files in a given PATH
file_list = os.listdir(path)
# If you want to filter by file type
file_list = [file for file in os.listdir(path) if os.path.splitext(file)[1] == '.py']
# Both cases yo can iterate over the list and apply the operations
# that you have
for file in file_list:
print(file)
#Operations that you want to do over files

Directory is not being recognized in Python

I'm uploading a zipped folder that contains a folder of text files, but it's not detecting that the folder that is zipped up is a directory. I think it might have something to do with requiring an absolute path in the os.path.isdir call, but can't seem to figure out how to implement that.
zipped = zipfile.ZipFile(request.FILES['content'])
for libitem in zipped.namelist():
if libitem.startswith('__MACOSX/'):
continue
# If it's a directory, open it
if os.path.isdir(libitem):
print "You have hit a directory in the zip folder -- we must open it before continuing"
for item in os.listdir(libitem):
The file you've uploaded is a single zip file which is simply a container for other files and directories. All of the Python os.path functions operate on files on your local file system which means you must first extract the contents of your zip before you can use os.path or os.listdir.
Unfortunately it's not possible to determine from the ZipFile object whether an entry is for a file or directory.
A rewrite or your code which does an extract first may look something like this:
import tempfile
# Create a temporary directory into which we can extract zip contents.
tmpdir = tempfile.mkdtemp()
try:
zipped = zipfile.ZipFile(request.FILES['content'])
zipped.extractall(tmpdir)
# Walk through the extracted directory structure doing what you
# want with each file.
for (dirpath, dirnames, filenames) in os.walk(tmpdir):
# Look into subdirectories?
for dirname in dirnames:
full_dir_path = os.path.join(dirpath, dirname)
# Do stuff in this directory
for filename in filenames:
full_file_path = os.path.join(dirpath, filename)
# Do stuff with this file.
finally:
# ... Clean up temporary diretory recursively here.
Usually to make things handle relative paths etc when running scripts you'd want to use os.path.
It seems to me that you're reading from a Zipfile the items you've not actually unzipped it so why would you expect the file/dirs to exist?
Usually I'd print os.getcwd() to find out where I am and also use os.path.join to join with the root of the data directory, whether that is the same as the directory containing the script I can't tell. Using something like scriptdir = os.path.dirname(os.path.abspath(__file__)).
I'd expect you would have to do something like
libitempath = os.path.join(scriptdir, libitem)
if os.path.isdir(libitempath):
....
But I'm guessing at what you're doing as it's a little unclear for me.

Create path and filename from string in Python

Given the following strings:
dir/dir2/dir3/dir3/file.txt
dir/dir2/dir3/file.txt
example/directory/path/file.txt
I am looking to create the correct directories and blank files within those directories.
I imported the os module and I saw that there is a mkdir function, but I am not sure what to do to create the whole path and blank files. Any help would be appreciated. Thank you.
Here is the answer on all your questions (directory creation and blank file creation)
import os
fileList = ["dir/dir2/dir3/dir3/file.txt",
"dir/dir2/dir3/file.txt",
"example/directory/path/file.txt"]
for file in fileList:
dir = os.path.dirname(file)
# create directory if it does not exist
if not os.path.exists(dir):
os.makedirs(dir)
# Create blank file if it does not exist
with open(file, "w"):
pass
First of all, given that try to create directory under a directory that doesn't exist, os.mkdir will raise an error. As such, you need to walk through the paths and check whether each of the subdirectories has or has not been created (and use mkdir as required). Alternative, you can use os.makedirs to handle this iteration for you.
A full path can be split into directory name and filename with os.path.split.
Example:
import os
(dirname, filename) = os.path.split('dir/dir2/dir3/dir3/file.txt')
os.makedirs(dirname)
Given we have a set of dirs we want to create, simply use a for loop to iterate through them. Schematically:
dirlist = ['dir1...', 'dir2...', 'dir3...']
for dir in dirlist:
os.makedirs( ... )

Categories