how can i copy only the structure (excluding the files) of a path with all subfolders?
For example
path1\subfolder1\subsubfolder1
\subsubfolder2
\subfolder2\subsubfolder1
\subsubfolder2
copy to
path2\subfolder1\subsubfolder1
\subsubfolder2
\subfolder2\subsubfolder1
\subsubfolder2
I need this because i am loading pickles from path1 and calculating some things. The save need to be to a new path so i can compare the changes.
Already tried it with path1.split('/') but this only works if path1 is not changing so i need to change this everytime for new calculations if I want to change the pickles.
You can use os.walk to recursively search through all sub folders for the given path.
Then for each subfolder found you can replace the input path with the output path and use the os.makedirs to create the folders if they do not exist yet.
import os
def copy_folder_structure(input_dir, output_dir):
all_subfolders = [x[0] for x in os.walk(input_dir)]
for folder in all_subfolders:
output_folder = folder.replace(input_dir, output_dir)
if not os.path.exists(output_folder):
os.makedirs(output_folder)
copy_folder_structure("path1", "path2")
Related
The problem is to get all the file names in a list that are under a particular directory and in a particular condition.
We have a directory named "test_dir".
There, we have sub directory "sub_dir_1", "sub_dir_2", "sub_dir_3"
and inside of each sub dir, we have some files.
sub_dir_1 has files ['test.txt', 'test.wav']
sub_dir_2 has files ['test_2.txt', 'test.wav']
sub_dir_2 has files ['test_3.txt', 'test_3.tsv']
What I want to get at the end of the day is a list of of the "test.wav" that exist under the "directory" ['sub_dir_1/test.wav', 'sub_dir_2/test.wav']. As you can see the condition is to get every path of 'test.wav' under the mother directory.
mother_dir_name = "directory"
get_test_wav(mother_dir_name)
returns --> ['sub_dir_1/test.wav', 'sub_dir_2/test.wav']
EDITED
I have changed the direction of the problem.
We first have this list of file names
["sub_dir_1/test.wav","sub_dir_2/test.wav","abc.csv","abc.json","sub_dir_3/test.json"]
from this list I would like to get a list that does not contain any path that contains "test.wav" like below
["abc.csv","abc.json","sub_dir_3/test.json"]
You can use glob patterns for this. Using pathlib,
from pathlib import Path
mother_dir = Path("directory")
list(mother_dir.glob("sub_dir_*/*.wav"))
Notice that I was fairly specific about which subdirectories to check - anything starting with "sub_dir_". You can change that pattern as needed to fit your environment.
Use os.walk():
import os
def get_test_wav(folder):
found = []
for root, folders, files in os.walk(folder):
for file in files:
if file == "test.wav":
found.append(os.path.join(root, file))
return found
Or a list comprehension approach:
import os
def get_test_wav(folder):
found = [f"{arr[0]}\\test.wav" for arr in os.walk(folder) if "test.wav" in arr[2]]
return found
I think this might help you How can I search sub-folders using glob.glob module?
The main way to make a list of files in a folder (to make it callable later) is:
file_path = os.path.join(motherdirectopry, 'subdirectory')
list_files = glob.glob(file_path + "/*.wav")
just check that link to see how you can join all sub-directories in a folder.
This will also give you all the file in sub directories that only has .wav at the end:
os.chdir(motherdirectory)
glob.glob('**/*.wav', recursive=True)
I have the following directory, in the parent dir there are several folders lets say ABCD and within each folder many zips with names as displayed and the letter of the parent folder included in the name along with other info:
-parent--A-xxxAxxxx_timestamp.zip
-xxxAxxxx_timestamp.zip
-xxxAxxxx_timestamp.zip
--B-xxxBxxxx_timestamp.zip
-xxxBxxxx_timestamp.zip
-xxxBxxxx_timestamp.zip
--C-xxxCxxxx_timestamp.zip
-xxxCxxxx_timestamp.zip
-xxxCxxxx_timestamp.zip
--D-xxxDxxxx_timestamp.zip
-xxxDxxxx_timestamp.zip
-xxxDxxxx_timestamp.zip
I need to unzip only selected zips in this tree and place them in the same directory with the same name without the .zip extension.
Output:
-parent--A-xxxAxxxx_timestamp
-xxxAxxxx_timestamp
-xxxAxxxx_timestamp
--B-xxxBxxxx_timestamp
-xxxBxxxx_timestamp
-xxxBxxxx_timestamp
--C-xxxCxxxx_timestamp
-xxxCxxxx_timestamp
-xxxCxxxx_timestamp
--D-xxxDxxxx_timestamp
-xxxDxxxx_timestamp
-xxxDxxxx_timestamp
My effort:
for path in glob.glob('./*/xxx*xxxx*'): ##walk the dir tree and find the files of interest
zipfile=os.path.basename(path) #save the zipfile path
zip_ref=zipfile.ZipFile(path, 'r')
zip_ref=extractall(zipfile.replace(r'.zip', '')) #unzip to a folder without the .zip extension
The problem is that i dont know how to save the A,B,C,D etc to include them in the path where the files will be unzipped. Thus, the unzipped folders are created in the parent directory. Any ideas?
The code that you have seems to be working fine, you just to make sure that you are not overriding variable names and using the correct ones. The following code works perfectly for me
import os
import zipfile
import glob
for path in glob.glob('./*/xxx*xxxx*'): ##walk the dir tree and find the files of interest
zf = os.path.basename(path) #save the zipfile path
zip_ref = zipfile.ZipFile(path, 'r')
zip_ref.extractall(path.replace(r'.zip', '')) #unzip to a folder without the .zip extension
Instead of trying to do it in a single statement , it would be much easier and more readable to do it by first getting list of all folders and then get list of files inside each folder. Example -
import os.path
for folder in glob.glob("./*"):
#Using *.zip to only get zip files
for path in glob.glob(os.path.join(".",folder,"*.zip")):
filename = os.path.split(path)[1]
if folder in filename:
#Do your logic
I'm trying to export all of my maps that are in my subdirectories.
I have the code to export, but I cannot figure out where to add the loop that will make it do this for all subdirectories. As of right now, it is exporting the maps in the directory, but not the subfolders.
import arcpy, os
arcpy.env.workspace = ws = r"C:\Users\162708\Desktop\Burn_Zones"
for subdir, dirs, files in os.walk(ws):
for file in files:
mxd_list = arcpy.ListFiles("*.mxd")
for mxd in mxd_list:
current_mxd = arcpy.mapping.MapDocument(os.path.join(ws, mxd))
pdf_name = mxd[:-4] + ".pdf"
arcpy.mapping.ExportToPDF(current_mxd, pdf_name)
del mxd_list
What am I doing wrong that it isn't able to iterate through the subfolders?
Thank you!
Iterating through os.walk result you should give tuples containing (path, dirs, files) (the first in the tuple is the current path that contains files which is why I tend to name it that way). The current directory does not change automatically so you need to incorporate it into the path you're giving to arcpy.ListFiles like this:
arcpy.ListFiles(os.path.join(path, "*.mxd"))
You should also remove the loop for file in files. It seems like you're exporting the files per directory so why export the whole directory every time for each file?
Also you should change arcpy.mapping.MapDocument(os.path.join(ws, mxd)) to arcpy.mapping.MapDocument(os.path.join(path, mxd)) where path is again the first element from os.walk.
I'm uploading a zipped folder that contains a folder of text files, but it's not detecting that the folder that is zipped up is a directory. I think it might have something to do with requiring an absolute path in the os.path.isdir call, but can't seem to figure out how to implement that.
zipped = zipfile.ZipFile(request.FILES['content'])
for libitem in zipped.namelist():
if libitem.startswith('__MACOSX/'):
continue
# If it's a directory, open it
if os.path.isdir(libitem):
print "You have hit a directory in the zip folder -- we must open it before continuing"
for item in os.listdir(libitem):
The file you've uploaded is a single zip file which is simply a container for other files and directories. All of the Python os.path functions operate on files on your local file system which means you must first extract the contents of your zip before you can use os.path or os.listdir.
Unfortunately it's not possible to determine from the ZipFile object whether an entry is for a file or directory.
A rewrite or your code which does an extract first may look something like this:
import tempfile
# Create a temporary directory into which we can extract zip contents.
tmpdir = tempfile.mkdtemp()
try:
zipped = zipfile.ZipFile(request.FILES['content'])
zipped.extractall(tmpdir)
# Walk through the extracted directory structure doing what you
# want with each file.
for (dirpath, dirnames, filenames) in os.walk(tmpdir):
# Look into subdirectories?
for dirname in dirnames:
full_dir_path = os.path.join(dirpath, dirname)
# Do stuff in this directory
for filename in filenames:
full_file_path = os.path.join(dirpath, filename)
# Do stuff with this file.
finally:
# ... Clean up temporary diretory recursively here.
Usually to make things handle relative paths etc when running scripts you'd want to use os.path.
It seems to me that you're reading from a Zipfile the items you've not actually unzipped it so why would you expect the file/dirs to exist?
Usually I'd print os.getcwd() to find out where I am and also use os.path.join to join with the root of the data directory, whether that is the same as the directory containing the script I can't tell. Using something like scriptdir = os.path.dirname(os.path.abspath(__file__)).
I'd expect you would have to do something like
libitempath = os.path.join(scriptdir, libitem)
if os.path.isdir(libitempath):
....
But I'm guessing at what you're doing as it's a little unclear for me.
Given the following strings:
dir/dir2/dir3/dir3/file.txt
dir/dir2/dir3/file.txt
example/directory/path/file.txt
I am looking to create the correct directories and blank files within those directories.
I imported the os module and I saw that there is a mkdir function, but I am not sure what to do to create the whole path and blank files. Any help would be appreciated. Thank you.
Here is the answer on all your questions (directory creation and blank file creation)
import os
fileList = ["dir/dir2/dir3/dir3/file.txt",
"dir/dir2/dir3/file.txt",
"example/directory/path/file.txt"]
for file in fileList:
dir = os.path.dirname(file)
# create directory if it does not exist
if not os.path.exists(dir):
os.makedirs(dir)
# Create blank file if it does not exist
with open(file, "w"):
pass
First of all, given that try to create directory under a directory that doesn't exist, os.mkdir will raise an error. As such, you need to walk through the paths and check whether each of the subdirectories has or has not been created (and use mkdir as required). Alternative, you can use os.makedirs to handle this iteration for you.
A full path can be split into directory name and filename with os.path.split.
Example:
import os
(dirname, filename) = os.path.split('dir/dir2/dir3/dir3/file.txt')
os.makedirs(dirname)
Given we have a set of dirs we want to create, simply use a for loop to iterate through them. Schematically:
dirlist = ['dir1...', 'dir2...', 'dir3...']
for dir in dirlist:
os.makedirs( ... )