Directory walk and remove files/directories

Directory walk and remove files/directories - python

I copied a (presumably large) number of files on to an existing directory, and I need to reverse the action. The targeted directory contains a number of other files, that I need to keep there, which makes it impossible to simply remove all files from the directory. I was able to do it with Python. Here's the script:
import os, sys, shutil
source = "/tmp/test/source"
target = "/tmp/test/target"
for root, dirs, files in os.walk(source): # for files and directories in source
for dir in dirs:
if dir.startswith("."):
print(f"Removing Hidden Directory: {dir}")
else:
print(f"Removing Directory: {dir}")
try:
shutil.rmtree(f"{target}/{dir}") # remove directories and sub-directories
except FileNotFoundError:
pass
for file in files:
if file.startswith("."): # if filename starts with a dot, it's a hidden file
print(f"Removing Hidden File: {file}")
else:
print(f"Removing File: {file}")
try:
os.remove(f"{target}/{file}") # remove files
except FileNotFoundError:
pass
print("Done")
The script above looks in the original (source) directory and lists those files. Then it looks into the directory you copied the files to(target), and removes only the listed files, as they exist in the source directory.
How can I do the same thing in Go? I tried filepath.WalkDir(), but as stated in the docs:
WalkDir walks the file tree rooted at root, calling fn for each file
or directory in the tree, including root.
If WalkDir() includes the root, then os.Remove() or os.RemoveAll() will delete the whole thing.

Answered by Cerise Limon. Use os.ReadDir to read source the directory entries. For each entry, os.RemoveAll the corresponding target file

Related

Python script to move specific filetypes from the all directories to one folder

I'm trying to write a python script to move all music files from my whole pc to one spcific folder.
They are scattered everywhere and I want to get them all in one place, so I don't want to copy but completely move them.
I was already able to make a list of all the files with this script:
import os
targetfiles = []
extensions = (".mp3", ".wav", ".flac")
for root, dirs, files in os.walk('/'):
for file in files:
if file.endswith(extensions):
targetfiles.append(os.path.join(root, file))
print(targetfiles)
This prints out a nice list of all the files but I'm stuck to now move them.
I did many diffent tries with different code and this was one of them:
import os
import shutil
targetfiles = []
extensions = (".mp3", ".wav", ".flac")
for root, dirs, files in os.walk('/'):
for file in files:
if file.endswith(extensions):
targetfiles.append(os.path.join(root, file))
new_path = 'C:/Users/Nicolaas/Music/All' + file
shutil.move(targetfiles, new_path)
But everything I try gives me an error:
TypeError: rename: src should be string, bytes or os.PathLike, not list
I think I've met my limit gathering this all as I'm only starting at Python but I would be very grateful if anyone could point me in the right direction!

You are trying to move a list of files to a new location, but the shutil.move function expects a single file as the first argument. To move all the files in the targetfiles list to the new location, you have to use a loop to move each file individually.
for file in targetfiles:
shutil.move(file, new_path)
Also if needed add a trailing slash to the new path 'C:/Users/Nicolaas/Music/All/'
On a sidenote are you sure that moving all files with those extentions is a good idea? I would suggest copying them or having a backup.
Edit:
You can use an if statement to exclude certain folders from being searched.
for root, dirs, files in os.walk('/'):
if any(folder in root for folder in excluded_folders):
continue
for file in files:
if file.endswith(extensions):
targetfiles.append(os.path.join(root, file))
Where excluded_folder is a list of the unwanted folders like: excluded_folders = ['Program Files', 'Windows']

I would suggest using glob for matching:
import glob
def match(extension, root_dir):
return glob.glob(f'**\\*.{extension}', root_dir=root_dir, recursive=True)
root_dirs = ['C:\\Path\\to\\Albums', 'C:\\Path\\to\\dir\\with\\music\\files']
excluded_folders = ['Bieber', 'Eminem']
extensions = ("mp3", "wav", "flac")
targetfiles = [f'{root_dir}\\{file_name}' for root_dir in root_dirs for extension in extensions for file_name in match(extension, root_dir) if not any(excluded_folder in file_name for excluded_folder in excluded_folders)]
Then you can move these files to new_path

How to move a directory in Python?

I need to move a directory from one location to another location on the same filesystem. I'm aware of solutions like shutil.move(), but the filesystem in question is an SD card (and therefore extremely slow), and there are a lot of files to move, so simply copying them and then deleting the originals is not acceptable. The Unix mv command can move a directory from one filesystem to the same filesystem without copying any files -- is there a way to do that in Python?

It turns out that the answer is yes. As you probably know, you can move a file (assuming it doesn't already exist in the destination) using os.rename(r'D:\path1\myfile.txt', r'D:\path2\myfile.txt'). You can do the same for directories:
os.rename(r'D:\long\path\to\mydir', r'D:\mydir')
but, of course, that only works if D:\mydir doesn't already exist. If it does exist, and you want to merge the files that are already there with the files that you're moving, you'll need to get a little bit more clever. Here's a snippet that'll do what you want:
def movedir(src, dst):
try:
os.rename(src, dst)
return
except FileExistsError:
pass
for root, dirs, files in os.walk(src):
dest_root = os.path.join(dst, os.path.relpath(root, src))
done = []
for dir_ in dirs:
try:
os.rename(os.path.join(root, dir_), os.path.join(dest_root, dir_))
done.append(dir_)
except FileExistsError:
pass
for dir_ in done:
dirs.remove(dir_)
for file in files:
os.replace(os.path.join(root, file), os.path.join(dest_root, file))
for root, dirs, files in os.walk(src, topdown=False):
os.rmdir(root)
Here's a version with comments explaining what everything does:
def movedir(src, dst):
# if a directory of the same name does not exist in the destination, we can simply rename the directory
# to a different path, and it will be moved -- it will disappear from the source path and appear in the destination
# path instantaneously, without any files being copied.
try:
os.rename(src, dst)
return
except FileExistsError:
# if a directory of the same name already exists, we must merge them. This is what the algorithm below does.
pass
for root, dirs, files in os.walk(src):
dest_root = os.path.join(dst, os.path.relpath(root, src))
done = []
for dir_ in dirs:
try:
os.rename(os.path.join(root, dir_), os.path.join(dest_root, dir_))
done.append(dir_)
except FileExistsError:
pass
# tell os.walk() not to recurse into subdirectories we've already moved. see the documentation on os.walk()
# for why this works: https://docs.python.org/3/library/os.html#os.walk
# lists can't be modified during iteration, so we have to put all the items we want to remove from the list
# into a second list, and then remove them after the loop.
for dir_ in done:
dirs.remove(dir_)
# move files. os.replace() is a bit like os.rename() but if there's an existing file in the destination with
# the same name, it will be deleted and replaced with the source file without prompting the user. It doesn't
# work on directories, so we only use it for files.
# You may want to change this to os.rename() and surround it with a try/except FileExistsError if you
# want to prompt the user to overwrite files.
for file in files:
os.replace(os.path.join(root, file), os.path.join(dest_root, file))
# clean up after ourselves.
# Directories we were able to successfully move just by renaming them (directories that didn't exist in the
# destination already) have already disappeared from the source. Directories we had to merge are still there in
# the source, but their contents were moved. os.rmdir() will fail unless the directory is already empty.
for root, dirs, files in os.walk(src, topdown=False):
os.rmdir(root)
movedir(r'D:\long\path\to\mydir', r'D:\mydir')
Please note that using os.rename() in this manner only works if the source path and the destination path are on the same filesystem (on Windows, this is true if they have the same drive letter). If they're on different drive letters (i.e. one is on C: and the other is on D:) or if one of the paths contains a reparse point (if you don't know what that is, don't worry about it, you'll probably never encounter one), you will need to use shutil.move(), which copies the files and then deletes them from the source -- this what Windows does when you move files between drives, and it takes about as long to finish.

Search a folder and sub folders for files starting with criteria

I have a folder "c:\test" , the folder "test" contains many sub folders and files (.xml, .wav). I need to search all folders for files in the test folder and all sub-folders, starting with the number 4 and being 7 characters long in it and copy these files to another folder called 'c:\test.copy' using python. any other files need to be ignored.
So far i can copy the files starting with a 4 but not structure to the new folder using the following,
from glob import glob
import os, shutil
root_src_dir = r'C:/test' #Path of the source directory
root_dst_dir = 'c:/test.copy' #Path to the destination directory
for file in glob('c:/test/**/4*.*'):
shutil.copy(file, root_dst_dir)
any help would be most welcome

You can use os.walk:
import os
import shutil
root_src_dir = r'C:/test' #Path of the source directory
root_dst_dir = 'c:/test.copy' #Path to the destination directory
for root, _, files in os.walk(root_src_dir):
for file in files:
if file.startswith("4") and len(file) == 7:
shutil.copy(os.path.join(root, file), root_dst_dir)
If, by 7 characters, you mean 7 characters without the file extension, then replace len(file) == 7 with len(os.path.splitext(file)[0]) == 7.

This can be done using the os and shutil modules:
import os
import shutil
Firstly, we need to establish the source and destination paths. source should the be the directory you are copying and destination should be the directory you want to copy into.
source = r"/root/path/to/source"
destination = r"/root/path/to/destination"
Next, we have to check if the destination path exists because shutil.copytree() will raise a FileExistsError if the destination path already exists. If it does already exist, we can remove the tree and duplicate it again. You can think of this block of code as simply refreshing the duplicate directory:
if os.path.exists(destination):
shutil.rmtree(destination)
shutil.copytree(source, destination)
Then, we can use os.walk to recursively navigate the entire directory, including subdirectories:
for path, _, files in os.walk(destination):
for file in files:
if not file.startswith("4") and len(os.path.splitext(file)[0]) != 7:
os.remove(os.path.join(path, file))
if not os.listdir(path):
os.rmdir(path)
We then can loop through the files in each directory and check if the file does not meet your condition (starts with "4" and has a length of 7). If it does not meet the condition, we simply remove it from the directory using os.remove.
The final if-statement checks if the directory is now empty. If the directory is empty after removing the files, we simply delete that directory using os.rmdir.

Create a zip with only .pdf and .xml files from one directory

I would love to know how i can zip only all pdfs from the main directory without including the subfolders.
I've tried several times changing the code, without any succes with what i want to achieve.
import zipfile
fantasy_zip = zipfile.ZipFile('/home/rob/Desktop/projects/zenjobv2/archivetest.zip', 'w')
for folder, subfolders, files in os.walk('/home/rob/Desktop/projects/zenjobv2/'):
for file in files:
if file.endswith('.pdf'):
fantasy_zip.write(os.path.join(folder, file), os.path.relpath(os.path.join(folder,file), '/home/rob/Desktop/projects/zenjobv2/'), compress_type = zipfile.ZIP_DEFLATED)
elif file.endswith('.xml'):
fantasy_zip.write(os.path.join(folder, file), os.path.relpath(os.path.join(folder,file), '/home/rob/Desktop/projects/zenjobv2/'), compress_type = zipfile.ZIP_DEFLATED)
fantasy_zip.close()
I expect that a zip is created only with the .pdfs and .xml files from the zenjobv2 folder/directory without including any other folders/subfolders.

You are looping through the entire directory tree with os.walk(). It sounds like you want to just look at the files in a given directory. For that, consider os.scandir(), which returns an iterator of all files and subdirectories in a given directory. You will just have to filter out elements that are directories:
root = "/home/rob/Desktop/projects/zenjobv2"
for entry in os.scandir(root):
if entry.is_dir():
continue # Just in case there are strangely-named directories
if entry.path.endswith(".pdf") or entry.path.endswith(".xml"):
# Process the file at entry.path as you see fit

Directory is not being recognized in Python

I'm uploading a zipped folder that contains a folder of text files, but it's not detecting that the folder that is zipped up is a directory. I think it might have something to do with requiring an absolute path in the os.path.isdir call, but can't seem to figure out how to implement that.
zipped = zipfile.ZipFile(request.FILES['content'])
for libitem in zipped.namelist():
if libitem.startswith('__MACOSX/'):
continue
# If it's a directory, open it
if os.path.isdir(libitem):
print "You have hit a directory in the zip folder -- we must open it before continuing"
for item in os.listdir(libitem):

The file you've uploaded is a single zip file which is simply a container for other files and directories. All of the Python os.path functions operate on files on your local file system which means you must first extract the contents of your zip before you can use os.path or os.listdir.
Unfortunately it's not possible to determine from the ZipFile object whether an entry is for a file or directory.
A rewrite or your code which does an extract first may look something like this:
import tempfile
# Create a temporary directory into which we can extract zip contents.
tmpdir = tempfile.mkdtemp()
try:
zipped = zipfile.ZipFile(request.FILES['content'])
zipped.extractall(tmpdir)
# Walk through the extracted directory structure doing what you
# want with each file.
for (dirpath, dirnames, filenames) in os.walk(tmpdir):
# Look into subdirectories?
for dirname in dirnames:
full_dir_path = os.path.join(dirpath, dirname)
# Do stuff in this directory
for filename in filenames:
full_file_path = os.path.join(dirpath, filename)
# Do stuff with this file.
finally:
# ... Clean up temporary diretory recursively here.

Usually to make things handle relative paths etc when running scripts you'd want to use os.path.
It seems to me that you're reading from a Zipfile the items you've not actually unzipped it so why would you expect the file/dirs to exist?
Usually I'd print os.getcwd() to find out where I am and also use os.path.join to join with the root of the data directory, whether that is the same as the directory containing the script I can't tell. Using something like scriptdir = os.path.dirname(os.path.abspath(__file__)).
I'd expect you would have to do something like
libitempath = os.path.join(scriptdir, libitem)
if os.path.isdir(libitempath):
....
But I'm guessing at what you're doing as it's a little unclear for me.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Directory walk and remove files/directories - python

Answered by Cerise Limon. Use os.ReadDir to read source the directory entries. For each entry, os.RemoveAll the corresponding target file

Related

Python script to move specific filetypes from the all directories to one folder

How to move a directory in Python?

Search a folder and sub folders for files starting with criteria

Create a zip with only .pdf and .xml files from one directory

Directory is not being recognized in Python

Categories

Resources