loop over subfolder inside a folder in python - python

I am trying to move files from one folder to another. I have a folder name from a to z. Inside each folder(a-z) i have several folders. I can move files from the subfolder of the folder(a-z) to my folder but I want to do it from a-z at once.
folder structure : a--ab
--ac
b--bc
--bd
.. till z
import glob
import os
import shutil
path = "E:\\download\\images\\a\\*"
move_path = "E:\\download\\images\\final\\"
files = glob.glob(path,recursive = True)
for file in files:
subfile= os.listdir(file)
for sub in subfile:
subpath = file + "\\" + sub
shutil.move(subpath,move_path +"\\" + sub)

Copy this tiny script in E:\download\images and run it from there. This way, the Path class will use that directory as the working root directory.
The images variable will contain a generator that will give you every file matching the glob (which means: every file in every subfolder that has a 3-letter extension and with the first subfolder's name having only one character).
When renaming, the file will be moved from the subfolder path to final/, thus being moved.
Keep in mind that the glob will pick every file or folder name having a 3-letter extension. You'll need to do additional checks if you have other files or folders that match this nomenclature.
from pathlib import Path
images = Path().glob('?/**/*.???')
for img in images:
img.rename('final/' + img.name)

Related

Moving Files from directory to their own folders

I am struggling with the paths and directories to solve this problem. Basically, I have a long list of .lammps files in one directory. My goal is to copy each file and move it into its own folder (which is one directory back) where its folder name is the file name minus the .lammps. All of the folders are already made, I just can't seem to figure out moving them. The entire list of files is in the Files directory. The individual folders are in the ROTATED FILES directory. Here is what I have. Any tips greatly appreciated.
Here is a file example
n-optimized.new.10_10-90-10_10.Ni00Nj01.lammps
The folder for this file is then named
n-optimized.new.10_10-90-10_10.Ni00Nj01
import os
file_directory = os.chdir("C:\Py Practice\ROTATED FILES\Files")
files = os.listdir()
for file in files:
# get the file -.lammps string
name1 = file.split('.')[0:4]
name2 = ".".join(name1)
# get the path for the files new respective folder (back a directory and paste folder name)
file_folder = "C:\Py Practice\ROTATED FILES/" + name2
# Move
combined_path = os.path.join(file, file_folder)
I've tried shutil and figured path join might be easier.
First of all, the code you have here shouldn't work since you either have to escape backslashes or use a raw string. Secondly, rather than using os for file system operations, it's much better to learn how to use pathlib (also a core python module) which provides a more modern object-oriented approach to file operations.
Using pathlib and shutil you can do something like
from pathlib import Path
from shutil import copyfile
file_directory = Path(r"C:\Py Practice\ROTATED FILES\Files")
# get the list of source files
source_files = [f for f in file_directory.glob('*.lammps')]
# create target file paths
target_files = [file_directory.parent / f.stem/ f.name for f in source_files]
for source, target in zip(source_files, target_files):
copyfile(str(source), str(target))
Here we're accessing different parts of file path using a convenient OOP structure. For example, if your file f is located in 'c:/foo/bar/boo.txt' then f.name is just the name of file: boo.txt, f.stem is the stem part of the file name (excluding the extension) boo, f.parent is its parent directory 'c:/foo/bar/' etc.
There's a really handy graphic of pathlib Path objects here.
The only inconvenience is that not all of core modules support Path objects yet so for copyfile we just need to get the string representation by calling str on the object.
And you don't even need to have target folders created beforehand, it's very easy to create the necessary folder structure as you go along:
from pathlib import Path
from shutil import copyfile
file_directory = Path(r"C:\Py Practice\ROTATED FILES\Files")
# get the list of source files
source_files = [f for f in file_directory.glob('*.lammps')]
# create target file paths
target_files = [file_directory.parent / f.stem/ f.name for f in source_files]
for source, target in zip(source_files, target_files):
# check that target directory exists
# and create a folder if not
if not target.parent.is_dir():
target.parent.mkdir()
copyfile(str(source), str(target))

Apply script to each folder in a drive

I am working on a data cleanup in a network drive. The drive has 1000+ folders, and those folders have several subfolders. The script that I got from G4G (seen below) prompts me to select a folder. I can click on one of my 1000+ folders, and the data is cleaned up properly (duplicates are deleted). However, I'd like to loop the command through the whole drive to avoid clicking on folders for hours. I cannot select the drive as my folder because duplicate file names between the first folders in the drive should not be considered duplicates.
EDIT:
I'll give an example to clarify.
Z:/Folder1 and Z:/Folder2 both have several files named "text.txt," immediately inside of the folders and within the subdirectories of the folders. Folder1 and Folder2, amongst all "text.txt" files immediately inside and within its subdirectories, should each be left with one "text.txt." If the current script is applied to Folder1 and Folder2 individually, then the desired result of one "text.txt" file existing in Folder1 and one existing in Folder2 is accomplished. If the script is applied to the Z drive, then between Folder1 and Folder2, there would only be one "text.txt," and one of the folders would be without a file named "text.txt."
How can I apply this script to each first folder in the drive without having to manually click on each folder?
from tkinter.filedialog import askdirectory
# Importing required libraries.
from tkinter import Tk
import os
import hashlib
from pathlib import Path
# We don't want the GUI window of
# tkinter to be appearing on our screen
Tk().withdraw()
# Dialog box for selecting a folder.
file_path = askdirectory(title="Select a folder")
# Listing out all the files
# inside our root folder.
list_of_files = os.walk(file_path)
# In order to detect the duplicate
# files we are going to define an empty dictionary.
unique_files = dict()
for root, folders, files in list_of_files:
# Running a for loop on all the files
for file in files:
# Finding complete file path
file_path = Path(os.path.join(root, file))
# Converting all the content of
# our file into md5 hash.
Hash_file = hashlib.md5(open(file_path, 'rb').read()).hexdigest()
# If file hash has already #
# been added we'll simply delete that file
if Hash_file not in unique_files:
unique_files[Hash_file] = file_path
else:
if file.endswith((".txt",".bmp")):
os.remove(file_path)
print(f"{file_path} has been deleted")
Maybe you should run it for your drive and use if/else to skip first folder
list_of_files = os.walk("your drive")
for root, folders, files in list_of_files:
if root != "your drive":
for file in files:
# ... code ...
This way you can skip also other (sub)folders.
OR you can use next() to skip some element from os.walk() because os.walk() doesn't give directly list with all elements but generator.
list_of_files = os.walk("your drive")
next(list_of_files) # skip first item
for root, folders, files in list_of_files:
for file in files:
# ... code ...

Search a folder and sub folders for files starting with criteria

I have a folder "c:\test" , the folder "test" contains many sub folders and files (.xml, .wav). I need to search all folders for files in the test folder and all sub-folders, starting with the number 4 and being 7 characters long in it and copy these files to another folder called 'c:\test.copy' using python. any other files need to be ignored.
So far i can copy the files starting with a 4 but not structure to the new folder using the following,
from glob import glob
import os, shutil
root_src_dir = r'C:/test' #Path of the source directory
root_dst_dir = 'c:/test.copy' #Path to the destination directory
for file in glob('c:/test/**/4*.*'):
shutil.copy(file, root_dst_dir)
any help would be most welcome
You can use os.walk:
import os
import shutil
root_src_dir = r'C:/test' #Path of the source directory
root_dst_dir = 'c:/test.copy' #Path to the destination directory
for root, _, files in os.walk(root_src_dir):
for file in files:
if file.startswith("4") and len(file) == 7:
shutil.copy(os.path.join(root, file), root_dst_dir)
If, by 7 characters, you mean 7 characters without the file extension, then replace len(file) == 7 with len(os.path.splitext(file)[0]) == 7.
This can be done using the os and shutil modules:
import os
import shutil
Firstly, we need to establish the source and destination paths. source should the be the directory you are copying and destination should be the directory you want to copy into.
source = r"/root/path/to/source"
destination = r"/root/path/to/destination"
Next, we have to check if the destination path exists because shutil.copytree() will raise a FileExistsError if the destination path already exists. If it does already exist, we can remove the tree and duplicate it again. You can think of this block of code as simply refreshing the duplicate directory:
if os.path.exists(destination):
shutil.rmtree(destination)
shutil.copytree(source, destination)
Then, we can use os.walk to recursively navigate the entire directory, including subdirectories:
for path, _, files in os.walk(destination):
for file in files:
if not file.startswith("4") and len(os.path.splitext(file)[0]) != 7:
os.remove(os.path.join(path, file))
if not os.listdir(path):
os.rmdir(path)
We then can loop through the files in each directory and check if the file does not meet your condition (starts with "4" and has a length of 7). If it does not meet the condition, we simply remove it from the directory using os.remove.
The final if-statement checks if the directory is now empty. If the directory is empty after removing the files, we simply delete that directory using os.rmdir.

Python: Unzip selected files in directory tree

I have the following directory, in the parent dir there are several folders lets say ABCD and within each folder many zips with names as displayed and the letter of the parent folder included in the name along with other info:
-parent--A-xxxAxxxx_timestamp.zip
-xxxAxxxx_timestamp.zip
-xxxAxxxx_timestamp.zip
--B-xxxBxxxx_timestamp.zip
-xxxBxxxx_timestamp.zip
-xxxBxxxx_timestamp.zip
--C-xxxCxxxx_timestamp.zip
-xxxCxxxx_timestamp.zip
-xxxCxxxx_timestamp.zip
--D-xxxDxxxx_timestamp.zip
-xxxDxxxx_timestamp.zip
-xxxDxxxx_timestamp.zip
I need to unzip only selected zips in this tree and place them in the same directory with the same name without the .zip extension.
Output:
-parent--A-xxxAxxxx_timestamp
-xxxAxxxx_timestamp
-xxxAxxxx_timestamp
--B-xxxBxxxx_timestamp
-xxxBxxxx_timestamp
-xxxBxxxx_timestamp
--C-xxxCxxxx_timestamp
-xxxCxxxx_timestamp
-xxxCxxxx_timestamp
--D-xxxDxxxx_timestamp
-xxxDxxxx_timestamp
-xxxDxxxx_timestamp
My effort:
for path in glob.glob('./*/xxx*xxxx*'): ##walk the dir tree and find the files of interest
zipfile=os.path.basename(path) #save the zipfile path
zip_ref=zipfile.ZipFile(path, 'r')
zip_ref=extractall(zipfile.replace(r'.zip', '')) #unzip to a folder without the .zip extension
The problem is that i dont know how to save the A,B,C,D etc to include them in the path where the files will be unzipped. Thus, the unzipped folders are created in the parent directory. Any ideas?
The code that you have seems to be working fine, you just to make sure that you are not overriding variable names and using the correct ones. The following code works perfectly for me
import os
import zipfile
import glob
for path in glob.glob('./*/xxx*xxxx*'): ##walk the dir tree and find the files of interest
zf = os.path.basename(path) #save the zipfile path
zip_ref = zipfile.ZipFile(path, 'r')
zip_ref.extractall(path.replace(r'.zip', '')) #unzip to a folder without the .zip extension
Instead of trying to do it in a single statement , it would be much easier and more readable to do it by first getting list of all folders and then get list of files inside each folder. Example -
import os.path
for folder in glob.glob("./*"):
#Using *.zip to only get zip files
for path in glob.glob(os.path.join(".",folder,"*.zip")):
filename = os.path.split(path)[1]
if folder in filename:
#Do your logic

copy and rename files in a directory in a specific pattern

I would like to copy a file lying in a directory equal to the number of times the other files lying in that directory and then rename all the new files.
For example, there are 3 files in a directory, filename1.xls, filename2.xls and filename3.xls. I would like to copy Filename1.xls 2 times (as there are 2 files in the directory excluding filename1.xls )and then rename each copied file as filename2.xls and fiilename3.xls. Hope my question is clear. Thanks, AD
hm... just get the amount of files in directory, copy your file N times and save them as
for number in range(amount):
"feliname%r.xls" % number
if I understand what you mean
To replace content of all files that have names that start with "F" and that are adjacent to a file given at the command-line with its copy:
#!/usr/bin/env python
import os
import shutil
import sys
filename = sys.argv[1] # provide file you want to multiply
dirname, basename = os.path.split(filename)
for name in os.listdir(dirname):
path = os.path.join(dirname, name)
#note: os.path.normcase() might be required to compare names
if name.startswith("F") and name != basename and os.path.isfile(path):
shutil.copy2(filename, path) #note: some metadata is not copied
Note: if the copy fails; the destination file might be destroyed. You can copy to a temporary file first in this case before replacing the destination.

Categories