I've been trying to figure out how I can loop through a root directory with sub directories and search for a file. If a specific file exists then run a script.
File structure example:
Root
--Folder 1
----TEST.txt
--Folder 2
----[no files]
--Folder 3
----TEST.txt
What I am trying to achieve is having the .py file in Root. When run it will loop over each folder in Root, if the file TEST.txt then do some processing.
Notes:
There will always be folders in Root
Where processing is needed there will be a file called TEST.txt
There will definitely be some folders that do not have TEST.txt
Pseudo code:
From Root open Folder 1. If TEST.txt is there then do some cool stuff and then 'cd ../' and repeat process but look in Folder 2.
Stop looping when all Folders have been checked.
This code should be able to search all folders and sub folders for a file.
import os
thisdir = os.getcwd()
for root, dirs, files in os.walk(thisdir):
if 'TEST.txt' in files:
#do some processing
Joining 'root' and the file name should be able to give you access to that file if you want to execute or analyze it somehow.
Related
I am working on a data cleanup in a network drive. The drive has 1000+ folders, and those folders have several subfolders. The script that I got from G4G (seen below) prompts me to select a folder. I can click on one of my 1000+ folders, and the data is cleaned up properly (duplicates are deleted). However, I'd like to loop the command through the whole drive to avoid clicking on folders for hours. I cannot select the drive as my folder because duplicate file names between the first folders in the drive should not be considered duplicates.
EDIT:
I'll give an example to clarify.
Z:/Folder1 and Z:/Folder2 both have several files named "text.txt," immediately inside of the folders and within the subdirectories of the folders. Folder1 and Folder2, amongst all "text.txt" files immediately inside and within its subdirectories, should each be left with one "text.txt." If the current script is applied to Folder1 and Folder2 individually, then the desired result of one "text.txt" file existing in Folder1 and one existing in Folder2 is accomplished. If the script is applied to the Z drive, then between Folder1 and Folder2, there would only be one "text.txt," and one of the folders would be without a file named "text.txt."
How can I apply this script to each first folder in the drive without having to manually click on each folder?
from tkinter.filedialog import askdirectory
# Importing required libraries.
from tkinter import Tk
import os
import hashlib
from pathlib import Path
# We don't want the GUI window of
# tkinter to be appearing on our screen
Tk().withdraw()
# Dialog box for selecting a folder.
file_path = askdirectory(title="Select a folder")
# Listing out all the files
# inside our root folder.
list_of_files = os.walk(file_path)
# In order to detect the duplicate
# files we are going to define an empty dictionary.
unique_files = dict()
for root, folders, files in list_of_files:
# Running a for loop on all the files
for file in files:
# Finding complete file path
file_path = Path(os.path.join(root, file))
# Converting all the content of
# our file into md5 hash.
Hash_file = hashlib.md5(open(file_path, 'rb').read()).hexdigest()
# If file hash has already #
# been added we'll simply delete that file
if Hash_file not in unique_files:
unique_files[Hash_file] = file_path
else:
if file.endswith((".txt",".bmp")):
os.remove(file_path)
print(f"{file_path} has been deleted")
Maybe you should run it for your drive and use if/else to skip first folder
list_of_files = os.walk("your drive")
for root, folders, files in list_of_files:
if root != "your drive":
for file in files:
# ... code ...
This way you can skip also other (sub)folders.
OR you can use next() to skip some element from os.walk() because os.walk() doesn't give directly list with all elements but generator.
list_of_files = os.walk("your drive")
next(list_of_files) # skip first item
for root, folders, files in list_of_files:
for file in files:
# ... code ...
Full_Path = 'C://Users//Me//Documents//Project//ABC//Report//summary.txt
File = 'summary.txt'
Start_Path = 'C://Users//ballen//Documents//Project//'
for root, dirs, files in os.walk(Start_Path):
if File in files and root.split(os.path.sep)[1] == "Report":
open(File)
print(File)
So I have this bit of code, my goal is to open and read every summary.txt that resides within a 'Report' directory. I gave the full path at the top, the 'ABC' folder is dynamic, so that will have a ton of different names. I am trying to get this script to go into the root directory 'Start_Path' and then only open and read the summary.txt files whose parent directory is 'Report'. (There are many more summary.txt files in other folders inside those dynamic ABC folders and I only want the ones within a 'Report' folder. Any ideas on this?
print(File) spits out all the summary.txt files within the Report folders, there are currently 18. But when I try to open(File) it says summary.txt path does not exist.
I have a folder structure where I have a few txt files in a few folders, and then I have to copy all the folders where I found txt files to another folder. My root folder is always different, so to find it I use os.path.dirname(os.path.abspath(file)) and search all the folders from there. Unfortunately I'm stuck here, because I never use any file handling in python.
From your question I'm assuming you have certain folders at a specific location and the folders contain specific txt files. Now you want to move all the folders containing text files to a specific location.
You can do this:
Walk through all the folders in the master_folder.
Walk through files in the sub-folders, and if any file has a .txt extension, it moves the folder to the target location:
import os
import shutil
for dirpath, dirs, files in os.walk('path_to_master_folder'):
for filename in files:
if filename.endswith('txt'):
shutil.move(dirpath, 'path_to_destination_folder')
I try to do a python3 script and i have to delete all .xml in a folder.
It's a giant folder with any folders in this folders (etc..).
(in dossier1, i have 2 folders and 2 .xml files, in there 2 folders, i have 2 folders and 2 .xml files etc)
Is it possible to say: "in this folder, search all the .xml and delete them" ?
i tried this:
filelist=os.listdir(path)
for fichier in filelist[:]: # filelist[:] makes a copy of filelist.
if not(fichier.endswith(".xml")):
filelist.remove(fichier)
but python dont't go in differents folders.
Thanks for your help :)
You can use os.walk. This answer gives good info about it.
You can do something like this
for path, directories, files in os.walk('./'): # change './' to directory path
for file in files:
fname = os.path.join(path, file)
if fname.endswith('.xml'): # If file ends with .xml
print(fname)
os.remove(os.path.abspath(fname))# Use absolute path to remove file
You can use pathlib.Path.glob:
>>> import pathlib
>>> list(pathlib.Path('.').glob("**/*.xml"))
[PosixPath('b.xml'), PosixPath('a.xml'), PosixPath('test/y.xml'), PosixPath('test/x.xml')]
>>> _[0].unlink() # delete some file
>>> for file in pathlib.Path('.').glob("**/*.xml"):
... print(file)
... file.unlink() # delete everything
...
a.xml
test/y.xml
test/x.xml
I have some code that I would like to run on multiple files, each contained in its own subdirectory. I'd like to write some additional code to ask the user for a subdirectory name, opens the subdirectory, and run the code on a file contained within it (there is only one file contained in each subdirectory). Can anyone help?
with this loop you can manipulate all the file through the subfolders for the root folder
import os
for root, dirs, files in os.walk('Your path here'):
for file in files:
//your code here//