I have a set of folders, and I want to be able to run a function that will find the most recently edited file and tell me the name of the file and the folder it is in.
Folder layout:
root
Folder A
File A
File B
Folder B
File C
File D
etc...
Any tips to get me started as i've hit a bit of a wall.
You should look at the os.walk function, as well as os.stat, which can let you do something like:
import os
max_mtime = 0
for dirname,subdirs,files in os.walk("."):
for fname in files:
full_path = os.path.join(dirname, fname)
mtime = os.stat(full_path).st_mtime
if mtime > max_mtime:
max_mtime = mtime
max_dir = dirname
max_file = fname
print max_dir, max_file
It helps to wrap the built in directory walking to function that yields only full paths to files. Then you can just take the function that returns all files and pick out the one that has the highest modification time:
import os
def all_files_under(path):
"""Iterates through all files that are under the given path."""
for cur_path, dirnames, filenames in os.walk(path):
for filename in filenames:
yield os.path.join(cur_path, filename)
latest_file = max(all_files_under('root'), key=os.path.getmtime)
If anyone is looking for an one line way to do it:
latest_edited_file = max([f for f in os.scandir("path\\to\\search")], key=lambda x: x.stat().st_mtime).name
use os.walk to list files
use os.stat to get file modified timestamp (st_mtime)
put both timestamps and filenames in a list and sort it by timestamp, largest timestamp is most recently edited file.
For multiple files, if anyone came here for that:
import glob, os
files = glob.glob("/target/directory/path/*/*.mp4")
files.sort(key=os.path.getmtime)
for file in files:
print(file)
This will print all files in any folder within /path/ that have the .mp4 extension, with the most recently modified file paths at the bottom.
You can use
os.walk
See: http://docs.python.org/library/os.html
Use os.path.walk() to traverse the directory tree and os.stat().st_mtime to get the mtime of the files.
The function you pass to os.path.walk() (the visit parameter) just needs to keep track of the largest mtime it's seen and where it saw it.
I'm using path = r"C:\Users\traveler\Desktop":
import os
def all_files_under(path):
#"""Iterates through all files that are under the given path."""
for cur_path, dirnames, filenames in os.walk(path):
for filename in filenames:
yield os.path.join(cur_path, filename)
latest_file = max(all_files_under('root'), key=os.path.getmtime)
What am i missing here?
Related
Attempting to write a function that walks a file system and returns the absolute path and filename for use in another function.
Example "/testdir/folderA/222/filename.ext".
Having tried multiple versions of this I cannot seem to get it to work properly.
filesCheck=[]
def findFiles(filepath):
files=[]
for root, dirs, files in os.walk(filepath):
for file in files:
currentFile = os.path.realpath(file)
print (currentFile)
if os.path.exists(currentFile):
files.append(currentFile)
return files
filesCheck = findFiles(/testdir)
This returns
"filename.ext" (only one).
Substitute in currentFile = os.path.join(root, file) for os.path.realpath(file) and it goes into a loop in the first directory. Tried os.path.join(dir, file) and it fails as one of my folders is named 222.
I have gone round in circles and get somewhat close but haven't been able to get it to work.
Running on Linux with Python 3.6
There's a several things wrong with your code.
There are multiple values are being assigned to the variable name files.
You're not adding the root directory to each filename os.walk() returns which can be done with os.path.join().
You're not passing a string to the findFiles() function.
If you fix those things there's no longer a need to call os.path.exists() because you can be sure it does.
Here's a working version:
import os
def findFiles(filepath):
found = []
for root, dirs, files in os.walk(filepath):
for file in files:
currentFile = os.path.realpath(os.path.join(root, file))
found.append(currentFile)
return found
filesCheck = findFiles('/testdir')
print(filesCheck)
Hi I think this is what you need. Perhaps you could give it a try :)
from os import walk
path = "C:/Users/SK/Desktop/New folder"
files = []
for (directoryPath, directoryNames, allFiles) in walk(path):
for file in allFiles:
files.append([file, f"{directoryPath}/{file}"])
print(files)
Output:
[ ['index.html', 'C:/Users/SK/Desktop/New folder/index.html'], ['test.py', 'C:/Users/SK/Desktop/New folder/test.py'] ]
I have a folder that contains 15 .jpg files and 15 .pdf files. The file names are the same with just the extensions being different. Example ABC123.jpg and ABC123.pdf. I have spent the better part of the last few days trying to use shutil to move the oldest .pdf file to a new folder then finding the matching .jpg file and moving it to the same folder as the .pdf. I was able to move the oldest file or move all files of a given type. Just couldn't get the oldest of a specific type. I tried moving all .pdfs to a new folder1 and all .jpgs to a new folder2 and then moving oldest from each of those to a common folder. However, they don't always match. The oldest .jpg might be different than the oldest .pdf. I am sure there is a simple solution, I have just been working it in circles so long I can no longer see the forest through the trees.
Use the os.path.getmtime function as the key to sort your files.
import os
def oldest_file(dir, type):
return min([name for name in os.listdir(dir) if name.endswith(type)], key=lambda name: os.path.getmtime(os.path.join(dir, name)))
print(oldest('/your/folder', '.jpg'))
If you need to search the entire tree, use os.walk instead of os.listdir:
import os
from itertools import chain
def oldest_file(dir, type):
return min(list(chain(*[[os.path.join(root, file) for file in files if file.endswith(type)] for root, _, files in os.walk(dir)])), key=lambda file: os.path.getmtime(file))
print(oldest('/your/folder', '.jpg'))
I'm sure you can handle the rest of the code that deals with moving files.
I found oldest_file_in_tree from this answer.
import os
import shutil
def oldest_file_in_tree(rootfolder, extension=".avi"):
return min(
(os.path.join(dirname, filename)
for dirname, dirnames, filenames in os.walk(rootfolder)
for filename in filenames
if filename.endswith(extension)),
key=lambda fn: os.stat(fn).st_mtime)
oldest_pdf = oldest_file_in_tree('/var/somedir', '.pdf')
name = oldest_pdf[:4]
matching_jpg = '{}.jpg'.format(name)
shutil.move("/var/somedir/{}.pdf".format(name), "path/to/new/destination/{}.pdf".format(name))
shutil.move("/var/somedir/{}.jpg".format(name), "path/to/new/destination/{}.jpg".format(name))
Here is how I was able to make it work..
import os, shutil
import glob
todir = '/var/somedir/'
def oldest_file_in_tree(rootfolder, extension=".pdf"):
return min(
(os.path.join(dirname, filename)
for dirname, dirnames, filenames in os.walk(rootfolder)
for filename in filenames
if filename.endswith(extension)),
key=lambda fn: os.stat(fn).st_mtime)
oldest_g3d = oldest_file_in_tree('/var/somedir/', '.pdf')
name = oldest_pdf[:-4]
matching_jpg = '{}.jpg'.format(name)
shutil.move(oldest_pdf, todir)
shutil.move(matching_jpg, todir)
This could be done with python, but I think I am missing a way to loop for all directories. Here is the code I am using:
import os
def renameInDir(directory):
for filename in os.listdir(directory):
if filename.endswith(".pdf"):
path = os.path.realpath(filename)
parents = path.split('/') //make an array of all the dirs in the path. 0 will be the original basefilename
newFilename=os.path.dirname(filename)+directory +parents[-1:][0] //reorganize data into format you want
os.rename(filename, newFilename)//rename the file
You should go with os.walk(). It will map the directory tree by the given directory param, and generate the file names.
Using os.walk() you'll accomplish the desired result is this way:
import os
from os.path import join
for dirpath, dirnames, filenames in os.walk('/path/to/directory'):
for name in filenames:
new_name = name[:-3] + 'new_file_extension'
os.rename(join(dirpath, name), join(dirpath, new_name))
I would like to find all the files in a directory and all sub-directories.
code used:
import os
import sys
path = "C:\\"
dirs = os.listdir(path)
filename = "C.txt"
FILE = open(filename, "w")
FILE.write(str(dirs))
FILE.close()
print dirs
The problem is - this code only lists files in directories, not sub-directories. What do I need to change in order to also list files in subdirectories?
To traverse a directory tree you want to use os.walk() for this.
Here's an example to get you started:
import os
searchdir = r'C:\root_dir' # traversal starts in this directory (the root)
for root, dirs, files in os.walk(searchdir):
for name in files:
(base, ext) = os.path.splitext(name) # split base and extension
print base, ext
which would give you access to the file names and the components.
You'll find the functions in the os and os.path module to be of great use for this sort of work.
This function will help you: os.path.walk() http://docs.python.org/library/os.path.html#os.path.walk
How do I get the absolute paths of all the files in a directory that could have many sub-folders in Python?
I know os.walk() recursively gives me a list of directories and files, but that doesn't seem to get me what I want.
os.path.abspath makes sure a path is absolute. Use the following helper function:
import os
def absoluteFilePaths(directory):
for dirpath,_,filenames in os.walk(directory):
for f in filenames:
yield os.path.abspath(os.path.join(dirpath, f))
If you have Python 3.4 or newer you can use pathlib (or a third-party backport if you have an older Python version):
import pathlib
for filepath in pathlib.Path(directory).glob('**/*'):
print(filepath.absolute())
If the argument given to os.walk is absolute, then the root dir names yielded during iteration will also be absolute. So, you only need to join them with the filenames:
import os
for root, dirs, files in os.walk(os.path.abspath("../path/to/dir/")):
for file in files:
print(os.path.join(root, file))
Try:
import os
for root, dirs, files in os.walk('.'):
for file in files:
p=os.path.join(root,file)
print p
print os.path.abspath(p)
print
You can use os.path.abspath() to turn relative paths into absolute paths:
file_paths = []
for folder, subs, files in os.walk(rootdir):
for filename in files:
file_paths.append(os.path.abspath(os.path.join(folder, filename)))
Starting with python 3.5 the idiomatic solution would be:
import os
def absolute_file_paths(directory):
path = os.path.abspath(directory)
return [entry.path for entry in os.scandir(path) if entry.is_file()]
This not just reads nicer but also is faster in many cases.
For more details (like ignoring symlinks) see original python docs:
https://docs.python.org/3/library/os.html#os.scandir
All files and folders:
x = [os.path.abspath(os.path.join(directory, p)) for p in os.listdir(directory)]
Images (.jpg | .png):
x = [os.path.abspath(os.path.join(directory, p)) for p in os.listdir(directory) if p.endswith(('jpg', 'png'))]
from glob import glob
def absolute_file_paths(directory):
return glob(join(directory, "**"))
Try:
from pathlib import Path
path = 'Desktop'
files = filter(lambda filepath: filepath.is_file(), Path(path).glob('*'))
for file in files:
print(file.absolute())
I wanted to keep the subdirectory details and not the files and wanted only subdirs with one xml file in them. I can do it this way:
for rootDirectory, subDirectories, files in os.walk(eventDirectory):
for subDirectory in subDirectories:
absSubDir = os.path.join(rootDirectory, subDirectory)
if len(glob.glob(os.path.join(absSubDir, "*.xml"))) == 1:
print "Parsing information in " + absSubDir
for root, directories, filenames in os.walk(directory):
for directory in directories:
print os.path.join(root, directory)
for filename in filenames:
if filename.endswith(".JPG"):
print filename
print os.path.join(root,filename)
Try This
pth=''
types=os.listdir(pth)
for type_ in types:
file_names=os.listdir(f'{pth}/{type_}')
file_names=list(map(lambda x:f'{pth}/{type_}/{x}',file_names))
train_folder+=file_names