Attempting to write a function that walks a file system and returns the absolute path and filename for use in another function.
Example "/testdir/folderA/222/filename.ext".
Having tried multiple versions of this I cannot seem to get it to work properly.
filesCheck=[]
def findFiles(filepath):
files=[]
for root, dirs, files in os.walk(filepath):
for file in files:
currentFile = os.path.realpath(file)
print (currentFile)
if os.path.exists(currentFile):
files.append(currentFile)
return files
filesCheck = findFiles(/testdir)
This returns
"filename.ext" (only one).
Substitute in currentFile = os.path.join(root, file) for os.path.realpath(file) and it goes into a loop in the first directory. Tried os.path.join(dir, file) and it fails as one of my folders is named 222.
I have gone round in circles and get somewhat close but haven't been able to get it to work.
Running on Linux with Python 3.6
There's a several things wrong with your code.
There are multiple values are being assigned to the variable name files.
You're not adding the root directory to each filename os.walk() returns which can be done with os.path.join().
You're not passing a string to the findFiles() function.
If you fix those things there's no longer a need to call os.path.exists() because you can be sure it does.
Here's a working version:
import os
def findFiles(filepath):
found = []
for root, dirs, files in os.walk(filepath):
for file in files:
currentFile = os.path.realpath(os.path.join(root, file))
found.append(currentFile)
return found
filesCheck = findFiles('/testdir')
print(filesCheck)
Hi I think this is what you need. Perhaps you could give it a try :)
from os import walk
path = "C:/Users/SK/Desktop/New folder"
files = []
for (directoryPath, directoryNames, allFiles) in walk(path):
for file in allFiles:
files.append([file, f"{directoryPath}/{file}"])
print(files)
Output:
[ ['index.html', 'C:/Users/SK/Desktop/New folder/index.html'], ['test.py', 'C:/Users/SK/Desktop/New folder/test.py'] ]
Related
I am trying to collect all files with all sub-directories and move to another directory
Code used
#collects all mp3 files from folders to a new folder
import os
from pathlib import Path
import shutil
#run once
path = os.getcwd()
os.mkdir("empetrishki")
empetrishki = path + "/empetrishki" #destination dir
print(path)
print(empetrishki)
#recursive collection
for root, dirs, files in os.walk(path, topdown=True, onerror=None, followlinks=True):
for name in files:
filePath = Path(name)
if filePath.suffix.lower() == ".mp3":
print(filePath)
os.path.join
filePath.rename(empetrishki.joinpath(filePath))
I have trouble with the last line of moving files: filePath.rename() nor shutil.move nor joinpath() have worked for me. Maybe that's because I am trying to change the element in the tuple - the output from os.walk
Similar code works with os.scandir but this would collect files only in the current directory
How can I fix that, thanks!
If you use pathlib.Path(name) that doesn't mean that something exists called name. Hence, you do need to be careful that you have a full path, or relative path, and you need to make sure to resolve those. In particular I am noting that you don't change your working directory and have a line like this:
filePath = Path(name)
This means that while you may be walking down the directory, your working directory may not be changing. You should make your path from the root and the name, it is also a good idea to resolve so that the full path is known.
filePath = Path(root).joinpath(name).resolve()
You can also place the Path(root) outside the inner loop as well. Now you have an absolute path from '/home/' to the filename. Hence, you should be able to rename with .rename(), like:
filePath.rename(x.parent.joinpath(newname))
#Or to another directory
filePath.rename(other_dir.joinpath(newname))
All together:
from pathlib import os, Path
empetrishki = Path.cwd().joinpath("empetrishki").resolve()
for root, dirs, files in os.walk(path, topdown=True, onerror=None, followlinks=True):
root = Path(root).resolve()
for name in files:
file = root.joinpath(name)
if file.suffix.lower() == ".mp3":
file.rename(empetrishki.joinpath(file.name))
for root, dirs, files in os.walk(path, topdown=True, onerror=None, followlinks=True):
if root == empetrishki:
continue # skip the destination dir
for name in files:
basename, extension = os.path.splitext(name)
if extension.lower() == ".mp3":
oldpath = os.path.join(root, name)
newpath = os.path.join(empetrishki, name)
print(oldpath)
shutil.move(oldpath, newpath)
This is what I suggest. Your code is running on the current directory, and the file is at the path os.path.join(root, name) and you need to provide such path to your move function.
Besides, I would also suggest to use os.path.splitext for extracting the file extension. More pythonic. And also you might want to skip scanning your target directory.
I've tried to write some code which will rename some files in a folder - essentially, they're listed as xxx_(a).bmp whereas they need to be xxx_a.bmp, where a runs from 1 to 2000.
I've used the inbuilt os.rename function to essentially swap them inside of a loop to get the right numbers, but this gives me FileNotFoundError [WinError2] the system cannot find the file specified Z:/AAA/BBB/xxx_(1).bmp' -> 'Z:/AAA/BBB/xxx_1.bmp'.
I've included the code I've written below if anyone could point me in the right direction. I've checked that I'm working in the right directory and it gives me the directory I'm expecting so I'm not sure why it can't find the files.
import os
n = 2000
folder = r"Z:/AAA/BBB/"
os.chdir(folder)
saved_path = os.getcwd()
print("CWD is" + saved_path)
for i in range(1,n):
old_file = os.path.join(folder, "xxx_(" + str(i) + ").bmp")
new_file = os.path.join(folder, "xxx_" +str(i)+ ".bmp")
os.rename(old_file, new_file)
print('renamed files')
The problem is os.rename doesn't create a new directory if the new name is a filename in a directory that does not currently exist.
In order to create the directory first, you can do the following in Python3:
os.makedirs(dirname, exist_ok=True)
In this case dirname can contain created or not-yet-created subdirectories.
As an alternative, one may use os.renames, which handles new and intermediate directories.
Try iterating files inside the directory and processing the files that meet your criteria.
from pathlib import Path
import re
folder = Path("Z:/AAA/BBB/")
for f in folder.iterdir():
if '(' in f.name:
new_name = f.stem.replace('(', '').replace(')', '')
# using regex
# new_name = re.sub('\(([^)]+)\)', r'\1', f.stem)
extension = f.suffix
new_path = f.with_name(new_name + extension)
f.rename(new_path)
I am trying to list all the files in the current folder and also files in the folders of the current folder.
This is what I have been upto:
import os
def sendFnF(dirList):
for file in dirList:
if os.path.isdir(file):
print 'Going in dir:',file
dirList1= os.listdir('./'+file)
# print 'files in list', dirList1
sendFnF(dirList1)
print 'backToPrevDirectory:'
else:
print 'file name is',file
filename= raw_input()
dirList= os.listdir('./'+filename)
sendFnF(dirList)
This code does get me into folders of the current directory. But when it comes to sub-folders; it treats them as files.
Any idea what I am doing wrong?
Thanks in advance,
Sarge.
Prepending ./ to a path does essentially nothing. Also, just because you call a function recursively with a directory path doesn't change the current directory, and thus the meaning of . in a file path.
Your basic approach is right, to go down a directory use os.path.join(). It'd be best to restructure your code so you listdir() at the start of sendFnF():
def sendFnF(directory):
for fname in os.listdir(directory):
# Add the current directory to the filename
fpath = os.path.join(directory, fname)
# You need to check the full path, not just the filename
if os.path.isdir(fpath):
sendFnF(fpath)
else:
# ...
# ...
sendFnf(filename)
That said, unless this is an exercise, you can just use os.walk()
I've got a problem with a short script, it'd be great if you could have a look!
import os
import subprocess
root = "/Users/software/fmtomov1.0/remaker_lastplot/source_relocation/observed_arrivals_loc3d"
def loop_loc3d(file_in):
"""Loops loc3d over the source files"""
return subprocess.call (['loc3d'], shell=True)
def relocation ():
for subdir, dirs, files in os.walk(root):
for file in files:
file_in = open(os.path.join(subdir, file), 'r')
return loop_loc3d(file_in)
I think the script is quite easy to understand, it's very simple. However I'm not getting the result wanted. In a few word I just want 'loc3d' to operate over all the files contents present in the 'observed_arrivals_loc3d' directory, which means that I need to open all the files and that's what I've actually done. In fact, if I try to 'print files' after:
for subdir, dirs, files in os.walk(root)
I'll get the name of every file. Furthermore, if I try a 'print file_in' after
file_in = open(os.path.join(subdir, file), 'r')
I get something like this line for every file:
<open file '/Users/software/fmtomov1.0/remaker_lastplot/source_relocation/observed_arrivals_loc3d/EVENT2580', mode 'r' at 0x78fe38>
subprocess has been tested alone on only one file and it's working.
Overall I'm getting no errors but just -11 which means absolutely nothing to me. The output from loc3d should be completly different.
So does the code look fine to you? Is there anything I'm missing? Any suggestion?
Thanks for your help!
I assume you would call loc3d filename from the CLI. If so, then:
def loop_loc3d(filename):
"""Loops loc3d over the source files"""
return subprocess.call (['loc3d',filename])
def relocation():
for subdir, dirs, files in os.walk(root):
for file in files:
filename = os.path.join(subdir, file)
return loop_loc3d(filename)
In other words, don't open the file yourself, let loc3d do it.
Currently your relocation method will return after the first iteration (for the first file). You shouldn't need to return at all.
def loop_loc3d(filename):
"""Loops loc3d over the source files"""
return subprocess.call (['loc3d',filename])
def relocation ():
for subdir, dirs, files in os.walk(root):
for file in files:
filename = os.path.join(subdir, file)
loop_loc3d(filename)
This is only one of the issues. The other is concerning loc3d itself. Try providing the full path for loc3d.
-11 exit code might mean that the command killed by signal Segmentation fault.
It is a bug in loc3d. A well-behaved program should not produce 'Segmentation fault' on any user input.
Feed loc3d only files that it can understand. Print filenames or use subprocess.check_call() to find out which file it doesn't like:
#!/usr/bin/env python
import fnmatch
import os
import subprocess
def loc3d_files(root):
for dirpath, dirs, files in os.walk(root, topdown=True):
# skip hidden directories
dirs[:] = [d for d in dirs if not d.startswith('.')]
# process only known files
for file in fnmatch.filter(files, "*some?pattern[0-9][0-9].[ch]"):
yield os.path.join(dirpath, file)
for path in loc3d_files(root):
print path
subprocess.check_call(['loc3d', path]) # raise on any error
Just found out that loc3d, as unutbu said, relies on several variables and in the specific case one called 'observal_arrivals' that I have to create and delete every time from my directory. In Pythonic terms it means:
import os
import shutil
import subprocess
def loop_loc3d(file_in):
"""Loops loc3d over the source files"""
return subprocess.call(["loc3d"], shell=True)
path = "/Users/software/fmtomo/remaker_lastplot/source_relocation"
path2 = "/Users/Programming/working_directory/2test"
new_file_name = 'observed_arrivals'
def define_object_file ():
for filename in os.listdir("."):
file_in = os.rename (filename, new_file_name) # get the observal_arrivals file
file_in = shutil.copy ("/Users/simone/Programming/working_directory/2test/observed_arrivals", "/Users/software/fmtomo/remaker_lastplot/source_relocation")
os.chdir(path) # goes where loc3d is
loop_loc3d (file_in)
os.remove("/Users/software/fmtomo/remaker_lastplot/source_relocation/observed_arrivals")
os.remove ("/Users/Programming/working_directory/2test/observed_arrivals")
os.chdir(path2)
Now, this is working very well, so it should answer my question. I guess it's quite easy to understand, it's just copying, changing dir and that kind of stuff.
I have a set of folders, and I want to be able to run a function that will find the most recently edited file and tell me the name of the file and the folder it is in.
Folder layout:
root
Folder A
File A
File B
Folder B
File C
File D
etc...
Any tips to get me started as i've hit a bit of a wall.
You should look at the os.walk function, as well as os.stat, which can let you do something like:
import os
max_mtime = 0
for dirname,subdirs,files in os.walk("."):
for fname in files:
full_path = os.path.join(dirname, fname)
mtime = os.stat(full_path).st_mtime
if mtime > max_mtime:
max_mtime = mtime
max_dir = dirname
max_file = fname
print max_dir, max_file
It helps to wrap the built in directory walking to function that yields only full paths to files. Then you can just take the function that returns all files and pick out the one that has the highest modification time:
import os
def all_files_under(path):
"""Iterates through all files that are under the given path."""
for cur_path, dirnames, filenames in os.walk(path):
for filename in filenames:
yield os.path.join(cur_path, filename)
latest_file = max(all_files_under('root'), key=os.path.getmtime)
If anyone is looking for an one line way to do it:
latest_edited_file = max([f for f in os.scandir("path\\to\\search")], key=lambda x: x.stat().st_mtime).name
use os.walk to list files
use os.stat to get file modified timestamp (st_mtime)
put both timestamps and filenames in a list and sort it by timestamp, largest timestamp is most recently edited file.
For multiple files, if anyone came here for that:
import glob, os
files = glob.glob("/target/directory/path/*/*.mp4")
files.sort(key=os.path.getmtime)
for file in files:
print(file)
This will print all files in any folder within /path/ that have the .mp4 extension, with the most recently modified file paths at the bottom.
You can use
os.walk
See: http://docs.python.org/library/os.html
Use os.path.walk() to traverse the directory tree and os.stat().st_mtime to get the mtime of the files.
The function you pass to os.path.walk() (the visit parameter) just needs to keep track of the largest mtime it's seen and where it saw it.
I'm using path = r"C:\Users\traveler\Desktop":
import os
def all_files_under(path):
#"""Iterates through all files that are under the given path."""
for cur_path, dirnames, filenames in os.walk(path):
for filename in filenames:
yield os.path.join(cur_path, filename)
latest_file = max(all_files_under('root'), key=os.path.getmtime)
What am i missing here?