List only files in a directory? - python

Is there a way to list the files (not directories) in a directory with Python? I know I could use os.listdir and a loop of os.path.isfile()s, but if there's something simpler (like a function os.path.listfilesindir or something), it would probably be better.

This is a simple generator expression:
files = (file for file in os.listdir(path)
if os.path.isfile(os.path.join(path, file)))
for file in files: # You could shorten this to one line, but it runs on a bit.
...
Or you could make a generator function if it suited you better:
def files(path):
for file in os.listdir(path):
if os.path.isfile(os.path.join(path, file)):
yield file
Then simply:
for file in files(path):
...

files = next(os.walk('..'))[2]

Using pathlib in Windows as follow:
files = (x for x in Path("your_path") if x.is_file())
Generates error:
TypeError: 'WindowsPath' object is not iterable
You should rather use Path.iterdir()
filePath = Path("your_path")
if filePath.is_dir():
files = list(x for x in filePath.iterdir() if x.is_file())

Since Python 3.6 you can use glob with a recursive option "**". Note that glob will give you all files and directories, so you can keep only the ones that are files
files = glob.glob(join(in_path, "**/*"), recursive=True)
files = [f for f in files if os.path.isfile(f)]

For the special case of working with files in the current directory, you could do it as a simple one-liner list comprehension:
[f for f in os.listdir(os.curdir) if os.path.isfile(f)]
Otherwise in the more general case, directory paths & filenames have to be joined:
dirpath = '~/path_to_dir_of_interest'
files = [f for f in os.listdir(dirpath) if os.path.isfile(os.path.join(dirpath, f))]

You could try pathlib, which has a lot of other useful stuff too.
Pathlib is an object-oriented library for interacting with filesystem paths. To get the files in the current directory, one can do:
from pathlib import *
files = (x for x in Path(".") if x.is_file())
for file in files:
print(str(file), "is a file!")
This is, in my opinion, more Pythonic than using os.path.
See also: PEP 428.

Using pathlib, the shortest way to list only files is:
[x for x in Path("your_path").iterdir() if x.is_file()]
with depth support if need be.

If you use Python 3, you could use pathlib.
But, you have to know that if you use the is_dir() method as :
from pathlib import *
#p is directory path
#files is list of files in the form of path type
files=[x for x in p.iterdir() if x.is_file()]
empty files will be skipped by .iterdir()
The solution I found is:
from pathlib import *
#p is directory path
#listing all directory's content, even empty files
contents=list(p.glob("*"))
#if element in contents isn't a folder, it's a file
#is_dir() even works for empty folders...!
files=[x for x in contents if not x.is_dir()]

Related

Iterating through directories in Python [duplicate]

I need to iterate through the subdirectories of a given directory and search for files. If I get a file I have to open it and change the content and replace it with my own lines.
I tried this:
import os
rootdir ='C:/Users/sid/Desktop/test'
for subdir, dirs, files in os.walk(rootdir):
for file in files:
f=open(file,'r')
lines=f.readlines()
f.close()
f=open(file,'w')
for line in lines:
newline = "No you are not"
f.write(newline)
f.close()
but I am getting an error. What am I doing wrong?
The actual walk through the directories works as you have coded it. If you replace the contents of the inner loop with a simple print statement you can see that each file is found:
import os
rootdir = 'C:/Users/sid/Desktop/test'
for subdir, dirs, files in os.walk(rootdir):
for file in files:
print(os.path.join(subdir, file))
If you still get errors when running the above, please provide the error message.
Another way of returning all files in subdirectories is to use the pathlib module, introduced in Python 3.4, which provides an object oriented approach to handling filesystem paths (Pathlib is also available on Python 2.7 via the pathlib2 module on PyPi):
from pathlib import Path
rootdir = Path('C:/Users/sid/Desktop/test')
# Return a list of regular files only, not directories
file_list = [f for f in rootdir.glob('**/*') if f.is_file()]
# For absolute paths instead of relative the current dir
file_list = [f for f in rootdir.resolve().glob('**/*') if f.is_file()]
Since Python 3.5, the glob module also supports recursive file finding:
import os
from glob import iglob
rootdir_glob = 'C:/Users/sid/Desktop/test/**/*' # Note the added asterisks
# This will return absolute paths
file_list = [f for f in iglob(rootdir_glob, recursive=True) if os.path.isfile(f)]
The file_list from either of the above approaches can be iterated over without the need for a nested loop:
for f in file_list:
print(f) # Replace with desired operations
From python >= 3.5 onward, you can use **, glob.iglob(path/**, recursive=True) and it seems the most pythonic solution, i.e.:
import glob, os
for filename in glob.iglob('/pardadox-music/**', recursive=True):
if os.path.isfile(filename): # filter dirs
print(filename)
Output:
/pardadox-music/modules/her1.mod
/pardadox-music/modules/her2.mod
...
Notes:
glob.iglob
glob.iglob(pathname, recursive=False)
Return an iterator which yields the same values as glob() without actually storing them all simultaneously.
If recursive is True, the pattern '**' will match any files and
zero or more directories and subdirectories.
If the directory contains files starting with . they won’t be matched by default. For example, consider a directory containing card.gif and .card.gif:
>>> import glob
>>> glob.glob('*.gif') ['card.gif']
>>> glob.glob('.c*')['.card.gif']
You can also use rglob(pattern),
which is the same as calling glob() with **/ added in front of the given relative pattern.

How to run script for all files in a folder/directry

I am new to python. I have successful written a script to search for something within a file using :
open(r"C:\file.txt) and re.search function and all works fine.
Is there a way to do the search function with all files within a folder? Because currently, I have to manually change the file name of my script by open(r"C:\file.txt),open(r"C:\file1.txt),open(r"C:\file2.txt)`, etc.
Thanks.
You can use os.walk to check all the files, as the following:
import os
for root, _, files in os.walk(path):
for filename in files:
with open(os.path.join(root, filename), 'r') as f:
#your code goes here
Explanation:
os.walk returns tuple of (root path, dir names, file names) in the folder, so you can iterate through filenames and open each file by using os.path.join(root, filename) which basically joins the root path with the file name so you can open the file.
Since you're a beginner, I'll give you a simple solution and walk through it.
Import the os module, and use the os.listdir function to create a list of everything in the directory. Then, iterate through the files using a for loop.
Example:
# Importing the os module
import os
# Give the directory you wish to iterate through
my_dir = <your directory - i.e. "C:\Users\bleh\Desktop\files">
# Using os.listdir to create a list of all of the files in dir
dir_list = os.listdir(my_dir)
# Use the for loop to iterate through the list you just created, and open the files
for f in dir_list:
# Whatever you want to do to all of the files
If you need help on the concepts, refer to the following:
for looops in p3: http://www.python-course.eu/python3_for_loop.php
os function Library (this has some cool stuff in it): https://docs.python.org/2/library/os.html
Good luck!
You can use the os.listdir(path) function:
import os
path = '/Users/ricardomartinez/repos/Salary-API'
# List for all files in a given PATH
file_list = os.listdir(path)
# If you want to filter by file type
file_list = [file for file in os.listdir(path) if os.path.splitext(file)[1] == '.py']
# Both cases yo can iterate over the list and apply the operations
# that you have
for file in file_list:
print(file)
#Operations that you want to do over files

List files ONLY in the current directory

In Python, I only want to list all the files in the current directory ONLY. I do not want files listed from any sub directory or parent.
There do seem to be similar solutions out there, but they don't seem to work for me. Here's my code snippet:
import os
for subdir, dirs, files in os.walk('./'):
for file in files:
do some stuff
print file
Let's suppose I have 2 files, holygrail.py and Tim inside my current directory. I have a folder as well and it contains two files - let's call them Arthur and Lancelot - inside it. When I run the script, this is what I get:
holygrail.py
Tim
Arthur
Lancelot
I am happy with holygrail.py and Tim. But the two files, Arthur and Lancelot, I do not want listed.
Just use os.listdir and os.path.isfile instead of os.walk.
Example:
import os
files = [f for f in os.listdir('.') if os.path.isfile(f)]
for f in files:
# do something
But be careful while applying this to other directory, like
files = [f for f in os.listdir(somedir) if os.path.isfile(f)]
which would not work because f is not a full path but relative to the current directory.
Therefore, for filtering on another directory, do os.path.isfile(os.path.join(somedir, f))
(Thanks Causality for the hint)
You can use os.listdir for this purpose. If you only want files and not directories, you can filter the results using os.path.isfile.
example:
files = os.listdir(os.curdir) #files and directories
or
files = filter(os.path.isfile, os.listdir( os.curdir ) ) # files only
files = [ f for f in os.listdir( os.curdir ) if os.path.isfile(f) ] #list comprehension version.
import os
destdir = '/var/tmp/testdir'
files = [ f for f in os.listdir(destdir) if os.path.isfile(os.path.join(destdir,f)) ]
You can use os.scandir(). New function in stdlib starts from Python 3.5.
import os
for entry in os.scandir('.'):
if entry.is_file():
print(entry.name)
Faster than os.listdir(). os.walk() implements os.scandir().
You can use the pathlib module.
from pathlib import Path
x = Path('./')
print(list(filter(lambda y:y.is_file(), x.iterdir())))
this can be done with os.walk()
python 3.5.2 tested;
import os
for root, dirs, files in os.walk('.', topdown=True):
dirs.clear() #with topdown true, this will prevent walk from going into subs
for file in files:
#do some stuff
print(file)
remove the dirs.clear() line and the files in sub folders are included again.
update with references;
os.walk documented here and talks about the triple list being created and topdown effects.
.clear() documented here for emptying a list
so by clearing the relevant list from os.walk you can effect its result to your needs.
import os
for subdir, dirs, files in os.walk('./'):
for file in files:
do some stuff
print file
You can improve this code with del dirs[:]which will be like following .
import os
for subdir, dirs, files in os.walk('./'):
del dirs[:]
for file in files:
do some stuff
print file
Or even better if you could point os.walk with current working directory .
import os
cwd = os.getcwd()
for subdir, dirs, files in os.walk(cwd, topdown=True):
del dirs[:] # remove the sub directories.
for file in files:
do some stuff
print file
instead of os.walk, just use os.listdir
To list files in a specific folder excluding files in its sub-folders with os.walk use:
_, _, file_list = next(os.walk(data_folder))
Following up on Pygirl and Flimm, use of pathlib, (really helpful reference, btw) their solution included the full path in the result, so here is a solution that outputs just the file names:
from pathlib import Path
p = Path(destination_dir) # destination_dir = './' in original post
files = [x.name for x in p.iterdir() if x.is_file()]
print(files)

How to filter files (with known type) from os.walk?

I have list from os.walk. But I want to exclude some directories and files. I know how to do it with directories:
for root, dirs, files in os.walk('C:/My_files/test'):
if "Update" in dirs:
dirs.remove("Update")
But how can I do it with files, which type I know. because this doesn't work:
if "*.dat" in files:
files.remove("*.dat")
files = [ fi for fi in files if not fi.endswith(".dat") ]
Exclude multiple extensions.
files = [ file for file in files if not file.endswith( ('.dat','.tar') ) ]
And in one more way, because I just wrote this, and then stumbled upon this question:
files = filter(lambda file: not file.endswith('.txt'), files)
Mote that in python3 filter returns a generator, not a list, and the list comprehension is "preferred".
A concise way of writing it, if you do this a lot:
def exclude_ext(ext):
def compare(fn): return os.path.splitext(fn)[1] != ext
return compare
files = filter(exclude_ext(".dat"), files)
Of course, exclude_ext goes in your appropriate utility package.
files = [file for file in files if os.path.splitext(file)[1] != '.dat']
Should be exactly what you need:
if thisFile.endswith(".txt"):
Try this:
import os
skippingWalk = lambda targetDirectory, excludedExtentions: (
(root, dirs, [F for F in files if os.path.splitext(F)[1] not in excludedExtentions])
for (root, dirs, files) in os.walk(targetDirectory)
)
for line in skippingWalk("C:/My_files/test", [".dat"]):
print line
This is a generator expression generating lambda function. You pass it a path and some extensions, and it invokes os.walk with the path, filters out the files with extensions in the list of unwanted extensions using a list comprehension, and returns the result.
(edit: removed the .upper() statement because there might be an actual difference between extensions of different case - if you want this to be case insensitive, add .upper() after os.path.splitext(F)[1] and pass extensions in in capital letters.)
The easiest way to filter files with a known type with os.walk() is to tell the path and get all the files filtered by the extension with an if statement.
for base, dirs, files in os.walk(path):
if files.endswith('.type'):
#Here you will go through all the files with the particular extension '.type'
.....
.....
Another solution would be to use the functions from fnmatch module:
def MatchesExtensions(name,extensions=["*.dat", "*.txt", "*.whatever"]):
for pattern in extensions:
if fnmatch.fnmatch(pattern):
return True
return False
This way you avoid all the hassle with upper/lower case extension. This means you don't need to convert to lower/upper when having to match *.JPEG, *.jpeg, *.JPeg, *.Jpeg
All above answers are working. Just wanted to add for anyone else whos files by any chance are coming from heterogeneous sources, e.g. downloading images in archives from the Internet. In this case, because Unix-like systems are case sensitive you may end up having extension like '.PNG' and '.png'. These will be treated by as different strings by endswith method, i.e. '.PNG'.endswith('png') will return False. In order to avoid this problem, use lower() function.
here is how to find all files in a directory ending with a specific extension
import glob, os
path=os.path.expanduser('C:\\Users\\A')
for filename in [item for item in os.listdir(path) if item.endswith(".ipynb") ]:
print(filename)
In these two ways I can select the files by the file type:
from os import listdir
from os.path import isfile, join
source_path = './data'
excelfiles = [f for f in listdir(source_path) if f.endswith(('.xlsx')) and isfile(join(source_path, f))]
from os import walk
excelfiles2 = []
for (dirpath, dirnames, filenames) in walk(source_path):
excelfiles2.extend(filename for filename in filenames if filename.endswith('.xlsx'))
break

Getting a list of all subdirectories in the current directory

Is there a way to return a list of all the subdirectories in the current directory in Python?
I know you can do this with files, but I need to get the list of directories instead.
Do you mean immediate subdirectories, or every directory right down the tree?
Either way, you could use os.walk to do this:
os.walk(directory)
will yield a tuple for each subdirectory. Ths first entry in the 3-tuple is a directory name, so
[x[0] for x in os.walk(directory)]
should give you all of the subdirectories, recursively.
Note that the second entry in the tuple is the list of child directories of the entry in the first position, so you could use this instead, but it's not likely to save you much.
However, you could use it just to give you the immediate child directories:
next(os.walk('.'))[1]
Or see the other solutions already posted, using os.listdir and os.path.isdir, including those at "How to get all of the immediate subdirectories in Python".
You could just use glob.glob
from glob import glob
glob("/path/to/directory/*/", recursive = True)
Don't forget the trailing / after the *.
Much nicer than the above, because you don't need several os.path.join() and you will get the full path directly (if you wish), you can do this in Python 3.5 and above.
subfolders = [ f.path for f in os.scandir(folder) if f.is_dir() ]
This will give the complete path to the subdirectory.
If you only want the name of the subdirectory use f.name instead of f.path
https://docs.python.org/3/library/os.html#os.scandir
Slightly OT: In case you need all subfolder recursively and/or all files recursively, have a look at this function, that is faster than os.walk & glob and will return a list of all subfolders as well as all files inside those (sub-)subfolders: https://stackoverflow.com/a/59803793/2441026
In case you want only all subfolders recursively:
def fast_scandir(dirname):
subfolders= [f.path for f in os.scandir(dirname) if f.is_dir()]
for dirname in list(subfolders):
subfolders.extend(fast_scandir(dirname))
return subfolders
Returns a list of all subfolders with their full paths. This again is faster than os.walk and a lot faster than glob.
An analysis of all functions
tl;dr:
- If you want to get all immediate subdirectories for a folder use os.scandir.
- If you want to get all subdirectories, even nested ones, use os.walk or - slightly faster - the fast_scandir function above.
- Never use os.walk for only top-level subdirectories, as it can be hundreds(!) of times slower than os.scandir.
If you run the code below, make sure to run it once so that your OS will have accessed the folder, discard the results and run the test, otherwise results will be screwed.
You might want to mix up the function calls, but I tested it, and it did not really matter.
All examples will give the full path to the folder. The pathlib example as a (Windows)Path object.
The first element of os.walk will be the base folder. So you will not get only subdirectories. You can use fu.pop(0) to remove it.
None of the results will use natural sorting. This means results will be sorted like this: 1, 10, 2. To get natural sorting (1, 2, 10), please have a look at https://stackoverflow.com/a/48030307/2441026
Results:
os.scandir took 1 ms. Found dirs: 439
os.walk took 463 ms. Found dirs: 441 -> it found the nested one + base folder.
glob.glob took 20 ms. Found dirs: 439
pathlib.iterdir took 18 ms. Found dirs: 439
os.listdir took 18 ms. Found dirs: 439
Tested with W7x64, Python 3.8.1.
# -*- coding: utf-8 -*-
# Python 3
import time
import os
from glob import glob
from pathlib import Path
directory = r"<insert_folder>"
RUNS = 1
def run_os_walk():
a = time.time_ns()
for i in range(RUNS):
fu = [x[0] for x in os.walk(directory)]
print(f"os.walk\t\t\ttook {(time.time_ns() - a) / 1000 / 1000 / RUNS:.0f} ms. Found dirs: {len(fu)}")
def run_glob():
a = time.time_ns()
for i in range(RUNS):
fu = glob(directory + "/*/")
print(f"glob.glob\t\ttook {(time.time_ns() - a) / 1000 / 1000 / RUNS:.0f} ms. Found dirs: {len(fu)}")
def run_pathlib_iterdir():
a = time.time_ns()
for i in range(RUNS):
dirname = Path(directory)
fu = [f for f in dirname.iterdir() if f.is_dir()]
print(f"pathlib.iterdir\ttook {(time.time_ns() - a) / 1000 / 1000 / RUNS:.0f} ms. Found dirs: {len(fu)}")
def run_os_listdir():
a = time.time_ns()
for i in range(RUNS):
dirname = Path(directory)
fu = [os.path.join(directory, o) for o in os.listdir(directory) if os.path.isdir(os.path.join(directory, o))]
print(f"os.listdir\t\ttook {(time.time_ns() - a) / 1000 / 1000 / RUNS:.0f} ms. Found dirs: {len(fu)}")
def run_os_scandir():
a = time.time_ns()
for i in range(RUNS):
fu = [f.path for f in os.scandir(directory) if f.is_dir()]
print(f"os.scandir\t\ttook {(time.time_ns() - a) / 1000 / 1000 / RUNS:.0f} ms.\tFound dirs: {len(fu)}")
if __name__ == '__main__':
run_os_scandir()
run_os_walk()
run_glob()
run_pathlib_iterdir()
run_os_listdir()
import os
d = '.'
[os.path.join(d, o) for o in os.listdir(d)
if os.path.isdir(os.path.join(d,o))]
Python 3.4 introduced the pathlib module into the standard library, which provides an object oriented approach to handle filesystem paths:
from pathlib import Path
p = Path('./')
# All subdirectories in the current directory, not recursive.
[f for f in p.iterdir() if f.is_dir()]
To recursively list all subdirectories, path globbing can be used with the ** pattern.
# This will also include the current directory '.'
list(p.glob('**'))
Note that a single * as the glob pattern would include both files and directories non-recursively. To get only directories, a trailing / can be appended but this only works when using the glob library directly, not when using glob via pathlib:
import glob
# These three lines return both files and directories
list(p.glob('*'))
list(p.glob('*/'))
glob.glob('*')
# Whereas this returns only directories
glob.glob('*/')
So Path('./').glob('**') matches the same paths as glob.glob('**/', recursive=True).
Pathlib is also available on Python 2.7 via the pathlib2 module on PyPi.
If you need a recursive solution that will find all the subdirectories in the subdirectories, use walk as proposed before.
If you only need the current directory's child directories, combine os.listdir with os.path.isdir
Listing Out only directories
print("\nWe are listing out only the directories in current directory -")
directories_in_curdir = list(filter(os.path.isdir, os.listdir(os.curdir)))
print(directories_in_curdir)
Listing Out only files in current directory
files = list(filter(os.path.isfile, os.listdir(os.curdir)))
print("\nThe following are the list of all files in the current directory -")
print(files)
I prefer using filter (https://docs.python.org/2/library/functions.html#filter), but this is just a matter of taste.
d='.'
filter(lambda x: os.path.isdir(os.path.join(d, x)), os.listdir(d))
Implemented this using python-os-walk. (http://www.pythonforbeginners.com/code-snippets-source-code/python-os-walk/)
import os
print("root prints out directories only from what you specified")
print("dirs prints out sub-directories from root")
print("files prints out all files from root and directories")
print("*" * 20)
for root, dirs, files in os.walk("/var/log"):
print(root)
print(dirs)
print(files)
You can get the list of subdirectories (and files) in Python 2.7 using os.listdir(path)
import os
os.listdir(path) # list of subdirectories and files
Since I stumbled upon this problem using Python 3.4 and Windows UNC paths, here's a variant for this environment:
from pathlib import WindowsPath
def SubDirPath (d):
return [f for f in d.iterdir() if f.is_dir()]
subdirs = SubDirPath(WindowsPath(r'\\file01.acme.local\home$'))
print(subdirs)
Pathlib is new in Python 3.4 and makes working with paths under different OSes much easier:
https://docs.python.org/3.4/library/pathlib.html
Although this question is answered a long time ago. I want to recommend to use the pathlib module since this is a robust way to work on Windows and Unix OS.
So to get all paths in a specific directory including subdirectories:
from pathlib import Path
paths = list(Path('myhomefolder', 'folder').glob('**/*.txt'))
# all sorts of operations
file = paths[0]
file.name
file.stem
file.parent
file.suffix
etc.
Copy paste friendly in ipython:
import os
d='.'
folders = list(filter(lambda x: os.path.isdir(os.path.join(d, x)), os.listdir(d)))
Output from print(folders):
['folderA', 'folderB']
Thanks for the tips, guys. I ran into an issue with softlinks (infinite recursion) being returned as dirs. Softlinks? We don't want no stinkin' soft links! So...
This rendered just the dirs, not softlinks:
>>> import os
>>> inf = os.walk('.')
>>> [x[0] for x in inf]
['.', './iamadir']
Here are a couple of simple functions based on #Blair Conrad's example -
import os
def get_subdirs(dir):
"Get a list of immediate subdirectories"
return next(os.walk(dir))[1]
def get_subfiles(dir):
"Get a list of immediate subfiles"
return next(os.walk(dir))[2]
This is how I do it.
import os
for x in os.listdir(os.getcwd()):
if os.path.isdir(x):
print(x)
Building upon Eli Bendersky's solution, use the following example:
import os
test_directory = <your_directory>
for child in os.listdir(test_directory):
test_path = os.path.join(test_directory, child)
if os.path.isdir(test_path):
print test_path
# Do stuff to the directory "test_path"
where <your_directory> is the path to the directory you want to traverse.
With full path and accounting for path being ., .., \\, ..\\..\\subfolder, etc:
import os, pprint
pprint.pprint([os.path.join(os.path.abspath(path), x[0]) \
for x in os.walk(os.path.abspath(path))])
The easiest way:
from pathlib import Path
from glob import glob
current_dir = Path.cwd()
all_sub_dir_paths = glob(str(current_dir) + '/*/') # returns list of sub directory paths
all_sub_dir_names = [Path(sub_dir).name for sub_dir in all_sub_dir_paths]
This answer didn't seem to exist already.
directories = [ x for x in os.listdir('.') if os.path.isdir(x) ]
I've had a similar question recently, and I found out that the best answer for python 3.6 (as user havlock added) is to use os.scandir. Since it seems there is no solution using it, I'll add my own. First, a non-recursive solution that lists only the subdirectories directly under the root directory.
def get_dirlist(rootdir):
dirlist = []
with os.scandir(rootdir) as rit:
for entry in rit:
if not entry.name.startswith('.') and entry.is_dir():
dirlist.append(entry.path)
dirlist.sort() # Optional, in case you want sorted directory names
return dirlist
The recursive version would look like this:
def get_dirlist(rootdir):
dirlist = []
with os.scandir(rootdir) as rit:
for entry in rit:
if not entry.name.startswith('.') and entry.is_dir():
dirlist.append(entry.path)
dirlist += get_dirlist(entry.path)
dirlist.sort() # Optional, in case you want sorted directory names
return dirlist
keep in mind that entry.path wields the absolute path to the subdirectory. In case you only need the folder name, you can use entry.name instead. Refer to os.DirEntry for additional details about the entry object.
using os walk
sub_folders = []
for dir, sub_dirs, files in os.walk(test_folder):
sub_folders.extend(sub_dirs)
This will list all subdirectories right down the file tree.
import pathlib
def list_dir(dir):
path = pathlib.Path(dir)
dir = []
try:
for item in path.iterdir():
if item.is_dir():
dir.append(item)
dir = dir + list_dir(item)
return dir
except FileNotFoundError:
print('Invalid directory')
pathlib is new in version 3.4
Function to return a List of all subdirectories within a given file path. Will search through the entire file tree.
import os
def get_sub_directory_paths(start_directory, sub_directories):
"""
This method iterates through all subdirectory paths of a given
directory to collect all directory paths.
:param start_directory: The starting directory path.
:param sub_directories: A List that all subdirectory paths will be
stored to.
:return: A List of all sub-directory paths.
"""
for item in os.listdir(start_directory):
full_path = os.path.join(start_directory, item)
if os.path.isdir(full_path):
sub_directories.append(full_path)
# Recursive call to search through all subdirectories.
get_sub_directory_paths(full_path, sub_directories)
return sub_directories
use a filter function os.path.isdir over os.listdir()
something like this filter(os.path.isdir,[os.path.join(os.path.abspath('PATH'),p) for p in os.listdir('PATH/')])
This function, with a given parent directory iterates over all its directories recursively and prints all the filenames which it founds inside. Too useful.
import os
def printDirectoryFiles(directory):
for filename in os.listdir(directory):
full_path=os.path.join(directory, filename)
if not os.path.isdir(full_path):
print( full_path + "\n")
def checkFolders(directory):
dir_list = next(os.walk(directory))[1]
#print(dir_list)
for dir in dir_list:
print(dir)
checkFolders(directory +"/"+ dir)
printDirectoryFiles(directory)
main_dir="C:/Users/S0082448/Desktop/carpeta1"
checkFolders(main_dir)
input("Press enter to exit ;")
we can get list of all the folders by using os.walk()
import os
path = os.getcwd()
pathObject = os.walk(path)
this pathObject is a object and we can get an array by
arr = [x for x in pathObject]
arr is of type [('current directory', [array of folder in current directory], [files in current directory]),('subdirectory', [array of folder in subdirectory], [files in subdirectory]) ....]
We can get list of all the subdirectory by iterating through the arr and printing the middle array
for i in arr:
for j in i[1]:
print(j)
This will print all the subdirectory.
To get all the files:
for i in arr:
for j in i[2]:
print(i[0] + "/" + j)
By joining multiple solutions from here, this is what I ended up using:
import os
import glob
def list_dirs(path):
return [os.path.basename(x) for x in filter(
os.path.isdir, glob.glob(os.path.join(path, '*')))]
Lot of nice answers out there but if you came here looking for a simple way to get list of all files or folders at once. You can take advantage of the os offered find on linux and mac which and is much faster than os.walk
import os
all_files_list = os.popen("find path/to/my_base_folder -type f").read().splitlines()
all_sub_directories_list = os.popen("find path/to/my_base_folder -type d").read().splitlines()
OR
import os
def get_files(path):
all_files_list = os.popen(f"find {path} -type f").read().splitlines()
return all_files_list
def get_sub_folders(path):
all_sub_directories_list = os.popen(f"find {path} -type d").read().splitlines()
return all_sub_directories_list
This should work, as it also creates a directory tree;
import os
import pathlib
def tree(directory):
print(f'+ {directory}')
print("There are " + str(len(os.listdir(os.getcwd()))) + \
" folders in this directory;")
for path in sorted(directory.glob('*')):
depth = len(path.relative_to(directory).parts)
spacer = ' ' * depth
print(f'{spacer}+ {path.name}')
This should list all the directories in a folder using the pathlib library. path.relative_to(directory).parts gets the elements relative to the current working dir.

Categories