In Python, I only want to list all the files in the current directory ONLY. I do not want files listed from any sub directory or parent.
There do seem to be similar solutions out there, but they don't seem to work for me. Here's my code snippet:
import os
for subdir, dirs, files in os.walk('./'):
for file in files:
do some stuff
print file
Let's suppose I have 2 files, holygrail.py and Tim inside my current directory. I have a folder as well and it contains two files - let's call them Arthur and Lancelot - inside it. When I run the script, this is what I get:
holygrail.py
Tim
Arthur
Lancelot
I am happy with holygrail.py and Tim. But the two files, Arthur and Lancelot, I do not want listed.
Just use os.listdir and os.path.isfile instead of os.walk.
Example:
import os
files = [f for f in os.listdir('.') if os.path.isfile(f)]
for f in files:
# do something
But be careful while applying this to other directory, like
files = [f for f in os.listdir(somedir) if os.path.isfile(f)]
which would not work because f is not a full path but relative to the current directory.
Therefore, for filtering on another directory, do os.path.isfile(os.path.join(somedir, f))
(Thanks Causality for the hint)
You can use os.listdir for this purpose. If you only want files and not directories, you can filter the results using os.path.isfile.
example:
files = os.listdir(os.curdir) #files and directories
or
files = filter(os.path.isfile, os.listdir( os.curdir ) ) # files only
files = [ f for f in os.listdir( os.curdir ) if os.path.isfile(f) ] #list comprehension version.
import os
destdir = '/var/tmp/testdir'
files = [ f for f in os.listdir(destdir) if os.path.isfile(os.path.join(destdir,f)) ]
You can use os.scandir(). New function in stdlib starts from Python 3.5.
import os
for entry in os.scandir('.'):
if entry.is_file():
print(entry.name)
Faster than os.listdir(). os.walk() implements os.scandir().
You can use the pathlib module.
from pathlib import Path
x = Path('./')
print(list(filter(lambda y:y.is_file(), x.iterdir())))
this can be done with os.walk()
python 3.5.2 tested;
import os
for root, dirs, files in os.walk('.', topdown=True):
dirs.clear() #with topdown true, this will prevent walk from going into subs
for file in files:
#do some stuff
print(file)
remove the dirs.clear() line and the files in sub folders are included again.
update with references;
os.walk documented here and talks about the triple list being created and topdown effects.
.clear() documented here for emptying a list
so by clearing the relevant list from os.walk you can effect its result to your needs.
import os
for subdir, dirs, files in os.walk('./'):
for file in files:
do some stuff
print file
You can improve this code with del dirs[:]which will be like following .
import os
for subdir, dirs, files in os.walk('./'):
del dirs[:]
for file in files:
do some stuff
print file
Or even better if you could point os.walk with current working directory .
import os
cwd = os.getcwd()
for subdir, dirs, files in os.walk(cwd, topdown=True):
del dirs[:] # remove the sub directories.
for file in files:
do some stuff
print file
instead of os.walk, just use os.listdir
To list files in a specific folder excluding files in its sub-folders with os.walk use:
_, _, file_list = next(os.walk(data_folder))
Following up on Pygirl and Flimm, use of pathlib, (really helpful reference, btw) their solution included the full path in the result, so here is a solution that outputs just the file names:
from pathlib import Path
p = Path(destination_dir) # destination_dir = './' in original post
files = [x.name for x in p.iterdir() if x.is_file()]
print(files)
Related
I am trying to zip each folder on its own in Python. However, the first folder is being zipped and includes all folders within it. Could someone please explain what is going on? Should I not be using shutil for this?
#%% Set path variable
path = r"G:\Folder"
os.chdir(path)
os.getcwd()
#%% Zip all folders
def retrieve_file_paths(dirName):
# File paths variable
filePaths = []
# Read all directory, subdirectories and file lists
for root, directories, files in os.walk(dirName):
for filename in directories:
# Createthe full filepath by using os module
filePath = os.path.join(root, filename)
filePaths.append(filePath)
# return all paths
return filePaths
filepaths = retrieve_file_paths(path)
#%% Print folders and start zipping individually
for x in filepaths:
print(x)
shutil.make_archive(x, 'zip', path)
shutil.make_archive will make an archive of all files and subfolders - since this is what most people want. If you need more choice of what files are included, you must use zipfile directly.
You can do this right within the walk loop (that is what it's for).
import os
import zipfile
dirName = 'C:\...'
# Read all directory, subdirectories and file lists
for root, directories, files in os.walk(dirName):
zf = zipfile.ZipFile(os.path.join(root, "thisdir.zip"), "w", compression=zipfile.ZIP_DEFLATED, compresslevel=9)
for name in files:
if name == 'thisdir.zip': continue
filePath = os.path.join(root, name)
zf.write(filePath, arcname=name)
zf.close()
This will create a file "thisdir.zip" in each subdirectory, containing only the files within this directory.
(edit: tested & corrected code example)
Following Torben's answer to my question, I modified the code to zip each directory recursively. I realised what had happened was that I was not specifying sub directories. Code below:
#Set path variable
path = r"insert root directory here"
os.chdir(path)
# Declare the functionto return all file paths in selected directory
def retrieve_file_paths(dirName):
for root, dirs, files in os.walk(dirName):
for dir in dirs:
zf = zipfile.ZipFile(os.path.join(root+dir, root+dir+'.zip'), "w", compression=zipfile.ZIP_DEFLATED, compresslevel=9)
files = os.listdir(root+dir)
print(files)
filePaths.append(files)
for f in files:
filepath = root + dir +'/'+ f
zf.write(filepath, arcname=f)
zf.close()
retrieve_file_paths(path)
it's a relativly simple answer once you got a look onto the Docs.
You can see the following under shutil.make_archive:
Note This function is not thread-safe.
The way threading in computing works on a high level basis:
On a machine there are cores, which can process data. (e.g. AMD Ryzen 5 5800x with 8cores)
Within a process, there are threads (e.g. 16 Threads on the Ryzen 5800X).
However, in multiprocessing there is no data shared between the processes.
In multithreading within one process you can access data from the same variable.
Because this function is not thread-safe, you will share the variable "x" and access the same item. Which means there can only be one output.
Have a look into multithreading and works with locks in order to sequelize threads.
Cheers
I need to iterate through the subdirectories of a given directory and search for files. If I get a file I have to open it and change the content and replace it with my own lines.
I tried this:
import os
rootdir ='C:/Users/sid/Desktop/test'
for subdir, dirs, files in os.walk(rootdir):
for file in files:
f=open(file,'r')
lines=f.readlines()
f.close()
f=open(file,'w')
for line in lines:
newline = "No you are not"
f.write(newline)
f.close()
but I am getting an error. What am I doing wrong?
The actual walk through the directories works as you have coded it. If you replace the contents of the inner loop with a simple print statement you can see that each file is found:
import os
rootdir = 'C:/Users/sid/Desktop/test'
for subdir, dirs, files in os.walk(rootdir):
for file in files:
print(os.path.join(subdir, file))
If you still get errors when running the above, please provide the error message.
Another way of returning all files in subdirectories is to use the pathlib module, introduced in Python 3.4, which provides an object oriented approach to handling filesystem paths (Pathlib is also available on Python 2.7 via the pathlib2 module on PyPi):
from pathlib import Path
rootdir = Path('C:/Users/sid/Desktop/test')
# Return a list of regular files only, not directories
file_list = [f for f in rootdir.glob('**/*') if f.is_file()]
# For absolute paths instead of relative the current dir
file_list = [f for f in rootdir.resolve().glob('**/*') if f.is_file()]
Since Python 3.5, the glob module also supports recursive file finding:
import os
from glob import iglob
rootdir_glob = 'C:/Users/sid/Desktop/test/**/*' # Note the added asterisks
# This will return absolute paths
file_list = [f for f in iglob(rootdir_glob, recursive=True) if os.path.isfile(f)]
The file_list from either of the above approaches can be iterated over without the need for a nested loop:
for f in file_list:
print(f) # Replace with desired operations
From python >= 3.5 onward, you can use **, glob.iglob(path/**, recursive=True) and it seems the most pythonic solution, i.e.:
import glob, os
for filename in glob.iglob('/pardadox-music/**', recursive=True):
if os.path.isfile(filename): # filter dirs
print(filename)
Output:
/pardadox-music/modules/her1.mod
/pardadox-music/modules/her2.mod
...
Notes:
glob.iglob
glob.iglob(pathname, recursive=False)
Return an iterator which yields the same values as glob() without actually storing them all simultaneously.
If recursive is True, the pattern '**' will match any files and
zero or more directories and subdirectories.
If the directory contains files starting with . they won’t be matched by default. For example, consider a directory containing card.gif and .card.gif:
>>> import glob
>>> glob.glob('*.gif') ['card.gif']
>>> glob.glob('.c*')['.card.gif']
You can also use rglob(pattern),
which is the same as calling glob() with **/ added in front of the given relative pattern.
I am new to python. I have successful written a script to search for something within a file using :
open(r"C:\file.txt) and re.search function and all works fine.
Is there a way to do the search function with all files within a folder? Because currently, I have to manually change the file name of my script by open(r"C:\file.txt),open(r"C:\file1.txt),open(r"C:\file2.txt)`, etc.
Thanks.
You can use os.walk to check all the files, as the following:
import os
for root, _, files in os.walk(path):
for filename in files:
with open(os.path.join(root, filename), 'r') as f:
#your code goes here
Explanation:
os.walk returns tuple of (root path, dir names, file names) in the folder, so you can iterate through filenames and open each file by using os.path.join(root, filename) which basically joins the root path with the file name so you can open the file.
Since you're a beginner, I'll give you a simple solution and walk through it.
Import the os module, and use the os.listdir function to create a list of everything in the directory. Then, iterate through the files using a for loop.
Example:
# Importing the os module
import os
# Give the directory you wish to iterate through
my_dir = <your directory - i.e. "C:\Users\bleh\Desktop\files">
# Using os.listdir to create a list of all of the files in dir
dir_list = os.listdir(my_dir)
# Use the for loop to iterate through the list you just created, and open the files
for f in dir_list:
# Whatever you want to do to all of the files
If you need help on the concepts, refer to the following:
for looops in p3: http://www.python-course.eu/python3_for_loop.php
os function Library (this has some cool stuff in it): https://docs.python.org/2/library/os.html
Good luck!
You can use the os.listdir(path) function:
import os
path = '/Users/ricardomartinez/repos/Salary-API'
# List for all files in a given PATH
file_list = os.listdir(path)
# If you want to filter by file type
file_list = [file for file in os.listdir(path) if os.path.splitext(file)[1] == '.py']
# Both cases yo can iterate over the list and apply the operations
# that you have
for file in file_list:
print(file)
#Operations that you want to do over files
Is there a way to list the files (not directories) in a directory with Python? I know I could use os.listdir and a loop of os.path.isfile()s, but if there's something simpler (like a function os.path.listfilesindir or something), it would probably be better.
This is a simple generator expression:
files = (file for file in os.listdir(path)
if os.path.isfile(os.path.join(path, file)))
for file in files: # You could shorten this to one line, but it runs on a bit.
...
Or you could make a generator function if it suited you better:
def files(path):
for file in os.listdir(path):
if os.path.isfile(os.path.join(path, file)):
yield file
Then simply:
for file in files(path):
...
files = next(os.walk('..'))[2]
Using pathlib in Windows as follow:
files = (x for x in Path("your_path") if x.is_file())
Generates error:
TypeError: 'WindowsPath' object is not iterable
You should rather use Path.iterdir()
filePath = Path("your_path")
if filePath.is_dir():
files = list(x for x in filePath.iterdir() if x.is_file())
Since Python 3.6 you can use glob with a recursive option "**". Note that glob will give you all files and directories, so you can keep only the ones that are files
files = glob.glob(join(in_path, "**/*"), recursive=True)
files = [f for f in files if os.path.isfile(f)]
For the special case of working with files in the current directory, you could do it as a simple one-liner list comprehension:
[f for f in os.listdir(os.curdir) if os.path.isfile(f)]
Otherwise in the more general case, directory paths & filenames have to be joined:
dirpath = '~/path_to_dir_of_interest'
files = [f for f in os.listdir(dirpath) if os.path.isfile(os.path.join(dirpath, f))]
You could try pathlib, which has a lot of other useful stuff too.
Pathlib is an object-oriented library for interacting with filesystem paths. To get the files in the current directory, one can do:
from pathlib import *
files = (x for x in Path(".") if x.is_file())
for file in files:
print(str(file), "is a file!")
This is, in my opinion, more Pythonic than using os.path.
See also: PEP 428.
Using pathlib, the shortest way to list only files is:
[x for x in Path("your_path").iterdir() if x.is_file()]
with depth support if need be.
If you use Python 3, you could use pathlib.
But, you have to know that if you use the is_dir() method as :
from pathlib import *
#p is directory path
#files is list of files in the form of path type
files=[x for x in p.iterdir() if x.is_file()]
empty files will be skipped by .iterdir()
The solution I found is:
from pathlib import *
#p is directory path
#listing all directory's content, even empty files
contents=list(p.glob("*"))
#if element in contents isn't a folder, it's a file
#is_dir() even works for empty folders...!
files=[x for x in contents if not x.is_dir()]
I know how to delete single files, however I am lost in my implementation of how to delete all files in a directory of one type.
Say the directory is \myfolder
I want to delete all files that are .config files, but nothing to the other ones.
How would I do this?
Thanks Kindly
Use the glob module:
import os
from glob import glob
for f in glob ('myfolder/*.config'):
os.unlink (f)
I would do something like the following:
import os
files = os.listdir("myfolder")
for f in files:
if not os.path.isdir(f) and ".config" in f:
os.remove(f)
It lists the files in a directory and if it's not a directory and the filename has ".config" anywhere in it, delete it. You'll either need to be in the same directory as myfolder, or give it the full path to the directory. If you need to do this recursively, I would use the os.walk function.
Here ya go:
import os
# Return all files in dir, and all its subdirectories, ending in pattern
def gen_files(dir, pattern):
for dirname, subdirs, files in os.walk(dir):
for f in files:
if f.endswith(pattern):
yield os.path.join(dirname, f)
# Remove all files in the current dir matching *.config
for f in gen_files('.', '.config'):
os.remove(f)
Note also that gen_files can be easily rewritten to accept a tuple of patterns, since str.endswith accepts a tuple