Make file, 'Currently being used' in python [duplicate] - python

What is the most elegant way to solve this:
open a file for reading, but only if it is not already opened for writing
open a file for writing, but only if it is not already opened for reading or writing
The built-in functions work like this
>>> path = r"c:\scr.txt"
>>> file1 = open(path, "w")
>>> print file1
<open file 'c:\scr.txt', mode 'w' at 0x019F88D8>
>>> file2 = open(path, "w")
>>> print file2
<open file 'c:\scr.txt', mode 'w' at 0x02332188>
>>> file1.write("111")
>>> file2.write("222")
>>> file1.close()
scr.txt now contains '111'.
>>> file2.close()
scr.txt was overwritten and now contains '222' (on Windows, Python 2.4).
The solution should work inside the same process (like in the example above) as well as when another process has opened the file.
It is preferred, if a crashing program will not keep the lock open.

I don't think there is a fully crossplatform way. On unix, the fcntl module will do this for you. However on windows (which I assume you are by the paths), you'll need to use the win32file module.
Fortunately, there is a portable implementation (portalocker) using the platform appropriate method at the python cookbook.
To use it, open the file, and then call:
portalocker.lock(file, flags)
where flags are portalocker.LOCK_EX for exclusive write access, or LOCK_SH for shared, read access.

The solution should work inside the same process (like in the example above) as well as when another process has opened the file.
If by 'another process' you mean 'whatever process' (i.e. not your program), in Linux there's no way to accomplish this relying only on system calls (fcntl & friends). What you want is mandatory locking, and the Linux way to obtain it is a bit more involved:
Remount the partition that contains your file with the mand option:
# mount -o remount,mand /dev/hdXY
Set the sgid flag for your file:
# chmod g-x,g+s yourfile
In your Python code, obtain an exclusive lock on that file:
fcntl.flock(fd, fcntl.LOCK_EX)
Now even cat will not be able to read the file until you release the lock.

EDIT: I solved it myself! By using directory existence & age as a locking mechanism! Locking by file is safe only on Windows (because Linux silently overwrites), but locking by directory works perfectly both on Linux and Windows. See my GIT where I created an easy to use class 'lockbydir.DLock' for that:
https://github.com/drandreaskrueger/lockbydir
At the bottom of the readme, you find 3 GITplayers where you can see the code examples execute live in your browser! Quite cool, isn't it? :-)
Thanks for your attention
This was my original question:
I would like to answer to parity3 (https://meta.stackoverflow.com/users/1454536/parity3) but I can neither comment directly ('You must have 50 reputation to comment'), nor do I see any way to contact him/her directly. What do you suggest to me, to get through to him?
My question:
I have implemented something similiar to what parity3 suggested here as an answer: https://stackoverflow.com/a/21444311/3693375 ("Assuming your Python interpreter, and the ...")
And it works brilliantly - on Windows. (I am using it to implement a locking mechanism that works across independently started processes. https://github.com/drandreaskrueger/lockbyfile )
But other than parity3 says, it does NOT work the same on Linux:
os.rename(src, dst)
Rename the file or directory src to dst. ... On Unix, if dst exists
and is a file,
it will be replaced silently if the user has permission.
The operation may fail on some Unix flavors if src and dst
are on different filesystems. If successful, the renaming will
be an atomic operation (this is a POSIX requirement).
On Windows, if dst already exists, OSError will be raised
(https://docs.python.org/2/library/os.html#os.rename)
The silent replacing is the problem. On Linux.
The "if dst already exists, OSError will be raised" is great for my purposes. But only on Windows, sadly.
I guess parity3's example still works most of the time, because of his if condition
if not os.path.exists(lock_filename):
try:
os.rename(tmp_filename,lock_filename)
But then the whole thing is not atomic anymore.
Because the if condition might be true in two parallel processes, and then both will rename, but only one will win the renaming race. And no exception raised (in Linux).
Any suggestions? Thanks!
P.S.: I know this is not the proper way, but I am lacking an alternative. PLEASE don't punish me with lowering my reputation. I looked around a lot, to solve this myself. How to PM users in here? And meh why can't I?

Here's a start on the win32 half of a portable implementation, that does not need a seperate locking mechanism.
Requires the Python for Windows Extensions to get down to the win32 api, but that's pretty much mandatory for python on windows already, and can alternatively be done with ctypes. The code could be adapted to expose more functionality if it's needed (such as allowing FILE_SHARE_READ rather than no sharing at all). See also the MSDN documentation for the CreateFile and WriteFile system calls, and the article on Creating and Opening Files.
As has been mentioned, you can use the standard fcntl module to implement the unix half of this, if required.
import winerror, pywintypes, win32file
class LockError(StandardError):
pass
class WriteLockedFile(object):
"""
Using win32 api to achieve something similar to file(path, 'wb')
Could be adapted to handle other modes as well.
"""
def __init__(self, path):
try:
self._handle = win32file.CreateFile(
path,
win32file.GENERIC_WRITE,
0,
None,
win32file.OPEN_ALWAYS,
win32file.FILE_ATTRIBUTE_NORMAL,
None)
except pywintypes.error, e:
if e[0] == winerror.ERROR_SHARING_VIOLATION:
raise LockError(e[2])
raise
def close(self):
self._handle.close()
def write(self, str):
win32file.WriteFile(self._handle, str)
Here's how your example from above behaves:
>>> path = "C:\\scr.txt"
>>> file1 = WriteLockedFile(path)
>>> file2 = WriteLockedFile(path) #doctest: +IGNORE_EXCEPTION_DETAIL
Traceback (most recent call last):
...
LockError: ...
>>> file1.write("111")
>>> file1.close()
>>> print file(path).read()
111

I prefer to use filelock, a cross-platform Python library which barely requires any additional code. Here's an example of how to use it:
from filelock import FileLock
lockfile = r"c:\scr.txt"
lock = FileLock(lockfile + ".lock")
with lock:
file = open(path, "w")
file.write("111")
file.close()
Any code within the with lock: block is thread-safe, meaning that it will be finished before another process has access to the file.

Assuming your Python interpreter, and the underlying os and filesystem treat os.rename as an atomic operation and it will error when the destination exists, the following method is free of race conditions. I'm using this in production on a linux machine. Requires no third party libs and is not os dependent, and aside from an extra file create, the performance hit is acceptable for many use cases. You can easily apply python's function decorator pattern or a 'with_statement' contextmanager here to abstract out the mess.
You'll need to make sure that lock_filename does not exist before a new process/task begins.
import os,time
def get_tmp_file():
filename='tmp_%s_%s'%(os.getpid(),time.time())
open(filename).close()
return filename
def do_exclusive_work():
print 'exclusive work being done...'
num_tries=10
wait_time=10
lock_filename='filename.lock'
acquired=False
for try_num in xrange(num_tries):
tmp_filename=get_tmp_file()
if not os.path.exists(lock_filename):
try:
os.rename(tmp_filename,lock_filename)
acquired=True
except (OSError,ValueError,IOError), e:
pass
if acquired:
try:
do_exclusive_work()
finally:
os.remove(lock_filename)
break
os.remove(tmp_filename)
time.sleep(wait_time)
assert acquired, 'maximum tries reached, failed to acquire lock file'
EDIT
It has come to light that os.rename silently overwrites the destination on a non-windows OS. Thanks for pointing this out # akrueger!
Here is a workaround, gathered from here:
Instead of using os.rename you can use:
try:
if os.name != 'nt': # non-windows needs a create-exclusive operation
fd = os.open(lock_filename, os.O_WRONLY | os.O_CREAT | os.O_EXCL)
os.close(fd)
# non-windows os.rename will overwrite lock_filename silently.
# We leave this call in here just so the tmp file is deleted but it could be refactored so the tmp file is never even generated for a non-windows OS
os.rename(tmp_filename,lock_filename)
acquired=True
except (OSError,ValueError,IOError), e:
if os.name != 'nt' and not 'File exists' in str(e): raise
# akrueger You're probably just fine with your directory based solution, just giving you an alternate method.

To make you safe when opening files within one application, you could try something like this:
import time
class ExclusiveFile(file):
openFiles = {}
fileLocks = []
class FileNotExclusiveException(Exception):
pass
def __init__(self, *args):
sMode = 'r'
sFileName = args[0]
try:
sMode = args[1]
except:
pass
while sFileName in ExclusiveFile.fileLocks:
time.sleep(1)
ExclusiveFile.fileLocks.append(sFileName)
if not sFileName in ExclusiveFile.openFiles.keys() or (ExclusiveFile.openFiles[sFileName] == 'r' and sMode == 'r'):
ExclusiveFile.openFiles[sFileName] = sMode
try:
file.__init__(self, sFileName, sMode)
finally:
ExclusiveFile.fileLocks.remove(sFileName)
else:
ExclusiveFile.fileLocks.remove(sFileName)
raise self.FileNotExclusiveException(sFileName)
def close(self):
del ExclusiveFile.openFiles[self.name]
file.close(self)
That way you subclass the file class. Now just do:
>>> f = ExclusiveFile('/tmp/a.txt', 'r')
>>> f
<open file '/tmp/a.txt', mode 'r' at 0xb7d7cc8c>
>>> f1 = ExclusiveFile('/tmp/a.txt', 'r')
>>> f1
<open file '/tmp/a.txt', mode 'r' at 0xb7d7c814>
>>> f2 = ExclusiveFile('/tmp/a.txt', 'w') # can't open it for writing now
exclfile.FileNotExclusiveException: /tmp/a.txt
If you open it first with 'w' mode, it won't allow anymore opens, even in read mode, just as you wanted...

Related

Python if, else and pass through list of files [duplicate]

How do I check whether a file exists or not, without using the try statement?
If the reason you're checking is so you can do something like if file_exists: open_it(), it's safer to use a try around the attempt to open it. Checking and then opening risks the file being deleted or moved or something between when you check and when you try to open it.
If you're not planning to open the file immediately, you can use os.path.isfile
Return True if path is an existing regular file. This follows symbolic links, so both islink() and isfile() can be true for the same path.
import os.path
os.path.isfile(fname)
if you need to be sure it's a file.
Starting with Python 3.4, the pathlib module offers an object-oriented approach (backported to pathlib2 in Python 2.7):
from pathlib import Path
my_file = Path("/path/to/file")
if my_file.is_file():
# file exists
To check a directory, do:
if my_file.is_dir():
# directory exists
To check whether a Path object exists independently of whether is it a file or directory, use exists():
if my_file.exists():
# path exists
You can also use resolve(strict=True) in a try block:
try:
my_abs_path = my_file.resolve(strict=True)
except FileNotFoundError:
# doesn't exist
else:
# exists
Use os.path.exists to check both files and directories:
import os.path
os.path.exists(file_path)
Use os.path.isfile to check only files (note: follows symbolic links):
os.path.isfile(file_path)
Unlike isfile(), exists() will return True for directories. So depending on if you want only plain files or also directories, you'll use isfile() or exists(). Here is some simple REPL output:
>>> os.path.isfile("/etc/password.txt")
True
>>> os.path.isfile("/etc")
False
>>> os.path.isfile("/does/not/exist")
False
>>> os.path.exists("/etc/password.txt")
True
>>> os.path.exists("/etc")
True
>>> os.path.exists("/does/not/exist")
False
import os
if os.path.isfile(filepath):
print("File exists")
Use os.path.isfile() with os.access():
import os
PATH = './file.txt'
if os.path.isfile(PATH) and os.access(PATH, os.R_OK):
print("File exists and is readable")
else:
print("Either the file is missing or not readable")
import os
os.path.exists(path) # Returns whether the path (directory or file) exists or not
os.path.isfile(path) # Returns whether the file exists or not
Although almost every possible way has been listed in (at least one of) the existing answers (e.g. Python 3.4 specific stuff was added), I'll try to group everything together.
Note: every piece of Python standard library code that I'm going to post, belongs to version 3.5.3.
Problem statement:
Check file (arguable: also folder ("special" file) ?) existence
Don't use try / except / else / finally blocks
Possible solutions:
1. [Python.Docs]: os.path.exists(path)
Also check other function family members like os.path.isfile, os.path.isdir, os.path.lexists for slightly different behaviors:
Return True if path refers to an existing path or an open file descriptor. Returns False for broken symbolic links. On some platforms, this function may return False if permission is not granted to execute os.stat() on the requested file, even if the path physically exists.
All good, but if following the import tree:
os.path - posixpath.py (ntpath.py)
genericpath.py - line ~20+
def exists(path):
"""Test whether a path exists. Returns False for broken symbolic links"""
try:
st = os.stat(path)
except os.error:
return False
return True
it's just a try / except block around [Python.Docs]: os.stat(path, *, dir_fd=None, follow_symlinks=True). So, your code is try / except free, but lower in the framestack there's (at least) one such block. This also applies to other functions (including os.path.isfile).
1.1. [Python.Docs]: pathlib - Path.is_file()
It's a fancier (and more [Wiktionary]: Pythonic) way of handling paths, but
Under the hood, it does exactly the same thing (pathlib.py - line ~1330):
def is_file(self):
"""
Whether this path is a regular file (also True for symlinks pointing
to regular files).
"""
try:
return S_ISREG(self.stat().st_mode)
except OSError as e:
if e.errno not in (ENOENT, ENOTDIR):
raise
# Path doesn't exist or is a broken symlink
# (see https://bitbucket.org/pitrou/pathlib/issue/12/)
return False
2. [Python.Docs]: With Statement Context Managers
Either:
Create one:
class Swallow: # Dummy example
swallowed_exceptions = (FileNotFoundError,)
def __enter__(self):
print("Entering...")
def __exit__(self, exc_type, exc_value, exc_traceback):
print("Exiting:", exc_type, exc_value, exc_traceback)
# Only swallow FileNotFoundError (not e.g. TypeError - if the user passes a wrong argument like None or float or ...)
return exc_type in Swallow.swallowed_exceptions
And its usage - I'll replicate the os.path.isfile behavior (note that this is just for demonstrating purposes, do not attempt to write such code for production):
import os
import stat
def isfile_seaman(path): # Dummy func
result = False
with Swallow():
result = stat.S_ISREG(os.stat(path).st_mode)
return result
Use [Python.Docs]: contextlib.suppress(*exceptions) - which was specifically designed for selectively suppressing exceptions
But, they seem to be wrappers over try / except / else / finally blocks, as [Python.Docs]: Compound statements - The with statement states:
This allows common try...except...finally usage patterns to be encapsulated for convenient reuse.
3. Filesystem traversal functions
Search the results for matching item(s):
[Python.Docs]: os.listdir(path='.') (or [Python.Docs]: os.scandir(path='.') on Python v3.5+, backport: [PyPI]: scandir)
Under the hood, both use:
Nix: [Man7]: OPENDIR(3) / [Man7]: READDIR(3) / [Man7]: CLOSEDIR(3)
Win: [MS.Learn]: FindFirstFileW function (fileapi.h) / [MS.Learn]: FindNextFileW function (fileapi.h) / [MS.Learn]: FindClose function (fileapi.h)
via [GitHub]: python/cpython - (main) cpython/Modules/posixmodule.c
Using scandir() instead of listdir() can significantly increase the performance of code that also needs file type or file attribute information, because os.DirEntry objects expose this information if the operating system provides it when scanning a directory. All os.DirEntry methods may perform a system call, but is_dir() and is_file() usually only require a system call for symbolic links; os.DirEntry.stat() always requires a system call on Unix, but only requires one for symbolic links on Windows.
[Python.Docs]: os.walk(top, topdown=True, onerror=None, followlinks=False)
Uses os.listdir (os.scandir when available)
[Python.Docs]: glob.iglob(pathname, *, root_dir=None, dir_fd=None, recursive=False, include_hidden=False) (or its predecessor: glob.glob)
Doesn't seem a traversing function per se (at least in some cases), but it still uses os.listdir
Since these iterate over folders, (in most of the cases) they are inefficient for our problem (there are exceptions, like non wildcarded globbing - as #ShadowRanger pointed out), so I'm not going to insist on them. Not to mention that in some cases, filename processing might be required.
4. [Python.Docs]: os.access(path, mode, *, dir_fd=None, effective_ids=False, follow_symlinks=True)
Its behavior is close to os.path.exists (actually it's wider, mainly because of the 2nd argument).
User permissions might restrict the file "visibility" as the doc states:
... test if the invoking user has the specified access to path. mode should be F_OK to test the existence of path...
Security considerations:
Using access() to check if a user is authorized to e.g. open a file before actually doing so using open() creates a security hole, because the user might exploit the short time interval between checking and opening the file to manipulate it.
os.access("/tmp", os.F_OK)
Since I also work in C, I use this method as well because under the hood, it calls native APIs (again, via "${PYTHON_SRC_DIR}/Modules/posixmodule.c"), but it also opens a gate for possible user errors, and it's not as Pythonic as other variants. So, don't use it unless you know what you're doing:
Nix: [Man7]: ACCESS(2)
Warning: Using these calls to check if a user is authorized to, for example, open a file before actually doing so using open(2) creates a security hole, because the user might exploit the short time interval between checking and opening the file to manipulate it. For this reason, the use of this system call should be avoided.
Win: [MS.Learn]: GetFileAttributesW function (fileapi.h)
As seen, this approach is highly discouraged (especially on Nix).
Note: calling native APIs is also possible via [Python.Docs]: ctypes - A foreign function library for Python, but in most cases it's more complicated. Before working with CTypes, check [SO]: C function called from Python via ctypes returns incorrect value (#CristiFati's answer) out.
(Win specific): since vcruntime###.dll (msvcr###.dll for older VStudio versions - I'm going to refer to it as UCRT) exports a [MS.Learn]: _access, _waccess function family as well, here's an example (note that the recommended [Python.Docs]: msvcrt - Useful routines from the MS VC++ runtime doesn't export them):
Python 3.5.3 (v3.5.3:1880cb95a742, Jan 16 2017, 16:02:32) [MSC v.1900 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import ctypes as cts, os
>>> cts.CDLL("msvcrt")._waccess(u"C:\\Windows\\Temp", os.F_OK)
0
>>> cts.CDLL("msvcrt")._waccess(u"C:\\Windows\\Temp.notexist", os.F_OK)
-1
Notes:
Although it's not a good practice, I'm using os.F_OK in the call, but that's just for clarity (its value is 0)
I'm using _waccess so that the same code works on Python 3 and Python 2 (in spite of [Wikipedia]: Unicode related differences between them - [SO]: Passing utf-16 string to a Windows function (#CristiFati's answer))
Although this targets a very specific area, it was not mentioned in any of the previous answers
The Linux (Ubuntu ([Wikipedia]: Ubuntu version history) 16 x86_64 (pc064)) counterpart as well:
Python 3.5.2 (default, Nov 17 2016, 17:05:23)
[GCC 5.4.0 20160609] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import ctypes as cts, os
>>> cts.CDLL("/lib/x86_64-linux-gnu/libc.so.6").access(b"/tmp", os.F_OK)
0
>>> cts.CDLL("/lib/x86_64-linux-gnu/libc.so.6").access(b"/tmp.notexist", os.F_OK)
-1
Notes:
Instead hardcoding libc.so (LibC)'s path ("/lib/x86_64-linux-gnu/libc.so.6") which may (and most likely, will) vary across systems, None (or the empty string) can be passed to CDLL constructor (ctypes.CDLL(None).access(b"/tmp", os.F_OK)). According to [Man7]: DLOPEN(3):
If filename is NULL, then the returned handle is for the main
program. When given to dlsym(3), this handle causes a search for a
symbol in the main program, followed by all shared objects loaded at
program startup, and then all shared objects loaded by dlopen() with
the flag RTLD_GLOBAL.
Main (current) program (python) is linked against LibC, so its symbols (including access) will be loaded
This has to be handled with care, since functions like main, Py_Main and (all the) others are available; calling them could have disastrous effects (on the current program)
This doesn't also apply to Windows (but that's not such a big deal, since UCRT is located in "%SystemRoot%\System32" which is in %PATH% by default). I wanted to take things further and replicate this behavior on Windows (and submit a patch), but as it turns out, [MS.Learn]: GetProcAddress function (libloaderapi.h) only "sees" exported symbols, so unless someone declares the functions in the main executable as __declspec(dllexport) (why on Earth the common person would do that?), the main program is loadable, but it is pretty much unusable
5. 3rd-party modules with filesystem capabilities
Most likely, will rely on one of the ways above (maybe with slight customizations). One example would be (again, Win specific) [GitHub]: mhammond/pywin32 - Python for Windows (pywin32) Extensions, which is a Python wrapper over WinAPIs.
But, since this is more like a workaround, I'm stopping here.
6. SysAdmin approach
I consider this a (lame) workaround (gainarie): use Python as a wrapper to execute shell commands:
Win:
(py35x64_test) [cfati#CFATI-5510-0:e:\Work\Dev\StackOverflow\q000082831]> "e:\Work\Dev\VEnvs\py35x64_test\Scripts\python.exe" -c "import os; print(os.system('dir /b \"C:\\Windows\\Temp\" > nul 2>&1'))"
0
(py35x64_test) [cfati#CFATI-5510-0:e:\Work\Dev\StackOverflow\q000082831]> "e:\Work\Dev\VEnvs\py35x64_test\Scripts\python.exe" -c "import os; print(os.system('dir /b \"C:\\Windows\\Temp.notexist\" > nul 2>&1'))"
1
Nix ([Wikipedia]: Unix-like) - Ubuntu:
[cfati#cfati-5510-0:/mnt/e/Work/Dev/StackOverflow/q000082831]> python3 -c "import os; print(os.system('ls \"/tmp\" > /dev/null 2>&1'))"
0
[cfati#cfati-5510-0:/mnt/e/Work/Dev/StackOverflow/q000082831]> python3 -c "import os; print(os.system('ls \"/tmp.notexist\" > /dev/null 2>&1'))"
512
Bottom line:
Do use try / except / else / finally blocks, because they can prevent you running into a series of nasty problems
A possible counterexample that I can think of, is performance: such blocks are costly, so try not to place them in code that it's supposed to run hundreds of thousands times per second (but since (in most cases) it involves disk access, it won't be the case)
Python 3.4+ has an object-oriented path module: pathlib. Using this new module, you can check whether a file exists like this:
import pathlib
p = pathlib.Path('path/to/file')
if p.is_file(): # or p.is_dir() to see if it is a directory
# do stuff
You can (and usually should) still use a try/except block when opening files:
try:
with p.open() as f:
# do awesome stuff
except OSError:
print('Well darn.')
The pathlib module has lots of cool stuff in it: convenient globbing, checking file's owner, easier path joining, etc. It's worth checking out. If you're on an older Python (version 2.6 or later), you can still install pathlib with pip:
# installs pathlib2 on older Python versions
# the original third-party module, pathlib, is no longer maintained.
pip install pathlib2
Then import it as follows:
# Older Python versions
import pathlib2 as pathlib
This is the simplest way to check if a file exists. Just because the file existed when you checked doesn't guarantee that it will be there when you need to open it.
import os
fname = "foo.txt"
if os.path.isfile(fname):
print("file does exist at this time")
else:
print("no such file exists at this time")
How do I check whether a file exists, using Python, without using a try statement?
Now available since Python 3.4, import and instantiate a Path object with the file name, and check the is_file method (note that this returns True for symlinks pointing to regular files as well):
>>> from pathlib import Path
>>> Path('/').is_file()
False
>>> Path('/initrd.img').is_file()
True
>>> Path('/doesnotexist').is_file()
False
If you're on Python 2, you can backport the pathlib module from pypi, pathlib2, or otherwise check isfile from the os.path module:
>>> import os
>>> os.path.isfile('/')
False
>>> os.path.isfile('/initrd.img')
True
>>> os.path.isfile('/doesnotexist')
False
Now the above is probably the best pragmatic direct answer here, but there's the possibility of a race condition (depending on what you're trying to accomplish), and the fact that the underlying implementation uses a try, but Python uses try everywhere in its implementation.
Because Python uses try everywhere, there's really no reason to avoid an implementation that uses it.
But the rest of this answer attempts to consider these caveats.
Longer, much more pedantic answer
Available since Python 3.4, use the new Path object in pathlib. Note that .exists is not quite right, because directories are not files (except in the unix sense that everything is a file).
>>> from pathlib import Path
>>> root = Path('/')
>>> root.exists()
True
So we need to use is_file:
>>> root.is_file()
False
Here's the help on is_file:
is_file(self)
Whether this path is a regular file (also True for symlinks pointing
to regular files).
So let's get a file that we know is a file:
>>> import tempfile
>>> file = tempfile.NamedTemporaryFile()
>>> filepathobj = Path(file.name)
>>> filepathobj.is_file()
True
>>> filepathobj.exists()
True
By default, NamedTemporaryFile deletes the file when closed (and will automatically close when no more references exist to it).
>>> del file
>>> filepathobj.exists()
False
>>> filepathobj.is_file()
False
If you dig into the implementation, though, you'll see that is_file uses try:
def is_file(self):
"""
Whether this path is a regular file (also True for symlinks pointing
to regular files).
"""
try:
return S_ISREG(self.stat().st_mode)
except OSError as e:
if e.errno not in (ENOENT, ENOTDIR):
raise
# Path doesn't exist or is a broken symlink
# (see https://bitbucket.org/pitrou/pathlib/issue/12/)
return False
Race Conditions: Why we like try
We like try because it avoids race conditions. With try, you simply attempt to read your file, expecting it to be there, and if not, you catch the exception and perform whatever fallback behavior makes sense.
If you want to check that a file exists before you attempt to read it, and you might be deleting it and then you might be using multiple threads or processes, or another program knows about that file and could delete it - you risk the chance of a race condition if you check it exists, because you are then racing to open it before its condition (its existence) changes.
Race conditions are very hard to debug because there's a very small window in which they can cause your program to fail.
But if this is your motivation, you can get the value of a try statement by using the suppress context manager.
Avoiding race conditions without a try statement: suppress
Python 3.4 gives us the suppress context manager (previously the ignore context manager), which does semantically exactly the same thing in fewer lines, while also (at least superficially) meeting the original ask to avoid a try statement:
from contextlib import suppress
from pathlib import Path
Usage:
>>> with suppress(OSError), Path('doesnotexist').open() as f:
... for line in f:
... print(line)
...
>>>
>>> with suppress(OSError):
... Path('doesnotexist').unlink()
...
>>>
For earlier Pythons, you could roll your own suppress, but without a try will be more verbose than with. I do believe this actually is the only answer that doesn't use try at any level in the Python that can be applied to prior to Python 3.4 because it uses a context manager instead:
class suppress(object):
def __init__(self, *exceptions):
self.exceptions = exceptions
def __enter__(self):
return self
def __exit__(self, exc_type, exc_value, traceback):
if exc_type is not None:
return issubclass(exc_type, self.exceptions)
Perhaps easier with a try:
from contextlib import contextmanager
#contextmanager
def suppress(*exceptions):
try:
yield
except exceptions:
pass
Other options that don't meet the ask for "without try":
isfile
import os
os.path.isfile(path)
from the docs:
os.path.isfile(path)
Return True if path is an existing regular file. This follows symbolic
links, so both islink() and isfile() can be true for the same path.
But if you examine the source of this function, you'll see it actually does use a try statement:
# This follows symbolic links, so both islink() and isdir() can be true
# for the same path on systems that support symlinks
def isfile(path):
"""Test whether a path is a regular file"""
try:
st = os.stat(path)
except os.error:
return False
return stat.S_ISREG(st.st_mode)
>>> OSError is os.error
True
All it's doing is using the given path to see if it can get stats on it, catching OSError and then checking if it's a file if it didn't raise the exception.
If you intend to do something with the file, I would suggest directly attempting it with a try-except to avoid a race condition:
try:
with open(path) as f:
f.read()
except OSError:
pass
os.access
Available for Unix and Windows is os.access, but to use you must pass flags, and it does not differentiate between files and directories. This is more used to test if the real invoking user has access in an elevated privilege environment:
import os
os.access(path, os.F_OK)
It also suffers from the same race condition problems as isfile. From the docs:
Note:
Using access() to check if a user is authorized to e.g. open a file
before actually doing so using open() creates a security hole, because
the user might exploit the short time interval between checking and
opening the file to manipulate it. It’s preferable to use EAFP
techniques. For example:
if os.access("myfile", os.R_OK):
with open("myfile") as fp:
return fp.read()
return "some default data"
is better written as:
try:
fp = open("myfile")
except IOError as e:
if e.errno == errno.EACCES:
return "some default data"
# Not a permission error.
raise
else:
with fp:
return fp.read()
Avoid using os.access. It is a low level function that has more opportunities for user error than the higher level objects and functions discussed above.
Criticism of another answer:
Another answer says this about os.access:
Personally, I prefer this one because under the hood, it calls native APIs (via "${PYTHON_SRC_DIR}/Modules/posixmodule.c"), but it also opens a gate for possible user errors, and it's not as Pythonic as other variants:
This answer says it prefers a non-Pythonic, error-prone method, with no justification. It seems to encourage users to use low-level APIs without understanding them.
It also creates a context manager which, by unconditionally returning True, allows all Exceptions (including KeyboardInterrupt and SystemExit!) to pass silently, which is a good way to hide bugs.
This seems to encourage users to adopt poor practices.
Prefer the try statement. It's considered better style and avoids race conditions.
Don't take my word for it. There's plenty of support for this theory. Here's a couple:
Style: Section "Handling unusual conditions" of these course notes for Software Design (2007)
Avoiding Race Conditions
Use:
import os
#Your path here e.g. "C:\Program Files\text.txt"
#For access purposes: "C:\\Program Files\\text.txt"
if os.path.exists("C:\..."):
print "File found!"
else:
print "File not found!"
Importing os makes it easier to navigate and perform standard actions with your operating system.
For reference, also see How do I check whether a file exists without exceptions?.
If you need high-level operations, use shutil.
Testing for files and folders with os.path.isfile(), os.path.isdir() and os.path.exists()
Assuming that the "path" is a valid path, this table shows what is returned by each function for files and folders:
You can also test if a file is a certain type of file using os.path.splitext() to get the extension (if you don't already know it)
>>> import os
>>> path = "path to a word document"
>>> os.path.isfile(path)
True
>>> os.path.splitext(path)[1] == ".docx" # test if the extension is .docx
True
TL;DR
The answer is: use the pathlib module
Pathlib is probably the most modern and convenient way for almost all of the file operations. For the existence of a file or a folder a single line of code is enough. If file is not exists, it will not throw any exception.
from pathlib import Path
if Path("myfile.txt").exists(): # works for both file and folders
# do your cool stuff...
The pathlib module was introduced in Python 3.4, so you need to have Python 3.4+. This library makes your life much easier while working with files and folders, and it is pretty to use. Here is more documentation about it: pathlib — Object-oriented filesystem paths.
BTW, if you are going to reuse the path, then it is better to assign it to a variable.
So it will become:
from pathlib import Path
p = Path("loc/of/myfile.txt")
if p.exists(): # works for both file and folders
# do stuffs...
#reuse 'p' if needed.
In 2016 the best way is still using os.path.isfile:
>>> os.path.isfile('/path/to/some/file.txt')
Or in Python 3 you can use pathlib:
import pathlib
path = pathlib.Path('/path/to/some/file.txt')
if path.is_file():
...
It doesn't seem like there's a meaningful functional difference between try/except and isfile(), so you should use which one makes sense.
If you want to read a file, if it exists, do
try:
f = open(filepath)
except IOError:
print 'Oh dear.'
But if you just wanted to rename a file if it exists, and therefore don't need to open it, do
if os.path.isfile(filepath):
os.rename(filepath, filepath + '.old')
If you want to write to a file, if it doesn't exist, do
# Python 2
if not os.path.isfile(filepath):
f = open(filepath, 'w')
# Python 3: x opens for exclusive creation, failing if the file already exists
try:
f = open(filepath, 'wx')
except IOError:
print 'file already exists'
If you need file locking, that's a different matter.
You could try this (safer):
try:
# http://effbot.org/zone/python-with-statement.htm
# 'with' is safer to open a file
with open('whatever.txt') as fh:
# Do something with 'fh'
except IOError as e:
print("({})".format(e))
The ouput would be:
([Errno 2] No such file or directory:
'whatever.txt')
Then, depending on the result, your program can just keep running from there or you can code to stop it if you want.
Date: 2017-12-04
Every possible solution has been listed in other answers.
An intuitive and arguable way to check if a file exists is the following:
import os
os.path.isfile('~/file.md') # Returns True if exists, else False
# Additionally, check a directory
os.path.isdir('~/folder') # Returns True if the folder exists, else False
# Check either a directory or a file
os.path.exists('~/file')
I made an exhaustive cheat sheet for your reference:
# os.path methods in exhaustive cheat sheet
{'definition': ['dirname',
'basename',
'abspath',
'relpath',
'commonpath',
'normpath',
'realpath'],
'operation': ['split', 'splitdrive', 'splitext',
'join', 'normcase'],
'compare': ['samefile', 'sameopenfile', 'samestat'],
'condition': ['isdir',
'isfile',
'exists',
'lexists'
'islink',
'isabs',
'ismount',],
'expand': ['expanduser',
'expandvars'],
'stat': ['getatime', 'getctime', 'getmtime',
'getsize']}
Although I always recommend using try and except statements, here are a few possibilities for you (my personal favourite is using os.access):
Try opening the file:
Opening the file will always verify the existence of the file. You can make a function just like so:
def File_Existence(filepath):
f = open(filepath)
return True
If it's False, it will stop execution with an unhanded IOError
or OSError in later versions of Python. To catch the exception,
you have to use a try except clause. Of course, you can always
use a try except` statement like so (thanks to hsandt
for making me think):
def File_Existence(filepath):
try:
f = open(filepath)
except IOError, OSError: # Note OSError is for later versions of Python
return False
return True
Use os.path.exists(path):
This will check the existence of what you specify. However, it checks for files and directories so beware about how you use it.
import os.path
>>> os.path.exists("this/is/a/directory")
True
>>> os.path.exists("this/is/a/file.txt")
True
>>> os.path.exists("not/a/directory")
False
Use os.access(path, mode):
This will check whether you have access to the file. It will check for permissions. Based on the os.py documentation, typing in os.F_OK, it will check the existence of the path. However, using this will create a security hole, as someone can attack your file using the time between checking the permissions and opening the file. You should instead go directly to opening the file instead of checking its permissions. (EAFP vs LBYP). If you're not going to open the file afterwards, and only checking its existence, then you can use this.
Anyway, here:
>>> import os
>>> os.access("/is/a/file.txt", os.F_OK)
True
I should also mention that there are two ways that you will not be able to verify the existence of a file. Either the issue will be permission denied or no such file or directory. If you catch an IOError, set the IOError as e (like my first option), and then type in print(e.args) so that you can hopefully determine your issue. I hope it helps! :)
If the file is for opening you could use one of the following techniques:
with open('somefile', 'xt') as f: # Using the x-flag, Python 3.3 and above
f.write('Hello\n')
if not os.path.exists('somefile'):
with open('somefile', 'wt') as f:
f.write("Hello\n")
else:
print('File already exists!')
Note: This finds either a file or a directory with the given name.
Additionally, os.access():
if os.access("myfile", os.R_OK):
with open("myfile") as fp:
return fp.read()
Being R_OK, W_OK, and X_OK the flags to test for permissions (doc).
if os.path.isfile(path_to_file):
try:
open(path_to_file)
pass
except IOError as e:
print "Unable to open file"
Raising exceptions is considered to be an acceptable, and Pythonic,
approach for flow control in your program. Consider handling missing
files with IOErrors. In this situation, an IOError exception will be
raised if the file exists but the user does not have read permissions.
Source: Using Python: How To Check If A File Exists
If you imported NumPy already for other purposes then there is no need to import other libraries like pathlib, os, paths, etc.
import numpy as np
np.DataSource().exists("path/to/your/file")
This will return true or false based on its existence.
Check file or directory exists
You can follow these three ways:
1. Using isfile()
Note 1: The os.path.isfile used only for files
import os.path
os.path.isfile(filename) # True if file exists
os.path.isfile(dirname) # False if directory exists
2. Using exists
Note 2: The os.path.exists is used for both files and directories
import os.path
os.path.exists(filename) # True if file exists
os.path.exists(dirname) # True if directory exists
3. The pathlib.Path method (included in Python 3+, installable with pip for Python 2)
from pathlib import Path
Path(filename).exists()
You can write Brian's suggestion without the try:.
from contextlib import suppress
with suppress(IOError), open('filename'):
process()
suppress is part of Python 3.4. In older releases you can quickly write your own suppress:
from contextlib import contextmanager
#contextmanager
def suppress(*exceptions):
try:
yield
except exceptions:
pass
I'm the author of a package that's been around for about 10 years, and it has a function that addresses this question directly. Basically, if you are on a non-Windows system, it uses Popen to access find. However, if you are on Windows, it replicates find with an efficient filesystem walker.
The code itself does not use a try block… except in determining the operating system and thus steering you to the "Unix"-style find or the hand-buillt find. Timing tests showed that the try was faster in determining the OS, so I did use one there (but nowhere else).
>>> import pox
>>> pox.find('*python*', type='file', root=pox.homedir(), recurse=False)
['/Users/mmckerns/.python']
And the doc…
>>> print pox.find.__doc__
find(patterns[,root,recurse,type]); Get path to a file or directory
patterns: name or partial name string of items to search for
root: path string of top-level directory to search
recurse: if True, recurse down from root directory
type: item filter; one of {None, file, dir, link, socket, block, char}
verbose: if True, be a little verbose about the search
On some OS, recursion can be specified by recursion depth (an integer).
patterns can be specified with basic pattern matching. Additionally,
multiple patterns can be specified by splitting patterns with a ';'
For example:
>>> find('pox*', root='..')
['/Users/foo/pox/pox', '/Users/foo/pox/scripts/pox_launcher.py']
>>> find('*shutils*;*init*')
['/Users/foo/pox/pox/shutils.py', '/Users/foo/pox/pox/__init__.py']
>>>
The implementation, if you care to look, is here:
https://github.com/uqfoundation/pox/blob/89f90fb308f285ca7a62eabe2c38acb87e89dad9/pox/shutils.py#L190
Adding one more slight variation which isn't exactly reflected in the other answers.
This will handle the case of the file_path being None or empty string.
def file_exists(file_path):
if not file_path:
return False
elif not os.path.isfile(file_path):
return False
else:
return True
Adding a variant based on suggestion from Shahbaz
def file_exists(file_path):
if not file_path:
return False
else:
return os.path.isfile(file_path)
Adding a variant based on suggestion from Peter Wood
def file_exists(file_path):
return file_path and os.path.isfile(file_path):
Here's a one-line Python command for the Linux command line environment. I find this very handy since I'm not such a hot Bash guy.
python -c "import os.path; print os.path.isfile('/path_to/file.xxx')"
You can use the "OS" library of Python:
>>> import os
>>> os.path.exists("C:\\Users\\####\\Desktop\\test.txt")
True
>>> os.path.exists("C:\\Users\\####\\Desktop\\test.tx")
False
How do I check whether a file exists, without using the try statement?
In 2016, this is still arguably the easiest way to check if both a file exists and if it is a file:
import os
os.path.isfile('./file.txt') # Returns True if exists, else False
isfile is actually just a helper method that internally uses os.stat and stat.S_ISREG(mode) underneath. This os.stat is a lower-level method that will provide you with detailed information about files, directories, sockets, buffers, and more. More about os.stat here
Note: However, this approach will not lock the file in any way and therefore your code can become vulnerable to "time of check to time of use" (TOCTTOU) bugs.
So raising exceptions is considered to be an acceptable, and Pythonic, approach for flow control in your program. And one should consider handling missing files with IOErrors, rather than if statements (just an advice).

Python: Effective way to find out if another application has a file open

I would like to find via Python script if an application has opened a file. I know there is a lot of proposition out there, but none seems to be doing excactly what I want. I have tried a lot of solutions. But still can't get behavior expected. Want I want is to raise an exception when a python script tries to open a file that is open by pdf reader. For example, if I opened a file with a pdf reader, and try to open/rename it via a Python script, it should raise an exception. I have tried multiple pieces of code:
try:
myfile = open("myfile.csv", "r+") # or "a+", whatever you need
except IOError:
print "Could not open file! Please close Excel!"
with myfile:
do_stuff()
In this question, I tested many scripts and none worked perfectly. When I rename pdf file, even when it is open the rename is accepter and at the end.
This makes me wonder is this is impossible or I am doing wrong.
You're right, there aren't many (any?) elegant solutions to this. Here is a cross-platform method I've found to work (in most cases).
Test if the lock file exists.
On Windows:
You won't see this in Win Exploder, but if you run os.listdir() on the file's directory you should see a ~$myfile.xlsx lock file. The naming convention is standard (as far as I know) so you can test for the existence of this file; or better yet, do a regex search.
On Linux:
Similar to the Win solution, look for the lock or swap file, usually named something like ~lock.myfile.xlsx# or .myfile.txt.swp.
Sys Admins:
Please feel free to edit this post if you feel appropriate. These are just some simple findings I've stumbled across along the way.
Hope this helps ...
One approach is to call lsof or it's equivalents, depending on the operating system:
import subprocess
import sys
def is_file_open_any_process(path: str) -> bool:
if sys.platform in ["linux", "darwin"]:
fnull = open(os.devnull, "w") # to redirect output to devnull
retcode = subprocess.call(["lsof", path], stdout=fnull, stderr=subprocess.STDOUT)
if retcode == 0:
return True
return False
else:
raise NotImplementedError(f"not implemented for sys.platform: {sys.platform}")

Python if statement conditions [duplicate]

How do I check whether a file exists or not, without using the try statement?
If the reason you're checking is so you can do something like if file_exists: open_it(), it's safer to use a try around the attempt to open it. Checking and then opening risks the file being deleted or moved or something between when you check and when you try to open it.
If you're not planning to open the file immediately, you can use os.path.isfile
Return True if path is an existing regular file. This follows symbolic links, so both islink() and isfile() can be true for the same path.
import os.path
os.path.isfile(fname)
if you need to be sure it's a file.
Starting with Python 3.4, the pathlib module offers an object-oriented approach (backported to pathlib2 in Python 2.7):
from pathlib import Path
my_file = Path("/path/to/file")
if my_file.is_file():
# file exists
To check a directory, do:
if my_file.is_dir():
# directory exists
To check whether a Path object exists independently of whether is it a file or directory, use exists():
if my_file.exists():
# path exists
You can also use resolve(strict=True) in a try block:
try:
my_abs_path = my_file.resolve(strict=True)
except FileNotFoundError:
# doesn't exist
else:
# exists
Use os.path.exists to check both files and directories:
import os.path
os.path.exists(file_path)
Use os.path.isfile to check only files (note: follows symbolic links):
os.path.isfile(file_path)
Unlike isfile(), exists() will return True for directories. So depending on if you want only plain files or also directories, you'll use isfile() or exists(). Here is some simple REPL output:
>>> os.path.isfile("/etc/password.txt")
True
>>> os.path.isfile("/etc")
False
>>> os.path.isfile("/does/not/exist")
False
>>> os.path.exists("/etc/password.txt")
True
>>> os.path.exists("/etc")
True
>>> os.path.exists("/does/not/exist")
False
import os
if os.path.isfile(filepath):
print("File exists")
Use os.path.isfile() with os.access():
import os
PATH = './file.txt'
if os.path.isfile(PATH) and os.access(PATH, os.R_OK):
print("File exists and is readable")
else:
print("Either the file is missing or not readable")
import os
os.path.exists(path) # Returns whether the path (directory or file) exists or not
os.path.isfile(path) # Returns whether the file exists or not
Although almost every possible way has been listed in (at least one of) the existing answers (e.g. Python 3.4 specific stuff was added), I'll try to group everything together.
Note: every piece of Python standard library code that I'm going to post, belongs to version 3.5.3.
Problem statement:
Check file (arguable: also folder ("special" file) ?) existence
Don't use try / except / else / finally blocks
Possible solutions:
1. [Python.Docs]: os.path.exists(path)
Also check other function family members like os.path.isfile, os.path.isdir, os.path.lexists for slightly different behaviors:
Return True if path refers to an existing path or an open file descriptor. Returns False for broken symbolic links. On some platforms, this function may return False if permission is not granted to execute os.stat() on the requested file, even if the path physically exists.
All good, but if following the import tree:
os.path - posixpath.py (ntpath.py)
genericpath.py - line ~20+
def exists(path):
"""Test whether a path exists. Returns False for broken symbolic links"""
try:
st = os.stat(path)
except os.error:
return False
return True
it's just a try / except block around [Python.Docs]: os.stat(path, *, dir_fd=None, follow_symlinks=True). So, your code is try / except free, but lower in the framestack there's (at least) one such block. This also applies to other functions (including os.path.isfile).
1.1. [Python.Docs]: pathlib - Path.is_file()
It's a fancier (and more [Wiktionary]: Pythonic) way of handling paths, but
Under the hood, it does exactly the same thing (pathlib.py - line ~1330):
def is_file(self):
"""
Whether this path is a regular file (also True for symlinks pointing
to regular files).
"""
try:
return S_ISREG(self.stat().st_mode)
except OSError as e:
if e.errno not in (ENOENT, ENOTDIR):
raise
# Path doesn't exist or is a broken symlink
# (see https://bitbucket.org/pitrou/pathlib/issue/12/)
return False
2. [Python.Docs]: With Statement Context Managers
Either:
Create one:
class Swallow: # Dummy example
swallowed_exceptions = (FileNotFoundError,)
def __enter__(self):
print("Entering...")
def __exit__(self, exc_type, exc_value, exc_traceback):
print("Exiting:", exc_type, exc_value, exc_traceback)
# Only swallow FileNotFoundError (not e.g. TypeError - if the user passes a wrong argument like None or float or ...)
return exc_type in Swallow.swallowed_exceptions
And its usage - I'll replicate the os.path.isfile behavior (note that this is just for demonstrating purposes, do not attempt to write such code for production):
import os
import stat
def isfile_seaman(path): # Dummy func
result = False
with Swallow():
result = stat.S_ISREG(os.stat(path).st_mode)
return result
Use [Python.Docs]: contextlib.suppress(*exceptions) - which was specifically designed for selectively suppressing exceptions
But, they seem to be wrappers over try / except / else / finally blocks, as [Python.Docs]: Compound statements - The with statement states:
This allows common try...except...finally usage patterns to be encapsulated for convenient reuse.
3. Filesystem traversal functions
Search the results for matching item(s):
[Python.Docs]: os.listdir(path='.') (or [Python.Docs]: os.scandir(path='.') on Python v3.5+, backport: [PyPI]: scandir)
Under the hood, both use:
Nix: [Man7]: OPENDIR(3) / [Man7]: READDIR(3) / [Man7]: CLOSEDIR(3)
Win: [MS.Learn]: FindFirstFileW function (fileapi.h) / [MS.Learn]: FindNextFileW function (fileapi.h) / [MS.Learn]: FindClose function (fileapi.h)
via [GitHub]: python/cpython - (main) cpython/Modules/posixmodule.c
Using scandir() instead of listdir() can significantly increase the performance of code that also needs file type or file attribute information, because os.DirEntry objects expose this information if the operating system provides it when scanning a directory. All os.DirEntry methods may perform a system call, but is_dir() and is_file() usually only require a system call for symbolic links; os.DirEntry.stat() always requires a system call on Unix, but only requires one for symbolic links on Windows.
[Python.Docs]: os.walk(top, topdown=True, onerror=None, followlinks=False)
Uses os.listdir (os.scandir when available)
[Python.Docs]: glob.iglob(pathname, *, root_dir=None, dir_fd=None, recursive=False, include_hidden=False) (or its predecessor: glob.glob)
Doesn't seem a traversing function per se (at least in some cases), but it still uses os.listdir
Since these iterate over folders, (in most of the cases) they are inefficient for our problem (there are exceptions, like non wildcarded globbing - as #ShadowRanger pointed out), so I'm not going to insist on them. Not to mention that in some cases, filename processing might be required.
4. [Python.Docs]: os.access(path, mode, *, dir_fd=None, effective_ids=False, follow_symlinks=True)
Its behavior is close to os.path.exists (actually it's wider, mainly because of the 2nd argument).
User permissions might restrict the file "visibility" as the doc states:
... test if the invoking user has the specified access to path. mode should be F_OK to test the existence of path...
Security considerations:
Using access() to check if a user is authorized to e.g. open a file before actually doing so using open() creates a security hole, because the user might exploit the short time interval between checking and opening the file to manipulate it.
os.access("/tmp", os.F_OK)
Since I also work in C, I use this method as well because under the hood, it calls native APIs (again, via "${PYTHON_SRC_DIR}/Modules/posixmodule.c"), but it also opens a gate for possible user errors, and it's not as Pythonic as other variants. So, don't use it unless you know what you're doing:
Nix: [Man7]: ACCESS(2)
Warning: Using these calls to check if a user is authorized to, for example, open a file before actually doing so using open(2) creates a security hole, because the user might exploit the short time interval between checking and opening the file to manipulate it. For this reason, the use of this system call should be avoided.
Win: [MS.Learn]: GetFileAttributesW function (fileapi.h)
As seen, this approach is highly discouraged (especially on Nix).
Note: calling native APIs is also possible via [Python.Docs]: ctypes - A foreign function library for Python, but in most cases it's more complicated. Before working with CTypes, check [SO]: C function called from Python via ctypes returns incorrect value (#CristiFati's answer) out.
(Win specific): since vcruntime###.dll (msvcr###.dll for older VStudio versions - I'm going to refer to it as UCRT) exports a [MS.Learn]: _access, _waccess function family as well, here's an example (note that the recommended [Python.Docs]: msvcrt - Useful routines from the MS VC++ runtime doesn't export them):
Python 3.5.3 (v3.5.3:1880cb95a742, Jan 16 2017, 16:02:32) [MSC v.1900 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import ctypes as cts, os
>>> cts.CDLL("msvcrt")._waccess(u"C:\\Windows\\Temp", os.F_OK)
0
>>> cts.CDLL("msvcrt")._waccess(u"C:\\Windows\\Temp.notexist", os.F_OK)
-1
Notes:
Although it's not a good practice, I'm using os.F_OK in the call, but that's just for clarity (its value is 0)
I'm using _waccess so that the same code works on Python 3 and Python 2 (in spite of [Wikipedia]: Unicode related differences between them - [SO]: Passing utf-16 string to a Windows function (#CristiFati's answer))
Although this targets a very specific area, it was not mentioned in any of the previous answers
The Linux (Ubuntu ([Wikipedia]: Ubuntu version history) 16 x86_64 (pc064)) counterpart as well:
Python 3.5.2 (default, Nov 17 2016, 17:05:23)
[GCC 5.4.0 20160609] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import ctypes as cts, os
>>> cts.CDLL("/lib/x86_64-linux-gnu/libc.so.6").access(b"/tmp", os.F_OK)
0
>>> cts.CDLL("/lib/x86_64-linux-gnu/libc.so.6").access(b"/tmp.notexist", os.F_OK)
-1
Notes:
Instead hardcoding libc.so (LibC)'s path ("/lib/x86_64-linux-gnu/libc.so.6") which may (and most likely, will) vary across systems, None (or the empty string) can be passed to CDLL constructor (ctypes.CDLL(None).access(b"/tmp", os.F_OK)). According to [Man7]: DLOPEN(3):
If filename is NULL, then the returned handle is for the main
program. When given to dlsym(3), this handle causes a search for a
symbol in the main program, followed by all shared objects loaded at
program startup, and then all shared objects loaded by dlopen() with
the flag RTLD_GLOBAL.
Main (current) program (python) is linked against LibC, so its symbols (including access) will be loaded
This has to be handled with care, since functions like main, Py_Main and (all the) others are available; calling them could have disastrous effects (on the current program)
This doesn't also apply to Windows (but that's not such a big deal, since UCRT is located in "%SystemRoot%\System32" which is in %PATH% by default). I wanted to take things further and replicate this behavior on Windows (and submit a patch), but as it turns out, [MS.Learn]: GetProcAddress function (libloaderapi.h) only "sees" exported symbols, so unless someone declares the functions in the main executable as __declspec(dllexport) (why on Earth the common person would do that?), the main program is loadable, but it is pretty much unusable
5. 3rd-party modules with filesystem capabilities
Most likely, will rely on one of the ways above (maybe with slight customizations). One example would be (again, Win specific) [GitHub]: mhammond/pywin32 - Python for Windows (pywin32) Extensions, which is a Python wrapper over WinAPIs.
But, since this is more like a workaround, I'm stopping here.
6. SysAdmin approach
I consider this a (lame) workaround (gainarie): use Python as a wrapper to execute shell commands:
Win:
(py35x64_test) [cfati#CFATI-5510-0:e:\Work\Dev\StackOverflow\q000082831]> "e:\Work\Dev\VEnvs\py35x64_test\Scripts\python.exe" -c "import os; print(os.system('dir /b \"C:\\Windows\\Temp\" > nul 2>&1'))"
0
(py35x64_test) [cfati#CFATI-5510-0:e:\Work\Dev\StackOverflow\q000082831]> "e:\Work\Dev\VEnvs\py35x64_test\Scripts\python.exe" -c "import os; print(os.system('dir /b \"C:\\Windows\\Temp.notexist\" > nul 2>&1'))"
1
Nix ([Wikipedia]: Unix-like) - Ubuntu:
[cfati#cfati-5510-0:/mnt/e/Work/Dev/StackOverflow/q000082831]> python3 -c "import os; print(os.system('ls \"/tmp\" > /dev/null 2>&1'))"
0
[cfati#cfati-5510-0:/mnt/e/Work/Dev/StackOverflow/q000082831]> python3 -c "import os; print(os.system('ls \"/tmp.notexist\" > /dev/null 2>&1'))"
512
Bottom line:
Do use try / except / else / finally blocks, because they can prevent you running into a series of nasty problems
A possible counterexample that I can think of, is performance: such blocks are costly, so try not to place them in code that it's supposed to run hundreds of thousands times per second (but since (in most cases) it involves disk access, it won't be the case)
Python 3.4+ has an object-oriented path module: pathlib. Using this new module, you can check whether a file exists like this:
import pathlib
p = pathlib.Path('path/to/file')
if p.is_file(): # or p.is_dir() to see if it is a directory
# do stuff
You can (and usually should) still use a try/except block when opening files:
try:
with p.open() as f:
# do awesome stuff
except OSError:
print('Well darn.')
The pathlib module has lots of cool stuff in it: convenient globbing, checking file's owner, easier path joining, etc. It's worth checking out. If you're on an older Python (version 2.6 or later), you can still install pathlib with pip:
# installs pathlib2 on older Python versions
# the original third-party module, pathlib, is no longer maintained.
pip install pathlib2
Then import it as follows:
# Older Python versions
import pathlib2 as pathlib
This is the simplest way to check if a file exists. Just because the file existed when you checked doesn't guarantee that it will be there when you need to open it.
import os
fname = "foo.txt"
if os.path.isfile(fname):
print("file does exist at this time")
else:
print("no such file exists at this time")
How do I check whether a file exists, using Python, without using a try statement?
Now available since Python 3.4, import and instantiate a Path object with the file name, and check the is_file method (note that this returns True for symlinks pointing to regular files as well):
>>> from pathlib import Path
>>> Path('/').is_file()
False
>>> Path('/initrd.img').is_file()
True
>>> Path('/doesnotexist').is_file()
False
If you're on Python 2, you can backport the pathlib module from pypi, pathlib2, or otherwise check isfile from the os.path module:
>>> import os
>>> os.path.isfile('/')
False
>>> os.path.isfile('/initrd.img')
True
>>> os.path.isfile('/doesnotexist')
False
Now the above is probably the best pragmatic direct answer here, but there's the possibility of a race condition (depending on what you're trying to accomplish), and the fact that the underlying implementation uses a try, but Python uses try everywhere in its implementation.
Because Python uses try everywhere, there's really no reason to avoid an implementation that uses it.
But the rest of this answer attempts to consider these caveats.
Longer, much more pedantic answer
Available since Python 3.4, use the new Path object in pathlib. Note that .exists is not quite right, because directories are not files (except in the unix sense that everything is a file).
>>> from pathlib import Path
>>> root = Path('/')
>>> root.exists()
True
So we need to use is_file:
>>> root.is_file()
False
Here's the help on is_file:
is_file(self)
Whether this path is a regular file (also True for symlinks pointing
to regular files).
So let's get a file that we know is a file:
>>> import tempfile
>>> file = tempfile.NamedTemporaryFile()
>>> filepathobj = Path(file.name)
>>> filepathobj.is_file()
True
>>> filepathobj.exists()
True
By default, NamedTemporaryFile deletes the file when closed (and will automatically close when no more references exist to it).
>>> del file
>>> filepathobj.exists()
False
>>> filepathobj.is_file()
False
If you dig into the implementation, though, you'll see that is_file uses try:
def is_file(self):
"""
Whether this path is a regular file (also True for symlinks pointing
to regular files).
"""
try:
return S_ISREG(self.stat().st_mode)
except OSError as e:
if e.errno not in (ENOENT, ENOTDIR):
raise
# Path doesn't exist or is a broken symlink
# (see https://bitbucket.org/pitrou/pathlib/issue/12/)
return False
Race Conditions: Why we like try
We like try because it avoids race conditions. With try, you simply attempt to read your file, expecting it to be there, and if not, you catch the exception and perform whatever fallback behavior makes sense.
If you want to check that a file exists before you attempt to read it, and you might be deleting it and then you might be using multiple threads or processes, or another program knows about that file and could delete it - you risk the chance of a race condition if you check it exists, because you are then racing to open it before its condition (its existence) changes.
Race conditions are very hard to debug because there's a very small window in which they can cause your program to fail.
But if this is your motivation, you can get the value of a try statement by using the suppress context manager.
Avoiding race conditions without a try statement: suppress
Python 3.4 gives us the suppress context manager (previously the ignore context manager), which does semantically exactly the same thing in fewer lines, while also (at least superficially) meeting the original ask to avoid a try statement:
from contextlib import suppress
from pathlib import Path
Usage:
>>> with suppress(OSError), Path('doesnotexist').open() as f:
... for line in f:
... print(line)
...
>>>
>>> with suppress(OSError):
... Path('doesnotexist').unlink()
...
>>>
For earlier Pythons, you could roll your own suppress, but without a try will be more verbose than with. I do believe this actually is the only answer that doesn't use try at any level in the Python that can be applied to prior to Python 3.4 because it uses a context manager instead:
class suppress(object):
def __init__(self, *exceptions):
self.exceptions = exceptions
def __enter__(self):
return self
def __exit__(self, exc_type, exc_value, traceback):
if exc_type is not None:
return issubclass(exc_type, self.exceptions)
Perhaps easier with a try:
from contextlib import contextmanager
#contextmanager
def suppress(*exceptions):
try:
yield
except exceptions:
pass
Other options that don't meet the ask for "without try":
isfile
import os
os.path.isfile(path)
from the docs:
os.path.isfile(path)
Return True if path is an existing regular file. This follows symbolic
links, so both islink() and isfile() can be true for the same path.
But if you examine the source of this function, you'll see it actually does use a try statement:
# This follows symbolic links, so both islink() and isdir() can be true
# for the same path on systems that support symlinks
def isfile(path):
"""Test whether a path is a regular file"""
try:
st = os.stat(path)
except os.error:
return False
return stat.S_ISREG(st.st_mode)
>>> OSError is os.error
True
All it's doing is using the given path to see if it can get stats on it, catching OSError and then checking if it's a file if it didn't raise the exception.
If you intend to do something with the file, I would suggest directly attempting it with a try-except to avoid a race condition:
try:
with open(path) as f:
f.read()
except OSError:
pass
os.access
Available for Unix and Windows is os.access, but to use you must pass flags, and it does not differentiate between files and directories. This is more used to test if the real invoking user has access in an elevated privilege environment:
import os
os.access(path, os.F_OK)
It also suffers from the same race condition problems as isfile. From the docs:
Note:
Using access() to check if a user is authorized to e.g. open a file
before actually doing so using open() creates a security hole, because
the user might exploit the short time interval between checking and
opening the file to manipulate it. It’s preferable to use EAFP
techniques. For example:
if os.access("myfile", os.R_OK):
with open("myfile") as fp:
return fp.read()
return "some default data"
is better written as:
try:
fp = open("myfile")
except IOError as e:
if e.errno == errno.EACCES:
return "some default data"
# Not a permission error.
raise
else:
with fp:
return fp.read()
Avoid using os.access. It is a low level function that has more opportunities for user error than the higher level objects and functions discussed above.
Criticism of another answer:
Another answer says this about os.access:
Personally, I prefer this one because under the hood, it calls native APIs (via "${PYTHON_SRC_DIR}/Modules/posixmodule.c"), but it also opens a gate for possible user errors, and it's not as Pythonic as other variants:
This answer says it prefers a non-Pythonic, error-prone method, with no justification. It seems to encourage users to use low-level APIs without understanding them.
It also creates a context manager which, by unconditionally returning True, allows all Exceptions (including KeyboardInterrupt and SystemExit!) to pass silently, which is a good way to hide bugs.
This seems to encourage users to adopt poor practices.
Prefer the try statement. It's considered better style and avoids race conditions.
Don't take my word for it. There's plenty of support for this theory. Here's a couple:
Style: Section "Handling unusual conditions" of these course notes for Software Design (2007)
Avoiding Race Conditions
Use:
import os
#Your path here e.g. "C:\Program Files\text.txt"
#For access purposes: "C:\\Program Files\\text.txt"
if os.path.exists("C:\..."):
print "File found!"
else:
print "File not found!"
Importing os makes it easier to navigate and perform standard actions with your operating system.
For reference, also see How do I check whether a file exists without exceptions?.
If you need high-level operations, use shutil.
Testing for files and folders with os.path.isfile(), os.path.isdir() and os.path.exists()
Assuming that the "path" is a valid path, this table shows what is returned by each function for files and folders:
You can also test if a file is a certain type of file using os.path.splitext() to get the extension (if you don't already know it)
>>> import os
>>> path = "path to a word document"
>>> os.path.isfile(path)
True
>>> os.path.splitext(path)[1] == ".docx" # test if the extension is .docx
True
TL;DR
The answer is: use the pathlib module
Pathlib is probably the most modern and convenient way for almost all of the file operations. For the existence of a file or a folder a single line of code is enough. If file is not exists, it will not throw any exception.
from pathlib import Path
if Path("myfile.txt").exists(): # works for both file and folders
# do your cool stuff...
The pathlib module was introduced in Python 3.4, so you need to have Python 3.4+. This library makes your life much easier while working with files and folders, and it is pretty to use. Here is more documentation about it: pathlib — Object-oriented filesystem paths.
BTW, if you are going to reuse the path, then it is better to assign it to a variable.
So it will become:
from pathlib import Path
p = Path("loc/of/myfile.txt")
if p.exists(): # works for both file and folders
# do stuffs...
#reuse 'p' if needed.
In 2016 the best way is still using os.path.isfile:
>>> os.path.isfile('/path/to/some/file.txt')
Or in Python 3 you can use pathlib:
import pathlib
path = pathlib.Path('/path/to/some/file.txt')
if path.is_file():
...
It doesn't seem like there's a meaningful functional difference between try/except and isfile(), so you should use which one makes sense.
If you want to read a file, if it exists, do
try:
f = open(filepath)
except IOError:
print 'Oh dear.'
But if you just wanted to rename a file if it exists, and therefore don't need to open it, do
if os.path.isfile(filepath):
os.rename(filepath, filepath + '.old')
If you want to write to a file, if it doesn't exist, do
# Python 2
if not os.path.isfile(filepath):
f = open(filepath, 'w')
# Python 3: x opens for exclusive creation, failing if the file already exists
try:
f = open(filepath, 'wx')
except IOError:
print 'file already exists'
If you need file locking, that's a different matter.
You could try this (safer):
try:
# http://effbot.org/zone/python-with-statement.htm
# 'with' is safer to open a file
with open('whatever.txt') as fh:
# Do something with 'fh'
except IOError as e:
print("({})".format(e))
The ouput would be:
([Errno 2] No such file or directory:
'whatever.txt')
Then, depending on the result, your program can just keep running from there or you can code to stop it if you want.
Date: 2017-12-04
Every possible solution has been listed in other answers.
An intuitive and arguable way to check if a file exists is the following:
import os
os.path.isfile('~/file.md') # Returns True if exists, else False
# Additionally, check a directory
os.path.isdir('~/folder') # Returns True if the folder exists, else False
# Check either a directory or a file
os.path.exists('~/file')
I made an exhaustive cheat sheet for your reference:
# os.path methods in exhaustive cheat sheet
{'definition': ['dirname',
'basename',
'abspath',
'relpath',
'commonpath',
'normpath',
'realpath'],
'operation': ['split', 'splitdrive', 'splitext',
'join', 'normcase'],
'compare': ['samefile', 'sameopenfile', 'samestat'],
'condition': ['isdir',
'isfile',
'exists',
'lexists'
'islink',
'isabs',
'ismount',],
'expand': ['expanduser',
'expandvars'],
'stat': ['getatime', 'getctime', 'getmtime',
'getsize']}
Although I always recommend using try and except statements, here are a few possibilities for you (my personal favourite is using os.access):
Try opening the file:
Opening the file will always verify the existence of the file. You can make a function just like so:
def File_Existence(filepath):
f = open(filepath)
return True
If it's False, it will stop execution with an unhanded IOError
or OSError in later versions of Python. To catch the exception,
you have to use a try except clause. Of course, you can always
use a try except` statement like so (thanks to hsandt
for making me think):
def File_Existence(filepath):
try:
f = open(filepath)
except IOError, OSError: # Note OSError is for later versions of Python
return False
return True
Use os.path.exists(path):
This will check the existence of what you specify. However, it checks for files and directories so beware about how you use it.
import os.path
>>> os.path.exists("this/is/a/directory")
True
>>> os.path.exists("this/is/a/file.txt")
True
>>> os.path.exists("not/a/directory")
False
Use os.access(path, mode):
This will check whether you have access to the file. It will check for permissions. Based on the os.py documentation, typing in os.F_OK, it will check the existence of the path. However, using this will create a security hole, as someone can attack your file using the time between checking the permissions and opening the file. You should instead go directly to opening the file instead of checking its permissions. (EAFP vs LBYP). If you're not going to open the file afterwards, and only checking its existence, then you can use this.
Anyway, here:
>>> import os
>>> os.access("/is/a/file.txt", os.F_OK)
True
I should also mention that there are two ways that you will not be able to verify the existence of a file. Either the issue will be permission denied or no such file or directory. If you catch an IOError, set the IOError as e (like my first option), and then type in print(e.args) so that you can hopefully determine your issue. I hope it helps! :)
If the file is for opening you could use one of the following techniques:
with open('somefile', 'xt') as f: # Using the x-flag, Python 3.3 and above
f.write('Hello\n')
if not os.path.exists('somefile'):
with open('somefile', 'wt') as f:
f.write("Hello\n")
else:
print('File already exists!')
Note: This finds either a file or a directory with the given name.
Additionally, os.access():
if os.access("myfile", os.R_OK):
with open("myfile") as fp:
return fp.read()
Being R_OK, W_OK, and X_OK the flags to test for permissions (doc).
if os.path.isfile(path_to_file):
try:
open(path_to_file)
pass
except IOError as e:
print "Unable to open file"
Raising exceptions is considered to be an acceptable, and Pythonic,
approach for flow control in your program. Consider handling missing
files with IOErrors. In this situation, an IOError exception will be
raised if the file exists but the user does not have read permissions.
Source: Using Python: How To Check If A File Exists
If you imported NumPy already for other purposes then there is no need to import other libraries like pathlib, os, paths, etc.
import numpy as np
np.DataSource().exists("path/to/your/file")
This will return true or false based on its existence.
Check file or directory exists
You can follow these three ways:
1. Using isfile()
Note 1: The os.path.isfile used only for files
import os.path
os.path.isfile(filename) # True if file exists
os.path.isfile(dirname) # False if directory exists
2. Using exists
Note 2: The os.path.exists is used for both files and directories
import os.path
os.path.exists(filename) # True if file exists
os.path.exists(dirname) # True if directory exists
3. The pathlib.Path method (included in Python 3+, installable with pip for Python 2)
from pathlib import Path
Path(filename).exists()
You can write Brian's suggestion without the try:.
from contextlib import suppress
with suppress(IOError), open('filename'):
process()
suppress is part of Python 3.4. In older releases you can quickly write your own suppress:
from contextlib import contextmanager
#contextmanager
def suppress(*exceptions):
try:
yield
except exceptions:
pass
I'm the author of a package that's been around for about 10 years, and it has a function that addresses this question directly. Basically, if you are on a non-Windows system, it uses Popen to access find. However, if you are on Windows, it replicates find with an efficient filesystem walker.
The code itself does not use a try block… except in determining the operating system and thus steering you to the "Unix"-style find or the hand-buillt find. Timing tests showed that the try was faster in determining the OS, so I did use one there (but nowhere else).
>>> import pox
>>> pox.find('*python*', type='file', root=pox.homedir(), recurse=False)
['/Users/mmckerns/.python']
And the doc…
>>> print pox.find.__doc__
find(patterns[,root,recurse,type]); Get path to a file or directory
patterns: name or partial name string of items to search for
root: path string of top-level directory to search
recurse: if True, recurse down from root directory
type: item filter; one of {None, file, dir, link, socket, block, char}
verbose: if True, be a little verbose about the search
On some OS, recursion can be specified by recursion depth (an integer).
patterns can be specified with basic pattern matching. Additionally,
multiple patterns can be specified by splitting patterns with a ';'
For example:
>>> find('pox*', root='..')
['/Users/foo/pox/pox', '/Users/foo/pox/scripts/pox_launcher.py']
>>> find('*shutils*;*init*')
['/Users/foo/pox/pox/shutils.py', '/Users/foo/pox/pox/__init__.py']
>>>
The implementation, if you care to look, is here:
https://github.com/uqfoundation/pox/blob/89f90fb308f285ca7a62eabe2c38acb87e89dad9/pox/shutils.py#L190
Adding one more slight variation which isn't exactly reflected in the other answers.
This will handle the case of the file_path being None or empty string.
def file_exists(file_path):
if not file_path:
return False
elif not os.path.isfile(file_path):
return False
else:
return True
Adding a variant based on suggestion from Shahbaz
def file_exists(file_path):
if not file_path:
return False
else:
return os.path.isfile(file_path)
Adding a variant based on suggestion from Peter Wood
def file_exists(file_path):
return file_path and os.path.isfile(file_path):
Here's a one-line Python command for the Linux command line environment. I find this very handy since I'm not such a hot Bash guy.
python -c "import os.path; print os.path.isfile('/path_to/file.xxx')"
You can use the "OS" library of Python:
>>> import os
>>> os.path.exists("C:\\Users\\####\\Desktop\\test.txt")
True
>>> os.path.exists("C:\\Users\\####\\Desktop\\test.tx")
False
How do I check whether a file exists, without using the try statement?
In 2016, this is still arguably the easiest way to check if both a file exists and if it is a file:
import os
os.path.isfile('./file.txt') # Returns True if exists, else False
isfile is actually just a helper method that internally uses os.stat and stat.S_ISREG(mode) underneath. This os.stat is a lower-level method that will provide you with detailed information about files, directories, sockets, buffers, and more. More about os.stat here
Note: However, this approach will not lock the file in any way and therefore your code can become vulnerable to "time of check to time of use" (TOCTTOU) bugs.
So raising exceptions is considered to be an acceptable, and Pythonic, approach for flow control in your program. And one should consider handling missing files with IOErrors, rather than if statements (just an advice).

running a script when some file is created

Is it possible to have a script run on a file when it's created if it has a specific extension?
let's call that extension "bar"
if I create the file "foo.bar", then my script will run with that file as an input.
Every time this file is saved, it would also run on the file.
Can I do that? If yes, how? If not, why?
note: If there is some technicality of why this is impossible, but I can do very close, that works too!
If you are using linux use pyinotify described on the website as follows: Pyinotify: monitor filesystem events with Python under Linux.
If you also want it to work using Mac OS X and Windows, you can have a look at this answer or this library.
You could do what Werkzeug does (this code copied directly from the link):
def _reloader_stat_loop(extra_files=None, interval=1):
"""When this function is run from the main thread, it will force other
threads to exit when any modules currently loaded change.
Copyright notice. This function is based on the autoreload.py from
the CherryPy trac which originated from WSGIKit which is now dead.
:param extra_files: a list of additional files it should watch.
"""
from itertools import chain
mtimes = {}
while 1:
for filename in chain(_iter_module_files(), extra_files or ()):
try:
mtime = os.stat(filename).st_mtime
except OSError:
continue
old_time = mtimes.get(filename)
if old_time is None:
mtimes[filename] = mtime
continue
elif mtime > old_time:
_log('info', ' * Detected change in %r, reloading' % filename)
sys.exit(3)
time.sleep(interval)
You'll have to spawn off a separate thread (which is what Werkzeug does), but that should work for you if you if you don't want to add pyinotify.

How to create a temporary file that can be read by a subprocess?

I'm writing a Python script that needs to write some data to a temporary file, then create a subprocess running a C++ program that will read the temporary file. I'm trying to use NamedTemporaryFile for this, but according to the docs,
Whether the name can be used to open the file a second time, while the named temporary file is still open, varies across platforms (it can be so used on Unix; it cannot on Windows NT or later).
And indeed, on Windows if I flush the temporary file after writing, but don't close it until I want it to go away, the subprocess isn't able to open it for reading.
I'm working around this by creating the file with delete=False, closing it before spawning the subprocess, and then manually deleting it once I'm done:
fileTemp = tempfile.NamedTemporaryFile(delete = False)
try:
fileTemp.write(someStuff)
fileTemp.close()
# ...run the subprocess and wait for it to complete...
finally:
os.remove(fileTemp.name)
This seems inelegant. Is there a better way to do this? Perhaps a way to open up the permissions on the temporary file so the subprocess can get at it?
Since nobody else appears to be interested in leaving this information out in the open...
tempfile does expose a function, mkdtemp(), which can trivialize this problem:
try:
temp_dir = mkdtemp()
temp_file = make_a_file_in_a_dir(temp_dir)
do_your_subprocess_stuff(temp_file)
remove_your_temp_file(temp_file)
finally:
os.rmdir(temp_dir)
I leave the implementation of the intermediate functions up to the reader, as one might wish to do things like use mkstemp() to tighten up the security of the temporary file itself, or overwrite the file in-place before removing it. I don't particularly know what security restrictions one might have that are not easily planned for by perusing the source of tempfile.
Anyway, yes, using NamedTemporaryFile on Windows might be inelegant, and my solution here might also be inelegant, but you've already decided that Windows support is more important than elegant code, so you might as well go ahead and do something readable.
According to Richard Oudkerk
(...) the only reason that trying to reopen a NamedTemporaryFile fails on
Windows is because when we reopen we need to use O_TEMPORARY.
and he gives an example of how to do this in Python 3.3+
import os, tempfile
DATA = b"hello bob"
def temp_opener(name, flag, mode=0o777):
return os.open(name, flag | os.O_TEMPORARY, mode)
with tempfile.NamedTemporaryFile() as f:
f.write(DATA)
f.flush()
with open(f.name, "rb", opener=temp_opener) as f:
assert f.read() == DATA
assert not os.path.exists(f.name)
Because there's no opener parameter in the built-in open() in Python 2.x, we have to combine lower level os.open() and os.fdopen() functions to achieve the same effect:
import subprocess
import tempfile
DATA = b"hello bob"
with tempfile.NamedTemporaryFile() as f:
f.write(DATA)
f.flush()
subprocess_code = \
"""import os
f = os.fdopen(os.open(r'{FILENAME}', os.O_RDWR | os.O_BINARY | os.O_TEMPORARY), 'rb')
assert f.read() == b'{DATA}'
""".replace('\n', ';').format(FILENAME=f.name, DATA=DATA)
subprocess.check_output(['python', '-c', subprocess_code]) == DATA
You can always go low-level, though am not sure if it's clean enough for you:
fd, filename = tempfile.mkstemp()
try:
os.write(fd, someStuff)
os.close(fd)
# ...run the subprocess and wait for it to complete...
finally:
os.remove(filename)
At least if you open a temporary file using existing Python libraries, accessing it from multiple processes is not possible in case of Windows. According to MSDN you can specify a 3rd parameter (dwSharedMode) shared mode flag FILE_SHARE_READ to CreateFile() function which:
Enables subsequent open operations on a file or device to request read
access. Otherwise, other processes cannot open the file or device if
they request read access. If this flag is not specified, but the file
or device has been opened for read access, the function fails.
So, you can write a Windows specific C routine to create a custom temporary file opener function, call it from Python and then you can make your sub-process access the file without any error. But I think you should stick with your existing approach as it is the most portable version and will work on any system and thus is the most elegant implementation.
Discussion on Linux and windows file locking can be found here.
EDIT: Turns out it is possible to open & read the temporary file from multiple processes in Windows too. See Piotr Dobrogost's answer.
Using mkstemp() instead with os.fdopen() in a with statement avoids having to call close():
fd, path = tempfile.mkstemp()
try:
with os.fdopen(fd, 'wb') as fileTemp:
fileTemp.write(someStuff)
# ...run the subprocess and wait for it to complete...
finally:
os.remove(path)
I know this is a really old post, but I think it's relevant today given that the API is changing and functions like mktemp and mkstemp are being replaced by functions like TemporaryFile() and TemporaryDirectory(). I just wanted to demonstrate in the following sample how to make sure that a temp directory is still available downstream:
Instead of coding:
tmpdirname = tempfile.TemporaryDirectory()
and using tmpdirname throughout your code, you should trying to use your code in a with statement block to insure that it is available for your code calls... like this:
with tempfile.TemporaryDirectory() as tmpdirname:
[do dependent code nested so it's part of the with statement]
If you reference it outside of the with then it's likely that it won't be visible anymore.

Categories