Understanding the Python with statement and context managers - python

I am trying to understand the with statement. I understand that it is supposed to replace the try/except block.
Now suppose I do something like this:
try:
name = "rubicon" / 2 # to raise an exception
except Exception as e:
print("No, not possible.")
finally:
print("OK, I caught you.")
How do I replace this with a context manager?

with doesn't really replace try/except, but, rather, try/finally. Still, you can make a context manager do something different in exception cases from non-exception ones:
class Mgr(object):
def __enter__(self): pass
def __exit__(self, ext, exv, trb):
if ext is not None: print "no not possible"
print "OK I caught you"
return True
with Mgr():
name='rubicon'/2 #to raise an exception
The return True part is where the context manager decides to suppress the exception (as you do by not re-raising it in your except clause).

The contextlib.contextmanager function decorator provides a handy way of providing a context manager without the need to write a full-fledged ContextManager class of your own (with __enter__ and __exit__ methods, so you don't have to remember the arguments to the __exit__ method, or that the __exit__ method must return True in order to suppress the exception). Instead, you write a function with a single yield at the point you want the with block to run, and you trap any exceptions (that effectively come from the yield) as you normally would.
from contextlib import contextmanager
#contextmanager
def handler():
# Put here what would ordinarily go in the `__enter__` method
# In this case, there's nothing to do
try:
yield # You can return something if you want, that gets picked up in the 'as'
except Exception as e:
print "no not possible"
finally:
print "Ok I caught you"
with handler():
name='rubicon'/2 #to raise an exception
Why go to the extra trouble of writing a context manager? Code re-use. You can use the same context manager in multiple places, without having to duplicate the exception handling. If the exception handling is unique to that situation, then don't bother with a context manager. But if the same pattern crops up again and again (or if it might for your users, e.g., closing a file, unlocking a mutex), it's worth the extra trouble. It's also a neat pattern to use if the exception handling is a bit complicated, as it separates the exception handling from the main line of code flow.

The with in Python is intended for wrapping a set of statements where you should set up and destroy or close resources. It is in a way similar to try...finally in that regard as the finally clause will be executed even after an exception.
A context manager is an object that implements two methods: __enter__ and __exit__. Those are called immediately before and after (respectively) the with block.
For instance, take a look at the classic open() example:
with open('temp.txt', 'w') as f:
f.write("Hi!")
Open returns a File object that implements __enter__ more or less like return self and __exit__ like self.close().

The Components of Context Manager
You should implement an __enter__ method that returns an object
Implement a __exit__ method.
Example
I will give a simple example to show you why we need a context manager. During the winter of Xinjiang, China, you should close a door immediately when you open a door. if you forget to close it, you will get cold.
class Door:
def __init__(self):
self.doorstatus='the door was closed when you are not at home'
print(self.doorstatus)
def __enter__(self):
print('I have opened the door')
return self
def __exit__(self,*args):
print('pong!the door has closed')
def fetchsomethings(self):
print('I have fetched somethings')
when fetch things at home, you should open a door, fetch somethings and close the door.
with Door() as dr:
dr.fetchsomethings()
the output is:
the door was closed when you are not at home
I have opened the door
I have fetched somethings
pong!the door has closed
Explanation
when you initiate a Door class, it will call __init__ method that will print
"the door was closed when you are not in home" and __enter__ method that will print "I have opened the door" and return a door instance called dr. when call self.fetchsomethings in with block, the method will print "I have fetched somethings".when the block is finished.the context manager will invoke __exit__
method and it will print "pong!the door has closed" .when you do not use with
keyword ,__enter__and __exit__ will not be invoked!!!!

with statements or context managers are there to aid with resources (although may be used for much more).
Let's say you opened a file for writing:
f = open(path, "w")
You now have an open file handle. During the handling of your file, no other program can write to it. In order to let other programs write to it, you must close the file handle:
f.close()
But, before closing your file an error occured:
f = open(path, "w")
data = 3/0 # Tried dividing by zero. Raised ZeroDivisionError
f.write(data)
f.close()
What will happen now is that the function or entire program will exit, while leaving your file with an open handle. (CPython cleans handles on termination and handles are freed together with a program but you shouldn't count on that)
A with statement ensures that as soon as you leave it's indentation, it will close the file handle:
with open(path, "w") as f:
data = 3/0 # Tried dividing by zero. Raised ZeroDivisionError
f.write(data)
# In here the file is already closed automatically, no matter what happened.
with statements may be used for many more things. For example: threading.Lock()
lock = threading.Lock()
with lock: # Lock is acquired
do stuff...
# Lock is automatically released.
Almost everything done with a context manager can be done with try: ... finally: ... but context managers are nicer to use, more comfortable, more readable and by implementing __enter__ and __exit__ provide an easy to use interface.
Creating context managers is done by implementing __enter__() and __exit__() in a normal class.
__enter__() tells what to do when a context manager starts and __exit__() when a context manager exists (giving the exception to the __exit__() method if an exception occurred)
A shortcut for creating context managers can be found in contextlib. It wraps a generator as a context manager.

Managing Resources: In any programming language, the usage of resources like file operations or database connections is very common. But these resources are limited in supply. Therefore, the main problem lies in making sure to release these resources after usage. If they are not released then it will lead to resource leakage and may cause the system to either slow down or crash. It would be very helpful if users have a mechanism for the automatic setup and teardown of resources. In Python, it can be achieved by the usage of context managers which facilitate the proper handling of resources. The most common way of performing file operations is by using the keyword as shown below:
# Python program showing a use of with keyword
with open("test.txt") as f:
data = f.read()
When a file is opened, a file descriptor is consumed which is a limited resource. Only a certain number of files can be opened by a process at a time. The following program demonstrates it.
file_descriptors = []
for x in range(100000):
file_descriptors.append(open('test.txt', 'w'))
it lead the error: OSError: [Errno 24] Too many open files: 'test.txt'
Python provides an easy way to manage resources: Context Managers. The with keyword is used. When it gets evaluated it should result in an object that performs context management. Context managers can be written using classes or functions(with decorators).
Creating a Context Manager: When creating context managers using classes, user need to ensure that the class has the methods: __enter__() and __exit__(). The __enter__() returns the resource that needs to be managed and the __exit__() does not return anything but performs the cleanup operations. First, let us create a simple class called ContextManager to understand the basic structure of creating context managers using classes, as shown below:
# Python program creating a context manager
class ContextManager():
def __init__(self):
print('init method called')
def __enter__(self):
print('enter method called')
return self
def __exit__(self, exc_type, exc_value, exc_traceback):
print('exit method called')
with ContextManager() as manager:
print('with statement block')
Output:
init method called
enter method called
with statement block
exit method called
In this case, a ContextManager object is created. This is assigned to the variable after the keyword i.e manager. On running the above program, the following get executed in sequence:
__init__()
__enter__()
statement body (code inside the with block)
__exit__()[the parameters in this method are used to manage exceptions]

Related

Should a function that returns a context-managed object be decorated by `#contextmanager`?

Say I have a function that returns a context-managed object, here a tempfile.TemporaryFile:
import tempfile
def make_temp_file():
"""Create a temporary file. Best used with a context manager"""
tmpfile = tempfile.NamedTemporaryFile()
return tmpfile
Is this safe as is or should this be wrapped with a context manager?
import tempfile
from contextlib import contextmanager
#contextmanager
def make_temp_file():
"""Create a temporary file. Best used with a context manager"""
tmpfile = tempfile.NamedTemporaryFile()
return tmpfile
My confusion comes from the linter pylint who still insist the first example triggers a consider-using-with rule.
Your function already returns a context object (something with __enter__ and __exit__ methods) and does not need to be rewrapped. That wouldn't change anything.
The idea of with is that the context manager's __exit__ will be called when the with suite finishes, even if there is an error. But algorithms don't always fit conveniently into suites as your code indicates. If you can't use a context manager, you need some other mechanism to ensure that something closes the object.
As an example, suppose your function performed some other task before return. It could hit an exception and terminate before anybody has a chance to close the object. In that case you would do
def make_temp_file():
tmpfile = tempfile.NamedTemporaryFile()
try:
do_other_things()
except:
tmpfile.close()
raise
If your function doesn't do anything after creating the object, you can skip the intermediate variable and likely get rid of the lint warning while you are at it. From the pylint doc the warning is suppressed when the call result is returned from the enclosing function.
def make_temp_file():
return tempfile.NamedTemporaryFile()
Note that since you return a context object that needs to be closed, the same issues apply to the thing that calls the function. Either that should be a with or should have some other mechanism to make sure the object is closed when done.

How to make a function act as a generator only when used as one

One existing example of this is open which can be used in these two ways:
f = open("File")
print(f.readline())
f.close()
# ...and...
with open("File") as f:
print(f.readline())
I intend to create a version of the asyncio.Lock class which allows you to not only acquire and release the lock manually but also to use a with block to wrap the code that requires the lock and release it automatically.
The thing you look for isn't a generator, but a context manager.
You don't even need to implement one, This works:
lock = asyncio.Lock()
async def example():
async with lock:
# Your code here
For other people getting here: although the OP wanted something that already works out of the box:
Whenever one sees a "function" that can work as a generator or a context manager, as is the case, or be used "stand alone", it is due to the fact it is not a "function": it is actually a class . WHat you do when calling open or asyncio.lock is creating an object, which internally has several methods, not only .read or .acquire, which both have to be explicly called, but special named methods which allows Python to call then in a transparent way, when the object is used in certain language constructs.
For example, if the class implements the __iter__ method, it can automatically be used in for statements. To be used with an with statement, it has to implement both __enter__ and __exit__ methods.

Strange behavior with contextmanager

Take a look at the following example:
from contextlib import AbstractContextManager, contextmanager
class MyClass(AbstractContextManager):
_values = {}
#contextmanager
def using(self, name, value):
print(f'Allocating {name} = {value}')
self._values[name] = value
try:
yield
finally:
print(f'Releasing {name}')
del self._values[name]
def __enter__(self):
return self.using('FOO', 42).__enter__()
def __exit__(self, exc_type, exc_val, exc_tb):
pass
with MyClass():
print('Doing work...')
I would expect the above code to print the following:
Allocating FOO = 42
Doing work...
Releasing FOO
Instead, this is what is being printed:
Allocating FOO = 42
Releasing FOO
Doing work...
Why is FOO getting released eagerly?
You're creating two context managers here. Only one of those context managers is actually implemented correctly.
Your using context manager is fine, but you've also implemented the context manager protocol on MyClass itself, and the implementation on MyClass is broken. MyClass.__enter__ creates a using context manager, enters it, returns what that context manager's __enter__ returns, and then throws the using context manager away.
You don't exit the using context manager when MyClass() is exited. You never exit it at all! You throw the using context manager away. It gets reclaimed, and when it does, the generator gets close called automatically, as part of normal generator cleanup. That throws a GeneratorExit exception into the generator, triggering the finally block.
Python doesn't promise when this cleanup will happen (or indeed, if it will happen at all), but in practice, CPython's reference counting mechanism triggers the cleanup as soon as the using context manager is no longer reachable.
Aside from that, if _values is supposed to be an instance variable, it should be set as self._values = {} inside an __init__ method. Right now, it's a class variable.

Is returning a value other than `self` in `__enter__` an anti-pattern?

Following this related question, while there are always examples of some library using a language feature in a unique way, I was wondering whether returning a value other than self in an __enter__ method should be considered an anti-pattern.
The main reason why this seems to me like a bad idea is that it makes wrapping context managers problematic. For example, in Java (also possible in C#), one can wrap an AutoCloseable class in another class which will take care of cleaning up after the inner class, like in the following code snippet:
try (BufferedReader reader =
new BufferedReader(new FileReader("src/main/resources/input.txt"))) {
return readAllLines(reader);
}
Here, BufferedReader wraps FileReader, and calls FileReader's close() method inside its own close() method. However, if this was Python, and FileReader would've returned an object other than self in its __enter__ method, this would make such an arrangement significantly more complicated. The following issues would have to be addressed by the writer of BufferedReader:
When I need to use FileReader for my own methods, do I use FileReader directly or the object returned by its __enter__ method? What methods are even supported by the returned object?
In my __exit__ method, do I need to close only the FileReader object, or the object returned in the __enter__ method?
What happens if __enter__ actually returns a different object on its call? Do I now need to keep a collection of all of the different objects returned by it in case someone calls __enter__ several times on me? How do I know which one to use when I need to use on of these objects?
And the list goes on. One semi-successful solution to all of these problems would be to simply avoid having one context manager class clean up after another context manager class. In my example, that would mean that we would need two nested with blocks - one for the FileReader, and one for the BufferedReader. However, this makes us write more boilerplate code, and seems significantly less elegant.
All in all, these issues lead me to believe that while Python does allow us to return something other than self in the __enter__ method, this behavior should simply be avoided. Is there some official or semi-official remarks about these issues? How should a responsible Python developer write code that addresses these issues?
TLDR: Returning something other than self from __enter__ is perfectly fine and not bad practice.
The introducing PEP 343 and Context Manager specification expressly list this as desired use cases.
An example of a context manager that returns a related object is the
one returned by decimal.localcontext(). These managers set the active
decimal context to a copy of the original decimal context and then
return the copy. This allows changes to be made to the current decimal
context in the body of the with statement without affecting code
outside the with statement.
The standard library has several examples of returning something other than self from __enter__. Notably, much of contextlib matches this pattern.
contextlib.contextmanager produces context managers which cannot return self, because there is no such thing.
contextlib.closing wraps a thing and returns it on __enter__.
contextlib.nullcontext returns a pre-defined constant
threading.Lock returns a boolean
decimal.localcontext returns a copy of its argument
The context manager protocol makes it clear what is the context manager, and who is responsible for cleanup. Most importantly, the return value of __enter__ is inconsequential for the protocol.
A rough paraphrasing of the protocol is this: When something runs cm.__enter__, it is responsible for running cm.__exit__. Notably, whatever code does that has access to cm (the context manager itself); the result of cm.__enter__ is not needed to call cm.__exit__.
In other words, a code that takes (and runs) a ContextManager must run it completely. Any other code does not have to care whether its value comes from a ContextManager or not.
# entering a context manager requires closing it…
def managing(cm: ContextManager):
value = cm.__enter__() # must clean up `cm` after this point
try:
yield from unmanaged(value)
except BaseException as exc:
if not cm.__exit__(type(exc), exc, exc.__traceback__):
raise
else:
cm.__exit__(None, None, None)
# …other code does not need to know where its values come from
def unmanaged(smth: Any):
yield smth
When context managers wrap others, the same rules apply: If the outer context manager calls the inner one's __enter__, it must call its __exit__ as well. If the outer context manager already has the entered inner context manager, it is not responsible for cleanup.
In some cases it is in fact bad practice to return self from __enter__. Returning self from __enter__ should only be done if self is fully initialised beforehand; if __enter__ runs any initialisation code, a separate object should be returned.
class BadContextManager:
"""
Anti Pattern: Context manager is in inconsistent state before ``__enter__``
"""
def __init__(self, path):
self.path = path
self._file = None # BAD: initialisation not complete
def read(self, n: int):
return self._file.read(n) # fails before the context is entered!
def __enter__(self) -> 'BadContextManager':
self._file = open(self.path)
return self # BAD: self was not valid before
def __exit__(self, exc_type, exc_val, tb):
self._file.close()
class GoodContext:
def __init__(self, path):
self.path = path
self._file = None # GOOD: Inconsistent state not visible/used
def __enter__(self) -> TextIO:
if self._file is not None:
raise RuntimeError(f'{self.__class__.__name__} is not re-entrant')
self._file = open(self.path)
return self._file # GOOD: value was not accessible before
def __exit__(self, exc_type, exc_val, tb):
self._file.close()
Notably, even though GoodContext returns a different object, it is still responsible to clean up. Another context manager wrapping GoodContext does not need to close the return value, it just has to call GoodContext.__exit__.

How to give with-statement-like functionality to class?

[I apologize for the inept title; I could not come up with anything better. Suggestions for a better title are welcome.]
I want to implement an interface to HDF5 files that supports multiprocess-level concurrency through file-locking. The intended environment for this module is a Linux cluster accessing a shared disk over NFS. The goal is to enable the concurrent access (over NFS) to the same file by multiple parallel processes running on several different hosts.
I would like to be able to implement the locking functionality through a wrapper class for the h5py.File class. (h5py already offers support for thread-level concurrency, but the underlying HDF5 library is not thread-safe.)
It would be great if I could do something in the spirit of this:
class LockedH5File(object):
def __init__(self, path, ...):
...
with h5py.File(path, 'r+') as h5handle:
fcntl.flock(fcntl.LOCK_EX)
yield h5handle
# (method returns)
I realize that the above code is wrong, but I hope it conveys the main idea: namely, to have the expression LockedH5File('/path/to/file') deliver an open handle to the client code, which can then perform various arbitrary read/write operations on it. When this handle goes out of scope, its destructor closes the handle, thereby releasing the lock.
The goal that motivates this arrangement is two-fold:
decouple the creation of the handle (by the library code) from the operations that are subsequently requested on the handle (by the client code), and
ensure that the handle is closed and the lock released, no matter what happens during the
execution of the intervening code (e.g. exceptions, unhandled
signals, Python internal errors).
How can I achieve this effect in Python?
Thanks!
objects that can be used in with statements are called context managers; and they implement a simple interface. They must provide two methods, an __enter__ method, which takes no arguments and may return anything (which will be assigned to the variable in the as portion), and an __exit__ method, which takes three arguments (which will be filled in with the result of sys.exc_info()) and returns non-zero to indicate that an exception was handled. Your example will probably look like:
class LockedH5File(object):
def __init__(self, path, ...):
self.path = path
def __enter__(self):
self.h5handle = h5handle = h5py.File(self.path, 'r+')
fcntl.flock(fcntl.LOCK_EX)
return h5handle
def __exit__(self, exc_type, exc_info, exc_tb):
self.h5handle.close()
To make this work, your class needs to implement the context manager protocol. Alternatively, write a generator function using the contextlib.contextmanager decorator.
Your class might roughly look like this (the details of h5py usage are probably horribly wrong):
class LockedH5File(object):
def __init__(self, path, ...):
self.h5file = h5py.File(path, 'r+')
def __enter__(self):
fcntl.flock(fcntl.LOCK_EX)
return self.h5file
def __exit__(self, exc_type, exc_val, exc_tb):
self.h5file.close()
Well, a context manager and with statement. In general, destructors in Python are not guaranteed to run at all, so you should not rely on them as anything other than fail-safe cleanup. Provide __enter__ and __exit__ methods, and use it like
with LockedFile(...) as fp:
# ...

Categories