Why aren't destructors guaranteed to be called on interpreter exit? - python

From the python docs:
It is not guaranteed that __del__() methods are called for objects that still exist when the interpreter exits.
Why not? What problems would occur if this guarantee were made?

I'm not convinced by the previous answers here.
Firstly note that the example given does not prevent __del__ methods being called during exit. In fact, the current CPythons will call the __del__ method given, twice in the case of Python 2.7 and once in the case of Python 3.4. So this can't be the "killer example" which shows why the guarantee is not made.
I think the statement in the docs is not motivated by a design principle that calling the destructors would be bad. Not least because it seems that in CPython 3.4 and up they are always called as you would expect and this caveat seems to be moot.
Instead I think the statement simply reflects the fact that the CPython implementation has sometimes not called all destructors on exit (presumably for ease of implementation reasons).
The situation seems to be that CPython 3.4 and 3.5 do always call all destructors on interpreter exit.
CPython 2.7 by contrast does not always do this. Certainly __del__ methods are usually not called on objects which have cyclic references, because those objects cannot be deleted if they have a __del__ method. The garbage collector won't collect them. While the objects do disappear when the interpreter exits (of course) they are not finalized and so their __del__ methods are never called. This is no longer true in Python 3.4 after the implementation of PEP 442.
However, it seems that Python 2.7 also does not finalize objects that have cyclic references, even if they have no destructors, if they only become unreachable during the interpreter exit.
Presumably this behaviour is sufficiently particular and difficult to explain that it is best expressed simply by a generic disclaimer - as the docs do.
Here's an example:
class Foo(object):
def __init__(self):
print("Foo init running")
def __del__(self):
print("Destructor Foo")
class Bar(object):
def __init__(self):
print("Bar1 init running")
self.bar = self
self.foo = Foo()
b = Bar()
# del b
With the del b commented out, the destructor in Foo is not called in Python 2.7 though it is in Python 3.4.
With the del b added, then the destructor is called (at interpreter exit) in both cases.

If you did some nasty things, you could find yourself with an undeletable object which python would try to delete forever:
class Phoenix(object):
def __del__(self):
print "Deleting an Oops"
global a
a = self
a = Phoenix()
Relying on __del__ isn't great in any event as python doesn't guarantee when an object will be deleted (especially objects with cyclic references). That said, perhaps turning your class into a context manager is a better solution ... Then you can guarantee that cleanup code is called even in the case of an exception, etc...

One example where the destructor is not called is, if you exit inside a method. Have a look at this example:
class Foo(object):
def __init__(self):
print("Foo init running")
def __del__(self):
print("Destructor Foo")
class Bar(object):
def __init__(self):
print("Bar1 init running")
self.bar = self
self.foo = Foo()
def __del__(self):
print("Destructor Bar")
def stop(self):
del self.foo
del self
exit(1)
b = Bar()
b.stop()
The output is:
Bar1 init running
Foo init running
Destructor Foo
As we destruct foo explicitly, the destructor is called, but not the destructor of bar!
And, if we do not delete foo explicitly, it is also not destructed properly:
class Foo(object):
def __init__(self):
print("Foo init running")
def __del__(self):
print("Destructor Foo")
class Bar(object):
def __init__(self):
print("Bar1 init running")
self.bar = self
self.foo = Foo()
def __del__(self):
print("Destructor Bar")
def stop(self):
exit(1)
b = Bar()
b.stop()
Output:
Bar1 init running
Foo init running

I don't think this is because doing the deletions would cause problems. It's more that the Python philosophy is not to encourage developers to rely on the use of object deletion, because the timing of these deletions cannot be predicted - it is up to the garbage collector when it occurs.
If the garbage collector may defer deleting unused objects for an unknown amount of time after they go out of scope, then relying on side effects that happen during the object deletion is not a very robust or deterministic strategy. RAII is not the Python way. Instead Python code handles cleanup using context managers, decorators, and the like.
Worse, in complicated situations, such as with object cycles, the garbage collector might not ever detect that objects can be deleted. This situation has improved as Python has matured. But because of exceptions to the expected GC behaviour like this, it is unwise for Python developers to rely on object deletion.
I speculate that interpreter exit is another complicated situation where the Python devs, especially for older versions of Python, were not completely strict about making sure the GC delete ran on all objects.

Likely because most of programmers would assume that destructors should only be called on dead (already unreachable) objects, and here on exit we would invoke them on live objects.
If it the developer has not been expecting a destructor call on the live object, some nasty UB may result. At least, something must be done to force-close the application after time out if it hangs. But then some destructors may not be called.
Java Runtime.runFinalizersOnExit has been deprecated because of the same reason.

Related

How to design a python class with a thread member, that gets garbage collected

I have created a class A using the following pattern
class A:
def __init__(self):
self.worker = threading.Thread(target=self.workToDo)
self.worker.setDaemon(daemonic=True)
self.worker.start()
def workToDo(self):
while True:
print("Work")
However, this design gets not garbage collected. I assume that this is due to a circular dependency between the running thread and its parent.
How can i design a class that starts a periodic thread that does some work, stops this thread on destruction and gets destructed as soon as all obvious references to the parent object get out of scope.
I tried to stop the thread in the ___del___ method, but this method is never called (i assume due to the circular dependency).
There is no circular dependence, and the garbage collector is doing exactly what it is supposed to do. Look at the method workToDo:
def workToDo(self):
while True:
print("Work")
Once you start the thread, this method will run forever. It contains a variable named self: the instance of class A that originally launched the thread. As long as this method continues to run, there is an active reference to the instance of A and therefore it cannot be garbage collected.
This can easily be demonstrated with the following little program:
import threading
import time
def workToDo2():
while True:
print("Work2")
time.sleep(0.5)
class A:
def __init__(self):
self.worker = threading.Thread(target=workToDo2, daemon=True)
self.worker.start()
def workToDo(self):
while True:
print("Work")
time.sleep(0.5)
def __del__(self):
print("del")
A()
time.sleep(5.0)
If you change the function that starts the thread from self.workToDo to workToDo2, the __del__ method fires almost immediately. In that case the thread does not reference the object created by A(), so it can be safely garbage collected.
Your statement of the problem is based on a false assumption about how the garbage collector works. There is no such concept as "obvious reference" - there is either a reference or there isn't.
The threads continue to run whether the object that launched them is garbage collected or not. You really should design Python threads so there is a mechanism to exit from them cleanly, unless they are true daemons and can continue to run without harming anything.
I understand the urge to avoid trusting your users to call some sort of explicit close function. But the Python philosophy is "we're all adults here," so IMO this problem is not a good use of your time.
Syntax of destructor declaration:
def __del__(self):
# body of destructor
Note: A reference to objects is also deleted when the object goes out of reference or when the program ends.
Example 1: Here is the simple example of destructor. By using del keyword we deleted the all references of object ‘obj’, therefore destructor invoked automatically
Python program to illustrate destructor:
class Employee:
# Initializing
def __init__(self):
print('Employee created.')
# Deleting (Calling destructor)
def __del__(self):
print('Destructor called, Employee deleted.')
obj = Employee()
del obj

setattr, object deletion and cyclic garbage collection

I would like to understand how object deletion works on python. Here is a very simple bunch of code.
class A(object):
def __init__(self):
setattr(self, "test", self._test)
def _test(self):
print "Hello, World!"
def __del__(self):
print "I'm dying!"
class B(object):
def test(self):
print "Hello, World!"
def __del__(self):
print "I'm dying"
print "----------Test on A"
A().test()
print "----------Test on B"
B().test()
Pythonista would recognize that I'm running a python 2.x version. More specially, this code runs on a python 2.7.1 setup.
This code outputs the following:
----------Test on A
Hello, World!
----------Test on B
Hello, World!
I'm dying
Surprisingly, A object is not deleted. I can understand why, since the setattr statement in __init__ produces a circular reference. But this one seems to be easy to resolve.
Finally, this page, in python documentation (Supporting Cyclic Garbage Collection), show that it's possible to deal with this kind of circular reference.
I would like to know:
why I never go thru my __del__ method in A class?
if my diagnosis about circular reference is good, why my object subclass does not support cyclic garbage collection?
finally, how to deal with this kind of setattr if I really want to go thru __del__?
Note: In A if the setattr points to another method of my module, there's no problem.
Fact 1
Instance methods are normally stored on the class. The interpreter first looks them up in the instance __dict__, which fails, and then looks on the class, which succeeds.
When you dynamically set the instance method of A in __init__, you create a reference to it in the instance dictionary. This reference is circular, so the refcount will never go to zero and the reference counter will not clean A up.
>>> class A(object):
... def _test(self): pass
... def __init__(self):
... self.test = self._test
...
>>> a = A()
>>> a.__dict__['test'].im_self
Fact 2
The garbage collector is what Python uses to deal with circular references. Unfortunately, it can't handle objects with __del__ methods, since in general it can't determine a safe order to call them. Instead, it just puts all such objects in gc.garbage. You can then go look there to break cycles, so they can be freed. From the docs
gc.garbage
A list of objects which the collector found to be unreachable but could
not be freed (uncollectable objects). By default, this list contains only
objects with __del__() methods. Objects that have __del__() methods
and are part of a reference cycle cause the entire reference cycle
to be uncollectable, including objects not necessarily
in the cycle but reachable only from it. Python doesn’t collect such
cycles automatically because, in general, it isn’t possible for Python
to guess a safe order in which to run the __del__() methods. If you
know a safe order, you can force the issue by examining the garbage
list, and explicitly breaking cycles due to your objects within the
list. Note that these objects are kept alive even so by virtue of
being in the garbage list, so they should be removed from garbage too.
For example, after breaking cycles, do del gc.garbage[:] to empty the
list. It’s generally better to avoid the issue by not creating cycles
containing objects with __del__() methods, and garbage can be examined
in that case to verify that no such cycles are being created.
Therefore
Don't make cyclic references on objects with __del__ methods if you want them to be garbage collected.
You should read the documentation on the __del__ method rather carefully - specifically, the part where objects with __del__ methods change the way the collector works.
The gc module provides some hooks where you can clean this up yourself.
I suspect that simply not having a __del__ method here would result in your object being properly cleaned up. You can verify this by looking through gc.garbage and seeing if your instance of A is present.

Thead-safe object and dict types in python

I have two questions about creating thread safe types in python, and one related question about multiple inheritance.
1) Are there any problematic implications with using the following subclasses in my threaded application as a sort of "lazy" thread-safe type? I realize that whomever sets values which may be altered by other threads bears the responsibility to ensure those values are thread safe as well.
2) Another question I have is if there exists more prudent alternatives to these types within python in a typical installation.
Example:
from threading import Lock
from __future__ import with_statement
class safedict(dict):
def __init__(self,*args,**kwargs):
self.mylock=Lock();
super(safedict, self).__init__(*args, **kwargs)
def __setitem__(self,*args,**kwargs):
with self.mylock:
print " DEBUG: Overloaded __setitem__ has the lock now."
super(safedict,self).__setitem__(*args,**kwargs)
class safeobject(object):
mylock = Lock(); # a temporary useless lock, until we have a proper instance.
def __init__(self,*args,**kwargs):
self.mylock=Lock();
super(safeobject, self).__init__(*args, **kwargs)
def __setattr__(self,*args,**kwargs):
with self.mylock:
print " DEBUG: Overloaded __setattr__ has the lock now."
super(safeobject,self).__setattr__(*args,**kwargs)
3) If both of the types defined above could be considered reasonably safe, what negative implications would be faced by using multiple inheritance to create a type that supported a mixture of both of these modifications, and does my example inherit those classes in the optimal order?
Example:
class safedict2(safeobject,dict):
def __setitem__(self,*args,**kwargs):
with self.mylock:
print " DEBUG: Overloaded __setitem__ has the lock now."
super(safedict2,self).__setitem__(*args,**kwargs)
Edit:
Just another example of another type inheriting both of the former types, and testing using ipython.
In [304]: class safedict3(safeobject,safedict):
.....: pass
.....:
In [305]: d3 = safedict3()
DEBUG: Overloaded __setattr__ has the lock now.
DEBUG: Overloaded __setattr__ has the lock now.
In [306]: d3.a=1
DEBUG: Overloaded __setattr__ has the lock now.
In [307]: d3['b']=2
DEBUG: Overloaded __setitem__ has the lock now.
In [308]: d3
Out[308]: {'b': 2}
As to your first and second questions, the dict, list, etc. types are already thread-safe. You do not have to add thread safety to them. However you may find this useful. It's a decorator that basically implements the synchronized keyword from Java, using function scope to define a critical section. Using a similar approach it is possible to author a threading.Condition oriented decorator also.
import threading
def tryfinally(finallyf):
u"returns a decorator that adds try/finally behavior with given no-argument call in the finally"
def decorator(callable):
def execute(*args, **kwargs):
try: result = callable(*args, **kwargs)
finally: finallyf()
return result
return execute
return decorator
def usinglock(lock):
u"returns a decorator whose argument will acquire the given lock while executing"
def decorator(function):
body = tryfinally(lock.release)(function)
def execute(*args, **kwargs):
lock.acquire()
return body(*args, **kwargs)
return execute
return decorator
def synchronized(function):
u"decorator; only one thread can enter the decorated function at a time; recursion is OK"
return usinglock(threading.RLock())(function)
Use it like this (and beware deadlocks if you overuse it):
#synchronized
def foo(*args):
print 'Only one thread can enter this function at a time'
On the third question, the Python tutorial states that the search order for inherited attributes is depth-first, left-first. So if you inherit (myclass, dict) then the __setitem__ method from myclass should be used. (In older versions of Python, this same section in the tutorial implied that this choice was arbitrary, but nowadays it appears to be quite deliberate.)
I'm guessing from the Freudian slip of a semicolon in the posted source that you are new to Python but experienced in either Java or C#. If so you will need to keep in mind that attribute (method) resolution occurs at run time in Python, and that the classes as well as the instance are first-class objects that can be inspected/explored at run time.
First the instance attribute dictionary is searched, then the class attributes, and then the parent class search algorithm starts. This is done with (conceptually) the equivalent of repeated hasattr(class_or_instance, attribute) calls.
The below confirms that for "new-style" classes (classes that inherit from object, which in the 2.x language specification is optional), this resolution occurs each time the attribute is looked up. It is not done when the class (or subclass) is created, or when instances are created. (This was done in release 2.7.2.)
>>> class Foo(object):
... def baz(self):
... print 'Original Foo.baz'
...
>>> class Bar(Foo): pass
...
>>> def newprint(self):
... print 'New Foo.baz'
...
>>> x = Foo()
>>> y = Bar()
>>> Foo.baz = newprint
>>> a = Foo()
>>> b = Bar()
>>> map(lambda k: k.baz(), (x, y, a, b))
New Foo.baz
New Foo.baz
New Foo.baz
New Foo.baz
[None, None, None, None]
Replacing the method of class Foo changes the behavior of subclasses already defined and of instances already created.

thread Locking/unlocking in constructor/destructor in python

I have a class that is only ever accessed externally through static methods. Those static methods then create an object of the class to use within the method, then they return and the object is presumably destroyed. The class is a getter/setter for a couple config files and now I need to place thread locks on the access to the config files.
Since I have several different static methods that all need read/write access to the config files that all create objects in the scope of the method, I was thinking of having my lock acquires done inside of the object constructor, and then releasing in the destructor.
My coworker expressed concern that it seems like that could potentially leave the class locked forever if something happened. And he also mentioned something about how the destructor in python was called in regards to the garbage collector, but we're both relatively new to python so that's an unknown.
Is this a reasonable solution or should I just lock/unlock in each of the methods themselves?
Class A():
rateLock = threading.RLock()
chargeLock = threading.RLock()
#staticmethod
def doZStuff():
a = A()
a.doStuff('Z')
#staticmethod
def doYStuff():
a = A()
a.doStuff('Y')
#synchronized(lock)
def doStuff(self, type):
if type == 'Z':
otherstuff()
elif type == 'B':
evenmorestuff()
Is it even possible to get it to work that way with the decorator on doStuff() instead of doZStuff()
Update
Thanks for the answers everyone. The problem I'm facing is mostly due to the fact that it doesn't really make sense to access my module asynchronously, but this is just part of an API. And the team accessing our stuff through the API was complaining about concurrency issues. So I don't need the perfect solution, I'm just trying to make it so they can't crash our side or get garbage data back
Class A():
rateLock = threading.RLock()
chargeLock = threading.RLock()
def doStuff(self,ratefile,chargefile):
with A.rateLock:
with open(ratefile) as f:
# ...
with A.chargeLock:
with open(chargefile) as f:
# ...
Using the with statement will guarantee that the (R)Lock is acquired and released in pairs. The release will be called even if there an exception occurs within the with-block.
You might also want to think about placing your locks around the file access block with open(...) as ... as tightly as you can so that the locks are not held longer than necessary.
Finally, the creation and garbage collection of a=A() will not affect the locks
if (as above) the locks are class attributes (as opposed to instance attributes). The class attributes live in A.__dict__, rather than a.__dict__. So the locks will not be garbage collected until A itself is garbage collected.
You are right with the garbage collection, so it is not a good idea.
Look into decorators, for writing synchronized functions.
Example: http://code.activestate.com/recipes/465057-basic-synchronization-decorator/
edit
I'm still not 100% sure what you have in mind, so my suggestion may be wrong:
class A():
lockZ = threading.RLock()
lockY = threading.RLock()
#staticmethod
#synchroized(lockZ)
def doZStuff():
a = A()
a.doStuff('Z')
#staticmethod
#synchroized(lockY)
def doYStuff():
a = A()
a.doStuff('Y')
def doStuff(self, type):
if type == 'Z':
otherstuff()
elif type == 'B':
evenmorestuff()
However, if you HAVE TO acquire and release locks in constructors and destructors, then you really, really, really should give your design another chance. You should change your basic assumptions.
In any application: a "LOCK" should always be held for a short time only - as short as possible. That means - in probably 90% of all cases, you will acquire the lock in the same method that will also release the lock.
There should hardly be NEVER EVER a reason to lock/unlock an object in a RAII style. This is not what it was meant to become ;)
Let me give you an ekxample: you manage some ressources, those res. can be read from many threads at once but only one thread can write to them.
In a "naive" implementation you would have one lock per object, and whenever someone wants to write to it, then you will LOCK it. When multiple threads want to write to it, then you have it synchronyzed fairly, all safe and well, BUT: When thread says "WRITE", then we will stall, until the other threads decide to release the lock.
But please understand that locks, mutex - all these primitives were created to synchronize only a few lines of your source code. So, instead of making the lock part of you writeable object, you have only a lock for the very short time where it really is required. You have to invest more time and thoughts in your interfaces. But, LOCKS/MUTEXES were never meant to be "held" for more than a few microseconds.
I don't know which platform you are on, but if you need to lock a file, well, you should probably use flock() if it is available instead of rolling your own locking routines.
Since you've mentioned that you are new to python, I must say that most of the time threads are not the solution in python. If your activity is CPU-bound, you should consider using multiprocessing. There is no concurrent execution because of GIL, remember? (this is true for most cases). If your activity is I/O bound, which I guess is the case, you should, perhaps, consider using an event-driven framework, like Twisted. That way you won't have to worry about deadlocks at all, I promise :)
Releasing locks in the destruction of objects is risky as has already been mentioned because of the garbage collector, because deciding when to call the __del__() method on objects is exclusively decided by the GC (usually when the refcount reaches zero) but in some cases, if you have circular references, it might never be called, even when the program exits.
If you are treating one specific configfile inside a class instance, then you might put a lock object from the Threading module inside it.
Some example code of this:
from threading import Lock
class ConfigFile:
def __init__(file):
self.file = file
self.lock = Lock()
def write(self, data):
self.lock.aquire()
<do stuff with file>
self.lock.release()
# Function that uses ConfigFile object
def staticmethod():
config = ConfigFile('myconfig.conf')
config.write('some data')
You can also use locks in a With statement, like:
def write(self, data):
with self.lock:
<do stuff with file>
And Python will aquire and release the lock for you, even in case of errors that happens while doing stuff with the file.

__del__ at program end

Suppose there is a program with a couple of objects living in it at runtime.
Is the __del__ method of each object called when the programs ends?
If yes I could for example do something like this:
class Client:
__del__( self ):
disconnect_from_server()
There are many potential difficulties associated with using __del__.
Usually, it is not necessary, or the best idea to define it yourself.
Instead, if you want an object that cleans up after itself upon exit or an exception, use a context manager:
per Carl's comment:
class Client:
def __exit__(self,ext_type,exc_value,traceback):
self.disconnect_from_server()
with Client() as c:
...
original answer:
import contextlib
class Client:
...
#contextlib.contextmanager
def make_client():
c=Client()
yield c
c.disconnect_from_server()
with make_client() as c:
...
I second the general idea of using context managers and the with statement instead of relying on __del__ (for much the same reasons one prefers try/finally to finalizer methods in Java, plus one: in Python, the presence of __del__ methods can make cyclic garbage uncollectable).
However, given that the goal is to have "an object that cleans up after itself upon exit or an exception", the implementation by #~unutbu is not correct:
#contextlib.contextmanager
def make_client():
c=Client()
yield c
c.disconnect_from_server()
with make_client() as c:
...
If an exception is raised in the ... part, disconnect_from_server_ does not get called (since the exception propagates through make_client, being uncaught there, and therefore terminates it while it's waiting at the yield).
The fix is simple:
#contextlib.contextmanager
def make_client():
c=Client()
try: yield c
finally: c.disconnect_from_server()
Essentially, the with statement lets you almost forget about the good old try/finally statement... except when you're writing context managers with contextlib, and then it's really important to remember it!-)
Consider using with-statement to make cleanup explicit.
With circular references __del__ is not called:
class Foo:
def __del__(self):
self.p = None
print "deleting foo"
a = Foo()
b = Foo()
a.p = b
b.p = a
prints nothing.
Yes, the Python interpreter tidies up at shutdown, including calling the __del__ method of every object (except objects that are part of a reference cycle).
Although, as others have pointed out, __del__ methods are very fragile and should be used with caution.

Categories