I'm trying to create a class that saves all of its instances in a dictionary:
>>> class X:
def __new__(cls, index):
if index in cls._instances:
return cls._instances[index]
self = object.__new__(cls)
self.index = index
cls._instances[index] = self
return self
def __del__(self):
del type(self)._instances[self.index]
_instances = {}
However, the __del__ doesn't seem to work:
>>> x = X(1)
>>> del x
>>> X._instances
{1: <__main__.X object at 0x00000000035166D8>}
>>>
What am I doing wrong?
Building on Kirk Strauser's answer, I'd like to point out that, when you del x, the class' _instances still holds another reference to x - and thus it can't be garbage collected (and __del__ won't run.
Instead of doing this kind of low-level magic, you probably should be using weakrefs, which were implemented especially for this purpose.
WeakValueDictinary, in particular, suits your needs perfectly, and you can fill it on __init__ instead of fiddling with __new__ and __del__
You're not doing anything wrong, but __del__ isn't quite what you think. From the docs on it:
Note del x doesn’t directly call x.__del__() — the former decrements the reference count for x by one, and the latter is only called when x‘s reference count reaches zero.
Running this from the interpreter is particularly tricky because command history or other mechanisms may hold references to x for an indeterminate amount of time.
By the way, your code looks an awful lot like a defaultdict with X as the factory. It may be more straightforward to use something like that to be more explicit (ergo more Pythonic) about what you're trying to do.
Related
One of the effects of the GC changes that happened in Python 3.4 is that a gc-tracked object will only have its __del__ method called once, even if the first __del__ call resurrects the object:
>>> class Foo(object):
... def __del__(self):
... print('__del__')
... global x
... x = self
...
>>> x = Foo()
>>> del x
__del__
>>> del x
>>>
(Untracked objects currently behave differently, since they don't have the flag that indicates already-finalized status. You can see this by inserting __slots__ = () in the above class definition. I'm not sure whether whether this is a bug or a known and accepted behavior difference.)
For debugging purposes, it would be useful to be able to determine if an object has had its __del__ method called. One option would be to insert a line in __del__ that sets an indicator flag, but that requires advance preparation, and it may not be possible for objects with __del__ written in C, such as generators.
Is it possible to determine whether an object has been finalized, without modifying its __del__ method?
In Python 3.9, this can be tested using gc.is_finalized(obj).
As far I know in python 'self' represents the object of a class. Recently I found a code where in the constructor(__init__) a variable value is assigned to 'self' like below:
self.x = self
Can anyone please explain what kind of value is actually assigned to x?
It creates a circular reference. self is bound to the instance on which the method is called, so setting self.x = self just creates a reference to the instance on the instance.
This is a generally silly thing to do, and potentially harmful to the memory performance of your program. If the class also defines the object.__del__() method then this will prevent the object from being garbage collected, causing a memory leak in all CPython releases < 3.4 (which implements PEP 442):
>>> import gc
>>> class SelfReference(object):
... def __init__(self):
... self.x = self
... def __del__(self):
... pass
...
>>> s = SelfReference()
>>> s.x is s # the instance references itself
True
>>> del s # deleting the only reference should clear it from memory
>>> gc.collect()
25
>>> gc.garbage # yet that instance is *still here*
[<__main__.SelfReference object at 0x102d0b890>]
The gc.garbage list contains everything the garbage collector cannot clean up due to circular references and __del__ methods.
I suspect that you found one of the very few actual usecases for assigning self to a an attribute anyway, which is the usecase davidb mentions: setting self.__dict__ to self if self is a mapping object, to 'merge' attribute and subscription access into one namespace.
Even if this kind of assignments can generally seem not a good idea, yet there are cases where it is indeed useful and elegant.
Here is one of those cases:
class Dict(dict):
'''Dictionary subclass allowing to access an item using its key as an
attribute.
'''
def __init__(self, *args, **kwargs):
super(Dict, self).__init__(*args, **kwargs)
self.__dict__ = self
Here is a simple usage example:
>>> d = Dict({'one':1, 'two':2})
>>> d['one']
1
>>> d.one
1
Someone asked a similar one [question]:Printing all instances of a class.
While I am less concerned about printing them, I'd rather to know how many instances are currently "live".
The reason for this instance capture is more like a setting up a scheduled job, every hour check these "live" unprocessed instances and enrich the data. After that, either a flag in this instance is set or just delete this instance.
Torsten Marek 's answer in [question]:Printing all instances of a class using weakrefs need a call to the base class constructor for every class of this type, is it possible to automate this? Or we can get all instances with some other methods?
You can either track it on your own (see the other answers) or ask the garbage collector:
import gc
class Foo(object):
pass
foo1, foo2 = Foo(), Foo()
foocount = sum(1 for o in gc.get_referrers(Foo) if o.__class__ is Foo)
This can be kinda slow if you have a lot of objects, but it's generally not too bad, and it has the advantage of being something you can easily use with someone else's code.
Note: Used o.__class__ rather than type(o) so it works with old-style classes.
If you only want this to work for CPython, and your definition of "live" can be a little lax, there's another way to do this that may be useful for debugging/introspection purposes:
>>> import gc
>>> class Foo(object): pass
>>> spam, eggs = Foo(), Foo()
>>> foos = [obj for obj in gc.get_objects() if isinstance(obj, Foo)]
>>> foos
[<__main__.Foo at 0x1153f0190>, <__main__.Foo at 0x1153f0210>]
>>> del spam
>>> foos = [obj for obj in gc.get_objects() if isinstance(obj, Foo)]
>>> foos
[<__main__.Foo at 0x1153f0190>, <__main__.Foo at 0x1153f0210>]
>>> del foos
>>> foos = [obj for obj in gc.get_objects() if isinstance(obj, Foo)]
>>> foos
[<__main__.Foo at 0x1153f0190>]
Note that deleting spam didn't actually make it non-live, because we've still got a reference to the same object in foos. And reassigning foos didn't not help, because apparently the call to get_objects happened before the old version is released. But eventually it went away once we stopped referring to it.
And the only way around this problem is to use weakrefs.
Of course this will be horribly slow in a large system, with or without weakrefs.
Sure, store the count in a class attribute:
class CountedMixin(object):
count = 0
def __init__(self, *args, **kwargs):
type(self).count += 1
super().__init__(*args, **kwargs)
def __del__(self):
type(self).count -= 1
try:
super().__del__()
except AttributeError:
pass
You could make this slightly more magical with a decorator or a metaclass than with a base class, or simpler if it can be a bit less general (I've attempted to make this fit in anywhere in any reasonable multiple-inheritance hierarchy, which you usually don't need to worry about…), but basically, this is all there is to it.
If you want to have the instances themselves (or, better, weakrefs to them), rather than just a count of them, just replace count=0 with instances=set(), then do instances.add(self) instead of count += 1, etc. (Again, though, you probably want a weakref to self, rather than self.)
I cannot comment to the answer of kindall, thus I write my comment as answer:
The solution with gc.get_referrers(<ClassName>) does not work with inherited classes in python 3. The method gc.get_referrers(<ClassName>) does not return any instances of a class that was inherited from <ClassName>.
Instead you need to use gc.get_objects() which is much slower, since it returns a full list of objects. But in case of unit-tests, where you simply want to ensure your objects get deleted after the test (no circular references) it should be sufficient and fast enough.
Also do not forget to call gc.collect() before checking the number of your instances, to ensure all unreferenced instances are really deleted.
I also saw an issue with weak references which are also counted in this way. The problem with weak references is, that the object which is referenced might not exist any more, thus isinstance(Instance, Class) might fail with an error about non existing weak references.
Here is a simple code example:
import gc
def getInstances(Class):
gc.collect()
Number = 0
InstanceList = gc.get_objects()
for Instance in InstanceList:
if 'weakproxy' not in str(type(Instance)): # avoid weak references
if isinstance(Instance, Class):
Number += 1
return Number
Is there any way to get the original object from a weakproxy pointed to it? eg is there the inverse to weakref.proxy()?
A simplified example(python2.7):
import weakref
class C(object):
def __init__(self, other):
self.other = weakref.proxy(other)
class Other(object):
pass
others = [Other() for i in xrange(3)]
my_list = [C(others[i % len(others)]) for i in xrange(10)]
I need to get the list of unique other members from my_list. The way I prefer for such tasks
is to use set:
unique_others = {x.other for x in my_list}
Unfortunately this throws TypeError: unhashable type: 'weakproxy'
I have managed to solve the specific problem in an imperative way(slow and dirty):
unique_others = []
for x in my_list:
if x.other in unique_others:
continue
unique_others.append(x.other)
but the general problem noted in the caption is still active.
What if I have only my_list under control and others are burried in some lib and someone may delete them at any time, and I want to prevent the deletion by collecting nonweak refs in a list?
Or I may want to get the repr() of the object itself, not <weakproxy at xx to Other at xx>
I guess there should be something like weakref.unproxy I'm not aware about.
I know this is an old question but I was looking for an answer recently and came up with something. Like others said, there is no documented way to do it and looking at the implementation of weakproxy type confirms that there is no standard way to achieve this.
My solution uses the fact that all Python objects have a set of standard methods (like __repr__) and that bound method objects contain a reference to the instance (in __self__ attribute).
Therefore, by dereferencing the proxy to get the method object, we can get a strong reference to the proxied object from the method object.
Example:
>>> def func():
... pass
...
>>> weakfunc = weakref.proxy(func)
>>> f = weakfunc.__repr__.__self__
>>> f is func
True
Another nice thing is that it will work for strong references as well:
>>> func.__repr__.__self__ is func
True
So there's no need for type checks if either a proxy or a strong reference could be expected.
Edit:
I just noticed that this doesn't work for proxies of classes. This is not universal then.
Basically there is something like weakref.unproxy, but it's just named weakref.ref(x)().
The proxy object is only there for delegation and the implementation is rather shaky...
The == function doesn't work as you would expect it:
>>> weakref.proxy(object) == object
False
>>> weakref.proxy(object) == weakref.proxy(object)
True
>>> weakref.proxy(object).__eq__(object)
True
However, I see that you don't want to call weakref.ref objects all the time. A good working proxy with dereference support would be nice.
But at the moment, this is just not possible. If you look into python builtin source code you see, that you need something like PyWeakref_GetObject, but there is just no call to this method at all (And: it raises a PyErr_BadInternalCall if the argument is wrong, so it seems to be an internal function). PyWeakref_GET_OBJECT is used much more, but there is no method in weakref.py that could be able to do that.
So, sorry to disappoint you, but you weakref.proxy is just not what most people would want for their use cases. You can however make your own proxy implementation. It isn't to hard. Just use weakref.ref internally and override __getattr__, __repr__, etc.
On a little sidenote on how PyCharm is able to produce the normal repr output (Because you mentioned that in a comment):
>>> class A(): pass
>>> a = A()
>>> weakref.proxy(a)
<weakproxy at 0x7fcf7885d470 to A at 0x1410990>
>>> weakref.proxy(a).__repr__()
'<__main__.A object at 0x1410990>'
>>> type( weakref.proxy(a))
<type 'weakproxy'>
As you can see, calling the original __repr__ can really help!
weakref.ref is hashable whereas weakref.proxy is not. The API doesn't say anything about how you actually can get a handle on the object a proxy points to. with weakref, it's easy, you can just call it. As such, you can roll your own proxy-like class...Here's a very basic attemp:
import weakref
class C(object):
def __init__(self,obj):
self.object=weakref.ref(obj)
def __getattr__(self,key):
if(key == "object"): return object.__getattr__(self,"object")
elif(key == "__init__"): return object.__getattr__(self,"__init__")
else:
obj=object.__getattr__(self,"object")() #Dereference the weakref
return getattr(obj,key)
class Other(object):
pass
others = [Other() for i in range(3)]
my_list = [C(others[i % len(others)]) for i in range(10)]
unique_list = {x.object for x in my_list}
Of course, now unique_list contains refs, not proxys which is fundamentally different...
I know that this is an old question, but I've been bitten by it (so, there's no real 'unproxy' in the standard library) and wanted to share my solution...
The way I solved it to get the real instance was just creating a property which returned it (although I suggest using weakref.ref instead of a weakref.proxy as code should really check if it's still alive before accessing it instead of having to remember to catch an exception whenever any attribute is accessed).
Anyways, if you still must use a proxy, the code to get the real instance is:
import weakref
class MyClass(object):
#property
def real_self(self):
return self
instance = MyClass()
proxied = weakref.proxy(instance)
assert proxied.real_self is instance
For specific debugging purposes I'd like to wrap the del function of an arbitrary object to perform extra tasks like write the last value of the object to a file.
Ideally I want to write
monkey(x)
and it should mean that the final value of x is printed when x is deleted
Now I figured that del is a class method. So the following is a start:
class Test:
def __str__(self):
return "Test"
def p(self):
print(str(self))
def monkey(x):
x.__class__.__del__=p
a=Test()
monkey(a)
del a
However if I want to monkey specific objects only I suppose I need to dynamically rewrite their class to a new one?! Moreover I need to do this anyway, since I cannot access del of built-in types?
Anyone knows how to implement that?
While special 'double underscore' methods like __del__, __str__, __repr__, etc. can be monkey-patched on the instance level, they'll just be ignored, unless they are called directly (e.g., if you take Omnifarious's answer: del a won't print a thing, but a.__del__() would).
If you still want to monkey patch a single instance a of class A at runtime, the solution is to dynamically create a class A1 which is derived from A, and then change a's class to the newly-created A1. Yes, this is possible, and a will behave as if nothing has changed - except that now it includes your monkey patched method.
Here's a solution based on a generic function I wrote for another question:
Python method resolution mystery
def override(p, methods):
oldType = type(p)
newType = type(oldType.__name__ + "_Override", (oldType,), methods)
p.__class__ = newType
class Test(object):
def __str__(self):
return "Test"
def p(self):
print(str(self))
def monkey(x):
override(x, {"__del__": p})
a=Test()
b=Test()
monkey(a)
print "Deleting a:"
del a
print "Deleting b:"
del b
del a deletes the name 'a' from the namespace, but not the object referenced by that name. See this:
>>> x = 7
>>> y = x
>>> del x
>>> print y
7
Also, some_object.__del__ is not guaranteed to be called at all.
Also, I already answered your question here (in german).
You can also inherit from some base class and override the __del__ method (then only thing you would need would be to override class when constructing an object).
Or you can use super built-in method.
Edit: This won't actually work, and I'm leaving it here largely as a warning to others.
You can monkey patch an individual object. self will not get passed to functions that you monkey patch in this way, but that's easily remedied with functools.partial.
Example:
def monkey_class(x):
x.__class__.__del__ = p
def monkey_object(x):
x.__del__ = functools.partial(p, x)