Python object losing __dict__ after mixin injection - python

I am trying to inject a mixin to a class with a decorator. When the code runs the class no longer has a dict property even though the dir(instance) says it has one. I'm not sure where the property is disappearing. Is there a way that I can get dict or otherwise find the instance's attributes?
def testDecorator(cls):
return type(cls.__name__, (Mixin,) + cls.__bases__, dict(cls.__dict__))
class Mixin:
pass
#testDecorator
class dummyClass:
def __init__(self):
self.testVar1 = 'test'
self.testVar2 = 3.14
inst = dummyClass()
print(dir(inst))
print(inst.__dict__)
This code works if the decorator is commented out yet causes an error when the decorator is present. Running on python 3.5.1

It's not "losing __dict__". What's happening is that your original dummyClass has a __dict__ descriptor intended to retrieve the __dict__ attribute of instances of your original dummyClass, but your decorator puts that descriptor into a new dummyClass that doesn't descend from the original.
It's not safe to use the original __dict__ descriptor with instances of the new class, because there's no inheritance relationship, and instances of the new class could have their dict pointer at a different offset in their memory layout. To fix this, have your decorator create a class that descends from the original instead of copying its dict and bases:
def testDecorator(cls):
return type(cls.__name__, (Mixin, cls), {})

Related

How to remove classes from __subclasses__?

When inheriting from a class, the child class is accessible on the parent via the .__subclasses__() method.
class BaseClass:
pass
class SubClass(BaseClass):
pass
BaseClass.__subclasses__()
# [<class '__main__.SubClass'>]
However, deleting the child class doesn't seem to remove it from the parent.
del SubClass
BaseClass.__subclasses__()
# [<class '__main__.SubClass'>]
Where does __subclasses__ get its information from? And can I manipulate it?
Or
Is there a proper way to remove a class and have its parent lose reference to it (e.g. BaseClass.remove_subclass(SubClass)?
The subclass contains references to itself internally, so it continues to exist until it is garbage collected. If you force a garbage collection cycle it will disappear from the __subclasses__():
import gc
gc.collect()
and then it has gone.
However make sure you have deleted all other references to the class before you force the garbage collection. For example, if you do it interactively and the last output was the subclass list there will still be a reference to the class in _.
class BaseClass:
pass
class SubClass(BaseClass):
pass
print(BaseClass.__subclasses__())
# [<class '__main__.SubClass'>]
del SubClass
import gc
gc.collect()
print(BaseClass.__subclasses__())
# []
Output with python 3.7 is:
[<class '__main__.SubClass'>]
[]
I should probably also add that while garbage collection works for this simple case you probably shouldn't depend on it in real life: it would be far too easy to accidentally keep a reference to the subclass somewhere in your code and then wonder why the class never goes away.
What you are trying to do here is keep a registry of subclasses so that the factory can return an object of the appropriate class. If you want to be able to add and remove classes from the registry then I think you have to be explicit. You could still use __subclasses__ to find candidate classes, but keep a flag on each class to show whether it is enabled. Then instead of just deleting the subclass set the flag to show the class is no longer in use and then (if you want) delete it.
Where does __subclasses__ get its information from?
For the CPython implementation of Python, the type object keeps a list of weak references under PyTypeObject.tp_subclasses. This is marked as "Not inherited. Internal use only" in the docs, so can be treated as an implementation detail of CPython. See also: How is __subclasses__ method implemented in CPython?.
And can I manipulate it?
Any class has a .__bases__ descriptor which, if changed, updates the references in PyTypeObject.tp_subclasses.
.__bases__ can only be manipulated when the class doesn't directly inherit from object. So while:
class BaseClass: pass
class OtherClass(BaseClass): pass
and
class BaseClass: pass
class OtherClass: pass
BaseClass.__bases__ = (OtherClass, )
# TypeError: __bases__ assignment: 'BaseClass' deallocator differs from 'object'
should be equivalent *. You will get an error. See: https://bugs.python.org/issue672115
You also can't use this to change a class to inherit from object.
class BaseClass: pass
class SubClass(BaseClass): pass
SubClass.__bases__ = (object,)
# TypeError: __bases__ assignment: 'type' object layout differs from 'BaseClass'
You can, however, change the bases of a class to be another class.
class BaseClass: pass
class SubClass(BaseClass): pass
class OtherClass: pass
SubClass.__bases__ = (OtherClass, )
# Or don't define it.
SubClass.__bases__ = (type("OtherClass", (object, ), {}), )
This all updates the parent class:
>>> BaseClass.__subclasses__()
[]

Best way to access class-method into instance method

class Test(object):
def __init__(self):
pass
def testmethod(self):
# instance method
self.task(10) # type-1 access class method
cls = self.__class__
cls.task(20) # type-2 access class method
#classmethod
def task(cls,val)
print(val)
I have two way to access class method into instance method.
self.task(10)
or
cls = self.__class__
cls.task(20)
My question is which one is the best and why??
If both ways are not same, then which one I use in which condition?
self.task(10) is definitely the best.
First, both will ultimately end in same operation for class instances:
__class__ is a special attribute that is guaranteed to exist for an class instance object and is is the class of the object (Ref: Python reference manual / Data model / The standard type hierarchy)
Class instances ...Special attributes: __dict__ is the attribute dictionary; __class__ is the instance’s class
calling a classmethod with a class instance object actually pass the class of the object to the method (Ref: same chapter of ref. manual):
...When an instance method object is created by retrieving a class method object from a class or instance, its __self__ attribute is the class itself
But the first is simpler and does not require usage of a special attribute.

Python: Why do I get an exception using super() but not with explicit super class name?

I am getting an exception when I try to access a base class's property using super(), but not when I use the base class name explicitly. Here is the derived class:
from CPSA_TransactionLogOutSet import CPSA_TransactionLogOutSet
class CPSA_TransactionFailureSet(CPSA_TransactionLogOutSet):
def __init__(self, connection, failedTransactionKey):
super().__init__(connection)
CPSA_TransactionLogOutSet.C_TRANS_TYP = "TRANS_FAIL"
super().C_TRANS_TYP = "TRANS_FAIL"
super().DefaultTableName = 'CPSA_TRANSACTION_LOG_IN'
super()._keyFields.append('J_TRANS_SEQ')
but trying to create an instance raises an AttributeError exception:
AttributeError: 'super' object has no attribute 'C_TRANS_TYP'
The base class consists of an __init__() method and a set of properties, only one of which is shown here:
class CPSA_TransactionLogOutSet(Recordset):
def __init__(self, connection):
super().__init__(connection)
self.DefaultTableName = 'CPSA_TRANSACTION_LOG_OUT'
#property
def C_TRANS_TYP(self):
return self.GetValue('C_TRANS_TYP')
#C_TRANS_TYP.setter
def C_TRANS_TYP(self, value):
self.SetValue('C_TRANS_TYP', value)
Why can't I use super() to access the C_TRANS_TYP property?
You don't need to use super() at all because there is no override on the current class. The descriptor will be bound to self without super(). The same applies to the other attributes on self:
def __init__(self, connection, failedTransactionKey):
super().__init__(connection)
self.C_TRANS_TYP = "TRANS_FAIL"
self.DefaultTableName = 'CPSA_TRANSACTION_LOG_IN'
self._keyFields.append('J_TRANS_SEQ')
super() is only needed to access descriptors that would not otherwise be reachable via self. The normal access path (via the instance) suffices here.
super() can't be used to bind data descriptors in an assignment or del obj.attr statement, because super() objects do not implement __set__ or __delete__. In other words, using super().attribute works for reading the attribute only, never for writing or deleting.
Setting CPSA_TransactionLogOutSet.C_TRANS_TYP is also incorrect; that replaces the descriptor object on the class. By executing that line, you removed the descriptor from the class hierarchy altogether, so neither self.C_TRANS_TYP nor super().C_TRANS_TYP would trigger the property you defined before.

What is a DynamicClassAttribute and how do I use it?

As of Python 3.4, there is a descriptor called DynamicClassAttribute. The documentation states:
types.DynamicClassAttribute(fget=None, fset=None, fdel=None, doc=None)
Route attribute access on a class to __getattr__.
This is a descriptor, used to define attributes that act differently when accessed through an instance and through a class. Instance access remains normal, but access to an attribute through a class will be routed to the class’s __getattr__ method; this is done by raising AttributeError.
This allows one to have properties active on an instance, and have virtual attributes on the class with the same name (see Enum for an example).
New in version 3.4.
It is apparently used in the enum module:
# DynamicClassAttribute is used to provide access to the `name` and
# `value` properties of enum members while keeping some measure of
# protection from modification, while still allowing for an enumeration
# to have members named `name` and `value`. This works because enumeration
# members are not set directly on the enum class -- __getattr__ is
# used to look them up.
#DynamicClassAttribute
def name(self):
"""The name of the Enum member."""
return self._name_
#DynamicClassAttribute
def value(self):
"""The value of the Enum member."""
return self._value_
I realise that enums are a little special, but I don't understand how this relates to the DynamicClassAttribute. What does it mean that those attributes are dynamic, how is this different from a normal property, and how do I use a DynamicClassAttribute to my advantage?
New Version:
I was a bit disappointed with the previous answer so I decided to rewrite it a bit:
First have a look at the source code of DynamicClassAttribute and you'll probably notice, that it looks very much like the normal property. Except for the __get__-method:
def __get__(self, instance, ownerclass=None):
if instance is None:
# Here is the difference, the normal property just does: return self
if self.__isabstractmethod__:
return self
raise AttributeError()
elif self.fget is None:
raise AttributeError("unreadable attribute")
return self.fget(instance)
So what this means is that if you want to access a DynamicClassAttribute (that isn't abstract) on the class it raises an AttributeError instead of returning self. For instances if instance: evaluates to True and the __get__ is identical to property.__get__.
For normal classes that just resolves in a visible AttributeError when calling the attribute:
from types import DynamicClassAttribute
class Fun():
#DynamicClassAttribute
def has_fun(self):
return False
Fun.has_fun
AttributeError - Traceback (most recent call last)
that for itself is not very helpful until you take a look at the "Class attribute lookup" procedure when using metaclasses (I found a nice image of this in this blog).
Because in case that an attribute raises an AttributeError and that class has a metaclass python looks at the metaclass.__getattr__ method and sees if that can resolve the attribute. To illustrate this with a minimal example:
from types import DynamicClassAttribute
# Metaclass
class Funny(type):
def __getattr__(self, value):
print('search in meta')
# Normally you would implement here some ifs/elifs or a lookup in a dictionary
# but I'll just return the attribute
return Funny.dynprop
# Metaclasses dynprop:
dynprop = 'Meta'
class Fun(metaclass=Funny):
def __init__(self, value):
self._dynprop = value
#DynamicClassAttribute
def dynprop(self):
return self._dynprop
And here comes the "dynamic" part. If you call the dynprop on the class it will search in the meta and return the meta's dynprop:
Fun.dynprop
which prints:
search in meta
'Meta'
So we invoked the metaclass.__getattr__ and returned the original attribute (which was defined with the same name as the new property).
While for instances the dynprop of the Fun-instance is returned:
Fun('Not-Meta').dynprop
we get the overriden attribute:
'Not-Meta'
My conclusion from this is, that DynamicClassAttribute is important if you want to allow subclasses to have an attribute with the same name as used in the metaclass. You'll shadow it on instances but it's still accessible if you call it on the class.
I did go into the behaviour of Enum in the old version so I left it in here:
Old Version
The DynamicClassAttribute is just useful (I'm not really sure on that point) if you suspect there could be naming conflicts between an attribute that is set on a subclass and a property on the base-class.
You'll need to know at least some basics about metaclasses, because this will not work without using metaclasses (a nice explanation on how class attributes are called can be found in this blog post) because the attribute lookup is slightly different with metaclasses.
Suppose you have:
class Funny(type):
dynprop = 'Very important meta attribute, do not override'
class Fun(metaclass=Funny):
def __init__(self, value):
self._stub = value
#property
def dynprop(self):
return 'Haha, overridden it with {}'.format(self._stub)
and then call:
Fun.dynprop
property at 0x1b3d9fd19a8
and on the instance we get:
Fun(2).dynprop
'Haha, overridden it with 2'
bad ... it's lost. But wait we can use the metaclass special lookup: Let's implement an __getattr__ (fallback) and implement the dynprop as DynamicClassAttribute. Because according to it's documentation that's its purpose - to fallback to the __getattr__ if it's called on the class:
from types import DynamicClassAttribute
class Funny(type):
def __getattr__(self, value):
print('search in meta')
return Funny.dynprop
dynprop = 'Meta'
class Fun(metaclass=Funny):
def __init__(self, value):
self._dynprop = value
#DynamicClassAttribute
def dynprop(self):
return self._dynprop
now we access the class-attribute:
Fun.dynprop
which prints:
search in meta
'Meta'
So we invoked the metaclass.__getattr__ and returned the original attribute (which was defined with the same name as the new property).
And for instances:
Fun('Not-Meta').dynprop
we get the overriden attribute:
'Not-Meta'
Well that's not too bad considering we can reroute using metaclasses to previously defined but overriden attributes without creating an instance. This example is the opposite that is done with Enum, where you define attributes on the subclass:
from enum import Enum
class Fun(Enum):
name = 'me'
age = 28
hair = 'brown'
and want to access these afterwards defined attributes by default.
Fun.name
# <Fun.name: 'me'>
but you also want to allow accessing the name attribute that was defined as DynamicClassAttribute (which returns which name the variable actually has):
Fun('me').name
# 'name'
because otherwise how could you access the name of 28?
Fun.hair.age
# <Fun.age: 28>
# BUT:
Fun.hair.name
# returns 'hair'
See the difference? Why does the second one don't return <Fun.name: 'me'>? That's because of this use of DynamicClassAttribute. So you can shadow the original property but "release" it again later. This behaviour is the reverse of that shown in my example and requires at least the usage of __new__ and __prepare__. But for that you need to know how that exactly works and is explained in a lot of blogs and stackoverflow-answers that can explain it much better than I can so I'll skip going into that much depth (and I'm not sure if I could solve it in short order).
Actual use-cases might be sparse but given time one can propably think of some...
Very nice discussion on the documentation of DynamicClassAttribute: "we added it because we needed it"
What is a DynamicClassAttribute
A DynamicClassAttribute is a descriptor that is similar to property. Dynamic is part of the name because you get different results based on whether you access it via the class or via the instance:
instance access is identical to property and simply runs whatever method was decorated, returning its result
class access raises an AttributeError; when this happens Python then searches every parent class (via the mro) looking for that attribute -- when it doesn't find it, it calls the class' metaclass's __getattr__ for one last shot at finding the attribute. __getattr__ can, of course, do whatever it wants -- in the case of EnumMeta __getattr__ looks in the class' _member_map_ to see if the requested attribute is there, and returns it if it is. As a side note: all that searching had a severe performance impact, which is why we ended up putting all members that did not have name conflicts with DynamicClassAttributes in the Enum class' __dict__ after all.
and how do I use it?
You use it just like you would property -- the only difference is that you use it when creating a base class for other Enums. As an example, the Enum from aenum1 has three reserved names:
name
value
values
values is there to support Enum members with multiple values. That class is effectively:
class Enum(metaclass=EnumMeta):
#DynamicClassAttribute
def name(self):
return self._name_
#DynamicClassAttribute
def value(self):
return self._value_
#DynamicClassAttribute
def values(self):
return self._values_
and now any aenum.Enum can have a values member without messing up Enum.<member>.values.
1 Disclosure: I am the author of the Python stdlib Enum, the enum34 backport, and the Advanced Enumeration (aenum) library.

why defined '__new__' and '__init__' all in a class

i think you can defined either '__init__' or '__new__' in a class,but why all defined in django.utils.datastructures.py.
my code:
class a(object):
def __init__(self):
print 'aaa'
def __new__(self):
print 'sss'
a()#print 'sss'
class b:
def __init__(self):
print 'aaa'
def __new__(self):
print 'sss'
b()#print 'aaa'
datastructures.py:
class SortedDict(dict):
"""
A dictionary that keeps its keys in the order in which they're inserted.
"""
def __new__(cls, *args, **kwargs):
instance = super(SortedDict, cls).__new__(cls, *args, **kwargs)
instance.keyOrder = []
return instance
def __init__(self, data=None):
if data is None:
data = {}
super(SortedDict, self).__init__(data)
if isinstance(data, dict):
self.keyOrder = data.keys()
else:
self.keyOrder = []
for key, value in data:
if key not in self.keyOrder:
self.keyOrder.append(key)
and what circumstances the SortedDict.__init__ will be call.
thanks
You can define either or both of __new__ and __init__.
__new__ must return an object -- which can be a new one (typically that task is delegated to type.__new__), an existing one (to implement singletons, "recycle" instances from a pool, and so on), or even one that's not an instance of the class. If __new__ returns an instance of the class (new or existing), __init__ then gets called on it; if __new__ returns an object that's not an instance of the class, then __init__ is not called.
__init__ is passed a class instance as its first item (in the same state __new__ returned it, i.e., typically "empty") and must alter it as needed to make it ready for use (most often by adding attributes).
In general it's best to use __init__ for all it can do -- and __new__, if something is left that __init__ can't do, for that "extra something".
So you'll typically define both if there's something useful you can do in __init__, but not everything you want to happen when the class gets instantiated.
For example, consider a class that subclasses int but also has a foo slot -- and you want it to be instantiated with an initializer for the int and one for the .foo. As int is immutable, that part has to happen in __new__, so pedantically one could code:
>>> class x(int):
... def __new__(cls, i, foo):
... self = int.__new__(cls, i)
... return self
... def __init__(self, i, foo):
... self.foo = foo
... __slots__ = 'foo',
...
>>> a = x(23, 'bah')
>>> print a
23
>>> print a.foo
bah
>>>
In practice, for a case this simple, nobody would mind if you lost the __init__ and just moved the self.foo = foo to __new__. But if initialization is rich and complex enough to be best placed in __init__, this idea is worth keeping in mind.
__new__ and __init__ do completely different things. The method __init__ initiates a new instance of a class --- it is a constructor. __new__ is a far more subtle thing --- it can change arguments and, in fact, the class of the initiated object. For example, the following code:
class Meters(object):
def __new__(cls, value):
return int(value / 3.28083)
If you call Meters(6) you will not actually create an instance of Meters, but an instance of int. You might wonder why this is useful; it is actually crucial to metaclasses, an admittedly obscure (but powerful) feature.
You'll note that in Python 2.x, only classes inheriting from object can take advantage of __new__, as you code above shows.
The use of __new__ you showed in django seems to be an attempt to keep a sane method resolution order on SortedDict objects. I will admit, though, that it is often hard to tell why __new__ is necessary. Standard Python style suggests that it not be used unless necessary (as always, better class design is the tool you turn to first).
My only guess is that in this case, they (author of this class) want the keyOrder list to exist on the class even before SortedDict.__init__ is called.
Note that SortedDict calls super() in its __init__, this would ordinarily go to dict.__init__, which would probably call __setitem__ and the like to start adding items. SortedDict.__setitem__ expects the .keyOrder property to exist, and therein lies the problem (since .keyOrder isn't normally created until after the call to super().) It's possible this is just an issue with subclassing dict because my normal gut instinct would be to just initialize .keyOrder before the call to super().
The code in __new__ might also be used to allow SortedDict to be subclassed in a diamond inheritance structure where it is possible SortedDict.__init__ is not called before the first __setitem__ and the like are called. Django has to contend with various issues in supporting a wide range of python versions from 2.3 up; it's possible this code is completely un-neccesary in some versions and needed in others.
There is a common use for defining both __new__ and __init__: accessing class properties which may be eclipsed by their instance versions without having to do type(self) or self.__class__ (which, in the existence of metaclasses, may not even be the right thing).
For example:
class MyClass(object):
creation_counter = 0
def __new__(cls, *args, **kwargs):
cls.creation_counter += 1
return super(MyClass, cls).__new__(cls)
def __init__(self):
print "I am the %dth myclass to be created!" % self.creation_counter
Finally, __new__ can actually return an instance of a wrapper or a completely different class from what you thought you were instantiating. This is used to provide metaclass-like features without actually needing a metaclass.
In my opinion, there was no need of overriding __new__ in the example you described.
Creation of an instance and actual memory allocation happens in __new__, __init__ is called after __new__ and is meant for initialization of instance serving the job of constructor in classical OOP terms. So, if all you want to do is initialize variables, then you should go for overriding __init__.
The real role of __new__ comes into place when you are using Metaclasses. There if you want to do something like changing attributes or adding attributes, that must happen before the creation of class, you should go for overriding __new__.
Consider, a completely hypothetical case where you want to make some attributes of class private, even though they are not defined so (I'm not saying one should ever do that).
class PrivateMetaClass(type):
def __new__(metaclass, classname, bases, attrs):
private_attributes = ['name', 'age']
for private_attribute in private_attributes:
if attrs.get(private_attribute):
attrs['_' + private_attribute] = attrs[private_attribute]
attrs.pop(private_attribute)
return super(PrivateMetaClass, metaclass).__new__(metaclass, classname, bases, attrs)
class Person(object):
__metaclass__ = PrivateMetaClass
name = 'Someone'
age = 19
person = Person()
>>> hasattr(person, 'name')
False
>>> person._name
'Someone'
Again, It's just for instructional purposes I'm not suggesting one should do anything like this.

Categories