Following is the __init__ method of the Local class from the werkzeug library:
def __init__(self):
object.__setattr__(self, '__storage__', {})
object.__setattr__(self, '__ident_func__', get_ident)
I don't understand two things about this code:
Why did they write
object.__setattr__(self, '__storage__', {})
instead of simply
`setattr(self, '__storage__', {})`
Why did they even use __setattr__ if the could simply write
self.__storage__ = {}
This ensures that the default Python definition of __setattr__ is used. It's generally used if the class has overridden __setattr__ to perform non-standard behaviour, but you still wish to access the original __setattr__ behaviour.
In the case of werkzeug, if you look at the Local class you'll see __setattr__ is defined like this:
def __setattr__(self, name, value):
ident = self.__ident_func__()
storage = self.__storage__
try:
storage[ident][name] = value
except KeyError:
storage[ident] = {name: value}
Instead of setting attributes in the dictionary of the object, it sets them in the __storage__ dictionary that was initialized previously. In order to set the __storage__ attribute at all (so that it may be accessed like self.__storage__ later), the original definition of __setattr__ from object must be used, which is why the awkward notation is used in the constructor.
They want to explicitly use the base object.__setattr__ implementation instead of a possibly overridden method instance method somewhere else in the inheritance chain. Local implements its own __setattr__ so this avoids that.
Because the same class defines __setattr__ and this needs to bypass that, since the first line says self.__ident_func__() which wouldn't work yet.
Related
As of Python 3.4, there is a descriptor called DynamicClassAttribute. The documentation states:
types.DynamicClassAttribute(fget=None, fset=None, fdel=None, doc=None)
Route attribute access on a class to __getattr__.
This is a descriptor, used to define attributes that act differently when accessed through an instance and through a class. Instance access remains normal, but access to an attribute through a class will be routed to the class’s __getattr__ method; this is done by raising AttributeError.
This allows one to have properties active on an instance, and have virtual attributes on the class with the same name (see Enum for an example).
New in version 3.4.
It is apparently used in the enum module:
# DynamicClassAttribute is used to provide access to the `name` and
# `value` properties of enum members while keeping some measure of
# protection from modification, while still allowing for an enumeration
# to have members named `name` and `value`. This works because enumeration
# members are not set directly on the enum class -- __getattr__ is
# used to look them up.
#DynamicClassAttribute
def name(self):
"""The name of the Enum member."""
return self._name_
#DynamicClassAttribute
def value(self):
"""The value of the Enum member."""
return self._value_
I realise that enums are a little special, but I don't understand how this relates to the DynamicClassAttribute. What does it mean that those attributes are dynamic, how is this different from a normal property, and how do I use a DynamicClassAttribute to my advantage?
New Version:
I was a bit disappointed with the previous answer so I decided to rewrite it a bit:
First have a look at the source code of DynamicClassAttribute and you'll probably notice, that it looks very much like the normal property. Except for the __get__-method:
def __get__(self, instance, ownerclass=None):
if instance is None:
# Here is the difference, the normal property just does: return self
if self.__isabstractmethod__:
return self
raise AttributeError()
elif self.fget is None:
raise AttributeError("unreadable attribute")
return self.fget(instance)
So what this means is that if you want to access a DynamicClassAttribute (that isn't abstract) on the class it raises an AttributeError instead of returning self. For instances if instance: evaluates to True and the __get__ is identical to property.__get__.
For normal classes that just resolves in a visible AttributeError when calling the attribute:
from types import DynamicClassAttribute
class Fun():
#DynamicClassAttribute
def has_fun(self):
return False
Fun.has_fun
AttributeError - Traceback (most recent call last)
that for itself is not very helpful until you take a look at the "Class attribute lookup" procedure when using metaclasses (I found a nice image of this in this blog).
Because in case that an attribute raises an AttributeError and that class has a metaclass python looks at the metaclass.__getattr__ method and sees if that can resolve the attribute. To illustrate this with a minimal example:
from types import DynamicClassAttribute
# Metaclass
class Funny(type):
def __getattr__(self, value):
print('search in meta')
# Normally you would implement here some ifs/elifs or a lookup in a dictionary
# but I'll just return the attribute
return Funny.dynprop
# Metaclasses dynprop:
dynprop = 'Meta'
class Fun(metaclass=Funny):
def __init__(self, value):
self._dynprop = value
#DynamicClassAttribute
def dynprop(self):
return self._dynprop
And here comes the "dynamic" part. If you call the dynprop on the class it will search in the meta and return the meta's dynprop:
Fun.dynprop
which prints:
search in meta
'Meta'
So we invoked the metaclass.__getattr__ and returned the original attribute (which was defined with the same name as the new property).
While for instances the dynprop of the Fun-instance is returned:
Fun('Not-Meta').dynprop
we get the overriden attribute:
'Not-Meta'
My conclusion from this is, that DynamicClassAttribute is important if you want to allow subclasses to have an attribute with the same name as used in the metaclass. You'll shadow it on instances but it's still accessible if you call it on the class.
I did go into the behaviour of Enum in the old version so I left it in here:
Old Version
The DynamicClassAttribute is just useful (I'm not really sure on that point) if you suspect there could be naming conflicts between an attribute that is set on a subclass and a property on the base-class.
You'll need to know at least some basics about metaclasses, because this will not work without using metaclasses (a nice explanation on how class attributes are called can be found in this blog post) because the attribute lookup is slightly different with metaclasses.
Suppose you have:
class Funny(type):
dynprop = 'Very important meta attribute, do not override'
class Fun(metaclass=Funny):
def __init__(self, value):
self._stub = value
#property
def dynprop(self):
return 'Haha, overridden it with {}'.format(self._stub)
and then call:
Fun.dynprop
property at 0x1b3d9fd19a8
and on the instance we get:
Fun(2).dynprop
'Haha, overridden it with 2'
bad ... it's lost. But wait we can use the metaclass special lookup: Let's implement an __getattr__ (fallback) and implement the dynprop as DynamicClassAttribute. Because according to it's documentation that's its purpose - to fallback to the __getattr__ if it's called on the class:
from types import DynamicClassAttribute
class Funny(type):
def __getattr__(self, value):
print('search in meta')
return Funny.dynprop
dynprop = 'Meta'
class Fun(metaclass=Funny):
def __init__(self, value):
self._dynprop = value
#DynamicClassAttribute
def dynprop(self):
return self._dynprop
now we access the class-attribute:
Fun.dynprop
which prints:
search in meta
'Meta'
So we invoked the metaclass.__getattr__ and returned the original attribute (which was defined with the same name as the new property).
And for instances:
Fun('Not-Meta').dynprop
we get the overriden attribute:
'Not-Meta'
Well that's not too bad considering we can reroute using metaclasses to previously defined but overriden attributes without creating an instance. This example is the opposite that is done with Enum, where you define attributes on the subclass:
from enum import Enum
class Fun(Enum):
name = 'me'
age = 28
hair = 'brown'
and want to access these afterwards defined attributes by default.
Fun.name
# <Fun.name: 'me'>
but you also want to allow accessing the name attribute that was defined as DynamicClassAttribute (which returns which name the variable actually has):
Fun('me').name
# 'name'
because otherwise how could you access the name of 28?
Fun.hair.age
# <Fun.age: 28>
# BUT:
Fun.hair.name
# returns 'hair'
See the difference? Why does the second one don't return <Fun.name: 'me'>? That's because of this use of DynamicClassAttribute. So you can shadow the original property but "release" it again later. This behaviour is the reverse of that shown in my example and requires at least the usage of __new__ and __prepare__. But for that you need to know how that exactly works and is explained in a lot of blogs and stackoverflow-answers that can explain it much better than I can so I'll skip going into that much depth (and I'm not sure if I could solve it in short order).
Actual use-cases might be sparse but given time one can propably think of some...
Very nice discussion on the documentation of DynamicClassAttribute: "we added it because we needed it"
What is a DynamicClassAttribute
A DynamicClassAttribute is a descriptor that is similar to property. Dynamic is part of the name because you get different results based on whether you access it via the class or via the instance:
instance access is identical to property and simply runs whatever method was decorated, returning its result
class access raises an AttributeError; when this happens Python then searches every parent class (via the mro) looking for that attribute -- when it doesn't find it, it calls the class' metaclass's __getattr__ for one last shot at finding the attribute. __getattr__ can, of course, do whatever it wants -- in the case of EnumMeta __getattr__ looks in the class' _member_map_ to see if the requested attribute is there, and returns it if it is. As a side note: all that searching had a severe performance impact, which is why we ended up putting all members that did not have name conflicts with DynamicClassAttributes in the Enum class' __dict__ after all.
and how do I use it?
You use it just like you would property -- the only difference is that you use it when creating a base class for other Enums. As an example, the Enum from aenum1 has three reserved names:
name
value
values
values is there to support Enum members with multiple values. That class is effectively:
class Enum(metaclass=EnumMeta):
#DynamicClassAttribute
def name(self):
return self._name_
#DynamicClassAttribute
def value(self):
return self._value_
#DynamicClassAttribute
def values(self):
return self._values_
and now any aenum.Enum can have a values member without messing up Enum.<member>.values.
1 Disclosure: I am the author of the Python stdlib Enum, the enum34 backport, and the Advanced Enumeration (aenum) library.
I tried to create dynamic object to validate my config in fly and present result as object. I tried to achieve this by creating such class:
class SubConfig(object):
def __init__(self, config, key_types):
self.__config = config
self.__values = {}
self.__key_types = key_types
def __getattr__(self, item):
if item in self.__key_types:
return self.__values[item] or None
else:
raise ValueError("No such item to get from config")
def __setattr__(self, item, value):
if self.__config._blocked:
raise ValueError("Can't change values after service has started")
if item in self.__key_types:
if type(value) in self.__key_types[item]:
self.__values[item] = value
else:
raise ValueError("Can't assing value in different type then declared!")
else:
raise ValueError("No such item to set in config")
SubConfig is wrapper for section in config file. Config has switch to kill possibility to change values after program started (you can change values only on initialization).
The problem is when I setting any value it is getting in infinity loop in getattr. As I read __getattr__ shouldn't behave like that (first take existing attr, then call __getattr__). I was comparing my code with available examples but I can't get a thing.
I noticed that all problems are generated my constructor.
The problem is that your constructor in initialising the object calls __setattr__, which then calls __getattr__ because the __ private members aren't initialised yet.
There are two ways I can think of to work around this:
One option is to call down to object.__setattr__ thereby avoiding your __setattr__ or equivalently use super(SubConfig, self).__setattr__(...) in __init__. You could also set values in self.__dict__ directly. A problem here is that because you're using double-underscores you'd have to mangle the attribute names manually (so '__config' becomes '_SubConfig__config'):
def __init__(self, config, key_types):
super(SubConfig, self).__setattr__('_SubConfig__config', config)
super(SubConfig, self).__setattr__('_SubConfig__values', {})
super(SubConfig, self).__setattr__('_SubConfig__key_types', key_types)
An alternative is to have __setattr__ detect and pass through access to attribute names that begin with _ i.e.
if item.startswith('_')
return super(SubConfig, self).__setattr__(item, value)
This is more Pythonic in that if someone has a good reason to access your object's internals, you have no reason to try to stop them.
Cf ecatmur's answer for the root cause - and remember that __setattr__ is not symetrical to __getattr__ - it is unconditionnaly called on each and every attempt to bind an object's attribute. Overriding __setattr__ is tricky and should not be done if you don't clearly understand the pros and cons.
Now for a simple practical solution to your use case: rewrite your initializer to avoid triggering setattr calls:
class SubConfig(object):
def __init__(self, config, key_types):
self.__dict__.update(
_SubConfig__config=config,
_SubConfig__values={},
_SubConfig__key_types=key_types
)
Note that I renamed your attributes to emulate the name-mangling that happens when using the double leading underscores naming scheme.
I just spent too long on a bug like the following:
>>> class Odp():
def __init__(self):
self.foo = "bar"
>>> o = Odp()
>>> o.raw_foo = 3 # oops - meant o.foo
I have a class with an attribute. I was trying to set it, and wondering why it had no effect. Then, I went back to the original class definition, and saw that the attribute was named something slightly different. Thus, I was creating/setting a new attribute instead of the one meant to.
First off, isn't this exactly the type of error that statically-typed languages are supposed to prevent? In this case, what is the advantage of dynamic typing?
Secondly, is there a way I could have forbidden this when defining Odp, and thus saved myself the trouble?
You can implement a __setattr__ method for the purpose -- that's much more robust than the __slots__ which is often misused for the purpose (for example, __slots__ is automatically "lost" when the class is inherited from, while __setattr__ survives unless explicitly overridden).
def __setattr__(self, name, value):
if hasattr(self, name):
object.__setattr__(self, name, value)
else:
raise TypeError('Cannot set name %r on object of type %s' % (
name, self.__class__.__name__))
You'll have to make sure the hasattr succeeds for the names you do want to be able to set, for example by setting the attributes at a class level or by using object.__setattr__ in your __init__ method rather than direct attribute assignment. (To forbid setting attributes on a class rather than its instances you'll have to define a custom metaclass with a similar special method).
i think you can defined either '__init__' or '__new__' in a class,but why all defined in django.utils.datastructures.py.
my code:
class a(object):
def __init__(self):
print 'aaa'
def __new__(self):
print 'sss'
a()#print 'sss'
class b:
def __init__(self):
print 'aaa'
def __new__(self):
print 'sss'
b()#print 'aaa'
datastructures.py:
class SortedDict(dict):
"""
A dictionary that keeps its keys in the order in which they're inserted.
"""
def __new__(cls, *args, **kwargs):
instance = super(SortedDict, cls).__new__(cls, *args, **kwargs)
instance.keyOrder = []
return instance
def __init__(self, data=None):
if data is None:
data = {}
super(SortedDict, self).__init__(data)
if isinstance(data, dict):
self.keyOrder = data.keys()
else:
self.keyOrder = []
for key, value in data:
if key not in self.keyOrder:
self.keyOrder.append(key)
and what circumstances the SortedDict.__init__ will be call.
thanks
You can define either or both of __new__ and __init__.
__new__ must return an object -- which can be a new one (typically that task is delegated to type.__new__), an existing one (to implement singletons, "recycle" instances from a pool, and so on), or even one that's not an instance of the class. If __new__ returns an instance of the class (new or existing), __init__ then gets called on it; if __new__ returns an object that's not an instance of the class, then __init__ is not called.
__init__ is passed a class instance as its first item (in the same state __new__ returned it, i.e., typically "empty") and must alter it as needed to make it ready for use (most often by adding attributes).
In general it's best to use __init__ for all it can do -- and __new__, if something is left that __init__ can't do, for that "extra something".
So you'll typically define both if there's something useful you can do in __init__, but not everything you want to happen when the class gets instantiated.
For example, consider a class that subclasses int but also has a foo slot -- and you want it to be instantiated with an initializer for the int and one for the .foo. As int is immutable, that part has to happen in __new__, so pedantically one could code:
>>> class x(int):
... def __new__(cls, i, foo):
... self = int.__new__(cls, i)
... return self
... def __init__(self, i, foo):
... self.foo = foo
... __slots__ = 'foo',
...
>>> a = x(23, 'bah')
>>> print a
23
>>> print a.foo
bah
>>>
In practice, for a case this simple, nobody would mind if you lost the __init__ and just moved the self.foo = foo to __new__. But if initialization is rich and complex enough to be best placed in __init__, this idea is worth keeping in mind.
__new__ and __init__ do completely different things. The method __init__ initiates a new instance of a class --- it is a constructor. __new__ is a far more subtle thing --- it can change arguments and, in fact, the class of the initiated object. For example, the following code:
class Meters(object):
def __new__(cls, value):
return int(value / 3.28083)
If you call Meters(6) you will not actually create an instance of Meters, but an instance of int. You might wonder why this is useful; it is actually crucial to metaclasses, an admittedly obscure (but powerful) feature.
You'll note that in Python 2.x, only classes inheriting from object can take advantage of __new__, as you code above shows.
The use of __new__ you showed in django seems to be an attempt to keep a sane method resolution order on SortedDict objects. I will admit, though, that it is often hard to tell why __new__ is necessary. Standard Python style suggests that it not be used unless necessary (as always, better class design is the tool you turn to first).
My only guess is that in this case, they (author of this class) want the keyOrder list to exist on the class even before SortedDict.__init__ is called.
Note that SortedDict calls super() in its __init__, this would ordinarily go to dict.__init__, which would probably call __setitem__ and the like to start adding items. SortedDict.__setitem__ expects the .keyOrder property to exist, and therein lies the problem (since .keyOrder isn't normally created until after the call to super().) It's possible this is just an issue with subclassing dict because my normal gut instinct would be to just initialize .keyOrder before the call to super().
The code in __new__ might also be used to allow SortedDict to be subclassed in a diamond inheritance structure where it is possible SortedDict.__init__ is not called before the first __setitem__ and the like are called. Django has to contend with various issues in supporting a wide range of python versions from 2.3 up; it's possible this code is completely un-neccesary in some versions and needed in others.
There is a common use for defining both __new__ and __init__: accessing class properties which may be eclipsed by their instance versions without having to do type(self) or self.__class__ (which, in the existence of metaclasses, may not even be the right thing).
For example:
class MyClass(object):
creation_counter = 0
def __new__(cls, *args, **kwargs):
cls.creation_counter += 1
return super(MyClass, cls).__new__(cls)
def __init__(self):
print "I am the %dth myclass to be created!" % self.creation_counter
Finally, __new__ can actually return an instance of a wrapper or a completely different class from what you thought you were instantiating. This is used to provide metaclass-like features without actually needing a metaclass.
In my opinion, there was no need of overriding __new__ in the example you described.
Creation of an instance and actual memory allocation happens in __new__, __init__ is called after __new__ and is meant for initialization of instance serving the job of constructor in classical OOP terms. So, if all you want to do is initialize variables, then you should go for overriding __init__.
The real role of __new__ comes into place when you are using Metaclasses. There if you want to do something like changing attributes or adding attributes, that must happen before the creation of class, you should go for overriding __new__.
Consider, a completely hypothetical case where you want to make some attributes of class private, even though they are not defined so (I'm not saying one should ever do that).
class PrivateMetaClass(type):
def __new__(metaclass, classname, bases, attrs):
private_attributes = ['name', 'age']
for private_attribute in private_attributes:
if attrs.get(private_attribute):
attrs['_' + private_attribute] = attrs[private_attribute]
attrs.pop(private_attribute)
return super(PrivateMetaClass, metaclass).__new__(metaclass, classname, bases, attrs)
class Person(object):
__metaclass__ = PrivateMetaClass
name = 'Someone'
age = 19
person = Person()
>>> hasattr(person, 'name')
False
>>> person._name
'Someone'
Again, It's just for instructional purposes I'm not suggesting one should do anything like this.
This one seems a bit tricky to me. Sometime ago I already managed to overwrite an instance's method with something like:
def my_method(self, attr):
pass
instancemethod = type(self.method_to_overwrite)
self.method_to_overwrite = instancemethod(my_method, self, self.__class__)
which worked very well for me; but now I'm trying to overwrite an instance's __getattribute__() function, which doesn't work for me for the reason the method seems to be
<type 'method-wrapper'>
Is it possible to do anything about that? I couldn't find any decent Python documentation on method-wrapper.
You want to override the attribute lookup algorithm on an per instance basis? Without knowing why you are trying to do this, I would hazard a guess that there is a cleaner less convoluted way of doing what you need to do. If you really need to then as Aaron said, you'll need to install a redirecting __getattribute__ handler on the class because Python looks up special methods only on the class, ignoring anything defined on the instance.
You also have to be extra careful about not getting into infinite recursion:
class FunkyAttributeLookup(object):
def __getattribute__(self, key):
try:
# Lookup the per instance function via objects attribute lookup
# to avoid infinite recursion.
getter = object.__getattribute__(self, 'instance_getattribute')
return getter(key)
except AttributeError:
return object.__getattribute__(self, key)
f = FunkyAttributeLookup()
f.instance_getattribute = lambda attr: attr.upper()
print(f.foo) # FOO
Also, if you are overriding methods on your instance, you don't need to instanciate the method object yourself, you can either use the descriptor protocol on functions that generates the methods or just curry the self argument.
#descriptor protocol
self.method_to_overwrite = my_method.__get__(self, type(self))
# or curry
from functools import partial
self.method_to_overwrite = partial(my_method, self)
You can't overwrite special methods at instance level. For new-style classes, implicit invocations of special methods are only guaranteed to work correctly if defined on an object’s type, not in the object’s instance dictionary.
There are a couple of methods which you can't overwrite and __getattribute__() is one of them.
I believe method-wrapper is a wrapper around a method written in C.