Composing descriptors in python

Composing descriptors in python - python

Background
In python, a descriptor is an object that defines any of __get__, __set__ or __delete__, and sometimes also __set_name__. The most common use of descriptors in python is probably property(getter, setter, deleter, description). The property descriptor calls the given getter, setter, and deleter when the respective descriptor methods are called.
It's interesting to note that functions are also descriptors: they define __get__ which when called, returns a bound method.
Descriptors are used to modify what happens when an objects properties are accessed. Examples are restricting access, logging object access, or even dynamic lookup from a database.
Problem
My question is: how do I design descriptors that are composable?
For example:
Say I have a Restricted descriptor (that only allows setting and getting when a condition of some sort is met), and a AccessLog descriptor (that logs every time the property is "set" or "get"). Can I design those so that I can compose their functionality when using them?
Say my example usage looks like this:
class ExampleClient:
# use them combined, preferably In any order
# (and there could be a special way to combine them,
# although functional composition makes the most sense)
foo: Restricted(AccessLog())
bar: AccessLog(Restricted())
# and also use them separately
qux: Restricted()
quo: AccessLog()
I'm looking for a way to make this into a re-usable pattern, so I can make any descriptor composable. Any advice on how to do this in a pythonic manner? I'm going to experiment with a few ideas myself, and see what works, but I was wondering if this has been tried already, and if there is sort of a "best practice" or common method for this sort of thing...

You can probably make that work. The tricky part might be figuring out what the default behavior should be for your descriptors if they don't have a "child" descriptor to delegate to. Maybe you want to default to behaving like a normal instance variable?
class DelegateDescriptor:
def __init__(self, child=None):
self.child = child
self.name = None
def __set_name__(self, owner, name):
self.name = name
if self.child is not None:
try:
self.child.__set_name__(owner, name)
except AttributeError:
pass
def __get__(self, instance, owner=None):
if instance is None:
return self
if self.child is not None:
return self.child.__get__(instance, owner)
try:
return instance.__dict__[self.name] # default behavior, lookup the value
except KeyError:
raise AttributeError
def __set__(self, instance, value):
if self.child is not None:
self.child.__set__(instance, value)
else:
instance.__dict__[self.name] = value # default behavior, store the value
def __delete__(self, instance):
if self.child is not None:
self.child.__delete__(instance)
else:
try:
del instance.__dict__[self.name] # default behavior, remove value
except KeyError:
raise AttributeError
Now, this descriptor doesn't actually do anything other than store a value or delegate to another descriptor. Your actual Restricted and AccessLog descriptors might be able to use this as a base class however, and add their own logic on top. The error checking is also very basic, you will probably want to do a better job raising the right kinds of exceptions with appropriate error messages in every use case before using this in production.

Related

How does object.getattribute redirect to get method on my descriptor and getattr?

I have a two-part question regarding the implementation of object.__getattribute(self, key), but they are both centered around my confusion of how it is working.
I've defined a data descriptor called NonNullStringDescriptor that I intended to attach to attributes.
class NonNullStringDescriptor:
def __init__(self, value: str = "Default Name"):
self.value = value
def __get__(self, instance, owner):
return self.value
def __set__(self, instance, value):
if isinstance(value, str) and len(value.strip()) > 0:
self.value = value
else:
raise TypeError("The value provided is not a non-null string.")
I then declare a Person class with an attribute of name.
class Person:
name = NonNullStringDescriptor()
def __init__(self):
self.age = 22
def __getattribute__(self, key):
print(f"__getattribute__({key})")
v = super(Person, self).__getattribute__(key)
print(f"Returned {v}")
if hasattr(v, '__get__'):
print("Invoking __get__")
return v.__get__(None, self)
return v
def __getattr__(self, item):
print(f"__getattr__ invoked.")
return "Unknown"
Now, I try accessing variable attributes, some that are descriptors, some normal instance attributes, and others that don't exist:
person = Person()
print("Printing", person.age) # "normal" attribute
print("Printing", person.hobby) # non-existent attribute
print("Printing", person.name) # descriptor attribute
The output that is see is
__getattribute__(age)
Returned 22
Printing 22
__getattribute__(hobby)
__getattr__ invoked.
Printing Unknown
__getattribute__(name)
Returned Default Name
Printing Default Name
I have two main questions, both of which center around super(Person, self).__getattribute__(key):
When I attempt to access a non-existent attribute, like hobby, I see that it redirects to __getattr__, which I know is often the "fallback" method in attribute lookup. However, I see that __getattribute__ is what is invoking this method. However, the Returned ... console output is never printed meaning that the rest of the __getattribute__ does not complete - so how exactly is __getattribute__ invoking __getattr__ directly, returning this default "Unknown" value without executing the rest of its own function call?
I would expect that what is returned from super(Person, self).__getattribute__(key) (v), is the data descriptor instance of NonNullStringDescriptor. However, I see that v is actually the string "Default Name" itself! So how does object.__getattribute__(self, key) just know to use the __get__ method of my descriptor, instead of returning the descriptor instance?
There's references to behavior in the Descriptor Protocol:
If the looked-up value is an object defining one of the descriptor
methods, then Python may override the default behavior and invoke the
descriptor method instead.
But it's never explicitly defined to me what is actually happening in object.__getattribute(self, key) that performs the override. I know that ultimately person.name gets converted into a low-level call to type(person).__dict__["name"].__get__(person, type(person))- is this all happening in object.__getattribute__?
I found this SO post, which describes proper implementation of __getattribute__, but I'm more curious at what is actually happening in object.__getattribute__. However, my IDE (PyCharm) only provides a stub for its implementation:
def __getattribute__(self, *args, **kwargs): # real signature unknown
""" Return getattr(self, name). """
pass

__getattribute__ doesn't call __getattr__. The __getattr__ fallback happens in the attribute access machinery, after __getattribute__ raises an AttributeError. If you want to see the implementation, it's in slot_tp_getattr_hook in Objects/typeobject.c.
object.__getattribute__ knows to call __get__ because there's code in object.__getattribute__ that calls __get__. It's pretty straightforward. If you want to see the implementation, object.__getattribute__ is PyObject_GenericGetAttr in the implementation (yes, even though it says GetAttr - the C side of things is a little different from the Python side), and there are two __get__ call sites (one for data descriptors and one for non-data descriptors), here and here.

What is a DynamicClassAttribute and how do I use it?

As of Python 3.4, there is a descriptor called DynamicClassAttribute. The documentation states:
types.DynamicClassAttribute(fget=None, fset=None, fdel=None, doc=None)
Route attribute access on a class to __getattr__.
This is a descriptor, used to define attributes that act differently when accessed through an instance and through a class. Instance access remains normal, but access to an attribute through a class will be routed to the class’s __getattr__ method; this is done by raising AttributeError.
This allows one to have properties active on an instance, and have virtual attributes on the class with the same name (see Enum for an example).
New in version 3.4.
It is apparently used in the enum module:
# DynamicClassAttribute is used to provide access to the `name` and
# `value` properties of enum members while keeping some measure of
# protection from modification, while still allowing for an enumeration
# to have members named `name` and `value`. This works because enumeration
# members are not set directly on the enum class -- __getattr__ is
# used to look them up.
#DynamicClassAttribute
def name(self):
"""The name of the Enum member."""
return self._name_
#DynamicClassAttribute
def value(self):
"""The value of the Enum member."""
return self._value_
I realise that enums are a little special, but I don't understand how this relates to the DynamicClassAttribute. What does it mean that those attributes are dynamic, how is this different from a normal property, and how do I use a DynamicClassAttribute to my advantage?

New Version:
I was a bit disappointed with the previous answer so I decided to rewrite it a bit:
First have a look at the source code of DynamicClassAttribute and you'll probably notice, that it looks very much like the normal property. Except for the __get__-method:
def __get__(self, instance, ownerclass=None):
if instance is None:
# Here is the difference, the normal property just does: return self
if self.__isabstractmethod__:
return self
raise AttributeError()
elif self.fget is None:
raise AttributeError("unreadable attribute")
return self.fget(instance)
So what this means is that if you want to access a DynamicClassAttribute (that isn't abstract) on the class it raises an AttributeError instead of returning self. For instances if instance: evaluates to True and the __get__ is identical to property.__get__.
For normal classes that just resolves in a visible AttributeError when calling the attribute:
from types import DynamicClassAttribute
class Fun():
#DynamicClassAttribute
def has_fun(self):
return False
Fun.has_fun
AttributeError - Traceback (most recent call last)
that for itself is not very helpful until you take a look at the "Class attribute lookup" procedure when using metaclasses (I found a nice image of this in this blog).
Because in case that an attribute raises an AttributeError and that class has a metaclass python looks at the metaclass.__getattr__ method and sees if that can resolve the attribute. To illustrate this with a minimal example:
from types import DynamicClassAttribute
# Metaclass
class Funny(type):
def __getattr__(self, value):
print('search in meta')
# Normally you would implement here some ifs/elifs or a lookup in a dictionary
# but I'll just return the attribute
return Funny.dynprop
# Metaclasses dynprop:
dynprop = 'Meta'
class Fun(metaclass=Funny):
def __init__(self, value):
self._dynprop = value
#DynamicClassAttribute
def dynprop(self):
return self._dynprop
And here comes the "dynamic" part. If you call the dynprop on the class it will search in the meta and return the meta's dynprop:
Fun.dynprop
which prints:
search in meta
'Meta'
So we invoked the metaclass.__getattr__ and returned the original attribute (which was defined with the same name as the new property).
While for instances the dynprop of the Fun-instance is returned:
Fun('Not-Meta').dynprop
we get the overriden attribute:
'Not-Meta'
My conclusion from this is, that DynamicClassAttribute is important if you want to allow subclasses to have an attribute with the same name as used in the metaclass. You'll shadow it on instances but it's still accessible if you call it on the class.
I did go into the behaviour of Enum in the old version so I left it in here:
Old Version
The DynamicClassAttribute is just useful (I'm not really sure on that point) if you suspect there could be naming conflicts between an attribute that is set on a subclass and a property on the base-class.
You'll need to know at least some basics about metaclasses, because this will not work without using metaclasses (a nice explanation on how class attributes are called can be found in this blog post) because the attribute lookup is slightly different with metaclasses.
Suppose you have:
class Funny(type):
dynprop = 'Very important meta attribute, do not override'
class Fun(metaclass=Funny):
def __init__(self, value):
self._stub = value
#property
def dynprop(self):
return 'Haha, overridden it with {}'.format(self._stub)
and then call:
Fun.dynprop
property at 0x1b3d9fd19a8
and on the instance we get:
Fun(2).dynprop
'Haha, overridden it with 2'
bad ... it's lost. But wait we can use the metaclass special lookup: Let's implement an __getattr__ (fallback) and implement the dynprop as DynamicClassAttribute. Because according to it's documentation that's its purpose - to fallback to the __getattr__ if it's called on the class:
from types import DynamicClassAttribute
class Funny(type):
def __getattr__(self, value):
print('search in meta')
return Funny.dynprop
dynprop = 'Meta'
class Fun(metaclass=Funny):
def __init__(self, value):
self._dynprop = value
#DynamicClassAttribute
def dynprop(self):
return self._dynprop
now we access the class-attribute:
Fun.dynprop
which prints:
search in meta
'Meta'
So we invoked the metaclass.__getattr__ and returned the original attribute (which was defined with the same name as the new property).
And for instances:
Fun('Not-Meta').dynprop
we get the overriden attribute:
'Not-Meta'
Well that's not too bad considering we can reroute using metaclasses to previously defined but overriden attributes without creating an instance. This example is the opposite that is done with Enum, where you define attributes on the subclass:
from enum import Enum
class Fun(Enum):
name = 'me'
age = 28
hair = 'brown'
and want to access these afterwards defined attributes by default.
Fun.name
# <Fun.name: 'me'>
but you also want to allow accessing the name attribute that was defined as DynamicClassAttribute (which returns which name the variable actually has):
Fun('me').name
# 'name'
because otherwise how could you access the name of 28?
Fun.hair.age
# <Fun.age: 28>
# BUT:
Fun.hair.name
# returns 'hair'
See the difference? Why does the second one don't return <Fun.name: 'me'>? That's because of this use of DynamicClassAttribute. So you can shadow the original property but "release" it again later. This behaviour is the reverse of that shown in my example and requires at least the usage of __new__ and __prepare__. But for that you need to know how that exactly works and is explained in a lot of blogs and stackoverflow-answers that can explain it much better than I can so I'll skip going into that much depth (and I'm not sure if I could solve it in short order).
Actual use-cases might be sparse but given time one can propably think of some...
Very nice discussion on the documentation of DynamicClassAttribute: "we added it because we needed it"

What is a DynamicClassAttribute
A DynamicClassAttribute is a descriptor that is similar to property. Dynamic is part of the name because you get different results based on whether you access it via the class or via the instance:
instance access is identical to property and simply runs whatever method was decorated, returning its result
class access raises an AttributeError; when this happens Python then searches every parent class (via the mro) looking for that attribute -- when it doesn't find it, it calls the class' metaclass's __getattr__ for one last shot at finding the attribute. __getattr__ can, of course, do whatever it wants -- in the case of EnumMeta __getattr__ looks in the class' _member_map_ to see if the requested attribute is there, and returns it if it is. As a side note: all that searching had a severe performance impact, which is why we ended up putting all members that did not have name conflicts with DynamicClassAttributes in the Enum class' __dict__ after all.
and how do I use it?
You use it just like you would property -- the only difference is that you use it when creating a base class for other Enums. As an example, the Enum from aenum1 has three reserved names:
name
value
values
values is there to support Enum members with multiple values. That class is effectively:
class Enum(metaclass=EnumMeta):
#DynamicClassAttribute
def name(self):
return self._name_
#DynamicClassAttribute
def value(self):
return self._value_
#DynamicClassAttribute
def values(self):
return self._values_
and now any aenum.Enum can have a values member without messing up Enum.<member>.values.
1 Disclosure: I am the author of the Python stdlib Enum, the enum34 backport, and the Advanced Enumeration (aenum) library.

Dynamic configuration object based on setattr and _getattr__

I tried to create dynamic object to validate my config in fly and present result as object. I tried to achieve this by creating such class:
class SubConfig(object):
def __init__(self, config, key_types):
self.__config = config
self.__values = {}
self.__key_types = key_types
def __getattr__(self, item):
if item in self.__key_types:
return self.__values[item] or None
else:
raise ValueError("No such item to get from config")
def __setattr__(self, item, value):
if self.__config._blocked:
raise ValueError("Can't change values after service has started")
if item in self.__key_types:
if type(value) in self.__key_types[item]:
self.__values[item] = value
else:
raise ValueError("Can't assing value in different type then declared!")
else:
raise ValueError("No such item to set in config")
SubConfig is wrapper for section in config file. Config has switch to kill possibility to change values after program started (you can change values only on initialization).
The problem is when I setting any value it is getting in infinity loop in getattr. As I read __getattr__ shouldn't behave like that (first take existing attr, then call __getattr__). I was comparing my code with available examples but I can't get a thing.
I noticed that all problems are generated my constructor.

The problem is that your constructor in initialising the object calls __setattr__, which then calls __getattr__ because the __ private members aren't initialised yet.
There are two ways I can think of to work around this:
One option is to call down to object.__setattr__ thereby avoiding your __setattr__ or equivalently use super(SubConfig, self).__setattr__(...) in __init__. You could also set values in self.__dict__ directly. A problem here is that because you're using double-underscores you'd have to mangle the attribute names manually (so '__config' becomes '_SubConfig__config'):
def __init__(self, config, key_types):
super(SubConfig, self).__setattr__('_SubConfig__config', config)
super(SubConfig, self).__setattr__('_SubConfig__values', {})
super(SubConfig, self).__setattr__('_SubConfig__key_types', key_types)
An alternative is to have __setattr__ detect and pass through access to attribute names that begin with _ i.e.
if item.startswith('_')
return super(SubConfig, self).__setattr__(item, value)
This is more Pythonic in that if someone has a good reason to access your object's internals, you have no reason to try to stop them.

Cf ecatmur's answer for the root cause - and remember that __setattr__ is not symetrical to __getattr__ - it is unconditionnaly called on each and every attempt to bind an object's attribute. Overriding __setattr__ is tricky and should not be done if you don't clearly understand the pros and cons.
Now for a simple practical solution to your use case: rewrite your initializer to avoid triggering setattr calls:
class SubConfig(object):
def __init__(self, config, key_types):
self.__dict__.update(
_SubConfig__config=config,
_SubConfig__values={},
_SubConfig__key_types=key_types
)
Note that I renamed your attributes to emulate the name-mangling that happens when using the double leading underscores naming scheme.

Why does a python descriptor get method accept the owner class as an arg?

Why does the __get__ method in a python descriptor accept the owner class as it's third argument? Can you give an example of it's use?
The first argument (self) is self evident, the second (instances) makes sense in the context of the typically shown descriptor pattern (ex to follow), but I've never really seen the third (owner) used. Can someone explain what the use case is for it?
Just by way of reference and facilitating answers this is the typical use of descriptors I've seen:
class Container(object):
class ExampleDescriptor(object):
def __get__(self, instance, owner):
return instance._name
def __set__(self, instance, value):
instance._name = value
managed_attr = ExampleDescriptor()
Given that instance.__class__ is available all I can think of is that explicitly passing the class has something to do with directly accessing the descriptor from the class instead of an instances (ex Container.managed_attr). Even so I'm not clear on what one would do in __get__ in this situation.

owner is used when the attribute is accessed from the class instead of an instance of the class, in which case instance will be None.
In your example attempting something like print(Container.managed_attr) would fail because instance is None so instance._name would raise an AttributeError.
You could improve this behavior by checking to see if instance is None, and it may be useful for logging or raising a more helpful exception to know which class the descriptor belongs to, hence the owner attribute. For example:
def __get__(self, instance, owner):
if instance is None:
# special handling for Customer.managed_attr
else:
return instance._name

When the descriptor is accessed from the class, instance will be None. If you have not accounted for that situation (as your example code does not) then an error will occur at that point.
What should you do in that case? Whatever is sensible. ;) If nothing else makes sense you could follow property's example and return the descriptor itself when accessed from the class.

Yes, it's used so that the descriptor can see Container when Container.managed_attr is accessed. You could return some object appropriate to the use case, like an unbound method when descriptors are used to implement methods.

I think the most famous application of the owner parameter of the __get__ method in Python is the classmethod decorator. Here is a pure Python version:
import types
class ClassMethod:
"Emulate PyClassMethod_Type() in Objects/funcobject.c."
def __init__(self, f):
self.f = f
def __get__(self, instance, owner=None):
if instance is None and owner is None:
raise TypeError("__get__(None, None) is invalid")
if owner is None:
owner = type(instance)
if hasattr(self.f, "__get__"):
return self.f.__get__(owner)
return types.MethodType(self.f, owner)
Thanks to the owner parameter, classmethod works for attribute lookup not only from an instance but also from a class:
class A:
#ClassMethod
def name(cls):
return cls.__name__
A().name() # returns 'A' so attribute lookup from an instance works
A.name() # returns 'A' so attribute lookup from a class works too

Python: Can a class forbid clients setting new attributes?

I just spent too long on a bug like the following:
>>> class Odp():
def __init__(self):
self.foo = "bar"
>>> o = Odp()
>>> o.raw_foo = 3 # oops - meant o.foo
I have a class with an attribute. I was trying to set it, and wondering why it had no effect. Then, I went back to the original class definition, and saw that the attribute was named something slightly different. Thus, I was creating/setting a new attribute instead of the one meant to.
First off, isn't this exactly the type of error that statically-typed languages are supposed to prevent? In this case, what is the advantage of dynamic typing?
Secondly, is there a way I could have forbidden this when defining Odp, and thus saved myself the trouble?

You can implement a __setattr__ method for the purpose -- that's much more robust than the __slots__ which is often misused for the purpose (for example, __slots__ is automatically "lost" when the class is inherited from, while __setattr__ survives unless explicitly overridden).
def __setattr__(self, name, value):
if hasattr(self, name):
object.__setattr__(self, name, value)
else:
raise TypeError('Cannot set name %r on object of type %s' % (
name, self.__class__.__name__))
You'll have to make sure the hasattr succeeds for the names you do want to be able to set, for example by setting the attributes at a class level or by using object.__setattr__ in your __init__ method rather than direct attribute assignment. (To forbid setting attributes on a class rather than its instances you'll have to define a custom metaclass with a similar special method).

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.