I am trying to understand what Python's descriptors are and what they are useful for. I understand how they work, but here are my doubts. Consider the following code:
class Celsius(object):
def __init__(self, value=0.0):
self.value = float(value)
def __get__(self, instance, owner):
return self.value
def __set__(self, instance, value):
self.value = float(value)
class Temperature(object):
celsius = Celsius()
Why do I need the descriptor class?
What is instance and owner here? (in __get__). What is the purpose of these parameters?
How would I call/use this example?
The descriptor is how Python's property type is implemented. A descriptor simply implements __get__, __set__, etc. and is then added to another class in its definition (as you did above with the Temperature class). For example:
temp=Temperature()
temp.celsius #calls celsius.__get__
Accessing the property you assigned the descriptor to (celsius in the above example) calls the appropriate descriptor method.
instance in __get__ is the instance of the class (so above, __get__ would receive temp, while owner is the class with the descriptor (so it would be Temperature).
You need to use a descriptor class to encapsulate the logic that powers it. That way, if the descriptor is used to cache some expensive operation (for example), it could store the value on itself and not its class.
An article about descriptors can be found here.
EDIT: As jchl pointed out in the comments, if you simply try Temperature.celsius, instance will be None.
Why do I need the descriptor class?
It gives you extra control over how attributes work. If you're used to getters and setters in Java, for example, then it's Python's way of doing that. One advantage is that it looks to users just like an attribute (there's no change in syntax). So you can start with an ordinary attribute and then, when you need to do something fancy, switch to a descriptor.
An attribute is just a mutable value. A descriptor lets you execute arbitrary code when reading or setting (or deleting) a value. So you could imagine using it to map an attribute to a field in a database, for example – a kind of ORM.
Another use might be refusing to accept a new value by throwing an exception in __set__ – effectively making the "attribute" read only.
What is instance and owner here? (in __get__). What is the purpose of these parameters?
This is pretty subtle (and the reason I am writing a new answer here - I found this question while wondering the same thing and didn't find the existing answer that great).
A descriptor is defined on a class, but is typically called from an instance. When it's called from an instance both instance and owner are set (and you can work out owner from instance so it seems kinda pointless). But when called from a class, only owner is set – which is why it's there.
This is only needed for __get__ because it's the only one that can be called on a class. If you set the class value you set the descriptor itself. Similarly for deletion. Which is why the owner isn't needed there.
How would I call/use this example?
Well, here's a cool trick using similar classes:
class Celsius:
def __get__(self, instance, owner):
return 5 * (instance.fahrenheit - 32) / 9
def __set__(self, instance, value):
instance.fahrenheit = 32 + 9 * value / 5
class Temperature:
celsius = Celsius()
def __init__(self, initial_f):
self.fahrenheit = initial_f
t = Temperature(212)
print(t.celsius)
t.celsius = 0
print(t.fahrenheit)
(I'm using Python 3; for python 2 you need to make sure those divisions are / 5.0 and / 9.0). That gives:
100.0
32.0
Now there are other, arguably better ways to achieve the same effect in python (e.g. if celsius were a property, which is the same basic mechanism but places all the source inside the Temperature class), but that shows what can be done...
I am trying to understand what Python's descriptors are and what they can be useful for.
Descriptors are objects in a class namespace that manage instance attributes (like slots, properties, or methods). For example:
class HasDescriptors:
__slots__ = 'a_slot' # creates a descriptor
def a_method(self): # creates a descriptor
"a regular method"
#staticmethod # creates a descriptor
def a_static_method():
"a static method"
#classmethod # creates a descriptor
def a_class_method(cls):
"a class method"
#property # creates a descriptor
def a_property(self):
"a property"
# even a regular function:
def a_function(some_obj_or_self): # creates a descriptor
"create a function suitable for monkey patching"
HasDescriptors.a_function = a_function # (but we usually don't do this)
Pedantically, descriptors are objects with any of the following special methods, which may be known as "descriptor methods":
__get__: non-data descriptor method, for example on a method/function
__set__: data descriptor method, for example on a property instance or slot
__delete__: data descriptor method, again used by properties or slots
These descriptor objects are attributes in other object class namespaces. That is, they live in the __dict__ of the class object.
Descriptor objects programmatically manage the results of a dotted lookup (e.g. foo.descriptor) in a normal expression, an assignment, or a deletion.
Functions/methods, bound methods, property, classmethod, and staticmethod all use these special methods to control how they are accessed via the dotted lookup.
A data descriptor, like property, can allow for lazy evaluation of attributes based on a simpler state of the object, allowing instances to use less memory than if you precomputed each possible attribute.
Another data descriptor, a member_descriptor created by __slots__, allows memory savings (and faster lookups) by having the class store data in a mutable tuple-like datastructure instead of the more flexible but space-consuming __dict__.
Non-data descriptors, instance and class methods, get their implicit first arguments (usually named self and cls, respectively) from their non-data descriptor method, __get__ - and this is how static methods know not to have an implicit first argument.
Most users of Python need to learn only the high-level usage of descriptors, and have no need to learn or understand the implementation of descriptors further.
But understanding how descriptors work can give one greater confidence in one's mastery of Python.
In Depth: What Are Descriptors?
A descriptor is an object with any of the following methods (__get__, __set__, or __delete__), intended to be used via dotted-lookup as if it were a typical attribute of an instance. For an owner-object, obj_instance, with a descriptor object:
obj_instance.descriptor invokes
descriptor.__get__(self, obj_instance, owner_class) returning a value
This is how all methods and the get on a property work.
obj_instance.descriptor = value invokes
descriptor.__set__(self, obj_instance, value) returning None
This is how the setter on a property works.
del obj_instance.descriptor invokes
descriptor.__delete__(self, obj_instance) returning None
This is how the deleter on a property works.
obj_instance is the instance whose class contains the descriptor object's instance. self is the instance of the descriptor (probably just one for the class of the obj_instance)
To define this with code, an object is a descriptor if the set of its attributes intersects with any of the required attributes:
def has_descriptor_attrs(obj):
return set(['__get__', '__set__', '__delete__']).intersection(dir(obj))
def is_descriptor(obj):
"""obj can be instance of descriptor or the descriptor class"""
return bool(has_descriptor_attrs(obj))
A Data Descriptor has a __set__ and/or __delete__.
A Non-Data-Descriptor has neither __set__ nor __delete__.
def has_data_descriptor_attrs(obj):
return set(['__set__', '__delete__']) & set(dir(obj))
def is_data_descriptor(obj):
return bool(has_data_descriptor_attrs(obj))
Builtin Descriptor Object Examples:
classmethod
staticmethod
property
functions in general
Non-Data Descriptors
We can see that classmethod and staticmethod are Non-Data-Descriptors:
>>> is_descriptor(classmethod), is_data_descriptor(classmethod)
(True, False)
>>> is_descriptor(staticmethod), is_data_descriptor(staticmethod)
(True, False)
Both only have the __get__ method:
>>> has_descriptor_attrs(classmethod), has_descriptor_attrs(staticmethod)
(set(['__get__']), set(['__get__']))
Note that all functions are also Non-Data-Descriptors:
>>> def foo(): pass
...
>>> is_descriptor(foo), is_data_descriptor(foo)
(True, False)
Data Descriptor, property
However, property is a Data-Descriptor:
>>> is_data_descriptor(property)
True
>>> has_descriptor_attrs(property)
set(['__set__', '__get__', '__delete__'])
Dotted Lookup Order
These are important distinctions, as they affect the lookup order for a dotted lookup.
obj_instance.attribute
First the above looks to see if the attribute is a Data-Descriptor on the class of the instance,
If not, it looks to see if the attribute is in the obj_instance's __dict__, then
it finally falls back to a Non-Data-Descriptor.
The consequence of this lookup order is that Non-Data-Descriptors like functions/methods can be overridden by instances.
Recap and Next Steps
We have learned that descriptors are objects with any of __get__, __set__, or __delete__. These descriptor objects can be used as attributes on other object class definitions. Now we will look at how they are used, using your code as an example.
Analysis of Code from the Question
Here's your code, followed by your questions and answers to each:
class Celsius(object):
def __init__(self, value=0.0):
self.value = float(value)
def __get__(self, instance, owner):
return self.value
def __set__(self, instance, value):
self.value = float(value)
class Temperature(object):
celsius = Celsius()
Why do I need the descriptor class?
Your descriptor ensures you always have a float for this class attribute of Temperature, and that you can't use del to delete the attribute:
>>> t1 = Temperature()
>>> del t1.celsius
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: __delete__
Otherwise, your descriptors ignore the owner-class and instances of the owner, instead, storing state in the descriptor. You could just as easily share state across all instances with a simple class attribute (so long as you always set it as a float to the class and never delete it, or are comfortable with users of your code doing so):
class Temperature(object):
celsius = 0.0
This gets you exactly the same behavior as your example (see response to question 3 below), but uses a Pythons builtin (property), and would be considered more idiomatic:
class Temperature(object):
_celsius = 0.0
#property
def celsius(self):
return type(self)._celsius
#celsius.setter
def celsius(self, value):
type(self)._celsius = float(value)
What is instance and owner here? (in get). What is the purpose of these parameters?
instance is the instance of the owner that is calling the descriptor. The owner is the class in which the descriptor object is used to manage access to the data point. See the descriptions of the special methods that define descriptors next to the first paragraph of this answer for more descriptive variable names.
How would I call/use this example?
Here's a demonstration:
>>> t1 = Temperature()
>>> t1.celsius
0.0
>>> t1.celsius = 1
>>>
>>> t1.celsius
1.0
>>> t2 = Temperature()
>>> t2.celsius
1.0
You can't delete the attribute:
>>> del t2.celsius
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: __delete__
And you can't assign a variable that can't be converted to a float:
>>> t1.celsius = '0x02'
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 7, in __set__
ValueError: invalid literal for float(): 0x02
Otherwise, what you have here is a global state for all instances, that is managed by assigning to any instance.
The expected way that most experienced Python programmers would accomplish this outcome would be to use the property decorator, which makes use of the same descriptors under the hood, but brings the behavior into the implementation of the owner class (again, as defined above):
class Temperature(object):
_celsius = 0.0
#property
def celsius(self):
return type(self)._celsius
#celsius.setter
def celsius(self, value):
type(self)._celsius = float(value)
Which has the exact same expected behavior of the original piece of code:
>>> t1 = Temperature()
>>> t2 = Temperature()
>>> t1.celsius
0.0
>>> t1.celsius = 1.0
>>> t2.celsius
1.0
>>> del t1.celsius
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: can't delete attribute
>>> t1.celsius = '0x02'
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 8, in celsius
ValueError: invalid literal for float(): 0x02
Conclusion
We've covered the attributes that define descriptors, the difference between data- and non-data-descriptors, builtin objects that use them, and specific questions about use.
So again, how would you use the question's example? I hope you wouldn't. I hope you would start with my first suggestion (a simple class attribute) and move on to the second suggestion (the property decorator) if you feel it is necessary.
Before going into the details of descriptors it may be important to know how attribute lookup in Python works. This assumes that the class has no metaclass and that it uses the default implementation of __getattribute__ (both can be used to "customize" the behavior).
The best illustration of attribute lookup (in Python 3.x or for new-style classes in Python 2.x) in this case is from Understanding Python metaclasses (ionel's codelog). The image uses : as substitute for "non-customizable attribute lookup".
This represents the lookup of an attribute foobar on an instance of Class:
Two conditions are important here:
If the class of instance has an entry for the attribute name and it has __get__ and __set__.
If the instance has no entry for the attribute name but the class has one and it has __get__.
That's where descriptors come into it:
Data descriptors which have both __get__ and __set__.
Non-data descriptors which only have __get__.
In both cases the returned value goes through __get__ called with the instance as first argument and the class as second argument.
The lookup is even more complicated for class attribute lookup (see for example Class attribute lookup (in the above mentioned blog)).
Let's move to your specific questions:
Why do I need the descriptor class?
In most cases you don't need to write descriptor classes! However you're probably a very regular end user. For example functions. Functions are descriptors, that's how functions can be used as methods with self implicitly passed as first argument.
def test_function(self):
return self
class TestClass(object):
def test_method(self):
...
If you look up test_method on an instance you'll get back a "bound method":
>>> instance = TestClass()
>>> instance.test_method
<bound method TestClass.test_method of <__main__.TestClass object at ...>>
Similarly you could also bind a function by invoking its __get__ method manually (not really recommended, just for illustrative purposes):
>>> test_function.__get__(instance, TestClass)
<bound method test_function of <__main__.TestClass object at ...>>
You can even call this "self-bound method":
>>> test_function.__get__(instance, TestClass)()
<__main__.TestClass at ...>
Note that I did not provide any arguments and the function did return the instance I had bound!
Functions are Non-data descriptors!
Some built-in examples of a data-descriptor would be property. Neglecting getter, setter, and deleter the property descriptor is (from Descriptor HowTo Guide "Properties"):
class Property(object):
def __init__(self, fget=None, fset=None, fdel=None, doc=None):
self.fget = fget
self.fset = fset
self.fdel = fdel
if doc is None and fget is not None:
doc = fget.__doc__
self.__doc__ = doc
def __get__(self, obj, objtype=None):
if obj is None:
return self
if self.fget is None:
raise AttributeError("unreadable attribute")
return self.fget(obj)
def __set__(self, obj, value):
if self.fset is None:
raise AttributeError("can't set attribute")
self.fset(obj, value)
def __delete__(self, obj):
if self.fdel is None:
raise AttributeError("can't delete attribute")
self.fdel(obj)
Since it's a data descriptor it's invoked whenever you look up the "name" of the property and it simply delegates to the functions decorated with #property, #name.setter, and #name.deleter (if present).
There are several other descriptors in the standard library, for example staticmethod, classmethod.
The point of descriptors is easy (although you rarely need them): Abstract common code for attribute access. property is an abstraction for instance variable access, function provides an abstraction for methods, staticmethod provides an abstraction for methods that don't need instance access and classmethod provides an abstraction for methods that need class access rather than instance access (this is a bit simplified).
Another example would be a class property.
One fun example (using __set_name__ from Python 3.6) could also be a property that only allows a specific type:
class TypedProperty(object):
__slots__ = ('_name', '_type')
def __init__(self, typ):
self._type = typ
def __get__(self, instance, klass=None):
if instance is None:
return self
return instance.__dict__[self._name]
def __set__(self, instance, value):
if not isinstance(value, self._type):
raise TypeError(f"Expected class {self._type}, got {type(value)}")
instance.__dict__[self._name] = value
def __delete__(self, instance):
del instance.__dict__[self._name]
def __set_name__(self, klass, name):
self._name = name
Then you can use the descriptor in a class:
class Test(object):
int_prop = TypedProperty(int)
And playing a bit with it:
>>> t = Test()
>>> t.int_prop = 10
>>> t.int_prop
10
>>> t.int_prop = 20.0
TypeError: Expected class <class 'int'>, got <class 'float'>
Or a "lazy property":
class LazyProperty(object):
__slots__ = ('_fget', '_name')
def __init__(self, fget):
self._fget = fget
def __get__(self, instance, klass=None):
if instance is None:
return self
try:
return instance.__dict__[self._name]
except KeyError:
value = self._fget(instance)
instance.__dict__[self._name] = value
return value
def __set_name__(self, klass, name):
self._name = name
class Test(object):
#LazyProperty
def lazy(self):
print('calculating')
return 10
>>> t = Test()
>>> t.lazy
calculating
10
>>> t.lazy
10
These are cases where moving the logic into a common descriptor might make sense, however one could also solve them (but maybe with repeating some code) with other means.
What is instance and owner here? (in __get__). What is the purpose of these parameters?
It depends on how you look up the attribute. If you look up the attribute on an instance then:
the second argument is the instance on which you look up the attribute
the third argument is the class of the instance
In case you look up the attribute on the class (assuming the descriptor is defined on the class):
the second argument is None
the third argument is the class where you look up the attribute
So basically the third argument is necessary if you want to customize the behavior when you do class-level look-up (because the instance is None).
How would I call/use this example?
Your example is basically a property that only allows values that can be converted to float and that is shared between all instances of the class (and on the class - although one can only use "read" access on the class otherwise you would replace the descriptor instance):
>>> t1 = Temperature()
>>> t2 = Temperature()
>>> t1.celsius = 20 # setting it on one instance
>>> t2.celsius # looking it up on another instance
20.0
>>> Temperature.celsius # looking it up on the class
20.0
That's why descriptors generally use the second argument (instance) to store the value to avoid sharing it. However in some cases sharing a value between instances might be desired (although I cannot think of a scenario at this moment). However it makes practically no sense for a celsius property on a temperature class... except maybe as purely academic exercise.
Why do I need the descriptor class?
Inspired by Fluent Python by Buciano Ramalho
Imaging you have a class like this
class LineItem:
price = 10.9
weight = 2.1
def __init__(self, name, price, weight):
self.name = name
self.price = price
self.weight = weight
item = LineItem("apple", 2.9, 2.1)
item.price = -0.9 # it's price is negative, you need to refund to your customer even you delivered the apple :(
item.weight = -0.8 # negative weight, it doesn't make sense
We should validate the weight and price in avoid to assign them a negative number, we can write less code if we use descriptor as a proxy as this
class Quantity(object):
__index = 0
def __init__(self):
self.__index = self.__class__.__index
self._storage_name = "quantity#{}".format(self.__index)
self.__class__.__index += 1
def __set__(self, instance, value):
if value > 0:
setattr(instance, self._storage_name, value)
else:
raise ValueError('value should >0')
def __get__(self, instance, owner):
return getattr(instance, self._storage_name)
then define class LineItem like this:
class LineItem(object):
weight = Quantity()
price = Quantity()
def __init__(self, name, weight, price):
self.name = name
self.weight = weight
self.price = price
and we can extend the Quantity class to do more common validating
You'd see https://docs.python.org/3/howto/descriptor.html#properties
class Property(object):
"Emulate PyProperty_Type() in Objects/descrobject.c"
def __init__(self, fget=None, fset=None, fdel=None, doc=None):
self.fget = fget
self.fset = fset
self.fdel = fdel
if doc is None and fget is not None:
doc = fget.__doc__
self.__doc__ = doc
def __get__(self, obj, objtype=None):
if obj is None:
return self
if self.fget is None:
raise AttributeError("unreadable attribute")
return self.fget(obj)
def __set__(self, obj, value):
if self.fset is None:
raise AttributeError("can't set attribute")
self.fset(obj, value)
def __delete__(self, obj):
if self.fdel is None:
raise AttributeError("can't delete attribute")
self.fdel(obj)
def getter(self, fget):
return type(self)(fget, self.fset, self.fdel, self.__doc__)
def setter(self, fset):
return type(self)(self.fget, fset, self.fdel, self.__doc__)
def deleter(self, fdel):
return type(self)(self.fget, self.fset, fdel, self.__doc__)
Easy to digest (with example) Explanation for __get__ & __set__ & __call__ in classes, what is Owner, Instance?
Some points to mug up before diving in:
__get__ __set__ are called descriptors of the class to work/save their internal attributes namely: __name__ (name of class/owner class), variables - __dict__ etc. I will explain what is an owner later
Descriptors are used in design patterers more commonly, for example, with decorators (to abstract things out). You can consider it's more often used in software architecture design to make things less redundant and more readable (seems ironical). Thus abiding SOLID and DRY principles.
If you are not designing software that should abide by SOLID and DRY principles, you probably don't need them, but it's always wise to understand them.
1. Conside this code:
class Method:
def __init__(self, name):
self.name = name
def __call__(self, instance, arg1, arg2):
print(f"{self.name}: {instance} called with {arg1} and {arg2}")
class MyClass:
method = Method("Internal call")
instance = MyClass()
instance.method("first", "second")
# Prints:TypeError: __call__() missing 1 required positional argument: 'arg2'
So, when instance.method("first", "second") is called, __call__ method is called from the Method class (call method makes a class object just callable like a function - whenever a class instance is called __call__ gets instiantiated), and following arguments are assigned: instance: "first", arg1: "second", and the last arg2 is left out, this prints out the error: TypeError: __call__() missing 1 required positional argument: 'arg2'
2. how to solve it?
Since __call__ takes instance as first argument (instance, arg1, arg2), but instance of what?
Instance is the instance of main class (MyClass) which is calling the descriptor class (Method). So, instance = MyClass() is the instance and so who is the owner? the class holding the discriptor class - MyClass, However, there is no method in our descriptor class (Method) to recognise it as an instance. So that is where we need __get__ method. Again consider the code below:
from types import MethodType
class Method:
def __init__(self, name):
self.name = name
def __call__(self, instance, arg1, arg2):
print(f"{self.name}: {instance} called with {arg1} and {arg2}")
def __set__(self, instance, value):
self.value = value
instance.__dict__["method"] = value
def __get__(self, instance, owner):
if instance is None:
return self
print (instance, owner)
return MethodType(self, instance)
class MyClass:
method = Method("Internal call")
instance = MyClass()
instance.method("first", "second")
# Prints: Internal call: <__main__.MyClass object at 0x7fb7dd989690> called with first and second
forget about set for now according to docs:
__get__ "Called to get the attribute of the owner class (class attribute access) or of an instance of that class (instance attribute access)."
if you do: instance.method.__get__(instance)
Prints:<__main__.MyClass object at 0x7fb7dd9eab90> <class '__main__.MyClass'>
this means instance: object of MyClass which is instance
and Owner is MyClass itself
3. __set__ Explaination:
__set__ is used to set some value in the class __dict__ object (let's say using a command line). command for setting the internal value for set is: instance.descriptor = 'value' # where descriptor is method in this case
(instance.__dict__["method"] = value in the code just update the __dict__ object of the descriptor)
So do: instance.method = 'value' now to check if the value = 'value' is set in the __set__ method we can access __dict__ object of the descriptor method.
Do:
instance.method.__dict__ prints: {'_name': 'Internal call', 'value': 'value'}
Or you can check the __dict__ value using vars(instance.method)
prints: {'name': 'Internal call', 'value': 'value'}
I hope things are clear now:)
I tried (with minor changes as suggested) the code from Andrew Cooke's answer. (I am running python 2.7).
The code:
#!/usr/bin/env python
class Celsius:
def __get__(self, instance, owner): return 9 * (instance.fahrenheit + 32) / 5.0
def __set__(self, instance, value): instance.fahrenheit = 32 + 5 * value / 9.0
class Temperature:
def __init__(self, initial_f): self.fahrenheit = initial_f
celsius = Celsius()
if __name__ == "__main__":
t = Temperature(212)
print(t.celsius)
t.celsius = 0
print(t.fahrenheit)
The result:
C:\Users\gkuhn\Desktop>python test2.py
<__main__.Celsius instance at 0x02E95A80>
212
With Python prior to 3, make sure you subclass from object which will make the descriptor work correctly as the get magic does not work for old style classes.
I'm having trouble assigning the assignment operator.
I have successfully overloaded __setattr__. But after the object is initialized, I want __setattr__ to do something else, so I try assigning it to be another function, __setattr2__.
Code:
class C(object):
def __init__(self):
self.x = 0
self.__setattr__ = self.__setattr2__
def __setattr__(self, name, value):
print "first, setting", name
object.__setattr__(self, name, value)
def __setattr2__(self, name, value):
print "second, setting", name
object.__setattr__(self, name, value)
c = C()
c.x = 1
What I get:
first, setting x
first, setting __setattr__
first, setting x
What I want/expect:
first, setting x
first, setting __setattr__
second, setting x
From the docs:
Special method lookup for new-style classes
For new-style classes, implicit invocations of special methods are
only guaranteed to work correctly if defined on an object’s type, not
in the object’s instance dictionary. That behaviour is the reason why
the following code raises an exception (unlike the equivalent example
with old-style classes):
>>> class C(object):
... pass
...
>>> c = C()
>>> c.__len__ = lambda: 5
>>> len(c)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: object of type 'C' has no len()
Why not use a flag to indicate that __init__ is still in progress?
class C(object):
def __init__(self):
# Use the superclass's __setattr__ because we've overridden our own.
super(C, self).__setattr__('initialising', True)
self.x = 0
# the very last thing we do in __init__ is indicate that it's finished
super(C, self).__setattr__('initialising', False)
def __setattr__(self, name, value):
if self.initialising:
print "during __init__, setting", name
# I happen to like super() rather than explicitly naming the superclass
super(C, self).__setattr__(name, value)
else:
print "after __init__, setting", name
super(C, self).__setattr__(name, value)
It is fairly easy to use the __getattr__ special method on Python classes to handle either missing properties or functions, but seemingly not both at the same time.
Consider this example which handles any property requested which is not defined explicitly elsewhere in the class...
class Props:
def __getattr__(self, attr):
return 'some_new_value'
>>> p = Props()
>>> p.prop # Property get handled
'some_new_value'
>>> p.func('an_arg', kw='keyword') # Function call NOT handled
Traceback (most recent call last):
File "<console>", line 1, in <module>
TypeError: 'str' object is not callable
Next, consider this example which handles any function call not defined explicitly elsewhere in the class...
class Funcs:
def __getattr__(self, attr):
def fn(*args, **kwargs):
# Do something with the function name and any passed arguments or keywords
print attr
print args
print kwargs
return
return fn
>>> f = Funcs()
>>> f.prop # Property get NOT handled
<function fn at 0x10df23b90>
>>> f.func('an_arg', kw='keyword') # Function call handled
func
('an_arg',)
{'kw': 'keyword'}
The question is how to handle both types of missing attributes in the same __getattr__? How to detect if the attribute requested was in property notation or in method notation with parentheses and return either a value or a function respectively? Essentially I want to handle SOME missing property attributes AND SOME missing function attributes and then resort to default behavior for all the other cases.
Advice?
How to detect if the attribute requested was in property notation or in method notation with parentheses and return either a value or a function respectively?
You can't. You also can't tell whether a requested method is an instance, class, or static method, etc. All you can tell is that someone is trying to retrieve an attribute for read access. Nothing else is passed into the getattribute machinery, so nothing else is available to your code.
So, you need some out-of-band way to know whether to create a function or some other kind of value. This is actually pretty common—you may actually be proxying for some other object that does have a value/function distinction (think of ctypes or PyObjC), or you may have a naming convention, etc.
However, you could always return an object that can be used either way. For example, if your "default behavior" is to return attributes are integers, or functions that return an integer, you can return something like this:
class Integerizer(object):
def __init__(self, value):
self.value = value
def __int__(self):
return self.value
def __call__(self, *args, **kw):
return self.value
There is no way to detect how the returned attribute was intended to be used. Everything on python objects are attributes, including the methods:
>>> class Foo(object):
... def bar(self): print 'bar called'
... spam='eggs'
...
>>> Foo.bar
<unbound method Foo.bar>
>>> Foo.spam
'eggs'
Python first looks up the attribute (bar or spam), and if you meant to call it (added parenthesis) then Python invokes the callable after lookup up the attribute:
>>> foo = Foo()
>>> fbar = foo.bar
>>> fbar()
'bar called'
In the above code I separated the lookup of bar from calling bar.
Since there is no distinction, you cannot detect in __getattr__ what the returned attribute will be used for.
__getattr__ is called whenever normal attribute access fails; in the following example monty is defined on the class, so __getattr__ is not called; it is only called for bar.eric and bar.john:
>>> class Bar(object):
... monty = 'python'
... def __getattr__(self, name):
... print 'Attribute access for {0}'.format(name)
... if name == 'eric':
... return 'idle'
... raise AttributeError(name)
...
>>> bar = Bar()
>>> bar.monty
'python'
>>> bar.eric
Attribute access for eric
'idle'
>>> bar.john
Attribute access for john
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 7, in __getattr__
AttributeError: john
Note that functions are not the only objects that you can invoke (call); any custom class that implements the __call__ method will do:
>>> class Baz(object):
... def __call__(self, name):
... print 'Baz sez: "Hello {0}!"'.format(name)
...
>>> baz = Baz()
>>> baz('John Cleese')
Baz sez: "Hello John Cleese!"
You could use that return objects from __getattr__ that can both be called and used as a value in different contexts.
Is there some way to make a class-level read-only property in Python? For instance, if I have a class Foo, I want to say:
x = Foo.CLASS_PROPERTY
but prevent anyone from saying:
Foo.CLASS_PROPERTY = y
EDIT:
I like the simplicity of Alex Martelli's solution, but not the syntax that it requires. Both his and ~unutbu's answers inspired the following solution, which is closer to the spirit of what I was looking for:
class const_value (object):
def __init__(self, value):
self.__value = value
def make_property(self):
return property(lambda cls: self.__value)
class ROType(type):
def __new__(mcl,classname,bases,classdict):
class UniqeROType (mcl):
pass
for attr, value in classdict.items():
if isinstance(value, const_value):
setattr(UniqeROType, attr, value.make_property())
classdict[attr] = value.make_property()
return type.__new__(UniqeROType,classname,bases,classdict)
class Foo(object):
__metaclass__=ROType
BAR = const_value(1)
BAZ = 2
class Bit(object):
__metaclass__=ROType
BOO = const_value(3)
BAN = 4
Now, I get:
Foo.BAR
# 1
Foo.BAZ
# 2
Foo.BAR=2
# Traceback (most recent call last):
# File "<stdin>", line 1, in <module>
# AttributeError: can't set attribute
Foo.BAZ=3
#
I prefer this solution because:
The members get declared inline instead of after the fact, as with type(X).foo = ...
The members' values are set in the actual class's code as opposed to in the metaclass's code.
It's still not ideal because:
I have to set the __metaclass__ in order for const_value objects to be interpreted correctly.
The const_values don't "behave" like the plain values. For example, I couldn't use it as a default value for a parameter to a method in the class.
The existing solutions are a bit complex -- what about just ensuring that each class in a certain group has a unique metaclass, then setting a normal read-only property on the custom metaclass. Namely:
>>> class Meta(type):
... def __new__(mcl, *a, **k):
... uniquemcl = type('Uniq', (mcl,), {})
... return type.__new__(uniquemcl, *a, **k)
...
>>> class X: __metaclass__ = Meta
...
>>> class Y: __metaclass__ = Meta
...
>>> type(X).foo = property(lambda *_: 23)
>>> type(Y).foo = property(lambda *_: 45)
>>> X.foo
23
>>> Y.foo
45
>>>
this is really much simpler, because it's based on nothing more than the fact that when you get an instance's attribute descriptors are looked up on the class (so of course when you get a class's attribute descriptors are looked on the metaclass), and making class/metaclass unique isn't terribly hard.
Oh, and of course:
>>> X.foo = 67
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: can't set attribute
just to confirm it IS indeed read-only!
The ActiveState solution that Pynt references makes instances of ROClass have read-only attributes. Your question seems to ask if the class itself can have read-only attributes.
Here is one way, based on Raymond Hettinger's comment:
#!/usr/bin/env python
def readonly(value):
return property(lambda self: value)
class ROType(type):
CLASS_PROPERTY = readonly(1)
class Foo(object):
__metaclass__=ROType
print(Foo.CLASS_PROPERTY)
# 1
Foo.CLASS_PROPERTY=2
# AttributeError: can't set attribute
The idea is this: Consider first Raymond Hettinger's solution:
class Bar(object):
CLASS_PROPERTY = property(lambda self: 1)
bar=Bar()
bar.CLASS_PROPERTY=2
It shows a relatively simple way to give bar a read-only property.
Notice that you have to add the CLASS_PROPERTY = property(lambda self: 1)
line to the definition of the class of bar, not to bar itself.
So, if you want the class Foo to have a read-only property, then the parent class of Foo has to have CLASS_PROPERTY = property(lambda self: 1) defined.
The parent class of a class is a metaclass. Hence we define ROType as the metaclass:
class ROType(type):
CLASS_PROPERTY = readonly(1)
Then we make Foo's parent class be ROType:
class Foo(object):
__metaclass__=ROType
Found this on ActiveState:
# simple read only attributes with meta-class programming
# method factory for an attribute get method
def getmethod(attrname):
def _getmethod(self):
return self.__readonly__[attrname]
return _getmethod
class metaClass(type):
def __new__(cls,classname,bases,classdict):
readonly = classdict.get('__readonly__',{})
for name,default in readonly.items():
classdict[name] = property(getmethod(name))
return type.__new__(cls,classname,bases,classdict)
class ROClass(object):
__metaclass__ = metaClass
__readonly__ = {'a':1,'b':'text'}
if __name__ == '__main__':
def test1():
t = ROClass()
print t.a
print t.b
def test2():
t = ROClass()
t.a = 2
test1()
Note that if you try to set a read-only attribute (t.a = 2) python will raise an AttributeError.
I want to override access to one variable in a class, but return all others normally. How do I accomplish this with __getattribute__?
I tried the following (which should also illustrate what I'm trying to do) but I get a recursion error:
class D(object):
def __init__(self):
self.test=20
self.test2=21
def __getattribute__(self,name):
if name=='test':
return 0.
else:
return self.__dict__[name]
>>> print D().test
0.0
>>> print D().test2
...
RuntimeError: maximum recursion depth exceeded in cmp
You get a recursion error because your attempt to access the self.__dict__ attribute inside __getattribute__ invokes your __getattribute__ again. If you use object's __getattribute__ instead, it works:
class D(object):
def __init__(self):
self.test=20
self.test2=21
def __getattribute__(self,name):
if name=='test':
return 0.
else:
return object.__getattribute__(self, name)
This works because object (in this example) is the base class. By calling the base version of __getattribute__ you avoid the recursive hell you were in before.
Ipython output with code in foo.py:
In [1]: from foo import *
In [2]: d = D()
In [3]: d.test
Out[3]: 0.0
In [4]: d.test2
Out[4]: 21
Update:
There's something in the section titled More attribute access for new-style classes in the current documentation, where they recommend doing exactly this to avoid the infinite recursion.
Actually, I believe you want to use the __getattr__ special method instead.
Quote from the Python docs:
__getattr__( self, name)
Called when an attribute lookup has not found the attribute in the usual places (i.e. it is not an instance attribute nor is it found in the class tree for self). name is the attribute name. This method should return the (computed) attribute value or raise an AttributeError exception.
Note that if the attribute is found through the normal mechanism, __getattr__() is not called. (This is an intentional asymmetry between __getattr__() and __setattr__().) This is done both for efficiency reasons and because otherwise __setattr__() would have no way to access other attributes of the instance. Note that at least for instance variables, you can fake total control by not inserting any values in the instance attribute dictionary (but instead inserting them in another object). See the __getattribute__() method below for a way to actually get total control in new-style classes.
Note: for this to work, the instance should not have a test attribute, so the line self.test=20 should be removed.
Python language reference:
In order to avoid infinite recursion
in this method, its implementation
should always call the base class
method with the same name to access
any attributes it needs, for example,
object.__getattribute__(self, name).
Meaning:
def __getattribute__(self,name):
...
return self.__dict__[name]
You're calling for an attribute called __dict__. Because it's an attribute, __getattribute__ gets called in search for __dict__ which calls __getattribute__ which calls ... yada yada yada
return object.__getattribute__(self, name)
Using the base classes __getattribute__ helps finding the real attribute.
How is the __getattribute__ method used?
It is called before the normal dotted lookup. If it raises AttributeError, then we call __getattr__.
Use of this method is rather rare. There are only two definitions in the standard library:
$ grep -Erl "def __getattribute__\(self" cpython/Lib | grep -v "/test/"
cpython/Lib/_threading_local.py
cpython/Lib/importlib/util.py
Best Practice
The proper way to programmatically control access to a single attribute is with property. Class D should be written as follows (with the setter and deleter optionally to replicate apparent intended behavior):
class D(object):
def __init__(self):
self.test2=21
#property
def test(self):
return 0.
#test.setter
def test(self, value):
'''dummy function to avoid AttributeError on setting property'''
#test.deleter
def test(self):
'''dummy function to avoid AttributeError on deleting property'''
And usage:
>>> o = D()
>>> o.test
0.0
>>> o.test = 'foo'
>>> o.test
0.0
>>> del o.test
>>> o.test
0.0
A property is a data descriptor, thus it is the first thing looked for in the normal dotted lookup algorithm.
Options for __getattribute__
You several options if you absolutely need to implement lookup for every attribute via __getattribute__.
raise AttributeError, causing __getattr__ to be called (if implemented)
return something from it by
using super to call the parent (probably object's) implementation
calling __getattr__
implementing your own dotted lookup algorithm somehow
For example:
class NoisyAttributes(object):
def __init__(self):
self.test=20
self.test2=21
def __getattribute__(self, name):
print('getting: ' + name)
try:
return super(NoisyAttributes, self).__getattribute__(name)
except AttributeError:
print('oh no, AttributeError caught and reraising')
raise
def __getattr__(self, name):
"""Called if __getattribute__ raises AttributeError"""
return 'close but no ' + name
>>> n = NoisyAttributes()
>>> nfoo = n.foo
getting: foo
oh no, AttributeError caught and reraising
>>> nfoo
'close but no foo'
>>> n.test
getting: test
20
What you originally wanted.
And this example shows how you might do what you originally wanted:
class D(object):
def __init__(self):
self.test=20
self.test2=21
def __getattribute__(self,name):
if name=='test':
return 0.
else:
return super(D, self).__getattribute__(name)
And will behave like this:
>>> o = D()
>>> o.test = 'foo'
>>> o.test
0.0
>>> del o.test
>>> o.test
0.0
>>> del o.test
Traceback (most recent call last):
File "<pyshell#216>", line 1, in <module>
del o.test
AttributeError: test
Code review
Your code with comments. You have a dotted lookup on self in __getattribute__.
This is why you get a recursion error. You could check if name is "__dict__" and use super to workaround, but that doesn't cover __slots__. I'll leave that as an exercise to the reader.
class D(object):
def __init__(self):
self.test=20
self.test2=21
def __getattribute__(self,name):
if name=='test':
return 0.
else: # v--- Dotted lookup on self in __getattribute__
return self.__dict__[name]
>>> print D().test
0.0
>>> print D().test2
...
RuntimeError: maximum recursion depth exceeded in cmp
Are you sure you want to use __getattribute__? What are you actually trying to achieve?
The easiest way to do what you ask is:
class D(object):
def __init__(self):
self.test = 20
self.test2 = 21
test = 0
or:
class D(object):
def __init__(self):
self.test = 20
self.test2 = 21
#property
def test(self):
return 0
Edit:
Note that an instance of D would have different values of test in each case. In the first case d.test would be 20, in the second it would be 0. I'll leave it to you to work out why.
Edit2:
Greg pointed out that example 2 will fail because the property is read only and the __init__ method tried to set it to 20. A more complete example for that would be:
class D(object):
def __init__(self):
self.test = 20
self.test2 = 21
_test = 0
def get_test(self):
return self._test
def set_test(self, value):
self._test = value
test = property(get_test, set_test)
Obviously, as a class this is almost entirely useless, but it gives you an idea to move on from.
Here is a more reliable version:
class D(object):
def __init__(self):
self.test = 20
self.test2 = 21
def __getattribute__(self, name):
if name == 'test':
return 0.
else:
return super(D, self).__getattribute__(name)
It calls __getattribute__ method from parent class, eventually falling back to object.__getattribute__ method if other ancestors don't override it.