Delegation in Python [duplicate] - python

I am trying to understand when to define __getattr__ or __getattribute__. The python documentation mentions __getattribute__ applies to new-style classes. What are new-style classes?

A key difference between __getattr__ and __getattribute__ is that __getattr__ is only invoked if the attribute wasn't found the usual ways. It's good for implementing a fallback for missing attributes, and is probably the one of two you want.
__getattribute__ is invoked before looking at the actual attributes on the object, and so can be tricky to implement correctly. You can end up in infinite recursions very easily.
New-style classes derive from object, old-style classes are those in Python 2.x with no explicit base class. But the distinction between old-style and new-style classes is not the important one when choosing between __getattr__ and __getattribute__.
You almost certainly want __getattr__.

Lets see some simple examples of both __getattr__ and __getattribute__ magic methods.
__getattr__
Python will call __getattr__ method whenever you request an attribute that hasn't already been defined. In the following example my class Count has no __getattr__ method. Now in main when I try to access both obj1.mymin and obj1.mymax attributes everything works fine. But when I try to access obj1.mycurrent attribute -- Python gives me AttributeError: 'Count' object has no attribute 'mycurrent'
class Count():
def __init__(self,mymin,mymax):
self.mymin=mymin
self.mymax=mymax
obj1 = Count(1,10)
print(obj1.mymin)
print(obj1.mymax)
print(obj1.mycurrent) --> AttributeError: 'Count' object has no attribute 'mycurrent'
Now my class Count has __getattr__ method. Now when I try to access obj1.mycurrent attribute -- python returns me whatever I have implemented in my __getattr__ method. In my example whenever I try to call an attribute which doesn't exist, python creates that attribute and sets it to integer value 0.
class Count:
def __init__(self,mymin,mymax):
self.mymin=mymin
self.mymax=mymax
def __getattr__(self, item):
self.__dict__[item]=0
return 0
obj1 = Count(1,10)
print(obj1.mymin)
print(obj1.mymax)
print(obj1.mycurrent1)
__getattribute__
Now lets see the __getattribute__ method. If you have __getattribute__ method in your class, python invokes this method for every attribute regardless whether it exists or not. So why do we need __getattribute__ method? One good reason is that you can prevent access to attributes and make them more secure as shown in the following example.
Whenever someone try to access my attributes that starts with substring 'cur' python raises AttributeError exception. Otherwise it returns that attribute.
class Count:
def __init__(self,mymin,mymax):
self.mymin=mymin
self.mymax=mymax
self.current=None
def __getattribute__(self, item):
if item.startswith('cur'):
raise AttributeError
return object.__getattribute__(self,item)
# or you can use ---return super().__getattribute__(item)
obj1 = Count(1,10)
print(obj1.mymin)
print(obj1.mymax)
print(obj1.current)
Important: In order to avoid infinite recursion in __getattribute__ method, its implementation should always call the base class method with the same name to access any attributes it needs. For example: object.__getattribute__(self, name) or super().__getattribute__(item) and not self.__dict__[item]
IMPORTANT
If your class contain both getattr and getattribute magic methods then __getattribute__ is called first. But if __getattribute__ raises
AttributeError exception then the exception will be ignored and __getattr__ method will be invoked. See the following example:
class Count(object):
def __init__(self,mymin,mymax):
self.mymin=mymin
self.mymax=mymax
self.current=None
def __getattr__(self, item):
self.__dict__[item]=0
return 0
def __getattribute__(self, item):
if item.startswith('cur'):
raise AttributeError
return object.__getattribute__(self,item)
# or you can use ---return super().__getattribute__(item)
# note this class subclass object
obj1 = Count(1,10)
print(obj1.mymin)
print(obj1.mymax)
print(obj1.current)

This is just an example based on Ned Batchelder's explanation.
__getattr__ example:
class Foo(object):
def __getattr__(self, attr):
print "looking up", attr
value = 42
self.__dict__[attr] = value
return value
f = Foo()
print f.x
#output >>> looking up x 42
f.x = 3
print f.x
#output >>> 3
print ('__getattr__ sets a default value if undefeined OR __getattr__ to define how to handle attributes that are not found')
And if same example is used with __getattribute__ You would get >>> RuntimeError: maximum recursion depth exceeded while calling a Python object

New-style classes inherit from object, or from another new style class:
class SomeObject(object):
pass
class SubObject(SomeObject):
pass
Old-style classes don't:
class SomeObject:
pass
This only applies to Python 2 - in Python 3 all the above will create new-style classes.
See 9. Classes (Python tutorial), NewClassVsClassicClass and What is the difference between old style and new style classes in Python? for details.

New-style classes are ones that subclass "object" (directly or indirectly). They have a __new__ class method in addition to __init__ and have somewhat more rational low-level behavior.
Usually, you'll want to override __getattr__ (if you're overriding either), otherwise you'll have a hard time supporting "self.foo" syntax within your methods.
Extra info: http://www.devx.com/opensource/Article/31482/0/page/4

getattribute: Is used to retrieve an attribute from an instance. It captures every attempt to access an instance attribute by using dot notation or getattr() built-in function.
getattr: Is executed as the last resource when attribute is not found in an object. You can choose to return a default value or to raise AttributeError.
Going back to the __getattribute__ function; if the default implementation was not overridden; the following checks are done when executing the method:
Check if there is a descriptor with the same name (attribute name) defined in any class in the MRO chain (method object resolution)
Then looks into the instance’s namespace
Then looks into the class namespace
Then into each base’s namespace and so on.
Finally, if not found, the default implementation calls the fallback getattr() method of the instance and it raises an AttributeError exception as default implementation.
This is the actual implementation of the object.__getattribute__ method:
.. c:function:: PyObject* PyObject_GenericGetAttr(PyObject *o,
PyObject *name) Generic attribute getter function that is meant to
be put into a type object's tp_getattro slot. It looks for a
descriptor in the dictionary of classes in the object's MRO as well
as an attribute in the object's :attr:~object.dict (if
present). As outlined in :ref:descriptors, data descriptors take
preference over instance attributes, while non-data descriptors
don't. Otherwise, an :exc:AttributeError is raised.

I find that no one mentions this difference:
__getattribute__ has a default implementation, but __getattr__ does not.
class A:
pass
a = A()
a.__getattr__ # error
a.__getattribute__ # return a method-wrapper
This has a clear meaning: since __getattribute__ has a default implementation, while __getattr__ not, clearly python encourages users to implement __getattr__.

In reading through Beazley & Jones PCB, I have stumbled on an explicit and practical use-case for __getattr__ that helps answer the "when" part of the OP's question. From the book:
"The __getattr__() method is kind of like a catch-all for attribute lookup. It's a method that gets called if code tries to access an attribute that doesn't exist." We know this from the above answers, but in PCB recipe 8.15, this functionality is used to implement the delegation design pattern. If Object A has an attribute Object B that implements many methods that Object A wants to delegate to, rather than redefining all of Object B's methods in Object A just to call Object B's methods, define a __getattr__() method as follows:
def __getattr__(self, name):
return getattr(self._b, name)
where _b is the name of Object A's attribute that is an Object B. When a method defined on Object B is called on Object A, the __getattr__ method will be invoked at the end of the lookup chain. This would make code cleaner as well, since you do not have a list of methods defined just for delegating to another object.

Related

Explicit call to __call__ works and uses __init__

I'm learning overloading in Python 3.X and to better understand the topic, I wrote the following code that works in 3.X but not in 2.X. I expected the below code to fail since I've not defined __call__ for class Test. But to my surprise, it works and prints "constructor called". Demo.
class Test:
def __init__(self):
print("constructor called")
#Test.__getitem__() #error as expected
Test.__call__() #this works in 3.X(but not in 2.X) and prints "constructor called"! WHY THIS DOESN'T GIVE ERROR in 3.x?
So my question is that how/why exactly does this code work in 3.x but not in 2.x. I mean I want to know the mechanics behind what is going on.
More importantly, why __init__ is being used here when I am using __call__?
In 3.x:
About attribute lookup, type and object
Every time an attribute is looked up on an object, Python follows a process like this:
Is it directly a part of the actual data in the object? If so, use that and stop.
Is it directly a part of the object's class? If so, hold onto that for step 4.
Otherwise, check the object's class for __getattr__ and __getattribute__ overrides, look through base classes in the MRO, etc. (This is a massive simplification, of course.)
If something was found in step 2 or 3, check if it has a __get__. If it does, look that up (yes, that means starting over at step 1 for the attribute named __get__ on that object), call it, and use its return value. Otherwise, use what was returned directly.
Functions have a __get__ automatically; it is used to implement method binding. Classes are objects; that's why it's possible to look up attributes in them. That is: the purpose of the class Test: block is to define a data type; the code creates an object named Test which represents the data type that was defined.
But since the Test class is an object, it must be an instance of some class. That class is called type, and has a built-in implementation.
>>> type(Test)
<class 'type'>
Notice that type(Test) is not a function call. Rather, the name type is pre-defined to refer to a class, which every other class created in user code is (by default) an instance of.
In other words, type is the default metaclass: the class of classes.
>>> type
<class 'type'>
One may ask, what class does type belong to? The answer is surprisingly simple - itself:
>>> type(type) is type
True
Since the above examples call type, we conclude that type is callable. To be callable, it must have a __call__ attribute, and it does:
>>> type.__call__
<slot wrapper '__call__' of 'type' objects>
When type is called with a single argument, it looks up the argument's class (roughly equivalent to accessing the __class__ attribute of the argument). When called with three arguments, it creates a new instance of type, i.e., a new class.
How does type work?
Because this is digging right at the core of the language (allocating memory for the object), it's not quite possible to implement this in pure Python, at least for the reference C implementation (and I have no idea what sort of magic is going on in PyPy here). But we can approximately model the type class like so:
def _validate_type(obj, required_type, context):
if not isinstance(obj, required_type):
good_name = required_type.__name__
bad_name = type(obj).__name__
raise TypeError(f'{context} must be {good_name}, not {bad_name}')
class type:
def __new__(cls, name_or_obj, *args):
# __new__ implicitly gets passed an instance of the class, but
# `type` is its own class, so it will be `type` itself.
if len(args) == 0: # 1-argument form: check the type of an existing class.
return obj.__class__
# otherwise, 3-argument form: create a new class.
try:
bases, attrs = args
except ValueError:
raise TypeError('type() takes 1 or 3 arguments')
_validate_type(name, str, 'type.__new__() argument 1')
_validate_type(bases, tuple, 'type.__new__() argument 2')
_validate_type(attrs, dict, 'type.__new__() argument 3')
# This line would not work if we were actually implementing
# a replacement for `type`, as it would route to `object.__new__(type)`,
# which is explicitly disallowed. But let's pretend it does...
result = super().__new__()
# Now, fill in attributes from the parameters.
result.__name__ = name_or_obj
# Assigning to `__bases__` triggers a lot of other internal checks!
result.__bases__ = bases
for name, value in attrs.items():
setattr(result, name, value)
return result
del __new__.__get__ # `__new__`s of builtins don't implement this.
def __call__(self, *args):
return self.__new__(self, *args)
# this, however, does have a `__get__`.
What happens (conceptually) when we call the class (Test())?
Test() uses function-call syntax, but it's not a function. To figure out what should happen, we translate the call into Test.__class__.__call__(Test). (We use __class__ directly here, because translating the function call using type - asking type to categorize itself - would end up in endless recursion.)
Test.__class__ is type, so this becomes type.__call__(Test).
type contains a __call__ directly (type is its own class, remember?), so it's used directly - we don't go through the __get__ descriptor. We call the function, with Test as self, and no other arguments. (We have a function now, so we don't need to translate the function call syntax again. We could - given a function func, func.__class__.__call__.__get__(func) gives us an instance of an unnamed builtin "method wrapper" type, which does the same thing as func when called. Repeating the loop on the method wrapper creates a separate method wrapper that still does the same thing.)
This attempts the call Test.__new__(Test) (since self was bound to Test). Test.__new__ isn't explicitly defined in Test, but since Test is a class, we don't look in Test's class (type), but instead in Test's base (object).
object.__new__(Test) exists, and does magical built-in stuff to allocate memory for a new instance of the Test class, make it possible to assign attributes to that instance (even though Test is a subtype of object, which disallows that), and set its __class__ to Test.
Similarly, when we call type, the same logical chain turns type(Test) into type.__class__.__call__(type, Test) into type.__call__(type, Test), which forwards to type.__new__(type, Test). This time, there is a __new__ attribute directly in type, so this doesn't fall back to looking in object. Instead, with name_or_obj being set to Test, we simply return Test.__class__, i.e., type. And with separate name, bases, attrs arguments, type.__new__ instead creates an instance of type.
Finally: what happens when we call Test.__call__() explicitly?
If there's a __call__ defined in the class, it gets used, since it's found directly. This will fail, however, because there aren't enough arguments: the descriptor protocol isn't used since the attribute was found directly, so self isn't bound, and so that argument is missing.
If there isn't a __call__ method defined, then we look in Test's class, i.e., type. There's a __call__ there, so the rest proceeds like steps 3-5 in the previous section.
In Python 3.x, every class is implicitely a child of the builtin class object. And at least in the CPython implementation, the object class has a __call__ method which is defined in its metaclass type.
That means that Test.__call__() is exactly the same as Test() and will return a new Test object, calling your custom __init__ method.
In Python 2.x classes are by default old-style classes and are not child of object. Because of that __call__ is not defined. You can get the same behaviour in Python 2.x by using new style classes, meaning by making an explicit inheritance on object:
# Python 2 new style class
class Test(object):
...

How to correctly "stub" __objclass__ in a Python class?

I have Python class looking somewhat like this:
class some_class:
def __getattr__(self, name):
# Do something with "name" (by passing it to a server)
Sometimes, I am working with ptpython (an interactive Python shell) for debugging. ptpython inspects instances of the class and tries to access the __objclass__ attribute, which does not exist. In __getattr__, I could simply check if name != "__objclass__" before working with name, but I'd like to know whether there is a better way by either correctly implementing or somehow stubbing __objclass__.
The Python documentation does not say very much about it, or at least I do not understand what I have to do:
The attribute __objclass__ is interpreted by the inspect module as specifying the class where this object was defined (setting this appropriately can assist in runtime introspection of dynamic class attributes). For callables, it may indicate that an instance of the given type (or a subclass) is expected or required as the first positional argument (for example, CPython sets this attribute for unbound methods that are implemented in C).
You want to avoid interfering with this attribute. There is no reason to do any kind of stubbing manually - you want to get out of the way and let it do what it usually does. If it behaves like attributes usually do, everything will work correctly.
The correct implementation is therefore to special-case the __objclass__ attribute in your __getattr__ function and throw an AttributeError.
class some_class:
def __getattr__(self, name):
if name == "__objclass__":
raise AttributeError
# Do something with "name" (by passing it to a server)
This way it will behave the same way as it would in a class that has no __getattr__: The attribute is considered non-existant by default, until it's assigned to. The __getattr__ method won't be called if the attribute already exists, so it can be used without any issues:
>>> obj = some_class()
>>> hasattr(obj, '__objclass__')
False
>>> obj.__objclass__ = some_class
>>> obj.__objclass__
<class '__main__.some_class'>

When does __getattribute__ not get involved in attribute lookup?

Consider the following:
class A(object):
def __init__(self):
print 'Hello!'
def foo(self):
print 'Foo!'
def __getattribute__(self, att):
raise AttributeError()
a = A() # Works, prints "Hello!"
a.foo() # throws AttributeError as expected
The implementation of __getattribute__ obviously fails all lookups. My questions:
Why is it still possible to instantiate an object? I would have expected the lookup of the __init__ method itself to fail as well.
What's the list of attributes that are not subject to __getattribute__?
The implementation of __getattribute__ obviously fails all lookups
Let's say it fails for all vanilla lookups.
So how did __getattribute__ itself get called in the first place since it is also an attribute of the class?
An attribute would refer to any name following a dot. So to get an attribute of a class instance, __getattribute__ is summoned unconditionally when you try to access that attribute (through dot reference).
However magic methods like __init__ are part of the language construct and so are not directly invoked (via dot reference) since they are implemented as part of the language.
Why is it still possible to instantiate an object?
When you do:
a = A()
The __init__ method gets called behind the scenes, but not via a vanilla lookup. The language handles this. Same applies to other methods like __setattr__, __delattr__, __getattribute__ also and others.
But if you directly called __init__:
a.__init__()
It would raise an error. Eh, this does not make any sense since the class is already initialized.
More subtly, if you tried to access __getattribute__ from your class instance via a dot reference:
a.__getattribute__
it would also raise an AttributeError; the language invocation of the same method attempted to lookup on the attribute __getattribute__, but failed with error.
What's the list of attributes that are not subject to
__getattribute__?
Summarily, __getattribute__ comes play when you try to access any attribute via dot reference. As long as you don't try to explicitly call a magic method, __getattribute__ will not be called.

The __getattribute__ method and descriptors

according to this guide on python descriptors
https://docs.python.org/howto/descriptor.html
method objects in new style classes are implemented using descriptors in order to avoid special casing them in attribute lookup.
the way I understand this is that there is a method object type that implements __get__ and returns a bound method object when called with an instance and an unbound method object when called with no instance and only a class. the article also states that this logic is implemented in the object.__getattribute__ method. like so:
def __getattribute__(self, key):
"Emulate type_getattro() in Objects/typeobject.c"
v = object.__getattribute__(self, key)
if hasattr(v, '__get__'):
return v.__get__(None, self)
return v
however object.__getattribute__ is itself a method! so how is it bound to an object (without infinite recursion)? if it is special cased in the attribute lookup does that not defeat the purpose of removing the old style special casing?
Actually, in CPython the default __getattribute__ implementation is not a Python method, but is instead implemented in C. It can access object slots (entries in the C structure representing Python objects) directly, without bothering to go through the pesky attribute access routine.
Just because your Python code has to do this, doesn't mean the C code has to. :-)
If you do implement a Python __getattribute__ method, just use object.__getattribute__(self, attrname), or better still, super().__getattribute__(attrname) to access attributes on self. That way you won't hit recursion either.
In the CPython implementation, the attribute access is actually handled by the tp_getattro slot in the C type object, with a fallback to the tp_getattr slot.
To be exhaustive and to fully expose what the C code does, when you use attribute access on an instance, here is the full set of functions called:
Python translates attribute access to a call to the PyObject_GetAttr() C function. The implementation for that function looks up the tp_getattro or tp_getattr slot for your class.
The object type has filled the tp_getattro slot with the PyObject_GenericGetAttr function, which delegates the call to _PyObject_GenericGetAttrWithDict (with the *dict pointer set to NULL and the suppress argument set to 0). This function is your object.__getattribute__ method (a special table maps between the name and the slots).
This _PyObject_GenericGetAttrWithDict function can access the instance __dict__ object through the tp_dict slot, but for descriptors (including methods), the _PyType_Lookup function is used.
_PyType_Lookup handles caching and delegates to find_name_in_mro on cache misses; the latter looks up attributes on the class (and superclasses). The code uses direct pointers to the tp_dict slot on each class in the MRO to reference class attributes.
If a descriptor is found by _PyType_Lookup it is returned to _PyObject_GenericGetAttrWithDict and it calls the tp_descr_get function on that object (the __get__ hook).
When you access an attribute on the class itself, instead of _PyObject_GenericGetAttrWithDict, the type->tp_getattro slot is instead serviced by the type_getattro() function, which takes metaclasses into account too. This version calls __get__ too, but leaves the instance parameter set to None.
Nowhere does this code have to recursively call __getattribute__ to access the __dict__ attribute, as it can simply reach into the C structures directly.

Which special methods bypasses __getattribute__ in Python?

In addition to bypassing any instance attributes in the interest of correctness, implicit special method lookup generally also bypasses the __getattribute__() method even of the object’s metaclass.
The docs mention special methods such as __hash__, __repr__ and __len__, and I know from experience it also includes __iter__ for Python 2.7.
To quote an answer to a related question:
"Magic __methods__() are treated specially: They are internally assigned to "slots" in the type data structure to speed up their look-up, and they are only looked up in these slots."
In a quest to improve my answer to another question, I need to know: Which methods, specifically, are we talking about?
You can find an answer in the python3 documentation for object.__getattribute__, which states:
Called unconditionally to implement attribute accesses for instances of the class. If the class also defines __getattr__(), the
latter will not be called unless __getattribute__() either calls it
explicitly or raises an AttributeError. This method should return the
(computed) attribute value or raise an AttributeError exception. In
order to avoid infinite recursion in this method, its implementation
should always call the base class method with the same name to access
any attributes it needs, for example, object.__getattribute__(self,
name).
Note
This method may still be bypassed when looking up special methods as the result of implicit invocation via language syntax or built-in
functions. See Special method lookup.
also this page explains exactly how this "machinery" works. Fundamentally __getattribute__ is called only when you access an attribute with the .(dot) operator(and also by hasattr as Zagorulkin pointed out).
Note that the page does not specify which special methods are implicitly looked up, so I deem that this hold for all of them(which you may find here.
Checked in 2.7.9
Couldn't find any way to bypass the call to __getattribute__, with any of the magical methods that are found on object or type:
# Preparation step: did this from the console
# magics = set(dir(object) + dir(type))
# got 38 names, for each of the names, wrote a.<that_name> to a file
# Ended up with this:
a.__module__
a.__base__
#...
Put this at the beginning of that file, which i renamed into a proper python module (asdf.py)
global_counter = 0
class Counter(object):
def __getattribute__(self, name):
# this will count how many times the method was called
global global_counter
global_counter += 1
return super(Counter, self).__getattribute__(name)
a = Counter()
# after this comes the list of 38 attribute accessess
a.__module__
#...
a.__repr__
#...
print global_counter # you're not gonna like it... it printer 38
Then i also tried to get each of those names by getattr and hasattr -> same result. __getattribute__ was called every time.
So if anyone has other ideas... I was too lazy to look inside C code for this, but I'm sure the answer lies somewhere there.
So either there's something that i'm not getting right, or the docs are lying.
super().method will also bypass __getattribute__. This atrocious code will run just fine (Python 3.11).
class Base:
def print(self):
print("whatever")
def __getattribute__(self, item):
raise Exception("Don't access this with a dot!")
class Sub(Base):
def __init__(self):
super().print()
a = Sub()
# prints 'whatever'
a.print()
# Exception Don't access this with a dot!

Categories