Python 3 __getattribute__ vs dot access behaviour

Python 3 __getattribute__ vs dot access behaviour - python

I read a bit on python's object attribute lookup (here: https://blog.ionelmc.ro/2015/02/09/understanding-python-metaclasses/#object-attribute-lookup).
Seems pretty straight forward, so I tried it out (python3):
class A:
def __getattr__(self, attr):
return (1,2,3)
a = A()
a.foobar #returns (1,2,3) as expected
a.__getattribute__('foobar') # raises AttributeError
My question is, aren't the two supposed to be identical?
Why does the second one raise an attribute error?
So apparently the answer is that the logic for a.foobar IS different from the logic for a.__getattribute("foobar"). According to the data model: a.foobar calls a.__getattribute("foobar") and if it raises an AttributeError, it calls a.-__getattr__('foobar')
So it seems the article has a mistake in their diagram. Is this correct?
And another question: Where does the real logic for a.foobar sit? I thought it was in __getattribute__ but apparently not entirely.
Edit:
Not a duplicate of
Difference between __getattr__ vs __getattribute__.
I am asking here what is the different between object.foo and object.__getattribute__("foo"). This is different from __getattr__ vs __getatribute__ which is trivial...

It's easy to get the impression that __getattribute__ is responsible for more than it really is. thing.attr doesn't directly translate to thing.__getattribute__('attr'), and __getattribute__ is not responsible for calling __getattr__.
The fallback to __getattr__ happens in the part of the attribute access machinery that lies outside __getattribute__. The attribute lookup process works like this:
Find the __getattribute__ method through a direct search of the object's type's MRO, bypassing the regular attribute lookup process.
Try __getattribute__.
If __getattribute__ returned something, the attribute lookup process is complete, and that's the attribute value.
If __getattribute__ raised a non-AttributeError, the attribute lookup process is complete, and the exception propagates out of the lookup.
Otherwise, __getattribute__ raised an AttributeError. The lookup continues.
Find the __getattr__ method the same way we found __getattribute__.
If there is no __getattr__, the attribute lookup process is complete, and the AttributeError from __getattribute__ propagates.
Try __getattr__, and return or raise whatever __getattr__ returns or raises.
At least, in terms of the language semantics, it works like that. In terms of the low-level implementation, some of these steps may be optimized out in cases where they're unnecessary, and there are C hooks like tp_getattro that I haven't described. You don't need to worry about that kind of thing unless you want to dive into the CPython interpreter source code.

Related

Detect if a getattribute call was due to hasattr

I'm re-implementing __getattribute__ for a class.
I want to notice any incorrect (meaning failures are expected, of course) failures of providing attributes (because the __getattribute__ implementation turned out quite complex). For that I log a warning if my code was unable to find/provide the attribute just before raising an AttributeError.
I'm aware:
__getattribute__ implementations are encouraged to be as small as simple as possible.
It is considered wrong for a __getattribute__ implementation to behave differently based on how/why it was called.
Code accessing the attribute can just as well try/except instead of using hasattr.
TL;DR: Nevertheless, I'd like to detect whether a call to __getattribute__ was done due to hasattr (verses a "genuine" attempt at accessing the attribute).

This is not possible, even through stack inspection. hasattr produces no frame object in the Python call stack, as it is written in C, and trying to inspect the last Python frame to guess whether it's suspended in the middle of a hasattr call is prone to all kinds of false negatives and false positives.
If you're absolutely determined to make your best shot at it anyway, the most reliable (but still fragile) kludge I can think of is to monkey-patch builtins.hasattr with a Python function that does produce a Python stack frame:
import builtins
import inspect
import types
_builtin_hasattr = builtins.hasattr
if not isinstance(_builtin_hasattr, types.BuiltinFunctionType):
raise Exception('hasattr already patched by someone else!')
def hasattr(obj, name):
return _builtin_hasattr(obj, name)
builtins.hasattr = hasattr
def probably_called_from_hasattr():
# Caller's caller's frame.
frame = inspect.currentframe().f_back.f_back
return frame.f_code is hasattr.__code__
Calling probably_called_from_hasattr inside __getattribute__ will then test if your __getattribute__ was probably called from hasattr. This avoids any need to assume that the calling code used the name "hasattr", or that use of the name "hasattr" corresponds to this particular __getattribute__ call, or that the hasattr call originated inside Python-level code instead of C.
The primary sources of fragility here are if someone saved a reference to the real hasattr before the monkey-patch went through, or if someone else monkey-patches hasattr (such as if someone copy-pastes this code into another file in the same program). The isinstance check attempts to catch most cases of someone else monkey-patching hasattr before us, but it's not perfect.
Additionally, if hasattr on an object written in C triggers attribute access on your object, that will look like your __getattribute__ was called from hasattr. This is the most likely way to get false positives; everything in the previous paragraph would give false negatives. You can protect against that by checking that the entry for obj in the hasattr frame's f_locals is the object it should be.
Finally, if your __getattribute__ was called from a decorator-created wrapper, subclass __getattribute__, or something similar, that will not count as a call from hasattr, even if the wrapper or override was called from hasattr, even if you want it to count.

You can use sys._getframe to get the caller frame and use inspect.getframeinfo to get the line of code that makes the call, and then use some sort of parsing mechanism such as regex (you can't use ast.parse since the one line of code is often an incomplete statement) to see if hasattr is the caller. It isn't very robust but it should work in most reasonable cases:
import inspect
import sys
import re
class A:
def __getattribute__(self, item):
if re.search(r'\bhasattr\b', inspect.getframeinfo(sys._getframe(1)).code_context[0]):
print('called by hasattr')
else:
print('called by something else')
hasattr(A(), 'foo')
getattr(A(), 'foo')
This outputs:
called by hasattr
called by something else

When does getattribute not get involved in attribute lookup?

Consider the following:
class A(object):
def __init__(self):
print 'Hello!'
def foo(self):
print 'Foo!'
def __getattribute__(self, att):
raise AttributeError()
a = A() # Works, prints "Hello!"
a.foo() # throws AttributeError as expected
The implementation of __getattribute__ obviously fails all lookups. My questions:
Why is it still possible to instantiate an object? I would have expected the lookup of the __init__ method itself to fail as well.
What's the list of attributes that are not subject to __getattribute__?

The implementation of __getattribute__ obviously fails all lookups
Let's say it fails for all vanilla lookups.
So how did __getattribute__ itself get called in the first place since it is also an attribute of the class?
An attribute would refer to any name following a dot. So to get an attribute of a class instance, __getattribute__ is summoned unconditionally when you try to access that attribute (through dot reference).
However magic methods like __init__ are part of the language construct and so are not directly invoked (via dot reference) since they are implemented as part of the language.
Why is it still possible to instantiate an object?
When you do:
a = A()
The __init__ method gets called behind the scenes, but not via a vanilla lookup. The language handles this. Same applies to other methods like __setattr__, __delattr__, __getattribute__ also and others.
But if you directly called __init__:
a.__init__()
It would raise an error. Eh, this does not make any sense since the class is already initialized.
More subtly, if you tried to access __getattribute__ from your class instance via a dot reference:
a.__getattribute__
it would also raise an AttributeError; the language invocation of the same method attempted to lookup on the attribute __getattribute__, but failed with error.
What's the list of attributes that are not subject to
__getattribute__?
Summarily, __getattribute__ comes play when you try to access any attribute via dot reference. As long as you don't try to explicitly call a magic method, __getattribute__ will not be called.

Why there is no infinite loop while overriding getattr method in python

I am trying to override getattr method and as per my understanding
there should be infinite loop in the following code snippet as by default
object.__getattribute__(self,attr) is invoked which will invoke overrided getattr method as attribute 'notpresent' is not present in namespaces and this process will be keep on repeating. Can anybody help me in figuring out that why this behavior is not observed here.
Moreover I am unable to figure out that why AttributeError is not raised when implicit call to getattribute is done while accessing attribute using dot notation while it is being raised second time when we are trying to invoke getattribute explicitly within method
class Test(object):
#Act as a fallback and is invoked when getattribute is unable to find attribute
def __getattr__(self,attr):
print "getattr is called"
return object.__getattribute__(self,attr) #AttributeError is raised
t=Test([1,2,3,4])
b = t.notpresent

You are calling object.__getattribute__ within Test.__getattr__.
There is no loop involved here.
Moreover, as per the docs, __getattribute__ does not implicitly call __getattr__.
Update after your edit
Here is the C-implementation of the __getattribute__ call. Especially the slot_tp_getattr_hook part.
In your case, the attribute lookup failure lead to the execution of line 6072 that calls your custom __getattr__ function.
From there on, the AttributeError has been cleared. But your call to object.__getattribute__ will set it back and line 6074 or 6075 won't handle it.
The object.__getattribute__ call is implemented like so and thus (re)raise AttributeError (line 1107).

Because the __getattribute__ normally only looks up the attribute in the __dict__ of the object and similar places - it does not implicitely call __getattr__ to retrieve the attribute.
Note that if __getattribute__ would call __getattr__ the __getattr__ method might be called twice if __getattribute__ failed to find the attribute (since lookup is supposed to call __getattr__ when __getattribute__ fails).

Which special methods bypasses getattribute in Python?

In addition to bypassing any instance attributes in the interest of correctness, implicit special method lookup generally also bypasses the __getattribute__() method even of the object’s metaclass.
The docs mention special methods such as __hash__, __repr__ and __len__, and I know from experience it also includes __iter__ for Python 2.7.
To quote an answer to a related question:
"Magic __methods__() are treated specially: They are internally assigned to "slots" in the type data structure to speed up their look-up, and they are only looked up in these slots."
In a quest to improve my answer to another question, I need to know: Which methods, specifically, are we talking about?

You can find an answer in the python3 documentation for object.__getattribute__, which states:
Called unconditionally to implement attribute accesses for instances of the class. If the class also defines __getattr__(), the
latter will not be called unless __getattribute__() either calls it
explicitly or raises an AttributeError. This method should return the
(computed) attribute value or raise an AttributeError exception. In
order to avoid infinite recursion in this method, its implementation
should always call the base class method with the same name to access
any attributes it needs, for example, object.__getattribute__(self,
name).
Note
This method may still be bypassed when looking up special methods as the result of implicit invocation via language syntax or built-in
functions. See Special method lookup.
also this page explains exactly how this "machinery" works. Fundamentally __getattribute__ is called only when you access an attribute with the .(dot) operator(and also by hasattr as Zagorulkin pointed out).
Note that the page does not specify which special methods are implicitly looked up, so I deem that this hold for all of them(which you may find here.

Checked in 2.7.9
Couldn't find any way to bypass the call to __getattribute__, with any of the magical methods that are found on object or type:
# Preparation step: did this from the console
# magics = set(dir(object) + dir(type))
# got 38 names, for each of the names, wrote a.<that_name> to a file
# Ended up with this:
a.__module__
a.__base__
#...
Put this at the beginning of that file, which i renamed into a proper python module (asdf.py)
global_counter = 0
class Counter(object):
def __getattribute__(self, name):
# this will count how many times the method was called
global global_counter
global_counter += 1
return super(Counter, self).__getattribute__(name)
a = Counter()
# after this comes the list of 38 attribute accessess
a.__module__
#...
a.__repr__
#...
print global_counter # you're not gonna like it... it printer 38
Then i also tried to get each of those names by getattr and hasattr -> same result. __getattribute__ was called every time.
So if anyone has other ideas... I was too lazy to look inside C code for this, but I'm sure the answer lies somewhere there.
So either there's something that i'm not getting right, or the docs are lying.

super().method will also bypass __getattribute__. This atrocious code will run just fine (Python 3.11).
class Base:
def print(self):
print("whatever")
def __getattribute__(self, item):
raise Exception("Don't access this with a dot!")
class Sub(Base):
def __init__(self):
super().print()
a = Sub()
# prints 'whatever'
a.print()
# Exception Don't access this with a dot!

Why can't I iterate over an object which delegates via getattr to an iterable?

An example from the book Core Python Programming on the topic Delegation doesn't seem to be working.. Or may be I didn't understand the topic clearly..
Below is the code, in which the class CapOpen wraps a file object and defines a modified behaviour of file when opened in write mode. It should write all strings in UPPERCASE only.
However when I try to open the file for reading, and iterate over it to print each line, I get the following exception:
Traceback (most recent call last):
File "D:/_Python Practice/Core Python Programming/chapter_13_Classes/
WrappingFileObject.py", line 29, in <module>
for each_line in f:
TypeError: 'CapOpen' object is not iterable
This is strange, because although I haven't explicitly defined iterator methods, I'd expect the calls to be delegated via __getattr__ to the underlying file object. Here's the code. Have I missed anything?
class CapOpen(object):
def __init__(self, filename, mode='r', buf=-1):
self.file = open(filename, mode, buf)
def __str__(self):
return str(self.file)
def __repr__(self):
return `self.file`
def write(self, line):
self.file.write(line.upper())
def __getattr__(self, attr):
return getattr(self.file, attr)
f = CapOpen('wrappingfile.txt', 'w')
f.write('delegation example\n')
f.write('faye is good\n')
f.write('at delegating\n')
f.close()
f = CapOpen('wrappingfile.txt', 'r')
for each_line in f: # I am getting Exception Here..
print each_line,
I am using Python 2.7.

This is a non-intuitive consequence of a Python implementation decision for new-style classes:
In addition to bypassing any instance attributes in the interest of
correctness, implicit special method lookup generally also bypasses
the __getattribute__() method even of the object’s metaclass...
Bypassing the __getattribute__() machinery in this fashion provides
significant scope for speed optimisations within the interpreter, at
the cost of some flexibility in the handling of special methods (the
special method must be set on the class object itself in order to be
consistently invoked by the interpreter).
This is also explicitly pointed out in the documentation for __getattr__/__getattribute__:
Note
This method may still be bypassed when looking up special methods as
the result of implicit invocation via language syntax or built-in
functions. See Special method lookup for new-style classes.
In other words, you can't rely on __getattr__ to always intercept your method lookups when your attributes are undefined. This is not intuitive, because it is reasonable to expect these implicit lookups to follow the same path as all other clients that access your object. If you call f.__iter__ directly from other code, it will resolve as expected. However, that isn't the case when called directly from the language.
The book you quote is pretty old, so the original example probably used old-style classes. If you remove the inheritance from object, your code will work as intended. That being said, you should avoid writing old style classes, since they will become obsolete in Python 3. If you want to, you can still maintain the delegation style here by implementing __iter__ and immediately delegating to the underlying self.file.__iter__.
Alternatively, inherit from the file object directly and __iter__ will be available by normal lookup, so that will also work.

For an object to be iterable, its class has to have __iter__ or __getitem__ defined.
__getattr__ is only called when something is being retrieved from the instance, but because there are several ways that iteration is supported, Python is looking first to see if the appropriate methods even exist.
Try this:
class Fake(object):
def __getattr__(self, name):
print "Nope, no %s here!" % name
raise AttributeError
f = Fake()
for not_here in f:
print not_here
As you can see, the same error is raised: TypeError: 'Fake' object is not iterable.
If you then do this:
print '__getattr__' in Fake.__dict__
print '__iter__' in Fake.__dict__
print '__getitem__' in Fake.__dict__
You can see what Python is seeing: that neither __iter__ nor __getitem__ exist, so Python does not know how to iterate over it. While Python could just try and then catch the exception, I suspect the reason why it does not is that catching exceptions is quite a bit slower.
See my answer here for the many ways to make an iterator.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.