I'm learning overloading in Python 3.X and to better understand the topic, I wrote the following code that works in 3.X but not in 2.X. I expected the below code to fail since I've not defined __call__ for class Test. But to my surprise, it works and prints "constructor called". Demo.
class Test:
def __init__(self):
print("constructor called")
#Test.__getitem__() #error as expected
Test.__call__() #this works in 3.X(but not in 2.X) and prints "constructor called"! WHY THIS DOESN'T GIVE ERROR in 3.x?
So my question is that how/why exactly does this code work in 3.x but not in 2.x. I mean I want to know the mechanics behind what is going on.
More importantly, why __init__ is being used here when I am using __call__?
In 3.x:
About attribute lookup, type and object
Every time an attribute is looked up on an object, Python follows a process like this:
Is it directly a part of the actual data in the object? If so, use that and stop.
Is it directly a part of the object's class? If so, hold onto that for step 4.
Otherwise, check the object's class for __getattr__ and __getattribute__ overrides, look through base classes in the MRO, etc. (This is a massive simplification, of course.)
If something was found in step 2 or 3, check if it has a __get__. If it does, look that up (yes, that means starting over at step 1 for the attribute named __get__ on that object), call it, and use its return value. Otherwise, use what was returned directly.
Functions have a __get__ automatically; it is used to implement method binding. Classes are objects; that's why it's possible to look up attributes in them. That is: the purpose of the class Test: block is to define a data type; the code creates an object named Test which represents the data type that was defined.
But since the Test class is an object, it must be an instance of some class. That class is called type, and has a built-in implementation.
>>> type(Test)
<class 'type'>
Notice that type(Test) is not a function call. Rather, the name type is pre-defined to refer to a class, which every other class created in user code is (by default) an instance of.
In other words, type is the default metaclass: the class of classes.
>>> type
<class 'type'>
One may ask, what class does type belong to? The answer is surprisingly simple - itself:
>>> type(type) is type
True
Since the above examples call type, we conclude that type is callable. To be callable, it must have a __call__ attribute, and it does:
>>> type.__call__
<slot wrapper '__call__' of 'type' objects>
When type is called with a single argument, it looks up the argument's class (roughly equivalent to accessing the __class__ attribute of the argument). When called with three arguments, it creates a new instance of type, i.e., a new class.
How does type work?
Because this is digging right at the core of the language (allocating memory for the object), it's not quite possible to implement this in pure Python, at least for the reference C implementation (and I have no idea what sort of magic is going on in PyPy here). But we can approximately model the type class like so:
def _validate_type(obj, required_type, context):
if not isinstance(obj, required_type):
good_name = required_type.__name__
bad_name = type(obj).__name__
raise TypeError(f'{context} must be {good_name}, not {bad_name}')
class type:
def __new__(cls, name_or_obj, *args):
# __new__ implicitly gets passed an instance of the class, but
# `type` is its own class, so it will be `type` itself.
if len(args) == 0: # 1-argument form: check the type of an existing class.
return obj.__class__
# otherwise, 3-argument form: create a new class.
try:
bases, attrs = args
except ValueError:
raise TypeError('type() takes 1 or 3 arguments')
_validate_type(name, str, 'type.__new__() argument 1')
_validate_type(bases, tuple, 'type.__new__() argument 2')
_validate_type(attrs, dict, 'type.__new__() argument 3')
# This line would not work if we were actually implementing
# a replacement for `type`, as it would route to `object.__new__(type)`,
# which is explicitly disallowed. But let's pretend it does...
result = super().__new__()
# Now, fill in attributes from the parameters.
result.__name__ = name_or_obj
# Assigning to `__bases__` triggers a lot of other internal checks!
result.__bases__ = bases
for name, value in attrs.items():
setattr(result, name, value)
return result
del __new__.__get__ # `__new__`s of builtins don't implement this.
def __call__(self, *args):
return self.__new__(self, *args)
# this, however, does have a `__get__`.
What happens (conceptually) when we call the class (Test())?
Test() uses function-call syntax, but it's not a function. To figure out what should happen, we translate the call into Test.__class__.__call__(Test). (We use __class__ directly here, because translating the function call using type - asking type to categorize itself - would end up in endless recursion.)
Test.__class__ is type, so this becomes type.__call__(Test).
type contains a __call__ directly (type is its own class, remember?), so it's used directly - we don't go through the __get__ descriptor. We call the function, with Test as self, and no other arguments. (We have a function now, so we don't need to translate the function call syntax again. We could - given a function func, func.__class__.__call__.__get__(func) gives us an instance of an unnamed builtin "method wrapper" type, which does the same thing as func when called. Repeating the loop on the method wrapper creates a separate method wrapper that still does the same thing.)
This attempts the call Test.__new__(Test) (since self was bound to Test). Test.__new__ isn't explicitly defined in Test, but since Test is a class, we don't look in Test's class (type), but instead in Test's base (object).
object.__new__(Test) exists, and does magical built-in stuff to allocate memory for a new instance of the Test class, make it possible to assign attributes to that instance (even though Test is a subtype of object, which disallows that), and set its __class__ to Test.
Similarly, when we call type, the same logical chain turns type(Test) into type.__class__.__call__(type, Test) into type.__call__(type, Test), which forwards to type.__new__(type, Test). This time, there is a __new__ attribute directly in type, so this doesn't fall back to looking in object. Instead, with name_or_obj being set to Test, we simply return Test.__class__, i.e., type. And with separate name, bases, attrs arguments, type.__new__ instead creates an instance of type.
Finally: what happens when we call Test.__call__() explicitly?
If there's a __call__ defined in the class, it gets used, since it's found directly. This will fail, however, because there aren't enough arguments: the descriptor protocol isn't used since the attribute was found directly, so self isn't bound, and so that argument is missing.
If there isn't a __call__ method defined, then we look in Test's class, i.e., type. There's a __call__ there, so the rest proceeds like steps 3-5 in the previous section.
In Python 3.x, every class is implicitely a child of the builtin class object. And at least in the CPython implementation, the object class has a __call__ method which is defined in its metaclass type.
That means that Test.__call__() is exactly the same as Test() and will return a new Test object, calling your custom __init__ method.
In Python 2.x classes are by default old-style classes and are not child of object. Because of that __call__ is not defined. You can get the same behaviour in Python 2.x by using new style classes, meaning by making an explicit inheritance on object:
# Python 2 new style class
class Test(object):
...
I have Python class looking somewhat like this:
class some_class:
def __getattr__(self, name):
# Do something with "name" (by passing it to a server)
Sometimes, I am working with ptpython (an interactive Python shell) for debugging. ptpython inspects instances of the class and tries to access the __objclass__ attribute, which does not exist. In __getattr__, I could simply check if name != "__objclass__" before working with name, but I'd like to know whether there is a better way by either correctly implementing or somehow stubbing __objclass__.
The Python documentation does not say very much about it, or at least I do not understand what I have to do:
The attribute __objclass__ is interpreted by the inspect module as specifying the class where this object was defined (setting this appropriately can assist in runtime introspection of dynamic class attributes). For callables, it may indicate that an instance of the given type (or a subclass) is expected or required as the first positional argument (for example, CPython sets this attribute for unbound methods that are implemented in C).
You want to avoid interfering with this attribute. There is no reason to do any kind of stubbing manually - you want to get out of the way and let it do what it usually does. If it behaves like attributes usually do, everything will work correctly.
The correct implementation is therefore to special-case the __objclass__ attribute in your __getattr__ function and throw an AttributeError.
class some_class:
def __getattr__(self, name):
if name == "__objclass__":
raise AttributeError
# Do something with "name" (by passing it to a server)
This way it will behave the same way as it would in a class that has no __getattr__: The attribute is considered non-existant by default, until it's assigned to. The __getattr__ method won't be called if the attribute already exists, so it can be used without any issues:
>>> obj = some_class()
>>> hasattr(obj, '__objclass__')
False
>>> obj.__objclass__ = some_class
>>> obj.__objclass__
<class '__main__.some_class'>
Consider the following:
class A(object):
def __init__(self):
print 'Hello!'
def foo(self):
print 'Foo!'
def __getattribute__(self, att):
raise AttributeError()
a = A() # Works, prints "Hello!"
a.foo() # throws AttributeError as expected
The implementation of __getattribute__ obviously fails all lookups. My questions:
Why is it still possible to instantiate an object? I would have expected the lookup of the __init__ method itself to fail as well.
What's the list of attributes that are not subject to __getattribute__?
The implementation of __getattribute__ obviously fails all lookups
Let's say it fails for all vanilla lookups.
So how did __getattribute__ itself get called in the first place since it is also an attribute of the class?
An attribute would refer to any name following a dot. So to get an attribute of a class instance, __getattribute__ is summoned unconditionally when you try to access that attribute (through dot reference).
However magic methods like __init__ are part of the language construct and so are not directly invoked (via dot reference) since they are implemented as part of the language.
Why is it still possible to instantiate an object?
When you do:
a = A()
The __init__ method gets called behind the scenes, but not via a vanilla lookup. The language handles this. Same applies to other methods like __setattr__, __delattr__, __getattribute__ also and others.
But if you directly called __init__:
a.__init__()
It would raise an error. Eh, this does not make any sense since the class is already initialized.
More subtly, if you tried to access __getattribute__ from your class instance via a dot reference:
a.__getattribute__
it would also raise an AttributeError; the language invocation of the same method attempted to lookup on the attribute __getattribute__, but failed with error.
What's the list of attributes that are not subject to
__getattribute__?
Summarily, __getattribute__ comes play when you try to access any attribute via dot reference. As long as you don't try to explicitly call a magic method, __getattribute__ will not be called.
according to this guide on python descriptors
https://docs.python.org/howto/descriptor.html
method objects in new style classes are implemented using descriptors in order to avoid special casing them in attribute lookup.
the way I understand this is that there is a method object type that implements __get__ and returns a bound method object when called with an instance and an unbound method object when called with no instance and only a class. the article also states that this logic is implemented in the object.__getattribute__ method. like so:
def __getattribute__(self, key):
"Emulate type_getattro() in Objects/typeobject.c"
v = object.__getattribute__(self, key)
if hasattr(v, '__get__'):
return v.__get__(None, self)
return v
however object.__getattribute__ is itself a method! so how is it bound to an object (without infinite recursion)? if it is special cased in the attribute lookup does that not defeat the purpose of removing the old style special casing?
Actually, in CPython the default __getattribute__ implementation is not a Python method, but is instead implemented in C. It can access object slots (entries in the C structure representing Python objects) directly, without bothering to go through the pesky attribute access routine.
Just because your Python code has to do this, doesn't mean the C code has to. :-)
If you do implement a Python __getattribute__ method, just use object.__getattribute__(self, attrname), or better still, super().__getattribute__(attrname) to access attributes on self. That way you won't hit recursion either.
In the CPython implementation, the attribute access is actually handled by the tp_getattro slot in the C type object, with a fallback to the tp_getattr slot.
To be exhaustive and to fully expose what the C code does, when you use attribute access on an instance, here is the full set of functions called:
Python translates attribute access to a call to the PyObject_GetAttr() C function. The implementation for that function looks up the tp_getattro or tp_getattr slot for your class.
The object type has filled the tp_getattro slot with the PyObject_GenericGetAttr function, which delegates the call to _PyObject_GenericGetAttrWithDict (with the *dict pointer set to NULL and the suppress argument set to 0). This function is your object.__getattribute__ method (a special table maps between the name and the slots).
This _PyObject_GenericGetAttrWithDict function can access the instance __dict__ object through the tp_dict slot, but for descriptors (including methods), the _PyType_Lookup function is used.
_PyType_Lookup handles caching and delegates to find_name_in_mro on cache misses; the latter looks up attributes on the class (and superclasses). The code uses direct pointers to the tp_dict slot on each class in the MRO to reference class attributes.
If a descriptor is found by _PyType_Lookup it is returned to _PyObject_GenericGetAttrWithDict and it calls the tp_descr_get function on that object (the __get__ hook).
When you access an attribute on the class itself, instead of _PyObject_GenericGetAttrWithDict, the type->tp_getattro slot is instead serviced by the type_getattro() function, which takes metaclasses into account too. This version calls __get__ too, but leaves the instance parameter set to None.
Nowhere does this code have to recursively call __getattribute__ to access the __dict__ attribute, as it can simply reach into the C structures directly.
In addition to bypassing any instance attributes in the interest of correctness, implicit special method lookup generally also bypasses the __getattribute__() method even of the object’s metaclass.
The docs mention special methods such as __hash__, __repr__ and __len__, and I know from experience it also includes __iter__ for Python 2.7.
To quote an answer to a related question:
"Magic __methods__() are treated specially: They are internally assigned to "slots" in the type data structure to speed up their look-up, and they are only looked up in these slots."
In a quest to improve my answer to another question, I need to know: Which methods, specifically, are we talking about?
You can find an answer in the python3 documentation for object.__getattribute__, which states:
Called unconditionally to implement attribute accesses for instances of the class. If the class also defines __getattr__(), the
latter will not be called unless __getattribute__() either calls it
explicitly or raises an AttributeError. This method should return the
(computed) attribute value or raise an AttributeError exception. In
order to avoid infinite recursion in this method, its implementation
should always call the base class method with the same name to access
any attributes it needs, for example, object.__getattribute__(self,
name).
Note
This method may still be bypassed when looking up special methods as the result of implicit invocation via language syntax or built-in
functions. See Special method lookup.
also this page explains exactly how this "machinery" works. Fundamentally __getattribute__ is called only when you access an attribute with the .(dot) operator(and also by hasattr as Zagorulkin pointed out).
Note that the page does not specify which special methods are implicitly looked up, so I deem that this hold for all of them(which you may find here.
Checked in 2.7.9
Couldn't find any way to bypass the call to __getattribute__, with any of the magical methods that are found on object or type:
# Preparation step: did this from the console
# magics = set(dir(object) + dir(type))
# got 38 names, for each of the names, wrote a.<that_name> to a file
# Ended up with this:
a.__module__
a.__base__
#...
Put this at the beginning of that file, which i renamed into a proper python module (asdf.py)
global_counter = 0
class Counter(object):
def __getattribute__(self, name):
# this will count how many times the method was called
global global_counter
global_counter += 1
return super(Counter, self).__getattribute__(name)
a = Counter()
# after this comes the list of 38 attribute accessess
a.__module__
#...
a.__repr__
#...
print global_counter # you're not gonna like it... it printer 38
Then i also tried to get each of those names by getattr and hasattr -> same result. __getattribute__ was called every time.
So if anyone has other ideas... I was too lazy to look inside C code for this, but I'm sure the answer lies somewhere there.
So either there's something that i'm not getting right, or the docs are lying.
super().method will also bypass __getattribute__. This atrocious code will run just fine (Python 3.11).
class Base:
def print(self):
print("whatever")
def __getattribute__(self, item):
raise Exception("Don't access this with a dot!")
class Sub(Base):
def __init__(self):
super().print()
a = Sub()
# prints 'whatever'
a.print()
# Exception Don't access this with a dot!