python - call instance method using __func__ - python

I am new to python, and I don't quite understand the __func__ in python 2.7.
I know when I define a class like this:
class Foo:
def f(self, arg):
print arg
I can use either Foo().f('a') or Foo.f(Foo(), 'a') to call this method. However, I can't call this method by Foo.f(Foo, 'a'). But I accidently found that I can use Foo.f.__func__(Foo, 'a') or even Foo.f.__func__(1, 'a') to get the same result.
I print out the values of Foo.f, Foo().f and Foo.f.__func__, and they are all different. However, I have only one piece of code in definition. Who can help to explain how above code actually works, especially the __func__? I get really confused now.

When you access Foo.f or Foo().f a method is returned; it's unbound in the first case and bound in the second. A python method is essentially a wrapper around a function that also holds a reference to the class it is a method of. When bound, it also holds a reference to the instance.
When you call an method, it'll do a type-check on the first argument passed in to make sure it is an instance (it has to be an instance of the referenced class, or a subclass of that class). When the method is bound, it'll provide that first argument, on an unbound method you provide it yourself.
It's this method object that has the __func__ attribute, which is just a reference to the wrapped function. By accessing the underlying function instead of calling the method, you remove the typecheck, and you can pass in anything you want as the first argument. Functions don't care about their argument types, but methods do.
Note that in Python 3, this has changed; Foo.f just returns the function, not an unbound method. Foo().f returns a method still, still bound, but there is no way to create an unbound method any more.
Under the hood, each function object has a __get__ method, this is what returns the method object:
>>> class Foo(object):
... def f(self): pass
...
>>> Foo.f
<unbound method Foo.f>
>>> Foo().f
<bound method Foo.f of <__main__.Foo object at 0x11046bc10>>
>>> Foo.__dict__['f']
<function f at 0x110450230>
>>> Foo.f.__func__
<function f at 0x110450230>
>>> Foo.f.__func__.__get__(Foo(), Foo)
<bound method Foo.f of <__main__.Foo object at 0x11046bc50>>
>>> Foo.f.__func__.__get__(None, Foo)
<unbound method Foo.f>
This isn't the most efficient codepath, so, Python 3.7 adds a new LOAD_METHOD - CALL_METHOD opcode pair that replaces the current LOAD_ATTRIBUTE - CALL_FUNCTION opcode pair precisely to avoid creating a new method object each time. This optimisation transforms the executon path for instance.foo() from type(instance).__dict__['foo'].__get__(instance, type(instance))() with type(instance).__dict__['foo'](instance), so 'manually' passing in the instance directly to the function object. This saves about 20% time on existing microbenchmarks.

Related

How does python know when to pass in `self`?

Consider a trivial example:
class C:
#staticmethod
def my_static_method():
print("static")
def my_instance_method(self):
print("self")
When I call C().my_static_method(), python doesn't pass the instance of C into my_static_method, and the descriptor that my_static_method references doesn't expect an instance of C, either.
This makes sense.
But then when I call C().my_instance_method(), how does python know to pass the instance of C that I'm calling my_instance_method from in as an argument, without me specifying anything?
As the link explains, function objects are descriptors! Just like staticmethod objects.
They have a __get__ method which returns a bound-method object, which essentially just partially applies the instance itself as the first positional argument. Consider:
>>> def foo(self):
... return self.bar
...
>>> class Baz:
... bar = 42
...
>>> baz = Baz()
>>> bound_method = foo.__get__(baz, Baz)
>>> bound_method
<bound method foo of <__main__.Baz object at 0x7ffcd001c7f0>>
>>> method()
42
By adding the #staticmethod decorator to my_static_method, you told python not to pass the calling instance of C into the function. So you can call this function as C.my_static_method().
By calling C() you created an instance of C. Then you called the non static function my_instance_method() which Python happily passed your new instance of C as the first parameter.
What happens when you call C.my_instance_method() ?
Rhetorical: You'll get a "missing one required arg self" exception -- since my_instance_method only works when calling from an instance unless you decorate it as static.
Of course you can still call the static member from an instance C().my_static_method() but you don't have a self param so no access to the instance.
The key point here is that methods are just functions that happen to be attributes of a class. The actual magic, in Python, happens in the attribute lookup process. The link you give explains earlier just how much happens every time x.y happens in Python. (Remember, everything is an object; that includes functions, classes, modules, type (which is an instance of itself)...)
This process is why descriptors can work at all; why we need explicit self; and why we can do fun things like calling a method with normal function call syntax (as long as we look it up from the class rather than an instance), alias it, mimic the method binding process with functools.partial....
Suppose we have c = C(). When you do c.my_instance_method (never mind calling it for now), Python looks for my_instance_method in type(c) (i.e., in the C class), and also checks if it's a descriptor, and also if it's specifically a data descriptor. Functions are non-data descriptors; even outside of a class, you can write
>>> def x(spam): return spam
...
>>> x.__get__
<method-wrapper '__get__' of function object at 0x...>
Because of the priority rules, as long as c doesn't directly have an attribute attached with the same name, the function will be found in C and its __get__ will be used. Note that the __get__ in question comes from the class - but it isn't using the same process as x.__get__ above. That code looks in the class because that's one of the places checked for an attribute lookup; but when c.my_instance_method redirects to C.my_instance_method.__get__, it's looking there directly - attaching a __get__ attribute directly to the function wouldn't change anything (which is why staticmethod is implemented as a class instead).
That __get__ implements the actual method binding. Let's pretend we found x as a method in the str class:
>>> x.__get__('spam', str)
<bound method x of 'spam'>
>>> x.__get__('spam', str)()
'spam'
Remember, although the function in question takes three arguments, we're calling __get__, itself, as a method - so x gets bound to it in the same way. Equivalently, and more faithful to the actual process:
>>> type(x).__get__(x, 'spam', str)
<bound method x of 'spam'>
>>> type(x).__get__(x, 'spam', str)()
'spam'
So what exactly is that "bound method", anyway?
>>> bound = type(x).__get__(x, 'spam', str)
>>> type(bound)
<class 'method'>
>>> bound.__call__
<method-wrapper '__call__' of method object at 0x...>
>>> bound.__func__
<function x at 0x...>
>>> bound.__self__
'spam'
>>> type(bound)(x, 'eggs')
<bound method x of 'eggs'>
Pretty much what you'd expect: it's a callable object that stores and uses the original function and self value, and does the obvious thing in __call__.

type(MyClass.myclassmethod) returns "method", not "classmethod"

I just stumbled over this strange behavior when the type of a method changes during subclassing:
class A:
def f(self, x):
return x**2
class B(A):
#classmethod
def f(cls, x):
return x**2
If I now ask for the type of B.f, I'll get the (supposedly) wrong answer:
In [37]: type(B.f)
Out[37]: method
Whereas this works as expected:
In [39]: type(B.__dict__["f"])
Out[39]: classmethod
(Seen in Python 3.4 and 3.6.)
Is this just a bug or is there a specific reason for this?
What's the difference between the attribute f and the .__dict__["f"] item? I thought they were the same.
In a testing suite, I was trying to support both types of methods inside a class to be tested. To be able to do that, I need to know the type in order to pass the correct number of arguments. If it's a normal method (i.e. self is the first argument), I'd just pass None explicitly, which by design shouldn't be used inside the method anyway, since it's not instance-dependent.
Maybe there's a better way to do this, like duck typing the call to the method. But there might be cases where this is not so easy to do, like if the method had *args and **kwargs... Therefore I went with the explicit type check, but got stuck at this point.
No, this is not a bug, this is normal behaviour. A classmethod produces a bound method when accessed on a class. That's exactly the point of a classmethod, to bind a function to the class you access it on or the class of an instance you access it on.
Like function and property objects, classmethod is a descriptor object, it implements a __get__ method. Accessing attributes on an instance or a class is delegated to the __getattribute__ method, and the default implementation of that hook will not just return what it found in object.__dict__[attributename]; it will also bind descriptors, by calling the descriptor.__get__() method. This is a hugely important aspect of Python, it is this mechanism that makes methods and attributes and loads of other things work.
classmethod objects, when bound by the descriptor protocol, return a method object. Method objects are wrappers that record the object bound to, and the function to call when they are called; calling a method really calls the underlying method with the bound object as first argument:
>>> class Foo:
... pass
...
>>> def bar(*args): print(args)
...
>>> classmethod(bar).__get__(None, Foo) # decorate with classmethod and bind
<bound method bar of <class '__main__.Foo'>>
>>> method = classmethod(bar).__get__(None, Foo)
>>> method.__self__
<class '__main__.Foo'>
>>> method.__func__
<function bar at 0x1056f0e18>
>>> method()
(<class '__main__.Foo'>,)
>>> method('additional arguments')
(<class '__main__.Foo'>, 'additional arguments')
So the method object returned for a classmethod object references the class (the second argument to __get__, the owner), and the original function. If you use a class method on an instance, the first argument is still ignored:
>>> classmethod(bar).__get__(Foo(), Foo).__self__ # called on an instance
<class '__main__.Foo'>
Functions, on the other hand, want to bind only to instances; so if the first argument to __get__ is set to None, they simply return self:
>>> bar.__get__(None, Foo) # access on a class
<function bar at 0x1056f0e18>
>>> bar.__get__(Foo(), Foo) # access on an instance
<bound method bar of <__main__.Foo object at 0x105833a90>>
>>> bar.__get__(Foo(), Foo).__self__
<__main__.Foo object at 0x105833160>
If accessing ClassObject.classmethod_object would return the classmethod object itself, like a function object would, then you could never actually use the class method on a class. That'd be rather pointless.
So no, object.attribute is not always the same thing as object.__dict__['attribute']. If object.__dict__['attribute'] supports the descriptor protocol, it'll be invoked.

Are there more than three types of methods in Python?

I understand there are at least 3 kinds of methods in Python having different first arguments:
instance method - instance, i.e. self
class method - class, i.e. cls
static method - nothing
These classic methods are implemented in the Test class below including an usual method:
class Test():
def __init__(self):
pass
def instance_mthd(self):
print("Instance method.")
#classmethod
def class_mthd(cls):
print("Class method.")
#staticmethod
def static_mthd():
print("Static method.")
def unknown_mthd():
# No decoration --> instance method, but
# No self (or cls) --> static method, so ... (?)
print("Unknown method.")
In Python 3, the unknown_mthd can be called safely, yet it raises an error in Python 2:
>>> t = Test()
>>> # Python 3
>>> t.instance_mthd()
>>> Test.class_mthd()
>>> t.static_mthd()
>>> Test.unknown_mthd()
Instance method.
Class method.
Static method.
Unknown method.
>>> # Python 2
>>> Test.unknown_mthd()
TypeError: unbound method unknown_mthd() must be called with Test instance as first argument (got nothing instead)
This error suggests such a method was not intended in Python 2. Perhaps its allowance now is due to the elimination of unbound methods in Python 3 (REF 001). Moreover, unknown_mthd does not accept args, and it can be bound to called by a class like a staticmethod, Test.unknown_mthd(). However, it is not an explicit staticmethod (no decorator).
Questions
Was making a method this way (without args while not explicitly decorated as staticmethods) intentional in Python 3's design? UPDATED
Among the classic method types, what type of method is unknown_mthd?
Why can unknown_mthd be called by the class without passing an argument?
Some preliminary inspection yields inconclusive results:
>>> # Types
>>> print("i", type(t.instance_mthd))
>>> print("c", type(Test.class_mthd))
>>> print("s", type(t.static_mthd))
>>> print("u", type(Test.unknown_mthd))
>>> print()
>>> # __dict__ Types, REF 002
>>> print("i", type(t.__class__.__dict__["instance_mthd"]))
>>> print("c", type(t.__class__.__dict__["class_mthd"]))
>>> print("s", type(t.__class__.__dict__["static_mthd"]))
>>> print("u", type(t.__class__.__dict__["unknown_mthd"]))
>>> print()
i <class 'method'>
c <class 'method'>
s <class 'function'>
u <class 'function'>
i <class 'function'>
c <class 'classmethod'>
s <class 'staticmethod'>
u <class 'function'>
The first set of type inspections suggests unknown_mthd is something similar to a staticmethod. The second suggests it resembles an instance method. I'm not sure what this method is or why it should be used over the classic ones. I would appreciate some advice on how to inspect and understand it better. Thanks.
REF 001: What's New in Python 3: “unbound methods” has been removed
REF 002: How to distinguish an instance method, a class method, a static method or a function in Python 3?
REF 003: What's the point of #staticmethod in Python?
Some background: In Python 2, "regular" instance methods could give rise to two kinds of method objects, depending on whether you accessed them via an instance or the class. If you did inst.meth (where inst is an instance of the class), you got a bound method object, which keeps track of which instance it is attached to, and passes it as self. If you did Class.meth (where Class is the class), you got an unbound method object, which had no fixed value of self, but still did a check to make sure a self of the appropriate class was passed when you called it.
In Python 3, unbound methods were removed. Doing Class.meth now just gives you the "plain" function object, with no argument checking at all.
Was making a method this way intentional in Python 3's design?
If you mean, was removal of unbound methods intentional, the answer is yes. You can see discussion from Guido on the mailing list. Basically it was decided that unbound methods add complexity for little gain.
Among the classic method types, what type of method is unknown_mthd?
It is an instance method, but a broken one. When you access it, a bound method object is created, but since it accepts no arguments, it's unable to accept the self argument and can't be successfully called.
Why can unknown_mthd be called by the class without passing an argument?
In Python 3, unbound methods were removed, so Test.unkown_mthd is just a plain function. No wrapping takes place to handle the self argument, so you can call it as a plain function that accepts no arguments. In Python 2, Test.unknown_mthd is an unbound method object, which has a check that enforces passing a self argument of the appropriate class; since, again, the method accepts no arguments, this check fails.
#BrenBarn did a great job answering your question. This answer however, adds a plethora of details:
First of all, this change in bound and unbound method is version-specific, and it doesn't relate to new-style or classic classes:
2.X classic classes by default
>>> class A:
... def meth(self): pass
...
>>> A.meth
<unbound method A.meth>
>>> class A(object):
... def meth(self): pass
...
>>> A.meth
<unbound method A.meth>
3.X new-style classes by default
>>> class A:
... def meth(self): pass
...
>>> A.meth
<function A.meth at 0x7efd07ea0a60>
You've already mentioned this in your question, it doesn't hurt to mention it twice as a reminder.
>>> # Python 2
>>> Test.unknown_mthd()
TypeError: unbound method unknown_mthd() must be called with Test instance as first argument (got nothing instead)
Moreover, unknown_mthd does not accept args, and it can be bound to a class like a staticmethod, Test.unknown_mthd(). However, it is not an explicit staticmethod (no decorator)
unknown_meth doesn't accept args, normally because you've defined the function without so that it does not take any parameter. Be careful and cautious, static methods as well as your coded unknown_meth method will not be magically bound to a class when you reference them through the class name (e.g, Test.unknown_meth). Under Python 3.X Test.unknow_meth returns a simple function object in 3.X, not a method bound to a class.
1 - Was making a method this way (without args while not explicitly decorated as staticmethods) intentional in Python 3's design? UPDATED
I cannot speak for CPython developers nor do I claim to be their representative, but from my experience as a Python programmer, it seems like they wanted to get rid of a bad restriction, especially given the fact that Python is extremely dynamic, not a language of restrictions; why would you test the type of objects passed to class methods and hence restrict the method to specific instances of classes? Type testing eliminates polymorphism. It would be decent if you just return a simple function when a method is fetched through the class which functionally behaves like a static method, you can think of unknown_meth to be static method under 3.X so long as you're careful not to fetch it through an instance of Test you're good to go.
2- Among the classic method types, what type of method is unknown_mthd?
Under 3.X:
>>> from types import *
>>> class Test:
... def unknown_mthd(): pass
...
>>> type(Test.unknown_mthd) is FunctionType
True
It's simply a function in 3.X as you could see. Continuing the previous session under 2.X:
>>> type(Test.__dict__['unknown_mthd']) is FunctionType
True
>>> type(Test.unknown_mthd) is MethodType
True
unknown_mthd is a simple function that lives inside Test__dict__, really just a simple function which lives inside the namespace dictionary of Test. Then, when does it become an instance of MethodType? Well, it becomes an instance of MethodType when you fetch the method attribute either from the class itself which returns an unbound method or its instances which returns a bound method. In 3.X, Test.unknown_mthd is a simple function--instance of FunctionType, and Test().unknown_mthd is an instance of MethodType that retains the original instance of class Test and adds it as the first argument implicitly on function calls.
3- Why can unknown_mthd be called by the class without passing an argument?
Again, because Test.unknown_mthd is just a simple function under 3.X. Whereas in 2.X, unknown_mthd not a simple function and must be called be passed an instance of Test when called.
Are there more than three types of methods in Python?
Yes. There are the three built-in kinds that you mention (instance method, class method, static method), four if you count #property, and anyone can define new method types.
Once you understand the mechanism for doing this, it's easy to explain why unknown_mthd is callable from the class in Python 3.
A new kind of method
Suppose we wanted to create a new type of method, call it optionalselfmethod so that we could do something like this:
class Test(object):
#optionalselfmethod
def optionalself_mthd(self, *args):
print('Optional-Self Method:', self, *args)
The usage is like this:
In [3]: Test.optionalself_mthd(1, 2)
Optional-Self Method: None 1 2
In [4]: x = Test()
In [5]: x.optionalself_mthd(1, 2)
Optional-Self Method: <test.Test object at 0x7fe80049d748> 1 2
In [6]: Test.instance_mthd(1, 2)
Instance method: 1 2
optionalselfmethod works like a normal instance method when called on an instance, but when called on the class, it always receives None for the first parameter. If it were a normal instance method, you would always have to pass an explicit value for the self parameter in order for it to work.
So how does this work? How you can you create a new method type like this?
The Descriptor Protocol
When Python looks up a field of an instance, i.e. when you do x.whatever, it check in several places. It checks the instance's __dict__ of course, but it also checks the __dict__ of the object's class, and base classes thereof. In the instance dict, Python is just looking for the value, so if x.__dict__['whatever'] exists, that's the value. However, in the class dict, Python is looking for an object which implements the Descriptor Protocol.
The Descriptor Protocol is how all three built-in kinds of methods work, it's how #property works, and it's how our special optionalselfmethod will work.
Basically, if the class dict has a value with the correct name1, Python checks if it has an __get__ method, and calls it like type(x).whatever.__get__(x, type(x)) Then, the value returned from __get__ is used as the field value.
So for example, a trivial descriptor which always returns 3:
class GetExample:
def __get__(self, instance, cls):
print("__get__", instance, cls)
return 3
class Test:
get_test = GetExample()
Usage is like this:
In[22]: x = Test()
In[23]: x.get_test
__get__ <__main__.Test object at 0x7fe8003fc470> <class '__main__.Test'>
Out[23]: 3
Notice that the descriptor is called with both the instance and the class type. It can also be used on the class:
In [29]: Test.get_test
__get__ None <class '__main__.Test'>
Out[29]: 3
When a descriptor is used on a class rather than an instance, the __get__ method gets None for self, but still gets the class argument.
This allows a simple implementation of methods: functions simply implement the descriptor protocol. When you call __get__ on a function, it returns a bound method of instance. If the instance is None, it returns the original function. You can actually call __get__ yourself to see this:
In [30]: x = object()
In [31]: def test(self, *args):
...: print(f'Not really a method: self<{self}>, args: {args}')
...:
In [32]: test
Out[32]: <function __main__.test>
In [33]: test.__get__(None, object)
Out[33]: <function __main__.test>
In [34]: test.__get__(x, object)
Out[34]: <bound method test of <object object at 0x7fe7ff92d890>>
#classmethod and #staticmethod are similar. These decorators create proxy objects with __get__ methods which provide different binding. Class method's __get__ binds the method to the instance, and static method's __get__ doesn't bind to anything, even when called on an instance.
The Optional-Self Method Implementation
We can do something similar to create a new method which optionally binds to an instance.
import functools
class optionalselfmethod:
def __init__(self, function):
self.function = function
functools.update_wrapper(self, function)
def __get__(self, instance, cls):
return boundoptionalselfmethod(self.function, instance)
class boundoptionalselfmethod:
def __init__(self, function, instance):
self.function = function
self.instance = instance
functools.update_wrapper(self, function)
def __call__(self, *args, **kwargs):
return self.function(self.instance, *args, **kwargs)
def __repr__(self):
return f'<bound optionalselfmethod {self.__name__} of {self.instance}>'
When you decorate a function with optionalselfmethod, the function is replaced with our proxy. This proxy saves the original method and supplies a __get__ method which returns a boudnoptionalselfmethod. When we create a boundoptionalselfmethod, we tell it both the function to call and the value to pass as self. Finally, calling the boundoptionalselfmethod calls the original function, but with the instance or None inserted into the first parameter.
Specific Questions
Was making a method this way (without args while not explicitly
decorated as staticmethods) intentional in Python 3's design? UPDATED
I believe this was intentional; however the intent would have been to eliminate unbound methods. In both Python 2 and Python 3, def always creates a function (you can see this by checking a type's __dict__: even though Test.instance_mthd comes back as <unbound method Test.instance_mthd>, Test.__dict__['instance_mthd'] is still <function instance_mthd at 0x...>).
In Python 2, function's __get__ method always returns a instancemethod, even when accessed through the class. When accessed through an instance, the method is bound to that instance. When accessed through the class, the method is unbound, and includes a mechanism which checks that the first argument is an instance of the correct class.
In Python 3, function's __get__ method will return the original function unchanged when accessed through the class, and a method when accessed through the instance.
I don't know the exact rationale but I would guess that type-checking of the first argument to a class-level function was deemed unnecessary, maybe even harmful; Python allows duck-typing after all.
Among the classic method types, what type of method is unknown_mthd?
unknown_mthd is a plain function, just like any normal instance method. It only fails when called through the instance because when method.__call__ attempts to call the function unknown_mthd with the bound instance, it doesn't accept enough parameters to receive the instance argument.
Why can unknown_mthd be called by the class without passing an
argument?
Because it's just a plain function, same as any other function. I just doesn't take enough arguments to work correctly when used as an instance method.
You may note that both classmethod and staticmethod work the same whether they're called through an instance or a class, whereas unknown_mthd will only work correctly when when called through the class and fail when called through an instance.
1. If a particular name has both a value in the instance dict and a descriptor in the class dict, which one is used depends on what kind of descriptor it is. If the descriptor only defines __get__, the value in the instance dict is used. If the descriptor also defines __set__, then it's a data-descriptor and the descriptor always wins. This is why you can assign over top of a method but not a #property; method only define __get__, so you can put things in the same-named slot in the instance dict, while #properties define __set__, so even if they're read-only, you'll never get a value from the instance __dict__ even if you've previously bypassed property lookup and stuck a value in the dict with e.g. x.__dict__['whatever'] = 3.

Why the statement (foo.__init__ is foo.__init__) return false [duplicate]

When investigating for another question, I found the following:
>>> class A:
... def m(self): return 42
...
>>> a = A()
This was expected:
>>> A.m == A.m
True
>>> a.m == a.m
True
But this I did not expect:
>>> a.m is a.m
False
And especially not this:
>>> A.m is A.m
False
Python seems to create new objects for each method access. Why am I seeing this behavior? I.e. what is the reason why it can't reuse one object per class and one per instance?
Yes, Python creates new method objects for each access, because it builds a wrapper object to pass in self. This is called a bound method.
Python uses descriptors to do this; function objects have a __get__ method that is called when accessed on a class:
>>> A.__dict__['m'].__get__(A(), A)
<bound method A.m of <__main__.A object at 0x10c29bc10>>
>>> A().m
<bound method A.m of <__main__.A object at 0x10c3af450>>
Note that Python cannot reuse A().m; Python is a highly dynamic language and the very act of accessing .m could trigger more code, which could alter behaviour of what A().m would return next time when accessed.
The #classmethod and #staticmethod decorators make use of this mechanism to return a method object bound to the class instead, and a plain unbound function, respectively:
>>> class Foo:
... #classmethod
... def bar(cls): pass
... #staticmethod
... def baz(): pass
...
>>> Foo.__dict__['bar'].__get__(Foo(), Foo)
<bound method type.bar of <class '__main__.Foo'>>
>>> Foo.__dict__['baz'].__get__(Foo(), Foo)
<function Foo.baz at 0x10c2a1f80>
>>> Foo().bar
<bound method type.bar of <class '__main__.Foo'>>
>>> Foo().baz
<function Foo.baz at 0x10c2a1f80>
See the Python descriptor howto for more detail.
However, Python 3.7 adds a new LOAD_METHOD - CALL_METHOD opcode pair that replaces the current LOAD_ATTRIBUTE - CALL_FUNCTION opcode pair precisely to avoid creating a new method object each time. This optimisation transforms the executon path for instance.foo() from type(instance).__dict__['foo'].__get__(instance, type(instance))() with type(instance).__dict__['foo'](instance), so 'manually' passing in the instance directly to the function object. The optimisation falls back to the normal attribute access path (including binding descriptors) if the attribute found is not a pure-python function object.
Because that's the most convenient, least magical and most space efficient way of implementing bound methods.
In case you're not aware, bound methods refers to being able to do something like this:
f = obj.m
# ... in another place, at another time
f(args, but, not, self)
Functions are descriptors. Descriptors are general objects which can behave differently when accessed as attribute of a class or object. They are used to implement property, classmethod, staticmethod, and several other things. The specific operation of function descriptors is that they return themselves for class access, and return a fresh bound method object for instance access. (Actually, this is only true for Python 3; Python 2 is more complicated in this regard, it has "unbound methods" which are basically functions but not quite).
The reason a new object is created on each access is one of simplicity and efficency: Creating a bound method up-front for every method of every instance takes time and space. Creating them on demand and never freeing them is a potential memory leak (although CPython does something similar for other built-in types) and slightly slower in some cases. Complicated weakref-based caching schemes method objects aren't free either and significantly more complicated (historically, bound methods predate weakrefs by far).

Python method accessor creates new objects on each access?

When investigating for another question, I found the following:
>>> class A:
... def m(self): return 42
...
>>> a = A()
This was expected:
>>> A.m == A.m
True
>>> a.m == a.m
True
But this I did not expect:
>>> a.m is a.m
False
And especially not this:
>>> A.m is A.m
False
Python seems to create new objects for each method access. Why am I seeing this behavior? I.e. what is the reason why it can't reuse one object per class and one per instance?
Yes, Python creates new method objects for each access, because it builds a wrapper object to pass in self. This is called a bound method.
Python uses descriptors to do this; function objects have a __get__ method that is called when accessed on a class:
>>> A.__dict__['m'].__get__(A(), A)
<bound method A.m of <__main__.A object at 0x10c29bc10>>
>>> A().m
<bound method A.m of <__main__.A object at 0x10c3af450>>
Note that Python cannot reuse A().m; Python is a highly dynamic language and the very act of accessing .m could trigger more code, which could alter behaviour of what A().m would return next time when accessed.
The #classmethod and #staticmethod decorators make use of this mechanism to return a method object bound to the class instead, and a plain unbound function, respectively:
>>> class Foo:
... #classmethod
... def bar(cls): pass
... #staticmethod
... def baz(): pass
...
>>> Foo.__dict__['bar'].__get__(Foo(), Foo)
<bound method type.bar of <class '__main__.Foo'>>
>>> Foo.__dict__['baz'].__get__(Foo(), Foo)
<function Foo.baz at 0x10c2a1f80>
>>> Foo().bar
<bound method type.bar of <class '__main__.Foo'>>
>>> Foo().baz
<function Foo.baz at 0x10c2a1f80>
See the Python descriptor howto for more detail.
However, Python 3.7 adds a new LOAD_METHOD - CALL_METHOD opcode pair that replaces the current LOAD_ATTRIBUTE - CALL_FUNCTION opcode pair precisely to avoid creating a new method object each time. This optimisation transforms the executon path for instance.foo() from type(instance).__dict__['foo'].__get__(instance, type(instance))() with type(instance).__dict__['foo'](instance), so 'manually' passing in the instance directly to the function object. The optimisation falls back to the normal attribute access path (including binding descriptors) if the attribute found is not a pure-python function object.
Because that's the most convenient, least magical and most space efficient way of implementing bound methods.
In case you're not aware, bound methods refers to being able to do something like this:
f = obj.m
# ... in another place, at another time
f(args, but, not, self)
Functions are descriptors. Descriptors are general objects which can behave differently when accessed as attribute of a class or object. They are used to implement property, classmethod, staticmethod, and several other things. The specific operation of function descriptors is that they return themselves for class access, and return a fresh bound method object for instance access. (Actually, this is only true for Python 3; Python 2 is more complicated in this regard, it has "unbound methods" which are basically functions but not quite).
The reason a new object is created on each access is one of simplicity and efficency: Creating a bound method up-front for every method of every instance takes time and space. Creating them on demand and never freeing them is a potential memory leak (although CPython does something similar for other built-in types) and slightly slower in some cases. Complicated weakref-based caching schemes method objects aren't free either and significantly more complicated (historically, bound methods predate weakrefs by far).

Categories