In the code below, the object a has two functions as attributes: one is a class attribute while one is its own instance attribute.
class A:
def foo(*args):
pass
a = A()
def bar(*args):
pass
a.bar = bar
print(a.foo)
print(a.bar)
What I expected was that both bar() and foo() would be methods of the object a, but, from the output of this code, it turns out that this is not the case—only foo is a method of a.
<bound method A.foo of <__main__.A object at 0x0000016F5F579AF0>>
<function bar at 0x0000016F5F5845E0>
So what exactly is a method in Python? Is it a class attribute that holds a function definition, which seems to be the case.
And why isn't the attribute bar considered a method by Python? What exactly is the idea behind this behaviour of Python?
A method is an instance of the class method returned by, among other things, the __get__ method of a function-valued class attribute.
a.bar is an instance attribute, not a class attribute.
When looking for a.foo, Python first looks for A.foo. Having found it, it next checks if its value has a __get__ method (which, as a function value, it does.) Because A.foo is a descriptor (i.e, has a __get__ method), its __get__ method is called: a.foo is the same as A.foo.__get__(a, A). The return value is a method object, whose __call__ method calls the underlying function with the object and its own arguments. That is,
a.foo(x) == A.foo.__get__(a, A)(x)
== A.foo(a, x)
Because a.bar is an instance attribute, the descriptor protocol is not invoked, so a.bar is the exact same object as bar.
(This is an extremely condensed version of the contents of the Descriptor HowTo Guide, in particular the section on methods. I highly recommended reading it.)
The lookup for a.bar proceeds by first looking for bar in A.__dict__. When it isn't found, Python looks in a.__dict__, and finds a function to call: end of story.
This is because you are setting the attribute to an instance of the class, not itself.
Following your example, let's set:
A.bar = bar
You'll see that the output is:
print(A.bar)
# <function bar at 0x0000021821D57280>
c = A()
print(c.bar)
# <bound method bar of <__main__.A object at 0x0000021821EEA400>>
Bound methods are methods that require the actual instance of the class itself. This is why we typically have self as the first parameter: the instance knows to pass itself to the function when it gets called. By setting the function as an attribute to the instance, the method is not bound, and the first parameter will not be itself.
We can see this behavior if we set our function bar as the following:
def bar(*args):
print("Argument types:", *map(type, args))
a.bar = bar
a.bar()
# Argument types:
A.bar = bar
c = A()
c.bar()
# Argument types: <class '__main__.A'>
We see that with a bound method, it passes itself to the function. This is the difference between setting a method as an attribute versus actually setting it to the class itself.
a.bar is not a BOUND method. It is a callable that has been set to an attribute. This means, among other things, that it doesn't automatically get the self.
class A:
def __init__(self):
self.baz = 'baz'
def foo(self):
print(self.baz)
def bar(obj):
print(obj.baz)
a = A()
a.bar = bar
a.foo()
# prints baz
a.bar()
# raises TypeError: bar() missing 1 required positional argument: 'obj'
a.bar(a)
# prints baz
You can make a method into a bound method with types.MethodType:
a.bar = types.MethodType(bar, a)
a.bar()
# prints baz
This only binds it to the instance; other instances won't have this attribute
a2 = A()
a2.bar()
# AttributeError: 'A' object has no attribute 'bar'
Related
Consider a trivial example:
class C:
#staticmethod
def my_static_method():
print("static")
def my_instance_method(self):
print("self")
When I call C().my_static_method(), python doesn't pass the instance of C into my_static_method, and the descriptor that my_static_method references doesn't expect an instance of C, either.
This makes sense.
But then when I call C().my_instance_method(), how does python know to pass the instance of C that I'm calling my_instance_method from in as an argument, without me specifying anything?
As the link explains, function objects are descriptors! Just like staticmethod objects.
They have a __get__ method which returns a bound-method object, which essentially just partially applies the instance itself as the first positional argument. Consider:
>>> def foo(self):
... return self.bar
...
>>> class Baz:
... bar = 42
...
>>> baz = Baz()
>>> bound_method = foo.__get__(baz, Baz)
>>> bound_method
<bound method foo of <__main__.Baz object at 0x7ffcd001c7f0>>
>>> method()
42
By adding the #staticmethod decorator to my_static_method, you told python not to pass the calling instance of C into the function. So you can call this function as C.my_static_method().
By calling C() you created an instance of C. Then you called the non static function my_instance_method() which Python happily passed your new instance of C as the first parameter.
What happens when you call C.my_instance_method() ?
Rhetorical: You'll get a "missing one required arg self" exception -- since my_instance_method only works when calling from an instance unless you decorate it as static.
Of course you can still call the static member from an instance C().my_static_method() but you don't have a self param so no access to the instance.
The key point here is that methods are just functions that happen to be attributes of a class. The actual magic, in Python, happens in the attribute lookup process. The link you give explains earlier just how much happens every time x.y happens in Python. (Remember, everything is an object; that includes functions, classes, modules, type (which is an instance of itself)...)
This process is why descriptors can work at all; why we need explicit self; and why we can do fun things like calling a method with normal function call syntax (as long as we look it up from the class rather than an instance), alias it, mimic the method binding process with functools.partial....
Suppose we have c = C(). When you do c.my_instance_method (never mind calling it for now), Python looks for my_instance_method in type(c) (i.e., in the C class), and also checks if it's a descriptor, and also if it's specifically a data descriptor. Functions are non-data descriptors; even outside of a class, you can write
>>> def x(spam): return spam
...
>>> x.__get__
<method-wrapper '__get__' of function object at 0x...>
Because of the priority rules, as long as c doesn't directly have an attribute attached with the same name, the function will be found in C and its __get__ will be used. Note that the __get__ in question comes from the class - but it isn't using the same process as x.__get__ above. That code looks in the class because that's one of the places checked for an attribute lookup; but when c.my_instance_method redirects to C.my_instance_method.__get__, it's looking there directly - attaching a __get__ attribute directly to the function wouldn't change anything (which is why staticmethod is implemented as a class instead).
That __get__ implements the actual method binding. Let's pretend we found x as a method in the str class:
>>> x.__get__('spam', str)
<bound method x of 'spam'>
>>> x.__get__('spam', str)()
'spam'
Remember, although the function in question takes three arguments, we're calling __get__, itself, as a method - so x gets bound to it in the same way. Equivalently, and more faithful to the actual process:
>>> type(x).__get__(x, 'spam', str)
<bound method x of 'spam'>
>>> type(x).__get__(x, 'spam', str)()
'spam'
So what exactly is that "bound method", anyway?
>>> bound = type(x).__get__(x, 'spam', str)
>>> type(bound)
<class 'method'>
>>> bound.__call__
<method-wrapper '__call__' of method object at 0x...>
>>> bound.__func__
<function x at 0x...>
>>> bound.__self__
'spam'
>>> type(bound)(x, 'eggs')
<bound method x of 'eggs'>
Pretty much what you'd expect: it's a callable object that stores and uses the original function and self value, and does the obvious thing in __call__.
I just stumbled over this strange behavior when the type of a method changes during subclassing:
class A:
def f(self, x):
return x**2
class B(A):
#classmethod
def f(cls, x):
return x**2
If I now ask for the type of B.f, I'll get the (supposedly) wrong answer:
In [37]: type(B.f)
Out[37]: method
Whereas this works as expected:
In [39]: type(B.__dict__["f"])
Out[39]: classmethod
(Seen in Python 3.4 and 3.6.)
Is this just a bug or is there a specific reason for this?
What's the difference between the attribute f and the .__dict__["f"] item? I thought they were the same.
In a testing suite, I was trying to support both types of methods inside a class to be tested. To be able to do that, I need to know the type in order to pass the correct number of arguments. If it's a normal method (i.e. self is the first argument), I'd just pass None explicitly, which by design shouldn't be used inside the method anyway, since it's not instance-dependent.
Maybe there's a better way to do this, like duck typing the call to the method. But there might be cases where this is not so easy to do, like if the method had *args and **kwargs... Therefore I went with the explicit type check, but got stuck at this point.
No, this is not a bug, this is normal behaviour. A classmethod produces a bound method when accessed on a class. That's exactly the point of a classmethod, to bind a function to the class you access it on or the class of an instance you access it on.
Like function and property objects, classmethod is a descriptor object, it implements a __get__ method. Accessing attributes on an instance or a class is delegated to the __getattribute__ method, and the default implementation of that hook will not just return what it found in object.__dict__[attributename]; it will also bind descriptors, by calling the descriptor.__get__() method. This is a hugely important aspect of Python, it is this mechanism that makes methods and attributes and loads of other things work.
classmethod objects, when bound by the descriptor protocol, return a method object. Method objects are wrappers that record the object bound to, and the function to call when they are called; calling a method really calls the underlying method with the bound object as first argument:
>>> class Foo:
... pass
...
>>> def bar(*args): print(args)
...
>>> classmethod(bar).__get__(None, Foo) # decorate with classmethod and bind
<bound method bar of <class '__main__.Foo'>>
>>> method = classmethod(bar).__get__(None, Foo)
>>> method.__self__
<class '__main__.Foo'>
>>> method.__func__
<function bar at 0x1056f0e18>
>>> method()
(<class '__main__.Foo'>,)
>>> method('additional arguments')
(<class '__main__.Foo'>, 'additional arguments')
So the method object returned for a classmethod object references the class (the second argument to __get__, the owner), and the original function. If you use a class method on an instance, the first argument is still ignored:
>>> classmethod(bar).__get__(Foo(), Foo).__self__ # called on an instance
<class '__main__.Foo'>
Functions, on the other hand, want to bind only to instances; so if the first argument to __get__ is set to None, they simply return self:
>>> bar.__get__(None, Foo) # access on a class
<function bar at 0x1056f0e18>
>>> bar.__get__(Foo(), Foo) # access on an instance
<bound method bar of <__main__.Foo object at 0x105833a90>>
>>> bar.__get__(Foo(), Foo).__self__
<__main__.Foo object at 0x105833160>
If accessing ClassObject.classmethod_object would return the classmethod object itself, like a function object would, then you could never actually use the class method on a class. That'd be rather pointless.
So no, object.attribute is not always the same thing as object.__dict__['attribute']. If object.__dict__['attribute'] supports the descriptor protocol, it'll be invoked.
The Python 3.x language reference described two ways to create a method object:
User-defined method objects may be created when getting an attribute of a class (perhaps via an instance of that class), if that attribute is a user-defined function object or a class method object.
When an instance method object is created by retrieving a user-defined function object from a class via one of its instances, its self attribute is the instance, and the method object is said to be bound. The new method’s func attribute is the original function object.
When a user-defined method object is created by retrieving another method object from a class or instance, the behaviour is the same as for a function object, except that the func attribute of the new instance is not the original method object but its func attribute.
When an instance method object is called, the underlying function (func) is called, inserting the class instance (self) in front of the argument list. For instance, when C is a class which contains a definition for a function f(), and x is an instance of C, calling x.f(1) is equivalent to calling C.f(x, 1).
When an instance method object is derived from a class method object, the “class instance” stored in self will actually be the class itself, so that calling either x.f(1) or C.f(1) is equivalent to calling f(C,1) where f is the underlying function.
In different ways, they both have different __func__ and __self__ values, but I'm not very aware of these two different ways, could someone explain it to me?
Python Language Reference | The Standard Type Hierarchy:
https://docs.python.org/3/reference/datamodel.html#the-standard-type-hierarchy
I'm not 100% sure that I completely understand your question, but maybe looking at an example will be helpful. To start, lets create a class which has a function in the definition:
>>> class Foo(object):
... def method(self):
... pass
...
>>> f = Foo()
User-defined method objects may be created when getting an attribute of a class (perhaps via an instance of that class), if that attribute is a user-defined function object or a class method object.
Ok, so we can create a method object by just accessing the attribute on an instance (if the attribute is a function). In our setup, f is an instance of the class Foo:
>>> type(f.method)
<class 'method'>
Compare that with accessing the method attribute on the class:
>>> type(Foo.method)
<class 'function'>
When an instance method object is created by retrieving a user-defined function object from a class via one of its instances, its __self__ attribute is the instance, and the method object is said to be bound. The new method’s __func__ attribute is the original function object.
This is just telling us what attributes exist on instance methods. Let's check it out:
>>> instance_method = f.method
>>> instance_method.__func__ is Foo.method
True
>>> instance_method.__self__ is f
True
So we see that the method object has a __func__ attribute which is just a reference to the actual Foo.method function. It also has a __self__ attribute that is a reference to the instance.
When an instance method object is called, the underlying function (func) is called, inserting the class instance (self) in front of the argument list. For instance, when C is a class which contains a definition for a function f(), and x is an instance of C, calling x.f(1) is equivalent to calling C.f(x, 1).
Basically, in reference to our example above, this is just saying that If:
instance_method = f.method
Then:
instance_method(arg1, arg2)
executes the following:
instance_method.__func__(instance_method.__self__, arg1, arg2)
For completeness and as an addendum to the excellent answer provided by #mgilson, I wanted to explain the remaining 2 paragraphs referenced in the original question.
First let's create a class with a classmethod:
>>> class Foo(object):
... #classmethod
... def cmethod(cls):
... pass
...
>>> f = Foo()
Now for the 3rd paragraph:
When a user-defined method object is created by retrieving another method object from a class or instance, the behaviour is the same as for a function object, except that the func attribute of the new instance is not the original method object but its func attribute.
This means:
>>> class_method = f.cmethod
>>> class_method.__func__ is Foo.cmethod.__func__
True
>>> class_method.__self__ is Foo
True
Note that __self__ is a reference to the Foo class. Finally, the last paragraph:
When an instance method object is derived from a class method object, the “class instance” stored in self will actually be the class itself, so that calling either x.f(1) or C.f(1) is equivalent to calling f(C,1) where f is the underlying function.
This just says that all of the following are equivalent:
>>> f.cmethod(arg1, arg2)
>>> Foo.cmethod(arg1, arg2)
>>> f.cmethod.__func__(Foo, arg1, arg2)
>>> Foo.cmethod.__func__(Foo, arg1, arg2)
>>> f.cmethod.__func__(f.cmethod.__self__, arg1, arg2)
>>> Foo.cmethod.__func__(Foo.cmethod.__self__, arg1, arg2)
New to Python and having done some reading, I'm making some methods in my custom class class methods rather than instance methods.
So I tested my code but I hadn't changed some of the method calls to call the method in the class rather than the instance, but they still worked:
class myClass:
#classmethod:
def foo(cls):
print 'Class method foo called with %s.'%(cls)
def bar(self):
print 'Instance method bar called with %s.'%(self)
myClass.foo()
thing = myClass()
thing.foo()
thing.bar()
This produces:
class method foo called with __main__.myClass.
class method foo called with __main__.myClass.
instance method bar called with <__main__.myClass instance at 0x389ba4>.
So what I'm wondering is why I can call a class method (foo) on an instance (thing.foo), (although it's the class that gets passed to the method)? It kind of makes sense, as 'thing' is a 'myClass', but I was expecting Python to give an error saying something along the lines of 'foo is a class method and can't be called on an instance'.
Is this just an expected consequence of inheritance with the 'thing' object inheriting the foo method from its superclass?
If I try to call the instance method via the class:
myClass.bar()
then I get:
TypeError: unbound method bar() must be called with myClass instance...
which makes perfect sense.
You can call it on an instance because #classmethod is a decorator (it takes a function as an argument and returns a new function).
Here is some relavent information from the Python documentation
It can be called either on the class (such as C.f()) or on an instance
(such as C().f()). The instance is ignored except for its class. If a
class method is called for a derived class, the derived class object
is passed as the implied first argument.
There's also quite a good SO discussion on #classmethod here.
Let's start with a quick overview of the descriptor protocol. If a class defines a __get__ method, an instance of that class is a descriptor. Accessing a descriptor as the attribute of another object produces the return value of the __get__ method, not the descriptor itself.
A function is a descriptor; this is how instance methods are implemented. Given
class myClass:
#classmethod:
def foo(cls):
print('Class method foo called with %s.' % (cls,))
def bar(self):
print 'Instance method bar called with %s.'%(self)
c = myClass()
the expression c.bar is equivalent to
myClass.__dict__['bar'].__get__(c, myClass)
while the expression myClass.bar is equivalent to the expression
myClass.__dict__['bar'].__get__(None, myClass)
Note the only difference is in the object passed as the first argument to __get__. The former returns a new method object, which when called passes c and its own arguments on to the function bar. This is why c.bar() and C.bar(c) are equivalent. The latter simply returns the function bar itself.
classmethod is a type that provides a different implementation of __get__. This means that c.foo() and myClass.foo() call __get__ as before:
# c.foo
myClass.__dict__['foo'].__get__(c, myClass)
# myClass.foo
myClass.__dict__['foo'].__get__(None, myClass)
Now, however, both calls return the same method object, and this method object, when called, passes myClass as the first argument to the original function object. That is, c.foo() is equivalent to myClass.foo(), which
is equivalent to x(myClass) (where x is the original function defined before the decoration bound the name foo to an instance of classmethod).
I find the following example mildly surprising:
>>> class Foo:
def blah(self):
pass
>>> f = Foo()
>>> def bar(self):
pass
>>> Foo.bar = bar
>>> f.bar
<bound method Foo.bar of <__main__.Foo object at 0x02D18FB0>>
I expected the bound method to be associated with each particular instance, and to be placed in it at construction. It seems logical that the bound method would have to be different for each instance, so that it knows which instance to pass in to the underlying function - and, indeed:
>>> g = Foo()
>>> g.blah is f.blah
False
But my understanding of the process is clearly flawed, since I would not expect assigning a function into a class attribute would put it in instances that had already been created by then.
So, my question is two fold -
Why does assigning a function into a class apply retroactively to instances? What are the actual lookup rules and processes that make this so?
Is this something guaranteed by the language, or just something that happens to happen?
You want to blow your mind, try this:
f.blah is f.blah
That's right, the instance method wrapper is different each time you access it.
In fact an instance method is a descriptor. In other words, f.blah is actually:
Foo.blah.__get__(f, type(f))
Methods are not actually stored on the instance; they are stored on the class, and a method wrapper is generated on the fly to bind the method to the instance.
The instances do not "contain" the method. The lookup process happens dynamically at the time you access foo.bar. It checks to see if the instance has an attribute of that name. Since it doesn't, it looks on the class, whereupon it finds whatever attribute the class has at that time. Note that methods are not special in this regard. You'll see the same effect if you set Foo.bar = 2; after that, foo.bar will evalute to 2.
What is guaranteed by the language is that attribute lookup proceeds in this fashion: first the instance, then the class if the attribute is not found on the instance. (Lookup rules are different for special methods implicitly invoked via operator overloading, etc..)
Only if you directly assign an attribute to the instance will it mask the class attribute.
>>> foo = Foo()
>>> foo.bar
Traceback (most recent call last):
File "<pyshell#79>", line 1, in <module>
foo.bar
AttributeError: 'Foo' object has no attribute 'bar'
>>> foo.bar = 2
>>> Foo.bar = 88
>>> foo.bar
2
All of the above is a separate matter from bound/unbound methods. The class machinery in Python uses the descriptor protocol so that when you access foo.bar, a new bound method instance is created on the fly. That's why you're seeing different bound method instances on your different objects. But note that underlyingly these bound methods rely on the same code object, as defined by the method you wrote in the class:
>>> foo = Foo()
>>> foo2 = Foo()
>>> foo.blah.im_func.__code__ is foo2.blah.im_func.__code__
True