I'm asking this question because of a discussion on the comment thread of this answer. I'm 90% of the way to getting my head round it.
In [1]: class A(object): # class named 'A'
...: def f1(self): pass
...:
In [2]: a = A() # an instance
f1 exists in three different forms:
In [3]: a.f1 # a bound method
Out[3]: <bound method a.f1 of <__main__.A object at 0x039BE870>>
In [4]: A.f1 # an unbound method
Out[4]: <unbound method A.f1>
In [5]: a.__dict__['f1'] # doesn't exist
KeyError: 'f1'
In [6]: A.__dict__['f1'] # a function
Out[6]: <function __main__.f1>
What is the difference between the bound method, unbound method and function objects, all of which are described by f1? How does one call these three objects? How can they be transformed into each other? The documentation on this stuff is quite hard to understand.
A function is created by the def statement, or by lambda. Under Python 2, when a function appears within the body of a class statement (or is passed to a type class construction call), it is transformed into an unbound method. (Python 3 doesn't have unbound methods; see below.) When a function is accessed on a class instance, it is transformed into a bound method, that automatically supplies the instance to the method as the first self parameter.
def f1(self):
pass
Here f1 is a function.
class C(object):
f1 = f1
Now C.f1 is an unbound method.
>>> C.f1
<unbound method C.f1>
>>> C.f1.im_func is f1
True
We can also use the type class constructor:
>>> C2 = type('C2', (object,), {'f1': f1})
>>> C2.f1
<unbound method C2.f1>
We can convert f1 to an unbound method manually:
>>> import types
>>> types.MethodType(f1, None, C)
<unbound method C.f1>
Unbound methods are bound by access on a class instance:
>>> C().f1
<bound method C.f1 of <__main__.C object at 0x2abeecf87250>>
Access is translated into calling through the descriptor protocol:
>>> C.f1.__get__(C(), C)
<bound method C.f1 of <__main__.C object at 0x2abeecf871d0>>
Combining these:
>>> types.MethodType(f1, None, C).__get__(C(), C)
<bound method C.f1 of <__main__.C object at 0x2abeecf87310>>
Or directly:
>>> types.MethodType(f1, C(), C)
<bound method C.f1 of <__main__.C object at 0x2abeecf871d0>>
The main difference between a function and an unbound method is that the latter knows which class it is bound to; calling or binding an unbound method requires an instance of its class type:
>>> f1(None)
>>> C.f1(None)
TypeError: unbound method f1() must be called with C instance as first argument (got NoneType instance instead)
>>> class D(object): pass
>>> f1.__get__(D(), D)
<bound method D.f1 of <__main__.D object at 0x7f6c98cfe290>>
>>> C.f1.__get__(D(), D)
<unbound method C.f1>
Since the difference between a function and an unbound method is pretty minimal, Python 3 gets rid of the distinction; under Python 3 accessing a function on a class instance just gives you the function itself:
>>> C.f1
<function f1 at 0x7fdd06c4cd40>
>>> C.f1 is f1
True
In both Python 2 and Python 3, then, these three are equivalent:
f1(C())
C.f1(C())
C().f1()
Binding a function to an instance has the effect of fixing its first parameter (conventionally called self) to the instance. Thus the bound method C().f1 is equivalent to either of:
(lamdba *args, **kwargs: f1(C(), *args, **kwargs))
functools.partial(f1, C())
is quite hard to understand
Well, it is quite a hard topic, and it has to do with descriptors.
Lets start with function. Everything is clear here - you just call it, all supplied arguments are passed while executing it:
>>> f = A.__dict__['f1']
>>> f(1)
1
Regular TypeError is raised in case of any problem with number of parameters:
>>> f()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: f1() takes exactly 1 argument (0 given)
Now, methods. Methods are functions with a bit of spices. Descriptors come in game here. As described in Data Model, A.f1 and A().f1 are translated into A.__dict__['f1'].__get__(None, A) and type(a).__dict__['f1'].__get__(a, type(a)) respectively. And results of these __get__'s differ from the raw f1 function. These objects are wrappers around the original f1 and contain some additional logic.
In case of unbound method this logic includes a check whether first argument is an instance of A:
>>> f = A.f1
>>> f()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unbound method f1() must be called with A instance as first argument (got nothing instead)
>>> f(1)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unbound method f1() must be called with A instance as first argument (got int instance instead)
If this check succeeds, it executes original f1 with that instance as first argument:
>>> f(A())
<__main__.A object at 0x800f238d0>
Note, that im_self attribute is None:
>>> f.im_self is None
True
In case of bound method this logic immediately supplies original f1 with an instance of A it was created of (this instance is actually stored in im_self attribute):
>>> f = A().f1
>>> f.im_self
<__main__.A object at 0x800f23950>
>>> f()
<__main__.A object at 0x800f23950>
So, bound mean that underlying function is bound to some instance. unbound mean that it is still bound, but only to a class.
A function object is a callable object created by a function definition. Both bound and unbound methods are callable objects created by a Descriptor called by the dot binary operator.
Bound and unbound method objects have 3 main properties: im_func is the function object defined in the class, im_class is the class, and im_self is the class instance. For unbound methods, im_self is None.
When a bound method is called, it calls im_func with im_self as the first parameter followed by its calling parameters. unbound methods call the underlying function with just its calling parameters.
Starting with Python 3, there are no unbound methods. Class.method returns a direct reference to the method.
Please refer to the Python 2 and Python 3 documentation for more details.
My interpretation is the following.
Class Function snippets:
Python 3:
class Function(object):
. . .
def __get__(self, obj, objtype=None):
"Simulate func_descr_get() in Objects/funcobject.c"
if obj is None:
return self
return types.MethodType(self, obj)
Python 2:
class Function(object):
. . .
def __get__(self, obj, objtype=None):
"Simulate func_descr_get() in Objects/funcobject.c"
return types.MethodType(self, obj, objtype)
If a function is called without class or instance, it is a plain function.
If a function is called from a class or an instance, its __get__ is called to retrieve wrapped function:
a. B.x is same as B.__dict__['x'].__get__(None, B).
In Python 3, this returns plain function.
In Python 2, this returns an unbound function.
b. b.x is same as type(b).__dict__['x'].__get__(b, type(b). This will return a bound method in both Python 2 and Python 3, which means self will be implicitly passed as first argument.
What is the difference between a function, an unbound method and a bound method?
From the ground breaking what is a function perspective there is no difference.
Python object oriented features are built upon a function based environment.
Being bound is equal to:
Will the function take the class (cls) or the object instance (self) as the first parameter or no?
Here is the example:
class C:
#instance method
def m1(self, x):
print(f"Excellent m1 self {self} {x}")
#classmethod
def m2(cls, x):
print(f"Excellent m2 cls {cls} {x}")
#staticmethod
def m3(x):
print(f"Excellent m3 static {x}")
ci=C()
ci.m1(1)
ci.m2(2)
ci.m3(3)
print(ci.m1)
print(ci.m2)
print(ci.m3)
print(C.m1)
print(C.m2)
print(C.m3)
Outputs:
Excellent m1 self <__main__.C object at 0x000001AF40319160> 1
Excellent m2 cls <class '__main__.C'> 2
Excellent m3 static 3
<bound method C.m1 of <__main__.C object at 0x000001AF40319160>>
<bound method C.m2 of <class '__main__.C'>>
<function C.m3 at 0x000001AF4023CBF8>
<function C.m1 at 0x000001AF402FBB70>
<bound method C.m2 of <class '__main__.C'>>
<function C.m3 at 0x000001AF4023CBF8>
The output shows the static function m3 will be never called bound.
C.m2 is bound to the C class because we sent the cls parameter which is the class pointer.
ci.m1 and ci.m2 are both bound; ci.m1 because we sent self which is a pointer to the instance, and ci.m2 because the instance knows that the class is bound ;).
To conclude you can bound method to a class or to a class object, based on the first parameter the method takes. If method is not bound it can be called unbound.
Note that method may not be originally part of the class. Check this answer from Alex Martelli for more details.
One interesting thing I saw today is that, when I assign a function to a class member, it becomes an unbound method. Such as:
class Test(object):
#classmethod
def initialize_class(cls):
def print_string(self, str):
print(str)
# Here if I do print(print_string), I see a function
cls.print_proc = print_string
# Here if I do print(cls.print_proc), I see an unbound method; so if I
# get a Test object o, I can call o.print_proc("Hello")
Related
I am wondering why the method exist two copy, one for instance object, the other for class object, why it is designed like this?
class Bar():
def method(self):
pass
#classmethod
def clsmethod(cls):
pass
b1 = Bar()
b2 = Bar()
print(Bar.method,id(Bar.method))
print(b1.method,id(b1.method))
print(b2.method,id(b2.method))
print(Bar.clsmethod,id(Bar.clsmethod))
print(b1.clsmethod,id(b1.clsmethod))
print(b2.clsmethod,id(b2.clsmethod))
This design is based on descriptors, specifically non-data descriptors. Every function happens to be a non-data descriptor by defining a __get__ method:
>>> def foo():
... pass
...
>>> foo.__get__
<method-wrapper '__get__' of function object at 0x7fa75be5be50>
When you have an expression x.y in your code, this means the attribute y is being looked up on the object x. The specific rules are explained here, and one of them is concerned with y being a (non-)data descriptor stored on the class of x (or any subclass). The following is an example:
>>> class Foo:
... def test(self):
... pass
...
Here Foo.test looks up the name test on the class Foo. The result is the function as you would define in the global namespace:
>>> Foo.test
<function Foo.test at 0x7fa75be5bf70>
However, as we have seen above, every function is also a descriptor, so if you look up test on an instance of Foo, it will call the descriptor's __get__ method to compute the result:
>>> f = Foo()
>>> f.test
<bound method Foo.test of <__main__.Foo object at 0x7fa75bf56b20>>
We can obtain a similar result by manually invoking Foo.test.__get__:
>>> Foo.test.__get__(f, type(f))
<bound method Foo.test of <__main__.Foo object at 0x7fa75bf56b20>>
This mechanism is what ensures that the instance (typically denoted via self) is passed as the first argument to instance methods. The descriptor returns a bound method (bound to the instance on which the lookup was performed) rather than the original function. This bound method inserts the instance as the very first parameter when being called. Every time you do Foo.test a new bound-method object is returned and hence their ids differ.
The situation with classmethods is similar where Foo.test.__get__(None, Foo) is called. The only difference is that for instances object.__getattribute__ is called while for classes type.__getattribute__ takes precedence.
>>> class Bar:
... #classmethod
... def test(cls):
... pass
...
>>> Bar.test
<bound method Bar.test of <class '__main__.Bar'>>
>>> Bar.__dict__['test'].__get__(None, Bar)
<bound method Bar.test of <class '__main__.Bar'>>
I have accidentally stumbled on this kind of notation:
>>> m = mock.Mock()
>>> m().my_value = 5
>>>
>>> m
<Mock id='139823798337360'>
>>> m()
<Mock name='mock()' id='139823798364240'>
m is an object of type mock, () is a function call. How can you function call an object?
I tried calling a normal object, and expectedly i got an exception
>>> class C(object):
... pass
...
>>> c = C()
>>> c()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'C' object is not callable
So this must be some kind of Mock magic. What is it and is it used for?
In Python, any object is callable if it implements __call__ method.
Any function is callable as the function object implements __call__. Also, all classes are callable. so you can make an instance of the class
In your class, if you add __call__ it'll look like this
class C(object):
def __call__(self, *args):
print("instance is called with %s", tuple(args))
Mock class defines __call__ so you can track calls to mock object.
>>> m = Mock()
>>> m(3, 4)
<Mock name='mock()' id='140219558391696'>
>>> m.mock_calls
[call(3, 4)]
>>>
I'll answer this, as it seems my question wasn't too clear.
So notation:
m = mock.Mock()
m().my_value = 5
is used to mock factory functions. When you need to mock a function that returns an object with certain properties. Every call to m() returns mock object that is different from m itself, but the same one every time (usually every call to a factory function returns a new object). So when a property of this object is set (like m().my_value=5), it will be available in any later calls to m()
When a function is assigned to an attribute during class definition, this attribute stays an ordinary function with its original signature:
>>> def f(x):
return x**2
>>> class A:
ff = f
>>> A.ff
<function f at 0x037D6ED0>
When the class is instantiated, this attribute becomes a bound method and its signature changes:
>>> a = A()
>>> a.ff
<bound method A.f of <__main__.A object at 0x03A726B0>>
I need to define a class that I can later customize by changing some attributes before instantiating. One of these attributes is a function and I need it to keep it's signature.
Using #staticmethod is obviously not an option, since no function is defined on class definition/customization, and decorations dont apply to attributes.
Is there any way to keep a function to be transformed into a bound method on instantiation?
Using #staticmethod is obviously not an option, since no function is defined on class definition/customization, and decorations dont apply to attributes.
No, staticmethod is the option, just call it directly to produce an instance:
class A:
ff = staticmethod(f)
#decorator syntax is only syntactic sugar to produce the exact same assignment after a function object has been created.
This works fine:
>>> def f(x):
... return x**2
...
>>> class A:
... f_unchanged = f
... f_static = staticmethod(f)
...
>>> A().f_unchanged
<bound method f of <__main__.A object at 0x10cf7b2e8>>
>>> A().f_static
<function f at 0x10cfb6510>
>>> A().f_static(4)
16
It doesn't matter where a function is defined, a def statement produces a function object regardless where it is used. def name is two things: creating the function object and an assignment of that function object no a name. Wether or not this takes place in a class statement or elsewhere doesn't actually matter.
What turns functions into bound methods is accessing them on an instance, as then the descriptor protocol kicks in. For example, accessing A().ff is turned into A.__dict__['ff'].__get__(A()), and it is the __get__ method on a function that produces the bound method. The bound method is only a proxy for the actual function, passing in the instance as a first argument when called.
A staticmethod defines a different __get__, one that just returns the original function, unbound. You can play with those __get__ methods directly:
>>> f.__get__(A()) # bind f to an instance
<bound method f of <__main__.A object at 0x10cf9f630>>
>>> A.__dict__['f_unchanged'] # bypass the protocol
<function f at 0x10cfb6510>
>>> A.__dict__['f_static'] # bypass the protocol
<staticmethod object at 0x10cf60f28>
>>> A.__dict__['f_static'].__get__(A()) # activate the protocol
<function f at 0x10cfb6510>
I would like to do the following:
class A(object): pass
a = A()
a.__int__ = lambda self: 3
i = int(a)
Unfortunately, this throws:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: int() argument must be a string or a number, not 'A'
This only seems to work if I assign the "special" method to the class A instead of an instance of it. Is there any recourse?
One way I thought of was:
def __int__(self):
# No infinite loop
if type(self).__int__.im_func != self.__int__.im_func:
return self.__int__()
raise NotImplementedError()
But that looks rather ugly.
Thanks.
Python always looks up special methods on the class, not the instance (except in the old, aka "legacy", kind of classes -- they're deprecated and have gone away in Python 3, because of the quirky semantics that mostly comes from looking up special methods on the instance, so you really don't want to use them, believe me!-).
To make a special class whose instances can have special methods independent from each other, you need to give each instance its own class -- then you can assign special methods on the instance's (individual) class without affecting other instances, and live happily ever after. If you want to make it look like you're assigning to an attribute the instance, while actually assigning to an attribute of the individualized per-instance class, you can get that with a special __setattr__ implementation, of course.
Here's the simple case, with explicit "assign to class" syntax:
>>> class Individualist(object):
... def __init__(self):
... self.__class__ = type('GottaBeMe', (self.__class__, object), {})
...
>>> a = Individualist()
>>> b = Individualist()
>>> a.__class__.__int__ = lambda self: 23
>>> b.__class__.__int__ = lambda self: 42
>>> int(a)
23
>>> int(b)
42
>>>
and here's the fancy version, where you "make it look like" you're assigning the special method as an instance attribute (while behind the scene it still goes to the class of course):
>>> class Sophisticated(Individualist):
... def __setattr__(self, n, v):
... if n[:2]=='__' and n[-2:]=='__' and n!='__class__':
... setattr(self.__class__, n, v)
... else:
... object.__setattr__(self, n, v)
...
>>> c = Sophisticated()
>>> d = Sophisticated()
>>> c.__int__ = lambda self: 54
>>> d.__int__ = lambda self: 88
>>> int(c)
54
>>> int(d)
88
The only recourse that works for new-style classes is to have a method on the class that calls the attribute on the instance (if it exists):
class A(object):
def __int__(self):
if '__int__' in self.__dict__:
return self.__int__()
raise ValueError
a = A()
a.__int__ = lambda: 3
int(a)
Note that a.__int__ will not be a method (only functions that are attributes of the class will become methods) so self is not passed implicitly.
I have nothing to add about the specifics of overriding __int__. But I noticed one thing about your sample that bears discussing.
When you manually assign new methods to an object, "self" is not automatically passed in. I've modified your sample code to make my point clearer:
class A(object): pass
a = A()
a.foo = lambda self: 3
a.foo()
If you run this code, it throws an exception because you passed in 0 arguments to "foo" and 1 is required. If you remove the "self" it works fine.
Python only automatically prepends "self" to the arguments if it had to look up the method in the class of the object and the function it found is a "normal" function. (Examples of "abnormal" functions: class methods, callable objects, bound method objects.) If you stick callables in to the object itself they won't automatically get "self".
If you want self there, use a closure.
In the following, setattr succeeds in the first invocation, but fails in the second, with:
AttributeError: 'method' object has no attribute 'i'
Why is this, and is there a way of setting an attribute on a method such that it will only exist on one instance, not for each instance of the class?
class c:
def m(self):
print(type(c.m))
setattr(c.m, 'i', 0)
print(type(self.m))
setattr(self.m, 'i', 0)
Python 3.2.2
The short answer: There is no way of adding custom attributes to bound methods.
The long answer follows.
In Python, there are function objects and method objects. When you define a class, the def statement creates a function object that lives within the class' namespace:
>>> class c:
... def m(self):
... pass
...
>>> c.m
<function m at 0x025FAE88>
Function objects have a special __dict__ attribute that can hold user-defined attributes:
>>> c.m.i = 0
>>> c.m.__dict__
{'i': 0}
Method objects are different beasts. They are tiny objects just holding a reference to the corresponding function object (__func__) and one to its host object (__self__):
>>> c().m
<bound method c.m of <__main__.c object at 0x025206D0>>
>>> c().m.__self__
<__main__.c object at 0x02625070>
>>> c().m.__func__
<function m at 0x025FAE88>
>>> c().m.__func__ is c.m
True
Method objects provide a special __getattr__ that forwards attribute access to the function object:
>>> c().m.i
0
This is also true for the __dict__ property:
>>> c().m.__dict__['a'] = 42
>>> c.m.a
42
>>> c().m.__dict__ is c.m.__dict__
True
Setting attributes follows the default rules, though, and since they don't have their own __dict__, there is no way to set arbitrary attributes.
This is similar to user-defined classes defining __slots__ and no __dict__ slot, when trying to set a non-existing slot raises an AttributeError (see the docs on __slots__ for more information):
>>> class c:
... __slots__ = ('a', 'b')
...
>>> x = c()
>>> x.a = 1
>>> x.b = 2
>>> x.c = 3
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'c' object has no attribute 'c'
Q: "Is there a way of setting an attribute on a method such that it will only exist on one instance, not for each instance of the class?"
A: Yes:
class c:
def m(self):
print(type(c.m))
setattr(c.m, 'i', 0)
print(type(self))
setattr(self, 'i', 0)
The static variable on functions in the post you link to is not useful for methods. It sets an attribute on the function so that this attribute is available the next time the function is called, so you can make a counter or whatnot.
But methods have an object instance associated with them (self). Hence you have no need to set attributes on the method, as you simply can set it on the instance instead. That is in fact exactly what the instance is for.
The post you link to shows how to make a function with a static variable. I would say that in Python doing so would be misguided. Instead look at this answer: What is the Python equivalent of static variables inside a function?
That is the way to do it in Python in a way that is clear and easily understandable. You use a class and make it callable. Setting attributes on functions is possible and there are probably cases where it's a good idea, but in general it will just end up confusing people.