Terminology: A user-defined function object attribute? - python

According to Python 2.7.12 documentation, User-defined methods:
User-defined method objects may be created when getting an attribute
of a class (perhaps via an instance of that class), if that attribute
is a user-defined function object, an unbound user-defined method
object, or a class method object. When the attribute is a
user-defined method object, a new method object is only created if the
class from which it is being retrieved is the same as, or a derived
class of, the class stored in the original method object; otherwise,
the original method object is used as it is.
I know that everything in Python is an object, so a "user-defined method" must be identical to a "user-defined method object". However, I can't understand why there is a "user-defined function object attribute". Say, in the following code:
class Foo(object):
def meth(self):
pass
meth is a function defined inside a class body, and thus a method. So why can we have a "user-defined function object attribute"? Aren't all attributes defined inside a class body?
Bouns question: Provide some examples illustrating how a user-defined method object is created by getting an attribute of a class. Isn't objects defined in their class definition? (I know methods can be assigned to a class instance, but that's monkey patching.)
I'm asking for help because this part of document is really really confusing to me, a programmer who only knows C, since Python is such a magical language that supports both functional programming and object-oriented programmer, which I haven't mastered yet. I've done a lot of search, but still can't figure that out.

When you do
class Foo(object):
def meth(self):
pass
you are defining a class Foo with a method meth. However, when this class definition is executed, no method object is created to represent the method. The def statement creates an ordinary function object.
If you then do
Foo.meth
or
Foo().meth
the attribute lookup finds the function object, but the function object is not used as the value of the attribute. Instead, using the descriptor protocol, Python calls the __get__ method of the function object to construct a method object, and that method object is used as the value of the attribute for that lookup. For Foo.meth, the method object is an unbound method object, which behaves like the function you defined, but with an extra type checking of self. For Foo().meth, the method object is a bound method object, which already knows what self is.
This is why Foo().meth() doesn't complain about a missing self argument; you pass 0 arguments to the method object, which then prepends self to the (empty) argument list and passes the arguments on to the underlying function object. If Foo().meth evaluated to the meth function directly, you would have to pass it self explicitly.
In Python 3, Foo.meth doesn't create an unbound method object; the function's __get__ still gets called, but it returns the function directly, since Guido decided unbound method objects weren't useful. Foo().meth still creates a bound method object, though.

Related

function descriptor on builtin types like dict.fromkeys behaves different to a normal method

Why does the function descriptor for dict.fromkeys function differently then other normal functions.
First of all you can't access __get__ like this: dict.fromkeys.__get__ you have to get it from the __dict__. (dict.__dict__['fromkeys'].__get__)
And then it doesn't work like any other function because it will only let itself get bound to a dict.
This works fine what I expected:
class Thing:
def __init__(self):
self.v = 5
def test(self):
print self.v
class OtherThing:
def __init__(self):
self.v = 6
print Thing.test
Thing.test.__get__(OtherThing())
this however does something unexpected:
#unbound method fromkeys
func = dict.__dict__["fromkeys"]
but it's description differs from a normal unbound function to looks like: <method 'fromkeys' of 'dict' objects> instead of: <unbound method dict.fromkeys> which was what I
this works as expected:
func.__get__({})([1,2,3])
but you can't bind it to something else I understand It wouldn't work but that doesn't usually stop us:
func.__get__([])([1,2,3])
This fails with a type error from the function descriptor...:
descriptor 'fromkeys' for type 'dict' doesn't apply to type 'list'
Why does python differentiate builtin type functions and normal functions like this? And can we do this too? Can we make a function that will only be bound to the type it belongs to?
class_or_type.descriptor always calls descriptor.__get__(None, class_or_type), which is why you can't get __get__ from a built-in unbound method object. The (unbound) method type produced for __get__ on a Python 2 function explicitly implements __get__ because Python methods are much more flexible. So both builtin.descriptor and class.descriptor invoke .__get__(), but the object type that either returns differs, and for Python methods the object type proxies attribute access:
Accessing attributes via __dict__, on the other hand, never calls __get__, so class_or_type.__dict__['fromkeys'] gives you the original descriptor object.
Under the hood, types like dict, including their methods, are implemented in C code, not in Python code, and C is much more picky about types. You can't ever use functions that are designed to work with the internal implementation details of a dict object with a list instead, so why bother?
That's really all there is to it; methods for built-in types are not the same thing as functions / methods implemented in Python, so while they mostly work the same, they can't work with dynamic types, and the implementation of how the descriptors work reflects that.
And because Python functions (wrapped in methods or not), could work with any Python type (if so designed), unbound method objects support __get__ so that you can take any such object and stick it onto another class; by supporting __get__ they support being re-bound to the new class.
If you want to achieve the same thing with Python types, you'll have to wrap the function objects in custom descriptors, then implement your own __get__ behaviour to restrict what is acceptable.
Note that in Python 3, function.__get__(None, classobject) just returns the function object itself, not an unbound method object like Python 2 does. The whole unbound / bound method distinction, where unbound method objects restrict the type for the first argument when called, wasn't found to be all that useful so it was dropped.

Are there more than three types of methods in Python?

I understand there are at least 3 kinds of methods in Python having different first arguments:
instance method - instance, i.e. self
class method - class, i.e. cls
static method - nothing
These classic methods are implemented in the Test class below including an usual method:
class Test():
def __init__(self):
pass
def instance_mthd(self):
print("Instance method.")
#classmethod
def class_mthd(cls):
print("Class method.")
#staticmethod
def static_mthd():
print("Static method.")
def unknown_mthd():
# No decoration --> instance method, but
# No self (or cls) --> static method, so ... (?)
print("Unknown method.")
In Python 3, the unknown_mthd can be called safely, yet it raises an error in Python 2:
>>> t = Test()
>>> # Python 3
>>> t.instance_mthd()
>>> Test.class_mthd()
>>> t.static_mthd()
>>> Test.unknown_mthd()
Instance method.
Class method.
Static method.
Unknown method.
>>> # Python 2
>>> Test.unknown_mthd()
TypeError: unbound method unknown_mthd() must be called with Test instance as first argument (got nothing instead)
This error suggests such a method was not intended in Python 2. Perhaps its allowance now is due to the elimination of unbound methods in Python 3 (REF 001). Moreover, unknown_mthd does not accept args, and it can be bound to called by a class like a staticmethod, Test.unknown_mthd(). However, it is not an explicit staticmethod (no decorator).
Questions
Was making a method this way (without args while not explicitly decorated as staticmethods) intentional in Python 3's design? UPDATED
Among the classic method types, what type of method is unknown_mthd?
Why can unknown_mthd be called by the class without passing an argument?
Some preliminary inspection yields inconclusive results:
>>> # Types
>>> print("i", type(t.instance_mthd))
>>> print("c", type(Test.class_mthd))
>>> print("s", type(t.static_mthd))
>>> print("u", type(Test.unknown_mthd))
>>> print()
>>> # __dict__ Types, REF 002
>>> print("i", type(t.__class__.__dict__["instance_mthd"]))
>>> print("c", type(t.__class__.__dict__["class_mthd"]))
>>> print("s", type(t.__class__.__dict__["static_mthd"]))
>>> print("u", type(t.__class__.__dict__["unknown_mthd"]))
>>> print()
i <class 'method'>
c <class 'method'>
s <class 'function'>
u <class 'function'>
i <class 'function'>
c <class 'classmethod'>
s <class 'staticmethod'>
u <class 'function'>
The first set of type inspections suggests unknown_mthd is something similar to a staticmethod. The second suggests it resembles an instance method. I'm not sure what this method is or why it should be used over the classic ones. I would appreciate some advice on how to inspect and understand it better. Thanks.
REF 001: What's New in Python 3: “unbound methods” has been removed
REF 002: How to distinguish an instance method, a class method, a static method or a function in Python 3?
REF 003: What's the point of #staticmethod in Python?
Some background: In Python 2, "regular" instance methods could give rise to two kinds of method objects, depending on whether you accessed them via an instance or the class. If you did inst.meth (where inst is an instance of the class), you got a bound method object, which keeps track of which instance it is attached to, and passes it as self. If you did Class.meth (where Class is the class), you got an unbound method object, which had no fixed value of self, but still did a check to make sure a self of the appropriate class was passed when you called it.
In Python 3, unbound methods were removed. Doing Class.meth now just gives you the "plain" function object, with no argument checking at all.
Was making a method this way intentional in Python 3's design?
If you mean, was removal of unbound methods intentional, the answer is yes. You can see discussion from Guido on the mailing list. Basically it was decided that unbound methods add complexity for little gain.
Among the classic method types, what type of method is unknown_mthd?
It is an instance method, but a broken one. When you access it, a bound method object is created, but since it accepts no arguments, it's unable to accept the self argument and can't be successfully called.
Why can unknown_mthd be called by the class without passing an argument?
In Python 3, unbound methods were removed, so Test.unkown_mthd is just a plain function. No wrapping takes place to handle the self argument, so you can call it as a plain function that accepts no arguments. In Python 2, Test.unknown_mthd is an unbound method object, which has a check that enforces passing a self argument of the appropriate class; since, again, the method accepts no arguments, this check fails.
#BrenBarn did a great job answering your question. This answer however, adds a plethora of details:
First of all, this change in bound and unbound method is version-specific, and it doesn't relate to new-style or classic classes:
2.X classic classes by default
>>> class A:
... def meth(self): pass
...
>>> A.meth
<unbound method A.meth>
>>> class A(object):
... def meth(self): pass
...
>>> A.meth
<unbound method A.meth>
3.X new-style classes by default
>>> class A:
... def meth(self): pass
...
>>> A.meth
<function A.meth at 0x7efd07ea0a60>
You've already mentioned this in your question, it doesn't hurt to mention it twice as a reminder.
>>> # Python 2
>>> Test.unknown_mthd()
TypeError: unbound method unknown_mthd() must be called with Test instance as first argument (got nothing instead)
Moreover, unknown_mthd does not accept args, and it can be bound to a class like a staticmethod, Test.unknown_mthd(). However, it is not an explicit staticmethod (no decorator)
unknown_meth doesn't accept args, normally because you've defined the function without so that it does not take any parameter. Be careful and cautious, static methods as well as your coded unknown_meth method will not be magically bound to a class when you reference them through the class name (e.g, Test.unknown_meth). Under Python 3.X Test.unknow_meth returns a simple function object in 3.X, not a method bound to a class.
1 - Was making a method this way (without args while not explicitly decorated as staticmethods) intentional in Python 3's design? UPDATED
I cannot speak for CPython developers nor do I claim to be their representative, but from my experience as a Python programmer, it seems like they wanted to get rid of a bad restriction, especially given the fact that Python is extremely dynamic, not a language of restrictions; why would you test the type of objects passed to class methods and hence restrict the method to specific instances of classes? Type testing eliminates polymorphism. It would be decent if you just return a simple function when a method is fetched through the class which functionally behaves like a static method, you can think of unknown_meth to be static method under 3.X so long as you're careful not to fetch it through an instance of Test you're good to go.
2- Among the classic method types, what type of method is unknown_mthd?
Under 3.X:
>>> from types import *
>>> class Test:
... def unknown_mthd(): pass
...
>>> type(Test.unknown_mthd) is FunctionType
True
It's simply a function in 3.X as you could see. Continuing the previous session under 2.X:
>>> type(Test.__dict__['unknown_mthd']) is FunctionType
True
>>> type(Test.unknown_mthd) is MethodType
True
unknown_mthd is a simple function that lives inside Test__dict__, really just a simple function which lives inside the namespace dictionary of Test. Then, when does it become an instance of MethodType? Well, it becomes an instance of MethodType when you fetch the method attribute either from the class itself which returns an unbound method or its instances which returns a bound method. In 3.X, Test.unknown_mthd is a simple function--instance of FunctionType, and Test().unknown_mthd is an instance of MethodType that retains the original instance of class Test and adds it as the first argument implicitly on function calls.
3- Why can unknown_mthd be called by the class without passing an argument?
Again, because Test.unknown_mthd is just a simple function under 3.X. Whereas in 2.X, unknown_mthd not a simple function and must be called be passed an instance of Test when called.
Are there more than three types of methods in Python?
Yes. There are the three built-in kinds that you mention (instance method, class method, static method), four if you count #property, and anyone can define new method types.
Once you understand the mechanism for doing this, it's easy to explain why unknown_mthd is callable from the class in Python 3.
A new kind of method
Suppose we wanted to create a new type of method, call it optionalselfmethod so that we could do something like this:
class Test(object):
#optionalselfmethod
def optionalself_mthd(self, *args):
print('Optional-Self Method:', self, *args)
The usage is like this:
In [3]: Test.optionalself_mthd(1, 2)
Optional-Self Method: None 1 2
In [4]: x = Test()
In [5]: x.optionalself_mthd(1, 2)
Optional-Self Method: <test.Test object at 0x7fe80049d748> 1 2
In [6]: Test.instance_mthd(1, 2)
Instance method: 1 2
optionalselfmethod works like a normal instance method when called on an instance, but when called on the class, it always receives None for the first parameter. If it were a normal instance method, you would always have to pass an explicit value for the self parameter in order for it to work.
So how does this work? How you can you create a new method type like this?
The Descriptor Protocol
When Python looks up a field of an instance, i.e. when you do x.whatever, it check in several places. It checks the instance's __dict__ of course, but it also checks the __dict__ of the object's class, and base classes thereof. In the instance dict, Python is just looking for the value, so if x.__dict__['whatever'] exists, that's the value. However, in the class dict, Python is looking for an object which implements the Descriptor Protocol.
The Descriptor Protocol is how all three built-in kinds of methods work, it's how #property works, and it's how our special optionalselfmethod will work.
Basically, if the class dict has a value with the correct name1, Python checks if it has an __get__ method, and calls it like type(x).whatever.__get__(x, type(x)) Then, the value returned from __get__ is used as the field value.
So for example, a trivial descriptor which always returns 3:
class GetExample:
def __get__(self, instance, cls):
print("__get__", instance, cls)
return 3
class Test:
get_test = GetExample()
Usage is like this:
In[22]: x = Test()
In[23]: x.get_test
__get__ <__main__.Test object at 0x7fe8003fc470> <class '__main__.Test'>
Out[23]: 3
Notice that the descriptor is called with both the instance and the class type. It can also be used on the class:
In [29]: Test.get_test
__get__ None <class '__main__.Test'>
Out[29]: 3
When a descriptor is used on a class rather than an instance, the __get__ method gets None for self, but still gets the class argument.
This allows a simple implementation of methods: functions simply implement the descriptor protocol. When you call __get__ on a function, it returns a bound method of instance. If the instance is None, it returns the original function. You can actually call __get__ yourself to see this:
In [30]: x = object()
In [31]: def test(self, *args):
...: print(f'Not really a method: self<{self}>, args: {args}')
...:
In [32]: test
Out[32]: <function __main__.test>
In [33]: test.__get__(None, object)
Out[33]: <function __main__.test>
In [34]: test.__get__(x, object)
Out[34]: <bound method test of <object object at 0x7fe7ff92d890>>
#classmethod and #staticmethod are similar. These decorators create proxy objects with __get__ methods which provide different binding. Class method's __get__ binds the method to the instance, and static method's __get__ doesn't bind to anything, even when called on an instance.
The Optional-Self Method Implementation
We can do something similar to create a new method which optionally binds to an instance.
import functools
class optionalselfmethod:
def __init__(self, function):
self.function = function
functools.update_wrapper(self, function)
def __get__(self, instance, cls):
return boundoptionalselfmethod(self.function, instance)
class boundoptionalselfmethod:
def __init__(self, function, instance):
self.function = function
self.instance = instance
functools.update_wrapper(self, function)
def __call__(self, *args, **kwargs):
return self.function(self.instance, *args, **kwargs)
def __repr__(self):
return f'<bound optionalselfmethod {self.__name__} of {self.instance}>'
When you decorate a function with optionalselfmethod, the function is replaced with our proxy. This proxy saves the original method and supplies a __get__ method which returns a boudnoptionalselfmethod. When we create a boundoptionalselfmethod, we tell it both the function to call and the value to pass as self. Finally, calling the boundoptionalselfmethod calls the original function, but with the instance or None inserted into the first parameter.
Specific Questions
Was making a method this way (without args while not explicitly
decorated as staticmethods) intentional in Python 3's design? UPDATED
I believe this was intentional; however the intent would have been to eliminate unbound methods. In both Python 2 and Python 3, def always creates a function (you can see this by checking a type's __dict__: even though Test.instance_mthd comes back as <unbound method Test.instance_mthd>, Test.__dict__['instance_mthd'] is still <function instance_mthd at 0x...>).
In Python 2, function's __get__ method always returns a instancemethod, even when accessed through the class. When accessed through an instance, the method is bound to that instance. When accessed through the class, the method is unbound, and includes a mechanism which checks that the first argument is an instance of the correct class.
In Python 3, function's __get__ method will return the original function unchanged when accessed through the class, and a method when accessed through the instance.
I don't know the exact rationale but I would guess that type-checking of the first argument to a class-level function was deemed unnecessary, maybe even harmful; Python allows duck-typing after all.
Among the classic method types, what type of method is unknown_mthd?
unknown_mthd is a plain function, just like any normal instance method. It only fails when called through the instance because when method.__call__ attempts to call the function unknown_mthd with the bound instance, it doesn't accept enough parameters to receive the instance argument.
Why can unknown_mthd be called by the class without passing an
argument?
Because it's just a plain function, same as any other function. I just doesn't take enough arguments to work correctly when used as an instance method.
You may note that both classmethod and staticmethod work the same whether they're called through an instance or a class, whereas unknown_mthd will only work correctly when when called through the class and fail when called through an instance.
1. If a particular name has both a value in the instance dict and a descriptor in the class dict, which one is used depends on what kind of descriptor it is. If the descriptor only defines __get__, the value in the instance dict is used. If the descriptor also defines __set__, then it's a data-descriptor and the descriptor always wins. This is why you can assign over top of a method but not a #property; method only define __get__, so you can put things in the same-named slot in the instance dict, while #properties define __set__, so even if they're read-only, you'll never get a value from the instance __dict__ even if you've previously bypassed property lookup and stuck a value in the dict with e.g. x.__dict__['whatever'] = 3.

if we create any method in a class in python, we always use self as first argument, but while calling that method we dont pass? why?

class ABC:
def xyz(self):
print "in xyz"
obj = ABC()
print obj.xyz()
output : in xyz
here self is not given as parameter while calling xyz with obj.
That's because self is, by default, the instance itself. obj.xyz() is equivalent to ABC.xyz(obj).
The first argument of every class method, including __init__, is always a reference to the current instance of the class.
By convention, this argument is always named self.
In the __init__ method, self refers to the newly created object; in other instance methods, it refers to the instance whose method was called. Although you need to specify self explicitly when defining the method, you do not specify it when calling the method; Python will add it for you automatically.
Learn more about self here and here
Technically, this is what happens:
It tries to evaluate obj.xyz
It funds that the object obj has no attribute named xyz
It then looks in obj's class (ABC) for an attribute named xyz, which it does have
ABC.xyz is a function, so then it "wraps" it into a "instance method". Basically, ABC.xyz is a function with one parameter (self), so the "instance method" is a function with one less parameter from the original function (no parameters in this case), which remembers the object obj. And if you call this "instance method", it passes it onto ABC.xyz, with the first argument being obj, the rest of the arguments being the argument to the bound method.
obj.xyz evaluates to this instance method
You make a call on this instance method, which calls ABC.xyz, with the remembered instance (obj) as the first argument.
Here are the relevant parts from the language reference (scroll from here):
Class instances
A class instance is created by calling a class object (see above). A class instance has a namespace implemented as a dictionary which is
the first place in which attribute references are searched. When an
attribute is not found there, and the instance’s class has an
attribute by that name, the search continues with the class
attributes. If a class attribute is found that is a user-defined
function object, it is transformed into an instance method object
whose __self__ attribute is the instance. [...]
and
Instance methods
[...]
When an instance method object is created by retrieving a user-defined
function object from a class via one of its instances, its __self__
attribute is the instance, and the method object is said to be bound.
The new method’s __func__ attribute is the original function object.
[...]
When an instance method object is called, the underlying function
(__func__) is called, inserting the class instance (__self__) in
front of the argument list. For instance, when C is a class which
contains a definition for a function f(), and x is an instance of
C, calling x.f(1) is equivalent to calling C.f(x, 1).
[...]
In the backend of the code obj.xyz(). The parameter returned is object(obj) itself that means ABC.xyz(obj)

Python Suite, Package, Module, TestCase and TestSuite differences

Best Guess:
method - def(self, maybeSomeVariables); lines of code which achieve some purpose
Function - same as method but returns something
Class - group of methods/functions
Module - a script, OR one or more classes. Basically a .py file.
Package - a folder which has modules in, and also a __init__.py file in there.
Suite - Just a word that gets thrown around a lot, by convention
TestCase - unittest's equivalent of a function
TestSuite - unittest's equivalent of a Class (or Module?)
My question is: Is this completely correct, and did I miss any hierarchical building blocks from that list?
I feel that you're putting in differences that don't actually exist. There isn't really a hierarchy as such. In python everything is an object. This isn't some abstract notion, but quite fundamental to how you should think about constructs you create when using python. An object is just a bunch of other objects. There is a slight subtlety in whether you're using new-style classes or not, but in the absence of a good reason otherwise, just use and assume new-style classes. Everything below is assuming new-style classes.
If an object is callable, you can call it using the calling syntax of a pair of braces, with the arguments inside them: my_callable(arg1, arg2). To be callable, an object needs to implement the __call__ method (or else have the correct field set in its C level type definition).
In python an object has a type associated with it. The type describes how the object was constructed. So, for example, a list object is of type list and a function object is of type function. The types themselves are of type type. You can find the type by using the built-in function type(). A list of all the built-in types can be found in the python documentation. Types are actually callable objects, and are used to create instances of a given type.
Right, now that's established, the nature of a given object is defined by it's type. This describes the objects of which it comprises. Coming back to your questions then:
Firstly, the bunch of objects that make up some object are called the attributes of that object. These attributes can be anything, but they typically consist of methods and some way of storing state (which might be types such as int or list).
A function is an object of type function. Crucially, that means it has the __call__ method as an attribute which makes it a callable (the __call__ method is also an object that itself has the __call__ method. It's __call__ all the way down ;)
A class, in the python world, can be considered as a type, but typically is used to refer to types that are not built-in. These objects are used to create other objects. You can define your own classes with the class keyword, and to create a class which is new-style you must inherit from object (or some other new-style class). When you inherit, you create a type that acquires all the characteristics of the parent type, and then you can overwrite the bits you want to (and you can overwrite any bits you want!). When you instantiate a class (or more generally, a type) by calling it, another object is returned which is created by that class (how the returned object is created can be changed in weird and crazy ways by modifying the class object).
A method is a special type of function that is called using the attribute notation. That is, when it is created, 2 extra attributes are added to the method (remember it's an object!) called im_self and im_func. im_self I will describe in a few sentences. im_func is a function that implements the method. When the method is called, like, for example, foo.my_method(10), this is equivalent to calling foo.my_method.im_func(im_self, 10). This is why, when you define a method, you define it with the extra first argument which you apparently don't seem to use (as self).
When you write a bunch of methods when defining a class, these become unbound methods. When you create an instance of that class, those methods become bound. When you call an bound method, the im_self argument is added for you as the object in which the bound method resides. You can still call the unbound method of the class, but you need to explicitly add the class instance as the first argument:
class Foo(object):
def bar(self):
print self
print self.bar
print self.bar.im_self # prints the same as self
We can show what happens when we call the various manifestations of the bar method:
>>> a = Foo()
>>> a.bar()
<__main__.Foo object at 0x179b610>
<bound method Foo.bar of <__main__.Foo object at 0x179b610>>
<__main__.Foo object at 0x179b610>
>>> Foo.bar()
TypeError: unbound method bar() must be called with Foo instance as first argument (got nothing instead)
>>> Foo.bar(a)
<__main__.Foo object at 0x179b610>
<bound method Foo.bar of <__main__.Foo object at 0x179b610>>
<__main__.Foo object at 0x179b610>
Bringing all the above together, we can define a class as follows:
class MyFoo(object):
a = 10
def bar(self):
print self.a
This generates a class with 2 attributes: a (which is an integer of value 10) and bar, which is an unbound method. We can see that MyFoo.a is just 10.
We can create extra attributes at run time, both within the class methods, and outside. Consider the following:
class MyFoo(object):
a = 10
def __init__(self):
self.b = 20
def bar(self):
print self.a
print self.b
def eep(self):
print self.c
__init__ is just the method that is called immediately after an object has been created from a class.
>>> foo = Foo()
>>> foo.bar()
10
20
>>> foo.eep()
AttributeError: 'MyFoo' object has no attribute 'c'
>>> foo.c = 30
>>> foo.eep()
30
This example shows 2 ways of adding an attribute to a class instance at run time (that is, after the object has been created from it's class).
I hope you can see then, that TestCase and TestSuite are just classes that are used to create test objects. There's nothing special about them except that they happen to have some useful features for writing tests. You can subclass and overwrite them to your heart's content!
Regarding your specific point, both methods and functions can return anything they want.
Your description of module, package and suite seems pretty sound. Note that modules are also objects!

How does a python method automatically receive 'self' as the first argument?

Consider this example of a strategy pattern in Python (adapted from the example here). In this case the alternate strategy is a function.
class StrategyExample(object):
def __init__(self, strategy=None) :
if strategy:
self.execute = strategy
def execute(*args):
# I know that the first argument for a method
# must be 'self'. This is just for the sake of
# demonstration
print locals()
#alternate strategy is a function
def alt_strategy(*args):
print locals()
Here are the results for the default strategy.
>>> s0 = StrategyExample()
>>> print s0
<__main__.StrategyExample object at 0x100460d90>
>>> s0.execute()
{'args': (<__main__.StrategyExample object at 0x100460d90>,)}
In the above example s0.execute is a method (not a plain vanilla function) and hence the first argument in args, as expected, is self.
Here are the results for the alternate strategy.
>>> s1 = StrategyExample(alt_strategy)
>>> s1.execute()
{'args': ()}
In this case s1.execute is a plain vanilla function and as expected, does not receive self. Hence args is empty. Wait a minute! How did this happen?
Both the method and the function were called in the same fashion. How does a method automatically get self as the first argument? And when a method is replaced by a plain vanilla function how does it not get the self as the first argument?
The only difference that I was able to find was when I examined the attributes of default strategy and alternate strategy.
>>> print dir(s0.execute)
['__cmp__', '__func__', '__self__', ...]
>>> print dir(s1.execute)
# does not have __self__ attribute
Does the presence of __self__ attribute on s0.execute (the method), but lack of it on s1.execute (the function) somehow account for this difference in behavior? How does this all work internally?
You can read the full explanation here in the python reference, under "User defined methods". A shorter and easier explanation can be found in the python tutorial's description of method objects:
If you still don’t understand how methods work, a look at the implementation can perhaps clarify matters. When an instance attribute is referenced that isn’t a data attribute, its class is searched. If the name denotes a valid class attribute that is a function object, a method object is created by packing (pointers to) the instance object and the function object just found together in an abstract object: this is the method object. When the method object is called with an argument list, a new argument list is constructed from the instance object and the argument list, and the function object is called with this new argument list.
Basically, what happens in your example is this:
a function assigned to a class (as happens when you declare a method inside the class body) is... a method.
When you access that method through the class, eg. StrategyExample.execute you get an "unbound method": it doesn't "know" to which instance it "belongs", so if you want to use that on an instance, you would need to provide the instance as the first argument yourself, eg. StrategyExample.execute(s0)
When you access the method through the instance, eg. self.execute or s0.execute, you get a "bound method": it "knows" which object it "belongs" to, and will get called with the instance as the first argument.
a function that you assign to an instance attribute directly however, as in self.execute = strategy or even s0.execute = strategy is... just a plain function (contrary to a method, it doesn't pass via the class)
To get your example to work the same in both cases:
either you turn the function into a "real" method: you can do this with types.MethodType:
self.execute = types.MethodType(strategy, self, StrategyExample)
(you more or less tell the class that when execute is asked for this particular instance, it should turn strategy into a bound method)
or - if your strategy doesn't really need access to the instance - you go the other way around and turn the original execute method into a static method (making it a normal function again: it won't get called with the instance as the first argument, so s0.execute() will do exactly the same as StrategyExample.execute()):
#staticmethod
def execute(*args):
print locals()
You need to assign an unbound method (i.e. with a self parameter) to the class or a bound method to the object.
Via the descriptor mechanism, you can make your own bound methods, it's also why it works when you assign the (unbound) function to a class:
my_instance = MyClass()
MyClass.my_method = my_method
When calling my_instance.my_method(), the lookup will not find an entry on my_instance, which is why it will at a later point end up doing this: MyClass.my_method.__get__(my_instance, MyClass) - this is the descriptor protocol. This will return a new method that is bound to my_instance, which you then execute using the () operator after the property.
This will share method among all instances of MyClass, no matter when they were created. However, they could have "hidden" the method before you assigned that property.
If you only want specific objects to have that method, just create a bound method manually:
my_instance.my_method = my_method.__get__(my_instance, MyClass)
For more detail about descriptors (a guide), see here.
The method is a wrapper for the function, and calls the function with the instance as the first argument. Yes, it contains a __self__ attribute (also im_self in Python prior to 3.x) that keeps track of which instance it is attached to. However, adding that attribute to a plain function won't make it a method; you need to add the wrapper. Here is how (although you may want to use MethodType from the types module to get the constructor, rather than using type(some_obj.some_method).
The function wrapped, by the way, is accessible through the __func__ (or im_func) attribute of the method.
When you do self.execute = strategy you set the attribute to a plain method:
>>> s = StrategyExample()
>>> s.execute
<bound method StrategyExample.execute of <__main__.StrategyExample object at 0x1dbbb50>>
>>> s2 = StrategyExample(alt_strategy)
>>> s2.execute
<function alt_strategy at 0x1dc1848>
A bound method is a callable object that calls a function passing an instance as the first argument in addition to passing through all arguments it was called with.
See: Python: Bind an Unbound Method?

Categories