Method object implementation in Python | Ambiguous statements from the documentation - python

I have some confusion about the method implementation in Python. Any help in that regard would be appreciated.
I found this closely-related post, which is quite useful, but not answering my questions.
My problem:
The following statements from the last paragraph of Section 9.3.4 from Python documentation are not quite clear to me.
When a non-data attribute of an instance is referenced, the instance’s
class is searched. If the name denotes a valid class attribute that is
a function object, a method object is created by packing (pointers to)
the instance object and the function object just found together in an
abstract object: this is the method object. When the method object is
called with an argument list, a new argument list is constructed from
the instance object and the argument list, and the function object is
called with this new argument list.
My initial understanding from the doccumentation was very similar to that of #vaughn-mcgill-adami as described in their post:
when a class instance object calls its method object,
a search would be done on the class object for the corresponding function object. If found, then calling that function owned by the class object with the class instance object as the first parameter.
However, after some coding, it seems to me that my understanding may not be correct. The reason is that one could delete the class object. Yet, calling the method object can still be done with no problem, whereas passing the class instance object to the function object from the class object would result in an error, since there would be no class object anymore after its deletion.
The following code snippet aims at better explaining my questions:
class myclass:
x=0
def inc(self):
self.x +=1
print(self.x)
myobj = myclass()
type(myobj.inc) #output: method
type(myclass.inc) #output: function
# 1. calling method object multiple times
# 2. calling function object and passing the instance object multiple times
print(myobj.x)
for i in range(3):
myobj.inc()
for i in range(3):
myclass.inc(myobj)
#-------------
# output:
#-------------
# 0
# 1
# 2
# 3
# 4
# 5
# 6
#-------------
# deleting the class object
del myclass
# calling method object multiple times
for i in range(3):
myobj.inc()
#-------------
# output:
#-------------
# 7
# 8
# 9
#-------------
# calling function object and passing the instance object
for i in range(3):
myclass.inc(myobj)
#-------------
# output:
#-------------
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-177-86f094e96103> in <module>
1 for i in range(3):
----> 2 myclass.inc(myobj)
NameError: name 'myclass' is not defined
My understanding from the attached code snippet is that after instantiating a class object,
the method objects are attached to the class instance object. The method objects and the class instance object will then be independent of the class object.
Here are some ambiguities I have from the documentation:
What does it mean "...the instance’s class is searched"? I believe it should be "instance class object", but it does not seem to be a typo, since in the following sentence it says "If the name denotes a valid class attribute that is a function object...". Again, here I think the "method object" should replace the "function object".
"...a method object is created...". Is the method object created once it is called, or at the same time that the class instance object is created?
"by packing (pointers to) the instance object and the function object just found". Again, "function object" does not seem correct to me.
Basically, if the class object is deleted, as in my code snippet, then there is no class object to be searched on in the first place. There will be no function object to be pointed to either.
There is a good chance that I am missing something here. I would be grateful if anyone could please advise. Thank you

Basically, if the class object is deleted, as in my code snippet, then
there is no class object to be searched on in the first place. There
will be no function object to be pointed to either.
This is your fundamental misunderstanding.
del does not delete objects.
Indeed, Python the language provides no way to manually delete objects (or manually manage memory, i.e. it is a memory-managed language).
del removes names from namespaces.
When you del myclass, you simply removed that name from that module's namespace. But the class object still exists, for one, it is being referenced by all instances of that class, my_object.__class__, and like any other object, will continue to exist as long as references to it exist.
So, consider the following example with del:
>>> x = [1,2,3]
>>> y = x
>>> print(x is y)
True
>>> print(id(x), id(y))
140664200270464 140664200270464
>>> x.append(1337)
>>> x
[1, 2, 3, 1337]
>>> y
[1, 2, 3, 1337]
>>> del x
>>> print(y)
[1, 2, 3, 1337]
So del x did not delete the object, it simply deleted the name x from that namespace. So if I do:
>>> print(x)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'x' is not defined
del can only indirectly lead to the deletion of an object, if the name you remove from the namespace is the last reference to that object, then that object is free for garbage collection (and in CPython, which uses reference counting as it's main garbage collections strategy, it will be reclaimed immediately, but this may not be true in other implementations, e.g, JYthon, which uses the Java runtime's garbage collector).
Some relevant documentation for the del statement is here:
Deletion of a name removes the binding of that name from the local or
global namespace, depending on whether the name occurs in a global
statement in the same code block. If the name is unbound, a NameError
exception will be raised.
Note, the other part,
Deletion of attribute references, subscriptions and slicings is passed
to the primary object involved; deletion of a slicing is in general
equivalent to assignment of an empty slice of the right type (but even
this is determined by the sliced object).
Is basically saying that a statement like
del obj[item]
is delegated to
type(obj).__delitem__
i.e., it is just sugar for a call to a method.

Related

"implicit uses of special methods always rely on the class-level binding of the special method"

I have difficulty understanding the last part (in bold) from Python in a Nutshell
Per-Instance Methods
An instance can have instance-specific bindings for all attributes,
including callable attributes (methods). For a method, just like for
any other attribute (except those bound to overriding descriptors),
an instance-specific binding hides a class-level binding:
attribute lookup does not consider the class when it finds a
binding directly in the instance. An instance-specific binding for a
callable attribute does not perform any of the transformations
detailed in “Bound and Unbound Methods” on page 110: the attribute
reference returns exactly the same callable object that was earlier
bound directly to the instance attribute.
However, this does not work as you might expect
for per-instance bindings of the special methods that Python calls
implicitly as a result of various operations, as covered in “Special
Methods” on page 123. Such implicit uses of special methods always
rely on the class-level binding of the special method, if any. For
example:
def fake_get_item(idx): return idx
class MyClass(object): pass
n = MyClass()
n.__getitem__ = fake_get_item
print(n[23]) # results in:
# Traceback (most recent call last):
# File "<stdin>", line 1, in ?
# TypeError: unindexable object
What does it mean specifically?
Why is the error of the example?
Thanks.
Neglecting all the fine details it basically says that special methods (as defined in Pythons data model - generally these are the methods starting with two underscores and ending with two underscores and are rarely, if ever, called directly) will never be used implicitly from the instance even if defined there:
n[whatever] # will always call type(n).__getitem__(n, whatever)
This differs from attribute look-up which checks the instance first:
def fake_get_item(idx):
return idx
class MyClass(object):
pass
n = MyClass()
n.__getitem__ = fake_get_item
print(n.__getitem__(23)) # works because attribute lookup checks the instance first
There is a whole section in the documentation about this (including rationale): "Special method lookup":
3.3.9. Special method lookup
For custom classes, implicit invocations of special methods are only guaranteed to work correctly if defined on an object’s type, not in the object’s instance dictionary. That behaviour is the reason why the following code raises an exception:
>>> class C:
... pass
...
>>> c = C()
>>> c.__len__ = lambda: 5
>>> len(c)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: object of type 'C' has no len()
The rationale behind this behaviour lies with a number of special methods such as __hash__() and __repr__() that are implemented by all objects, including type objects. If the implicit lookup of these methods used the conventional lookup process, they would fail when invoked on the type object itself:
>>> 1 .__hash__() == hash(1)
True
>>> int.__hash__() == hash(int)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: descriptor '__hash__' of 'int' object needs an argument
[...]
Bypassing the __getattribute__() machinery in this fashion provides significant scope for speed optimisations within the interpreter, at the cost of some flexibility in the handling of special methods (the special method must be set on the class object itself in order to be consistently invoked by the interpreter).
To put it even more plainly, it means that you can't redefine the dunder methods on the fly. As a consequence, ==, +, and the rest of the operators always mean the same thing for all objects of type T.
I'll try to summarize what the extract says and in particular the part in bold.
Generally speaking, when Python tries to find the value of an attribute (including a method), it first checks the instance (i.e. the actual object you created), then the class.
The code below illustrates the generic behavior.
class MyClass(object):
def a(self):
print("howdy from the class")
n = MyClass()
#here the class method is called
n.a()
#'howdy from the class'
def new_a():
print("hello from new a")
n.a = new_a
#the new instance binding hides the class binding
n.a()
#'hello from new a'
What the part in bold states is that this behavior does not apply to "Special Methods" such as __getitem__. In other words, overriding __getitem__ at the instance level (n.__getitem__ = fake_get_item in your exemple) does nothing : when the method is called through the n[] syntax, an error is raised because the class does not implement the method.
(If the generic behavior also held in this case, the result of print(n[23]) would have been to print 23, i.e. executing the fake_get_item method).
Another example of the same behavior:
class MyClass(object):
def __getitem__(self, idx):
return idx
n = MyClass()
fake_get_item = lambda x: "fake"
print(fake_get_item(23))
#'fake'
n.__getitem__ = fake_get_item
print(n[23])
#'23'
In this example, the class method for __getitem__ (which returns the index number) is called instead of the instance binding (which returns 'fake').

Class name as a variable in python

I was just looking at one question here and the OP was using a same name for class, other things and also for variable. When I was trying to answer it, I became confused myself and thus thought of asking.
For example:
class MyClass:
pass
MyClass=MyClass()
Though, I will never code anything like this. I would like to understand how this will be treated by python interpreter. So my question is, is the variable MyClass I will use will be created first or the other way? Which is, creating an instance of MyClass firstly and assigning it to MyClass variable. I think the latter is correct but if that is the case, how will the following be resolved?
class MyClass:
pass
MyClass=MyClass()
new_class=MyClass()
The right-hand side of the assignment is processed first, so an instance of MyClass is created. But then you reassign the name MyClass to that instance. When you execute
new_class = MyClass()
you should get an error about MyClass not being callable, since that name now refers to an instance of the original class, not the class itself.
class MyClass:
pass
MyClass=MyClass()
In simple terms, the above code does three things (in this order):
Defines the class MyClass.
Creates an instance of MyClass.
Assigns that instance to the variable MyClass.
After the last step, the class MyClass is overwritten and can no longer be used. All you have left is an instance of it contained in the variable MyClass.
Moreover, if you try to call this instance as you would a class, you will get an error:
>>> class MyClass:
... pass
...
>>> MyClass=MyClass()
>>> new_class=MyClass()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'MyClass' object is not callable
>>>
The line:
new_class=MyClass()
in most cases will return an error, saying something like instance not callable.
MyClass now refers to the instance of what MyClass previous held that is a class.
You could make a new instance of former MyClass by:
new_class = MyClass.__class__()
MyClass is just just a variable that points/refers to a particular object. First it was class then it was changed to hold an instance of that class.
Variables are treated as objects in Python. From my understanding, when you assign a new instance of MyClass to an object, python will try to create a reference of the original class to the object and duplicate. However, the namespace of the new object is already used (in the original MyClass), and the duplication will return you an error, so the first code will not work.
For the second piece of code, the final line will not execute due to the same reason of Namespace Duplication. Since the last but one line failed, the proposed reference target is still the original MyClass, which won't work at all.

How does attribute resolution work in Python?

Consider the following code:
class A(object):
def do(self):
print self.z
class B(A):
def __init__(self, y):
self.z = y
b = B(3)
b.do()
Why does this work? When executing b = B(3), attribute z is set. When b.do() is called, Python's MRO finds the do function in class A. But why is it able to access an attribute defined in a subclass?
Is there a use case for this functionality? I would love an example.
It works in a pretty simple way: when a statement is executed that sets an attribute, it is set. When a statement is executed that reads an attribute, it is read. When you write code that reads an attribute, Python does not try to guess whether the attribute will exist when that code is executed; it just waits until the code actually is executed, and if at that time the attribute doesn't exist, then you'll get an exception.
By default, you can always set any attribute on an instance of a user-defined class; classes don't normally define lists of "allowed" attributes that could be set (although you can make that happen too), they just actually set attributes. Of course, you can only read attributes that exist, but again, what matters is whether they exist when you actually try to read them. So it doesn't matter if an attribute exists when you define a function that tries to read it; it only matters when (or if) you actually call that function.
In your example, it doesn't matter that there are two classes, because there is only one instance. Since you only create one instance and call methods on one instance, the self in both methods is the same object. First __init__ is run and it sets the attribute on self. Then do is run and it reads the attribute from the same self. That's all there is to it. It doesn't matter where the attribute is set; once it is set on the instance, it can be accessed from anywhere: code in a superclass, subclass, other class, or not in any class.
Since new attributes can be added to any object at any time, attribute resolution happens at execution time, not compile time. Consider this example which may be a bit more instructive, derived from yours:
class A(object):
def do(self):
print(self.z) # references an attribute which we have't "declared" in an __init__()
#make a new A
aa = A()
# this next line will error, as you would expect, because aa doesn't have a self.z
aa.do()
# but we can make it work now by simply doing
aa.z = -42
aa.do()
The first one will squack at you, but the second will print -42 as expected.
Python objects are just dictionaries. :)
When retrieving an attribute from an object (print self.attrname) Python follows these steps:
If attrname is a special (i.e. Python-provided) attribute for objectname, return it.
Check objectname.__class__.__dict__ for attrname. If it exists and is a data-descriptor, return the descriptor result. Search all bases of objectname.__class__ for the same case.
Check objectname.__dict__ for attrname, and return if found. If objectname is a class, search its bases too. If it is a class and a descriptor exists in it or its bases, return the descriptor result.
Check objectname.__class__.__dict__ for attrname. If it exists and is a non-data descriptor, return the descriptor result. If it exists, and is not a descriptor, just return it. If it exists and is a data descriptor, we shouldn't be here because we would have returned at point 2. Search all bases of objectname.__class__ for same case.
Raise AttributeError
Source
Understanding get and set and Python descriptors
Since you instanciated a B object, B.__init__ was invoked and added an attribute z. This attribute is now present in the object. It's not some weird overloaded magical shared local variable of B methods that somehow becomes inaccessible to code written elsewhere. There's no such thing. Neither does self become a different object when it's passed to a superclass' method (how's polymorphism supposed to work if that happens?).
There's also no such thing as a declaration that A objects have no such object (try o = A(); a.z = whatever), and neither is self in do required to be an instance of A1. In fact, there are no declarations at all. It's all "go ahead and try it"; that's kind of the definition of a dynamic language (not just dynamic typing).
That object's z attribute present "everywhere", all the time2, regardless of the "context" from which it is accessed. It never matters where code is defined for the resolution process, or for several other behaviors3. For the same reason, you can access a list's methods despite not writing C code in listobject.c ;-) And no, methods aren't special. They are just objects too (instances of the type function, as it happens) and are involved in exactly the same lookup sequence.
1 This is a slight lie; in Python 2, A.do would be "bound method" object which in fact throws an error if the first argument doesn't satisfy isinstance(A, <first arg>).
2 Until it's removed with del or one of its function equivalents (delattr and friends).
3 Well, there's name mangling, and in theory, code could inspect the stack, and thereby the caller code object, and thereby the location of its source code.

Python Suite, Package, Module, TestCase and TestSuite differences

Best Guess:
method - def(self, maybeSomeVariables); lines of code which achieve some purpose
Function - same as method but returns something
Class - group of methods/functions
Module - a script, OR one or more classes. Basically a .py file.
Package - a folder which has modules in, and also a __init__.py file in there.
Suite - Just a word that gets thrown around a lot, by convention
TestCase - unittest's equivalent of a function
TestSuite - unittest's equivalent of a Class (or Module?)
My question is: Is this completely correct, and did I miss any hierarchical building blocks from that list?
I feel that you're putting in differences that don't actually exist. There isn't really a hierarchy as such. In python everything is an object. This isn't some abstract notion, but quite fundamental to how you should think about constructs you create when using python. An object is just a bunch of other objects. There is a slight subtlety in whether you're using new-style classes or not, but in the absence of a good reason otherwise, just use and assume new-style classes. Everything below is assuming new-style classes.
If an object is callable, you can call it using the calling syntax of a pair of braces, with the arguments inside them: my_callable(arg1, arg2). To be callable, an object needs to implement the __call__ method (or else have the correct field set in its C level type definition).
In python an object has a type associated with it. The type describes how the object was constructed. So, for example, a list object is of type list and a function object is of type function. The types themselves are of type type. You can find the type by using the built-in function type(). A list of all the built-in types can be found in the python documentation. Types are actually callable objects, and are used to create instances of a given type.
Right, now that's established, the nature of a given object is defined by it's type. This describes the objects of which it comprises. Coming back to your questions then:
Firstly, the bunch of objects that make up some object are called the attributes of that object. These attributes can be anything, but they typically consist of methods and some way of storing state (which might be types such as int or list).
A function is an object of type function. Crucially, that means it has the __call__ method as an attribute which makes it a callable (the __call__ method is also an object that itself has the __call__ method. It's __call__ all the way down ;)
A class, in the python world, can be considered as a type, but typically is used to refer to types that are not built-in. These objects are used to create other objects. You can define your own classes with the class keyword, and to create a class which is new-style you must inherit from object (or some other new-style class). When you inherit, you create a type that acquires all the characteristics of the parent type, and then you can overwrite the bits you want to (and you can overwrite any bits you want!). When you instantiate a class (or more generally, a type) by calling it, another object is returned which is created by that class (how the returned object is created can be changed in weird and crazy ways by modifying the class object).
A method is a special type of function that is called using the attribute notation. That is, when it is created, 2 extra attributes are added to the method (remember it's an object!) called im_self and im_func. im_self I will describe in a few sentences. im_func is a function that implements the method. When the method is called, like, for example, foo.my_method(10), this is equivalent to calling foo.my_method.im_func(im_self, 10). This is why, when you define a method, you define it with the extra first argument which you apparently don't seem to use (as self).
When you write a bunch of methods when defining a class, these become unbound methods. When you create an instance of that class, those methods become bound. When you call an bound method, the im_self argument is added for you as the object in which the bound method resides. You can still call the unbound method of the class, but you need to explicitly add the class instance as the first argument:
class Foo(object):
def bar(self):
print self
print self.bar
print self.bar.im_self # prints the same as self
We can show what happens when we call the various manifestations of the bar method:
>>> a = Foo()
>>> a.bar()
<__main__.Foo object at 0x179b610>
<bound method Foo.bar of <__main__.Foo object at 0x179b610>>
<__main__.Foo object at 0x179b610>
>>> Foo.bar()
TypeError: unbound method bar() must be called with Foo instance as first argument (got nothing instead)
>>> Foo.bar(a)
<__main__.Foo object at 0x179b610>
<bound method Foo.bar of <__main__.Foo object at 0x179b610>>
<__main__.Foo object at 0x179b610>
Bringing all the above together, we can define a class as follows:
class MyFoo(object):
a = 10
def bar(self):
print self.a
This generates a class with 2 attributes: a (which is an integer of value 10) and bar, which is an unbound method. We can see that MyFoo.a is just 10.
We can create extra attributes at run time, both within the class methods, and outside. Consider the following:
class MyFoo(object):
a = 10
def __init__(self):
self.b = 20
def bar(self):
print self.a
print self.b
def eep(self):
print self.c
__init__ is just the method that is called immediately after an object has been created from a class.
>>> foo = Foo()
>>> foo.bar()
10
20
>>> foo.eep()
AttributeError: 'MyFoo' object has no attribute 'c'
>>> foo.c = 30
>>> foo.eep()
30
This example shows 2 ways of adding an attribute to a class instance at run time (that is, after the object has been created from it's class).
I hope you can see then, that TestCase and TestSuite are just classes that are used to create test objects. There's nothing special about them except that they happen to have some useful features for writing tests. You can subclass and overwrite them to your heart's content!
Regarding your specific point, both methods and functions can return anything they want.
Your description of module, package and suite seems pretty sound. Note that modules are also objects!

Difference between accessing an instance attribute and a class attribute

I have a Python class
class pytest:
i = 34
def func(self):
return "hello world"
When I access pytest.i, I get 34. I can also do this another way:
a = pytest()
a.i
This gives 34 as well.
If I try to access the (non-existing) pytest.j, I get
Traceback (most recent call last):
File "<pyshell#6>", line 1, in <module>
pytest.j
AttributeError: class pytest has no attribute 'j'
while when I try a.j, the error is
Traceback (most recent call last):
File "<pyshell#8>", line 1, in <module>
a.j
AttributeError: pytest instance has no attribute 'j'
So my question is: What exactly happens in the two cases and what is the difference?
No, these are two different things.
In Python, everything is an object. Classes are objects, functions are objects and instances are objects. Since everything is an object, everything behaves in a similar way. In your case, you create a class instance (== an object with the type "Class") with the name "pytest". That object has two attributes: i and fuc. i is an instance of "Integer" or "Number", fuc is an instance of "Function".
When you use "pytest.j", you tell python "look up the object pytest and when you have it, look up i". "pytest" is a class instance but that doesn't matter.
When you create an instance of "pytest" (== an object with the type "pytest"), then you have an object which has "defaults". In your case, a is an instance of pytest which means that anything that can't be found in a will be searched in pytest, next.
So a.j means: "Look in a. When it's not there, also look in pytest". But j doesn't exist and Python now has to give you a meaningful error message. It could say "class pytest has no attribute 'j'". This would be correct but meaningless: You would have to figure out yourself that you tried to access j via a. It would be confusing. Guido won't have that.
Therefore, python uses a different error message. Since it does not always have the name of the instance (a), the designers decided to use the type instead, so you get "pytest instance...".
To summarize, there are two types of variables associated with classes and objects: class variables and instance variables. Class variables are associated with classes, but instance variables are associated with objects. Here's an example:
class TestClass:
classVar = 0
def __init__(self):
self.instanceVar = 0
classVar is a class variable associated with the class TestClass. instanceVar is an instance variable associated with objects of the type TestClass.
print(TestClass.classVar) # prints 0
instance1 = TestClass() # creates new instance of TestClass
instance2 = TestClass() # creates another new instance of TestClass
instance1 and instance2 share classVar because they're both objects of the type TestClass.
print(instance1.classVar) # prints 0
TestClass.classVar = 1
print(instance1.classVar) # prints 1
print(instance2.classVar) # prints 1
However, they both have copies of instanceVar because it is an instance variable associated with individual instances, not the class.
print(instance1.instanceVar) # prints 0
print(TestClass.instanceVar) # error! instanceVar is not a class variable
instance1.instanceVar = 1
print(instance1.instanceVar) # prints 1
print(instance2.instanceVar) # prints 0
As Aaron said, if you try to access an instance variable, Python first checks the instance variables of that object, then the class variables of the object's type. Class variables function as default values for instance variables.

Categories