Difference between accessing an instance attribute and a class attribute - python

I have a Python class
class pytest:
i = 34
def func(self):
return "hello world"
When I access pytest.i, I get 34. I can also do this another way:
a = pytest()
a.i
This gives 34 as well.
If I try to access the (non-existing) pytest.j, I get
Traceback (most recent call last):
File "<pyshell#6>", line 1, in <module>
pytest.j
AttributeError: class pytest has no attribute 'j'
while when I try a.j, the error is
Traceback (most recent call last):
File "<pyshell#8>", line 1, in <module>
a.j
AttributeError: pytest instance has no attribute 'j'
So my question is: What exactly happens in the two cases and what is the difference?

No, these are two different things.
In Python, everything is an object. Classes are objects, functions are objects and instances are objects. Since everything is an object, everything behaves in a similar way. In your case, you create a class instance (== an object with the type "Class") with the name "pytest". That object has two attributes: i and fuc. i is an instance of "Integer" or "Number", fuc is an instance of "Function".
When you use "pytest.j", you tell python "look up the object pytest and when you have it, look up i". "pytest" is a class instance but that doesn't matter.
When you create an instance of "pytest" (== an object with the type "pytest"), then you have an object which has "defaults". In your case, a is an instance of pytest which means that anything that can't be found in a will be searched in pytest, next.
So a.j means: "Look in a. When it's not there, also look in pytest". But j doesn't exist and Python now has to give you a meaningful error message. It could say "class pytest has no attribute 'j'". This would be correct but meaningless: You would have to figure out yourself that you tried to access j via a. It would be confusing. Guido won't have that.
Therefore, python uses a different error message. Since it does not always have the name of the instance (a), the designers decided to use the type instead, so you get "pytest instance...".

To summarize, there are two types of variables associated with classes and objects: class variables and instance variables. Class variables are associated with classes, but instance variables are associated with objects. Here's an example:
class TestClass:
classVar = 0
def __init__(self):
self.instanceVar = 0
classVar is a class variable associated with the class TestClass. instanceVar is an instance variable associated with objects of the type TestClass.
print(TestClass.classVar) # prints 0
instance1 = TestClass() # creates new instance of TestClass
instance2 = TestClass() # creates another new instance of TestClass
instance1 and instance2 share classVar because they're both objects of the type TestClass.
print(instance1.classVar) # prints 0
TestClass.classVar = 1
print(instance1.classVar) # prints 1
print(instance2.classVar) # prints 1
However, they both have copies of instanceVar because it is an instance variable associated with individual instances, not the class.
print(instance1.instanceVar) # prints 0
print(TestClass.instanceVar) # error! instanceVar is not a class variable
instance1.instanceVar = 1
print(instance1.instanceVar) # prints 1
print(instance2.instanceVar) # prints 0
As Aaron said, if you try to access an instance variable, Python first checks the instance variables of that object, then the class variables of the object's type. Class variables function as default values for instance variables.

Related

Method object implementation in Python | Ambiguous statements from the documentation

I have some confusion about the method implementation in Python. Any help in that regard would be appreciated.
I found this closely-related post, which is quite useful, but not answering my questions.
My problem:
The following statements from the last paragraph of Section 9.3.4 from Python documentation are not quite clear to me.
When a non-data attribute of an instance is referenced, the instance’s
class is searched. If the name denotes a valid class attribute that is
a function object, a method object is created by packing (pointers to)
the instance object and the function object just found together in an
abstract object: this is the method object. When the method object is
called with an argument list, a new argument list is constructed from
the instance object and the argument list, and the function object is
called with this new argument list.
My initial understanding from the doccumentation was very similar to that of #vaughn-mcgill-adami as described in their post:
when a class instance object calls its method object,
a search would be done on the class object for the corresponding function object. If found, then calling that function owned by the class object with the class instance object as the first parameter.
However, after some coding, it seems to me that my understanding may not be correct. The reason is that one could delete the class object. Yet, calling the method object can still be done with no problem, whereas passing the class instance object to the function object from the class object would result in an error, since there would be no class object anymore after its deletion.
The following code snippet aims at better explaining my questions:
class myclass:
x=0
def inc(self):
self.x +=1
print(self.x)
myobj = myclass()
type(myobj.inc) #output: method
type(myclass.inc) #output: function
# 1. calling method object multiple times
# 2. calling function object and passing the instance object multiple times
print(myobj.x)
for i in range(3):
myobj.inc()
for i in range(3):
myclass.inc(myobj)
#-------------
# output:
#-------------
# 0
# 1
# 2
# 3
# 4
# 5
# 6
#-------------
# deleting the class object
del myclass
# calling method object multiple times
for i in range(3):
myobj.inc()
#-------------
# output:
#-------------
# 7
# 8
# 9
#-------------
# calling function object and passing the instance object
for i in range(3):
myclass.inc(myobj)
#-------------
# output:
#-------------
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-177-86f094e96103> in <module>
1 for i in range(3):
----> 2 myclass.inc(myobj)
NameError: name 'myclass' is not defined
My understanding from the attached code snippet is that after instantiating a class object,
the method objects are attached to the class instance object. The method objects and the class instance object will then be independent of the class object.
Here are some ambiguities I have from the documentation:
What does it mean "...the instance’s class is searched"? I believe it should be "instance class object", but it does not seem to be a typo, since in the following sentence it says "If the name denotes a valid class attribute that is a function object...". Again, here I think the "method object" should replace the "function object".
"...a method object is created...". Is the method object created once it is called, or at the same time that the class instance object is created?
"by packing (pointers to) the instance object and the function object just found". Again, "function object" does not seem correct to me.
Basically, if the class object is deleted, as in my code snippet, then there is no class object to be searched on in the first place. There will be no function object to be pointed to either.
There is a good chance that I am missing something here. I would be grateful if anyone could please advise. Thank you
Basically, if the class object is deleted, as in my code snippet, then
there is no class object to be searched on in the first place. There
will be no function object to be pointed to either.
This is your fundamental misunderstanding.
del does not delete objects.
Indeed, Python the language provides no way to manually delete objects (or manually manage memory, i.e. it is a memory-managed language).
del removes names from namespaces.
When you del myclass, you simply removed that name from that module's namespace. But the class object still exists, for one, it is being referenced by all instances of that class, my_object.__class__, and like any other object, will continue to exist as long as references to it exist.
So, consider the following example with del:
>>> x = [1,2,3]
>>> y = x
>>> print(x is y)
True
>>> print(id(x), id(y))
140664200270464 140664200270464
>>> x.append(1337)
>>> x
[1, 2, 3, 1337]
>>> y
[1, 2, 3, 1337]
>>> del x
>>> print(y)
[1, 2, 3, 1337]
So del x did not delete the object, it simply deleted the name x from that namespace. So if I do:
>>> print(x)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'x' is not defined
del can only indirectly lead to the deletion of an object, if the name you remove from the namespace is the last reference to that object, then that object is free for garbage collection (and in CPython, which uses reference counting as it's main garbage collections strategy, it will be reclaimed immediately, but this may not be true in other implementations, e.g, JYthon, which uses the Java runtime's garbage collector).
Some relevant documentation for the del statement is here:
Deletion of a name removes the binding of that name from the local or
global namespace, depending on whether the name occurs in a global
statement in the same code block. If the name is unbound, a NameError
exception will be raised.
Note, the other part,
Deletion of attribute references, subscriptions and slicings is passed
to the primary object involved; deletion of a slicing is in general
equivalent to assignment of an empty slice of the right type (but even
this is determined by the sliced object).
Is basically saying that a statement like
del obj[item]
is delegated to
type(obj).__delitem__
i.e., it is just sugar for a call to a method.

Why do we need to specify self in __init__ constructor while the instance have not been created yet?

While executing the following code:
class Test():
def __init__(self):
self.hi_there()
self.a = 5
def hi_there(self):
print(self.a)
new_object = Test()
new_object.hi_there()
I have received an error:
Traceback (most recent call last):
File "/root/a.py", line 241, in <module>
new_object = Test()
File "/root/a.py", line 233, in __init__
self.hello()
File "/root/a.py", line 238, in hello
print(self.a)
AttributeError: 'Test' object has no attribute 'a'
Why do we need to specify the self inside the function while the object is not initialized yet? The possibility to call hi_there() function means that the object is already set, but how come if other variables attributed to this instances haven't been initialized yet?
What is the self inside the __init__ function if it's not a "full" object yet?
Clearly this part of code works:
class Test():
def __init__(self):
#self.hi_there()
self.a = 5
self.hi_there()
def hi_there(self):
print(self.a)
new_object = Test()
new_object.hi_there()
I come from C++ world, there you have to declare the variables before you assign them.
I fully understand your the use of self. Although I don't understand what is the use of self inside__init__() if the self object is not fully initialized.
There is no magic. By the time __init__ is called, the object is created and its methods defined, but you have the chance to set all the instance attributes and do all other initialization. If you look at execution in __init__:
def __init__(self):
self.hi_there()
self.a = 5
def hi_there(self):
print(self.a)
the first thing that happens in __init__ is that hi_there is called. The method already exists, so the function call works, and we drop into hi_there(), which does print(self.a). But this is the problem: self.a isn't set yet, since this only happens in the second line of __init__, but we called hi_there from the first line of __init__. Execution hasn't reached the line where you set self.a = 5, so there's no way that the method call self.hi_there() issued before this assignment can use self.a. This is why you get the AttributeError.
Actually, the object has already been created when __init__ is called. That's why you need self as a parameter. And because of the way Python works internally, you don't have access to the objects without self (Bear in mind that it doesn't need to be called self, you can call it anything you want as long as it is a valid name. The instance is always the first parameter of a method, whatever it's name is.).
The truth is that __init__ doesn't create the object, it just initializes it. There is a class method called __new__, which is in charge of creating the instance and returning it. That's where the object is created.
Now, when does the object get it's a attribute. That's in __init__, but you do have access to it's methods inside of __init__. I'm not completely knowledable about how the creation of the objects works, but methods are already set once you get to that point. That doesn't happen with values, so they are not available until you define them yourself in __init__.
Basically Python creates the object, gives it it's methods, and then gives you the instance so you can initialize it's attributes.
EDIT
Another thing I forgot to mention. Just like you define __init__, you can define __new__ yourself. It's not very common, but you do it when you need to modify the actual object's creation. I've only seen it when defining metaclasses (What are metaclasses in Python?). Another method you can define in that case is __call__, giving you even more control.
Not sure what you meant here, but I guess the first code sample should call an hello() function instead of the hi_there() function.
Someone corrects me if I'm wrong, but in Python, defining a class, or a function is dynamic. By this I mean, defining a class or a function happens at runtime: these are regular statements that are executed just like others.
This language feature allows powerful thing such as decorating the behavior of a function to enrich it with extra functionality (see decorators).
Therefore, when you create an instance of the Test class, you try to call the hello() function before you have set explicitly the value of a. Therefore, the Test class is not YET aware of its a attribute. It has to be read sequentially.

"implicit uses of special methods always rely on the class-level binding of the special method"

I have difficulty understanding the last part (in bold) from Python in a Nutshell
Per-Instance Methods
An instance can have instance-specific bindings for all attributes,
including callable attributes (methods). For a method, just like for
any other attribute (except those bound to overriding descriptors),
an instance-specific binding hides a class-level binding:
attribute lookup does not consider the class when it finds a
binding directly in the instance. An instance-specific binding for a
callable attribute does not perform any of the transformations
detailed in “Bound and Unbound Methods” on page 110: the attribute
reference returns exactly the same callable object that was earlier
bound directly to the instance attribute.
However, this does not work as you might expect
for per-instance bindings of the special methods that Python calls
implicitly as a result of various operations, as covered in “Special
Methods” on page 123. Such implicit uses of special methods always
rely on the class-level binding of the special method, if any. For
example:
def fake_get_item(idx): return idx
class MyClass(object): pass
n = MyClass()
n.__getitem__ = fake_get_item
print(n[23]) # results in:
# Traceback (most recent call last):
# File "<stdin>", line 1, in ?
# TypeError: unindexable object
What does it mean specifically?
Why is the error of the example?
Thanks.
Neglecting all the fine details it basically says that special methods (as defined in Pythons data model - generally these are the methods starting with two underscores and ending with two underscores and are rarely, if ever, called directly) will never be used implicitly from the instance even if defined there:
n[whatever] # will always call type(n).__getitem__(n, whatever)
This differs from attribute look-up which checks the instance first:
def fake_get_item(idx):
return idx
class MyClass(object):
pass
n = MyClass()
n.__getitem__ = fake_get_item
print(n.__getitem__(23)) # works because attribute lookup checks the instance first
There is a whole section in the documentation about this (including rationale): "Special method lookup":
3.3.9. Special method lookup
For custom classes, implicit invocations of special methods are only guaranteed to work correctly if defined on an object’s type, not in the object’s instance dictionary. That behaviour is the reason why the following code raises an exception:
>>> class C:
... pass
...
>>> c = C()
>>> c.__len__ = lambda: 5
>>> len(c)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: object of type 'C' has no len()
The rationale behind this behaviour lies with a number of special methods such as __hash__() and __repr__() that are implemented by all objects, including type objects. If the implicit lookup of these methods used the conventional lookup process, they would fail when invoked on the type object itself:
>>> 1 .__hash__() == hash(1)
True
>>> int.__hash__() == hash(int)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: descriptor '__hash__' of 'int' object needs an argument
[...]
Bypassing the __getattribute__() machinery in this fashion provides significant scope for speed optimisations within the interpreter, at the cost of some flexibility in the handling of special methods (the special method must be set on the class object itself in order to be consistently invoked by the interpreter).
To put it even more plainly, it means that you can't redefine the dunder methods on the fly. As a consequence, ==, +, and the rest of the operators always mean the same thing for all objects of type T.
I'll try to summarize what the extract says and in particular the part in bold.
Generally speaking, when Python tries to find the value of an attribute (including a method), it first checks the instance (i.e. the actual object you created), then the class.
The code below illustrates the generic behavior.
class MyClass(object):
def a(self):
print("howdy from the class")
n = MyClass()
#here the class method is called
n.a()
#'howdy from the class'
def new_a():
print("hello from new a")
n.a = new_a
#the new instance binding hides the class binding
n.a()
#'hello from new a'
What the part in bold states is that this behavior does not apply to "Special Methods" such as __getitem__. In other words, overriding __getitem__ at the instance level (n.__getitem__ = fake_get_item in your exemple) does nothing : when the method is called through the n[] syntax, an error is raised because the class does not implement the method.
(If the generic behavior also held in this case, the result of print(n[23]) would have been to print 23, i.e. executing the fake_get_item method).
Another example of the same behavior:
class MyClass(object):
def __getitem__(self, idx):
return idx
n = MyClass()
fake_get_item = lambda x: "fake"
print(fake_get_item(23))
#'fake'
n.__getitem__ = fake_get_item
print(n[23])
#'23'
In this example, the class method for __getitem__ (which returns the index number) is called instead of the instance binding (which returns 'fake').

Class name as a variable in python

I was just looking at one question here and the OP was using a same name for class, other things and also for variable. When I was trying to answer it, I became confused myself and thus thought of asking.
For example:
class MyClass:
pass
MyClass=MyClass()
Though, I will never code anything like this. I would like to understand how this will be treated by python interpreter. So my question is, is the variable MyClass I will use will be created first or the other way? Which is, creating an instance of MyClass firstly and assigning it to MyClass variable. I think the latter is correct but if that is the case, how will the following be resolved?
class MyClass:
pass
MyClass=MyClass()
new_class=MyClass()
The right-hand side of the assignment is processed first, so an instance of MyClass is created. But then you reassign the name MyClass to that instance. When you execute
new_class = MyClass()
you should get an error about MyClass not being callable, since that name now refers to an instance of the original class, not the class itself.
class MyClass:
pass
MyClass=MyClass()
In simple terms, the above code does three things (in this order):
Defines the class MyClass.
Creates an instance of MyClass.
Assigns that instance to the variable MyClass.
After the last step, the class MyClass is overwritten and can no longer be used. All you have left is an instance of it contained in the variable MyClass.
Moreover, if you try to call this instance as you would a class, you will get an error:
>>> class MyClass:
... pass
...
>>> MyClass=MyClass()
>>> new_class=MyClass()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'MyClass' object is not callable
>>>
The line:
new_class=MyClass()
in most cases will return an error, saying something like instance not callable.
MyClass now refers to the instance of what MyClass previous held that is a class.
You could make a new instance of former MyClass by:
new_class = MyClass.__class__()
MyClass is just just a variable that points/refers to a particular object. First it was class then it was changed to hold an instance of that class.
Variables are treated as objects in Python. From my understanding, when you assign a new instance of MyClass to an object, python will try to create a reference of the original class to the object and duplicate. However, the namespace of the new object is already used (in the original MyClass), and the duplication will return you an error, so the first code will not work.
For the second piece of code, the final line will not execute due to the same reason of Namespace Duplication. Since the last but one line failed, the proposed reference target is still the original MyClass, which won't work at all.

Python class variable name vs __name__

I'm trying to understand the relationship between the variable a Python class object is assigned to and the __name__ attribute for that class object. For example:
In [1]: class Foo(object):
...: pass
...:
In [2]: Foo.__name__ = 'Bar'
In [3]: Foo.__name__
Out[3]: 'Bar'
In [4]: Foo
Out[4]: __main__.Bar
In [5]: Bar
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-5-962d3beb4fd6> in <module>()
----> 1 Bar
NameError: name 'Bar' is not defined
So it seems like I have changed the __name__ attribute of the class but I can't refer to it by that name. I know this is a bit general but could someone explain the relationship between Foo and Foo.__name__?
It's simple. There is no relationship at all.
When you create a class a local variable is created with name you used, pointing at the class so you can use it.
The class also gets an attribute __name__ that contains the name of that variable, because that's handy in certain cases, like pickling.
You can set the local variable to something else, or change the __name__ variable, but then things like pickling won't work, so don't do that.
__name__ is mere self-identification, in oder to know what type an instance of it really is.
The other thing is the way it can be accessed with. That can vary if you re-assign it.
They both are assigned at the time you define the class.
It works the same way with functions: if you def them, they get assigned to the given name and they get the respective __name__ attribute.
OTOH, if you have a lambda function, it gets a __name__ attribute of <lambda>, because it doesn't know the name it gets assigned to.
Short version
class Foo(object): pass creates a class and assigns it to local name Foo.
Foo.__name__ = 'Bar' assigns a new value to attribute __name__. The enclosing scope is not affected.
Long version
The class statement creates a class and assigns to the name provided in the local scope. When creating a class Python tells the class the name it was created with by assigning it to the class's __name__ attribute.
Assigning to a class's attribute does not introduce a name into the local scope. Therefore any changes to attributes (such as __name__) do not affect the enclosing scope.
You need to keep in mind that in python a class is just an object like any other. It wouldn't make sense for an object to contain an attribute that was linked to a variable that refers to the object, because there could be any number of variable names referring to it. Any time you write an assignment (Bar = Foo) or pass the object to a function, you have a new reference. Naturally all objects must be independent of how they are referenced.
__name__ is simply a piece of information attached to the class object, which happens to be the same as the variable name it's initially assigned to.

Categories