Why are the class __dict__ and __weakref__ never re-defined in Python?

Why are the class __dict__ and __weakref__ never re-defined in Python? - python

Class creation seems to never re-define the __dict__ and __weakref__ class attributes (i.e. if they already exist in the dictionary of a superclass, they are not added to the dictionaries of its subclasses), but to always re-define the __doc__ and __module__ class attributes. Why?
>>> class A: pass
...
>>> class B(A): pass
...
>>> class C(B): __slots__ = ()
...
>>> vars(A)
mappingproxy({'__module__': '__main__',
'__dict__': <attribute '__dict__' of 'A' objects>,
'__weakref__': <attribute '__weakref__' of 'A' objects>,
'__doc__': None})
>>> vars(B)
mappingproxy({'__module__': '__main__', '__doc__': None})
>>> vars(C)
mappingproxy({'__module__': '__main__', '__slots__': (), '__doc__': None})
>>> class A: __slots__ = ()
...
>>> class B(A): pass
...
>>> class C(B): pass
...
>>> vars(A)
mappingproxy({'__module__': '__main__', '__slots__': (), '__doc__': None})
>>> vars(B)
mappingproxy({'__module__': '__main__',
'__dict__': <attribute '__dict__' of 'B' objects>,
'__weakref__': <attribute '__weakref__' of 'B' objects>,
'__doc__': None})
>>> vars(C)
mappingproxy({'__module__': '__main__', '__doc__': None})

The '__dict__' and '__weakref__' entries in a class's __dict__ (when present) are descriptors used for retrieving an instance's dict pointer and weakref pointer from the instance memory layout. They're not the actual class's __dict__ and __weakref__ attributes - those are managed by descriptors on the metaclass.
There's no point adding those descriptors if a class's ancestors already provide one. However, a class does need its own __module__ and __doc__, regardless of whether its parents already have one - it doesn't make sense for a class to inherit its parent's module name or docstring.
You can see the implementation in type_new, the (very long) C implementation of type.__new__. Look for the add_weak and add_dict variables - those are the variables that determine whether type.__new__ should add space for __dict__ and __weakref__ in the class's instance memory layout. If type.__new__ decides it should add space for one of those attributes to the instance memory layout, it also adds getset descriptors to the class (through tp_getset) to retrieve the attributes:
if (add_dict) {
if (base->tp_itemsize)
type->tp_dictoffset = -(long)sizeof(PyObject *);
else
type->tp_dictoffset = slotoffset;
slotoffset += sizeof(PyObject *);
}
if (add_weak) {
assert(!base->tp_itemsize);
type->tp_weaklistoffset = slotoffset;
slotoffset += sizeof(PyObject *);
}
type->tp_basicsize = slotoffset;
type->tp_itemsize = base->tp_itemsize;
type->tp_members = PyHeapType_GET_MEMBERS(et);
if (type->tp_weaklistoffset && type->tp_dictoffset)
type->tp_getset = subtype_getsets_full;
else if (type->tp_weaklistoffset && !type->tp_dictoffset)
type->tp_getset = subtype_getsets_weakref_only;
else if (!type->tp_weaklistoffset && type->tp_dictoffset)
type->tp_getset = subtype_getsets_dict_only;
else
type->tp_getset = NULL;
If add_dict or add_weak are false, no space is reserved and no descriptor is added. One condition for add_dict or add_weak to be false is if one of the parents already reserved space:
add_dict = 0;
add_weak = 0;
may_add_dict = base->tp_dictoffset == 0;
may_add_weak = base->tp_weaklistoffset == 0 && base->tp_itemsize == 0;
This check doesn't actually care about any ancestor descriptors, just whether an ancestor reserved space for an instance dict pointer or weakref pointer, so if a C ancestor reserved space without providing a descriptor, the child won't reserve space or provide a descriptor. For example, set has a nonzero tp_weaklistoffset, but no __weakref__ descriptor, so descendants of set won't provide a __weakref__ descriptor either, even though instances of set (including subclass instances) support weak references.
You'll also see an && base->tp_itemsize == 0 in the initialization for may_add_weak - you can't add weakref support to a subclass of a class with variable-length instances.

Related

Class variable scope for static vs class methods

I discovered a weird behaviour (at least weird for me) on python class variables.
class Base(object):
_var = 0
#classmethod
def inc_class(cls):
cls._var += 1
#staticmethod
def inc_static():
Base._var += 1
class A(Base):
pass
class B(Base):
pass
a = A()
b = B()
a.inc_class()
b.inc_class()
a.inc_static()
b.inc_static()
print(a._var)
print(b._var)
print(Base._var)
The output is 1 1 2.
This is surprising me (I was expecting 4 4 4) and I'm wondering why?

When decorated with #classmethod the first argument cls to inc_class(cls) is, well, the class. <class '__main__.A'> and <class '__main__.B'> respectively for A and B. So cls._var refers to A's _var, and similarly for B. In inc_static, decorated with #staticmethod there is no argument, you're explicitly referring to <class '__main__.Base'>, a different _var.
Note the '_var': 0 attribute in Base's and A's __dict__. #classmethod is doing what you'd expect it to do, binding members to classes, in this case A and B.
>>> Base.__dict__
mappingproxy({'__module__': '__main__', '_var': 0, 'inc_class': <classmethod
object at 0x7f23037a8b38>, 'inc_static': <staticmethod object at
0x7f23037a8c18>, '__dict__': <attribute '__dict__' of 'Base' objects>,
'__weakref__': <attribute '__weakref__' of 'Base' objects>, '__doc__': None})
>>> A.__dict__
mappingproxy({'__module__': '__main__', '__doc__': None})`
After calling Base.inc_static():
>>> Base.__dict__
mappingproxy({'__module__': '__main__', '_var': 1, 'inc_class':
<classmethod object at 0x7f23037a8b38>, 'inc_static': <staticmethod
object at 0x7f23037a8c18>, '__dict__': <attribute '__dict__' of 'Base'
objects>, '__weakref__': <attribute '__weakref__' of 'Base' objects>,
'__doc__': None})
>>> A.__dict__
mappingproxy({'__module__': '__main__', '__doc__': None})
After calling A.inc_class():
>>> Base.__dict__
mappingproxy({'__module__': '__main__', '_var': 1, 'inc_class':
<classmethod object at 0x7f23037a8b38>, 'inc_static': <staticmethod
object at 0x7f23037a8c18>, '__dict__': <attribute '__dict__' of 'Base'
objects>, '__weakref__': <attribute '__weakref__' of 'Base' objects>,
'__doc__': None})
>>> A.__dict__
mappingproxy({'__module__': '__main__', '__doc__': None, '_var': 1})
What's interesting is how A's _var is initialised. Note that you do cls._var += 1 before cls._var has been defined. As explained here, cls._var += 1 is equivalent to cls._var = cls._var; cls._var += 1. Because of the way python does lookup the first read of cls._var will fail in A and continue to find it in Base. At the assignment _var is added to A's __dict__ with the value of Base._var, and then all is fine.
>>> class Base(object):
... _var = 10
... #classmethod
... def inc_class(cls):
... cls._var += 1
...
>>> class A(Base):
... pass
...
>>> A.__dict__
mappingproxy({'__module__': '__main__', '__doc__': None})
>>> A.inc_class()
>>> A.__dict__
mappingproxy({'__module__': '__main__', '__doc__': None, '_var': 11})

Even though the two classes inherit from the Base class, they are completely different objects. Through the instantiation of a and b, you have two objects that belong to two separate classes. When you call
a.inc_class()
b.inc_class()
you increment the _var attribute of class A once, and then you do the same for class B. Even though they share the same name, they are different objects. If you had a second instance of class A, say a2, and you would call the function again, then both calls would manipulate the same variable. This explains how you get your first two outputs.
The third output refers to the Base class object. Again, even though it is the same name, it is a different object. You increment the 3rd object twice, therefore you get 2 as the answer.

Can setattr() can be defined in a class with slots?

Say I have a class which defines __slots__:
class Foo(object):
__slots__ = ['x']
def __init__(self, x=1):
self.x = x
# will the following work?
def __setattr__(self, key, value):
if key == 'x':
object.__setattr__(self, name, -value) # Haha - let's set to minus x
Can I define __setattr__() for it?
Since Foo has no __dict__, what will it update?

All your code does, apart from negate the value, is call the parent class __setattr__, which is exactly what would happen without your __setattr__ method. So the short answer is: Sure you can define a __setattr__.
What you cannot do is redefine __setattr__ to use self.__dict__, because instances of a class with slots do not have a __dict__ attribute. But such instances do have a self.x attribute, it's contents are just not stored in a dictionary on the instance.
Instead, slot values are stored in the same location a __dict__ instance dictionary would otherwise be stored; on the object heap. Space is reserved for len(__slots__) references, and descriptors on the class access these references on your behalf.
So, in a __setattr__ hook, you can just call those descriptors directly instead:
def __setattr__(self, key, value):
if key == 'x':
Foo.__dict__[key].__set__(self, -value)
Interesting detour: yes, on classes without a __slots__ attribute, there is a descriptor that would give you access to the __dict__ object of instances:
>>> class Bar(object): pass
...
>>> Bar.__dict__['__dict__']
<attribute '__dict__' of 'Bar' objects>
>>> Bar.__dict__['__dict__'].__get__(Bar(), Bar)
{}
which is how normal instances can look up self.__dict__. Which makes you wonder where the Bar.__dict__ object is found. In Python, it is turtles all the way down, you'd look that object up on the type object of course:
>>> type.__dict__['__dict__']
<attribute '__dict__' of 'type' objects>
>>> type.__dict__['__dict__'].__get__(Bar, type)
dict_proxy({'__dict__': <attribute '__dict__' of 'Bar' objects>, '__module__': '__main__', '__weakref__': <attribute '__weakref__' of 'Bar' objects>, '__doc__': None})

Why get method is not called for instance attribute?

There is this code:
class A:
def __init__(self, x):
self.x = x
def __get__(self, obj, type=None):
print("__get__")
return self.x
def __set__(self, obj, value):
pass
class B:
a_oc = A(44)
def __init__(self, y):
self.a_ob = A(y)
b = B(3)
print(b.a_oc) # class attribute called __get__
print(b.a_ob) # __get__ not called
For class attribute __get__ is called, for instance attribute it is not. Why?

The attribute lookup rule for the new type class(class in 3.x and class inherits from object in 2.x) is, take obj.attr:
if the value is generated by Python, such as __hash__, return it
lookup in obj.__class__.__dict__, if it exists and there exists __get__, return the result of attr.__get__(obj, obj.__class__), if not, lookup in the parent class recursively.
lookup in obj.__dict__. If obj is an instance and the attr exists, return it, or next step. Else if the obj is a class, lookup in itself's, its parents' __dict__, if it is a descriptor, return attr.__get__(None, obj.__class__) or the attr itself.
lookup in obj.__class__.__dict__. If attr is a non-data descriptor, return the result of it. Else return the attr itself if it exists.
raise AttributeError
See you class:
>>> b.__class__
<class 'des.B'>
>>> b.__class__.__dict__
mappingproxy({'__init__': <function B.__init__ at 0x7f2dacb4e290>, '__doc__': None, '__weakref__': <attribute '__weakref__' of 'B' objects>, '__dict__': <attribute '__dict__' of 'B' objects>, 'a_oc': <des.A object at 0x7f2dacb5de50>, '__module__': 'des', '__qualname__': 'B'})
>>>
>>> b.__dict__
{'a_ob': <des.A object at 0x7f2dacb5df10>}
>>>
b.a_oc fits step 2 and b.a_ob fits step3. I put your code in module des.

Why 'declare' variables in Python?

I was Googling some Python-related questions earlier, and stumbled upon this page. The author does something like the following:
class TestClass(object):
first = str()
def __init__(self):
self.first = "Hello"
What's the point of "declaring" the variable first like that? I've never seen this done before, and I can't for the life of me think of a situation where it is beneficial to create a variable before assigning it some value.
The above example could just as well have looked like this:
class TestClass(object):
def __init__(self, first="Hello"):
self.first = first
...or am I missing something?

The fact that the author uses
first = str()
as opposed to
first = ''
shows, alongside setting self.first in __init__ anyway, that there that is no purpose in doing this.
Maybe the author is confused and thinks python variable need to be declared first -_- (evident when viewing the link)

That's not a declaration, that's an assignment ... to a variable inside the class, as opposed to a variable inside an instance.
Consider the following output:
>>> class K1(object):
... def __init__(self):
... self.attr = 'value'
...
>>> x = K1()
>>> x.__dict__
{'attr': 'value'}
>>> class K2(object):
... attr = 'value'
... def __init__(self):
... self.another = 'value2'
...
>>> y = K2()
>>> y.__dict__
{'another': 'value2'}
Here x is an instance of class K1 and has an attribute named attr, and y is an instance of class K2 and has a different attribute named another. But:
>>> y.attr
'value'
Where did that come from? It came from the class:
>>> y.__class__.__dict__
dict_proxy({'__module__': '__main__', 'attr': 'value',
'__dict__': <attribute '__dict__' of 'K2' objects>,
'__weakref__': <attribute '__weakref__' of 'K2' objects>,
'__doc__': None, '__init__': <function __init__ at 0x80185b9b0>})
That's kind of messy but you can see the attr sitting in there. If you look at x.__class__.__dict__ there's no attr:
>>> x.__class__.__dict__
dict_proxy({'__dict__': <attribute '__dict__' of 'K1' objects>,
'__module__': '__main__',
'__weakref__': <attribute '__weakref__' of 'K1' objects>,
'__doc__': None, '__init__': <function __init__ at 0x80185b938>})
When you get an attribute on an instance, like x.attr or y.attr, Python first looks for something attached to the instance itself. If nothing is found, though, it "looks upward" to see if something else defines that attribute. For classes with inheritance, that involves going through the "member resolution order" list. In this case there is no inheritance to worry about, but the next step is to look at the class itself. Here, in K2, there's an attribute in the class named attr, so that's what y.attr produces.
You can change the class attribute to change what shows up in y.attr:
>>> K2.attr = 'newvalue'
>>> y.attr
'newvalue'
And in fact, if you make another instance of K2(), it too will pick up the new value:
>>> z = K2()
>>> z.attr
'newvalue'
Note that changing x's attr does not affect new instances of K1():
>>> w = K1()
>>> w.attr = 'private to w'
>>> w.attr
'private to w'
>>> x.attr
'value'
That's because w.attr is really w.__dict__['attr'], and x.attr is really x.__dict__['attr']. On the other hand, y.attr and z.attr are both really y.__class__.__dict__['attr'] and z.__class__.__dict__['attr'], and since y.__class__ and z.__class__ are both K2, changing K2.attr changes both.
(I'm not sure the guy who wrote the page referenced in the original question realizes all this, though. Creating a class-level attribute and then creating an instance-level one with the same name is kind of pointless.)

str() is equal to ""
>>> str()
''
I think the author wants to show that instance attributes override class attributes having same name. So on executing
test = testclass()
print test.__dict__
you'll get:
{'second': 'weird', 'third': 'test', 'first': 'Some'}
not
{'second': '', 'third': '', 'first': ''}
but
print testclass.__dict__
will print the class attributes:
{'__module__': '__main__', 'third': '', 'second': '', '__doc__': None, '__init__': <function __init__ at 0xb5fed6bc>, 'first': ''}

There is indeed a little difference between the two examples:
class TestClass(object):
first = 'foo'
def __init__(self):
self.first = "Hello"
print(TestClass.first)
Output:
foo
However with:
class TestClass(object):
def __init__(self, first="Hello"):
self.first = "Hello"
print(TestClass.first)
Output:
Traceback (most recent call last):
File "C:\Users\...\Desktop\test.py", line 5, in <module>
print(TestClass.first)
AttributeError: type object 'TestClass' has no attribute 'first'
Note: But that doesn't mean that the author's code make sense. Just wanted to point out the difference.

`dict` of classes in Python

Code goes first,
#Python 2.7
>>>class A(object):
pass
>>>a1 = A()
>>>a2 = A()
>>>A.__dict__
dict_proxy({'__dict__': <attribute '__dict__' of 'A' objects>, '__module__': '__main__', '__weakref__': <attribute '__weakref__' of 'A' objects>, '__doc__': None})
Question
1.what is dict_proxy and why use it?
2.A.__dict__ contains an attr -- '__dict': <attribute '__dict__' of 'A' objects>. What is this? Is it for a1 and a2? But objects of A have their own __dict__, don't they?

For your fist question I quote from Fredrik Lundh: http://www.velocityreviews.com/forums/t359039-dictproxy-what-is-this.html:
a CPython implementation detail, used to protect an internal data structure used
by new-style objects from unexpected modifications.

For your second question:
>>> class A(object):
pass
>>> a1 = A()
>>> a2 = A()
>>> a1.foo="spam"
>>> a1.__dict__
{'foo': 'spam'}
>>> A.bacon = 'delicious'
>>> a1.bacon
'delicious'
>>> a2.bacon
'delicious'
>>> a2.foo
Traceback (most recent call last):
File "<pyshell#314>", line 1, in <module>
a2.foo
AttributeError: 'A' object has no attribute 'foo'
>>> a1.__dict__
{'foo': 'spam'}
>>> A.__dict__
dict_proxy({'__dict__': <attribute '__dict__' of 'A' objects>, 'bacon': 'delicious', '__module__': '__main__', '__weakref__': <attribute '__weakref__' of 'A' objects>, '__doc__': None})
Does this answer your question?
If not, dive deeper: https://stackoverflow.com/a/4877655/1324545

dict_proxy prevents you from creating new attributes on a class object by assigning them to the __dict__. If you want to do that use setattr(A, attribute_name, value).
a1 and a2 are instances of A and not class objects. They don't have the protection A has and you can assign using a1.__dict__['abc'] = 'xyz'

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Why are the class dict and weakref never re-defined in Python? - python

Related

Class variable scope for static vs class methods

Can setattr() can be defined in a class with slots?

Why get method is not called for instance attribute?

Why 'declare' variables in Python?

`dict` of classes in Python

Categories

Resources

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Why are the class __dict__ and __weakref__ never re-defined in Python? - python

Related

Class variable scope for static vs class methods

Can __setattr__() can be defined in a class with __slots__?

Why __get__ method is not called for instance attribute?

Why 'declare' variables in Python?

`__dict__` of classes in Python

Categories

Resources

Why are the class dict and weakref never re-defined in Python? - python

Can setattr() can be defined in a class with slots?

Why get method is not called for instance attribute?

`dict` of classes in Python