I discovered a weird behaviour (at least weird for me) on python class variables.
class Base(object):
_var = 0
#classmethod
def inc_class(cls):
cls._var += 1
#staticmethod
def inc_static():
Base._var += 1
class A(Base):
pass
class B(Base):
pass
a = A()
b = B()
a.inc_class()
b.inc_class()
a.inc_static()
b.inc_static()
print(a._var)
print(b._var)
print(Base._var)
The output is 1 1 2.
This is surprising me (I was expecting 4 4 4) and I'm wondering why?
When decorated with #classmethod the first argument cls to inc_class(cls) is, well, the class. <class '__main__.A'> and <class '__main__.B'> respectively for A and B. So cls._var refers to A's _var, and similarly for B. In inc_static, decorated with #staticmethod there is no argument, you're explicitly referring to <class '__main__.Base'>, a different _var.
Note the '_var': 0 attribute in Base's and A's __dict__. #classmethod is doing what you'd expect it to do, binding members to classes, in this case A and B.
>>> Base.__dict__
mappingproxy({'__module__': '__main__', '_var': 0, 'inc_class': <classmethod
object at 0x7f23037a8b38>, 'inc_static': <staticmethod object at
0x7f23037a8c18>, '__dict__': <attribute '__dict__' of 'Base' objects>,
'__weakref__': <attribute '__weakref__' of 'Base' objects>, '__doc__': None})
>>> A.__dict__
mappingproxy({'__module__': '__main__', '__doc__': None})`
After calling Base.inc_static():
>>> Base.__dict__
mappingproxy({'__module__': '__main__', '_var': 1, 'inc_class':
<classmethod object at 0x7f23037a8b38>, 'inc_static': <staticmethod
object at 0x7f23037a8c18>, '__dict__': <attribute '__dict__' of 'Base'
objects>, '__weakref__': <attribute '__weakref__' of 'Base' objects>,
'__doc__': None})
>>> A.__dict__
mappingproxy({'__module__': '__main__', '__doc__': None})
After calling A.inc_class():
>>> Base.__dict__
mappingproxy({'__module__': '__main__', '_var': 1, 'inc_class':
<classmethod object at 0x7f23037a8b38>, 'inc_static': <staticmethod
object at 0x7f23037a8c18>, '__dict__': <attribute '__dict__' of 'Base'
objects>, '__weakref__': <attribute '__weakref__' of 'Base' objects>,
'__doc__': None})
>>> A.__dict__
mappingproxy({'__module__': '__main__', '__doc__': None, '_var': 1})
What's interesting is how A's _var is initialised. Note that you do cls._var += 1 before cls._var has been defined. As explained here, cls._var += 1 is equivalent to cls._var = cls._var; cls._var += 1. Because of the way python does lookup the first read of cls._var will fail in A and continue to find it in Base. At the assignment _var is added to A's __dict__ with the value of Base._var, and then all is fine.
>>> class Base(object):
... _var = 10
... #classmethod
... def inc_class(cls):
... cls._var += 1
...
>>> class A(Base):
... pass
...
>>> A.__dict__
mappingproxy({'__module__': '__main__', '__doc__': None})
>>> A.inc_class()
>>> A.__dict__
mappingproxy({'__module__': '__main__', '__doc__': None, '_var': 11})
Even though the two classes inherit from the Base class, they are completely different objects. Through the instantiation of a and b, you have two objects that belong to two separate classes. When you call
a.inc_class()
b.inc_class()
you increment the _var attribute of class A once, and then you do the same for class B. Even though they share the same name, they are different objects. If you had a second instance of class A, say a2, and you would call the function again, then both calls would manipulate the same variable. This explains how you get your first two outputs.
The third output refers to the Base class object. Again, even though it is the same name, it is a different object. You increment the 3rd object twice, therefore you get 2 as the answer.
Related
I have a python internal class that looks like this:
class TopClass:
#my_decorator
class InternalClass:
"""Persistent state."""
a: int
b: int
c: int
in this case my_decorator is a thin wrapper around #dataclass. Running sphinx however writes in the docs things like:
__dataclass_params__= _DataclassParams(init=True,repr=True,eq=True,order=False,unsafe_hash=False, frozen=False)
__dict__= mappingproxy({'__module__': 'my_module.try', '__annotations__': {'a': 'int', 'b': 'int', 'c': 'int'}, '__doc__': 'Persistent state.', '__dict__': <attribute '__dict__' of 'InternalClass' objects>, '__weakref__': <attribute '__weakref__' of 'State' objects>, '__dataclass_params__': _DataclassParams(init=True,repr=True,eq=True,order=False,unsafe_hash=False,frozen=False), '__dataclass_fields__': {'a': Field(name='a',type='int',default=<dataclasses._MISSING_TYPE object>,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),_field_type=_FIELD), 'b': Field(name='b',type='int',default=<dataclasses._MISSING_TYPE object>,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),_field_type=_FIELD), 'c': Field(name='c',type='int',default=<dataclasses._MISSING_TYPE object>,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),_field_type=_FIELD)}, '__init__': <function __create_fn__.<locals>.__init__>, '__repr__': <function __create_fn__.<locals>.__repr__>, '__eq__': <function __create_fn__.<locals>.__eq__>, '__hash__': None})
__eq__(other)
Return self==value.
__hash__= None
__init__(a, b, c)
How can I avoid this?
I have the following classes:
class A(object):
x = 1
class B(A):
pass
class C(A):
pass
When I print the value of x from each class I get:
>>>A.x, B.x, C.x
(1,1,1)
Then I assign 2 to B.x
B.x = 2
A.x, B.x, C.x
>>>(1,2,1)
Everything was normal but when I assigned 3 to A.x I got this :
A.x=3
A.x, B.x, C.x
>>>(3,2,3)
I thought it would return (3,2,1).
This is fundamentally how inheritance works in Python: for class-level variables, it first checks' the classes namespace, then the namespace of every class in the method resolution order. So, both B and C inherit x from A:
In [1]: class A(object):
...: x = 1
...: class B(A):
...: pass
...: class C(A):
...: pass
...:
In [2]: vars(A)
Out[2]:
mappingproxy({'__module__': '__main__',
'x': 1,
'__dict__': <attribute '__dict__' of 'A' objects>,
'__weakref__': <attribute '__weakref__' of 'A' objects>,
'__doc__': None})
In [3]: vars(B)
Out[3]: mappingproxy({'__module__': '__main__', '__doc__': None})
In [4]: vars(C)
Out[4]: mappingproxy({'__module__': '__main__', '__doc__': None})
When you ask for B.x or C.x, it looks into that class namespace, doesn't find any "x", then tries A's namespace, finds it, and returns it.
Now, when you assign a variable to B.x = 2, that adds it to B's class namespace directly:
In [5]: B.x = 2
...:
In [6]: vars(B)
Out[6]: mappingproxy({'__module__': '__main__', '__doc__': None, 'x': 2})
And similarly, when you assign it to A.x=3, it overwrites the old value:
In [7]: A.x=3
...:
In [8]: vars(A)
Out[8]:
mappingproxy({'__module__': '__main__',
'x': 3,
'__dict__': <attribute '__dict__' of 'A' objects>,
'__weakref__': <attribute '__weakref__' of 'A' objects>,
'__doc__': None})
In [9]: vars(B)
Out[9]: mappingproxy({'__module__': '__main__', '__doc__': None, 'x': 2})
In [10]: vars(C)
Out[10]: mappingproxy({'__module__': '__main__', '__doc__': None})
So now, same as before, when you look for C.x, it doesn't find it's own, then it looks for x inside A, and finds it.
Note, inheritance works like this with instances too, just it checks the instance namespace first, then the instances class's namespace, then all the namespace of the classes in it's method resolution order.
I think it's because of this fact that you did not set "a" field for instance of the "C" class.
Thus it gets its default value from the superclass ("Parent class").
If you set the value of "a" in the c instance, You will get "(3,2,1)".
Class creation seems to never re-define the __dict__ and __weakref__ class attributes (i.e. if they already exist in the dictionary of a superclass, they are not added to the dictionaries of its subclasses), but to always re-define the __doc__ and __module__ class attributes. Why?
>>> class A: pass
...
>>> class B(A): pass
...
>>> class C(B): __slots__ = ()
...
>>> vars(A)
mappingproxy({'__module__': '__main__',
'__dict__': <attribute '__dict__' of 'A' objects>,
'__weakref__': <attribute '__weakref__' of 'A' objects>,
'__doc__': None})
>>> vars(B)
mappingproxy({'__module__': '__main__', '__doc__': None})
>>> vars(C)
mappingproxy({'__module__': '__main__', '__slots__': (), '__doc__': None})
>>> class A: __slots__ = ()
...
>>> class B(A): pass
...
>>> class C(B): pass
...
>>> vars(A)
mappingproxy({'__module__': '__main__', '__slots__': (), '__doc__': None})
>>> vars(B)
mappingproxy({'__module__': '__main__',
'__dict__': <attribute '__dict__' of 'B' objects>,
'__weakref__': <attribute '__weakref__' of 'B' objects>,
'__doc__': None})
>>> vars(C)
mappingproxy({'__module__': '__main__', '__doc__': None})
The '__dict__' and '__weakref__' entries in a class's __dict__ (when present) are descriptors used for retrieving an instance's dict pointer and weakref pointer from the instance memory layout. They're not the actual class's __dict__ and __weakref__ attributes - those are managed by descriptors on the metaclass.
There's no point adding those descriptors if a class's ancestors already provide one. However, a class does need its own __module__ and __doc__, regardless of whether its parents already have one - it doesn't make sense for a class to inherit its parent's module name or docstring.
You can see the implementation in type_new, the (very long) C implementation of type.__new__. Look for the add_weak and add_dict variables - those are the variables that determine whether type.__new__ should add space for __dict__ and __weakref__ in the class's instance memory layout. If type.__new__ decides it should add space for one of those attributes to the instance memory layout, it also adds getset descriptors to the class (through tp_getset) to retrieve the attributes:
if (add_dict) {
if (base->tp_itemsize)
type->tp_dictoffset = -(long)sizeof(PyObject *);
else
type->tp_dictoffset = slotoffset;
slotoffset += sizeof(PyObject *);
}
if (add_weak) {
assert(!base->tp_itemsize);
type->tp_weaklistoffset = slotoffset;
slotoffset += sizeof(PyObject *);
}
type->tp_basicsize = slotoffset;
type->tp_itemsize = base->tp_itemsize;
type->tp_members = PyHeapType_GET_MEMBERS(et);
if (type->tp_weaklistoffset && type->tp_dictoffset)
type->tp_getset = subtype_getsets_full;
else if (type->tp_weaklistoffset && !type->tp_dictoffset)
type->tp_getset = subtype_getsets_weakref_only;
else if (!type->tp_weaklistoffset && type->tp_dictoffset)
type->tp_getset = subtype_getsets_dict_only;
else
type->tp_getset = NULL;
If add_dict or add_weak are false, no space is reserved and no descriptor is added. One condition for add_dict or add_weak to be false is if one of the parents already reserved space:
add_dict = 0;
add_weak = 0;
may_add_dict = base->tp_dictoffset == 0;
may_add_weak = base->tp_weaklistoffset == 0 && base->tp_itemsize == 0;
This check doesn't actually care about any ancestor descriptors, just whether an ancestor reserved space for an instance dict pointer or weakref pointer, so if a C ancestor reserved space without providing a descriptor, the child won't reserve space or provide a descriptor. For example, set has a nonzero tp_weaklistoffset, but no __weakref__ descriptor, so descendants of set won't provide a __weakref__ descriptor either, even though instances of set (including subclass instances) support weak references.
You'll also see an && base->tp_itemsize == 0 in the initialization for may_add_weak - you can't add weakref support to a subclass of a class with variable-length instances.
Say I have a class which defines __slots__:
class Foo(object):
__slots__ = ['x']
def __init__(self, x=1):
self.x = x
# will the following work?
def __setattr__(self, key, value):
if key == 'x':
object.__setattr__(self, name, -value) # Haha - let's set to minus x
Can I define __setattr__() for it?
Since Foo has no __dict__, what will it update?
All your code does, apart from negate the value, is call the parent class __setattr__, which is exactly what would happen without your __setattr__ method. So the short answer is: Sure you can define a __setattr__.
What you cannot do is redefine __setattr__ to use self.__dict__, because instances of a class with slots do not have a __dict__ attribute. But such instances do have a self.x attribute, it's contents are just not stored in a dictionary on the instance.
Instead, slot values are stored in the same location a __dict__ instance dictionary would otherwise be stored; on the object heap. Space is reserved for len(__slots__) references, and descriptors on the class access these references on your behalf.
So, in a __setattr__ hook, you can just call those descriptors directly instead:
def __setattr__(self, key, value):
if key == 'x':
Foo.__dict__[key].__set__(self, -value)
Interesting detour: yes, on classes without a __slots__ attribute, there is a descriptor that would give you access to the __dict__ object of instances:
>>> class Bar(object): pass
...
>>> Bar.__dict__['__dict__']
<attribute '__dict__' of 'Bar' objects>
>>> Bar.__dict__['__dict__'].__get__(Bar(), Bar)
{}
which is how normal instances can look up self.__dict__. Which makes you wonder where the Bar.__dict__ object is found. In Python, it is turtles all the way down, you'd look that object up on the type object of course:
>>> type.__dict__['__dict__']
<attribute '__dict__' of 'type' objects>
>>> type.__dict__['__dict__'].__get__(Bar, type)
dict_proxy({'__dict__': <attribute '__dict__' of 'Bar' objects>, '__module__': '__main__', '__weakref__': <attribute '__weakref__' of 'Bar' objects>, '__doc__': None})
Code goes first,
#Python 2.7
>>>class A(object):
pass
>>>a1 = A()
>>>a2 = A()
>>>A.__dict__
dict_proxy({'__dict__': <attribute '__dict__' of 'A' objects>, '__module__': '__main__', '__weakref__': <attribute '__weakref__' of 'A' objects>, '__doc__': None})
Question
1.what is dict_proxy and why use it?
2.A.__dict__ contains an attr -- '__dict': <attribute '__dict__' of 'A' objects>. What is this? Is it for a1 and a2? But objects of A have their own __dict__, don't they?
For your fist question I quote from Fredrik Lundh: http://www.velocityreviews.com/forums/t359039-dictproxy-what-is-this.html:
a CPython implementation detail, used to protect an internal data structure used
by new-style objects from unexpected modifications.
For your second question:
>>> class A(object):
pass
>>> a1 = A()
>>> a2 = A()
>>> a1.foo="spam"
>>> a1.__dict__
{'foo': 'spam'}
>>> A.bacon = 'delicious'
>>> a1.bacon
'delicious'
>>> a2.bacon
'delicious'
>>> a2.foo
Traceback (most recent call last):
File "<pyshell#314>", line 1, in <module>
a2.foo
AttributeError: 'A' object has no attribute 'foo'
>>> a1.__dict__
{'foo': 'spam'}
>>> A.__dict__
dict_proxy({'__dict__': <attribute '__dict__' of 'A' objects>, 'bacon': 'delicious', '__module__': '__main__', '__weakref__': <attribute '__weakref__' of 'A' objects>, '__doc__': None})
Does this answer your question?
If not, dive deeper: https://stackoverflow.com/a/4877655/1324545
dict_proxy prevents you from creating new attributes on a class object by assigning them to the __dict__. If you want to do that use setattr(A, attribute_name, value).
a1 and a2 are instances of A and not class objects. They don't have the protection A has and you can assign using a1.__dict__['abc'] = 'xyz'