Initializing parent class instance attributes - python

I understand how to initialize a parent class to get their instance attributes in a child class, but not exactly what's going on behind the scenes to accomplish this. (Note: not using super intentionally here, just to make illustration clear)
Below we extend class A by adding an extra attribute y to the child class B. If you look at the class dict after instantiating b=B(), we rightfully see both b.x(inherited from class A) and b.y.
I assume at a high level this is accomplished by the call to A.__init__(self,x=10) performing something similar to b.x=10 (the way a normal instance attribute would be assigned) within the __init__ of class B. It's a bit unclear to me because you are calling the __init__ of class A, not class B, yet class B still gets it's instance attributes updated accordingly. How does class A's __init__ know to update b's instance attributes.
This is different than inherited methods where the b object has no explicit inherited method in it's particular namespace, but looks up the inheritance chain when a call to a missing method is made. With the attribute, the method is actually in b's namespace (it's instance dict).
class A:
def __init__(self,x):
self.x = x
class B(A):
def __init__(self):
A.__init__(self,x=10)
self.y = 1
b = B()
print(b.__dict__)
>>>{x:10,y:1} #x added to instance dict from parent init
Below we inherit from the built-in list. Here, similar to the above, since we are calling the list's __init__ method within Foolist's __init__, I would expect to see an instance dictionary that contains elems, but it is nowhere to be found. The values 123 are in the object somewhere, as can be seen by printing alist, but not in the instance dict.
class Foolist(list):
def __init__(self, elems):
list.__init__(self, elems)
alist = Foolist('123')
So what exactly is going on in the inheriting class when a parent's __init__ is called from a child's __init__? How are values being bound? It seems different from method lookup, as you are not searching the inheritance chain on demand, but actually assigning values to the inheriting class's instance dict.
How does a call to a parents init fill out it's child's instance dict? Why does the Foolist example not do this?

The answer is simple: self.
As a very rough overview, when instantiating a class, an object is created. This is more or less literally just an empty container without affiliation to anything.* This "empty container" is then passed to the __init__ method of the class that is being instantiated, where it becomes... the self argument! You're then setting an attribute on that object. You're then calling a different class's __init__ method, explicitly passing your specific self object to that method; that method then adds another attribute to the object.
This is in fact how every instance method works. Each method implicitly receives the "current object" as its first argument, self. When calling a parent's __init__ method, you're actually making that object passing very explicit.
You can approximate that behaviour with this simple example:
def init_a(obj):
obj.x = 10
def init_b(obj):
init_a(obj)
obj.y = 20
o = {}
init_b(o)
* The object is not entirely "empty", there are particular attributes set on the object which create an affiliation with a particular class, so the object is "an instance of" a certain class, and Python can locate all the methods it so inherits from the class as needed.

Related

Getting private attribute in parent class using super(), outside of a method

I have a class with a private constant _BAR = object().
In a child class, outside of a method (no access to self), I want to refer to _BAR.
Here is a contrived example:
class Foo:
_BAR = object()
def __init__(self, bar: object = _BAR):
...
class DFoo(Foo):
"""Child class where I want to access private class variable from parent."""
def __init__(self, baz: object = super()._BAR):
super().__init__(baz)
Unfortunately, this doesn't work. One gets an error: RuntimeError: super(): no arguments
Is there a way to use super outside of a method to get a parent class attribute?
The workaround is to use Foo._BAR, I am wondering though if one can use super to solve this problem.
Inside of DFoo, you cannot refer to Foo._BAR without referring to Foo. Python variables are searched in the local, enclosing, global and built-in scopes (and in this order, it is the so called LEGB rule) and _BAR is not present in any of them.
Let's ignore an explicit Foo._BAR.
Further, it gets inherited: DFoo._BAR will be looked up first in DFoo, and when not found, in Foo.
What other means are there to get the Foo reference? Foo is a base class of DFoo. Can we use this relationship? Yes and no. Yes at execution time and no at definition time.
The problem is when the DFoo is being defined, it does not exist yet. We have no start point to start following the inheritance chain. This rules out an indirect reference (DFoo -> Foo) in a def method(self, ....): line and in a class attribute _DBAR = _BAR.
It is possible to work around this limitation using a class decorator. Define the class and then modify it:
def deco(cls):
cls._BAR = cls.__mro__[1]._BAR * 2 # __mro__[0] is the class itself
return cls
class Foo:
_BAR = 10
#deco
class DFoo(Foo):
pass
print(Foo._BAR, DFoo._BAR) # 10 20
Similar effect can be achieved with a metaclass.
The last option to get a reference to Foo is at execution time. We have the object self, its type is DFoo, and its parent type is Foo and there exists the _BAR. The well known super() is a shortcut to get the parent.
I have assumed only one base class for simplicity. If there were several base classes, super() returns only one of them. The example class decorator does the same. To understand how several bases are sorted to a sequence, see how the MRO works (Method Resolution Order).
My final thought is that I could not think up a use-case where such access as in the question would be required.
Short answer: you can't !
I'm not going into much details about super class itself here. (I've written a pure Python implementation in this gist if you like to read.)
But now let's see how we can call super:
1- Without arguments:
From PEP 3135:
This PEP proposes syntactic sugar for use of the super type to
automatically construct instances of the super type binding to the
class that a method was defined in, and the instance (or class object
for classmethods) that the method is currently acting upon.
The new syntax:
super()
is equivalent to:
super(__class__, <firstarg>)
...and <firstarg> is the first parameter of the method
So this is not an option because you don't have access to the "instance".
(Body of the function/methods is not executed unless it gets called, so no problem if DFoo doesn't exist yet inside the method definition)
2- super(type, instance)
From documentation:
The zero argument form only works inside a class definition, as the
compiler fills in the necessary details to correctly retrieve the
class being defined, as well as accessing the current instance for
ordinary methods.
What were those necessary details mentioned above? A "type" and A "instance":
We can't pass neither "instance" nor "type" which is DFoo here. The first one is because it's not inside the method so we don't have access to instance(self). Second one is DFoo itself. By the time the body of the DFoo class is being executed there is no reference to DFoo, it doesn't exist yet. The body of the class is executed inside a namespace which is a dictionary. After that a new instance of type type which is here named DFoo is created using that populated dictionary and added to the global namespaces. That's what class keyword roughly does in its simple form.
3- super(type, type):
If the second argument is a type, issubclass(type2, type) must be
true
Same reason mentioned in above about accessing the DFoo.
4- super(type):
If the second argument is omitted, the super object returned is
unbound.
If you have an unbound super object you can't do lookup(unless for the super object's attributes itself). Remember super() object is a descriptor. You can turn an unbound object to a bound object by calling __get__ and passing the instance:
class A:
a = 1
class B(A):
pass
class C(B):
sup = super(B)
try:
sup.a
except AttributeError as e:
print(e) # 'super' object has no attribute 'a'
obj = C()
print(obj.sup.a) # 1
obj.sup automatically calls the __get__.
And again same reason about accessing DFoo type mentioned above, nothing changed. Just added for records. These are the ways how we can call super.

instance methods sharing in python

1- is it true?
all the objects of a particular class have their own data members but share the member functions, for which only one copy in the memory exists?
2- and why the address of init in this code is similar:
class c:
def __init__(self,color):
print (f"id of self in __init__ on class is {id(self)}")
def test(self):
print("hello")
print (f"id of __init__ on class is {id(__init__)}")
a=c("red")
print(id(a.__init__))
print(id(a.test))
b=c("green")
b.test()
print(id(b.__init__))
print(id(b.test))
Output:
id of __init__ on class is 1672033309600
id of self in __init__ on class is 1672033251232
**1672028411200
1672028411200**
id of self in __init__ on class is 1672033249696
hello
**1672028411200
1672028411200**
Yes, all instances share the same code for a method. When you reference the method through a specific instance, a bound method object is created; it contains a reference to the method and the instance. When this bound method is called, it then calls the method function with the instance inserted as the first argument.
When you reference a method, a new bound method object is created. Unless you save the reference in a variable, the object will be garbage collected immediately. Referring to another method will create another bound method object, and it can use the same address.
Change your code to
init = a.__init__
test = a.test
print(id(init))
print(id(test))
and you'll get different IDs. Assigning the methods to variables keeps the memory from being reused.

Is there a better way than overriding every method of my parent Python class?

I have a Python parent class with dozens of methods. These parent methods return a parent object.
Each of these methods is similar to a math operation on two objects (e.g. add(self,other), multiply(self,other), which returns the result of the operation as a new object of the same class.
I also have a child class, and its objects use all the parent methods. However, I need them to return the result as a new object of the child class not the parent class.
The child class has additional member variables and it has additional methods.
My first thought is to override each parent method with a child method that a) calls the eponymous parent method (child's add calls super's add) , b) converts the returned parent object into a new child object to set the additional child member variable, and c) returns the new child object.
Apart from the additional property that the child has over the parent, the conversion also allows me to perform type assertions to ensure I have submitted a child object as a function parameter, where required.
Maybe this is all par for the course. But it seems tedious, and cluttery, as I will have to write many such small overriding methods that all do the same thing (call the parent's method verbatim, convert the result).
What I also do not like about this approach is that if the parent is from a library used elsewhere, I'd have to write the overrides for each parent method. To future proof I'd even have to do this for methods I presently don't intend to use.
What are my alternatives? Or is there a better way to set up the classes in the first place, to avoid this?
It has crossed my mind to switch parent and child, but then this new child (formerly parent) will carry around a member variable that means nothing to it, and will have access to methods that make no sense to it.
I assume you have something like
class Parent:
def __add__(self, other):
return Parent(...)
when you probably want
class Parent:
def __add__(self, other):
return type(self)(...)
This allows the method to return a value whose type depends on its arguments (specifically, its first argument) rather than which class defined it.
Define the parent class' methods to take the class of self into consideration:
>>> class Parent:
... def __init__(self): pass
... def method(self): return self.__class__() # or type(self)()
...
>>> class Child(Parent): pass
...
>>> Child().method()
<__main__.Child object at 0x00000151EB7E0AF0>
>>> Parent().method()
<__main__.Parent object at 0x00000151EB977640>

What exactly does super() return in Python 3? [duplicate]

This question already has answers here:
How does `super` interacts with a class's `__mro__` attribute in multiple inheritance?
(2 answers)
Closed 4 years ago.
From Python3's documentation super() "returns a proxy object that delegates method calls to a parent or sibling class of type." What does that mean?
Suppose I have the following code:
class SuperClass():
def __init__(self):
print("__init__ from SuperClass.")
print("self object id from SuperClass: " + str(id(self)))
class SubClass(SuperClass):
def __init__(self):
print("__init__ from SubClass.")
print("self object id from SubClass: " + str(id(self)))
super().__init__()
sc = SubClass()
The output I get from this is:
__init__ from SubClass.
self object id from SubClass: 140690611849200
__init__ from SuperClass.
self object id from SuperClass: 140690611849200
This means that in the line super().__init__(), super() is returning the current object which is then implicitly passed to the superclass' __init__() method. Is this accurate or am I missing something here?
To put it simply, I want to understand the following:
When super().__init__() is run,
What exactly is being passed to __init__() and how? We are calling it on super() so whatever this is returning should be getting passed to the __init__() method from what I understand about Python so far.
Why don't we have to pass in self to super().__init__()?
returns a proxy object that delegates method calls to a parent or
sibling class of type.
This proxy is an object that acts as the method-calling portion of the parent class. It is not the class itself; rather, it's just enough information so that you can use it to call the parent class methods.
If you call __init__(), you get your own, local, sub-class __init__ function. When you call super(), you get that proxy object, which will redirect you to the parent-class methods. Thus, when you call super().__init__(), that proxy redirects the call to the parent-class __init__ method.
Similarly, if you were to call super().foo, you would get the foo method from the parent class -- again, re-routed by that proxy.
Is that clear to you?
Responses to OP comments
But that must mean that this proxy object is being passed to
__init__() when running super().__init__() right?
Wrong. The proxy object is like a package name, such as calling math.sqrt(). You're not passing math to sqrt, you're using it to denote which sqrt you're using. If you wanted to pass the proxy to __init__, the call would be __init__(super()). That call would be semantically ridiculous, of course.
When we have to actually pass in self which is the sc object in my example.
No, you are not passing in sc; that is the result of the object creation call (internal method __new__), which includes an invocation of init. For __init__, the self object is a new item created for you by the Python run-time system. For most class methods, that first argument (called self out of convention, this in other languages) is the object that invoked the method.
This means that in the line super().__init__(), super() is returning the current object which is then implicitly passed to the superclass' __init__() method. Is this accurate or am I missing something here?
>>> help(super)
super() -> same as super(__class__, <first argument>)
super call returns a proxy/wrapper object which remembers:
The instance invoking super()
The class of the calling object
The class that's invoking super()
This is perfectly sound. super always fetches the attribute of the next class in the hierarchy ( really the MRO) that has the attribute that you're looking for. So it's not returning the current object, but rather and more accurately, it returns an object that remembers enough information to search for attributes higher in the class hierarchy.
What exactly is being passed to __init__() and how? We are calling it on super() so whatever this is returning should be getting passed to the __init__() method from what I understand about Python so far.
You're almost right. But super loves to play tricks on us. super class defines __getattribute__, this method is responsible for attribute search. When you do something like: super().y(), super.__getattribute__ gets called searching for y. Once it finds y it passes the instance that's invoking the super call to y. Also, super has __get__ method, which makes it a descriptor, I'll omit the details of descriptors here, refer to the documentation to know more. This answers your second question as well, as to why self isn't passed explicitly.
*Note: super is a little bit different and relies on some magic. Almost for all other classes, the behavior is the same. That is:
a = A() # A is a class
a.y() # same as A.y(a), self is a
But super is different:
class A:
def y(self):
return self
class B(A):
def y(self)
return super().y() # equivalent to: A.y(self)
b = B()
b.y() is b # True: returns b not super(), self is b not super()
I wrote a simple test to investigate what CPython does for super:
class A:
pass
class B(A):
def f(self):
return super()
#classmethod
def g(cls):
return super()
def h(selfish):
selfish = B()
return super()
class C(B):
pass
c = C()
for method in 'fgh':
super_object = getattr(c, method)()
print(super_object, super_object.__self__, super_object.__self_class__, super_object.__thisclass__) # (These methods were found using dir.)
The zero-argument super call returns an object that stores three things:
__self__ stores the object whose name matches the first parameter of the method—even if that name has been reassigned.
__self_class__ stores its type, or itself in the case of a class method.
__thisclass__ stores the class in which the method is defined.
(It is unfortunate that __thisclass__ was implemented this way rather than fetching an attribute on the method because it makes it impossible to use the zero-argument form of super with meta-programming.)
The object returned by super implements getattribute, which forwards method calls to the type found in the __mro__ of __self_class__ one step after __thisclass__.

Why should classes with __get__ or __set__ know who uses them?

I just read about descriptors and it felt very unintentional that the behavior of a class can depend on who uses it. The two methods
__get__(self, instance, owner)
__set__(self, instance, value)
do exactly that. They get in the instance of the class that uses them. What is the reason for this design decision? How is it used?
Update: I think of descriptors as normal types. The class that uses them as a member type can be easily manipulated by side effects of the descriptor. Here is an example of what I mean. Why does Python supprt that?
class Age(object):
def __init__(value):
self.value = value
def __get__(self, instance, owener):
instance.name = 'You got manipulated'
return self.value
class Person(object):
age = Age(42)
name = 'Peter'
peter = Person()
print(peter.name, 'is', peter.age)
__get__ and __set__ receive no information about who's calling them. The 3 arguments are the descriptor object itself, the object whose attribute is being accessed, and the type of the object.
I think the best way to clear this up is with an example. So, here's one:
class Class:
def descriptor(self):
return
foo_instance = Foo()
method_object = foo_instance.descriptor
Functions are descriptors. When you access an object's method, the method object is created by finding the function that implements the method and calling __get__. Here,
method_object = foo_instance.descriptor
calls descriptor.__get__(foo_instance, Foo) to create the method_object. The __get__ method receives no information about who's calling it, only the information needed to perform its task of attribute access.
Descriptors are used to implement binding behaviour; a descriptor requires a context, the object on which they act.
That object is the instance object passed in.
Note that without a descriptor, attribute access on an object acts directly on the object attributes (the instance __dict__ when setting or deleting, otherwise the class and base classes attributes are searched as well).
A descriptor lets you delegate that access to a separate object entirely, encapsulating getting, setting and deleting. But to be able to do so, that object needs access to the context, the instance. Because getting an attribute also normally searches the class and its bases, the __get__ descriptor method is also passed the class (owner) of the instance.
Take functions, for example. A function is a descriptor too, and binding them to an instance produces a method. A class can have any number of instances, but it makes little sense to store bound methods on all those instances when you create the instance, that would be wasteful.
Instead, functions are bound dynamically; you look up the function name on the instance, the function is found on the class instead, and with a call to __get__ the function is bound to the instance, returning a method object. This method object can then pass in the instance to the function when called, producing the self argument.
An example of the descriptor protocol in action is bound methods. When you access an instance method o.foo you can either call it immediately or save it into a variable: a = o.foo. Now, when you call a(x, y, z) the instance o is passed to foo as the first self parameter:
class C(object):
def foo(self, x, y, z):
print(self, x, y, z)
o = C()
a = o.foo
a(1, 2, 3) # prints <C instance at 0x...> 1 2 3
This works because functions implement the descriptor protocol; when you __get__ a function on an object instance it returns a bound method, with the instance bound to the function.
There would be no way for the above to work without the descriptor protocol giving access to the object instance.

Categories