Why am I forced to use threading.Thread.__init__(self) or super(ClassName, self).__init__() when I create a threading.Thread Class?
For example:
class Threader(threading.Thread):
def __init__(self, _fp, _q):
threading.Thread.__init__(self)
self.path = _fp
self.queue = _q
def run(self):
# Do stuff
or
class Threader(threading.Thread):
def __init__(self, _fp, _q):
super(Threader, self).__init__()
self.path = _fp
self.queue = _q
def run(self):
# Do stuff
Both methods work, and do roughly the same thing. However, if I remove either .__init__() methods, I receive in the stack: from thread.start(): thread.__init__() not called.
Shouldn't defining my own def __init__() "replace" the .__init__() method?
I've read this other SO post and that aligned with what I thought, get same stack error though.
Consider this simplified example:
class dog:
def __init__(self):
self.legs = 4
self.sound = 'woof'
class chihuahua(dog):
def __init__(self):
self.sound = 'yip'
# what's missing here?
We've created a subclass of dog, called chihuahua. A user of this class would reasonably expect it to behave like a dog in all default aspects, except the specific one that we have overridden (the sound it makes). But note that, as you have pointed out, the new subclass __init__ replaces the base class __init__. Completely replaces. Unlike C++, the base-class initialization code is not automatically called when a subclass instance is created. Therefore, the line self.legs = 4 never gets run when you create a chihuahua(). As a result, this type of dog is running around without any idea how many legs it has. Hence you could argue it is not a fully-functioning dog, and you shouldn't be surprised if it falls over while trying to perform complex tricks.
As subclass designer you have two options to fix this. The first is to reimplement the self.legs = 4 line explicitly in the subclass. Well, that'll work fine in this example, but it's not a great option in general because it violates the DRY principle even in cases where you do know exactly what code to write and how to maintain it. And in more complex examples (like your Thread subclass), you presumably won't know. Second option: explicitly call the superclass initializer and let it do its thing.
Defining your own __init__ overrides the base class. But what about all the work the base __init__ does to make the thread runnable? All variables and state that it would normally create are missing. Unless you hack all of that in yourself (and why do that?) the thread is of course completely unrunnable.
Not all classes need an __init__ of course, but the vast majority do. Even for the ones that don't, calling __init__ is harmless - it just goes to object.__init__ and future-proofs the child class in the event an implementer decides an __init__ is useful after all.
Related
According to Python docs super()
is useful for accessing inherited methods that have been overridden in
a class.
I understand that super refers to the parent class and it lets you access parent methods. My question is why do people always use super inside the init method of the child class? I have seen it everywhere. For example:
class Person:
def __init__(self, name):
self.name = name
class Employee(Person):
def __init__(self, **kwargs):
super().__init__(name=kwargs['name']) # Here super is being used
def first_letter(self):
return self.name[0]
e = Employee(name="John")
print(e.first_letter())
I can accomplish the same without super and without even an init method:
class Person:
def __init__(self, name):
self.name = name
class Employee(Person):
def first_letter(self):
return self.name[0]
e = Employee(name="John")
print(e.first_letter())
Are there drawbacks with the latter code? It looks so much cleanr to me. I don't even have to use the boilerplate **kwargs and kwargs['argument'] syntax.
I am using Python 3.8.
Edit: Here's another stackoverflow questions which has code from different people who are using super in the child's init method. I don't understand why. My best guess is there's something new in Python 3.8.
The child might want to do something different or more likely additional to what the super class does - in this case the child must have an __init__.
Calling super’s init means that you don’t have to copy/paste (with all the implications for maintenance) that init in the child’s class, which otherwise would be needed if you wanted some additional code in the child init.
But note there are complications about using super’s init if you use multiple inheritance (e.g. which super gets called) and this needs care. Personally I avoid multiple inheritance and keep inheritance to aminimum anyway - it’s easy to get tempted into creating multiple levels of inheritance/class hierarchy but my experience is that a ‘keep it simple’ approach is usually much better.
The potential drawback to the latter code is that there is no __init__ method within the Employee class. Since there is none, the __init__ method of the parent class is called. However, as soon as an __init__ method is added to the Employee class (maybe there's some Employee-specific attribute that needs to be initialized, like an id_number) then the __init__ method of the parent class is overridden and not called (unless super.__init__() is called) and then an Employee will not have a name attribute.
The correct way to use super here is for both methods to use super. You cannot assume that Person is the last (or at least, next-to-last, before object) class in the MRO.
class Person:
def __init__(self, name, **kwargs):
super().__init__(**kwargs)
self.name = name
class Employee(Person):
# Optional, since Employee.__init__ does nothing
# except pass the exact same arguments "upstream"
def __init__(self, **kwargs):
super().__init__(**kwargs)
def first_letter(self):
return self.name[0]
Consider a class definition like
class Bar:
...
class Foo(Person, Bar):
...
The MRO for Foo looks like [Foo, Person, Bar, object]; the call to super().__init__ inside Person.__init__ would call Bar.__init__, not object.__init__, and Person has no way of knowing if values in **kwargs are meant for Bar, so it must pass them on.
I want to do something like the following (in Python 3.7):
class Animal:
def __init__(self, name, legs):
self.legs = legs
print(name)
#classmethod
def with_two_legs(cls, name):
# extremely long code to generate name_full from name
name_full = name
return cls(name_full, 2)
class Human(Animal):
def __init__(self):
super().with_two_legs('Human')
john = Human()
Basically, I want to override the __init__ method of a child class with a factory classmethod of the parent. The code as written, however, does not work, and raises:
TypeError: __init__() takes 1 positional argument but 3 were given
I think this means that super().with_two_legs('Human') passes Human as the cls variable.
1) Why doesn't this work as written? I assumed super() would return a proxy instance of the superclass, so cls would be Animal right?
2) Even if this was the case I don't think this code achieves what I want, since the classmethod returns an instance of Animal, but I just want to initialize Human in the same way classmethod does, is there any way to achieve the behaviour I want?
I hope this is not a very obvious question, I found the documentation on super() somewhat confusing.
super().with_two_legs('Human') does in fact call Animal's with_two_legs, but it passes Human as the cls, not Animal. super() makes the proxy object only to assist with method lookup, it doesn't change what gets passed (it's still the same self or cls it originated from). In this case, super() isn't even doing anything useful, because Human doesn't override with_two_legs, so:
super().with_two_legs('Human')
means "call with_two_legs from the first class above Human in the hierarchy which defines it", and:
cls.with_two_legs('Human')
means "call with_two_legs on the first class in the hierarchy starting with cls that defines it". As long as no class below Animal defines it, those do the same thing.
This means your code breaks at return cls(name_full, 2), because cls is still Human, and your Human.__init__ doesn't take any arguments beyond self. Even if you futzed around to make it work (e.g. by adding two optional arguments that you ignore), this would cause an infinite loop, as Human.__init__ called Animal.with_two_legs, which in turn tried to construct a Human, calling Human.__init__ again.
What you're trying to do is not a great idea; alternate constructors, by their nature, depend on the core constructor/initializer for the class. If you try to make a core constructor/initializer that relies on an alternate constructor, you've created a circular dependency.
In this particular case, I'd recommend avoiding the alternate constructor, in favor of either explicitly providing the legs count always, or using an intermediate TwoLeggedAnimal class that performs the task of your alternate constructor. If you want to reuse code, the second option just means your "extremely long code to generate name_full from name" can go in TwoLeggedAnimal's __init__; in the first option, you'd just write a staticmethod that factors out that code so it can be used by both with_two_legs and other constructors that need to use it.
The class hierarchy would look something like:
class Animal:
def __init__(self, name, legs):
self.legs = legs
print(name)
class TwoLeggedAnimal(Animal)
def __init__(self, name):
# extremely long code to generate name_full from name
name_full = name
super().__init__(name_full, 2)
class Human(TwoLeggedAnimal):
def __init__(self):
super().__init__('Human')
The common code approach would instead be something like:
class Animal:
def __init__(self, name, legs):
self.legs = legs
print(name)
#staticmethod
def _make_two_legged_name(basename):
# extremely long code to generate name_full from name
return name_full
#classmethod
def with_two_legs(cls, name):
return cls(cls._make_two_legged_name(name), 2)
class Human(Animal):
def __init__(self):
super().__init__(self._make_two_legged_name('Human'), 2)
Side-note: What you were trying to do wouldn't work even if you worked around the recursion, because __init__ doesn't make new instances, it initializes existing instances. So even if you call super().with_two_legs('Human') and it somehow works, it's making and returning a completely different instance, but not doing anything to the self received by __init__ which is what's actually being created. The best you'd have been able to do is something like:
def __init__(self):
self_template = super().with_two_legs('Human')
# Cheaty way to copy all attributes from self_template to self, assuming no use
# of __slots__
vars(self).update(vars(self_template))
There is no way to call an alternate constructor in __init__ and have it change self implicitly. About the only way I can think of to make this work in the way you intended without creating helper methods and preserving your alternate constructor would be to use __new__ instead of __init__ (so you can return an instance created by another constructor), and doing awful things with the alternate constructor to explicitly call the top class's __new__ to avoid circular calling dependencies:
class Animal:
def __new__(cls, name, legs): # Use __new__ instead of __init__
self = super().__new__(cls) # Constructs base object
self.legs = legs
print(name)
return self # Returns initialized object
#classmethod
def with_two_legs(cls, name):
# extremely long code to generate name_full from name
name_full = name
return Animal.__new__(cls, name_full, 2) # Explicitly call Animal's __new__ using correct subclass
class Human(Animal):
def __new__(cls):
return super().with_two_legs('Human') # Return result of alternate constructor
The proxy object you get from calling super was only used to locate the with_two_legs method to be called (and since you didn't override it in Human, you could have used self.with_two_legs for the same result).
As wim commented, your alternative constructor with_two_legs doesn't work because the Human class breaks the Liskov substitution principle by having a different constructor signature. Even if you could get the code to call Animal to build your instance, you'd have problems because you'd end up with an Animal instances and not a Human one (so other methods in Human, if you wrote some, would not be available).
Note that this situation is not that uncommon, many Python subclasses have different constructor signatures than their parent classes. But it does mean that you can't use one class freely in place of the other, as happens with a classmethod that tries to construct instances. You need to avoid those situations.
In this case, you are probably best served by using a default value for the legs argument to the Animal constructor. It can default to 2 legs if no alternative number is passed. Then you don't need the classmethod, and you don't run into problems when you override __init__:
class Animal:
def __init__(self, name, legs=2): # legs is now optional, defaults to 2
self.legs = legs
print(name)
class Human(Animal):
def __init__(self):
super().__init__('Human')
john = Human()
This question already has answers here:
What does 'super' do in Python? - difference between super().__init__() and explicit superclass __init__()
(11 answers)
Closed 7 years ago.
Why is super() used?
Is there a difference between using Base.__init__ and super().__init__?
class Base(object):
def __init__(self):
print "Base created"
class ChildA(Base):
def __init__(self):
Base.__init__(self)
class ChildB(Base):
def __init__(self):
super(ChildB, self).__init__()
ChildA()
ChildB()
super() lets you avoid referring to the base class explicitly, which can be nice. But the main advantage comes with multiple inheritance, where all sorts of fun stuff can happen. See the standard docs on super if you haven't already.
Note that the syntax changed in Python 3.0: you can just say super().__init__() instead of super(ChildB, self).__init__() which IMO is quite a bit nicer. The standard docs also refer to a guide to using super() which is quite explanatory.
I'm trying to understand super()
The reason we use super is so that child classes that may be using cooperative multiple inheritance will call the correct next parent class function in the Method Resolution Order (MRO).
In Python 3, we can call it like this:
class ChildB(Base):
def __init__(self):
super().__init__()
In Python 2, we were required to call super like this with the defining class's name and self, but we'll avoid this from now on because it's redundant, slower (due to the name lookups), and more verbose (so update your Python if you haven't already!):
super(ChildB, self).__init__()
Without super, you are limited in your ability to use multiple inheritance because you hard-wire the next parent's call:
Base.__init__(self) # Avoid this.
I further explain below.
"What difference is there actually in this code?:"
class ChildA(Base):
def __init__(self):
Base.__init__(self)
class ChildB(Base):
def __init__(self):
super().__init__()
The primary difference in this code is that in ChildB you get a layer of indirection in the __init__ with super, which uses the class in which it is defined to determine the next class's __init__ to look up in the MRO.
I illustrate this difference in an answer at the canonical question, How to use 'super' in Python?, which demonstrates dependency injection and cooperative multiple inheritance.
If Python didn't have super
Here's code that's actually closely equivalent to super (how it's implemented in C, minus some checking and fallback behavior, and translated to Python):
class ChildB(Base):
def __init__(self):
mro = type(self).mro()
check_next = mro.index(ChildB) + 1 # next after *this* class.
while check_next < len(mro):
next_class = mro[check_next]
if '__init__' in next_class.__dict__:
next_class.__init__(self)
break
check_next += 1
Written a little more like native Python:
class ChildB(Base):
def __init__(self):
mro = type(self).mro()
for next_class in mro[mro.index(ChildB) + 1:]: # slice to end
if hasattr(next_class, '__init__'):
next_class.__init__(self)
break
If we didn't have the super object, we'd have to write this manual code everywhere (or recreate it!) to ensure that we call the proper next method in the Method Resolution Order!
How does super do this in Python 3 without being told explicitly which class and instance from the method it was called from?
It gets the calling stack frame, and finds the class (implicitly stored as a local free variable, __class__, making the calling function a closure over the class) and the first argument to that function, which should be the instance or class that informs it which Method Resolution Order (MRO) to use.
Since it requires that first argument for the MRO, using super with static methods is impossible as they do not have access to the MRO of the class from which they are called.
Criticisms of other answers:
super() lets you avoid referring to the base class explicitly, which can be nice. . But the main advantage comes with multiple inheritance, where all sorts of fun stuff can happen. See the standard docs on super if you haven't already.
It's rather hand-wavey and doesn't tell us much, but the point of super is not to avoid writing the parent class. The point is to ensure that the next method in line in the method resolution order (MRO) is called. This becomes important in multiple inheritance.
I'll explain here.
class Base(object):
def __init__(self):
print("Base init'ed")
class ChildA(Base):
def __init__(self):
print("ChildA init'ed")
Base.__init__(self)
class ChildB(Base):
def __init__(self):
print("ChildB init'ed")
super().__init__()
And let's create a dependency that we want to be called after the Child:
class UserDependency(Base):
def __init__(self):
print("UserDependency init'ed")
super().__init__()
Now remember, ChildB uses super, ChildA does not:
class UserA(ChildA, UserDependency):
def __init__(self):
print("UserA init'ed")
super().__init__()
class UserB(ChildB, UserDependency):
def __init__(self):
print("UserB init'ed")
super().__init__()
And UserA does not call the UserDependency method:
>>> UserA()
UserA init'ed
ChildA init'ed
Base init'ed
<__main__.UserA object at 0x0000000003403BA8>
But UserB does in-fact call UserDependency because ChildB invokes super:
>>> UserB()
UserB init'ed
ChildB init'ed
UserDependency init'ed
Base init'ed
<__main__.UserB object at 0x0000000003403438>
Criticism for another answer
In no circumstance should you do the following, which another answer suggests, as you'll definitely get errors when you subclass ChildB:
super(self.__class__, self).__init__() # DON'T DO THIS! EVER.
(That answer is not clever or particularly interesting, but in spite of direct criticism in the comments and over 17 downvotes, the answerer persisted in suggesting it until a kind editor fixed his problem.)
Explanation: Using self.__class__ as a substitute for the class name in super() will lead to recursion. super lets us look up the next parent in the MRO (see the first section of this answer) for child classes. If you tell super we're in the child instance's method, it will then lookup the next method in line (probably this one) resulting in recursion, probably causing a logical failure (in the answerer's example, it does) or a RuntimeError when the recursion depth is exceeded.
>>> class Polygon(object):
... def __init__(self, id):
... self.id = id
...
>>> class Rectangle(Polygon):
... def __init__(self, id, width, height):
... super(self.__class__, self).__init__(id)
... self.shape = (width, height)
...
>>> class Square(Rectangle):
... pass
...
>>> Square('a', 10, 10)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 3, in __init__
TypeError: __init__() missing 2 required positional arguments: 'width' and 'height'
Python 3's new super() calling method with no arguments fortunately allows us to sidestep this issue.
It's been noted that in Python 3.0+ you can use
super().__init__()
to make your call, which is concise and does not require you to reference the parent OR class names explicitly, which can be handy. I just want to add that for Python 2.7 or under, some people implement a name-insensitive behaviour by writing self.__class__ instead of the class name, i.e.
super(self.__class__, self).__init__() # DON'T DO THIS!
HOWEVER, this breaks calls to super for any classes that inherit from your class, where self.__class__ could return a child class. For example:
class Polygon(object):
def __init__(self, id):
self.id = id
class Rectangle(Polygon):
def __init__(self, id, width, height):
super(self.__class__, self).__init__(id)
self.shape = (width, height)
class Square(Rectangle):
pass
Here I have a class Square, which is a sub-class of Rectangle. Say I don't want to write a separate constructor for Square because the constructor for Rectangle is good enough, but for whatever reason I want to implement a Square so I can reimplement some other method.
When I create a Square using mSquare = Square('a', 10,10), Python calls the constructor for Rectangle because I haven't given Square its own constructor. However, in the constructor for Rectangle, the call super(self.__class__,self) is going to return the superclass of mSquare, so it calls the constructor for Rectangle again. This is how the infinite loop happens, as was mentioned by #S_C. In this case, when I run super(...).__init__() I am calling the constructor for Rectangle but since I give it no arguments, I will get an error.
Super has no side effects
Base = ChildB
Base()
works as expected
Base = ChildA
Base()
gets into infinite recursion.
Just a heads up... with Python 2.7, and I believe ever since super() was introduced in version 2.2, you can only call super() if one of the parents inherit from a class that eventually inherits object (new-style classes).
Personally, as for python 2.7 code, I'm going to continue using BaseClassName.__init__(self, args) until I actually get the advantage of using super().
There isn't, really. super() looks at the next class in the MRO (method resolution order, accessed with cls.__mro__) to call the methods. Just calling the base __init__ calls the base __init__. As it happens, the MRO has exactly one item-- the base. So you're really doing the exact same thing, but in a nicer way with super() (particularly if you get into multiple inheritance later).
The main difference is that ChildA.__init__ will unconditionally call Base.__init__ whereas ChildB.__init__ will call __init__ in whatever class happens to be ChildB ancestor in self's line of ancestors
(which may differ from what you expect).
If you add a ClassC that uses multiple inheritance:
class Mixin(Base):
def __init__(self):
print "Mixin stuff"
super(Mixin, self).__init__()
class ChildC(ChildB, Mixin): # Mixin is now between ChildB and Base
pass
ChildC()
help(ChildC) # shows that the Method Resolution Order is ChildC->ChildB->Mixin->Base
then Base is no longer the parent of ChildB for ChildC instances. Now super(ChildB, self) will point to Mixin if self is a ChildC instance.
You have inserted Mixin in between ChildB and Base. And you can take advantage of it with super()
So if you are designed your classes so that they can be used in a Cooperative Multiple Inheritance scenario, you use super because you don't really know who is going to be the ancestor at runtime.
The super considered super post and pycon 2015 accompanying video explain this pretty well.
I was looking into Python's super method and multiple inheritance. I read along something like when we use super to call a base method which has implementation in all base classes, only one class' method will be called even with variety of arguments. For example,
class Base1(object):
def __init__(self, a):
print "In Base 1"
class Base2(object):
def __init__(self):
print "In Base 2"
class Child(Base1, Base2):
def __init__(self):
super(Child, self).__init__('Intended for base 1')
super(Child, self).__init__()# Intended for base 2
This produces TyepError for the first super method. super would call whichever method implementation it first recognizes and gives TypeError instead of checking for other classes down the road. However, this will be much more clear and work fine when we do the following:
class Child(Base1, Base2):
def __init__(self):
Base1.__init__(self, 'Intended for base 1')
Base2.__init__(self) # Intended for base 2
This leads to two questions:
Is __init__ method a static method or a class method?
Why use super, which implicitly choose the method on it's own rather than explicit call to the method like the latter example? It looks lot more cleaner than using super to me. So what is the advantage of using super over the second way(other than writing the base class name with the method call)
super() in the face of multiple inheritance, especially on methods that are present on object can get a bit tricky. The general rule is that if you use super, then every class in the hierarchy should use super. A good way to handle this for __init__ is to make every method take **kwargs, and always use keyword arguments everywhere. By the time the call to object.__init__ occurs, all arguments should have been popped out!
class Base1(object):
def __init__(self, a, **kwargs):
print "In Base 1", a
super(Base1, self).__init__()
class Base2(object):
def __init__(self, **kwargs):
print "In Base 2"
super(Base2, self).__init__()
class Child(Base1, Base2):
def __init__(self, **kwargs):
super(Child, self).__init__(a="Something for Base1")
See the linked article for way more explanation of how this works and how to make it work for you!
Edit: At the risk of answering two questions, "Why use super at all?"
We have super() for many of the same reasons we have classes and inheritance, as a tool for modularizing and abstracting our code. When operating on an instance of a class, you don't need to know all of the gritty details of how that class was implemented, you only need to know about its methods and attributes, and how you're meant to use that public interface for the class. In particular, you can be confident that changes in the implementation of a class can't cause you problems as a user of its instances.
The same argument holds when deriving new types from base classes. You don't want or need to worry about how those base classes were implemented. Here's a concrete example of how not using super might go wrong. suppose you've got:
class Foo(object):
def frob(self):
print "frobbign as a foo"
class Bar(object):
def frob(self):
print "frobbign as a bar"
and you make a subclass:
class FooBar(Foo, Bar):
def frob(self):
Foo.frob(self)
Bar.frob(self)
Everything's fine, but then you realize that when you get down to it,
Foo really is a kind of Bar, so you change it
class Foo(Bar):
def frob(self):
print "frobbign as a foo"
Bar.frob(self)
Which is all fine, except that in your derived class, FooBar.frob() calls Bar.frob() twice.
This is the exact problem super() solves, it protects you from calling superclass implementations more than once (when used as directed...)
As for your first question, __init__ is neither a staticmethod nor a classmethod; it is an ordinary instance method. (That is, it receives the instance as its first argument.)
As for your second question, if you want to explicitly call multiple base class implementations, then doing it explicitly as you did is indeed the only way. However, you seem to be misunderstanding how super works. When you call super, it does not "know" if you have already called it. Both of your calls to super(Child, self).__init__ call the Base1 implementation, because that is the "nearest parent" (the most immediate superclass of Child).
You would use super if you want to call just this immediate superclass implementation. You would do this if that superclass was also set up to call its superclass, and so on. The way to use super is to have each class call only the next implementation "up" in the class hierarchy, so that the sequence of super calls overall calls everything that needs to be called, in the right order. This type of setup is often called "cooperative inheritance", and you can find various articles about it online, including here and here.
I'm enhancing an existing class that does some calculations in the __init__ function to determine the instance state. Is it ok to call __init__() from __getstate__() in order to reuse those calculations?
To summarize reactions from Kroltan and jonsrharpe:
Technically it is OK
Technically it will work and if you do it properly, it can be considered OK.
Practically it is tricky, avoid that
If you edit the code in future and touch __init__, then it is easy (even for you) to forget about use in __setstate__ and then you enter into difficult to debug situation (asking yourself, where it comes from).
class Calculator():
def __init__(self):
# some calculation stuff here
def __setstate__(self, state)
self.__init__()
The calculation stuff is better to get isolated into another shared method:
class Calculator():
def __init__(self):
self._shared_calculation()
def __setstate__(self, state)
self._shared_calculation()
def _shared_calculation(self):
#some calculation stuff here
This way you shall notice.
Note: use of "_" as prefix for the shared method is arbitrary, you do not have to do that.
It's usually preferable to write a method called __getnewargs__ instead. That way, the Pickling mechanism will call __init__ for you automatically.
Another approach is to Customize the constructor class __init__ in a subclass. Ideally it is better to have to one Constructor class & change according to your need in Subclass
class Person:
def __init__(self, name, job=None, pay=0):
self.name = name
self.job = job
self.pay = pay
class Manager(Person):
def __init__(self, name, pay):
Person.__init__(self, name, 'title', pay) # Run constructor with 'title'
Calling constructors class this way turns out to be a very common coding pattern in Python. By itself, Python uses inheritance to look for and call only one __init__ method at construction time—the lowest one in the class tree.
If you need higher __init__ methods to be run at construction time, you must call them manually, and usually through the superclass name as in shown in the code above. his way you augment the Superclass constructor & replace the logic in subclass altogether to your liking
As suggested by Jan it is tricky & you will enter difficult debug situation if you call it in same class