I'm enhancing an existing class that does some calculations in the __init__ function to determine the instance state. Is it ok to call __init__() from __getstate__() in order to reuse those calculations?
To summarize reactions from Kroltan and jonsrharpe:
Technically it is OK
Technically it will work and if you do it properly, it can be considered OK.
Practically it is tricky, avoid that
If you edit the code in future and touch __init__, then it is easy (even for you) to forget about use in __setstate__ and then you enter into difficult to debug situation (asking yourself, where it comes from).
class Calculator():
def __init__(self):
# some calculation stuff here
def __setstate__(self, state)
self.__init__()
The calculation stuff is better to get isolated into another shared method:
class Calculator():
def __init__(self):
self._shared_calculation()
def __setstate__(self, state)
self._shared_calculation()
def _shared_calculation(self):
#some calculation stuff here
This way you shall notice.
Note: use of "_" as prefix for the shared method is arbitrary, you do not have to do that.
It's usually preferable to write a method called __getnewargs__ instead. That way, the Pickling mechanism will call __init__ for you automatically.
Another approach is to Customize the constructor class __init__ in a subclass. Ideally it is better to have to one Constructor class & change according to your need in Subclass
class Person:
def __init__(self, name, job=None, pay=0):
self.name = name
self.job = job
self.pay = pay
class Manager(Person):
def __init__(self, name, pay):
Person.__init__(self, name, 'title', pay) # Run constructor with 'title'
Calling constructors class this way turns out to be a very common coding pattern in Python. By itself, Python uses inheritance to look for and call only one __init__ method at construction time—the lowest one in the class tree.
If you need higher __init__ methods to be run at construction time, you must call them manually, and usually through the superclass name as in shown in the code above. his way you augment the Superclass constructor & replace the logic in subclass altogether to your liking
As suggested by Jan it is tricky & you will enter difficult debug situation if you call it in same class
Related
According to Python docs super()
is useful for accessing inherited methods that have been overridden in
a class.
I understand that super refers to the parent class and it lets you access parent methods. My question is why do people always use super inside the init method of the child class? I have seen it everywhere. For example:
class Person:
def __init__(self, name):
self.name = name
class Employee(Person):
def __init__(self, **kwargs):
super().__init__(name=kwargs['name']) # Here super is being used
def first_letter(self):
return self.name[0]
e = Employee(name="John")
print(e.first_letter())
I can accomplish the same without super and without even an init method:
class Person:
def __init__(self, name):
self.name = name
class Employee(Person):
def first_letter(self):
return self.name[0]
e = Employee(name="John")
print(e.first_letter())
Are there drawbacks with the latter code? It looks so much cleanr to me. I don't even have to use the boilerplate **kwargs and kwargs['argument'] syntax.
I am using Python 3.8.
Edit: Here's another stackoverflow questions which has code from different people who are using super in the child's init method. I don't understand why. My best guess is there's something new in Python 3.8.
The child might want to do something different or more likely additional to what the super class does - in this case the child must have an __init__.
Calling super’s init means that you don’t have to copy/paste (with all the implications for maintenance) that init in the child’s class, which otherwise would be needed if you wanted some additional code in the child init.
But note there are complications about using super’s init if you use multiple inheritance (e.g. which super gets called) and this needs care. Personally I avoid multiple inheritance and keep inheritance to aminimum anyway - it’s easy to get tempted into creating multiple levels of inheritance/class hierarchy but my experience is that a ‘keep it simple’ approach is usually much better.
The potential drawback to the latter code is that there is no __init__ method within the Employee class. Since there is none, the __init__ method of the parent class is called. However, as soon as an __init__ method is added to the Employee class (maybe there's some Employee-specific attribute that needs to be initialized, like an id_number) then the __init__ method of the parent class is overridden and not called (unless super.__init__() is called) and then an Employee will not have a name attribute.
The correct way to use super here is for both methods to use super. You cannot assume that Person is the last (or at least, next-to-last, before object) class in the MRO.
class Person:
def __init__(self, name, **kwargs):
super().__init__(**kwargs)
self.name = name
class Employee(Person):
# Optional, since Employee.__init__ does nothing
# except pass the exact same arguments "upstream"
def __init__(self, **kwargs):
super().__init__(**kwargs)
def first_letter(self):
return self.name[0]
Consider a class definition like
class Bar:
...
class Foo(Person, Bar):
...
The MRO for Foo looks like [Foo, Person, Bar, object]; the call to super().__init__ inside Person.__init__ would call Bar.__init__, not object.__init__, and Person has no way of knowing if values in **kwargs are meant for Bar, so it must pass them on.
I want to do something like the following (in Python 3.7):
class Animal:
def __init__(self, name, legs):
self.legs = legs
print(name)
#classmethod
def with_two_legs(cls, name):
# extremely long code to generate name_full from name
name_full = name
return cls(name_full, 2)
class Human(Animal):
def __init__(self):
super().with_two_legs('Human')
john = Human()
Basically, I want to override the __init__ method of a child class with a factory classmethod of the parent. The code as written, however, does not work, and raises:
TypeError: __init__() takes 1 positional argument but 3 were given
I think this means that super().with_two_legs('Human') passes Human as the cls variable.
1) Why doesn't this work as written? I assumed super() would return a proxy instance of the superclass, so cls would be Animal right?
2) Even if this was the case I don't think this code achieves what I want, since the classmethod returns an instance of Animal, but I just want to initialize Human in the same way classmethod does, is there any way to achieve the behaviour I want?
I hope this is not a very obvious question, I found the documentation on super() somewhat confusing.
super().with_two_legs('Human') does in fact call Animal's with_two_legs, but it passes Human as the cls, not Animal. super() makes the proxy object only to assist with method lookup, it doesn't change what gets passed (it's still the same self or cls it originated from). In this case, super() isn't even doing anything useful, because Human doesn't override with_two_legs, so:
super().with_two_legs('Human')
means "call with_two_legs from the first class above Human in the hierarchy which defines it", and:
cls.with_two_legs('Human')
means "call with_two_legs on the first class in the hierarchy starting with cls that defines it". As long as no class below Animal defines it, those do the same thing.
This means your code breaks at return cls(name_full, 2), because cls is still Human, and your Human.__init__ doesn't take any arguments beyond self. Even if you futzed around to make it work (e.g. by adding two optional arguments that you ignore), this would cause an infinite loop, as Human.__init__ called Animal.with_two_legs, which in turn tried to construct a Human, calling Human.__init__ again.
What you're trying to do is not a great idea; alternate constructors, by their nature, depend on the core constructor/initializer for the class. If you try to make a core constructor/initializer that relies on an alternate constructor, you've created a circular dependency.
In this particular case, I'd recommend avoiding the alternate constructor, in favor of either explicitly providing the legs count always, or using an intermediate TwoLeggedAnimal class that performs the task of your alternate constructor. If you want to reuse code, the second option just means your "extremely long code to generate name_full from name" can go in TwoLeggedAnimal's __init__; in the first option, you'd just write a staticmethod that factors out that code so it can be used by both with_two_legs and other constructors that need to use it.
The class hierarchy would look something like:
class Animal:
def __init__(self, name, legs):
self.legs = legs
print(name)
class TwoLeggedAnimal(Animal)
def __init__(self, name):
# extremely long code to generate name_full from name
name_full = name
super().__init__(name_full, 2)
class Human(TwoLeggedAnimal):
def __init__(self):
super().__init__('Human')
The common code approach would instead be something like:
class Animal:
def __init__(self, name, legs):
self.legs = legs
print(name)
#staticmethod
def _make_two_legged_name(basename):
# extremely long code to generate name_full from name
return name_full
#classmethod
def with_two_legs(cls, name):
return cls(cls._make_two_legged_name(name), 2)
class Human(Animal):
def __init__(self):
super().__init__(self._make_two_legged_name('Human'), 2)
Side-note: What you were trying to do wouldn't work even if you worked around the recursion, because __init__ doesn't make new instances, it initializes existing instances. So even if you call super().with_two_legs('Human') and it somehow works, it's making and returning a completely different instance, but not doing anything to the self received by __init__ which is what's actually being created. The best you'd have been able to do is something like:
def __init__(self):
self_template = super().with_two_legs('Human')
# Cheaty way to copy all attributes from self_template to self, assuming no use
# of __slots__
vars(self).update(vars(self_template))
There is no way to call an alternate constructor in __init__ and have it change self implicitly. About the only way I can think of to make this work in the way you intended without creating helper methods and preserving your alternate constructor would be to use __new__ instead of __init__ (so you can return an instance created by another constructor), and doing awful things with the alternate constructor to explicitly call the top class's __new__ to avoid circular calling dependencies:
class Animal:
def __new__(cls, name, legs): # Use __new__ instead of __init__
self = super().__new__(cls) # Constructs base object
self.legs = legs
print(name)
return self # Returns initialized object
#classmethod
def with_two_legs(cls, name):
# extremely long code to generate name_full from name
name_full = name
return Animal.__new__(cls, name_full, 2) # Explicitly call Animal's __new__ using correct subclass
class Human(Animal):
def __new__(cls):
return super().with_two_legs('Human') # Return result of alternate constructor
The proxy object you get from calling super was only used to locate the with_two_legs method to be called (and since you didn't override it in Human, you could have used self.with_two_legs for the same result).
As wim commented, your alternative constructor with_two_legs doesn't work because the Human class breaks the Liskov substitution principle by having a different constructor signature. Even if you could get the code to call Animal to build your instance, you'd have problems because you'd end up with an Animal instances and not a Human one (so other methods in Human, if you wrote some, would not be available).
Note that this situation is not that uncommon, many Python subclasses have different constructor signatures than their parent classes. But it does mean that you can't use one class freely in place of the other, as happens with a classmethod that tries to construct instances. You need to avoid those situations.
In this case, you are probably best served by using a default value for the legs argument to the Animal constructor. It can default to 2 legs if no alternative number is passed. Then you don't need the classmethod, and you don't run into problems when you override __init__:
class Animal:
def __init__(self, name, legs=2): # legs is now optional, defaults to 2
self.legs = legs
print(name)
class Human(Animal):
def __init__(self):
super().__init__('Human')
john = Human()
Why am I forced to use threading.Thread.__init__(self) or super(ClassName, self).__init__() when I create a threading.Thread Class?
For example:
class Threader(threading.Thread):
def __init__(self, _fp, _q):
threading.Thread.__init__(self)
self.path = _fp
self.queue = _q
def run(self):
# Do stuff
or
class Threader(threading.Thread):
def __init__(self, _fp, _q):
super(Threader, self).__init__()
self.path = _fp
self.queue = _q
def run(self):
# Do stuff
Both methods work, and do roughly the same thing. However, if I remove either .__init__() methods, I receive in the stack: from thread.start(): thread.__init__() not called.
Shouldn't defining my own def __init__() "replace" the .__init__() method?
I've read this other SO post and that aligned with what I thought, get same stack error though.
Consider this simplified example:
class dog:
def __init__(self):
self.legs = 4
self.sound = 'woof'
class chihuahua(dog):
def __init__(self):
self.sound = 'yip'
# what's missing here?
We've created a subclass of dog, called chihuahua. A user of this class would reasonably expect it to behave like a dog in all default aspects, except the specific one that we have overridden (the sound it makes). But note that, as you have pointed out, the new subclass __init__ replaces the base class __init__. Completely replaces. Unlike C++, the base-class initialization code is not automatically called when a subclass instance is created. Therefore, the line self.legs = 4 never gets run when you create a chihuahua(). As a result, this type of dog is running around without any idea how many legs it has. Hence you could argue it is not a fully-functioning dog, and you shouldn't be surprised if it falls over while trying to perform complex tricks.
As subclass designer you have two options to fix this. The first is to reimplement the self.legs = 4 line explicitly in the subclass. Well, that'll work fine in this example, but it's not a great option in general because it violates the DRY principle even in cases where you do know exactly what code to write and how to maintain it. And in more complex examples (like your Thread subclass), you presumably won't know. Second option: explicitly call the superclass initializer and let it do its thing.
Defining your own __init__ overrides the base class. But what about all the work the base __init__ does to make the thread runnable? All variables and state that it would normally create are missing. Unless you hack all of that in yourself (and why do that?) the thread is of course completely unrunnable.
Not all classes need an __init__ of course, but the vast majority do. Even for the ones that don't, calling __init__ is harmless - it just goes to object.__init__ and future-proofs the child class in the event an implementer decides an __init__ is useful after all.
I am writing a class with multiple constructors using #classmethod. Now I would like both the __init__ constructor as well as the classmethod constructor call some routine of the class to set initial values before doing other stuff.
From __init__ this is usually done with self:
def __init__(self, name="", revision=None):
self._init_attributes()
def _init_attributes(self):
self.test = "hello"
From a classmethod constructor, I would call another classmethod instead, because the instance (i.e. self) is not created until I leave the classmethod with return cls(...). Now, I can call my _init_attributes() method as
#classmethod
def from_file(cls, filename=None)
cls._init_attributes()
# do other stuff like reading from file
return cls()
and this actually works (in the sense that I don't get an error and I can actually see the test attribute after executing c = Class.from_file(). However, if I understand things correctly, then this will set the attributes on the class level, not on the instance level. Hence, if I initialize an attribute with a mutable object (e.g. a list), then all instances of this class would use the same list, rather than their own instance list. Is this correct? If so, is there a way to initialize "instance" attributes in classmethods, or do I have to write the code in such a way that all the attribute initialisation is done in init?
Hmmm. Actually, while writing this: I may even have greater trouble than I thought because init will be called upon return from the classmethod, won't it? So what would be a proper way to deal with this situation?
Note: Article [1] discusses a somewhat similar problem.
Yes, you'r understanding things correctly: cls._init_attributes() will set class attributes, not instance attributes.
Meanwhile, it's up to your alternate constructor to construct and return an instance. In between constructing it and returning it, that's when you can call _init_attributes(). In other words:
#classmethod
def from_file(cls, filename=None)
obj = cls()
obj._init_attributes()
# do other stuff like reading from file
return obj
However, you're right that the only obvious way to construct and return an instance is to just call cls(), which will call __init__.
But this is easy to get around: just have the alternate constructors pass some extra argument to __init__ meaning "skip the usual initialization, I'm going to do it later". For example:
def __init__(self, name="", revision=None, _skip_default_init=False):
# blah blah
#classmethod
def from_file(cls, filename=""):
# blah blah setup
obj = cls(_skip_default_init=True)
# extra initialization work
return obj
If you want to make this less visible, you can always take **kwargs and check it inside the method body… but remember, this is Python; you can't prevent people from doing stupid things, all you can do is make it obvious that they're stupid. And the _skip_default_init should be more than enough to handle that.
If you really want to, you can override __new__ as well. Constructing an object doesn't call __init__ unless __new__ returns an instance of cls or some subclass thereof. So, you can give __new__ a flag that tells it to skip over __init__ by munging obj.__class__, then restore the __class__ yourself. This is really hacky, but could conceivably be useful.
A much cleaner solution—but for some reason even less common in Python—is to borrow the "class cluster" idea from Smalltalk/ObjC: Create a private subclass that has a different __init__ that doesn't super (or intentionally skips over its immediate base and supers from there), and then have your alternate constructor in the base class just return an instance of that subclass.
Alternatively, if the only reason you don't want to call __init__ is so you can do the exact same thing __init__ would have done… why? DRY stands for "don't repeat yourself", not "bend over backward to find ways to force yourself to repeat yourself", right?
For example, I have a
class BaseHandler(object):
def prepare(self):
self.prepped = 1
I do not want everyone that subclasses BaseHandler and also wants to implement prepare to have to remember to call
super(SubBaseHandler, self).prepare()
Is there a way to ensure the superclass method is run even if the subclass also implements prepare?
I have solved this problem using a metaclass.
Using a metaclass allows the implementer of the BaseHandler to be sure that all subclasses will call the superclasses prepare() with no adjustment to any existing code.
The metaclass looks for an implementation of prepare on both classes and then overwrites the subclass prepare with one that calls superclass.prepare followed by subclass.prepare.
class MetaHandler(type):
def __new__(cls, name, bases, attrs):
instance = type.__new__(cls, name, bases, attrs)
super_instance = super(instance, instance)
if hasattr(super_instance, 'prepare') and hasattr(instance, 'prepare'):
super_prepare = getattr(super_instance, 'prepare')
sub_prepare = getattr(instance, 'prepare')
def new_prepare(self):
super_prepare(self)
sub_prepare(self)
setattr(instance, 'prepare', new_prepare)
return instance
class BaseHandler(object):
__metaclass__ = MetaHandler
def prepare(self):
print 'BaseHandler.prepare'
class SubHandler(BaseHandler):
def prepare(self):
print 'SubHandler.prepare'
Using it looks like this:
>>> sh = SubHandler()
>>> sh.prepare()
BaseHandler.prepare
SubHandler.prepare
Tell your developers to define prepare_hook instead of prepare, but
tell the users to call prepare:
class BaseHandler(object):
def prepare(self):
self.prepped = 1
self.prepare_hook()
def prepare_hook(self):
pass
class SubBaseHandler(BaseHandler):
def prepare_hook(self):
pass
foo = SubBaseHandler()
foo.prepare()
If you want more complex chaining of prepare calls from multiple subclasses, then your developers should really use super as that's what it was intended for.
Just accept that you have to tell people subclassing your class to call the base method when overriding it. Every other solution either requires you to explain them to do something else, or involves some un-pythonic hacks which could be circumvented too.
Python’s object inheritance model was designed to be open, and any try to go another way will just overcomplicate the problem which does not really exist anyway. Just tell everybody using your stuff to either follow your “rules”, or the program will mess up.
One explicit solution without too much magic going on would be to maintain a list of prepare call-backs:
class BaseHandler(object):
def __init__(self):
self.prepare_callbacks = []
def register_prepare_callback(self, callback):
self.prepare_callbacks.append(callback)
def prepare(self):
# Do BaseHandler preparation
for callback in self.prepare_callbacks:
callback()
class MyHandler(BaseHandler):
def __init__(self):
BaseHandler.__init__(self)
self.register_prepare_callback(self._prepare)
def _prepare(self):
# whatever
In general you can try using __getattribute__ to achive something like this (until the moment someone overwrites this method too), but it is against the Python ideas. There is a reason to be able to access private object members in Python. The reason is mentioned in import this