Why is Python super used in the child's init method? - python

According to Python docs super()
is useful for accessing inherited methods that have been overridden in
a class.
I understand that super refers to the parent class and it lets you access parent methods. My question is why do people always use super inside the init method of the child class? I have seen it everywhere. For example:
class Person:
def __init__(self, name):
self.name = name
class Employee(Person):
def __init__(self, **kwargs):
super().__init__(name=kwargs['name']) # Here super is being used
def first_letter(self):
return self.name[0]
e = Employee(name="John")
print(e.first_letter())
I can accomplish the same without super and without even an init method:
class Person:
def __init__(self, name):
self.name = name
class Employee(Person):
def first_letter(self):
return self.name[0]
e = Employee(name="John")
print(e.first_letter())
Are there drawbacks with the latter code? It looks so much cleanr to me. I don't even have to use the boilerplate **kwargs and kwargs['argument'] syntax.
I am using Python 3.8.
Edit: Here's another stackoverflow questions which has code from different people who are using super in the child's init method. I don't understand why. My best guess is there's something new in Python 3.8.

The child might want to do something different or more likely additional to what the super class does - in this case the child must have an __init__.
Calling super’s init means that you don’t have to copy/paste (with all the implications for maintenance) that init in the child’s class, which otherwise would be needed if you wanted some additional code in the child init.
But note there are complications about using super’s init if you use multiple inheritance (e.g. which super gets called) and this needs care. Personally I avoid multiple inheritance and keep inheritance to aminimum anyway - it’s easy to get tempted into creating multiple levels of inheritance/class hierarchy but my experience is that a ‘keep it simple’ approach is usually much better.

The potential drawback to the latter code is that there is no __init__ method within the Employee class. Since there is none, the __init__ method of the parent class is called. However, as soon as an __init__ method is added to the Employee class (maybe there's some Employee-specific attribute that needs to be initialized, like an id_number) then the __init__ method of the parent class is overridden and not called (unless super.__init__() is called) and then an Employee will not have a name attribute.

The correct way to use super here is for both methods to use super. You cannot assume that Person is the last (or at least, next-to-last, before object) class in the MRO.
class Person:
def __init__(self, name, **kwargs):
super().__init__(**kwargs)
self.name = name
class Employee(Person):
# Optional, since Employee.__init__ does nothing
# except pass the exact same arguments "upstream"
def __init__(self, **kwargs):
super().__init__(**kwargs)
def first_letter(self):
return self.name[0]
Consider a class definition like
class Bar:
...
class Foo(Person, Bar):
...
The MRO for Foo looks like [Foo, Person, Bar, object]; the call to super().__init__ inside Person.__init__ would call Bar.__init__, not object.__init__, and Person has no way of knowing if values in **kwargs are meant for Bar, so it must pass them on.

Related

Python - Child Class to call a function from another Child Class

I have a pretty big class that i want to break down in smaller classes that each handle a single part of the whole. So each child takes care of only one aspect of the whole.
Each of these child classes still need to communicate with one another.
For example Data Access creates a dictionary that Plotting Controller needs to have access to.
And then plotting Controller needs to update stuff on Main GUI Controller. But these children have various more inter-communication functions.
How do I achieve this?
I've read Metaclasses, Cooperative Multiple Inheritence and Wonders of Cooperative Multiple Inheritence, but i cannot figure out how to do this.
The closest I've come is the following code:
class A:
def __init__(self):
self.myself = 'ClassA'
def method_ONE_from_class_A(self, caller):
print(f"I am method ONE from {self.myself} called by {caller}")
self.method_ONE_from_class_B(self.myself)
def method_TWO_from_class_A(self, caller):
print(f"I am method TWO from {self.myself} called by {caller}")
self.method_TWO_from_class_B(self.myself)
class B:
def __init__(self):
self.me = 'ClassB'
def method_ONE_from_class_B(self, caller):
print(f"I am method ONE from {self.me} called by {caller}")
self.method_TWO_from_class_A(self.me)
def method_TWO_from_class_B(self, caller):
print(f"I am method TWO from {self.me} called by {caller}")
class C(A, B):
def __init__(self):
A.__init__(self)
B.__init__(self)
def children_start_talking(self):
self.method_ONE_from_class_A('Big Poppa')
poppa = C()
poppa.children_start_talking()
which results correctly in:
I am method ONE from ClassA called by Big Poppa
I am method ONE from ClassB called by ClassA
I am method TWO from ClassA called by ClassB
I am method TWO from ClassB called by ClassA
But... even though Class B and Class A correctly call the other children's functions, they don't actually find their declaration. Nor do i "see" them when i'm typing the code, which is both frustrating and worrisome that i might be doing something wrong.
Is there a good way to achieve this? Or is it an actually bad idea?
EDIT: Python 3.7 if it makes any difference.
Inheritance
When breaking a class hierarchy like this, the individual "partial" classes, we call "mixins", will "see" only what is declared directly on them, and on their base-classes. In your example, when writing class A, it does not know anything about class B - you as the author, can know that methods from class B will be present, because methods from class A will only be called from class C, that inherits both.
Your programming tools, the IDE including, can't know that. (That you should know better than your programming aid, is a side track). It would work, if run, but this is a poor design.
If all methods are to be present directly on a single instance of your final class, all of them have to be "present" in a super-class for them all - you can even write independent subclasses in different files, and then a single subclass that will inherit all of them:
from abc import abstractmethod, ABC
class Base(ABC):
#abstractmethod
def method_A_1(self):
pass
#abstractmethod
def method_A_2(self):
pass
#abstractmethod
def method_B_1(self):
pass
class A(Base):
def __init__(self, *args, **kwargs):
# pop consumed named parameters from "kwargs"
...
super().__init__(*args, **kwargs)
# This call ensures all __init__ in bases are called
# because Python linearize the base classes on multiple inheritance
def method_A_1(self):
...
def method_A_2(self):
...
class B(Base):
def __init__(self, *args, **kwargs):
# pop consumed named parameters from "kwargs"
...
super().__init__(*args, **kwargs)
# This call ensures all __init__ in bases are called
# because Python linearize the base classes on multiple inheritance
def method_B_1(self):
...
...
class C(A, B):
pass
(The "ABC" and "abstractmethod" are a bit of sugar - they will work, but this design would work without any of that - thought their presence help whoever is looking at your code to figure out what is going on, and will raise an earlier runtime error if you per mistake create an instance of one of the incomplete base classes)
Composite
This works, but if your methods are actually for wildly different domains, instead
of multiple inheritance, you should try using the "composite design pattern".
No need for multiple inheritance if it does not arise naturally.
In this case, you instantiate objects of the classes that drive the different domains on the __init__ of the shell class, and pass its own instance to those child, which will keep a reference to it (in a self.parent attribute, for example). Chances are your IDE still won't know what you are talking about, but you will have a saner design.
class Parent:
def __init__(self):
self.a_domain = A(self)
self.b_domain = B(self)
class A:
def __init__(self, parent):
self.parent = parent
# no need to call any "super...init", this is called
# as part of the initialization of the parent class
def method_A_1(self):
...
def method_A_2(self):
...
class B:
def __init__(self, parent):
self.parent = parent
def method_B_1(self):
# need result from 'A' domain:
a_value = self.parent.a_domain.method_A_1()
...
This example uses the basic of the language features, but if you decide
to go for it in a complex application, you can sophisticate it - there are
interface patterns, that could allow you to swap the classes used
for different domains, in specialized subclasses, and so on. But typically
the pattern above is what you would need.

Overriding __init__ with parent classmethod in python

I want to do something like the following (in Python 3.7):
class Animal:
def __init__(self, name, legs):
self.legs = legs
print(name)
#classmethod
def with_two_legs(cls, name):
# extremely long code to generate name_full from name
name_full = name
return cls(name_full, 2)
class Human(Animal):
def __init__(self):
super().with_two_legs('Human')
john = Human()
Basically, I want to override the __init__ method of a child class with a factory classmethod of the parent. The code as written, however, does not work, and raises:
TypeError: __init__() takes 1 positional argument but 3 were given
I think this means that super().with_two_legs('Human') passes Human as the cls variable.
1) Why doesn't this work as written? I assumed super() would return a proxy instance of the superclass, so cls would be Animal right?
2) Even if this was the case I don't think this code achieves what I want, since the classmethod returns an instance of Animal, but I just want to initialize Human in the same way classmethod does, is there any way to achieve the behaviour I want?
I hope this is not a very obvious question, I found the documentation on super() somewhat confusing.
super().with_two_legs('Human') does in fact call Animal's with_two_legs, but it passes Human as the cls, not Animal. super() makes the proxy object only to assist with method lookup, it doesn't change what gets passed (it's still the same self or cls it originated from). In this case, super() isn't even doing anything useful, because Human doesn't override with_two_legs, so:
super().with_two_legs('Human')
means "call with_two_legs from the first class above Human in the hierarchy which defines it", and:
cls.with_two_legs('Human')
means "call with_two_legs on the first class in the hierarchy starting with cls that defines it". As long as no class below Animal defines it, those do the same thing.
This means your code breaks at return cls(name_full, 2), because cls is still Human, and your Human.__init__ doesn't take any arguments beyond self. Even if you futzed around to make it work (e.g. by adding two optional arguments that you ignore), this would cause an infinite loop, as Human.__init__ called Animal.with_two_legs, which in turn tried to construct a Human, calling Human.__init__ again.
What you're trying to do is not a great idea; alternate constructors, by their nature, depend on the core constructor/initializer for the class. If you try to make a core constructor/initializer that relies on an alternate constructor, you've created a circular dependency.
In this particular case, I'd recommend avoiding the alternate constructor, in favor of either explicitly providing the legs count always, or using an intermediate TwoLeggedAnimal class that performs the task of your alternate constructor. If you want to reuse code, the second option just means your "extremely long code to generate name_full from name" can go in TwoLeggedAnimal's __init__; in the first option, you'd just write a staticmethod that factors out that code so it can be used by both with_two_legs and other constructors that need to use it.
The class hierarchy would look something like:
class Animal:
def __init__(self, name, legs):
self.legs = legs
print(name)
class TwoLeggedAnimal(Animal)
def __init__(self, name):
# extremely long code to generate name_full from name
name_full = name
super().__init__(name_full, 2)
class Human(TwoLeggedAnimal):
def __init__(self):
super().__init__('Human')
The common code approach would instead be something like:
class Animal:
def __init__(self, name, legs):
self.legs = legs
print(name)
#staticmethod
def _make_two_legged_name(basename):
# extremely long code to generate name_full from name
return name_full
#classmethod
def with_two_legs(cls, name):
return cls(cls._make_two_legged_name(name), 2)
class Human(Animal):
def __init__(self):
super().__init__(self._make_two_legged_name('Human'), 2)
Side-note: What you were trying to do wouldn't work even if you worked around the recursion, because __init__ doesn't make new instances, it initializes existing instances. So even if you call super().with_two_legs('Human') and it somehow works, it's making and returning a completely different instance, but not doing anything to the self received by __init__ which is what's actually being created. The best you'd have been able to do is something like:
def __init__(self):
self_template = super().with_two_legs('Human')
# Cheaty way to copy all attributes from self_template to self, assuming no use
# of __slots__
vars(self).update(vars(self_template))
There is no way to call an alternate constructor in __init__ and have it change self implicitly. About the only way I can think of to make this work in the way you intended without creating helper methods and preserving your alternate constructor would be to use __new__ instead of __init__ (so you can return an instance created by another constructor), and doing awful things with the alternate constructor to explicitly call the top class's __new__ to avoid circular calling dependencies:
class Animal:
def __new__(cls, name, legs): # Use __new__ instead of __init__
self = super().__new__(cls) # Constructs base object
self.legs = legs
print(name)
return self # Returns initialized object
#classmethod
def with_two_legs(cls, name):
# extremely long code to generate name_full from name
name_full = name
return Animal.__new__(cls, name_full, 2) # Explicitly call Animal's __new__ using correct subclass
class Human(Animal):
def __new__(cls):
return super().with_two_legs('Human') # Return result of alternate constructor
The proxy object you get from calling super was only used to locate the with_two_legs method to be called (and since you didn't override it in Human, you could have used self.with_two_legs for the same result).
As wim commented, your alternative constructor with_two_legs doesn't work because the Human class breaks the Liskov substitution principle by having a different constructor signature. Even if you could get the code to call Animal to build your instance, you'd have problems because you'd end up with an Animal instances and not a Human one (so other methods in Human, if you wrote some, would not be available).
Note that this situation is not that uncommon, many Python subclasses have different constructor signatures than their parent classes. But it does mean that you can't use one class freely in place of the other, as happens with a classmethod that tries to construct instances. You need to avoid those situations.
In this case, you are probably best served by using a default value for the legs argument to the Animal constructor. It can default to 2 legs if no alternative number is passed. Then you don't need the classmethod, and you don't run into problems when you override __init__:
class Animal:
def __init__(self, name, legs=2): # legs is now optional, defaults to 2
self.legs = legs
print(name)
class Human(Animal):
def __init__(self):
super().__init__('Human')
john = Human()

Reason for calling super class constructor using superclass.__init()__ instead of superclass()

I am a beginner in Python and using Lutz's book to understand OOPS in Python. This question might be basic, but I'd appreciate any help. I researched SO and found answers on "how", but not "why."
As I understand from the book, if Sub inherits Super then one need not call superclass' (Super's) __init__() method.
Example:
class Super:
def __init__(self,name):
self.name=name
print("Name is:",name)
class Sub(Super):
pass
a = Sub("Harry")
a.name
Above code does assign attribute name to the object a. It also prints the name as expected.
However, if I modify the code as:
class Super:
def __init__(self,name):
print("Inside Super __init__")
self.name=name
print("Name is:",name)
class Sub(Super):
def __init__(self,name):
Super(name) #Call __init__ directly
a = Sub("Harry")
a.name
The above code doesn't work fine. By fine, I mean that although Super.__init__() does get called (as seen from the print statements), there is no attribute attached to a. When I run a.name, I get an error, AttributeError: 'Sub' object has no attribute 'name'
I researched this topic on SO, and found the fix on Chain-calling parent constructors in python and Why aren't superclass __init__ methods automatically invoked?
These two threads talk about how to fix it, but they don't provide a reason for why.
Question: Why do I need to call Super's __init__ using Super.__init__(self, name) OR super(Sub, self).__init__(name) instead of a direct call Super(name)?
In Super.__init__(self, name) and Super(name), we see that Super's __init__() gets called, (as seen from print statements), but only in Super.__init__(self, name) we see that the attribute gets attached to Sub class.
Wouldn't Super(name) automatically pass self (child) object to Super? Now, you might ask that how do I know that self is automatically passed? If I modify Super(name) to Super(self,name), I get an error message that TypeError: __init__() takes 2 positional arguments but 3 were given. As I understand from the book, self is automatically passed. So, effectively, we end up passing self twice.
I don't know why Super(name) doesn't attach name attribute to Sub even though Super.__init__() is run. I'd appreciate any help.
For reference, here's the working version of the code based on my research from SO:
class Super:
def __init__(self,name):
print("Inside __init__")
self.name=name
print("Name is:",name)
class Sub(Super):
def __init__(self,name):
#Super.__init__(self, name) #One way to fix this
super(Sub, self).__init__(name) #Another way to fix this
a = Sub("Harry")
a.name
PS: I am using Python-3.6.5 under Anaconda Distribution.
As I understand from the book, if Sub inherits Super then one need not call superclass' (Super's) __init__() method.
This is misleading. It's true that you aren't required to call the superclass's __init__ method—but if you don't, whatever it does in __init__ never happens. And for normal classes, all of that needs to be done. It is occasionally useful, usually when a class wasn't designed to be inherited from, like this:
class Rot13Reader:
def __init__(self, filename):
self.file = open(filename):
def close(self):
self.file.close()
def dostuff(self):
line = next(file)
return codecs.encode(line, 'rot13')
Imagine that you want all the behavior of this class, but with a string rather than a file. The only way to do that is to skip the open:
class LocalRot13Reader(Rot13Reader):
def __init__(self, s):
# don't call super().__init__, because we don't have a filename to open
# instead, set up self.file with something else
self.file = io.StringIO(s)
Here, we wanted to avoid the self.file assignment in the superclass. In your case—as with almost all classes you're ever going to write—you don't want to avoid the self.name assignment in the superclass. That's why, even though Python allows you to not call the superclass's __init__, you almost always call it.
Notice that there's nothing special about __init__ here. For example, we can override dostuff to call the base class's version and then do extra stuff:
def dostuff(self):
result = super().dostuff()
return result.upper()
… or we can override close and intentionally not call the base class:
def close(self):
# do nothing, including no super, because we borrowed our file
The only difference is that good reasons to avoid calling the base class tend to be much more common in normal methods than in __init__.
Question: Why do I need to call Super's __init__ using Super.__init__(self, name) OR super(Sub, self).__init__(name) instead of a direct call Super(name)?
Because these do very different things.
Super(name) constructs a new Super instance, calls __init__(name) on it, and returns it to you. And you then ignore that value.
In particular, Super.__init__ does get called one time either way—but the self it gets called with is that new Super instance, that you're just going to throw away, in the Super(name) case, while it's your own self in the super(Sub, self).__init__(name) case.
So, in the first case, it sets the name attribute on some other object that gets thrown away, and nobody ever sets it on your object, which is why self.name later raises an AttributeError.
It might help you understand this if you add something to both class's __init__ methods to show which instance is involved:
class Super:
def __init__(self,name):
print(f"Inside Super __init__ for {self}")
self.name=name
print("Name is:",name)
class Sub(Super):
def __init__(self,name):
print(f"Inside Sub __init__ for {self}")
# line you want to experiment with goes here.
If that last line is super().__init__(name), super(Sub, self).__init__name), or Super.__init__(self, name), you will see something like this:
Inside Sub __init__ for <__main__.Sub object at 0x10f7a9e80>
Inside Super __init__ for <__main__.Sub object at 0x10f7a9e80>
Notice that it's the same object, the Sub at address 0x10f7a9e80, in both cases.
… but if that last line is Super(name):
Inside Sub __init__ for <__main__.Sub object at 0x10f7a9ea0>
Inside Super __init__ for <__main__.Super object at 0x10f7a9ec0>
Now we have two different objects, at different addresses 0x10f7a9ea0 and 0x10f7a9ec0, and with different types.
If you're curious about what the magic all looks like under the covers, Super(name) does something like this (oversimplifying a bit and skipping over some steps1):
_newobj = Super.__new__(Super)
if isinstance(_newobj, Super):
Super.__init__(_newobj, name)
… while super(Sub, self).__init__(name) does something like this:
_basecls = magically_find_next_class_in_mro(Sub)
_basecls.__init__(self, name)
As a side note, if a book is telling you to use super(Sub, self).__init__(name) or Super.__init__(self, name), it's probably an obsolete book written for Python 2.
In Python 3, you just do this:
super().__init__(name): Calls the correct next superclass by method resolution order. You almost always want this.
super(Sub, self).__init__(name): Calls the correct next superclass—unless you make a mistake and get Sub wrong there. You only need this if you're writing dual-version code that has to run in 2.7 as well as 3.x.
Super.__init__(self, name): Calls Super, whether it's the correct next superclass or not. You only need this if the method resolution order is wrong and you have to work around it.2
If you want to understand more, it's all in the docs, but it can be a bit daunting:
__new__
__init__
super (also see Raymond Hettinger's blog post)
method invocation (also see the HOWTO)
The original introduction to super, __new__, and all the related features was very helpful to me in understanding all of this. I'm not sure if it'll be as helpful to someone who's not coming at this already understanding old-style Python classes, but it's pretty well written, and Guido (obviously) knows what he's talking about, so it might be worth reading.
1. The biggest cheat in this explanation is that super actually returns a proxy object that acts like _baseclass bound to self in the same way methods are bound, which can be used to bind methods, like __init__. This is useful/interesting knowledge if you know how methods work, but probably just extra confusion if you don't.
2. … or if you're working with old-style classes, which don't support super (or proper method-resolution order). This never comes up in Python 3, which doesn't have old-style classes. But, unfortunately, you will see it in lots of tkinter examples, because the best tutorial is still Effbot's, which was written for Python 2.3, when Tkinter was all old-style classes, and has never been updated.
Super(name) is not a "direct call" to the superclass __init__. After all, you called Super, not Super.__init__.
Super.__init__ takes an uninitialized Super instance and initializes it. Super creates and initializes a new, completely separate instance from the one you wanted to initialize (and then you immediately throw the new instance away). The instance you wanted to initialize is untouched.
Super(name) instantiates a new instance of super. Think of this example:
def __init__(self, name):
x1 = Super(name)
x2 = Super("some other name")
assert x1 is not self
assert x2 is not self
In order to explicitly call The Super's constructor on the current instance, you'd have to use the following syntax:
def __init__(self, name):
Super.__init__(self, name)
Now, maybe you don't want read further if you are a beginner.
If you do, you will see that there is a good reason to use super(Sub, self).__init__(name) (or super().__init__(name) in Python 3) instead of Super.__init__(self, name).
Super.__init__(self, name) works fine, as long as you are certain that Super is in fact your superclass. But in fact, you don't know ever that for sure.
You could have the following code:
class Super:
def __init__(self):
print('Super __init__')
class Sub(Super):
def __init__(self):
print('Sub __init__')
Super.__init__(self)
class Sub2(Super):
def __init__(self):
print('Sub2 __init__')
Super.__init__(self)
class SubSub(Sub, Sub2):
pass
You would now expect that SubSub() ends up calling all of the above constructors, but it does not:
>>> x = SubSub()
Sub __init__
Super __init__
>>>
To correct it, you'd have to do:
class Super:
def __init__(self):
print('Super __init__')
class Sub(Super):
def __init__(self):
print('Sub __init__')
super().__init__()
class Sub2(Super):
def __init__(self):
print('Sub2 __init__')
super().__init__()
class SubSub(Sub, Sub2):
pass
Now it works:
>>> x = SubSub()
Sub __init__
Sub2 __init__
Super __init__
>>>
The reason is that the super class of Sub is declared to be Super, in case of multiple inheritance in class SubSub, Python's MRO establishes the inheritance as: SubSub inherits from Sub, which inherits from Sub2, which inherits from Super, which inherits from object.
You can test that:
>>> SubSub.__mro__
(<class '__main__.SubSub'>, <class '__main__.Sub'>, <class '__main__.Sub2'>, <class '__main__.Super'>, <class 'object'>)
Now, the super() call in constructors of each of the classes finds the next class in the MRO so that the constructor of that class can be called.
See https://www.python.org/download/releases/2.3/mro/

Is it OK to call __init__ from __setstate__

I'm enhancing an existing class that does some calculations in the __init__ function to determine the instance state. Is it ok to call __init__() from __getstate__() in order to reuse those calculations?
To summarize reactions from Kroltan and jonsrharpe:
Technically it is OK
Technically it will work and if you do it properly, it can be considered OK.
Practically it is tricky, avoid that
If you edit the code in future and touch __init__, then it is easy (even for you) to forget about use in __setstate__ and then you enter into difficult to debug situation (asking yourself, where it comes from).
class Calculator():
def __init__(self):
# some calculation stuff here
def __setstate__(self, state)
self.__init__()
The calculation stuff is better to get isolated into another shared method:
class Calculator():
def __init__(self):
self._shared_calculation()
def __setstate__(self, state)
self._shared_calculation()
def _shared_calculation(self):
#some calculation stuff here
This way you shall notice.
Note: use of "_" as prefix for the shared method is arbitrary, you do not have to do that.
It's usually preferable to write a method called __getnewargs__ instead. That way, the Pickling mechanism will call __init__ for you automatically.
Another approach is to Customize the constructor class __init__ in a subclass. Ideally it is better to have to one Constructor class & change according to your need in Subclass
class Person:
def __init__(self, name, job=None, pay=0):
self.name = name
self.job = job
self.pay = pay
class Manager(Person):
def __init__(self, name, pay):
Person.__init__(self, name, 'title', pay) # Run constructor with 'title'
Calling constructors class this way turns out to be a very common coding pattern in Python. By itself, Python uses inheritance to look for and call only one __init__ method at construction time—the lowest one in the class tree.
If you need higher __init__ methods to be run at construction time, you must call them manually, and usually through the superclass name as in shown in the code above. his way you augment the Superclass constructor & replace the logic in subclass altogether to your liking
As suggested by Jan it is tricky & you will enter difficult debug situation if you call it in same class

How to auto register a class when it's defined

I want to have an instance of class registered when the class is defined. Ideally the code below would do the trick.
registry = {}
def register( cls ):
registry[cls.__name__] = cls() #problem here
return cls
#register
class MyClass( Base ):
def __init__(self):
super( MyClass, self ).__init__()
Unfortunately, this code generates the error NameError: global name 'MyClass' is not defined.
What's going on is at the #problem here line I'm trying to instantiate a MyClass but the decorator hasn't returned yet so it doesn't exist.
Is the someway around this using metaclasses or something?
Yes, meta classes can do this. A meta class' __new__ method returns the class, so just register that class before returning it.
class MetaClass(type):
def __new__(cls, clsname, bases, attrs):
newclass = super(MetaClass, cls).__new__(cls, clsname, bases, attrs)
register(newclass) # here is your register function
return newclass
class MyClass(object):
__metaclass__ = MetaClass
The previous example works in Python 2.x. In Python 3.x, the definition of MyClass is slightly different (while MetaClass is not shown because it is unchanged - except that super(MetaClass, cls) can become super() if you want):
#Python 3.x
class MyClass(metaclass=MetaClass):
pass
As of Python 3.6 there is also a new __init_subclass__ method (see PEP 487) that can be used instead of a meta class (thanks to #matusko for his answer below):
class ParentClass:
def __init_subclass__(cls, **kwargs):
super().__init_subclass__(**kwargs)
register(cls)
class MyClass(ParentClass):
pass
[edit: fixed missing cls argument to super().__new__()]
[edit: added Python 3.x example]
[edit: corrected order of args to super(), and improved description of 3.x differences]
[edit: add Python 3.6 __init_subclass__ example]
Since python 3.6 you don't need metaclasses to solve this
In python 3.6 simpler customization of class creation was introduced (PEP 487).
An __init_subclass__ hook that initializes all subclasses of a given class.
Proposal includes following example of subclass registration
class PluginBase:
subclasses = []
def __init_subclass__(cls, **kwargs):
super().__init_subclass__(**kwargs)
cls.subclasses.append(cls)
In this example, PluginBase.subclasses will contain a plain list of
all subclasses in the entire inheritance tree. One should note that
this also works nicely as a mixin class.
The problem isn't actually caused by the line you've indicated, but by the super call in the __init__ method. The problem remains if you use a metaclass as suggested by dappawit; the reason the example from that answer works is simply that dappawit has simplified your example by omitting the Base class and therefore the super call. In the following example, neither ClassWithMeta nor DecoratedClass work:
registry = {}
def register(cls):
registry[cls.__name__] = cls()
return cls
class MetaClass(type):
def __new__(cls, clsname, bases, attrs):
newclass = super(cls, MetaClass).__new__(cls, clsname, bases, attrs)
register(newclass) # here is your register function
return newclass
class Base(object):
pass
class ClassWithMeta(Base):
__metaclass__ = MetaClass
def __init__(self):
super(ClassWithMeta, self).__init__()
#register
class DecoratedClass(Base):
def __init__(self):
super(DecoratedClass, self).__init__()
The problem is the same in both cases; the register function is called (either by the metaclass or directly as a decorator) after the class object is created, but before it has been bound to a name. This is where super gets gnarly (in Python 2.x), because it requires you to refer to the class in the super call, which you can only reasonably do by using the global name and trusting that it will have been bound to that name by the time the super call is invoked. In this case, that trust is misplaced.
I think a metaclass is the wrong solution here. Metaclasses are for making a family of classes that have some custom behaviour in common, exactly as classes are for making a family of instances that have some custom behavior in common. All you're doing is calling a function on a class. You wouldn't define a class to call a function on a string, neither should you define a metaclass to call a function on a class.
So, the problem is a fundamental incompatibility between: (1) using hooks in the class creation process to create instances of the class, and (2) using super.
One way to resolve this is to not use super. super solves a hard problem, but it introduces others (this is one of them). If you're using a complex multiple inheritance scheme, super's problems are better than the problems of not using super, and if you're inheriting from third-party classes that use super then you have to use super. If neither of those conditions are true, then just replacing your super calls with direct base class calls may actually be a reasonable solution.
Another way is to not hook register into class creation. Adding register(MyClass) after each of your class definitions is pretty equivalent to adding #register before them or __metaclass__ = Registered (or whatever you call the metaclass) into them. A line down the bottom is much less self-documenting than a nice declaration up the top of the class though, so this doesn't feel great, but again it may actually be a reasonable solution.
Finally, you can turn to hacks that are unpleasant, but will probably work. The problem is that a name is being looked up in a module's global scope just before it's been bound there. So you could cheat, as follows:
def register(cls):
name = cls.__name__
force_bound = False
if '__init__' in cls.__dict__:
cls.__init__.func_globals[name] = cls
force_bound = True
try:
registry[name] = cls()
finally:
if force_bound:
del cls.__init__.func_globals[name]
return cls
Here's how this works:
We first check to see whether __init__ is in cls.__dict__ (as opposed to whether it has an __init__ attribute, which will always be true). If it's inherited an __init__ method from another class we're probably fine (because the superclass will already be bound to its name in the usual way), and the magic we're about to do doesn't work on object.__init__ so we want to avoid trying that if the class is using a default __init__.
We lookup the __init__ method and grab it's func_globals dictionary, which is where global lookups (such as to find the class referred to in a super call) will go. This is normally the global dictionary of the module where the __init__ method was originally defined. Such a dictionary is about to have the cls.__name__ inserted into it as soon as register returns, so we just insert it ourselves early.
We finally create an instance and insert it into the registry. This is in a try/finally block to make sure we remove the binding we created whether or not creating an instance throws an exception; this is very unlikely to be necessary (since 99.999% of the time the name is about to be rebound anyway), but it's best to keep weird magic like this as insulated as possible to minimise the chance that someday some other weird magic interacts badly with it.
This version of register will work whether it's invoked as a decorator or by the metaclass (which I still think is not a good use of a metaclass). There are some obscure cases where it will fail though:
I can imagine a weird class that doesn't have an __init__ method but inherits one that calls self.someMethod, and someMethod is overridden in the class being defined and makes a super call. Probably unlikely.
The __init__ method might have been defined in another module originally and then used in the class by doing __init__ = externally_defined_function in the class block. The func_globals attribute of the other module though, which means our temporary binding would clobber any definition of this class' name in that module (oops). Again, unlikely.
Probably other weird cases I haven't thought of.
You could try to add more hacks to make it a little more robust in these situations, but the nature of Python is both that these kind of hacks are possible and that it's impossible to make them absolutely bullet proof.
The answers here didn't work for me in python3, because __metaclass__ didn't work.
Here's my code registering all subclasses of a class at their definition time:
registered_models = set()
class RegisteredModel(type):
def __new__(cls, clsname, superclasses, attributedict):
newclass = type.__new__(cls, clsname, superclasses, attributedict)
# condition to prevent base class registration
if superclasses:
registered_models.add(newclass)
return newclass
class CustomDBModel(metaclass=RegisteredModel):
pass
class BlogpostModel(CustomDBModel):
pass
class CommentModel(CustomDBModel):
pass
# prints out {<class '__main__.BlogpostModel'>, <class '__main__.CommentModel'>}
print(registered_models)
Calling the Base class directly should work (instead of using super()):
def __init__(self):
Base.__init__(self)
It can be also done with something like this (without a registry function)
_registry = {}
class MetaClass(type):
def __init__(cls, clsname, bases, methods):
super().__init__(clsname, bases, methods)
_registry[cls.__name__] = cls
class MyClass1(metaclass=MetaClass): pass
class MyClass2(metaclass=MetaClass): pass
print(_registry)
# {'MyClass1': <class '__main__.MyClass1'>, 'MyClass2': <class '__main__.MyClass2'>}
Additionally, if we need to use a base abstract class (e.g. Base() class), we can do it this way (notice the metacalss inherits from ABCMeta instead of type)
from abc import ABCMeta
_registry = {}
class MetaClass(ABCMeta):
def __init__(cls, clsname, bases, methods):
super().__init__(clsname, bases, methods)
_registry[cls.__name__] = cls
class Base(metaclass=MetaClass): pass
class MyClass1(Base): pass
class MyClass2(Base): pass
print(_registry)
# {'Base': <class '__main__.Base'>, 'MyClass1': <class '__main__.MyClass1'>, 'MyClass2': <class '__main__.MyClass2'>}

Categories