Metaclasses and when/how functions are called - python

I'm trying to learn how metaclasses work in python 3. Things I want to know are: which functions are called, in what order, and their signatures and returns.
As an example, I know __prepare__ gets called when a class with a metaclass is instantiated with arguments metaclass, name_of_subclass, bases and returns a dictionary representing the future namespace of the instantiated object.
I feel like I understand __prepare__'s step in the process well. What I don't, though, are __init__, __new__, and __call__. What are their arguments? What do they return? How do they all call each other, or in general how does the process go? Currently, I'm stuck on understanding when __init__ is called.
Here is some code I've been messing around with to answer my questions:
#!/usr/bin/env python3
class Logged(type):
#classmethod
def __prepare__(cls, name, bases):
print('In meta __prepare__')
return {}
def __call__(subclass):
print('In meta __call__')
print('Creating {}.'.format(subclass))
return subclass.__new__(subclass)
def __new__(subclass, name, superclasses, attributes, **keyword_arguments):
print('In meta __new__')
return type.__new__(subclass, name, superclasses, attributes)
def __init__(subclass, name, superclasses, attributes, **keyword_arguments):
print('In meta __init__')
class Thing(metaclass = Logged):
def __new__(this, *arguments, **keyword_arguments):
print('In sub __new__')
return super(Thing, this).__new__(this)
def __init__(self, *arguments, **keyword_arguments):
print('In sub __init__')
def hello(self):
print('hello')
def main():
thing = Thing()
thing.hello()
if __name__ == '__main__':
main()
From this and some googling, I know that __new__ is really a static method that returns an instance of some object (usually the object where __new__ is defined, but not always), and that __init__ is called of an instance when it is made. By that logic, I'm confused as to why Thing.__init__() isn't being called. Could someone illuminate?
The output of this code prints 'hello', so an instance of Thing is being created, which further confuses me about init. Here's the output:
In meta __prepare__
In meta __new__
In meta __init__
In meta __call__
Creating <class '__main__.Thing'>
In sub __new__
hello
Any help understanding metaclasses would be appreciated. I've read quite a few tutorials, but I've missed some of these details.

First of all: __prepare__ is optional, you don't need to supply an implementation if all you are doing is return a default {} empty dictionary.
Metaclasses work exactly like classes, in that when you call them, then they produce an object. Both classes and metaclasses are factories. The difference is that a metaclass produces a class object when called, a class produces an instance when called.
Both classes and metaclasses define a default __call__ implementation, which basically does:
Call self.__new__ to produce a new object.
if that new object is an instance of self / a class with this
metaclass, then also call __init__ on that object.
You produced your own __call__ implementation, which doesn't implement that second step, which is why Thing.__init__ is never called.
You may ask: but the __call__ method is defined on the metaclass. That's correct, so it is exactly that method that is called when you call the class with Thing(). All special methods (starting and ending with __) are called on the type (e.g. type(instance) is the class, and type(class) is the metaclass) precisely because Python has this multi-level hierarchy of instances from classes from metaclasses; a __call__ method on the class itself is used to make instances callable. For metaclass() calls, it is the type object itself that provides the __call__ implementation. That's right, metaclasses are both subclasses and instances of type, at the same time.
When writing a metaclass, you should only implement __call__ if you want to customise what happens when you call the class. Leave it at the default implementation otherwise.
If I remove the __call__ method from your metaclass (and ignore the __prepare__ method), then Thing.__init__ is once again called:
>>> class Logged(type):
... def __new__(subclass, name, superclasses, attributes, **keyword_arguments):
... print('In meta __new__')
... return type.__new__(subclass, name, superclasses, attributes)
... def __init__(subclass, name, superclasses, attributes, **keyword_arguments):
... print('In meta __init__')
...
>>> class Thing(metaclass = Logged):
... def __new__(this, *arguments, **keyword_arguments):
... print('In sub __new__')
... return super(Thing, this).__new__(this)
... def __init__(self, *arguments, **keyword_arguments):
... print('In sub __init__')
... def hello(self):
... print('hello')
...
In meta __new__
In meta __init__
>>> thing = Thing()
In sub __new__
In sub __init__

In the metaclass's __call__ method, you're calling Thing's __new__ only, but not __init__. It seems that the default behaviour of __call__ is to invoke both of them, as seen when we call the metaclass's inherited __call__:
def __call__(subclass):
print('In meta __call__')
print('Creating {}.'.format(subclass))
return super().__call__(subclass)
This prints:
Creating <class '__main__.Thing'>.
In sub __new__
In sub __init__

Related

How is memory set aside when an object is created out of a class?

I read that __init__ doesn't create the object(sets aside memory).
I couldn't find who does the actual object creation.
How does this object creation happens internally?
You're probably looking for __new__.
From a Python point of view, what happens is something like the following, assuming you have a class, A, that inherits directly from object:
A.__new__ method gets called
object.__new__ gets called, with cls=A
A.__init__ method gets called, with arguments forwarded from your A.__new__
Example:
class A():
def __new__(cls, *args, **kwargs):
print(f'__new__ called with args {args} and kwargs {kwargs}')
return super().__new__(cls) # object.__new__
def __init__(self, *args, **kwargs):
print(f'__init__ called with args {args} and kwargs {kwargs}')
# args are discarded
for key, arg in kwargs.items():
setattr(self, key, arg)
a_instance = A('arg', kwarg=1)
a_instance.kwarg
Output:
__new__ called with args ('arg',) and kwargs {'kwarg': 1}
__init__ called with args ('arg',) and kwargs {'kwarg': 1}
1
In general, there is no need to do anything with __new__, because Python objects are usually mutable, and so there is no distinction between initialising instance attributes and modifying them.
The main use case of overriding __new__, in my experience, is when you inherit from immutable types, such as tuple. In such cases, you must initialise all instance attributes at creation, and therefore __init__ is too late.
When you type something like MyClass() in Python, Python runtime will call __new__ on MyClass, which should construct an object; this will be followed by invoking __init__ on the newly constructed object. Thus, __new__ is called "constructor", and __init__ an "initialiser". This sequence is coded outside Python (in C, in case of CPython). Visit Python documentation to read more on __new__ and __init__.
Short answer
As a short answer, the method which creates the object is __new__ and __init__ just initializes the created object.
Long answer
However, keep reading if you want to get a deeper insight into what's happening when an object is going to be created in python.
In python 2.x there were two types of classes in python old-style and new-style classes.
class C: # A old-style class sample
pass
class C(object): # A old-style class sample
pass
In old-style classes, there was no __new__ method so __init__ was the constructor.
However, in python 3.x, just new-style class remains ( Independent of whatever kind of definition you choose for your class, It'll be inherited from the base class "Object"). In new-style classes both __new__ and __init__ methods are available. __new__ and __init__ is the constructor and the initializer respectively and you're permitted to override both of them (be cautious, generally you don't need to override __new__ method expect in some cases like defining meta classes and etc. so don't manipulate it if it's not necessary.)
Finally, when an object is going to be created, the constructor will be called before initializer.
class A(object): # -> don't forget the object specified as base
def __new__(cls):
print "A.__new__ called"
return super(A, cls).__new__(cls)
def __init__(self):
print "A.__init__ called"
A()
The output will be:
A.__new__ called
A.__init__ called

Unexpected behavior of setattr() inside Metaclass __init__

Here is a basic Metaclass intending to provide a custom __init__:
class D(type):
def __init__(cls, name, bases, attributes):
super(D, cls).__init__(name, bases, attributes)
for hook in R.registered_hooks:
setattr(cls, hook.__qualname__, hook)
The for loop iterates over a list of functions, calling setattr(cls, hook.__qualname__, hook) for each of it.
Here is a Class that use the above Metaclass:
class E(metaclass=D):
def __init__(self):
pass
The weird thing:
E().__dict__ prints {}
But when I call
`E.__dict__ prints {'f1': <function f1 at 0x001>, 'f2': <function f2 at 0x002>}`
I was expecting to add those function as attributes to the instance attributes, since the __init__ provides a custom initialization for a Class but it seems like the attributes were added to Class attributes.
What is the cause of this and how can I add attributes to the instance, in this scenario? Thanks!
They are added as instance attributes. The instance, though, is the class, as an instance of the metaclass. You aren't defining a __init__ method for E; you are defining the method that type uses to initialize E itself.
If you just want attributes added to the instance of E, you're working at the wrong level; you should define a mixin and use multiple inheritance for that.
For __init__ to be called on the instantiation of an object obj, you need to have defined type(obj).__init__. By using a metaclass, you defined type(type(obj)).__init__. You are working at the wrong level.
In this case, you only need inheritance, not a metaclass.
class D():
def __init__(self, *args):
print('D.__init__ was called')
class E(D):
pass
e = E() # prints: D.__init__ was called
Note that you will still find that e.__dict__ is empty.
print(e.__dict__) # {}
This is because you instances have no attributes, methods are stored and looked up in the class.
If you're using a metaclass, that means you're changing the definition of a class, which indirectly affects the attributes of instances of that class. Assuming you actually want to do this:
You're creating a custom __init__ method. As with normal instantiation, __init__ is called on an object that has been already created. Generally when you're doing something meaningful with metaclasses, you'll want to add to the class's dict before instantiation, which is done in the __new__ method to avoid issues that may come up in subclassing. It should look something like this:
class D(type):
def __new__(cls, name, bases, dct):
for hook in R.registered_hooks:
dct[hook.__qualname__] = hook
return super(D, cls).__new__(cls, name, bases, dct)
This will not modify the __dict__ of instances of that class, but attribute resolution for instances also looks for class attributes. So if you add foo as an attribute for E in the metaclass, E().foo will return the expected value even though it's not visible in E().__dict__.

Assuming `obj` has type `objtype`, are `super(cls,obj)` and `super(cls,objtype)` the same?

In Python, assuming that obj has type objtype, are super(cls,obj) and super(cls,objtype) the same?
Is it correct that super(cls,obj) converts obj to another object whose class is a superclass of objtype which is after cls in the MRO of objtype?
What does super(cls,objtype) mean then?
For example, given an implementation of the Singleton design pattern:
class Singleton(object):
_singletons = {}
def __new__(cls, *args, **kwds):
if cls not in cls._singletons:
cls._singletons[cls] = super(Singleton, cls).__new__(cls)
return cls._singletons[cls]
any subclass of Singleton (that does not further override __new__ ) has exactly one instance.
What does super(Singleton, cls) mean, where cls is a class? What does it return?
Thanks.
According to the docs, super
Return a proxy object that delegates method calls to a parent or sibling class of type.
So super returns an object which knows how to call the methods of other classes in the class hierarchy.
The second argument to super is the object to which super is bound; generally this is an instance of the class, but if super is being called in the context of a method that is a classmethod or staticmethod then we want to call the method on the class object itself rather than an instance.
So calling super(SomeClass, cls).some_method() means call some_method on the classes that SomeClass descends from, rather than on instances of these classes. Otherwise super calls behave just like a super call in an instance method.
The usage looks more natural in less complicated code:
class C(SomeBaseClass):
def method(self):
# bind super to 'self'
super(C, self).method()
#classmethod
def cmethod(cls):
# no 'self' here -- bind to cls
super(C, cls).cmethod()
Note that super(C, cls) is required in python2, but an empty super() is enough in python3.
In your singleton example, super(Singleton, cls).__new__(cls) returns the result of calling object.__new__(cls), an instance of Singleton. It's being created this way to avoid recursively calling Singleton.__new__.

why defined '__new__' and '__init__' all in a class

i think you can defined either '__init__' or '__new__' in a class,but why all defined in django.utils.datastructures.py.
my code:
class a(object):
def __init__(self):
print 'aaa'
def __new__(self):
print 'sss'
a()#print 'sss'
class b:
def __init__(self):
print 'aaa'
def __new__(self):
print 'sss'
b()#print 'aaa'
datastructures.py:
class SortedDict(dict):
"""
A dictionary that keeps its keys in the order in which they're inserted.
"""
def __new__(cls, *args, **kwargs):
instance = super(SortedDict, cls).__new__(cls, *args, **kwargs)
instance.keyOrder = []
return instance
def __init__(self, data=None):
if data is None:
data = {}
super(SortedDict, self).__init__(data)
if isinstance(data, dict):
self.keyOrder = data.keys()
else:
self.keyOrder = []
for key, value in data:
if key not in self.keyOrder:
self.keyOrder.append(key)
and what circumstances the SortedDict.__init__ will be call.
thanks
You can define either or both of __new__ and __init__.
__new__ must return an object -- which can be a new one (typically that task is delegated to type.__new__), an existing one (to implement singletons, "recycle" instances from a pool, and so on), or even one that's not an instance of the class. If __new__ returns an instance of the class (new or existing), __init__ then gets called on it; if __new__ returns an object that's not an instance of the class, then __init__ is not called.
__init__ is passed a class instance as its first item (in the same state __new__ returned it, i.e., typically "empty") and must alter it as needed to make it ready for use (most often by adding attributes).
In general it's best to use __init__ for all it can do -- and __new__, if something is left that __init__ can't do, for that "extra something".
So you'll typically define both if there's something useful you can do in __init__, but not everything you want to happen when the class gets instantiated.
For example, consider a class that subclasses int but also has a foo slot -- and you want it to be instantiated with an initializer for the int and one for the .foo. As int is immutable, that part has to happen in __new__, so pedantically one could code:
>>> class x(int):
... def __new__(cls, i, foo):
... self = int.__new__(cls, i)
... return self
... def __init__(self, i, foo):
... self.foo = foo
... __slots__ = 'foo',
...
>>> a = x(23, 'bah')
>>> print a
23
>>> print a.foo
bah
>>>
In practice, for a case this simple, nobody would mind if you lost the __init__ and just moved the self.foo = foo to __new__. But if initialization is rich and complex enough to be best placed in __init__, this idea is worth keeping in mind.
__new__ and __init__ do completely different things. The method __init__ initiates a new instance of a class --- it is a constructor. __new__ is a far more subtle thing --- it can change arguments and, in fact, the class of the initiated object. For example, the following code:
class Meters(object):
def __new__(cls, value):
return int(value / 3.28083)
If you call Meters(6) you will not actually create an instance of Meters, but an instance of int. You might wonder why this is useful; it is actually crucial to metaclasses, an admittedly obscure (but powerful) feature.
You'll note that in Python 2.x, only classes inheriting from object can take advantage of __new__, as you code above shows.
The use of __new__ you showed in django seems to be an attempt to keep a sane method resolution order on SortedDict objects. I will admit, though, that it is often hard to tell why __new__ is necessary. Standard Python style suggests that it not be used unless necessary (as always, better class design is the tool you turn to first).
My only guess is that in this case, they (author of this class) want the keyOrder list to exist on the class even before SortedDict.__init__ is called.
Note that SortedDict calls super() in its __init__, this would ordinarily go to dict.__init__, which would probably call __setitem__ and the like to start adding items. SortedDict.__setitem__ expects the .keyOrder property to exist, and therein lies the problem (since .keyOrder isn't normally created until after the call to super().) It's possible this is just an issue with subclassing dict because my normal gut instinct would be to just initialize .keyOrder before the call to super().
The code in __new__ might also be used to allow SortedDict to be subclassed in a diamond inheritance structure where it is possible SortedDict.__init__ is not called before the first __setitem__ and the like are called. Django has to contend with various issues in supporting a wide range of python versions from 2.3 up; it's possible this code is completely un-neccesary in some versions and needed in others.
There is a common use for defining both __new__ and __init__: accessing class properties which may be eclipsed by their instance versions without having to do type(self) or self.__class__ (which, in the existence of metaclasses, may not even be the right thing).
For example:
class MyClass(object):
creation_counter = 0
def __new__(cls, *args, **kwargs):
cls.creation_counter += 1
return super(MyClass, cls).__new__(cls)
def __init__(self):
print "I am the %dth myclass to be created!" % self.creation_counter
Finally, __new__ can actually return an instance of a wrapper or a completely different class from what you thought you were instantiating. This is used to provide metaclass-like features without actually needing a metaclass.
In my opinion, there was no need of overriding __new__ in the example you described.
Creation of an instance and actual memory allocation happens in __new__, __init__ is called after __new__ and is meant for initialization of instance serving the job of constructor in classical OOP terms. So, if all you want to do is initialize variables, then you should go for overriding __init__.
The real role of __new__ comes into place when you are using Metaclasses. There if you want to do something like changing attributes or adding attributes, that must happen before the creation of class, you should go for overriding __new__.
Consider, a completely hypothetical case where you want to make some attributes of class private, even though they are not defined so (I'm not saying one should ever do that).
class PrivateMetaClass(type):
def __new__(metaclass, classname, bases, attrs):
private_attributes = ['name', 'age']
for private_attribute in private_attributes:
if attrs.get(private_attribute):
attrs['_' + private_attribute] = attrs[private_attribute]
attrs.pop(private_attribute)
return super(PrivateMetaClass, metaclass).__new__(metaclass, classname, bases, attrs)
class Person(object):
__metaclass__ = PrivateMetaClass
name = 'Someone'
age = 19
person = Person()
>>> hasattr(person, 'name')
False
>>> person._name
'Someone'
Again, It's just for instructional purposes I'm not suggesting one should do anything like this.

Python multiple inheritance: which __new__ to call?

I have a class Parent. I want to define a __new__ for Parent so it does some magic upon instantiation (for why, see footnote). I also want children classes to inherit from this and other classes to get Parent's features. The Parent's __new__ would return an instance of a subclass of the child class's bases and the Parent class.
This is how the child class would be defined:
class Child(Parent, list):
pass
But now I don't know what __new__ to call in Parent's __new__. If I call object.__new__, the above Child example complains that list.__new__ should be called. But how would Parent know that? I made it work so it loops through all the __bases__, and call each __new__ inside a try: block:
class Parent(object):
def __new__(cls, *args, **kwargs):
# There is a special wrapper function for instantiating instances of children
# classes that passes in a 'bases' argument, which is the __bases__ of the
# Children class.
bases = kwargs.get('bases')
if bases:
cls = type('name', bases + (cls,), kwargs.get('attr', {}))
for base in cls.__mro__:
if base not in (cls, MyMainType):
try:
obj = base.__new__(cls)
break
except TypeError:
pass
return obj
return object.__new__(cls)
But this just looks like a hack. Surely, there must be a better way of doing this?
Thanks.
The reason I want to use __new__ is so I can return an object of a subclass that has some dynamic attributes (the magic __int__ attributes, etc) assigned to the class. I could have done this in __init__, but I would not be able to modify self.__class__ in __init__ if the new class has a different internal structure, which is the case here due to multiple inheritance.
I think this will get you what you want:
return super(Parent, cls).__new__(cls, *args, **kwargs)
and you won't need the bases keyword argument. Unless I'm getting you wrong and you're putting that in there on purpose.

Categories