Python3 metaclass: use to validate constructor argument - python

I was planning to use metaclass to validate the constructor argument in Python3, but it seems __new__method has no access to the variable val, because the class A() has not been instantiated yet.
Sow what's the correct way to do it?
class MyMeta(type):
def __new__(cls, clsname, superclasses, attributedict):
print("clsname: ", clsname)
print("superclasses: ", superclasses)
print("attributedict: ", attributedict)
return type.__new__(cls, clsname, superclasses, attributedict)
class A(metaclass=MyMeta):
def __init__(self, val):
self.val = val
A(123)

... it seems __new__method has no access to the variable val, because the class A() has not been instantiated yet.
Exactly.
So what's the correct way to do it?
Not with a metaclass.
Metaclasses are for fiddling with the creation of the class object itself, and what you want to do is related to instances of the class.
Best practice: don't type-check the val at all. Pythonic code is duck-typed. Simply document that you expect a string-like argument, and users who put garbage in get garbage out.

wim is absolutely correct that this isn't a good use of metaclasses, but it's certainly possible (and easy, too).
Consider how you would create a new instance of your class. You do this:
A(123)
In other words: You create an instance by calling the class. And python allows us to create custom callable objects by defining a __call__ method. So all we have to do is to implement a suitable __call__ method in our metaclass:
class MyMeta(type):
def __call__(self, val):
if not isinstance(val, str):
raise TypeError('val must be a string')
return super().__call__(val)
class A(metaclass=MyMeta):
def __init__(self, val):
self.val = val
And that's it. Simple, right?
>>> A('foo')
<__main__.A object at 0x007886B0>
>>> A(123)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "untitled.py", line 5, in __call__
raise TypeError('val must be a string')
TypeError: val must be a string

Related

How to troubleshoot `super()` calls finding incorrect type and obj?

I have a decorator in my library which takes a user's class and creates a new version of it, with a new metaclass, it is supposed to completely replace the original class. Everything works; except for super() calls:
class NewMeta(type):
pass
def deco(cls):
cls_dict = dict(cls.__dict__)
if "__dict__" in cls_dict:
del cls_dict["__dict__"]
if "__weakref__" in cls_dict:
del cls_dict["__weakref__"]
return NewMeta(cls.__name__, cls.__bases__, cls_dict)
#deco
class B:
def x(self):
print("Hi there")
#deco
class A(B):
def x(self):
super().x()
Using this code like so, yields an error:
>>> a = A()
>>> a.x()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 4, in x
TypeError: super(type, obj): obj must be an instance or subtype of type
Some terminology:
The source code class A as produced by class A(B):.
The produced class A*, as produced by NewMeta(cls.__name__, cls.__bases__, cls_dict).
A is established by Python to be the type when using super inside of the methods of A*. How can I correct this?
There's some suboptimal solutions like calling super(type(self), self).x, or passing cls.__mro__ instead of cls.__bases__ into the NewMeta call (so that obj=self always inherits from the incorrect type=A). The first is unsustainable for end users, the 2nd pollutes the inheritance chains and is confusing as the class seems to inherit from itself.
Python seems to introspect the source code, or maybe stores some information to automatically establish the type, and in this case, I'd say it is failing to do so;
How could I make sure that inside of the methods of A A* is established as the type argument of argumentless super calls?
The argument-free super uses the __class__ cell, which is a regular function closure.
Data Model: Creating the class object
__class__ is an implicit closure reference created by the compiler if any methods in a class body refer to either __class__ or super.
>>> class E:
... def x(self):
... return __class__ # return the __class__ cell
...
>>> E().x()
__main__.E
>>> # The cell is stored as a __closure__
>>> E.x.__closure__[0].cell_contents is E().x() is E
True
Like any other closure, this is a lexical relation: it refers to class scope in which the method was literally defined. Replacing the class with a decorator still has the methods refer to the original class.
The simplest fix is to explicitly refer to the name of the class, which gets rebound to the newly created class by the decorator.
#deco
class A(B):
def x(self):
super(A, self).x()
Alternatively, one can change the content of the __class__ cell to point to the new class:
def deco(cls):
cls_dict = dict(cls.__dict__)
cls_dict.pop("__dict__", None)
cls_dict.pop("__weakref__", None)
new_cls = NewMeta(cls.__name__, cls.__bases__, cls_dict)
for method in new_cls.__dict__.values():
if getattr(method, "__closure__", None) and method.__closure__[0].cell_contents is cls:
method.__closure__[0].cell_contents = new_cls
return new_cls

What is the corretly way to call super in dynamically added methods?

I defined a metaclass which add a method named "test" to the created classes:
class FooMeta(type):
def __new__(mcls, name, bases, attrs):
def test(self):
return super().test()
attrs["test"] = test
cls = type.__new__(mcls, name, bases, attrs)
return cls
Then I create two classes using this Metaclass
class A(metaclass=FooMeta):
pass
class B(A):
pass
When I run
a = A()
a.test()
a TypeError is raised at super().test():
super(type, obj): obj must be an instance or subtype of type
Which means super() cannot infer the parent class correctly. If I change the super call into
def __new__(mcls, name, bases, attrs):
def test(self):
return super(cls, self).test()
attrs["test"] = test
cls = type.__new__(mcls, name, bases, attrs)
return cls
then the raised error becomes:
AttributeError: 'super' object has no attribute 'test'
which is expected as the parent of A does not implement test method.
So my question is what is the correct way to call super() in a dynamically added method? Should I always write super(cls, self) in this case? If so, it is too ugly (for python3)!
Parameterless super() is very special in Python because it triggers some behavior during code compilation time itself: Python creates an invisible __class__ variable which is a reference to the "physical" class statement body were the super() call is embedded (it also happens if one makes direct use of the __class__ variable inside a class method).
In this case, the "physical" class where super() is called is the metaclass FooMeta itself, not the class it is creating.
The workaround for that is to use the version of super which takes 2 positional arguments: the class in which it will search the immediate superclass, and the instance itself.
In Python 2 and other occasions one may prefer the parameterized use of super, it is normal to use the class name itself as the first parameter: at runtime, this name will be available as a global variable in the current module. That is, if class A would be statically coded in the source file, with a def test(...): method, you would use super(A, self).test(...) inside its body.
However, although the class name won't be available as a variable in the module defining the metaclass, you really need to pass a reference to the class as the first argument to super. Since the (test) method receives self as a reference to the instance, its class is given by either self.__class__ or type(self).
TL;DR: just change the super call in your dynamic method to read:
class FooMeta(type):
def __new__(mcls, name, bases, attrs):
def test(self):
return super(type(self), self).test()
attrs["test"] = test
cls = type.__new__(mcls, name, bases, attrs)
return cls

traceback behaviour for __init__ errors when using __call__ in metaclasses?

Using the following code:
class Meta(type):
def __new__(mcl, name, bases, nmspc):
return super(Meta, mcl).__new__(mcl, name, bases, nmspc)
class TestClass(object):
__metaclass__ = Meta
def __init__(self):
pass
t = TestClass(2) # deliberate error
produces the following:
Traceback (most recent call last):
File "foo.py", line 12, in <module>
t = TestClass(2)
TypeError: __init__() takes exactly 1 argument (2 given)
However using __call__ instead of __new__ in the following code:
class Meta(type):
def __call__(cls, *args, **kwargs):
instance = super(Meta, cls).__call__(*args, **kwargs)
# do something
return instance
class TestClass(object):
__metaclass__ = Meta
def __init__(self):
pass
t = TestClass(2) # deliberate error
gives me the following traceback:
Traceback (most recent call last):
File "foo.py", line 14, in <module>
t = TestClass(2)
File "foo.py", line 4, in __call__
instance = super(Meta, cls).__call__(*args, **kwargs)
TypeError: __init__() takes exactly 1 argument (2 given)
Does type also trigger the __init__ of the class from its __call__ or is the behaviour changed when I add the metaclass?
Both __new__ and __call__ are being run by the call to the class constructor. Why is __call__ showing up in the error message but not __new__?
Is there a way of suppressing the lines of the traceback showing the __call__ for the metaclass here? i.e. when the error is in the call to the constructor and not the __call__ code?
Lets see if I can answer your three questions:
Does type also trigger the __init__ of the class from its __call__ or is the behaviour changed when I add the metaclass?
The default behavior of type.__call__ is to create a new object with cls.__new__ (which may be inherited from object.__new__, or call it with super). If the object returned from cls.__new__ is an instance of cls, type.__call__ will then run cls.__init__ on it.
If you define your own __call__ method in a custom metaclass, it can do almost anything. Usually though you'll call type.__call__ at some point (via super) and so the same behavior will happen. This isn't required though. You can return anything from a metaclass's __call__ method.
Both __new__ and __call__ are being run by the call to the class constructor. Why is __call__ showing up in the error message but not __new__?
You're misunderstanding what Meta.__new__ is for. The __new__ method in a metaclass is not called when you make an instance of the normal class. It is called when you make an instance of the metaclass, which is the class object itself.
Try running this code, to better understand what is going on:
print("Before Meta")
class Meta(type):
def __new__(meta, name, bases, dct):
print("In Meta.__new__")
return super(Meta, meta).__new__(meta, name, bases, dct)
def __call__(cls, *args, **kwargs):
print("In Meta.__call__")
return super(Meta, cls).__call__(*args, **kwargs)
print("After Meta, before Cls")
class Cls(object):
__metaclass__ = Meta
def __init__(self):
print("In Cls.__init__")
print("After Cls, before obj")
obj = Cls()
print("Bottom of file")
The output you'll get is:
Before Meta
After Meta, before Cls
In Meta.__new__
After Cls, before obj
In Meta.__call__
In Cls.__init__
Bottom of file
Note that Meta.__new__ is called where the regular class Cls is defined, not when the instance of Cls is created. The Cls class object is in fact an instance of Meta, so this makes some sense.
The difference in your exception tracebacks comes from this fact. When the exception occurs, the metaclass's __new__ method has long since finished (since if it didn't, there wouldn't have been a regular class to call at all).
Is there a way of suppressing the lines of the traceback showing the __call__ for the metaclass here? i.e. when the error is in the call to the constructor and not the __call__ code?
Yes and no. It's probably possible, but its almost certainly a bad idea. Python's stacktraces will, by default, show you the full call stack (excluding builtin stuff that's implemented in C, rather than Python). That's their purpose. The problem causing an exception in your code is not always going to be in the last call, even in less confusing areas than metaclasses.
Consider this trivial example:
def a(*args):
b(args) # note, no * here, so a single tuple will be passed on
def b(*args):
c(*args):
def c():
print(x)
a()
In this code, there's an error in the a function, but an exception is only raised when b calls c with the wrong number of arguments.
I suppose if you needed to you could pretty things up a bit by editing the data in the stack trace object somewhere, but if you do that automatically it is likely to make things much more confusing if you ever encounter an actual error in the metaclass code.
In fact, what the interpreter is complaining about is that you are not passing arg to __init__.
You should do:
t = TestClass('arg')
or:
class TestClass(object):
__metaclass__ = Meta
def __init__(self):
pass
t = TestClass()

Determine if __getattr__ is method or attribute call

Is there any way to determine the difference between a method and an attribute call using __getattr__?
I.e. in:
class Bar(object):
def __getattr__(self, name):
if THIS_IS_A_METHOD_CALL:
# Handle method call
def method(**kwargs):
return 'foo'
return method
else:
# Handle attribute call
return 'bar'
foo=Bar()
print(foo.test_method()) # foo
print(foo.test_attribute) # bar
The methods are not local so it's not possible to determine it using getattr/callable. I also understand that methods are attributes, and that there might not be a solution. Just hoping there is one.
You cannot tell how an object is going to used in the __getattr__ hook, at all. You can access methods without calling them, store them in a variable, and later call them, for example.
Return an object with a __call__ method, it'll be invoked when called:
class CallableValue(object):
def __init__(self, name):
self.name = name
def __call__(self, *args, **kwargs):
print "Lo, {} was called!".format(self.name)
class Bar(object):
def __getattr__(self, name):
return CallableValue(name)
but instances of this will not be the same thing as a string or a list at the same time.
Demo:
>>> class CallableValue(object):
... def __init__(self, name):
... self.name = name
... def __call__(self, *args, **kwargs):
... print "Lo, {} was called!".format(self.name)
...
>>> class Bar(object):
... def __getattr__(self, name):
... return CallableValue(name)
...
>>> b = Bar()
>>> something = b.test_method
>>> something
<__main__.CallableValue object at 0x10ac3c290>
>>> something()
Lo, test_method was called!
In short, no, there is no reliable way - the issue is that a method is an attribute in Python - there is no distinction made. It just happens to be an attribute that is a bound method.
You can check if the attribute is a method, but there is no guarantee that means it will be called, e.g:
class Test:
def test(self):
...
Test().test # This accesses the method, but doesn't call it!
There is no way for the call accessing the function to know if it's going to be called when it is returned - that's a future event that hasn't yet been processed.
If you are willing to assume that a method being accessed is a method being called, you can determine that it is a method being accessed with a check like this:
hasattr(value, "__self__") and value.__self__ is self
Where value is the attribute you want to check to see if it is a method or some other attribute, and self is the instance you want to see if it's a method for.
If you need something to happen when it is called, you could use this moment to decorate the function.
An solid code example of this can be found here.
It is not possible to know whether an attribute is subsequently called from __getattr__, which Martijn Pieters explains.
Although it does not determine whether an attribute is called, it is possible to know whether an attribute can be called with callable. Another way is to use type to keep track of the various objects, or make a list of attribute names.
class Foo(object):
bar_attribute = 'callable'
def __getattr__(self, name):
instanceOrValue = getattr(self, "bar_%s" %name)
if callable(instanceOrValue):
# Handle object that can be called
def wrap(**kwargs):
return "is %s" %instanceOrValue(**kwargs)
return wrap
# Handle object that can not be called
return 'not %s' %instanceOrValue
def bar_method(self, **kwargs):
return 'callable';
foo=Foo()
print(foo.method()) # is callable
print(foo.attribute) # not callable
__getattr__ can only keep track of certain things, but it can be the right solution in many situations because change calling (__call__) has an impact on calling in all situations and not only when the Foo class is used.
Difference between calling a method and accessing an attribute
What is a "callable"?
Python __call__ special method practical example

Metaclasses in Python: a couple of questions to clarify

After crashing with metaclasses i delved into the topic of metaprogramming in Python and I have a couple of questions that are, imho, not clearly anwered in available docs.
When using both __new__ and __init__ in a metaclass, their arguments must be defined the same?
What's most efficient way to define class __init__ in a metaclass?
Is there any way to refer to class instance (normally self) in a metaclass?
When using both __new__ and __init__
in a metaclass, their arguments must
be defined the same?
I think Alex Martelli explains
it most succinctly:
class Name(Base1,Base2): <<body>>
__metaclass__==suitable_metaclass
means
Name = suitable_metaclass('Name', (Base1,Base2), <<dict-built-by-body>>)
So stop thinking about
suitable_metaclass as a metaclass
for a moment and just regard it as a
class. Whenever you see
suitable_metaclass('Name', (Base1,Base2), <<dict-built-by-body>>)
it tells you that
suitable_metaclass's __new__
method must have a signature
something like
def __new__(metacls, name, bases, dct)
and a __init__ method like
def __init__(cls, name, bases, dct)
So the signatures are not exactly the same, but they differ only in the first argument.
What's most efficient way to define
class __init__ in a metaclass?
What do you mean by efficient? It is
not necessary to define the __init__
unless you want to.
Is there any way to refer to class
instance (normally self) in a
metaclass?
No, and you should not need to.
Anything that depends on the class
instance should be dealt with in the
class definition, rather than in the
metaclass.
For 1: The __init__ and __new__ of any class have to accept the same arguments, because they would be called with the same arguments. It's common for __new__ to take more arguments that it ignores (e.g. object.__new__ takes any arguments and it ignores them) so that __new__ doesn't have to be overridden during inheritance, but you usually only do that when you have no __new__ at all.
This isn't a problem here, because as it was stated, metaclasses are always called with the same set of arguments always so you can't run into trouble. With the arguments at least. But if you're modifying the arguments that are passed to the parent class, you need to modify them in both.
For 2: You usually don't define the class __init__ in a metaclass. You can write a wrapper and replace the __init__ of the class in either __new__ or __init__ of the metaclass, or you can redefine the __call__ on the metaclass. The former would act weirdly if you use inheritance.
import functools
class A(type):
def __call__(cls, *args, **kwargs):
r = super(A, cls).__call__(*args, **kwargs)
print "%s was instantiated" % (cls.__name__, )
print "the new instance is %r" % (r, )
return r
class B(type):
def __init__(cls, name, bases, dct):
super(B, cls).__init__(name, bases, dct)
if '__init__' not in dct:
return
old_init = dct['__init__']
#functools.wraps(old_init)
def __init__(self, *args, **kwargs):
old_init(self, *args, **kwargs)
print "%s (%s) was instantiated" % (type(self).__name__, cls.__name__)
print "the new instance is %r" % (self, )
cls.__init__ = __init__
class T1:
__metaclass__ = A
class T2:
__metaclass__ = B
def __init__(self):
pass
class T3(T2):
def __init__(self):
super(T3, self).__init__()
And the result from calling it:
>>> T1()
T1 was instantiated
the new instance is <__main__.T1 object at 0x7f502c104290>
<__main__.T1 object at 0x7f502c104290>
>>> T2()
T2 (T2) was instantiated
the new instance is <__main__.T2 object at 0x7f502c0f7ed0>
<__main__.T2 object at 0x7f502c0f7ed0>
>>> T3()
T3 (T2) was instantiated
the new instance is <__main__.T3 object at 0x7f502c104290>
T3 (T3) was instantiated
the new instance is <__main__.T3 object at 0x7f502c104290>
<__main__.T3 object at 0x7f502c104290>
For 3: Yes, from __call__ as shown above.

Categories