Mix-in of abstract class and namedtuple - python

I want to define a mix-in of a namedtuple and a base class with defines and abstract method:
import abc
import collections
class A(object):
__metaclass__ = abc.ABCMeta
#abc.abstractmethod
def do(self):
print("U Can't Touch This")
B = collections.namedtuple('B', 'x, y')
class C(B, A):
pass
c = C(x=3, y=4)
print(c)
c.do()
From what I understand reading the docs and other examples I have seen, c.do() should raise an error, as class C does not implement do(). However, when I run it... it works:
B(x=3, y=4)
U Can't Touch This
I must be overlooking something.

When you take a look at the method resolution order of C you see that B comes before A in that list. That means when you instantiate C the __new__ method of B will be called first.
This is the implementation of namedtuple.__new__
def __new__(_cls, {arg_list}):
'Create new instance of {typename}({arg_list})'
return _tuple.__new__(_cls, ({arg_list}))
You can see that it does not support cooperative inheritance, because it breaks the chain and simply calls tuples __new__ method. Like this the ABCMeta.__new__ method that checks for abstract methods is never executed (where ever that is) and it can't check for abstract methods. So the instantiation does not fail.
I thought inverting the MRO would solve that problem, but it strangely did not. I'm gonna investigate a bit more and update this answer.

Related

Calling super().method() vs. BaseClass.method(self)

There are two main ways for a derived class to call a base class's methods.
Base.method(self):
class Derived(Base):
def method(self):
Base.method(self)
...
or super().method():
class Derived(Base):
def method(self):
super().method()
...
Suppose I now do this:
obj = Derived()
obj.method()
As far as I know, both Base.method(self) and super().method() do the same thing. Both will call Base.method with a reference to obj. In particular, super() doesn't do the legwork to instantiate an object of type Base. Instead, it creates a new object of type super and grafts the instance attributes from obj onto it, then it dynamically looks up the right attribute from Base when you try to get it from the super object.
The super() method has the advantage of minimizing the work you need to do when you change the base for a derived class. On the other hand, Base.method uses less magic and may be simpler and clearer when a class inherits from multiple base classes.
Most of the discussions I've seen recommend calling super(), but is this an established standard among Python coders? Or are both of these methods widely used in practice? For example, answers to this stackoverflow question go both ways, but generally use the super() method. On the other hand, the Python textbook I am teaching from this semester only shows the Base.method approach.
Using super() implies the idea that whatever follows should be delegated to the base class, no matter what it is. It's about the semantics of the statement. Referring explicitly to Base on the other hand conveys the idea that Base was chosen explicitly for some reason (perhaps unknown to the reader), which might have its applications too.
Apart from that however there is a very practical reason for using super(), namely cooperative multiple inheritance. Suppose you've designed the following class hierarchy:
class Base:
def test(self):
print('Base.test')
class Foo(Base):
def test(self):
print('Foo.test')
Base.test(self)
class Bar(Base):
def test(self):
print('Bar.test')
Base.test(self)
Now you can use both Foo and Bar and everything works as expected. However these two classes won't work together in a multiple inheritance schema:
class Test(Foo, Bar):
pass
Test().test()
# Output:
# Foo.test
# Base.test
That last call to test skips over Bar's implementation since Foo didn't specify that it wants to delegate to the next class in method resolution order but instead explicitly specified Base. Using super() resolves this issue:
class Base:
def test(self):
print('Base.test')
class Foo(Base):
def test(self):
print('Foo.test')
super().test()
class Bar(Base):
def test(self):
print('Bar.test')
super().test()
class Test(Foo, Bar):
pass
Test().test()
# Output:
# Foo.test
# Bar.test
# Base.test

initialization order in python class hierarchies

In C++, given a class hierarchy, the most derived class's ctor calls its base class ctor which then initialized the base part of the object, before the derived part is instantiated. In Python I want to understand what's going on in a case where I have the requirement, that Derived subclasses a given class Base which takes a callable in its __init__ method which it then later invokes. The callable features some parameters which I pass in Derived class's __init__, which is where I also define the callable function. My idea then was to pass the Derived class itself to its Base class after having defined the __call__ operator
class Derived(Base):
def __init__(self, a, b):
def _process(c, d):
do_something with a and b
self.__class__.__call__ = _process
super(Derived, self).__init__(self)
Is this a pythonic way of dealing with this problem?
What is the exact order of initialization here? Does one needs to call super as a first instruction in the __init__ method or is it ok to do it the way I did?
I am confused whether it is considered good practice to use super with or without arguments in python > 3.6
What is the exact order of initialization here?
Well, very obviously the one you can see in your code - Base.__init__() is only called when you explicitely ask for it (with the super() call). If Base also has parents and everyone in the chain uses super() calls, the parents initializers will be invoked according to the mro.
Basically, Python is a "runtime language" - except for the bytecode compilation phase, everything happens at runtime - so there's very few "black magic" going on (and much of it is actually documented and fully exposed for those who want to look under the hood or do some metaprogramming).
Does one needs to call super as a first instruction in the init method or is it ok to do it the way I did?
You call the parent's method where you see fit for the concrete use case - you just have to beware of not using instance attributes (directly or - less obvious to spot - indirectly via a method call that depends on those attributes) before they are defined.
I am confused whether it is considered good practice to use super with or without arguments in python > 3.6
If you don't need backward compatibily, use super() without params - unless you want to explicitely skip some class in the MRO, but then chances are there's something debatable with your design (but well - sometimes we can't afford to rewrite a whole code base just to avoid one very special corner case, so that's ok too as long as you understand what you're doing and why).
Now with your core question:
class Derived(Base):
def __init__(self, a, b):
def _process(c, d):
do_something with a and b
self.__class__.__call__ = _process
super(Derived, self).__init__(self)
self.__class__.__call__ is a class attribute and is shared by all instances of the class. This means that you either have to make sure you are only ever using one single instance of the class (which doesn't seem to be the goal here) or are ready to have totally random results, since each new instance will overwrite self.__class__.__call__ with it's own version.
If what you want is to have each instance's __call__ method to call it's own version of process(), then there's a much simpler solution - just make _process an instance attribute and call it from __call__ :
class Derived(Base):
def __init__(self, a, b):
def _process(c, d):
do_something with a and b
self._process = _process
super(Derived, self).__init__(self)
def __call__(self, c, d):
return self._process(c, d)
Or even simpler:
class Derived(Base):
def __init__(self, a, b):
super(Derived, self).__init__(self)
self._a = a
self._b = b
def __call__(self, c, d):
do_something_with(self._a, self._b)
EDIT:
Base requires a callable in ins init method.
This would be better if your example snippet was closer to your real use case.
But when I call super().init() the call method of Derived should not have been instantiated yet or has it?
Now that's a good question... Actually, Python methods are not what you think they are. What you define in a class statement's body using the def statement are still plain functions, as you can see by yourself:
class Foo:
... def bar(self): pass
...
Foo.bar
"Methods" are only instanciated when an attribute lookup resolves to a class attribute that happens to be a function:
Foo().bar
main.Foo object at 0x7f3cef4de908>>
Foo().bar
main.Foo object at 0x7f3cef4de940>>
(if you wonder how this happens, it's documented here)
and they actually are just thin wrappers around a function, instance and class (or function and class for classmethods), which delegate the call to the underlying function, injecting the instance (or class) as first argument. In CS terms, a Python method is the partial application of a function to an instance (or class).
Now as I mentionned upper, Python is a runtime language, and both def and class are executable statements. So by the time you define your Derived class, the class statement creating the Base class object has already been executed (else Base wouldn't exist at all), with all the class statement block being executed first (to define the functions and other class attributes).
So "when you call super().__init()__", the __call__ function of Base HAS been instanciated (assuming it's defined in the class statement for Base of course, but that's by far the most common case).

Should I use super() every time I call parent function or only inside overridden functions?

If we override parent class's method, we can use super() to avoid mention of parent class's name - that's clear.
But what about case, when we just use in subclass some function defined in parent class? What is preferable way: to use super().parent_method() or self.parent_method()? Or there's no difference?
class A:
def test1(self):
pass
def test_a(self):
pass
class B(A):
def test1(self):
super().test1() # That's clear.
def test_b(self):
# Which option is better, when I want to use parent's func here?
# 1) super().test_a()
# 2) self.test_a()
pass
Usually you will want to use self.test_a() to call an inherited method. However, in some rare situations you might want to use super().test_a() even though it seems to do the same thing. They're not equivalent, even though they have the same behavior in your example.
To explore the differences, lets make two versions of your B class, one with each kind of call, then make two C classes that further extend the B classes and override test_a:
class A(object):
def test_a(self):
return "A"
class B1(A):
def test_b(self):
return self.test_a() + "B"
class B2(A):
def test_b(self):
return super().test_a() + "B"
class C1(B1):
def test_a(self):
return "C"
class C2(B2):
def test_a(self):
return "C"
When you call the test_b() method on C1 and C2 instances you'll get different results, even though B1 and B2 behave the same:
>>> B1().test_b()
'AB'
>>> B2().test_b()
'AB'
>>> C1().test_b()
'CB'
>>> C2().test_b()
'AB'
This is because the super() call in B2.test_b tells Python that you want to skip the version of test_a in any more derived class and always call an implementation from a parent class. (Actually, I suppose it could be a sibling class in a multiple inheritance situation, but that's getting even more obscure.)
Like I said at the top, you usually want to allow a more-derived class like the Cs to override the behavior of the inherited methods you're calling in your less-derived class. That means that most of the time using self.whatever is the way to go. You only need to use super when you're doing something fancy.
Since B is an A, it has a member test_a. So you call it as
self.test_a()
B does not overwrite A.test_a so there is no need to use super() to call it.
Since B overwrites A.test1, you must explicitly name the method you want to call.
self.test1()
will call B.test1, while
super().test1()
will call A.test1.
Firstly both the ways super().test_a() and self.test_a() will result in execution of method test_a().
Since Class B does not override or overwrite test_a() I think use of self.test_a() will be much efficient as self is a mere reference to the current object which is there in memory.
As per documentation, super() results in creation of proxy object which contains other methods also. Owing to this reason I feel self will be the correct approach in your case.
If we override parent class's method, we can use super() to avoid
mention of parent class's name - that's clear.
actually super() is not syntactic sugar, its purpose is to invoke parent implementation of a certain method.
You have to use super() when you want to override a parent method, you don't have to use super() when instead you want to overwrite a method. The difference is that in the first case you want to add extra behavior (aka code execution) before or after the original implementation, in the second you want a completely different implementation.
You can't use self.method_name() in an override, the result will be a recursion error! (RuntimeError: maximum recursion depth exceeded)
Example:
class A:
def m(self):
print('A.m implementation')
class B(A):
def m(self):
super().m()
print('B.m implementation')
class C(A):
def m(self):
print('C.m implementation')
class D(A):
def m(self):
self.m()
a = A()
a.m()
b = B()
b.m()
c = C()
c.m()
d = D()
d.m()
Given a base class A, with a method m, B extends A by overriding m, C extends A by overwriting m, and D generates an error!
EDIT:
I just realized that you actually have 2 different methods (test_a and test_b). My answer is still valid, but regarding your specific scenario:
you should use self.test_a() unless you override/overwrite that method in your class B and you want to execute the original implementation... so we can say that calling super().test_a() or self.test_a() it's the same given that you'll never override/overwrite the original test_a() in your subclasses... however is a nonsense to use super() if not for an override/overwrite

Are static fields used to modify super class behaviour thread safe?

If a subclass wants to modify the behaviour of inherited methods through static fields, is it thread safe?
More specifically:
class A (object):
_m = 0
def do(self):
print self._m
class B (A):
_m=1
def test(self):
self.do()
class C (A):
_m=2
def test(self):
self.do()
Is there a risk that an instance of class B calling do() would behave as class C is supposed to, or vice-versa, in a multithreading environment? I would say yes, but I was wondering if somebody went through actually testing this pattern already.
Note: This is not a question about the pattern itself, which I think should be avoided, but about its consequences, as I found it in reviewing real life code.
First, remember that classes are objects, and static fields (and for that matter, methods) are attributes of said class objects.
So what happens is that self.do() looks up the do method in self and calls do(self). self is set to whatever object is being called, which itself references one of the classes A, B, or C as its class. So the lookup will find the value of _m in the correct class.
Of course, that requires a correction to your code:
class A (object):
_m = 0
def do(self):
if self._m==0: ...
elif ...
Your original code won't work because Python only looks for _m in two places: defined in the function, or as a global. It won't look in class scope like C++ does. So you have to prefix with self. so the right one gets used. If you wanted to force it to use the _m in class A, you would use A._m instead.
P.S. There are times you need this pattern, particularly with metaclasses, which are kinda-sorta Python's analog to C++'s template metaprogramming and functional algorithms.

Is there a way to implement methods like __len__ or __eq__ as classmethods?

It is pretty easy to implement __len__(self) method in Python so that it handles len(inst) calls like this one:
class A(object):
def __len__(self):
return 7
a = A()
len(a) # gives us 7
And there are plenty of alike methods you can define (__eq__, __str__, __repr__ etc.).
I know that Python classes are objects as well.
My question: can I somehow define, for example, __len__ so that the following works:
len(A) # makes sense and gives some predictable result
What you're looking for is called a "metaclass"... just like a is an instance of class A, A is an instance of class as well, referred to as a metaclass. By default, Python classes are instances of the type class (the only exception is under Python 2, which has some legacy "old style" classes, which are those which don't inherit from object). You can check this by doing type(A)... it should return type itself (yes, that object has been overloaded a little bit).
Metaclasses are powerful and brain-twisting enough to deserve more than the quick explanation I was about to write... a good starting point would be this stackoverflow question: What is a Metaclass.
For your particular question, for Python 3, the following creates a metaclass which aliases len(A) to invoke a class method on A:
class LengthMetaclass(type):
def __len__(self):
return self.clslength()
class A(object, metaclass=LengthMetaclass):
#classmethod
def clslength(cls):
return 7
print(len(A))
(Note: Example above is for Python 3. The syntax is slightly different for Python 2: you would use class A(object):\n __metaclass__=LengthMetaclass instead of passing it as a parameter.)
The reason LengthMetaclass.__len__ doesn't affect instances of A is that attribute resolution in Python first checks the instance dict, then walks the class hierarchy [A, object], but it never consults the metaclasses. Whereas accessing A.__len__ first consults the instance A, then walks it's class hierarchy, which consists of [LengthMetaclass, type].
Since a class is an instance of a metaclass, one way is to use a custom metaclass:
>>> Meta = type('Meta', (type,), {'__repr__': lambda cls: 'class A'})
>>> A = Meta('A', (object,), {'__repr__': lambda self: 'instance of class A'})
>>> A
class A
>>> A()
instance of class A
I fail to see how the Syntax specifically is important, but if you really want a simple way to implement it, just is the normal len(self) that returns len(inst) but in your implementation make it return a class variable that all instances share:
class A:
my_length = 5
def __len__(self):
return self.my_length
and you can later call it like that:
len(A()) #returns 5
obviously this creates a temporary instance of your class, but length only makes sense for an instance of a class and not really for the concept of a class (a Type object).
Editing the metaclass sounds like a very bad idea and unless you are doing something for school or to just mess around I really suggest you rethink this idea..
try this:
class Lengthy:
x = 5
#classmethod
def __len__(cls):
return cls.x
The #classmethod allows you to call it directly on the class, but your len implementation won't be able to depend on any instance variables:
a = Lengthy()
len(a)

Categories