If a subclass wants to modify the behaviour of inherited methods through static fields, is it thread safe?
More specifically:
class A (object):
_m = 0
def do(self):
print self._m
class B (A):
_m=1
def test(self):
self.do()
class C (A):
_m=2
def test(self):
self.do()
Is there a risk that an instance of class B calling do() would behave as class C is supposed to, or vice-versa, in a multithreading environment? I would say yes, but I was wondering if somebody went through actually testing this pattern already.
Note: This is not a question about the pattern itself, which I think should be avoided, but about its consequences, as I found it in reviewing real life code.
First, remember that classes are objects, and static fields (and for that matter, methods) are attributes of said class objects.
So what happens is that self.do() looks up the do method in self and calls do(self). self is set to whatever object is being called, which itself references one of the classes A, B, or C as its class. So the lookup will find the value of _m in the correct class.
Of course, that requires a correction to your code:
class A (object):
_m = 0
def do(self):
if self._m==0: ...
elif ...
Your original code won't work because Python only looks for _m in two places: defined in the function, or as a global. It won't look in class scope like C++ does. So you have to prefix with self. so the right one gets used. If you wanted to force it to use the _m in class A, you would use A._m instead.
P.S. There are times you need this pattern, particularly with metaclasses, which are kinda-sorta Python's analog to C++'s template metaprogramming and functional algorithms.
Related
In C++, given a class hierarchy, the most derived class's ctor calls its base class ctor which then initialized the base part of the object, before the derived part is instantiated. In Python I want to understand what's going on in a case where I have the requirement, that Derived subclasses a given class Base which takes a callable in its __init__ method which it then later invokes. The callable features some parameters which I pass in Derived class's __init__, which is where I also define the callable function. My idea then was to pass the Derived class itself to its Base class after having defined the __call__ operator
class Derived(Base):
def __init__(self, a, b):
def _process(c, d):
do_something with a and b
self.__class__.__call__ = _process
super(Derived, self).__init__(self)
Is this a pythonic way of dealing with this problem?
What is the exact order of initialization here? Does one needs to call super as a first instruction in the __init__ method or is it ok to do it the way I did?
I am confused whether it is considered good practice to use super with or without arguments in python > 3.6
What is the exact order of initialization here?
Well, very obviously the one you can see in your code - Base.__init__() is only called when you explicitely ask for it (with the super() call). If Base also has parents and everyone in the chain uses super() calls, the parents initializers will be invoked according to the mro.
Basically, Python is a "runtime language" - except for the bytecode compilation phase, everything happens at runtime - so there's very few "black magic" going on (and much of it is actually documented and fully exposed for those who want to look under the hood or do some metaprogramming).
Does one needs to call super as a first instruction in the init method or is it ok to do it the way I did?
You call the parent's method where you see fit for the concrete use case - you just have to beware of not using instance attributes (directly or - less obvious to spot - indirectly via a method call that depends on those attributes) before they are defined.
I am confused whether it is considered good practice to use super with or without arguments in python > 3.6
If you don't need backward compatibily, use super() without params - unless you want to explicitely skip some class in the MRO, but then chances are there's something debatable with your design (but well - sometimes we can't afford to rewrite a whole code base just to avoid one very special corner case, so that's ok too as long as you understand what you're doing and why).
Now with your core question:
class Derived(Base):
def __init__(self, a, b):
def _process(c, d):
do_something with a and b
self.__class__.__call__ = _process
super(Derived, self).__init__(self)
self.__class__.__call__ is a class attribute and is shared by all instances of the class. This means that you either have to make sure you are only ever using one single instance of the class (which doesn't seem to be the goal here) or are ready to have totally random results, since each new instance will overwrite self.__class__.__call__ with it's own version.
If what you want is to have each instance's __call__ method to call it's own version of process(), then there's a much simpler solution - just make _process an instance attribute and call it from __call__ :
class Derived(Base):
def __init__(self, a, b):
def _process(c, d):
do_something with a and b
self._process = _process
super(Derived, self).__init__(self)
def __call__(self, c, d):
return self._process(c, d)
Or even simpler:
class Derived(Base):
def __init__(self, a, b):
super(Derived, self).__init__(self)
self._a = a
self._b = b
def __call__(self, c, d):
do_something_with(self._a, self._b)
EDIT:
Base requires a callable in ins init method.
This would be better if your example snippet was closer to your real use case.
But when I call super().init() the call method of Derived should not have been instantiated yet or has it?
Now that's a good question... Actually, Python methods are not what you think they are. What you define in a class statement's body using the def statement are still plain functions, as you can see by yourself:
class Foo:
... def bar(self): pass
...
Foo.bar
"Methods" are only instanciated when an attribute lookup resolves to a class attribute that happens to be a function:
Foo().bar
main.Foo object at 0x7f3cef4de908>>
Foo().bar
main.Foo object at 0x7f3cef4de940>>
(if you wonder how this happens, it's documented here)
and they actually are just thin wrappers around a function, instance and class (or function and class for classmethods), which delegate the call to the underlying function, injecting the instance (or class) as first argument. In CS terms, a Python method is the partial application of a function to an instance (or class).
Now as I mentionned upper, Python is a runtime language, and both def and class are executable statements. So by the time you define your Derived class, the class statement creating the Base class object has already been executed (else Base wouldn't exist at all), with all the class statement block being executed first (to define the functions and other class attributes).
So "when you call super().__init()__", the __call__ function of Base HAS been instanciated (assuming it's defined in the class statement for Base of course, but that's by far the most common case).
I want to define a mix-in of a namedtuple and a base class with defines and abstract method:
import abc
import collections
class A(object):
__metaclass__ = abc.ABCMeta
#abc.abstractmethod
def do(self):
print("U Can't Touch This")
B = collections.namedtuple('B', 'x, y')
class C(B, A):
pass
c = C(x=3, y=4)
print(c)
c.do()
From what I understand reading the docs and other examples I have seen, c.do() should raise an error, as class C does not implement do(). However, when I run it... it works:
B(x=3, y=4)
U Can't Touch This
I must be overlooking something.
When you take a look at the method resolution order of C you see that B comes before A in that list. That means when you instantiate C the __new__ method of B will be called first.
This is the implementation of namedtuple.__new__
def __new__(_cls, {arg_list}):
'Create new instance of {typename}({arg_list})'
return _tuple.__new__(_cls, ({arg_list}))
You can see that it does not support cooperative inheritance, because it breaks the chain and simply calls tuples __new__ method. Like this the ABCMeta.__new__ method that checks for abstract methods is never executed (where ever that is) and it can't check for abstract methods. So the instantiation does not fail.
I thought inverting the MRO would solve that problem, but it strangely did not. I'm gonna investigate a bit more and update this answer.
I want to figure out the type of the class in which a certain method is defined (in essence, the enclosing static scope of the method), from within the method itself, and without specifying it explicitly, e.g.
class SomeClass:
def do_it(self):
cls = enclosing_class() # <-- I need this.
print(cls)
class DerivedClass(SomeClass):
pass
obj = DerivedClass()
# I want this to print 'SomeClass'.
obj.do_it()
Is this possible?
If you need this in Python 3.x, please see my other answer—the closure cell __class__ is all you need.
If you need to do this in CPython 2.6-2.7, RickyA's answer is close, but it doesn't work, because it relies on the fact that this method is not overriding any other method of the same name. Try adding a Foo.do_it method in his answer, and it will print out Foo, not SomeClass
The way to solve that is to find the method whose code object is identical to the current frame's code object:
def do_it(self):
mro = inspect.getmro(self.__class__)
method_code = inspect.currentframe().f_code
method_name = method_code.co_name
for base in reversed(mro):
try:
if getattr(base, method_name).func_code is method_code:
print(base.__name__)
break
except AttributeError:
pass
(Note that the AttributeError could be raised either by base not having something named do_it, or by base having something named do_it that isn't a function, and therefore doesn't have a func_code. But we don't care which; either way, base is not the match we're looking for.)
This may work in other Python 2.6+ implementations. Python does not require frame objects to exist, and if they don't, inspect.currentframe() will return None. And I'm pretty sure it doesn't require code objects to exist either, which means func_code could be None.
Meanwhile, if you want to use this in both 2.7+ and 3.0+, change that func_code to __code__, but that will break compatibility with earlier 2.x.
If you need CPython 2.5 or earlier, you can just replace the inpsect calls with the implementation-specific CPython attributes:
def do_it(self):
mro = self.__class__.mro()
method_code = sys._getframe().f_code
method_name = method_code.co_name
for base in reversed(mro):
try:
if getattr(base, method_name).func_code is method_code:
print(base.__name__)
break
except AttributeError:
pass
Note that this use of mro() will not work on classic classes; if you really want to handle those (which you really shouldn't want to…), you'll have to write your own mro function that just walks the hierarchy old-school… or just copy it from the 2.6 inspect source.
This will only work in Python 2.x implementations that bend over backward to be CPython-compatible… but that includes at least PyPy. inspect should be more portable, but then if an implementation is going to define frame and code objects with the same attributes as CPython's so it can support all of inspect, there's not much good reason not to make them attributes and provide sys._getframe in the first place…
First, this is almost certainly a bad idea, and not the way you want to solve whatever you're trying to solve but refuse to tell us about…
That being said, there is a very easy way to do it, at least in Python 3.0+. (If you need 2.x, see my other answer.)
Notice that Python 3.x's super pretty much has to be able to do this somehow. How else could super() mean super(THISCLASS, self), where that THISCLASS is exactly what you're asking for?*
Now, there are lots of ways that super could be implemented… but PEP 3135 spells out a specification for how to implement it:
Every function will have a cell named __class__ that contains the class object that the function is defined in.
This isn't part of the Python reference docs, so some other Python 3.x implementation could do it a different way… but at least as of 3.2+, they still have to have __class__ on functions, because Creating the class object explicitly says:
This class object is the one that will be referenced by the zero-argument form of super(). __class__ is an implicit closure reference created by the compiler if any methods in a class body refer to either __class__ or super. This allows the zero argument form of super() to correctly identify the class being defined based on lexical scoping, while the class or instance that was used to make the current call is identified based on the first argument passed to the method.
(And, needless to say, this is exactly how at least CPython 3.0-3.5 and PyPy3 2.0-2.1 implement super anyway.)
In [1]: class C:
...: def f(self):
...: print(__class__)
In [2]: class D(C):
...: pass
In [3]: D().f()
<class '__main__.C'>
Of course this gets the actual class object, not the name of the class, which is apparently what you were after. But that's easy; you just need to decide whether you mean __class__.__name__ or __class__.__qualname__ (in this simple case they're identical) and print that.
* In fact, this was one of the arguments against it: that the only plausible way to do this without changing the language syntax was to add a new closure cell to every function, or to require some horrible frame hacks which may not even be doable in other implementations of Python. You can't just use compiler magic, because there's no way the compiler can tell that some arbitrary expression will evaluate to the super function at runtime…
If you can use #abarnert's method, do it.
Otherwise, you can use some hardcore introspection (for python2.7):
import inspect
from http://stackoverflow.com/a/22898743/2096752 import getMethodClass
def enclosing_class():
frame = inspect.currentframe().f_back
caller_self = frame.f_locals['self']
caller_method_name = frame.f_code.co_name
return getMethodClass(caller_self.__class__, caller_method_name)
class SomeClass:
def do_it(self):
print(enclosing_class())
class DerivedClass(SomeClass):
pass
DerivedClass().do_it() # prints 'SomeClass'
Obviously, this is likely to raise an error if:
called from a regular function / staticmethod / classmethod
the calling function has a different name for self (as aptly pointed out by #abarnert, this can be solved by using frame.f_code.co_varnames[0])
Sorry for writing yet another answer, but here's how to do what you actually want to do, rather than what you asked for:
this is about adding instrumentation to a code base to be able to generate reports of method invocation counts, for the purpose of checking certain approximate runtime invariants (e.g. "the number of times that method ClassA.x() is executed is approximately equal to the number of times that method ClassB.y() is executed in the course of a run of a complicated program).
The way to do that is to make your instrumentation function inject the information statically. After all, it has to know the class and method it's injecting code into.
I will have to instrument many classes by hand, and to prevent mistakes I want to avoid typing the class names everywhere. In essence, it's the same reason why typing super() is preferable to typing super(ClassX, self).
If your instrumentation function is "do it manually", the very first thing you want to turn it into an actual function instead of doing it manually. Since you obviously only need static injection, using a decorator, either on the class (if you want to instrument every method) or on each method (if you don't) would make this nice and readable. (Or, if you want to instrument every method of every class, you might want to define a metaclass and have your root classes use it, instead of decorating every class.)
For example, here's an easy way to instrument every method of a class:
import collections
import functools
import inspect
_calls = {}
def inject(cls):
cls._calls = collections.Counter()
_calls[cls.__name__] = cls._calls
for name, method in cls.__dict__.items():
if inspect.isfunction(method):
#functools.wraps(method)
def wrapper(*args, **kwargs):
cls._calls[name] += 1
return method(*args, **kwargs)
setattr(cls, name, wrapper)
return cls
#inject
class A(object):
def f(self):
print('A.f here')
#inject
class B(A):
def f(self):
print('B.f here')
#inject
class C(B):
pass
#inject
class D(C):
def f(self):
print('D.f here')
d = D()
d.f()
B.f(d)
print(_calls)
The output:
{'A': Counter(),
'C': Counter(),
'B': Counter({'f': 1}),
'D': Counter({'f': 1})}
Exactly what you wanted, right?
You can either do what #mgilson suggested or take another approach.
class SomeClass:
pass
class DerivedClass(SomeClass):
pass
This makes SomeClass the base class for DerivedClass.
When you normally try to get the __class__.name__ then it will refer to derived class rather than the parent.
When you call do_it(), it's really passing DerivedClass as self, which is why you are most likely getting DerivedClass being printed.
Instead, try this:
class SomeClass:
pass
class DerivedClass(SomeClass):
def do_it(self):
for base in self.__class__.__bases__:
print base.__name__
obj = DerivedClass()
obj.do_it() # Prints SomeClass
Edit:
After reading your question a few more times I think I understand what you want.
class SomeClass:
def do_it(self):
cls = self.__class__.__bases__[0].__name__
print cls
class DerivedClass(SomeClass):
pass
obj = DerivedClass()
obj.do_it() # prints SomeClass
[Edited]
A somewhat more generic solution:
import inspect
class Foo:
pass
class SomeClass(Foo):
def do_it(self):
mro = inspect.getmro(self.__class__)
method_name = inspect.currentframe().f_code.co_name
for base in reversed(mro):
if hasattr(base, method_name):
print(base.__name__)
break
class DerivedClass(SomeClass):
pass
class DerivedClass2(DerivedClass):
pass
DerivedClass().do_it()
>> 'SomeClass'
DerivedClass2().do_it()
>> 'SomeClass'
SomeClass().do_it()
>> 'SomeClass'
This fails when some other class in the stack has attribute "do_it", since this is the signal name for stop walking the mro.
Say I have a class A, B and C.
Class A and B are both mixin classes for Class C.
class A( object ):
pass
class B( object ):
pass
class C( object, A, B ):
pass
This will not work when instantiating class C. I would have to remove object from class C to make it work. (Else you'll get MRO problems).
TypeError: Error when calling the metaclass bases
Cannot create a consistent method resolution
order (MRO) for bases B, object, A
However, my case is a bit more complicated. In my case class C is a server where A and B will be plugins that are loaded on startup. These are residing in their own folder.
I also have a Class named Cfactory. In Cfactory I have a __new__ method that will create a fully functional object C. In the __new__ method I search for plugins, load them using __import__, and then assign them to C.__bases__ += (loadedClassTypeGoesHere, )
So the following is a possibility: (made it quite abstract)
class A( object ):
def __init__( self ): pass
def printA( self ): print "A"
class B( object ):
def __init__( self ): pass
def printB( self ): print "B"
class C( object ):
def __init__( self ): pass
class Cfactory( object ):
def __new__( cls ):
C.__bases__ += ( A, )
C.__bases__ += ( B, )
return C()
This again will not work, and will give the MRO errors again:
TypeError: Cannot create a consistent method resolution
order (MRO) for bases object, A
An easy fix for this is removing the object baseclass from A and B. However this will make them old-style objects which should be avoided when these plugins are being run stand-alone (which should be possible, UnitTest wise)
Another easy fix is removing object from C but this will also make it an old-style class and C.__bases__ will be unavailable thus I can't add extra objects to the base of C
What would be a good architectural solution for this and how would you do something like this? For now I can live with old-style classes for the plugins themselves. But I rather not use them.
Think of it this way -- you want the mixins to override some of the behaviors of object, so they need to be before object in the method resolution order.
So you need to change the order of the bases:
class C(A, B, object):
pass
Due to this bug, you need C not to inherit from object directly to be able to correctly assign to __bases__, and the factory really could just be a function:
class FakeBase(object):
pass
class C(FakeBase):
pass
def c_factory():
for base in (A, B):
if base not in C.__bases__:
C.__bases__ = (base,) + C.__bases__
return C()
I don't know the details, so maybe I'm completely off-base here, but it seems like you're using the wrong mechanisms to achieve your design.
First off, why is Cfactory a class, and why does its __new__ method return an instance of something else? That looks like a bizarre way to implement what is quite naturally a function. Cfactory as you've described it (and shown a simplified example) doesn't behave at all like a class; you don't have multiple instances of it that share functionality (in fact it looks like you've made it impossible to construct instances of naturally).
To be honest, C doesn't look very much like a class to me either. It seems like you can't be creating more than one instance of it, otherwise you'd end up with an ever-growing bases list. So that makes C basically a module rather than a class, only with extra boilerplate. I try to avoid the "single-instance class to represent the application or some external system" pattern (though I know it's popular because Java requires that you use it). But the class inheritance mechanism can often be handy for things that aren't really classes, such as your plugin system.
I would've done this with a classmethod on C to find and load plugins, invoked by the module defining C so that it's always in a good state. Alternatively you could use a metaclass to automatically add whatever plugins it finds to the class bases. Mixing the mechanism for configuring the class in with the mechanism for creating an instance of the class seems wrong; it's the opposite of flexible de-coupled design.
If the plugins can't be loaded at the time C is created, then I would go with manually invoking the configurator classmethod at the point when you can search for plugins, before the C instance is created.
Actually, if the class can't be put into a consistent state as soon as it's created I would probably rather go for dynamic class creation than modifying the bases of an existing class. Then the system isn't locked into the class being configured once and instantiated once; you're at least open to the possibility of having multiple instances with different sets of plugins loaded. Something like this:
def Cfactory(*args, **kwargs):
plugins = find_plugins()
bases = (C,) + plugins
cls = type('C_with_plugins', bases, {})
return cls(*args, **kwargs)
That way, you have your single call to create your C instance with gives you a correctly configured instance, but it doesn't have strange side effects on any other hypothetical instances of C that might already exist, and its behaviour doesn't depend on whether it's been run before. I know you probably don't need either of those two properties, but it's barely more code than you have in your simplified example, and why break the conceptual model of what classes are if you don't have to?
There is a simple workaround: Create a helper-class, with a nice name, like PluginBase. And use that the inherit of, instead of object.
This makes the code more readable (imho) and it circumstances the bug.
class PluginBase(object): pass
class ServerBase(object): pass
class pluginA(PluginBase): "Now it is clearly a plugin class"
class pluginB(PluginBase): "Another plugin"
class Server1(ServerBase, pluginA, pluginB): "This works"
class Server2(ServerBase): pass
Server2.__bases__ += (pluginA,) # This also works
As note: Probably you don't need the factory; it's needed in C++, but hardly in Python
It is pretty easy to implement __len__(self) method in Python so that it handles len(inst) calls like this one:
class A(object):
def __len__(self):
return 7
a = A()
len(a) # gives us 7
And there are plenty of alike methods you can define (__eq__, __str__, __repr__ etc.).
I know that Python classes are objects as well.
My question: can I somehow define, for example, __len__ so that the following works:
len(A) # makes sense and gives some predictable result
What you're looking for is called a "metaclass"... just like a is an instance of class A, A is an instance of class as well, referred to as a metaclass. By default, Python classes are instances of the type class (the only exception is under Python 2, which has some legacy "old style" classes, which are those which don't inherit from object). You can check this by doing type(A)... it should return type itself (yes, that object has been overloaded a little bit).
Metaclasses are powerful and brain-twisting enough to deserve more than the quick explanation I was about to write... a good starting point would be this stackoverflow question: What is a Metaclass.
For your particular question, for Python 3, the following creates a metaclass which aliases len(A) to invoke a class method on A:
class LengthMetaclass(type):
def __len__(self):
return self.clslength()
class A(object, metaclass=LengthMetaclass):
#classmethod
def clslength(cls):
return 7
print(len(A))
(Note: Example above is for Python 3. The syntax is slightly different for Python 2: you would use class A(object):\n __metaclass__=LengthMetaclass instead of passing it as a parameter.)
The reason LengthMetaclass.__len__ doesn't affect instances of A is that attribute resolution in Python first checks the instance dict, then walks the class hierarchy [A, object], but it never consults the metaclasses. Whereas accessing A.__len__ first consults the instance A, then walks it's class hierarchy, which consists of [LengthMetaclass, type].
Since a class is an instance of a metaclass, one way is to use a custom metaclass:
>>> Meta = type('Meta', (type,), {'__repr__': lambda cls: 'class A'})
>>> A = Meta('A', (object,), {'__repr__': lambda self: 'instance of class A'})
>>> A
class A
>>> A()
instance of class A
I fail to see how the Syntax specifically is important, but if you really want a simple way to implement it, just is the normal len(self) that returns len(inst) but in your implementation make it return a class variable that all instances share:
class A:
my_length = 5
def __len__(self):
return self.my_length
and you can later call it like that:
len(A()) #returns 5
obviously this creates a temporary instance of your class, but length only makes sense for an instance of a class and not really for the concept of a class (a Type object).
Editing the metaclass sounds like a very bad idea and unless you are doing something for school or to just mess around I really suggest you rethink this idea..
try this:
class Lengthy:
x = 5
#classmethod
def __len__(cls):
return cls.x
The #classmethod allows you to call it directly on the class, but your len implementation won't be able to depend on any instance variables:
a = Lengthy()
len(a)