I have an abstract base class Base that provides an abstract method _run() that needs to be implemented by derived classes, as well as a method run() that will call _run() and do some extra work that is common to all derived classes.
In all derived classes, I am setting the function docstring for the _run() method. As this function is not part of the public API, I want the same docstring (and function signature) to instead show up for the run() method.
Consider the following example:
import inspect
from abc import ABC, abstractmethod
class Base(ABC):
#abstractmethod
def _run(self):
return
def run(self, *args, **kwargs):
"""old_doc"""
return self._run(*args, **kwargs)
class Derived(Base):
def _run(self):
"""new_doc"""
return
My initial idea was to manipulate the docstring in Base.__init__ or Base.__new__. This works to some extent, but presents a number of problems:
I want to be able to override these two methods (at the very least __init__) in derived classes.
This requires the class to be instantiated before the docstring is available.
By setting the docstring for Base.run when instantiating the derived class, it would in fact set the docstring for all derived classes.
class Base(ABC):
def __init__(self):
type(self).run.__doc__ = type(self)._run.__doc__
type(self).run.__signature__ = inspect.signature(type(self)._run)
...
What I am hoping for:
>>> Derived.run.__doc__
'new_doc'
What I get so far:
>>> Derived.run.__doc__
'old_doc'
>>> Derived().run.__doc__
'new_doc'
Are there any solutions to this?
Don't modify the docstring of Base.run; instead, document what it does: it invokes a subclass-defined method.
class Base(ABC):
#abstractmethod
def _run(self):
"Must be replaced with actual code"
return
def run(self, *args, **kwargs):
"""Does some setup and runs self._run"""
return self._run(*args, **kwargs)
class Derived(Base):
def _run(self):
"""Does some work"""
return
There is no need to generate a new docstring for Derived.run, because Derived.run and Base.run evaluate to the exact same object: the run method defined by Base. Inheritance doesn't change what Base.run does just because it is invoked from an instance of Derived rather than an instance of Base.
The best workaround I have come up with is to create a decorator instead:
from abc import ABC, abstractmethod
class Base(ABC):
#abstractmethod
def run(self, *args, **kwargs):
"""old_doc"""
return self._run(*args, **kwargs)
def extra_work(func):
# Do some extra work and modify func.__doc__
...
return func
class Derived(Base):
#extra_work
def run(self):
"""new_doc"""
return
This way the extra work can still be defined outside the derived class to avoid duplicating it in every class derived from Base, and I am able to automatically update the docstring to reflect the added functionality.
Related
I have an abstract class with a static function that calls other abstract functions. But when I'm creating a new class and overriding abstract function still the original (abstract) function is running.
I have written an example similar to my problem. Please help.
In the following example, I want to run do_something() from Main not Base.
from abc import ABC, abstractmethod
class Base(ABC):
#staticmethod
#abstractmethod
def do_something():
print('Base')
#staticmethod
def print_something():
Base.do_something()
class Main(Base):
#staticmethod
def do_something():
print('Main')
Main.print_something()
Output:
Base
Main.print_something doesn't exist, so it resolves to Base.print_something, which explicitly calls Base.do_something, not Main.do_something. You probably want print_something to be a class method instead.
class Base(ABC):
#staticmethod
#abstractmethod
def do_something():
print('Base')
#classmethod
def print_something(cls):
cls.do_something()
class Main(Base):
#staticmethod
def do_something():
print('Main')
Main.print_something()
Now when Main.print_something resolves to Base.print_something, it will still receive Main (not Base) as its argument, allowing it to invoke Main.do_something as desired.
I read that it is considered bad practice to create a variable in the class namespace and then change its value in the class constructor.
(One of my sources: SoftwareEngineering SE: Is it a good practice to declare instance variables as None in a class in Python.)
Consider the following code:
# lib.py
class mixin:
def __init_subclass__(cls, **kwargs):
cls.check_mixin_subclass_validity(cls)
super().__init_subclass__(**kwargs)
def check_mixin_subclass_validity(subclass):
assert hasattr(subclass, 'necessary_var'), \
'Missing necessary_var'
def method_used_by_subclass(self):
return self.necessary_var * 3.14
# app.py
class my_subclass(mixin):
necessary_var = None
def __init__(self, some_value):
self.necessary_var = some_value
def run(self):
# DO SOME STUFF
self.necessary_var = self.method_used_by_subclass()
# DO OTHER STUFF
To force its subclass to declare the variable necessary_var, the class mixin uses the metaclass subclass_validator.
And the only way I know to makes it work on app.py side, is to initialized necessary_var as a class variable.
I am missing something or is it the only way to do so?
Short answer
You should check that attributes and methods exist at instantiation of a class, not before. This is what the abc module does and it has good reasons to work like this.
Long answer
First, I would like to point out that it seems what you want to check is that an instance attribute exists.
Due to Python dynamic nature, it is not possible to do so before an instance is created, that is after the call to __init__. We could define Mixin.__init__, but we would then have to rely on the users of your API to have perfect hygiene and to always call super().__init__.
One option is thus to create a metaclass and add a check in its __call__ method.
class MetaMixin(type):
def __call__(self, *args, **kwargs):
instance = super().__call__(*args, **kwargs)
assert hasattr(instance, 'necessary_var')
class Mixin(metaclass=MetaMixin):
pass
class Foo(Mixin):
def __init__(self):
self.necessary_var = ...
Foo() # Works fine
class Bar(Mixin):
pass
Bar() # AssertionError
To convince yourself that it is good practice to do this at instantiation, we can look toward the abc module which uses this behaviour.
from abc import abstractmethod, ABC
class AbstractMixin(ABC):
#abstractmethod
def foo(self):
...
class Foo(AbstractMixin):
pass
# Right now, everything is still all good
Foo() # TypeError: Can't instantiate abstract class Foo with abstract methods foo
As you can see the TypeError was raise at instantiation of Foo() and not at class creation.
But why does it behave like this?
The reason for that is that not every class will be instantiated, consider the example where we want to inherit from Mixin to create a new mixin which checks for some more attributes.
class Mixin:
def __init_subclass__(cls, **kwargs):
assert hasattr(cls, 'necessary_var')
super().__init_subclass__(**kwargs)
class MoreMixin(Mixin):
def __init_subclass__(cls, **kwargs):
assert hasattr(cls, 'other_necessary_var')
super().__init_subclass__(**kwargs)
# AssertionError was raised at that point
class Foo(MoreMixin):
necessary_var = ...
other_necessary_var = ...
As you see, the AssertionError was raised at the creation of the MoreMixin class. This is clearly not the desired behaviour since the Foo class is actually correctly built and that is what our mixin was supposed to check.
In conclusion, the existence of some attribute or method should be done at instantiation, Otherwise, you are preventing a whole lot of helpful inheritance techniques. This is why the abc module does it like that and this is why we should.
I have a class who's job is to wrap another class (code I don't control), intercept all calls to the wrapped class, perform some logic, and pass along the call to the underlying class. Here's an example:
class GithubRepository(object):
def get_commit(self, sha):
return 'Commit {}'.format(sha)
def get_contributors(self):
return ['bobbytables']
class LoggingGithubRepositoryWrapper(object):
def __init__(self, github_repository):
self._github_repository = github_repository
def __getattr__(self, name):
base_func = getattr(self._github_repository, name)
def log_wrap(*args, **kwargs):
print "Calling {}".format(name)
return base_func(*args, **kwargs)
return log_wrap
if __name__ == '__main__':
git_client = LoggingGithubRepositoryWrapper(GithubRepository())
print git_client.get_commit('abcdef1245')
print git_client.get_contributors()
As you can see, the way that I do this is by implementing __getattr__ on the wrapping class and delegating to the underlying class. The downside to this approach is that users of LoggingGithubRepositoryWrapper don't know which attributes/methods the underlying GithubRepository actually has.
This leads me to my question: is there a way to define or document the calls handled by __getattr__? Ideally, I'd like to be able to autocomplete on git_client. and be provided a list of supported methods. Thanks for your help in advance!
You can do this a few different ways, but they wont involve the use of __getattr__.
What you really need to do is dynamically create your class, or at least dynamically create the wrapped functions on your class. There are a few ways to do this in python.
You could build the class definition using type() or metaclasses, or build it on class instantiation using the __new__ method.
Every time you call LoggingGithubRepositoryWrapper(), the __new__ method will be called. Here, it looks at all the attributes on the github_repository argument and finds all the non-private methods. It then creates a function on the instantiated LoggingGithubRepositoryWrapper class instance that wraps the repo call in a logging statement.
At the end, it passes back the modified class instance. Then __init__ is called.
from types import MethodType
class LoggingGithubRepositoryWrapper(object):
def __new__(cls, github_repository):
self = super(LoggingGithubRepositoryWrapper, cls).__new__(cls)
for name in dir(github_repository):
if name.startswith('__'):
continue
func = getattr(github_repository, name)
if isinstance(func, MethodType):
setattr(self, name, cls.log_wrap(func))
return self
#staticmethod
def log_wrap(func):
def wrap(*args, **kwargs):
print 'Calling {0}'.format(func.__name__)
return func(*args, **kwargs)
return wrap
def __init__(self, github_repository):
... # this is all the same
i had a class called CacheObject,and many class extend from it.
now i need to add something common on all classes from this class so i write this
class CacheObject(object):
def __init__(self):
self.updatedict = dict()
but the child class didn't obtain the updatedict attribute.i know calling super init function was optional in python,but is there an easy way to force all of them to add the init rather than walk all the classes and modify them one by one?
I was in a situation where I wanted classes to always call their base classes' constructor in order before they call their own. The following is Python3 code that should do what you want:
class meta(type):
def __init__(cls,name,bases,dct):
def auto__call__init__(self, *a, **kw):
for base in cls.__bases__:
base.__init__(self, *a, **kw)
cls.__init__child_(self, *a, **kw)
cls.__init__child_ = cls.__init__
cls.__init__ = auto__call__init__
class A(metaclass=meta):
def __init__(self):
print("Parent")
class B(A):
def __init__(self):
print("Child")
To illustrate, it will behave as follows:
>>> B()
Parent
Child
<__main__.B object at 0x000001F8EF251F28>
>>> A()
Parent
<__main__.A object at 0x000001F8EF2BB2B0>
I suggest a non-code fix:
Document that super().__init__() should be called by your subclasses before they use any other methods defined in it.
This is not an uncommon restriction. See, for instance, the documentation for threading.Thread in the standard library, which says:
If the subclass overrides the constructor, it must make sure to invoke the base class constructor (Thread.__init__()) before doing anything else to the thread.
There are probably many other examples, I just happened to have that doc page open.
You can override __new__. As long as your base classes doesn't override __new__ without calling super().__new__, then you'll be fine.
class CacheObject(object):
def __new__(cls, *args, **kwargs):
instance = super().__new__(cls, *args, **kwargs)
instance.updatedict = {}
return instance
class Foo(CacheObject):
def __init__(self):
pass
However, as some commenters said, the motivation for this seems a little shady. You should perhaps just add the super calls instead.
This isn't what you asked for, but how about making updatedict a property, so that it doesn't need to be set in __init__:
class CacheObject(object):
#property
def updatedict(self):
try:
return self._updatedict
except AttributeError:
self._updatedict = dict()
return self._updatedict
Hopefully this achieves the real goal, that you don't want to have to touch every subclass (other than to make sure none uses an attribute called updatedict for something else, of course).
There are some odd gotchas, though, because it is different from setting updatedict in __init__ as in your question. For example, the content of CacheObject().__dict__ is different. It has no key updatedict because I've put that key in the class, not in each instance.
Regardless of motivation, another option is to use __init_subclass__() (Python 3.6+) to get this kind of behavior. (For example, I'm using it because I want users not familiar with the intricacies of Python to be able to inherit from a class to create specific engineering models, and I'm trying to keep the structure of the class they have to define very basic.)
In the case of your example,
class CacheObject:
def __init__(self) -> None:
self.updatedict = dict()
def __init_subclass__(cls) -> None:
orig_init = cls.__init__
#wraps(orig_init)
def __init__(self, *args, **kwargs):
orig_init(self, *args, **kwargs)
super(self.__class__, self).__init__()
cls.__init__ = __init__
What this does is any class that subclasses CacheObject will now, when created, have its __init__ function wrapped by the parent class—we're replacing it with a new function that calls the original, and then calls super() (the parent's) __init__ function. So now, even if the child class overrides the parent __init__, at the instance's creation time, its __init__ is then wrapped by a function that calls it and then calls its parent.
You can add a decorator to your classes :
def my_decorator(cls):
old_init = cls.__init__
def new_init(self):
self.updatedict = dict()
old_init(self)
cls.__init__ = new_init
return cls
#my_decorator
class SubClass(CacheObject):
pass
if you want to add the decorators to all the subclasses automatically, use a metaclass:
class myMeta(type):
def __new__(cls, name, parents, dct):
return my_decorator(super().__new__(cls, name, parents, dct))
class CacheObject(object, metaclass=myMeta):
pass
I'd like to automatically run some code upon class creation that can call other class methods. I have not found a way of doing so from within the class declaration itself and end up creating a #classmethod called __clsinit__ and call it from the defining scope immediately after the class declaration. Is there a method I can define such that it will get automatically called after the class object is created?
You can do this with a metaclass or a class decorator.
A class decorator (since 2.6) is probably easier to understand:
def call_clsinit(cls):
cls._clsinit()
return cls
#call_clsinit
class MyClass:
#classmethod
def _clsinit(cls):
print "MyClass._clsinit()"
Metaclasses are more powerful; they can call code and modify the ingredients of the class before it is created as well as afterwards (also, they can be inherited):
def call_clsinit(*args, **kwargs):
cls = type(*args, **kwargs)
cls._clsinit()
return cls;
class MyClass(object):
__metaclass__ = call_clsinit
#classmethod
def _clsinit(cls):
print "MyClass._clsinit()"