Related
How to call a subclass method from a base class method, only if the subclass supports that method? And what's the best way to do that? Illustrative example, I have an animal that protects my house: if someone walks by it will look angry, and it will bark if it can.
Example code:
class Protector(object):
def protect(self):
self.lookangry()
if hasattr(self, 'bark'):
self.bark()
class GermanShepherd(Protector):
def lookangry(self):
print u') _ _ __/°°¬'
def bark(self):
print 'wau wau'
class ScaryCat(Protector):
def lookangry(self):
print '=^..^='
I can think of lots of alternative implementations for this:
Using hasattr as above.
try: self.bark() except AttributeError: pass but that also catches any AttributeErrors in bark
Same as 2 but inspect the error message to make sure it's the right AttributeError
Like 2 but define an abstract bark method that raises NotImplementedError in the abstract class and check for NotImplementedError instead of AttributeError. With this solution Pylint will complain that I forgot to override the abstract method in ScaryCat.
Define an empty bark method in the abstract class:
class Protector(object):
def protect(self):
self.lookangry()
self.bark()
def bark(self):
pass
I figured in Python their should usually be one way to do something. In this case it's not clear to me which. Which one of these options is most readable, least likely to introduce a bug when stuff is changed and most inline with coding standards, especially Pylint? Is there a better way to do it that I've missed?
It seems to me you're thinking about inheritance incorrectly. The base class is supposed to encapsulate everything that is shared across any of the subclasses. If something is not shared by all subclasses, by definition it is not part of the base class.
So your statement "if someone walks by it will look angry, and it will bark if it can" doesn't make sense to me. The "bark if it can" part is not shared across all subclasses, therefore it shouldn't be implemented in the base class.
What should happen is that the subclass that you want to bark adds this functionality to the protect() method. As in:
class Protector():
def protect(self):
self.lookangry()
class GermanShepherd(Protector):
def protect(self):
super().protect() # or super(GermanShepherd, self).protect() for Python 2
self.bark()
This way all subclasses will lookangry(), but the subclasses which implement a bark() method will have it as part of the extended functionality of the superclass's protect() method.
I think 6.) could be that the Protector class makes just the basic shared methods abstract thus required, while leaving the extra methods to its heirs. Of course this can be splitted into more sub-classes, see https://repl.it/repls/AridScrawnyCoderesource (Written in Python 3.6)
class Protector(object):
def lookangry(self):
raise NotImplementedError("If it can't look angry, it can't protect")
def protect(self):
self.lookangry()
class Doggo(Protector):
def bark(self):
raise NotImplementedError("If a dog can't bark, it can't protect")
def protect(self):
super().protect()
self.bark()
class GermanShepherd(Doggo):
def lookangry(self):
print(') _ _ __/°°¬')
def bark(self):
print('wau wau')
class Pug(Doggo):
# We will not consider that screeching as barking so no bark method
def lookangry(self):
print('(◉ω◉)')
class ScaryCat(Protector):
def lookangry(self):
print('o(≧o≦)o')
class Kitten(Protector):
pass
doggo = GermanShepherd()
doggo.protect()
try:
gleam_of_silver = Pug()
gleam_of_silver.protect()
except NotImplementedError as e:
print(e)
cheezburger = ScaryCat()
cheezburger.protect()
try:
ball_of_wool = Kitten()
ball_of_wool.protect()
except NotImplementedError as e:
print(e)
You missed one possibility:
Define a bark method that raises NotImplementedError, as in your option 4, but don't make it abstract.
This eliminates PyLint's complaint—and, more importantly, eliminates the legitimate problem it was complaining about.
As for your other options:
hasattr is unnecessary LBYL, which is usually not Pythonic.
The except problem can be handled by doing bark = self.bark inside a try block, then doing bark() if it passes. This is sometimes necessary, but the fact that it's a bit clumsy and hasn't been "fixed" should give you an idea of how often it's worth doing.
Inspecting error messages is an anti-pattern. Anything that's not a separate, documented argument value is subject to change across Python versions and implementations. (Plus, what if ManWithSidekick.bark() does self.sidekick.bark()? How would you distinguish the AttributeError there?)
So, that leaves 2, 4.5, and 5.
I think in most cases, either 4.5 or 5 will be the right thing to do. The difference between them is not pragmatic, but conceptual: If a ScaryCat an animal that barks silently, use option 5; if not, then barking must be an optional part of protection that not all protectors do, in which case use option 4.5.
For this toy example, I think I'd use option 4.5. And I think that will be the case with most toy examples you come up with.
However, I suspect that most real-life examples will be pretty different:
Most real-life examples won't need this deep hierarchy.
Of those that do, usually either bark will either be implemented by all subclasses, or won't be called by the superclass.
Of those that do need this, I think option 5 will usually fit. Sure, barking silently is not something a ScaryCat does, but parse_frame silently is something a ProxyProtocol does.
And there are so few exceptions left after that, that it's hard to speak about them abstractly and generally.
I come from a C# background where the language has some built in "protect the developer" features. I understand that Python takes the "we're all adults here" approach and puts responsibility on the developer to code thoughtfully and carefully.
That said, Python suggests conventions like a leading underscore for private instance variables. My question is, is there a particular convention for marking a class as abstract other than just specifying it in the docstrings? I haven't seen anything in particular in the python style guide that mentions naming conventions for abstract classes.
I can think of 3 options so far but I'm not sure if they're good ideas:
Specify it in the docstring above the class (might be overlooked)
Use a leading underscore in the class name (not sure if this is universally understood)
Create a def __init__(self): method on the abstract class that raises an error (not sure if this negatively impacts inheritance, like if you want to call a base constructor)
Is one of these a good option or is there a better one? I just want to make sure that other developers know that it is abstract and so if they try to instantiate it they should accept responsibility for any strange behavior.
If you're using Python 2.6 or higher, you can use the Abstract Base Class module from the standard library if you want to enforce abstractness. Here's an example:
from abc import ABCMeta, abstractmethod
class SomeAbstractClass(object):
__metaclass__ = ABCMeta
#abstractmethod
def this_method_must_be_overridden(self):
return "But it can have an implementation (callable via super)."
class ConcreteSubclass(SomeAbstractClass):
def this_method_must_be_overridden(self):
s = super(ConcreteSubclass, self).this_method_must_be_overridden()
return s.replace("can", "does").replace(" (callable via super)", "")
Output:
>>> a = SomeAbstractClass()
Traceback (most recent call last):
File "<pyshell#13>", line 1, in <module>
a = SomeAbstractClass()
TypeError: Can't instantiate abstract class SomeAbstractClass with abstract
methods this_method_must_be_overridden
>>> c = ConcreteSubclass()
>>> c.this_method_must_be_overridden()
'But it does have an implementation.'
Based on your last sentence, I would answer answer "just document it". Anyone who uses a class in a way that the documentation says not to must accept responsibility for any strange behavior.
There is an abstract base class mechanism in Python, but I don't see any reason to use it if your only goal is to discourage instantiation.
I just name my abstract classes with the prefix 'Abstract'. E.g. AbstractDevice, AbstractPacket, etc.
It's about as easy and to the point as it comes. If others choose to go ahead and instantiate and/or use a class that starts with the word 'Abstract', then they either know what they're doing or there was no hope for them anyway.
Naming it thus, also serves as a reminder to myself not to go nuts with deep abstraction hierarchies, because putting 'Abstract' on the front of a whole lot of classes feels stupid too.
Create your 'abstract' class and raise NotImplementedError() in the abstract methods.
It won't stop people using the class and, in true duck-typing fashion, it will let you know if you neglect to implement the abstract method.
In Python 3.x, your class can inherit from abc.ABC.
This will make your class non-instantiable and your IDE will warn you if you try to do so.
import abc
class SomeAbstractClass(abc.ABC):
#abc.abstractmethod
def some_abstract_method(self):
raise NotImplementedError
#property
#abc.abstractmethod
def some_abstract_property(self):
raise NotImplementedError
This has first been suggested in PEP 3119.
To enforce things is possible, but rather unpythonic. When I came to Python after many years of C++ programming I also tried to do the same, I suppose, most of people try doing so if they have an experience in more classical languages. Metaclasses would do the job, but anyway Python checks very few things at compilation time. Your check will still be performed at runtime. So, is the inability to create a certain class really that useful if discovered only at runtime? In C++ (and in C# as well) you can not even compile you code creating an abstract class, and that is the whole point -- to discover the problem as early as possible. If you have abstract methods, raising a NotImplementedError exception seems to be quite enough. NB: raising, not returning an error code! In Python errors usually should not be silent unless thay are silented explicitly. Documenting. Naming a class in a way that says it's abstract. That's all.
Quality of Python code is ensured mostly with methods that are quite different from those used in languages with advanced compile-time type checking. Personally I consider that the most serious difference between dynamically typed lngauges and the others. Unit tests, coverage analysis etc. As a result, the design of code is quite different: everything is done not to enforce things, but to make testing them as easy as possible.
This is a style conventions question.
PEP8 convention for a class definition would be something like
class MyClass(object):
def __init__(self, attri):
self.attri = attri
So say I want to write a module-scoped function which takes some data, processes it, and then creates an instance of MyClass.
PEP8 says that my function definitions should have lowercase_underscore style names, like
def get_my_class(arg1, arg2, arg3):
pass
But my inclination would be to make it clear that I'm talking about MyClass instances like so
def get_MyClass(arg1, arg2, arg3):
pass
For this case, it looks trivially obvious that my_class and MyClass are related, but there are some cases where it's not so obvious. For example, I'm pulling data from a spreadsheet and have a SpreadsheetColumn class that takes the form of a heading attribute and a data list attribute. Yet, if you didn't know I was talking about an instance of the SpreadsheetColumn class, you might think that I'm talking about a raw column of cells as they might appear in an Excel sheet.
I'm wondering if it's reasonable to violate PEP8 to use get_MyClass. Being new to Python, I don't want to create a habit for a bad naming convention.
I've searched PEP8 and Stack Overflow and didn't see anything that addressed the issue.
Depending on the usage of the function, it might be more appropriate to turn it into a classmethod or staticmethod. Then it's association with the class is clear, but you don't violate any naming conventions.
e.g.:
class MyClass(object):
def __init__(self,arg):
self.arg = arg
#classmethod
def from_sum(cls,*args):
return cls(sum(args))
inst = MyClass.from_sum(1,2,3,4)
print inst.arg #10
Let's take a step back. Usually, you don't want to do this at all, so the naming convention is the least of your worries.
First, normally, you don't care what actual class or type something is. This is what duck typing is all about. You don't want a SpreadsheetColumn instance, you want something that you can use as a spreadsheet column. It may be an instance of SpreadsheetColumn, or of a subclass, or of some proxy class, or of some mock class for testing—whatever it is, you don't care, as long as it looks and works like a column.
Notice that, even in static languages like Java and C#, factory functions (or objects) usually don't create an instance of a specific class, they create an instance of any class that implements a specific interface. In Python, that's usually implicit. (And, when it's not, it's usually because you're using something like PEAK or Twisted, and you should follow their coding style for protocols or interfaces.)
So, your factory function should be called get_column, not get_SpreadsheetColumn.
When the function is more of an "alternate constructor" than a factory, then mgilson's answer is the way to go. See chain() and chain.from_iterable() in itertools from a good standard library example.
But notice that this isn't very common in the standard library, most of the popular modules on PyPI, etc. And there's a good reason. Usually, you can just use a single constructor with default-valued parameters, keyword parameters, or at worst *args and **kwargs. If this makes the API too confusing for human readers, or too ambiguous to code, that's when you need an alternate constructor. Otherwise, you don't.
Sometimes, you really do need a factory that creates objects of a concrete type, and that concrete type is a part of the interface that the caller needs to know about. As I mentioned above, this is pretty rare even in static languages, and it's even rarer in Python, but it does come up. And then, you really do need an answer to your original question.
In that case, I think I would name the function something ugly and unusual like get_MyClass or get_MyClass_instance. It ought to stick out immediately, because anyone reading my code will probably need to figure out why I'm explicitly getting a MyClass instead of a thing in order to understand the rest of my code.
Say I have a class, which has a number of subclasses.
I can instantiate the class. I can then set its __class__ attribute to one of the subclasses. I have effectively changed the class type to the type of its subclass, on a live object. I can call methods on it which invoke the subclass's version of those methods.
So, how dangerous is doing this? It seems weird, but is it wrong to do such a thing? Despite the ability to change type at run-time, is this a feature of the language that should completely be avoided? Why or why not?
(Depending on responses, I'll post a more-specific question about what I would like to do, and if there are better alternatives).
Here's a list of things I can think of that make this dangerous, in rough order from worst to least bad:
It's likely to be confusing to someone reading or debugging your code.
You won't have gotten the right __init__ method, so you probably won't have all of the instance variables initialized properly (or even at all).
The differences between 2.x and 3.x are significant enough that it may be painful to port.
There are some edge cases with classmethods, hand-coded descriptors, hooks to the method resolution order, etc., and they're different between classic and new-style classes (and, again, between 2.x and 3.x).
If you use __slots__, all of the classes must have identical slots. (And if you have the compatible but different slots, it may appear to work at first but do horrible things…)
Special method definitions in new-style classes may not change. (In fact, this will work in practice with all current Python implementations, but it's not documented to work, so…)
If you use __new__, things will not work the way you naively expected.
If the classes have different metaclasses, things will get even more confusing.
Meanwhile, in many cases where you'd think this is necessary, there are better options:
Use a factory to create an instance of the appropriate class dynamically, instead of creating a base instance and then munging it into a derived one.
Use __new__ or other mechanisms to hook the construction.
Redesign things so you have a single class with some data-driven behavior, instead of abusing inheritance.
As a very most common specific case of the last one, just put all of the "variable methods" into classes whose instances are kept as a data member of the "parent", rather than into subclasses. Instead of changing self.__class__ = OtherSubclass, just do self.member = OtherSubclass(self). If you really need methods to magically change, automatic forwarding (e.g., via __getattr__) is a much more common and pythonic idiom than changing classes on the fly.
Assigning the __class__ attribute is useful if you have a long time running application and you need to replace an old version of some object by a newer version of the same class without loss of data, e.g. after some reload(mymodule) and without reload of unchanged modules. Other example is if you implement persistency - something similar to pickle.load.
All other usage is discouraged, especially if you can write the complete code before starting the application.
On arbitrary classes, this is extremely unlikely to work, and is very fragile even if it does. It's basically the same thing as pulling the underlying function objects out of the methods of one class, and calling them on objects which are not instances of the original class. Whether or not that will work depends on internal implementation details, and is a form of very tight coupling.
That said, changing the __class__ of objects amongst a set of classes that were particularly designed to be used this way could be perfectly fine. I've been aware that you can do this for a long time, but I've never yet found a use for this technique where a better solution didn't spring to mind at the same time. So if you think you have a use case, go for it. Just be clear in your comments/documentation what is going on. In particular it means that the implementation of all the classes involved have to respect all of their invariants/assumptions/etc, rather than being able to consider each class in isolation, so you'd want to make sure that anyone who works on any of the code involved is aware of this!
Well, not discounting the problems cautioned about at the start. But it can be useful in certain cases.
First of all, the reason I am looking this post up is because I did just this and __slots__ doesn't like it. (yes, my code is a valid use case for slots, this is pure memory optimization) and I was trying to get around a slots issue.
I first saw this in Alex Martelli's Python Cookbook (1st ed). In the 3rd ed, it's recipe 8.19 "Implementing Stateful Objects or State Machine Problems". A fairly knowledgeable source, Python-wise.
Suppose you have an ActiveEnemy object that has different behavior from an InactiveEnemy and you need to switch back and forth quickly between them. Maybe even a DeadEnemy.
If InactiveEnemy was a subclass or a sibling, you could switch class attributes. More exactly, the exact ancestry matters less than the methods and attributes being consistent to code calling it. Think Java interface or, as several people have mentioned, your classes need to be designed with this use in mind.
Now, you still have to manage state transition rules and all sorts of other things. And, yes, if your client code is not expecting this behavior and your instances switch behavior, things will hit the fan.
But I've used this quite successfully on Python 2.x and never had any unusual problems with it. Best done with a common parent and small behavioral differences on subclasses with the same method signatures.
No problems, until my __slots__ issue that's blocking it just now. But slots are a pain in the neck in general.
I would not do this to patch live code. I would also privilege using a factory method to create instances.
But to manage very specific conditions known in advance? Like a state machine that the clients are expected to understand thoroughly? Then it is pretty darn close to magic, with all the risk that comes with it. It's quite elegant.
Python 3 concerns? Test it to see if it works but the Cookbook uses Python 3 print(x) syntax in its example, FWIW.
The other answers have done a good job of discussing the question of why just changing __class__ is likely not an optimal decision.
Below is one example of a way to avoid changing __class__ after instance creation, using __new__. I'm not recommending it, just showing how it could be done, for the sake of completeness. However it is probably best to do this using a boring old factory rather than shoe-horning inheritance into a job for which it was not intended.
class ChildDispatcher:
_subclasses = dict()
def __new__(cls, *args, dispatch_arg, **kwargs):
# dispatch to a registered child class
subcls = cls.getsubcls(dispatch_arg)
return super(ChildDispatcher, subcls).__new__(subcls)
def __init_subclass__(subcls, **kwargs):
super(ChildDispatcher, subcls).__init_subclass__(**kwargs)
# add __new__ contructor to child class based on default first dispatch argument
def __new__(cls, *args, dispatch_arg = subcls.__qualname__, **kwargs):
return super(ChildDispatcher,cls).__new__(cls, *args, **kwargs)
subcls.__new__ = __new__
ChildDispatcher.register_subclass(subcls)
#classmethod
def getsubcls(cls, key):
name = cls.__qualname__
if cls is not ChildDispatcher:
raise AttributeError(f"type object {name!r} has no attribute 'getsubcls'")
try:
return ChildDispatcher._subclasses[key]
except KeyError:
raise KeyError(f"No child class key {key!r} in the "
f"{cls.__qualname__} subclasses registry")
#classmethod
def register_subclass(cls, subcls):
name = subcls.__qualname__
if cls is not ChildDispatcher:
raise AttributeError(f"type object {name!r} has no attribute "
f"'register_subclass'")
if name not in ChildDispatcher._subclasses:
ChildDispatcher._subclasses[name] = subcls
else:
raise KeyError(f"{name} subclass already exists")
class Child(ChildDispatcher): pass
c1 = ChildDispatcher(dispatch_arg = "Child")
assert isinstance(c1, Child)
c2 = Child()
assert isinstance(c2, Child)
How "dangerous" it is depends primarily on what the subclass would have done when initializing the object. It's entirely possible that it would not be properly initialized, having only run the base class's __init__(), and something would fail later because of, say, an uninitialized instance attribute.
Even without that, it seems like bad practice for most use cases. Easier to just instantiate the desired class in the first place.
Here's an example of one way you could do the same thing without changing __class__. Quoting #unutbu in the comments to the question:
Suppose you were modeling cellular automata. Suppose each cell could be in one of say 5 Stages. You could define 5 classes Stage1, Stage2, etc. Suppose each Stage class has multiple methods.
class Stage1(object):
…
class Stage2(object):
…
…
class Cell(object):
def __init__(self):
self.current_stage = Stage1()
def goToStage2(self):
self.current_stage = Stage2()
def __getattr__(self, attr):
return getattr(self.current_stage, attr)
If you allow changing __class__ you could instantly give a cell all the methods of a new stage (same names, but different behavior).
Same for changing current_stage, but this is a perfectly normal and pythonic thing to do, that won't confuse anyone.
Plus, it allows you to not change certain special methods you don't want changed, just by overriding them in Cell.
Plus, it works for data members, class methods, static methods, etc., in ways every intermediate Python programmer already understands.
If you refuse to change __class__, then you might have to include a stage attribute, and use a lot of if statements, or reassign a lot of attributes pointing to different stage's functions
Yes, I've used a stage attribute, but that's not a downside—it's the obvious visible way to keep track of what the current stage is, better for debugging and for readability.
And there's not a single if statement or any attribute reassignment except for the stage attribute.
And this is just one of multiple different ways of doing this without changing __class__.
In the comments I proposed modeling cellular automata as a possible use case for dynamic __class__s. Let's try to flesh out the idea a bit:
Using dynamic __class__:
class Stage(object):
def __init__(self, x, y):
self.x = x
self.y = y
class Stage1(Stage):
def step(self):
if ...:
self.__class__ = Stage2
class Stage2(Stage):
def step(self):
if ...:
self.__class__ = Stage3
cells = [Stage1(x,y) for x in range(rows) for y in range(cols)]
def step(cells):
for cell in cells:
cell.step()
yield cells
For lack of a better term, I'm going to call this
The traditional way: (mainly abarnert's code)
class Stage1(object):
def step(self, cell):
...
if ...:
cell.goToStage2()
class Stage2(object):
def step(self, cell):
...
if ...:
cell.goToStage3()
class Cell(object):
def __init__(self, x, y):
self.x = x
self.y = y
self.current_stage = Stage1()
def goToStage2(self):
self.current_stage = Stage2()
def __getattr__(self, attr):
return getattr(self.current_stage, attr)
cells = [Cell(x,y) for x in range(rows) for y in range(cols)]
def step(cells):
for cell in cells:
cell.step(cell)
yield cells
Comparison:
The traditional way creates a list of Cell instances each with a
current stage attribute.
The dynamic __class__ way creates a list of instances which are
subclasses of Stage. There is no need for a current stage
attribute since __class__ already serves this purpose.
The traditional way uses goToStage2, goToStage3, ... methods to
switch stages.
The dynamic __class__ way requires no such methods. You just
reassign __class__.
The traditional way uses the special method __getattr__ to delegate
some method calls to the appropriate stage instance held in the
self.current_stage attribute.
The dynamic __class__ way does not require any such delegation. The
instances in cells are already the objects you want.
The traditional way needs to pass the cell as an argument to
Stage.step. This is so cell.goToStageN can be called.
The dynamic __class__ way does not need to pass anything. The
object we are dealing with has everything we need.
Conclusion:
Both ways can be made to work. To the extent that I can envision how these two implementations would pan-out, it seems to me the dynamic __class__ implementation will be
simpler (no Cell class),
more elegant (no ugly goToStage2 methods, no brain-teasers like why
you need to write cell.step(cell) instead of cell.step()),
and easier to understand (no __getattr__, no additional level of
indirection)
I don't even know how to explain this, so here is the code I'm trying.
from couchdb.schema import Document, TextField
class Base(Document):
type = TextField(default=self.__name__)
#self doesn't work, how do I get a reference to Base?
class User(Base):
pass
#User.type be defined as TextField(default="Test2")
The reason I'm even trying this is I'm working on creating a base class for an orm I'm using. I want to avoid defining the table name for every model I have. Also knowing what the limits of python is will help me avoid wasting time trying impossible things.
The class object does not (yet) exist while the class body is executing, so there is no way for code in the class body to get a reference to it (just as, more generally, there is no way for any code to get a reference to any object that does not exist). Test2.__name__, however, already does what you're specifically looking for, so I don't think you need any workaround (such as metaclasses or class decorators) for your specific use case.
Edit: for the edited question, where you don't just need the name as a string, a class decorator is the simplest way to work around the problem (in Python 2.6 or later):
def maketype(cls):
cls.type = TextField(default=cls.__name__)
return cls
and put #maketype in front of each class you want to decorate that way. In Python 2.5 or earlier, you need instead to say maketype(Base) after each relevant class statement.
If you want this functionality to get inherited, then you have to define a custom metaclass that performs the same functionality in its __init__ or __new__ methods. Personally, I would recommend against defining custom metaclasses unless they're really indispensable -- instead, I'd stick with the simpler decorator approach.
You may want to check out the other question python super class relection
In your case, Test2.__base__ will return the base class Test. If it doesn't work, you may use the new style: class Test(object)