Python abstract classes - how to discourage instantiation? - python

I come from a C# background where the language has some built in "protect the developer" features. I understand that Python takes the "we're all adults here" approach and puts responsibility on the developer to code thoughtfully and carefully.
That said, Python suggests conventions like a leading underscore for private instance variables. My question is, is there a particular convention for marking a class as abstract other than just specifying it in the docstrings? I haven't seen anything in particular in the python style guide that mentions naming conventions for abstract classes.
I can think of 3 options so far but I'm not sure if they're good ideas:
Specify it in the docstring above the class (might be overlooked)
Use a leading underscore in the class name (not sure if this is universally understood)
Create a def __init__(self): method on the abstract class that raises an error (not sure if this negatively impacts inheritance, like if you want to call a base constructor)
Is one of these a good option or is there a better one? I just want to make sure that other developers know that it is abstract and so if they try to instantiate it they should accept responsibility for any strange behavior.

If you're using Python 2.6 or higher, you can use the Abstract Base Class module from the standard library if you want to enforce abstractness. Here's an example:
from abc import ABCMeta, abstractmethod
class SomeAbstractClass(object):
__metaclass__ = ABCMeta
#abstractmethod
def this_method_must_be_overridden(self):
return "But it can have an implementation (callable via super)."
class ConcreteSubclass(SomeAbstractClass):
def this_method_must_be_overridden(self):
s = super(ConcreteSubclass, self).this_method_must_be_overridden()
return s.replace("can", "does").replace(" (callable via super)", "")
Output:
>>> a = SomeAbstractClass()
Traceback (most recent call last):
File "<pyshell#13>", line 1, in <module>
a = SomeAbstractClass()
TypeError: Can't instantiate abstract class SomeAbstractClass with abstract
methods this_method_must_be_overridden
>>> c = ConcreteSubclass()
>>> c.this_method_must_be_overridden()
'But it does have an implementation.'

Based on your last sentence, I would answer answer "just document it". Anyone who uses a class in a way that the documentation says not to must accept responsibility for any strange behavior.
There is an abstract base class mechanism in Python, but I don't see any reason to use it if your only goal is to discourage instantiation.

I just name my abstract classes with the prefix 'Abstract'. E.g. AbstractDevice, AbstractPacket, etc.
It's about as easy and to the point as it comes. If others choose to go ahead and instantiate and/or use a class that starts with the word 'Abstract', then they either know what they're doing or there was no hope for them anyway.
Naming it thus, also serves as a reminder to myself not to go nuts with deep abstraction hierarchies, because putting 'Abstract' on the front of a whole lot of classes feels stupid too.

Create your 'abstract' class and raise NotImplementedError() in the abstract methods.
It won't stop people using the class and, in true duck-typing fashion, it will let you know if you neglect to implement the abstract method.

In Python 3.x, your class can inherit from abc.ABC.
This will make your class non-instantiable and your IDE will warn you if you try to do so.
import abc
class SomeAbstractClass(abc.ABC):
#abc.abstractmethod
def some_abstract_method(self):
raise NotImplementedError
#property
#abc.abstractmethod
def some_abstract_property(self):
raise NotImplementedError
This has first been suggested in PEP 3119.

To enforce things is possible, but rather unpythonic. When I came to Python after many years of C++ programming I also tried to do the same, I suppose, most of people try doing so if they have an experience in more classical languages. Metaclasses would do the job, but anyway Python checks very few things at compilation time. Your check will still be performed at runtime. So, is the inability to create a certain class really that useful if discovered only at runtime? In C++ (and in C# as well) you can not even compile you code creating an abstract class, and that is the whole point -- to discover the problem as early as possible. If you have abstract methods, raising a NotImplementedError exception seems to be quite enough. NB: raising, not returning an error code! In Python errors usually should not be silent unless thay are silented explicitly. Documenting. Naming a class in a way that says it's abstract. That's all.
Quality of Python code is ensured mostly with methods that are quite different from those used in languages with advanced compile-time type checking. Personally I consider that the most serious difference between dynamically typed lngauges and the others. Unit tests, coverage analysis etc. As a result, the design of code is quite different: everything is done not to enforce things, but to make testing them as easy as possible.

Related

Typehints for python class

Is it possible to typehint a class's self?
The reason being is basing a class off of an ambiguously dynamic class that has definitions given in hint stubs called BaseClassB and SubClassD.
I would have expected this to be valid python, but it's not. Is there a way to typehint the baseclass argument(s) to creating a class?
I'd also accept any tricks that get PyCharm to autocomplete off self correctly as an answer as this doesn't seem to be supported Python in 3.7.4.
e.g.
class MyClass(BaseClassAmbiguous: Union[BaseClassB, SubClassD])
def func(self):
self.self_doesnt_autocomplete_correctly
self. # Desired functionality is this to autocomplete according to BaseClassB and SubClassD
I suspect the reason why your type checker is choking on your code is because it's not actually valid syntax. The annotation in your base class list is a syntax error.
Probably the best available workaround is to just give BaseClassAmbiguous a fake type, like so:
from typing import Union, TYPE_CHECKING
if TYPE_CHECKING:
class BaseClassAmbiguous(BaseClassB, SubClassD): pass
else:
# Create your ambiguous base class however it's actually
# constructed at runtime
class MyClass(BaseClassAmbiguous):
def func(self) -> None:
self.blah
Basically, lie to your type-checker and pretend that BaseClassAmbiguous directly inherits from your two classes. I'm not sure if Pycharm specifically supports this kind of thing, but it's something it in principle ought to support. (E.g. it's possible to do these kinds of shenanigans in Python).
That said, if you're going to use this approach, you're probably better off just having BaseClassAmbiguous actually inherit directly from both subclasses if at all possible.
To answer your original question, yes, it's legal to annotate the self method. That technique is usually reserved for when you want to have your self variable be generic -- see this example in PEP 484, and this mini-tutorial in the mypy docs.
But you could in principle annotate self using any type hint, really, including Unions, as long as it's not fundamentally incompatible with what your class really is -- your annotation for self would need to essentially be the same type as or a supertype of MyClass.
That said, this technique will likely not help you here: what you'd need is an intersection type, not a union type, and annotating self won't help resolve the syntax error in your base class list.
I think the more broad problem here is that the PEP 484 ecosystem doesn't necessarily deal well with ambiguous or dynamic base classes. Perhaps this is not possible to do in your case, but If I were in your shoes, I'd concentrate my efforts on making the base class more concrete.

Best practice for Python 3 class creation

In my research I found that in Python 3 these three types of class definition are synonymous:
class MyClass:
pass
class MyClass():
pass
class MyClass(object):
pass
However, I was not able to find out which way is recommended. Which one should I use as a best practice?
I would say: Use the third option:
class MyClass(object):
pass
It explicitly mentions that you want to subclass object (and doesn't the Zen of Python mention: "Explicit is better than implicit.") and you don't run into nasty errors in case you (or someone else) ever run the code in Python 2 where these statements are different.
In Python 2, there's 2 types of classes. To use the new-style, you have to inherit explicitly from object. If not, the old-style implementation is used.
In Python 3, all classes extend object implicitly, whether you say so yourself or not.
You probably will want to use the new-style class anyway but if you code is supposed to work with both python 2 and 3 you'll have to explicitly inherit from object:
class Foo(object):
pass
To jump on the other answer, yes the Zen of Python state that
Explicit is better than implicit.
I think this mean we should avoid possible confusion in code like we should in language in general, remember code is communication.
If you only work with python 3, and your code/project explicitly state that, there is no possible confusion, all class without explicit inheritance automatically inherit from object. If for some obscure reason the base class change in the future (let's imagine from object to Object), the same code will work. And the Zen of Python also says that
Simple is better than complex.
(of course complex is quite an overstatement in this example but still...)
So again if you code only support python3, you should use the simplest form:
class Foo:
pass
The form with just () is quite useless since it doesn't give any valuable information.

How to reproduce a Java interface behaviour in a pythonic way

Let us define a class called Filter. I want this class to be extended by subclasses and I want all these subclasses to override a method : job.
More specifically
class Filter(object):
def __init__(self, csvin=None):
self._input = csvin
# I want this to be abstract. All the classes that inherit from Filter should
# implement their own version of job.
def job(self):
pass
Said differently, I want to make sure that any subclass of Filter has a method called job.
I heard about the module abc, but I also read about these concepts called duck-typing and EAFP. My understanding is that, if I am a duck, I will just try to run
f = SomeFilter()
f.job()
and see if it works. This is my problem if it raises any exception, I should have been more careful when I wrote the class SomeFilter.
I am pretty sure I do not fully understand the meaning of duck-typing and EAFP, but if it means that I have to postpone debugging as late as possible (that is, at invokation time), then I disagree with this way of thinking. I do not understand why so many people seem to appreciate this EAFP philosophy, but I wish to be part of them.
Can someone convert me and explain how to achieve this in a safe and predictive manner, that is by preventing the programmer from making a mistake when extending Filter, in a pythonic way.
You can use raise NotImplementedError, as per the documentation:
class Filter(object):
def __init__(self, csvin=None):
self._input = csvin
# I want this to be abstract. All the classes that inherit from Filter should
# implement their own version of job.
def job(self):
raise NotImplementedError

CapWords conventions: get_MyClass or get_my_class

This is a style conventions question.
PEP8 convention for a class definition would be something like
class MyClass(object):
def __init__(self, attri):
self.attri = attri
So say I want to write a module-scoped function which takes some data, processes it, and then creates an instance of MyClass.
PEP8 says that my function definitions should have lowercase_underscore style names, like
def get_my_class(arg1, arg2, arg3):
pass
But my inclination would be to make it clear that I'm talking about MyClass instances like so
def get_MyClass(arg1, arg2, arg3):
pass
For this case, it looks trivially obvious that my_class and MyClass are related, but there are some cases where it's not so obvious. For example, I'm pulling data from a spreadsheet and have a SpreadsheetColumn class that takes the form of a heading attribute and a data list attribute. Yet, if you didn't know I was talking about an instance of the SpreadsheetColumn class, you might think that I'm talking about a raw column of cells as they might appear in an Excel sheet.
I'm wondering if it's reasonable to violate PEP8 to use get_MyClass. Being new to Python, I don't want to create a habit for a bad naming convention.
I've searched PEP8 and Stack Overflow and didn't see anything that addressed the issue.
Depending on the usage of the function, it might be more appropriate to turn it into a classmethod or staticmethod. Then it's association with the class is clear, but you don't violate any naming conventions.
e.g.:
class MyClass(object):
def __init__(self,arg):
self.arg = arg
#classmethod
def from_sum(cls,*args):
return cls(sum(args))
inst = MyClass.from_sum(1,2,3,4)
print inst.arg #10
Let's take a step back. Usually, you don't want to do this at all, so the naming convention is the least of your worries.
First, normally, you don't care what actual class or type something is. This is what duck typing is all about. You don't want a SpreadsheetColumn instance, you want something that you can use as a spreadsheet column. It may be an instance of SpreadsheetColumn, or of a subclass, or of some proxy class, or of some mock class for testing—whatever it is, you don't care, as long as it looks and works like a column.
Notice that, even in static languages like Java and C#, factory functions (or objects) usually don't create an instance of a specific class, they create an instance of any class that implements a specific interface. In Python, that's usually implicit. (And, when it's not, it's usually because you're using something like PEAK or Twisted, and you should follow their coding style for protocols or interfaces.)
So, your factory function should be called get_column, not get_SpreadsheetColumn.
When the function is more of an "alternate constructor" than a factory, then mgilson's answer is the way to go. See chain() and chain.from_iterable() in itertools from a good standard library example.
But notice that this isn't very common in the standard library, most of the popular modules on PyPI, etc. And there's a good reason. Usually, you can just use a single constructor with default-valued parameters, keyword parameters, or at worst *args and **kwargs. If this makes the API too confusing for human readers, or too ambiguous to code, that's when you need an alternate constructor. Otherwise, you don't.
Sometimes, you really do need a factory that creates objects of a concrete type, and that concrete type is a part of the interface that the caller needs to know about. As I mentioned above, this is pretty rare even in static languages, and it's even rarer in Python, but it does come up. And then, you really do need an answer to your original question.
In that case, I think I would name the function something ugly and unusual like get_MyClass or get_MyClass_instance. It ought to stick out immediately, because anyone reading my code will probably need to figure out why I'm explicitly getting a MyClass instead of a thing in order to understand the rest of my code.

Python grab class in class definition

I don't even know how to explain this, so here is the code I'm trying.
from couchdb.schema import Document, TextField
class Base(Document):
type = TextField(default=self.__name__)
#self doesn't work, how do I get a reference to Base?
class User(Base):
pass
#User.type be defined as TextField(default="Test2")
The reason I'm even trying this is I'm working on creating a base class for an orm I'm using. I want to avoid defining the table name for every model I have. Also knowing what the limits of python is will help me avoid wasting time trying impossible things.
The class object does not (yet) exist while the class body is executing, so there is no way for code in the class body to get a reference to it (just as, more generally, there is no way for any code to get a reference to any object that does not exist). Test2.__name__, however, already does what you're specifically looking for, so I don't think you need any workaround (such as metaclasses or class decorators) for your specific use case.
Edit: for the edited question, where you don't just need the name as a string, a class decorator is the simplest way to work around the problem (in Python 2.6 or later):
def maketype(cls):
cls.type = TextField(default=cls.__name__)
return cls
and put #maketype in front of each class you want to decorate that way. In Python 2.5 or earlier, you need instead to say maketype(Base) after each relevant class statement.
If you want this functionality to get inherited, then you have to define a custom metaclass that performs the same functionality in its __init__ or __new__ methods. Personally, I would recommend against defining custom metaclasses unless they're really indispensable -- instead, I'd stick with the simpler decorator approach.
You may want to check out the other question python super class relection
In your case, Test2.__base__ will return the base class Test. If it doesn't work, you may use the new style: class Test(object)

Categories