In my research I found that in Python 3 these three types of class definition are synonymous:
class MyClass:
pass
class MyClass():
pass
class MyClass(object):
pass
However, I was not able to find out which way is recommended. Which one should I use as a best practice?
I would say: Use the third option:
class MyClass(object):
pass
It explicitly mentions that you want to subclass object (and doesn't the Zen of Python mention: "Explicit is better than implicit.") and you don't run into nasty errors in case you (or someone else) ever run the code in Python 2 where these statements are different.
In Python 2, there's 2 types of classes. To use the new-style, you have to inherit explicitly from object. If not, the old-style implementation is used.
In Python 3, all classes extend object implicitly, whether you say so yourself or not.
You probably will want to use the new-style class anyway but if you code is supposed to work with both python 2 and 3 you'll have to explicitly inherit from object:
class Foo(object):
pass
To jump on the other answer, yes the Zen of Python state that
Explicit is better than implicit.
I think this mean we should avoid possible confusion in code like we should in language in general, remember code is communication.
If you only work with python 3, and your code/project explicitly state that, there is no possible confusion, all class without explicit inheritance automatically inherit from object. If for some obscure reason the base class change in the future (let's imagine from object to Object), the same code will work. And the Zen of Python also says that
Simple is better than complex.
(of course complex is quite an overstatement in this example but still...)
So again if you code only support python3, you should use the simplest form:
class Foo:
pass
The form with just () is quite useless since it doesn't give any valuable information.
Related
Which of the following cases is the best practice way of declaring an instance variable in python. Is there a typical preference, and what are the justifications for this?
Option 1 - Declare within __init__
class MyObject:
def __init__(self, arg):
self.variable_1 = self.method_1(arg)
def method_1(self, arg):
return(arg)
Option 2 - Declare in other methods
class MyObject:
def __init__(self, arg):
self.method_1(arg)
def method_1(self, arg):
self.variable_1 = arg
This is purely to understand if there is a best practice way of doing this that other developers would prefer to see when reviewing and extending code.
This is obviously not exact science, but it generally makes more sense to set all attributes (as possible) in the constructor so that you can follow up on them.
You can, of course, change them later as necessary in other methods.
Setting constructor level variables everywhere in the class makes it very hard to understand where things are coming from.
Option 1 is best practice to declare instance variable in Python.
Instance variables are for data that is actually part of the instance so it would be better if you define in constructor.
Your Option 2 is basically a Setter-/Getter-Paradigm. Python uses properties for these use-cases. There's a nice SO-answer for a similar question.
In general you initialize all your Instance-variables in the __init__-method, that's its reason to exist. If you need a getter-/setter use properties. And use the "least-astonishment" principle. Do not surprise another reader, or your later self with overly clever and/or complicated solutions. (aka KISS principle)
It depends. Defining all the attributes inside __init__ itself generally makes the code more readable, but if the class has a lot of attributes and you can easily divide them into logical groups then it makes sense to initialise each group of attributes in its own initialising method. You may wish to indicate that such methods are private by giving them a name that commences with a single underscore.
Note that if the class is derived from one or more other classes (apart from object) then you will have to call super.__init__ to initialise the attributes inherited from the parent class(es).
The bottom line is that all instance attributes should exist by the time that __init__ finishes executing. If it's not possible to set a proper value for some attribute in __init__ then it should be set to an appropriate default value, eg an empty string, list, etc, None, or a sentinel value like object().
Of course, the above doesn't apply to #property attributes, but even those will generally have an underlying "private" attribute that should be set in __init__.
For more info about properties, please see Raymond Hettinger's excellent Descriptor HowTo Guide in the Python docs.
As juanpa.arrivillaga mentions in the question comments, we don't actually declare variables in Python. That's basically because the Python data model doesn't really have variables like C and many other languages do. For a succinct explanation with nice diagrams please see Other languages have "variables", Python has "names". Also see Facts and myths about Python names and values, which was written by SO veteran Ned Batchelder.
I've been reading lots of previous SO discussions of factory functions, etc. and still don't know what the best (pythonic) approach is to this particular situation. I'll admit up front that i am imposing a somewhat artificial constraint on the problem in that i want my solution to work without modifying the module i am trying to extend: i could make modifications to it, but let's assume that it must remain as-is because i'm trying to understand best practice in this situation.
I'm working with the http://pypi.python.org/pypi/icalendar module, which handles parsing from and serializing to the Icalendar spec (hereafter ical). It parses the text into a hierarchy of dictionary-like "component" objects, where every "component" is an instance of a trivial derived class implementing the different valid ical types (VCALENDAR, VEVENT, etc.) and they are all spit out by a recursive factory from the common parent class:
class Component(...):
#classmethod
def from_ical(cls, ...)
I have created a 'CalendarFile' class that extends the ical 'Calendar' class, including in it generator function of its own:
class CalendarFile(Calendar):
#classmethod
def from_file(cls, ics):
which opens a file (ics) and passes it on:
instance = cls.from_ical(f.read())
It initializes and modifies some other things in instance and then returns it. The problem is that instance ends up being a Calendar object instead of a CalendarFile object, in spite of cls being CalendarFile. Short of going into the factory function of the ical module and fiddling around in there, is there any way to essentially "recast" that object as a 'CalendarFile'?
The alternatives (again without modifying the original module) that I have considered are:make the CalendarFile class a has-a Calendar class (each instance creates its own internal instance of a Calendar object), but that seems methodically stilted.
fiddle with the returned object to give it the methods it needs (i know there's a term for creating a customized object but it escapes me).
make the additional methods into functions and just have them work with instances of Calendar.
or perhaps the answer is that i shouldn't be trying to subclass from a module in the first place, and this type of code belongs in the module itself.
Again i'm trying to understand what the "best" approach is and also learn if i'm missing any alternatives. Thanks.
Normally, I would expect an alternative constructor defined as a classmethod to simply call the class's standard constructor, transforming the arguments that it receives into valid arguments to the standard constructor.
>>> class Toy(object):
... def __init__(self, x):
... self.x = abs(x)
... def __repr__(self):
... return 'Toy({})'.format(self.x)
... #classmethod
... def from_string(cls, s):
... return cls(int(s))
...
>>> Toy.from_string('5')
Toy(5)
In most cases, I would strongly recommend something like this approach; this is the gold standard for alternative constructors.
But this is a special case.
I've now looked over the source, and I think the best way to add a new class is to edit the module directly; otherwise, scrap inheritance and take option one (your "has-a" option). The different classes are all slightly differentiated versions of the same container class -- they shouldn't really even be separate classes. But if you want to add a new class in the idiom of the code as it it is written, you have to add a new class to the module itself. Furthermore, from_iter is deceptively named; it's not really a constructor at all. I think it should be a standalone function. It builds a whole tree of components linked together, and the code that builds the individual components is buried in a chain of calls to various factory functions that also should be standalone functions but aren't. IMO much of that code ought to live in __init__ where it would be useful to you for subclassing, but it doesn't.
Indeed, none of the subclasses of Component even add any methods. By adding methods to your subclass of Calendar, you're completely disregarding the actual idiom of the code. I don't like its idiom very much but by disregarding that idiom, you're making it even worse. If you don't want to modify the original module, then forget about inheritance here and give your object a has-a relationship to Calendar objects. Don't modify __class__; establish your own OO structure that follows standard OO practices.
I come from a C# background where the language has some built in "protect the developer" features. I understand that Python takes the "we're all adults here" approach and puts responsibility on the developer to code thoughtfully and carefully.
That said, Python suggests conventions like a leading underscore for private instance variables. My question is, is there a particular convention for marking a class as abstract other than just specifying it in the docstrings? I haven't seen anything in particular in the python style guide that mentions naming conventions for abstract classes.
I can think of 3 options so far but I'm not sure if they're good ideas:
Specify it in the docstring above the class (might be overlooked)
Use a leading underscore in the class name (not sure if this is universally understood)
Create a def __init__(self): method on the abstract class that raises an error (not sure if this negatively impacts inheritance, like if you want to call a base constructor)
Is one of these a good option or is there a better one? I just want to make sure that other developers know that it is abstract and so if they try to instantiate it they should accept responsibility for any strange behavior.
If you're using Python 2.6 or higher, you can use the Abstract Base Class module from the standard library if you want to enforce abstractness. Here's an example:
from abc import ABCMeta, abstractmethod
class SomeAbstractClass(object):
__metaclass__ = ABCMeta
#abstractmethod
def this_method_must_be_overridden(self):
return "But it can have an implementation (callable via super)."
class ConcreteSubclass(SomeAbstractClass):
def this_method_must_be_overridden(self):
s = super(ConcreteSubclass, self).this_method_must_be_overridden()
return s.replace("can", "does").replace(" (callable via super)", "")
Output:
>>> a = SomeAbstractClass()
Traceback (most recent call last):
File "<pyshell#13>", line 1, in <module>
a = SomeAbstractClass()
TypeError: Can't instantiate abstract class SomeAbstractClass with abstract
methods this_method_must_be_overridden
>>> c = ConcreteSubclass()
>>> c.this_method_must_be_overridden()
'But it does have an implementation.'
Based on your last sentence, I would answer answer "just document it". Anyone who uses a class in a way that the documentation says not to must accept responsibility for any strange behavior.
There is an abstract base class mechanism in Python, but I don't see any reason to use it if your only goal is to discourage instantiation.
I just name my abstract classes with the prefix 'Abstract'. E.g. AbstractDevice, AbstractPacket, etc.
It's about as easy and to the point as it comes. If others choose to go ahead and instantiate and/or use a class that starts with the word 'Abstract', then they either know what they're doing or there was no hope for them anyway.
Naming it thus, also serves as a reminder to myself not to go nuts with deep abstraction hierarchies, because putting 'Abstract' on the front of a whole lot of classes feels stupid too.
Create your 'abstract' class and raise NotImplementedError() in the abstract methods.
It won't stop people using the class and, in true duck-typing fashion, it will let you know if you neglect to implement the abstract method.
In Python 3.x, your class can inherit from abc.ABC.
This will make your class non-instantiable and your IDE will warn you if you try to do so.
import abc
class SomeAbstractClass(abc.ABC):
#abc.abstractmethod
def some_abstract_method(self):
raise NotImplementedError
#property
#abc.abstractmethod
def some_abstract_property(self):
raise NotImplementedError
This has first been suggested in PEP 3119.
To enforce things is possible, but rather unpythonic. When I came to Python after many years of C++ programming I also tried to do the same, I suppose, most of people try doing so if they have an experience in more classical languages. Metaclasses would do the job, but anyway Python checks very few things at compilation time. Your check will still be performed at runtime. So, is the inability to create a certain class really that useful if discovered only at runtime? In C++ (and in C# as well) you can not even compile you code creating an abstract class, and that is the whole point -- to discover the problem as early as possible. If you have abstract methods, raising a NotImplementedError exception seems to be quite enough. NB: raising, not returning an error code! In Python errors usually should not be silent unless thay are silented explicitly. Documenting. Naming a class in a way that says it's abstract. That's all.
Quality of Python code is ensured mostly with methods that are quite different from those used in languages with advanced compile-time type checking. Personally I consider that the most serious difference between dynamically typed lngauges and the others. Unit tests, coverage analysis etc. As a result, the design of code is quite different: everything is done not to enforce things, but to make testing them as easy as possible.
This is a style conventions question.
PEP8 convention for a class definition would be something like
class MyClass(object):
def __init__(self, attri):
self.attri = attri
So say I want to write a module-scoped function which takes some data, processes it, and then creates an instance of MyClass.
PEP8 says that my function definitions should have lowercase_underscore style names, like
def get_my_class(arg1, arg2, arg3):
pass
But my inclination would be to make it clear that I'm talking about MyClass instances like so
def get_MyClass(arg1, arg2, arg3):
pass
For this case, it looks trivially obvious that my_class and MyClass are related, but there are some cases where it's not so obvious. For example, I'm pulling data from a spreadsheet and have a SpreadsheetColumn class that takes the form of a heading attribute and a data list attribute. Yet, if you didn't know I was talking about an instance of the SpreadsheetColumn class, you might think that I'm talking about a raw column of cells as they might appear in an Excel sheet.
I'm wondering if it's reasonable to violate PEP8 to use get_MyClass. Being new to Python, I don't want to create a habit for a bad naming convention.
I've searched PEP8 and Stack Overflow and didn't see anything that addressed the issue.
Depending on the usage of the function, it might be more appropriate to turn it into a classmethod or staticmethod. Then it's association with the class is clear, but you don't violate any naming conventions.
e.g.:
class MyClass(object):
def __init__(self,arg):
self.arg = arg
#classmethod
def from_sum(cls,*args):
return cls(sum(args))
inst = MyClass.from_sum(1,2,3,4)
print inst.arg #10
Let's take a step back. Usually, you don't want to do this at all, so the naming convention is the least of your worries.
First, normally, you don't care what actual class or type something is. This is what duck typing is all about. You don't want a SpreadsheetColumn instance, you want something that you can use as a spreadsheet column. It may be an instance of SpreadsheetColumn, or of a subclass, or of some proxy class, or of some mock class for testing—whatever it is, you don't care, as long as it looks and works like a column.
Notice that, even in static languages like Java and C#, factory functions (or objects) usually don't create an instance of a specific class, they create an instance of any class that implements a specific interface. In Python, that's usually implicit. (And, when it's not, it's usually because you're using something like PEAK or Twisted, and you should follow their coding style for protocols or interfaces.)
So, your factory function should be called get_column, not get_SpreadsheetColumn.
When the function is more of an "alternate constructor" than a factory, then mgilson's answer is the way to go. See chain() and chain.from_iterable() in itertools from a good standard library example.
But notice that this isn't very common in the standard library, most of the popular modules on PyPI, etc. And there's a good reason. Usually, you can just use a single constructor with default-valued parameters, keyword parameters, or at worst *args and **kwargs. If this makes the API too confusing for human readers, or too ambiguous to code, that's when you need an alternate constructor. Otherwise, you don't.
Sometimes, you really do need a factory that creates objects of a concrete type, and that concrete type is a part of the interface that the caller needs to know about. As I mentioned above, this is pretty rare even in static languages, and it's even rarer in Python, but it does come up. And then, you really do need an answer to your original question.
In that case, I think I would name the function something ugly and unusual like get_MyClass or get_MyClass_instance. It ought to stick out immediately, because anyone reading my code will probably need to figure out why I'm explicitly getting a MyClass instead of a thing in order to understand the rest of my code.
I don't even know how to explain this, so here is the code I'm trying.
from couchdb.schema import Document, TextField
class Base(Document):
type = TextField(default=self.__name__)
#self doesn't work, how do I get a reference to Base?
class User(Base):
pass
#User.type be defined as TextField(default="Test2")
The reason I'm even trying this is I'm working on creating a base class for an orm I'm using. I want to avoid defining the table name for every model I have. Also knowing what the limits of python is will help me avoid wasting time trying impossible things.
The class object does not (yet) exist while the class body is executing, so there is no way for code in the class body to get a reference to it (just as, more generally, there is no way for any code to get a reference to any object that does not exist). Test2.__name__, however, already does what you're specifically looking for, so I don't think you need any workaround (such as metaclasses or class decorators) for your specific use case.
Edit: for the edited question, where you don't just need the name as a string, a class decorator is the simplest way to work around the problem (in Python 2.6 or later):
def maketype(cls):
cls.type = TextField(default=cls.__name__)
return cls
and put #maketype in front of each class you want to decorate that way. In Python 2.5 or earlier, you need instead to say maketype(Base) after each relevant class statement.
If you want this functionality to get inherited, then you have to define a custom metaclass that performs the same functionality in its __init__ or __new__ methods. Personally, I would recommend against defining custom metaclasses unless they're really indispensable -- instead, I'd stick with the simpler decorator approach.
You may want to check out the other question python super class relection
In your case, Test2.__base__ will return the base class Test. If it doesn't work, you may use the new style: class Test(object)