This question already has answers here:
Closed 13 years ago.
Possible Duplicate:
How many Python classes should I put in one file?
Coming from a C++ background I've grown accustomed to organizing my classes such that, for the most part, there's a 1:1 ratio between classes and files. By making it so that a single file contains a single class I find the code more navigable. As I introduce myself to Python I'm finding lots of examples where a single file contains multiple classes. Is that the recommended way of doing things in Python? If so, why?
Am I missing this convention in the PEP8?
Here are some possible reasons:
Python is not exclusively class-based - the natural unit of code decomposition in Python is the module. Modules are just as likely to contain functions (which are first-class objects in Python) as classes. In Java, the unit of decomposition is the class. Hence, Python has one module=one file, and Java has one (public) class=one file.
Python is much more expressive than Java, and if you restrict yourself to one class per file (which Python does not prevent you from doing) you will end up with lots of very small files - more to keep track of with very little benefit.
An example of roughly equivalent functionality: Java's log4j => a couple of dozen files, ~8000 SLOC. Python logging => 3 files, ~ 2800 SLOC.
There's a mantra, "flat is better than nested," that generally discourages an overuse of hierarchy. I'm not sure there's any hard and fast rules as to when you want to create a new module -- for the most part, people just use their discretion to group logically related functionality (classes and functions that pertain to a particular problem domain).
Good thread from the Python mailing list, and a quote by Fredrik Lundh:
even more important is that in Python,
you don't use classes for every-
thing; if you need factories,
singletons, multiple ways to create
objects, polymorphic helpers, etc, you
use plain functions, not classes or
static methods.
once you've gotten over the "it's all
classes", use modules to organize
things in a way that makes sense to
the code that uses your components.
make the import statements look good.
the book Expert Python Programming has something related discussion
Chapter 4: Choosing Good Names:"Building the Namespace Tree" and "Splitting the Code"
My line crude summary: collect some related class to one module(source file),and
collect some related module to one package, is helpful for code maintain.
In python, class can also be used for small tasks (just for grouping etc). maintaining a 1:1 relation would result in having too many files with small or little functionality.
There is no specific convention for this - do whatever makes your code the most readable and maintainable.
A good example of not having seperate files for each class might be the models.py file within a django app. Each django app may have a handful of classes that are related to that app, and putting them into individual files just makes more work.
Similarly, having each view in a different file again is likely to be counterproductive.
Related
As I learn more about Python I am starting to get into the realm of classes. I have been reading on how to properly call a class and how to import the module or package.module but I was wondering if it is really needed to do this.
My question is this: Is it required to move your class to a separate module for a functional reason or is it solely for readability? I can perform all the same task using defined functions within my main module so what is the need for the class if any outside of readability?
Modules are structuring tools that provide encapsulation. In other words, modules are structures that combine your logic and data into one compartment, in the module itself. When you code a module, you should be consistent. To make a module consistent you must define its purpose: does my module provide tools? What type of tools? String tools? Numericals tools...?
For example, let's assume you're coding a program that processes numbers. Typically, you would use the builtin math module, and for some specialized purposes you might need to code some functions and classes that process your numbers according to your needs. If you read the documentation of math builtin module, you'll find math defines classes ad functions that relate to math but no classes or functions that process strings for instance, this is cohesion--unifying the purpose of your module. Keep in mind, maximizing cohesion, minimizes coupling. That's, when you keep your module unified, you make it less likely to be dependent on other modules.
Is it required to move your Class to a separate module for a functional reason or is it solely for readability?
If that specific class doesn't relate to your module, then you're probably better off moving that class to another module. Definitely, This is not a valid statement all the time. Suppose you're coding a relatively small program and you don't really need to define a large number of tools that you'll use in your small program, coding your class in your main module doesn't hurt at all. In larger applications where you need to write dozens of tools on the other hand, it's better to split your program to modules with specified purposes, myStringTools, myMath, main and many other modules. Structuring your program with modules and packages enhances maintenance.
If you need to delve deeper read about Modular programming, it'll help you grasp the idea even better.
You can do as you please. If the code for your classes is short, putting them all in your main script is fine. If they're longish, then splitting them out into separate files is a useful organizing technique (that has the added benefit of the code in them no getting recompiled into byte-code everytime the the script they are used in is run.
Putting them in modules also encourages their reuse since they're no longer mixed in with a lot of other unrelated stuff.
Lastly, they may be useful because modules are esstentially singleton objects, meaning that there's only once instance of them in your program which is created the first time it's imported. Later imports in other modules will just reuse the existing instance. This can be a nice way to do initialize that only has to be done once.
In Java, this question is easy (if a little tedious) - every class requires its own file. So the number of .java files in a project is the number of classes (not counting anonymous/nested classes).
In Python, though, I can define multiple classes in the same file, and I'm not quite sure how to find the point at which I split things up. It seems wrong to make a file for every class, but it also feels wrong just to leave everything in the same file by default. How do I know where to break a program up?
Remember that in Python, a file is a module that you will most likely import in order to use the classes contained therein. Also remember one of the basic principles of software development "the unit of packaging is the unit of reuse", which basically means:
If classes are most likely used together, or if using one class leads to using another, they belong in a common package.
As I see it, this is really a question about reuse and abstraction. If you have a problem that you can solve in a very general way, so that the resulting code would be useful in many other programs, put it in its own module.
For example: a while ago I wrote a (bad) mpd client. I wanted to make configuration file and option parsing easy, so I created a class that combined ConfigParser and optparse functionality in a way I thought was sensible. It needed a couple of support classes, so I put them all together in a module. I never use the client, but I've reused the configuration module in other projects.
EDIT: Also, a more cynical answer just occurred to me: if you can only solve a problem in a really ugly way, hide the ugliness in a module. :)
In Java ... every class requires its own file.
On the flipside, sometimes a Java file, also, will include enums or subclasses or interfaces, within the main class because they are "closely related."
not counting anonymous/nested classes
Anonymous classes shouldn't be counted, but I think tasteful use of nested classes is a choice much like the one you're asking about Python.
(Occasionally a Java file will have two classes, not nested, which is allowed, but yuck don't do it.)
Python actually gives you the choice to package your code in the way you see fit.
The analogy between Python and Java is that a file i.e., the .py file in Python is
equivalent to a package in Java as in it can contain many related classes and functions.
For good examples, have a look in the Python built-in modules.
Just download the source and check them out, the rule of thumb I follow is
when you have very tightly coupled classes or functions you keep them in a single file
else you break them up.
My first "serious" language was Java, so I have comprehended object-oriented programming in sense that elemental brick of program is a class.
Now I write on VBA and Python. There are module languages and I am feeling persistent discomfort: I don't know how should I decompose program in a modules/classes.
I understand that one module corresponds to one knowledge domain, one module should ba able to test separately...
Should I apprehend module as namespace(c++) only?
I don't do VBA but in python, modules are fundamental. As you say, the can be viewed as namespaces but they are also objects in their own right. They are not classes however, so you cannot inherit from them (at least not directly).
I find that it's a good rule to keep a module concerned with one domain area. The rule that I use for deciding if something is a module level function or a class method is to ask myself if it could meaningfully be used on any objects that satisfy the 'interface' that it's arguments take. If so, then I free it from a class hierarchy and make it a module level function. If its usefulness truly is restricted to a particular class hierarchy, then I make it a method.
If you need it work on all instances of a class hierarchy and you make it a module level function, just remember that all the the subclasses still need to implement the given interface with the given semantics. This is one of the tradeoffs of stepping away from methods: you can no longer make a slight modification and call super. On the other hand, if subclasses are likely to redefine the interface and its semantics, then maybe that particular class hierarchy isn't a very good abstraction and should be rethought.
It is matter of taste. If you use modules your 'program' will be more procedural oriented. If you choose classes it will be more or less object oriented. I'm working with Excel for couple of months and personally I choose classes whenever I can because it is more comfortable to me. If you stop thinking about objects and think of them as Components you can use them with elegance. The main reason why I prefer classes is that you can have it more that one. You can't have two instances of module. It allows me use encapsulation and better code reuse.
For example let's assume that you like to have some kind of logger, to log actions that were done by your program during execution. You can write a module for that. It can have for example a global variable indicating on which particular sheet logging will be done. But consider the following hypothetical situation: your client wants you to include some fancy report generation functionality in your program. You are smart so you figure out that you can use your logging code to prepare them. But you can't do log and report simultaneously by one module. And you can with two instances of logging Component without any changes in their code.
Idioms of languages are different and thats the reason a problem solved in different languages take different approaches.
"C" is all about procedural decomposition.
Main idiom in Java is about "class or Object" decomposition. Functions are not absent, but they become a part of exhibited behavior of these classes.
"Python" provides support for both Class based problem decomposition as well as procedural based.
All of these uses files, packages or modules as concept for organizing large code pieces together. There is nothing that restricts you to have one module for one knowledge domain.
These are decomposition and organizing techniques and can be applied based on the problem at hand.
If you are comfortable with OO, you should be able to use it very well in Python.
VBA also allows the use of classes. Unfortunately, those classes don't support all the features of a full-fleged object oriented language. Especially inheritance is not supported.
But you can work with interfaces, at least up to a certain degree.
I only used modules like "one module = one singleton". My modules contain "static" or even stateless methods. So in my opinion a VBa module is not namespace. More often a bunch of classes and modules would form a "namespace". I often create a new project (DLL, DVB or something similar) for such a "namespace".
I have a class called Path for which there are defined about 10 methods, in a dedicated module Path.py. Recently I had a need to write 5 more methods for Path, however these new methods are quite obscure and technical and 90% of the time they are irrelevant.
Where would be a good place to put them so their context is clear? Of course I can just put them with the class definition, but I don't like that because I like to keep the important things separate from the obscure things.
Currently I have these methods as functions that are defined in a separate module, just to keep things separate, but it would be better to have them as bound methods. (Currently they take the Path instance as an explicit parameter.)
Does anyone have a suggestion?
If the method is relevant to the Path - no matter how obscure - I think it should reside within the class itself.
If you have multiple places where you have path-related functionality, it leads to problems. For example, if you want to check if some functionality already exists, how will a new programmer know to check the other, less obvious places?
I think a good practice might be to order functions by importance. As you may have heard, some suggest putting public members of classes first, and private/protected ones after. You could consider putting the common methods in your class higher than the obscure ones.
If you're keen to put those methods in a different source file at any cost, AND keen to have them at methods at any cost, you can achieve both goals by using the different source file to define a mixin class and having your Path class import that method and multiply-inherit from that mixin. So, technically, it's quite feasible.
However, I would not recommend this course of action: it's worth using "the big guns" (such as multiple inheritance) only to serve important goals (such as reuse and removing repetition), and separating methods out in this way is not really a particularly crucial goal.
If those "obscure methods" played no role you would not be implementing them, so they must have SOME importance, after all; so I'd just clarify in docstrings and comments what they're for, maybe explicitly mention that they're rarely needed, and leave it at that.
I would just prepend the names with an underscore _, to show that the reader shouldn't bother about them.
It's conventionally the same thing as private members in other languages.
Put them in the Path class, and document that they are "obscure" either with comments or docstrings. Separate them at the end if you like.
Oh wait, I thought of something -- I can just define them in the Path.py module, where every obscure method will be a one-liner that will call the function from the separate module that currently exists. With this compromise, the obscure methods will comprise of maybe 10 lines in the end of the file instead of 50% of its bulk.
I suggest making them accessible from a property of the Path class called something like "Utilties". For example: Path.Utilities.RazzleDazzle. This will help with auto-completion tools and general maintenance.
I'm looking to develop a project in python and all of the python I have done is minor scripting with no regard to classes or structure. I haven't seen much about this, so is this how larger python projects are done?
Also, do things like "namespaces" and "projects" exist in this realm? As well as object oriented principles such as inheriting from other classes?
Yes, you can, and you should! :)
Here is a nice introduction to Python Modules (including packages).
Correction: you probably should not put each single class into a separate file (like Java mandates and many C++ places do). The language is pretty lax about it as you can see in the linked tutorial, keep an open eye on other projects, use common sense, and do whatever makes sense to you (or whatever is being done in your team/project - unless it is very wrong).
You can put code (classes, function defs, etc) into python modules (individual source files), which are then imported with import. Typically like functionality (like in the Python standard library) is contained within a single module.
You can do that, but usually they are arranged a bit different.
You can take a look to the source code of one python application.
Here's one: "JaikuEngine" which powers the website http://www.jaiku.com/
Yes.
You can put python classes into separate files, use namespaces for scoping, then place this into Modules which can be loaded from other scripts.