Splitting code into modules (conventions)

Splitting code into modules (conventions) - python

So I've been searching for a bit and couldn't find anything on Google or PEP discussing this.
I am doing a project with tkinter and I had a file, that is part of a project, that was only 200 lines of code (excluding all the commented out code). While the entire file was related to the GUI portion of the project, it felt a bit long and a bit broad to me.
I ended up splitting the file into 4 different files that each has its own portion of the GUI.
Basically, the directory looks like this:
project/
guiclasses/
statisticsframe.py
textframes.py
windowclass.py
main_gui.py
...
statisticsframe has a class of a frame that shows statistics about stuff.
textframes holds 3 classes of frames holding textareas, one of them inherits Frame, the others inherit the first one.
windowclass basically creates the root of the window and all the general initialization for a tkinter GUI.
main_gui isn't actually the name but it simply combines all the above three and runs the mainloop()
Overall, each file is now 40-60 lines of code.
I am wondering if there are any conventions regarding this. The rule of thumb in most languages is that if you can reuse the functions/ classes elsewhere then you should split, though in Python it is less of a problem since you can import specific classes and functions from modules.
Sorry if it isn't coherent enough, nearly 3AM here and it is simply sitting in the back of my head.

I'm not familiar with tkinter, so my advice would be rather broad.
You can use any split into modules which you feel is better, but
as readability counts try making names coherent and do not repeat yourself: guiclasses - your enire progarm is about GUI, and there obviously classes somewhere, why repeath that in a name? imagine typing all that in in import, make it meaningful to type
flat structure is better than nested, three modules do not have to go to submodule
best split is across layers of abstraction (this is probably hardest and specific to tkinter)
anything in a module shoudl be rather self-sufficient and quite isolated from other parts of the program
modules should make good entitites for unit testing (eg share same fixtures)
can you write an understandable docstring for a module? then it's a good one.
try learning by example, I often seek wisdom for naming and package structure in Barry Warsaw mailman, maybe you can try finding some reputable repo with tkinter to follow (eg IDLE?).
From purely syntatic view I would have named the modules as:
- <package_name>
- baseframe
- textframe
- window
- main

Related

Is there a reason to create classes on seperate modules?

As I learn more about Python I am starting to get into the realm of classes. I have been reading on how to properly call a class and how to import the module or package.module but I was wondering if it is really needed to do this.
My question is this: Is it required to move your class to a separate module for a functional reason or is it solely for readability? I can perform all the same task using defined functions within my main module so what is the need for the class if any outside of readability?

Modules are structuring tools that provide encapsulation. In other words, modules are structures that combine your logic and data into one compartment, in the module itself. When you code a module, you should be consistent. To make a module consistent you must define its purpose: does my module provide tools? What type of tools? String tools? Numericals tools...?
For example, let's assume you're coding a program that processes numbers. Typically, you would use the builtin math module, and for some specialized purposes you might need to code some functions and classes that process your numbers according to your needs. If you read the documentation of math builtin module, you'll find math defines classes ad functions that relate to math but no classes or functions that process strings for instance, this is cohesion--unifying the purpose of your module. Keep in mind, maximizing cohesion, minimizes coupling. That's, when you keep your module unified, you make it less likely to be dependent on other modules.
Is it required to move your Class to a separate module for a functional reason or is it solely for readability?
If that specific class doesn't relate to your module, then you're probably better off moving that class to another module. Definitely, This is not a valid statement all the time. Suppose you're coding a relatively small program and you don't really need to define a large number of tools that you'll use in your small program, coding your class in your main module doesn't hurt at all. In larger applications where you need to write dozens of tools on the other hand, it's better to split your program to modules with specified purposes, myStringTools, myMath, main and many other modules. Structuring your program with modules and packages enhances maintenance.
If you need to delve deeper read about Modular programming, it'll help you grasp the idea even better.

You can do as you please. If the code for your classes is short, putting them all in your main script is fine. If they're longish, then splitting them out into separate files is a useful organizing technique (that has the added benefit of the code in them no getting recompiled into byte-code everytime the the script they are used in is run.
Putting them in modules also encourages their reuse since they're no longer mixed in with a lot of other unrelated stuff.
Lastly, they may be useful because modules are esstentially singleton objects, meaning that there's only once instance of them in your program which is created the first time it's imported. Later imports in other modules will just reuse the existing instance. This can be a nice way to do initialize that only has to be done once.

Tracking changes in python source files?

I'm learning python and came into a situation where I need to change the behvaviour of a function. I'm initially a java programmer so in the Java world a change in a function would let Eclipse shows that a lot of source files in Java has errors. That way I can know which files need to get modified. But how would one do such a thing in python considering there are no types?! I'm using TextMate2 for python coding.
Currently I'm doing the brute-force way. Opening every python script file and check where I'm using that function and then modify. But I'm sure this is not the way to deal with large projects!!!
Edit: as an example I define a class called Graph in a python script file. Graph has two objects variables. I created many objects (each with different name!!!) of this class in many script files and then decided that I want to change the name of the object variables! Now I'm going through each file and reading my code again in order to change the names again :(. PLEASE help!
Example: File A has objects x,y,z of class C. File B has objects xx,yy,zz of class C. Class C has two instance variables names that should be changed Foo to Poo and Foo1 to Poo1. Also consider many files like A and B. What would you do to solve this? Are you serisouly going to open each file and search for x,y,z,xx,yy,zz and then change the names individually?!!!

Sounds like you can only code inside an IDE!
Two steps to free yourself from your IDE and become a better programmer.
Write unit tests for your code.
Learn how to use grep
Unit tests will exercise your code and provide reassurance that it is always doing what you wanted it to do. They make refactoring MUCH easier.
grep, what a wonderful tool grep -R 'my_function_name' src will find every reference to your function in files under the directory src.
Also, see this rather wonderful blog post: Unix as an IDE.

Whoa, slow down. The coding process you described is not scalable.
How exactly did you change the behavior of the function? Give specifics, please.
UPDATE: This all sounds like you're trying to implement a class and its methods by cobbling together a motley patchwork of functions and local variables - like I wrongly did when I first learned OO coding in Python. The code smell is that when the type/class of some class internal changes, it should generally not affect the class methods. If you're refactoring all your code every 10 mins, you're doing something seriously wrong. Step back and think about clean decomposition into objects, methods and data members.
(Please give more specifics if you want a more useful answer.)
If you were only changing input types, there might be no need to change the calling code.
(Unless the new fn does something very different to the old one, in which case what was the argument against calling it a different name?)
If you changed the return type, and you can't find a common ancestor type or container (tuple, sequence etc.) to put the return values in, then yes you need to change its caller code. However...
...however if the function should really be a method of a class, declare that class and the method already. The previous paragraph was a code smell that your function really should have been a method, specifically a polymorphic method.
Read about code smells, anti-patterns and When do you know you're dealing with an anti-pattern?. There e.g. you will find a recommendation for the video "Recovery from Addiction - A taste of the Python programming language's concision and elegance from someone who once suffered an addiction to the Java programming language." - Sean Kelly
Also, sounds like you want to use Test-Driven Design and add some unittests.
If you give us the specifics we can critique it better.

You won't get this functionality in a text editor. I use sublime text 3, and I love it, but it doesn't have this functionality. It does however jump to files and functions via its 'Goto Anything' (Ctrl+P) functionality, and its Multiple Selections / Multi Edit is great for small refactoring tasks.
However, when it comes to IDEs, JetBrains pycharm has some of the amazing re-factoring tools that you might be looking for.
The also free Python Tools for Visual Studio (see free install options here which can use the free VS shell) has some excellent Refactoring capabilities and a superb REPL to boot.
I use all three. I spend most of my time in sublime text, I like pycharm for refactoring, and I find PT4VS excellent for very involved prototyping.
Despite python being a dynamically typed language, IDEs can still introspect to a reasonable degree. But, of course, it won't approach the level of Java or C# IDEs. Incidentally, if you are coming over from Java, you may have come across JetBrains IntelliJ, which PyCharm will feel almost identical to.
One's programming style is certainly different between a statically typed language like C# and a dynamic language like python. I find myself doing things in smaller, testable modules. The iteration speed is faster. And in a dynamic language one relies less on IDE tools and more on unit tests that cover the key functionality. If you don't have these you will break things when you refactor.

One answer only specific to your edit:
if your old code was working and does not need to be modified, you could just keep old names as alias of the new ones, resulting in your old code not to be broken. Example:
class MyClass(object):
def __init__(self):
self.t = time.time()
# creating new names
def new_foo(self, arg):
return 'new_foo', arg
def new_bar(self, arg):
return 'new_bar', arg
# now creating functions aliases
foo = new_foo
bar = new_bar
if your code need rework, rewrite your common code, execute everything, and correct any failure. You could also look for any import/instantiation of your class.

One of the tradeoffs between statically and dynamically typed languages is that the latter require less scaffolding in the form of type declarations, but also provide less help with refactoring tools and compile-time error detection. Some Python IDEs do offer a certain level of type inference and help with refactoring, but even the best of them will not be able to match the tools developed for statically typed languages.
Dynamic language programmers typically ensure correctness while refactoring in one or more of the following ways:
Use grep to look for function invocation sites, and fix them. (You would have to do that in languages like Java as well if you wanted to handle reflection.)
Start the application and see what goes wrong.
Write unit tests, if you don't already have them, use a coverage tool to make sure that they cover your whole program, and run the test suite after each change to check that everything still works.

When should a Python script be split into multiple files/modules?

In Java, this question is easy (if a little tedious) - every class requires its own file. So the number of .java files in a project is the number of classes (not counting anonymous/nested classes).
In Python, though, I can define multiple classes in the same file, and I'm not quite sure how to find the point at which I split things up. It seems wrong to make a file for every class, but it also feels wrong just to leave everything in the same file by default. How do I know where to break a program up?

Remember that in Python, a file is a module that you will most likely import in order to use the classes contained therein. Also remember one of the basic principles of software development "the unit of packaging is the unit of reuse", which basically means:
If classes are most likely used together, or if using one class leads to using another, they belong in a common package.

As I see it, this is really a question about reuse and abstraction. If you have a problem that you can solve in a very general way, so that the resulting code would be useful in many other programs, put it in its own module.
For example: a while ago I wrote a (bad) mpd client. I wanted to make configuration file and option parsing easy, so I created a class that combined ConfigParser and optparse functionality in a way I thought was sensible. It needed a couple of support classes, so I put them all together in a module. I never use the client, but I've reused the configuration module in other projects.
EDIT: Also, a more cynical answer just occurred to me: if you can only solve a problem in a really ugly way, hide the ugliness in a module. :)

In Java ... every class requires its own file.
On the flipside, sometimes a Java file, also, will include enums or subclasses or interfaces, within the main class because they are "closely related."
not counting anonymous/nested classes
Anonymous classes shouldn't be counted, but I think tasteful use of nested classes is a choice much like the one you're asking about Python.
(Occasionally a Java file will have two classes, not nested, which is allowed, but yuck don't do it.)

Python actually gives you the choice to package your code in the way you see fit.
The analogy between Python and Java is that a file i.e., the .py file in Python is
equivalent to a package in Java as in it can contain many related classes and functions.
For good examples, have a look in the Python built-in modules.
Just download the source and check them out, the rule of thumb I follow is
when you have very tightly coupled classes or functions you keep them in a single file
else you break them up.

Are multiple classes in a single file recommended? [duplicate]

This question already has answers here:
Closed 13 years ago.
Possible Duplicate:
How many Python classes should I put in one file?
Coming from a C++ background I've grown accustomed to organizing my classes such that, for the most part, there's a 1:1 ratio between classes and files. By making it so that a single file contains a single class I find the code more navigable. As I introduce myself to Python I'm finding lots of examples where a single file contains multiple classes. Is that the recommended way of doing things in Python? If so, why?
Am I missing this convention in the PEP8?

Here are some possible reasons:
Python is not exclusively class-based - the natural unit of code decomposition in Python is the module. Modules are just as likely to contain functions (which are first-class objects in Python) as classes. In Java, the unit of decomposition is the class. Hence, Python has one module=one file, and Java has one (public) class=one file.
Python is much more expressive than Java, and if you restrict yourself to one class per file (which Python does not prevent you from doing) you will end up with lots of very small files - more to keep track of with very little benefit.
An example of roughly equivalent functionality: Java's log4j => a couple of dozen files, ~8000 SLOC. Python logging => 3 files, ~ 2800 SLOC.

There's a mantra, "flat is better than nested," that generally discourages an overuse of hierarchy. I'm not sure there's any hard and fast rules as to when you want to create a new module -- for the most part, people just use their discretion to group logically related functionality (classes and functions that pertain to a particular problem domain).
Good thread from the Python mailing list, and a quote by Fredrik Lundh:
even more important is that in Python,
you don't use classes for every-
thing; if you need factories,
singletons, multiple ways to create
objects, polymorphic helpers, etc, you
use plain functions, not classes or
static methods.
once you've gotten over the "it's all
classes", use modules to organize
things in a way that makes sense to
the code that uses your components.
make the import statements look good.

the book Expert Python Programming has something related discussion
Chapter 4: Choosing Good Names:"Building the Namespace Tree" and "Splitting the Code"
My line crude summary: collect some related class to one module(source file),and
collect some related module to one package, is helpful for code maintain.

In python, class can also be used for small tasks (just for grouping etc). maintaining a 1:1 relation would result in having too many files with small or little functionality.

There is no specific convention for this - do whatever makes your code the most readable and maintainable.

A good example of not having seperate files for each class might be the models.py file within a django app. Each django app may have a handful of classes that are related to that app, and putting them into individual files just makes more work.
Similarly, having each view in a different file again is likely to be counterproductive.

Possibilities for Python classes organized across files? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
The community reviewed whether to reopen this question 3 months ago and left it closed:
Original close reason(s) were not resolved
Improve this question
I'm used to the Java model where you can have one public class per file. Python doesn't have this restriction, and I'm wondering what's the best practice for organizing classes.

A Python file is called a "module" and it's one way to organize your software so that it makes "sense". Another is a directory, called a "package".
A module is a distinct thing that may have one or two dozen closely-related classes. The trick is that a module is something you'll import, and you need that import to be perfectly sensible to people who will read, maintain and extend your software.
The rule is this: a module is the unit of reuse.
You can't easily reuse a single class. You should be able to reuse a module without any difficulties. Everything in your library (and everything you download and add) is either a module or a package of modules.
For example, you're working on something that reads spreadsheets, does some calculations and loads the results into a database. What do you want your main program to look like?
from ssReader import Reader
from theCalcs import ACalc, AnotherCalc
from theDB import Loader
def main( sourceFileName ):
rdr= Reader( sourceFileName )
c1= ACalc( options )
c2= AnotherCalc( options )
ldr= Loader( parameters )
for myObj in rdr.readAll():
c1.thisOp( myObj )
c2.thatOp( myObj )
ldr.laod( myObj )
Think of the import as the way to organize your code in concepts or chunks. Exactly how many classes are in each import doesn't matter. What matters is the overall organization that you're portraying with your import statements.

Since there is no artificial limit, it really depends on what's comprehensible. If you have a bunch of fairly short, simple classes that are logically grouped together, toss in a bunch of 'em. If you have big, complex classes or classes that don't make sense as a group, go one file per class. Or pick something in between. Refactor as things change.

I happen to like the Java model for the following reason. Placing each class in an individual file promotes reuse by making classes easier to see when browsing the source code. If you have a bunch of classes grouped into a single file, it may not be obvious to other developers that there are classes there that can be reused simply by browsing the project's directory structure. Thus, if you think that your class can possibly be reused, I would put it in its own file.

It entirely depends on how big the project is, how long the classes are, if they will be used from other files and so on.
For example I quite often use a series of classes for data-abstraction - so I may have 4 or 5 classes that may only be 1 line long (class SomeData: pass).
It would be stupid to split each of these into separate files - but since they may be used from different files, putting all these in a separate data_model.py file would make sense, so I can do from mypackage.data_model import SomeData, SomeSubData
If you have a class with lots of code in it, maybe with some functions only it uses, it would be a good idea to split this class and the helper-functions into a separate file.
You should structure them so you do from mypackage.database.schema import MyModel, not from mypackage.email.errors import MyDatabaseModel - if where you are importing things from make sense, and the files aren't tens of thousands of lines long, you have organised it correctly.
The Python Modules documentation has some useful information on organising packages.

I find myself splitting things up when I get annoyed with the bigness of files and when the desirable structure of relatedness starts to emerge naturally. Often these two stages seem to coincide.
It can be very annoying if you split things up too early, because you start to realise that a totally different ordering of structure is required.
On the other hand, when any .java or .py file is getting to more than about 700 lines I start to get annoyed constantly trying to remember where "that particular bit" is.
With Python/Jython circular dependency of import statements also seems to play a role: if you try to split too many cooperating basic building blocks into separate files this "restriction"/"imperfection" of the language seems to force you to group things, perhaps in rather a sensible way.
As to splitting into packages, I don't really know, but I'd say probably the same rule of annoyance and emergence of happy structure works at all levels of modularity.

I would say to put as many classes as can be logically grouped in that file without making it too big and complex.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Splitting code into modules (conventions) - python

Related

Is there a reason to create classes on seperate modules?

Tracking changes in python source files?

When should a Python script be split into multiple files/modules?

Are multiple classes in a single file recommended? [duplicate]

Possibilities for Python classes organized across files? [closed]

Categories

Resources