Organizing Python classes in modules and/or packages - python

I like the Java convention of having one public class per file, even if there are sometimes good reasons to put more than one public class into a single file. In my case I have alternative implementations of the same interface. But if I would place them into separate files, I'd have redundant names in the import statements (or misleading module names):
import someConverter.SomeConverter
whereas someConverter would be the file (and module) name and SomeConverter the class name. This looks pretty inelegant to me. To put all alternative classes into one file would lead to a more meaningful import statement:
import converters.SomeConverter
But I fear that the files become pretty large, if I put all related classes into a single module file. What is the Python best practise here? Is one class per file unusual?

A lot of it is personal preference. Using python modules, you do have the option to keep each class in a separate file and still allow for import converters.SomeConverter (or from converters import SomeConverter)
Your file structure could look something like this:
* converters
- __init__.py
- baseconverter.py
- someconverter.py
- otherconverter.py
and then in your __init__.py file:
from baseconverter import BaseConverter
from otherconverter import OtherConverter

Zach's solution breaks on Python 3. Here is a fixed solution.
A lot of it is personal preference. Using python modules, you do have the option to keep each class in a separate file and still allow for import converters.SomeConverter (or from converters import SomeConverter)
Your file structure could look something like this:
* converters
- __init__.py
- baseconverter.py
- someconverter.py
- otherconverter.py
and then in your __init__.py file:
from converters.baseconverter import BaseConverter
from converters.otherconverter import OtherConverter

The above solutions are good, but the problem with importing modules in __init__.py is that this will cause all the modules to be loaded twice(inefficient). Try adding a print statement at the end of otherconverter.py and run otherconverter.py. (You'll see that the print statement is executed twice)
I prefer the following. Use another package with name "_converter" and define everything there. And then your "converters.py" becomes the interface for accessing all public members
* _converters
- __init__.py
- baseconverter.py
- someconverter.py
- otherconverter.py
* converters.py
where converters.py is
from _converters.someconverter import SomeConverter
from _converters.otherconverter import OtherConverter
...
...
...
converters = [SomeConverter, OtherConverter, ...]
And as the previous solutions mentioned, it is a personal choice. A few practices involve defining a module "interace.py" within the package and importing all public members here. If you have many modules to load, you should choose efficiency over aesthetics.

Related

Why shouldn't I use __init__.py to separate public from private classes?

I am implementing a benchmark. In a subpackage, I have implemented a workload setup. To make the code easier accessible, I want to split the code into public and private classes.
benchmark/
- __init__.py
- benchmark.py -> Benchmark
- workload/
- __init__.py
- loader.py -> WorkloadLoader
- validation.py -> WorkloadValidation
- internal_logic1.py -> Internal1
- internal_logic2.py -> Internal2
I see three options:
Prefix the internal classes with an underscore (e.g., _Internal1). This has the disadvantage that it will add even more underscores in the code. (For me, underscore prefixes work better with snake case methods than camel case classes.)
Move all internal classes to a new subpackage benchmark.workload.internal. This has the disadvantage, that it creates deeper folder hierarchies.
Only import the classes that are public in the __init__.py file of the workload package. In theory, you would need another step to look up code, but in practice IDEs like Intellij will do the work for you.
Of the three options, option 3 is the best choice in my eyes. It has the additional advantage, that it reduces the import paths:
# from
from workload.loader import WorkloadLoader
from workload.validation import WorkloadValidation
# to
from workload import WorkloadLoader, WorkloadValidation
This looks like a clean module boundary. Yet, answers from another SO posts seem to disagree: How do I write good/correct package __init__.py files
Question Why I should not use __init__.py to separate public from private classes?
Example
# workload/__init__.py:
from .loader import WorkloadLoader
from .validation import WorkloadValidation

Integrate class attributes in client namespace

I want to define a bunch of attributes for use in a module that should also be accessible from other modules, because they're part of my interface contract.
I've put them in a data class in my module like this, but I want to avoid qualifying them every time, similar to how you use import * from a module:
#dataclass
class Schema:
key1='key1'
key2='key2'
and in the same module:
<mymodule.py>
print(my_dict[Schema.key1])
I would prefer to be able to do this:
print(my_dict[key1])
I was hoping for an equivalent syntax to:
from Schema import *
This would allow me to do this from other modules too:
<another_module.py>
from mymodule.Schema import *
but that doesn't work.
Is there a way to do this?
Short glossary
module - a python file that can be imported
package - a collection of modules in a directory that can also be imported, is technically also a module
name - shorthand for a named value (often just "variable" in other languages), they can be imported from modules
Using import statements allows you to import either packages, modules, or names:
import xml # package
from xml import etree # also a package
from xml.etree import ElementTree # module
from xml.etree.ElementTree import TreeBuilder # name
# --- here is where it ends ---
from xml.etree.ElementTree.TreeBuilder import element_factory # does not work
The dots in such an import chain can only be made after module objects, which packages and modules are, and names are not. So, while it looks like we are just accessing attributes of objects, we are actually relying on a mechanism that normal objects just don't support, so we can't import from within them.
In your particular case, a reasonable solution would be to simply turn the object that you wanted to hold the schema into a top-level module in your project:
schema.py
key1 = 'key1'
key2 = 'key2'
...
Which will give you the option to import them in the way that you initially proposed. Doing something like this to make common constants easily accessible in your project is not unusual, and the django framework for example uses a settings.py in the same manner.
One thing you should keep in mind is that names in python modules are effectively singletons, so their values can't be changed at runtime[1].
[1] They can, but it's so hacky that it should pretty much always be treated as not possible.

Best way to import several classes

I have defined several classes in a single python file. My wish is to create a library with these. I would ideally like to import the library in such a way that I can use the classes without a prefix (like mylibrary.myclass() as opposed to just myclass() ), if that's what you can call them, I am not entirely sure as I am a beginner.
What is the proper way to achieve this, or the otherwise best result? Define all classes in __init __? Define them all in a single file as I currently have like AllMyClasses.py? Or should I have a separate file for every class in the library directory like FirstClass.py, SecondClass.py etc.
I realize this is a question that should be easy enough to google, but since I am still quite new to python and programming in general I haven't quite figured out what the correct keywords are for a problem in this context(such as my uncertainty about "prefix")
More information can be found in the tutorial on modules (single files) or packages (when in a directory with an __init__.py file) on the python site.
The suggested way (according to the style guide) is to spell out each class import specifically.
from my_module import MyClass1, MyClass2
object1 = MyClass1()
object2 = MyClass2()
While you can also shorten the module name:
import my_module as mo
object = mo.MyClass1()
Using from my_module import * is recommended to be avoided as it can be confusing (even if it is the recommended way for some things, like tkinter)
If it's for your personal use, you can just put all your classes Class1, Class2, ... in a myFile.py and to use them call import myFile (without the .py extension)
import myFile
myVar1 = myFile.Class1()
myVar2 = myFile.Class2()
from within another script. If you want to be able to use the classes without the file name prefix, import the file like this:
from myFile import *
Note that the file you want to import should be in a directory where Python can find it (the same where the script is running or a directory in PYTHONPATH).
The _init_ is needed if you want to create a Python module for distribution. Here are the instructions: Distributing Python Modules
EDIT after checking the Python's style guide PEP 8 on imports:
Wildcard imports (from import) should be avoided, as they make it unclear which names are present in the namespace, confusing both readers and many automated tools
So in this example you should have used
from myFile import Class1, Class2

Is this an okay way of doing Django Models

I am doing the fat models approach, so I have transformed my models.py into a package thus:
+--polls/
| +--models/
| +--__init__.py
| +--__shared_imports.py
| +--Choice.py
| +--Question.py
My main part of the question is the __shared_imports.py: I realized that we've common import statements in various modules in the package and decided to to have that file to do the imports, then in my modules I write this:
from __shared_imports.py import *
Everything works fine, but just want to know if this approach is good. I'll appreciate your thoughts on this.
Use only lower case when you name modules.
Do not use double underscore for module name.
Read PEP8 and Google Python style guide.
Use less verbose names for modules. For example: shared_imports.py -> shared.py
Got it.
In this case you need to import everything in __init__.py.
Then you can export all names as __all__ = ['Choice', 'Question']
So, it will be enough just to import models package.
Example: __ init __.py
import Choice
import Question
__all__ = ['Choice', 'Question']
avoid import * because it will prevent tools like pyflakes from determining undefined variables.
to move all of it in a subdirectory and splitting it into separate files is not a bad idea, albeit mostly not needed. when your models.py file gets big, you should rather be thinking about splitting the project up into smaller apps.

Circular & nested imports in python

I'm having some real headaches right now trying to figure out how to import stuff properly. I had my application structured like so:
main.py
util_functions.py
widgets/
- __init__.py
- chooser.py
- controller.py
I would always run my applications from the root directory, so most of my imports would be something like this
from util_functions import *
from widgets.chooser import *
from widgets.controller import *
# ...
And my widgets/__init__.py was setup like this:
from widgets.chooser import Chooser
from widgets.controller import MainPanel, Switch, Lever
__all__ = [
'Chooser', 'MainPanel', 'Switch', 'Lever',
]
It was working all fine, except that widgets/controller.py was getting kind of lengthy, and I wanted it to split it up into multiple files:
main.py
util_functions.py
widgets/
- __init__.py
- chooser.py
- controller/
- __init__.py
- mainpanel.py
- switch.py
- lever.py
One of issues is that the Switch and Lever classes have static members where each class needs to access the other one. Using imports with the from ___ import ___ syntax that created circular imports. So when I tried to run my re-factored application, everything broke at the imports.
My question is this: How can I fix my imports so I can have this nice project structure? I cannot remove the static dependencies of Switch and Lever on each other.
This is covered in the official Python FAQ under How can I have modules that mutually import each other.
As the FAQ makes clear, there's no silvery bullet that magically fixes the problem. The options described in the FAQ (with a little more detail than is in the FAQ) are:
Never put anything at the top level except classes, functions, and variables initialized with constants or builtins, never from spam import anything, and then the circular import problems usually don't arise. Clean and simple, but there are cases where you can't follow those rules.
Refactor the modules to move the imports into the middle of the module, where each module defines the things that need to be exported before importing the other module. This can means splitting classes into two parts, an "interface" class that can go above the line, and an "implementation" subclass that goes below the line.
Refactor the modules in a similar way, but move the "export" code (with the "interface" classes) into a separate module, instead of moving them above the imports. Then each implementation module can import all of the interface modules. This has the same effect as the previous one, with the advantage that your code is idiomatic, and more readable by both humans and automated tools that expect imports at the top of a module, but the disadvantage that you have more modules.
As the FAQ notes, "These solutions are not mutually exclusive." In particular, you can try to move as much top-level code as possible into function bodies, replace as many from spam import … statements with import spam as is reasonable… and then, if you still have circular dependencies, resolve them by refactoring into import-free export code above the line or in a separate module.
With the generalities out of the way, let's look at your specific problem.
Your switch.Switch and lever.Lever classes have "static members where each class needs to access the other one". I assume by this you mean they have class attributes that are initialized using class attributes or class or static methods from the other class?
Following the first solution, you could change things so that these values are initialized after import time. Let's assume your code looked like this:
class Lever:
switch_stuff = Switch.do_stuff()
# ...
You could change that to:
class Lever:
#classmethod
def init_class(cls):
cls.switch_stuff = Switch.do_stuff()
Now, in the __init__.py, right after this:
from lever import Lever
from switch import Switch
… you add:
Lever.init_class()
Switch.init_class()
That's the trick: you're resolving the ambiguous initialization order by making the initialization explicit, and picking an explicit order.
Alternatively, following the second or third solution, you could split Lever up into Lever and LeverImpl. Then you do this (whether as separate lever.py and leverimpl.py files, or as one file with the imports in the middle):
class Lever:
#classmethod
def get_switch_stuff(cls):
return cls.switch_stuff
from switch import Swift
class LeverImpl(Lever):
switch_stuff = Switch.do_stuff()
Now you don't need any kind of init_class method. Of course you do need to change the attribute to a method—but if you don't like that, with a bit of work, you can always change it into a "class #property" (either by writing a custom descriptor, or by using #property in a metaclass).
Note that you don't actually need to fix both classes to resolve the circularity, just one. In theory, it's cleaner to fix both, but in practice, if the fixes are ugly, it may be better to just fix the one that's less ugly to fix and leave the dependency in the opposite direction alone.

Categories