I am refactoring a piece of Python 2 software. The code is currently in a single file, which contains 45 or so classes (and 2 lines outside of a class to bootstrap the application).
I'd like to have one class per file, with ideally files from related classes grouped in directories.
I typically like to write my Python imports as such:
from zoo.dog_classes.beagle_class import BeagleClass
from zoo.dog_classes.dalmatian_class import DalmatianClass
so that it is clear which modules are imported in the class, and that their name is as short as possible.
For this software, the logic is quite complex though, with classes referring to one another commonly in class methods, resulting in numerous circular imports, which rules out this approach.
I do not want to import modules in functions, as this is horrible for readability and code gets repeated everywhere.
It seems like my only option is to write imports in such a fashion:
import zoo.dog_classes.beagle_class
and later, when I need the class:
b = zoo.dog_classes.beagle_class.BeagleClass()
This is however extremely verbose and painful to write.
How should I deal with my imports?
import zoo.dog_classes.beagle_class as beagle
b = beagle.BeagleClass()
Related
I am relatively new to programming. So far I have seen two ways that classes
are imported and inherited in python. The first one which is also what I
have been doing while learning Flask is:
from package.module import SuperClass
class SubClass(SuperClass):
The other one which I am seeing quite often in most Django code is:
from package import module
class SubClass(module.SuperClass):
Which one is the right way of doing things? Is there a significant
advantage of using one over the other?
Short answer : they are the same, choose the most explicit / legible one.
Long answer : more details in this question from the Software Engineering StackExchange.
They are the same thing. The only difference is that it is sometime preferable to import an entire module if there would be too many individual packages to import (you wouldnt want to write from ... import module1, module2, module3, 100 times).
I'm trying to import a function from a python module. That function is declared on the module I'm calling import from, but nevertheless I'm using that function on the other file.
Like so:
context.py
from elements import *
class Context:
def __init__(self):
pass
#staticmethod
def load():
print "load code here"
elements.py
from context import *
class Item:
def __init__(self):
Context.load() # NameError: global name 'load' is not defined
As someone who comes from Java, seems like applying the same nested class accessing logic doesn't work in Python. I'm wondering what could be the best practice here, since it doesn't seem right to put the import statement below the Context class. I searched about this but the material wasn't clear about this practice.
Also, at context.py I'm using instances of classes defined at elements, and vice versa. So my question is really what would be the best importing practice here.
Another related question: is it good practice to encapsulate functions and variables inside Classes in Python or should I use global functions/variables instead?
Ah, in Python this is considered a circular import error -- and can be incredibly frustrating. elements is importing from context and vice-versa. This may be possible in Java with magic compiler tricks but since Python is (mostly) interpreted, this isn't possible*.
Another unstated difference between Java and Python is that a Python class is closer to a hashmap with a special API than a proper Java class. As such, it is perfectly acceptable and preferable to put classes that have a tight interdependence (such as the ones you wrote) in the same Python module. This will remove the circular import error.
In general, you want to organize library modules by dependency level -- meaning, the leaves of your lib folder do not import from anything else in your project, and as you progress closer to the root, more imports are drawn upon. To the best of your ability you want your import structure to be a tree, not a spiderweb (if that makes any sense). Without a compiler, it's the only way I've found in a large (multi-million line) Python project to maintain sanity.
The above comments are generally considered best practice, this next suggestion is highly opinionated:
I would recommend structuring executable modules around I/O boundaries. It becomes very tempting to build tightly interconnected fabrics of Python objects with complicated inheritance structures passed by reference. While on a small and medium scale this offers development advantages, on a larger scale you lose the ability to easily integrate concurrency since you've taken away the ability for the code to be transfer-layer agnostic.
Edit: Okay, it IS possible by playing around with import statement ordering, using the __import__ method, etc., to hack the import framework and accomplish this. However, you should NOT do this if you intend to have a large project -- it is very brittle and difficult to explain to a team. It seems like you're more interested in best practices, which is how I directed my answer. Sorry if that was unclear.
In context.py file you should add def before __init__, also class methods do not take self:
class Context:
def __init__(self):
pass
#staticmethod
def load():
print "load code here"
then in another file:
from context import Context
class Item:
def __init__(self):
Context.load()
As I learn more about Python I am starting to get into the realm of classes. I have been reading on how to properly call a class and how to import the module or package.module but I was wondering if it is really needed to do this.
My question is this: Is it required to move your class to a separate module for a functional reason or is it solely for readability? I can perform all the same task using defined functions within my main module so what is the need for the class if any outside of readability?
Modules are structuring tools that provide encapsulation. In other words, modules are structures that combine your logic and data into one compartment, in the module itself. When you code a module, you should be consistent. To make a module consistent you must define its purpose: does my module provide tools? What type of tools? String tools? Numericals tools...?
For example, let's assume you're coding a program that processes numbers. Typically, you would use the builtin math module, and for some specialized purposes you might need to code some functions and classes that process your numbers according to your needs. If you read the documentation of math builtin module, you'll find math defines classes ad functions that relate to math but no classes or functions that process strings for instance, this is cohesion--unifying the purpose of your module. Keep in mind, maximizing cohesion, minimizes coupling. That's, when you keep your module unified, you make it less likely to be dependent on other modules.
Is it required to move your Class to a separate module for a functional reason or is it solely for readability?
If that specific class doesn't relate to your module, then you're probably better off moving that class to another module. Definitely, This is not a valid statement all the time. Suppose you're coding a relatively small program and you don't really need to define a large number of tools that you'll use in your small program, coding your class in your main module doesn't hurt at all. In larger applications where you need to write dozens of tools on the other hand, it's better to split your program to modules with specified purposes, myStringTools, myMath, main and many other modules. Structuring your program with modules and packages enhances maintenance.
If you need to delve deeper read about Modular programming, it'll help you grasp the idea even better.
You can do as you please. If the code for your classes is short, putting them all in your main script is fine. If they're longish, then splitting them out into separate files is a useful organizing technique (that has the added benefit of the code in them no getting recompiled into byte-code everytime the the script they are used in is run.
Putting them in modules also encourages their reuse since they're no longer mixed in with a lot of other unrelated stuff.
Lastly, they may be useful because modules are esstentially singleton objects, meaning that there's only once instance of them in your program which is created the first time it's imported. Later imports in other modules will just reuse the existing instance. This can be a nice way to do initialize that only has to be done once.
I created a module named util that provides classes and functions I often use in Python.
Some of them need imported features. What are the pros and the cons of importing needed things inside class/function definition? Is it better than import at the beginning of a module file? Is it a good idea?
It's the most common style to put every import at the top of the file. PEP 8 recommends it, which is a good reason to do it to start with. But that's not a whim, it has advantages (although not critical enough to make everything else a crime). It allows finding all imports at a glance, as opposed to looking through the whole file. It also ensures everything is imported before any other code (which may depend on some imports) is executed. NameErrors are usually easy to resolve, but they can be annoying.
There's no (significant) namespace pollution to be avoided by keeping the module in a smaller scope, since all you add is the actual module (no, import * doesn't count and probably shouldn't be used anyway). Inside functions, you'd import again on every call (not really harmful since everything is imported once, but uncalled for).
PEP8, the Python style guide, states that:
Imports are always put at the top of
the file, just after any module
comments and docstrings, and before module globals and constants.
Of course this is no hard and fast rule, and imports can go anywhere you want them to. But putting them at the top is the best way to go about it. You can of course import within functions or a class.
But note you cannot do this:
def foo():
from os import *
Because:
SyntaxWarning: import * only allowed at module level
Like flying sheep's answer, I agree that the others are right, but I put imports in other places like in __init__() routines and function calls when I am DEVELOPING code. After my class or function has been tested and proven to work with the import inside of it, I normally give it its own module with the import following PEP8 guidelines. I do this because sometimes I forget to delete imports after refactoring code or removing old code with bad ideas. By keeping the imports inside the class or function under development, I am specifying its dependencies should I want to copy it elsewhere or promote it to its own module...
Only move imports into a local scope, such as inside a function definition, if it’s necessary to solve a problem such as avoiding a circular import or are trying to reduce the initialization time of a module. This technique is especially helpful if many of the imports are unnecessary depending on how the program executes. You may also want to move imports into a function if the modules are only ever used in that function. Note that loading a module the first time may be expensive because of the one time initialization of the module, but loading a module multiple times is virtually free, costing only a couple of dictionary lookups. Even if the module name has gone out of scope, the module is probably available in sys.modules.
https://docs.python.org/3/faq/programming.html#what-are-the-best-practices-for-using-import-in-a-module
I believe that it's best practice (according to some PEP's) that you keep import statements at the beginning of a module. You can add import statements to an __init__.py file, which will import those module to all modules inside the package.
So...it's certainly something you can do the way you're doing it, but it's discouraged and actually unnecessary.
While the other answers are mostly right, there is a reason why python allows this.
It is not smart to import redundant stuff which isn’t needed. So, if you want to e.g. parse XML into an element tree, but don’t want to use the slow builtin XML parser if lxml is available, you would need to check this the moment you need to invoke the parser.
And instead of memorizing the availability of lxml at the beginning, I would prefer to try importing and using lxml, except it’s not there, in which case I’d fallback to the builtin xml module.
I'm writing an application for scientific data analysis and I'm wondering what's the best way to structure the code to avoid (or address) the circular import problem. Currently I'm using a mix of OO and procedural programming.
Other questions address this issue but in a more abstract way. Here I'm looking for a solution that is optimal in a more specific context.
I have a class Container defined in DataLib.py whose data consist in lists and/or arrays. With all methods and supporting functions DataLib.py is quite large (~1000 lines).
I have a second module SelectionLib.py (~400 lines) that contains only functions to "filter" the data in Container according to different criteria. These functions return new Container objects (with filtered data) and thus SelectionLib.py needs to import Container from DataLib.py. Note that, logically, these functions are "methods" for "Container", they are just implemented using python functions.
Now, I want to add some high level method to Container so that a complex analysis can be performed with a single function of method call. And by "complex analysis" I mean an arbitrary number of Container methods call, local function (defined in DataLib.py) and filter functions (defined inSelectionLib.py).
So the problem is that DataLib.py needs to import SelectionLib.py to use the filter functions, but SelectionLib.py already imports DataLib.py.
Right know my hackish solution is to run the two files with run -i ... from IPython so it is like having a big single file and I avoid the circular import. But at the same time this scripts are difficult to integrate for example in a GUI.
How do you suggest to solve this problem:
use pure OO and inheritance and split the object in 3: CoreContainer -> SelectionContainer -> HighLevelContainer
Restructuring the code (everything in one file?)
Some sort of Import trickery (put imports at the end)
Any feedback is appreciated!
If functions in SelectionLib are, as you say, "methods" for Container, it seems reasonable that DataLib imports SelectionLib, not the other way around.
Then the user code would just import DataLib. This would require some refactoring. One possibility to minimize the disruption to the user code would be to rename your existing DataLib and SelectionLib to _DataLib and _SelectionLib, and have a new DataLib to import the necessary bits from either (or both).
As an aside, it's better to follow the PEP-8 conventions and name your modules in lowercase_with_underscores.