I'm writing a python package and am wondering where the best place is to put constants?
I know you can create a file called 'constants.py' in the package and then call them with module.constants.const, but shouldn't there be a way to associate the constant with the whole module? e.g. you can call numpy.pi, how would I do something like that?
Also, where in the module is the best place to put paths to directories outside of the module where I want to read/write files?
Put them where you feel they can most easily be maintained. Usually that means in the module to which the constants logically belong.
You can always import the constants into the __init__.py file of your package to make it easier for someone to find them. If you did decide on a constants module, I'd add a __all__ sequence to state what values are public, then in the __init__.py file do:
from constants import *
to make the same names available at the package level.
Related
For years, I've known that the very definition of a Python module is as a separate file. In fact, even the official documentation states that "a module is a file containing Python definitions and statements". Yet, this online tutorial from people who seem pretty knowledgeable states that "a module usually corresponds to a single file". Where does the "usually" come from? Can a Python module consist of multiple files?
Not really.
Don't read too much into the phrasing of one short throwaway sentence, in a much larger blog post that concerns packaging and packages, both of which are by nature multi-file.
Imports do not make modules multifile
By the logic that modules are multifile because of imports... almost any python module is multifile. Unless the imports are from the subtree, which has no real discernible difference to code using the module. That notion of subtree imports, btw, is relevant... to Python packages.
__module__, the attribute found on classes and functions, also maps to one file, as determined by import path.
The usefulness of expanding the definition of modules that way seems… limited, and risks confusion. Let imports be imports ans modules be modules (i.e. files).
But that's like, my personal opinion.
Let's go all language lawyer on it
And refer to the Python tutorial. I figure they will be talking about modules at some point and will be much more careful in their wording than a blog post which was primarily concerned about another subject.
6. Modules
To support this, Python has a way to put definitions in a file and use them in a script or in an interactive instance of the interpreter. Such a file is called a module; definitions from a module can be imported into other modules or into the main module (the collection of variables that you have access to in a script executed at the top level and in calculator mode).
A module is a file containing Python definitions and statements. The file name is the module name with the suffix .py appended. Within a module, the module’s name (as a string) is available as the value of the global variable name.
p.s. OK, what about calling it a file, instead of a module, then?
That supposes that you store Python code in a file system. But you could have an exotic environment that stores it in a database instead (or embeds it in a larger C/Rust executable?). So, module, seems better understood as a "contiguous chunk of Python code". Usually that's a file, but having a separate term allows for flexibility, without changing anything to the core concepts.
Yup, a python module can include more than one file. Basically what you would do is get a file for the main code of the module you are writing, and in that main file include some other tools you can use.
For example, you can have the file my_splitter_module.py, in which you have... say a function that gets a list of integers and split it in half creating two lists. Now say you wanna multiply all the numbers that are in the first half between each other ([1, 2, 3] -> 1 * 2 * 3), but with the other half sum them ([1, 2, 3] -> 1 + 2 + 3). Now say you don't want to make the code messy and so you decide to make another two functions, one that gets a list and multiply its items, and another that sum them.
Of course, you could make the two functions in the same my_splitter_module.py file, but in other situations when you have big files with big classes etc, you would like to make a file like multiply_list.py and sum_list.py, and then importing them to the my_splitter_module.py
At the end, you would import my_splitter_module.py to your main.py file, and while doing this, you would also be importing multiply_list.py and sum_list.py files.
Yes, sure.
If you create a folder named mylib in your PATH or in the same directory as your script, it allows you to use import mylib.
Make sure to put __init__.py in the folder and in that imoprt everything from other files because the variables, functions, etc. import just from the __init__.py.
For example:
project -+- lib -+- __init__.py
| +- bar.py
| +- bar2.py
|
+- foo.py
__init__.py :
from bar import test, random
from bar2 import sample
foo.py :
import lib
print(test)
sample()
Hope it helps.
I am developing a rather complex application for my company following the object-oriented model in python3. The application contains several packages and sub-packages, each - of course - containing an __init__.py module.
I mostly used those __init__.py modules to declare generic classes for the package inside them, which are intended to be used as abstract templates for their respective package only.
My question is now: Is this a "nice" / "correct" / "pythonic" way to use the __init__.py module(s)? Or shall I rather declare my generic classes somewhere else?
To give an example, let's assume a package mypkg:
mypkg.__init__.py:
class Foo(object):
__some_attr = None
def __init__(self, some_attr):
self.__some_attr = some_attr
#property
def some_attr(self):
return self.__some_attr
#some_attr.setter
def some_attr(self, val):
self.__some_attr = val
mypkg.myfoo.py:
from . import Foo
class MyFoo(Foo):
def __init__(self):
super().__init__("this is my implemented value")
def printme(self):
print(self.some_attr)
It depends by what is the API you want to provide. For example the collections module in the standard library defines all the classes in the __init__.py1.
And the standard library should be pretty "pythonic", whatever that means.
However it provides mostly a "module-like" interface. It's quite rare to see:
import collections.abc
If you already have subpackages you are probably better introducing a new subpackage.
If, currently, the use of the package doesn't actually depend on subpackages you might consider putting the code in the __init__.py. Or put the code in some private module and simply import the names inside __init__.py (this is what I'd prefer)
If you are only concerned with where it's better to put the Abstract Base Classes, as shown above (collections.abc contains the abstract base classes of the collections package), and as you can see from the standard library's abc module, it's common to define an abc.py submodule that contains them.
You may consider exposing them directly from the __init__.py doing:
from .abc import *
from . import abc
# ...
__all__ = list_of_names_to_export + abc.__all__
inside your __init__.py.
1 The actual implementation used is in C however: _collectionsmodule.c
You can always put everything into one source file. The reason for splitting the more complex code into separate modules or packages is to separate the things that are mutually related from things that are unrelated. The separated things should be as independent as possible on the rest. And it applies to any level: data structures, functions, classes, modules, packages, applications.
Inside the module or inside the package should apply the same rules. I agree with Bakuriu that __init__.py should be closely related to the package infrastructure, but not neccessarily to the functionality of the module. My personal opinion is that the __init__.py should be as simple as possible. The reason firstly is that everything should be as simple as possible but not simpler. Secondly, people reading the code of your package tend to think that way. They may not expect the unexpected functionality in __init__.py. It would probably be better to create generic.py inside the package for the purpose. It is easier to document the purpose of the module (say via its top docstring).
The better the separation is from the beginning, the better can the independent features be combined in future. You get better flexibility -- both for the usage of module inside the package and also for future modifications.
It is indeed possible to use __init__.py for a specific module initialization, but I have never seen someone using it to define new functions. The only "correct" use I know is what is discussed in this subject ....
Anyway, as you might have a complex application, you can use a temporary file where you define all your needed functions instead of defining them directly in __init__.py module. This will allow you more readability it is easier to change afterwards.
Don't ever use __init__.py for anything except to define __all__. You will save so many lives if you will avoid it.
Reason: It is common for developers to look at packages and modules. But there is a problem you can stuck with sometimes. If you have package, you assume that there is a modules and code inside of it. But you will rarely count __init__.py as one, because, let's face it, most times it is just a requirement to make modules from directory importable.
package
__init__.py
mod1.py
humans.py
cats.py
dogs.py
cars.py
commons.py
Where should be located class Family? It is common class and it depends on others, because we can create family of humans, dogs and cats, even cars!
By my logic, and logic of my friends, it should be places in separated file, but I will try to find it in commons, next in humans and then... I will be embarrassed, because I don't really know where it is!
Stupid example, huh? But it gives a point.
Let's say I have a couple of Python packages.
/package_name
__init__.py
/dohickey
__init__.py
stuff.py
other_stuff.py
shiny_stuff.py
/thingamabob
__init__.py
cog_master.py
round_cogs.py
teethless_cogs.py
/utilities
__init__.py
important.py
super_critical_top_secret_cog_blueprints.py
What's the best way to utilize the utilites package? Say shiny_stuff.py needs to import important.py, what's the best way to go about that?
Currently I'm thinking
from .utilities import important
But is that the best way? Would it make more sense to add utilities to the path and import it that way?
import sys
sys.path.append(os.path.basename(os.path.basename(__file__)))
import utilities.super_critical_top_secret_cog_blueprints
That seems clunky to add to each of my files.
I think the safest way is always to use absolute import, so in you case:
from package_name.utilities import important
This way you won't have to change your code if you decide to move your shiny_stuff.py in some other package (assuming that package_name will still be in your sys.path).
According to Nick Coghlan (who is a Python core developer):
"“Never add a package directory, or any directory inside a package, directly to the Python path.” (Under the heading "The double import trap")
Adding the package directory to the path gives two separate ways for the module to be referred to. The link above is an excellent blog post about the Python import system. Adding it to the path directly means you can potentially have two copies of a single module, which you don't want. Your relative import from .utilities import important is fine, and an absolute import import package_name.utilities.important is also fine.
A "best" out-of-context choice probably doesn't exist, but you can have some criteria choosing which is better for your use cases, and for such a judgment one should know are the different possible approaches and their characteristics. Probably the best source of information is the PEP 328 itself, which contains some rationale about declaring distinct possibilities for that.
A common approach is to use the "absolute import", in your case it would be something like:
from package_name.utilities import important
This way, you can make this file it a script. It is somewhat independent from other modules and packages, fixed mainly by its location. If you have a package structure and need to change one single module from its location, having absolute path would help this single file to be kept unchanged, but all the ones which uses this module it should change. Of course you can also import the __init__.py files as:
from package_name import utilities
And these imports have the same characteristics. Be careful that utilities.important try to find a variable important within __init__.py, not from important.py, so having a "import important" __init__.py would help avoiding a mistake due to the distinction between file structure and namespace structure.
Another way to do that is the relative approach, by using:
from ..utilities import important
The first dot (from .stuff import ___ or from . import ___) says "the module in this [sub]package", or __init__.py when there's only the dot. From the second dot we are talking about parent directories. Generally, starting with dots in any import isn't allowed in a script/executable, but you can read about explicit relative imports (PEP 366) if you care about scripts with relative imports.
A justification for relative import can be found on the PEP 328 itself:
With the shift to absolute imports, the question arose whether relative imports should be allowed at all. Several use cases were presented, the most important of which is being able to rearrange the structure of large packages without having to edit sub-packages. In addition, a module inside a package can't easily import itself without relative imports.
Either case, the modules are tied to the subpackages in the sense that package_name is imported first no matter which the user tried to import first, unless you use sys.path to search for subpackages as packages (i.e., use the package root inside sys.path)...but that sounds weird, why would one do that?
The __init__.py can auto-import module names, for that one should care about its namespace contents. For example, say important.py has an object called top_secret, which is a dictionary. To find it from anywhere you would need
from package_name.utilities.important import top_secret
Perhaps you want be less specific:
from package_name.utilities import top_secret
That would be done with an __init__.py with the following line inside it:
from .important import top_secret
That's perhaps mixing the relative and absolute imports, but for a __init__.py you probably know that subpackage makes sense as a subpackage, i.e., as an abstraction by itself. If it's just a bunch of files located in the same place with the need for a explicit module name, probably the __init__.py would be empty (or almost empty). But for avoiding explicit module names for the user, the same idea can be done on the root __init__.py, with
from .utilities import top_secret
Completely indirect, but the namespace gets flat this way while the files are nested for some internal organization. For example, the wx package (wxPython) do that: everything can be found from wx import ___ directly.
You can also use some metaprogramming for finding the contents if you want to follow this approach, for example, using __all__ to detect all names a module have, or looking for the file location to know which modules/subpackages are available there to import. However, some simpler code completion utilities might get lost when doing that.
For some contexts you might have other kind of constraints. For example, macropy makes some "magic" with imports and doesn't work on the file you call as a script, so you'll need at least 2 modules just to use this package.
Anyhow, you should always ask whether nesting into subpackages is really needed for you code or API organization. The PEP 20 tells us that "Flat is better than nested", which isn't a law but a point-of-view that suggests you should keep a flat package structure unless nesting is needed for some reason. Likewise, you don't need a module for each class nor anything alike.
Use absolute import in case you need to move to a different location.
How can I create nested modules (packages?) with the python c api?
I would like the client code (python) to be able do something like this:
import MainModuleName
import MainModuleName.SubModuleName
Instead of currently:
import MainModuleName
import MainModuleNameSubModuleName
Which imo looks ugly and clutters up the namespace.
Is it possible without having to mess around with file system directories?
You do not "mess around" with file system directories. File system directories are how you create submodules, unless you want to be really obscure and have a lot of needless pain.
You want to have a module called MainModuleName.SubModuleName and then MainModuleName should be a directory with an __init__.py file.
A common way of doing C modules is to put all the C-code in modules with names starting in underscore, in this case _mainmodulename.c, and then importing them from Python files. This is done so that you only need to implement the things in C that has to be in C, and the rest you can do in Python. You can also have pure-Python fallbacks that way. I suggest you do something similar, and create the module structure in Python and then import the classes and functions from C modules with underscore names.
I'm learning Python and I have been playing around with packages. I wanted to know the best way to define classes in packages. It seems that the only way to define classes in a package is to define them in the __init__.py of that package. Coming from Java, I'd kind of like to define individual files for my classes. Is this a recommended practice?
I'd like to have my directory look somewhat like this:
recursor/
__init__.py
RecursionException.py
RecursionResult.py
Recursor.py
So I could refer to my classes as recursor.Recursor, recursor.RecursionException, and recursor.RecursionResult. Is this doable or recommended in Python?
Go ahead and define your classes in separate modules. Then make __init__.py do something like this:
from RecursionException import RecursionException
from RecursionResult import RecursionResult
from Recursor import Recursor
That will import each class into the package's root namespace, so calling code can refer to recursor.Recursor instead of recursor.Recursor.Recursor.
I feel the need to echo some of the other comments here, though: Python is not Java. Rather than creating a new module for every class under the sun, I suggest grouping closely related classes into a single module. It's easier to understand your code that way, and calling code won't need a bazillion imports.
This is perfectly doable. Just create a new class module for each of those classes, and create exactly the structure you posted.
You can also make a Recursion.py module or something similar, and include all 3 classes in that file.
(I'm also new to Python from Java, and I haven't yet put anything in my __init__.py files...)
In Python you're not restricted to defining 1 class per file and few do that. You can if you want to though - it's totally up to you. A Package in Python is just a directory with an
__init__.py
file. You don't have to put anything in that file you can to control what gets imported etc.