In Python, a namespace package allows you to spread Python code among several projects. This is useful when you want to release related libraries as separate downloads. For example, with the directories Package-1 and Package-2 in PYTHONPATH,
Package-1/namespace/__init__.py
Package-1/namespace/module1/__init__.py
Package-2/namespace/__init__.py
Package-2/namespace/module2/__init__.py
the end-user can import namespace.module1 and import namespace.module2.
What's the best way to define a namespace package so more than one Python product can define modules in that namespace?
TL;DR:
On Python 3.3 you don't have to do anything, just don't put any __init__.py in your namespace package directories and it will just work. On pre-3.3, choose the pkgutil.extend_path() solution over the pkg_resources.declare_namespace() one, because it's future-proof and already compatible with implicit namespace packages.
Python 3.3 introduces implicit namespace packages, see PEP 420.
This means there are now three types of object that can be created by an import foo:
A module represented by a foo.py file
A regular package, represented by a directory foo containing an __init__.py file
A namespace package, represented by one or more directories foo without any __init__.py files
Packages are modules too, but here I mean "non-package module" when I say "module".
First it scans sys.path for a module or regular package. If it succeeds, it stops searching and creates and initalizes the module or package. If it found no module or regular package, but it found at least one directory, it creates and initializes a namespace package.
Modules and regular packages have __file__ set to the .py file they were created from. Regular and namespace packages have __path__set to the directory or directories they were created from.
When you do import foo.bar, the above search happens first for foo, then if a package was found, the search for bar is done with foo.__path__as the search path instead of sys.path. If foo.bar is found, foo and foo.bar are created and initialized.
So how do regular packages and namespace packages mix? Normally they don't, but the old pkgutil explicit namespace package method has been extended to include implicit namespace packages.
If you have an existing regular package that has an __init__.py like this:
from pkgutil import extend_path
__path__ = extend_path(__path__, __name__)
... the legacy behavior is to add any other regular packages on the searched path to its __path__. But in Python 3.3, it also adds namespace packages.
So you can have the following directory structure:
├── path1
│ └── package
│ ├── __init__.py
│ └── foo.py
├── path2
│ └── package
│ └── bar.py
└── path3
└── package
├── __init__.py
└── baz.py
... and as long as the two __init__.py have the extend_path lines (and path1, path2 and path3 are in your sys.path) import package.foo, import package.bar and import package.baz will all work.
pkg_resources.declare_namespace(__name__) has not been updated to include implicit namespace packages.
There's a standard module, called pkgutil, with which you
can 'append' modules to a given namespace.
With the directory structure you've provided:
Package-1/namespace/__init__.py
Package-1/namespace/module1/__init__.py
Package-2/namespace/__init__.py
Package-2/namespace/module2/__init__.py
You should put those two lines in both Package-1/namespace/__init__.py and Package-2/namespace/__init__.py (*):
from pkgutil import extend_path
__path__ = extend_path(__path__, __name__)
(* since -unless you state a dependency between them- you don't know which of them will be recognized first - see PEP 420 for more information)
As the documentation says:
This will add to the package's __path__ all subdirectories of directories on sys.path named after the package.
From now on, you should be able to distribute those two packages independently.
This section should be pretty self-explanatory.
In short, put the namespace code in __init__.py, update setup.py to declare a namespace, and you are free to go.
This is an old question, but someone recently commented on my blog that my posting about namespace packages was still relevant, so thought I would link to it here as it provides a practical example of how to make it go:
https://web.archive.org/web/20150425043954/http://cdent.tumblr.com/post/216241761/python-namespace-packages-for-tiddlyweb
That links to this article for the main guts of what's going on:
http://www.siafoo.net/article/77#multiple-distributions-one-virtual-package
The __import__("pkg_resources").declare_namespace(__name__) trick is pretty much drives the management of plugins in TiddlyWeb and thus far seems to be working out.
You have your Python namespace concepts back to front, it is not possible in python to put packages into modules. Packages contain modules not the other way around.
A Python package is simply a folder containing a __init__.py file. A module is any other file in a package (or directly on the PYTHONPATH) that has a .py extension. So in your example you have two packages but no modules defined. If you consider that a package is a file system folder and a module is file then you see why packages contain modules and not the other way around.
So in your example assuming Package-1 and Package-2 are folders on the file system that you have put on the Python path you can have the following:
Package-1/
namespace/
__init__.py
module1.py
Package-2/
namespace/
__init__.py
module2.py
You now have one package namespace with two modules module1 and module2. and unless you have a good reason you should probably put the modules in the folder and have only that on the python path like below:
Package-1/
namespace/
__init__.py
module1.py
module2.py
Related
I have the script I want to run in the following structure
scripts/
project/
main.py
libraries/
a.py
In main.py I need to import things from a.py. How can I import things in subfolders that are two or more folders above main.py?
The proper way to handle this would be putting everything that needs to know about each other under a shared package, then the individual sub-packages and sub-modules can be accessed through that package. But this will also require moving the application's entrypoint to either the package, or a module that's a sibling of the package in the directory and can import it. If moving the entrypoint is an issue, or something quick and dirty is required for prototyping, Python also implements a couple other methods for affecting where imports search for modules which can be found near the end of the answer.
For the package approach, let's say you have this structure and want to import something between the two modules:
.
├── bar_dir
│ └── bar.py
└── foo_dir
└── foo.py
Currently, the two packages do not know about each other because Python only adds the entrypoint file's parent (either bar_dir or foo_dir depending on which file you run) to the import search path, so we have to tell them about each other in some way, this is done through the top level package they'll both share.
.
└── top_level
├── __init__.py
├── bar_dir
│ ├── __init__.py
│ └── bar.py
└── foo_dir
├── __init__.py
└── foo.py
This is the package layout we need, but to be able to use the package in imports, the top packagehas to be initialized.
If you only need to run the one file, you can do for example python -m top_level.bar_dir.bar but a hidden entry point like that could be confusing to work with.
To avoid that, you can define the package as a runnable module by implementing a __main__.py file inside of it, next to __init__.py, which is ran when doing python -m top_level. The new __main__.py entrypoint would then contain the actual code that runs the app (e.g. the main function) while the other modules would only have definitions.
The __init__.py files are used to mark the directories as proper packages and are ran when the package is imported to initialize its namespace.
With this done the packages now can see each other and can be accessed through either absolute or relative imports, an absolute import would being with the top_level package and use the whole dotted path to the module/package we need to import, e.g. from top_level.bar_dir import bar can be used to import bar.
Packages also allow relative imports which are a special form of a from-style import that begins with one or more dots, where each dot means the import goes up one package - from the foo module from . import module would attempt to import module from the foo_dir package, from .. import module would search for it in the top_level package etc.
One thing to note is that importing a package doesn't initialize the modules under it unless it's an explicit import of that module, for example only importing top_level won't make foo_dir and bar_dir available in its namespace unless they're imported directly through import top_level.foo_dir/top_level.bar_dir or the package's __init__.py added them to the package's namespace through its own import.
If this doesn't work in your structure, an another way is to let Python know where to search for your modules by adding to its module search path, this can be done either at runtime by inserting path strings into the sys.path list, or through the PYTHONPATH environment variable.
Continuing with the above example with a scenario and importing bar from foo, an entry for the bar_dir directory (or the directory above it) can be added to the sys.path list or the aforementioned environment variable. After that import bar (or from bar_dir import bar if the parent was added) can be used to import the module, just as if they were next to each other. The inserted path can also be relative, but that is prone to breakage with a changing cwd.
I'll shorten the notation. I have
PYTHONPATH=/path1/dir1:/path2/dir2
Structures:
/path1/dir1/
README
muggle.py
...
utils/
/path2/dir2/
__init__.py
utils/
__init__.py
pkg2/
__init__.py
mod2.py
dir1 has a module utils, but is not, itself a package: no __init__.py
dir2 has a module utils, and does have __init__.py
My boiler-plate code (before dir1 was part of the environment) has imports from dir2 of the form
from utils.pkg2.mod2 import func2
The problem comes in that I'm now adapting this code to call functions that import from utils in dir1; I cannot alter that part of the environment.
What can I do to make my code go for the dir2/utils module? Unfortunately, this also needs to be adaptable to Python 2.6.6 and later.
I have search existing questions on SO and elsewhere; all the answers I've found depend on some package "handle" that I do not have.
This import statement is incorrect:
from utils.pkg2.mod2 import func2
If it has ever worked correctly, that was relying on resolving with the current working directory, implicit relative imports in Python 2.x, or a manually munged PYTHONPATH / sys.path.
This is the type of import for which PEP8 said:
Implicit relative imports should never be used and have been removed in Python 3.
So what to do instead? sys.path should be augmented with top-level directories, not intra-package directories, i.e.:
PYTHONPATH=/path1/dir1:/path2
And change imports like this:
from dir2.utils.pkg2.mod2 import func2
Now the sub-package dir2.utils is namespaced from the top-level package utils.
I have a Python module that normally works as a stand-alone.
file1.py
file2.py
file3.py
However, I also want it to be part of a different project, in which the module is placed in a separate subdirectory.
__init.py__
build.py
└── compiler
└── __init__.py
└── file1.py
└── file2.py
└── file3.py
Since the module scripts use plenty of cross-imports, this is not possible. Once placed in a subdirectory, the imports no longer find the respective files because it looks in the top directory only.
To remedy the problem, I tried various things. I appended the subdirectory as an additional path in the top-most build.py script.
sys.path.append('compiler')
It did not solve the problem. Cross-imports are still not working.
I also tried relative imports but that breaks the stand-alone version of the module. So, I tried exception handling to catch them
try:
from file1 import TestClass
except ImportError:
from .file1 import TestClass
That did not work either and resulted, despite my best efforts in ImportError: attempted relative import with no known parent package errors.
I also tried all sorts of variations of these things, but none of it worked.
I know it has to be possible to do something like this and I am surprised that this is so hard to do. My Internet searches all came back with the same suggestions—the ones I outlined above, none of which worked in my case, particularly because they do not take into account the ability to run code as a stand-alone and as a sub-module.
I can't be the first person trying to write a module that can be used as a stand-alone package or as a sub-module in other projects. What am I missing here?
Relative imports, as the error tells you, require a parent package. Think of .file1 as shorthand for <this_module>.file1. If there is no <this_module>, you can't ask it for file1. In order to properly use relative imports you'll have to make a wrapper project to contain the shared module so it can be properly namespaced.
So your standalone module will instead look like this, matching the consumer:
__init.py__
standalone.py
└── compiler
└── __init__.py
└── file1.py
└── file2.py
└── file3.py
The other option is to make your shared module truly installable with a setup.py or pyproject.toml or whatever your favorite method is. Then you install it in the consuming project instead of directly including it.
I realize there are a slew of posts on SO related to Python and imports, but it seems like a fair number of these posts are asking about import rules/procedures with respect to creating an actual Python package (vs just a project with multiple directories and python files). I am very new to Python and just need some more basic clarification on what is and is not possible with regard to access/importing within the context of multiple py files in a project directory.
Let's say you have the following project directory (to be clear, this is not a package that is somewhere on sys.path, but say, on your Desktop):
myProject/
├── __init__.py
├── scriptA.py
└── subfolder
├── __init__.py
└── scriptB.py
└── subsubfolder
├── __init__.py
└── scriptC.py
└── foo.py
Am I correct in understanding that the only way scriptC.py could import and use methods or classes within scriptB.py if scriptC.py is run directly via $ python scriptC.py and from within the subsubfolder directory is if I add the parent directory and path to scriptB.py to the Python path at runtime via sys.path ?
It is possible, however, for scriptC.py to import foo.py or for scriptB.py to import scriptC.py or foo.py without dealing with sys.path, correct? Adjacent py files and py files in subdirectories are accessible just by using relative import paths, you just can't import python scripts that live in parent or sibling directories (without using sys.path) ?
What's Possible
Anything.
No, really. See the imp module, the the imputil module -- take a look at how the zipimport module is written if you want some inspiration.
If you can get a string with your module's code in a variable, you can get a module into sys.modules using the above, and perhaps hack around with its contents using the ast module on the way.
A custom import hook that looks in parent directories? Well within the range of possibilities.
What's Best Practice
What you're proposing isn't actually good practice. The best-practice approach looks more like the following:
myProject/
├── setup.py
└── src/
├── moduleA.py
└── submodule/
├── __init__.py
├── moduleB.py
└── subsubmodule/
├── __init__.py
└── moduleC.py
Here, the top of your project is always in myProject/src. If you use setup.py to configure moduleA:main, submodule.moduleB:main and submodule.subsubmodule.moduleC:main as entry points (perhaps named scriptA, scriptB and scriptC), then the functions named main in each of those modules would be invoked when the user ran the (automatically generated by setuptools) scripts so named.
With this layout (and appropriate setuptools use), your moduleC.py can absolutely import moduleA, or import submodule.moduleB.
Another approach, which doesn't involve entrypoints, to invoke the code in your moduleC.py (while keeping the module's intended hierarchy intact, and assuming you're in a virtualenv where python setup.py develop has been run) like so:
python -m submodule.subsubmodule.moduleC
I am just starting with python and have troubles understanding the searching path for intra-package module loads. I have a structure like this:
top/ Top-level package
__init__.py Initialize the top package
src/ Subpackage for source files
__init__.py
pkg1/ Source subpackage 1
__init__.py
mod1_1.py
mod1_2.py
...
pkg2/ Source subpackage 2
__init__.py
mod2_1.py
mod2_2.py
...
...
test/ Subpackage for unit testing
__init__.py
pkg1Test/ Tests for subpackage1
__init__.py
testSuite1_1.py
testSuite1_2.py
...
pkg2Test/ Tests for subpackage2
__init__.py
testSuite2_1.py
testSuite2_2.py
...
...
In testSuite1_1 I need to import module mod1_1.py (and so on). What import statement should I use?
Python's official tutorial (at docs.python.org, sec 6.4.2) says:
"If the imported module is not found in the current package (the package of which the current module is a submodule), the import statement looks for a top-level module with the given name."
I took this to mean that I could use (from within testSuite1_1.py):
from src.pkg1 import mod1_1
or
import src.pkg1.mod1_1
neither works. I read several answers to similar questions here, but could not find a solution.
Edit: I changed the module names to follow Python's naming conventions. But I still cannot get this simple example to work.
The module name doesn't include the .py extension. Also, in your example, the top-level module is actually named top. And finally, hyphens aren't legal for names in python, I'd suggest replacing them with underscores. Then try:
from top.src.pkg1 import mod1_1
Problem solved with the help of http://legacy.python.org/doc/essays/packages.html (referred to in a similar question). The key point (perhpas obvious to more experienced python developers) is this:
"In order for a Python program to use a package, the package must be findable by the import statement. In other words, the package must be a subdirectory of a directory that is on sys.path. [...] the easiest way to ensure that a package was on sys.path was to either install it in the standard library or to have users extend sys.path by setting their $PYTHONPATH shell environment variable"
Adding the path to "top" to PYTHONPATH solved the problem.To make the solution portable (this is a personal project, but I need to share it across several machines), I guess having a minimal initialization code in top/setup.py should work.