I have the script I want to run in the following structure
scripts/
project/
main.py
libraries/
a.py
In main.py I need to import things from a.py. How can I import things in subfolders that are two or more folders above main.py?
The proper way to handle this would be putting everything that needs to know about each other under a shared package, then the individual sub-packages and sub-modules can be accessed through that package. But this will also require moving the application's entrypoint to either the package, or a module that's a sibling of the package in the directory and can import it. If moving the entrypoint is an issue, or something quick and dirty is required for prototyping, Python also implements a couple other methods for affecting where imports search for modules which can be found near the end of the answer.
For the package approach, let's say you have this structure and want to import something between the two modules:
.
├── bar_dir
│ └── bar.py
└── foo_dir
└── foo.py
Currently, the two packages do not know about each other because Python only adds the entrypoint file's parent (either bar_dir or foo_dir depending on which file you run) to the import search path, so we have to tell them about each other in some way, this is done through the top level package they'll both share.
.
└── top_level
├── __init__.py
├── bar_dir
│ ├── __init__.py
│ └── bar.py
└── foo_dir
├── __init__.py
└── foo.py
This is the package layout we need, but to be able to use the package in imports, the top packagehas to be initialized.
If you only need to run the one file, you can do for example python -m top_level.bar_dir.bar but a hidden entry point like that could be confusing to work with.
To avoid that, you can define the package as a runnable module by implementing a __main__.py file inside of it, next to __init__.py, which is ran when doing python -m top_level. The new __main__.py entrypoint would then contain the actual code that runs the app (e.g. the main function) while the other modules would only have definitions.
The __init__.py files are used to mark the directories as proper packages and are ran when the package is imported to initialize its namespace.
With this done the packages now can see each other and can be accessed through either absolute or relative imports, an absolute import would being with the top_level package and use the whole dotted path to the module/package we need to import, e.g. from top_level.bar_dir import bar can be used to import bar.
Packages also allow relative imports which are a special form of a from-style import that begins with one or more dots, where each dot means the import goes up one package - from the foo module from . import module would attempt to import module from the foo_dir package, from .. import module would search for it in the top_level package etc.
One thing to note is that importing a package doesn't initialize the modules under it unless it's an explicit import of that module, for example only importing top_level won't make foo_dir and bar_dir available in its namespace unless they're imported directly through import top_level.foo_dir/top_level.bar_dir or the package's __init__.py added them to the package's namespace through its own import.
If this doesn't work in your structure, an another way is to let Python know where to search for your modules by adding to its module search path, this can be done either at runtime by inserting path strings into the sys.path list, or through the PYTHONPATH environment variable.
Continuing with the above example with a scenario and importing bar from foo, an entry for the bar_dir directory (or the directory above it) can be added to the sys.path list or the aforementioned environment variable. After that import bar (or from bar_dir import bar if the parent was added) can be used to import the module, just as if they were next to each other. The inserted path can also be relative, but that is prone to breakage with a changing cwd.
Related
I have a Python module that normally works as a stand-alone.
file1.py
file2.py
file3.py
However, I also want it to be part of a different project, in which the module is placed in a separate subdirectory.
__init.py__
build.py
└── compiler
└── __init__.py
└── file1.py
└── file2.py
└── file3.py
Since the module scripts use plenty of cross-imports, this is not possible. Once placed in a subdirectory, the imports no longer find the respective files because it looks in the top directory only.
To remedy the problem, I tried various things. I appended the subdirectory as an additional path in the top-most build.py script.
sys.path.append('compiler')
It did not solve the problem. Cross-imports are still not working.
I also tried relative imports but that breaks the stand-alone version of the module. So, I tried exception handling to catch them
try:
from file1 import TestClass
except ImportError:
from .file1 import TestClass
That did not work either and resulted, despite my best efforts in ImportError: attempted relative import with no known parent package errors.
I also tried all sorts of variations of these things, but none of it worked.
I know it has to be possible to do something like this and I am surprised that this is so hard to do. My Internet searches all came back with the same suggestions—the ones I outlined above, none of which worked in my case, particularly because they do not take into account the ability to run code as a stand-alone and as a sub-module.
I can't be the first person trying to write a module that can be used as a stand-alone package or as a sub-module in other projects. What am I missing here?
Relative imports, as the error tells you, require a parent package. Think of .file1 as shorthand for <this_module>.file1. If there is no <this_module>, you can't ask it for file1. In order to properly use relative imports you'll have to make a wrapper project to contain the shared module so it can be properly namespaced.
So your standalone module will instead look like this, matching the consumer:
__init.py__
standalone.py
└── compiler
└── __init__.py
└── file1.py
└── file2.py
└── file3.py
The other option is to make your shared module truly installable with a setup.py or pyproject.toml or whatever your favorite method is. Then you install it in the consuming project instead of directly including it.
This question already has answers here:
How to do relative imports in Python?
(18 answers)
Closed 4 years ago.
I have a simple project structure like this:
➜ (venv:evernote) evernote_bear_project git:(master) ✗ tree | grep -v pyc
.
├── README.md
...
(snip)
...
├── manage.py
├── sample
│ ├── EDAMTest.py <==== here is an import that won't work
│ └── enlogo.png
└── util
├── __init__.py
├── files.py <====== This is being imported
└── test_files.py
Now I have a relative import in sample/EDAMTest.py:
from ..util.files import *
When I try to run python sample/EDAMTest.py from project root folder in command line, I get an error saying:
ValueError: attempted relative import beyond top-level package
I know this has been asked many times, but I still don't get it.
Since I'm running the script from the project root, in my understanding Python should be able to "know" that when I try to import from ..util.files import *, that it should go up one directory, no?
EDIT
Thanks for all the answers.
So what I understand from the link above is this:
I was running the module sample/EDAMTest.py directly via python sample/EDAMTest.py, and that meant
that the __name__ of the module was __main__
and now the path (sample/) was my "root" so to speak.
So now Python searches only this path and any path that's below it for modules / packages. Hence the error message attempted relative import _beyond top-level package_, so it cannot go one more level up, because it is at the root already.
Also Python cannot look one level "up", since this syntax from ..util.files import * does not go up a level in directory, but in a list of modules / packages it keeps on the "import search path" (sys.path)?
Is that correct?
sys.path.append() is a tweak, if your directory structure is fixed and there is nothing you can do about it.
Otherwise, you can try rearranging the folders. The easiest is moving util under sample, another option is making both of the folders psrt of a larger package.
Also import * is not ensorsed.
The relative import syntax is for importing other modules from the same package, not from the file system.
Depending on what you want, you could...
Fix the package so that the relative import works
Put __init__.py files in the project root and sample directory and run the script from one level up. This doesn't seem like what you want.
Tell python where to find the package
Set the PYTHONPATH environment variable so that python can find the package.
PYTHONPATH='.':$PYTHONPATH python samples/EDAMTest.py
Install util so that python can find it
Add a setup script and use it to install the util package and avoid setting PYTHONPATH.
"The relative import syntax is for importing other modules from the same package, not from the file system.", This is right as stated by George G.
Put __init__.py in your subfolders, which will make them package.
__init__.py can be an empty file but it is often used to perform
setup needed for the package(import things, load things into path, etc).
However you can import File into your __init__.py to make it
available at the package level:
# in your __init__.py
from util import files
# now import File from util package
from util import files
of if you want to import some specific method or class, you can do
from util.files import some_function
Another thing to do is at the package level make util/modules
available with the __all__ variable. When the interpeter sees
an __all__ variable defined in an __init__.py it imports the
modules listed in the __all__ variable when you do:
from package import *
__all__ is a list containing the names of modules that you want to be
imported with import * so looking at our above example again if we
wanted to import the submodules in util the all variable
in util/init.py would be:
__all__ = ['files', 'test_files']
With the __all__ variable populated like that, when you perform
from util import *
it would import files and test_files.
I realize there are a slew of posts on SO related to Python and imports, but it seems like a fair number of these posts are asking about import rules/procedures with respect to creating an actual Python package (vs just a project with multiple directories and python files). I am very new to Python and just need some more basic clarification on what is and is not possible with regard to access/importing within the context of multiple py files in a project directory.
Let's say you have the following project directory (to be clear, this is not a package that is somewhere on sys.path, but say, on your Desktop):
myProject/
├── __init__.py
├── scriptA.py
└── subfolder
├── __init__.py
└── scriptB.py
└── subsubfolder
├── __init__.py
└── scriptC.py
└── foo.py
Am I correct in understanding that the only way scriptC.py could import and use methods or classes within scriptB.py if scriptC.py is run directly via $ python scriptC.py and from within the subsubfolder directory is if I add the parent directory and path to scriptB.py to the Python path at runtime via sys.path ?
It is possible, however, for scriptC.py to import foo.py or for scriptB.py to import scriptC.py or foo.py without dealing with sys.path, correct? Adjacent py files and py files in subdirectories are accessible just by using relative import paths, you just can't import python scripts that live in parent or sibling directories (without using sys.path) ?
What's Possible
Anything.
No, really. See the imp module, the the imputil module -- take a look at how the zipimport module is written if you want some inspiration.
If you can get a string with your module's code in a variable, you can get a module into sys.modules using the above, and perhaps hack around with its contents using the ast module on the way.
A custom import hook that looks in parent directories? Well within the range of possibilities.
What's Best Practice
What you're proposing isn't actually good practice. The best-practice approach looks more like the following:
myProject/
├── setup.py
└── src/
├── moduleA.py
└── submodule/
├── __init__.py
├── moduleB.py
└── subsubmodule/
├── __init__.py
└── moduleC.py
Here, the top of your project is always in myProject/src. If you use setup.py to configure moduleA:main, submodule.moduleB:main and submodule.subsubmodule.moduleC:main as entry points (perhaps named scriptA, scriptB and scriptC), then the functions named main in each of those modules would be invoked when the user ran the (automatically generated by setuptools) scripts so named.
With this layout (and appropriate setuptools use), your moduleC.py can absolutely import moduleA, or import submodule.moduleB.
Another approach, which doesn't involve entrypoints, to invoke the code in your moduleC.py (while keeping the module's intended hierarchy intact, and assuming you're in a virtualenv where python setup.py develop has been run) like so:
python -m submodule.subsubmodule.moduleC
There's something that's bothering me about imports in packages.
Imagine I have the following directory structure:
pack
├── __init__.py
├── sub1
│ ├── __init__.py
│ └── mod1.py
└── sub2
├── __init__.py
└── mod2.py
Inside mod1.py I have the following code to import mod2.py:
# mod1.py
import pack.sub2.mod2
pack.sub2.mod2.helloworld()
I have a main.py file in the directory containing pack that imports pack/sub1/mod1.py
How does mod1.py have access to pack? pack is not in the same directory as mod1.py. Does python automatically add the topmost package to sys.path?
You can investigate this by inspecting sys.path in an interactive interpreter. What you'll find is that the first element of it is the location of the script the interpreter was told to run. This means that when you run your script at the top level (the location of the pack package), that location is added to sys.path automatically. It doesn't have anything to do with the actual package structure, so if you ran mod1.py as a script you would have things break (this is probably why you put your script at the top level!).
Note that in Python 2, you also have the issue of implicit relative imports, which doesn't impact the issue you're asking about, but might come up if you had a few more modules involved. If you added mod3.py to sub1, you could import it from mod1 with just import mod3, with the pack.sub1 prefix being figured out implicitly. This implicit behavior is generally considered a bad thing, and in Python 3 such implicit relative imports are not allowed (you can also disable them in Python 2 with from __future__ import absolute_import). To import pack.sub1.mod3 from pack.sub1.mod1 you'd need to either name it in full, or use an explicit relative import: from . import mod3
To tie this relative import business back to your question, if you wanted to avoid relying on pack being part of sys.path (or, more realistically, protect against changes to pack's name), you could modify your import of mod2 from mod1 to be an explicit relative import. Just use from .. import sub2.mod2.
In Python, a namespace package allows you to spread Python code among several projects. This is useful when you want to release related libraries as separate downloads. For example, with the directories Package-1 and Package-2 in PYTHONPATH,
Package-1/namespace/__init__.py
Package-1/namespace/module1/__init__.py
Package-2/namespace/__init__.py
Package-2/namespace/module2/__init__.py
the end-user can import namespace.module1 and import namespace.module2.
What's the best way to define a namespace package so more than one Python product can define modules in that namespace?
TL;DR:
On Python 3.3 you don't have to do anything, just don't put any __init__.py in your namespace package directories and it will just work. On pre-3.3, choose the pkgutil.extend_path() solution over the pkg_resources.declare_namespace() one, because it's future-proof and already compatible with implicit namespace packages.
Python 3.3 introduces implicit namespace packages, see PEP 420.
This means there are now three types of object that can be created by an import foo:
A module represented by a foo.py file
A regular package, represented by a directory foo containing an __init__.py file
A namespace package, represented by one or more directories foo without any __init__.py files
Packages are modules too, but here I mean "non-package module" when I say "module".
First it scans sys.path for a module or regular package. If it succeeds, it stops searching and creates and initalizes the module or package. If it found no module or regular package, but it found at least one directory, it creates and initializes a namespace package.
Modules and regular packages have __file__ set to the .py file they were created from. Regular and namespace packages have __path__set to the directory or directories they were created from.
When you do import foo.bar, the above search happens first for foo, then if a package was found, the search for bar is done with foo.__path__as the search path instead of sys.path. If foo.bar is found, foo and foo.bar are created and initialized.
So how do regular packages and namespace packages mix? Normally they don't, but the old pkgutil explicit namespace package method has been extended to include implicit namespace packages.
If you have an existing regular package that has an __init__.py like this:
from pkgutil import extend_path
__path__ = extend_path(__path__, __name__)
... the legacy behavior is to add any other regular packages on the searched path to its __path__. But in Python 3.3, it also adds namespace packages.
So you can have the following directory structure:
├── path1
│ └── package
│ ├── __init__.py
│ └── foo.py
├── path2
│ └── package
│ └── bar.py
└── path3
└── package
├── __init__.py
└── baz.py
... and as long as the two __init__.py have the extend_path lines (and path1, path2 and path3 are in your sys.path) import package.foo, import package.bar and import package.baz will all work.
pkg_resources.declare_namespace(__name__) has not been updated to include implicit namespace packages.
There's a standard module, called pkgutil, with which you
can 'append' modules to a given namespace.
With the directory structure you've provided:
Package-1/namespace/__init__.py
Package-1/namespace/module1/__init__.py
Package-2/namespace/__init__.py
Package-2/namespace/module2/__init__.py
You should put those two lines in both Package-1/namespace/__init__.py and Package-2/namespace/__init__.py (*):
from pkgutil import extend_path
__path__ = extend_path(__path__, __name__)
(* since -unless you state a dependency between them- you don't know which of them will be recognized first - see PEP 420 for more information)
As the documentation says:
This will add to the package's __path__ all subdirectories of directories on sys.path named after the package.
From now on, you should be able to distribute those two packages independently.
This section should be pretty self-explanatory.
In short, put the namespace code in __init__.py, update setup.py to declare a namespace, and you are free to go.
This is an old question, but someone recently commented on my blog that my posting about namespace packages was still relevant, so thought I would link to it here as it provides a practical example of how to make it go:
https://web.archive.org/web/20150425043954/http://cdent.tumblr.com/post/216241761/python-namespace-packages-for-tiddlyweb
That links to this article for the main guts of what's going on:
http://www.siafoo.net/article/77#multiple-distributions-one-virtual-package
The __import__("pkg_resources").declare_namespace(__name__) trick is pretty much drives the management of plugins in TiddlyWeb and thus far seems to be working out.
You have your Python namespace concepts back to front, it is not possible in python to put packages into modules. Packages contain modules not the other way around.
A Python package is simply a folder containing a __init__.py file. A module is any other file in a package (or directly on the PYTHONPATH) that has a .py extension. So in your example you have two packages but no modules defined. If you consider that a package is a file system folder and a module is file then you see why packages contain modules and not the other way around.
So in your example assuming Package-1 and Package-2 are folders on the file system that you have put on the Python path you can have the following:
Package-1/
namespace/
__init__.py
module1.py
Package-2/
namespace/
__init__.py
module2.py
You now have one package namespace with two modules module1 and module2. and unless you have a good reason you should probably put the modules in the folder and have only that on the python path like below:
Package-1/
namespace/
__init__.py
module1.py
module2.py