Since there are so many questions on relative imports, I will make it as short and sweet as possible. And yes, I've read "Relative imports for the billionth time".
I have a project structure like this:
.
├── Makefile
└── src
├── __init__.py
├── model
│ └── train_model.py
└── preprocessing
└── process.py
where I want to be able to, as an example, call make preprocessingor make train which then runs either process.pyor train_model.py with
## Make train
train:
python3 src/model/train_model.py
E.g. modules will always from the top project folder where the Makefile lives.
Now, my problem is that i might have dependencies between different submodules, such as train_model.py and process.py. Specifically, if I try to import processin train_model by using from src.preprocessing import process i get an error ImportError: No module named 'src'. In a similar vein, I've tried from ...preprocessing import process, which gives me another error: SystemError: Parent module '' not loaded, cannot perform relative import.
I use if __name__ == '__main__': at the end of my train_model.py, but I can't seem to figure out, how python uses __name__to find different modules, and if this f**** something up in the process.
Use PYTHONPATH. I would do it this way:
Makefile:
export PYTHONPATH=$(abspath src)
train:
python3 src/model/train_model.py
train_model.py:
from preprocessing import process
Now every import will first look under src. It is not conventional to write from src.preprocessing import process - typically imports are understood to be within some base directory (you wouldn't want to set PYTHONPATH to the directory above src, because it may contain things you don't want to import).
Related
This is the structure of my project
final
├── common
├── __init__.py
├── batch_data_processing.py
├── processing_utility.py
├── post_processing.py
├── Processing_raw_data
├── batch_process_raw_data.py
so i want to import from common.batch_data_processing in batch_process_raw_data.py
but when I try it I get ModuleNotFoundError: No module named 'common'
is there a way to import this module without the need to install it ?
Note : this is intended to be used by "non python users"
here is pictures to better discribe the problem.
Add the following code above your import code to indicate the path:
# The following is a relative path,
# it can also be replaced with the absolute path
# of the directory where common is located.
# sys.path.append("C:\\Users\\Admin\\Desktop\\Final")
import sys
sys.path.append("./")
When all your scripts are in the same folder, importing modules is almost impossible to go wrong. If you need to import scripts from external folders, you can specify the path using the above method.
I have the script I want to run in the following structure
scripts/
project/
main.py
libraries/
a.py
In main.py I need to import things from a.py. How can I import things in subfolders that are two or more folders above main.py?
The proper way to handle this would be putting everything that needs to know about each other under a shared package, then the individual sub-packages and sub-modules can be accessed through that package. But this will also require moving the application's entrypoint to either the package, or a module that's a sibling of the package in the directory and can import it. If moving the entrypoint is an issue, or something quick and dirty is required for prototyping, Python also implements a couple other methods for affecting where imports search for modules which can be found near the end of the answer.
For the package approach, let's say you have this structure and want to import something between the two modules:
.
├── bar_dir
│ └── bar.py
└── foo_dir
└── foo.py
Currently, the two packages do not know about each other because Python only adds the entrypoint file's parent (either bar_dir or foo_dir depending on which file you run) to the import search path, so we have to tell them about each other in some way, this is done through the top level package they'll both share.
.
└── top_level
├── __init__.py
├── bar_dir
│ ├── __init__.py
│ └── bar.py
└── foo_dir
├── __init__.py
└── foo.py
This is the package layout we need, but to be able to use the package in imports, the top packagehas to be initialized.
If you only need to run the one file, you can do for example python -m top_level.bar_dir.bar but a hidden entry point like that could be confusing to work with.
To avoid that, you can define the package as a runnable module by implementing a __main__.py file inside of it, next to __init__.py, which is ran when doing python -m top_level. The new __main__.py entrypoint would then contain the actual code that runs the app (e.g. the main function) while the other modules would only have definitions.
The __init__.py files are used to mark the directories as proper packages and are ran when the package is imported to initialize its namespace.
With this done the packages now can see each other and can be accessed through either absolute or relative imports, an absolute import would being with the top_level package and use the whole dotted path to the module/package we need to import, e.g. from top_level.bar_dir import bar can be used to import bar.
Packages also allow relative imports which are a special form of a from-style import that begins with one or more dots, where each dot means the import goes up one package - from the foo module from . import module would attempt to import module from the foo_dir package, from .. import module would search for it in the top_level package etc.
One thing to note is that importing a package doesn't initialize the modules under it unless it's an explicit import of that module, for example only importing top_level won't make foo_dir and bar_dir available in its namespace unless they're imported directly through import top_level.foo_dir/top_level.bar_dir or the package's __init__.py added them to the package's namespace through its own import.
If this doesn't work in your structure, an another way is to let Python know where to search for your modules by adding to its module search path, this can be done either at runtime by inserting path strings into the sys.path list, or through the PYTHONPATH environment variable.
Continuing with the above example with a scenario and importing bar from foo, an entry for the bar_dir directory (or the directory above it) can be added to the sys.path list or the aforementioned environment variable. After that import bar (or from bar_dir import bar if the parent was added) can be used to import the module, just as if they were next to each other. The inserted path can also be relative, but that is prone to breakage with a changing cwd.
I have a Python module that normally works as a stand-alone.
file1.py
file2.py
file3.py
However, I also want it to be part of a different project, in which the module is placed in a separate subdirectory.
__init.py__
build.py
└── compiler
└── __init__.py
└── file1.py
└── file2.py
└── file3.py
Since the module scripts use plenty of cross-imports, this is not possible. Once placed in a subdirectory, the imports no longer find the respective files because it looks in the top directory only.
To remedy the problem, I tried various things. I appended the subdirectory as an additional path in the top-most build.py script.
sys.path.append('compiler')
It did not solve the problem. Cross-imports are still not working.
I also tried relative imports but that breaks the stand-alone version of the module. So, I tried exception handling to catch them
try:
from file1 import TestClass
except ImportError:
from .file1 import TestClass
That did not work either and resulted, despite my best efforts in ImportError: attempted relative import with no known parent package errors.
I also tried all sorts of variations of these things, but none of it worked.
I know it has to be possible to do something like this and I am surprised that this is so hard to do. My Internet searches all came back with the same suggestions—the ones I outlined above, none of which worked in my case, particularly because they do not take into account the ability to run code as a stand-alone and as a sub-module.
I can't be the first person trying to write a module that can be used as a stand-alone package or as a sub-module in other projects. What am I missing here?
Relative imports, as the error tells you, require a parent package. Think of .file1 as shorthand for <this_module>.file1. If there is no <this_module>, you can't ask it for file1. In order to properly use relative imports you'll have to make a wrapper project to contain the shared module so it can be properly namespaced.
So your standalone module will instead look like this, matching the consumer:
__init.py__
standalone.py
└── compiler
└── __init__.py
└── file1.py
└── file2.py
└── file3.py
The other option is to make your shared module truly installable with a setup.py or pyproject.toml or whatever your favorite method is. Then you install it in the consuming project instead of directly including it.
This question already has answers here:
Relative imports in Python 3
(31 answers)
Closed 6 years ago.
I read a lot of answers related to the question I am asking, but still I do not understand how to make possible this thing I am trying.
So let's go to the point. I will report a simplified version of my application.
Suppose I have a main folder called project and inside it a src main package containing three subpackages:
clustering (containing a file: clustering.py)
parser (containing a file: parser.py)
support_class (containing a file: myClass.py)
In each folder, except for the project one, there is a __init__.py
Now, the python scripts contained in the clustering and parser package should use both the myclass.py contained in support_class.
I tried relative imports, but they do not works because I would like to run the scripts contained in the clustering and parser package directly and I do not want to use the -m option.
Es. python parser.py [arguments]
Example of relative import I used is:
from ..supportClass import myClass
I tried to add the package path to the sys.path but something is not working because it still tells me that it can't find the module.
Es.
sys.path.insert(0, "~/project/src")
from support_class import myClass.py
Can anyone suggest the best way to do this in python 2.7?
If I could avoid the sys.path option it would be great because I do not like this solution that much.
Thanks in advance.
Let's start from your project's folder architecture:
MyProject/
└── src
├── clustering
│ ├── __init__.py
│ └── clustering.py
├── parser
│ ├── __init__.py
│ └── parser.py
├── support_class
│ ├── __init__.py
│ └── support.py
└── main.py
If I'm not mistaken, your issue is that you want to import support.py from within parser.py and clustering.py and being able to run those two independently if needed. Two words for you:
Conditional imports
(And one more, after finding a real other solution ;): PYTHONPATH)
With the assumption that your scripts have a if __name__ == "__main__": section to run your tests, you can simply have the following as their imports:
clustering.py & parser.py:
if __name__ == "__main__":
import sys
import os
PACKAGE_PARENT = '..'
SCRIPT_DIR = os.path.dirname(os.path.realpath(os.path.join(os.getcwd(), os.path.expanduser(__file__))))
sys.path.append(os.path.normpath(os.path.join(SCRIPT_DIR, PACKAGE_PARENT)))
from support_class.support import Support
else:
from support_class.support import Support
main.py:
from support_class.support import Support
Then, python clustering.py and python parser.py to your heart's content!
Which makes this a duplicate of https://stackoverflow.com/a/16985066/3425488
First, you have to create an __init __.py (two = "_", before and after, no spaces) file inside the directory where you have your actual package.
Second, you want to simply call your package from the python script where you are import to.
e.g.:
my_script.py #your script where you want to include/import your package
my_clustering_dir # directory containing files for package
my_clustering.py # file should be inside my_clustering_dir
"__init __".py # make sure this file is inside my_clustering_dir only (it can be empty)
Now, you can go to my_script.py. Then, you can add the following:
from my_clustering_dir import my_clustering #no extension (.py) needed
When you call a python script like this
python parser.py
That module is loaded as __main__, not parser.parser. It won't be loaded as part of any package, so it can't do relative imports. The correct way to do this is to create a separate script that is only used as the main script and not use any of your module scripts as main scripts. For example, create a script like this
main.py
from parser import parser
parser.main()
Then you can run python /path/to/main.py and it will work.
Also, a note on your package layout. In your example, parser and clustering and support_class aren't subpackages, they are top-level packages. Typically, if you have a package named project and you're using a src directory, your layout would look like this:
/project
setup.py
/src
/project
__init__.py
/clustering
__init__.py
/parser
..
Alternatively, if you're building an actual python package with a setup.py script, you can use the console_scripts entry point in setuptools to generate the script automatically.
setup(
...
entry_points = {
'console_scripts': ['myparser=project.parser:main'],
}
...
)
I am struggling a bit to set up a working structure in one of my projects. The problem is, that I have main package and a subpackage in a structure like this (I left out all unnecessary files):
code.py
mypackage/__init__.py
mypackage/work.py
mypackage/utils.py
The utils.py has some utility code that is normally only used in the mypackage package.
I normally have some test code each module file, that calls some methods of the current module and prints some things to quickcheck if everything is working correctly. This code is placed in a if __name__ == "__main__" block at the end of the file. So I include the utils.py directly via import utils. E.g mypackage/work.py looks like:
import utils
def someMethod():
pass
if __name__ == "__main__":
print(someMethod())
But now when I use this module in the parent package (e.g. code.py) and I import it like this
import mypackage.work
I get the following error:
ImportError: No module named 'utils'
After some research I found out, that this can be fixed by adding the mypackage/ folder to the PYTHONPATH environment variable, but this feels strange for me. Isn't there any other way to fix this? I have heard about relative imports, but this is mentioned in the python docs about modules
Note that relative imports are based on the name of the current module. Since the name of the main module is always "main", modules intended for use as the main module of a Python application must always use absolute imports.
Any suggestions how I can have a if __name__ == "__main__" section in the submodule and also can use this file from the parent package without messing up the imports?
EDIT: If I use a relative import in work.py as suggested in a answer to import utils:
from . import utils
I get the following error:
SystemError: Parent module '' not loaded, cannot perform relative import
Unfortunately relative imports and direct running of submodules don't mix.
Add the parent directory of mypackage to your PYTHONPATH or always cd into the parent directory when you want to run a submodule.
Then you have two possibilities:
Use absolute (from mypackage import utils) instead of relative imports (from . import utils) and run them directly as before. The drawback with that solution is that you'll always need to write the fully qualified path, making it more work to rename mypackage later, among other things.
or
Run python3 -m mypackage.utils etc. to run your submodules instead of running python3 mypackage/utils.py.
This may take some time to adapt to, but it's the more correct way (a module in a package isn't the same as a standalone script) and you can continue to use relative imports.
There are more "magical" solutions involving __package__ and sys.path but they all require extra code at the top of every file with relative imports you want to run directly. I wouldn't recommend these.
You should create a structure like this:
flammi88
├── flammi88
│ ├── __init__.py
│ ├── code.py
│ └── mypackage
│ ├── __init__.py
│ ├── utils.py
│ └── work.py
└── setup.py
then put at least this in the setup.py:
import setuptools
from distutils.core import setup
setup(
name='flammi88',
packages=['flammi88'],
)
now, from the directory containing setup.py, run
pip install -e .
This will make the flammi88 package available in development mode. Now you can say:
from flammi88.mypackage import utils
everywhere. This is the normal way to develop packages in Python, and it solves all of your relative import problems. That being said, Guido frowns upon standalone scripts in sub-packages. With this structure I would move the tests inside flammi88/tests/... (and run them with e.g. py.test), but some people like to keep the tests next to the code.
Update:
an extended setup.py that describes external requirements and creates executables of the sub-packages you want to run can look like:
import setuptools
from distutils.core import setup
setup(
name='flammi88',
packages=['flammi88'],
install_requires=[
'markdown',
],
entry_points={
'console_scripts': [
'work = flammi88.mypackage.work:someMethod',
]
}
)
now, after pip installing your package, you can just type work at the command line.
Import utils inside the work.py as follows:
import mypackage.utils
or if you want to use shorter name:
from mypackage import utils
EDIT: If you need to run work.py outside of the package, then try:
try:
from mypackage import utils
except ImportError:
import utils
Use:
from . import utils
as suggested by Peter
In your code.py you should use:
from mypackage import work