How are absolute imports possible from within a subpackage? - python

There's something that's bothering me about imports in packages.
Imagine I have the following directory structure:
pack
├── __init__.py
├── sub1
│   ├── __init__.py
│   └── mod1.py
└── sub2
├── __init__.py
└── mod2.py
Inside mod1.py I have the following code to import mod2.py:
# mod1.py
import pack.sub2.mod2
pack.sub2.mod2.helloworld()
I have a main.py file in the directory containing pack that imports pack/sub1/mod1.py
How does mod1.py have access to pack? pack is not in the same directory as mod1.py. Does python automatically add the topmost package to sys.path?

You can investigate this by inspecting sys.path in an interactive interpreter. What you'll find is that the first element of it is the location of the script the interpreter was told to run. This means that when you run your script at the top level (the location of the pack package), that location is added to sys.path automatically. It doesn't have anything to do with the actual package structure, so if you ran mod1.py as a script you would have things break (this is probably why you put your script at the top level!).
Note that in Python 2, you also have the issue of implicit relative imports, which doesn't impact the issue you're asking about, but might come up if you had a few more modules involved. If you added mod3.py to sub1, you could import it from mod1 with just import mod3, with the pack.sub1 prefix being figured out implicitly. This implicit behavior is generally considered a bad thing, and in Python 3 such implicit relative imports are not allowed (you can also disable them in Python 2 with from __future__ import absolute_import). To import pack.sub1.mod3 from pack.sub1.mod1 you'd need to either name it in full, or use an explicit relative import: from . import mod3
To tie this relative import business back to your question, if you wanted to avoid relying on pack being part of sys.path (or, more realistically, protect against changes to pack's name), you could modify your import of mod2 from mod1 to be an explicit relative import. Just use from .. import sub2.mod2.

Related

How can I use relative imports in Python to import a function in another directory

I have a directory structure with 2 basic python files inside seperate directories:
├── package
│ ├── subpackage1
│ │ └── module1.py
└── subpackage2
└── module2.py
module1.py:
def module1():
print('hello world')
module2.py:
from ..subpackage1.module1 import module1
module1()
When running python3 module2.py I get the error: ImportError: attempted relative import with no known parent package
However when I run it with the imports changed to use sys.path.append() it runs successfully
import sys
sys.path.append('../subpackage1/')
from module1 import module1
module1()
Can anyone help me understand why this is and how to correct my code so that I can do this with relative imports?
To be considered a package, a Python directory has to include an __init__.py file. Since your module2.py file is not below a directory that contains an __init__.py file, it isn't considered to be part of a package. Relative imports only work inside packages.
UPDATE:
I only gave part of the answer you needed. Sorry about that. This business of running a file inside a package as a script is a bit of a can of worms. It's discussed pretty well in this SO question:
Relative imports in Python 3
The main take-away is that you're better off (and you're doing what Guido wants you to) if you don't do this at all, but rather move directly executable code outside of any module. You can usually do this by adding an extra file next to your package root dir that just imports the module you want to run.
Here's how to do that with your setup:
.
├── package
│   ├── __init__.py
│   ├── subpackage1
│   │   └── module1.py
│   └── subpackage2
│   └── module2.py
└── test.py
test.py:
import package.subpackage2.module2
You then run test.py directly. Because the directory containing the executed script is included in sys.path, this will work regardless of what the working directory is when you run the script.
You can also do basically this same thing without changing any code (you don't need test.py) by running the "script" as a module.
python3 -m package.subpackage2.module2
If you have to make what you're trying to do work, I think I'd take this approach:
import os, sys
sys.path.append(os.path.join(os.path.dirname(__file__), '..'))
from subpackage1.module1 import module1
module1()
So you compute in a relative way where the root of the enclosing package is in the filesystem, you add that to the Python path, and then you use an absolute import rather than a relative import.
There are other solutions that involve extra tools and/or installation steps. I can't think why you could possibly prefer those solutions to the last solution I show.
By default, Python just considers a directory with code in it to be a directory with code in it, not a package/subpackage. In order to make it into a package, you'll need to add an __init__.py file to each one, as well as an __init__.py file to within the main package directory.
Even adding the __init__.py files won't be enough, but you should. You should also create a setup.py file next to your package directory. Your file tree would look like this:
├── setup.py
└── package
├── __init__.py
└── subpackage1
│ ├── __init__.py
│ └── module1.py
└── subpackage2
├── __init__.py
└── module2.py
This setup.py file could start off like this:
from setuptools import setup
setup(
name='package',
packages=['package'],
)
These configurations are enough to get you started. Then, on the root of your directory (parent folder to package and setup.py), you will execute next command in you terminal pip install -e . to install your package, named package, in development mode. Then you'll be able to navigate to package/subpackage2/ and execute python module2.py having your expected result. You could even execute python package/subpackage2/module2.py and it works.
The thing is, modules and packages don't work the same way they work in another programming languages. Without the creation of setup.py if you were to create a program in your root directory, named main.py for example, then you could import modules from inside package folder tree. But if you're looking to execute package\subpackage2\module2.py.
If you want relative imports without changing your directory structure and without adding a lot of boilerplate you could use my import library: ultraimport
It gives the programmer more control over their imports and lets you do file system based relative or absolute imports.
Your module2.py could then look like this:
import ultraimport
module1 = ultraimport('__dir__/../subpackage1/module1.py')
This will always work, no matter how you run your code or if you have any init files and independent of sys.path.

Importing module from subfolder

This is the structure of my project
final
├── common
├── __init__.py
├── batch_data_processing.py
├── processing_utility.py
├── post_processing.py
├── Processing_raw_data
├── batch_process_raw_data.py
so i want to import from common.batch_data_processing in batch_process_raw_data.py
but when I try it I get ModuleNotFoundError: No module named 'common'
is there a way to import this module without the need to install it ?
Note : this is intended to be used by "non python users"
here is pictures to better discribe the problem.
Add the following code above your import code to indicate the path:
# The following is a relative path,
# it can also be replaced with the absolute path
# of the directory where common is located.
# sys.path.append("C:\\Users\\Admin\\Desktop\\Final")
import sys
sys.path.append("./")
When all your scripts are in the same folder, importing modules is almost impossible to go wrong. If you need to import scripts from external folders, you can specify the path using the above method.

Importing from above path on python for windows

I have the script I want to run in the following structure
scripts/
project/
main.py
libraries/
a.py
In main.py I need to import things from a.py. How can I import things in subfolders that are two or more folders above main.py?
The proper way to handle this would be putting everything that needs to know about each other under a shared package, then the individual sub-packages and sub-modules can be accessed through that package. But this will also require moving the application's entrypoint to either the package, or a module that's a sibling of the package in the directory and can import it. If moving the entrypoint is an issue, or something quick and dirty is required for prototyping, Python also implements a couple other methods for affecting where imports search for modules which can be found near the end of the answer.
For the package approach, let's say you have this structure and want to import something between the two modules:
.
├── bar_dir
│   └── bar.py
└── foo_dir
└── foo.py
Currently, the two packages do not know about each other because Python only adds the entrypoint file's parent (either bar_dir or foo_dir depending on which file you run) to the import search path, so we have to tell them about each other in some way, this is done through the top level package they'll both share.
.
└── top_level
├── __init__.py
├── bar_dir
│   ├── __init__.py
│   └── bar.py
└── foo_dir
├── __init__.py
└── foo.py
This is the package layout we need, but to be able to use the package in imports, the top packagehas to be initialized.
If you only need to run the one file, you can do for example python -m top_level.bar_dir.bar but a hidden entry point like that could be confusing to work with.
To avoid that, you can define the package as a runnable module by implementing a __main__.py file inside of it, next to __init__.py, which is ran when doing python -m top_level. The new __main__.py entrypoint would then contain the actual code that runs the app (e.g. the main function) while the other modules would only have definitions.
The __init__.py files are used to mark the directories as proper packages and are ran when the package is imported to initialize its namespace.
With this done the packages now can see each other and can be accessed through either absolute or relative imports, an absolute import would being with the top_level package and use the whole dotted path to the module/package we need to import, e.g. from top_level.bar_dir import bar can be used to import bar.
Packages also allow relative imports which are a special form of a from-style import that begins with one or more dots, where each dot means the import goes up one package - from the foo module from . import module would attempt to import module from the foo_dir package, from .. import module would search for it in the top_level package etc.
One thing to note is that importing a package doesn't initialize the modules under it unless it's an explicit import of that module, for example only importing top_level won't make foo_dir and bar_dir available in its namespace unless they're imported directly through import top_level.foo_dir/top_level.bar_dir or the package's __init__.py added them to the package's namespace through its own import.
If this doesn't work in your structure, an another way is to let Python know where to search for your modules by adding to its module search path, this can be done either at runtime by inserting path strings into the sys.path list, or through the PYTHONPATH environment variable.
Continuing with the above example with a scenario and importing bar from foo, an entry for the bar_dir directory (or the directory above it) can be added to the sys.path list or the aforementioned environment variable. After that import bar (or from bar_dir import bar if the parent was added) can be used to import the module, just as if they were next to each other. The inserted path can also be relative, but that is prone to breakage with a changing cwd.

How to cross-import modules in subdirectory so they work as sub-module and as a stand-alone?

I have a Python module that normally works as a stand-alone.
file1.py
file2.py
file3.py
However, I also want it to be part of a different project, in which the module is placed in a separate subdirectory.
__init.py__
build.py
└── compiler
└── __init__.py
└── file1.py
└── file2.py
└── file3.py
Since the module scripts use plenty of cross-imports, this is not possible. Once placed in a subdirectory, the imports no longer find the respective files because it looks in the top directory only.
To remedy the problem, I tried various things. I appended the subdirectory as an additional path in the top-most build.py script.
sys.path.append('compiler')
It did not solve the problem. Cross-imports are still not working.
I also tried relative imports but that breaks the stand-alone version of the module. So, I tried exception handling to catch them
try:
from file1 import TestClass
except ImportError:
from .file1 import TestClass
That did not work either and resulted, despite my best efforts in ImportError: attempted relative import with no known parent package errors.
I also tried all sorts of variations of these things, but none of it worked.
I know it has to be possible to do something like this and I am surprised that this is so hard to do. My Internet searches all came back with the same suggestions—the ones I outlined above, none of which worked in my case, particularly because they do not take into account the ability to run code as a stand-alone and as a sub-module.
I can't be the first person trying to write a module that can be used as a stand-alone package or as a sub-module in other projects. What am I missing here?
Relative imports, as the error tells you, require a parent package. Think of .file1 as shorthand for <this_module>.file1. If there is no <this_module>, you can't ask it for file1. In order to properly use relative imports you'll have to make a wrapper project to contain the shared module so it can be properly namespaced.
So your standalone module will instead look like this, matching the consumer:
__init.py__
standalone.py
└── compiler
└── __init__.py
└── file1.py
└── file2.py
└── file3.py
The other option is to make your shared module truly installable with a setup.py or pyproject.toml or whatever your favorite method is. Then you install it in the consuming project instead of directly including it.

Can a Python script in a (sub)module import from upstream in its directory hierarchy?

I realize there are a slew of posts on SO related to Python and imports, but it seems like a fair number of these posts are asking about import rules/procedures with respect to creating an actual Python package (vs just a project with multiple directories and python files). I am very new to Python and just need some more basic clarification on what is and is not possible with regard to access/importing within the context of multiple py files in a project directory.
Let's say you have the following project directory (to be clear, this is not a package that is somewhere on sys.path, but say, on your Desktop):
myProject/
├── __init__.py
├── scriptA.py
└── subfolder
├── __init__.py
└── scriptB.py
└── subsubfolder
├── __init__.py
└── scriptC.py
└── foo.py
Am I correct in understanding that the only way scriptC.py could import and use methods or classes within scriptB.py if scriptC.py is run directly via $ python scriptC.py and from within the subsubfolder directory is if I add the parent directory and path to scriptB.py to the Python path at runtime via sys.path ?
It is possible, however, for scriptC.py to import foo.py or for scriptB.py to import scriptC.py or foo.py without dealing with sys.path, correct? Adjacent py files and py files in subdirectories are accessible just by using relative import paths, you just can't import python scripts that live in parent or sibling directories (without using sys.path) ?
What's Possible
Anything.
No, really. See the imp module, the the imputil module -- take a look at how the zipimport module is written if you want some inspiration.
If you can get a string with your module's code in a variable, you can get a module into sys.modules using the above, and perhaps hack around with its contents using the ast module on the way.
A custom import hook that looks in parent directories? Well within the range of possibilities.
What's Best Practice
What you're proposing isn't actually good practice. The best-practice approach looks more like the following:
myProject/
├── setup.py
└── src/
├── moduleA.py
└── submodule/
├── __init__.py
├── moduleB.py
└── subsubmodule/
├── __init__.py
└── moduleC.py
Here, the top of your project is always in myProject/src. If you use setup.py to configure moduleA:main, submodule.moduleB:main and submodule.subsubmodule.moduleC:main as entry points (perhaps named scriptA, scriptB and scriptC), then the functions named main in each of those modules would be invoked when the user ran the (automatically generated by setuptools) scripts so named.
With this layout (and appropriate setuptools use), your moduleC.py can absolutely import moduleA, or import submodule.moduleB.
Another approach, which doesn't involve entrypoints, to invoke the code in your moduleC.py (while keeping the module's intended hierarchy intact, and assuming you're in a virtualenv where python setup.py develop has been run) like so:
python -m submodule.subsubmodule.moduleC

Categories