Python3 importlib.util.spec_from_file_location with relative path? - python

There are a lot of python questions about how to import relative packages or by explicit location (linked to two popular examples).
In addition there is always the documentation
Having read this, I am still not quite sure what specs are, how they are related to modules, and why one would need to tokenize it.
So for someone who doesn't understand, could you please try to explain how one would do this (programmatically and what the means under the hood)
e.g.
if I have
proj-dir
--mod1
--|--__init__.py
--|--class1.py
--mod2
--|--__init__.py
--|--class2.py
how do I import mod2 into mod1?
import sys
sys.path.insert(0, "../mod2")
this technically works, but I fear that it may cause issues in the future if I try to pickle objects and use them elsewhere...
The explicit location suggested
import importlib.util
spec = importlib.util.spec_from_file_location("module.name", "/path/to/file.py")
foo = importlib.util.module_from_spec(spec)
spec.loader.exec_module(foo)
foo.MyClass()
so in this case I just do:
import importlib.util
spec = importlib.util.spec_from_file_location("mod2.class2", "../mod2/class2.py")
foo = importlib.util.module_from_spec(spec)
spec.loader.exec_module(foo)
foo.MyClass()
??

First of all, let me clarify the differences between a Python module & a Python package so that both of us are on the same page. ✌
###A module is a single .py file (or files) that are imported under one import and used. ✔
`import aModuleName
Here 'aModuleName' is just a regular .py file.`
###Whereas, a package is a collection of modules in directories that give a package hierarchy. A package contains a distinct __init__.py file. ✔
# Here 'aPackageName` is a folder with a `__init__.py` file
# and 'aModuleName', which is just a regular .py file.
Therefore, the correct version of your proj-dir would be something like this, ⤵
proj-dir
├── __init__.py
├── package1
│ ├── __init__.py
│ ├── module1.py
└── package2
├── __init__.py
└── module2.py
🔎 Notice that I've also added an empty __init__.py into the proj-dir itself which makes it a package too.
👍 Now, if you want to import any python object from module2 of package2 into module1 of package1, then the import statement in the file module1.py would be
from proj-dir.package2.module2 import object2
# if you were to import the entire module2 then,
from proj-dir.package2 import module2

Related

How can I use relative imports in Python to import a function in another directory

I have a directory structure with 2 basic python files inside seperate directories:
├── package
│ ├── subpackage1
│ │ └── module1.py
└── subpackage2
└── module2.py
module1.py:
def module1():
print('hello world')
module2.py:
from ..subpackage1.module1 import module1
module1()
When running python3 module2.py I get the error: ImportError: attempted relative import with no known parent package
However when I run it with the imports changed to use sys.path.append() it runs successfully
import sys
sys.path.append('../subpackage1/')
from module1 import module1
module1()
Can anyone help me understand why this is and how to correct my code so that I can do this with relative imports?
To be considered a package, a Python directory has to include an __init__.py file. Since your module2.py file is not below a directory that contains an __init__.py file, it isn't considered to be part of a package. Relative imports only work inside packages.
UPDATE:
I only gave part of the answer you needed. Sorry about that. This business of running a file inside a package as a script is a bit of a can of worms. It's discussed pretty well in this SO question:
Relative imports in Python 3
The main take-away is that you're better off (and you're doing what Guido wants you to) if you don't do this at all, but rather move directly executable code outside of any module. You can usually do this by adding an extra file next to your package root dir that just imports the module you want to run.
Here's how to do that with your setup:
.
├── package
│   ├── __init__.py
│   ├── subpackage1
│   │   └── module1.py
│   └── subpackage2
│   └── module2.py
└── test.py
test.py:
import package.subpackage2.module2
You then run test.py directly. Because the directory containing the executed script is included in sys.path, this will work regardless of what the working directory is when you run the script.
You can also do basically this same thing without changing any code (you don't need test.py) by running the "script" as a module.
python3 -m package.subpackage2.module2
If you have to make what you're trying to do work, I think I'd take this approach:
import os, sys
sys.path.append(os.path.join(os.path.dirname(__file__), '..'))
from subpackage1.module1 import module1
module1()
So you compute in a relative way where the root of the enclosing package is in the filesystem, you add that to the Python path, and then you use an absolute import rather than a relative import.
There are other solutions that involve extra tools and/or installation steps. I can't think why you could possibly prefer those solutions to the last solution I show.
By default, Python just considers a directory with code in it to be a directory with code in it, not a package/subpackage. In order to make it into a package, you'll need to add an __init__.py file to each one, as well as an __init__.py file to within the main package directory.
Even adding the __init__.py files won't be enough, but you should. You should also create a setup.py file next to your package directory. Your file tree would look like this:
├── setup.py
└── package
├── __init__.py
└── subpackage1
│ ├── __init__.py
│ └── module1.py
└── subpackage2
├── __init__.py
└── module2.py
This setup.py file could start off like this:
from setuptools import setup
setup(
name='package',
packages=['package'],
)
These configurations are enough to get you started. Then, on the root of your directory (parent folder to package and setup.py), you will execute next command in you terminal pip install -e . to install your package, named package, in development mode. Then you'll be able to navigate to package/subpackage2/ and execute python module2.py having your expected result. You could even execute python package/subpackage2/module2.py and it works.
The thing is, modules and packages don't work the same way they work in another programming languages. Without the creation of setup.py if you were to create a program in your root directory, named main.py for example, then you could import modules from inside package folder tree. But if you're looking to execute package\subpackage2\module2.py.
If you want relative imports without changing your directory structure and without adding a lot of boilerplate you could use my import library: ultraimport
It gives the programmer more control over their imports and lets you do file system based relative or absolute imports.
Your module2.py could then look like this:
import ultraimport
module1 = ultraimport('__dir__/../subpackage1/module1.py')
This will always work, no matter how you run your code or if you have any init files and independent of sys.path.

Python import from sibling directories

Disclaimer: after searching through tons of very similar feeds that in the end all turn out to solve a slightly different problem I guess I have to open a new question (although I am sure there exists an answer somewhere --> so point that out if you know it ;)
The problem: I am using Python 2, am building a project with this tree:
project
├── __init__.py
├── foo
│   └── __init__.py
│   └── bar
│   └── __init__.py
├── notebooks
│ └── __init__.py
│ └── skript.py
└── test
└── __init__.py
└── foo
└── __init__.py
└── bar
└── __init__.py
└── file.py
Now I want to load test.foo.bar from within project/notebooks/skript.py. Therefore, I do in that skript
import sys
sys.path.append('../')
If I then run
import test.foo.bar # or: import test.foo
python tells me
ImportError: No module named foo.bar
(or ImportError: No module named foo respectively). Funily, import test does not throw an error, but if I then do test.foo it throws an AttributeError: 'module' object has no attribute 'foo'.
So I wonder, what is going wrong here and how to fix it?
Edit
Also, I tried adding this to skript.py
import sys
import os
MYDIR = os.path.dirname(os.path.abspath(__file__))
sys.path.append(os.path.join(MYDIR,'../test'))
sys.path.append(os.path.join(MYDIR,'../test/foo'))
sys.path.append(os.path.join(MYDIR,'../test/foo/bar')) #I am not sure this is entirely needed
as was pointed out below. Still,
import test.foo.bar.file
or
from test.foo.bar import file
just yield
ImportError: No module named foo.bar
Same for
sys.path.append('../test/foo/bar')
import test.foo.bar.file
I have still no clue whats going wrong?
Messing with sys.path is rarely a good idea.
Since your plan seems to be to use both foo and test from notebooks (that will probably just contain Jupyter notebooks), the cleanest solution would be to install foo and test as packages.
Remove the __init__.py from your top level directory and notebooks, since you will not want to import them. Then add a setup.py to your top level directory. Since your tests are specific to foo, you should either rename them foo_test or move them into foo itself.
A minimal setup.py would look like this
from setuptools import setup
setup(name='foo',
version='0.1',
description='descroption of fo',
author='you',
author_email='your#mail',
packages=['foo','test_foo])
Then you can simply pip install -e . in your top level directory and it will be installed into your current virtualenv. If you are not using virtualenvs, you should pip install --user -e .
It should work with
from test import foo
But you have to add a __init__.py to your project directory.
For Python 3 it would be:
from .test import foo
If you use the dot in front of the folder name, python searches for the file in the same directory as the file you are working on is placed in.
Sorry for my bad english.
Are you using an IDE? If so add the path to the Python Interpreter inside the project Properties to all the primary packages (foo, test, notebooks). Otherwise try to explicitly add the bar package to the sys path like so
import sys
import os
MYDIR = os.path.dirname(os.path.abspath(__file__))
sys.path.append(os.path.join(MYDIR,'test'))
sys.path.append(os.path.join(MYDIR,'test/foo'))
sys.path.append(os.path.join(MYDIR,'test/foo/bar')) #I am not sure this is entirely needed

Intra-package reference of modules in sub-packages using dotted syntax

I have the following package structure:
.
└── package
├── __init__.py
├── sub_package_1
│   ├── __init__.py
│   └── main_module.py
├── sub_package_2
│   ├── __init__.py
│   └── some_module.py
└── sub_package_3
├── __init__.py
└── some_module.py
In package/sub_package_1/main_module.py I want to use both package/sub_package_2/some_module.py and package/sub_package_3/some_module.py. For this I want to use intra-package reference. I know that I can use from ..sub_package_1 import some-module but because of the similar name I want to use dotted syntax such as sub_package_1.some_module.
Using from .. import sub_package_2 I obviously cannot access sub_package_2.some_module because sub_package_2 is a package. However I found out that using
from .. import sub_package_2
from ..sub_package_2 import some_module
I can access sub_package_2.some_module. Apparently the 2nd import adds some_module to sub_package_2 (checking dir(sub_package_2)).
My questions are:
Is there a way to use a single import instead of the two above?
Why does (in general) import package followed by from package import module add module to package? What is Python actually doing here?
1.
In the file __init__.py of sub_package_2 you write
from . import some_module
And in main_module.py you can must write
from .. import sub_package_2
And the code sub_package_2.some_module should work now
2.
"How import in python work" you can read more here Importing Python Modules
from .. import sub_package_2 creates a reference to sub_package_2 in the current namespace. Package sub_package_2 is like a module now and is defined in the file __init__.py. If you wrote nothing in __init__.py, sub_package_2 won't know some_modue
from ..sub_package_2 import some_module create a reference to the module some_module of the package sub_package_2 with the name some_module. It's something like some_module = sub_package_2.some_module. You see: there are a reference to some_module in sub_package_2 too. And now sub_package_2 knows the module some_module
Important: You can use sub_package_2.some_module but only some_module will work too. They are identical after your 2 imports.
And if you write in the __init__.py:
from . import some_module
some_module belongs to sub_package_2 automatically
For similar module names you can use as
from ..sub_package_1 import some_module as some_module_1
from ..sub_package_2 import some_module as some_module_2
from ..sub_package_3 import some_module as some_module_3

How are absolute imports possible from within a subpackage?

There's something that's bothering me about imports in packages.
Imagine I have the following directory structure:
pack
├── __init__.py
├── sub1
│   ├── __init__.py
│   └── mod1.py
└── sub2
├── __init__.py
└── mod2.py
Inside mod1.py I have the following code to import mod2.py:
# mod1.py
import pack.sub2.mod2
pack.sub2.mod2.helloworld()
I have a main.py file in the directory containing pack that imports pack/sub1/mod1.py
How does mod1.py have access to pack? pack is not in the same directory as mod1.py. Does python automatically add the topmost package to sys.path?
You can investigate this by inspecting sys.path in an interactive interpreter. What you'll find is that the first element of it is the location of the script the interpreter was told to run. This means that when you run your script at the top level (the location of the pack package), that location is added to sys.path automatically. It doesn't have anything to do with the actual package structure, so if you ran mod1.py as a script you would have things break (this is probably why you put your script at the top level!).
Note that in Python 2, you also have the issue of implicit relative imports, which doesn't impact the issue you're asking about, but might come up if you had a few more modules involved. If you added mod3.py to sub1, you could import it from mod1 with just import mod3, with the pack.sub1 prefix being figured out implicitly. This implicit behavior is generally considered a bad thing, and in Python 3 such implicit relative imports are not allowed (you can also disable them in Python 2 with from __future__ import absolute_import). To import pack.sub1.mod3 from pack.sub1.mod1 you'd need to either name it in full, or use an explicit relative import: from . import mod3
To tie this relative import business back to your question, if you wanted to avoid relying on pack being part of sys.path (or, more realistically, protect against changes to pack's name), you could modify your import of mod2 from mod1 to be an explicit relative import. Just use from .. import sub2.mod2.

Importing a lot of modules from custom package

Say i have this this structure.
MyApp
├── main.py
└── package
├── __init__.py
├── a.py
├── b.py
├── c.py
├── d.py
├── e.py
├── f.py
├── g.py
├── h.py
├── ...
└── z.py
And in main.py I need to use all modules, from a.py to z.py
I'd like to know how I can import all those modules with one import statement.
So instead of doing
from package import a
from package import b
...
from package import z
I could just import the package and have all the modules ready.
Things I've tried
import package
a = package.a.A()
# AttributeError: 'module' object has no attribute 'a'
Now I know I could put a code in __init__.py to add all the modules to __all__, but from what I've read, we should avoid 'from package import *'
The reason for this is that the package might have an increasing number of modules and I would like to adding an import statement to the main code each time a module is created. Ideally I'd like to be able to just drop the module in the package and have it ready for use.
In __init__.py, you can:
import a, b, c, d...
and then the modules will be placed in the package namespace after you do import package.
You you really want to names a, b, etc. in main.py's namespace, and have this happen with no effort, you can't really avoid from package import *; any other method of importing them all implicitly is going to be just as bad, since it involves polluting the namespace with names you don't explicitly import.
I would recommend not doing this. If you must, this is the method I've used in the past:
# __init__.py
import os
import re
PACKAGE = 'MyApp.package'
MODULE_RE = r"^.*.py$"
for filename in os.listdir(os.path.dirname(__file__)):
if not re.match(MODULE_RE, filename) or filename == "__init__.py":
continue
imported_module = __import__('%s.%s' % (PACKAGE, filename[:-3]),
{}, {},
filename[:-3])
What you propose is very bad design practice since you import all but not what is required. In general IT SLOWS DOWN program loading - never do it if not really required. Never initialize not used variables in modules also since it waste of time.
Two solution which not follow the good design practice if not used correctly.
Check this answer Can someone explain __all__ in Python?.
You could also use __import__ to load modules and os.path.dirname(__file__) to list all files names in directory and load as modules.
BTW this pattern is lead to serious security holes since you allow load anything - it need only creation permission to break security.
This code is not very beautiful, but I think it is a nice workaround
import os
for i in os.listdir('package'):
if i.endswith('.py') and not i.startswith('__'):
exec('from package import ' + i[:-3])

Categories