Real goal: I have a module that is common between two packages (say, bar and bar2). I want to use exact same test files for both cases so I want to change the test imports to not name the package explicitly. (Why? This can be useful during the process of extracting modules from a mega-package into separate packages.)
My idea was to add another module that imports a particular package and provides an "alias" for it. It almost worked, but I got a problem.
Initially I had:
# test.py:
from bar import some_function
If I do nothing magical, there will be two versions of test.py: one with from bar import some_function and another with from new_bar import some_function. I want to avoid this and have the test code files remain the same.
After I added indirection:
#foo.py:
import bar as baz
#test.py:
from .foo import baz # Works!
from .foo.baz import some_function # ModuleNotFoundError: No module named 'cur_dir.foo.baz'; 'cur_dir.foo' is not a package
I can make foo a package:
#foo/__init__.py:
import bar as baz
#test.py:
from .foo import baz # Works!
from .foo.baz import some_function # ModuleNotFoundError: No module named 'cur_dir.foo.baz'
The error changes a bit, but still remains.
I know that I can work around the problem by writing
# test.py:
from .foo import baz
some_function = baz.some_function
Is there any other way? I want my imports to be "normal".
Is there a way to create an "alias" for a package that can be used with the standard import mechanism?
The import statement only looks at actual modules and their paths, not at aliases inside the loaded modules. An actual module alias in Python's module registry, sys.modules, is required.
import sys
import os
sys.modules["os_alias"] = os # alias `os` to `os_alias`
import os_alias # alias import works now
from os_alias import chdir # even as from ... import ...
Once a module alias has been added to sys.modules, it is available for import in the entire application.
Note that module aliasing can lead to subtle bugs when submodules of aliased modules are loaded. In specific, if the submodules are not aliased explicitly, separate versions are created that are not identical. This means that any tests based on object identity, including isinstance(original.submodule.someclass(), alias.submodule.someclass), will fail if the versions are mixed.
To avoid this, you must alias all submodules of any aliased package.
Related
In a python project I would like to globber imports into a single file called common_imports.py in order to reduce number of import statements in python files.
Instead of writing
file1.py
import foo
import bar
import baz
[...]
file2.py
import foo
import bar
import baz
[...]
I would like to write
file1.py
from common_imports import *
[...]
file2.py
from common_imports import *
[...]
common_imports.py
import foo
import bar
import baz
However, this gives me a lot of pylint false positives.
I can disable pylint warnings in the common_imports.py file by adding a pylint disable comment. I can disable wildcard imports. Unfortunately, I can disable unused imports only globally but not specific for all imports from common_imports.py.
Somebody has an idea howto get pylint on the track?
Summarising my comments above into a proper answer:
TL;DR:
While the reusable code motive is commendable, it's not fit for purpose here. Listen to the linter, and save your hard-earned respect among your colleagues. :-)
Pythonic Viewpoint:
Don't
Why? Python convention, in all its organisational glory and documented structure, states that if you use a library in a module, import it in the module. Plain and simple.
Imports are always put at the top of the file, just after any module comments and docstrings, and before module globals and constants.
-- PEP8 - Imports
At a lower level, the sys.modules dict, which tracks imports, will only import a library if it hasn’t been imported already. So from an efficiency point of view, there is no gain.
Maintainer's Viewpoint:
Don't
Why? If (when) the code is changed / optimised in a module, thus alleviating the need for a specific import … "remind me where I look to find where that library is imported? Oh ya, here. But this other module needs that import, but not this new library I’m using to optimise this code. Where should I import that? Ugh!!!"
You've lost the hard-earned respect of following maintainers.
I would like a way to detect if my module was executed directly, as in import module or from module import * rather than by import module.submodule (which also executes module), and have this information accessible in module's __init__.py.
Here is a use case:
In Python, a common idiom is to add import statement in a module's __init__.py file, such as to "flatten" the module's namespace and make its submodules accessible directly. Unfortunately, doing so can make loading a specific submodule very slow, as all other siblings imported in __init__.py will also execute.
For instance:
module/
__init__.py
submodule/
__init__.py
...
sibling/
__init__.py
...
By adding to module/__init__.py:
from .submodule import *
from .sibling import *
It is now possible for users of the module to access definitions in submodules without knowing the details of the package structure (i.e. from module import SomeClass, where SomeClass is defined somewhere in submodule and exposed in its own __init__.py file).
However, if I now run submodule directly (as in import module.submodule, by calling python3 -m module.submodule, or even indirectly via pytest) I will also, unavoidably, execute sibling! If sibling is large, this can slow things down for no reason.
I would instead like to write module/__init__.py something like:
if __???__ == 'module':
from .submodule import *
from .sibling import *
Where __???__ gives me the fully qualified name of the import. Any similar mechanism would also work, although I'm mostly interested in the general case (detecting direct executing) rather than this specific example.
What is being desired is will result in undefined behavior (in the sense whether or not the flattened names be importable from module) when we consider how the import system actually works, if it were actually possible.
Hypothetically, if what you want to achieve is possible, where some __dunder__ that will disambiguate which import statement was used to import module/__init__.py (e.g. import module and from module import *, vs import module.submodule. For the first case, module may trigger the subsequent (slow) import to produce a "flattened" version of the desired imports, while the latter case (import module.submodule) will avoid that and thus module will not contain any assignments of the "flattened" imports.
To illustrate the example a bit more, say one may import SiblingClass from module.sibling.SiblingClass by simply doing from module import SiblingClass as the module/__init__.py file executes from .sibling import * statement to create that binding. But then, if executing import module.submodule resulting in the avoidance of that flatten import, we get the following scenario:
import module.submodule
# module.submodule gets imported
from module import SiblingClass
# ImportError will occur
Why is that? This is simply due to how Python imports a file - the source file is executed in its entirety once to assign imports, function and class declarations to the designated names, and be registered to sys.modules under its import name. Importing the module again will not execute the file again, thus if the from .sibling import * statement was not executed during its initial import (i.e. import module.submodule), it will never be executed again during subsequent import of the same module, as the copy produced by the initial import assigned to its module entry in sys.module is returned (unless the module was reloaded manually, the code for the module will be executed again).
You may verify this fact by putting in a print statement into a file, import the corresponding module to see the output produced, and see that no further output will be produced on subsequent import of that module (related: What happens when a module is imported twice?).
Effectively, the desired functionality as described in the question cannot be implemented in Python.
A related thread on this topic: How to only import sub module without exec __init__.py in the package
This is not a complete solution, but standalone py.test (ignore __init__.py files) proposes setting a global flag to detect when in test. This corrects the problem for tests at least, provided the concerned modules don't call each other.
I'm baffled by the importing dynamics in __init__.py.
Say I have this structure:
package
├── __init__.py
└── subpackage
├── __init__.py
└── dostuff.py
I would like to import things in dostuff.py. I could do it like this: from package.subpackage.dostuff import thefunction, but I would like to remove the subpackage level in the import statement, so it would look like this:
from package.dostuff import thefunction
I tried putting this in package/__init__.py:
from .subpackage import dostuff
And what I don't understand is this:
# doing this works:
from package import dostuff
dostuff.thefunction()
# but this doesn't work:
from package.dostuff import thefunction
# ModuleNotFoundError: No module named 'package.dostuff'
Why is that, and how can I make from package.dostuff import thefunction work?
The only way I see to make what you intend would be to actually create a package/dostuff.py module and import all you need in it as from .subpackage.dostuff import thefunction.
The point is that when you use from .subpackage import dostuff in package/__init__.py, you do not rename the original module.
To be more explicit, here is an example of use with both your import and a package/dostuff.py file:
# We import the dostuff link from package
>>> from package import dostuff
>>> dostuff
<module 'package.subpackage.dostuff' from '/tmp/test/package/subpackage/dostuff.py'>
# We use our custom package.dostuff
>>> from package.dostuff import thefunction
>>> package.dostuff
<module 'package.dostuff' from '/tmp/test/package/dostuff.py'>
>>> from package import dostuff
>>> dostuff
<module 'package.dostuff' from '/tmp/test/package/dostuff.py'>
# The loaded function is the same
>>> dostuff.thefunction
<function thefunction at 0x7f95403d2730>
>>> package.dostuff.thefunction
<function thefunction at 0x7f95403d2730>
A clearer way of putting this is:
from X import Y only works when X is an actual module path.
Y on the contrary can be any item imported in this module.
This also applies to packages with anything being declared in their __init__.py. Here you declare the module package.subpackage.dostuff in package, hence you can import it and use it.
But if you try to use the module for a direct import, it has to exist on the filesystem
Resources:
Python documentation about module management in the import system:
https://docs.python.org/3/reference/import.html#submodules.
Python import system search behavior:
https://docs.python.org/3/reference/import.html#searching
https://docs.python.org/3/glossary.html#term-qualified-name
https://docs.python.org/2.0/ref/import.html
I hope that makes it clearer
You can in fact fake this quite easily by fiddling with Python's sys.modules dict. The question is whether you do really need this or whether it might be good to spend a second thought on your package structure.
Personally, I would consider this bad style, because it applies magic to the module and package names and people who might use and extend your package will have a hard time figuring out what's going on there.
Following your structure above, add the following code to your package/__init__.py:
import sys
from .subpackage import dostuff
# This will be package.dostuff; just avoiding to hard-code it.
_pkg_name = f"{__name__}.{dostuff.__name__.rsplit('.', 1)[1]}"
if _pkg_name not in sys.modules.keys():
dostuff.__name__ = _pkg_name # Will have no effect; see below
sys.modules[_pkg_name] = dostuff
This imports the dostuff module from your subpackage to the scope of package, changes its module path and adds it to the imported modules. Essentially, this just copies the binding of your module to another import path where member memory addresses remain the same. You just duplicate the references:
import package
print(package.dostuff)
print(package.subpackage.dostuff)
print(package.dostuff.something_to_do)
print(package.subpackage.dostuff.something_to_do)
... yields
<module 'package.subpackage.dostuff' from '/path/package/subpackage/dostuff.py'>
<module 'package.subpackage.dostuff' from '/path/package/subpackage/dostuff.py'>
<function something_to_do at 0x1029b8ae8>
<function something_to_do at 0x1029b8ae8>
Note that
The module name package.subpackage.dostuff has not changed even though being updated in package/__init__.py
The function reference is the same: 0x1029b8ae8
Now, you can also go
from package.dostuff import something_to_do
something_to_do()
However, be cautious. Changing the imported modules during import of a module might have unintended side-effects (also the order of updating sys.modules and importing other subpackages or submodules from within package might be relevant). Usually, you buy extra work and extra complexity by applying such kind of "improvement". Better yet set up a proper package structure and stick to it.
I have a following project structure:
project
|----app.py
|----package
|---__init__.py
|---module.py
|---module2.py
|---module3.py
|---....
My __init__.py file currently is empty. In module.py I have a definition of a class:
class UsefulClass:
...
And in other modules similar definitions as well. My app.py looks like this:
#!/usr/bin/env python
# -*- coding: utf-8 -*-
from package.module import UsefulClass
from package.module2 import UsefulClass2
...
usefulclass = UsefulClass()
usefulclass2 = UsefulClass2()
....
My question is: how can I replace this from package.module... import UsefulClass statements? Even now, I have only 4 modules defined and this imports starting to look ugly. Can I import them in __init__.py file and then just use import package in app.py? I have tried that and it gives me an error.
I am looking for a clean and elegant solution.
In Python 3:
package/__init__.py:
from .foo import bar
package/foo.py:
bar=0
app1.py:
import package
print(package.bar)
app2.py:
from package import bar
print(bar)
Either way, this prints 0, just as you want.
In Python 2, just change from .foo import bar to from foo import bar.
(In fact, the 2.x code often works in Python 3, but it's not correct, and in some edge cases it will fail. For example, if you have a bar.py at the same level as the app, you'll end up with bar being that module, instead of 0.)
In real life, you probably want to specify a __all__ from each package and module that you might ever from foo import … (if for no other reason than to allow to test things at the interactive interpreter with from foo import *).
It sounds like you're saying you already tried this, and got an error. Without knowing exactly what you tried, and what the error was, and which Python version you're using, I have no idea what in particular you might have gotten wrong, but presumably you got something wrong.
The .foo specifies a package-relative import. Saying from .foo import bar means "from the foo module in the same package as me, import bar". If you leave off the dot, you're saying "from the foo module in the standard module path, import bar".
The tutorial section on Intra-package References (and surrounding sections) gives a very brief explanation. The reference docs for import and the import system in general give most of the details, but the original proposal in PEP 328 explains the details, and the rationale behind the design, a lot more simply.
The reason you need to leave off the dot in 2.x is that 2.x didn't have any way to distinguish relative and absolute imports. There's only from foo import bar, which means "from the foo module of the same package as me, or, if there is no such module, the one in the standard module path, import bar".
I was trying out one of the Python standard library modules, let's call it foo.bar.baz.
So I wrote a little script starting with
import foo.bar.baz
and saved it as foo.py.
When I executed the script I got an ImportError. It took me a while (I'm still learning Python), but I finally realized the problem was how I named the script. Once I renamed foo.py to something else, the problem went away.
So I understand that the import foo statement will look for the script foo.py before looking for the standard library foo, but it's not clear to me what it was looking for when I said import foo.bar.baz. Is there some way that foo.py could have the content for that statement to make sense? And if not, why didn't the Python interpreter move on to look for a directory hierarchy like foo/bar with the appropriate __init__.py's?.
An import statement like import foo.bar.baz first imports foo, then asks it for bar, and then asks foo.bar for baz. Whether foo will, once imported, be able to satisfy the request for bar or bar.baz is unimportant to the import of foo. It's just a module. There is only one foo module. Both import foo and import foo.bar.baz will find the same module -- just like any other way of importing the foo module.
There is actually a way for foo to be a single module, rather than a package, and still be able to satisfy a statement like import foo.bar.baz: it can add "foo.bar" and "foo.bar.baz" to the sys.modules dict. This is exactly what the os module does with os.path: it imports the right "path" module for the platform (posixpath, ntpath, os2path, etc), and assigns it to the path attribute. Then it does sys.modules["os.path"] = path to make that module importable as os.path, so a statement like import os.path works. There isn't really a reason to do this -- os.path is available without importing it as well -- but it's possible.