I have a number of small utils modules that i organize under a 'msa' namespace so I can use in a number of different research projects. Currently I have them organized like this:
# folder structure:
packages <-- in my pythonpath
--msa
----msa_utils.py
----msa_geom.py
----msa_pyglet.py
----msa_math.py
----etc
# imported and used this like
from msa import msa_pyglet
from msa import msa_math
msa_pyglet.draw_rect(msa_math.lerp(...))
However I would like to avoid the 'msa_' in the names and use like this:
# folder structure:
packages <-- in my pythonpath
--msa
----utils.py
----geom.py
----pyglet.py
----math.py
----etc
# imported and used this like
import msa.pyglet
import msa.math
msa.pyglet.draw_rect(msa.math.lerp(...))
This shouldn't cause name conflicts when importing from outside, however there are name conflicts when the modules themselves import modules with conflicting names. E.g. msa/pyglet needs to imports pyglet (the external one), but ends up trying to import itself. Likewise any module which tries to import the standard math library imports only my math module. Which is all understandable. But what is the usual pythonic way of dealing with this? Do I have to give each module file a globally unique name?
In Python 2, imports in packages without a package qualifier indeed first look for package local modules.
Thus, import pyglet will find msa.pyglet before the top-level pyglet is considered.
Switch to absolute imports to make the Python 3 behaviour the default, where unqualified names are always top level names:
from __future__ import absolute_import
Now import pyglet can only ever find the top-level name, never msa.pyglet. To reference other modules within your msa namespace, use from . import pyglet or from msa import pyglet.
See PEP 328 -- Imports: Multi-Line and Absolute/Relative for more details.
Related
I would like a way to detect if my module was executed directly, as in import module or from module import * rather than by import module.submodule (which also executes module), and have this information accessible in module's __init__.py.
Here is a use case:
In Python, a common idiom is to add import statement in a module's __init__.py file, such as to "flatten" the module's namespace and make its submodules accessible directly. Unfortunately, doing so can make loading a specific submodule very slow, as all other siblings imported in __init__.py will also execute.
For instance:
module/
__init__.py
submodule/
__init__.py
...
sibling/
__init__.py
...
By adding to module/__init__.py:
from .submodule import *
from .sibling import *
It is now possible for users of the module to access definitions in submodules without knowing the details of the package structure (i.e. from module import SomeClass, where SomeClass is defined somewhere in submodule and exposed in its own __init__.py file).
However, if I now run submodule directly (as in import module.submodule, by calling python3 -m module.submodule, or even indirectly via pytest) I will also, unavoidably, execute sibling! If sibling is large, this can slow things down for no reason.
I would instead like to write module/__init__.py something like:
if __???__ == 'module':
from .submodule import *
from .sibling import *
Where __???__ gives me the fully qualified name of the import. Any similar mechanism would also work, although I'm mostly interested in the general case (detecting direct executing) rather than this specific example.
What is being desired is will result in undefined behavior (in the sense whether or not the flattened names be importable from module) when we consider how the import system actually works, if it were actually possible.
Hypothetically, if what you want to achieve is possible, where some __dunder__ that will disambiguate which import statement was used to import module/__init__.py (e.g. import module and from module import *, vs import module.submodule. For the first case, module may trigger the subsequent (slow) import to produce a "flattened" version of the desired imports, while the latter case (import module.submodule) will avoid that and thus module will not contain any assignments of the "flattened" imports.
To illustrate the example a bit more, say one may import SiblingClass from module.sibling.SiblingClass by simply doing from module import SiblingClass as the module/__init__.py file executes from .sibling import * statement to create that binding. But then, if executing import module.submodule resulting in the avoidance of that flatten import, we get the following scenario:
import module.submodule
# module.submodule gets imported
from module import SiblingClass
# ImportError will occur
Why is that? This is simply due to how Python imports a file - the source file is executed in its entirety once to assign imports, function and class declarations to the designated names, and be registered to sys.modules under its import name. Importing the module again will not execute the file again, thus if the from .sibling import * statement was not executed during its initial import (i.e. import module.submodule), it will never be executed again during subsequent import of the same module, as the copy produced by the initial import assigned to its module entry in sys.module is returned (unless the module was reloaded manually, the code for the module will be executed again).
You may verify this fact by putting in a print statement into a file, import the corresponding module to see the output produced, and see that no further output will be produced on subsequent import of that module (related: What happens when a module is imported twice?).
Effectively, the desired functionality as described in the question cannot be implemented in Python.
A related thread on this topic: How to only import sub module without exec __init__.py in the package
This is not a complete solution, but standalone py.test (ignore __init__.py files) proposes setting a global flag to detect when in test. This corrects the problem for tests at least, provided the concerned modules don't call each other.
My project structure like this:
/project
main.py
/a_module
__init__.py
/sub_module
__init__.py
some_file.py
main.py
from a_module import main_api
a_module/__init__.py
from sub_module import sub_api
sub_module/__init__.py
from some_file import detail_api
In a_module/__init__.py gives Unable to import 'sub_module' error.
Why I cannot import 'sub_module'?
When I change to the relative path solve the error.
from .sub_module import sub_api
But I don't understand, does __init__.py design for public the API of the module? Why don't treat sub_module as a module instead of a directory? it's such a bad design to me...
__init__.py is executed when you import the package that contains it. But it's not your problem. Your problem is that module imports are always absolute unless explicitly relative. That means that they must chain from some directory in sys.path. By default this includes the working directory, so when you run main.py from within project, it can find a_module, and nothing else.
from sub_module import sub_api
In a_module/__init__.py doesn't work though, because imports are always absolute unless explicitly relative. So that import says "starting from some sys.path root, find a top level package named sub_module and import sub_api from it". Since no such module exists you get an error. from .sub_module import sub_api works because you opted into relative imports, so it doesn't start over from sys.path.
For an example of why you would do this, I'll give you something that broke in our own code back in the Python 2 days before absolute import by default was the law (from __future__ import absolute_import enabled the Py3 behavior, which is how we fixed it, but despite what the docs say, it was never enabled by default in Py2, the only enabled by default behavior was relative imports). Our layout was:
teamnamespace/
module.py
math/
mathrelatedsubmodule.py
othermathsubmodule.py
Now, we innocently thought hey, we'll put all our packages under a single shared top level namespace, and subpackages cover broad categories within them, and since we had a lot of additional utilities for basic mathematics, we put them under teamnamespace.math. Problem was, for the non-math modules, like teamnamespace.module, when they did:
import math # or
from math import ceil
it defaulted to relative lookup, and imported teamnamespace.math as math (a thoroughly useless import, since it was a namespace package only, all the functionality was in the sub-modules), not the built-in math module. In fact, without the Python 3 behavior, there was no reasonable way to get the built-in math module from a module under teamnamespace. Whereas with the Python 3 behavior, you can get either one or both (by aliasing one or the other with as, with no ambiguity:
# Gets built-in
import math
# Gets teamnamespace.math
from . import math
I have been creating a few modules organized by purpose, and each module contain a number number of functions. I would like to bundle these individual modules into a larger "package" that other users can import from a shared location.
Currently, I have all of my modules in one folder, called python_modules and I have appended this path to os.path so I can easily import my individual modules as needed.
However, I would like to instead import a single package, that contains all of my modules, so I don't have to import each one individually. I know that I could put all my modules into one file, but that doesn't seem like a good way to organize my processes.
Currently, I have to following files in my python_modules folder:
__init__.py
load_data_functions.py
parse_data_functions.py
network_functions.py
counting_function.py
math_functions.py
...
...
other_functions.py
The __init__.py file is empty and does not have anything inside of it. The other modules all have various functions inside of them, and some are dependent on others. For example, network_functions.py relies on load_data_functions.py and parse_data_functions.py.
As I said, I want to package all of these modules into a larger package that I can share with others, and so we don't have to import each module independently.
A Package is just a bunch of modules. You already have an __init__.py file, so python_modules is already a package. You should simply be able to import python_modules and then call functions from each individual module as load_data_functions.some_function(), parse_data_functions.some_other_function().
assuming "I would like to instead import a single package" means you want to be able to something like:
import python_modules
port = python_modules.network_functions.get_port()
your __init__.py file should look like:
from . import load_data_functions
from . import parse_data_functions
from . import network_functions
...
if you'd like to be able to do:
import python_modules
port = python_modules.get_port()
your __init__.py file should look like:
from .load_data_functions import *
from .parse_data_functions import *
from .network_functions import *
...
As for modules within the package referring to each other, you can use the same idea, e.g. at the top of network_funtions.py you'd want put from . import load_data_functions. You should structure things as to avoid circular imports.
Here is a simple case: I want to define a module in python name robot. So, I have a folder named robot with these two files:
__init__.py:
from test import a
test.py:
a = "hello world"
Now, when I import robot in the interpreter, the robot namespace includes test and a. However, I only want it to include a. Why this odd behavior?
EDIT:
Here's a slightly more representative example of what I want to achieve:
Given the following files:
__init__.py:
from spam import a
import ham
spam.py:
a = "hello world"
ham.py:
b = "foo"
Can I have a robot namespace containing a and ham at its top level but not spam?
You have created not just a module but a package. A package contains its submodules in its namespace (once they have been imported, as you imported test here). This is as it should be, since the usual way of using packages is to provide a grouping of several modules. There's not much use to making a package with only one module (i.e., one contentful .py file) inside it.
If you just want a one-file module, just create a file called robots.py and put your code in there.
Edit: See this previous question. The answer is that you should in general not worry about excluding module names from your package namespace. The modules are supposed to be in the package namespace. If you want to add functions and stuff from submodules as well, for convenience, that's fine, but there's not really anything to be gained by "covering your tracks" and hiding the modules you imported. However, as described in the answers to that question, there are some hackish ways to approximate what you want.
Are you just asking how to import specific modules or functions?
test.py:
import robot.spam.a
import robot.ham
Don't import the entire package.
all.
I'd think that this could be answered easily, but it isn't. As long as I've been searching for an answer, I keep thinking that I'm overlooking something simple.
I have a python workspace with the following package structure:
MyTestProject
/src
/TestProjectNamespace
__init__.py
Module_A.py
Module_B.py
SecondTestProject
/src
/SecondTestProjectNamespace
__init__.py
Module_1.py
Module_2.py
...
Module_10.py
Note that MyTestProjectNamespace has a reference to SecondTestProjectNamespace.
In MyTestProjectNamespace, I need to import everything in SecondTestProjectNamespace. I could import one module at a time with the following statement(s):
from SecondTestProjectNamespace.Module_A import *
from SecondTestProjectNamespace.Module_B import *
...but this isn't practical if the SecondTestProject has 50 modules in it.
Does Python support a way to import everything in a namespace / package? Any help would be appreciated.
Thanks in advance.
Yes, you can roll this using pkgutil.
Here's an example that lists all packages under twisted (except tests), and imports them:
# -*- Mode: Python -*-
# vi:si:et:sw=4:sts=4:ts=4
import pkgutil
import twisted
for importer, modname, ispkg in pkgutil.walk_packages(
path=twisted.__path__,
prefix=twisted.__name__+'.',
onerror=lambda x: None):
# skip tests
if modname.find('test') > -1:
continue
print(modname)
# gloss over import errors
try:
__import__(modname)
except:
print 'Failed importing', modname
pass
# show that we actually imported all these, by showing one subpackage is imported
print twisted.python
I have to agree with the other posters that star imports are a bad idea.
No. It is possible to set up SecondTestProject to automatically import everything in its submodules, by putting code in __init__.py to do the from ... import * you mention. It's also possible to automate this to some extent using the __import__ function and/or the imp module. But there is no quick and easy way to take a package that isn't set up this way and make it work this way.
It's probably not a good idea anyway. If you have 50 modules, importing everything from all of them into your global namespace is going to cause a proliferation of names, and very likely conflicts among those names.
As other had put it - it might not be a good idea. But there are ways of keeping your namespaces and therefore avoiding naming conflicts - and having all the modules/sub-packages in a module available to the package user with a single import.
Let's suppose I have a package named "pack", within it a module named "a.py" defining some "b" variable. All I want to do is :
>>> import pack
>>> pack.a.b
1
One way of doing this is to put in pack/__init__.py a line that says
import a - thus in your case you'd need fifty such lines, and keep them up to date.
Not that bad.
However, the documentation at http://docs.python.org/tutorial/modules.html#importing-from-a-package - says that if you have a string list named __all__ in your __init__.py file, all module/sub-package names in that list are imported when one does from pack import *
That alone would half-work - but would require users of your package to perform the not-recommended "from x import *" form.
But -- you can do the "... import *" inside __init__.py itself, after defining the __all__ variable - so all you have to do is to keep the __all__ up to date:
With the TestProjectNamespace/__init__.py being like this:
__all__ = ["Module_A", "Module_B", ...]
from TestProjectNamespace import *
your users would have
TestProjectNamespace.Module_A (and others) available upon import of TestProjectNamespace.
And, of course - you could automate the creation of __all__ - it is just a variable, after all - but I would not recommend that.
Does Python support a way to import everything in a namespace / package?
No. A package is not a super-module -- it's a collection of modules grouped together.
At least part of the reason is that it's not trivial to determine what 'everything' means inside a folder: there are problems like network drives, soft links, hard links, ...