I have two versions of the same Python package. I need from a module in a subpackage in the current version to be able to call a function inside the old version of the package (which copied itself in the past)
Where I am now:
now/
package/
__init__.py
subpackage/
__init__.py
module.py -> "import package.subpackage.... <HERE>"
subpackage2/
...
...
The old version:
past/
package/
__init__.py
subpackage/
__init__.py
module.py -> "import package.subpackage; from . import module2; .... def f(x) ..."
module2.py
subpackage2/
...
...
I need to import in <HERE> the "old" f and run it.
Ideally
the function f should live its life inside the old package without knowing anything about the new version of the package
the module in the new package should call it, let it live its life, get the results and then forget altogether about the existence of the old package (so calling "import package.subpackage2" after letting f do her thing should run the "new" version)
doing that should not be terribly complex
The underlying idea is to improve reproducibility by saving the code that I used for some task along with the output data, and then being able to run parts of it.
Sadly, I understood this is not a simple task with Python 3, so I am prepared to accept some sort of compromise. I am prepared to accept, for example that after running the old f(x) the name package in the "new" code will be bound to the old.
EDIT
I tried in two ways using importlib. The idea was to create an object mod and then doing f = getattr(mod, "f"), but it doesn't work
Changing sys.path to ['.../past/package/subpackage'] and then calling importlib.import_module('package.subpackage.module') . The problem is that it will load the one in "now" even with the changed sys.path, probably because the name package is already in sys.modules
spec = importlib.util.spec_from_file_location("module", "path..to..past..module.py"))
module = importlib.util.module_from_spec(spec)
spec.loader.exec_module(module)
In that case relative imports (from . import module2.py) won't work, giving the error "attempted relative import with no known parent package"
There is one way this could work quite simply, but you will have to make a few modifications to your old package.
You can simply create a file in now/package/old/__init__.py containing:
__path__ = ['/absolute/path/to/old/package']
In the new package, you can then do:
from package.old.package.subpackage.module import f as old_f
The catch here is that the old package tries to import its own packages using absolute import, it is going to load stuff from the new packages instead. So the old package will have to only use relative imports when importing stuffs from its own package or you'll have to prepend package.old to all absolute imports that the old package was doing.
If you are fine with modifying the old packages in this way, then that should be fine. If that limitation would not work for you, then read on.
If you are really, really sure that for some reasons don't want to modify the old packages. Then let's do some black magic, you'd want to replace builtins.__import__ with your own version that returns different modules depending on who is doing the importing. You can figure out who is doing the importing by inspecting the call stack.
For example, this is how you might do it (tested on Python 3.6):
import builtins
import inspect
import package.old
old_package_path = package.old.__path__[0]
OUR_PACKAGE_NAME = 'package'
OUR_PACKAGE_NAME_WITH_DOT = OUR_PACKAGE_NAME + '.'
def import_module(name, globs=None, locs=None, fromlist=(), level=0):
# only intercept imports for our own package from our old module
if not name.startswith(OUR_PACKAGE_NAME_WITH_DOT) or \
not inspect.stack()[1].filename.startswith(old_package_path):
return real_import(name, globs, locs, fromlist, level)
new_name = OUR_PACKAGE_NAME + '.old.' + name[len(OUR_PACKAGE_NAME_WITH_DOT):]
mod = real_import(new_name, globs, locs, fromlist, level)
return mod.old
# save the original __import__ since we'll need it to do the actual import
real_import = builtins.__import__
builtins.__import__ = import_module
builtins.__import__ gets called on any import statements encountered by the interpreter, and the call is not cached so you can return different things every time it is called even when they use the same name.
The following is my old answer, here for historical purpose only
I don't quite get what you're trying to do, but this is likely possible to do in Python 3 by using importlib.
You would just create a module loader that loads your module from an explicit filepath.
There's also an invalidate_caches() and reload() function which may be useful, though you may not need them.
Related
Having already use flat packages, I was not expecting the issue I encountered with nested packages. Here is…
Directory layout
dir
|
+-- test.py
|
+-- package
|
+-- __init__.py
|
+-- subpackage
|
+-- __init__.py
|
+-- module.py
Content of init.py
Both package/__init__.py and package/subpackage/__init__.py are empty.
Content of module.py
# file `package/subpackage/module.py`
attribute1 = "value 1"
attribute2 = "value 2"
attribute3 = "value 3"
# and as many more as you want...
Content of test.py (3 versions)
Version 1
# file test.py
from package.subpackage.module import *
print attribute1 # OK
That's the bad and unsafe way of importing things (import all in a bulk), but it works.
Version 2
# file test.py
import package.subpackage.module
from package.subpackage import module # Alternative
from module import attribute1
A safer way to import, item by item, but it fails, Python don't want this: fails with the message: "No module named module". However …
# file test.py
import package.subpackage.module
from package.subpackage import module # Alternative
print module # Surprise here
… says <module 'package.subpackage.module' from '...'>. So that's a module, but that's not a module /-P 8-O ... uh
Version 3
# file test.py v3
from package.subpackage.module import attribute1
print attribute1 # OK
This one works. So you are either forced to use the overkill prefix all the time or use the unsafe way as in version #1 and disallowed by Python to use the safe handy way? The better way, which is safe and avoid unecessary long prefix is the only one which Python reject? Is this because it loves import * or because it loves overlong prefixes (which does not help to enforce this practice)?.
Sorry for the hard words, but that's two days I trying to work around this stupid‑like behavior. Unless I was totally wrong somewhere, this will leave me with a feeling something is really broken in Python's model of package and sub‑packages.
Notes
I don't want to rely on sys.path, to avoid global side effects, nor on *.pth files, which are just another way to play with sys.path with the same global effets. For the solution to be clean, it has to be local only. Either Python is able to handle subpackage, either it's not, but it should not require to play with global configuration to be able to handle local stuff.
I also tried use imports in package/subpackage/__init__.py, but it solved nothing, it do the same, and complains subpackage is not a known module, while print subpackage says it's a module (weird behavior, again).
May be I'm entirely wrong tough (the option I would prefer), but this make me feel a lot disappointed about Python.
Any other known way beside of the three I tried? Something I don't know about?
(sigh)
----- %< ----- edit ----- >% -----
Conclusion so far (after people's comments)
There is nothing like real sub‑package in Python, as all package references goes to a global dictionnary, only, which means there's no local dictionary, which implies there's is no way to manage local package reference.
You have to either use full prefix or short prefix or alias. As in:
Full prefix version
from package.subpackage.module import attribute1
# An repeat it again an again
# But after that, you can simply:
use_of (attribute1)
Short prefix version (but repeated prefix)
from package.subpackage import module
# Short but then you have to do:
use_of (module.attribute1)
# and repeat the prefix at every use place
Or else, a variation of the above.
from package.subpackage import module as m
use_of (m.attribute1)
# `m` is a shorter prefix, but you could as well
# define a more meaningful name after the context
Factorized version
If you don't mind about importing multiple entity all at once in a batch, you can:
from package.subpackage.module import attribute1, attribute2
# and etc.
Not in my first favorite taste (I prefer to have one import statement per imported entity), but may be the one I will personally favor.
Update (2012-09-14):
Finally appears to be OK in practice, except with a comment about the layout. Instead of the above, I used:
from package.subpackage.module import (
attribute1,
attribute2,
attribute3,
...) # and etc.
You seem to be misunderstanding how import searches for modules. When you use an import statement it always searches the actual module path (and/or sys.modules); it doesn't make use of module objects in the local namespace that exist because of previous imports. When you do:
import package.subpackage.module
from package.subpackage import module
from module import attribute1
The second line looks for a package called package.subpackage and imports module from that package. This line has no effect on the third line. The third line just looks for a module called module and doesn't find one. It doesn't "re-use" the object called module that you got from the line above.
In other words from someModule import ... doesn't mean "from the module called someModule that I imported earlier..." it means "from the module named someModule that you find on sys.path...". There is no way to "incrementally" build up a module's path by importing the packages that lead to it. You always have to refer to the entire module name when importing.
It's not clear what you're trying to achieve. If you only want to import the particular object attribute1, just do from package.subpackage.module import attribute1 and be done with it. You need never worry about the long package.subpackage.module once you've imported the name you want from it.
If you do want to have access to the module to access other names later, then you can do from package.subpackage import module and, as you've seen you can then do module.attribute1 and so on as much as you like.
If you want both --- that is, if you want attribute1 directly accessible and you want module accessible, just do both of the above:
from package.subpackage import module
from package.subpackage.module import attribute1
attribute1 # works
module.someOtherAttribute # also works
If you don't like typing package.subpackage even twice, you can just manually create a local reference to attribute1:
from package.subpackage import module
attribute1 = module.attribute1
attribute1 # works
module.someOtherAttribute #also works
The reason #2 fails is because sys.modules['module'] does not exist (the import routine has its own scope, and cannot see the module local name), and there's no module module or package on-disk. Note that you can separate multiple imported names by commas.
from package.subpackage.module import attribute1, attribute2, attribute3
Also:
from package.subpackage import module
print module.attribute1
If all you're trying to do is to get attribute1 in your global namespace, version 3 seems just fine. Why is it overkill prefix ?
In version 2, instead of
from module import attribute1
you can do
attribute1 = module.attribute1
I'm trying to do the following in python 2.6.
my_module.py:-
from another_module import another_factory
def my_factory(name):
pass
another_module.py:-
from my_module import my_factory
def another_factory(name):
pass
Both modules in the same folder.
It gives me the error:
Error: cannot import name my_factory
As seen from the comments, you are trying to do a circle import which is impossible.
If in your module A you try to import something from the module B, and when loading the module B (to satisfy this dependency) you are trying to import something from the module A, you are where you started and you got a circle import: A needs B and B needs A!!, it is somehow like saying that A needs A, which is quite unlogic.
For instance:
# moduleA
from moduleB import functionB
...
So the interpreter tries to load the moduleB, which looks like the following:
# moduleB
from moduleA import functionA
...
And goes back to the moduleA, which tries again to import B, and, etc. Therefore python just raises the error and stops the insanity for a greater good.
Dependencies don't work like this. Define what module needs the other one, and just do a simple import. In your example, it seems that another_module needs my_module, so change my_module and eliminate the dependency on another_module.
If both modules actually need each other, it is a clear sign that they belong to the same logical concept, and should be merged.
PD: in some cases to avoid huge files, you can split a logical unit in two, and to avoid the circle dependencies, you write your imports inside of the functions (which are not executed at load time), so that there is not a circle. This is however in general something to avoid.
The real question is... do you consider each file as a module or are they part of a package ?
Trying to import modules outside a package is sometimes painful. You should rather build a package by simply creating an empty __init__.py module in the directory. Though, if you have
__init__.py
my_module.py
another_module.py
If you have te following function in my_module.py,
def my_factory(x):
return x * x
You should be able to access the my_factory() function from another_module.py by writing this :
from my_module import my_factory
But, if you don't have the __init__.py file/module, the import function will be (somehow) lost and will only use the sys.path for searching other modules. You may then add the following lines (before the import) in the another_module.py file :
sys.path.append(os.path.dirname(os.path.expanduser('.')))
You may also use the various packages available to help importing modules, like imp or import_file (see the documentation). Or you can decide to use load_source (also see the doc : https://docs.python.org/2/library/imp.html)
I am developing a package that has a file structure similar to the following:
test.py
package/
__init__.py
foo_module.py
example_module.py
If I call import package in test.py, I want the package module to appear similar to this:
>>> vars(package)
mapping_proxy({foo: <function foo at 0x…}, {example: <function example at 0x…})
In other words, I want the members of all modules in package to be in package's namespace, and I do not want the modules themselves to be in the namespace. package is not a sub-package.
Let's say my files look like this:
foo_module.py:
def foo(bar):
return bar
example_module.py:
def example(arg):
return foo(arg)
test.py:
print(example('derp'))
How do I structure the import statements in test.py, example_module.py, and __init__.py to work from outside the package directory (i.e. test.py) and within the package itself (i.e. foo_module.py and example_module.py)? Everything I try gives Parent module '' not loaded, cannot perform relative import or ImportError: No module named 'module_name'.
Also, as a side-note (as per PEP 8): "Relative imports for intra-package imports are highly discouraged. Always use the absolute package path for all imports. Even now that PEP 328 is fully implemented in Python 2.5, its style of explicit relative imports is actively discouraged; absolute imports are more portable and usually more readable."
I am using Python 3.3.
I want the members of all modules in package to be in package's
namespace, and I do not want the modules themselves to be in the
namespace.
I was able to do that by adapting something I've used in Python 2 to automatically import plug-ins to also work in Python 3.
In a nutshell, here's how it works:
The package's __init__.py file imports all the other Python files in the same package directory except for those whose names start with an '_' (underscore) character.
It then adds any names in the imported module's namespace to that of __init__ module's (which is also the package's namespace). Note I had to make the example_module module explicitly import foo from the .foo_module.
One important aspect of doing things this way is realizing that it's dynamic and doesn't require the package module names to be hardcoded into the __init__.py file. Of course this requires more code to accomplish, but also makes it very generic and able to work with just about any (single-level) package — since it will automatically import new modules when they're added and no longer attempt to import any removed from the directory.
test.py:
from package import *
print(example('derp'))
__init__.py:
def _import_all_modules():
""" Dynamically imports all modules in this package. """
import traceback
import os
global __all__
__all__ = []
globals_, locals_ = globals(), locals()
# Dynamically import all the package modules in this file's directory.
for filename in os.listdir(__name__):
# Process all python files in directory that don't start
# with underscore (which also prevents this module from
# importing itself).
if filename[0] != '_' and filename.split('.')[-1] in ('py', 'pyw'):
modulename = filename.split('.')[0] # Filename sans extension.
package_module = '.'.join([__name__, modulename])
try:
module = __import__(package_module, globals_, locals_, [modulename])
except:
traceback.print_exc()
raise
for name in module.__dict__:
if not name.startswith('_'):
globals_[name] = module.__dict__[name]
__all__.append(name)
_import_all_modules()
foo_module.py:
def foo(bar):
return bar
example_module.py:
from .foo_module import foo # added
def example(arg):
return foo(arg)
I think you can get the values you need without cluttering up your namespace, by using from module import name style imports. I think these imports will work for what you are asking for:
Imports for example_module.py:
from package.foo_module import foo
Imports for __init__.py:
from package.foo_module import foo
from package.example_module import example
__all__ = [foo, example] # not strictly necessary, but makes clear what is public
Imports for test.py:
from package import example
Note that this only works if you're running test.py (or something else at the same level of the package hierarchy). Otherwise you'd need to make sure the folder containing package is in the python module search path (either by installing the package somewhere Python will look for it, or by adding the appropriate folder to sys.path).
In the below hierachy, is there a convenient and universal way to reference to the top_package using a generic term in all .py file below? I would like to have a consistent way to import other modules, so that even when the "top_package" changes name nothing breaks.
I am not in favour of using the relative import like "..level_one_a" as relative path will be different to each python file below. I am looking for a way that:
Each python file can have the same import statement for the same module in the package.
A decoupling reference to "top_package" in any .py file inside the package, so whatever name "top_package" changes to, nothing breaks.
top_package/
__init__.py
level_one_a/
__init__.py
my_lib.py
level_two/
__init__.py
hello_world.py
level_one_b/
__init__.py
my_lib.py
main.py
This should do the job:
top_package = __import__(__name__.split('.')[0])
The trick here is that for every module the __name__ variable contains the full path to the module separated by dots such as, for example, top_package.level_one_a.my_lib. Hence, if you want to get the top package name, you just need to get the first component of the path and import it using __import__.
Despite the variable name used to access the package is still called top_package, you can rename the package and if will still work.
Put your package and the main script into an outer container directory, like this:
container/
main.py
top_package/
__init__.py
level_one_a/
__init__.py
my_lib.py
level_two/
__init__.py
hello_world.py
level_one_b/
__init__.py
my_lib.py
When main.py is run, its parent directory (container) will be automatically added to the start of sys.path. And since top_package is now in the same directory, it can be imported from anywhere within the package tree.
So hello_world.py could import level_one_b/my_lib.py like this:
from top_package.level_one_b import my_lib
No matter what the name of the container directory is, or where it is located, the imports will always work with this arrangement.
But note that, in your original example, top_package it could easily function as the container directory itself. All you would have to do is remove top_package/__init__.py, and you would be left with efectively the same arrangement.
The previous import statement would then change to:
from level_one_b import my_lib
and you would be free to rename top_package however you wished.
You could use a combination of the __import__() function and the __path__ attribute of a package.
For example, suppose you wish to import <whatever>.level_one_a.level_two.hello_world from somewhere else in the package. You could do something like this:
import os
_temp = __import__(__path__[0].split(os.sep)[0] + ".level_one_a.level_two.hello_world")
my_hello_world = _temp.level_one_a.level_two.hello_world
This code is independent of the name of the top level package and can be used anywhere in the package. It's also pretty ugly.
This works from within a library module:
import __main__ as main_package
TOP_PACKAGE = main_package.__package__.split('.')[0]
I believe #2 is impossible without using relative imports or the named package. You have to specify what module to import either by explicitly calling its name or using a relative import. otherwise how would the interpreter know what you want?
If you make your application launcher one level above top_level/ and have it import top_level you can then reference top_level.* from anywhere inside the top_level package.
(I can show you an example from software I'm working on: http://github.com/toddself/beerlog/)
I'm taking a look at how the model system in django works and I noticed something that I don't understand.
I know that you create an empty __init__.py file to specify that the current directory is a package. And that you can set some variable in __init__.py so that import * works properly.
But django adds a bunch of from ... import ... statements and defines a bunch of classes in __init__.py. Why? Doesn't this just make things look messy? Is there a reason that requires this code in __init__.py?
All imports in __init__.py are made available when you import the package (directory) that contains it.
Example:
./dir/__init__.py:
import something
./test.py:
import dir
# can now use dir.something
EDIT: forgot to mention, the code in __init__.py runs the first time you import any module from that directory. So it's normally a good place to put any package-level initialisation code.
EDIT2: dgrant pointed out to a possible confusion in my example. In __init__.py import something can import any module, not necessary from the package. For example, we can replace it with import datetime, then in our top level test.py both of these snippets will work:
import dir
print dir.datetime.datetime.now()
and
import dir.some_module_in_dir
print dir.datetime.datetime.now()
The bottom line is: all names assigned in __init__.py, be it imported modules, functions or classes, are automatically available in the package namespace whenever you import the package or a module in the package.
It's just personal preference really, and has to do with the layout of your python modules.
Let's say you have a module called erikutils. There are two ways that it can be a module, either you have a file called erikutils.py on your sys.path or you have a directory called erikutils on your sys.path with an empty __init__.py file inside it. Then let's say you have a bunch of modules called fileutils, procutils, parseutils and you want those to be sub-modules under erikutils. So you make some .py files called fileutils.py, procutils.py, and parseutils.py:
erikutils
__init__.py
fileutils.py
procutils.py
parseutils.py
Maybe you have a few functions that just don't belong in the fileutils, procutils, or parseutils modules. And let's say you don't feel like creating a new module called miscutils. AND, you'd like to be able to call the function like so:
erikutils.foo()
erikutils.bar()
rather than doing
erikutils.miscutils.foo()
erikutils.miscutils.bar()
So because the erikutils module is a directory, not a file, we have to define it's functions inside the __init__.py file.
In django, the best example I can think of is django.db.models.fields. ALL the django *Field classes are defined in the __init__.py file in the django/db/models/fields directory. I guess they did this because they didn't want to cram everything into a hypothetical django/db/models/fields.py model, so they split it out into a few submodules (related.py, files.py, for example) and they stuck the made *Field definitions in the fields module itself (hence, __init__.py).
Using the __init__.py file allows you to make the internal package structure invisible from the outside. If the internal structure changes (e.g. because you split one fat module into two) you only have to adjust the __init__.py file, but not the code that depends on the package. You can also make parts of your package invisible, e.g. if they are not ready for general usage.
Note that you can use the del command, so a typical __init__.py may look like this:
from somemodule import some_function1, some_function2, SomeObject
del somemodule
Now if you decide to split somemodule the new __init__.py might be:
from somemodule1 import some_function1, some_function2
from somemodule2 import SomeObject
del somemodule1
del somemodule2
From the outside the package still looks exactly as before.
"We recommend not putting much code in an __init__.py file, though. Programmers do not expect actual logic to happen in this file, and much like with from x import *, it can trip them up if they are looking for the declaration of a particular piece of code and can't find it until they check __init__.py. "
-- Python Object-Oriented Programming Fourth Edition Steven F. Lott Dusty Phillips