I'm trying to make a directory containing all the modules a python script depends on. Rather than manually tracking down all those files, I would like to automatically find these files as python imports them. To do this, I've added a module finder to sys.meta_path:
import sys, imp
class ImportPrint(object):
def find_module(self, name, path=None):
toks = name.split(".")
pre, loc = ".".join(toks[:-1]), toks[-1]
try:
module_info = imp.find_module(loc, path)
except ImportError:
module_info = imp.find_module(loc)
if module_info[0]: module_info[0].close
print "A", name, module_info[1]
return None
sys.meta_path = [ImportPrint()]
import mymod1, mymod2, etc..
This almost works, but the __init__.py files are not found this way. Is there a better way to find them, or should I just hackily add them whenever the file found is a directory? Will this method miss any other files.
According to the documentation for sys.meta_path, your find_module method will be called with the path argument set to the path of a package if it is one. Why not use os.path.join(path, '__init__.py') for when path exists?
Related
Is there a way to make sure all imports listed in all of my python files are in PYTHONPATH. Basically a validator for all my imports in files.
My approach: Change all file path from / to "." e.g. foo/bar to foo.bar and then run:
pkgutil.find_loader("foo.bar")
Problem: It does not work if I have foo.bar.zoo.data as a module
export PYTHONPATH=$(pwd):"my_lib_path"
for root,dir,files in os.walk(work_dir):
for filename in files:
if not "__init__" in filename and filename.endswith(".py"):
print "Testing file {}".format(os.path.join(root,filename))
filepath = os.path.join(root,filename)
filepath = filepath.replace(work_dir,'',)
filepath = filepath.replace('/', '.')
filepath = filepath.lstrip(".").rstrip(".py")
print " testing filepath" , filepath
# tried this also
'''for imp, name, _ in pkgutil.iter_modules(root):
full_name = "{}.{}".format(root,name)
''' module = imp.find_module(full_name)
mod = pkgutil.find_loader(filepath)
The best way to do this is to use the importlib (Python 3.1+) or imp (Python 2.x) module to do all the steps an import would do up to, but not including, running the code.
The key functions for each Python version are:
3.4-3.8: importlib.util.find_spec
3.3-3.3: importlib.find_loader.
3.1-3.2: importlib.find_module
3.0-3.0: imp.find_module
1.5-2.7: imp.find_module
0.9-1.4: No idea. (There were no packages yet; everything was different…)
Advantages of doing it this way:
It works whether the module is a normal Python module, a .pyc-only module, a C extension module, a builtin, or some funky special type of module that you've installed a custom import hook for.
It works even if the module is inside a .egg, or in the frozen bootstrap collection, or if the whole library is wrapped up in a .zip or even buried inside a .exe.
It automatically takes care of the funky rules, like figuring out what is or isn't a valid namespace package extension directory (including dealing with things like site.py and old-style setuptools path-injection) that would be a huge pain to get right.
In 3.4+, this is literally the same code that import uses, because the entire import system is written in Python. For older versions, that's not true—but for 2.3+, it's guaranteed to get you the same thing you would get in an import hook, which is almost surely close enough.
Quoting the 3.7 docs:
importlib.util.find_spec(name, package=None)
Find the spec for a module, optionally relative to the specified package name. If the module is in sys.modules, then sys.modules[name].__spec__ is returned (unless the spec would be None or is not set, in which case ValueError is raised). Otherwise a search using sys.meta_path is done. None is returned if no spec is found.
If name is for a submodule (contains a dot), the parent module is automatically imported.
name and package work the same as for import_module().
(You don't really have to care what a spec is here; if Python can find the spec for a module, the module is present; if it returns None, the module is not present.)
So, this will just magically handle foo.bar.zoo.data.
If you look at the Examples, there's one that does exactly what you want, "Checking if a module can be imported".
def test_module(name):
if not importlib.util.find_spec(name):
raise ImportError(name)
From the 2.7 docs:
imp.find_module(name[, path])
Try to find the module name. If path is omitted or None, the list of directory names given by sys.path is searched, but first a few special places are searched: the function tries to find a built-in module with the given name (C_BUILTIN), then a frozen module (PY_FROZEN), and on some systems some other places are looked in as well (on Windows, it looks in the registry which may point to a specific file).
Otherwise, path must be a list of directory names; each directory is searched for files with any of the suffixes returned by get_suffixes() above. Invalid names in the list are silently ignored (but all list items must be strings).
If search is successful, the return value is a 3-element tuple (file, pathname, description):
file is an open file object positioned at the beginning, pathname is the pathname of the file found, and description is a 3-element tuple as contained in the list returned by get_suffixes() describing the kind of module found.
If the module does not live in a file, the returned file is None, pathname is the empty string, and the description tuple contains empty strings for its suffix and mode; the module type is indicated as given in parentheses above. If the search is unsuccessful, ImportError is raised. Other exceptions indicate problems with the arguments or environment.
If the module is a package, file is None, pathname is the package path and the last item in the description tuple is PKG_DIRECTORY.
This function does not handle hierarchical module names (names containing dots). In order to find P.M, that is, submodule M of package P, use find_module() and load_module() to find and load package P, and then use find_module() with the path argument set to P.__path__. When P itself has a dotted name, apply this recipe recursively.
As you can see, it's a little more complicated—it doesn't magically handle foo.bar.zoo.data; you will need to find foo, verify that it's a package, load it, find bar, etc., and finally find (but not load) data.
Something like this (untested):
def test_module(name):
parts = name.split('.')
path = None
for part in parts[:-1]:
file, pathname, description = imp.find_module(part, path)
if description[-1] != imp.PKG_DIRECTORY:
raise ImportError(name)
pkg = imp.load_module(part, file, pathname, description)
path = pkg.__path__
file, pathname, description = imp.find_module(part[-1], path)
I have a function which is stored in builtins. This is used to load python modules with relative paths from the projects base directory. The projecs base directory is stored under builtins.absolute
Function below:
def projectRelativeImport(fileName, projectRelativePath, moduleName = None):
# if moduleName not set, set it to file name with first letter capatilised
if moduleName is None:
moduleName = fileName[:1].capitalize() + fileName[1:]
# we shouldn't be passing fileName with an extension unless moduleName is set due to previous if. So in those cases we add .py
if len(fileName) >= 3 and fileName[-3:] != '.py':
fileName = fileName + '.py'
dir = os.path.join(builtins.absolute, projectRelativePath)
full = os.path.join(dir, fileName)
sys.path.append(dir)
imp.load_source(moduleName, full)
sys.path.remove(dir)
On one of my other files I use projectRelativeImport('inputSaveHandler', 'app/util', 'SaveHandler') to import SaveHandler from app/util/inputSaveHandler.py. This runs through the project RelativeImport absolutely fine. Correct strings are being used by imp, I've printed to check.
But a couple of lines after that execution I have a line
handler = SaveHandler.ConfHandler()
Which throws NameError: name 'SaveHandler' is not defined
I realise my project relative import function is a bit odd, especially since I have it globally saved using builtins (there's probably a better way but I only started using python over the last two days). But I'm just a bit confused as to why the name isn't being recognised. Do I need to return something from imp due to scope being rubbish as the project relative import function is in a different file?
I fixed this by returning from projectRelativeImport() what was passed back from imp.load_source as shown below:
sys.path.append(dir)
submodule = imp.load_source(moduleName, full)
sys.path.remove(dir)
return submodule
Then when I used the import function the returned value now goes to a variable with the same name as that I gave to the module (all very strange)
SaveHandler = projectRelativeImport('inputSaveHandler', 'app/util', 'SaveHandler')
I got to this because it worked no problem from the file projectRelativeImport was defined in but not in any others. So it was clearly a scope issue to me, so I figured I'd just try returning whatever imp gave me and it worked
This question already has answers here:
How can I import a module dynamically given its name as string?
(10 answers)
How can I import a module dynamically given the full path?
(35 answers)
Closed 3 months ago.
I have multiple files with a structure like a file example.py:
def initialize(context):
pass
def daj_omacku_teplu(context, data):
pass
def hmataj_pomaly(context, data):
pass
def chvatni_paku(context, data):
pass
def mikaj_laktom(context, data):
pass
and I need to be able to dynamically import methods from "example.py" in a different python file like:
for fn in os.listdir('.'):
if os.path.isfile(fn):
from fn import mikaj_laktom
mikaj_laktom(example_context, sample_data)
For multiple reasons, I can not change the structure of example.py so I need to make a mechanism to load methods and evaluate them. I tried to use importlib but it can only import a class, not file with only methods defined.
Thanks for the help.
Python import does not support importing using paths, so you will need to have the files accessible as modules, see (sys.path). Assuming for now that your sources are located in the same folder as the main script, I would use the following (or similar):
import sys
def load_module(module):
# module_path = "mypackage.%s" % module
module_path = module
if module_path in sys.modules:
return sys.modules[module_path]
return __import__(module_path, fromlist=[module])
# Main script here... Could be your for loop or anything else
# `m` is a reference to the imported module that contains the functions
m = load_module("example")
m.mikaj_laktom(None, [])
The source files can also be part of another package, in which case you will need an __init__.py in the same folder with the .py files (see packages) and you import with "mypackage.module" notation. (Note that the top level folder should be in your path, in the above example this is the folder containing "mypackage")
UDPATE:
As pointed out by #skyking there are lib that can help you do the same thing. See this post
My comment on __init__.py is outdate since things have changed in py3. See this post for some more detailed explanation
You were on the right track with importlib. It can be used to load modules by name, however I do not think you can load them into the global namespace in this way (as in from module import function). So you need to load them as module objects and call your required method:
import glob, importlib, os, pathlib, sys
# The directory containing your modules needs to be on the search path.
MODULE_DIR = '/path/to/modules'
sys.path.append(MODULE_DIR)
# Get the stem names (file name, without directory and '.py') of any
# python files in your directory, load each module by name and run
# the required function.
py_files = glob.glob(os.path.join(MODULE_DIR, '*.py'))
for py_file in py_files:
module_name = pathlib.Path(py_file).stem
module = importlib.import_module(module_name)
module.mikaj_laktom()
Also, be careful using '.' as your MODULE_DIR, as this will presumably try to load the current python file as well, which might cause some unexpected behaviour.
Edit: if using Python2, you won't have pathlib in the standard library, so use
module_name = os.path.splitext(os.path.split(py_file)[1])[0]
to get the equivalent of Path.stem.
I have a python 2.6 Django app which has a folder structure like this:
/foo/bar/__init__.py
I have another couple directories on the filesystem full of python modules like this:
/modules/__init__.py
/modules/module1/__init__.py
/other_modules/module2/__init__.py
/other_modules/module2/file.py
Each module __init__ has a class. For example module1Class() and module2Class() respectively. In module2, file.py contains a class called myFileClass().
What I would like to do is put some code in /foo/bar/__init__.py so I can import in my Django project like this:
from foo.bar.module1 import module1Class
from foo.bar.module2 import module2Class
from foo.bar.module2.file import myFileClass
The list of directories which have modules is contained in a tuple in a Django config which looks like this:
module_list = ("/modules", "/other_modules",)
I've tried using __import__ and vars() to dynamically generate variables like this:
import os
import sys
for m in module_list:
sys.path.insert(0, m)
for d in os.listdir(m):
if os.path.isdir(d):
vars()[d] = getattr(__import__(m.split("/")[-1], fromlist=[d], d)
But that doesn't seem to work. Is there any way to do this?
Thanks!
I can see at least one problem with your code. The line...
if os.path.isdir(d):
...won't work, because os.listdir() returns relative pathnames, so you'll need to convert them to absolute pathnames, otherwise the os.path.isdir() will return False because the path doesn't exist (relative to the current working directory), rather than raising an exception (which would make more sense, IMO).
The following code works for me...
import sys
import os
# Directories to search for packages
root_path_list = ("/modules", "/other_modules",)
# Make a backup of sys.path
old_sys_path = sys.path[:]
# Add all paths to sys.path first, in case one package imports from another
for root_path in root_path_list:
sys.path.insert(0, root_path)
# Add new packages to current scope
for root_path in root_path_list:
filenames = os.listdir(root_path)
for filename in filenames:
full_path = os.path.join(root_path, filename)
if os.path.isdir(full_path):
locals()[filename] = __import__(filename)
# Restore sys.path
sys.path[:] = old_sys_path
# Clean up locals
del sys, os, root_path_list, old_sys_path, root_path, filenames, filename, full_path
Update
Thinking about it, it might be safer to check for the presence of __init__.py, rather than using os.path.isdir() in case you have subdirectories which don't contain such a file, otherwise the __import__() will fail.
So you could change the lines...
full_path = os.path.join(root_path, filename)
if os.path.isdir(full_path):
locals()[filename] = __import__(filename)
...to...
full_path = os.path.join(root_path, filename, '__init__.py')
if os.path.exists(full_path):
locals()[filename] = __import__(filename)
...but it might be unnecessary.
We wound up biting the bullet and changing how we do things. Now the list of directories to find modules is passed in the Django config and each one is added to sys.path (similar to a comment Aya mentioned and something I did before but wasn't too happy with). Then for each module inside of it, we check for an __init__.py and if it exists, attempt to treat it as a module to use inside of the app without using the foo.bar piece.
This required some adjustment on how we interact with the modules and how developers code their modules (they now need to use relative imports within their module instead of the full path imports they used before) but I think this will be an easier design for developers to use long-term.
We didn't add these to INSTALLED_APPS because we do some exception handling where if we cannot import a module due to dependency issues or bad code our software will continue running just without that module. If they were in INSTALLED_APPS we wouldn't be able to leverage that flexibility on when/how to deal with those exceptions.
Thanks for all of the help!
I am writing a minimal replacement for mod_python's publisher.py
The basic premise is that it is loading modules based on a URL scheme:
/foo/bar/a/b/c/d
Whereby /foo/ might be a directory and 'bar' is a method ExposedBar in a publishable class in /foo/index.py. Likewise /foo might map to /foo.py and bar is a method in the exposed class. The semantics of this aren't really important. I have a line:
sys.path.insert(0, path_to_file) # /var/www/html/{bar|foo}
mod_obj = __import__(module_name)
mod_obj.__name__ = req.filename
Then the module is inspected for the appropriate class/functions/methods. When the process gets as far as it can the remaining URI data, /a/b/c is passed to that method or function.
This was working fine until I had /var/www/html/foo/index.py and /var/www/html/bar/index.py
When viewing in the browser, it is fairly random which 'index.py' gets selected, even though I set the first search path to '/var/www/html/foo' or '/var/www/html/bar' and then loaded __import__('index'). I have no idea why it is finding either by seemingly random choice. This is shown by:
__name__ is "/var/www/html/foo/index.py"
req.filename is "/var/www/html/foo/index.py"
__file__ is "/var/www/html/bar/index.py"
This question then is, why would the __import__ be randomly selecting either index. I would understand this if the path was '/var/www/html' but it isn't. Secondly:
Can I load a module by it's absolute path into a module object? Without modification of sys.path. I can't find any docs on __import__ or new.module() for this.
Can I load a module by it's absolute
path into a module object? Without
modification of sys.path. I can't find
any docs on __import__ or new.module()
for this.
import imp
import os
def module_from_path(path):
filename = os.path.basename(path)
modulename = os.path.splitext(filename)[0]
with open(path) as f:
return imp.load_module(modulename, f, path, ('py', 'U', imp.PY_SOURCE))