Iterate on modules in Python - python

So I have a nested folder in which I have modules that perform some action.
Note: they are not classes it's just scripts.
I would like to iterate on those modules.
What I have now:
from scripts.module_1 import train_module_1
from scripts.module_2 import train_module_2
from scripts.module_3 import train_module_3
from scripts.module_4 import train_module_4
def test_train_module_1():
try:
train_module_1.main('test.csv')
except ValueError as value_error:
assert False, "test_train_module_1 failed:" + str(value_error)
...
The same for all train modules
This is how my dir looks like, My code is written in my_test.py :
tests
my_test.py
scripts
module_1
__init__.py
train_module_1.py
module_1_blabla.py
module_2
__init__.py
train_module_2.py
module_2_blabla.py
...
I wonder if I can somehow iterate on the those modules, in each module
take only the files that starts with "train_"
and perform the main function in each. I basically know how to do it But I didnt find a good solution for this kind of iterations.
I need dynamically to get the modules from scripts. so that even if someone will add a module I won't need to change the code here.
Is there something like that:
for i in scripts.children():
for j in i.children():
if j.__name__.startswith('train_'):
try:
j.main(f'{j.__name__}_test.csv')
except ValueError as value_error:
assert False, f'test_{j.__name__} failed: {value_error}'
Thanks in advance

Yes, there are several approaches, depending on your exact needs.
You could, for instance, get a list of the module names in your directory and then import them using the built-in function __import__('...') like so:
for module_name in list:
mod = __import__(module_name)
mod.main(module_name + "_test.csv)
If, on the other hand, you have already imported the modules, you can find them by looking at sys.modules (which is a dictionary of all currently imported modules).
import sys
for name in sys.modules:
if name.startswith("train_"):
mod = sys.modules[name]
mod.main(name + "_test.csv")
UPDATE: Here is a more complete version that goes through a directory structure and finds all the Python-modules that start with train_, imports them, and executes their main-function.
import os
for dir in os.scandir('.'):
if dir.is_dir():
for file in os.scandir(dir.path):
if file.name.startswith('train') and file.name.endswith('.py'):
name = file.name[:-3] # without the '.py' at the end
package = __import__(dir.name + '.' + name)
mod = getattr(package, name)
mod.main()
Note that the __import__ function returns the base package (i.e. scripts in your case), so we have to retrieve the module we want through getattr() first.

You need two different things:
import some modules and only one specific name from each one
execute the main function from them
The importlib module can help for the first part:
train = {}
for num in '1234':
mod = importlib.import_module('script.module_' + i)
train[i] = getattr(mod, 'train_module_' + i)
You can then easily invoke them:
for i, t in train.items():
try:
t.main('test.csv')
except ValueError as value_error:
assert False, f"test_train_module_{i} failed:" + str(value_error)

Related

name ' ' is not defined

I have a function which is stored in builtins. This is used to load python modules with relative paths from the projects base directory. The projecs base directory is stored under builtins.absolute
Function below:
def projectRelativeImport(fileName, projectRelativePath, moduleName = None):
# if moduleName not set, set it to file name with first letter capatilised
if moduleName is None:
moduleName = fileName[:1].capitalize() + fileName[1:]
# we shouldn't be passing fileName with an extension unless moduleName is set due to previous if. So in those cases we add .py
if len(fileName) >= 3 and fileName[-3:] != '.py':
fileName = fileName + '.py'
dir = os.path.join(builtins.absolute, projectRelativePath)
full = os.path.join(dir, fileName)
sys.path.append(dir)
imp.load_source(moduleName, full)
sys.path.remove(dir)
On one of my other files I use projectRelativeImport('inputSaveHandler', 'app/util', 'SaveHandler') to import SaveHandler from app/util/inputSaveHandler.py. This runs through the project RelativeImport absolutely fine. Correct strings are being used by imp, I've printed to check.
But a couple of lines after that execution I have a line
handler = SaveHandler.ConfHandler()
Which throws NameError: name 'SaveHandler' is not defined
I realise my project relative import function is a bit odd, especially since I have it globally saved using builtins (there's probably a better way but I only started using python over the last two days). But I'm just a bit confused as to why the name isn't being recognised. Do I need to return something from imp due to scope being rubbish as the project relative import function is in a different file?
I fixed this by returning from projectRelativeImport() what was passed back from imp.load_source as shown below:
sys.path.append(dir)
submodule = imp.load_source(moduleName, full)
sys.path.remove(dir)
return submodule
Then when I used the import function the returned value now goes to a variable with the same name as that I gave to the module (all very strange)
SaveHandler = projectRelativeImport('inputSaveHandler', 'app/util', 'SaveHandler')
I got to this because it worked no problem from the file projectRelativeImport was defined in but not in any others. So it was clearly a scope issue to me, so I figured I'd just try returning whatever imp gave me and it worked

Python dynamic import methods from file [duplicate]

This question already has answers here:
How can I import a module dynamically given its name as string?
(10 answers)
How can I import a module dynamically given the full path?
(35 answers)
Closed 3 months ago.
I have multiple files with a structure like a file example.py:
def initialize(context):
pass
def daj_omacku_teplu(context, data):
pass
def hmataj_pomaly(context, data):
pass
def chvatni_paku(context, data):
pass
def mikaj_laktom(context, data):
pass
and I need to be able to dynamically import methods from "example.py" in a different python file like:
for fn in os.listdir('.'):
if os.path.isfile(fn):
from fn import mikaj_laktom
mikaj_laktom(example_context, sample_data)
For multiple reasons, I can not change the structure of example.py so I need to make a mechanism to load methods and evaluate them. I tried to use importlib but it can only import a class, not file with only methods defined.
Thanks for the help.
Python import does not support importing using paths, so you will need to have the files accessible as modules, see (sys.path). Assuming for now that your sources are located in the same folder as the main script, I would use the following (or similar):
import sys
def load_module(module):
# module_path = "mypackage.%s" % module
module_path = module
if module_path in sys.modules:
return sys.modules[module_path]
return __import__(module_path, fromlist=[module])
# Main script here... Could be your for loop or anything else
# `m` is a reference to the imported module that contains the functions
m = load_module("example")
m.mikaj_laktom(None, [])
The source files can also be part of another package, in which case you will need an __init__.py in the same folder with the .py files (see packages) and you import with "mypackage.module" notation. (Note that the top level folder should be in your path, in the above example this is the folder containing "mypackage")
UDPATE:
As pointed out by #skyking there are lib that can help you do the same thing. See this post
My comment on __init__.py is outdate since things have changed in py3. See this post for some more detailed explanation
You were on the right track with importlib. It can be used to load modules by name, however I do not think you can load them into the global namespace in this way (as in from module import function). So you need to load them as module objects and call your required method:
import glob, importlib, os, pathlib, sys
# The directory containing your modules needs to be on the search path.
MODULE_DIR = '/path/to/modules'
sys.path.append(MODULE_DIR)
# Get the stem names (file name, without directory and '.py') of any
# python files in your directory, load each module by name and run
# the required function.
py_files = glob.glob(os.path.join(MODULE_DIR, '*.py'))
for py_file in py_files:
module_name = pathlib.Path(py_file).stem
module = importlib.import_module(module_name)
module.mikaj_laktom()
Also, be careful using '.' as your MODULE_DIR, as this will presumably try to load the current python file as well, which might cause some unexpected behaviour.
Edit: if using Python2, you won't have pathlib in the standard library, so use
module_name = os.path.splitext(os.path.split(py_file)[1])[0]
to get the equivalent of Path.stem.

Python relative __import__

Suppose I have a module package containing the following files. An empty file C:\codes\package\__init__.py and some non-trivial files:
One located in C:\codes\package\first.py
def f():
print 'a'
Another located in C:\codes\package\second.py
def f():
print 'b'
There is also a third file: C:\codes\package\general.py with the following code
def myPrint(module_name):
module = __import__(module_name)
module.f()
if __name__ == '__main__':
myPrint('first')
myPrint('second')
When I run the latter file, everything goes fine. However, if I try to execute the file C:\codes\test.py containing
if __name__ == '__main__':
from package import general
general.myPrint('first')
general.myPrint('second')
I get the import error ImportError: No module named first. How to resolve this issue?
First, I suspect you forgot to metion you have a (possibly empty) file package\__init__.py which makes package a package. Otherwise, from package import general wouldn't work.
The second case differs from the first in so far as you are in a package. From inside a package, you wouldn't do import first, but import .first. The equivalent to the latter is described here where you either add level=1 as a parameter or (but I am not sure about that) you put .first into the string and set level to -1 (if it isn't the default nevertheless, that's not clear from the documentation).
Additionally, you have to provide at least globals(), so the right line is
module = __import__(module_name, globals(), level=1)
I have found this solution here.
In your case, you should import your module_name from package. Use fromlist argument:
getattr(__import__("package", fromlist=[module_name]), module_name)
Assuming, you're using Python 3, that's just because this version dropped the support for implicit relative imports. With Python 2 it would be working just fine.
So either you'd need to use relative imports in C:\codes\package\general.py, which would result in erroneous call to it, or add your package to the path. A little dirty, but working hack would be:
def myPrint(module_name):
pkg = os.path.dirname(__file__)
sys.path.insert(0, pkg)
try:
module = __import__(module_name)
except:
raise
finally:
sys.path.remove(pkg)
module.f()
Maybe you can achieve a cleaner implementation with the importlib module.

Reloading packages (and their submodules) recursively in Python

In Python you can reload a module as follows...
import foobar
import importlib
importlib.reload(foobar)
This works for .py files, but for Python packages it will only reload the package and not any of the nested sub-modules.
With a package:
foobar/__init__.py
foobar/spam.py
foobar/eggs.py
Python Script:
import foobar
# assume `spam/__init__.py` is importing `.spam`
# so we dont need an explicit import.
print(foobar.spam) # ok
import importlib
importlib.reload(foobar)
# foobar.spam WONT be reloaded.
Not to suggest this is a bug, but there are times its useful to reload a package and all its submodules. (If you want to edit a module while a script runs for example).
What are some good ways to recursively reload a package in Python?
Notes:
For the purpose of this question assume the latest Python3.x
(currently using importlib)
Allowing that this may requre some edits to the modules themselves.
Assume that wildcard imports aren't used (from foobar import *), since they may complicate reload logic.
Heres a function that recursively loads a package.
Double checked that the reloaded modules are updated in the modules where they are used, and that issues with infinite recursion are checked for.
One restruction is it needs to run on a package (which only makes sense for packages anyway)
import os
import types
import importlib
def reload_package(package):
assert(hasattr(package, "__package__"))
fn = package.__file__
fn_dir = os.path.dirname(fn) + os.sep
module_visit = {fn}
del fn
def reload_recursive_ex(module):
importlib.reload(module)
for module_child in vars(module).values():
if isinstance(module_child, types.ModuleType):
fn_child = getattr(module_child, "__file__", None)
if (fn_child is not None) and fn_child.startswith(fn_dir):
if fn_child not in module_visit:
# print("reloading:", fn_child, "from", module)
module_visit.add(fn_child)
reload_recursive_ex(module_child)
return reload_recursive_ex(package)
# example use
import os
reload_package(os)
I'll offer another answer for the case in which you want to reload only a specific nested module. I found this to be useful for situations where I found myself editing a single subnested module, and reloading all sub-nested modules via a solution like ideasman42's approach or deepreload would produce undesired behavior.
assuming you want to reload a module into the workspace below
my_workspace.ipynb
import importlib
import my_module
import my_other_module_that_I_dont_want_to_reload
print(my_module.test()) #old result
importlib.reload(my_module)
print(my_module.test()) #new result
but my_module.py looks like this:
import my_nested_submodule
def test():
my_nested_submodule.do_something()
and you just made an edit in my_nested_submodule.py:
def do_something():
print('look at this cool new functionality!')
You can manually force my_nested_submodule, and only my_nested_submodule to be reloaded by adjusting my_module.py so it looks like the following:
import my_nested_submodule
import importlib
importlib.reload(my_nested_submodule)
def test():
my_nested_submodule.do_something()
I've updated the answer from #ideasman42 to always reload modules from the bottom of the dependency tree first. Note that it will raise an error if the dependency graph is not a tree (i.e. contains cycles) as I don't think it will be possible to cleanly reload all modules in that case.
import importlib
import os
import types
import pathlib
def get_package_dependencies(package):
assert(hasattr(package, "__package__"))
fn = package.__file__
fn_dir = os.path.dirname(fn) + os.sep
node_set = {fn} # set of module filenames
node_depth_dict = {fn:0} # tracks the greatest depth that we've seen for each node
node_pkg_dict = {fn:package} # mapping of module filenames to module objects
link_set = set() # tuple of (parent module filename, child module filename)
del fn
def dependency_traversal_recursive(module, depth):
for module_child in vars(module).values():
# skip anything that isn't a module
if not isinstance(module_child, types.ModuleType):
continue
fn_child = getattr(module_child, "__file__", None)
# skip anything without a filename or outside the package
if (fn_child is None) or (not fn_child.startswith(fn_dir)):
continue
# have we seen this module before? if not, add it to the database
if not fn_child in node_set:
node_set.add(fn_child)
node_depth_dict[fn_child] = depth
node_pkg_dict[fn_child] = module_child
# set the depth to be the deepest depth we've encountered the node
node_depth_dict[fn_child] = max(depth, node_depth_dict[fn_child])
# have we visited this child module from this parent module before?
if not ((module.__file__, fn_child) in link_set):
link_set.add((module.__file__, fn_child))
dependency_traversal_recursive(module_child, depth+1)
else:
raise ValueError("Cycle detected in dependency graph!")
dependency_traversal_recursive(package, 1)
return (node_pkg_dict, node_depth_dict)
# example use
import collections
node_pkg_dict, node_depth_dict = get_package_dependencies(collections)
for (d,v) in sorted([(d,v) for v,d in node_depth_dict.items()], reverse=True):
print("Reloading %s" % pathlib.Path(v).name)
importlib.reload(node_pkg_dict[v])

How to load modules dynamically on package import?

Given the following example layout:
test/
test.py
formats/
__init__.py
format_a.py
format_b.py
What I try to archive is, that whenever I import formats, the __init__.py looks for all available modules in the formats subdir, loads them and makes them available (right now simply through a variable, supported_formats). If theres a better, more pythonic or otherwise approach to dynamically loading stuff on runtime, based on physical available files, please tell.
My Approach
I tried something like this (in __init__.py):
supported_formats = [__import__(f[:f.index('.py')]) for f in glob.glob('*.py')]
So far I just get it to work, when I run __init__.py from the command line (from the formats subdir or from other directories) . But when I import it from test.py, it bails on me like this:
ImportError: No module named format_a.py
Same when I import it from the python interpreter, when I started the interpreter in an other directory but the formats subdir.
Here's the whole code. It also looks for a specific class and stores one instance of each class in an dict, but loading the modules dynamically is the main part I don't get:
def dload(get_cls=True, get_mod=True, key=None, fstring_mod='*.py', fstring_class=''):
if p.dirname(__file__):
path = p.split(p.abspath(__file__))[0]
fstring_mod = p.join(path, fstring_mod)
print >> sys.stderr, 'Path-Glob:', fstring_mod
modules = [p.split(fn)[1][:fn.index('.py')] for fn in glob.glob(fstring_mod)]
print >> sys.stderr, 'Modules:', ', '.join(modules)
modules = [__import__(m) for m in modules]
if get_cls:
classes = {} if key else []
for m in modules:
print >> sys.stderr, "-", m
for c in [m.__dict__[c]() for c in m.__dict__ if c.startswith(fstring_class)]:
print >> sys.stderr, " ", c
if key:
classes[getattr(c, key)] = c
else:
classes.append(c)
if get_mod:
return (modules, classes)
else:
return classes
elif get_mod:
return modules
_supported_formats = dload(get_mod=False, key='fid', fstring_mod='format_*.py', fstring_class='Format')
My Idea
The whole messing with filesystem-paths and the like is probably messy anyway. I would like to handle this with module namespaces or something similar, but I'm kinda lost right now on how start and how to address the modules, so they reachable from anywhere.
There are two fixes you need to make to your code:
You should call __import__(m, globals(), locals()) instead of __import__(m). This is needed for Python to locate the modules within the package.
Your code doesn't remove the .py extension properly since you call index() on the wrong string. If it will always be a .py extension, you can simply use p.split(fn)[1][:-3] instead.
First you must make it so that your code works regardless of the current working directory. For that you use the __file__ variable. You should also use absolute imports.
So something like (untested):
supported_formats = {}
for fn in os.listdir(os.path.dirname(__file__)):
if fn.endswith('.py'):
exec ("from formats import %s" % fn[:-3]) in supported_formats
A module is searched in sys.path. You should be able to extend sys.path with the path to your module. I'm also not really sure whether you can load a module on sys.path with a 'module.py' convention, I would think without '.py' is preferred.
This is obviously not a solution, but may be handy nonetheless.
I thought if you did something that, 'formats' would be your package, so when you tell it import formats you should be able to access the rest of the modules inside that package, so, you would have something like formats.format_a.your_method
Not sure though, I'm just a n00b.
Here's the code I came up with after the corrections from interjay. Still not sure if this is good style.
def load_modules(filemask='*.py', ignore_list=('__init__.py', )):
modules = {}
dirname = os.path.dirname(__file__)
if dirname:
filemask = os.path.join(dirname, filemask)
for fn in glob.glob(filemask):
fn = os.path.split(fn)[1]
if fn in ignore_list:
continue
fn = os.path.splitext(fn)[0]
modules[fn] = __import__(fn, globals(), locals())
return modules

Categories