Variables from function to current file - python

1. Summary
I can't find, how I can add variables from YAML files to Python files without duplicates.
2. Purpose
I use Pelican — static sites generator. It use .py files for configuration. Problems:
I can't reuse variables from .py for JavaScript
import * antipattern, that use even in official Pelican blog
I try move configuration to YAML files → I get problem of this question.
3. MCVE
3.1. Files
Live demo on Repl.it
main.py:
"""First Python file."""
# [INFO] Using ruamel.yaml — superset of PyYAML:
# https://stackoverflow.com/a/38922434/5951529
import ruamel.yaml as yaml
SETTINGS_FILES = ["kira.yaml", "kristina.yaml"]
for setting_file in SETTINGS_FILES:
VARIABLES = yaml.safe_load(open(setting_file))
# [INFO] Convert Python dictionary to variables:
# https://stackoverflow.com/a/36059129/5951529
locals().update(VARIABLES)
# [INFO] View all variables:
# https://stackoverflow.com/a/633134/5951529
print(dir())
publishconf.py:
"""Second Python file."""
import ruamel.yaml as yaml
# [NOTE] Another value in list
SETTINGS_FILES = ["kira.yaml", "katya.yaml"]
for setting_file in SETTINGS_FILES:
VARIABLES = yaml.load(open(setting_file))
locals().update(VARIABLES)
print(dir())
kira.yaml:
DECISION: Saint Petersburg
kristina.yaml:
SPAIN: Marbella
katya.yaml:
BURIED: Novoshakhtinsk
3.2. Expected behavior
DECISION and SPAIN variables in main.py:
$ python main.py
['DECISION', 'SETTINGS_FILES', 'SPAIN', 'VARIABLES', '__annotations__', '__builtins__', '__cached__', '__doc__', '__file__', '__loader__', '__name__', '__package__', '__spec__', '__warningregistry__', 'setting_file', 'yaml']
DECISION and BURIED variables in publishconf.py:
$ python publishconf.py
['BURIED', 'DECISION', 'SETTINGS_FILES', 'VARIABLES', '__annotations__', '__builtins__', '__cached__', '__doc__', '__file__', '__loader__', '__name__', '__package__', '__spec__', '__warningregistry__', 'setting_file', 'yaml']
3.3. Problem
Duplicate loop in main.py and publishconf.py:
for setting_file in SETTINGS_FILES:
VARIABLES = yaml.load(open(setting_file))
locals().update(VARIABLES)
Can I not use duplicates?
4. Not helped
4.1. Configuration file
Live demo on Repl.it
config.py:
"""Config Python file."""
# [INFO] Using ruamel.yaml — superset of PyYAML:
# https://stackoverflow.com/a/38922434/5951529
import ruamel.yaml as yaml
MAIN_CONFIG = ["kira.yaml", "kristina.yaml"]
PUBLISHCONF_CONFIG = ["kira.yaml", "katya.yaml"]
def kirafunction(pelicanplugins):
"""Function for both Python files."""
for setting_file in pelicanplugins:
# [INFO] Convert Python dictionary to variables:
# https://stackoverflow.com/a/36059129/5951529
variables = yaml.safe_load(open(setting_file))
globals().update(variables)
def main_function():
"""For main.py."""
kirafunction(MAIN_CONFIG)
def publishconf_function():
"""For publishconf.py."""
kirafunction(PUBLISHCONF_CONFIG)
main.py:
"""First Python file."""
import sys
from config import main_function
sys.path.append(".")
main_function()
# [INFO] View all variables:
# https://stackoverflow.com/a/633134/5951529
print(dir())
publishconf.py:
"""Second Python file."""
import sys
from config import publishconf_function
sys.path.append(".")
publishconf_function()
print(dir())
Variables from main_function and publishconf_function doesn't share across files:
$ python main.py
['__annotations__', '__builtins__', '__cached__', '__doc__', '__file__', '__loader__', '__name__', '__package__', '__spec__', 'main_function', 'signal', 'sys']
4.2. Another attempts
Wrap loop to function as in this example:
def kirafunction():
"""Docstring."""
for setting_file in SETTINGS_FILES:
VARIABLES = yaml.safe_load(open(setting_file))
locals().update(VARIABLES)
kirafunction()
Using global keyword
“I think editing locals() like that is generally a bad idea. If you think globals() is a better alternative, think it twice!”
Search in Stack Overflow questions

I would avoid any update to what locals returns, because documentation explicitely states:
Note
The contents of this dictionary should not be modified; changes may not affect the values of local and free variables used by the interpreter.
The globals function is a dictionnary simply containing the attributes of a module, and the mapping returned by globals is indeed writable.
So if this exists in one Python source:
def kirafunction(map,settings):
# [NOTE] Another value in list
for setting_file in settings:
VARIABLES = yaml.load(open(setting_file))
map.update(VARIABLES)
This can be used from any other Python source after importing the above function:
kirafunction(globals(), settings)
and will import the variables in the globals dictionnary of the calling module. And will be highly non-pythonic...
A slightly more Pythonic way, would be to dedicate one Python module to hold both the code loading the yaml files and the new variables:
loader.py:
import ruamel.yaml as yaml
SETTINGS_FILES = ["kira.yaml", "kristina.yaml"]
for setting_file in SETTINGS_FILES:
VARIABLES = yaml.safe_load(open(setting_file))
# [INFO] Convert Python dictionary to variables:
# https://stackoverflow.com/a/36059129/5951529
globals().update(VARIABLES)
Then from any other Python module you can use:
...
import loader # assuming that it is in sys.path...
...
print(loader.DECISION)
print(dir(loader))
But it is still uncommon and would require comments to explain the rationale for it.
After reading the Pelican config example from you comment, I assume that what you need is a way to import in different scripts a bunch of variables declared in yaml files. In that case I would put the code loading the variables in one module, and update the globals() dictionnary in the other modules:
loader.py:
import ruamel.yaml as yaml
MAIN_CONFIG = ["kira.yaml", "kristina.yaml"]
PUBLISHCONF_CONFIG = ["kira.yaml", "katya.yaml"]
def kirafunction(pelicanplugins):
"""Function for both Python files."""
variables = {}
for setting_file in pelicanplugins:
# [INFO] Convert Python dictionary to variables:
# https://stackoverflow.com/a/36059129/5951529
variables.update(yaml.safe_load(open(setting_file)))
return variables
Then for example in publishconf.py you would use:
from loader import kirafunction, PUBLISHCONF_CONFIG as pelican_config
# other Python code...
# import variables from the yaml files defined in PUBLISHCONF_CONFIG
# because Pelican expects them as plain Python module variables
globals().update(kirafunction(pelican_config))
Again, updating globals() is probably appropriate in this use case, but is generally frowned upon, hence the comment.

Related

Load dynamically a file, and run a function inside [duplicate]

How do I load a Python module given its full path?
Note that the file can be anywhere in the filesystem where the user has access rights.
See also: How to import a module given its name as string?
For Python 3.5+ use (docs):
import importlib.util
import sys
spec = importlib.util.spec_from_file_location("module.name", "/path/to/file.py")
foo = importlib.util.module_from_spec(spec)
sys.modules["module.name"] = foo
spec.loader.exec_module(foo)
foo.MyClass()
For Python 3.3 and 3.4 use:
from importlib.machinery import SourceFileLoader
foo = SourceFileLoader("module.name", "/path/to/file.py").load_module()
foo.MyClass()
(Although this has been deprecated in Python 3.4.)
For Python 2 use:
import imp
foo = imp.load_source('module.name', '/path/to/file.py')
foo.MyClass()
There are equivalent convenience functions for compiled Python files and DLLs.
See also http://bugs.python.org/issue21436.
The advantage of adding a path to sys.path (over using imp) is that it simplifies things when importing more than one module from a single package. For example:
import sys
# the mock-0.3.1 dir contains testcase.py, testutils.py & mock.py
sys.path.append('/foo/bar/mock-0.3.1')
from testcase import TestCase
from testutils import RunTests
from mock import Mock, sentinel, patch
To import your module, you need to add its directory to the environment variable, either temporarily or permanently.
Temporarily
import sys
sys.path.append("/path/to/my/modules/")
import my_module
Permanently
Adding the following line to your .bashrc (or alternative) file in Linux
and excecute source ~/.bashrc (or alternative) in the terminal:
export PYTHONPATH="${PYTHONPATH}:/path/to/my/modules/"
Credit/Source: saarrrr, another Stack Exchange question
If your top-level module is not a file but is packaged as a directory with __init__.py, then the accepted solution almost works, but not quite. In Python 3.5+ the following code is needed (note the added line that begins with 'sys.modules'):
MODULE_PATH = "/path/to/your/module/__init__.py"
MODULE_NAME = "mymodule"
import importlib
import sys
spec = importlib.util.spec_from_file_location(MODULE_NAME, MODULE_PATH)
module = importlib.util.module_from_spec(spec)
sys.modules[spec.name] = module
spec.loader.exec_module(module)
Without this line, when exec_module is executed, it tries to bind relative imports in your top level __init__.py to the top level module name -- in this case "mymodule". But "mymodule" isn't loaded yet so you'll get the error "SystemError: Parent module 'mymodule' not loaded, cannot perform relative import". So you need to bind the name before you load it. The reason for this is the fundamental invariant of the relative import system: "The invariant holding is that if you have sys.modules['spam'] and sys.modules['spam.foo'] (as you would after the above import), the latter must appear as the foo attribute of the former" as discussed here.
It sounds like you don't want to specifically import the configuration file (which has a whole lot of side effects and additional complications involved). You just want to run it, and be able to access the resulting namespace. The standard library provides an API specifically for that in the form of runpy.run_path:
from runpy import run_path
settings = run_path("/path/to/file.py")
That interface is available in Python 2.7 and Python 3.2+.
You can also do something like this and add the directory that the configuration file is sitting in to the Python load path, and then just do a normal import, assuming you know the name of the file in advance, in this case "config".
Messy, but it works.
configfile = '~/config.py'
import os
import sys
sys.path.append(os.path.dirname(os.path.expanduser(configfile)))
import config
I have come up with a slightly modified version of #SebastianRittau's wonderful answer (for Python > 3.4 I think), which will allow you to load a file with any extension as a module using spec_from_loader instead of spec_from_file_location:
from importlib.util import spec_from_loader, module_from_spec
from importlib.machinery import SourceFileLoader
spec = spec_from_loader("module.name", SourceFileLoader("module.name", "/path/to/file.py"))
mod = module_from_spec(spec)
spec.loader.exec_module(mod)
The advantage of encoding the path in an explicit SourceFileLoader is that the machinery will not try to figure out the type of the file from the extension. This means that you can load something like a .txt file using this method, but you could not do it with spec_from_file_location without specifying the loader because .txt is not in importlib.machinery.SOURCE_SUFFIXES.
I've placed an implementation based on this, and #SamGrondahl's useful modification into my utility library, haggis. The function is called haggis.load.load_module. It adds a couple of neat tricks, like the ability to inject variables into the module namespace as it is loaded.
You can use the
load_source(module_name, path_to_file)
method from the imp module.
Do you mean load or import?
You can manipulate the sys.path list specify the path to your module, and then import your module. For example, given a module at:
/foo/bar.py
You could do:
import sys
sys.path[0:0] = ['/foo'] # Puts the /foo directory at the start of your path
import bar
Here is some code that works in all Python versions, from 2.7-3.5 and probably even others.
config_file = "/tmp/config.py"
with open(config_file) as f:
code = compile(f.read(), config_file, 'exec')
exec(code, globals(), locals())
I tested it. It may be ugly, but so far it is the only one that works in all versions.
You can do this using __import__ and chdir:
def import_file(full_path_to_module):
try:
import os
module_dir, module_file = os.path.split(full_path_to_module)
module_name, module_ext = os.path.splitext(module_file)
save_cwd = os.getcwd()
os.chdir(module_dir)
module_obj = __import__(module_name)
module_obj.__file__ = full_path_to_module
globals()[module_name] = module_obj
os.chdir(save_cwd)
except Exception as e:
raise ImportError(e)
return module_obj
import_file('/home/somebody/somemodule.py')
If we have scripts in the same project but in different directory means, we can solve this problem by the following method.
In this situation utils.py is in src/main/util/
import sys
sys.path.append('./')
import src.main.util.utils
#or
from src.main.util.utils import json_converter # json_converter is example method
To add to Sebastian Rittau's answer:
At least for CPython, there's pydoc, and, while not officially declared, importing files is what it does:
from pydoc import importfile
module = importfile('/path/to/module.py')
PS. For the sake of completeness, there's a reference to the current implementation at the moment of writing: pydoc.py, and I'm pleased to say that in the vein of xkcd 1987 it uses neither of the implementations mentioned in issue 21436 -- at least, not verbatim.
I believe you can use imp.find_module() and imp.load_module() to load the specified module. You'll need to split the module name off of the path, i.e. if you wanted to load /home/mypath/mymodule.py you'd need to do:
imp.find_module('mymodule', '/home/mypath/')
...but that should get the job done.
You can use the pkgutil module (specifically the walk_packages method) to get a list of the packages in the current directory. From there it's trivial to use the importlib machinery to import the modules you want:
import pkgutil
import importlib
packages = pkgutil.walk_packages(path='.')
for importer, name, is_package in packages:
mod = importlib.import_module(name)
# do whatever you want with module now, it's been imported!
There's a package that's dedicated to this specifically:
from thesmuggler import smuggle
# À la `import weapons`
weapons = smuggle('weapons.py')
# À la `from contraband import drugs, alcohol`
drugs, alcohol = smuggle('drugs', 'alcohol', source='contraband.py')
# À la `from contraband import drugs as dope, alcohol as booze`
dope, booze = smuggle('drugs', 'alcohol', source='contraband.py')
It's tested across Python versions (Jython and PyPy too), but it might be overkill depending on the size of your project.
Create Python module test.py:
import sys
sys.path.append("<project-path>/lib/")
from tes1 import Client1
from tes2 import Client2
import tes3
Create Python module test_check.py:
from test import Client1
from test import Client2
from test import test3
We can import the imported module from module.
This area of Python 3.4 seems to be extremely tortuous to understand! However with a bit of hacking using the code from Chris Calloway as a start I managed to get something working. Here's the basic function.
def import_module_from_file(full_path_to_module):
"""
Import a module given the full path/filename of the .py file
Python 3.4
"""
module = None
try:
# Get module name and path from full path
module_dir, module_file = os.path.split(full_path_to_module)
module_name, module_ext = os.path.splitext(module_file)
# Get module "spec" from filename
spec = importlib.util.spec_from_file_location(module_name,full_path_to_module)
module = spec.loader.load_module()
except Exception as ec:
# Simple error printing
# Insert "sophisticated" stuff here
print(ec)
finally:
return module
This appears to use non-deprecated modules from Python 3.4. I don't pretend to understand why, but it seems to work from within a program. I found Chris' solution worked on the command line but not from inside a program.
I made a package that uses imp for you. I call it import_file and this is how it's used:
>>>from import_file import import_file
>>>mylib = import_file('c:\\mylib.py')
>>>another = import_file('relative_subdir/another.py')
You can get it at:
http://pypi.python.org/pypi/import_file
or at
http://code.google.com/p/import-file/
To import a module from a given filename, you can temporarily extend the path, and restore the system path in the finally block reference:
filename = "directory/module.py"
directory, module_name = os.path.split(filename)
module_name = os.path.splitext(module_name)[0]
path = list(sys.path)
sys.path.insert(0, directory)
try:
module = __import__(module_name)
finally:
sys.path[:] = path # restore
A simple solution using importlib instead of the imp package (tested for Python 2.7, although it should work for Python 3 too):
import importlib
dirname, basename = os.path.split(pyfilepath) # pyfilepath: '/my/path/mymodule.py'
sys.path.append(dirname) # only directories should be added to PYTHONPATH
module_name = os.path.splitext(basename)[0] # '/my/path/mymodule.py' --> 'mymodule'
module = importlib.import_module(module_name) # name space of defined module (otherwise we would literally look for "module_name")
Now you can directly use the namespace of the imported module, like this:
a = module.myvar
b = module.myfunc(a)
The advantage of this solution is that we don't even need to know the actual name of the module we would like to import, in order to use it in our code. This is useful, e.g. in case the path of the module is a configurable argument.
I have written my own global and portable import function, based on importlib module, for:
Be able to import both modules as submodules and to import the content of a module to a parent module (or into a globals if has no parent module).
Be able to import modules with a period characters in a file name.
Be able to import modules with any extension.
Be able to use a standalone name for a submodule instead of a file name without extension which is by default.
Be able to define the import order based on previously imported module instead of dependent on sys.path or on a what ever search path storage.
The examples directory structure:
<root>
|
+- test.py
|
+- testlib.py
|
+- /std1
| |
| +- testlib.std1.py
|
+- /std2
| |
| +- testlib.std2.py
|
+- /std3
|
+- testlib.std3.py
Inclusion dependency and order:
test.py
-> testlib.py
-> testlib.std1.py
-> testlib.std2.py
-> testlib.std3.py
Implementation:
Latest changes store: https://sourceforge.net/p/tacklelib/tacklelib/HEAD/tree/trunk/python/tacklelib/tacklelib.py
test.py:
import os, sys, inspect, copy
SOURCE_FILE = os.path.abspath(inspect.getsourcefile(lambda:0)).replace('\\','/')
SOURCE_DIR = os.path.dirname(SOURCE_FILE)
print("test::SOURCE_FILE: ", SOURCE_FILE)
# portable import to the global space
sys.path.append(TACKLELIB_ROOT) # TACKLELIB_ROOT - path to the library directory
import tacklelib as tkl
tkl.tkl_init(tkl)
# cleanup
del tkl # must be instead of `tkl = None`, otherwise the variable would be still persist
sys.path.pop()
tkl_import_module(SOURCE_DIR, 'testlib.py')
print(globals().keys())
testlib.base_test()
testlib.testlib_std1.std1_test()
testlib.testlib_std1.testlib_std2.std2_test()
#testlib.testlib.std3.std3_test() # does not reachable directly ...
getattr(globals()['testlib'], 'testlib.std3').std3_test() # ... but reachable through the `globals` + `getattr`
tkl_import_module(SOURCE_DIR, 'testlib.py', '.')
print(globals().keys())
base_test()
testlib_std1.std1_test()
testlib_std1.testlib_std2.std2_test()
#testlib.std3.std3_test() # does not reachable directly ...
globals()['testlib.std3'].std3_test() # ... but reachable through the `globals` + `getattr`
testlib.py:
# optional for 3.4.x and higher
#import os, inspect
#
#SOURCE_FILE = os.path.abspath(inspect.getsourcefile(lambda:0)).replace('\\','/')
#SOURCE_DIR = os.path.dirname(SOURCE_FILE)
print("1 testlib::SOURCE_FILE: ", SOURCE_FILE)
tkl_import_module(SOURCE_DIR + '/std1', 'testlib.std1.py', 'testlib_std1')
# SOURCE_DIR is restored here
print("2 testlib::SOURCE_FILE: ", SOURCE_FILE)
tkl_import_module(SOURCE_DIR + '/std3', 'testlib.std3.py')
print("3 testlib::SOURCE_FILE: ", SOURCE_FILE)
def base_test():
print('base_test')
testlib.std1.py:
# optional for 3.4.x and higher
#import os, inspect
#
#SOURCE_FILE = os.path.abspath(inspect.getsourcefile(lambda:0)).replace('\\','/')
#SOURCE_DIR = os.path.dirname(SOURCE_FILE)
print("testlib.std1::SOURCE_FILE: ", SOURCE_FILE)
tkl_import_module(SOURCE_DIR + '/../std2', 'testlib.std2.py', 'testlib_std2')
def std1_test():
print('std1_test')
testlib.std2.py:
# optional for 3.4.x and higher
#import os, inspect
#
#SOURCE_FILE = os.path.abspath(inspect.getsourcefile(lambda:0)).replace('\\','/')
#SOURCE_DIR = os.path.dirname(SOURCE_FILE)
print("testlib.std2::SOURCE_FILE: ", SOURCE_FILE)
def std2_test():
print('std2_test')
testlib.std3.py:
# optional for 3.4.x and higher
#import os, inspect
#
#SOURCE_FILE = os.path.abspath(inspect.getsourcefile(lambda:0)).replace('\\','/')
#SOURCE_DIR = os.path.dirname(SOURCE_FILE)
print("testlib.std3::SOURCE_FILE: ", SOURCE_FILE)
def std3_test():
print('std3_test')
Output (3.7.4):
test::SOURCE_FILE: <root>/test01/test.py
import : <root>/test01/testlib.py as testlib -> []
1 testlib::SOURCE_FILE: <root>/test01/testlib.py
import : <root>/test01/std1/testlib.std1.py as testlib_std1 -> ['testlib']
import : <root>/test01/std1/../std2/testlib.std2.py as testlib_std2 -> ['testlib', 'testlib_std1']
testlib.std2::SOURCE_FILE: <root>/test01/std1/../std2/testlib.std2.py
2 testlib::SOURCE_FILE: <root>/test01/testlib.py
import : <root>/test01/std3/testlib.std3.py as testlib.std3 -> ['testlib']
testlib.std3::SOURCE_FILE: <root>/test01/std3/testlib.std3.py
3 testlib::SOURCE_FILE: <root>/test01/testlib.py
dict_keys(['__name__', '__doc__', '__package__', '__loader__', '__spec__', '__annotations__', '__builtins__', '__file__', '__cached__', 'os', 'sys', 'inspect', 'copy', 'SOURCE_FILE', 'SOURCE_DIR', 'TackleGlobalImportModuleState', 'tkl_membercopy', 'tkl_merge_module', 'tkl_get_parent_imported_module_state', 'tkl_declare_global', 'tkl_import_module', 'TackleSourceModuleState', 'tkl_source_module', 'TackleLocalImportModuleState', 'testlib'])
base_test
std1_test
std2_test
std3_test
import : <root>/test01/testlib.py as . -> []
1 testlib::SOURCE_FILE: <root>/test01/testlib.py
import : <root>/test01/std1/testlib.std1.py as testlib_std1 -> ['testlib']
import : <root>/test01/std1/../std2/testlib.std2.py as testlib_std2 -> ['testlib', 'testlib_std1']
testlib.std2::SOURCE_FILE: <root>/test01/std1/../std2/testlib.std2.py
2 testlib::SOURCE_FILE: <root>/test01/testlib.py
import : <root>/test01/std3/testlib.std3.py as testlib.std3 -> ['testlib']
testlib.std3::SOURCE_FILE: <root>/test01/std3/testlib.std3.py
3 testlib::SOURCE_FILE: <root>/test01/testlib.py
dict_keys(['__name__', '__doc__', '__package__', '__loader__', '__spec__', '__annotations__', '__builtins__', '__file__', '__cached__', 'os', 'sys', 'inspect', 'copy', 'SOURCE_FILE', 'SOURCE_DIR', 'TackleGlobalImportModuleState', 'tkl_membercopy', 'tkl_merge_module', 'tkl_get_parent_imported_module_state', 'tkl_declare_global', 'tkl_import_module', 'TackleSourceModuleState', 'tkl_source_module', 'TackleLocalImportModuleState', 'testlib', 'testlib_std1', 'testlib.std3', 'base_test'])
base_test
std1_test
std2_test
std3_test
Tested in Python 3.7.4, 3.2.5, 2.7.16
Pros:
Can import both module as a submodule and can import content of a module to a parent module (or into a globals if has no parent module).
Can import modules with periods in a file name.
Can import any extension module from any extension module.
Can use a standalone name for a submodule instead of a file name without extension which is by default (for example, testlib.std.py as testlib, testlib.blabla.py as testlib_blabla and so on).
Does not depend on a sys.path or on a what ever search path storage.
Does not require to save/restore global variables like SOURCE_FILE and SOURCE_DIR between calls to tkl_import_module.
[for 3.4.x and higher] Can mix the module namespaces in nested tkl_import_module calls (ex: named->local->named or local->named->local and so on).
[for 3.4.x and higher] Can auto export global variables/functions/classes from where being declared to all children modules imported through the tkl_import_module (through the tkl_declare_global function).
Cons:
Does not support complete import:
Ignores enumerations and subclasses.
Ignores builtins because each what type has to be copied exclusively.
Ignore not trivially copiable classes.
Avoids copying builtin modules including all packaged modules.
[for 3.3.x and lower] Require to declare tkl_import_module in all modules which calls to tkl_import_module (code duplication)
Update 1,2 (for 3.4.x and higher only):
In Python 3.4 and higher you can bypass the requirement to declare tkl_import_module in each module by declare tkl_import_module in a top level module and the function would inject itself to all children modules in a single call (it's a kind of self deploy import).
Update 3:
Added function tkl_source_module as analog to bash source with support execution guard upon import (implemented through the module merge instead of import).
Update 4:
Added function tkl_declare_global to auto export a module global variable to all children modules where a module global variable is not visible because is not a part of a child module.
Update 5:
All functions has moved into the tacklelib library, see the link above.
This should work
path = os.path.join('./path/to/folder/with/py/files', '*.py')
for infile in glob.glob(path):
basename = os.path.basename(infile)
basename_without_extension = basename[:-3]
# http://docs.python.org/library/imp.html?highlight=imp#module-imp
imp.load_source(basename_without_extension, infile)
Import package modules at runtime (Python recipe)
http://code.activestate.com/recipes/223972/
###################
## #
## classloader.py #
## #
###################
import sys, types
def _get_mod(modulePath):
try:
aMod = sys.modules[modulePath]
if not isinstance(aMod, types.ModuleType):
raise KeyError
except KeyError:
# The last [''] is very important!
aMod = __import__(modulePath, globals(), locals(), [''])
sys.modules[modulePath] = aMod
return aMod
def _get_func(fullFuncName):
"""Retrieve a function object from a full dotted-package name."""
# Parse out the path, module, and function
lastDot = fullFuncName.rfind(u".")
funcName = fullFuncName[lastDot + 1:]
modPath = fullFuncName[:lastDot]
aMod = _get_mod(modPath)
aFunc = getattr(aMod, funcName)
# Assert that the function is a *callable* attribute.
assert callable(aFunc), u"%s is not callable." % fullFuncName
# Return a reference to the function itself,
# not the results of the function.
return aFunc
def _get_class(fullClassName, parentClass=None):
"""Load a module and retrieve a class (NOT an instance).
If the parentClass is supplied, className must be of parentClass
or a subclass of parentClass (or None is returned).
"""
aClass = _get_func(fullClassName)
# Assert that the class is a subclass of parentClass.
if parentClass is not None:
if not issubclass(aClass, parentClass):
raise TypeError(u"%s is not a subclass of %s" %
(fullClassName, parentClass))
# Return a reference to the class itself, not an instantiated object.
return aClass
######################
## Usage ##
######################
class StorageManager: pass
class StorageManagerMySQL(StorageManager): pass
def storage_object(aFullClassName, allOptions={}):
aStoreClass = _get_class(aFullClassName, StorageManager)
return aStoreClass(allOptions)
I'm not saying that it is better, but for the sake of completeness, I wanted to suggest the exec function, available in both Python 2 and Python 3.
exec allows you to execute arbitrary code in either the global scope, or in an internal scope, provided as a dictionary.
For example, if you have a module stored in "/path/to/module" with the function foo(), you could run it by doing the following:
module = dict()
with open("/path/to/module") as f:
exec(f.read(), module)
module['foo']()
This makes it a bit more explicit that you're loading code dynamically, and grants you some additional power, such as the ability to provide custom builtins.
And if having access through attributes, instead of keys is important to you, you can design a custom dict class for the globals, that provides such access, e.g.:
class MyModuleClass(dict):
def __getattr__(self, name):
return self.__getitem__(name)
In Linux, adding a symbolic link in the directory your Python script is located works.
I.e.:
ln -s /absolute/path/to/module/module.py /absolute/path/to/script/module.py
The Python interpreter will create /absolute/path/to/script/module.pyc and will update it if you change the contents of /absolute/path/to/module/module.py.
Then include the following in file mypythonscript.py:
from module import *
This will allow imports of compiled (pyd) Python modules in 3.4:
import sys
import importlib.machinery
def load_module(name, filename):
# If the Loader finds the module name in this list it will use
# module_name.__file__ instead so we need to delete it here
if name in sys.modules:
del sys.modules[name]
loader = importlib.machinery.ExtensionFileLoader(name, filename)
module = loader.load_module()
locals()[name] = module
globals()[name] = module
load_module('something', r'C:\Path\To\something.pyd')
something.do_something()
A quite simple way: suppose you want import file with relative path ../../MyLibs/pyfunc.py
libPath = '../../MyLibs'
import sys
if not libPath in sys.path: sys.path.append(libPath)
import pyfunc as pf
But if you make it without a guard you can finally get a very long path.
These are my two utility functions using only pathlib. It infers the module name from the path.
By default, it recursively loads all Python files from folders and replaces init.py by the parent folder name. But you can also give a Path and/or a glob to select some specific files.
from pathlib import Path
from importlib.util import spec_from_file_location, module_from_spec
from typing import Optional
def get_module_from_path(path: Path, relative_to: Optional[Path] = None):
if not relative_to:
relative_to = Path.cwd()
abs_path = path.absolute()
relative_path = abs_path.relative_to(relative_to.absolute())
if relative_path.name == "__init__.py":
relative_path = relative_path.parent
module_name = ".".join(relative_path.with_suffix("").parts)
mod = module_from_spec(spec_from_file_location(module_name, path))
return mod
def get_modules_from_folder(folder: Optional[Path] = None, glob_str: str = "*/**/*.py"):
if not folder:
folder = Path(".")
mod_list = []
for file_path in sorted(folder.glob(glob_str)):
mod_list.append(get_module_from_path(file_path))
return mod_list
This answer is a supplement to Sebastian Rittau's answer responding to the comment: "but what if you don't have the module name?" This is a quick and dirty way of getting the likely Python module name given a filename -- it just goes up the tree until it finds a directory without an __init__.py file and then turns it back into a filename. For Python 3.4+ (uses pathlib), which makes sense since Python 2 people can use "imp" or other ways of doing relative imports:
import pathlib
def likely_python_module(filename):
'''
Given a filename or Path, return the "likely" python module name. That is, iterate
the parent directories until it doesn't contain an __init__.py file.
:rtype: str
'''
p = pathlib.Path(filename).resolve()
paths = []
if p.name != '__init__.py':
paths.append(p.stem)
while True:
p = p.parent
if not p:
break
if not p.is_dir():
break
inits = [f for f in p.iterdir() if f.name == '__init__.py']
if not inits:
break
paths.append(p.stem)
return '.'.join(reversed(paths))
There are certainly possibilities for improvement, and the optional __init__.py files might necessitate other changes, but if you have __init__.py in general, this does the trick.

Updating of a config file written in python

I'm writing a simulation software which should support reading parameters from a config file or from the command line. It is very important to be able to track what was the configuration of a simulation, I'm committing the config file to a local git repository at the start of simulation.
Now if I have parameters on the command line they have higher priority than the ones in the config file. But I also want to commit them. I guess I could save the python objects of a configured simulation, just before it is started. But it would be more elegant, if I could just update the config file with the command line parameters before committing it.
The reason I write the config file in python is that I have to define some python objects in it. I have something like
import SomeSimulationClass
SIMULATOR = SomeSimulationClass
in my config file and the SIMULATOR can then be swapped easily.
If I want to use something like configparser I can't have objects I believe.
Is there any easy way to update a python config file? All variable names in it are already defined, I just want to change the values. The only thing I can think of is parsing the file, comparing strings between the file and the command line parameters ...
You can write whatever you want into a file, and then later, Configparser can read
from it using values from your variables. Here is an example on how I used Configparser to read environment from config file.
import os
from ConfigParser import SafeConfigParser
conf_filename = os.getenv("CONFIG_FILE")
src_dir = os.getenv("CONFIG_DIR")
conf_file = os.path.join(src_dir,conf_filename)
parser = SafeConfigParser()
parser.read(conf_file)
section = env
server = parser.get(section, 'host')
db_port = parser.get(section, 'db_port')
ws_port = parser.get(section, 'ws_port')
and the config file itself:
[PROD]
host=xxx-yyy-15
db_port=1521
ws_port=8280
ora_server=xxx-xxx-xxx.com
sid=XXXXX
userid=xxxx
passwd=xxxx
[STAGE]
host=xxx-yyy-04
db_port=1521
ws_port=8280
ora_server=yyy-yyy-yyy.com
sid=YYYYYY
userid=yyyy
passwd=yyyy
I found a way to do what I want. Some small modifications were necessary to my python config module to allow it to be rewritten with the following script, but it works for my purposes:
with open('merged_config.py', 'w') as merged_config, \
open(base_config_module.__file__, 'r') as base_config:
for line in base_config:
if 'import' in line:
# copy imports from bas config
merged_config.write(line)
for item in dir(base_config_module):
if item.startswith("__"):
# ignore __variables like '__builtins__', '__cached__', '__doc__', '__file__', '__loader__', '__name__' ...
continue
if item == 'SimulationSteps':
# ingoring my imports
continue
item_val = getattr(base_config_module, item)
# I had to overwrite the __repr__() method of Enums which I used. Everyting else worked fine.
merged_config.write('%s = %s\n' % (item, repr(item_val)))

Understanding Python Pickle Insecurity

It states in the Python documentation that pickle is not secure and shouldn't parse untrusted user input. If you research this; almost all examples demonstrate this with a system() call via os.system.
Whats not clear to me, is how os.system is interpreted correctly without the os module being imported.
>>> import pickle
>>> pickle.loads("cos\nsystem\n(S'ls /'\ntR.") # This clearly works.
bin boot cgroup dev etc home lib lib64 lost+found media mnt opt proc root run sbin selinux srv sys tmp usr var
0
>>> dir() # no os module
['__builtins__', '__doc__', '__name__', '__package__', 'pickle']
>>> os.system('ls /')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'os' is not defined
>>>
Can someone explain?
The name of the module (os) is part of the opcode, and pickle automatically imports the module:
# pickle.py
def find_class(self, module, name):
# Subclasses may override this
__import__(module)
mod = sys.modules[module]
klass = getattr(mod, name)
return klass
Note the __import__(module) line.
The function is called when the GLOBAL 'os system' pickle bytecode instruction is executed.
This mechanism is necessary in order to be able to unpickle instances of classes whose modules haven't been explicitly imported into the caller's namespace.
For altogether too much information on writing malicious Pickles that go much further than the standard os.system() example, see this presentation and its accompanying paper.
If you use pickletools.dis to disassemble the pickle you can see how this is working:
import pickletools
print pickletools.dis("cos\nsystem\n(S'ls ~'\ntR.")
Output:
0: c GLOBAL 'os system'
11: ( MARK
12: S STRING 'ls ~'
20: t TUPLE (MARK at 11)
21: R REDUCE
22: . STOP
Pickle uses a simple stack-based virtual machine that records the instructions used to reconstruct the object. In other words the pickled instructions in your example are:
Push self.find_class(module_name, class_name) i.e. push os.system
Push the string 'ls ~'
Build tuple from topmost stack items
Apply callable to argtuple, both on stack. i.e. os.system(*('ls ~',))
Source
Importing a module only adds it to the local namespace, which is not necessarily the one you're in. Except when it doesn't:
>>> dir()
['__builtins__', '__doc__', '__name__', '__package__']
>>> __import__('os')
<module 'os' from '/usr/lib64/python2.7/os.pyc'>
>>> dir()
['__builtins__', '__doc__', '__name__', '__package__']

What's the best way to implement "from . import *" in Python?

I'm using Django, and I like to separate my models, views, and tests into subdirectories.
But, this means that I need to maintain an __init__.py in each subdirectory that imports every module in that directory.
I'd rather just put some call in that says:
from some_library import import_everything
import_everything()
That would have the same effect as iterating over the current directory and importing every .py file in that directory.
What's the best/easiest way to implement this?
Here are what my django application directories (essentially) look like:
some_django_app/
models/
__init__.py
create.py
read.py
update.py
delete.py
views/
__init__.py
create.py
read.py
update.py
delete.py
forms/
__init__.py
create.py
update.py
tests/
__init__.py
create.py
read.py
update.py
delete.py
So, you can see that to make a "proper" Django app, all my init.py files need to import all the other .py files in each directory. I'd rather just have some simple boilerplate there.
Within your app/models/__init__.py add these lines:
from app.models.create import *
from app.models.read import *
from app.models.update import *
from app.models.delete import *
This'll be your best bet for conciseness and readability. from app.models import * will now load all classes/etc from within each of the other files. Likewise, from app.models import foo will load foo no matter which of these files it's defined in.
Using the information given in synthesizerpatel's answer, you could implement import_everything this way:
import os
import sys
def import_everything(path):
# Insert near the beginning so path will be the item removed with sys.path.remove(path) below
# (The case when sys.path[0] == path works fine too).
# Do not insert at index 0 since sys.path[0] may have a special meaning
sys.path.insert(1,path)
for filename in os.listdir(path):
if filename.endswith('.py'):
modname = filename.replace('.py', '')
module = __import__(modname, fromlist = [True])
attrs = getattr(module, '__all__',
(attr for attr in dir(module) if not attr.startswith('_')))
for attr in attrs:
# print('Adding {a}'.format(a = attr))
globals()[attr] = getattr(module, attr)
sys.path.remove(path)
and could be used like this:
print(globals().keys())
# ['import_everything', '__builtins__', '__file__', '__package__', 'sys', '__name__', 'os', '__doc__']
import_everything(os.path.expanduser('~/test'))
print(globals().keys())
# ['hashlib', 'pythonrc', 'import_everything', '__builtins__', 'get_input', '__file__', '__package__', 'sys', 'mp', 'time', 'home', '__name__', 'main', 'os', '__doc__', 'user']

Is there a way to access parent modules in Python

I need to know if there is a way to access parent modules from submodules. If I import submodule:
from subprocess import types
I have types - is there some Python magic to get access to subprocess module from types? Something similar to this for classes ().__class__.__bases__[0].__subclasses__().
If you've accessed a module you can typically get to it from the sys.modules dictionary. Python doesn't keep "parent pointers" with names, particularly because the relationship is not one-to-one. For example, using your example:
>>> from subprocess import types
>>> types
<module 'types' from '/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/types.pyc'>
>>> import sys
>>> sys.modules['subprocess']
<module 'subprocess' from '/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/subprocess.pyc'>
If you'll note the presence of types in the subprocess module is just an artifact of the import types statement in it. You just import types if you need that module.
In fact, a future version of subprocess may not import types any more, and your code will break. You should only import the names that appear in the __all__ list of a module; consider other names as implementation details.
So, for example:
>>> import subprocess
>>> dir(subprocess)
['CalledProcessError', 'MAXFD', 'PIPE', 'Popen', 'STDOUT', '_PIPE_BUF', '__all__', '__builtins__', '__doc__',
'__file__', '__name__', '__package__', '_active', '_cleanup', '_demo_posix', '_demo_windows', '_eintr_retry_call',
'_has_poll', 'call', 'check_call', 'check_output', 'errno', 'fcntl', 'gc', 'list2cmdline', 'mswindows', 'os',
'pickle', 'select', 'signal', 'sys', 'traceback', 'types']
>>> subprocess.__all__
['Popen', 'PIPE', 'STDOUT', 'call', 'check_call', 'check_output', 'CalledProcessError']
You can see that most of the names visible in subprocess are just other top-level modules that it imports.
For posterity, I ran into this also and came up with the one liner:
import sys
parent_module = sys.modules['.'.join(__name__.split('.')[:-1]) or '__main__']
The or '__main__' part is just in case you load the file directly it will return itself.
full_module_name = module.__name__
parent, _, sub = full_module_name.rpartition('.')
if parent:
parent = import(parent, fromlist='dummy')
I assume you are not inside the subprocess module already, you could do
import somemodule
children = dir(somemodule)
Then you could inspect the children of subprocess with the inspect module:
http://docs.python.org/library/inspect.html
Maybe the getmodule method would be useful for you?
http://docs.python.org/library/inspect.html#inspect.getmodule
import inspect
parent_module = inspect.getmodule(somefunction)
children = dir(parent_module)
package = parent_module.__package__
On my machine __package__ returns empty for 'types', but can be more useful for my own modules as it does return the parent module as a string
Best way worked for us was
Let' say folder structure
src
|demoproject
|
|--> uimodule--> ui.py
|--> backendmodule --> be.py
setup.py
1. Create installable package out of the project
2. Have __init__.py in all the directory(module)
3. create setup.py [ Keep in top level folder, here inside src]
Sample
from setuptools import setup, find_packages
setup(
name="demopackage",
version="1",
packages=find_packages(exclude=["tests.*", "tests"]),
author='',
author_email='',
description="",
url="",
)
4. From src folder, create installable package
pip3 install .
5. this will install a package --> demopackage
6. Now from any of your module you can access any module, ex
7. from ui.py to access be.py function calldb(), make below import
from demopackage.backendmodule.be import calldb
8. and so on, when you a new folder into your project just add __init__.py in that folder and it will be accessible, just like above, but you have to execute `"pip3 install ."`

Categories