Suppose you have the following structure in your src folder:
conf.py
./drivers
mod1.py --> contains mod1Class
mod2.py --> contains mod2Class
What I'd like to have is a snippet of code in conf.py to automatically instantiate the classes in mod*.py so that if one day I'll add mod3.py --> mod3Class this will be automatically instantiated in conf.py without adding any line of code.
I tried, without success:
from drivers import *
but I'm not able to import, I receive a NameError. So I'm stuck at the very first step. Also suppose I'm able to perform the import successfully, how can I do:
mod1Class_instance = mod1.mod1Class() (in a cycle, one instance for every file in drivers)
in an automatic way? I cannot use strings to make the instance of a class so I cannot get the names of the files in drivers and use strings. What's the right way to do this operation?
Thanks
Maybe, it's what you need:
from types import ModuleType
import drivers
for driver_module in dir(drivers):
if not isinstance(driver_module, ModuleType):
continue # not real module driver
for cls in dir(driver_module):
if not isinstance(cls, SomeBaseClass):
continue # not real mod class
# create new variable with name as lower class name
locals()[cls.__name__.lower()] = cls()
And, also, you should create __init__.py file in your drivers folder. This will mean that your folder is a python-module.
On the other hand, I recommend manually describe all imports. This simple approach makes your code more clear.
This loads classes in modules in a directory drivers, which is in the same directory of the current module, and does not need to make drivers a package:
from collections import defaultdict
import os
import pkgutil
def getclasses(module):
"""Get classes from a driver module."""
# I'd recommend another way to find classes; for example, in defuz's
# answer, classes in driver modules would have one base class.
try:
yield getattr(module, module.__name__ + "Class")
except AttributeError:
pass
instances = defaultdict(list)
drivers_dir = os.path.join(os.path.dirname(__file__), 'drivers')
for module_loader, name, ispkg in pkgutil.iter_modules([drivers_dir]):
module = module_loader.find_module(name).load_module(name)
for cls in getclasses(module):
# You might want to use the name of the module as a key instead of the
# module object, or, perhaps, to append all instances to a same list.
instances[module].append(cls())
# I'd recommend not putting instances in the module namespace,
# and just leaving them in that dictionary.
for mod_instances in instances.values():
for instance in mod_instances:
locals()[type(instance).__name__ + "_instance"] = instance
Related
I have the following project structure:
Package1
|--__init__.py
|--__main__.py
|--Module1.py
|--Module2.py
where Module1.py contains something like:
import dill as pickle
import Package1.Module2
# from https://stackoverflow.com/questions/52402783/pickle-class-definition-in-module-with-dill
def mainify(obj):
import __main__
import inspect
import ast
s = inspect.getsource(obj)
m = ast.parse(s)
co = compile(m, "<string>", "exec")
exec(co, __main__.__dict__)
def Module1():
"""I hope the details of this class are not necessary for this example. I can add detail if necessary
"""
obj_to_pickle = Module1()
def write_session():
mainify(Module1)
mainify(Module2)
with FileHandler.open_file(...) as f:
pickle.dump(obj_to_pickle, f)
I run the code as a module via python -m Package1 ..., thus __main__.py is the entry point to package execution, though I hope these details aren't relevant (I can improve my example if necessary).
Now, when I try to load the pickled object, I get ModuleNotFoundError: No module named Package1.
How can tell dill in this situation to understand that Package1 is the package? The mainify function seems to be getting the modules' source code into the pickle, but I believe the import statement in Module1.py that is import Package1.Module2.py is causing the ImportError. How can I tell dill to understand the reference to Package1?
NOTE: this reference can be fixed by adding the directory that Package1 is in via sys.path.append. But the whole point of pickling the package source alongside the instance is to make pickled instance unpicklable without needed to do this.
Relevant posts:
Pickle class definition in module with dill
Why dill dumps external classes by reference, no matter what?
#courtyardz. I'm a contributor of dill and your question is similar to others that have been asked in the past.
First, let me explain that generally dill assumes that all the modules necessary to deserialize an object are importable in the "unpickling" environment. Therefore modules are almost always saved by reference, with the current exception of modules that are not properly installed, like local modules (e.g. located in the working directory) or modules at non-canonical paths added to sys.path. There's also a function that's able to save the complete state of a module, which can be restored afterwards, but not the module itself.
That said, what exactly do you need? It's to serialize an object alongside its class (including any objects in the module's namespace that it refers to), or it's really the whole module?
If you need to transfer the complete module to an interpreter session where it's not available, like in a different machine, this problem is under active discussion here: https://github.com/uqfoundation/dill/issues/123. There's no complete solution for this currently, but one possibility is to ship the module as a ZIP archive, and load it using the zipimport module (indirectly, by saving the zip file to disk, maybe in a temporary location, and adding its path to sys.path as described in Python's documentation).
If you just need to serialize an object with its class, note that doing such has the limitation that objects of that class pickled by separate calls to dill.dump() or dill.dumps() will end up having different (although identical) classes when unpickled. This may or may not be a problem. There's also an open discussion about forcing the serialization of a class by value: https://github.com/uqfoundation/dill/issues/424.
The workaround you are trying to use should work because dill pickles classes defined in the __main__ module by value, as well as "orphaned" classes, i.e. classes that can't be found in the module where they were defined. However, for this to work the object must be created by the __main__.Module1 class (I suppose this is a class, even though you used def instead of class in your code example), not the Package1.Module1.Module1 class. If the class references global objects in Module1 in its methods, you may need to use the option recurse=True with dill.dump(s).
A simpler workaround, that may not work for your specific case as it involves multiple modules, is to temporarily change the __module__ attribute of the class. For example, at a module's body:
import dill
class X:
pass
obj = X()
X.__module__ = None # temporarily orphan the class
with open('/path/to/file.pkl', 'wb') as file:
dill.dump(obj) # X will be pickled by value because __module__ is None
X.__module__ = __name__ # de-orphan the class
Going back to your example, if you can't create the object with the "mainified" class, you may change the object's class temporarily too:
obj_to_pickle = Module1()
def write_session():
mainify(Module1)
mainify(Module2)
obj_to_pickle.__class__ = __main__.Module1
with FileHandler.open_file(...) as f:
pickle.dump(obj_to_pickle, f)
obj_to_pickle.__class__ = Module1
If the object has instance attributes of types defined in Package1, it won't work however.
Is there a way (using only python. i.e.: without a bash script nor another language code) to call a specific function in every script inside a folder without needing to import all of them explicitly.
For example, let's say that this is my structure:
main.py
modules/
module1.py
module2.py
module3.py
module4.py
and every moduleX.py has this code:
import os
def generic_function(caller):
print('{} was called by {}'.format(os.path.basename(__file__), caller))
def internal_function():
print('ERROR: Someone called an internal function')
while main.py has this code:
import modules
import os
for module in modules.some_magic_function():
module.generic_function(os.path.basename(__file__))
So if I run main.py, I should get this output:
module1.py was called by main.py
module2.py was called by main.py
module3.py was called by main.py
module4.py was called by main.py
*Please note that internal_function() shouldn't be called (unlike this question). Also, I don't want to declare explicitly every module file even on a __init__.py
By the way, I don't mind to use classes for this. In fact it could be even better.
You can use exec or eval to do that. So it would go roughly this way (for exec):
def magic_execute():
import os
import glob
for pyfl in glob.glob(os.path(MYPATH, '*.py'):
with open(pyfl, 'rt') as fh:
pycode = fh.read()
pycode += '\ngeneric_function({})'.format(__file__)
exec(pycode)
The assumption here is that you are not going to import the modules at all.
Please note, that there are numerous security issues related to using exec in such a non-restricted manner. You can increase security a bit.
While sophros' approach is quickly and enough for implicitly importing the modules, you could have issues related to controlling every module or with complex calls (like having conditions for each calls). So I went with another approeach:
First I created a class with the function(s) (now methods) declared. With this I can avoid checking if the method exists as I can use the default one if I didn't declare it:
# main.py
class BaseModule:
def __init__(self):
# Any code
def generic_function(self, caller):
# This could be a Print (or default return value) or an Exception
raise Exception('generic_function wasn\'t overridden or it was used with super')
Then I created another class that extends the BaseModule. Sadly I wasn't able to get a good way for checking inherence without knowing the name of the child class so I used the same name for every module:
# modules/moduleX.py
from main import BaseModule
class GenericModule(BaseModule):
def __init__(self):
BaseModule.__init__(self)
# Any code
def generic_function(self, caller):
print('{} was called by {}'.format(os.path.basename(__file__), caller))
Finally, in my main.py, I used the importlib for importing the modules dynamically and saving an instance for each one, so I can use them later (for sake of simplicity I didn't save them in the following code, but it's easy as using a list and appending every instance on it):
# main.py
import importlib
import os
if __name__ == '__main__':
relPath = 'modules' # This has to be relative to the working directory
for pyFile in os.listdir('./' + relPath):
# just load python (.py) files except for __init__.py or similars
if pyFile.endswith('.py') and not pyFile.startswith('__'):
# each module has to be loaded with dots instead of slashes in the path and without the extension. Also, modules folder must have a __init___.py file
module = importlib.import_module('{}.{}'.format(relPath, pyFile[:-3]))
# we have to test if there is actually a class defined in the module. This was extracted from [1]
try:
moduleInstance = module.GenericModule(self)
moduleInstance.generic_function(os.path.basename(__file__)) # You can actually do whatever you want here. You can save the moduleInstance in a list and call the function (method) later, or save its return value.
except (AttributeError) as e:
# NOTE: This will be fired if there is ANY AttributeError exception, including those that are related to a typo, so you should print or raise something here for diagnosting
print('WARN:', pyFile, 'doesn\'t has GenericModule class or there was a typo in its content')
References:
[1] Check for class existence
[2] Import module dynamically
[3] Method Overriding in Python
I am developing a Python package which does work by taking in user-defined objects all of which are instances of a class which I wrote. The way I have designed is, user passes his/her objects as defined in one or more python scripts (see example below).
I want to access the objects which user defines in the scripts. How can I do that?
I looked at import by filename but to no avail. I even went on to use imp.load_source but didn't solve.
Some typical user-defined objects
Assume for the sake of the problem, all methods are defined in Base. I understand what I am asking for leads to arbitrary code execution, so I am open to suggestions wherein users can pass their instances of the Base class arbitrarily but safely.
foo.py has the following code:
from package import Base
foo = Base('foo')
foo.AddBar('bar', 'bar')
foo.AddCow('moo')
ooo.py :
from package import Base
ooo = Base('ooo')
ooo.AddBar('ooo','ooo')
ooo.AddO(12)
And I run my main program as,
main_program -p foo.py ooo.py
I want to be able to access foo, ooo in the main_program body.
Tried:
I am using python2.7 I know I am using older Python, I will make the move soon
importlib
Tried importlib.import_module but it throws ImportError: Import by filename is not supported.
__import__
I tried using __import__('/path/to/file.py') but it throws the same ImportError: Import by filename is not supported.
At this point, any solution which lets me use objects defined in user-input scripts works.
If you are okay with skipping the .py in the filename, this can be solved by asking the user to pass the module name (basically the file name without the py) extension
Referring to this answer and this book, here is an example
tester.py
class A:
def write(self):
print("hello")
obj = A()
Now we want to dynamically access obj from a file called test.py, so we do
python test.py tester
And what does test.py do? It imports the module based on name and access it methods. Note that this assumes you are not concerned about the order in which the user passes the objects
test.py
import sys
# Get all parameters (sys.argv[0] is the file name so skipping that)
module_names = sys.argv[1:]
# I believe you can also do this using importlib
modules = list(map(__import__, module_names))
# modules[0] is now "tester"
modules[0].obj.write()
Mapping this to your example, I think this should be
foo.py
from package import Base
foo = Base('foo')
foo.AddBar('bar', 'bar')
foo.AddCow('moo')
ooo.py
from package import Base
ooo = Base('ooo')
ooo.AddBar('ooo','ooo')
ooo.AddO(12)
And run main program as
python main_program.py foo ooo
Have you tried
from (whatever the file name is) import *
Also don’t include .py in the file name.
I'd like to load a module dynamically, given its string name (from an environment variable). I'm using Python 2.7. I know I can do something like:
import os, importlib
my_module = importlib.import_module(os.environ.get('SETTINGS_MODULE'))
This is roughly equivalent to
import my_settings
(where SETTINGS_MODULE = 'my_settings'). The problem is, I need something equivalent to
from my_settings import *
since I'd like to be able to access all methods and variables in the module. I've tried
import os, importlib
my_module = importlib.import_module(os.environ.get('SETTINGS_MODULE'))
from my_module import *
but I get a bunch of errors doing that. Is there a way to import all methods and attributes of a module dynamically in Python 2.7?
If you have your module object, you can mimic the logic import * uses as follows:
module_dict = my_module.__dict__
try:
to_import = my_module.__all__
except AttributeError:
to_import = [name for name in module_dict if not name.startswith('_')]
globals().update({name: module_dict[name] for name in to_import})
However, this is almost certainly a really bad idea. You will unceremoniously stomp on any existing variables with the same names. This is bad enough when you do from blah import * normally, but when you do it dynamically there is even more uncertainty about what names might collide. You are better off just importing my_module and then accessing what you need from it using regular attribute access (e.g., my_module.someAttr), or getattr if you need to access its attributes dynamically.
Not answering precisely the question as worded, but if you wish to have a file as proxy to a dynamic module, you can use the ability to define __getattr__ on the module level.
import importlib
import os
module_name = os.environ.get('CONFIG_MODULE', 'configs.config_local')
mod = importlib.import_module(module_name)
def __getattr__(name):
return getattr(mod, name)
My case was a bit different - wanted to dynamically import the constants.py names in each gameX.__init__.py module (see below), cause statically importing those would leave them in sys.modules forever (see: this excerpt from Beazley I picked from this related question).
Here is my folder structure:
game/
__init__.py
game1/
__init__.py
constants.py
...
game2/
__init__.py
constants.py
...
Each gameX.__init__.py exports an init() method - so I had initially a from .constants import * in all those gameX.__init__.py which I tried to move inside the init() method.
My first attempt in the lines of:
## -275,2 +274,6 ## def init():
# called instead of 'reload'
+ yak = {}
+ yak.update(locals())
+ from .constants import * # fails here
+ yak = {x: y for x,y in locals() if x not in yak}
+ globals().update(yak)
brec.ModReader.recHeader = RecordHeader
Failed with the rather cryptic:
SyntaxError: import * is not allowed in function 'init' because it contains a nested function with free variables
I can assure you there are no nested functions in there. Anyway I hacked and slashed and ended up with:
def init():
# ...
from .. import dynamic_import_hack
dynamic_import_hack(__name__)
Where in game.__init__.py:
def dynamic_import_hack(package_name):
print __name__ # game.init
print package_name # game.gameX.init
import importlib
constants = importlib.import_module('.constants', package=package_name)
import sys
for k in dir(constants):
if k.startswith('_'): continue
setattr(sys.modules[package_name], k, getattr(constants, k))
(for setattr see How can I add attributes to a module at run time? while for getattr How can I import a python module function dynamically? - I prefer to use those than directly access the __dict__)
This works and it's more general than the approach in the accepted answer cause it allows you to have the hack in one place and use it from whatever module. However I am not really sure it's the best way to implement it - was going to ask a question but as it would be a duplicate of this one I am posting it as an answer and hope to get some feedback. My questions would be:
why this "SyntaxError: import * is not allowed in function 'init'" while there are no nested functions ?
dir has a lot of warnings in its doc - in particular it attempts to produce the most relevant, rather than complete, information - this complete worries me a bit
is there no builtin way to do an import * ? even in python 3 ?
I'm working on creating a Python module that maps API provided by a different language/framework into Python. Ideally, I would like this to be presented as a single root package that exposes helper methods, and which maps all namespaces in that other framework to Python packages/modules. For the sake of convenience, let's take CLR as an example:
import clr.System.Data
import clr.System.Windows.Forms
Here clr is the magic top-level package which exposes CLR namespaces System.Data and System.Windows.Forms subpackages/submodules (so far as I can see, a package is just a module with child modules/packages; it is still valid to have other kinds of members therein).
I've read PEP-302 and wrote a simple prototype program that achieves a similar effect by installing a custom meta_path hook. The clr module itself is a proper Python module which, when imported, sets __path__ = [] (making it a package, so that import even attempts lookup for submodules at all), and registers the hook. The hook itself intercepts any package load where full name of the package starts with "clr.", dynamically creates the new module using imp.new_module(), registers it in sys.modules, and uses pixie dust and rainbows to fill it with classes and methods from the original API. Here's the code:
clr.py
import sys
import imp
class MyLoader:
def load_module(self, fullname):
try:
return sys.modules[fullname]
except KeyError:
pass
print("--- load ---")
print(fullname)
m = imp.new_module(fullname)
m.__file__ = "clr:" + fullname
m.__path__ = []
m.__loader__ = self
m.speak = lambda: print("I'm " + fullname)
sys.modules.setdefault(fullname, m)
return m
class MyFinder:
def find_module(self, fullname, path = None):
print("--- find ---")
print(fullname)
print(path)
if fullname.startswith("clr."):
return MyLoader()
return None
print("--- init ---")
__path__ = []
sys.meta_path.append(MyFinder())
test.py
import clr.Foo.Bar.Baz
clr.Foo.speak()
clr.Foo.Bar.speak()
clr.Foo.Bar.Baz.speak()
All in all this seems to work fine. Python guarantees that modules in the chain are imported left to right, so clr is always imported first, and it sets up the hook that allows the remainder of the chain to be imported.
However, I'm wondering if what I'm doing here is overkill. I am, after all, installing a global hook, that will be called for any module import, even though I filter out those that I don't care about. Is there, perhaps, some way to install a hook that will only be called for imports from my particular package, and not others? Or is the above the Right Way to do this kind of thing in Python?
In general, I think your approach looks fine. I wouldn't worry about it being "global", since the whole point is to specify which paths should be handled by you. Moving this test inside the import logic would just needlessly complicate it, so it's left to the implementer of the hook to decide.
Just one small concern, maybe you could use sys.path_hooks? It appears to be a bit less "powerful" than sys.meta_path
sys.path_hooks is a list of callables, which will be checked in
sequence to determine if they can handle a given path item. The
callable is called with one argument, the path item. The callable
must raise ImportError if it is unable to handle the path item, and
return an importer object if it can handle the path item.