Interesting usecase today: I need to migrate a module in our codebase following code changes. The old mynamespace.Document will disappear and I want to ensure smooth migration by replacing this package by a code object that will dynamically import the correct path and migrate the corresponding objects.
In short:
# instanciate a dynamic package, but do not load
# statically submodules
mynamespace.Document = SomeObject()
assert 'submodule' not in mynamespace.Document.__dict__
# and later on, when importing it, the submodule
# is built if not already available in __dict__
from namespace.Document.submodule import klass
c = klass()
A few things to note:
I am not talking only of migrating code. A simple huge sed would in a sense be enough to change the code in order to migrate some imports, and I would not need a dynamic module. I am talking of objects. A website, holding some live/stored objects will need migration. Those objects will be loaded assuming that mynamespace.Document.submodule.klass exists, and that's the reason for the dynamic module. I need to provide the site with something to load.
We cannot, or do not want to change the way objects are unpickled/loaded. For simplicity, let's just say that we want to make sure that the idiom from mynamespace.Document.submodule import klass has to work. I cannot use instead from mynamespace import Document as container; klass = getattr(getattr(container, 'submodule'), 'klass')
What I tried:
import sys
from types import ModuleType
class VerboseModule(ModuleType):
def __init__(self, name, doc=None):
super(VerboseModule, self).__init__(name, doc)
sys.modules[name] = self
def __repr__(self):
return "<%s %s>" % (self.__class__.__name__, self.__name__)
def __getattribute__(self, name):
if name not in ('__name__', '__repr__', '__class__'):
print "fetching attribute %s for %s" % (name, self)
return super(VerboseModule, self).__getattribute__(name)
class DynamicModule(VerboseModule):
"""
This module generates a dummy class when asked for a component
"""
def __getattr__(self, name):
class Dummy(object):
pass
Dummy.__name__ = name
Dummy.__module__ = self
setattr(self, name, Dummy)
return Dummy
class DynamicPackage(VerboseModule):
"""
This package should generate dummy modules
"""
def __getattr__(self, name):
mod = DynamicModule("%s.%s" % (self.__name__, name))
setattr(self, name, mod)
return mod
DynamicModule("foobar")
# (the import prints:)
# fetching attribute __path__ for <DynamicModule foobar>
# fetching attribute DynamicModuleWorks for <DynamicModule foobar>
# fetching attribute DynamicModuleWorks for <DynamicModule foobar>
from foobar import DynamicModuleWorks
print DynamicModuleWorks
DynamicPackage('document')
# fetching attribute __path__ for <DynamicPackage document>
from document.submodule import ButDynamicPackageDoesNotWork
# Traceback (most recent call last):
# File "dynamicmodule.py", line 40, in <module>
# from document.submodule import ButDynamicPackageDoesNotWork
#ImportError: No module named submodule
As you can see the Dynamic Package does not work. I do not understand what is happening because document is not even asked for a ButDynamicPackageDoesNotWork attribute.
Can anyone clarify what is happening; and if/how I can fix this?
The problem is that python will bypass the entry in for document in sys.modules and load the file for submodule directly. Of course this doesn't exist.
demonstration:
>>> import multiprocessing
>>> multiprocessing.heap = None
>>> import multiprocessing.heap
>>> multiprocessing.heap
<module 'multiprocessing.heap' from '/usr/lib/python2.6/multiprocessing/heap.pyc'>
We would expect that heap is still None because python can just pull it out of sys.modules but That doesn't happen. The dotted notation essentially maps directly to {something on python path}/document/submodule.py and an attempt is made to load that directly.
Update
The trick is to override pythons importing system. The following code requires your DynamicModule class.
import sys
class DynamicImporter(object):
"""this class works as both a finder and a loader."""
def __init__(self, lazy_packages):
self.packages = lazy_packages
def load_module(self, fullname):
"""this makes the class a loader. It is given name of a module and expected
to return the module object"""
print "loading {0}".format(fullname)
components = fullname.split('.')
components = ['.'.join(components[:i+1])
for i in range(len(components))]
for component in components:
if component not in sys.modules:
DynamicModule(component)
print "{0} created".format(component)
return sys.modules[fullname]
def find_module(self, fullname, path=None):
"""This makes the class a finder. It is given the name of a module as well as
the package that contains it (if applicable). It is expected to return a
loader for that module if it knows of one or None in which case other methods
will be tried"""
if fullname.split('.')[0] in self.packages:
print "found {0}".format(fullname)
return self
else:
return None
# This is a list of finder objects which is empty by defaule
# It is tried before anything else when a request to import a module is encountered.
sys.meta_path=[DynamicImporter('foo')]
from foo.bar import ThisShouldWork
Related
Call me weird if you like, many have before, but I have a large class which I'd like to make extensible with methods loaded from a plugin directory. Essentially, I'm monkey patching the class. What I have almost works but the method loaded doesn't 'see' the globals defined in __main__. Ideally I'd like a way to tell globals() (or whatever mechanism is actually used to locate global variables) to use that existing in __main__. Here is the code I have (trimmed for the sake of brevity):
#!/usr/bin/env python3
import importlib
import os
import types
main_global = "Hi, I'm in main"
class MyClass:
def __init__(self, plugin_dir=None):
if plugin_dir:
self.load_plugins(plugin_dir, ext="plugin")
def load_plugins(self, plugin_dir, ext):
""" Load plugins
Plugins are files in 'plugin_dir' that have the given extension.
The functions defined within are imported as methods of this class.
"""
cls = self.__class__
# First check that we're not importing the same extension twice into
# the same class.
try:
plugins = getattr(cls, "_plugins")
except AttributeError:
plugins = set()
setattr(cls, "_plugins", plugins)
if ext in plugins:
return
plugins.add(ext)
for file in os.listdir(plugin_dir):
if not file.endswith(ext):
continue
filename = os.path.join(plugin_dir, file)
loader = importlib.machinery.SourceFileLoader("bar", filename)
module = types.ModuleType(loader.name)
loader.exec_module(module)
for name in dir(module):
if name.startswith("__"):
continue
obj = getattr(module, name)
if callable(obj):
obj = obj.__get__(self, cls)
setattr(cls, name, obj)
z = MyClass(plugin_dir="plugins")
z.foo("Hello")
And this is 'foo.plugin' from the plugins directory:
#!/usr/bin/env python3
foo_global = "I am global within foo"
def foo(self, value):
print(f"I am foo, called with {self} and {value}")
print(f"foo_global = {foo_global}")
print(f"main_global = {main_global}")
The output is...
I am foo, called with <__main__.MyClass object at 0x7fd4680bfac8> and Hello
foo_global = I am global within foo
Traceback (most recent call last):
File "./plugged", line 55, in <module>
z.foo("Hello")
File "plugins/foo.plugin", line 8, in foo
print(f"main_global = {main_global}")
NameError: name 'main_global' is not defined
I know it all feels a bit 'hacky', but it's become a challenge so please don't flame me on style etc. If there's another way to achieve this aim, I'm all ears.
Thoughts, learned friends?
You can do what you want with a variation of the technique shown in #Martijn Pieters' answer to the the question: How to inject variable into scope with a decorator? tweaked to inject multiple values into a class method.
from functools import wraps
import importlib
import os
from pathlib import Path
import types
main_global = "Hi, I'm in main"
class MyClass:
def __init__(self, plugin_dir=None):
if plugin_dir:
self.load_plugins(plugin_dir, ext="plugin")
def load_plugins(self, plugin_dir, ext):
""" Load plugins
Plugins are files in 'plugin_dir' that have the given extension.
The functions defined within are imported as methods of this class.
"""
cls = self.__class__
# First check that we're not importing the same extension twice into
# the same class.
try:
plugins = getattr(cls, "_plugins")
except AttributeError:
plugins = set()
setattr(cls, "_plugins", plugins)
if ext in plugins:
return
plugins.add(ext)
for file in Path(plugin_dir).glob(f'*.{ext}'):
loader = importlib.machinery.SourceFileLoader("bar", str(file))
module = types.ModuleType(loader.name)
loader.exec_module(module)
namespace = globals()
for name in dir(module):
if name.startswith("__"):
continue
obj = getattr(module, name)
if callable(obj):
obj = inject(obj.__get__(self, cls), namespace)
setattr(cls, name, obj)
def inject(method, namespace):
#wraps(method)
def wrapped(*args, **kwargs):
method_globals = method.__globals__
# Save copies of any of method's global values replaced by the namespace.
replaced = {key: method_globals[key] for key in namespace if key in method_globals}
method_globals.update(namespace)
try:
method(*args[1:], **kwargs)
finally:
method_globals.update(replaced) # Restore any replaced globals.
return wrapped
z = MyClass(plugin_dir="plugins")
z.foo("Hello")
Example output:
I am foo, called with <__main__.MyClass object at 0x0056F670> and Hello
foo_global = I am global within foo
main_global = Hi, I'm in main
You can approach the problem with a factory function and inheritance. Assuming each of your plugins is something like this, defined in a separate importable file:
class MyPlugin:
foo = 'bar'
def extra_method(self):
print(self.foo)
You can use a factory like this:
def MyClassFactory(plugin_dir):
def search_and_import_plugins(plugin_dir):
# Look for all possible plugins and import them
return plugin_list # a list of plugin classes, like [MyPlugin]
plugin_list = search_and_import_plugins(plugin_dir):
class MyClass(*plugin_list):
pass
return MyClass()
z = MyClassFactory('/home/me/plugins')
I have class called 'my_class' placed in 'my_module'. And I need to import this class. I tried to do it like this:
import importlib
result = importlib.import_module('my_module.my_class')
but it says:
ImportError: No module named 'my_module.my_class'; 'my_module' is not a package
So. As I can see it works only for modules, but can't handle classes. How can I import a class from a module?
It is expecting my_module to be a package containing a module named 'my_class'. If you need to import a class, or an attribute in general, dynamically, just use getattr after you import the module:
cls = getattr(import_module('my_module'), 'my_class')
Also, yes, it does only work with modules. Remember importlib.import_module is a wrapper of the internal importlib.__import__ function. It doesn't offer the same amount of functionality as the full import statement which, coupled with from, performs an attribute look-up on the imported module.
import importlib
import logging
logger = logging.getLogger(__name__)
def factory(module_class_string, super_cls: type = None, **kwargs):
"""
:param module_class_string: full name of the class to create an object of
:param super_cls: expected super class for validity, None if bypass
:param kwargs: parameters to pass
:return:
"""
module_name, class_name = module_class_string.rsplit(".", 1)
module = importlib.import_module(module_name)
assert hasattr(module, class_name), "class {} is not in {}".format(class_name, module_name)
logger.debug('reading class {} from module {}'.format(class_name, module_name))
cls = getattr(module, class_name)
if super_cls is not None:
assert issubclass(cls, super_cls), "class {} should inherit from {}".format(class_name, super_cls.__name__)
logger.debug('initialising {} with params {}'.format(class_name, kwargs))
obj = cls(**kwargs)
return obj
I'm currently writing some kind of tiny api to support extending module classes. Users should be able to just write their class name in a config and it gets used in our program. The contract is, that the class' module has a function called create(**kwargs) to return an instance of our base module class, and is placed in a special folder. But the isinstance check Fails as soon as the import is made dynamically.
modules are placed in lib/services/name
module base class (in lib/services/service)
class Service:
def __init__(self, **kwargs):
#some initialization
example module class (in lib/services/ping)
class PingService(Service):
def __init__(self, **kwargs):
Service.__init__(self,**kwargs)
# uninteresting init
def create(kwargs):
return PingService(**kwargs)
importing function
import sys
from lib.services.service import Service
def doimport( clazz, modPart, kw, class_check):
path = "lib/" + modPart
sys.path.append(path)
mod = __import__(clazz)
item = mod.create(kw)
if class_check(item):
print "im happy"
return item
calling code
class_check = lambda service: isinstance(service, Service)
s = doimport("ping", "services", {},class_check)
print s
from lib.services.ping import create
pingService = create({})
if isinstance(pingService, Service):
print "why this?"
what the hell am I doing wrong
here is a small example zipped up, just extract and run test.py without arguments
zip example
The problem was in your ping.py file. I don't know exactly why, but when dinamically importing it was not accepting the line from service import Service, so you just have to change it to the relative path: from lib.services.service import Service. Adding lib/services to the sys.path could not make it work the inheritance, which I found strange...
Also, I am using imp.load_source which seems more robust:
import os, imp
def doimport( clazz, modPart, kw, class_check):
path = os.path.join('lib', modPart, clazz + '.py')
mod = imp.load_source( clazz, path )
item = mod.create(kw)
if class_check(item):
print "im happy"
return item
I have a module that imports fine (i print it at the top of the module that uses it)
from authorize import cim
print cim
Which produces:
<module 'authorize.cim' from '.../dist-packages/authorize/cim.pyc'>
However later in a method call, it has mysteriously turned to None
class MyClass(object):
def download(self):
print cim
which when run show that cim is None. The module isn't ever directly assigned to None anywhere in this module.
Any ideas how this can happen?
As you comment it youself - it is likely some code is attributing None to the "cim" name on your module itself - the way for checking for this is if your large module would be made "read only" for other modules -- I think Python allows for this --
(20 min. hacking ) --
Here -- just put this snippet in a "protect_module.py" file, import it, and call
"ProtectdedModule()" at the end of your module in which the name "cim" is vanishing -
it should give you the culprit:
"""
Protects a Module against naive monkey patching -
may be usefull for debugging large projects where global
variables change without notice.
Just call the "ProtectedModule" class, with no parameters from the end of
the module definition you want to protect, and subsequent assignments to it
should fail.
"""
from types import ModuleType
from inspect import currentframe, getmodule
import sys
class ProtectedModule(ModuleType):
def __init__(self, module=None):
if module is None:
module = getmodule(currentframe(1))
ModuleType.__init__(self, module.__name__, module.__doc__)
self.__dict__.update(module.__dict__)
sys.modules[self.__name__] = self
def __setattr__(self, attr, value):
frame = currentframe(1)
raise ValueError("Attempt to monkey patch module %s from %s, line %d" %
(self.__name__, frame.f_code.co_filename, frame.f_lineno))
if __name__ == "__main__":
from xml.etree import ElementTree as ET
ET = ProtectedModule(ET)
print dir(ET)
ET.bla = 10
print ET.bla
In my case, this was related with threading quirks: https://docs.python.org/2/library/threading.html#importing-in-threaded-code
I have the following interface :
class Interface(object):
__metaclass__ = abc.ABCMeta
#abc.abstractmethod
def run(self):
"""Run the process."""
return
I have a collections of modules that are all in the same directory. Each module contains a single class that implements my interface.
For example Launch.py :
class Launch(Interface):
def run(self):
pass
Let's say I have 20 modules, that implements 20 classes. I would like to be able to launch a module that would check if some of the classes do not implement the Interface.
I know I have to use :
issubclass(Launch, ProcessInterface) to know if a certain class implements my process interface.
introspection to get the class that is in my module.
import modules at runtime
I am just not sure how to do that.
I can manage to use issubclass inside a module.
But I cannot use issubclass if I am outside the module.
I need to :
get the list of all modules in the directory
get the class in each module
do issubclass on each class
I would need a draf of a function that could do that.
You're probably looking for something like this:
from os import listdir
from sys import path
modpath = "/path/to/modules"
for modname in listdir(modpath):
if modname.endswith(".py"):
# look only in the modpath directory when importing
oldpath, path[:] = path[:], [modpath]
try:
module = __import__(modname[:-3])
except ImportError:
print "Couldn't import", modname
continue
finally: # always restore the real path
path[:] = oldpath
for attr in dir(module):
cls = getattr(module, attr)
if isinstance(cls, type) and not issubclass(cls, ProcessInterface):
# do whatever