Package-specific import hooks in Python

Package-specific import hooks in Python - python

I'm working on creating a Python module that maps API provided by a different language/framework into Python. Ideally, I would like this to be presented as a single root package that exposes helper methods, and which maps all namespaces in that other framework to Python packages/modules. For the sake of convenience, let's take CLR as an example:
import clr.System.Data
import clr.System.Windows.Forms
Here clr is the magic top-level package which exposes CLR namespaces System.Data and System.Windows.Forms subpackages/submodules (so far as I can see, a package is just a module with child modules/packages; it is still valid to have other kinds of members therein).
I've read PEP-302 and wrote a simple prototype program that achieves a similar effect by installing a custom meta_path hook. The clr module itself is a proper Python module which, when imported, sets __path__ = [] (making it a package, so that import even attempts lookup for submodules at all), and registers the hook. The hook itself intercepts any package load where full name of the package starts with "clr.", dynamically creates the new module using imp.new_module(), registers it in sys.modules, and uses pixie dust and rainbows to fill it with classes and methods from the original API. Here's the code:
clr.py
import sys
import imp
class MyLoader:
def load_module(self, fullname):
try:
return sys.modules[fullname]
except KeyError:
pass
print("--- load ---")
print(fullname)
m = imp.new_module(fullname)
m.__file__ = "clr:" + fullname
m.__path__ = []
m.__loader__ = self
m.speak = lambda: print("I'm " + fullname)
sys.modules.setdefault(fullname, m)
return m
class MyFinder:
def find_module(self, fullname, path = None):
print("--- find ---")
print(fullname)
print(path)
if fullname.startswith("clr."):
return MyLoader()
return None
print("--- init ---")
__path__ = []
sys.meta_path.append(MyFinder())
test.py
import clr.Foo.Bar.Baz
clr.Foo.speak()
clr.Foo.Bar.speak()
clr.Foo.Bar.Baz.speak()
All in all this seems to work fine. Python guarantees that modules in the chain are imported left to right, so clr is always imported first, and it sets up the hook that allows the remainder of the chain to be imported.
However, I'm wondering if what I'm doing here is overkill. I am, after all, installing a global hook, that will be called for any module import, even though I filter out those that I don't care about. Is there, perhaps, some way to install a hook that will only be called for imports from my particular package, and not others? Or is the above the Right Way to do this kind of thing in Python?

In general, I think your approach looks fine. I wouldn't worry about it being "global", since the whole point is to specify which paths should be handled by you. Moving this test inside the import logic would just needlessly complicate it, so it's left to the implementer of the hook to decide.
Just one small concern, maybe you could use sys.path_hooks? It appears to be a bit less "powerful" than sys.meta_path
sys.path_hooks is a list of callables, which will be checked in
sequence to determine if they can handle a given path item. The
callable is called with one argument, the path item. The callable
must raise ImportError if it is unable to handle the path item, and
return an importer object if it can handle the path item.

Related

How to customize a module import in Python?

I would like to customize the behavior of my module when it is imported.
For example, let say I want my module to print an incremented number each time another file use import my_module. And when from my_module import some_string is used, it should print "some_string".
How could I do that?
I read several questions here and there but this does not seems to work.
# my_module.py
import sys
class MyImporter:
def find_module(self, module_name, package_path):
print(module_name, package_path)
return self
def load_module(self, module_name):
print(module_name)
return self
sys.meta_path.append(MyImporter())
# file.py
import my_module # Nothing happens

What you're asking for is to have Python work not like Python. Whenever it imports a module it parses and executes the 'opened' code only once so it can pick up the definitions, functions, classes, etc. - every subsequent import of the module just references the cached & parsed first import.
That's why even if you put something like vars()["counter"] = vars().get("counter", 0) + 1 at your module's 'root', the counter will never go above 1 indicating that the module was indeed executed only once. You can force module reload using reload() (or importlib.reload() on Python 3.6+) but then you'd lose your counter if you keep it in the module itself.
Of course, you can have an external counter to be called when your module is imported, but that would have to be a contract with the users of your module at which point the question becomes - can't you just contract your users to call a function to increase your counter whenever they import your module instead of having to reload it for you to capture the count? Reloading a module will also make it have a potentially different state in every context it was reloaded which will make Python behave unexpectedly and should be avoided at any cost.
So, a short answer would be - no, you cannot do that and you should not attempt to do it. If you want something that doesn't work like Python - use something that isn't Python.
However... If you have a really, REALLY good reason to do this (and you don't!) and you don't mind hacking how Python fundamentally behaves (and you should mind) then you might attempt to do this by wrapping the built-in import and checking whenever it gets fired for your module. Something like:
your_module.py:
# HERE BE DRAGONS!!!
import sys
try:
import builtins # Python 3.4+
except ImportError:
import __builtin__ as builtins # Python 2.6+
__builtin_import__ = builtins.__import__ # store a reference to the built-in import
def __custom_import__(name, *args, **kwargs):
# execute builtin first so that the import fails if badly requested
ret = __builtin_import__(name, *args, **kwargs)
if ret is sys.modules[__name__]: # we're trying to load this module
if len(args) > 1 and args[2]: # using the `from your_module import whatever` form
if "some_string" in args[2]: # if some_string is amongst requested properties
print("some_string")
else: # using the `import your_module` form...
print_counter() # increase and print the latest count
return ret # return back the actual import result
builtins.__import__ = __custom_import__ # override the built-in import with our method
counter = 0
# a convinience function, you can do all of this through the `__custom_import__` function
def print_counter():
global counter
counter += 1
print(counter)
print_counter() # call it immediately on the first import to print out the counter
some_string = "I'm holding some string value" # since we want to import this
# HAVE I FORGOT TO TELL YOU NOT TO DO THIS? WELL, DON'T!!!
Keep in mind that this will not account for the first import (be it in the pure import your_module or in the from your_module import whatever form) as the import override won't exist until your module is loaded - that's why it calls print_counter() immediately in hope that the first import of the module was in the form of import your_module and not in the from..import form (if not it will wrongly print out the count instead of some_string the first time). To solve the first-import issue, you can move this 'ovverride' to the __init__.py in the same folder so that the override loads before your module starts and then delegate the counter change / some_string print to the module once loaded, just make sure you do your module name check properly in that case (you need to account for the package as well) and make sure it doesn't automatically execute the counter.
You also, technically, don't need the some_string property at all - by moving the execution of the built-in import around you can do your from..import check first, find the position of some_string in args[2] and pop it before calling the builtin import, then return None in the same position once executed. You can also do your printing and counter incrementing from within the overriden import function.
Again, for the love of all things fluffy and the poor soul who might have to rely on your code one day - please don't do this!

Actually, it does look like it's possible to do what you're looking for in python3.5. It's probably a bad idea, and I've carefully written my code to demonstrate the concept without being polished enough to use as-is, because I'd think carefully before doing something like this in a production project.
If you need to look at a more-or-less production example of this, take a look at the SelfWrapper class in the sh module.
Meanwhile, you can override your own entry in sys.modules to be a subclass of Module. Then you can override getattribute and detect accesses to attributes.
As best I can tell:
Every subsiquent import of the module references spec so you could probably count accesses to spec to count total imports
Each from foo import bar accesses bar as an attribute. I don't think you can distinguish between "from foo import bar" and "import foo; foo.bar"
import sys, types
class Wrapper(types.ModuleType):
def __getattribute__(self, attr):
print(attr)
return super().__getattribute__(attr)
test = "test"
sys.modules[__name__].__class__ = Wrapper

Here is how you can dynamically import modules-
from importlib import import_module
def import_from(module, name):
module = import_module(module, name)
return getattr(module, name)
and use it like this-
funcObj = import_from("<file_name>", "<method_name>")
response = funcObj(arg1,arg2)

Hacking python's import statement

I'm building a Python module for a fairly specific purpose. What I'd like to do with this is get more functionality behind importing things from it.
I'd like to have a setup by which saying from my_module import foo would run a function and pass the string "foo". This function would return the object that should be imported.
For example, maybe I want to make a cloud-based import system. I'd like to store community scripts in the cloud, and then download them when a user tries to import them.
Maybe I use the code from cloud import test_module. This would check a cache to decide whether test_module had been downloaded. If so, it would return that module. If not, it would download the module before importing it.
How can I accomplish something like this in Python, by which a dynamic range of submodules could be seamlessly imported from the cloud?

Full featured support for what you ask probably requires a bunch of complicated code using importlib and hooking into various parts of the import machinery. However, a more limited solution can be implemented with just a single custom class that pretends to be a module.
When you import a module, Python first checks in the sys.modules dictionary to see if the module is a key. If so, it returns the value associated with the key. It does this regardless of what the value is, so you can put any kind of object in sys.modules and Python will treat it like a module. A module's code can even replace its own entry in sys.modules, and the replacement will be used even the first time it is imported!
So, to implement your fancy module that downloads other modules on demand, replace the module itself with an instance of a custom class, and write that class a __getattr__ or __getattribute__ method that does the work you want.
Here's a trivial example module that returns a string for any attribute you look for in it. The string will always be the same as the requested attribute name. In your code, you'd want to do your fancy web-cache lookups and downloading, and then return the fetched module object instead of just returning a string.
class FakeModule(object):
def __getattribute__(self, name):
return name
import sys
sys.modules[__name__] = FakeModule()
On my system I've saved that as fakemodule.py. Now if I do from fakemodule import foo, I get foo with the value 'foo' in my local namespace.
Note that this only works for one level deep imports. If you do from fakemodule.subpackage import name it will not work because there's no fakemodule.subpackage entry in sys.modules.

Lazy-loading modules in python

I'm trying to put together a system that will handle lazy-loading of modules that don't explicitly exist. Basically I have an http server with a number of endpoints that I don't know ahead of time that I would like to programmatically offer for import. These modules would all have a uniform method signature, they just wouldn't exist ahead of time.
import lazy.route as test
import lazy.fake as test2
test('Does this exist?') # This sends a post request.
test2("This doesn't exist.") # Also sends a post request
I can handle all the logic I need around these imports with a uniform decorator, I just can't find any way of "decorating" imports in python, or actually interacting with them in any kind of programmatic way.
Does anyone have experience with this? I've been hunting around, and the closest thing I've found is the ast module, which would lead to a really awful kind of hacky implementation in my current under my current understanding (something like finding all import statements and manually over-writing the import function)
Not looking for a handout, just a piece of the python codebase to start looking at, or an example of someone that's done something similar.

I got a little clever in my googling and managed to find a PEP that specifically addressed this issue, it just happens to be relatively unknown, probably because the subset of reasonable uses for this is pretty narrow.
I found an excellent piece of example code showing off the new sys.meta_path implementation. I've posted it below for information on how to dynamically bootstrap your import statements.
import sys
class VirtualModule(object):
def hello(self):
return 'Hello World!'
class CustomImporter(object):
virtual_name = 'my_virtual_module'
def find_module(self, fullname, path):
"""This method is called by Python if this class
is on sys.path. fullname is the fully-qualified
name of the module to look for, and path is either
__path__ (for submodules and subpackages) or None (for
a top-level module/package).
Note that this method will be called every time an import
statement is detected (or __import__ is called), before
Python's built-in package/module-finding code kicks in."""
if fullname == self.virtual_name:
# As per PEP #302 (which implemented the sys.meta_path protocol),
# if fullname is the name of a module/package that we want to
# report as found, then we need to return a loader object.
# In this simple example, that will just be self.
return self
# If we don't provide the requested module, return None, as per
# PEP #302.
return None
def load_module(self, fullname):
"""This method is called by Python if CustomImporter.find_module
does not return None. fullname is the fully-qualified name
of the module/package that was requested."""
if fullname != self.virtual_name:
# Raise ImportError as per PEP #302 if the requested module/package
# couldn't be loaded. This should never be reached in this
# simple example, but it's included here for completeness. :)
raise ImportError(fullname)
# PEP#302 says to return the module if the loader object (i.e,
# this class) successfully loaded the module.
# Note that a regular class works just fine as a module.
return VirtualModule()
if __name__ == '__main__':
# Add our import hook to sys.meta_path
sys.meta_path.append(CustomImporter())
# Let's use our import hook
import my_virtual_module
print my_virtual_module.hello()
The full blog post is here

Automatic instance Python

Suppose you have the following structure in your src folder:
conf.py
./drivers
mod1.py --> contains mod1Class
mod2.py --> contains mod2Class
What I'd like to have is a snippet of code in conf.py to automatically instantiate the classes in mod*.py so that if one day I'll add mod3.py --> mod3Class this will be automatically instantiated in conf.py without adding any line of code.
I tried, without success:
from drivers import *
but I'm not able to import, I receive a NameError. So I'm stuck at the very first step. Also suppose I'm able to perform the import successfully, how can I do:
mod1Class_instance = mod1.mod1Class() (in a cycle, one instance for every file in drivers)
in an automatic way? I cannot use strings to make the instance of a class so I cannot get the names of the files in drivers and use strings. What's the right way to do this operation?
Thanks

Maybe, it's what you need:
from types import ModuleType
import drivers
for driver_module in dir(drivers):
if not isinstance(driver_module, ModuleType):
continue # not real module driver
for cls in dir(driver_module):
if not isinstance(cls, SomeBaseClass):
continue # not real mod class
# create new variable with name as lower class name
locals()[cls.__name__.lower()] = cls()
And, also, you should create __init__.py file in your drivers folder. This will mean that your folder is a python-module.
On the other hand, I recommend manually describe all imports. This simple approach makes your code more clear.

This loads classes in modules in a directory drivers, which is in the same directory of the current module, and does not need to make drivers a package:
from collections import defaultdict
import os
import pkgutil
def getclasses(module):
"""Get classes from a driver module."""
# I'd recommend another way to find classes; for example, in defuz's
# answer, classes in driver modules would have one base class.
try:
yield getattr(module, module.__name__ + "Class")
except AttributeError:
pass
instances = defaultdict(list)
drivers_dir = os.path.join(os.path.dirname(__file__), 'drivers')
for module_loader, name, ispkg in pkgutil.iter_modules([drivers_dir]):
module = module_loader.find_module(name).load_module(name)
for cls in getclasses(module):
# You might want to use the name of the module as a key instead of the
# module object, or, perhaps, to append all instances to a same list.
instances[module].append(cls())
# I'd recommend not putting instances in the module namespace,
# and just leaving them in that dictionary.
for mod_instances in instances.values():
for instance in mod_instances:
locals()[type(instance).__name__ + "_instance"] = instance

Python: importing through function to main namespace

(Important: See update below.)
I'm trying to write a function, import_something, that will important certain modules. (It doesn't matter which for this question.) The thing is, I would like those modules to be imported at the level from which the function is called. For example:
import_something() # Let's say this imports my_module
my_module.do_stuff() #
Is this possible?
Update:
Sorry, my original phrasing and example were misleading. I'll try to explain my entire problem. What I have is a package, which has inside it some modules and packages. In its __init__.py I want to import all the modules and packages. So somewhere else in the program, I import the entire package, and iterate over the modules/packages it has imported.
(Why? The package is called crunchers, and inside it there are defined all kinds of crunchers, like CruncherThread, CruncherProcess, and in the future perhaps MicroThreadCruncher. I want the crunchers package to automatically have all the crunchers that are placed in it, so later in the program when I use crunchers I know it can tell exactly which crunchers I have defined.)
I know I can solve this if I avoid using functions at all, and do all imports on the main level with for loops and such. But it's ugly and I want to see if I can avoid it.
If anything more is unclear, please ask in comments.

Functions have the ability to return something to where they were called. Its called their return value :p
def import_something():
# decide what to import
# ...
mod = __import__( something )
return mod
my_module = import_something()
my_module.do_stuff()
good style, no hassle.
About your update, I think adding something like this to you __init__.py does what you want:
import os
# make a list of all .py files in the same dir that dont start with _
__all__ = installed = [ name for (name,ext) in ( os.path.splitext(fn) for fn in os.listdir(os.path.dirname(__file__))) if ext=='.py' and not name.startswith('_') ]
for name in installed:
# import them all
__import__( name, globals(), locals())
somewhere else:
import crunchers
crunchers.installed # all names
crunchers.cruncherA # actual module object, but you can't use it since you don't know the name when you write the code
# turns out the be pretty much the same as the first solution :p
mycruncher = getattr(crunchers, crunchers.installed[0])

You can monkey with the parent frame in CPython to install the modules into the locals for that frame (and only that frame). The downsides are that a) this is really quite hackish and b) sys._getframe() is not guaranteed to exist in other python implementations.
def importer():
f = sys._getframe(1) # Get the parent frame
f.f_locals["some_name"] = __import__(module_name, f.f_globals, f.f_locals)
You still have to install the module into f_locals, since import won't actually do that for you - you just supply the parent frame locals and globals for the proper context.
Then in your calling function you can have:
def foo():
importer() # Magically makes 'some_name' available to the calling function
some_name.some_func()

Are you looking for something like this?
def my_import(*names):
for name in names:
sys._getframe(1).f_locals[name] = __import__(name)
then you can call it like this:
my_import("os", "re")
or
namelist = ["os", "re"]
my_import(*namelist)

According to __import__'s help:
__import__(name, globals={}, locals={}, fromlist=[], level=-1) -> module
Import a module. The globals are only used to determine the context;
they are not modified. ...
So you can simply get the globals of your parent frame and use that for the __import__ call.
def import_something(s):
return __import__(s, sys._getframe(1).f_globals)
Note: Pre-2.6, __import__'s signature differed in that it simply had optional parameters instead of using kwargs. Since globals is the second argument in both cases, the way it's called above works fine. Just something to be aware of if you decided to use any of the other arguments.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Package-specific import hooks in Python - python

Related

How to customize a module import in Python?

Hacking python's import statement

Lazy-loading modules in python

Automatic instance Python

Python: importing through function to main namespace

Categories

Resources