How to customize a module import in Python? - python

I would like to customize the behavior of my module when it is imported.
For example, let say I want my module to print an incremented number each time another file use import my_module. And when from my_module import some_string is used, it should print "some_string".
How could I do that?
I read several questions here and there but this does not seems to work.
# my_module.py
import sys
class MyImporter:
def find_module(self, module_name, package_path):
print(module_name, package_path)
return self
def load_module(self, module_name):
print(module_name)
return self
sys.meta_path.append(MyImporter())
# file.py
import my_module # Nothing happens

What you're asking for is to have Python work not like Python. Whenever it imports a module it parses and executes the 'opened' code only once so it can pick up the definitions, functions, classes, etc. - every subsequent import of the module just references the cached & parsed first import.
That's why even if you put something like vars()["counter"] = vars().get("counter", 0) + 1 at your module's 'root', the counter will never go above 1 indicating that the module was indeed executed only once. You can force module reload using reload() (or importlib.reload() on Python 3.6+) but then you'd lose your counter if you keep it in the module itself.
Of course, you can have an external counter to be called when your module is imported, but that would have to be a contract with the users of your module at which point the question becomes - can't you just contract your users to call a function to increase your counter whenever they import your module instead of having to reload it for you to capture the count? Reloading a module will also make it have a potentially different state in every context it was reloaded which will make Python behave unexpectedly and should be avoided at any cost.
So, a short answer would be - no, you cannot do that and you should not attempt to do it. If you want something that doesn't work like Python - use something that isn't Python.
However... If you have a really, REALLY good reason to do this (and you don't!) and you don't mind hacking how Python fundamentally behaves (and you should mind) then you might attempt to do this by wrapping the built-in import and checking whenever it gets fired for your module. Something like:
your_module.py:
# HERE BE DRAGONS!!!
import sys
try:
import builtins # Python 3.4+
except ImportError:
import __builtin__ as builtins # Python 2.6+
__builtin_import__ = builtins.__import__ # store a reference to the built-in import
def __custom_import__(name, *args, **kwargs):
# execute builtin first so that the import fails if badly requested
ret = __builtin_import__(name, *args, **kwargs)
if ret is sys.modules[__name__]: # we're trying to load this module
if len(args) > 1 and args[2]: # using the `from your_module import whatever` form
if "some_string" in args[2]: # if some_string is amongst requested properties
print("some_string")
else: # using the `import your_module` form...
print_counter() # increase and print the latest count
return ret # return back the actual import result
builtins.__import__ = __custom_import__ # override the built-in import with our method
counter = 0
# a convinience function, you can do all of this through the `__custom_import__` function
def print_counter():
global counter
counter += 1
print(counter)
print_counter() # call it immediately on the first import to print out the counter
some_string = "I'm holding some string value" # since we want to import this
# HAVE I FORGOT TO TELL YOU NOT TO DO THIS? WELL, DON'T!!!
Keep in mind that this will not account for the first import (be it in the pure import your_module or in the from your_module import whatever form) as the import override won't exist until your module is loaded - that's why it calls print_counter() immediately in hope that the first import of the module was in the form of import your_module and not in the from..import form (if not it will wrongly print out the count instead of some_string the first time). To solve the first-import issue, you can move this 'ovverride' to the __init__.py in the same folder so that the override loads before your module starts and then delegate the counter change / some_string print to the module once loaded, just make sure you do your module name check properly in that case (you need to account for the package as well) and make sure it doesn't automatically execute the counter.
You also, technically, don't need the some_string property at all - by moving the execution of the built-in import around you can do your from..import check first, find the position of some_string in args[2] and pop it before calling the builtin import, then return None in the same position once executed. You can also do your printing and counter incrementing from within the overriden import function.
Again, for the love of all things fluffy and the poor soul who might have to rely on your code one day - please don't do this!

Actually, it does look like it's possible to do what you're looking for in python3.5. It's probably a bad idea, and I've carefully written my code to demonstrate the concept without being polished enough to use as-is, because I'd think carefully before doing something like this in a production project.
If you need to look at a more-or-less production example of this, take a look at the SelfWrapper class in the sh module.
Meanwhile, you can override your own entry in sys.modules to be a subclass of Module. Then you can override getattribute and detect accesses to attributes.
As best I can tell:
Every subsiquent import of the module references spec so you could probably count accesses to spec to count total imports
Each from foo import bar accesses bar as an attribute. I don't think you can distinguish between "from foo import bar" and "import foo; foo.bar"
import sys, types
class Wrapper(types.ModuleType):
def __getattribute__(self, attr):
print(attr)
return super().__getattribute__(attr)
test = "test"
sys.modules[__name__].__class__ = Wrapper

Here is how you can dynamically import modules-
from importlib import import_module
def import_from(module, name):
module = import_module(module, name)
return getattr(module, name)
and use it like this-
funcObj = import_from("<file_name>", "<method_name>")
response = funcObj(arg1,arg2)

Related

accessing and changing module level variable [duplicate]

I've run into a bit of a wall importing modules in a Python script. I'll do my best to describe the error, why I run into it, and why I'm tying this particular approach to solve my problem (which I will describe in a second):
Let's suppose I have a module in which I've defined some utility functions/classes, which refer to entities defined in the namespace into which this auxiliary module will be imported (let "a" be such an entity):
module1:
def f():
print a
And then I have the main program, where "a" is defined, into which I want to import those utilities:
import module1
a=3
module1.f()
Executing the program will trigger the following error:
Traceback (most recent call last):
File "Z:\Python\main.py", line 10, in <module>
module1.f()
File "Z:\Python\module1.py", line 3, in f
print a
NameError: global name 'a' is not defined
Similar questions have been asked in the past (two days ago, d'uh) and several solutions have been suggested, however I don't really think these fit my requirements. Here's my particular context:
I'm trying to make a Python program which connects to a MySQL database server and displays/modifies data with a GUI. For cleanliness sake, I've defined the bunch of auxiliary/utility MySQL-related functions in a separate file. However they all have a common variable, which I had originally defined inside the utilities module, and which is the cursor object from MySQLdb module.
I later realised that the cursor object (which is used to communicate with the db server) should be defined in the main module, so that both the main module and anything that is imported into it can access that object.
End result would be something like this:
utilities_module.py:
def utility_1(args):
code which references a variable named "cur"
def utility_n(args):
etcetera
And my main module:
program.py:
import MySQLdb, Tkinter
db=MySQLdb.connect(#blahblah) ; cur=db.cursor() #cur is defined!
from utilities_module import *
And then, as soon as I try to call any of the utilities functions, it triggers the aforementioned "global name not defined" error.
A particular suggestion was to have a "from program import cur" statement in the utilities file, such as this:
utilities_module.py:
from program import cur
#rest of function definitions
program.py:
import Tkinter, MySQLdb
db=MySQLdb.connect(#blahblah) ; cur=db.cursor() #cur is defined!
from utilities_module import *
But that's cyclic import or something like that and, bottom line, it crashes too. So my question is:
How in hell can I make the "cur" object, defined in the main module, visible to those auxiliary functions which are imported into it?
Thanks for your time and my deepest apologies if the solution has been posted elsewhere. I just can't find the answer myself and I've got no more tricks in my book.
Globals in Python are global to a module, not across all modules. (Many people are confused by this, because in, say, C, a global is the same across all implementation files unless you explicitly make it static.)
There are different ways to solve this, depending on your actual use case.
Before even going down this path, ask yourself whether this really needs to be global. Maybe you really want a class, with f as an instance method, rather than just a free function? Then you could do something like this:
import module1
thingy1 = module1.Thingy(a=3)
thingy1.f()
If you really do want a global, but it's just there to be used by module1, set it in that module.
import module1
module1.a=3
module1.f()
On the other hand, if a is shared by a whole lot of modules, put it somewhere else, and have everyone import it:
import shared_stuff
import module1
shared_stuff.a = 3
module1.f()
… and, in module1.py:
import shared_stuff
def f():
print shared_stuff.a
Don't use a from import unless the variable is intended to be a constant. from shared_stuff import a would create a new a variable initialized to whatever shared_stuff.a referred to at the time of the import, and this new a variable would not be affected by assignments to shared_stuff.a.
Or, in the rare case that you really do need it to be truly global everywhere, like a builtin, add it to the builtin module. The exact details differ between Python 2.x and 3.x. In 3.x, it works like this:
import builtins
import module1
builtins.a = 3
module1.f()
As a workaround, you could consider setting environment variables in the outer layer, like this.
main.py:
import os
os.environ['MYVAL'] = str(myintvariable)
mymodule.py:
import os
myval = None
if 'MYVAL' in os.environ:
myval = os.environ['MYVAL']
As an extra precaution, handle the case when MYVAL is not defined inside the module.
This post is just an observation for Python behaviour I encountered. Maybe the advices you read above don't work for you if you made the same thing I did below.
Namely, I have a module which contains global/shared variables (as suggested above):
#sharedstuff.py
globaltimes_randomnode=[]
globalist_randomnode=[]
Then I had the main module which imports the shared stuff with:
import sharedstuff as shared
and some other modules that actually populated these arrays. These are called by the main module. When exiting these other modules I can clearly see that the arrays are populated. But when reading them back in the main module, they were empty. This was rather strange for me (well, I am new to Python). However, when I change the way I import the sharedstuff.py in the main module to:
from globals import *
it worked (the arrays were populated).
Just sayin'
A function uses the globals of the module it's defined in. Instead of setting a = 3, for example, you should be setting module1.a = 3. So, if you want cur available as a global in utilities_module, set utilities_module.cur.
A better solution: don't use globals. Pass the variables you need into the functions that need it, or create a class to bundle all the data together, and pass it when initializing the instance.
The easiest solution to this particular problem would have been to add another function within the module that would have stored the cursor in a variable global to the module. Then all the other functions could use it as well.
module1:
cursor = None
def setCursor(cur):
global cursor
cursor = cur
def method(some, args):
global cursor
do_stuff(cursor, some, args)
main program:
import module1
cursor = get_a_cursor()
module1.setCursor(cursor)
module1.method()
Since globals are module specific, you can add the following function to all imported modules, and then use it to:
Add singular variables (in dictionary format) as globals for those
Transfer your main module globals to it
.
addglobals = lambda x: globals().update(x)
Then all you need to pass on current globals is:
import module
module.addglobals(globals())
Since I haven't seen it in the answers above, I thought I would add my simple workaround, which is just to add a global_dict argument to the function requiring the calling module's globals, and then pass the dict into the function when calling; e.g:
# external_module
def imported_function(global_dict=None):
print(global_dict["a"])
# calling_module
a = 12
from external_module import imported_function
imported_function(global_dict=globals())
>>> 12
The OOP way of doing this would be to make your module a class instead of a set of unbound methods. Then you could use __init__ or a setter method to set the variables from the caller for use in the module methods.
Update
To test the theory, I created a module and put it on pypi. It all worked perfectly.
pip install superglobals
Short answer
This works fine in Python 2 or 3:
import inspect
def superglobals():
_globals = dict(inspect.getmembers(
inspect.stack()[len(inspect.stack()) - 1][0]))["f_globals"]
return _globals
save as superglobals.py and employ in another module thusly:
from superglobals import *
superglobals()['var'] = value
Extended Answer
You can add some extra functions to make things more attractive.
def superglobals():
_globals = dict(inspect.getmembers(
inspect.stack()[len(inspect.stack()) - 1][0]))["f_globals"]
return _globals
def getglobal(key, default=None):
"""
getglobal(key[, default]) -> value
Return the value for key if key is in the global dictionary, else default.
"""
_globals = dict(inspect.getmembers(
inspect.stack()[len(inspect.stack()) - 1][0]))["f_globals"]
return _globals.get(key, default)
def setglobal(key, value):
_globals = superglobals()
_globals[key] = value
def defaultglobal(key, value):
"""
defaultglobal(key, value)
Set the value of global variable `key` if it is not otherwise st
"""
_globals = superglobals()
if key not in _globals:
_globals[key] = value
Then use thusly:
from superglobals import *
setglobal('test', 123)
defaultglobal('test', 456)
assert(getglobal('test') == 123)
Justification
The "python purity league" answers that litter this question are perfectly correct, but in some environments (such as IDAPython) which is basically single threaded with a large globally instantiated API, it just doesn't matter as much.
It's still bad form and a bad practice to encourage, but sometimes it's just easier. Especially when the code you are writing isn't going to have a very long life.

Making a copy of an entire namespace?

I'd like to make a copy of an entire namespace while replacing some functions with dynamically constructed versions.
In other words, starting with namespace (import tensorflow as tf), I want to make a copy of it, replace some functions with my own versions, and update __globals__ of all the symbols to stay within the new namespace. This needs to be done in topological order of dependency.
I started doing something like it here but now I'm starting to wonder if I'm reinventing the wheel. Care is needed to deal with circular dependencies in system modules, functions/types/objects need to be updated differently, etc.
Can anyone point to existing code that solves a similar task?
To patch a set of functions while importing second instances of a set of functions, you can override the standard Python import hook and apply the patches directly at import time. This will make sure that no other module will ever see the unpatched versions of any of the modules, so even if they import functions from another module directly by name, they will only see the patched functions. Here is a proof-of-concept implementation:
import __builtin__
import collections
import contextlib
import sys
#contextlib.contextmanager
def replace_import_hook(new_import_hook):
original_import = __builtin__.__import__
__builtin__.__import__ = new_import_hook
yield original_import
__builtin__.__import__ = original_import
def clone_modules(patches, additional_module_names=None):
"""Import new instances of a set of modules with some objects replaced.
Arguments:
patches - a dictionary mapping `full.module.name.symbol` to the new object.
additional_module_names - a list of the additional modules you want new instances of, without
replacing any objects in them.
Returns:
A dictionary mapping module names to the new patched module instances.
"""
def import_hook(module_name, *args):
result = original_import(module_name, *args)
if module_name not in old_modules or module_name in new_modules:
return result
# The semantics for the return value of __import__() are a bit weird, so we need some logic
# to determine the actual imported module object.
if len(args) >= 3 and args[2]:
module = result
else:
module = reduce(getattr, module_name.split('.')[1:], result)
for symbol, obj in patches_by_module[module_name].items():
setattr(module, symbol, obj)
new_modules[module_name] = module
return result
# Group patches by module name
patches_by_module = collections.defaultdict(dict)
for dotted_name, obj in patches.items():
module_name, symbol = dotted_name.rsplit('.', 1) # Only allows patching top-level objects
patches_by_module[module_name][symbol] = obj
try:
# Remove the old module instances from sys.modules and store them in old_modules
all_module_names = list(patches_by_module)
if additional_module_names is not None:
all_module_names.extend(additional_module_names)
old_modules = {}
for name in all_module_names:
old_modules[name] = sys.modules.pop(name)
# Re-import modules to create new patched versions
with replace_import_hook(import_hook) as original_import:
new_modules = {}
for module_name in all_module_names:
import_hook(module_name)
finally:
sys.modules.update(old_modules)
return new_modules
And here some test code for this implementation:
from __future__ import print_function
import math
import random
def patched_log(x):
print('Computing log({:g})'.format(x))
return math.log(x)
patches = {'math.log': patched_log}
cloned_modules = clone_modules(patches, ['random'])
new_math = cloned_modules['math']
new_random = cloned_modules['random']
print('Original log: ', math.log(2.0))
print('Patched log: ', new_math.log(2.0))
print('Original expovariate: ', random.expovariate(2.0))
print('Patched expovariate: ', new_random.expovariate(2.0))
The test code has this output:
Computing log(4)
Computing log(4.5)
Original log: 0.69314718056
Computing log(2)
Patched log: 0.69314718056
Original expovariate: 0.00638038735379
Computing log(0.887611)
Patched expovariate: 0.0596108277801
The first two lines of output result from these two lines in random, which are executed at import time. This demonstrates that random sees the patched function right away. The rest of the output demonstrates that the original math and random still use the unpatched version of log, while the cloned modules both use the patched version.
A cleaner way of overriding the import hook might be to use a meta import hook as defined in PEP 302, but providing a full implementation of that approach is beyond the scope of StackOverflow.
Instead of trying to make a copy of the contents of a module and patch everything in it to use the correct globals, you could trick Python into importing everything you want to copy a second time. This will give you a newly initialized copy of all modules, so it won't copy any global state the modules might have (not sure whether you would need that).
import importlib
import sys
def new_module_instances(module_names):
old_modules = {}
for name in module_names:
old_modules[name] = sys.modules.pop(name)
new_modules = {}
for name in module_names:
new_modules[name] = importlib.import_module(name)
sys.modules.update(old_modules)
return new_modules
Note that we first delete all modules we want to replace from sys.modules, so they all get import a second time, and the dependencies between these modules are set up correctly automatically. At the end of the function, we restore the original state of sys.modules, so everything else continues to see the original versions of these modules.
Here's an example:
>>> import logging.handlers
>>> new_modules = new_module_instances(['logging', 'logging.handlers'])
>>> logging_clone = new_modules['logging']
>>> logging
<module 'logging' from '/usr/lib/python2.7/logging/__init__.pyc'>
>>> logging_clone
<module 'logging' from '/usr/lib/python2.7/logging/__init__.pyc'>
>>> logging is logging_clone
False
>>> logging is logging.handlers.logging
True
>>> logging_clone is logging_clone.handlers.logging
True
The last three expressions show that the two versions of logging are different modules, and both versions of the handlers module use the correct version of the logging module.
To my mind, you can do this easily:
import imp, string
st = imp.load_module('st', *imp.find_module('string')) # copy the module
def my_upper(a):
return "a" + a
def my_lower(a):
return a + "a"
st.upper = my_upper
st.lower = my_lower
print string.upper("hello") # HELLO
print string.lower("hello") # hello
print st.upper("hello") # ahello
print st.lower("hello") # helloa
And when you call st.upper("hello"), it will result in "hello".
So, you don't really need to mess with globals.

Python: How to import all methods and attributes from a module dynamically

I'd like to load a module dynamically, given its string name (from an environment variable). I'm using Python 2.7. I know I can do something like:
import os, importlib
my_module = importlib.import_module(os.environ.get('SETTINGS_MODULE'))
This is roughly equivalent to
import my_settings
(where SETTINGS_MODULE = 'my_settings'). The problem is, I need something equivalent to
from my_settings import *
since I'd like to be able to access all methods and variables in the module. I've tried
import os, importlib
my_module = importlib.import_module(os.environ.get('SETTINGS_MODULE'))
from my_module import *
but I get a bunch of errors doing that. Is there a way to import all methods and attributes of a module dynamically in Python 2.7?
If you have your module object, you can mimic the logic import * uses as follows:
module_dict = my_module.__dict__
try:
to_import = my_module.__all__
except AttributeError:
to_import = [name for name in module_dict if not name.startswith('_')]
globals().update({name: module_dict[name] for name in to_import})
However, this is almost certainly a really bad idea. You will unceremoniously stomp on any existing variables with the same names. This is bad enough when you do from blah import * normally, but when you do it dynamically there is even more uncertainty about what names might collide. You are better off just importing my_module and then accessing what you need from it using regular attribute access (e.g., my_module.someAttr), or getattr if you need to access its attributes dynamically.
Not answering precisely the question as worded, but if you wish to have a file as proxy to a dynamic module, you can use the ability to define __getattr__ on the module level.
import importlib
import os
module_name = os.environ.get('CONFIG_MODULE', 'configs.config_local')
mod = importlib.import_module(module_name)
def __getattr__(name):
return getattr(mod, name)
My case was a bit different - wanted to dynamically import the constants.py names in each gameX.__init__.py module (see below), cause statically importing those would leave them in sys.modules forever (see: this excerpt from Beazley I picked from this related question).
Here is my folder structure:
game/
__init__.py
game1/
__init__.py
constants.py
...
game2/
__init__.py
constants.py
...
Each gameX.__init__.py exports an init() method - so I had initially a from .constants import * in all those gameX.__init__.py which I tried to move inside the init() method.
My first attempt in the lines of:
## -275,2 +274,6 ## def init():
# called instead of 'reload'
+ yak = {}
+ yak.update(locals())
+ from .constants import * # fails here
+ yak = {x: y for x,y in locals() if x not in yak}
+ globals().update(yak)
brec.ModReader.recHeader = RecordHeader
Failed with the rather cryptic:
SyntaxError: import * is not allowed in function 'init' because it contains a nested function with free variables
I can assure you there are no nested functions in there. Anyway I hacked and slashed and ended up with:
def init():
# ...
from .. import dynamic_import_hack
dynamic_import_hack(__name__)
Where in game.__init__.py:
def dynamic_import_hack(package_name):
print __name__ # game.init
print package_name # game.gameX.init
import importlib
constants = importlib.import_module('.constants', package=package_name)
import sys
for k in dir(constants):
if k.startswith('_'): continue
setattr(sys.modules[package_name], k, getattr(constants, k))
(for setattr see How can I add attributes to a module at run time? while for getattr How can I import a python module function dynamically? - I prefer to use those than directly access the __dict__)
This works and it's more general than the approach in the accepted answer cause it allows you to have the hack in one place and use it from whatever module. However I am not really sure it's the best way to implement it - was going to ask a question but as it would be a duplicate of this one I am posting it as an answer and hope to get some feedback. My questions would be:
why this "SyntaxError: import * is not allowed in function 'init'" while there are no nested functions ?
dir has a lot of warnings in its doc - in particular it attempts to produce the most relevant, rather than complete, information - this complete worries me a bit
is there no builtin way to do an import * ? even in python 3 ?

Injecting Locals into Dynamically Loaded Modules Before Execution

I'm trying to build a sort of script system in python that will allow small snippets of code to be selected and executed at runtime inside python.
Essentially I want to be able to load a small python file like
for i in Foo: #not in a function.
print i
Where somewhere else in the program I assign what Foo will be. As if Foo served as a function argument to the entire loaded python file instead of a single function
So somewhere else
FooToPass = GetAFoo ()
TempModule = __import__ ("TheSnippit",<Somehow put {'Foo' : FooToPass} in the locals>)
It is considered bad style to have code with side effects at module level. If you want your module to do something, put that code in a function, make Foo a parameter of this function and call it with the desired value.
Python's import mechanism does not allow to preinitialise a module namespace. If you want to do this anyway (which is, in my opinion, confusing and unnecessary), you have to fiddle around with details of the import mechanism. Example implementation (untested):
import imp
import sys
def my_import(module_name, globals):
if module_name in sys.modules:
return sys.modules[module_name]
module = imp.new_module(module_name)
vars(module).update(globals)
f, module.__file__, options = imp.find_module(module_name)
exec f.read() in vars(module)
f.close()
sys.modules[module_name] = module
return module

Is it possible to overload from/import in Python?

Is it possible to overload the from/import statement in Python?
For example, assuming jvm_object is an instance of class JVM, is it possible to write this code:
class JVM(object):
def import_func(self, cls):
return something...
jvm = JVM()
# would invoke JVM.import_func
from jvm import Foo
This post demonstrates how to use functionality introduced in PEP-302 to import modules over the web. I post it as an example of how to customize the import statement rather than as suggested usage ;)
It's hard to find something which isn't possible in a dynamic language like Python, but do we really need to abuse everything? Anyway, here it is:
from types import ModuleType
import sys
class JVM(ModuleType):
Foo = 3
sys.modules['JVM'] = JVM
from JVM import Foo
print Foo
But one pattern I've seen in several libraries/projects is some kind of a _make_module() function, which creates a ModuleType dynamically and initializes everything in it. After that, the current Module is replaced by the new module (using the assignment to sys.modules) and the _make_module() function gets deleted. The advantage of that, is that you can loop over the module and even add objects to the module inside that loop, which is quite useful sometimes (but use it with caution!).

Categories