Pass-through/export whole third party module (using __all__?) - python

I have a module that wraps another module to insert some shim logic in some functions. The wrapped module uses a settings module mod.settings which I want to expose, but I don't want the users to import it from there, in case I would like to shim something there as well in the future. I want them to import wrapmod.settings.
Importing the module and exporting it works, but is a bit verbose on the client side. It results in having to write settings.thing instead of just thing.
I want the users to be able to do from wrapmod.settings import * and get the same results as if they did from mod.settings import * but right now, only from wrapmod import settings is available. How to I work around this?

If I understand the situation correctly, you're writing a module wrapmod that is intended to transform parts of an existing package mod. The specific part you're transforming is the submodule mod.settings. You've imported the settings module and made your changes to it, but even though it is available as wrapmod.settings, you can't use that module name in an from ... import ... statement.
I think the best way to fix that is to insert the modified module into sys.modules under the new dotted name. This makes Python accept that name as valid even though wrapmod isn't really a package.
So wrapmod would look something like:
import sys
from mod import settings
# modify settings here
sys.modules['wrapmod.settings'] = settings # add this line!

I ended up making a code-generator for a thin wrapper module instead, since the sys.module hacking broke all IDE integration.
from ... import mod
# this is just a pass-through wrapper around mod.settings
__all__ = mod.__all__
# generate pass-through wrapper around mod.settings; doesn't break IDE integration, unlike manual sys.modules editing.
if __name__ == "__main__":
for thing in settings.__all__:
print(thing + " = mod." + thing)
which when run as a script, outputs code that can then be appended to the end of this file.

Related

Is there a convenient way to translate a "from A import B as C" to an python import using a specific path

I want to import a python module without adding its containing folder to the python path. I would want the import look like
from A import B as C
Due to the specific path that shall be used, the import looks like
import imp
A = imp.load_source('A', 'path')
C = A.B
This is quite unhandy with long paths and module names. Is there an easier way? Is there A way, where the module is not added to the local variables (no A)?
If you just don't want A to be visible at a global level, you could stick the import (imp.load_source) inside a function. If you actually don't want a module object at all in the local scope, you can do that too, but I wouldn't recommend it.
If module A is a python source file you could read in the file (or even just the relevant portion that you want) and run an exec on it.
source.py
MY_GLOBAL_VAR = 1
def my_func():
print 'hello'
Let's say you have some code that wants my_func
path = '/path/to/source.py'
execfile(path)
my_func()
# 'hello'
Be aware that you're also going to get anything else defined in the file (like MY_GLOBAL_VAR). Again, this will work, but I wouldn't recommend it
Someone looking at your code won't be able to see where my_func came from.
You're essentially doing the same thing as a from A import * import, which is generally frowned upon in python, because , you could be importing all sorts of things into your namespace that you didn't want. And even if it works now, if the source code changes, it could import names that shadow your own global symbols.
It's potentially a security hole, since you could be exec'ing an untrusted source file.
It's way more verbose than a regular python import.

python module import syntax

I'm teaching myself Python (I have experience in other languages).
I found a way to import a "module". In PHP, this would just be named an include file. But I guess Python names it a module. I'm looking for a simple, best-practices approach. I can get fancy later. But right now, I'm trying to keep it simple while not developing bad habits. Here is what I did:
I created a blank file named __init__.py, which I stored in Documents (the folder on the Mac)
I created a file named myModuleFile.py, which I stored in Documents
In myModuleFile.py, I created a function:
def myFunction()
print("hello world")
I created another file: myMainFile.py, which I stored in Documents
In this file, I typed the following:
import myModuleFile.py
myModuleFile.myFunction()
This successfully printed out "hello world" to the console when I ran it on the terminal.
Is this a best-practices way to do this for my simple current workflow?
I'm not sure the dot notation means I'm onto something good or something bad. It throws an error if I try to use myFunction() instead of myModuleFile.myFunction(). I kind of think it would be good. If there were a second imported module, it would know to call myFunction() from myModuleFile rather than the other one. So the dot notation makes everybody know exactly which file you are trying to call the function from.
I think there is some advanced stuff using sys or some sort of exotic configuration stuff. But I'm hoping my simple little way of doing things is ok for now.
Thanks for any clarification on this.
For your import you don't need the ".py" extension
You can use:
import myModuleFile
myModuleFile.myFunction()
Or
from myModuleFile import myFunction
myFunction()
Last syntax is common if you import several functions or globals of your module.
Besides to use the "main" function, I'd put this on your module:
from myModuleFile import myFunction
if __name__ == '__main__':
myFunction()
Otherwise the main code could be executed in imports or other cases.
I'd use just one module for myModuleFile.py and myMainFile.py, using the previous pattern let you know if your module is called from command line or as import.
Lastly, I'd change the name of your files to avoid the CamelCase, that is, I'd replace myModuleFile.py by my_module.py. Python loves the lowercase ;-)
You only need to have init.py if you are creating a package (a package in a simple sense is a subdirectory which has one or more modules in it, but I think it may be more complex than you need right now).
If you have just one folder which has MyModule.py and MyMainFile.py - you don't need the init.py.
In MyMainFile.py you can write :
import myModuleFile
and then use
myModuleFile.MyFunction()
The reason for including the module name is that you may reuse the same function name in more than one module and you need a way of saying which module your program is using.
Module Aliases
If you want to you can do this :
import myModuleFile as MyM
and then use
MyM.MyFunction()
Here you have created MyM as an alias for myModuleFile, and created less typing.
Here Lies Dragons
You will sometimes see one other forms of IMport, which can be dangerous, especially for the beginner.
from myModuleFile import MyFunction
if you do this you can use :
MyFunction()
but this has a problem if you have used the same function name in MyMainFile, or in any other library you have used, as you now can't get to any other definition of the name MyFunction. This is often termed Contaminating the namespace - and should really be avoided unless you are absolutely certain it is safe.
there is a final form which I will show for completeness :
from myModuleFile import *
While you will now be able to access every function defined in myModuleFile without using myModuleFile in front of it, you have also now prevented your MyMainFile from using any function in any library which matches any name defined in myModuleFile.
Using this form is generally not considered to be a good idea.
I hope this helps.

Python: force every import to reload

Is there a way to force import x to always reload x in Python (i.e., as if I had called reload(x), or imp.reload(x) for Python 3)? Or in general, is there some way to force some code to be run every time I run import x? I'm OK with monkey patching or hackery.
I've tried moving the code into a separate module and deleting x from sys.modules in that separate file. I dabbled a bit with import hooks, but I didn't try too hard because according to the documentation, they are only called after the sys.modules cache is checked. I also tried monkeypatching sys.modules with a custom dict subclass, but whenever I do that, from module import submodule raises KeyError (I'm guessing sys.modules is not a real dictionary).
Basically, I'm trying to write a debugging tool (which is why some hackery is OK here). My goal is simply that import x is shorter to type than import x;x.y.
If you really want to change the semantics of the import statement, you will have to patch the interpreter. import checks whether the named module already is loaded and if so it does nothing more. You would have to change exactly that, and that is hard-wired in the interpreter.
Maybe you can live with patching the Python sources to use myImport('modulename') instead of import modulename? That would make it possible within Python itself.
Taking a lead from Alfe's answer, I got it to work like this. This goes at the module level.
def custom_logic():
# Put whatever you want to run when the module is imported here
# This version is run on the first import
custom_logic()
def __myimport__(name, *args, **kwargs):
if name == 'x': # Replace with the name of this module
# This version is run on all subsequent imports
custom_logic()
return __origimport__(name, *args, **kwargs)
# Will only be run on first import
__builtins__['__origimport__'] = __import__
__builtins__['__import__'] = __myimport__
We are monkeypatching __builtins__, which is why __origimport__ is defined when __myimport__ is run.

How to tell if a Python modules I being reload()ed from within the module

When writing a Python module, is there a way to tell if the module is being imported or reloaded?
I know I can create a class, and the __init__() will only be called on the first import, but I hadn't planning on creating a class. Though, I will if there isn't an easy way to tell if we are being imported or reloaded.
The documentation for reload() actually gives a code snippet that I think should work for your purposes, at least in the usual case. You'd do something like this:
try:
reloading
except NameError:
reloading = False # means the module is being imported
else:
reloading = True # means the module is being reloaded
What this really does is detect whether the module is being imported "cleanly" (e.g. for the first time) or is overwriting a previous instance of the same module. In the normal case, a "clean" import corresponds to the import statement, and a "dirty" import corresponds to reload(), because import only really imports the module once, the first time it's executed (for each given module).
If you somehow manage to force a subsequent execution of the import statement into doing something nontrivial, or if you somehow manage to import your module for the first time using reload(), or if you mess around with the importing mechanism (through the imp module or the like), all bets are off. In other words, don't count on this always working in every possible situation.
P.S. The fact that you're asking this question makes me wonder if you're doing something you probably shouldn't be doing, but I won't ask.
>>> import os
>>> os.foo = 5
>>> os.foo
5
>>> import os
>>> os.foo
5

How do I override a Python import?

I'm working on pypreprocessor which is a preprocessor that takes c-style directives and I've been able to make it work like a traditional preprocessor (it's self-consuming and executes postprocessed code on-the-fly) except that it breaks library imports.
The problem is: The preprocessor runs through the file, processes it, outputs to a temporary file, and exec() the temporary file. Libraries that are imported need to be handled a little different, because they aren't executed, but rather they are loaded and made accessible to the caller module.
What I need to be able to do is: Interrupt the import (since the preprocessor is being run in the middle of the import), load the postprocessed code as a tempModule, and replace the original import with the tempModule to trick the calling script with the import into believing that the tempModule is the original module.
I have searched everywhere and so far and have no solution.
This Stack Overflow question is the closest I've seen so far to providing an answer:
Override namespace in Python
Here's what I have.
# Remove the bytecode file created by the first import
os.remove(moduleName + '.pyc')
# Remove the first import
del sys.modules[moduleName]
# Import the postprocessed module
tmpModule = __import__(tmpModuleName)
# Set first module's reference to point to the preprocessed module
sys.modules[moduleName] = tmpModule
moduleName is the name of the original module, and tmpModuleName is the name of the postprocessed code file.
The strange part is this solution still runs completely normal as if the first module completed loaded normally; unless you remove the last line, then you get a module not found error.
Hopefully someone on Stack Overflow know a lot more about imports than I do, because this one has me stumped.
Note: I will only award a solution, or, if this is not possible in Python; the best, most detailed explanation of why this is not impossible.
Update: For anybody who is interested, here is the working code.
if imp.lock_held() is True:
del sys.modules[moduleName]
sys.modules[tmpModuleName] = __import__(tmpModuleName)
sys.modules[moduleName] = __import__(tmpModuleName)
The 'imp.lock_held' part detects whether the module is being loaded as a library. The following lines do the rest.
Does this answer your question? The second import does the trick.
Mod_1.py
def test_function():
print "Test Function -- Mod 1"
Mod_2.py
def test_function():
print "Test Function -- Mod 2"
Test.py
#!/usr/bin/python
import sys
import Mod_1
Mod_1.test_function()
del sys.modules['Mod_1']
sys.modules['Mod_1'] = __import__('Mod_2')
import Mod_1
Mod_1.test_function()
To define a different import behavior or to totally subvert the import process you will need to write import hooks. See PEP 302.
For example,
import sys
class MyImporter(object):
def find_module(self, module_name, package_path):
# Return a loader
return self
def load_module(self, module_name):
# Return a module
return self
sys.meta_path.append(MyImporter())
import now_you_can_import_any_name
print now_you_can_import_any_name
It outputs:
<__main__.MyImporter object at 0x009F85F0>
So basically it returns a new module (which can be any object), in this case itself. You may use it to alter the import behavior by returning processe_xxx on import of xxx.
IMO: Python doesn't need a preprocessor. Whatever you are accomplishing can be accomplished in Python itself due to it very dynamic nature, for example, taking the case of the debug example, what is wrong with having at top of file
debug = 1
and later
if debug:
print "wow"
?
In Python 2 there is the imputil module that seems to provide the functionality you are looking for, but has been removed in python 3. It's not very well documented but contains an example section that shows how you can replace the standard import functions.
For Python 3 there is the importlib module (introduced in Python 3.1) that contains functions and classes to modify the import functionality in all kinds of ways. It should be suitable to hook your preprocessor into the import system.

Categories