I'd like to dynamically create a module from a dictionary, and I'm wondering if adding an element to sys.modules is really the best way to do this. EG
context = { a: 1, b: 2 }
import types
test_context_module = types.ModuleType('TestContext', 'Module created to provide a context for tests')
test_context_module.__dict__.update(context)
import sys
sys.modules['TestContext'] = test_context_module
My immediate goal in this regard is to be able to provide a context for timing test execution:
import timeit
timeit.Timer('a + b', 'from TestContext import *')
It seems that there are other ways to do this, since the Timer constructor takes objects as well as strings. I'm still interested in learning how to do this though, since a) it has other potential applications; and b) I'm not sure exactly how to use objects with the Timer constructor; doing so may prove to be less appropriate than this approach in some circumstances.
EDITS/REVELATIONS/PHOOEYS/EUREKA:
I've realized that the example code relating to running timing tests won't actually work, because import * only works at the module level, and the context in which that statement is executed is that of a function in the testit module. In other words, the globals dictionary used when executing that code is that of __main__, since that's where I was when I wrote the code in the interactive shell. So that rationale for figuring this out is a bit botched, but it's still a valid question.
I've discovered that the code run in the first set of examples has the undesirable effect that the namespace in which the newly created module's code executes is that of the module in which it was declared, not its own module. This is like way weird, and could lead to all sorts of unexpected rattlesnakeic sketchiness. So I'm pretty sure that this is not how this sort of thing is meant to be done, if it is in fact something that the Guido doth shine upon.
The similar-but-subtly-different case of dynamically loading a module from a file that is not in python's include path is quite easily accomplished using imp.load_source('NewModuleName', 'path/to/module/module_to_load.py'). This does load the module into sys.modules. However this doesn't really answer my question, because really, what if you're running python on an embedded platform with no filesystem?
I'm battling a considerable case of information overload at the moment, so I could be mistaken, but there doesn't seem to be anything in the imp module that's capable of this.
But the question, essentially, at this point is how to set the global (ie module) context for an object. Maybe I should ask that more specifically? And at a larger scope, how to get Python to do this while shoehorning objects into a given module?
Hmm, well one thing I can tell you is that the timeit function actually executes its code using the module's global variables. So in your example, you could write
import timeit
timeit.a = 1
timeit.b = 2
timeit.Timer('a + b').timeit()
and it would work. But that doesn't address your more general problem of defining a module dynamically.
Regarding the module definition problem, it's definitely possible and I think you've stumbled on to pretty much the best way to do it. For reference, the gist of what goes on when Python imports a module is basically the following:
module = imp.new_module(name)
execfile(file, module.__dict__)
That's kind of the same thing you do, except that you load the contents of the module from an existing dictionary instead of a file. (I don't know of any difference between types.ModuleType and imp.new_module other than the docstring, so you can probably use them interchangeably) What you're doing is somewhat akin to writing your own importer, and when you do that, you can certainly expect to mess with sys.modules.
As an aside, even if your import * thing was legal within a function, you might still have problems because oddly enough, the statement you pass to the Timer doesn't seem to recognize its own local variables. I invoked a bit of Python voodoo by the name of extract_context() (it's a function I wrote) to set a and b at the local scope and ran
print timeit.Timer('print locals(); a + b', 'sys.modules["__main__"].extract_context()').timeit()
Sure enough, the printout of locals() included a and b:
{'a': 1, 'b': 2, '_timer': <built-in function time>, '_it': repeat(None, 999999), '_t0': 1277378305.3572791, '_i': None}
but it still complained NameError: global name 'a' is not defined. Weird.
Related
If you type this:
import somemodule
help(somemodule)
it will print out paged package description. I would need to get the same description as a string but without importing this package to the current namespace. Is this possible? It surely is, because anything is possible in Python, but what is the most elegant/pythonic way of doing so?
Side note: by elegant way I mean without opening a separate process and capturing its stdout... ;)
In other words, is there a way to peek into a unimported but installed package and get its description? Maybe something with importlib.abc.InspectLoader? But I have no idea how to make it work the way I need.
UPDATE: I need not just not polluting the namespace but also do this without leaving any traces of itself or dependent modules in memory and in sys.modules etc. Like it was never really imported.
UPDATE: Before anyone asks me why I need it - I want to list all installed python packages with their description. But after this I do not want to have them imported in sys.modules nor occupying excessive space in memory because there can be a lots of them.
The reason that you will need to import the module to get a help string is that in many cases, the help strings are actually generated in code. It would be pointlessly difficult to parse the text of such a package to get the string since you would then have to write a small Python interpreter to reconstruct the actual string.
That being said, there are ways of completely deleting a temporarily imported modules based on this answer, which summarizes a thread that appeared on the Python mailing list around 2003: http://web.archive.org/web/20080926094551/http://mail.python.org/pipermail/python-list/2003-December/241654.html. The methods described here will generally only work if the module is not referenced elsewhere. Otherwise the module will be unloaded in the sense that import will reload it from scratch instead of using the existing sys.modules entry, but the module will still live in memory.
Here is a function that does approximately what you want and even prints a warning if the module does not appear to have been unloaded. Unlike the solutions proposed in the linked answer, this function really handles all the side-effects of loading a module, including the fact that importing one package may import other external packages into sys.modules:
import sys, warnings
def get_help(module_name):
modules_copy = sys.modules.copy()
module = __import__(module_name)
h = help(module)
for modname in list(sys.modules):
if modname not in modules_copy:
del sys[modname]
if sys.getrefcount(module) > 1:
warnings.warn('Module {} is likely not to be completely wiped'.format(module_name))
del module
return h
The reason that I make a list of the keys in the final loop is that it is inadvisable to modify a dictionary (or any other iterable) as you iterate through it. At least in Python 3, dict.keys() returns an iterable that is backed by the dictionary itself, not a frozen copy. I am not sure if h = ... and return h are even necessary, but in the worst case, h is just None.
Well, if you are only worried about keeping the global namespace tidy, you could always import in a function:
>>> def get_help():
... import math
... help(math)
...
>>> math
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'math' is not defined
I would suggest a different approach, if i understand you correctly, you wish to read a portion of a package, without importing it (even within a function with local scope). I would suggest a method to do so would be via accessing the (python_path)/Lib/site-packages/(package_name)/ and reading the contents of the respective files as an alternative to importing the module so Python can.
I have a plypython function which does some json magic. For this it obviously imports the json library.
Is the import called on every call to the function? Are there any performance implication I have to be aware of?
The import is executed on every function call. This is the same behavior you would get if you wrote a normal Python module with the import statement inside a function body as oppposed to at the module level.
Yes, this will affect performance.
You can work around this by caching your imports like this:
CREATE FUNCTION test() RETURNS text
LANGUAGE plpythonu
AS $$
if 'json' in SD:
json = SD['json']
else:
import json
SD['json'] = json
return json.dumps(...)
$$;
This is admittedly not very pretty, and better ways to do this are being discussed, but they won't happen before PostgreSQL 9.4.
The declaration in the body of a PL/Python function will eventually become an ordinary Python function and will thus behave as such. When a Python function imports a module for the first time the module is cached in the sys.modules dictionary (https://docs.python.org/3/reference/import.html#the-module-cache). Subsequent imports of the same module will simply bind the import name to the module object found in the dictionary. In a sense, what I'm saying may cast some doubt on the usefulness of the tip given in the accepted answer, since it makes it somewhat redundant, as Python already does a similar caching for you.
To sum things up, I'd say that if you import in the standard way of simply using the import or from [...] import constructs, then you need not worry about repeated imports, in functions or otherwise, Python has got you covered.
On the other hand, Python allows you to bypass its native import semantics and to implement your own (with the __import__() function and importlib module). If this is what you're doing, maybe you should review what's available in the toolbox (https://docs.python.org/3/reference/import.html).
I am trying to design the package and module system for a programming language (Heron) which can be both compiled and interpreted, and from what I have seen I really like the Python approach. Python has a rich choice of modules, which seems to contribute largely to its success.
What I don`t know is what happens in Python if a module is included in two different compiled packages: are there separate copies made of the data or is it shared?
Related to this are a bunch of side-questions:
Am I right in assuming that packages can be compiled in Python?
What are there pros and cons to the two approaches (copying or sharing of module data)?
Are there widely known problems with the Python module system, from the point of view of the Python community? For example is there a PEP under consideration for enhancing modules/packages?
Are there certain aspects of the Python module/package system which wouldn`t work well for a compiled language?
Well, you asked a lot of questions. Here are some hints to get a bit further:
a. Python code is lexed and compiled into Python specific instructions, but not compiled to machine executable code. The ".pyc" file is automatically created whenever you run python code that does not match the existing .pyc timestamp. This feature can be turned off. You might play with the dis module to see these instructions.
b. When a module is imported, it is executed (top to bottom) in its own namespace and that namespace cached globally. When you import from another module, the module is not executed again. Remember that def is just a statement. You may want to put a print('compiling this module') statement in your code to trace it.
It depends.
There were recent enhancements, mostly around specifying which module needed to be loaded. Modules can have relative paths so that a huge project might have multiple modules with the a same name.
Python itself won't work for a compiled language. Google for "unladen swallow blog" to see the tribulations of trying to speed up a language where "a = sum(b)" can change meanings between executions. Outside of corner cases, the module system forms a nice bridge between source code and a compiled library system. The approach works well, and Python's easy wrapping of C code (swig, etc.) helps.
Modules are the only truly global objects in Python, with all other global data based around the module system (which uses sys.modules as a registry). Packages are simply modules with special semantics for importing submodules. "Compiling" a .py file into a .pyc or .pyo isn't compilation as understood for most languages: it only checks the syntax and creates a code object which, when executed in the interpreter, creates the module object.
example.py:
print "Creating %s module." % __name__
def show_def(f):
print "Creating function %s.%s." % (__name__, f.__name__)
return f
#show_def
def a():
print "called: %s.a" % __name__
Interactive session:
>>> import example
# first sys.modules['example'] is checked
# since it doesn't exist, example.py is found and "compiled" to example.pyc
# (since example.pyc doesn't exist, same would happen if it was outdated, etc.)
Creating example module. # module code is executed
Creating function example.a. # def statement executed
>>> example.a()
called: example.a
>>> import example
# sys.modules['example'] found, local variable example assigned to that object
# no 'Creating ..' output
>>> d = {"__name__": "fake"}
>>> exec open("example.py") in d
# the first import in this session is very similar to this
# in that it creates a module object (which has a __dict__), initializes a few
# variables in it (__builtins__, __name__, and others---packages' __init__
# modules have their own as well---look at some_module.__dict__.keys() or
# dir(some_module))
# and executes the code from example.py in this dict (or the code object stored
# in example.pyc, etc.)
Creating fake module. # module code is executed
Creating function fake.a. # def statement executed
>>> d.keys()
['__builtins__', '__name__', 'a', 'show_def']
>>> d['a']()
called: fake.a
Your questions:
They are compiled, in a sense, but not as you would expect if you're familiar with how C compilers work.
If the data is immutable, copying is feasible, and should be indistinguishable from sharing except for object identity (is operator and id() in Python).
Imports may or may not execute code (they always assign a local variable to an object, but that poses no problems) and may or may not modify sys.modules. You must be careful to not import in threads, and generally it is best to do all imports at the top of every module: this leads to a cascading graph so all the imports are done at once and then __main__ continues and does the Real Work™.
I don't know of any current PEP, but there's already a lot of complex machinery in place, too. For example packages can have a __path__ attribute (really a list of paths) so submodules don't have to be in the same directory, and these paths can even be computed at runtime! (Example mungepath package below.) You can have your own import hooks, use import statements inside functions, directly call __import__, and I wouldn't be surprised to find 2-3 other unique ways to work with packages and modules.
A subset of the import system would work in a traditionally-compiled language, as long as it was similar to something like C's #include. You could run the "first level" of execution (creating the module objects) in the compiler, and compile those results. There are significant drawbacks to this, however, and amounts to separate execution contexts for module-level code and functions executed at runtime (and some functions would have to run in both contexts!). (Remember in Python that every statement is executed at runtime, even def and class statements.)
I believe this is the main reason traditionally-compiled languages restrict "top-level" code to class, function, and object declarations, eliminating this second context. Even then, you have initialization problems for global objects in C/C++ (and others), unless managed carefully.
mungepath/__init__.py:
print __path__
__path__.append(".") # CWD, would be different in non-example code
print __path__
from . import example # this is example.py from above, and is NOT in mungepath/
# note that this is a degenerate case, in that we now have two names for the
# 'same' module: example and mungepath.example, but they're really different
# modules with different functions (use 'is' or 'id()' to verify)
Interactive session:
>>> import example
Creating example module.
Creating function example.a.
>>> example.__dict__.keys()
['a', '__builtins__', '__file__', 'show_def', '__package__',
'__name__', '__doc__']
>>> import mungepath
['mungepath']
['mungepath', '.']
Creating mungepath.example module.
Creating function mungepath.example.a.
>>> mungepath.example.a()
called: mungepath.example.a
>>> example is mungepath.example
False
>>> example.a is mungepath.example.a
False
Global data is scoped at the interpreter level.
"packages" can be compiled as a package is just a collection of modules which themselves can be compiled.
I am not sure I understand given the established scoping of data.
I'm working on my first significant Python project and I'm having trouble with scope issues and executing code in included files. Previously my experience is with PHP.
What I would like to do is have one single file that sets up a number of configuration variables, which would then be used throughout the code. Also, I want to make certain functions and classes available globally. For example, the main file would include a single other file, and that file would load a bunch of commonly used functions (each in its own file) and a configuration file. Within those loaded files, I also want to be able to access the functions and configuration variables. What I don't want to do, is to have to put the entire routine at the beginning of each (included) file to include all of the rest. Also, these included files are in various sub-directories, which is making it much harder to import them (especially if I have to re-import in every single file).
Anyway I'm looking for general advice on the best way to structure the code to achieve what I want.
Thanks!
In python, it is a common practice to have a bunch of modules that implement various functions and then have one single module that is the point-of-access to all the functions. This is basically the facade pattern.
An example: say you're writing a package foo, which includes the bar, baz, and moo modules.
~/project/foo
~/project/foo/__init__.py
~/project/foo/bar.py
~/project/foo/baz.py
~/project/foo/moo.py
~/project/foo/config.py
What you would usually do is write __init__.py like this:
from foo.bar import func1, func2
from foo.baz import func3, constant1
from foo.moo import func1 as moofunc1
from foo.config import *
Now, when you want to use the functions you just do
import foo
foo.func1()
print foo.constant1
# assuming config defines a config1 variable
print foo.config1
If you wanted, you could arrange your code so that you only need to write
import foo
At the top of every module, and then access everything through foo (which you should probably name "globals" or something to that effect). If you don't like namespaces, you could even do
from foo import *
and have everything as global, but this is really not recommended. Remember: namespaces are one honking great idea!
This is a two-step process:
In your module globals.py import the items from wherever.
In all of your other modules, do "from globals import *"
This brings all of those names into the current module's namespace.
Now, having told you how to do this, let me suggest that you don't. First of all, you are loading up the local namespace with a bunch of "magically defined" entities. This violates precept 2 of the Zen of Python, "Explicit is better than implicit." Instead of "from foo import *", try using "import foo" and then saying "foo.some_value". If you want to use the shorter names, use "from foo import mumble, snort". Either of these methods directly exposes the actual use of the module foo.py. Using the globals.py method is just a little too magic. The primary exception to this is in an __init__.py where you are hiding some internal aspects of a package.
Globals are also semi-evil in that it can be very difficult to figure out who is modifying (or corrupting) them. If you have well-defined routines for getting/setting globals, then debugging them can be much simpler.
I know that PHP has this "everything is one, big, happy namespace" concept, but it's really just an artifact of poor language design.
As far as I know program-wide global variables/functions/classes/etc. does not exist in Python, everything is "confined" in some module (namespace). So if you want some functions or classes to be used in many parts of your code one solution is creating some modules like: "globFunCl" (defining/importing from elsewhere everything you want to be "global") and "config" (containing configuration variables) and importing those everywhere you need them. If you don't like idea of using nested namespaces you can use:
from globFunCl import *
This way you'll "hide" namespaces (making names look like "globals").
I'm not sure what you mean by not wanting to "put the entire routine at the beginning of each (included) file to include all of the rest", I'm afraid you can't really escape from this. Check out the Python Packages though, they should make it easier for you.
This depends a bit on how you want to package things up. You can either think in terms of files or modules. The latter is "more pythonic", and enables you to decide exactly which items (and they can be anything with a name: classes, functions, variables, etc.) you want to make visible.
The basic rule is that for any file or module you import, anything directly in its namespace can be accessed. So if myfile.py contains definitions def myfun(...): and class myclass(...) as well as myvar = ... then you can access them from another file by
import myfile
y = myfile.myfun(...)
x = myfile.myvar
or
from myfile import myfun, myvar, myclass
Crucially, anything at the top level of myfile is accessible, including imports. So if myfile contains from foo import bar, then myfile.bar is also available.
I'm writing a plugin system for my program and I can't get past one thing:
class ThingLoader(object):
'''
Loader class
'''
def loadPlugins(self):
'''
Get all the plugins from plugins folder
'''
from diones.thingpad.plugin.IntrospectionHelper import loadClasses
classList=loadClasses('./plugins', IPlugin)#Gets a list of
#plugin classes
self.plugins={}#Dictionary that should be filled with
#touples of objects and theirs states, activated, deactivated.
classList[0](self)#Runs nicelly
foo = classList[1]
print foo#prints <class 'TestPlugin.TestPlugin'>
foo(self)#Raise an exception
The test plugin looks like this:
import diones.thingpad.plugin.IPlugin as plugin
class TestPlugin(plugin.IPlugin):
'''
classdocs
'''
def __init__(self, loader):
self.name='Test Plugin'
super(TestPlugin, self).__init__(loader)
Now the IPlugin looks like this:
class IPlugin(object):
'''
classdocs
'''
name=''
def __init__(self, loader):
self.loader=loader
def activate(self):
pass
All the IPlugin classes works flawlessy by them selves, but when called by ThingLoader the program gets an exception:
File "./plugins\TestPlugin.py", line 13, in __init__
super(TestPlugin, self).__init__(loader) NameError:
global name 'super' is not defined
I looked all around and I simply don't know what is going on.
‘super’ is a builtin. Unless you went out of your way to delete builtins, you shouldn't ever see “global name 'super' is not defined”.
I'm looking at your user web link where there is a dump of IntrospectionHelper. It's very hard to read without the indentation, but it looks like you may be doing exactly that:
built_in_list = ['__builtins__', '__doc__', '__file__', '__name__']
for i in built_in_list:
if i in module.__dict__:
del module.__dict__[i]
That's the original module dict you're changing there, not an informational copy you are about to return! Delete these members from a live module and you can expect much more than ‘super’ to break.
It's very hard to keep track of what that module is doing, but my reaction is there is far too much magic in it. The average Python program should never need to be messing around with the import system, sys.path, and monkey-patching __magic__ module members. A little bit of magic can be a neat trick, but this is extremely fragile. Just off the top of my head from browsing it, the code could be broken by things like:
name clashes with top-level modules
any use of new-style classes
modules supplied only as compiled bytecode
zipimporter
From the incredibly round-about functions like getClassDefinitions, extractModuleNames and isFromBase, it looks to me like you still have quite a bit to learn about the basics of how Python works. (Clues: getattr, module.__name__ and issubclass, respectively.)
In this case now is not the time to be diving into import magic! It's hard. Instead, do things The Normal Python Way. It may be a little more typing to say at the bottom of a package's mypackage/__init__.py:
from mypackage import fooplugin, barplugin, bazplugin
plugins= [fooplugin.FooPlugin, barplugin.BarPlugin, bazplugin.BazPlugin]
but it'll work and be understood everywhere without relying on a nest of complex, fragile magic.
Incidentally, unless you are planning on some in-depth multiple inheritance work (and again, now may not be the time for that), you probably don't even need to use super(). The usual “IPlugin.__init__(self, ...)” method of calling a known superclass is the straightforward thing to do; super() is not always “the newer, better way of doing things” and there are things you should understand about it before you go charging into using it.
Unless you're running a version of Python earlier than 2.2 (pretty unlikely), super() is definitely a built-in function (available in every scope, and without importing anything).
May be worth checking your version of Python (just start up the interactive prompt by typing python at the command line).