I have a plypython function which does some json magic. For this it obviously imports the json library.
Is the import called on every call to the function? Are there any performance implication I have to be aware of?
The import is executed on every function call. This is the same behavior you would get if you wrote a normal Python module with the import statement inside a function body as oppposed to at the module level.
Yes, this will affect performance.
You can work around this by caching your imports like this:
CREATE FUNCTION test() RETURNS text
LANGUAGE plpythonu
AS $$
if 'json' in SD:
json = SD['json']
else:
import json
SD['json'] = json
return json.dumps(...)
$$;
This is admittedly not very pretty, and better ways to do this are being discussed, but they won't happen before PostgreSQL 9.4.
The declaration in the body of a PL/Python function will eventually become an ordinary Python function and will thus behave as such. When a Python function imports a module for the first time the module is cached in the sys.modules dictionary (https://docs.python.org/3/reference/import.html#the-module-cache). Subsequent imports of the same module will simply bind the import name to the module object found in the dictionary. In a sense, what I'm saying may cast some doubt on the usefulness of the tip given in the accepted answer, since it makes it somewhat redundant, as Python already does a similar caching for you.
To sum things up, I'd say that if you import in the standard way of simply using the import or from [...] import constructs, then you need not worry about repeated imports, in functions or otherwise, Python has got you covered.
On the other hand, Python allows you to bypass its native import semantics and to implement your own (with the __import__() function and importlib module). If this is what you're doing, maybe you should review what's available in the toolbox (https://docs.python.org/3/reference/import.html).
Related
I'd like to dynamically create a module from a dictionary, and I'm wondering if adding an element to sys.modules is really the best way to do this. EG
context = { a: 1, b: 2 }
import types
test_context_module = types.ModuleType('TestContext', 'Module created to provide a context for tests')
test_context_module.__dict__.update(context)
import sys
sys.modules['TestContext'] = test_context_module
My immediate goal in this regard is to be able to provide a context for timing test execution:
import timeit
timeit.Timer('a + b', 'from TestContext import *')
It seems that there are other ways to do this, since the Timer constructor takes objects as well as strings. I'm still interested in learning how to do this though, since a) it has other potential applications; and b) I'm not sure exactly how to use objects with the Timer constructor; doing so may prove to be less appropriate than this approach in some circumstances.
EDITS/REVELATIONS/PHOOEYS/EUREKA:
I've realized that the example code relating to running timing tests won't actually work, because import * only works at the module level, and the context in which that statement is executed is that of a function in the testit module. In other words, the globals dictionary used when executing that code is that of __main__, since that's where I was when I wrote the code in the interactive shell. So that rationale for figuring this out is a bit botched, but it's still a valid question.
I've discovered that the code run in the first set of examples has the undesirable effect that the namespace in which the newly created module's code executes is that of the module in which it was declared, not its own module. This is like way weird, and could lead to all sorts of unexpected rattlesnakeic sketchiness. So I'm pretty sure that this is not how this sort of thing is meant to be done, if it is in fact something that the Guido doth shine upon.
The similar-but-subtly-different case of dynamically loading a module from a file that is not in python's include path is quite easily accomplished using imp.load_source('NewModuleName', 'path/to/module/module_to_load.py'). This does load the module into sys.modules. However this doesn't really answer my question, because really, what if you're running python on an embedded platform with no filesystem?
I'm battling a considerable case of information overload at the moment, so I could be mistaken, but there doesn't seem to be anything in the imp module that's capable of this.
But the question, essentially, at this point is how to set the global (ie module) context for an object. Maybe I should ask that more specifically? And at a larger scope, how to get Python to do this while shoehorning objects into a given module?
Hmm, well one thing I can tell you is that the timeit function actually executes its code using the module's global variables. So in your example, you could write
import timeit
timeit.a = 1
timeit.b = 2
timeit.Timer('a + b').timeit()
and it would work. But that doesn't address your more general problem of defining a module dynamically.
Regarding the module definition problem, it's definitely possible and I think you've stumbled on to pretty much the best way to do it. For reference, the gist of what goes on when Python imports a module is basically the following:
module = imp.new_module(name)
execfile(file, module.__dict__)
That's kind of the same thing you do, except that you load the contents of the module from an existing dictionary instead of a file. (I don't know of any difference between types.ModuleType and imp.new_module other than the docstring, so you can probably use them interchangeably) What you're doing is somewhat akin to writing your own importer, and when you do that, you can certainly expect to mess with sys.modules.
As an aside, even if your import * thing was legal within a function, you might still have problems because oddly enough, the statement you pass to the Timer doesn't seem to recognize its own local variables. I invoked a bit of Python voodoo by the name of extract_context() (it's a function I wrote) to set a and b at the local scope and ran
print timeit.Timer('print locals(); a + b', 'sys.modules["__main__"].extract_context()').timeit()
Sure enough, the printout of locals() included a and b:
{'a': 1, 'b': 2, '_timer': <built-in function time>, '_it': repeat(None, 999999), '_t0': 1277378305.3572791, '_i': None}
but it still complained NameError: global name 'a' is not defined. Weird.
I know that from module import * will import all the functions in current namespace but it is a bad practice. I want to use two functions directly and use module.function when I have to use any other function from the module. What I am doing currently is:
import module
from module import func1, func2
# DO REST OF MY STUFF
Is it a good practice? Does the order of first two statements matter?
Is there a better way using which I can use these two functions directly and use rest of the functions as usual with the module's name prepended to them?
Using just import module results in very long statements with a lot of repetition if I use the same function from the given module five times in a single statement. That's what I want to avoid.
The order doesn't matter and it's not a pythonic way. When you import the module there is no need to import some of its functions separately again. If you are not sure how many of the functions you might need to use just import the module and access to the functions on demand with a simple reference.
# The only import you need
import module
# Use module.funcX when you need any of its functions
After all, if you want to use some of your functions (much) more than the others, as the cost of attribute access is greater than importing the functions separately, you better to import them as you've done.
And still, the order doesn't matter. You can do:
import module
from module import func1, func2
For more info read the documentation https://www.python.org/dev/peps/pep-0008/#imports
It is not good to do (may be opinion based):
import module
from module import func1, func2 # `func1` and `func2` are already part of module
Because you already hold a reference to module.
If I were you, I would import it in the form of import module. Since your issue is that module.func1() becomes too long. I may import the module and use as for creating a alias for the name. For example:
import module as mo
# ^ for illustration purpose. Even the name of
# your actual module wont be `module`.
# Alias should also be self-explanatory
# For example:
import database_manager as db_manager
Now I may access the functions as:
mo.func1()
mo.func2()
Edit: Based on the edit in actual question
If your are calling same function in the same line, there is possibility that your are already doing some thing wrong. It will be great if you can share what your that function does.
For example: Want to the rertun value of those functions to be passed as argument to another function? as:
test_func(mo.func1(x), mo.func1(y). mo.func1(z))
could be done as:
params_list = [x, y, z]
func_list = [mo.func1(param) for param in params_list]
test_func(*func_list)
When using many IDEs that support autocompletion with Python, things like this will show warnings, which I find annoying:
from eventlet.green.httplib import BadStatusLine
When switching to:
from eventlet.green.httplib import *
The warnings go away. What's the benefit to limiting imports to a specific set of types you'll use? Is the parsing faster? Reduces collisions? What other point is there? It seems the state of python IDEs and the nature of the typing system makes it hard for many IDEs to fully get right when a type import works and when it doesn't.
By typing from foo import *, you import all the names defined in foo into the global namespace. This is bad practice because you could have name clashes both with other modules and with built-ins.
For example, consider a module foo
#foo.py
def open(something):
pass
and a module bar:
#bar.py
def open(something_else):
pass
Now, from foo import * hides the built-in function open() which means that any calls to open() now refer to foo.open() rather than the built-in. Worse, if you then have from bar import *, the function open() in bar now hides both the built-in and the function imported from foo.
In the example above, from foo import open is equally shadowing the built-in function, but one glance at the code tells you why you can't open files for IO anymore.
This is why you should import only specific names, ensuring that you know what names are imported. Alternatively, you could use fully qualified names (import foo; foo.open(), which is perfectly safe).
EDIT: Just as a note, this can be horribly compounded if the module you're importing also uses from x import *. In this case, not only do you typically import all the stuff in the module foo, but also all the stuff in the module x into the global namespace. This can very quickly turn into an absolute mess.
It reduces collisions with user-defined types, it reduces coupling and it's self-documenting, since it makes clear from the outset of the module which classes are coming from libraries (so the rest must be user-defined). The parsing is not faster, at least not in CPython: an imported module must be read in its entirety to look for the classes/functions being imported.
(I must admit that I never use an IDE.)
I'm creating a class to extend a package, and prior to class instantiation I don't know which subset of the package's namespace I need. I've been careful about avoiding namespace conflicts in my code, so, does
from package import *
create problems besides name conflicts?
Is it better to examine the class's input and import only the names I need (at runtime) in the __init__ ??
Can python import from a set [] ?
does
for name in [namespace,namespace]:
from package import name
make any sense?
I hope this question doesn't seem like unnecessary hand-ringing, i'm just super new to python and don't want to do the one thing every 'beginnger's guide' says not to do (from pkg import * ) unless I'm sure there's no alternative.
thoughts, advice welcome.
In order:
It does not create other problems - however, name conflicts can be much more of a problem than you'd expect.
Definitely defer your imports if you can. Even though Python variable scoping is simplistic, you also gain the benefit of not having to import the module if the functionality that needs it never gets called.
I don't know what you mean. Square brackets are used to make lists, not sets. You can import multiple names from a module in one line - just use a comma-delimited list:
from awesome_module import spam, ham, eggs, baked_beans
# awesome_module defines lots of other names, but they aren't pulled in.
No, that won't do what you want - name is an identifier, and as such, each time through the loop the code will attempt to import the name name, and not the name that corresponds to the string referred to by the name variable.
However, you can get this kind of "dynamic import" effect, using the __import__ function. Consult the documentation for more information, and make sure you have a real reason for using it first. We get into some pretty advanced uses of the language here pretty quickly, and it usually isn't as necessary as it first appears. Don't get too clever. We hates them tricksy hobbitses.
When importing * you get everything in the module dumped straight into your namespace. This is not always a good thing as you could accentually overwrite something like;
from time import *
sleep = None
This would render the time.sleep function useless...
The other way of taking functions, variables and classes from a module would be saying
from time import sleep
This is a nicer way but often the best way is to just import the module and reference the module directly like
import time
time.sleep(3)
you can import like from PIL import Image, ImageDraw
what is imported by from x import * is limited to the list __all__ in x if it exists
importing at runtime if the module name isn't know or fixed in the code must be done with __import__ but you shouldn't have to do that
This syntax constructions help you to avoid any name collision:
from package import somename as another_name
import package as another_package_name
Is it possible to overload the from/import statement in Python?
For example, assuming jvm_object is an instance of class JVM, is it possible to write this code:
class JVM(object):
def import_func(self, cls):
return something...
jvm = JVM()
# would invoke JVM.import_func
from jvm import Foo
This post demonstrates how to use functionality introduced in PEP-302 to import modules over the web. I post it as an example of how to customize the import statement rather than as suggested usage ;)
It's hard to find something which isn't possible in a dynamic language like Python, but do we really need to abuse everything? Anyway, here it is:
from types import ModuleType
import sys
class JVM(ModuleType):
Foo = 3
sys.modules['JVM'] = JVM
from JVM import Foo
print Foo
But one pattern I've seen in several libraries/projects is some kind of a _make_module() function, which creates a ModuleType dynamically and initializes everything in it. After that, the current Module is replaced by the new module (using the assignment to sys.modules) and the _make_module() function gets deleted. The advantage of that, is that you can loop over the module and even add objects to the module inside that loop, which is quite useful sometimes (but use it with caution!).