This question already has answers here:
Python, doing conditional imports the right way
(4 answers)
Closed last month.
I'm new to conditional importing in Python, and am considering two approaches for my module design. I'd appreciate input on why I might want to go with one vs. the other (or if a better alternative exists).
The problem
I have a program that will need to call structurally identical but distinct modules under different conditions. These modules all have the same functions, inputs, outputs, etc., the only difference is in what they do within their functions. For example,
# module_A.py
def get_the_thing(input):
# do the thing specific to module A
return thing
# module_B.py
def get_the_thing(input):
# do the thing specific to module B
return thing
Option 1
Based on an input value, I would just conditionally import the appropriate module, in line with this answer.
if val == 'A':
import module_A
if val == 'B':
import module_B
Option 2
I use the input variable to generate the module name as a string, then I call the function from the correct module based on that string using this method. I believe this requires me to import all the modules first.
import module_A
import module_B
in_var = get_input() # Say my input variable is 'A', meaning use Module A
module_nm = 'module_' + in_var
function_nm = 'get_the_thing'
getattr(globals()[module_nm], function_nm)(my_args)
The idea is this would call module_A.get_the_thing() by generating the module and function names at runtime. This is a frivolous example for only one function call, but in my actual case I'd be working with a list of functions, just wanted to keep things simple.
Any thoughts on whether either design is better, or if something superior exists to these two? Would appreciate any reasons why. Of course, A is more concise and probably more intuitive, but wasn't sure this necessarily equated to good design or differences in performance.
I'd go with Option 1. It's significantly neater, and you aren't needing to fiddle around with strings to do lookups. Dealing with strings, at the very least, will complicate refactoring. If you ever change any of the names involved, you must remember to update the strings as well; especially since even smart IDEs won't be able to help you here with typical shift+F6 renaming. The less places that you have difficult to maintain code like that, the better.
I'd make a minor change to 1 though. With how you have it now, each use of the module will still require using a qualified name, like module_A.do_thing(). That means whenever you want to call a function, you'll need to first figure out which was imported in the first place, which leads to more messy code. I'd import them under a common name:
if val == 'A':
import module_A as my_module
if val == 'B':
import module_B as my_module
. . .
my_module.do_thing() # The exact function called will depend on which module was imported as my_module
You could also, as suggested in the comments, use a wildcard import to avoid needing to use a name for the module:
if val == 'A':
from module_A import *
if val == 'B':
from module_B import *
. . .
do_thing()
But this is discouraged by PEP8:
Wildcard imports (from <module> import *) should be avoided, as they make it unclear which names are present in the namespace, confusing both readers and many automated tools.
It also pollutes the namespace that you're importing into, making it easier to accidentally shadow a name from the imported file.
Related
It is recommended to not to use import * in Python.
Can anyone please share the reason for that, so that I can avoid it doing next time?
Because it puts a lot of stuff into your namespace (might shadow some other object from previous import and you won't know about it).
Because you don't know exactly what is imported and can't easily find from which module a certain thing was imported (readability).
Because you can't use cool tools like pyflakes to statically detect errors in your code.
According to the Zen of Python:
Explicit is better than implicit.
... can't argue with that, surely?
You don't pass **locals() to functions, do you?
Since Python lacks an "include" statement, and the self parameter is explicit, and scoping rules are quite simple, it's usually very easy to point a finger at a variable and tell where that object comes from -- without reading other modules and without any kind of IDE (which are limited in the way of introspection anyway, by the fact the language is very dynamic).
The import * breaks all that.
Also, it has a concrete possibility of hiding bugs.
import os, sys, foo, sqlalchemy, mystuff
from bar import *
Now, if the bar module has any of the "os", "mystuff", etc... attributes, they will override the explicitly imported ones, and possibly point to very different things. Defining __all__ in bar is often wise -- this states what will implicitly be imported - but still it's hard to trace where objects come from, without reading and parsing the bar module and following its imports. A network of import * is the first thing I fix when I take ownership of a project.
Don't misunderstand me: if the import * were missing, I would cry to have it. But it has to be used carefully. A good use case is to provide a facade interface over another module.
Likewise, the use of conditional import statements, or imports inside function/class namespaces, requires a bit of discipline.
I think in medium-to-big projects, or small ones with several contributors, a minimum of hygiene is needed in terms of statical analysis -- running at least pyflakes or even better a properly configured pylint -- to catch several kind of bugs before they happen.
Of course since this is python -- feel free to break rules, and to explore -- but be wary of projects that could grow tenfold, if the source code is missing discipline it will be a problem.
That is because you are polluting the namespace. You will import all the functions and classes in your own namespace, which may clash with the functions you define yourself.
Furthermore, I think using a qualified name is more clear for the maintenance task; you see on the code line itself where a function comes from, so you can check out the docs much more easily.
In module foo:
def myFunc():
print 1
In your code:
from foo import *
def doThis():
myFunc() # Which myFunc is called?
def myFunc():
print 2
It is OK to do from ... import * in an interactive session.
Say you have the following code in a module called foo:
import ElementTree as etree
and then in your own module you have:
from lxml import etree
from foo import *
You now have a difficult-to-debug module that looks like it has lxml's etree in it, but really has ElementTree instead.
Understood the valid points people put here. However, I do have one argument that, sometimes, "star import" may not always be a bad practice:
When I want to structure my code in such a way that all the constants go to a module called const.py:
If I do import const, then for every constant, I have to refer it as const.SOMETHING, which is probably not the most convenient way.
If I do from const import SOMETHING_A, SOMETHING_B ..., then obviously it's way too verbose and defeats the purpose of the structuring.
Thus I feel in this case, doing a from const import * may be a better choice.
http://docs.python.org/tutorial/modules.html
Note that in general the practice of importing * from a module or package is frowned upon, since it often causes poorly readable code.
These are all good answers. I'm going to add that when teaching new people to code in Python, dealing with import * is very difficult. Even if you or they didn't write the code, it's still a stumbling block.
I teach children (about 8 years old) to program in Python to manipulate Minecraft. I like to give them a helpful coding environment to work with (Atom Editor) and teach REPL-driven development (via bpython). In Atom I find that the hints/completion works just as effectively as bpython. Luckily, unlike some other statistical analysis tools, Atom is not fooled by import *.
However, lets take this example... In this wrapper they from local_module import * a bunch modules including this list of blocks. Let's ignore the risk of namespace collisions. By doing from mcpi.block import * they make this entire list of obscure types of blocks something that you have to go look at to know what is available. If they had instead used from mcpi import block, then you could type walls = block. and then an autocomplete list would pop up.
It is a very BAD practice for two reasons:
Code Readability
Risk of overriding the variables/functions etc
For point 1:
Let's see an example of this:
from module1 import *
from module2 import *
from module3 import *
a = b + c - d
Here, on seeing the code no one will get idea regarding from which module b, c and d actually belongs.
On the other way, if you do it like:
# v v will know that these are from module1
from module1 import b, c # way 1
import module2 # way 2
a = b + c - module2.d
# ^ will know it is from module2
It is much cleaner for you, and also the new person joining your team will have better idea.
For point 2: Let say both module1 and module2 have variable as b. When I do:
from module1 import *
from module2 import *
print b # will print the value from module2
Here the value from module1 is lost. It will be hard to debug why the code is not working even if b is declared in module1 and I have written the code expecting my code to use module1.b
If you have same variables in different modules, and you do not want to import entire module, you may even do:
from module1 import b as mod1b
from module2 import b as mod2b
As a test, I created a module test.py with 2 functions A and B, which respectively print "A 1" and "B 1". After importing test.py with:
import test
. . . I can run the 2 functions as test.A() and test.B(), and "test" shows up as a module in the namespace, so if I edit test.py I can reload it with:
import importlib
importlib.reload(test)
But if I do the following:
from test import *
there is no reference to "test" in the namespace, so there is no way to reload it after an edit (as far as I can tell), which is a problem in an interactive session. Whereas either of the following:
import test
import test as tt
will add "test" or "tt" (respectively) as module names in the namespace, which will allow re-loading.
If I do:
from test import *
the names "A" and "B" show up in the namespace as functions. If I edit test.py, and repeat the above command, the modified versions of the functions do not get reloaded.
And the following command elicits an error message.
importlib.reload(test) # Error - name 'test' is not defined
If someone knows how to reload a module loaded with "from module import *", please post. Otherwise, this would be another reason to avoid the form:
from module import *
As suggested in the docs, you should (almost) never use import * in production code.
While importing * from a module is bad, importing * from a package is probably even worse.
By default, from package import * imports whatever names are defined by the package's __init__.py, including any submodules of the package that were loaded by previous import statements.
If a package’s __init__.py code defines a list named __all__, it is taken to be the list of submodule names that should be imported when from package import * is encountered.
Now consider this example (assuming there's no __all__ defined in sound/effects/__init__.py):
# anywhere in the code before import *
import sound.effects.echo
import sound.effects.surround
# in your module
from sound.effects import *
The last statement will import the echo and surround modules into the current namespace (possibly overriding previous definitions) because they are defined in the sound.effects package when the import statement is executed.
I have several different modules, and I need to import one of them depending on different situations, for example:
if check_situation() == 1:
import helper_1 as helper
elif check_situation() == 2:
import helper_2 as helper
elif ...
...
else:
import helper_0 as helper
these helpers contain same dictionaries dict01, dict02, dict03...but have different values to be called in different situations.
But this has some problems:
import sentences are all written in the top of a file, but check_situation() function here needs prerequisites so that it's now far from top.
more than 1 file needs this helper module, so it's hard and ugly to use this kind of import.
So, how to re-arrange these helpers?
Firstly, there is no strict requirement that import statements need to be at the top of a file, it is more a style guide thing.
Now, importlib and a dict can be used to replace your if/elif chain:
import importlib
d = {1: 'helper_1', 2: 'helper_2'}
helper = importlib.import_module(d.get(check_situation(), 'helper_0'))
But it's just syntactic sugar really, I suspect you have bigger fish to fry. It sounds like you need to reconsider your data structures, and redesign code.
Anytime you have variables named like dict01, dict02, dict03 it is a sure sign that you need to gear up a level, and have some container of dicts e.g. a list of them. Same goes for your 'helper' module names ending with digits.
You can use __import__(), it accepts a string and returns that module:
helper=__import__("helper_{0}".format(check_situation()))
example :
In [10]: mod=__import__("{0}math".format(raw_input("enter 'c' or '': ")))
enter 'c' or '': c #imports cmath
In [11]: mod.__file__
Out[11]: '/usr/local/lib/python2.7/lib-dynload/cmath.so'
In [12]: mod=__import__("{0}math".format(raw_input("enter 'c' or '': ")))
enter 'c' or '':
In [13]: mod.__file__
Out[13]: '/usr/local/lib/python2.7/lib-dynload/math.so'
As pointed out by #wim and from python3.x docs on __import__():
Import a module. Because this function is meant for use by the Python
interpreter and not for general use it is better to use
importlib.import_module() to programmatically import a module.
Solve it myself, refered to #Michael Scott Cuthbert
# re_direct.py
import this_module
import that_module
wanted = None
# caller.py
import re-direct
'''
many prerequisites
'''
def imp_now(case):
import re_direct
if case1:
re_direct.wanted = re_direct.this_module
elif case2:
re_direct.wanted = re_direct.that_module
then, if in caller, I call that imp_now, then wanted, no matter called in caller file or other file calling this wanted, will all be re-directed to this_or_that_module.
also, for I import re_direct only in a function, so you will not see this module anywhere else, but only see wanted.
I agree that the approaches given in the other answers are closer to the main question posed in your title, but if the overhead on importing modules is low (as importing a couple of dictionaries likely is) and there are no side-effects to importing, in this case, you may be better off importing them all and selecting the proper dictionary later in the modules:
import helper_0
import helper_1
...
helperList = [helper_0, helper_1, helper_2...]
...
helper = helperList[check_situation()]
One way is to use import x, without using "from" keyword. So then you refer to things with their namespace everywhere.
Is there any other way? like doing something like in C++ ifnotdef __b__ def __b__ type of thing?
Merge any pair of modules that depend on each other into a single module. Then introduce extra modules to get the old names back.
E.g.,
# a.py
from b import B
class A: whatever
# b.py
from a import A
class B: whatever
becomes
# common.py
class A: whatever
class B: whatever
# a.py
from common import A
# b.py
from common import B
Circular imports are a "code smell," and often (but not always) indicate that some refactoring would be appropriate. E.g., if A.x uses B.y and B.y uses A.z, then you might consider moving A.z into its own module.
If you do think you need circular imports, then I'd generally recommend importing the module and referring to objects with fully qualified names (i.e, import A and use A.x rather than from A import x).
If you're trying to do from A import *, the answer is very simple: Don't do that. You're usually supposed to do import A and refer to the qualified names.
For quick&dirty scripts, and interactive sessions, that's a perfectly reasonable thing to do—but in such cases, you won't run into circular imports.
There are some cases where it makes sense to do import * in real code. For example, if you want to hide a module structure that's complex, or that you generate dynamically, or that changes frequently between versions, or if you're wrapping up someone else's package that's too deeply nested, import * may make sense from a "wrapper module" or a top-level package module. But in that case, nothing you import will be importing you.
In fact, I'm having a hard time imagining any case where import * is warranted and circular dependencies are even a possibility.
If you're doing from A import foo, there are ways around that (e.g., import A then foo = A.foo). But you probably don't want to do that. Again, consider whether you really need to bring foo into your namespace—qualified names are a feature, not a problem to be worked around.
If you're doing the from A import foo just for convenience in implementing your functions, because A is actually long_package_name.really_long_module_name and your code is unreadable because of all those calls to long_package_name.really_long_module_name.long_class_name.class_method_that_puts_me_over_80_characters, remember that you can always import long_package_name.really_long_module_name as P and then use P for you qualified calls.
(Also, remember that with any from done for implementation convenience, you probably want to make sure to specify a __all__ to make sure the imported names don't appear to be part of your namespace if someone does an import * on you from an interactive session.)
Also, as others have pointed out, most, but not all, cases of circular dependencies, are a symptom of bad design, and refactoring your modules in a sensible way will fix it. And in the rare cases where you really do need to bring the names into your namespace, and a circular set of modules is actually the best design, some artificial refactoring may still be a better choice than foo = A.foo.
I'm creating a class to extend a package, and prior to class instantiation I don't know which subset of the package's namespace I need. I've been careful about avoiding namespace conflicts in my code, so, does
from package import *
create problems besides name conflicts?
Is it better to examine the class's input and import only the names I need (at runtime) in the __init__ ??
Can python import from a set [] ?
does
for name in [namespace,namespace]:
from package import name
make any sense?
I hope this question doesn't seem like unnecessary hand-ringing, i'm just super new to python and don't want to do the one thing every 'beginnger's guide' says not to do (from pkg import * ) unless I'm sure there's no alternative.
thoughts, advice welcome.
In order:
It does not create other problems - however, name conflicts can be much more of a problem than you'd expect.
Definitely defer your imports if you can. Even though Python variable scoping is simplistic, you also gain the benefit of not having to import the module if the functionality that needs it never gets called.
I don't know what you mean. Square brackets are used to make lists, not sets. You can import multiple names from a module in one line - just use a comma-delimited list:
from awesome_module import spam, ham, eggs, baked_beans
# awesome_module defines lots of other names, but they aren't pulled in.
No, that won't do what you want - name is an identifier, and as such, each time through the loop the code will attempt to import the name name, and not the name that corresponds to the string referred to by the name variable.
However, you can get this kind of "dynamic import" effect, using the __import__ function. Consult the documentation for more information, and make sure you have a real reason for using it first. We get into some pretty advanced uses of the language here pretty quickly, and it usually isn't as necessary as it first appears. Don't get too clever. We hates them tricksy hobbitses.
When importing * you get everything in the module dumped straight into your namespace. This is not always a good thing as you could accentually overwrite something like;
from time import *
sleep = None
This would render the time.sleep function useless...
The other way of taking functions, variables and classes from a module would be saying
from time import sleep
This is a nicer way but often the best way is to just import the module and reference the module directly like
import time
time.sleep(3)
you can import like from PIL import Image, ImageDraw
what is imported by from x import * is limited to the list __all__ in x if it exists
importing at runtime if the module name isn't know or fixed in the code must be done with __import__ but you shouldn't have to do that
This syntax constructions help you to avoid any name collision:
from package import somename as another_name
import package as another_package_name
Suppose you have 3 modules, a.py, b.py, and c.py:
a.py:
v1 = 1
v2 = 2
etc.
b.py:
from a import *
c.py:
from a import *
v1 = 0
Will c.py change v1 in a.py and b.py? If not, is there a way to do it?
All that a statement like:
v1 = 0
can do is bind the name v1 to the object 0. It can't affect a different module.
If I'm using unfamiliar terms there, and I guess I probably am, I strongly recommend you read Fredrik Lundh's excellent article Python Objects: Reset your brain.
The from ... import * form is basically intended for handy interactive use at the interpreter prompt: you'd be well advised to never use it in other situations, as it will give you nothing but problems.
In fact, the in-house style guide at my employer goes further, recommending to always import a module, never contents from within a module (a module from within a package is OK and in fact recommended). As a result, in our codebase, references to imported things are always qualified names (themod.thething) and never barenames (which always refer to builtin, globals of this same module, or locals); this makes the code much clearer and more readable and avoids all kinds of subtle anomalies.
Of course, if a module's name is too long, an as clause in the import, to give it a shorter and handier alias for the purposes of the importing module, is fine. But, with your one-letter module names, that won't be needed;-).
So, if you follow the guideline and always import the module (and not things from inside it), c.v1 will always be referring to the same thing as a.v1 and b.v1, both for getting AND setting: here's one potential subtle anomaly avoided right off the bat!-)
Remember the very last bit of the Zen of Python (do import this at the interpreter prompt to see it all):
Namespaces are one honking great idea -- let's do more of those!
Importing the whole module (not bits and pieces from within it) preserves its integrity as a namespace, as does always referring to things inside the imported module by qualified (dotted) names. It's one honking great idea: do more of that!-)
Yes, you just need to access it correctly (and don't use import *, it's evil)
c.py:
import a
print a.v1 # prints 1
a.v1 = 0
print a.v1 # prints 0