Why is "import" implemented this way?

Why is "import" implemented this way? - python

>>> import math
>>> math.pi
3.141592653589793
>>> math.pi = 3
>>> math.pi
3
>>> import math
>>> math.pi
3
Initial question: Why can't I get math.pi back?
I thought import would import all the defined variables and functions to the current scope. And if a variable name already exists in current scope, then it would replace it.
Yes, it does replace it:
>>> pi = 3
>>> from math import *
>>> pi
3.141592653589793
Then I thought maybe the math.pi = 3 assignment actually changed the property in the math class(or is it math module?), which the import math imported.
I was right:
>>> import math
>>> math.pi
3.141592653589793
>>> math.pi = 3
>>> from math import *
>>> pi
3
So, it seems that:
If you do import x, then it imports x as a class-like thing. And if you make changes to x.property, the change would persist in the module so that every time you import it again, it's a modified version.
Real question:
Why is import implemented this way? Why not let every import math import a fresh, unmodified copy of math? Why leave the imported math open to change?
Is there any workaround to get math.pi back after doing math.pi = 3 (except math.pi = 3.141592653589793, of course)?
Originally I thought import math is preferred over from math import *. But this behaviour leaves me worrying someone else might be modifying my imported module if I do it this way...How should I do the import?

Python only creates one copy of any given module. Importing a module repeatedly reuses the original. This is because if modules A and B imported C and D, which imported E and F, etc., C and D would get loaded twice, and E and F would get loaded 4 times, etc. With any but the most trivial of dependency graphs, you'd spend a few minutes loading redundant modules before running out of memory. Also, if A imported B and B imported A, you'd get stuck in a recursive loop and, again, run out of memory without doing anything useful.
The solution: Don't screw with the contents of other modules. If you do, that's an interpreter-wide change. There are occasionally situations where you'd want to do this, so Python lets you, but it's usually a bad idea.

A module may be imported many times. An import statement just loads the reference from sys.modules. If the import statement also reloaded the module from disk, it would be quite slow. Modifying a module like this is very unusual and is only done under rare, documented circumstances, so there’s no need to worry.
How to reload a module:
>>> import imp
>>> imp.reload(math)
<module 'math' (built-in)>
>>> math.pi
3.141592653589793

The import behavior is intended to allow modules to have state. For example, a module that runs initialization code may have all sorts of different behaviors based on what happens at init time (a good example is the os module, which transparently loads different versions of the path submodule depending on what OS you're on). The usual behavior exists to allow lots of different code to access the module without re-running the initialization over and over. Moreover, modules function sort of like static classes in other languages - they can maintain state and are often used as an alternative to global variables: eg, you might use the locale module to set local culture variables (currency format, etc) -- calling locale.setlocale in one part of your code and local.getlocale in another is a nice alternative to making a global variable.
Your example, of course, points out the weakness. One of the classic python principes is
We're all adults here
The language does not provide much of the privacy management features you'd find in, say, Java or C# which let the author lock down the contents of a module or class. You can, if you're feeling malicious (or just suicidal) do exactly the sort of thing done in your example: change pi to equal 3, or turn a function into a variable, or all sorts of other nasty stuff. The language is not designed to make that hard -- it's up to coders to be responsible.
#Josh Lee's answer shows how to use reload, which is the correct way of refreshing a module to it's disk-based state. The wisdom of using reload depends mostly on how much init code is in the module, and also on the web of other modules which import or are imported by the module in question.

Related

Should I use from tkinter import * or import tkinter as tk? [duplicate]

It is recommended to not to use import * in Python.
Can anyone please share the reason for that, so that I can avoid it doing next time?

Because it puts a lot of stuff into your namespace (might shadow some other object from previous import and you won't know about it).
Because you don't know exactly what is imported and can't easily find from which module a certain thing was imported (readability).
Because you can't use cool tools like pyflakes to statically detect errors in your code.

According to the Zen of Python:
Explicit is better than implicit.
... can't argue with that, surely?

You don't pass **locals() to functions, do you?
Since Python lacks an "include" statement, and the self parameter is explicit, and scoping rules are quite simple, it's usually very easy to point a finger at a variable and tell where that object comes from -- without reading other modules and without any kind of IDE (which are limited in the way of introspection anyway, by the fact the language is very dynamic).
The import * breaks all that.
Also, it has a concrete possibility of hiding bugs.
import os, sys, foo, sqlalchemy, mystuff
from bar import *
Now, if the bar module has any of the "os", "mystuff", etc... attributes, they will override the explicitly imported ones, and possibly point to very different things. Defining __all__ in bar is often wise -- this states what will implicitly be imported - but still it's hard to trace where objects come from, without reading and parsing the bar module and following its imports. A network of import * is the first thing I fix when I take ownership of a project.
Don't misunderstand me: if the import * were missing, I would cry to have it. But it has to be used carefully. A good use case is to provide a facade interface over another module.
Likewise, the use of conditional import statements, or imports inside function/class namespaces, requires a bit of discipline.
I think in medium-to-big projects, or small ones with several contributors, a minimum of hygiene is needed in terms of statical analysis -- running at least pyflakes or even better a properly configured pylint -- to catch several kind of bugs before they happen.
Of course since this is python -- feel free to break rules, and to explore -- but be wary of projects that could grow tenfold, if the source code is missing discipline it will be a problem.

That is because you are polluting the namespace. You will import all the functions and classes in your own namespace, which may clash with the functions you define yourself.
Furthermore, I think using a qualified name is more clear for the maintenance task; you see on the code line itself where a function comes from, so you can check out the docs much more easily.
In module foo:
def myFunc():
print 1
In your code:
from foo import *
def doThis():
myFunc() # Which myFunc is called?
def myFunc():
print 2

It is OK to do from ... import * in an interactive session.

Say you have the following code in a module called foo:
import ElementTree as etree
and then in your own module you have:
from lxml import etree
from foo import *
You now have a difficult-to-debug module that looks like it has lxml's etree in it, but really has ElementTree instead.

Understood the valid points people put here. However, I do have one argument that, sometimes, "star import" may not always be a bad practice:
When I want to structure my code in such a way that all the constants go to a module called const.py:
If I do import const, then for every constant, I have to refer it as const.SOMETHING, which is probably not the most convenient way.
If I do from const import SOMETHING_A, SOMETHING_B ..., then obviously it's way too verbose and defeats the purpose of the structuring.
Thus I feel in this case, doing a from const import * may be a better choice.

http://docs.python.org/tutorial/modules.html
Note that in general the practice of importing * from a module or package is frowned upon, since it often causes poorly readable code.

These are all good answers. I'm going to add that when teaching new people to code in Python, dealing with import * is very difficult. Even if you or they didn't write the code, it's still a stumbling block.
I teach children (about 8 years old) to program in Python to manipulate Minecraft. I like to give them a helpful coding environment to work with (Atom Editor) and teach REPL-driven development (via bpython). In Atom I find that the hints/completion works just as effectively as bpython. Luckily, unlike some other statistical analysis tools, Atom is not fooled by import *.
However, lets take this example... In this wrapper they from local_module import * a bunch modules including this list of blocks. Let's ignore the risk of namespace collisions. By doing from mcpi.block import * they make this entire list of obscure types of blocks something that you have to go look at to know what is available. If they had instead used from mcpi import block, then you could type walls = block. and then an autocomplete list would pop up.

It is a very BAD practice for two reasons:
Code Readability
Risk of overriding the variables/functions etc
For point 1:
Let's see an example of this:
from module1 import *
from module2 import *
from module3 import *
a = b + c - d
Here, on seeing the code no one will get idea regarding from which module b, c and d actually belongs.
On the other way, if you do it like:
# v v will know that these are from module1
from module1 import b, c # way 1
import module2 # way 2
a = b + c - module2.d
# ^ will know it is from module2
It is much cleaner for you, and also the new person joining your team will have better idea.
For point 2: Let say both module1 and module2 have variable as b. When I do:
from module1 import *
from module2 import *
print b # will print the value from module2
Here the value from module1 is lost. It will be hard to debug why the code is not working even if b is declared in module1 and I have written the code expecting my code to use module1.b
If you have same variables in different modules, and you do not want to import entire module, you may even do:
from module1 import b as mod1b
from module2 import b as mod2b

As a test, I created a module test.py with 2 functions A and B, which respectively print "A 1" and "B 1". After importing test.py with:
import test
. . . I can run the 2 functions as test.A() and test.B(), and "test" shows up as a module in the namespace, so if I edit test.py I can reload it with:
import importlib
importlib.reload(test)
But if I do the following:
from test import *
there is no reference to "test" in the namespace, so there is no way to reload it after an edit (as far as I can tell), which is a problem in an interactive session. Whereas either of the following:
import test
import test as tt
will add "test" or "tt" (respectively) as module names in the namespace, which will allow re-loading.
If I do:
from test import *
the names "A" and "B" show up in the namespace as functions. If I edit test.py, and repeat the above command, the modified versions of the functions do not get reloaded.
And the following command elicits an error message.
importlib.reload(test) # Error - name 'test' is not defined
If someone knows how to reload a module loaded with "from module import *", please post. Otherwise, this would be another reason to avoid the form:
from module import *

As suggested in the docs, you should (almost) never use import * in production code.
While importing * from a module is bad, importing * from a package is probably even worse.
By default, from package import * imports whatever names are defined by the package's __init__.py, including any submodules of the package that were loaded by previous import statements.
If a package’s __init__.py code defines a list named __all__, it is taken to be the list of submodule names that should be imported when from package import * is encountered.
Now consider this example (assuming there's no __all__ defined in sound/effects/__init__.py):
# anywhere in the code before import *
import sound.effects.echo
import sound.effects.surround
# in your module
from sound.effects import *
The last statement will import the echo and surround modules into the current namespace (possibly overriding previous definitions) because they are defined in the sound.effects package when the import statement is executed.

Python: variables shared betwen modules and namespaces

After reading at
Python - Visibility of global variables in imported modules
I was curious about this example:
import shared_stuff
import module1
shared_stuff.a = 3
module1.f()
If there are no other variables "a" anywhere else, why the following one is not equivalent?
from shared_stuff import *
import module1
a = 3
module1.f()
We leave out "explicit is better than implicit": I am asking out of curiosity, as I prefer the first syntax anyway.
I come from C and it looks like I didn't fully grasp Python's namespace's subtleties.
Even a link to docs where this namespace's behaviour is explained is enough.

Importing * copies all the references from the module into the current scope; there is no connection to the original module at all.

In python, what's the difference between math.sin(x) and sin(x)?

Which one is better:
import math
math.sin(x)
or
from math import *
sin(x)
And what's the difference?

Try to avoid using a wildcard from module import *, since you dont know what exactly you have imported into your current namespace. This can lead to confusion if there are conflicts in the names of such items.
You should use from math import sin since it would make it very clear that you only require sin.
Using (import math and using math.sin) or from math import sin is more of a personal choice unless there is another variable or function having same name in the namespace. If such a case is happening, then using math.sin is the better way

It's mostly about reading it nicer, so that you know where sin came from.
However there is the danger of importing modules with clashing names. If another module has something named 'sin' that you also import with the * wildcard, then you actually will only have one of them.
Using math.sin is extra explicit to avoid those cases. And when you write larger programs where you've imported whole modules indiscriminately, it's easier to miss that you have clashing names.

When is 'from SomeFile import *' okay

This is sort of a best practice question. If I have a large number of custom classes, but don't want them in my main program is it acceptable to stick `from someFile import *' at the top? I know that I won't be doing anything to edit/redefine the classes, functions and variables from that file.
from someFile import *
a = someFunction()
#Other stuff
someFile would just contain various custom classes and function that I know work and I don't need to scroll past every time I'm working in the program. As long as I'm careful is there any reason to not do this?

If you are using a lot of classes, it is usually safer to avoid the use of this syntax. Especially if you use third-party classes, because some of them may have the same methods. (eg. sin, cos, etc) and you can get strange behaviors.
In my opinion, it is acceptable to use this syntax when you provide an example of the use of your code. In this way you only source your method like that to show the functionalities in a more clear way.
Personally I try to avoid this syntax. I prefer to explicitly call the "right" class. If you don't like to write long class/modules names, try just to load them as aliases like
import LongModuleName as LM

http://www.python.org/dev/peps/pep-0008/#imports
According to pep-8:
There is one defensible use case for a wildcard import, which is to republish an internal interface as part of a public API (for example, overwriting a pure Python implementation of an interface with the definitions from an optional accelerator module and exactly which definitions will be overwritten isn't known in advance).

In my opinion, probably the most useful case of the from someFile import * syntax is when you are using the python interpreter interactively. Image you want to do some quick math:
$ python
>>> from math import *
>>> sin(4.0)
This is especially useful when using pylab and ipython to turn your ipython session into a MATLAB clone with just the line from pylab import *.

One problem with from module import * is that it makes it difficult to see where different functions came from, compare:
import random
random.shuffle(...)
from random import shuffle
shuffle(...)
from random import *
shuffle(...)
Another is risk of name collision; if two modules have classes or functions with the same name, import * will shadow one with the other. With from x import y you can see right at the top that you might need from z import y as y2.

Using from someFile import * is fine as long as you're aware you're importing everything into the namespace of that module. This could lead to clashes if you happen to be reusing names.
Consider only importing the classes and functions that you really use in that module as it's more explicit for others who may be reading the source code. It also makes it easier to figure out where a function is defined: if I see a function and can't locate an explicit import line in the same file then I'm forced to grep / search more broadly to see where the function is defined.

Avoiding circular (cyclic) imports in Python?

One way is to use import x, without using "from" keyword. So then you refer to things with their namespace everywhere.
Is there any other way? like doing something like in C++ ifnotdef __b__ def __b__ type of thing?

Merge any pair of modules that depend on each other into a single module. Then introduce extra modules to get the old names back.
E.g.,
# a.py
from b import B
class A: whatever
# b.py
from a import A
class B: whatever
becomes
# common.py
class A: whatever
class B: whatever
# a.py
from common import A
# b.py
from common import B

Circular imports are a "code smell," and often (but not always) indicate that some refactoring would be appropriate. E.g., if A.x uses B.y and B.y uses A.z, then you might consider moving A.z into its own module.
If you do think you need circular imports, then I'd generally recommend importing the module and referring to objects with fully qualified names (i.e, import A and use A.x rather than from A import x).

If you're trying to do from A import *, the answer is very simple: Don't do that. You're usually supposed to do import A and refer to the qualified names.
For quick&dirty scripts, and interactive sessions, that's a perfectly reasonable thing to do—but in such cases, you won't run into circular imports.
There are some cases where it makes sense to do import * in real code. For example, if you want to hide a module structure that's complex, or that you generate dynamically, or that changes frequently between versions, or if you're wrapping up someone else's package that's too deeply nested, import * may make sense from a "wrapper module" or a top-level package module. But in that case, nothing you import will be importing you.
In fact, I'm having a hard time imagining any case where import * is warranted and circular dependencies are even a possibility.
If you're doing from A import foo, there are ways around that (e.g., import A then foo = A.foo). But you probably don't want to do that. Again, consider whether you really need to bring foo into your namespace—qualified names are a feature, not a problem to be worked around.
If you're doing the from A import foo just for convenience in implementing your functions, because A is actually long_package_name.really_long_module_name and your code is unreadable because of all those calls to long_package_name.really_long_module_name.long_class_name.class_method_that_puts_me_over_80_characters, remember that you can always import long_package_name.really_long_module_name as P and then use P for you qualified calls.
(Also, remember that with any from done for implementation convenience, you probably want to make sure to specify a __all__ to make sure the imported names don't appear to be part of your namespace if someone does an import * on you from an interactive session.)
Also, as others have pointed out, most, but not all, cases of circular dependencies, are a symptom of bad design, and refactoring your modules in a sensible way will fix it. And in the rare cases where you really do need to bring the names into your namespace, and a circular set of modules is actually the best design, some artificial refactoring may still be a better choice than foo = A.foo.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Why is "import" implemented this way? - python

Related

Should I use from tkinter import * or import tkinter as tk? [duplicate]

Python: variables shared betwen modules and namespaces

In python, what's the difference between math.sin(x) and sin(x)?

When is 'from SomeFile import *' okay

Avoiding circular (cyclic) imports in Python?

Categories

Resources