Python: variables shared betwen modules and namespaces - python

After reading at
Python - Visibility of global variables in imported modules
I was curious about this example:
import shared_stuff
import module1
shared_stuff.a = 3
module1.f()
If there are no other variables "a" anywhere else, why the following one is not equivalent?
from shared_stuff import *
import module1
a = 3
module1.f()
We leave out "explicit is better than implicit": I am asking out of curiosity, as I prefer the first syntax anyway.
I come from C and it looks like I didn't fully grasp Python's namespace's subtleties.
Even a link to docs where this namespace's behaviour is explained is enough.

Importing * copies all the references from the module into the current scope; there is no connection to the original module at all.

Related

Should I use from tkinter import * or import tkinter as tk? [duplicate]

It is recommended to not to use import * in Python.
Can anyone please share the reason for that, so that I can avoid it doing next time?
Because it puts a lot of stuff into your namespace (might shadow some other object from previous import and you won't know about it).
Because you don't know exactly what is imported and can't easily find from which module a certain thing was imported (readability).
Because you can't use cool tools like pyflakes to statically detect errors in your code.
According to the Zen of Python:
Explicit is better than implicit.
... can't argue with that, surely?
You don't pass **locals() to functions, do you?
Since Python lacks an "include" statement, and the self parameter is explicit, and scoping rules are quite simple, it's usually very easy to point a finger at a variable and tell where that object comes from -- without reading other modules and without any kind of IDE (which are limited in the way of introspection anyway, by the fact the language is very dynamic).
The import * breaks all that.
Also, it has a concrete possibility of hiding bugs.
import os, sys, foo, sqlalchemy, mystuff
from bar import *
Now, if the bar module has any of the "os", "mystuff", etc... attributes, they will override the explicitly imported ones, and possibly point to very different things. Defining __all__ in bar is often wise -- this states what will implicitly be imported - but still it's hard to trace where objects come from, without reading and parsing the bar module and following its imports. A network of import * is the first thing I fix when I take ownership of a project.
Don't misunderstand me: if the import * were missing, I would cry to have it. But it has to be used carefully. A good use case is to provide a facade interface over another module.
Likewise, the use of conditional import statements, or imports inside function/class namespaces, requires a bit of discipline.
I think in medium-to-big projects, or small ones with several contributors, a minimum of hygiene is needed in terms of statical analysis -- running at least pyflakes or even better a properly configured pylint -- to catch several kind of bugs before they happen.
Of course since this is python -- feel free to break rules, and to explore -- but be wary of projects that could grow tenfold, if the source code is missing discipline it will be a problem.
That is because you are polluting the namespace. You will import all the functions and classes in your own namespace, which may clash with the functions you define yourself.
Furthermore, I think using a qualified name is more clear for the maintenance task; you see on the code line itself where a function comes from, so you can check out the docs much more easily.
In module foo:
def myFunc():
print 1
In your code:
from foo import *
def doThis():
myFunc() # Which myFunc is called?
def myFunc():
print 2
It is OK to do from ... import * in an interactive session.
Say you have the following code in a module called foo:
import ElementTree as etree
and then in your own module you have:
from lxml import etree
from foo import *
You now have a difficult-to-debug module that looks like it has lxml's etree in it, but really has ElementTree instead.
Understood the valid points people put here. However, I do have one argument that, sometimes, "star import" may not always be a bad practice:
When I want to structure my code in such a way that all the constants go to a module called const.py:
If I do import const, then for every constant, I have to refer it as const.SOMETHING, which is probably not the most convenient way.
If I do from const import SOMETHING_A, SOMETHING_B ..., then obviously it's way too verbose and defeats the purpose of the structuring.
Thus I feel in this case, doing a from const import * may be a better choice.
http://docs.python.org/tutorial/modules.html
Note that in general the practice of importing * from a module or package is frowned upon, since it often causes poorly readable code.
These are all good answers. I'm going to add that when teaching new people to code in Python, dealing with import * is very difficult. Even if you or they didn't write the code, it's still a stumbling block.
I teach children (about 8 years old) to program in Python to manipulate Minecraft. I like to give them a helpful coding environment to work with (Atom Editor) and teach REPL-driven development (via bpython). In Atom I find that the hints/completion works just as effectively as bpython. Luckily, unlike some other statistical analysis tools, Atom is not fooled by import *.
However, lets take this example... In this wrapper they from local_module import * a bunch modules including this list of blocks. Let's ignore the risk of namespace collisions. By doing from mcpi.block import * they make this entire list of obscure types of blocks something that you have to go look at to know what is available. If they had instead used from mcpi import block, then you could type walls = block. and then an autocomplete list would pop up.
It is a very BAD practice for two reasons:
Code Readability
Risk of overriding the variables/functions etc
For point 1:
Let's see an example of this:
from module1 import *
from module2 import *
from module3 import *
a = b + c - d
Here, on seeing the code no one will get idea regarding from which module b, c and d actually belongs.
On the other way, if you do it like:
# v v will know that these are from module1
from module1 import b, c # way 1
import module2 # way 2
a = b + c - module2.d
# ^ will know it is from module2
It is much cleaner for you, and also the new person joining your team will have better idea.
For point 2: Let say both module1 and module2 have variable as b. When I do:
from module1 import *
from module2 import *
print b # will print the value from module2
Here the value from module1 is lost. It will be hard to debug why the code is not working even if b is declared in module1 and I have written the code expecting my code to use module1.b
If you have same variables in different modules, and you do not want to import entire module, you may even do:
from module1 import b as mod1b
from module2 import b as mod2b
As a test, I created a module test.py with 2 functions A and B, which respectively print "A 1" and "B 1". After importing test.py with:
import test
. . . I can run the 2 functions as test.A() and test.B(), and "test" shows up as a module in the namespace, so if I edit test.py I can reload it with:
import importlib
importlib.reload(test)
But if I do the following:
from test import *
there is no reference to "test" in the namespace, so there is no way to reload it after an edit (as far as I can tell), which is a problem in an interactive session. Whereas either of the following:
import test
import test as tt
will add "test" or "tt" (respectively) as module names in the namespace, which will allow re-loading.
If I do:
from test import *
the names "A" and "B" show up in the namespace as functions. If I edit test.py, and repeat the above command, the modified versions of the functions do not get reloaded.
And the following command elicits an error message.
importlib.reload(test) # Error - name 'test' is not defined
If someone knows how to reload a module loaded with "from module import *", please post. Otherwise, this would be another reason to avoid the form:
from module import *
As suggested in the docs, you should (almost) never use import * in production code.
While importing * from a module is bad, importing * from a package is probably even worse.
By default, from package import * imports whatever names are defined by the package's __init__.py, including any submodules of the package that were loaded by previous import statements.
If a package’s __init__.py code defines a list named __all__, it is taken to be the list of submodule names that should be imported when from package import * is encountered.
Now consider this example (assuming there's no __all__ defined in sound/effects/__init__.py):
# anywhere in the code before import *
import sound.effects.echo
import sound.effects.surround
# in your module
from sound.effects import *
The last statement will import the echo and surround modules into the current namespace (possibly overriding previous definitions) because they are defined in the sound.effects package when the import statement is executed.

Why is "import" implemented this way?

>>> import math
>>> math.pi
3.141592653589793
>>> math.pi = 3
>>> math.pi
3
>>> import math
>>> math.pi
3
Initial question: Why can't I get math.pi back?
I thought import would import all the defined variables and functions to the current scope. And if a variable name already exists in current scope, then it would replace it.
Yes, it does replace it:
>>> pi = 3
>>> from math import *
>>> pi
3.141592653589793
Then I thought maybe the math.pi = 3 assignment actually changed the property in the math class(or is it math module?), which the import math imported.
I was right:
>>> import math
>>> math.pi
3.141592653589793
>>> math.pi = 3
>>> from math import *
>>> pi
3
So, it seems that:
If you do import x, then it imports x as a class-like thing. And if you make changes to x.property, the change would persist in the module so that every time you import it again, it's a modified version.
Real question:
Why is import implemented this way? Why not let every import math import a fresh, unmodified copy of math? Why leave the imported math open to change?
Is there any workaround to get math.pi back after doing math.pi = 3 (except math.pi = 3.141592653589793, of course)?
Originally I thought import math is preferred over from math import *. But this behaviour leaves me worrying someone else might be modifying my imported module if I do it this way...How should I do the import?
Python only creates one copy of any given module. Importing a module repeatedly reuses the original. This is because if modules A and B imported C and D, which imported E and F, etc., C and D would get loaded twice, and E and F would get loaded 4 times, etc. With any but the most trivial of dependency graphs, you'd spend a few minutes loading redundant modules before running out of memory. Also, if A imported B and B imported A, you'd get stuck in a recursive loop and, again, run out of memory without doing anything useful.
The solution: Don't screw with the contents of other modules. If you do, that's an interpreter-wide change. There are occasionally situations where you'd want to do this, so Python lets you, but it's usually a bad idea.
A module may be imported many times. An import statement just loads the reference from sys.modules. If the import statement also reloaded the module from disk, it would be quite slow. Modifying a module like this is very unusual and is only done under rare, documented circumstances, so there’s no need to worry.
How to reload a module:
>>> import imp
>>> imp.reload(math)
<module 'math' (built-in)>
>>> math.pi
3.141592653589793
The import behavior is intended to allow modules to have state. For example, a module that runs initialization code may have all sorts of different behaviors based on what happens at init time (a good example is the os module, which transparently loads different versions of the path submodule depending on what OS you're on). The usual behavior exists to allow lots of different code to access the module without re-running the initialization over and over. Moreover, modules function sort of like static classes in other languages - they can maintain state and are often used as an alternative to global variables: eg, you might use the locale module to set local culture variables (currency format, etc) -- calling locale.setlocale in one part of your code and local.getlocale in another is a nice alternative to making a global variable.
Your example, of course, points out the weakness. One of the classic python principes is
We're all adults here
The language does not provide much of the privacy management features you'd find in, say, Java or C# which let the author lock down the contents of a module or class. You can, if you're feeling malicious (or just suicidal) do exactly the sort of thing done in your example: change pi to equal 3, or turn a function into a variable, or all sorts of other nasty stuff. The language is not designed to make that hard -- it's up to coders to be responsible.
#Josh Lee's answer shows how to use reload, which is the correct way of refreshing a module to it's disk-based state. The wisdom of using reload depends mostly on how much init code is in the module, and also on the web of other modules which import or are imported by the module in question.

Importing class from another file in python - I know the fix, but why doesn't the original work?

I can make this code work, but I am still confused why it won't work the first way I tried.
I am practicing python because my thesis is going to be coded in it (doing some cool things with Arduino and PC interfaces). I'm trying to import a class from another file into my main program so that I can create objects. Both files are in the same directory. It's probably easier if you have a look at the code at this point.
#from ArduinoBot import *
#from ArduinoBot import ArduinoBot
import ArduinoBot
# Create ArduinoBot object
bot1 = ArduinoBot()
# Call toString inside bot1 object
bot1.toString()
input("Press enter to end.")
Here is the very basic ArduinoBot class
class ArduinoBot:
def toString(self):
print ("ArduinoBot toString")
Either of the first two commented out import statements will make this work, but not the last one, which to me seems the most intuitive and general. There's not a lot of code for stuff to go wrong here, it's a bit frustrating to be hitting these kind of finicky language specific quirks when I had heard some many good things about Python. Anyway I must be doing something wrong, but why doesn't the simple 'import ClassName' or 'import FileName' work?
Thank you for your help.
consider a file (example.py):
class foo(object):
pass
class bar(object):
pass
class example(object):
pass
Now in your main program, if you do:
import example
what should be imported from the file example.py? Just the class example? should the class foo come along too? The meaning would be too ambiguous if import module pulled the whole module's namespace directly into your current namespace.
The idea is that namespaces are wonderful. They let you know where the class/function/data came from. They also let you group related things together (or equivalently, they help you keep unrelated things separate!). A module sets up a namespace and you tell python exactly how you want to bring that namespace into the current context (namespace) by the way you use import.
from ... import * says -- bring everything in that module directly into my namespace.
from ... import ... as ... says, bring only the thing that I specify directly into my namespace, but give it a new name here.
Finally, import ... simply says bring that module into the current namespace, but keep it separate. This is the most common form in production code because of (at least) 2 reasons.
It prevents name clashes. You can have a local class named foo which won't conflict with the foo in example.py -- You get access to that via example.foo
It makes it easy to trace down which module a class came from for debugging.
consider:
from foo import *
from bar import *
a = AClass() #did this come from foo? bar? ... Hmmm...
In this case, to get access to the class example from example.py, you could also do:
import example
example_instance = example.example()
but you can also get foo:
foo_instance = example.foo()
The simple answer is that modules are things in Python. A module has its own status as a container for classes, functions, and other objects. When you do import ArduinoBot, you import the module. If you want things in that module -- classes, functions, etc. -- you have to explicitly say that you want them. You can either import them directly with from ArduinoBot import ..., or access them via the module with import ArduinoBot and then ArduinoBot.ArduinoBot.
Instead of working against this, you should leverage the container-ness of modules to allow you to group related stuff into a module. It may seem annoying when you only have one class in a file, but when you start putting multiple classes and functions in one file, you'll see that you don't actually want all that stuff being automatically imported when you do import module, because then everything from all modules would conflict with other things. The modules serve a useful function in separating different functionality.
For your example, the question you should ask yourself is: if the code is so simple and compact, why didn't you put it all in one file?
Import doesn't work quite the you think it does. It does work the way it is documented to work, so there's a very simple remedy for your problem, but nonetheless:
import ArduinoBot
This looks for a module (or package) on the import path, executes the module's code in a new namespace, and then binds the module object itself to the name ArduinoBot in the current namespace. This means a module global variable named ArduinoBot in the ArduinoBot module would now be accessible in the importing namespace as ArduinoBot.ArduinoBot.
from ArduinoBot import ArduinoBot
This loads and executes the module as above, but does not bind the module object to the name ArduinoBot. Instead, it looks for a module global variable ArduinoBot within the module, and binds whatever object that referred to the name ArduinoBot in the current namespace.
from ArduinoBot import *
Similarly to the above, this loads and executes a module without binding the module object to any name in the current namespace. It then looks for all module global variables, and binds them all to the same name in the current namespace.
This last form is very convenient for interactive work in the python shell, but generally considered bad style in actual development, because it's not clear what names it actually binds. Considering it imports everything global in the imported module, including any names that it imported at global scope, it very quickly becomes extremely difficult to know what names are in scope or where they came from if you use this style pervasively.
The module itself is an object. The last approach does in fact work, if you access your class as a member of the module. Either if the following will work, and either may be appropriate, depending on what else you need from the imported items:
from my_module import MyClass
foo = MyClass()
or
import my_module
foo = my_module.MyClass()
As mentioned in the comments, your module and class usually don't have the same name in python. That's more a Java thing, and can sometimes lead to a little confusion here.

Why bother to limit the types imported from a python package?

When using many IDEs that support autocompletion with Python, things like this will show warnings, which I find annoying:
from eventlet.green.httplib import BadStatusLine
When switching to:
from eventlet.green.httplib import *
The warnings go away. What's the benefit to limiting imports to a specific set of types you'll use? Is the parsing faster? Reduces collisions? What other point is there? It seems the state of python IDEs and the nature of the typing system makes it hard for many IDEs to fully get right when a type import works and when it doesn't.
By typing from foo import *, you import all the names defined in foo into the global namespace. This is bad practice because you could have name clashes both with other modules and with built-ins.
For example, consider a module foo
#foo.py
def open(something):
pass
and a module bar:
#bar.py
def open(something_else):
pass
Now, from foo import * hides the built-in function open() which means that any calls to open() now refer to foo.open() rather than the built-in. Worse, if you then have from bar import *, the function open() in bar now hides both the built-in and the function imported from foo.
In the example above, from foo import open is equally shadowing the built-in function, but one glance at the code tells you why you can't open files for IO anymore.
This is why you should import only specific names, ensuring that you know what names are imported. Alternatively, you could use fully qualified names (import foo; foo.open(), which is perfectly safe).
EDIT: Just as a note, this can be horribly compounded if the module you're importing also uses from x import *. In this case, not only do you typically import all the stuff in the module foo, but also all the stuff in the module x into the global namespace. This can very quickly turn into an absolute mess.
It reduces collisions with user-defined types, it reduces coupling and it's self-documenting, since it makes clear from the outset of the module which classes are coming from libraries (so the rest must be user-defined). The parsing is not faster, at least not in CPython: an imported module must be read in its entirety to look for the classes/functions being imported.
(I must admit that I never use an IDE.)

Python imports: Will changing a variable in "child" change variable in "parent"/other children?

Suppose you have 3 modules, a.py, b.py, and c.py:
a.py:
v1 = 1
v2 = 2
etc.
b.py:
from a import *
c.py:
from a import *
v1 = 0
Will c.py change v1 in a.py and b.py? If not, is there a way to do it?
All that a statement like:
v1 = 0
can do is bind the name v1 to the object 0. It can't affect a different module.
If I'm using unfamiliar terms there, and I guess I probably am, I strongly recommend you read Fredrik Lundh's excellent article Python Objects: Reset your brain.
The from ... import * form is basically intended for handy interactive use at the interpreter prompt: you'd be well advised to never use it in other situations, as it will give you nothing but problems.
In fact, the in-house style guide at my employer goes further, recommending to always import a module, never contents from within a module (a module from within a package is OK and in fact recommended). As a result, in our codebase, references to imported things are always qualified names (themod.thething) and never barenames (which always refer to builtin, globals of this same module, or locals); this makes the code much clearer and more readable and avoids all kinds of subtle anomalies.
Of course, if a module's name is too long, an as clause in the import, to give it a shorter and handier alias for the purposes of the importing module, is fine. But, with your one-letter module names, that won't be needed;-).
So, if you follow the guideline and always import the module (and not things from inside it), c.v1 will always be referring to the same thing as a.v1 and b.v1, both for getting AND setting: here's one potential subtle anomaly avoided right off the bat!-)
Remember the very last bit of the Zen of Python (do import this at the interpreter prompt to see it all):
Namespaces are one honking great idea -- let's do more of those!
Importing the whole module (not bits and pieces from within it) preserves its integrity as a namespace, as does always referring to things inside the imported module by qualified (dotted) names. It's one honking great idea: do more of that!-)
Yes, you just need to access it correctly (and don't use import *, it's evil)
c.py:
import a
print a.v1 # prints 1
a.v1 = 0
print a.v1 # prints 0

Categories