When we import a module in a Python script, does this copy all the required code into the script, or does it just let the script know where to find it?
What happens if we don't use the module then in the code, does it get optimized out somehow, like in C/C++?
None of those things are the case.
An import does two things. First, if the requested module has not previously been loaded, the import loads the module. This mostly boils down to creating a new global scope and executing the module's code in that scope to initialize the module. The new global scope is used as the module's attributes, as well as for global variable lookup for any code in the module.
Second, the import binds whatever names were requested. import whatever binds the whatever name to the whatever module object. import whatever.thing also binds the whatever name to the whatever module object. from whatever import somefunc looks up the somefunc attribute on the whatever module object and binds the somefunc name to whatever the attribute lookup finds.
Unused imports cannot be optimized out, because both the module loading and the name binding have effects that some other code might be relying on.
Related
I usually don't think too hard about variable scope in python, but I wanted to see if there's a clean explanation for this. Given two files called main.py and utils.py:
utils.py
def run():
print(L)
main.py
import utils
def run():
print(L)
if __name__ == '__main__':
L = [1,2]
run()
utils.run()
The first run() call in main.py runs fine despite L not being fed into run(), and the utils.run() call raises a NameError. Is L a global variable available to all functions defined in main.py?
If I imported utils with from utils import * instead of import utils, would that change anything?
It's module-level scope. A global variable defined in a module is available to all functions defined in the same module (if it's not overriden). Functions in another module don't have access to another module's variables unless they import them.
About "If I imported utils with from utils import * instead of import utils, would that change anything?":
No. The scope is determined at parsing time.
Check
this
for more information.
Notably:
It is important to realize that scopes are determined textually: the global
scope of a function defined in a module is that module’s namespace, no matter
from where or by what alias the function is called. On the other hand, the
actual search for names is done dynamically, at run time [...]
So the global scopes of both functions for variables defined in a module are the modules they're defined in. For one, its module also later has a definition for a global variable it uses, but not the other module, and when it's time to check for a variable when a function is run, each checks their own module's variables definitions, one finds it, the other does not.
See Python's FAQ. Their implementation of scope is a compromise between convenience and the dangers of globals.
Variables are treated as globals if they're only referenced by a function, and need to be explicitly declared as globals (e.g. global foo ) inside of the function body if you want to edit them. If you edit run() to try and change the value of L, you'll get an error.
What's happening here is that your Python code imports utils, and then runs run(). This function sees that you're looking for a variable named "L," and checks your global namespace.
My tree looks like
parent/
|--__init__.py
\--a.py
And the content of __init__.py is
import parent.a as _a
a = 'some string'
When I open up a Python at the top level and import parent.a, I would get the string instead of module. For example import parent.a as the_a; type(the_a) == str.
So I think OK probably import is importing the name from the parent namespace, and it's now overridden. So I figure I can go import parent._a as a_module. But this doesn't work as there is "No module named _a".
This is very confusing. A function can override a module with the same name, but a module cannot take on a new name and "reexport".
Is there any explanation I'm not aware of? Or is this documented feature?
Even more confusing, if I remove the import statement in __init__.py, everything is back normal again (import parent.a; type(parent.a) is module). But why is this different? The a name in parent namespace is still a string.
(I ran on Python 3.5.3 and 2.7.13 with the same results)
In an import statement, the module reference never uses attribute lookups. The statements
import parent.a # as ...
and
from parent.a import ... # as ...
will always look for parent.a in the sys.modules namespace before trying to further initiate module loading from disk.
However, for from ... import name statements, Python does look at attributes of the resolved module to find name, before looking for submodules.
Module globals and the attributes on a module object are the same thing. On import, Python adds submodules as attributes (so globals) to the parent module, but you are free to overwrite those attributes, as you did in your code. However, when you then use an import with the parent.a module path, attributes do not come into play.
From the Submodules section of the Python import system reference documentation:
When a submodule is loaded using any mechanism [...] a binding is placed in the parent module’s namespace to the submodule object. For example, if package spam has a submodule foo, after importing spam.foo, spam will have an attribute foo which is bound to the submodule.
Your import parent.a as _a statement adds two names to the parent namespace; first a is added pointing to the parent.a submodule, and then _a is also set, pointing to the same object.
Your next line replaces the name a with a binding to the 'some string' object.
The Searching section of the same details how Python goes about finding a module when you import:
To begin the search, Python needs the fully qualified name of the module [...] being imported.
[...]
This name will be used in various phases of the import search, and it may be the dotted path to a submodule, e.g. foo.bar.baz. In this case, Python first tries to import foo, then foo.bar, and finally foo.bar.baz. If any of the intermediate imports fail, a ModuleNotFoundError is raised.
then further on
The first place checked during import search is sys.modules. This mapping serves as a cache of all modules that have been previously imported, including the intermediate paths. So if foo.bar.baz was previously imported, sys.modules will contain entries for foo, foo.bar, and foo.bar.baz. Each key will have as its value the corresponding module object.
During import, the module name is looked up in sys.modules and if present, the associated value is the module satisfying the import, and the process completes. [...] If the module name is missing, Python will continue searching for the module.
So when trying to import parent.a all that matters is that sys.modules['parent.a'] exists. sys.modules['parent'].a is not consulted.
Only from module import ... would ever look at attributes. From the import statement documentation:
The from form uses a slightly more complex process:
find the module specified in the from clause, loading and initializing it if necessary;
for each of the identifiers specified in the import clauses:
check if the imported module has an attribute by that name
if not, attempt to import a submodule with that name and then check the imported module again for that attribute
[...]
So from parent import _a would work, as would from parent import a, and you'd get the parent.a submodule and the 'some string' object, respectively.
Note that sys.modules is writable, if you must have import parent._a work, you can always just alter sys.modules directly:
sys.modules['parent._a'] = sys.modules['parent.a'] # make parent._a an alias for parent.a
import parent._a # works now
I think I have a coherent understanding of this problem now, just documenting my findings in case others run into this.
What Martijn said above is mostly true, expanding on that answer, import parent.a as _a is a two step process. The first step is module lookup of parent.a, which never goes through attribute lookup, and then it does a binding onto sys.modules, and then an attribute binding of the module to attribute a in parent. In fact this is all you get if you only use import parent.a. This part is described thoroughly by the previous answer.
The second part as _a does an attribute lookup of parent.a, and binds it onto the name _a. So to answer my original question, now if I go outside and start an interactive Python interpreter, now parent.a has been overwritten to the string in __init__.py, and import parent.a as the_a; the_a would get me the string. In fact, this is the same as import parent.a; parent.a. Both the_a and parent.a are the results of attribute lookup. I could still get the submodule by parent._a or sys.modules["parent.a"].
To answer my follow up question:
Even more confusing, if I remove the import statement in __init__.py, everything is back normal again (import parent.a; type(parent.a) is module). But why is this different? The a name in parent namespace is still a string.
This is when I import parent.a in the outside interactive Python interpreter, it first evaluates __init__.py, which does the overwriting of parent.a to a string. But the import hasn't finished yet, it goes on importing the submodule parent.a, and since we are still in the importing part, we don't do attribute lookups, and so we find the correct submodule. When all this is done, it binds the submodule to a of parent, thus overwriting the string that was overwriting the submodule, and making it all correct again.
This sounds very confusing, but remember (https://docs.python.org/3/reference/import.html#submodules):
When a submodule is loaded using any mechanism (e.g. importlib APIs, the import or import-from statements, or built-in __import__()) a binding is placed in the parent module’s namespace to the submodule object. For example, if package spam has a submodule foo, after importing spam.foo, spam will have an attribute foo which is bound to the submodule. Let’s say you have the following directory structure:
An import parent.a first runs all the module set-up code, and then binds the name.
I have a module with the name my_module.py
Inside of this module there is a function my_function:
def myFunction():
print my_variable
Apparently when this functions is called it prints my_variable which is not instantiated anywhere yet. So, calling myFunction() from inside of the module itself will crash the execution.
Now, aside from my_module.py I have another script with the name my_app.py residing in the same folder.
Inside of my_app.py I am importing my_module.py and instantiating my_variable under its namespace. After my_variable is instantiated I am calling my_module.myFunction() which picks up my_variable and prints its context out:
import module
module.my_variable = 'this variable is instantiated inside of another script'
module.myFunction()
While this approach works I wonder if it is designed properly. Is there other way to instantiate a variable outside the imported module to be used by this imported module?
import module
module.my_variable = 'this variable is instantiated inside of another script'
module.myFunction()
While this approach works I wonder if it is designed properly.
No, this is not designed properly. One proper way is to pass the value to the function explicitly.
Is there other way to instantiate a variable outside the imported module to be used by this imported module?
Just have another module were you declare this variable(s). For example my_vars.py:
my_variable = 'this variable is instantiated inside of another script'
Then in my_module.py:
import my_vars
def myFunction():
print my_vars.my_variable
I'm not sure what you're trying to achieve but it's generally best practice to not mutate "global" variables. Every time you'd want to use my_function() in your code you'd have to explicitly change my_variable first, which can trigger side effects in your code if other functions/methods are depending on it. The best way would be to rewrite my_function() so that it accepts my_variable as an argument
I am using SimPy, and I try to simulate a network.
This is my main module:
from SimPy.Simulation import *
import node0
import message0
import network0
reload (message0)
reload (node0)
reload(network0)
initialize()
topology=network0.Network()
activate(topology, topology.operate())
node1=node0.Node(1)
node1.interface.send(destination='node1')
simulate(until=25)
I want an object of class message, which is activated by an object of class node, to interrrupt
class Message(Process):
def arrive(self, destination, myEvent=delay):
self.destination=destination
self.interrupt(topology)
an object of class Network (topology).
But I'm getting an error:
NameError: global name 'topology' is not defined
And I don't know how to make an object global. And if I type topology in python shell then it shows me object topology, so why can't message see it?
I'm pretty sure the issue is that your Message class is defined in a different module than where your topology variable is. So called "global" variables in Python are not really global (in the sense that there's just one global namespace), but just at the top of a specific module's namespace. So the global variable topology in your main module's namespace is not accessible as a global variable from a different module.
My suggestion for working around this by passing the topology value to the Message as a parameter to the __init__ method. If the message is being created by something other than your own code (e.g. by your Node class), you might need to pass it around a bit more, so that it will be available when needed.
If that is not possible, you might be able to put the topology value in the namespace of a module that can be imported by your Message code. This can get messy though, as circular imports can break things if you're not careful.
I thought about this for a while and can't think of a better title, sorry.
I'm new'ish to Python, and (like many other's it seems) I just can't get my head around import.
I think I understand 'modules' and 'packages', classes and attributes and all that. It's one specific behavior I need clarified.
Say I have a file, foo.py. It has one line it:
x = 1
If, in another file, I `import foo", I can reference x. And, wonderfully, in another file I can import foo and now those two files can share x. Leaving classes out of the discussion for simplicity, I believe this is the pythonic way to share attributes between files.
Here's the question: Is is fair to say, when I import foo, that foo.py itself is, (for lack of a better metaphor), secretly instantiated by the interpreter?
I realize if I define a class in a module, it follow traditional rules and only become instantiated if I explicitly do so. But, the python interpreter (via the import statement) instantiating an instance of my module in the global namespace is the only way to explain the attribute sharing behavior.
Is this true? Semi-true? Or am I wandering with the Sleestaks in the Land of the Lost?
When you import a module:
if the module has not been previously imported, the file is parsed in to a module object which is added to sys.modules with a key that is the import path from the pythonpath to your module
that module object (or some member thereof) is aliased in the importing namespace, the alias and object being referenced being determined by the specific form of import you used
So when you import foo, the interpreter checks sys.modules for something registered with the name foo. If it finds it, it provides a label foo in the local namespace for the foo module. If it doesn't, it searches down the pythonpath until it finds a foo module, parses that to a module object, adds that object to sys.modules, and adds a label in the local namespace for that module object.
import foo as foof does the same thing, only the local namespace label created is foof. from foo import x follows the same process up to the point of creating a label and reference in the local namespace, instead providing a label x in the namespace for the attribute x from the foo module. from foo import x as foox just combines the 2 ideas.
With classes, you can actually poke around this whole system by crawling up and down the tree using the __module__ attribute.
The import creates an instance of a "module" object. It is worth knowing that this is created only the first time the module is imported. The following times it is imported you are getting a reference to the original. You can create your own module objects on the fly with a bit of instrospection.
import glob # Import any python module
moduleType = type(glob)
onTheFly = moduleType("OnTheFly", "Docstring for this module")
Although there isn't much benefit to creating these.
Yes, indeed its true. If you execute import foo a module object foo is instatiated and the contents of your file e.g a class bar is added as a member of that object.