In the Python Guide's chapter on project structure, the term "top-level statement" is brought up a few times. I'm not sure exactly what this refers to. My guess is it's any variable declarations that happen outside of any functions or class methods that fire as soon as a module is loaded. Is this correct? Does it also include a module's import statements?
It's not just variable declarations (and there aren't any variable declarations anyway). It's pretty much anything that starts at indentation level 0.
import sys # top-level
3 + 4 # top-level
x = 0 # top-level
def f(): # top-level
import os # not top-level!
return 3 # not top-level
if x: # top-level
print 3 # not top-level
else:
print 4 # not top-level, but executes as part of an if statement
# that is top-level
class TopLevel(object): # top-level
x = 3 # not top-level, but executes as part of the class statement
def foo(self): # not top-level, but executes as part of the class statement
print 5 # not top-level
Here's the first mention of "top-level statement":
Once modu.py is found, the Python interpreter will execute the module in an isolated scope. Any top-level statement in modu.py will be executed, including other imports if any. Function and class definitions are stored in the module’s dictionary.
This makes it clear that what they really mean is "things that are interpreted at import time".
While it's not terribly helpful directly, the Python documentation itself also uses the phrase "top-level" (components, which then means "statements" in this context).
Note that this module:
"""a python module, spam.py"""
def spam():
return "spam"
class Spam(object):
pass
has two statements in it, the def and the class. These are both executed at import time. These definitions are compound statements (see def and class descriptions). If there are decorators attached to a top-level def, that adds even more top-level things to run. (See also user2357112's answer: running a class statement invokes more internal workings.)
Add an import sys at the top and you've added a third statement, which imports sys. However, if you add this:
def ham(eggs):
import os
return os.path.basename(eggs)
you have still only added one statement, the def ham, to the top-level stuff. It's when ham itself is executed (called) that the import os will be run.
In python, the statements which are not indented are called a top-level statement. Internally python gives a special name to top-level statement s as __main__.
Related
I have file main.py which contains the following code:
class A:
def __init__(self, a):
self.a = a
def run(self):
return self.a+10
a = A(4)
print(a.run())
In file test.py, I tried to monkey patch class A in main.py as follows:
import main
class A:
def __init__(self, a):
self.a = a
def run(self):
return self.a+5
main.A = A
Unfortunately, when I run import test from a python interpreter, the module still prints out 14 as opposed to my expected output which is 9.
Is there a way to monkey patch a class inside a module before the module body is executed?
The problem here is that when you imported the main.py file, it executed the code a = A(4) using the real implementation of the class A. Then the rest of your test.py was executed and you replaced the A reference, but it was too late.
You can check that by adding in your test :
print(__name__) # __main__
print(main.__name__) # so70731368_main
print(A.__module__) # __main__
print(main.A.__module__) # __main__
print(main.a.__class__.__module__) # so70731368_main
Here, __main__ is a bit confusing but that's how Python call the first file you run (in your case test.py). The a instance is declared in the so70731368_main module, and it used the A class from the same module, you just changed A after the fact with the definition from the test file (__main__).
The fact that you need to patch two definitions (A and a) defined in the same file is very tricky. unittest.mock.patch is not powerful enough to patch inside an import (it patches after the import).
You can not, in a clean and simple way, prevent a to be instantiated as a main.A (real) class and get printed. What you can do is patch it after, for later uses, that is what you showed.
To answer directly your question : "patching" means replacing one reference by another, so the reference has to already be defined. In your example, it would require to patch between the class definition and the class instantiation (for the print to not use the real a), which is not supported.
There is no simple solution to this problem. If you have control over the code of the main.py file, then try to change it so that it does not instantiate a at import time.
I am writing a Python script where some of the core functionalities can be done by another existing library. Unfortunately, while that library has more features, it is also slower, so I'd like if the user could at runtime select whether they want to use that library or my own fast and simple implementation. Unfortunately I'm stuck at a point where I don't understand some of the workings of Python's module system.
Suppose that my main program was main.py, that the (optional) external module is in module_a.py and that my own fast and simple implementation of module_a together with the actual program code that uses either my own implementation or the one of module_a is in the file module_x.py:
main.py:
import module_x
module_x.test(True)
module_x.test(False)
module_a.py:
class myclass():
def __init__(self):
print("i'm myclass in module_a")
module_x.py:
class myclass():
def __init__(self):
print("i'm myclass in module_x")
def test(enable_a):
if enable_a:
try:
from module_a import myclass
except ImportError:
global myclass
enable_a = False
else:
global myclass
i = myclass()
When I now execute main.py I get:
$ python3 main.py
i'm myclass in module_a
i'm myclass in module_a
But why is this? If False is passed to test() then the import of the module_a implementation should never happen. Instead it should only see myclass from the local file. Why doesn't it? How do I make test() use the local definition of myclass conditionally?
My solution is supposed to run in Python3 but I see the same effect when I use Python2.7.
An import statement is permanent within the thread of execution unless it is explicitly undone. Furthermore, once the from ... import statement is executed in this case, it replaces the variable myclass in the global scope (at which point the class it was previously referencing defined in the same file is no longer referenced and can in theory be garbage collected)
So what is happening here is whenever you run test(True) the first time, your myclass in module_x is effectively deleted and replaced with the myclass from module_a. All subsequent calls to test(False) then call global myclass which is effectively a no-op since the global myclass now refers to the one imported from the other class (and besides the global call is unneeded when not changing the global variable from a local scope as explained here).
To work around this, I would strongly suggest encapsulating the desired module-switching behavior in a class that is independent of either module you would like to switch. You can then charge that class with holding a reference to both modules and providing the rest of you client code with the correct one. E.g.
module_a_wrapper.py
import module_x
import module_a
class ModuleAWrapper(object):
_target_module = module_x # the default
#classmethod
def get_module(cls):
return cls._target_module
def set_module(enable_a):
if enable_a:
ModuleAWrapper._target_module = module_a
else:
ModuleAWrapper._target_module = module_x
def get_module():
return ModuleAWrapper.get_module()
main.py:
from module_a_wrapper import set_module, get_module
set_module(True)
get_module().myclass()
set_module(False)
get_module().myclass()
Running:
python main.py
# Outputs:
i'm myclass in module_a
i'm myclass in module_x
You can read more about the guts of the python import system here
The answer by lemonhead properly explains why this effect happens and gives a valid solution.
The general rule seems to be: wherever and however you import a module, it will always replace any variables of the same name from the global scope.
Funnily, when I use the import foo as bar construct, then there must neither be a global variable named foo nor one named bar!
So while lemonhead's solution worked it adds lots of complexity and will lead to my code being much longer because every time I want to get something from either module I have to prefix that call with the getter function.
This solution allows me to solve the problem with a minimal amount of changed code:
module_x.py:
class myclass_fast():
def __init__(self):
print("i'm myclass in module_x")
def test(enable_a):
if enable_a:
try:
from module_a import myclass
except ImportError:
enable_a = False
myclass = myclass_fast
else:
myclass = myclass_fast
i = myclass()
So the only thing I changed was to rename the class I had in global scope from myclass to myclass_fast. This way it will not be overwritten anymore by the import of myclass from module_a. Then, on demand, I change the local variable myclass to either be the imported module or myclass_fast.
I put a method in a file mymodule.py:
def do_something():
global a
a=1
If I try
>>> execfile('mymodule.py')
>>> do_something()
>>> print a
I get "1" as I expect. But if I import the module instead,
>>> from mymodule import *
and then run do_something(), then the python session knows nothing about the variable "a".
Can anyone explain the difference to me? Thanks.
execfile without globals, locals argument, It executes the file content in the current namespace. (the same namespace that call the execfile)
While, import execute the specified module in a separated namespace, and define the mymodule in the local namespace.
In the second part where you import mymodule, the reason why it isn't showing up is that a is global to the namespace of mymodule as done that way.
Try:
print mymodule.a
This prints:
1
As expected.
As per the Python documentation:
The global statement is a declaration which holds for the entire
current code block. It means that the listed identifiers are to be
interpreted as globals. It would be impossible to assign to a global
variable without global, although free variables may refer to globals
without being declared global.
Names listed in a global statement must not be used in the same code
block textually preceding that global statement.
Names listed in a global statement must not be defined as formal
parameters or in a for loop control target, class definition, function
definition, or import statement.
my problem is about i have a file that contain class and inside this class there is bunch of code will be executed
so whenever i import that file it will executed ! without creating an object of the class ! , here is the example
FILE X
class d:
def __init__(self):
print 'print this will NOT be printed'
print "this will be printed"
file B
import x
output is this will be printed, so my question is how to skip executing it until creating a new object?
You can't do that in Python, in Python every class is a first level object, a Python class is an object too and an class attribute can exist even if there is no instances of that class. If you just want to suppress the output of the print statement you can redirect the output of your print statements on the importing moment or create a context like the one provided in this first answer and use the __import__ statement manually.
If all you want to do is suppress the print (or any other executable statements) during import, surround them with a check for top module execution:
if __name__ == '__main__':
print 'this will be printed'
This will prevent the print during import, but allow it when the module is
executed interactively.
As others have pointed out, the second print statment is executing because it's one of the suite of statements making up the class declaration -- all of which are executed when the module they're in is imported because the declaration is part of its top-level code verses it being nested inside a function or method.
The first print statement isn't executed because it's part of a method definition, whose statements don't execute until it's called --- unlike those within a class definition. Typically a class's __init__() method is called indirectly when an instance of the class is created using the class's name, which would be d() for one named d like yours.
So, although it contradicts what's in the text of the strings being displayed, to make that second print statement only execute when instances of the class are created (just like with the first one) you'd need to also make it part of the same method (or called by it). In other words, after doing so, neither of them will execute when the file the class is in is imported, but both will when any instances of the class are created. Here's what I mean:
File x.py:
class d:
def __init__(self):
print 'print this will NOT be printed' # not true
print "this will be printed when object is created"
File b.py:
import x # no print statements execute
obj = d() # both print statements will be executed now
Your question is like: I have a function
def f():
print(1)
print(2)
How do I make print(1) executed, but not print(2)? There is really no easy way. What you have to understand is that def __init__(self) is also a statement. Your class consists of that statement and print statement. There is no easy way to execute one but not the other. Of course, if you can change the source of the class, just put the print inside __init__, where it will be called after instance creation.
(Copied from a comment above in case it is useful to future readers)
Agree with #mgilson and #EmmettJButler - this code is likely best-placed in the __init__. When Python imports a module, it executes the module-level code, including building the class definition with the class methods, etc. Therefore when you import X, and class d's definition gets built (so you can call it from B), it executes the code inside of the class. Usually this means you'll have class-level variables set and unbound methods ready to be attached to instances, but in your case it means that your statement will be printed.
As suggested by the others, refactoring the code is likely your best bet.
Does it matter where modules are loaded in a code?
Or should they all be declared at the top, since during load time the external modules will have to be loaded regardless of where they are declared in the code...?
Example:
from os import popen
try:
popen('echo hi')
doSomethingIllegal;
except:
import logging #Module called only when needed?
logging.exception("Record to logger)
or is this optimized by the compiler the same way as:
from os import popen
import logging #Module will always loaded regardless
try:
popen('echo hi')
doSomethingIllegal;
except:
logging.exception("Record to logger)
This indicates it may make a difference:
"import statements can be executed just about anywhere. It's often useful to place them inside functions to restrict their visibility and/or reduce initial startup time. Although Python's interpreter is optimized to not import the same module multiple times, repeatedly executing an import statement can seriously affect performance in some circumstances."
These two OS questions, local import statements? and import always at top of module? discuss this at length.
Finally, if you are curious about your specific case you could profile/benchmark your two alternatives in your environment.
I prefer to put all of my import statements at the top of the source file, following stylistic conventions and for consistency (also it would make changes easier later w/o having to hunt through the source file looking for import statements scattered throughout)
The general rule of thumb is that imports should be at the top of the file, as that makes code easier to follow, and that makes it easier to figure out what a module will need without having to go through all the code.
The Python style guide covers some basic guidelines for how imports should look: http://www.python.org/dev/peps/pep-0008/#imports
In practice, though, there are times when it makes sense to import from within a particular function. This comes up with imports that would be circular:
# Module 1
from module2 import B
class A(object):
def do_something(self):
my_b = B()
...
# Module 2
from module1 import A
class B(object):
def do_something(self):
my_a = A()
...
That won't work as is, but you could get around the circularity by moving the import:
# Module 1
from module2 import B
class A(object):
def do_something(self):
my_b = B()
...
# Module 2
class B(object):
def do_something(self):
from module1 import A
my_a = A()
...
Ideally, you would design the classes such that this would never come up, and maybe even include them in the same module. In that toy example, having each import the other really doesn't make sense. However, in practice, there are some cases where it makes more sense to include an import for one method within the method itself, rather than throwing everything into the same module, or extracting the method in question out to some other object.
But, unless you have good reason to deviate, I say go with the top-of-the-module convention.