I am writing a Python script where some of the core functionalities can be done by another existing library. Unfortunately, while that library has more features, it is also slower, so I'd like if the user could at runtime select whether they want to use that library or my own fast and simple implementation. Unfortunately I'm stuck at a point where I don't understand some of the workings of Python's module system.
Suppose that my main program was main.py, that the (optional) external module is in module_a.py and that my own fast and simple implementation of module_a together with the actual program code that uses either my own implementation or the one of module_a is in the file module_x.py:
main.py:
import module_x
module_x.test(True)
module_x.test(False)
module_a.py:
class myclass():
def __init__(self):
print("i'm myclass in module_a")
module_x.py:
class myclass():
def __init__(self):
print("i'm myclass in module_x")
def test(enable_a):
if enable_a:
try:
from module_a import myclass
except ImportError:
global myclass
enable_a = False
else:
global myclass
i = myclass()
When I now execute main.py I get:
$ python3 main.py
i'm myclass in module_a
i'm myclass in module_a
But why is this? If False is passed to test() then the import of the module_a implementation should never happen. Instead it should only see myclass from the local file. Why doesn't it? How do I make test() use the local definition of myclass conditionally?
My solution is supposed to run in Python3 but I see the same effect when I use Python2.7.
An import statement is permanent within the thread of execution unless it is explicitly undone. Furthermore, once the from ... import statement is executed in this case, it replaces the variable myclass in the global scope (at which point the class it was previously referencing defined in the same file is no longer referenced and can in theory be garbage collected)
So what is happening here is whenever you run test(True) the first time, your myclass in module_x is effectively deleted and replaced with the myclass from module_a. All subsequent calls to test(False) then call global myclass which is effectively a no-op since the global myclass now refers to the one imported from the other class (and besides the global call is unneeded when not changing the global variable from a local scope as explained here).
To work around this, I would strongly suggest encapsulating the desired module-switching behavior in a class that is independent of either module you would like to switch. You can then charge that class with holding a reference to both modules and providing the rest of you client code with the correct one. E.g.
module_a_wrapper.py
import module_x
import module_a
class ModuleAWrapper(object):
_target_module = module_x # the default
#classmethod
def get_module(cls):
return cls._target_module
def set_module(enable_a):
if enable_a:
ModuleAWrapper._target_module = module_a
else:
ModuleAWrapper._target_module = module_x
def get_module():
return ModuleAWrapper.get_module()
main.py:
from module_a_wrapper import set_module, get_module
set_module(True)
get_module().myclass()
set_module(False)
get_module().myclass()
Running:
python main.py
# Outputs:
i'm myclass in module_a
i'm myclass in module_x
You can read more about the guts of the python import system here
The answer by lemonhead properly explains why this effect happens and gives a valid solution.
The general rule seems to be: wherever and however you import a module, it will always replace any variables of the same name from the global scope.
Funnily, when I use the import foo as bar construct, then there must neither be a global variable named foo nor one named bar!
So while lemonhead's solution worked it adds lots of complexity and will lead to my code being much longer because every time I want to get something from either module I have to prefix that call with the getter function.
This solution allows me to solve the problem with a minimal amount of changed code:
module_x.py:
class myclass_fast():
def __init__(self):
print("i'm myclass in module_x")
def test(enable_a):
if enable_a:
try:
from module_a import myclass
except ImportError:
enable_a = False
myclass = myclass_fast
else:
myclass = myclass_fast
i = myclass()
So the only thing I changed was to rename the class I had in global scope from myclass to myclass_fast. This way it will not be overwritten anymore by the import of myclass from module_a. Then, on demand, I change the local variable myclass to either be the imported module or myclass_fast.
Related
I have file main.py which contains the following code:
class A:
def __init__(self, a):
self.a = a
def run(self):
return self.a+10
a = A(4)
print(a.run())
In file test.py, I tried to monkey patch class A in main.py as follows:
import main
class A:
def __init__(self, a):
self.a = a
def run(self):
return self.a+5
main.A = A
Unfortunately, when I run import test from a python interpreter, the module still prints out 14 as opposed to my expected output which is 9.
Is there a way to monkey patch a class inside a module before the module body is executed?
The problem here is that when you imported the main.py file, it executed the code a = A(4) using the real implementation of the class A. Then the rest of your test.py was executed and you replaced the A reference, but it was too late.
You can check that by adding in your test :
print(__name__) # __main__
print(main.__name__) # so70731368_main
print(A.__module__) # __main__
print(main.A.__module__) # __main__
print(main.a.__class__.__module__) # so70731368_main
Here, __main__ is a bit confusing but that's how Python call the first file you run (in your case test.py). The a instance is declared in the so70731368_main module, and it used the A class from the same module, you just changed A after the fact with the definition from the test file (__main__).
The fact that you need to patch two definitions (A and a) defined in the same file is very tricky. unittest.mock.patch is not powerful enough to patch inside an import (it patches after the import).
You can not, in a clean and simple way, prevent a to be instantiated as a main.A (real) class and get printed. What you can do is patch it after, for later uses, that is what you showed.
To answer directly your question : "patching" means replacing one reference by another, so the reference has to already be defined. In your example, it would require to patch between the class definition and the class instantiation (for the print to not use the real a), which is not supported.
There is no simple solution to this problem. If you have control over the code of the main.py file, then try to change it so that it does not instantiate a at import time.
I have a function that has a decorator. The decorator accepts arguments and the value of the argument is derived from another function call.
example.py
from cachetools import cached
from cachetools import TTLCache
from other import get_value
#cached(cache=TTLCache(maxsize=1, ttl=get_value('cache_ttl')))
def my_func():
return 'result'
other.py
def get_value(key):
data = {
'cache_ttl': 10,
}
# Let's assume here we launch a shuttle to the space too.
return data[key]
I'd like to mock the call to get_value(). I'm using the following in my test:
example_test.py
import mock
import pytest
from example import my_func
#pytest.fixture
def mock_get_value():
with mock.patch(
"example.get_value",
autospec=True,
) as _mock:
yield _mock
def test_my_func(mock_get_value):
assert my_func() == 'result'
Here I'm injecting mock_get_value to test_my_func. However, since my decorator is called on the first import, get_value() gets called immediately. Any idea if there's a way to mock the call to get_value() before module is imported right away using pytest?
Move the from example import my_func inside your with in your test function. Also patch it where it's really coming from, other.get_value. That may be all it takes.
Python caches modules in sys.modules, so module-level code (like function definitions) only runs on the first import from anywhere. If this isn't the first time, you can force a re-import using either importlib.reload() or by deleting the appropriate key in sys.modules and importing again.
Beware that re-importing a module may have side effects, and you may also want to re-import the module again after running the test to avoid interfering with other tests. If another module was using objects defined in the re-imported module, these don't just disappear, and may not be updated the way it expects. For example, re-importing a module may create a second instance of what was supposed to be a singleton.
One more robust approach would be save the original imported module object somewhere else, delete from sys.modules, re-import with the patched version for the duration of the test, and then put back the original import into sys.modules after the test. You could do this with an import inside of a patch.dict() context on sys.modules.
import mock
import sys
import pytest
#pytest.fixture
def mock_get_value():
with mock.patch(
"other.get_value",
autospec=True,
) as _mock, mock.patch.dict("sys.modules"):
sys.modules.pop("example", None)
yield _mock
def test_my_func(mock_get_value):
from example import my_func
assert my_func() == 'result'
Another possibility is to call the decorator yourself in the test, on the original function. If the decorator used functools.wraps()/functools.update_wrapper(), then original function should be available as a __wrapped__ attribute. This may not be available depending on how the decorator was implemented.
I have two Python scripts, one testclass.py:
import numpy
zz = numpy
class Something(object):
def __init__(self):
self.xp = zz
and one testscript.py:
from testclass import Something
x = Something()
print(x.xp)
I expected testscript.py to throw an error because I thought that testscript only imports the class Something (with its __init__ method), and not the global variable zz. So, given this bevahiour, my question is, when importing from a module, does Python "run" everything in the module file?
Yes. When you execute:
from testclass import Something
It has the same effect as:
import testclass
Something = testclass.Something
More generally, the Python interpreter can't know beforehand what objects your module exposes (unless you explicitly name them in __all__). For an extreme case, consider the following:
a.py:
import random
if random.random() > 0.5:
class Foo(object):
pass
else:
class Bar(object):
pass
Running from a import Foo has a 50% chance of failing because the a module object may or may not have a Foo attribute.
Say we have two scripts, script1 and script2.
script1 is defined as:
class Foo(object):
def __init__(self, name):
self.name = name
class bar(object):
def __init__(self, name):
self.name = name
def test(givenString):
return eval(givenString)
and script2 is defined as:
from .script1 import test
x = "Foo('me')"
print test(x)
script2's print statement for test(x) successfully tells me that I have a Foo object, but that doesn't make sense to me because I only imported test from script1, not Foo. I looked at the eval documentation but that didn't clear up much for me. How is it possible that a Foo object is created even when I never imported the class Foo?
eval() uses the globals of the module it is executed in. test 'lives' in the script1 global namespace, so any expression executed by eval() uses the same namespace as that function and thus can resolve Foo, bar and test.
Importing a function does not alter its namespace; the globals for test don't change merely by being called from script2. If it did, any imports in script1 would also need to be imported into script2, for each and every function you ever wanted to use. That would be incredibly impractical.
You can even see the globals for functions you import; print test.func_globals will show you the exact namespace of script1.
I have a Python library design that I'm trying to clean up, but I noticed that one piece isn't auto-completing in Eclipse/PyDev. I'm familiar enough for this not to be a problem for me, but its a library that others will end up using and auto-complete that doesn't work for a feature chunk won't do.
I'll just explain quickly what its trying to do, I re-created the design below. It all works, but auto-complete isn't useful. Maybe someone can just set me straight.
In module test_module2.py
import Main from test_module.py
from test_module import Main
class Second(object):
_main = Main
def set_main_variable(self, var):
self._main.variable = var
In module test_module.py
import sys
class Main(object):
variable = 0
second = None
def __init__(self):
for x in sys.modules.keys():
if x.endswith("test_module2"):
y = sys.modules[x]
self.second = y.Second()
self.second._main = self
def print_variable(self):
print self.variable
In application test.py
import test_module
import test_module2
if __name__ == "__main__":
m = test_module.Main()
m.second.set_main_variable(10)
m.print_variable() # prints 10
With this, if the test_module2.py module is imported, you can access it via a separate namespace member variable Main.second. Keeps things separate, and if the module isn't available, then you don't have access to Second's features.
In test_module2.py, Main is imported (you can't use Second without Main anyways) and _main is default to Main. This allows auto-complete in Second when you're working on the parent _main reference that's setup when Main is constructed and Second was imported. (this is good)
In Main, Second is optional and is never directly calls from Main methods. So auto-complete isn't necessary. (this is fine)
In __main__, auto-complete naturally works for Main member variables and methods, but doesn't work with m.second. I can't default m.second to anything because I'd end up having to import test_module2.py from test_module.py. It's optional and is defined by __main__'s import of test_module2.py. e.g. if __main__ imports test_module2.py, then Second is automatically available to the application.
Anyone have a better way to have an optionally imported module/class construct to a member variable, that will work with IDE auto-completion?
Thanks in advance for any ideas.
I'm going to come to the conclusion that auto-completion with PyDev will only see a variable set to a class's members if the parent class inherits from it, or defaults the variable to the class itself. For example:
class Main(object):
second = Second
Then in __main__ you can auto-complete: main.second...
That or Main has to inherit from Second.
Ok, I need to go back and complain that this library design isn't going to work with IDE auto-completion. See if I can use an wrapper class to inherit if the import test_module2 is present and clean things up.
My solution:
Here's what I came up with:
In module test_module2.py
import test_module
class Second(object):
_variable = 0 # overrided by Main's _variable
def set_main_variable(self, var):
self._variable = var
class Main(test_module.Main, Second):
def __init__(self):
super(Main, self).__init__()
In module test_module.py
class Main(object):
_variable = 0
second = None
def __init__(self):
super(Main, self).__init__()
def print_variable(self):
print self._variable
Now! In test.py if you import test_module or test_module2 (not both), you can construct Main with or without Second's added functionality. Second will have access to everything in Main, and because Main is inheriting Second, auto-complete works.
In application test.py
#import test_module
import test_module2
if __name__ == "__main__":
m = test_module.Main()
m.set_main_variable(10)
m.print_variable() # prints 10
I don't know if I can easily move Second's methods into a sub-namespace, like Main.second.set_variable(). I would have to explicitly set a Main reference within Second after Main constructs it as a variable, (in Main init, self.second = Second()) and not have Second inherited by Main. Then you could call m.second.set_main_variable(10), and keep all Second methods accesible from Main under the .second namespace.
First, Simple is better than complex -- your design appears to be complicated for no good reason. Maybe you should provide some details about your actual library.
Second, with Python as a dynamically typed language, it's just natural that there are cases when your IDE fails to auto-complete because your IDE has a static perspective. And you definitely shouldn't programm to suite the capabilites of your IDE.