I sometime would like to apply small fixes to small issues, just for testing, or for internal purposes.
For example I would like to just replace self.builder.current_docname with node.source in sphinx.writers.html5.HTML5Translator.visit_literal_block.
The (elegant) way of doing it is to copy-paste the method where the patch should be applied then modify the little tiny thing I have to modify, then override the original method with the new one. In this case I would have a lot of imports and I have to copy/paste the whole method locally (bad SSOT is bad).
Instead I would would like to write something like:
from monkey import ReMonkeyPatcher # Fake...
from sphinx.writer.html5 import HTML5Translator
# Override the class's method
ReMonkeyPatcher(HTML5Translator, 'self.builder.current_docname', 'node.source')
# Profit...
I've read here that the reflective nature of Python allows for run-time hacks with inspect and ast. So I wrote this:
def monkey_replace(function, search, replace):
source = inspect.getsource(function)
while code[0].isspace():
code = textwrap.dedent(code)
code = code.replace(search, replace)
ast_tree = ast.parse(code)
compile(ast_tree, '<string>', mode='exec')
mod = {}
exec(code, mod) # <-- Here is the issue...
method = mod[function.__name__]
replace_function(function, method) # Doesn't know how to do that yet
The main issue is exec(code, mod) in which the context is missing. To work in any/most cases, it has to be executed in the context of the original function (imports...).
Is there an elegant way of doing it?
In Python monkey-patching is extremely easy. As it says in the post you reference:
Because there are no protected variables, you can simply overwrite the function you would like to patch:
HTMLTranslator.builder.current_docname = node.source
Related
I want to define a bunch of config variables that can be imported in all the modules in my project. The values of those variables will be constant during runtime but are not known before runtime; they depend on the input. Usually I'd define a dict in my top module which would be passed to all functions and classes from other modules; however, I was thinking it may be cleaner to simply create a blank config.py module which would be dynamically filled with config variables by the top module:
# top.py
import config
config.x = x
# config.py
x = None
# other.py
import config
print(config.x)
I like this approach because I don't have to save the parameters as attributes of classes in my other modules; which makes sense to me because parameters do not describe classes themselves.
This works but is it considered bad practice?
The question as such may be disputed. But I would generally say yes, it's "bad practice" because scope and impact of change is really getting blurred. Note the use case you're describing really is not about sharing configuration, but about different parts of the program functions, objects, modules exchanging data and as such it's a bit of a variation on (meta)global variable).
Reading common configuration values could be fine, but changing them along the way... you may lose track of what happened where and also in which order as modules get imported / values get modified. For instance assume the config.py and two modules m1.py:
import config
print(config.x)
config.x=1
and m2.py:
import config
print(config.x)
config.x=2
and a main.py that just does:
import m1
import m2
import config
print(config.x)
or:
import m2
import m1
import config
print(config.x)
The state in which you find config in each module and really any other (incl. main.py here) depends on order in which imports have occurred and who assigned what value when. Even for a program entirely under your control, this may get confusing (and source of mistakes) rather quickly.
For runtime data and passing information between objects and modules (and your example is really that and not configuration that is predefined and shared between modules) I would suggest you look into describing the information perhaps in a custom state (config) object and pass it around through appropriate interface. But really just a function / method argument may be all that is needed. The exact form depends on what exactly you're trying to achieve and what your overall design is.
In your example, other.py behaves differently when called or imported before top.py which may still seem obvious and manageable in a minimal example, but really is not a very sound design. Anyone reading the code (incl. future you) should be able to follow its logic and this IMO breaks its flow.
The most trivial (and procedural) example of what for what you've described and now I hopefully have a better grasp of would be other.py recreating your current behavior:
def do_stuff(value):
print(value) # We did something useful here
if __name__ == "__main__":
do_stuff(None) # Could also use config with defaults
And your top.py presumably being the entry point and orchestrating importing and execution doing:
import other
x = get_the_value()
other.do_stuff(x)
You can of course introduce an interface to configure do_stuff perhaps a dict or a custom class even with default implementation in config.py:
class Params:
def __init__(self, x=None):
self.x = x
and your other.py:
def do_stuff(params=config.Params()):
print(params.x) # We did something useful here
And on your top.py you can use:
params = config.Params(get_the_value())
other.do_stuff(params)
But you could also have any use case specific source of value(s):
class TopParams:
def __init__(self, url):
self.x = get_value_from_url(url)
params = TopParams("https://example.com/value-source")
other.do_stuff(params)
x could even be a property which you retrieve every time you access it... or lazily when needed and then cached... Again, it really then is a matter of what you need to do.
"Is it bad practice to modify attributes of one module from another module?"
that it is considered as bad practice - violation of the law of demeter, which means in fact "talk to friends, not to strangers".
Objects should expose behaviour and functions, but should HIDE the data.
DataStructures should EXPOSE data, but should not have any methods (which are exposed). The law of demeter does not apply to such DataStructures. OOP Purists might cover such DataStructures with setters and getters, but it really adds no value in Python.
there is a lot of literature about that like : https://en.wikipedia.org/wiki/Law_of_Demeter
and of course, a must to read: "Clean Code", by Robert C. Martin (Uncle Bob), check it out on Youtube also.
For procedural programming it is perfectly normal to keep data in a DataStructure which does not have any (exposed) methods.
The procedures in the program work with that data. Consider to use the module attrs, see : https://www.attrs.org/en/stable/ for easy creation of such classes.
my prefered method for keeping config is (here without using attrs):
# conf_xy.py
"""
config is code - so why use damned parsers, textfiles, xml, yaml, toml and all that
if You just can use testable code as config that can deliver the correct types, etc.
as well as hinting in Your favorite IDE ?
Here, for demonstration without using attrs package - usually I use attrs (read the docs)
"""
class ConfXY(object):
def __init__(self) -> None:
self.x: int = 1
self.z: float = get_z_from_input()
...
conf_xy=ConfXY()
# other.py
from conf_xy import conf_xy
...
y = conf_xy.x * 2
...
I found the following code snippet that I can't seem to make work for my scenario (or any scenario at all):
def load(code):
# Delete all local variables
globals()['code'] = code
del locals()['code']
# Run the code
exec(globals()['code'])
# Delete any global variables we've added
del globals()['load']
del globals()['code']
# Copy k so we can use it
if 'k' in locals():
globals()['k'] = locals()['k']
del locals()['k']
# Copy the rest of the variables
for k in locals().keys():
globals()[k] = locals()[k]
I created a file called "dynamic_module" and put this code in it, which I then used to try to execute the following code which is a placeholder for some dynamically created string I would like to execute.
import random
import datetime
class MyClass(object):
def main(self, a, b):
r = random.Random(datetime.datetime.now().microsecond)
a = r.randint(a, b)
return a
Then I tried executing the following:
import dynamic_module
dynamic_module.load(code_string)
return_value = dynamic_module.MyClass().main(1,100)
When this runs it should return a random number between 1 and 100. However, I can't seem to get the initial snippet I found to work for even the simplest of code strings. I think part of my confusion in doing this is that I may misunderstand how globals and locals work and therefore how to properly fix the problems I'm encountering. I need the code string to use its own imports and variables and not have access to the ones where it is being run from, which is the reason I am going through this somewhat over-complicated method.
You should not be using the code you found. It is has several big problems, not least that most of it doesn't actually do anything (locals() is a proxy, deleting from it has no effect on the actual locals, it puts any code you execute in the same shared globals, etc.)
Use the accepted answer in that post instead; recast as a function that becomes:
import sys, imp
def load_module_from_string(code, name='dynamic_module')
module = imp.new_module(name)
exec(code, mymodule.__dict__)
return module
then just use that:
dynamic_module = load_module_from_string(code_string)
return_value = dynamic_module.MyClass().main(1, 100)
The function produces a new, clean module object.
In general, this is not how you should dynamically import and use external modules. You should be using __import__ within your function to do this. Here's a simple example that worked for me:
plt = __import__('matplotlib.pyplot', fromlist = ['plt'])
plt.plot(np.arange(5), np.arange(5))
plt.show()
I imagine that for your specific application (loading from code string) it would be much easier to save the dynamically generated code string to a file (in a folder containing an __init__.py file) and then to call it using __import__. Then you could access all variables and functions of the code as parts of the imported module.
Unless I'm missing something?
Introduction
Pydev is a great eclipse plugin that let us write python code easily.
It can even give autocompletion suggestion when I do this:
from package.module import Random_Class
x = Random_Class()
x. # the autocompletion will be popped up,
# showing every method & attribute inside Random_Class
That is great !!!
The Problem (And My Question)
However, when I don't use explicit import, and use __import__ for example, I can't have the same autocompletion effect.
import_location = ".".join(('package', 'module'))
__import__(import_location, globals(), locals(), ['*'])
My_Class = getattr(sys.modules[import_location], 'Random_Class')
x = My_Class()
x. # I expect autocompletion, but nothing happened
Question: is there any way (in pydev or any IDE) to make the second one also
show autocompletion?
Why do I need to do this?
Well, I make a simple MVC framework, and I want to provide something like load_model, load_controller, and load_view which is still work with autocompletion (or at least possible to work)
So, instead of leave users do this (although I don't forbid them to do so):
from applications.the_application.models.the_model import The_Model
x = The_Model()
x. # autocompletion works
I want to let users do this:
x = load_model('the_application', 'the_model')()
x. # autocompletion still works
The "applications" part is actually configurable by another script, and I don't want users to change all of their importing model/controller part everytime they change the configuration. Plus, I think load_model, load_controller, and load_view make MVC pattern shown more obvious.
Unexpected Answer
I know some tricks such as doing this (as what people do with
web2py):
import_location = ".".join(('package', 'module'))
__import__(import_location, globals(), locals(), ['*'])
My_Class = getattr(sys.modules[import_location], 'Random_Class')
x = My_Class()
if 0:
from package.module import Random_Class
x = Random_Class()
x. # Now autocompletion is going to work
and I don't expect to do this, since it will only add unnecessary
extra work.
I don't expect any don't try to be clever comments. I have enough of them
I don't expect dynamic import is evil comments. I'm not a purist.
I don't expect any just use django, or pylons, or whatever comments. Such as comments even unrelated to my question.
I have done this before. This may be slightly different from your intended method, so let me know if it doesn't apply.
I dynamically import different modules that all subclass a master class, using similar code to your example. Because the subclassing module already imports the master, I don't need to import it in the main module.
To get highlighting, the solution was to import the master class into the main module first, even though it wasn't used directly. In my case it was a good fallback if the particular subclass didn't exist, but that's an implementation detail.
This only works if your classes all inherit from one parent.
Not really an answer to my own question. However, I can change the approach. So, instead of provide "load_model()", I can use relative import. Something like this:
from ..models.my_model import Model_Class as Great_Model
m = Great_Model()
I have a module that I need to test in python.
I'm using the unittest framework but I ran into a problem.
The module has some method definitions, one of which is used when it's imported (readConfiguration) like so:
.
.
.
def readConfiguration(file = "default.xml"):
# do some reading from xml
readConfiguration()
This is a problem because when I try to import the module it also tries to run the "readConfiguration" method which fails the module and the program (a configuration file does not exist in the test environment).
I'd like to be able to test the module independent of any configuration files.
I didn't write the module and it cannot be re-factored.
I know I can include a dummy configuration file but I'm looking for a "cleaner", more elegant, solution.
As commenters have already pointed out, imports should never have side effects, so try to get the module changed if at all possible.
If you really, absolutely, cannot do this, there might be another way: let readConfiguration() be called, but stub out its dependencies. For instance, if it uses the builtin open() function, you could mock that, as demonstrated in the mock documentation:
>>> mock = MagicMock(return_value=sentinel.file_handle)
>>> with patch('builtins.open', mock):
... import the_broken_module
... # do your testing here
Replace sentinel.file_handle with StringIO("<contents of mock config file>") if you need to supply actual content.
It's brittle as it depends on the implementation of readConfiguration(), but if there really is no other way, it might be useful as a last resort.
imagine you have an io heavy function like this:
def getMd5Sum(path):
with open(path) as f:
return md5(f.read()).hexdigest()
Do you think Python is flexible enough to allow code like this (notice the $):
def someGuiCallback(filebutton):
...
path = filebutton.getPath()
md5sum = $getMd5Sum()
showNotification("Md5Sum of file: %s" % md5sum)
...
To be executed something like this:
def someGuiCallback_1(filebutton):
...
path = filebutton.getPath()
Thread(target=someGuiCallback_2, args=(path,)).start()
def someGuiCallback_2(path):
md5sum = getMd5Sum(path)
glib.idle_add(someGuiCallback_3, md5sum)
def someGuiCallback_3(md5sum):
showNotification("Md5Sum of file: %s" % md5sum)
...
(glib.idle_add just pushes a function onto the queue of the main thread)
I've thought about using decoraters, but they don't allow me to access the 'content' of the function after the call. (the showNotification part)
I guess I could write a 'compiler' to change the code before execution, but it doesn't seam like the optimal solution.
Do you have any ideas, on how to do something like the above?
You can use import hooks to achieve this goal...
PEP 302 - New Import Hooks
PEP 369 - Post Import Hooks
... but I'd personally view it as a little bit nasty.
If you want to go down that route though, essentially what you'd be doing is this:
You add an import hook for an extension (eg ".thpy")
That import hook is then responsible for (essentially) passing some valid code as a result of the import.
That valid code is given arguments that effectively relate to the file you're importing.
That means your precompiler can perform whatever transformations you like to the source on the way in.
On the downside:
Whilst using import hooks in this way will work, it will surprise the life out of any maintainer or your code. (Bad idea IMO)
The way you do this relies upon imputil - which has been removed in python 3.0, which means your code written this way has a limited lifetime.
Personally I wouldn't go there, but if you do, there's an issue of the Python Magazine where doing this sort of thing is covered in some detail, and I'd advise getting a back issue of that to read up on it. (Written by Paul McGuire, April 2009 issue, probably available as PDF).
Specifically that uses imputil and pyparsing as it's example, but the principle is the same.
How about something like this:
def performAsync(asyncFunc, notifyFunc):
def threadProc():
retValue = asyncFunc()
glib.idle_add(notifyFunc, retValue)
Thread(target=threadProc).start()
def someGuiCallback(filebutton):
path = filebutton.getPath()
performAsync(
lambda: getMd5Sum(path),
lambda md5sum: showNotification("Md5Sum of file: %s" % md5sum)
)
A bit ugly with the lambdas, but it's simple and probably more readable than using precompiler tricks.
Sure you can access function code (already compiled) from decorator, disassemble and hack it. You can even access the source of module it's defined in and recompile it. But I think this is not necessary. Below is an example using decorated generator, where yield statement serves as a delimiter between synchronous and asynchronous parts:
from threading import Thread
import hashlib
def async(gen):
def func(*args, **kwargs):
it = gen(*args, **kwargs)
result = it.next()
Thread(target=lambda: list(it)).start()
return result
return func
#async
def test(text):
# synchronous part (empty in this example)
yield # Use "yield value" if you need to return meaningful value
# asynchronous part[s]
digest = hashlib.md5(text).hexdigest()
print digest