Docs here in http://eventlet.net/doc/patching.htm says "If no arguments are specified, everything is patched." and "thread, which patches thread, threading, and Queue".
But with a simple test:
#!/bin/env python
import threading
import eventlet
eventlet.monkey_patch()
if __name__ == '__main__':
patched = eventlet.patcher.is_monkey_patched(threading)
print('patched : %s' % patched)
The result is:
patched : False
It seems like threading is not patched at all.
The doc is wrong?
I found the doc is right. The problem is about is_monkey_patched(), it can't detect some situation like 'threading, Queue' module. Take a look at the src of this function, the behaviour is easy to understand.
def _green_thread_modules():
from eventlet.green import Queue
from eventlet.green import thread
from eventlet.green import threading
if six.PY2:
return [('Queue', Queue), ('thread', thread), ('threading', threading)]
if six.PY3:
return [('queue', Queue), ('_thread', thread), ('threading', threading)]
if on['thread'] and not already_patched.get('thread'):
modules_to_patch += _green_thread_modules()
already_patched['thread'] = True
def is_monkey_patched(module):
"""Returns True if the given module is monkeypatched currently, False if
not. *module* can be either the module itself or its name.
Based entirely off the name of the module, so if you import a
module some other way than with the import keyword (including
import_patched), this might not be correct about that particular
module."""
return module in already_patched or \
getattr(module, '__name__', None) in already_patched
And because the patch operation is implemented like this:
for name, mod in modules_to_patch:
orig_mod = sys.modules.get(name)
if orig_mod is None:
orig_mod = __import__(name)
for attr_name in mod.__patched__:
patched_attr = getattr(mod, attr_name, None)
if patched_attr is not None:
setattr(orig_mod, attr_name, patched_attr)
We can check whether a module like threading/Queue is patched by using:
>>>import threading
>>>eventlet.monkey_patch()
>>>threading.current_thread.__module__
>>>'eventlet.green.threading'
Related
Does the interpreter somehow keep a timestamp of when a module is imported? Or is there an easy way of hooking into the import machinery to do this?
The scenario is a long-running Python process that at various points imports user-provided modules. I would like the process to be able to check "should I restart to load the latest code changes?" by checking the module file's timestamps against the time the module was imported.
Here's a way to automatically have an attribute (named _loadtime in the example code below) added to modules when they're imported. The code is based on Recipe 10.12 titled "Patching Modules on Import" in the book Python Cookbook, by David Beazley and Brian Jones, O'Reilly, 2013, which shows a technique that I adapted to do what you want.
For testing purposes I created this trivial target_module.py file:
print('in target_module')
Here's the example code:
import importlib
import sys
import time
class PostImportFinder:
def __init__(self):
self._skip = set() # To prevent recursion.
def find_module(self, fullname, path=None):
if fullname in self._skip: # Prevent recursion
return None
self._skip.add(fullname)
return PostImportLoader(self)
class PostImportLoader:
def __init__(self, finder):
self._finder = finder
def load_module(self, fullname):
importlib.import_module(fullname)
module = sys.modules[fullname]
# Add a custom attribute to the module object.
module._loadtime = time.time()
self._finder._skip.remove(fullname)
return module
sys.meta_path.insert(0, PostImportFinder())
if __name__ == '__main__':
import time
try:
print('importing target_module')
import target_module
except Exception as e:
print('Import failed:', e)
raise
loadtime = time.localtime(target_module._loadtime)
print('module loadtime: {} ({})'.format(
target_module._loadtime,
time.strftime('%Y-%b-%d %H:%M:%S', loadtime)))
Sample output:
importing target_module
in target_module
module loadtime: 1604683023.2491636 (2020-Nov-06 09:17:03)
I don't think there's any way to get around how hacky this is, but how about something like this every time you import? (I don't know exactly how you're importing):
import time
from types import ModuleType
# create a dictionary to keep track
# filter globals to exclude things that aren't modules and aren't builtins
MODULE_TIMES = {k:None for k,v in globals().items() if not k.startswith("__") and not k.endswith("__") and type(v) == ModuleType}
for module_name in user_module_list:
MODULE_TIMES[module_name] = time.time()
eval("import {0}".format(module_name))
And then you can reference this dictionary in a similar way later.
Say I have two files:
# spam.py
import library_Python3_only as l3
def spam(x,y)
return l3.bar(x).baz(y)
and
# beans.py
import library_Python2_only as l2
...
Now suppose I wish to call spam from within beans. It's not directly possible since both files depend on incompatible Python versions. Of course I can Popen a different python process, but how could I pass in the arguments and retrieve the results without too much stream-parsing pain?
Here is a complete example implementation using subprocess and pickle that I actually tested. Note that you need to use protocol version 2 explicitly for pickling on the Python 3 side (at least for the combo Python 3.5.2 and Python 2.7.3).
# py3bridge.py
import sys
import pickle
import importlib
import io
import traceback
import subprocess
class Py3Wrapper(object):
def __init__(self, mod_name, func_name):
self.mod_name = mod_name
self.func_name = func_name
def __call__(self, *args, **kwargs):
p = subprocess.Popen(['python3', '-m', 'py3bridge',
self.mod_name, self.func_name],
stdin=subprocess.PIPE,
stdout=subprocess.PIPE)
stdout, _ = p.communicate(pickle.dumps((args, kwargs)))
data = pickle.loads(stdout)
if data['success']:
return data['result']
else:
raise Exception(data['stacktrace'])
def main():
try:
target_module = sys.argv[1]
target_function = sys.argv[2]
args, kwargs = pickle.load(sys.stdin.buffer)
mod = importlib.import_module(target_module)
func = getattr(mod, target_function)
result = func(*args, **kwargs)
data = dict(success=True, result=result)
except Exception:
st = io.StringIO()
traceback.print_exc(file=st)
data = dict(success=False, stacktrace=st.getvalue())
pickle.dump(data, sys.stdout.buffer, 2)
if __name__ == '__main__':
main()
The Python 3 module (using the pathlib module for the showcase)
# spam.py
import pathlib
def listdir(p):
return [str(c) for c in pathlib.Path(p).iterdir()]
The Python 2 module using spam.listdir
# beans.py
import py3bridge
delegate = py3bridge.Py3Wrapper('spam', 'listdir')
py3result = delegate('.')
print py3result
Assuming the caller is Python3.5+, you have access to a nicer subprocess module. Perhaps you could user subprocess.run, and communicate via pickled Python objects sent through stdin and stdout, respectively. There would be some setup to do, but no parsing on your side, or mucking with strings etc.
Here's an example of Python2 code via subprocess.Popen
p = subprocess.Popen(python3_args, stdin=subprocess.PIPE, stdout=subprocess.PIPE)
stdout, stderr = p.communicate(pickle.dumps(python3_args))
result = pickle.load(stdout)
You could create a simple script as such :
import sys
import my_wrapped_module
import json
params = sys.argv
script = params.pop(0)
function = params.pop(0)
print(json.dumps(getattr(my_wrapped_module, function)(*params)))
You'll be able to call it like that :
pythonx.x wrapper.py myfunction param1 param2
This is obviously a security hazard though, be careful.
Also note that if your params are anything else than string or integers, you'll have some issues, so maybe think about transmitting params as a json string, and convert it using json.loads() in the wrapper.
It's possible to use the multiprocessing.managers module to achieve what you want. It does require a small amount of hacking though.
Given a module that has functions you want to expose then you need to create a Manager that can create proxies for those functions.
manager process that serves proxies to the py3 functions:
from multiprocessing.managers import BaseManager
import spam
class SpamManager(BaseManager):
pass
# Register a way of getting the spam module.
# You can use the exposed arg to control what is exposed.
# By default only "public" functions (without a leading underscore) are exposed,
# but can only ever expose functions or methods.
SpamManager.register("get_spam", callable=(lambda: spam), exposed=["add", "sub"])
# specifying the address as localhost means the manager is only visible to
# processes on this machine
manager = SpamManager(address=('localhost', 50000), authkey=b'abc',
serializer='xmlrpclib')
server = manager.get_server()
server.serve_forever()
I've redefined spam to contain two function called add and sub.
# spam.py
def add(x, y):
return x + y
def sub(x, y):
return x - y
client process that uses the py3 functions exposed by the SpamManager.
from __future__ import print_function
from multiprocessing.managers import BaseManager
class SpamManager(BaseManager):
pass
SpamManager.register("get_spam")
m = SpamManager(address=('localhost', 50000), authkey=b'abc',
serializer='xmlrpclib')
m.connect()
spam = m.get_spam()
print("1 + 2 = ", spam.add(1, 2)) # prints 1 + 2 = 3
print("1 - 2 = ", spam.sub(1, 2)) # prints 1 - 2 = -1
spam.__name__ # Attribute Error -- spam is a module, but its __name__ attribute
# is not exposed
Once set up, this form gives an easy way of accessing functions and values. It also allows these functions and values to be used them in a similar way that you might use them if they were not proxies. Finally, it allows you to set a password on the server process so that only authorised processes can access the manager. That the manager is long running, also means that a new process doesn't have to be started for each function call you make.
One limitation is that I've used the xmlrpclib module rather than pickle to send data back and forth between the server and the client. This is because python2 and python3 use different protocols for pickle. You could fix this by adding your own client to multiprocessing.managers.listener_client that uses an agreed upon protocol for pickling objects.
I'm writing unit tests to validate my project functionalities. I need to replace some of the functions with mock function and I thought to use the Python mock library. The implementation I used doesn't seem to work properly though and I don't understand where I'm doing wrong. Here a simplified scenario:
root/connector.py
from ftp_utils.py import *
def main():
config = yaml.safe_load("vendor_sftp.yaml")
downloaded_files = []
downloaded_files = get_files(config)
for f in downloaded_files:
#do something
root/utils/ftp_utils.py
import os
import sys
import pysftp
def get_files(config):
sftp = pysftp.Connection(config['host'], username=config['username'])
sftp.chdir(config['remote_dir'])
down_files = sftp.listdir()
if down_files is not None:
for f in down_files:
sftp.get(f, os.path.join(config['local_dir'], f), preserve_mtime=True)
return down_files
root/tests/connector_tester.py
import unittest
import mock
import ftp_utils
import connector
def get_mock_files():
return ['digital_spend.csv', 'tv_spend.csv']
class ConnectorTester(unittest.TestCase)
#mock.patch('ftp_utils.get_files', side_effect=get_mock_files)
def test_main_process(self, get_mock_files_function):
# I want to use a mock version of the get_files function
connector.main()
When I debug my test I expect that the get_files function called inside the main of connector.py is the get_mock_files(), but instead is the ftp_utils.get_files(). What am I doing wrong here? What should I change in my code to properly call the get_mock_file() mock?
Thanks,
Alessio
I think there are several problems with your scenario:
connector.py cannot import from ftp_utils.py that way
nor can connector_tester.py
as a habit, it is better to have your testing files under the form test_xxx.py
to use unittest with patching, see this example
In general, try to provide working minimal examples so that it is easier for everyone to run your code.
I modified rather heavily your example to make it work, but basically, the problem is that you patch 'ftp_utils.get_files' while it is not the reference that is actually called inside connector.main() but probably rather 'connector.get_files'.
Here is the modified example's directory:
test_connector.py
ftp_utils.py
connector.py
test_connector.py:
import unittest
import sys
import mock
import connector
def get_mock_files(*args, **kwargs):
return ['digital_spend.csv', 'tv_spend.csv']
class ConnectorTester(unittest.TestCase):
def setUp(self):
self.patcher = mock.patch('connector.get_files', side_effect=get_mock_files)
self.patcher.start()
def test_main_process(self):
# I want to use a mock version of the get_files function
connector.main()
suite = unittest.TestLoader().loadTestsFromTestCase(ConnectorTester)
if __name__ == "__main__":
unittest.main()
NB: what is called when running connector.main() is 'connector.get_files'
connector.py:
from ftp_utils import *
def main():
config = None
downloaded_files = []
downloaded_files = get_files(config)
for f in downloaded_files:
print(f)
connector/ftp_utils.py unchanged.
I know that if I import a module by name import(moduleName), then I can reload it with reload(moduleName)
But, I am importing a bunch of modules with a Kleene star:
from proj import *
How can I reload them in this case?
I think there's a way to reload all python modules. The code for Python 2.7 is listed below: Instead of importing the math module with an asterisk, you can import whatever you need.
from math import *
from sys import *
Alfa = modules.keys()
modules.clear()
for elem in Alfa:
str = 'from '+elem+' import *'
try:
exec(str)
except:
pass
This is a complicated and confusing issue. The method I give will reload the module and refresh the variables in the given context. However, this method will fall over if you have multiple modules using a starred import on the given module as they will retain their original values instead of updating. In generally, even having to reload a module is something you shouldn't be doing, with the exception of when working with a REPL. Modules aren't something that should be dynamic. You should consider other ways of providing the updates you need.
import sys
def reload_starred(module_name, context):
if context in sys.modules:
context = vars(sys.modules[context])
module = sys.modules[module_name]
for name in get_public_module_variables(module):
try:
del context[name]
except KeyError:
pass
module = reload(module)
context.update(get_public_module_variables(module))
def get_public_module_variables(module):
if hasattr(module, "__all__"):
return dict((k,v) for (k,v) in vars(module).items()
if k in module.__all__)
else:
return dict((k,v) for (k,v) in vars(module).items()
if not k.startswith("_"))
Usage:
reload_starred("my_module", __name__)
reload_starred("my_module", globals())
reload_starred("my_module", "another_module")
def function():
from my_module import *
...
reload_starred("my_module", locals())
I have a module that imports fine (i print it at the top of the module that uses it)
from authorize import cim
print cim
Which produces:
<module 'authorize.cim' from '.../dist-packages/authorize/cim.pyc'>
However later in a method call, it has mysteriously turned to None
class MyClass(object):
def download(self):
print cim
which when run show that cim is None. The module isn't ever directly assigned to None anywhere in this module.
Any ideas how this can happen?
As you comment it youself - it is likely some code is attributing None to the "cim" name on your module itself - the way for checking for this is if your large module would be made "read only" for other modules -- I think Python allows for this --
(20 min. hacking ) --
Here -- just put this snippet in a "protect_module.py" file, import it, and call
"ProtectdedModule()" at the end of your module in which the name "cim" is vanishing -
it should give you the culprit:
"""
Protects a Module against naive monkey patching -
may be usefull for debugging large projects where global
variables change without notice.
Just call the "ProtectedModule" class, with no parameters from the end of
the module definition you want to protect, and subsequent assignments to it
should fail.
"""
from types import ModuleType
from inspect import currentframe, getmodule
import sys
class ProtectedModule(ModuleType):
def __init__(self, module=None):
if module is None:
module = getmodule(currentframe(1))
ModuleType.__init__(self, module.__name__, module.__doc__)
self.__dict__.update(module.__dict__)
sys.modules[self.__name__] = self
def __setattr__(self, attr, value):
frame = currentframe(1)
raise ValueError("Attempt to monkey patch module %s from %s, line %d" %
(self.__name__, frame.f_code.co_filename, frame.f_lineno))
if __name__ == "__main__":
from xml.etree import ElementTree as ET
ET = ProtectedModule(ET)
print dir(ET)
ET.bla = 10
print ET.bla
In my case, this was related with threading quirks: https://docs.python.org/2/library/threading.html#importing-in-threaded-code