How do I pass a decorator's function into a job?
I have a decorator that would run a job using the function.
#job
def queueFunction(passedFunction, *args, **kwargs):
# Do some stuff
passedFunction(*args, **kwargs)
def myDecorator(async=True):
def wrapper(function):
def wrappedFunc(*args, **kwargs):
data = DEFAULT_DATA
if async:
queueFunction.delay(function, *args, **kwargs)
else:
data = queueFunction(function, *args, **kwargs)
return data
return wrappedFunc
return wrapper
I get an error when trying to use it.
Can't pickle <function Model.passedFunction at 0x7f410ad4a048>: it's not the same object as modelInstance.models.Model.passedFunction
Using Python 3.4
What happens is that you are passing in the original function (or method) to the queueFunction.delay() function, but that's not the same function that it's qualified name says it is.
In order to run functions in a worker, Python RQ uses the pickle module to serialise both the function and its arguments. But functions (and classes) are serialised as importable names, and when deserialising the pickle module simply imports the recorded name. But it does first check that that will result in the right object. So when pickling, the qualified name is tested to double-check it'll produce the exact same object.
If we use pickle.loads as a sample function, then what roughly happens is this:
>>> import pickle
>>> import sys
>>> sample_function = pickle.loads
>>> module_name = sample_function.__module__
>>> function_name = sample_function.__qualname__
>>> recorded_name = f"{module_name}.{function_name}"
>>> recorded_name
'_pickle.loads'
>>> parent, obj = sys.modules[module_name], None
>>> for name in function_name.split("."): # traverse a dotted path of names
... obj = getattr(parent, name)
...
>>> obj is sample_function
True
Note that pickle.loads is really _pickle.loads; that doesn't matter all that much, but what does matter is that _pickle can be accessed and it has an object that can be found by using the qualified name, and it is the same object still. This will work even for methods on classes (modulename.ClassName.method_name).
But when you decorate a function, you are potentially replacing that function object:
>>> def decorator(f):
... def wrapper(*args, **kwargs):
... return f, f(*args, **kwargs)
... return wrapper
...
>>> #decorator
... def foo(): pass
...
>>> foo.__qualname__
'decorator.<locals>.wrapper'
>>> foo()[0].__qualname__ # original function
'foo'
Note that the decorator result has a very different qualified name from the original! Pickle won't be able to map that back to either the decorator result or to the original function.
You are passing in the original, undecorated function to queueFunction.delay(), and it's qualified name will not match that of the wrappedFunc() function you replaced it with; when pickle tries to import the fully qualified name found on that function object, it'll find the wrappedFunc object and that's not the same object.
There are several ways around this, but the easiest is to store the original function as an attribute on the wrapper, and rename it's qualified name to match. This makes the original function available
You'll have to use he #functools.wraps() utility decorator here to copy various attributes from the original, decorated function over to your wrapper function. This includes the original name.
Here is a version that alters the original function qualified name:
from functools import wraps
def myDecorator(async_=True):
def wrapper(function):
#wraps(function)
def wrappedFunc(*args, **kwargs):
data = DEFAULT_DATA
if async:
queueFunction.delay(function, *args, **kwargs)
else:
data = queueFunction(function, *args, **kwargs)
return data
# make the original available to the pickle module as "<name>.original"
wrappedFunc.original = function
wrappedFunc.original.__qualname__ += ".original"
return wrappedFunc
return wrapper
The #wraps(function) decorator makes sure that wrappedFunc.__qualname__ is set to that of function, so if function was named foo, so now is the wrappedFunc function object. The wrappedFunc.original.__qualname__ += ".original" statement then sets the qualified name of wrappedFunc.original to foo.original, and that's exactly where pickle can find it again!
Note: I renamed async to async_ to make the above code work on Python 3.7 and above; as of Python 3.7 async is a reserved keyword.
I also see that you are making the decision to run something synchronous or asynchronous at decoration time. In that case I'd re-write it to not check the aync_ boolean flag each time you call the function. Just return different wrappers:
from functools import wraps
def myDecorator(async_=True):
def decorator(function):
if async_:
#wraps(function)
def wrapper(*args, **kwargs):
queueFunction.delay(wrappedFunc.original, *args, **kwargs)
return DEFAULT_DATA
# make the original available to the pickle module as "<name>.original"
wrapper.original = function
wrapper.original.__qualname__ += ".original"
else:
#wraps(function)
def wrapper(*args, **kwargs):
return queueFunction(function, *args, **kwargs)
return wrapper
return decorator
I also renamed the various inner functions; myDecorator is a decorator factory that returns the actual decorator, and the decorator returns the wrapper.
Either way, the result is that now the .original object can be pickled:
>>> import pickle
>>> #myDecorator(True)
... def foo(): pass
...
>>> foo.original
<function foo.original at 0x10195dd90>
>>> pickle.dumps(foo.original, pickle.HIGHEST_PROTOCOL)
b'\x80\x04\x95\x1d\x00\x00\x00\x00\x00\x00\x00\x8c\x08__main__\x94\x8c\x0cfoo.original\x94\x93\x94.'
Related
Consider this small example:
import datetime as dt
class Timed(object):
def __init__(self, f):
self.func = f
def __call__(self, *args, **kwargs):
start = dt.datetime.now()
ret = self.func(*args, **kwargs)
time = dt.datetime.now() - start
ret["time"] = time
return ret
class Test(object):
def __init__(self):
super(Test, self).__init__()
#Timed
def decorated(self, *args, **kwargs):
print(self)
print(args)
print(kwargs)
return dict()
def call_deco(self):
self.decorated("Hello", world="World")
if __name__ == "__main__":
t = Test()
ret = t.call_deco()
which prints
Hello
()
{'world': 'World'}
Why is the self parameter (which should be the Test obj instance) not passed as first argument to the decorated function decorated?
If I do it manually, like :
def call_deco(self):
self.decorated(self, "Hello", world="World")
it works as expected. But if I must know in advance if a function is decorated or not, it defeats the whole purpose of decorators. What is the pattern to go here, or do I misunderstood something?
tl;dr
You can fix this problem by making the Timed class a descriptor and returning a partially applied function from __get__ which applies the Test object as one of the arguments, like this
class Timed(object):
def __init__(self, f):
self.func = f
def __call__(self, *args, **kwargs):
print(self)
start = dt.datetime.now()
ret = self.func(*args, **kwargs)
time = dt.datetime.now() - start
ret["time"] = time
return ret
def __get__(self, instance, owner):
from functools import partial
return partial(self.__call__, instance)
The actual problem
Quoting Python documentation for decorator,
The decorator syntax is merely syntactic sugar, the following two function definitions are semantically equivalent:
def f(...):
...
f = staticmethod(f)
#staticmethod
def f(...):
...
So, when you say,
#Timed
def decorated(self, *args, **kwargs):
it is actually
decorated = Timed(decorated)
only the function object is passed to the Timed, the object to which it is actually bound is not passed on along with it. So, when you invoke it like this
ret = self.func(*args, **kwargs)
self.func will refer to the unbound function object and it is invoked with Hello as the first argument. That is why self prints as Hello.
How can I fix this?
Since you have no reference to the Test instance in the Timed, the only way to do this would be to convert Timed as a descriptor class. Quoting the documentation, Invoking descriptors section,
In general, a descriptor is an object attribute with “binding behavior”, one whose attribute access has been overridden by methods in the descriptor protocol: __get__(), __set__(), and __delete__(). If any of those methods are defined for an object, it is said to be a descriptor.
The default behavior for attribute access is to get, set, or delete the attribute from an object’s dictionary. For instance, a.x has a lookup chain starting with a.__dict__['x'], then type(a).__dict__['x'], and continuing through the base classes of type(a) excluding metaclasses.
However, if the looked-up value is an object defining one of the descriptor methods, then Python may override the default behavior and invoke the descriptor method instead.
We can make Timed a descriptor, by simply defining a method like this
def __get__(self, instance, owner):
...
Here, self refers to the Timed object itself, instance refers to the actual object on which the attribute lookup is happening and owner refers to the class corresponding to the instance.
Now, when __call__ is invoked on Timed, the __get__ method will be invoked. Now, somehow, we need to pass the first argument as the instance of Test class (even before Hello). So, we create another partially applied function, whose first parameter will be the Test instance, like this
def __get__(self, instance, owner):
from functools import partial
return partial(self.__call__, instance)
Now, self.__call__ is a bound method (bound to Timed instance) and the second parameter to partial is the first argument to the self.__call__ call.
So, all these effectively translate like this
t.call_deco()
self.decorated("Hello", world="World")
Now self.decorated is actually Timed(decorated) (this will be referred as TimedObject from now on) object. Whenever we access it, the __get__ method defined in it will be invoked and it returns a partial function. You can confirm that like this
def call_deco(self):
print(self.decorated)
self.decorated("Hello", world="World")
would print
<functools.partial object at 0x7fecbc59ad60>
...
So,
self.decorated("Hello", world="World")
gets translated to
Timed.__get__(TimedObject, <Test obj>, Test.__class__)("Hello", world="World")
Since we return a partial function,
partial(TimedObject.__call__, <Test obj>)("Hello", world="World"))
which is actually
TimedObject.__call__(<Test obj>, 'Hello', world="World")
So, <Test obj> also becomes a part of *args, and when self.func is invoked, the first argument will be the <Test obj>.
You first have to understand how function become methods and how self is "automagically" injected.
Once you know that, the "problem" is obvious: you are decorating the decorated function with a Timed instance - IOW, Test.decorated is a Timed instance, not a function instance - and your Timed class does not mimick the function type's implementation of the descriptor protocol. What you want looks like this:
import types
class Timed(object):
def __init__(self, f):
self.func = f
def __call__(self, *args, **kwargs):
start = dt.datetime.now()
ret = self.func(*args, **kwargs)
time = dt.datetime.now() - start
ret["time"] = time
return ret
def __get__(self, instance, cls):
return types.MethodType(self, instance, cls)
Disclaimer:
Before reading this post know that I am trying to do something that is unconventional in python. Since "Don't do x" is not an answer to "how do I do x?" let's assume there is a very good reason to do this, even though in most cases it would not be good practice.
The Question
Given I have a class that is dynamically created by applying a decorator to a function, how would I go about pickling an instance of said class?
For example, to set this up it might look like this:
import inspect
from functools import wraps
class BaseClass:
pass
def _make_method(func):
""" decorator for adding self as first argument to function """
#wraps(func)
def decorator(self, *args, **kwargs):
return func(*args, **kwargs)
# set signature to include self
sig = inspect.signature(decorator)
par = inspect.Parameter('self', 1)
new_params = tuple([par] + list(sig.parameters.values()))
new_sig = sig.replace(parameters=new_params,
return_annotation=sig.return_annotation)
decorator.__signature__ = new_sig
return decorator
def snake2camel(snake_str):
""" convert a snake_string to a CamelString """
return "".join(x.title() for x in snake_str.split('_'))
def make_class(func):
""" dynamically create a class setting the call method to function """
name = snake2camel(func.__name__) # get the name of the new class
method = _make_method(func)
cls = type(name, (BaseClass,), {'__call__': method})
return cls()
#make_class
def something(arg):
return arg
Now something is an instance of the dynamically created class Something.
type(something) # -> __main__.Something
isinstance(something, BaseClass) # -> True
which works fine, but when I try to pickle it (or use the multiprocessing module which uses pickle under the hood):
import pickle
pickle.dumps(something) # -> raises
it throws this error:
# PicklingError: Can't pickle <class '__main__.Something'>: attribute lookup Something on __main__ failed
So I thought I could redefine BaseClass to use a reduce method like so:
class BaseClass:
def __reduce__(self):
return make_class, (self.__call__.__func__,)
but then it throws the dreaded "not the same object" error:
# PicklingError: Can't pickle <function something at 0x7fe124cb2d08>: it's not the same object as __main__.something
How can I make this work without bringing in dependencies? I need to be able to pickle the something object so I can use it with the ProcessPoolExecutor class from the concurrent.futures module in python 3.6, so simply using dill or cloudpickle is probably not an option here.
What I am trying to do is write a wrapper around another module so that I can transform the parameters that are being passed to the methods of the other module. That was fairly confusing, so here is an example:
import somemodule
class Wrapper:
def __init__(self):
self.transforms = {}
self.transforms["t"] = "test"
# This next function is the one I want to exist
# Please understand the lines below will not compile and are not real code
def __intercept__(self, item, *args, **kwargs):
if "t" in args:
args[args.index("t")] = self.transforms["t"]
return somemodule.item(*args, **kwargs)
The goal is to allow users of the wrapper class to make simplified calls to the underlying module without having to rewrite all of the functions in the module. So in this case if somemodule had a function called print_uppercase then the user could do
w = Wrapper()
w.print_uppercase("t")
and get the output
TEST
I believe the answer lies in __getattr__ but I'm not totally sure how to use it for this application.
__getattr__ combined with defining a function on the fly should work:
# somemodule
def print_uppercase(x):
print(x.upper())
Now:
from functools import wraps
import somemodule
class Wrapper:
def __init__(self):
self.transforms = {}
self.transforms["t"] = "test"
def __getattr__(self, attr):
func = getattr(somemodule, attr)
#wraps(func)
def _wrapped(*args, **kwargs):
if "t" in args:
args = list(args)
args[args.index("t")] = self.transforms["t"]
return func(*args, **kwargs)
return _wrapped
w = Wrapper()
w.print_uppercase('Hello')
w.print_uppercase('t')
Output:
HELLO
TEST
I would approach this by calling the intercept method, and entering the desired method to execute, as a parameter for intercept. Then, in the intercept method, you can search for a method with that name and execute it.
Since your Wrapper object doesn't have any mutable state, it'd be easier to implement without a class. Example wrapper.py:
def func1(*args, **kwargs):
# do your transformations
return somemodule.func1(*args, **kwargs)
Then call it like:
import wrapper as w
print w.func1('somearg')
Assume I have to unit test methodA, defined in the following class:
class SomeClass(object):
def wrapper(fun):
def _fun(self, *args, **kwargs):
self.b = 'Original'
fun(self, *args, **kwargs)
return _fun
#wrapper
def methodA(self):
pass
My test class is as follows:
from mock import patch
class TestSomeClass(object):
def testMethodA(self):
def mockDecorator(f):
def _f(self, *args, **kwargs):
self.b = 'Mocked'
f(self, *args, **kwargs)
return _f
with patch('some_class.SomeClass.wrapper', mockDecorator):
from some_class import SomeClass
s = SomeClass()
s.methodA()
assert s.b == 'Mocked', 's.b is equal to %s' % s.b
If I run the test, I hit the assertion:
File "/home/klinden/workinprogress/mockdecorators/test_some_class.py", line 17, in testMethodA
assert s.b == 'Mocked', 's.b is equal to %s' % s.b
AssertionError: s.b is equal to Original
If I stick a breakpoint in the test, after patching, this is I can see wrapper has been mocked out just fine, but that methodA still references the old wrapper:
(Pdb) p s.wrapper
<bound method SomeClass.mockDecorator of <some_class.SomeClass object at 0x7f9ed1bf60d0>>
(Pdb) p s.methodA
<bound method SomeClass._fun of <some_class.SomeClass object at 0x7f9ed1bf60d0>>
Any idea of what the problem is here?
After mulling over, I've found a solution.
Since monkey patching seems not to be effective (and I've also tried a few
other solutions), I dug into the function internals and that proved to be fruitful.
Python 3
You're lucky - just use the wraps decorator, which creates a __wrapped__ attribute, which in turn contains the wrapped function. See the linked answers above for more details.
Python 2
Even if you use #wraps, no fancy attribute is created.
However, you just need to realise that the wrapper method does nothing but a closure: so you'll be able to find your wrapped function in its func_closure attribute.
In the original example, the wrapped function would be at: s.methodA.im_func.func_closure[0].cell_contents
Wrapping up (ha!)
I created a getWrappedFunction helper along this lines, to ease my testing:
#staticmethod
def getWrappedFunction(wrapper):
return wrapper.im_func.func_closure[0].cell_contents
YMMV, especially if you do fancy stuff and include other objects in the closure.
I have a python module with lots of functions and I want to apply a decorator for all of them.
Is there a way to patch all of them via monkey-patching to apply this decorator for every function without copy-pasting on line of applying decorator?
In other words, I want to replace this:
#logging_decorator(args)
func_1():
pass
#logging_decorator(args)
func_2():
pass
#logging_decorator(args)
func_3():
pass
#logging_decorator(args)
func_n():
pass
With this:
patch_func():
# get all functions of this module
# apply #logging_decorator to all (or not all) of them
func_1():
pass
func_2():
pass
func_3():
pass
func_n():
pass
I'm really not certain that this is a good idea. Explicit is better than implicit, after all.
With that said, something like this should work, using inspect to find which members of a module can be decorated and using __dict__ to manipulate the module's contents.
import inspect
def decorate_module(module, decorator):
for name, member in inspect.getmembers(module):
if inspect.getmodule(member) == module and callable(member):
if member == decorate_module or member == decorator:
continue
module.__dict__[name] = decorator(member)
Sample usage:
def simple_logger(f):
def wrapper(*args, **kwargs):
print("calling " + f.__name__)
f(*args, **kwargs)
return wrapper
def do_something():
pass
decorate_module(sys.modules[__name__], simple_logger)
do_something()
I ain't gonna be pretty ... but you can list all functions using dir() after their definition. Then I can't think of a way to patch them without a wrapper function.
def patched(func):
#logging_decorator
def newfunc(*args, **kwargs):
return func(*args, **kwargs)
return newfunc
funcs=[f in dir() if not '__' in f]
for f in funcs:
exec(f+'=patched(f)')