What I am trying to do is write a wrapper around another module so that I can transform the parameters that are being passed to the methods of the other module. That was fairly confusing, so here is an example:
import somemodule
class Wrapper:
def __init__(self):
self.transforms = {}
self.transforms["t"] = "test"
# This next function is the one I want to exist
# Please understand the lines below will not compile and are not real code
def __intercept__(self, item, *args, **kwargs):
if "t" in args:
args[args.index("t")] = self.transforms["t"]
return somemodule.item(*args, **kwargs)
The goal is to allow users of the wrapper class to make simplified calls to the underlying module without having to rewrite all of the functions in the module. So in this case if somemodule had a function called print_uppercase then the user could do
w = Wrapper()
w.print_uppercase("t")
and get the output
TEST
I believe the answer lies in __getattr__ but I'm not totally sure how to use it for this application.
__getattr__ combined with defining a function on the fly should work:
# somemodule
def print_uppercase(x):
print(x.upper())
Now:
from functools import wraps
import somemodule
class Wrapper:
def __init__(self):
self.transforms = {}
self.transforms["t"] = "test"
def __getattr__(self, attr):
func = getattr(somemodule, attr)
#wraps(func)
def _wrapped(*args, **kwargs):
if "t" in args:
args = list(args)
args[args.index("t")] = self.transforms["t"]
return func(*args, **kwargs)
return _wrapped
w = Wrapper()
w.print_uppercase('Hello')
w.print_uppercase('t')
Output:
HELLO
TEST
I would approach this by calling the intercept method, and entering the desired method to execute, as a parameter for intercept. Then, in the intercept method, you can search for a method with that name and execute it.
Since your Wrapper object doesn't have any mutable state, it'd be easier to implement without a class. Example wrapper.py:
def func1(*args, **kwargs):
# do your transformations
return somemodule.func1(*args, **kwargs)
Then call it like:
import wrapper as w
print w.func1('somearg')
Related
I'm trying to code a method from a class that uses a decorator from another class. The problem is that I need information stored in the Class that contains the decorator (ClassWithDecorator.decorator_param). To achieve that I'm using partial, injecting self as the first argument, but when I do that the self, from the class that uses the decorator " gets lost" somehow and I end up getting an error. Note that this does not happen if I remove partial() from my_decorator() and "self" will be correctly stored inside *args.
See the code sample:
from functools import partial
class ClassWithDecorator:
def __init__(self):
self.decorator_param = "PARAM"
def my_decorator(self, decorated_func):
def my_callable(ClassWithDecorator_instance, *args, **kwargs):
# Do something with decorator_param
print(ClassWithDecorator_instance.decorator_param)
return decorated_func(*args, **kwargs)
return partial(my_callable, self)
decorator_instance = ClassWithDecorator()
class WillCallDecorator:
def __init__(self):
self.other_param = "WillCallDecorator variable"
#decorator_instance.my_decorator
def decorated_method(self):
pass
WillCallDecorator().decorated_method()
I get
PARAM
Traceback (most recent call last):
File "****/decorator.py", line 32, in <module>
WillCallDecorator().decorated_method()
File "****/decorator.py", line 12, in my_callable
return decorated_func(*args, **kwargs)
TypeError: decorated_method() missing 1 required positional argument: 'self'
How can I pass the self corresponding to WillCallDecorator() into decorated_method() but at the same time pass information from its own class to my_callable() ?
It seems that you may want to use partialmethod instead of partial:
From the docs:
class functools.partialmethod(func, /, *args, **keywords)
When func is a non-descriptor callable, an appropriate bound method is created dynamically. This behaves like a normal Python function when used as a method: the self argument will be inserted as the first positional argument, even before the args and keywords supplied to the partialmethod constructor.
So much simpler just to use the self variable you already have. There is absolutely no reason to be using partial or partialmethod here at all:
from functools import partial
class ClassWithDecorator:
def __init__(self):
self.decorator_param = "PARAM"
def my_decorator(self, decorated_func):
def my_callable(*args, **kwargs):
# Do something with decorator_param
print(self.decorator_param)
return decorated_func(*args, **kwargs)
return my_callable
decorator_instance = ClassWithDecorator()
class WillCallDecorator:
def __init__(self):
self.other_param = "WillCallDecorator variable"
#decorator_instance.my_decorator
def decorated_method(self):
pass
WillCallDecorator().decorated_method()
Also, to answer your question about why your code didn't work, when you access something.decorated_method() the code checks whether decorated_method is a function and if so turns it internally into a call WillCallDecorator.decorated_method(something). But the value returned from partial is a functools.partial object, not a function. So the class lookup binding won't happen here.
In more detail, something.method(arg) is equivalent to SomethingClass.method.__get__(something, arg) when something doesn't have an attribute method and its type SomethingClass does have the attribute and the attribute has a method __get__ but the full set of steps for attribute lookup is quite complicated.
I have a decorator to control time limit, if the function execution exceeds limit, an error is raised.
def timeout(seconds=10):
def decorator(func):
# a timeout decorator
return decorator
And I want to build a class, using the constructor to pass the time limit into the class.
def myClass:
def __init__(self,time_limit):
self.time_limit = time_limit
#timeout(self.time_limit)
def do_something(self):
#do something
But this does not work.
File "XX.py", line YY, in myClass
#timeout(self.tlimit)
NameError: name 'self' is not defined
What's the correct way to implement this?
self.time_limit is only available when a method in an instance of your class is called.
The decorator statement, prefixing the methods, on the other hand is run when the class body is parsed.
However, the inner part of your decorator, if it will always be applied to methods, will get self as its first parameter - and there you can simply make use of any instance attribute:
def timeout(**decorator_parms):
def decorator(func):
def wrapper(self, *args, **kwargs):
time_limit = self.time_limit
now = time.time()
result = func(self, *args, **kwargs)
# code to check timeout
..
return result
return wrapper
return decorator
If your decorator is expected to work with other time limits than always self.limit you could always pass a string or other constant object, and check it inside the innermost decorator with a simple if statement. In case the timeout is a certain string or object, you use the instance attribute, otherwise you use the passed in value;
You can also decorate a method in the constructor:
def myClass:
def __init__(self,time_limit):
self.do_something = timeout(time_limit)(self.do_something)
def do_something(self):
#do something
I am trying to wrap the constructor for pyspark Pipeline.init constructor, and monkey patch in the newly wrapped constructor. However, I am running into an error that seems to have something to do with the way Pipeline.init uses decorators
Here is the code that actually does the monkey patch:
def monkeyPatchPipeline():
oldInit = Pipeline.__init__
def newInit(self, **keywordArgs):
oldInit(self, stages=keywordArgs["stages"])
Pipeline.__init__ = newInit
However, when I run a simple program:
import PythonSparkCombinatorLibrary
from pyspark.ml import Pipeline
from pyspark.ml.classification import LogisticRegression
from pyspark.ml.feature import HashingTF, Tokenizer
PythonSparkCombinatorLibrary.TransformWrapper.monkeyPatchPipeline()
tokenizer = Tokenizer(inputCol="text", outputCol="words")
hashingTF = HashingTF(inputCol=tokenizer.getOutputCol(),outputCol="features")
lr = LogisticRegression(maxIter=10, regParam=0.001)
pipeline = Pipeline(stages=[tokenizer, hashingTF, lr])
I get this error:
Traceback (most recent call last):
File "C:\<my path>\PythonApplication1\main.py", line 26, in <module>
pipeline = Pipeline(stages=[tokenizer, hashingTF, lr])
File "C:<my path>PythonApplication1 \PythonSparkCombinatorLibrary.py", line 36, in newInit
oldInit(self, stages=keywordArgs["stages"])
File "C:\<pyspark_path>\pyspark\__init__.py", line 98, in wrapper
return func(*args, **kwargs)
File "C:\<pyspark_path>\pyspark\ml\pipeline.py", line 63, in __init__
kwargs = self.__init__._input_kwargs
AttributeError: 'function' object has no attribute '_input_kwargs'
Looking into the pyspark interface, I see that Pipeline.init looks like this:
#keyword_only
def __init__(self, stages=None):
"""
__init__(self, stages=None)
"""
if stages is None:
stages = []
super(Pipeline, self).__init__()
kwargs = self.__init__._input_kwargs
self.setParams(**kwargs)
And noting the #keyword_only decorator, I inspected that code as well:
def keyword_only(func):
"""
A decorator that forces keyword arguments in the wrapped method
and saves actual input keyword arguments in `_input_kwargs`.
"""
#wraps(func)
def wrapper(*args, **kwargs):
if len(args) > 1:
raise TypeError("Method %s forces keyword arguments." % func.__name__)
wrapper._input_kwargs = kwargs
return func(*args, **kwargs)
return wrapper
I'm totally confused both about how this code works in the first place, and also why it seems to cause problems with my own wrapper. I see that wrapper is adding a _input_kwargs field to itself, but how is Pipeline.__init__ about to read that field with self.__init__._input_kwargs? And why doesn't the same thing happen when I wrap Pipeline.__init__ again?
Decorator 101. Decorator is a higher-order function which takes a function as its first argument (and typically only), and returns a function. # annotation is just a syntactic sugar for a simple function call, so following
#decorator
def decorated(x):
...
can be rewritten for example as:
def decorated_(x):
...
decorated = decorator(decorated_)
So Pipeline.__init__ is actually a functools.wrapped wrapper which captures defined __init__ (func argument of the keyword_only) as a part of its closure. When it is called, it uses received kwargs as a function attribute of itself. Basically what happens here can be simplified to:
def f(**kwargs):
f._input_kwargs = kwargs # f is in the current scope
hasattr(f, "_input_kwargs")
False
f(foo=1, bar="x")
hasattr(f, "_input_kwargs")
True
When you further wrap (decorate) __init__ the external function won't have _input_kwargs attached, hence the error. If you want to make it work you have apply the same process, as used by the original __init__, to your own version, for example with the same decorator:
#keyword_only
def newInit(self, **keywordArgs):
oldInit(self, stages=keywordArgs["stages"])
but I liked I mentioned in the comments, you should rather consider subclassing.
I'm writing a Python class to wrap/decorate/enhance another class from a package called petl, a framework for ETL (data movement) workflows. Due to design constraints I can't just subclass it; every method call has to be sent through my own class so I can control what kind of objects are being passed back. So in principle this is a proxy class, but I'm having some trouble using existing answers/recipes out there. This is what my code looks like:
from functools import partial
class PetlTable(object):
"""not really how we construct petl tables, but for illustrative purposes"""
def hello(name):
print('Hello, {}!'.format(name)
class DatumTable(object):
def __init__(self, petl_tbl):
self.petl_tbl = petl_tbl
def __getattr__(self, name):
"""this returns a partial referencing the child method"""
petl_attr = getattr(self.petl_tbl, name, None)
if petl_attr and callable(petl_attr):
return partial(self.call_petl_method, func=petl_attr)
raise NotImplementedError('Not implemented')
def call_petl_method(self, func, *args, **kwargs):
func(*args, **kwargs)
Then I try to instantiate a table and call something:
# create a petl table
pt = PetlTable()
# wrap it with our own class
dt = DatumTable(pt)
# try to run the petl method
dt.hello('world')
This gives a TypeError: call_petl_method() got multiple values for argument 'func'.
This only happens with positional arguments; kwargs seem to be fine. I'm pretty sure it has to do with self not being passed in, but I'm not sure what the solution is. Can anyone think of what I'm doing wrong, or a better solution altogether?
This seems to be a common issue with mixing positional and keyword args:
TypeError: got multiple values for argument
To get around it, I took the positional arg func out of call_petl_method and put it in a kwarg that's unlikely to overlap with the kwargs of the child function. A little hacky, but it works.
I ended up writing a Proxy class to do all this generically:
class Proxy(object):
def __init__(self, child):
self.child = child
def __getattr__(self, name):
child_attr = getattr(self.child, name)
return partial(self.call_child_method, __child_fn__=child_attr)
#classmethod
def call_child_method(cls, *args, **kwargs):
"""
This calls a method on the child object and wraps the response as an
object of its own class.
Takes a kwarg `__child_fn__` which points to a method on the child
object.
Note: this can't take any positional args or they get clobbered by the
keyword args we're trying to pass to the child. See:
https://stackoverflow.com/questions/21764770/typeerror-got-multiple-values-for-argument
"""
# get child method
fn = kwargs.pop('__child_fn__')
# call the child method
r = fn(*args, **kwargs)
# wrap the response as an object of the same class
r_wrapped = cls(r)
return r_wrapped
This will also solve the problem. It doesn't use partial at all.
class PetlTable(object):
"""not really how we construct petl tables, but for illustrative purposes"""
def hello(name):
print('Hello, {}!'.format(name))
class DatumTable(object):
def __init__(self, petl_tbl):
self.petl_tbl = petl_tbl
def __getattr__(self, name):
"""Looks-up named attribute in class of the petl_tbl object."""
petl_attr = self.petl_tbl.__class__.__dict__.get(name, None)
if petl_attr and callable(petl_attr):
return petl_attr
raise NotImplementedError('Not implemented')
if __name__ == '__main__':
# create a petl table
pt = PetlTable()
# wrap it with our own class
dt = DatumTable(pt)
# try to run the petl method
dt.hello('world') # -> Hello, world!
How do I pass a decorator's function into a job?
I have a decorator that would run a job using the function.
#job
def queueFunction(passedFunction, *args, **kwargs):
# Do some stuff
passedFunction(*args, **kwargs)
def myDecorator(async=True):
def wrapper(function):
def wrappedFunc(*args, **kwargs):
data = DEFAULT_DATA
if async:
queueFunction.delay(function, *args, **kwargs)
else:
data = queueFunction(function, *args, **kwargs)
return data
return wrappedFunc
return wrapper
I get an error when trying to use it.
Can't pickle <function Model.passedFunction at 0x7f410ad4a048>: it's not the same object as modelInstance.models.Model.passedFunction
Using Python 3.4
What happens is that you are passing in the original function (or method) to the queueFunction.delay() function, but that's not the same function that it's qualified name says it is.
In order to run functions in a worker, Python RQ uses the pickle module to serialise both the function and its arguments. But functions (and classes) are serialised as importable names, and when deserialising the pickle module simply imports the recorded name. But it does first check that that will result in the right object. So when pickling, the qualified name is tested to double-check it'll produce the exact same object.
If we use pickle.loads as a sample function, then what roughly happens is this:
>>> import pickle
>>> import sys
>>> sample_function = pickle.loads
>>> module_name = sample_function.__module__
>>> function_name = sample_function.__qualname__
>>> recorded_name = f"{module_name}.{function_name}"
>>> recorded_name
'_pickle.loads'
>>> parent, obj = sys.modules[module_name], None
>>> for name in function_name.split("."): # traverse a dotted path of names
... obj = getattr(parent, name)
...
>>> obj is sample_function
True
Note that pickle.loads is really _pickle.loads; that doesn't matter all that much, but what does matter is that _pickle can be accessed and it has an object that can be found by using the qualified name, and it is the same object still. This will work even for methods on classes (modulename.ClassName.method_name).
But when you decorate a function, you are potentially replacing that function object:
>>> def decorator(f):
... def wrapper(*args, **kwargs):
... return f, f(*args, **kwargs)
... return wrapper
...
>>> #decorator
... def foo(): pass
...
>>> foo.__qualname__
'decorator.<locals>.wrapper'
>>> foo()[0].__qualname__ # original function
'foo'
Note that the decorator result has a very different qualified name from the original! Pickle won't be able to map that back to either the decorator result or to the original function.
You are passing in the original, undecorated function to queueFunction.delay(), and it's qualified name will not match that of the wrappedFunc() function you replaced it with; when pickle tries to import the fully qualified name found on that function object, it'll find the wrappedFunc object and that's not the same object.
There are several ways around this, but the easiest is to store the original function as an attribute on the wrapper, and rename it's qualified name to match. This makes the original function available
You'll have to use he #functools.wraps() utility decorator here to copy various attributes from the original, decorated function over to your wrapper function. This includes the original name.
Here is a version that alters the original function qualified name:
from functools import wraps
def myDecorator(async_=True):
def wrapper(function):
#wraps(function)
def wrappedFunc(*args, **kwargs):
data = DEFAULT_DATA
if async:
queueFunction.delay(function, *args, **kwargs)
else:
data = queueFunction(function, *args, **kwargs)
return data
# make the original available to the pickle module as "<name>.original"
wrappedFunc.original = function
wrappedFunc.original.__qualname__ += ".original"
return wrappedFunc
return wrapper
The #wraps(function) decorator makes sure that wrappedFunc.__qualname__ is set to that of function, so if function was named foo, so now is the wrappedFunc function object. The wrappedFunc.original.__qualname__ += ".original" statement then sets the qualified name of wrappedFunc.original to foo.original, and that's exactly where pickle can find it again!
Note: I renamed async to async_ to make the above code work on Python 3.7 and above; as of Python 3.7 async is a reserved keyword.
I also see that you are making the decision to run something synchronous or asynchronous at decoration time. In that case I'd re-write it to not check the aync_ boolean flag each time you call the function. Just return different wrappers:
from functools import wraps
def myDecorator(async_=True):
def decorator(function):
if async_:
#wraps(function)
def wrapper(*args, **kwargs):
queueFunction.delay(wrappedFunc.original, *args, **kwargs)
return DEFAULT_DATA
# make the original available to the pickle module as "<name>.original"
wrapper.original = function
wrapper.original.__qualname__ += ".original"
else:
#wraps(function)
def wrapper(*args, **kwargs):
return queueFunction(function, *args, **kwargs)
return wrapper
return decorator
I also renamed the various inner functions; myDecorator is a decorator factory that returns the actual decorator, and the decorator returns the wrapper.
Either way, the result is that now the .original object can be pickled:
>>> import pickle
>>> #myDecorator(True)
... def foo(): pass
...
>>> foo.original
<function foo.original at 0x10195dd90>
>>> pickle.dumps(foo.original, pickle.HIGHEST_PROTOCOL)
b'\x80\x04\x95\x1d\x00\x00\x00\x00\x00\x00\x00\x8c\x08__main__\x94\x8c\x0cfoo.original\x94\x93\x94.'