I currently have the following structure:
Inside a class I need to handle several types of functions with two special variables and an arbitrary number of parameters. To wrap these for the methods I apply them on I scan the function signatures first (that works very reliable) and decide what the parameters and what my variables are.
I then bind them back with a lambda expression in the following way. Let func(x, *args) be my function, then I'll bind
f = lambda x, t: func(x=x, **func_parameter)
In the case that I get func(x, t, *args) I bind
f = lambda x, t: func(x=x, t=t, **func_parameter)
and similar if I have neither variables.
It is essential that I hand a function of the form f(x,t) to my methods inside that class.
I would like to use functools.partial for that - it is the more pythonic way to do it and the performance when executing is better (the function f is potentially called a couple of million times...). The problem that I have is that I don't know what to do if I have a basis function which is independent of one of the variables t and x, that's why I went with lambda functions at all, they just map the other variable 'blind'. It's still two function calls and while definitions with lambda and partial take the same time, execution is a lot faster with partial.
Does anyone knoe how to use partial in that case? Performance is kind of an issue here.
EDIT: A little later. I figured out that function evaluation with tuple arguments are faster than with keyword arguments, so that was a plus.
And then, in the end, as a user I would just take some of the guess work from Python, i.e. directly define
def func(x):
return 2*x
instead of
def func(x, a):
return a*x
And call it directly. In that way I can use the function directly. Second case would be if I implement the case where x and t are both present as partial mapping.
That might be a compromise.
You could write adapter classes that have an f(x,t) call signature. The result is similar to functools.partial but much more flexible. __call__ gives you a consistent call signature and lets you add, drop, and map parameters. Arguments can be fixed when an instance is made. It seems like it should execute as fast as a normal function, but I have no basis for that.
A toy version:
class Adapt:
'''return a function with call signature f(x,t)'''
def __init__(self, func, **kwargs):
self.func = func
self.kwargs = kwargs
def __call__(self, x, t):
# mapping magic goes here
return self.func(x, t, **self.kwargs)
#return self.func(a=x, b=t, **self.kwargs)
def f(a, b, c):
print(a, b, c)
Usage:
>>> f_xt = Adapt(f, c = 4)
>>> f_xt(3, 4)
3 4 4
>>>
Don't know how you could make that generic for arbitrary parameters and mappings, maybe someone will chime in with an idea or an edit.
So if you end up writing an adapter specific to each function, the function can be embedded in the class instead of an instance parameter.
class AdaptF:
'''return a function with call signature f(x,t)'''
def __init__(self, **kwargs):
self.kwargs = kwargs
def __call__(self, x, t):
'''does stuff with x and t'''
# mapping magic goes here
return self.func(a=x, b=t, **self.kwargs)
def func(self, a, b, c):
print(a, b, c)
>>> f_xt = AdaptF(c = 4)
>>> f_xt(x = 3, t = 4)
3 4 4
>>>
I just kinda made this up from stuff I have read so I don't know if it is viable. I feel like I should give credit to the source I read but for the life of me I can't find it - I probably saw it on a pyvideo.
.
Related
I have a series of functions, where the result of one is fed into the next, plus other inputs:
result = function1(
function2(
function3(
function4(x, y, z),
a, b, c),
d, e, f),
g, h, i)
This is ugly and hard to understand. In particular it isn't obvious that function1 is actually the last one to be called.
How can this code be written nicer, in a Pythonic way?
I could e.g. assign intermediate results to variables:
j = function4(x, y, z)
k = function3(j, a, b, c)
l = function2(k, d, e, f)
result = function1(l, g, h, i)
But this also puts additional variables for things I don't need into the namespace, and may keep a large amount of memory from being freed – unless I add a del j, k, l, which is its own kind of ugly. Plus, I have to come up with names.
Or I could use the name of the final result also for the intermediate results:
result = function4(x, y, z)
result = function3(result, a, b, c)
result = function2(result, d, e, f)
result = function1(result, g, h, i)
The disadvantage here is that the same name is used for possibly wildly different things, which may make reading and debugging harder.
Then maybe _ or __?
__ = function4(x, y, z)
__ = function3(__, a, b, c)
__ = function2(__, d, e, f)
result = function1(__, g, h, i)
A bit better, but again not super clear. And I might have to add a del __ at the end.
Is there a better, more Pythonic way?
I think nesting function calls and assigning results to variables is not a bad solution, especially if you capture the whole in a single function with descriptive name. The name of the function is self-documenting, allows for reuse of the complex structure, and the local variables created in the function only exist for the lifetime of the function execution, so your namespace remains clean.
Having said that, depending on the type of the values being passed around, you could subclass whatever that type is and add the functions as methods on the class, which would allow you to chain the calls, which I'd consider very Pythonic if the superclass lends itself to it.
An example of that is the way the pandas library works and has changed recently. In the past, many pandas.DataFrame methods would take an optional parameter called inplace to allow a user to change this:
import pandas as pd
df = pd.DataFrame(my_data)
df = df.some_operation(arg1, arg2)
df = df.some_other_operation(arg3, arg4)
...
To this:
import pandas as pd
df = pd.DataFrame(my_data)
df.some_operation(arg1, arg2, inplace=True)
df.some_other_operation(arg3, arg4, inplace=True)
With inplace=True, the original DataFrame would be modified (or at least, the end result would suggest that's what happened), and in some cases it might still be returned, in other cases not at all. The idea being that it avoided the creation of additional data; also, many users would use multiple variables (df1, df2, etc.) to keep all the intermediate results around, often for no good reason.
However, as you may know that pattern is a bad idea in pandas for several reasons.
Instead, modern pandas methods consistently return the DataFrame itself, and inplace is being deprecated where it still exists. But because all functions now return the actual resulting DataFrame, this works:
import pandas as pd
df = pd.DataFrame(my_data)\
.some_operation(arg1, arg2)
.some_other_operation(arg3, arg4)
(Note that this originally worked for many cases already in pandas, but consistency is not the point here, it's the pattern that matters)
Your situation could be similar, depending on what you're passing exactly. From the example you provided, it's unclear what the types might be. But given that you consistently pass the function result as the first parameter, it seems likely that your implementation is something like:
def fun1(my_type_var, a, b, c):
# perform some operation on my_type_var and return result of same type
return operations_on(my_type_var, a, b, c)
def fun2(my_type_var, a, b, c):
# similar
return other_operations_on(my_type_var, a, b, c)
...
In that case, it would make more sense to:
class MyType:
def method1(self, a, b, c):
# perform some operation on self, then return the result, or self
mt = operations_on(self, a, b, c)
return mt
...
Because this would allow:
x = MyType().method1(arg1, arg2, arg3).method2(...)
And depending on what your type is exactly, you could choose to modify self and return that, or perform the operation with a new instance as a result like in the example above. What the better idea is depends on what you're doing, how your type works internally, etc.
Fantastic answer by #Grismar
One more option to consider, if the functions are things you are creating, instead of transformations of data via, e.g., numpy or pandas or whatever, is to create functions where each function itself, on the surface, is only one level deep.
This would be in line with how Robert "Bob" Martin advises to write functions in his book, Clean Code.
The idea is that with each function, you descend one level of abstraction.
Lacking the context for what you're doing, I can't demonstrate this convincingly, i.e., with function names that actually make sense. But mechanically, here is how it would look:
result = make_result(a,b,c,i,j,k,x,y,z)
def make_result(a,b,c,i,j,k,x,y,z):
return function1(make_input_for_1(a,b,c,i,j,k,x,y,z))
def make_input_for_1(...):
return function2(make_input_for_2(...)
and so on and so forth.
This only makes sense if the nesting of the functions is indeed an ever-more-detailed implementation of the general task.
You can create a class that aids in chaining these functions together. The constructor can apply a higher-order function to each input function to store the output of the function as an instance field and return self, then set each of those wrapped functions as methods of the instance. Something like this:
import functools
class Chainable():
def __init__(self, funcs):
self.value = None
def wrapper(func):
#functools.wraps(func)
def run(*args, **kwargs):
if self.value is None:
self.value = func(*args, **kwargs)
return self
else:
self.value = func(self.value, *args, **kwargs)
return self
return run
for func in funcs:
setattr(self, func.__qualname__, wrapper(func))
def f1(a, b):
return a + b
def f2(a, b):
return a * b
res = Chainable([f1, f2]).f1(1, 2).f2(5).value
print(res)
Suppose I have a function that takes two arguments and performs some calculation on them:
def add(a, b):
return a + b
I want to call this function through a multiprocessing library which can only handle functions with a single argument. So, I change the function to take its argument as a single tuple instead:
def add2(ab):
a, b = ab
return a + b
However, this seems clunky to me. The variables essentially need to be defined (and documented) twice. If I were using a lambda function, I could just write the following and it will accept the tuple properly:
add3 = lambda (a, b): a + b
Unfortunately, my function is not trivial enough to implement as a lambda function. Is there any sort of syntactic sugar feature in python that would allow me to write a named function that accepts a tuple but treats each component of that tuple as a separate named argument? My attempts to search for solutions to this have mostly turned up references to the *args operator, but that does not apply here because I do not control the site where the function is called.
Here is an example of how the function is being called. Note that it is called via the multiprocessing library so I cannot pass more than one argument:
import multiprocessing
pool = multiprocessing.Pool(processes=4)
for result in pool.imap_unordered(add, [(1,2),(3,4)]):
print(result)
Answers for either python 2.7 or 3.x are welcome.
It's best not to alter the original function interface, making it less Pythonic.
In Python 2, write a wrapper function to use with multiprocessing.
def _add(args):
return add(*args)
In Python 3, just use Pool.starmap instead:
>>> def add(a, b):
... return a + b
...
>>> p = Pool()
>>> list(p.starmap(add, [(1, 2), ('hello', ' world')]))
[3, 'hello world']
If you are worried about repeating yourself (a and b appear too many times), simply give the incoming tuple a non-descriptive name.
def add(t):
a, b = t
return a + b
Or, in your specific case, you can avoid a and b altogether by indexing the tuple:
def add(addends):
return addends[0] + addends[1]
As an alternative, you could wrap your function so the source code has the familiar argument format, but the function in use has the tuple argument:
def tupleize(func):
def wrapper(tup):
return func(*tup)
return wrapper
#tupleize
def add(a, b):
return a+b
t = 1, 2
assert(add(t) == 3)
As I was writing this question, I found the way to do it in Python 2.7:
def add4((a, b)):
return a + b
However apparently this no longer works in Python 3, so additional answers regarding Python 3 would still be helpful.
You could use a decorator to extend the multiprocessing library function to take multiple arguments, do whatever you want to them, and then call it with a single argument.
For example, a simple decorator that takes any number of arguments, sums them together, then calls the original function with the total as a single argument:
import 3rdpartylib
def sum_args(func):
def inner(*args):
return func(sum(args))
return inner
# Replace imported function with decorated version
3rdpartylib = sum_args(3rdpartylib)
# Decorate your own libraries
#sum_args
def my_own_lib(number):
print("A:", number)
3rdpartylib(1,2,3,4)
my_own_lib(5,10,15)
The main advantage is that you can decorate/replace any number of methods with this same decorator function to achieve the same effect.
(disclaimed: not a Python kid, so please be gentle)
I am trying to compose functions using the following:
def compose(*functions):
return functools.reduce(lambda acc, f: lambda x: acc(f(x)), functions, lambda x: x)
which works as expected for scalar functions. I'd like to work with functions returning tuples and others taking multiple arguments, eg.
def dummy(name):
return (name, len(name), name.upper())
def transform(name, size, upper):
return (upper, -size, name)
# What I want to achieve using composition,
# ie. f = compose(transform, dummy)
transform(*dummy('Australia'))
=> ('AUSTRALIA', -9, 'Australia')
Since dummy returns a tuple and transform takes three arguments, I need to unpack the value.
How can I achieve this using my compose function above? If I try like this, I get:
f = compose(transform, dummy)
f('Australia')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 2, in <lambda>
File "<stdin>", line 2, in <lambda>
TypeError: transform() takes exactly 3 arguments (1 given)
Is there a way to change compose such that it will unpack where needed?
This one works for your example but it wont handle just any arbitrary function - it will only works with positional arguments and (of course) the signature of any function must match the return value of the previous (wrt/ application order) one.
def compose(*functions):
return functools.reduce(
lambda f, g: lambda *args: f(*g(*args)),
functions,
lambda *args: args
)
Note that using reduce here, while certainly idiomatic in functional programming, is rather unpythonic. The "obvious" pythonic implementation would use iteration instead:
def itercompose(*functions):
def composed(*args):
for func in reversed(functions):
args = func(*args)
return args
return composed
Edit:
You ask "Is there a way to make have a compose function which will work in both cases" - "both cases" here meaning wether the functions returns an iterable or not (what you call "scalar functions", a concept that has no meaning in Python).
Using the iteration-based implementation, you could just test if the return value is iterable and wrap it in a tuple ie:
import collections
def itercompose(*functions):
def composed(*args):
for func in reversed(functions):
if not isinstance(args, collections.Iterable):
args = (args,)
args = func(*args)
return args
return composed
but this is not garanteed to work as expected - actually this is even garanteed to NOT work as expected for most use cases. There are a lot of builtin iterable types in Python (and even more user-defined ones) and just knowing an object is iterable doesn't say much about it's semantic.
For example a dict or str are iterable but in this case should obviously be considered a "scalar". A list is iterable too, and how it should be interpreted in this case is actually just undecidable without knowing exactly what it contains and what the "next" function in composition order expects - in some cases you will want to treat it as a single argument, in other cases ase a list of args.
IOW only the caller of the compose() function can really tell how each function result should be considered - actually you might even have cases where you want a tuple to be considered as a "scalar" value by the next function. So to make a long story short: no, there's no one-size-fits-all generic solution in Python. The best I could think of requires a combination of result inspection and manual wrapping of composed functions so the result is properly interpreted by the "composed" function but at this point manually composing the functions will be both way simpler and much more robust.
FWIW remember that Python is first and mostly a dynamically typed object oriented language so while it does have a decent support for functional programming idioms it's obviously not the best tool for real functional programming.
You might consider inserting a "function" (really, a class constructor) in your compose chain to signal the unpacking of the prior/inner function's results. You would then adjust your composer function to check for that class to determine if the prior result should be unpacked. (You actually end up doing the reverse: tuple-wrap all function results except those signaled to be unpacked -- and then have the composer unpack everything.) It adds overhead, it's not at all Pythonic, it's written in a terse lambda style, but it does accomplish the goal of being able to properly signal in a function chain when the composer should unpack a result. Consider the following generic code, which you can then adapt to your specific composition chain:
from functools import reduce
from operator import add
class upk: #class constructor signals composer to unpack prior result
def __init__(s,r): s.r = r #hold function's return for wrapper function
idt = lambda x: x #identity
wrp = lambda x: x.r if isinstance(x, upk) else (x,) #wrap all but unpackables
com = lambda *fs: ( #unpackable compose, unpacking whenever upk is encountered
reduce(lambda a,f: lambda *x: a(*wrp(f(*x))), fs, idt) )
foo = com(add, upk, divmod) #upk signals divmod's results should be unpacked
print(foo(6,4))
This circumvents the problem, as called out by prior answers/comments, of requiring your composer to guess which types of iterables should be unpacked. Of course, the cost is that you must explicitly insert upk into the callable chain whenever unpacking is required. In that sense, it is by no means "automatic", but it is still a fairly simple/terse way of achieving the intended result while avoiding unintended wraps/unwraps in many corner cases.
The compose function in the answer contributed by Bruno did do the job for functions with multiple arguments but didn't work any more for scalar ones unfortunately.
Using the fact that Python `unpacks' tuples into positional arguments, this is how I solved it:
import functools
def compose(*functions):
def pack(x): return x if type(x) is tuple else (x,)
return functools.reduce(
lambda acc, f: lambda *y: f(*pack(acc(*pack(y)))), reversed(functions), lambda *x: x)
which now works just as expected, eg.
#########################
# scalar-valued functions
#########################
def a(x): return x + 1
def b(x): return -x
# explicit
> a(b(b(a(15))))
# => 17
# compose
> compose(a, b, b, a)(15)
=> 17
########################
# tuple-valued functions
########################
def dummy(x):
return (x.upper(), len(x), x)
def trans(a, b, c):
return (b, c, a)
# explicit
> trans(*dummy('Australia'))
# => ('AUSTRALIA', 9, 'Australia')
# compose
> compose(trans, dummy)('Australia')
# => ('AUSTRALIA', 9, 'Australia')
And this also works with multiple arguments:
def add(x, y): return x + y
# explicit
> b(a(add(5, 3)))
=> -9
# compose
> compose(b, a, add)(5, 3)
=> -9
I tried to reimplement something like partial (which later will have more behavior). Now in the following example lazycall1 seems to work just as fine as lazycall2, so I don't understand why the documentation of partial suggests using the longer second version. Any suggestions? Can it get me in trouble?
def lazycall1(func, *args, **kwargs):
def f():
func(*args, **kwargs)
return f
def lazycall2(func, *args, **kwargs):
def f():
func(*args, **kwargs)
f.func=func # why do I need that?
f.args=args
f.kwargs=kwargs
return f
def A(x):
print("A", x)
def B(x):
print("B", x)
a1=lazycall1(A, 1)
b1=lazycall1(B, 2)
a1()
b1()
a2=lazycall2(A, 3)
b2=lazycall2(B, 4)
a2()
b2()
EDIT: Actually the answers given so far aren't quite right. Even with double arguments it would work. Is there another reason?
def lazycall(func, *args):
def f(*args2):
return func(*(args+args2))
return f
def sum_up(a, b):
return a+b
plusone=lazycall(sum_up, 1)
plustwo=lazycall(sum_up, 2)
print(plusone(6)) #7
print(plustwo(9)) #11
The only extra thing the second form has, are some extra properties. This might be helpful if you start passing around the functions returned by lazycall2, so that the receiving function may make decisions based on these values.
functools.partial can accept additional arguments - or overridden arguments - in the inner, returned function. Your inner f() functions don't, so there's no need for what you're doing in lazycall2. However, if you wanted to do something like this:
def sum(a, b):
return a+b
plusone = lazycall3(sum, 1)
plusone(6) # 7
You'd need to do what is shown in those docs.
Look closer at the argument names in the inner function newfunc in the Python documentation page you link to, they are different than those passed to the inner function, args vs. fargs, keywords vs. fkeywords. Their implementation of partial saves the arguments that the outer function was given and adds them to the arguments given to the inner function.
Since you reuse the exact same argument names in your inner function, the original arguments to the outer function won't be accessible in there.
As for setting func, args, and kwargs attributes on the outer function, a function is an object in Python, and you can set attributes on it. These attributes allow you to get access to the original function and arguments after you have passed them into your lazycall functions. So a1.func will be A and a1.args will be [1].
If you don't need to keep track of the original function and arguments, you should be fine
with your lazycall1.
I've been playing around in depth with attempting to write my own version of a memoizing decorator before I go looking at other people's code. It's more of an exercise in fun, honestly. However, in the course of playing around I've found I can't do something I want with decorators.
def addValue( func, val ):
def add( x ):
return func( x ) + val
return add
#addValue( val=4 )
def computeSomething( x ):
#function gets defined
If I want to do that I have to do this:
def addTwo( func ):
return addValue( func, 2 )
#addTwo
def computeSomething( x ):
#function gets defined
Why can't I use keyword arguments with decorators in this manner? What am I doing wrong and can you show me how I should be doing it?
You need to define a function that returns a decorator:
def addValue(val):
def decorator(func):
def add(x):
return func(x) + val
return add
return decorator
When you write #addTwo, the value of addTwo is directly used as a decorator. However, when you write #addValue(4), first addValue(4) is evaluated by calling the addValue function. Then the result is used as a decorator.
You want to partially apply the function addValue - give the val argument, but not func. There are generally two ways to do this:
The first one is called currying and used in interjay's answer: instead of a function with two arguments, f(a,b) -> res, you write a function of the first arg that returns another function that takes the 2nd arg g(a) -> (h(b) -> res)
The other way is a functools.partial object. It uses inspection on the function to figure out what arguments a function needs to run (func and val in your case ). You can add extra arguments when creating a partial and once you call the partial, it uses all the extra arguments given.
from functools import partial
#partial(addValue, val=2 ) # you can call this addTwo
def computeSomething( x ):
return x
Partials are usually a much simpler solution for this partial application problem, especially with more than one argument.
Decorators with any kinds of arguments -- named/keyword ones, unnamed/positional ones, or some of each -- essentially, ones you call on the #name line rather than just mention there -- need a double level of nesting (while the decorators you just mention have a single level of nesting). That goes even for argument-less ones if you want to call them in the # line -- here's the simplest, do-nothing, double-nested decorator:
def double():
def middling():
def inner(f):
return f
return inner
return middling
You'd use this as
#double()
def whatever ...
note the parentheses (empty in this case since there are no arguments needed nor wanted): they mean you're calling double, which returns middling, which decorates whatever.
Once you've seen the difference between "calling" and "just mentioning", adding (e.g. optional) named args is not hard:
def doublet(foo=23):
def middling():
def inner(f):
return f
return inner
return middling
usable either as:
#doublet()
def whatever ...
or as:
#doublet(foo=45)
def whatever ...
or equivalently as:
#doublet(45)
def whatever ...