The concurrent.futures.Executor.map takes a variable number of iterables from which the function given is called. How should I call it if I have a generator that produces tuples that are normally unpacked in place?
The following doesn't work because each of the generated tuples is given as a different argument to map:
args = ((a, b) for (a, b) in c)
for result in executor.map(f, *args):
pass
Without the generator, the desired arguments to map might look like this:
executor.map(
f,
(i[0] for i in args),
(i[1] for i in args),
...,
(i[N] for i in args),
)
One argument that is repeated, one argument in c
from itertools import repeat
for result in executor.map(f, repeat(a), c):
pass
Need to unpack items of c, and can unpack c
from itertools import izip
for result in executor.map(f, *izip(*c)):
pass
Need to unpack items of c, can't unpack c
Change f to take a single argument and unpack the argument in the function.
If each item in c has a variable number of members, or you're calling f only a few times:
executor.map(lambda args, f=f: f(*args), c)
It defines a new function that unpacks each item from c and calls f. Using a default argument for f in the lambda makes f local inside the lambda and so reduces lookup time.
If you've got a fixed number of arguments, and you need to call f a lot of times:
from collections import deque
def itemtee(iterable, n=2):
def gen(it = iter(iterable), items = deque(), next = next):
popleft = items.popleft
extend = items.extend
while True:
if not items:
extend(next(it))
yield popleft()
return [gen()] * n
executor.map(f, *itemtee(c, n))
Where n is the number of arguments to f. This is adapted from itertools.tee.
You need to remove the * on the map call:
args = ((a, b) for b in c)
for result in executor.map(f, args):
pass
This will call f, len(args) times, where f should accept one parameter.
If you want f to accept two parameters you can use a lambda call like:
args = ((a, b) for b in c)
for result in executor.map(lambda p: f(*p), args): # (*p) does the unpacking part
pass
You can use currying to create new function via partial method in Python
from concurrent.futures import ThreadPoolExecutor
from functools import partial
def some_func(param1, param2):
# some code
# currying some_func with 'a' argument is repeated
func = partial(some_func, a)
with ThreadPoolExecutor() as executor:
executor.map(func, list_of_args):
...
If you need to pass more than one the same parameters you can pass them to partial method
func = partial(some_func, a, b, c)
So suppose you have a function which takes 3 arguments and all the 3 arguments are dynamic and keep on changing with every call. For example:
def multiply(a,b,c):
print(a * b * c)
To call this multiple times using threading, I would first create a list of tuples where each tuple is a version of a,b,c:
arguments = [(1,2,3), (4,5,6), (7,8,9), ....]
To we know that concurrent.futures's map function would accept first argument as the target function and second argument as the list of arguments for each version of the function that will be execute. Therefore, you might make a call like this:
for _ in executor.map(multiply, arguments) # Error
But this will give you error that the function expected 3 arguments but got only 1. To solve this problem, we create a helper function:
def helper(numbers):
multiply(numbers[0], numbers[1], numbers[2])
Now, we can call this function using executor as follow:
with ThreadPoolExecutor() as executor:
for _ in executor.map(helper, arguments):
pass
That should give you the desired results.
Here's a code snippet showing how to send multiple arguments to a function with ThreadPoolExecutor:
import concurrent.futures
def hello(first_name: str, last_name: str) -> None:
"""Prints a friendly hello with first name and last name"""
print('Hello %s %s!' % (first_name, last_name))
def main() -> None:
"""Examples showing how to use ThreadPoolExecutor and executer.map
sending multiple arguments to a function"""
# Example 1: Sending multiple arguments using tuples
# Define tuples with sequential arguments to be passed to hello()
args_names = (
('Bruce', 'Wayne'),
('Clark', 'Kent'),
('Diana', 'Prince'),
('Barry', 'Allen'),
)
with concurrent.futures.ThreadPoolExecutor() as executor:
# Using lambda, unpacks the tuple (*f) into hello(*args)
executor.map(lambda f: hello(*f), args_names)
print()
# Example 2: Sending multiple arguments using dict with named keys
# Define dicts with arguments as key names to be passed to hello()
kwargs_names = (
{'first_name': 'Bruce', 'last_name': 'Wayne'},
{'first_name': 'Clark', 'last_name': 'Kent'},
{'first_name': 'Diana', 'last_name': 'Prince'},
{'first_name': 'Barry', 'last_name': 'Allen'},
)
with concurrent.futures.ThreadPoolExecutor() as executor:
# Using lambda, unpacks the dict (**f) into hello(**kwargs)
executor.map(lambda f: hello(**f), kwargs_names)
if __name__ == '__main__':
main()
lets say you have data like this in data frame shown below and you want to pass 1st two columns to a function which will read the images and predict the fetaures and then calculate the difference and return the difference value.
Note: you can have any scenario as per your requirement and respectively you can define the function.
The below code snippet will takes these two columns as argument and pass to the Threadpool mechanism (showing the progress bar also)
''' function that will give the difference of two numpy feature matrix'''
def getDifference(image_1_loc, image_2_loc, esp=1e-7):
arr1 = ''' read 1st image and extract feature '''
arr2 = ''' read 2nd image and extract feature '''
diff = arr1.ravel() - arr2.ravel() + esp
return diff
'''Using ThreadPoolExecutor from concurrent.futures with multiple argument'''
with ThreadPoolExecutor() as executor:
result = np.array(
list(tqdm(
executor.map(lambda x : function(*x), [(i,j) for i,j in df[['image_1','image_2']].values]),
total=len(df)
)
)
)
For ProcessPoolExecutor.map():
Similar to map(func, *iterables) except:
the iterables are collected immediately rather than lazily;
func is executed asynchronously and several calls to func may be made
concurrently.
Therefore, the usage of ProcessPoolExecutor.map() is the same as that of Python's build-in map(). Here is the docs:
Return an iterator that applies function to every item of iterable,
yielding the results. If additional iterable arguments are passed,
function must take that many arguments and is applied to the items
from all iterables in parallel.
Conclusion: pass the several parameters to map().
Try running the following snippet under python 3, and you will be quite clear:
from concurrent.futures import ProcessPoolExecutor
def f(a, b):
print(a+b)
with ProcessPoolExecutor() as pool:
pool.map(f, (0, 1, 2, 3, 4, 5, 6, 7, 8, 9), (0, 1, 2))
# 0, 2, 4
array = [(i, i) for i in range(3)]
with ProcessPoolExecutor() as pool:
pool.map(f, *zip(*array))
# 0, 2, 4
I have seen so many answers here, but none of them is as straight forward as using lambda expressions:
foo(x,y):
pass
want to call above method 10 times, with same value i.e. xVal and yVal?
with concurrent.futures.ThreadPoolExecutor() as executor:
for _ in executor.map( lambda _: foo(xVal, yVal), range(0, 10)):
pass
This works for me:
from concurrent.futures import ThreadPoolExecutor
def concurrent_function(function, list):
with ThreadPoolExecutor() as executor:
executor.map(function, list)
def concurrent_multiply(args = {'a': 1, 'b': 2}):
print(args['a']*args['b'])
concurrent_function(multiply, [{'a': 1, 'b': 1},
{'a': 2, 'b': 2},
{'a': 3, 'b': 3}])
A simple utility that I use all the time is below.
########### Start of Utility Code ###########
import os
import sys
import traceback
from concurrent import futures
from functools import partial
def catch(fn):
def wrap(*args, **kwargs):
result = None
try:
result = fn(*args, **kwargs)
except Exception as err:
type_, value_, traceback_ = sys.exc_info()
return None, (
args,
"".join(traceback.format_exception(type_, value_, traceback_)),
)
else:
return result, (args, None)
return wrap
def top_level_wrap(fn, arg_tuple):
args, kwargs = arg_tuple
return fn(*args, *kwargs)
def create_processes(fn, values, handle_error, handle_success):
cores = os.cpu_count()
max_workers = 2 * cores + 1
to_exec = partial(top_level_wrap, fn)
with futures.ProcessPoolExecutor(max_workers=max_workers) as executor:
for result, error in executor.map(to_exec, values):
args, tb = error
if tb is not None:
handle_error(args, tb)
else:
handle_success(result)
########### End of Utility Code ###########
Example usage -
######### Start of example usage ###########
import time
#catch
def fail_when_5(val):
time.sleep(val)
if val == 5:
raise Exception("Error - val was 5")
else:
return f"No error val is {val}"
def handle_error(args, tb):
print("args is", args)
print("TB is", tb)
def top_level(val, val_2, test=None, test2="ok"):
print(val_2, test, test2)
return fail_when_5(val)
handle_success = print
if __name__ == "__main__":
# SHAPE -> ( (args, kwargs), (args, kwargs), ... )
values = tuple(
((x, x + 1), {"test": f"t_{x+2}", "test2": f"t_{x+3}"}) for x in range(10)
)
create_processes(top_level, values, handle_error, handle_success)
######### End of example usage ###########
Related
I want to find a clear and efficient way to be able to change parameter value set for functools.partial.
Let's see a simple example:
from functools import partial
def fn(a,b,c,d,e):
print(a,b,c,d,e)
fn12 = partial(fn, 1,2)
Later, I want to have something like:
fn12 [0] = 7
to replace the value on specific place without create a new partial, because it's pretty heavy code there.
Addition: i ask about general possibility to change partial value.
The naive example would be like :
def printme( a,b,c,d,e):
print(a,b,c,d,e)
class my_partial:
def __init__(self, fn, *args):
self.__func__ = fn
self. args = list(args)
def __call__(self, *next_args):
call = self. args + list(next_args)
return self. __func__(* (call) )
fn12 = my_partial(printme,1,2)
fn12(3,4,5)
fn12.args[1] = 7
fn12(3,4,5)
I need that for example for widgets, where action function is defined like :
rb.config(command = partial(...))
but then I'd like to change some parameters given in partial. I could do a new partial again but that looks kinda messy.
If it is permissible to look into the implementation of partial, then using __reduce__ and __setstate__ you can replace the args wholesale:
from functools import partial
def fn(a,b,c,d,e):
print(a,b,c,d,e)
fn12 = partial(fn, 1,2)
def replace_args(part, new_args):
_,_, f = part.__reduce__()
f, _, k, n = f
part.__setstate__( (f, new_args, k, n) )
fn12('c','d','e')
replace_args(fn12, (7,2))
fn12('c','d','e')
Output:
1 2 c d e
7 2 c d e
You can update partial parameters. For example if you have a function like this:
def f(a, b):
return a*b
func = partial(f, b=2)
func(1) # result: 1*2=2
Now, you can update partial parameter b like this:
func(1, b=7) # result: 1*7=7
I wanted to learn about using map in python and a google search brought me to http://www.bogotobogo.com/python/python_fncs_map_filter_reduce.php which I have found helpful.
One of the codes on that page uses a for loop and puts map within that for loop in an interesting way, and the list used within the map function actually takes a list of 2 functions. Here is the code:
def square(x):
return (x**2)
def cube(x):
return (x**3)
funcs = [square, cube]
for r in range(5):
value = map(lambda x: x(r), funcs)
print value
output:
[0, 0]
[1, 1]
[4, 8]
[9, 27]
[16, 64]
So, at this point in that tutorial, I thought "well if you can write that code with a function on the fly (lambda), then it could be written using a standard function using def". So I changed the code to this:
def square(x):
return (x**2)
def cube(x):
return (x**3)
def test(x):
return x(r)
funcs = [square, cube]
for r in range(5):
value = map(test, funcs)
print value
I got the same output as the first piece of code, but it bothered me that variable r was taken from the global namespace and that the code is not tight functional programming. And there is where I got tripped up. Here is my code:
def square(x):
return (x**2)
def cube(x):
return (x**3)
def power(x):
return x(r)
def main():
funcs = [square, cube]
for r in range(5):
value = map(power, funcs)
print value
if __name__ == "__main__":
main()
I have played around with this code, but the issue is with passing into the function def power(x). I have tried numerous ways of trying to pass into this function, but lambda has the ability to automatically assign x variable to each iteration of the list funcs.
Is there a way to do this by using a standard def function, or is it not possible and only lambda can be used? Since I am learning python and this is my first language, I am trying to understand what's going on here.
You could nest the power() function in the main() function:
def main():
def power(x):
return x(r)
funcs = [square, cube]
for r in range(5):
value = map(power, funcs)
print value
so that r is now taken from the surrounding scope again, but is not a global. Instead it is a closure variable instead.
However, using a lambda is just another way to inject r from the surrounding scope here and passing it into the power() function:
def power(r, x):
return x(r)
def main():
funcs = [square, cube]
for r in range(5):
value = map(lambda x: power(r, x), funcs)
print value
Here r is still a non-local, taken from the parent scope!
You could create the lambda with r being a default value for a second argument:
def power(r, x):
return x(r)
def main():
funcs = [square, cube]
for r in range(5):
value = map(lambda x, r=r: power(r, x), funcs)
print value
Now r is passed in as a default value instead, so it was taken as a local. But for the purposes of your map() that doesn't actually make a difference here.
Currying is another option. Because a function of two arguments is the same as a function of one argument that returns another function that takes the remaining argument, you can write it like this:
def square(x):
return (x**2)
def cube(x):
return (x**3)
def power(r):
return lambda(x): x(r) # This is where we construct our curried function
def main():
funcs = [square, cube]
for y in range(5):
value = map(power(y), funcs) # Here, we apply the first function
# to get at the second function (which
# was constructed with the lambda above).
print value
if __name__ == "__main__":
main()
To make the relation a little more explicit, a function of the type (a, b) -> c (a function that takes an argument of type a and an argument of type b and returns a value of type c) is equivalent to a function of type a -> (b -> c).
Extra stuff about the equivalence
If you want to get a little deeper into the math behind this equivalence, you can see this relationship using a bit of algebra. Viewing these types as algebraic data types, we can translate any function a -> b to ba and any pair (a, b) to a * b. Sometimes function types are called "exponentials" and pair types are called "product types" because of this connection. From here, we can see that
c(a * b) = (cb)a
and so,
(a, b) -> c ~= a -> (b -> c)
Why not simply pass the functions as part of the argument to power(), and use itertools.product to create the required (value, func) combinations?
from itertools import product
# ...
def power((value, func)):
return func(value)
for r in range(5):
values = map(power, product([r], funcs))
print values
Or if you don't want / require the results to be grouped by functions, and instead want a flat list, you could simply do:
values = map(power, product(range(5), funcs))
print values
Note: The signature power((value, func)) defines power() to accept a single 2-tuple argument that is automatically unpacked into value and func.
It's equivalent to
def power(arg):
value, func = arg
Is there a function equivalent to the * symbol, for expanding function arguments, in python? That's the entire question, but if you want an explanation of why I need that, continue reading.
In our code, we use tuples in certain places to define nested functions/conditions, to evaluate something like f(a, b, g(c, h(d))) at run time. The syntax is something like (fp = function pointer, c = constant):
nestedFunction = (fp1, c1, (fp2, c2, c3), (fp3,))
At run time, under certain conditions, that would be evaluated as:
fp1(c1, fp2(c2, c3), fp3())
Basically the first argument in each tuple is necessarily a function, the rest of the arguments in a tuple can either be constants or tuples representing other functions. The functions are evaluated from the inside out.
Anyways, you can see how the need for argument expansion, in the form of a function, could arise. And it turns out you cannot define something like:
def expand(myTuple):
return *myTuple
I can work around it by defining my functions carefully, but argument expansion would be nice to not have to hack around the issue. And just FYI, changing this design isn't an option.
You'll need to write your own recursive function that applies arguments to functions in nested tuples:
def recursive_apply(*args):
for e in args:
yield e[0](*recursive_apply(*e[1:])) if isinstance(e, tuple) else e
then use that in your function call:
next(recursive_apply(nestedFunction))
The next() is required because recursive_apply() is a generator; you can wrap the next(recursive_apply(...)) expression in a helper function to ease use; here I bundled the recursive function in the local namespace:
def apply(nested_structure):
def recursive_apply(*args):
for e in args:
yield e[0](*recursive_apply(*e[1:])) if isinstance(e, tuple) else e
return next(recursive_apply(nested_structure))
Demo:
>>> def fp(num):
... def f(*args):
... res = sum(args)
... print 'fp{}{} -> {}'.format(num, args, res)
... return res
... f.__name__ = 'fp{}'.format(num)
... return f
...
>>> for i in range(3):
... f = fp(i + 1)
... globals()[f.__name__] = f
...
>>> c1, c2, c3 = range(1, 4)
>>> nestedFunction = (fp1, c1, (fp2, c2, c3), (fp3,))
>>> apply(nestedFunction)
fp2(2, 3) -> 5
fp3() -> 0
fp1(1, 5, 0) -> 6
6
I am trying to write a parameter search function to loop over one of the parameters and repeatedly call a function with all other parameters the same, other than the one I am searching over. Here is some sample code:
def worker1(a, b, c):
return a + b + c
def worker2(d, e, f):
return d * e * f
def search(model, params):
res = []
# Loop over one of the parameters and repeatedly append to res
if model == 1:
res.append(worker1(**params))
elif model == 2:
res.append(worker2(**params))
return res
params = dict(a=1, b=2, c=3)
print search(1, params)
I have two workers and they are called depending on the value of the model flag I pass to search(). The problem I am trying to solve here is to write a loop (commented in the code) over the if statements to repeatedly call say worker1 by varying only one of the parameters. I want my code to be flexible - sometimes I want to loop through a and keep b and c the same, but sometimes I want to loop through b and keeping a and c the same.
I'm open whatever solution suggested, but I think I would be specifying the search parameters in the params dictionary. E.g. To loop a over 1,2,3,4, I would say:
`params = dict(a=[1,2,3,4], b=2, c=3)`
Also it would be nice if I don't have to modify the code for worker1 and worker2.
Thank you!
You could perhaps use itertools.product to call your workers with all combinations of params:
http://docs.python.org/2/library/itertools.html#itertools.product
eg
from itertools import product
def worker1(a, b, c):
return a + b + c
def worker2(d, e, f):
return d * e * f
def search(model, *params):
res = []
# Loop over one of the parameters and repeatedly append to res
for current_params in product(*params):
if model == 1:
res.append(worker1(*current_params))
elif model == 2:
res.append(worker2(*current_params))
return res
print search(1, [1,2,3,4], [2], [3])
# more complicated combinations are possible:
print search(1, [1,2,3,4], [2,7,9], [3,13,23,43])
I've avoided using keyword arguments as your worker functions take differently-named args so it wouldn't make much sense.
I'm assuming your worker functions don't actually look like the ones above as if they did you could further simplify the code using the builtin sum and reduce functions.
I am not sure if I understood the problem. Check if this is what you want (omitted the model parameter):
>>> def worker1(a, b, c):
return a + b + c
>>> def search(params):
params = params.values()
var_param = filter(lambda p: type(p) == list, params)[0]
other_params = filter(lambda p: p != var_param, params)
return [worker1(x, *other_params) for x in var_param]
>>> search({'a':2, 'b':[3,4,5], 'c':3})
[8, 9, 10]
Assuming:
arguments of worker1() are commutative (order does not matter).
variable parameter is a list
other parameters are single values.
In the above sample b is the variable parameter which you want to loop over
Update:
In case order of the arguments of the function worker1 is to be preserved:
def search(params):
params = params.items()
var_param = filter(lambda t: type(t[1]) == list, params)[0]
other_params = filter(lambda t: t != var_param, params)
var_param_key = var_param[0]
var_param_values = var_param[1]
return [worker1(**dict([(var_param_key, x)] + other_params)) for x in var_param_values]
I have a function which performs an expensive operation and is called often; but, the operation only needs to be performed once - its result could be cached.
I tried making an infinite generator but I didn't get the results I expected:
>>> def g():
... result = "foo"
... while True:
... yield result
...
>>> g()
<generator object g at 0x1093db230> # why didn't it give me "foo"?
Why isn't g a generator?
>>> g
<function g at 0x1093de488>
Edit: it's fine if this approach doesn't work, but I need something which performs exactly like a regular function, like so:
>>> [g() for x in range(3)]
["foo", "foo", "foo"]
g() is a generator function. Calling it returns the generator. You then need to use that generator to get your values. By looping, for example, or by calling next() on it:
gen = g()
value = next(gen)
Note that calling g() again will calculate the same value again and produce a new generator.
You may just want to use a global to cache the value. Storing it as an attribute on the function could work:
def g():
if not hasattr(g, '_cache'):
g._cache = 'foo'
return g._cache
A better way: #functools.lru_cache(maxsize=None). It's been backported to python 2.7, or you could just write your own.
I am occasionally guilty of doing:
def foo():
if hasattr(foo, 'cache'):
return foo.cache
# do work
foo.cache = result
return result
Here's a dead-simple caching decorator. It doesn't take into account any variations in parameters, it just returns the same result after the first call. There are fancier ones out there that cache the result for each combination of inputs ("memoization").
import functools
def callonce(func):
result = []
#functools.wraps(func)
def wrapper(*args, **kwargs):
if not result:
result.append(func(*args, **kwargs))
return result[0]
return wrapper
Usage:
#callonce
def long_running_function(x, y, z):
# do something expensive with x, y, and z, producing result
return result
If you would prefer to write your function as a generator for some reason (perhaps the result is slightly different on each call, but there's still a time-consuming initial setup, or else you just want C-style static variables that allow your function to remember some bit of state from one call to the next), you can use this decorator:
import functools
def gen2func(generator):
gen = []
#functools.wraps(generator)
def wrapper(*args, **kwargs):
if not gen:
gen.append(generator(*args, **kwargs))
return next(gen[0])
return wrapper
Usage:
#gen2func
def long_running_function_in_generator_form(x, y, z):
# do something expensive with x, y, and z, producing result
while True:
yield result
result += 1 # for example
A Python 2.5 or later version that uses .send() to allow parameters to be passed to each iteration of the generator is as follows (note that **kwargs are not supported):
import functools
def gen2func(generator):
gen = []
#functools.wraps(generator)
def wrapper(*args):
if not gen:
gen.append(generator(*args))
return next(gen[0])
return gen[0].send(args)
return wrapper
#gen2func
def function_with_static_vars(a, b, c):
# time-consuming initial setup goes here
# also initialize any "static" vars here
while True:
# do something with a, b, c
a, b, c = yield # get next a, b, c
A better option would be to use memoization. You can create a memoize decorator that you can use to wrap any function that you want to cache the results for. You can find some good implementations here.
You can also leverage Beaker and its cache.
Also it has a tons of extensions.