running a function in parallel in Python - python

I'm trying to run a function like f in parallel in Python but have two problems:
When using map, function f is not applied to all permuted tuples of arrays a and b.
When trying to use Pool, I get the following error:
TypeError: '<=' not supported between instances of 'tuple' and 'int'
def f(n,m):
x = n * m
return x
a = (1,2,3,4,5)
b = (3,4,7,8,9)
result = map(f, a, b)
print(list(result))
#now trying parallel computing
from multiprocessing import Pool
pool = Pool(processes=4)
print(*pool.map(f, a, b))

I didn't make any changes for your #1 issue and get the expected result from using map(). You seem to have an incorrect assumption of how it works, but didn't provide expectation vs. actual results for your example.
for #2 to return the same answers as #1, you need starmap() instead of map() for this instance of multiprocessing use, and then zip() the argument lists to provide sets of arguments. If on an OS that doesn't fork (and for portability if you are), run global code only if it is the main process, and not a spawned process by using the documented if __name__ == '__main__': idiom:
from multiprocessing import Pool
def f(n,m):
x = n * m
return x
if __name__ == '__main__':
a = (1,2,3,4,5)
b = (3,4,7,8,9)
result = map(f, a, b)
print(list(result))
#now trying parallel computing
pool = Pool(processes=4)
print(*pool.starmap(f, zip(a, b)))
Output:
[3, 8, 21, 32, 45]
3 8 21 32 45
If you actually want permutations as mentioned in #1, use itertools.starmap or pool.starmap with itertools.product(a,b) as parameters instead.

Related

How to write a multithreaded function for processing different tasks concurrently?

I would like to define a do_in_parallel function in python that will take in functions with arguments, make a thread for each and perform them in parallel. The function should work as so:
do_in_parallel(_sleep(3), _sleep(8), _sleep(3))
I am however having a hard time defining the do_in_parallel function to take multiple functions with multiple arguments each, here's my attempt:
from time import sleep
import threading
def do_in_parallel(*kwargs):
tasks = []
for func in kwargs.keys():
t = threading.Thread(target=func, args=(arg for arg in kwargs[func]))
t.start()
tasks.append(t)
for task in tasks:
task.join()
def _sleep(n):
sleep(n)
print('slept', n)
Using it as so, and getting the following error:
do_in_parallel(_sleep=3, _sleep=8, _sleep=3)
>> do_in_parallel(sleepX=3, sleepX=8, sleepX=3)
^
>> SyntaxError: keyword argument repeated
Can someone explain what I would need to change in my function so that it can take multiple function parameters as so:
do_in_parallel(_sleep(3), _sleep(8), maybe_do_something(else, and_else))
do_in_parallel(_sleep(3), _sleep(8), maybe_do_something(else, and_else))
This call structure wouldn't work anyway since you are passing the results of your target functions to do_in_parallel (you are already calling _sleep etc.).
What you need to do instead, is bundle up tasks and pass these tasks to your processing function. A task here is a tuple, containing the target function to be called and an argument-tuple task = (_sleep, (n,)).
I suggest you then use a ThreadPool and the apply_async method to process the separate tasks.
from time import sleep
from multiprocessing.dummy import Pool # .dummy.Pool is a ThreadPool
def _sleep(n):
sleep(n)
result = f'slept {n}'
print(result)
return result
def _add(a, b):
result = a + b
print(result)
return result
def do_threaded(tasks):
with Pool(len(tasks)) as pool:
results = [pool.apply_async(*t) for t in tasks]
results = [res.get() for res in results]
return results
if __name__ == '__main__':
tasks = [(_sleep, (i,)) for i in [3, 8, 3]]
# [(<function _sleep at 0x7f035f844ea0>, (3,)),
# (<function _sleep at 0x7f035f844ea0>, (8,)),
# (<function _sleep at 0x7f035f844ea0>, (3,))]
tasks += [(_add, (a, b)) for a, b in zip(range(0, 3), range(10, 13))]
print(do_threaded(tasks))
Output:
10
12
14
slept 3
slept 3
slept 8
['slept 3', 'slept 8', 'slept 3', 10, 12, 14]
Process finished with exit code 0

Python multiprocessing pool doesn't take an iterable as an argument

I've read many posts here about multiprocessing.pool, but I still don't understand where the problem in my code is.
I want to parallelize a function using multiprocessing pool in python. The function takes one argument and returns two values. I want this one argument to be an integer and want to iterate over this integer. I've tried the examples I've seen here, but it doesn't work for me (apparently I do something wrong, but what?)
My code:
import multiprocessing
from multiprocessing import Pool
def function(num):
res1 = num ** 2 # calculate someting
res2 = num + num # calculate someting
return res1, res2
if __name__ == '__main__':
num = 10
pool = multiprocessing.Pool(processes=4)
# next line works, but with [something,something,...] as an argument
result = pool.map(function, [1, 100, 10000])
# next line doesn't work and I have no idea why!
result2 = pool.map(function, range(num))
pool.close()
pool.join()
print(result2)
I get TypeError: 'float' object is not subscriptable when I calculate result2.
Would be grateful for help!

How to parallel sum a loop using multiprocessing in Python

I am having difficulty understanding how to use Python's multiprocessing module.
I have a sum from 1 to n where n=10^10, which is too large to fit into a list, which seems to be the thrust of many examples online using multiprocessing.
Is there a way to "split up" the range into segments of a certain size and then perform the sum for each segment?
For instance
def sum_nums(low,high):
result = 0
for i in range(low,high+1):
result += i
return result
And I want to compute sum_nums(1,10**10) by breaking it up into many sum_nums(1,1000) + sum_nums(1001,2000) + sum_nums(2001,3000)... and so on. I know there is a close-form n(n+1)/2 but pretend we don't know that.
Here is what I've tried
import multiprocessing
def sum_nums(low,high):
result = 0
for i in range(low,high+1):
result += i
return result
if __name__ == "__main__":
n = 1000
procs = 2
sizeSegment = n/procs
jobs = []
for i in range(0, procs):
process = multiprocessing.Process(target=sum_nums, args=(i*sizeSegment+1, (i+1)*sizeSegment))
jobs.append(process)
for j in jobs:
j.start()
for j in jobs:
j.join()
#where is the result?
I find the usage of multiprocess.Pool and map() much more simple
Using your code:
from multiprocessing import Pool
def sum_nums(args):
low = int(args[0])
high = int(args[1])
return sum(range(low,high+1))
if __name__ == "__main__":
n = 1000
procs = 2
sizeSegment = n/procs
# Create size segments list
jobs = []
for i in range(0, procs):
jobs.append((i*sizeSegment+1, (i+1)*sizeSegment))
pool = Pool(procs).map(sum_nums, jobs)
result = sum(pool)
>>> print result
>>> 500500
You can do this sum without multiprocessing at all, and it's probably simpler, if not faster, to just use generators.
# prepare a generator of generators each at 1000 point intervals
>>> xr = (xrange(1000*i+1,i*1000+1001) for i in xrange(10000000))
>>> list(xr)[:3]
[xrange(1, 1001), xrange(1001, 2001), xrange(2001, 3001)]
# sum, using two map functions
>>> xr = (xrange(1000*i+1,i*1000+1001) for i in xrange(10000000))
>>> sum(map(sum, map(lambda x:x, xr)))
50000000005000000000L
However, if you want to use multiprocessing, you can also do this too. I'm using a fork of multiprocessing that is better at serialization (but otherwise, not really different).
>>> xr = (xrange(1000*i+1,i*1000+1001) for i in xrange(10000000))
>>> import pathos
>>> mmap = pathos.multiprocessing.ProcessingPool().map
>>> tmap = pathos.multiprocessing.ThreadingPool().map
>>> sum(tmap(sum, mmap(lambda x:x, xr)))
50000000005000000000L
The version w/o multiprocessing is faster and takes about a minute on my laptop. The multiprocessing version takes a few minutes due to the overhead of spawning multiple python processes.
If you are interested, get pathos here: https://github.com/uqfoundation
First, the best way to get around the memory issue is to use an iterator/generator instead of a list:
def sum_nums(low, high):
result = 0
for i in xrange(low, high+1):
result += 1
return result
in python3, range() produces an iterator, so this is only needed in python2
Now, where multiprocessing comes in is when you want to split up the processing to different processes or CPU cores. If you don't need to control the individual workers than the easiest method is to use a process pool. This will let you map a function to the pool and get the output. You can alternatively use apply_async to apply jobs to the pool one at a time and get a delayed result which you can get with .get():
import multiprocessing
from multiprocessing import Pool
from time import time
def sum_nums(low, high):
result = 0
for i in xrange(low, high+1):
result += i
return result
# map requires a function to handle a single argument
def sn((low,high)):
return sum_nums(low, high)
if __name__ == '__main__':
#t = time()
# takes forever
#print sum_nums(1,10**10)
#print '{} s'.format(time() -t)
p = Pool(4)
n = int(1e8)
r = range(0,10**10+1,n)
results = []
# using apply_async
t = time()
for arg in zip([x+1 for x in r],r[1:]):
results.append(p.apply_async(sum_nums, arg))
# wait for results
print sum(res.get() for res in results)
print '{} s'.format(time() -t)
# using process pool
t = time()
print sum(p.map(sn, zip([x+1 for x in r], r[1:])))
print '{} s'.format(time() -t)
On my machine, just calling sum_nums with 10**10 takes almost 9 minutes, but using a Pool(8) and n=int(1e8) reduces this to just over a minute.

recursive function calls and queuing in python

I have the sketch of my code as follows:
def func1(c):
return a,b
def func2(c,x):
if condition:
a,b = func1(c)
x.append(a,b)
func2(a,x)
func2(b,x)
return x
x = []
y = func2(c, x)
The problem, as you might have figured out from the code, is that I would like func2(b) to be computed in parallel with func2(a) whenever condition is true i.e. before b is replace by a new b from func2(a). But according to my algorithm, this clearly can not happen due to the new b's.
I do think such a problem might be perfect for parallel computing approach. But, I did not use it before and my knowledge about that is quite limited. I did try the suggestion from How to do parallel programming in Python, though. But I got the same result like the sketch above.
Caveat: Threading might not be parallel enough for you (see https://docs.python.org/2/library/threading.html note on the Global Interpreter Lock) so you might have to use the multiprocessing library instead (https://docs.python.org/2/library/multiprocessing.html).
...So I've cheated/been-lazy & used a thread/process neutral term "job". You'll need to pick either threading or multiprocessing for everywhere that I use "job".
def func1(c):
return a,b
def func2(c,x):
if condition:
a,b = func1(c)
x.append(a,b)
a_job = None
if (number_active_jobs() >= NUM_CPUS):
# do a and b sequentially
func2(a, x)
else:
a_job = fork_job(func2, a, x)
func2(b,x)
if a_job is not None:
join(a_job)
x = []
func2(c, x)
# all results are now in x (don't need y)
...that will be best if you need a,b pairs to finish together for some reason.
If you're willing to let the scheduler go nuts, you could "job" them all & then join at the end:
def func1(c):
return a,b
def func2(c,x):
if condition:
a,b = func1(c)
x.append(a,b)
if (number_active_jobs() >= NUM_CPUS):
# do a and b sequentially
func2(a, x)
else:
all_jobs.append(fork_job(func2, a, x))
# TODO: the same job-or-sequential for func2(b,x)
all_jobs = []
x = []
func2(c, x)
for j in all_jobs:
join(j)
# all results are now in x (don't need y)
The NUM_CPUS check could be done with threading.activeCount() instead of a full blown worker threa pool (python - how to get the numebr of active threads started by specific class?).
But with multiprocessing you'd have more work to do with JoinableQueue and a fixed size Pool of workers
From your explanation I have a feeling that it is not that b gets updated (which is not, as DouglasDD explained), but x. To let both recursive calls to work on a same x, you need to take some sort of a snapshot of x. The simplest way is to pass an index of a newly appended tuple, along the lines of
def func2(c, x, length):
...
x.append(a, b)
func2(a, x, length + 1)
func2(b, x, length + 1)

Python Multiprocessing with a single function

I have a simulation that is currently running, but the ETA is about 40 hours -- I'm trying to speed it up with multi-processing.
It essentially iterates over 3 values of one variable (L), and over 99 values of of a second variable (a). Using these values, it essentially runs a complex simulation and returns 9 different standard deviations. Thus (even though I haven't coded it that way yet) it is essentially a function that takes two values as inputs (L,a) and returns 9 values.
Here is the essence of the code I have:
STD_1 = []
STD_2 = []
# etc.
for L in range(0,6,2):
for a in range(1,100):
### simulation code ###
STD_1.append(value_1)
STD_2.append(value_2)
# etc.
Here is what I can modify it to:
master_list = []
def simulate(a,L):
### simulation code ###
return (a,L,STD_1, STD_2 etc.)
for L in range(0,6,2):
for a in range(1,100):
master_list.append(simulate(a,L))
Since each of the simulations are independent, it seems like an ideal place to implement some sort of multi-threading/processing.
How exactly would I go about coding this?
EDIT: Also, will everything be returned to the master list in order, or could it possibly be out of order if multiple processes are working?
EDIT 2: This is my code -- but it doesn't run correctly. It asks if I want to kill the program right after I run it.
import multiprocessing
data = []
for L in range(0,6,2):
for a in range(1,100):
data.append((L,a))
print (data)
def simulation(arg):
# unpack the tuple
a = arg[1]
L = arg[0]
STD_1 = a**2
STD_2 = a**3
STD_3 = a**4
# simulation code #
return((STD_1,STD_2,STD_3))
print("1")
p = multiprocessing.Pool()
print ("2")
results = p.map(simulation, data)
EDIT 3: Also what are the limitations of multiprocessing. I've heard that it doesn't work on OS X. Is this correct?
Wrap the data for each iteration up into a tuple.
Make a list data of those tuples
Write a function f to process one tuple and return one result
Create p = multiprocessing.Pool() object.
Call results = p.map(f, data)
This will run as many instances of f as your machine has cores in separate processes.
Edit1: Example:
from multiprocessing import Pool
data = [('bla', 1, 3, 7), ('spam', 12, 4, 8), ('eggs', 17, 1, 3)]
def f(t):
name, a, b, c = t
return (name, a + b + c)
p = Pool()
results = p.map(f, data)
print results
Edit2:
Multiprocessing should work fine on UNIX-like platforms such as OSX. Only platforms that lack os.fork (mainly MS Windows) need special attention. But even there it still works. See the multiprocessing documentation.
Here is one way to run it in parallel threads:
import threading
L_a = []
for L in range(0,6,2):
for a in range(1,100):
L_a.append((L,a))
# Add the rest of your objects here
def RunParallelThreads():
# Create an index list
indexes = range(0,len(L_a))
# Create the output list
output = [None for i in indexes]
# Create all the parallel threads
threads = [threading.Thread(target=simulate,args=(output,i)) for i in indexes]
# Start all the parallel threads
for thread in threads: thread.start()
# Wait for all the parallel threads to complete
for thread in threads: thread.join()
# Return the output list
return output
def simulate(list,index):
(L,a) = L_a[index]
list[index] = (a,L) # Add the rest of your objects here
master_list = RunParallelThreads()
Use Pool().imap_unordered if ordering is not important. It will return results in a non-blocking fashion.

Categories