multiprocessing.pool with manager and async methods - python

I am trying to make use of Manager() to share dictionary between processes and tried out the following code:
from multiprocessing import Manager, Pool
def f(d):
d['x'] += 2
if __name__ == '__main__':
manager = Manager()
d = manager.dict()
d['x'] = 2
p= Pool(4)
for _ in range(2000):
p.map_async(f, (d,)) #apply_async, map
p.close()
p.join()
print (d) # expects this result --> {'x': 4002}
Using map_async and apply_async, the result printed is always different (e.g. {'x': 3838}, {'x': 3770}).
However, using map will give the expected result.
Also, i have tried using Process instead of Pool, the results are different too.
Any insights?
Something on the non-blocking part and race conditions are not handled by manager?

When you call map (rather than map_async), it will block until the processors have finished all the requests you are passing, which in your case is just one call to function f. So even though you have a pool size of 4, you are in essence doing the 2000 processes one at a time. To actually parallelize execution, you should have done a single p.map(f, [d]*2000) instead of the loop.
But when you call map_async, you do not block and are returned a result object. A call to get on the result object will block until the process finishes and will return with the result of the function call. So now you are running up to 4 processes at a time. But the update to the dictionary is not serialized across the processors. I have modifed the code to force serialization of of d[x] += 2 by using a multiprocessing lock. You will see that the results are now 4002.
from multiprocessing import Manager, Pool, Lock
def f(d):
lock.acquire()
d['x'] += 2
lock.release()
def init(l):
global lock
lock = l
if __name__ == '__main__':
with Manager() as manager:
d = manager.dict()
d['x'] = 2
lock = Lock()
p = Pool(4, initializer=init, initargs=(lock,)) # Create the multiprocessing lock that is sharable by all the processes
results = [] # if the function returnd a result we wanted
for _ in range(2000):
results.append(p.map_async(f, (d,))) #apply_async, map
"""
for i in range(2000): # if the function returned a result we wanted
results[i].get() # wait for everything to finish
"""
p.close()
p.join()
print(d)

Related

How to start functions in parallel, check if they are done, and start a new function in python?

I want to write a python code that does the following:
At first, it starts, say, 3 processes (or threads, or whatever) in parallel.
Then in a loop, python waits until any of the processes have finished (and returned some value)
Then, the python code starts a new function
In the end, I want 3 processes always running in parallel, until all functions I need to run are run. Here is some pseudocode:
import time
import random
from multiprocessing import Process
# some random function which can have different execution time
def foo():
time.sleep(random.randint(10) + 2)
return 42
# Start 3 functions
p = []
p.append(Process(target=foo))
p.append(Process(target=foo))
p.append(Process(target=foo))
while(True):
# wait until one of the processes has finished
???
# then add a new process so that always 3 are running in parallel
p.append(Process(target=foo))
I am pretty sure it is not clear what I want. Please ask.
What you really want is to start three processes and feed a queue with jobs that you want executed. Then there will only ever be three processes and when one is finished, it reads the next item from the queue and executes that:
import time
import random
from multiprocessing import Process, Queue
# some random function which can have different execution time
def foo(a):
print('foo', a)
time.sleep(random.randint(1, 10) + 2)
print(a)
return 42
def readQueue(q):
while True:
item = q.get()
if item:
f,*args = item
f(*args)
else:
return
if __name__ == '__main__':
q = Queue()
for a in range(4): # create 4 jobs
q.put((foo, a))
for _ in range(3): # sentinel for 3 processes
q.put(None)
# Start 3 processes
p = []
p.append(Process(target=readQueue, args=(q,)))
p.append(Process(target=readQueue, args=(q,)))
p.append(Process(target=readQueue, args=(q,)))
for j in p:
j.start()
#time.sleep(10)
for j in p:
j.join()
You can use the Pool of the multiprocessing module.
my_foos = [foo, foo, foo, foo]
def do_something(method):
method()
from multiprocessing import Pool
with Pool(3) as p:
p.map(do_something, my_foos)
The number 3 states the number of parallel jobs.
map takes the inputs as arguments to the function do_something
In your case do_something can be a function which calls the functions you want to be processed, which are passed as a list to inputs.

Sharing a counter with multiprocessing.Pool

I'd like to use multiprocessing.Value + multiprocessing.Lock to share a counter between separate processes. For example:
import itertools as it
import multiprocessing
def func(x, val, lock):
for i in range(x):
i ** 2
with lock:
val.value += 1
print('counter incremented to:', val.value)
if __name__ == '__main__':
v = multiprocessing.Value('i', 0)
lock = multiprocessing.Lock()
with multiprocessing.Pool() as pool:
pool.starmap(func, ((i, v, lock) for i in range(25)))
print(counter.value())
This will throw the following exception:
RuntimeError: Synchronized objects should only be shared between
processes through inheritance
What I am most confused by is that a related (albeit not completely analogous) pattern works with multiprocessing.Process():
if __name__ == '__main__':
v = multiprocessing.Value('i', 0)
lock = multiprocessing.Lock()
procs = [multiprocessing.Process(target=func, args=(i, v, lock))
for i in range(25)]
for p in procs: p.start()
for p in procs: p.join()
Now, I recognize that these are two different markedly things:
the first example uses a number of worker processes equal to cpu_count(), and splits an iterable range(25) between them
the second example creates 25 worker processes and tasks each with one input
That said: how can I share an instance with pool.starmap() (or pool.map()) in this manner?
I've seen similar questions here, here, and here, but those approaches doesn't seem to be suited to .map()/.starmap(), regarldess of whether Value uses ctypes.c_int.
I realize that this approach technically works:
def func(x):
for i in range(x):
i ** 2
with lock:
v.value += 1
print('counter incremented to:', v.value)
v = None
lock = None
def set_global_counter_and_lock():
"""Egh ... """
global v, lock
if not any((v, lock)):
v = multiprocessing.Value('i', 0)
lock = multiprocessing.Lock()
if __name__ == '__main__':
# Each worker process will call `initializer()` when it starts.
with multiprocessing.Pool(initializer=set_global_counter_and_lock) as pool:
pool.map(func, range(25))
Is this really the best-practices way of going about this?
The RuntimeError you get when using Pool is because arguments for pool-methods are pickled before being send over a (pool-internal) queue to the worker processes.
Which pool-method you are trying to use is irrelevant here. This doesn't happen when you just use Process because there is no queue involved. You can reproduce the error just with pickle.dumps(multiprocessing.Value('i', 0)).
Your last code snippet doesn't work how you think it works. You are not sharing a Value, you are recreating independent counters for every child process.
In case you were on Unix and used the default start-method "fork", you would be done with just not passing the shared objects as arguments into the pool-methods.
Your child-processes would inherit the globals through forking. With process-start-methods "spawn" (default Windows and macOS with Python 3.8+) or "forkserver", you'll have to use the initializer during Pool
instantiation, to let the child-processes inherit the shared objects.
Note, you don't need an extra multiprocessing.Lock here, because multiprocessing.Value comes by default with an internal one you can use.
import os
from multiprocessing import Pool, Value #, set_start_method
def func(x):
for i in range(x):
assert i == i
with cnt.get_lock():
cnt.value += 1
print(f'{os.getpid()} | counter incremented to: {cnt.value}\n')
def init_globals(counter):
global cnt
cnt = counter
if __name__ == '__main__':
# set_start_method('spawn')
cnt = Value('i', 0)
iterable = [10000 for _ in range(10)]
with Pool(initializer=init_globals, initargs=(cnt,)) as pool:
pool.map(func, iterable)
assert cnt.value == 100000
Probably worth noting as well is that you don't need the counter to be shared in all cases.
If you just need to keep track of how often something has happened in total, an option would be to keep separate worker-local counters during computation which you sum up at the end.
This could result in a significant performance improvement for frequent counter updates for which you don't need synchronization during the parallel computation itself.

Understanding Python Multiprocessing Documentation

Trying to understand Python multiprocessing documents.
I would put this on meta but I'm not sure whether it might be valuable to searchers later.
I need some guidance as to how these examples relate to multiprocessing.
Am I correct in thinking that multiprocessing is using multiple processes (and thus CPUs) in order to break down an iterable task and thus shorten its duration?
from multiprocessing import Process
def f(name):
print('hello', name)
if __name__ == '__main__':
p = Process(target=f, args=('bob',))
p.start()
p.join()
We are starting one process; but how do I start multiple to complete my task? Do I iterate through Process + start() lines?
Yet there are no examples later in the documentation of for example:
for x in range(5):
p[x]=Process(target=f, args=('bob',))
p[x].start()
p.join()
Would that be the 'real life' implementation?
Here is the 'Queue Example':
from multiprocessing import Process, Queue
def f(q):
q.put([42, None, 'hello'])
if __name__ == '__main__':
q = Queue()
p = Process(target=f, args=(q,))
p.start()
print(q.get()) # prints "[42, None, 'hello']"
p.join()
But again, how is this multiprocessing? This is just starting a process and having it run objects in a queue?
How do I make multiple processes start and run the objects in the queue?
Finally for pool:
from multiprocessing import Pool
import time
def f(x):
return x*x
if __name__ == '__main__':
with Pool(processes=4) as pool: # start 4 worker processes
result = pool.apply_async(f, (10,)) # evaluate "f(10)" asynchronously in a single process
print(result.get(timeout=1)) # prints "100" unless your computer is *very* slow
Are four processes doing 10x10 at once and it waits until all four come back or does just the one do this because we only gave the pool one argument?
If the former: Wouldn't that be slower than just having one do it in the first place? What about memory? Do we hold process 1's result until process 4 returns in RAM or does it get printed?
print(pool.map(f, range(10))) # prints "[0, 1, 4,..., 81]"
it = pool.imap(f, range(10))
print(next(it)) # prints "0"
print(next(it)) # prints "1"
print(it.next(timeout=1)) # prints "4" unless your computer is *very* slow
result = pool.apply_async(time.sleep, (10,))
print(result.get(timeout=1)) # raises multiprocessing.TimeoutError

Dictionary multiprocessing

I want to parallelize the processing of a dictionary using the multiprocessing library.
My problem can be reduced to this code:
from multiprocessing import Manager,Pool
def modify_dictionary(dictionary):
if((3,3) not in dictionary):
dictionary[(3,3)]=0.
for i in range(100):
dictionary[(3,3)] = dictionary[(3,3)]+1
return 0
if __name__ == "__main__":
manager = Manager()
dictionary = manager.dict(lock=True)
jobargs = [(dictionary) for i in range(5)]
p = Pool(5)
t = p.map(modify_dictionary,jobargs)
p.close()
p.join()
print dictionary[(3,3)]
I create a pool of 5 workers, and each worker should increment dictionary[(3,3)] 100 times. So, if the locking process works correctly, I expect dictionary[(3,3)] to be 500 at the end of the script.
However; something in my code must be wrong, because this is not what I get: the locking process does not seem to be "activated" and dictionary[(3,3)] always have a valuer <500 at the end of the script.
Could you help me?
The problem is with this line:
dictionary[(3,3)] = dictionary[(3,3)]+1
Three things happen on that line:
Read the value of the dictionary key (3,3)
Increment the value by 1
Write the value back again
But the increment part is happening outside of any locking.
The whole sequence must be atomic, and must be synchronized across all processes. Otherwise the processes will interleave giving you a lower than expected total.
Holding a lock whist incrementing the value ensures that you get the total of 500 you expect:
from multiprocessing import Manager,Pool,Lock
lock = Lock()
def modify_array(dictionary):
if((3,3) not in dictionary):
dictionary[(3,3)]=0.
for i in range(100):
with lock:
dictionary[(3,3)] = dictionary[(3,3)]+1
return 0
if __name__ == "__main__":
manager = Manager()
dictionary = manager.dict(lock=True)
jobargs = [(dictionary) for i in range(5)]
p = Pool(5)
t = p.map(modify_array,jobargs)
p.close()
p.join()
print dictionary[(3,3)]
I ve managed many times to find here the correct solution to a programming difficulty I had. So I would like to contribute a little bit. Above code still has the problem of not updating right the dictionary. To have the right result you have to pass lock and correct jobargs to f. In above code you make a new dictionary in every proccess. The code I found to work fine:
from multiprocessing import Process, Manager, Pool, Lock
from functools import partial
def f(dictionary, l, k):
with l:
for i in range(100):
dictionary[3] += 1
if __name__ == "__main__":
manager = Manager()
dictionary = manager.dict()
lock = manager.Lock()
dictionary[3] = 0
jobargs = list(range(5))
pool = Pool()
func = partial(f, dictionary, lock)
t = pool.map(func, jobargs)
pool.close()
pool.join()
print(dictionary)
In the OP's code, it is locking the entire iteration. In general, you should only apply locks for the shortest time, as long as it is effective. The following code is much more efficient. You acquire the lock only to make the code atomic
def f(dictionary, l, k):
for i in range(100):
with l:
dictionary[3] += 1
Note that dictionary[3] += 1 is not atomic, so it must be locked.

Regarding relation of multiprocessing with cores of server

For the following code, assume I have a 32 core machine, will python decide how many process to create for me?
from multiprocessing import Process
for i in range(100):
p = Process(target=run, args=(fileToAnalyse,))
p.start()
No, it does not decide for you.
To limit the number of subprocesses, you need to use a pool of workers.
Example from the documentation:
from multiprocessing import Pool
def f(x):
return x*x
if __name__ == '__main__':
pool = Pool(processes=4) # start 4 worker processes
result = pool.apply_async(f, [10]) # evaluate "f(10)" asynchronously
print result.get(timeout=1) # prints "100" unless your computer is *very* slow
print pool.map(f, range(10)) # prints "[0, 1, 4,..., 81]"
If you omit, processes=4, it will use multiprocessing.cpu_count which return the number of cpu.

Categories