consider the case that each process need to do somthing on an array.
It seemes Pool.apply is the right choice for this job.
def sumj(i, arr):
print(i, os.getpid())
sleep(0.5)
return np.sum(arr)
if __name__ == "__main__":
mat = np.ones((40, 10))
pool = Pool(processes=10)
results = [pool.apply(sumj, args=(i, mat[i,:])) for i in range(40)]
0 1220757
1 1220758
2 1220759
3 1220760
4 1220761
5 1220762
6 1220763
why am I getting a serial running, pid changes but each 0.5 sec I get one row of printed data ?
LINK
apply(func[, args[, kwds]]) Call func with arguments args and keyword
arguments kwds. It blocks until the result is ready. Given this
blocks, apply_async() is better suited for performing work in
parallel. Additionally, func is only executed in one of the workers of
the pool.
pool = Pool(processes=10)
results = [pool.apply_async(sumj, args=(i, mat[i,:])) for i in range(40)]
print([i.get() for i in results])
Related
I want to write a python code that does the following:
At first, it starts, say, 3 processes (or threads, or whatever) in parallel.
Then in a loop, python waits until any of the processes have finished (and returned some value)
Then, the python code starts a new function
In the end, I want 3 processes always running in parallel, until all functions I need to run are run. Here is some pseudocode:
import time
import random
from multiprocessing import Process
# some random function which can have different execution time
def foo():
time.sleep(random.randint(10) + 2)
return 42
# Start 3 functions
p = []
p.append(Process(target=foo))
p.append(Process(target=foo))
p.append(Process(target=foo))
while(True):
# wait until one of the processes has finished
???
# then add a new process so that always 3 are running in parallel
p.append(Process(target=foo))
I am pretty sure it is not clear what I want. Please ask.
What you really want is to start three processes and feed a queue with jobs that you want executed. Then there will only ever be three processes and when one is finished, it reads the next item from the queue and executes that:
import time
import random
from multiprocessing import Process, Queue
# some random function which can have different execution time
def foo(a):
print('foo', a)
time.sleep(random.randint(1, 10) + 2)
print(a)
return 42
def readQueue(q):
while True:
item = q.get()
if item:
f,*args = item
f(*args)
else:
return
if __name__ == '__main__':
q = Queue()
for a in range(4): # create 4 jobs
q.put((foo, a))
for _ in range(3): # sentinel for 3 processes
q.put(None)
# Start 3 processes
p = []
p.append(Process(target=readQueue, args=(q,)))
p.append(Process(target=readQueue, args=(q,)))
p.append(Process(target=readQueue, args=(q,)))
for j in p:
j.start()
#time.sleep(10)
for j in p:
j.join()
You can use the Pool of the multiprocessing module.
my_foos = [foo, foo, foo, foo]
def do_something(method):
method()
from multiprocessing import Pool
with Pool(3) as p:
p.map(do_something, my_foos)
The number 3 states the number of parallel jobs.
map takes the inputs as arguments to the function do_something
In your case do_something can be a function which calls the functions you want to be processed, which are passed as a list to inputs.
I am trying to make use of Manager() to share dictionary between processes and tried out the following code:
from multiprocessing import Manager, Pool
def f(d):
d['x'] += 2
if __name__ == '__main__':
manager = Manager()
d = manager.dict()
d['x'] = 2
p= Pool(4)
for _ in range(2000):
p.map_async(f, (d,)) #apply_async, map
p.close()
p.join()
print (d) # expects this result --> {'x': 4002}
Using map_async and apply_async, the result printed is always different (e.g. {'x': 3838}, {'x': 3770}).
However, using map will give the expected result.
Also, i have tried using Process instead of Pool, the results are different too.
Any insights?
Something on the non-blocking part and race conditions are not handled by manager?
When you call map (rather than map_async), it will block until the processors have finished all the requests you are passing, which in your case is just one call to function f. So even though you have a pool size of 4, you are in essence doing the 2000 processes one at a time. To actually parallelize execution, you should have done a single p.map(f, [d]*2000) instead of the loop.
But when you call map_async, you do not block and are returned a result object. A call to get on the result object will block until the process finishes and will return with the result of the function call. So now you are running up to 4 processes at a time. But the update to the dictionary is not serialized across the processors. I have modifed the code to force serialization of of d[x] += 2 by using a multiprocessing lock. You will see that the results are now 4002.
from multiprocessing import Manager, Pool, Lock
def f(d):
lock.acquire()
d['x'] += 2
lock.release()
def init(l):
global lock
lock = l
if __name__ == '__main__':
with Manager() as manager:
d = manager.dict()
d['x'] = 2
lock = Lock()
p = Pool(4, initializer=init, initargs=(lock,)) # Create the multiprocessing lock that is sharable by all the processes
results = [] # if the function returnd a result we wanted
for _ in range(2000):
results.append(p.map_async(f, (d,))) #apply_async, map
"""
for i in range(2000): # if the function returned a result we wanted
results[i].get() # wait for everything to finish
"""
p.close()
p.join()
print(d)
I want to execute this function without having to rewrite all the code for each process.
def executeNode(node):
node.execution()
And the code that I don't feel the need to repeat n times the next one. I need to use Process not Threads.
a0 = Process(target=executeNode, args = (node1))
a1 = Process(target=executeNode, args = (node2))
a2 = Process(target=executeNode, args = (node3))
...............................
an = Process(target=executeNode, args = (nodeN))
So I decided to create a list of nodes but I don't know how to execute a process for each item (node) of the list.
sNodes = []
for i in range(0, n):
node = node("a"+ str(i), (4001 + i))
sNodes.append(node)
How can I execute a process for each item (node) of the list (sNodes).
Thank you all.
You can use a Pool:
from multiprocessing import Pool
if __name__ == '__main__':
with Pool(n) as p:
print(p.map(executeNode, sNodes))
Where n is the number of processes you want.
In case you want detached processes or you dont expect a result is better to simply use another loop:
processes = []
for node in sNodes:
p = Process(target=executeNode, args = (node1))
processes.append(p)
p.Start()
General tip: having a lot of processes will not speed up your code but make your processor start swaping and everything will be slower. Just in case you are looking for a code speedup instead of a logical architecture.
Try something like this:
from multiprocessing import Pool
process_number = 4
nodes = [...]
def execute_node(node):
print(node)
pool = Pool(processes=process_number)
pool.starmap(execute_node, [(node,) for node in nodes])
pool.close()
You will find more intel here: https://docs.python.org/3/library/multiprocessing.html
On the python docs, it says that if maxsize is less than or equal to zero, the Queue size is infinite. I've also tried maxsize=-1. However this isn't the case and the program will hang. So as a work-around I created multiple Queues to work with. But this will not be ideal as I will need to work with even bigger lists and then would have to subsequently create more and more Queue() and add additional code to process the elements.
queue = Queue(maxsize=0)
queue2 = Queue(maxsize=0)
queue3 = Queue(maxsize=0)
PROCESS_COUNT = 6
def filter(aBigList):
list_chunks = list(chunks(aBigList, PROCESS_COUNT))
pool = multiprocessing.Pool(processes=PROCESS_COUNT)
for chunk in list_chunks:
pool.apply_async(func1, (chunk,))
pool.close()
pool.join()
allFiltered = []
# list of dicts
while not queue.empty():
allFiltered.append(queue.get())
while not queue2.empty():
allFiltered.append(queue2.get())
while not queue3.empty():
allFiltered.append(queue3.get())
//do work with allFiltered
def func1(subList):
SUBLIST_SPLIT = 3
theChunks = list(chunks(subList, SUBLIST_SPLIT))
for i in theChunks[0]:
dictQ = updateDict(i)
queue.put(dictQ)
for x in theChunks[1]:
dictQ = updateDict(x)
queue2.put(dictQ)
for y in theChunks[2]:
dictQ = updateDict(y)
queue3.put(dictQ)
Your issue happens because you do not process the Queue before the join call.
When you are using a multiprocessing.Queue, you should empty it before trying to join the feeder process. The Process wait for all the object put in the Queue to be flushed before terminating. I don't know why it is the case even for Queue with large size but it might be linked to the fact that the underlying os.pipe object do not have a size large enough.
So putting your get call before the pool.join should solve your problem.
PROCESS_COUNT = 6
def filter(aBigList):
list_chunks = list(chunks(aBigList, PROCESS_COUNT))
pool = multiprocessing.Pool(processes=PROCESS_COUNT)
result_queue = multiprocessing.Queue()
async_result = []
for chunk in list_chunks:
async_result.append(pool.apply_async(
func1, (chunk, result_queue)))
done = 0
while done < 3:
res = queue.get()
if res == None:
done += 1
else:
all_filtered.append(res)
pool.close()
pool.join()
# do work with allFiltered
def func1(sub_list, result_queue):
# mapping function
results = []
for i in sub_list:
result_queue.append(updateDict(i))
result_queue.append(None)
One question is why do you need to handle the communication by yourself? you could just let the Pool manage that for you if you re factor:
PROCESS_COUNT = 6
def filter(aBigList):
list_chunks = list(chunks(aBigList, PROCESS_COUNT))
pool = multiprocessing.Pool(processes=PROCESS_COUNT)
async_result = []
for chunk in list_chunks:
async_result.append(pool.apply_async(func1, (chunk,)))
pool.close()
pool.join()
# Reduce the result
allFiltered = [res.get() for res in async_result]
# do work with allFiltered
def func1(sub_list):
# mapping function
results = []
for i in sub_list:
results.append(updateDict(i))
return results
This permits to avoid this kind of bug.
EDIT
Finally, you can even reduce your code even further by using the Pool.map function, which even handle chunksize.
If your chunks gets too big, you might get error in the pickling process of the results (as stated in your comment). You can thus reduce adapt the size of the chink using map:
PROCESS_COUNT = 6
def filter(aBigList):
# Run in parallel a internal function of mp.Pool which run
# UpdateDict on chunk of 100 item in aBigList and return them.
# The map function takes care of the chunking, dispatching and
# collect the items in the right order.
with multiprocessing.Pool(processes=PROCESS_COUNT) as pool:
allFiltered = pool.map(updateDict, aBigList, chunksize=100)
# do work with allFiltered
import multiprocessing as mp
if __name__ == '__main__':
#pool = mp.Pool(M)
p1 = mp.Process(target= target1, args= (arg1,))
p2 = mp.Process(target= target2, args= (arg1,))
...
p9 = mp.Process(target= target9, args= (arg9,))
p10 = mp.Process(target= target10, args= (arg10,))
...
pN = mp.Process(target= targetN, args= (argN,))
processList = [p1, p2, .... , p9, p10, ... ,pN]
I have N different target functions which consume unequal non-trivial amount of time to execute.
I am looking for a way to execute them in parallel such that M (1 < M < N) processes are running simultaneously. And as soon as a process is finished next process should start from the list, until all the processes in processList are completed.
As I am not calling the same target function, I could not use Pool.
I considered doing something like this:
for i in range(0, N, M):
limit = i + M
if(limit > N):
limit = N
for p in processList[i:limit]:
p.join()
Since my target functions consume unequal time to execute, this method is not really efficient.
Any suggestions? Thanks in advance.
EDIT:
Question title has been changed to 'Execute a list of process without multiprocessing pool map' from 'Execute a list of process without multiprocessing pool'.
You can use proccess Pool:
#!/usr/bin/env python
# coding=utf-8
from multiprocessing import Pool
import random
import time
def target_1():
time.sleep(random.uniform(0.5, 2))
print('done target 1')
def target_2():
time.sleep(random.uniform(0.5, 2))
print('done target 1')
def target_3():
time.sleep(random.uniform(0.5, 2))
print('done target 1')
def target_4():
time.sleep(random.uniform(0.5, 2))
print('done target 1')
pool = Pool(2) # maximum two processes at time.
pool.apply_async(target_1)
pool.apply_async(target_2)
pool.apply_async(target_3)
pool.apply_async(target_4)
pool.close()
pool.join()
Pool is created specifically for what you need to do - execute many tasks in limited number of processes.
I also suggest you take a look at concurrent.futures library and it's backport to Python 2.7. It has a ProcessPoolExecutor, which has roughly same capabilities, but it's methods returns Future objects, and they has a nicer API.
Here is a way to do it in Python 3.4, which could be adapted for Python 2.7 :
targets_with_args = [
(target1, arg1),
(target2, arg2),
(target3, arg3),
...
]
with concurrent.futures.ProcessPoolExecutor(max_workers=20) as executor:
futures = [executor.submit(target, arg) for target, arg in targets_with_args]
results = [future.result() for future in concurrent.futures.as_completed(futures)]
I would use a Queue. adding processes to it from processList, and as soon as a process is finished i would remove it from the queue and add another one.
a pseudo code will look like:
from Queue import Queue
q = Queue(m)
# add first process to queue
i = 0
q.put(processList[i])
processList[i].start()
i+=1
while not q.empty():
p=q.get()
# check if process is finish. if not return it to the queue for later checking
if p.is_alive():
p.put(t)
# add another process if there is space and there are more processes to add
if not q.full() and i < len(processList):
q.put(processList[i])
processList[i].start()
i+=1
A simple solution would be to wrap the functions target{1,2,...N} into a single function forward_to_target that forwards to the appropriate target{1,2,...N} function according to the argument that is passed in. If you cannot infer the appropriate target function from the arguments you currently use, replace each argument with a tuple (argX, X), then in the forward_to_target function unpack the tuple and forward to the appropriate function indicated by the X.
You could have two lists of targets and arguments, zip the two together - and send them to a runner function (here it's run_target_on_args):
#!/usr/bin/env python
import multiprocessing as mp
# target functions
targets = [len, str, len, zip]
# arguments for each function
args = [["arg1"], ["arg2"], ["arg3"], [["arg5"], ["arg6"]]]
# applies target function on it's arguments
def run_target_on_args(target_args):
return target_args[0](*target_args[1])
pool = mp.Pool()
print pool.map(run_target_on_args, zip(targets, args))