Python multiprocessing processes terminating? - python

I've read a number of answers here on Stackoverflow about Python multiprocessing, and I think this one is the most useful for my purposes: python multiprocessing queue implementation.
Here is what I'd like to do: poll the database for new work, put it in the queue and have 4 processes continuously do the work. What I'm unclear on is what happens when an item in the queue is done being processed. In the question above, the process terminates when the queue is empty. However, in my case, I'd just like to keep waiting until there is data in the queue. So do I just sleep and periodically check the queue? So my worker processes will never die? Is that good practice?
def mp_worker(queue):
while True:
if (queue.qsize() == 0):
time.sleep(20)
else:
db_record = queue.get()
process_file(db_record)
def mp_handler():
num_workers = 4
processes = [Process(target=mp_worker, args=(queue,)) for _ in range(num_workers)]
for process in processes:
process.start()
for process in processes:
process.join()
if __name__ == '__main__':
db_conn = db.create_postgre_connection(DB_CONFIG)
while True:
db_records = db.retrieve_received_files(DB_CONN)
if (len(db_records) > 0):
for db_record in db_records:
queue.put(db_record)
mp_handler()
else:
time.sleep(20)
db_conn.close()
Does it make sense?
Thanks.

Figured it out. Workers have to die, since otherwise they never return. But I start a new set of workers when there is data anyway, so that's not a problem. Updated code:
def mp_worker(queue):
while queue.qsize() > 0 :
db_record = queue.get()
process_file(db_record)
def mp_handler():
num_workers = 4
if (queue.qsize() < num_workers):
num_workers = queue.qsize()
processes = [Process(target=mp_worker, args=(queue,)) for _ in range(num_workers)]
for process in processes:
process.start()
for process in processes:
process.join()
if __name__ == '__main__':
while True:
db_records = db.retrieve_received_files(DB_CONN)
print(db_records)
if (len(db_records) > 0):
for db_record in db_records:
queue.put(db_record)
mp_handler()
else:
time.sleep(20)
DB_CONN.close()

Related

Stop a process when error occur in multiprocessing

I have created a 3 process in python. I have attached a code.
Now I want to stop the execution of running p2,p3 process because I got an error due to p1 process.I have idea to add p2.terminate(),I don't know where to add in this case. Thanks in advance.
def table(a):
try:
for i in range(100):
print(i,'x',a,'=',a*i)
except:
print("error")
processes = []
p1= multiprocessing.Process(target = table,args=['s'])
p2= multiprocessing.Process(target = table,args=[5])
p3= multiprocessing.Process(target = table,args=[2])
p1.start()
p2.start()
p3.start()
processes.append(p1)
processes.append(p2)
processes.append(p3)
for process in processes:
process.join()```
To stop any given process once one of the process terminates due to an error, first set up your target table() to exit with an appropriate exitcode > 0
def table(args):
try:
for i in range(100):
print(i,'x', a ,'=', a*i)
except:
sys.exit(1)
sys.exit(0)
Then you can start your processes and poll the processes to see if any one has terminated.
#!/usr/bin/env python3
# coding: utf-8
import multiprocessing
import time
import logging
import sys
logging.basicConfig(level=logging.INFO, format='[%(asctime)-15s] [%(processName)-10s] %(message)s', datefmt='%Y-%m-%d %H:%M:%S')
def table(args):
try:
for i in range(5):
logging.info('{} x {} = {}'.format(i, args, i*args))
if isinstance(args, str):
raise ValueError()
time.sleep(5)
except:
logging.error('Done in Error Path: {}'.format(args))
sys.exit(1)
logging.info('Done in Success Path: {}'.format(args))
sys.exit(0)
if __name__ == '__main__':
p1 = multiprocessing.Process(target=table, args=('s',))
p2 = multiprocessing.Process(target=table, args=(5,))
p3 = multiprocessing.Process(target=table, args=(2,))
processes = [p1, p2, p3]
for process in processes:
process.start()
while True:
failed = []
completed = []
for process in processes:
if process.exitcode is not None and process.exitcode != 0:
failed.append(process)
if failed:
for process in processes:
if process not in failed:
logging.info('Terminating Process: {}'.format(process))
process.terminate()
break
if len(completed) == len(processes):
break
time.sleep(1)
Essentially, you are using terminate() to stop the remaining processes that are still running.
to stop a all cores when one core has faced with error, i use this code block:
processes = []
for j in range(0, n_core):
p = multiprocessing.Process(target=table, args=('some input',))
processes.append(p)
time.sleep(0.1)
p.start()
flag = True
while flag:
flag = False
for p in processes:
if p.exitcode == 1:
for z in processes:
z.kill()
sys.exit(1)
elif p.is_alive():
flag = True
for p in processes:
p.join()
First, I have modified function table to throw an exception that is not caught when the argument passed to it is 's' and to delay .1 seconds otherwise before printing to give the main process a chance to realize that the sub-process through an exception and can cancel the other processes before they have started printing. Otherwise, the other processes will have completed before you can cancel them. Here I am using a process pool, which supports a terminate method that conveniently terminates all submitted, uncompleted tasks without having to cancel each one individually (although that is also an option).
The code creates a multiprocessing pool of size 3 since that is the number of "tasks" being submitted and then uses method apply_async to submit the 3 tasks to run in parallel (assuming you have at least 3 processors). apply_sync returns an AsyncResult instance whose get method can be called to wait for the completion of the submitted task and to get the return value from the worker function table, which is None for the second and third tasks submitted and of no interest, or will throw an exception if the worker function had an uncaught exception, which is the case with the first task submitted:
import multiprocessing
import time
def table(a):
if a == 's':
raise Exception('I am "s"')
time.sleep(.1)
for i in range(100):
print(i,'x',a,'=',a*i)
# required for Windows:
if __name__ == '__main__':
pool = multiprocessing.Pool(3) # create a pool of 3 processes
result1 = pool.apply_async(table, args=('s',))
result2 = pool.apply_async(table, args=(5,))
result3 = pool.apply_async(table, args=(2,))
try:
result1.get() # wait for completion of first task
except Exception as e:
print(e)
pool.terminate() # kill all processes in the pool
else:
# wait for all submitted tasks to complete:
pool.close()
pool.join()
"""
# or alternatively:
result2.get() # wait for second task to finish
result3.get() # wait for third task to finish
"""
Prints:
I am "s"

100 percent load with multiprocessing queues

this only replicates my problem to get 100% load for the main python script if it tries to control loop over a shared queue
import multiprocessing
import random
def func1(num, q):
while True:
num = random.randint(1, 101)
if q.empty():
q.put(num)
def func2(num, q):
while True:
num = q.get()
num = num ** 2
if q.empty():
q.put(num)
num = 2
q = multiprocessing.Queue()
p1 = multiprocessing.Process(target=func1, args=(num, q))
p2 = multiprocessing.Process(target=func2, args=(num, q))
p1.daemon = True
p2.daemon = True
p1.start()
p2.start()
running = True
while running:
if not q.empty():
num = q.get(True, 0.1)
print(num)
would there be a better method to control from a script multiple worker processes. Better in sense of no load !?
I'm not sure I understand your program:
What's with the num parameter of func1() and func2()? It never gets used.
func2 will discard its result if func1 happens to have posted another number after func2 got the last number out of the queue.
Why do you daemonize the workers? Are you quite sure this is what you want?
The if not q.empty(): q.get() construct in the main code will sooner or later raise a queue.Empty exception because it's a race between it and the q.get() in func2.
The uncaught queue.Empty exception will terminate the main process, leaving the two workers orphaned - and running.
General advice:
Use different queues for issuing jobs (request queue) and collecting results (response queue). Include the request in the response if necessary.
Think about how to terminate the workers. Consider a "poison pill", i.e. a value in the request queue that causes workers to die, i.e. exit/terminate.
Be really really sure you understand the race conditions in your code, like the one I mentioned above (empty vs. get).
Here's some sample code I hacked up:
import multiprocessing
import time
import random
import os
def request_generator(requests):
while True:
requests.put(random.randint(1, 101))
time.sleep(0.01)
def worker(requests, responses):
worker_id = os.getpid()
while True:
request = requests.get()
response = request ** 2
responses.put((request, response, worker_id))
def main():
requests = multiprocessing.Queue()
responses = multiprocessing.Queue()
gen = multiprocessing.Process(target=request_generator, args=(requests,))
w1 = multiprocessing.Process(target=worker, args=(requests, responses))
w2 = multiprocessing.Process(target=worker, args=(requests, responses))
gen.start()
w1.start()
w2.start()
while True:
req, resp, worker_id = responses.get()
print("worker {}: {} => {}".format(worker_id, req, resp))
if __name__ == "__main__":
main()

Python multiprocessing with Queue (split loads dynamically)

I am trying to use multiprocessing to process very large number of files.
I tried to put the list of files into queue and make 3 workers split the load with a common Queue data type. However this seems not working. Probably I am misunderstanding about the queue in multiprocessing package.
Below is the example source code:
import multiprocessing
from multiprocessing import Queue
def worker(i, qu):
"""worker function"""
while ~qu.empty():
val=qu.get()
print 'Worker:',i, ' start with file:',val
j=1
for k in range(i*10000,(i+1)*10000): # some time consuming process
for j in range(i*10000,(i+1)*10000):
j=j+k
print 'Worker:',i, ' end with file:',val
if __name__ == '__main__':
jobs = []
qu=Queue()
for j in range(100,110): # files numbers are from 100 to 110
qu.put(j)
for i in range(3): # 3 multiprocess
p = multiprocessing.Process(target=worker, args=(i,qu))
jobs.append(p)
p.start()
p.join()
Thanks for the comments.
I come to know that using Pool is the best solution.
import multiprocessing
import time
def worker(val):
"""worker function"""
print 'Worker: start with file:',val
time.sleep(1.1)
print 'Worker: end with file:',val
if __name__ == '__main__':
file_list=range(100,110)
p = multiprocessing.Pool(2)
p.map(worker, file_list)
Two issues:
1) you are joining only on the 3rd process
2) Why not use multiprocessing.Pool?
3) race condition on qu.get()
1 & 3)
import multiprocessing
from multiprocessing import Queue
def worker(i, qu):
"""worker function"""
while 1:
try:
val=qu.get(timeout)
except Queue.Empty: break# Yay no race condition
print 'Worker:',i, ' start with file:',val
j=1
for k in range(i*10000,(i+1)*10000): # some time consuming process
for j in range(i*10000,(i+1)*10000):
j=j+k
print 'Worker:',i, ' end with file:',val
if __name__ == '__main__':
jobs = []
qu=Queue()
for j in range(100,110): # files numbers are from 100 to 110
qu.put(j)
for i in range(3): # 3 multiprocess
p = multiprocessing.Process(target=worker, args=(i,qu))
jobs.append(p)
p.start()
for p in jobs: #<--- join on all processes ...
p.join()
2)
for how to use the Pool, see:
https://docs.python.org/2/library/multiprocessing.html
You are joining only the last of your created processes. That means if the first or the second process is still working while the third is finished, your main process is goning down and kills the remaining processes before they are finished.
You should join them all in order to wait until they are finished:
for p in jobs:
p.join()
Another thing is you should consider using qu.get_nowait() in order to get rid of the race condition between qu.empty() and qu.get().
For example:
try:
while 1:
message = self.queue.get_nowait()
""" do something fancy here """
except Queue.Empty:
pass
I hope that helps

Python multiprocessing module: join processes with timeout

I'm doing an optimization of parameters of a complex simulation. I'm using the multiprocessing module for enhancing the performance of the optimization algorithm. The basics of multiprocessing I learned at http://pymotw.com/2/multiprocessing/basics.html.
The complex simulation lasts different times depending on the given parameters from the optimization algorithm, around 1 to 5 minutes. If the parameters are chosen very badly, the simulation can last 30 minutes or more and the results are not useful. So I was thinking about build in a timeout to the multiprocessing, that terminates all simulations that last more than a defined time. Here is an abstracted version of the problem:
import numpy as np
import time
import multiprocessing
def worker(num):
time.sleep(np.random.random()*20)
def main():
pnum = 10
procs = []
for i in range(pnum):
p = multiprocessing.Process(target=worker, args=(i,), name = ('process_' + str(i+1)))
procs.append(p)
p.start()
print('starting', p.name)
for p in procs:
p.join(5)
print('stopping', p.name)
if __name__ == "__main__":
main()
The line p.join(5) defines the timeout of 5 seconds. Because of the for-loop for p in procs: the program waits 5 seconds until the first process is finished and then again 5 seconds until the second process is finished and so on, but i want the program to terminate all processes that last more than 5 seconds. Additionally, if none of the processes last longer than 5 seconds the program must not wait this 5 seconds.
You can do this by creating a loop that will wait for some timeout amount of seconds, frequently checking to see if all processes are finished. If they don't all finish in the allotted amount of time, then terminate all of the processes:
TIMEOUT = 5
start = time.time()
while time.time() - start <= TIMEOUT:
if not any(p.is_alive() for p in procs):
# All the processes are done, break now.
break
time.sleep(.1) # Just to avoid hogging the CPU
else:
# We only enter this if we didn't 'break' above.
print("timed out, killing all processes")
for p in procs:
p.terminate()
p.join()
If you want to kill all the processes you could use the Pool from multiprocessing
you'll need to define a general timeout for all the execution as opposed of individual timeouts.
import numpy as np
import time
from multiprocessing import Pool
def worker(num):
xtime = np.random.random()*20
time.sleep(xtime)
return xtime
def main():
pnum = 10
pool = Pool()
args = range(pnum)
pool_result = pool.map_async(worker, args)
# wait 5 minutes for every worker to finish
pool_result.wait(timeout=300)
# once the timeout has finished we can try to get the results
if pool_result.ready():
print(pool_result.get(timeout=1))
if __name__ == "__main__":
main()
This will get you a list with the return values for all your workers in order.
More information here:
https://docs.python.org/2/library/multiprocessing.html#module-multiprocessing.pool
Thanks to the help of dano I found a solution:
import numpy as np
import time
import multiprocessing
def worker(num):
time.sleep(np.random.random()*20)
def main():
pnum = 10
TIMEOUT = 5
procs = []
bool_list = [True]*pnum
for i in range(pnum):
p = multiprocessing.Process(target=worker, args=(i,), name = ('process_' + str(i+1)))
procs.append(p)
p.start()
print('starting', p.name)
start = time.time()
while time.time() - start <= TIMEOUT:
for i in range(pnum):
bool_list[i] = procs[i].is_alive()
print(bool_list)
if np.any(bool_list):
time.sleep(.1)
else:
break
else:
print("timed out, killing all processes")
for p in procs:
p.terminate()
for p in procs:
print('stopping', p.name,'=', p.is_alive())
p.join()
if __name__ == "__main__":
main()
Its not the most elegant way, I'm sure there is a better way than using bool_list. Processes that are still alive after the timeout of 5 seconds will be killed. If you are setting shorter times in the worker function than the timeout, you will see that the program stops before the timeout of 5 seconds is reached. I'm still open for more elegant solutions if there are :)

Python Multiprocessing Pipe "Deadlock"

I'm facing problems with the following example code:
from multiprocessing import Lock, Process, Queue, current_process
def worker(work_queue, done_queue):
for item in iter(work_queue.get, 'STOP'):
print("adding ", item, "to done queue")
#this works: done_queue.put(item*10)
done_queue.put(item*1000) #this doesnt!
return True
def main():
workers = 4
work_queue = Queue()
done_queue = Queue()
processes = []
for x in range(10):
work_queue.put("hi"+str(x))
for w in range(workers):
p = Process(target=worker, args=(work_queue, done_queue))
p.start()
processes.append(p)
work_queue.put('STOP')
for p in processes:
p.join()
done_queue.put('STOP')
for item in iter(done_queue.get, 'STOP'):
print(item)
if __name__ == '__main__':
main()
When the done Queue becomes big enough (a limit about 64k i think), the whole thing freezes without any further notice.
What is the general approach for such a situation when the queue becomes too big? is there some way to remove elements on the fly once they are processed? The Python docs recommend removing the p.join(), in a real application however i can not estimate when the processes have finished. Is there a simple solution for this problem besides infinite looping and using .get_nowait()?
This works for me with 3.4.0alpha4, 3.3, 3.2, 3.1 and 2.6. It tracebacks with 2.7 and 3.0. I pylint'd it, BTW.
#!/usr/local/cpython-3.3/bin/python
'''SSCCE for a queue deadlock'''
import sys
import multiprocessing
def worker(workerno, work_queue, done_queue):
'''Worker function'''
#reps = 10 # this worked for the OP
#reps = 1000 # this worked for me
reps = 10000 # this didn't
for item in iter(work_queue.get, 'STOP'):
print("adding", item, "to done queue")
#this works: done_queue.put(item*10)
for thing in item * reps:
#print('workerno: {}, adding thing {}'.format(workerno, thing))
done_queue.put(thing)
done_queue.put('STOP')
print('workerno: {0}, exited loop'.format(workerno))
return True
def main():
'''main function'''
workers = 4
work_queue = multiprocessing.Queue(maxsize=0)
done_queue = multiprocessing.Queue(maxsize=0)
processes = []
for integer in range(10):
work_queue.put("hi"+str(integer))
for workerno in range(workers):
dummy = workerno
process = multiprocessing.Process(target=worker, args=(workerno, work_queue, done_queue))
process.start()
processes.append(process)
work_queue.put('STOP')
itemno = 0
stops = 0
while True:
item = done_queue.get()
itemno += 1
sys.stdout.write('itemno {0}\r'.format(itemno))
if item == 'STOP':
stops += 1
if stops == workers:
break
print('exited done_queue empty loop')
for workerno, process in enumerate(processes):
print('attempting process.join() of workerno {0}'.format(workerno))
process.join()
done_queue.put('STOP')
if __name__ == '__main__':
main()
HTH

Categories