How to list Processes started by multiprocessing Pool? - python

While attempting to store multiprocessing's process instance in multiprocessing list-variable 'poolList` I am getting a following exception:
SimpleQueue objects should only be shared between processes through inheritance
The reason why I would like to store the PROCESS instances in a variable is to be able to terminate all or just some of them later (if for example a PROCESS freezes). If storing a PROCESS in variable is not an option I would like to know how to get or to list all the PROCESSES started by mutliprocessing POOL. That would be very similar to what .current_process() method does. Except .current_process gets only a single process while I need all the processes started or all the processes currently running.
Two questions:
Is it even possible to store an instance of the Process (as a result of mp.current_process()
Currently I am only able to get a single process from inside of the function that the process is running (from inside of myFunct() using .current_process() method).
Instead I would like to to list all the processes currently running by multiprocessing. How to achieve it?
import multiprocessing as mp
poolList=mp.Manager().list()
def myFunct(arg):
print 'myFunct(): current process:', mp.current_process()
try: poolList.append(mp.current_process())
except Exception, e: print e
for i in range(110):
for n in range(500000):
pass
poolDict[arg]=i
print 'myFunct(): completed', arg, poolDict
from multiprocessing import Pool
pool = Pool(processes=2)
myArgsList=['arg1','arg2','arg3']
pool=Pool(processes=2)
pool.map_async(myFunct, myArgsList)
pool.close()
pool.join()

To list the processes started by a Pool()-instance(which is what you mean if I understand you correctly), there is the pool._pool-list. And it contains the instances of the processes.
However, it is not part of the documented interface and hence, really should not be used.
BUT...it seems a little bit unlikely that it would change just like that anyway. I mean, should they stop having an internal list of processes in the pool? And not call that _pool?
And also, it annoys me that there at least isn't a get processes-method. Or something.
And handling it breaking due to some name change should not be that difficult.
But still, use at your own risk:
from multiprocessing import pool
# Have to run in main
if __name__ == '__main__':
# Create 3 worker processes
_my_pool = pool.Pool(3)
# Loop, terminate, and remove from the process list
# Use a copy [:] of the list to remove items correctly
for _curr_process in _my_pool._pool[:]:
print("Terminating process "+ str(_curr_process.pid))
_curr_process.terminate()
_my_pool._pool.remove(_curr_process)
# If you call _repopulate, the pool will again contain 3 worker processes.
_my_pool._repopulate_pool()
for _curr_process in _my_pool._pool[:]:
print("After repopulation "+ str(_curr_process.pid))
The example creates a pool and manually terminates all processes.
It is important that you remember to delete the process you terminate from the pool yourself i you want Pool() to continue working as usual.
_my_pool._repopulate increases the number of working processes to 3 again, not needed to answer the question, but gives a little bit of behind-the-scenes insight.

Yes you can get all active process and perform action based on name of process
e.g
multiprocessing.Process(target=foo, name="refresh-reports")
and then
for p in multiprocessing.active_children():
if p.name == "foo":
p.terminate()

You're creating a managed List object, but then letting the associated Manager object expire.
Process objects are shareable because they aren't pickle-able; that is, they aren't simple.
Oddly the multiprocessing module doesn't have the equivalent of threading.enumerate() -- that is, you can't list all outstanding processes. As a workaround, I just store procs in a list. I never terminate() a process, but do sys.exit(0) in the parent. It's rough, because the workers will leave things in an inconsistent state, but it's okay for smaller programs
To kill a frozen worker, I suggest: 1) worker receives "heartbeat" jobs in a queue every now and then, 2) if parent notices worker A hasn't responded to a heartbeat in a certain amount of time, then p.terminate(). Consider restating the problem in another SO question, as it's interesting.
To be honest the map stuff is much easier than using a Manager.
Here's a Manager example I've used. A worker adds stuff to a shared list. Another worker occasionally wakes up, processes everything on the list, then goes back to sleep. The code also has verbose logs, which are essential for ease in debugging.
source
# producer adds to fixed-sized list; scanner uses them
import logging, multiprocessing, sys, time
def producer(objlist):
'''
add an item to list every sec; ensure fixed size list
'''
logger = multiprocessing.get_logger()
logger.info('start')
while True:
try:
time.sleep(1)
except KeyboardInterrupt:
return
msg = 'ding: {:04d}'.format(int(time.time()) % 10000)
logger.info('put: %s', msg)
del objlist[0]
objlist.append( msg )
def scanner(objlist):
'''
every now and then, run calculation on objlist
'''
logger = multiprocessing.get_logger()
logger.info('start')
while True:
try:
time.sleep(5)
except KeyboardInterrupt:
return
logger.info('items: %s', list(objlist))
def main():
logger = multiprocessing.log_to_stderr(
level=logging.INFO
)
logger.info('setup')
# create fixed-length list, shared between producer & consumer
manager = multiprocessing.Manager()
my_objlist = manager.list( # pylint: disable=E1101
[None] * 10
)
multiprocessing.Process(
target=producer,
args=(my_objlist,),
name='producer',
).start()
multiprocessing.Process(
target=scanner,
args=(my_objlist,),
name='scanner',
).start()
logger.info('running forever')
try:
manager.join() # wait until both workers die
except KeyboardInterrupt:
pass
logger.info('done')
if __name__=='__main__':
main()

Related

Multiprocessing does not work and hangs on join on windows 10 [duplicate]

I have a question understanding the queue in the multiprocessing module in python 3
This is what they say in the programming guidelines:
Bear in mind that a process that has put items in a queue will wait before
terminating until all the buffered items are fed by the “feeder” thread to
the underlying pipe. (The child process can call the
Queue.cancel_join_thread
method of the queue to avoid this behaviour.)
This means that whenever you use a queue you need to make sure that all
items which have been put on the queue will eventually be removed before the
process is joined. Otherwise you cannot be sure that processes which have
put items on the queue will terminate. Remember also that non-daemonic
processes will be joined automatically.
An example which will deadlock is the following:
from multiprocessing import Process, Queue
def f(q):
q.put('X' * 1000000)
if __name__ == '__main__':
queue = Queue()
p = Process(target=f, args=(queue,))
p.start()
p.join() # this deadlocks
obj = queue.get()
A fix here would be to swap the last two lines (or simply remove the
p.join() line).
So apparently, queue.get() should not be called after a join().
However there are examples of using queues where get is called after a join like:
import multiprocessing as mp
import random
import string
# define a example function
def rand_string(length, output):
""" Generates a random string of numbers, lower- and uppercase chars. """
rand_str = ''.join(random.choice(
string.ascii_lowercase
+ string.ascii_uppercase
+ string.digits)
for i in range(length))
output.put(rand_str)
if __name__ == "__main__":
# Define an output queue
output = mp.Queue()
# Setup a list of processes that we want to run
processes = [mp.Process(target=rand_string, args=(5, output))
for x in range(2)]
# Run processes
for p in processes:
p.start()
# Exit the completed processes
for p in processes:
p.join()
# Get process results from the output queue
results = [output.get() for p in processes]
print(results)
I've run this program and it works (also posted as a solution to the StackOverFlow question Python 3 - Multiprocessing - Queue.get() does not respond).
Could someone help me understand what the rule for the deadlock is here?
The queue implementation in multiprocessing that allows data to be transferred between processes relies on standard OS pipes.
OS pipes are not infinitely long, so the process which queues data could be blocked in the OS during the put() operation until some other process uses get() to retrieve data from the queue.
For small amounts of data, such as the one in your example, the main process can join() all the spawned subprocesses and then pick up the data. This often works well, but does not scale, and it is not clear when it will break.
But it will certainly break with large amounts of data. The subprocess will be blocked in put() waiting for the main process to remove some data from the queue with get(), but the main process is blocked in join() waiting for the subprocess to finish. This results in a deadlock.
Here is an example where a user had this exact issue. I posted some code in an answer there that helped him solve his problem.
Don't call join() on a process object before you got all messages from the shared queue.
I used following workaround to allow processes to exit before processing all its results:
results = []
while True:
try:
result = resultQueue.get(False, 0.01)
results.append(result)
except queue.Empty:
pass
allExited = True
for t in processes:
if t.exitcode is None:
allExited = False
break
if allExited & resultQueue.empty():
break
It can be shortened but I left it longer to be more clear for newbies.
Here resultQueue is the multiprocess.Queue that was shared with multiprocess.Process objects. After this block of code you will get the result array with all the messages from the queue.
The problem is that input buffer of the queue pipe that receive messages may become full causing writer(s) infinite block until there will be enough space to receive next message. So you have three ways to avoid blocking:
Increase the multiprocessing.connection.BUFFER size (not so good)
Decrease message size or its amount (not so good)
Fetch messages from the queue immediately as they come (good way)

Python script is hanging AFTER multithreading

I know there are a few questions and answers related to hanging threads in Python, but my situation is slightly different as the script is hanging AFTER all the threads have been completed. The threading script is below, but obviously the first 2 functions are simplified massively.
When I run the script shown, it works. When I use my real functions, the script hangs AFTER THE LAST LINE. So, all the scenarios are processed (and a message printed to confirm), logStudyData() then collates all the results and writes to a csv. "Script Complete" is printed. And THEN it hangs.
The script with threading functionality removed runs fine.
I have tried enclosing the main script in try...except but no exception gets logged. If I use a debugger with a breakpoint on the final print and then step it forward, it hangs.
I know there is not much to go on here, but short of including the whole 1500-line script, I don't know hat else to do. Any suggestions welcome!
def runScenario(scenario):
# Do a bunch of stuff
with lock:
# access global variables
pass
pass
def logStudyData():
# Combine results from all scenarios into a df and write to csv
pass
def worker():
global q
while True:
next_scenario = q.get()
if next_scenario is None:
break
runScenario(next_scenario)
print(next_scenario , " is complete")
q.task_done()
import threading
from queue import Queue
global q, lock
q = Queue()
threads = []
scenario_list = ['s1','s2','s3','s4','s5','s6','s7','s8','s9','s10','s11','s12']
num_worker_threads = 6
lock = threading.Lock()
for i in range(num_worker_threads):
print("Thread number ",i)
this_thread = threading.Thread(target=worker)
this_thread.start()
threads.append(this_thread)
for scenario_name in scenario_list:
q.put(scenario_name)
q.join()
print("q.join completed")
logStudyData()
print("script complete")
As the docs for Queue.get say:
Remove and return an item from the queue. If optional args block is true and timeout is None (the default), block if necessary until an item is available. If timeout is a positive number, it blocks at most timeout seconds and raises the Empty exception if no item was available within that time. Otherwise (block is false), return an item if one is immediately available, else raise the Empty exception (timeout is ignored in that case).
In other words, there is no way get can ever return None, except by you calling q.put(None) on the main thread, which you don't do.
Notice that the example directly below those docs does this:
for i in range(num_worker_threads):
q.put(None)
for t in threads:
t.join()
The second one is technically necessary, but you usually get away with not doing it.
But the first one is absolutely necessary. You need to either do this, or come up with some other mechanism to tell your workers to quit. Without that, your main thread just tries to exit, which means it tries to join every worker, but those workers are all blocked forever on a get that will never happen, so your program hangs forever.
Building a thread pool may not be rocket science (if only because rocket scientists tend to need their calculations to be deterministic and hard real-time…), but it's not trivial, either, and there are plenty of things you can get wrong. You may want to consider using one of the two already-built threadpools in the Python standard library, concurrent.futures.ThreadPoolExecutor or multiprocessing.dummy.Pool. This would reduce your entire program to:
import concurrent.futures
def work(scenario):
runScenario(scenario)
print(scenario , " is complete")
scenario_list = ['s1','s2','s3','s4','s5','s6','s7','s8','s9','s10','s11','s12']
with concurrent.futures.ThreadPoolExecutor(max_workers=6) as x:
results = list(x.map(work, scenario_list))
print("q.join completed")
logStudyData()
print("script complete")
Obviously you'll still need a lock around any mutable variables you change inside runScenario—although if you're only using a mutable variable there because you couldn't figure out how to return values to the main thread, that's trivial with an Executor: just return the values from work, and then you can use them like this:
for result in x.map(work, scenario_list):
do_something(result)

Multiprocess message queue not receiving messages

I have a script that creates a class and try's to launch an object of that class in a separate process;
class Task():
def __init__(self, messageQueue):
self.messageQueue = messageQueue
def run(self):
startTime = time.time()
while time.time() -startTime < 60:
try:
message = self.messageQueue.get_nowait()
print message
self.messageQueue.task_done()
except Queue.Empty:
print "No messages"
time.sleep(1)
def test(messageQueue):
task = Task(messageQueue)
task.run()
if __name__ == '__main__':
messageQueue = Queue.Queue()
p = Process(target=test, args=(messageQueue,))
p.start()
time.sleep(5)
messageQueue.put("hello")
Instead of seeing the message "hello" printed out after 5 seconds, I just get a continuous stream of "No messages". What am I doing wrong?
The problem is that you're using Queue.Queue, which only handles multiple threads within the same process, not multiple processes.
The multiprocessing module comes with its own replacement, multiprocessing.Queue, which provides the same functionality, but works with both threads and processes.
See Pipes and Queues in the multiprocessing doc for more details—but you probably don't need any more details; the multiprocessing.Queue is meant to be as close to a multi-process clone of Queue.Queue as possible.
If you want to understand the under-the-covers difference:
A Queue.Queue is a deque with condition variables wrapped around it. It relies on the fact that code running in the same interpreter can access the same objects to share the deque, and uses the condition variables to protect the deque from races as well as for signaling.
A multiprocessing.Queue is a more complicated thing that pickles objects and passes them over a pipe between the processes. Races aren't a problem, but signaling still is, so it also has the equivalent of condition variables, but obviously not the ones from threading.

Python 3 Multiprocessing queue deadlock when calling join before the queue is empty

I have a question understanding the queue in the multiprocessing module in python 3
This is what they say in the programming guidelines:
Bear in mind that a process that has put items in a queue will wait before
terminating until all the buffered items are fed by the “feeder” thread to
the underlying pipe. (The child process can call the
Queue.cancel_join_thread
method of the queue to avoid this behaviour.)
This means that whenever you use a queue you need to make sure that all
items which have been put on the queue will eventually be removed before the
process is joined. Otherwise you cannot be sure that processes which have
put items on the queue will terminate. Remember also that non-daemonic
processes will be joined automatically.
An example which will deadlock is the following:
from multiprocessing import Process, Queue
def f(q):
q.put('X' * 1000000)
if __name__ == '__main__':
queue = Queue()
p = Process(target=f, args=(queue,))
p.start()
p.join() # this deadlocks
obj = queue.get()
A fix here would be to swap the last two lines (or simply remove the
p.join() line).
So apparently, queue.get() should not be called after a join().
However there are examples of using queues where get is called after a join like:
import multiprocessing as mp
import random
import string
# define a example function
def rand_string(length, output):
""" Generates a random string of numbers, lower- and uppercase chars. """
rand_str = ''.join(random.choice(
string.ascii_lowercase
+ string.ascii_uppercase
+ string.digits)
for i in range(length))
output.put(rand_str)
if __name__ == "__main__":
# Define an output queue
output = mp.Queue()
# Setup a list of processes that we want to run
processes = [mp.Process(target=rand_string, args=(5, output))
for x in range(2)]
# Run processes
for p in processes:
p.start()
# Exit the completed processes
for p in processes:
p.join()
# Get process results from the output queue
results = [output.get() for p in processes]
print(results)
I've run this program and it works (also posted as a solution to the StackOverFlow question Python 3 - Multiprocessing - Queue.get() does not respond).
Could someone help me understand what the rule for the deadlock is here?
The queue implementation in multiprocessing that allows data to be transferred between processes relies on standard OS pipes.
OS pipes are not infinitely long, so the process which queues data could be blocked in the OS during the put() operation until some other process uses get() to retrieve data from the queue.
For small amounts of data, such as the one in your example, the main process can join() all the spawned subprocesses and then pick up the data. This often works well, but does not scale, and it is not clear when it will break.
But it will certainly break with large amounts of data. The subprocess will be blocked in put() waiting for the main process to remove some data from the queue with get(), but the main process is blocked in join() waiting for the subprocess to finish. This results in a deadlock.
Here is an example where a user had this exact issue. I posted some code in an answer there that helped him solve his problem.
Don't call join() on a process object before you got all messages from the shared queue.
I used following workaround to allow processes to exit before processing all its results:
results = []
while True:
try:
result = resultQueue.get(False, 0.01)
results.append(result)
except queue.Empty:
pass
allExited = True
for t in processes:
if t.exitcode is None:
allExited = False
break
if allExited & resultQueue.empty():
break
It can be shortened but I left it longer to be more clear for newbies.
Here resultQueue is the multiprocess.Queue that was shared with multiprocess.Process objects. After this block of code you will get the result array with all the messages from the queue.
The problem is that input buffer of the queue pipe that receive messages may become full causing writer(s) infinite block until there will be enough space to receive next message. So you have three ways to avoid blocking:
Increase the multiprocessing.connection.BUFFER size (not so good)
Decrease message size or its amount (not so good)
Fetch messages from the queue immediately as they come (good way)

Python multiprocessing with an updating queue and an output queue

How can I script a Python multiprocess that uses two Queues as these ones?:
one as a working queue that starts with some data and that, depending on conditions of the functions to be parallelized, receives further tasks on the fly,
another that gathers results and is used to write down the result after processing finishes.
I basically need to put some more tasks in the working queue depending on what I found in its initial items. The example I post below is silly (I could transform the item as I like and put it directly in the output Queue), but its mechanics are clear and reflect part of the concept I need to develop.
Hereby my attempt:
import multiprocessing as mp
def worker(working_queue, output_queue):
item = working_queue.get() #I take an item from the working queue
if item % 2 == 0:
output_queue.put(item**2) # If I like it, I do something with it and conserve the result.
else:
working_queue.put(item+1) # If there is something missing, I do something with it and leave the result in the working queue
if __name__ == '__main__':
static_input = range(100)
working_q = mp.Queue()
output_q = mp.Queue()
for i in static_input:
working_q.put(i)
processes = [mp.Process(target=worker,args=(working_q, output_q)) for i in range(mp.cpu_count())] #I am running as many processes as CPU my machine has (is this wise?).
for proc in processes:
proc.start()
for proc in processes:
proc.join()
for result in iter(output_q.get, None):
print result #alternatively, I would like to (c)pickle.dump this, but I am not sure if it is possible.
This does not end nor print any result.
At the end of the whole process I would like to ensure that the working queue is empty, and that all the parallel functions have finished writing to the output queue before the later is iterated to take out the results. Do you have suggestions on how to make it work?
The following code achieves the expected results. It follows the suggestions made by #tawmas.
This code allows to use multiple cores in a process that requires that the queue which feeds data to the workers can be updated by them during the processing:
import multiprocessing as mp
def worker(working_queue, output_queue):
while True:
if working_queue.empty() == True:
break #this is the so-called 'poison pill'
else:
picked = working_queue.get()
if picked % 2 == 0:
output_queue.put(picked)
else:
working_queue.put(picked+1)
return
if __name__ == '__main__':
static_input = xrange(100)
working_q = mp.Queue()
output_q = mp.Queue()
results_bank = []
for i in static_input:
working_q.put(i)
processes = [mp.Process(target=worker,args=(working_q, output_q)) for i in range(mp.cpu_count())]
for proc in processes:
proc.start()
for proc in processes:
proc.join()
results_bank = []
while True:
if output_q.empty() == True:
break
results_bank.append(output_q.get_nowait())
print len(results_bank) # length of this list should be equal to static_input, which is the range used to populate the input queue. In other words, this tells whether all the items placed for processing were actually processed.
results_bank.sort()
print results_bank
You have a typo in the line that creates the processes. It should be mp.Process, not mp.process. This is what is causing the exception you get.
Also, you are not looping in your workers, so they actually only consume a single item each from the queue and then exit. Without knowing more about the required logic, it's not easy to give specific advice, but you will probably want to enclose the body of your worker function inside a while True loop and add a condition in the body to exit when the work is done.
Please note that, if you do not add a condition to explicitly exit from the loop, your workers will simply stall forever when the queue is empty. You might consider using the so-called poison pill technique to signal the workers they may exit. You will find an example and some useful discussion in the PyMOTW article on Communication Between processes.
As for the number of processes to use, you will need to benchmark a bit to find what works for you, but, in general, one process per core is a good starting point when your workload is CPU bound. If your workload is IO bound, you might have better results with a higher number of workers.

Categories