I just want to know how to clear a multiprocessing.Queue like a queue.Queue in Python:
>>> import queue
>>> queue.Queue().clear()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'Queue' object has no attribute 'clear'
>>> queue.Queue().queue.clear()
>>> import multiprocessing
>>> multiprocessing.Queue().clear()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'Queue' object has no attribute 'clear'
>>> multiprocessing.Queue().queue.clear()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'Queue' object has no attribute 'queue'
So, I take look at Queue class, and you may to try this code:
while not some_queue.empty():
some_queue.get() # as docs say: Remove and return an item from the queue.
Ask for forgiveness rather than permission; just try to empty the queue until you get the Empty exception, then ignore that exception:
from Queue import Empty
def clear(q):
try:
while True:
q.get_nowait()
except Empty:
pass
Better yet: is a built-in class missing the method you want? Subclass the built-in class, and add the method you think should be there!
from Queue import Queue, Empty
class ClearableQueue(Queue):
def clear(self):
try:
while True:
self.get_nowait()
except Empty:
pass
Your ClearableQueue class inherits all the goodness (and behavior) of the built-in Queue class, and has the method you now want.
Simply use q = ClearableQueue() in all places where you used q = Queue(), and call q.clear() when you'd like.
There is no direct way of clearing a multiprocessing.Queue.
I believe the closest you have is close(), but that simply states that no more data will be pushed to that queue, and will close it when all data has been flushed to the pipe.
pipe(7) Linux manual page specifies that a pipe has a limited capacity (65,536 bytes by default) and that writing to a full pipe blocks until enough data has been read from the pipe to allow the write to complete:
I/O on pipes and FIFOs
[…]
If a process attempts to read from an empty pipe, then read(2) will block until data is available. If a process attempts to write to a full pipe (see below), then write(2) blocks until sufficient data has been read from the pipe to allow the write to complete. Nonblocking I/O is possible by using the fcntl(2) F_SETFL operation to enable the O_NONBLOCK open file status flag.
[…]
Pipe capacity
A pipe has a limited capacity. If the pipe is full, then a write(2) will block or fail, depending on whether the O_NONBLOCK flag is set (see below). Different implementations have different limits for the pipe capacity. Applications should not rely on a particular capacity: an application should be designed so that a reading process consumes data as soon as it is available, so that a writing process does not remain blocked.
In Linux versions before 2.6.11, the capacity of a pipe was the same as the system page size (e.g., 4096 bytes on i386). Since Linux 2.6.11, the pipe capacity is 16 pages (i.e., 65,536 bytes in a system with a page size of 4096 bytes). Since Linux 2.6.35, the default pipe capacity is 16 pages, but the capacity can be queried and set using the fcntl(2) F_GETPIPE_SZ and F_SETPIPE_SZ operations. See fcntl(2) for more information.
That is why the multiprocessing Python library documentation recommends to make a consumer process empty each Queue object with Queue.get calls before its feeder threads are joined in producer processes (implicitly with garbage collection or explicitly with Queue.join_thread calls):
Joining processes that use queues
Bear in mind that a process that has put items in a queue will wait before terminating until all the buffered items are fed by the “feeder” thread to the underlying pipe. (The child process can call the Queue.cancel_join_thread method of the queue to avoid this behaviour.)
This means that whenever you use a queue you need to make sure that all items which have been put on the queue will eventually be removed before the process is joined. Otherwise you cannot be sure that processes which have put items on the queue will terminate. Remember also that non-daemonic processes will be joined automatically.
An example which will deadlock is the following:
from multiprocessing import Process, Queue
def f(q):
q.put('X' * 1000000)
if __name__ == '__main__':
queue = Queue()
p = Process(target=f, args=(queue,))
p.start()
p.join() # this deadlocks
obj = queue.get()
A fix here would be to swap the last two lines (or simply remove the p.join() line).
In some applications, a consumer process may not know how many items have been added to a queue by producer processes. In this situation, a reliable way to empty the queue is to make each producer process add a sentinel item when it is done and make the consumer process remove items (regular and sentinel items) until it has removed as many sentinel items as there are producer processes:
import multiprocessing
def f(q, e):
while True:
q.put('X' * 1000000) # block the feeder thread (size > pipe capacity)
if e.is_set():
break
q.put(None) # add a sentinel item
if __name__ == '__main__':
start_count = 5
stop_count = 0
q = multiprocessing.Queue()
e = multiprocessing.Event()
for _ in range(start_count):
multiprocessing.Process(target=f, args=(q, e)).start()
e.set() # stop producer processes
while stop_count < start_count:
if q.get() is None: # empty the queue
stop_count += 1 # count the sentinel items removed
This solution uses blocking Queue.get calls to empty the queue. This guarantees that all items have been added to the queue and removed.
#DanH’s solution uses non-blocking Queue.get_nowait calls to empty the queue. The problem with that solution is that producer processes can still add items to the queue after the consumer process has emptied the queue, which will create a deadlock (the consumer process will wait for the producer processes to terminate, each producer process will wait for its feeder thread to terminate, the feeder thread of each producer process will wait for the consumer process to remove the items added to the queue):
import multiprocessing.queues
def f(q):
q.put('X' * 1000000) # block the feeder thread (size > pipe capacity)
if __name__ == '__main__':
q = multiprocessing.Queue()
p = multiprocessing.Process(target=f, args=(q,))
p.start()
try:
while True:
q.get_nowait()
except multiprocessing.queues.Empty:
pass # reached before the producer process adds the item to the queue
p.join() # deadlock
Or newly created producer processes can fail to deserialise the Process object of the consumer process if the synchronisation resources of the queue that comes with it as an attribute are garbage collected before, raising a FileNotFoundError:
import multiprocessing.queues
def f(q):
q.put('X' * 1000000)
if __name__ == '__main__':
q = multiprocessing.Queue()
multiprocessing.Process(target=f, args=(q,)).start()
try:
while True:
q.get_nowait()
except multiprocessing.queues.Empty:
pass # reached before the producer process deserialises the Process
Standard error:
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/usr/local/Cellar/python#3.9/3.9.12/Frameworks/Python.framework/Versions/3.9/lib/python3.9/multiprocessing/spawn.py", line 116, in spawn_main
exitcode = _main(fd, parent_sentinel)
File "/usr/local/Cellar/python#3.9/3.9.12/Frameworks/Python.framework/Versions/3.9/lib/python3.9/multiprocessing/spawn.py", line 126, in _main
self = reduction.pickle.load(from_parent)
File "/usr/local/Cellar/python#3.9/3.9.12/Frameworks/Python.framework/Versions/3.9/lib/python3.9/multiprocessing/synchronize.py", line 110, in __setstate__
self._semlock = _multiprocessing.SemLock._rebuild(*state)
FileNotFoundError: [Errno 2] No such file or directory
I am a newbie so don't be angry on me, but
Why not redefine the .Queue() variable?
import multiprocessing as mp
q = mp.Queue()
chunk = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
for i in chunk:
q.put(i)
print(q.empty())
q = mp.Queue()
print(q.empty())
My output:
>>False
>>True
I'm just self-educating right now, so if I'm wrong, feel free to point it out
Related
This is the exact code from Python.org. If you comment out the time.sleep(), it crashes with a long exception traceback. I would like to know why.
And, I do understand why Python.org included it in their example code. But artificially creating "working time" via time.sleep() shouldn't break the code when it's removed. It seems to me that the time.sleep() is affording some sort of spin up time. But as I said, I'd like to know from people who might actually know the answer.
A user comment asked me to fill in more details on the environment this was happening in. It was on OSX Big Sur 11.4. Using a clean install of Python 3.95 from Python.org (no Homebrew, etc). Run from within Pycharm inside a venv. I hope that helps add to understanding the situation.
import time
import random
from multiprocessing import Process, Queue, current_process, freeze_support
#
# Function run by worker processes
#
def worker(input, output):
for func, args in iter(input.get, 'STOP'):
result = calculate(func, args)
output.put(result)
#
# Function used to calculate result
#
def calculate(func, args):
result = func(*args)
return '%s says that %s%s = %s' % \
(current_process().name, func.__name__, args, result)
#
# Functions referenced by tasks
#
def mul(a, b):
#time.sleep(0.5*random.random()) # <--- time.sleep() commented out
return a * b
def plus(a, b):
#time.sleep(0.5*random.random()). # <--- time.sleep() commented out
return a + b
#
#
#
def test():
NUMBER_OF_PROCESSES = 4
TASKS1 = [(mul, (i, 7)) for i in range(20)]
TASKS2 = [(plus, (i, 8)) for i in range(10)]
# Create queues
task_queue = Queue()
done_queue = Queue()
# Submit tasks
for task in TASKS1:
task_queue.put(task)
# Start worker processes
for i in range(NUMBER_OF_PROCESSES):
Process(target=worker, args=(task_queue, done_queue)).start()
# Get and print results
print('Unordered results:')
for i in range(len(TASKS1)):
print('\t', done_queue.get())
# Add more tasks using `put()`
for task in TASKS2:
task_queue.put(task)
# Get and print some more results
for i in range(len(TASKS2)):
print('\t', done_queue.get())
# Tell child processes to stop
for i in range(NUMBER_OF_PROCESSES):
task_queue.put('STOP')
if __name__ == '__main__':
freeze_support()
test()
This is the traceback if it helps anyone:
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/multiprocessing/spawn.py", line 116, in spawn_main
exitcode = _main(fd, parent_sentinel)
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/multiprocessing/spawn.py", line 126, in _main
self = reduction.pickle.load(from_parent)
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/multiprocessing/synchronize.py", line 110, in __setstate__
Traceback (most recent call last):
File "<string>", line 1, in <module>
self._semlock = _multiprocessing.SemLock._rebuild(*state)
FileNotFoundError: [Errno 2] No such file or directory
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/multiprocessing/spawn.py", line 116, in spawn_main
exitcode = _main(fd, parent_sentinel)
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/multiprocessing/spawn.py", line 126, in _main
self = reduction.pickle.load(from_parent)
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/multiprocessing/synchronize.py", line 110, in __setstate__
self._semlock = _multiprocessing.SemLock._rebuild(*state)
FileNotFoundError: [Errno 2] No such file or directory
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/multiprocessing/spawn.py", line 116, in spawn_main
exitcode = _main(fd, parent_sentinel)
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/multiprocessing/spawn.py", line 126, in _main
self = reduction.pickle.load(from_parent)
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/multiprocessing/synchronize.py", line 110, in __setstate__
self._semlock = _multiprocessing.SemLock._rebuild(*state)
FileNotFoundError: [Errno 2] No such file or directory
Here's a technical breakdown.
This is a race condition where the main process finishes, and exits before some of the children have a chance to fully start up. As long as a child fully starts, there are mechanisms in-place to ensure they shut down smoothly, but there's an unsafe in-between time. Race conditions can be very system dependent, as it is up to the OS and the hardware to schedule the different threads, as well as how fast they chew through their work.
Here's what's going on when a process is started... Early on in the creation of a child process, it registers itself in the main process so that it will be either joined or terminated when the main process exits depending on if it's daemonic (multiprocessing.util._exit_function). This exit function was registered with the atexit module on import of multiprocessing.
Also during creation of the child process, a pair of Pipes are opened which will be used to pass the Process object to the child interpreter (which includes what function you want to execute and its arguments). This requires 2 file handles to be shared with the child, and these file handles are also registered to be closed using atexit.
The problem arises when the main process exits before the child has a chance to read all the necessary data from the pipe (un-pickling the Process object) during the startup phase. If the main process first closes the pipe, then waits for the child to join, then we have a problem. The child will continue spinning up the new python instance until it gets to the point when it needs to read in the Process object containing your function and arguments it should run. It will try to read from a pipe which has already been closed, which is an error.
If all the children get a chance to fully start-up you won't see this ever, because that pipe is only used for startup. Putting in a delay which will in some way guarantee that all the children have some time to fully start up is what solves this problem. Manually calling join will provide this delay by waiting for the children before any of the atexit handlers are called. Additionally, any amount of processing delay means that q.get in the main thread will have to wait a while which also gives the children time to start up before closing. I was never able to reproduce the problem you encountered, but presumably you saw the output from all the TASKS (" Process-1 says that mul(19, 7) = 133 "). Only one or two of the child processes ended up doing all the work, allowing the main process to get all the results, and finish up before the other children finished startup.
EDIT:
The error is unambiguous as to what's happening, but I still can't figure how it happens... As far as I can tell, the file handles should be closed when calling _run_finalizers() in _exit_function after joining or terminating all active_children rather than before via _run_finalizers(0)
EDIT2:
_run_finalizers will seemingly actually never call Popen.finalizer to close the pipes, because exitpriority is None. I'm very confused as to what's going on here, and I think I need to sleep on it...
Apparently #user2357112supportsMonica was on the right track. It totally solves the problem if you join the processes before exiting the program. Also #Aaron's answer has the deep knowledge as to why this fixes the issue!
I added the following bits of code as was suggested and it totally fixed the need to have time.sleep() in there.
First I gathered all the processes when they were started:
processes: list[Process] = []
# Start worker processes
for i in range(NUMBER_OF_PROCESSES):
p = Process(target=worker, args=(task_queue, done_queue))
p.start()
processes.append(p)
Then at the end of the program I joined them as follows:
# Join the processes
for p in processes:
p.join()
Totally solved the issues. Thanks for the advice.
The following program does the following things:
Parent process creates an inter-process shared value of data type SHARED_DTYPE
Parent process creates inter-process queue to pass object from child process to parent process.
Parent process spawns child process (and waits for object to arrive via the inter-process queue).
Child process modifies the value of the inter-process shared value
Child process creates an object of data type TRAVELLER_DTYPE
Child process passes the created object via the inter-process queue.
Parent process receives the object via the inter-process queue.
from multiprocessing import Value, Process, Queue
import ctypes
SHARED_DTYPE = ctypes.c_int
TRAVELLER_DTYPE = ctypes.c_float
shared_value = Value(SHARED_DTYPE, 0)
print('type of shared_value =', type(shared_value))
print('shared_value =', shared_value.value)
def child_proc():
try:
shared_value.value = 1
obj = TRAVELLER_DTYPE(5)
print('send into queue =', obj)
q.put(obj)
except BaseException as e:
print(e)
finally:
print('child_proc process is finished')
if __name__ == "__main__":
try:
q = Queue()
cp = Process(target=child_proc)
cp.start()
cp.join()
print('shared_value =', shared_value.value)
obj = q.get()
print('recv from queue =', obj)
except BaseException as e:
print(e)
finally:
print('__main__ process is finished')
Now, if the above program is run, it works correctly, giving the following output:
type of shared_value = <class 'multiprocessing.sharedctypes.Synchronized'>
shared_value = 0
send into queue = c_float(5.0)
child_proc process is finished
shared_value = 1
recv from queue = c_float(5.0)
__main__ process is finished
But if we change the TRAVELLER_DTYPE to ctypes.c_int at the top of the program, it no longer works correctly.
Sometimes, it gives the following output:
type of shared_value = <class 'multiprocessing.sharedctypes.Synchronized'>
shared_value = 0
send into queue = c_int(5)
child_proc process is finished
shared_value = 1
^C <-- Pressed ctrl-C here, was hung indefinitely.
__main__ process is finished
While other times, it gives this output:
type of shared_value = <class 'multiprocessing.sharedctypes.Synchronized'>
shared_value = 0
send into queue = c_int(5)
child_proc process is finished
Traceback (most recent call last):
File "/usr/lib/python3.8/multiprocessing/queues.py", line 239, in _feed
obj = _ForkingPickler.dumps(obj)
File "/usr/lib/python3.8/multiprocessing/reduction.py", line 51, in dumps
cls(buf, protocol).dump(obj)
File "/usr/lib/python3.8/multiprocessing/sharedctypes.py", line 129, in reduce_ctype
assert_spawning(obj)
File "/usr/lib/python3.8/multiprocessing/context.py", line 359, in assert_spawning
raise RuntimeError(
RuntimeError: c_int objects should only be shared between processes through inheritance
shared_value = 1
^C <-- Pressed ctrl-C here, was hung indefinitely.
__main__ process is finished
Why?
In general, the program works correctly if and only if SHARED_DTYPE != TRAVELLER_DTYPE
Is some explicit locking object required?
The Python multiprocessing doc page does not mention any such issue.
Searching online the error message does not give any relevant info/lead:
Some SO question
Some SO question: no shared value and queue together, although suggests using a multiprocessing.Manager() and multiprocessing.Manager().Queue()
A python bug report: Is there something relevant in this which can give some hints?
Odd that it works when the two types are not the same, but fails when they are the same. The bug report mentioned looks relevant but old. This does appear to be a bug. A workaround is that unlike Value objects, Queue objects do not need to be (and perhaps shouldn't) be ctypes types, so you can use int and float instead and it works.
I assume you are running on Linux, but on Windows it uses spawning vs. forking of processes, and with spawning the script is imported into child processes, making global variables different instances between processes. This makes even your "working" scenario fail on Windows. Instead, the queue and shared value should be passed as arguments to the child worker ensuring they are inherited correctly as the same object (this may be what then error message is referring to).
Below I've also rearranged the code to work with spawning so it will work on Windows as well as Linux:
from multiprocessing import Value, Process, Queue
import ctypes
SHARED_DTYPE = ctypes.c_int
TRAVELLER_DTYPE = int
def child_proc(q,shared_value):
shared_value.value = 1
obj = TRAVELLER_DTYPE(5)
print('send into queue =', obj)
q.put(obj)
print('child_proc process is finished')
if __name__ == "__main__":
shared_value = Value(SHARED_DTYPE, 0)
print('type of shared_value =', type(shared_value))
print('shared_value =', shared_value.value)
q = Queue()
cp = Process(target=child_proc,args=(q,shared_value))
cp.start()
cp.join()
print('shared_value =', shared_value.value)
obj = q.get()
print('recv from queue =', obj)
print('__main__ process is finished')
type of shared_value = <class 'multiprocessing.sharedctypes.Synchronized'>
shared_value = 0
send into queue = 5
child_proc process is finished
shared_value = 1
recv from queue = 5
__main__ process is finished
I'm trying to use a cluster of computers to run millions of small simulations. To do this I tried to set up two "servers" on my main computer, one to add input variables in a queue to the network and one to take care of the result.
This is the code for putting stuff into the simulation variables queue:
"""This script reads start parameters and calls on run_sim to run the
simulations"""
import time
from multiprocessing import Process, freeze_support, Manager, Value, Queue, current_process
from multiprocessing.managers import BaseManager
class QueueManager(BaseManager):
pass
class MultiComputers(Process):
def __init__(self, sim_name, queue):
self.sim_name = sim_name
self.queue = queue
super(MultiComputers, self).__init__()
def get_sim_obj(self, offset, db):
"""returns a list of lists from a database query"""
def handle_queue(self):
self.sim_nr = 0
sims = self.get_sim_obj()
self.total = len(sims)
while len(sims) > 0:
if self.queue.qsize() > 100:
self.queue.put(sims[0])
self.sim_nr += 1
print(self.sim_nr, round(self.sim_nr/self.total * 100, 2), self.queue.qsize())
del sims[0]
def run(self):
self.handle_queue()
if __name__ == '__main__':
freeze_support()
queue = Queue()
w = MultiComputers('seed_1_hundred', queue)
w.start()
QueueManager.register('get_queue', callable=lambda: queue)
m = QueueManager(address=('', 8001), authkey=b'abracadabra')
s = m.get_server()
s.serve_forever()
And then is this queue run to take care of the results of the simulations:
__author__ = 'axa'
from multiprocessing import Process, freeze_support, Queue
from multiprocessing.managers import BaseManager
import time
class QueueManager(BaseManager):
pass
class SaveFromMultiComp(Process):
def __init__(self, sim_name, queue):
self.sim_name = sim_name
self.queue = queue
super(SaveFromMultiComp, self).__init__()
def run(self):
res_got = 0
with open('sim_type1_' + self.sim_name, 'a') as f_1:
with open('sim_type2_' + self.sim_name, 'a') as f_2:
while True:
if self.queue.qsize() > 0:
while self.queue.qsize() > 0:
res = self.queue.get()
res_got += 1
if res[0] == 1:
f_1.write(str(res[1]) + '\n')
elif res[0] == 2:
f_2.write(str(res[1]) + '\n')
print(res_got)
time.sleep(0.5)
if __name__ == '__main__':
queue = Queue()
w = SaveFromMultiComp('seed_1_hundred', queue)
w.start()
m = QueueManager(address=('', 8002), authkey=b'abracadabra')
s = m.get_server()
s.serve_forever()
These scripts works as expected for handling the first ~7-800 simulations, after that I get the following error in the terminal running the receiving result script:
Exception in thread Thread-1:
Traceback (most recent call last):
File "C:\Python35\lib\threading.py", line 914, in _bootstrap_inner
self.run()
File "C:\Python35\lib\threading.py", line 862, in run
self._target(*self._args, **self._kwargs)
File "C:\Python35\lib\multiprocessing\managers.py", line 177, in accepter
t.start()
File "C:\Python35\lib\threading.py", line 844, in start
_start_new_thread(self._bootstrap, ())
RuntimeError: can't start new thread
Can anyone give som insights in where and how the threads are spawned, is a new thread spawned every time I call queue.get() or how does it work?
And I would be very glad if someone knows what I can do to avoid this failure? (i'm running the script with Python3.5-32)
All signs point to your system being out of resources it needs to launch a thread (probably memory, but you could be leaking threads or other resources). You could use OS system monitoring tools (top for Linux, Resource Monitor for windows) to look at the number of threads and memory usage to track this down, but I would recommend you just use an easier, more efficient programming pattern.
While not a perfect comparison, you generally are seeing the C10K problem and it states that blocking threads waiting for results do not scale well and can be prone to leaking errors like this. The solution was to implement Async IO patterns (one blocking thread that launches other workers) and this is pretty straight forward to do in Web Servers.
A framework like pythons aiohttp should be a good fit for what you want. You just need a handler that can get the ID of the remote code and the result. The framework should hopefully take care of the scaling for you.
So in your case you can keep your launching code, but after it starts the process on the remote machine, kill the thread. Have the remote code then send an HTTP message to your server with 1) its ID and 2) its result. Throw in a little extra code to ask it to try again if it does not get a 200 'OK' Status code and you should be in much better shape.
I think you have to many Threads running for your system. I would first check your system ressources and then rethink my Program.
Try limiting your threads and use as few as possible.
I'm using python 3.7 and following this documentation. I want to have a process, which should spawn a child process, wait for it to finish a task, and get some info back. I use the following code:
if __name__ == '__main__':
q = Queue()
p = Process(target=some_func, args=(q,))
p.start()
print q.get()
p.join()
When the child process finishes correctly there is no problem, and it works great, but the problem starts when my child process is terminated before it finished.
In this case, my application is hanging on wait.
Giving a timeout to q.get() and p.join() not completely solves the issue, because I want to know immediately that the child process died and not to wait to the timeout.
Another problem is that timeout on q.get() yields an exception, which I prefer to avoid.
Can someone suggest me a more elegant way to overcome those issues?
Queue & Signal
One possibility would be registering a signal handler and use it to pass a sentinel value.
On Unix you could handle SIGCHLD in the parent, but that's not an option in your case. According to the signal module:
On Windows, signal() can only be called with SIGABRT, SIGFPE, SIGILL, SIGINT, SIGSEGV, SIGTERM, or SIGBREAK.
Not sure if killing it through Task-Manager will translate into SIGTERM but you can give it a try.
For handling SIGTERM you would need to register the signal handler in the child.
import os
import sys
import time
import signal
from functools import partial
from multiprocessing import Process, Queue
SENTINEL = None
def _sigterm_handler(signum, frame, queue):
print("received SIGTERM")
queue.put(SENTINEL)
sys.exit()
def register_sigterm(queue):
global _sigterm_handler
_sigterm_handler = partial(_sigterm_handler, queue=queue)
signal.signal(signal.SIGTERM, _sigterm_handler)
def some_func(q):
register_sigterm(q)
print(os.getpid())
for i in range(30):
time.sleep(1)
q.put(f'msg_{i}')
if __name__ == '__main__':
q = Queue()
p = Process(target=some_func, args=(q,))
p.start()
for msg in iter(q.get, SENTINEL):
print(msg)
p.join()
Example Output:
12273
msg_0
msg_1
msg_2
msg_3
received SIGTERM
Process finished with exit code 0
Queue & Process.is_alive()
Even if this works with Task-Manager, your use-case sounds like you can't exclude force kills, so I think you're better off with an approach which doesn't rely on signals.
You can check in a loop if your process p.is_alive(), call queue.get() with a timeout specified and handle the Empty exceptions:
import os
import time
from queue import Empty
from multiprocessing import Process, Queue
def some_func(q):
print(os.getpid())
for i in range(30):
time.sleep(1)
q.put(f'msg_{i}')
if __name__ == '__main__':
q = Queue()
p = Process(target=some_func, args=(q,))
p.start()
while p.is_alive():
try:
msg = q.get(timeout=0.1)
except Empty:
pass
else:
print(msg)
p.join()
It would be also possible to avoid an exception, but I wouldn't recommend this because you don't spend your waiting time "on the queue", hence decreasing the responsiveness:
while p.is_alive():
if not q.empty():
msg = q.get_nowait()
print(msg)
time.sleep(0.1)
Pipe & Process.is_alive()
If you intend to utilize one connection per-child, it would however be possible to use a pipe instead of a queue. It's more performant than a queue
(which is mounted on top of a pipe) and you can use multiprocessing.connection.wait (Python 3.3+) to await readiness of multiple objects at once.
multiprocessing.connection.wait(object_list, timeout=None)
Wait till an object in object_list is ready. Returns the list of those objects in object_list which are ready. If timeout is a float then the call blocks for at most that many seconds. If timeout is None then it will block for an unlimited period. A negative timeout is equivalent to a zero timeout.
For both Unix and Windows, an object can appear in object_list if it is a readable Connection object;
a connected and readable socket.socket object; or
the sentinel attribute of a Process object.
A connection or socket object is ready when there is data available to be read from it, or the other end has been closed.
Unix: wait(object_list, timeout) almost equivalent select.select(object_list, [], [], timeout). The difference is that, if select.select() is interrupted by a signal, it can raise OSError with an error number of EINTR, whereas wait() will not.
Windows: An item in object_list must either be an integer handle which is waitable (according to the definition used by the documentation of the Win32 function WaitForMultipleObjects()) or it can be an object with a fileno() method which returns a socket handle or pipe handle. (Note that pipe handles and socket handles are not waitable handles.)
You can use this to await the sentinel attribute of the process and the parental end of the pipe concurrently.
import os
import time
from multiprocessing import Process, Pipe
from multiprocessing.connection import wait
def some_func(conn_write):
print(os.getpid())
for i in range(30):
time.sleep(1)
conn_write.send(f'msg_{i}')
if __name__ == '__main__':
conn_read, conn_write = Pipe(duplex=False)
p = Process(target=some_func, args=(conn_write,))
p.start()
while p.is_alive():
wait([p.sentinel, conn_read]) # block-wait until something gets ready
if conn_read.poll(): # check if something can be received
print(conn_read.recv())
p.join()
I use a Pool to run several commands simultaneously. I would like to don't print the stack-trace when the user interrupt the script.
Here is my script structure:
def worker(some_element):
try:
cmd_res = Popen(SOME_COMMAND, stdout=PIPE, stderr=PIPE).communicate()
except (KeyboardInterrupt, SystemExit):
pass
except Exception, e:
print str(e)
return
#deal with cmd_res...
pool = Pool()
try:
pool.map(worker, some_list, chunksize = 1)
except KeyboardInterrupt:
pool.terminate()
print 'bye!'
By calling pool.terminated() when KeyboardInterrupt raises, I expected to don't print the stack-trace, but it doesn't works, I got sometimes something like:
^CProcess PoolWorker-6:
Traceback (most recent call last):
File "/usr/lib/python2.7/multiprocessing/process.py", line 258, in _bootstrap
self.run()
File "/usr/lib/python2.7/multiprocessing/process.py", line 114, in run
self._target(*self._args, **self._kwargs)
File "/usr/lib/python2.7/multiprocessing/pool.py", line 102, in worker
task = get()
File "/usr/lib/python2.7/multiprocessing/queues.py", line 374, in get
racquire()
KeyboardInterrupt
Process PoolWorker-1:
Process PoolWorker-2:
Traceback (most recent call last):
File "/usr/lib/python2.7/multiprocessing/process.py", line 258, in _bootstrap
Traceback (most recent call last):
...
bye!
Do you know how I could hide this?
Thanks.
In your case you don't even need pool processes or threads. And then it gets easier to silence KeyboardInterrupts with try-catch.
Pool processes are useful when your Python code does CPU-consuming calculations that can profit from parallelization.
Threads are useful when your Python code does complex blocking I/O that can run in parallel. You just want to execute multiple programs in parallel and wait for the results. When you use Pool you create processes that do nothing other than starting other processes and waiting for them to terminate.
The simplest solution is to create all of the processes in parallel and then to call .communicate() on each of them:
try:
processes = []
# Start all processes at once
for element in some_list:
processes.append(Popen(SOME_COMMAND, stdout=PIPE, stderr=PIPE))
# Fetch their results sequentially
for process in processes:
cmd_res = process.communicate()
# Process your result here
except KeyboardInterrupt:
for process in processes:
try:
process.terminate()
except OSError:
pass
This works when when the output on STDOUT and STDERR isn't too big. Else when another process than the one communicate() is currently running for produces too much output for the PIPE buffer (usually around 1-8 kB) it will be suspended by the OS until communicate() is called on the suspended process. In that case you need a more sophisticated solution:
Asynchronous I/O
Since Python 3.4 you can use the asyncio module for single-thread pseudo-multithreading:
import asyncio
from asyncio.subprocess import PIPE
loop = asyncio.get_event_loop()
#asyncio.coroutine
def worker(some_element):
process = yield from asyncio.create_subprocess_exec(*SOME_COMMAND, stdout=PIPE)
try:
cmd_res = yield from process.communicate()
except KeyboardInterrupt:
process.terminate()
return
try:
pass # Process your result here
except KeyboardInterrupt:
return
# Start all workers
workers = []
for element in some_list:
w = worker(element)
workers.append(w)
asyncio.async(w)
# Run until everything complete
loop.run_until_complete(asyncio.wait(workers))
You should be able to limit the number of concurrent processes using e.g. asyncio.Semaphore if you need to.
When you instantiate Pool, it creates cpu_count() (on my machine, 8) python processes waiting for your worker(). Note that they don't run it yet, they are waiting for the command. When they don't perform your code, they also don't handle KeyboardInterrupt. You can see what they are doing if you specify Pool(processes=2) and send the interruption. You can play with processes number to fix it, but I don't think you can handle it in all the cases.
Personally I don't recommend to use multiprocessing.Pool for the task of launching other processes. It's overkill to launch several python processes for that. Much more efficient way – is using threads (see threading.Thread, Queue.Queue). But in this case you need to implement threading pool youself. Which is not so hard though.
Your child process will receive both the KeyboardInterrupt exception and the exception from the terminate().
Because the child process receives the KeyboardInterrupt, a simple join() in the parent -- rather than the terminate() -- should suffice.
As suggested y0prst I used threading.Thread instead of Pool.
Here is a working example, which rasterize a set of vectors with ImageMagick (I know I can use mogrify for this, it's just an example).
#!/usr/bin/python
from os.path import abspath
from os import listdir
from threading import Thread
from subprocess import Popen, PIPE
RASTERISE_CALL = "magick %s %s"
INPUT_DIR = './tests_in/'
def get_vectors(dir):
'''Return a list of svg files inside the `dir` directory'''
return [abspath(dir+f).replace(' ', '\\ ') for f in listdir(dir) if f.endswith('.svg')]
class ImageMagickError(Exception):
'''Custom error for ImageMagick fails calls'''
def __init__(self, value): self.value = value
def __str__(self): return repr(self.value)
class Rasterise(Thread):
'''Rasterizes a given vector.'''
def __init__(self, svg):
self.stdout = None
self.stderr = None
Thread.__init__(self)
self.svg = svg
def run(self):
p = Popen((RASTERISE_CALL % (self.svg, self.svg + '.png')).split(), shell=False, stdout=PIPE, stderr=PIPE)
self.stdout, self.stderr = p.communicate()
if self.stderr is not '':
raise ImageMagickError, 'can not rasterize ' + self.svg + ': ' + self.stderr
threads = []
def join_threads():
'''Joins all the threads.'''
for t in threads:
try:
t.join()
except(KeyboardInterrupt, SystemExit):
pass
#Rasterizes all the vectors in INPUT_DIR.
for f in get_vectors(INPUT_DIR):
t = Rasterise(f)
try:
print 'rasterize ' + f
t.start()
except (KeyboardInterrupt, SystemExit):
join_threads()
except ImageMagickError:
print 'Opps, IM can not rasterize ' + f + '.'
continue
threads.append(t)
# wait for all threads to end
join_threads()
print ('Finished!')
Please, tell me if you think there are a more pythonic way to do that, or if it can be optimised, I will edit my answer.