I have a python script which starts multiple commands using subprocess.Popen. I added a signal handler which is called if a child exits. I want to check which child terminated. I can do this by iterating over all children:
#!/usr/bin/env python
import subprocess
import signal
procs = []
def signal_handler(signum, frame):
for proc in procs:
proc.poll()
if proc.returncode is not None:
print "%s returned %s" % (proc.pid, proc.returncode)
procs.remove(proc)
def main():
signal.signal(signal.SIGCHLD, signal_handler)
procs.append(subprocess.Popen(["/bin/sleep", "2"]))
procs.append(subprocess.Popen(["/bin/sleep","5"]))
# wait so the main process does not terminate immediately
procs[1].wait()
if __name__ == "__main__":
main()
I would like to avoid querying all subprocesses. Is there a way to determine in the signal handler which child terminated?
You could achieve a similar result using multiprocessing. You could use the threading package instead if you didn't want to spawn the extra processes. It has pretty much the exact same interface. Basically, each subprocess call happens in a new process, which then launches your sleep processes.
import subprocess
import multiprocessing
def callback(result):
# do something with result
pid, returncode = result
print pid, returncode
def call_process(cmd):
p = subprocess.Popen(cmd)
p.wait()
return p.pid, p.returncode
def main():
pool = multiprocessing.Pool()
pool.apply_async(call_process, [["/bin/sleep", "2"]], callback=callback)
pool.apply_async(call_process, [["/bin/sleep", "5"]], callback=callback)
pool.close()
pool.join()
main()
Related
The use of multiprocessing.Manager prevents clean termination of Python child process using subprocess.Process.Popen.terminate() and subprocess.Process.Popen.kill().
This seems to be because Manager creates a child process behind the scenes for communicating, but this process does not know how to clean itself up when the parent is terminated.
What is the easiest way to use multiprocessing.Manager so that it does not prevent a process shutdown by a signal?
A demostration:
"""Multiprocess manager hang test."""
import multiprocessing
import subprocess
import sys
import time
def launch_and_read_process():
proc = subprocess.Popen(
[
"python",
sys.argv[0],
"run_unkillable"
],
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
)
# Give time for the process to run and print()
time.sleep(3)
status = proc.poll()
print("poll() is", status)
print("Terminating")
assert proc.returncode is None
proc.terminate()
exit_code = proc.wait()
print("Got exit code", exit_code)
stdout, stderr = proc.communicate()
print("Got output", stdout.decode("utf-8"))
def run_unkillable():
# Disable manager creation to make the code run correctly
manager = multiprocessing.Manager()
d = manager.dict()
d["foo"] = "bar"
print("This is an example output", flush=True)
time.sleep(999)
def main():
mode = sys.argv[1]
print("Doing subrouting", mode)
func = globals().get(mode)
func()
if __name__ == "__main__":
main()
Run as python test-script.py launch_and_read_process.
Good output (no multiprocessing.Manager):
Doing subrouting launch_and_read_process
poll() is None
Terminating
Got exit code -15
Got output Doing subrouting run_unkillable
This is an example output
Output when subprocess.Popen.communicate hangs because use of Manager:
Doing subrouting launch_and_read_process
poll() is None
Terminating
Got exit code -15
Like you pointed out, this happens because the manager spawns its own child processes. So when you do proc.communicate() the code hangs because that child process's stderr and stdout are still open. You can easily solve this on Unix by setting your own handlers for SIGTERM and SIGINT, but it becomes a little hairy on Windows since those two signals are pretty much useless. Also, keep in mind that signals are only delivered to the main thread. Depending on your OS and the signal, if the thread is busy (time.sleep(999)) then the whole timer may need to run out before the signal can be intercepted. Anyway, I have provided a solution for both Windows and Unix with a note at the end:
UNIX
This is pretty straightforward, you simply define your own handlers for the signals where you explicitly call manager.shutdown() to properly cleanup its child process:
def handler(manager, *args):
"""
Our handler, use functools.partial to fix arg manager (or you
can create a factory function too)
"""
manager.shutdown()
sys.exit()
def run_unkillable():
# Disable manager creation to make the code run correctly
manager = multiprocessing.Manager()
# Register our handler,
h = functools.partial(handler, manager)
signal.signal(signal.SIGINT, h)
signal.signal(signal.SIGTERM, h)
d = manager.dict()
d["foo"] = "bar"
print("This is an example output", flush=True)
time.sleep(999)
Windows
On Windows you will need to explicitly send the signal signal.CTRL_BREAK_EVENT rather than the plain proc.terminate() to ensure that the child process intercepts it (reference). Additionally, you'll also want to sleep in shorter durations in a loop instead of doing sleep(999) to make sure the signal interrupts the main thread rather than waiting for the whole duration of sleep (see this question for alternatives).
"""Multiprocess manager hang test."""
import functools
import multiprocessing
import subprocess
import sys
import time
import signal
def launch_and_read_process():
proc = subprocess.Popen(
[
"python",
sys.argv[0],
"run_unkillable"
],
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
creationflags=subprocess.CREATE_NEW_PROCESS_GROUP # So that our current process does not get SIGBREAK signal
)
# Give time for the process to run and print()
time.sleep(5)
status = proc.poll()
print("poll() is", status)
print("Terminating")
assert proc.returncode is None
# Send this specific signal instead of doing terminate()
proc.send_signal(signal.CTRL_BREAK_EVENT)
exit_code = proc.wait()
print("Got exit code", exit_code)
stdout, stderr = proc.communicate()
print("Got output", stdout.decode("utf-8"))
def handler(manager, *args):
"""
Our handler, use functools.partial to fix arg manager (or you
can create a factory function too)
"""
manager.shutdown()
sys.exit()
def run_unkillable():
# Disable manager creation to make the code run correctly
manager = multiprocessing.Manager()
# Register our handler,
signal.signal(signal.SIGBREAK, functools.partial(handler, manager))
d = manager.dict()
d["foo"] = "bar"
print("This is an example output", flush=True)
# Sleep in a loop otherwise the signal won't interrupt the main thread
for _ in range(999):
time.sleep(1)
def main():
mode = sys.argv[1]
print("Doing subrouting", mode)
func = globals().get(mode)
func()
if __name__ == "__main__":
main()
Note: Keep in mind that there is a race condition in the above solution because we are registering the signal handler after the creation of a manager. Theoretically, one could kill the process right before the handler is registered and the proc.communicate() will then hang because the manager was not cleaned up. So you may want to supply a timeout parameter to .communicate with error handling to log these edge cases.
Is there a way to make the processes in concurrent.futures.ProcessPoolExecutor terminate if the parent process terminates for any reason?
Some details: I'm using ProcessPoolExecutor in a job that processes a lot of data. Sometimes I need to terminate the parent process with a kill command, but when I do that the processes from ProcessPoolExecutor keep running and I have to manually kill them too. My primary work loop looks like this:
with concurrent.futures.ProcessPoolExecutor(n_workers) as executor:
result_list = [executor.submit(_do_work, data) for data in data_list]
for id, future in enumerate(
concurrent.futures.as_completed(result_list)):
print(f'{id}: {future.result()}')
Is there anything I can add here or do differently to make the child processes in executor terminate if the parent dies?
You can start a thread in each process to terminate when parent process dies:
def start_thread_to_terminate_when_parent_process_dies(ppid):
pid = os.getpid()
def f():
while True:
try:
os.kill(ppid, 0)
except OSError:
os.kill(pid, signal.SIGTERM)
time.sleep(1)
thread = threading.Thread(target=f, daemon=True)
thread.start()
Usage: pass initializer and initargs to ProcessPoolExecutor
with concurrent.futures.ProcessPoolExecutor(
n_workers,
initializer=start_thread_to_terminate_when_parent_process_dies, # +
initargs=(os.getpid(),), # +
) as executor:
This works even if the parent process is SIGKILL/kill -9'ed.
I would suggest two changes:
Use a kill -15 command, which can be handled by the Python program as a SIGTERM signal rather than a kill -9 command.
Use a multiprocessing pool created with the multiprocessing.pool.Pool class, whose terminate method works quite differently than that of the concurrent.futures.ProcessPoolExecutor class in that it will kill all processes in the pool so any tasks that have been submitted and running will be also immediately terminated.
Your equivalent program using the new pool and handling a SIGTERM interrupt would be:
from multiprocessing import Pool
import signal
import sys
import os
...
def handle_sigterm(*args):
#print('Terminating...', file=sys.stderr, flush=True)
pool.terminate()
sys.exit(1)
# The process to be "killed", if necessary:
print(os.getpid(), file=sys.stderr)
pool = Pool(n_workers)
signal.signal(signal.SIGTERM, handle_sigterm)
results = pool.imap_unordered(_do_work, data_list)
for id, result in enumerate(results):
print(f'{id}: {result}')
You could run the script in a kill-cgroup. When you need to kill the whole thing, you can do so by using the cgroup's kill switch. Even a cpu-cgroup will do the trick as you can access the group's pids.
Check this article on how to use cgexec.
My goal is to implement a Python 3 method that will support running a system command (using subprocess) following a few requirements:
Running long lasting commands
Live logging of both stdout and stderr
Enforcing a timeout to stop the command if it fails to complete on time
In order to support live logging, I have used 2 threads which handles both stdout and stderr outputs.
My challenge is to enforce the timeout on the threads and the subprocess process.
My attempt to implement the timeout using a signal handler, seems to freeze the interpreter as soon as the handler is called.
What's wrong with my implementation ?
Is there any other way to implement my requirements?
Here is my current implementation attempt:
def run_live_output(cmd, timeout=900, **kwargs):
full_output = StringIO()
def log_popen_pipe(p, log_errors=False):
while p.poll() is None:
output = ''
if log_errors:
output = p.stderr.readline()
log.warning(f"{output}")
else:
output = p.stdout.readline()
log.info(f"{output}")
full_output.write(output)
if p.poll():
log.error(f"{cmd}\n{p.stderr.readline()}")
class MyTimeout(Exception):
pass
def handler(signum, frame):
log.info(f"Signal handler called with signal {signum}")
raise MyTimeout
with subprocess.Popen(
cmd,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
stdin=subprocess.PIPE,
universal_newlines=True,
**kwargs
) as sp:
with ThreadPoolExecutor(2) as pool:
try:
signal.signal(signal.SIGALRM, handler)
signal.alarm(timeout)
r1 = pool.submit(log_popen_pipe, sp)
r2 = pool.submit(log_popen_pipe, sp, log_errors=True)
r1.result()
r2.result()
except MyTimeout:
log.info(f"Timed out - Killing the threads and process")
pool.shutdown(wait=True)
sp.kill()
except Exception as e:
log.info(f"{e}")
return full_output.getvalue()
Q-1) My attempt to implement the timeout using a signal handler, seems to freeze the interpreter as soon as the handler is called, What's wrong with my implementation ?
A-1) No your signal handler not freezing, There is freezing but not in the signal handler, signal handler is fine. Your main thread blocked (frozen) when you call pool.shutdown(wait=True). Because your subprocess is still running and you do while p.poll() is None: in the log_popen_pipe func. That's why your main thread will not continue until log_popen_pipe finished.
To solve this issue, we need to remove pool.shutdown(wait=True) and then call the sp.terminate(). I suggest you to use sp.terminate() instead sp.kill() because sp.kill() will send SIGKILL signal which is not preferred until you really need it. In addition that, end of the with ThreadPoolExecutor(2) as pool: statement, pool.shutdown(wait=True) will be called and this will not block you if log_popen_pipe func ended.
In your case log_popen_pipe func will finished if subprocess finished when we do sp.terminate().
Q-2) Is there any other way to implement my requirements?
A-2) Yes there is, you can use Timer class from threading library. Timer class will create 1 thread and this thread will wait for timeout seconds and end of the timeout seconds, this created thread will call sp.terminate func
Here is the code:
from io import StringIO
import signal,subprocess
from concurrent.futures import ThreadPoolExecutor
import logging as log
from threading import Timer
log.root.setLevel(log.INFO)
def run_live_output(cmd, timeout=900, **kwargs):
full_output = StringIO()
def log_popen_pipe(p, log_errors=False):
while p.poll() is None:
output = ''
if log_errors:
output = p.stderr.readline()
log.warning(f"{output}")
else:
output = p.stdout.readline()
log.info(f"{output}")
full_output.write(output)
if p.poll()!=None:
log.error(f"subprocess finished, {cmd}\n{p.stdout.readline()}")
with subprocess.Popen(
cmd,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
stdin=subprocess.PIPE,
universal_newlines=True,
**kwargs
) as sp:
Timer(timeout,sp.terminate).start()
with ThreadPoolExecutor(2) as pool:
try:
r1 = pool.submit(log_popen_pipe, sp)
r2 = pool.submit(log_popen_pipe, sp, log_errors=True)
r1.result()
r2.result()
except Exception as e:
log.info(f"{e}")
return full_output.getvalue()
run_live_output(["python3","...."],timeout=4)
By the way p.poll() will return the returncode of the terminated subprocess. If you want to get output of successfully terminated subprocess, you need to use if p.poll()==0 0 generally means subprocess successfully terminated
here is a example:
from multiprocessing import Process
import time
def func():
print('sub process is running')
time.sleep(5)
print('sub process finished')
if __name__ == '__main__':
p = Process(target=func)
p.start()
print('done')
what I expect is that the main process will terminate right after it start a subprocess. But after printing out 'done', the terminal is still waiting....Is there any way to do this so that the main process will exit right after printing out 'done', instead of waiting for subprocess? I'm confused here because I'm not calling p.join()
Python will not end if there exists a non-daemon process.
By setting, daemon attribute before start() call, you can make the process daemonic.
p = Process(target=func)
p.daemon = True # <-----
p.start()
print('done')
NOTE: There will be no sub process finished message printed; because the main process will terminate sub-process at exit. This may not be what you want.
You should do double-fork:
import os
import time
from multiprocessing import Process
def func():
if os.fork() != 0: # <--
return # <--
print('sub process is running')
time.sleep(5)
print('sub process finished')
if __name__ == '__main__':
p = Process(target=func)
p.start()
p.join()
print('done')
Following the excellent answer from #falsetru, I wrote out a quick generalization in the form of a decorator.
import os
from multiprocessing import Process
def detachify(func):
"""Decorate a function so that its calls are async in a detached process.
Usage
-----
.. code::
import time
#detachify
def f(message):
time.sleep(5)
print(message)
f('Async and detached!!!')
"""
# create a process fork and run the function
def forkify(*args, **kwargs):
if os.fork() != 0:
return
func(*args, **kwargs)
# wrapper to run the forkified function
def wrapper(*args, **kwargs):
proc = Process(target=lambda: forkify(*args, **kwargs))
proc.start()
proc.join()
return
return wrapper
Usage (copied from docstring):
import time
#detachify
def f(message):
time.sleep(5)
print(message)
f('Async and detached!!!')
Or if you like,
def f(message):
time.sleep(5)
print(message)
detachify(f)('Async and detached!!!')
This question concerns multiprocessing in python. I want to execute some code when I terminate the process, to be more specific just before it will be terminated. I'm looking for a solution which works as atexit.register for the python program.
I have a method worker which looks:
def worker():
while True:
print('work')
time.sleep(2)
return
I run it by:
proc = multiprocessing.Process(target=worker, args=())
proc.start()
My goal is to execute some extra code just before terminating it, which I do by:
proc.terminate()
Use signal handling and intercept SIGTERM:
import multiprocessing
import time
import sys
from signal import signal, SIGTERM
def before_exit(*args):
print('Hello')
sys.exit(0) # don't forget to exit!
def worker():
signal(SIGTERM, before_exit)
time.sleep(10)
proc = multiprocessing.Process(target=worker, args=())
proc.start()
time.sleep(3)
proc.terminate()
Produces the desirable output just before subprocess termination.