Popen send_signal signal not received in subprocess - python

I am running multiple processes using subprocess.Popen and when I detect a change in any of some of a group of files, I want to send a signal to one of these process. I have defined a signal handler in the process but it doesnt seem that it is being sent any signal. some help would be appreciated. The function that does the sending of the signal and the signal handler are shown below.
def start_up():
p, i = None, None
while 1:
subprocess.call(['clear'])
logging.info('starting overlay on host %s' % socket.gethostname())
p = subprocess.Popen([sys.executable, 'sdp_proc.py'])
i = subprocess.Popen([sys.executable, 'kernel.py', sys.argv[1],
sys.argv[2]])
if file_modified():
p.terminate()
i.send_signal(signal.SIGINT)
time.sleep(1)
The signal handler is shown below:
def signal_handler(signum, frame):
with open('log.txt', 'w') as f:
f.write(' so what mate, received signal with signal number %s' % signum)
signal.signal(signal.SIGINT, signal_handler)

I'd guess that the SIGINT is being sent to the subprocess before it even has a chance to load up all of Python, so before it installs the SIGINT handler, meaning it will die right away.
You probably want to watch the subprocess for some successful-load condition to be met (perhaps just sending a byte on a pipe) before sending it any SIGINT signals that you expect to be handled by your own handler code.

According to the official documentation in both Unix and NT, you need to use process groups to receive signals if the shell=True is used.
So, I decided to wrap the built-in Popen function to achieve this behaviour.
import subprocess as sp
import os
def Popen(command, env=None, **kw):
if os.name == 'nt':
# On Windows, the specified env must include a valid SystemRoot
# Use a current value
if env is None:
env = {}
env['SystemRoot'] = os.environ['SystemRoot']
kw['creationflags'] = sp.CREATE_NEW_PROCESS_GROUP
else:
kw['preexec_fn'] = os.setpgrp
return sp.Popen(command, env=env, **kw)
Cross-platform kill function:
if os.name == 'nt':
# On Windows, os module can't kill processes by group
# Kill all children indiscriminately instead
def killpg_by_pid(pid, s):
# Ignoring s due the lack of support in windows.j
sp.call(['taskkill', '/T', '/F', '/PID', str(pid)])
else:
def killpg_by_pid(pid, s):
os.killpg(os.getpgid(pid), s)
Usage:
tested in Linux and Windows.
import signal
process = Popen(
command,
stdout=sp.PIPE,
stderr=sp.PIPE,
shell=True,
**kw,
)
# Kill
killpg_by_pid(process.pid, signal.SIGTERM)
For a complete example please take a look at: bddcli.

Related

multiprocessing.Manager() hangs Popen.communicate() on Python

The use of multiprocessing.Manager prevents clean termination of Python child process using subprocess.Process.Popen.terminate() and subprocess.Process.Popen.kill().
This seems to be because Manager creates a child process behind the scenes for communicating, but this process does not know how to clean itself up when the parent is terminated.
What is the easiest way to use multiprocessing.Manager so that it does not prevent a process shutdown by a signal?
A demostration:
"""Multiprocess manager hang test."""
import multiprocessing
import subprocess
import sys
import time
def launch_and_read_process():
proc = subprocess.Popen(
[
"python",
sys.argv[0],
"run_unkillable"
],
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
)
# Give time for the process to run and print()
time.sleep(3)
status = proc.poll()
print("poll() is", status)
print("Terminating")
assert proc.returncode is None
proc.terminate()
exit_code = proc.wait()
print("Got exit code", exit_code)
stdout, stderr = proc.communicate()
print("Got output", stdout.decode("utf-8"))
def run_unkillable():
# Disable manager creation to make the code run correctly
manager = multiprocessing.Manager()
d = manager.dict()
d["foo"] = "bar"
print("This is an example output", flush=True)
time.sleep(999)
def main():
mode = sys.argv[1]
print("Doing subrouting", mode)
func = globals().get(mode)
func()
if __name__ == "__main__":
main()
Run as python test-script.py launch_and_read_process.
Good output (no multiprocessing.Manager):
Doing subrouting launch_and_read_process
poll() is None
Terminating
Got exit code -15
Got output Doing subrouting run_unkillable
This is an example output
Output when subprocess.Popen.communicate hangs because use of Manager:
Doing subrouting launch_and_read_process
poll() is None
Terminating
Got exit code -15
Like you pointed out, this happens because the manager spawns its own child processes. So when you do proc.communicate() the code hangs because that child process's stderr and stdout are still open. You can easily solve this on Unix by setting your own handlers for SIGTERM and SIGINT, but it becomes a little hairy on Windows since those two signals are pretty much useless. Also, keep in mind that signals are only delivered to the main thread. Depending on your OS and the signal, if the thread is busy (time.sleep(999)) then the whole timer may need to run out before the signal can be intercepted. Anyway, I have provided a solution for both Windows and Unix with a note at the end:
UNIX
This is pretty straightforward, you simply define your own handlers for the signals where you explicitly call manager.shutdown() to properly cleanup its child process:
def handler(manager, *args):
"""
Our handler, use functools.partial to fix arg manager (or you
can create a factory function too)
"""
manager.shutdown()
sys.exit()
def run_unkillable():
# Disable manager creation to make the code run correctly
manager = multiprocessing.Manager()
# Register our handler,
h = functools.partial(handler, manager)
signal.signal(signal.SIGINT, h)
signal.signal(signal.SIGTERM, h)
d = manager.dict()
d["foo"] = "bar"
print("This is an example output", flush=True)
time.sleep(999)
Windows
On Windows you will need to explicitly send the signal signal.CTRL_BREAK_EVENT rather than the plain proc.terminate() to ensure that the child process intercepts it (reference). Additionally, you'll also want to sleep in shorter durations in a loop instead of doing sleep(999) to make sure the signal interrupts the main thread rather than waiting for the whole duration of sleep (see this question for alternatives).
"""Multiprocess manager hang test."""
import functools
import multiprocessing
import subprocess
import sys
import time
import signal
def launch_and_read_process():
proc = subprocess.Popen(
[
"python",
sys.argv[0],
"run_unkillable"
],
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
creationflags=subprocess.CREATE_NEW_PROCESS_GROUP # So that our current process does not get SIGBREAK signal
)
# Give time for the process to run and print()
time.sleep(5)
status = proc.poll()
print("poll() is", status)
print("Terminating")
assert proc.returncode is None
# Send this specific signal instead of doing terminate()
proc.send_signal(signal.CTRL_BREAK_EVENT)
exit_code = proc.wait()
print("Got exit code", exit_code)
stdout, stderr = proc.communicate()
print("Got output", stdout.decode("utf-8"))
def handler(manager, *args):
"""
Our handler, use functools.partial to fix arg manager (or you
can create a factory function too)
"""
manager.shutdown()
sys.exit()
def run_unkillable():
# Disable manager creation to make the code run correctly
manager = multiprocessing.Manager()
# Register our handler,
signal.signal(signal.SIGBREAK, functools.partial(handler, manager))
d = manager.dict()
d["foo"] = "bar"
print("This is an example output", flush=True)
# Sleep in a loop otherwise the signal won't interrupt the main thread
for _ in range(999):
time.sleep(1)
def main():
mode = sys.argv[1]
print("Doing subrouting", mode)
func = globals().get(mode)
func()
if __name__ == "__main__":
main()
Note: Keep in mind that there is a race condition in the above solution because we are registering the signal handler after the creation of a manager. Theoretically, one could kill the process right before the handler is registered and the proc.communicate() will then hang because the manager was not cleaned up. So you may want to supply a timeout parameter to .communicate with error handling to log these edge cases.

How to implement a subprocess.Popen with both live logging and a timeout option?

My goal is to implement a Python 3 method that will support running a system command (using subprocess) following a few requirements:
Running long lasting commands
Live logging of both stdout and stderr
Enforcing a timeout to stop the command if it fails to complete on time
In order to support live logging, I have used 2 threads which handles both stdout and stderr outputs.
My challenge is to enforce the timeout on the threads and the subprocess process.
My attempt to implement the timeout using a signal handler, seems to freeze the interpreter as soon as the handler is called.
What's wrong with my implementation ?
Is there any other way to implement my requirements?
Here is my current implementation attempt:
def run_live_output(cmd, timeout=900, **kwargs):
full_output = StringIO()
def log_popen_pipe(p, log_errors=False):
while p.poll() is None:
output = ''
if log_errors:
output = p.stderr.readline()
log.warning(f"{output}")
else:
output = p.stdout.readline()
log.info(f"{output}")
full_output.write(output)
if p.poll():
log.error(f"{cmd}\n{p.stderr.readline()}")
class MyTimeout(Exception):
pass
def handler(signum, frame):
log.info(f"Signal handler called with signal {signum}")
raise MyTimeout
with subprocess.Popen(
cmd,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
stdin=subprocess.PIPE,
universal_newlines=True,
**kwargs
) as sp:
with ThreadPoolExecutor(2) as pool:
try:
signal.signal(signal.SIGALRM, handler)
signal.alarm(timeout)
r1 = pool.submit(log_popen_pipe, sp)
r2 = pool.submit(log_popen_pipe, sp, log_errors=True)
r1.result()
r2.result()
except MyTimeout:
log.info(f"Timed out - Killing the threads and process")
pool.shutdown(wait=True)
sp.kill()
except Exception as e:
log.info(f"{e}")
return full_output.getvalue()
Q-1) My attempt to implement the timeout using a signal handler, seems to freeze the interpreter as soon as the handler is called, What's wrong with my implementation ?
A-1) No your signal handler not freezing, There is freezing but not in the signal handler, signal handler is fine. Your main thread blocked (frozen) when you call pool.shutdown(wait=True). Because your subprocess is still running and you do while p.poll() is None: in the log_popen_pipe func. That's why your main thread will not continue until log_popen_pipe finished.
To solve this issue, we need to remove pool.shutdown(wait=True) and then call the sp.terminate(). I suggest you to use sp.terminate() instead sp.kill() because sp.kill() will send SIGKILL signal which is not preferred until you really need it. In addition that, end of the with ThreadPoolExecutor(2) as pool: statement, pool.shutdown(wait=True) will be called and this will not block you if log_popen_pipe func ended.
In your case log_popen_pipe func will finished if subprocess finished when we do sp.terminate().
Q-2) Is there any other way to implement my requirements?
A-2) Yes there is, you can use Timer class from threading library. Timer class will create 1 thread and this thread will wait for timeout seconds and end of the timeout seconds, this created thread will call sp.terminate func
Here is the code:
from io import StringIO
import signal,subprocess
from concurrent.futures import ThreadPoolExecutor
import logging as log
from threading import Timer
log.root.setLevel(log.INFO)
def run_live_output(cmd, timeout=900, **kwargs):
full_output = StringIO()
def log_popen_pipe(p, log_errors=False):
while p.poll() is None:
output = ''
if log_errors:
output = p.stderr.readline()
log.warning(f"{output}")
else:
output = p.stdout.readline()
log.info(f"{output}")
full_output.write(output)
if p.poll()!=None:
log.error(f"subprocess finished, {cmd}\n{p.stdout.readline()}")
with subprocess.Popen(
cmd,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
stdin=subprocess.PIPE,
universal_newlines=True,
**kwargs
) as sp:
Timer(timeout,sp.terminate).start()
with ThreadPoolExecutor(2) as pool:
try:
r1 = pool.submit(log_popen_pipe, sp)
r2 = pool.submit(log_popen_pipe, sp, log_errors=True)
r1.result()
r2.result()
except Exception as e:
log.info(f"{e}")
return full_output.getvalue()
run_live_output(["python3","...."],timeout=4)
By the way p.poll() will return the returncode of the terminated subprocess. If you want to get output of successfully terminated subprocess, you need to use if p.poll()==0 0 generally means subprocess successfully terminated

How to run & stop python script from another python script?

I want code like this:
if True:
run('ABC.PY')
else:
if ScriptRunning('ABC.PY):
stop('ABC.PY')
run('ABC.PY'):
Basically, I want to run a file, let's say abc.py, and based on some conditions. I want to stop it, and run it again from another python script. Is it possible?
I am using Windows.
You can use python Popen objects for running processes in a child process
So run('ABC.PY') would be p = Popen("python 'ABC.PY'")
if ScriptRunning('ABC.PY) would be if p.poll() == None
stop('ABC.PY') would be p.kill()
This is a very basic example for what you are trying to achieve
Please checkout subprocess.Popen docs to fine tune your logic for running the script
import subprocess
import shlex
import time
def run(script):
scriptArgs = shlex.split(script)
commandArgs = ["python"]
commandArgs.extend(scriptArgs)
procHandle = subprocess.Popen(commandArgs, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
return procHandle
def isScriptRunning(procHandle):
return procHandle.poll() is None
def stopScript(procHandle):
procHandle.terminate()
time.sleep(5)
# Forcefully terminate the script
if isScriptRunning(procHandle):
procHandle.kill()
def getOutput(procHandle):
# stderr will be redirected to stdout due "stderr=subprocess.STDOUT" argument in Popen call
stdout, _ = procHandle.communicate()
returncode = procHandle.returncode
return returncode, stdout
def main():
procHandle = run("main.py --arg 123")
time.sleep(5)
isScriptRunning(procHandle)
stopScript(procHandle)
print getOutput(procHandle)
if __name__ == "__main__":
main()
One thing that you should be aware about is stdout=subprocess.PIPE.
If your python script has a very large output, the pipes may overflow causing your script to block until .communicate is called over the handle.
To avoid this, pass a file handle to stdout, like this
fileHandle = open("main_output.txt", "w")
subprocess.Popen(..., stdout=fileHandle)
In this way, the output of the python process will be dumped into the file.(You will have to modily the getOutput() function too for this)
import subprocess
process = None
def run_or_rerun(flag):
global process
if flag:
assert(process is None)
process = subprocess.Popen(['python', 'ABC.PY'])
process.wait() # must wait or caller will hang
else:
if process.poll() is None: # it is still running
process.terminate() # terminate process
process = subprocess.Popen(['python', 'ABC.PY']) # rerun
process.wait() # must wait or caller will hang

How to kill a python child process created with subprocess.check_output() when the parent dies?

I am running on a linux machine a python script which creates a child process using subprocess.check_output() as it follows:
subprocess.check_output(["ls", "-l"], stderr=subprocess.STDOUT)
The problem is that even if the parent process dies, the child is still running.
Is there any way I can kill the child process as well when the parent dies?
Yes, you can achieve this by two methods. Both of them require you to use Popen instead of check_output. The first is a simpler method, using try..finally, as follows:
from contextlib import contextmanager
#contextmanager
def run_and_terminate_process(*args, **kwargs):
try:
p = subprocess.Popen(*args, **kwargs)
yield p
finally:
p.terminate() # send sigterm, or ...
p.kill() # send sigkill
def main():
with run_and_terminate_process(args) as running_proc:
# Your code here, such as running_proc.stdout.readline()
This will catch sigint (keyboard interrupt) and sigterm, but not sigkill (if you kill your script with -9).
The other method is a bit more complex, and uses ctypes' prctl PR_SET_PDEATHSIG. The system will send a signal to the child once the parent exits for any reason (even sigkill).
import signal
import ctypes
libc = ctypes.CDLL("libc.so.6")
def set_pdeathsig(sig = signal.SIGTERM):
def callable():
return libc.prctl(1, sig)
return callable
p = subprocess.Popen(args, preexec_fn = set_pdeathsig(signal.SIGTERM))
Your problem is with using subprocess.check_output - you are correct, you can't get the child PID using that interface. Use Popen instead:
proc = subprocess.Popen(["ls", "-l"], stdout=PIPE, stderr=PIPE)
# Here you can get the PID
global child_pid
child_pid = proc.pid
# Now we can wait for the child to complete
(output, error) = proc.communicate()
if error:
print "error:", error
print "output:", output
To make sure you kill the child on exit:
import os
import signal
def kill_child():
if child_pid is None:
pass
else:
os.kill(child_pid, signal.SIGTERM)
import atexit
atexit.register(kill_child)
Don't know the specifics, but the best way is still to catch errors (and perhaps even all errors) with signal and terminate any remaining processes there.
import signal
import sys
import subprocess
import os
def signal_handler(signal, frame):
sys.exit(0)
signal.signal(signal.SIGINT, signal_handler)
a = subprocess.check_output(["ls", "-l"], stderr=subprocess.STDOUT)
while 1:
pass # Press Ctrl-C (breaks the application and is catched by signal_handler()
This is just a mockup, you'd need to catch more than just SIGINT but the idea might get you started and you'd need to check for spawned process somehow still.
http://docs.python.org/2/library/os.html#os.kill
http://docs.python.org/2/library/subprocess.html#subprocess.Popen.pid
http://docs.python.org/2/library/subprocess.html#subprocess.Popen.kill
I'd recommend rewriting a personalized version of check_output cause as i just realized check_output is really just for simple debugging etc since you can't interact so much with it during executing..
Rewrite check_output:
from subprocess import Popen, PIPE, STDOUT
from time import sleep, time
def checkOutput(cmd):
a = Popen('ls -l', shell=True, stdin=PIPE, stdout=PIPE, stderr=STDOUT)
print(a.pid)
start = time()
while a.poll() == None or time()-start <= 30: #30 sec grace period
sleep(0.25)
if a.poll() == None:
print('Still running, killing')
a.kill()
else:
print('exit code:',a.poll())
output = a.stdout.read()
a.stdout.close()
a.stdin.close()
return output
And do whatever you'd like with it, perhaps store the active executions in a temporary variable and kill them upon exit with signal or other means of intecepting errors/shutdowns of the main loop.
In the end, you still need to catch terminations in the main application in order to safely kill any childs, the best way to approach this is with try & except or signal.
As of Python 3.2 there is a ridiculously simple way to do this:
from subprocess import Popen
with Popen(["sleep", "60"]) as process:
print(f"Just launched server with PID {process.pid}")
I think this will be best for most use cases because it's simple and portable, and it avoids any dependence on global state.
If this solution isn't powerful enough, then I would recommend checking out the other answers and discussion on this question or on Python: how to kill child process(es) when parent dies?, as there are a lot of neat ways to approach the problem that provide different trade-offs around portability, resilience, and simplicity. 😊
Manually you could do this:
ps aux | grep <process name>
get the PID(second column) and
kill -9 <PID>
-9 is to force killing it

Popen communicate is not working

I have a script that has been working properly for the past 3 months. The Server went down last Monday and since then my script stopped working. The script hangs at coords = p.communicate()[0].split().
Here's a part of the script:
class SelectByLatLon(GridSelector):
def __init__(self, from_lat, to_lat, from_lon, to_lon):
self.from_lat = from_lat
self.to_lat = to_lat
self.from_lon = from_lon
self.to_lon = to_lon
def get_selection(self, file):
p = subprocess.Popen(
[
os.path.join(module_root, 'bin/points_from_latlon.tcl'),
file,
str(self.from_lat), str(self.to_lat), str(self.from_lon), str(self.to_lon)
],
stdout = subprocess.PIPE
)
coords = p.communicate()[0].split()
return ZGridSelection(int(coords[0]), int(coords[1]), int(coords[2]), int(coords[3]))
When I run the script on another server everything works just fine.
Can I use something else instead of p.communicate()[0].split() ?
You might have previously run your server without daemonization i.e., you had functional stdin, stdout, stderr streams. To fix, you could redirect the streams to DEVNULL for the subprocess:
import os
from subprocess import Popen, PIPE
DEVNULL = os.open(os.devnull, os.O_RDWR)
p = Popen(tcl_cmd, stdin=DEVNULL, stdout=PIPE, stderr=DEVNULL, close_fds=True)
os.close(DEVNULL)
.communicate() may wait for EOF on stdout even if tcl_cmd already exited: the tcl script might have spawned a child process that inherited the standard streams and outlived its parent.
If you know that you don't need any stdout after the tcl_cmd exits then you could kill the whole process tree when you detect that tcl_cmd is done.
You might need start_new_session=True analog to be able to kill the whole process tree:
import os
import signal
from threading import Timer
def kill_tree_on_exit(p):
p.wait() # wait for tcl_cmd to exit
os.killpg(p.pid, signal.SIGTERM)
t = Timer(0, kill_tree_on_exit, [p])
t.start()
coords = p.communicate()[0].split()
t.cancel()
See How to terminate a python subprocess launched with shell=True

Categories