Kill or terminate subprocess when timeout? - python

I would like to repeatedly execute a subprocess as fast as possible. However, sometimes the process will take too long, so I want to kill it.
I use signal.signal(...) like below:
ppid=pipeexe.pid
signal.signal(signal.SIGALRM, stop_handler)
signal.alarm(1)
.....
def stop_handler(signal, frame):
print 'Stop test'+testdir+'for time out'
if(pipeexe.poll()==None and hasattr(signal, "SIGKILL")):
os.kill(ppid, signal.SIGKILL)
return False
but sometime this code will try to stop the next round from executing.
Stop test/home/lu/workspace/152/treefit/test2for time out
/bin/sh: /home/lu/workspace/153/squib_driver: not found ---this is the next execution; the program wrongly stops it.
Does anyone know how to solve this? I want to stop in time not execute 1 second the time.sleep(n) often wait n seconds. I do not want that I want it can execute less than 1 second

You could do something like this:
import subprocess as sub
import threading
class RunCmd(threading.Thread):
def __init__(self, cmd, timeout):
threading.Thread.__init__(self)
self.cmd = cmd
self.timeout = timeout
def run(self):
self.p = sub.Popen(self.cmd)
self.p.wait()
def Run(self):
self.start()
self.join(self.timeout)
if self.is_alive():
self.p.terminate() #use self.p.kill() if process needs a kill -9
self.join()
RunCmd(["./someProg", "arg1"], 60).Run()
The idea is that you create a thread that runs the command and to kill it if the timeout exceeds some suitable value, in this case 60 seconds.

Here is something I wrote as a watchdog for subprocess execution. I use it now a lot, but I'm not so experienced so maybe there are some flaws in it:
import subprocess
import time
def subprocess_execute(command, time_out=60):
"""executing the command with a watchdog"""
# launching the command
c = subprocess.Popen(command)
# now waiting for the command to complete
t = 0
while t < time_out and c.poll() is None:
time.sleep(1) # (comment 1)
t += 1
# there are two possibilities for the while to have stopped:
if c.poll() is None:
# in the case the process did not complete, we kill it
c.terminate()
# and fill the return code with some error value
returncode = -1 # (comment 2)
else:
# in the case the process completed normally
returncode = c.poll()
return returncode
Usage:
return = subprocess_execute(['java', '-jar', 'some.jar'])
Comments:
here, the watchdog time out is in seconds; but it's easy to change to whatever needed by changing the time.sleep() value. The time_out will have to be documented accordingly;
according to what is needed, here it maybe more suitable to raise some exception.
Documentation: I struggled a bit with the documentation of subprocess module to understand that subprocess.Popen is not blocking; the process is executed in parallel (maybe I do not use the correct word here, but I think it's understandable).
But as what I wrote is linear in its execution, I really have to wait for the command to complete, with a time out to avoid bugs in the command to pause the nightly execution of the script.

I guess this is a common synchronization problem in event-oriented programming with threads and processes.
If you should always have only one subprocess running, make sure the current subprocess is killed before running the next one. Otherwise the signal handler may get a reference to the last subprocess run and ignore the older.
Suppose subprocess A is running. Before the alarm signal is handled, subprocess B is launched. Just after that, your alarm signal handler attempts to kill a subprocess. As the current PID (or the current subprocess pipe object) was set to B's when launching the subprocess, B gets killed and A keeps running.
Is my guess correct?
To make your code easier to understand, I would include the part that creates a new subprocess just after the part that kills the current subprocess. That would make clear there is only one subprocess running at any time. The signal handler could do both the subprocess killing and launching, as if it was the iteration block that runs in a loop, in this case event-driven with the alarm signal every 1 second.

Here's what I use:
class KillerThread(threading.Thread):
def __init__(self, pid, timeout, event ):
threading.Thread.__init__(self)
self.pid = pid
self.timeout = timeout
self.event = event
self.setDaemon(True)
def run(self):
self.event.wait(self.timeout)
if not self.event.isSet() :
try:
os.kill( self.pid, signal.SIGKILL )
except OSError, e:
#This is raised if the process has already completed
pass
def runTimed(dt, dir, args, kwargs ):
event = threading.Event()
cwd = os.getcwd()
os.chdir(dir)
proc = subprocess.Popen(args, **kwargs )
os.chdir(cwd)
killer = KillerThread(proc.pid, dt, event)
killer.start()
(stdout, stderr) = proc.communicate()
event.set()
return (stdout,stderr, proc.returncode)

A bit more complex, I added an answer to solve a similar problem: Capturing stdout, feeding stdin, and being able to terminate after some time of inactivity and/or after some overall runtime.

Related

How to keep ROS publisher publishing while executing a subprocess callback?

How can I keep the ROS Publisher publishing the messages while calling a sub-process:
import subprocess
import rospy
class Pub():
def __init__(self):
pass
def updateState(self, msg):
cmd = ['python3', planner_path, "--alias", search_options, "--plan-file", plan_path, domain_path, problem_path]
subprocess.run(cmd, shell=False, stdout=subprocess.PIPE)
self.plan_pub.publish(msg)
def myPub(self):
rospy.init_node('problem_formulator', anonymous=True)
self.plan_pub = rospy.Publisher("plan", String, queue_size=10)
rate = rospy.Rate(10) # 10hz
rospy.Subscriber('model', String, updateState)
rospy.sleep(1)
rospy.spin()
if __name__ == "__main__":
p_ = Pub()
p_.myPub()
Since subprocess.call is a blocking call your subscription callback may take a long time.
Run the command described by args. Wait for command to complete, then return the returncode attribute.
ROS itself will not call the callback again while it is executed already. This means you are blocking this and potentially also other callbacks to be called in time.
The most simple solution would be to replace subprocess.call by subprocess.Popen which
Execute a child program in a new process
nonblocking.
But keep in mind that this potentially starts the process multiple times quite fast.
Think about starting the process only conditionally if not already running. This can be achieved by checking the process to be finished in another thread. Simple but effective, use boolean flag. Here is a small prototype:
def updateState(self, msg):
#Start the process if not already running
if not self._process_running:
p = subprocess.Popen(...)
self._process_running = True
def wait_process():
while p.poll() is None:
time.sleep(0.1)
self._process_running = False
threading.Thread(target=wait_process).start()
#Other callback code
self.plan_pub.publish(msg)

How to kill a process reliably while testing?

I have a couple of different scripts that require opening a MongoDB instance that go something like this:
mongod = Popen(
["mongod", "--dbpath", '/path/to/db'],
)
#Do some stuff
mongod.terminate()
And this works great when the code I'm executing works, but while I'm tinkering, errors inevitably arise. Then the Mongod instance remains running, and the next time I attempt to run the script, it detects that and doesn't open a new one.
I can terminate the process from the command line, but this is somewhat tedious. Or I can wrap everything in a try loop, but for some of the scripts, I have to do this a bunch, since every function depends on every other one. Is there a more elegant way to force close the process even in the event of an error somewhere else in the code?
EDIT: Did some testing based on tdelaney's comment, it looks like when I run these scripts in Sublime text and en error is generated, the script doesn't actually finish - it hits the error and then waits with the mongod instance open... i think. Once I kill the process in the terminal, sublime text tells me "finished in X seconds with exit code1"
EDIT2: On Kirby's suggestion, tried:
def testing():
mongod = Popen(
["mongod", "--dbpath", '/Users/KBLaptop/computation/db/'],
)
#Stuff that generates error
mongod.terminate()
def cleanup():
for proc in subprocess._active[:]:
try: proc.terminate()
except: pass
atexit.register(cleanup)
testing()
The error in testing() seems to prevent anything from continuing, so the atexit never registers and the process keeps running. Am I missing something obvious?
If you're running under CPython, you can cheat and take advantage of Python's destructors:
class PopenWrapper(object):
def __del__(self):
if self._child_created:
self.terminate()
This is slightly ucky, though. My preference would be to atexit:
import atexit
mongod = Popen(...)
def cleanup():
for proc in subprocess._active[:]:
try: proc.terminate()
except: pass
atexit.register(cleanup)
Still slightly hack-ish, though.
EDIT: Try this:
from subprocess import Popen
import atexit
started = []
def auto_popen(*args, **kw):
p = Popen(*args, **kw)
started.append(p)
return p
def testing():
mongod = auto_popen(['blah blah'], shell=True)
assert 0
#Stuff that generates error
mongod.terminate()
def cleanup():
for proc in started:
if proc.poll() is None:
try: proc.kill()
except: pass
atexit.register(cleanup)
testing()

Timeouting subprocesses in Python

def adbshell(command, serial=None, adbpath='adb'):
args = [adbpath]
if serial is not None:
args.extend(['-s', serial])
args.extend(['shell', command])
return subprocess.check_output(args)
def pmpath(serial=None, adbpath='adb'):
return adbshell('am instrument -e class............', serial=serial, adbpath=adbpath)
I have to run this test for a specific time period, and then exit if it is not working. How do I provide a timeout?
Depending which Python version you are running.
Python 3.3 onwards:
subprocess.check_output() provides a timeout param. Check the signature here
subprocess.check_output(args, *, stdin=None, stderr=None, shell=False, universal_newlines=False, timeout=None)
Below Python 3.3:
You can use threading module. Something like:
def run(args, timeout):
def target():
print 'Start thread'
subprocess.check_output(args)
print 'End thread'
thread = threading.Thread(target=target)
thread.start() # Start executing the target()
thread.join(timeout) # Join the thread after specified timeout
Note - I haven't tested the code above with threading and check_output(). Normally I use the subprocess.Popen() which offers more flexibility and handles almost all scenarios. Check the doc
The Popen constructure provides more flexiblity, as it can be used to check the exit status of the subprocess call.
The Popen.poll returns None if the process has not terminated yet. Hence call the subrprocess, sleep for the time required time out.
consider a simple test.py which is the subprocess called from the main program.
import time
for i in range(10):
print i
time.sleep(2)
The test.py is called from another program using the subprocess.Popen
from subprocess import Popen, PIPE
import time
cmd = Popen(['python','test.py'],stdout=PIPE)
print cmd.poll()
time.sleep(2)
if cmd.poll()== None:
print "killing"
cmd.terminate()
time.sleep(2)
provides a time out of 2 seconds, so that the program can excecute.
checks the exit status of the process using Popen.poll
if None, the process has not terminated, kills the process.

How to kill a python child process created with subprocess.check_output() when the parent dies?

I am running on a linux machine a python script which creates a child process using subprocess.check_output() as it follows:
subprocess.check_output(["ls", "-l"], stderr=subprocess.STDOUT)
The problem is that even if the parent process dies, the child is still running.
Is there any way I can kill the child process as well when the parent dies?
Yes, you can achieve this by two methods. Both of them require you to use Popen instead of check_output. The first is a simpler method, using try..finally, as follows:
from contextlib import contextmanager
#contextmanager
def run_and_terminate_process(*args, **kwargs):
try:
p = subprocess.Popen(*args, **kwargs)
yield p
finally:
p.terminate() # send sigterm, or ...
p.kill() # send sigkill
def main():
with run_and_terminate_process(args) as running_proc:
# Your code here, such as running_proc.stdout.readline()
This will catch sigint (keyboard interrupt) and sigterm, but not sigkill (if you kill your script with -9).
The other method is a bit more complex, and uses ctypes' prctl PR_SET_PDEATHSIG. The system will send a signal to the child once the parent exits for any reason (even sigkill).
import signal
import ctypes
libc = ctypes.CDLL("libc.so.6")
def set_pdeathsig(sig = signal.SIGTERM):
def callable():
return libc.prctl(1, sig)
return callable
p = subprocess.Popen(args, preexec_fn = set_pdeathsig(signal.SIGTERM))
Your problem is with using subprocess.check_output - you are correct, you can't get the child PID using that interface. Use Popen instead:
proc = subprocess.Popen(["ls", "-l"], stdout=PIPE, stderr=PIPE)
# Here you can get the PID
global child_pid
child_pid = proc.pid
# Now we can wait for the child to complete
(output, error) = proc.communicate()
if error:
print "error:", error
print "output:", output
To make sure you kill the child on exit:
import os
import signal
def kill_child():
if child_pid is None:
pass
else:
os.kill(child_pid, signal.SIGTERM)
import atexit
atexit.register(kill_child)
Don't know the specifics, but the best way is still to catch errors (and perhaps even all errors) with signal and terminate any remaining processes there.
import signal
import sys
import subprocess
import os
def signal_handler(signal, frame):
sys.exit(0)
signal.signal(signal.SIGINT, signal_handler)
a = subprocess.check_output(["ls", "-l"], stderr=subprocess.STDOUT)
while 1:
pass # Press Ctrl-C (breaks the application and is catched by signal_handler()
This is just a mockup, you'd need to catch more than just SIGINT but the idea might get you started and you'd need to check for spawned process somehow still.
http://docs.python.org/2/library/os.html#os.kill
http://docs.python.org/2/library/subprocess.html#subprocess.Popen.pid
http://docs.python.org/2/library/subprocess.html#subprocess.Popen.kill
I'd recommend rewriting a personalized version of check_output cause as i just realized check_output is really just for simple debugging etc since you can't interact so much with it during executing..
Rewrite check_output:
from subprocess import Popen, PIPE, STDOUT
from time import sleep, time
def checkOutput(cmd):
a = Popen('ls -l', shell=True, stdin=PIPE, stdout=PIPE, stderr=STDOUT)
print(a.pid)
start = time()
while a.poll() == None or time()-start <= 30: #30 sec grace period
sleep(0.25)
if a.poll() == None:
print('Still running, killing')
a.kill()
else:
print('exit code:',a.poll())
output = a.stdout.read()
a.stdout.close()
a.stdin.close()
return output
And do whatever you'd like with it, perhaps store the active executions in a temporary variable and kill them upon exit with signal or other means of intecepting errors/shutdowns of the main loop.
In the end, you still need to catch terminations in the main application in order to safely kill any childs, the best way to approach this is with try & except or signal.
As of Python 3.2 there is a ridiculously simple way to do this:
from subprocess import Popen
with Popen(["sleep", "60"]) as process:
print(f"Just launched server with PID {process.pid}")
I think this will be best for most use cases because it's simple and portable, and it avoids any dependence on global state.
If this solution isn't powerful enough, then I would recommend checking out the other answers and discussion on this question or on Python: how to kill child process(es) when parent dies?, as there are a lot of neat ways to approach the problem that provide different trade-offs around portability, resilience, and simplicity. 😊
Manually you could do this:
ps aux | grep <process name>
get the PID(second column) and
kill -9 <PID>
-9 is to force killing it

Run a process and kill it if it doesn't end within one hour

I need to do the following in Python. I want to spawn a process (subprocess module?), and:
if the process ends normally, to continue exactly from the moment it terminates;
if, otherwise, the process "gets stuck" and doesn't terminate within (say) one hour, to kill it and continue (possibly giving it another try, in a loop).
What is the most elegant way to accomplish this?
The subprocess module will be your friend. Start the process to get a Popen object, then pass it to a function like this. Note that this only raises exception on timeout. If desired you can catch the exception and call the kill() method on the Popen process. (kill is new in Python 2.6, btw)
import time
def wait_timeout(proc, seconds):
"""Wait for a process to finish, or raise exception after timeout"""
start = time.time()
end = start + seconds
interval = min(seconds / 1000.0, .25)
while True:
result = proc.poll()
if result is not None:
return result
if time.time() >= end:
raise RuntimeError("Process timed out")
time.sleep(interval)
There are at least 2 ways to do this by using psutil as long as you know the process PID.
Assuming the process is created as such:
import subprocess
subp = subprocess.Popen(['progname'])
...you can get its creation time in a busy loop like this:
import psutil, time
TIMEOUT = 60 * 60 # 1 hour
p = psutil.Process(subp.pid)
while 1:
if (time.time() - p.create_time()) > TIMEOUT:
p.kill()
raise RuntimeError('timeout')
time.sleep(5)
...or simply, you can do this:
import psutil
p = psutil.Process(subp.pid)
try:
p.wait(timeout=60*60)
except psutil.TimeoutExpired:
p.kill()
raise
Also, while you're at it, you might be interested in the following extra APIs:
>>> p.status()
'running'
>>> p.is_running()
True
>>>
I had a similar question and found this answer. Just for completeness, I want to add one more way how to terminate a hanging process after a given amount of time: The python signal library
https://docs.python.org/2/library/signal.html
From the documentation:
import signal, os
def handler(signum, frame):
print 'Signal handler called with signal', signum
raise IOError("Couldn't open device!")
# Set the signal handler and a 5-second alarm
signal.signal(signal.SIGALRM, handler)
signal.alarm(5)
# This open() may hang indefinitely
fd = os.open('/dev/ttyS0', os.O_RDWR)
signal.alarm(0) # Disable the alarm
Since you wanted to spawn a new process anyways, this might not be the best soloution for your problem, though.
A nice, passive, way is also by using a threading.Timer and setting up callback function.
from threading import Timer
# execute the command
p = subprocess.Popen(command)
# save the proc object - either if you make this onto class (like the example), or 'p' can be global
self.p == p
# config and init timer
# kill_proc is a callback function which can also be added onto class or simply a global
t = Timer(seconds, self.kill_proc)
# start timer
t.start()
# wait for the test process to return
rcode = p.wait()
t.cancel()
If the process finishes in time, wait() ends and code continues here, cancel() stops the timer. If meanwhile the timer runs out and executes kill_proc in a separate thread, wait() will also continue here and cancel() will do nothing. By the value of rcode you will know if we've timeouted or not. Simplest kill_proc: (you can of course do anything extra there)
def kill_proc(self):
os.kill(self.p, signal.SIGTERM)
Koodos to Peter Shinners for his nice suggestion about subprocess module. I was using exec() before and did not have any control on running time and especially terminating it. My simplest template for this kind of task is the following and I am just using the timeout parameter of subprocess.run() function to monitor the running time. Of course you can get standard out and error as well if needed:
from subprocess import run, TimeoutExpired, CalledProcessError
for file in fls:
try:
run(["python3.7", file], check=True, timeout=7200) # 2 hours timeout
print("scraped :)", file)
except TimeoutExpired:
message = "Timeout :( !!!"
print(message, file)
f.write("{message} {file}\n".format(file=file, message=message))
except CalledProcessError:
message = "SOMETHING HAPPENED :( !!!, CHECK"
print(message, file)
f.write("{message} {file}\n".format(file=file, message=message))

Categories