Here is my code, it launches a subprocess, waits till it ends and returns stdout, or a timeout happens and it raises exception. Common use is print(Run('python --version').execute())
class Run(object):
def __init__(self, cmd, timeout=2*60*60):
self.cmd = cmd.split()
self.timeout = timeout
self._stdout = b''
self.dt = 10
self.p = None
def execute(self):
print("Execute command: {}".format(' '.join(self.cmd)))
def target():
self.p = Popen(self.cmd, stdout=PIPE, stderr=STDOUT)
self._stdout = self.p.communicate()[0]
thread = Thread(target=target)
thread.start()
t = 0
while t < self.timeout:
thread.join(self.dt)
if thread.is_alive():
t += self.dt
print("Running for: {} seconds".format(t))
else:
ret_code = self.p.poll()
if ret_code:
raise AssertionError("{} failed.\nretcode={}\nstdout:\n{}".format(
self.cmd, ret_code, self._stdout))
return self._stdout
else:
print('Timeout {} reached, kill task, pid={}'.format(self.timeout, self.p.pid))
self.p.terminate()
thread.join()
raise AssertionError("Timeout")
The problem is following case. The process that I launch spawns more child processes. So when the timeout is reached, I kill main process (the one I srarted using my class) with self.p.terminate(), the children are remaining and my code hangs on line self._stdout = self.p.communicate()[0]. And execution continues if I manually kill all child processes.
I tried soulution when instead of self.p.terminate() I kill whole process tree.
This also does not work if the main process finished by itself and its children are existing on their own, and I have no ability to find and kill them. But they are blocking self.p.communicate().
Is there way to effectively solve this?
You could use the ProcessWrapper from the PySys framework - it offers alot of this functionality as an abstraction in a cross platform way i.e.
import sys, os
from pysys.constants import *
from pysys.process.helper import ProcessWrapper
from pysys.exceptions import ProcessTimeout
command=sys.executable
arguments=['--version']
try:
process = ProcessWrapper(command, arguments=arguments, environs=os.environ, workingDir=os.getcwd(), stdout='stdout.log', stderr='stderr.log', state=FOREGROUND, timeout=5.0)
process.start()
except ProcessTimeout:
print "Process timeout"
process.stop()
It's at SourceForge (http://sourceforge.net/projects/pysys/files/ and http://pysys.sourceforge.net/) if of interest.
Related
I have a piece of python code that should spawn an interruptible task in a child process:
class Script:
'''
Class instantiated by the parent process
'''
def __init__(self, *args, **kwargs):
self.process = None
self.start()
def __del__(self):
if self.process:
if self.process.poll() is None:
self.stop()
def start(self):
popen_kwargs = {
'executable': sys.executable,
'creationflags': 0 * subprocess.CREATE_DEFAULT_ERROR_MODE
| subprocess.CREATE_NEW_PROCESS_GROUP,
}
self.process = subprocess.Popen(['python', os.path.realpath(__file__)],
**popen_kwargs)
def stop(self):
if not self.process:
return
try:
self.process.send_signal(signal.CTRL_C_EVENT)
self.process.wait()
self.process = None
except KeyboardInterrupt:
pass
class ScriptSubprocess:
def __init__(self):
self.stop = False
def run(self):
try:
while not self.stop:
# ...
except KeyboardInterrupt:
# interrupted!
pass
finally:
# make a *clean* exit if interrupted
self.stop = True
if __name__ == '__main__':
p = ScriptSubprocess()
p.run()
del p
and it works fine in a standalone python interpreter.
The problem arises when I move this code in the real application, which has an embedded Python interpreter.
In this case, it hangs when trying to stop the child process at the line self.process.wait(), indicating the previous line self.process.send_signal(signal.CTRL_C_EVENT) did not work and in fact the child process is still running and if I manually terminate it via task manager, the call to self.process.wait() returns as if it has succeeded stopping the child process.
I am looking for possible causes (e.g. some process flag of the parent process) that disables CTRL_C_EVENT.
The documentation of subprocess says:
Popen.send_signal(signal)
Sends the signal signal to the child.
Do nothing if the process completed.
Note On Windows, SIGTERM is an alias for terminate(). CTRL_C_EVENT and
CTRL_BREAK_EVENT can be sent to processes started with a creationflags
parameter which includes CREATE_NEW_PROCESS_GROUP.
and also:
subprocess.CREATE_NEW_PROCESS_GROUP
A Popen creationflags parameter to specify that a new process group will be created. This flag is necessary for using os.kill() on the subprocess.
This flag is ignored if CREATE_NEW_CONSOLE is specified.
So I am using creationflags with subprocess.CREATE_NEW_PROCESS_GROUP, but still it is unable to kill the subprocess with CTRL_C_EVENT in the real application. (same as without this flag)
Since the real application (i.e. the parent process) is also using SetConsoleCtrlHandler to handle certain signals, I also try to pass creationflags with subprocess.CREATE_DEFAULT_ERROR_MODE to override that error mode in the child process, but still unable to kill the child process with CTRL_C_EVENT.
Note: CTRL_BREAK_EVENT works but does not give a clean exit (i.e. the finally: clause is not executed).
My guess is that SetConsoleCtrlHandler is the culprit, but I have no means of avoiding that being called in the parent process, or undoing its effect...
My goal is to implement a Python 3 method that will support running a system command (using subprocess) following a few requirements:
Running long lasting commands
Live logging of both stdout and stderr
Enforcing a timeout to stop the command if it fails to complete on time
In order to support live logging, I have used 2 threads which handles both stdout and stderr outputs.
My challenge is to enforce the timeout on the threads and the subprocess process.
My attempt to implement the timeout using a signal handler, seems to freeze the interpreter as soon as the handler is called.
What's wrong with my implementation ?
Is there any other way to implement my requirements?
Here is my current implementation attempt:
def run_live_output(cmd, timeout=900, **kwargs):
full_output = StringIO()
def log_popen_pipe(p, log_errors=False):
while p.poll() is None:
output = ''
if log_errors:
output = p.stderr.readline()
log.warning(f"{output}")
else:
output = p.stdout.readline()
log.info(f"{output}")
full_output.write(output)
if p.poll():
log.error(f"{cmd}\n{p.stderr.readline()}")
class MyTimeout(Exception):
pass
def handler(signum, frame):
log.info(f"Signal handler called with signal {signum}")
raise MyTimeout
with subprocess.Popen(
cmd,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
stdin=subprocess.PIPE,
universal_newlines=True,
**kwargs
) as sp:
with ThreadPoolExecutor(2) as pool:
try:
signal.signal(signal.SIGALRM, handler)
signal.alarm(timeout)
r1 = pool.submit(log_popen_pipe, sp)
r2 = pool.submit(log_popen_pipe, sp, log_errors=True)
r1.result()
r2.result()
except MyTimeout:
log.info(f"Timed out - Killing the threads and process")
pool.shutdown(wait=True)
sp.kill()
except Exception as e:
log.info(f"{e}")
return full_output.getvalue()
Q-1) My attempt to implement the timeout using a signal handler, seems to freeze the interpreter as soon as the handler is called, What's wrong with my implementation ?
A-1) No your signal handler not freezing, There is freezing but not in the signal handler, signal handler is fine. Your main thread blocked (frozen) when you call pool.shutdown(wait=True). Because your subprocess is still running and you do while p.poll() is None: in the log_popen_pipe func. That's why your main thread will not continue until log_popen_pipe finished.
To solve this issue, we need to remove pool.shutdown(wait=True) and then call the sp.terminate(). I suggest you to use sp.terminate() instead sp.kill() because sp.kill() will send SIGKILL signal which is not preferred until you really need it. In addition that, end of the with ThreadPoolExecutor(2) as pool: statement, pool.shutdown(wait=True) will be called and this will not block you if log_popen_pipe func ended.
In your case log_popen_pipe func will finished if subprocess finished when we do sp.terminate().
Q-2) Is there any other way to implement my requirements?
A-2) Yes there is, you can use Timer class from threading library. Timer class will create 1 thread and this thread will wait for timeout seconds and end of the timeout seconds, this created thread will call sp.terminate func
Here is the code:
from io import StringIO
import signal,subprocess
from concurrent.futures import ThreadPoolExecutor
import logging as log
from threading import Timer
log.root.setLevel(log.INFO)
def run_live_output(cmd, timeout=900, **kwargs):
full_output = StringIO()
def log_popen_pipe(p, log_errors=False):
while p.poll() is None:
output = ''
if log_errors:
output = p.stderr.readline()
log.warning(f"{output}")
else:
output = p.stdout.readline()
log.info(f"{output}")
full_output.write(output)
if p.poll()!=None:
log.error(f"subprocess finished, {cmd}\n{p.stdout.readline()}")
with subprocess.Popen(
cmd,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
stdin=subprocess.PIPE,
universal_newlines=True,
**kwargs
) as sp:
Timer(timeout,sp.terminate).start()
with ThreadPoolExecutor(2) as pool:
try:
r1 = pool.submit(log_popen_pipe, sp)
r2 = pool.submit(log_popen_pipe, sp, log_errors=True)
r1.result()
r2.result()
except Exception as e:
log.info(f"{e}")
return full_output.getvalue()
run_live_output(["python3","...."],timeout=4)
By the way p.poll() will return the returncode of the terminated subprocess. If you want to get output of successfully terminated subprocess, you need to use if p.poll()==0 0 generally means subprocess successfully terminated
I'm struggling to find the best approach to running multiple OS commands in parallel and being able to capture the output from them. The OS command is a semi-long running proprietary utility written in C. (running on solaris/linux hosts and using python 2.4) From a high level, this script will pull jobs from a job queue, instantiate a class for each job, wherein the class then spawns the OS utility with provided arguments. There is actually going to be a lot more to this class but just focusing on the overall architecture of the script, the omitted code is trivial in this context.
There are actually 2 points where I need the output from this OS command.
When the command is first executed it returns a jobid, which I need to capture. The command then blocks until complete. I then need to capture the return code of this command.
What I really want to do (I think) is define a class which spawns a thread and then executes Popen().
Here is what I have now:
#!/usr/bin/python
import sys, subprocess, threading
cmd = "/path/to/utility -x arg1 -y arg2"
class Command(object):
def __init__(self, cmd):
self.cmd = cmd
self.process = None
self.returncode = None
self.jobid = None
def __call__(self):
print "Starting job..."
self.process = subprocess.Popen(self.cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE, shell=True)
out, err = self.process.communicate()
self.jobid = out.split()[10]
def alive(self):
if self.process.poll():
return True
else:
return False
def getJobID(self):
return self.jobid
job = Command(cmd)
jobt = threading.Thread(target=job, args=[])
jobt.start()
# if job.alive():
# print "Job is still alive."
# do something
# else:
# print "Job is not alive."
# do something else
sys.exit(0)
The problem here is using p.communicate() causes the entire thread to block and I can't get the jobid at the point I want to.
Also if I uncomment the if statement, It complains that there is no method alive().
I've tried various variations of this, like creating the thread inside of the call method of the class but that seemed like I was going down the wrong road.
I also tried specifying the class name as the target argument when spawning the thread:
jobt = threading.Thread(target=Command, args=[cmd])
jobt.start()
Every approach I have used I kept hitting roadblocks.
Thx for any suggestions.
So after trying dano's idea, I now have this:
class Command(threading.Thread):
def __init__(self, cmd):
super(Command, self).__init__()
self.cmd = cmd
self.process = None
self.returncode = None
self.jobid = None
def run(self):
print "Starting job..."
self.process = subprocess.Popen(self.cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE, bufsize=0, shell=False)
print "Getting job id..."
out = self.process.stdout.readline()
print "out=" + out
self.returncode = self.process.wait()
def alive(self):
if self.process.poll():
return True
else:
return False
def getJobID(self):
return self.jobid
job = Command(cmd)
job.start()
Which yields this following output:
Starting job...
Getting job id...
At this point it hangs until the OS command completes.
Here is an example of running this command manually. The first two lines of output return immediately.
$ /path/to/my/command -x arg1 -y arg2
Info: job request 1 (somestring) submitted; job id is 729.
Info: waiting for job completion
# here is hangs until the job is complete
Info: job 729 completed successfully
Thx again for the help.
I think you could simplify things by having Command inherit from threading.Thread:
import sys
import subprocess
import threading
cmd = "/path/to/utility -x arg1 -y arg2"
class Command(threading.Thread):
def __init__(self, cmd):
super(Command, self).__init__()
self.cmd = cmd
self.process = None
self.returncode = None
self.jobid = None
def run(self):
print "Starting job..."
self.process = subprocess.Popen(self.cmd, stdout=subprocess.PIPE,
stderr=subprocess.PIPE, shell=True)
out, err = self.process.communicate()
self.jobid = out.split()[10]
def alive(self):
if self.process.poll():
return True
else:
return False
def getJobID(self):
return self.jobid
job = Command(cmd)
job.start()
if job.alive():
print "Job is still alive."
else:
print "Job is not alive."
sys.exit(0)
You can't use self.process.communicate() to get the job id prior to the command actually exiting, becausecommunicate() will block until the program completes. Instead, you'd need to use read directly from the process' stdout:
self.process = subprocess.Popen(self.cmd, stdout=subprocess.PIPE,
stderr=subprocess.PIPE, bufsize=0, shell=True)
out = self.process.stdout.readline()
self.jobid = out.split()[10]
Note that bufsize=0 is added, so try to avoid the subprocess buffering its output, which could make readline block.
Then you can call communicate or wait to wait for the process to end:
self.returncode = self.process.wait()
I would like to repeatedly execute a subprocess as fast as possible. However, sometimes the process will take too long, so I want to kill it.
I use signal.signal(...) like below:
ppid=pipeexe.pid
signal.signal(signal.SIGALRM, stop_handler)
signal.alarm(1)
.....
def stop_handler(signal, frame):
print 'Stop test'+testdir+'for time out'
if(pipeexe.poll()==None and hasattr(signal, "SIGKILL")):
os.kill(ppid, signal.SIGKILL)
return False
but sometime this code will try to stop the next round from executing.
Stop test/home/lu/workspace/152/treefit/test2for time out
/bin/sh: /home/lu/workspace/153/squib_driver: not found ---this is the next execution; the program wrongly stops it.
Does anyone know how to solve this? I want to stop in time not execute 1 second the time.sleep(n) often wait n seconds. I do not want that I want it can execute less than 1 second
You could do something like this:
import subprocess as sub
import threading
class RunCmd(threading.Thread):
def __init__(self, cmd, timeout):
threading.Thread.__init__(self)
self.cmd = cmd
self.timeout = timeout
def run(self):
self.p = sub.Popen(self.cmd)
self.p.wait()
def Run(self):
self.start()
self.join(self.timeout)
if self.is_alive():
self.p.terminate() #use self.p.kill() if process needs a kill -9
self.join()
RunCmd(["./someProg", "arg1"], 60).Run()
The idea is that you create a thread that runs the command and to kill it if the timeout exceeds some suitable value, in this case 60 seconds.
Here is something I wrote as a watchdog for subprocess execution. I use it now a lot, but I'm not so experienced so maybe there are some flaws in it:
import subprocess
import time
def subprocess_execute(command, time_out=60):
"""executing the command with a watchdog"""
# launching the command
c = subprocess.Popen(command)
# now waiting for the command to complete
t = 0
while t < time_out and c.poll() is None:
time.sleep(1) # (comment 1)
t += 1
# there are two possibilities for the while to have stopped:
if c.poll() is None:
# in the case the process did not complete, we kill it
c.terminate()
# and fill the return code with some error value
returncode = -1 # (comment 2)
else:
# in the case the process completed normally
returncode = c.poll()
return returncode
Usage:
return = subprocess_execute(['java', '-jar', 'some.jar'])
Comments:
here, the watchdog time out is in seconds; but it's easy to change to whatever needed by changing the time.sleep() value. The time_out will have to be documented accordingly;
according to what is needed, here it maybe more suitable to raise some exception.
Documentation: I struggled a bit with the documentation of subprocess module to understand that subprocess.Popen is not blocking; the process is executed in parallel (maybe I do not use the correct word here, but I think it's understandable).
But as what I wrote is linear in its execution, I really have to wait for the command to complete, with a time out to avoid bugs in the command to pause the nightly execution of the script.
I guess this is a common synchronization problem in event-oriented programming with threads and processes.
If you should always have only one subprocess running, make sure the current subprocess is killed before running the next one. Otherwise the signal handler may get a reference to the last subprocess run and ignore the older.
Suppose subprocess A is running. Before the alarm signal is handled, subprocess B is launched. Just after that, your alarm signal handler attempts to kill a subprocess. As the current PID (or the current subprocess pipe object) was set to B's when launching the subprocess, B gets killed and A keeps running.
Is my guess correct?
To make your code easier to understand, I would include the part that creates a new subprocess just after the part that kills the current subprocess. That would make clear there is only one subprocess running at any time. The signal handler could do both the subprocess killing and launching, as if it was the iteration block that runs in a loop, in this case event-driven with the alarm signal every 1 second.
Here's what I use:
class KillerThread(threading.Thread):
def __init__(self, pid, timeout, event ):
threading.Thread.__init__(self)
self.pid = pid
self.timeout = timeout
self.event = event
self.setDaemon(True)
def run(self):
self.event.wait(self.timeout)
if not self.event.isSet() :
try:
os.kill( self.pid, signal.SIGKILL )
except OSError, e:
#This is raised if the process has already completed
pass
def runTimed(dt, dir, args, kwargs ):
event = threading.Event()
cwd = os.getcwd()
os.chdir(dir)
proc = subprocess.Popen(args, **kwargs )
os.chdir(cwd)
killer = KillerThread(proc.pid, dt, event)
killer.start()
(stdout, stderr) = proc.communicate()
event.set()
return (stdout,stderr, proc.returncode)
A bit more complex, I added an answer to solve a similar problem: Capturing stdout, feeding stdin, and being able to terminate after some time of inactivity and/or after some overall runtime.
I run a subprocess using:
p = subprocess.Popen("subprocess",
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
stdin=subprocess.PIPE)
This subprocess could either exit immediately with an error on stderr, or keep running. I want to detect either of these conditions - the latter by waiting for several seconds.
I tried this:
SECONDS_TO_WAIT = 10
select.select([],
[p.stdout, p.stderr],
[p.stdout, p.stderr],
SECONDS_TO_WAIT)
but it just returns:
([],[],[])
on either condition. What can I do?
Have you tried using the Popen.Poll() method. You could just do this:
p = subprocess.Popen("subprocess",
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
stdin=subprocess.PIPE)
time.sleep(SECONDS_TO_WAIT)
retcode = p.poll()
if retcode is not None:
# process has terminated
This will cause you to always wait 10 seconds, but if the failure case is rare this would be amortized over all the success cases.
Edit:
How about:
t_nought = time.time()
seconds_passed = 0
while(p.poll() is not None and seconds_passed < 10):
seconds_passed = time.time() - t_nought
if seconds_passed >= 10:
#TIMED OUT
This has the ugliness of being a busy wait, but I think it accomplishes what you want.
Additionally looking at the select call documentation again I think you may want to change it as follows:
SECONDS_TO_WAIT = 10
select.select([p.stderr],
[],
[p.stdout, p.stderr],
SECONDS_TO_WAIT)
Since you would typically want to read from stderr, you want to know when it has something available to read (ie the failure case).
I hope this helps.
This is what i came up with. Works when you need and don't need to timeout on thep process, but with a semi-busy loop.
def runCmd(cmd, timeout=None):
'''
Will execute a command, read the output and return it back.
#param cmd: command to execute
#param timeout: process timeout in seconds
#return: a tuple of three: first stdout, then stderr, then exit code
#raise OSError: on missing command or if a timeout was reached
'''
ph_out = None # process output
ph_err = None # stderr
ph_ret = None # return code
p = subprocess.Popen(cmd, shell=True,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE)
# if timeout is not set wait for process to complete
if not timeout:
ph_ret = p.wait()
else:
fin_time = time.time() + timeout
while p.poll() == None and fin_time > time.time():
time.sleep(1)
# if timeout reached, raise an exception
if fin_time < time.time():
# starting 2.6 subprocess has a kill() method which is preferable
# p.kill()
os.kill(p.pid, signal.SIGKILL)
raise OSError("Process timeout has been reached")
ph_ret = p.returncode
ph_out, ph_err = p.communicate()
return (ph_out, ph_err, ph_ret)
Here is a nice example:
from threading import Timer
from subprocess import Popen, PIPE
proc = Popen("ping 127.0.0.1", shell=True)
t = Timer(60, proc.kill)
t.start()
proc.wait()
Using select and sleeping doesn't really make much sense. select (or any kernel polling mechanism) is inherently useful for asynchronous programming, but your example is synchronous. So either rewrite your code to use the normal blocking fashion or consider using Twisted:
from twisted.internet.utils import getProcessOutputAndValue
from twisted.internet import reactor
def stop(r):
reactor.stop()
def eb(reason):
reason.printTraceback()
def cb(result):
stdout, stderr, exitcode = result
# do something
getProcessOutputAndValue('/bin/someproc', []
).addCallback(cb).addErrback(eb).addBoth(stop)
reactor.run()
Incidentally, there is a safer way of doing this with Twisted by writing your own ProcessProtocol:
http://twistedmatrix.com/projects/core/documentation/howto/process.html
Python 3.3
import subprocess as sp
try:
sp.check_call(["/subprocess"], timeout=10,
stdin=sp.DEVNULL, stdout=sp.DEVNULL, stderr=sp.DEVNULL)
except sp.TimeoutError:
# timeout (the subprocess is killed at this point)
except sp.CalledProcessError:
# subprocess failed before timeout
else:
# subprocess ended successfully before timeout
See TimeoutExpired docs.
If, as you said in the comments above, you're just tweaking the output each time and re-running the command, would something like the following work?
from threading import Timer
import subprocess
WAIT_TIME = 10.0
def check_cmd(cmd):
p = subprocess.Popen(cmd,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE)
def _check():
if p.poll()!=0:
print cmd+" did not quit within the given time period."
# check whether the given process has exited WAIT_TIME
# seconds from now
Timer(WAIT_TIME, _check).start()
check_cmd('echo')
check_cmd('python')
The code above, when run, outputs:
python did not quit within the given time period.
The only downside of the above code that I can think of is the potentially overlapping processes as you keep running check_cmd.
This is a paraphrase on Evan's answer, but it takes into account the following :
Explicitly canceling the Timer object : if the Timer interval would be long and the process will exit by its "own will" , this could hang your script :(
There is an intrinsic race in the Timer approach (the timer attempt killing the process just after the process has died and this on Windows will raise an exception).
DEVNULL = open(os.devnull, "wb")
process = Popen("c:/myExe.exe", stdout=DEVNULL) # no need for stdout
def kill_process():
""" Kill process helper"""
try:
process.kill()
except OSError:
pass # Swallow the error
timer = Timer(timeout_in_sec, kill_process)
timer.start()
process.wait()
timer.cancel()