Poll subprocess finished while looping stdout - python

I'm writing a script that produces output in an unpredictable size, I want to know from inside the loop when the script has finished.
This is the code:
#!/usr/bin/env python3
import subprocess
import shlex
def main():
cmd = 'bash -c "for i in $(seq 1 15);do echo $i ;sleep 1;done"'
print(cmd)
p = subprocess.Popen(shlex.split(cmd), stdout=subprocess.PIPE,
universal_newlines=True)
for line in p.stdout:
print(f"file_name: {line.strip()}")
print(p.poll())
if __name__ == "__main__":
main()
The p.poll() is always None even in the last iteration, and it makes sense because after echo it sleeps for 1 second before moving to the next iteration and finishes.
Any way of making it work?

You have already identified the problem, that is, after the subprocess has put out the last line it will still continue to run for one second and so while the program is in the loop the program will always be seen to be running. Even if you move the call to poll outside the loop you may have to wait a bit to give the subprocess a chance to terminate after outputting its final message (I have reduced the loop size -- life is too short):
#!/usr/bin/env python3
import subprocess
import shlex
import time
def main():
cmd = 'bash -c "for i in $(seq 1 5);do echo $i; sleep 1; done;"'
print(cmd)
p = subprocess.Popen(shlex.split(cmd), stdout=subprocess.PIPE, universal_newlines=True)
for line in p.stdout:
print(f"file_name: {line.strip()}", flush=True)
print(p.poll())
time.sleep(.1)
print(p.poll())
if __name__ == "__main__":
main()
Prints:
bash -c "for i in $(seq 1 5);do echo $i; sleep 1; done;"
file_name: 1
file_name: 2
file_name: 3
file_name: 4
file_name: 5
None
0
To "get it to work" inside the loop would require special knowledge of what's going on inside the subprocess. Based on the previous piece of code, we would need:
#!/usr/bin/env python3
import subprocess
import shlex
import time
def main():
cmd = 'bash -c "for i in $(seq 1 5);do echo $i; sleep 1; done;"'
print(cmd)
p = subprocess.Popen(shlex.split(cmd), stdout=subprocess.PIPE, universal_newlines=True)
for line in p.stdout:
# has to be greater than the sleep time in the subprocess to give the subprocess a chance to terminate
print(f"file_name: {line.strip()}", flush=True)
time.sleep(1.1)
print(p.poll())
if __name__ == "__main__":
main()
Prints:
bash -c "for i in $(seq 1 5);do echo $i; sleep 1; done;"
file_name: 1
None
file_name: 2
None
file_name: 3
None
file_name: 4
None
file_name: 5
0
But this is hardly a practical solution. One would have to ask what is the reason for doing this polling; it offers no useful information unless you are willing to include sleep calls following your reads because there will always be some delay following the last write done by the subprocess and its termination, and these sleep calls are generally wasteful. You should just be reading until there is no more output and then do a p.wait() to wait for the subprocess to terminate, but its's your choice:
#!/usr/bin/env python3
import subprocess
import shlex
def main():
cmd = 'bash -c "for i in $(seq 1 5);do echo $i; sleep 1; done;"'
print(cmd)
p = subprocess.Popen(shlex.split(cmd), stdout=subprocess.PIPE, universal_newlines=True)
for line in p.stdout:
print(f"file_name: {line.strip()}", flush=True)
p.wait()
if __name__ == "__main__":
main()

Related

subprocess: start process in background and start another in one call

This program should echo the pid of sleep immediately:
import subprocess
subprocess.check_output("sleep 1 & echo $!", shell=True)
Running this on the shell directly, it immediately prints the pid, but running it in python, the & is ignored and it takes 1 second before echo is executed.
How can I get this to work with only one execution of check_output (or another function of subprocess)?
(This is a simplified example, in reality instead of sleep 1 I'd put my own executable)
check_output waits for the output pipes to close and sleep has them too. You can redirect to /dev/null for an immediate return.
subprocess.check_output("sleep 1 >/dev/null 2>&1 & echo $!", shell=True)
UPDATE
Its hard to tell if sleep 1 really did run in the background so I wrote a slightly larger test.
test.py - writes time to stdout for 5 seconds
import time
for i in range(5):
print(time.strftime('%H:%M:%S'), flush=True)
time.sleep(1)
print('done', flush=True)
runner.py - runs the test redirecting stdout to a file and monitors the file.
import subprocess as subp
import time
import os
# run program in background
pid = int(subp.check_output("python3 test.py >test.out 2>&1 & echo $!",
shell=True))
print("pid", pid)
# monitor output file
pos = 0
done = False
while not done:
time.sleep(.1)
if os.stat('test.out').st_size > pos:
with open('test.out', 'rb') as fp:
fp.seek(pos)
for line in fp.readlines():
print(line.strip().decode())
done = b'done' in line
pos = fp.tell()
print("test complete")
Running it, I get
td#mintyfresh ~/tmp $ python3 runner.py
pid 24353
09:32:18
09:32:19
09:32:20
09:32:21
09:32:22
done
test complete

Terminate subprocess running in thread on program exit

Based on the accepted answer to this question: python-subprocess-callback-when-cmd-exits I am running a subprocess in a separate thread and after the completion of the subprocess a callable is executed. All good, but the problem is that even if running the thread as a daemon, the subprocess continues to run even after the program exits normally or it is killed by kill -9, Ctrl + C, etc...
Below is a very simplified example (runs on 2.7):
import threading
import subprocess
import time
import sys
def on_exit(pid):
print 'Process with pid %s ended' % pid
def popen_with_callback(cmd):
def run_in_thread(command):
proc = subprocess.Popen(
command,
shell=False
)
proc.wait()
on_exit(proc.pid)
return
thread = threading.Thread(target=run_in_thread, args=([cmd]))
thread.daemon = True
thread.start()
return thread
if __name__ == '__main__':
popen_with_callback(
[
"bash",
"-c",
"for ((i=0;i<%s;i=i+1)); do echo $i; sleep 1; done" % sys.argv[1]
])
time.sleep(5)
print 'program ended'
If the main thread lasts longer than the subprocess everything is fine:
(venv)~/Desktop|➤➤ python testing_threads.py 3
> 0
> 1
> 2
> Process with pid 26303 ended
> program ended
If the main thread lasts less than the subprocess, the subprocess continues to run until it eventually hangs:
(venv)~/Desktop|➤➤ python testing_threads.py 8
> 0
> 1
> 2
> 3
> 4
> program ended
(venv)~/Desktop|➤➤ 5
> 6
> 7
# hanging from now on
How to terminate the subprocess if the main program is finished or killed? I tried to use atexit.register(os.kill(proc.pid, signal.SIGTERM)) just before proc.wait but it actually executes when the thread running the subprocess exits, not when the main thread exits.
I was also thinking of polling for the parent pid, but I am not sure how to implement it because of the proc.wait situation.
Ideal outcome would be:
(venv)~/Desktop|➤➤ python testing_threads.py 8
> 0
> 1
> 2
> 3
> 4
> program ended
> Process with pid 1234 ended
Use Thread.join method, which blocks main thread until this thread exits:
if __name__ == '__main__':
popen_with_callback(
[
"bash",
"-c",
"for ((i=0;i<%s;i=i+1)); do echo $i; sleep 1; done" % sys.argv[1]
]).join()
print 'program ended'
I just got an ugly but effective method.
Just set a global variable to handle the proc=subprocess.Popen(),
and you can kill the proc whenever you like:
my_proc = None
def popen_with_callback(cmd):
def run_in_thread(command):
global my_proc
proc = subprocess.Popen(command, shell=False)
my_proc = proc
proc.wait()
on_exit(proc.pid)
return
...
Then you can kill the proc wherever you want in the program.
just do:
my_proc.kill()

How to validate that subprocess.Popen is non blocking

A couple of answers (first, second) have mentioned that subprocess.Popen is a non blocking call.
What can be a simple example which can validate it or can be used to explain it to a beginner.
I tried the following code. It shows that "Finish" is printed before printing output of ls -lrt but as soon as I add sleep 10 before ls -lrt in command, it waits for command to finish.
import logging
import os
import subprocess
import signal
import time
log = logging.getLogger(__name__)
class Utils(object):
#staticmethod
def run_command(cmnd, env=None, cwd=None, timeout=0):
p = subprocess.Popen(cmnd, shell=True, stdin=None, bufsize=-1, env=env,
stdout=subprocess.PIPE, stderr=subprocess.STDOUT,
close_fds=True, cwd=cwd, preexec_fn=os.setsid)
#stdout_val = p.communicate()[0]
stdout_val = p.stdout.read()
return p.returncode, stdout_val.strip()
if __name__ == '__main__':
print "Start"
print "Invoke command"
status, output = Utils.run_command("ls -lrt") # line - 10
#status, output = Utils.run_command("sleep 10;ls -lrt") # line - 11
for i in xrange(10):
print "Finish"
print status
print output
EDIT 1: Replacing call p.communicate() with p.stdout.read() after suggestion.

cannot kill a Sub process created by Popen when printing process.stdout

I have created a script which should run a command and kill it after 15 seconds
import logging
import subprocess
import time
import os
import sys
import signal
#cmd = "ping 192.168.1.1 -t"
cmd = "C:\\MyAPP\MyExe.exe -t 80 -I C:\MyApp\Temp -M Documents"
proc=subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.STDOUT,shell=True)
**for line in proc.stdout:
print (line.decode("utf-8"), end='')**
time.sleep(15)
os.kill(proc.pid, signal.SIGTERM)
#proc.kill() #Tried this too but no luck
This doesnot terminate my subprocess. however if I comment out the logging to stdout part, ie
for line in proc.stdout:
print (line.decode("utf-8"), end='')
the subprocess has been killed.
I have tried proc.kill() and CTRL_C_EVENT too but no luck.
Any help would be highly appreciated. Please see me as novice to python
To terminate subprocess in 15 seconds while printing its output line-by-line:
#!/usr/bin/env python
from __future__ import print_function
from threading import Timer
from subprocess import Popen, PIPE, STDOUT
# start process
cmd = r"C:\MyAPP\MyExe.exe -t 80 -I C:\MyApp\Temp -M Documents"
process = Popen(cmd, stdout=PIPE, stderr=STDOUT,
bufsize=1, universal_newlines=True)
# terminate process in 15 seconds
timer = Timer(15, terminate, args=[process])
timer.start()
# print output
for line in iter(process.stdout.readline, ''):
print(line, end='')
process.stdout.close()
process.wait() # wait for the child process to finish
timer.cancel()
Notice, you don't need shell=True here. You could define terminate() as:
def terminate(process):
if process.poll() is None:
try:
process.terminate()
except EnvironmentError:
pass # ignore
If you want to kill the whole process tree then define terminate() as:
from subprocess import call
def terminate(process):
if process.poll() is None:
call('taskkill /F /T /PID ' + str(process.pid))
Use raw-string literals for Windows paths: r"" otherwise you should escape all backslashes in the string literal
Drop shell=True. It creates an additional process for no reason here
universal_newlines=True enables text mode (bytes are decode into Unicode text using the locale preferred encoding automatically on Python 3)
iter(process.stdout.readline, '') is necessary for compatibility with Python 2 (otherwise the data may be printed with a delay due to the read-ahead buffer bug)
Use process.terminate() instead of process.send_signal(signal.SIGTERM) or os.kill(proc.pid, signal.SIGTERM)
taskkill allows to kill a process tree on Windows
The problem is reading from stdout is blocking. You need to either read the subprocess's output or run the timer on a separate thread.
from subprocess import Popen, PIPE
from threading import Thread
from time import sleep
class ProcKiller(Thread):
def __init__(self, proc, time_limit):
super(ProcKiller, self).__init__()
self.proc = proc
self.time_limit = time_limit
def run(self):
sleep(self.time_limit)
self.proc.kill()
p = Popen('while true; do echo hi; sleep 1; done', shell=True)
t = ProcKiller(p, 5)
t.start()
p.communicate()
EDITED to reflect suggested changes in comment
from subprocess import Popen, PIPE
from threading import Thread
from time import sleep
from signal import SIGTERM
import os
class ProcKiller(Thread):
def __init__(self, proc, time_limit):
super(ProcKiller, self).__init__()
self.proc = proc
self.time_limit = time_limit
def run(self):
sleep(self.time_limit)
os.kill(self.proc.pid, SIGTERM)
p = Popen('while true; do echo hi; sleep 1; done', shell=True)
t = ProcKiller(p, 5)
t.start()
p.communicate()

python subprocess poll() is not returning None even if Popen is still running

I have a python script that executes linux commands with timeout using a while loop and sleep like below
fout = tempfile.TemporaryFile()
try:
p = subprocess.Popen(["/bin/bash","-c", options.command], bufsize=-1, shell=False, preexec_fn=os.setsid, stdin=subprocess.PIPE, stdout=fout, stderr=subprocess.PIPE)
except:
sys.exit(UNEXPECTED_ERROR)
if options.timeout:
print "options.timeout = %s" % options.timeout
elapsed = 0
time.sleep(0.1) # This sleep is for the delay between Popen and poll() functions
while p.poll() is None:
time.sleep(1)
elapsed = elapsed + 1
print "elapsed = %s" % elapsed
if elapsed >= options.timeout:
# TIMEDOUT
# kill all processes that are in the same child process group
# which kills the process tree
pgid = os.getpgid(p.pid)
os.killpg(pgid, signal.SIGKILL)
p.wait()
fout.close()
sys.exit(TIMEOUT_ERROR)
break
else:
p.wait()
fout.seek(0) #rewind to the beginning of the file
print fout.read(),
fout.close()
sys.exit(p.returncode)
$ time myScript -c "cat file2" 2>&1 -t 5
options.timeout = 5
elapsed = 1
real 0m11.811s
user 0m0.046s
sys 0m1.153s
My question is in that above case even if the timeout is 5 seconds cat continues till it finishes. Am I missing something here? Please help.
It works as expected on Ubuntu:
$ /usr/bin/ssh root#localhost -t 'sync && echo 3 > /proc/sys/vm/drop_caches'
$ /usr/bin/time python2.4 myscript.py 'cat big_file'
timeout
done
0.01user 0.63system 0:05.16elapsed 12%CPU
$ /usr/bin/ssh root#localhost -t 'sync && echo 3 > /proc/sys/vm/drop_caches'
$ /usr/bin/time cat big_file >/dev/null
0.02user 0.82system 0:09.93elapsed 8%CPU
It also work with a shell command:
$ /usr/bin/time python2.4 myscript.py 'while : ; do sleep 1; done'
timeout
done
0.02user 0.00system 0:05.03elapsed 0%CPU
Assumptions:
you can't use time.time() due to possibility of a system clock change
time.clock() doesn't measure children times on Linux
we can't emulate time.monotonic() from Python 3.3 in pure Python
due to ctypes is not available on Python 2.4
it is acceptable to survive hibernation e.g., 2 seconds before hibernation + 3 seconds after computer wakes up whenever it happens if timeout is 5 seconds.
#!/usr/bin/env python2.4
import os
import signal
import sys
import tempfile
import time
from subprocess import Popen
class TimeoutExpired(Exception):
pass
def wait(process, timeout, _sleep_time=.1):
for _ in xrange(int(timeout * 1. / _sleep_time + .5)):
time.sleep(_sleep_time) # NOTE: assume it doesn't wake up earlier
if process.poll() is not None:
return process.wait()
raise TimeoutExpired # NOTE: timeout precision is not very good
f = tempfile.TemporaryFile()
p = Popen(["/bin/bash", "-c", sys.argv[1]], stdout=f, preexec_fn=os.setsid,
close_fds=True)
try:
wait(p, timeout=5)
except TimeoutExpired:
print >>sys.stderr, "timeout"
os.killpg(os.getpgid(p.pid), signal.SIGKILL)
p.wait()
else:
f.seek(0)
for line in f:
print line,
f.close() # delete it
print >>sys.stderr, "done"
Beside of the problems I see in your code
you call Popen() with stdin=subprocess.PIPE and stderr=subprocess.PIPE. But you never handle these pipes. With a command like cat file2, this should be fine, but it can lead to problems.
I can spot a potential misbehaviour: you might have mixed up indentation (as in the 1st version of your question). Assume you have the following:
while p.poll() is None:
time.sleep(1)
elapsed = elapsed + 1
print "elapsed = %s" % elapsed
if elapsed >= options.timeout:
# TIMEDOUT
# kill all processes that are in the same child process group
# which kills the process tree
pgid = os.getpgid(p.pid)
os.killpg(pgid, signal.SIGKILL)
p.wait()
fout.close()
sys.exit(TIMEOUT_ERROR)
break
You don't reach the timeout threshold, and nevertheless p.wait() is called due to a bad indentation. Don't mix up tabs and spaces; PEP 8 suggests to use spaces only and a indentation depth of 4 columns.

Categories