I am trying to find a way in Python to run other programs in such a way that:
The stdout and stderr of the program being run can be logged
separately.
The stdout and stderr of the program being run can be
viewed in near-real time, such that if the child process hangs, the
user can see. (i.e. we do not wait for execution to complete before
printing the stdout/stderr to the user)
Bonus criteria: The
program being run does not know it is being run via python, and thus
will not do unexpected things (like chunk its output instead of
printing it in real-time, or exit because it demands a terminal to
view its output). This small criteria pretty much means we will need
to use a pty I think.
Here is what i've got so far...
Method 1:
def method1(command):
## subprocess.communicate() will give us the stdout and stderr sepurately,
## but we will have to wait until the end of command execution to print anything.
## This means if the child process hangs, we will never know....
proc=subprocess.Popen(command, stdout=subprocess.PIPE, stderr=subprocess.PIPE, shell=True, executable='/bin/bash')
stdout, stderr = proc.communicate() # record both, but no way to print stdout/stderr in real-time
print ' ######### REAL-TIME ######### '
######## Not Possible
print ' ########## RESULTS ########## '
print 'STDOUT:'
print stdout
print 'STDOUT:'
print stderr
Method 2
def method2(command):
## Using pexpect to run our command in a pty, we can see the child's stdout in real-time,
## however we cannot see the stderr from "curl google.com", presumably because it is not connected to a pty?
## Furthermore, I do not know how to log it beyond writing out to a file (p.logfile). I need the stdout and stderr
## as strings, not files on disk! On the upside, pexpect would give alot of extra functionality (if it worked!)
proc = pexpect.spawn('/bin/bash', ['-c', command])
print ' ######### REAL-TIME ######### '
proc.interact()
print ' ########## RESULTS ########## '
######## Not Possible
Method 3:
def method3(command):
## This method is very much like method1, and would work exactly as desired
## if only proc.xxx.read(1) wouldn't block waiting for something. Which it does. So this is useless.
proc=subprocess.Popen(command, stdout=subprocess.PIPE, stderr=subprocess.PIPE, shell=True, executable='/bin/bash')
print ' ######### REAL-TIME ######### '
out,err,outbuf,errbuf = '','','',''
firstToSpeak = None
while proc.poll() == None:
stdout = proc.stdout.read(1) # blocks
stderr = proc.stderr.read(1) # also blocks
if firstToSpeak == None:
if stdout != '': firstToSpeak = 'stdout'; outbuf,errbuf = stdout,stderr
elif stderr != '': firstToSpeak = 'stderr'; outbuf,errbuf = stdout,stderr
else:
if (stdout != '') or (stderr != ''): outbuf += stdout; errbuf += stderr
else:
out += outbuf; err += errbuf;
if firstToSpeak == 'stdout': sys.stdout.write(outbuf+errbuf);sys.stdout.flush()
else: sys.stdout.write(errbuf+outbuf);sys.stdout.flush()
firstToSpeak = None
print ''
print ' ########## RESULTS ########## '
print 'STDOUT:'
print out
print 'STDERR:'
print err
To try these methods out, you will need to import sys,subprocess,pexpect
pexpect is pure-python and can be had with
sudo pip install pexpect
I think the solution will involve python's pty module - which is somewhat of a black art that I cannot find anyone who knows how to use. Perhaps SO knows :)
As a heads-up, i recommend you use 'curl www.google.com' as a test command, because it prints its status out on stderr for some reason :D
UPDATE-1:
OK so the pty library is not fit for human consumption. The docs, essentially, are the source code.
Any presented solution that is blocking and not async is not going to work here. The Threads/Queue method by Padraic Cunningham works great, although adding pty support is not possible - and it's 'dirty' (to quote Freenode's #python).
It seems like the only solution fit for production-standard code is using the Twisted framework, which even supports pty as a boolean switch to run processes exactly as if they were invoked from the shell.
But adding Twisted into a project requires a total rewrite of all the code. This is a total bummer :/
UPDATE-2:
Two answers were provided, one of which addresses the first two
criteria and will work well where you just need both the stdout and
stderr using Threads and Queue. The other answer uses select, a
non-blocking method for reading file descriptors, and pty, a method to
"trick" the spawned process into believing it is running in a real
terminal just as if it was run from Bash directly - but may or may not
have side-effects. I wish I could accept both answers, because the
"correct" method really depends on the situation and why you are
subprocessing in the first place, but alas, I could only accept one.
The stdout and stderr of the program being run can be logged separately.
You can't use pexpect because both stdout and stderr go to the same pty and there is no way to separate them after that.
The stdout and stderr of the program being run can be viewed in near-real time, such that if the child process hangs, the user can see. (i.e. we do not wait for execution to complete before printing the stdout/stderr to the user)
If the output of a subprocess is not a tty then it is likely that it uses a block buffering and therefore if it doesn't produce much output then it won't be "real time" e.g., if the buffer is 4K then your parent Python process won't see anything until the child process prints 4K chars and the buffer overflows or it is flushed explicitly (inside the subprocess). This buffer is inside the child process and there are no standard ways to manage it from outside. Here's picture that shows stdio buffers and the pipe buffer for command 1 | command2 shell pipeline:
The program being run does not know it is being run via python, and thus will not do unexpected things (like chunk its output instead of printing it in real-time, or exit because it demands a terminal to view its output).
It seems, you meant the opposite i.e., it is likely that your child process chunks its output instead of flushing each output line as soon as possible if the output is redirected to a pipe (when you use stdout=PIPE in Python). It means that the default threading or asyncio solutions won't work as is in your case.
There are several options to workaround it:
the command may accept a command-line argument such as grep --line-buffered or python -u, to disable block buffering.
stdbuf works for some programs i.e., you could run ['stdbuf', '-oL', '-eL'] + command using the threading or asyncio solution above and you should get stdout, stderr separately and lines should appear in near-real time:
#!/usr/bin/env python3
import os
import sys
from select import select
from subprocess import Popen, PIPE
with Popen(['stdbuf', '-oL', '-e0', 'curl', 'www.google.com'],
stdout=PIPE, stderr=PIPE) as p:
readable = {
p.stdout.fileno(): sys.stdout.buffer, # log separately
p.stderr.fileno(): sys.stderr.buffer,
}
while readable:
for fd in select(readable, [], [])[0]:
data = os.read(fd, 1024) # read available
if not data: # EOF
del readable[fd]
else:
readable[fd].write(data)
readable[fd].flush()
finally, you could try pty + select solution with two ptys:
#!/usr/bin/env python3
import errno
import os
import pty
import sys
from select import select
from subprocess import Popen
masters, slaves = zip(pty.openpty(), pty.openpty())
with Popen([sys.executable, '-c', r'''import sys, time
print('stdout', 1) # no explicit flush
time.sleep(.5)
print('stderr', 2, file=sys.stderr)
time.sleep(.5)
print('stdout', 3)
time.sleep(.5)
print('stderr', 4, file=sys.stderr)
'''],
stdin=slaves[0], stdout=slaves[0], stderr=slaves[1]):
for fd in slaves:
os.close(fd) # no input
readable = {
masters[0]: sys.stdout.buffer, # log separately
masters[1]: sys.stderr.buffer,
}
while readable:
for fd in select(readable, [], [])[0]:
try:
data = os.read(fd, 1024) # read available
except OSError as e:
if e.errno != errno.EIO:
raise #XXX cleanup
del readable[fd] # EIO means EOF on some systems
else:
if not data: # EOF
del readable[fd]
else:
readable[fd].write(data)
readable[fd].flush()
for fd in masters:
os.close(fd)
I don't know what are the side-effects of using different ptys for stdout, stderr. You could try whether a single pty is enough in your case e.g., set stderr=PIPE and use p.stderr.fileno() instead of masters[1]. Comment in sh source suggests that there are issues if stderr not in {STDOUT, pipe}
If you want to read from stderr and stdout and get the output separately, you can use a Thread with a Queue, not overly tested but something like the following:
import threading
import queue
def run(fd, q):
for line in iter(fd.readline, ''):
q.put(line)
q.put(None)
def create(fd):
q = queue.Queue()
t = threading.Thread(target=run, args=(fd, q))
t.daemon = True
t.start()
return q, t
process = Popen(["curl","www.google.com"], stdout=PIPE, stderr=PIPE,
universal_newlines=True)
std_q, std_out = create(process.stdout)
err_q, err_read = create(process.stderr)
while std_out.is_alive() or err_read.is_alive():
for line in iter(std_q.get, None):
print(line)
for line in iter(err_q.get, None):
print(line)
While J.F. Sebastian's answer certainly solves the heart of the problem, i'm running python 2.7 (which wasn't in the original criteria) so im just throwing this out there to any other weary travellers who just want to cut/paste some code.
I havent tested this throughly yet, but on all the commands i have tried it seems to work perfectly :)
you may want to change .decode('ascii') to .decode('utf-8') - im still testing that bit out.
#!/usr/bin/env python2.7
import errno
import os
import pty
import sys
from select import select
import subprocess
stdout = ''
stderr = ''
command = 'curl google.com ; sleep 5 ; echo "hey"'
masters, slaves = zip(pty.openpty(), pty.openpty())
p = subprocess.Popen(command, stdin=slaves[0], stdout=slaves[0], stderr=slaves[1], shell=True, executable='/bin/bash')
for fd in slaves: os.close(fd)
readable = { masters[0]: sys.stdout, masters[1]: sys.stderr }
try:
print ' ######### REAL-TIME ######### '
while readable:
for fd in select(readable, [], [])[0]:
try: data = os.read(fd, 1024)
except OSError as e:
if e.errno != errno.EIO: raise
del readable[fd]
finally:
if not data: del readable[fd]
else:
if fd == masters[0]: stdout += data.decode('ascii')
else: stderr += data.decode('ascii')
readable[fd].write(data)
readable[fd].flush()
except:
print "Unexpected error:", sys.exc_info()[0]
raise
finally:
p.wait()
for fd in masters: os.close(fd)
print ''
print ' ########## RESULTS ########## '
print 'STDOUT:'
print stdout
print 'STDERR:'
print stderr
I am new to Python and Linux. I have a process running in a terminal window that will go indefinitely. The only way to stop it would be for it to crash or for me to hit ctrl+C. This process outputs text to the terminal window that I wish to capture with Python, so I can do some additional processing of that text.
I know I need to do something with getting stdout, but no matter what I try, I can't seem to capture the stdout correctly. Here is what I have so far.
import subprocess
command = 'echo this is a test. Does it come out as a single line?'
def myrun(cmd):
p = subprocess.Popen(cmd, shell=True, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
stdout = []
while True:
line = p.stdout.read()
stdout.append(line)
if line == '' and p.poll() != None:
break
return ''.join(stdout)
result = myrun(command)
print('> ' + result),
This will work when my command is a simple "echo blah blah blah". I am guessing this is because the echo process terminates. If I try running the continuous command, the output is never captured. Is this possible to do?
read() will block on reading until reach EOF, use read(1024) or readline() instead:
read(size=-1)
Read and return up to size bytes. If the argument is omitted, None, or negative, data is read and returned until EOF is reached.
eg:
p = subprocess.Popen('yes', stdout=subprocess.PIPE)
while True:
line = p.stdout.readline()
print(line.strip())
see more on the python io doc.
There are a lot of similar posts, but I didn't find answer.
On Gnu/Linux, with Python and subprocess module, I use the following code to iterate over the
stdout/sdterr of a command launched with subprocess:
class Shell:
"""
run a command and iterate over the stdout/stderr lines
"""
def __init__(self):
pass
def __call__(self,args,cwd='./'):
p = subprocess.Popen(args,
cwd=cwd,
stdout = subprocess.PIPE,
stderr = subprocess.STDOUT,
)
while True:
line = p.stdout.readline()
self.code = p.poll()
if line == '':
if self.code != None:
break
else:
continue
yield line
#example of use
args = ["./foo"]
shell = Shell()
for line in shell(args):
#do something with line
print line,
This works fine... except if the command executed is python, for example `args = ['python','foo.py'], in which case the output is not flushed but printed only when the command is finished.
Is there a solution?
Check out How to flush output of Python print?.
You need to run the python subprocess with the -u option:
-u Force stdin, stdout and stderr to be totally unbuffered. On sys‐
tems where it matters, also put stdin, stdout and stderr in binary
mode. Note that there is internal buffering in xreadlines(),
readlines() and file-object iterators ("for line in sys.stdin")
which is not influenced by this option. To work around this, you
will want to use "sys.stdin.readline()" inside a "while 1:" loop.
Or, if you have control over the python sub-process script you can use sys.stdout.flush() to flush the output every time you print.
import sys
sys.stdout.flush()
I am trying to get output from a subprocess and then give commands to that process based on the preceding output. I need to do this a variable number of times, when the program needs further input. (I also need to be able to hide the subprocess command prompt if possible).
I figured this would be an easy task given that I have seen this problem being discussed in posts from 2003 and it is nearly 2012 and it appears to be a pretty common need and really seems like it should be a basic part of any programming language. Apparently I was wrong and somehow almost 9 years later there is still no standard way of accomplishing this task in a stable, non-destructive, platform independent way!
I don't really understand much about file i/o and buffering or threading so I would prefer a solution that is as simple as possible. If there is a module that accomplishes this that is compatible with python 3.x, I would be very willing to download it. I realize that there are multiple questions that ask basically the same thing, but I have yet to find an answer that addresses the simple task that I am trying to accomplish.
Here is the code I have so far based on a variety of sources; however I have absolutely no idea what to do next. All my attempts ended in failure and some managed to use 100% of my CPU (to do basically nothing) and would not quit.
import subprocess
from subprocess import Popen, PIPE
p = Popen(r'C:\postgis_testing\shellcomm.bat',stdin=PIPE,stdout=PIPE,stderr=subprocess.STDOUT shell=True)
stdout,stdin = p.communicate(b'command string')
In case my question is unclear I am posting the text of the sample batch file that I demonstrates a situation in which it is necessary to send multiple commands to the subprocess (if you type an incorrect command string the program loops).
#echo off
:looper
set INPUT=
set /P INPUT=Type the correct command string:
if "%INPUT%" == "command string" (echo you are correct) else (goto looper)
If anyone can help me I would very much appreciate it, and I'm sure many others would as well!
EDIT here is the functional code using eryksun's code (next post) :
import subprocess
import threading
import time
import sys
try:
import queue
except ImportError:
import Queue as queue
def read_stdout(stdout, q, p):
it = iter(lambda: stdout.read(1), b'')
for c in it:
q.put(c)
if stdout.closed:
break
_encoding = getattr(sys.stdout, 'encoding', 'latin-1')
def get_stdout(q, encoding=_encoding):
out = []
while 1:
try:
out.append(q.get(timeout=0.2))
except queue.Empty:
break
return b''.join(out).rstrip().decode(encoding)
def printout(q):
outdata = get_stdout(q)
if outdata:
print('Output: %s' % outdata)
if __name__ == '__main__':
#setup
p = subprocess.Popen(['shellcomm.bat'], stdin=subprocess.PIPE,
stdout=subprocess.PIPE, stderr=subprocess.PIPE,
bufsize=0, shell=True) # I put shell=True to hide prompt
q = queue.Queue()
encoding = getattr(sys.stdin, 'encoding', 'utf-8')
#for reading stdout
t = threading.Thread(target=read_stdout, args=(p.stdout, q, p))
t.daemon = True
t.start()
#command loop
while p.poll() is None:
printout(q)
cmd = input('Input: ')
cmd = (cmd + '\n').encode(encoding)
p.stdin.write(cmd)
time.sleep(0.1) # I added this to give some time to check for closure (otherwise it doesn't work)
#tear down
for n in range(4):
rc = p.poll()
if rc is not None:
break
time.sleep(0.25)
else:
p.terminate()
rc = p.poll()
if rc is None:
rc = 1
printout(q)
print('Return Code: %d' % rc)
However when the script is run from a command prompt the following happens:
C:\Users\username>python C:\postgis_testing\shellcomm7.py
Input: sth
Traceback (most recent call last):
File "C:\postgis_testing\shellcomm7.py", line 51, in <module>
p.stdin.write(cmd)
IOError: [Errno 22] Invalid argument
It seems that the program closes out when run from command prompt. any ideas?
This demo uses a dedicated thread to read from stdout. If you search around, I'm sure you can find a more complete implementation written up in an object oriented interface. At least I can say this is working for me with your provided batch file in both Python 2.7.2 and 3.2.2.
shellcomm.bat:
#echo off
echo Command Loop Test
echo.
:looper
set INPUT=
set /P INPUT=Type the correct command string:
if "%INPUT%" == "command string" (echo you are correct) else (goto looper)
Here's what I get for output based on the sequence of commands "wrong", "still wrong", and "command string":
Output:
Command Loop Test
Type the correct command string:
Input: wrong
Output:
Type the correct command string:
Input: still wrong
Output:
Type the correct command string:
Input: command string
Output:
you are correct
Return Code: 0
For reading the piped output, readline might work sometimes, but set /P INPUT in the batch file naturally isn't writing a line ending. So instead I used lambda: stdout.read(1) to read a byte at a time (not so efficient, but it works). The reading function puts the data on a queue. The main thread gets the output from the queue after it writes a a command. Using a timeout on the get call here makes it wait a small amount of time to ensure the program is waiting for input. Instead you could check the output for prompts to know when the program is expecting input.
All that said, you can't expect a setup like this to work universally because the console program you're trying to interact with might buffer its output when piped. In Unix systems there are some utility commands available that you can insert into a pipe to modify the buffering to be non-buffered, line-buffered, or a given size -- such as stdbuf. There are also ways to trick the program into thinking it's connected to a pty (see pexpect). However, I don't know a way around this problem on Windows if you don't have access to the program's source code to explicitly set the buffering using setvbuf.
import subprocess
import threading
import time
import sys
if sys.version_info.major >= 3:
import queue
else:
import Queue as queue
input = raw_input
def read_stdout(stdout, q):
it = iter(lambda: stdout.read(1), b'')
for c in it:
q.put(c)
if stdout.closed:
break
_encoding = getattr(sys.stdout, 'encoding', 'latin-1')
def get_stdout(q, encoding=_encoding):
out = []
while 1:
try:
out.append(q.get(timeout=0.2))
except queue.Empty:
break
return b''.join(out).rstrip().decode(encoding)
def printout(q):
outdata = get_stdout(q)
if outdata:
print('Output:\n%s' % outdata)
if __name__ == '__main__':
ARGS = ["shellcomm.bat"] ### Modify this
#setup
p = subprocess.Popen(ARGS, bufsize=0, stdin=subprocess.PIPE,
stdout=subprocess.PIPE, stderr=subprocess.PIPE)
q = queue.Queue()
encoding = getattr(sys.stdin, 'encoding', 'utf-8')
#for reading stdout
t = threading.Thread(target=read_stdout, args=(p.stdout, q))
t.daemon = True
t.start()
#command loop
while 1:
printout(q)
if p.poll() is not None or p.stdin.closed:
break
cmd = input('Input: ')
cmd = (cmd + '\n').encode(encoding)
p.stdin.write(cmd)
#tear down
for n in range(4):
rc = p.poll()
if rc is not None:
break
time.sleep(0.25)
else:
p.terminate()
rc = p.poll()
if rc is None:
rc = 1
printout(q)
print('\nReturn Code: %d' % rc)
During the runtime of a process I would like to read its stdout and write it to a file. Any attempt of mine however failed because no matter what I tried as soon as I tried reading from the stdout it blocked until the process finished.
Here is a snippet of what I am trying to do. (The first part is simply a python script that writes something to stdout.)
import subprocess
p = subprocess.Popen('python -c \'\
from time import sleep\n\
for i in range(3):\n\
sleep(1)\n\
print "Hello", i\
\'', shell = True, stdout = subprocess.PIPE)
while p.poll() == None:
#read the stdout continuously
pass
print "Done"
I know that there are multiple questions out there that deal with the same subject. However, none of the ones I found was able to answer my question.
What is happening is buffering on the writer side. Since you are writing such small chunks from the little code snippet the underlying FILE object is buffering the output until the end. The following works as you expect.
#!/usr/bin/python
import sys
import subprocess
p = subprocess.Popen("""python -c '
from time import sleep ; import sys
for i in range(3):
sleep(1)
print "Hello", i
sys.stdout.flush()
'""", shell = True, stdout = subprocess.PIPE)
while True:
inline = p.stdout.readline()
if not inline:
break
sys.stdout.write(inline)
sys.stdout.flush()
print "Done"
However, you may not be expecting the right thing. The buffering is there to reduce the number of system calls in order to make the system more efficient. Does it really matter to you that the whole text is buffered until the end before you write it to a file? Don't you still get all the output in the file?
the following code would print stdout line by line as the subprocess runs until the readline() method returns an empty string:
p = subprocess.Popen(cmd, stdout=subprocess.PIPE)
for line in iter(p.stdout.readline, ''):
print line
p.stdout.close()
print 'Done'
update relating to your question better:
import subprocess
p = subprocess.Popen(['python'], stdout=subprocess.PIPE, stdin=subprocess.PIPE)
p.stdin.write("""
from time import sleep ; import sys
for i in range(3):
sleep(1)
print "Hello", i
sys.stdout.flush()
""")
p.stdin.close()
for line in iter(p.stdout.readline, ''):
print line
p.stdout.close()
print 'Done'
You can use subprocess.communicate() to get the output from stdout. Something like:
while(p.poll() == None):
#read the stdout continuously
print(p.communicate()[0])
pass
More info available at: http://docs.python.org/library/subprocess.html