read subprocess stdout line by line - python

My python script uses subprocess to call a linux utility that is very noisy. I want to store all of the output to a log file and show some of it to the user. I thought the following would work, but the output doesn't show up in my application until the utility has produced a significant amount of output.
#fake_utility.py, just generates lots of output over time
import time
i = 0
while True:
print hex(i)*512
i += 1
time.sleep(0.5)
#filters output
import subprocess
proc = subprocess.Popen(['python','fake_utility.py'],stdout=subprocess.PIPE)
for line in proc.stdout:
#the real code does filtering here
print "test:", line.rstrip()
The behavior I really want is for the filter script to print each line as it is received from the subprocess. Sorta like what tee does but with python code.
What am I missing? Is this even possible?
Update:
If a sys.stdout.flush() is added to fake_utility.py, the code has the desired behavior in python 3.1. I'm using python 2.6. You would think that using proc.stdout.xreadlines() would work the same as py3k, but it doesn't.
Update 2:
Here is the minimal working code.
#fake_utility.py, just generates lots of output over time
import sys, time
for i in range(10):
print i
sys.stdout.flush()
time.sleep(0.5)
#display out put line by line
import subprocess
proc = subprocess.Popen(['python','fake_utility.py'],stdout=subprocess.PIPE)
#works in python 3.0+
#for line in proc.stdout:
for line in iter(proc.stdout.readline,''):
print line.rstrip()

I think the problem is with the statement for line in proc.stdout, which reads the entire input before iterating over it. The solution is to use readline() instead:
#filters output
import subprocess
proc = subprocess.Popen(['python','fake_utility.py'],stdout=subprocess.PIPE)
while True:
line = proc.stdout.readline()
if not line:
break
#the real code does filtering here
print "test:", line.rstrip()
Of course you still have to deal with the subprocess' buffering.
Note: according to the documentation the solution with an iterator should be equivalent to using readline(), except for the read-ahead buffer, but (or exactly because of this) the proposed change did produce different results for me (Python 2.5 on Windows XP).

Bit late to the party, but was surprised not to see what I think is the simplest solution here:
import io
import subprocess
proc = subprocess.Popen(["prog", "arg"], stdout=subprocess.PIPE)
for line in io.TextIOWrapper(proc.stdout, encoding="utf-8"): # or another encoding
# do something with line
(This requires Python 3.)

Indeed, if you sorted out the iterator then buffering could now be your problem. You could tell the python in the sub-process not to buffer its output.
proc = subprocess.Popen(['python','fake_utility.py'],stdout=subprocess.PIPE)
becomes
proc = subprocess.Popen(['python','-u', 'fake_utility.py'],stdout=subprocess.PIPE)
I have needed this when calling python from within python.

You want to pass these extra parameters to subprocess.Popen:
bufsize=1, universal_newlines=True
Then you can iterate as in your example. (Tested with Python 3.5)

A function that allows iterating over both stdout and stderr concurrently, in realtime, line by line
In case you need to get the output stream for both stdout and stderr at the same time, you can use the following function.
The function uses Queues to merge both Popen pipes into a single iterator.
Here we create the function read_popen_pipes():
from queue import Queue, Empty
from concurrent.futures import ThreadPoolExecutor
def enqueue_output(file, queue):
for line in iter(file.readline, ''):
queue.put(line)
file.close()
def read_popen_pipes(p):
with ThreadPoolExecutor(2) as pool:
q_stdout, q_stderr = Queue(), Queue()
pool.submit(enqueue_output, p.stdout, q_stdout)
pool.submit(enqueue_output, p.stderr, q_stderr)
while True:
if p.poll() is not None and q_stdout.empty() and q_stderr.empty():
break
out_line = err_line = ''
try:
out_line = q_stdout.get_nowait()
except Empty:
pass
try:
err_line = q_stderr.get_nowait()
except Empty:
pass
yield (out_line, err_line)
read_popen_pipes() in use:
import subprocess as sp
with sp.Popen(my_cmd, stdout=sp.PIPE, stderr=sp.PIPE, text=True) as p:
for out_line, err_line in read_popen_pipes(p):
# Do stuff with each line, e.g.:
print(out_line, end='')
print(err_line, end='')
return p.poll() # return status-code

You can also read lines w/o loop. Works in python3.6.
import os
import subprocess
process = subprocess.Popen(command, stdout=subprocess.PIPE)
list_of_byte_strings = process.stdout.readlines()

Pythont 3.5 added the methods run() and call() to the subprocess module, both returning a CompletedProcess object. With this you are fine using proc.stdout.splitlines():
proc = subprocess.run( comman, shell=True, capture_output=True, text=True, check=True )
for line in proc.stdout.splitlines():
print "stdout:", line
See also How to Execute Shell Commands in Python Using the Subprocess Run Method

I tried this with python3 and it worked, source
When you use popen to spawn the new thread, you tell the operating system to PIPE the stdout of the child processes so the parent process can read it and here, stderr is copied to the stderr of the parent process.
in output_reader we read each line of stdout of the child process by wrapping it in an iterator that populates line by line output from the child process whenever a new line is ready.
def output_reader(proc):
for line in iter(proc.stdout.readline, b''):
print('got line: {0}'.format(line.decode('utf-8')), end='')
def main():
proc = subprocess.Popen(['python', 'fake_utility.py'],
stdout=subprocess.PIPE,
stderr=subprocess.STDOUT)
t = threading.Thread(target=output_reader, args=(proc,))
t.start()
try:
time.sleep(0.2)
import time
i = 0
while True:
print (hex(i)*512)
i += 1
time.sleep(0.5)
finally:
proc.terminate()
try:
proc.wait(timeout=0.2)
print('== subprocess exited with rc =', proc.returncode)
except subprocess.TimeoutExpired:
print('subprocess did not terminate in time')
t.join()

The following modification of Rômulo's answer works for me on Python 2 and 3 (2.7.12 and 3.6.1):
import os
import subprocess
process = subprocess.Popen(command, stdout=subprocess.PIPE)
while True:
line = process.stdout.readline()
if line != '':
os.write(1, line)
else:
break

I was having a problem with the arg list of Popen to update servers, the following code resolves this a bit.
import getpass
from subprocess import Popen, PIPE
username = 'user1'
ip = '127.0.0.1'
print ('What is the password?')
password = getpass.getpass()
cmd1 = f"""sshpass -p {password} ssh {username}#{ip}"""
cmd2 = f"""echo {password} | sudo -S apt update"""
cmd3 = " && "
cmd4 = f"""echo {password} | sudo -S apt upgrade -y"""
cmd5 = " && "
cmd6 = "exit"
commands = [cmd1, cmd2, cmd3, cmd4, cmd5, cmd6]
command = " ".join(commands)
cmd = command.split()
with Popen(cmd, stdout=PIPE, bufsize=1, universal_newlines=True) as p:
for line in p.stdout:
print(line, end='')
And to run the update on a local computer, the following code example does this.
import getpass
from subprocess import Popen, PIPE
print ('What is the password?')
password = getpass.getpass()
cmd1_local = f"""apt update"""
cmd2_local = f"""apt upgrade -y"""
commands = [cmd1_local, cmd2_local]
with Popen(['echo', password], stdout=PIPE) as auth:
for cmd in commands:
cmd = cmd.split()
with Popen(['sudo','-S'] + cmd, stdin=auth.stdout, stdout=PIPE, bufsize=1, universal_newlines=True) as p:
for line in p.stdout:
print(line, end='')

Related

Python subprocess stdout iterator always blocks

I want to capture the output while printing it, but I'm blocking forever without reading even a single line. What's going on? I'm using Python2.
Generator script:
#!/usr/bin/env python2.7
import random
import time
while True:
print(random.random())
time.sleep(1)
Sample generator output:
$ ./generator.py
0.334835137212
0.896609571236
0.833267988558
0.55456332113
^CTraceback (most recent call last):
Reader script:
import subprocess
cmd = ['./generator.py']
p = subprocess.Popen(cmd, stdout=subprocess.PIPE)
for line in p.stdout:
print(line)
print('Looping')
p.wait()
I've also tried:
import subprocess
import sys
cmd = ['./generator.py']
p = subprocess.Popen(cmd, stdout=subprocess.PIPE)
while True:
line = p.stdout.readline()
print(line)
print('Looping')
p.wait()
...and:
import sys
import subprocess
import select
import time
cmd = ['./generator.py']
p = subprocess.Popen(cmd, stdout=subprocess.PIPE)
s = select.poll()
s.register(p.stdout, select.POLLIN)
while True:
if s.poll(1):
line = p.stdout.read()
else:
p.poll()
if p.returncode is not None:
break
print('Looping')
time.sleep(1)
p.wait()
As #dhke mentioned, one of the issues is implicit output-buffering in the producer. If you have the ability to change the producer, and you're willing to, and the production is done by calls to the print-function then just add "flush=True" as an argument to the print function. You can also fall-back to doing a sys.stdout.flush() at key points in the producer.
The second problem appears to be iterating over sys.stdout. This never seems to work for a long-running process. The second and third methods
I'm dealing with a similar problem. This is the workaround im currently using to prevent buffering.
proc = subprocess.Popen(['stdbuf', '-o0'] + cmd, stdout=subprocess.PIPE)
The disadvantage of this methood is that it relys on an external Linux command to solve the problem. Have a look in the comments here for a different and native python approach to get rid of the PIPE buffering. Many thanks to #9000 for suggesting both solutions to me.

Displaying subprocess output to stdout and redirecting it

I'm running a script via Python's subprocess module. Currently I use:
p = subprocess.Popen('/path/to/script', stdout=subprocess.PIPE, stderr=subprocess.PIPE)
result = p.communicate()
I then print the result to the stdout. This is all fine but as the script takes a long time to complete, I wanted real time output from the script to stdout as well. The reason I pipe the output is because I want to parse it.
To save subprocess' stdout to a variable for further processing and to display it while the child process is running as it arrives:
#!/usr/bin/env python3
from io import StringIO
from subprocess import Popen, PIPE
with Popen('/path/to/script', stdout=PIPE, bufsize=1,
universal_newlines=True) as p, StringIO() as buf:
for line in p.stdout:
print(line, end='')
buf.write(line)
output = buf.getvalue()
rc = p.returncode
To save both subprocess's stdout and stderr is more complex because you should consume both streams concurrently to avoid a deadlock:
stdout_buf, stderr_buf = StringIO(), StringIO()
rc = teed_call('/path/to/script', stdout=stdout_buf, stderr=stderr_buf,
universal_newlines=True)
output = stdout_buf.getvalue()
...
where teed_call() is define here.
Update: here's a simpler asyncio version.
Old version:
Here's a single-threaded solution based on child_process.py example from tulip:
import asyncio
import sys
from asyncio.subprocess import PIPE
#asyncio.coroutine
def read_and_display(*cmd):
"""Read cmd's stdout, stderr while displaying them as they arrive."""
# start process
process = yield from asyncio.create_subprocess_exec(*cmd,
stdout=PIPE, stderr=PIPE)
# read child's stdout/stderr concurrently
stdout, stderr = [], [] # stderr, stdout buffers
tasks = {
asyncio.Task(process.stdout.readline()): (
stdout, process.stdout, sys.stdout.buffer),
asyncio.Task(process.stderr.readline()): (
stderr, process.stderr, sys.stderr.buffer)}
while tasks:
done, pending = yield from asyncio.wait(tasks,
return_when=asyncio.FIRST_COMPLETED)
assert done
for future in done:
buf, stream, display = tasks.pop(future)
line = future.result()
if line: # not EOF
buf.append(line) # save for later
display.write(line) # display in terminal
# schedule to read the next line
tasks[asyncio.Task(stream.readline())] = buf, stream, display
# wait for the process to exit
rc = yield from process.wait()
return rc, b''.join(stdout), b''.join(stderr)
The script runs '/path/to/script command and reads line by line both its stdout&stderr concurrently. The lines are printed to parent's stdout/stderr correspondingly and saved as bytestrings for future processing. To run the read_and_display() coroutine, we need an event loop:
import os
if os.name == 'nt':
loop = asyncio.ProactorEventLoop() # for subprocess' pipes on Windows
asyncio.set_event_loop(loop)
else:
loop = asyncio.get_event_loop()
try:
rc, *output = loop.run_until_complete(read_and_display("/path/to/script"))
if rc:
sys.exit("child failed with '{}' exit code".format(rc))
finally:
loop.close()
p.communicate() waits for the subprocess to complete and then returns its entire output at once.
Have you tried something like this instead, where you read the subprocess output line-by-line?
p = subprocess.Popen('/path/to/script', stdout=subprocess.PIPE, stderr=subprocess.PIPE)
for line in p.stdout:
# do something with this individual line
print line
The Popen.communicate doc clearly states:
Note: The data read is buffered in memory, so do not use this method if the data size is large or unlimited.
https://docs.python.org/2/library/subprocess.html#subprocess.Popen.communicate
So if you need realtime output, you need to use something like this:
stream_p = subprocess.Popen('/path/to/script', stdout=subprocess.PIPE, stderr=subprocess.PIPE)
while stream_line in stream_p:
#Parse it the way you want
print stream_line
This prints both stdout and stderr to the terminal as well as saving both stdout and stderr into a variable:
from subprocess import Popen, PIPE, STDOUT
with Popen(args, stdout=PIPE, stderr=STDOUT, text=True, bufsize=1) as p:
output = "".join([print(buf, end="") or buf for buf in p.stdout])
However, depending on what exactly you're doing, this might be important to note: By using stderr=STDOUT, we cannot differentiate between stdout and stderr anymore and with the call to print, your output will always be printed to stdout, doesn't matter if it came from stdout or stderr.
For Python < 3.7 you will need to use universal_newlines instead of text.
New in version 3.7: text was added as a more readable alias for universal_newlines.
Source: https://docs.python.org/3/library/subprocess.html#subprocess.Popen

Manipulating python stdout/stderr using subprocess

I am trying to manipulate/strip the output of 7zip command and intimate user about the progress of process. The sample code i am trying to use is below:
import subprocess
proc = subprocess.Popen(['7zip','arg', 'archive'],shell=True, stdin=None, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
while True:
line = proc.stdout.readline()
if line != '':
#Do some striping and update pyqt label
print "test:", line.rstrip()
sys.stdout.flush()
else:
break
However, the real problem is that print statement only print stdout after completion of the process. Is there a way to capture the stdout line by line then manipulate and print again?
Update
Updated the script to include sys.stdout.flush()
Yes, the popen family of calls.
You can see the documentation here
(child_stdin,
child_stdout,
child_stderr) = os.popen3("cmd", mode, bufsize)
==>
p = Popen("cmd", shell=True, bufsize=bufsize,
stdin=PIPE, stdout=PIPE, stderr=PIPE, close_fds=True)
(child_stdin,
child_stdout,
child_stderr) = (p.stdin, p.stdout, p.stderr)
they give you filedescriptors to the streams and you can use them to read the output of the called program.

How do I get all of the output from my .exe using subprocess and Popen?

I am trying to run an executable and capture its output using subprocess.Popen; however, I don't seem to be getting all of the output.
import subprocess as s
from subprocess import Popen
import os
ps = Popen(r'C:\Tools\Dvb_pid_3_0.exe', stdin = s.PIPE,stdout = s.PIPE)
print 'pOpen done..'
while:
line = ps.stdout.readline()
print line
It prints two line less than the original exe file when opened manually.
I tried an alternative approach with the same result:
f = open('myprogram_output.txt','w')
proc = Popen('C:\Tools\Dvb_pid_3_0.exe ', stdout =f)
line = proc.stdout.readline()
print line
f.close()
Can anyone please help me to get the full data of the exe?
As asked by Sebastian:
Original exe file last few lines o/p:
-Gdd : Generic count (1 - 1000)
-Cdd : Cut start at (0 - 99)
-Edd : Cut end at (1 - 100)
Please select the stream file number below:
1 - .\pdsx100-bcm7230-squashfs-sdk0.0.0.38-0.2.6.0-prod.sao.ts
The o/p I get after running:
-P0xYYYY : Pid been interested
-S0xYYYY : Service ID been interested
-T0xYYYY : Transport ID been interested
-N0xYYYY : Network ID been interested
-R0xYYYY : A old Pid been replaced by this PID
-Gdd : Generic count (1 - 1000)
So we can see some lines missing. I have to write 1 and choose value after please select the fule number below appears.
I tried to use ps.stdin.write('1\n'). It didn't print the value in the exe file
New code:
#!/usr/bin/env python
from subprocess import Popen, PIPE
cmd = r'C:\Tools\Dvb_pid_3_0.exe'
p = Popen(cmd, stdin=PIPE, stdout=None, stderr=None, universal_newlines=True)
stdout_text, stderr_text = p.communicate(input="1\n\n")
print("stdout: %r\nstderr: %r" % (stdout_text, stderr_text))
if p.returncode != 0:
raise RuntimeError("%r failed, status code %d" % (cmd, p.returncode))
Thanks Sebastien. I am able to see the entire output but not able to feed in any input with the current code.
To get all stdout as a string:
from subprocess import check_output as qx
cmd = r'C:\Tools\Dvb_pid_3_0.exe'
output = qx(cmd)
To get both stdout and stderr as a single string:
from subprocess import STDOUT
output = qx(cmd, stderr=STDOUT)
To get all lines as a list:
lines = output.splitlines()
To get lines as they are being printed by the subprocess:
from subprocess import Popen, PIPE
p = Popen(cmd, stdout=PIPE, bufsize=1)
for line in iter(p.stdout.readline, ''):
print line,
p.stdout.close()
if p.wait() != 0:
raise RuntimeError("%r failed, exit status: %d" % (cmd, p.returncode))
Add stderr=STDOUT to the Popen() call to merge stdout/stderr.
Note: if cmd uses block-buffering in the non-interactive mode then lines won't appear until the buffer flushes. winpexpect module might be able to get the output sooner.
To save the output to a file:
import subprocess
with open('output.txt', 'wb') as f:
subprocess.check_call(cmd, stdout=f)
# to read line by line
with open('output.txt') as f:
for line in f:
print line,
If cmd always requires input even an empty one; set stdin:
import os
with open(os.devnull, 'rb') as DEVNULL:
output = qx(cmd, stdin=DEVNULL) # use subprocess.DEVNULL on Python 3.3+
You could combine these solutions e.g., to merge stdout/stderr, and to save the output to a file, and to provide an empty input:
import os
from subprocess import STDOUT, check_call as x
with open(os.devnull, 'rb') as DEVNULL, open('output.txt', 'wb') as f:
x(cmd, stdin=DEVNULL, stdout=f, stderr=STDOUT)
To provide all input as a single string you could use .communicate() method:
#!/usr/bin/env python
from subprocess import Popen, PIPE
cmd = ["python", "test.py"]
p = Popen(cmd, stdin=PIPE, stdout=PIPE, stderr=PIPE, universal_newlines=True)
stdout_text, stderr_text = p.communicate(input="1\n\n")
print("stdout: %r\nstderr: %r" % (stdout_text, stderr_text))
if p.returncode != 0:
raise RuntimeError("%r failed, status code %d" % (cmd, p.returncode))
where test.py:
print raw_input('abc')[::-1]
raw_input('press enter to exit')
If your interaction with the program is more like a conversation than you might need winpexpect module. Here's an example from pexpect docs:
# This connects to the openbsd ftp site and
# downloads the recursive directory listing.
from winpexpect import winspawn as spawn
child = spawn ('ftp ftp.openbsd.org')
child.expect ('Name .*: ')
child.sendline ('anonymous')
child.expect ('Password:')
child.sendline ('noah#example.com')
child.expect ('ftp> ')
child.sendline ('cd pub')
child.expect('ftp> ')
child.sendline ('get ls-lR.gz')
child.expect('ftp> ')
child.sendline ('bye')
To send special keys such as F3, F10 on Windows you might need SendKeys module or its pure Python implementation SendKeys-ctypes. Something like:
from SendKeys import SendKeys
SendKeys(r"""
{LWIN}
{PAUSE .25}
r
C:\Tools\Dvb_pid_3_0.exe{ENTER}
{PAUSE 1}
1{ENTER}
{PAUSE 1}
2{ENTER}
{PAUSE 1}
{F3}
{PAUSE 1}
{F10}
""")
It doesn't capture output.
The indentation of your question threw me off a bit, since Python is particular about that. Have you tried something as so:
import subprocess as s
from subprocess import Popen
import os
ps = Popen(r'C:\Tools\Dvb_pid_3_0.exe', stdin = s.PIPE,stdout = s.PIPE)
print 'pOpen done..'
(stdout, stderr) = ps.communicate()
print stdout
I think that stdout will be one single string of whatever you return from your command, so this may not be what you desire, since readline() presumes you want to view output line by line.
Would suggest poking around http://docs.python.org/library/subprocess.html for some uses that match what you are up to.

python, iterate on subprocess.Popen() stdout/stderr

There are a lot of similar posts, but I didn't find answer.
On Gnu/Linux, with Python and subprocess module, I use the following code to iterate over the
stdout/sdterr of a command launched with subprocess:
class Shell:
"""
run a command and iterate over the stdout/stderr lines
"""
def __init__(self):
pass
def __call__(self,args,cwd='./'):
p = subprocess.Popen(args,
cwd=cwd,
stdout = subprocess.PIPE,
stderr = subprocess.STDOUT,
)
while True:
line = p.stdout.readline()
self.code = p.poll()
if line == '':
if self.code != None:
break
else:
continue
yield line
#example of use
args = ["./foo"]
shell = Shell()
for line in shell(args):
#do something with line
print line,
This works fine... except if the command executed is python, for example `args = ['python','foo.py'], in which case the output is not flushed but printed only when the command is finished.
Is there a solution?
Check out How to flush output of Python print?.
You need to run the python subprocess with the -u option:
-u Force stdin, stdout and stderr to be totally unbuffered. On sys‐
tems where it matters, also put stdin, stdout and stderr in binary
mode. Note that there is internal buffering in xreadlines(),
readlines() and file-object iterators ("for line in sys.stdin")
which is not influenced by this option. To work around this, you
will want to use "sys.stdin.readline()" inside a "while 1:" loop.
Or, if you have control over the python sub-process script you can use sys.stdout.flush() to flush the output every time you print.
import sys
sys.stdout.flush()

Categories