Real time read from subprocess.stdout on Windows - python

To emphasize, the problem is real time read instead of non-blocking read. It has been asked before, e.g. subprocess.Popen.stdout - reading stdout in real-time (again). But no satisfactory solution has been proposed.
As an example, the following code tries to simulate the python shell.
import subprocess
p = subprocess.Popen(['python'], stdin=subprocess.PIPE, stdout=subprocess.PIPE)
while True:
line = input('>>> ')
p.stdin.write(line.encode())
print('>>> ', p.stdout.read().decode())
However, it would be blocked when reading from p.stdout. After searching around, I found the following two possible soutions.
using fctrl and O_NONBLOCK
using thread and queue
Whereas the 1st soution may work and only work on linux, the 2nd soution just turn blocking read to non-blocking read, i.e. I cannot get real time output of the subprocess. For example, if I input 'print("hello")', I will get nothing from p.stdout using 2nd solution.
Perhaps, someone would suggest p.communite. Unfortunately, it is not suitable in this case, since it would close stdin as described here.
So, is there any solutions for Windows?
Edited: Even if -u is turned on and p.stdout.read is replaced with p.stdout.readline, the problem still exists.
import subprocess
p = subprocess.Popen(['python', '-u'], stdin=subprocess.PIPE, stdout=subprocess.PIPE)
while True:
line = input('>>> ')
p.stdin.write(line.encode())
p.stdin.flush()
print('>>> ', p.stdout.readline().decode())
Solution: The following is the final code based on J.F. Sebastian's answer and comments.
from subprocess import Popen, PIPE, STDOUT
with Popen(
['python', '-i', '-q'],
stdin=PIPE, stdout=PIPE, stderr=STDOUT,
bufsize=0
) as process:
while True:
line = input('>>> ')
if not line:
break
process.stdin.write((line+'\n').encode())
print(process.stdout.readline().decode(), end='')
It should be noted that the program would hang when the command triggers no output.

Here's a complete working example that uses a subprocess interactively:
#!/usr/bin/env python3
import sys
from subprocess import Popen, PIPE, DEVNULL
with Popen([sys.executable, '-i'], stdin=PIPE, stdout=PIPE, stderr=DEVNULL,
universal_newlines=True) as process:
for i in range(10):
print("{}**2".format(i), file=process.stdin, flush=True)
square = process.stdout.readline()
print(square, end='')
Here's another example: how to run [sys.executable, '-u', 'test.py'] interactively.

Related

Python subprocess.Popen stdout=subprocess.PIPE blocking execution [duplicate]

I'm using Python's subprocess.communicate() to read stdout from a process that runs for about a minute.
How can I print out each line of that process's stdout in a streaming fashion, so that I can see the output as it's generated, but still block on the process terminating before continuing?
subprocess.communicate() appears to give all the output at once.
To get subprocess' output line by line as soon as the subprocess flushes its stdout buffer:
#!/usr/bin/env python2
from subprocess import Popen, PIPE
p = Popen(["cmd", "arg1"], stdout=PIPE, bufsize=1)
with p.stdout:
for line in iter(p.stdout.readline, b''):
print line,
p.wait() # wait for the subprocess to exit
iter() is used to read lines as soon as they are written to workaround the read-ahead bug in Python 2.
If subprocess' stdout uses a block buffering instead of a line buffering in non-interactive mode (that leads to a delay in the output until the child's buffer is full or flushed explicitly by the child) then you could try to force an unbuffered output using pexpect, pty modules or unbuffer, stdbuf, script utilities, see Q: Why not just use a pipe (popen())?
Here's Python 3 code:
#!/usr/bin/env python3
from subprocess import Popen, PIPE
with Popen(["cmd", "arg1"], stdout=PIPE, bufsize=1,
universal_newlines=True) as p:
for line in p.stdout:
print(line, end='')
Note: Unlike Python 2 that outputs subprocess' bytestrings as is; Python 3 uses text mode (cmd's output is decoded using locale.getpreferredencoding(False) encoding).
Please note, I think J.F. Sebastian's method (below) is better.
Here is an simple example (with no checking for errors):
import subprocess
proc = subprocess.Popen('ls',
shell=True,
stdout=subprocess.PIPE,
)
while proc.poll() is None:
output = proc.stdout.readline()
print output,
If ls ends too fast, then the while loop may end before you've read all the data.
You can catch the remainder in stdout this way:
output = proc.communicate()[0]
print output,
I believe the simplest way to collect output from a process in a streaming fashion is like this:
import sys
from subprocess import *
proc = Popen('ls', shell=True, stdout=PIPE)
while True:
data = proc.stdout.readline() # Alternatively proc.stdout.read(1024)
if len(data) == 0:
break
sys.stdout.write(data) # sys.stdout.buffer.write(data) on Python 3.x
The readline() or read() function should only return an empty string on EOF, after the process has terminated - otherwise it will block if there is nothing to read (readline() includes the newline, so on empty lines, it returns "\n"). This avoids the need for an awkward final communicate() call after the loop.
On files with very long lines read() may be preferable to reduce maximum memory usage - the number passed to it is arbitrary, but excluding it results in reading the entire pipe output at once which is probably not desirable.
If you want a non-blocking approach, don't use process.communicate(). If you set the subprocess.Popen() argument stdout to PIPE, you can read from process.stdout and check if the process still runs using process.poll().
If you're simply trying to pass the output through in realtime, it's hard to get simpler than this:
import subprocess
# This will raise a CalledProcessError if the program return a nonzero code.
# You can use call() instead if you don't care about that case.
subprocess.check_call(['ls', '-l'])
See the docs for subprocess.check_call().
If you need to process the output, sure, loop on it. But if you don't, just keep it simple.
Edit: J.F. Sebastian points out both that the defaults for the stdout and stderr parameters pass through to sys.stdout and sys.stderr, and that this will fail if sys.stdout and sys.stderr have been replaced (say, for capturing output in tests).
myCommand="ls -l"
cmd=myCommand.split()
# "universal newline support" This will cause to interpret \n, \r\n and \r equally, each as a newline.
p = subprocess.Popen(cmd, stderr=subprocess.PIPE, universal_newlines=True)
while True:
print(p.stderr.readline().rstrip('\r\n'))
Adding another python3 solution with a few small changes:
Allows you to catch the exit code of the shell process (I have been unable to get the exit code while using the with construct)
Also pipes stderr out in real time
import subprocess
import sys
def subcall_stream(cmd, fail_on_error=True):
# Run a shell command, streaming output to STDOUT in real time
# Expects a list style command, e.g. `["docker", "pull", "ubuntu"]`
p = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.STDOUT, bufsize=1, universal_newlines=True)
for line in p.stdout:
sys.stdout.write(line)
p.wait()
exit_code = p.returncode
if exit_code != 0 and fail_on_error:
raise RuntimeError(f"Shell command failed with exit code {exit_code}. Command: `{cmd}`")
return(exit_code)

python how to read output without EOF from stdout of subprocess [duplicate]

I'm using Python's subprocess.communicate() to read stdout from a process that runs for about a minute.
How can I print out each line of that process's stdout in a streaming fashion, so that I can see the output as it's generated, but still block on the process terminating before continuing?
subprocess.communicate() appears to give all the output at once.
To get subprocess' output line by line as soon as the subprocess flushes its stdout buffer:
#!/usr/bin/env python2
from subprocess import Popen, PIPE
p = Popen(["cmd", "arg1"], stdout=PIPE, bufsize=1)
with p.stdout:
for line in iter(p.stdout.readline, b''):
print line,
p.wait() # wait for the subprocess to exit
iter() is used to read lines as soon as they are written to workaround the read-ahead bug in Python 2.
If subprocess' stdout uses a block buffering instead of a line buffering in non-interactive mode (that leads to a delay in the output until the child's buffer is full or flushed explicitly by the child) then you could try to force an unbuffered output using pexpect, pty modules or unbuffer, stdbuf, script utilities, see Q: Why not just use a pipe (popen())?
Here's Python 3 code:
#!/usr/bin/env python3
from subprocess import Popen, PIPE
with Popen(["cmd", "arg1"], stdout=PIPE, bufsize=1,
universal_newlines=True) as p:
for line in p.stdout:
print(line, end='')
Note: Unlike Python 2 that outputs subprocess' bytestrings as is; Python 3 uses text mode (cmd's output is decoded using locale.getpreferredencoding(False) encoding).
Please note, I think J.F. Sebastian's method (below) is better.
Here is an simple example (with no checking for errors):
import subprocess
proc = subprocess.Popen('ls',
shell=True,
stdout=subprocess.PIPE,
)
while proc.poll() is None:
output = proc.stdout.readline()
print output,
If ls ends too fast, then the while loop may end before you've read all the data.
You can catch the remainder in stdout this way:
output = proc.communicate()[0]
print output,
I believe the simplest way to collect output from a process in a streaming fashion is like this:
import sys
from subprocess import *
proc = Popen('ls', shell=True, stdout=PIPE)
while True:
data = proc.stdout.readline() # Alternatively proc.stdout.read(1024)
if len(data) == 0:
break
sys.stdout.write(data) # sys.stdout.buffer.write(data) on Python 3.x
The readline() or read() function should only return an empty string on EOF, after the process has terminated - otherwise it will block if there is nothing to read (readline() includes the newline, so on empty lines, it returns "\n"). This avoids the need for an awkward final communicate() call after the loop.
On files with very long lines read() may be preferable to reduce maximum memory usage - the number passed to it is arbitrary, but excluding it results in reading the entire pipe output at once which is probably not desirable.
If you want a non-blocking approach, don't use process.communicate(). If you set the subprocess.Popen() argument stdout to PIPE, you can read from process.stdout and check if the process still runs using process.poll().
If you're simply trying to pass the output through in realtime, it's hard to get simpler than this:
import subprocess
# This will raise a CalledProcessError if the program return a nonzero code.
# You can use call() instead if you don't care about that case.
subprocess.check_call(['ls', '-l'])
See the docs for subprocess.check_call().
If you need to process the output, sure, loop on it. But if you don't, just keep it simple.
Edit: J.F. Sebastian points out both that the defaults for the stdout and stderr parameters pass through to sys.stdout and sys.stderr, and that this will fail if sys.stdout and sys.stderr have been replaced (say, for capturing output in tests).
myCommand="ls -l"
cmd=myCommand.split()
# "universal newline support" This will cause to interpret \n, \r\n and \r equally, each as a newline.
p = subprocess.Popen(cmd, stderr=subprocess.PIPE, universal_newlines=True)
while True:
print(p.stderr.readline().rstrip('\r\n'))
Adding another python3 solution with a few small changes:
Allows you to catch the exit code of the shell process (I have been unable to get the exit code while using the with construct)
Also pipes stderr out in real time
import subprocess
import sys
def subcall_stream(cmd, fail_on_error=True):
# Run a shell command, streaming output to STDOUT in real time
# Expects a list style command, e.g. `["docker", "pull", "ubuntu"]`
p = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.STDOUT, bufsize=1, universal_newlines=True)
for line in p.stdout:
sys.stdout.write(line)
p.wait()
exit_code = p.returncode
if exit_code != 0 and fail_on_error:
raise RuntimeError(f"Shell command failed with exit code {exit_code}. Command: `{cmd}`")
return(exit_code)

Python Popen _with_ realtime input/output control

I have searched and experimented for over an hour on this and there doesn't seem to be a way to both do a 'here document' and get the output line by line as it occurs:
python = '''var="some character text"
print(var)
print(var)
exit()
'''
from subprocess import Popen, PIPE, STDOUT
import shlex
def run_process(command):
p = Popen(shlex.split(command), stdin=PIPE, stdout=PIPE, stderr=STDOUT)
p.stdin.write(python)
while True:
output = p.stdout.readline()
if output == '' and p.poll() is not None:
break
if output:
print output.strip()
rc=p.poll()
return rc
run_process("/usr/bin/python")
The above code hangs indefinitely. Yes, it's a snake eating its tail, but it was just to prove the concept.
The problem is my subprocess takes a LONG time to run and I need to be able to see the output without waiting hours to figure out if anything is wrong. Any hints? Thanks.
The Python interpreter behaves differently when run in interactive vs. non-interactive mode. From the python(1) manual page:
In non-interactive mode, the entire input is parsed before it is executed.
Of course, “entire input” is delimited by EOF, and your program never sends an EOF, which is why it hangs.
Python runs in interactive mode if its stdin is a tty. You can use the Ptyprocess library to spawn a process with a tty as stdin. Or use the Pexpect library (based on Ptyprocess), which even includes ready-made REPL wrappers for Python and other programs.
But if you replace Python with sed — which of course doesn’t have an interactive mode — the program still doesn’t work:
sed = '''this is a foo!\n
another foo!\n
'''
from subprocess import Popen, PIPE, STDOUT
import shlex
def run_process(command):
p = Popen(shlex.split(command), stdin=PIPE, stdout=PIPE, stderr=STDOUT)
p.stdin.write(sed)
while True:
output = p.stdout.readline()
if output == '' and p.poll() is not None:
break
if output:
print output.strip()
rc=p.poll()
return rc
run_process("/bin/sed -e 's/foo/bar/g'")
This is caused by a different problem: output buffering in sed. Some programs have options to disable buffering. In particular, both sed and Python have a -u option, which solves this problem:
run_process("/bin/sed -ue 's/foo/bar/g'")

Python subprocess module, how do I give input to the first of series of piped commands?

I am trying to use Python's subprocess module. What I require is to send input to the first process whose output becomes the input of the second process.
The situation is basically almost the same as the example given in the documentation here:
http://docs.python.org/library/subprocess.html#replacing-shell-pipeline
except that I need to provide input the first command.
Here is that example copied:
p1 = Popen(["dmesg"], stdout=PIPE)
p2 = Popen(["grep", "hda"], stdin=p1.stdout, stdout=PIPE)
p1.stdout.close() # Allow p1 to receive a SIGPIPE if p2 exits.
output = p2.communicate()[0]
If we change the first line to:
p1 = Popen(["cat"], stdout=PIPE, stdin=PIPE)
How do I provide the input string to the process?
If I attempt it by changing the final line to:
output = p2.communicate(input=inputstring)[0]
This doesn't work.
I do have a working version, which just stores the output of the first command in a string and then passes that to the second command. This isn't terrible as there is essentially no concurrency that can be exploited (in my actual use case the first command will exit rather quickly and produce all of its output at the end).
Here is the working version in full:
import subprocess
simple = """Writing some text
with some lines in which the
word line occurs but others
where it does
not
"""
def run ():
catcommand = [ "cat" ]
catprocess = subprocess.Popen(catcommand,
stdin=subprocess.PIPE,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE)
(catout, caterr) = catprocess.communicate(input=simple)
grepcommand = [ "grep", "line" ]
grepprocess = subprocess.Popen(grepcommand,
stdin=subprocess.PIPE,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE)
(grepout, greperr) = grepprocess.communicate(input=catout)
print "--- output ----"
print grepout
print "--- error ----"
print greperr
if __name__ == "__main__":
run()
I hope I've been clear enough, thanks for any help.
If you do
from subprocess import Popen, PIPE
p1 = Popen(["cat"], stdout=PIPE, stdin=PIPE)
You should do p1.communicate("Your Input to the p1") and that will flow through the PIPE.
The stdin is the process's input and you should communicate to that only.
The program which have given is absolutely fine, there seems no problem with that.
I assume that cat, grep are just example commands otherwise you could use a pure Python solution without subprocesses e.g.:
for line in simple.splitlines():
if "line" in line:
print(line)
Or if you want to use grep:
from subprocess import Popen, PIPE
output = Popen(['grep', 'line'], stdin=PIPE, stdout=PIPE).communicate(simple)[0]
print output,
You can pass the output of the first command to the second one without storing it in a string first:
from subprocess import Popen, PIPE
from threading import Thread
# start commands in parallel
first = Popen(first_command, stdin=PIPE, stdout=PIPE)
second = Popen(second_command, stdin=first.stdout, stdout=PIPE)
first.stdout.close() # notify `first` if `second` exits
first.stdout = None # avoid I/O on it in `.communicate()`
# feed input to the first command
Thread(target=first.communicate, args=[simple]).start() # avoid blocking
# get output from the second command at the same time
output = second.communicate()[0]
print output,
If you don't want to store all input/output in memory; you might need threads (to read/write in chunks without blocking) or a select loop (works on POSIX).
If there are multiple commands, it might be more readable just to use the shell directly as suggested by #Troels Folke or use a library such as plumbum that hides all the gory details of emulating the shell by hand.
Hmm, why not mix in a bit of (ba)sh? :-)
from subprocess import Popen, PIPE
cproc = Popen('cat | grep line', stdin=PIPE, stdout=PIPE, stderr=PIPE, shell=True)
out, err = cproc.communicate("this line has the word line in it")
BEWARE though:
This only works on systems that use a Bourne Shell compatible shell (like most *nix'es)
Usign shell=True and putting user input in the command string is a bad idea, unless you escape the user input first. Read the subprocess docs -> "Frequently Used Arguments" for details.
This is ugly, non portable, non pythonic and so on...
EDIT:
There is no need to use cat though, if all you want to do is grep. Just feed the input directly to grep, or even better, use python regular expressions.
I faced a similar problem where I wanted to have several piped Popen.
This worked for me:
from subprocess import Popen, PIPE
p1 = Popen(["cat"], stdin=PIPE, stdout=PIPE)
p2 = Popen(["grep", "hda"], stdin=p1.stdout, stdout=PIPE)
p1.stdin.write(inputstring)
p1.stdin.close()
output = p2.communicate()[0]
However, the #jfs solution with threads seems more robust.

Read streaming input from subprocess.communicate()

I'm using Python's subprocess.communicate() to read stdout from a process that runs for about a minute.
How can I print out each line of that process's stdout in a streaming fashion, so that I can see the output as it's generated, but still block on the process terminating before continuing?
subprocess.communicate() appears to give all the output at once.
To get subprocess' output line by line as soon as the subprocess flushes its stdout buffer:
#!/usr/bin/env python2
from subprocess import Popen, PIPE
p = Popen(["cmd", "arg1"], stdout=PIPE, bufsize=1)
with p.stdout:
for line in iter(p.stdout.readline, b''):
print line,
p.wait() # wait for the subprocess to exit
iter() is used to read lines as soon as they are written to workaround the read-ahead bug in Python 2.
If subprocess' stdout uses a block buffering instead of a line buffering in non-interactive mode (that leads to a delay in the output until the child's buffer is full or flushed explicitly by the child) then you could try to force an unbuffered output using pexpect, pty modules or unbuffer, stdbuf, script utilities, see Q: Why not just use a pipe (popen())?
Here's Python 3 code:
#!/usr/bin/env python3
from subprocess import Popen, PIPE
with Popen(["cmd", "arg1"], stdout=PIPE, bufsize=1,
universal_newlines=True) as p:
for line in p.stdout:
print(line, end='')
Note: Unlike Python 2 that outputs subprocess' bytestrings as is; Python 3 uses text mode (cmd's output is decoded using locale.getpreferredencoding(False) encoding).
Please note, I think J.F. Sebastian's method (below) is better.
Here is an simple example (with no checking for errors):
import subprocess
proc = subprocess.Popen('ls',
shell=True,
stdout=subprocess.PIPE,
)
while proc.poll() is None:
output = proc.stdout.readline()
print output,
If ls ends too fast, then the while loop may end before you've read all the data.
You can catch the remainder in stdout this way:
output = proc.communicate()[0]
print output,
I believe the simplest way to collect output from a process in a streaming fashion is like this:
import sys
from subprocess import *
proc = Popen('ls', shell=True, stdout=PIPE)
while True:
data = proc.stdout.readline() # Alternatively proc.stdout.read(1024)
if len(data) == 0:
break
sys.stdout.write(data) # sys.stdout.buffer.write(data) on Python 3.x
The readline() or read() function should only return an empty string on EOF, after the process has terminated - otherwise it will block if there is nothing to read (readline() includes the newline, so on empty lines, it returns "\n"). This avoids the need for an awkward final communicate() call after the loop.
On files with very long lines read() may be preferable to reduce maximum memory usage - the number passed to it is arbitrary, but excluding it results in reading the entire pipe output at once which is probably not desirable.
If you want a non-blocking approach, don't use process.communicate(). If you set the subprocess.Popen() argument stdout to PIPE, you can read from process.stdout and check if the process still runs using process.poll().
If you're simply trying to pass the output through in realtime, it's hard to get simpler than this:
import subprocess
# This will raise a CalledProcessError if the program return a nonzero code.
# You can use call() instead if you don't care about that case.
subprocess.check_call(['ls', '-l'])
See the docs for subprocess.check_call().
If you need to process the output, sure, loop on it. But if you don't, just keep it simple.
Edit: J.F. Sebastian points out both that the defaults for the stdout and stderr parameters pass through to sys.stdout and sys.stderr, and that this will fail if sys.stdout and sys.stderr have been replaced (say, for capturing output in tests).
myCommand="ls -l"
cmd=myCommand.split()
# "universal newline support" This will cause to interpret \n, \r\n and \r equally, each as a newline.
p = subprocess.Popen(cmd, stderr=subprocess.PIPE, universal_newlines=True)
while True:
print(p.stderr.readline().rstrip('\r\n'))
Adding another python3 solution with a few small changes:
Allows you to catch the exit code of the shell process (I have been unable to get the exit code while using the with construct)
Also pipes stderr out in real time
import subprocess
import sys
def subcall_stream(cmd, fail_on_error=True):
# Run a shell command, streaming output to STDOUT in real time
# Expects a list style command, e.g. `["docker", "pull", "ubuntu"]`
p = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.STDOUT, bufsize=1, universal_newlines=True)
for line in p.stdout:
sys.stdout.write(line)
p.wait()
exit_code = p.returncode
if exit_code != 0 and fail_on_error:
raise RuntimeError(f"Shell command failed with exit code {exit_code}. Command: `{cmd}`")
return(exit_code)

Categories