Handling stdin and stdout

Handling stdin and stdout - python

I'm trying to use subprocess to handle streams. I need to write data to the stream, and be able to read from it asynchronously (before the program dies, because mine's will take minutes to complete, however it products output).
For the learn case, I've been using the timeout command from Windows 7:
import subprocess
import time
args = ['timeout', '5']
p = subprocess.Popen(args, stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.STDOUT, shell=False)
p.stdin.write('\n') # this is supposed to mimic Enter button pressed event.
while True:
print p.stdout.read() # expected this to print output interactively. This actually hungs.
time.sleep(1)
Where am I wrong?

This line:
print p.stdout.read() # expected this to print output interactively. This actually hungs.
hangs because read() means "read all data until EOF". See the documentation. It seems like you may have wanted to read a line at a time:
print p.stdout.readline()

Related

Python subprocess timing out?

I have a script that runs another command, waits for it to finish, logs the stdout and stderr and based the return code does other stuff. Here is the code:
p = subprocess.Popen(command, stdin=subprocess.PIPE, stderr=subprocess.PIPE, stdout=subprocess.PIPE)
o, e = p.communicate()
if p.returncode:
# report error
# do other stuff
The problem I'm having is that if command takes a long time to run none of the other actions get done. The possible errors won't get reported and the other stuff that needs to happen if no errors doesn't get done. It essentially doesn't go past p.communicate() if it takes too long. Some times this command can takes hours (or even longer) to run and some times it can take as little as 5 seconds.
Am I missing something or doing something wrong?

As per the documentation located here, it's safe to say that you're code is waiting for the subprocess to finish.
If you need to go do 'other things' while you wait you could create a loop like:
while p.poll():
# 'other things'
time.sleep(0.2)
Pick a sleep time that's reasonable for how often you want python to wake up and check the subprocess as well as doing its 'other things'.

The Popen.communicate waits for the process to finish, before anything is returned. Thus it is not ideal for any long running command; and even less so if the subprocess can hang waiting for input, say prompting for a password.
The stderr=subprocess.PIPE, stdout=subprocess.PIPE are needed only if you want to capture the output of the command into a variable. If you are OK with the output going to your terminal, then you can remove these both; and even use subprocess.call instead of Popen. Also, if you do not provide input to your subprocess, then do not use stdin=subprocess.PIPE at all, but direct that from the null device instead (in Python 3.3+ you can use stdin=subprocess.DEVNULL; in Python <3.3 use stdin=open(os.devnull, 'rb')
If you need the contents too, then instead of calling p.communicate(), you can read p.stdout and p.stderr yourself in chunks and output to the terminal, but it is a bit complicated, as it is easy to deadlock the program - the dummy approach would try to read from the subprocess' stdout while the subprocess would want to write to stderr. For this case there are 2 remedies:
you could use select.select to poll both stdout and stderr to see whichever becomes ready first and read from it then
or, if you do not care for stdout and stderr being combined into one,
you can use STDOUT to redirect the stderr stream into the stdout stream: stdout=subprocess.PIPE, stderr=subprocess.STDOUT; now all the output comes to p.stdout that you can read easily in loop and output the chunks, without worrying about deadlocks:
If the stdout, stderr are going to be huge, you can also spool them to a file right there in Popen; say,
stdout = open('stdout.txt', 'w+b')
stderr = open('stderr.txt', 'w+b')
p = subprocess.Popen(..., stdout=stdout, stderr=stderr)
while p.poll() is None:
# reading at the end of the file will return an empty string
err = stderr.read()
print(err)
out = stdout.read()
print(out)
# if we met the end of the file, then we can sleep a bit
# here to avoid spending excess CPU cycles just to poll;
# another option would be to use `select`
if not err and not out: # no input, sleep a bit
time.sleep(0.01)

Understanding Popen.communicate

I have a script named 1st.py which creates a REPL (read-eval-print-loop):
print "Something to print"
while True:
r = raw_input()
if r == 'n':
print "exiting"
break
else:
print "continuing"
I then launched 1st.py with the following code:
p = subprocess.Popen(["python","1st.py"], stdin=PIPE, stdout=PIPE)
And then tried this:
print p.communicate()[0]
It failed, providing this traceback:
Traceback (most recent call last):
File "1st.py", line 3, in <module>
r = raw_input()
EOFError: EOF when reading a line
Can you explain what is happening here please? When I use p.stdout.read(), it hangs forever.

.communicate() writes input (there is no input in this case so it just closes subprocess' stdin to indicate to the subprocess that there is no more input), reads all output, and waits for the subprocess to exit.
The exception EOFError is raised in the child process by raw_input() (it expected data but got EOF (no data)).
p.stdout.read() hangs forever because it tries to read all output from the child at the same time as the child waits for input (raw_input()) that causes a deadlock.
To avoid the deadlock you need to read/write asynchronously (e.g., by using threads or select) or to know exactly when and how much to read/write, for example:
from subprocess import PIPE, Popen
p = Popen(["python", "-u", "1st.py"], stdin=PIPE, stdout=PIPE, bufsize=1)
print p.stdout.readline(), # read the first line
for i in range(10): # repeat several times to show that it works
print >>p.stdin, i # write input
p.stdin.flush() # not necessary in this case
print p.stdout.readline(), # read output
print p.communicate("n\n")[0], # signal the child to exit,
# read the rest of the output,
# wait for the child to exit
Note: it is a very fragile code if read/write are not in sync; it deadlocks.
Beware of block-buffering issue (here it is solved by using "-u" flag that turns off buffering for stdin, stdout in the child).
bufsize=1 makes the pipes line-buffered on the parent side.

Do not use communicate(input=""). It writes input to the process, closes its stdin and then reads all output.
Do it like this:
p=subprocess.Popen(["python","1st.py"],stdin=PIPE,stdout=PIPE)
# get output from process "Something to print"
one_line_output = p.stdout.readline()
# write 'a line\n' to the process
p.stdin.write('a line\n')
# get output from process "not time to break"
one_line_output = p.stdout.readline()
# write "n\n" to that process for if r=='n':
p.stdin.write('n\n')
# read the last output from the process "Exiting"
one_line_output = p.stdout.readline()
What you would do to remove the error:
all_the_process_will_tell_you = p.communicate('all you will ever say to this process\nn\n')[0]
But since communicate closes the stdout and stdin and stderr, you can not read or write after you called communicate.

Your second bit of code starts the first bit of code as a subprocess with piped input and output. It then closes its input and tries to read its output.
The first bit of code tries to read from standard input, but the process that started it closed its standard input, so it immediately reaches an end-of-file, which Python turns into an exception.

How can I capture output and show it at the same time with Python?

I have a pretty long running job, which runs for several minutes and then gets restarted. The task outputs various information which I capture like this:
output = subprocess.Popen(cmd,stdout=subprocess.PIPE).communicate()
The thing is, I will only get the entire output at a time. I would like to show output as the program is sending it to stdout, while still pushing it back in a buffer ( I need to check the output for the presence of some strings ). In Ruby I would do it like this:
IO.popen(cmd) do |io|
io.each_line do |line|
puts line
buffer << line
end
end

You can try something like this :
cmd = ["./my_program.sh"]
p = subprocess.Popen( cmd, shell=False, stdout=subprocess.PIPE) # launch the process
while p.poll() is None: # check if the process is still alive
out = p.stdout.readline() # if it is still alive, grab the output
do_something_with(out) # do what you want with it

You could read it one line at a time:
from subprocess import Popen, PIPE
p = Popen('grep -ir graph .', stdout=PIPE)
while not p.returncode:
s = p.stdout.readline()
print s
p.poll()
In this way, you are only blocking for the time it takes to process to output a single line.

You can use the "tee" command. It does exactly what you need from it.
http://www.computerhope.com/unix/utee.htm

redirecting shell output using subprocess

I have a python script which calls a lot of shell functions. The script can be run interactively from a terminal, in which case I'd like to display output right away, or called by crontab, in which case I'd like to email error output.
I wrote a helper function for calling shell functions:
import subprocess
import shlex
import sys
def shell(cmdline, interactive=True):
args = shlex.split(cmdline.encode("ascii"))
proc = subprocess.Popen(args, stdout=subprocess.PIPE,
stderr=subprocess.PIPE)
val = proc.communicate()
if interactive is True:
if proc.returncode:
print "returncode " + str(proc.returncode)
print val[1]
sys.exit(1)
else:
print val[0]
else:
if proc.returncode:
print ""
# send email with val[0] + val[1]
if __name__ == "__main__":
# example of command that produces non-zero returncode
shell("ls -z")
The problem I'm having is two-fold.
1) In interactive mode, when the shell command takes a while to finish (e.g. few minutes), I don't see anything until the command is completely done since communicate() buffers output. Is there a way to display output as it comes in, and avoid buffering? I also need a way to check the returncode, which is why I'm using communicate().
2) Some shell commands I call can produce a lot of output (e.g. 2MB). The documentation for communicate() says "do not use this method if the data size is large or unlimited." Does anyone know how large is "large"?

1) When you use communicate, you capture the output of the subprocess so nothing is sent to your standard output. The only reason why you see the output when the subprocess is finished is because you print it yourself.
Since you want to either see it as it runs and not capture it or capture everything and do something with it only at the end, you can change the way it works in interactive mode by leaving stdout and stderr to None. This makes the subprocess use the same streams as your program. You'll also have to replace the call to communicate with a call to wait:
if interactive is True:
proc = subprocess.Popen(args)
proc.wait()
if proc.returncode:
print "returncode " + str(proc.returncode)
sys.exit(1)
else:
proc = subprocess.Popen(args, stdout=subprocess.PIPE,
stderr=subprocess.PIPE)
val = proc.communicate()
if proc.returncode:
print ""
# send email with val[0] + val[1]
2) Too large is "too large to store in memory", so it all depends on a lot of factors. If storing temporarily 2MB of data in memory is fine in your situation, then there's nothing to worry about.

Python monitoring stderr and stdout of a subprocess

I trying to start a program (HandBreakCLI) as a subprocess or thread from within python 2.7. I have gotten as far as starting it, but I can't figure out how to monitor it's stderr and stdout.
The program outputs it's status (% done) and info about the encode to stderr and stdout, respectively. I'd like to be able to periodically retrieve the % done from the appropriate stream.
I've tried calling subprocess.Popen with stderr and stdout set to PIPE and using the subprocess.communicate, but it sits and waits till the process is killed or complete then retrieves the output then. Doesn't do me much good.
I've got it up and running as a thread, but as far as I can tell I still have to eventually call subprocess.Popen to execute the program and run into the same wall.
Am I going about this the right way? What other options do I have or how to I get this to work as described?

I have accomplished the same with ffmpeg. This is a stripped down version of the relevant portions. bufsize=1 means line buffering and may not be needed.
def Run(command):
proc = subprocess.Popen(command, bufsize=1,
stdout=subprocess.PIPE, stderr=subprocess.STDOUT,
universal_newlines=True)
return proc
def Trace(proc):
while proc.poll() is None:
line = proc.stdout.readline()
if line:
# Process output here
print 'Read line', line
proc = Run([ handbrakePath ] + allOptions)
Trace(proc)
Edit 1: I noticed that the subprocess (handbrake in this case) needs to flush after lines to use this (ffmpeg does).
Edit 2: Some quick tests reveal that bufsize=1 may not be actually needed.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Handling stdin and stdout - python

This line: print p.stdout.read() # expected this to print output interactively. This actually hungs. hangs because read() means "read all data until EOF". See the documentation. It seems like you may have wanted to read a line at a time: print p.stdout.readline()

Related

Python subprocess timing out?

Understanding Popen.communicate

How can I capture output and show it at the same time with Python?

redirecting shell output using subprocess

Python monitoring stderr and stdout of a subprocess

Categories

Resources