I have a script named 1st.py which creates a REPL (read-eval-print-loop):
print "Something to print"
while True:
r = raw_input()
if r == 'n':
print "exiting"
break
else:
print "continuing"
I then launched 1st.py with the following code:
p = subprocess.Popen(["python","1st.py"], stdin=PIPE, stdout=PIPE)
And then tried this:
print p.communicate()[0]
It failed, providing this traceback:
Traceback (most recent call last):
File "1st.py", line 3, in <module>
r = raw_input()
EOFError: EOF when reading a line
Can you explain what is happening here please? When I use p.stdout.read(), it hangs forever.
.communicate() writes input (there is no input in this case so it just closes subprocess' stdin to indicate to the subprocess that there is no more input), reads all output, and waits for the subprocess to exit.
The exception EOFError is raised in the child process by raw_input() (it expected data but got EOF (no data)).
p.stdout.read() hangs forever because it tries to read all output from the child at the same time as the child waits for input (raw_input()) that causes a deadlock.
To avoid the deadlock you need to read/write asynchronously (e.g., by using threads or select) or to know exactly when and how much to read/write, for example:
from subprocess import PIPE, Popen
p = Popen(["python", "-u", "1st.py"], stdin=PIPE, stdout=PIPE, bufsize=1)
print p.stdout.readline(), # read the first line
for i in range(10): # repeat several times to show that it works
print >>p.stdin, i # write input
p.stdin.flush() # not necessary in this case
print p.stdout.readline(), # read output
print p.communicate("n\n")[0], # signal the child to exit,
# read the rest of the output,
# wait for the child to exit
Note: it is a very fragile code if read/write are not in sync; it deadlocks.
Beware of block-buffering issue (here it is solved by using "-u" flag that turns off buffering for stdin, stdout in the child).
bufsize=1 makes the pipes line-buffered on the parent side.
Do not use communicate(input=""). It writes input to the process, closes its stdin and then reads all output.
Do it like this:
p=subprocess.Popen(["python","1st.py"],stdin=PIPE,stdout=PIPE)
# get output from process "Something to print"
one_line_output = p.stdout.readline()
# write 'a line\n' to the process
p.stdin.write('a line\n')
# get output from process "not time to break"
one_line_output = p.stdout.readline()
# write "n\n" to that process for if r=='n':
p.stdin.write('n\n')
# read the last output from the process "Exiting"
one_line_output = p.stdout.readline()
What you would do to remove the error:
all_the_process_will_tell_you = p.communicate('all you will ever say to this process\nn\n')[0]
But since communicate closes the stdout and stdin and stderr, you can not read or write after you called communicate.
Your second bit of code starts the first bit of code as a subprocess with piped input and output. It then closes its input and tries to read its output.
The first bit of code tries to read from standard input, but the process that started it closed its standard input, so it immediately reaches an end-of-file, which Python turns into an exception.
Related
I have one sh file, I need to install it in target linux box. So I'm in the process of writing automatic installation for the sh file which required lot of input from user. Example, first thing I made ./file.sh it will show a big paragaraph and ask user to press Enter. I'm stuck in this place. How to send key data to the sub process. Here is what I've tried.
import subprocess
def runProcess(exe):
global p
p = subprocess.Popen(exe, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
while(True):
retcode = p.poll() #returns None while subprocess is running
line = p.stdout.readline()
yield line
if(retcode is not None):
break
for line in runProcess('./file.sh'.split()):
if '[Enter]' in line:
print line + 'got it'
p.communicate('\r')
Correct me if my understanding is wrong, pardon me if it is duplicate.
If you need to send a bunch of newlines and nothing else, you need to:
Make sure the stdin for the Popen is a pipe
Send the newlines without causing a deadlock
Your current code does neither. Something that might work (assuming they're not using APIs that require direct interaction in a tty, rather than just reading stdin):
import subprocess
import threading
def feednewlines(f):
try:
# Write as many newlines as it will take
while True:
f.write(b'\n') # Write newline, not carriage return
f.flush() # Flush to ensure it's sent as quickly as possible
except OSError:
return # Done when pipe closed/process exited
def runProcess(exe):
global p
# Get stdin as pipe too
p = subprocess.Popen(exe, stdin=subprocess.PIPE,
stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
# Use thread to just feed as many newlines as needed to stdin of subprocess
feeder = threading.Thread(target=feednewlines, args=(p.stdin,))
feeder.daemon = True
feeder.start()
# No need to poll, just read until it closes stdout or exits
for line in p.stdout:
yield line
p.stdin.close() # Stop feeding (causes thread to error and exit)
p.wait() # Cleanup process
# Iterate output, and echo when [Enter] seen
for line in runProcess('./file.sh'.split()):
if '[Enter]' in line:
print line + 'got it'
For the case where you need to customize the responses, you're going to need to add communication between parent and feeder thread, which makes this uglier, and it only works if the child process is properly flushing its output when it prompts you, even when not connected to a terminal. You might do something like this to define a global queue:
import queue # Queue on Python 2
feederqueue = queue.Queue()
then change the feeder function to:
def feednewlines(f):
try:
while True:
f.write(feederqueue.get())
f.flush()
except OSError:
return
and change the global code lower down to:
for line in runProcess('./file.sh'.split()):
if '[Enter]' in line:
print line + 'got it'
feederqueue.put(b'\n')
elif 'THING THAT REQUIRES YOU TO TYPE FOO' in line:
feederqueue.put(b'foo\n')
etc.
Command line programs run differently when they are run in a terminal verses when they are run in the background. If the program is attached to a terminal, they run in an interactive line mode expecting user interaction. If stdin is a file or a pipe, they run in block mode where writes are delayed until a certain block size is buffered. Your program will never see the [Enter] prompt because it uses pipes and the data is still in the subprocesses output buffer.
The python pexpect module solves this problem by emulating a terminal and allowing you to interact with the program with a series of "expect" statements.
Suppose we want to run a test program
#!/usr/bin/env python3
data = input('[Enter]')
print(data)
its pretty boring. It prompts for data, prints it, then exits. We can run it with pexpect
#!/usr/bin/env python3
import pexpect
# run the program
p = pexpect.spawn('./test.py')
# we don't need to see our input to the program echoed back
p.setecho(False)
# read lines until the desired program output is seen
p.expect(r'\[Enter\]')
# send some data to the program
p.sendline('inner data')
# wait for it to exit
p.expect(pexpect.EOF)
# show everything since the previous expect
print(p.before)
print('outer done')
We have created a commodity function used in many projects which uses subprocess to start a command. This function is as follows:
def _popen( command_list ):
p = subprocess.Popen( command_list, stdout=subprocess.PIPE,
stderr=subprocess.PIPE )
out, error_msg = p.communicate()
# Some processes (e.g. system_start) print a number of dots in stderr
# even when no error occurs.
if error_msg.strip('.') == '':
error_msg = ''
return out, error_msg
For most processes this works as intended.
But now I have to use it with a background-process which need to keep running as long as my python-script is running as well and thus now the fun starts ;-).
Note: the script also needs to start other non background-processes using this same _popen-function.
I know that by skipping p.communicate I can make the process start in the background, while my python script continues.
But there are 2 problems with this:
I need to check that the background process started correctly
While the main process is running I need to check the stdout and stderr of the background process from time to time without stopping the process / ending hanging in the background process.
Check background process started correctly
For 1 I currently adapted the _popen version to take an extra parameter 'skip_com' (default False) to skip the p.communicate call. And in that case I return the p-object i.s.o. out and error_msg.
This so I can check if the process is running directly after starting it up and if not call communicate on the p-object to check what the error_msg is.
MY_COMMAND_LIST = [ "<command that should go to background>" ]
def _popen( command_list, skip_com=False ):
p = subprocess.Popen( command_list, stdout=subprocess.PIPE,
stderr=subprocess.PIPE )
if not skip_com:
out, error_msg = p.communicate()
# Some processes (e.g. system_start) print a number of dots in stderr
# even when no error occurs.
if error_msg.strip('.') == '':
error_msg = ''
return out, error_msg
else:
return p
...
p = _popen( MY_COMMAND_LIST, True )
error = _get_command_pid( MY_COMMAND_LIST ) # checks if background command is running using _popen and ps -ef
if error:
_, error_msg = p.communicate()
I do not know if there is a better way to do this.
check stdout / stderr
For 2 I have not found a solution which does not cause the script to wait for the end of the background process.
The only ways I know to communicate is using iter on e.g. p.stdout.readline. But that will hang if the process is still running:
for line in iter( p.stdout.readline, "" ): print line
Any one an idea how to do this?
/edit/ I need to check the data I get from stdout and stderr seperately. Especially stderr is important in this case since if the background process encounters an error it will exit and I need to catch that in my main program to be able to prevent errors caused by that exit.
The stdout output is needed in some situations to check the expected behaviour in the background process and to react on that.
Update
The subprocess will actually exit if it encounters an error
If you don't need to read the output to detect an error then redirect it to DEVNULL and call .poll() to check child process' status from time to time without stopping the process.
assuming you have to read the output:
Do not use stdout=PIPE, stderr=PIPE unless you read from the pipes. Otherwise, the child process may hang as soon as any of the corresponding OS pipe buffers fill up.
If you want to start a process and do something else while it is running then you need a non-blocking way to read its output. A simple portable way is to use a thread:
def process_output(process):
with finishing(process): # close pipes, call .wait()
for line in iter(process.stdout.readline, b''):
if detected_error(line):
communicate_error(process, line)
process = Popen(command, stdout=PIPE, stderr=STDOUT, bufsize=1)
Thread(target=process_output, args=[process]).start()
I need to check the data I get from stdout and stderr seperately.
Use two threads:
def read_stdout(process):
with waiting(process), process.stdout: # close pipe, call .wait()
for line in iter(process.stdout.readline, b''):
do_something_with_stdout(line)
def read_stderr(process):
with process.stderr:
for line in iter(process.stderr.readline, b''):
if detected_error(line):
communicate_error(process, line)
process = Popen(command, stdout=PIPE, stderr=PIPE, bufsize=1)
Thread(target=read_stdout, args=[process]).start()
Thread(target=read_stderr, args=[process]).start()
You could put the code into a custom class (to group do_something_with_stdout(), detected_error(), communicate_error() methods).
It may be better or worse than what you imagine...
Anyway, the correct way of reading a pipe line by line is simply:
for line in p.stdout:
#process line is you want of just
print line
Or if you need to process that inside of a higher level loop
line = next(p.stdout)
But a harder problem could come from the commands started from Python. Many programs use the underlying C standard library, and by default stdout is a buffered stream. The system detects whether the standard output is connected to a terminal, and automatically flushes output on a new line (\n) or on a read on same terminal. But if output is connected to a pipe or a file, everything is buffered until the buffer is full, which on current systems requires several kBytes. In that case nothing can be done at Python level. Above code would get a full line as soon as it would written on the pipe, but cannot guess before callee has actually written something...
I have a script that runs another command, waits for it to finish, logs the stdout and stderr and based the return code does other stuff. Here is the code:
p = subprocess.Popen(command, stdin=subprocess.PIPE, stderr=subprocess.PIPE, stdout=subprocess.PIPE)
o, e = p.communicate()
if p.returncode:
# report error
# do other stuff
The problem I'm having is that if command takes a long time to run none of the other actions get done. The possible errors won't get reported and the other stuff that needs to happen if no errors doesn't get done. It essentially doesn't go past p.communicate() if it takes too long. Some times this command can takes hours (or even longer) to run and some times it can take as little as 5 seconds.
Am I missing something or doing something wrong?
As per the documentation located here, it's safe to say that you're code is waiting for the subprocess to finish.
If you need to go do 'other things' while you wait you could create a loop like:
while p.poll():
# 'other things'
time.sleep(0.2)
Pick a sleep time that's reasonable for how often you want python to wake up and check the subprocess as well as doing its 'other things'.
The Popen.communicate waits for the process to finish, before anything is returned. Thus it is not ideal for any long running command; and even less so if the subprocess can hang waiting for input, say prompting for a password.
The stderr=subprocess.PIPE, stdout=subprocess.PIPE are needed only if you want to capture the output of the command into a variable. If you are OK with the output going to your terminal, then you can remove these both; and even use subprocess.call instead of Popen. Also, if you do not provide input to your subprocess, then do not use stdin=subprocess.PIPE at all, but direct that from the null device instead (in Python 3.3+ you can use stdin=subprocess.DEVNULL; in Python <3.3 use stdin=open(os.devnull, 'rb')
If you need the contents too, then instead of calling p.communicate(), you can read p.stdout and p.stderr yourself in chunks and output to the terminal, but it is a bit complicated, as it is easy to deadlock the program - the dummy approach would try to read from the subprocess' stdout while the subprocess would want to write to stderr. For this case there are 2 remedies:
you could use select.select to poll both stdout and stderr to see whichever becomes ready first and read from it then
or, if you do not care for stdout and stderr being combined into one,
you can use STDOUT to redirect the stderr stream into the stdout stream: stdout=subprocess.PIPE, stderr=subprocess.STDOUT; now all the output comes to p.stdout that you can read easily in loop and output the chunks, without worrying about deadlocks:
If the stdout, stderr are going to be huge, you can also spool them to a file right there in Popen; say,
stdout = open('stdout.txt', 'w+b')
stderr = open('stderr.txt', 'w+b')
p = subprocess.Popen(..., stdout=stdout, stderr=stderr)
while p.poll() is None:
# reading at the end of the file will return an empty string
err = stderr.read()
print(err)
out = stdout.read()
print(out)
# if we met the end of the file, then we can sleep a bit
# here to avoid spending excess CPU cycles just to poll;
# another option would be to use `select`
if not err and not out: # no input, sleep a bit
time.sleep(0.01)
I have noticed two different behaviors with two approaches that should have result in the same outcome.
The goal - to execute an external program using subprocess module, send some data and read the results.
The external program is PLINK, platform is WindowsXP, Python version 3.3.
The main idea-
execution=["C:\\Pr..\\...\\plink.exe", "-l", username, "-pw", "***", IP]
a=subprocess.Popen(execution, bufsize=0, stdout=PIPE, stdin=PIPE, stderr=STDOUT, shell=False)
con=a.stdout.readline()
if (con.decode("utf-8").count("FATAL ERROR: Network error: Connection timed out")==0):
a.stdin.write(b"con rout 1\n")
print(a.stdout.readline().decode("utf-8"))
a.stdin.write(b"infodf\n")
print(a.stdout.readline().decode("utf-8"))
else:
print("ERROR")
a.kill()
So far so good.
Now, I want to be able to do a loop (after each write to the sub process's stdin), that waits untill EOF of the sub process's stdout, print it, then another stdin command, and so on.
So I first tried what previous discussions about the same topic yield (live output from subprocess command, read subprocess stdout line by line, python, subprocess: reading output from subprocess) .
And it didnt work (it hangs forever) because the PLINK process is remaining alive untill I kill it myself, so there is no use of waiting for the stdout of the sub process to reach EOF or to do a loop while stdout is true because it will always be true until I kill it.
So I decided to read from stdout twice every time I am writing to stdin (good enought for me)-
execution=["C:\\Pr..\\...\\plink.exe", "-l", username, "-pw", "***", IP]
a=subprocess.Popen(execution, bufsize=0, stdout=PIPE, stdin=PIPE, stderr=STDOUT, shell=False)
con=a.stdout.readline()
if (con.decode("utf-8").count("FATAL ERROR: Network error: Connection timed out")==0):
a.stdin.write(b"con rout 1\n")
print(a.stdout.readline().decode("utf-8"))
print(a.stdout.readline().decode("utf-8")) //the extra line [1]
a.stdin.write(b"infodf\n")
print(a.stdout.readline().decode("utf-8"))
print(a.stdout.readline().decode("utf-8")) //the extra line [2]
else:
print("ERROR")
a.kill()
But the first extra readline() hangs forever, as far as I understand, for the same reason I mentioned. The first extra readline() waits forever for output, because the only output was already read in the first readline(), and because PLINK is alive, the function just "sit" there and waits for a new output line to get.
So I tried this code, expecting the same hang because PLINK never dies until i kill it-
execution=["C:\\Pr..\\...\\plink.exe", "-l", username, "-pw", "***", IP]
a=subprocess.Popen(execution, bufsize=0, stdout=PIPE, stdin=PIPE, stderr=STDOUT, shell=False)
con=a.stdout.readline()
if (con.decode("utf-8").count("FATAL ERROR: Network error: Connection timed out")==0):
a.stdin.write(b"con rout 1\n")
print(a.stdout.readline().decode("utf-8"))
a.stdin.write(b"infodf\n")
print(a.stdout.readline().decode("utf-8"))
print(a.communicate()[0].decode("utf-8")) //Popen.communicate() function
else:
print("ERROR")
a.kill()
I tried that because according to the documentation of communicate(), the function wait until the process is ended, and then it finishes. Also, it reads from stdout until EOF. (same as writing and reading stdout and stdin)
But communicate() finishes and does not hang, in opposite of the previous code block.
What am I missing here? why when using communicate() the PLINK ends, but when using readline() it does not?
Your program without communicate() deadlocks because both processes are waiting on each other to write something before they write anything more themselves.
communicate() does not deadlock in your example because it closes the stream, like the command a.stdin.close() would. This sends an EOF to your subprocess, letting it know that there is no more input coming, so it can close itself, which in turn closes its output, so a.stdout.read() eventually returns an EOF (empty string).
There is no special signal that your main process will receive from your subprocess to let you know that it is done writing the results from one command, but is ready for another command.
This means that to communicate back and forth with one subprocess like you're trying to, you must read the exact number of lines that the subprocess sends. Like you saw, if you try to read too many lines, you deadlock. You might be able to use what you know, such as the command you sent it, and the output you have seen so far, to figure out exactly how many lines to read.
You can use threads to write and read at the same time, especially if the output only needs to be printed to the user:
from threading import Thread
def print_remaining(stream):
for line in stream:
print(line.decode("utf-8"))
con = a.stdout.readline()
if "FATAL ERROR" not in con.decode("utf-8"):
Thread(target=print_remaining, args=[a.stdout]).start()
for cmd in LIST_OF_COMMANDS_TO_SEND:
a.stdin.write(cmd)
The thing is that when using subprocess.Popen your code continues to be read even before the process terminates. Try appending .wait() to your Popen call (see documentation),
a=subprocess.Popen(execution, bufsize=0, stdout=PIPE, stdin=PIPE, stderr=STDOUT, shell=False).wait()
This will ensure that the execution finishes before going on with anything else.
I'm trying to use subprocess to handle streams. I need to write data to the stream, and be able to read from it asynchronously (before the program dies, because mine's will take minutes to complete, however it products output).
For the learn case, I've been using the timeout command from Windows 7:
import subprocess
import time
args = ['timeout', '5']
p = subprocess.Popen(args, stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.STDOUT, shell=False)
p.stdin.write('\n') # this is supposed to mimic Enter button pressed event.
while True:
print p.stdout.read() # expected this to print output interactively. This actually hungs.
time.sleep(1)
Where am I wrong?
This line:
print p.stdout.read() # expected this to print output interactively. This actually hungs.
hangs because read() means "read all data until EOF". See the documentation. It seems like you may have wanted to read a line at a time:
print p.stdout.readline()