I have searched and experimented for over an hour on this and there doesn't seem to be a way to both do a 'here document' and get the output line by line as it occurs:
python = '''var="some character text"
print(var)
print(var)
exit()
'''
from subprocess import Popen, PIPE, STDOUT
import shlex
def run_process(command):
p = Popen(shlex.split(command), stdin=PIPE, stdout=PIPE, stderr=STDOUT)
p.stdin.write(python)
while True:
output = p.stdout.readline()
if output == '' and p.poll() is not None:
break
if output:
print output.strip()
rc=p.poll()
return rc
run_process("/usr/bin/python")
The above code hangs indefinitely. Yes, it's a snake eating its tail, but it was just to prove the concept.
The problem is my subprocess takes a LONG time to run and I need to be able to see the output without waiting hours to figure out if anything is wrong. Any hints? Thanks.
The Python interpreter behaves differently when run in interactive vs. non-interactive mode. From the python(1) manual page:
In non-interactive mode, the entire input is parsed before it is executed.
Of course, “entire input” is delimited by EOF, and your program never sends an EOF, which is why it hangs.
Python runs in interactive mode if its stdin is a tty. You can use the Ptyprocess library to spawn a process with a tty as stdin. Or use the Pexpect library (based on Ptyprocess), which even includes ready-made REPL wrappers for Python and other programs.
But if you replace Python with sed — which of course doesn’t have an interactive mode — the program still doesn’t work:
sed = '''this is a foo!\n
another foo!\n
'''
from subprocess import Popen, PIPE, STDOUT
import shlex
def run_process(command):
p = Popen(shlex.split(command), stdin=PIPE, stdout=PIPE, stderr=STDOUT)
p.stdin.write(sed)
while True:
output = p.stdout.readline()
if output == '' and p.poll() is not None:
break
if output:
print output.strip()
rc=p.poll()
return rc
run_process("/bin/sed -e 's/foo/bar/g'")
This is caused by a different problem: output buffering in sed. Some programs have options to disable buffering. In particular, both sed and Python have a -u option, which solves this problem:
run_process("/bin/sed -ue 's/foo/bar/g'")
Related
I'm using Python's subprocess.communicate() to read stdout from a process that runs for about a minute.
How can I print out each line of that process's stdout in a streaming fashion, so that I can see the output as it's generated, but still block on the process terminating before continuing?
subprocess.communicate() appears to give all the output at once.
To get subprocess' output line by line as soon as the subprocess flushes its stdout buffer:
#!/usr/bin/env python2
from subprocess import Popen, PIPE
p = Popen(["cmd", "arg1"], stdout=PIPE, bufsize=1)
with p.stdout:
for line in iter(p.stdout.readline, b''):
print line,
p.wait() # wait for the subprocess to exit
iter() is used to read lines as soon as they are written to workaround the read-ahead bug in Python 2.
If subprocess' stdout uses a block buffering instead of a line buffering in non-interactive mode (that leads to a delay in the output until the child's buffer is full or flushed explicitly by the child) then you could try to force an unbuffered output using pexpect, pty modules or unbuffer, stdbuf, script utilities, see Q: Why not just use a pipe (popen())?
Here's Python 3 code:
#!/usr/bin/env python3
from subprocess import Popen, PIPE
with Popen(["cmd", "arg1"], stdout=PIPE, bufsize=1,
universal_newlines=True) as p:
for line in p.stdout:
print(line, end='')
Note: Unlike Python 2 that outputs subprocess' bytestrings as is; Python 3 uses text mode (cmd's output is decoded using locale.getpreferredencoding(False) encoding).
Please note, I think J.F. Sebastian's method (below) is better.
Here is an simple example (with no checking for errors):
import subprocess
proc = subprocess.Popen('ls',
shell=True,
stdout=subprocess.PIPE,
)
while proc.poll() is None:
output = proc.stdout.readline()
print output,
If ls ends too fast, then the while loop may end before you've read all the data.
You can catch the remainder in stdout this way:
output = proc.communicate()[0]
print output,
I believe the simplest way to collect output from a process in a streaming fashion is like this:
import sys
from subprocess import *
proc = Popen('ls', shell=True, stdout=PIPE)
while True:
data = proc.stdout.readline() # Alternatively proc.stdout.read(1024)
if len(data) == 0:
break
sys.stdout.write(data) # sys.stdout.buffer.write(data) on Python 3.x
The readline() or read() function should only return an empty string on EOF, after the process has terminated - otherwise it will block if there is nothing to read (readline() includes the newline, so on empty lines, it returns "\n"). This avoids the need for an awkward final communicate() call after the loop.
On files with very long lines read() may be preferable to reduce maximum memory usage - the number passed to it is arbitrary, but excluding it results in reading the entire pipe output at once which is probably not desirable.
If you want a non-blocking approach, don't use process.communicate(). If you set the subprocess.Popen() argument stdout to PIPE, you can read from process.stdout and check if the process still runs using process.poll().
If you're simply trying to pass the output through in realtime, it's hard to get simpler than this:
import subprocess
# This will raise a CalledProcessError if the program return a nonzero code.
# You can use call() instead if you don't care about that case.
subprocess.check_call(['ls', '-l'])
See the docs for subprocess.check_call().
If you need to process the output, sure, loop on it. But if you don't, just keep it simple.
Edit: J.F. Sebastian points out both that the defaults for the stdout and stderr parameters pass through to sys.stdout and sys.stderr, and that this will fail if sys.stdout and sys.stderr have been replaced (say, for capturing output in tests).
myCommand="ls -l"
cmd=myCommand.split()
# "universal newline support" This will cause to interpret \n, \r\n and \r equally, each as a newline.
p = subprocess.Popen(cmd, stderr=subprocess.PIPE, universal_newlines=True)
while True:
print(p.stderr.readline().rstrip('\r\n'))
Adding another python3 solution with a few small changes:
Allows you to catch the exit code of the shell process (I have been unable to get the exit code while using the with construct)
Also pipes stderr out in real time
import subprocess
import sys
def subcall_stream(cmd, fail_on_error=True):
# Run a shell command, streaming output to STDOUT in real time
# Expects a list style command, e.g. `["docker", "pull", "ubuntu"]`
p = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.STDOUT, bufsize=1, universal_newlines=True)
for line in p.stdout:
sys.stdout.write(line)
p.wait()
exit_code = p.returncode
if exit_code != 0 and fail_on_error:
raise RuntimeError(f"Shell command failed with exit code {exit_code}. Command: `{cmd}`")
return(exit_code)
To emphasize, the problem is real time read instead of non-blocking read. It has been asked before, e.g. subprocess.Popen.stdout - reading stdout in real-time (again). But no satisfactory solution has been proposed.
As an example, the following code tries to simulate the python shell.
import subprocess
p = subprocess.Popen(['python'], stdin=subprocess.PIPE, stdout=subprocess.PIPE)
while True:
line = input('>>> ')
p.stdin.write(line.encode())
print('>>> ', p.stdout.read().decode())
However, it would be blocked when reading from p.stdout. After searching around, I found the following two possible soutions.
using fctrl and O_NONBLOCK
using thread and queue
Whereas the 1st soution may work and only work on linux, the 2nd soution just turn blocking read to non-blocking read, i.e. I cannot get real time output of the subprocess. For example, if I input 'print("hello")', I will get nothing from p.stdout using 2nd solution.
Perhaps, someone would suggest p.communite. Unfortunately, it is not suitable in this case, since it would close stdin as described here.
So, is there any solutions for Windows?
Edited: Even if -u is turned on and p.stdout.read is replaced with p.stdout.readline, the problem still exists.
import subprocess
p = subprocess.Popen(['python', '-u'], stdin=subprocess.PIPE, stdout=subprocess.PIPE)
while True:
line = input('>>> ')
p.stdin.write(line.encode())
p.stdin.flush()
print('>>> ', p.stdout.readline().decode())
Solution: The following is the final code based on J.F. Sebastian's answer and comments.
from subprocess import Popen, PIPE, STDOUT
with Popen(
['python', '-i', '-q'],
stdin=PIPE, stdout=PIPE, stderr=STDOUT,
bufsize=0
) as process:
while True:
line = input('>>> ')
if not line:
break
process.stdin.write((line+'\n').encode())
print(process.stdout.readline().decode(), end='')
It should be noted that the program would hang when the command triggers no output.
Here's a complete working example that uses a subprocess interactively:
#!/usr/bin/env python3
import sys
from subprocess import Popen, PIPE, DEVNULL
with Popen([sys.executable, '-i'], stdin=PIPE, stdout=PIPE, stderr=DEVNULL,
universal_newlines=True) as process:
for i in range(10):
print("{}**2".format(i), file=process.stdin, flush=True)
square = process.stdout.readline()
print(square, end='')
Here's another example: how to run [sys.executable, '-u', 'test.py'] interactively.
I'm running a perl script that accepts a file as input from Python using subprocess.Popen(). I now need the input to the script to accept input from the standard input and not a file. If I run the perl scrip from the shell like this:
perl thescript.perl --in /dev/stdin --other_args other_values
It works perfectly. However, in python, nothing happens using the following commands:
mytext = "hi there"
args = ["perl", "myscript.perl", "--in", "/dev/stdin", "--other_args", other_values]
pipe = subprocess.Popen(args, stdin=subprocess.PIPE, stdout=subprocess.PIPE)
result = pipe.communicate(input=mytext.encode("utf8"))[0]`
result always returns empty (I've also tried using pipe.stdin.write(mytext") and result=pipe.stdout.read())
Please let me know what I'm doing wrong.
Thanks to the comments by #J.F.Sebastian above, I managed to solve this problem with echo and pipes.
args = ["perl", "myscript.perl", "--in", "/dev/stdin", "other_args", other_vals]
pipe1 = subprocess.Popen(["echo", mytext], stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
pipe2 = subprocess.Popen(args, stdin=pipe1.stdout, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
pipe1.stdout.close()
result = pipe2.communicate()[0]
Which returns the expected output. Still not sure why the original (posted in the question) didn't work (using communicate to send the text to the stdin)
/dev/stdin should work (if it works in the shell on your system):
>>> from subprocess import Popen, PIPE
>>> import sys
>>> p = Popen([sys.executable, '-c', 'print(open("/dev/stdin").read()[::-1])'],
... stdin=PIPE, stdout=PIPE)
>>> p.communicate(b'ab')[0]
'ba\n'
stdin=PIPE creates a pipe and connects it to the child process' stdin. Reading from /dev/stdin is equivalent to reading from the standard input (0 fd) and therefore the child reads from the pipe here as shown in the example.
I am developing a robot that accepts commands from network (XMPP) and uses subprocess module in Python to execute them and sends back the output of commands. Essentially it is an SSH-like XMPP-based non-interactive shell.
The robot only executes commands from authenticated trusted sources, so arbitrary shell commands are allowed (shell=True).
However, when I accidentally send some command that needs a tty, the robot is stuck.
For example:
subprocess.check_output(['vim'], shell=False)
subprocess.check_output('vim', shell=True)
Should each of the above commands is received, the robot is stuck, and the terminal from which the robot is run, is broken.
Though the robot only receives commands from authenticated trusted sources, human errs. How could I make the robot filter out those commands that will break itself? I know there is os.isatty but how could I utilize it? Is there a way to detect those "bad" commands and refuse to execute them?
TL;DR:
Say, there are two kinds of commands:
Commands like ls: does not need a tty to run.
Commands like vim: needs a tty; breaks subprocess if no tty is given.
How could I tell a command is ls-like or is vim-like and refuses to run the command if it is vim-like?
What you expect is a function that receives command as input, and returns meaningful output by running the command.
Since the command is arbitrary, requirement for tty is just one of many bad cases may happen (other includes running a infinite loop), your function should only concern about its running period, in other words, a command is “bad” or not should be determined by if it ends in a limited time or not, and since subprocess is asynchronous by nature, you can just run the command and handle it in a higher vision.
Demo code to play, you can change the cmd value to see how it performs differently:
#!/usr/bin/env python
# coding: utf-8
import time
import subprocess
from subprocess import PIPE
#cmd = ['ls']
#cmd = ['sleep', '3']
cmd = ['vim', '-u', '/dev/null']
print 'call cmd'
p = subprocess.Popen(cmd, shell=True,
stdin=PIPE, stderr=PIPE, stdout=PIPE)
print 'called', p
time_limit = 2
timer = 0
time_gap = 0.2
ended = False
while True:
time.sleep(time_gap)
returncode = p.poll()
print 'process status', returncode
timer += time_gap
if timer >= time_limit:
print 'timeout, kill process'
p.kill()
break
if returncode is not None:
ended = True
break
if ended:
print 'process ended by', returncode
print 'read'
out, err = p.communicate()
print 'out', repr(out)
print 'error', repr(err)
else:
print 'process failed'
Three points are notable in the above code:
We use Popen instead of check_output to run the command, unlike check_output which will wait for the process to end, Popen returns immediately, thus we can do further things to control the process.
We implement a timer to check for the process's status, if it runs for too long, we killed it manually because we think a process is not meaningful if it could not end in a limited time. In this way your original problem will be solved, as vim will never end and it will definitely being killed as an “unmeaningful” command.
After the timer helps us filter out bad commands, we can get stdout and stderr of the command by calling communicate method of the Popen object, after that its your choice to determine what to return to the user.
Conclusion
tty simulation is not needed, we should run the subprocess asynchronously, then control it by a timer to determine whether it should be killed or not, for those ended normally, its safe and easy to get the output.
Well, SSH is already a tool that will allow users to remotely execute commands and be authenticated at the same time. The authentication piece is extremely tricky, please be aware that building the software you're describing is a bit risky from a security perspective.
There isn't a way to determine whether a process is going to need a tty or not. And there's no os.isatty method because if you ran a sub-processes that needed one wouldn't mean that there was one. :)
In general, it would probably be safer from a security perspective and also a solution to this problem if you were to consider a white list of commands. You could choose that white list to avoid things that would need a tty, because I don't think you'll easily get around this.
Thanks a lot for #J.F. Sebastia's help (see comments under the question), I've found a solution (workaround?) for my case.
The reason why vim breaks terminal while ls does not, is that vim needs a tty. As Sebastia says, we can feed vim with a pty using pty.openpty(). Feeding a pty gurantees the command will not break terminal, and we can add a timout to auto-kill such processes. Here is (dirty) working example:
#!/usr/bin/env python3
import pty
from subprocess import STDOUT, check_output, TimeoutExpired
master_fd, slave_fd = pty.openpty()
try:
output1 = check_output(['ls', '/'], stdin=slave_fd, stderr=STDOUT, universal_newlines=True, timeout=3)
print(output1)
except TimeoutExpired:
print('Timed out')
try:
output2 = check_output(['vim'], stdin=slave_fd, stderr=STDOUT, universal_newlines=True, timeout=3)
print(output2)
except TimeoutExpired:
print('Timed out')
Note it is stdin that we need to take care of, not stdout or stderr.
You can refer to my answer in: https://stackoverflow.com/a/43012138/3555925, which use pseudo-terminal to make stdout no-blocking, and use select in handle stdin/stdout.
I can just modify the command var to 'vim'. And the script is working fine.
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import os
import sys
import select
import termios
import tty
import pty
from subprocess import Popen
command = 'vim'
# save original tty setting then set it to raw mode
old_tty = termios.tcgetattr(sys.stdin)
tty.setraw(sys.stdin.fileno())
# open pseudo-terminal to interact with subprocess
master_fd, slave_fd = pty.openpty()
# use os.setsid() process the leader of a new session, or bash job control will not be enabled
p = Popen(command,
preexec_fn=os.setsid,
stdin=slave_fd,
stdout=slave_fd,
stderr=slave_fd,
universal_newlines=True)
while p.poll() is None:
r, w, e = select.select([sys.stdin, master_fd], [], [])
if sys.stdin in r:
d = os.read(sys.stdin.fileno(), 10240)
os.write(master_fd, d)
elif master_fd in r:
o = os.read(master_fd, 10240)
if o:
os.write(sys.stdout.fileno(), o)
# restore tty settings back
termios.tcsetattr(sys.stdin, termios.TCSADRAIN, old_tty)
I'm trying to execute a system command with subprocess and reading the output.
But if the command takes more than 10 seconds I want to kill the subprocess.
I've tried doing this in several ways.
My last try was inspired by this post: https://stackoverflow.com/a/3326559/969208
Example:
import os
import signal
from subprocess import Popen, PIPE
class Alarm(Exception):
pass
def alarm_handler(signum, frame):
raise Alarm
def pexec(args):
p = Popen(args, stdout=PIPE, stderr=PIPE)
signal.signal(signal.SIGALRM, alarm_handler)
signal.alarm(10)
stdout = stderr = ''
try:
stdout, stderr = p.communicate()
signal.alarm(0)
except Alarm:
try:
os.kill(p.pid, signal.SIGKILL)
except:
pass
return (stdout, stderr)
The problem is: After the program exits no chars are shown in the cli until I hit return. And hitting return will not give me a new line.
I suppose this has something to do with the stdout and stderr pipe.
I've tried flushing and reading from the pipe (p.stdout.flush())
I've also tried with different Popen args, but might've missed something. Just thought I'd keep it simple here.
I'm running this on a Debian server.
Am I missing something here?
EDIT:
It seems this is only the case when killing an ongoing ffmpeg process. If the ffmpeg process exits normally before 10 seconds, there is no problem at all.
I've tried executing a couple of different command that take longer than 10 seconds, one who prints output, one who doesn't and a ffmpeg command to check the integrity of a file.
args = ['sleep', '12s'] # Works fine
args = ['ls', '-R', '/var'] # Works fine, prints lots for a long time
args = ['ffmpeg', '-v', '1', '-i', 'large_file.mov','-f', 'null', '-'] # Breaks cli output
I believe ffmpeg prints using \r and prints everything on the strerr pipe. Can this be the cause? Any ideas how to fix it?
Well. your code surely works fine on my Ubuntu server.
(which is close cousin or brother of Debian I suppose)
I added few more lines, so that I can test your code.
import os
import signal
from subprocess import Popen, PIPE
class Alarm(Exception):
pass
def alarm_handler(signum, frame):
raise Alarm
def pexec(args):
p = Popen(args, stdout=PIPE, stderr=PIPE)
signal.signal(signal.SIGALRM, alarm_handler)
signal.alarm(1)
stderr = ''
try:
stdout, stderr = p.communicate()
signal.alarm(0)
except Alarm:
print "Done!"
try:
os.kill(p.pid, signal.SIGKILL)
except:
pass
return (stdout, stderr)
args = ('find', '/', '-name','*')
stdout = pexec(args)
print "----------------------result--------------------------"
print stdout
print "----------------------result--------------------------"
Works like a charm.
If this code works on your server, I guess problem actually lies on
command line application that you trying to retrieve data.
I have the same problem. I can't get a running FFmpeg to terminate gracefully from a python subprocess, so I am using <process>.kill(). However I think this means FFmpeg does not restore the mode of the tty properly (as described here: https://askubuntu.com/a/172747)
You can get your shell back by running reset at the bash prompt, but that clears the screen so you can't see your script's output as you continue to work.
Better is to run stty echo which turns echoing back on for your shell session.
You can even run this in your script after you've nuked FFmpeg. I am doing:
ffmpeg_popen.kill()
ffmpeg_popen.wait()
subprocess.call(["stty", "echo"])
This works for me on Ubuntu with bash as my shell. YMMV, but I hope it helps. It smells hacky but it's the best solution I've found.
I ran into a similar issue with ffmpeg. It seems that if ffmpeg is killed using Popen.kill() it does not properly close and does not reinstate echoing on your terminal.
We can solve this using a pipe to stdin, and writing q to close ffmpeg as we would in a cli session:
p = Popen(args, stdin=PIPE stdout=PIPE, stderr=PIPE)
p.stdin.write(b"q")
It's probably preferable to use Popen.communicate in order to avoid a deadlock. The following will also work:
p = Popen(args, stdin=PIPE stdout=PIPE, stderr=PIPE)
p.communicate(b'q')
But it seems like even the following works:
p = Popen(args, stdin=PIPE stdout=PIPE, stderr=PIPE)
p.kill()
I'm not sure what causes this ffmpeg to close cleanly if it has an input pipe. Perhaps it has something to do with what causes this bug in the first place?