Decorate \ delegate a File object to add functionality - python

I've been writing a small Python script that executes some shell commands using the subprocess module and a helper function:
import subprocess as sp
def run(command, description):
"""Runs a command in a formatted manner. Returns its return code."""
start=datetime.datetime.now()
sys.stderr.write('%-65s' % description)
s=sp.Popen(command, shell=True, stderr=sp.PIPE, stdout=sp.PIPE)
out,err=s.communicate()
end=datetime.datetime.now()
duration=end-start
status='Done' if s.returncode==0 else 'Failed'
print '%s (%d seconds)' % (status, duration.seconds)
The following lines reads the standard output and error:
s=sp.Popen(command, shell=True, stderr=sp.PIPE, stdout=sp.PIPE)
out,err=s.communicate()
As you can see, stdout and stderr are not used. Suppose that I want to write the output and error messages to a log file, in a formatted way, e.g.:
[STDOUT: 2011-01-17 14:53:55] <message>
[STDERR: 2011-01-17 14:53:56] <message>
My question is, what's the most Pythonic way to do it? I thought of three options:
Inherit the file object and override the write method.
Use a Delegate class which implements write.
Connect to the PIPE itself in some way.
UPDATE : reference test script
I'm checking the results with this script, saved as test.py:
#!/usr/bin/python
import sys
sys.stdout.write('OUT\n')
sys.stdout.flush()
sys.stderr.write('ERR\n')
sys.stderr.flush()
Any ideas?

1 and 2 are reasonable solutions, but overriding write() won't be enough.
The problem is that Popen needs file handles to attach to the process, so Python file objects doesn't work, they have to be OS level. To solve that you have to have a Python object that has a os level file handle. The only way I can think of solving that is to use pipes, so you have an os level file handle to write to. But then you need another thread that sits and polls that pipe for things to read in so it can log it. (So this is more strictly an implementation of 2, as it delegates to logging).
Said and done:
import io
import logging
import os
import select
import subprocess
import time
import threading
LOG_FILENAME = 'output.log'
logging.basicConfig(filename=LOG_FILENAME,level=logging.DEBUG)
class StreamLogger(io.IOBase):
def __init__(self, level):
self.level = level
self.pipe = os.pipe()
self.thread = threading.Thread(target=self._flusher)
self.thread.start()
def _flusher(self):
self._run = True
buf = b''
while self._run:
for fh in select.select([self.pipe[0]], [], [], 0)[0]:
buf += os.read(fh, 1024)
while b'\n' in buf:
data, buf = buf.split(b'\n', 1)
self.write(data.decode())
time.sleep(1)
self._run = None
def write(self, data):
return logging.log(self.level, data)
def fileno(self):
return self.pipe[1]
def close(self):
if self._run:
self._run = False
while self._run is not None:
time.sleep(1)
os.close(self.pipe[0])
os.close(self.pipe[1])
So that class starts a os level pipe that Popen can attach the stdin/out/error to for the subprocess. It also starts a thread that polls the other end of that pipe once a second for things to log, which it then logs with the logging module.
Possibly this class should implement more things for completeness, but it works in this case anyway.
Example code:
with StreamLogger(logging.INFO) as out:
with StreamLogger(logging.ERROR) as err:
subprocess.Popen("ls", stdout=out, stderr=err, shell=True)
output.log ends up like so:
INFO:root:output.log
INFO:root:streamlogger.py
INFO:root:and
INFO:root:so
INFO:root:on
Tested with Python 2.6, 2.7 and 3.1.
I would think any implementation of 1 and 3 would need to use similar techniques. It is a bit involved, but unless you can make the Popen command log correctly itself, I don't have a better idea).

I would suggest option 3, with the logging standard library package. In this case I'd say the other 2 were overkill.

1 and 2 won't work. Here's an implementation of the principle:
import subprocess
import time
FileClass = open('tmptmp123123123.tmp', 'w').__class__
class WrappedFile(FileClass):
TIMETPL = "%Y-%m-%d %H:%M:%S"
TEMPLATE = "[%s: %s] "
def __init__(self, name, mode='r', buffering=None, title=None):
self.title = title or name
if buffering is None:
super(WrappedFile, self).__init__(name, mode)
else:
super(WrappedFile, self).__init__(name, mode, buffering)
def write(self, s):
stamp = time.strftime(self.TIMETPL)
if not s:
return
# Add a line with timestamp per line to be written
s = s.split('\n')
spre = self.TEMPLATE % (self.title, stamp)
s = "\n".join(["%s %s" % (spre, line) for line in s]) + "\n"
super(WrappedFile, self).write(s)
The reason it doesn't work is that Popen never calls stdout.write. A wrapped file will work fine when we call its write method and will even be written to if passed to Popen, but the write will happen in a lower layer, skipping the write method.

This simple solution worked for me:
import sys
import datetime
import tempfile
import subprocess as sp
def run(command, description):
"""Runs a command in a formatted manner. Returns its return code."""
with tempfile.SpooledTemporaryFile(8*1024) as so:
print >> sys.stderr, '%-65s' % description
start=datetime.datetime.now()
retcode = sp.call(command, shell=True, stderr=sp.STDOUT, stdout=so)
end=datetime.datetime.now()
so.seek(0)
for line in so.readlines():
print >> sys.stderr,'logging this:', line.rstrip()
duration=end-start
status='Done' if retcode == 0 else 'Failed'
print >> sys.stderr, '%s (%d seconds)' % (status, duration.seconds)
REF_SCRIPT = r"""#!/usr/bin/python
import sys
sys.stdout.write('OUT\n')
sys.stdout.flush()
sys.stderr.write('ERR\n')
sys.stderr.flush()
"""
SCRIPT_NAME = 'refscript.py'
if __name__ == '__main__':
with open(SCRIPT_NAME, 'w') as script:
script.write(REF_SCRIPT)
run('python ' + SCRIPT_NAME, 'Reference script')

This uses Adam Rosenfield's make_async and read_async. Whereas my original answer used select.epoll and was thus Linux-only, it now uses select.select, which should work under Unix or Windows.
This logs output from the subprocess to /tmp/test.log as it occurs:
import logging
import subprocess
import shlex
import select
import fcntl
import os
import errno
def make_async(fd):
# https://stackoverflow.com/a/7730201/190597
'''add the O_NONBLOCK flag to a file descriptor'''
fcntl.fcntl(fd, fcntl.F_SETFL, fcntl.fcntl(fd, fcntl.F_GETFL) | os.O_NONBLOCK)
def read_async(fd):
# https://stackoverflow.com/a/7730201/190597
'''read some data from a file descriptor, ignoring EAGAIN errors'''
try:
return fd.read()
except IOError, e:
if e.errno != errno.EAGAIN:
raise e
else:
return ''
def log_process(proc,stdout_logger,stderr_logger):
loggers = { proc.stdout: stdout_logger, proc.stderr: stderr_logger }
def log_fds(fds):
for fd in fds:
out = read_async(fd)
if out.strip():
loggers[fd].info(out)
make_async(proc.stdout)
make_async(proc.stderr)
while True:
# Wait for data to become available
rlist, wlist, xlist = select.select([proc.stdout, proc.stderr], [], [])
log_fds(rlist)
if proc.poll() is not None:
# Corner case: check if more output was created
# between the last call to read_async and now
log_fds([proc.stdout, proc.stderr])
break
if __name__=='__main__':
formatter = logging.Formatter('[%(name)s: %(asctime)s] %(message)s')
handler = logging.FileHandler('/tmp/test.log','w')
handler.setFormatter(formatter)
stdout_logger=logging.getLogger('STDOUT')
stdout_logger.setLevel(logging.DEBUG)
stdout_logger.addHandler(handler)
stderr_logger=logging.getLogger('STDERR')
stderr_logger.setLevel(logging.DEBUG)
stderr_logger.addHandler(handler)
proc = subprocess.Popen(shlex.split('ls -laR /tmp'),
stdout=subprocess.PIPE,
stderr=subprocess.PIPE)
log_process(proc,stdout_logger,stderr_logger)

Related

Capturing print output from shared library called from python with ctypes module

I am working with a shared library that is being called through the ctypes module. I would like to redirect the stdout associated with this module to a variable or a file that I can access in my program. However ctypes uses a separate stdout from sys.stdout.
I'll demonstrate the problem I am having with libc. If anyone is copying and pasting the code they might have to change the filename on line 2.
import ctypes
libc = ctypes.CDLL('libc.so.6')
from cStringIO import StringIO
import sys
oldStdOut = sys.stdout
sys.stdout = myStdOut = StringIO()
print 'This text gets captured by myStdOut'
libc.printf('This text fails to be captured by myStdOut\n')
sys.stdout = oldStdOut
myStdOut.getvalue()
Is there any way I can capture the stdout that is associated with the ctypes loaded shared library?
We can use os.dup2() and os.pipe() to replace the entire stdout file descriptor (fd 1) with a pipe we can read from ourselves. You can do the same thing with stderr (fd 2).
This example uses select.select() to see if the pipe (our fake stdout) has data waiting to be written, so we can print it safely without blocking execution of our script.
As we are completely replacing the stdout file descriptor for this process and any subprocesses, this example can even capture output from child processes.
import os, sys, select
# the pipe would fail for some reason if I didn't write to stdout at some point
# so I write a space, then backspace (will show as empty in a normal terminal)
sys.stdout.write(' \b')
pipe_out, pipe_in = os.pipe()
# save a copy of stdout
stdout = os.dup(1)
# replace stdout with our write pipe
os.dup2(pipe_in, 1)
# check if we have more to read from the pipe
def more_data():
r, _, _ = select.select([pipe_out], [], [], 0)
return bool(r)
# read the whole pipe
def read_pipe():
out = ''
while more_data():
out += os.read(pipe_out, 1024)
return out
# testing print methods
import ctypes
libc = ctypes.CDLL('libc.so.6')
print 'This text gets captured by myStdOut'
libc.printf('This text fails to be captured by myStdOut\n')
# put stdout back in place
os.dup2(stdout, 1)
print 'Contents of our stdout pipe:'
print read_pipe()
Simplest example, because this question in google top.
import os
from ctypes import CDLL
libc = CDLL(None)
stdout = os.dup(1)
silent = os.open(os.devnull, os.O_WRONLY)
os.dup2(silent, 1)
libc.printf(b"Hate this text")
os.dup2(stdout, 1)
If the data the native process writes are large (larger than pipe buffer), the native program would block until you make some space in the pipe by reading it.
The solution from lunixbochs, however, needs the native process to finish before it starts reading the pipe. I improved the solution so that it reads the pipe in parallel from a separate thread. This way you can capture output of any size.
This solution is also inspired by https://stackoverflow.com/a/16571630/1076564 and captures both stdout and stderr:
class CtypesStdoutCapture(object):
def __enter__(self):
self._pipe_out, self._pipe_in = os.pipe()
self._err_pipe_out, self._err_pipe_in = os.pipe()
self._stdout = os.dup(1)
self._stderr = os.dup(2)
self.text = ""
self.err = ""
# replace stdout with our write pipe
os.dup2(self._pipe_in, 1)
os.dup2(self._err_pipe_in, 2)
self._stop = False
self._read_thread = threading.Thread(target=self._read, args=["text", self._pipe_out])
self._read_err_thread = threading.Thread(target=self._read, args=["err", self._err_pipe_out])
self._read_thread.start()
self._read_err_thread.start()
return self
def __exit__(self, *args):
self._stop = True
self._read_thread.join()
self._read_err_thread.join()
# put stdout back in place
os.dup2(self._stdout, 1)
os.dup2(self._stderr, 2)
self.text += self.read_pipe(self._pipe_out)
self.err += self.read_pipe(self._err_pipe_out)
# check if we have more to read from the pipe
def more_data(self, pipe):
r, _, _ = select.select([pipe], [], [], 0)
return bool(r)
# read the whole pipe
def read_pipe(self, pipe):
out = ''
while self.more_data(pipe):
out += os.read(pipe, 1024)
return out
def _read(self, type, pipe):
while not self._stop:
setattr(self, type, getattr(self, type) + self.read_pipe(pipe))
sleep(0.001)
def __str__(self):
return self.text
# Usage:
with CtypesStdoutCapture as capture:
lib.native_fn()
print(capture.text)
print(capture.err)
There is a Python project called Wurlitzer that very elegantly solves this problem. It's a work of art and deserves to be one of the top answers to this question.
https://github.com/minrk/wurlitzer
https://pypi.org/project/wurlitzer/
pip install wurlitzer
from wurlitzer import pipes
with pipes() as (out, err):
call_some_c_function()
stdout = out.read()
from io import StringIO
from wurlitzer import pipes, STDOUT
out = StringIO()
with pipes(stdout=out, stderr=STDOUT):
call_some_c_function()
stdout = out.getvalue()
from wurlitzer import sys_pipes
with sys_pipes():
call_some_c_function()
And the most magical part: it supports Jupyter:
%load_ext wurlitzer

Python logging and subprocess ouput and error stream

I would like to start off a python process and log subprocess error messages to the logging object of the parent script. I would ideally like to unify the log streams into one file. Can I somehow access the output stream of the logging class? One solution I know of is to use proc log for logging. As described in the answer below, I could read from the proc.stdin and stderr, but I'd have duplicate logging headers. I wonder if there is a way to pass the file descriptor underlying the logging class directly to the subprocess?
logging.basicConfig(filename="test.log",level=logging.DEBUG)
logging.info("Started")
procLog = open(os.path.expanduser("subproc.log"), 'w')
proc = subprocess.Popen(cmdStr, shell=True, stderr=procLog, stdout=procLog)
proc.wait()
procLog.flush()
Based on Adam Rosenfield's code, you could
use select.select to block until there is output to be read from
proc.stdout or proc.stderr,
read and log that output, then
repeat until the process is done.
Note that the following writes to /tmp/test.log and runs the command ls -laR /tmp. Change to suit your needs.
(PS: Typically /tmp contains directories which can not be read by normal users, so running ls -laR /tmp produces output to both stdout and stderr. The code below correctly interleaves those two streams as they are produced.)
import logging
import subprocess
import shlex
import select
import fcntl
import os
import errno
import contextlib
logger = logging.getLogger(__name__)
def make_async(fd):
'''add the O_NONBLOCK flag to a file descriptor'''
fcntl.fcntl(fd, fcntl.F_SETFL, fcntl.fcntl(fd, fcntl.F_GETFL) | os.O_NONBLOCK)
def read_async(fd):
'''read some data from a file descriptor, ignoring EAGAIN errors'''
try:
return fd.read()
except IOError, e:
if e.errno != errno.EAGAIN:
raise e
else:
return ''
def log_fds(fds):
for fd in fds:
out = read_async(fd)
if out:
logger.info(out)
#contextlib.contextmanager
def plain_logger():
root = logging.getLogger()
hdlr = root.handlers[0]
formatter_orig = hdlr.formatter
hdlr.setFormatter(logging.Formatter('%(message)s'))
yield
hdlr.setFormatter(formatter_orig)
def main():
# fmt = '%(name)-12s: %(levelname)-8s %(message)s'
logging.basicConfig(filename = '/tmp/test.log', mode = 'w',
level = logging.DEBUG)
logger.info("Started")
cmdStr = 'ls -laR /tmp'
with plain_logger():
proc = subprocess.Popen(shlex.split(cmdStr),
stdout = subprocess.PIPE, stderr = subprocess.PIPE)
# without `make_async`, `fd.read` in `read_async` blocks.
make_async(proc.stdout)
make_async(proc.stderr)
while True:
# Wait for data to become available
rlist, wlist, xlist = select.select([proc.stdout, proc.stderr], [], [])
log_fds(rlist)
if proc.poll() is not None:
# Corner case: check if more output was created
# between the last call to read_async and now
log_fds([proc.stdout, proc.stderr])
break
logger.info("Done")
if __name__ == '__main__':
main()
Edit:
You can redirect stdout and stderr to logfile = open('/tmp/test.log', 'a').
A small difficulty with doing so, however, is that any logger handler that is also writing to /tmp/test.log will not be aware of what the subprocess is writing, and so the log file may get garbled.
If you do not make logging calls while the subprocess is doing its business, then the only problem is that the logger handler has the wrong position in the file after the subprocess has finished. That can be fixed by calling
handler.stream.seek(0, 2)
so the handler will resume writing at the end of the file.
import logging
import subprocess
import contextlib
import shlex
logger = logging.getLogger(__name__)
#contextlib.contextmanager
def suspended_logger():
root = logging.getLogger()
handler = root.handlers[0]
yield
handler.stream.seek(0, 2)
def main():
logging.basicConfig(filename = '/tmp/test.log', filemode = 'w',
level = logging.DEBUG)
logger.info("Started")
with suspended_logger():
cmdStr = 'test2.py 1>>/tmp/test.log 2>&1'
logfile = open('/tmp/test.log', 'a')
proc = subprocess.Popen(shlex.split(cmdStr),
stdout = logfile,
stderr = logfile)
proc.communicate()
logger.info("Done")
if __name__ == '__main__':
main()

Redirecting stdout to "nothing" in python

I have a large project consisting of sufficiently large number of modules, each printing something to the standard output. Now as the project has grown in size, there are large no. of print statements printing a lot on the std out which has made the program considerably slower.
So, I now want to decide at runtime whether or not to print anything to the stdout. I cannot make changes in the modules as there are plenty of them. (I know I can redirect the stdout to a file but even this is considerably slow.)
So my question is how do I redirect the stdout to nothing ie how do I make the print statement do nothing?
# I want to do something like this.
sys.stdout = None # this obviously will give an error as Nonetype object does not have any write method.
Currently the only idea I have is to make a class which has a write method (which does nothing) and redirect the stdout to an instance of this class.
class DontPrint(object):
def write(*args): pass
dp = DontPrint()
sys.stdout = dp
Is there an inbuilt mechanism in python for this? Or is there something better than this?
Cross-platform:
import os
import sys
f = open(os.devnull, 'w')
sys.stdout = f
On Windows:
f = open('nul', 'w')
sys.stdout = f
On Linux:
f = open('/dev/null', 'w')
sys.stdout = f
A nice way to do this is to create a small context processor that you wrap your prints in. You then just use is in a with-statement to silence all output.
Python 2:
import os
import sys
from contextlib import contextmanager
#contextmanager
def silence_stdout():
old_target = sys.stdout
try:
with open(os.devnull, "w") as new_target:
sys.stdout = new_target
yield new_target
finally:
sys.stdout = old_target
with silence_stdout():
print("will not print")
print("this will print")
Python 3.4+:
Python 3.4 has a context processor like this built-in, so you can simply use contextlib like this:
import contextlib
with contextlib.redirect_stdout(None):
print("will not print")
print("this will print")
If the code you want to surpress writes directly to sys.stdout using None as redirect target won't work. Instead you can use:
import contextlib
import sys
import os
with contextlib.redirect_stdout(open(os.devnull, 'w')):
sys.stdout.write("will not print")
sys.stdout.write("this will print")
If your code writes to stderr instead of stdout, you can use contextlib.redirect_stderr instead of redirect_stdout.
Running this code only prints the second line of output, not the first:
$ python test.py
this will print
This works cross-platform (Windows + Linux + Mac OSX), and is cleaner than the ones other answers imho.
If you're in python 3.4 or higher, there's a simple and safe solution using the standard library:
import contextlib
with contextlib.redirect_stdout(None):
print("This won't print!")
(at least on my system) it appears that writing to os.devnull is about 5x faster than writing to a DontPrint class, i.e.
#!/usr/bin/python
import os
import sys
import datetime
ITER = 10000000
def printlots(out, it, st="abcdefghijklmnopqrstuvwxyz1234567890"):
temp = sys.stdout
sys.stdout = out
i = 0
start_t = datetime.datetime.now()
while i < it:
print st
i = i+1
end_t = datetime.datetime.now()
sys.stdout = temp
print out, "\n took", end_t - start_t, "for", it, "iterations"
class devnull():
def write(*args):
pass
printlots(open(os.devnull, 'wb'), ITER)
printlots(devnull(), ITER)
gave the following output:
<open file '/dev/null', mode 'wb' at 0x7f2b747044b0>
took 0:00:02.074853 for 10000000 iterations
<__main__.devnull instance at 0x7f2b746bae18>
took 0:00:09.933056 for 10000000 iterations
If you're in a Unix environment (Linux included), you can redirect output to /dev/null:
python myprogram.py > /dev/null
And for Windows:
python myprogram.py > nul
You can just mock it.
import mock
sys.stdout = mock.MagicMock()
Your class will work just fine (with the exception of the write() method name -- it needs to be called write(), lowercase). Just make sure you save a copy of sys.stdout in another variable.
If you're on a *NIX, you can do sys.stdout = open('/dev/null'), but this is less portable than rolling your own class.
How about this:
from contextlib import ExitStack, redirect_stdout
import os
with ExitStack() as stack:
if should_hide_output():
null_stream = open(os.devnull, "w")
stack.enter_context(null_stream)
stack.enter_context(redirect_stdout(null_stream))
noisy_function()
This uses the features in the contextlib module to hide the output of whatever command you are trying to run, depending on the result of should_hide_output(), and then restores the output behavior after that function is done running.
If you want to hide standard error output, then import redirect_stderr from contextlib and add a line saying stack.enter_context(redirect_stderr(null_stream)).
The main downside it that this only works in Python 3.4 and later versions.
sys.stdout = None
It is OK for print() case. But it can cause an error if you call any method of sys.stdout, e.g. sys.stdout.write().
There is a note in docs:
Under some conditions stdin, stdout and stderr as well as the original
values stdin, stdout and stderr can be None. It is usually
the case for Windows GUI apps that aren’t connected to a console and
Python apps started with pythonw.
Supplement to iFreilicht's answer - it works for both python 2 & 3.
import sys
class NonWritable:
def write(self, *args, **kwargs):
pass
class StdoutIgnore:
def __enter__(self):
self.stdout_saved = sys.stdout
sys.stdout = NonWritable()
return self
def __exit__(self, *args):
sys.stdout = self.stdout_saved
with StdoutIgnore():
print("This won't print!")
If you don't want to deal with resource-allocation nor rolling your own class, you may want to use TextIO from Python typing. It has all required methods stubbed for you by default.
import sys
from typing import TextIO
sys.stdout = TextIO()
There are a number of good answers in the flow, but here is my Python 3 answer (when sys.stdout.fileno() isn't supported anymore) :
import os
import sys
oldstdout = os.dup(1)
oldstderr = os.dup(2)
oldsysstdout = sys.stdout
oldsysstderr = sys.stderr
# Cancel all stdout outputs (will be lost) - optionally also cancel stderr
def cancel_stdout(stderr=False):
sys.stdout.flush()
devnull = open('/dev/null', 'w')
os.dup2(devnull.fileno(), 1)
sys.stdout = devnull
if stderr:
os.dup2(devnull.fileno(), 2)
sys.stderr = devnull
# Redirect all stdout outputs to a file - optionally also redirect stderr
def reroute_stdout(filepath, stderr=False):
sys.stdout.flush()
file = open(filepath, 'w')
os.dup2(file.fileno(), 1)
sys.stdout = file
if stderr:
os.dup2(file.fileno(), 2)
sys.stderr = file
# Restores stdout to default - and stderr
def restore_stdout():
sys.stdout.flush()
sys.stdout.close()
os.dup2(oldstdout, 1)
os.dup2(oldstderr, 2)
sys.stdout = oldsysstdout
sys.stderr = oldsysstderr
To use it:
Cancel all stdout and stderr outputs with:
cancel_stdout(stderr=True)
Route all stdout (but not stderr) to a file:
reroute_stdout('output.txt')
To restore stdout and stderr:
restore_stdout()
Why don't you try this?
sys.stdout.close()
sys.stderr.close()
Will add some example to the numerous answers here:
import argparse
import contextlib
class NonWritable:
def write(self, *args, **kwargs):
pass
parser = argparse.ArgumentParser(description='my program')
parser.add_argument("-p", "--param", help="my parameter", type=str, required=True)
#with contextlib.redirect_stdout(None): # No effect as `argparse` will output to `stderr`
#with contextlib.redirect_stderr(None): # AttributeError: 'NoneType' object has no attribute 'write'
with contextlib.redirect_stderr(NonWritable): # this works!
args = parser.parse_args()
The normal output would be:
>python TEST.py
usage: TEST.py [-h] -p PARAM
TEST.py: error: the following arguments are required: -p/--param
I use this. Redirect stdout to a string, which you subsequently ignore. I use a context manager to save and restore the original setting for stdout.
from io import StringIO
...
with StringIO() as out:
with stdout_redirected(out):
# Do your thing
where stdout_redirected is defined as:
from contextlib import contextmanager
#contextmanager
def stdout_redirected(new_stdout):
save_stdout = sys.stdout
sys.stdout = new_stdout
try:
yield None
finally:
sys.stdout = save_stdout

Redirect stdout to a file in Python?

How do I redirect stdout to an arbitrary file in Python?
When a long-running Python script (e.g, web application) is started from within the ssh session and backgounded, and the ssh session is closed, the application will raise IOError and fail the moment it tries to write to stdout. I needed to find a way to make the application and modules output to a file rather than stdout to prevent failure due to IOError. Currently, I employ nohup to redirect output to a file, and that gets the job done, but I was wondering if there was a way to do it without using nohup, out of curiosity.
I have already tried sys.stdout = open('somefile', 'w'), but this does not seem to prevent some external modules from still outputting to terminal (or maybe the sys.stdout = ... line did not fire at all). I know it should work from simpler scripts I've tested on, but I also didn't have time yet to test on a web application yet.
If you want to do the redirection within the Python script, setting sys.stdout to a file object does the trick:
# for python3
import sys
with open('file', 'w') as sys.stdout:
print('test')
A far more common method is to use shell redirection when executing (same on Windows and Linux):
$ python3 foo.py > file
There is contextlib.redirect_stdout() function in Python 3.4+:
from contextlib import redirect_stdout
with open('help.txt', 'w') as f:
with redirect_stdout(f):
print('it now prints to `help.text`')
It is similar to:
import sys
from contextlib import contextmanager
#contextmanager
def redirect_stdout(new_target):
old_target, sys.stdout = sys.stdout, new_target # replace sys.stdout
try:
yield new_target # run some code with the replaced stdout
finally:
sys.stdout = old_target # restore to the previous value
that can be used on earlier Python versions. The latter version is not reusable. It can be made one if desired.
It doesn't redirect the stdout at the file descriptors level e.g.:
import os
from contextlib import redirect_stdout
stdout_fd = sys.stdout.fileno()
with open('output.txt', 'w') as f, redirect_stdout(f):
print('redirected to a file')
os.write(stdout_fd, b'not redirected')
os.system('echo this also is not redirected')
b'not redirected' and 'echo this also is not redirected' are not redirected to the output.txt file.
To redirect at the file descriptor level, os.dup2() could be used:
import os
import sys
from contextlib import contextmanager
def fileno(file_or_fd):
fd = getattr(file_or_fd, 'fileno', lambda: file_or_fd)()
if not isinstance(fd, int):
raise ValueError("Expected a file (`.fileno()`) or a file descriptor")
return fd
#contextmanager
def stdout_redirected(to=os.devnull, stdout=None):
if stdout is None:
stdout = sys.stdout
stdout_fd = fileno(stdout)
# copy stdout_fd before it is overwritten
#NOTE: `copied` is inheritable on Windows when duplicating a standard stream
with os.fdopen(os.dup(stdout_fd), 'wb') as copied:
stdout.flush() # flush library buffers that dup2 knows nothing about
try:
os.dup2(fileno(to), stdout_fd) # $ exec >&to
except ValueError: # filename
with open(to, 'wb') as to_file:
os.dup2(to_file.fileno(), stdout_fd) # $ exec > to
try:
yield stdout # allow code to be run with the redirected stdout
finally:
# restore stdout to its previous value
#NOTE: dup2 makes stdout_fd inheritable unconditionally
stdout.flush()
os.dup2(copied.fileno(), stdout_fd) # $ exec >&copied
The same example works now if stdout_redirected() is used instead of redirect_stdout():
import os
import sys
stdout_fd = sys.stdout.fileno()
with open('output.txt', 'w') as f, stdout_redirected(f):
print('redirected to a file')
os.write(stdout_fd, b'it is redirected now\n')
os.system('echo this is also redirected')
print('this is goes back to stdout')
The output that previously was printed on stdout now goes to output.txt as long as stdout_redirected() context manager is active.
Note: stdout.flush() does not flush
C stdio buffers on Python 3 where I/O is implemented directly on read()/write() system calls. To flush all open C stdio output streams, you could call libc.fflush(None) explicitly if some C extension uses stdio-based I/O:
try:
import ctypes
from ctypes.util import find_library
except ImportError:
libc = None
else:
try:
libc = ctypes.cdll.msvcrt # Windows
except OSError:
libc = ctypes.cdll.LoadLibrary(find_library('c'))
def flush(stream):
try:
libc.fflush(None)
stream.flush()
except (AttributeError, ValueError, IOError):
pass # unsupported
You could use stdout parameter to redirect other streams, not only sys.stdout e.g., to merge sys.stderr and sys.stdout:
def merged_stderr_stdout(): # $ exec 2>&1
return stdout_redirected(to=sys.stdout, stdout=sys.stderr)
Example:
from __future__ import print_function
import sys
with merged_stderr_stdout():
print('this is printed on stdout')
print('this is also printed on stdout', file=sys.stderr)
Note: stdout_redirected() mixes buffered I/O (sys.stdout usually) and unbuffered I/O (operations on file descriptors directly). Beware, there could be buffering issues.
To answer, your edit: you could use python-daemon to daemonize your script and use logging module (as #erikb85 suggested) instead of print statements and merely redirecting stdout for your long-running Python script that you run using nohup now.
you can try this too much better
import sys
class Logger(object):
def __init__(self, filename="Default.log"):
self.terminal = sys.stdout
self.log = open(filename, "a")
def write(self, message):
self.terminal.write(message)
self.log.write(message)
sys.stdout = Logger("yourlogfilename.txt")
print "Hello world !" # this is should be saved in yourlogfilename.txt
The other answers didn't cover the case where you want forked processes to share your new stdout.
To do that:
from os import open, close, dup, O_WRONLY
old = dup(1)
close(1)
open("file", O_WRONLY) # should open on 1
..... do stuff and then restore
close(1)
dup(old) # should dup to 1
close(old) # get rid of left overs
Quoted from PEP 343 -- The "with" Statement (added import statement):
Redirect stdout temporarily:
import sys
from contextlib import contextmanager
#contextmanager
def stdout_redirected(new_stdout):
save_stdout = sys.stdout
sys.stdout = new_stdout
try:
yield None
finally:
sys.stdout = save_stdout
Used as follows:
with open(filename, "w") as f:
with stdout_redirected(f):
print "Hello world"
This isn't thread-safe, of course, but neither is doing this same dance manually. In single-threaded programs (for example in scripts) it is a popular way of doing things.
import sys
sys.stdout = open('stdout.txt', 'w')
Here is a variation of Yuda Prawira answer:
implement flush() and all the file attributes
write it as a contextmanager
capture stderr also
.
import contextlib, sys
#contextlib.contextmanager
def log_print(file):
# capture all outputs to a log file while still printing it
class Logger:
def __init__(self, file):
self.terminal = sys.stdout
self.log = file
def write(self, message):
self.terminal.write(message)
self.log.write(message)
def __getattr__(self, attr):
return getattr(self.terminal, attr)
logger = Logger(file)
_stdout = sys.stdout
_stderr = sys.stderr
sys.stdout = logger
sys.stderr = logger
try:
yield logger.log
finally:
sys.stdout = _stdout
sys.stderr = _stderr
with log_print(open('mylogfile.log', 'w')):
print('hello world')
print('hello world on stderr', file=sys.stderr)
# you can capture the output to a string with:
# with log_print(io.StringIO()) as log:
# ....
# print('[captured output]', log.getvalue())
You need a terminal multiplexer like either tmux or GNU screen
I'm surprised that a small comment by Ryan Amos' to the original question is the only mention of a solution far preferable to all the others on offer, no matter how clever the python trickery may be and how many upvotes they've received. Further to Ryan's comment, tmux is a nice alternative to GNU screen.
But the principle is the same: if you ever find yourself wanting to leave a terminal job running while you log-out, head to the cafe for a sandwich, pop to the bathroom, go home (etc) and then later, reconnect to your terminal session from anywhere or any computer as though you'd never been away, terminal multiplexers are the answer. Think of them as VNC or remote desktop for terminal sessions. Anything else is a workaround. As a bonus, when the boss and/or partner comes in and you inadvertently ctrl-w / cmd-w your terminal window instead of your browser window with its dodgy content, you won't have lost the last 18 hours-worth of processing!
Based on this answer: https://stackoverflow.com/a/5916874/1060344, here is another way I figured out which I use in one of my projects. For whatever you replace sys.stderr or sys.stdout with, you have to make sure that the replacement complies with file interface, especially if this is something you are doing because stderr/stdout are used in some other library that is not under your control. That library may be using other methods of file object.
Check out this way where I still let everything go do stderr/stdout (or any file for that matter) and also send the message to a log file using Python's logging facility (but you can really do anything with this):
class FileToLogInterface(file):
'''
Interface to make sure that everytime anything is written to stderr, it is
also forwarded to a file.
'''
def __init__(self, *args, **kwargs):
if 'cfg' not in kwargs:
raise TypeError('argument cfg is required.')
else:
if not isinstance(kwargs['cfg'], config.Config):
raise TypeError(
'argument cfg should be a valid '
'PostSegmentation configuration object i.e. '
'postsegmentation.config.Config')
self._cfg = kwargs['cfg']
kwargs.pop('cfg')
self._logger = logging.getlogger('access_log')
super(FileToLogInterface, self).__init__(*args, **kwargs)
def write(self, msg):
super(FileToLogInterface, self).write(msg)
self._logger.info(msg)
Programs written in other languages (e.g. C) have to do special magic (called double-forking) expressly to detach from the terminal (and to prevent zombie processes). So, I think the best solution is to emulate them.
A plus of re-executing your program is, you can choose redirections on the command-line, e.g. /usr/bin/python mycoolscript.py 2>&1 1>/dev/null
See this post for more info: What is the reason for performing a double fork when creating a daemon?
I know this question is answered (using python abc.py > output.log 2>&1 ), but I still have to say:
When writing your program, don't write to stdout. Always use logging to output whatever you want. That would give you a lot of freedom in the future when you want to redirect, filter, rotate the output files.
As mentioned by #jfs, most solutions will not properly handle some types of stdout output such as that from C extensions. There is a module that takes care of all this on PyPI called wurlitzer. You just need its sys_pipes context manager. It's as easy as using:
from contextlib import redirect_stdout
import os
from wurlitzer import sys_pipes
log = open("test.log", "a")
with redirect_stdout(log), sys_pipes():
print("print statement")
os.system("echo echo call")
Based on previous answers on this post I wrote this class for myself as a more compact and flexible way of redirecting the output of pieces of code - here just to a list - and ensure that the output is normalized afterwards.
class out_to_lt():
def __init__(self, lt):
if type(lt) == list:
self.lt = lt
else:
raise Exception("Need to pass a list")
def __enter__(self):
import sys
self._sys = sys
self._stdout = sys.stdout
sys.stdout = self
return self
def write(self,txt):
self.lt.append(txt)
def __exit__(self, type, value, traceback):
self._sys.stdout = self._stdout
Used as:
lt = []
with out_to_lt(lt) as o:
print("Test 123\n\n")
print(help(str))
Updating. Just found a scenario where I had to add two extra methods, but was easy to adapt:
class out_to_lt():
...
def isatty(self):
return True #True: You're running in a real terminal, False:You're being piped, redirected, cron
def flush(self):
pass
There are other versions using context but nothing this simple. I actually just googled to double check it would work and was surprised not to see it, so for other people looking for a quick solution that is safe and directed at only the code within the context block, here it is:
import sys
with open('test_file', 'w') as sys.stdout:
print('Testing 1 2 3')
Tested like so:
$ cat redirect_stdout.py
import sys
with open('test_file', 'w') as sys.stdout:
print('Testing 1 2 3')
$ python redirect_stdout.py
$ cat test_file
Testing 1 2 3

How to replicate tee behavior in Python when using subprocess?

I'm looking for a Python solution that will allow me to save the output of a command in a file without hiding it from the console.
FYI: I'm asking about tee (as the Unix command line utility) and not the function with the same name from Python intertools module.
Details
Python solution (not calling tee, it is not available under Windows)
I do not need to provide any input to stdin for called process
I have no control over the called program. All I know is that it will output something to stdout and stderr and return with an exit code.
To work when calling external programs (subprocess)
To work for both stderr and stdout
Being able to differentiate between stdout and stderr because I may want to display only one of the to the console or I could try to output stderr using a different color - this means that stderr = subprocess.STDOUT will not work.
Live output (progressive) - the process can run for a long time, and I'm not able to wait for it to finish.
Python 3 compatible code (important)
References
Here are some incomplete solutions I found so far:
http://devlishgenius.blogspot.com/2008/10/logging-in-real-time-in-python.html (mkfifo works only on Unix)
http://blog.kagesenshi.org/2008/02/teeing-python-subprocesspopen-output.html (doesn't work at all)
Diagram http://blog.i18n.ro/wp-content/uploads/2010/06/Drawing_tee_py.png
Current code (second try)
#!/usr/bin/python
from __future__ import print_function
import sys, os, time, subprocess, io, threading
cmd = "python -E test_output.py"
from threading import Thread
class StreamThread ( Thread ):
def __init__(self, buffer):
Thread.__init__(self)
self.buffer = buffer
def run ( self ):
while 1:
line = self.buffer.readline()
print(line,end="")
sys.stdout.flush()
if line == '':
break
proc = subprocess.Popen(cmd, shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
stdoutThread = StreamThread(io.TextIOWrapper(proc.stdout))
stderrThread = StreamThread(io.TextIOWrapper(proc.stderr))
stdoutThread.start()
stderrThread.start()
proc.communicate()
stdoutThread.join()
stderrThread.join()
print("--done--")
#### test_output.py ####
#!/usr/bin/python
from __future__ import print_function
import sys, os, time
for i in range(0, 10):
if i%2:
print("stderr %s" % i, file=sys.stderr)
else:
print("stdout %s" % i, file=sys.stdout)
time.sleep(0.1)
Real output
stderr 1
stdout 0
stderr 3
stdout 2
stderr 5
stdout 4
stderr 7
stdout 6
stderr 9
stdout 8
--done--
Expected output was to have the lines ordered. Remark, modifying the Popen to use only one PIPE is not allowed because in the real life I will want to do different things with stderr and stdout.
Also even in the second case I was not able to obtain real-time like out, in fact all the results were received when the process finished. By default, Popen should use no buffers (bufsize=0).
I see that this is a rather old post but just in case someone is still searching for a way to do this:
proc = subprocess.Popen(["ping", "localhost"],
stdout=subprocess.PIPE,
stderr=subprocess.PIPE)
with open("logfile.txt", "w") as log_file:
while proc.poll() is None:
line = proc.stderr.readline()
if line:
print "err: " + line.strip()
log_file.write(line)
line = proc.stdout.readline()
if line:
print "out: " + line.strip()
log_file.write(line)
If requiring python 3.6 isn't an issue there is now a way of doing this using asyncio. This method allows you to capture stdout and stderr separately but still have both stream to the tty without using threads. Here's a rough outline:
class RunOutput:
def __init__(self, returncode, stdout, stderr):
self.returncode = returncode
self.stdout = stdout
self.stderr = stderr
async def _read_stream(stream, callback):
while True:
line = await stream.readline()
if line:
callback(line)
else:
break
async def _stream_subprocess(cmd, stdin=None, quiet=False, echo=False) -> RunOutput:
if isWindows():
platform_settings = {"env": os.environ}
else:
platform_settings = {"executable": "/bin/bash"}
if echo:
print(cmd)
p = await asyncio.create_subprocess_shell(
cmd,
stdin=stdin,
stdout=asyncio.subprocess.PIPE,
stderr=asyncio.subprocess.PIPE,
**platform_settings
)
out = []
err = []
def tee(line, sink, pipe, label=""):
line = line.decode("utf-8").rstrip()
sink.append(line)
if not quiet:
print(label, line, file=pipe)
await asyncio.wait(
[
_read_stream(p.stdout, lambda l: tee(l, out, sys.stdout)),
_read_stream(p.stderr, lambda l: tee(l, err, sys.stderr, label="ERR:")),
]
)
return RunOutput(await p.wait(), out, err)
def run(cmd, stdin=None, quiet=False, echo=False) -> RunOutput:
loop = asyncio.get_event_loop()
result = loop.run_until_complete(
_stream_subprocess(cmd, stdin=stdin, quiet=quiet, echo=echo)
)
return result
The code above was based on this blog post: https://kevinmccarthy.org/2016/07/25/streaming-subprocess-stdin-and-stdout-with-asyncio-in-python/
This is a straightforward port of tee(1) to Python.
import sys
sinks = sys.argv[1:]
sinks = [open(sink, "w") for sink in sinks]
sinks.append(sys.stderr)
while True:
input = sys.stdin.read(1024)
if input:
for sink in sinks:
sink.write(input)
else:
break
I'm running on Linux right now but this ought to work on most platforms.
Now for the subprocess part, I don't know how you want to 'wire' the subprocess's stdin, stdout and stderr to your stdin, stdout, stderr and file sinks, but I know you can do this:
import subprocess
callee = subprocess.Popen(
["python", "-i"],
stdin=subprocess.PIPE,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
)
Now you can access callee.stdin, callee.stdout and callee.stderr like normal files, enabling the above "solution" to work. If you want to get the callee.returncode, you'll need to make an extra call to callee.poll().
Be careful with writing to callee.stdin: if the process has exited when you do that, an error may be rised (on Linux, I get IOError: [Errno 32] Broken pipe).
This is how it can be done
import sys
from subprocess import Popen, PIPE
with open('log.log', 'w') as log:
proc = Popen(["ping", "google.com"], stdout=PIPE, encoding='utf-8')
while proc.poll() is None:
text = proc.stdout.readline()
log.write(text)
sys.stdout.write(text)
If you don't want to interact with the process you can use the subprocess module just fine.
Example:
tester.py
import os
import sys
for file in os.listdir('.'):
print file
sys.stderr.write("Oh noes, a shrubbery!")
sys.stderr.flush()
sys.stderr.close()
testing.py
import subprocess
p = subprocess.Popen(['python', 'tester.py'], stdout=subprocess.PIPE,
stdin=subprocess.PIPE, stderr=subprocess.PIPE)
stdout, stderr = p.communicate()
print stdout, stderr
In your situation you can simply write stdout/stderr to a file first. You can send arguments to your process with communicate as well, though I wasn't able to figure out how to continually interact with the subprocess.
On Linux, if you really need something like the tee(2) syscall, you can get it like this:
import os
import ctypes
ld = ctypes.CDLL(None, use_errno=True)
SPLICE_F_NONBLOCK = 0x02
def tee(fd_in, fd_out, length, flags=SPLICE_F_NONBLOCK):
result = ld.tee(
ctypes.c_int(fd_in),
ctypes.c_int(fd_out),
ctypes.c_size_t(length),
ctypes.c_uint(flags),
)
if result == -1:
errno = ctypes.get_errno()
raise OSError(errno, os.strerror(errno))
return result
To use this, you probably want to use Python 3.10 and something with os.splice (or use ctypes in the same way to get splice). See the tee(2) man page for an example.
My solution isn't elegant, but it works.
You can use powershell to gain access to "tee" under WinOS.
import subprocess
import sys
cmd = ['powershell', 'ping', 'google.com', '|', 'tee', '-a', 'log.txt']
if 'darwin' in sys.platform:
cmd.remove('powershell')
p = subprocess.Popen(cmd)
p.wait()

Categories