I have a long-running Python script that I run from the command-line. The script writes progress messages and results to the standard output. I want to capture everything the script write to the standard output in a file, but also see it on the command line. Alternatively, I want the output to go to the file immediately, so I can use tail to view the progress. I have tried this:
python MyLongRunngingScript.py | tee log.txt
But it does not produce any output (just running the script produces output as expected). Can anyone propose a simple solution? I am using Mac OS X 10.6.4.
Edit I am using print for output in my script.
You are on the right path but the problem is python buffering the output.
Fortunately there is a way to tell it not to buffer output:
python -u MyLongRunngingScript.py | tee log.txt
The fact that you don't see anything is probably related to the fact that buffering is occurring. So you only get output every 4 Ko of text or so.
instead, try something like this :
class OutputSplitter(object):
def __init__(self, real_output, *open_files):
self.__stdout = real_output
self.__fds = open_files
self.encoding = real_output.encoding
def write(self, string):
self.__stdout.write(string) # don't catch exception on that one.
self.__stdout.flush()
for fd in self.__fds:
try:
fd.write(string)
fd.flush()
except IOError:
pass # do what you want here.
def flush(self):
pass # already flushed
Then decorate sys.stdout with that class with some code like that :
stdout_saved = sys.stdout
logfile = open("log.txt","a") # check exception on that one.
sys.stdout = OutputSplitter(stdout_saved, logfile)
That way, every output (print included) is flushed to the standard output and to the specified file. Might require tweaking because i haven't tested that implementation.
Of course, expect to see a (small most of the time) performance penalty when printing messages.
Another simple solution could also be
python script.py > output.log
You could try doing sys.stdout.flush() occasionally in your script, and running with tee again. When stdout is redirected through to tee, it might get buffered for longer than if it's going straight to a terminal.
Related
Here is my problem. I have an application which prints some traces to the standard output using logging module. Now, I want to be able to read those traces at the same time in order to wait for specific trace I need.
This is for the testing purpose. So for example, if wanted trace does not occur in about 2 seconds, test fails.
I know I can read output of another scripts by using something like this:
import subprocess
p = subprocess.Popen(cmd, shell=True, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
while True:
line = p.stdout.readline()
print line
if line == '' and p.poll() != None:
break
But, how can I do something similar from the script itself?
Thanks in advance.
EDIT
So, since my problem was expecting certain trace to appear while the Python application is running, and since I couldn't find a simple way to do so from the application itself, I decided to start the application (as suggested in comments) from another script.
The module I found very helpful, and easier to use than subprocess module, is pexpect module.
If you want to do some pre-processing of the logger messages you can do something like:
#!/usr/bin/python
import sys
import logging
import time
import types
def debug_wrapper(self,msg):
if( hasattr(self,'last_time_seen') and 'message' in msg):
print("INFO: seconds past since last time seen "+str(time.time()-self.last_time_seen))
self.last_time_seen = time.time()
self.debug_original(msg)
logging.basicConfig(stream=sys.stdout, level=logging.DEBUG)
logger = logging.getLogger("test")
logger.debug_original = logger.debug
logger.debug = types.MethodType(debug_wrapper, logger)
while True:
logger.debug("INFO: some message.")
time.sleep(1)
This works by replacing the original debug function of the logger object with your custom debug_wrapper function, in which you can do whatever processing you want, like for example, storing the last time you have seen a message.
You can store the script output to a file in real-time and then read its content within the script in real-time(as the contents in the output file is updating dynamically).
To store the script output to a file in real-time, you may use unbuffer which comes with the expect package.
sudo apt-get install expect
Then, while running the script use:
unbuffer python script.py > output.txt
You have to just print the output in the script , which will be dynamically updating to the output file. And hence, read that file each time.
Also, use > for overwriting old or creating new file and >> for appending the contents in previously created output.txt file.
If you want to record the output from print statement in other Python code, you can redirect sys.stdout to string like file object as follows:
import io
import sys
def foo():
print("hello world, what else ?")
stream = io.StringIO()
sys.stdout = stream
try:
foo()
finally:
sys.stdout = sys.__stdout__
print(stream.getvalue())
I'm using python 3 on Komodo, and I want for there to be a time delay between the execution of commands. However, using the code below, all of the print commands are launched at the same time, but it does show that the time after all the commands are executed is two seconds greater than the time before the commands are executed. Is there a way for the first line to be printed, wait a second, second line be printed, wait a second, and have third and fourth lines be print?
import time
from time import sleep
t=time.asctime(time.localtime(time.time()));
print(t)
time.sleep(1)
print('Good Night')
time.sleep(1)
print('I"m back')
t=time.asctime(time.localtime(time.time()));
print(t)
By default, print prints to sys.stdout, which is line-buffered when writing to an interactive terminal,1 but block-buffered when writing to a file.
So, when you run your code with python myscript.py from your Terminal or Command Prompt, you will see each line appear as it's printed, as desired.
But if you run it with, say, python myscript.py >outfile, nothing will get written until the buffer fills up (or until the script exits, if that never happens). Normally, that's fine. But apparently, however you're running your script in Komodo, it looks like a regular file, not an interactive terminal, to Python.
It's possible that you can fix that just by using or configuring Komodo differently.
I don't know much about Komodo, but I do see that there's an addon for embedding a terminal; maybe if you use that instead of sending output to the builtin JavaScript (?) console, things will work better, but I really have no idea.
Alternatively, you can make sure that the output buffer is flushed after each line by doing it manually, e.g., by passing the flush argument to print:
print(t, flush=True)
If you really want to, you can even replace print in your module with a function that always does this:
import builtins
import functools
print = functools.partial(builtins.print, flush=True)
… but you probably don't want to do that.
Alternatively, you can replace sys.stdout with a line-buffered file object over the raw stdout, just by calling open on its underlying raw file or file descriptor:
sys.stdout = open(sys.stdout.fileno(), buffering=1)
If you search around Stack Overflow or the web, you'll find a lot of suggestions to disable buffering. And you can force Python to use unbuffered output with the -u flag or the PYTHONUNBUFFERED environment variable. But that may not do any good in Python 3.2
1. As sys.stdout explains, it's just a regular text file, like those returned by open. As explained in open, this distinction is made by calling isatty.
2. Python 2's stdout is just a thin wrapper around the C stdio object, so if you open it unbuffered, there's no buffering. Python 3's stdout is a hefty wrapper around the raw file descriptor that does its own buffering and decoding (see the io docs for details), so -u will make sys.stdout.buffer.raw unbuffered, but sys.stdout itself will still be buffered, as explained in the -u docs.
I have a weird problem to read from STDIN in a python script.
Here is my use case. I have rsyslog configured with an output module so rsyslog can pipe log messages to my Python script.
My Python script is really trivial :
#! /usr/bin/env python
# -*- coding: utf-8 -*-
import sys
fd = open('/tmp/testrsyslogomoutput.txt', 'a')
fd.write("Receiving log message : \n%s\n" % ('-'.join(sys.stdin.readlines())))
fd.close()
If I run echo "foo" | mypythonscript.py I can get "foo" in the target file /tmp/testrsyslogomoutput.txt. However when I run it within rsyslog, messages seems to be sent only when I stop/restart rsyslog (I believe some buffer is flushed at some point).
I first thought it was a problem with Rsyslog. So I replaced my python program with a shell one, without changing anything to the rsyslog configuration. The shell script works perfectly with rsyslog and as you can see in the code below, the script is really trivial:
#! /bin/sh
cat /dev/stdin >> /tmp/testrsyslogomoutput.txt
Since my shell script works but my Python one does not, I believe I made a mistake somewhere in my Python code but I can not find where. If you could point me to my mistake(s) that would be great.
Thanks in advance :)
readlines will not return until it has finished reading the file. Since the pipe feeding stdin never finishes, readlines never finishes either. Stopping rsyslog closes the pipe and lets it finish.
I'd also suspect the reason is that rsyslog does not terminate. readlines() should not return until it reaches a real EOF. But why would the shell script act differently? Perhaps the use of /dev/stdin is the reason. Try this version and see if it still runs without hanging:
#!/bin/sh
cat >> /tmp/testrsyslogomoutput.txt
If this makes a difference, you'll also have a fix: open and read /dev/stdin from python, instead of sys.stdin.
Edit: So cat somehow reads whatever is waiting at stdin and returns, but python blocks and waits until stdin is exhausted. Strange. You can also try replacing readlines() with a single read() followed by split("\n"), but at this point I doubt that will help.
So, forget the diagnosis and let's try a work-around: Force stdin to do non-blocking i/o. The following is supposed to do that:
import fcntl, os
# Add O_NONBLOCK to the stdin descriptor flags
flags = fcntl.fcntl(0, fcntl.F_GETFL)
fcntl.fcntl(0, fcntl.F_SETFL, fl | os.O_NONBLOCK)
message = sys.stdin.read().split("\n") # Read what's waiting, in one go
fd = open('/tmp/testrsyslogomoutput.txt', 'a')
fd.write("Receiving log message : \n%s\n" % ('-'.join(message)))
fd.close()
You probably want to use that in combination with python -u. Hope it works!
If you use readline() instead, it will return on \n, though this will only write one line then quit.
If you want to keep writing lines as long they are there, you can use a simple for:
for line in fd:
fd.write("Receiving log message : \n%s\n" % (line)
fd.close()
I have seen this question answered in reference to Bash, but can't find one for Python. Apologies if this is repeating something.
Is it possible to print to the terminal and an output file with one command? I'm familiar with using print >> and sys.stdout = WritableObject, but I'd like to avoid having to double print commands for each line I want logged.
I'm using Python 2.6, just in case such knowledge is necessary.
More importantly, I want this to run on a Windows-based system using IDLE's command line. So, in essence, I want the python script to report to IDLE's terminal and a given log file.
EDIT: For anyone who finds this and decides to go with the answer I chose, if you need help understanding context managers (like I did), I recommend Doug Hellman's Python Modules of the Week for clarification. This one details the context library. For help with decorators see this Stack Overflow question's answers.
Replace sys.stdout.
class PrintAndLog(object):
def __init__(self, fileOrPath): # choose which makes more sense
self._file = ...
def write(s):
sys.stdout.write(s)
self._file.write(s)
def close(self):
self._file.close()
# insert wrappers for .flush, .writelines
_old_stdout = sys.stdout
sys.stdout = PrintAndLog(f)
... # print and stuff
sys.stdout = _old_stdout
Can be put into a context manager (this is at least the third time I see something like this on SO alone...):
from contextlib import contextmanager
#contextmanager
def replace_stdout(f):
old_stdout = sys.stdout
try:
sys.stdout = PrintAndLog(f)
yield
finally:
sys.stdout = old_stdout
Why not just write a function?
def myPrint(anOpenFileObject, message):
print message
anOpenFileObject.write(message)
If you're in Unix: At the start of your program, you could mkfifo a named pipe, and launch in the bg a cat from it to a tee of the terminal and the desired output file. Then throughout your program, output to the fd of the named pipe. Finally, rm the named pipe just before exiting.
But honestly I would just make a wrapper around print that prints to both dests.
Is it possible to capture Python interpreter's output from a Python script?
Is it possible to capture Windows CMD's output from a Python script?
If so, which librar(y|ies) should I look into?
If you are talking about the python interpreter or CMD.exe that is the 'parent' of your script then no, it isn't possible. In every POSIX-like system (now you're running Windows, it seems, and that might have some quirk I don't know about, YMMV) each process has three streams, standard input, standard output and standard error. Bu default (when running in a console) these are directed to the console, but redirection is possible using the pipe notation:
python script_a.py | python script_b.py
This ties the standard output stream of script a to the standard input stream of script B. Standard error still goes to the console in this example. See the article on standard streams on Wikipedia.
If you're talking about a child process, you can launch it from python like so (stdin is also an option if you want two way communication):
import subprocess
# Of course you can open things other than python here :)
process = subprocess.Popen(["python", "main.py"], stdout=subprocess.PIPE, stderr=subprocess.PIPE)
x = process.stderr.readline()
y = process.stdout.readline()
process.wait()
See the Python subprocess module for information on managing the process. For communication, the process.stdin and process.stdout pipes are considered standard file objects.
For use with pipes, reading from standard input as lassevk suggested you'd do something like this:
import sys
x = sys.stderr.readline()
y = sys.stdin.readline()
sys.stdin and sys.stdout are standard file objects as noted above, defined in the sys module. You might also want to take a look at the pipes module.
Reading data with readline() as in my example is a pretty naïve way of getting data though. If the output is not line-oriented or indeterministic you probably want to look into polling which unfortunately does not work in windows, but I'm sure there's some alternative out there.
I think I can point you to a good answer for the first part of your question.
1. Is it possible to capture Python interpreter's output from a Python
script?
The answer is "yes", and personally I like the following lifted from the examples in the PEP 343 -- The "with" Statement document.
from contextlib import contextmanager
import sys
#contextmanager
def stdout_redirected(new_stdout):
saved_stdout = sys.stdout
sys.stdout = new_stdout
try:
yield None
finally:
sys.stdout.close()
sys.stdout = saved_stdout
And used like this:
with stdout_redirected(open("filename.txt", "w")):
print "Hello world"
A nice aspect of it is that it can be applied selectively around just a portion of a script's execution, rather than its entire extent, and stays in effect even when unhandled exceptions are raised within its context. If you re-open the file in append-mode after its first use, you can accumulate the results into a single file:
with stdout_redirected(open("filename.txt", "w")):
print "Hello world"
print "screen only output again"
with stdout_redirected(open("filename.txt", "a")):
print "Hello world2"
Of course, the above could also be extended to also redirect sys.stderr to the same or another file. Also see this answer to a related question.
Actually, you definitely can, and it's beautiful, ugly, and crazy at the same time!
You can replace sys.stdout and sys.stderr with StringIO objects that collect the output.
Here's an example, save it as evil.py:
import sys
import StringIO
s = StringIO.StringIO()
sys.stdout = s
print "hey, this isn't going to stdout at all!"
print "where is it ?"
sys.stderr.write('It actually went to a StringIO object, I will show you now:\n')
sys.stderr.write(s.getvalue())
When you run this program, you will see that:
nothing went to stdout (where print usually prints to)
the first string that gets written to stderr is the one starting with 'It'
the next two lines are the ones that were collected in the StringIO object
Replacing sys.stdout/err like this is an application of what's called monkeypatching. Opinions may vary whether or not this is 'supported', and it is definitely an ugly hack, but it has saved my bacon when trying to wrap around external stuff once or twice.
Tested on Linux, not on Windows, but it should work just as well. Let me know if it works on Windows!
You want subprocess. Look specifically at Popen in 17.1.1 and communicate in 17.1.2.
In which context are you asking?
Are you trying to capture the output from a program you start on the command line?
if so, then this is how to execute it:
somescript.py | your-capture-program-here
and to read the output, just read from standard input.
If, on the other hand, you're executing that script or cmd.exe or similar from within your program, and want to wait until the script/program has finished, and capture all its output, then you need to look at the library calls you use to start that external program, most likely there is a way to ask it to give you some way to read the output and wait for completion.