How to read standard output of a Python script from within it? - python

Here is my problem. I have an application which prints some traces to the standard output using logging module. Now, I want to be able to read those traces at the same time in order to wait for specific trace I need.
This is for the testing purpose. So for example, if wanted trace does not occur in about 2 seconds, test fails.
I know I can read output of another scripts by using something like this:
import subprocess
p = subprocess.Popen(cmd, shell=True, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
while True:
line = p.stdout.readline()
print line
if line == '' and p.poll() != None:
break
But, how can I do something similar from the script itself?
Thanks in advance.
EDIT
So, since my problem was expecting certain trace to appear while the Python application is running, and since I couldn't find a simple way to do so from the application itself, I decided to start the application (as suggested in comments) from another script.
The module I found very helpful, and easier to use than subprocess module, is pexpect module.

If you want to do some pre-processing of the logger messages you can do something like:
#!/usr/bin/python
import sys
import logging
import time
import types
def debug_wrapper(self,msg):
if( hasattr(self,'last_time_seen') and 'message' in msg):
print("INFO: seconds past since last time seen "+str(time.time()-self.last_time_seen))
self.last_time_seen = time.time()
self.debug_original(msg)
logging.basicConfig(stream=sys.stdout, level=logging.DEBUG)
logger = logging.getLogger("test")
logger.debug_original = logger.debug
logger.debug = types.MethodType(debug_wrapper, logger)
while True:
logger.debug("INFO: some message.")
time.sleep(1)
This works by replacing the original debug function of the logger object with your custom debug_wrapper function, in which you can do whatever processing you want, like for example, storing the last time you have seen a message.

You can store the script output to a file in real-time and then read its content within the script in real-time(as the contents in the output file is updating dynamically).
To store the script output to a file in real-time, you may use unbuffer which comes with the expect package.
sudo apt-get install expect
Then, while running the script use:
unbuffer python script.py > output.txt
You have to just print the output in the script , which will be dynamically updating to the output file. And hence, read that file each time.
Also, use > for overwriting old or creating new file and >> for appending the contents in previously created output.txt file.

If you want to record the output from print statement in other Python code, you can redirect sys.stdout to string like file object as follows:
import io
import sys
def foo():
print("hello world, what else ?")
stream = io.StringIO()
sys.stdout = stream
try:
foo()
finally:
sys.stdout = sys.__stdout__
print(stream.getvalue())

Related

How to return a dictionary as a function's return value running as a subprocess to its parent process?

I have two scripts parent.py and child.py The parent.py calls child.py as a subprocess. Child.py has a function that collects certain result in a dictionary and i wish to return that dictionary back to the parent process. I have tried by printing that dictionary from child.py onto its STDOUT so that the parent process can read it but then thats not helping me as the dictionary's content are being read as strings on seperate lines by the parent.
Moreover , as suggested in comments , i tried serializing the dictionary using JSON while printing it on stdout and also read it back from the parent using JSON , that works fine, but i also am printing a lot of other information from the child to its stdout which is eventually also being read by the parent and is mixing things up .
Another suggestion that came up was by writing the result from the child to a file in the directory and make the parent read from that file. That would work too , but i would be running 100s of instances of this code in Celery and hence it would lead to overwrites on that same file by other instances of the child.
My question is since we have a PIPE connecting the two processes how can i just write my dictionary directly into the PIPE from child.py and get it read from the parent.py
# parent.py
import subprocess
proc = subprocess.Popen(['python3', 'child.py'],
stdin=subprocess.PIPE,
stdout = subprocess.PIPE
)
proc.comunicate()
result = proc.stdout
#child.py
def child_function():
result = {}
result[1] = "one"
result[2] = "two"
print(result)
#return result
if __name__ == "__main__":
child_function()
Have the parent create a FIFO (named pipe) for the child:
with os.mkfifo(mypipe) as pipe:
proc = subprocess.Popen(['python3', 'child.py', 'mypipe'],
stdin=subprocess.PIPE, stdout=subprocess.PIPE)
print(pipe.read())
Now the child can do this:
pipe_path = # get from argv
with open(pipe_path, 'w') as pipe:
pipe.write(str(result))
This keeps your communication separate from stdin/stdout/stderr.
A subprocess running Python is in no way different from a subprocess running something else. Python doesn't know or care that the other program is also a Python program; they have no access to each other's variables, memory, running state, or other internals. Simply imagine that the subprocess is a monolithic binary. The only ways you can communicate with it is to send and receive bytes (which can be strings, if you agree on a character encoding) and signals (so you can kill your subprocess, or raise some other signal which it can trap and handle -- like a timer; you get exactly one bit of information when the timer expires, and what you do with that bit is up to the receiver of the signal).
To "serialize" information means to encode it in a way which lets the recipient deserialize it. JSON is a good example; you can transfer a structure consisting of a (possibly nested structure of) dictionary or list as text, and the recipient will know how to map that stream of bytes into the same structure.
When both sender and receiver are running the same Python version, you could also use pickles; pickle is a native Python format which allows you to transfer a richer structure. But if your needs are modest, I'd simply go with JSON.
parent.py:
import subprocess
import json
# Prefer subprocess.run() over bare-bones Popen()
proc = subprocess.run(['python3', 'child.py'],
check=True, capture_output=True, text=True)
result = json.loads(proc.stdout)
child.py:
import json
import logging
def child_function():
result = {}
result[1] = "one"
result[2] = "two"
loggging.info('Some unrelated output which should not go into the JSON')
print(json.dumps(result))
#return result
if __name__ == "__main__":
logging.basicConfig(level=logging.WARNING)
child_function()
To avoid mixing JSON with other output, print the other output to standard error instead of standard output (or figure out a way to embed it into the JSON after all). The logging module is a convenient way to do that, with the added bonus that you can turn it off easily, partially or entirely (the above example demonstrates logging which is turned off via logging.basicConfig because it only selects printing of messages of priority WARNING or higher, which excludes INFO). The parent will get these messages in proc.stderr.
You can get the results via a file.
parent.py:
import tempfile
import os
import subprocess
import json
fd, temp_file_name = tempfile.mkstemp() # create temporary file
os.close(fd) # close the file
proc = subprocess.Popen(['python3', 'child.py', temp_file_name]) # pass file_name
proc.communicate()
with open(temp_file_name) as fp:
result = json.load(fp) # get dictionary from here
os.unlink(temp_file_name) # no longer need this file
child.py:
import sys
import json
def child_function(temp_file_name):
result = {}
result[1] = "one"
result[2] = "two"
with open(temp_file_name, 'w') as fp:
json.dump(result, fp)
if __name__ == "__main__":
child_function(sys.argv[1]) # pass the file name argument

Controlling a python script from another script

I am trying to learn how to write a script control.py, that runs another script test.py in a loop for a certain number of times, in each run, reads its output and halts it if some predefined output is printed (e.g. the text 'stop now'), and the loop continues its iteration (once test.py has finished, either on its own, or by force). So something along the lines:
for i in range(n):
os.system('test.py someargument')
if output == 'stop now': #stop the current test.py process and continue with next iteration
#output here is supposed to contain what test.py prints
The problem with the above is that, it does not check the output of test.py as it is running, instead it waits until test.py process is finished on its own, right?
Basically trying to learn how I can use a python script to control another one, as it is running. (e.g. having access to what it prints and so on).
Finally, is it possible to run test.py in a new terminal (i.e. not in control.py's terminal) and still achieve the above goals?
An attempt:
test.py is this:
from itertools import permutations
import random as random
perms = [''.join(p) for p in permutations('stop')]
for i in range(1000000):
rand_ind = random.randrange(0,len(perms))
print perms[rand_ind]
And control.py is this: (following Marc's suggestion)
import subprocess
command = ["python", "test.py"]
n = 10
for i in range(n):
p = subprocess.Popen(command, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
while True:
output = p.stdout.readline().strip()
print output
#if output == '' and p.poll() is not None:
# break
if output == 'stop':
print 'sucess'
p.kill()
break
#Do whatever you want
#rc = p.poll() #Exit Code
You can use subprocess module or also the os.popen
os.popen(command[, mode[, bufsize]])
Open a pipe to or from command. The return value is an open file object connected to the pipe, which can be read or written depending on whether mode is 'r' (default) or 'w'.
With subprocess I would suggest
subprocess.call(['python.exe', command])
or the subprocess.Popen --> that is similar to os.popen (for instance)
With popen you can read the connected object/file and check whether "Stop now" is there.
The os.system is not deprecated and you can use as well (but you won't get a object from that), you can just check if return at the end of execution.
From subprocess.call you can run it in a new terminal or if you want to call multiple times ONLY the test.py --> than you can put your script in a def main() and run the main as much as you want till the "Stop now" is generated.
Hope this solve your query :-) otherwise comment again.
Looking at what you wrote above you can also redirect the output to a file directly from the OS call --> os.system(test.py *args >> /tmp/mickey.txt) then you can check at each round the file.
As said the popen is an object file that you can access.
What you are hinting at in your comment to Marc Cabos' answer is Threading
There are several ways Python can use the functionality of other files. If the content of test.py can be encapsulated in a function or class, then you can import the relevant parts into your program, giving you greater access to the runnings of that code.
As described in other answers you can use the stdout of a script, running it in a subprocess. This could give you separate terminal outputs as you require.
However if you want to run the test.py concurrently and access variables as they are changed then you need to consider threading.
Yes you can use Python to control another program using stdin/stdout, but when using another process output often there is a problem of buffering, in other words the other process doesn't really output anything until it's done.
There are even cases in which the output is buffered or not depending on if the program is started from a terminal or not.
If you are the author of both programs then probably is better using another interprocess channel where the flushing is explicitly controlled by the code, like sockets.
You can use the "subprocess" library for that.
import subprocess
command = ["python", "test.py", "someargument"]
for i in range(n):
p = subprocess.Popen(command, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
while True:
output = p.stdout.readline()
if output == '' and p.poll() is not None:
break
if output == 'stop now':
#Do whatever you want
rc = p.poll() #Exit Code

Logging from an External Application

I am writing a research tool and I have recently switched from using "print" statements to using the logger functionality built into Python. This, I reasoned, would allow me to give the user the option of dumping the output to a file, besides dumping it to the screen.
So far so good. The part of my code that is in Python uses "logger.info" and "logger.error" to dump to both the screen and a file. "logger" is the module-wide logger. This part works like a charm.
However, at several points, I use "subprocess.call" to run an executable through the shell. So, throughout the code, I have lines like
proc = subprocess.call(command)
The output from this command would print to the screen, as always, but it would not dump to the file that the user specified.
One possible option would be to open up a pipe to the file:
proc = subprocess.call(command, stdout=f, stderr=subprocess.OUTPUT)
But that would only dump to the file and not to the screen.
Basically, my question boils down to this: is there a way I can leverage my existing logger, without having to construct another handler for files specifically for subprocess.call? (Perhaps by redirecting output to the logger?) Or is this impossible, given the current setup? If the latter, how can I improve the setup?
(Oh, also, it would be great if the logging were in 'real time', so that messages from the executable are logged as they are received.)
Thanks for any help! :)
Instead of piping stdout to a file, you can pipe it to a PIPE, and then read from that PIPE and write to logger. Something like this:
proc = subprocess.Popen(command, stdout=subprocess.PIPE, stderr=subprocess.OUTPUT)
for line in proc.stdout:
logging.info(line)
However, there's an even simpler answer: You have to use a file-like object with a file handle, but you can create one on top of pipes that passes each line to logging. You could write this object yourself, but, as #unutbu says, someone's already done it in this question. So:
with StreamLogger(logging.INFO) as out:
proc = subprocess.call(command, stdout=out, stderr=subprocess.OUTPUT)
Of course you can also temporarily wrap stdout to write to the logger and just pass the output through, e.g., using this confusingly identically-named class:
with StreamLogger('stdout'):
proc = subprocess.call(command, stderr=subprocess.OUTPUT)
unutbu's comment is good; you should take a look at Lennart's answer.
What you need is something like the functionality of tee, but the subprocess module works at the level of OS handles, which means that data written by the subprocess can't be seen by your Python code, say by some file-like object you write which logs and prints whatever is written to it.
As well as using Lennart's answer, you can do this sort of thing using a third party library like sarge (disclosure: I'm its maintainer). It works for more than logging. Suppose you have a program that generates output, such as:
# echotest.py
import time
for i in range(10):
print('Message %d' % (i + 1))
and you want to capture it in your script, log it and print it to screen:
#subptest.py
from sarge import capture_stdout
import logging
import sys
logging.basicConfig(filename='subptest.log', filemode='w',
level=logging.INFO)
p = capture_stdout('python echotest.py', async=True)
while True:
line = p.stdout.readline()
line = line.strip()
# depending on how the child process generates output,
# sometimes you won't see anything for a bit. Hence only print and log
# if you get something
if line:
print(line)
logging.info(line)
# Check to see when we can stop - after the child is done.
# The return code will be set to the value of the child's exit code,
# so it won't be None any more.
rc = p.commands[0].process.poll()
# if no more output and subprocess is done, break
if not line and rc is not None:
break
If you run the above script, you get printed out to the console:
$ python subptest.py
Message 1
Message 2
Message 3
Message 4
Message 5
Message 6
Message 7
Message 8
Message 9
Message 10
And when we check the log file, we see:
$ cat subptest.log
INFO:root:Message 1
INFO:root:Message 2
INFO:root:Message 3
INFO:root:Message 4
INFO:root:Message 5
INFO:root:Message 6
INFO:root:Message 7
INFO:root:Message 8
INFO:root:Message 9
INFO:root:Message 10

How to write script output to file and command-line?

I have a long-running Python script that I run from the command-line. The script writes progress messages and results to the standard output. I want to capture everything the script write to the standard output in a file, but also see it on the command line. Alternatively, I want the output to go to the file immediately, so I can use tail to view the progress. I have tried this:
python MyLongRunngingScript.py | tee log.txt
But it does not produce any output (just running the script produces output as expected). Can anyone propose a simple solution? I am using Mac OS X 10.6.4.
Edit I am using print for output in my script.
You are on the right path but the problem is python buffering the output.
Fortunately there is a way to tell it not to buffer output:
python -u MyLongRunngingScript.py | tee log.txt
The fact that you don't see anything is probably related to the fact that buffering is occurring. So you only get output every 4 Ko of text or so.
instead, try something like this :
class OutputSplitter(object):
def __init__(self, real_output, *open_files):
self.__stdout = real_output
self.__fds = open_files
self.encoding = real_output.encoding
def write(self, string):
self.__stdout.write(string) # don't catch exception on that one.
self.__stdout.flush()
for fd in self.__fds:
try:
fd.write(string)
fd.flush()
except IOError:
pass # do what you want here.
def flush(self):
pass # already flushed
Then decorate sys.stdout with that class with some code like that :
stdout_saved = sys.stdout
logfile = open("log.txt","a") # check exception on that one.
sys.stdout = OutputSplitter(stdout_saved, logfile)
That way, every output (print included) is flushed to the standard output and to the specified file. Might require tweaking because i haven't tested that implementation.
Of course, expect to see a (small most of the time) performance penalty when printing messages.
Another simple solution could also be
python script.py > output.log
You could try doing sys.stdout.flush() occasionally in your script, and running with tee again. When stdout is redirected through to tee, it might get buffered for longer than if it's going straight to a terminal.

How to capture Python interpreter's and/or CMD.EXE's output from a Python script?

Is it possible to capture Python interpreter's output from a Python script?
Is it possible to capture Windows CMD's output from a Python script?
If so, which librar(y|ies) should I look into?
If you are talking about the python interpreter or CMD.exe that is the 'parent' of your script then no, it isn't possible. In every POSIX-like system (now you're running Windows, it seems, and that might have some quirk I don't know about, YMMV) each process has three streams, standard input, standard output and standard error. Bu default (when running in a console) these are directed to the console, but redirection is possible using the pipe notation:
python script_a.py | python script_b.py
This ties the standard output stream of script a to the standard input stream of script B. Standard error still goes to the console in this example. See the article on standard streams on Wikipedia.
If you're talking about a child process, you can launch it from python like so (stdin is also an option if you want two way communication):
import subprocess
# Of course you can open things other than python here :)
process = subprocess.Popen(["python", "main.py"], stdout=subprocess.PIPE, stderr=subprocess.PIPE)
x = process.stderr.readline()
y = process.stdout.readline()
process.wait()
See the Python subprocess module for information on managing the process. For communication, the process.stdin and process.stdout pipes are considered standard file objects.
For use with pipes, reading from standard input as lassevk suggested you'd do something like this:
import sys
x = sys.stderr.readline()
y = sys.stdin.readline()
sys.stdin and sys.stdout are standard file objects as noted above, defined in the sys module. You might also want to take a look at the pipes module.
Reading data with readline() as in my example is a pretty naïve way of getting data though. If the output is not line-oriented or indeterministic you probably want to look into polling which unfortunately does not work in windows, but I'm sure there's some alternative out there.
I think I can point you to a good answer for the first part of your question.
1. Is it possible to capture Python interpreter's output from a Python
script?
The answer is "yes", and personally I like the following lifted from the examples in the PEP 343 -- The "with" Statement document.
from contextlib import contextmanager
import sys
#contextmanager
def stdout_redirected(new_stdout):
saved_stdout = sys.stdout
sys.stdout = new_stdout
try:
yield None
finally:
sys.stdout.close()
sys.stdout = saved_stdout
And used like this:
with stdout_redirected(open("filename.txt", "w")):
print "Hello world"
A nice aspect of it is that it can be applied selectively around just a portion of a script's execution, rather than its entire extent, and stays in effect even when unhandled exceptions are raised within its context. If you re-open the file in append-mode after its first use, you can accumulate the results into a single file:
with stdout_redirected(open("filename.txt", "w")):
print "Hello world"
print "screen only output again"
with stdout_redirected(open("filename.txt", "a")):
print "Hello world2"
Of course, the above could also be extended to also redirect sys.stderr to the same or another file. Also see this answer to a related question.
Actually, you definitely can, and it's beautiful, ugly, and crazy at the same time!
You can replace sys.stdout and sys.stderr with StringIO objects that collect the output.
Here's an example, save it as evil.py:
import sys
import StringIO
s = StringIO.StringIO()
sys.stdout = s
print "hey, this isn't going to stdout at all!"
print "where is it ?"
sys.stderr.write('It actually went to a StringIO object, I will show you now:\n')
sys.stderr.write(s.getvalue())
When you run this program, you will see that:
nothing went to stdout (where print usually prints to)
the first string that gets written to stderr is the one starting with 'It'
the next two lines are the ones that were collected in the StringIO object
Replacing sys.stdout/err like this is an application of what's called monkeypatching. Opinions may vary whether or not this is 'supported', and it is definitely an ugly hack, but it has saved my bacon when trying to wrap around external stuff once or twice.
Tested on Linux, not on Windows, but it should work just as well. Let me know if it works on Windows!
You want subprocess. Look specifically at Popen in 17.1.1 and communicate in 17.1.2.
In which context are you asking?
Are you trying to capture the output from a program you start on the command line?
if so, then this is how to execute it:
somescript.py | your-capture-program-here
and to read the output, just read from standard input.
If, on the other hand, you're executing that script or cmd.exe or similar from within your program, and want to wait until the script/program has finished, and capture all its output, then you need to look at the library calls you use to start that external program, most likely there is a way to ask it to give you some way to read the output and wait for completion.

Categories