Async read console output with python while process is running

Async read console output with python while process is running - python

I want to run a long process (calculix simulation) via python.
As mentioned here one can read the console string with communicate().
As far as I understand the string is returned after the process is completed? Is there a possibility to get the console output while the process is running?

You have to use subprocess.Popen.poll to check process terminates or not.
while sub_process.poll() is None:
output_line = sub_process.stdout.readline()
This will give you runtime output.

This should work:
sp = subprocess.Popen([your args], stdout=subprocess.PIPE)
while sp.poll() is None: # sp.poll() returns None while subprocess is running
output = sp.stdout # here you have acccess to the stdout while the process is running
# Do stuff with stdout
Notice we don't call communicate() on subprocess here.

Related

Controlling a python script from another script

I am trying to learn how to write a script control.py, that runs another script test.py in a loop for a certain number of times, in each run, reads its output and halts it if some predefined output is printed (e.g. the text 'stop now'), and the loop continues its iteration (once test.py has finished, either on its own, or by force). So something along the lines:
for i in range(n):
os.system('test.py someargument')
if output == 'stop now': #stop the current test.py process and continue with next iteration
#output here is supposed to contain what test.py prints
The problem with the above is that, it does not check the output of test.py as it is running, instead it waits until test.py process is finished on its own, right?
Basically trying to learn how I can use a python script to control another one, as it is running. (e.g. having access to what it prints and so on).
Finally, is it possible to run test.py in a new terminal (i.e. not in control.py's terminal) and still achieve the above goals?
An attempt:
test.py is this:
from itertools import permutations
import random as random
perms = [''.join(p) for p in permutations('stop')]
for i in range(1000000):
rand_ind = random.randrange(0,len(perms))
print perms[rand_ind]
And control.py is this: (following Marc's suggestion)
import subprocess
command = ["python", "test.py"]
n = 10
for i in range(n):
p = subprocess.Popen(command, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
while True:
output = p.stdout.readline().strip()
print output
#if output == '' and p.poll() is not None:
# break
if output == 'stop':
print 'sucess'
p.kill()
break
#Do whatever you want
#rc = p.poll() #Exit Code

You can use subprocess module or also the os.popen
os.popen(command[, mode[, bufsize]])
Open a pipe to or from command. The return value is an open file object connected to the pipe, which can be read or written depending on whether mode is 'r' (default) or 'w'.
With subprocess I would suggest
subprocess.call(['python.exe', command])
or the subprocess.Popen --> that is similar to os.popen (for instance)
With popen you can read the connected object/file and check whether "Stop now" is there.
The os.system is not deprecated and you can use as well (but you won't get a object from that), you can just check if return at the end of execution.
From subprocess.call you can run it in a new terminal or if you want to call multiple times ONLY the test.py --> than you can put your script in a def main() and run the main as much as you want till the "Stop now" is generated.
Hope this solve your query :-) otherwise comment again.
Looking at what you wrote above you can also redirect the output to a file directly from the OS call --> os.system(test.py *args >> /tmp/mickey.txt) then you can check at each round the file.
As said the popen is an object file that you can access.

What you are hinting at in your comment to Marc Cabos' answer is Threading
There are several ways Python can use the functionality of other files. If the content of test.py can be encapsulated in a function or class, then you can import the relevant parts into your program, giving you greater access to the runnings of that code.
As described in other answers you can use the stdout of a script, running it in a subprocess. This could give you separate terminal outputs as you require.
However if you want to run the test.py concurrently and access variables as they are changed then you need to consider threading.

Yes you can use Python to control another program using stdin/stdout, but when using another process output often there is a problem of buffering, in other words the other process doesn't really output anything until it's done.
There are even cases in which the output is buffered or not depending on if the program is started from a terminal or not.
If you are the author of both programs then probably is better using another interprocess channel where the flushing is explicitly controlled by the code, like sockets.

You can use the "subprocess" library for that.
import subprocess
command = ["python", "test.py", "someargument"]
for i in range(n):
p = subprocess.Popen(command, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
while True:
output = p.stdout.readline()
if output == '' and p.poll() is not None:
break
if output == 'stop now':
#Do whatever you want
rc = p.poll() #Exit Code

Reading Command line output Python

I have a problem where I am issuing a command using python and then taking in the values to create a list of services.
serviceList = subprocess.Popen(command, shell=True, stdout =subprocess.PIPE).stdout.read()
print serviceList
command is a working command that works perfectly when I copy and paste it into cmd, giving me a list of services and their status.
If I run this command it just returns nothing. When I print out serviceList it is blank.
I am using python 2.7

You must use communicate() method instead of stdout.read() to get the value of serviceList.
Even the Python docs recommend it.
Warning: Use communicate() rather than .stdin.write, .stdout.read or .stderr.read to avoid deadlocks due to any of the other OS pipe buffers filling up and blocking the child process.
Try this:
proc = subprocess.Popen(command, shell=True, stdout =subprocess.PIPE)
serviceList = proc.communicate()[0]
print serviceList
communicate() returns a tuple (stdoutdata, stderrdata). Here, i assign the first element of the tuple to serviceList.

If the program simply prints out a bunch of information then exits, an easier way (also no way for it to deadlock due to full buffer) to read output would be to call:
process = subprocess.Popen(command) # only call shell=True if you *really need it
stdoutdata, stderrdata = process.communicate() # blocks until process terminates
docs:
*Calling shell=True with external input opens your code to shell injection attacks, and should be used with caution

To save the standard output, add output = serviceList.stdout.readlines() to your code.

There's also the subprocess function check_output() which blocks and returns the output of the process as a byte-string. If you want to avoid blocking, you could make a function that calls this and use it as that target for a new Thread() e.g.
import subprocess
import threading
def f():
print subprocess.check_output([command])
threading.Thread(target=f).start()

Waiting for output from a subprocess which does not terminate

I need to run a subprocess from my script. The subprocess is an interactive (shell-like) application, to which I issue commands through the subprocess' stdin.
After I issue a command, the subprocess outputs the result to stdout and then waits for the next command (but does not terminate).
For example:
from subprocess import Popen, PIPE
p = Popen(args = [...], stdin = PIPE, stdout = PIPE, stderr = PIPE, shell = False)
# Issue a command:
p.stdin.write('command\n')
# *** HERE: get the result from p.stdout ***
# CONTINUE with the rest of the script once there is not more data in p.stdout
# NOTE that the subprocess is still running and waiting for the next command
# through stdin.
My problem is getting the result from p.stdout. The script needs to get the output while there is new data in p.stdout; but once there is no more data, I want to continue with the script.
The subprocess does not terminate, so I cannot use communicate() (which waits for the process to terminate).
I tried reading from p.stdout after issuing the command, like this:
res = p.stdout.read()
But the subprocess is not fast enough, and I just get empty result.
I thought about polling p.stdout in a loop until I get something, but then how do I know I got everything? And it seems wasteful anyway.
Any suggestions?

Use gevent.subprocess in gevent-1.0 to substitute the standard subprocess module. It could do the concurrency tasks using synchronous logic and won't block the script. Here is a brief tutorial about gevent.subprocess

Use circuits.io.Process in circuits-dev to wrap an asynchronous call to subprocess.
Example: https://bitbucket.org/circuits/circuits-dev/src/tip/examples/ping.py

After investigating several options I reached two solutions:
Setting the subprocess' stdout stream to be non blocking by using the fcntl module.
Using a thread to collect the subprocess' output to a proxy queue, and then reading the queue from the main thread.
I describe both solutions (and the problem and its origin) in this post.

Detecting the end of the stream on popen.stdout.readline

I have a python program which launches subprocesses using Popen and consumes their output nearly real-time as it is produced. The code of the relevant loop is:
def run(self, output_consumer):
self.prepare_to_run()
popen_args = self.get_popen_args()
logging.debug("Calling popen with arguments %s" % popen_args)
self.popen = subprocess.Popen(**popen_args)
while True:
outdata = self.popen.stdout.readline()
if not outdata and self.popen.returncode is not None:
# Terminate when we've read all the output and the returncode is set
break
output_consumer.process_output(outdata)
self.popen.poll() # updates returncode so we can exit the loop
output_consumer.finish(self.popen.returncode)
self.post_run()
def get_popen_args(self):
return {
'args': self.command,
'shell': False, # Just being explicit for security's sake
'bufsize': 0, # More likely to see what's being printed as it happens
# Not guarantted since the process itself might buffer its output
# run `python -u` to unbuffer output of a python processes
'cwd': self.get_cwd(),
'env': self.get_environment(),
'stdout': subprocess.PIPE,
'stderr': subprocess.STDOUT,
'close_fds': True, # Doesn't seem to matter
}
This works great on my production machines, but on my dev machine, the call to .readline() hangs when certain subprocesses complete. That is, it will successfully process all of the output, including the final output line saying "process complete", but then will again poll readline and never return. This method exits properly on the dev machine for most of the sub-processes I call, but consistently fails to exit for one complex bash script that itself calls many sub-processes.
It's worth noting that popen.returncode gets set to a non-None (usually 0) value many lines before the end of the output. So I can't just break out of the loop when that is set or else I lose everything that gets spat out at the end of the process and is still buffered waiting for reading. The problem is that when I'm flushing the buffer at that point, I can't tell when I'm at the end because the last call to readline() hangs. Calling read() also hangs. Calling read(1) gets me every last character out, but also hangs after the final line. popen.stdout.closed is always False. How can I tell when I'm at the end?
All systems are running python 2.7.3 on Ubuntu 12.04LTS. FWIW, stderr is being merged with stdout using stderr=subprocess.STDOUT.
Why the difference? Is it failing to close stdout for some reason? Could the sub-sub-processes do something to keep it open somehow? Could it be because I'm launching the process from a terminal on my dev box, but in production it's launched as a daemon through supervisord? Would that change the way the pipes are processed and if so how do I normalize them?

The main code loop looks right. It could be that the pipe isn't closing because another process is keeping it open. For example, if script launches a background process that writes to stdout then the pipe will no close. Are you sure no other child process still running?
An idea is to change modes when you see the .returncode has set. Once you know the main process is done, read all its output from buffer, but don't get stuck waiting. You can use select to read from the pipe with a timeout. Set a several seconds timeout and you can clear the buffer without getting stuck waiting child process.

Without knowing the contents of the "one complex bash script" which causes the problem, there's too many possibilities to determine the exact cause.
However, focusing on the fact that you claim it works if you run your Python script under supervisord, then it might be getting stuck if a sub-process is trying to read from stdin, or just behaves differently if stdin is a tty, which (I presume) supervisord will redirect from /dev/null.
This minimal example seems to cope better with cases where my example test.sh runs subprocesses which try to read from stdin...
import os
import subprocess
f = subprocess.Popen(args='./test.sh',
shell=False,
bufsize=0,
stdin=open(os.devnull, 'rb'),
stdout=subprocess.PIPE,
stderr=subprocess.STDOUT,
close_fds=True)
while 1:
s = f.stdout.readline()
if not s and f.returncode is not None:
break
print s.strip()
f.poll()
print "done %d" % f.returncode
Otherwise, you can always fall back to using a non-blocking read, and bail out when you get your final output line saying "process complete", although it's a bit of a hack.

If you use readline() or read(), it should not hang. No need to check returncode or poll(). If it is hanging when you know the process is finished, it is most probably a subprocess keeping your pipe open, as others said before.
There are two things you could do to debug this:
* Try to reproduce with a minimal script instead of the current complex one, or
* Run that complex script with strace -f -e clone,execve,exit_group and see what is that script starting, and if any process is surviving the main script (check when the main script calls exit_group, if strace is still waiting after that, you have a child still alive).

I find that calls to read (or readline) sometimes hang, despite previously calling poll. So I resorted to calling select to find out if there is readable data. However, select without a timeout can hang, too, if the process was closed. So I call select in a semi-busy loop with a tiny timeout for each iteration (see below).
I'm not sure if you can adapt this to readline, as readline might hang if the final \n is missing, or if the process doesn't close its stdout before you close its stdin and/or terminate it. You could wrap this in a generator, and everytime you encounter a \n in stdout_collected, yield the current line.
Also note that in my actual code, I'm using pseudoterminals (pty) to wrap the popen handles (to more closely fake user input) but it should work without.
# handle to read from
handle = self.popen.stdout
# how many seconds to wait without data
timeout = 1
begin = datetime.now()
stdout_collected = ""
while self.popen.poll() is None:
try:
fds = select.select([handle], [], [], 0.01)[0]
except select.error, exc:
print exc
break
if len(fds) == 0:
# select timed out, no new data
delta = (datetime.now() - begin).total_seconds()
if delta > timeout:
return stdout_collected
# try longer
continue
else:
# have data, timeout counter resets again
begin = datetime.now()
for fd in fds:
if fd == handle:
data = os.read(handle, 1024)
# can handle the bytes as they come in here
# self._handle_stdout(data)
stdout_collected += data
# process exited
# if using a pseudoterminal, close the handles here
self.popen.wait()

Why are you setting the sdterr to STDOUT?
The real benefit of making a communicate() call on a subproces is that you are able to retrieve a tuple containining the stdout response as well as the stderr meesage.
Those might be useful if the logic depends on their succsss or failure.
Also, it would save you from the pain of having to iterate through lines. Communicate() gives you everything and there would be no unresolved questions about whether or not the full message was received

I wrote a demo with bash subprocess that can be easy explored.
A closed pipe can be recognized by '' in the output from readline(), while the output from an empty line is '\n'.
from subprocess import Popen, PIPE, STDOUT
p = Popen(['bash'], stdout=PIPE, stderr=STDOUT)
out = []
while True:
outdata = p.stdout.readline()
if not outdata:
break
#output_consumer.process_output(outdata)
print "* " + repr(outdata)
out.append(outdata)
print "* closed", repr(out)
print "* returncode", p.wait()
Example of input/output: Closing the pipe distinctly before terminating the process. That is why wait() should be used instead of poll()
[prompt] $ python myscript.py
echo abc
* 'abc\n'
exec 1>&- # close stdout
exec 2>&- # close stderr
* closed ['abc\n']
exit
* returncode 0
[prompt] $
Your code did output a huge number of empty strings for this case.
Example: Fast terminated process without '\n' on the last line:
echo -n abc
exit
* 'abc'
* closed ['abc']
* returncode 0

Python Popen not behaving like a subprocess

My problem is this--I need to get output from a subprocess and I am using the following code to call it-- (Feel free to ignore the long arguments. The importing thing is the stdout= subprocess.PIPE)
(stdout, stderr) = subprocess.Popen([self.ChapterToolPath, "-x", book.xmlPath , "-a", book.aacPath , "-o", book.outputPath+ "/" + fileName + ".m4b"], stdout= subprocess.PIPE).communicate()
print stdout
Thanks to an answer below, I've been able to get the output of the program, but I still end up waiting for the process to terminate before I get anything. The interesting thing is that in my debugger, there is all sorts of text flying by in the console and it is all ignored. But the moment that anything is written to the console in black (I am using pycharm) the program continues without a problem. Could the main program be waiting for some kind of output in order to move on? This would make sense because I am trying to communicate with it.... Is there a difference between text that I can see in the console and actual text that makes it to the stdout? And how would I collect the text written to the console?
Thanks!

The first line of the documentation for subprocess.call() describes it as such:
Run the command described by args. Wait for command to complete, then return the returncode attribute.
Thus, it necessarily waits for the subprocess to exit.
subprocess.Popen(), by contrast, does not do this, returning a handle on a process with which one than then communicate().

To get all output from a program:
from subprocess import check_output as qx
output = qx([program, arg1, arg2, ...])
To get output while the program is running:
from subprocess import Popen, PIPE
p = Popen([program, arg1, ...], stdout=PIPE)
for line in iter(p.stdout.readline, ''):
print line,
There might be a buffering issue on the program' side if it prints line-by-line when run interactively but buffers its output if run as a subprocess. There are various solutions depending on your OS or the program e.g., you could run it using pexpect module.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Async read console output with python while process is running - python

I want to run a long process (calculix simulation) via python. As mentioned here one can read the console string with communicate(). As far as I understand the string is returned after the process is completed? Is there a possibility to get the console output while the process is running?

You have to use subprocess.Popen.poll to check process terminates or not. while sub_process.poll() is None: output_line = sub_process.stdout.readline() This will give you runtime output.

Related

Controlling a python script from another script

Reading Command line output Python

Waiting for output from a subprocess which does not terminate

Detecting the end of the stream on popen.stdout.readline

Python Popen not behaving like a subprocess

Categories

Resources