I have a script that is part of an automated test suite. It runs very slowly on Windows but not on Linux and I have found out why. The process that we are testing ('frank') creates a child process (so a grandchild). The python code won't return until that grandchild process also ends (on Windows - doesn't do this on Linux). The grandchild process will kill itself off after 5 seconds if there is no parent (it hangs around in case another process talks to it)
I've found I can stop the communicate function from hanging in this way if I don't capture stdout. But I need stdout. I read somewhere that the communicate function is waiting for all pipes to be closed. I know that the stdout handle is duplicated for the grandchild but I can't change the code I'm testing.
I've been searching for a solution. I tried some creation flags (still in the code) but that didn't help.
This is the cut down test -
import os
import sys
import threading
import subprocess
def read_from_pipe(process):
last_stdout = process.communicate()[0]
print (last_stdout)
CREATE_NEW_PROCESS_GROUP = 0x00000200
DETACHED_PROCESS = 0x00000008
# start process
command = 'frank my arguments'
cwd = "C:\\dev\\ui_test\\frank_test\\workspace\\report183"
p = subprocess.Popen(command,
stdout=subprocess.PIPE,
cwd=cwd)
# run thread to read from output
t = threading.Thread(target=read_from_pipe, args=[p])
t.start()
t.join(30)
print('finished')
Any ideas?
Thanks.
Peter.
After tips from #eryksun and a lot of Googling, I have this rather complicated lot of code! At one point, I considered cheating and doing os.system and redirecting to a temp file but then I realised that our test code allows for a command timing out. os.system would just block forever if the child process doesn't die.
import os
import sys
import threading
import subprocess
import time
if os.name == 'nt':
import msvcrt
import ctypes
# See https://stackoverflow.com/questions/55160319/python-subprocess-waiting-for-grandchild-on-windows-with-stdout-set for details on Windows code
# Based on https://github.com/it2school/Projects/blob/master/2017/Python/party4kids-2/CarGame/src/pygame/tests/test_utils/async_sub.py
from ctypes.wintypes import DWORD
if sys.version_info >= (3,):
null_byte = '\x00'.encode('ascii')
else:
null_byte = '\x00'
def ReadFile(handle, desired_bytes, ol = None):
c_read = DWORD()
buffer = ctypes.create_string_buffer(desired_bytes+1)
success = ctypes.windll.kernel32.ReadFile(handle, buffer, desired_bytes, ctypes.byref(c_read), ol)
buffer[c_read.value] = null_byte
return ctypes.windll.kernel32.GetLastError(), buffer.value
def PeekNamedPipe(handle):
c_avail = DWORD()
c_message = DWORD()
success = ctypes.windll.kernel32.PeekNamedPipe(handle, None, 0, None, ctypes.byref(c_avail), ctypes.byref(c_message))
return "", c_avail.value, c_message.value
def read_available(handle):
buffer, bytesToRead, result = PeekNamedPipe(handle)
if bytesToRead:
hr, data = ReadFile(handle, bytesToRead, None)
return data
return b''
def read_from_pipe(process):
if os.name == 'posix':
last_stdout = process.communicate()[0]
else:
handle = msvcrt.get_osfhandle(process.stdout.fileno())
last_stdout = b''
while process.poll() is None:
last_stdout += read_available(handle)
time.sleep(0.1)
last_stdout += read_available(handle)
print (last_stdout)
# start process
command = 'frank my arguments'
cwd = "C:\\dev\\ui_test\\frank_test\\workspace\\report183"
p = subprocess.Popen(command,
stdout=subprocess.PIPE,
cwd=cwd)
# run thread to read from output
t = threading.Thread(target=read_from_pipe, args=[p])
t.start()
t.join(30)
print('finished')
Related
I want code like this:
if True:
run('ABC.PY')
else:
if ScriptRunning('ABC.PY):
stop('ABC.PY')
run('ABC.PY'):
Basically, I want to run a file, let's say abc.py, and based on some conditions. I want to stop it, and run it again from another python script. Is it possible?
I am using Windows.
You can use python Popen objects for running processes in a child process
So run('ABC.PY') would be p = Popen("python 'ABC.PY'")
if ScriptRunning('ABC.PY) would be if p.poll() == None
stop('ABC.PY') would be p.kill()
This is a very basic example for what you are trying to achieve
Please checkout subprocess.Popen docs to fine tune your logic for running the script
import subprocess
import shlex
import time
def run(script):
scriptArgs = shlex.split(script)
commandArgs = ["python"]
commandArgs.extend(scriptArgs)
procHandle = subprocess.Popen(commandArgs, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
return procHandle
def isScriptRunning(procHandle):
return procHandle.poll() is None
def stopScript(procHandle):
procHandle.terminate()
time.sleep(5)
# Forcefully terminate the script
if isScriptRunning(procHandle):
procHandle.kill()
def getOutput(procHandle):
# stderr will be redirected to stdout due "stderr=subprocess.STDOUT" argument in Popen call
stdout, _ = procHandle.communicate()
returncode = procHandle.returncode
return returncode, stdout
def main():
procHandle = run("main.py --arg 123")
time.sleep(5)
isScriptRunning(procHandle)
stopScript(procHandle)
print getOutput(procHandle)
if __name__ == "__main__":
main()
One thing that you should be aware about is stdout=subprocess.PIPE.
If your python script has a very large output, the pipes may overflow causing your script to block until .communicate is called over the handle.
To avoid this, pass a file handle to stdout, like this
fileHandle = open("main_output.txt", "w")
subprocess.Popen(..., stdout=fileHandle)
In this way, the output of the python process will be dumped into the file.(You will have to modily the getOutput() function too for this)
import subprocess
process = None
def run_or_rerun(flag):
global process
if flag:
assert(process is None)
process = subprocess.Popen(['python', 'ABC.PY'])
process.wait() # must wait or caller will hang
else:
if process.poll() is None: # it is still running
process.terminate() # terminate process
process = subprocess.Popen(['python', 'ABC.PY']) # rerun
process.wait() # must wait or caller will hang
I am writing a python daemon (using python 3.7) that continuously checks if data is available on stdin (using select) and does something with it. The data can contain non-unicode characters, so the daemon needs to read from sys.stdin.buffer instead of sys.stdin.
That program actually works. But I am writing a functional test for that program, using a second python script, that starts the daemon with subprocess.Popen, and sends data to it's stdin and reads from its stdout. For some reason, that part doesn't work. proc.readline() blocks forever, and some proc.stdin.write never reach the child process.
To boil it down, I have 2 small scripts that illustrate the problem.
Right now, the issue is that the first readline() call in test.py is blocking and never returning, while test2.py already wrote a full line to stdout.
# test.py
from subprocess import Popen, PIPE, TimeoutExpired
import time
import select
proc = Popen(["python", "test1.py"], stdin=PIPE, stdout=PIPE,
bufsize=0)
print("Writing")
proc.stdin.write("blablabla\n".encode('utf-8'))
time.sleep(2)
output = proc.stdout.readline()
print("Result")
print(output)
print("Writing")
proc.stdin.write("zxczxczxczxc\n".encode('utf-8'))
time.sleep(2)
output = proc.stdout.readline()
print("Result")
print(output)
proc.terminate()
print(proc.stdout.read())
#test1.py
import select
import signal
import sys
import logging
logging.basicConfig(level=logging.DEBUG, stream=sys.stdout)
log = logging.getLogger()
STOP = False
def stop(frame, signal):
global STOP
STOP = False
signal.signal(signal.SIGTERM, stop)
signal.signal(signal.SIGINT, stop)
def input_available():
"""Check if data is available on stdin."""
data_available = select.select([sys.stdin], [], [], 0)
return sys.stdin in data_available[0]
data = ""
while not STOP:
if input_available():
char = sys.stdin.buffer.read(1)
try:
char = char.decode('utf-8')
data += char
except UnicodeDecodeError:
char = None
print("skipping char")
if char == '\n':
s = f"Got line {data}"
print(s)
log.debug(s)
data = ""
sys.stdout.flush()
EDIT
From the commment of #charles-duffy I ran it with strace. It seems that the program in the subprocess is blocking on it's read call?
This is the last bit of strace output:
write(1, "Writing\n", 8Writing
) = 8
write(4, "blablabla\n", 10) = 10
select(0, NULL, NULL, NULL, {tv_sec=2, tv_usec=0}) = 0 (Timeout)
read(5,
I have implemented a variant on the code in this question:
A non-blocking read on a subprocess.PIPE in Python
To try and read the output in real time from this dummy program test.py:
import time, sys
print "Hello there"
for i in range(100):
time.sleep(0.1)
sys.stdout.write("\r%d"%i)
sys.stdout.flush()
print
print "Go now or I shall taunt you once again!"
The variation on the other question is that the calling program must read character by character, not line by line, as the dummy program test.py outputs progress indication all on one line by use of \r. So here it is:
import sys,time
from subprocess import PIPE, Popen
from threading import Thread
try:
from Queue import Queue, Empty
except ImportError:
from queue import Queue, Empty # Python 3.x
ON_POSIX = 'posix' in sys.builtin_module_names
def enqueue_output(out, queue):
while True:
buffersize = 1
data = out.read(buffersize)
if not data:
break
queue.put(data)
out.close()
p = Popen(sys.executable + " test.py", stdout=PIPE, bufsize=1, close_fds=ON_POSIX)
q = Queue()
t = Thread(target=enqueue_output, args=(p.stdout, q))
t.daemon = True # Thread dies with the program
t.start()
while True:
p.poll()
if p.returncode:
break
# Read line without blocking
try:
char = q.get_nowait()
time.sleep(0.1)
except Empty:
pass
else: # Got line
sys.stdout.write(char)
sys.stdout.flush()
print "left loop"
sys.exit(0)
Two problems with this
It never exits - p.returncode never returns a value and the loop is not left. How can I fix it?
It's really slow! Is there a way to make it more efficient without increasing buffersize?
As #Markku K. pointed out, you should use bufsize=0 to read one byte at a time.
Your code doesn't require a non-blocking read. You can simplify it:
import sys
from functools import partial
from subprocess import Popen, PIPE
p = Popen([sys.executable, "test.py"], stdout=PIPE, bufsize=0)
for b in iter(partial(p.stdout.read, 1), b""):
print b # it should print as soon as `sys.stdout.flush()` is called
# in the test.py
p.stdout.close()
p.wait()
Note: reading 1 byte at a time is very inefficient.
Also, in general, there could be a block-buffering issue that sometimes can be solved using pexpect, pty modules or unbuffer, stdbuf, script command-line utilities.
For Python processes you could use -u flag to force unbuffering (binary layer) of stdin, stdout, stderr streams.
The task I try to accomplish is to stream a ruby file and print out the output. (NOTE: I don't want to print out everything at once)
main.py
from subprocess import Popen, PIPE, STDOUT
import pty
import os
file_path = '/Users/luciano/Desktop/ruby_sleep.rb'
command = ' '.join(["ruby", file_path])
master, slave = pty.openpty()
proc = Popen(command, bufsize=0, shell=True, stdout=slave, stderr=slave, close_fds=True)
stdout = os.fdopen(master, 'r', 0)
while proc.poll() is None:
data = stdout.readline()
if data != "":
print(data)
else:
break
print("This is never reached!")
ruby_sleep.rb
puts "hello"
sleep 2
puts "goodbye!"
Problem
Streaming the file works fine. The hello/goodbye output is printed with the 2 seconds delay. Exactly as the script should work. The problem is that readline() hangs in the end and never quits. I never reach the last print.
I know there is a lot of questions like this here a stackoverflow but non of them made me solve the problem. I'm not that into the whole subprocess thing so please give me a more hands-on/concrete answer.
Regards
edit
Fix unintended code. (nothing to do with the actual error)
I assume you use pty due to reasons outlined in Q: Why not just use a pipe (popen())? (all other answers so far ignore your "NOTE: I don't want to print out everything at once").
pty is Linux only as said in the docs:
Because pseudo-terminal handling is highly platform dependent, there
is code to do it only for Linux. (The Linux code is supposed to work
on other platforms, but hasn’t been tested yet.)
It is unclear how well it works on other OSes.
You could try pexpect:
import sys
import pexpect
pexpect.run("ruby ruby_sleep.rb", logfile=sys.stdout)
Or stdbuf to enable line-buffering in non-interactive mode:
from subprocess import Popen, PIPE, STDOUT
proc = Popen(['stdbuf', '-oL', 'ruby', 'ruby_sleep.rb'],
bufsize=1, stdout=PIPE, stderr=STDOUT, close_fds=True)
for line in iter(proc.stdout.readline, b''):
print line,
proc.stdout.close()
proc.wait()
Or using pty from stdlib based on #Antti Haapala's answer:
#!/usr/bin/env python
import errno
import os
import pty
from subprocess import Popen, STDOUT
master_fd, slave_fd = pty.openpty() # provide tty to enable
# line-buffering on ruby's side
proc = Popen(['ruby', 'ruby_sleep.rb'],
stdin=slave_fd, stdout=slave_fd, stderr=STDOUT, close_fds=True)
os.close(slave_fd)
try:
while 1:
try:
data = os.read(master_fd, 512)
except OSError as e:
if e.errno != errno.EIO:
raise
break # EIO means EOF on some systems
else:
if not data: # EOF
break
print('got ' + repr(data))
finally:
os.close(master_fd)
if proc.poll() is None:
proc.kill()
proc.wait()
print("This is reached!")
All three code examples print 'hello' immediately (as soon as the first EOL is seen).
leave the old more complicated code example here because it may be referenced and discussed in other posts on SO
Or using pty based on #Antti Haapala's answer:
import os
import pty
import select
from subprocess import Popen, STDOUT
master_fd, slave_fd = pty.openpty() # provide tty to enable
# line-buffering on ruby's side
proc = Popen(['ruby', 'ruby_sleep.rb'],
stdout=slave_fd, stderr=STDOUT, close_fds=True)
timeout = .04 # seconds
while 1:
ready, _, _ = select.select([master_fd], [], [], timeout)
if ready:
data = os.read(master_fd, 512)
if not data:
break
print("got " + repr(data))
elif proc.poll() is not None: # select timeout
assert not select.select([master_fd], [], [], 0)[0] # detect race condition
break # proc exited
os.close(slave_fd) # can't do it sooner: it leads to errno.EIO error
os.close(master_fd)
proc.wait()
print("This is reached!")
Not sure what is wrong with your code, but the following seems to work for me:
#!/usr/bin/python
from subprocess import Popen, PIPE
import threading
p = Popen('ls', stdout=PIPE)
class ReaderThread(threading.Thread):
def __init__(self, stream):
threading.Thread.__init__(self)
self.stream = stream
def run(self):
while True:
line = self.stream.readline()
if len(line) == 0:
break
print line,
reader = ReaderThread(p.stdout)
reader.start()
# Wait until subprocess is done
p.wait()
# Wait until we've processed all output
reader.join()
print "Done!"
Note that I don't have Ruby installed and hence cannot check with your actual problem. Works fine with ls, though.
Basically what you are looking at here is a race condition between your proc.poll() and your readline(). Since the input on the master filehandle is never closed, if the process attempts to do a readline() on it after the ruby process has finished outputting, there will never be anything to read, but the pipe will never close. The code will only work if the shell process closes before your code tries another readline().
Here is the timeline:
readline()
print-output
poll()
readline()
print-output (last line of real output)
poll() (returns false since process is not done)
readline() (waits for more output)
(process is done, but output pipe still open and no poll ever happens for it).
Easy fix is to just use the subprocess module as it suggests in the docs, not in conjunction with openpty:
http://docs.python.org/library/subprocess.html
Here is a very similar problem for further study:
Using subprocess with select and pty hangs when capturing output
Try this:
proc = Popen(command, bufsize=0, shell=True, stdout=PIPE, close_fds=True)
for line in proc.stdout:
print line
print("This is most certainly reached!")
As others have noted, readline() will block when reading data. It will even do so when your child process has died. I am not sure why this does not happen when executing ls as in the other answer, but maybe the ruby interpreter detects that it is writing to a PIPE and therefore it will not close automatically.
I want to run many processes in parallel with ability to take stdout in any time. How should I do it? Do I need to run thread for each subprocess.Popen() call, a what?
You can do it in a single thread.
Suppose you have a script that prints lines at random times:
#!/usr/bin/env python
#file: child.py
import os
import random
import sys
import time
for i in range(10):
print("%2d %s %s" % (int(sys.argv[1]), os.getpid(), i))
sys.stdout.flush()
time.sleep(random.random())
And you'd like to collect the output as soon as it becomes available, you could use select on POSIX systems as #zigg suggested:
#!/usr/bin/env python
from __future__ import print_function
from select import select
from subprocess import Popen, PIPE
# start several subprocesses
processes = [Popen(['./child.py', str(i)], stdout=PIPE,
bufsize=1, close_fds=True,
universal_newlines=True)
for i in range(5)]
# read output
timeout = 0.1 # seconds
while processes:
# remove finished processes from the list (O(N**2))
for p in processes[:]:
if p.poll() is not None: # process ended
print(p.stdout.read(), end='') # read the rest
p.stdout.close()
processes.remove(p)
# wait until there is something to read
rlist = select([p.stdout for p in processes], [],[], timeout)[0]
# read a line from each process that has output ready
for f in rlist:
print(f.readline(), end='') #NOTE: it can block
A more portable solution (that should work on Windows, Linux, OSX) can use reader threads for each process, see Non-blocking read on a subprocess.PIPE in python.
Here's os.pipe()-based solution that works on Unix and Windows:
#!/usr/bin/env python
from __future__ import print_function
import io
import os
import sys
from subprocess import Popen
ON_POSIX = 'posix' in sys.builtin_module_names
# create a pipe to get data
input_fd, output_fd = os.pipe()
# start several subprocesses
processes = [Popen([sys.executable, 'child.py', str(i)], stdout=output_fd,
close_fds=ON_POSIX) # close input_fd in children
for i in range(5)]
os.close(output_fd) # close unused end of the pipe
# read output line by line as soon as it is available
with io.open(input_fd, 'r', buffering=1) as file:
for line in file:
print(line, end='')
#
for p in processes:
p.wait()
You can also collect stdout from multiple subprocesses concurrently using twisted:
#!/usr/bin/env python
import sys
from twisted.internet import protocol, reactor
class ProcessProtocol(protocol.ProcessProtocol):
def outReceived(self, data):
print data, # received chunk of stdout from child
def processEnded(self, status):
global nprocesses
nprocesses -= 1
if nprocesses == 0: # all processes ended
reactor.stop()
# start subprocesses
nprocesses = 5
for _ in xrange(nprocesses):
reactor.spawnProcess(ProcessProtocol(), sys.executable,
args=[sys.executable, 'child.py'],
usePTY=True) # can change how child buffers stdout
reactor.run()
See Using Processes in Twisted.
You don't need to run a thread for each process. You can peek at the stdout streams for each process without blocking on them, and only read from them if they have data available to read.
You do have to be careful not to accidentally block on them, though, if you're not intending to.
You can wait for process.poll() to finish, and run other stuff concurrently:
import time
import sys
from subprocess import Popen, PIPE
def ex1() -> None:
command = 'sleep 2.1 && echo "happy friday"'
proc = Popen(command, shell=True, stderr=PIPE, stdout=PIPE)
while proc.poll() is None:
# do stuff here
print('waiting')
time.sleep(0.05)
out, _err = proc.communicate()
print(out, file=sys.stderr)
sys.stderr.flush()
assert proc.poll() == 0
ex1()