How to read stdout from python subprocess popen non-blockingly on Windows?

How to read stdout from python subprocess popen non-blockingly on Windows? - python

I am suffering from the Windows Python subprocess module.
This is test code1(named test1.py):
import subprocess as sbp
with sbp.Popen('python tests/test2.py',stdout=sbp.PIPE) as proc:
print('parent process')
print(proc.stdout.read(1))
print('end.')
and test code2(named test2.py):
import random
import time
def r():
while True:
yield random.randint(0, 100)
for i in r():
print(i)
time.sleep(1)
Generally, the test code2 generates random integer(0~100) and print it out infinitely.
I want the test code1 create a subprocess and launch it, read the stdout in realtime(not waiting for subprocess finished).
But when I run the code, the output is :
python.exe test1.py
parent process
It blocks on stdout.read() forever.
I have tried:
Replace stdout.read with communicate(), doesn't work as python doc expected, it will blocking until subprocess terminate.
use poll() methods to detect subprocess and read n bytes, forever block on read()
Modify the test2.code, only generate one nunber and break the loop. The father process print it out immediately(I think it's because child process terminated)
I searched a lot of similiar answers and did as they suggested(use stdout instead of communicate), but still didn't work?
Could anyone help me explaining why and how to do it?
This is my platform information:
Python 3.6.4 (v3.6.4:d48eceb, Dec 19 2017, 06:54:40) [MSC v.1900 64 bit (AMD64)] on win32

It has to do with Python's output buffering (for a child process in your case). Try disabling the buffering and your code should work. You can do it by either running python with -u key, or calling sys.stdout.flush().
To use the -u key you need to modify the argument in the call to Popen, to use the flush() call you need to modify the test2.py.
Also, your test1.py would print just a single number, because you read only 1 byte from the pipe, instead of reading them in a loop.
Solution 1:
test1.py
import subprocess as sbp
with sbp.Popen(["python3", "-u", "./test2.py"], stdout=sbp.PIPE) as proc:
print("parent process")
while proc.poll() is None: # Check the the child process is still running
data = proc.stdout.read(1) # Note: it reads as binary, not text
print(data)
print("end")
This way you don't have to touch the test2.py at all.
Solution 2:
test1.py
import subprocess as sbp
with sbp.Popen("./test2.py", stdout=sbp.PIPE) as proc:
print("parent process")
while proc.poll() is None: # Check the the child process is still running
data = proc.stdout.read(1) # Note: it reads as binary, not text
print(data)
print("end")
test2.py
import random
import time
import sys
def r():
while True:
yield random.randint(0, 100)
for i in r():
print(i)
sys.stdout.flush() # Here you force Python to instantly flush the buffer
time.sleep(1)
This will print each received byte on a new line, e.g.:
parent process
b'9'
b'5'
b'\n'
b'2'
b'6'
b'\n'
You can switch the pipe to text mode by providing encoding in arguments or providing universal_newlines=True, which will make it use the default encoding. And then write directly to sys.stdout of your parent process. This will basically stream the output of a child process to the output of the parent process.
test1.py
import subprocess as sbp
import sys
with sbp.Popen("./test2.py", stdout=sbp.PIPE, universal_newlines=True) as proc:
print("parent process")
while proc.poll() is None: # Check the the child process is still running
data = proc.stdout.read(1) # Note: it reads as binary, not text
sys.stdout.write(data)
print("end")
This will provide the output as if the test2.py is executed directly:
parent process
33
94
27

Related

two Popen processes are colliding (in python 3.6.8)

I'm using python 3.6.8 and I have a situation when one process cannot continue until other one is finished.
p1 is in the main thread and must stay opened for a long time doing things.
p2 must run in separate thread (daemon=True), read stdout/err using communicate() and finish.
(all pipes are needed, must not disable them)
As you will see below, when run code by python 3.10.4 I have output "thread.popen/communicate", but python 3.6.8 will not print this line.
it will stuck inside communicate() i think.
What I ask for is I need a workaround for 3.6.8 and optionally explanation what is going on with Python 3.6.8? a bug with Locks or maybe Pipes?
Thank you!
import threading
from time import sleep
from subprocess import Popen, PIPE, STDOUT
def run():
print('thread')
p2 = Popen('git', stdin = PIPE, stdout = PIPE, stderr = PIPE)
o,e = p2.communicate()
print('thread.popen/communicate')
if __name__ == '__main__':
threading.Thread(target=run, daemon=True).start()
p1 = Popen('cmd', stdin = PIPE, stdout = PIPE, stderr = STDOUT)
print('main.popen')
# p1.wait()
sleep(2)
F:\MySSDPrograms\cudatext\py\cuda_lsp>python.exe new.py
thread
main.popen
thread.popen/communicate
F:\MySSDPrograms\cudatext\py\cuda_lsp>f:\Python36\python.exe new.py
thread
main.popen

Have subprocess.Popen only wait on its child process to return, but not any grandchildren

I have a python script that does this:
p = subprocess.Popen(pythonscript.py, stdin=PIPE, stdout=PIPE, stderr=PIPE, shell=False)
theStdin=request.input.encode('utf-8')
(outputhere,errorshere) = p.communicate(input=theStdin)
It works as expected, it waits for the subprocess to finish via p.communicate(). However within the pythonscript.py I want to "fire and forget" a "grandchild" process. I'm currently doing this by overwriting the join function:
class EverLastingProcess(Process):
def join(self, *args, **kwargs):
pass # Overwrites join so that it doesn't block. Otherwise parent waits.
def __del__(self):
pass
And starting it like this:
p = EverLastingProcess(target=nameOfMyFunction, args=(arg1, etc,), daemon=False)
p.start()
This also works fine I just run pythonscript.py in a bash terminal or bash script. Control and a response returns while the child process started by EverLastingProcess keeps going. However, when I run pythonscript.py with Popen running the process as shown above, it looks from timings that the Popen is waiting on the grandchild to finish.
How can I make it so that the Popen only waits on the child process, and not any grandchild processes?

The solution above (using the join method with the shell=True addition) stopped working when we upgraded our Python recently.
There are many references on the internet about the pieces and parts of this, but it took me some doing to come up with a useful solution to the entire problem.
The following solution has been tested in Python 3.9.5 and 3.9.7.
Problem Synopsis
The names of the scripts match those in the code example below.
A top-level program (grandparent.py):
Uses subprocess.run or subprocess.Popen to call a program (parent.py)
Checks return value from parent.py for sanity.
Collects stdout and stderr from the main process 'parent.py'.
Does not want to wait around for the grandchild to complete.
The called program (parent.py)
Might do some stuff first.
Spawns a very long process (the grandchild - "longProcess" in the code below).
Might do a little more work.
Returns its results and exits while the grandchild (longProcess) continues doing what it does.
Solution Synopsis
The important part isn't so much what happens with subprocess. Instead, the method for creating the grandchild/longProcess is the critical part. It is necessary to ensure that the grandchild is truly emancipated from parent.py.
Subprocess only needs to be used in a way that captures output.
The longProcess (grandchild) needs the following to happen:
It should be started using multiprocessing.
It needs multiprocessing's 'daemon' set to False.
It should also be invoked using the double-fork procedure.
In the double-fork, extra work needs to be done to ensure that the process is truly separate from parent.py. Specifically:
Move the execution away from the environment of parent.py.
Use file handling to ensure that the grandchild no longer uses the file handles (stdin, stdout, stderr) inherited from parent.py.
Example Code
grandparent.py - calls parent.py using subprocess.run()
#!/usr/bin/env python3
import subprocess
p = subprocess.run(["/usr/bin/python3", "/path/to/parent.py"], capture_output=True)
## Comment the following if you don't need reassurance
print("The return code is: " + str(p.returncode))
print("The standard out is: ")
print(p.stdout)
print("The standard error is: ")
print(p.stderr)
parent.py - starts the longProcess/grandchild and exits, leaving the grandchild running. After 10 seconds, the grandchild will write timing info to /tmp/timelog.
!/usr/bin/env python3
import time
def longProcess() :
time.sleep(10)
fo = open("/tmp/timelog", "w")
fo.write("I slept! The time now is: " + time.asctime(time.localtime()) + "\n")
fo.close()
import os,sys
def spawnDaemon(func):
# do the UNIX double-fork magic, see Stevens' "Advanced
# Programming in the UNIX Environment" for details (ISBN 0201563177)
try:
pid = os.fork()
if pid > 0: # parent process
return
except OSError as e:
print("fork #1 failed. See next. " )
print(e)
sys.exit(1)
# Decouple from the parent environment.
os.chdir("/")
os.setsid()
os.umask(0)
# do second fork
try:
pid = os.fork()
if pid > 0:
# exit from second parent
sys.exit(0)
except OSError as e:
print("fork #2 failed. See next. " )
print(e)
print(1)
# Redirect standard file descriptors.
# Here, they are reassigned to /dev/null, but they could go elsewhere.
sys.stdout.flush()
sys.stderr.flush()
si = open('/dev/null', 'r')
so = open('/dev/null', 'a+')
se = open('/dev/null', 'a+')
os.dup2(si.fileno(), sys.stdin.fileno())
os.dup2(so.fileno(), sys.stdout.fileno())
os.dup2(se.fileno(), sys.stderr.fileno())
# Run your daemon
func()
# Ensure that the daemon exits when complete
os._exit(os.EX_OK)
import multiprocessing
daemonicGrandchild=multiprocessing.Process(target=spawnDaemon, args=(longProcess,))
daemonicGrandchild.daemon=False
daemonicGrandchild.start()
print("have started the daemon") # This will get captured as stdout by grandparent.py
References
The code above was mainly inspired by the following two resources.
This reference is succinct about the use of the double-fork but does not include the file handling we need in this situation.
This reference contains the needed file handling, but does many other things that we do not need.

Edit: the below stopped working after a Python upgrade, see the accepted answer from Lachele.
Working answer from a colleague, change to shell=True like this:
p = subprocess.Popen(pythonscript.py, stdin=PIPE, stdout=PIPE, stderr=PIPE, shell=True)
I've tested and the grandchild subprocesses stay alive after the child processes returns without waiting for them to finish.

subprocess run in background and write output line by line to a file

I have two files:
main.py
import subprocess
import shlex
def main():
command = 'python test_output.py'
logfile = open('output', 'w')
proc = subprocess.Popen(shlex.split(command), stdout=logfile)
if __name__ == "__main__":
main()
and test_output.py
from time import sleep
import os
for i in range(0, 30):
print("Slept for => ", i+1, "s")
sleep(1)
os.system("notify-send completed -t 1500")
The output of the process is written in logfile once the child process is completed. Is there any way to:
Start child process from main and exit it (like it does now).
Keep running the child process in background.
As child process produces an output, write it immediately to logfile. (Don't wait for the child process to finish, as it does now.)
There are other questions (like this one) where solution is given for reading line by line, but they make the main.py wait. Is it possible to do everything in background, without keeping main.py waiting?

Both the buffers of the filehandler as the subprocess can be set to 'line-buffering', where a newline character causes each object's buffer to be forwarded. This is done by setting the buffer parameter to 1, see open() command and subprocess.
You need to make sure that the child process will not buffer by itself. By seeing that you are running a Python script too, you either need to implement this in the code there, like flush=True for a print statement:
print(this_and_that, flush=True)
Source
Credit

Python subprocess always waits for programm [duplicate]

I'm trying to port a shell script to the much more readable python version. The original shell script starts several processes (utilities, monitors, etc.) in the background with "&". How can I achieve the same effect in python? I'd like these processes not to die when the python scripts complete. I am sure it's related to the concept of a daemon somehow, but I couldn't find how to do this easily.

While jkp's solution works, the newer way of doing things (and the way the documentation recommends) is to use the subprocess module. For simple commands its equivalent, but it offers more options if you want to do something complicated.
Example for your case:
import subprocess
subprocess.Popen(["rm","-r","some.file"])
This will run rm -r some.file in the background. Note that calling .communicate() on the object returned from Popen will block until it completes, so don't do that if you want it to run in the background:
import subprocess
ls_output=subprocess.Popen(["sleep", "30"])
ls_output.communicate() # Will block for 30 seconds
See the documentation here.
Also, a point of clarification: "Background" as you use it here is purely a shell concept; technically, what you mean is that you want to spawn a process without blocking while you wait for it to complete. However, I've used "background" here to refer to shell-background-like behavior.

Note: This answer is less current than it was when posted in 2009. Using the subprocess module shown in other answers is now recommended in the docs
(Note that the subprocess module provides more powerful facilities for spawning new processes and retrieving their results; using that module is preferable to using these functions.)
If you want your process to start in the background you can either use system() and call it in the same way your shell script did, or you can spawn it:
import os
os.spawnl(os.P_DETACH, 'some_long_running_command')
(or, alternatively, you may try the less portable os.P_NOWAIT flag).
See the documentation here.

You probably want the answer to "How to call an external command in Python".
The simplest approach is to use the os.system function, e.g.:
import os
os.system("some_command &")
Basically, whatever you pass to the system function will be executed the same as if you'd passed it to the shell in a script.

I found this here:
On windows (win xp), the parent process will not finish until the longtask.py has finished its work. It is not what you want in CGI-script. The problem is not specific to Python, in PHP community the problems are the same.
The solution is to pass DETACHED_PROCESS Process Creation Flag to the underlying CreateProcess function in win API. If you happen to have installed pywin32 you can import the flag from the win32process module, otherwise you should define it yourself:
DETACHED_PROCESS = 0x00000008
pid = subprocess.Popen([sys.executable, "longtask.py"],
creationflags=DETACHED_PROCESS).pid

Use subprocess.Popen() with the close_fds=True parameter, which will allow the spawned subprocess to be detached from the Python process itself and continue running even after Python exits.
https://gist.github.com/yinjimmy/d6ad0742d03d54518e9f
import os, time, sys, subprocess
if len(sys.argv) == 2:
time.sleep(5)
print 'track end'
if sys.platform == 'darwin':
subprocess.Popen(['say', 'hello'])
else:
print 'main begin'
subprocess.Popen(['python', os.path.realpath(__file__), '0'], close_fds=True)
print 'main end'

Both capture output and run on background with threading
As mentioned on this answer, if you capture the output with stdout= and then try to read(), then the process blocks.
However, there are cases where you need this. For example, I wanted to launch two processes that talk over a port between them, and save their stdout to a log file and stdout.
The threading module allows us to do that.
First, have a look at how to do the output redirection part alone in this question: Python Popen: Write to stdout AND log file simultaneously
Then:
main.py
#!/usr/bin/env python3
import os
import subprocess
import sys
import threading
def output_reader(proc, file):
while True:
byte = proc.stdout.read(1)
if byte:
sys.stdout.buffer.write(byte)
sys.stdout.flush()
file.buffer.write(byte)
else:
break
with subprocess.Popen(['./sleep.py', '0'], stdout=subprocess.PIPE, stderr=subprocess.PIPE) as proc1, \
subprocess.Popen(['./sleep.py', '10'], stdout=subprocess.PIPE, stderr=subprocess.PIPE) as proc2, \
open('log1.log', 'w') as file1, \
open('log2.log', 'w') as file2:
t1 = threading.Thread(target=output_reader, args=(proc1, file1))
t2 = threading.Thread(target=output_reader, args=(proc2, file2))
t1.start()
t2.start()
t1.join()
t2.join()
sleep.py
#!/usr/bin/env python3
import sys
import time
for i in range(4):
print(i + int(sys.argv[1]))
sys.stdout.flush()
time.sleep(0.5)
After running:
./main.py
stdout get updated every 0.5 seconds for every two lines to contain:
0
10
1
11
2
12
3
13
and each log file contains the respective log for a given process.
Inspired by: https://eli.thegreenplace.net/2017/interacting-with-a-long-running-child-process-in-python/
Tested on Ubuntu 18.04, Python 3.6.7.

You probably want to start investigating the os module for forking different threads (by opening an interactive session and issuing help(os)). The relevant functions are fork and any of the exec ones. To give you an idea on how to start, put something like this in a function that performs the fork (the function needs to take a list or tuple 'args' as an argument that contains the program's name and its parameters; you may also want to define stdin, out and err for the new thread):
try:
pid = os.fork()
except OSError, e:
## some debug output
sys.exit(1)
if pid == 0:
## eventually use os.putenv(..) to set environment variables
## os.execv strips of args[0] for the arguments
os.execv(args[0], args)

You can use
import os
pid = os.fork()
if pid == 0:
Continue to other code ...
This will make the python process run in background.

I haven't tried this yet but using .pyw files instead of .py files should help. pyw files dosen't have a console so in theory it should not appear and work like a background process.

Python: subprocess32 process.stdout.readline() waiting time

If I run the following function "run" with for example "ls -Rlah /" I get output immediately via the print statement as expected
import subprocess32 as subprocess
def run(command):
process = subprocess.Popen(command,
stdout=subprocess.PIPE,
stderr=subprocess.STDOUT)
try:
while process.poll() == None:
print process.stdout.readline()
finally:
# Handle the scenario if the parent
# process has terminated before this subprocess
if process.poll():
process.kill()
However if I use the python example program below it seems to be stuck on either process.poll() or process.stdout.readline() until the program has finished. I think it is stdout.readline() since if I increase the number of strings to output from 10 to 10000 (in the example program) or add in a sys.stdout.flush() just after every print, the print in the run function does get executed.
How can I make the output from a subprocess more real-timeish?
Note: I have just discovered that the python example program does not perform a sys.stdout.flush() when it outputs, is there a way for the caller of subprocess to enforce this somehow?
Example program which outputs 10 strings every 5 seconds.
#!/bin/env python
import time
if __name__ == "__main__":
i = 0
start = time.time()
while True:
if time.time() - start >= 5:
for _ in range(10):
print "hello world" + str(i)
start = time.time()
i += 1
if i >= 3:
break

On most systems, command line programs line buffer or block buffer depending on whether stdout is a terminal or a pipe. On unixy systems, the parent process can create a pseudo-terminal to get terminal-like behavior even though the child isn't really run from a terminal. You can use the pty module to create a pseudo-terminal or use the pexpect module which eases access to interactive programs.
As mentioned in comments, using poll to read lines can result in lost data. One example is data left in the stdout pipe when the process terminates. Reading pty is a bit different than pipes and you'll find you need to catch an IOError when the child closes to get it all to work properly as in the example below.
try:
import subprocess32 as subprocess
except ImportError:
import subprocess
import pty
import sys
import os
import time
import errno
print("running %s" % sys.argv[1])
m,s = (os.fdopen(pipe) for pipe in pty.openpty())
process = subprocess.Popen([sys.argv[1]],
stdin=s,
stdout=s,
stderr=subprocess.STDOUT)
s.close()
try:
graceful = False
while True:
line = m.readline()
print line.rstrip()
except IOError, e:
if e.errno != errno.EIO:
raise
graceful = True
finally:
# Handle the scenario if the parent
# process has terminated before this subprocess
m.close()
if not graceful:
process.kill()
process.wait()

You should flush standard output in your script:
print "hello world" + str(i)
sys.stdout.flush()
When standard output is a terminal, stdout is line-buffered. But when it is not, stdout is block buffered and you need to flush it explicitly.
If you can't change the source of your script, you can use the -u option of Python (in the subprocess):
-u Force stdin, stdout and stderr to be totally unbuffered.
Your command should be: ['python', '-u', 'script.py']
In general, this kind of buffering happens in userspace. There are no generic ways to force an application to flush its buffers: some applications support command line options (like Python), others support signals, others do not support anything.
One solution might be to emulate a pseudo terminal, giving "hints" to the programs that they should operate in line-buffered mode. Still, this is not a solution that works in every case.

For things other than python you could try using unbuffer:
unbuffer disables the output buffering that occurs when program output is redirected from non-interactive programs. For example, suppose you are watching the output from a fifo by running it through od and then more.
od -c /tmp/fifo | more
You will not see anything until a full page of output has been produced.
You can disable this automatic buffering as follows:
unbuffer od -c /tmp/fifo | more
Normally, unbuffer does not read from stdin. This simplifies use of unbuffer in some situations. To use unbuffer in a pipeline, use the -p flag. Example:
process1 | unbuffer -p process2 | process3
So in your case:
run(["unbuffer",cmd])
There are some caveats listed in the docs but it is another option.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to read stdout from python subprocess popen non-blockingly on Windows? - python

Related

two Popen processes are colliding (in python 3.6.8)

Have subprocess.Popen only wait on its child process to return, but not any grandchildren

subprocess run in background and write output line by line to a file

Python subprocess always waits for programm [duplicate]

Python: subprocess32 process.stdout.readline() waiting time

Categories

Resources