Reading/writing to a Popen() subprocess - python

I'm trying to talk to a child process using the python subprocess.Popen() call. In my real code, I'm implementing a type of IPC, so I want to write some data, read the response, write some more data, read the response, and so on. Because of this, I cannot use Popen.communicate(), which otherwise works well for the simple case.
This code shows my problem. It never even gets the first response, hangs at the first "Reading result". Why? How can I make this work as I expect?
import subprocess
p = subprocess.Popen(["sed", 's/a/x/g'],
stdout = subprocess.PIPE,
stdin = subprocess.PIPE)
p.stdin.write("abc\n")
print "Reading result:"
print p.stdout.readline()
p.stdin.write("cat\n")
print "Reading result:"
print p.stdout.readline()

sed's output is buffered and only outputs its data until enough has been cumulated or the input stream is exhausted and closed.
Try this:
import subprocess
p = subprocess.Popen(["sed", 's/a/x/g'],
stdout = subprocess.PIPE,
stdin = subprocess.PIPE)
p.stdin.write("abc\n")
p.stdin.write("cat\n")
p.stdin.close()
print "Reading result 1:"
print p.stdout.readline()
print "Reading result 2:"
print p.stdout.readline()
Be aware that this cannot be done reliably which huge data as wriring to stdin blocks once the buffer is full. The best way to do is using communicate().

I would try to use Popen().communicate() if you can as it does a lot of nice things for you, but if you need to use Popen() exactly as you described, you'll need to set sed to flush its buffer after newlines with the -l option:
p = subprocess.Popen(['sed', '-l', 's/a/x/g'],
stdout=subprocess.PIPE,
stdin=subprocess.PIPE)
and your code should work fine

Related

Python Subprocess.pOpen runs slow when memory is occ

I wrote a python script that uses subprocess.Popen to call command line tool look to do the binary search on a file.
For example
p = subprocess.Popen('look -b "abc" testfile.txt',executable='/bin/bash', stdout=subprocess.PIPE, stderr=STDOUT, shell=True)
out, err = p.communicate()
result = out.decode()
print(result)
What this snippet of code does is that it calls the system command look to perform a binary search the file called testfile.txt for the string abc.
It works fine if you just have this snippet of code.
However, when your memory is loaded with some large files, it becomes significantly slow.
For example, if you do:
a = read_a_large_file() #Like GBs of data
p = subprocess.Popen('look -b "abc" testfile.txt',executable='/bin/bash', stdout=subprocess.PIPE, stderr=STDOUT, shell=True)
out, err = p.communicate()
result = out.decode()
print(result)
a[0]
The subprocess part takes a very long time to execute. Running the look command is very fast in shell as it performs binary search on sorted files.
Any help will be appreciated! Thanks!

How to get Popen stdout in realtime NOT line by line in Python?

In general this questions has a lot of answers but they are all limited to line by line reading. For example this code:
def execute(cmd):
popen = subprocess.Popen(cmd, stdout=subprocess.PIPE, universal_newlines=True)
for stdout_line in iter(popen.stdout.readline, ""):
yield stdout_line
popen.stdout.close()
return_code = popen.wait()
if return_code:
raise subprocess.CalledProcessError(return_code, cmd)
But there are output lines for example like this (where dots are added once in ~10s):
............................
They show progress of a task that runs. I don't want to stop output until the line is finished and only then print the whole dots line.
So I need to yield string when:
I can read a block of 1024 symbols of output (or just the whole output)
there are ANY symbols of output and more then 1s passed (no matter line is finished or not)
But I don't know how to do this.
p.s. Maybe a dup. Didn't find.
If you're on Linux you can use stdbuf -o0 in front of the command you're executing to make its stdout become unbuffered (i.e. instantaneous).

Python reading output from subprocess

I have a program which opens a subprocess and communicates with it by writing to its stdin and reading from its stdout.
proc = subprocess.Popen(['foo'],
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
stdin=subprocess.PIPE)
proc.stdin.write('stuff\n')
proc.stdin.flush()
The problem is that when reading, it always blocks if I call proc.stdout.read(), and when I try to read line by line using the following:
output = str()
while proc.stdout in select.select([proc.stdout], [], [])[0]:
output += proc.stdout.readline()
it still blocks because select.select returns proc.stdout even after all the output has been read already. What can I do?
note that I am not using proc.communicate because I would like to communicate with the process multiple times

Getting output of a process at runtime

I am using a python script to run a process using subprocess.Popen and simultaneously store the output in a text file as well as print it on the console. This is my code:
result = subprocess.Popen(cmd, shell=True, stdout=subprocess.PIPE)
for line in result.stdout.readlines(): #read and store result in log file
openfile.write("%s\n" %line)
print("%s" %line)
Above code works fine, but what it does is it first completes the process and stores the output in result variable. After that for loop stores the output as well as print it.
But i want the output at runtime (as my process can take hours to complete, i don't get any output for all these hours).
So is there any other function that gives me the output dynamically (at runtime), means as soon as the process gives first line, it should get printed.
The problem here is that .readlines() gets the entire output before returning, as it constructs a full list. Just iterate directly:
for line in result.stdout:
print(line)
.readlines() returns a list of all the lines the process will return while open, i.e., it doesn't return anything until all output from the subprocess is received. To read line by line in "real time":
import sys
from subprocess import Popen, PIPE
proc = Popen(cmd, shell=True, bufsize=1, stdout=PIPE)
for line in proc.stdout:
openfile.write(line)
sys.stdout.buffer.write(line)
sys.stdout.buffer.flush()
proc.stdout.close()
proc.wait()
Note: if the subprocess uses block-buffering when it is run in non-interactive mode; you might need pexpect, pty modules or stdbuf, unbuffer, script commands.
Note: on Python 2, you might also need to use iter(), to get "real time" output:
for line in iter(proc.stdout.readline, ""):
openfile.write(line)
print line,
You can iterate over the lines one by one by using readline on the pipe:
while True:
line = result.stdout.readline()
print line.strip()
if not line:
break
The lines contain a trailing \n which I stripped for printing.
When the process terminates, readline returns an empty string, so you know when to stop.

Is there a way to create a new P4 changelist using python subprocess?

This question from years ago does what I need:
How do I check out a file from perforce in python?
but is there a way to do this using the subprocess module? (which I understand is the preferred way)
I've looked through stackoverflow, the python docs, as well as many google searches trying to find a way to use the stdin to send the required input to the p4 process, but I've not been successful. I've been able to find plenty on capturing the output of a subprocess command, but have not been able to grok the input commands.
I'm pretty new to python in general, so I am likely missing something obvious, but I don't know what I don't know in this case.
This is the code I've come up with so far:
descr = "this is a test description"
tempIn = tempfile.TemporaryFile()
tempOut = tempfile.TemporaryFile()
p = subprocess.Popen(["p4","change","-i"],stdout=tempOut, stdin=tempIn)
tempIn.write("change: New\n")
tempIn.write("description: " + descr)
tempIn.close()
(out, err) = p.communicate()
print out
As I mentioned in my comment, use the Perforce Python API.
Regarding your code:
tempfile.TemporaryFile() isn't usually appropriate for creating a file and then passing the contents off to something else. The temporary file is automatically deleted as soon as the file is closed. Often you need to close the file for writing before you can re-open it for reading, creating a catch-22 situation. (You can get around this with tempfile.NamedTemporaryFile(delete=False), but that's still too round-about for this situation.)
To use communicate(), you need to pass subprocess.PIPE:
descr = "this is a test description"
changespec = "change: New\ndescription: " + descr
p = subprocess.Popen(["p4","change","-i"], stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
(out, err) = p.communicate(changespec)
print out
if stdout is not unlimited then use #Jon-Eric's answer otherwise replace p.communicate() with rc = p.wait(); tempOut.seek(0); chunk = tempOut.read(chunk_size) ....

Categories