Very large input and piping using subprocess.Popen

Very large input and piping using subprocess.Popen - python

I have pretty simple problem. I have a large file that goes through three steps, a decoding step using an external program, some processing in python, and then recoding using another external program. I have been using subprocess.Popen() to try to do this in python rather than forming unix pipes. However, all the data are buffered to memory. Is there a pythonic way of doing this task, or am I best dropping back to a simple python script that reads from stdin and writes to stdout with unix pipes on either side?
import os, sys, subprocess
def main(infile,reflist):
print infile,reflist
samtoolsin = subprocess.Popen(["samtools","view",infile],
stdout=subprocess.PIPE,bufsize=1)
samtoolsout = subprocess.Popen(["samtools","import",reflist,"-",
infile+".tmp"],stdin=subprocess.PIPE,bufsize=1)
for line in samtoolsin.stdout.read():
if(line.startswith("#")):
samtoolsout.stdin.write(line)
else:
linesplit = line.split("\t")
if(linesplit[10]=="*"):
linesplit[9]="*"
samtoolsout.stdin.write("\t".join(linesplit))

Popen has a bufsize parameter that will limit the size of the buffer in memory. If you don't want the files in memory at all, you can pass file objects as the stdin and stdout parameters. From the subprocess docs:
bufsize, if given, has the same meaning as the corresponding argument to the built-in open() function: 0 means unbuffered, 1 means line buffered, any other positive value means use a buffer of (approximately) that size. A negative bufsize means to use the system default, which usually means fully buffered. The default value for bufsize is 0 (unbuffered).

Try to make this small change, see if the efficiency is better.
for line in samtoolsin.stdout:
if(line.startswith("#")):
samtoolsout.stdin.write(line)
else:
linesplit = line.split("\t")
if(linesplit[10]=="*"):
linesplit[9]="*"
samtoolsout.stdin.write("\t".join(linesplit))

However, all the data are buffered to memory ...
Are you using subprocess.Popen.communicate()? By design, this function will wait for the process to finish, all the while accumulating the data in a buffer, and then return it to you. As you've pointed out, this is problematic if dealing with very large files.
If you want to process the data while it is generated, you will need to write a loop using the poll() and .stdout.read() methods, then write that output to another socket/file/etc.
Do be sure to notice the warnings in the documentation against doing this as it is easy to result in a deadlock (the parent process waits for the child process to generate data, who is in turn waiting for the parent process to empty the pipe buffer).

I was using the .read() method on the stdout stream. Instead, I simply needed to read directly from the stream in the for loop above. The corrected code does what I expected.
#!/usr/bin/env python
import os
import sys
import subprocess
def main(infile,reflist):
print infile,reflist
samtoolsin = subprocess.Popen(["samtools","view",infile],
stdout=subprocess.PIPE,bufsize=1)
samtoolsout = subprocess.Popen(["samtools","import",reflist,"-",
infile+".tmp"],stdin=subprocess.PIPE,bufsize=1)
for line in samtoolsin.stdout:
if(line.startswith("#")):
samtoolsout.stdin.write(line)
else:
linesplit = line.split("\t")
if(linesplit[10]=="*"):
linesplit[9]="*"
samtoolsout.stdin.write("\t".join(linesplit))

Trying to do some basic shell piping with very large input in python:
svnadmin load /var/repo < r0-100.dump
I found the simplest way to get this working even with large (2-5GB) files was:
subprocess.check_output('svnadmin load %s < %s' % (repo, fname), shell=True)
I like this method because it's simple and you can do standard shell redirection.
I tried going the Popen route to run a redirect:
cmd = 'svnadmin load %s' % repo
p = Popen(cmd, stdin=PIPE, stdout=PIPE, shell=True)
with open(fname) as inline:
for line in inline:
p.communicate(input=line)
But that broke with large files. Using:
p.stdin.write()
Also broke with very large files.

Related

Reading output from child process using python

The Context
I am using the subprocess module to start a process from python. I want to be able to access the output (stdout, stderr) as soon as it is written/buffered.
The solution must support Windows 7. I require a solution for Unix systems too but I suspect the Windows case is more difficult to solve.
The solution should support Python 2.6. I am currently restricted to Python 2.6 but solutions using later versions of Python are still appreciated.
The solution should not use third party libraries. Ideally I would love a solution using the standard library but I am open to suggestions.
The solution must work for just about any process. Assume there is no control over the process being executed.
The Child Process
For example, imagine I want to run a python file called counter.py via a subprocess. The contents of counter.py is as follows:
import sys
for index in range(10):
# Write data to standard out.
sys.stdout.write(str(index))
# Push buffered data to disk.
sys.stdout.flush()
The Parent Process
The parent process responsible for executing the counter.py example is as follows:
import subprocess
command = ['python', 'counter.py']
process = subprocess.Popen(
cmd,
bufsize=1,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
)
The Issue
Using the counter.py example I can access the data before the process has completed. This is great! This is exactly what I want. However, removing the sys.stdout.flush() call prevents the data from being accessed at the time I want it. This is bad! This is exactly what I don't want. My understanding is that the flush() call forces the data to be written to disk and before the data is written to disk it exists only in a buffer. Remember I want to be able to run just about any process. I do not expect the process to perform this kind of flushing but I still expect the data to be available in real time (or close to it). Is there a way to achieve this?
A quick note about the parent process. You may notice I am using bufsize=0 for line buffering. I was hoping this would cause a flush to disk for every line but it doesn't seem to work that way. How does this argument work?
You will also notice I am using subprocess.PIPE. This is because it appears to be the only value which produces IO objects between the parent and child processes. I have come to this conclusion by looking at the Popen._get_handles method in the subprocess module (I'm referring to the Windows definition here). There are two important variables, c2pread and c2pwrite which are set based on the stdout value passed to the Popen constructor. For instance, if stdout is not set, the c2pread variable is not set. This is also the case when using file descriptors and file-like objects. I don't really know whether this is significant or not but my gut instinct tells me I would want both read and write IO objects for what I am trying to achieve - this is why I chose subprocess.PIPE. I would be very grateful if someone could explain this in more detail. Likewise, if there is a compelling reason to use something other than subprocess.PIPE I am all ears.
Method For Retrieving Data From The Child Process
import time
import subprocess
import threading
import Queue
class StreamReader(threading.Thread):
"""
Threaded object used for reading process output stream (stdout, stderr).
"""
def __init__(self, stream, queue, *args, **kwargs):
super(StreamReader, self).__init__(*args, **kwargs)
self._stream = stream
self._queue = queue
# Event used to terminate thread. This way we will have a chance to
# tie up loose ends.
self._stop = threading.Event()
def stop(self):
"""
Stop thread. Call this function to terminate the thread.
"""
self._stop.set()
def stopped(self):
"""
Check whether the thread has been terminated.
"""
return self._stop.isSet()
def run(self):
while True:
# Flush buffered data (not sure this actually works?)
self._stream.flush()
# Read available data.
for line in iter(self._stream.readline, b''):
self._queue.put(line)
# Breather.
time.sleep(0.25)
# Check whether thread has been terminated.
if self.stopped():
break
cmd = ['python', 'counter.py']
process = subprocess.Popen(
cmd,
bufsize=1,
stdout=subprocess.PIPE,
)
stdout_queue = Queue.Queue()
stdout_reader = StreamReader(process.stdout, stdout_queue)
stdout_reader.daemon = True
stdout_reader.start()
# Read standard out of the child process whilst it is active.
while True:
# Attempt to read available data.
try:
line = stdout_queue.get(timeout=0.1)
print '%s' % line
# If data was not read within time out period. Continue.
except Queue.Empty:
# No data currently available.
pass
# Check whether child process is still active.
if process.poll() != None:
# Process is no longer active.
break
# Process is no longer active. Nothing more to read. Stop reader thread.
stdout_reader.stop()
Here I am performing the logic which reads standard out from the child process in a thread. This allows for the scenario in which the read is blocking until data is available. Instead of waiting for some potentially long period of time, we check whether there is available data, to be read within a time out period, and continue looping if there is not.
I have also tried another approach using a kind of non-blocking read. This approach uses the ctypes module to access Windows system calls. Please note that I don't fully understand what I am doing here - I have simply tried to make sense of some example code I have seen in other posts. In any case, the following snippet doesn't solve the buffering issue. My understanding is that it's just another way to combat a potentially long read time.
import os
import subprocess
import ctypes
import ctypes.wintypes
import msvcrt
cmd = ['python', 'counter.py']
process = subprocess.Popen(
cmd,
bufsize=1,
stdout=subprocess.PIPE,
)
def read_output_non_blocking(stream):
data = ''
available_bytes = 0
c_read = ctypes.c_ulong()
c_available = ctypes.c_ulong()
c_message = ctypes.c_ulong()
fileno = stream.fileno()
handle = msvcrt.get_osfhandle(fileno)
# Read available data.
buffer_ = None
bytes_ = 0
status = ctypes.windll.kernel32.PeekNamedPipe(
handle,
buffer_,
bytes_,
ctypes.byref(c_read),
ctypes.byref(c_available),
ctypes.byref(c_message),
)
if status:
available_bytes = int(c_available.value)
if available_bytes > 0:
data = os.read(fileno, available_bytes)
print data
return data
while True:
# Read standard out for child process.
stdout = read_output_non_blocking(process.stdout)
print stdout
# Check whether child process is still active.
if process.poll() != None:
# Process is no longer active.
break
Comments are much appreciated.
Cheers

At issue here is buffering by the child process. Your subprocess code already works as well as it could, but if you have a child process that buffers its output then there is nothing that subprocess pipes can do about this.
I cannot stress this enough: the buffering delays you see are the responsibility of the child process, and how it handles buffering has nothing to do with the subprocess module.
You already discovered this; this is why adding sys.stdout.flush() in the child process makes the data show up sooner; the child process uses buffered I/O (a memory cache to collect written data) before sending it down the sys.stdout pipe 1.
Python automatically uses line-buffering when sys.stdout is connected to a terminal; the buffer flushes whenever a newline is written. When using pipes, sys.stdout is not connected to a terminal and a fixed-size buffer is used instead.
Now, the Python child process can be told to handle buffering differently; you can set an environment variable or use a command-line switch to alter how it uses buffering for sys.stdout (and sys.stderr and sys.stdin). From the Python command line documentation:
-u
Force stdin, stdout and stderr to be totally unbuffered. On systems where it matters, also put stdin, stdout and stderr in binary mode.
[...]
PYTHONUNBUFFERED
If this is set to a non-empty string it is equivalent to specifying the -u option.
If you are dealing with child processes that are not Python processes and you experience buffering issues with those, you'll need to look at the documentation of those processes to see if they can be switched to use unbuffered I/O, or be switched to more desirable buffering strategies.
One thing you could try is to use the script -c command to provide a pseudo-terminal to a child process. This is a POSIX tool, however, and is probably not available on Windows.
1. It should be noted that when flushing a pipe, no data is 'written to disk'; all data remains entirely in memory here. I/O buffers are just memory caches to get the best performance out of I/O by handling data in larger chunks. Only if you have a disk-based file object would fileobj.flush() cause it to push any buffers to the OS, which usually means that data is indeed written to disk.

expect has a command called 'unbuffer':
http://expect.sourceforge.net/example/unbuffer.man.html
that will disable buffering for any command

Sending strings between to Python Scripts using subprocess PIPEs

I want to open a Python script using subprocess in my main python program. I want these two programs to be able to chat with one another as they are both running so I can monitor the activity in the slave script, i.e. I need them to send strings between each other.
The main program will have a function similar to this that will communicate with and monitor the slave script:
Script 1
import subprocess
import pickle
import sys
import time
import os
def communicate(clock_speed, channel_number, frequency):
p = subprocess.Popen(['C:\\Python27\\pythonw','test.py'], stdin=subprocess.PIPE, stdout=subprocess.PIPE)
data = pickle.dumps([clock_speed, channel_number, frequency]).replace("\n", "\\()")
print data
p.stdin.write("Start\n")
print p.stdout.read()
p.stdin.write(data + "\n")
p.poll()
print p.stdout.readline()
print "return:" + p.stdout.readline()
#p.kill()
if __name__ == '__main__':
print "GO"
communicate(clock_speed = 400, channel_number = 0, frequency = 5*1e6)
The test.py script looks similar to this:
Script 2
import ctypes
import pickle
import time
import sys
start = raw_input("")
sys.stdout.write("Ready For Data")
data = raw_input("")
data = pickle.loads(data.replace("\\()", "\n"))
sys.stdout.write(str(data))
###BUNCH OF OTHER STUFF###
What I want these scripts to do is the following:
Script 1 to open Script 2 using Popen
Script 1 sends the string "Start\n"
Script 2 reads this string and sends the string "Ready For Data"
Script 1 reads this string and sends the pickled data to Script 2
Then whatever...
The main question is how to do parts 2-4. Then the rest of the communication between the two scripts should follow. As of now, I have only been able to read the strings from Script 2 after it has been terminated.
Any help is greatly appreciated.
UPDATE:
Script 1 must be run using 32-bit Python, while Script 2 must be run using 64-bit Python.

The problem with pipes is that if you call a read operation and there is nothing to read, your code is stalled until the other party writes something for you to read. Also if you write too much, your next write operation might block until the other party reads something out of the pipe and frees it.
There are "non-blocking calls" you can make, that will return an error in these cases instead of blocking, but your application will still need to deal with the errors sensibly.
In any case, you need to set up some kind of protocol. Think of HTTP, or any other protocol you know well: there are requests and responses, and while you are reading either of the two the protocol always tells you if there is something else you need to read or not. That way you can always make an informed decision on whether to wait for more data or not.
Here is an example that works. It works because there is the following protocol:
p1 sends a single line, ending with '\n';
p2 does the same;
p1 sends another line;
p2 does the same;
both are happy and exit.
In order to write a line to the pipe (on either side) and make sure it gets onto the pipe, I call write() and then flush().
In order to read a single line from the pipe (on either side) but not a single byte more, thus blocking my code until the line is ready and no longer than that, I use readline().
There are other calls you could make and other protocols, including ready-made ones, but the single-line protocol works well for simple things and for a demo like this.
p1.py:
import subprocess
p = subprocess.Popen(['python', 'p2.py'], stdin=subprocess.PIPE, stdout=subprocess.PIPE)
p.stdin.write("Hello\n")
p.stdin.flush()
print 'got', p.stdout.readline().strip()
p.stdin.write("How are you?\n")
p.stdin.flush()
print 'got', p.stdout.readline().strip()
p2.py:
import sys
data = sys.stdin.readline()
sys.stdout.write("Hm.\n")
sys.stdout.flush()
data = sys.stdin.readline()
sys.stdout.write("Whatever.\n")
sys.stdout.flush()

I also had a problem similar to this, where there was no way to send general Python objects between different processes without running into the problem of knowing either when the other side hasn't sent an object or is closed. Also trying to use multiprocessing.Queue usually means that the process needs to have been started by the current process which is not always the case when two processes want to collaborate.
To combat this I use the picklepipe module, which defines a generic object serialization pipe interface as well as a pipe that uses the pickle protocol called the PicklePipe (also one that uses the marshal protocol called MarshalPipe). It can send more than just strings, it can send any pickleable object to it's peer.
The pipes are even selectable, meaning they can be used by the selectors module (or selectors2, selectors34) as file objects when a new object is ready to be received. This makes waiting for many different pipes to be ready very efficient.
Supports Python 2.7+ (and probably 2.6) and all major platforms. Can even send objects between two different versions of Python! Check out the project documentation or view the source on Github.
Disclosure: I am the author of picklepipe. I would love to hear your feedback. :)

Best way to fork multiple shell commands/processes in Python?

Most of the examples I've seen with os.fork and the subprocess/multiprocessing modules show how to fork a new instance of the calling python script or a chunk of python code. What would be the best way to spawn a set of arbitrary shell command concurrently?
I suppose, I could just use subprocess.call or one of the Popen commands and pipe the output to a file, which I believe will return immediately, at least to the caller. I know this is not that hard to do, I'm just trying to figure out the simplest, most Pythonic way to do it.
Thanks in advance

All calls to subprocess.Popen return immediately to the caller. It's the calls to wait and communicate which block. So all you need to do is spin up a number of processes using subprocess.Popen (set stdin to /dev/null for safety), and then one by one call communicate until they're all complete.
Naturally I'm assuming you're just trying to start a bunch of unrelated (i.e. not piped together) commands.

I like to use PTYs instead of pipes. For a bunch of processes where I only want to capture error messages I did this.
RNULL = open('/dev/null', 'r')
WNULL = open('/dev/null', 'w')
logfile = open("myprocess.log", "a", 1)
REALSTDERR = sys.stderr
sys.stderr = logfile
This next part was in a loop spawning about 30 processes.
sys.stderr = REALSTDERR
master, slave = pty.openpty()
self.subp = Popen(self.parsed, shell=False, stdin=RNULL, stdout=WNULL, stderr=slave)
sys.stderr = logfile
After this I had a select loop which collected any error messages and sent them to the single log file. Using PTYs meant that I never had to worry about partial lines getting mixed up because the line discipline provides simple framing.

There is no best for all possible circumstances. The best depends on the problem at hand.
Here's how to spawn a process and save its output to a file combining stdout/stderr:
import subprocess
import sys
def spawn(cmd, output_file):
on_posix = 'posix' in sys.builtin_module_names
return subprocess.Popen(cmd, close_fds=on_posix, bufsize=-1,
stdin=open(os.devnull,'rb'),
stdout=output_file,
stderr=subprocess.STDOUT)
To spawn multiple processes that can run in parallel with your script and each other:
processes, files = [], []
try:
for i, cmd in enumerate(commands):
files.append(open('out%d' % i, 'wb'))
processes.append(spawn(cmd, files[-1]))
finally:
for p in processes:
p.wait()
for f in files:
f.close()
Note: cmd is a list everywhere.

I suppose, I could just us subprocess.call or one of the Popen
commands and pipe the output to a file, which I believe will return
immediately, at least to the caller.
That's not a good way to do it if you want to process the data.
In this case, better do
sp = subprocess.Popen(['ls', '-l'], stdout=subprocess.PIPE)
and then sp.communicate() or read directly from sp.stdout.read().
If the data shall be processed in the calling program at a later time, there are two ways to go:
You can retrieve the data ASAP, maybe via a separate thread, reading them and storing them somewhere where the consumer can get them.
You can have the producing subprocess have block and retrieve the data from it when you need them. The subprocess produces as many data as fit in the pipe buffer (usually 64 kiB) and then blocks on further writes. As soon as you need the data, you read() from the subprocess object's stdout (maybe stderr as well) and use them - or, again, you use sp.communicate() at that later time.
Way 1 would the way to go if producing the data needs much time, so that your wprogram would have to wait.
Way 2 would be to be preferred if the size of the data is quite huge and/or the data is produced so fast that buffering would make no sense.

See an older answer of mine including code snippets to do:
Uses processes not threads for blocking I/O because they can more reliably be p.terminated()
Implements a retriggerable timeout watchdog that restarts counting whenever some output happens
Implements a long-term timeout watchdog to limit overall runtime
Can feed in stdin (although I only need to feed in one-time short strings)
Can capture stdout/stderr in the usual Popen means (Only stdout is coded, and stderr redirected to stdout; but can easily be separated)
It's almost realtime because it only checks every 0.2 seconds for output. But you could decrease this or remove the waiting interval easily
Lots of debugging printouts still enabled to see whats happening when.
For spawning multiple concurrent commands, you would need to alter the class RunCmd to instantiate mutliple read output/write input queues and to spawn mutliple Popen subprocesses.

scrambled output from a child process run from subprocess

I'm using the following code to run another python script. The problem I'm facing is that the output of that script is coming out in an unorderly manner.
While running it from the command line, I get the correct output i.e. :
some output here
Editing xml file and saving changes
Uploading xml file back..
While running the script using subprocess, am getting some of the output in reverse order:
correct output till here
Uploading xml file back..
Editing xml file and saving changes
The script is executing without errors and making the right changes. So I think the culprit might be the code that is calling the child script, but I can't find the problem:
cmd = "child_script.py"
proc = subprocess.Popen(cmd.split(), stdout=subprocess.PIPE,stderr=subprocess.STDOUT)
(fout ,ferr) = ( proc.stdout, proc.stderr )
print "Going inside while - loop"
while True:
line = proc.stdout.readline()
print line
fo.write(line)
try :
err = ferr.readline()
fe.write(err)
except Exception, e:
pass
if not line:
pass
break
[EDIT]: fo and fe are file handles to output and error logs. Also the script is being run on Windows.Sorry for missing these details.

There are a few problems with the part of the script you've quoted, I'm afraid:
As mentioned in detly's comment, what are fo and fe? Presumably those are objects to which you're writing the output of the child process? (Update: you indicate that these are both for writing output logs.)
There's an indentation error on line 3. (Update: I've fixed that in the original post.)
You're specifying stderr=subprocess.STDOUT, so: (a) ferr will always be None in your loop and (b) due to buffering, standard output and error may be mixed in an unpredictable way. However, it looks from your code as if you actually want to deal with standard output and standard error separately, so perhaps try stderr=subprocess.PIPE instead.
It would be a good idea to rewrite your loop as jsbueno suggests:
from subprocess import Popen, PIPE
proc = Popen(["child_script.py"], stdout=PIPE, stderr=PIPE)
fout, ferr = proc.stdout, proc.stderr
for line in fout:
print(line.rstrip())
fo.write(line)
for line in ferr:
fe.write(line)
... or to reduce it even further, since it seems that the aim is essentially that you just want to write the standard output and standard error from the child process to fo and fe, just do:
proc = subprocess.Popen(["child_script.py"], stdout=fo, stderr=fe)
If you still see the output lines swapped in the file that fo is writing to, then we can only assume that there is some way in which this can happen in the child script. e.g. is the child script multi-threaded? Is one of the lines printed via a callback from another function?

Most of the times I've seen order of output differ based on execution, some output was sent to the C standard IO streams stdin, and some output was sent to stderr. The buffering characteristics of stdout and stderr vary depending upon if they are connected to a terminal, pipes, files, etc:
NOTES
The stream stderr is unbuffered. The stream stdout is
line-buffered when it points to a terminal. Partial lines
will not appear until fflush(3) or exit(3) is called, or a
newline is printed. This can produce unexpected results,
especially with debugging output. The buffering mode of
the standard streams (or any other stream) can be changed
using the setbuf(3) or setvbuf(3) call. Note that in case
stdin is associated with a terminal, there may also be
input buffering in the terminal driver, entirely unrelated
to stdio buffering. (Indeed, normally terminal input is
line buffered in the kernel.) This kernel input handling
can be modified using calls like tcsetattr(3); see also
stty(1), and termios(3).
So perhaps you should configure both stdout and stderr to go to the same source, so the same buffering will be applied to both streams.
Also, some programs open the terminal directly open("/dev/tty",...) (mostly so they can read passwords), so comparing terminal output with pipe output isn't always going to work.
Further, if your program is mixing direct write(2) calls with standard IO calls, the order of output can be different based on the different buffering choices.
I hope one of these is right :) let me know which, if any.

Using subprocess.Popen for Process with Large Output

I have some Python code that executes an external app which works fine when the app has a small amount of output, but hangs when there is a lot. My code looks like:
p = subprocess.Popen(cmd, shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
errcode = p.wait()
retval = p.stdout.read()
errmess = p.stderr.read()
if errcode:
log.error('cmd failed <%s>: %s' % (errcode,errmess))
There are comments in the docs that seem to indicate the potential issue. Under wait, there is:
Warning: This will deadlock if the child process generates enough output to a stdout or stderr pipe such that it blocks waiting for the OS pipe buffer to accept more data. Use communicate() to avoid that.
though under communicate, I see:
Note The data read is buffered in memory, so do not use this method if the data size is large or unlimited.
So it is unclear to me that I should use either of these if I have a large amount of data. They don't indicate what method I should use in that case.
I do need the return value from the exec and do parse and use both the stdout and stderr.
So what is an equivalent method in Python to exec an external app that is going to have large output?

You're doing blocking reads to two files; the first needs to complete before the second starts. If the application writes a lot to stderr, and nothing to stdout, then your process will sit waiting for data on stdout that isn't coming, while the program you're running sits there waiting for the stuff it wrote to stderr to be read (which it never will be--since you're waiting for stdout).
There are a few ways you can fix this.
The simplest is to not intercept stderr; leave stderr=None. Errors will be output to stderr directly. You can't intercept them and display them as part of your own message. For commandline tools, this is often OK. For other apps, it can be a problem.
Another simple approach is to redirect stderr to stdout, so you only have one incoming file: set stderr=STDOUT. This means you can't distinguish regular output from error output. This may or may not be acceptable, depending on how the application writes output.
The complete and complicated way of handling this is select (http://docs.python.org/library/select.html). This lets you read in a non-blocking way: you get data whenever data appears on either stdout or stderr. I'd only recommend this if it's really necessary. This probably doesn't work in Windows.

Reading stdout and stderr independently with very large output (ie, lots of megabytes) using select:
import subprocess, select
proc = subprocess.Popen(cmd, bufsize=8192, shell=False, \
stdout=subprocess.PIPE, stderr=subprocess.PIPE)
with open(outpath, "wb") as outf:
dataend = False
while (proc.returncode is None) or (not dataend):
proc.poll()
dataend = False
ready = select.select([proc.stdout, proc.stderr], [], [], 1.0)
if proc.stderr in ready[0]:
data = proc.stderr.read(1024)
if len(data) > 0:
handle_stderr_data(data)
if proc.stdout in ready[0]:
data = proc.stdout.read(1024)
if len(data) == 0: # Read of zero bytes means EOF
dataend = True
else:
outf.write(data)

A lot of output is subjective so it's a little difficult to make a recommendation. If the amount of output is really large then you likely don't want to grab it all with a single read() call anyway. You may want to try writing the output to a file and then pull the data in incrementally like such:
f=file('data.out','w')
p = subprocess.Popen(cmd, shell=True, stdout=f, stderr=subprocess.PIPE)
errcode = p.wait()
f.close()
if errcode:
errmess = p.stderr.read()
log.error('cmd failed <%s>: %s' % (errcode,errmess))
for line in file('data.out'):
#do something

Glenn Maynard is right in his comment about deadlocks. However, the best way of solving this problem is two create two threads, one for stdout and one for stderr, which read those respective streams until exhausted and do whatever you need with the output.
The suggestion of using temporary files may or may not work for you depending on the size of output etc. and whether you need to process the subprocess' output as it is generated.
As Heikki Toivonen has suggested, you should look at the communicate method. However, this buffers the stdout/stderr of the subprocess in memory and you get those returned from the communicate call - this is not ideal for some scenarios. But the source of the communicate method is worth looking at.
Another example is in a package I maintain, python-gnupg, where the gpg executable is spawned via subprocess to do the heavy lifting, and the Python wrapper spawns threads to read gpg's stdout and stderr and consume them as data is produced by gpg. You may be able to get some ideas by looking at the source there, as well. Data produced by gpg to both stdout and stderr can be quite large, in the general case.

I had the same problem. If you have to handle a large output, another good option could be to use a file for stdout and stderr, and pass those files per parameter.
Check the tempfile module in python: https://docs.python.org/2/library/tempfile.html.
Something like this might work
out = tempfile.NamedTemporaryFile(delete=False)
Then you would do:
Popen(... stdout=out,...)
Then you can read the file, and erase it later.

You could try communicate and see if that solves your problem. If not, I'd redirect the output to a temporary file.

Here is simple approach which captures both regular output plus error output, all within Python so limitations in stdout don't apply:
com_str = 'uname -a'
command = subprocess.Popen([com_str], stdout=subprocess.PIPE, shell=True)
(output, error) = command.communicate()
print output
Linux 3.11.0-20-generic SMP Fri May 2 21:32:55 UTC 2014
and
com_str = 'id'
command = subprocess.Popen([com_str], stdout=subprocess.PIPE, shell=True)
(output, error) = command.communicate()
print output
uid=1000(myname) gid=1000(mygrp) groups=1000(cell),0(root)

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.