Capture stderr in ProcessPoolExecutor

Capture stderr in ProcessPoolExecutor - python

Im using ProcessPoolExecutor to execute external command is there any way to capture stderr when the process is finished (similar to subprocess)? Im capturing executor.submit() future result but that returns 0 or 1.

This might not be an answer but it is of that direction and far too long for a comment, so here goes.
I would say no to that. You might be able to achieve this by tinkering with stderr file descriptors, redirecting that to a stream of your own and returning this as the worker result but I am wondering if ProcessPoolExecutor is suitable for your task - of course not knowing what it is.
A subprocess created by a process pool does not finish like a subprocess created by yourself. It stays alive waiting for more work to arrive until you close the pool. If your worker produces stdout or stderr, they go to the same place where your main process directs its stdout and stderr.
Your workers will also process many different tasks. If your pool size is four and you submit ten tasks, how do you then decipher from a plain stderr capture which task created what message?
I have a hunch this needs to be redesigned. You would be able to raise exceptions in your workers and then later capture those from your future objects. Or it might be that your task is something where a pool is just not suitable. If subprocesses do what you want them to do, why not use them instead?
Pools are good for parallelising repetitive tasks that return and receive modest amount of data (implemented as queues that are not miracle performers), with a standard interface and standardised output/error handling. Pools simplify your code by providing the routine part. If your subtasks require different inputs, their outputs and error handling vary greatly or there is a lot of data to be transmitted, you might be better off by building the parallel processing part yourself.

Related

Interprocess Communication between two python scripts without STDOUT

I am trying to create a Monitor script that monitors all the threads or a huge python script which has several loggers running, several thread running.
From Monitor.py i could run subprocess and forward the STDOUT which might contain my status of the threads.. but since several loggers are running i am seeing other logging in that..
Question: How can run the main script as a separate process and get custom messages, thread status without interfering with logging. ( passing PIPE as argument ? )
Main_Script.py * Runs Several Threads * Each Thread has separate Loggers.
Monitor.py * Spins up the Main_script.py * Monitors the each of the threads in MainScript.py ( may be obtain other messages from Main_script in the future)
So Far, I tried subprocess, process from Multiprocessing.
Subprocess lets me start the Main_script and forward the stdout back to monitor but I see the logging of threads coming in through the same STDOUT. I am using the “import logging “ Library to log the data from each threads to separate files.
I tried “process” from Multiprocessing. I had to call the main function of the main_script.py as a process and send a PIPE argument to it from monitor.py. Now I can’t see the Main_script.py as a separate process when I run top command.

Normally, you want to change the child process to work like a typical Unix userland tool: the logging and other side-band information goes to stderr (or to a file, or syslog, etc.), and only the actual output goes to stdout.
Then, the problem is easy: just capture stdout to a PIPE that you process, and either capture stderr to a different PIPE, or pass it through to real stderr.
If that's not appropriate for some reason, you need to come up with some other mechanism for IPC: Unix or Windows named pipes, anonymous pipes that you pass by leaking the file descriptor across the fork/exec and then pass the fd as an argument, Unix-domain sockets, TCP or UDP localhost sockets, a higher-level protocol like a web service on top of TCP sockets, mmapped files, anonymous mmaps or pipes that you pass between processes via a Unix-domain socket or Windows API calls, …
As you can see, there are a huge number of options. Without knowing anything about your problem other than that you want "custom messages", it's impossible to tell you which one you want.
While we're at it: If you can rewrite your code around multiprocessing rather than subprocess, there are nice high-level abstractions built in to that module. For example, you can use a Queue that automatically manages synchronization and blocking, and also manages pickling/unpickling so you can just pass any (picklable) object rather than having to worry about serializing to text and parsing the text. Or you can create shared memory holding arrays of int32 objects, or NumPy arrays, or arbitrary structures that you define with ctypes. And so on. Of course you could build the same abstractions yourself, without needing to use multiprocessing, but it's a lot easier when they're there out of the box.
Finally, while your question is tagged ipc and pipe, and titled "Interprocess Communication", your description refers to threads, not processes. If you actually are using a bunch of threads in a single process, you don't need any of this.
You can just stick your results on a queue.Queue, or store them in a list or deque with a Lock around it, or pass in a callback to be called with each new result, or use a higher-level abstraction like concurrent.futures.ThreadPoolExecutor and return a Future object or an iterator of Futures, etc.

Interleaving stdout from Popen with messages from ZMQ recv

Is there a best-practices approach to poll for the stdout/stderr from a subprocess.Popen as well as a zmq socket?
In my case, I have my main program spawning a Popen subprocess. The subprocess publishes messages via zmq which I then want to subscribe to in my main program.
Waiting on multiple zmq sockets is not complicated with the zmq.Poller but when I want to interleave this with the output from my subprocess itself, I am unsure how to do it in the best way without risking waits or having needless loops.
In the end, I would like to use it like so:
process = Popen([prog, '--publish-to', 'tcp://127.0.0.1:89890'],
stdout=subprocess.PIPE, stderr=subprocess.PIPE, ...)
for (origin, data) in interleave(process, 'tcp://127.0.0.1:89890'):
if origin == 'STDOUT': pass
if origin == 'STDERR': pass
if origin == 'ZMQ': pass
prog --publish-to tcp://127.0.0.1:89890 will then open a zmq.PUB socket and publish data, whereas the interleave function will subscribe to this and also poll for stdout and stderr, yielding whatever data reaches it first.
I know how to define interleave with multiple daemon threads and queues but I don’t know if this approach might have some caveats with regards to lazy reading (ie. stdout might not be processed until the end of the program?) or other things that I have not yet thought about (seems also to be quite a bit of overhead for such a task).
I will be thankful for all ideas or insights.
I aim for at least Python 3.3/3.4 but if this turns out to be much easier with the new async/await tools, I could also use Python 3.5 for the code.

Use zmq.Poller: http://pyzmq.readthedocs.io/en/latest/api/zmq.html#polling. You can register zmq sockets and native file descriptors (e.g. process.stdout.fileno() and process.stderr.fileno()) there, and it will wait until input is available on at least one of the registered sources.
I don't know if it works in Windows, you should try.

What is the best way to capture output from a process using python?

I am using python's subprocess module to start a new process. I would like to capture the output of the new process in real time so I can do things with it (display it, parse it, etc.). I have seen many examples of how this can be done, some use custom file-like objects, some use threading and some attempt to read the output until the process has completed.
File Like Objects Example (click me)
I would prefer not to use custom file-like objects because I want to allow users to supply their own values for stdin, stdout and stderr.
Threading Example (click me)
I do not really understand why threading is required so I am reluctant to follow this example. If someone can explain why the threading example makes sense I would be happy listen. However, this example also restricts users from supplying their own stdout and stderr values.
Read Output Example (see below)
The example which makes the most sense to me is to read the stdout, stderr until the process has finished. Here is some example code:
import subprocess
# Start a process which prints the options to the python program.
process = subprocess.Popen(
["python", "-h"],
bufsize=1,
stdin=subprocess.PIPE,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
)
# While the process is running, display the output to the user.
while True:
# Read standard output data.
for stdout_line in iter(process.stdout.readline, ""):
# Display standard output data.
sys.stdout.write(stdout_line)
# Read standard error data.
for stderr_line in iter(process.stderr.readline, ""):
# Display standard error data.
sys.stderr.write(stderr_line)
# If the process is complete - exit loop.
if process.poll() != None:
break
My question is,
Q. Is there a recommended approach for capturing the output of a process using python?

First, your design is a bit silly, since you can do the same thing like this:
process = subprocess.Popen(
["python", "-h"],
bufsize=1,
stdout=sys.stdout,
stderr=sys.stderr
)
… or, even better:
process = subprocess.Popen(
["python", "-h"],
bufsize=1
)
However, I'll assume that's just a toy example, and you might want to do something more useful.
The main problem with your design is that it won't read anything from stderr until stdout is done.
Imagine you're driving an MP3 player that prints each track name to stdout, and logging info to stderr, and you want to play 10 songs. Do you really want to wait 30 minutes before displaying any of the logging to your users?
If that is acceptable, then you might as well just use communicate, which takes care of all of the headaches for you.
Plus, even if it's acceptable for your model, are you sure you can queue up that much unread data in the pipe without it blocking the child? On every platform?
Just breaking up the loop to alternate between the two won't help, because you could end up blocking on stdout.readline() for 5 minutes while stderr is piling up.
So that's why you need some way to read from both at once.
How do you read from two pipes at once?
This is the same problem (but smaller) as handling 1000 network clients at once, and it has the same solutions: threading, or multiplexing (and the various hybrids, like doing green threads on top of a multiplexor and event loop, or using a threaded proactor, etc.).
The best sample code for the threaded version is communicate from the 3.2+ source code. It's a little complicated, but if you want to handle all of the edge cases properly on both Windows and Unix there's really no avoiding a bit of complexity.
For multiplexing, you can use the select module, but keep in mind that this only works on Unix (you can't select on pipes on Windows), and it's buggy without 3.2+ (or the subprocess32 backport), and to really get all the edge cases right you need to add a signal handler to your select. Unless you really, really don't want to use threading, this is the harder answer.
But the easy answer is to use someone else's implementation. There are a dozen or more modules on PyPI specifically for async subprocesses. Alternatively, if you already have a good reason to write your app around an event loop, just about every modern event-loop-driven async networking library (including the stdlib's asyncio) includes subprocess support out of the box, that works on both Unix and Windows.
Is there a recommended approach for capturing the output of a process using python?
It depends on who you're asking; a thousand Python developers might have a thousand different answers… or at least half a dozen. If you're asking what the core devs would recommend, I can take a guess:
If you don't need to capture it asynchronously, use communicate (but make sure to upgrade to at least 3.2 for important bug fixes). If you do need to capture it asynchronously, use asyncio (which requires 3.4).

Parallel processing within a queue (using Pool within Celery)

I'm using Celery to queue jobs from a CGI application I made. The way I've set it up, Celery makes each job run one- or two-at-a-time by setting CELERYD_CONCURRENCY = 1 or = 2 (so they don't crowd the processor or thrash from memory consumption). The queue works great, thanks to advice I got on StackOverflow.
Each of these jobs takes a fair amount of time (~30 minutes serial), but has an embarrassing parallelizability. For this reason, I was using Pool.map to split it and do the work in parallel. It worked great from the command line, and I got runtimes around 5 minutes using a new many-cored chip.
Unfortunately, there is some limitation that does not allow daemonic process to have subprocesses, and when I run the fancy parallelized code within the CGI queue, I get this error:
AssertionError: daemonic processes are not allowed to have children
I noticed other people have had similar questions, but I can't find an answer that wouldn't require abandoning Pool.map altogether, and making more complicated thread code.
What is the appropriate design choice here? I can easily run my serial jobs using my Celery queue. I can also run my much faster parallelized jobs without a queue. How should I approach this, and is it possible to get what I want (both the queue and the per-job parallelization)?
A couple of ideas I've had (some are quite hacky):
The job sent to the Celery queue simply calls the command line program. That program can use Pool as it pleases, and then saves the result figures & data to a file (just as it does now). Downside: I won't be able to check on the status of the job or see if it terminated successfully. Also, system calls from CGI may cause security issues.
Obviously, if the queue is very full of jobs, I can make use of the CPU resources (by setting CELERYD_CONCURRENCY = 6 or so); this will allow many people to be "at the front of the queue" at once.Downside: Each job will spend a lot of time at the front of the queue; if the queue isn't full, there will be no speedup. Also, many partially finished jobs will be stored in memory at the same time, using much more RAM.
Use Celery's #task to parallelize within sub-jobs. Then, instead of setting CELERYD_CONCURRENCY = 1, I would set it to 6 (or however many sub jobs I'd like to allow in memory at a time). Downside: First of all, I'm not sure whether this will successfully avoid the "task-within-task" problem. But also, the notion of queue position may be lost, and many partially finished jobs may end up in memory at once.
Perhaps there is a way to call Pool.map and specify that the threads are non-daemonic? Or perhaps there is something more lightweight I can use instead of Pool.map? This is similar to an approach taken on another open StackOverflow question. Also, I should note that the parallelization I exploit via Pool.map is similar to linear algebra, and there is no inter-process communication (each just runs independently and returns its result without talking to the others).
Throw away Celery and use multiprocessing.Queue. Then maybe there'd be some way to use the same "thread depth" for every thread I use (i.e. maybe all of the threads could use the same Pool, avoiding nesting)?
Thanks a lot in advance.

What you need is a workflow management system (WFMS) that manages
task concurrency
task dependency
task nesting
among other things.
From a very high level view, a WFMS sits on top of a task pool like celery, and submits the tasks which are ready to execute to the pool. It is also responsible for opening up a nest and submitting the tasks in the nest accordingly.
I've developed a system to do just that. It's called pomsets. Try it out, and feel free to send me any questions.

I using a multiprocessed deamons based on Twisted with forking and Gearman jobs query normally.
Try to look at Gearman.

Share a resource between two processes

I want to know the best practices followed to share a queue (resource) between two processes in Python. Here is a what each process is doing:
Process_1: continuously gets data (in json format) from a streaming api
Process_2: is a daemon (similar to Sander Marechal's code) which commits data (one at a time) into a database
So, Process_1 (or Producer) puts a unit of data onto this shared resource, and Process_2 (or Consumer) will poll this shared resource for any new units of data, and store them in a DB, if any.
There are some options which came to my mind:
Using pickle (drawback: extra overhead of pickling and de-pickling)
Passing data via stdout of Process_1
to stdin of Process_2 (drawback: none, but not sure how to implement this with a daemon)
Using the pool object in the multiprocessing library (drawback: not sure how to code this as one process is a daemon)
I would like an optimal solution practiced in this regard, with some code :). Thanks.

multiprocessing.pool isn't what you want in this case - it is useful for having multiple units of work done 'in the background' (concurrently), not so much for managing a shared resource. Since you appear to have the format of the communications worked out, and they communicate in only one direction, what you want is a multiprocessing.Queue - the documentation has a good example of how to use it - you will want your Process_1 putting data into the Queue as needed, and Process_2 calling q.get() in an infinite loop. This will cause the Consumer to block when there is nothing to do, rather than busy-waiting as you suggest (which can waste processor cycles). The issue that this leaves is closing the daemon - possibly the best way is to have the Producer put a sentinel value at the end of the queue, to ensure that the Consumer deals with all requests. Other alternatives include things like trying to forcibly kill the process when the child exits, but this is error-prone.
Note that this assumes that the Producer spawns the Consumer (or vice versa) - if the Consumer is a long-running daemon that can deal with multiple relatively short-lived Producers, the situation becomes quite a bit harder - there isn't, to my knowledge, any cross-platform high-level IPC module; the most portable (and generally easiest) way to handle this may be to use the filesystem as a queue - have a spool folder somewhere that the Producers write a text file to for each request; the Consumer can then process these at its leisure - however, this isn't without its own issues: you would need to ensure that the Consumer doesn't try to open a half-written instruction file, that the Producers aren't stepping on each other's toes, and that the Producers and the Consumer agree on the ordering of requests.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.