Using stdin in a child Process - python

So I have a program, in the "main" process I fire off a new Process object which (what I want) is to read lines from stdin and append them to a Queue object.
Essentially the basic system setup is that there is a "command getting" process which the user will enter commands/queries, and I need to get those queries to other subsystems running in separate processes. My thinking is to share these via a multiprocessing.Queue which the other systems can read from.
What I have (focusing on just the getting the commands/queries) is basically:
def sub_proc(q):
some_str = ""
while True:
some_str = raw_input("> ")
if some_str.lower() == "quit":
return
q.put_nowait(some_str)
if __name__ == "__main__":
q = Queue()
qproc = Process(target=sub_proc, args=(q,))
qproc.start()
qproc.join()
# now at this point q should contain all the strings entered by the user
The problem is that I get:
Process Process-1:
Traceback (most recent call last):
File "/usr/lib/python2.7/multiprocessing/process.py", line 258, in _bootstrap
self.run()
File "/usr/lib/python2.7/multiprocessing/process.py", line 114, in run
self._target(*self._args, **self._kwargs)
File "/home/blah/blah/blah/blah.py", line 325, in sub_proc
some_str = raw_input("> ")
File "/randompathhere/eclipse/plugins/org.python.pydev_2.1.0.2011052613/PySrc/pydev_sitecustomize/sitecustomize.py", line 181, in raw_input
ret = original_raw_input(prompt)
EOFError: EOF when reading a line
How do?

I solved a similar issue by passing the original stdin file descriptor to the child process and re-opening it there.
def sub_proc(q,fileno):
sys.stdin = os.fdopen(fileno) #open stdin in this process
some_str = ""
while True:
some_str = raw_input("> ")
if some_str.lower() == "quit":
return
q.put_nowait(some_str)
if __name__ == "__main__":
q = Queue()
fn = sys.stdin.fileno() #get original file descriptor
qproc = Process(target=sub_proc, args=(q,fn))
qproc.start()
qproc.join()
This worked for my relatively simple case. I was even able to use the readline module on the re-opened stream. I don't know how robust it is for more complex systems.

In short, the main process and your second process don't share the same STDIN.
from multiprocessing import Process, Queue
import sys
def sub_proc():
print sys.stdin.fileno()
if __name__ == "__main__":
print sys.stdin.fileno()
qproc = Process(target=sub_proc)
qproc.start()
qproc.join()
Run that and you should get two different results for sys.stdin.fileno()
Unfortunately, that doesn't solve your problem. What are you trying to do?

If you don't want to pass stdin to the target processes function, like in #Ashelly's answer, or just need to do it for many different processes, you can do it with multiprocessing.Pool via the initializer argument:
import os, sys, multiprocessing
def square(num=None):
if not num:
num = int(raw_input('square what? '))
return num ** 2
def initialize(fd):
sys.stdin = os.fdopen(fd)
initargs = [sys.stdin.fileno()]
pool = multiprocessing.Pool(initializer=initialize, initargs=initargs)
pool.apply(square, [3])
pool.apply(square)
the above example will print the number 9, followed by a prompt for input and then the square of the input number.
Just be careful not to have multiple child processes reading from the same descriptor at the same time or things may get... confusing.

You could use threading and keep it all on the same process:
from multiprocessing import Queue
from Queue import Empty
from threading import Thread
def sub_proc(q):
some_str = ""
while True:
some_str = raw_input("> ")
if some_str.lower() == "quit":
return
q.put_nowait(some_str)
if __name__ == "__main__":
q = Queue()
qproc = Thread(target=sub_proc, args=(q,))
qproc.start()
qproc.join()
while True:
try:
print q.get(False)
except Empty:
break

Related

Python 3: multiprocessing, EOFError: EOF when reading a line

I wish to have a process continually monitoring RPi input, and set a variable (I have chosen a queue) to True or False to reflect the debounced value. Another process will then capture an image (from a stream). I have written some code just to check I can get multiprocessing and signalling (the queue) working ok (I'm an amature coder...).
It all works fine with threading, but multiprocessing is giving an odd error. Specifically 'multiprocessing, EOFError: EOF when reading a line'. Code outputs:-
this computer has the following number of CPU's 6
OK, started thread on separate processor, now we monitor variable
enter something, True is the key word:
Process Process-1:
Traceback (most recent call last):
File "c:\Python34\lib\multiprocessing\process.py", line 254, in _bootstrap
self.run()
File "c:\Python34\lib\multiprocessing\process.py", line 93, in run
self._target(*self._args, **self._kwargs)
File "C:\Users\Peter\Documents\NetBeansProjects\test_area\src\test4.py", line 16, in Wait4InputIsTrue
ValueIs = input("enter something, True is the key word: ")
EOFError: EOF when reading a line
This module monitors the 'port' (I am using the keyboard as an input):
#test4.py
from time import sleep
from multiprocessing import Lock
def Wait4InputIsTrue(TheVar, TheLock):
while True:
sleep(0.2)
TheLock.acquire()
#try:
ValueIs = input("enter something, True is the key word: ")
#except:
# ValueIs = False
if ValueIs == "True":
TheVar.put(True)
print("changed TheVar to True")
TheLock.release()
This module monitors the status, and acts on it:
#test5.py
if __name__ == "__main__":
from multiprocessing import Process, Queue, Lock, cpu_count
from time import sleep
from test4 import Wait4InputIsTrue
print("this computer has the following number of CPU's", cpu_count())
LockIt = Lock()
IsItTrue = Queue(maxsize = 3)
Wait4 = Process(target = Wait4InputIsTrue, args = (IsItTrue, LockIt))
Wait4.start()
print("OK, started thread on separate processor, now we monitor variable")
while True:
if IsItTrue.qsize():
sleep(0.1)
print("received input from separate thread:", IsItTrue.get())
Note that I have tried adding a try: to the input statement in test4.py, in which case it keeps printing "enter something, True is the key word: " indefinitely, without a cr.
I added Lock in wild attempt to fix it, makes no difference
Anyone any idea why this is happening?
Your problem can be boiled down to a simpler script:
import multiprocessing as mp
import sys
def worker():
print("Got", repr(sys.stdin.read(1)))
if __name__ == "__main__":
process = mp.Process(target=worker)
process.start()
process.join()
When run, it produces
$ python3 i.py
Got ''
Reading zero bytes means the pipe is closed and input(..) turns that into an EOFError exception.
The multiprocessing module doesn't let you read stdin. That makes sense generally because mixing stdin readers from multiple children is a risky business. In fact, digging into the implementation, multiprocessing/process.py explicitly sets stdin to devnull:
sys.stdin.close()
sys.stdin = open(os.devnull)
If you are just using stdin for test, then the solution is simple: Don't do that! If you really need user input, life is quite a bit more difficult. You can use additional queues plus code in the parent to prompt users and get input.

Do not print stack-trace using Pool python

I use a Pool to run several commands simultaneously. I would like to don't print the stack-trace when the user interrupt the script.
Here is my script structure:
def worker(some_element):
try:
cmd_res = Popen(SOME_COMMAND, stdout=PIPE, stderr=PIPE).communicate()
except (KeyboardInterrupt, SystemExit):
pass
except Exception, e:
print str(e)
return
#deal with cmd_res...
pool = Pool()
try:
pool.map(worker, some_list, chunksize = 1)
except KeyboardInterrupt:
pool.terminate()
print 'bye!'
By calling pool.terminated() when KeyboardInterrupt raises, I expected to don't print the stack-trace, but it doesn't works, I got sometimes something like:
^CProcess PoolWorker-6:
Traceback (most recent call last):
File "/usr/lib/python2.7/multiprocessing/process.py", line 258, in _bootstrap
self.run()
File "/usr/lib/python2.7/multiprocessing/process.py", line 114, in run
self._target(*self._args, **self._kwargs)
File "/usr/lib/python2.7/multiprocessing/pool.py", line 102, in worker
task = get()
File "/usr/lib/python2.7/multiprocessing/queues.py", line 374, in get
racquire()
KeyboardInterrupt
Process PoolWorker-1:
Process PoolWorker-2:
Traceback (most recent call last):
File "/usr/lib/python2.7/multiprocessing/process.py", line 258, in _bootstrap
Traceback (most recent call last):
...
bye!
Do you know how I could hide this?
Thanks.
In your case you don't even need pool processes or threads. And then it gets easier to silence KeyboardInterrupts with try-catch.
Pool processes are useful when your Python code does CPU-consuming calculations that can profit from parallelization.
Threads are useful when your Python code does complex blocking I/O that can run in parallel. You just want to execute multiple programs in parallel and wait for the results. When you use Pool you create processes that do nothing other than starting other processes and waiting for them to terminate.
The simplest solution is to create all of the processes in parallel and then to call .communicate() on each of them:
try:
processes = []
# Start all processes at once
for element in some_list:
processes.append(Popen(SOME_COMMAND, stdout=PIPE, stderr=PIPE))
# Fetch their results sequentially
for process in processes:
cmd_res = process.communicate()
# Process your result here
except KeyboardInterrupt:
for process in processes:
try:
process.terminate()
except OSError:
pass
This works when when the output on STDOUT and STDERR isn't too big. Else when another process than the one communicate() is currently running for produces too much output for the PIPE buffer (usually around 1-8 kB) it will be suspended by the OS until communicate() is called on the suspended process. In that case you need a more sophisticated solution:
Asynchronous I/O
Since Python 3.4 you can use the asyncio module for single-thread pseudo-multithreading:
import asyncio
from asyncio.subprocess import PIPE
loop = asyncio.get_event_loop()
#asyncio.coroutine
def worker(some_element):
process = yield from asyncio.create_subprocess_exec(*SOME_COMMAND, stdout=PIPE)
try:
cmd_res = yield from process.communicate()
except KeyboardInterrupt:
process.terminate()
return
try:
pass # Process your result here
except KeyboardInterrupt:
return
# Start all workers
workers = []
for element in some_list:
w = worker(element)
workers.append(w)
asyncio.async(w)
# Run until everything complete
loop.run_until_complete(asyncio.wait(workers))
You should be able to limit the number of concurrent processes using e.g. asyncio.Semaphore if you need to.
When you instantiate Pool, it creates cpu_count() (on my machine, 8) python processes waiting for your worker(). Note that they don't run it yet, they are waiting for the command. When they don't perform your code, they also don't handle KeyboardInterrupt. You can see what they are doing if you specify Pool(processes=2) and send the interruption. You can play with processes number to fix it, but I don't think you can handle it in all the cases.
Personally I don't recommend to use multiprocessing.Pool for the task of launching other processes. It's overkill to launch several python processes for that. Much more efficient way – is using threads (see threading.Thread, Queue.Queue). But in this case you need to implement threading pool youself. Which is not so hard though.
Your child process will receive both the KeyboardInterrupt exception and the exception from the terminate().
Because the child process receives the KeyboardInterrupt, a simple join() in the parent -- rather than the terminate() -- should suffice.
As suggested y0prst I used threading.Thread instead of Pool.
Here is a working example, which rasterize a set of vectors with ImageMagick (I know I can use mogrify for this, it's just an example).
#!/usr/bin/python
from os.path import abspath
from os import listdir
from threading import Thread
from subprocess import Popen, PIPE
RASTERISE_CALL = "magick %s %s"
INPUT_DIR = './tests_in/'
def get_vectors(dir):
'''Return a list of svg files inside the `dir` directory'''
return [abspath(dir+f).replace(' ', '\\ ') for f in listdir(dir) if f.endswith('.svg')]
class ImageMagickError(Exception):
'''Custom error for ImageMagick fails calls'''
def __init__(self, value): self.value = value
def __str__(self): return repr(self.value)
class Rasterise(Thread):
'''Rasterizes a given vector.'''
def __init__(self, svg):
self.stdout = None
self.stderr = None
Thread.__init__(self)
self.svg = svg
def run(self):
p = Popen((RASTERISE_CALL % (self.svg, self.svg + '.png')).split(), shell=False, stdout=PIPE, stderr=PIPE)
self.stdout, self.stderr = p.communicate()
if self.stderr is not '':
raise ImageMagickError, 'can not rasterize ' + self.svg + ': ' + self.stderr
threads = []
def join_threads():
'''Joins all the threads.'''
for t in threads:
try:
t.join()
except(KeyboardInterrupt, SystemExit):
pass
#Rasterizes all the vectors in INPUT_DIR.
for f in get_vectors(INPUT_DIR):
t = Rasterise(f)
try:
print 'rasterize ' + f
t.start()
except (KeyboardInterrupt, SystemExit):
join_threads()
except ImageMagickError:
print 'Opps, IM can not rasterize ' + f + '.'
continue
threads.append(t)
# wait for all threads to end
join_threads()
print ('Finished!')
Please, tell me if you think there are a more pythonic way to do that, or if it can be optimised, I will edit my answer.

Threaded subprocess read is not give any output

I have the following code and am trying to run in in Idle in linux.
import sys
from subprocess import PIPE, Popen
from threading import Thread
try:
from Queue import Queue, Empty
except ImportError:
from queue import Queue, Empty # python 3.x
ON_POSIX = 'posix' in sys.builtin_module_names
def enqueue_output(out, queue):
for line in iter(out.readline, b''):
queue.put(line)
out.close()
p = Popen(['youtube-dl', '-l', '-c', 'https://www.youtube.com/watch?v=utV1sdjr4PY'], stdout=PIPE, bufsize=1, close_fds=ON_POSIX)
q = Queue()
t = Thread(target=enqueue_output, args=(p.stdout, q))
t.daemon = True # thread dies with the program
t.start()
# ... do other things here
# read line without blocking
while True:
try: line = q.get_nowait() # or q.get(timeout=.1)
except Empty:
pass
#print('no output yet')
else: # got line
print line
But is is always printing "no output yet".
Edit: I edited the code and it is working. But I have another problem. The percentage of the download is updated in a single line, but the code reads it only after the line is complete
OK, let's put the comments in an answer.
import sys, os
from subprocess import PIPE, Popen
from time import sleep
import pty
master, slave = pty.openpty()
stdout = os.fdopen(master)
p = Popen(['youtube-dl', '-l', '-c', 'https://www.youtube.com/watch?v=AYlb-7TXMxM'], shell=False,stdout=slave,stderr=slave, close_fds=True)
while True:
#line = stdout.readline().rstrip() - will strip the new line
line = stdout.readline()
if line != b'':
sys.stdout.write("\r%s" % line)
sys.stdout.flush()
sleep(.1)
If you want a thread and a diferent while, I sugest wrapping in a class and avoid queue. The output is „unbuffered” - thanks #FilipMalckzak

Python: How to peek into a pty object to avoid blocking?

I am using pty to read non blocking the stdout of a process like this:
import os
import pty
import subprocess
master, slave = pty.openpty()
p = subprocess.Popen(cmd, stdout = slave)
stdout = os.fdopen(master)
while True:
if p.poll() != None:
break
print stdout.readline()
stdout.close()
Everything works fine except that the while-loop occasionally blocks. This is due to the fact that the line print stdout.readline() is waiting for something to be read from stdout. But if the program already terminated, my little script up there will hang forever.
My question is: Is there a way to peek into the stdout object and check if there is data available to be read? If this is not the case it should continue through the while-loop where it will discover that the process actually already terminated and break the loop.
Yes, use the select module's poll:
import select
q = select.poll()
q.register(stdout,select.POLLIN)
and in the while use:
l = q.poll(0)
if not l:
pass # no input
else:
pass # there is some input
The select.poll() answer is very neat, but doesn't work on Windows. The following solution is an alternative. It doesn't allow you to peek stdout, but provides a non-blocking alternative to readline() and is based on this answer:
from subprocess import Popen, PIPE
from threading import Thread
def process_output(myprocess): #output-consuming thread
nextline = None
buf = ''
while True:
#--- extract line using read(1)
out = myprocess.stdout.read(1)
if out == '' and myprocess.poll() != None: break
if out != '':
buf += out
if out == '\n':
nextline = buf
buf = ''
if not nextline: continue
line = nextline
nextline = None
#--- do whatever you want with line here
print 'Line is:', line
myprocess.stdout.close()
myprocess = Popen('myprogram.exe', stdout=PIPE) #output-producing process
p1 = Thread(target=process_output, args=(myprocess,)) #output-consuming thread
p1.daemon = True
p1.start()
#--- do whatever here and then kill process and thread if needed
if myprocess.poll() == None: #kill process; will automatically stop thread
myprocess.kill()
myprocess.wait()
if p1 and p1.is_alive(): #wait for thread to finish
p1.join()
Other solutions for non-blocking read have been proposed here, but did not work for me:
Solutions that require readline (including the Queue based ones) always block. It is difficult (impossible?) to kill the thread that executes readline. It only gets killed when the process that created it finishes, but not when the output-producing process is killed.
Mixing low-level fcntl with high-level readline calls may not work properly as anonnn has pointed out.
Using select.poll() is neat, but doesn't work on Windows according to python docs.
Using third-party libraries seems overkill for this task and adds additional dependencies.

Python Multiprocessing atexit Error "Error in atexit._run_exitfuncs"

I am trying to run a simple multiple processes application in Python. The main thread spawns 1 to N processes and waits until they all done processing. The processes each run an infinite loop, so they can potentially run forever without some user interruption, so I put in some code to handle a KeyboardInterrupt:
#!/usr/bin/env python
import sys
import time
from multiprocessing import Process
def main():
# Set up inputs..
# Spawn processes
Proc( 1).start()
Proc( 2).start()
class Proc ( Process ):
def __init__ ( self, procNum):
self.id = procNum
Process.__init__(self)
def run ( self ):
doneWork = False
while True:
try:
# Do work...
time.sleep(1)
sys.stdout.write('.')
if doneWork:
print "PROC#" + str(self.id) + " Done."
break
except KeyboardInterrupt:
print "User aborted."
sys.exit()
# Main Entry
if __name__=="__main__":
main()
The problem is that when using CTRL-C to exit, I get an additional error even though the processes seem to exit immediately:
......User aborted.
Error in atexit._run_exitfuncs:
Traceback (most recent call last):
File "C:\Python26\lib\atexit.py", line 24, in _run_exitfuncs
func(*targs, **kargs)
File "C:\Python26\lib\multiprocessing\util.py", line 281, in _exit_function
p.join()
File "C:\Python26\lib\multiprocessing\process.py", line 119, in join
res = self._popen.wait(timeout)
File "C:\Python26\lib\multiprocessing\forking.py", line 259, in wait
res = _subprocess.WaitForSingleObject(int(self._handle), msecs)
KeyboardInterrupt
Error in sys.exitfunc:
Traceback (most recent call last):
File "C:\Python26\lib\atexit.py", line 24, in _run_exitfuncs
func(*targs, **kargs)
File "C:\Python26\lib\multiprocessing\util.py", line 281, in _exit_function
p.join()
File "C:\Python26\lib\multiprocessing\process.py", line 119, in join
res = self._popen.wait(timeout)
File "C:\Python26\lib\multiprocessing\forking.py", line 259, in wait
res = _subprocess.WaitForSingleObject(int(self._handle), msecs)
KeyboardInterrupt
I am running Python 2.6 on Windows. If there is a better way to handle multiprocessing in Python, please let me know.
This is a very old question, but it seems like the accepted answer does not actually fix the problem.
The main issue is that you need to handle the keyboard interrupt in the parent process as well. Additionally to that, in the while loop, you just need to exit the loop, there's no need to call sys.exit()
I've tried to as closely match the example in the original question. The doneWork code does nothing in the example so have removed that for clarity.
import sys
import time
from multiprocessing import Process
def main():
# Set up inputs..
# Spawn processes
try:
processes = [Proc(1), Proc(2)]
[p.start() for p in processes]
[p.join() for p in processes]
except KeyboardInterrupt:
pass
class Proc(Process):
def __init__(self, procNum):
self.id = procNum
Process.__init__(self)
def run(self):
while True:
try:
# Do work...
time.sleep(1)
sys.stdout.write('.')
except KeyboardInterrupt:
print("User aborted.")
break
# Main Entry
if __name__ == "__main__":
main()
Rather then just forcing sys.exit(), you want to send a signal to your threads to tell them to stop. Look into using signal handlers and threads in Python.
You could potentially do this by changing your while True: loop to be while keep_processing: where keep_processing is some sort of global variable that gets set on the KeyboardInterrupt exception. I don't think this is a good practice though.

Categories