Python multiprocessing - AssertionError: can only join a child process - python
I'm taking my first foray into the python mutliprocessing module and I'm running into some problems. I'm very familiar with the threading module but I need to make sure the processes I'm executing are running in parallel.
Here's an outline of what I'm trying to do. Please ignore things like undeclared variables/functions because I can't paste my code in full.
import multiprocessing
import time
def wrap_func_to_run(host, args, output):
output.append(do_something(host, args))
return
def func_to_run(host, args):
return do_something(host, args)
def do_work(server, client, server_args, client_args):
server_output = func_to_run(server, server_args)
client_output = func_to_run(client, client_args)
#handle this output and return a result
return result
def run_server_client(server, client, server_args, client_args, server_output, client_output):
server_process = multiprocessing.Process(target=wrap_func_to_run, args=(server, server_args, server_output))
server_process.start()
client_process = multiprocessing.Process(target=wrap_func_to_run, args=(client, client_args, client_output))
client_process.start()
server_process.join()
client_process.join()
#handle the output and return some result
def run_in_parallel(server, client):
#set up commands for first process
server_output = client_output = []
server_cmd = "cmd"
client_cmd = "cmd"
process_one = multiprocessing.Process(target=run_server_client, args=(server, client, server_cmd, client_cmd, server_output, client_output))
process_one.start()
#set up second process to run - but this one can run here
result = do_work(server, client, "some server args", "some client args")
process_one.join()
#use outputs above and the result to determine result
return final_result
def main():
#grab client
client = client()
#grab server
server = server()
return run_in_parallel(server, client)
if __name__ == "__main__":
main()
Here's the error I'm getting:
Error in sys.exitfunc:
Traceback (most recent call last):
File "/usr/lib64/python2.7/atexit.py", line 24, in _run_exitfuncs
func(*targs, **kargs)
File "/usr/lib64/python2.7/multiprocessing/util.py", line 319, in _exit_function
p.join()
File "/usr/lib64/python2.7/multiprocessing/process.py", line 143, in join
assert self._parent_pid == os.getpid(), 'can only join a child process'
AssertionError: can only join a child process
I've tried a lot of different things to fix this but my feeling is that there's something wrong with the way I'm using this module.
EDIT:
So I created a file that will reproduce this by simulating the client/server and the work they do - Also I missed an important point which was that I was running this in unix. Another important bit of information was that do_work in my actual case involves using os.fork(). I was unable to reproduce the error without also using os.fork() so I'm assuming the problem is there. In my real world case, that part of the code was not mine so I was treating it like a black box (likely a mistake on my part). Anyways here's the code to reproduce -
#!/usr/bin/python
import multiprocessing
import time
import os
import signal
import sys
class Host():
def __init__(self):
self.name = "host"
def work(self):
#override - use to simulate work
pass
class Server(Host):
def __init__(self):
self.name = "server"
def work(self):
x = 0
for i in range(10000):
x+=1
print x
time.sleep(1)
class Client(Host):
def __init__(self):
self.name = "client"
def work(self):
x = 0
for i in range(5000):
x+=1
print x
time.sleep(1)
def func_to_run(host, args):
print host.name + " is working"
host.work()
print host.name + ": " + args
return "done"
def do_work(server, client, server_args, client_args):
print "in do_work"
server_output = client_output = ""
child_pid = os.fork()
if child_pid == 0:
server_output = func_to_run(server, server_args)
sys.exit(server_output)
time.sleep(1)
client_output = func_to_run(client, client_args)
# kill and wait for server to finish
os.kill(child_pid, signal.SIGTERM)
(pid, status) = os.waitpid(child_pid, 0)
return (server_output == "done" and client_output =="done")
def run_server_client(server, client, server_args, client_args):
server_process = multiprocessing.Process(target=func_to_run, args=(server, server_args))
print "Starting server process"
server_process.start()
client_process = multiprocessing.Process(target=func_to_run, args=(client, client_args))
print "Starting client process"
client_process.start()
print "joining processes"
server_process.join()
client_process.join()
print "processes joined and done"
def run_in_parallel(server, client):
#set up commands for first process
server_cmd = "server command for run_server_client"
client_cmd = "client command for run_server_client"
process_one = multiprocessing.Process(target=run_server_client, args=(server, client, server_cmd, client_cmd))
print "Starting process one"
process_one.start()
#set up second process to run - but this one can run here
print "About to do work"
result = do_work(server, client, "server args from do work", "client args from do work")
print "Joining process one"
process_one.join()
#use outputs above and the result to determine result
print "Process one has joined"
return result
def main():
#grab client
client = Client()
#grab server
server = Server()
return run_in_parallel(server, client)
if __name__ == "__main__":
main()
If I remove the use of os.fork() in do_work I don't get the error and the code behaves like I would have expected it before (except for the passing of outputs which I've accepted as my mistake/misunderstanding). I can change the old code to not use os.fork() but I'd also like to know why this caused this problem and if there's a workable solution.
EDIT 2:
I started working on a solution that omits os.fork() before the accepted answer. Here's what I have with some tweaking to the amount of simulated work that can be done -
#!/usr/bin/python
import multiprocessing
import time
import os
import signal
import sys
from Queue import Empty
class Host():
def __init__(self):
self.name = "host"
def work(self, w):
#override - use to simulate work
pass
class Server(Host):
def __init__(self):
self.name = "server"
def work(self, w):
x = 0
for i in range(w):
x+=1
print x
time.sleep(1)
class Client(Host):
def __init__(self):
self.name = "client"
def work(self, w):
x = 0
for i in range(w):
x+=1
print x
time.sleep(1)
def func_to_run(host, args, w, q):
print host.name + " is working"
host.work(w)
print host.name + ": " + args
q.put("ZERO")
return "done"
def handle_queue(queue):
done = False
results = []
return_val = 0
while not done:
#try to grab item from Queue
tr = None
try:
tr = queue.get_nowait()
print "found element in queue"
print tr
except Empty:
done = True
if tr is not None:
results.append(tr)
for el in results:
if el != "ZERO":
return_val = 1
return return_val
def do_work(server, client, server_args, client_args):
print "in do_work"
server_output = client_output = ""
child_pid = os.fork()
if child_pid == 0:
server_output = func_to_run(server, server_args)
sys.exit(server_output)
time.sleep(1)
client_output = func_to_run(client, client_args)
# kill and wait for server to finish
os.kill(child_pid, signal.SIGTERM)
(pid, status) = os.waitpid(child_pid, 0)
return (server_output == "done" and client_output =="done")
def run_server_client(server, client, server_args, client_args, w, mq):
local_queue = multiprocessing.Queue()
server_process = multiprocessing.Process(target=func_to_run, args=(server, server_args, w, local_queue))
print "Starting server process"
server_process.start()
client_process = multiprocessing.Process(target=func_to_run, args=(client, client_args, w, local_queue))
print "Starting client process"
client_process.start()
print "joining processes"
server_process.join()
client_process.join()
print "processes joined and done"
if handle_queue(local_queue) == 0:
mq.put("ZERO")
def run_in_parallel(server, client):
#set up commands for first process
master_queue = multiprocessing.Queue()
server_cmd = "server command for run_server_client"
client_cmd = "client command for run_server_client"
process_one = multiprocessing.Process(target=run_server_client, args=(server, client, server_cmd, client_cmd, 400000000, master_queue))
print "Starting process one"
process_one.start()
#set up second process to run - but this one can run here
print "About to do work"
#result = do_work(server, client, "server args from do work", "client args from do work")
run_server_client(server, client, "server args from do work", "client args from do work", 5000, master_queue)
print "Joining process one"
process_one.join()
#use outputs above and the result to determine result
print "Process one has joined"
return_val = handle_queue(master_queue)
print return_val
return return_val
def main():
#grab client
client = Client()
#grab server
server = Server()
val = run_in_parallel(server, client)
if val:
print "failed"
else:
print "passed"
return val
if __name__ == "__main__":
main()
This code has some tweaked printouts just to see exactly what is happening. I used a multiprocessing.Queue to store and share outputs across the processes and back into my main thread to be handled. I think this solves the python portion of my problem but there's still some issues in the code I'm working on. The only other thing I can say is that the equivalent to func_to_run involves sending a command over ssh and grabbing any err along with the output. For some reason, this works perfectly fine for a command that has a low execution time, but not well for a command that has a much larger execution time/output. I tried simulating this with the drastically different work values in my code here but haven't been able to reproduce similar results.
EDIT 3
Library code I'm using (again not mine) uses Popen.wait() for the ssh commands and I just read this:
Popen.wait()
Wait for child process to terminate. Set and return returncode attribute.
Warning This will deadlock when using stdout=PIPE and/or stderr=PIPE and the >child process generates enough output to a pipe such that it blocks waiting for >the OS pipe buffer to accept more data. Use communicate() to avoid that.
I adjusted the code to not buffer and just print as it is received and everything works.
I can change the old code to not use os.fork() but I'd also like to know why this caused this problem and if there's a workable solution.
The key to understanding the problem is knowing exactly what fork() does. CPython docs state "Fork a child process." but this presumes you understand the C library call fork().
Here's what glibc's manpage says about it:
fork() creates a new process by duplicating the calling process. The new process, referred to as the child, is an exact duplicate of the calling process, referred to as the parent, except for the following points: ...
It's basically as if you took your program and made a copy of its program state (heap, stack, instruction pointer, etc) with small differences and let it execute independent of the original. When this child process exits naturally, it will use exit() and that will trigger atexit() handlers registered by the multiprocessing module.
What can you do to avoid it?
omit os.fork(): use multiprocessing instead, like you are exploring now
probably effective: import multiprocessing after executing fork(), only in the child or parent as necessary.
use _exit() in the child (CPython docs state, "Note The standard way to exit is sys.exit(n). _exit() should normally only be used in the child process after a fork().")
https://docs.python.org/2/library/os.html#os._exit
In addition to the excellent solution from Cain, if you're facing the same situation as I was, where you can't control how the subprocesses are created, you can try to unregister the atexit function in your subprocesses to get rid of these messages:
import atexit
from multiprocessing.util import _exit_function
atexit.unregister(_exit_function)
ATTENTION: This may lead to leakage. For instance, if your subprocesses have their own children, they won't be cleared. So clearify your situation and test thoroughly afterwards.
It seems to me that you are threading it one time too many. I would not thread it from run_in_parallel, but simply calling run_server_client with the proper arguments, because they will thread inside.
Related
Tracking application launched with a python file
I've encountered a situation where I thought it would be a good idea to create a launcher for an application which I tend to run several instances of. This is to ensure that I and the application get access to the wanted environment variables that can be provided and set for each instance. import os import subprocess def launch(): """ Launches application. """ # create environment os.environ['APPLICATION_ENVIRON'] = 'usr/path' # launch application application_path = 'path/to/application' app = subprocess.Popen([application_path]) pid = app.pid app.wait() print 'done with process: {}'.format(pid) if __name__ == '__main__': launch() I want to be able to track the applications, do I dump the pids in a file and remove them when the process closes? Do I launch a service that I communicate with? Being fairly new to programming in general I don't know if I'm missing a term in the lingo or just thinking wrong. But I was reading up on Daemons and services to track the applications and couldn't come up with a proper answer. Put simply, a bit lost how to approach it.
What you're doing already seems reasonable. I'd probably extend it to something like this: import os import subprocess def launch_app(): os.environ['APPLICATION_ENVIRON'] = 'usr/path' application_path = 'path/to/application' return subprocess.Popen([application_path]) def _purge_finished_apps(apps): still_running = set() for app in apps: return_code = app.poll() if return_code is not None: print " PID {} no longer running (return code {})".format(app.pid, return_code) else: still_running.add(app) return still_running def ui(): apps = set() while True: print print "1. To launch new instance" print "2. To view all instances" print "3. To exit, terminating all running instances" print "4. To exit, leaving instances running" opt = int(raw_input()) apps = _purge_finished_apps(apps) if opt == 1: app = launch_app() apps.add(app) print " PID {} launched".format(app.pid) elif opt == 2: if not apps: print "There are no instances running" for app in apps: print " PID {} running".format(app.pid) elif opt == 3: for app in apps: print "Terminating PID {}".format(app.pid) app.terminate() for app in apps: app.wait() print "PID {} finished".format(app.pid) return elif opt == 4: return if __name__ == "__main__": ui()
Here's a code sample to help illustrate how it might work for you. Note that you can capture the stdout from the processes in real time in your host script; this might be useful if the program you're running uses the console. (As a side note on the example: You probably would want to change the IP addresses: these are from my internal network. Be kind to any external sites you might want to use, please. Launching thousands of processes with the same target might be construed as a hostile gesture.) (An additional side note on this example: It is conceivable that I will lose some of my time samples when evaluating the output pipe...if the subprocess writes it to the console piecemeal, it is conceivable that I might occasionally catch it exactly as it is partway done - meaning I might get half of the "time=xxms" statement, causing the RE to miss it. I've done a poor job of checking for this possibility (i.e. I couldn't be bothered for the example). This is one of the hazards of multiprocess/multithreaded programming that you'll need to be aware of if you do it much.) # Subprocessor.py # # Launch a console application repeatedly and test its state. # import subprocess import re NUMBER_OF_PROCESSES_TO_OPEN = 3 DELAY_BETWEEN_CHECKS = 5 CMD = "ping" ARGS = ([CMD, "-n", "8", "192.168.0.60"], [CMD, "-n", "12", "192.168.0.20"], [CMD, "-n", "4", "192.168.0.21"]) def go(): processes = {} stopped = [False, False, False] samples = [0]*NUMBER_OF_PROCESSES_TO_OPEN times = [0.0]*NUMBER_OF_PROCESSES_TO_OPEN print "Opening processes..." for i in range(NUMBER_OF_PROCESSES_TO_OPEN): # The next line creates a subprocess, this is a non-blocking call so # the program will complete it more or less instantly. newprocess = subprocess.Popen(args = ARGS[i], stdout = subprocess.PIPE) processes[i] = newprocess print " process {} open, pid == {}.".format(i, processes[i].pid) # Build a regular expression to work with the stdout. gettimere = re.compile("time=([0-9]*)ms") while len(processes) > 0: for i, p in processes.iteritems(): # Popen.poll() asks the process if it is still running - it is # a non-blocking call that completes instantly. isrunning = (p.poll() == None) data = p.stdout.readline() # Get the stdout from the process. matchobj = gettimere.search(data) if matchobj: for time in matchobj.groups(): samples[i] += 1 times[i] = (times[i] * (samples[i] - 1) + int(time)) / samples[i] # If the process was stopped before we read the last of the # data from its output pipe, flag it so we don't keep messing # with it. if not isrunning: stopped[i] = True print "Process {} stopped, pid == {}, average time == {}".format(i, processes[i].pid, times[i]) # This code segment deletes the stopped processes from the dict # so we don't keep checking them (and know when to stop the main # program loop). for i in range(len(stopped)): if stopped[i] and processes.has_key(i): del processes[i] if __name__ == '__main__': go()
How to Interrupt/Stop/End a hanging multi-threaded python program
I have a python program that implements threads like this: class Mythread(threading.Thread): def __init__(self, name, q): threading.Thread.__init__(self) self.name = name self.q = q def run(self): print "Starting %s..." % (self.name) while True: ## Get data from queue data = self.q.get() ## do_some_processing with data ### process_data(data) ## Mark Queue item as done self.q.task_done() print "Exiting %s..." % (self.name) def call_threaded_program(): ##Setup the threads. Define threads,queue,locks threads = [] q = Queue.Queue() thread_count = n #some number data_list = [] #some data list containing data ##Create Threads for thread_id in range(1, thread_count+1): thread_name = "Thread-" + str(thread_id) thread = Mythread(thread_name,q) thread.daemon = True thread.start() ##Fill data in Queue for data_item in data_list: q.put(data_item) try: ##Wait for queue to be exhausted and then exit main program q.join() except (KeyboardInterrupt, SystemExit) as e: print "Interrupt Issued. Exiting Program with error state: %s"%(str(e)) exit(1) The call_threaded_program() is called from a different program. I have the code working under normal circumstances. However if an error/exception occurs in one of the threads, then the program is stuck (as the queue join is infinitely blocking). The only way I am able to quit this program is to close the terminal itself. What is the best way to terminate this program when a thread bails out? Is there a clean (actually I would take any way) way of doing this? I know this question has been asked numerous times, but I am still unable to find a convincing answer. I would really appreciate any help. EDIT: I tried removing the join on the queue and used a global exit flag as suggested in Is there any way to kill a Thread in Python? However, Now the behavior is so strange, I can't comprehend what is going on. import threading import Queue import time exit_flag = False class Mythread (threading.Thread): def __init__(self,name,q): threading.Thread.__init__(self) self.name = name self.q = q def run(self): try: # Start Thread print "Starting %s...."%(self.name) # Do Some Processing while not exit_flag: data = self.q.get() print "%s processing %s"%(self.name,str(data)) self.q.task_done() # Exit thread print "Exiting %s..."%(self.name) except Exception as e: print "Exiting %s due to Error: %s"%(self.name,str(e)) def main(): global exit_flag ##Setup the threads. Define threads,queue,locks threads = [] q = Queue.Queue() thread_count = 20 data_list = range(1,50) ##Create Threads for thread_id in range(1,thread_count+1): thread_name = "Thread-" + str(thread_id) thread = Mythread(thread_name,q) thread.daemon = True threads.append(thread) thread.start() ##Fill data in Queue for data_item in data_list: q.put(data_item) try: ##Wait for queue to be exhausted and then exit main program while not q.empty(): pass # Stop the threads exit_flag = True # Wait for threads to finish print "Waiting for threads to finish..." while threading.activeCount() > 1: print "Active Threads:",threading.activeCount() time.sleep(1) pass print "Finished Successfully" except (KeyboardInterrupt, SystemExit) as e: print "Interrupt Issued. Exiting Program with error state: %s"%(str(e)) if __name__ == '__main__': main() The program's output is as below: #Threads get started correctly #The output also is getting processed but then towards the end, All i see are Active Threads: 16 Active Threads: 16 Active Threads: 16... The program then just hangs or keeps on printing the active threads. However since the exit flag is set to True, the thread's run method is not being exercised. So I have no clue as to how these threads are kept up or what is happening. EDIT: I found the problem. In the above code, thread's get method were blocking and hence unable to quit. Using a get method with a timeout instead did the trick. I have the code for just the run method that I modified below def run(self): try: #Start Thread printing "Starting %s..."%(self.name) #Do Some processing while not exit_flag: try: data = self.q.get(True,self.timeout) print "%s processing %s"%(self.name,str(data)) self.q.task_done() except: print "Queue Empty or Timeout Occurred. Try Again for %s"%(self.name) # Exit thread print "Exiting %s..."%(self.name) except Exception as e: print "Exiting %s due to Error: %s"%(self.name,str(e))
If you want to force all the threads to exit when the process exits, you can set the "daemon" flag of the thread to True before the thread is created. http://docs.python.org/2/library/threading.html#threading.Thread.daemon
I did it once in C. Basically i had a main process that were starting the other ones and kept tracks of them, ie. stored the PID and waited for the return code. If you have an error in a process the code will indicate so and then you can stop every other process. Hope this helps Edit: Sorry i can have forgotten in my answer that you were using threads. But I think it still applies. You can either wrap or modify the thread to get a return value or you can use the multithread pool library. how to get the return value from a thread in python? Python thread exit code
How to "listen" to a multiprocessing queue in Python
I will start with the code, I hope it is simple enough: import Queue import multiprocessing class RobotProxy(multiprocessing.Process): def __init__(self, commands_q): multiprocessing.Process.__init__(self) self.commands_q = commands_q def run(self): self.listen() print "robot started" def listen(self): print "listening" while True: print "size", self.commands_q.qsize() command = self.commands_q.get() print command if command is "start_experiment": self.start_experiment() elif command is "end_experiment": self.terminate_experiment() break else: raise Exception("Communication command not recognized") print "listen finished" def start_experiment(self): #self.vision = ds.DropletSegmentation( ) print "start experiment" def terminate_experiment(self): print "terminate experiment" if __name__ == "__main__": command_q = Queue.Queue() robot_proxy = RobotProxy( command_q ) robot_proxy.start() #robot_proxy.listen() print "after start" print command_q.qsize() command_q.put("start_experiment") command_q.put("end_experiment") print command_q.qsize() raise SystemExit So basically I launch a process, and I want this process to listen to commands put on the Queue. When I execute this code, I get the following: after start 0 2 listening size 0 it seems that I am not sharing the Queue properly, or that I am doing any other error. The program gets stuck forever in that "self.commands_q.get() when in theory the queue has 2 elements
You need to use multiprocessing.Queue instead of Queue.Queue in order to have the Queue object be shared across processes. See here: Multiprocessing Queues
Multiprocessing python-server creates too many temp-directories
I'm trying to implement a server in python3.3 that has a separate thread preloaded to do all the processing for the incoming connections. from multiprocessing import Process, Pipe, Queue from multiprocessing.reduction import reduce_socket import time import socketserver,socket def process(q): while 1: fn,args = q.get() conn = fn(*args) while conn.recv(1, socket.MSG_PEEK): buf = conn.recv(100) if not buf: break conn.send(b"Got it: ") conn.send(buf) conn.close() class MyHandler(socketserver.BaseRequestHandler): def handle(self): print("Opening connection") print("Processing") self.server.q.put(reduce_socket(self.request)) while self.request.recv(1, socket.MSG_PEEK): time.sleep(1) print("Closing connection") class MyServer(socketserver.ForkingTCPServer): p = Process q = Queue() parent_conn,child_conn = Pipe() def __init__(self,server_address,handler): socketserver.ForkingTCPServer.__init__(self,server_address, handler) self.p = Process(target=process,args=(self.q,)) self.p.start() def __del__(self): self.p.join() server_address = ('',9999) myserver = MyServer(server_address,MyHandler) myserver.serve_forever() I can test that it works using the following script: from multiprocessing.reduction import reduce_socket import time import socket s = socket.socket(socket.AF_INET, socket.SOCK_STREAM) s.connect(('localhost', 9999)) time.sleep(1) print("reduce_socket(s)") fn,args = reduce_socket(s) time.sleep(1) print("rebuild_socket(s)") conn = fn(*args) time.sleep(1) print("using_socket(s)") conn.send("poks") print conn.recv(255) conn.send("poks") print conn.recv(255) conn.send("") print conn.recv(255) conn.close() Unfortunately there seems to be something that is wrong since after running the test for n times, my tmp-folder is filled with subfolders: $ ls /tmp/pymp*|wc -l 32000 These temporary files are created by socket_reduce(). Interestingly the rebuild/reduce_socket() in the client also creates the temporary files, but they are removed once the function exits. The maximum amount of folders in my current tmp-filesystem is 32000 which causes a problem. I could remove the /tmp/pymp*-files by hand or somewhere in the server, but I guess there should also be the correct way to do this. Can anyone help me with this?
Okay, Kind of fixed it. From the ../lib/python3.3/multiprocessing/util.py: $ grep "def get_temp_dir" -B5 /usr/local/lib/python3.3/multiprocessing/util.py # # Function returning a temp directory which will be removed on exit # def get_temp_dir(): It seems that the temporary directory should be available until the process quits. Since my process() and main() both run forever, the temporary file won't be removed. To fix it I can create another process that will hand the reduced_socket to the process(): def process(q): while 1: fn,args = q.get() conn = fn(*args) while conn.recv(1, socket.MSG_PEEK): buf = conn.recv(100) if not buf: break conn.send(b"Got it: ") conn.send(buf) conn.close() q.put("ok") class MyHandler(socketserver.BaseRequestHandler): def socket_to_process(self,q): q.put(reduce_socket(self.request)) q.get() def handle(self): p = Process(target=self.socket_to_process,args=(self.server.q,)) p.start() p.join() This way the temporary file is created in a subprocess that will exit once the process() has done its thing with the input. I don't think this is an elegant way of doing it but it works. If someone knows better, please let stackoverflow know.
Call to POpen inside thread that is running along side WSGIREF.simple_server causes deadlock on os.fork
This question is related to A Django Related Question I am running a restlite WSGI app with wsgiref.simple_server. I have this set up so that before the serve_forever() method gets calls it initializes some objects. Most relevantly these classes. import Queue import threading import Deployer import ParallelRunner import sys import subprocess class DeployManager: def __init__(self): self.inputQueue = Queue.Queue() self.workerThread = DeployManagerWorkerThread(self.inputQueue) self.workerThread.start() def addDeployJob(self, appList): self.inputQueue.put(appList) #make sure this handles the queue being full def stopWorker(self): self.workerThread.running = False def __del__(self): self.stopWorker() class DeployManagerWorkerThread(threading.Thread): def __init__(self, Queue): super(DeployManagerWorkerThread, self).__init__() self.queue = Queue self.running = True self.deployer = Deployer.Deployer() self.runner = ParallelRunner.ParallelRunner() def run(self): while self.running: try: appList = self.queue.get(timeout = 10) #This blocks until something is in the queue sys.stdout.write('Got deployment job\n') command = "ssh " + server.sshUsername + "#" + server.hostname + "" + " -i " + server.sshPrivateKeyPath + r" 'bash -s' < " + pathToScript self.process = subprocess.Popen(command,shell=True ,stdin=subprocess.PIPE, stdout=subprocess.PIPE) output = process.communicate()[0] sys.stdout.write(output + '\n') except Queue.Empty: pass sys.stdout.write('DeployManagerWorkerThread exiting\n') The restlite request is set up like this #restlite.resource def deployAll(): def POST(request, entity): GlobalDeployManager.addDeployJob(GlobalAppList) #TODO should runners be configurable at start up #print output #or on a run by run basis return 'Job added' return locals() Which would put an entry into the queue which the workerThread would then grab and start processing. However the call to POpen always hangs I stepped into this and it appears it hangs at a call to os.fork() which will "freeze" the server. When the thread gets to the POpen command the main thread is at the line where it is accepting new requests. I believe the server is using epoll for this. If there are multiple jobs in the queue i can press control-C at the console the server is running in and the main thread will exit (Not be at the line which is waiting for requests) and then the thread will be able to run will work as expected and then close. Any ideas?
I eventually figured this out... I needed to create the threads object in the run method. I accidentally forgot that the init method was called from the main thread and then the new thread itself never executes init.