I start a bunch of threads working on a queue and I want to kill them when sending the SIGINT (Ctrl+C). What is the best way to handle this?
targets = Queue.Queue()
threads_num = 10
threads = []
for i in threads_num:
t = MyThread()
t.setDaemon(True)
threads.append(t)
t.start()
targets.join()
If you are not interested in letting the other threads shut down gracefully, simply start them in daemon mode and wrap the join of the queue in a terminator thread.
That way, you can make use of the join method of the thread -- which supports a timeout and does not block off exceptions -- instead of having to wait on the queue's join method.
In other words, do something like this:
term = Thread(target=someQueueVar.join)
term.daemon = True
term.start()
while (term.isAlive()):
term.join(3600)
Now, Ctrl+C will terminate the MainThread whereupon the Python Interpreter hard-kills all threads marked as "daemons". Do note that this means that you have to set "Thread.daemon" for all the other threads or shut them down gracefully by catching the correct exception (KeyboardInterrupt or SystemExit) and doing whatever needs to be done for them to quit.
Do also note that you absolutely need to pass a number to term.join(), as otherwise it will, too, ignore all exceptions. You can select an arbitrarily high number, though.
Isn't Ctrl+C SIGINT?
Anyway, you can install a handler for the appropriate signal, and in the handler:
set a global flag that instructs the workers to exit, and make sure they check it periodically
or put 10 shutdown tokens on the queue, and have the workers exit when they pop this magic token
or set a flag which instructs the main thread to push those tokens, make sure the main thread checks that flag
etc. Mostly it depends on the structure of the application you're interrupting.
One way to do it is to install a signal handler for SIGTERM that directly calls os._exit(signal.SIGTERM). However unless you specify the optional timeout argument to Queue.get the signal handler function will not run until after the get method returns. (That's completely undocumented; I discovered that on my own.) So you can specify sys.maxint as the timeout and put your Queue.get call in a retry loop for purity to get around that.
Why don't you set timeouts for any operation on the queue? Then your threads can regular check if they have to finish by checking if an Event is raised.
This is how I tackled this.
class Worker(threading.Thread):
def __init__(self):
self.shutdown_flag = threading.Event()
def run(self):
logging.info('Worker started')
while not self.shutdown_flag.is_set():
try:
task = self.get_task_from_queue()
except queue.Empty:
continue
self.process_task(task)
def get_task_from_queue(self) -> Task:
return self.task_queue.get(block=True, timeout=10)
def shutdown(self):
logging.info('Shutdown received')
self.shutdown_flag.set()
Upon receiving a signal the main thread sets the shutdown event on workers. The workers wait on a blocking queue, but keep checking every 10 seconds if they have received a shutdown signal.
I managed to solve the problem by emptying the queue on KeyboardInterrupt and letting threads to gracefully stop themselves.
I don't know if it's the best way to handle this but is simple and quite clean.
targets = Queue.Queue()
threads_num = 10
threads = []
for i in threads_num:
t = MyThread()
t.setDaemon(True)
threads.append(t)
t.start()
while True:
try:
# If the queue is empty exit loop
if self.targets.empty() is True:
break
# KeyboardInterrupt handler
except KeyboardInterrupt:
print "[X] Interrupt! Killing threads..."
# Substitute the old queue with a new empty one and exit loop
targets = Queue.Queue()
break
# Join every thread on the queue normally
targets.join()
Related
I have the following thread, for where the queue q is empty most of the time:
def run(self):
while True:
if not self.exit_flag:
items = self.q.get()
q.work_it()
else:
return 0
If the exit flag is set, it's not likely to exit immediately, because it's probably currently blocking at
items = self.q.get()
If I put a timeout on it
items = self.q.get(True, 0.1)
I will often raise the Empty exception, since the queue is more often empty than not, using more resources than I'd like.
If I do busy waiting like
def run(self):
while True:
if not self.exit_flag:
if not self.q.empty():
items = self.q.get()
q.work_it()
time.sleep(0.1)
else:
return 0
Then I do busy waiting instead of using the blocking feature of Queue.get() which seems ugly. It seems like I'm missing the elegant solution of this problem here? Is there one or should I just use the busy waiting solution?
Instead of using an exit_flag flag, consider putting a special "shutdown" indicator in the queue. When the worker dequeues a shutdown indicator, have it recognize the shutdown indicator and shut down instead of continuing on.
One approach with no third party dependencies is to use the consumer/producer pattern where you have your code run in it's own thread so it's not blocking.
Python's Queue has a join() method that will block until task_done() has been called on all the items that have been taken from the queue.
Is there a way to periodically check for this condition, or receive an event when it happens, so that you can continue to do other things in the meantime? You can, of course, check if the queue is empty, but that doesn't tell you if the count of unfinished tasks is actually zero.
The Python Queue itself does not support this, so you could try the following
from threading import Thread
class QueueChecker(Thread):
def __init__(self, q):
Thread.__init__(self)
self.q = q
def run(self):
q.join()
q_manager_thread = QueueChecker(my_q)
q_manager_thread.start()
while q_manager_thread.is_alive():
#do other things
#when the loop exits the tasks are done
#because the thread will have returned
#from blocking on the q.join and exited
#its run method
q_manager_thread.join() #to cleanup the thread
a while loop on the thread.is_alive() bit might not be exactly what you want, but at least you can see how to asynchronously check on the status of the q.join now.
I have a process which spawns 2 types of thread classes. One thread is responsible for consuming a job_queue (100Threads of this class are usually running). And second thread is a killing thread. I am using a result_done flag which is being set from thread2 and the issue is my threads1 wait for X seconds and then check if result_done flag is set.
def run(self):
while True:
try:
val = self.job_queue.get(True,self.maxtimeout)
except:
pass
if self.result_done.isset():
return
Now, if maxtimeout is set to 500seconds and I set the result_done flag from another thread, this thread will wait for 500 seconds before exiting( if there's no data in the queue).
What I want to achieve is that all the threads die gracefully along with the current process, properly terminating the db,websocket,http connections etc as soon as result_done event is set from any of the threads of the process.
I am using python multiprocess library to create the process which spawns these threads.
Update: All threads are daemon=True threads.
To avoid waiting maxtimeout time, you could use Event.wait() method:
def run(self):
while not self.result_done.is_set():
try:
val = self.job_queue.get_nowait()
except Empty:
if self.result_done.wait(1): # wait a second to avoid busy loop
return
Event.wait(timeout) returns without waiting the full timeout if the event is set during the call.
I have the below code but it lives on after the queue is empty, any insights:
def processor():
while(1>0):
if queue.empty() == True:
print "the Queue is empty!"
break
source=queue.get()
page = urllib2.urlopen(source)
print page
def main:
for i in range(threads):
th = Thread(target=processor)
th.setDaemon(True)
th.start()
queue.join()
It prints queue empty as many times as I have threads and just stands there doing nothing.
You need to call queue.task_done() after printing the page, otherwise join() will block. Each thread, after using get() must call task_done().
See documentation for queue
This part:
while(1>0):
if queue.empty() == True:
print "the Queue is empty!"
break
Above is just plain wrong. queue.get() is blocking, there is absolutely no reason to have a busy loop. It should be deleted.
Your code should look something like this.
def processor():
source=queue.get()
page = urllib2.urlopen(source)
print page
queue.task_done()
def main:
for i in range(threads):
th = Thread(target=processor)
th.setDaemon(True)
th.start()
for source in all_sources:
queue.put(source)
queue.join()
It's not the cleanest way to exit, but it will work. Since processor threads are set to be daemons, whole process with exit as soon as the main is done.
As Max said, need a complete example to help with your behavior, but from the documentation:
Python’s Thread class supports a subset of the behavior of Java’s Thread class; currently, there are no priorities, no thread groups, and threads cannot be destroyed, stopped, suspended, resumed, or interrupted.
It stops being alive when its run() method terminates – either normally, or by raising an unhandled exception. The is_alive() method tests whether the thread is alive.
http://docs.python.org/library/threading.html
The lower level thread module does allow you to manually call exit(). Without a more complete example, I don't know if that's what you need in this case, but I suspect not as Thread objects should automatically end when run() is complete.
http://docs.python.org/library/thread.html
I have a queue that always needs to be ready to process items when they are added to it. The function that runs on each item in the queue creates and starts thread to execute the operation in the background so the program can go do other things.
However, the function I am calling on each item in the queue simply starts the thread and then completes execution, regardless of whether or not the thread it started completed. Because of this, the loop will move on to the next item in the queue before the program is done processing the last item.
Here is code to better demonstrate what I am trying to do:
queue = Queue.Queue()
t = threading.Thread(target=worker)
t.start()
def addTask():
queue.put(SomeObject())
def worker():
while True:
try:
# If an item is put onto the queue, immediately execute it (unless
# an item on the queue is still being processed, in which case wait
# for it to complete before moving on to the next item in the queue)
item = queue.get()
runTests(item)
# I want to wait for 'runTests' to complete before moving past this point
except Queue.Empty, err:
# If the queue is empty, just keep running the loop until something
# is put on top of it.
pass
def runTests(args):
op_thread = SomeThread(args)
op_thread.start()
# My problem is once this last line 't.start()' starts the thread,
# the 'runTests' function completes operation, but the operation executed
# by some thread is not yet done executing because it is still running in
# the background. I do not want the 'runTests' function to actually complete
# execution until the operation in thread t is done executing.
"""t.join()"""
# I tried putting this line after 't.start()', but that did not solve anything.
# I have commented it out because it is not necessary to demonstrate what
# I am trying to do, but I just wanted to show that I tried it.
Some notes:
This is all running in a PyGTK application. Once the 'SomeThread' operation is complete, it sends a callback to the GUI to display the results of the operation.
I do not know how much this affects the issue I am having, but I thought it might be important.
A fundamental issue with Python threads is that you can't just kill them - they have to agree to die.
What you should do is:
Implement the thread as a class
Add a threading.Event member which the join method clears and the thread's main loop occasionally checks. If it sees it's cleared, it returns. For this override threading.Thread.join to check the event and then call Thread.join on itself
To allow (2), make the read from Queue block with some small timeout. This way your thread's "response time" to the kill request will be the timeout, and OTOH no CPU choking is done
Here's some code from a socket client thread I have that has the same issue with blocking on a queue:
class SocketClientThread(threading.Thread):
""" Implements the threading.Thread interface (start, join, etc.) and
can be controlled via the cmd_q Queue attribute. Replies are placed in
the reply_q Queue attribute.
"""
def __init__(self, cmd_q=Queue.Queue(), reply_q=Queue.Queue()):
super(SocketClientThread, self).__init__()
self.cmd_q = cmd_q
self.reply_q = reply_q
self.alive = threading.Event()
self.alive.set()
self.socket = None
self.handlers = {
ClientCommand.CONNECT: self._handle_CONNECT,
ClientCommand.CLOSE: self._handle_CLOSE,
ClientCommand.SEND: self._handle_SEND,
ClientCommand.RECEIVE: self._handle_RECEIVE,
}
def run(self):
while self.alive.isSet():
try:
# Queue.get with timeout to allow checking self.alive
cmd = self.cmd_q.get(True, 0.1)
self.handlers[cmd.type](cmd)
except Queue.Empty as e:
continue
def join(self, timeout=None):
self.alive.clear()
threading.Thread.join(self, timeout)
Note self.alive and the loop in run.