I have a queue that always needs to be ready to process items when they are added to it. The function that runs on each item in the queue creates and starts thread to execute the operation in the background so the program can go do other things.
However, the function I am calling on each item in the queue simply starts the thread and then completes execution, regardless of whether or not the thread it started completed. Because of this, the loop will move on to the next item in the queue before the program is done processing the last item.
Here is code to better demonstrate what I am trying to do:
queue = Queue.Queue()
t = threading.Thread(target=worker)
t.start()
def addTask():
queue.put(SomeObject())
def worker():
while True:
try:
# If an item is put onto the queue, immediately execute it (unless
# an item on the queue is still being processed, in which case wait
# for it to complete before moving on to the next item in the queue)
item = queue.get()
runTests(item)
# I want to wait for 'runTests' to complete before moving past this point
except Queue.Empty, err:
# If the queue is empty, just keep running the loop until something
# is put on top of it.
pass
def runTests(args):
op_thread = SomeThread(args)
op_thread.start()
# My problem is once this last line 't.start()' starts the thread,
# the 'runTests' function completes operation, but the operation executed
# by some thread is not yet done executing because it is still running in
# the background. I do not want the 'runTests' function to actually complete
# execution until the operation in thread t is done executing.
"""t.join()"""
# I tried putting this line after 't.start()', but that did not solve anything.
# I have commented it out because it is not necessary to demonstrate what
# I am trying to do, but I just wanted to show that I tried it.
Some notes:
This is all running in a PyGTK application. Once the 'SomeThread' operation is complete, it sends a callback to the GUI to display the results of the operation.
I do not know how much this affects the issue I am having, but I thought it might be important.
A fundamental issue with Python threads is that you can't just kill them - they have to agree to die.
What you should do is:
Implement the thread as a class
Add a threading.Event member which the join method clears and the thread's main loop occasionally checks. If it sees it's cleared, it returns. For this override threading.Thread.join to check the event and then call Thread.join on itself
To allow (2), make the read from Queue block with some small timeout. This way your thread's "response time" to the kill request will be the timeout, and OTOH no CPU choking is done
Here's some code from a socket client thread I have that has the same issue with blocking on a queue:
class SocketClientThread(threading.Thread):
""" Implements the threading.Thread interface (start, join, etc.) and
can be controlled via the cmd_q Queue attribute. Replies are placed in
the reply_q Queue attribute.
"""
def __init__(self, cmd_q=Queue.Queue(), reply_q=Queue.Queue()):
super(SocketClientThread, self).__init__()
self.cmd_q = cmd_q
self.reply_q = reply_q
self.alive = threading.Event()
self.alive.set()
self.socket = None
self.handlers = {
ClientCommand.CONNECT: self._handle_CONNECT,
ClientCommand.CLOSE: self._handle_CLOSE,
ClientCommand.SEND: self._handle_SEND,
ClientCommand.RECEIVE: self._handle_RECEIVE,
}
def run(self):
while self.alive.isSet():
try:
# Queue.get with timeout to allow checking self.alive
cmd = self.cmd_q.get(True, 0.1)
self.handlers[cmd.type](cmd)
except Queue.Empty as e:
continue
def join(self, timeout=None):
self.alive.clear()
threading.Thread.join(self, timeout)
Note self.alive and the loop in run.
Related
I want to make a thread and control it with an event object. Detailedly speaking, I want the thread to be executed whenever the event object is set and to wait itselt, repeatedly.
The below shows a sketchy logic I thought of.
import threading
import time
e = threading.Event()
def start_operation():
e.wait()
while e.is_set():
print('STARTING TASK')
e.clear()
t1 = threading.Thread(target=start_operation)
t1.start()
e.set() # first set
e.set() # second set
I expected t1 to run once the first set has been commanded and to stop itself(due to e.clear inside it), and then to run again after the second set has been commanded. So, accordign to what I expected, it should print out 'STARTING TASK' two times. But it shows it only once, which I don't understand why. How am I supposed to change the code to make it run the while loop again, whenever the event object is set?
The first problem is that once you exit a while loop, you've exited it. Changing the predicate back won't change anything. Forget about events for a second and just look at this code:
i = 0
while i == 0:
i = 1
It obviously doesn't matter if you set i = 0 again later, right? You've already left the while loop, and the whole function. And your code is doing exactly the same thing.
You can fix problem that by just adding another while loop around the whole thing:
def start_operation():
while True:
e.wait()
while e.is_set():
print('STARTING TASK')
e.clear()
However, that still isn't going to work—except maybe occasionally, by accident.
Event.set doesn't block; it just sets the event immediately, even if it's already set. So, the most likely flow of control here is:
background thread hits e.wait() and blocks.
main thread hits e.set() and sets event.
main thread hits e.set() and sets event again, with no effect.
background thread wakes up, does the loop once, calls e.clear() at the end.
background thread waits forever on e.wait().
(The fact that there's no way to avoid missed signals with events is effectively the reason conditions were invented, and that anything newer than Win32 and Python doesn't bother with events… But a condition isn't sufficient here either.)
If you want the main thread to block until the event is clear, and only then set it again, you can't do that. You need something extra, like a second event, which the main thread can wait on and the background thread can set.
But if you want to keep track of multiple set calls, without missing any, you need to use a different sync mechanism. A queue.Queue may be overkill here, but it's dead simple to do in Python, so let's just use that. Of course you don't actually have any values to put on the queue, but that's OK; you can just stick a dummy value there:
import queue
import threading
q = queue.Queue()
def start_operation():
while True:
_ = q.get()
print('STARTING TASK')
t1 = threading.Thread(target=start_operation)
t1.start()
q.put(None)
q.put(None)
And if you later want to add a way to shut down the background thread, just change it to stick values on:
import queue
import threading
q = queue.Queue()
def start_operation():
while True:
if q.get():
return
print('STARTING TASK')
t1 = threading.Thread(target=start_operation)
t1.start()
q.put(False)
q.put(False)
q.put(True)
I have a specific problem.
Main content of program starts with creating Process with dbus loop, where I listen for signals.
Content of signals I store in queues. In next part of main I have a threadpool.
When some thread takes item from queue, it use specific function(detection) to handle request - based on content of item from queue. (There is operation on database, from where I take data and make some operations depends on request)
Every thread in thread pool starts one more thread, which should handle signals (current status and interrupt).
For example: I receive signal, which means I have to handle something on numbers. Any thread from threadpool takes this item from queue and starts function which handle something on numbers - it can take long time. So after any time, I receive signal for current status and I need to send current status of detection - that's why I use threads (for shared memory). Also I can receive interrupt signal from D-Bus ("it takes too long time, so stop this detection and be free for another request"). And the interrupt is the main problem...
So my main questions are:
Is there any way, I can raise exception on interrupt signal and stop function (detection)? (I just found solution, but only for catch in main... but I need to catch it in thread which is in threadpool and raise in thread which is in thread in threadpool)
Second question is about GIL... does my thread with signal receiving receive all signals? I think it doesn't... (Yes, I use threads_init())
program:
SERVICE = multiprocessing.Process(target=dbus_signal_receiver, args=(...))
SERVICE.daemon = True
SERVICE.start()
class worker(threading.Thread):
def __init__(self,...):
threading.Thread.__init__(self)
def run(self):
while True:
#get item from queue
s = threading.Thread(target=curr_and_interr_signal_handle, args=(ID of item from queue,...))
s.daemon = True
s.start()
#start specific detection based on request
for i in range(number of threads):
t = worker(...)
t.daemon = True
t.start()
and I hoped, something like this will work... (but it doesn't)
...
class worker(threading.Thread):
def __init__(self,...):
threading.Thread.__init__(self)
def run(self):
while True:
try:
#get item from queue
s = threading.Thread(target=curr_and_interr_signal_handle, args=(ID of item from queue,...))
s.daemon = True
s.start()
#start specific detection based on request
except raised_interrupt_exception:
#continue - wait for another request from queue
...
Read about 18.8.1.2. Signals and threads
Python signal handlers are always executed in the main Python thread,
even if the signal was received in another thread.
This means that signals can’t be used as a means of inter-thread communication.
You can use the synchronization primitives from the threading module instead.
Besides, only the main thread is allowed to set a new signal handler.
Read about 17.1.7. Event Objects
This is one of the simplest mechanisms for communication between threads: one thread signals an event and other threads wait for it
Isn't clear why you have to use thread in thread.
Why could your worker thread not handle detection?
For instance, the following should be do it:
def run(self):
while self.running.is_set():
#get item from queue
#start specific detection based on request
Python's Queue has a join() method that will block until task_done() has been called on all the items that have been taken from the queue.
Is there a way to periodically check for this condition, or receive an event when it happens, so that you can continue to do other things in the meantime? You can, of course, check if the queue is empty, but that doesn't tell you if the count of unfinished tasks is actually zero.
The Python Queue itself does not support this, so you could try the following
from threading import Thread
class QueueChecker(Thread):
def __init__(self, q):
Thread.__init__(self)
self.q = q
def run(self):
q.join()
q_manager_thread = QueueChecker(my_q)
q_manager_thread.start()
while q_manager_thread.is_alive():
#do other things
#when the loop exits the tasks are done
#because the thread will have returned
#from blocking on the q.join and exited
#its run method
q_manager_thread.join() #to cleanup the thread
a while loop on the thread.is_alive() bit might not be exactly what you want, but at least you can see how to asynchronously check on the status of the q.join now.
I start a bunch of threads working on a queue and I want to kill them when sending the SIGINT (Ctrl+C). What is the best way to handle this?
targets = Queue.Queue()
threads_num = 10
threads = []
for i in threads_num:
t = MyThread()
t.setDaemon(True)
threads.append(t)
t.start()
targets.join()
If you are not interested in letting the other threads shut down gracefully, simply start them in daemon mode and wrap the join of the queue in a terminator thread.
That way, you can make use of the join method of the thread -- which supports a timeout and does not block off exceptions -- instead of having to wait on the queue's join method.
In other words, do something like this:
term = Thread(target=someQueueVar.join)
term.daemon = True
term.start()
while (term.isAlive()):
term.join(3600)
Now, Ctrl+C will terminate the MainThread whereupon the Python Interpreter hard-kills all threads marked as "daemons". Do note that this means that you have to set "Thread.daemon" for all the other threads or shut them down gracefully by catching the correct exception (KeyboardInterrupt or SystemExit) and doing whatever needs to be done for them to quit.
Do also note that you absolutely need to pass a number to term.join(), as otherwise it will, too, ignore all exceptions. You can select an arbitrarily high number, though.
Isn't Ctrl+C SIGINT?
Anyway, you can install a handler for the appropriate signal, and in the handler:
set a global flag that instructs the workers to exit, and make sure they check it periodically
or put 10 shutdown tokens on the queue, and have the workers exit when they pop this magic token
or set a flag which instructs the main thread to push those tokens, make sure the main thread checks that flag
etc. Mostly it depends on the structure of the application you're interrupting.
One way to do it is to install a signal handler for SIGTERM that directly calls os._exit(signal.SIGTERM). However unless you specify the optional timeout argument to Queue.get the signal handler function will not run until after the get method returns. (That's completely undocumented; I discovered that on my own.) So you can specify sys.maxint as the timeout and put your Queue.get call in a retry loop for purity to get around that.
Why don't you set timeouts for any operation on the queue? Then your threads can regular check if they have to finish by checking if an Event is raised.
This is how I tackled this.
class Worker(threading.Thread):
def __init__(self):
self.shutdown_flag = threading.Event()
def run(self):
logging.info('Worker started')
while not self.shutdown_flag.is_set():
try:
task = self.get_task_from_queue()
except queue.Empty:
continue
self.process_task(task)
def get_task_from_queue(self) -> Task:
return self.task_queue.get(block=True, timeout=10)
def shutdown(self):
logging.info('Shutdown received')
self.shutdown_flag.set()
Upon receiving a signal the main thread sets the shutdown event on workers. The workers wait on a blocking queue, but keep checking every 10 seconds if they have received a shutdown signal.
I managed to solve the problem by emptying the queue on KeyboardInterrupt and letting threads to gracefully stop themselves.
I don't know if it's the best way to handle this but is simple and quite clean.
targets = Queue.Queue()
threads_num = 10
threads = []
for i in threads_num:
t = MyThread()
t.setDaemon(True)
threads.append(t)
t.start()
while True:
try:
# If the queue is empty exit loop
if self.targets.empty() is True:
break
# KeyboardInterrupt handler
except KeyboardInterrupt:
print "[X] Interrupt! Killing threads..."
# Substitute the old queue with a new empty one and exit loop
targets = Queue.Queue()
break
# Join every thread on the queue normally
targets.join()
I have the below code but it lives on after the queue is empty, any insights:
def processor():
while(1>0):
if queue.empty() == True:
print "the Queue is empty!"
break
source=queue.get()
page = urllib2.urlopen(source)
print page
def main:
for i in range(threads):
th = Thread(target=processor)
th.setDaemon(True)
th.start()
queue.join()
It prints queue empty as many times as I have threads and just stands there doing nothing.
You need to call queue.task_done() after printing the page, otherwise join() will block. Each thread, after using get() must call task_done().
See documentation for queue
This part:
while(1>0):
if queue.empty() == True:
print "the Queue is empty!"
break
Above is just plain wrong. queue.get() is blocking, there is absolutely no reason to have a busy loop. It should be deleted.
Your code should look something like this.
def processor():
source=queue.get()
page = urllib2.urlopen(source)
print page
queue.task_done()
def main:
for i in range(threads):
th = Thread(target=processor)
th.setDaemon(True)
th.start()
for source in all_sources:
queue.put(source)
queue.join()
It's not the cleanest way to exit, but it will work. Since processor threads are set to be daemons, whole process with exit as soon as the main is done.
As Max said, need a complete example to help with your behavior, but from the documentation:
Python’s Thread class supports a subset of the behavior of Java’s Thread class; currently, there are no priorities, no thread groups, and threads cannot be destroyed, stopped, suspended, resumed, or interrupted.
It stops being alive when its run() method terminates – either normally, or by raising an unhandled exception. The is_alive() method tests whether the thread is alive.
http://docs.python.org/library/threading.html
The lower level thread module does allow you to manually call exit(). Without a more complete example, I don't know if that's what you need in this case, but I suspect not as Thread objects should automatically end when run() is complete.
http://docs.python.org/library/thread.html