If I have a threading.Event and the following two lines of code:
event.set()
event.clear()
and I have some threads who are waiting for that event.
My question is related to what happens when calling the set() method:
Can I be ABSOLUTELY sure that all the waiting thread(s) will be notified? (i.e. Event.set() "notifies" the threads)
Or could it happen that those two lines are executed so quickly after each other, that some threads might still be waiting? (i.e. Event.wait() polls the event's state, which might be already "cleared" again)
Thanks for your answers!
In the internals of Python, an event is implemented with a Condition() object.
When calling the event.set() method, the notify_all() of the condition is called (after getting the lock to be sure to be not interrupted), then all the threads receive the notification (the lock is released only when all the threads are notified), so you can be sure that all the threads will effectively be notified.
Now, clearing the event just after the notification is not a problem.... until you do not want to check the event value in the waiting threads with an event.is_set(), but you only need this kind of check if you were waiting with a timeout.
Examples :
pseudocode that works :
#in main thread
event = Event()
thread1(event)
thread2(event)
...
event.set()
event.clear()
#in thread code
...
event.wait()
#do the stuff
pseudocode that may not work :
#in main thread
event = Event()
thread1(event)
thread2(event)
...
event.set()
event.clear()
#in thread code
...
while not event.is_set():
event.wait(timeout_value)
#do the stuff
Edited : in python >= 2.7 you can still wait for an event with a timeout and be sure of the state of the event :
event_state = event.wait(timeout)
while not event_state:
event_state = event.wait(timeout)
It's easy enough to verify that things work as expected (Note: this is Python 2 code, which will need adapting for Python 3):
import threading
e = threading.Event()
threads = []
def runner():
tname = threading.current_thread().name
print 'Thread waiting for event: %s' % tname
e.wait()
print 'Thread got event: %s' % tname
for t in range(100):
t = threading.Thread(target=runner)
threads.append(t)
t.start()
raw_input('Press enter to set and clear the event:')
e.set()
e.clear()
for t in threads:
t.join()
print 'All done.'
If you run the above script and it terminates, all should be well :-) Notice that a hundred threads are waiting for the event to be set; it's set and cleared straight away; all threads should see this and should terminate (though not in any definite order, and the "All done" can be printed anywhere after the "Press enter" prompt, not just at the very end.
Python 3+
It's easier to check that it works
import threading
import time
lock = threading.Lock() # just to sync printing
e = threading.Event()
threads = []
def runner():
tname = threading.current_thread().name
with lock:
print('Thread waiting for event ', tname)
e.wait()
with lock:
print('Thread got event: ', tname)
for t in range(8): # Create 8 threads could be 100's
t = threading.Thread(target=runner)
threads.append(t)
t.start()
time.sleep(1) # force wait until set/clear
e.set()
e.clear()
for t in threads:
t.join()
print('Done')
Related
I have two threads and am trying to use event.set() and event.wait() to pause the doing_different_stuff thread whilst the monitoring thread is executing.
My code is below. Even though the Rain_Check image is not on screen the doing_different_stuff thread only prints once when it is first run.
Can anyone see what I am doing wrong?
popup_found = threading.Event()
def monitoring():
while True:
if pyautogui.locateOnScreen('Rain_Check.png'):
popup_found.set()
print("Found pop-up")
sleep(float(random.uniform(22.21, 44.36)))
print("Pausing before closing pop-up")
def doing_different_stuff():
while True:
popup_found.wait()
sleep(1)
print("DDSing...")
rc_thread = threading.Thread(target=monitoring)
rc_thread.start()
dds_thread = threading.Thread(target=doing_different_stuff)
dds_thread.start()
I am trying to make it to where my threads can catch a sigint. It looks like to me that kill_received singleton list is in the same namespace of signal_handler() and do_the_uploads() and the same memory location is being referenced. But when I control C when it's running, I see False being printed from "print kill_received[0]" when it should be True since I hit control-C.
kill_received = [False]
def signal_handler(signal, frame):
global kill_received
kill_received[0] = True
print "\nYou pressed Ctrl+C!"
print (
"Your logs and their locations are:"
"\n{}\n{}\n{}".format(debug, error, info))
sys.exit(0)
def do_the_uploads(file_list, file_quantity,
retry_list, authenticate):
"""The uploading engine"""
value = raw_input(
"\nPlease enter how many conncurent "
"uploads you want at one time(example: 200)> ")
value = int(value)
logger.info('{} conncurent uploads will be used.'.format(value))
confirm = raw_input(
"\nProceed to upload files? Enter [Y/y] for yes: ").upper()
if confirm == "Y":
kill_received = False
sys.stdout.write("\x1b[2J\x1b[H")
q = Queue.Queue()
def worker():
global kill_received
while True and not kill_received[0]:
print kill_received[0]
item = q.get()
upload_file(item, file_quantity, retry_list, authenticate)
q.task_done()
for i in range(value):
t = Thread(target=worker)
t.setDaemon(True)
t.start()
for item in file_list:
q.put(item)
q.join()
print "Finished. Cleaning up processes...",
#Allowing the threads to cleanup
time.sleep(4)
print "done."
From main script:
from modules.upload_actions import do_the_uploads, retry, signal_handler
if __name__ == '__main__':
signal.signal(signal.SIGINT, signal_handler)
retry_list = []
file_list, authenticate, ticket_number = main()
file_quantity = FileQuantity(len(file_list))
do_the_uploads(file_list, file_quantity,
retry_list, authenticate)
Update:
Still no success, but I changed the syntax to this as it's cleaner:
def worker():
global kill_received
while not kill_received[0]:
time.sleep(1)
print kill_received[0]
item = q.get()
upload_file(item, file_quantity, retry_list, authenticate)
q.task_done()
The key to understanding what is going on is the comment you made
No. When the threads are all completed then I do...but not during thread execution for all the files they have to upload.
and this line of code:
q.join()
Contrary to what you are probably expecting, a control-C does NOT cause it to stop waiting for the queue - it doesn't accept the control-C until after this call has returned. So what is happening is that all of your threads have done their jobs and emptied the queue, and then are waiting on the line
item = q.get()
Only after the last thread calls q.task_done does the main thread return and then process the control-C. However, at that point all the threads are stuck waiting for more items from the queue (which they aren't going to get), so they will never exit the loop.
There might be more going on here than this, but to see if this is the problem try a busy wait for the queue to be empty:
while not q.empty():
time.sleep(0.1)
q.join()
You need the join afterward because the queue being empty means the last upload has been pulled from the queue, not that it has been finished.
One other thing you can add is an item to the queue that signals the thread should finish, such as None. For example,
def worker():
global kill_received
while True and not kill_received[0]:
print kill_received[0]
item = q.get()
if item is None:
q.task_done()
break
upload_file(item, file_quantity, retry_list, authenticate)
q.task_done()
for i in range(value):
t = Thread(target=worker)
t.setDaemon(True)
t.start()
for item in file_list:
q.put(item)
for i in range(value):
q.put(None)
Of course, this assumes that None is not a valid value to upload. This won't help with the control-C issue, but it is something you might find helpful to make sure that the threads exit when the program finishes normally.
As a general help, when you testing things with threads it can be helpful to have a way to print out stack traces for all threads. This SO question talks about how to do that.
The reason you see False printed is because it never gets a chance to print it. You killed it before it every hits your print kill_received[0] statement.
Think about it. There is probably a small chance that you could hit Ctrl-C between execution of this statement:
while True and not kill_received[0]:
and this statement:
print kill_received[0]
but it's improbable. Interrupting any of the threads at any other time will cause them to stop looping (from your while statement), and never print anything.
EDIT: You have the line: kill_received = False which may be causing you issues. It should probably be kill_received[0] = False
Here's an example code of from Python documentation:
def worker():
while True:
item = q.get()
do_work(item)
q.task_done()
q = Queue()
for i in range(num_worker_threads):
t = Thread(target=worker)
t.daemon = True
t.start()
for item in source():
q.put(item)
q.join() # block until all tasks are done
I modified it to fit my use case like this:
import threading
from Queue import Queue
max_threads = 10
q = Queue(maxsize=max_threads + 2)
def worker():
while True:
task = q.get(1)
# do something with the task
q.task_done()
for i in range(max_threads):
t = threading.Thread(target=worker)
t.start()
for task in ['a', 'b', 'c']:
q.put(task)
q.join()
When I execute it, debugger says that all the jobs were executed, but q.join() seems to wait forever. How can I send a signal to the worker threads that I already sent all the tasks?
This process doesn't finish at .join() because the worker threads continue waiting on new queue data (blocking .get())
Here is a method that uses a simple flag finishUp to tell workers to exit, which we set after .join() is done - meaning all tasks are processed. I added a timeout in the q.get() call to allow it to check on finishUp flag
import threading
import queue
max_threads = 5
q = queue.Queue(maxsize=max_threads + 2)
finishUp = False
def worker():
while True:
try:
task = q.get(block=True, timeout=1)
# do something with the task
print ("processing task for:"+str(task))
q.task_done()
except Exception as ex: # we get this exception when queue is empty
if finishUp:
print ("thread finishing because processing is done")
return
for i in range(max_threads):
t = threading.Thread(target=worker)
t.start()
for task in ['a', 'b', 'c']:
q.put(task)
print ("waiting on join")
q.join()
finishUp = True # let the workers know that they can exit
print ("finished")
this produces the following output:
waiting on join
processing task for:a
processing task for:b
processing task for:c
finished
thread finishing because processing is done
thread finishing because processing is done
thread finishing because processing is done
thread finishing because processing is done
thread finishing because processing is done
Process finished with exit code 0
q.join() actually returns. You can test that by put print("done") after the q.join() line.
....
q.join()
print('done')
Then, why does it not end the program?
Because, by default, threads are non-daemon thread.
You can set thread as daemon thread using <thread_object>.daemon = True
for i in range(max_threads):
t = threading.Thread(target=worker)
t.daemon = True # <---
t.start()
According to threading module documentation:
daemon
A boolean value indicating whether this thread is a daemon thread
(True) or not (False). This must be set before start() is called,
otherwise RuntimeError is raised. Its initial value is inherited from
the creating thread; the main thread is not a daemon thread and
therefore all threads created in the main thread default to daemon =
False.
The entire Python program exits when no alive non-daemon threads are
left.
New in version 2.6.
I defined a DONE object to signal the end of work:
DONE = object()
and literally put it into the queue when the upper level knows that no more data will come:
q.put_nowait(DONE)
in the worker thread, as soon as the object is received, the thread quits.
But in case there are other threads listening on the very same queue, we have to put the object back on the queue:
item = q.get()
if item is DONE:
q.put_nowait(DONE)
return
cheers :)
I have a python program that implements threads like this:
class Mythread(threading.Thread):
def __init__(self, name, q):
threading.Thread.__init__(self)
self.name = name
self.q = q
def run(self):
print "Starting %s..." % (self.name)
while True:
## Get data from queue
data = self.q.get()
## do_some_processing with data ###
process_data(data)
## Mark Queue item as done
self.q.task_done()
print "Exiting %s..." % (self.name)
def call_threaded_program():
##Setup the threads. Define threads,queue,locks
threads = []
q = Queue.Queue()
thread_count = n #some number
data_list = [] #some data list containing data
##Create Threads
for thread_id in range(1, thread_count+1):
thread_name = "Thread-" + str(thread_id)
thread = Mythread(thread_name,q)
thread.daemon = True
thread.start()
##Fill data in Queue
for data_item in data_list:
q.put(data_item)
try:
##Wait for queue to be exhausted and then exit main program
q.join()
except (KeyboardInterrupt, SystemExit) as e:
print "Interrupt Issued. Exiting Program with error state: %s"%(str(e))
exit(1)
The call_threaded_program() is called from a different program.
I have the code working under normal circumstances. However if an error/exception occurs in one of the threads, then the program is stuck (as the queue join is infinitely blocking). The only way I am able to quit this program is to close the terminal itself.
What is the best way to terminate this program when a thread bails out? Is there a clean (actually I would take any way) way of doing this? I know this question has been asked numerous times, but I am still unable to find a convincing answer. I would really appreciate any help.
EDIT:
I tried removing the join on the queue and used a global exit flag as suggested in Is there any way to kill a Thread in Python?
However, Now the behavior is so strange, I can't comprehend what is going on.
import threading
import Queue
import time
exit_flag = False
class Mythread (threading.Thread):
def __init__(self,name,q):
threading.Thread.__init__(self)
self.name = name
self.q = q
def run(self):
try:
# Start Thread
print "Starting %s...."%(self.name)
# Do Some Processing
while not exit_flag:
data = self.q.get()
print "%s processing %s"%(self.name,str(data))
self.q.task_done()
# Exit thread
print "Exiting %s..."%(self.name)
except Exception as e:
print "Exiting %s due to Error: %s"%(self.name,str(e))
def main():
global exit_flag
##Setup the threads. Define threads,queue,locks
threads = []
q = Queue.Queue()
thread_count = 20
data_list = range(1,50)
##Create Threads
for thread_id in range(1,thread_count+1):
thread_name = "Thread-" + str(thread_id)
thread = Mythread(thread_name,q)
thread.daemon = True
threads.append(thread)
thread.start()
##Fill data in Queue
for data_item in data_list:
q.put(data_item)
try:
##Wait for queue to be exhausted and then exit main program
while not q.empty():
pass
# Stop the threads
exit_flag = True
# Wait for threads to finish
print "Waiting for threads to finish..."
while threading.activeCount() > 1:
print "Active Threads:",threading.activeCount()
time.sleep(1)
pass
print "Finished Successfully"
except (KeyboardInterrupt, SystemExit) as e:
print "Interrupt Issued. Exiting Program with error state: %s"%(str(e))
if __name__ == '__main__':
main()
The program's output is as below:
#Threads get started correctly
#The output also is getting processed but then towards the end, All i see are
Active Threads: 16
Active Threads: 16
Active Threads: 16...
The program then just hangs or keeps on printing the active threads. However since the exit flag is set to True, the thread's run method is not being exercised. So I have no clue as to how these threads are kept up or what is happening.
EDIT:
I found the problem. In the above code, thread's get method were blocking and hence unable to quit. Using a get method with a timeout instead did the trick. I have the code for just the run method that I modified below
def run(self):
try:
#Start Thread
printing "Starting %s..."%(self.name)
#Do Some processing
while not exit_flag:
try:
data = self.q.get(True,self.timeout)
print "%s processing %s"%(self.name,str(data))
self.q.task_done()
except:
print "Queue Empty or Timeout Occurred. Try Again for %s"%(self.name)
# Exit thread
print "Exiting %s..."%(self.name)
except Exception as e:
print "Exiting %s due to Error: %s"%(self.name,str(e))
If you want to force all the threads to exit when the process exits, you can set the "daemon" flag of the thread to True before the thread is created.
http://docs.python.org/2/library/threading.html#threading.Thread.daemon
I did it once in C. Basically i had a main process that were starting the other ones and kept tracks of them, ie. stored the PID and waited for the return code. If you have an error in a process the code will indicate so and then you can stop every other process. Hope this helps
Edit:
Sorry i can have forgotten in my answer that you were using threads. But I think it still applies. You can either wrap or modify the thread to get a return value or you can use the multithread pool library.
how to get the return value from a thread in python?
Python thread exit code
If I have a program that uses threading and Queue, how do I get exceptions to stop execution? Here is an example program, which is not possible to stop with ctrl-c (basically ripped from the python docs).
from threading import Thread
from Queue import Queue
from time import sleep
def do_work(item):
sleep(0.5)
print "working" , item
def worker():
while True:
item = q.get()
do_work(item)
q.task_done()
q = Queue()
num_worker_threads = 10
for i in range(num_worker_threads):
t = Thread(target=worker)
# t.setDaemon(True)
t.start()
for item in range(1, 10000):
q.put(item)
q.join() # block until all tasks are done
The simplest way is to start all the worker threads as daemon threads, then just have your main loop be
while True:
sleep(1)
Hitting Ctrl+C will throw an exception in your main thread, and all of the daemon threads will exit when the interpreter exits. This assumes you don't want to perform cleanup in all of those threads before they exit.
A more complex way is to have a global stopped Event:
stopped = Event()
def worker():
while not stopped.is_set():
try:
item = q.get_nowait()
do_work(item)
except Empty: # import the Empty exception from the Queue module
stopped.wait(1)
Then your main loop can set the stopped Event to False when it gets a KeyboardInterrupt
try:
while not stopped.is_set():
stopped.wait(1)
except KeyboardInterrupt:
stopped.set()
This lets your worker threads finish what they're doing you want instead of just having every worker thread be a daemon and exit in the middle of execution. You can also do whatever cleanup you want.
Note that this example doesn't make use of q.join() - this makes things more complex, though you can still use it. If you do then your best bet is to use signal handlers instead of exceptions to detect KeyboardInterrupts. For example:
from signal import signal, SIGINT
def stop(signum, frame):
stopped.set()
signal(SIGINT, stop)
This lets you define what happens when you hit Ctrl+C without affecting whatever your main loop is in the middle of. So you can keep doing q.join() without worrying about being interrupted by a Ctrl+C. Of course, with my above examples, you don't need to be joining, but you might have some other reason for doing so.