I've a python program that spawns a number of threads. These threads last anywhere between 2 seconds to 30 seconds. In the main thread I want to track whenever each thread completes and print a message. If I just sequentially .join() all threads and the first thread lasts 30 seconds and others complete much sooner, I wouldn't be able to print a message sooner -- all messages will be printed after 30 seconds.
Basically I want to block until any thread completes. As soon as a thread completes, print a message about it and go back to blocking if any other threads are still alive. If all threads are done then exit program.
One way I could think of is to have a queue that is passed to all the threads and block on queue.get(). Whenever a message is received from the queue, print it, check if any other threads are alive using threading.active_count() and if so, go back to blocking on queue.get(). This would work but here all the threads need to follow the discipline of sending a message to the queue before terminating.
I'm wonder if this is the conventional way of achieving this behavior or are there any other / better ways ?
Here's a variation on #detly's answer that lets you specify the messages from your main thread, instead of printing them from your target functions. This creates a wrapper function which calls your target and then prints a message before terminating. You could modify this to perform any kind of standard cleanup after each thread completes.
#!/usr/bin/python
import threading
import time
def target1():
time.sleep(0.1)
print "target1 running"
time.sleep(4)
def target2():
time.sleep(0.1)
print "target2 running"
time.sleep(2)
def launch_thread_with_message(target, message, args=[], kwargs={}):
def target_with_msg(*args, **kwargs):
target(*args, **kwargs)
print message
thread = threading.Thread(target=target_with_msg, args=args, kwargs=kwargs)
thread.start()
return thread
if __name__ == '__main__':
thread1 = launch_thread_with_message(target1, "finished target1")
thread2 = launch_thread_with_message(target2, "finished target2")
print "main: launched all threads"
thread1.join()
thread2.join()
print "main: finished all threads"
The thread needs to be checked using the Thread.is_alive() call.
Why not just have the threads themselves print a completion message, or call some other completion callback when done?
You can the just join these threads from your main program, so you'll see a bunch of completion messages and your program will terminate when they're all done, as required.
Here's a quick and simple demonstration:
#!/usr/bin/python
import threading
import time
def really_simple_callback(message):
"""
This is a really simple callback. `sys.stdout` already has a lock built-in,
so this is fine to do.
"""
print message
def threaded_target(sleeptime, callback):
"""
Target for the threads: sleep and call back with completion message.
"""
time.sleep(sleeptime)
callback("%s completed!" % threading.current_thread())
if __name__ == '__main__':
# Keep track of the threads we create
threads = []
# callback_when_done is effectively a function
callback_when_done = really_simple_callback
for idx in xrange(0, 10):
threads.append(
threading.Thread(
target=threaded_target,
name="Thread #%d" % idx,
args=(10 - idx, callback_when_done)
)
)
[t.start() for t in threads]
[t.join() for t in threads]
# Note that thread #0 runs for the longest, but we'll see its message first!
What I would suggest is loop like this
while len(threadSet) > 0:
time.sleep(1)
for thread in theadSet:
if not thread.isAlive()
print "Thread "+thread.getName()+" terminated"
threadSet.remove(thread)
There is a 1 second sleep, so there will be a slight delay between the thread termination and the message being printed. If you can live with this delay, then I think this is a simpler solution than the one you proposed in your question.
You can let the threads push their results into a threading.Queue. Have another thread wait on this queue and print the message as soon as a new item appears.
I'm not sure I see the problem with using:
threading.activeCount()
to track the number of threads that are still active?
Even if you don't know how many threads you're going to launch before starting it seems pretty easy to track. I usually generate thread collections via list comprehension then a simple comparison using activeCount to the list size can tell you how many have finished.
See here: http://docs.python.org/library/threading.html
Alternately, once you have your thread objects you can just use the .isAlive method within the thread objects to check.
I just checked by throwing this into a multithread program I have and it looks fine:
for thread in threadlist:
print(thread.isAlive())
Gives me a list of True/False as the threads turn on and off. So you should be able to do that and check for anything False in order to see if any thread is finished.
I use a slightly different technique because of the nature of the threads I used in my application. To illustrate, this is a fragment of a test-strap program I wrote to scaffold a barrier class for my threading class:
while threads:
finished = set(threads) - set(threading.enumerate())
while finished:
ttt = finished.pop()
threads.remove(ttt)
time.sleep(0.5)
Why do I do it this way? In my production code, I have a time limit, so the first line actually reads "while threads and time.time() < cutoff_time". If I reach the cut-off, I then have code to tell the threads to shut down.
Related
I would like to run a process in a thread (which is iterating over a large database table). While the thread is running, I just want the program to wait. If that thread takes longer then 30 seconds, I want to kill the thread and do something else. By killing the thread, I mean that I want it to cease activity and release resources gracefully.
I figured the best way to do this was through a Thread()'s join(delay) and is_alive() functions, and an Event. Using the join(delay) I can have my program wait 30 seconds for the thread to finish, and by using the is_alive() function I can determine if the thread has finished its work. If it hasn't finished its work, the event is set, and the thread knows to stop working at that point.
Is this approach valid, and is this the most pythonic way to go about my problem statement?
Here is some sample code:
import threading
import time
# The worker loops for about 1 minute adding numbers to a set
# unless the event is set, at which point it breaks the loop and terminates
def worker(e):
data = set()
for i in range(60):
data.add(i)
if not e.isSet():
print "foo"
time.sleep(1)
else:
print "bar"
break
e = threading.Event()
t = threading.Thread(target=worker, args=(e,))
t.start()
# wait 30 seconds for the thread to finish its work
t.join(30)
if t.is_alive():
print "thread is not done, setting event to kill thread."
e.set()
else:
print "thread has already finished."
Using an Event in this case is works just fine as the signalling mechanism, and
is actually recommended in the threading module docs.
If you want your threads to stop gracefully, make them non-daemonic and use a
suitable signalling mechanism such as an Event.
When verifying thread termination, timeouts almost always introduce room for
error. Therefore, while using the .join() with a timeout for the initial
decision to trigger the event is fine, final verification should be made using a
.join() without a timeout.
# wait 30 seconds for the thread to finish its work
t.join(30)
if t.is_alive():
print "thread is not done, setting event to kill thread."
e.set()
# The thread can still be running at this point. For example, if the
# thread's call to isSet() returns right before this call to set(), then
# the thread will still perform the full 1 second sleep and the rest of
# the loop before finally stopping.
else:
print "thread has already finished."
# Thread can still be alive at this point. Do another join without a timeout
# to verify thread shutdown.
t.join()
This can be simplified to something like this:
# Wait for at most 30 seconds for the thread to complete.
t.join(30)
# Always signal the event. Whether the thread has already finished or not,
# the result will be the same.
e.set()
# Now join without a timeout knowing that the thread is either already
# finished or will finish "soon."
t.join()
I'm way late to this game, but I've been wrestling with a similar question and the following appears to both resolve the issue perfectly for me AND lets me do some basic thread state checking and cleanup when the daemonized sub-thread exits:
import threading
import time
import atexit
def do_work():
i = 0
#atexit.register
def goodbye():
print ("'CLEANLY' kill sub-thread with value: %s [THREAD: %s]" %
(i, threading.currentThread().ident))
while True:
print i
i += 1
time.sleep(1)
t = threading.Thread(target=do_work)
t.daemon = True
t.start()
def after_timeout():
print "KILL MAIN THREAD: %s" % threading.currentThread().ident
raise SystemExit
threading.Timer(2, after_timeout).start()
Yields:
0
1
KILL MAIN THREAD: 140013208254208
'CLEANLY' kill sub-thread with value: 2 [THREAD: 140013674317568]
I was also struggling to close a thread that was waiting to receive a notification.
Tried solution given here by user5737269 but it didn't really work for me. It was getting stuck in second join statement(without timeout one). Struggled a lot but didn't find any solution to this problem. Got this solution after thinking sometime:
My thread is waiting to receive a message in que. I want to close this thread, if no notification is received for 20 seconds. So, after 20 seconds, I am writing a message to this que so that thread terminates on its own.
Here's code:
q = Queue.Queue()
t.join(20)
if t.is_alive():
print("STOPPING THIS THREAD ....")
q.put("NO NOTIFICATION RECEIVED")
t.join(20)
else:
print("Thread completed successfully!!")
This worked for me.. Hope this idea helps someone!
I am very new to the concept of threading and the concepts are still somewhat fuzzy.
But as of now i have a requirement in which i spin up an arbitrary number of threads from my Python program and then my Python program should indicate to the user running the process which threads have finished executing. Below is my first try:
import threading
from threading import Thread
from time import sleep
def exec_thread(n):
name = threading.current_thread().getName()
filename = name + ".txt"
with open(filename, "w+") as file:
file.write(f"My name is {name} and my main thread is {threading.main_thread()}\n")
sleep(n)
file.write(f"{name} exiting\n")
t1 = Thread(name="First", target=exec_thread, args=(10,))
t2 = Thread(name="Second", target=exec_thread, args=(2,))
t1.start()
t2.start()
while len(threading.enumerate()) > 1:
print(f"Waiting ... !")
sleep(5)
print(f"The threads are done"
So this basically tells me when all the threads are done executing.
But i want to know as soon as any one of my threads have completed execution so that i can tell the user that please check the output file for the thread.
I cannot use thread.join() since that would block my main program and the user would not know anything unless everything is complete which might take hours. The user wants to know as soon as some results are available.
Now i know that we can check individual threads whether they are active or not by doing : thread.isAlive() but i was hoping for a more elegant solution in which if the child threads can somehow communicate with the main thread and say I am done !
Many thanks for any answers in advance.
The simplest and most straightforward way to indicate a single thread is "done" is to put the required notification in the thread's implementation method, as the very last step. For example, you could print out a notification to the user.
Or, you could use events, see: https://docs.python.org/3/library/threading.html#event-objects
This is one of the simplest mechanisms for communication between
threads: one thread signals an event and other threads wait for it.
An event object manages an internal flag that can be set to true with
the set() method and reset to false with the clear() method. The
wait() method blocks until the flag is true.
So, the "final act" in your thread implementation would be to set an event object, and your main thread can wait until it's set.
Or, for an even fancier and more mechanism, use queues: https://docs.python.org/3/library/queue.html
Each thread writes an "I'm done" object to the queue when done, and the main thread can read those notifications from the queue in sequence as each thread completes.
from threading import Thread
class MyClass:
#...
def method2(self):
while True:
try:
hashes = self.target.bssid.replace(':','') + '.pixie'
text = open(hashes).read().splitlines()
except IOError:
time.sleep(5)
continue
# function goes on ...
def method1(self):
new_thread = Thread(target=self.method2())
new_thread.setDaemon(True)
new_thread.start() # Main thread will stop there, wait until method 2
print "Its continues!" # wont show =(
# function goes on ...
Is it possible to do like that?
After new_thread.start() Main thread waits until its done, why is that happening? i didn't provide new_thread.join() anywhere.
Daemon doesn't solve my problem because my problem is that Main thread stops right after new thread start, not because main thread execution is end.
As written, the call to the Thread constructor is invoking self.method2 instead of referring to it. Replace target=self.method2() with target=self.method2 and the threads will run in parallel.
Note that, depending on what your threads do, CPU computations might still be serialized due to the GIL.
IIRC, this is because the program doesn't exit until all non-daemon threads have finished execution. If you use a daemon thread instead, it should fix the issue. This answer gives more details on daemon threads:
Daemon Threads Explanation
from thread import start_new_thread
num_threads = 0
def heron(a):
global num_threads
num_threads += 1
# code has been left out, see above
num_threads -= 1
return new
start_new_thread(heron,(99,))
start_new_thread(heron,(999,))
start_new_thread(heron,(1733,))
start_new_thread(heron,(17334,))
while num_threads > 0:
pass
This is simple code of thread i want to know in last line why do we use while loop
The final while-loop waits for all of the threads to finish before the main thread exits.
It is expensive check (100% CPU for the spin-wait). You can improve it in one of two ways:
while num_threads > 0:
time.sleep(0.1)
or by tracking all the threads in a list and joining them one-by-one:
for worker in worker_threads:
worker.join()
We want to keep process alive until all children finish the work. So we must keep executing something in main thread as long as any child is alive, hence the check for num_threads variable.
If it wasn't for this, all children threads would be killed ASAP main thread finished its work regardless of whether they actually finished their work, so waiting for them is mandatory to ensure everything is done.
To build on Raymond Hettinger's answer: the parent process starts a number of threads, each of which does work. We then wait for each to exit, so that we can collect and process their output. In this case each worker just outputs to the screen, so the parent just has to join() each task to make sure it ran and exited correctly.
Here's an alternate way to code the above. It uses the higher-level library threading (vs thread), and only calls join() on threads besides the current one. We also use threading.enumerate() instead of manually keeping track of worker threads -- easier.
code:
import threading
def heron(a):
print '{}: a={}'.format(threading.current_thread(), a)
threading.Thread(target=heron, args=(99,)).start()
threading.Thread(target=heron, args=(999,)).start()
threading.Thread(target=heron, args=(1733,)).start()
threading.Thread(target=heron, args=(17334,)).start()
print
print '{} threads, joining'.format(threading.active_count())
for thread in threading.enumerate():
print '- {} join'.format(thread)
if thread == threading.current_thread():
continue
thread.join()
print 'done'
Example output:
python ./jointhread.py
<Thread(Thread-1, started 140381408802560)>: a=99
<Thread(Thread-2, started 140381400082176)>: a=999
<Thread(Thread-3, started 140381400082176)>: a=1733
2 threads, joining
- <_MainThread(MainThread, started 140381429581632)> join
- <Thread(Thread-4, started 140381408802560)> join
<Thread(Thread-4, started 140381408802560)>: a=17334
done
I have the below code but it lives on after the queue is empty, any insights:
def processor():
while(1>0):
if queue.empty() == True:
print "the Queue is empty!"
break
source=queue.get()
page = urllib2.urlopen(source)
print page
def main:
for i in range(threads):
th = Thread(target=processor)
th.setDaemon(True)
th.start()
queue.join()
It prints queue empty as many times as I have threads and just stands there doing nothing.
You need to call queue.task_done() after printing the page, otherwise join() will block. Each thread, after using get() must call task_done().
See documentation for queue
This part:
while(1>0):
if queue.empty() == True:
print "the Queue is empty!"
break
Above is just plain wrong. queue.get() is blocking, there is absolutely no reason to have a busy loop. It should be deleted.
Your code should look something like this.
def processor():
source=queue.get()
page = urllib2.urlopen(source)
print page
queue.task_done()
def main:
for i in range(threads):
th = Thread(target=processor)
th.setDaemon(True)
th.start()
for source in all_sources:
queue.put(source)
queue.join()
It's not the cleanest way to exit, but it will work. Since processor threads are set to be daemons, whole process with exit as soon as the main is done.
As Max said, need a complete example to help with your behavior, but from the documentation:
Python’s Thread class supports a subset of the behavior of Java’s Thread class; currently, there are no priorities, no thread groups, and threads cannot be destroyed, stopped, suspended, resumed, or interrupted.
It stops being alive when its run() method terminates – either normally, or by raising an unhandled exception. The is_alive() method tests whether the thread is alive.
http://docs.python.org/library/threading.html
The lower level thread module does allow you to manually call exit(). Without a more complete example, I don't know if that's what you need in this case, but I suspect not as Thread objects should automatically end when run() is complete.
http://docs.python.org/library/thread.html