I have the below code but it lives on after the queue is empty, any insights:
def processor():
while(1>0):
if queue.empty() == True:
print "the Queue is empty!"
break
source=queue.get()
page = urllib2.urlopen(source)
print page
def main:
for i in range(threads):
th = Thread(target=processor)
th.setDaemon(True)
th.start()
queue.join()
It prints queue empty as many times as I have threads and just stands there doing nothing.
You need to call queue.task_done() after printing the page, otherwise join() will block. Each thread, after using get() must call task_done().
See documentation for queue
This part:
while(1>0):
if queue.empty() == True:
print "the Queue is empty!"
break
Above is just plain wrong. queue.get() is blocking, there is absolutely no reason to have a busy loop. It should be deleted.
Your code should look something like this.
def processor():
source=queue.get()
page = urllib2.urlopen(source)
print page
queue.task_done()
def main:
for i in range(threads):
th = Thread(target=processor)
th.setDaemon(True)
th.start()
for source in all_sources:
queue.put(source)
queue.join()
It's not the cleanest way to exit, but it will work. Since processor threads are set to be daemons, whole process with exit as soon as the main is done.
As Max said, need a complete example to help with your behavior, but from the documentation:
Python’s Thread class supports a subset of the behavior of Java’s Thread class; currently, there are no priorities, no thread groups, and threads cannot be destroyed, stopped, suspended, resumed, or interrupted.
It stops being alive when its run() method terminates – either normally, or by raising an unhandled exception. The is_alive() method tests whether the thread is alive.
http://docs.python.org/library/threading.html
The lower level thread module does allow you to manually call exit(). Without a more complete example, I don't know if that's what you need in this case, but I suspect not as Thread objects should automatically end when run() is complete.
http://docs.python.org/library/thread.html
Related
I am very new to the concept of threading and the concepts are still somewhat fuzzy.
But as of now i have a requirement in which i spin up an arbitrary number of threads from my Python program and then my Python program should indicate to the user running the process which threads have finished executing. Below is my first try:
import threading
from threading import Thread
from time import sleep
def exec_thread(n):
name = threading.current_thread().getName()
filename = name + ".txt"
with open(filename, "w+") as file:
file.write(f"My name is {name} and my main thread is {threading.main_thread()}\n")
sleep(n)
file.write(f"{name} exiting\n")
t1 = Thread(name="First", target=exec_thread, args=(10,))
t2 = Thread(name="Second", target=exec_thread, args=(2,))
t1.start()
t2.start()
while len(threading.enumerate()) > 1:
print(f"Waiting ... !")
sleep(5)
print(f"The threads are done"
So this basically tells me when all the threads are done executing.
But i want to know as soon as any one of my threads have completed execution so that i can tell the user that please check the output file for the thread.
I cannot use thread.join() since that would block my main program and the user would not know anything unless everything is complete which might take hours. The user wants to know as soon as some results are available.
Now i know that we can check individual threads whether they are active or not by doing : thread.isAlive() but i was hoping for a more elegant solution in which if the child threads can somehow communicate with the main thread and say I am done !
Many thanks for any answers in advance.
The simplest and most straightforward way to indicate a single thread is "done" is to put the required notification in the thread's implementation method, as the very last step. For example, you could print out a notification to the user.
Or, you could use events, see: https://docs.python.org/3/library/threading.html#event-objects
This is one of the simplest mechanisms for communication between
threads: one thread signals an event and other threads wait for it.
An event object manages an internal flag that can be set to true with
the set() method and reset to false with the clear() method. The
wait() method blocks until the flag is true.
So, the "final act" in your thread implementation would be to set an event object, and your main thread can wait until it's set.
Or, for an even fancier and more mechanism, use queues: https://docs.python.org/3/library/queue.html
Each thread writes an "I'm done" object to the queue when done, and the main thread can read those notifications from the queue in sequence as each thread completes.
from threading import Thread
class MyClass:
#...
def method2(self):
while True:
try:
hashes = self.target.bssid.replace(':','') + '.pixie'
text = open(hashes).read().splitlines()
except IOError:
time.sleep(5)
continue
# function goes on ...
def method1(self):
new_thread = Thread(target=self.method2())
new_thread.setDaemon(True)
new_thread.start() # Main thread will stop there, wait until method 2
print "Its continues!" # wont show =(
# function goes on ...
Is it possible to do like that?
After new_thread.start() Main thread waits until its done, why is that happening? i didn't provide new_thread.join() anywhere.
Daemon doesn't solve my problem because my problem is that Main thread stops right after new thread start, not because main thread execution is end.
As written, the call to the Thread constructor is invoking self.method2 instead of referring to it. Replace target=self.method2() with target=self.method2 and the threads will run in parallel.
Note that, depending on what your threads do, CPU computations might still be serialized due to the GIL.
IIRC, this is because the program doesn't exit until all non-daemon threads have finished execution. If you use a daemon thread instead, it should fix the issue. This answer gives more details on daemon threads:
Daemon Threads Explanation
Python's Queue has a join() method that will block until task_done() has been called on all the items that have been taken from the queue.
Is there a way to periodically check for this condition, or receive an event when it happens, so that you can continue to do other things in the meantime? You can, of course, check if the queue is empty, but that doesn't tell you if the count of unfinished tasks is actually zero.
The Python Queue itself does not support this, so you could try the following
from threading import Thread
class QueueChecker(Thread):
def __init__(self, q):
Thread.__init__(self)
self.q = q
def run(self):
q.join()
q_manager_thread = QueueChecker(my_q)
q_manager_thread.start()
while q_manager_thread.is_alive():
#do other things
#when the loop exits the tasks are done
#because the thread will have returned
#from blocking on the q.join and exited
#its run method
q_manager_thread.join() #to cleanup the thread
a while loop on the thread.is_alive() bit might not be exactly what you want, but at least you can see how to asynchronously check on the status of the q.join now.
I've a python program that spawns a number of threads. These threads last anywhere between 2 seconds to 30 seconds. In the main thread I want to track whenever each thread completes and print a message. If I just sequentially .join() all threads and the first thread lasts 30 seconds and others complete much sooner, I wouldn't be able to print a message sooner -- all messages will be printed after 30 seconds.
Basically I want to block until any thread completes. As soon as a thread completes, print a message about it and go back to blocking if any other threads are still alive. If all threads are done then exit program.
One way I could think of is to have a queue that is passed to all the threads and block on queue.get(). Whenever a message is received from the queue, print it, check if any other threads are alive using threading.active_count() and if so, go back to blocking on queue.get(). This would work but here all the threads need to follow the discipline of sending a message to the queue before terminating.
I'm wonder if this is the conventional way of achieving this behavior or are there any other / better ways ?
Here's a variation on #detly's answer that lets you specify the messages from your main thread, instead of printing them from your target functions. This creates a wrapper function which calls your target and then prints a message before terminating. You could modify this to perform any kind of standard cleanup after each thread completes.
#!/usr/bin/python
import threading
import time
def target1():
time.sleep(0.1)
print "target1 running"
time.sleep(4)
def target2():
time.sleep(0.1)
print "target2 running"
time.sleep(2)
def launch_thread_with_message(target, message, args=[], kwargs={}):
def target_with_msg(*args, **kwargs):
target(*args, **kwargs)
print message
thread = threading.Thread(target=target_with_msg, args=args, kwargs=kwargs)
thread.start()
return thread
if __name__ == '__main__':
thread1 = launch_thread_with_message(target1, "finished target1")
thread2 = launch_thread_with_message(target2, "finished target2")
print "main: launched all threads"
thread1.join()
thread2.join()
print "main: finished all threads"
The thread needs to be checked using the Thread.is_alive() call.
Why not just have the threads themselves print a completion message, or call some other completion callback when done?
You can the just join these threads from your main program, so you'll see a bunch of completion messages and your program will terminate when they're all done, as required.
Here's a quick and simple demonstration:
#!/usr/bin/python
import threading
import time
def really_simple_callback(message):
"""
This is a really simple callback. `sys.stdout` already has a lock built-in,
so this is fine to do.
"""
print message
def threaded_target(sleeptime, callback):
"""
Target for the threads: sleep and call back with completion message.
"""
time.sleep(sleeptime)
callback("%s completed!" % threading.current_thread())
if __name__ == '__main__':
# Keep track of the threads we create
threads = []
# callback_when_done is effectively a function
callback_when_done = really_simple_callback
for idx in xrange(0, 10):
threads.append(
threading.Thread(
target=threaded_target,
name="Thread #%d" % idx,
args=(10 - idx, callback_when_done)
)
)
[t.start() for t in threads]
[t.join() for t in threads]
# Note that thread #0 runs for the longest, but we'll see its message first!
What I would suggest is loop like this
while len(threadSet) > 0:
time.sleep(1)
for thread in theadSet:
if not thread.isAlive()
print "Thread "+thread.getName()+" terminated"
threadSet.remove(thread)
There is a 1 second sleep, so there will be a slight delay between the thread termination and the message being printed. If you can live with this delay, then I think this is a simpler solution than the one you proposed in your question.
You can let the threads push their results into a threading.Queue. Have another thread wait on this queue and print the message as soon as a new item appears.
I'm not sure I see the problem with using:
threading.activeCount()
to track the number of threads that are still active?
Even if you don't know how many threads you're going to launch before starting it seems pretty easy to track. I usually generate thread collections via list comprehension then a simple comparison using activeCount to the list size can tell you how many have finished.
See here: http://docs.python.org/library/threading.html
Alternately, once you have your thread objects you can just use the .isAlive method within the thread objects to check.
I just checked by throwing this into a multithread program I have and it looks fine:
for thread in threadlist:
print(thread.isAlive())
Gives me a list of True/False as the threads turn on and off. So you should be able to do that and check for anything False in order to see if any thread is finished.
I use a slightly different technique because of the nature of the threads I used in my application. To illustrate, this is a fragment of a test-strap program I wrote to scaffold a barrier class for my threading class:
while threads:
finished = set(threads) - set(threading.enumerate())
while finished:
ttt = finished.pop()
threads.remove(ttt)
time.sleep(0.5)
Why do I do it this way? In my production code, I have a time limit, so the first line actually reads "while threads and time.time() < cutoff_time". If I reach the cut-off, I then have code to tell the threads to shut down.
I start a bunch of threads working on a queue and I want to kill them when sending the SIGINT (Ctrl+C). What is the best way to handle this?
targets = Queue.Queue()
threads_num = 10
threads = []
for i in threads_num:
t = MyThread()
t.setDaemon(True)
threads.append(t)
t.start()
targets.join()
If you are not interested in letting the other threads shut down gracefully, simply start them in daemon mode and wrap the join of the queue in a terminator thread.
That way, you can make use of the join method of the thread -- which supports a timeout and does not block off exceptions -- instead of having to wait on the queue's join method.
In other words, do something like this:
term = Thread(target=someQueueVar.join)
term.daemon = True
term.start()
while (term.isAlive()):
term.join(3600)
Now, Ctrl+C will terminate the MainThread whereupon the Python Interpreter hard-kills all threads marked as "daemons". Do note that this means that you have to set "Thread.daemon" for all the other threads or shut them down gracefully by catching the correct exception (KeyboardInterrupt or SystemExit) and doing whatever needs to be done for them to quit.
Do also note that you absolutely need to pass a number to term.join(), as otherwise it will, too, ignore all exceptions. You can select an arbitrarily high number, though.
Isn't Ctrl+C SIGINT?
Anyway, you can install a handler for the appropriate signal, and in the handler:
set a global flag that instructs the workers to exit, and make sure they check it periodically
or put 10 shutdown tokens on the queue, and have the workers exit when they pop this magic token
or set a flag which instructs the main thread to push those tokens, make sure the main thread checks that flag
etc. Mostly it depends on the structure of the application you're interrupting.
One way to do it is to install a signal handler for SIGTERM that directly calls os._exit(signal.SIGTERM). However unless you specify the optional timeout argument to Queue.get the signal handler function will not run until after the get method returns. (That's completely undocumented; I discovered that on my own.) So you can specify sys.maxint as the timeout and put your Queue.get call in a retry loop for purity to get around that.
Why don't you set timeouts for any operation on the queue? Then your threads can regular check if they have to finish by checking if an Event is raised.
This is how I tackled this.
class Worker(threading.Thread):
def __init__(self):
self.shutdown_flag = threading.Event()
def run(self):
logging.info('Worker started')
while not self.shutdown_flag.is_set():
try:
task = self.get_task_from_queue()
except queue.Empty:
continue
self.process_task(task)
def get_task_from_queue(self) -> Task:
return self.task_queue.get(block=True, timeout=10)
def shutdown(self):
logging.info('Shutdown received')
self.shutdown_flag.set()
Upon receiving a signal the main thread sets the shutdown event on workers. The workers wait on a blocking queue, but keep checking every 10 seconds if they have received a shutdown signal.
I managed to solve the problem by emptying the queue on KeyboardInterrupt and letting threads to gracefully stop themselves.
I don't know if it's the best way to handle this but is simple and quite clean.
targets = Queue.Queue()
threads_num = 10
threads = []
for i in threads_num:
t = MyThread()
t.setDaemon(True)
threads.append(t)
t.start()
while True:
try:
# If the queue is empty exit loop
if self.targets.empty() is True:
break
# KeyboardInterrupt handler
except KeyboardInterrupt:
print "[X] Interrupt! Killing threads..."
# Substitute the old queue with a new empty one and exit loop
targets = Queue.Queue()
break
# Join every thread on the queue normally
targets.join()