Self-taught programming student, so I apologize for all the amateur mistakes. I want to learn some deeper subjects, so I'm trying to understand threading, and exception handling.
import threading
import sys
from time import sleep
from random import randint as r
def waiter(n):
print "Starting thread " + str(n)
wait_time = r(1,10)
sleep(wait_time)
print "Exiting thread " + str(n)
if __name__=='__main__':
try:
for i in range(5):
t = threading.Thread(target=waiter, args=(i+1,))
t.daemon = True
t.start()
sleep(3)
print 'All threads complete!'
sys.exit(1)
except KeyboardInterrupt:
print ''
sys.exit(1)
This script just starts and stops threads after a random time and will kill the program if it receives a ^C. I've noticed that it doesn't print when some threads finish:
Starting thread 1
Starting thread 2
Starting thread 3
Exiting thread 3
Exiting thread 2
Starting thread 4
Exiting thread 1
Exiting thread 4
Starting thread 5
All threads complete!
In this example, it never states it exits thread 5. I find I can fix this if I comment out the t.daemon = True statement, but then exception handling waits for any threads to finish up.
Starting thread 1
Starting thread 2
^C
Exiting thread 1
Exiting thread 2
I can understand that when dealing with threads, it's best that they complete what they're handling before exiting, but I'm just curious as to why this is. I'd really appreciate any answers regarding the nature of threading and daemons to guide my understanding.
The whole point of a daemon thread is that if it's not finished by the time the main thread finishes, it gets summarily killed. Quoting the docs:
A thread can be flagged as a “daemon thread”. The significance of this flag is that the entire Python program exits when only daemon threads are left. The initial value is inherited from the creating thread. The flag can be set through the daemon property or the daemon constructor argument.
Note Daemon threads are abruptly stopped at shutdown. Their resources (such as open files, database transactions, etc.) may not be released properly. If you want your threads to stop gracefully, make them non-daemonic and use a suitable signalling mechanism such as an Event.
Now, look at your logic. The main thread only sleeps for 3 seconds after starting thread 5. But thread 5 can sleep for anywhere from 1-10 seconds. So, about 70% of the time, it's not going to be finished by the time the main thread wakes up, prints "All threads complete!", and exits. But thread 5 is still asleep for another 5 seconds. In which case thread 5 will be killed without ever going to print "Exiting thread 5".
If this isn't the behavior you want—if you want the main thread to wait for all the threads to finish—then don't use daemon threads.
Related
I would like to run a process in a thread (which is iterating over a large database table). While the thread is running, I just want the program to wait. If that thread takes longer then 30 seconds, I want to kill the thread and do something else. By killing the thread, I mean that I want it to cease activity and release resources gracefully.
I figured the best way to do this was through a Thread()'s join(delay) and is_alive() functions, and an Event. Using the join(delay) I can have my program wait 30 seconds for the thread to finish, and by using the is_alive() function I can determine if the thread has finished its work. If it hasn't finished its work, the event is set, and the thread knows to stop working at that point.
Is this approach valid, and is this the most pythonic way to go about my problem statement?
Here is some sample code:
import threading
import time
# The worker loops for about 1 minute adding numbers to a set
# unless the event is set, at which point it breaks the loop and terminates
def worker(e):
data = set()
for i in range(60):
data.add(i)
if not e.isSet():
print "foo"
time.sleep(1)
else:
print "bar"
break
e = threading.Event()
t = threading.Thread(target=worker, args=(e,))
t.start()
# wait 30 seconds for the thread to finish its work
t.join(30)
if t.is_alive():
print "thread is not done, setting event to kill thread."
e.set()
else:
print "thread has already finished."
Using an Event in this case is works just fine as the signalling mechanism, and
is actually recommended in the threading module docs.
If you want your threads to stop gracefully, make them non-daemonic and use a
suitable signalling mechanism such as an Event.
When verifying thread termination, timeouts almost always introduce room for
error. Therefore, while using the .join() with a timeout for the initial
decision to trigger the event is fine, final verification should be made using a
.join() without a timeout.
# wait 30 seconds for the thread to finish its work
t.join(30)
if t.is_alive():
print "thread is not done, setting event to kill thread."
e.set()
# The thread can still be running at this point. For example, if the
# thread's call to isSet() returns right before this call to set(), then
# the thread will still perform the full 1 second sleep and the rest of
# the loop before finally stopping.
else:
print "thread has already finished."
# Thread can still be alive at this point. Do another join without a timeout
# to verify thread shutdown.
t.join()
This can be simplified to something like this:
# Wait for at most 30 seconds for the thread to complete.
t.join(30)
# Always signal the event. Whether the thread has already finished or not,
# the result will be the same.
e.set()
# Now join without a timeout knowing that the thread is either already
# finished or will finish "soon."
t.join()
I'm way late to this game, but I've been wrestling with a similar question and the following appears to both resolve the issue perfectly for me AND lets me do some basic thread state checking and cleanup when the daemonized sub-thread exits:
import threading
import time
import atexit
def do_work():
i = 0
#atexit.register
def goodbye():
print ("'CLEANLY' kill sub-thread with value: %s [THREAD: %s]" %
(i, threading.currentThread().ident))
while True:
print i
i += 1
time.sleep(1)
t = threading.Thread(target=do_work)
t.daemon = True
t.start()
def after_timeout():
print "KILL MAIN THREAD: %s" % threading.currentThread().ident
raise SystemExit
threading.Timer(2, after_timeout).start()
Yields:
0
1
KILL MAIN THREAD: 140013208254208
'CLEANLY' kill sub-thread with value: 2 [THREAD: 140013674317568]
I was also struggling to close a thread that was waiting to receive a notification.
Tried solution given here by user5737269 but it didn't really work for me. It was getting stuck in second join statement(without timeout one). Struggled a lot but didn't find any solution to this problem. Got this solution after thinking sometime:
My thread is waiting to receive a message in que. I want to close this thread, if no notification is received for 20 seconds. So, after 20 seconds, I am writing a message to this que so that thread terminates on its own.
Here's code:
q = Queue.Queue()
t.join(20)
if t.is_alive():
print("STOPPING THIS THREAD ....")
q.put("NO NOTIFICATION RECEIVED")
t.join(20)
else:
print("Thread completed successfully!!")
This worked for me.. Hope this idea helps someone!
I have a question in Python programming. I am writing a code that has a thread. This thread is a blocked thread. Blocked thread means: a thread is waiting for an event. If the event is not set, this thread must wait until the event is set. My expectation that block thread must wait the event without any timeout for waiting!
After starting the blocked thread, I write a forever loop to calculate a counter. The problem is: When I want to terminate my Python program by Ctrl+C, I can not terminate the blocked thread correctly. This thread is still alive! My code is here.
import threading
import time
def wait_for_event(e):
while True:
"""Wait for the event to be set before doing anything"""
e.wait()
e.clear()
print "In wait_for_event"
e = threading.Event()
t1 = threading.Thread(name='block',
target=wait_for_event,
args=(e,))
t1.start()
# Check t1 thread is alive or not
print "Before while True. t1 is alive: %s" % t1.is_alive()
counter = 0
while True:
try:
time.sleep(1)
counter = counter + 1
print "counter: %d " % counter
except KeyboardInterrupt:
print "In KeyboardInterrupt branch"
break
print "Out of while True"
# Check t1 thread is alive
print "After while True. t1 is alive: %s" % t1.is_alive()
Output:
$ python thread_test1.py
Before while True. t1 is alive: True
counter: 1
counter: 2
counter: 3
^CIn KeyboardInterrupt branch
Out of while True
After while True. t1 is alive: True
Could anyone give me a help? I want to ask 2 questions.
1. Can I stop a blocked thread by Ctrl+C? If I can, please give me a feasible direction.
2. If we stop the Python program by Ctrl+\ keyboard or reset the Hardware (example, PC) that is running the Python program, the blocked thread can be terminated or not?
Ctrl+C stops only the main thread, Your threads aren't in daemon mode, that's why they keep running, and that's what keeps the process alive. First make your threads to daemon.
t1 = threading.Thread(name='block',
target=wait_for_event,
args=(e,))
t1.daemon = True
t1.start()
Similarly for your other Threads. But there another problem - once the main thread has started your threads, there's nothing else for it to do. So it exits, and the threads are destroyed instantly. So let's keep the main thread alive:
import time
while True:
time.sleep(1)
Please have a look at this, I hope you will get your other answers.
If you need to kill all running python's processes you can simply run pkill python from the command line.
This is a little bit extreme but would work.
An other solution would be to use locking inside your code see here:
I'm trying to implement a simple threadpool in python.
I start a few threads with the following code:
threads = []
for i in range(10):
t = threading.Thread(target=self.workerFuncSpinner(
taskOnDeckQueue, taskCompletionQueue, taskErrorQueue, i))
t.setDaemon(True)
threads.append(t)
t.start()
for thread in threads:
thread.join()
At this point, the worker thread only prints when it starts and exits and time.sleeps between. The problem is, instead of getting output like:
#All output at the same time
thread 1 starting
thread 2 starting
thread n starting
# 5 seconds pass
thread 1 exiting
thread 2 exiting
thread n exiting
I get:
thread 1 starting
# 5 seconds pass
thread 1 exiting
thread 2 starting
# 5 seconds pass
thread 2 exiting
thread n starting
# 5 seconds pass
thread n exiting
And when I do a threading.current_thread(), they all report they are mainthread.
It's like there not even threads, but running in the main thread context.
Help?
Thanks
You are calling workerFuncSpinner in the main thread when creating the Thread object. Use a reference to the method instead:
t=threading.Thread(target=self.workerFuncSpinner,
args=(taskOnDeckQueue, taskCompletionQueue, taskErrorQueue, i))
Your original code:
t = threading.Thread(target=self.workerFuncSpinner(
taskOnDeckQueue, taskCompletionQueue, taskErrorQueue, i))
t.start()
could be rewritten as
# call the method in the main thread
spinner = self.workerFuncSpinner(
taskOnDeckQueue, taskCompletionQueue, taskErrorQueue, i)
# create a thread that will call whatever `self.workerFuncSpinner` returned,
# with no arguments
t = threading.Thread(target=spinner)
# run whatever workerFuncSpinner returned in background thread
t.start()
You were calling the method serially in the main thread and nothing in the created threads.
I suspect workerFuncSpinner may be your problem. I would verify that it is not actually running the task, but returning a callable object for the thread to run.
https://docs.python.org/2/library/threading.html#threading.Thread
I have the following script in python that calls a function every X seconds creating a new thread:
def function():
threading.Timer(X, function).start()
do_something
function()
My question is, what if the function takes 2*X seconds to execute? Since I'm using threading this should not be a problem, right? I will have more "instances" of the function running at the same time but once every one finishes its thread should be destroyed. Thanks
If the function takes 2*X seconds, then you're going to have multiple instances of function running concurrently. It's easy to see with an example:
import threading
import time
X = 2
def function():
print("Thread {} starting.".format(threading.current_thread()))
threading.Timer(X, function).start()
time.sleep(2*X)
print("Thread {} done.".format(threading.current_thread()))
function()
Output:
Thread <_MainThread(MainThread, started 140115183785728)> starting.
Thread <_Timer(Thread-1, started 140115158210304)> starting.
Thread <_MainThread(MainThread, started 140115183785728)> done.
Thread <_Timer(Thread-2, started 140115149817600)> starting.
Thread <_Timer(Thread-3, started 140115141424896)> starting.
Thread <_Timer(Thread-1, started 140115158210304)> done.
Thread <_Timer(Thread-4, started 140115133032192)> starting.
Thread <_Timer(Thread-2, started 140115149817600)> done.
Thread <_Timer(Thread-3, started 140115141424896)> done.
Thread <_Timer(Thread-5, started 140115158210304)> starting.
Thread <_Timer(Thread-6, started 140115141424896)> starting.
Thread <_Timer(Thread-4, started 140115133032192)> done.
Thread <_Timer(Thread-7, started 140115149817600)> starting.
Thread <_Timer(Thread-5, started 140115158210304)> done.
Thread <_Timer(Thread-8, started 140115133032192)> starting.
Thread <_Timer(Thread-6, started 140115141424896)> done.
Thread <_Timer(Thread-9, started 140115158210304)> starting.
Thread <_Timer(Thread-7, started 140115149817600)> done.
Thread <_Timer(Thread-10, started 140115141424896)> starting.
Thread <_Timer(Thread-8, started 140115133032192)> done.
Thread <_Timer(Thread-11, started 140115149817600)> starting.
<And on and on forever and ever>
As you can see from the output, this is also an infinite loop, so the program will never end.
If it's safe for multiple instances of function to run at the same time, then this is fine. If it's not, then you need to protect the not-thread-safe part of function with a lock:
import threading
import time
X = 2
lock = threading.Lock()
def function():
with lock:
print("Thread {} starting.".format(threading.current_thread()))
threading.Timer(X, function).start()
time.sleep(2*X)
print("Thread {} done.".format(threading.current_thread()))
function()
Output:
Thread <_MainThread(MainThread, started 140619426387712)> starting.
Thread <_MainThread(MainThread, started 140619426387712)> done.
Thread <_Timer(Thread-1, started 140619400812288)> starting.
Thread <_Timer(Thread-1, started 140619400812288)> done.
Thread <_Timer(Thread-2, started 140619392419584)> starting.
Thread <_Timer(Thread-2, started 140619392419584)> done.
Thread <_Timer(Thread-3, started 140619381606144)> starting.
Thread <_Timer(Thread-3, started 140619381606144)> done.
Thread <_Timer(Thread-4, started 140619392419584)> starting.
Thread <_Timer(Thread-4, started 140619392419584)> done.
Thread <_Timer(Thread-5, started 140619381606144)> starting.
One final note: because of the Global Interpreter Lock, in CPython only one thread can ever actually execute bytecode at a time. So when you use threads, you're not really improving performance if you're doing CPU-bound tasks, because only one thread is every actually executing at a time. Instead, the OS ends up frequently switching between all the threads, giving each a bit of CPU time. This will generally end up being slower than a single-threaded approach, because of the added overhead of switching between the threads. If you're planning on doing CPU-bound work in each thread, you may want to use multiprocessing instead.
In theory you could have 3 active threads running at any given time: one that is just about to end, one that's in the middle of a run, and one that's just been spawned.
|-----|
|-----|
|-----|
In practice, you might end up with a few more:
import threading
import logging
logger = logging.getLogger(__name__)
import time
def function():
threading.Timer(X, function).start()
logger.info('{} active threads'.format(threading.active_count()))
time.sleep(2*X)
logging.basicConfig(level=logging.DEBUG,
format='[%(asctime)s %(threadName)s] %(message)s',
datefmt='%H:%M:%S')
X = 3
function()
yields
[16:12:13 MainThread] 2 active threads
[16:12:16 Thread-1] 3 active threads
[16:12:19 Thread-2] 4 active threads
[16:12:22 Thread-3] 4 active threads
[16:12:25 Thread-4] 5 active threads
[16:12:28 Thread-5] 4 active threads
[16:12:31 Thread-6] 4 active threads
[16:12:34 Thread-7] 4 active threads
[16:12:37 Thread-8] 5 active threads
[16:12:40 Thread-9] 4 active threads
[16:12:43 Thread-10] 5 active threads
[16:12:46 Thread-11] 5 active threads
I don't see any inherent problem with this; you just have to be aware of what it's doing.
You may run into a race condition if one instance of the function is writing to a resource while another is trying to read that same resource.
http://en.wikipedia.org/wiki/Multithreading_(computer_architecture)#Disadvantages
Can you setup a test so that you can experiment with the behavior that you are concerned with?
from thread import start_new_thread
num_threads = 0
def heron(a):
global num_threads
num_threads += 1
# code has been left out, see above
num_threads -= 1
return new
start_new_thread(heron,(99,))
start_new_thread(heron,(999,))
start_new_thread(heron,(1733,))
start_new_thread(heron,(17334,))
while num_threads > 0:
pass
This is simple code of thread i want to know in last line why do we use while loop
The final while-loop waits for all of the threads to finish before the main thread exits.
It is expensive check (100% CPU for the spin-wait). You can improve it in one of two ways:
while num_threads > 0:
time.sleep(0.1)
or by tracking all the threads in a list and joining them one-by-one:
for worker in worker_threads:
worker.join()
We want to keep process alive until all children finish the work. So we must keep executing something in main thread as long as any child is alive, hence the check for num_threads variable.
If it wasn't for this, all children threads would be killed ASAP main thread finished its work regardless of whether they actually finished their work, so waiting for them is mandatory to ensure everything is done.
To build on Raymond Hettinger's answer: the parent process starts a number of threads, each of which does work. We then wait for each to exit, so that we can collect and process their output. In this case each worker just outputs to the screen, so the parent just has to join() each task to make sure it ran and exited correctly.
Here's an alternate way to code the above. It uses the higher-level library threading (vs thread), and only calls join() on threads besides the current one. We also use threading.enumerate() instead of manually keeping track of worker threads -- easier.
code:
import threading
def heron(a):
print '{}: a={}'.format(threading.current_thread(), a)
threading.Thread(target=heron, args=(99,)).start()
threading.Thread(target=heron, args=(999,)).start()
threading.Thread(target=heron, args=(1733,)).start()
threading.Thread(target=heron, args=(17334,)).start()
print
print '{} threads, joining'.format(threading.active_count())
for thread in threading.enumerate():
print '- {} join'.format(thread)
if thread == threading.current_thread():
continue
thread.join()
print 'done'
Example output:
python ./jointhread.py
<Thread(Thread-1, started 140381408802560)>: a=99
<Thread(Thread-2, started 140381400082176)>: a=999
<Thread(Thread-3, started 140381400082176)>: a=1733
2 threads, joining
- <_MainThread(MainThread, started 140381429581632)> join
- <Thread(Thread-4, started 140381408802560)> join
<Thread(Thread-4, started 140381408802560)>: a=17334
done