I am calling a 3rd party function that I have not written so I cannot edit it.
It has its own logic that may take some time to finish.
I wrapped that 3rd party call into a thread and I wanted to immediately cancel it once I received some event.
import threading
import time
# Assume this is the 3rd party function
def call_3rd_party_function():
for i in range(50):
print("Executing task " + str(i))
time.sleep(0.25)
def task(stop_event: threading.Event):
print("Starting background thread!")
while not stop_event.is_set():
call_3rd_party_function()
stop_event = threading.Event()
t = threading.Thread(target=task, args=(stop_event,))
t.setDaemon(True)
t.start()
time.sleep(5)
stop_event.set()
t.join()
print("Main thread end!")
In the code above, example after 5 seconds.. I want to cancel that task, but my output is not what I want as the 3rd party function still continue with its tasks to completion.
Starting background thread!
Executing task 0
Executing task 1
Executing task 2
Executing task 3
.
.
Executing task 49
Main thread end!
Is there a way to do this? I wrapped it in a thread since I want it to run on its own so as not to block the main thread from doing some other things.
Any hints?
Related
I have set up a thread pool executor with 4 threads. I have added 2 items to my queue to be processed. When I submit the tasks and retrieve futures, it appears the other 2 threads not processing items in the queue keep running and hang, even if they are not processing anything!
import time
import queue
import concurrent
def _read_queue(queue):
msg = queue.get()
time.sleep(2)
queue.task_done()
n_threads = 4
q = queue.Queue()
q.put('test')
q.put("test2")
with concurrent.futures.ThreadPoolExecutor(max_workers=n_threads) as pool:
futures = []
for _ in range(n_threads):
future = pool.submit(_read_queue, q)
print(future.running())
print("Why am running forever?")
How can I adjust my code so that threads that are not processing anything from the queue are shutdown so my program can terminate?
Because queue.get() operation block your ThreadPoolExecutor threads.
for _ in range(n_threads):
future = pool.submit(_read_queue, q)
print(future.running())
Let's examine future = pool.submit(_read_queue, q) in every iteration of for loop
In first iteration of for loop, pool.submit(_read_queue, q) will put a job inside the ThreadPoolExecutor internal queue. When any job are put inside the ThreadPoolExecutor internal queue (it's name is self._work_queue), submit method will create a thread1(I say thread1,thread2.. for easily understand) thread. This thread will execute _read_queue func(This can be happen immediately or this can be happen after the fourth iteration of for loop. This ordering is depends on the Operating System Scheduler, please look at this) and queue.get() will return "test". Then, this thread will sleep for 2 seconds.
In second iteration of for loop, pool.submit(_read_queue, q) will put a job inside the ThreadPoolExecutor internal queue and then submit method will check that there is any thread which is waiting for a job ? No, there is no any waiting thread, first thread is sleeping(for 2 seconds). So submit method will do below steps :
if "there is a thread which will accept a job immediately": #Step 1
return
# Step 2
if numbe_of_created_threads(now this is 1) < self._max_workers:
threading.Thread().... #create a new thread
And then submit method will create a new thread2 thread and this thread will execute _read_queue func and queue.get() will return "test2". Then, this thread will sleep for 2 seconds. Also, q, queue object will be empty and subsequent get() call will block the calling thread
In third iteration of for loop, submit method will put a job inside the ThreadPoolExecutor internal queue and then submit method will check that there is any thread which is waiting for a job ? There is no any waiting thread, first thread is sleeping(for 2 seconds) and second thread also is sleeping, so submit method will create a new thread3 thread (It will check the both step1 and Step2) and this thread will execute _read_queue func same as other threads did. When thread3 run, it will execute queue.get() but this will block the thread3, because q,queue object is empty and if you call get(blocking=True) method of a empty queue object, your calling thread will be blocked .
In fourth iteration of for loop, this will be same as with third case, and then thread4 will be blocked on queue.get() operation.
I assume 2 seconds not passed now, and there will be 5 thread which is alive (can be sleep mode or not) currently. After 2 seconds passed, thread1 and thread2(because time.sleep(2) will return) will terminate*1 but thread3 and thread4 will not, because queue.get() blocking them. That's why your main thread (whole program) will wait them and not terminate.
What can we do in this situation ?
We can put two elements inside the q object because q.get() blocking your thread by using acquire a lock object. We can only release this lock by calling release() method, to do that we need to call queue.put(something)
Here is one of the solutions ;
import time,threading
import queue
from concurrent import futures
def _read_queue(queue):
msg = queue.get()
time.sleep(2)
queue.put(None)
n_threads = 4
q = queue.Queue()
q.put('test')
q.put("test2")
with futures.ThreadPoolExecutor(max_workers=n_threads) as pool:
futures = []
for _ in range(n_threads):
futures.append(pool.submit(_read_queue, q))
*1, I said ThreadPoolExecutor threads will terminate after function finished, but it is depend on calling the shutdown() method, If we don't call shutdown() method of pool object, thread will not terminate even if function finished. Because creating and destruction a thread is costly, that's why threadpool concept is there.(shutdown() method will be called end of the with statement)
If I'm wrong somewhere please correct me.
Lets say I want to run 10 threads at same time and after one is finished start immediately new one. How can I do that?
I know with thread.join() I can wait to get finished, but than 10 threads needs to be finished, but I want after one finished to start new one immediately.
Well, what I understand is that you need to execute 10 thread at the same time.
I suggest you to use threading.BoundedSemaphore()
A sample code on using it is given below:
import threading
from typing import List
def do_something():
print("I hope this cleared your doubt :)")
sema4 = threading.BoundedSemaphore(10)
# 10 is given as parameter since your requirement stated that you need just 10 threads to get executed parallely
threads_list: List[threading.Thread] = []
# Above variable is used to save threads
for i in range(100):
thread = threading.Thread(target=do_something)
threads_list.append(thread) # saving thread in order to join it later
thread.start() # starting the thread
for thread in threads_list:
thread.join() # else, parent program is terminated without waiting for child threads
I would like to run a process in a thread (which is iterating over a large database table). While the thread is running, I just want the program to wait. If that thread takes longer then 30 seconds, I want to kill the thread and do something else. By killing the thread, I mean that I want it to cease activity and release resources gracefully.
I figured the best way to do this was through a Thread()'s join(delay) and is_alive() functions, and an Event. Using the join(delay) I can have my program wait 30 seconds for the thread to finish, and by using the is_alive() function I can determine if the thread has finished its work. If it hasn't finished its work, the event is set, and the thread knows to stop working at that point.
Is this approach valid, and is this the most pythonic way to go about my problem statement?
Here is some sample code:
import threading
import time
# The worker loops for about 1 minute adding numbers to a set
# unless the event is set, at which point it breaks the loop and terminates
def worker(e):
data = set()
for i in range(60):
data.add(i)
if not e.isSet():
print "foo"
time.sleep(1)
else:
print "bar"
break
e = threading.Event()
t = threading.Thread(target=worker, args=(e,))
t.start()
# wait 30 seconds for the thread to finish its work
t.join(30)
if t.is_alive():
print "thread is not done, setting event to kill thread."
e.set()
else:
print "thread has already finished."
Using an Event in this case is works just fine as the signalling mechanism, and
is actually recommended in the threading module docs.
If you want your threads to stop gracefully, make them non-daemonic and use a
suitable signalling mechanism such as an Event.
When verifying thread termination, timeouts almost always introduce room for
error. Therefore, while using the .join() with a timeout for the initial
decision to trigger the event is fine, final verification should be made using a
.join() without a timeout.
# wait 30 seconds for the thread to finish its work
t.join(30)
if t.is_alive():
print "thread is not done, setting event to kill thread."
e.set()
# The thread can still be running at this point. For example, if the
# thread's call to isSet() returns right before this call to set(), then
# the thread will still perform the full 1 second sleep and the rest of
# the loop before finally stopping.
else:
print "thread has already finished."
# Thread can still be alive at this point. Do another join without a timeout
# to verify thread shutdown.
t.join()
This can be simplified to something like this:
# Wait for at most 30 seconds for the thread to complete.
t.join(30)
# Always signal the event. Whether the thread has already finished or not,
# the result will be the same.
e.set()
# Now join without a timeout knowing that the thread is either already
# finished or will finish "soon."
t.join()
I'm way late to this game, but I've been wrestling with a similar question and the following appears to both resolve the issue perfectly for me AND lets me do some basic thread state checking and cleanup when the daemonized sub-thread exits:
import threading
import time
import atexit
def do_work():
i = 0
#atexit.register
def goodbye():
print ("'CLEANLY' kill sub-thread with value: %s [THREAD: %s]" %
(i, threading.currentThread().ident))
while True:
print i
i += 1
time.sleep(1)
t = threading.Thread(target=do_work)
t.daemon = True
t.start()
def after_timeout():
print "KILL MAIN THREAD: %s" % threading.currentThread().ident
raise SystemExit
threading.Timer(2, after_timeout).start()
Yields:
0
1
KILL MAIN THREAD: 140013208254208
'CLEANLY' kill sub-thread with value: 2 [THREAD: 140013674317568]
I was also struggling to close a thread that was waiting to receive a notification.
Tried solution given here by user5737269 but it didn't really work for me. It was getting stuck in second join statement(without timeout one). Struggled a lot but didn't find any solution to this problem. Got this solution after thinking sometime:
My thread is waiting to receive a message in que. I want to close this thread, if no notification is received for 20 seconds. So, after 20 seconds, I am writing a message to this que so that thread terminates on its own.
Here's code:
q = Queue.Queue()
t.join(20)
if t.is_alive():
print("STOPPING THIS THREAD ....")
q.put("NO NOTIFICATION RECEIVED")
t.join(20)
else:
print("Thread completed successfully!!")
This worked for me.. Hope this idea helps someone!
As titles describe, I create a separate thread to do a long task in Flask.
import schedule
import time
start_time = time.time()
def job():
print("I'm working..." + str(time.time() - start_time))
def run_schedule():
while True:
schedule.run_pending()
time.sleep(1)
When I press Ctrl + c to terminate the server, the thread still prints. How can I stop the thread when server exits?
You may want to set your thread as daemon.
A thread runs until it ends by itself or it is explicity killed.
A daemon thread runs with the same conditions and if at least one other non-daemonic thread is running: this means that if you end your main thread and no other threads are running, all daemonic thread will end as well.
if you're using threading module, you may set the thread as daemonic by changing his boolean:
import threading
your_thread.daemon = True
if you're using thread module, it should be one of the kwargs
I've a python program that spawns a number of threads. These threads last anywhere between 2 seconds to 30 seconds. In the main thread I want to track whenever each thread completes and print a message. If I just sequentially .join() all threads and the first thread lasts 30 seconds and others complete much sooner, I wouldn't be able to print a message sooner -- all messages will be printed after 30 seconds.
Basically I want to block until any thread completes. As soon as a thread completes, print a message about it and go back to blocking if any other threads are still alive. If all threads are done then exit program.
One way I could think of is to have a queue that is passed to all the threads and block on queue.get(). Whenever a message is received from the queue, print it, check if any other threads are alive using threading.active_count() and if so, go back to blocking on queue.get(). This would work but here all the threads need to follow the discipline of sending a message to the queue before terminating.
I'm wonder if this is the conventional way of achieving this behavior or are there any other / better ways ?
Here's a variation on #detly's answer that lets you specify the messages from your main thread, instead of printing them from your target functions. This creates a wrapper function which calls your target and then prints a message before terminating. You could modify this to perform any kind of standard cleanup after each thread completes.
#!/usr/bin/python
import threading
import time
def target1():
time.sleep(0.1)
print "target1 running"
time.sleep(4)
def target2():
time.sleep(0.1)
print "target2 running"
time.sleep(2)
def launch_thread_with_message(target, message, args=[], kwargs={}):
def target_with_msg(*args, **kwargs):
target(*args, **kwargs)
print message
thread = threading.Thread(target=target_with_msg, args=args, kwargs=kwargs)
thread.start()
return thread
if __name__ == '__main__':
thread1 = launch_thread_with_message(target1, "finished target1")
thread2 = launch_thread_with_message(target2, "finished target2")
print "main: launched all threads"
thread1.join()
thread2.join()
print "main: finished all threads"
The thread needs to be checked using the Thread.is_alive() call.
Why not just have the threads themselves print a completion message, or call some other completion callback when done?
You can the just join these threads from your main program, so you'll see a bunch of completion messages and your program will terminate when they're all done, as required.
Here's a quick and simple demonstration:
#!/usr/bin/python
import threading
import time
def really_simple_callback(message):
"""
This is a really simple callback. `sys.stdout` already has a lock built-in,
so this is fine to do.
"""
print message
def threaded_target(sleeptime, callback):
"""
Target for the threads: sleep and call back with completion message.
"""
time.sleep(sleeptime)
callback("%s completed!" % threading.current_thread())
if __name__ == '__main__':
# Keep track of the threads we create
threads = []
# callback_when_done is effectively a function
callback_when_done = really_simple_callback
for idx in xrange(0, 10):
threads.append(
threading.Thread(
target=threaded_target,
name="Thread #%d" % idx,
args=(10 - idx, callback_when_done)
)
)
[t.start() for t in threads]
[t.join() for t in threads]
# Note that thread #0 runs for the longest, but we'll see its message first!
What I would suggest is loop like this
while len(threadSet) > 0:
time.sleep(1)
for thread in theadSet:
if not thread.isAlive()
print "Thread "+thread.getName()+" terminated"
threadSet.remove(thread)
There is a 1 second sleep, so there will be a slight delay between the thread termination and the message being printed. If you can live with this delay, then I think this is a simpler solution than the one you proposed in your question.
You can let the threads push their results into a threading.Queue. Have another thread wait on this queue and print the message as soon as a new item appears.
I'm not sure I see the problem with using:
threading.activeCount()
to track the number of threads that are still active?
Even if you don't know how many threads you're going to launch before starting it seems pretty easy to track. I usually generate thread collections via list comprehension then a simple comparison using activeCount to the list size can tell you how many have finished.
See here: http://docs.python.org/library/threading.html
Alternately, once you have your thread objects you can just use the .isAlive method within the thread objects to check.
I just checked by throwing this into a multithread program I have and it looks fine:
for thread in threadlist:
print(thread.isAlive())
Gives me a list of True/False as the threads turn on and off. So you should be able to do that and check for anything False in order to see if any thread is finished.
I use a slightly different technique because of the nature of the threads I used in my application. To illustrate, this is a fragment of a test-strap program I wrote to scaffold a barrier class for my threading class:
while threads:
finished = set(threads) - set(threading.enumerate())
while finished:
ttt = finished.pop()
threads.remove(ttt)
time.sleep(0.5)
Why do I do it this way? In my production code, I have a time limit, so the first line actually reads "while threads and time.time() < cutoff_time". If I reach the cut-off, I then have code to tell the threads to shut down.