I have multiple Process threads running and I'd like to join all of them together with a timeout parameter. I understand that if no timeout were necessary, I'd be able to write:
for thread in threads:
thread.join()
One solution I thought of was to use a master thread that joined all the threads together and attempt to join that thread. However, I received the following error in Python:
AssertionError: can only join a child process
The code I have is below.
def join_all(threads):
for thread in threads:
thread.join()
if __name__ == '__main__':
for thread in threads:
thread.start()
master = multiprocessing.Process(target=join_all, args=(threads,))
master.start()
master.join(timeout=60)
You could loop over each thread repeatedly, doing non-blocking checks to see if the thread is done:
import time
def timed_join_all(threads, timeout):
start = cur_time = time.time()
while cur_time <= (start + timeout):
for thread in threads:
if not thread.is_alive():
thread.join()
time.sleep(1)
cur_time = time.time()
if __name__ == '__main__':
for thread in threads:
thread.start()
timed_join_all(threads, 60)
This answer is initially based on that by dano but has a number of changes.
join_all takes a list of threads and a timeout (in seconds) and attempts to join all of the threads. It does this by making a non-blocking call to Thread.join (by setting the timeout to 0, as join with no arguments will never timeout).
Once all the threads have finished (by checking is_alive() on each of them) the loop will exit prematurely.
If some threads are still running by the time the timeout occurs, the function raises a RuntimeError with information about the remaining threads.
import time
def join_all(threads, timeout):
"""
Args:
threads: a list of thread objects to join
timeout: the maximum time to wait for the threads to finish
Raises:
RuntimeError: is not all the threads have finished by the timeout
"""
start = cur_time = time.time()
while cur_time <= (start + timeout):
for thread in threads:
if thread.is_alive():
thread.join(timeout=0)
if all(not t.is_alive() for t in threads):
break
time.sleep(0.1)
cur_time = time.time()
else:
still_running = [t for t in threads if t.is_alive()]
num = len(still_running)
names = [t.name for t in still_running]
raise RuntimeError('Timeout on {0} threads: {1}'.format(num, names))
if __name__ == '__main__':
for thread in threads:
thread.start()
join_all(threads, 60)
In my usage of this, it was inside a test suite where the threads were dæmonised versions of ExcThread so that if the threads never finished running, it wouldn't matter.
The following code joins each process, waiting a certain amount of time. If the proc returns fast enough, the timeout is reduced, then the next process is joined. If a timeout occurs, an error message is shown and the entire system exits to the caller.
source
import multiprocessing, sys, time
# start three procs that run for differing lengths of time
procs = [
multiprocessing.Process(
target=time.sleep, args=[num], name='%d sec'%num,
)
for num in [1,2,5]
]
for p in procs:
p.start()
print p
timeleft = 3.0
print 'Join, timeout after {} seconds'.format(timeleft)
for p in procs:
orig = time.time()
print '{}: join, {:.3f} sec left...'.format(p, timeleft)
p.join(timeleft)
timeleft -= time.time() - orig
if timeleft <= 0.:
sys.exit('timed out!')
example with timeout
We start three procs: one waits for 1 sec, another for 3 sec, the last for 5 seconds. Then we `join` them, timing out after 3 seconds -- the last proc will be *interrupted*.
<Process(1 sec, started)>
<Process(2 sec, started)>
<Process(5 sec, started)>
Join, timeout after 3.0 seconds
<Process(1 sec, started)>: join, 3.000 sec left...
<Process(2 sec, started)>: join, 1.982 sec left...
<Process(5 sec, started)>: join, 0.965 sec left...
timed out!
I'm writing this here, just to make sure that I don't forget it. The principle of the answer is the same as the one of dano. Also the code snippet is a bit more pythonic:
threads = []
timeout = ...
# create and start the threads
for work in ...:
thread = threading.Thread(target=worker)
thread.daemon = True # without this the thread might outlive its parent
thread.start()
threads.append(thread)
# Wait for workers to finish or for timeout
stop_time = time.time() + timeout
while any(t.isAlive for t in threads) and (time.time() < stop_time):
time.sleep(0.1)
Related
For some reason, when the timeout is reached and the except is therefore executed, thread 2 is still "working", still expecting to get values from the user. Even though the closing_threads function is entered.
Why can't I terminate the thread? Why is it still waiting for keyboard entry?
If I add t2.join() then execution hangs indefinitely.
def main():
q2 = queue.Queue()
q1 = queue.Queue()
t1 = threading.Thread(target=nothing, name='t1', args=(q1,))
t2 = threading.Thread(target=get_interrupt_from_user, name='t2', args=(q2,))
t1.start()
t2.start()
try:
q2.get(timeout=4)
except:
...
closing_threads(t1, t2)
def closing_threads(t1, t2):
print('closing the threads')
t1.join()
t2.join()
print(t1.is_alive())
print(t2.is_alive())
def get_interrupt_from_user(q) -> None:
print('############ Thread 2 is starting! ############')
interrupt = False
while not interrupt:
print('use KeyboardInterrupt to stop the execution')
try:
input()
except KeyboardInterrupt:
print('KeyboardInterrupt exception took place')
else:
print('exit by KeyboardInterrupt!!!')
interrupt = True
print(f'interrupt took place = {interrupt}')
q.put(interrupt)
def nothing(q) -> None:
print('############ Thread 1 is starting! ############')
the second thread is technically not working, nor terminated, it is in a suspended state by the operating system, it will be terminated when it returns from this suspended state.
when you call input the operating system suspends the thread, and waits for input, then when input is available it wakes the thread and sends it the user input ... problem is that during that time it cannot handle interrupts ... because it is not executing code.
one way you can solve this is to declare the thread as daemon and not join it, so it will be killed when the other python threads die.
import queue
import threading
def main():
q2 = queue.Queue()
q1 = queue.Queue()
t1 = threading.Thread(target=nothing, name='t1', args=(q1,))
t2 = threading.Thread(target=get_interrupt_from_user, name='t2', args=(q2,),daemon=True)
t1.start()
t2.start()
try:
q2.get(timeout=4)
except:
...
closing_threads(t1, t2)
def closing_threads(t1, t2):
print('closing the threads')
t1.join()
print(t1.is_alive())
print(t2.is_alive())
def get_interrupt_from_user(q) -> None:
print('############ Thread 2 is starting! ############')
interrupt = False
while not interrupt:
print('use KeyboardInterrupt to stop the execution')
try:
input()
except KeyboardInterrupt:
print('KeyboardInterrupt exception took place')
else:
print('exit by KeyboardInterrupt!!!')
interrupt = True
print(f'interrupt took place = {interrupt}')
q.put(interrupt)
def nothing(q) -> None:
print('############ Thread 1 is starting! ############')
if __name__ == "__main__":
main()
other ways are sort of platform dependent and would involve summoning another processes that would signal the operating system to wake your thread up or terminate it and it gets messy really quick.
one last method is to use the select module on your stdin for this and create your own eventloop on the child thread, but this only works on linux.
I am trying to run as many thread as possible. however I have problem here
C:\Python27\lib\threading.py
_start_new_thread(self.__bootstrap, ())
thread.error: can't start new thread
When I call this
class startSleep(threading.Thread):
import threading
import time
class startSleep(threading.Thread):
def run(self):
current = x
# input of the treads
thread = input("Threads: ")
nload = 1
x = 0
# Threads
for x in xrange(thread):
startSleep().start()
time.sleep(0.003)
print bcolors.BLUE + "Thread " + str(x) + " started!"
I want to run as many thread as possible
There is a limit to how many threads you can start that the system can simultaneously handle, you need to either close these threads from within (by having the function you thread either finish or while loops break) or try joining the threads by creating a list of these threads and joining the list items.
list_of_threads.append(example)
example.start()
for thread in list_of_threads:
thread.join()
Now assuming you want to add unlimited threads, you need functions to simply finish, this code never ends with threads -> your unlimited threads:
from threading import Thread
def sleeper(i):
print(i)
i = 0
while(1):
t = Thread(target=sleeper, args=(i,))
t.start()
i += 1
I wonder if it is possible to check how long of each processes take.
for example, there are four workers and the job should take no more than 10 seconds, but one of worker take more than 10 seconds.Is there way to raise a alert after 10 seconds and before process finish the job.
My initial thought is using manager, but it seems I have wait till process finished.
Many thanks.
You can check whether process is alive after you tried to join it. Don't forget to set timeout otherwise it'll wait until job is finished.
Here is simple example for you
from multiprocessing import Process
import time
def task():
import time
time.sleep(5)
procs = []
for x in range(2):
proc = Process(target=task)
procs.append(proc)
proc.start()
time.sleep(2)
for proc in procs:
proc.join(timeout=0)
if proc.is_alive():
print "Job is not finished!"
I have found this solution time ago (somewhere here in StackOverflow) and I am very happy with it.
Basically, it uses signal to raise an exception if a process takes more than expected.
All you need to do is to add this class to your code:
import signal
class Timeout:
def __init__(self, seconds=1, error_message='TimeoutError'):
self.seconds = seconds
self.error_message = error_message
def handle_timeout(self, signum, frame):
raise TimeoutError(self.error_message)
def __enter__(self):
signal.signal(signal.SIGALRM, self.handle_timeout)
signal.alarm(self.seconds)
def __exit__(self, type, value, traceback):
signal.alarm(0)
Here is a general example of how it works:
import time
with Timeout(seconds=3, error_message='JobX took too much time'):
try:
time.sleep(10) #your job
except TimeoutError as e:
print(e)
In your case, I would add the with statement to the job that your worker need to perform. Then you catch the Exception and you do what you think is best.
Alternatively, you can periodically check if a process is alive:
timeout = 3 #seconds
start = time.time()
while time.time() - start < timeout:
if any(proces.is_alive() for proces in processes):
time.sleep(1)
else:
print('All processes done')
else:
print("Timeout!")
# do something
Use Pipe and messages
from multiprocessing import Process, Pipe
import numpy as np
caller, worker = Pipe()
val1 = ['der', 'die', 'das']
def worker_function(info):
print (info.recv())
for i in range(10):
print (val1[np.random.choice(3, 1)[0]])
info.send(['job finished'])
info.close()
def request(data):
caller.send(data)
task = Process(target=worker_function, args=(worker,))
if not task.is_alive():
print ("task is requested")
task.start()
if caller.recv() == ['job finished']:
task.join()
print ("finished")
if __name__ == '__main__':
data = {'input': 'here'}
request(data)
My multi-threading script raising this error :
thread.error : can't start new thread
when it reached 460 threads :
threading.active_count() = 460
I assume the old threads keeps stack up, since the script didn't kill them. This is my code:
import threading
import Queue
import time
import os
import csv
def main(worker):
#Do Work
print worker
return
def threader():
while True:
worker = q.get()
main(worker)
q.task_done()
def main_threader(workers):
global q
global city
q = Queue.Queue()
for x in range(20):
t = threading.Thread(target=threader)
t.daemon = True
print "\n\nthreading.active_count() = " + str(threading.active_count()) + "\n\n"
t.start()
for worker in workers:
q.put(worker)
q.join()
How do I kill the old threads when their job is done? (Is return not enough?)
Your threader function never exits, so your threads never die. Since you're just processing one fixed set of work and never adding items after you start working, you could set the threads up to exit when the queue is empty.
See the following altered version of your code and the comments I added:
def threader(q):
# let the thread die when all work is done
while not q.empty():
worker = q.get()
main(worker)
q.task_done()
def main_threader(workers):
# you don't want global variables
#global q
#global city
q = Queue.Queue()
# make sure you fill the queue *before* starting the worker threads
for worker in workers:
q.put(worker)
for x in range(20):
t = threading.Thread(target=threader, args=[q])
t.daemon = True
print "\n\nthreading.active_count() = " + str(threading.active_count()) + "\n\n"
t.start()
q.join()
Notice that I removed global q, and instead I pass q to the thread function. You don't want threads created by a previous call to end up sharing a q with new threads (edit although q.join() prevents this anyway, it's still better to avoid globals).
I'm doing an optimization of parameters of a complex simulation. I'm using the multiprocessing module for enhancing the performance of the optimization algorithm. The basics of multiprocessing I learned at http://pymotw.com/2/multiprocessing/basics.html.
The complex simulation lasts different times depending on the given parameters from the optimization algorithm, around 1 to 5 minutes. If the parameters are chosen very badly, the simulation can last 30 minutes or more and the results are not useful. So I was thinking about build in a timeout to the multiprocessing, that terminates all simulations that last more than a defined time. Here is an abstracted version of the problem:
import numpy as np
import time
import multiprocessing
def worker(num):
time.sleep(np.random.random()*20)
def main():
pnum = 10
procs = []
for i in range(pnum):
p = multiprocessing.Process(target=worker, args=(i,), name = ('process_' + str(i+1)))
procs.append(p)
p.start()
print('starting', p.name)
for p in procs:
p.join(5)
print('stopping', p.name)
if __name__ == "__main__":
main()
The line p.join(5) defines the timeout of 5 seconds. Because of the for-loop for p in procs: the program waits 5 seconds until the first process is finished and then again 5 seconds until the second process is finished and so on, but i want the program to terminate all processes that last more than 5 seconds. Additionally, if none of the processes last longer than 5 seconds the program must not wait this 5 seconds.
You can do this by creating a loop that will wait for some timeout amount of seconds, frequently checking to see if all processes are finished. If they don't all finish in the allotted amount of time, then terminate all of the processes:
TIMEOUT = 5
start = time.time()
while time.time() - start <= TIMEOUT:
if not any(p.is_alive() for p in procs):
# All the processes are done, break now.
break
time.sleep(.1) # Just to avoid hogging the CPU
else:
# We only enter this if we didn't 'break' above.
print("timed out, killing all processes")
for p in procs:
p.terminate()
p.join()
If you want to kill all the processes you could use the Pool from multiprocessing
you'll need to define a general timeout for all the execution as opposed of individual timeouts.
import numpy as np
import time
from multiprocessing import Pool
def worker(num):
xtime = np.random.random()*20
time.sleep(xtime)
return xtime
def main():
pnum = 10
pool = Pool()
args = range(pnum)
pool_result = pool.map_async(worker, args)
# wait 5 minutes for every worker to finish
pool_result.wait(timeout=300)
# once the timeout has finished we can try to get the results
if pool_result.ready():
print(pool_result.get(timeout=1))
if __name__ == "__main__":
main()
This will get you a list with the return values for all your workers in order.
More information here:
https://docs.python.org/2/library/multiprocessing.html#module-multiprocessing.pool
Thanks to the help of dano I found a solution:
import numpy as np
import time
import multiprocessing
def worker(num):
time.sleep(np.random.random()*20)
def main():
pnum = 10
TIMEOUT = 5
procs = []
bool_list = [True]*pnum
for i in range(pnum):
p = multiprocessing.Process(target=worker, args=(i,), name = ('process_' + str(i+1)))
procs.append(p)
p.start()
print('starting', p.name)
start = time.time()
while time.time() - start <= TIMEOUT:
for i in range(pnum):
bool_list[i] = procs[i].is_alive()
print(bool_list)
if np.any(bool_list):
time.sleep(.1)
else:
break
else:
print("timed out, killing all processes")
for p in procs:
p.terminate()
for p in procs:
print('stopping', p.name,'=', p.is_alive())
p.join()
if __name__ == "__main__":
main()
Its not the most elegant way, I'm sure there is a better way than using bool_list. Processes that are still alive after the timeout of 5 seconds will be killed. If you are setting shorter times in the worker function than the timeout, you will see that the program stops before the timeout of 5 seconds is reached. I'm still open for more elegant solutions if there are :)