I'm calling a function in a for loop however, I want to check if that function takes longer than 5 seconds to execute, I want to pass that iteration and move on to the next iteration.
I have thought about using the time library, and starting a clock, but the end timer will only execute after the function executes, thus I won't be able to pass that specific iteration within 5 seconds
I am attaching an example below. Hope this might help you:
from threading import Timer
class LoopStopper:
def __init__(self, seconds):
self._loop_stop = False
self._seconds = seconds
def _stop_loop(self):
self._loop_stop = True
def run( self, generator_expression, task):
""" Execute a task a number of times based on the generator_expression"""
t = Timer(self._seconds, self._stop_loop)
t.start()
for i in generator_expression:
task(i)
if self._loop_stop:
break
t.cancel() # Cancel the timer if the loop ends ok.
ls = LoopStopper( 5) # 5 second timeout
ls.run( range(1000000), print) # print numbers from 0 to 999999
Here's some code I've been experimenting with which has a task() which iterates over it params argument and takes a random amount of time to complete each.
I start a thread for each task, waiting for the thread to complete by monitoring a queue of return values. If the thread fails to complete, then the main loop abandons it, and starts the next thread.
The program shows which tasks fail or finish (different every time).
The tasks which finish have their results printed out (the param and the sleep time).
import threading, queue
import random
import time
def task(params, q):
for p in params:
s = random.randint(1,4)
s = s * s
s = s / 8
time.sleep(s)
q.put((p,s), False)
q.put(None, False) # None is sentinal value
def sampleQueue(q, ret, results):
while not q.empty():
item = q.get()
if item:
ret.append(item)
else:
# Found None sentinal
results.append(ret)
return True
return False
old = []
results = []
for p in [1,2,3,4]:
q = queue.SimpleQueue()
t = threading.Thread(target=task, args=([p,p,p,p,p], q))
t.start()
end = time.time() + 5
ret = []
failed = True
while time.time() < end:
time.sleep(0.1)
if sampleQueue(q, ret, results):
failed = False
break
if failed:
print(f'Task {p} failed!')
old.append(t)
else:
print(f'Task {p} finished!')
t.join()
print(results)
print(f'{len(old)} threads failed')
for t in old:
t.join()
print('Done')
Example output:
Task 1 finished!
Task 2 finished!
Task 3 failed!
Task 4 failed!
[[(1, 1.125), (1, 1.125), (1, 2.0), (1, 0.125), (1, 0.5)], [(2, 0.125), (2, 1.125), (2, 0.5), (2, 2.0), (2, 0.125)]]
2 threads failed
Done
I will post an alternative solution using the subprocess module. You need to create a python file with your function, call it as a subprocess, and call the wait method. If the process wont finish in the desired time it will throw an error, so you kill that process and keep going with the iteration.
As an example, this is the function you want to call:
from time import time
import sys
x = eval(sys.argv[1])
t = time()
a = [i for i in range(int(x**5))]
#pipe to the main process the computaiton time
sys.stdout.write('%s'%(time()-t))
And the main function, where I call the previous function on the func.py file:
import subprocess as sp
from subprocess import Popen, PIPE
for i in range(1,50,1):
#call the process
process = Popen(['python','~func.py', '%i'%i],
stdout = PIPE,stdin = PIPE)
try:
#if it finish within 1 sec:
process.wait(1)
print('Finished in: %s s'%(process.stdout.read().decode()))
except:
#else kill the process. It is important to kill it,
#otherwise it will keep running.
print('Timeout')
process.kill()
Related
I have written a bit of code to see the race condition, But it Doesn't happen.
class SharedContent:
def __init__(self, initia_value = 0) -> None:
self.initial_value = initia_value
def incerease(self ,delta = 1):
sleep(1)
self.initial_value += delta
content = SharedContent(0)
threads: list[Thread] = []
for i in range(250):
t = Thread(target=content.incerease)
t.start()
threads.append(t)
#wait until all threads have finished their job
while True:
n = 0
for t in threads:
if t.is_alive():
sleep(0.2)
continue
n += 1
if n == len(threads):
break
print(content.initial_value)
The output is 250 which implies no race condition has happened!
Why is that?
I even tried this with random sleep time but the output was the same.
I changed your program. This version prints a different number every time I run it.
#!/usr/bin/env python3
from threading import Thread
class SharedContent:
def __init__(self, initia_value = 0) -> None:
self.initial_value = initia_value
def incerease(self ,delta = 1):
for i in range(0, 1000000):
self.initial_value += delta
content = SharedContent(0)
threads = []
for i in range(2):
t = Thread(target=content.incerease)
t.start()
threads.append(t)
#wait until all threads have finished their job
for t in threads:
t.join()
print(content.initial_value)
What I changed:
Only two threads instead of 250.
Got rid of sleep() calls.
Each thread increments the variable one million times instead of just one time.
Main program uses join() to wait for the threads to finish.
I'm trying to use multiprocessing for a function that can potentially return a segfault (I have no control over this ATM). In cases where the child process hits a segfault, I want only that child to fail, but all other child tasks to continue/return their results.
I've already switched from multiprocessing.Pool to concurrent.futures.ProcessPoolExecutor avoid the issue of the child process hanging forever (or until an arbitrary timeout) as documented in this bug: https://bugs.python.org/issue22393.
However the issue I face now, is that when the first child task hits a segfault, all in-flight child processes get marked as broken (concurrent.futures.process.BrokenProcessPool).
Is there a way to only mark actually broken child processes as broken?
Code I'm running in Python 3.7.4:
import concurrent.futures
import ctypes
from time import sleep
def do_something(x):
print(f"{x}; in do_something")
sleep(x*3)
if x == 2:
# raise a segmentation fault internally
return x, ctypes.string_at(0)
return x, x-1
nums = [1, 2, 3, 1.5]
executor = concurrent.futures.ProcessPoolExecutor()
result_futures = []
for num in nums:
# Using submit with a list instead of map lets you get past the first exception
# Example: https://stackoverflow.com/a/53346191/7619676
future = executor.submit(do_something, num)
result_futures.append(future)
# Wait for all results
concurrent.futures.wait(result_futures)
# After a segfault is hit for any child process (i.e. is "terminated abruptly"), the process pool becomes unusable
# and all running/pending child processes' results are set to broken
for future in result_futures:
try:
print(future.result())
except concurrent.futures.process.BrokenProcessPool:
print("broken")
Result:
(1, 0)
broken
broken
(1.5, 0.5)
Desired result:
(1, 0)
broken
(3, 2)
(1.5, 0.5)
multiprocessing.Pool and concurrent.futures.ProcessPoolExecutor both make assumptions about how to handle the concurrency of the interactions between the workers and the main process that are violated if any one process is killed or segfaults, so they do the safe thing and mark the whole pool as broken. To get around this, you will need to build up your own pool with different assumptions directly using multiprocessing.Process instances.
This might sound intimidating but a list and a multiprocessing.Manager will get you pretty far:
import multiprocessing
import ctypes
import queue
from time import sleep
def do_something(job, result):
while True:
x=job.get()
print(f"{x}; in do_something")
sleep(x*3)
if x == 2:
# raise a segmentation fault internally
return x, ctypes.string_at(0)
result.put((x, x-1))
nums = [1, 2, 3, 1.5]
if __name__ == "__main__":
# you ARE using the spawn context, right?
ctx = multiprocessing.get_context("spawn")
manager = ctx.Manager()
job_queue = manager.Queue(maxsize=-1)
result_queue = manager.Queue(maxsize=-1)
pool = [
ctx.Process(target=do_something, args=(job_queue, result_queue), daemon=True)
for _ in range(multiprocessing.cpu_count())
]
for proc in pool:
proc.start()
for num in nums:
job_queue.put(num)
try:
while True:
# Timeout is our only signal that no more results coming
print(result_queue.get(timeout=10))
except queue.Empty:
print("Done!")
print(pool) # will see one dead Process
for proc in pool:
proc.kill() # avoid stderr spam
This "Pool" is a little inflexible, and you will probably want to customize it for your application's specific needs. But you can definitely skip right over segfaulting workers.
When I went down this rabbit hole, where I was interested in cancelling specific submissions to a worker pool, I eventually wound up writing a whole library to integrate into Trio async apps: trio-parallel. Hopefully you won't need to go that far!
Based on #Richard Sheridan's answer, I ended up using the code below. This version doesn't require setting a timeout, which is something I couldn't do for my use case.
import ctypes
import multiprocessing
from typing import List
from time import sleep
def do_something(x, result):
print(f"{x} starting")
sleep(x * 3)
if x == 2:
# raise a segmentation fault internally
y = ctypes.string_at(0)
y = x
print(f"{x} done")
results_queue.put(y)
def wait_for_process_slot(
processes: List,
concurrency: int = multiprocessing.cpu_count() - 1,
wait_sec: int = 1,
) -> int:
"""Blocks main process if `concurrency` processes are already running.
Alternative to `multiprocessing.Semaphore.acquire`
useful for when child processes might fail and not be able to signal.
Relies instead on the main's (parent's) tracking of `multiprocessing.Process`es.
"""
counter = 0
while True:
counter = sum([1 for i, p in processes.items() if p.is_alive()])
if counter < concurrency:
return counter
sleep(wait_sec)
if __name__ == "__main__":
# "spawn" results in an OSError b/c pickling a segfault fails?
ctx = multiprocessing.get_context()
manager = ctx.Manager()
results_queue = manager.Queue(maxsize=-1)
concurrency = multiprocessing.cpu_count() - 1 # reserve 1 CPU for waiting
nums = [3, 1, 2, 1.5]
all_processes = {}
for idx, num in enumerate(nums):
num_running_processes = wait_for_process_slot(all_processes, concurrency)
p = ctx.Process(target=do_something, args=(num, results_queue), daemon=True)
all_processes.update({idx: p})
p.start()
# Wait for the last batch of processes not blocked by wait_for_process_slot to finish
for p in all_processes.values():
p.join()
# Check last batch of processes for bad processes
# Relies on all processes having finished (the p.joins above)
bad_nums = [idx for idx, p in all_processes.items() if p.exitcode != 0]
I have a process that is essentially just an infinite loop and I have a second process that is a timer. How can I kill the loop process once the timer is done?
def action():
x = 0
while True:
if x < 1000000:
x = x + 1
else:
x = 0
def timer(time):
time.sleep(time)
exit()
loop_process = multiprocessing.Process(target=action)
loop_process.start()
timer_process = multiprocessing.Process(target=timer, args=(time,))
timer_process.start()
I want the python script to end once the timer is done.
You could do it by using a sharing state between the processes and creating a flag value that all the concurrent processes can access (although this may be somewhat inefficient).
Here's what I'm suggesting:
import multiprocessing as mp
import time
def action(run_flag):
x = 0
while run_flag.value:
if x < 1000000:
x = x + 1
else:
x = 0
print('action() terminating')
def timer(run_flag, secs):
time.sleep(secs)
run_flag.value = False
if __name__ == '__main__':
run_flag = mp.Value('I', True)
loop_process = mp.Process(target=action, args=(run_flag,))
loop_process.start()
timer_process = mp.Process(target=timer, args=(run_flag, 2.0))
timer_process.start()
loop_process.join()
timer_process.join()
print('done')
A simple return statement after else in action() would work perfectly. Moreover, you had an error in your timer function. Your argument had the same name as inbuilt library time.
def action():
x = 0
while True:
if x < 1000000:
x = x + 1
else:
x = 0
return # To exit else it will always revolve in infinite loop
def timer(times):
time.sleep(times)
exit()
loop_process = multiprocessing.Process(target=action)
loop_process.start()
timer_process = multiprocessing.Process(target=timer(10))
timer_process.start()
Hope this answers your question!!!
I think you don't need to make a second process just for a timer.
Graceful Timeout
In case you need clean up before exit in your action process, you can use a Timer-thread and let the while-loop check if it is still alive. This allows your worker process to exit gracefully, but you'll have to pay with reduced performance
because the repeated method call takes some time. Doesn't have to be an issue if it' s not a tight loop, though.
from multiprocessing import Process
from datetime import datetime
from threading import Timer
def action(runtime, x=0):
timer = Timer(runtime, lambda: None) # just returns None on timeout
timer.start()
while timer.is_alive():
if x < 1_000_000_000:
x += 1
else:
x = 0
if __name__ == '__main__':
RUNTIME = 1
p = Process(target=action, args=(RUNTIME,))
p.start()
print(f'{datetime.now()} {p.name} started')
p.join()
print(f'{datetime.now()} {p.name} ended')
Example Output:
2019-02-28 19:18:54.731207 Process-1 started
2019-02-28 19:18:55.738308 Process-1 ended
Termination on Timeout
If you don't have the need for a clean shut down (you are not using shared queues, working with DBs etc.), you can let the parent process terminate() the worker-process after your specified time.
terminate()
Terminate the process. On Unix this is done using the SIGTERM signal; on Windows TerminateProcess() is used. Note that exit handlers and finally clauses, etc., will not be executed.
Note that descendant processes of the process will not be terminated – they will simply become orphaned.
Warning If this method is used when the associated process is using a pipe or queue then the pipe or queue is liable to become corrupted and may become unusable by other process. Similarly, if the process has acquired a lock or semaphore etc. then terminating it is liable to cause other processes to deadlock. docs
If you don't have anything to do in the parent you can simply .join(timeout) the worker-process and .terminate() afterwards.
from multiprocessing import Process
from datetime import datetime
def action(x=0):
while True:
if x < 1_000_000_000:
x += 1
else:
x = 0
if __name__ == '__main__':
RUNTIME = 1
p = Process(target=action)
p.start()
print(f'{datetime.now()} {p.name} started')
p.join(RUNTIME)
p.terminate()
print(f'{datetime.now()} {p.name} terminated')
Example Output:
2019-02-28 19:22:43.705596 Process-1 started
2019-02-28 19:22:44.709255 Process-1 terminated
In case you want to use terminate(), but need your parent unblocked you could also use a Timer-thread within the parent for that.
from multiprocessing import Process
from datetime import datetime
from threading import Timer
def action(x=0):
while True:
if x < 1_000_000_000:
x += 1
else:
x = 0
def timeout(process, timeout):
timer = Timer(timeout, process.terminate)
timer.start()
if __name__ == '__main__':
RUNTIME = 1
p = Process(target=action)
p.start()
print(f'{datetime.now()} {p.name} started')
timeout(p, RUNTIME)
p.join()
print(f'{datetime.now()} {p.name} terminated')
Example Output:
2019-02-28 19:23:45.776951 Process-1 started
2019-02-28 19:23:46.778840 Process-1 terminated
In the below example, if you execute the program multiple times, it spawns a new thread each time with a new ID.
1. How do I terminate all the threads on task completion ?
2. How can I assign name/ID to the threads ?
import threading, Queue
THREAD_LIMIT = 3
jobs = Queue.Queue(5) # This sets up the queue object to use 5 slots
singlelock = threading.Lock() # This is a lock so threads don't print trough each other
# list
inputlist_Values = [ (5,5),(10,4),(78,5),(87,2),(65,4),(10,10),(65,2),(88,95),(44,55),(33,3) ]
def DoWork(inputlist):
print "Inputlist received..."
print inputlist
# Spawn the threads
print "Spawning the {0} threads.".format(THREAD_LIMIT)
for x in xrange(THREAD_LIMIT):
print "Thread {0} started.".format(x)
# This is the thread class that we instantiate.
worker().start()
# Put stuff in queue
print "Putting stuff in queue"
for i in inputlist:
# Block if queue is full, and wait 5 seconds. After 5s raise Queue Full error.
try:
jobs.put(i, block=True, timeout=5)
except:
singlelock.acquire()
print "The queue is full !"
singlelock.release()
# Wait for the threads to finish
singlelock.acquire() # Acquire the lock so we can print
print "Waiting for threads to finish."
singlelock.release() # Release the lock
jobs.join() # This command waits for all threads to finish.
class worker(threading.Thread):
def run(self):
# run forever
while 1:
# Try and get a job out of the queue
try:
job = jobs.get(True,1)
singlelock.acquire() # Acquire the lock
print self
print "Multiplication of {0} with {1} gives {2}".format(job[0],job[1],(job[0]*job[1]))
singlelock.release() # Release the lock
# Let the queue know the job is finished.
jobs.task_done()
except:
break # No more jobs in the queue
def main():
DoWork(inputlist_Values)
How do I terminate all the threads on task completion?
You could put THREAD_LIMIT sentinel values (e.g., None) at the end of the queue and exit thread's run() method if a thread sees it.
On your main thread exit all non-daemon threads are joined so the program will keep running if any of the threads is alive. Daemon threads are terminated on your program exit.
How can I assign name/ID to the threads ?
You can assign name by passing it to the constructor or by changing .name directly.
Thread identifier .ident is a read-only property that is unique among alive threads. It maybe reused if one thread exits and another starts.
You could rewrite you code using multiprocessing.dummy.Pool that provides the same interface as multiprocessing.Pool but uses threads instead of processes:
#!/usr/bin/env python
import logging
from multiprocessing.dummy import Pool
debug = logging.getLogger(__name__).debug
def work(x_y):
try:
x, y = x_y # do some work here
debug('got %r', x_y)
return x / y, None
except Exception as e:
logging.getLogger(__name__).exception('work%r failed', x_y)
return None, e
def main():
logging.basicConfig(level=logging.DEBUG,
format="%(levelname)s:%(threadName)s:%(asctime)s %(message)s")
inputlist = [ (5,5),(10,4),(78,5),(87,2),(65,4),(10,10), (1,0), (0,1) ]
pool = Pool(3)
s = 0.
for result, error in pool.imap_unordered(work, inputlist):
if error is None:
s += result
print("sum=%s" % (s,))
pool.close()
pool.join()
if __name__ == "__main__":
main()
Output
DEBUG:Thread-1:2013-01-14 15:37:37,253 got (5, 5)
DEBUG:Thread-1:2013-01-14 15:37:37,253 got (87, 2)
DEBUG:Thread-1:2013-01-14 15:37:37,253 got (65, 4)
DEBUG:Thread-1:2013-01-14 15:37:37,254 got (10, 10)
DEBUG:Thread-1:2013-01-14 15:37:37,254 got (1, 0)
ERROR:Thread-1:2013-01-14 15:37:37,254 work(1, 0) failed
Traceback (most recent call last):
File "prog.py", line 11, in work
return x / y, None
ZeroDivisionError: integer division or modulo by zero
DEBUG:Thread-1:2013-01-14 15:37:37,254 got (0, 1)
DEBUG:Thread-3:2013-01-14 15:37:37,253 got (10, 4)
DEBUG:Thread-2:2013-01-14 15:37:37,253 got (78, 5)
sum=78.0
Threads don't stop unless you tell them to stop.
My recommendation is that you add a stop variable into your Thread subclass, and check whether this variable is True or not in your run loop (instead of while 1:).
An example:
class worker(threading.Thread):
def __init__(self):
self._stop = False
def stop(self):
self._stop = True
def run(self):
# run until stopped
while not self._stop:
# do work
Then when your program is quitting (for whatever reason) you have to make sure to call the stop method on all your working threads.
About your second question, doesn't adding a name variable to your Thread subclass work for you?
I've read a lot of posts about using threads, subprocesses, etc.. A lot of it seems over complicated for what I'm trying to do...
All I want to do is stop executing a function after X amount of time has elapsed.
def big_loop(bob):
x = bob
start = time.time()
while True:
print time.time()-start
This function is an endless loop that never throws any errors or exceptions, period.
I"m not sure the difference between "commands, shells, subprocesses, threads, etc.." and this function, which is why I'm having trouble manipulating subprocesses.
I found this code here, and tried it but as you can see it keeps printing after 10 seconds have elapsed:
import time
import threading
import subprocess as sub
import time
class RunCmd(threading.Thread):
def __init__(self, cmd, timeout):
threading.Thread.__init__(self)
self.cmd = cmd
self.timeout = timeout
def run(self):
self.p = sub.Popen(self.cmd)
self.p.wait()
def Run(self):
self.start()
self.join(self.timeout)
if self.is_alive():
self.p.terminate()
self.join()
def big_loop(bob):
x = bob
start = time.time()
while True:
print time.time()-start
RunCmd(big_loop('jimijojo'), 10).Run() #supposed to quit after 10 seconds, but doesn't
x = raw_input('DONEEEEEEEEEEEE')
What's a simple way this function can be killed. As you can see in my attempt above, it doesn't terminate after 20 seconds and just keeps on going...
***OH also, I've read about using signal, but I"m on windows so I can't use the alarm feature.. (python 2.7)
**assume the "infinitely running function" can't be manipulated or changed to be non-infinite, if I could change the function, well I'd just change it to be non infinite wouldn't I?
Here are some similar questions, which I haven't able to port over their code to work with my simple function:
Perhaps you can?
Python: kill or terminate subprocess when timeout
signal.alarm replacement in Windows [Python]
Ok I tried an answer I received, it works.. but how can I use it if I remove the if __name__ == "__main__": statement? When I remove this statement, the loop never ends as it did before..
import multiprocessing
import Queue
import time
def infinite_loop_function(bob):
var = bob
start = time.time()
while True:
time.sleep(1)
print time.time()-start
print 'this statement will never print'
def wrapper(queue, bob):
result = infinite_loop_function(bob)
queue.put(result)
queue.close()
#if __name__ == "__main__":
queue = multiprocessing.Queue(1) # Maximum size is 1
proc = multiprocessing.Process(target=wrapper, args=(queue, 'var'))
proc.start()
# Wait for TIMEOUT seconds
try:
timeout = 10
result = queue.get(True, timeout)
except Queue.Empty:
# Deal with lack of data somehow
result = None
finally:
proc.terminate()
print 'running other code, now that that infinite loop has been defeated!'
print 'bla bla bla'
x = raw_input('done')
Use the building blocks in the multiprocessing module:
import multiprocessing
import Queue
TIMEOUT = 5
def big_loop(bob):
import time
time.sleep(4)
return bob*2
def wrapper(queue, bob):
result = big_loop(bob)
queue.put(result)
queue.close()
def run_loop_with_timeout():
bob = 21 # Whatever sensible value you need
queue = multiprocessing.Queue(1) # Maximum size is 1
proc = multiprocessing.Process(target=wrapper, args=(queue, bob))
proc.start()
# Wait for TIMEOUT seconds
try:
result = queue.get(True, TIMEOUT)
except Queue.Empty:
# Deal with lack of data somehow
result = None
finally:
proc.terminate()
# Process data here, not in try block above, otherwise your process keeps running
print result
if __name__ == "__main__":
run_loop_with_timeout()
You could also accomplish this with a Pipe/Connection pair, but I'm not familiar with their API. Change the sleep time or TIMEOUT to check the behaviour for either case.
There is no straightforward way to kill a function after a certain amount of time without running the function in a separate process. A better approach would probably be to rewrite the function so that it returns after a specified time:
import time
def big_loop(bob, timeout):
x = bob
start = time.time()
end = start + timeout
while time.time() < end:
print time.time() - start
# Do more stuff here as needed
Can't you just return from the loop?
start = time.time()
endt = start + 30
while True:
now = time.time()
if now > endt:
return
else:
print end - start
import os,signal,time
cpid = os.fork()
if cpid == 0:
while True:
# do stuff
else:
time.sleep(10)
os.kill(cpid, signal.SIGKILL)
You can also check in the loop of a thread for an event, which is more portable and flexible as it allows other reactions than brute killing. However, this approach fails if # do stuff can take time (or even wait forever on some event).