Catch exception in Pool worker that is not started yet - python

I'm trying to catch exception in sub process. My code (see below) works fine if number of workers == number of tasks. But if workers more than tasks as in an example, 2 process that was not started yet will thow same exception but I can't catch it neither in MainProcess nor in sub processes.
from multiprocessing import Pool, current_process
import time
from exceptions import GracefulExit
import signal
def terminate_handler(signum, frame):
print(dir(frame))
if current_process().name == 'MainProcess':
# we will not raise error if process
# is main because we need to finish all the jobs
return
raise GracefulExit()
def test_func(val):
i = 0
try:
while True:
time.sleep(0.1)
i += 1
if i >= 10:
return i
except GracefulExit:
pass
return i
if __name__ == "__main__":
signal.signal(
signal.SIGINT, terminate_handler)
try:
with Pool(5) as p:
r = p.map(test_func, [1, 2, 3])
except GracefulExit:
pass
print(r)

What can I suggest, pick any variant:
make sure that number of tasks not less the number of workers, just add stubs (e.g. None) to ensure that all workers will start
override run method of Pool's Process:
class CustomProcess(Process):
def run(self):
try:
super().run()
except GracefulExit:
print('interrupted idle worker')
class CustomPool(multiprocessing.pool.Pool):
Process = CustomProcess
....
with CustomPool(5) as p:
r = p.map(test_func, [1, 2, 3])
write your own pool implementation

Related

asyncio.wait_for with non-async function [duplicate]

I'm calling a function in Python which I know may stall and force me to restart the script.
How do I call the function or what do I wrap it in so that if it takes longer than 5 seconds the script cancels it and does something else?
You may use the signal package if you are running on UNIX:
In [1]: import signal
# Register an handler for the timeout
In [2]: def handler(signum, frame):
...: print("Forever is over!")
...: raise Exception("end of time")
...:
# This function *may* run for an indetermined time...
In [3]: def loop_forever():
...: import time
...: while 1:
...: print("sec")
...: time.sleep(1)
...:
...:
# Register the signal function handler
In [4]: signal.signal(signal.SIGALRM, handler)
Out[4]: 0
# Define a timeout for your function
In [5]: signal.alarm(10)
Out[5]: 0
In [6]: try:
...: loop_forever()
...: except Exception, exc:
...: print(exc)
....:
sec
sec
sec
sec
sec
sec
sec
sec
Forever is over!
end of time
# Cancel the timer if the function returned before timeout
# (ok, mine won't but yours maybe will :)
In [7]: signal.alarm(0)
Out[7]: 0
10 seconds after the call signal.alarm(10), the handler is called. This raises an exception that you can intercept from the regular Python code.
This module doesn't play well with threads (but then, who does?)
Note that since we raise an exception when timeout happens, it may end up caught and ignored inside the function, for example of one such function:
def loop_forever():
while 1:
print('sec')
try:
time.sleep(10)
except:
continue
You can use multiprocessing.Process to do exactly that.
Code
import multiprocessing
import time
# bar
def bar():
for i in range(100):
print "Tick"
time.sleep(1)
if __name__ == '__main__':
# Start bar as a process
p = multiprocessing.Process(target=bar)
p.start()
# Wait for 10 seconds or until process finishes
p.join(10)
# If thread is still active
if p.is_alive():
print "running... let's kill it..."
# Terminate - may not work if process is stuck for good
p.terminate()
# OR Kill - will work for sure, no chance for process to finish nicely however
# p.kill()
p.join()
How do I call the function or what do I wrap it in so that if it takes longer than 5 seconds the script cancels it?
I posted a gist that solves this question/problem with a decorator and a threading.Timer. Here it is with a breakdown.
Imports and setups for compatibility
It was tested with Python 2 and 3. It should also work under Unix/Linux and Windows.
First the imports. These attempt to keep the code consistent regardless of the Python version:
from __future__ import print_function
import sys
import threading
from time import sleep
try:
import thread
except ImportError:
import _thread as thread
Use version independent code:
try:
range, _print = xrange, print
def print(*args, **kwargs):
flush = kwargs.pop('flush', False)
_print(*args, **kwargs)
if flush:
kwargs.get('file', sys.stdout).flush()
except NameError:
pass
Now we have imported our functionality from the standard library.
exit_after decorator
Next we need a function to terminate the main() from the child thread:
def quit_function(fn_name):
# print to stderr, unbuffered in Python 2.
print('{0} took too long'.format(fn_name), file=sys.stderr)
sys.stderr.flush() # Python 3 stderr is likely buffered.
thread.interrupt_main() # raises KeyboardInterrupt
And here is the decorator itself:
def exit_after(s):
'''
use as decorator to exit process if
function takes longer than s seconds
'''
def outer(fn):
def inner(*args, **kwargs):
timer = threading.Timer(s, quit_function, args=[fn.__name__])
timer.start()
try:
result = fn(*args, **kwargs)
finally:
timer.cancel()
return result
return inner
return outer
Usage
And here's the usage that directly answers your question about exiting after 5 seconds!:
#exit_after(5)
def countdown(n):
print('countdown started', flush=True)
for i in range(n, -1, -1):
print(i, end=', ', flush=True)
sleep(1)
print('countdown finished')
Demo:
>>> countdown(3)
countdown started
3, 2, 1, 0, countdown finished
>>> countdown(10)
countdown started
10, 9, 8, 7, 6, countdown took too long
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 11, in inner
File "<stdin>", line 6, in countdown
KeyboardInterrupt
The second function call will not finish, instead the process should exit with a traceback!
KeyboardInterrupt does not always stop a sleeping thread
Note that sleep will not always be interrupted by a keyboard interrupt, on Python 2 on Windows, e.g.:
#exit_after(1)
def sleep10():
sleep(10)
print('slept 10 seconds')
>>> sleep10()
sleep10 took too long # Note that it hangs here about 9 more seconds
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 11, in inner
File "<stdin>", line 3, in sleep10
KeyboardInterrupt
nor is it likely to interrupt code running in extensions unless it explicitly checks for PyErr_CheckSignals(), see Cython, Python and KeyboardInterrupt ignored
I would avoid sleeping a thread more than a second, in any case - that's an eon in processor time.
How do I call the function or what do I wrap it in so that if it takes longer than 5 seconds the script cancels it and does something else?
To catch it and do something else, you can catch the KeyboardInterrupt.
>>> try:
... countdown(10)
... except KeyboardInterrupt:
... print('do something else')
...
countdown started
10, 9, 8, 7, 6, countdown took too long
do something else
I have a different proposal which is a pure function (with the same API as the threading suggestion) and seems to work fine (based on suggestions on this thread)
def timeout(func, args=(), kwargs={}, timeout_duration=1, default=None):
import signal
class TimeoutError(Exception):
pass
def handler(signum, frame):
raise TimeoutError()
# set the timeout handler
signal.signal(signal.SIGALRM, handler)
signal.alarm(timeout_duration)
try:
result = func(*args, **kwargs)
except TimeoutError as exc:
result = default
finally:
signal.alarm(0)
return result
I ran across this thread when searching for a timeout call on unit tests. I didn't find anything simple in the answers or 3rd party packages so I wrote the decorator below you can drop right into code:
import multiprocessing.pool
import functools
def timeout(max_timeout):
"""Timeout decorator, parameter in seconds."""
def timeout_decorator(item):
"""Wrap the original function."""
#functools.wraps(item)
def func_wrapper(*args, **kwargs):
"""Closure for function."""
pool = multiprocessing.pool.ThreadPool(processes=1)
async_result = pool.apply_async(item, args, kwargs)
# raises a TimeoutError if execution exceeds max_timeout
return async_result.get(max_timeout)
return func_wrapper
return timeout_decorator
Then it's as simple as this to timeout a test or any function you like:
#timeout(5.0) # if execution takes longer than 5 seconds, raise a TimeoutError
def test_base_regression(self):
...
The stopit package, found on pypi, seems to handle timeouts well.
I like the #stopit.threading_timeoutable decorator, which adds a timeout parameter to the decorated function, which does what you expect, it stops the function.
Check it out on pypi: https://pypi.python.org/pypi/stopit
I am the author of wrapt_timeout_decorator.
Most of the solutions presented here work wunderfully under Linux on the first glance - because we have fork() and signals() - but on windows the things look a bit different.
And when it comes to subthreads on Linux, You cant use Signals anymore.
In order to spawn a process under Windows, it needs to be picklable - and many decorated functions or Class methods are not.
So you need to use a better pickler like dill and multiprocess (not pickle and multiprocessing) - thats why You cant use ProcessPoolExecutor (or only with limited functionality).
For the timeout itself - You need to define what timeout means - because on Windows it will take considerable (and not determinable) time to spawn the process. This can be tricky on short timeouts. Lets assume, spawning the process takes about 0.5 seconds (easily !!!). If You give a timeout of 0.2 seconds what should happen?
Should the function time out after 0.5 + 0.2 seconds (so let the method run for 0.2 seconds)?
Or should the called process time out after 0.2 seconds (in that case, the decorated function will ALWAYS timeout, because in that time it is not even spawned)?
Also nested decorators can be nasty and You cant use Signals in a subthread. If You want to create a truly universal, cross-platform decorator, all this needs to be taken into consideration (and tested).
Other issues are passing exceptions back to the caller, as well as logging issues (if used in the decorated function - logging to files in another process is NOT supported)
I tried to cover all edge cases, You might look into the package wrapt_timeout_decorator, or at least test Your own solutions inspired by the unittests used there.
#Alexis Eggermont - unfortunately I dont have enough points to comment - maybe someone else can notify You - I think I solved Your import issue.
There are a lot of suggestions, but none using concurrent.futures, which I think is the most legible way to handle this.
from concurrent.futures import ProcessPoolExecutor
# Warning: this does not terminate function if timeout
def timeout_five(fnc, *args, **kwargs):
with ProcessPoolExecutor() as p:
f = p.submit(fnc, *args, **kwargs)
return f.result(timeout=5)
Super simple to read and maintain.
We make a pool, submit a single process and then wait up to 5 seconds before raising a TimeoutError that you could catch and handle however you needed.
Native to python 3.2+ and backported to 2.7 (pip install futures).
Switching between threads and processes is as simple as replacing ProcessPoolExecutor with ThreadPoolExecutor.
If you want to terminate the Process on timeout I would suggest looking into Pebble.
Building on and and enhancing the answer by #piro , you can build a contextmanager. This allows for very readable code which will disable the alaram signal after a successful run (sets signal.alarm(0))
from contextlib import contextmanager
import signal
import time
#contextmanager
def timeout(duration):
def timeout_handler(signum, frame):
raise TimeoutError(f'block timedout after {duration} seconds')
signal.signal(signal.SIGALRM, timeout_handler)
signal.alarm(duration)
try:
yield
finally:
signal.alarm(0)
def sleeper(duration):
time.sleep(duration)
print('finished')
Example usage:
In [19]: with timeout(2):
...: sleeper(1)
...:
finished
In [20]: with timeout(2):
...: sleeper(3)
...:
---------------------------------------------------------------------------
Exception Traceback (most recent call last)
<ipython-input-20-66c78858116f> in <module>()
1 with timeout(2):
----> 2 sleeper(3)
3
<ipython-input-7-a75b966bf7ac> in sleeper(t)
1 def sleeper(t):
----> 2 time.sleep(t)
3 print('finished')
4
<ipython-input-18-533b9e684466> in timeout_handler(signum, frame)
2 def timeout(duration):
3 def timeout_handler(signum, frame):
----> 4 raise Exception(f'block timedout after {duration} seconds')
5 signal.signal(signal.SIGALRM, timeout_handler)
6 signal.alarm(duration)
Exception: block timedout after 2 seconds
Great, easy to use and reliable PyPi project timeout-decorator (https://pypi.org/project/timeout-decorator/)
installation:
pip install timeout-decorator
Usage:
import time
import timeout_decorator
#timeout_decorator.timeout(5)
def mytest():
print "Start"
for i in range(1,10):
time.sleep(1)
print "%d seconds have passed" % i
if __name__ == '__main__':
mytest()
timeout-decorator don't work on windows system as , windows didn't support signal well.
If you use timeout-decorator in windows system you will get the following
AttributeError: module 'signal' has no attribute 'SIGALRM'
Some suggested to use use_signals=False but didn't worked for me.
Author #bitranox created the following package:
pip install https://github.com/bitranox/wrapt-timeout-decorator/archive/master.zip
Code Sample:
import time
from wrapt_timeout_decorator import *
#timeout(5)
def mytest(message):
print(message)
for i in range(1,10):
time.sleep(1)
print('{} seconds have passed'.format(i))
def main():
mytest('starting')
if __name__ == '__main__':
main()
Gives the following exception:
TimeoutError: Function mytest timed out after 5 seconds
Highlights
Raises TimeoutError uses exceptions to alert on timeout - can easily be modified
Cross Platform: Windows & Mac OS X
Compatibility: Python 3.6+ (I also tested on python 2.7 and it works with small syntax adjustments)
For full explanation and extension to parallel maps, see here https://flipdazed.github.io/blog/quant%20dev/parallel-functions-with-timeouts
Minimal Example
>>> #killer_call(timeout=4)
... def bar(x):
... import time
... time.sleep(x)
... return x
>>> bar(10)
Traceback (most recent call last):
...
__main__.TimeoutError: function 'bar' timed out after 4s
and as expected
>>> bar(2)
2
Full code
import multiprocessing as mp
import multiprocessing.queues as mpq
import functools
import dill
from typing import Tuple, Callable, Dict, Optional, Iterable, List, Any
class TimeoutError(Exception):
def __init__(self, func: Callable, timeout: int):
self.t = timeout
self.fname = func.__name__
def __str__(self):
return f"function '{self.fname}' timed out after {self.t}s"
def _lemmiwinks(func: Callable, args: Tuple, kwargs: Dict[str, Any], q: mp.Queue):
"""lemmiwinks crawls into the unknown"""
q.put(dill.loads(func)(*args, **kwargs))
def killer_call(func: Callable = None, timeout: int = 10) -> Callable:
"""
Single function call with a timeout
Args:
func: the function
timeout: The timeout in seconds
"""
if not isinstance(timeout, int):
raise ValueError(f'timeout needs to be an int. Got: {timeout}')
if func is None:
return functools.partial(killer_call, timeout=timeout)
#functools.wraps(killer_call)
def _inners(*args, **kwargs) -> Any:
q_worker = mp.Queue()
proc = mp.Process(target=_lemmiwinks, args=(dill.dumps(func), args, kwargs, q_worker))
proc.start()
try:
return q_worker.get(timeout=timeout)
except mpq.Empty:
raise TimeoutError(func, timeout)
finally:
try:
proc.terminate()
except:
pass
return _inners
if __name__ == '__main__':
#killer_call(timeout=4)
def bar(x):
import time
time.sleep(x)
return x
print(bar(2))
bar(10)
Notes
You will need to import inside the function because of the way dill works.
This will also mean these functions may not be not compatible with doctest if there are imports inside your target functions. You will get an issue with __import__ not found.
Just in case it is helpful for anyone, building on the answer by #piro, I've made a function decorator:
import time
import signal
from functools import wraps
def timeout(timeout_secs: int):
def wrapper(func):
#wraps(func)
def time_limited(*args, **kwargs):
# Register an handler for the timeout
def handler(signum, frame):
raise Exception(f"Timeout for function '{func.__name__}'")
# Register the signal function handler
signal.signal(signal.SIGALRM, handler)
# Define a timeout for your function
signal.alarm(timeout_secs)
result = None
try:
result = func(*args, **kwargs)
except Exception as exc:
raise exc
finally:
# disable the signal alarm
signal.alarm(0)
return result
return time_limited
return wrapper
Using the wrapper on a function with a 20 seconds timeout would look something like:
#timeout(20)
def my_slow_or_never_ending_function(name):
while True:
time.sleep(1)
print(f"Yet another second passed {name}...")
try:
results = my_slow_or_never_ending_function("Yooo!")
except Exception as e:
print(f"ERROR: {e}")
We can use signals for the same. I think the below example will be useful for you. It is very simple compared to threads.
import signal
def timeout(signum, frame):
raise myException
#this is an infinite loop, never ending under normal circumstances
def main():
print 'Starting Main ',
while 1:
print 'in main ',
#SIGALRM is only usable on a unix platform
signal.signal(signal.SIGALRM, timeout)
#change 5 to however many seconds you need
signal.alarm(5)
try:
main()
except myException:
print "whoops"
Another solution with asyncio :
If you want to cancel the background task and not just timeout on the running main code, then you need an explicit communication from main thread to ask the code of the task to cancel , like a threading.Event()
import asyncio
import functools
import multiprocessing
from concurrent.futures.thread import ThreadPoolExecutor
class SingletonTimeOut:
pool = None
#classmethod
def run(cls, to_run: functools.partial, timeout: float):
pool = cls.get_pool()
loop = cls.get_loop()
try:
task = loop.run_in_executor(pool, to_run)
return loop.run_until_complete(asyncio.wait_for(task, timeout=timeout))
except asyncio.TimeoutError as e:
error_type = type(e).__name__ #TODO
raise e
#classmethod
def get_pool(cls):
if cls.pool is None:
cls.pool = ThreadPoolExecutor(multiprocessing.cpu_count())
return cls.pool
#classmethod
def get_loop(cls):
try:
return asyncio.get_event_loop()
except RuntimeError:
asyncio.set_event_loop(asyncio.new_event_loop())
# print("NEW LOOP" + str(threading.current_thread().ident))
return asyncio.get_event_loop()
# ---------------
TIME_OUT = float('0.2') # seconds
def toto(input_items,nb_predictions):
return 1
to_run = functools.partial(toto,
input_items=1,
nb_predictions="a")
results = SingletonTimeOut.run(to_run, TIME_OUT)
#!/usr/bin/python2
import sys, subprocess, threading
proc = subprocess.Popen(sys.argv[2:])
timer = threading.Timer(float(sys.argv[1]), proc.terminate)
timer.start()
proc.wait()
timer.cancel()
exit(proc.returncode)
The func_timeout package by Tim Savannah has worked well for me.
Installation:
pip install func_timeout
Usage:
import time
from func_timeout import func_timeout, FunctionTimedOut
def my_func(n):
time.sleep(n)
time_to_sleep = 10
# time out after 2 seconds using kwargs
func_timeout(2, my_func, kwargs={'n' : time_to_sleep})
# time out after 2 seconds using args
func_timeout(2, my_func, args=(time_to_sleep,))
I had a need for nestable timed interrupts (which SIGALARM can't do) that won't get blocked by time.sleep (which the thread-based approach can't do). I ended up copying and lightly modifying code from here: http://code.activestate.com/recipes/577600-queue-for-managing-multiple-sigalrm-alarms-concurr/
The code itself:
#!/usr/bin/python
# lightly modified version of http://code.activestate.com/recipes/577600-queue-for-managing-multiple-sigalrm-alarms-concurr/
"""alarm.py: Permits multiple SIGALRM events to be queued.
Uses a `heapq` to store the objects to be called when an alarm signal is
raised, so that the next alarm is always at the top of the heap.
"""
import heapq
import signal
from time import time
__version__ = '$Revision: 2539 $'.split()[1]
alarmlist = []
__new_alarm = lambda t, f, a, k: (t + time(), f, a, k)
__next_alarm = lambda: int(round(alarmlist[0][0] - time())) if alarmlist else None
__set_alarm = lambda: signal.alarm(max(__next_alarm(), 1))
class TimeoutError(Exception):
def __init__(self, message, id_=None):
self.message = message
self.id_ = id_
class Timeout:
''' id_ allows for nested timeouts. '''
def __init__(self, id_=None, seconds=1, error_message='Timeout'):
self.seconds = seconds
self.error_message = error_message
self.id_ = id_
def handle_timeout(self):
raise TimeoutError(self.error_message, self.id_)
def __enter__(self):
self.this_alarm = alarm(self.seconds, self.handle_timeout)
def __exit__(self, type, value, traceback):
try:
cancel(self.this_alarm)
except ValueError:
pass
def __clear_alarm():
"""Clear an existing alarm.
If the alarm signal was set to a callable other than our own, queue the
previous alarm settings.
"""
oldsec = signal.alarm(0)
oldfunc = signal.signal(signal.SIGALRM, __alarm_handler)
if oldsec > 0 and oldfunc != __alarm_handler:
heapq.heappush(alarmlist, (__new_alarm(oldsec, oldfunc, [], {})))
def __alarm_handler(*zargs):
"""Handle an alarm by calling any due heap entries and resetting the alarm.
Note that multiple heap entries might get called, especially if calling an
entry takes a lot of time.
"""
try:
nextt = __next_alarm()
while nextt is not None and nextt <= 0:
(tm, func, args, keys) = heapq.heappop(alarmlist)
func(*args, **keys)
nextt = __next_alarm()
finally:
if alarmlist: __set_alarm()
def alarm(sec, func, *args, **keys):
"""Set an alarm.
When the alarm is raised in `sec` seconds, the handler will call `func`,
passing `args` and `keys`. Return the heap entry (which is just a big
tuple), so that it can be cancelled by calling `cancel()`.
"""
__clear_alarm()
try:
newalarm = __new_alarm(sec, func, args, keys)
heapq.heappush(alarmlist, newalarm)
return newalarm
finally:
__set_alarm()
def cancel(alarm):
"""Cancel an alarm by passing the heap entry returned by `alarm()`.
It is an error to try to cancel an alarm which has already occurred.
"""
__clear_alarm()
try:
alarmlist.remove(alarm)
heapq.heapify(alarmlist)
finally:
if alarmlist: __set_alarm()
and a usage example:
import alarm
from time import sleep
try:
with alarm.Timeout(id_='a', seconds=5):
try:
with alarm.Timeout(id_='b', seconds=2):
sleep(3)
except alarm.TimeoutError as e:
print 'raised', e.id_
sleep(30)
except alarm.TimeoutError as e:
print 'raised', e.id_
else:
print 'nope.'
I have face the same problem but my situation is need work on sub thread, signal didn't work for me, so I wrote a python package: timeout-timer to solve this problem, support for use as context or decorator, use signal or sub thread module to trigger a timeout interrupt:
from timeout_timer import timeout, TimeoutInterrupt
class TimeoutInterruptNested(TimeoutInterrupt):
pass
def test_timeout_nested_loop_both_timeout(timer="thread"):
cnt = 0
try:
with timeout(5, timer=timer):
try:
with timeout(2, timer=timer, exception=TimeoutInterruptNested):
sleep(2)
except TimeoutInterruptNested:
cnt += 1
time.sleep(10)
except TimeoutInterrupt:
cnt += 1
assert cnt == 2
see more: https://github.com/dozysun/timeout-timer
Here is a simple example running one method with timeout and also retriev its value if successfull.
import multiprocessing
import time
ret = {"foo": False}
def worker(queue):
"""worker function"""
ret = queue.get()
time.sleep(1)
ret["foo"] = True
queue.put(ret)
if __name__ == "__main__":
queue = multiprocessing.Queue()
queue.put(ret)
p = multiprocessing.Process(target=worker, args=(queue,))
p.start()
p.join(timeout=10)
if p.exitcode is None:
print("The worker timed out.")
else:
print(f"The worker completed and returned: {queue.get()}")
Here is a slight improvement to the given thread-based solution.
The code below supports exceptions:
def runFunctionCatchExceptions(func, *args, **kwargs):
try:
result = func(*args, **kwargs)
except Exception, message:
return ["exception", message]
return ["RESULT", result]
def runFunctionWithTimeout(func, args=(), kwargs={}, timeout_duration=10, default=None):
import threading
class InterruptableThread(threading.Thread):
def __init__(self):
threading.Thread.__init__(self)
self.result = default
def run(self):
self.result = runFunctionCatchExceptions(func, *args, **kwargs)
it = InterruptableThread()
it.start()
it.join(timeout_duration)
if it.isAlive():
return default
if it.result[0] == "exception":
raise it.result[1]
return it.result[1]
Invoking it with a 5 second timeout:
result = timeout(remote_calculate, (myarg,), timeout_duration=5)
Here is a POSIX version that combines many of the previous answers to deliver following features:
Subprocesses blocking the execution.
Usage of the timeout function on class member functions.
Strict requirement on time-to-terminate.
Here is the code and some test cases:
import threading
import signal
import os
import time
class TerminateExecution(Exception):
"""
Exception to indicate that execution has exceeded the preset running time.
"""
def quit_function(pid):
# Killing all subprocesses
os.setpgrp()
os.killpg(0, signal.SIGTERM)
# Killing the main thread
os.kill(pid, signal.SIGTERM)
def handle_term(signum, frame):
raise TerminateExecution()
def invoke_with_timeout(timeout, fn, *args, **kwargs):
# Setting a sigterm handler and initiating a timer
old_handler = signal.signal(signal.SIGTERM, handle_term)
timer = threading.Timer(timeout, quit_function, args=[os.getpid()])
terminate = False
# Executing the function
timer.start()
try:
result = fn(*args, **kwargs)
except TerminateExecution:
terminate = True
finally:
# Restoring original handler and cancel timer
signal.signal(signal.SIGTERM, old_handler)
timer.cancel()
if terminate:
raise BaseException("xxx")
return result
### Test cases
def countdown(n):
print('countdown started', flush=True)
for i in range(n, -1, -1):
print(i, end=', ', flush=True)
time.sleep(1)
print('countdown finished')
return 1337
def really_long_function():
time.sleep(10)
def really_long_function2():
os.system("sleep 787")
# Checking that we can run a function as expected.
assert invoke_with_timeout(3, countdown, 1) == 1337
# Testing various scenarios
t1 = time.time()
try:
print(invoke_with_timeout(1, countdown, 3))
assert(False)
except BaseException:
assert(time.time() - t1 < 1.1)
print("All good", time.time() - t1)
t1 = time.time()
try:
print(invoke_with_timeout(1, really_long_function2))
assert(False)
except BaseException:
assert(time.time() - t1 < 1.1)
print("All good", time.time() - t1)
t1 = time.time()
try:
print(invoke_with_timeout(1, really_long_function))
assert(False)
except BaseException:
assert(time.time() - t1 < 1.1)
print("All good", time.time() - t1)
# Checking that classes are referenced and not
# copied (as would be the case with multiprocessing)
class X:
def __init__(self):
self.value = 0
def set(self, v):
self.value = v
x = X()
invoke_with_timeout(2, x.set, 9)
assert x.value == 9
I intend to kill the process if job not done , using thread and process both to achieve this.
from concurrent.futures import ThreadPoolExecutor
from time import sleep
import multiprocessing
# test case 1
def worker_1(a,b,c):
for _ in range(2):
print('very time consuming sleep')
sleep(1)
return a+b+c
# test case 2
def worker_2(in_name):
for _ in range(10):
print('very time consuming sleep')
sleep(1)
return 'hello '+in_name
Actual class as a contextmanager
class FuncTimer():
def __init__(self,fn,args,runtime):
self.fn = fn
self.args = args
self.queue = multiprocessing.Queue()
self.runtime = runtime
self.process = multiprocessing.Process(target=self.thread_caller)
def thread_caller(self):
with ThreadPoolExecutor() as executor:
future = executor.submit(self.fn, *self.args)
self.queue.put(future.result())
def __enter__(self):
return self
def start_run(self):
self.process.start()
self.process.join(timeout=self.runtime)
if self.process.exitcode is None:
self.process.kill()
if self.process.exitcode is None:
out_res = None
print('killed premature')
else:
out_res = self.queue.get()
return out_res
def __exit__(self, exc_type, exc_value, exc_traceback):
self.process.kill()
How to use it
print('testing case 1')
with FuncTimer(fn=worker_1,args=(1,2,3),runtime = 5) as fp:
res = fp.start_run()
print(res)
print('testing case 2')
with FuncTimer(fn=worker_2,args=('ram',),runtime = 5) as fp:
res = fp.start_run()
print(res)

100 percent load with multiprocessing queues

this only replicates my problem to get 100% load for the main python script if it tries to control loop over a shared queue
import multiprocessing
import random
def func1(num, q):
while True:
num = random.randint(1, 101)
if q.empty():
q.put(num)
def func2(num, q):
while True:
num = q.get()
num = num ** 2
if q.empty():
q.put(num)
num = 2
q = multiprocessing.Queue()
p1 = multiprocessing.Process(target=func1, args=(num, q))
p2 = multiprocessing.Process(target=func2, args=(num, q))
p1.daemon = True
p2.daemon = True
p1.start()
p2.start()
running = True
while running:
if not q.empty():
num = q.get(True, 0.1)
print(num)
would there be a better method to control from a script multiple worker processes. Better in sense of no load !?
I'm not sure I understand your program:
What's with the num parameter of func1() and func2()? It never gets used.
func2 will discard its result if func1 happens to have posted another number after func2 got the last number out of the queue.
Why do you daemonize the workers? Are you quite sure this is what you want?
The if not q.empty(): q.get() construct in the main code will sooner or later raise a queue.Empty exception because it's a race between it and the q.get() in func2.
The uncaught queue.Empty exception will terminate the main process, leaving the two workers orphaned - and running.
General advice:
Use different queues for issuing jobs (request queue) and collecting results (response queue). Include the request in the response if necessary.
Think about how to terminate the workers. Consider a "poison pill", i.e. a value in the request queue that causes workers to die, i.e. exit/terminate.
Be really really sure you understand the race conditions in your code, like the one I mentioned above (empty vs. get).
Here's some sample code I hacked up:
import multiprocessing
import time
import random
import os
def request_generator(requests):
while True:
requests.put(random.randint(1, 101))
time.sleep(0.01)
def worker(requests, responses):
worker_id = os.getpid()
while True:
request = requests.get()
response = request ** 2
responses.put((request, response, worker_id))
def main():
requests = multiprocessing.Queue()
responses = multiprocessing.Queue()
gen = multiprocessing.Process(target=request_generator, args=(requests,))
w1 = multiprocessing.Process(target=worker, args=(requests, responses))
w2 = multiprocessing.Process(target=worker, args=(requests, responses))
gen.start()
w1.start()
w2.start()
while True:
req, resp, worker_id = responses.get()
print("worker {}: {} => {}".format(worker_id, req, resp))
if __name__ == "__main__":
main()

Python multiprocessing - check status of each processes

I wonder if it is possible to check how long of each processes take.
for example, there are four workers and the job should take no more than 10 seconds, but one of worker take more than 10 seconds.Is there way to raise a alert after 10 seconds and before process finish the job.
My initial thought is using manager, but it seems I have wait till process finished.
Many thanks.
You can check whether process is alive after you tried to join it. Don't forget to set timeout otherwise it'll wait until job is finished.
Here is simple example for you
from multiprocessing import Process
import time
def task():
import time
time.sleep(5)
procs = []
for x in range(2):
proc = Process(target=task)
procs.append(proc)
proc.start()
time.sleep(2)
for proc in procs:
proc.join(timeout=0)
if proc.is_alive():
print "Job is not finished!"
I have found this solution time ago (somewhere here in StackOverflow) and I am very happy with it.
Basically, it uses signal to raise an exception if a process takes more than expected.
All you need to do is to add this class to your code:
import signal
class Timeout:
def __init__(self, seconds=1, error_message='TimeoutError'):
self.seconds = seconds
self.error_message = error_message
def handle_timeout(self, signum, frame):
raise TimeoutError(self.error_message)
def __enter__(self):
signal.signal(signal.SIGALRM, self.handle_timeout)
signal.alarm(self.seconds)
def __exit__(self, type, value, traceback):
signal.alarm(0)
Here is a general example of how it works:
import time
with Timeout(seconds=3, error_message='JobX took too much time'):
try:
time.sleep(10) #your job
except TimeoutError as e:
print(e)
In your case, I would add the with statement to the job that your worker need to perform. Then you catch the Exception and you do what you think is best.
Alternatively, you can periodically check if a process is alive:
timeout = 3 #seconds
start = time.time()
while time.time() - start < timeout:
if any(proces.is_alive() for proces in processes):
time.sleep(1)
else:
print('All processes done')
else:
print("Timeout!")
# do something
Use Pipe and messages
from multiprocessing import Process, Pipe
import numpy as np
caller, worker = Pipe()
val1 = ['der', 'die', 'das']
def worker_function(info):
print (info.recv())
for i in range(10):
print (val1[np.random.choice(3, 1)[0]])
info.send(['job finished'])
info.close()
def request(data):
caller.send(data)
task = Process(target=worker_function, args=(worker,))
if not task.is_alive():
print ("task is requested")
task.start()
if caller.recv() == ['job finished']:
task.join()
print ("finished")
if __name__ == '__main__':
data = {'input': 'here'}
request(data)

python 3.6, for input() - achieve time out and skip to next line [duplicate]

I'm calling a function in Python which I know may stall and force me to restart the script.
How do I call the function or what do I wrap it in so that if it takes longer than 5 seconds the script cancels it and does something else?
You may use the signal package if you are running on UNIX:
In [1]: import signal
# Register an handler for the timeout
In [2]: def handler(signum, frame):
...: print("Forever is over!")
...: raise Exception("end of time")
...:
# This function *may* run for an indetermined time...
In [3]: def loop_forever():
...: import time
...: while 1:
...: print("sec")
...: time.sleep(1)
...:
...:
# Register the signal function handler
In [4]: signal.signal(signal.SIGALRM, handler)
Out[4]: 0
# Define a timeout for your function
In [5]: signal.alarm(10)
Out[5]: 0
In [6]: try:
...: loop_forever()
...: except Exception, exc:
...: print(exc)
....:
sec
sec
sec
sec
sec
sec
sec
sec
Forever is over!
end of time
# Cancel the timer if the function returned before timeout
# (ok, mine won't but yours maybe will :)
In [7]: signal.alarm(0)
Out[7]: 0
10 seconds after the call signal.alarm(10), the handler is called. This raises an exception that you can intercept from the regular Python code.
This module doesn't play well with threads (but then, who does?)
Note that since we raise an exception when timeout happens, it may end up caught and ignored inside the function, for example of one such function:
def loop_forever():
while 1:
print('sec')
try:
time.sleep(10)
except:
continue
You can use multiprocessing.Process to do exactly that.
Code
import multiprocessing
import time
# bar
def bar():
for i in range(100):
print "Tick"
time.sleep(1)
if __name__ == '__main__':
# Start bar as a process
p = multiprocessing.Process(target=bar)
p.start()
# Wait for 10 seconds or until process finishes
p.join(10)
# If thread is still active
if p.is_alive():
print "running... let's kill it..."
# Terminate - may not work if process is stuck for good
p.terminate()
# OR Kill - will work for sure, no chance for process to finish nicely however
# p.kill()
p.join()
How do I call the function or what do I wrap it in so that if it takes longer than 5 seconds the script cancels it?
I posted a gist that solves this question/problem with a decorator and a threading.Timer. Here it is with a breakdown.
Imports and setups for compatibility
It was tested with Python 2 and 3. It should also work under Unix/Linux and Windows.
First the imports. These attempt to keep the code consistent regardless of the Python version:
from __future__ import print_function
import sys
import threading
from time import sleep
try:
import thread
except ImportError:
import _thread as thread
Use version independent code:
try:
range, _print = xrange, print
def print(*args, **kwargs):
flush = kwargs.pop('flush', False)
_print(*args, **kwargs)
if flush:
kwargs.get('file', sys.stdout).flush()
except NameError:
pass
Now we have imported our functionality from the standard library.
exit_after decorator
Next we need a function to terminate the main() from the child thread:
def quit_function(fn_name):
# print to stderr, unbuffered in Python 2.
print('{0} took too long'.format(fn_name), file=sys.stderr)
sys.stderr.flush() # Python 3 stderr is likely buffered.
thread.interrupt_main() # raises KeyboardInterrupt
And here is the decorator itself:
def exit_after(s):
'''
use as decorator to exit process if
function takes longer than s seconds
'''
def outer(fn):
def inner(*args, **kwargs):
timer = threading.Timer(s, quit_function, args=[fn.__name__])
timer.start()
try:
result = fn(*args, **kwargs)
finally:
timer.cancel()
return result
return inner
return outer
Usage
And here's the usage that directly answers your question about exiting after 5 seconds!:
#exit_after(5)
def countdown(n):
print('countdown started', flush=True)
for i in range(n, -1, -1):
print(i, end=', ', flush=True)
sleep(1)
print('countdown finished')
Demo:
>>> countdown(3)
countdown started
3, 2, 1, 0, countdown finished
>>> countdown(10)
countdown started
10, 9, 8, 7, 6, countdown took too long
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 11, in inner
File "<stdin>", line 6, in countdown
KeyboardInterrupt
The second function call will not finish, instead the process should exit with a traceback!
KeyboardInterrupt does not always stop a sleeping thread
Note that sleep will not always be interrupted by a keyboard interrupt, on Python 2 on Windows, e.g.:
#exit_after(1)
def sleep10():
sleep(10)
print('slept 10 seconds')
>>> sleep10()
sleep10 took too long # Note that it hangs here about 9 more seconds
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 11, in inner
File "<stdin>", line 3, in sleep10
KeyboardInterrupt
nor is it likely to interrupt code running in extensions unless it explicitly checks for PyErr_CheckSignals(), see Cython, Python and KeyboardInterrupt ignored
I would avoid sleeping a thread more than a second, in any case - that's an eon in processor time.
How do I call the function or what do I wrap it in so that if it takes longer than 5 seconds the script cancels it and does something else?
To catch it and do something else, you can catch the KeyboardInterrupt.
>>> try:
... countdown(10)
... except KeyboardInterrupt:
... print('do something else')
...
countdown started
10, 9, 8, 7, 6, countdown took too long
do something else
I have a different proposal which is a pure function (with the same API as the threading suggestion) and seems to work fine (based on suggestions on this thread)
def timeout(func, args=(), kwargs={}, timeout_duration=1, default=None):
import signal
class TimeoutError(Exception):
pass
def handler(signum, frame):
raise TimeoutError()
# set the timeout handler
signal.signal(signal.SIGALRM, handler)
signal.alarm(timeout_duration)
try:
result = func(*args, **kwargs)
except TimeoutError as exc:
result = default
finally:
signal.alarm(0)
return result
I ran across this thread when searching for a timeout call on unit tests. I didn't find anything simple in the answers or 3rd party packages so I wrote the decorator below you can drop right into code:
import multiprocessing.pool
import functools
def timeout(max_timeout):
"""Timeout decorator, parameter in seconds."""
def timeout_decorator(item):
"""Wrap the original function."""
#functools.wraps(item)
def func_wrapper(*args, **kwargs):
"""Closure for function."""
pool = multiprocessing.pool.ThreadPool(processes=1)
async_result = pool.apply_async(item, args, kwargs)
# raises a TimeoutError if execution exceeds max_timeout
return async_result.get(max_timeout)
return func_wrapper
return timeout_decorator
Then it's as simple as this to timeout a test or any function you like:
#timeout(5.0) # if execution takes longer than 5 seconds, raise a TimeoutError
def test_base_regression(self):
...
The stopit package, found on pypi, seems to handle timeouts well.
I like the #stopit.threading_timeoutable decorator, which adds a timeout parameter to the decorated function, which does what you expect, it stops the function.
Check it out on pypi: https://pypi.python.org/pypi/stopit
I am the author of wrapt_timeout_decorator.
Most of the solutions presented here work wunderfully under Linux on the first glance - because we have fork() and signals() - but on windows the things look a bit different.
And when it comes to subthreads on Linux, You cant use Signals anymore.
In order to spawn a process under Windows, it needs to be picklable - and many decorated functions or Class methods are not.
So you need to use a better pickler like dill and multiprocess (not pickle and multiprocessing) - thats why You cant use ProcessPoolExecutor (or only with limited functionality).
For the timeout itself - You need to define what timeout means - because on Windows it will take considerable (and not determinable) time to spawn the process. This can be tricky on short timeouts. Lets assume, spawning the process takes about 0.5 seconds (easily !!!). If You give a timeout of 0.2 seconds what should happen?
Should the function time out after 0.5 + 0.2 seconds (so let the method run for 0.2 seconds)?
Or should the called process time out after 0.2 seconds (in that case, the decorated function will ALWAYS timeout, because in that time it is not even spawned)?
Also nested decorators can be nasty and You cant use Signals in a subthread. If You want to create a truly universal, cross-platform decorator, all this needs to be taken into consideration (and tested).
Other issues are passing exceptions back to the caller, as well as logging issues (if used in the decorated function - logging to files in another process is NOT supported)
I tried to cover all edge cases, You might look into the package wrapt_timeout_decorator, or at least test Your own solutions inspired by the unittests used there.
#Alexis Eggermont - unfortunately I dont have enough points to comment - maybe someone else can notify You - I think I solved Your import issue.
There are a lot of suggestions, but none using concurrent.futures, which I think is the most legible way to handle this.
from concurrent.futures import ProcessPoolExecutor
# Warning: this does not terminate function if timeout
def timeout_five(fnc, *args, **kwargs):
with ProcessPoolExecutor() as p:
f = p.submit(fnc, *args, **kwargs)
return f.result(timeout=5)
Super simple to read and maintain.
We make a pool, submit a single process and then wait up to 5 seconds before raising a TimeoutError that you could catch and handle however you needed.
Native to python 3.2+ and backported to 2.7 (pip install futures).
Switching between threads and processes is as simple as replacing ProcessPoolExecutor with ThreadPoolExecutor.
If you want to terminate the Process on timeout I would suggest looking into Pebble.
Building on and and enhancing the answer by #piro , you can build a contextmanager. This allows for very readable code which will disable the alaram signal after a successful run (sets signal.alarm(0))
from contextlib import contextmanager
import signal
import time
#contextmanager
def timeout(duration):
def timeout_handler(signum, frame):
raise TimeoutError(f'block timedout after {duration} seconds')
signal.signal(signal.SIGALRM, timeout_handler)
signal.alarm(duration)
try:
yield
finally:
signal.alarm(0)
def sleeper(duration):
time.sleep(duration)
print('finished')
Example usage:
In [19]: with timeout(2):
...: sleeper(1)
...:
finished
In [20]: with timeout(2):
...: sleeper(3)
...:
---------------------------------------------------------------------------
Exception Traceback (most recent call last)
<ipython-input-20-66c78858116f> in <module>()
1 with timeout(2):
----> 2 sleeper(3)
3
<ipython-input-7-a75b966bf7ac> in sleeper(t)
1 def sleeper(t):
----> 2 time.sleep(t)
3 print('finished')
4
<ipython-input-18-533b9e684466> in timeout_handler(signum, frame)
2 def timeout(duration):
3 def timeout_handler(signum, frame):
----> 4 raise Exception(f'block timedout after {duration} seconds')
5 signal.signal(signal.SIGALRM, timeout_handler)
6 signal.alarm(duration)
Exception: block timedout after 2 seconds
Great, easy to use and reliable PyPi project timeout-decorator (https://pypi.org/project/timeout-decorator/)
installation:
pip install timeout-decorator
Usage:
import time
import timeout_decorator
#timeout_decorator.timeout(5)
def mytest():
print "Start"
for i in range(1,10):
time.sleep(1)
print "%d seconds have passed" % i
if __name__ == '__main__':
mytest()
timeout-decorator don't work on windows system as , windows didn't support signal well.
If you use timeout-decorator in windows system you will get the following
AttributeError: module 'signal' has no attribute 'SIGALRM'
Some suggested to use use_signals=False but didn't worked for me.
Author #bitranox created the following package:
pip install https://github.com/bitranox/wrapt-timeout-decorator/archive/master.zip
Code Sample:
import time
from wrapt_timeout_decorator import *
#timeout(5)
def mytest(message):
print(message)
for i in range(1,10):
time.sleep(1)
print('{} seconds have passed'.format(i))
def main():
mytest('starting')
if __name__ == '__main__':
main()
Gives the following exception:
TimeoutError: Function mytest timed out after 5 seconds
Highlights
Raises TimeoutError uses exceptions to alert on timeout - can easily be modified
Cross Platform: Windows & Mac OS X
Compatibility: Python 3.6+ (I also tested on python 2.7 and it works with small syntax adjustments)
For full explanation and extension to parallel maps, see here https://flipdazed.github.io/blog/quant%20dev/parallel-functions-with-timeouts
Minimal Example
>>> #killer_call(timeout=4)
... def bar(x):
... import time
... time.sleep(x)
... return x
>>> bar(10)
Traceback (most recent call last):
...
__main__.TimeoutError: function 'bar' timed out after 4s
and as expected
>>> bar(2)
2
Full code
import multiprocessing as mp
import multiprocessing.queues as mpq
import functools
import dill
from typing import Tuple, Callable, Dict, Optional, Iterable, List, Any
class TimeoutError(Exception):
def __init__(self, func: Callable, timeout: int):
self.t = timeout
self.fname = func.__name__
def __str__(self):
return f"function '{self.fname}' timed out after {self.t}s"
def _lemmiwinks(func: Callable, args: Tuple, kwargs: Dict[str, Any], q: mp.Queue):
"""lemmiwinks crawls into the unknown"""
q.put(dill.loads(func)(*args, **kwargs))
def killer_call(func: Callable = None, timeout: int = 10) -> Callable:
"""
Single function call with a timeout
Args:
func: the function
timeout: The timeout in seconds
"""
if not isinstance(timeout, int):
raise ValueError(f'timeout needs to be an int. Got: {timeout}')
if func is None:
return functools.partial(killer_call, timeout=timeout)
#functools.wraps(killer_call)
def _inners(*args, **kwargs) -> Any:
q_worker = mp.Queue()
proc = mp.Process(target=_lemmiwinks, args=(dill.dumps(func), args, kwargs, q_worker))
proc.start()
try:
return q_worker.get(timeout=timeout)
except mpq.Empty:
raise TimeoutError(func, timeout)
finally:
try:
proc.terminate()
except:
pass
return _inners
if __name__ == '__main__':
#killer_call(timeout=4)
def bar(x):
import time
time.sleep(x)
return x
print(bar(2))
bar(10)
Notes
You will need to import inside the function because of the way dill works.
This will also mean these functions may not be not compatible with doctest if there are imports inside your target functions. You will get an issue with __import__ not found.
Just in case it is helpful for anyone, building on the answer by #piro, I've made a function decorator:
import time
import signal
from functools import wraps
def timeout(timeout_secs: int):
def wrapper(func):
#wraps(func)
def time_limited(*args, **kwargs):
# Register an handler for the timeout
def handler(signum, frame):
raise Exception(f"Timeout for function '{func.__name__}'")
# Register the signal function handler
signal.signal(signal.SIGALRM, handler)
# Define a timeout for your function
signal.alarm(timeout_secs)
result = None
try:
result = func(*args, **kwargs)
except Exception as exc:
raise exc
finally:
# disable the signal alarm
signal.alarm(0)
return result
return time_limited
return wrapper
Using the wrapper on a function with a 20 seconds timeout would look something like:
#timeout(20)
def my_slow_or_never_ending_function(name):
while True:
time.sleep(1)
print(f"Yet another second passed {name}...")
try:
results = my_slow_or_never_ending_function("Yooo!")
except Exception as e:
print(f"ERROR: {e}")
We can use signals for the same. I think the below example will be useful for you. It is very simple compared to threads.
import signal
def timeout(signum, frame):
raise myException
#this is an infinite loop, never ending under normal circumstances
def main():
print 'Starting Main ',
while 1:
print 'in main ',
#SIGALRM is only usable on a unix platform
signal.signal(signal.SIGALRM, timeout)
#change 5 to however many seconds you need
signal.alarm(5)
try:
main()
except myException:
print "whoops"
Another solution with asyncio :
If you want to cancel the background task and not just timeout on the running main code, then you need an explicit communication from main thread to ask the code of the task to cancel , like a threading.Event()
import asyncio
import functools
import multiprocessing
from concurrent.futures.thread import ThreadPoolExecutor
class SingletonTimeOut:
pool = None
#classmethod
def run(cls, to_run: functools.partial, timeout: float):
pool = cls.get_pool()
loop = cls.get_loop()
try:
task = loop.run_in_executor(pool, to_run)
return loop.run_until_complete(asyncio.wait_for(task, timeout=timeout))
except asyncio.TimeoutError as e:
error_type = type(e).__name__ #TODO
raise e
#classmethod
def get_pool(cls):
if cls.pool is None:
cls.pool = ThreadPoolExecutor(multiprocessing.cpu_count())
return cls.pool
#classmethod
def get_loop(cls):
try:
return asyncio.get_event_loop()
except RuntimeError:
asyncio.set_event_loop(asyncio.new_event_loop())
# print("NEW LOOP" + str(threading.current_thread().ident))
return asyncio.get_event_loop()
# ---------------
TIME_OUT = float('0.2') # seconds
def toto(input_items,nb_predictions):
return 1
to_run = functools.partial(toto,
input_items=1,
nb_predictions="a")
results = SingletonTimeOut.run(to_run, TIME_OUT)
#!/usr/bin/python2
import sys, subprocess, threading
proc = subprocess.Popen(sys.argv[2:])
timer = threading.Timer(float(sys.argv[1]), proc.terminate)
timer.start()
proc.wait()
timer.cancel()
exit(proc.returncode)
The func_timeout package by Tim Savannah has worked well for me.
Installation:
pip install func_timeout
Usage:
import time
from func_timeout import func_timeout, FunctionTimedOut
def my_func(n):
time.sleep(n)
time_to_sleep = 10
# time out after 2 seconds using kwargs
func_timeout(2, my_func, kwargs={'n' : time_to_sleep})
# time out after 2 seconds using args
func_timeout(2, my_func, args=(time_to_sleep,))
I had a need for nestable timed interrupts (which SIGALARM can't do) that won't get blocked by time.sleep (which the thread-based approach can't do). I ended up copying and lightly modifying code from here: http://code.activestate.com/recipes/577600-queue-for-managing-multiple-sigalrm-alarms-concurr/
The code itself:
#!/usr/bin/python
# lightly modified version of http://code.activestate.com/recipes/577600-queue-for-managing-multiple-sigalrm-alarms-concurr/
"""alarm.py: Permits multiple SIGALRM events to be queued.
Uses a `heapq` to store the objects to be called when an alarm signal is
raised, so that the next alarm is always at the top of the heap.
"""
import heapq
import signal
from time import time
__version__ = '$Revision: 2539 $'.split()[1]
alarmlist = []
__new_alarm = lambda t, f, a, k: (t + time(), f, a, k)
__next_alarm = lambda: int(round(alarmlist[0][0] - time())) if alarmlist else None
__set_alarm = lambda: signal.alarm(max(__next_alarm(), 1))
class TimeoutError(Exception):
def __init__(self, message, id_=None):
self.message = message
self.id_ = id_
class Timeout:
''' id_ allows for nested timeouts. '''
def __init__(self, id_=None, seconds=1, error_message='Timeout'):
self.seconds = seconds
self.error_message = error_message
self.id_ = id_
def handle_timeout(self):
raise TimeoutError(self.error_message, self.id_)
def __enter__(self):
self.this_alarm = alarm(self.seconds, self.handle_timeout)
def __exit__(self, type, value, traceback):
try:
cancel(self.this_alarm)
except ValueError:
pass
def __clear_alarm():
"""Clear an existing alarm.
If the alarm signal was set to a callable other than our own, queue the
previous alarm settings.
"""
oldsec = signal.alarm(0)
oldfunc = signal.signal(signal.SIGALRM, __alarm_handler)
if oldsec > 0 and oldfunc != __alarm_handler:
heapq.heappush(alarmlist, (__new_alarm(oldsec, oldfunc, [], {})))
def __alarm_handler(*zargs):
"""Handle an alarm by calling any due heap entries and resetting the alarm.
Note that multiple heap entries might get called, especially if calling an
entry takes a lot of time.
"""
try:
nextt = __next_alarm()
while nextt is not None and nextt <= 0:
(tm, func, args, keys) = heapq.heappop(alarmlist)
func(*args, **keys)
nextt = __next_alarm()
finally:
if alarmlist: __set_alarm()
def alarm(sec, func, *args, **keys):
"""Set an alarm.
When the alarm is raised in `sec` seconds, the handler will call `func`,
passing `args` and `keys`. Return the heap entry (which is just a big
tuple), so that it can be cancelled by calling `cancel()`.
"""
__clear_alarm()
try:
newalarm = __new_alarm(sec, func, args, keys)
heapq.heappush(alarmlist, newalarm)
return newalarm
finally:
__set_alarm()
def cancel(alarm):
"""Cancel an alarm by passing the heap entry returned by `alarm()`.
It is an error to try to cancel an alarm which has already occurred.
"""
__clear_alarm()
try:
alarmlist.remove(alarm)
heapq.heapify(alarmlist)
finally:
if alarmlist: __set_alarm()
and a usage example:
import alarm
from time import sleep
try:
with alarm.Timeout(id_='a', seconds=5):
try:
with alarm.Timeout(id_='b', seconds=2):
sleep(3)
except alarm.TimeoutError as e:
print 'raised', e.id_
sleep(30)
except alarm.TimeoutError as e:
print 'raised', e.id_
else:
print 'nope.'
I have face the same problem but my situation is need work on sub thread, signal didn't work for me, so I wrote a python package: timeout-timer to solve this problem, support for use as context or decorator, use signal or sub thread module to trigger a timeout interrupt:
from timeout_timer import timeout, TimeoutInterrupt
class TimeoutInterruptNested(TimeoutInterrupt):
pass
def test_timeout_nested_loop_both_timeout(timer="thread"):
cnt = 0
try:
with timeout(5, timer=timer):
try:
with timeout(2, timer=timer, exception=TimeoutInterruptNested):
sleep(2)
except TimeoutInterruptNested:
cnt += 1
time.sleep(10)
except TimeoutInterrupt:
cnt += 1
assert cnt == 2
see more: https://github.com/dozysun/timeout-timer
Here is a simple example running one method with timeout and also retriev its value if successfull.
import multiprocessing
import time
ret = {"foo": False}
def worker(queue):
"""worker function"""
ret = queue.get()
time.sleep(1)
ret["foo"] = True
queue.put(ret)
if __name__ == "__main__":
queue = multiprocessing.Queue()
queue.put(ret)
p = multiprocessing.Process(target=worker, args=(queue,))
p.start()
p.join(timeout=10)
if p.exitcode is None:
print("The worker timed out.")
else:
print(f"The worker completed and returned: {queue.get()}")
Here is a slight improvement to the given thread-based solution.
The code below supports exceptions:
def runFunctionCatchExceptions(func, *args, **kwargs):
try:
result = func(*args, **kwargs)
except Exception, message:
return ["exception", message]
return ["RESULT", result]
def runFunctionWithTimeout(func, args=(), kwargs={}, timeout_duration=10, default=None):
import threading
class InterruptableThread(threading.Thread):
def __init__(self):
threading.Thread.__init__(self)
self.result = default
def run(self):
self.result = runFunctionCatchExceptions(func, *args, **kwargs)
it = InterruptableThread()
it.start()
it.join(timeout_duration)
if it.isAlive():
return default
if it.result[0] == "exception":
raise it.result[1]
return it.result[1]
Invoking it with a 5 second timeout:
result = timeout(remote_calculate, (myarg,), timeout_duration=5)
Here is a POSIX version that combines many of the previous answers to deliver following features:
Subprocesses blocking the execution.
Usage of the timeout function on class member functions.
Strict requirement on time-to-terminate.
Here is the code and some test cases:
import threading
import signal
import os
import time
class TerminateExecution(Exception):
"""
Exception to indicate that execution has exceeded the preset running time.
"""
def quit_function(pid):
# Killing all subprocesses
os.setpgrp()
os.killpg(0, signal.SIGTERM)
# Killing the main thread
os.kill(pid, signal.SIGTERM)
def handle_term(signum, frame):
raise TerminateExecution()
def invoke_with_timeout(timeout, fn, *args, **kwargs):
# Setting a sigterm handler and initiating a timer
old_handler = signal.signal(signal.SIGTERM, handle_term)
timer = threading.Timer(timeout, quit_function, args=[os.getpid()])
terminate = False
# Executing the function
timer.start()
try:
result = fn(*args, **kwargs)
except TerminateExecution:
terminate = True
finally:
# Restoring original handler and cancel timer
signal.signal(signal.SIGTERM, old_handler)
timer.cancel()
if terminate:
raise BaseException("xxx")
return result
### Test cases
def countdown(n):
print('countdown started', flush=True)
for i in range(n, -1, -1):
print(i, end=', ', flush=True)
time.sleep(1)
print('countdown finished')
return 1337
def really_long_function():
time.sleep(10)
def really_long_function2():
os.system("sleep 787")
# Checking that we can run a function as expected.
assert invoke_with_timeout(3, countdown, 1) == 1337
# Testing various scenarios
t1 = time.time()
try:
print(invoke_with_timeout(1, countdown, 3))
assert(False)
except BaseException:
assert(time.time() - t1 < 1.1)
print("All good", time.time() - t1)
t1 = time.time()
try:
print(invoke_with_timeout(1, really_long_function2))
assert(False)
except BaseException:
assert(time.time() - t1 < 1.1)
print("All good", time.time() - t1)
t1 = time.time()
try:
print(invoke_with_timeout(1, really_long_function))
assert(False)
except BaseException:
assert(time.time() - t1 < 1.1)
print("All good", time.time() - t1)
# Checking that classes are referenced and not
# copied (as would be the case with multiprocessing)
class X:
def __init__(self):
self.value = 0
def set(self, v):
self.value = v
x = X()
invoke_with_timeout(2, x.set, 9)
assert x.value == 9
I intend to kill the process if job not done , using thread and process both to achieve this.
from concurrent.futures import ThreadPoolExecutor
from time import sleep
import multiprocessing
# test case 1
def worker_1(a,b,c):
for _ in range(2):
print('very time consuming sleep')
sleep(1)
return a+b+c
# test case 2
def worker_2(in_name):
for _ in range(10):
print('very time consuming sleep')
sleep(1)
return 'hello '+in_name
Actual class as a contextmanager
class FuncTimer():
def __init__(self,fn,args,runtime):
self.fn = fn
self.args = args
self.queue = multiprocessing.Queue()
self.runtime = runtime
self.process = multiprocessing.Process(target=self.thread_caller)
def thread_caller(self):
with ThreadPoolExecutor() as executor:
future = executor.submit(self.fn, *self.args)
self.queue.put(future.result())
def __enter__(self):
return self
def start_run(self):
self.process.start()
self.process.join(timeout=self.runtime)
if self.process.exitcode is None:
self.process.kill()
if self.process.exitcode is None:
out_res = None
print('killed premature')
else:
out_res = self.queue.get()
return out_res
def __exit__(self, exc_type, exc_value, exc_traceback):
self.process.kill()
How to use it
print('testing case 1')
with FuncTimer(fn=worker_1,args=(1,2,3),runtime = 5) as fp:
res = fp.start_run()
print(res)
print('testing case 2')
with FuncTimer(fn=worker_2,args=('ram',),runtime = 5) as fp:
res = fp.start_run()
print(res)

How to use multiprocessing queue in Python?

I'm having much trouble trying to understand just how the multiprocessing queue works on python and how to implement it. Lets say I have two python modules that access data from a shared file, let's call these two modules a writer and a reader. My plan is to have both the reader and writer put requests into two separate multiprocessing queues, and then have a third process pop these requests in a loop and execute as such.
My main problem is that I really don't know how to implement multiprocessing.queue correctly, you cannot really instantiate the object for each process since they will be separate queues, how do you make sure that all processes relate to a shared queue (or in this case, queues)
My main problem is that I really don't know how to implement multiprocessing.queue correctly, you cannot really instantiate the object for each process since they will be separate queues, how do you make sure that all processes relate to a shared queue (or in this case, queues)
This is a simple example of a reader and writer sharing a single queue... The writer sends a bunch of integers to the reader; when the writer runs out of numbers, it sends 'DONE', which lets the reader know to break out of the read loop.
You can spawn as many reader processes as you like...
from multiprocessing import Process, Queue
import time
import sys
def reader_proc(queue):
"""Read from the queue; this spawns as a separate Process"""
while True:
msg = queue.get() # Read from the queue and do nothing
if msg == "DONE":
break
def writer(count, num_of_reader_procs, queue):
"""Write integers into the queue. A reader_proc() will read them from the queue"""
for ii in range(0, count):
queue.put(ii) # Put 'count' numbers into queue
### Tell all readers to stop...
for ii in range(0, num_of_reader_procs):
queue.put("DONE")
def start_reader_procs(qq, num_of_reader_procs):
"""Start the reader processes and return all in a list to the caller"""
all_reader_procs = list()
for ii in range(0, num_of_reader_procs):
### reader_p() reads from qq as a separate process...
### you can spawn as many reader_p() as you like
### however, there is usually a point of diminishing returns
reader_p = Process(target=reader_proc, args=((qq),))
reader_p.daemon = True
reader_p.start() # Launch reader_p() as another proc
all_reader_procs.append(reader_p)
return all_reader_procs
if __name__ == "__main__":
num_of_reader_procs = 2
qq = Queue() # writer() writes to qq from _this_ process
for count in [10**4, 10**5, 10**6]:
assert 0 < num_of_reader_procs < 4
all_reader_procs = start_reader_procs(qq, num_of_reader_procs)
writer(count, len(all_reader_procs), qq) # Queue stuff to all reader_p()
print("All reader processes are pulling numbers from the queue...")
_start = time.time()
for idx, a_reader_proc in enumerate(all_reader_procs):
print(" Waiting for reader_p.join() index %s" % idx)
a_reader_proc.join() # Wait for a_reader_proc() to finish
print(" reader_p() idx:%s is done" % idx)
print(
"Sending {0} integers through Queue() took {1} seconds".format(
count, (time.time() - _start)
)
)
print("")
Here's a dead simple usage of multiprocessing.Queue and multiprocessing.Process that allows callers to send an "event" plus arguments to a separate process that dispatches the event to a "do_" method on the process. (Python 3.4+)
import multiprocessing as mp
import collections
Msg = collections.namedtuple('Msg', ['event', 'args'])
class BaseProcess(mp.Process):
"""A process backed by an internal queue for simple one-way message passing.
"""
def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
self.queue = mp.Queue()
def send(self, event, *args):
"""Puts the event and args as a `Msg` on the queue
"""
msg = Msg(event, args)
self.queue.put(msg)
def dispatch(self, msg):
event, args = msg
handler = getattr(self, "do_%s" % event, None)
if not handler:
raise NotImplementedError("Process has no handler for [%s]" % event)
handler(*args)
def run(self):
while True:
msg = self.queue.get()
self.dispatch(msg)
Usage:
class MyProcess(BaseProcess):
def do_helloworld(self, arg1, arg2):
print(arg1, arg2)
if __name__ == "__main__":
process = MyProcess()
process.start()
process.send('helloworld', 'hello', 'world')
The send happens in the parent process, the do_* happens in the child process.
I left out any exception handling that would obviously interrupt the run loop and exit the child process. You can also customize it by overriding run to control blocking or whatever else.
This is really only useful in situations where you have a single worker process, but I think it's a relevant answer to this question to demonstrate a common scenario with a little more object-orientation.
I had a look at multiple answers across stack overflow and the web while trying to set-up a way of doing multiprocessing using queues for passing around large pandas dataframes. It seemed to me that every answer was re-iterating the same kind of solutions without any consideration of the multitude of edge cases one will definitely come across when setting up calculations like these. The problem is that there is many things at play at the same time. The number of tasks, the number of workers, the duration of each task and possible exceptions during task execution. All of these make synchronization tricky and most answers do not address how you can go about it. So this is my take after fiddling around for a few hours, hopefully this will be generic enough for most people to find it useful.
Some thoughts before any coding examples. Since queue.Empty or queue.qsize() or any other similar method is unreliable for flow control, any code of the like
while True:
try:
task = pending_queue.get_nowait()
except queue.Empty:
break
is bogus. This will kill the worker even if milliseconds later another task turns up in the queue. The worker will not recover and after a while ALL the workers will disappear as they randomly find the queue momentarily empty. The end result will be that the main multiprocessing function (the one with the join() on the processes) will return without all the tasks having completed. Nice. Good luck debugging through that if you have thousands of tasks and a few are missing.
The other issue is the use of sentinel values. Many people have suggested adding a sentinel value in the queue to flag the end of the queue. But to flag it to whom exactly? If there is N workers, assuming N is the number of cores available give or take, then a single sentinel value will only flag the end of the queue to one worker. All the other workers will sit waiting for more work when there is none left. Typical examples I've seen are
while True:
task = pending_queue.get()
if task == SOME_SENTINEL_VALUE:
break
One worker will get the sentinel value while the rest will wait indefinitely. No post I came across mentioned that you need to submit the sentinel value to the queue AT LEAST as many times as you have workers so that ALL of them get it.
The other issue is the handling of exceptions during task execution. Again these should be caught and managed. Moreover, if you have a completed_tasks queue you should independently count in a deterministic way how many items are in the queue before you decide that the job is done. Again relying on queue sizes is bound to fail and returns unexpected results.
In the example below, the par_proc() function will receive a list of tasks including the functions with which these tasks should be executed alongside any named arguments and values.
import multiprocessing as mp
import dill as pickle
import queue
import time
import psutil
SENTINEL = None
def do_work(tasks_pending, tasks_completed):
# Get the current worker's name
worker_name = mp.current_process().name
while True:
try:
task = tasks_pending.get_nowait()
except queue.Empty:
print(worker_name + ' found an empty queue. Sleeping for a while before checking again...')
time.sleep(0.01)
else:
try:
if task == SENTINEL:
print(worker_name + ' no more work left to be done. Exiting...')
break
print(worker_name + ' received some work... ')
time_start = time.perf_counter()
work_func = pickle.loads(task['func'])
result = work_func(**task['task'])
tasks_completed.put({work_func.__name__: result})
time_end = time.perf_counter() - time_start
print(worker_name + ' done in {} seconds'.format(round(time_end, 5)))
except Exception as e:
print(worker_name + ' task failed. ' + str(e))
tasks_completed.put({work_func.__name__: None})
def par_proc(job_list, num_cpus=None):
# Get the number of cores
if not num_cpus:
num_cpus = psutil.cpu_count(logical=False)
print('* Parallel processing')
print('* Running on {} cores'.format(num_cpus))
# Set-up the queues for sending and receiving data to/from the workers
tasks_pending = mp.Queue()
tasks_completed = mp.Queue()
# Gather processes and results here
processes = []
results = []
# Count tasks
num_tasks = 0
# Add the tasks to the queue
for job in job_list:
for task in job['tasks']:
expanded_job = {}
num_tasks = num_tasks + 1
expanded_job.update({'func': pickle.dumps(job['func'])})
expanded_job.update({'task': task})
tasks_pending.put(expanded_job)
# Use as many workers as there are cores (usually chokes the system so better use less)
num_workers = num_cpus
# We need as many sentinels as there are worker processes so that ALL processes exit when there is no more
# work left to be done.
for c in range(num_workers):
tasks_pending.put(SENTINEL)
print('* Number of tasks: {}'.format(num_tasks))
# Set-up and start the workers
for c in range(num_workers):
p = mp.Process(target=do_work, args=(tasks_pending, tasks_completed))
p.name = 'worker' + str(c)
processes.append(p)
p.start()
# Gather the results
completed_tasks_counter = 0
while completed_tasks_counter < num_tasks:
results.append(tasks_completed.get())
completed_tasks_counter = completed_tasks_counter + 1
for p in processes:
p.join()
return results
And here is a test to run the above code against
def test_parallel_processing():
def heavy_duty1(arg1, arg2, arg3):
return arg1 + arg2 + arg3
def heavy_duty2(arg1, arg2, arg3):
return arg1 * arg2 * arg3
task_list = [
{'func': heavy_duty1, 'tasks': [{'arg1': 1, 'arg2': 2, 'arg3': 3}, {'arg1': 1, 'arg2': 3, 'arg3': 5}]},
{'func': heavy_duty2, 'tasks': [{'arg1': 1, 'arg2': 2, 'arg3': 3}, {'arg1': 1, 'arg2': 3, 'arg3': 5}]},
]
results = par_proc(task_list)
job1 = sum([y for x in results if 'heavy_duty1' in x.keys() for y in list(x.values())])
job2 = sum([y for x in results if 'heavy_duty2' in x.keys() for y in list(x.values())])
assert job1 == 15
assert job2 == 21
plus another one with some exceptions
def test_parallel_processing_exceptions():
def heavy_duty1_raises(arg1, arg2, arg3):
raise ValueError('Exception raised')
return arg1 + arg2 + arg3
def heavy_duty2(arg1, arg2, arg3):
return arg1 * arg2 * arg3
task_list = [
{'func': heavy_duty1_raises, 'tasks': [{'arg1': 1, 'arg2': 2, 'arg3': 3}, {'arg1': 1, 'arg2': 3, 'arg3': 5}]},
{'func': heavy_duty2, 'tasks': [{'arg1': 1, 'arg2': 2, 'arg3': 3}, {'arg1': 1, 'arg2': 3, 'arg3': 5}]},
]
results = par_proc(task_list)
job1 = sum([y for x in results if 'heavy_duty1' in x.keys() for y in list(x.values())])
job2 = sum([y for x in results if 'heavy_duty2' in x.keys() for y in list(x.values())])
assert not job1
assert job2 == 21
Hope that is helpful.
in "from queue import Queue" there is no module called queue, instead multiprocessing should be used. Therefore, it should look like "from multiprocessing import Queue"
Just made a simple and general example for demonstrating passing a message over a Queue between 2 standalone programs. It doesn't directly answer the OP's question but should be clear enough indicating the concept.
Server:
multiprocessing-queue-manager-server.py
import asyncio
import concurrent.futures
import multiprocessing
import multiprocessing.managers
import queue
import sys
import threading
from typing import Any, AnyStr, Dict, Union
class QueueManager(multiprocessing.managers.BaseManager):
def get_queue(self, ident: Union[AnyStr, int, type(None)] = None) -> multiprocessing.Queue:
pass
def get_queue(ident: Union[AnyStr, int, type(None)] = None) -> multiprocessing.Queue:
global q
if not ident in q:
q[ident] = multiprocessing.Queue()
return q[ident]
q: Dict[Union[AnyStr, int, type(None)], multiprocessing.Queue] = dict()
delattr(QueueManager, 'get_queue')
def init_queue_manager_server():
if not hasattr(QueueManager, 'get_queue'):
QueueManager.register('get_queue', get_queue)
def serve(no: int, term_ev: threading.Event):
manager: QueueManager
with QueueManager(authkey=QueueManager.__name__.encode()) as manager:
print(f"Server address {no}: {manager.address}")
while not term_ev.is_set():
try:
item: Any = manager.get_queue().get(timeout=0.1)
print(f"Client {no}: {item} from {manager.address}")
except queue.Empty:
continue
async def main(n: int):
init_queue_manager_server()
term_ev: threading.Event = threading.Event()
executor: concurrent.futures.ThreadPoolExecutor = concurrent.futures.ThreadPoolExecutor()
i: int
for i in range(n):
asyncio.ensure_future(asyncio.get_running_loop().run_in_executor(executor, serve, i, term_ev))
# Gracefully shut down
try:
await asyncio.get_running_loop().create_future()
except asyncio.CancelledError:
term_ev.set()
executor.shutdown()
raise
if __name__ == '__main__':
asyncio.run(main(int(sys.argv[1])))
Client:
multiprocessing-queue-manager-client.py
import multiprocessing
import multiprocessing.managers
import os
import sys
from typing import AnyStr, Union
class QueueManager(multiprocessing.managers.BaseManager):
def get_queue(self, ident: Union[AnyStr, int, type(None)] = None) -> multiprocessing.Queue:
pass
delattr(QueueManager, 'get_queue')
def init_queue_manager_client():
if not hasattr(QueueManager, 'get_queue'):
QueueManager.register('get_queue')
def main():
init_queue_manager_client()
manager: QueueManager = QueueManager(sys.argv[1], authkey=QueueManager.__name__.encode())
manager.connect()
message = f"A message from {os.getpid()}"
print(f"Message to send: {message}")
manager.get_queue().put(message)
if __name__ == '__main__':
main()
Usage
Server:
$ python3 multiprocessing-queue-manager-server.py N
N is a integer indicating how many servers should be created. Copy one of the <server-address-N> output by the server and make it the first argument of each multiprocessing-queue-manager-client.py.
Client:
python3 multiprocessing-queue-manager-client.py <server-address-1>
Result
Server:
Client 1: <item> from <server-address-1>
Gist: https://gist.github.com/89062d639e40110c61c2f88018a8b0e5
UPD: Created a package here.
Server:
import ipcq
with ipcq.QueueManagerServer(address=ipcq.Address.AUTO, authkey=ipcq.AuthKey.AUTO) as server:
server.get_queue().get()
Client:
import ipcq
client = ipcq.QueueManagerClient(address=ipcq.Address.AUTO, authkey=ipcq.AuthKey.AUTO)
client.get_queue().put('a message')
We implemented two versions of this, one a simple multi thread pool that can execute many types of callables, making our lives much easier and the second version that uses processes, which is less flexible in terms of callables and requires and extra call to dill.
Setting frozen_pool to true will freeze execution until finish_pool_queue is called in either class.
Thread Version:
'''
Created on Nov 4, 2019
#author: Kevin
'''
from threading import Lock, Thread
from Queue import Queue
import traceback
from helium.loaders.loader_retailers import print_info
from time import sleep
import signal
import os
class ThreadPool(object):
def __init__(self, queue_threads, *args, **kwargs):
self.frozen_pool = kwargs.get('frozen_pool', False)
self.print_queue = kwargs.get('print_queue', True)
self.pool_results = []
self.lock = Lock()
self.queue_threads = queue_threads
self.queue = Queue()
self.threads = []
for i in range(self.queue_threads):
t = Thread(target=self.make_pool_call)
t.daemon = True
t.start()
self.threads.append(t)
def make_pool_call(self):
while True:
if self.frozen_pool:
#print '--> Queue is frozen'
sleep(1)
continue
item = self.queue.get()
if item is None:
break
call = item.get('call', None)
args = item.get('args', [])
kwargs = item.get('kwargs', {})
keep_results = item.get('keep_results', False)
try:
result = call(*args, **kwargs)
if keep_results:
self.lock.acquire()
self.pool_results.append((item, result))
self.lock.release()
except Exception as e:
self.lock.acquire()
print e
traceback.print_exc()
self.lock.release()
os.kill(os.getpid(), signal.SIGUSR1)
self.queue.task_done()
def finish_pool_queue(self):
self.frozen_pool = False
while self.queue.unfinished_tasks > 0:
if self.print_queue:
print_info('--> Thread pool... %s' % self.queue.unfinished_tasks)
sleep(5)
self.queue.join()
for i in range(self.queue_threads):
self.queue.put(None)
for t in self.threads:
t.join()
del self.threads[:]
def get_pool_results(self):
return self.pool_results
def clear_pool_results(self):
del self.pool_results[:]
Process Version:
'''
Created on Nov 4, 2019
#author: Kevin
'''
import traceback
from helium.loaders.loader_retailers import print_info
from time import sleep
import signal
import os
from multiprocessing import Queue, Process, Value, Array, JoinableQueue, Lock,\
RawArray, Manager
from dill import dill
import ctypes
from helium.misc.utils import ignore_exception
from mem_top import mem_top
import gc
class ProcessPool(object):
def __init__(self, queue_processes, *args, **kwargs):
self.frozen_pool = Value(ctypes.c_bool, kwargs.get('frozen_pool', False))
self.print_queue = kwargs.get('print_queue', True)
self.manager = Manager()
self.pool_results = self.manager.list()
self.queue_processes = queue_processes
self.queue = JoinableQueue()
self.processes = []
for i in range(self.queue_processes):
p = Process(target=self.make_pool_call)
p.start()
self.processes.append(p)
print 'Processes', self.queue_processes
def make_pool_call(self):
while True:
if self.frozen_pool.value:
sleep(1)
continue
item_pickled = self.queue.get()
if item_pickled is None:
#print '--> Ending'
self.queue.task_done()
break
item = dill.loads(item_pickled)
call = item.get('call', None)
args = item.get('args', [])
kwargs = item.get('kwargs', {})
keep_results = item.get('keep_results', False)
try:
result = call(*args, **kwargs)
if keep_results:
self.pool_results.append(dill.dumps((item, result)))
else:
del call, args, kwargs, keep_results, item, result
except Exception as e:
print e
traceback.print_exc()
os.kill(os.getpid(), signal.SIGUSR1)
self.queue.task_done()
def finish_pool_queue(self, callable=None):
self.frozen_pool.value = False
while self.queue._unfinished_tasks.get_value() > 0:
if self.print_queue:
print_info('--> Process pool... %s' % (self.queue._unfinished_tasks.get_value()))
if callable:
callable()
sleep(5)
for i in range(self.queue_processes):
self.queue.put(None)
self.queue.join()
self.queue.close()
for p in self.processes:
with ignore_exception: p.join(10)
with ignore_exception: p.terminate()
with ignore_exception: del self.processes[:]
def get_pool_results(self):
return self.pool_results
def clear_pool_results(self):
del self.pool_results[:]
def test(eg):
print 'EG', eg
Call with either:
tp = ThreadPool(queue_threads=2)
tp.queue.put({'call': test, 'args': [random.randint(0, 100)]})
tp.finish_pool_queue()
or
pp = ProcessPool(queue_processes=2)
pp.queue.put(dill.dumps({'call': test, 'args': [random.randint(0, 100)]}))
pp.queue.put(dill.dumps({'call': test, 'args': [random.randint(0, 100)]}))
pp.finish_pool_queue()
A multi-producers and multi-consumers example, verified. It should be easy to modify it to cover other cases, single/multi producers, single/multi consumers.
from multiprocessing import Process, JoinableQueue
import time
import os
q = JoinableQueue()
def producer():
for item in range(30):
time.sleep(2)
q.put(item)
pid = os.getpid()
print(f'producer {pid} done')
def worker():
while True:
item = q.get()
pid = os.getpid()
print(f'pid {pid} Working on {item}')
print(f'pid {pid} Finished {item}')
q.task_done()
for i in range(5):
p = Process(target=worker, daemon=True).start()
# send thirty task requests to the worker
producers = []
for i in range(2):
p = Process(target=producer)
producers.append(p)
p.start()
# make sure producers done
for p in producers:
p.join()
# block until all workers are done
q.join()
print('All work completed')
Explanation:
Two producers and five consumers in this example.
JoinableQueue is used to make sure all elements stored in queue will be processed. 'task_done' is for worker to notify an element is done. 'q.join()' will wait for all elements marked as done.
With #2, there is no need to join wait for every worker.
But it is important to join wait for every producer to store element into queue. Otherwise, program exit immediately.

Categories