Specification of the problem:
I'm searching through really great amount of lines of a log file and I'm distributing those lines to groups in order to regular expressions(RegExses) I have stored using the re.match() function. Unfortunately some of my RegExses are too complicated and Python sometimes gets himself to backtracking hell. Due to this I need to protect it with some kind of timeout.
Problems:
re.match, I'm using, is Python's function and as I found out somewhere here on StackOverflow (I'm really sorry, I can not find the link now :-( ). It is very difficult to interrupt thread with running Python's library. For this reason threads are out of the game.
Because evaluating of re.match function takes relatively short time and I want to analyse with this function great amount of lines, I need some timeout function that wont't take too long to execute (this makes threads even less suitable, it takes really long time to initialise new thread) and can be set to less than one second.
For those reasons, answers here - Timeout on a function call
and here - Timeout function if it takes too long to finish with decorator (alarm - 1sec and more) are off the table.
I've spent this morning searching for solution to this question but I did not find any satisfactory answer.
Solution:
I've just modified a script posted here: Timeout function if it takes too long to finish.
And here is the code:
from functools import wraps
import errno
import os
import signal
class TimeoutError(Exception):
pass
def timeout(seconds=10, error_message=os.strerror(errno.ETIME)):
def decorator(func):
def _handle_timeout(signum, frame):
raise TimeoutError(error_message)
def wrapper(*args, **kwargs):
signal.signal(signal.SIGALRM, _handle_timeout)
signal.setitimer(signal.ITIMER_REAL,seconds) #used timer instead of alarm
try:
result = func(*args, **kwargs)
finally:
signal.alarm(0)
return result
return wraps(func)(wrapper)
return decorator
And then you can use it like this:
from timeout import timeout
from time import time
#timeout(0.01)
def loop():
while True:
pass
try:
begin = time.time()
loop()
except TimeoutError, e:
print "Time elapsed: {:.3f}s".format(time.time() - begin)
Which prints
Time elapsed: 0.010s
Related
I looked online and found some SO discussing and ActiveState recipes for running some code with a timeout. It looks there are some common approaches:
Use thread that run the code, and join it with timeout. If timeout elapsed - kill the thread. This is not directly supported in Python (used private _Thread__stop function) so it is bad practice
Use signal.SIGALRM - but this approach not working on Windows!
Use subprocess with timeout - but this is too heavy - what if I want to start interruptible task often, I don't want fire process for each!
So, what is the right way? I'm not asking about workarounds (eg use Twisted and async IO), but actual way to solve actual problem - I have some function and I want to run it only with some timeout. If timeout elapsed, I want control back. And I want it to work on Linux and Windows.
A completely general solution to this really, honestly does not exist. You have to use the right solution for a given domain.
If you want timeouts for code you fully control, you have to write it to cooperate. Such code has to be able to break up into little chunks in some way, as in an event-driven system. You can also do this by threading if you can ensure nothing will hold a lock too long, but handling locks right is actually pretty hard.
If you want timeouts because you're afraid code is out of control (for example, if you're afraid the user will ask your calculator to compute 9**(9**9)), you need to run it in another process. This is the only easy way to sufficiently isolate it. Running it in your event system or even a different thread will not be enough. It is also possible to break things up into little chunks similar to the other solution, but requires very careful handling and usually isn't worth it; in any event, that doesn't allow you to do the same exact thing as just running the Python code.
What you might be looking for is the multiprocessing module. If subprocess is too heavy, then this may not suit your needs either.
import time
import multiprocessing
def do_this_other_thing_that_may_take_too_long(duration):
time.sleep(duration)
return 'done after sleeping {0} seconds.'.format(duration)
pool = multiprocessing.Pool(1)
print 'starting....'
res = pool.apply_async(do_this_other_thing_that_may_take_too_long, [8])
for timeout in range(1, 10):
try:
print '{0}: {1}'.format(duration, res.get(timeout))
except multiprocessing.TimeoutError:
print '{0}: timed out'.format(duration)
print 'end'
If it's network related you could try:
import socket
socket.setdefaulttimeout(number)
I found this with eventlet library:
http://eventlet.net/doc/modules/timeout.html
from eventlet.timeout import Timeout
timeout = Timeout(seconds, exception)
try:
... # execution here is limited by timeout
finally:
timeout.cancel()
For "normal" Python code, that doesn't linger prolongued times in C extensions or I/O waits, you can achieve your goal by setting a trace function with sys.settrace() that aborts the running code when the timeout is reached.
Whether that is sufficient or not depends on how co-operating or malicious the code you run is. If it's well-behaved, a tracing function is sufficient.
An other way is to use faulthandler:
import time
import faulthandler
faulthandler.enable()
try:
faulthandler.dump_tracebacks_later(3)
time.sleep(10)
finally:
faulthandler.cancel_dump_tracebacks_later()
N.B: The faulthandler module is part of stdlib in python3.3.
If you're running code that you expect to die after a set time, then you should write it properly so that there aren't any negative effects on shutdown, no matter if its a thread or a subprocess. A command pattern with undo would be useful here.
So, it really depends on what the thread is doing when you kill it. If its just crunching numbers who cares if you kill it. If its interacting with the filesystem and you kill it , then maybe you should really rethink your strategy.
What is supported in Python when it comes to threads? Daemon threads and joins. Why does python let the main thread exit if you've joined a daemon while its still active? Because its understood that someone using daemon threads will (hopefully) write the code in a way that it wont matter when that thread dies. Giving a timeout to a join and then letting main die, and thus taking any daemon threads with it, is perfectly acceptable in this context.
I've solved that in that way:
For me is worked great (in windows and not heavy at all) I'am hope it was useful for someone)
import threading
import time
class LongFunctionInside(object):
lock_state = threading.Lock()
working = False
def long_function(self, timeout):
self.working = True
timeout_work = threading.Thread(name="thread_name", target=self.work_time, args=(timeout,))
timeout_work.setDaemon(True)
timeout_work.start()
while True: # endless/long work
time.sleep(0.1) # in this rate the CPU is almost not used
if not self.working: # if state is working == true still working
break
self.set_state(True)
def work_time(self, sleep_time): # thread function that just sleeping specified time,
# in wake up it asking if function still working if it does set the secured variable work to false
time.sleep(sleep_time)
if self.working:
self.set_state(False)
def set_state(self, state): # secured state change
while True:
self.lock_state.acquire()
try:
self.working = state
break
finally:
self.lock_state.release()
lw = LongFunctionInside()
lw.long_function(10)
The main idea is to create a thread that will just sleep in parallel to "long work" and in wake up (after timeout) change the secured variable state, the long function checking the secured variable during its work.
I'm pretty new in Python programming, so if that solution has a fundamental errors, like resources, timing, deadlocks problems , please response)).
solving with the 'with' construct and merging solution from -
Timeout function if it takes too long to finish
this thread which work better.
import threading, time
class Exception_TIMEOUT(Exception):
pass
class linwintimeout:
def __init__(self, f, seconds=1.0, error_message='Timeout'):
self.seconds = seconds
self.thread = threading.Thread(target=f)
self.thread.daemon = True
self.error_message = error_message
def handle_timeout(self):
raise Exception_TIMEOUT(self.error_message)
def __enter__(self):
try:
self.thread.start()
self.thread.join(self.seconds)
except Exception, te:
raise te
def __exit__(self, type, value, traceback):
if self.thread.is_alive():
return self.handle_timeout()
def function():
while True:
print "keep printing ...", time.sleep(1)
try:
with linwintimeout(function, seconds=5.0, error_message='exceeded timeout of %s seconds' % 5.0):
pass
except Exception_TIMEOUT, e:
print " attention !! execeeded timeout, giving up ... %s " % e
I'm using the code solution mentioned here.
I'm new to decorators, and don't understand why this solution doesn't work if I want to write something like the following:
#timeout(10)
def main_func():
nested_func()
while True:
continue
#timeout(5)
def nested_func():
print "finished doing nothing"
=> Result of this will be no timeout at all. We will be stuck on endless loop.
However if I remove #timeout annotation from nested_func I get a timeout error.
For some reason we can't use decorator on function and on a nested function in the same time, any idea why and how can I correct it to work, assume that containing function timeout always must be bigger than the nested timeout.
This is a limitation of the signal module's timing functions, which the decorator you linked uses. Here's the relevant piece of the documentation (with emphasis added by me):
signal.alarm(time)
If time is non-zero, this function requests that a SIGALRM signal be sent to the process in time seconds. Any previously scheduled alarm is canceled (only one alarm can be scheduled at any time). The returned value is then the number of seconds before any previously set alarm was to have been delivered. If time is zero, no alarm is scheduled, and any scheduled alarm is canceled. If the return value is zero, no alarm is currently scheduled. (See the Unix man page alarm(2).) Availability: Unix.
So, what you're seeing is that when your nested_func is called, it's timer cancels the outer function's timer.
You can update the decorator to pay attention to the return value of the alarm call (which will be the time before the previous alarm (if any) was due). It's a bit complicated to get the details right, since the inner timer needs to track how long its function ran for, so it can modify the time remaining on the previous timer. Here's an untested version of the decorator that I think gets it mostly right (but I'm not entirely sure it works correctly for all exception cases):
import time
import signal
class TimeoutError(Exception):
def __init__(self, value = "Timed Out"):
self.value = value
def __str__(self):
return repr(self.value)
def timeout(seconds_before_timeout):
def decorate(f):
def handler(signum, frame):
raise TimeoutError()
def new_f(*args, **kwargs):
old = signal.signal(signal.SIGALRM, handler)
old_time_left = signal.alarm(seconds_before_timeout)
if 0 < old_time_left < second_before_timeout: # never lengthen existing timer
signal.alarm(old_time_left)
start_time = time.time()
try:
result = f(*args, **kwargs)
finally:
if old_time_left > 0: # deduct f's run time from the saved timer
old_time_left -= time.time() - start_time
signal.signal(signal.SIGALRM, old)
signal.alarm(old_time_left)
return result
new_f.func_name = f.func_name
return new_f
return decorate
as Blckknght pointed out, You can't use signals for nested decorators - but you can use multiprocessing to achieve that.
You might use this decorator, it supports nested decorators : https://github.com/bitranox/wrapt_timeout_decorator
and as ABADGER1999 points out in his blog https://anonbadger.wordpress.com/2018/12/15/python-signal-handlers-and-exceptions/
using signals and the TimeoutException is probably not the best idea - because it can be caught in the decorated function.
Of course you can use your own Exception, derived from the Base Exception Class, but the code might still not work as expected -
see the next example - you may try it out in jupyter: https://mybinder.org/v2/gh/bitranox/wrapt_timeout_decorator/master?filepath=jupyter_test_wrapt_timeout_decorator.ipynb
import time
from wrapt_timeout_decorator import *
# caveats when using signals - the TimeoutError raised by the signal may be caught
# inside the decorated function.
# So You might use Your own Exception, derived from the base Exception Class.
# In Python-3.7.1 stdlib there are over 300 pieces of code that will catch your timeout
# if you were to base an exception on Exception. If you base your exception on BaseException,
# there are still 231 places that can potentially catch your exception.
# You should use use_signals=False if You want to make sure that the timeout is handled correctly !
# therefore the default value for use_signals = False on this decorator !
#timeout(5, use_signals=True)
def mytest(message):
try:
print(message)
for i in range(1,10):
time.sleep(1)
print('{} seconds have passed - lets assume we read a big file here'.format(i))
# TimeoutError is a Subclass of OSError - therefore it is caught here !
except OSError:
for i in range(1,10):
time.sleep(1)
print('Whats going on here ? - Ooops the Timeout Exception is catched by the OSError ! {}'.format(i))
except Exception:
# even worse !
pass
except:
# the worst - and exists more then 300x in actual Python 3.7 stdlib Code !
# so You never really can rely that You catch the TimeoutError when using Signals !
pass
if __name__ == '__main__':
try:
mytest('starting')
print('no Timeout Occured')
except TimeoutError():
# this will never be printed because the decorated function catches implicitly the TimeoutError !
print('Timeout Occured')
There's a better version of timeout decorator that's currently on Python's PyPI library. It supports both UNIX and non-UNIX based operating system. The part where SIGNALS are mentioned - that specifically for UNIX.
Assuming you aren't using UNIX. Below is a code snippet from the decorator that shows you a list of parameters that you can use as required.
def timeout(seconds=None, use_signals=True, timeout_exception=TimeoutError, exception_message=None)
For implementation on NON-UNIX base operating system. This is what I would do:
import time
import timeout_decorator
#timeout_decorator.timeout(10, use_signals=False)
def main_func():
nested_func()
while True:
continue
#timeout_decorator.timeout(5, use_signals=False)
def nested_func():
print "finished doing nothing"
If you notice, I'm doing use_signals=False. That's all, you should be good to go.
I have struggled with this question for about a week -- time to ask someone who can bang out an answer in a couple minutes.
I am trying to run a python program once every 10 seconds. There are a lot of questions of this sort : Use sched module to run at a given time, Python threading.timer - repeat function every 'n' seconds, How to execute a function asynchronously every 60 seconds in Python?
Normally the solutions using sched or time.sleep would work, but I am trying to start a scheduled process from within cmd2, which is already running in a while False loop. (When you exit cmd2, it exits this loop).
Because of this, when I start a function to repeat every 10 seconds, I enter another loop nested within cmd2 and I am unable to enter cmd2 commands. I can only get back to cmd2 by exiting the sub-loop that is repeating the function, and thus the function stops repeating.
Evidently threading will solve this problem. I have tried threading.Timer without success. Perhaps the real problem is that I do not understand threads or multiprocessing.
Here is an example of code that is roughly isomorphic to the code I'm using, using sched module, which I got to work:
import cmd2
import repeated
class prompt(cmd2.Cmd):
"""this lets you enter commands"""
def default(self, line):
return cmd2.Cmd.default(self, line)
def do_exit(self, line):
return True
def do_repeated(self, line):
repeated.func_1()
Where repeated.py looks like this:
import sched
import time
def func_2(sc):
print 'doing stuff'
sc.enter(10, 0, func_2, (sc,))
def func_1():
s = sched.scheduler(time.time, time.sleep)
s.enter(0, 0, func_2, (s,))
s.run()
http://docs.python.org/2/library/queue.html?highlight=queue#Queue
Can you instance a Queue object outside of cmd2? There can be one thread that watches the queue and takes jobs from it at periodic intervals; while cmd2 is free to run or not run. The thread that processes the queue, and the queue object itself need to be in the outer scope, of course.
To schedule something at a particular time, you can insert a tuple which has the target time in it. Or you can have the thread just check at regular intervals, if that's good enough.
[Edit, if you have a process that is intended to repeat, you can have it requeue itself at the end of it's operation.]
As soon as I asked the question I was able to figure it out. Don't know why that happens sometimes.
This code
def f():
# do something here ...
# call f() again in 60 seconds
threading.Timer(60, f).start()
# start calling f now and every 60 sec thereafter
f()
From here: How to execute a function asynchronously every 60 seconds in Python?
Actually works for what I was trying to do. There are evidently some subtleties in how the function is called as an argument in threading.Timer. Before when I was including the arguments and even the parentheses after the function I was getting recursive depth errors --i.e. the function was calling itself without delay constantly.
So anyone else who has a problem like this, pay attention to how you call the function in threading.Timer(60, f).start(). If you write threading.Timer(60, f()).start() or something similar it will probably not work.
This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Timeout on a Python function call
How to timeout function in python, timout less than a second
I am running a function within a for loop, such as the following:
for element in my_list:
my_function(element)
for some reason, some elements may lead the function into very long processing time (maybe even some infinite loop that I cannot really trace where it comes from). So I want to add some loop control to skip the current element if its processing for example takes more than 2 seconds. How can this be done?
I would discourage the most obvious answer - using a signal.alarm() and an alarm signal handler that asynchronously raises an exception to jump out of task execution. In theory it should work great, but in practice the cPython interpreter code doesn't guarantee that the handler is executed within the time frame that you want. Signal handling can be delayed by x number of bytecode instructions, so the exception could still be raised after you explicitly cancel the alarm (outside the context of the try block).
A problem we ran into regularly was that the alarm handler's exception would get raised after the timeoutable code completed.
Since there isn't much available by way of thread control, I have relied on process control for handling tasks that must be subjected to a timeout. Basically, the gist is to hand off the task to a child process and kill the child process if the task takes too long. multiprocessing.pool isn't quite that sophisticated - so I have a home-rolled pool for that level of control.
Something like this:
import signal
import time
class Timeout(Exception):
pass
def try_one(func,t):
def timeout_handler(signum, frame):
raise Timeout()
old_handler = signal.signal(signal.SIGALRM, timeout_handler)
signal.alarm(t) # triger alarm in 3 seconds
try:
t1=time.clock()
func()
t2=time.clock()
except Timeout:
print('{} timed out after {} seconds'.format(func.__name__,t))
return None
finally:
signal.signal(signal.SIGALRM, old_handler)
signal.alarm(0)
return t2-t1
def troublesome():
while True:
pass
try_one(troublesome,2)
The function troublsome will never return on its own. If you use try_one(troublesome,2) it successfully times out after 2 seconds.
Sometimes, my function will hang. If the function is longer than 3 seconds.
I want to quit from this function, how to do this?
Thanks
Well, you should probably just figure out why it is hanging. If it is taking a long time to do the work then perhaps you could be doing something more efficiently. If it needs more than three seconds than how will breaking in the middle help you? The assumption is that the work actually needs to be done.
What call is hanging? If you don't own the code to the call that is hanging you're SOL unless you run it on another thread and create a timer to watch it.
This sounds to me like it may be a case of trying to find an answer to the wrong question, but perhaps you are waiting on a connection or some operation that should have a timeout period. With the information (or lack thereof) you have given us it is impossible to tell which is true. How about posting some code so that we are able to give a more definitive answer?
I had to do to something like that, too. I whipped up a module that provides a decorator to limit the runtime of a function:
import logging
import signal
from functools import wraps
MODULELOG = logging.getLogger(__name__)
class Timeout(Exception):
"""The timeout handler was invoked"""
def _timeouthandler(signum, frame):
"""This function is called by SIGALRM"""
MODULELOG.info('Invoking the timeout')
raise Timeout
def timeout(seconds):
"""Decorate a function to set a timeout on its execution"""
def wrap(func):
"""Wrap a timer around the given function"""
#wraps(func)
def inner(*args, **kwargs):
"""Set an execution timer, execute the wrapped function,
then clear the timer."""
MODULELOG.debug('setting an alarm for %d seconds on %s' % (seconds, func))
oldsignal = signal.signal(signal.SIGALRM, _timeouthandler)
signal.alarm(seconds)
try:
return func(*args, **kwargs)
finally:
MODULELOG.debug('clearing the timer on %s' % func)
signal.alarm(0)
signal.signal(signal.SIGALRM, oldsignal)
return inner
return wrap
I use it like:
#limits.timeout(300)
def dosomething(args): ...
In my case, dosomething was calling an external program which would periodically hang and I wanted to be able to cleanly exit whenever that happened.
I was able to set a timeout on input by using the select module.
You should be able to set your input type appropriately.
i,o,e = select.select([sys.stdin],[],[],5)
This will listen for input for 5 seconds before moving on.