Python, Timeout of Try, Except Statement after X number of seconds? - python

I've been searching on this but can't seem to find an exact answer (most get into more complicated things like multithreading, etc), I just want to do something like a Try, Except statement where if the process doesn't finish within X number of seconds it will throw an exception.
EDIT: The reason for this is that I am using a website testing software (selenium) with a configuration that sometimes causes it to hang. It doesn't throw an error, doesn't timeout or do anything so I have no way of catching it. I am wondering what the best way is to determine that this has occured so I can move on in my application, so I was thinking if I could do something like, "if this hasn't finished by X seconds... move on".

You can't do it without some sort of multithreading or multiprocessing, even if that's hidden under some layers of abstraction, unless that "process" you're running is specifically designed for asynchronicity and calls-back to a known function once in a while.
If you describe what that process actually is, it will be easier to provide real solutions. I don't think that you appreciate the power of Python where it comes to implementations that are succinct while being complete. This may take just a few lines of code to implement, even if using multithreading/multiprocessing.

A generic solution if you are using UNIX:
import time as time
import signal
#Close session
def handler(signum, frame):
print 1
raise Exception('Action took too much time')
signal.signal(signal.SIGALRM, handler)
signal.alarm(3) #Set the parameter to the amount of seconds you want to wait
try:
#RUN CODE HERE
for i in range(0,5):
time.sleep(1)
except:
print 2
signal.alarm(10) #Resets the alarm to 10 new seconds
signal.alarm(0) #Disables the alarm

The answer (and the question) isn't specific to the try/except statement. If you want to have an infinite loop that isn't infinite (but stops after a while), it probably shouldn't be infinite. For example, change it to use:
while time <= some_value:
Or add an extra check to the body of the loop, making it break when you want it to stop:
while True:
...
if time > some_value:
break
If that isn't possible (for example, because you don't control the loop at all), things get significantly harder. On many systems you could use signal.alarm to have a signal delivered after a period of time, and then have a signal handler for signal.SIGALRM that raises your TimeoutError exception. That may or may not work depending on what the infinite loop actually does; if it blocks signals, or catches and ignores exceptions, or in any other way interferes with the signal handler, that won't work. Another possibility would be to not do the loop in the current process, but in a separate one; you can then terminate the separate process at your whim. It won't be easy to do anything but terminate it, though, so cleanup or recovery of partial work is very hard. (Threads aren't going to work at all, because there's no way to interrupt the separate thread doing the infinite loop.)

Related

Python - while loop inside twisted main loop?

I need to know if its possible to run a while loop inside a twisted websocket main loop
the while loop im referring to is the lib you see in this question: shout-python segmentation fault how can I fix this?
all I need it to do is send the new title once it updates, though I can handle that part. Its that while self.running: in the play() function. If you can help I'll surely appreciate it.
For Twisted's single-threaded, cooperative multitasking system to operate at its best, it's important that any particular piece of code running in the reactor thread not run for too long without giving control back to the reactor. As long as any one piece of code is running in that thread, no other code is running in that thread. In a single-threaded, cooperative multitasking system that means other events aren't being serviced.
Depending on your application, it may be fine for a single piece of code to run without giving up control for many milliseconds, many seconds, perhaps even minutes. It's entirely dependent on what events your application is responsible for handling and what level of responsiveness you want to get from it. When writing general purpose library code for a system like this, most people assume that it's only okay to run code for a single task for a handful of milliseconds or so before giving up control - to err on the side of being suitable for use in more applications rather than fewer (although people rarely consider the exact time limit, mostly operations are separated into "pretty quick" and everything else).
What's almost always unacceptable is to run a single piece of code indefinitely without giving control back to the reactor. The loop in the answer you linked to is effectively infinite and so it will hold control for an arbitrarily long period of time (perhaps for most of the runtime of the program). There are few if any applications that can tolerate this since the result is that other events will never be handled. If it's tolerable for your application to be unable to respond to any events while it spends its entire run time working on a single task then you may not need a multitasking system at all (ie, you may not need Twisted, you may be able to just use a while loop).
The loop in that answer is basically a "process some data as quickly as possible" loop. There are a few options for implementations of this kind of work in ways that are more multitasking-friendly.
One way is a literal translation of the loop into a pattern that's friendly to the reactor. You can do this with a generator:
from twisted.internet.task import cooperate
class Audio(object):
def play(self):
# ... setup stuff ...
while self.running:
# ... one chunk of work ...
yield
def main():
...
cooperate(Audio().play())
cooperate takes an iterator and iterates over it - but not all at once. It iterates it a few times and then gives up control to the reactor. Then it iterates it a few more times and then gives up control again. This continues until the iterator is exhausted (or the reactor is stopped).
Another slightly less literal translation is based on LoopingCall which takes over responsibility for the looping construct, leaving you only to supply the body of the loop:
from twisted.internet.task import LoopingCall
class Audio(object):
def play(self):
# ... setup stuff ...
LoopingCall(self._play_iteration).start(0)
def _play_iteration(self):
# ... one chunk of work
This gives you control over the rate at which the loop iterates. The 0 passed to start in this example means "as fast as possible" (wait 0 seconds between iterations) - while remaining cooperative with the rest of the system. If you wanted one iteration per second, you would pass 1, etc.
Another less literal option is to use a data flow abstraction - for example, Twisted's native producer/consumer system or the newer tubes library - to set up multitasking-friendly data processing pipelines that are further abstraction from the specific "read, process" loop in the linked answer.

difference between blocking IO and While 1?

Hi while i'm programming i had to do a choice :
while not cont_flag :
pass
and using a Event object :
if not const_flag.is_set() :
const_flag.wait()
i want to know if there is a difference in performance between the two methode
There is. The first method is called busy waiting and is very different from blocking. In busy waiting, the CPU is being used constantly as the while loop is executed. In blocking, the thread is actually suspended until a wake-up condition is met.
See also this discussion:
What is the difference between busy-wait and polling?
The first one is referred to as busy waiting, it will eat up 100% of the CPU time while waiting. It's a way better practice to have some signalling mechanism to communicate events (e.g. something's done).
Python only allows a single thread to execute at a time, regardless of how many cpus your system may have. If multiple threads are ready to run, python will switch among them periodically. If you "busy wait" as in your first example, that while loop will eat up much of the time that your other threads could use for their work. While the second solution is far superior, if you end up using the first one, add a modest sleep to it.
while not cont_flag:
time.sleep(.1)

How to handle timeouts when a process receives SIGSTOP and SIGCONT?

I have some Python code which uses threading.Timer to implement a 60-second timeout for an operation.
The problem is that this code runs in a job-control environment where it may get pre-empted by a higher priority job. In this case it will be sent SIGSTOP, and then some time later, SIGCONT. I need a way to somehow notice that this has happened and reset the timeout: obviously the operation hasn't really timed out if it's been suspended for the whole 60 seconds.
I tried to add a signal handler for SIGCONT but this seems to get executed after the code provided to threading.Timer has been executed.
Is there some way to achieve this?
A fairly simple answer that occurred to me after posting this is to simply break up the timer into multiple sub-timers, e.g. having 10 6-second timers instead where each one starts the next one in a chain. That way, if I get suspended, I only lose one of the timers and still get most of the wait before timing out.
This is of course not foolproof, especially if I get repeatedly suspended and restarted, but it's easy to do and seems like it might be good enough.
You need to rethink what you're asking for; a timeout reflects elapsed time (wall time); you want to know the time used by your process.
Fortunately you can measure this with getrusage: http://docs.python.org/library/resource.html
You'll still need to set a timeout; when it returns, measure the increase in user or system time usage since the start of the operation and terminate the operation if it exceeds the limit, else reschedule the timeout appropriately.
If your application is multi-threaded, the docs says that:
only the main thread can set a new signal handler, and the main thread will be the only one to receive signals
Make sure you are handling your signals from the main thread.

Python--Getting Queue.Empty exception from a nonempty multiprocessing.Queue

I've having the opposite problem of many Python users--my program is using too little CPU. I already got help in switching to multiprocessing to utilize all four of my work computer's cores, and I have seen real performance improvement as a result. But the improvement is somewhat unreliable. The CPU usage of my program seems to deteriorate as it continues to run--even with six processes running. After adding some debug messages, I discovered this was because some of the processes I was spawning (which are all supposed to run until completion) were dying prematurely. The main body of the method the processes run is a while True loop, and the only way out is this block:
try:
f = filequeue.get(False)
except Empty:
print "Done"
return
filequeue is populated before the creation of the subprocesses, so it definitely isn't actually empty. All the processes should exit at roughly the same time once it actually is empty. I tried adding a nonzero timeout (0.05) parameter to the Queue.get call, but this didn't fix the problem. Why could I be getting a Queue.empty exception from a nonempty Queue?
I suggest using filequeue.get(True) instead of filequeue.get(False). This will cause the queue to block until there are more elements.
It will, however, block forever after the final element has been processed.
To work around this, the main process could add a special "sentinel" object at the end of each queue. The workers would terminate upon seeing this special object (instead of relying on the emptiness of the queue).
I had a similar problem and found out from experimentation, not reading the docs, that even when the queue is non-empty, get(False) can still spuriously throw Empty. In my use case, the workers have to exit when they run out of work in the Queue, so get(True) is a non-option.
My solution was this: I found that if in the "except Empty:" block, I check that the Queue is indeed empty(), it works -- empty() will not return True unless the Queue is really empty.
I was using Python 2.7.

Python: time a method call and stop it if time is exceeded

I need to dynamically load code (comes as source), run it and get the results. The code that I load always includes a run method, which returns the needed results. Everything looks ridiculously easy, as usual in Python, since I can do
exec(source) #source includes run() definition
result = run(params)
#do stuff with result
The only problem is, the run() method in the dynamically generated code can potentially not terminate, so I need to only run it for up to x seconds. I could spawn a new thread for this, and specify a time for .join() method, but then I cannot easily get the result out of it (or can I). Performance is also an issue to consider, since all of this is happening in a long while loop
Any suggestions on how to proceed?
Edit: to clear things up per dcrosta's request: the loaded code is not untrusted, but generated automatically on the machine. The purpose for this is genetic programming.
The only "really good" solutions -- imposing essentially no overhead -- are going to be based on SIGALRM, either directly or through a nice abstraction layer; but as already remarked Windows does not support this. Threads are no use, not because it's hard to get results out (that would be trivial, with a Queue!), but because forcibly terminating a runaway thread in a nice cross-platform way is unfeasible.
This leaves high-overhead multiprocessing as the only viable cross-platform solution. You'll want a process pool to reduce process-spawning overhead (since presumably the need to kill a runaway function is only occasional, most of the time you'll be able to reuse an existing process by sending it new functions to execute). Again, Queue (the multiprocessing kind) makes getting results back easy (albeit with a modicum more caution than for the threading case, since in the multiprocessing case deadlocks are possible).
If you don't need to strictly serialize the executions of your functions, but rather can arrange your architecture to try two or more of them in parallel, AND are running on a multi-core machine (or multiple machines on a fast LAN), then suddenly multiprocessing becomes a high-performance solution, easily paying back for the spawning and IPC overhead and more, exactly because you can exploit as many processors (or nodes in a cluster) as you can use.
You could use the multiprocessing library to run the code in a separate process, and call .join() on the process to wait for it to finish, with the timeout parameter set to whatever you want. The library provides several ways of getting data back from another process - using a Value object (seen in the Shared Memory example on that page) is probably sufficient. You can use the terminate() call on the process if you really need to, though it's not recommended.
You could also use Stackless Python, as it allows for cooperative scheduling of microthreads. Here you can specify a maximum number of instructions to execute before returning. Setting up the routines and getting the return value out is a little more tricky though.
I could spawn a new thread for this, and specify a time for .join() method, but then I cannot easily get the result out of it
If the timeout expires, that means the method didn't finish, so there's no result to get. If you have incremental results, you can store them somewhere and read them out however you like (keeping threadsafety in mind).
Using SIGALRM-based systems is dicey, because it can deliver async signals at any time, even during an except or finally handler where you're not expecting one. (Other languages deal with this better, unfortunately.) For example:
try:
# code
finally:
cleanup1()
cleanup2()
cleanup3()
A signal passed up via SIGALRM might happen during cleanup2(), which would cause cleanup3() to never be executed. Python simply does not have a way to terminate a running thread in a way that's both uncooperative and safe.
You should just have the code check the timeout on its own.
import threading
from datetime import datetime, timedelta
local = threading.local()
class ExecutionTimeout(Exception): pass
def start(max_duration = timedelta(seconds=1)):
local.start_time = datetime.now()
local.max_duration = max_duration
def check():
if datetime.now() - local.start_time > local.max_duration:
raise ExecutionTimeout()
def do_work():
start()
while True:
check()
# do stuff here
return 10
try:
print do_work()
except ExecutionTimeout:
print "Timed out"
(Of course, this belongs in a module, so the code would actually look like "timeout.start()"; "timeout.check()".)
If you're generating code dynamically, then generate a timeout.check() call at the start of each loop.
Consider using the stopit package that could be useful in some cases you need timeout control. Its doc emphasizes the limitations.
https://pypi.python.org/pypi/stopit
a quick google for "python timeout" reveals a TimeoutFunction class
Executing untrusted code is dangerous, and should usually be avoided unless it's impossible to do so. I think you're right to be worried about the time of the run() method, but the run() method could do other things as well: delete all your files, open sockets and make network connections, begin cracking your password and email the result back to an attacker, etc.
Perhaps if you can give some more detail on what the dynamically loaded code does, the SO community can help suggest alternatives.

Categories