PyErr_CheckSignals not picking up signal - python

I have a C-extension which implements an LRU cache https://github.com/pbrady/fastcache . I've recently noticed that in an application (SymPy) which makes fairly heavy use of caching, a timeout signal gets lost and the application continues to run. This only happens when using my C-extension and not with a pure Python LRU cache (i.e. functools.lru_cache) https://github.com/pbrady/fastcache/issues/26.
I've peppered my routines with calls to PyErr_CheckSignals() and the signal gets lost less frequently but it's still happening. Note that during a call, the cache will call PyObject_Hash, PyDict_Get/Set/DelItem and PyObject_Call (in the case of a miss).
Here's the relevant snippet of the SymPy code (timeout is an integer):
def _timeout(self, function, timeout):
def callback(x, y):
signal.alarm(0)
raise Skipped("Timeout")
signal.signal(signal.SIGALRM, callback)
signal.alarm(timeout) # Set an alarm with a given timeout
function()
signal.alarm(0) # Disable the alarm enter code here
Can something be overwriting the signal? If so, how do I work around this?

Turns out there is no real mystery here but it is a little tricky.
In addition to calls to PyErr_CheckSignals() the signal can be caught by the interpreter whenever control passes to it via calls to PyObject_Hash, PyDict_Get/Set/DelItem. If the signal is caught by the interpreter in one of these functions it will trigger an exception due to the callback function (and the signal will go away since it was handled). However, I was not checking the return value of all my functions (i.e. I knew my argument was hashable so I wasn't checking the return value of PyDict_SetItem)
Thus the exception was ignored and the program continued to execute as if the signal had not happened.
Special thanks to Ondrej for talking through this.

Related

How do I add an errback to deferLater?

Consider the following twisted code, using deferLater:
import random
from twisted.internet.task import deferLater
from twisted.internet import reactor
def random_exception(msg='general'):
if random.random() < 0.5:
raise Exception("Random exception with 50%% likelihood occurred in %s!" % msg)
def dolater():
random_exception('dolater')
print "it's later!"
def whoops(failure):
failure.trap(Exception)
print failure
defer = deferLater(reactor, 10, dolater)
defer.addErrback(whoops)
reactor.run()
An exception is raised during the 10 second sleep (namely a KeyboardInterrupt), however, it seems that the whoops method is never called. My assumption is that since I add the errBack after the deferred kicks off, it's never properly registered. Advice appreciated.
EDIT:
Alright, no one likes my use of the signal (not the exception) KeyboardInterrupt to show an error condition outside of the defer. I thought pretty hard about an actual exception that might occur out of the defer callback, but couldn't think of a particularly good one, most everything would be some kind of signal (or developer error), so signal handling is fine for now- but that wasn't really the heart of the question.
As I understand it, twisted's callback/errback system handles errors within the callback structure - e.g. if dolater raises an Exception of some kind. To show this, I have added an exception that could occur during dolater, to show that if the exception occurs in dolater, the errback handles the exception just fine.
My concern was if something went wrong while the reactor was just reacting normally, and the only thing I could get to go wrong was a keyboard interrupt, then I wanted whoops to fire. It appears that if I put other async events into the reactor and raise exceptions from there, then the dolater code wouldn't be affected, and I would have to add errbacks to those other async events. There is no master error handling for an entire twisted program.
So signals it is, until I can find some way to cause the reactor to fail without a signal.
If by KeyboardInterrupt you mean a signal (ctrl-c, SIGINT, etc), then what you need to do is setup a signal handler with your whoops function as the callback.
By following two previous answers from #jean-paul-calderone twisted: catch keyboardinterrupt and shutdown properly and twisted - interrupt callback via KeyboardInterrupt, I tried the following, and I think it matches your need:
def dolater():
print "it's later!"
def whoops(signal, stackframe):
print "I'm here because of signal number " + str(signal)
reactor.stop()
defer = task.deferLater(reactor, 10, dolater)
signal.signal(signal.SIGINT, whoops)
reactor.run()
That will call whoops on a SIGINT. I put a reactor.stop() in the whoops because otherwise the reactor would just keep on running, take that out if you really want it to keep running in the face of a ctrl-c.
Note: I'm not explicitly showing how to fire a err-back in the signal system because (at least to my understanding) that doesn't really map to how defer'ed should be used. I imagine if you found a way to get the defer'ed into the signal handler you could fire its errback but I think thats out of the expected use-case for twisted and may have crazy consequences.
The problem is with the actual exception you're trying to catch, specifically KeyboardInterrupt is not a subclass of Exception, thus can not be catched with it. If you'd just change the line:
failure.trap(Exception)
into:
failure.trap(KeyboardInterrupt)
it surely would catch it. More on Python's exception hierarchy can be found in the official Python docs: https://docs.python.org/2/library/exceptions.html
Twisted is a library for doing many things concurrently. The things are kept as isolated as possible (given that this is still Python, there's still global state, etc).
If you have a TCP server with two clients connect to it and one of them sends you some bad data that triggers a bug in your parser that leads to an exception being raised, that exception isn't going to cause the other client to receive any error. Nor would you want it to, I hope (at least not automatically).
Similarly, if you have a client connected to your server and you start a delayed call with deferLater and the client triggers that bug, you wouldn't want the error to be delivered to the errback on the Deferred returned by deferLater.
The idea here is that separate event sources are generally treated separately (until you write some code that glues them together somehow).
For the ten seconds that are passing between when you call deferLater and when Twisted begins to run the function you passed to deferLater, any errors that happen - including you hitting C-c on your keyboard to make Python raise a KeyboardInterrupt - aren't associated with that delayed call and they won't be delivered to the errback you attach to its Deferred.
Only exceptions raised by your dolater function will cause the errback chain of that Deferred to begin execution.

What is the proper way to handle (in python) IOError: [Errno 4] Interrupted system call, raised by multiprocessing.Queue.get

When I use multiprocessing.Queue.get I sometimes get an exception due to EINTR.
I know definitely that sometimes this happens for no good reason (I open another pane in a tmux buffr), and in such a case I would want to continue working and retry the operation.
I can imagine that in some other cases The error would be due to a good reason and I should stop running or fix some error.
How can I distinguish the two?
Thanks in advance
The EINTR error can be returned from many system calls when the application receives a signal while waiting for other input. Typically these signals can be quite benign and already handled by Python, but the underlying system call still ends up being interrupted. When doing C/C++ coding this is one reason why you can't entirely rely on functions like sleep(). The Python libraries sometimes handle this error code internally, but obviously in this case they're not.
You might be interested to read this thread which discusses this problem.
The general approach to EINTR is to simply handle the error and retry the operation again - this should be a safe thing to do with the get() method on the queue. Something like this could be used, passing the queue as a parameter and replacing the use of the get() method on the queue:
import errno
def my_queue_get(queue, block=True, timeout=None):
while True:
try:
return queue.get(block, timeout)
except IOError, e:
if e.errno != errno.EINTR:
raise
# Now replace instances of queue.get() with my_queue_get(queue), with other
# parameters passed as usual.
Typically you shouldn't need to worry about EINTR in a Python program unless you know you're waiting for a particular signal (for example SIGHUP) and you've installed a signal handler which sets a flag and relies on the main body of the code to pick up the flag. In this case, you might need to break out of your loop and check the signal flag if you receive EINTR.
However, if you're not using any signal handling then you should be able to just ignore EINTR and repeat your operation - if Python itself needs to do something with the signal it should have already dealt with it in the signal handler.
Old question, modern solution: as of Python 3.5, the wonderful PEP 475 - Retry system calls failing with EINTR has been implemented and solves the problem for you. Here is the abstract:
System call wrappers provided in the standard library should be retried automatically when they fail with EINTR , to relieve application code from the burden of doing so.
By system calls, we mean the functions exposed by the standard C library pertaining to I/O or handling of other system resources.
Basically, the system will catch and retry for you a piece of code that failed with EINTR so you don't have to handle it anymore. If you are targeting an older release, the while True loop still is the way to go. Note however that if you are using Python 3.3 or 3.4, you can catch the dedicated exception InterruptedError instead of catching IOError and checking for EINTR.

Python Idle and KeyboardInterrupts

KeyboardInterrupts in Idle work for me 90% of the time, but I was wondering why they don't always work. In Idle, if I do
import time
time.sleep(10)
and then attempt a KeyboardInterrupt using Ctrl+C, it does not interrupt the process until after sleeping for 10 seconds.
The same code and a KeyboardInterrupt via Ctrl+C works immediately in the shell.
A quick glance at the IDLE source reveals that KeyboardInterrupts have some special case handling: http://svn.python.org/view/python/tags/r267/Lib/idlelib/PyShell.py?annotate=88851
On top of that, code is actually executed in a separate process which the main IDLE gui process communicates with via RPC. You're going to get different behavior under that model - it's best to just test with the canonical interpreter (via command line, interactive, etc.)
============
Digging deeper...
The socket on the RPC server is managed in a secondary thread which is supposed to propagate a KeyboardInterrupt using a call to thread.interrupt_main() ( http://svn.python.org/view/python/tags/r267/Lib/idlelib/run.py?annotate=88851 ). Behavior is not as expected there... This posting hints that for some reason, interrupt_main doesn't provide the level of granularity that you would expect: http://bytes.com/topic/python/answers/38386-thread-interrupt_main-doesnt-seem-work
Async API functions in cPython are a little goofy (from my experience) due to how the interpreter loop is handled, so it doesn't surprise me. interrupt_main() calls PyErr_SetInterrupt() to asynchronously notify the interpreter to handle a SIGINT in the main thread. From http://docs.python.org/c-api/exceptions.html#PyErr_SetInterrupt:
This function simulates the effect of
a SIGINT signal arriving — the next
time PyErr_CheckSignals() is called,
KeyboardInterrupt will be raised
That would require the interpreter to go though whatever number of bytecode instructions before PyErr_CheckSignals() is called again - something that probably doesn't happen during a time.sleep(). I would venture to say that's a wart of simulating a SIGINT rather than actually signaling a SIGINT.
See This article:
I quote:
If you try to stop a CPython program
using Control-C, the interpreter
throws a KeyboardInterrupt exception.
It makes some sense, because the thread is asleep for 10 seconds and so exceptions cannot be thrown until the 10 seconds pass. However, ctrl + c always work in the shell because you are trying to stop a process, not throw a python KeyboardInterrupt exception.
Also, see this previously answered question.
I hope this helps!

profiling fuse-python

I am currently writing a fuse using fuse-python. It's already doing what it should. However, after it's mounted for a few weeks, it's becoming noticeably slow. So I wanted to profile it. I know about a few point where it could be optimized. But these should not be the culprits.
However, fuse-python hangs in an infinite loop (see line 733 and 757 of the fuse source). If I run fuse in debug mode (using the -d switch), it will run in foreground. However, I cannot stop it with SIGINT nor with CTRL+C (which is anyway the same).
I tried to use the signal module to trap the signal in the main thread. But this does not work either. Interestingly, once I shoot the process down with SIGKILL, I see the KeyboardInterrupt on stdout. Also, after a SIGKILL, the signal handler is executed as expected.
This has repercussions on profiling. As the process never terminates normally, cProfile never gets the chance to save the stats file.
Any ideas?
Python installs a handler that raises KeyboardInterrupt on SIGINT. If a non-default signal handler is detected when fuse's main is called, it will not replace the handler with its own, which normally calls fuse_session_exit and cleans up. After you've called fuse's main, the KeyboardInterrupt is swallowed by CFUNCTYPE wrappers and you never see them.
Your options are to:
Send SIGQUIT by pressing Ctrl+\, or any other terminating signal other than SIGINT. However fuse will not exit cleanly.
Install the default signal handler to SIGINT before calling fuse's main, and restore the original when you're done.
old_handler =signal(SIGINT, SIG_DFL)
# call main
signal(SIGINT, old_handler)
I'd highly recommend you switch to an alternative binding also, fuse-python is terribly messy and difficult to work with. I've had a lot of luck with fusepy, and have submitted a few patches there.
When you're able to terminate your FUSE instance without using uncaught signals, the Python profiler will be able to save the stats as per normal.

Python, Timeout of Try, Except Statement after X number of seconds?

I've been searching on this but can't seem to find an exact answer (most get into more complicated things like multithreading, etc), I just want to do something like a Try, Except statement where if the process doesn't finish within X number of seconds it will throw an exception.
EDIT: The reason for this is that I am using a website testing software (selenium) with a configuration that sometimes causes it to hang. It doesn't throw an error, doesn't timeout or do anything so I have no way of catching it. I am wondering what the best way is to determine that this has occured so I can move on in my application, so I was thinking if I could do something like, "if this hasn't finished by X seconds... move on".
You can't do it without some sort of multithreading or multiprocessing, even if that's hidden under some layers of abstraction, unless that "process" you're running is specifically designed for asynchronicity and calls-back to a known function once in a while.
If you describe what that process actually is, it will be easier to provide real solutions. I don't think that you appreciate the power of Python where it comes to implementations that are succinct while being complete. This may take just a few lines of code to implement, even if using multithreading/multiprocessing.
A generic solution if you are using UNIX:
import time as time
import signal
#Close session
def handler(signum, frame):
print 1
raise Exception('Action took too much time')
signal.signal(signal.SIGALRM, handler)
signal.alarm(3) #Set the parameter to the amount of seconds you want to wait
try:
#RUN CODE HERE
for i in range(0,5):
time.sleep(1)
except:
print 2
signal.alarm(10) #Resets the alarm to 10 new seconds
signal.alarm(0) #Disables the alarm
The answer (and the question) isn't specific to the try/except statement. If you want to have an infinite loop that isn't infinite (but stops after a while), it probably shouldn't be infinite. For example, change it to use:
while time <= some_value:
Or add an extra check to the body of the loop, making it break when you want it to stop:
while True:
...
if time > some_value:
break
If that isn't possible (for example, because you don't control the loop at all), things get significantly harder. On many systems you could use signal.alarm to have a signal delivered after a period of time, and then have a signal handler for signal.SIGALRM that raises your TimeoutError exception. That may or may not work depending on what the infinite loop actually does; if it blocks signals, or catches and ignores exceptions, or in any other way interferes with the signal handler, that won't work. Another possibility would be to not do the loop in the current process, but in a separate one; you can then terminate the separate process at your whim. It won't be easy to do anything but terminate it, though, so cleanup or recovery of partial work is very hard. (Threads aren't going to work at all, because there's no way to interrupt the separate thread doing the infinite loop.)

Categories