I've been using GAE for more than a year now, and one of the most difficult things for me to deal with is the fact that my otherwise well written code may occasionally raise an exception because of a GAE hiccup.
I already have a decent procedure for unhandled exceptions. My custom request handler presents a nice error page and administrators get an email. This, however, is a bad user experience.
What I want to do is to handle exceptions so I can immediately take the appropriate action and prevent some generic error page.
My questions are:
What exceptions should I catch?
Where should I catch them?
I realize that a full answer to this is not practical, but I'm looking for some best practices for the most common situations.
For exceptions that I should catch, I sometimes see exceptions that are not on the official list. For example, I've received an UnknownError.
For where to catch exceptions, I wonder if I should catch them in each get/post method. Something like this:
def get(self):
try:
# normal get processing
except SomeException:
# redirect to the same page to try again and fix any data if necessary
I'm surprised there is not more information out there about this as this is an important aspect of any GAE app. There are some good articles here and here, but these don't answer my questions.
What exceptions should I catch?
That depends upon what level of error catching you're going for. From my experience catching the errors in the official list and linked articles will get you a very high level of error catching. If you need to go above and beyond that putting in a generic except would be easier than trying to predict unknown errors.
Where should I catch them?
The most likely place(s) for GAE errors is when interacting with the db, so setting some try-except blocks around there if you haven't will give you a good return on your effort for dealing with GAE-issue error handling.
Besides the advice of your linked articles you can also think about putting the failed operations into a task queue. Each task will automatically retry 5 times before failing which can give you some ability to ride out datastore switches or other service interruptions if you don't need immediate feedback from the operation.
Related
Step Functions are AWS structures that control the flow of lambdas (or other events). All my lambdas use Python (but Lambdas can use most major languages). Throughout the process my step function sends status updates back to the client (the client triggered it via API). Let's say it progresses through these updates: Started -> In Progress -> Finishing -> Done. For handled errors it will send an 'Error' status back to the client. So the client could see a timeline like this: Started -> In Progress -> Errored. This is ideal - so the user knows the process has stopped.
But when there are unexpected/unhandled errors the client never really knows and the timeline might sit at 'In Progress' indefinitely - the user doesn't know what happened. So I started looking into the built-in Step Function error handling. I like this option because I can create a 'Catch' function for each lambda or event where I can communicate back to the client if there is an error. The downside to this was that it really made the step function template/design messy see the before/after screenshots below.
BEFORE---------------
AFTER---------------
The template code that generates these graphs doesn't look much better. So I considered an alternative which seems similarly messy. I could add a single try/except block within each lambda for the entire lambda - to catch any/all errors. For example:
def lambda_handler(event, context):
try:
#Execute function tasks
except:
#Communicate back to client that there was an error
Similar to the step function 'Catch' functions this would ensure that I catch and communicate any error. But this seems like a bad idea just because of what it is (adding blanket/blind try/except).
So right now I'm stuck between messy/repeated code and try/except-ing everything. Am I implementing step function 'Catch' incorrectly? Am I missing a better way to handle unknown Python errors? Is there another approach entirely?
As #stijndepestel pointed out, having a catch-all error check is a good idea.
What I do in my Python Lambda functions is this: I have a custom router class, which besides route managing, it handles all errors. If the error inherits from a base error class that I've created, then it's custom error that I threw, and those are assigned special info when I created them that automatically gets formatted when they are converted into strings. The router sends that back to the client if possible.
But if the error is some unknown/unexpected one, then the router prints it with as much detail as possible to CloudWatch Logs, and then returns a generic "500 Internal Server Error" message to the client.
I'd probably set it up in the future to notify me by email or something like that when such errors occur, so that I can take action quickly.
I don't see why having a try-catch system for the entirety of your lambda is such a bad idea. It just ensures that you're always in control of how errors are communicated to the caller of the lambda function.
Imagine for example a lambda that serves as a back-end for an HTTP API, it would be better practice to have an try-catch for everything, so you can communicate to your clients what the problem was, or at least provide a generic HTTP 500 type error. In this case, the functions will be called by AWS Step Functions, which means you're error messages don't have to be user friendly, but the fact you might want to be in control of how unexpected exceptions are handled, is still the same in my book.
I'm implementing a function that is supposed to return a deferred. Inside that function I decide that an error happened.
I could just raise the error:
raise ValueError("...")
but then the function is not returning a deferred anymore. I don't want to use maybeDeferred anytime I call it.
Alternatively, I could return a deferred like this:
return defer.fail(Failure(ValueError(...)))
This works, but the Failure won't include a stacktrace, making it hard to track the error down.
The best thing I found so far is this:
try:
raise ValueError("...")
except:
return defer.fail()
I get a deferred back and the Failure contains the stacktrace. But this is rather verbose.
Is there a better way that I'm missing?
I'm a bit surprised that such a common thing hasn't an elegant solution.
This question surprised me a little, and I had to think for a while about why.
If all you want is for the failure with traceback to bubble up and get logged by the global error handler, you needn't return a defer.fail(). Raising the exception will get that behavior.
The difference would be in a situation like this:
foo().addErrback(fooErrorHandler)
in that case, fooErrorHandler would get called when the result of the deferred was a failure but not when it was an exception raised synchronously. That would require this more cumbersome form:
try:
foo().addErrback(fooErrorHandler)
except Exception, err:
fooErrorHandler(Failure(err))
which admittedly looks pretty bad. But the situation we're talking about is now a case where
you define an explicit error handler for this call
the error handler does something with the failure's traceback
the function can fail synchronously or asynchronously
You use the same error handler for both the synchronous and asynchronous failure modes.
which is maybe why it hasn't come up as such a common thing as you might think.
Of course, the other reason it might not have come up as a common thing is that people are often lazy about defining error handlers, and often lazy about documenting the sorts of exceptions their code may raise.
Ah-hah, something else that may have added to my confusion: Failure does know how to to store its stack, but it was explicitly changed to not do this unless there's a traceback for performance reasons. The method described by that commit message to get a traceback is the same as the four-line try/except example in your post.
I guess that could be an option to Failure() (as captureVars is), or an alternate constructor method, if this does come up enough to warrant it.
I have a python CGI script that takes several query strings as arguments.
The query strings are generated by another script, so there is little possibility to get illegal arguments,
unless some "naughty" user changes them intentionally.
A illegal argument may throw a exception (e.g. int() function get non-numerical inputs),
but does it make sense to write some codes to catch such rare errors? Any security risk or performance penalty if not caught?
I know the page may go ugly if exceptions are not nicely handled, but a naughty user deserves it, right?
Any unhandled exception causes program to terminate.
That means if your program is doing some thing and exception occurs it will shutdown in an unclean fashion without releasing resources.
Any ways CGI is obsolete use Django, Flask, Web2py or something.
In the app I'm currently working on (2.7 runtime), I'm trying to make sure that exceptions at the API level (i.e. not my code) are handled correctly within my application. However, it appears that Google/AppEngine handle those exceptions internally and doesn't bubble them up. For instance, using Thing which is a previously defined ndb.Model
t = Thing(id=1,name='thingy')
try:
t.put()
except Exception as e:
self.log(e)
self.abort(500)
In the unlikely event that something goes awry with the put() I have no way to catch/log that event -- or is there?
A similar thing happens with storing data to the blobstore where exceptions are, apparently, caught and raised internally and leaving no chance for me to log those.
Perhaps I'm missing a key point? I've looked through the API docs but the exceptions raised by services and how to catch them doesn't seem to be a priority for documentation team.
Actually App Engine logs every single request. Just go to the application's dashboard and click on Logs.
If you want to log something on your own you should use the logging library and you can read more about it in the documentation.
So instead of self.log you should use logging.error.
I've been doing amateur coding in Python for a while now and feel quite comfortable with it. Recently though I've been writing my first Daemon and am trying to come to terms with how my programs should flow.
With my past programs, exceptions could be handled by simply aborting the program, perhaps after some minor cleaning up. The only consideration I had to give to program structure was the effective handling of non-exception input. In effect, "Garbage In, Nothing Out".
In my Daemon, there is an outside loop that effectively never ends and a sleep statement within it to control the interval at which things happen. Processing of valid input data is easy but I'm struggling to understand the best practice for dealing with exceptions. Sometimes the exception may occur within several levels of nested functions and each needs to return something to its parent, which must, in turn, return something to its parent until control returns to the outer-most loop. Each function must be capable of handling any exception condition, not only for itself but also for all its subordinates.
I apologise for the vagueness of my question but I'm wondering if anyone could offer me some general pointers into how these exceptions should be handled. Should I be looking at spawning sub-processes that can be terminated without impact to the parent? A (remote) possibility is that I'm doing things correctly and actually do need all that nested handling. Another very real possibility is that I haven't got a clue what I'm talking about. :)
Steve
Exceptions are designed for the purpose of (potentially) not being caught immediately-- that's how they differ from when a function returns a value that means "error". Each exception can be caught at the level where you want to (and can) do something about it.
At a minimum, you could start by catching all exceptions at the main loop and logging a message. This is simple and ensures that your daemon won't die. At the main loop it's probably too late to fix most problems, so you can catch specific exceptions sooner. E.g. if a file has the wrong format, catch the exception in the routine that opens and tries to use the file, not deep in the parsing code where the problem is discovered; perhaps you can try another format. Basically if there's a place where you could recover from a particular error condition, catch it there and do so.
The answer will be "it depends".
If an exception occurs in some low-level function, it may be appropriate to catch it there if there is enough information available at this level to let the function complete successfully in spite of the exception. E.g. when reading triangles from an .stl file, the normal vector of the triangle it both explicitly given and implicitly given by the sequence of the three points that make up the triangle. So if the normal vector is given as (0,0,0), which is a 0-length vector and should trigger an exception in the constructor of a Normal vector class, that can be safely caught in the constructor of a Triangle class, because it can still be calculated by other means.
If there is not enough information available to handle an exception, it should trickle upwards to a level where it can be handled. E.g. if you are writing a module to read and interpret a file format, it should raise an exception if the file it was given doesn't match the file format. In this case it is probably the top level of the program using that module that should handle the exception and communicate with the user. (Or in case of a daemon, log the error and carry on.)