In the app I'm currently working on (2.7 runtime), I'm trying to make sure that exceptions at the API level (i.e. not my code) are handled correctly within my application. However, it appears that Google/AppEngine handle those exceptions internally and doesn't bubble them up. For instance, using Thing which is a previously defined ndb.Model
t = Thing(id=1,name='thingy')
try:
t.put()
except Exception as e:
self.log(e)
self.abort(500)
In the unlikely event that something goes awry with the put() I have no way to catch/log that event -- or is there?
A similar thing happens with storing data to the blobstore where exceptions are, apparently, caught and raised internally and leaving no chance for me to log those.
Perhaps I'm missing a key point? I've looked through the API docs but the exceptions raised by services and how to catch them doesn't seem to be a priority for documentation team.
Actually App Engine logs every single request. Just go to the application's dashboard and click on Logs.
If you want to log something on your own you should use the logging library and you can read more about it in the documentation.
So instead of self.log you should use logging.error.
Related
Step Functions are AWS structures that control the flow of lambdas (or other events). All my lambdas use Python (but Lambdas can use most major languages). Throughout the process my step function sends status updates back to the client (the client triggered it via API). Let's say it progresses through these updates: Started -> In Progress -> Finishing -> Done. For handled errors it will send an 'Error' status back to the client. So the client could see a timeline like this: Started -> In Progress -> Errored. This is ideal - so the user knows the process has stopped.
But when there are unexpected/unhandled errors the client never really knows and the timeline might sit at 'In Progress' indefinitely - the user doesn't know what happened. So I started looking into the built-in Step Function error handling. I like this option because I can create a 'Catch' function for each lambda or event where I can communicate back to the client if there is an error. The downside to this was that it really made the step function template/design messy see the before/after screenshots below.
BEFORE---------------
AFTER---------------
The template code that generates these graphs doesn't look much better. So I considered an alternative which seems similarly messy. I could add a single try/except block within each lambda for the entire lambda - to catch any/all errors. For example:
def lambda_handler(event, context):
try:
#Execute function tasks
except:
#Communicate back to client that there was an error
Similar to the step function 'Catch' functions this would ensure that I catch and communicate any error. But this seems like a bad idea just because of what it is (adding blanket/blind try/except).
So right now I'm stuck between messy/repeated code and try/except-ing everything. Am I implementing step function 'Catch' incorrectly? Am I missing a better way to handle unknown Python errors? Is there another approach entirely?
As #stijndepestel pointed out, having a catch-all error check is a good idea.
What I do in my Python Lambda functions is this: I have a custom router class, which besides route managing, it handles all errors. If the error inherits from a base error class that I've created, then it's custom error that I threw, and those are assigned special info when I created them that automatically gets formatted when they are converted into strings. The router sends that back to the client if possible.
But if the error is some unknown/unexpected one, then the router prints it with as much detail as possible to CloudWatch Logs, and then returns a generic "500 Internal Server Error" message to the client.
I'd probably set it up in the future to notify me by email or something like that when such errors occur, so that I can take action quickly.
I don't see why having a try-catch system for the entirety of your lambda is such a bad idea. It just ensures that you're always in control of how errors are communicated to the caller of the lambda function.
Imagine for example a lambda that serves as a back-end for an HTTP API, it would be better practice to have an try-catch for everything, so you can communicate to your clients what the problem was, or at least provide a generic HTTP 500 type error. In this case, the functions will be called by AWS Step Functions, which means you're error messages don't have to be user friendly, but the fact you might want to be in control of how unexpected exceptions are handled, is still the same in my book.
I have a small server script mirroring IoT traffic, and handling several kinds of packages.
In there, I have stuff in queues, and pulling from them is arranged as follows:
def pull_from(service, ID):
with service.LOCK_A:
if not ID in service.queues:
service.queues[ID] = queue.Queue(35)
return service.queues[ID].get(timeout=2.5)
Here timeout expires in for example 2.5 seconds, and then raises queues.Empty, releasing the lock. The exception is catched downstream.
Previously I have avoided stuff like this. Is this considered "sound design" or is lock release through with-exception a sort of hack that should be avoided?
Yes, it’s fine. with statements always run __exit__ on exceptions, by design, like a try…finally.
I've been using GAE for more than a year now, and one of the most difficult things for me to deal with is the fact that my otherwise well written code may occasionally raise an exception because of a GAE hiccup.
I already have a decent procedure for unhandled exceptions. My custom request handler presents a nice error page and administrators get an email. This, however, is a bad user experience.
What I want to do is to handle exceptions so I can immediately take the appropriate action and prevent some generic error page.
My questions are:
What exceptions should I catch?
Where should I catch them?
I realize that a full answer to this is not practical, but I'm looking for some best practices for the most common situations.
For exceptions that I should catch, I sometimes see exceptions that are not on the official list. For example, I've received an UnknownError.
For where to catch exceptions, I wonder if I should catch them in each get/post method. Something like this:
def get(self):
try:
# normal get processing
except SomeException:
# redirect to the same page to try again and fix any data if necessary
I'm surprised there is not more information out there about this as this is an important aspect of any GAE app. There are some good articles here and here, but these don't answer my questions.
What exceptions should I catch?
That depends upon what level of error catching you're going for. From my experience catching the errors in the official list and linked articles will get you a very high level of error catching. If you need to go above and beyond that putting in a generic except would be easier than trying to predict unknown errors.
Where should I catch them?
The most likely place(s) for GAE errors is when interacting with the db, so setting some try-except blocks around there if you haven't will give you a good return on your effort for dealing with GAE-issue error handling.
Besides the advice of your linked articles you can also think about putting the failed operations into a task queue. Each task will automatically retry 5 times before failing which can give you some ability to ride out datastore switches or other service interruptions if you don't need immediate feedback from the operation.
Using Google App Engine, Python 2.7, threadsafe:true, webapp2.
I would like to include all logging.XXX() messages in my API responses, so I need an efficient way to collect up all the log messages that occur during the scope of a request. I also want to operate in threadsafe:true, so I need to be careful to get only the right log messages.
Currently, my strategy is to add a logging.Handler at the start of my webapp2 dispatch method, and then remove it at the end. To collect logs only for my thread, I instantiate the logging.Handler with the name of the current thread; the handler will simply throw out log records that are from a different thread. I am using thread name and not thread ID because I was getting some unexpected results on dev_appserver when using the ID.
Questions:
Is it efficient to constantly be adding/removing logging.Handler objects in this fashion? I.e., every request will add, then remove, a Handler. Is this "cheap"?
Is this the best way to get only the logging messages for my request? My big assumption is that each request gets its own thread, and that thread name will actually select the right items.
Am I fundamentally misunderstanding Python logging? Perhaps I should only have a single additional Handler added once at the "module-level" statically, and my dispatch should do something lighter.
Any advice is appreciated. I don't have a good understanding of what Python (and specifically App Engine Python) does under the hood with respect to logging. Obviously, this is eminently possible because the App Engine Log Viewer does exactly the same thing: it displays all the log messages for that request. In fact, if I could piggyback on that somehow, that would be even better. It absolutely needs to be super-cheap though - i.e., an RPC call is not going to cut it.
I can add some code if that will help.
I found lots of goodness here:
from google.appengine.api import logservice
entries = logservice.logs_buffer().parse_logs()
Is there a way to create a middleware that will catch every raised exception and print the stacktrace both to log and stdout (possibly with some additional information) in Pylons framework?
Standard paste.exceptions.errormiddleware.ErrorMiddleware already does this, and even a little more.