App Engine: uncaught application failure - python

On all versions of the app I started getting error "uncaught application failure". I'm on python27. Errors began to appear suddenly, last app deployment was ~5 hours ago. From time to time I get the expected result but mostly I see the error. Nothing useful in the logs. Any suggestions?
Response body:
<html><head><title>s~lawinsidercontracts : uncaught application failure</title><body><pre>
<br></pre></body></html>
In logs this is looks like:
2013-12-20 22:22:54.987
This request caused a new process to be started for your application, and thus caused your application code to be loaded for the first time. This request may thus take longer and use more CPU than a typical request for your application.
W 2013-12-20 22:22:54.987
A problem was encountered with the process that handled this request, causing it to exit. This is likely to cause a new process to be used for the next request to your application. (Error code 121)
UPDATE: Now I see this issue only with default version. Any of the "default versions" gets this error. Different source code does not make any difference. Temporary solved by splitting 99% of traffic to non-default version. (it's works!)
Please, note: it does not seem to be a bug in my code, I tried completely different sources, it's seems more like internal error in the instance(s) and I would like to get feedback from GAE team.

3 days without errors. Apparently the problem resolved itself.

Related

Google App Engine keeps deploying a new instance, or none at all (server error)

I built and deployed an app on GAE. Yesterday all seemed to be working fine, sending requests every few seconds to the app would be successful with a response time of about 2.5 seconds. Today GAE keeps deploying a new instance for every request, or fails to create even one, resulting in unacceptably high response times (and much higher charges) or even 500 server errors.
I tried to suspend and restart the app a few times, works again for a couple of requests, then reverts to the same behavior. On the console I can see that a new instance is immediately shut down after serving a request, or in case of server error, that GAE was unable to deploy a new instance.
I checked the quotas on the console, nothing seems to hint that I cannot send multiple requests from the same IP.
Has anyone experienced such issues, and if yes, what could be the cause(s) and remedies? Please note, I am very new to GAE so have no further clue right now on where to start.
EDIT: Just realized the average memory used by an instance (F2 in my case, which gives you 256MB) is very close to the max (250MB). Could it be the issue? I will upgrade to F4 (512MB) and see what happens.
As per the documentation - a new instance may be created based on request rate, response latencies, and other application metrics.
Therefore, it’s expected behaviour for the GAE Standard instances to scale up and down depending on the traffic they receive.
Also, if the maximum memory usage for the instance class is reached, a shutdown process will be triggered as explained here.
As for the failures to create a new instance, it’s hard to tell what may be causing it without the Stackdriver Logging information. At the top of my head, you may receive HTTP 500 errors due to having reached the response limit, but it could indeed happen for any other reason as well.
Finally, taking into account the nature of the issues, I think it’s a good idea testing the GAE app’s behaviour using a better instance class and comparing the results. If you no longer experience this using an F4 instance class, it’s safe to assume that the previous instance class was simply not enough to satisfy the app’s requirements.

How AppEngine instances work on the local server

Newbie on appengine and I really don't know how to phrase the question which sadly results in me not knowing what keywords to google and I hope that i really do get help other than the bashing that a lot of people do.
I'm confused between the behavior of appengine online and the appengine on the local server.
Background info:
Btw this is in Python
Initially i assumed that , when needed or as authored
an instance of the app or module will be created.
And that instance will be the one serving multiple requests from different clients.
In this behavior any initialization code will only be run once.
But in the local development server.
Every time i add something new, specially in the main.py,
the server is able to catch the new changes,
then on browser-refresh be able to run it.
This made me think, wait...
Does it run the entire script over and over again
on every request?
Question:
Does an instance/module run the entire code on every request or is this just an added behavior to the dev server to make development easier?
Both your assumptions - about behaviour in production and development - are wrong.
In production, GAE spins up instances as required. This may be in response to increased load, or the host may simply decide after a certain amount of time to recycle an instance by killing it and starting a new one. Initialization code will always be run whenever a new instance is started.
In development, you only get a single instance. However, the server watches your file system for changes. If it detects a change to the code itself, it will restart itself, and therefore re-run the initialization code. But if you don't make any code changes between requests, the existing process continues indefinitely, and init code will not be re-run.

pymongo: "OperationFailure: database error: error querying server"

We're occasionally getting the following error when doing queries:
OperationFailure: database error: error querying server
There is no specific query causing this, and when repeating the process things work. Has anybody else seen this error?
Our setup is a cluster of Ubuntu VMs on Amazon EC2, we're using Python 2.7.3 and pymongo v2.3. We're also using Mongoengine, however we still get this exception from non-Mongoengine code.
To those discovering this question:
We were never able to fully diagnose the problem with this, our hunch is that the database connection tends to fail every once in a while for whatever reason. From our research into distributed computing, this is a common problem and needs to be handled explicitly.
In the end, we adapted our system to become robust to DB connection failures by catching OperationFailure exceptions along with similar ones and re-establishing the database connection. This resolved the problem along with a number of similar ones we were having.
Seems the query failed on the server - to diagnose you'd need to check the server logs.

ApplicationError2 and ApplicationError5 when communicating with external api from AppEngine

I have built an application on google app engine, in python27 to connect with another services API and in general everything works smoothly. Every now and then I get one of the following two errors
(<class 'google.appengine.api.remote_socket._remote_socket.error'>, error('An error occured while connecting to the server: ApplicationError: 2 ',), <traceback object at 0x11949c10>)
(<class 'httplib.HTTPException'>, HTTPException('ApplicationError: 5 ',), <traceback object at 0x113a5850>)
The first of these errors (ApplicationError: 2) I interpret to be an error occurring on the part of the servers with which I am communicating, however I've not been able to find any detail on this and if there is any way I am responsible / can fix it.
The second of these errors (ApplicationError: 5) I've found some detail on and it suggests that the server took too long to communicate with my application - however I've set the timeout to be 20s and it fails considerably quicker than that.
If anyone could offer links or insight into the errors - specifically what causes the error and what can be done to fix it I'd very much appreciate it.
You get to start using the word "idempotent" in casual conversations and curses :)
The only thing you can do is to try the call again, and accept the fact that your initial call may have gone through, only to time out on the response - i.e. if the call actually did something (create a customer order for example), after the timeout error you might have to check if the first request succeed so you don't end up with multiple copies of the same order.
Hope that makes sense. FWIW we work with some unfriendly API's and for us, about 80% of our code is dealing with exactly this sort of !##$%.

How to Disable Django / mod_WSGI Page Caching

I have Django running in Apache via mod_wsgi. I believe Django is caching my pages server-side, which is causing some of the functionality to not work correctly.
I have a countdown timer that works by getting the current server time, determining the remaining countdown time, and outputting that number to the HTML template. A javascript countdown timer then takes over and runs the countdown for the user.
The problem arises when the user refreshes the page, or navigates to a different page with the countdown timer. The timer appears to jump around to different times sporadically, usually going back to the same time over and over again on each refresh.
Using HTTPFox, the page is not being loaded from my browser cache, so it looks like either Django or Apache is caching the page. Is there any way to disable this functionality? I'm not going to have enough traffic to worry about caching the script output. Or am I completely wrong about why this is happening?
[Edit] From the posts below, it looks like caching is disabled in Django, which means it must be happening elsewhere, perhaps in Apache?
[Edit] I have a more thorough description of what is happening: For the first 7 (or so) requests made to the server, the pages are rendered by the script and returned, although each of those 7 pages seems to be cached as it shows up later. On the 8th request, the server serves up the first page. On the 9th request, it serves up the second page, and so on in a cycle. This lasts until I restart apache, when the process starts over again.
[Edit] I have configured mod_wsgi to run only one process at a time, which causes the timer to reset to the same value in every case. Interestingly though, there's another component on my page that displays a random image on each request, using order('?'), and that does refresh with different images each time, which would indicate the caching is happening in Django and not in Apache.
[Edit] In light of the previous edit, I went back and reviewed the relevant views.py file, finding that the countdown start variable was being set globally in the module, outside of the view functions. Moving that setting inside the view functions resolved the problem. So it turned out not to be a caching issue after all. Thanks everyone for your help on this.
From my experience with mod_wsgi in Apache, it is highly unlikely that they are causing caching. A couple of things to try:
It is possible that you have some proxy server between your computer and the web server that is appropriately or inappropriately caching pages. Sometimes ISPs run proxy servers to reduce bandwidth outside their network. Can you please provide the HTTP headers for a page that is getting cached (Firebug can give these to you). Headers that I would specifically be interested in include Cache-Control, Expires, Last-Modified, and ETag.
Can you post your MIDDLEWARE_CLASSES from your settings.py file. It possible that you have a Middleware that performs caching for you.
Can you grep your code for the following items "load cache", "django.core.cache", and "cache_page". A *grep -R "search" ** will work.
Does the settings.py (or anything it imports like "from localsettings import *") include CACHE_BACKEND?
What happens when you restart apache? (e.g. sudo services apache restart). If a restart clears the issue, then it might be apache doing caching (it is possible that this could also clear out a locmen Django cache backend)
Did you specifically setup Django caching? From the docs it seems you would clearly know if Django was caching as it requires work beforehand to get it working. Specifically, you need to define where the cached files are saved.
http://docs.djangoproject.com/en/dev/topics/cache/
Are you using a multiprocess configuration for Apache/mod_wsgi? If you are, that will account for why different responses can have a different value for the timer as likely that when timer is initialised will be different for each process handling requests. Thus why it can jump around.
Have a read of:
http://code.google.com/p/modwsgi/wiki/ProcessesAndThreading
Work out in what mode or configuration you are running Apache/mod_wsgi and perhaps post what that configuration is. Without knowing, there are too many unknowns.
I just came across this:
Support for Automatic Reloading To help deployment tools you can
activate support for automatic reloading. Whenever something changes
the .wsgi file, mod_wsgi will reload all the daemon processes for us.
For that, just add the following directive to your Directory section:
WSGIScriptReloading On

Categories