DeadlineExceededError with python django appengine - python

I have a task running which uses django.test.client to render pages:
from django.test.client import Client
client = Client()
resp = client.get(path, HTTP_HOST=settings.STATIC_SITE_DOMAIN, secure=True)
This successfully loops through pages and renders them to static files. However as the site has increased in size and page load times have increased, it now throws an error:
Rendering path /articles/funny-video-of-the-month/
Internal Server Error: /articles/ten-tips-for-december/
DeadlineExceededError: The overall deadline for responding to the HTTP request was exceeded.
I don't know whether this is the AppEngine task reaching it's 10 minute max length or the actual page http request reaching a 60 second limit. The documentation about the error is here:
https://cloud.google.com/appengine/articles/deadlineexceedederrors
There are other potential fixes on stackoverflow, but they are usually around urlfetch method which I don't believe is used.
Here are some potential fixes I can think of:
1) Alter existing timeouts on http requests made via django.test.client
https://github.com/django/django/blob/master/django/test/client.py
https://github.com/django/django/blob/master/django/core/handlers/wsgi.py
2) Split static generation into different tasks so each task doesn't run longer than 10mins
3) Try manual scaling option
https://cloud.google.com/appengine/docs/python/modules/#Python_Instance_scaling_and_class
Any suggestions on how to fix would be greatly appreciated.

Related

How to prevent the 230 seconds azure gateway timeout using python flask for long running work loads

I have a python flask application as a azure web app and one function is a compute intensive workload which takes more than 5 minutes to process, is there any hack to prevent the gateway time out error by keeping the TCP connection active between the client and the api while the function is processing the data? Sample of current code below.
from flask import Flask
app = Flask(__name__)
#app.route('/data')
def data():
mydata = super_long_process_function()
# takes more than 5 minutes to process
return mydata
Since the super_long_process_function takes more than 5 minutes, it always times out with 504 Gateway Time-out. One thing I want to mention is that this is idle timeout at the TCP level which means that if the connection is idle only and no data transfer happening, only then this timeout is hit. So is there any hack in flask that can be used to prevent this timeout while we process the data because based on my research and reading Microsoft documentation the 230 seconds limit cannot be changed for web apps.
In short: the 230 second timeout, as you stated, cannot be changed.
230 seconds is the maximum amount of time that a request can take without sending any data back to the response. It is not configurable.
Source: GitHub issue
The timeout occurs of there's no response. Keeping the connection open and sending data will not help.
There are a couple of ways you can go about this. Here are two of more possible solutions you could use to trigger your long running tasks without the timeout being an issue.
Only trigger the long running task with an HTTP call, but don't wait for their completion before returning a response.
Trigger the task using a messaging mechanism like Storage Queues or Service Bus.
For updating the web application with the result of the long running task, think along the lines of having the response hold a URL the frontend can call to check for task completion periodically, your request having a callback URL to call when the task has completed or implementing Azure Web PubSub for sending status updates to the client.

PYTHON: Google API directions layers gives OVER_QUERY_LIMIT while displaying route between multiple points after 10 requests

I am working on a project to display routes using google map distance API between point A and B coming from a dataframe, I keep getting the error of "
[directions layer] You have sent too many directions request. Wait until your quota replenishes". This happens only when the number of requests exceed 10. ALTHOUGH, I have added a 1 second delay after the 10th request with no luck! I also tried to send the requests in batches with a size of 10 each, that didn't work too! Moreover, I've read the API documentation
I have billing setup although I am on the free trial apparently I still have over $300 of credit.
Here is a simulation of my code, a very simple one, for your kind advice
import gmaps
import time
dict_k ={}
dict_k['st_coordinates'] =[ '24.760669, 54.704782','25.168596, 56.355549','25.004274, 55.063742','25.000252, 55.060932',
'24.872900, 55.137325','24.537609, 54.383664','24.050339, 53.470903','24.211435, 55.424501',
'24.196804, 55.855923','24.308309, 54.675861','24.988239, 55.104435','24.985047, 55.071542',
'24.306433, 54.490205','25.000252, 55.060932','24.536064, 54.383048' ]
dict_k['en_coordinates'] =[ '24.454036, 54.376656','24.130453, 55.801786','23.931339, 53.045892','24.171,54.408',
'24.454036, 54.376656','24.130453, 55.801786','23.931339, 53.045892','24.171,54.408',
'24.454036, 54.376656','24.130453, 55.801786','23.931339, 53.045892','24.171,54.408',
'24.454036, 54.376656','24.130453, 55.801786','23.931339, 53.045892']
key = 'AIzA...............'
dataframe = pd.DataFrame.from_dict(dict_k)
def draw_map (dataframe,key):
gmaps.configure(key)
fig = gmaps.figure()
for i in range(len(dataframe)):
start_point = eval(dataframe['st_coordinates'].iloc[i] )
end_point = eval(dataframe['en_coordinates'].iloc[i] )
layer = gmaps.directions_layer(start_point, end_point)
fig.add_layer(layer)
if i >10:
time.sleep(1)
return fig
df = draw_map(dataframe,key)
df
Here's a link to the API guide. Spells it out quite clearly.
https://developers.google.com/maps/premium/previous-licenses/articles/usage-limits
You can exceed the Google Maps Platform web services usage limits by:
Sending too many requests per day.
Sending requests too fast, i.e. too many requests per second.
Sending requests too fast for too long or otherwise abusing the web service.
Exceeding other usage limits, e.g. points per request in the Elevation API.
Usage limits exceeded
If you exceed the usage limits you will get an OVER_QUERY_LIMIT status code as a response.
This means that the web service will stop providing normal responses and switch to returning only status code OVER_QUERY_LIMIT until more usage is allowed again. This can happen:
Within a few seconds, if the error was received because your application sent too many requests per second.
Within the next 24 hours, if the error was received because your application sent too many requests per day. The daily quotas are reset at midnight, Pacific Time.
Upon receiving a response with status code OVER_QUERY_LIMIT, your application should determine which usage limit has been exceeded. This can be done by pausing for 2 seconds and resending the same request. If status code is still OVER_QUERY_LIMIT, your application is sending too many requests per day. Otherwise, your application is sending too many requests per second.

openshift 3 django - request too large

I migrated a django app from Openshift 2 to Openshift3 Online. It has an upload feature that allows users to upload audio files. The files are usually larger than 50MB. In Openshift3 if I try to upload the file it only works for files up to around 12 MB. Larger than 12 MB leads to an error message in the firefox saying "connection canceled". Chromium gives more details:
Request Entity Too Large
The requested resource
/myApp/upload
does not allow request data with POST requests, or the amount of data provided in the request exceeds the capacity limit.
I'm using wsgi_mod-express. From searching on this error message on google, I could see that it I'm probably hitting any limit in the webserver configuration. Which limit could that be and how would I be able to change it?
As per help messages from running mod_wsgi-express start-server --help:
--limit-request-body NUMBER
The maximum number of bytes which are allowed in a
request body. Defaults to 10485760 (10MB).
Change your app.sh to add the option and set it to a larger value.

max_clients limit reached error on tornado-botocore server

I've developed a Tornado server using the tornado-botocore package for interacting with Amazon SQS service.
When I'm trying to load test the server i get the following log:
[simple_httpclient:137:fetch_impl] max_clients limit reached, request queued. 10 active, 89 queued requests.
I assume it's from the ASyncHTTPClient used by the botocore package.
I've tried to set the max_clients to an higher number but with no success:
def _connect(self, operation):
sqs_connection = Botocore(
service='sqs', operation=operation,
region_name=options.aws_sqs_region_name,
session=session)
sqs_connection.http_client.configure(None, defaults=dict(max_clients=5000))
what am i doing wrong?
Thanks.
configure is a class method that must be called before an AsyncHTTPClient is created: tornado.httpclient.AsyncHTTPClient.configure(None, max_clients=100).
The log message does not indicate an error (it's logged at debug level). It's up to you whether it's appropriate for this service to respond to load by using more connections or queuing things up. 5000 connections for a single application process seems like too much to me.

Timeout with Python Requests + Clojure HttpKit Server but not Ring server

I have some Ring routes which I'm running one of two ways.
lein ring server, with the lein-ring plugin
using org.httpkit.server, like (hs/run-server app {:port 3000}))
It's a web app (being consumed by an Angular.js browser client).
I have some API tests written in Python using the Requests library:
my_r = requests.post(MY_ROUTE,
data=MY_DATA,
headers={"Content-Type": "application/json"},
timeout=10)
When I use lein ring server, this request works fine in the JS client and the Python tests.
When I use httpkit, this works fine in the JS client but the Python client times out with
socket.timeout: timed out
I can't figure out why the Python client is timing out. It happens with httpkit but not with lein-ring, so I can only assume that the cause is related to the difference.
I've looked at the traffic in WireShark and both look like they give the correct response. Both have the same Content-Length field (15 bytes).
I've raised the number of threads to 10 (shouldn't need to) and no change.
Any ideas what's wrong?
I found how to fix this, but no satisfactory explanation.
I was using wrap-json-response Ring middleware to take a HashMap and convert it to JSON. I switched to doing my own conversion in my handler with json/write-str, and this fixes it.
At a guess it might be something to do with the server handling output buffering, but that's speculation.
I've combed through the Wireshark dumps and I can't see any relevant differences between the two. The sent Content-Length fields are identical. The 'bytes in flight' differ, at 518 and 524.
No clue as to why the web browser was happy with this but Python Requests wasn't, and whether or this is a bug in Requests, httpkit, ring-middleware-format or my own code.

Categories