openshift 3 django - request too large - python

I migrated a django app from Openshift 2 to Openshift3 Online. It has an upload feature that allows users to upload audio files. The files are usually larger than 50MB. In Openshift3 if I try to upload the file it only works for files up to around 12 MB. Larger than 12 MB leads to an error message in the firefox saying "connection canceled". Chromium gives more details:
Request Entity Too Large
The requested resource
/myApp/upload
does not allow request data with POST requests, or the amount of data provided in the request exceeds the capacity limit.
I'm using wsgi_mod-express. From searching on this error message on google, I could see that it I'm probably hitting any limit in the webserver configuration. Which limit could that be and how would I be able to change it?

As per help messages from running mod_wsgi-express start-server --help:
--limit-request-body NUMBER
The maximum number of bytes which are allowed in a
request body. Defaults to 10485760 (10MB).
Change your app.sh to add the option and set it to a larger value.

Related

how do you tell if arcgis request is processed correctly?

my company has an arcgis server, and i've been trying to geocode some address using the python requests packages.
However, as long as the input format is correct, the reponse.status_code is always"200", meaning everything is OK, even if the server didn't process the request properly.
( for example, if the batch size limit is 1000 records, and I sent an json input with 2000 records, it would still return status_code 200, but half of the records will get ignored. )
just wondering if there is a way for me to know if the server process the request properly or not?
A great spot to check is the server logs to start with. They are located in your ArcGIS server manager (https://gisserver.domain.com:6443/arcgis/manager). I would assume it would log some type of warning/info there if records were ignored, but it is not technically an error so there would be no error messages would be returned anywhere.
I doubt you'd want to do this but if you want to up your limit you can follow this technical article on how to do thathttps://support.esri.com/en/technical-article/000012383

AWS Elastic Beanstalk health check issue

My web application is Django and web server use Nginx, use Docker image and Elastic Beanstalk for deployment.
Normally there was no problem, but as the load balancer expands EC2, my web server becomes 502 Bad Gateway.
I checked Elastic Beanstalk application logs, about 16% of the requests returned 5xx errors, at which time the load balancer expands EC2, causing the web server to transition to the 502 Bad Gateway state and the Elastic Beanstalk application to the Degraded state.
Is this a common problem when the load balancer performs a health check? If not, how to turn off the Health Check?
I am attaching a captured image for reference.
As far as I know, 502 Bad Gateway error can be mitigated only by manually checking the major links you have on your websites and if they are accessible through a simple GET request.
In case of my website, I had some issue with the login page and an about page, (and it was about 33% of my website sadly) which is why after uploading to EC2 i got a 5xx error on health check. I solved the problem by simply making the links work on the server (there were some functionalities which were only running on localhost and not on AWS so I fixed that and got OK status in Health Check)
I don't think there is a point in removing health check as it gives vital information about your website and you probably don't want your website to have inaccessible pages.
Keep track of logs to narrow down to the problem.
I hope you find the solution.
While your code is being deployed, you will get 502 because the EC2 instance fails the health check call. You need to adjust the load balance health check default settings to allow enough time for your deployment to complete. Allow more time for a deployment if you also restart the server after each deployment.
The AWS load balancer sends a health check request to each registered instance every N seconds using the path you specify. The default interval seconds is 30 seconds. If the health check fails N number of times (default is 2) for any of the instances you have running, health changes to Degraded or Severe depending on the percentage of your instances that are not responding.
Send a request that should return a 200 response code. Default is '/index.html'
Wait for N seconds before time out (default 5 seconds)
Try again after N interval seconds (default 30 seconds)
If N consecutive calls fail, change the health state to warning or severe (default unhealthy threshold is 2)
After N consecutive successful calls, return the health state to OK (default is 10).
With the default settings, if any web server instance is down for more than a minute (2 tries of 30 seconds each), it is considered an outage. It will take 5 minutes (10 tries every 30 seconds) to get back to Ok status.
For a detailed explanation and configuration options please check AWS documentation: Configure Health Checks for Elastic Load Balancing

DeadlineExceededError with python django appengine

I have a task running which uses django.test.client to render pages:
from django.test.client import Client
client = Client()
resp = client.get(path, HTTP_HOST=settings.STATIC_SITE_DOMAIN, secure=True)
This successfully loops through pages and renders them to static files. However as the site has increased in size and page load times have increased, it now throws an error:
Rendering path /articles/funny-video-of-the-month/
Internal Server Error: /articles/ten-tips-for-december/
DeadlineExceededError: The overall deadline for responding to the HTTP request was exceeded.
I don't know whether this is the AppEngine task reaching it's 10 minute max length or the actual page http request reaching a 60 second limit. The documentation about the error is here:
https://cloud.google.com/appengine/articles/deadlineexceedederrors
There are other potential fixes on stackoverflow, but they are usually around urlfetch method which I don't believe is used.
Here are some potential fixes I can think of:
1) Alter existing timeouts on http requests made via django.test.client
https://github.com/django/django/blob/master/django/test/client.py
https://github.com/django/django/blob/master/django/core/handlers/wsgi.py
2) Split static generation into different tasks so each task doesn't run longer than 10mins
3) Try manual scaling option
https://cloud.google.com/appengine/docs/python/modules/#Python_Instance_scaling_and_class
Any suggestions on how to fix would be greatly appreciated.

how to find out how much of file size served by nginx?

so I want to serve some file with django nginx solution. the problem is that many of files which is serving have a huge size and users have some quota to download files.
so how I can find out how much of file size serving to user? what I mean is that maybe user close download file connection. so how can I find the right size of serving?
thanks!
With Nginx you can log the amount of bandwidth used pushed to a log file by using the log_module but this is not exactly what you want, but can help achieve what you wish.
So, now you will have logs that you can parse to get the file sizes and total bandwidth used and then have a script that updates a database and then you can then authorize future downloads if their limit is reached or within some soft limit range.
Server Fault with similar question
Another option is, making an assumption, keep the file sizes in a database and just keep a tally at the request so when ever they hit a download link it immediately increments their download count and if it is over their limit then just invalidate the link else make the link valid and pass them over to Nginx.
Another option would be to write a custom Nginx module that performs the increment at a much more fine grained level, but this could be more wok than your situation requires.

Fetch a large chunk of data with TaskQueue

I'd like to fetch a large file from an url, but it always raises a DeadLineExceededError, although I have tried with a TaskQueue and put deadline=600 to fetch.
The problem comes from the fetch, so Backends cannot help here : even if i'd launched a backend with a TaskQueue, i'd had 24h to return, but there 'd be still the limit of 10 min with the fetch, ya ?
Is there a way to fetch from a particular offset of file to an other offset ? So could I split the fetch and after put all parts together ?
Any ideas ?
Actually the file to fetch is not really large : between 15 and 30 MB, but the server is likely overwhelmingly slow and constantly fired ...
If the server supports it, you can supply the HTTP Range header to specify a subset of the file that you want to fetch. If the content is being served statically, the server will probably respect range requests; if it's dynamic, it depends on whether the author of the code that generates the response allowed for them.

Categories