How to adjust timeout for python app served with waitress - python

I am running a flask application which has an option in the UI where the user can click a button and it calls an endpoint to perform some analysis.
The application is served as follows:
from waitress import serve
serve(app, host="0.0.0.0", port=5000)
After around ~1 minute, I am receiving a gateway timeout in the UI:
504 Gatway Time-out
However, the flask application keeps doing the work behind and after 2 minutes it completes the processing and I can see that it submits the data on the db side. So the process is not timing out itself.
I tried already passing channel_timeout argument to a much higher value (default seems 120 seconds) but with no luck. I know that this makes sense to implement in a different way where the user doesn't have to wait for these two minutes, but I am looking if there is such a timeout set by default and whether it can be increased.
The application is deployed in K8s and the UI is exposed via ingress. Could the timeout come from ingress instead?

The problem was with the ingress controller default timeout.
I managed to get around this issue by changing the implementation and having this job run as background task instead.

Related

How to prevent the 230 seconds azure gateway timeout using python flask for long running work loads

I have a python flask application as a azure web app and one function is a compute intensive workload which takes more than 5 minutes to process, is there any hack to prevent the gateway time out error by keeping the TCP connection active between the client and the api while the function is processing the data? Sample of current code below.
from flask import Flask
app = Flask(__name__)
#app.route('/data')
def data():
mydata = super_long_process_function()
# takes more than 5 minutes to process
return mydata
Since the super_long_process_function takes more than 5 minutes, it always times out with 504 Gateway Time-out. One thing I want to mention is that this is idle timeout at the TCP level which means that if the connection is idle only and no data transfer happening, only then this timeout is hit. So is there any hack in flask that can be used to prevent this timeout while we process the data because based on my research and reading Microsoft documentation the 230 seconds limit cannot be changed for web apps.
In short: the 230 second timeout, as you stated, cannot be changed.
230 seconds is the maximum amount of time that a request can take without sending any data back to the response. It is not configurable.
Source: GitHub issue
The timeout occurs of there's no response. Keeping the connection open and sending data will not help.
There are a couple of ways you can go about this. Here are two of more possible solutions you could use to trigger your long running tasks without the timeout being an issue.
Only trigger the long running task with an HTTP call, but don't wait for their completion before returning a response.
Trigger the task using a messaging mechanism like Storage Queues or Service Bus.
For updating the web application with the result of the long running task, think along the lines of having the response hold a URL the frontend can call to check for task completion periodically, your request having a callback URL to call when the task has completed or implementing Azure Web PubSub for sending status updates to the client.

Shutdown Flask server after response

I've been reading a lot around different ways to shutdown a Flask app but I don't get it how I could implement something for my use case.
I wrote and am testing a simple Flask app which takes a POST request to create some resources within Google Cloud. This Flask app is deployed into a container and is running on Cloud Run.
My question is, I want to shutdown the app right after a 200 response or would there be a way to handle one request per Cloud Run instance?
app = Flask(__name__)
#app.route('/', methods=['POST'])
def main():
#some validation on the request.json
try:
kick_off_terraform()
return ("Success", 200)
except Exception as e:
print(e)
After doing some research I found out I can control the concurrency on the GCP side, and that way I can allow only one request per instance on Cloud Run.
gcloud run deploy $SERVICE_NAME
--image gcr.io/$GCP_PROJECT_ID/$SERVICE_NAME:$CI_COMMIT_SHORT_SHA
--region=us-east1
--platform managed
--concurrency=1
Sadly hacks like --concurrency=1 or --max-instances=1 are not great because shutting down the server after a request may cause the request to fail. (When I did that in the past, requests failed.)
Based on your question I am guessing you might not have fully grasped Cloud Run runtime behavior. Please note that:
You don't need to "shut down" a container on Cloud Run. It automatically suspends once all requests finish, and you are not even charged for the idle time that occurs outside of a request.
Operations like kick_off_terraform() can't happen in the background (they have to finish before you return the response), because Cloud Run currently doesn't allocate CPU in the background.
What you need is something like "run to completion containers" and you may need to wait a bit for that to be supported by Cloud Run.

Multiprocess can not work in flask when I deploy flask to server by using flask + uwsgi + nginx

I'm using flask + uwsgi + nginx to deploy website on server.
In the flask, my code is below, here is what I want to design: every time when I click run model, it would run a model in another process, but the interface would link to the waiting interface immediately.
train_dataset.status = status
db.session.commit()
text = 'Start training your models, please wait for a while or check results several hours later'
# would run a model in another process
task = Process(target=start_train, args=(app.config['UPLOAD_FOLDER'], current_user.id, p.id, p.task), name="training.exe")
task.start()
print(task.pid, task.name)
session[f"{project_name}_train"] = task.pid
# in the meanwhile, link to waiting interface
return render_template('p_computeview.html', project_name=project_name, text=text,
status=status, p=p, email=email, not_identify=not_identify)
And when I test in local development environment
app.run()
it's ok, when I click run model, the interface just link to wait interface and I can see the model running logs.
But when I deploy to server, I chose uwsgi + nginx + flask.
In uwsgi.ini, I already specify the processes
processes=2
threads=5
But when I click run model, the interface was still, didn't link to waiting interface, however I can see the model running logs, and when finished modeling, the interface would link to waiting interface (which prove that the Process function was not working ??)
my server have 2 cpus, so I think it can support multi process
Can someone help me ? I guess there are some problems in uwsgi or nginx ?
The desire to run separate threads or processes within a request context is a common one. For various reasons, except in very narrow circumstances, it is a desire that leads to frustration. In this case, as soon as task goes out of scope, the Process machinery gets torn down.
If you want to start a long-running task from a request handler, use a framework like celery or Rq, which arrange to run jobs entirely out of process from the http server.

Google AppEngine - Using queues with max_concurrent_requests set to 1: Process terminated because the request deadline was exceeded

I've set up a TaskQueue for my AppEngine API. The API processes large requests, and may take up to three hours to conclude computing, working on one task at a time.
I've set 'max_concurrent_requests' to 1 in the queue.yaml file, so that only one task will be active at a time, and increased the gunicorn timeout as well.
My problem comes because the CloudTask requests seem to have a timeout of 10min, after which they throw up an error:
Process terminated because the request deadline was exceeded. Please ensure that your HTTP server is listening for requests on 0.0.0.0 and on the port defined by the PORT environment variable. (Error code 123)
How can I configure my queue to wait idle for a previous task to finish, instead of simply timing out?
You've most likely set your App Engine service scaling element to "autoscaling" (or didn't define it, autoscaling being the default value) in the app.yaml file.
Instances in autoscaling have a deadline of 10min, as documented here. You'll need to re-deploy your service with an updated app.yaml file setting the scaling element to "manual scaling" or "basic scaling" to allow your tasks to run to up to 24h.

Simultaneous requests with turbogears2

I'm very new to web dev, and i'm trying to build a simple Web interface with Ajax calls to refresh data, and turbogears2 as the backend.
My Ajax calls are working fine and makes periodic calls to my Turbogears2 server, however these calls takes time to complete (some requests make the server to use remote SSH calls on other machines, which takes up to 3-4 seconds to complete).
My problem is that TurboGears waits for each request to complete before handling the next one, so all my concurrent Ajax calls are being queued instead of being all processed in parallel.
To refresh N values takes 3*N seconds where it could just take 3 seconds with concurrency.
Any idea how to fix this ?
Here is my current server-side code (method get_load is the one called with Ajax):
class RootController(TGController):
#expose()
def index(self):
with open ("index.html") as data:
index = data.read()
return index
#expose()
def get_load(self, ip):
command = "bash get_cpu_load.sh"
request = subprocess.Popen(["ssh", "-o ConnectTimeout=2", ip, command])
load = str(request.communicate()[0])
return load
Your problem is probably caused by the fact that you are serving requests with Gearbox wsgiref server. By default the wsgiref server is single threaded and so can serve a single request at time. That can be changed by providing the wsgiref.threaded = true configuration option in your development.ini server section (the same where ip address and port are specified too). See https://github.com/TurboGears/gearbox#gearbox-http-servers and http://turbogears.readthedocs.io/en/latest/turbogears/gearbox.html#changing-http-server for additional details.
Note that wsgiref is the development server for TurboGears and usage on production is usually discouraged. You should consider using something like waitress, chaussette or mod_wsgi when deploying your application, see http://turbogears.readthedocs.io/en/latest/cookbook/deploy/index.html?highlight=deploy

Categories