profiling api with django, gunicorn and threadpool

profiling api with django, gunicorn and threadpool - python

I have an API to do the below operations. I am using python, Django framework and gunicorn/nginx for deployment.
API has deployed in AWS lightsail. Request will come in for every 2secs.
receives data from the client.
creates record in local SQLite database and sends response.
runs task asynchronously in thread. (running in thread). Entire task takes about 1 sec on average.
a. gets the updated record from step 2 with ID. (0 sec)
b. posts data to another API using requests. (.5 secs)
c. updates to database (AWS RDS) (.5 secs)
Setup:
I have ThreadPoolExecutor max_workers=12.
gunicorn has one worker as the instance has 1vCPU. I don't use gunicorn workers with threads, since I have to perform some other task within the api.
The reason asyncio is not used was that base update to db in Django is not supported with this. So I kept the post API in the threading itself.
Each request will be unique. No same request.
Even If I keep max_worker to 1 in threadpool, it is bursting 10% mark in AWS 5$ instance. API only receive request for every 2secs.
I am not able to profile this situation where it is causing the CPU usage.
There are a couple of reason, I can think of.
gunicorn master is constantly checking on the worker.
OS is managing threads for context switching.
Any pointer will be helpful for profiling.

Related

How to prevent the 230 seconds azure gateway timeout using python flask for long running work loads

I have a python flask application as a azure web app and one function is a compute intensive workload which takes more than 5 minutes to process, is there any hack to prevent the gateway time out error by keeping the TCP connection active between the client and the api while the function is processing the data? Sample of current code below.
from flask import Flask
app = Flask(__name__)
#app.route('/data')
def data():
mydata = super_long_process_function()
# takes more than 5 minutes to process
return mydata
Since the super_long_process_function takes more than 5 minutes, it always times out with 504 Gateway Time-out. One thing I want to mention is that this is idle timeout at the TCP level which means that if the connection is idle only and no data transfer happening, only then this timeout is hit. So is there any hack in flask that can be used to prevent this timeout while we process the data because based on my research and reading Microsoft documentation the 230 seconds limit cannot be changed for web apps.

In short: the 230 second timeout, as you stated, cannot be changed.
230 seconds is the maximum amount of time that a request can take without sending any data back to the response. It is not configurable.
Source: GitHub issue
The timeout occurs of there's no response. Keeping the connection open and sending data will not help.
There are a couple of ways you can go about this. Here are two of more possible solutions you could use to trigger your long running tasks without the timeout being an issue.
Only trigger the long running task with an HTTP call, but don't wait for their completion before returning a response.
Trigger the task using a messaging mechanism like Storage Queues or Service Bus.
For updating the web application with the result of the long running task, think along the lines of having the response hold a URL the frontend can call to check for task completion periodically, your request having a callback URL to call when the task has completed or implementing Azure Web PubSub for sending status updates to the client.

What to do If the request is not completed in a certain time

I am trying to design an API on fastAPI, I have clients who limit the response to their request by time. For example, for some clients it may be tens of seconds, for others milliseconds.
It is assumed that the user sends a request (e.g. /v5/bla/info), the API checks who sends this request and determines the response time for it. If the request is executed during this time, then give an answer, if it is not executed, then at the end of the specified time, give some kind of request ID, so that the user then sends a request to another endpoint (e.g. /v5/check_request), which would give information on the execution (pending, done, error) of the request using the request ID.
The question is how to implement task execution and runtime checking while holding a session with the client
EDIT I was thinking that the API would send all requests coming to it to the database, and then some "executor" would take data from the database, execute requests and update the status. meanwhile, the api would check the recording status every n seconds and give the result.
How bad/good is this option. The load is approximately 30 million requests in 24 hours

As mentioned above I would recommend celery. With celery there is the option to schedule long running tasks on a task queue. When you place a task on the queue, a worker can pick up the task and process the computation in a separate process.
Celery can be run in different configurations. You will need to select a broker and a backend. E.g. as a broker RabbitMQ and as a backend Redis. More Information is here.
There is already a cookiecutter template for a bigger setup you could use. But you can also build a simpler setup with your own docker-compose file. Once you have configured celery. You can create tasks on the queue with:
from celery import Celery
celery = Celery(__name__)
celery.conf.broker_url = os.environ.get("CELERY_BROKER_URL", "redis://localhost:6379") # or any config you prefer
celery.conf.result_backend = os.environ.get("CELERY_RESULT_BACKEND", "redis://localhost:6379") # or any config you prefer
#celery.task(name="long_task")
def long_task():
# long running code
return True
And then add the task with something like this in the first endpoint:
task = long_task.delay()
# task.id will give you the id
And then second endpoint can call it by id and get the status of the task
from celery.result import AsyncResult
task_result = AsyncResult(task_id)

Fetch APIs with Django RF, Celery and RabbitMQ

I am working on an application to fetch multiple Rest api's, and in turn, expose the data from those fetches, in a Rest api.
The Api's I need to fetch are slow, and do not have the same structure. All fetches need to be recurrent, daily.
I am thinking of using :
-Django Rest Framework to expose the api, use the parsers and serializers to treat the data received, and store the aggregated data to be exposed, in a Postgresql.
-Celery to launch workers and/or child processes in parallel
-Celery beat to fetch regularly, and keep data up to date.
-RabbiMq as a message broker to Celery.
Can I use celery to make the fetch calls to the api's or do I need to write a special fetch script?
And how do I get the result back from Celery to Django RF to store it? Do I need to use another queue and celery worker for the response ?
Is it better to have 1 worker with multiple child processes, or multiple workers with 1 child process each ?

Python + uwsgi - multiprocessing and shared app state

We have a flask app running behind uwsgi with 4 processes. Its an API which serves data from one of our two ElasticSearch clusters.
On app bootstrap each process pulls config from external DB to check which ES cluster is active and connects to it.
Evey now and then POST request comes (from aws SNS service) which informs all the clients to switch ES cluster. That triggers the same function as on bootstrap - pull config from DB reconnect to active ES cluster.
It works well running as a single process, but when we have more then one process running only one of them will get updated (the one which picks up POST request)... where other processes are still connected to inactive cluster.
Pulling config on each request to make sure that ES cluster we use is active would be to slow. Im thinking to install redis locally and store the active_es_cluster there... any other ideas?

I think there are two routes you could go down.
Have an endpoint "/set_es_cluster" that gets hit by your SNS POST request. This endpoint then sets the key "active_es_cluster", which is read on every ES request by your other processes. The downside of this is that on each ES request you need to do a redis lookup first.
Have a seperate process that gets the POST request specifically (I assume the clusters are not changing often). The purpose of this process is to receive the post request and just have uWSGI gracefully restart your other flask processes.
The advantages of the second option:
Don't have to hit redis on every request
Let uWSGI handle the restarts for you (which it does well)
You already setup the config pulling at runtime anyway so it should "just work" with your existing application

Concurrent requests Flask python

I have some Flask application. It works with some database, I'm using SQLAlchemy for this. So I have one question:
Flask handle requests one-by-one. So, for example, I have two users, which are modifying the same record in the table of database, for example A and B (they are concurrent).
How can I say to user B that user A has changed this record? It must be some message to user B.

In the development server version, when you do app.run(), you get a single synchronous process, which means at most 1 requests being processed at a time. So you cannot accept multiple users at the same time.
However, gunicorn is a solid, easy-to-use WSGI server that will let you spawn multiple workers (separate processes), and even comes with asynchronous workers when you need to deploy your application.
However, to answer your question, since, they run on separate threads, the data that exists in the database at the specific time when the query is run in that thread will be used/returned.
I hope this answers your query.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

profiling api with django, gunicorn and threadpool - python

Related

How to prevent the 230 seconds azure gateway timeout using python flask for long running work loads

What to do If the request is not completed in a certain time

Fetch APIs with Django RF, Celery and RabbitMQ

Python + uwsgi - multiprocessing and shared app state

Concurrent requests Flask python

Categories

Resources