Parallelize work within a Flask view with asyncio - python

I am working on a Flask app in which the response to the client depends on replies that I get from a couple of external APIs. The requests to these APIs are logically independent from each other, so a speed gain can be realized by sending these requests in parallel (in the example below response time would be cut almost in half).
It seems to me the simplest and most modern way to achieve this is to use asyncio and process all work in a separate async function that is called from the flask view function using asyncio.run(). I have included a short working example below.
Using celery or any other type of queue with a separate worker process does not really make sense here, because the response has to wait for the API results anyway before sending a reply. As far as I can see this is a variant of this idea where a processing loop is accessed through asyncio. There are certainly applications for this, but I think if we really just want to parallelize IO before answering a request this is unnecessarily complicated.
However, I know that there can be some pitfalls in using various kinds of multithreading from within Flask. Therefore my questions are:
Would the implmentation below be considered safe when used in a production environment? How does that depend on the kind of server that we run Flask on? Particularly, the built-in development server or a typical multi-worker gunicorn setup such as suggested on https://flask.palletsprojects.com/en/1.1.x/deploying/wsgi-standalone/#gunicorn?
Are there any considerations to be made about Flask's app and request contexts in the async function or can I simply use them as I would in any other function? I.e. can I simply import current_app to access my application config or use the g and session objects? When writing to them possible race conditions would clearly have to be considered, but are there any other issues? In my basic tests (not in example) everything seems to work alright.
Are there any other solutions that would improve on this?
Here is my example application. Since the ascynio interface changed a bit over time it is probably worth noting that I tested this on Python 3.7 and 3.8 and I have done my best to avoid deprecated parts of asyncio.
import asyncio
import random
import time
from flask import Flask
app = Flask(__name__)
async def contact_api_a():
print(f'{time.perf_counter()}: Start request 1')
# This sleep simulates querying and having to wait for an external API
await asyncio.sleep(2)
# Here is our simulated API reply
result = random.random()
print(f'{time.perf_counter()}: Finish request 1')
return result
async def contact_api_b():
print(f'{time.perf_counter()}: Start request 2')
await asyncio.sleep(1)
result = random.random()
print(f'{time.perf_counter()}: Finish request 2')
return result
async def contact_apis():
# Create the two tasks
task_a = asyncio.create_task(contact_api_a())
task_b = asyncio.create_task(contact_api_b())
# Wait for both API requests to finish
result_a, result_b = await asyncio.gather(task_a, task_b)
print(f'{time.perf_counter()}: Finish both requests')
return result_a, result_b
#app.route('/')
def hello_world():
start_time = time.perf_counter()
# All async processes are organized in a separate function
result_a, result_b = asyncio.run(contact_apis())
# We implement some final business logic before finishing the request
final_result = result_a + result_b
processing_time = time.perf_counter() - start_time
return f'Result: {final_result:.2f}; Processing time: {processing_time:.2f}'

This will be safe to run in production but asyncio will not work efficiently with the Gunicorn async workers, such as gevent or eventlet. This is because the result_a, result_b = asyncio.run(contact_apis()) will block the gevent/eventlet event-loop until it completes, whereas using the gevent/eventlet spawn equivalents will not. The Flask server shouldn't be used in production. The Gunicorn threaded workers (or multiple Gunicorn processes) will be fine, as asyncio will block the thread/process.
The globals will work fine as they are tied to either the thread (threaded workers) or green-thread (gevent/eventlet) and not to the asyncio task.
I would say Quart is an improvement (I'm the Quart author). Quart is the Flask API re-implemented using asyncio. With Quart the snippet above is,
import asyncio
import random
import time
from quart import Quart
app = Quart(__name__)
async def contact_api_a():
print(f'{time.perf_counter()}: Start request 1')
# This sleep simulates querying and having to wait for an external API
await asyncio.sleep(2)
# Here is our simulated API reply
result = random.random()
print(f'{time.perf_counter()}: Finish request 1')
return result
async def contact_api_b():
print(f'{time.perf_counter()}: Start request 2')
await asyncio.sleep(1)
result = random.random()
print(f'{time.perf_counter()}: Finish request 2')
return result
async def contact_apis():
# Create the two tasks
task_a = asyncio.create_task(contact_api_a())
task_b = asyncio.create_task(contact_api_b())
# Wait for both API requests to finish
result_a, result_b = await asyncio.gather(task_a, task_b)
print(f'{time.perf_counter()}: Finish both requests')
return result_a, result_b
#app.route('/')
async def hello_world():
start_time = time.perf_counter()
# All async processes are organized in a separate function
result_a, result_b = await contact_apis()
# We implement some final business logic before finishing the request
final_result = result_a + result_b
processing_time = time.perf_counter() - start_time
return f'Result: {final_result:.2f}; Processing time: {processing_time:.2f}'
I'd also suggest using an asyncio based request library such as httpx

Related

How to stop execution of FastAPI endpoint after a specified time to reduce CPU resource usage/cost?

Use case
The client micro service, which calls /do_something, has a timeout of 60 seconds in the request/post() call. This timeout is fixed and can't be changed. So if /do_something takes 10 mins, /do_something is wasting CPU resources since the client micro service is NOT waiting after 60 seconds for the response from /do_something, which wastes CPU for 10 mins and this increases the cost. We have limited budget.
The current code looks like this:
import time
from uvicorn import Server, Config
from random import randrange
from fastapi import FastAPI
app = FastAPI()
def some_func(text):
"""
Some computationally heavy function
whose execution time depends on input text size
"""
randinteger = randrange(1,120)
time.sleep(randinteger)# simulate processing of text
return text
#app.get("/do_something")
async def do_something():
response = some_func(text="hello world")
return {"response": response}
# Running
if __name__ == '__main__':
server = Server(Config(app=app, host='0.0.0.0', port=3001))
server.run()
Desired Solution
Here /do_something should stop the processing of the current request to endpoint after 60 seconds and wait for next request to process.
If execution of the end point is force stopped after 60 seconds we should be able to log it with custom message.
This should not kill the service and work with multithreading/multiprocessing.
I tried this. But when timeout happends the server is getting killed.
Any solution to fix this?
import logging
import time
import timeout_decorator
from uvicorn import Server, Config
from random import randrange
from fastapi import FastAPI
app = FastAPI()
#timeout_decorator.timeout(seconds=2, timeout_exception=StopIteration, use_signals=False)
def some_func(text):
"""
Some computationally heavy function
whose execution time depends on input text size
"""
randinteger = randrange(1,30)
time.sleep(randinteger)# simulate processing of text
return text
#app.get("/do_something")
async def do_something():
try:
response = some_func(text="hello world")
except StopIteration:
logging.warning(f'Stopped /do_something > endpoint due to timeout!')
else:
logging.info(f'( Completed < /do_something > endpoint')
return {"response": response}
# Running
if __name__ == '__main__':
server = Server(Config(app=app, host='0.0.0.0', port=3001))
server.run()
This answer is not about improving CPU time—as you mentioned in the comments section—but rather explains what would happen, if you defined an endpoint with normal def or async def, as well as provides solutions when you run blocking operations inside an endpoint.
You are asking how to stop the processing of a request after a while, in order to process further requests. It does not really make that sense to start processing a request, and then (60 seconds later) stop it as if it never happened (wasting server resources all that time and having other requests waiting). You should instead let the handling of requests to FastAPI framework itself. When you define an endpoint with async def, it is run on the main thread (in the event loop), i.e., the server processes the requests sequentially, as long as there is no await call inside the endpoint (just like in your case). The keyword await passes function control back to the event loop. In other words, it suspends the execution of the surrounding coroutine, and tells the event loop to let something else run, until the awaited task completes (and has returned the result data). The await keyword only works within an async function.
Since you perform a heavy CPU-bound operation inside your async def endpoint (by calling your some_func() function), and you never give up control for other requests to run in the event loop (e.g., by awaiting for some coroutine), the server will be blocked and wait for that request to be fully processed and complete, before moving on to the next one(s)—have a look at this answer for more details.
Solutions
One solution would be to define your endpoint with normal def instead of async def. In brief, when you declare an endpoint with normal def instead of async def in FastAPI, it is run in an external threadpool that is then awaited, instead of being called directly (as it would block the server); hence, FastAPI would still work asynchronously.
Another solution, as described in this answer, is to keep the async def definition and run the CPU-bound operation in a separate thread and await it, using Starlette's run_in_threadpool(), thus ensuring that the main thread (event loop), where coroutines are run, does not get blocked. As described by #tiangolo here, "run_in_threadpool is an awaitable function, the first parameter is a normal function, the next parameters are passed to that function directly. It supports sequence arguments and keyword arguments". Example:
from fastapi.concurrency import run_in_threadpool
res = await run_in_threadpool(cpu_bound_task, text='Hello world')
Since this is about a CPU-bound operation, it would be preferable to run it in a separate process, using ProcessPoolExecutor, as described in the link provided above. In this case, this could be integrated with asyncio, in order to await the process to finish its work and return the result(s). Note that, as described in the link above, it is important to protect the main loop of code to avoid recursive spawning of subprocesses, etc—essentially, your code must be under if __name__ == '__main__'. Example:
import concurrent.futures
from functools import partial
import asyncio
loop = asyncio.get_running_loop()
with concurrent.futures.ProcessPoolExecutor() as pool:
res = await loop.run_in_executor(pool, partial(cpu_bound_task, text='Hello world'))
About Request Timeout
With regards to the recent update on your question about the client having a fixed 60s request timeout; if you are not behind a proxy such as Nginx that would allow you to set the request timeout, and/or you are not using gunicorn, which would also allow you to adjust the request timeout, you could use a middleware, as suggested here, to set a timeout for all incoming requests. The suggested middleware (example is given below) uses asyncio's .wait_for() function, which waits for an awaitable function/coroutine to complete with a timeout. If a timeout occurs, it cancels the task and raises asyncio.TimeoutError.
Regarding your comment below:
My requirement is not unblocking next request...
Again, please read carefully the first part of this answer to understand that if you define your endpoint with async def and not await for some coroutine inside, but instead perform some CPU-bound task (as you already do), it will block the server until is completed (and even the approach below wont' work as expected). That's like saying that you would like FastAPI to process one request at a time; in that case, there is no reason to use an ASGI framework such as FastAPI, which takes advantage of the async/await syntax (i.e., processing requests asynchronously), in order to provide fast performance. Hence, you either need to drop the async definition from your endpoint (as mentioned earlier above), or, preferably, run your synchronous CPU-bound task using ProcessPoolExecutor, as described earlier.
Also, your comment in some_func():
Some computationally heavy function whose execution time depends on
input text size
indicates that instead of (or along with) setting a request timeout, you could check the length of input text (using a dependency fucntion, for instance) and raise an HTTPException in case the text's length exceeds some pre-defined value, which is known beforehand to require more than 60s to complete the processing. In that way, your system won't waste resources trying to perform a task, which you already know will not be completed.
Working Example
import time
import uvicorn
import asyncio
import concurrent.futures
from functools import partial
from fastapi import FastAPI, Request
from fastapi.responses import JSONResponse
from starlette.status import HTTP_504_GATEWAY_TIMEOUT
from fastapi.concurrency import run_in_threadpool
REQUEST_TIMEOUT = 2 # adjust timeout as desired
app = FastAPI()
#app.middleware('http')
async def timeout_middleware(request: Request, call_next):
try:
return await asyncio.wait_for(call_next(request), timeout=REQUEST_TIMEOUT)
except asyncio.TimeoutError:
return JSONResponse({'detail': f'Request exceeded the time limit for processing'},
status_code=HTTP_504_GATEWAY_TIMEOUT)
def cpu_bound_task(text):
time.sleep(5)
return text
#app.get('/')
async def main():
loop = asyncio.get_running_loop()
with concurrent.futures.ProcessPoolExecutor() as pool:
res = await loop.run_in_executor(pool, partial(cpu_bound_task, text='Hello world'))
return {'response': res}
if __name__ == '__main__':
uvicorn.run(app)

FastAPI, add long tasks to buffer and process them one by one, while maintaining server responsiveness

I am trying to set up a FastAPI server that will take as input some biological data, and run some processing on them. Since the processing takes up all the server's resources, queries should be processed sequentially. However, the server should stay responsive and add further requests in a buffer. I've been trying to use the BackgroundTasks module for this, but after sending the second query, the response gets delayed while the task is running. Any help appreciated, and thanks in advance.
import os
import sys
import time
from dataclasses import dataclass
from fastapi import FastAPI, Request, BackgroundTasks
EXPERIMENTS_BASE_DIR = "/experiments/"
QUERY_BUFFER = {}
app = FastAPI()
#dataclass
class Query():
query_name: str
query_sequence: str
experiment_id: str = None
status: str = "pending"
def __post_init__(self):
self.experiment_id = str(time.time())
self.experiment_dir = os.path.join(EXPERIMENTS_BASE_DIR, self.experiment_id)
os.makedirs(self.experiment_dir, exist_ok=False)
def run(self):
self.status = "running"
# perform some long task using the query sequence and get a return code #
self.status = "finished"
return 0 # or another code depending on the final output
#app.post("/")
async def root(request: Request, background_tasks: BackgroundTasks):
query_data = await request.body()
query_data = query_data.decode("utf-8")
query_data = dict(str(x).split("=") for x in query_data.split("&"))
query = Query(**query_data)
QUERY_BUFFER[query.experiment_id] = query
background_tasks.add_task(process, query)
return {"Query created": query, "Query ID": query.experiment_id, "Backlog Length": len(QUERY_BUFFER)}
async def process(query):
""" Process query and generate data"""
ret_code = await query.run()
del QUERY_BUFFER[query.experiment_id]
print(f'Query {query.experiment_id} processing finished with return code {ret_code}.')
#app.get("/backlog/")
def return_backlog():
return {f"Currently {len(QUERY_BUFFER)} jobs in the backlog."}
EDIT:
The original answer was influenced by testing with httpx.AsyncClient (as flagged might be the case in the original caveat). The test client causes background tasks to block that do not block without the test client. As such, there's a simpler solution provided you don't want to test it with httpx.AsyncClient. The new solution uses uvicorn and then I tested this manually with Postman instead.
This solution uses a function as the background task (process) so that it runs outside the main thread. It then schedules a job to run aprocess which will run in the main thread when the event loop gets a chance. The aprocess coroutine is able to then await the run coroutine of your Query as before.
Additionally, I've added a time.sleep(10) to the process function to illustrate that even long running non-IO tasks will not prevent your original HTTP session from sending a response back to the client (although this will only work if it is something that releases the GIL. If it's CPU bound though you might want a separate process altogether by using multiprocessing or a separate service). Finally, I've replaced the prints with logging so that they work along with the uvicorn logging.
import asyncio
import os
import sys
import time
from dataclasses import dataclass
from fastapi import FastAPI, Request, BackgroundTasks
import logging
logging.basicConfig(level=logging.INFO, format="%(levelname)-9s %(asctime)s - %(name)s - %(message)s")
LOGGER = logging.getLogger(__name__)
EXPERIMENTS_BASE_DIR = "/experiments/"
QUERY_BUFFER = {}
app = FastAPI()
loop = asyncio.get_event_loop()
#dataclass
class Query():
query_name: str
query_sequence: str
experiment_id: str = None
status: str = "pending"
def __post_init__(self):
self.experiment_id = str(time.time())
self.experiment_dir = os.path.join(EXPERIMENTS_BASE_DIR, self.experiment_id)
# os.makedirs(self.experiment_dir, exist_ok=False) # Commented out for testing
async def run(self):
self.status = "running"
await asyncio.sleep(5) # simulate long running query
# perform some long task using the query sequence and get a return code #
self.status = "finished"
return 0 # or another code depending on the final output
#app.post("/")
async def root(request: Request, background_tasks: BackgroundTasks):
query_data = await request.body()
query_data = query_data.decode("utf-8")
query_data = dict(str(x).split("=") for x in query_data.split("&"))
query = Query(**query_data)
QUERY_BUFFER[query.experiment_id] = query
background_tasks.add_task(process, query)
LOGGER.info(f'root - added task')
return {"Query created": query, "Query ID": query.experiment_id, "Backlog Length": len(QUERY_BUFFER)}
def process(query):
""" Schedule processing of query, and then run some long running non-IO job without blocking the app"""
asyncio.run_coroutine_threadsafe(aprocess(query), loop)
LOGGER.info(f"process - {query.experiment_id} - Submitted query job. Now run non-IO work for 10 seconds...")
time.sleep(10) # simulate long running non-IO work, does not block app as this is in another thread - provided it is not cpu bound.
LOGGER.info(f'process - {query.experiment_id} - wake up!')
async def aprocess(query):
""" Process query and generate data """
ret_code = await query.run()
del QUERY_BUFFER[query.experiment_id]
LOGGER.info(f'aprocess - Query {query.experiment_id} processing finished with return code {ret_code}.')
#app.get("/backlog/")
def return_backlog():
return {f"return_backlog - Currently {len(QUERY_BUFFER)} jobs in the backlog."}
if __name__ == "__main__":
import uvicorn
uvicorn.run("scratch_26:app", host="127.0.0.1", port=8000)
ORIGINAL ANSWER:
*A caveat on this answer - I've tried testing this with `httpx.AsyncClient`, which might account for different behavior compared to deploying behind guvicorn.*
From what I can tell (and I am very open to correction on this), BackgroundTasks actually need to complete prior to an HTTP response being sent. This is not what the Starlette docs or the FastAPI docs say, but it appears to be the case, at least while using the httpx AsyncClient.
Whether you add a a coroutine (which is executed in the main thread) or a function (which gets executed in it's own side thread) that HTTP response is blocked from being sent until the background task is complete.
If you want to await a long running (asyncio friendly) task, you can get around this problem by using a wrapper function. The wrapper function adds the real task (a coroutine, since it will be using await) to the event loop and then returns. Since this is very fast, the fact that it "blocks" no longer matters (assuming a few milliseconds doesn't matter).
The real task then gets executed in turn (but after the initial HTTP response has been sent), and although it's on the main thread, the asyncio part of the function will not block.
You could try this:
#app.post("/")
async def root(request: Request, background_tasks: BackgroundTasks):
...
background_tasks.add_task(process_wrapper, query)
...
async def process_wrapper(query):
loop = asyncio.get_event_loop()
loop.create_task(process(query))
async def process(query):
""" Process query and generate data"""
ret_code = await query.run()
del QUERY_BUFFER[query.experiment_id]
print(f'Query {query.experiment_id} processing finished with return code {ret_code}.')
Note also that you'll also need to make your run() function a coroutine by adding the async keyword since you're expecting to await it from your process() function.
Here's a full working example that uses httpx.AsyncClient to test it. I've added the fmt_duration helper function to show the lapsed time for illustrative purposes. I've also commented out the code that creates directories, and simulated a 2 second query duration in the run() function.
import asyncio
import os
import sys
import time
from dataclasses import dataclass
from fastapi import FastAPI, Request, BackgroundTasks
from httpx import AsyncClient
EXPERIMENTS_BASE_DIR = "/experiments/"
QUERY_BUFFER = {}
app = FastAPI()
start_ts = time.time()
#dataclass
class Query():
query_name: str
query_sequence: str
experiment_id: str = None
status: str = "pending"
def __post_init__(self):
self.experiment_id = str(time.time())
self.experiment_dir = os.path.join(EXPERIMENTS_BASE_DIR, self.experiment_id)
# os.makedirs(self.experiment_dir, exist_ok=False) # Commented out for testing
async def run(self):
self.status = "running"
await asyncio.sleep(2) # simulate long running query
# perform some long task using the query sequence and get a return code #
self.status = "finished"
return 0 # or another code depending on the final output
#app.post("/")
async def root(request: Request, background_tasks: BackgroundTasks):
query_data = await request.body()
query_data = query_data.decode("utf-8")
query_data = dict(str(x).split("=") for x in query_data.split("&"))
query = Query(**query_data)
QUERY_BUFFER[query.experiment_id] = query
background_tasks.add_task(process_wrapper, query)
print(f'{fmt_duration()} - root - added task')
return {"Query created": query, "Query ID": query.experiment_id, "Backlog Length": len(QUERY_BUFFER)}
async def process_wrapper(query):
loop = asyncio.get_event_loop()
loop.create_task(process(query))
async def process(query):
""" Process query and generate data"""
ret_code = await query.run()
del QUERY_BUFFER[query.experiment_id]
print(f'{fmt_duration()} - process - Query {query.experiment_id} processing finished with return code {ret_code}.')
#app.get("/backlog/")
def return_backlog():
return {f"{fmt_duration()} - return_backlog - Currently {len(QUERY_BUFFER)} jobs in the backlog."}
async def test_me():
async with AsyncClient(app=app, base_url="http://example") as ac:
res = await ac.post("/", content="query_name=foo&query_sequence=42")
print(f"{fmt_duration()} - [{res.status_code}] - {res.content.decode('utf8')}")
res = await ac.post("/", content="query_name=bar&query_sequence=43")
print(f"{fmt_duration()} - [{res.status_code}] - {res.content.decode('utf8')}")
content = ""
while not content.endswith('0 jobs in the backlog."]'):
await asyncio.sleep(1)
backlog_results = await ac.get("/backlog")
content = backlog_results.content.decode("utf8")
print(f"{fmt_duration()} - test_me - content: {content}")
def fmt_duration():
return f"Progress time: {time.time() - start_ts:.3f}s"
loop = asyncio.get_event_loop()
print(f'starting loop...')
loop.run_until_complete(test_me())
duration = time.time() - start_ts
print(f'Finished. Duration: {duration:.3f} seconds.')
in my local environment if I run the above I get this output:
starting loop...
Progress time: 0.005s - root - added task
Progress time: 0.006s - [200] - {"Query created":{"query_name":"foo","query_sequence":"42","experiment_id":"1627489235.9300923","status":"pending","experiment_dir":"/experiments/1627489235.9300923"},"Query ID":"1627489235.9300923","Backlog Length":1}
Progress time: 0.007s - root - added task
Progress time: 0.009s - [200] - {"Query created":{"query_name":"bar","query_sequence":"43","experiment_id":"1627489235.932097","status":"pending","experiment_dir":"/experiments/1627489235.932097"},"Query ID":"1627489235.932097","Backlog Length":2}
Progress time: 1.016s - test_me - content: ["Progress time: 1.015s - return_backlog - Currently 2 jobs in the backlog."]
Progress time: 2.008s - process - Query 1627489235.9300923 processing finished with return code 0.
Progress time: 2.008s - process - Query 1627489235.932097 processing finished with return code 0.
Progress time: 2.041s - test_me - content: ["Progress time: 2.041s - return_backlog - Currently 0 jobs in the backlog."]
Finished. Duration: 2.041 seconds.
I also tried making process_wrapper a function so that Starlette executes it in a new thread. This works the same way, just use run_coroutine_threadsafe instead of create_task i.e.
def process_wrapper(query):
loop = asyncio.get_event_loop()
asyncio.run_coroutine_threadsafe(process(query), loop)
If there is some other way to get a background task to run without blocking the HTTP response I'd love to find out how, but absent that this wrapper solution should work.
I think your issue is in the task you want to run, not in the BackgroundTask itself.
FastAPI (and underlying Starlette, which is responsible for running the background tasks) is created on top of the asyncio and handles all requests asynchronously. That means, if one request is being processed, if there is any IO operation while processing the current request, and that IO operation supports the asynchronous approach, FastAPI will switch to the next request in queue while this IO operation is pending.
Same goes for any background tasks added to the queue. If background task is pending, any requests or other background tasks will be handled only when FastAPI is waiting for any IO operation.
As you may see, this is not ideal when either your view or task doesn't have any IO operations or they cannot be run asynchronously. There is a workaround for that situation:
declare your views or tasks as normal, non asynchronous functions
Starlette will then run those views in a separate thread, outside of the main async loop, so other requests can be handled at the same time
manually run the part of your logic that may block the
processing of other requests using asgiref.sync_to_async
This will also cause this logic to be executed in a separate thread, releasing the main async loop to take care of other requests until the function returns.
If you are not doing any asynchronous IO operations in your long-running task, the first approach will be most suitable for you. Otherwise, you should take any part of your code that is either long-running or performs any non-asynchronous IO operations and wrap it with sync_to_async.

HTTP server kick-off background python script without blocking

I'd like to be able to trigger a long-running python script via a web request, in bare-bones fashion. Also, I'd like to be able to trigger other copies of the script with different parameters while initial copies are still running.
I've looked at flask, aiohttp, and queueing possibilities. Flask and aiohttp seem to have the least overhead to set up. I plan on executing the existing python script via subprocess.run (however, I did consider refactoring the script into libraries that could be used in the web response function).
With aiohttp, I'm trying something like:
ingestion_service.py:
from aiohttp import web
from pprint import pprint
routes = web.RouteTableDef()
#routes.get("/ingest_pipeline")
async def test_ingest_pipeline(request):
'''
Get the job_conf specified from the request and activate the script
'''
#subprocess.run the command with lookup of job conf file
response = web.Response(text=f"Received data ingestion request")
await response.prepare(request)
await response.write_eof()
#eventually this would be subprocess.run call
time.sleep(80)
return response
def init_func(argv):
app = web.Application()
app.add_routes(routes)
return app
But though the initial request returns immediately, subsequent requests block until the initial request is complete. I'm running a server via:
python -m aiohttp.web -H localhost -P 8080 ingestion_service:init_func
I know that multithreading and concurrency may provide better solutions than asyncio. In this case, I'm not looking for a robust solution, just something that will allow me to run multiple scripts at once via http request, ideally with minimal memory costs.
OK, there were a couple of issues with what I was doing. Namely, time.sleep() is blocking, so asyncio.sleep() should be used. However, since I'm interested in spawning a subprocess, I can use asyncio.subprocess to do that in a non-blocking fashion.
nb:
asyncio: run one function threaded with multiple requests from websocket clients
https://docs.python.org/3/library/asyncio-subprocess.html.
Using these help, but there's still an issue with the webhandler terminating the subprocess. Luckily, there's a solution here:
https://docs.aiohttp.org/en/stable/web_advanced.html
aiojobs has a decorator "atomic" that will protect the process until it is complete. So, code along these lines will function:
from aiojobs.aiohttp import setup, atomic
import asyncio
import os
from aiohttp import web
#atomic
async def ingest_pipeline(request):
#be careful what you pass through to shell, lest you
#give away the keys to the kingdom
shell_command = "[your command here]"
response_text = f"running {shell_command}"
response_code = 200
response = web.Response(text=response_text, status=response_code)
await response.prepare(request)
await response.write_eof()
ingestion_process = await asyncio.create_subprocess_shell(shell_command,
stdout=asyncio.subprocess.PIPE,
stderr=asyncio.subprocess.PIPE)
stdout, stderr = await ingestion_process.communicate()
return response
def init_func(argv):
app = web.Application()
setup(app)
app.router.add_get('/ingest_pipeline', ingest_pipeline)
return app
This is very bare bones, but might help others looking for a quick skeleton for a temporary internal solution.

Make a Python asyncio call from a Flask route

I want to execute an async function every time the Flask route is executed. Why is the abar function never executed?
import asyncio
from flask import Flask
async def abar(a):
print(a)
loop = asyncio.get_event_loop()
app = Flask(__name__)
#app.route("/")
def notify():
asyncio.ensure_future(abar("abar"), loop=loop)
return "OK"
if __name__ == "__main__":
app.run(debug=False, use_reloader=False)
loop.run_forever()
I also tried putting the blocking call in a separate thread. But it is still not calling the abar function.
import asyncio
from threading import Thread
from flask import Flask
async def abar(a):
print(a)
app = Flask(__name__)
def start_worker(loop):
asyncio.set_event_loop(loop)
try:
loop.run_forever()
finally:
loop.close()
worker_loop = asyncio.new_event_loop()
worker = Thread(target=start_worker, args=(worker_loop,))
#app.route("/")
def notify():
asyncio.ensure_future(abar("abar"), loop=worker_loop)
return "OK"
if __name__ == "__main__":
worker.start()
app.run(debug=False, use_reloader=False)
You can incorporate some async functionality into Flask apps without having to completely convert them to asyncio.
import asyncio
from flask import Flask
async def abar(a):
print(a)
loop = asyncio.get_event_loop()
app = Flask(__name__)
#app.route("/")
def notify():
loop.run_until_complete(abar("abar"))
return "OK"
if __name__ == "__main__":
app.run(debug=False, use_reloader=False)
This will block the Flask response until the async function returns, but it still allows you to do some clever things. I've used this pattern to perform many external requests in parallel using aiohttp, and then when they are complete, I'm back into traditional flask for data processing and template rendering.
import aiohttp
import asyncio
import async_timeout
from flask import Flask
loop = asyncio.get_event_loop()
app = Flask(__name__)
async def fetch(url):
async with aiohttp.ClientSession() as session, async_timeout.timeout(10):
async with session.get(url) as response:
return await response.text()
def fight(responses):
return "Why can't we all just get along?"
#app.route("/")
def index():
# perform multiple async requests concurrently
responses = loop.run_until_complete(asyncio.gather(
fetch("https://google.com/"),
fetch("https://bing.com/"),
fetch("https://duckduckgo.com"),
fetch("http://www.dogpile.com"),
))
# do something with the results
return fight(responses)
if __name__ == "__main__":
app.run(debug=False, use_reloader=False)
A simpler solution to your problem (in my biased view) is to switch to Quart from Flask. If so your snippet simplifies to,
import asyncio
from quart import Quart
async def abar(a):
print(a)
app = Quart(__name__)
#app.route("/")
async def notify():
await abar("abar")
return "OK"
if __name__ == "__main__":
app.run(debug=False)
As noted in the other answers the Flask app run is blocking, and does not interact with an asyncio loop. Quart on the other hand is the Flask API built on asyncio, so it should work how you expect.
Also as an update, Flask-Aiohttp is no longer maintained.
Your mistake is to try to run the asyncio event loop after calling app.run(). The latter doesn't return, it instead runs the Flask development server.
In fact, that's how most WSGI setups will work; either the main thread is going to busy dispatching requests, or the Flask server is imported as a module in a WSGI server, and you can't start an event loop here either.
You'll instead have to run your asyncio event loop in a separate thread, then run your coroutines in that separate thread via asyncio.run_coroutine_threadsafe(). See the Coroutines and Multithreading section in the documentation for what this entails.
Here is an implementation of a module that will run such an event loop thread, and gives you the utilities to schedule coroutines to be run in that loop:
import asyncio
import itertools
import threading
__all__ = ["EventLoopThread", "get_event_loop", "stop_event_loop", "run_coroutine"]
class EventLoopThread(threading.Thread):
loop = None
_count = itertools.count(0)
def __init__(self):
self.started = threading.Event()
name = f"{type(self).__name__}-{next(self._count)}"
super().__init__(name=name, daemon=True)
def __repr__(self):
loop, r, c, d = self.loop, False, True, False
if loop is not None:
r, c, d = loop.is_running(), loop.is_closed(), loop.get_debug()
return (
f"<{type(self).__name__} {self.name} id={self.ident} "
f"running={r} closed={c} debug={d}>"
)
def run(self):
self.loop = loop = asyncio.new_event_loop()
asyncio.set_event_loop(loop)
loop.call_later(0, self.started.set)
try:
loop.run_forever()
finally:
try:
shutdown_asyncgens = loop.shutdown_asyncgens()
except AttributeError:
pass
else:
loop.run_until_complete(shutdown_asyncgens)
try:
shutdown_executor = loop.shutdown_default_executor()
except AttributeError:
pass
else:
loop.run_until_complete(shutdown_executor)
asyncio.set_event_loop(None)
loop.close()
def stop(self):
loop, self.loop = self.loop, None
if loop is None:
return
loop.call_soon_threadsafe(loop.stop)
self.join()
_lock = threading.Lock()
_loop_thread = None
def get_event_loop():
global _loop_thread
if _loop_thread is None:
with _lock:
if _loop_thread is None:
_loop_thread = EventLoopThread()
_loop_thread.start()
# give the thread up to a second to produce a loop
_loop_thread.started.wait(1)
return _loop_thread.loop
def stop_event_loop():
global _loop_thread
with _lock:
if _loop_thread is not None:
_loop_thread.stop()
_loop_thread = None
def run_coroutine(coro):
"""Run the coroutine in the event loop running in a separate thread
Returns a Future, call Future.result() to get the output
"""
return asyncio.run_coroutine_threadsafe(coro, get_event_loop())
You can use the run_coroutine() function defined here to schedule asyncio routines. Use the returned Future instance to control the coroutine:
Get the result with Future.result(). You can give this a timeout; if no result is produced within the timeout, the coroutine is automatically cancelled.
You can query the state of the coroutine with the .cancelled(), .running() and .done() methods.
You can add callbacks to the future, which will be called when the coroutine has completed, or is cancelled or raised an exception (take into account that this is probably going to be called from the event loop thread, not the thread that you called run_coroutine() in).
For your specific example, where abar() doesn't return any result, you can just ignore the returned future, like this:
#app.route("/")
def notify():
run_coroutine(abar("abar"))
return "OK"
Note that before Python 3.8 that you can't use an event loop running on a separate thread to create subprocesses with! See my answer to Python3 Flask asyncio subprocess in route hangs for backport of the Python 3.8 ThreadedChildWatcher class for a work-around for this.
For same reason you won't see this print:
if __name__ == "__main__":
app.run(debug=False, use_reloader=False)
print('Hey!')
loop.run_forever()
loop.run_forever() is never called since as #dirn already noted app.run is also blocking.
Running global blocking event loop - is only way you can run asyncio coroutines and tasks, but it's not compatible with running blocking Flask app (or with any other such thing in general).
If you want to use asynchronous web framework you should choose one created to be asynchronous. For example, probably most popular now is aiohttp:
from aiohttp import web
async def hello(request):
return web.Response(text="Hello, world")
if __name__ == "__main__":
app = web.Application()
app.router.add_get('/', hello)
web.run_app(app) # this runs asyncio event loop inside
Upd:
About your try to run event loop in background thread. I didn't investigate much, but it seems problem somehow related with tread-safety: many asyncio objects are not thread-safe. If you change your code this way, it'll work:
def _create_task():
asyncio.ensure_future(abar("abar"), loop=worker_loop)
#app.route("/")
def notify():
worker_loop.call_soon_threadsafe(_create_task)
return "OK"
But again, this is very bad idea. It's not only very inconvenient, but I guess wouldn't make much sense: if you're going to use thread to start asyncio, why don't just use threads in Flask instead of asyncio? You will have Flask you want and parallelization.
If I still didn't convince you, at least take a look at Flask-aiohttp project. It has close to Flask api and I think still better that what you're trying to do.
The main issue, as already explained in the other answers by #Martijn Pieters and #Mikhail Gerasimov is that app.run is blocking, so the line loop.run_forever() is never called. You will need to manually set up and maintain a run loop on a separate thread.
Fortunately, with Flask 2.0, you don't need to create, run, and manage your own event loop anymore. You can define your route as async def and directly await on coroutines from your route functions.
https://flask.palletsprojects.com/en/2.0.x/async-await/
Using async and await
New in version 2.0.
Routes, error handlers, before request, after request, and teardown
functions can all be coroutine functions if Flask is installed with
the async extra (pip install flask[async]). It requires Python 3.7+
where contextvars.ContextVar is available. This allows views to be
defined with async def and use await.
Flask will take care of creating the event loop on each request. All you have to do is define your coroutines and await on them to finish:
https://flask.palletsprojects.com/en/2.0.x/async-await/#performance
Performance
Async functions require an event loop to run. Flask, as a WSGI
application, uses one worker to handle one request/response cycle.
When a request comes into an async view, Flask will start an event
loop in a thread, run the view function there, then return the result.
Each request still ties up one worker, even for async views. The
upside is that you can run async code within a view, for example to
make multiple concurrent database queries, HTTP requests to an
external API, etc. However, the number of requests your application
can handle at one time will remain the same.
Tweaking the original example from the question:
import asyncio
from flask import Flask, jsonify
async def send_notif(x: int):
print(f"Called coro with {x}")
await asyncio.sleep(1)
return {"x": x}
app = Flask(__name__)
#app.route("/")
async def notify():
futures = [send_notif(x) for x in range(5)]
results = await asyncio.gather(*futures)
response = list(results)
return jsonify(response)
# The recommended way now is to use `flask run`.
# See: https://flask.palletsprojects.com/en/2.0.x/cli/
# if __name__ == "__main__":
# app.run(debug=False, use_reloader=False)
$ time curl -s -XGET 'http://localhost:5000'
[{"x":0},{"x":1},{"x":2},{"x":3},{"x":4}]
real 0m1.016s
user 0m0.005s
sys 0m0.006s
Most common recipes using asyncio can be applied the same way. The one thing to take note of is, as of Flask 2.0.1, we cannot use asyncio.create_task to spawn background tasks:
https://flask.palletsprojects.com/en/2.0.x/async-await/#background-tasks
Async functions will run in an event loop until they complete, at which
stage the event loop will stop. This means any additional spawned
tasks that haven’t completed when the async function completes will be
cancelled. Therefore you cannot spawn background tasks, for example
via asyncio.create_task.
If you wish to use background tasks it is best to use a task queue to
trigger background work, rather than spawn tasks in a view function.
Other than the limitation with create_task, it should work for use-cases where you want to make async database queries or multiple calls to external APIs.

Handling async requests using multiple APIs - Python Flask Redis

I'm working on an application that will have to consult multiple APIs for information and after processing the data, will output the answer to a client. The client uses a browser to connect to a web server to forward the request, afterwards, the web server will look for the information needed from the multiple APIs and after joining the responses from those APIs will then give an answer to the client.
The web server was built using Flask and a module that extracts the needed information for each API was also implemented (Python). Since the consulting process for each API takes time, I would like to give the web server a timeout for responding, therefore, after the requests are sent only those that are below the time buffer will be used.
My proposed solution:
Use a Redis Queue and an RQ worker to enqueue the requests for each API and store the responses on the Queue then wait for the timeout and collect the responses that were able to respond in the allowed time. Afterwards, process the information and give the response to the user.
The flask web server is setup something like this:
#app.route('/result',methods=["POST"])
def show_result():
inputText = request.form["question"]
tweetModule = Twitter()
tweeterResponse = tweetModule.ask(params=inputText)
redditObject = RedditModule()
redditResponse = redditObject.ask(params=inputText)
edmunds = Edmunds()
edmundsJson = edmunds.ask(params=inputText)
# More APIs could be consulted here
# Send each request async and the synchronize the responses from the queue
template = env.get_template('templates/result.html')
return render_template(template,resp=resp)
The worker:
conn = redis.from_url(redis_url)
if __name__ == '__main__':
with Connection(conn):
worker = Worker(map(Queue, listen))
worker.work()
And lets assume each Module handles its own queueing process.
I can see some problems ahead:
What happens to the information stored on the queue that did not make it to the timeout?
How can I make Flask wait and then extract the responses from the Queue?
Is it possible that information could get mixed if two clients ask in the same time-frame?
Is there a better way to handle the async requests and then synchronize the response?
Thanks!
In such cases I prefer a combination of HTTPX and flask[async]
First - HTTPX
HTTPX offers a standard synchronous API by default, but also gives you the option of an async client if you need it.
Async is a concurrency model that is far more efficient than multi-threading, and can provide significant performance benefits and enable the use of long-lived network connections such as WebSockets.
If you're working with an async web framework then you'll also want to use an async client for sending outgoing HTTP requests.
>>> async with httpx.AsyncClient() as client:
... r = await client.get('https://www.example.com/')
...
>>> r
<Response [200 OK]>
Second - Using async and await in a flask
Routes, error handlers, before request, after request, and teardown functions can all be coroutine functions if Flask is installed with the async extra (pip install flask[async]). It requires Python 3.7+ where contextvars.ContextVar is available. This allows views to be defined with async def and use await.
For example, you should do something like this:
import asyncio
import httpx
from flask import Flask, render_template, request
app = Flask(__name__)
#app.route('/async', methods=['GET', 'POST'])
async def async_form():
if request.method == 'POST':
...
async with httpx.AsyncClient() as client:
tweeterResponse, redditResponse, edmundsJson = await asyncio.gather(
client.get(f'https://api.tweeter....../id?id={request.form["tweeter_id"]}', timeout=None),
client.get(f'https://api.redditResponse.....?key={APIKEY}&reddit={request.form["reddit_id"]}'),
client.post(f'https://api.edmundsJson.......', data=inputText)
)
...
resp = {
"tweeter_response" : tweeterResponse,
"reddit_response": redditResponse,
"edmunds_json" : edmundsJson
}
template = env.get_template('templates/result.html')
return render_template(template, resp=resp)

Categories