python tornado async client - python

I created batch delayed http (async) client which allows to trigger multiple async http requests and most importantly it allows to delay the start of requests so for example 100 requests are not triggered at a time.
But it has an issue. The http .fetch() method has a handleMethod parameter which handles the response, but I found out that if the delay (sleep) after the fetch isn't long enough the handle method is not even triggered. (maybe the request is killed or what meanwhile).
It is probably related to .run_sync method. How to fix that? I want to put delays but dont want this issue happen.
I need to parse the response regardless how long the request takes, regardless the following sleep call (that call has another reason as i said, and should not be related to response handling at all)
class BatchDelayedHttpClient:
def __init__(self, requestList):
# class members
self.httpClient = httpclient.AsyncHTTPClient()
self.requestList = requestList
ioloop.IOLoop.current().run_sync(self.execute)
#gen.coroutine
def execute(self):
print("exec start")
for request in self.requestList:
print("requesting " + request["url"])
self.httpClient.fetch(request["url"], request["handleMethod"], method=request["method"], headers=request["headers"], body=request["body"])
yield gen.sleep(request["sleep"])
print("exec end")

Related

How to stop execution of FastAPI endpoint after a specified time to reduce CPU resource usage/cost?

Use case
The client micro service, which calls /do_something, has a timeout of 60 seconds in the request/post() call. This timeout is fixed and can't be changed. So if /do_something takes 10 mins, /do_something is wasting CPU resources since the client micro service is NOT waiting after 60 seconds for the response from /do_something, which wastes CPU for 10 mins and this increases the cost. We have limited budget.
The current code looks like this:
import time
from uvicorn import Server, Config
from random import randrange
from fastapi import FastAPI
app = FastAPI()
def some_func(text):
"""
Some computationally heavy function
whose execution time depends on input text size
"""
randinteger = randrange(1,120)
time.sleep(randinteger)# simulate processing of text
return text
#app.get("/do_something")
async def do_something():
response = some_func(text="hello world")
return {"response": response}
# Running
if __name__ == '__main__':
server = Server(Config(app=app, host='0.0.0.0', port=3001))
server.run()
Desired Solution
Here /do_something should stop the processing of the current request to endpoint after 60 seconds and wait for next request to process.
If execution of the end point is force stopped after 60 seconds we should be able to log it with custom message.
This should not kill the service and work with multithreading/multiprocessing.
I tried this. But when timeout happends the server is getting killed.
Any solution to fix this?
import logging
import time
import timeout_decorator
from uvicorn import Server, Config
from random import randrange
from fastapi import FastAPI
app = FastAPI()
#timeout_decorator.timeout(seconds=2, timeout_exception=StopIteration, use_signals=False)
def some_func(text):
"""
Some computationally heavy function
whose execution time depends on input text size
"""
randinteger = randrange(1,30)
time.sleep(randinteger)# simulate processing of text
return text
#app.get("/do_something")
async def do_something():
try:
response = some_func(text="hello world")
except StopIteration:
logging.warning(f'Stopped /do_something > endpoint due to timeout!')
else:
logging.info(f'( Completed < /do_something > endpoint')
return {"response": response}
# Running
if __name__ == '__main__':
server = Server(Config(app=app, host='0.0.0.0', port=3001))
server.run()
This answer is not about improving CPU time—as you mentioned in the comments section—but rather explains what would happen, if you defined an endpoint with normal def or async def, as well as provides solutions when you run blocking operations inside an endpoint.
You are asking how to stop the processing of a request after a while, in order to process further requests. It does not really make that sense to start processing a request, and then (60 seconds later) stop it as if it never happened (wasting server resources all that time and having other requests waiting). You should instead let the handling of requests to FastAPI framework itself. When you define an endpoint with async def, it is run on the main thread (in the event loop), i.e., the server processes the requests sequentially, as long as there is no await call inside the endpoint (just like in your case). The keyword await passes function control back to the event loop. In other words, it suspends the execution of the surrounding coroutine, and tells the event loop to let something else run, until the awaited task completes (and has returned the result data). The await keyword only works within an async function.
Since you perform a heavy CPU-bound operation inside your async def endpoint (by calling your some_func() function), and you never give up control for other requests to run in the event loop (e.g., by awaiting for some coroutine), the server will be blocked and wait for that request to be fully processed and complete, before moving on to the next one(s)—have a look at this answer for more details.
Solutions
One solution would be to define your endpoint with normal def instead of async def. In brief, when you declare an endpoint with normal def instead of async def in FastAPI, it is run in an external threadpool that is then awaited, instead of being called directly (as it would block the server); hence, FastAPI would still work asynchronously.
Another solution, as described in this answer, is to keep the async def definition and run the CPU-bound operation in a separate thread and await it, using Starlette's run_in_threadpool(), thus ensuring that the main thread (event loop), where coroutines are run, does not get blocked. As described by #tiangolo here, "run_in_threadpool is an awaitable function, the first parameter is a normal function, the next parameters are passed to that function directly. It supports sequence arguments and keyword arguments". Example:
from fastapi.concurrency import run_in_threadpool
res = await run_in_threadpool(cpu_bound_task, text='Hello world')
Since this is about a CPU-bound operation, it would be preferable to run it in a separate process, using ProcessPoolExecutor, as described in the link provided above. In this case, this could be integrated with asyncio, in order to await the process to finish its work and return the result(s). Note that, as described in the link above, it is important to protect the main loop of code to avoid recursive spawning of subprocesses, etc—essentially, your code must be under if __name__ == '__main__'. Example:
import concurrent.futures
from functools import partial
import asyncio
loop = asyncio.get_running_loop()
with concurrent.futures.ProcessPoolExecutor() as pool:
res = await loop.run_in_executor(pool, partial(cpu_bound_task, text='Hello world'))
About Request Timeout
With regards to the recent update on your question about the client having a fixed 60s request timeout; if you are not behind a proxy such as Nginx that would allow you to set the request timeout, and/or you are not using gunicorn, which would also allow you to adjust the request timeout, you could use a middleware, as suggested here, to set a timeout for all incoming requests. The suggested middleware (example is given below) uses asyncio's .wait_for() function, which waits for an awaitable function/coroutine to complete with a timeout. If a timeout occurs, it cancels the task and raises asyncio.TimeoutError.
Regarding your comment below:
My requirement is not unblocking next request...
Again, please read carefully the first part of this answer to understand that if you define your endpoint with async def and not await for some coroutine inside, but instead perform some CPU-bound task (as you already do), it will block the server until is completed (and even the approach below wont' work as expected). That's like saying that you would like FastAPI to process one request at a time; in that case, there is no reason to use an ASGI framework such as FastAPI, which takes advantage of the async/await syntax (i.e., processing requests asynchronously), in order to provide fast performance. Hence, you either need to drop the async definition from your endpoint (as mentioned earlier above), or, preferably, run your synchronous CPU-bound task using ProcessPoolExecutor, as described earlier.
Also, your comment in some_func():
Some computationally heavy function whose execution time depends on
input text size
indicates that instead of (or along with) setting a request timeout, you could check the length of input text (using a dependency fucntion, for instance) and raise an HTTPException in case the text's length exceeds some pre-defined value, which is known beforehand to require more than 60s to complete the processing. In that way, your system won't waste resources trying to perform a task, which you already know will not be completed.
Working Example
import time
import uvicorn
import asyncio
import concurrent.futures
from functools import partial
from fastapi import FastAPI, Request
from fastapi.responses import JSONResponse
from starlette.status import HTTP_504_GATEWAY_TIMEOUT
from fastapi.concurrency import run_in_threadpool
REQUEST_TIMEOUT = 2 # adjust timeout as desired
app = FastAPI()
#app.middleware('http')
async def timeout_middleware(request: Request, call_next):
try:
return await asyncio.wait_for(call_next(request), timeout=REQUEST_TIMEOUT)
except asyncio.TimeoutError:
return JSONResponse({'detail': f'Request exceeded the time limit for processing'},
status_code=HTTP_504_GATEWAY_TIMEOUT)
def cpu_bound_task(text):
time.sleep(5)
return text
#app.get('/')
async def main():
loop = asyncio.get_running_loop()
with concurrent.futures.ProcessPoolExecutor() as pool:
res = await loop.run_in_executor(pool, partial(cpu_bound_task, text='Hello world'))
return {'response': res}
if __name__ == '__main__':
uvicorn.run(app)

Call to async endpoint gets blocked by another thread

I have a tornado webservice which is going to serve something around 500 requests per minute. All these requests are going to hit 1 specific endpoint. There is a C++ program that I have compiled using Cython and use it inside the tornado service as my processor engine. Each request that goes to /check/ will trigger a function call in the C++ program (I will call it handler) and the return value will get sent to user as response.
This is how I wrap the handler class. One important point is that I do not instantiate the handler in __init__. There is another route in my tornado code that I want to start loading the DataStructure after an authroized request hits that route. (e.g. /reload/)
executors = ThreadPoolExecutor(max_workers=4)
class CheckerInstance(object):
def __init__(self, *args, **kwargs):
self.handler = None
self.is_loading = False
self.is_live = False
def init(self):
if not self.handler:
self.handler = pDataStructureHandler()
self.handler.add_words_from_file(self.data_file_name)
self.end_loading()
self.go_live()
def renew(self):
self.handler = None
self.init()
class CheckHandler(tornado.web.RequestHandler):
async def get(self):
query = self.get_argument("q", None).encode('utf-8')
answer = query
if not checker_instance.is_live:
self.write(dict(answer=self.get_argument("q", None), confidence=100))
return
checker_response = await checker_instance.get_response(query)
answer = checker_response[0]
confidence = checker_response[1]
if self.request.connection.stream.closed():
return
self.write(dict(correct=answer, confidence=confidence, is_cache=is_cache))
def on_connection_close(self):
self.wait_future.cancel()
class InstanceReloadHandler(BasicAuthMixin, tornado.web.RequestHandler):
def prepare(self):
self.get_authenticated_user(check_credentials_func=credentials.get, realm='Protected')
def new_file_exists(self):
return True
def can_reload(self):
return not checker_instance.is_loading
def get(self):
error = False
message = None
if not self.can_reload():
error = True
message = 'another job is being processed!'
else:
if not self.new_file_exists():
error = True
message = 'no new file found!'
else:
checker_instance.go_fake()
checker_instance.start_loading()
tornado.ioloop.IOLoop.current().run_in_executor(executors, checker_instance.renew)
message = 'job started!'
if self.request.connection.stream.closed():
return
self.write(dict(
success=not error, message=message
))
def on_connection_close(self):
self.wait_future.cancel()
def main():
app = tornado.web.Application(
[
(r"/", MainHandler),
(r"/check", CheckHandler),
(r"/reload", InstanceReloadHandler),
(r"/health", HealthHandler),
(r"/log-event", SubmitLogHandler),
],
debug=options.debug,
)
checker_instance = CheckerInstance()
I want this service to keep responding after checker_instance.renew starts running in another thread. But this is not what happens. When I hit the /reload/ endpoint and renew function starts working, any request to /check/ halts and waits for the reloading process to finish and then it starts working again. When the DataStructure is being loaded, the service should be in fake mode and respond to people with the same query that they send as input.
I have tested this code in my development environment with an i5 CPU (4 CPU cores) and it works just fine! But in the production environment (3 double-thread CPU cores) the /check/ endpoint halts requests.
It is difficult to fully trace the events being handled because you have clipped out some of the code for brevity. For instance, I don't see a get_response implementation here so I don't know if it is awaiting something itself that could be dependent on the state of checker_instance.
One area I would explore is in the thread-safety (or seeming absence of) in passing the checker_instance.renew to run_in_executor. This feels questionable to me because you are mutating the state of a single instance of CheckerInstance from a separate thread. While it might not break things explicitly, it does seem like this could be introducing odd race conditions or unanticipated copies of memory that might explain the unexpected behavior you are experiencing
If possible, I would make whatever load behavior you have that you want to offload to a thread be completely self-contained and when the data is loaded, return it as the function result which can then be fed back into you checker_instance. If you were to do this with the code as-is, you would want to await the run_in_executor call for its result and then update the checker_instance. This would mean the reload GET request would wait until the data was loaded. Alternatively, in your reload GET request, you could ioloop.spawn_callback to a function that triggers the run_in_executor in this manner, allowing the reload request to complete instead of waiting.

Can I asynchronously duplicate a webapp2.RequestHandler Request to a different url?

For a percentage of production traffic, I want to duplicate the received request to a different version of my application. This needs to happen asynchronously so I don't double service time to the client.
The reason for doing this is so I can compare the responses generated by the prod version and a production candidate version. If their results are appropriately similar, I can be confident that the new version hasn't broken anything. (If I've made a functional change to the application, I'd filter out the necessary part of the response from this comparison.)
So I'm looking for an equivalent to:
class Foo(webapp2.RequestHandler):
def post(self):
handle = make_async_call_to('http://other_service_endpoint.com/', self.request)
# process the user's request in the usual way
test_response = handle.get_response()
# compare the locally-prepared response and the remote one, and log
# the diffs
# return the locally-prepared response to the caller
UPDATE
google.appengine.api.urlfetch was suggested as a potential solution to my problem, but it's synchronous in the dev_appserver, though it behaves the way I wanted in production (the request doesn't go out until get_response() is called, and it blocks). :
start_time = time.time()
rpcs = []
print 'creating rpcs:'
for _ in xrange(3):
rpcs.append(urlfetch.create_rpc())
print time.time() - start_time
print 'making fetch calls:'
for rpc in rpcs:
urlfetch.make_fetch_call(rpc, 'http://httpbin.org/delay/3')
print time.time() - start_time
print 'getting results:'
for rpc in rpcs:
rpc.get_result()
print time.time() - start_time
creating rpcs:
9.51290130615e-05
0.000154972076416
0.000189065933228
making fetch calls:
0.00029993057251
0.000356912612915
0.000473976135254
getting results:
3.15417003632
6.31326603889
9.46627306938
UPDATE2
So, after playing with some other options, I found a way to make completely non-blocking requests:
start_time = time.time()
rpcs = []
logging.info('creating rpcs:')
for i in xrange(10):
rpc = urlfetch.create_rpc(deadline=30.0)
url = 'http://httpbin.org/delay/{}'.format(i)
urlfetch.make_fetch_call(rpc, url)
rpc.callback = create_callback(rpc, url)
rpcs.append(rpc)
logging.info(time.time() - start_time)
logging.info('getting results:')
while rpcs:
rpc = apiproxy_stub_map.UserRPC.wait_any(rpcs)
rpcs.remove(rpc)
logging.info(time.time() - start_time)
...but the important point to note is that none of the async fetch options in urllib work in the dev_appserver. Having discovered this, I went back to try #DanCornilescu's solution and found that it only works properly in production, but not in the dev_appserver.
The URL Fetch service supports asynchronous requests. From Issuing an asynchronous request:
HTTP(S) requests are synchronous by default. To issue an asynchronous
request, your application must:
Create a new RPC object using urlfetch.create_rpc(). This object represents your asynchronous call in subsequent method calls.
Call urlfetch.make_fetch_call() to make the request. This method takes your RPC object and the request target's URL as parameters.
Call the RPC object's get_result() method. This method returns the result object if the request is successful, and raises an exception if
an error occurred during the request.
The following snippets demonstrate how to make a basic asynchronous
request from a Python application. First, import the urlfetch library
from the App Engine SDK:
from google.appengine.api import urlfetch
Next, use urlfetch to make the asynchronous request:
rpc = urlfetch.create_rpc()
urlfetch.make_fetch_call(rpc, "http://www.google.com/")
# ... do other things ...
try:
result = rpc.get_result()
if result.status_code == 200:
text = result.content
self.response.write(text)
else:
self.response.status_code = result.status_code
logging.error("Error making RPC request")
except urlfetch.DownloadError:
logging.error("Error fetching URL0")
Note: As per Sniggerfardimungus's experiment mentioned in the question's update the async calls might not work as expected on the development server - being serialized instead of concurrent, but they do so when deployed on GAE. Personally I didn't use the async calls yet, so I can't really say.
If the intent is not block at all waiting for the response from the production candidate app you could push a copy of the original request and the production-prepared response on a task queue then answer to the original request - with neglijible delay (that of enqueueing the task).
The handler for the respective task queue would, outside of the original request's critical path, make the request to the staging app using the copy of the original request (async or not, doesn't really matter from the point of view of impacting the production app's response time), get its response and compare it with the production-prepared response, log the deltas, etc. This can be nicely wrapped in a separate module for minimal changes to the production app and deployed/deleted as needed.

tornado web http request blocks other requests, how to not block other requests

import tornado.web
import Queue
QUEUE = Queue.Queue()
class HandlerA( tornado.web.RequestHandler ):
def get(self):
global QUEUE
self.finish(QUEUE.get_nowait())
class HandlerB( tornado.web.RequestHandler ):
def get(self):
global QUEUE
QUEUE.put('Hello')
self.finish('In queue.')
Problem: HandlerA blocks HandlerB for 10 seconds.
Browser A handled by HandlerA and waits...
Browser B handled by HandlerB and waits.... till timeout exceptions
Goal
Browser A handled by HandlerA and waits...
Browser B handled by HandlerB and returns
HandlerA returns after dequeuing
Is this an issue with Non-blocking, async, epoll or sockets?
Thanks!
UPDATE:
I updated this code with a new thread to handle the Queue.get_nowait() request. Which I fear is a HORRIBLE solution considering I'm going to have thousands of requests at once and would therefore have thousands of threads at once. I'm considering moving to a epoll style in the near future.
class HandlerA( tornado.web.RequestHandler ):
#tornado.web.asynchronous
def get(self):
thread.start_new_thread(self.get_next)
def get_next(self):
global QUEUE
self.finish(QUEUE.get_nowait())
Now this is not the best way to handle it... but at least its a start.
SOLUTION
Found here Running blocking code in Tornado
This is Python. So, time.sleep will always block the flow! In order to call action after 10 seconds with Tornado, you need to use tornado.ioloop.add_timeout function and pass callback as param. Docs for more information.

Executing an asynchronous test in Python

I need to do the following test:
send a GET request to a server (http://remote/...)
wait for the server to send a POST request in response (http://local/...)
parse the POST data and do some assertions
Selenium does not fit this case: it can't listen to connections, and I can send a GET without Selenium as well.
so, I make a unit test:
class MobiMoneyTestCase(TestCase):
def test_can_send_response(self):
resp = requests.post('http://url/api/', data={'callback': 'http://localhost:8000'})
class Handler(SimpleHTTPRequestHandler):
def do_GET(self):
assert self.path == '...'
httpd = SocketServer.ThreadingTCPServer(('localhost', 8000),Handler)
The test has to wait 5 seconds for the POST request and then fail if nothing happened. How can I merge these items in the test? If I put sleep(5) in the test_can..., the httpd handler does not reply until the countdown ends.
Basically you want to timeout a process if it's too long ? You should check out the signal module in that case.
There is an neat implementation (with decorator) here : Timeout function if it takes too long to finish

Categories