Can you have an async handler in Lambda Python 3.6? - python

I've made Lambda functions before but not in Python. I know in Javascript Lambda supports the handler function being asynchronous, but I get an error if I try it in Python.
Here is the code I am trying to test:
async def handler(event, context):
print(str(event))
return {
'message' : 'OK'
}
And this is the error I get:
An error occurred during JSON serialization of response: <coroutine object handler at 0x7f63a2d20308> is not JSON serializable
Traceback (most recent call last):
File "/var/lang/lib/python3.6/json/__init__.py", line 238, in dumps
**kw).encode(obj)
File "/var/lang/lib/python3.6/json/encoder.py", line 199, in encode
chunks = self.iterencode(o, _one_shot=True)
File "/var/lang/lib/python3.6/json/encoder.py", line 257, in iterencode
return _iterencode(o, 0)
File "/var/runtime/awslambda/bootstrap.py", line 149, in decimal_serializer
raise TypeError(repr(o) + " is not JSON serializable")
TypeError: <coroutine object handler at 0x7f63a2d20308> is not JSON serializable
/var/runtime/awslambda/bootstrap.py:312: RuntimeWarning: coroutine 'handler' was never awaited
errortype, result, fatal = report_fault(invokeid, e)
EDIT 2021:
Since this question seems to be gaining traction, I assume people are coming here trying to figure out how to get async to work with AWS Lambda as I was. The bad news is that even now more than a year later, there still isn't any support by AWS to have an asynchronous handler in a Python-based Lambda function. (I have no idea why, as NodeJS-based Lambda functions can handle it perfectly fine.)
The good news is that since Python 3.7, there is a simple workaround in the form of asyncio.run:
import asyncio
def lambda_handler(event, context):
# Use asyncio.run to synchronously "await" an async function
result = asyncio.run(async_handler(event, context))
return {
'statusCode': 200,
'body': result
}
async def async_handler(event, context):
# Put your asynchronous code here
await asyncio.sleep(1)
return 'Success'
Note: The selected answer says that using asyncio.run is not the proper way of starting an asynchronous task in Lambda. In general, they are correct because if some other resource in your Lambda code creates an event loop (a database/HTTP client, etc.), it's wasteful to create another loop and it's better to operate on the existing loop using asyncio.get_event_loop.
However, if an event loop does not yet exist when your code begins running, asyncio.run becomes the only (simple) course of action.

Not at all. Async Python handlers are not supported by AWS Lambda.
If you need to use async/await functionality in your AWS Lambda, you have to define an async function in your code (either in Lambda files or a Lambda Layer) and call asyncio.get_event_loop().run_until_complete(your_async_handler()) inside your sync handler.
Please note that asyncio.run (introduced in Python 3.7) is not a proper way to call an async handler in AWS Lambda execution environment since Lambda tries to reuse the execution context for subsequent invocations. The problem here is that asyncio.run creates a new EventLoop and closes the previous one. If you have opened any resources or created coroutines attached to the closed EventLoop from previous Lambda invocation you will get «Event loop closed» error. asyncio.get_event_loop().run_until_complete allows you to reuse the same loop. See related StackOverflow question.
AWS Lambda documentation misleads its readers a little by introducing synchronous and asynchronous invocations. Do not mix it up with sync/async Python functions. Synchronous refers to invoking AWS Lambda with further waiting for the result (blocking operation). The function is called immediately and you get the response as soon as possible. Whereas using an asynchronous invocation you ask Lambda to schedule the function execution and do not wait for the response at all. When the time comes, Lambda still will call the handler function synchronously.

Don't use run() method and call run_until_complete()
import json
import asyncio
async def my_async_method():
await some_async_functionality()
def lambda_handler(event, context):
loop = asyncio.get_event_loop()
result = loop.run_until_complete(my_async_method())
return {
'statusCode': 200,
'body': json.dumps('Hello Lambda')
}

Related

How to stop execution of FastAPI endpoint after a specified time to reduce CPU resource usage/cost?

Use case
The client micro service, which calls /do_something, has a timeout of 60 seconds in the request/post() call. This timeout is fixed and can't be changed. So if /do_something takes 10 mins, /do_something is wasting CPU resources since the client micro service is NOT waiting after 60 seconds for the response from /do_something, which wastes CPU for 10 mins and this increases the cost. We have limited budget.
The current code looks like this:
import time
from uvicorn import Server, Config
from random import randrange
from fastapi import FastAPI
app = FastAPI()
def some_func(text):
"""
Some computationally heavy function
whose execution time depends on input text size
"""
randinteger = randrange(1,120)
time.sleep(randinteger)# simulate processing of text
return text
#app.get("/do_something")
async def do_something():
response = some_func(text="hello world")
return {"response": response}
# Running
if __name__ == '__main__':
server = Server(Config(app=app, host='0.0.0.0', port=3001))
server.run()
Desired Solution
Here /do_something should stop the processing of the current request to endpoint after 60 seconds and wait for next request to process.
If execution of the end point is force stopped after 60 seconds we should be able to log it with custom message.
This should not kill the service and work with multithreading/multiprocessing.
I tried this. But when timeout happends the server is getting killed.
Any solution to fix this?
import logging
import time
import timeout_decorator
from uvicorn import Server, Config
from random import randrange
from fastapi import FastAPI
app = FastAPI()
#timeout_decorator.timeout(seconds=2, timeout_exception=StopIteration, use_signals=False)
def some_func(text):
"""
Some computationally heavy function
whose execution time depends on input text size
"""
randinteger = randrange(1,30)
time.sleep(randinteger)# simulate processing of text
return text
#app.get("/do_something")
async def do_something():
try:
response = some_func(text="hello world")
except StopIteration:
logging.warning(f'Stopped /do_something > endpoint due to timeout!')
else:
logging.info(f'( Completed < /do_something > endpoint')
return {"response": response}
# Running
if __name__ == '__main__':
server = Server(Config(app=app, host='0.0.0.0', port=3001))
server.run()
This answer is not about improving CPU time—as you mentioned in the comments section—but rather explains what would happen, if you defined an endpoint with normal def or async def, as well as provides solutions when you run blocking operations inside an endpoint.
You are asking how to stop the processing of a request after a while, in order to process further requests. It does not really make that sense to start processing a request, and then (60 seconds later) stop it as if it never happened (wasting server resources all that time and having other requests waiting). You should instead let the handling of requests to FastAPI framework itself. When you define an endpoint with async def, it is run on the main thread (in the event loop), i.e., the server processes the requests sequentially, as long as there is no await call inside the endpoint (just like in your case). The keyword await passes function control back to the event loop. In other words, it suspends the execution of the surrounding coroutine, and tells the event loop to let something else run, until the awaited task completes (and has returned the result data). The await keyword only works within an async function.
Since you perform a heavy CPU-bound operation inside your async def endpoint (by calling your some_func() function), and you never give up control for other requests to run in the event loop (e.g., by awaiting for some coroutine), the server will be blocked and wait for that request to be fully processed and complete, before moving on to the next one(s)—have a look at this answer for more details.
Solutions
One solution would be to define your endpoint with normal def instead of async def. In brief, when you declare an endpoint with normal def instead of async def in FastAPI, it is run in an external threadpool that is then awaited, instead of being called directly (as it would block the server); hence, FastAPI would still work asynchronously.
Another solution, as described in this answer, is to keep the async def definition and run the CPU-bound operation in a separate thread and await it, using Starlette's run_in_threadpool(), thus ensuring that the main thread (event loop), where coroutines are run, does not get blocked. As described by #tiangolo here, "run_in_threadpool is an awaitable function, the first parameter is a normal function, the next parameters are passed to that function directly. It supports sequence arguments and keyword arguments". Example:
from fastapi.concurrency import run_in_threadpool
res = await run_in_threadpool(cpu_bound_task, text='Hello world')
Since this is about a CPU-bound operation, it would be preferable to run it in a separate process, using ProcessPoolExecutor, as described in the link provided above. In this case, this could be integrated with asyncio, in order to await the process to finish its work and return the result(s). Note that, as described in the link above, it is important to protect the main loop of code to avoid recursive spawning of subprocesses, etc—essentially, your code must be under if __name__ == '__main__'. Example:
import concurrent.futures
from functools import partial
import asyncio
loop = asyncio.get_running_loop()
with concurrent.futures.ProcessPoolExecutor() as pool:
res = await loop.run_in_executor(pool, partial(cpu_bound_task, text='Hello world'))
About Request Timeout
With regards to the recent update on your question about the client having a fixed 60s request timeout; if you are not behind a proxy such as Nginx that would allow you to set the request timeout, and/or you are not using gunicorn, which would also allow you to adjust the request timeout, you could use a middleware, as suggested here, to set a timeout for all incoming requests. The suggested middleware (example is given below) uses asyncio's .wait_for() function, which waits for an awaitable function/coroutine to complete with a timeout. If a timeout occurs, it cancels the task and raises asyncio.TimeoutError.
Regarding your comment below:
My requirement is not unblocking next request...
Again, please read carefully the first part of this answer to understand that if you define your endpoint with async def and not await for some coroutine inside, but instead perform some CPU-bound task (as you already do), it will block the server until is completed (and even the approach below wont' work as expected). That's like saying that you would like FastAPI to process one request at a time; in that case, there is no reason to use an ASGI framework such as FastAPI, which takes advantage of the async/await syntax (i.e., processing requests asynchronously), in order to provide fast performance. Hence, you either need to drop the async definition from your endpoint (as mentioned earlier above), or, preferably, run your synchronous CPU-bound task using ProcessPoolExecutor, as described earlier.
Also, your comment in some_func():
Some computationally heavy function whose execution time depends on
input text size
indicates that instead of (or along with) setting a request timeout, you could check the length of input text (using a dependency fucntion, for instance) and raise an HTTPException in case the text's length exceeds some pre-defined value, which is known beforehand to require more than 60s to complete the processing. In that way, your system won't waste resources trying to perform a task, which you already know will not be completed.
Working Example
import time
import uvicorn
import asyncio
import concurrent.futures
from functools import partial
from fastapi import FastAPI, Request
from fastapi.responses import JSONResponse
from starlette.status import HTTP_504_GATEWAY_TIMEOUT
from fastapi.concurrency import run_in_threadpool
REQUEST_TIMEOUT = 2 # adjust timeout as desired
app = FastAPI()
#app.middleware('http')
async def timeout_middleware(request: Request, call_next):
try:
return await asyncio.wait_for(call_next(request), timeout=REQUEST_TIMEOUT)
except asyncio.TimeoutError:
return JSONResponse({'detail': f'Request exceeded the time limit for processing'},
status_code=HTTP_504_GATEWAY_TIMEOUT)
def cpu_bound_task(text):
time.sleep(5)
return text
#app.get('/')
async def main():
loop = asyncio.get_running_loop()
with concurrent.futures.ProcessPoolExecutor() as pool:
res = await loop.run_in_executor(pool, partial(cpu_bound_task, text='Hello world'))
return {'response': res}
if __name__ == '__main__':
uvicorn.run(app)

How to really make this operation async in python?

I have very recently started checking out asyncio for python. The use case is that, I hit an endpoint /refresh_data and the data is refreshed (from an S3 bucket). But this should be a non blocking operation, other API endpoints should still be able to be serviced.
So in my controller, I have:
def refresh_data():
myservice.refresh_data()
return jsonify(dict(ok=True))
and in my service I have:
async def refresh_data():
try:
s_result = await self.s3_client.fetch()
except (FileNotFound, IOError) as e:
logger.info("problem")
gather = {i.pop("x"):i async for i in summary_result}
# ... some other stuff
and in my client:
async def fetch():
result = pd.read_parquet("bucket", "pyarrow", cols, filts).to_dict(orient="col1")
return result
And when I run it, I see this error:
TypeError: 'async for' requires an object with __aiter__ method, got coroutine
I don't know how to move past this. Adding async definitely makes it return a coroutine type - but, either I have implemented this messily or I haven't fully understood asyncio package in Python. I have been working off of simple examples but I'm not sure what's wrong with what I've done here.

AWS Lambda call url which takes longer than Lambda execution limit

Context
I want to run an AWS Lambda, call an endpoint (fire and forget) then stop the Lambda - all the while the endpoint is whirling away happily on it's own.
Attempts
1.
Using a timeout, e.g.
try:
requests.get(url, timeout=0.001)
except requests.exceptions.ReadTimeout:
...
2.
Using an async call with grequest:
import grequests
...
def handler(event, context):
try:
...
req = grequests.get(url)
grequests.send(req, grequests.Pool(1))
return {
'statusCode': 200,
'body': "Lambda done"
}
except Exception:
logger.exception('Error while running lambda')
raise
These requests don't appear to reach the API, it's almost like the request is being cancelled.
Any ideas why?
Question
How can a Lambda call a URL which takes a long time to complete? Thanks.
To anyone reading this: I fixed my problem by using AWS Batch.
You have to increase the timeout on your function. Depending on how you have defined your function, you can increase the default timeout of 300 seconds on the function page or inside your cloudformation script.
The timeout defined inside your function has no effect on lambda functionality. AWS Lambda will kill your function after 300 seconds regardless of any timeout defined inside your script. See: https://docs.aws.amazon.com/lambda/latest/dg/limits.html

How to use Asynchronous Generators in Twisted?

Twisted provides wrapping native coroutines(coroutines using async/await). Using ensureDeferred, we can wrap a coroutine and get its equivalent deferred. How to wrap an asynchronous generator (available in Python 3.6), so that we get a deferred ?
Trying to wrap an asynchronous object returns following error:
File "/home/yash/Scrapy/vnv/lib/python3.6/site-packages/Twisted-17.9.0-py3.6-linux-x86_64.egg/twisted/internet/defer.py", line 915, in ensureDeferred
raise ValueError("%r is not a coroutine or a Deferred" % (coro,))
ValueError: is not a coroutine or a Deferred
EDIT: I have provided a sample code where the logic required is demonstrated,and the corresponding error occuring:
from twisted.internet import reactor, defer
import asyncio
async def start_request():
urls = ['https://example.com/page/1',
'https://example.com/page/2',
'https://example.com/page/3']
async for url in urls:
yield url
await asyncio.sleep(2)
async def after_request():
await asyncio.sleep(3)
print("Completed the work")
deferred = ensureDeferred(start_request())
deferred.addCallback(after_request())
print("reactor has started")
reactor.run()
Error Traceback:
Traceback (most recent call last):
File "req2.py", line 20, in <module>
deferred = defer.ensureDeferred(start_request())
File "/home/yash/Scrapy/vnv/lib/python3.6/site-packages/Twisted-
17.9.0-py3.6-linux-x86_64.egg/twisted/internet/defer.py", line 915, in
ensureDeferred
raise ValueError("%r is not a coroutine or a Deferred" % (coro,))
ValueError: <async_generator object start_request at 0x7febb282cf50>
is not a coroutine or a Deferred
There's a couple of problems with the example you provided. The first one is causing the error you're seeing, and the other will cause problems later
The first one is that if you use yield inside a async function, it's not a coroutine anymore, it's a generator function. So, what you're passing to ensureDeferred is not a coroutine, which causes the error you're seeing. Instead of yield url, try to just print, or collect them in a list and return it later.
The second problem is that you can use async/await with twisted, but not asyncio, directly. Twisted code that wants to call asyncio functions must convert asyncio Futures to Deferreds first. Since asyncio.sleep gives you a coroutine, you need to convert that to a Future first, with asyncio.ensure_future, then convert the Future to a Deferred, with Deferred.fromFuture. Or you can just use twisted to simulate a sleep, like this:
def sleep(secs):
d = Deferred()
reactor.callLater(secs, d.callback, None)
return d
The third problem is that you can't use async for with something that is not an async iterator, so the async for url in urls: should be just for url in urls:
P.S: Thanks to #runciter and #jfhbrook from the #twisted channel on Freenode for the help with this =)

How to wait for coroutines to complete synchronously within method if event loop is already running?

I'm trying to create a Python-based CLI that communicates with a web service via websockets. One issue that I'm encountering is that requests made by the CLI to the web service intermittently fail to get processed. Looking at the logs from the web service, I can see that the problem is caused by the fact that frequently these requests are being made at the same time (or even after) the socket has closed:
2016-09-13 13:28:10,930 [22 ] INFO DeviceBridge - Device bridge has opened
2016-09-13 13:28:11,936 [21 ] DEBUG DeviceBridge - Device bridge has received message
2016-09-13 13:28:11,937 [21 ] DEBUG DeviceBridge - Device bridge has received valid message
2016-09-13 13:28:11,937 [21 ] WARN DeviceBridge - Unable to process request: {"value": false, "path": "testcube.pwms[0].enabled", "op": "replace"}
2016-09-13 13:28:11,936 [5 ] DEBUG DeviceBridge - Device bridge has closed
In my CLI I define a class CommunicationService that is responsible for handling all direct communication with the web service. Internally, it uses the websockets package to handle communication, which itself is built on top of asyncio.
CommunicationService contains the following method for sending requests:
def send_request(self, request: str) -> None:
logger.debug('Sending request: {}'.format(request))
asyncio.ensure_future(self._ws.send(request))
...where ws is a websocket opened earlier in another method:
self._ws = await websockets.connect(websocket_address)
What I want is to be able to await the future returned by asyncio.ensure_future and, if necessary, sleep for a short while after in order to give the web service time to process the request before the websocket is closed.
However, since send_request is a synchronous method, it can't simply await these futures. Making it asynchronous would be pointless as there would be nothing to await the coroutine object it returned. I also can't use loop.run_until_complete as the loop is already running by the time it is invoked.
I found someone describing a problem very similar to the one I have at mail.python.org. The solution that was posted in that thread was to make the function return the coroutine object in the case the loop was already running:
def aio_map(coro, iterable, loop=None):
if loop is None:
loop = asyncio.get_event_loop()
coroutines = map(coro, iterable)
coros = asyncio.gather(*coroutines, return_exceptions=True, loop=loop)
if loop.is_running():
return coros
else:
return loop.run_until_complete(coros)
This is not possible for me, as I'm working with PyRx (Python implementation of the reactive framework) and send_request is only called as a subscriber of an Rx observable, which means the return value gets discarded and is not available to my code:
class AnonymousObserver(ObserverBase):
...
def _on_next_core(self, value):
self._next(value)
On a side note, I'm not sure if this is some sort of problem with asyncio that's commonly come across or whether I'm just not getting it, but I'm finding it pretty frustrating to use. In C# (for instance), all I would need to do is probably something like the following:
void SendRequest(string request)
{
this.ws.Send(request).Wait();
// Task.Delay(500).Wait(); // Uncomment If necessary
}
Meanwhile, asyncio's version of "wait" unhelpfully just returns another coroutine that I'm forced to discard.
Update
I've found a way around this issue that seems to work. I have an asynchronous callback that gets executed after the command has executed and before the CLI terminates, so I just changed it from this...
async def after_command():
await comms.stop()
...to this:
async def after_command():
await asyncio.sleep(0.25) # Allow time for communication
await comms.stop()
I'd still be happy to receive any answers to this problem for future reference, though. I might not be able to rely on workarounds like this in other situations, and I still think it would be better practice to have the delay executed inside send_request so that clients of CommunicationService do not have to concern themselves with timing issues.
In regards to Vincent's question:
Does your loop run in a different thread, or is send_request called by some callback?
Everything runs in the same thread - it's called by a callback. What happens is that I define all my commands to use asynchronous callbacks, and when executed some of them will try to send a request to the web service. Since they're asynchronous, they don't do this until they're executed via a call to loop.run_until_complete at the top level of the CLI - which means the loop is running by the time they're mid-way through execution and making this request (via an indirect call to send_request).
Update 2
Here's a solution based on Vincent's proposal of adding a "done" callback.
A new boolean field _busy is added to CommunicationService to represent if comms activity is occurring or not.
CommunicationService.send_request is modified to set _busy true before sending the request, and then provides a callback to _ws.send to reset _busy once done:
def send_request(self, request: str) -> None:
logger.debug('Sending request: {}'.format(request))
def callback(_):
self._busy = False
self._busy = True
asyncio.ensure_future(self._ws.send(request)).add_done_callback(callback)
CommunicationService.stop is now implemented to wait for this flag to be set false before progressing:
async def stop(self) -> None:
"""
Terminate communications with TestCube Web Service.
"""
if self._listen_task is None or self._ws is None:
return
# Wait for comms activity to stop.
while self._busy:
await asyncio.sleep(0.1)
# Allow short delay after final request is processed.
await asyncio.sleep(0.1)
self._listen_task.cancel()
await asyncio.wait([self._listen_task, self._ws.close()])
self._listen_task = None
self._ws = None
logger.info('Terminated connection to TestCube Web Service')
This seems to work too, and at least this way all communication timing logic is encapsulated within the CommunicationService class as it should be.
Update 3
Nicer solution based on Vincent's proposal.
Instead of self._busy we have self._send_request_tasks = [].
New send_request implementation:
def send_request(self, request: str) -> None:
logger.debug('Sending request: {}'.format(request))
task = asyncio.ensure_future(self._ws.send(request))
self._send_request_tasks.append(task)
New stop implementation:
async def stop(self) -> None:
if self._listen_task is None or self._ws is None:
return
# Wait for comms activity to stop.
if self._send_request_tasks:
await asyncio.wait(self._send_request_tasks)
...
You could use a set of tasks:
self._send_request_tasks = set()
Schedule the tasks using ensure_future and clean up using add_done_callback:
def send_request(self, request: str) -> None:
task = asyncio.ensure_future(self._ws.send(request))
self._send_request_tasks.add(task)
task.add_done_callback(self._send_request_tasks.remove)
And wait for the set of tasks to complete:
async def stop(self):
if self._send_request_tasks:
await asyncio.wait(self._send_request_tasks)
Given that you're not inside an asynchronous function you can use the yield from keyword to effectively implement await yourself. The following code will block until the future returns:
def send_request(self, request: str) -> None:
logger.debug('Sending request: {}'.format(request))
future = asyncio.ensure_future(self._ws.send(request))
yield from future.__await__()

Categories