I have rather complex system running an asynchronous task called "automation". Meanwhile I would like to inspect where the task is currently waiting. Something like a callstack for async-await.
The following example creates such an automation task, stepping into do_something which in turn calls sleep. While this task is running, its stack is printed. I'd wish to see something like "automation → do_something → sleep". But print_stack only points to the line await do_something() in the top-level coroutine automation, but nothing more.
#!/usr/bin/env python3
import asyncio
async def sleep():
await asyncio.sleep(0.1)
async def do_something():
print('...')
await sleep()
async def automation():
for _ in range(10):
await do_something()
async def main():
task = asyncio.create_task(automation(), name='automation')
while not task.done():
task.print_stack()
await asyncio.sleep(0.1)
asyncio.run(main())
I thought about using _scheduled from asyncio.BaseEventLoop, but this seems to be always [] in my example. And since my production code runs uvloop I looked into https://github.com/MagicStack/uvloop/issues/135,
https://github.com/MagicStack/uvloop/issues/163 and
https://github.com/MagicStack/uvloop/pull/171, all of which are stale for about 4 years.
Is there something else I could try?
I found something: When running asyncio with debug=True and using a different interval of 0.11s (to avoid both task being "in sync"), we can access asyncio.get_running_loop()._scheduled. This contains a list of asyncio.TimerHandle. In debug mode each TimerHandle has a proper _source_traceback containing a list of traceback.FrameSummary with information like filename, lineno, name and line.
...
async def main():
task = asyncio.create_task(automation(), name='automation')
while not task.done():
for timer_handle in asyncio.get_running_loop()._scheduled:
for frame_summary in timer_handle._source_traceback:
print(f'{frame_summary.filename}:{frame_summary.lineno} {frame_summary.name}')
await asyncio.sleep(0.11)
asyncio.run(main(), debug=True)
The output looks something like this:
/opt/homebrew/Cellar/python#3.10/3.10.6_1/Frameworks/Python.framework/Versions/3.10/lib/python3.10/asyncio/base_events.py:600 run_forever
/opt/homebrew/Cellar/python#3.10/3.10.6_1/Frameworks/Python.framework/Versions/3.10/lib/python3.10/asyncio/base_events.py:1888 _run_once
/opt/homebrew/Cellar/python#3.10/3.10.6_1/Frameworks/Python.framework/Versions/3.10/lib/python3.10/asyncio/events.py:80 _run
/Users/falko/./test.py:17 automation
/Users/falko/./test.py:12 do_something
/Users/falko/./test.py:7 sleep
/opt/homebrew/Cellar/python#3.10/3.10.6_1/Frameworks/Python.framework/Versions/3.10/lib/python3.10/asyncio/tasks.py:601 sleep
The main drawback for my application is the fact that uvloop doesn't come with the field _scheduled. That's why I'm still looking for an alternative approach.
Related
I have a fastapi endpoint that calls a python script but has two problems on GCP:
it always give a success code (because it's not blocking)
The instances is always running as cloud rundoesn't know when to turn it off(because it's not blocking).
I think the problem is blocking :-)
Here's a sample of my code:
async def curve_builder(secret: str):
os.system("python3 scripts/my_script.py")
return {"succcess": True, "status": "Batch job completed"}
Is there a way to let the script and then return a success/fail message once it's done? I'm not sure how to block it, it seems to just return a success as soon as the command is executed.
I'm not sure if this is specific to fastapi or general python.
Blocking operations could hang up your current worker. When you want to execute blocking code over a coroutine, send its logic to a executor.
Get the event loop
loop = asyncio.get_running_loop()
Any blocking code must go out of your coroutine. So, your current worker will be able to execute other coroutines.
await loop.run_in_executor(None, func)
For your case, the final result will be:
async def curve_builder(secret: str):
loop = asyncio.get_running_loop()
result = await loop.run_in_executor(None, lambda: os.system("python3 scripts/my_script.py"))
return {"status": result}
You can read further information in the docs: https://docs.python.org/3/library/asyncio-eventloop.html#asyncio.loop.run_in_executor
Assign the ‘os.system()’ call to a variable. The exit code of your script is assigned, so it will wait till it finished, despite being a async method you are working from.
The answer was wrong, I've tested an example setup but could not reproduce the issue.
Script1:
import os
import asyncio
async def method1():
print("Start of method1")
os.system("python /path/to/other/script/script2.py")
print("End of method1")
print("start script1")
asyncio.run(method1())
print("end script1")
Script2:
import asyncio
async def method2():
print("Start method2")
await asyncio.sleep(3)
print("End method2")
print("start async script2")
asyncio.run(method2())
print("end async script2")
Output:
start script1
Start of method1
start async script2
Start method2
End method2
End async script2
End of method
end script1
I am trying to do something similar like C# ManualResetEvent but in Python.
I have attempted to do it in python but doesn't seem to work.
import asyncio
cond = asyncio.Condition()
async def main():
some_method()
cond.notify()
async def some_method():
print("Starting...")
await cond.acquire()
await cond.wait()
cond.release()
print("Finshed...")
main()
I want the some_method to start then wait until signaled to start again.
This code is not complete, first of all you need to use asyncio.run() to bootstrap the event loop - this is why your code is not running at all.
Secondly, some_method() never actually starts. You need to asynchronously start some_method() using asyncio.create_task(). When you call an "async def function" (the more correct term is coroutinefunction) it returns a coroutine object, this object needs to be driven by the event loop either by you awaiting it or using the before-mentioned function.
Your code should look more like this:
import asyncio
async def main():
cond = asyncio.Condition()
t = asyncio.create_task(some_method(cond))
# The event loop hasn't had any time to start the task
# until you await again. Sleeping for 0 seconds will let
# the event loop start the task before continuing.
await asyncio.sleep(0)
cond.notify()
# You should never really "fire and forget" tasks,
# the same way you never do with threading. Wait for
# it to complete before returning:
await t
async def some_method(cond):
print("Starting...")
await cond.acquire()
await cond.wait()
cond.release()
print("Finshed...")
asyncio.run(main())
I have a fastAPI app that posts two requests, one of them is longer (if it helps, they're Elasticsearch queries and I'm using the AsyncElasticsearch module which already returns coroutine). This is my attempt:
class my_module:
search_object = AsyncElasticsearch(url, port)
async def do_things(self):
resp1 = await search_object.search() #the longer one
print(check_resp1)
resp2 = await search_object.search() #the shorter one
print(check_resp2)
process(resp2)
process(resp1)
do_synchronous_things()
return thing
app = FastAPI()
#app.post("/")
async def service(user_input):
result = await my_module.do_things()
return results
What I observed is instead of awaiting resp1, by the time it got to check_resp1 it's already a full response, as if I didn't use async at all.
I'm new to python async, I knew my code wouldn't work, but I don't know how to fix it. As far as I understand, when interpreter sees await it starts the function then just moves on, which in this case should immediately post the next request. How do I make it do that?
Yes, that's correct the coroutine won't proceed until the results are ready. You can use asyncio.gather to run tasks concurrently:
import asyncio
async def task(msg):
print(f"START {msg}")
await asyncio.sleep(1)
print(f"END {msg}")
return msg
async def main():
await task("1")
await task("2")
results = await asyncio.gather(task("3"), task("4"))
print(results)
if __name__ == "__main__":
asyncio.run(main())
Test:
$ python test.py
START 1
END 1
START 2
END 2
START 3
START 4
END 3
END 4
['3', '4']
Alternatively you can use asyncio.as_completed to get the earliest next result:
for coro in asyncio.as_completed((task("5"), task("6"))):
earliest_result = await coro
print(earliest_result)
Update Fri 2 Apr 09:25:33 UTC 2021:
asyncio.run is available since Python 3.7+, in previous versions you will have to create and start the loop manually:
if __name__ == "__main__":
loop = asyncio.get_event_loop()
loop.run_until_complete(main())
loop.close()
Explanation
The reason your code run synchronyously is that in do_things function, the code is executed as follow:
Schedule search_object.search() to execute
Wait till search_object.search() is finished and get the result
Schedule search_object.search() to execute
Wait till search_object.search() is finished and get the result
Execute (synchronyously) process(resp2)
Execute (synchronyously) process(resp1)
Execute (synchronyously) do_synchronous_things()
What you intended, is to make steps 1 and 3 executed before 2 and 4. You can make it easily with unsync library - here is the documentation.
How you can fix this
from unsync import unsync
class my_module:
search_object = AsyncElasticsearch(url, port)
#unsync
async def search1():
return await search_object.search()
#unsync
async def search2(): # not sure if this is any different to search1
return await search_object.search()
async def do_things(self):
task1, task2 = self.search1(), self.search2() # schedule tasks
resp1, resp2 = task1.result(), task2.result() # wait till tasks are executed
# you might also do similar trick with process function to run process(resp2) and process(resp1) concurrently
process(resp2)
process(resp1)
do_synchronous_things() # if this does not rely on resp1 and resp2 it might also be put into separate task to make the computation quicker. To do this use #unsync(cpu_bound=True) decorator
return thing
app = FastAPI()
#app.post("/")
async def service(user_input):
result = await my_module.do_things()
return results
More information
If you want to learn more about asyncio and asyncronyous programming, I recommend this tutorial. There is also similar case that you presented with a few possible solutions to make the coroutines run concurrently.
PS. Obviosuly I could not run this code, so you must debug it on your own.
I'm new to Python and have code similar to the following:
import time
import asyncio
async def my_async_function(i):
print("My function {}".format(i))
async def start():
requests = []
# Create multiple requests
for i in range(5):
print("Creating request #{}".format(i))
requests.append(my_async_function(i))
# Do some additional work here
print("Begin sleep")
time.sleep(10)
print("End sleep")
# Wait for all requests to finish
return await asyncio.gather(*requests)
asyncio.run(start())
No matter how long the "additional work" takes, the requests seem to only run after "End sleep". I'm guessing asyncio.gather is what actually begins to execute them. How can I have the requests (aka my_async_function()) start immediately, do additional work, and then wait for all to complete at the end?
Edit:
Per Krumelur's comments and my own findings, the following results in what I'm looking for:
import time
import asyncio
import random
async def my_async_function(i):
print("Begin function {}".format(i))
await asyncio.sleep(int(random.random() * 10))
print("End function {}".format(i))
async def start():
requests = []
# Create multiple requests
for i in range(10):
print("Creating request #{}".format(i))
requests.append(asyncio.create_task(my_async_function(i)))
# Do some additional work here
print("Begin sleep")
await asyncio.sleep(5)
print("End sleep")
# Wait for all requests to finish
return await asyncio.gather(*requests)
asyncio.run(start())
This only works if my_async_function and the "additional work" both are awaitable so that the event loop can give each of them execution time. You need create_task (if you know it's a coroutine) or ensure_future (if it could be a coroutine or future) to allow the requests to run immediately, otherwise they still end up running only when you gather.
time.sleep() is a synchronous operation
You’ll want to use the asynchronous sleep and await it,
E.g.
await asyncio.sleep(10)
Other async code will only run when the current task yields (I.e. typically when “await”ing something).
Using async code means you have to keep using async everywhere. Async operations are meant for I/O-bound applications. If “additional work” is mainly CPU-bound, you are better off using threads (but beware the global interpreter lock!)
I've read tons of articles and tutorial about Python's 3.5 async/await thing. I have to say I'm pretty confused, because some use get_event_loop() and run_until_complete(), some use ensure_future(), some use asyncio.wait(), and some use call_soon().
It seems like I have a lot choices, but I have no idea if they are completely identical or there are cases where you use loops and there are cases where you use wait().
But the thing is all examples work with asyncio.sleep() as simulation of real slow operation which returns an awaitable object. Once I try to swap this line for some real code the whole thing fails. What the heck are the differences between approaches written above and how should I run a third-party library which is not ready for async/await. I do use the Quandl service to fetch some stock data.
import asyncio
import quandl
async def slow_operation(n):
# await asyncio.sleep(1) # Works because it's await ready.
await quandl.Dataset(n) # Doesn't work because it's not await ready.
async def main():
await asyncio.wait([
slow_operation("SIX/US9884981013EUR4"),
slow_operation("SIX/US88160R1014EUR4"),
])
# You don't have to use any code for 50 requests/day.
quandl.ApiConfig.api_key = "MY_SECRET_CODE"
loop = asyncio.get_event_loop()
loop.run_until_complete(main())
I hope you get the point how lost I feel and how simple thing I would like to have running in parallel.
If a third-party library is not compatible with async/await then obviously you can't use it easily. There are two cases:
Let's say that the function in the library is asynchronous and it gives you a callback, e.g.
def fn(..., clb):
...
So you can do:
def on_result(...):
...
fn(..., on_result)
In that case you can wrap such functions into the asyncio protocol like this:
from asyncio import Future
def wrapper(...):
future = Future()
def my_clb(...):
future.set_result(xyz)
fn(..., my_clb)
return future
(use future.set_exception(exc) on exception)
Then you can simply call that wrapper in some async function with await:
value = await wrapper(...)
Note that await works with any Future object. You don't have to declare wrapper as async.
If the function in the library is synchronous then you can run it in a separate thread (probably you would use some thread pool for that). The whole code may look like this:
import asyncio
import time
from concurrent.futures import ThreadPoolExecutor
# Initialize 10 threads
THREAD_POOL = ThreadPoolExecutor(10)
def synchronous_handler(param1, ...):
# Do something synchronous
time.sleep(2)
return "foo"
# Somewhere else
async def main():
loop = asyncio.get_event_loop()
futures = [
loop.run_in_executor(THREAD_POOL, synchronous_handler, param1, ...),
loop.run_in_executor(THREAD_POOL, synchronous_handler, param1, ...),
loop.run_in_executor(THREAD_POOL, synchronous_handler, param1, ...),
]
await asyncio.wait(futures)
for future in futures:
print(future.result())
with THREAD_POOL:
loop = asyncio.get_event_loop()
loop.run_until_complete(main())
If you can't use threads for whatever reason then using such a library simply makes entire asynchronous code pointless.
Note however that using synchronous library with async is probably a bad idea. You won't get much and yet you complicate the code a lot.
You can take a look at the following simple working example from here. By the way it returns a string worth reading :-)
import aiohttp
import asyncio
async def fetch(client):
async with client.get('https://docs.aiohttp.org/en/stable/client_reference.html') as resp:
assert resp.status == 200
return await resp.text()
async def main():
async with aiohttp.ClientSession() as client:
html = await fetch(client)
print(html)
loop = asyncio.get_event_loop()
loop.run_until_complete(main())