Process an Asynchronous Iterator in background - python

I trying to process an asynchronous iterator in background.
I can process in the main function:
target = ...
async def main():
async for item in target.listen():
print(item) # works fine
print('done') # never arrives, because loop iterates forever
asyncio.run(main())
But doing it this way, the code is blocking.
So I try process the iterator in other function with create_task, but the loop never iterates.
target = ...
async def process():
async for item in target.listen():
print(item) # never print
async def main():
asyncio.create_task(process())
print('done') # works
asyncio.run(main())
Is it possible to process an asynchronous iterator in background? How?

If you call create_task() and then immediately exit -- which is what you're doing in your code -- then of course the task never runs. You can only run something "in the background" if you're doing something in the foreground, and you're not.
Even if your main function wasn't exiting immediately, your background task still wouldn't run. Asyncio provides concurrent execution, but not parallel execution -- other tasks won't execute unless the current task yields control by calling await.
Here's a working example; I've replaced your target variable with a simple list of words just so we have something over which to iterate:
import asyncio
target = "one two three four five six".split()
async def process():
for item in target:
print(item) # never print
await asyncio.sleep(0.3)
async def main():
task = asyncio.create_task(process())
while not task.done():
print("doing something while waiting")
await asyncio.sleep(1)
print("done")
asyncio.run(main())
Running the above code produces:
doing something while waiting
one
two
three
four
doing something while waiting
five
six
done
Rather than checking for task.done() as I've done in this example, you would in practice be more likely to use one of the waiting primitives or asyncio.gather to wait for the completion of background tasks.

Related

python how to make early return when asyncio generator

I want to return the first element of async generator and handle the remainning values without return like fire and forget. How to make early return of coroutine in python?
After passing the iterator to asyncio.create_task, it doesn't print the remaining values.
import asyncio
import time
async def async_iter(num):
for i in range(num):
await asyncio.sleep(0.5)
yield i
async def handle_remains(it):
async for i in it:
print(i)
async def run() -> None:
it = async_iter(10)
async for i in it:
print(i)
break
# await handle_remains(it)
# want to make this `fire and forget`(no await), expecting just printing the remainning values.
asyncio.create_task(handle_remains(it))
return i
if __name__ == '__main__':
asyncio.run(run())
time.sleep(10)
You’re close with the code, but not quite there yet (see also my comments above). In short, creating the Task isn’t enough: the Task needs to run:
task = asyncio.create_task(handle_remains(it)) # Creates a Task in `pending` state.
await task # Run the task, i.e. execute the wrapped coroutine.
A Task, along with coroutines and Futures, is an “Awaitable”. In fact:
When a coroutine is wrapped into a Task with functions like asyncio.create_task() the coroutine is automatically scheduled to run soon.
Notice the “scheduled to run soon”, now you have to make sure to actually run the task by calling await, a keyword which…
is used to obtain a result of coroutine execution.

Is there a way to change asyncio.sleep while currently sleeping?

If I have a coroutine currently sleeping to allow other coroutines to run, is is possible to change the sleep time while sleeping? Or would I have to cancel and restart the coroutine. I think I may have just answered myself there. Looking for help from the more experienced.
The "sleep" coroutine is obviously designed to be simple: it pauses for that amount of time, and it is it.
What you seem to need is a way to synchronize your co-routines, and if no signal gets back in an specified amount of time (the time you are passing to sleep), to move on.
Take a look at the synchronization primitives https://docs.python.org/3.6/library/asyncio-sync.html and asyncio.wait_for
So, you can instead of asyncio.sleep, call a co-routine, with wait_for, where it expects an Event, or a Lock release. The Event or lock-release then is used by whatever part of your code would "cancel sleep" anyway.
I created an example to show both sleeping running to the end, and being canceled.
import asyncio
async def interruptable_sleep(time, event):
try:
await asyncio.wait_for(event.wait(), timeout=time)
except asyncio.TimeoutError:
print("'sleeping' proceeded normaly")
else:
print("'sleeping' canceled")
async def sleeper(m, n, event):
await asyncio.sleep(n)
if n == 3:
event.set()
print(f"cycle {m}, step {n}")
async def main():
event = asyncio.Event()
tasks = []
for cycle in range(3):
event.clear()
# create batch of async tasks to run in parallel
for step in range(6):
tasks.append(asyncio.create_task(sleeper(cycle, step, event), name=f"{cycle}_{step}"))
await interruptable_sleep(2, event)
# 'join' remaining tasks
event.set()
await asyncio.gather(*tasks)
asyncio.run(main())
This pattern sort of "reverses" the idea of a timeout: if a task finishes early, the waiting is canceled . (while timeout means "if a task is too late, cancel it") -
But maybe ou just need the other pattern there: to create a list of all your tasks and call asyncio.gather, rather than calling "sleep" to give "time for the other tasks to run".

How asyncio understands that task is complete for non-blocking operations

I'm trying to understand how asyncio works. As for I/O operation i got understand that when await was called, we register Future object in EventLoop, and then calling epoll for get sockets which belongs to Future objects, that ready for give us data. After we run registred callback and resume function execution.
But, the thing that i cant understant, what's happening if we use await not for I/O operation. How eventloop understands that task is complete? Is it create socket for that or use another kind of loop? Is it use epoll? Or doesnt it add to Loop and used it as generator?
There is an example:
import asyncio
async def test():
return 10
async def my_coro(delay):
loop = asyncio.get_running_loop()
end_time = loop.time() + delay
while True:
print("Blocking...")
await test()
if loop.time() > end_time:
print("Done.")
break
async def main():
await my_coro(3.0)
asyncio.run(main())
await doesn't automatically yield to the event loop, that happens only when an async function (anywhere in the chain of awaits) requests suspension, typically due to IO or timeout not being ready.
In your example the event loop is never returned to, which you can easily verify by moving the "Blocking" print before the while loop and changing main to await asyncio.gather(my_coro(3.0), my_coro(3.0)). What you'll observe is that the coroutines are executed in series ("blocking" followed by "done", all repeated twice), not in parallel ("blocking" followed by another "blocking" and then twice "done"). The reason for that was that there was simply no opportunity for a context switch - my_coro executed in one go as if they were an ordinary function because none of its awaits ever chose to suspend.

Share queue in event loop

Is it possible to share an asyncio.Queue over different tasks in one event loop?
The usecase:
Two tasks are publishing data on a queue, and one task is grabbing the new items from the Queue. All tasks in an asynchronous way.
main.py
import asyncio
import creator
async def pull_message(queue):
while True:
# Here I dont get messages, maybe the queue is always
# occupied by a other task?
msg = await queue.get()
print(msg)
if __name__ == "__main__"
loop = asyncio.get_event_loop()
queue = asyncio.Queue(loop=loop)
future = asyncio.ensure_future(pull_message(queue))
creators = list()
for i in range(2):
creators.append(loop.create_task(cr.populate_msg(queue)))
# add future to creators for easy handling
creators.append(future)
loop.run_until_complete(asyncio.gather(*creators))
creator.py
import asyncio
async def populate_msg(queue):
while True:
msg = "Foo"
await queue.put(msg)
The problem in your code is that populate_msg doesn't yield to the event loop because the queue is unbounded. This is somewhat counter-intuitive because the coroutine clearly contains an await, but that await only suspends the execution of the coroutine if the coroutine would otherwise block. Since put() on an unbounded queue never blocks, populate_msg is the only thing executed by the event loop.
The problem will go away once you change populate_msg to actually do something else (like await a network event). For testing purposes you can add await asyncio.sleep(0) inside the loop, which will force the coroutine to yield control to the event loop at every iteration of the while loop. Note that this will cause the event loop to spend an entire core by continuously spinning the loop.

What is the correct way to switch freely between asynchronous tasks?

Suppose I have some tasks running asynchronously. They may be totally independent, but I still want to set points where the tasks will pause so they can run concurrently.
What is the correct way to run the tasks concurrently? I am currently using await asyncio.sleep(0), but I feel this is adding a lot of overhead.
import asyncio
async def do(name, amount):
for i in range(amount):
# Do some time-expensive work
print(f'{name}: has done {i}')
await asyncio.sleep(0)
return f'{name}: done'
async def main():
res = await asyncio.gather(do('Task1', 3), do('Task2', 2))
print(*res, sep='\n')
loop = asyncio.get_event_loop()
loop.run_until_complete(main())
Output
Task1: has done 0
Task2: has done 0
Task1: has done 1
Task2: has done 1
Task1: has done 2
Task1: done
Task2: done
If we were using simple generators, an empty yield would pause the flow of a task without any overhead, but empty await are not valid.
What is the correct way to set such breakpoints without overhead?
As mentioned in the comments, normally asyncio coroutines suspend automatically on calls that would block or sleep in equivalent synchronous code. In your case the coroutine is CPU-bound, so awaiting blocking calls is not enough, it needs to occasionally relinquish control to the event loop to allow the rest of the system to run.
Explicit yields are not uncommon in cooperative multitasking, and using await asyncio.sleep(0) for that purpose will work as intended, it does carry a risk: sleep too often, and you're slowing down the computation by unnecessary switches; sleep too seldom, and you're hogging the event loop by spending too much time in a single coroutine.
The solution provided by asyncio is to offload CPU-bound code to a thread pool using run_in_executor. Awaiting it will automatically suspend the coroutine until the CPU-intensive task is done, without any intermediate polling. For example:
import asyncio
def do(id, amount):
for i in range(amount):
# Do some time-expensive work
print(f'{id}: has done {i}')
return f'{id}: done'
async def main():
loop = asyncio.get_event_loop()
res = await asyncio.gather(
loop.run_in_executor(None, do, 'Task1', 5),
loop.run_in_executor(None, do, 'Task2', 3))
print(*res, sep='\n')
loop = asyncio.get_event_loop()
loop.run_until_complete(main())

Categories