Are asyncio's EventLoop tasks created with loop.create_task a FIFO - python

I can not find any documentation on this, but empirically it seems that it is.
In what order will the coroutines 1 and 2 run in the following three examples and is the order always guaranteed?
A
loop.run_until_complete(coro1)
loop.run_until_complete(coro2)
loop.run_forever()
B
loop.create_task(coro1)
loop.create_task(coro2)
loop.run_forever()
C
loop.create_task(coro1)
loop.run_until_complete(coro2)
loop.run_forever()
etc.

In your first example, coro1 will run until it is complete. Then coro2 will run. This is essentially the same as if they were both synchronous functions.
In your second example, coro1 will run until it's told to await. At that point control is yielded to coro2. coro2 will run until it's told to await. At that point the loop will check to see if coro1 is ready to resume. This will repeat until both are finished and then the loop will just wait.
In your final example, coro2 starts first, following the same back and forth as the previous example, and then the process will stop once it, coro2, is done. Then coro1 will resume until it's done and then the loop will just wait.
A fourth example is
loop.run_until_complete(
asyncio.gather(
asyncio.ensure_future(coro1),
asyncio.ensure_future(coro2),
)
)
It will behave like the second example except it will stop once both are complete.

Related

What does asyncio.create_task actually do?

I'm trying to understand how does asyncio.create_task actually work. Suppose I have following code:
import asyncio
import time
async def delayer():
await asyncio.sleep(1)
async def messenger():
await asyncio.sleep(1)
return "A Message"
async def main():
message = await messenger()
await delayer()
start_time = time.time()
asyncio.run(main())
end_time = time.time() - start_time
print(end_time)
The code will take about 2 seconds. But if I make some changes to the body of main like this:
import asyncio
import time
async def delayer():
await asyncio.sleep(1)
async def messenger():
await asyncio.sleep(1)
return "A Message"
async def main():
task1 = asyncio.create_task(delayer())
task2 = asyncio.create_task(delayer())
await task1
await task2
start_time = time.time()
asyncio.run(main())
end_time = time.time() - start_time
print(end_time)
Now the code will take about 1 second.
My understanding from what I read is that await is a blocking process as we can see from the first code. In that code we need to wait 1 second for the messenger function to return, then another second for delayer function.
Now the real question come from the second code. We just learnt that await need us to wait for its expression to return. So even if we use async.create_task, shouldn't awaits in one of the function's body block the process and then return whenever it finishes its job, thus should give us 2 seconds for the program to end?
If that wasn't the case, can you help me understand the asyncio.create_task?
What I know:
await is a blocking process
await executes coroutine function and task object
await makes us possible to pause coroutine process (I don't quite understand about this, too)
create_task creates task object and then schedule and execute it as soon as possible
What I am expecting:
I hope I can get a simple but effective answer about how does asyncio.create_task conduct its work using my sample code.
Perhaps it will help to think in the following way.
You cannot understand what await does until you understand what an event loop is. This line:
asyncio.run(main())
creates and executes an event loop, which is basically an infinite loop with some methods for allowing an exit - a "semi-infinite" loop, so to speak. Until that loop exits, it will be entirely responsible for executing the program. (Here I am assuming that your program has only a single thread and a single Process. I'm not talking about concurrent program in any form.) Each unit of code that can run within an event loop is called a "Task." The idea of the loop is that it can run multiple Tasks by switching from one to another, thus giving the illusion that the CPU is doing more than one thing at a time.
The asyncio.run() call does a second thing: it creates a Task, main(). At that moment, it's the only Task. The event loop begins to run the Task at its first line. Initially it runs just like any other function:
async def main():
task1 = asyncio.create_task(delayer())
task2 = asyncio.create_task(delayer())
await task1
await task2
It creates two more tasks, task1 and task2. Now there are 3 Tasks but only one of them can be active. That's still main(). Then you come to this line:
await task1
The await keyword is what allows this whole rigmarole to work. It is an instruction to the event loop to suspend the active task right here, at this point, and possibly allow another Task to become the active one. So to address your first bullet point, await is neither "blocking" nor is it a "process". Its purpose is to mark a point at which the event loop gets control back from the active Task.
There is another thing happening here. The object that follows the await is called, unimaginatively, an "awaitable" object. Its crucial property is whether or not it is "done." The event loop keeps track of this object; as the loop cycles through its Tasks it will keep checking this object. If it's not done, main() doesn't resume. (This isn't exactly how it's implemented because that would be inefficient, but it's conceptually what's happening.) If you want to say that the await is "blocking" main() until task1 is finished, that's sort-of true; but "blocking" has a technical meaning so it's not the best word to use. In any case, the event loop is not "blocked" at all - it can keep running other Tasks until the awaitable task1 is done. After task1 becomes "done" and main() gets its turn to be the active task, execution continues to the next line of code.
Your second bullet point, "await executes coroutine function and task object" is not correct. await doesn't execute anything. As I said, it just marks a point where the Task gets suspended and the event loop gets control back. Its awaitable determines when the Task can be resumed.
You say, "await makes [it] possible to pause coroutine process". Not quite right - it ALWAYS suspends the current Task. Whether or not there is a significant delay in the Task's execution depends on whether there are other Tasks that are ready to take over, and also the state of its awaitable.
"create_task creates task object and then schedule and execute it as soon as possible." Correct. But "as soon as possible" means the next time the current Task hits an await expression. Other Tasks may get a turn to run first, before the new Task gets a chance to start. Those details are up to the implementation of the event loop. But eventually the new Task will get a turn.
In the comments you ask, "Is it safe if I say that plain await, not being involved in any event loop or any kind of it, works in blocking manner?" It's absolutely not safe to say that. First of all, there is no such thing as a "plain await". Your task must wait FOR something, otherwise how would the event loop know when to resume? An await without an event loop is either a syntax error or a runtime error - it makes no sense, because await is a point where the Task and the event loop interact. The main point is that event loops and await expression are intimately related: an await without an event loop is an error; an event loop without any await expressions is useless.
The closest you can come to a plain await is this expression:
await asyncio.sleep(0)
which has the effect of suspending the current Task momentarily, giving the event loop a chance to run other tasks, resuming this Task as soon as possible.
One other point is that the code:
await task1
is an expression which has a value, in this case the returned value from task1. Since your task1 doesn't return anything this will be None. But if your delayer function looked like this:
async def delayer():
await asyncio.sleep(1)
return "Hello"
then in main() you could write:
print(await task1)
and you would see "Hello" on the console.

Is this a good alternative of asyncio.sleep

I decided not use asyncio.sleep() and tried to create my own coroutine function as shown below. Since, time.sleep is an IO bound function, I thought this will print 7 seconds. But it prints 11 seconds.
import time
import asyncio
async def my_sleep(delay):
time.sleep(delay)
async def main():
start = time.time()
await asyncio.gather(my_sleep(4), my_sleep(7))
print("Took", time.time()-start, "seconds")
asyncio.run(main())
# Expected: Took 7 seconds
# Got: Took 11.011508464813232 seconds
Though if I write a similar code with threads, It does print 7 seconds. Do Task objects created by asyncio.gather not recognize time.sleep as an IO bound operation, the way threads do? Please explain why is it happening.
time.sleep is blocking operation for event loop. It has no sense if you write async in defention of function because it not unlock the event loop (no await command)
This two questions might help you to understand more:
Python 3.7 - asyncio.sleep() and time.sleep()
Run blocking and unblocking tasks together with asyncio
This would not work for you because time.sleep is a synchronous function.
From the 'perspective' of the event loop my_sleep might as well be doing a heavy computation within an async function, never yielding the execution context while working.
The first tell tale sign of this is that you're not using an await statement when calling time.sleep.
Making a synchronous function behave as an async one is not trivial, but the common approach is moving the function call to worker threads and awaiting the results.
I'd recommend looking at the solution of anyio, they implemented a run_sync function which does exactly that.

Order of execution in async function. How is it determined?

When I run the following asynchronous code:
from asyncio import create_task, sleep, run
async def print_(delay, x):
print(f'start: {x}')
await sleep(delay)
print(f'end: {x}')
async def main():
slow_task = create_task(print_(2, 'slow task'))
fast_task = create_task(print_(1, 'fast task'))
# The order of execution here is strange:
print(0)
await slow_task
print(1)
await fast_task
run(main())
I get an unexpected order of execution:
0
start: slow task
start: fast task
end: fast task
end: slow task
1
What exactly is happening?
How can I predict the order of execution?
What I find strange is print(1) is ignored until all tasks are finished. To my understanding the code runs as expected until it reaches any await. Then it creates a task-loop with any other awaitable it finds down the line. Which it prioritizes. Right?
That's what I find surprising. I'd expect it to run print(1) before any task is complete. Why doesn't it?
Is that the standard behavior or does it vary? If so, what does it vary upon?
If you could into detail how the event loop works alongside the rest of the code, that'd be great.
Let's walk through this:
You create asynchronous processes for a fast task and a slow task at the same time; even the "fast" one will take a significant amount of time.
Immediately after creating them, you print 0, so this becomes your first output.
You call await slow_task, passing control to the event loop until slow_task finishes.
Because you requested slow_task, it's prioritized, so it starts first, so start: slow_task is printed.
Because slow_task contains an await sleep(2), it passes control back to the event loop, which finds fast_task as ready to operate and starts it, so start: fast_task is printed.
Because fast_task's await sleep(1) finishes first, fast_task completes, and end: fast_task is printed. Because we're awaiting slow_task, not fast_task, we remain in the event loop.
Finally, the slow task finishes, so it prints end: slow task. Because this is what we were awaiting for, control flow is returned to the synchronous process.
After the slow task has finished, you print 1, so this becomes your last output.
Finally, you wait for the fast task to finish; it already did finish earlier, while you were waiting for the fast task, so this returns immediately.
Everything is exactly as one would expect.
That said, for the more general case, you can't expect order-of-operations in an async program to be reliably deterministic in real-world cases where you're waiting on I/O operations that can take a variable amount of time.

Execution order of tasks scheduled on an event loop (Python 3.10)

I am still learning the asyncio module in python 3.10. Below is a code I wrote trying to understand how to run coroutines as tasks on an event loop. the output I am getting is different from what I expected. Please help me understand what I am doing wrong or if there is something I misunderstood.
import asyncio
async def f(x):
try:
print(f"Inside f({x})...")
await asyncio.sleep(x)
print(f'slept for {x} seconds!!')
return x
except asyncio.CancelledError: # Execute this part only when the task is cancelled.
print(f'Cancelled({x})')
async def main():
loop = asyncio.get_running_loop()
task1 = loop.create_task(f(1), name="task1")
task2 = loop.create_task(f(2), name="task2")
task3 = loop.create_task(f(3), name="task3")
print(task1.done())
print(task2.done())
print(task3.done())
asyncio.run(main())
Output:
False False False Inside f(1)... Inside f(2)... Inside f(3)...
Cancelled(3) Cancelled(1) Cancelled(2)
I expect this code to exit main() immediately after print(task3.done()) and only run the exception handling part of the coroutine function f(x), but the event loop runs the functionalities of f(1), f(2) and f(3) exactly one time before actually exiting.
My doubts are
Why do all the tasks on the event loop run once before getting cancelled?
What is the execution order (if there is any, is it fixed or random?)
Edit:
I first tried using task1 = asyncio.create_task(f(1), name = "task1"). Then I used loop.create_task(). Both give same output(which is to be expected I guess).
I think you should change the line "task1 = loop.create_task(f(1), name="task1")" to be "task1 = asyncio.create_task(f(1), name="task1")"
Your program is finishing before your asynchronous tasks are done, so they are all being cancelled. The call to done() just prints that they're not done.
You need to add await(task1), etc. in addition to the previous answer if you want to give your tasks the chance to finish. Or just
await asyncio.gather(task1, task2, task3) to wait for all of them in one call.

How to await (without using the await keyword) for an asyncio task to complete and retrieve the result?

As we know, inside an async function, code will not continue executing until the awaited coroutine finishes executing:
await coro()
# or
await asyncio.gather(coro_1(), coro_2())
# Code below will run AFTER the coroutines above finish running = desired effect
Outside (or inside) an async function it is possible to add coroutines to an event loop with asyncio.create_task(coro()), which returns a Task object.
In my scenario the tasks are added to an >>> existing running loop <<<, and the code that follows will continue executing without waiting for that task/coroutine to finish. Of course, it is possible to execute a callback function when the task finishes with task_obj.add_done_callback(callback_func).
asyncio.create_task(coro()) is especially useful on Jupyter Notebooks / Jupyter Lab because Jupyter runs an event loop in the background. Calling asyncio.get_event_loop() will retrieve Jupyter's event loop. Inside a Jupyter Notebook we cannot call asyncio.run(coro()) or loop.run_until_complete() because the loop is already running (except if we run the async code an a separate process and create a new event loop inside that process, but this is not the use case I'm looking for).
So my question is how can I await (from outside an async function = without using await keyword) on an asyncio task (or group of tasks) to finish and retrieve results before executing the code that follows?
tsk_obj = asyncio.create_task(coro()) # adds task to a (in my case) running event loop
# QUESTION:
# What can be done to await the result before executing the rest of the code below?
print('Result is:', tsk_obj.result())
# ^-- this of course will execute immediately = NOT the desired effect
# (the coroutine is very likely still running on the event loop and the result is not ready)
Without concurrent execution this is logically impossible. To wait for something outside an async function means to block the current thread. But if the current thread is blocked the task cannot possibly run. You can actually write such code (waiting on a threading.Event object, for example, which is set at the end of your task). But the program will deadlock for the reason I gave.

Categories