I was wondering what exactly happens when we await a coroutine in async Python code, for example:
await send_message(string)
(1) send_message is added to the event loop, and the calling coroutine gives up control to the event loop, or
(2) We jump directly into send_message
Most explanations I read point to (1), as they describe the calling coroutine as exiting. But my own experiments suggest (2) is the case: I tried to have a coroutine run after the caller but before the callee and could not achieve this.
Disclaimer: Open to correction (particularly as to details and correct terminology) since I arrived here looking for the answer to this myself. Nevertheless, the research below points to a pretty decisive "main point" conclusion:
Correct OP answer: No, await (per se) does not yield to the event loop, yield yields to the event loop, hence for the case given: "(2) We jump directly into send_message". In particular, certain yield expressions are the only points, at bottom, where async tasks can actually be switched out (in terms of nailing down the precise spot where Python code execution can be suspended).
To be proven and demonstrated: 1) by theory/documentation, 2) by implementation code, 3) by example.
By theory/documentation
PEP 492: Coroutines with async and await syntax
While the PEP is not tied to any specific Event Loop implementation, it is relevant only to the kind of coroutine that uses yield as a signal to the scheduler, indicating that the coroutine will be waiting until an event (such as IO) is completed. ...
[await] uses the yield from implementation [with an extra step of validating its argument.] ...
Any yield from chain of calls ends with a yield. This is a fundamental mechanism of how Futures are implemented. Since, internally, coroutines are a special kind of generators, every await is suspended by a yield somewhere down the chain of await calls (please refer to PEP 3156 for a detailed explanation). ...
Coroutines are based on generators internally, thus they share the implementation. Similarly to generator objects, coroutines have throw(), send() and close() methods. ...
The vision behind existing generator-based coroutines and this proposal is to make it easy for users to see where the code might be suspended.
In context, "easy for users to see where the code might be suspended" seems to refer to the fact that in synchronous code yield is the place where execution can be "suspended" within a routine allowing other code to run, and that principle now extends perfectly to the async context wherein a yield (if its value is not consumed within the running task but is propagated up to the scheduler) is the "signal to the scheduler" to switch out tasks.
More succinctly: where does a generator yield control? At a yield. Coroutines (including those using async and await syntax) are generators, hence likewise.
And it is not merely an analogy, in implementation (see below) the actual mechanism by which a task gets "into" and "out of" coroutines is not anything new, magical, or unique to the async world, but simply by calling the coro's <generator>.send() method. That was (as I understand the text) part of the "vision" behind PEP 492: async and await would provide no novel mechanism for code suspension but just pour async-sugar on Python's already well-beloved and powerful generators.
And
PEP 3156: The "asyncio" module
The loop.slow_callback_duration attribute controls the maximum execution time allowed between two yield points before a slow callback is reported [emphasis in original].
That is, an uninterrupted segment of code (from the async perspective) is demarcated as that between two successive yield points (whose values reached up to the running Task level (via an await/yield from tunnel) without being consumed within it).
And this:
The scheduler has no public interface. You interact with it by using yield from future and yield from task.
Objection: "That says 'yield from', but you're trying to argue that the task can only switch out at a yield itself! yield from and yield are different things, my friend, and yield from itself doesn't suspend code!"
Ans: Not a contradiction. The PEP is saying you interact with the scheduler by using yield from future/task. But as noted above in PEP 492, any chain of yield from (~aka await) ultimately reaches a yield (the "bottom turtle"). In particular (see below), yield from future does in fact yield that same future after some wrapper work, and that yield is the actual "switch out point" where another task takes over. But it is incorrect for your code to directly yield a Future up to the current Task because you would bypass the necessary wrapper.
The objection having been answered, and its practical coding considerations being noted, the point I wish to make from the above quote remains: that a suitable yield in Python async code is ultimately the one thing which, having suspended code execution in the standard way that any other yield would do, now futher engages the scheduler to bring about a possible task switch.
By implementation code
asyncio/futures.py
class Future:
...
def __await__(self):
if not self.done():
self._asyncio_future_blocking = True
yield self # This tells Task to wait for completion.
if not self.done():
raise RuntimeError("await wasn't used with future")
return self.result() # May raise too.
__iter__ = __await__ # make compatible with 'yield from'.
Paraphrase: The line yield self is what tells the running task to sit out for now and let other tasks run, coming back to this one sometime after self is done.
Almost all of your awaitables in asyncio world are (multiple layers of) wrappers around a Future. The event loop remains utterly blind to all higher level await awaitable expressions until the code execution trickles down to an await future or yield from future and then (as seen here) calls yield self, which yielded self is then "caught" by none other than the Task under which the present coroutine stack is running thereby signaling to the task to take a break.
Possibly the one and only exception to the above "code suspends at yield self within await future" rule, in an asyncio context, is the potential use of a bare yield such as in asyncio.sleep(0). And since the sleep function is a topic of discourse in the comments of this post, let's look at that.
asyncio/tasks.py
#types.coroutine
def __sleep0():
"""Skip one event loop run cycle.
This is a private helper for 'asyncio.sleep()', used
when the 'delay' is set to 0. It uses a bare 'yield'
expression (which Task.__step knows how to handle)
instead of creating a Future object.
"""
yield
async def sleep(delay, result=None, *, loop=None):
"""Coroutine that completes after a given time (in seconds)."""
if delay <= 0:
await __sleep0()
return result
if loop is None:
loop = events.get_running_loop()
else:
warnings.warn("The loop argument is deprecated since Python 3.8, "
"and scheduled for removal in Python 3.10.",
DeprecationWarning, stacklevel=2)
future = loop.create_future()
h = loop.call_later(delay,
futures._set_result_unless_cancelled,
future, result)
try:
return await future
finally:
h.cancel()
Note: We have here the two interesting cases at which control can shift to the scheduler:
(1) The bare yield in __sleep0 (when called via an await).
(2) The yield self immediately within await future.
The crucial line (for our purposes) in asyncio/tasks.py is when Task._step runs its top-level coroutine via result = self._coro.send(None) and recognizes fourish cases:
(1) result = None is generated by the coro (which, again, is a generator): the task "relinquishes control for one event loop iteration".
(2) result = future is generated within the coro, with further magic member field evidence that the future was yielded in a proper manner from out of Future.__iter__ == Future.__await__: the task relinquishes control to the event loop until the future is complete.
(3) A StopIteration is raised by the coro indicating the coroutine completed (i.e. as a generator it exhausted all its yields): the final result of the task (which is itself a Future) is set to the coroutine return value.
(4) Any other Exception occurs: the task's set_exception is set accordingly.
Modulo details, the main point for our concern is that coroutine segments in an asyncio event loop ultimately run via coro.send(). Initial startup and final termination aside, send() proceeds precisely from the last yield value it generated to the next one.
By example
import asyncio
import types
def task_print(s):
print(f"{asyncio.current_task().get_name()}: {s}")
async def other_task(s):
task_print(s)
class AwaitableCls:
def __await__(self):
task_print(" 'Jumped straight into' another `await`; the act of `await awaitable` *itself* doesn't 'pause' anything")
yield
task_print(" We're back to our awaitable object because that other task completed")
asyncio.create_task(other_task("The event loop gets control when `yield` points (from an iterable coroutine) propagate up to the `current_task` through a suitable chain of `await` or `yield from` statements"))
async def coro():
task_print(" 'Jumped straight into' coro; the `await` keyword itself does nothing to 'pause' the current_task")
await AwaitableCls()
task_print(" 'Jumped straight back into' coro; we have another pending task, but leaving an `__await__` doesn't 'pause' the task any more than entering the `__await__` does")
#types.coroutine
def iterable_coro(context):
task_print(f"`{context} iterable_coro`: pre-yield")
yield None # None or a Future object are the only legitimate yields to the task in asyncio
task_print(f"`{context} iterable_coro`: post-yield")
async def original_task():
asyncio.create_task(other_task("Aha, but a (suitably unconsumed) *`yield`* DOES 'pause' the current_task allowing the event scheduler to `_wakeup` another task"))
task_print("Original task")
await coro()
task_print("'Jumped straight out of' coro. Leaving a coro, as with leaving/entering any awaitable, doesn't give control to the event loop")
res = await iterable_coro("await")
assert res is None
asyncio.create_task(other_task("This doesn't run until the very end because the generated None following the creation of this task is consumed by the `for` loop"))
for y in iterable_coro("for y in"):
task_print(f"But 'ordinary' `yield` points (those which are consumed by the `current_task` itself) behave as ordinary without relinquishing control at the async/task-level; `y={y}`")
task_print("Done with original task")
asyncio.get_event_loop().run_until_complete(original_task())
run in python3.8 produces
Task-1: Original task
Task-1: 'Jumped straight into' coro; the await keyword itself does nothing to 'pause' the current_task
Task-1: 'Jumped straight into' another await; the act of await awaitable itself doesn't 'pause' anything
Task-2: Aha, but a (suitably unconsumed) yield DOES 'pause' the current_task allowing the event scheduler to _wakeup another task
Task-1: We're back to our awaitable object because that other task completed
Task-1: 'Jumped straight back into' coro; we have another pending task, but leaving an __await__ doesn't 'pause' the task any more than entering the __await__ does
Task-1: 'Jumped straight out of' coro. Leaving a coro, as with leaving/entering any awaitable, doesn't give control to the event loop
Task-1: await iterable_coro: pre-yield
Task-3: The event loop gets control when yield points (from an iterable coroutine) propagate up to the current_task through a suitable chain of await or yield from statements
Task-1: await iterable_coro: post-yield
Task-1: for y in iterable_coro: pre-yield
Task-1: But 'ordinary' yield points (those which are consumed by the current_task itself) behave as ordinary without relinquishing control at the async/task-level; y=None
Task-1: for y in iterable_coro: post-yield
Task-1: Done with original task
Task-4: This doesn't run until the very end because the generated None following the creation of this task is consumed by the for loop
Indeed, exercises such as the following can help one's mind to decouple the functionality of async/await from notion of "event loops" and such. The former is conducive to nice implementations and usages of the latter, but you can use async and await just as specially syntaxed generator stuff without any "loop" (whether asyncio or otherwise) whatsoever:
import types # no asyncio, nor any other loop framework
async def f1():
print(1)
print(await f2(),'= await f2()')
return 8
#types.coroutine
def f2():
print(2)
print((yield 3),'= yield 3')
return 7
class F3:
def __await__(self):
print(4)
print((yield 5),'= yield 5')
print(10)
return 11
task1 = f1()
task2 = F3().__await__()
""" You could say calls to send() represent our
"manual task management" in this script.
"""
print(task1.send(None), '= task1.send(None)')
print(task2.send(None), '= task2.send(None)')
try:
print(task1.send(6), 'try task1.send(6)')
except StopIteration as e:
print(e.value, '= except task1.send(6)')
try:
print(task2.send(9), 'try task2.send(9)')
except StopIteration as e:
print(e.value, '= except task2.send(9)')
produces
1
2
3 = task1.send(None)
4
5 = task2.send(None)
6 = yield 3
7 = await f2()
8 = except task1.send(6)
9 = yield 5
10
11 = except task2.send(9)
Yes, await passes control back to the asyncio eventloop, and allows it to schedule other async functions.
Another way is
await asyncio.sleep(0)
Related
When can an asyncio Task be canceled? Or, more generally, when can an asyncio loop switch to a different Task? It's been really hard for me to use cancellation in my asyncio programs, because I don't know when a CancelledError can get thrown.
I was working on a bit of code earlier, with a context manager kinda like this:
#! /usr/bin/python3
import asyncio
class MyContextManager:
def __init__(self):
self.locked = False
async def __aenter__(self):
self.locked = True
async def __aexit__(self, *_):
self.locked = False
async def main():
async with MyContextManager():
print("Doing something that needs locking")
if __name__ == "__main__":
asyncio.run(main())
What happens if the task is cancelled during __aenter__? I need to make sure that self.locked is false whenever the async with section exits. (I'm using self.locked as a simplification of a more complex acquire/release algorithm here, which includes some steps that are necessarily async.)
The docs regarding async with say:
The following code:
async with EXPRESSION as TARGET:
SUITE
is semantically equivalent to:
manager = (EXPRESSION)
aenter = type(manager).__aenter__
aexit = type(manager).__aexit__
value = await aenter(manager)
hit_except = False
try:
TARGET = value
SUITE
except:
hit_except = True
if not await aexit(manager, *sys.exc_info()):
raise finally:
if not hit_except:
await aexit(manager, None, None, None)
If I'm reading this right, this means that there's a window between when await aenter is called and when the try:finally: block is set up. If a task is canceled at the time that aenter returns, then aexit will not be called.
Can a task be canceled on exit from an async function? Well, let's look at the docs for asyncio.shield:
The statement:
res = await shield(something())
is equivalent to:
res = await something()
except that if the coroutine containing it is cancelled, the Task running in something() is not cancelled. From the point of view of something(), the cancellation did not happen. Although its caller is still cancelled, so the “await” expression still raises a CancelledError.
This seems to imply that an await expression can raise a CancelledError, even if the task is not canceled during the evaluation of the underlying expression.
As an opposing view, when I looked at the source code for asyncio.shield, it looks like the CancelledError is raised within asyncio.shield, rather than at the time that the await expression returns.
The biggest advantage of coroutines over threads is that it's much easier to reason about parallelism: synchronous operations will complete serially, and it's only when you await that anything can change out from under you. I use this reasoning a lot in my code. But it's not clear exactly when that await expression can change something out from under you.
An asyncio task can only be canceled on the await, as that's the only point at which the event loop can be running instead of your code.
The event loop is given control when one of the tasks you await yields to it (this is usually done by something like awaiting a sleep, or a future). Do mind that this is not necessarily done for every await, so an await doesn't guarantee that the event loop will actually run (the simplest example would be a coro that only returns a constant).
With the python equivalent async context manager code, the only point where something can be cancelled before the try block is set up is within __aenter__ itself, as that's the only point that may actually yield control. If something in the __aenter__ is cancelled and propagates up as an exception, it wouldn't make sense to call the __aexit__ either.
For shield, what the docs are saying is that on cancellation, the shield's task will be cancelled, which is shown to the caller by rising CancelledError on await, but the task it's wrapping will continue executing without ever being aware of the shielded cancellation.
As in the following example, I encountered an unusual error when using async Generator.
async def demo():
async def get_data():
for i in range(5): # loop: for or while
await asyncio.sleep(1) # some IO code
yield i
datas = get_data()
await asyncio.gather(
anext(datas),
anext(datas),
anext(datas),
anext(datas),
anext(datas),
)
if __name__ == '__main__':
# asyncio.run(main())
asyncio.run(demo())
Console output:
2022-05-11 23:55:24,530 DEBUG asyncio 29180 30600 Using proactor: IocpProactor
Traceback (most recent call last):
File "E:\workspace\develop\python\crawlerstack-proxypool\demo.py", line 77, in <module>
asyncio.run(demo())
File "D:\devtools\Python310\lib\asyncio\runners.py", line 44, in run
return loop.run_until_complete(main)
File "D:\devtools\Python310\lib\asyncio\base_events.py", line 641, in run_until_complete
return future.result()
File "E:\workspace\develop\python\crawlerstack-proxypool\demo.py", line 66, in demo
await asyncio.gather(
RuntimeError: anext(): asynchronous generator is already running
Situation description: I have a loop logic that fetches a batch of data from Redis at a time, and I want to use yield to return the result. But this error occurs when I create a concurrent task.
Is there a good solution to this situation? I don't mean to change the way I'm using it now, but to see if I can tell if it's running or something like a lock and wait for it to run and then execute anext.
Maybe my logic is not reasonable now, but I also want to understand some critical language, let me realize the seriousness of this.
Thank you for your help.
TL;DR: the right way
Async generators suit badly for a parallel consumption. See my explanations below. As a proper workaround, use asyncio.Queue for the communication between producers and consumers:
queue = asyncio.Queue()
async def producer():
for item in range(5):
await asyncio.sleep(random.random()) # imitate async fetching
print('item fetched:', item)
await queue.put(item)
async def consumer():
while True:
item = await queue.get()
await asyncio.sleep(random.random()) # imitate async processing
print('item processed:', item)
await asyncio.gather(producer(), consumer(), consumer())
The above code snippet works well for an infinite stream of items: for example, a web server, which runs forever serving requests from clients. But what if we need to process a finite number of items? How should consumers know when to stop?
This deserves another question on Stack Overflow to cover all alternatives, but the simplest option is a sentinel approach, described below.
Sentinel: finite data streams approach
Introduce a sentinel = object(). When all items from an external data source are fetched and put to the queue, producer must push as many sentinels to the queue as many consumers you have. Once a consumer fetches the sentinel, it knows it should stop: if item is sentinel: break from loop.
sentinel = object()
consumers_count = 2
async def producer():
... # the same code as above
if new_item is None: # if no new data
for _ in range(consumers_count):
await queue.put(sentinel)
async def consumer():
while True:
... # the same code as above
if item is sentinel:
break
await asyncio.gather(
producer(),
*(consumer() for _ in range(consumers_count)),
)
TL;DR [2]: a dirty workaround
Since you require to not change your async generator approach, here is an asyncgen-based alternative. To resolve this issue (in a simple-yet-dirty way), you may wrap the source async generator with a lock:
async def with_lock(agen, lock: asyncio.Lock):
while True:
async with lock: # only one consumer is allowed to read
try:
yield await anext(agen)
except StopAsyncIteration:
break
lock = asyncio.Lock() # a common lock for all consumers
await asyncio.gather(
# every consumer must have its own "wrapped" generator
anext(with_lock(datas, lock)),
anext(with_lock(datas, lock)),
...
)
This will ensure only one consumer awaits for an item from the generator at a time. While this consumer awaits, other consumers are being executed, so parallelization is not lost.
A roughly equivalent code with async for (looks a little smarter):
async def with_lock(agen, lock: asyncio.Lock):
await lock.acquire()
async for item in agen:
lock.release()
yield item
await lock.acquire()
lock.release()
However, this code only handles async generator's anext method. Whereas generators API also includes aclose and athrow methods. See an explanation below.
Though, you may add support for these to the with_lock function too, I would recommend to either subclass a generator and handle the lock support inside, or better use the Queue-based approach from above.
See contextlib.aclosing for some inspiration.
Explanation
Both sync and async generators have a special attribute: .gi_running (for regular generators) and .ag_running (for async ones). You may discover them by executing dir on a generator:
>>> dir((i for i in range(0))
[..., 'gi_running', ...]
They are set to True when a generator's .__next__ or .__anext__ method is executed (next(...) and anext(...) are just a syntactic sugar for those).
This prevents re-executing next(...) on a generator, when another next(...) call on the same generator is already being executed: if the running flag is True, an exception is raised (for a sync generator it raises ValueError: generator already executing).
So, returning to your example, when you run await anext(datas) (via asyncio.gather), the following happens:
datas.ag_running is set to True.
An execution flow steps into the datas.__anext__ method.
Once an inner await statement is reached inside of the __anext__ method (await asyncio.sleep(1) in your case), asyncio's loop switches to another consumer.
Now, another consumer tries to call await anext(datas) too, but since datas.ag_running flag is still set to True, this results in a RuntimeError.
Why is this flag needed?
A generator's execution can be suspended and resumed. But only at yield statements. Thus, if a generator is paused at an inner await statement, it cannot be "resumed", because its state disallows it.
That's why a parallel next/anext call to a generator raises an exception: it is not ready to be resumed, it is already running.
athrow and aclose
Generators' API (both sync and async) includes not only send/asend method for iteration, but also:
close/aclose to release generator-allocated resources (e.g. a database connection) on exit or an exception
and throw/athrow to inform generator that it has to handle an exception.
aclose and athrow are async methods too. Which means that if two consumers try to close/throw an underlying generator in parallel, you will encounter the same issue since a generator will be closing (or handling an exception) while closed (thrown an exception) again.
Sync generators example
Though this is a frequent case for async generators, reproducing it for sync generators is not that naive, since sync next(...) calls are rarely interrupted.
One of the ways to interrupt a sync generator is to run a multithreaded code with multiple consumers (run in parallel threads) reading from a single generator. In that case, when the generator's code is interrupted while executing a next call, all other consumers' parallel attempts to call next will result in an exception.
Another way to achieve this is demonstrated in the generators-related PEP #255 via a self-consuming generator:
>>> def g():
... i = next(me)
... yield i
...
>>> me = g()
>>> next(me)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 2, in g
ValueError: generator already executing
When outer next(me) is called, it sets me.gi_running to True and then executes the generator function code. A subsequent inner next(me) call leads to a ValueError.
Conclusion
Generators (especially async) work the best when consumed by a single reader. Multiple consumers support is hard, since requires patching behaviour of all the generator's methods, and thus discouraged.
I'd like to know what guarantees python gives around when a event loop will switch tasks.
As I understand it async / await are significantly different from threads in that the event loop does not switch task based on time slicing, meaning that unless the task yields (await), it will carry on indefinitely. This can actually be useful because it is easier to manage critical sections under asyncio than with threading.
What I'm less clear about is something like the following:
async def caller():
while True:
await callee()
async def callee():
pass
In this example caller is repeatedly await. So technically it is yielding. But I'm not clear on whether this will allow other tasks on the event loop to execute because it only yields to callee and that is never yielding.
That is if I awaited callee inside a "critical section" even though I know it won't block, am I at risk of something else unexpected happening?
You are right to be wary. caller yields from callee, and yields to the event loop. Then the event loop decides which task to resume. Other tasks may (hopefully) be squeezed in between the calls to callee. callee needs to await an actual blocking Awaitable such as asyncio.Future or asyncio.sleep(), not a coroutine, otherwise the control will not be returned to the event loop until caller returns.
For example, the following code will finish the caller2 task before it starts working on the caller1 task. Because callee2 is essentially a sync function without awaiting a blocking I/O operations, therefore, no suspension point is created and caller2 will resume immediately after each call to callee2.
import asyncio
import time
async def caller1():
for i in range(5):
await callee1()
async def callee1():
await asyncio.sleep(1)
print(f"called at {time.strftime('%X')}")
async def caller2():
for i in range(5):
await callee2()
async def callee2():
time.sleep(1)
print(f"sync called at {time.strftime('%X')}")
async def main():
task1 = asyncio.create_task(caller1())
task2 = asyncio.create_task(caller2())
await task1
await task2
asyncio.run(main())
Result:
sync called at 19:23:39
sync called at 19:23:40
sync called at 19:23:41
sync called at 19:23:42
sync called at 19:23:43
called at 19:23:43
called at 19:23:44
called at 19:23:45
called at 19:23:46
called at 19:23:47
But if callee2 awaits as the following, the task switching will happen even if it awaits asyncio.sleep(0), and the tasks will run concurrently.
async def callee2():
await asyncio.sleep(1)
print('sync called')
Result:
called at 19:22:52
sync called at 19:22:52
called at 19:22:53
sync called at 19:22:53
called at 19:22:54
sync called at 19:22:54
called at 19:22:55
sync called at 19:22:55
called at 19:22:56
sync called at 19:22:56
This behavior is not necessarily intuitive, but it makes sense considering that asyncio was made to handle I/O operations and networking concurrently, not the usual synchronous python codes.
Another thing to note is: This still works if the callee awaits a coroutine that, in turn, awaits a asyncio.Future, asyncio.sleep(), or another coroutine that await one of those things down the chain. The flow control will be returned to the event loop when the blocking Awaitable is awaited. So the following also works.
async def callee2():
await inner_callee()
print(f"sync called at {time.strftime('%X')}")
async def inner_callee():
await asyncio.sleep(1)
TLDR: No. Coroutines and their respective keywords (await, async with, async for) only enable suspension. Whether suspension occurs depends on the framework used, if at all.
Third-party async functions / iterators / context managers can act as
checkpoints; if you see await <something> or one of its friends, then
that might be a checkpoint. So to be safe, you should prepare for
scheduling or cancellation happening there.
[Trio documentation]
The await syntax of Python is syntactic sugar around two fundamental mechanisms: yield to temporarily suspend with a value, and return to permanently exit with a value. These are the same that, say, a generator function coroutine can use:
def gencoroutine():
for i in range(5):
yield i # temporarily suspend
return 5 # permanently exit
Notably, return does not imply a suspension. It is possible for a generator coroutine to never yield at all.
The await keyword (and its sibling yield from) interacts with both the yield and return mechanism:
If its target yields, await "passes on" the suspension to its own caller. This allows to suspend an entire stack of coroutines that all await each other.
If its target returnss, await catches the return value and provides it to its own coroutine. This allows to return a value directly to a "caller", without suspension.
This means that await does not guarantee that a suspension occurs. It is up to the target of await to trigger a suspension.
By itself, an async def coroutine can only return without suspension, and await to allow suspension. It cannot suspend by itself (yield does not suspend to the event loop).
async def unyielding():
return 2 # or `pass`
This means that await of just coroutines does never suspend. Only specific awaitables are able to suspend.
Suspension is only possible for awaitables with a custom __await__ method. These can yield directly to the event loop.
class YieldToLoop:
def __await__(self):
yield # to event loop
return # to awaiter
This means that await, directly or indirectly, of a framework's awaitable will suspend.
The exact semantics of suspending depend on the async framework in use. For example, whether a sleep(0) triggers a suspension or not, or which coroutine to run instead, is up to the framework. This also extends to async iterators and context managers -- for example, many async context managers will suspend either on enter or exit but not both.
Trio
If you call an async function provided by Trio (await <something in trio>), and it doesn’t raise an exception, then it always acts as a checkpoint. (If it does raise an exception, it might act as a checkpoint or might not.)
Asyncio
sleep() always suspends the current task, allowing other tasks to run.
I'm studying Python's asyncio library and i'm having a hard time understanding how some Python functions work.
For example:
import asyncio
async def coroutine():
print('Initialize coroutine')
await asyncio.sleep(1)
print('Done !')
loop = asyncio.get_event_loop()
#asyncio.ensure_future(coroutine())
#loop.create_task(coroutine())
loop.run_forever()
For example in the code if I make use of ensure_future it automatically starts the execution of my task, so does create_task, but in the case of create task I even understand that it uses the loop used in the loop.create_task method call. My question is because using ensure_future it already starts executing the task, and if it does this because in some case I see codes like this:
import asyncio
async def coroutine():
print('Initialize coroutine')
await asyncio.sleep(1)
print('Done !')
loop = asyncio.get_event_loop()
task = asyncio.ensure_future(coroutine())
loop.run_until_complete(asyncio.wait([task]))
My question is because using ensure_future it already starts executing the task, and if it does this
Yes, ensure_future starts executing the coroutine as soon as the event loop is resumed, even if no one awaits the returned future.
This feature of ensure_future follows from the implementation, but it also makes sense conceptually.
On the implementation level, it is a consequence of ensure_future internally calling create_task when given a coroutine object. In other words, for coroutines there is no difference between loop.create_task and asyncio.ensure_future. Otherwise the difference is that ensure_future is more general: it takes any awaitable object, and converts it to a Future. If it receives a coroutine object, create_task performs the appropriate conversion and the new Task (itself a subclass of Future) is returned. If the argument to ensure_future is already a Future or its subclass, it is returned unchanged.
On the conceptual level, ensure_future returns an object that represents the "future" value of a computation, i.e. it encapsulates the result of execution that happens in the background. In case of a corroutine, running in the background means being scheduled with the event loop. Without ensuring the coroutine's execution, returning a Future would be misleading because that future would never arrive.
I have a little bit of experience with promises in Javascript. I am quite experienced with Python, but new to its coroutines, and there is a bit that I just fail to understand: where does the asynchronicity kick in?
Let's consider the following minimal example:
async def gen():
await something
return 42
As I understand it, await something puts execution of our function aside and lets the main program run other bits. At some point something has a new result and gen will have a result soon after.
If gen and something are coroutines, then by all internet wisdom they are generators. And the only way to know when a generator has a new item available, afaik, is by polling it: x=gen(); next(x). But this is blocking! How does the scheduler "know" when x has a result? The answer can't be "when something has a result" because something must be a generator, too (for it is a coroutine). And this argument applies recursively.
I can't get past this idea that at some point the process will just have to sit and wait synchronously.
The secret sauce here is the asyncio module. Your something object has to be an awaitable object itself, and either depend on more awaitable objects, or must yield from a Future object.
For example, the asyncio.sleep() coroutine yields a Future:
#coroutine
def sleep(delay, result=None, *, loop=None):
"""Coroutine that completes after a given time (in seconds)."""
if delay == 0:
yield
return result
if loop is None:
loop = events.get_event_loop()
future = loop.create_future()
h = future._loop.call_later(delay,
futures._set_result_unless_cancelled,
future, result)
try:
return (yield from future)
finally:
h.cancel()
(The syntax here uses the older generator syntax, to remain backwards compatible with older Python 3 releases).
Note that a future doesn't use await or yield from; they simply use yield self until some condition is met. In the above async.sleep() coroutine, that condition is met when a result has been produced (in the async.sleep() code above, via the futures._set_result_unless_cancelled() function called after a delay).
An event loop then keeps pulling in the next 'result' from each pending future it manages (polling them efficiently) until the future signals it is done (by raising a StopIteration exception holding the results; return from a co-routine would do that, for example). At that point the coroutine that yielded the future can be signalled to continue (either by sending the future result, or by throwing an exception if the future raised anything other than StopIteration).
So for your example, the loop will kick off your gen() coroutine, and await something then (directly or indirectly) yields a future. That future is polled until it raises StopIteration (signalling it is done) or raises some other exception. If the future is done, coroutine.send(result) is executed, allowing it to then advance to the return 42 line, triggering a new StopIteration exception with that value, allowing a calling coroutine awaiting on gen() to continue, etc.