I have a little bit of experience with promises in Javascript. I am quite experienced with Python, but new to its coroutines, and there is a bit that I just fail to understand: where does the asynchronicity kick in?
Let's consider the following minimal example:
async def gen():
await something
return 42
As I understand it, await something puts execution of our function aside and lets the main program run other bits. At some point something has a new result and gen will have a result soon after.
If gen and something are coroutines, then by all internet wisdom they are generators. And the only way to know when a generator has a new item available, afaik, is by polling it: x=gen(); next(x). But this is blocking! How does the scheduler "know" when x has a result? The answer can't be "when something has a result" because something must be a generator, too (for it is a coroutine). And this argument applies recursively.
I can't get past this idea that at some point the process will just have to sit and wait synchronously.
The secret sauce here is the asyncio module. Your something object has to be an awaitable object itself, and either depend on more awaitable objects, or must yield from a Future object.
For example, the asyncio.sleep() coroutine yields a Future:
#coroutine
def sleep(delay, result=None, *, loop=None):
"""Coroutine that completes after a given time (in seconds)."""
if delay == 0:
yield
return result
if loop is None:
loop = events.get_event_loop()
future = loop.create_future()
h = future._loop.call_later(delay,
futures._set_result_unless_cancelled,
future, result)
try:
return (yield from future)
finally:
h.cancel()
(The syntax here uses the older generator syntax, to remain backwards compatible with older Python 3 releases).
Note that a future doesn't use await or yield from; they simply use yield self until some condition is met. In the above async.sleep() coroutine, that condition is met when a result has been produced (in the async.sleep() code above, via the futures._set_result_unless_cancelled() function called after a delay).
An event loop then keeps pulling in the next 'result' from each pending future it manages (polling them efficiently) until the future signals it is done (by raising a StopIteration exception holding the results; return from a co-routine would do that, for example). At that point the coroutine that yielded the future can be signalled to continue (either by sending the future result, or by throwing an exception if the future raised anything other than StopIteration).
So for your example, the loop will kick off your gen() coroutine, and await something then (directly or indirectly) yields a future. That future is polled until it raises StopIteration (signalling it is done) or raises some other exception. If the future is done, coroutine.send(result) is executed, allowing it to then advance to the return 42 line, triggering a new StopIteration exception with that value, allowing a calling coroutine awaiting on gen() to continue, etc.
Related
As in the following example, I encountered an unusual error when using async Generator.
async def demo():
async def get_data():
for i in range(5): # loop: for or while
await asyncio.sleep(1) # some IO code
yield i
datas = get_data()
await asyncio.gather(
anext(datas),
anext(datas),
anext(datas),
anext(datas),
anext(datas),
)
if __name__ == '__main__':
# asyncio.run(main())
asyncio.run(demo())
Console output:
2022-05-11 23:55:24,530 DEBUG asyncio 29180 30600 Using proactor: IocpProactor
Traceback (most recent call last):
File "E:\workspace\develop\python\crawlerstack-proxypool\demo.py", line 77, in <module>
asyncio.run(demo())
File "D:\devtools\Python310\lib\asyncio\runners.py", line 44, in run
return loop.run_until_complete(main)
File "D:\devtools\Python310\lib\asyncio\base_events.py", line 641, in run_until_complete
return future.result()
File "E:\workspace\develop\python\crawlerstack-proxypool\demo.py", line 66, in demo
await asyncio.gather(
RuntimeError: anext(): asynchronous generator is already running
Situation description: I have a loop logic that fetches a batch of data from Redis at a time, and I want to use yield to return the result. But this error occurs when I create a concurrent task.
Is there a good solution to this situation? I don't mean to change the way I'm using it now, but to see if I can tell if it's running or something like a lock and wait for it to run and then execute anext.
Maybe my logic is not reasonable now, but I also want to understand some critical language, let me realize the seriousness of this.
Thank you for your help.
TL;DR: the right way
Async generators suit badly for a parallel consumption. See my explanations below. As a proper workaround, use asyncio.Queue for the communication between producers and consumers:
queue = asyncio.Queue()
async def producer():
for item in range(5):
await asyncio.sleep(random.random()) # imitate async fetching
print('item fetched:', item)
await queue.put(item)
async def consumer():
while True:
item = await queue.get()
await asyncio.sleep(random.random()) # imitate async processing
print('item processed:', item)
await asyncio.gather(producer(), consumer(), consumer())
The above code snippet works well for an infinite stream of items: for example, a web server, which runs forever serving requests from clients. But what if we need to process a finite number of items? How should consumers know when to stop?
This deserves another question on Stack Overflow to cover all alternatives, but the simplest option is a sentinel approach, described below.
Sentinel: finite data streams approach
Introduce a sentinel = object(). When all items from an external data source are fetched and put to the queue, producer must push as many sentinels to the queue as many consumers you have. Once a consumer fetches the sentinel, it knows it should stop: if item is sentinel: break from loop.
sentinel = object()
consumers_count = 2
async def producer():
... # the same code as above
if new_item is None: # if no new data
for _ in range(consumers_count):
await queue.put(sentinel)
async def consumer():
while True:
... # the same code as above
if item is sentinel:
break
await asyncio.gather(
producer(),
*(consumer() for _ in range(consumers_count)),
)
TL;DR [2]: a dirty workaround
Since you require to not change your async generator approach, here is an asyncgen-based alternative. To resolve this issue (in a simple-yet-dirty way), you may wrap the source async generator with a lock:
async def with_lock(agen, lock: asyncio.Lock):
while True:
async with lock: # only one consumer is allowed to read
try:
yield await anext(agen)
except StopAsyncIteration:
break
lock = asyncio.Lock() # a common lock for all consumers
await asyncio.gather(
# every consumer must have its own "wrapped" generator
anext(with_lock(datas, lock)),
anext(with_lock(datas, lock)),
...
)
This will ensure only one consumer awaits for an item from the generator at a time. While this consumer awaits, other consumers are being executed, so parallelization is not lost.
A roughly equivalent code with async for (looks a little smarter):
async def with_lock(agen, lock: asyncio.Lock):
await lock.acquire()
async for item in agen:
lock.release()
yield item
await lock.acquire()
lock.release()
However, this code only handles async generator's anext method. Whereas generators API also includes aclose and athrow methods. See an explanation below.
Though, you may add support for these to the with_lock function too, I would recommend to either subclass a generator and handle the lock support inside, or better use the Queue-based approach from above.
See contextlib.aclosing for some inspiration.
Explanation
Both sync and async generators have a special attribute: .gi_running (for regular generators) and .ag_running (for async ones). You may discover them by executing dir on a generator:
>>> dir((i for i in range(0))
[..., 'gi_running', ...]
They are set to True when a generator's .__next__ or .__anext__ method is executed (next(...) and anext(...) are just a syntactic sugar for those).
This prevents re-executing next(...) on a generator, when another next(...) call on the same generator is already being executed: if the running flag is True, an exception is raised (for a sync generator it raises ValueError: generator already executing).
So, returning to your example, when you run await anext(datas) (via asyncio.gather), the following happens:
datas.ag_running is set to True.
An execution flow steps into the datas.__anext__ method.
Once an inner await statement is reached inside of the __anext__ method (await asyncio.sleep(1) in your case), asyncio's loop switches to another consumer.
Now, another consumer tries to call await anext(datas) too, but since datas.ag_running flag is still set to True, this results in a RuntimeError.
Why is this flag needed?
A generator's execution can be suspended and resumed. But only at yield statements. Thus, if a generator is paused at an inner await statement, it cannot be "resumed", because its state disallows it.
That's why a parallel next/anext call to a generator raises an exception: it is not ready to be resumed, it is already running.
athrow and aclose
Generators' API (both sync and async) includes not only send/asend method for iteration, but also:
close/aclose to release generator-allocated resources (e.g. a database connection) on exit or an exception
and throw/athrow to inform generator that it has to handle an exception.
aclose and athrow are async methods too. Which means that if two consumers try to close/throw an underlying generator in parallel, you will encounter the same issue since a generator will be closing (or handling an exception) while closed (thrown an exception) again.
Sync generators example
Though this is a frequent case for async generators, reproducing it for sync generators is not that naive, since sync next(...) calls are rarely interrupted.
One of the ways to interrupt a sync generator is to run a multithreaded code with multiple consumers (run in parallel threads) reading from a single generator. In that case, when the generator's code is interrupted while executing a next call, all other consumers' parallel attempts to call next will result in an exception.
Another way to achieve this is demonstrated in the generators-related PEP #255 via a self-consuming generator:
>>> def g():
... i = next(me)
... yield i
...
>>> me = g()
>>> next(me)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 2, in g
ValueError: generator already executing
When outer next(me) is called, it sets me.gi_running to True and then executes the generator function code. A subsequent inner next(me) call leads to a ValueError.
Conclusion
Generators (especially async) work the best when consumed by a single reader. Multiple consumers support is hard, since requires patching behaviour of all the generator's methods, and thus discouraged.
Lets say I have a C++ function result_type compute(input_type input), which I have made available to python using cython. My python code executes multiple computations like this:
def compute_total_result()
inputs = ...
total_result = ...
for input in inputs:
result = compute_python_wrapper(input)
update_total_result(total_result)
return total_result
Since the computation takes a long time, I have implemented a C++ thread pool (like this) and written a function std::future<result_type> compute_threaded(input_type input), which returns a future that becomes ready as soon as the thread pool is done executing.
What I would like to do is to use this C++ function in python as well. A simple way to do this would be to wrap the std::future<result_type> including its get() function, wait for all results like this:
def compute_total_results_parallel()
inputs = ...
total_result = ...
futures = []
for input in inputs:
futures.append(compute_threaded_python_wrapper(input))
for future in futures:
update_total_result(future.get())
return total_result
I suppose this works well enough in this case, but it becomes very complicated very fast, because I have to pass futures around.
However, I think that conceptually, waiting for these C++ results is no different from waiting for file or network I/O.
To facilitate I/O operations, the python devs introduced the async / await keywords. If my compute_threaded_python_wrapper would be part of asyncio, I could simply rewrite it as
async def compute_total_results_async()
inputs = ...
total_result = ...
for input in inputs:
result = await compute_threaded_python_wrapper(input)
update_total_result(total_result)
return total_result
And I could execute the whole code via result = asyncio.run(compute_total_results_async()).
There are a lot of tutorials regarding async programming in python, but most of them deal with using coroutines where the bedrock seem to be some call into the asyncio package, mostly calling asyncio.sleep(delay) as a proxy for I/O.
My question is: (How) Can I implement coroutines in python, enabling python to await the wrapped future object (There is some mention of a __await__ method returning an iterator)?
First, an inaccuracy in the question needs to be corrected:
If my compute_threaded_python_wrapper would be part of asyncio, I could simply rewrite it as [...]
The rewrite is incorrect: await means "wait until the computation finishes", so the loop as written would execute the code sequentially. A rewrite that actually runs the tasks in parallel would be something like:
# a direct translation of the "parallel" version
def compute_total_results_async()
inputs = ...
total_result = ...
tasks = []
# first spawn all the tasks
for input in inputs:
tasks.append(
asyncio.create_task(compute_threaded_python_wrapper(input))
)
# and then await them
for task in tasks:
update_total_result(await task)
return total_result
This spawn-all-await-all pattern is so uniquitous that asyncio provides a helper function, asyncio.gather(), which makes it much shorter, especially when combined with a list comprehension:
# a more idiomatic version
def compute_total_results_async()
inputs = ...
total_result = ...
results = await asyncio.gather(
*[compute_threaded_python_wrapper(input) for input in inputs]
)
for result in results:
update_total_result(result)
return total_result
With that out of the way, we can proceed to the main question:
My question is: (How) Can I implement coroutines in python, enabling python to await the wrapped future object (There is some mention of a __await__ method returning an iterator)?
Yes, awaitable objects are implemented using iterators that yield to indicate suspension. But that is way too low-level a tool for what you actually need. You don't need just any awaitable, but one that works with the asyncio event loop, which has specific expectations of the underlying iterator. You need a mechanism to resume the awaitable when the result is ready, where you again depend on asyncio.
Asyncio already provides awaitable objects that can be externally assigned a value: futures. An asyncio future represents an async value that will become available at some point in the future. They are related to, but not semantically equivalent to C++ futures, and should not to be confused with multi-threaded futures from the concurrent.futures stdlib module.
To create an awaitable object that is activated by something that happens in another thread, you need to create a future, and then start your off-thread task, instructing it to mark the future as completed when it finishes execution. Since asyncio futures are not thread-safe, this must be done using the call_soon_threadsafe event loop method provided by asyncio for such situations. In Python it would be done like this:
def run_async():
loop = asyncio.get_event_loop()
future = loop.create_future()
def on_done(result):
# when done, notify the future in a thread-safe manner
loop.call_soon_threadsafe(future.set_result, resut)
# start the worker in a thread owned by the pool
pool.submit(_worker, on_done)
# returning a future makes run_async() awaitable, and
# passable to asyncio.gather() etc.
return future
def _worker(on_done):
# this runs in a different thread
... processing goes here ...
result = ...
on_done(result)
In your case, the worker would be presumably implemented in Cython combined with C++.
I was wondering what exactly happens when we await a coroutine in async Python code, for example:
await send_message(string)
(1) send_message is added to the event loop, and the calling coroutine gives up control to the event loop, or
(2) We jump directly into send_message
Most explanations I read point to (1), as they describe the calling coroutine as exiting. But my own experiments suggest (2) is the case: I tried to have a coroutine run after the caller but before the callee and could not achieve this.
Disclaimer: Open to correction (particularly as to details and correct terminology) since I arrived here looking for the answer to this myself. Nevertheless, the research below points to a pretty decisive "main point" conclusion:
Correct OP answer: No, await (per se) does not yield to the event loop, yield yields to the event loop, hence for the case given: "(2) We jump directly into send_message". In particular, certain yield expressions are the only points, at bottom, where async tasks can actually be switched out (in terms of nailing down the precise spot where Python code execution can be suspended).
To be proven and demonstrated: 1) by theory/documentation, 2) by implementation code, 3) by example.
By theory/documentation
PEP 492: Coroutines with async and await syntax
While the PEP is not tied to any specific Event Loop implementation, it is relevant only to the kind of coroutine that uses yield as a signal to the scheduler, indicating that the coroutine will be waiting until an event (such as IO) is completed. ...
[await] uses the yield from implementation [with an extra step of validating its argument.] ...
Any yield from chain of calls ends with a yield. This is a fundamental mechanism of how Futures are implemented. Since, internally, coroutines are a special kind of generators, every await is suspended by a yield somewhere down the chain of await calls (please refer to PEP 3156 for a detailed explanation). ...
Coroutines are based on generators internally, thus they share the implementation. Similarly to generator objects, coroutines have throw(), send() and close() methods. ...
The vision behind existing generator-based coroutines and this proposal is to make it easy for users to see where the code might be suspended.
In context, "easy for users to see where the code might be suspended" seems to refer to the fact that in synchronous code yield is the place where execution can be "suspended" within a routine allowing other code to run, and that principle now extends perfectly to the async context wherein a yield (if its value is not consumed within the running task but is propagated up to the scheduler) is the "signal to the scheduler" to switch out tasks.
More succinctly: where does a generator yield control? At a yield. Coroutines (including those using async and await syntax) are generators, hence likewise.
And it is not merely an analogy, in implementation (see below) the actual mechanism by which a task gets "into" and "out of" coroutines is not anything new, magical, or unique to the async world, but simply by calling the coro's <generator>.send() method. That was (as I understand the text) part of the "vision" behind PEP 492: async and await would provide no novel mechanism for code suspension but just pour async-sugar on Python's already well-beloved and powerful generators.
And
PEP 3156: The "asyncio" module
The loop.slow_callback_duration attribute controls the maximum execution time allowed between two yield points before a slow callback is reported [emphasis in original].
That is, an uninterrupted segment of code (from the async perspective) is demarcated as that between two successive yield points (whose values reached up to the running Task level (via an await/yield from tunnel) without being consumed within it).
And this:
The scheduler has no public interface. You interact with it by using yield from future and yield from task.
Objection: "That says 'yield from', but you're trying to argue that the task can only switch out at a yield itself! yield from and yield are different things, my friend, and yield from itself doesn't suspend code!"
Ans: Not a contradiction. The PEP is saying you interact with the scheduler by using yield from future/task. But as noted above in PEP 492, any chain of yield from (~aka await) ultimately reaches a yield (the "bottom turtle"). In particular (see below), yield from future does in fact yield that same future after some wrapper work, and that yield is the actual "switch out point" where another task takes over. But it is incorrect for your code to directly yield a Future up to the current Task because you would bypass the necessary wrapper.
The objection having been answered, and its practical coding considerations being noted, the point I wish to make from the above quote remains: that a suitable yield in Python async code is ultimately the one thing which, having suspended code execution in the standard way that any other yield would do, now futher engages the scheduler to bring about a possible task switch.
By implementation code
asyncio/futures.py
class Future:
...
def __await__(self):
if not self.done():
self._asyncio_future_blocking = True
yield self # This tells Task to wait for completion.
if not self.done():
raise RuntimeError("await wasn't used with future")
return self.result() # May raise too.
__iter__ = __await__ # make compatible with 'yield from'.
Paraphrase: The line yield self is what tells the running task to sit out for now and let other tasks run, coming back to this one sometime after self is done.
Almost all of your awaitables in asyncio world are (multiple layers of) wrappers around a Future. The event loop remains utterly blind to all higher level await awaitable expressions until the code execution trickles down to an await future or yield from future and then (as seen here) calls yield self, which yielded self is then "caught" by none other than the Task under which the present coroutine stack is running thereby signaling to the task to take a break.
Possibly the one and only exception to the above "code suspends at yield self within await future" rule, in an asyncio context, is the potential use of a bare yield such as in asyncio.sleep(0). And since the sleep function is a topic of discourse in the comments of this post, let's look at that.
asyncio/tasks.py
#types.coroutine
def __sleep0():
"""Skip one event loop run cycle.
This is a private helper for 'asyncio.sleep()', used
when the 'delay' is set to 0. It uses a bare 'yield'
expression (which Task.__step knows how to handle)
instead of creating a Future object.
"""
yield
async def sleep(delay, result=None, *, loop=None):
"""Coroutine that completes after a given time (in seconds)."""
if delay <= 0:
await __sleep0()
return result
if loop is None:
loop = events.get_running_loop()
else:
warnings.warn("The loop argument is deprecated since Python 3.8, "
"and scheduled for removal in Python 3.10.",
DeprecationWarning, stacklevel=2)
future = loop.create_future()
h = loop.call_later(delay,
futures._set_result_unless_cancelled,
future, result)
try:
return await future
finally:
h.cancel()
Note: We have here the two interesting cases at which control can shift to the scheduler:
(1) The bare yield in __sleep0 (when called via an await).
(2) The yield self immediately within await future.
The crucial line (for our purposes) in asyncio/tasks.py is when Task._step runs its top-level coroutine via result = self._coro.send(None) and recognizes fourish cases:
(1) result = None is generated by the coro (which, again, is a generator): the task "relinquishes control for one event loop iteration".
(2) result = future is generated within the coro, with further magic member field evidence that the future was yielded in a proper manner from out of Future.__iter__ == Future.__await__: the task relinquishes control to the event loop until the future is complete.
(3) A StopIteration is raised by the coro indicating the coroutine completed (i.e. as a generator it exhausted all its yields): the final result of the task (which is itself a Future) is set to the coroutine return value.
(4) Any other Exception occurs: the task's set_exception is set accordingly.
Modulo details, the main point for our concern is that coroutine segments in an asyncio event loop ultimately run via coro.send(). Initial startup and final termination aside, send() proceeds precisely from the last yield value it generated to the next one.
By example
import asyncio
import types
def task_print(s):
print(f"{asyncio.current_task().get_name()}: {s}")
async def other_task(s):
task_print(s)
class AwaitableCls:
def __await__(self):
task_print(" 'Jumped straight into' another `await`; the act of `await awaitable` *itself* doesn't 'pause' anything")
yield
task_print(" We're back to our awaitable object because that other task completed")
asyncio.create_task(other_task("The event loop gets control when `yield` points (from an iterable coroutine) propagate up to the `current_task` through a suitable chain of `await` or `yield from` statements"))
async def coro():
task_print(" 'Jumped straight into' coro; the `await` keyword itself does nothing to 'pause' the current_task")
await AwaitableCls()
task_print(" 'Jumped straight back into' coro; we have another pending task, but leaving an `__await__` doesn't 'pause' the task any more than entering the `__await__` does")
#types.coroutine
def iterable_coro(context):
task_print(f"`{context} iterable_coro`: pre-yield")
yield None # None or a Future object are the only legitimate yields to the task in asyncio
task_print(f"`{context} iterable_coro`: post-yield")
async def original_task():
asyncio.create_task(other_task("Aha, but a (suitably unconsumed) *`yield`* DOES 'pause' the current_task allowing the event scheduler to `_wakeup` another task"))
task_print("Original task")
await coro()
task_print("'Jumped straight out of' coro. Leaving a coro, as with leaving/entering any awaitable, doesn't give control to the event loop")
res = await iterable_coro("await")
assert res is None
asyncio.create_task(other_task("This doesn't run until the very end because the generated None following the creation of this task is consumed by the `for` loop"))
for y in iterable_coro("for y in"):
task_print(f"But 'ordinary' `yield` points (those which are consumed by the `current_task` itself) behave as ordinary without relinquishing control at the async/task-level; `y={y}`")
task_print("Done with original task")
asyncio.get_event_loop().run_until_complete(original_task())
run in python3.8 produces
Task-1: Original task
Task-1: 'Jumped straight into' coro; the await keyword itself does nothing to 'pause' the current_task
Task-1: 'Jumped straight into' another await; the act of await awaitable itself doesn't 'pause' anything
Task-2: Aha, but a (suitably unconsumed) yield DOES 'pause' the current_task allowing the event scheduler to _wakeup another task
Task-1: We're back to our awaitable object because that other task completed
Task-1: 'Jumped straight back into' coro; we have another pending task, but leaving an __await__ doesn't 'pause' the task any more than entering the __await__ does
Task-1: 'Jumped straight out of' coro. Leaving a coro, as with leaving/entering any awaitable, doesn't give control to the event loop
Task-1: await iterable_coro: pre-yield
Task-3: The event loop gets control when yield points (from an iterable coroutine) propagate up to the current_task through a suitable chain of await or yield from statements
Task-1: await iterable_coro: post-yield
Task-1: for y in iterable_coro: pre-yield
Task-1: But 'ordinary' yield points (those which are consumed by the current_task itself) behave as ordinary without relinquishing control at the async/task-level; y=None
Task-1: for y in iterable_coro: post-yield
Task-1: Done with original task
Task-4: This doesn't run until the very end because the generated None following the creation of this task is consumed by the for loop
Indeed, exercises such as the following can help one's mind to decouple the functionality of async/await from notion of "event loops" and such. The former is conducive to nice implementations and usages of the latter, but you can use async and await just as specially syntaxed generator stuff without any "loop" (whether asyncio or otherwise) whatsoever:
import types # no asyncio, nor any other loop framework
async def f1():
print(1)
print(await f2(),'= await f2()')
return 8
#types.coroutine
def f2():
print(2)
print((yield 3),'= yield 3')
return 7
class F3:
def __await__(self):
print(4)
print((yield 5),'= yield 5')
print(10)
return 11
task1 = f1()
task2 = F3().__await__()
""" You could say calls to send() represent our
"manual task management" in this script.
"""
print(task1.send(None), '= task1.send(None)')
print(task2.send(None), '= task2.send(None)')
try:
print(task1.send(6), 'try task1.send(6)')
except StopIteration as e:
print(e.value, '= except task1.send(6)')
try:
print(task2.send(9), 'try task2.send(9)')
except StopIteration as e:
print(e.value, '= except task2.send(9)')
produces
1
2
3 = task1.send(None)
4
5 = task2.send(None)
6 = yield 3
7 = await f2()
8 = except task1.send(6)
9 = yield 5
10
11 = except task2.send(9)
Yes, await passes control back to the asyncio eventloop, and allows it to schedule other async functions.
Another way is
await asyncio.sleep(0)
I have a program, roughly like the example below.
A task is gathering a number of values and returning them to a caller.
Sometimes the tasks may get cancelled.
In those cases, I still want to get the results the tasks have gathered so far.
Hence I catch the CancelledError exception, clean up, and return the completed results.
async def f():
results = []
for i in range(100):
try:
res = await slow_call()
results.append(res)
except asyncio.CancelledError:
results.append('Undecided')
return results
def on_done(task):
if task.cancelled():
print('Incomplete result', task.result()
else:
print(task.result())
async def run():
task = asyncio.create_task(f())
task.add_done_callback(on_done)
The problem is that the value returned after a task is cancelled doesn't appear to be available in the task.
Calling task.result() simply rethrows CancelledError. Calling task._result is just None.
Is there a way to get the return value of a cancelled task, assuming it has one?
Edit: I realize now that catching the CancelledError results in the task not being cancelled at all.
This leaves me with another conundrum: How do I signal to the tasks owner that this result is only a "half" result, and the task has really been cancelled.
I suppose I could add an extra return value indicating this, but that seems to go against the whole idea of the task cancellation system.
Any suggestions for a good approach here?
I'm a long way away from understanding the use case, but the following does something sensible for me:
import asyncio
async def fn(results):
for i in range(10):
# your slow_call
await asyncio.sleep(0.1)
results.append(i)
def on_done(task, results):
if task.cancelled():
print('incomplete', results)
else:
print('complete', results)
async def run():
results = []
task = asyncio.create_task(fn(results))
task.add_done_callback(lambda t: on_done(t, results))
# give fn some time to finish, reducing this will cause the task to be cancelled
# you'll see the incomplete message if this is < 1.1
await asyncio.sleep(1.1)
asyncio.run(run())
it's the use of add_done_callback and sleep in run that feels very awkward and makes me think I don't understand what you're doing. maybe posting something to https://codereview.stackexchange.com containing more of the calling code would help get ideas of better ways to structure things. note that there are other libraries like trio that provide much nicer interfaces to Python coroutines than the asyncio builtin library (which was standardised prematurely IMO)
I don't think that is possible, because in my opinion, collides with the meaning of cancellation of a task.
You can implements a similar behavior inside your slow_call, by triggering the CancelledError, catching it inside your function and then returns whatever you want.
I'm studying Python's asyncio library and i'm having a hard time understanding how some Python functions work.
For example:
import asyncio
async def coroutine():
print('Initialize coroutine')
await asyncio.sleep(1)
print('Done !')
loop = asyncio.get_event_loop()
#asyncio.ensure_future(coroutine())
#loop.create_task(coroutine())
loop.run_forever()
For example in the code if I make use of ensure_future it automatically starts the execution of my task, so does create_task, but in the case of create task I even understand that it uses the loop used in the loop.create_task method call. My question is because using ensure_future it already starts executing the task, and if it does this because in some case I see codes like this:
import asyncio
async def coroutine():
print('Initialize coroutine')
await asyncio.sleep(1)
print('Done !')
loop = asyncio.get_event_loop()
task = asyncio.ensure_future(coroutine())
loop.run_until_complete(asyncio.wait([task]))
My question is because using ensure_future it already starts executing the task, and if it does this
Yes, ensure_future starts executing the coroutine as soon as the event loop is resumed, even if no one awaits the returned future.
This feature of ensure_future follows from the implementation, but it also makes sense conceptually.
On the implementation level, it is a consequence of ensure_future internally calling create_task when given a coroutine object. In other words, for coroutines there is no difference between loop.create_task and asyncio.ensure_future. Otherwise the difference is that ensure_future is more general: it takes any awaitable object, and converts it to a Future. If it receives a coroutine object, create_task performs the appropriate conversion and the new Task (itself a subclass of Future) is returned. If the argument to ensure_future is already a Future or its subclass, it is returned unchanged.
On the conceptual level, ensure_future returns an object that represents the "future" value of a computation, i.e. it encapsulates the result of execution that happens in the background. In case of a corroutine, running in the background means being scheduled with the event loop. Without ensuring the coroutine's execution, returning a Future would be misleading because that future would never arrive.