distinguish between cancellation of shielded task and current task

distinguish between cancellation of shielded task and current task - python

While reading:
https://docs.python.org/3/library/asyncio-task.html#asyncio.Task.cancel
it seems that catching CancelledError is used for two purposes.
One is potentially preventing your task from getting cancelled.
The other is determining that something has cancelled the task you are awaiting.
How to tell the difference?
async def cancel_me():
try:
await asyncio.sleep(3600)
except asyncio.CancelledError:
raise
finally:
print('cancel_me(): after sleep')
async def main():
task = asyncio.create_task(cancel_me())
await asyncio.sleep(1)
task.cancel()
try:
await task
except asyncio.CancelledError:
# HERE: How do I know if `task` has been cancelled, or I AM being cancelled?
print("main(): cancel_me is cancelled now")

How to tell the difference [between ourselves getting canceled and the task we're awaiting getting canceled]?
Asyncio doesn't make it easy to tell the difference. When an outer task awaits an inner task, it is delegating control to inner one's coroutine. As a result, canceling either task injects a CancelledError into the exact same place: the inner-most await inside the inner task. This is why you cannot tell the which of the two tasks was canceled originally.
However, it is possible to circumvent the issue by breaking the chain of awaits and connecting the tasks using a completion callback instead. Cancellation of the inner task is then intercepted and detected in the callback:
class ChildCancelled(asyncio.CancelledError):
pass
async def detect_cancel(task):
cont = asyncio.get_event_loop().create_future()
def on_done(_):
if task.cancelled():
cont.set_exception(ChildCancelled())
elif task.exception() is not None:
cont.set_exception(task.exception())
else:
cont.set_result(task.result())
task.add_done_callback(on_done)
await cont
This is functionally equivalent to await task, except it doesn't await the inner task directly; it awaits a dummy future whose result is set after task completes. At this point we can replace the CancelledError (which we know must have come from cancellation of the inner task) with the more specific ChildCancelled. On the other hand, if the outer task is cancelled, that will show up as a regular CancelledError at await cont, and will be propagated as usual.
Here is some test code:
import asyncio, sys
# async def detect_cancel defined as above
async def cancel_me():
print('cancel_me')
try:
await asyncio.sleep(3600)
finally:
print('cancel_me(): after sleep')
async def parent(task):
await asyncio.sleep(.001)
try:
await detect_cancel(task)
except ChildCancelled:
print("parent(): child is cancelled now")
raise
except asyncio.CancelledError:
print("parent(): I am cancelled")
raise
async def main():
loop = asyncio.get_event_loop()
child_task = loop.create_task(cancel_me())
parent_task = loop.create_task(parent(child_task))
await asyncio.sleep(.1) # give a chance to child to start running
if sys.argv[1] == 'parent':
parent_task.cancel()
else:
child_task.cancel()
await asyncio.sleep(.5)
asyncio.get_event_loop().run_until_complete(main())
Note that with this implementation, cancelling the outer task will not automatically cancel the inner one, but that may be easily changed with an explicit call to child.cancel(), either in parent, or in detect_cancel itself.
Asyncio uses a similar approach to implement asyncio.shield().

Context
First, let's consider the wider context:
caller() --> your_coro() --> callee()
You are in control of your coroutine, but not in control of callers and only in partial control of callees.
By default, cancellation is effectively "propagated" both by up and down the stack:
(1)
caller1() ------------------+ (2)
+--> callee()
caller2() --> your_coro() --+
(4) (3)
In this diagram, semantically and very loosely, if caller1() is actively cancelled, then callee() gets cancelled, and then your coroutine gets cancelled and then caller2() gets cancelled. Roughly same happens if caller2() is actively cancelled.
(callee() is shared, and thus not a plain coroutine, rather a Task or a Future)
What alternate behaviour might you want?
Shield
If you want callee() to continue even if caller2() is cancelled, shield it:
callee_f = asyncio.ensure_future(callee())
async def your_coro():
# I might die, but I won't take callee down with me
await asyncio.shield(callee_f)
Reverse shield
If you allow callee() to die, but want your coroutine to continue, convert the exception:
async def reverse_shield(awaitable):
try:
return await awaitable
except asyncio.CancelledError:
raise Exception("custom")
async def your_coro():
await reverse_shield(callee_f)
# handle custom exception
Shield yourself
This one is questionable — normally you should allow your caller to cancel your coroutine.
A notable exception is when if your caller is a framework, and it's not configurable.
def your_coro():
async def inner():
...
return asyncio.shield(inner())

Related

Use asyncio.all_tasks() in the right place

I am reading the book and face the code snippet, which doesn't makes sense for me. Can someone clarify that for me ?
import asyncio
async def main():
print(f'{time.ctime()} Hello!')
await asyncio.sleep(1.0)
print(f'{time.ctime()} Goodbye!')
loop = asyncio.get_event_loop()
task = loop.create_task(main())
loop.run_until_complete(task) # This line is responsible to block the thread (Which is MainThread in my case), until every coroutine won't be finished.
pending = asyncio.all_tasks(loop=loop) # asyncio.all_tasks() Return a set of not yet finished Task objects run by the loop. Based on definition, pending will always be an empty set.
for task in pending:
task.cancel()
group = asyncio.gather(*pending, return_exceptions=True)
loop.run_until_complete(group)
loop.close()
I think asyncio.all_tasks() should be used before loop.run_until_complete() function. Besides I find many other places where it is useful, but this example absolutely does not makes sense for me. I am really interested in, why author did that ? What was the point ?

What you are thinking is correct. There is no point here for having .all_tasks() as it always returns an empty set. You only have one task and you pass it to .run_until_complete(), so it blocks until it gets done.
But things change when you have another task that takes longer than your main coroutine:
import asyncio
import time
async def longer_task():
print("inside longer coroutine")
await asyncio.sleep(2)
async def main():
print(f"{time.ctime()} Hello!")
await asyncio.sleep(1.0)
print(f"{time.ctime()} Goodbye!")
loop = asyncio.new_event_loop()
task1 = loop.create_task(main())
task2 = loop.create_task(longer_task())
loop.run_until_complete(task1)
pending = asyncio.all_tasks(loop=loop)
print(pending)
for task in pending:
task.cancel()
group = asyncio.gather(*pending, return_exceptions=True)
loop.run_until_complete(group)
loop.close()
Event loop only cares to finish task1 so task2 is gonna be in pending mode.
I think asyncio.all_tasks() should be used before
loop.run_until_complete() function.
As soon as you create_task(), it will be included in the set returned by all_tasks() even if the loop has not started yet.
Just a side note: (version 3.10) Since you don't have a running event loop, .get_event_loop() will warn you. use .new_event_loop() instead.

Ensure that the wrapping coroutine is cancelled only if the awaited coroutine is cancelled

I need to wrap a coroutine that returns data. If the data is returned, then it is not available anymore. If the coroutine is cancelled, the data is available next call. I need the wrapping coroutine to have the same behavior, however sometimes it is cancelled while the wrapped coroutine has already finished.
I can reproduce this behavior with the following code.
import asyncio
loop = asyncio.get_event_loop()
fut = asyncio.Future()
async def wait():
return await fut
task = asyncio.ensure_future(wait())
async def test():
await asyncio.sleep(0.1)
fut.set_result('data')
print ('fut', fut)
print ('task', task)
task.cancel()
await asyncio.sleep(0.1)
print ('fut', fut)
print ('task', task)
loop.run_until_complete(test())
The output clearly shows that the wrapping coroutine was cancelled after the coroutine finished, meaning that data is forever lost. I cannot shield neither the call, because if I'm cancelled I have no data to return anyway.
fut <Future finished result='data'>
task <Task pending coro=<wait() running at <ipython-input-8-6d115ded09c6>:7> wait_for=<Future finished result='data'>>
fut <Future finished result='data'>
task <Task cancelled coro=<wait() done, defined at <ipython-input-8-6d115ded09c6>:6>>
In my case, this is due to having two futures, the one validating the wrapped coroutine, and the one cancelling the wrapping coroutine, being sometimes validated together. I could probably choose to delay the cancel (via asyncio.sleep(0)), but am I sure it will never happen by accident ?
The problem makes more sense with a task:
import asyncio
loop = asyncio.get_event_loop()
data = []
fut_data = asyncio.Future()
async def get_data():
while not data:
await asyncio.shield(fut_data)
return data.pop()
fut_wapper = asyncio.Future()
async def wrapper_data():
task = asyncio.ensure_future(get_data())
return await task
async def test():
task = asyncio.ensure_future(wrapper_data())
await asyncio.sleep(0)
data.append('data')
fut_data.set_result(None)
await asyncio.sleep(0)
print ('wrapper_data', task)
task.cancel()
await asyncio.sleep(0)
print ('wrapper_data', task)
print ('data', data)
loop.run_until_complete(test())
task <Task cancelled coro=<wrapper_data() done, defined at <ipython-input-2-93645b78e9f7>:16>>
data []
The data has been consumed but the task has been cancelled, so data cannot be retrieved. Awaiting directly for get_data() would work, but then cannot be cancelled.

I think you need to first shield the awaited future from cancellation, then detect your own cancellation. If the future hasn't completed, propagate the cancellation into it (effectively undoing the shield()) and out. If the future has completed, ignore the cancellation and return the data.
The code would look like this, also changed to avoid global vars and use asyncio.run() (which you can turn to run_until_complete() if you're using Python 3.6):
import asyncio
async def wait(fut):
try:
return await asyncio.shield(fut)
except asyncio.CancelledError:
if fut.done():
# we've been canceled, but we have the data - ignore the
# cancel request
return fut.result()
# otherwise, propagate the cancellation into the future
fut.cancel()
# ...and to the caller
raise
async def test():
loop = asyncio.get_event_loop()
fut = loop.create_future()
task = asyncio.create_task(wait(fut))
await asyncio.sleep(0.1)
fut.set_result('data')
print ('fut', fut)
print ('task', task)
task.cancel()
await asyncio.sleep(0.1)
print ('fut', fut)
print ('task', task)
asyncio.run(test())
Note that ignoring the cancel request can be thought of as abuse of the cancellation mechanism. But if the task is known to proceed afterwards (ideally immediately finishing), it might be the right thing in your situation. Caution is advised.

Where do I catch the KeyboardInterrupt exception in this async setup

I am working on a project that uses the ccxt async library which requires all resources used by a certain class to be released with an explicit call to the class's .close() coroutine. I want to exit the program with ctrl+c and await the close coroutine in the exception. However, it is never awaited.
The application consists of the modules harvesters, strategies, traders, broker, and main (plus config and such). The broker initiates the strategies specified for an exchange and executes them. The strategy initiates the associated harvester which collects the necessary data. It also analyses the data and spawns a trader when there is a profitable opportunity. The main module creates a broker for each exchange and runs it. I have tried to catch the exception at each of these levels, but the close routine is never awaited. I'd prefer to catch it in the main module in order to close all exchange instances.
Harvester
async def harvest(self):
if not self.routes:
self.routes = await self.get_routes()
for route in self.routes:
self.logger.info("Harvesting route {}".format(route))
await asyncio.sleep(self.exchange.rateLimit / 1000)
yield await self.harvest_route(route)
Strategy
async def execute(self):
async for route_dct in self.harvester.harvest():
self.logger.debug("Route dictionary: {}".format(route_dct))
await self.try_route(route_dct)
Broker
async def run(self):
for strategy in self.strategies:
self.strategies[strategy] = getattr(
strategies, strategy)(self.share, self.exchange, self.currency)
while True:
try:
await self.execute_strategies()
except KeyboardInterrupt:
await safe_exit(self.exchange)
Main
async def main():
await load_exchanges()
await load_markets()
brokers = [Broker(
share,
exchanges[id]["api"],
currency,
exchanges[id]["strategies"]
) for id in exchanges]
futures = [broker.run() for broker in brokers]
for future in asyncio.as_completed(futures):
executed = await future
return executed
if __name__ == "__main__":
status = asyncio.run(main())
sys.exit(status)
I had expected the close() coroutine to be awaited, but I still get an error from the library that I must explicitly call it. Where do I catch the exception so that all exchange instances are closed properly?

Somewhere in your code should be entry point, where event loop is started.
Usually it is one of functions below:
loop.run_until_complete(main())
loop.run_forever()
asyncio.run(main())
When ctrl+C happens KeyboardInterrupt can be catched at this line. When it happened to execute some finalizing coroutine you can run event loop again.
This little example shows idea:
import asyncio
async def main():
print('Started, press ctrl+C')
await asyncio.sleep(10)
async def close():
print('Finalazing...')
await asyncio.sleep(1)
if __name__ == '__main__':
loop = asyncio.get_event_loop()
try:
loop.run_until_complete(main())
except KeyboardInterrupt:
loop.run_until_complete(close())
finally:
print('Program finished')

create dependency between concurrent tasks in Python asyncio

I have two tasks in a consumer/producer relationship, separated by a asyncio.Queue. If the producer task fails, I'd like the consumer task to also fail as soon as possible, and not wait indefinitely on the queue. The consumer task can be created(spawned) independently from the producer task.
In general terms, I'd like to implement a dependency between two tasks, such that the failure of one is also the failure of the other, while keeping those two tasks concurrent(i.e. one will not await the other directly).
What kind of solutions(e.g. patterns) could be used here?
Basically, I'm thinking of erlang's "links".
I think it may be possible to implement something similar using callbacks, i.e. asyncio.Task.add_done_callback
Thanks!

From the comment:
The behavior I'm trying to avoid is the consumer being oblivious to the producer's death and waiting indefinitely on the queue. I want the consumer to be notified of the producer's death, and have a chance to react. or just fail, and that even while it's also waiting on the queue.
Other than the answer presented by Yigal, another way is to set up a third task that monitors the two and cancels one when the other one finishes. This can be generalized to any two tasks:
async def cancel_when_done(source, target):
assert isinstance(source, asyncio.Task)
assert isinstance(target, asyncio.Task)
try:
await source
except:
# SOURCE is a task which we expect to be awaited by someone else
pass
target.cancel()
Now when setting up the producer and the consumer, you can link them with the above function. For example:
async def producer(q):
for i in itertools.count():
await q.put(i)
await asyncio.sleep(.2)
if i == 7:
1/0
async def consumer(q):
while True:
val = await q.get()
print('got', val)
async def main():
loop = asyncio.get_event_loop()
queue = asyncio.Queue()
p = loop.create_task(producer(queue))
c = loop.create_task(consumer(queue))
loop.create_task(cancel_when_done(p, c))
await asyncio.gather(p, c)
asyncio.get_event_loop().run_until_complete(main())

One way would be to propagate the exception through the queue, combined with delegation of the work handling:
class ValidWorkLoad:
async def do_work(self, handler):
await handler(self)
class HellBrokeLoose:
def __init__(self, exception):
self._exception = exception
async def do_work(self, handler):
raise self._exception
async def worker(name, queue):
async def handler(work_load):
print(f'{name} handled')
while True:
next_work = await queue.get()
try:
await next_work.do_work(handler)
except Exception as e:
print(f'{name} caught exception: {type(e)}: {e}')
break
finally:
queue.task_done()
async def producer(name, queue):
i = 0
while True:
try:
# Produce some work, or fail while trying
new_work = ValidWorkLoad()
i += 1
if i % 3 == 0:
raise ValueError(i)
await queue.put(new_work)
print(f'{name} produced')
await asyncio.sleep(0) # Preempt just for the sake of the example
except Exception as e:
print('Exception occurred')
await queue.put(HellBrokeLoose(e))
break
loop = asyncio.get_event_loop()
queue = asyncio.Queue(loop=loop)
producer_coro = producer('Producer', queue)
consumer_coro = worker('Consumer', queue)
loop.run_until_complete(asyncio.gather(producer_coro, consumer_coro))
loop.close()
Which outputs:
Producer produced
Consumer handled
Producer produced
Consumer handled
Exception occurred
Consumer caught exception: <class 'ValueError'>: 3
Alternatively you could skip the delegation, and designate an item that signals the worker to stop. When catching an exception in the producer you put that designated item in the queue.

Another possible solution:
import asyncio
def link_tasks(t1: Union[asyncio.Task, asyncio.Future], t2: Union[asyncio.Task, asyncio.Future]):
"""
Link the fate of two asyncio tasks,
such that the failure or cancellation of one
triggers the cancellation of the other
"""
def done_callback(other: asyncio.Task, t: asyncio.Task):
# TODO: log cancellation due to link propagation
if t.cancelled():
other.cancel()
elif t.exception():
other.cancel()
t1.add_done_callback(functools.partial(done_callback, t2))
t2.add_done_callback(functools.partial(done_callback, t1))
This uses asyncio.Task.add_done_callback to register callbacks that will cancel the other task if either one fails or is cancelled.

Wrong except in asyncio.Queue.put?

asyncio\queues.py
#coroutine
def put(self, item):
"""Put an item into the queue.
Put an item into the queue. If the queue is full, wait until a free
slot is available before adding item.
This method is a coroutine.
"""
while self.full():
putter = futures.Future(loop=self._loop)
self._putters.append(putter)
try:
yield from putter
except:
putter.cancel() # Just in case putter is not done yet.
if not self.full() and not putter.cancelled():
# We were woken up by get_nowait(), but can't take
# the call. Wake up the next in line.
self._wakeup_next(self._putters)
raise
return self.put_nowait(item)
In my view, putter can be done by cancel, set_exception or set_result. get_nowait use set_result. only cancel and set_exception will throw exception, then except: can occur. I think except: is not needed.
Why does it add a except: to Wake up the next in line?
Update:#Vincent
_wakeup_next call set_result. set_result will execute self._state = _FINISHED. task1.cancel() will self._fut_waiter.cancel() which return False. So, task1 will be not cancelled.
#Vincent thanks very much
the key cause is task.cancel can cancel task though the future which task is waiting has been set_result(self._state = _FINISHED).

If the task waiting for putter is cancelled, yield from putter raises a CancelledError. That can happen after get_nowait() is called, and you want to make sure the other putters are notified that a new slot is available in the queue.
Here's an example:
async def main():
# Create a full queue
queue = asyncio.Queue(1)
await queue.put('A')
# Schedule two putters as tasks
task1 = asyncio.ensure_future(queue.put('B'))
task2 = asyncio.ensure_future(queue.put('C'))
await asyncio.sleep(0)
# Make room in the queue, print 'A'
print(queue.get_nowait())
# Cancel task 1 before giving the control back to the event loop
task1.cancel()
# Thankfully, the putter in task 2 has been notified
await task2
# Print 'C'
print(await queue.get())
EDIT: More information about what happens internally:
queue.get_nowait(): putter.set_result(None) is called; the putter state is now FINISHED, and task1 will wake up when the control is given back to the event loop.
task1.cancel(): task1._fut_waiter is already finished, so task1._must_cancel is set to True in order to raise a CancelledError the next time task1 runs.
await task2:
The control is given back to the control loop, and task1._step() runs. A CancelledError is thrown inside the coroutine: task1._coro.throw(CancelledError()).
queue.put catches the exception. Since the queue is not full and 'B' is not going to be inserted, the next putter in the queue has to be notified: self._wakeup_next(self._putters).
Then the CancelledError is re-raised and caught in task1._step(). task1 now actually cancels itself (super().cancel()).

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

distinguish between cancellation of shielded task and current task - python

Related

Use asyncio.all_tasks() in the right place

Ensure that the wrapping coroutine is cancelled only if the awaited coroutine is cancelled

Where do I catch the KeyboardInterrupt exception in this async setup

create dependency between concurrent tasks in Python asyncio

Wrong except in asyncio.Queue.put?

Categories

Resources