I am a new python programmer who is trying to write a 'bot' to trade on betfair for myself. (ambitious!!!!)
My problem that has arisen is this - I have the basics of an asyncio event loop running but I have noticed that if one of the coroutines fails in its process ( for instance an API call fails or a mongodb read) then the asyncio event loop just continues running but ignores the one failed coroutine
my question is how could I either restart that one coroutine automatically or handle an error to stop the complete asyncio loop but at the moment everything runs seemingly oblivious to the fact that something is not right and one portion of it has failed. In my case the loop never returned to the 'rungetcompetitionids' function after a database read was not successful and it never returned to the function again even though it is in a while true loop
The usergui is not yet functional but only there to try asyncio
thanks
Clive
import sys
import datetime
from login import sessiontoken as gst
from mongoenginesetups.setupmongo import global_init as initdatabase
from asyncgetcompetitionids import competition_id_pass as gci
from asyncgetcompetitionids import create_comp_id_table_list as ccid
import asyncio
import PySimpleGUI as sg
sg.change_look_and_feel('DarkAmber')
layout = [
[sg.Text('Password'), sg.InputText(password_char='*', key='password')],
[sg.Text('', key='status')],
[sg.Button('Submit'), sg.Button('Cancel')]
]
window = sg.Window('Betfair', layout)
def initialisethedatabase():
initdatabase('xxxx', 'xxxx', xxxx, 'themongo1', True)
async def runsessiontoken():
nextlogontime = datetime.datetime.now()
while True:
returned_login_time = gst(nextlogontime)
nextlogontime = returned_login_time
await asyncio.sleep(15)
async def rungetcompetitionids(compid_from_compid_table_list):
nextcompidtime = datetime.datetime.now()
while True:
returned_time , returned_list = gci(nextcompidtime, compid_from_compid_table_list)
nextcompidtime = returned_time
compid_from_compid_table_list = returned_list
await asyncio.sleep(10)
async def userinterface():
while True:
event, value = window.read(timeout=1)
if event in (None, 'Cancel'):
sys.exit()
if event != "__TIMEOUT__":
print(f"{event} {value}")
await asyncio.sleep(0.0001)
async def wait_list():
await asyncio.wait([runsessiontoken(),
rungetcompetitionids(compid_from_compid_table_list),
userinterface()
])
initialisethedatabase()
compid_from_compid_table_list = ccid()
print(compid_from_compid_table_list)
nextcompidtime = datetime.datetime.now()
print(nextcompidtime)
loop = asyncio.get_event_loop()
loop.run_until_complete(wait_list())
loop.close()
A simple solution would be to use a wrapper function (or "supervisor") that catches Exception and then just blindly retries the function. More elegant solutions would include printing out the exception and stack trace for diagnostic purposes, and querying the application state to see if it makes sense to try and continue. For instance, if betfair tells you your account is not authorised, then continuing makes no sense. And if it's a general network error then retying immediately might be worthwhile. You might also want to stop retrying if the supervisor notices it has restarted quite a lot in a short space of time.
eg.
import asyncio
import traceback
import functools
from collections import deque
from time import monotonic
MAX_INTERVAL = 30
RETRY_HISTORY = 3
# That is, stop after the 3rd failure in a 30 second moving window
def supervise(func, name=None, retry_history=RETRY_HISTORY, max_interval=MAX_INTERVAL):
"""Simple wrapper function that automatically tries to name tasks"""
if name is None:
if hasattr(func, '__name__'): # raw func
name = func.__name__
elif hasattr(func, 'func'): # partial
name = func.func.__name__
return asyncio.create_task(supervisor(func, retry_history, max_interval), name=name)
async def supervisor(func, retry_history=RETRY_HISTORY, max_interval=MAX_INTERVAL):
"""Takes a noargs function that creates a coroutine, and repeatedly tries
to run it. It stops is if it thinks the coroutine is failing too often or
too fast.
"""
start_times = deque([float('-inf')], maxlen=retry_history)
while True:
start_times.append(monotonic())
try:
return await func()
except Exception:
if min(start_times) > monotonic() - max_interval:
print(
f'Failure in task {asyncio.current_task().get_name()!r}.'
' Is it in a restart loop?'
)
# we tried our best, this coroutine really isn't working.
# We should try to shutdown gracefully by setting a global flag
# that other coroutines should periodically check and stop if they
# see that it is set. However, here we just reraise the exception.
raise
else:
print(func.__name__, 'failed, will retry. Failed because:')
traceback.print_exc()
async def a():
await asyncio.sleep(2)
raise ValueError
async def b(greeting):
for i in range(15):
print(greeting, i)
await asyncio.sleep(0.5)
async def main_async():
tasks = [
supervise(a),
# passing repeated argument to coroutine (or use lambda)
supervise(functools.partial(b, 'hello'))
]
await asyncio.wait(
tasks,
# Only stop when all coroutines have completed
# -- this allows for a graceful shutdown
# Alternatively use FIRST_EXCEPTION to stop immediately
return_when=asyncio.ALL_COMPLETED,
)
return tasks
def main():
# we run outside of the event loop, so we can carry out a post-mortem
# without needing the event loop to be running.
done = asyncio.run(main_async())
for task in done:
if task.cancelled():
print(task, 'was cancelled')
elif task.exception():
print(task, 'failed with:')
# we use a try/except here to reconstruct the traceback for logging purposes
try:
task.result()
except:
# we can use a bare-except as we are not trying to block
# the exception -- just record all that may have happened.
traceback.print_exc()
main()
And this will result in output like:
hello 0
hello 1
hello 2
hello 3
a failed, will retry. Failed because:
Traceback (most recent call last):
File "C:\Users\User\Documents\python\src\main.py", line 30, in supervisor
return await func()
File "C:\Users\User\Documents\python\src\main.py", line 49, in a
raise ValueError
ValueError
hello 4
hello 5
hello 6
hello 7
a failed, will retry. Failed because:
Traceback (most recent call last):
File "C:\Users\User\Documents\python\src\main.py", line 30, in supervisor
return await func()
File "C:\Users\User\Documents\python\src\main.py", line 49, in a
raise ValueError
ValueError
hello 8
hello 9
hello 10
hello 11
Failure in task 'a'. Is it in a restart loop?
hello 12
hello 13
hello 14
exception=ValueError()> failed with:
Traceback (most recent call last):
File "C:\Users\User\Documents\python\src\main.py", line 84, in main
task.result()
File "C:\Users\User\Documents\python\src\main.py", line 30, in supervisor
return await func()
File "C:\Users\User\Documents\python\src\main.py", line 49, in a
raise ValueError
ValueError
Related
I have a big project which depends some third-party libraries, and sometimes its execution gets interrupted by a CancelledError.
To demonstrate the issue, let's look at a small example:
import asyncio
async def main():
task = asyncio.create_task(foo())
# Cancel the task in 1 second.
loop = asyncio.get_event_loop()
loop.call_later(1.0, lambda: task.cancel())
await task
async def foo():
await asyncio.sleep(999)
if __name__ == '__main__':
asyncio.run(main())
Traceback:
Traceback (most recent call last):
File "/Users/ss/Library/Application Support/JetBrains/PyCharm2021.2/scratches/async.py", line 19, in <module>
asyncio.run(main())
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/asyncio/runners.py", line 43, in run
return loop.run_until_complete(main)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/asyncio/base_events.py", line 579, in run_until_complete
return future.result()
concurrent.futures._base.CancelledError
As you can see, there's no information about the place the CancelledError originates from. How do I find out the exact cause of it?
One approach that I came up with is to place a lot of try/except blocks which would catch the CancelledError and narrow down the place where it comes from. But that's quite tedious.
I've solved it by applyting a decorator to every async function in the project. The decorator's job is simple - log a message when a CancelledError is raised from the function. This way we will see which functions (and more importantly, in which order) get cancelled.
Here's the decorator code:
def log_cancellation(f):
async def wrapper(*args, **kwargs):
try:
return await f(*args, **kwargs)
except asyncio.CancelledError:
print(f"Cancelled {f}")
raise
return wrapper
In order to add this decorator everywhere I used regex. Find: (.*)(async def). Replace with: $1#log_cancellation\n$1$2.
Also to avoid importing log_cancellation in every file I modified the builtins:
builtins.log_cancellation = log_cancellation
The rich package has helped us to identify the cause of CancelledError, without much code change required.
from rich.console import Console
console = Console()
if __name__ == "__main__":
try:
asyncio.run(main()) # replace main() with your entrypoint
except BaseException as e:
console.print_exception(show_locals=True)
When I run this code in Python 3.7:
import asyncio
sem = asyncio.Semaphore(2)
async def work():
async with sem:
print('working')
await asyncio.sleep(1)
async def main():
await asyncio.gather(work(), work(), work())
asyncio.run(main())
It fails with RuntimeError:
$ python3 demo.py
working
working
Traceback (most recent call last):
File "demo.py", line 13, in <module>
asyncio.run(main())
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/asyncio/runners.py", line 43, in run
return loop.run_until_complete(main)
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/asyncio/base_events.py", line 584, in run_until_complete
return future.result()
File "demo.py", line 11, in main
await asyncio.gather(work(), work(), work())
File "demo.py", line 6, in work
async with sem:
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/asyncio/locks.py", line 92, in __aenter__
await self.acquire()
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/asyncio/locks.py", line 474, in acquire
await fut
RuntimeError: Task <Task pending coro=<work() running at demo.py:6> cb=[gather.<locals>._done_callback() at /opt/local/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/asyncio/tasks.py:664]> got Future <Future pending> attached to a different loop
Python 3.10+: This error message should not occur anymore, see answer from #mmdanziger:
(...) the implementation of Semaphore has been changed and no longer grabs the current loop on init
Python 3.9 and older:
It's because Semaphore constructor sets its _loop attribute – in asyncio/locks.py:
class Semaphore(_ContextManagerMixin):
def __init__(self, value=1, *, loop=None):
if value < 0:
raise ValueError("Semaphore initial value must be >= 0")
self._value = value
self._waiters = collections.deque()
if loop is not None:
self._loop = loop
else:
self._loop = events.get_event_loop()
But asyncio.run() starts a completely new loop – in asyncio/runners.py, it's also metioned in the documentation:
def run(main, *, debug=False):
if events._get_running_loop() is not None:
raise RuntimeError(
"asyncio.run() cannot be called from a running event loop")
if not coroutines.iscoroutine(main):
raise ValueError("a coroutine was expected, got {!r}".format(main))
loop = events.new_event_loop()
...
Semaphore initiated outside of asyncio.run() grabs the asyncio "default" loop and so cannot be used with the event loop created with asyncio.run().
Solution
Initiate Semaphore from code called by asyncio.run(). You will have to pass them to the right place, there are more possibilities how to do that, you can for example use contextvars, but I will just give the simplest example:
import asyncio
async def work(sem):
async with sem:
print('working')
await asyncio.sleep(1)
async def main():
sem = asyncio.Semaphore(2)
await asyncio.gather(work(sem), work(sem), work(sem))
asyncio.run(main())
The same issue (and solution) is probably also with asyncio.Lock, asyncio.Event, and asyncio.Condition.
Update: As of Python 3.10 OP's code will run as written. This is because the implementation of Semaphore has been changed and no longer grabs the current loop on init. See this answer for more discussion.
Python 3.10 implementation from GitHub
class Semaphore(_ContextManagerMixin, mixins._LoopBoundMixin):
"""A Semaphore implementation.
A semaphore manages an internal counter which is decremented by each
acquire() call and incremented by each release() call. The counter
can never go below zero; when acquire() finds that it is zero, it blocks,
waiting until some other thread calls release().
Semaphores also support the context management protocol.
The optional argument gives the initial value for the internal
counter; it defaults to 1. If the value given is less than 0,
ValueError is raised.
"""
def __init__(self, value=1, *, loop=mixins._marker):
super().__init__(loop=loop)
if value < 0:
raise ValueError("Semaphore initial value must be >= 0")
self._value = value
self._waiters = collections.deque()
self._wakeup_scheduled = False
Alternative solution for Python 3.9 and older is to instantiate the Event, Lock, Semaphore, etc. as a first step inside the main() task, where possible.
I validated this with an Event case tested on Python 3.10 (Windows) vs Python 3.9 (Raspberry Pi).
I have a project in Python 3.5 without any usage of asynchronous features. I have to implement the folowing logic:
def should_return_in_3_sec(some_serious_job, arguments, finished_callback):
# Start some_serious_job(*arguments) in a task
# if it finishes within 3 sec:
# return result immediately
# otherwise return None, but do not terminate task.
# If the task finishes in 1 minute:
# call finished_callback(result)
# else:
# call finished_callback(None)
pass
The function should_return_in_3_sec() should remain synchronous, but it is up to me to write any new asynchronous code (including some_serious_job()).
What is the most elegant and pythonic way to do it?
Fork off a thread doing the serious job, let it write its result into a queue and then terminate. Read in your main thread from that queue with a timeout of three seconds. If the timeout occurs, start another thread and return None. Let the second thread read from the queue with a timeout of one minute; if that timeouts also, call finished_callback(None); otherwise call finished_callback(result).
I sketched it like this:
import threading, queue
def should_return_in_3_sec(some_serious_job, arguments, finished_callback):
result_queue = queue.Queue(1)
def do_serious_job_and_deliver_result():
result = some_serious_job(arguments)
result_queue.put(result)
threading.Thread(target=do_serious_job_and_deliver_result).start()
try:
result = result_queue.get(timeout=3)
except queue.Empty: # timeout?
def expect_and_handle_late_result():
try:
result = result_queue.get(timeout=60)
except queue.Empty:
finished_callback(None)
else:
finished_callback(result)
threading.Thread(target=expect_and_handle_late_result).start()
return None
else:
return result
The threading module has some simple timeout options, see Thread.join(timeout) for example.
If you do choose to use asyncio, below is a a partial solution to address some of your needs:
import asyncio
import time
async def late_response(task, flag, timeout, callback):
done, pending = await asyncio.wait([task], timeout=timeout)
callback(done.pop().result() if done else None) # will raise an exception if some_serious_job failed
flag[0] = True # signal some_serious_job to stop
return await task
async def launch_job(loop, some_serious_job, arguments, finished_callback,
timeout_1=3, timeout_2=5):
flag = [False]
task = loop.run_in_executor(None, some_serious_job, flag, *arguments)
done, pending = await asyncio.wait([task], timeout=timeout_1)
if done:
return done.pop().result() # will raise an exception if some_serious_job failed
asyncio.ensure_future(
late_response(task, flag, timeout_2, finished_callback))
return None
def f(flag, n):
for i in range(n):
print("serious", i, flag)
if flag[0]:
return "CANCELLED"
time.sleep(1)
return "OK"
def finished(result):
print("FINISHED", result)
loop = asyncio.get_event_loop()
result = loop.run_until_complete(launch_job(loop, f, [1], finished))
print("result:", result)
loop.run_forever()
This will run the job in a separate thread (Use loop.set_executor(ProcessPoolExecutor()) to run a CPU intensive task in a process instead). Keep in mind it is a bad practice to terminate a process/thread - the code above uses a very simple list to signal the thread to stop (See also threading.Event / multiprocessing.Event).
While implementing your solution, you might discover you would want to modify your existing code to use couroutines instead of using threads.
Want to know on which awaitable a Python asyncio task is currently on?
After some research, I did not found anything from the Python 3 asyncio library on how to know the current await or yield from directive a Task instance is currently on.
I want to share this with you: feel free to give some advice on how it can be improved or if something better exists from your own recipes or from the standard library itself :)
import inspect
import linecache
import traceback
def get_awaitable_stack_trace(task, file):
"""
Get the callstack representing the chain of awaits-yield from directives
that a Task object is currently on.
:param task: The Task object.
:param file: The file-like object on which the callstack is written.
"""
extracted_list = []
coro = task._coro
while True:
# Get the information on the current coroutine or generator.
coro_name = coro.__name__
try:
frame = coro.cr_frame
except AttributeError:
frame = coro.gi_frame
coro_filename = frame.f_code.co_filename
await_line_number = frame.f_lineno
linecache.checkcache(coro_filename)
line = linecache.getline(coro_filename, await_line_number,
frame.f_globals)
# Record the stack trace info for this coroutine.
extracted_list.append(
(coro_filename, await_line_number, coro_name, line))
# Get the next awaitable object in the chain.
try:
coro = coro.cr_await
except AttributeError:
coro = coro.gi_yieldfrom
if not inspect.isawaitable(coro):
break
traceback.print_list(extracted_list, file=file)
Given this code:
async def coro_1():
try:
await asyncio.sleep(999)
except asyncio.CancelledError:
# Do something.
pass
async def coro_2():
await coro_1()
A task that calls coro_2, when passed to get_awaitable_stack_trace, will produce this call stack:
File "D:\Dev\Git\test.py", line 164, in agent
await coro_2()
File "D:\Dev\Git\test.py", line 148, in coro_2
await coro_1()
File "D:\Dev\Git\test.py", line 142, in coro_1
await asyncio.sleep(999)
File "C:\Python35\Lib\asyncio\tasks.py", line 508, in sleep
return (yield from future)
Hope this helps!
Suppose I have two functions that work like this:
#tornado.gen.coroutine
def f():
for i in range(4):
print("f", i)
yield tornado.gen.sleep(0.5)
#tornado.gen.coroutine
def g():
yield tornado.gen.sleep(1)
print("Let's raise RuntimeError")
raise RuntimeError
In general, function f might contain endless loop and never return (e.g. it can process some queue).
What I want to do is to be able to interrupt it, at any time it yields.
The most obvious way doesn't work. Exception is only raised after function f exits (if it's endless, it obviously never happens).
#tornado.gen.coroutine
def main():
try:
yield [f(), g()]
except Exception as e:
print("Caught", repr(e))
while True:
yield tornado.gen.sleep(10)
if __name__ == "__main__":
tornado.ioloop.IOLoop.instance().run_sync(main)
Output:
f 0
f 1
Let's raise RuntimeError
f 2
f 3
Traceback (most recent call last):
File "/tmp/test/lib/python3.4/site-packages/tornado/gen.py", line 812, in run
yielded = self.gen.send(value)
StopIteration
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
<...>
File "test.py", line 16, in g
raise RuntimeError
RuntimeError
That is, exception is only raised when both of the coroutines return (both futures resolve).
This's partially solved by tornado.gen.WaitIterator, but it's buggy (unless I'm mistaken). But that's not the point.
It still doesn't solve the problem of interrupting existing coroutines. Coroutine continues to run even though the function that started it exits.
EDIT: it seems like coroutine cancellation is something not really supported in Tornado, unlike in Python's asyncio, where you can easily throw CancelledError at every yield point.
If you use WaitIterator according to the instructions, and use a toro.Event to signal between coroutines, it works as expected:
from datetime import timedelta
import tornado.gen
import tornado.ioloop
import toro
stop = toro.Event()
#tornado.gen.coroutine
def f():
for i in range(4):
print("f", i)
# wait raises Timeout if not set before the deadline.
try:
yield stop.wait(timedelta(seconds=0.5))
print("f done")
return
except toro.Timeout:
print("f continuing")
#tornado.gen.coroutine
def g():
yield tornado.gen.sleep(1)
print("Let's raise RuntimeError")
raise RuntimeError
#tornado.gen.coroutine
def main():
wait_iterator = tornado.gen.WaitIterator(f(), g())
while not wait_iterator.done():
try:
result = yield wait_iterator.next()
except Exception as e:
print("Error {} from {}".format(e, wait_iterator.current_future))
stop.set()
else:
print("Result {} received from {} at {}".format(
result, wait_iterator.current_future,
wait_iterator.current_index))
if __name__ == "__main__":
tornado.ioloop.IOLoop.instance().run_sync(main)
For now, pip install toro to get the Event class. Tornado 4.2 will include Event, see the changelog.
Since version 5, Tornado runs on asyncio event loop.
On Python 3, the IOLoop is always a wrapper around the asyncio event loop, and asyncio.Future and asyncio.Task are used instead of their Tornado counterparts.
Hence you can use asyncio Task cancellation, i.e. asyncio.Task.cancel.
Your example with a queue reading while-true loop, might look like this.
import logging
from asyncio import CancelledError
from tornado import ioloop, gen
async def read_off_a_queue():
while True:
try:
await gen.sleep(1)
except CancelledError:
logging.debug('Reader cancelled')
break
else:
logging.debug('Pretend a task is consumed')
async def do_some_work():
await gen.sleep(5)
logging.debug('do_some_work is raising')
raise RuntimeError
async def main():
logging.debug('Starting queue reader in background')
reader_task = gen.convert_yielded(read_off_a_queue())
try:
await do_some_work()
except RuntimeError:
logging.debug('do_some_work failed, cancelling reader')
reader_task.cancel()
# give the task a chance to clean up, in case it
# catches CancelledError and awaits something
try:
await reader_task
except CancelledError:
pass
if __name__ == '__main__':
logging.basicConfig(level='DEBUG')
ioloop.IOLoop.instance().run_sync(main)
If you run it, you should see:
DEBUG:asyncio:Using selector: EpollSelector
DEBUG:root:Starting queue reader in background
DEBUG:root:Pretend a task is consumed
DEBUG:root:Pretend a task is consumed
DEBUG:root:Pretend a task is consumed
DEBUG:root:Pretend a task is consumed
DEBUG:root:do_some_work is raising
DEBUG:root:do_some_work failed, cancelling reader
DEBUG:root:Reader cancelled
Warning: This is not a working solution. Look at the commentary. Still if you're new (as myself), this example can show the logical flow. Thanks #nathaniel-j-smith and #wgh
What is the difference using something more primitive, like global variable for instance?
import asyncio
event = asyncio.Event()
aflag = False
async def short():
while not aflag:
print('short repeat')
await asyncio.sleep(1)
print('short end')
async def long():
global aflag
print('LONG START')
await asyncio.sleep(3)
aflag = True
print('LONG END')
async def main():
await asyncio.gather(long(), short())
if __name__ == '__main__':
asyncio.run(main())
It is for asyncio, but I guess the idea stays the same. This is a semi-question (why Event would be better?). Yet solution yields exact result author needs:
LONG START
short repeat
short repeat
short repeat
LONG END
short end
UPDATE:
this slides may be really helpful in understanding core of a problem.