I am using the Docker SDK, and I am trying to race a task that times out after some number of seconds against another task that waits on a Docker container to finish. In effect, I want to know if a given container finishes within the timeout I've set.
I have the following code to do it (adapted from this post):
container = # ... create container with Docker SDK
timeout = # ... some int
killed = None
# our tasks
async def __timeout():
await asyncio.sleep(timeout)
return True
async def __run():
container.wait()
return False
# loop and runner
wait_loop = asyncio.new_event_loop()
done, pending = wait_loop.run_until_complete(
asyncio.wait({__run(), __timeout()}, return_when=asyncio.FIRST_COMPLETED)
)
# result extraction
for task in done:
if killed is None:
killed = task.result()
# ... do something with result
# clean up
for task in pending:
task.cancel()
with contextlib.suppress(asyncio.CancelledError):
wait_loop.run_until_complete(task)
wait_loop.close()
Unfortunately, I keep getting the following error:
File "/usr/lib/python3.5/asyncio/base_events.py", line 387, in run_until_complete
return future.result()
File "/usr/lib/python3.5/asyncio/futures.py", line 274, in result
raise self._exception
File "/usr/lib/python3.5/asyncio/tasks.py", line 241, in _step
result = coro.throw(exc)
File "/usr/lib/python3.5/asyncio/tasks.py", line 347, in wait
return (yield from _wait(fs, timeout, return_when, loop))
File "/usr/lib/python3.5/asyncio/tasks.py", line 430, in _wait
yield from waiter
File "/usr/lib/python3.5/asyncio/futures.py", line 361, in __iter__
yield self # This tells Task to wait for completion.
RuntimeError: Task <Task pending coro=<wait() running at /usr/lib/python3.5/asyncio/tasks.py:347> cb=[_run_until_complete_cb() at /usr/lib/python3.5/asyncio/base_events.py:164]> got Future <Future> pending> attached to a different loop
It seems I can't race with the wait task because it belongs to a different loop. Is there any way I can get around this error so that I can determine which task finishes first?
The problem is simple, there is one default loop in every thread. Which is set by asyncio.set_event_loop(loop). Then you can get this loop by loop = asyncio.get_event_loop().
So the problem is, mostly, some packages use asyncio.get_event_loop() by default to get current running loop. Take aiohttp as an example:
class aiohttp.ClientSession(*, connector=None, loop=None, cookies=None, headers=None, skip_auto_headers=None, auth=None, json_serialize=json.dumps, version=aiohttp.HttpVersion11, cookie_jar=None, read_timeout=None, conn_timeout=None, timeout=sentinel, raise_for_status=False, connector_owner=True, auto_decompress=True, requote_redirect_url=False, trust_env=False, trace_configs=None)
As you can see, it accepts loop parameter to specify running loop. But You can also just leave it blank to use asyncio.get_event_loop() by default.
Your problem is you are launching coroutines in a new created loop. But you cannot confirm that all your interal operations are also using this new created one. As they may use asyncio.get_event_loop(), they will be attached into another loop which is the default loop in current thread.
As far as I think, you don't really need to create a new one, but let users do that. Just like the example above, you accept an argument loop, and if it is None, use the default one.
Or you need to carefully inspect your code to ensure that every possible coroutine is using the loop you create.
Related
Context
I am trying to instantiate a legacy data extractor by my dask worker using an actor pattern
from dask.distributed import Client
client = Client()
connector = Sharepoint(CONF.sources["sharepoint"])
items = connector.enumerate_items()
# extraction
remote_extractor = client.submit(
SharepointExtractor, CONF.sources["sharepoint"], connector, actor=True
) # Create Extractor on a worker
extractor = remote_extractor.result() # Get back a pointer to that object
futures = client.map(
extractor.job,
[i for i in items],
retries=5,
pure=False,
)
_ = await client.gather(futures)
The first thing the SharepointExtractor does is to get an http session from its connector
class SharepointExtractor:
def __init__(
self, conf: ConfigTree, connector: Sharepoint, *args, **kwargs
) -> None:
self.conf = conf
self.session = connector.session_factory()
.session_factory() basically returns a aiohttp.client.ClientSession enriched with an Oauth token (which motivates the choice for an actor).
The problem
at one point ClientSession's constructor calls asyncio.get_event_loop() which does not seem available in the worker
...
File "/home/zar3bski/.cache/pypoetry/virtualenvs/poc-dask-iG-N0GH5-py3.10/lib/python3.10/site-packages/eteel/connectors/rest.py", line 96, in session_factory
connector=TCPConnector(limit=30),
File "/home/zar3bski/.cache/pypoetry/virtualenvs/poc-dask-iG-N0GH5-py3.10/lib/python3.10/site-packages/aiohttp/connector.py", line 767, in __init__
super().__init__(
File "/home/zar3bski/.cache/pypoetry/virtualenvs/poc-dask-iG-N0GH5-py3.10/lib/python3.10/site-packages/aiohttp/connector.py", line 234, in __init__
loop = get_running_loop(loop)
File "/home/zar3bski/.cache/pypoetry/virtualenvs/poc-dask-iG-N0GH5-py3.10/lib/python3.10/site-packages/aiohttp/helpers.py", line 287, in get_running_loop
loop = asyncio.get_event_loop()
File "/usr/lib/python3.10/asyncio/events.py", line 656, in get_event_loop
raise RuntimeError('There is no current event loop in thread %r.'
RuntimeError: There is no current event loop in thread 'Dask-Default-Threads-484036-0'.
Since I am in a dev/local context, from what I understand, I end up with a LocalCluster
Going async
I naively thought that going async would automagicaly inject the notion of event_loop into the workers.
client = await Client(asynchronous=True)
connector = Sharepoint(CONF.sources["sharepoint"])
items = connector.enumerate_items()
# extraction
remote_extractor = await client.submit(
SharepointExtractor, CONF.sources["sharepoint"], connector, actor=True
) # Create Extractor on a worker
extractor = await remote_extractor # Get back a pointer to that object
But the same error occurs
Setting an event loop explicitly
loop = asyncio.new_event_loop()
client = await Client(
asynchronous=True, loop=loop
)
This time, the error is slightly more enigmatic
....
File "/home/zar3bski/.cache/pypoetry/virtualenvs/poc-dask-iG-N0GH5-py3.10/lib/python3.10/site-packages/distributed/client.py", line 923, in __init__
self._loop_runner = LoopRunner(loop=loop, asynchronous=asynchronous)
File "/home/zar3bski/.cache/pypoetry/virtualenvs/poc-dask-iG-N0GH5-py3.10/lib/python3.10/site-packages/distributed/utils.py", line 451, in __init__
if not loop.asyncio_loop.is_running():
AttributeError: '_UnixSelectorEventLoop' object has no attribute 'asyncio_loop'
(not sure what this constructor is waiting for loop)
Do you have examples of dask actors involving resources from aiohttp (or any other async lib)? How should I set dask workers got get an event loop avaiblable to my actors?
Edit
Following #mdurant approach (a kind of singleton based importation of the extractor from a importable module)
def get_extractor(CONF):
if extractor[0] is None:
connector = Sharepoint(CONF.sources["sharepoint"])
extractor[0] = SharepointBis(CONF.sources["sharepoint"], connector)
return extractor[0]
def workload(CONF, item):
extractor = get_extractor(CONF)
return extractor.job(item)
def main():
client = Client()
connector = Sharepoint(CONF.sources["sharepoint"])
items = connector.enumerate_items()
futures = client.map(
workload,
[CONF for _ in range(len(items))],
[i for i in items],
retries=5,
pure=False,
)
_ = client.gather(futures)
I still get
2022-12-01 10:05:54,923 - distributed.worker - WARNING - Compute Failed
Key: workload-ffcf0f1a-8aee-41d1-9ad2-f7eea91fa107-41
Function: workload
args: (<eteel.conf.ConfGenerator object at 0x7fae8040d4e0>, 'firex1.sharepoint.com,930e9ef8-6bdf-4484-9883-6aa9965c548f,aed0d0bd-a659-4dbf-bbaa-a56f4efa3b0c')
kwargs: {}
Exception: 'RuntimeError("There is no current event loop in thread \'Dask-Default-Threads-166860-1\'.")'
same goes with a Client(asynchronous=True);which drives me back to my question: how can I have an event loop in a Dask Thread? I have a strong intuition that this has something to do with Client(asynchronous=True, loop={this parameter})
OK, I think there is some confusion going on in this question, so I will do my best to clarify the situation. There are three main points:
some things cannot be serialised between processes easily or at all
some objects are expensive to create per process, and it would be nice to only do it once
the work must happen in an async context
Here is how I would do it. Put this in an importable module.
extractor = [None]
def get_extractor(CONF):
if extractor[0] is None:
connector = Sharepoint(CONF.sources["sharepoint"])
extractor[0] = SharepointExtractor(CONF.sources["sharepoint"], connector)
return extractor[0]
async def workload(CONF, item):
extractor = get_extractor(CONF)
return await extractor.job(item, retries=5)
if __name__ == "__main__": # or run this elsewhere
client = ...
items = ...
futures = client.map(workload, items)
output = client.gather(futures)
I do not know from the OP which parts of the workload are coroutines, I am guessing the .job method - but it should be obvious what I am doing. I note the original code would not have worked in a simple non-dask session, and it is always best to start off with something that works before trying to daskify it.
On async in dask:
client.map/submit supports coroutine functions, and they will be executed on the same event loop as the main worker. That's all you need here. All the distributed components (worker, scheduler, client) are async, server-like implementations with event loops, but execution of worker code does not normally happen in the same thread as the one running that server.
client(asynchronous=True) implies that the client is to be constructed and operated on only from within coroutines - and that the client's event loop is in the current thread. This is probably not what you want, unless you know what you are doing.
Is it possible to get a ThreadPoolExecutor to wait for all its futures and their add_done_callback() functions to complete without having to call .shutdown(wait=True)? The following code snippet illustrates the essence of what I'm trying to accomplish, which is to reuse the thread pool between iterations of the outer loop.
from concurrent.futures import ThreadPoolExecutor, wait
import time
def proc_func(n):
return n + 1
def create_callback_func(fid, sleep_time):
def callback(future):
time.sleep(sleep_time)
fid.write(str(future.result()))
return
return callback
num_workers = 4
num_files_write = 3
num_tasks = 8
sleep_time = 1
pool = ThreadPoolExecutor(max_workers=num_workers)
for n in range(num_files_write):
fid = open(f'test{n}.txt', 'w')
futs = []
callback_func = create_callback_func(fid, sleep_time)
for t in range(num_tasks):
fut = pool.submit(proc_func, n)
fut.add_done_callback(callback_func)
futs.append(fut)
wait(futs)
fid.close()
pool.shutdown(wait=True)
Running this code throws a bunch of ValueError: I/O operation on closed file. and the three files that get written have contents:
test0.txt -> 1111
test1.txt -> 2222
test3.txt -> 3333
Clearly this is wrong and there should be eight of each numeral. If I create and shutdown a separate ThreadPoolExecutor for each file, then the correct result is achieved. So I know that the Executor has the ability to properly wait for all the callbacks to finish, but can I tell it to do so without shutting it down?
I'm afraid that cannot be done and you are "misusing" the callback.
The primary purpose of the callback is to notify that the scheduled work has been done.
The internal future states are PENDING -> RUNNING -> FINISHED (disregarding cancellations for brevity). When the FINISHED state is reached, the callbacks are invoked, but there is no next state when they finish. That's why it is not possible to synchronize with that event.
The core of the execution of a submitted function in one of the available threads is (simplified):
try:
result = self.fn(*self.args, **self.kwargs)
except BaseException as exc:
self.future.set_exception(exc)
else:
self.future.set_result(result)
where both set_exception and set_result look like this (very simplified):
... save the result/exception
self._state = FINISHED
... wakeup all waiters
self._invoke_callbacks() # this is the last statement
The future is in FINISHED, i.e. "done" state when the "done" callback is called. It would not make sense to notify that the work is done before marking it done.
As you noticed already, in your code:
wait(futs)
fid.close()
the wait returns, the file get closed, but the callback is not finished yet and fails attemtping to write to a closed file.
The second question is why shutdown(wait=True) works? Simply because it waits for all threads:
if wait:
for t in self._threads:
t.join()
Those threads execute also the callbacks (see the code snippets above). That's why the callback execution must be finished when the threads are finished.
When I run this code in Python 3.7:
import asyncio
sem = asyncio.Semaphore(2)
async def work():
async with sem:
print('working')
await asyncio.sleep(1)
async def main():
await asyncio.gather(work(), work(), work())
asyncio.run(main())
It fails with RuntimeError:
$ python3 demo.py
working
working
Traceback (most recent call last):
File "demo.py", line 13, in <module>
asyncio.run(main())
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/asyncio/runners.py", line 43, in run
return loop.run_until_complete(main)
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/asyncio/base_events.py", line 584, in run_until_complete
return future.result()
File "demo.py", line 11, in main
await asyncio.gather(work(), work(), work())
File "demo.py", line 6, in work
async with sem:
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/asyncio/locks.py", line 92, in __aenter__
await self.acquire()
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/asyncio/locks.py", line 474, in acquire
await fut
RuntimeError: Task <Task pending coro=<work() running at demo.py:6> cb=[gather.<locals>._done_callback() at /opt/local/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/asyncio/tasks.py:664]> got Future <Future pending> attached to a different loop
Python 3.10+: This error message should not occur anymore, see answer from #mmdanziger:
(...) the implementation of Semaphore has been changed and no longer grabs the current loop on init
Python 3.9 and older:
It's because Semaphore constructor sets its _loop attribute – in asyncio/locks.py:
class Semaphore(_ContextManagerMixin):
def __init__(self, value=1, *, loop=None):
if value < 0:
raise ValueError("Semaphore initial value must be >= 0")
self._value = value
self._waiters = collections.deque()
if loop is not None:
self._loop = loop
else:
self._loop = events.get_event_loop()
But asyncio.run() starts a completely new loop – in asyncio/runners.py, it's also metioned in the documentation:
def run(main, *, debug=False):
if events._get_running_loop() is not None:
raise RuntimeError(
"asyncio.run() cannot be called from a running event loop")
if not coroutines.iscoroutine(main):
raise ValueError("a coroutine was expected, got {!r}".format(main))
loop = events.new_event_loop()
...
Semaphore initiated outside of asyncio.run() grabs the asyncio "default" loop and so cannot be used with the event loop created with asyncio.run().
Solution
Initiate Semaphore from code called by asyncio.run(). You will have to pass them to the right place, there are more possibilities how to do that, you can for example use contextvars, but I will just give the simplest example:
import asyncio
async def work(sem):
async with sem:
print('working')
await asyncio.sleep(1)
async def main():
sem = asyncio.Semaphore(2)
await asyncio.gather(work(sem), work(sem), work(sem))
asyncio.run(main())
The same issue (and solution) is probably also with asyncio.Lock, asyncio.Event, and asyncio.Condition.
Update: As of Python 3.10 OP's code will run as written. This is because the implementation of Semaphore has been changed and no longer grabs the current loop on init. See this answer for more discussion.
Python 3.10 implementation from GitHub
class Semaphore(_ContextManagerMixin, mixins._LoopBoundMixin):
"""A Semaphore implementation.
A semaphore manages an internal counter which is decremented by each
acquire() call and incremented by each release() call. The counter
can never go below zero; when acquire() finds that it is zero, it blocks,
waiting until some other thread calls release().
Semaphores also support the context management protocol.
The optional argument gives the initial value for the internal
counter; it defaults to 1. If the value given is less than 0,
ValueError is raised.
"""
def __init__(self, value=1, *, loop=mixins._marker):
super().__init__(loop=loop)
if value < 0:
raise ValueError("Semaphore initial value must be >= 0")
self._value = value
self._waiters = collections.deque()
self._wakeup_scheduled = False
Alternative solution for Python 3.9 and older is to instantiate the Event, Lock, Semaphore, etc. as a first step inside the main() task, where possible.
I validated this with an Event case tested on Python 3.10 (Windows) vs Python 3.9 (Raspberry Pi).
I have two processes; a main process and a subprocess. The main process is running an asyncio event loop, and starts the subprocess. I want to start another asyncio event loop in the subprocess. I'm using the aioprocessing module to launch the subprocess.
The subprocess function is:
def subprocess_code():
loop = asyncio.get_event_loop()
#asyncio.coroutine
def f():
for i in range(10):
print(i)
yield from asyncio.sleep(1)
loop.run_until_complete(f())
But I get an error:
loop.run_until_complete(f())
File "/usr/lib/python3.4/asyncio/base_events.py", line 271, in run_until_complete
self.run_forever()
File "/usr/lib/python3.4/asyncio/base_events.py", line 239, in run_forever
raise RuntimeError('Event loop is running.')
RuntimeError: Event loop is running.
Is it possible to start a new, or restart the existing, asyncio event loop in the subprocess? If so, how?
Sorry for disturb!
I found a solution!
policy = asyncio.get_event_loop_policy()
policy.set_event_loop(policy.new_event_loop())
loop = asyncio.get_event_loop()
put this code to start new asycnio event loop inside of subprocess started from process with asyncio event loop
Exploring Python 3.4.0's asyncio module, I am attempting to create a class with asyncio.coroutine methods that are called from an event_loop outside of the class.
My working code is below.
import asyncio
class Foo():
#asyncio.coroutine
def async_sleep(self):
print('about to async sleep')
yield from asyncio.sleep(1)
#asyncio.coroutine
def call_as(self):
print('about to call ass')
yield from self.async_sleep()
def run_loop(self):
loop = asyncio.get_event_loop()
loop.run_until_complete(self.call_as())
print('done with loop')
loop.close()
a = Foo()
a.run_loop()
loop = asyncio.get_event_loop()
loop.run_until_complete(a.call_as())
The call to a.run_loop() provides output as expected:
python3 async_class.py
about to call ass
about to async sleep
done with loop
However as soon as the event_loop attempts to process a.call_as() I get the following Traceback:
Traceback (most recent call last):
File "async_class.py", line 26, in <module>
doop.run_until_complete(asyncio.async(a.call_ass()))
File "/opt/boxen/homebrew/opt/pyenv/versions/3.4.0/lib/python3.4/asyncio/base_events.py", line 203, in run_until_complete
self.run_forever()
File "/opt/boxen/homebrew/opt/pyenv/versions/3.4.0/lib/python3.4/asyncio/base_events.py", line 184, in run_forever
self._run_once()
File "/opt/boxen/homebrew/opt/pyenv/versions/3.4.0/lib/python3.4/asyncio/base_events.py", line 778, in _run_once
event_list = self._selector.select(timeout)
AttributeError: 'NoneType' object has no attribute 'select'
I have attempted wrapping a.call_as() in an asyncio.Task(), asyncio.async() and the failure is the same.
As it turns out, the issue was with the context of the event loop.
asyncio magically creates an event loop for a thread at runtime. This event loop's context is set when .get_event_loop() is called.
In the above example, a.run_loop sets the event loop inside the context of Foo.run_loop.
One kicker of event loops is that there may only be one event loop per thread, and a given event loop can only process events in its context.
With that in mind, note that the loop = asyncio.get_event_loop() just after a.run_loop is asking to assign the thread's event loop to the __main__ context. Unfortunately, the event loop was already set to the context of Foo.run_loop, so a None type is set for the __main__ event loop.
Instead, it is necessary to create a new event loop and then set that event loop's context to __main__, i.e.
new_loop = asyncio.new_event_loop()
asyncio.set_event_loop(new_loop)
Only then will an event loop be properly set in the context of __main__, allowing for the proper execution of our now-modified new_loop.run_until_complete(a.call_as())
It is because you close the eventloop at the end of Foo.run_loop()
From BaseEventLoop.close
Close the event loop. The loop must not be running. Pending callbacks will be lost.
This clears the queues and shuts down the executor, but does not wait for the executor to finish.
This is idempotent and irreversible. No other methods should be called after this one.