Why is asyncio.Future incompatible with concurrent.futures.Future?

Why is asyncio.Future incompatible with concurrent.futures.Future? - python

The two classes represent excellent abstractions for concurrent programming, so it's a bit disconcerting that they don't support the same API.
Specifically, according to the docs:
asyncio.Future is almost compatible with concurrent.futures.Future.
Differences:
result() and exception() do not take a timeout argument and raise an exception when the future isn’t done yet.
Callbacks registered with add_done_callback() are always called via the event loop's call_soon_threadsafe().
This class is not compatible with the wait() and as_completed() functions in the concurrent.futures package.
The above list is actually incomplete, there are a couple more differences:
running() method is absent
result() and exception() may raise InvalidStateError if called too early
Are any of these due to the inherent nature of an event loop that makes these operations either useless or too troublesome to implement?
And what is the meaning of the difference related to add_done_callback()? Either way, the callback is guaranteed to happen at some unspecified time after the futures is done, so isn't it perfectly consistent between the two classes?

The core reason for the difference is in how threads (and processes) handle blocks vs how coroutines handle events that block. In threading, the current thread is suspended until whatever condition resolves and the thread can go forward. For example in the case of the futures, if you request the result of a future, it's fine to suspend the current thread until that result is available.
However the concurrency model of an event loop is that rather than suspending code, you return to the event loop and get called again when ready. So it is an error to request the result of an asyncio future that doesn't have a result ready.
You might think that the asyncio future could just wait and while that would be inefficient, would it really be all that bad for your coroutine to block? It turns out though that having the coroutine block is very likely to mean that the future never completes. It is very likely that the future's result will be set by some code associated with the event loop running the code that requests the result. If the thread running that event loop blocks, no code associated with the event loop would run. So blocking on the result would deadlock and prevent the result from being produced.
So, yes, the differences in interface are due to this inherent difference. As an example, you wouldn't want to use an asyncio future with the concurrent.futures waiter abstraction because again that would block the event loop thread.
The add_done_callbacks difference guarantees that callbacks will be run in the event loop. That's desirable because they will get the event loop's thread local data. Also, a lot of coroutine code assumes that it will never be run at the same time as other code from the same event loop. That is, coroutines are only thread safe under the assumption that two coroutines from the same event loop do not run at the same time. Running the callbacks in the event loop avoids a lot of thread safety issues and makes it easier to write correct code.

concurrent.futures.Future provides a way to share results between different threads and processes usually when you use Executor.
asyncio.Future solves same task but for coroutines, that are actually some special sort of functions running usually in one process/thread asynchronously. "Asynchronously" in current context means that event loop manages code executing flow of this coroutines: it may suspend execution inside one coroutine, start executing another coroutine and later return to executing first one - everything usually in one thread/process.
These objects (and many other threading/asyncio objects like Lock, Event, Semaphore etc.) look similar because the idea of concurrency in your code with threads/processes and coroutines is similar.
I think the main reason objects are different is historical: asyncio was created much later then threading and concurrent.futures. It's probably impossible to change concurrent.futures.Future to work with asyncio without breaking class API.
Should both classes be one in "ideal world"? This is probably debatable issue, but I see many disadvantages of that: while asyncio and threading look similar at first glance, they're very different in many ways, including internal implementation or way of writing asyncio/non-asyncio code (see async/await keywords).
I think it's probably for the best that classes are different: we clearly split different by nature ways of concurrency (even if their similarity looks strange at first).

Related

replacing asyncio with concurent.futures

asyncio is causing issues on my spyder IDE => would like to replace it with concurent.futures library
how can I replace the below code relying only on concurent.futures library
asyncio.get_event_loop().run_until_complete(api(message))
exact function looks as follows
def async_loop(api, message):
return asyncio.get_event_loop().run_until_complete(api(message))

As written, you're starting up the event loop only until a particular task completes (which may or may not launch or wait on other tasks), and blocking until it completes. The only reason it's a task is because it needs to use async functions, those can only run in an event loop, and while running, they may launch other tasks or wait on other awaitables, and while waiting, the event loop can do other tasks.
In short, if not for the need to be an async task running in a non-async context, this would just be:
def async_loop(api, message):
return api(message)
which calls api and waits for it to complete.
Really, that's it. If the things api does or calls need to run some tasks asynchronously, without blocking on them immediately, you'd have some global executor, e.g.
executor = concurrent.Futures.ThreadPoolExecutor()
which would be used to launch tasks with:
fut = executor.submit(callable, 'arg1', 'arg2', kwarg1='somevalue')
and, when the result of the task is needed, someone would call:
value = fut.result()
on it (which would block if it wasn't done yet, return the result if it completed without an exception, or raise the exception it died with if it died with an exception).
Whenever you no longer need the executor, you just call .shutdown() on it and it will wait for all outstanding tasks to complete. That's it.
As a side-note, the error you're experiencing is part of why they've deprecated get_event_loop() in 3.10 (and discouraged it since 3.7). In all likelihood, the simplest solution to your problem (avoiding a switch to threads, because all that means is you've got new problems) is to use the much simpler high-level API, asyncio.run (introduced in 3.7), which creates an event loop, runs the task in it to completion, does reasonable cleanup, then returns the result:
def async_loop(api, message):
return asyncio.run(api(message))
There's also the asyncio.get_running_loop function (that is the exact replacement for get_event_loop) which you use when an event loop already exists (which you should typically be aware of; event loops don't pop into existence in given thread on their own, so you should know if you launched one; in this case you hadn't, so asyncio.run is the correct one to use).

Correctness of modified consumer/producer

I am creating a Sound class to play notes and would like feedback on the correctness and conciseness of my design. This class differs from the typical consumer/producer in two ways:
The consumer should respond to events, such as to shut down the thread, or otherwise continue forever. The typical consumer/producer exits when the queue is empty. For example, a thread waiting in queue.get cannot handle additional notifications.
Each set of notes submitted by the producer should overwrite any unprocessed notes remaining on the queue.
Originally I had the consumer process one note at a time using the queue module. I found continually acquiring and releasing the lock without any competition to be inefficient, and as previously noted, queue.get prevents waiting on additional events. So instead of building upon that, I rewrote it into:
import threading
queue = []
condition = threading.Condition()
interrupt = threading.Event()
stop = threading.Event()
def producer():
while some_condition:
ns = get_notes() # [(float,float)]
with condition:
queue.clear()
queue.append(ns)
interrupt.set()
condition.notify()
with condition:
stop.set()
condition.notify()
consumer.join()
def consumer():
while not stop.is_set():
with condition:
while not (queue or stop.is_set()):
condition.wait()
if stop.is_set():
break
interrupt.clear()
ns = queue.pop()
ss = gen_samples(ns) # iterator/fast
for b in grouper(ss, size/2):
if interrupt.is_set() or stop.is_set()
break
stream.write(b)
thread = threading.Thread(target=consumer)
thread.start()
producer()
My questions are as follows:
Is this thread-safe? I want to specifically point out my use of is_set without locks or synchronization (in the for-loop).
Can the events be replaced with boolean variables? I believe so as conflicting writes in both threads (data race) are guarded by the condition variable. There is a race condition between setting and checking events but I do not believe it affects program flow.
Is there a more efficient approach/algorithm utilizing different synchronization primitives from the threading module?
edit: Found and fixed a possible deadlock described in Why does Python threading.Condition() notify() require a lock?

Analyzing thread-safety in Python can take into account the Global Interpreter Lock (GIL): no two threads will execute Python code simultaneously. Assignments to variables or object fields are effectively atomic (there are no half-assigned variables) and changes propagate effectively immediately to other threads.
This means that your use of Event.is_set() is already equivalent to using plain booleans. An event is a bool guarded by a Condition. The is_set() method checks the boolean directly. The set() method acquires the Condition, sets the boolean, and notifies all waiting threads. The wait() methods waits until the set() method is invoked. The clear() method acquires the Condition and unsets the boolean. Since you never wait() for any Event, and setting the boolean is atomic, the Condition in the Event is effectively unused.
This might get rid of a couple of locks, but isn't really a huge efficiency win. A Condition is still an abstraction over a lock, but the built-in Queue type uses locks directly. Thus, I would assume that the built-in queue is no less performant than your solution, even for a single consumer.
Your main issue with the built-in queue is that “continually acquiring and releasing the lock without any competition [is] inefficient”. This is wrong on two counts:
Due to Python's GIL, there is little competition in either case.
Acquiring uncontested locks is very efficient.
So while your solution is probably sufficiently correct (I can see no opportunity for deadlock) it is unlikely to be particularly efficient. (There are just some small mistakes, like using stop instead of stop.is_set() and some syntax errors.)
If you are seeing poor performance with Python threads that's probably because of CPython, not because of the Queue type. I already mentioned that only one thread can run at a time due to the GIL. If multiple threads want to run, they must be scheduled by the operating system to do so and acquire the GIL. Each thread will wait for 5ms before asking the running thread to give up the GIL (in a manner quite similar to your interrupt flag). And then the thread can do useful work like acquiring a lock for a critical section that must not be interrupted by other threads.
Possibly, the solution could be to avoid CPython's threads.
If you have multiple CPU-bound tasks, you must use multiple processes. CPython's threads will not run in parallel. However, communication between processes is more expensive.
Consider whether you can combine the producer+consumer directly, possibly using features such as generators.
For an easier time with juggling multiple tasks in the same thread, consider using async/await. Event loops are provided by the asyncio module. This is just as fast as Python's threads, with the caveat that tasks don't pre-empt (interrupt) each other. But this can be advantage: since a task can only be suspended at an await, you don't need most locks and it is easier to reason about correctness of the code. The downside is that async/await might have even higher latency than using threads.
Python has a concept of “executors” that make it easy and efficient to run tasks in separate threads (for I/O-bound tasks) or separate processes (for CPU-bound tasks).
For communicating between multiple processes, use the types from the multiprocessing module (e.g. Queue, Connection, or Value).

Using async-await over Thread + Queue or ThreadPoolExecutor?

I've never used the async-await syntax but I do often need to make HTTP/S requests and parse responses while awaiting future responses. To accomplish this task, I currently use the ThreadPoolExecutor class which execute the calls asynchronously anyways; effectively I'm achieving (I believe) the same result I would get with more lines of code to use async-await.
Operating under the assumption that my current implementations work asynchronously, I am wondering how the async-await implementation would differ from that of my original one which used Threads and a Queue to manage workers; it also used a Semaphore to limit workers.
That implementation was devised under the following conditions:
There may be any number of requests
Total number of active requests may be 4
Only send next request when a response is received
The basic flow of the implementation was as follows:
Generate container of requests
Create a ListeningQueue
For each request create a Thread and pass the URL, ListeningQueue and Semaphore
Each Thread attempts to acquire the Semaphore (limited to 4 Threads)
Main Thread continues in a while checking ListeningQueue
When a Thread receives a response, place in ListeningQueue and release Semaphore
A waiting Thread acquires Semaphore (process repeats)
Main Thread processes responses until count equals number of requests
Because I need to limit the number of active Threads I use a Semaphore, and if I were to try this using async-await I would have to devise some logic in the Main Thread or in the async def that prevents a request from being sent if the limit has been reached. Apart from that constraint, I don't see where using async-await would be any more useful. Is it that it lowers overhead and race condition chances by eliminating Threads? Is that the main benefit? If so, even though using a ThreadPoolExecutor is making asynchronous calls it is using a pool of Threads, thus making async-await a better option?

Operating under the assumption that my current implementations work asynchronously, I am wondering how the async-await implementation would differ from that of my original one which used Threads and a Queue to manage workers
It would not be hard to implement very similar logic using asyncio and async-await, which has its own version of semaphore that is used in much the same way. See answers to this question for examples of limiting the number of parallel requests with a fixed number of tasks or by using a semaphore.
As for advantages of asyncio over equivalent code using threads, there are several:
Everything runs in a single thread regardless of the number of active connections. Your program can scale to a large number of concurrent tasks without swamping the OS with an unreasonable number of threads or the downloads having to wait for a free slot in the thread pool before they even start.
As you pointed out, single-threaded execution is less susceptible to race conditions because the points where a task switch can occur are clearly marked with await, and everything in-between is effectively atomic. The advantage of this is less obvious in small threaded programs where the executor just hands tasks to threads in a fire-and-collect fashion, but as the logic grows more complex and the threads begin to share more state (e.g. due to caching or some synchronization logic), this becomes more pronounced.
async/await allows you to easily create additional independent tasks for things like monitoring, logging and cleanup. When using threads, those do not fit the executor model and require additional threads, always with a design smell that suggests threads are being abused. With asyncio, each task can be as if it were running in its own thread, and use await to wait for something to happen (and yield control to others) - e.g. a timer-based monitoring task would consist of a loop that awaits asyncio.sleep(), but the logic could be arbitrarily complex. Despite the code looking sequential, each task is lightweight and carries no more weight to the OS than that of a small allocated object.
async/await supports reliable cancellation, which threads never did and likely never will. This is often overlooked, but in asyncio it is perfectly possible to cancel a running task, which causes it to wake up from await with an exception that terminates it. Cancellation makes it straightforward to implement timeouts, task groups, and other patterns that are impossible or a huge chore when using threads.
On the flip side, the disadvantage of async/await is that all your code must be async. Among other things, it means that you cannot use libraries like requests, you have to switch to asyncio-aware alternatives like aiohttp.

Does the lock in asyncio.Condition have other purpose besides compatibility with threading.Condition?

I'd like to ask about asyncio.Condition. I'm not familiar with the concept, but I know and understand locks, semaphores, and queues since my student years.
I could not find a good explanation or typical use cases, just this example. I looked at the source. The core fnctionality is achieved with a FIFO of futures. Each waiting coroutine adds a new future and awaits it. Another coroutine may call notify() which sets the result of one or optionally more futures from the FIFO and that wakes up the same number of waiting coroutines. Really simple up to this point.
However, the implementation and the usage is more complicated than this. A waiting coroutine must first acquire a lock associated with the condition in order to be able to wait (and the wait() releases it while waiting). Also the notifier must acquire a lock to be able to notify(). This leads to with statement before each operation:
async with condition:
# condition operation (wait or notify)
or else a RuntimeError occurrs.
I do not understand the point of having this lock. What resource do we need to protect with the lock? In asyncio there could be always only one coroutine executing in the event loop, there are no "critical sections" as known from threading.
Is this lock really needed (why?) or is it for compatibility with threading code only?
My first idea was it is for the compatibility, but in such case why didn't they remove the lock while preserving the usage? i.e. making
async with condition:
basically an optional no-op.

The answer for this is essentially the same as for threading.Condition vs threading.Event; a condition without a lock is an event, not a condition(*).
Conditions are used to signal that a resource is available. Whomever was waiting for the condition, can use that resource until they are done with it. To ensure that no-one else can use the resource, you need to lock the resource:
resource = get_some_resource()
async with resource.condition:
await resource.condition.wait()
# this resource is mine, no-one will touch it
await resource.do_something_async()
# lock released, resource is available again for the next user
Note how the lock is not released after wait() resumes! Until the lock is released, no other co-routine waiting for the same condition can proceed, access to the resource is made exclusive by virtue of the lock. Note that the lock is released while waiting, so other coroutines can add themselves to the queue, but for wait() to finally return the lock must first be re-acquired.
If you don't need to coordinate access to a shared resource, use an event; a condition is basically a lock and event combined into one primitive, avoiding common implementation pitfalls.
Note that multiple conditions can share locks. This would let you signal specific stages, and other coroutines can wait for that specific stage to arrive. The shared lock would coordinate access to a single resource, but different conditions are signalled when each stage is initiated.
For threading, the typical use-case for conditions offered is that of a single producer, and multiple consumers all waiting on items from the producer to process. The work queue is the shared resource, the producer acquires the condition lock to push an item into the queue and then call notify(), at which point the next consumer waiting on the condition is given the lock (as it returns from wait()) and can remove the item from the queue to work on. This doesn't quite translate to a coroutine-based application, as coroutines don't have the sitting-idle-waiting-for-work-to-be-done problems threading systems have, it's much easier to just spin up consumer co-routines as needed (with perhaps a semaphore to impose a ceiling).
Perhaps a better example is the aioimaplib library, which supports IMAP4 transactions in full. These transactions are asynchronous, but you need to have access to the shared connection resource. So the library uses a single Condition object and wait_for() to wait for a specific state to arrive and thus give exclusive connection access to the coroutine waiting for that transaction state.
(*): Events have a different use-case from conditions, and thus behave a little different from a condition without locking. Once set, an event needs to be cleared explicitly, while a condition 'auto-clears' when used, and is never 'set' when no-one is waiting on the condition. But if you want to signal between tasks and don't need to control access to a shared resource, then you probably wanted an event.

Is this multi-threaded function asynchronous

I'm afraid I'm still a bit confused (despite checking other threads) whether:
all asynchronous code is multi-threaded
all multi-threaded functions are asynchronous
My initial guess is no to both and that proper asynchronous code should be able to run in one thread - however it can be improved by adding threads for example like so:
So I constructed this toy example:
from threading import *
from queue import Queue
import time
def do_something_with_io_lag(in_work):
out = in_work
# Imagine we do some work that involves sending
# something over the internet and processing the output
# once it arrives
time.sleep(0.5) # simulate IO lag
print("Hello, bee number: ",
str(current_thread().name).replace("Thread-",""))
class WorkerBee(Thread):
def __init__(self, q):
Thread.__init__(self)
self.q = q
def run(self):
while True:
# Get some work from the queue
work_todo = self.q.get()
# This function will simiulate I/O lag
do_something_with_io_lag(work_todo)
# Remove task from the queue
self.q.task_done()
if __name__ == '__main__':
def time_me(nmbr):
number_of_worker_bees = nmbr
worktodo = ['some input for work'] * 50
# Create a queue
q = Queue()
# Fill with work
[q.put(onework) for onework in worktodo]
# Launch processes
for _ in range(number_of_worker_bees):
t = WorkerBee(q)
t.start()
# Block until queue is empty
q.join()
# Run this code in serial mode (just one worker)
%time time_me(nmbr=1)
# Wall time: 25 s
# Basically 50 requests * 0.5 seconds IO lag
# For me everything gets processed by bee number: 59
# Run this code using multi-tasking (launch 50 workers)
%time time_me(nmbr=50)
# Wall time: 507 ms
# Basically the 0.5 second IO lag + 0.07 seconds it took to launch them
# Now everything gets processed by different bees
Is it asynchronous?
To me this code does not seem asynchronous because it is Figure 3 in my example diagram. The I/O call blocks the thread (although we don't feel it because they are blocked in parallel).
However, if this is the case I am confused why requests-futures is considered asynchronous since it is a wrapper around ThreadPoolExecutor:
with concurrent.futures.ThreadPoolExecutor(max_workers=20) as executor:
future_to_url = {executor.submit(load_url, url, 10): url for url in get_urls()}
for future in concurrent.futures.as_completed(future_to_url):
url = future_to_url[future]
try:
data = future.result()
Can this function on just one thread?
Especially when compared to asyncio, which means it can run single-threaded
There are only two ways to have a program on a single processor do
“more than one thing at a time.” Multi-threaded programming is the
simplest and most popular way to do it, but there is another very
different technique, that lets you have nearly all the advantages of
multi-threading, without actually using multiple threads. It’s really
only practical if your program is largely I/O bound. If your program
is processor bound, then pre-emptive scheduled threads are probably
what you really need. Network servers are rarely processor bound,
however.

First of all, one note: concurrent.futures.Future is not the same as asyncio.Future. Basically it's just an abstraction - an object, that allows you to refer to job result (or exception, which is also a result) in your program after you assigned a job, but before it is completed. It's similar to assigning common function's result to some variable.
Multithreading: Regarding your example, when using multiple threads you can say that your code is "asynchronous" as several operations are performed in different threads at the same time without waiting for each other to complete, and you can see it in the timing results. And you're right, your function due to sleep is blocking, it blocks the worker thread for the specified amount of time, but when you use several threads those threads are blocked in parallel. So if you would have one job with sleep and the other one without and run multiple threads, the one without sleep would perform calculations while the other would sleep. When you use single thread, the jobs are performed in in a serial manner one after the other, so when one job sleeps the other jobs wait for it, actually they just don't exist until it's their turn. All this is pretty much proven by your time tests. The thing happened with print has to do with "thread safety", i.e. print uses standard output, which is a single shared resource. So when your multiple threads tried to print at the same time the switching happened inside and you got your strange output. (This also show "asynchronicity" of your multithreaded example.) To prevent such errors there are locking mechanisms, e.g. locks, semaphores, etc.
Asyncio: To better understand the purpose note the "IO" part, it's not 'async computation', but 'async input/output'. When talking about asyncio you usually don't think about threads at first. Asyncio is about event loop and generators (coroutines). The event loop is the arbiter, that governs the execution of coroutines (and their callbacks), that were registered to the loop. Coroutines are implemented as generators, i.e. functions that allow to perform some actions iteratively, saving state at each iteration and 'returning', and on the next call continuing with the saved state. So basically the event loop is while True: loop, that calls all coroutines/generators, assigned to it, one after another, and they provide result or no-result on each such call - this provides possibility for "asynchronicity". (A simplification, as there's scheduling mechanisms, that optimize this behavior.) The event loop in this situation can run in single thread and if coroutines are non-blocking it will give you true "asynchronicity", but if they are blocking then it's basically a linear execution.
You can achieve the same thing with explicit multithreading, but threads are costly - they require memory to be assigned, switching them takes time, etc. On the other hand asyncio API allows you to abstract from actual implementation and just consider your jobs to be performed asynchronously. It's implementation may be different, it includes calling the OS API and the OS decides what to do, e.g. DMA, additional threads, some specific microcontroller use, etc. The thing is it works well for IO due to lower level mechanisms, hardware stuff. On the other hand, performing computation will require explicit breaking of computation algorithm into pieces to use as asyncio coroutine, so a separate thread might be a better decision, as you can launch the whole computation as one there. (I'm not talking about algorithms that are special to parallel computing). But asyncio event loop might be explicitly set to use separate threads for coroutines, so this will be asyncio with multithreading.
Regarding your example, if you'll implement your function with sleep as asyncio coroutine, shedule and run 50 of them single threaded, you'll get time similar to the first time test, i.e. around 25s, as it is blocking. If you will change it to something like yield from [asyncio.sleep][3](0.5) (which is a coroutine itself), shedule and run 50 of them single threaded, it will be called asynchronously. So while one coroutine will sleep the other will be started, and so on. The jobs will complete in time similar to your second multithreaded test, i.e. close to 0.5s. If you will add print here you'll get good output as it will be used by single thread in serial manner, but the output might be in different order then the order of coroutine assignment to the loop, as coroutines could be run in different order. If you will use multiple threads, then the result will obviously be close to the last one anyway.
Simplification: The difference in multythreading and asyncio is in blocking/non-blocking, so basicly blocking multithreading will somewhat come close to non-blocking asyncio, but there're a lot of differences.
Multithreading for computations (i.e. CPU bound code)
Asyncio for input/output (i.e. I/O bound code)
Regarding your original statement:
all asynchronous code is multi-threaded
all multi-threaded functions are asynchronous
I hope that I was able to show, that:
asynchronous code might be both single threaded and multi-threaded
all multi-threaded functions could be called "asynchronous"

I think the main confusion comes from the meaning of asynchronous. From the Free Online Dictionary of Computing, "A process [...] whose execution can proceed independently" is asynchronous. Now, apply that to what your bees do:
Retrieve an item from the queue. Only one at a time can do that, while the order in which they get an item is undefined. I wouldn't call that asynchronous.
Sleep. Each bee does so independently of all others, i.e. the sleep duration runs on all, otherwise the time wouldn't go down with multiple bees. I'd call that asynchronous.
Call print(). While the calls are independent, at some point the data is funneled into the same output target, and at that point a sequence is enforced. I wouldn't call that asynchronous. Note however that the two arguments to print() and also the trailing newline are handled independently, which is why they can be interleaved.
Lastly, the call to q.join(). Here of course the calling thread is blocked until the queue is empty, so some kind of synchronization is enforced and wanted. I don't see why this "seems to break" for you.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.