Why asynchronous sleep task and cpu-bound task cannot proceed concurrently?

Why asynchronous sleep task and cpu-bound task cannot proceed concurrently? - python

I need to send HTTP requests and do some CPU intensive task while waiting for the response. I tried to mock the situation with an asyncio.sleep and a CPU task below:
import asyncio
async def main():
loop = asyncio.get_event_loop()
start = loop.time()
task = asyncio.create_task(asyncio.sleep(1))
# ------Useless CPU-Bound Task------ #
for n in range(10 ** 7):
n **= 7
# ---------------------------------- #
print(f"CPU-bound process finished in {loop.time()-start:.2f} seconds.")
await task
print(f"Finished in {loop.time()-start:.2f} seconds.")
asyncio.run(main())
Output:
CPU-bound process finished in 2.12 seconds.
Finished in 3.12 seconds.
I expected the sleeping task to proceed during the CPU process but apparently they ran synchronously. This also makes me worry about the requests that I need to send such that CPU process might begin and completely block the requests so that they don't get sent to the server until the process finishes etc.
So the question is why does this happen and how to prevent it?
I've also read somewhere that asyncio only switches context upon await calls. Does this have disadvantages in a situation like this, if so, how?
Append: Will using threading have any advantages over asyncio in this scenario? I know it's many questions, but I'm really confused.

Asyncio tasks are more co-operative concurrency than true concurrency.
Your sleeper task won't actually start running until you "yield" control to it, which is usually done with an await call. Since that happens after your main (CPU-intensive) code is finished, there will be an extra second after that before everything is actually done.
An await asyncio.sleep(0) between sleeper task creation and CPU-intensive work will allow the sleeper task to commence. It will them immediately yield back to the main task and they'll run "concurrently".
Of course, a CPU-bound async task sort of defeats the purpose of asyncio since it won't yield to allow other tasks to run in a timely manner. That doesn't really matter for this sleeper but, if it was a task that had to do thirty things, one per second, that would be a problem.
If you need to do anything like that, it's a good idea to either choose one of the other forty-eight ways of doing concurrency in Python :-), or yield enough in the main task so that other tasks can run. In other words, something like:
yield_cycle = 0.1 # Cycle time.
then = time.monotonic() # Base time.
for n in range(10 ** 7):
n **= 7
if time.monotonic() - then > yield_cycle: # Check cycle time.
await asyncio.sleep(0) # Yield if exceeded.
then = time.monotonic() # Prep next cycle.
In fact, we have a helper function in our own code base which does exactly this. I can't give you the actual source code but I think it's (hopefully) simple enough to recite from memory:
async def play_nice(secs: float, base: float) -> float:
"""Yield periodically in intensive task.
Initial call can use negative base to yield immediately.
Args:
secs: Minimum run time before yield will happen.
base: Base monotonic time to use for calculations.
Returns:
New base time to use.
"""
if base < 0:
base = time.monotonic() - secs
if time.monotonic() - base >= secs:
await asyncio.sleep(0)
return time.monotonic()
return base
# Your code is then:
then = await play_nice(secs=0.1, base=-1) # Initial yield.
for n in range(10 ** 7):
n **= 7
then = await play_nice(secs=0.1, base=then) # Subsequent ones.

The reason is your CPU intensive task has the control until it yields it. You can force it to yield using sleep:
sleep() always suspends the current task, allowing other tasks to run.
Setting the delay to 0 provides an optimized path to allow other tasks to run. This can be used by long-running functions to avoid blocking the event loop for the full duration of the function call.
import asyncio
async def test_sleep(n):
await asyncio.sleep(n)
async def main():
loop = asyncio.get_event_loop()
start = loop.time()
task = asyncio.create_task(asyncio.sleep(1))
await asyncio.sleep(0)
# ------Useless CPU-Bound Task------ #
for n in range(10 ** 7):
n **= 7
# ---------------------------------- #
print(f'CPU-bound process finished in {loop.time()-start:.2f} seconds.')
await task
print(f"Finished in {loop.time()-start:.2f} seconds.")
await main()
Will output
CPU-bound process finished in 4.21 seconds.
Finished in 4.21 seconds.

Related

How to avoid to start hundreds of threads when starting (very short) actions at different timings in the future

I use this method to launch a few dozen (less than thousand) of calls of do_it at different timings in the future:
import threading
timers = []
while True:
for i in range(20):
t = threading.Timer(i * 0.010, do_it, [i]) # I pass the parameter i to function do_it
t.start()
timers.append(t) # so that they can be cancelled if needed
wait_for_something_else() # this can last from 5 ms to 20 seconds
The runtime of each do_it call is very fast (much less than 0.1 ms) and non-blocking. I would like to avoid spawning hundreds of new threads for such a simple task.
How could I do this with only one additional thread for all do_it calls?
Is there a simple way to do this with Python, without third party library and only standard library?

As I understand it, you want a single worker thread that can process submitted tasks, not in the order they are submitted, but rather in some prioritized order. This seems like a job for the thread-safe queue.PriorityQueue.
from dataclasses import dataclass, field
from threading import Thread
from typing import Any
from queue import PriorityQueue
#dataclass(order=True)
class PrioritizedItem:
priority: int
item: Any=field(compare=False)
def thread_worker(q: PriorityQueue[PrioritizedItem]):
while True:
do_it(q.get().item)
q.task_done()
q = PriorityQueue()
t = Thread(target=thread_worker, args=(q,))
t.start()
while True:
for i in range(20):
q.put(PrioritizedItem(priority=i * 0.010, item=i))
wait_for_something_else()
This code assumes you want to run forever. If not, you can add a timeout to the q.get in thread_worker, and return when the queue.Empty exception is thrown because the timeout expired. Like that you'll be able to join the queue/thread after all the jobs have been processed, and the timeout has expired.
If you want to wait until some specific time in the future to run the tasks, it gets a bit more complicated. Here's an approach that extends the above approach by sleeping in the worker thread until the specified time has arrived, but be aware that time.sleep is only as accurate as your OS allows it to be.
from dataclasses import astuple, dataclass, field
from datetime import datetime, timedelta
from time import sleep
from threading import Thread
from typing import Any
from queue import PriorityQueue
#dataclass(order=True)
class TimedItem:
when: datetime
item: Any=field(compare=False)
def thread_worker(q: PriorityQueue[TimedItem]):
while True:
when, item = astuple(q.get())
sleep_time = (when - datetime.now()).total_seconds()
if sleep_time > 0:
sleep(sleep_time)
do_it(item)
q.task_done()
q = PriorityQueue()
t = Thread(target=thread_worker, args=(q,))
t.start()
while True:
now = datetime.now()
for i in range(20):
q.put(TimedItem(when=now + timedelta(seconds=i * 0.010), item=i))
wait_for_something_else()
To address this problem using only a single extra thread we have to sleep in that thread, so it's possible that new tasks with higher priority could come in while the worker is sleeping. In that case the worker would process that new high priority task after it's done with the current one. The above code assumes that scenario will not happen, which seems reasonable based on the problem description. If that might happen you can alter the sleep code to repeatedly poll if the task at the front of the priority queue has come due. The disadvantage with a polling approach like that is that it would be more CPU intensive.
Also, if you can guarantee that the relative order of the tasks won't change after they've been submitted to the worker, then you can replace the priority queue with a regular queue.Queue to simplify the code somewhat.
These do_it tasks can be cancelled by removing them from the queue.
The above code was tested with the following mock definitions:
def do_it(x):
print(x)
def wait_for_something_else():
sleep(5)
An alternative approach that uses no extra threads would be to use asyncio, as pointed out by smcjones. Here's an approach using asyncio that calls do_it at specific times in the future by using loop.call_later:
import asyncio
def do_it(x):
print(x)
async def wait_for_something_else():
await asyncio.sleep(5)
async def main():
loop = asyncio.get_event_loop()
while True:
for i in range(20):
loop.call_later(i * 0.010, do_it, i)
await wait_for_something_else()
asyncio.run(main())
These do_it tasks can be cancelled using the handle returned by loop.call_later.
This approach will, however, require either switching over your program to use asyncio throughout, or running the asyncio event loop in a separate thread.

It sounds like you want something to be non-blocking and asynchronous, but also single-processed and single-threaded (one thread dedicated to do_it).
If this is the case, and especially if any networking is involved, so long as you're not actively doing serious I/O on your main thread, it is probably worthwhile using asyncio instead.
It's designed to handle non-blocking operations, and allows you to make all of your requests without waiting for a response.
Example:
import asyncio
def main():
while True:
tasks = []
for i in range(20):
tasks.append(asyncio.create_task(do_it(i)))
await wait_for_something_else()
for task in tasks:
await task
asyncio.run(main())
Given the time spent on blocking I/O (seconds) - you'll probably waste more time managing threads than you will save on generating a separate thread to do these other operations.

As you have said that in your code each series of 20 do_it calls starts when wait_for_something_else is finished, I would recommend calling the join method in each iteration of the while loop:
import threading
timers = []
while True:
for i in range(20):
t = threading.Timer(i * 0.010, do_it, [i]) # I pass the parameter i to function do_it
t.start()
timers.append(t) # so that they can be cancelled if needed
wait_for_something_else() # this can last from 5 ms to 20 seconds
for t in timers[-20:]:
t.join()

do_it run in order and cancellable
run all do_it in one thread and sleep for the specific timing (may not with sleep)
use a variable "should_run_it" to check the do_it should run or not (cancellable?)
it's that something like this?
import threading
import time
def do_it(i):
print(f"[{i}] {time.time()}")
should_run_it = {i:True for i in range(20)}
def guard_do_it(i):
if should_run_it[i]:
do_it(i)
def run_do_it():
for i in range(20):
guard_do_it(i)
time.sleep(0.010)
if __name__ == "__main__":
t = threading.Timer(0.010, run_do_it)
start = time.time()
print(start)
t.start()
#should_run_it[5] = should_run_it[10] = should_run_it[15] = False # test
t.join()
end = time.time()
print(end)
print(end - start)

I don't have a ton of experience with threading in Python, so please go easy on me. The concurrent.futures library is a part of Python3 and it's dead simple. I'm providing an example for you so you can see how straightforward it is.
Concurrent.futures with exactly one thread for do_it() and concurrency:
import concurrent.futures
import time
def do_it(iteration):
time.sleep(0.1)
print('do it counter', iteration)
def wait_for_something_else():
time.sleep(1)
print('waiting for something else')
def single_thread():
with concurrent.futures.ThreadPoolExecutor(max_workers=1) as executor:
futures = (executor.submit(do_it, i) for i in range(20))
for future in concurrent.futures.as_completed(futures):
future.result()
def do_asap():
wait_for_something_else()
with concurrent.futures.ThreadPoolExecutor() as executor:
futures = [executor.submit(single_thread), executor.submit(do_asap)]
for future in concurrent.futures.as_completed(futures):
future.result()
The code above uses max_workers=1 threads to execute do_it() in a single thread. On line 13, do_it() is constrained to a single thread using the option max_workers=1 to limit the work to exactly one thread.
On line 22, both methods are submitted to the concurrent.futures thread pool executor. The code from lines 21-24 enables both methods to run in a thread pool and do_it runs on a single non-blocking thread.
The concurrent.futures doc describes how to control the number of threads. When max_workers is not specified, the total number of threads assigned to both processes is max_workers = min(32, os.cpu_count() + 4).

Is there a way to change asyncio.sleep while currently sleeping?

If I have a coroutine currently sleeping to allow other coroutines to run, is is possible to change the sleep time while sleeping? Or would I have to cancel and restart the coroutine. I think I may have just answered myself there. Looking for help from the more experienced.

The "sleep" coroutine is obviously designed to be simple: it pauses for that amount of time, and it is it.
What you seem to need is a way to synchronize your co-routines, and if no signal gets back in an specified amount of time (the time you are passing to sleep), to move on.
Take a look at the synchronization primitives https://docs.python.org/3.6/library/asyncio-sync.html and asyncio.wait_for
So, you can instead of asyncio.sleep, call a co-routine, with wait_for, where it expects an Event, or a Lock release. The Event or lock-release then is used by whatever part of your code would "cancel sleep" anyway.
I created an example to show both sleeping running to the end, and being canceled.
import asyncio
async def interruptable_sleep(time, event):
try:
await asyncio.wait_for(event.wait(), timeout=time)
except asyncio.TimeoutError:
print("'sleeping' proceeded normaly")
else:
print("'sleeping' canceled")
async def sleeper(m, n, event):
await asyncio.sleep(n)
if n == 3:
event.set()
print(f"cycle {m}, step {n}")
async def main():
event = asyncio.Event()
tasks = []
for cycle in range(3):
event.clear()
# create batch of async tasks to run in parallel
for step in range(6):
tasks.append(asyncio.create_task(sleeper(cycle, step, event), name=f"{cycle}_{step}"))
await interruptable_sleep(2, event)
# 'join' remaining tasks
event.set()
await asyncio.gather(*tasks)
asyncio.run(main())
This pattern sort of "reverses" the idea of a timeout: if a task finishes early, the waiting is canceled . (while timeout means "if a task is too late, cancel it") -
But maybe ou just need the other pattern there: to create a list of all your tasks and call asyncio.gather, rather than calling "sleep" to give "time for the other tasks to run".

What is the correct way to switch freely between asynchronous tasks?

Suppose I have some tasks running asynchronously. They may be totally independent, but I still want to set points where the tasks will pause so they can run concurrently.
What is the correct way to run the tasks concurrently? I am currently using await asyncio.sleep(0), but I feel this is adding a lot of overhead.
import asyncio
async def do(name, amount):
for i in range(amount):
# Do some time-expensive work
print(f'{name}: has done {i}')
await asyncio.sleep(0)
return f'{name}: done'
async def main():
res = await asyncio.gather(do('Task1', 3), do('Task2', 2))
print(*res, sep='\n')
loop = asyncio.get_event_loop()
loop.run_until_complete(main())
Output
Task1: has done 0
Task2: has done 0
Task1: has done 1
Task2: has done 1
Task1: has done 2
Task1: done
Task2: done
If we were using simple generators, an empty yield would pause the flow of a task without any overhead, but empty await are not valid.
What is the correct way to set such breakpoints without overhead?

As mentioned in the comments, normally asyncio coroutines suspend automatically on calls that would block or sleep in equivalent synchronous code. In your case the coroutine is CPU-bound, so awaiting blocking calls is not enough, it needs to occasionally relinquish control to the event loop to allow the rest of the system to run.
Explicit yields are not uncommon in cooperative multitasking, and using await asyncio.sleep(0) for that purpose will work as intended, it does carry a risk: sleep too often, and you're slowing down the computation by unnecessary switches; sleep too seldom, and you're hogging the event loop by spending too much time in a single coroutine.
The solution provided by asyncio is to offload CPU-bound code to a thread pool using run_in_executor. Awaiting it will automatically suspend the coroutine until the CPU-intensive task is done, without any intermediate polling. For example:
import asyncio
def do(id, amount):
for i in range(amount):
# Do some time-expensive work
print(f'{id}: has done {i}')
return f'{id}: done'
async def main():
loop = asyncio.get_event_loop()
res = await asyncio.gather(
loop.run_in_executor(None, do, 'Task1', 5),
loop.run_in_executor(None, do, 'Task2', 3))
print(*res, sep='\n')
loop = asyncio.get_event_loop()
loop.run_until_complete(main())

Async function runs in a sync (serial) way, tasks block each other

I do not get any acceleration using asyncio. This snippet still runs the same fashion as a sync job. Most of the examples use asyncio.sleep() to impose delay, my question is what if part of the code poses the delay depending on the input parameters.
async def c(n):
#this loop is supposed to impose delay
for i in range(1, n * 40000):
c *= i
return n
async def f():
tasks = [c(i) for i in [2,1,3]]
r=[]
completed, pending = await asyncio.wait(tasks)
for item in completed:
r.append(item.result())
return r
if __name__=="__main__":
loop = asyncio.get_event_loop()
k=loop.run_until_complete(f())
loop.close()
I expect to get [1,2,3] but I do not (there is no time difference when running in serial also)

asyncio is not about getting acceleration, it's about avoiding "callback hell" when programming in an asynchronous environment, such as (but not limited to) non-blocking IO. Since the code in the question is not asynchronous, there is nothing to gain from using asyncio - but you can look into multiprocessing instead.
In the above case, the function is defined as async, but it runs its entire calculation without awaiting anything. It also contains references to unassigned variables, so let's start with a version that runs:
async def long_calc(n):
p = 1
for i in range(1, n * 10000):
p *= i
print(math.log(p))
return p
The print at the end immediately indicates when the calculation is done. Starting several such coroutines "in parallel" is done with asyncio.gather:
async def wait_calcs():
return await asyncio.gather(*[long_calc(i) for i in [2, 1, 3]])
asyncio.gather will let the calculations run and return once all of them are complete, returning a list of their results in the order in which they they appear in the argument list. But the output printed when running loop.run_until_complete(wait_calcs()) shows that calculations are not really running in parallel:
178065.71824964616
82099.71749644238
279264.3442843094
The results correspond to the [2, 1, 3] order. If the coroutines were running in parallel, the smallest number would appear first because its coroutine has by far the least work to do.
We can force the coroutine to give a chance to other coroutines to run by introducing a no-op sleep in the inner loop:
async def long_calc(n):
p = 1
for i in range(1, n * 10000):
p *= i
await asyncio.sleep(0)
print(math.log(p))
return p
The output now shows that the coroutines were running in parallel:
82099.71749644238
178065.71824964616
279264.3442843094
Note that this version also takes more time to run because it involves more switching between the coroutines and the main loop. The slowdown can be avoided by only sleeping once in a hundred cycles or so.

await for any future asyncio

I'm trying to use asyncio to handle concurrent network I/O. A very large number of functions are to be scheduled at a single point which vary greatly in time it takes for each to complete. Received data is then processed in a separate process for each output.
The order in which the data is processed is not relevant, so given the potentially very long waiting period for output I'd like to await for whatever future finishes first instead of a predefined order.
def fetch(x):
sleep()
async def main():
futures = [loop.run_in_executor(None, fetch, x) for x in range(50)]
for f in futures:
await f
loop = asyncio.get_event_loop()
loop.run_until_complete(main())
Normally, awaiting in order in which futures were queued is fine:
Blue color represents time each task is in executor's queue, i.e. run_in_executor has been called, but the function was not yet executed, as the executor runs only 5 tasks simultaneously; green is time spent on executing the function itself; and the red is the time spent waiting for all previous futures to await.
In my case where functions vary in time greatly, there is a lot of time lost on waiting for previous futures in queue to await, while I could be locally processing GET output. This makes my system idle for a while only to get overwhelmed when several outputs complete simultaneously, then jumping back to idle waiting for more requests to finish.
Is there a way to await whatever future is first completed in the executor?

Looks like you are looking for asyncio.wait with return_when=asyncio.FIRST_COMPLETED.
def fetch(x):
sleep()
async def main():
futures = [loop.run_in_executor(None, fetch, x) for x in range(50)]
while futures:
done, futures = await asyncio.wait(futures,
loop=loop, return_when=asyncio.FIRST_COMPLETED)
for f in done:
await f
loop = asyncio.get_event_loop()
loop.run_until_complete(main())

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Why asynchronous sleep task and cpu-bound task cannot proceed concurrently? - python

Related

How to avoid to start hundreds of threads when starting (very short) actions at different timings in the future

Is there a way to change asyncio.sleep while currently sleeping?

What is the correct way to switch freely between asynchronous tasks?

Async function runs in a sync (serial) way, tasks block each other

await for any future asyncio

Categories

Resources