How to use run_in_executor right? - python

I try to use run_in_executor and have some questions. Here is code (basically copypast from docs)
import asyncio
import concurrent.futures
def cpu_bound(val):
# CPU-bound operations will block the event loop:
# in general it is preferable to run them in a
# process pool.
print(f'Start task: {val}')
sum(i * i for i in range(10 ** 7))
print(f'End task: {val}')
async def async_task(val):
print(f'Start async task: {val}')
while True:
print(f'Tick: {val}')
await asyncio.sleep(1)
async def main():
loop = asyncio.get_running_loop()
## Options:
for i in range(5):
loop.create_task(async_task(i))
# 1. Run in the default loop's executor:
# for i in range(10):
# loop.run_in_executor(
# None, cpu_bound, i)
# print('default thread pool')
# 2. Run in a custom thread pool:
# with concurrent.futures.ThreadPoolExecutor(max_workers=10) as pool:
# for i in range(10):
# loop.run_in_executor(
# pool, cpu_bound, i)
# print('custom thread pool')
# 3. Run in a custom process pool:
with concurrent.futures.ProcessPoolExecutor(max_workers = 10) as pool:
for i in range(10):
loop.run_in_executor(
pool, cpu_bound, i)
print('custom process pool')
while True:
await asyncio.sleep(1)
asyncio.run(main())
Case 1: run_in_executor where executor is None:
async_task's execute in the same time as cpu_bound's execute.
In other cases async_task's will execute after cpu_bound's are done.
I thought when we use ProcessPoolExecutor tasks shouldn't block loop. Where am I wrong?

In other cases async_task's will execute after cpu_bound's are done. I thought when we use ProcessPoolExecutor tasks shouldn't block loop. Where am I wrong?
The problem is that with XXXPoolExecutor() shuts down the pool at the end of the with block. Pool shutdown waits for the pending tasks to finish, which blocks the event loop and is incompatible with asyncio. Since your first variant doesn't involve a with statement, it doesn't have this issue.
The solution is simply to remove the with statement and create the pool once (for example at top-level or in main()), and just use it in the function. If you want to, you can explicitly shut down the pool by calling pool.shutdown() after asyncio.run() has completed.
Also note that you are never awaiting the futures returned by loop.run_in_executor. This is an error and asyncio will probably warn you of it; you should probably collect the returned values in a list and await them with something like results = await asyncio.gather(*tasks). This will not only collect the results, but also make sure that the exceptions that occur in the off-thread functions get correctly propagated to your code rather than dropped.

Related

why can't i await readasarray method in gdal module?

I am trying to read several remote images into python and read those image as numpyarray, I try to consider using async to boost my workflow, but I get an error like this:type error: object numpy.ndarray can't be used in 'await' expression',I wonder is it because the method readasarray is not async, so if I have to make it async, I will have to rewrite this method by my own.here are some of my code:
async def taskIO_1():
in_ds = gdal.Open(a[0])
data1 = await in_ds.GetRasterBand(1).ReadAsArray()
async def taskIO_2():
in_ds = gdal.Open(a[1])
data2 = await in_ds.GetRasterBand(1).ReadAsArray()
async def main():
tasks = [taskIO_1(), taskIO_2()]
done,pending = await asyncio.wait(tasks)
for r in done:
print(r.result())
if __name__ == '__main__':
start = time.time()
loop = asyncio.get_event_loop()
try:
loop.run_until_complete(main())
finally:
loop.close()
print(float(time.time()-start))
Your notion is correct: In general, library functions are executed in a synchronized (blocking) fashion, unless the library is explicitly written to support asynchronous execution (e.g. by using non-blocking I/O), such as aiofiles or aiohttp.
To use synchronous calls that you want to be executed asynchronously, you could use loop.run_in_executor. This does nothing else than to offload the computation into a separate thread or process and wrap it so it behaves like a coroutine. An example is shown here:
import asyncio
import concurrent.futures
def blocking_io():
# File operations (such as logging) can block the
# event loop: run them in a thread pool.
with open('/dev/urandom', 'rb') as f:
return f.read(100)
def cpu_bound():
# CPU-bound operations will block the event loop:
# in general it is preferable to run them in a
# process pool.
return sum(i * i for i in range(10 ** 7))
async def main():
loop = asyncio.get_running_loop()
## Options:
# 1. Run in the default loop's executor:
result = await loop.run_in_executor(
None, blocking_io)
print('default thread pool', result)
# 2. Run in a custom thread pool:
with concurrent.futures.ThreadPoolExecutor() as pool:
result = await loop.run_in_executor(
pool, blocking_io)
print('custom thread pool', result)
# 3. Run in a custom process pool:
with concurrent.futures.ProcessPoolExecutor() as pool:
result = await loop.run_in_executor(
pool, cpu_bound)
print('custom process pool', result)
asyncio.run(main())
However, if your application is not using any truly asynchronous features, you are probably better off to just use concurrent.futures pool directly and achieve concurrency that way.

Python 3: How to submit an async function to a threadPool?

I want to use both ThreadPoolExecutor from concurrent.futures and async functions.
My program repeatedly submits a function with different input values to a thread pool. The final sequence of tasks that are executed in that larger function can be in any order, and I don't care about the return value, just that they execute at some point in the future.
So I tried to do this
async def startLoop():
while 1:
for item in clients:
arrayOfFutures.append(await config.threadPool.submit(threadWork, obj))
wait(arrayOfFutures, timeout=None, return_when=ALL_COMPLETED)
where the function submitted is:
async def threadWork(obj):
bool = do_something() # needs to execute before next functions
if bool:
do_a() # can be executed at any time
do_b() # ^
where do_b and do_a are async functions.The problem with this is that I get the error: TypeError: object Future can't be used in 'await' expression and if I remove the await, I get another error saying I need to add await.
I guess I could make everything use threads, but I don't really want to do that.
I recommend a careful readthrough of Python 3's asyncio development guide, particularly the "Concurrency and Multithreading" section.
The main conceptual issue in your example that event loops are single-threaded, so it doesn't make sense to execute an async coroutine in a thread pool. There are a few ways for event loops and threads to interact:
Event loop per thread. For example:
async def threadWorkAsync(obj):
b = do_something()
if b:
# Run a and b as concurrent tasks
task_a = asyncio.create_task(do_a())
task_b = asyncio.create_task(do_b())
await task_a
await task_b
def threadWork(obj):
# Create run loop for this thread and block until completion
asyncio.run(threadWorkAsync())
def startLoop():
while 1:
arrayOfFutures = []
for item in clients:
arrayOfFutures.append(config.threadPool.submit(threadWork, item))
wait(arrayOfFutures, timeout=None, return_when=ALL_COMPLETED)
Execute blocking code in an executor. This allows you to use async futures instead of concurrent futures as above.
async def startLoop():
while 1:
arrayOfFutures = []
for item in clients:
arrayOfFutures.append(asyncio.run_in_executor(
config.threadPool, threadWork, item))
await asyncio.gather(*arrayOfFutures)
Use threadsafe functions to submit tasks to event loops across threads. For example, instead of creating a run loop for each thread you could run all async coroutines in the main thread's run loop:
def threadWork(obj, loop):
b = do_something()
if b:
future_a = asyncio.run_coroutine_threadsafe(do_a(), loop)
future_b = asyncio.run_coroutine_threadsafe(do_b(), loop)
concurrent.futures.wait([future_a, future_b])
async def startLoop():
loop = asyncio.get_running_loop()
while 1:
arrayOfFutures = []
for item in clients:
arrayOfFutures.append(asyncio.run_in_executor(
config.threadPool, threadWork, item, loop))
await asyncio.gather(*arrayOfFutures)
Note: This example should not be used literally as it will result in all coroutines executing in the main thread while the thread pool workers just block. This is just to show an example of the run_coroutine_threadsafe() method.

AsyncIO run in executor using ProcessPoolExecutor

I tried to combine blocking tasks and non-blocking (I/O bound) tasks using ProcessPoolExecutor and found it's behavior pretty unexpected.
class BlockingQueueListener(BaseBlockingListener):
def run(self):
# Continioulsy listening a queue
blocking_listen()
class NonBlockingListener(BaseNonBlocking):
def non_blocking_listen(self):
while True:
await self.get_message()
def run(blocking):
blocking.run()
if __name__ == "__main__":
loop = asyncio.get_event_loop()
executor = ProcessPoolExecutor()
blocking = BlockingQueueListener()
non_blocking = NonBlockingListener()
future = loop.run_in_executor(executor, run(blocking))
loop.run_until_complete(
asyncio.gather(
non_blocking.main(),
future
)
)
I was expecting that both tasks will have control concurrently, but blocking task started in ProcessPoolExecutor blocks and never return control. How could it happen? What the proper way to combine normal coroutines and futures started in multiprocessing executor?
This line:
future = loop.run_in_executor(executor, run(blocking))
Will actually run the blocking function and give its result to the executor.
According to the documentation, you need to pass the function explicitly followed by its arguments.
future = loop.run_in_executor(executor, run, blocking)

How to run a coroutine and wait it result from a sync func when the loop is running?

I have a code like the foolowing:
def render():
loop = asyncio.get_event_loop()
async def test():
await asyncio.sleep(2)
print("hi")
return 200
if loop.is_running():
result = asyncio.ensure_future(test())
else:
result = loop.run_until_complete(test())
When the loop is not running is quite easy, just use loop.run_until_complete and it return the coro result but if the loop is already running (my blocking code running in app which is already running the loop) I cannot use loop.run_until_complete since it will raise an exception; when I call asyncio.ensure_future the task gets scheduled and run, but I want to wait there for the result, does anybody knows how to do this? Docs are not very clear how to do this.
I tried passing a concurrent.futures.Future calling set_result inside the coro and then calling Future.result() on my blocking code, but it doesn't work, it blocks there and do not let anything else to run. ANy help would be appreciated.
To implement runner with the proposed design, you would need a way to single-step the event loop from a callback running inside it. Asyncio explicitly forbids recursive event loops, so this approach is a dead end.
Given that constraint, you have two options:
make render() itself a coroutine;
execute render() (and its callers) in a thread different than the thread that runs the asyncio event loop.
Assuming #1 is out of the question, you can implement the #2 variant of render() like this:
def render():
loop = _event_loop # can't call get_event_loop()
async def test():
await asyncio.sleep(2)
print("hi")
return 200
future = asyncio.run_coroutine_threadsafe(test(), loop)
result = future.result()
Note that you cannot use asyncio.get_event_loop() in render because the event loop is not (and should not be) set for that thread. Instead, the code that spawns the runner thread must call asyncio.get_event_loop() and send it to the thread, or just leave it in a global variable or a shared structure.
Waiting Synchronously for an Asynchronous Coroutine
If an asyncio event loop is already running by calling loop.run_forever, it will block the executing thread until loop.stop is called [see the docs]. Therefore, the only way for a synchronous wait is to run the event loop on a dedicated thread, schedule the asynchronous function on the loop and wait for it synchronously from another thread.
For this I have composed my own minimal solution following the answer by user4815162342. I have also added the parts for cleaning up the loop when all work is finished [see loop.close].
The main function in the code below runs the event loop on a dedicated thread, schedules several tasks on the event loop, plus the task the result of which is to be awaited synchronously. The synchronous wait will block until the desired result is ready. Finally, the loop is closed and cleaned up gracefully along with its thread.
The dedicated thread and the functions stop_loop, run_forever_safe, and await_sync can be encapsulated in a module or a class.
For thread-safery considerations, see section “Concurrency and Multithreading” in asyncio docs.
import asyncio
import threading
#----------------------------------------
def stop_loop(loop):
''' stops an event loop '''
loop.stop()
print (".: LOOP STOPPED:", loop.is_running())
def run_forever_safe(loop):
''' run a loop for ever and clean up after being stopped '''
loop.run_forever()
# NOTE: loop.run_forever returns after calling loop.stop
#-- cancell all tasks and close the loop gracefully
print(".: CLOSING LOOP...")
# source: <https://xinhuang.github.io/posts/2017-07-31-common-mistakes-using-python3-asyncio.html>
loop_tasks_all = asyncio.Task.all_tasks(loop=loop)
for task in loop_tasks_all: task.cancel()
# NOTE: `cancel` does not guarantee that the Task will be cancelled
for task in loop_tasks_all:
if not (task.done() or task.cancelled()):
try:
# wait for task cancellations
loop.run_until_complete(task)
except asyncio.CancelledError: pass
#END for
print(".: ALL TASKS CANCELLED.")
loop.close()
print(".: LOOP CLOSED:", loop.is_closed())
def await_sync(task):
''' synchronously waits for a task '''
while not task.done(): pass
print(".: AWAITED TASK DONE")
return task.result()
#----------------------------------------
async def asyncTask(loop, k):
''' asynchronous task '''
print("--start async task %s" % k)
await asyncio.sleep(3, loop=loop)
print("--end async task %s." % k)
key = "KEY#%s" % k
return key
def main():
loop = asyncio.new_event_loop() # construct a new event loop
#-- closures for running and stopping the event-loop
run_loop_forever = lambda: run_forever_safe(loop)
close_loop_safe = lambda: loop.call_soon_threadsafe(stop_loop, loop)
#-- make dedicated thread for running the event loop
thread = threading.Thread(target=run_loop_forever)
#-- add some tasks along with my particular task
myTask = asyncio.run_coroutine_threadsafe(asyncTask(loop, 100200300), loop=loop)
otherTasks = [asyncio.run_coroutine_threadsafe(asyncTask(loop, i), loop=loop)
for i in range(1, 10)]
#-- begin the thread to run the event-loop
print(".: EVENT-LOOP THREAD START")
thread.start()
#-- _synchronously_ wait for the result of my task
result = await_sync(myTask) # blocks until task is done
print("* final result of my task:", result)
#... do lots of work ...
print("*** ALL WORK DONE ***")
#========================================
# close the loop gracefully when everything is finished
close_loop_safe()
thread.join()
#----------------------------------------
main()
here is my case, my whole programe is async, but call some sync lib, then callback to my async func.
follow the answer by user4815162342.
import asyncio
async def asyncTask(k):
''' asynchronous task '''
print("--start async task %s" % k)
# await asyncio.sleep(3, loop=loop)
await asyncio.sleep(3)
print("--end async task %s." % k)
key = "KEY#%s" % k
return key
def my_callback():
print("here i want to call my async func!")
future = asyncio.run_coroutine_threadsafe(asyncTask(1), LOOP)
return future.result()
def sync_third_lib(cb):
print("here will call back to your code...")
cb()
async def main():
print("main start...")
print("call sync third lib ...")
await asyncio.to_thread(sync_third_lib, my_callback)
# await loop.run_in_executor(None, func=sync_third_lib)
print("another work...keep async...")
await asyncio.sleep(2)
print("done!")
LOOP = asyncio.get_event_loop()
LOOP.run_until_complete(main())

Understanding Python Concurrency with Asyncio

I was wondering how concurrency works in python 3.6 with asyncio. My understanding is that when the interpreter executing await statement, it will leave it there until the awaiting process is complete and then move on to execute the other coroutine task. But what I see here in the code below is not like that. The program runs synchronously, executing task one by one.
What is wrong with my understanding and my impletementation code?
import asyncio
import time
async def myWorker(lock, i):
print("Attempting to attain lock {}".format(i))
# acquire lock
with await lock:
# run critical section of code
print("Currently Locked")
time.sleep(10)
# our worker releases lock at this point
print("Unlocked Critical Section")
async def main():
# instantiate our lock
lock = asyncio.Lock()
# await the execution of 2 myWorker coroutines
# each with our same lock instance passed in
# await asyncio.wait([myWorker(lock), myWorker(lock)])
tasks = []
for i in range(0, 100):
tasks.append(asyncio.ensure_future(myWorker(lock, i)))
await asyncio.wait(tasks)
# Start up a simple loop and run our main function
# until it is complete
loop = asyncio.get_event_loop()
loop.run_until_complete(main())
print("All Tasks Completed")
loop.close()
Invoking a blocking call such as time.sleep in an asyncio coroutine blocks the whole event loop, defeating the purpose of using asyncio.
Change time.sleep(10) to await asyncio.sleep(10), and the code will behave like you expect.
asyncio use a loop to run everything, await would yield back the control to the loop so it can arrange the next coroutine to run.

Categories