I am using Python3 Asyncio module to create a load balancing application. I have two heavy IO tasks:
A SNMP polling module, which determines the best possible server
A "proxy-like" module, which balances the petitions to the selected server.
Both processes are going to run forever, are independent from eachother and should not be blocked by the other one.
I cant use 1 event loop because they would block eachother, is there any way to have 2 event loops or do I have to use multithreading/processing?
I tried using asyncio.new_event_loop() but havent managed to make it work.
The whole point of asyncio is that you can run multiple thousands of I/O-heavy tasks concurrently, so you don't need Threads at all, this is exactly what asyncio is made for. Just run the two coroutines (SNMP and proxy) in the same loop and that's it.
You have to make both of them available to the event loop BEFORE calling loop.run_forever(). Something like this:
import asyncio
async def snmp():
print("Doing the snmp thing")
await asyncio.sleep(1)
async def proxy():
print("Doing the proxy thing")
await asyncio.sleep(2)
async def main():
while True:
await snmp()
await proxy()
loop = asyncio.get_event_loop()
loop.create_task(main())
loop.run_forever()
I don't know the structure of your code, so the different modules might have their own infinite loop or something, in this case you can run something like this:
import asyncio
async def snmp():
while True:
print("Doing the snmp thing")
await asyncio.sleep(1)
async def proxy():
while True:
print("Doing the proxy thing")
await asyncio.sleep(2)
loop = asyncio.get_event_loop()
loop.create_task(snmp())
loop.create_task(proxy())
loop.run_forever()
Remember, both snmp and proxy needs to be coroutines (async def) written in an asyncio-aware manner. asyncio will not make simple blocking Python functions suddenly "async".
In your specific case, I suspect that you are confused a little bit (no offense!), because well-written async modules will never block each other in the same loop. If this is the case, you don't need asyncio at all and just simply run one of them in a separate Thread without dealing with any asyncio stuff.
Answering my own question to post my solution:
What I ended up doing was creating a thread and a new event loop inside the thread for the polling module, so now every module runs in a different loop. It is not a perfect solution, but it is the only one that made sense to me(I wanted to avoid threads, but since it is only one...). Example:
import asyncio
import threading
def worker():
second_loop = asyncio.new_event_loop()
execute_polling_coroutines_forever(second_loop)
return
threads = []
t = threading.Thread(target=worker)
threads.append(t)
t.start()
loop = asyncio.get_event_loop()
execute_proxy_coroutines_forever(loop)
Asyncio requires that every loop runs its coroutines in the same thread. Using this method you have one event loop foreach thread, and they are totally independent: every loop will execute its coroutines on its own thread, so that is not a problem.
As I said, its probably not the best solution, but it worked for me.
Though in most cases, you don't need multiple event loops running when using asyncio, people shouldn't assume their assumptions apply to all the cases or just give you what they think are better without directly targeting your original question.
Here's a demo of what you can do for creating new event loops in threads. Comparing to your own answer, the set_event_loop does the trick for you not to pass the loop object every time you do an asyncio-based operation.
import asyncio
import threading
async def print_env_info_async():
# As you can see each work thread has its own asyncio event loop.
print(f"Thread: {threading.get_ident()}, event loop: {id(asyncio.get_running_loop())}")
async def work():
while True:
await print_env_info_async()
await asyncio.sleep(1)
def worker():
new_loop = asyncio.new_event_loop()
asyncio.set_event_loop(new_loop)
new_loop.run_until_complete(work())
return
number_of_threads = 2
for _ in range(number_of_threads):
threading.Thread(target=worker).start()
Ideally, you'll want to put heavy works in worker threads and leave the asncyio thread run as light as possible. Think the asyncio thread as the GUI thread of a desktop or mobile app, you don't want to block it. Worker threads are usually very busy, this is one of the reason you don't want to create separate asyncio event loops in worker threads. Here's an example of how to manage heavy worker threads with a single asyncio event loop. And this is the most common practice in this kind of use cases:
import asyncio
import concurrent.futures
import threading
import time
def print_env_info(source_thread_id):
# This will be called in the main thread where the default asyncio event loop lives.
print(f"Thread: {threading.get_ident()}, event loop: {id(asyncio.get_running_loop())}, source thread: {source_thread_id}")
def work(event_loop):
while True:
# The following line will fail because there's no asyncio event loop running in this worker thread.
# print(f"Thread: {threading.get_ident()}, event loop: {id(asyncio.get_running_loop())}")
event_loop.call_soon_threadsafe(print_env_info, threading.get_ident())
time.sleep(1)
async def worker():
print(f"Thread: {threading.get_ident()}, event loop: {id(asyncio.get_running_loop())}")
loop = asyncio.get_running_loop()
number_of_threads = 2
executor = concurrent.futures.ThreadPoolExecutor(max_workers=number_of_threads)
for _ in range(number_of_threads):
asyncio.ensure_future(loop.run_in_executor(executor, work, loop))
loop = asyncio.get_event_loop()
loop.create_task(worker())
loop.run_forever()
I know it's an old thread but it might be still helpful for someone.
I'm not good in asyncio but here is a bit improved solution of #kissgyorgy answer. Instead of awaiting each closure separately we create list of tasks and fire them later (python 3.9):
import asyncio
async def snmp():
while True:
print("Doing the snmp thing")
await asyncio.sleep(0.4)
async def proxy():
while True:
print("Doing the proxy thing")
await asyncio.sleep(2)
async def main():
tasks = []
tasks.append(asyncio.create_task(snmp()))
tasks.append(asyncio.create_task(proxy()))
await asyncio.gather(*tasks)
asyncio.run(main())
Result:
Doing the snmp thing
Doing the proxy thing
Doing the snmp thing
Doing the snmp thing
Doing the snmp thing
Doing the snmp thing
Doing the proxy thing
Asyncio event loop is a single thread running and it will not run anything in parallel, it is how it is designed. The closest thing which I can think of is using asyncio.wait.
from asyncio import coroutine
import asyncio
#coroutine
def some_work(x, y):
print("Going to do some heavy work")
yield from asyncio.sleep(1.0)
print(x + y)
#coroutine
def some_other_work(x, y):
print("Going to do some other heavy work")
yield from asyncio.sleep(3.0)
print(x * y)
if __name__ == '__main__':
loop = asyncio.get_event_loop()
loop.run_until_complete(asyncio.wait([asyncio.async(some_work(3, 4)),
asyncio.async(some_other_work(3, 4))]))
loop.close()
an alternate way is to use asyncio.gather() - it returns a future results from the given list of futures.
tasks = [asyncio.Task(some_work(3, 4)), asyncio.Task(some_other_work(3, 4))]
loop.run_until_complete(asyncio.gather(*tasks))
If the proxy server is running all the time it cannot switch back and forth. The proxy listens for client requests and makes them asynchronous, but the other task cannot execute, because this one is serving forever.
If the proxy is a coroutine and is starving the SNMP-poller (never awaits), isn't the client requests being starved aswell?
every coroutine will run forever, they will not end
This should be fine, as long as they do await/yield from. The echo server will also run forever, it doesn't mean you can't run several servers (on differents ports though) in the same loop.
Related
I'd like to use asyncio to do a lot of simultaneous non-blocking IO in Python. However, I want that use of asyncio to be abstracted away from the user--under the hood there's a lot of asychronous calls going on simultaneously to speed things up, but for the user there's a single, synchronous call.
Basically something like this:
async def _slow_async_fn(address):
data = await async_load_data(address)
return data
def synchronous_blocking_io()
addresses = ...
tasks = []
for address in addresses:
tasks.append(_slow_async_fn(address))
all_results = some_fn(asyncio.gather(*tasks))
return all_results
The problem is, how can I achieve this in a way that's agnostic to the user's running environment? I use a pattern like asyncio.get_event_loop().run_until_complete(), I run into issues if the code is being called inside an environment like Jupyter where there's already an event loop running. Is there a way to robustly gather the results of a set of asynchronous tasks that doesn't require pushing async/await statements all the way up the program?
The restriction on running loops is per thread, so running a new event loop is possible, as long as it is in a new thread.
import asyncio
import concurrent.futures
async def gatherer_of(tasks):
# It's necessary to wrap asyncio.gather() in a coroutine (reasons beyond scope)
return await asyncio.gather(*tasks)
def synchronous_blocking_io():
addresses = ...
tasks = []
for address in addresses:
tasks.append(_slow_async_fn(address))
loop = asyncio.new_event_loop()
return loop.run_until_complete(gatherer_of(tasks))
def synchronous_blocking_io_wrapper():
with concurrent.futures.ThreadPoolExecutor(max_workers=1) as executor:
fut = executor.submit(synchronous_blocking_io)
return fut.result()
# Testing
async def async_runner():
# Simulating execution from a running loop
return synchronous_blocking_io_wrapper()
# Run from synchronous client
# print(synchronous_blocking_io_wrapper())
# Run from async client
# print(asyncio.run(async_runner()))
The same result can be achieved with the ProcessPoolExecutor, by manually running synchronous_blocking_io in a new thread and joining it, starting an entirely new process and so forth. As long as you are not in the same thread, you won't conflict with any running event loop.
Is it possible to share an asyncio.Queue over different tasks in one event loop?
The usecase:
Two tasks are publishing data on a queue, and one task is grabbing the new items from the Queue. All tasks in an asynchronous way.
main.py
import asyncio
import creator
async def pull_message(queue):
while True:
# Here I dont get messages, maybe the queue is always
# occupied by a other task?
msg = await queue.get()
print(msg)
if __name__ == "__main__"
loop = asyncio.get_event_loop()
queue = asyncio.Queue(loop=loop)
future = asyncio.ensure_future(pull_message(queue))
creators = list()
for i in range(2):
creators.append(loop.create_task(cr.populate_msg(queue)))
# add future to creators for easy handling
creators.append(future)
loop.run_until_complete(asyncio.gather(*creators))
creator.py
import asyncio
async def populate_msg(queue):
while True:
msg = "Foo"
await queue.put(msg)
The problem in your code is that populate_msg doesn't yield to the event loop because the queue is unbounded. This is somewhat counter-intuitive because the coroutine clearly contains an await, but that await only suspends the execution of the coroutine if the coroutine would otherwise block. Since put() on an unbounded queue never blocks, populate_msg is the only thing executed by the event loop.
The problem will go away once you change populate_msg to actually do something else (like await a network event). For testing purposes you can add await asyncio.sleep(0) inside the loop, which will force the coroutine to yield control to the event loop at every iteration of the while loop. Note that this will cause the event loop to spend an entire core by continuously spinning the loop.
Suppose I have some tasks running asynchronously. They may be totally independent, but I still want to set points where the tasks will pause so they can run concurrently.
What is the correct way to run the tasks concurrently? I am currently using await asyncio.sleep(0), but I feel this is adding a lot of overhead.
import asyncio
async def do(name, amount):
for i in range(amount):
# Do some time-expensive work
print(f'{name}: has done {i}')
await asyncio.sleep(0)
return f'{name}: done'
async def main():
res = await asyncio.gather(do('Task1', 3), do('Task2', 2))
print(*res, sep='\n')
loop = asyncio.get_event_loop()
loop.run_until_complete(main())
Output
Task1: has done 0
Task2: has done 0
Task1: has done 1
Task2: has done 1
Task1: has done 2
Task1: done
Task2: done
If we were using simple generators, an empty yield would pause the flow of a task without any overhead, but empty await are not valid.
What is the correct way to set such breakpoints without overhead?
As mentioned in the comments, normally asyncio coroutines suspend automatically on calls that would block or sleep in equivalent synchronous code. In your case the coroutine is CPU-bound, so awaiting blocking calls is not enough, it needs to occasionally relinquish control to the event loop to allow the rest of the system to run.
Explicit yields are not uncommon in cooperative multitasking, and using await asyncio.sleep(0) for that purpose will work as intended, it does carry a risk: sleep too often, and you're slowing down the computation by unnecessary switches; sleep too seldom, and you're hogging the event loop by spending too much time in a single coroutine.
The solution provided by asyncio is to offload CPU-bound code to a thread pool using run_in_executor. Awaiting it will automatically suspend the coroutine until the CPU-intensive task is done, without any intermediate polling. For example:
import asyncio
def do(id, amount):
for i in range(amount):
# Do some time-expensive work
print(f'{id}: has done {i}')
return f'{id}: done'
async def main():
loop = asyncio.get_event_loop()
res = await asyncio.gather(
loop.run_in_executor(None, do, 'Task1', 5),
loop.run_in_executor(None, do, 'Task2', 3))
print(*res, sep='\n')
loop = asyncio.get_event_loop()
loop.run_until_complete(main())
I'm not sure what I'm doing wrong here, I'm trying to have a class which contains a queue and uses a coroutine to consume items on that queue. The wrinkle is that the event loop is being run in a separate thread (in that thread I do loop.run_forever() to get it running).
What I'm seeing though is that the coroutine for consuming items is never fired:
import asyncio
from threading import Thread
import functools
# so print always flushes to stdout
print = functools.partial(print, flush=True)
def start_loop(loop):
def run_forever(loop):
print("Setting loop to run forever")
asyncio.set_event_loop(loop)
loop.run_forever()
print("Leaving run forever")
asyncio.set_event_loop(loop)
print("Spawaning thread")
thread = Thread(target=run_forever, args=(loop,))
thread.start()
class Foo:
def __init__(self, loop):
print("in foo init")
self.queue = asyncio.Queue()
asyncio.run_coroutine_threadsafe(self.consumer(self.queue), loop)
async def consumer(self, queue):
print("In consumer")
while True:
message = await queue.get()
print(f"Got message {message}")
if message == "END OF QUEUE":
print(f"exiting consumer")
break
print(f"Processing {message}...")
def main():
loop = asyncio.new_event_loop()
start_loop(loop)
f = Foo(loop)
f.queue.put("this is a message")
f.queue.put("END OF QUEUE")
loop.call_soon_threadsafe(loop.stop)
# wait for the stop to propagate and complete
while loop.is_running():
pass
if __name__ == "__main__":
main()
Output:
Spawaning thread
Setting loop to run forever
in foo init
Leaving run forever
There are several issues with this code.
First, check the warnings:
test.py:44: RuntimeWarning: coroutine 'Queue.put' was never awaited
f.queue.put("this is a message")
test.py:45: RuntimeWarning: coroutine 'Queue.put' was never awaited
f.queue.put("END OF QUEUE")
That means queue.put is a coroutine, so it has to be run using run_coroutine_threadsafe:
asyncio.run_coroutine_threadsafe(f.queue.put("this is a message"), loop)
asyncio.run_coroutine_threadsafe(f.queue.put("END OF QUEUE"), loop)
You could also use queue.put_nowait which is a synchronous method. However, asyncio objects are generally not threadsafe so every synchronous call has to go through call_soon_threadsafe:
loop.call_soon_threadsafe(f.queue.put_nowait, "this is a message")
loop.call_soon_threadsafe(f.queue.put_nowait, "END OF QUEUE")
Another issue is that the loop gets stopped before the consumer task can start processing items. You could add a join method to the Foo class to wait for the consumer to finish:
class Foo:
def __init__(self, loop):
[...]
self.future = asyncio.run_coroutine_threadsafe(self.consumer(self.queue), loop)
def join(self):
self.future.result()
Then make sure to call this method before stopping the loop:
f.join()
loop.call_soon_threadsafe(loop.stop)
This should be enough to get the program to work as you expect. However, this code is still problematic on several aspects.
First, the loop should not be set both in the main thread and the extra thread. Asyncio loops are not meant to be shared between threads, so you need to make sure that everything asyncio related happens in the dedicated thread.
Since Foo is responsible for the communication between those two threads, you'll have to be extra careful to make sure every line of code runs in the right thread. For instance, the instantiation of asyncio.Queue has to happen in the asyncio thread.
See this gist for a corrected version of your program.
Also, I'd like to point out that this is not the typical use case for asyncio. You generally want to have an asyncio loop running in the main thread, especially if you need subprocess support:
asyncio supports running subprocesses from different threads, but there are limits:
An event loop must run in the main thread
The child watcher must be instantiated in the main thread, before executing subprocesses from other threads. Call the get_child_watcher() function in the main thread to instantiate the child watcher.
I would suggest designing your application the other way, i.e. running asyncio in the main thread and use run_in_executor for the synchronous blocking code.
I was trying to explain an example of async programming in python but I failed.
Here is my code.
import asyncio
import time
async def asyncfoo(t):
time.sleep(t)
print("asyncFoo")
loop = asyncio.get_event_loop()
loop.run_until_complete(asyncfoo(10)) # I think Here is the problem
print("Foo")
loop.close()
My expectation is that I would see:
Foo
asyncFoo
With a wait of 10s before asyncFoo was displayed.
But instead I got nothing for 10s, and then they both displayed.
What am I doing wrong, and how can I explain it?
run_until_complete will block until asyncfoo is done. Instead, you would need two coroutines executed in the loop. Use asyncio.gather to easily start more than one coroutine with run_until_complete.
Here is a an example:
import asyncio
async def async_foo():
print("asyncFoo1")
await asyncio.sleep(3)
print("asyncFoo2")
async def async_bar():
print("asyncBar1")
await asyncio.sleep(1)
print("asyncBar2")
loop = asyncio.get_event_loop()
loop.run_until_complete(asyncio.gather(async_foo(), async_bar()))
loop.close()
Your expectation would work in contexts where you run your coroutine as a Task independent of the flow of the code. Another situation where it would work is if you are running multiple coroutines side-by-side, in which case the event-loop will juggle the code execution from await to await statement.
Within the context of your example, you can achieve your anticipated behaviour by wrapping your coroutine in a Task object, which will continue-on in the background without holding up the remainder of the code in the code-block from whence it is called.
For example.
import asyncio
async def asyncfoo(t):
await asyncio.sleep(t)
print("asyncFoo")
async def my_app(t):
my_task = asyncio.ensure_future(asyncfoo(t))
print("Foo")
await asyncio.wait([my_task])
loop = asyncio.get_event_loop()
loop.run_until_complete(my_app(10))
loop.close()
Note that you should use asyncio.sleep() instead of the time module.
run_until_complete is blocking. So, even if it'll happen in 10 seconds, it will wait for it. After it's completed, the other print occurs.
You should launch your loop.run_until_complete(asyncfoo(10)) in a thread or a subprocess if you want the "Foo" to be print before.