Multiple asyncio loops - python

I am implementing a service in a multi-service python application. This service will wait for requests to be pushed in a redis queue and resolve them whenever one is available.
async def foo():
while True:
_, request = await self.request_que.brpop(self.name)
# processing request ...
And I'm adding this foo() function to an asyncio event loop:
loop = asyncio.new_event_loop()
asyncio.ensure_future(self.handle(), loop=loop)
loop.run_forever()
The main question is, is it ok to do this on several(say 3 or 4) different services, that results in running several asyncio loops simultaneously? Will it harm OS?

Simple answer: yes. It is fine. I previously implemented a service which runs on one asyncio loop and spawns additional processes which run on their own loop (within the same machine).

Related

Using asyncio in the middle of Python3.6 script

I have a Python task that reports on it's progress during execution using a status updater that has 1 or more handlers associated with it. I want the updates to be dispatched to each handler in an asynchronous way (each handler is responsible for making an I/O bound call with the updates, pushing to a queue, logging to a file, calling a HTTP endpoint etc). The status updater has a method like so with each handler.dispatch method being a coroutine. This was working until a handler using aiohttp was added and now I am getting weird errors from the aiohttp module.
def _dispatch(self, **updates):
event_loop = asyncio.get_event_loop()
tasks = (event_loop.create_task(handler.dispatch(**updates)) for handler in self._handlers)
event_loop.run_until_complete(asyncio.gather(*tasks))
Every example of asyncio I've seen basically has this pattern
if __name__ == '__main__':
loop = asyncio.get_event_loop()
loop.run_until_complete(main())
loop.close()
My question is, is the way I am attempting to use the asyncio module in this case just completely wrong? Does the event loop need to be created once and only once and then everything else goes through that?

Non blocking python class method executed with asyncio

I'm trying to initialize a non-blocking task, which shares data with its parent object. It is a websocket client, and it would not block the main execution, though still run "forever".
My humble expectations were this would do it, but sadly, it is blocking the main thread.
loop = asyncio.new_event_loop()
task = loop.create_task(self.initWS())
loop.run_forever()
self.initWS() is indeed not blocking the main thread, but loop.run_forever() is.
If you want to execute more tasks concurrently with self.initWS(), you have to add them to the asyncio loop, too.

Multithreaded server using Python Websockets library

I am trying to build a Server and Client using Python Websockets library. I am following the example given here: Python OCPP documentation. I want to edit this example such that for each new client(Charge Point), the server(Central System) runs on new thread (or new loop as Websockets library uses Python's asyncio). Basically I want to receive messages from multiple clients simultaneously. I modified the main method of server(Central System), from the example that I followed, like this by adding a loop parameter in websockets.serve() in a hope that whenever a new client runs, it runs on a different loop:
async def main():
server = await websockets.serve(
on_connect,
'0.0.0.0',
9000,
subprotocols=['ocpp2.0'],
ssl=ssl_context,
loop=asyncio.new_event_loop(),
ping_timeout=100000000
)
await server.wait_closed()
if __name__ == '__main__':
asyncio.run(main())
I am getting the following error. Please help:
RuntimeError:
Task <Task pending coro=<main() running at server.py:36>
cb=_run_until_complete_cb() at /usr/lib/python3.7/asyncio/base_events.py:153]>
got Future <_GatheringFuture pending> attached to a different loop - sys:1:
RuntimeWarning: coroutine 'BaseEventLoop._create_server_getaddrinfo' was never awaited
Author of the OCPP package here. The examples in the project are able to handle multiple clients with 1 event loop.
The problem in your example is loop=asyncio.new_event_loop(). The websocket.serve() is called in the 'main' event loop. But you pass it a reference to a second event loop. This causes the future to be attached to a different event loop it was created it.
You should avoid running multiple event loops.

How to efficiently use asyncio when calling a method on a BaseProxy?

I'm working on an application that uses LevelDB and that uses multiple long-lived processes for different tasks.
Since LevelDB does only allow a single process maintaining a database connection, all our database access is funneled through a special database process.
To access the database from another process we use a BaseProxy. But since we are using asyncio our proxy shouldn't block on these APIs that call into the db process which then eventually read from the db. Therefore we implement the APIs on the proxy using an executor.
loop = asyncio.get_event_loop()
return await loop.run_in_executor(
thread_pool_executor,
self._callmethod,
method_name,
args,
)
And while that works just fine, I wonder if there's a better alternative to wrapping the _callmethod call of the BaseProxy in a ThreadPoolExecutor.
The way I understand it, the BaseProxy calling into the DB process is the textbook example of waiting on IO, so using a thread for this seems unnecessary wasteful.
In a perfect world, I'd assume an async _acallmethod to exist on the BaseProxy but unfortunately that API does not exist.
So, my question basically boils down to: When working with BaseProxy is there a more efficient alternative to running these cross process calls in a ThreadPoolExecutor?
Unfortunately, the multiprocessing library is not suited to conversion to asyncio, what you have is the best you can do if you must use BaseProxy to handle your IPC (Inter-Process communication).
While it is true that the library uses blocking I/O here you can't easily reach in and re-work the blocking parts to use non-blocking primitives instead. If you were to insist on going this route you'd have to patch or rewrite the internal implementation details of that library, but being internal implementation details these can differ from Python point release to point release making any patching fragile and prone to break with minor Python upgrades. The _callmethod method is part of a deep hierarchy of abstractions involving threads, socket or pipe connections, and serializers. See multiprocessing/connection.py and multiprocessing/managers.py.
So your options here are to stick with your current approach (using a threadpool executor to shove BaseProxy._callmethod() to another thread) or to implement your own IPC solution using asyncio primitives. Your central database-access process would act as a server for your other processes to connect to as a client, either using sockets or named pipes, using an agreed-upon serialisation scheme for client requests and server responses. This is what multiprocessing implements for you, but you'd implement your own (simpler) version, using asyncio streams and whatever serialisation scheme best suits your application patterns (e.g. pickle, JSON, protobuffers, or something else entirely).
A thread pool is what you want. aioprocessing provides some async functionality of multiprocessing, but it does it using threads as you have proposed. I suggest making an issue against python if there isn't one for exposing true async multiprocessing.
https://github.com/dano/aioprocessing
In most cases, this library makes blocking calls to multiprocessing methods asynchronous by executing the call in a ThreadPoolExecutor
Assuming you have the python and the database running in the same system (i.e. you are not looking to async any network calls), you have two options.
what you are already doing (run in executor). It blocks the db thread but main thread remains free to do other stuff. This is not pure non-blocking, but it is quite an acceptable solution for I/O blocking cases, with a small overhead of maintaining a thread.
For true non-blocking solution (that can be run in a single thread without blocking) you have to have #1. native support for async (callback) from the DB for each fetch call and #2 wrap that in your custom event loop implementation. Here you subclass the Base loop, and overwrite methods to integrate your db callbacks. For example you can create a base loop that implements a pipe server. the db writes to the pipe and python polls the pipe. See the implementation of Proactor event loop in the asyncio code base. Note: I have never implemented any custom event loop.
I am not familiar with leveldb, but for a key-value store, it is not clear if there will be any significant benefit for such a callback for fetch and pure non-blocking implementation. In case you are getting multiple fetches inside an iterator and that is your main problem you can make the loop async (with each fetch still blocking) and can improve your performance. Below is a dummy code that explains this.
import asyncio
import random
import time
async def talk_to_db(d):
"""
blocking db iteration. sleep is the fetch function.
"""
for k, v in d.items():
time.sleep(1)
yield (f"{k}:{v}")
async def talk_to_db_async(d):
"""
real non-blocking db iteration. fetch (sleep) is native async here
"""
for k, v in d.items():
await asyncio.sleep(1)
yield (f"{k}:{v}")
async def talk_to_db_async_loop(d):
"""
semi-non-blocking db iteration. fetch is blocking, but the
loop is not.
"""
for k, v in d.items():
time.sleep(1)
yield (f"{k}:{v}")
await asyncio.sleep(0)
async def db_call_wrapper(db):
async for row in talk_to_db(db):
print(row)
async def db_call_wrapper_async(db):
async for row in talk_to_db_async(db):
print(row)
async def db_call_wrapper_async_loop(db):
async for row in talk_to_db_async_loop(db):
print(row)
async def func(i):
await asyncio.sleep(5)
print(f"done with {i}")
database = {i:random.randint(1,20) for i in range(20)}
async def main():
db_coro = db_call_wrapper(database)
coros = [func(i) for i in range(20)]
coros.append(db_coro)
await asyncio.gather(*coros)
async def main_async():
db_coro = db_call_wrapper_async(database)
coros = [func(i) for i in range(20)]
coros.append(db_coro)
await asyncio.gather(*coros)
async def main_async_loop():
db_coro = db_call_wrapper_async_loop(database)
coros = [func(i) for i in range(20)]
coros.append(db_coro)
await asyncio.gather(*coros)
# run the blocking db iteration
loop = asyncio.get_event_loop()
loop.run_until_complete(main())
# run the non-blocking db iteration
loop = asyncio.get_event_loop()
loop.run_until_complete(main_async())
# run the non-blocking (loop only) db iteration
loop = asyncio.get_event_loop()
loop.run_until_complete(main_async_loop())
This is something you can try. Otherwise, I would say your current method is quite efficient. I do not think BaseProxy can give you an async acall API, it does not know how to handle the callback from your db.

Making connections with python3 asyncio

I'm trying to connect to more than one server at the same time. I am currently using loop.create_connection but it freezes up at the first non-responding server.
gsock = loop.create_connection(lambda: opensock(sid), server, port)
transport, protocol = loop.run_until_complete(gsock)
I tried threading this but it created problems with the sid value being used as well as various errors such as RuntimeError: Event loop is running and RuntimeError: Event loop stopped before Future completed. Also, according my variables (tho were getting mixed up) the protocol's connection_made() method gets executed when transport, protocol = loop.run_until_complete(gsock) throws an exception.
I don't understand much about the asyncio module so please be as thorough as possible. I dont think I need reader/writer variables, as the reading should be done automatically and trigger data_received() method.
Thank You.
You can connect to many servers at the same time by scheduling all the coroutines concurrently, rather than using loop.run_until_complete to make each connection individually. One way to do that is to use asyncio.gather to schedule them all and wait for each to finish:
import asyncio
# define opensock somewhere
#asyncio.coroutine
def connect_serv(server, port):
try:
transport, protocol = yield from loop.create_connection(lambda: opensock(sid), server, port)
except Exception:
print("Connection to {}:{} failed".format(server, port))
loop = asyncio.get_event_loop()
loop.run_until_complete(
asyncio.gather(
connect_serv('1.2.3.4', 3333),
connect_serv('2.3.4.5', 5555),
connect_serv('google.com', 80),
))
loop.run_forever()
This will kick off all three coroutines listed in the call to gather concurrently, so that if one of them hangs, the others won't be affected; they'll be able to carry on with their work while the other connection hangs. Then, if all of them complete, loop.run_forever() gets executed, which will allow you program to continue running until you stop the loop or kill the program.
The reader/writer variables you mentioned would only be relevant if you used asyncio.open_connection to connect to the servers, rather than create_connection. It uses the Stream API, which is a higher-level API than the protocol/transport-based API that create_connection uses. It's really up to you to decide which you prefer to use. There are examples of both in the asyncio docs, if you want to see a comparison.

Categories