How to implement single-producer multi-consumer with aioredis pub/sub - python

I have the web app. That app has endpoint to push some object data to redis channel.
And another endpoint handles websocket connection, where that data is fetched from channel and send to client via ws.
When i connect via ws, messages gets only first connected client.
How to read messages from redis channel with multiple clients and not create a new subscription?
Websocket handler.
Here i subscribe to channel, save it to app (init_tram_channel). Then run job where i listen channel and send messages(run_tram_listening).
#routes.get('/tram-state-ws/{tram_id}')
async def tram_ws(request: web.Request):
ws = web.WebSocketResponse()
await ws.prepare(request)
tram_id = int(request.match_info['tram_id'])
channel_name = f'tram_{tram_id}'
await init_tram_channel(channel_name, request.app)
tram_job = await run_tram_listening(
request=request,
ws=ws,
channel=request.app['tram_producers'][channel_name]
)
request.app['websockets'].add(ws)
try:
async for msg in ws:
if msg.type == aiohttp.WSMsgType.TEXT:
if msg.data == 'close':
await ws.close()
break
if msg.type == aiohttp.WSMsgType.ERROR:
logging.error(f'ws connection was closed with exception {ws.exception()}')
else:
await asyncio.sleep(0.005)
except asyncio.CancelledError:
pass
finally:
await tram_job.close()
request.app['websockets'].discard(ws)
return ws
Subscribing and saving channel.
Every channel is related to unique object, and in order not to create many channels that related to the same object, i save only one to app.
app['tram_producers'] is dict.
async def init_tram_channel(
channel_name: str,
app: web.Application
):
if channel_name not in app['tram_producers']:
channel, = await app['redis'].subscribe(channel_name)
app['tram_producers'][channel_name] = channel
Running coro for channel listening.
I run it via aiojobs:
async def run_tram_listening(
request: web.Request,
ws: web.WebSocketResponse,
channel: Channel
):
"""
:return: aiojobs._job.Job object
"""
listen_redis_job = await spawn(
request,
_read_tram_subscription(
ws,
channel
)
)
return listen_redis_job
Coro where i listen and send messages:
async def _read_tram_subscription(
ws: web.WebSocketResponse,
channel: Channel
):
try:
async for msg in channel.iter():
tram_data = msg.decode()
await ws.send_json(tram_data)
except asyncio.CancelledError:
pass
except Exception as e:
logging.error(msg=e, exc_info=e)

The following code has been found in some aioredis github issue (I've adopted it to my task).
class TramProducer:
def __init__(self, channel: aioredis.Channel):
self._future = None
self._channel = channel
def __aiter__(self):
return self
def __anext__(self):
return asyncio.shield(self._get_message())
async def _get_message(self):
if self._future:
return await self._future
self._future = asyncio.get_event_loop().create_future()
message = await self._channel.get_json()
future, self._future = self._future, None
future.set_result(message)
return message
So, how it works? TramProducer wraps the way we get messages.
As said #Messa
message is received from one Redis subscription only once.
So only one client of TramProducer is retrieving messages from redis, while other clients are waiting for future result that will be set after receiving message from channel.
If self._future initialized it means that somebody is waiting for message from redis, so we will just wait for self._future result.
TramProducer usage (i've taken an example from my question):
async def _read_tram_subscription(
ws: web.WebSocketResponse,
tram_producer: TramProducer
):
try:
async for msg in tram_producer:
await ws.send_json(msg)
except asyncio.CancelledError:
pass
except Exception as e:
logging.error(msg=e, exc_info=e)
TramProducer initialization:
async def init_tram_channel(
channel_name: str,
app: web.Application
):
if channel_name not in app['tram_producers']:
channel, = await app['redis'].subscribe(channel_name)
app['tram_producers'][channel_name] = TramProducer(channel)
I think it maybe helpfull for somebody.
Full project here https://gitlab.com/tram-emulator/tram-server

I guess a message is received from one Redis subscription only once, and if there is more than one listeners in your app, then only one of them will get it.
So you need to create something like mini pub/sub inside the application to distribute the messages to all listeners (websocket connections in this case).
Some time ago I've made an aiohttp websocket chat example - not with Redis, but at least the cross-websocket distribution is there: https://github.com/messa/aiohttp-nextjs-demo-chat/blob/master/chat_web/views/api.py
The key is to have an application-wide message_subcriptions, where every websocket connection registers itself, or perhaps its own asyncio.Queue (I've used Event in my example, but that's suboptimal), and whenever message comes from Redis, it is pushed to all relevant queues.
Of course when websocket connection ends (client unsubscribe, disconnect, failure...) the queue should be removed (and possibly Redis subscription cancelled if it was the last connection listening to it).
Asyncio doesn’t mean we should forget about queues :) Also it’s good to get familiar with combining multiple tasks at once (reading from websocket, reading from message queue, perhaps reading from some notification queue...). Using queues can also help you to handle client reconnects more cleanly (without loss of any messages).

Related

Python Async Functions won't Give up the CPU

I have two async functions that both need to run constantly and one of them just hogs all of the CPU.
The first function handles receiving websocket messages from a client.
async def handle_message(self, ws):
"""Handles a message from the websocket."""
logger.info('awaiting message')
while True:
msg = await ws.receive()
logger.debug('received message: %s', msg)
jmsg = json.loads(msg['text'])
logger.info('received message: {}'.format(jmsg))
param = jmsg['parameter']
val = jmsg['data']['value']
logger.info('setting parameter {} to {}'.format(param, val))
self.camera.change_parameter(param, val)
The second function grabs images from a camera and sends them to the frontend client. This is the one that one that won't give the other guy any time.
async def send_image(self, ws):
"""Sends an image to the websocket."""
for im in self.camera:
await asyncio.sleep(1000)
h, w = im.shape[:2]
resized = cv2.resize(im, (w // 4, h // 4))
await ws.send_bytes(image_to_bytes(resized))
I'm executing these coroutines using asyncio.gather(). The decorator is from FastAPI and Backend() is my class that contains the two async coroutines.
#app.websocket('/ws')
async def websocket_endpoint(websocket: WebSocket):
"""Handle a WebSocket connection."""
backend = Backend()
logger.info('Started backend.')
await websocket.accept()
try:
aws = [backend.send_image(websocket), backend.handle_message(websocket)]
done, pending = await asyncio.gather(*aws)
except WebSocketDisconnect:
await websocket.close()
Both of these coroutines will operate seperately, but if I try to run them together send_image() never gives any time to handle_message and so none of the messages are ever received (or at least that's what I think is going on).
I thought this is what asyncio was trying to solve, but I'm probably using it wrong. I thought about using multiprocessing, but I'm pretty sure FastAPI expects awaitables here. I also read about using the return variables from gather(), but I didn't really understand. Something about canceling the pending tasks and adding them back to the event loop.
Can anyone show me the correct (and preferably modern pythonic) way to make these async coroutines run concurrently?

Python Websocket and Async using

I try to create a websocket server, I wanna make a client to exchange data from server to client, but now my data from other process, I need to make a queue accept data from other process, it makes my main websocket function blocked, the final result is that could not reconnect after client connection break, I think it blocked in the code of queue.
Here is my part of my code:
class RecorderEventHook(object):
def __init__(self, high_event_mq):
self.high_event_mq = high_event_mq
self.msg = None
self.loop = None
# #wrap_keep_alive
async def on_msg_event(self, websocket):
try:
# async for message in websocket:
while True:
msg = self.high_event_mq.get()
await websocket.send(json.dumps(msg))
# msg
except Exception as error:
print(error)
async def event_controller(self):
await websockets.serve(self.on_msg_event, 'localhost', 8888)
def start(self):
loop = asyncio.new_event_loop()
loop.create_task(self.event_controller())
loop.run_forever()
I try to save connected websocket object and using in other thread(in same process), but it failed and mentions
"xxxx" function never waited
I want to be able to receive data from other processes without affecting the normal reconnection of the client.
Anybody help and big appreciate.

FastAPI websocket connection causes cpu spike to 100% inside the docker container

I am developing a private chat for two or more users to communicate with each other. I have an endpoint for the websocket connection where only authenticated users are able to do a handshake between client and the server
The problem occurs when the websocket connection is accepted. Consumer handler is running smoothly in the infinite loop which waits for the messages from the client side to actually do some specific tasks that are requested, but the producer on the other side, hangs in an infinite loop and that causes the CPU spike up to 100%
Obviously I need one listener to a specific redis channel where I get all the messages from the users in real time, somehow I should listen to it, while loop does that but because of that CPU spike obviously it is not a good solution.
# api.py
async def consumer_handler(service):
"""Messages received on the websocket connection Consumer - (Publisher)"""
try:
while True:
received_data = await service.websocket.receive_json()
if received_data['event_type'] == "online.users":
await service.get_online_user_status(received_data['role_id'])
elif received_data['event_type'] == "message.user":
await service.send_message(received_data['user_id'], received_data['content'])
elif received_data['event_type'] == "info":
await service.get_info()
except WebSocketDisconnect:
logger.debug("WebSocketDisconnect - consumer handler disconnected")
async def producer_handler(service):
"""Messages generated at the backend to send to the websocket Producer - (Subscriber)"""
try:
while True:
if service.pubsub.subscribed:
message = await service.pubsub.get_message(ignore_subscribe_messages=True)
if message:
await service.websocket.send_json(message['data'].decode())
except (ConnectionClosedOK, aioredis.exceptions.ConnectionError) as e:
logger.debug(f"{e.__class__}", "producer handler disconnected")
#chat_app.websocket("/")
async def websocket_endpoint(websocket: WebSocket,
current_user: User = Depends(is_authenticated_ws)):
if not current_user:
return
async with ConnectionContextManager(user_id=current_user.id, websocket=websocket) as service:
producer_task = asyncio.ensure_future(producer_handler(service))
consumer_task = asyncio.ensure_future(consumer_handler(service))
done, pending = await asyncio.wait(
[consumer_task, producer_task],
return_when=asyncio.FIRST_COMPLETED
)
for task in pending:
task.cancel()
This endpoint handles the both producer/subscriber logic as it is described in the websockets documentation
#websocket_utils.py
class WebsocketService:
"""
This acts like a service for websocket, is returned within the context manager
this class is used to not interact with consumer directly, instead interact it with the manager
"""
def __init__(self, *, user_id: UUID4, websocket: WebSocket, pubsub: PubSub):
self.user_id = user_id
self.websocket = websocket
self.pubsub = pubsub
async def get_online_user_status(self, role_id):
await consumer.online_user_status_per_role(role_id, self.websocket)
async def send_message(self, user_id: UUID4, content: str):
await consumer.send_message_to_user(user_id=user_id,
message=content,
websocket=self.websocket)
async def get_info(self):
await consumer.fetch_info(self.websocket)
class ConnectionContextManager:
"""
This context manager handles the websocket connection
on enter, it returns a controller for the websocket events
"""
websocket_service: WebsocketService
def __init__(self, *, user_id: UUID4, websocket: WebSocket):
self.websocket_service = WebsocketService(user_id=user_id,
websocket=websocket,
pubsub=websocket.app.redis.pubsub())
async def __aenter__(self):
logger.debug("Context manager enter")
await consumer.connect(
user_id=self.websocket_service.user_id,
websocket=self.websocket_service.websocket,
pubsub=self.websocket_service.pubsub
)
return self.websocket_service
async def __aexit__(self, exc_type, exc_val, exc_tb) -> None:
await consumer.disconnect(
user_id=self.websocket_service.user_id,
pubsub=self.websocket_service.pubsub,
websocket=self.websocket_service.websocket,
)
logger.debug("Context manager exit")
This context manager ensures that each user has their own pubsub channel and it creates a controller for the actual consumer so that I do not have to pass user_id and other handy parameters all the time when I need a specific resource.
class ConnectionConsumer:
__redis: aioredis.Redis
def __init__(self):
self.__redis = aioredis.from_url(settings.ws_redis_url, encoding='utf-8', decode_responses=True)
async def __send_json(self, obj: dict, websocket: WebSocket):
await websocket.send_json(obj)
async def connect(self, *, user_id: UUID4, websocket: WebSocket, pubsub: PubSub):
# Accept connection if authorization is successful, set the user online and subscribe to its channel layer
await websocket.accept()
await self.__redis.set(f"status:{user_id}", "1") # status:UUID4 (means online)
await pubsub.subscribe(f"channel:{user_id}") # subscribe to itself's channel
async def disconnect(self, *, user_id: UUID4, websocket: WebSocket, pubsub: PubSub):
# Gracefully disconnect from the websocket and remove the channel layer from pubsub
await self.__redis.delete(f"status:{user_id}")
await pubsub.unsubscribe(f"channel:{user_id}")
await pubsub.close()
await self.__redis.close()
await websocket.close()
And here is the actual consumer which is called from the service that context manager returns.
CONTAINER ID NAME CPU % MEM USAGE / LIMIT MEM % NET I/O BLOCK I/O PIDS
4ed80g7fb093 s_be 1.77% 76.09MiB / 15.29GiB 0.49% 37.3kB / 21.1kB 0B / 0B 7
This is the docker stats for the container when only consumer is handled
CONTAINER ID NAME CPU % MEM USAGE / LIMIT MEM % NET I/O BLOCK I/O PIDS
4ed80g7fb093 s_be 100.36% 76.08MiB / 15.29GiB 0.49% 42.9kB / 25.7kB 0B / 0B 7
And this is the docker stats for the container when both producer and consumer handlers are running
I have tried to split the connections as well but I have the same issue.
I know this is pretty old question, but I also got similar problem.
The issue is that await pubsub.get_message uses timeout=0.0 param by default, which causes "infinite polling" and high CPU usage.
You can specify timeout argument (it must be a "seconds" float) so the system will wait before returning. Also, you can pass timeout=None to make get_message function to wait indefinitely for next message.
message = await pubsub.get_message(
ignore_subscribe_messages=True, timeout=None
) # This will wait for new message indefinitely
if message:
...
SOLUTION
here is the refactored producer handler that uses listen() function instead of get_message() which yields response instead of returning, this causes event loop to yield message whenever the value would be sent to it with generator expression. it does not need to check each time if value is available or not and therefore the problem gets solved and we do not need to have a timeout or await sleep() function inside the code
#logger.catch
async def producer_handler(service):
"""Messages generated at the backend to send to the websocket Producer - (Subscriber)"""
try:
while True:
if service.pubsub.subscribed:
async for message in service.pubsub.listen():
if message['type'] == "subscribe": continue
await service.websocket.send_text(message['data'])
except (ConnectionClosedOK, aioredis.exceptions.ConnectionError) as e:
sentry_sdk.capture_exception(e)
logger.debug(f"{e.__class__}", "producer handler disconnected")

Two parallely polling tasks on an event driven platform

I am currently working on a server platform, which is based on an event driven architecture. An event should enter the system via a websocket connection, and after some processing the response for it should also leave the system via the same websocket connection. The implementation logic behind the idea is, that if a connection is made to the server, I put it in a while cycle, and await it to send me data until it disconnects. The incoming data is put into a queue, from which a worker thread will pull it out and process it. On the other part, I have created a task, which is polling an outgoing event queue, and if there is an event in the queue, it sends it to the corresponding recipient. Unfortunately my current asyncio logic is flawed, in the way that polling the outgoing event queue blocks the receiving task, and I cannot wrap my head around a way to fix it. Here are some code snippets, which should represent the problem presented above:
Starting the websocket server
def run(self, address: str, port: int, ssl_context: ssl.SSLContext = None):
start_server = websockets.serve(
self.websocket_connection_handler, address, port, ssl=ssl_context)
event_loop = asyncio.get_event_loop()
event_loop.create_task(self.send_heartbeat())
event_loop.create_task(self.dispatch_outgoing_events())
print(f'Running on {"wss" if ssl_context else "ws"}://{address}:{port}')
event_loop.run_until_complete(start_server)
event_loop.run_forever()
The dispatcher function which infinitely polls data from the outgoing queue
async def dispatch_outgoing_events(self):
while not self.exit_state.should_exit:
if len(self.outgoing_event_queue) == 0:
await asyncio.sleep(0)
else:
event = self.outgoing_event_queue.get_event()
destination = event.destination
client_id = re.findall(
r'[a-f0-9]{8}-[a-f0-9]{4}-[a-f0-9]{4}-[a-f0-9]{4}-[a-f0-9]{12}', destination)[0]
client = self.client_store.get(client_id)
await client.websocket.send(serializer.serialize(event))
The connection handler function for the websocket
async def websocket_connection_handler(self, websocket, path):
client_id = await self.register(websocket)
try:
while not self.exit_state.should_exit:
correlation_id = str(uuid4())
message = await websocket.recv()
else:
try:
event = serializer.deserialize(
message, correlation_id, client_id)
event.return_address = f'remote://websocket/{client_id}'
self.incoming_event_queue.add_event(event)
except Exception as e:
event = type('evt', (object,), dict(system_entry=str(
datetime.datetime.utcnow()), destination=f'remote://websocket/{client_id}'))()
self.exception_handler.handle_exception(
e, event)
except Exception as exception:
print(
f'client {client_id} suddenly disconnected. Reason: {type(exception).__name__} -> {exception}')
self.client_store.remove(client_id)
self.topic_factory.remove_client(client_id)
self.topic_factory.get_topic('server_notifications').publish(ClientDisconnectedNotification(client_id),
str(uuid4()))

Multiclient Streaming Websocket endpoint (Python)

Recently I've gotten into the "crypto mania" and have started writing my own wrappers around the API's on some exchanges.
Binance in particular has an a streaming websocket endpoint.
where you can stream data but via a websocket endpoint.
I thought I'd try this out on my own using sanic.
here is my websocket route
#ws_routes.websocket("/hello")
async def hello(request, ws):
while True:
await ws.send("hello")
now I have 2 clients on 2 different machines connecting to it
async def main():
async with aiohttp.ClientSession() as session:
ws = await session.ws_connect("ws://192.168.86.31:8000/hello")
while True:
data = await ws.receive()
print(data)
however only one of the clients will be able to connect and receive the sent data from the server. I'm assuming that because of the while loop its blocking and preventing the other connection from connecting because it doesn't yield?
how do we make it stream to multiple clients without blocking the other connections?
I looked into adding more workers and it seems to do the trick but what I don't understand is thats not a very scalable solution. because each client would be its own worker and if you have thousands or even just 10 clients that would be 10 workers 1 per client.
so how does Binance do their websocket streaming? or hell how does the twitter stream endpoint work?
how is it able to serve an infinite stream to multiple concurrent clients?
because ultimately thats what I'm trying to do
The way to solve this would be something like this.
I am using the sanic framework
class Stream:
def __init__(self):
self._connected_clients = set()
async def __call__(self, *args, **kwargs):
await self.stream(*args, **kwargs)
async def stream(self, request, ws):
self._connected_clients.add(ws)
while True:
disconnected_clients = []
for client in self._connected_clients: # check for disconnected clients
if client.state == 3: # append to a list because error will be raised if removed from set while iterating over it
disconnected_clients.append(client)
for client in disconnected_clients: # remove disconnected clients
self._connected_clients.remove(client)
await asyncio.wait([client.send("Hello") for client in self._connected_clients]))
ws_routes.add_websocket_route(Stream(), "/stream")
keep track of each websocket session
append to a list or set
check for invalid websocket sessions and remove from your websocket sessions container
do an await asyncio.wait([ws_session.send() for ws_session [list of valid sessions]]) which is basically a broadcast.
5.profit!
this is basically the pubsub design pattern
Something like this maybe?
import aiohttp
import asyncio
loop = asyncio.get_event_loop()
async def main():
async with aiohttp.ClientSession() as session:
ws = await session.ws_connect("ws://192.168.86.31:8000/hello")
while True:
data = await ws.receive()
print(data)
multiple_coroutines = [main() for _ in range(10)]
loop.run_until_complete(asyncio.gather(*multiple_coroutines))

Categories