How can I maintain two active asyncio streams without a TimeoutError? - python

I'm trying to use asyncio to manage connections in a p2p networking application. I am trying to maintain a large number (~300) of connections using asyncio streams.
I'm using python3.6 and it hangs and times out on asyncio.open_connection(...) each time.
async def example():
reader, writer = await asyncio.open_connection(ip, port)
writer.write(handshake)
await writer.drain()
response = await reader.read(RESP_SIZE)
errcode, results = await worker(reader, writer, workerdata)
# This is the line it hangs and times out on
reader2, writer2 = await asyncio.open_connection(ip2, port2)
# Second, identical handshake sequence here
writer2.write(handshake)
await writer2.drain()
response = await reader2.read(RESP_SIZE)
errcode, results = await worker(reader2, writer2, workerdata2)
def main():
loop = asyncio.get_event_loop()
loop.run_until_complete(example())
loop.close()
A trivial example works for a single connection, but once I try to perform a handshake/open a second connection it hangs and I receive
TimeoutError: [Errno 110] Connect call failed
Is it possible to have multiple connections to different client ip/port pairs at the same time using asyncio streams? Is there a different async library that's more appropriate for this?

It hangs because the worker deadlocks waiting for some foreign message.
Piece of advise, always use timeouts.

Related

Execute a coroutine after asyncio server is started

I am working on a controller application that monitors and controls subprocesses which are independent python executeables.
Basically what I want is that in controller.py running an asyncio.star_server. After the server is up and running the controller.py should execute other python files as clients which will connect to it. The controller server runs forever and create new client instances and also send shutdown message to them if necessary.
Unfortunately this does not work. No error received, it just hangs.
controller.py:
async def handleClient(reader, writer):
#handling a connection
addr = writer.get_extra_info("peername")
print(f"connection from {addr}")
data_ = await reader.readline()
...
async def startClient(client_py_file, host, port):
# this executes another py file that will connect to this server
await asyncio.sleep(0.1)
subprocess.run(["python.exe", client_py_file, host, port])
async def main():
server = await asyncio.start_server(handleClient, "127.0.0.1", 4000)
await asyncio.ensure_future(startClient("client.py", "127.0.0.1", 4000)
await server.wait_closed()
asyncio.run(main())
It seems it executes the client.py that starts, that connects to the server without any error.
client.py:
async def async_client(loop):
reader, writer = await asyncio.open_connection(host, port, loop = loop)
writer.writelines([json.dumps("key" : idstr, "msg" : "this is my message"}, b"\n"])
await writer.drain()
while True:
data = await reader.readline()
....
now the client hangs on and waits for response from the server. But on the server the handleClient handler is not triggered. Have no idea what goes wrong. Could you please help me?
Thank you in advance!
The problem is that subprocess.run is a blocking function, which waits for the client to finish. During this wait the event loop is blocked and unable to service the incoming connections.
The simplest fix is to replace subprocess.run(...) with subprocess.Popen(...) which does the same thing, but returning a handle to the subprocess without waiting for it to finish. If you need to communicate with the subprocess, you can also use asyncio.create_subprocess_exec(...) which also returns a handle, but one whose methods like wait() are coroutines.

Ways to optimize simple asyncio program where TCP clients are persistent

Using Python 3.7.4 and the asyncio package I'm trying to write an application that should spawn around 20000 (20k or more) TCP clients which then connect to a single server.
The clients then wait for a command from the server (received_data = await reader.read(4096)) and proceed to executing it (await loop.run_in_executor(...)) then send the response back to the server (writer.write(resp)).
After this cycle is completed, I sleep 100ms (await asyncio.sleep(100e-3)) in order to allow other coroutines to run.
The 20k clients should never disconnect and should process commands from the server indefinitely.
I'm interested in ways I can change the code to optimize it (barring the use of uvloop or directly implementing a Protocol since I saw in uvloop's docs this could improve the performance) beyond what it is capable now.
Let's assume that I cannot modify handle_request.
For example the await asyncio.sleep(100e-3) is especially bothering me, but I had to add it there, otherwise the impression was that no other coroutines ran other than the first one! Why could that be?
Say I remove the sleep (since in theory the other awaits should allow other coroutines to run), what else could I do?
Below is a minimal example of what my application looks like:
import asyncio
from collections import namedtuple
import logging
import os
import sys
logger = logging.getLogger(__name__)
should_exit = asyncio.Event()
def exit(signame, loop):
should_exit.set()
logger.warning('Exiting soon...')
def handle_request(received_data, entity):
logger.info('Backend logic here that consumes a bit of time depending on the entity and the received_data')
async def run_entity(entity, args):
logger.info(f'Running entity {entity}')
loop = asyncio.get_running_loop()
try:
reader, writer = await asyncio.open_connection(args.addr[0], int(args.addr[1]))
logger.debug(f'{entity} connected to {args.addr[0]}:{args.addr[1]}')
try:
while not should_exit.is_set():
received_data = await reader.read(4096)
if received_data:
logger.debug(f'{entity} received data {received_data}')
success, resp = await loop.run_in_executor(None, functools.partial(handle_request, received_data, entity))
if success:
logger.debug(f'{entity} sending response {resp}')
writer.write(resp)
await writer.drain()
await asyncio.sleep(100e-3)
except ConnectionResetError:
pass
except ConnectionRefusedError:
logger.warning(f'Connection refused by {args.addr[0]}:{args.addr[1]}.')
except Exception:
logger.exception('Details of unexpected error:')
logger.info(f'Stopped entity {entity}')
async def main(entities, args):
if os.name == 'posix':
loop = asyncio.get_running_loop()
loop.add_signal_handler(signal.SIGTERM, functools.partial(exit, signal.SIGTERM, loop))
loop.add_signal_handler(signal.SIGINT, functools.partial(exit, signal.SIGINT, loop))
tasks = (run_entity(entity, args) for entity in entities)
await asyncio.gather(*tasks)
if __name__ == '__main__':
ArgsReplacement = namedtuple('ArgsReplacement', ['addr'])
asyncio.run(main(range(20000), ArgsReplacement(addr=['127.0.0.1', '4242'])))

Python sockets server and client in one script

I have a seemingly simple task that I can't quite wrap my brains around.
Here is what I need to do. Using socket module, start a server, use a client to start a connection, stop the server, return connection data - all in one script. I can do it when I run the two from two terminals but I need to put both server and client code in one script for automation. My problem is that socket.accept() is a blocking call and the script hangs before I can invoke the client. Tried playing with socket.setblocking(False) but it still blocks. I intuitively feel that I can accomplish this with asyncio module, but I have no experience with it and the examples I've seen don't seem to fit my task. Thanks much.
I need to put both server and client code in one script for automation. My problem is that socket.accept() is a blocking call and the script hangs before I can invoke the client. [...] I intuitively feel that I can accomplish this with asyncio module
Asyncio indeed makes it easy to start several tasks "in the background" (see asyncio.create_task) or "in parallel" (see asyncio.gather).
In fact, since the start_server API runs the server "in the background" to begin with (sort of how a server forks to daemonize itself, and you don't have to add & when starting it from a shell), you don't even need to do anything special to start the client and the server in parallel - just start the server, await the client coroutine, and stop the server.
As an example, starting with the echo client/server examples from the documentation, I've quickly arrived to something like this:
import asyncio
async def connect():
print('connecting...')
reader, writer = await asyncio.open_connection('127.0.0.1', 8888)
writer.write(b'hello world')
data = await reader.read(100)
assert data == b'hello world'
writer.close()
await writer.wait_closed()
print('closed connection')
return data
async def handle_client(reader, writer):
print('incoming connection')
while True:
data = await reader.read(100)
if data == b'':
break
writer.write(data)
await writer.drain()
print('incoming connection closed')
async def main():
server = await asyncio.start_server(handle_client, '127.0.0.1', 8888)
print('server now set up')
await connect()
server.close()
await server.wait_closed()
asyncio.run(main())

listen to multiple socket with websockets and asyncio

I am trying to create a script in python that listens to multiple sockets using websockets and asyncio, the problem is that no matter what I do it only listen to the first socket I call.
I think its the infinite loop, what are my option to solve this? using threads for each sockets?
async def start_socket(self, event):
payload = json.dumps(event)
loop = asyncio.get_event_loop()
self.tasks.append(loop.create_task(
self.subscribe(event)))
# this should not block the rest of the code
await asyncio.gather(*tasks)
def test(self):
# I want to be able to add corotines at a different time
self.start_socket(event1)
# some code
self.start_socket(event2)
this is what I did eventually, that way its not blocking the main thread and all subscriptions are working in parallel.
def subscribe(self, payload):
ws = websocket.WebSocket(sslopt={"cert_reqs": ssl.CERT_NONE})
ws.connect(url)
ws.send(payload)
while True:
result = ws.recv()
print("Received '%s'" % result)
def start_thread(self, loop):
asyncio.set_event_loop(loop)
loop.run_forever()
def start_socket(self, **kwargs):
worker_loop = asyncio.new_event_loop()
worker = Thread(target=self.start_thread, args=(worker_loop,))
worker.start()
worker_loop.call_soon_threadsafe(self.subscribe, payload)
def listen(self):
self.start_socket(payload1)
# code
self.start_socket(payload2)
# code
self.start_socket(payload3)
Your code appears incomplete, but what you've shown has two issues. One is that run_until_complete accepts a coroutine object (or other kind of future), not a coroutine function. So it should be:
# note parentheses after your_async_function()
asyncio.get_event_loop().run_until_complete(your_async_function())
the problem is that no matter what I do it only listen to the first socket I call. I think its the infinite loop, what are my option to solve this? using threads for each sockets?
The infinite loop is not the problem, asyncio is designed to support such "infinite loops". The problem is that you are trying to do everything in one coroutine, whereas you should be creating one coroutine per websocket. This is not a problem, as coroutines are very lightweight.
For example (untested):
async def subscribe_all(self, payload):
loop = asyncio.get_event_loop()
# create a task for each URL
for url in url_list:
tasks.append(loop.create_task(self.subscribe_one(url, payload)))
# run all tasks in parallel
await asyncio.gather(*tasks)
async def subsribe_one(self, url, payload):
async with websockets.connect(url) as websocket:
await websocket.send(payload)
while True:
msg = await websocket.recv()
print(msg)
One way to efficiently listen to multiple websocket connections from a websocket server is to keep a list of connected clients and essentially juggle multiple conversations in parallel.
E.g. A simple server that sends random # to each connected client every few secs:
import os
import asyncio
import websockets
import random
websocket_clients = set()
async def handle_socket_connection(websocket, path):
"""Handles the whole lifecycle of each client's websocket connection."""
websocket_clients.add(websocket)
print(f'New connection from: {websocket.remote_address} ({len(websocket_clients)} total)')
try:
# This loop will keep listening on the socket until its closed.
async for raw_message in websocket:
print(f'Got: [{raw_message}] from socket [{id(websocket)}]')
except websockets.exceptions.ConnectionClosedError as cce:
pass
finally:
print(f'Disconnected from socket [{id(websocket)}]...')
websocket_clients.remove(websocket)
async def broadcast_random_number(loop):
"""Keeps sending a random # to each connected websocket client"""
while True:
for c in websocket_clients:
num = str(random.randint(10, 99))
print(f'Sending [{num}] to socket [{id(c)}]')
await c.send(num)
await asyncio.sleep(2)
if __name__ == "__main__":
loop = asyncio.get_event_loop()
try:
socket_server = websockets.serve(handle_socket_connection, 'localhost', 6789)
print(f'Started socket server: {socket_server} ...')
loop.run_until_complete(socket_server)
loop.run_until_complete(broadcast_random_number(loop))
loop.run_forever()
finally:
loop.close()
print(f"Successfully shutdown [{loop}].")
A simple client that connects to the server and listens for the numbers:
import asyncio
import random
import websockets
async def handle_message():
uri = "ws://localhost:6789"
async with websockets.connect(uri) as websocket:
msg = 'Please send me a number...'
print(f'Sending [{msg}] to [{websocket}]')
await websocket.send(msg)
while True:
got_back = await websocket.recv()
print(f"Got: {got_back}")
asyncio.get_event_loop().run_until_complete(handle_message())
Mixing up threads and asyncio is more trouble than its worth and you still have code that will block on the most wasteful steps like network IO (which is the essential benefit of using asyncio).
You need to run each coroutine asynchronously in an event loop, call any blocking calls with await and define each method that interacts with any awaitable interactions with an async
See a working e.g.: https://github.com/adnantium/websocket_client_server

Multiclient Streaming Websocket endpoint (Python)

Recently I've gotten into the "crypto mania" and have started writing my own wrappers around the API's on some exchanges.
Binance in particular has an a streaming websocket endpoint.
where you can stream data but via a websocket endpoint.
I thought I'd try this out on my own using sanic.
here is my websocket route
#ws_routes.websocket("/hello")
async def hello(request, ws):
while True:
await ws.send("hello")
now I have 2 clients on 2 different machines connecting to it
async def main():
async with aiohttp.ClientSession() as session:
ws = await session.ws_connect("ws://192.168.86.31:8000/hello")
while True:
data = await ws.receive()
print(data)
however only one of the clients will be able to connect and receive the sent data from the server. I'm assuming that because of the while loop its blocking and preventing the other connection from connecting because it doesn't yield?
how do we make it stream to multiple clients without blocking the other connections?
I looked into adding more workers and it seems to do the trick but what I don't understand is thats not a very scalable solution. because each client would be its own worker and if you have thousands or even just 10 clients that would be 10 workers 1 per client.
so how does Binance do their websocket streaming? or hell how does the twitter stream endpoint work?
how is it able to serve an infinite stream to multiple concurrent clients?
because ultimately thats what I'm trying to do
The way to solve this would be something like this.
I am using the sanic framework
class Stream:
def __init__(self):
self._connected_clients = set()
async def __call__(self, *args, **kwargs):
await self.stream(*args, **kwargs)
async def stream(self, request, ws):
self._connected_clients.add(ws)
while True:
disconnected_clients = []
for client in self._connected_clients: # check for disconnected clients
if client.state == 3: # append to a list because error will be raised if removed from set while iterating over it
disconnected_clients.append(client)
for client in disconnected_clients: # remove disconnected clients
self._connected_clients.remove(client)
await asyncio.wait([client.send("Hello") for client in self._connected_clients]))
ws_routes.add_websocket_route(Stream(), "/stream")
keep track of each websocket session
append to a list or set
check for invalid websocket sessions and remove from your websocket sessions container
do an await asyncio.wait([ws_session.send() for ws_session [list of valid sessions]]) which is basically a broadcast.
5.profit!
this is basically the pubsub design pattern
Something like this maybe?
import aiohttp
import asyncio
loop = asyncio.get_event_loop()
async def main():
async with aiohttp.ClientSession() as session:
ws = await session.ws_connect("ws://192.168.86.31:8000/hello")
while True:
data = await ws.receive()
print(data)
multiple_coroutines = [main() for _ in range(10)]
loop.run_until_complete(asyncio.gather(*multiple_coroutines))

Categories