Why does pyzmq subscriber behave differently with asyncio?

Why does pyzmq subscriber behave differently with asyncio? - python

I have an XPUB/XSUB device and a number of mock publishers running in one process. In a separate process, I want to connect a subscriber and print received message to the terminal. Below I will show two variants of a simple function to do just that. I have these functions wrapped as command-line utilities.
My problem is that the asyncio variant never receives messages.
On the other hand, the non-async variant works just fine. I have tested all cases for ipc and tcp transports. The publishing process never changes in my tests, except when I restart it to change transport. The messages are short strings and published roughly once per second, so we're not looking at performance problem.
The subscriber program sits indefinitely at the line msg = await sock.receive_multipart(). In the XPUB/XSUB device I have instrumentation that shows the forwarding of the sock.setsockopt(zmq.SUBSCRIBE, channel.encode()) message, same as when the non-async variant connects.
The asyncio variant (not working, as described)
def subs(url, channel):
import asyncio
import zmq
import zmq.asyncio
ctx = zmq.asyncio.Context.instance()
sock = ctx.socket(zmq.SUB)
sock.connect(url)
sock.setsockopt(zmq.SUBSCRIBE, channel.encode())
async def task():
while True:
msg = await sock.recv_multipart()
print(' | '.join(m.decode() for m in msg))
try:
asyncio.run(task())
finally:
sock.setsockopt(zmq.LINGER, 0)
sock.close()
The regular blocking variant (works fine)
def subs(url, channel):
import zmq
ctx = zmq.Context.instance()
sock = ctx.socket(zmq.SUB)
sock.connect(url)
sock.setsockopt(zmq.SUBSCRIBE, channel.encode())
def task():
while True:
msg = sock.recv_multipart()
print(' | '.join(m.decode() for m in msg))
try:
task()
finally:
sock.setsockopt(zmq.LINGER, 0)
sock.close()
For this particular tool there is no need to use asyncio. However, I am experiencing this problem elsewhere in my code too, where an asynchronous recv never receives. So I'm hoping that by clearing it up in this simple case I'll understand what's going wrong in general.
My versions are
import zmq
zmq.zmq_version() # '4.3.2'
zmq.__version__ # '19.0.2'
I'm on MacOS 10.13.6.
I'm fully out of ideas. Internet, please help!

A working async variant is
def subs(url, channel):
import asyncio
import zmq
import zmq.asyncio
ctx = zmq.asyncio.Context.instance()
async def task():
sock = ctx.socket(zmq.SUB)
sock.connect(url)
sock.setsockopt(zmq.SUBSCRIBE, channel.encode())
try:
while True:
msg = await sock.recv_multipart()
print(' | '.join(m.decode() for m in msg))
finally:
sock.setsockopt(zmq.LINGER, 0)
sock.close()
asyncio.run(task())
I conclude that, when using asyncio zmq, sockets must be created with a call running on the event loop from which the sockets will be awaited. Even though the original form did not do anything fancy with event loops, it appears that the socket has an event loop different from that used by asyncio.run. I'm not sure why, and I didn't open an issue with pyzmq because their docs show usage as in this answer, without comment.
Edit in response to a comment:
asyncio.run always creates a new event loop, so the loop presumably created for the sockets instantiated outside of the co-routine passed to asyncio.run (as in the asyncio variant in the original question) is obviously different.

Related

When is async/await needed in writing to socket in python asnycio.Protocol?

I'm confused as to when and why the async/await syntax would be needed, or not, when using the low level asyncio.Protocol approach in python. Suppose I have a subclass of asyncio.Protocol, say EchoClientProtocol, and that has a method send_message that external code can use to send any message to an echo server. For example:
import asyncio
class EchoClientProtocol(asyncio.Protocol):
def __init__(self, message):
self.transport = None
self.message = message
def send_message(self,txt):
self.message = txt
self.transport.write(self.message.encode())
print('Data sent: {!r}'.format(self.message))
def connection_made(self, transport):
self.transport=transport
self.send_message(self.message)
def data_received(self, data):
print('Data received: {!r}'.format(data.decode()))
def connection_lost(self, exc):
print('The server closed the connection')
self.on_con_lost.set_result(True)
We can create and run the client with code that uses the familiar async/away syntax, with some auto-reconnect logic built in:
async def main():
while True:
try:
message="Initial message here"
transport, protocol = await loop.create_connection(
lambda: EchoClientProtocol(message),
'127.0.0.1',
8888
)
except OSError:
print("Server not up, retrying in 5 seconds...")
await asyncio.sleep(5)
else:
break
loop = asyncio.get_event_loop()
asyncio.run(main())
The above client class EchoClientProtocol does not declare send_message as async, and does not await the self.transport.write() method. I have seen code like this in many examples online. Why would this work if it uses asyncio? Isn't the async/await syntax necessary? Why or why not?
Suppose I have two clients in the same python script and both have to run simultaneously without blocking each other. For example, suppose that one echo client connects to a server at port 8888 and another connect to another client server at port 8899. I can use async.gather to make sure that both run simultaneously, but in that case do I have to declare send_message as async, and do I have to await self.transport.write()? In other words, what would I have to change in the EchoClientProtocol above so that I could run two or more such clients in the same script (connecting to different servers on different ports) to make them both run without blocking each other?

Python asyncio - starting coroutines in an infinite loop

I am making a simple server/client chat program in Python. This program should allow for multiple users to connect at once, and then execute their requests concurrently. For this, I am using the asyncio module and sockets.
async def accept_new_connections(socket):
socket.listen(1)
while True:
connection, client_address = sock.accept()
print("accepted conn")
asyncio.create_task(accept_commands(socket, connection))
async def accept_commands(socket, connection):
print("accept cmd started")
while True:
# get and execute commands
def main():
asyncio.run(accept_new_connections(socket))
main()
What I would hope to do is running accept_commands for each of the connections, which would then execute commands concurrently. However, the current code only starts accept_commands for the first connection, and blocks the while loop (the one in accept_new_connections).
Any idea what I need to change to have accept_command started for each of the connections instead?

It is tough to tell because your example does have the implementation of accept_commands, but based on your issue it is likely you need to use the async socket methods on event loop itself so that your coroutine can yield execution and let something else happen.
The below example shows how to do this. This starts a socket on port 8080 and will send back any data it receives back to the client. You can see this work concurrently by connecting two clients with netcat or telnet and sending data.
import asyncio
import socket
socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
socket.bind(('localhost', 8080))
socket.listen(1)
socket.setblocking(False)
loop = asyncio.new_event_loop()
async def main():
while True:
connection, client_address = await loop.sock_accept(socket)
print('connected')
loop.create_task(accept_commands(connection))
async def accept_commands(connection):
while True:
request = await loop.sock_recv(connection, 16)
await loop.sock_sendall(connection, request)
loop.run_until_complete(main())

How to clean up connections after KeyboardInterrupt in python-trio

My class when is connected to the server should immediately send sign in string, afterwards when the session is over it should send out the sign out string and clean up the sockets. Below is my code.
import trio
class test:
_buffer = 8192
_max_retry = 4
def __init__(self, host='127.0.0.1', port=12345, usr='user', pwd='secret'):
self.host = str(host)
self.port = int(port)
self.usr = str(usr)
self.pwd = str(pwd)
self._nl = b'\r\n'
self._attempt = 0
self._queue = trio.Queue(30)
self._connected = trio.Event()
self._end_session = trio.Event()
#property
def connected(self):
return self._connected.is_set()
async def _sender(self, client_stream, nursery):
print('## sender: started!')
q = self._queue
while True:
cmd = await q.get()
print('## sending to the server:\n{!r}\n'.format(cmd))
if self._end_session.is_set():
nursery.cancel_scope.shield = True
with trio.move_on_after(1):
await client_stream.send_all(cmd)
nursery.cancel_scope.shield = False
await client_stream.send_all(cmd)
async def _receiver(self, client_stream, nursery):
print('## receiver: started!')
buff = self._buffer
while True:
data = await client_stream.receive_some(buff)
if not data:
print('## receiver: connection closed')
self._end_session.set()
break
print('## got data from the server:\n{!r}'.format(data))
async def _watchdog(self, nursery):
await self._end_session.wait()
await self._queue.put(self._logoff)
self._connected.clear()
nursery.cancel_scope.cancel()
#property
def _login(self, *a, **kw):
nl = self._nl
usr, pwd = self.usr, self.pwd
return nl.join(x.encode() for x in ['Login', usr,pwd]) + 2*nl
#property
def _logoff(self, *a, **kw):
nl = self._nl
return nl.join(x.encode() for x in ['Logoff']) + 2*nl
async def _connect(self):
host, port = self.host, self.port
print('## connecting to {}:{}'.format(host, port))
try:
client_stream = await trio.open_tcp_stream(host, port)
except OSError as err:
print('##', err)
else:
async with client_stream:
self._end_session.clear()
self._connected.set()
self._attempt = 0
# Sign in as soon as connected
await self._queue.put(self._login)
async with trio.open_nursery() as nursery:
print("## spawning watchdog...")
nursery.start_soon(self._watchdog, nursery)
print("## spawning sender...")
nursery.start_soon(self._sender, client_stream, nursery)
print("## spawning receiver...")
nursery.start_soon(self._receiver, client_stream, nursery)
def connect(self):
while self._attempt <= self._max_retry:
try:
trio.run(self._connect)
trio.run(trio.sleep, 1)
self._attempt += 1
except KeyboardInterrupt:
self._end_session.set()
print('Bye bye...')
break
tst = test()
tst.connect()
My logic doesn't quite work. Well it works if I kill the netcat listener, so then my session looks like the following:
## connecting to 127.0.0.1:12345
## spawning watchdog...
## spawning sender...
## spawning receiver...
## receiver: started!
## sender: started!
## sending to the server:
b'Login\r\nuser\r\nsecret\r\n\r\n'
## receiver: connection closed
## sending to the server:
b'Logoff\r\n\r\n'
Note that Logoff string has been sent out, although it doesn't make sense in here as connection is already broken by that time.
However my goal is to Logoff when user KeyboardInterrupt. In this case my session looks similar to this:
## connecting to 127.0.0.1:12345
## spawning watchdog...
## spawning sender...
## spawning receiver...
## receiver: started!
## sender: started!
## sending to the server:
b'Login\r\nuser\r\nsecret\r\n\r\n'
Bye bye...
Note that Logoff hasn't been sent off.
Any ideas?

Here your call tree looks something like:
connect
|
+- _connect*
|
+- _watchdog*
|
+- _sender*
|
+- _receiver*
The *s indicate the 4 trio tasks. The _connect task is sitting at the end of the nursery block, waiting for the child tasks to complete. The _watchdog task is blocked in await self._end_session.wait(), the _sender task is blocked in await q.get(), and the _receiver task is blocked in await client_stream.receive_some(...).
When you hit control-C, then the standard Python semantics are that whatever bit of Python code is running suddenly raises KeyboardInterrupt. In this case, you have 4 different tasks running, so one of those blocked operations gets picked at random [1], and raises a KeyboardInterrupt. This means a few different things might happen:
If _watchdog's wait call raises KeyboardInterrupt, then the _watchdog method immediately exits, so it never even tries to send logout. Then as part of unwinding the stack, trio cancels all the other tasks, and once they've exited then the KeyboardInterrupt keeps propagating up until it reaches your finally block in connect. At this point you try to notify the watchdog task using self._end_session.set(), but it's not running anymore, so it doesn't notice.
If _sender's q.get() call raises KeyboardInterrupt, then the _sender method immediately exits, so even if the _watchdog did ask it to send a logoff message, it won't be there to notice. And in any case, trio then proceeds to cancel the watchdog and receiver tasks anyway, and things proceed as above.
If _receiver's receive_all call raises KeyboardInterrupt... same thing happens.
Minor subtlety: _connect can also receive the KeyboardInterrupt, which does the same thing: cancels all the children, and then waits for them to stop before allowing the KeyboardInterrupt to keep propagating.
If you want to reliably catch control-C and then do something with it, then this business of it being raised at some random point is quite a nuisance. The simplest way to do this is to use Trio's support for catching signals to catch the signal.SIGINT signal, which is the thing that Python normally converts into a KeyboardInterrupt. (The "INT" stands for "interrupt".) Something like:
async def _control_c_watcher(self):
# This API is currently a little cumbersome, sorry, see
# https://github.com/python-trio/trio/issues/354
with trio.catch_signals({signal.SIGINT}) as batched_signal_aiter:
async for _ in batched_signal_aiter:
self._end_session.set()
# We exit the loop, restoring the normal behavior of
# control-C. This way hitting control-C once will try to
# do a polite shutdown, but if that gets stuck the user
# can hit control-C again to raise KeyboardInterrupt and
# force things to exit.
break
and then start this running alongside your other tasks.
You also have the problem that in your _watchdog method, it puts the logoff request into the queue – thus scheduling a message to be sent later, by the _sender task – and then immediately cancels all the tasks, so that the _sender task probably won't get a chance to see the message and react to it! In general, I find my code works nicer when I use tasks only when necessary. Instead of having a sender task and then putting messages in a queue when you want to send them, why not have the code that wants to send a message call stream.send_all directly? The one thing you have to watch out for is if you have multiple tasks that might send things simultaneously, you might want to use a trio.Lock() to make sure they don't bump into each other by calling send_all at the same time:
async def send_all(self, data):
async with self.send_lock:
await self.send_stream.send_all(data)
async def do_logoff(self):
# First send the message
await self.send_all(b"Logoff\r\n\r\n")
# And then, *after* the message has been sent, cancel the tasks
self.nursery.cancel()
If you do it this way, you might be able to get rid of the watchdog task and the _end_session event entirely.
A few other notes about your code while I'm here:
Calling trio.run multiple times like this is unusual. The normal style is to call it once at the top of your program, and put all your real code inside it. Once you exit trio.run, all of trio's state is lost, you're definitely not running any concurrent tasks (so there's no way anything could possibly be listening and notice your call to _end_session.set()!). And in general, almost all Trio functions assume that you're already inside a call to trio.run. It turns out that right now you can call trio.Queue() before starting trio without getting an exception, but that's basically just a coincidence.
The use of shielding inside _sender looks odd to me. Shielding is generally an advanced feature that you almost never want to use, and I don't think this is an exception.
Hope that helps! And if you want to talk more about style/design issues like this but are worried they might be too vague for stack overflow ("is this program designed well?"), then feel free to drop by the trio chat channel.
[1] Well, actually trio probably picks the main task for various reasons, but that's not guaranteed and in any case it doesn't make a difference here.

python3.5: asyncio, How to wait for "transport.write(data)" to finish or to return an error?

I'm writing a tcp client in python3.5 using asyncio
After reading How to detect write failure in asyncio? that talk about the high-level streaming api, I've tried to implement using the low level protocol api.
class _ClientProtocol(asyncio.Protocol):
def connection_made(self, transport):
self.transport = transport
class Client:
def __init__(self, loop=None):
self.protocol = _ClientProtocol()
if loop is None:
loop = asyncio.get_event_loop()
self.loop = loop
loop.run_until_complete(self._connect())
async def _connect(self):
await self.loop.create_connection(
lambda: self.protocol,
'127.0.0.1',
8080,
)
# based on https://vorpus.org/blog/some-thoughts-on-asynchronous-api-design-in-a-post-asyncawait-world/#bug-3-closing-time
self.protocol.transport.set_write_buffer_limits(0)
def write(self, data):
self.protocol.transport.write(data)
def wait_all_data_have_been_written_or_throw():
pass
client = Client()
client.write(b"some bytes")
client.wait_all_data_have_been_written_or_throw()
As per the python documentation, I know write is non-blocking, and I would like the wait_all_data_have_been_written_or_throw to tell me if all data have been written or if something bad happened in the middle (like a connection lost, but I assume there's way more things that can go bad, and that the underlying socket already return exception about it?)
Does the standard library provide a way to do so ?

The question is mainly related to TCP sockets functionality, not asyncio implementation itself.
Let's look on the following code:
import socket
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect((host, port))
s.send(b'data')
Successful send() call means the data was transferred into kernel space buffer for the socket, nothing more.
Data was not sent via wire, not received by peer and, obviously, not processed by received.
Actual sending is performed asynchronously by Operation System Kernel, user code has no control over it.
What's why wait_all_data_have_been_written_or_throw() make not much sense: writing without an error doesn't assume receiving these data by peer but only successful moving from user-space buffer to kernel-space one.

Python 3 websockets - send message before closing connection

I'm new to Stack Overflow (although have been a long-term "stalker"!) so please be gentle with me!
I'm trying to learn Python, in particular Asyncio using websockets.
Having scoured the web for examples/tutorials I've put together the following tiny chat application, and could use some advice before it gets bulkier (more commands etc) and becomes difficult to refactor.
My main question, is why (when sending the DISCONNECT command) does it need the asyncio.sleep(0) in order to send the disconnection verification message BEFORE closing the connection?
Other than that, am I on the right tracks with the structure here?
I feel that there's too much async/await but I can't quite wrap my head around why.
Staring at tutorials and S/O posts for hours on end doesn't seem to be helping at this point so I thought I'd get some expert advice directly!
Here we go, simple WS server that responds to "nick", "msg", "test" & "disconnect" commands. No prefix required, i.e "nick Rachel".
import asyncio
import websockets
import sys
class ChatServer:
def __init__(self):
print("Chat Server Starting..")
self.Clients = set()
if sys.platform == 'win32':
self.loop = asyncio.ProactorEventLoop()
asyncio.set_event_loop(self.loop)
else:
self.loop = asyncio.get_event_loop()
def run(self):
start_server = websockets.serve(self.listen, '0.0.0.0', 8080)
try:
self.loop.run_until_complete(start_server)
print("Chat Server Running!")
self.loop.run_forever()
except:
print("Chat Server Error!")
async def listen(self, websocket, path):
client = Client(websocket=websocket)
sender_task = asyncio.ensure_future(self.handle_outgoing_queue(client))
self.Clients.add(client)
print("+ connection: " + str(len(self.Clients)))
while True:
try:
msg = await websocket.recv()
if msg is None:
break
await self.handle_message(client, msg)
except websockets.exceptions.ConnectionClosed:
break
self.Clients.remove(client)
print("- connection: " + str(len(self.Clients)))
async def handle_outgoing_queue(self, client):
while client.websocket.open:
msg = await client.outbox.get()
await client.websocket.send(msg)
async def handle_message(self, client, data):
strdata = data.split(" ")
_cmd = strdata[0].lower()
try:
# Check to see if the command exists. Otherwise, AttributeError is thrown.
func = getattr(self, "cmd_" + _cmd)
try:
await func(client, param, strdata)
except IndexError:
await client.send("Not enough parameters!")
except AttributeError:
await client.send("Command '%s' does not exist!" % (_cmd))
# SERVER COMMANDS
async def cmd_nick(self, client, param, strdata):
# This command needs a parameter (with at least one character). If not supplied, IndexError is raised
# Is there a cleaner way of doing this? Otherwise it'll need to reside within all functions that require a param
test = param[1][0]
# If we've reached this point there's definitely a parameter supplied
client.Nick = param[1]
await client.send("Your nickname is now %s" % (client.Nick))
async def cmd_msg(self, client, param, strdata):
# This command needs a parameter (with at least one character). If not supplied, IndexError is raised
# Is there a cleaner way of doing this? Otherwise it'll need to reside within all functions that require a param
test = param[1][0]
# If we've reached this point there's definitely a parameter supplied
message = strdata.split(" ",1)[1]
# Before we proceed, do we have a nickname?
if client.Nick == None:
await client.send("You must choose a nickname before sending messages!")
return
for each in self.Clients:
await each.send("%s says: %s" % (client.Nick, message))
async def cmd_test(self, client, param, strdata):
# This command doesn't need a parameter, so simply let the client know they issued this command successfully.
await client.send("Test command reply!")
async def cmd_disconnect(self, client, param, strdata):
# This command doesn't need a parameter, so simply let the client know they issued this command successfully.
await client.send("DISCONNECTING")
await asyncio.sleep(0) # If this isn't here we don't receive the "disconnecting" message - just an exception in "handle_outgoing_queue" ?
await client.websocket.close()
class Client():
def __init__(self, websocket=None):
self.websocket = websocket
self.IPAddress = websocket.remote_address[0]
self.Port = websocket.remote_address[1]
self.Nick = None
self.outbox = asyncio.Queue()
async def send(self, data):
await self.outbox.put(data)
chat = ChatServer()
chat.run()

Your code uses infinite size Queues, which means .put() calls .put_nowait() and returns immediately. (If you do want to keep these queues in your code, consider using 'None' in the queue as a signal to close a connection and move client.websocket.close() to handle_outgoing_queue()).
Another issue: Consider replacing for x in seq: await co(x) with await asyncio.wait([co(x) for x in seq]). Try it with asyncio.sleep(1) to experience a dramatic difference.
I believe a better option will be dropping all outbox Queues and just relay on the built in asyncio queue and ensure_future. The websockets package already includes Queues in its implementation.

I want to point out that the author of websockets indicated in a post on July 17 of 2017 that websockets used to return None when the connection was closed but that was changed at some point. Instead he suggests that you use a try and deal with the exception. The OP's code shows both a check for None AND a try/except. The None check is needlessly verbose and apparently not even accurate since with the current version, websocket.recv() doesn't return anything when the client closes.
Addressing the "main" question, it looks like a race condition of sorts. Remember that asyncio does it's work by going around and touching all the awaited elements in order to nudge them along. If your 'close connection' command is processed at some point ahead of when your queue is cleared, the client will never get that last message in the queue. Adding the async.sleep adds an extra step to the round robin and probably puts your queue emptying task ahead of your 'close connection'.
Addressing the amount of awaits, it's all about how many asynchronous things you need to have happen to accomplish the goal. If you block at any point you'll stop all the other tasks that you want to keep going.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.