ZeroMQ: HWM on PUSH does not work

ZeroMQ: HWM on PUSH does not work - python

I am trying to write a server/client script with a server that vents the tasks, and multiple workers that execute it.
The problem is that my ventilator has so many tasks that it would fill up the memory in a heartbeat.
I tried to set the HWM before it binds, but with no success. It just keeps on sending messages as soon as a worker connects, completely disregarding the HWM that was set. I also have a sink that keeps record of the tasks that were done.
server.py
import zmq
def ventilate():
context = zmq.Context()
# Socket to send messages on
sender = context.socket(zmq.PUSH)
sender.setsockopt(zmq.SNDHWM, 30) #Big messages, so I don't want to keep too many in queue
sender.bind("tcp://*:5557")
# Socket with direct access to the sink: used to syncronize start of batch
sink = context.socket(zmq.PUSH)
sink.connect("tcp://localhost:5558")
print "Sending tasks to workers…"
# The first message is "0" and signals start of batch
sink.send('0')
print "Sent starting signal"
while True:
sender.send("Message")
if __name__=="__main__":
ventilate()
worker.py
import zmq
from multiprocessing import Process
def work():
context = zmq.Context()
# Socket to receive messages on
receiver = context.socket(zmq.PULL)
receiver.connect("tcp://localhost:5557")
# Socket to send messages to
sender = context.socket(zmq.PUSH)
sender.connect("tcp://localhost:5558")
# Process t asks forever
while True:
msg = receiver.recv_msg()
print "Doing sth with msg %s"%(msg)
sender.send("Message %s done"%(msg))
if __name__ == "__main__":
for worker in range(10):
Process(target=work).start()
sink.py
import zmq
def sink():
context = zmq.Context()
# Socket to receive messages on
receiver = context.socket(zmq.PULL)
receiver.bind("tcp://*:5558")
# Wait for start of batch
s = receiver.recv()
print "Received start signal"
while True:
msg = receiver.recv_msg()
print msg
if __name__=="__main__":
sink()

Ok, I had a play around, I don't think the issue is with the PUSH HWM, but rather that you can't set a HWM for PULL. If you look at this documentation, you can see there it says N/A for action on HWM.
The PULL sockets seem to be taking hundreds of messages each (and I did try setting a HWM just in case it did anything on the PULL socket. It didn't.). I evidenced this by changing the ventilator to send messages with an incrementing integer, and changing each worker in the pool to wait 2 seconds between calls to recv(). The workers print out that they are processing messages with vastly different integers. For instance, one worker will be working on message 10, while the next is working on message 400. As time goes on, you see the worker who was processing message 10, is now processing message 11, 12, 13, etc. while the other is processing 401, 402, etc.
This indicates to me that the ZMQ_PULL socket is buffering the messages somewhere. So while the ZMQ_PUSH socket does have a HWM, the PULL socket is requesting messages quickly, despite them not actually being accessed by a call to recv(). So that results in the PUSH HWM effectively being ignored if a PULL socket is connected. As far as I can see, you can't control the length of the buffer of the PULL socket (I would expect the RCVHWM socket option to control this but it doesn't appear to).
This behaviour of course begs the question what is the point of the ZMQ_PULL HWM option, which only makes sense to have if you can also control the receiving sockets HWM.
At this point, I'd start asking the 0MQ people whether you are missing something obvious, or if this is considered a bug.
Sorry I couldn't be more help!

ZeroMQ has buffers on both sending and receiving ends of a socket, hence you need to set high water marks on both the PUSH and the PULL socket in your code (and indeed before a bind() or connect()).
In the Python bindings this is now conveniently done via socket.hwm = 1 which will set both ZMQ_SNDHWM and ZMQ_RCVHWM in one go.

Related

How to stop a websocket client without stopping reactor

I have an app similar to a chat-room writing in python that intends to do the following things:
A prompt for user to input websocket server address.
Then create a websocket client that connects to server and send/receive messages. Disable the ability to create a websocket client.
After receiving "close" from server (NOT a close frame), client should drop connecting and re-enable the app to create a client. Go back to 1.
If user exits the app, it exit the websocket client if there is one running.
My approach for this is using a main thread to deal with user input. When user hits enter, a thread is created for WebSocketClient using AutoBahn's twisted module and pass a Queue to it. Check if the reactor is running or not and start it if it's not.
Overwrite on message method to put a closing flag into the Queue when getting "close". The main thread will be busy checking the Queue until receiving the flag and go back to start. The code looks like following.
Main thread.
def main_thread():
while True:
text = raw_input("Input server url or exit")
if text == "exit":
if myreactor:
myreactor.stop()
break
msgq = Queue.Queue()
threading.Thread(target=wsthread, args=(text, msgq)).start()
is_close = False
while True:
if msgq.empty() is False:
msg = msgq.get()
if msg == "close":
is_close = True
else:
print msg
if is_close:
break
print 'Websocket client closed!'
Factory and Protocol.
class MyProtocol(WebSocketClientProtocol):
def onMessage(self, payload, isBinary):
msg = payload.decode('utf-8')
self.Factory.q.put(msg)
if msg == 'close':
self.dropConnection(abort=True)
class WebSocketClientFactoryWithQ(WebSocketClientFactory):
def __init__(self, *args, **kwargs):
self.queue = kwargs.pop('queue', None)
WebSocketClientFactory.__init__(self, *args, **kwargs)
Client thread.
def wsthread(url, q):
factory = WebSocketClientFactoryWithQ(url=url, queue=q)
factory.protocol = MyProtocol
connectWS(Factory)
if myreactor is None:
myreactor = reactor
reactor.run()
print 'Done'
Now I got a problem. It seems that my client thread never stops. Even if I receive "close", it seems still running and every time I try to recreate a new client, it creates a new thread. I understand the first thread won't stop since reactor.run() will run forever, but from the 2nd thread and on, it should be non-blocking since I'm not starting it anymore. How can I change that?
EDIT:
I end up solving it with
Adding stopFactory() after disconnect.
Make protocol functions with reactor.callFromThread().
Start the reactor in the first thread and put clients in other threads and use reactor.callInThread() to create them.

Your main_thread creates new threads running wsthread. wsthread uses Twisted APIs. The first wsthread becomes the reactor thread. All subsequent threads are different and it is undefined what happens if you use a Twisted API from them.
You should almost certainly remove the use of threads from your application. For dealing with console input in a Twisted-based application, take a look at twisted.conch.stdio (not the best documented part of Twisted, alas, but just what you want).

Can't make a Python socket blocking

I'm trying to write a fairly simple client-server Python application using socket and SocketServer. To allow for two-way communication between client and server, the client maintains one connected socket with the server so it can listen for messages in a separate thread, while the main thread creates one-time-use sockets to send messages to the server. I want my "listening" socket to be blocking, as it is running in a separate thread whose only purpose is to wait for data without blocking the main program. Here is the function where I create this socket:
def connect(self, alias, serverIP):
if not alias or not isinstance(alias, str):
print "ERROR: Must specify an alias"
return
self.serverIP = serverIP
self.downConnection = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
self.downConnection.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
self.downConnection.setblocking(1)
self.downConnection.connect((self.serverIP, 11100))
self.downConnection.send("SENDSERVER CONNECT %s" % alias)
Here is the loop where the persistent socket listens for messages from the server (with some debugging code thrown in):
i = 0
while True:
print "LOOP", i,
if self.closed:
break
try:
data = self.downConnection.recv(1024)
except socket.timeout, e:
print "Timeout"
pass
else:
print "Received %d" % len(data)
if data:
self.received(data)
i += 1
I would expect to see "Received ##" messages only when the server sends data, and maybe periodic "Timeout" messages otherwise. Instead, the output grows very rapidly and looks entirely like this:
LOOP 33858 Received 0
LOOP 33859 Received 0
LOOP 33860 Received 0
LOOP 33861 Received 0
LOOP 33862 Received 0
LOOP 33863 Received 0
LOOP 33864 Received 0
LOOP 33865 Received 0
So it seems that self.downConnection.recv() is immediately returning an empty string each time it is called, rather than blocking until it receives substantive data like it's supposed to. This is puzzling, as I'm explicitly setting the socket to be blocking (which I think is also the default setting). Constantly executing this loop instead of the thread spending most of its time waiting for data is wasting a good deal of CPU time. What am I doing wrong in setting up the blocking socket?
Here is the full server code. The Comms class is also the superclass of the client class, to allow for some basic common functionality.
Something does seem to be wrong with the connection from the server's end. The server can receive data from the client, but trying to send data to the client gives a socket.error: [Errno 9] Bad file descriptor exception.

How do I add timeout, in order to move to next client if current client has not sent any data?

How do I add timeout, in order to move to next client if current client has not sent any data in Python? I have my all the connected clients stored in the conn_clients list.
Here's my code for receive function:
def receive(connection):
curr_con = connection
while True:
message = connection.recv(BUFFER_SIZE)
if not message:
print "Closing connection"
conn_clients.remove(connection) #removing socket from list
return
send_all(curr_con, message) #sending message to all cleints

You did not mention the operating system, and I'm not sure if this is applicable for Windows.
GLib provides timeout functions which you can start and tell to execute a function after a certain interval. I found this article which shows how to connect g_io_add_watch to socket events (much cleaner that waiting for events) and simultaneously start a timer (from the same library). If the socket doesn't show activity in the time set, the function can abort the socket process.

How to avoid high cpu usage?

I created a zmq_forwarder.py that's run separately and it passes messages from the app to a sockJS connection, and i'm currently working on right now on how a flask app could receive a message from sockJS via zmq. i'm pasting the contents of my zmq_forwarder.py. im new to ZMQ and i dont know why everytime i run it, it uses 100% CPU load.
import zmq
# Prepare our context and sockets
context = zmq.Context()
receiver_from_server = context.socket(zmq.PULL)
receiver_from_server.bind("tcp://*:5561")
forwarder_to_server = context.socket(zmq.PUSH)
forwarder_to_server.bind("tcp://*:5562")
receiver_from_websocket = context.socket(zmq.PULL)
receiver_from_websocket.bind("tcp://*:5563")
forwarder_to_websocket = context.socket(zmq.PUSH)
forwarder_to_websocket.bind("tcp://*:5564")
# Process messages from both sockets
# We prioritize traffic from the server
while True:
# forward messages from the server
while True:
try:
message = receiver_from_server.recv(zmq.DONTWAIT)
except zmq.Again:
break
print "Received from server: ", message
forwarder_to_websocket.send_string(message)
# forward messages from the websocket
while True:
try:
message = receiver_from_websocket.recv(zmq.DONTWAIT)
except zmq.Again:
break
print "Received from websocket: ", message
forwarder_to_server.send_string(message)
as you can see, i've setup 4 sockets. the app connects to port 5561 to push data to zmq, and port 5562 to receive from zmq (although im still figuring out how to actually set it up to listen for messages sent by zmq). on the other hand, sockjs receives data from zmq on port 5564 and sends data to it on port 5563.
i've read the zmq.DONTWAIT makes receiving of message asynchronous and non-blocking so i added it.
is there a way to improve the code so that i dont overload the CPU? the goal is to be able to pass messages between the flask app and the websocket using zmq.

You are polling your two receiver sockets in a tight loop, without any blocking (zmq.DONTWAIT), which will inevitably max out the CPU.
Note that there is some support in ZMQ for polling multiple sockets in a single thread - see this answer. I think you can adjust the timeout in poller.poll(millis) so that your code only uses lots of CPU if there are lots of incoming messages, and idles otherwise.
Your other option is to use the ZMQ event loop to respond to incoming messages asynchronously, using callbacks. See the PyZMQ documentation on this topic, from which the following "echo" example is adapted:
# set up the socket, and a stream wrapped around the socket
s = ctx.socket(zmq.REP)
s.bind('tcp://localhost:12345')
stream = ZMQStream(s)
# Define a callback to handle incoming messages
def echo(msg):
# in this case, just echo the message back again
stream.send_multipart(msg)
# register the callback
stream.on_recv(echo)
# start the ioloop to start waiting for messages
ioloop.IOLoop.instance().start()

How to take action if ZeroMQ doesn't receive messages?

I've got some sort of distributed control system in which I send a heart beat every second. On the receiving end I need to take action if no message has been received for more than 2 seconds. The problem is that when zeroMQ is waiting for an answer I can't do anything else, like checking how much time has passed since the last message was received.
The code I have now is below. Does anybody know how I could take action if no message has been received for more than 2 seconds? All tips are welcome!
[EDIT] With the tip of Pieter Hintjes I added polling to the code, but I still doesn't work. Any other ideas?
from datetime import datetime
import zmq
context = zmq.Context()
# Set up subscriber connection to receive message from broker
subscriber = context.socket(zmq.SUB)
subscriber.connect('tcp://localhost:8888')
subscriber.setsockopt(zmq.SUBSCRIBE, 'beat')
# Initialise poll set
poller = zmq.Poller()
poller.register(subscriber, zmq.POLLIN)
while True:
socks = dict(poller.poll(2000))
if subscriber in socks and socks[subscriber] == zmq.POLLIN:
message = subscriber.recv()
print(message)
print('do other stuff')

Use poll on the ZMQ socket instead of a blocking recv. There are lots of examples of this in the ZeroMQ Guide.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.