ZeroMQ round robin and workers subscription

ZeroMQ round robin and workers subscription - python

I got some clients connecting to a frontend broker and some workers doing some job.
zeromq pattern I use :
How can I have a round-robin distribution for my workers AND a worker selection base on event name ?
I used PUB/SUB pattern for the subscription filtering but I don't want my broker to send the same message to workers.
Here some code (python3, zmq):
client.py
context = zmq.Context()
socket = context.socket(zmq.DEALER)
socket.identity = b'frontend'
socket.connect('tcp://127.0.0.1:4444')
while True:
event = random.choice([b'CreateUser', b'GetIndex', b'GetIndex', b'GetIndex'])
socket.send(event)
print('Emit %s event' % event)
time.sleep(1)
broker.py
context = zmq.Context()
frontend = context.socket(zmq.ROUTER)
frontend.identity = b'broker'
frontend.bind("tcp://127.0.0.1:4444")
backend = context.socket(zmq.DEALER)
backend.identity = b'broker'
backend.bind("tcp://127.0.0.1:5555")
poller = zmq.Poller()
poller.register(frontend, zmq.POLLIN)
poller.register(backend, zmq.POLLIN)
id = 0
while True:
id += 1
sockets = dict(poller.poll())
if frontend in sockets:
event, message = frontend.recv_multipart()
print('Event %s from %s' % (message.decode('utf-8'), event.decode('utf-8')))
backend.send_multipart([message,str(id).encode('utf-8')])
create_user_worker.py
context = zmq.Context()
worker = context.socket(zmq.DEALER)
worker.identity = b'create-user-worker'
worker.connect("tcp://127.0.0.1:5555")
while True:
message, id = worker.recv_multipart()
if message == b'CreateUser':
print(message, id)
get_index_worker.py
context = zmq.Context()
worker = context.socket(zmq.DEALER)
worker.identity = b'get-index-worker'
worker.connect("tcp://127.0.0.1:5555")
while True:
message, id = worker.recv_multipart()
if message == b'GetIndex':
print(message, id)
The output of the following code:
get_index_worker.py
b'GetIndex' b'1'
b'GetIndex' b'2'
b'GetIndex' b'4'
b'GetIndex' b'6'
create_user_worker.py
b'CreateUser' b'3'
The task for the event with the id 5 is lost
github repo: https://github.com/guillaumevincent/tornado-zeromq

Status Quo : As-is
ROUTER/DEALER Device is agnostic to any other logic, than it's internal design dictates ( listen on client side, dispatch any incoming message on a round-robin basis down the line, towards a worker side & keep internal records so as to be able to return answer messages from workers back towards the respective client, nothing more )
How to get more?
Try to imagine another possible approach.
Each client can have more sockets and may get .connect()-ed to more Device-s.
Each Device will receive just the "specialised" type of messages and will handle these appropriately with a standard, round-robin "primitive-load-balancing" Merry-Go-Round behaviour
This way both your design objectives ( distribute messages towards a pool of otherwise load-balanced handlers I. while keeping an event-specific direction principle II. ) are met with still using the most primitive ZeroMQ entities.

Related

How can I impose server priority on a UDP client receiving from multiple servers on the same port

I have a client application that is set to receive data from a given UDP port, and two servers (let's call them "primary" and "secondary") that are broadcasting data over that port.
I've set up a UDP receiver thread that uses a lossy queue to update my frontend. Lossy is okay here because the data are just status info strings, e.g. 'on'/'off', that I'm picking up periodically.
My desired behavior is as follows:
If the primary server is active and broadcasting, the client will accept data from the primary server only (regardless of data coming in from the secondary server)
If the primary server stops broadcasting, the client will accept data from the secondary server
If the primary server resumes broadcasting, don't cede to the primary unless the secondary server goes down (to prevent bouncing back and forth in the event that the primary sever is going in and out of failure)
If neither server is broadcasting, raise a flag
Currently the problem is that if both servers are broadcasting (which they will be most of the time), my client happily receives data from both and bounces back and forth between the two. I understand why this is happening, but I'm unsure how to stop it / work around it.
How can I structure my client to disregard data coming in from the secondary server as long as it's also getting data from the primary server?
NB - I'm using threads and queues here to keep my UDP operations from blocking my GUI
# EXAMPLE CLIENT APP
import queue
import socket as skt
import tkinter as tk
from threading import Event, Thread
class App(tk.Tk):
def __init__(self):
super().__init__()
self.title('UDP Client Test')
# set up window close handler
self.protocol('WM_DELETE_WINDOW', self.on_close)
# display the received value
self.data_label_var = tk.StringVar(self, 'No Data')
self.data_label = ttk.Label(self, textvariable=self.data_label_var)
self.data_label.pack()
# server IP addresses (example)
self.primary = '10.0.42.1'
self.secondary = '10.0.42.2'
self.port = 5555
self.timeout = 2.0
self.client_socket = self.get_client_socket(self.port, self.timeout)
self.dq_loop = None # placeholder for dequeue loop 'after' ID
self.receiver_queue = Queue(maxsize=1)
self.stop_event = Event()
self.receiver_thread = Thread(
name='status_receiver',
target=self.receiver_worker,
args=(
self.client_socket,
(self.primary, self.secondary),
self.receiver_queue,
self.stop_event
)
)
def get_client_socket(self, port: int, timeout: float) -> skt.socket:
"""Set up a UDP socket bound to the given port"""
client_socket = skt.socket(skt.AF_INET, skt.SOCK_DGRAM)
client_socket.settimeout(timeout)
client_socket.bind('', port) # accept traffic on this port from any IP address
return client_socket
#staticmethod
def receiver_worker(
socket: skt.socket,
addresses: tuple[str, str],
queue: queue.Queue,
stop_event: Event,
) -> None:
"""Thread worker that receives data over UDP and puts it in a lossy queue"""
primary, secondary = addresses # server IPs
while not stop_event.is_set(): # loop until application exit...
try:
data, server = socket.recvfrom(1024)
# here's where I'm having trouble - if traffic is coming in from both servers,
# there's a good chance my frontend will just pick up data from both alternately
# (and yes, I know these conditions do the same thing...for now)
if server[0] == primary:
queue.put_nowait((data, server))
elif server[0] == secondary:
queue.put_nowait((data, server))
else: # inbound traffic on the correct port, but from some other server
print('disredard...')
except queue.Full:
print('Queue full...') # not a problem, just here in case...
except skt.timeout:
print('Timeout...') # TODO
def receiver_dequeue(self) -> None:
"""Periodically fetch data from the worker queue and update the UI"""
try:
data, server = self.receiver_queue.get_nowait()
except queue.Empty:
pass # nothing to do
else: # update the label
self.data_label_var.set(data.decode())
finally: # continue updating 10x / second
self.dq_loop = self.after(100, self.receiver_dequeue)
def on_close(self) -> None:
"""Perform cleanup tasks on application exit"""
if self.dq_loop:
self.after_cancel(self.dq_loop)
self.stop_event.set() # stop the receiver thread loop
self.receiver_thread.join()
self.client_socket.close()
self.quit()
if __name__ == '__main__':
app = App()
app.mainloop()
My actual application is only slightly more complex than this, but the basic operation is the same: get data from UDP, use data to update UI...rinse and repeat
I suspect the changes need to be made to my receiver_worker method, but I'm not sure where to go from here. Any help is very much welcome and appreciated! And thanks for taking the time to read this long question!
Addendum: FWIW I did some reading about Selectors but I'm not sure how to go about implementing them in my case - if anybody can point me to a relevant example, that would be amazing

The core of the problem is: how do you determine that a given server is really offline as opposed to just temporarily taking a break, e.g. due to a momentary network glitch?
All your client really knows is whether it has received any UDP packets from a given source IP address recently or not, for some well-chosen definition of "recently". So what you can do in your client is update a per-IP-address member-variable to the current timestamp, whenever you receive a UDP packet from a given server. Then you can have a helper method like this (pseudocode):
def HowManyMillisecondsSinceTheLastUDPPacketWasReceivedFromServer(self, packetSourceIP):
{
return current_timestamp_milliseconds() - self._lastPacketReceiveTimeStamp[packetSourceIP]
}
Then e.g. if you know that your servers will be sending out a UDP packet once per second, you can decree that a given server is officially considered "offline" if you haven't received any UDP packets from it within the last 5 seconds. (Choose your own numbers here to suit, of course)
Then after you receive a packet and update the corresponding server-timestamp-member-variable, you can also update a member-variable indicating which server is the now the "active server" (i.e. the server you should currently be listening to):
def UpdateActiveServer(self)
{
millisSincePrimary = HowManyMillisecondsSinceTheLastUDPPacketWasReceivedFromServer(self._primaryServerIP)
millisSinceSecondary = HowManyMillisecondsSinceTheLastUDPPacketWasReceivedFromServer(self, _secondaryServerIP)
serverOfflineMillis = 5*1000 // 5 seconds
primaryIsOffline = (millisSincePrimary >= serverOfflineMillis)
secondaryIsOffline = (millisSinceSecondary >= serverOfflineMillis)
if ((primaryIsOffline) and (not secondaryIsOffline)):
self._usePacketsFromSecondaryServer = true
if ((secondaryIsOffline) and (not primaryIsOffline)):
self._usePacketsFromSecondaryServer = false
}
.... then the rest of your code can check the current value of self._usePacketsFromSecondaryServer to decide which incoming UDP packets to listen to and which ones to ignore (pseudocode):
def PacketReceived(whichServer):
if ((whichServer == self._primaryServerIP) and (not self._usePacketsFromSecondaryServer)) or ((whichServer == self._secondaryServerIP) and (self._usePacketsFromSecondaryServer)):
# code to parse and use UDP packet goes here

ZeroMQ: load balance many workers and one master

Suppose I have one master process that divides up data to be processed in parallel. Lets say there are 1000 chunks of data and 100 nodes on which to run the computations.
Is there some way to do REQ/REP to keep all the workers busy? I've tried to use the load balancer pattern in the guide but with a single client, sock.recv() is going to block until it receives its response from the worker.
Here is the code, slightly modified from the zmq guide for a load balancer. Is starts up one client, 10 workers, and a load balancer/broker in the middle. How can I get all those workers working at the same time???
from __future__ import print_function
from multiprocessing import Process
import zmq
import time
import uuid
import random
def client_task():
"""Basic request-reply client using REQ socket."""
socket = zmq.Context().socket(zmq.REQ)
socket.identity = str(uuid.uuid4())
socket.connect("ipc://frontend.ipc")
# Send request, get reply
for i in range(100):
print("SENDING: ", i)
socket.send('WORK')
msg = socket.recv()
print(msg)
def worker_task():
"""Worker task, using a REQ socket to do load-balancing."""
socket = zmq.Context().socket(zmq.REQ)
socket.identity = str(uuid.uuid4())
socket.connect("ipc://backend.ipc")
# Tell broker we're ready for work
socket.send(b"READY")
while True:
address, empty, request = socket.recv_multipart()
time.sleep(random.randint(1, 4))
socket.send_multipart([address, b"", b"OK : " + str(socket.identity)])
def broker():
context = zmq.Context()
frontend = context.socket(zmq.ROUTER)
frontend.bind("ipc://frontend.ipc")
backend = context.socket(zmq.ROUTER)
backend.bind("ipc://backend.ipc")
# Initialize main loop state
workers = []
poller = zmq.Poller()
# Only poll for requests from backend until workers are available
poller.register(backend, zmq.POLLIN)
while True:
sockets = dict(poller.poll())
if backend in sockets:
# Handle worker activity on the backend
request = backend.recv_multipart()
worker, empty, client = request[:3]
if not workers:
# Poll for clients now that a worker is available
poller.register(frontend, zmq.POLLIN)
workers.append(worker)
if client != b"READY" and len(request) > 3:
# If client reply, send rest back to frontend
empty, reply = request[3:]
frontend.send_multipart([client, b"", reply])
if frontend in sockets:
# Get next client request, route to last-used worker
client, empty, request = frontend.recv_multipart()
worker = workers.pop(0)
backend.send_multipart([worker, b"", client, b"", request])
if not workers:
# Don't poll clients if no workers are available
poller.unregister(frontend)
# Clean up
backend.close()
frontend.close()
context.term()
def main():
NUM_CLIENTS = 1
NUM_WORKERS = 10
# Start background tasks
def start(task, *args):
process = Process(target=task, args=args)
process.start()
start(broker)
for i in range(NUM_CLIENTS):
start(client_task)
for i in range(NUM_WORKERS):
start(worker_task)
# Process(target=broker).start()
if __name__ == "__main__":
main()

I guess there is different ways to do this :
-you can, for example, use the threading module to launch all your requests from your single client, with something like:
result_list = [] # Add the result to a list for the example
rlock = threading.RLock()
def client_thread(client_url, request, i):
context = zmq.Context.instance()
socket = context.socket(zmq.REQ)
socket.setsockopt_string(zmq.IDENTITY, '{}'.format(i))
socket.connect(client_url)
socket.send(request.encode())
reply = socket.recv()
with rlock:
result_list.append((i, reply))
return
def client_task():
# tasks = list with all your tasks
url_client = "ipc://frontend.ipc"
threads = []
for i in range(len(tasks)):
thread = threading.Thread(target=client_thread,
args=(url_client, tasks[i], i,))
thread.start()
threads.append(thread)
-you can take benefit of an evented library like asyncio (there is a submodule zmq.asyncio and an other library aiozmq, the last one offers a higher level of abstraction). In this case you will send your requests to the workers, sequentially too, but without blocking for each response (and so not keeping the main loop busy) and get the results when they came back to the main loop. This could look like this:
import asyncio
import zmq.asyncio
async def client_async(request, context, i, client_url):
"""Basic client sending a request (REQ) to a ROUTER (the broker)"""
socket = context.socket(zmq.REQ)
socket.setsockopt_string(zmq.IDENTITY, '{}'.format(i))
socket.connect(client_url)
await socket.send(request.encode())
reply = await socket.recv()
socket.close()
return reply
async def run(loop):
# tasks = list full of tasks
url_client = "ipc://frontend.ipc"
asyncio_tasks = []
ctx = zmq.asyncio.Context()
for i in range(len(tasks)):
task = asyncio.ensure_future(client_async(tasks[i], ctx, i, url_client))
asyncio_tasks.append(task)
responses = await asyncio.gather(*asyncio_tasks)
return responses
zmq.asyncio.install()
loop = asyncio.get_event_loop()
results = loop.run_until_complete(run(loop))
I didn't tested theses two snippets but they are both coming (with modifications to fit the question) from code i have using zmq in a similar configuration than your question.

Using process instead of thread with zeromq

I'm reading this code http://zguide.zeromq.org/py:mtserver
But when I've tried to replace threading.Thread by multiprocessing.Process I got the error
Assertion failed: ok (mailbox.cpp:84)
Code is
import time
import threading
import zmq
def worker_routine(worker_url, context=None):
"""Worker routine"""
context = context or zmq.Context.instance()
# Socket to talk to dispatcher
socket = context.socket(zmq.REP)
socket.connect(worker_url)
while True:
string = socket.recv()
print("Received request: [ %s ]" % (string))
# do some 'work'
time.sleep(1)
#send reply back to client
socket.send(b"World")
def main():
"""Server routine"""
url_worker = "inproc://workers"
url_client = "tcp://*:5555"
# Prepare our context and sockets
context = zmq.Context.instance()
# Socket to talk to clients
clients = context.socket(zmq.ROUTER)
clients.bind(url_client)
# Socket to talk to workers
workers = context.socket(zmq.DEALER)
workers.bind(url_worker)
# Launch pool of worker threads
for i in range(5):
process = multiprocessing.Process(target=worker_routine, args=(url_worker,))
process.start()
zmq.device(zmq.QUEUE, clients, workers)
# We never get here but clean up anyhow
clients.close()
workers.close()
context.term()
if __name__ == "__main__":
main()

The limitations of each transport is detailed in the API.
inproc is for intra-process communication (i.e. threads). You should try ipc which support inter-process communication or even just tcp.

ZMQ pair (for signaling) is blocking because of bad connection

I have two threads. One is a Worker Thread, the other a Communication Thread.
The Worker Thread is reading data off a serial port, doing some processing, and then enqueueing the results to be sent to a server.
The Communication Tthread is reading the results off the queue, and sending it. The challenge is that connectivity is wireless, and although usually present, it can be spotty (dropping in and out of range for a few minutes), and I don't want to block Worker Thread if I lose connectivity.
The pattern I have chosen for this, is as follows:
Worker Thread has an enqueue method which adds the message to a Queue, then send a signal to inproc://signal using a zmq.PAIR.
Communication Thread uses zmq.DEALER to communicate to the server (a zmq.ROUTER), but polls the inproc://signal pair in order to register whether there is a new message needing sending or not.
The following is a simplified example of the pattern:
import Queue
import zmq
import time
import threading
import simplejson
class ZmqPattern():
def __init__(self):
self.q_out = Queue.Queue()
self.q_in = Queue.Queue()
self.signal = None
self.API_KEY = 'SOMETHINGCOMPLEX'
self.zmq_comm_thr = None
def start_zmq_signal(self):
self.context = zmq.Context()
# signal socket for waking the zmq thread to send messages to the relay
self.signal = self.context.socket(zmq.PAIR)
self.signal.bind("inproc://signal")
def enqueue(self, msg):
print("> pre-enqueue")
self.q_out.put(msg)
print("< post-enqueue")
print(") send sig")
self.signal.send(b"")
print("( sig sent")
def communication_thread(self, q_out):
poll = zmq.Poller()
self.endpoint_url = 'tcp://' + '127.0.0.1' + ':' + '9001'
wake = self.context.socket(zmq.PAIR)
wake.connect("inproc://signal")
poll.register(wake, zmq.POLLIN)
self.socket = self.context.socket(zmq.DEALER)
self.socket.setsockopt(zmq.IDENTITY, self.API_KEY)
self.socket.connect(self.endpoint_url)
poll.register(self.socket, zmq.POLLIN)
while True:
sockets = dict(poll.poll())
if self.socket in sockets:
message = self.socket.recv()
message = simplejson.loads(message)
# Incomming messages which need to be handled on the worker thread
self.q_in.put(message)
if wake in sockets:
wake.recv()
while not q_out.empty():
print(">> Popping off Queue")
message = q_out.get()
print(">>> Popped off Queue")
message = simplejson.dumps(message)
print("<<< About to be sent")
self.socket.send(message)
print("<< Sent")
def start(self):
self.start_zmq_signal()
# ZMQ Thread
self.zmq_comm_thr = threading.Thread(target=self.communication_thread, args=([self.q_out]))
self.zmq_comm_thr.daemon = True
self.zmq_comm_thr.name = "ZMQ Thread"
self.zmq_comm_thr.start()
if __name__ == '__main__':
test = ZmqPattern()
test.start()
print '###############################################'
print '############## Starting comms #################'
print "###############################################"
last_debug = time.time()
test_msg = {}
for c in xrange(1000):
key = 'something{}'.format(c)
val = 'important{}'.format(c)
test_msg[key] = val
while True:
test.enqueue(test_msg)
if time.time() - last_debug > 1:
last_debug = time.time()
print "Still alive..."
If you run this, you'll see the dealer blocks as there is no router on the other end, and shortly after, the pair blocks as the Communication Thread isn't receiving
How should I best set up the inproc zmq to not block Worker Thread.
FYI, the most the entire system would need to buffer is in the order of 200k messages, and each message is around 256 bytes.

The dealer socket has a limit on the number of messages it will store, called the high water mark. Right below your dealer socket creation, try:
self.socket = self.context.socket(zmq.DEALER)
self.socket.setsockopt(zmq.SNDHWM, 200000)
And set that number as high as you dare; the limit is your machine's memory.
EDIT:
Some good discussion of high water marks in this question:
Majordomo broker: handling large number of connections

Redis pub/sub adding additional channels mid subscription

Is it possible to add additional subscriptions to a Redis connection? I have a listening thread but it appears not to be influenced by new SUBSCRIBE commands.
If this is the expected behavior, what is the pattern that should be used if users add a stock ticker feed to their interests or join chatroom?
I would like to implement a Python class similar to:
import threading
import redis
class RedisPubSub(object):
def __init__(self):
self._redis_pub = redis.Redis(host='localhost', port=6379, db=0)
self._redis_sub = redis.Redis(host='localhost', port=6379, db=0)
self._sub_thread = threading.Thread(target=self._listen)
self._sub_thread.setDaemon(True)
self._sub_thread.start()
def publish(self, channel, message):
self._redis_pub.publish(channel, message)
def subscribe(self, channel):
self._redis_sub.subscribe(channel)
def _listen(self):
for message in self._redis_sub.listen():
print message

The python-redis Redis and ConnectionPool classes inherit from threading.local, and this is producing the "magical" effects you're seeing.
Summary: your main thread and worker threads' self._redis_sub clients end up using two different connections to the server, but only the main thread's connection has issued the SUBSCRIBE command.
Details: Since the main thread is creating the self._redis_sub, that client ends up being placed into main's thread-local storage. Next I presume the main thread does a client.subscribe(channel) call. Now the main thread's client is subscribed on connection 1. Next you start the self._sub_thread worker thread which ends up having its own self._redis_sub attribute set to a new instance of redis.Client which constructs a new connection pool and establishes a new connection to the redis server.
This new connection has not yet been subscribed to your channel, so listen() returns immediately. So with python-redis you cannot pass an established connection with outstanding subscriptions (or any other stateful commands) between threads.
Depending on how you plan to implement your app you may need to switch to using a different client, or come up with some other way to communicate subscription state to the worker threads, e.g. send subscription commands through a queue.
One other issue is that python-redis uses blocking sockets, which prevents your listening thread from doing other work while waiting for messages, and it cannot signal it wishes to unsubscribe unless it does so immediately after receiving a message.

Async way:
Twisted framework and the plug txredisapi
Example code (Subscribe:
import txredisapi as redis
from twisted.application import internet
from twisted.application import service
class myProtocol(redis.SubscriberProtocol):
def connectionMade(self):
print "waiting for messages..."
print "use the redis client to send messages:"
print "$ redis-cli publish chat test"
print "$ redis-cli publish foo.bar hello world"
self.subscribe("chat")
self.psubscribe("foo.*")
reactor.callLater(10, self.unsubscribe, "chat")
reactor.callLater(15, self.punsubscribe, "foo.*")
# self.continueTrying = False
# self.transport.loseConnection()
def messageReceived(self, pattern, channel, message):
print "pattern=%s, channel=%s message=%s" % (pattern, channel, message)
def connectionLost(self, reason):
print "lost connection:", reason
class myFactory(redis.SubscriberFactory):
# SubscriberFactory is a wapper for the ReconnectingClientFactory
maxDelay = 120
continueTrying = True
protocol = myProtocol
application = service.Application("subscriber")
srv = internet.TCPClient("127.0.0.1", 6379, myFactory())
srv.setServiceParent(application)
Only one thread, no headache :)
Depends on what kind of app u coding of course. In networking case go twisted.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.