Redis pub/sub adding additional channels mid subscription - python

Is it possible to add additional subscriptions to a Redis connection? I have a listening thread but it appears not to be influenced by new SUBSCRIBE commands.
If this is the expected behavior, what is the pattern that should be used if users add a stock ticker feed to their interests or join chatroom?
I would like to implement a Python class similar to:
import threading
import redis
class RedisPubSub(object):
def __init__(self):
self._redis_pub = redis.Redis(host='localhost', port=6379, db=0)
self._redis_sub = redis.Redis(host='localhost', port=6379, db=0)
self._sub_thread = threading.Thread(target=self._listen)
self._sub_thread.setDaemon(True)
self._sub_thread.start()
def publish(self, channel, message):
self._redis_pub.publish(channel, message)
def subscribe(self, channel):
self._redis_sub.subscribe(channel)
def _listen(self):
for message in self._redis_sub.listen():
print message

The python-redis Redis and ConnectionPool classes inherit from threading.local, and this is producing the "magical" effects you're seeing.
Summary: your main thread and worker threads' self._redis_sub clients end up using two different connections to the server, but only the main thread's connection has issued the SUBSCRIBE command.
Details: Since the main thread is creating the self._redis_sub, that client ends up being placed into main's thread-local storage. Next I presume the main thread does a client.subscribe(channel) call. Now the main thread's client is subscribed on connection 1. Next you start the self._sub_thread worker thread which ends up having its own self._redis_sub attribute set to a new instance of redis.Client which constructs a new connection pool and establishes a new connection to the redis server.
This new connection has not yet been subscribed to your channel, so listen() returns immediately. So with python-redis you cannot pass an established connection with outstanding subscriptions (or any other stateful commands) between threads.
Depending on how you plan to implement your app you may need to switch to using a different client, or come up with some other way to communicate subscription state to the worker threads, e.g. send subscription commands through a queue.
One other issue is that python-redis uses blocking sockets, which prevents your listening thread from doing other work while waiting for messages, and it cannot signal it wishes to unsubscribe unless it does so immediately after receiving a message.

Async way:
Twisted framework and the plug txredisapi
Example code (Subscribe:
import txredisapi as redis
from twisted.application import internet
from twisted.application import service
class myProtocol(redis.SubscriberProtocol):
def connectionMade(self):
print "waiting for messages..."
print "use the redis client to send messages:"
print "$ redis-cli publish chat test"
print "$ redis-cli publish foo.bar hello world"
self.subscribe("chat")
self.psubscribe("foo.*")
reactor.callLater(10, self.unsubscribe, "chat")
reactor.callLater(15, self.punsubscribe, "foo.*")
# self.continueTrying = False
# self.transport.loseConnection()
def messageReceived(self, pattern, channel, message):
print "pattern=%s, channel=%s message=%s" % (pattern, channel, message)
def connectionLost(self, reason):
print "lost connection:", reason
class myFactory(redis.SubscriberFactory):
# SubscriberFactory is a wapper for the ReconnectingClientFactory
maxDelay = 120
continueTrying = True
protocol = myProtocol
application = service.Application("subscriber")
srv = internet.TCPClient("127.0.0.1", 6379, myFactory())
srv.setServiceParent(application)
Only one thread, no headache :)
Depends on what kind of app u coding of course. In networking case go twisted.

Related

connection to two RabbitMQ servers

I'm using python with pika, and have the following two similar use cases:
Connect to RabbitMQ server A and server B (at different IP addrs with different credentials), listen on exchange A1 on server A; when a message arrives, process it and send to an exchange on server B
Open an HTTP listener and connect to RabbitMQ server B; when a specific HTTP request arrives, process it and send to an exchange on server B
Alas, in both these cases using my usual techniques, by the time I get to sending to server B the connection throws ConnectionClosed or ChannelClosed.
I assume this is the cause: while waiting on the incoming messages, the connection to server B (its "driver") is starved of CPU cycles, and it never gets a chance to service is connection socket, thus it can't respond to heartbeats from server B, thus the servers shuts down the connection.
But I can't noodle out the fix. My current work around is lame: I catch the ConnectionClosed, reopen a connection to server B, and retry sending my message.
But what is the "right" way to do this? I've considered these, but don't really feel I have all the parts to solve this:
Don't just sit forever in server A's basic_consume (my usual pattern), but rather, use a timeout, and when I catch the timeout somehow "service" heartbeats on server B's driver, before returning to a "consume with timeout"... but how do I do that? How do I "let service B's connection driver service its heartbeats"?
I know the socket library's select() call can wait for messages on several sockets and once, then service the socket who has packets waiting. So maybe this is what pika's SelectConnection is for? a) I'm not sure, this is just a hunch. b) Even if right, while I can find examples of how to create this connection, I can't find examples of how to use it to solve my multiconnection case.
Set up the the two server connections in different processes... and use Python interprocess queues to get the processed message from one process to the next. The concept is "two different RabbitMQ connections in two different processes should thus then be able to independently service their heartbeats". Except... I think this has a fatal flaw: the process with "server B" is, instead, going to be "stuck" waiting on the interprocess queue, and the same "starvation" is going to happen.
I've checked StackOverflow and Googled this for an hour last night: I can't for the life of me find a blog post or sample code for this.
Any input? Thanks a million!
I managed to work it out, basing my solution on the documentation and an answer in the pika-python Google group.
First of all, your assumption is correct — the client process that's connected to server B, responsible for publishing, cannot reply to heartbeats if it's already blocking on something else, like waiting a message from server A or blocking on an internal communication queue.
The crux of the solution is that the publisher should run as a separate thread and use BlockingConnection.process_data_events to service heartbeats and such. It looks like that method is supposed to be called in a loop that checks if the publisher still needs to run:
def run(self):
while self.is_running:
# Block at most 1 second before returning and re-checking
self.connection.process_data_events(time_limit=1)
Proof of concept
Since proving the full solution requires having two separate RabbitMQ instances running, I have put together a Git repo with an appropriate docker-compose.yml, the application code and comments to test this solution.
https://github.com/karls/rabbitmq-two-connections
Solution outline
Below is a sketch of the solution, minus imports and such. Some notable things:
Publisher runs as a separate thread
The only "work" that the publisher does is servicing heartbeats and such, via Connection.process_data_events
The publisher registers a callback whenever the consumer wants to publish a message, using Connection.add_callback_threadsafe
The consumer takes the publisher as a constructor argument so it can publish the messages it receives, but it can work via any other mechanism as long as you have a reference to an instance of Publisher
The code is taken from the linked Git repo, which is why certain details are hardcoded, e.g the queue name etc. It will work with any RabbitMQ setup needed (direct-to-queue, topic exchange, fanout, etc).
class Publisher(threading.Thread):
def __init__(
self,
connection_params: ConnectionParameters,
*args,
**kwargs,
):
super().__init__(*args, **kwargs)
self.daemon = True
self.is_running = True
self.name = "Publisher"
self.queue = "downstream_queue"
self.connection = BlockingConnection(connection_params)
self.channel = self.connection.channel()
self.channel.queue_declare(queue=self.queue, auto_delete=True)
self.channel.confirm_delivery()
def run(self):
while self.is_running:
self.connection.process_data_events(time_limit=1)
def _publish(self, message):
logger.info("Calling '_publish'")
self.channel.basic_publish("", self.queue, body=message.encode())
def publish(self, message):
logger.info("Calling 'publish'")
self.connection.add_callback_threadsafe(lambda: self._publish(message))
def stop(self):
logger.info("Stopping...")
self.is_running = False
# Call .process_data_events one more time to block
# and allow the while-loop in .run() to break.
# Otherwise the connection might be closed too early.
#
self.connection.process_data_events(time_limit=1)
if self.connection.is_open:
self.connection.close()
logger.info("Connection closed")
logger.info("Stopped")
class Consumer:
def __init__(
self,
connection_params: ConnectionParameters,
publisher: Optional["Publisher"] = None,
):
self.publisher = publisher
self.queue = "upstream_queue"
self.connection = BlockingConnection(connection_params)
self.channel = self.connection.channel()
self.channel.queue_declare(queue=self.queue, auto_delete=True)
self.channel.basic_qos(prefetch_count=1)
def start(self):
self.channel.basic_consume(
queue=self.queue, on_message_callback=self.on_message
)
try:
self.channel.start_consuming()
except KeyboardInterrupt:
logger.info("Warm shutdown requested...")
except Exception:
traceback.print_exception(*sys.exc_info())
finally:
self.stop()
def on_message(self, _channel: Channel, m, _properties, body):
try:
message = body.decode()
logger.info(f"Got: {message!r}")
if self.publisher:
self.publisher.publish(message)
else:
logger.info(f"No publisher provided, printing message: {message!r}")
self.channel.basic_ack(delivery_tag=m.delivery_tag)
except Exception:
traceback.print_exception(*sys.exc_info())
self.channel.basic_nack(delivery_tag=m.delivery_tag, requeue=False)
def stop(self):
logger.info("Stopping consuming...")
if self.connection.is_open:
logger.info("Closing connection...")
self.connection.close()
if self.publisher:
self.publisher.stop()
logger.info("Stopped")

Make rabbitMQ connections accessible to other modules

RabbitMQ best practices suggest using long-lived connections, ideally separate Consume and Publish connections, and attach a channel per thread to the corresponding connection. I am building a distributed system where every part needs to consume and publish messages to other parts of the system. Class RabbitMQ creates those connections, attaches channels to them, and publishes messages. On the other hand I have around 10 processes, each in a thread, that must consume/publish through its "own" channel. On startup each processes creates its channel and binds its queues.
My question is how to start a unique instance of class RabbitMQ that makes the two connections "accessible" to the processes, keeping those two connections alive and avoiding opnening/closing channels. I tried import messaging in each module, but for every import there is an instantiation of the class and, therefore two new connections. I also tried adding a singleton to class RabbitMQ to avoid multiple instantiations on imports but did not work.
I appreciate your help.
messaging.py
class RabbitMQ:
def __init__(self):
self.consume_connection = None
self.publish_connection = None
self.initialize_connection()
def initialize_connection(self):
self.consume_connection = pika.BlockingConnection(pika.ConnectionParameters(
host='localhost', socket_timeout=5, client_properties={'connection_name': 'consume_connection'}))
self.publish_connection = pika.BlockingConnection(pika.ConnectionParameters(
host='localhost', socket_timeout=5, client_properties={'connection_name': 'publish_connection'}))
def send_message(self, exchange_name, routing_key, message, channel):
...
def create_consume_channel(self):
...
def create_publish_channel(self):
...
Messaging = RabbitMQ()
consuming_process.py
...
def connect_messaging(self):
channel = self.messaging.create_consume_channel() # <-- messaging would be the instance of class RabbitMQ
channel.basic_qos(prefetch_count=100)
exchange_name = 'abc'
channel.exchange_declare(exchange=exchange_name, exchange_type='direct')
result = channel.queue_declare(queue='queue_name')
queue_1 = result.method.queue
channel.queue_bind(exchange=exchange_name, queue=queue_1, routing_key='some_routing_key')
...
def callback_function(ch, method, properties, body):
...
ch.basic_ack(delivery_tag=method.delivery_tag)
channel.basic_consume(callback_function, queue=queue_1, no_ack=False)
channel.start_consuming()

How to stop a websocket client without stopping reactor

I have an app similar to a chat-room writing in python that intends to do the following things:
A prompt for user to input websocket server address.
Then create a websocket client that connects to server and send/receive messages. Disable the ability to create a websocket client.
After receiving "close" from server (NOT a close frame), client should drop connecting and re-enable the app to create a client. Go back to 1.
If user exits the app, it exit the websocket client if there is one running.
My approach for this is using a main thread to deal with user input. When user hits enter, a thread is created for WebSocketClient using AutoBahn's twisted module and pass a Queue to it. Check if the reactor is running or not and start it if it's not.
Overwrite on message method to put a closing flag into the Queue when getting "close". The main thread will be busy checking the Queue until receiving the flag and go back to start. The code looks like following.
Main thread.
def main_thread():
while True:
text = raw_input("Input server url or exit")
if text == "exit":
if myreactor:
myreactor.stop()
break
msgq = Queue.Queue()
threading.Thread(target=wsthread, args=(text, msgq)).start()
is_close = False
while True:
if msgq.empty() is False:
msg = msgq.get()
if msg == "close":
is_close = True
else:
print msg
if is_close:
break
print 'Websocket client closed!'
Factory and Protocol.
class MyProtocol(WebSocketClientProtocol):
def onMessage(self, payload, isBinary):
msg = payload.decode('utf-8')
self.Factory.q.put(msg)
if msg == 'close':
self.dropConnection(abort=True)
class WebSocketClientFactoryWithQ(WebSocketClientFactory):
def __init__(self, *args, **kwargs):
self.queue = kwargs.pop('queue', None)
WebSocketClientFactory.__init__(self, *args, **kwargs)
Client thread.
def wsthread(url, q):
factory = WebSocketClientFactoryWithQ(url=url, queue=q)
factory.protocol = MyProtocol
connectWS(Factory)
if myreactor is None:
myreactor = reactor
reactor.run()
print 'Done'
Now I got a problem. It seems that my client thread never stops. Even if I receive "close", it seems still running and every time I try to recreate a new client, it creates a new thread. I understand the first thread won't stop since reactor.run() will run forever, but from the 2nd thread and on, it should be non-blocking since I'm not starting it anymore. How can I change that?
EDIT:
I end up solving it with
Adding stopFactory() after disconnect.
Make protocol functions with reactor.callFromThread().
Start the reactor in the first thread and put clients in other threads and use reactor.callInThread() to create them.
Your main_thread creates new threads running wsthread. wsthread uses Twisted APIs. The first wsthread becomes the reactor thread. All subsequent threads are different and it is undefined what happens if you use a Twisted API from them.
You should almost certainly remove the use of threads from your application. For dealing with console input in a Twisted-based application, take a look at twisted.conch.stdio (not the best documented part of Twisted, alas, but just what you want).

Connected clients list in Python Tornado

I have a Tornado WebSocket Server running in a separate process that is launched by a thread. This thread calls the publish method of my TornadoServer when it gets messages to send via websockets.
Running Tornado on a separate process was the only way I found to start the tornado loop without the thread blocking on this call.
In my thread, I start the tornado process by calling these methods on thread init method:
self.p = tornado_server.TornadoServer()
self.p.daemon = True
self.p.start()
In this thread, I have an infinite loop that tries to get messages from a Queue and if it gets messages, it calls the self.p.publish(client, message).
So far, so good.
On the Tornado process, I basically implemented a publish/subscribe system. When a user opens a webpage, the page sends a "subscription" message for a specific "client" let's say. On the "on_message" callback I append a tuple of the WebSocketHandler instance and the client that the user wants to subscribe to a global list.
Then, the publish method should search in the list for subscribed users to the message's target client and it should call the write_message on the WebSocket stored on that list.
The only thing that it isn't working is that my "clients" list have different scopes or something.
This is the code of my tornado_server file:
#!/usr/bin/python2
import tornado.web, tornado.websocket, tornado.ioloop, multiprocessing
clients = []
class TornadoServer(multiprocessing.Process):
class WebSocketHandler(tornado.websocket.WebSocketHandler):
def on_message(self, message):
global clients
print 'TORNADO - Received message:', str(message)
channel, subtopic = message.split('/')
print 'TORNADO - Subscribing:', str(subtopic)
clients.append((self, subtopic))
def on_close(self):
global clients
for websocket, client in clients:
if self == websocket:
print 'TORNADO - Removed client'
to_remove = (self, client)
clients.remove(to_remove)
def __init__(self):
multiprocessing.Process.__init__(self)
self.application = tornado.web.Application([(r"/tri-anim", WebSocketHandler)])
self.application.listen(1339)
def run(self):
tornado.ioloop.IOLoop.current().start()
def publish(self, client, message):
global clients
for websocket, websocketclient in clients:
if websocketclient == client:
websocket.write_message(str(message))
No matter what I do, clients have always different scopes. When publish is called, the "clients" is always empty. Is there any way to get this working?
You're calling publish in the parent process, but the clients list is only updated in the child process. When using multiprocessing each process gets its own copy of all the variables. If you used threads instead the variables would be shared, but even then you'd need to use IOLoop.instance().add_callback to do a thread-safe handoff between the thread calling publish and the write_message function (which must be called on the IOLoop thread).

Attaching ZMQStream with existing tornado ioloop

I have an application where every websocket connection (within tornado open callback) creates a zmq.SUB socket to an existing zmq.FORWARDER device. Idea is to receive data from zmq as callbacks, which can then be relayed to frontend clients over websocket connection.
https://gist.github.com/abhinavsingh/6378134
ws.py
import zmq
from zmq.eventloop import ioloop
from zmq.eventloop.zmqstream import ZMQStream
ioloop.install()
from tornado.websocket import WebSocketHandler
from tornado.web import Application
from tornado.ioloop import IOLoop
ioloop = IOLoop.instance()
class ZMQPubSub(object):
def __init__(self, callback):
self.callback = callback
def connect(self):
self.context = zmq.Context()
self.socket = self.context.socket(zmq.SUB)
self.socket.connect('tcp://127.0.0.1:5560')
self.stream = ZMQStream(self.socket)
self.stream.on_recv(self.callback)
def subscribe(self, channel_id):
self.socket.setsockopt(zmq.SUBSCRIBE, channel_id)
class MyWebSocket(WebSocketHandler):
def open(self):
self.pubsub = ZMQPubSub(self.on_data)
self.pubsub.connect()
self.pubsub.subscribe("session_id")
print 'ws opened'
def on_message(self, message):
print message
def on_close(self):
print 'ws closed'
def on_data(self, data):
print data
def main():
application = Application([(r'/channel', MyWebSocket)])
application.listen(10001)
print 'starting ws on port 10001'
ioloop.start()
if __name__ == '__main__':
main()
forwarder.py
import zmq
def main():
try:
context = zmq.Context(1)
frontend = context.socket(zmq.SUB)
frontend.bind('tcp://*:5559')
frontend.setsockopt(zmq.SUBSCRIBE, '')
backend = context.socket(zmq.PUB)
backend.bind('tcp://*:5560')
print 'starting zmq forwarder'
zmq.device(zmq.FORWARDER, frontend, backend)
except KeyboardInterrupt:
pass
except Exception as e:
logger.exception(e)
finally:
frontend.close()
backend.close()
context.term()
if __name__ == '__main__':
main()
publish.py
import zmq
if __name__ == '__main__':
context = zmq.Context()
socket = context.socket(zmq.PUB)
socket.connect('tcp://127.0.0.1:5559')
socket.send('session_id helloworld')
print 'sent data for channel session_id'
However, my ZMQPubSub class doesn't seem like is receiving any data at all.
I further experimented and realized that I need to call ioloop.IOLoop.instance().start() after registering on_recv callback within ZMQPubSub. But, that will just block the execution.
I also tried passing main.ioloop instance to ZMQStream constructor but doesn't help either.
Is there a way by which I can bind ZMQStream to existing main.ioloop instance without blocking flow within MyWebSocket.open?
In your now complete example, simply change frontend in your forwarder to a PULL socket and your publisher socket to PUSH, and it should behave as you expect.
The general principles of socket choice that are relevant here:
use PUB/SUB when you want to send a message to everyone who is ready to receive it (may be no one)
use PUSH/PULL when you want to send a message to exactly one peer, waiting for them to be ready
it may appear initially that you just want PUB-SUB, but once you start looking at each socket pair, you realize that they are very different. The frontend-websocket connection is definitely PUB-SUB - you may have zero-to-many receivers, and you just want to send messages to everyone who happens to be available when a message comes through. But the backend side is different - there is only one receiver, and it definitely wants every message from the publishers.
So there you have it - backend should be PULL and frontend PUB. All your sockets:
PUSH -> [PULL-PUB] -> SUB
publisher.py: socket is PUSH, connected to backend in device.py
forwarder.py: backend is PULL, frontend is PUB
ws.py: SUB connects and subscribes to forwarder.frontend.
The relevant behavior that makes PUB/SUB fail on the backend in your case is the slow joiner syndrome, which is described in The Guide. Essentially, subscribers take a finite time to tell publishers about there subscriptions, so if you send a message immediately after opening a PUB socket, the odds are it hasn't been told that it has any subscribers yet, so it's just discarding messages.
ZeroMq subscribers have to subscribe on what messages they wish to receive; I don't see that in your code. I believe the Python way is this:
self.socket.setsockopt(zmq.SUBSCRIBE, "")

Categories