I have a Tornado WebSocket Server running in a separate process that is launched by a thread. This thread calls the publish method of my TornadoServer when it gets messages to send via websockets.
Running Tornado on a separate process was the only way I found to start the tornado loop without the thread blocking on this call.
In my thread, I start the tornado process by calling these methods on thread init method:
self.p = tornado_server.TornadoServer()
self.p.daemon = True
self.p.start()
In this thread, I have an infinite loop that tries to get messages from a Queue and if it gets messages, it calls the self.p.publish(client, message).
So far, so good.
On the Tornado process, I basically implemented a publish/subscribe system. When a user opens a webpage, the page sends a "subscription" message for a specific "client" let's say. On the "on_message" callback I append a tuple of the WebSocketHandler instance and the client that the user wants to subscribe to a global list.
Then, the publish method should search in the list for subscribed users to the message's target client and it should call the write_message on the WebSocket stored on that list.
The only thing that it isn't working is that my "clients" list have different scopes or something.
This is the code of my tornado_server file:
#!/usr/bin/python2
import tornado.web, tornado.websocket, tornado.ioloop, multiprocessing
clients = []
class TornadoServer(multiprocessing.Process):
class WebSocketHandler(tornado.websocket.WebSocketHandler):
def on_message(self, message):
global clients
print 'TORNADO - Received message:', str(message)
channel, subtopic = message.split('/')
print 'TORNADO - Subscribing:', str(subtopic)
clients.append((self, subtopic))
def on_close(self):
global clients
for websocket, client in clients:
if self == websocket:
print 'TORNADO - Removed client'
to_remove = (self, client)
clients.remove(to_remove)
def __init__(self):
multiprocessing.Process.__init__(self)
self.application = tornado.web.Application([(r"/tri-anim", WebSocketHandler)])
self.application.listen(1339)
def run(self):
tornado.ioloop.IOLoop.current().start()
def publish(self, client, message):
global clients
for websocket, websocketclient in clients:
if websocketclient == client:
websocket.write_message(str(message))
No matter what I do, clients have always different scopes. When publish is called, the "clients" is always empty. Is there any way to get this working?
You're calling publish in the parent process, but the clients list is only updated in the child process. When using multiprocessing each process gets its own copy of all the variables. If you used threads instead the variables would be shared, but even then you'd need to use IOLoop.instance().add_callback to do a thread-safe handoff between the thread calling publish and the write_message function (which must be called on the IOLoop thread).
Related
I'm using python with pika, and have the following two similar use cases:
Connect to RabbitMQ server A and server B (at different IP addrs with different credentials), listen on exchange A1 on server A; when a message arrives, process it and send to an exchange on server B
Open an HTTP listener and connect to RabbitMQ server B; when a specific HTTP request arrives, process it and send to an exchange on server B
Alas, in both these cases using my usual techniques, by the time I get to sending to server B the connection throws ConnectionClosed or ChannelClosed.
I assume this is the cause: while waiting on the incoming messages, the connection to server B (its "driver") is starved of CPU cycles, and it never gets a chance to service is connection socket, thus it can't respond to heartbeats from server B, thus the servers shuts down the connection.
But I can't noodle out the fix. My current work around is lame: I catch the ConnectionClosed, reopen a connection to server B, and retry sending my message.
But what is the "right" way to do this? I've considered these, but don't really feel I have all the parts to solve this:
Don't just sit forever in server A's basic_consume (my usual pattern), but rather, use a timeout, and when I catch the timeout somehow "service" heartbeats on server B's driver, before returning to a "consume with timeout"... but how do I do that? How do I "let service B's connection driver service its heartbeats"?
I know the socket library's select() call can wait for messages on several sockets and once, then service the socket who has packets waiting. So maybe this is what pika's SelectConnection is for? a) I'm not sure, this is just a hunch. b) Even if right, while I can find examples of how to create this connection, I can't find examples of how to use it to solve my multiconnection case.
Set up the the two server connections in different processes... and use Python interprocess queues to get the processed message from one process to the next. The concept is "two different RabbitMQ connections in two different processes should thus then be able to independently service their heartbeats". Except... I think this has a fatal flaw: the process with "server B" is, instead, going to be "stuck" waiting on the interprocess queue, and the same "starvation" is going to happen.
I've checked StackOverflow and Googled this for an hour last night: I can't for the life of me find a blog post or sample code for this.
Any input? Thanks a million!
I managed to work it out, basing my solution on the documentation and an answer in the pika-python Google group.
First of all, your assumption is correct — the client process that's connected to server B, responsible for publishing, cannot reply to heartbeats if it's already blocking on something else, like waiting a message from server A or blocking on an internal communication queue.
The crux of the solution is that the publisher should run as a separate thread and use BlockingConnection.process_data_events to service heartbeats and such. It looks like that method is supposed to be called in a loop that checks if the publisher still needs to run:
def run(self):
while self.is_running:
# Block at most 1 second before returning and re-checking
self.connection.process_data_events(time_limit=1)
Proof of concept
Since proving the full solution requires having two separate RabbitMQ instances running, I have put together a Git repo with an appropriate docker-compose.yml, the application code and comments to test this solution.
https://github.com/karls/rabbitmq-two-connections
Solution outline
Below is a sketch of the solution, minus imports and such. Some notable things:
Publisher runs as a separate thread
The only "work" that the publisher does is servicing heartbeats and such, via Connection.process_data_events
The publisher registers a callback whenever the consumer wants to publish a message, using Connection.add_callback_threadsafe
The consumer takes the publisher as a constructor argument so it can publish the messages it receives, but it can work via any other mechanism as long as you have a reference to an instance of Publisher
The code is taken from the linked Git repo, which is why certain details are hardcoded, e.g the queue name etc. It will work with any RabbitMQ setup needed (direct-to-queue, topic exchange, fanout, etc).
class Publisher(threading.Thread):
def __init__(
self,
connection_params: ConnectionParameters,
*args,
**kwargs,
):
super().__init__(*args, **kwargs)
self.daemon = True
self.is_running = True
self.name = "Publisher"
self.queue = "downstream_queue"
self.connection = BlockingConnection(connection_params)
self.channel = self.connection.channel()
self.channel.queue_declare(queue=self.queue, auto_delete=True)
self.channel.confirm_delivery()
def run(self):
while self.is_running:
self.connection.process_data_events(time_limit=1)
def _publish(self, message):
logger.info("Calling '_publish'")
self.channel.basic_publish("", self.queue, body=message.encode())
def publish(self, message):
logger.info("Calling 'publish'")
self.connection.add_callback_threadsafe(lambda: self._publish(message))
def stop(self):
logger.info("Stopping...")
self.is_running = False
# Call .process_data_events one more time to block
# and allow the while-loop in .run() to break.
# Otherwise the connection might be closed too early.
#
self.connection.process_data_events(time_limit=1)
if self.connection.is_open:
self.connection.close()
logger.info("Connection closed")
logger.info("Stopped")
class Consumer:
def __init__(
self,
connection_params: ConnectionParameters,
publisher: Optional["Publisher"] = None,
):
self.publisher = publisher
self.queue = "upstream_queue"
self.connection = BlockingConnection(connection_params)
self.channel = self.connection.channel()
self.channel.queue_declare(queue=self.queue, auto_delete=True)
self.channel.basic_qos(prefetch_count=1)
def start(self):
self.channel.basic_consume(
queue=self.queue, on_message_callback=self.on_message
)
try:
self.channel.start_consuming()
except KeyboardInterrupt:
logger.info("Warm shutdown requested...")
except Exception:
traceback.print_exception(*sys.exc_info())
finally:
self.stop()
def on_message(self, _channel: Channel, m, _properties, body):
try:
message = body.decode()
logger.info(f"Got: {message!r}")
if self.publisher:
self.publisher.publish(message)
else:
logger.info(f"No publisher provided, printing message: {message!r}")
self.channel.basic_ack(delivery_tag=m.delivery_tag)
except Exception:
traceback.print_exception(*sys.exc_info())
self.channel.basic_nack(delivery_tag=m.delivery_tag, requeue=False)
def stop(self):
logger.info("Stopping consuming...")
if self.connection.is_open:
logger.info("Closing connection...")
self.connection.close()
if self.publisher:
self.publisher.stop()
logger.info("Stopped")
Trying to send and receive data between the processes using MQTT, I am able to achieve a call from one of the process using BaseManager class to main mqtt class which publishes data on MQTT which has been sent from created Process.
The code looks like below
class MqttClient:
def __init__(self):
self.pub = Mqtt.Client()
self.pub.connect("localhost", 1883, 60)
self.pub.loop_start()
def publishM(self, data): # method to be called from processes to publish data
self.pub.publish("Test", data)
class subProcess:
def __init__(self, mqttObj):
self.MqttObject = mqttObj
self.data = "Hello World"
self.MqttObject.publishM(self.data) # calling method in MqttClient using instance of it
if __name__ == "__main__":
BaseManager.register('MqttClient', MqttClient) # registering MqttClient class to baseManager
manager = BaseManager()
manager.start()
mqttInstance = manager.MqttClient() # Instance of MqttClient
p = Process(target=subProcess, args=(mqttInstance,)) # instance of MqttClient to process
p.start()
p.join()
The above code works well when a created process needs to send data over Mqtt. But I also need to send data to the process back using Mqtt. Meaning process should be subscribed to a topic and should have onmessage method to receive data from Mqtt. I know I can create one mqtt client inside process itself to begin with but the number of processes increases number of mqtt client also will increase. I would like to do with single mqtt client.
How to achieve this ? Thanks.
I have my Tornado client continuously listening to my Tornado server in a loop, as it is mentioned here - http://tornadoweb.org/en/stable/websocket.html#client-side-support. It looks like this:
import tornado.websocket
from tornado import gen
#gen.coroutine
def test():
client = yield tornado.websocket.websocket_connect("ws://localhost:9999/ws")
client.write_message("Hello")
while True:
msg = yield client.read_message()
if msg is None:
break
print msg
client.close()
if __name__ == "__main__":
tornado.ioloop.IOLoop.instance().run_sync(test)
I'm not able to get multiple instances of clients to connect to the server. The second client always waits for the first client process to end before it connects to the server. The server is set up as follows, with reference from Websockets with Tornado: Get access from the "outside" to send messages to clients and Tornado - Listen to multiple clients simultaneously over websockets.
class WSHandler(tornado.websocket.WebSocketHandler):
clients = set()
def open(self):
print 'new connection'
WSHandler.clients.add(self)
def on_message(self, message):
print 'message received %s' % message
# process received message
# pass it to a thread which updates a variable
while True:
output = updated_variable
self.write_message(output)
def on_close(self):
print 'connection closed'
WSHandler.clients.remove(self)
application = tornado.web.Application([(r'/ws', WSHandler),])
if __name__ == "__main__":
http_server = tornado.httpserver.HTTPServer(application)
http_server.listen(9999)
tornado.ioloop.IOLoop.instance().start()
But this has not worked - for some reason even after I have made a successful first connection, the second connection just fails to connect i.e. it does not even get added to the clients set.
I initially thought the while True would not block the server from receiving and handling more clients, but it does as without it multiple clients are able to connect. How can I send back continuously updated information from my internal thread without using the while True?
Any help would be greatly appreciated!
To write messages to client in a while loop, you can use the yield None inside the loop. This will pause the while loop and then Tornado's IOLoop will be free to accept new connections.
Here's an example:
#gen.coroutine
def on_message(self):
while True:
self.write_message("Hello")
yield None
Thanks for your answer #xyres! I was able to get it to work by starting a thread in the on_message method that handed processing and the while True to a function outside the WSHandler class. I believe this allowed for the method to run outside of Tornado's IOLoop, unblocking new connections.
This is how my server looks now:
def on_message(self, message):
print 'message received %s' % message
sendThread = threading.Thread(target=send, args=(self, message))
sendThread.start()
def send(client, msg):
# process received msg
# pass it to a thread which updates a variable
while True:
output = updated_variable
client.write_message(output)
Where send is a function defined outside the class which does the required computation for me and writes back to client inside thewhile True.
I have an app similar to a chat-room writing in python that intends to do the following things:
A prompt for user to input websocket server address.
Then create a websocket client that connects to server and send/receive messages. Disable the ability to create a websocket client.
After receiving "close" from server (NOT a close frame), client should drop connecting and re-enable the app to create a client. Go back to 1.
If user exits the app, it exit the websocket client if there is one running.
My approach for this is using a main thread to deal with user input. When user hits enter, a thread is created for WebSocketClient using AutoBahn's twisted module and pass a Queue to it. Check if the reactor is running or not and start it if it's not.
Overwrite on message method to put a closing flag into the Queue when getting "close". The main thread will be busy checking the Queue until receiving the flag and go back to start. The code looks like following.
Main thread.
def main_thread():
while True:
text = raw_input("Input server url or exit")
if text == "exit":
if myreactor:
myreactor.stop()
break
msgq = Queue.Queue()
threading.Thread(target=wsthread, args=(text, msgq)).start()
is_close = False
while True:
if msgq.empty() is False:
msg = msgq.get()
if msg == "close":
is_close = True
else:
print msg
if is_close:
break
print 'Websocket client closed!'
Factory and Protocol.
class MyProtocol(WebSocketClientProtocol):
def onMessage(self, payload, isBinary):
msg = payload.decode('utf-8')
self.Factory.q.put(msg)
if msg == 'close':
self.dropConnection(abort=True)
class WebSocketClientFactoryWithQ(WebSocketClientFactory):
def __init__(self, *args, **kwargs):
self.queue = kwargs.pop('queue', None)
WebSocketClientFactory.__init__(self, *args, **kwargs)
Client thread.
def wsthread(url, q):
factory = WebSocketClientFactoryWithQ(url=url, queue=q)
factory.protocol = MyProtocol
connectWS(Factory)
if myreactor is None:
myreactor = reactor
reactor.run()
print 'Done'
Now I got a problem. It seems that my client thread never stops. Even if I receive "close", it seems still running and every time I try to recreate a new client, it creates a new thread. I understand the first thread won't stop since reactor.run() will run forever, but from the 2nd thread and on, it should be non-blocking since I'm not starting it anymore. How can I change that?
EDIT:
I end up solving it with
Adding stopFactory() after disconnect.
Make protocol functions with reactor.callFromThread().
Start the reactor in the first thread and put clients in other threads and use reactor.callInThread() to create them.
Your main_thread creates new threads running wsthread. wsthread uses Twisted APIs. The first wsthread becomes the reactor thread. All subsequent threads are different and it is undefined what happens if you use a Twisted API from them.
You should almost certainly remove the use of threads from your application. For dealing with console input in a Twisted-based application, take a look at twisted.conch.stdio (not the best documented part of Twisted, alas, but just what you want).
Is it possible to add additional subscriptions to a Redis connection? I have a listening thread but it appears not to be influenced by new SUBSCRIBE commands.
If this is the expected behavior, what is the pattern that should be used if users add a stock ticker feed to their interests or join chatroom?
I would like to implement a Python class similar to:
import threading
import redis
class RedisPubSub(object):
def __init__(self):
self._redis_pub = redis.Redis(host='localhost', port=6379, db=0)
self._redis_sub = redis.Redis(host='localhost', port=6379, db=0)
self._sub_thread = threading.Thread(target=self._listen)
self._sub_thread.setDaemon(True)
self._sub_thread.start()
def publish(self, channel, message):
self._redis_pub.publish(channel, message)
def subscribe(self, channel):
self._redis_sub.subscribe(channel)
def _listen(self):
for message in self._redis_sub.listen():
print message
The python-redis Redis and ConnectionPool classes inherit from threading.local, and this is producing the "magical" effects you're seeing.
Summary: your main thread and worker threads' self._redis_sub clients end up using two different connections to the server, but only the main thread's connection has issued the SUBSCRIBE command.
Details: Since the main thread is creating the self._redis_sub, that client ends up being placed into main's thread-local storage. Next I presume the main thread does a client.subscribe(channel) call. Now the main thread's client is subscribed on connection 1. Next you start the self._sub_thread worker thread which ends up having its own self._redis_sub attribute set to a new instance of redis.Client which constructs a new connection pool and establishes a new connection to the redis server.
This new connection has not yet been subscribed to your channel, so listen() returns immediately. So with python-redis you cannot pass an established connection with outstanding subscriptions (or any other stateful commands) between threads.
Depending on how you plan to implement your app you may need to switch to using a different client, or come up with some other way to communicate subscription state to the worker threads, e.g. send subscription commands through a queue.
One other issue is that python-redis uses blocking sockets, which prevents your listening thread from doing other work while waiting for messages, and it cannot signal it wishes to unsubscribe unless it does so immediately after receiving a message.
Async way:
Twisted framework and the plug txredisapi
Example code (Subscribe:
import txredisapi as redis
from twisted.application import internet
from twisted.application import service
class myProtocol(redis.SubscriberProtocol):
def connectionMade(self):
print "waiting for messages..."
print "use the redis client to send messages:"
print "$ redis-cli publish chat test"
print "$ redis-cli publish foo.bar hello world"
self.subscribe("chat")
self.psubscribe("foo.*")
reactor.callLater(10, self.unsubscribe, "chat")
reactor.callLater(15, self.punsubscribe, "foo.*")
# self.continueTrying = False
# self.transport.loseConnection()
def messageReceived(self, pattern, channel, message):
print "pattern=%s, channel=%s message=%s" % (pattern, channel, message)
def connectionLost(self, reason):
print "lost connection:", reason
class myFactory(redis.SubscriberFactory):
# SubscriberFactory is a wapper for the ReconnectingClientFactory
maxDelay = 120
continueTrying = True
protocol = myProtocol
application = service.Application("subscriber")
srv = internet.TCPClient("127.0.0.1", 6379, myFactory())
srv.setServiceParent(application)
Only one thread, no headache :)
Depends on what kind of app u coding of course. In networking case go twisted.