Two paho.mqtt clients subscribing to the same client localy - python

I'm trying to find out if it is possible to have two paho.mqtt clients (https://eclipse.org/paho/clients/python/docs/) subscribing to the same server. Both clients and server are running on the same host. My aim is to have two clients subscribing with different credentials to the same server (which in my case is rabbitmq with mqtt plugin) so I can sort my payloads by vhosts (not by topic since I don't have control over topics).
My observation at the moment is that the clients just keep reconnecting which would suggest I'm either doing something wrong or that there can be only one client connected to the MQTT server at a time...
So here is the question - was you able to run more than one client subscribed to the same server where all clients and server were running locally?
Edit:
It seems RabbitMQ with MQTT plugin allows to achieve this functionality. The one could configure two users to have access to separate vhosts and just by doing this payloads get segregated. My scenario was to configure two clients so I could distinguish who had sent which payload, and localy I could spawn mirror clients to consume payload of related users.
Many thanks to #hardillb who helped with this question and with related question.

Each client must have a unique client id, the broker will kick off the oldest client when a new one connects with the same client id. Other than that you can run as many clients as you want connecting from anywhere that can reach the broker

Related

Server to Server Websocket communication

Here is the architecture topology:
An IoT device that counts people and saves the data to its cloud platform. Data can be accessed via an API and more specific it requires to provide a webserver endpoint where it can push the data every minute or so. This a ready-made product that I cannot change the data transfer method.
A webserver on my side that receives and stores the data.
As I am new to WebSockets, I interpret the above configuration as a WebSocket server installed on my webserver and wait for the data to be received from the IoT server (client).
So I deployed a Linux server in digitalocean and enabled the websocket server to wait for the incoming connections. The code I used for the server is:
import asyncio
import websockets
async def echo(websocket, path):
async for message in websocket:
print(message)
start_server = websockets.serve(echo, "MYSERVERIP", 80)
asyncio.get_event_loop().run_until_complete(start_server)
asyncio.get_event_loop().run_forever()
All I need at this stage is to print all JSON packets that are pushed from the IoT server.
When I try to set the endpoint address in the IoT server, it refuses to accept ws://Myserver:80 and only accepts HTTP://Myserver:80. Obviously I don't have any HTTP server running on my server and therefore I am guessing the connection is refused from my server.
Also, the IoT API requires token X-Auth-token authentication. I am using the WebSockets python library but I didn't set up the authentication on my server. I left it null on both IoT server API and my server.
If I was to add a token authentication, what would be parameters or arguments required for the websocket server? I tried to search the websockets docs but with no luck.
This is not for production environment!! I am only trying to learn.
Any thoughts are welcome.
So these are the requirements:
An IoT device that counts people and saves the data to its cloud
platform. Data can be accessed via an API and more specific it
requires to provide a webserver endpoint where it can push the data
every minute or so.
A webserver on my side that receives and stores
the data.
They need data to be refresh every minute or so. In my humble opinion, websockets are neccesary only on real time.
That said, my proposed solution is to use a Message Broker instead. I think it's easier to handle than websockets directly, and you do not have to care about maintaining a live socket connection all the time (which is not efficient in terms of energy in IoT world).
In other words, use a Pub/Sub architecture instead. Your IoT devices publish data to the Message Broker (common one is RabbitMQ), and then you build a server that subscribes to the broker, consuming its data and stores it.
Now, every device connects to the cloud only when it has data available, this saves energy. The protocol may be MQTT or HTTP, MQTT is often used in the IoT world.
Related: Pub-sub messaging benefits

Scaling a decoupled realtime server alongside a standard webserver

Say I have a typical web server that serves standard HTML pages to clients, and a websocket server running alongside it used for realtime updates (chat, notifications, etc.).
My general workflow is when something occurs on the main server that triggers the need for a realtime message, the main server sends that message to the realtime server (via a message queue) and the realtime server distributes it to any related connection.
My concern is, if I want to scale things up a bit, and add another realtime server, it seems my only options are:
Have the main server keep track of which realtime server the client
is connected to. When that client receives a notification/chat
message, the main server forwards that message along to only the
realtime server the client is connected to. The downside here is
code complexity, as the main server has to do some extra book
keeping.
Or instead have the main server simply pass that message
along to every realtime server; only the server the client is
connected to would actually do anything with it. This would result
in a number of wasted messages being passed around.
Am I missing another option here? I'm just trying to make sure I don't go too far down one of these paths and realize I'm doing things totally wrong.
If the scenario is
a) The main web server raises a message upon an action (let's say a record is inserted)
b ) He notifies the appropriate real-time server
you could decouple these two steps by using an intermediate pub/sub architecture that forwards the messages to the indended recipient.
An implementation would be
1) You have a redis pub-sub channel where upon a client connecting to a real-time socket, you start listening in that channel
2) When the main app wants to notify a user via the real-time server, it pushes to the channel a message, the real-time server get's it and forwards it to the intended user.
This way, you decouple the realtime notification from the main app and you don't have to keep track of where the user is.
The problem you are describing is the common "message backplane" used for example in SignalR, also related to the "fanout message exchange" in message architectures. When having a backplane or doing fanout, every message is forwarded to every message node server, so clients can connect to any server and get the message. This approach is a reasonable pain when you have to support both long polling and websockets. However, as you noticed, it is a waste of traffic and resources.
You need to use a message infrastructure with intelligent routing, like RabbitMQ. Take a look to topic and header exchange : https://www.rabbitmq.com/tutorials/amqp-concepts.html
How Topic Exchanges Route Messages
RabbitMQ for Windows: Exchange Types
There are tons of different queuing frameworks. Pick the one you like, but ensure you can have more exchange modes than just direct or fanout ;) At the end, a WebSocket is just and endpoint to connect to a message infrastructure. So if you want to scale out, it boils down to the backend you have :)
For just a few realtime servers, you could conceivably just keep a list of them in the main server and just go through them round-robin.
Another approach is to use a load balancer.
Basically, you'll have one dedicated node to receive the requests from the main server, and then have that load-balancer node take care of choosing which websocket/realtime server to forward the request to.
Of course, this just shifts the code complexity from the main server to a new component, but conceptually I think it's better and more decoupled.
Changed the answer because a reply indicated that the "main" and "realtime" servers are alraady load-balanced clusters and not individual hosts.
The central scalability question seems to be:
My general workflow is when something occurs on the main server that triggers the need for a realtime message, the main server sends that message to the realtime server (via a message queue) and the realtime server distributes it to any related connection.
Emphasis on the word "related". Assume you have 10 "main" servers and 50 "realtime" servers, and an event occurs on main server #5: which of the websockets would be considered related to this event?
Worst case is that any event on any "main" server would need to propagate to all websockets. That's a O(N^2) complexity, which counts as a severe scalability impairment.
This O(N^2) complexity can only be prevented if you can group the related connections in groups that don't grow with the cluster size or total nr. of connections. Grouping requires state memory to store to which group(s) does a connection belong.
Remember that there's 3 ways to store state:
global memory (memcached / redis / DB, ...)
sticky routing (load balancer configuration)
client memory (cookies, browser local storage, link/redirect URLs)
Where option 3 counts as the most scalable one because it omits a central state storage.
For passing the messages from "main" to the "realtime" servers, that traffic should by definition be much smaller than the traffic towards the clients. There's also efficient frameworks to push pub/sub traffic.

Group chat application in python using threads or asycore

I am developing a group chat application to learn how to use sockets, threads (maybe), and asycore module(maybe).
What my thought was have a client-server architecture so that when a client connects to the server the server sends the client a list of other connects (other client 'user name', ip addres) and then a person can connect to one or more people at a time and the server would set up a P2P connection between the client(s). I have the socket part working, but the server can only handle one client connection at a time.
What would be the best, most common, practical way to go about handling multiple connections?
Do I create a new process/thread whenever I new connection comes into the server and then connect the different client connections together, or use the asycore module which from what I understand makes the server send the same data to multiple sockets(connection) and I just have to regulate where the data goes.
Any help/thoughts/advice would be appreciated.
For a group chat application, the general approach will be:
Server side (accept process):
Create the socket, bind it to a well known port (and on appropriate interface) and listen
While (app_running)
Client_socket = accept (using serverSocket)
Spawn a new thread and pass this socket to the thread. That thread handles the client that just connected.
Continue, so that server can continue to accept more connections.
Server-side client mgmt Thread:
while app_running:
read the incoming message, and store to a queue or something.
continue
Server side (group chat processing):
For all connected clients:
check their queues. If any message present, send that to ALL the connected clients (including the client that sent this message -- serves as ACK sort of)
Client side:
create a socket
connect to server via IP-address, and port
do send/receive.
There can be lots of improvement on the above. Like the server could poll the sockets or use "select" operation on a group of sockets. That would make it efficient in the sense that having a separate thread for each connected client will be an overdose when there are many. (Think ~1MB per thread for stack).
PS: I haven't really used asyncore module. But I am just guessing that you would notice some performance improvement when you have lots of connected clients and very less processing.

Clustering TCP servers, so can send data to all clients

Important note:
I've asked this question already on ServerFault: https://serverfault.com/questions/349065/clustering-tcp-servers-so-can-send-data-to-all-clients, but I'd also like a programmers perspective on the problem.
I'm developing a real-time mobile app by setting up a TCP connection between the app and server backend. Each user can send messages to all other users.
(I'm making the TCP server in Python with Twisted, am creating my own 'protocol' for communication between the app/backend and hosting it on Amazon Web Services.)
Currently I'm trying to make the backend scalable (and reliable). As far as I can tell, the system could cope with more users by upgrading to a bigger server (which could become rather limiting), or by adding new servers in a cluster configuration - i.e. having several servers sitting behind a load balancer, probably with 1 database they all access.
I have sketched out the rough architecture of this:
However what if the Red user sends a message to all other connected users? Red's server has a TCP connection with Red, but not with Green.
I can think of a one way to deal with this problem:
Each server could have an open TCP (or SSL) connection with each other server. When one server wants to send a message to all users it simply passes this along it's connection to the other servers. A record could be kept in the database of which servers are online (and their IP address), and one of the servers could be a boss - i.e. decides if others are up and running, if not it could remove them from the database (if a server was up and lost it's connection to the boss it could check the database and see if it had been removed, and restart if it had - else it could assume the boss was down.)
Clearly this needs refinement but shows the general principle.
Alternatively I'm not sure if this is possible (- definitely seems like wishful thinking on my part):
Perhaps users could just connect to a box or router, and all servers could message all users through it?
If you know how to cluster TCP servers effectively, or a design pattern that provides a solution, or have any comments at all, then I would be very grateful. Thank you :-)
You need to decide (or if you already did this - to share these decisions with us) reliability requirements for your system: should all messages be sent to all users in any case (e.g. one or more servers crashed), can you tolerate sending the same message twice to the same user on server crash? Your system complexity depends directly on these decisions.
The simplest version is when a message is not delivered to all users on server crash. All your servers keep TCP connection to each other. One of them receives a message from a user and sends it to all other connected users (to this server) and to all other connected servers. Other servers send this message to all their users. To scale the system you just run additional server which connects to all existing servers.
Have a look how it is handled with IRC servers. They essentially can do this already. Everbody can send to everybody else, on all servers. Or just to single users, also on another server. And to groups, called "channels". It works best by routing amongst the servers.
It's not that hard, if you can make sure the servers know each other and can talk to each other.
On a side note: At 9/11, the most reliable internet news source was the IRC network. All the www sites were down because of bandwidth; it took them ages to even get a plain-text web page back up. During this time, IRC networks were able to provide near real-time, moderated news channels across the atlantic. You maybe could no longer log into a server on the other side, but at least the servers were able to keep up a server-to-server connection across.
An obvious choice is to use the DB as a clearinghouse for messages. You have to store incoming messages somewhere anyway, lest they be lost if a server suddenly crashes. Put incoming messages into the central database and have notification processes on the TCP servers grab the messages and send them to the correct users.
TCP server cannot be clustered, the snapshot you put here is a classic HTTP server example.
Since the device will send TCP connection to server, say, pure socket, there will be noway of establishing a load-balancing server.

Twisted: let one Factory fire event for the other on data update

I have seen this answer Sending data received in one Twisted factory to second factory , but my problem is somewhat different.
I have two servers each listening on their own ports for incoming connections. Server 1 (XMLFactory) receives request from a client, saves current state into MemCache key and sends HTTP request to external server.
reactor.listenTCP(int(appPort), XMLFactory(), interface=XMLhost)
reactor.listenTCP(int(uptPort), UPTFactory(), interface=UPThost)
External server returns a reply to a Server 2 (UPTFactory). Server 2 updates MemCache key.
The issue is to let Server 1's respective client connection know that key was updated, so it can update its client.
Your help is much appreciated.

Categories