Limit number of connections to a rabbit queue? - python

I use pika-0.10.0 with rabbitmq-3.6.6 broker on ubuntu-16.04. I designed a Request/Reply service. There is a single Request queue where all clients push their requests. Each client creates a unique Reply queue: the server pushes replies targeting this client to this unique queue. My API can be seen as two messages: init and run.
init messages contain big images, thus init is a big and slow request. run messages are lighter and the server reuses previous images. The server can serve multiple clients. Usually client#1 init then run multiple times. If client#2 comes in and init, it will replace the images sent by client#1 on the server. And further run issued by client#1 would use wrong images. Then I am asking:
is it possible to limit the number of connections to a queue? E.g. the server serves one client at a time.
another option would be: the server binds images to a client, saves them, and reuse them when this client runs. It requires more work, and will impact performance if two or more clients' requests are closely interleaved.
sending the images in each run request is not an option, would be too slow.

I think you have a problem in your design. Logically each run corresponds to a certain init so they have to be connected. I'd put a correlation id field into init and run events. When server receives run it checks if it there was a corresponding init processed and uses the result of that init.
Speaking of performance:
You can make init worker queue and have multiple processing servers listen to it. The example is in the RabbitMQ docs
Then, when init request comes in, one of available servers will pick it up, and store your images and the correlation ID. If you have multiple init requests at the same time - no problem, they will be processed eventually (or simultaneosly if servers are free)
Then server that did the process sends reply message to the client queue saying init work is done, and sends name of the queue where run request have to be published.
When ready, client sends its run request to the correct queue.
To directly answer the question:
there is a common misconception that you publish to a queue. In RabbitMQ you publish to an exchange that cares about routing of your messages to a number of queues. So you question really becomes can I limit number of publishing connections to an exchange. I'm pretty sure there is no way of doing so on the broker side.
Even if there was a way of limiting number of connections, imagine the situation:
Client1 comes in, pushes its 'init' request.
Client1 holds its connection, waiting to push run.
Client1 fails or network partition occurs, its connection gets
dropped.
Client2 comes in and pushes its init request.
Client2 fails
Client1 comes back up and pushes its run and gets Client2's
images.
Connection is a transient thing and cannot be relied upon as a transaction mechanism.

Related

Long polling scalable architecture in tornado/cyclone

I want to implement long polling in python using cyclone or tornado with regards to scalability of service from beginning. Clients might connect for hours to this service. My concept:
Client HTTP requests will be processed by multiple tornado/cyclone handler threads behind NGINX proxy (serving as load balancer). There will be multiple data queues for requests: one for all unprocessed requests from all clients and rest of queues containing responses specific to each connected client, previously generated by worker processes. When requests are delivered to tornado/cyclone handler threads, request data will be sent for processing to worker queue and then processed by workers (which connect to database etc.). Meanwhile tornado/cyclone handler thread will look into client-specific queue and sends response with data back to client (if there is some waiting in queue). Please see the diagram.
Simple diagram: https://i.stack.imgur.com/9ZxcA.png
I am considering queue system because some requests might be pretty heavy on database and some requests might create notifications and messages for other clients. Is this a way to go towards scalable server or is it just overkill?
After doing some research I have decided to go with tornado websockets connected to zeroMQ. Inspired by this answer: Scaling WebSockets with a Message Queue.

What is the proper way to have an asychronous python script wait for a response from a server?

I currently have a program that does work on a large set of data, at one point in the process it sends the data to a server for more work to be done, then my program looks for the completed data periodically, sleeping if it is not ready and repeating until it fetches the data, then continuing to do work locally.
Instead of polling repeatedly until the external server has finished, it has the ability to send a simple http post to an address I designate once the work has finished.
So I assume I need flask running at an address that can receive the notification, but I'm unsure of the best way to incorporate flask into the original program. I am thinking just to split my program into 2 parts.
part1.py
does work --> send to external server
part1 ends
flask server.py
receives data --> spawns part2.py with received data
The original program uses multiprocessing pools to offset waiting for the server responses, but with using flask, can I just repeatedly spawn new instances of part2 to do work on the data as it is received?
Am I doing this all completely wrong, I've just put this together with some googling and feel out of my depth
U can use broker with a message queue implemented ex. Celery + Redis or RabbitMQ. Then, when the other server finishes doing whatever it has to do with the data it can produce an event, and the first server will receive a notification.

Synchronizing socket programming python

I have a client-server application consisted of three rounds. At each round the client sends a file to the server, the server computes sth and send it back to the client. The client based on the received message prepares the message for the next round etc.
The application sometimes works smoothly, sometimes not. I guess the problem is some sort of lack of synchronization between the rounds. For example before the client sends the message for the second round the server already starts its second round, which creates problems.
I do not use any module for networking apart from sockets and ThreadedTCPHandler. How i can assert my application to wait for example the other network entity to send its message before starting its execution, without creating deadlocks
Have a look at ZeroMQ and its Python client pyzmq. It provides a bit easier way to write client/server or distributed applications.

Scaling a decoupled realtime server alongside a standard webserver

Say I have a typical web server that serves standard HTML pages to clients, and a websocket server running alongside it used for realtime updates (chat, notifications, etc.).
My general workflow is when something occurs on the main server that triggers the need for a realtime message, the main server sends that message to the realtime server (via a message queue) and the realtime server distributes it to any related connection.
My concern is, if I want to scale things up a bit, and add another realtime server, it seems my only options are:
Have the main server keep track of which realtime server the client
is connected to. When that client receives a notification/chat
message, the main server forwards that message along to only the
realtime server the client is connected to. The downside here is
code complexity, as the main server has to do some extra book
keeping.
Or instead have the main server simply pass that message
along to every realtime server; only the server the client is
connected to would actually do anything with it. This would result
in a number of wasted messages being passed around.
Am I missing another option here? I'm just trying to make sure I don't go too far down one of these paths and realize I'm doing things totally wrong.
If the scenario is
a) The main web server raises a message upon an action (let's say a record is inserted)
b ) He notifies the appropriate real-time server
you could decouple these two steps by using an intermediate pub/sub architecture that forwards the messages to the indended recipient.
An implementation would be
1) You have a redis pub-sub channel where upon a client connecting to a real-time socket, you start listening in that channel
2) When the main app wants to notify a user via the real-time server, it pushes to the channel a message, the real-time server get's it and forwards it to the intended user.
This way, you decouple the realtime notification from the main app and you don't have to keep track of where the user is.
The problem you are describing is the common "message backplane" used for example in SignalR, also related to the "fanout message exchange" in message architectures. When having a backplane or doing fanout, every message is forwarded to every message node server, so clients can connect to any server and get the message. This approach is a reasonable pain when you have to support both long polling and websockets. However, as you noticed, it is a waste of traffic and resources.
You need to use a message infrastructure with intelligent routing, like RabbitMQ. Take a look to topic and header exchange : https://www.rabbitmq.com/tutorials/amqp-concepts.html
How Topic Exchanges Route Messages
RabbitMQ for Windows: Exchange Types
There are tons of different queuing frameworks. Pick the one you like, but ensure you can have more exchange modes than just direct or fanout ;) At the end, a WebSocket is just and endpoint to connect to a message infrastructure. So if you want to scale out, it boils down to the backend you have :)
For just a few realtime servers, you could conceivably just keep a list of them in the main server and just go through them round-robin.
Another approach is to use a load balancer.
Basically, you'll have one dedicated node to receive the requests from the main server, and then have that load-balancer node take care of choosing which websocket/realtime server to forward the request to.
Of course, this just shifts the code complexity from the main server to a new component, but conceptually I think it's better and more decoupled.
Changed the answer because a reply indicated that the "main" and "realtime" servers are alraady load-balanced clusters and not individual hosts.
The central scalability question seems to be:
My general workflow is when something occurs on the main server that triggers the need for a realtime message, the main server sends that message to the realtime server (via a message queue) and the realtime server distributes it to any related connection.
Emphasis on the word "related". Assume you have 10 "main" servers and 50 "realtime" servers, and an event occurs on main server #5: which of the websockets would be considered related to this event?
Worst case is that any event on any "main" server would need to propagate to all websockets. That's a O(N^2) complexity, which counts as a severe scalability impairment.
This O(N^2) complexity can only be prevented if you can group the related connections in groups that don't grow with the cluster size or total nr. of connections. Grouping requires state memory to store to which group(s) does a connection belong.
Remember that there's 3 ways to store state:
global memory (memcached / redis / DB, ...)
sticky routing (load balancer configuration)
client memory (cookies, browser local storage, link/redirect URLs)
Where option 3 counts as the most scalable one because it omits a central state storage.
For passing the messages from "main" to the "realtime" servers, that traffic should by definition be much smaller than the traffic towards the clients. There's also efficient frameworks to push pub/sub traffic.

Which web servers are compatible with gevent and how do the two relate?

I'm looking to start a web project using Flask and its SocketIO plugin, which depends on gevent (something something greenlets), but I don't understand how gevent relates to the webserver. Does using gevent restrict my server choice at all? How does it relate to the different levels of web servers that we have in python (e.g. Nginx/Apache, Gunicorn)?
Thanks for the insight.
First, lets clarify what we are talking about:
gevent is a library to allow the programming of event loops easily. It is a way to immediately return responses without "blocking" the requester.
socket.io is a javascript library create clients that can maintain permanent connections to servers, which send events. Then, the library can react to these events.
greenlet think of this a thread. A way to launch multiple workers that do some tasks.
A highly simplified overview of the entire process follows:
Imagine you are creating a chat client.
You need a way to notify the user's screens when anyone types a message. For this to happen, you need someway to tell all the users when a new message is there to be displayed. That's what socket.io does. You can think of it like a radio that is tuned to a particular frequency. Whenever someone transmits on this frequency, the code does something. In the case of the chat program, it adds the message to the chat box window.
Of course, if you have a radio tuned to a frequency (your client), then you need a radio station/dj to transmit on this frequency. Here is where your flask code comes in. It will create "rooms" and then transmit messages. The clients listen for these messages.
You can also write the server-side ("radio station") code in socket.io using node, but that is out of scope here.
The problem here is that traditionally - a web server works like this:
A user types an address into a browser, and hits enter (or go).
The browser reads the web address, and then using the DNS system, finds the IP address of the server.
It creates a connection to the server, and then sends a request.
The webserver accepts the request.
It does some work, or launches some process (depending on the type of request).
It prepares (or receives) a response from the process.
It sends the response to the client.
It closes the connection.
Between 3 and 8, the client (the browser) is waiting for a response - it is blocked from doing anything else. So if there is a problem somewhere, like say, some server side script is taking too long to process the request, the browser stays stuck on the white page with the loading icon spinning. It can't do anything until the entire process completes. This is just how the web was designed to work.
This kind of 'blocking' architecture works well for 1-to-1 communication. However, for multiple people to keep updated, this blocking doesn't work.
The event libraries (gevent) help with this because they accept and will not block the client; they immediately send a response and when the process is complete.
Your application, however, still needs to notify the client. However, as the connection is closed - you don't have a way to contact the client back.
In order to notify the client and to make sure the client doesn't need to "refresh", a permanent connection should be open - that's what socket.io does. It opens a permanent connection, and is always listening for messages.
So work request comes in from one end - is accepted.
The work is executed and a response is generated by something else (it could be a the same program or another program).
Then, a notification is sent "hey, I'm done with your request - here is the response".
The person from step 1, listens for this message and then does something.
Underneath is all is WebSocket a new full-duplex protocol that enables all this radio/dj functionality.
Things common between WebSockets and HTTP:
Work on the same port (80)
WebSocket requests start off as HTTP requests for the handshake (an upgrade header), but then shift over to the WebSocket protocol - at which point the connection is handed off to a websocket-compatible server.
All your traditional web server has to do is listen for this handshake request, acknowledge it, and then pass the request on to a websocket-compatible server - just like any other normal proxy request.
For Apache, you can use mod_proxy_wstunnel
For nginx versions 1.3+ have websocket support built-in

Categories