ZeroMQ: How to construct simple asynchronous broker? Seems impossible

ZeroMQ: How to construct simple asynchronous broker? Seems impossible - python

I am building a simple star-like client-server topology.
The idea is that clients connect to the server, can send messages, and the server can send messages to them, when the server decides to. There will be a relatively small number of clients, about 30, but so many that it is not sensible to send all outgoing data to all. I'm sure I'm just boneheaded, but this seems to be completely impossible with ZeroMQ.
The last part is the reason this question does not provide answer.
The catch is this :
I can use a ROUTER socket to receive messages from clients. This also carries identification. However, I cannot use the same socket for sending, since ZeroMQ sockets are not threadsafe. I.e. I can't have one thread waiting for incoming messages, and another sending outgoing from the server itself. I am not aware of any way I could wait in blocking for both - socket.recv(), and for example .get() on a queue - at the same time on a single thread in python. Maybe there is a way to do that.
Using two sockets - one incoming one outgoing - doesn't work either. The identification is not shared between sockets, and so the sending socket would still have to be polled to obtain client id mapping, if even for once. We obviously can't use own port for each client. There seems to be no way for the server to send a message to a single client out of it's own volition.
(subscription topics are a dead idea too: message filtering is performed on client-side, and the server would just flood all client networks)
In the end TCP sockets can handle this sort of asynchronous situation easily, but effective message framing on python is a nightmare to build. All I'm essentially after is a reliable socket that handles messages, and has well defined failure modes.

I don't know Python but for C/C++ I would use zmq_poll(). There are several options, depending on your requirements.
Use zmq_poll() to wait for messages from clients. If a message arrives, process it. Also use a time-out. When the time-out expires, check if you need to send messages to clients and send them.
zmq_poll() can also wait on general file descriptors. You can use some type of file descriptor and trigger it (write to it) from another process or thread when you have a message to send to a client. If this file descriptor is triggered, send messages to clients.
Use ZeroMQ sockets internally inside your server. Use zmq_poll() to wait both on messages from clients and internal processes or threads. If the internal sockets are triggered, send messages to clients.
You can use the file descriptor or internal ZeroMQ sockets just for triggering but you can also send the message content through the file descriptor or ZeroMQ socket.

Q : "ZeroMQ: How to construct simple asynchronous broker?"
The concept builds on a few assumptions that are not supported or do not hold :
a)Python threads actually never execute concurrently, they are re-[SERIAL]-ised into a sequence of soloists execution blocks & for any foreseeable future will remain such, since ever & forever (as Guido van ROSSUM has explained this feature to be a pyramidal reason for collision prevention - details on GIL-lock, serving this purpose, are countless )
b)ZeroMQ thread-safeness has nothing to do with using a blocking-mode for operations.
c)ZeroMQ PUB/SUB archetype does perform a topic-filtering, yet in different versions on different sides of the "ocean" :
Until v3.1, subscription mechanics ( a.k.a. a TOPIC-filter ) was handled on the SUB-side, so this part of the processing got distributed among all SUB-s ( at a cost of uniformly wide data-traffic across all transport-classes involved ) and there was no penalty, except for a sourcing such data-flow related workload ... on the PUB-side.
Since v3.1, the TOPIC-filter is processed on the PUB-side, at a cost of such a processing overhead & memory allocations, but saving all the previously wasted transport-capacities, consumed just to later realise at the SUB-side the message is not matching the TOPIC-filter and will be disposed off.
Using a .poll()-based & zmq.NOBLOCK-modes of .recv()- & .send()-methods in the code design will never leave one in ambiguous, the less in an unsalvagable deadlock waiting-state and adds the capability to design even a lightweight priority-driven soft-scheduler for doing so with different relative priority levels.
Given your strong exposure in realtime systems, you might like to have a read into this to review the ZeroMQ Framework properties.

Related

pyzmq - zmq_req can I have one context and use several sockets?

I'm currently working on a Benchmark project, where I'm trying to stress the server out with zmq requests.
I was wondering what would be the best way to approach this, I was thinking of having a context to create a socket and push it into a thread, in which I would send request and wait for responses in each thread respectively, but I'm not too sure this is possible with python's limitations.
More over, would it be the same socket for all threads, that is, if I'm waiting for a response on one thread (With it's own socket), would it be possible for another thread to catch that response?
Thanks.
EDIT:
Test flow logic would be like this:
Client socket would use zmq.REQ.
Client sends message.
Client waits for a response.
If no response, client reconnects and tries again until limit.
I'd like to scale this operation up to any number of clients, preferring not to deal with Processes unless performance wise the difference is significant..
How would you do this?

Q : "...can I have one context and use several sockets?"
Oh sure you can.
Moreover, you can have several Context()-instances, each one managing ... almost... any number of Socket()-instances, each Socket()-instance's methods may get called from one and only one python-thread ( a Zen-of-Zero rule: zero-sharing ).
Due to known GIL-lock re-[SERIAL]-isation of all the thread-based code-execution flow, this still has to and will wait for acquiring the GIL-lock ownership, which in turn permits a GIL-lock owner ( and nobody else ) to execute a fixed amount of python instructions, before it re-releases the GIL-lock to some other thread...

Asynchronous IPC between Node.js/Electron and Python

I try to build a GUI for given Python code using Electron.
The data flow is actually straight-forward: The user interacts with the Electron app, which sends a request to the Python API, which processes the request and sends a reply.
So far, so good. I read different threads and blog posts:
ZeroRPC solutions:
https://medium.com/#abulka/electron-python-4e8c807bfa5e
https://github.com/fyears/electron-python-example
Spawn Python API as child process from node.js and communicate directly:
https://www.ahmedbouchefra.com/connect-python-3-electron-nodejs-build-desktop-apps/
This seems to be not the smartest solution for me, since using zeroRPC or zeroMQ makes it more easy to change the frontend architecture without touching the backend code.
Use zeroMQ sockets (for example exclusive pair?)
https://zeromq.org/socket-api/#exclusive-pair-pattern
But in all three solutions, I struggle at the same point: I have to make asynchronous requests/replies, because the request processing can take some time and in this time, there can occur already further requests. For me, this looks like a very common pattern, but I found nothing on SO, maybe I just don't know, what exactly I am looking for.
Frontend Backend
| |
REQ1 |—————————————————————————————>|Process REQ1——--
| | |
REQ2 |—————————————————————————————>|Process REQ2 --|----—
| | | |
REP1 |<————————————————————————————-|REPLY1 <——————— |
| | |
REP2 |<————————————————————————————-|REPLY2 <———————————--
| |
The most flexible solution seems to me going with 3. zeroMQ, but on the website and the Python doc, I found only the minimum working examples, where both, send and receive are blocking.
Could anybody give me a hint?

If you're thinking of using ZeroMQ, you are entering into the world of Actor model programming. In actor model programming, sending a message happens independently of receiving that message (the two activities are asynchronous).
What ZeroMQ means by Blocking
When ZeroMQ talks about a send "blocking", what that means is that the internal buffer ZeroMQ uses to queue up messages prior to transmission is full, so it blocks the sending application until there is space available in this queue. The thing that empties the queue is the successful transfer of earlier messages to the receiver, which has a receive buffer, which has to be emptied by the recieve application. The thing that actually transfers the messages is the mamangement thread(s) that belong to the ZeroMQ contenxt.
This management thread is the cruicial part; it's running independently of your own application threads, and so it's making the communications between sender and receiver asynchronous.
What you likely want is to use ZeroMQ's reactor, zmq_poll(). Typically in actor model programming you have a loop, and at the top is a call to the reactor (zmq_poll() in this case). Zmq_poll() tells you when something has happened, but here you'd primarily be interested in it telling you that a message has arrived. Typically then you'd read that message, process it (which may involve sending out other ZeroMQ messages), and looping back to the zmq_poll().
Backend
So your backend would be something like:
while (forever)
{
zmq_poll(list of input sockets) // allows serving more than one socket
zmq_recv(socket that has a message ready to read) // will always succeed immediately because zmq_poll() told us there was a message waiting
decode req message
generate reply message
zmq_send(reply to original requester) // Socket should be in blocking mode to ensue that messages don't get lost if something is unexpectedly running slowly
}
If you don't want to serve more than one Front end, it's simpler:
while (forever)
{
zmq_recv(req) // Socket should be in blocking mode
decode req message
generate reply message
zmq_send(reply) // Socket should also be in blocking mode to ensure that messages don't get lost if something is unexpectedly running slow
}
Frontend
Your front end will be different. Basically, you'll need the Electron event loop handler to take over the role of zmq_poll(). A build of ZeroMQ for use within Electron will have taken care of that. But basically it will come down to GUI event callbacks sending ZeroMQ messages. You will also have to write a callback for Electron to run when a message arrives on the socket from the backend. There'll be no blocking in the front end between sending and receiving a message.
Timing
This means that the timing diagram you've drawn is wrong. The front end can send out as many requests as it wants, but there's no timing alignment between those requests departing and arriving in the backend (though assuming everything is running smoothly, the first one will arrive pretty much straight away). Having sent a request or requests, the front end simply returns to doing whatever it wants (which, for a User Interface, is often nothing but the event loop manager waiting for an event).
That backend will be in a loop of read/process/reply, read/process/reply, handling the requests one at a time. Again there is no timing alignment between those replies departing and subsequently arriving in the front end. When a reply does arrive back in the front end, it wakes up and deals with it.

ZeroMQ: socket per data type or just one socket?

I've got a program which receives information from about 10 other (sensor reading) programs (all controlled by myself). I now want to make them communicate using ZeroMQ.
For most of the queues the important thing is that the central receiving program always has the latest sensor data, all older messages are not important anymore. If a couple messages get lost I don't care. So for all of them I started out with a separate PUB/SUB socket; one for each program. But I'm not sure if that is the right way to do it. As far as I understand I have two options:
Make a separate socket for every program and read them out in a loop. That way I know by the socket what the information is I'm receiving (I'm often just sending an int).
Make one socket to which all the programs connect, and with every message I send a string which tells the receiving end what the message is about.
All connections are on a PUB/SUB basis, so creating one socket would well work out. I'm just not sure if that is the most efficient way to do it.
All tips are welcome!

- PUB/SUB is fine and allows an easy conversion from N-sensors:1-logger into N-sensors:2+-loggers- one might also benefit from a conceptual separation of a socket from an access-port, where more than one sockets may get connected
How to get always JUST THE ACTUAL ( LAST ) SENSOR READOUT:
If not bound, due to system-integration constraints, to some early ZeroMQ API, there is a lovely feature exactly for this via a .setsockopt( ZMQ_CONFLATE, True ) method:
ZMQ_CONFLATE: Keep only last message
If set, a socket shall keep only one message in its inbound/outbound queue, this message being the last message received/the last message to be sent. Ignores ZMQ_RCVHWM and ZMQ_SNDHWM options. Does not support multi-part messages, in particular, only one part of it is kept in the socket internal queue.
On design dilemma:
Unless your real-time control stability introduces some hard-real-time limit, the PUB-side freely decides, how often a new value is instructed to .send() to SUB(-s). Here no magic is needed, the less with ZMQ_CONFLATE option set on the internal outgoing queue managed.
The SUB(-s) side receiver(s) will also benefit from the ZMQ_CONFLATE option set on the internal incoming queue managed, but given a set of individual .bind()-s instantiate separate landing ports for delivery of different individual sensoric readouts, your "last" values will remain consistently the "last"-readouts. If all readouts would go into a common landing pad, your receiving process will get masked-out ( lost ) all readouts but the one that was just accidentally the "last" right before .recv() took place, which would not help much, would it?
If some I/O-performance related tweaking becomes necessary, the .Context( n_IO_threads ) + ZMQ_AFFINITY-mapping options may increase and prioritise the resources the ioDataPump may harness for increased IO-performance

Unless you're up against a tight real time requirement there's not much point in having more sockets than necessary. ZMQ's fair queuing ought to take care of giving each sensor program equal attention (see Figure 6 in the guide)
If your sensor programs are on other devices connected by Ethernet, the ultimate performance of your programs is limited by the bandwidth of the Ethernet NIC in your computer. A single thread program handling a single PULL socket stands a good chance of being able to process the data coming in faster than it can transit the NIC.
If that's so, then you may as well stick to a single socket and enjoy the simpler code. It's not very hard dealing with multiple sockets, but it's far easier to deal with one. For example, with one single socket you don't have to tell each sensor program what network port to connect to - it can be a constant.
PUSH/PULL sounds like a more natural pattern for your situation than PUB/SUB, but that won't make much difference.
Lastness
Lastness is going to be your (potential) problem. The whole point of things like ZMQ is that they will deliver messages in the order they're sent. Thus you read a message, it is by definition the "last" message so far as the recipient is concerned. The recipient has no idea as to whether or not there is another message on the way, in transit.
This is a feature of Actor model architectures (which is what ZMQ is). Messages get buffered up in the transport, and there's no information about the newness of the message to be learned when it's read. All you know is that it was sent some time beforehand. There is no execution rendezvous with the sender.
Now, you either process it as if it is the last message, or you wait for a period of time to see if another one comes along before processing it. The easiest thing to do is to simply process each message as if it is the last.
Contrast this with a Communicating Sequential Processes architecture. It's basically the same as an Actor model architecture, except that the transport does not buffer messages. Message sends block until the recipient has called message read.
Thus when you read a message, the recipient knows that it the last one sent by the sender. And the sender knows that the message it has sent has been received at that very instant by the recipient. So the knowledge of lastness is absolute - the message received really is the last one sent.
However, unless you have something fairly heavyweight going on I wouldn't worry about it. You are quite likely to be able to keep up with your sensor data stream even if the messages you're reading aren't the latest in the queue.
You can nearly make ZMQ into CSP by setting the high water limit on the sending end's socket to 1. That means that you can buffer up at most 1 message. That's not the same as 0, and unfortunately setting the HWM to 0 means "unlimited size buffer".

Memory bounds in twisted applications

Consider the following scenario: A process on the server is used to handle data from a network connection. Twisted makes this very easy with spawnProcess and you can easily connect the ProcessTransport with your protocol on the network side.
However, I was unable to determine how Twisted handles a situation where the data from the network is available faster than the process performs reads on its standard input. As far as I can see, Twisted code mostly uses an internal buffer (self._buffer or similar) to store unconsumed data. Doesn't this mean that concurrent requests from a fast connection (eg. over local gigabit LAN) could fill up main memory and induce heavy swapping, making the situation even worse? How can this be prevented?
Ideally, the internal buffer would have an upper bound. As I understand it, the OS's networking code would automatically stall the connection/start dropping packets if the OS's buffers are full, which would slow down the client. (Yes I know, DoS on the network level is still possible, but this is a different problem). This is also the approach I would take if implementing it myself: just don't read from the socket if the internal buffer is full.
Restricting the maximum request size is also not an option in my case, as the service should be able to process files of arbitrary size.

The solution has two parts.
One part is called producers. Producers are objects that data comes out of. A TCP transport is a producer. Producers have a couple useful methods: pauseProducing and resumeProducing. pauseProducing causes the transport to stop reading data from the network. resumeProducing causes it to start reading again. This gives you a way to avoid building up an unbounded amount of data in memory that you haven't processed yet. When you start to fall behind, just pause the transport. When you catch up, resume it.
The other part is called consumers. Consumers are objects that data goes in to. A TCP transport is also a consumer. More importantly for your case, though, a child process transport is also a consumer. Consumers have a few methods, one in particular is useful to you: registerProducer. This tells the consumer which producer data is coming to it from. The consumer can them call pauseProducing and resumeProducing according to its ability to process the data. When a transport (TCP or process) cannot send data as fast as a producer is asking it to send data, it will pause the producer. When it catches up, it will resume it again.
You can read more about producers and consumers in the Twisted documentation.

FIFO (named pipe) messaging obstacles

I plan to use Unix named pipes (mkfifo) for simple multi-process messaging.
A message would be just a single line of text.
Would you discourage me from that? What obstacles should I expect?
I have noticed these limitations:
A sender cannot continue until the message is received.
A receiver is blocked until there are some data. Nonblocking IO would be needed
when we need to stop the reading. For example, another thread could ask for that.
The receiver could obtain many messages in a single read. These have to be processed
before quiting.
The max length of an atomic message is limited by 4096 bytes. That is the PIPE_BUF limit on Linux (see man 7 pipe).
I will implement the messaging in Python. But the obstacles hold in general.

Lack of portability - they are mainly a Unix thing. Sockets are more portable.
Harder to scale out to multiple systems (another sockets+)
On the other hand, I believe pipes are faster than sockets for processes on the same machine (less communication overhead).
As to your limitations,
You can "select" on pipes, to do a non-blocking read.
I normally (in perl) print out my messages on pipes seperated by "\n", and read a line from them to get one message at a time.
Do be careful with the atomic length.
I find perlipc to be a good discussion between the various options, though it has perl specific code.

The blocking, both on the sender side and the receiver side, can be worked around via non-blocking I/O.
Further limitations of FIFOs:
Only one client at a time.
After the client closes the FIFO, the server need to re-open its endpoint.
Unidirectional.
I would use UNIX domain sockets instead, which have none of the above limitations.
As an added benefit, if you want to scale it to communicate between multiple machines, it's barely any change at all. For example, just take the Python documentation page on socket and replace socket.AF_INET with socket.AF_UNIX, (HOST, PORT) with filename, and it just works.
SOCK_STREAM will give you stream-like behavior; that is, two sends may be merged into one receive or vice versa. AF_UNIX also supports SOCK_DGRAM: datagrams are guaranteed to be sent and read all as one unit or not at all. (Analogously, AF_INET+SOCK_STREAM=TCP, AF_INET+SOCK_DGRAM=UDP.)

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.