Python server for hardware control (possibly with Twisted?)

Python server for hardware control (possibly with Twisted?) - python

I'm currently in the process of programming a server which can let clients interact with a piece of hardware. For the interested readers it's a device which monitors the wavelength of a set of lasers concurrently (and controls the lasers). The server should be able to broadcast the wavelengths (a list of floats) on a regular basis and let the clients change the settings of the device through dll calls.
My initial idea was to write a custom protocol to handle the communication, but after thinking about how to handle TCP fragmentation and data encoding I bumped into Twisted, and it looks like most of the work is already done if I use perspective broker to share the data and call server methods directly from the clients. This solution might be a bit overkill, but for me it appeared obvious, what do you think?
My main concern arrose when I thought about the clients. Basically I need two types of clients, one which just displays the wavelengths (this should be straight forward) and a second which can change the device settings and get feedback when it's changed. My idea was to create a single client capable of both, but thinking about combining it with our previous system got me thinking... The second client should be controlled from an already rather complex python framework which controls a lot of independant hardware with relatively strict timing requirements, and the settings of the wavelengthmeter should then be called within this sequential code. Now the thing is, how do I mix this with the Twisted client? As I understand Twisted is not threadsafe, so I can't simply spawn a new thread running the reactor and then inteact with it from my main thread, can I?
Any suggestions for writing this server/client framework through different means than Twisted are very welcome!
Thanks

You can start the reactor in a dedicated thread, and then issue calls to it with blockingCallFromThread from your existing "sequential" code.
Also, I'd recommend AMP for the protocol rather than PB, since AMP is more amenable to heterogeneous environments (see amp-protocol.net for independent protocol information), and it sounds like you have a substantial amount of other technology you might want to integrate with this system.

Have you tried zeromq?
It's a library that simplifies working with sockets. It can operate over TCP and implements several topologies, such as publisher/subscriber (for broadcasting data, such as your laser readings) and request/response (that you can use for you control scheme).
There are bindings for several languages and the site is full of examples. Also, it's amazingly fast.
Good stuff.

Related

High-performance replacement for multiprocessing.Queue

My distributed application consists of many producers that push tasks into several FIFO queues, and multiple consumers for every one of these queues. All these components live on a single node, so no networking involved.
This pattern is perfectly supported by Python's built-in multiprocessing.Queue, however when I am scaling up my application the queue implementation seems to be a bottleneck. I am not sending large amounts of data, so memory sharing does not solve the problem. What I need is fast guaranteed delivery of 10^4-10^5 small messages per second. Each message is about 100 bytes.
I am new to the world of fast distributed computing and I am very confused by the sheer amount of options. There is RabbitMQ, Redis, Kafka, etc.
ZeroMQ is a more focused and compact alternative, which also has successors such as nanomsg and nng. Also, implementing something like a many-to-many queue with a guaranteed delivery seems nontrivial without a broker.
I would really appreciate if someone could point me to a "standard" way of doing something like this with one of the faster frameworks.

After trying a few available implementations and frameworks, I still could not find anything that would be suitable for my task. Either too slow or too heavy.
To solve the issue my colleagues and I developed this: https://github.com/alex-petrenko/faster-fifo
faster-fifo is a drop-in replacement for Python's multiprocessing.Queue and is significantly faster. In fact, it is up to 30x faster in the configurations I cared about (many producers, few consumers) because it additionally supports get_many() method on the consumer side.
It is brokereless, lightweight, supports arbitrary many-to-many configurations, implemented for Posix systems using pthread synchronization primitives.

I think that a lot of it depends partly on what sort of importance you place on individual messages.
If each and every one is vital, and you have to consider what happens to them in the event of some failure somewhere, then frameworks like RabbitMQ can be useful. RabbitMQ has a broker, and it's possible to configure this for some sort of high availability, high reliability mode. With the right queue settings, RabbitMQ will look after your messages up until some part of your system consumes them.
To do all this, RabbitMQ needs a broker. This makes it fairly slow. Though at one point there was talk about reimplementing RabbitMQ on top of ZeroMQ's underlying protocols (zmtp) and doing away with the broker, implementing all the functionality in the endpoints instead.
In contrast, ZeroMQ does far less to guarantee that, in the event of failures, your messages will actually, eventually, get through to the intended destination. If a process dies, or a network connection fails, then there's a high chance that messages have got lost. More recent versions can be set up to actively monitor connections, so that if a network cable breaks or a process dies somewhere, the endpoints at the other end of the sockets can be informed about this pretty quickly. If one then implements a communicating sequential processes framework on top of ZMQ's actor framework (think: message acknowledgements, etc. This will slow it down) you can end up with a system whereby endpoints can know for sure that messages have been transfered to intended destinations.
Being brokerless allows zmq to be pretty fast. And it's efficient across a number of different transports, ranging from inproc to tcp, all of which can be blended together. If you're not worried about processes crashing or network connections failing, ZMQ gives you a guarantee to deliver messages right out of the box.
So, deciding what it is that's important in your application helps choose what technology you're doing to use as part of it - RabbitMQ, ZeroMQ, etc. Once you've decided that, then the problem of "how to get the patterns I need" is reduced to "what patterns does that technology support". RabbitMQ is, AFAIK, purely pub/sub (there can be a lot of each), whereas ZeroMQ has many more.

I have tried Redis Server queuing in order to replace Python standard multiprocessing Queue. It is s NO GO for Redis ! Python is best, fastest and can accept any kind of data type you throw at it, where with Redis and complex datatype such as dict with lot of numpy array etc... you have to pickle or json dumps/loads which add up overhead to the process.
Cheers,
Steve

Python IPC - Twisted, RabbitMQ,

I want to create 2 applications in Python which should communicate with each other. One of these application should behave like a server and the second should be the GUI of a client. They could be run on the same system(on the same machine) or remotely and on different devices.
I want to ask you, which technology should I use - an AMQP messaging (like RabbitMQ), Twisted like server (or Tornado) or ZeroMQ and connect applications to it. In the future I would like to have some kind of authentication etc.
I have read really lot of questions and articles (like this one: Why do we need to use rabbitmq), and a lot of people are telling "rabbitmq and twisted are different". I know they are. I really love to know the differences and why one of these solutions will be superior than the other in this case.
EDIT:
I want to use it with following requirements:
There will be more than 1 user connected at a time - I think there will be 1 - 10 users connected to the same program and they would work collaboratively
The data send are "messages" telling what user did - something like remote calls (but don't focus on that, because the GUIS can be written in different languages, so the messages will be something like json informations).
The system should allow for collaborative work - so it should be as interactive as possible. (data will be send all the time when user something types or performs some action).
Additional I would love to hear why one solution would be better than the other not only in this particular case.

Twisted is used to solve the C10k networking problem by giving you asynchronous networking through the Reactor Pattern. Its also convenient because it provides a nice concurrency abstraction as threading/concurrency in Python is not as easy as say Erlang. Consequently some people use Twisted to dispatch work tasks but thats not what it is designed for.
RabbitMQ is based on the message queue pattern. Its all about reliable message passing and is not about networking. I stress the reliable part as there are many different asynchronous networking frameworks (Vert.x for example) that provide message passing (also known as pub/sub).
More often than not most people combine the two patterns together to create a "message bus" that will work with a variety of networking needs with out unnecessary network blocking and for great integration and scalability.
The reason a "message queue" goes so well with a networking "reactor loop" is that you should not block on the reactor loop so you have to dispatch blocking work to some other process (thread, lwp, separate machine process, queue, etc..). In practice the cleanest way to do this is distributed message passing.
Based on your requirements it sounds like you should use asynchronous networking if you want the results to show up instantly and scale but you could probably get away with a simple system that just polls given you only have handful of clients. So the question is how many total users (Twisted)? And how reliable do you want the updates to be delivered (RabbitMQ)? Finally do you want your architecture to be language and platform agnostic... maybe you want to use Node.js later (focus on the message queue instead of async networking... ie RabbitMQ). Personally I would take a look at Vert.x which allows you to write in Python.

When someone is telling you that Twisted and RabbitMQ is different is because compare both is like compare two things with different target.
Twisted is a asynchronous framework, like Tornadao. RabbitMQ is a message queue system. You can't compare each one straight for.
You should turn your ask into a two new questions, the first one wich protocol should I use to communicate my process ? The answer can be figure out with words like amqp, Protocol Buffers ...
And the other one, which framework should I use to write my client and server program ? Here the answer can fall on Twisted, Tornado, ....

I need to make a "server" that can handle multiple long lasting connections of streaming data

I need to read and plot data in real time from multiple Android phones simultaneously. I'm trying to build a server (in python) that each phone can connect to simultaneously, which will receive the data streams from each phone and plot in real time, using matplotlib. I'm not very experienced in socket programming, although I know the basics (single request servers and such). How should I go about doing this? I looked at asyncore, SocketServer, and other modules, but I'm not sure I grasp how to allow multiple long standing connections.
I was thinking I should create a new thread for each phone (although I'm not sure if it's safe to pass a socket to a new thread), but I also want to be able to plot using subplots (eg, 4 plots side by side), although this is not that important.
I just need a point in the right direction. Small code samples appreciated to illustrate the concept.

Using threads due to the Python's implementation of threading might lead to a degraded performance, depending on what your threads do.
I'd suggest using a framework for building asynchronous server. A one such framework is Gevent. Using asynchronous event loop you can do calculations while other "threads" (in case of gevent, greenlets) are waiting for I/O and thus getting better performance. The model is also ideal for long-lasting idle connections.

Python - Waiting for input from lots of sockets

I'm working on a simple experiment in Python. I have a "master" process, in charge of all the others, and every single process has a connection via unix socket to the master process. I would like to be able for the master process to be able to monitor all of the sockets for a response - but there could theoretically be almost a hundred of them. How would threads impact the memory and performance of the application? What would be the best solution? Thanks a lot!

One hundred simultaneous threads might be pushing the reasonable limits of threading. If you find this is the cleanest way to organize your code, I'd say give it a try, but threading really doesn't scale very far.
What works better is to use a technique like select to wait for one of the sockets to be readable / writable / or has an error to report. This mechanism lets you go to sleep until something interesting happens, handle as many sockets have content to handle, and then go back to sleep again, all in a single thread of execution. Removing the multi-threading can often reduce chances for errors, and this style of programming should get you into the hundreds of connections no trouble. (If you want to go beyond about 100, I'd use the poll functionality instead of select -- constantly rebuilding the list of interesting file descriptors takes time that poll does not require.)
Something to consider is the Python Twisted Framework. They've gone to some length to provide a consistent way to hook callbacks onto events for this exact sort of programming. (If you're familiar with node.js, it's a bit like that, but Python.) I must admit a slight aversion to Twisted -- I never got very far in their documentation without being utterly baffled -- but a lot of people made it further in the docs than I did. You might find it a better fit than I have.

The easiest way to conduct comparative tests of threads versus processes for socket handling is to use the SocketServer in Python's standard library. You can easily switch approaches (while keeping everything else the same) by inheriting from either ThreadingMixIn or ForkingMixIn. Here is a simple example to get you started.
Another alternative is a select/poll approach using non-blocking sockets in a single process and a single thread.
If you're interested in software that is already fully developed and highly evolved, consider these high-performance Python based server packages:
The Twisted framework uses the async single process, single thread style.
The Tornado framework is similar (less evolved, less full featured, but easier to understand)
And Gunicorn which is a high-performance forking server.

Writing a socket-based server in Python, recommended strategies?

I was recently reading this document which lists a number of strategies that could be employed to implement a socket server. Namely, they are:
Serve many clients with each thread, and use nonblocking I/O and level-triggered readiness notification
Serve many clients with each thread, and use nonblocking I/O and readiness change notification
Serve many clients with each server thread, and use asynchronous I/O
serve one client with each server thread, and use blocking I/O
Build the server code into the kernel
Now, I would appreciate a hint on which should be used in CPython, which we know has some good points, and some bad points. I am mostly interested in performance under high concurrency, and yes a number of the current implementations are too slow.
So if I may start with the easy one, "5" is out, as I am not going to be hacking anything into the kernel.
"4" Also looks like it must be out because of the GIL. Of course, you could use multiprocessing in place of threads here, and that does give a significant boost. Blocking IO also has the advantage of being easier to understand.
And here my knowledge wanes a bit:
"1" is traditional select or poll which could be trivially combined with multiprocessing.
"2" is the readiness-change notification, used by the newer epoll and kqueue
"3" I am not sure there are any kernel implementations for this that have Python wrappers.
So, in Python we have a bag of great tools like Twisted. Perhaps they are a better approach, though I have benchmarked Twisted and found it too slow on a multiple processor machine. Perhaps having 4 twisteds with a load balancer might do it, I don't know. Any advice would be appreciated.

asyncore is basically "1" - It uses select internally, and you just have one thread handling all requests. According to the docs it can also use poll. (EDIT: Removed Twisted reference, I thought it used asyncore, but I was wrong).
"2" might be implemented with python-epoll (Just googled it - never seen it before).
EDIT: (from the comments) In python 2.6 the select module has epoll, kqueue and kevent build-in (on supported platforms). So you don't need any external libraries to do edge-triggered serving.
Don't rule out "4", as the GIL will be dropped when a thread is actually doing or waiting for IO-operations (most of the time probably). It doesn't make sense if you've got huge numbers of connections of course. If you've got lots of processing to do, then python may not make sense with any of these schemes.
For flexibility maybe look at Twisted?
In practice your problem boils down to how much processing you are going to do for requests. If you've got a lot of processing, and need to take advantage of multi-core parallel operation, then you'll probably need multiple processes. On the other hand if you just need to listen on lots of connections, then select or epoll, with a small number of threads should work.

How about "fork"? (I assume that is what the ForkingMixIn does) If the requests are handled in a "shared nothing" (other than DB or file system) architecture, fork() starts pretty quickly on most *nixes, and you don't have to worry about all the silly bugs and complications from threading.
Threads are a design illness forced on us by OSes with too-heavy-weight processes, IMHO. Cloning a page table with copy-on-write attributes seems a small price, especially if you are running an interpreter anyway.
Sorry I can't be more specific, but I'm more of a Perl-transitioning-to-Ruby programmer (when I'm not slaving over masses of Java at work)
Update: I finally did some timings on thread vs fork in my "spare time". Check it out:
http://roboprogs.com/devel/2009.04.html
Expanded:
http://roboprogs.com/devel/2009.12.html

One sollution is gevent. Gevent maries a libevent based event polling with lightweight cooperative task switching implemented by greenlet.
What you get is all the performance and scalability of an event system with the elegance and straightforward model of blocking IO programing.
(I don't know what the SO convention about answering to realy old questions is, but decided I'd still add my 2 cents)

Can I suggest additional links?
cogen is a crossplatform library for network oriented, coroutine based programming using the enhanced generators from python 2.5. On the main page of cogen project there're links to several projects with similar purpose.

I like Douglas' answer, but as an aside...
You could use a centralized dispatch thread/process that listens for readiness notifications using select and delegates to a pool of worker threads/processes to help accomplish your parallelism goals.
As Douglas mentioned, however, the GIL won't be held during most lengthy I/O operations (since no Python-API things are happening), so if it's response latency you're concerned about you can try moving the critical portions of your code to CPython API.

http://docs.python.org/library/socketserver.html#asynchronous-mixins
As for multi-processor (multi-core) machines. With CPython due to GIL you'll need at least one process per core, to scale. As you say that you need CPython, you might try to benchmark that with ForkingMixIn. With Linux 2.6 might give some interesting results.
Other way is to use Stackless Python. That's how EVE solved it. But I understand that it's not always possible.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.