Efficient python chat server [closed] - python

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 9 years ago.
I am now writing a unicast chat server model, the flow will be as follows:
Sender send out message to the chat server, in the message the server also specify the message recipient id
The chat server will route the message to the right client, based on the recipient id
I implemented the chat server model using python standard library asyncore. I found that the CPU goes up, once the client connect to the server (1% vs 24%). I believe the performance is limited by the looping of the handle_write function.
Is there a better (e.g. more efficient) framework to accomplish my chat server requirement?
Thanks in advance

Of course we'd need actual code to debug the problem. But what you're mainly asking is:
Is there a better (e.g. more efficient) framework to accomplish my chat implementation?
Yes. It's generally accepted that asyncore sucks. It's hard to use as well as being inefficient. It's especially bad on Windows, because select especially sucks on Windows.
So, yes, using a different framework will probably make things better.
Unfortunately, an SO question is not a good place to get recommendations for frameworks, but I can throw out a list of usual suspects: twisted, monocle, gevent, eventlet, tulip.
Alternatively, if you're not worried about scalability to more than a few dozen clients, just about performance at the small scale, using a thread per client (or even two threads, one for reads and one for writes) and blocking I/O is incredibly simple.
Finally, if you're staying up to date with Python 3.x, there's a good chance that 3.4 will have a new and improved async I/O module that's much more efficient and much easier to use than asyncore (and it will almost certainly be based on tulip). So… the best solution may be to get a time machine and go forward a few months. (Or, if you're a reader searching for this answer in the future, look in the standard library under IPC and guess which module is the new-and-improved async I/O module.)

I just read from a web, talking about the efficiency between different python web servers (Link).
I think I will use gevent as it is very efficient (seems).

Related

Flexible IPC solution for Python on Linux? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 9 years ago.
I'm writing a program in Python for which I'm considering a local client-server model, but I am struggling to figure out the best way for the server to communicate with the client(s). A simple, canned solution would be best--I'm not looking to reinvent the wheel. Here are my needs for this program:
Runs on Linux
Server and clients are on the same system, so I don't need to go over a network.
Latency that's not likely to be annoying to an interactive user.
Multiple clients can connect to the same server.
Clients are started independently of the server and can connect/disconnect at any time.
The number of clients is measurable in dozens; I don't need to scale very high.
Clients can come in a few different flavors:
Stream readers - Reads a continuous stream of data (in practice, this is all text).
State readers - Reads some state information that updates every once in a while.
Writers - Sends some data to the server, receives some response each time.
Client type 1 seems simple enough; it's a unidirectional dumb pipe. Client type 2 is a bit more interesting. I want to avoid simply polling the server to check for new data periodically since that would add noticeable latency for the user. The server needs some way to signal to all and only the relevant clients when the state information is updated so that the client can receive the updated state from the server. Client type 3 must be bidirectional; it will send user-supplied data to the server and receive some kind of response after each send.
I've looked at Python's IPC page (http://docs.python.org/2/library/ipc.html), but I don't think any of those solutions are right for my needs. The subprocess module is completely inappropriate, and everything else is a bit more low-level than I'd like.
The similar question Efficient Python to Python IPC isn't quite the same; I don't need to transfer Python objects, I'm not especially worried about CPU efficiency for the number of clients I'll have, I only care about Linux, and none of the answers to that question are especially helpful to me anyway.
Update:
I cannot accept an answer that just points me at a framework/library/module/tool without actually giving an explanation of how it can be used for my three different server-client relationships. If you say, "All of this can be done with named pipes!" I will have to ask "How?" Code snippets would be ideal, but a high-level description of a solution can work too.
Have you already looked into ZeroMQ? It has excellent Python support, and the documented examples already cover your use cases.
It's easy to use on a single platform, single machine setup, but it can be very easily extended to a network.

Efficient Python IPC [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 3 years ago.
Improve this question
I'm making an application in Python3, which will be divided in batch and gui parts.
Batch is responsible for processing logic and gui is responsible for displaying it.
Which inter-process communication (IPC) framework should I use with the following requirements:
The GUI can be run on other device than batch (GUI can be run on the same device, on smartphone, tablet etc, locally or over network).
The batch (Python3 IPc library) should work with no problem on Linux, Mac, Windows, ...
The IPC should support GUI written in different languages (Python, Javascript, ...)
The performance of IPC is important - it should be as "interactive" as possible, but without losing information.
Several GUI could be connected to the same batch.
additional: Will the choice be other if the GUI will be guaranteed to be written in Python also?
Edit:
I have found a lot of IPC libraries, like here: Efficient Python to Python IPC or ActiveMQ or RabbitMQ or ZeroMQ or.
The best looking options I have found so far are:
rabbitmq
zeromq
pyro
Are they appropriate slutions to this problem? If not why? And if something is better, please tell me why also.
The three you mentioned seem a good fit and will uphold your requirements. I think you should go on with what you feel most comfortable\familiar with.
From my personal experience, I do believe ZeroMQ is the best combination between efficiency, ease of use and inter-operability. I had an easy time integrating zmq 2.2 with Python 2.7, so that would be my personal favorite. However as I said I'm quite sure you can't go wrong with all 3 frameworks.
Half related: Requirements tend to change with time, you may decide to switch framework later on, therefore encapsulating the dependency on the framework would be a good design pattern to use. (e.g. having a single conduit module that interacts with the framework and have its API use your internal datastructures and domain language)
I've used the Redis engine for this. Extremely simple, and lightweight.
Server side does:
import redis
r = redis.Redis() # Init
r.subscribe(['mychannel']) # Subscribe to "channel"
for x in r.listen():
print "I got message",x
Client side does:
import redis
r = redis.Redis() # Init
r.publish('mychannel',mymessage)
"messages" are strings (of any size). If you need to pass complex data structures, I like to use json.loads and json.dumps to convert between python dicts/arrays and strings -
"pickle" is perhaps the better way to do this for python-to-python communication, though JSON means "the other side" can be written in anything.
Now there are a billion other things Redis is good for - and they all inherently are just as simple.
You are asking for a lot of things from the framework; network enabled, multi-platform, multi-language, high performance (which ideally should be further specified - what does it mean, bandwidth? latency? what is "good enough"; are we talking kB/s, MB/s, GB/s? 1 ms or 1000 ms round-trip?) Plus there are a lot of things not mentioned which can easily come into play, e.g. do you need authentication or encryption? Some frameworks give you such functionality, others rely on implementing that part of the puzzle yourself.
There probably exists no silver bullet product which is going to give you an ideal solution which optimizes all those requirements at the same time. As for the 'additional' component of your question - yes, if you restrict language requirements to python only, or further distinguish between key vs. nice-to-have requirements, there would be more solutions available.
One technology you might want to have a look at is Versile Python (full disclosure: I am one of the developers). It is multi-platform and supports python v2.6+/v3, and java SE6+. Regarding performance, it depends on what are your requirements. If you have any questions about the technology, just ask on the forum.
The solution is dbus
It is a mature solution and availiable for a lot of languages (C, Python, ..., just google for dbus + your favorite language), though not as fast as shared memory, but still fast enough for pretty much everything not requiring (hard) realtime properties.
I'll take a different tack here and say why not use the de facto RPC language of the Internet? I.e. HTTP REST APIs?
With Python Requests on the client side and Flask on the server side, you get these kinds of benefits:
Existing HTTP REST tools like Postman can access and test your server.
Those same tools can document your API.
If you also use JSON, then you get a lot of tooling that works with that too.
You get proven security practices and solutions (session based security and SSL).
It's a familiar pattern for a lot of different developers.

Fastest Python concurrency framework [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
Could someone suggest a concurrency framework (threads, event based) in Python that can handle two tasks, one that has to process a LOT of events, and another that can process one or more commands at a slower rate? I am going to prototype with Twisted and see if it will meet my needs. I went through http://www.slideshare.net/dabeaz/an-introduction-to-python-concurrency which is informative and so the other choice I can try seems to be the multiprocessing module.
Background
I am trying to write a program that can interface with a C program on one side and the network on the other. The C program generates events at a high rate (hundreds of thousands, possibly a million messages per second) which need to be consumed without letting it block, and the C program needs to be sent commands arriving from the network.
I think Python with zeromq (http://www.zeromq.org/) will suffice for consuming the events from the C program. But I need to also concurrently process commands from the network in my program. I have used Python with Twisted before to do asynchronous programming, but am not sure if it can handle the zeromq messages concurrently with other tasks fast enough.
I am going to try it out, but I was wondering if anybody has any thoughts on other ways of doing things. I would rather use Python as it would make handling of the commands and keeping state easier than if I had to do it in C.

C or Python for performance and scalability in creating socket connections? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
I did see this post but it does not answer my question: C/Python Socket Performance?
I have been tasked with creating an application that can create thousands of connections based on sockets. I can do this in Python but I want to have room for performance improvements. I know it's possible in Python because of my past projects, but I'm curious how much of a performance improvement this would be if I was to do this project in C (not C++)?
It really depends on what you're doing with the sockets.
The best generic answer is: Usually Python is good enough that it doesn't matter, but sometimes it's not.
The overhead in the time taken to create and connect sockets is minimal, and reading and writing isn't much worse. But that doesn't matter, that's pretty much never any significant time spent doing that anyway.
There are reactors and proactors for Python every bit as good as the general-purpose ones available for C (and half of the C libraries have Python bindings). If you're not doing much significant work beyond the sockets, this is often your main bottleneck. If you've got a very specific use pattern and/or very tightly specified hardware, you might be able to write a custom reactor or proactor that beats out anything general-purpose. In that case, you pretty much have to go with C, not Python.
But usually, you've got significant work to do beyond just manipulating sockets.
If that work is mostly independent and highly parallelizable, C obviously beats Python (because of the GIL), unless the jobs are heavy enough that you can multi-process them (and keep in mind that "heavy enough" can be pretty heavy on Windows platforms). Except, of course, that it's incredibly easy to screw up performance (not to mention stability) writing multi-threaded C code; really, something like Erlang or Haskell is probably a better bet here than either C or Python. (If you're about to say, "But we've got people who are experienced at C but they can't learn Haskell", then those people are probably not good enough programmers to write multi-threaded code.)
If that work is mostly memory copying between socket buffers, and you can deal with a tightly-specified system, you may be able to write C code that optimizes zero-copies, and there's no way to do that in Python.
But if it's mostly typical things like waiting on disk or serialized computation, then it scarcely matters how you write the socket-stuff, because it's going to end up waiting on the real code anyway.
So, without any more information, I'd go with Python, because the time you save getting things up and running and debugged vs. C can be spent optimizing or otherwise improving whatever turns out to matter.
If you're using the Windows platform, learn the one thread per core concept of IOCPs and stay away from using thread pools that entail a more or less one thread per socket usage.

What's so cool about Twisted? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 7 years ago.
Improve this question
I'm increasingly hearing that Python's Twisted framework rocks and other frameworks pale in comparison.
Can anybody shed some light on this and possibly compare Twisted with other network programming frameworks.
There are a lot of different aspects of Twisted that you might find cool.
Twisted includes lots and lots of protocol implementations, meaning that more likely than not there will be an API you can use to talk to some remote system (either client or server in most cases) - be it HTTP, FTP, SMTP, POP3, IMAP4, DNS, IRC, MSN, OSCAR, XMPP/Jabber, telnet, SSH, SSL, NNTP, or one of the really obscure protocols like Finger, or ident, or one of the lower level protocol-building-protocols like DJB's netstrings, simple line-oriented protocols, or even one of Twisted's custom protocols like Perspective Broker (PB) or Asynchronous Messaging Protocol (AMP).
Another cool thing about Twisted is that on top of these low-level protocol implementations, you'll often find an abstraction that's somewhat easier to use. For example, when writing an HTTP server, Twisted Web provides a "Resource" abstraction which lets you construct URL hierarchies out of Python objects to define how requests will be responded to.
All of this is tied together with cooperating APIs, mainly due to the fact that none of this functionality is implemented by blocking on the network, so you don't need to start a thread for every operation you want to do. This contributes to the scalability that people often attribute to Twisted (although it is the kind of scalability that only involves a single computer, not the kind of scalability that lets your application grow to use a whole cluster of hosts) because Twisted can handle thousands of connections in a single thread, which tends to work better than having thousands of threads, each for a single connection.
Avoiding threading is also beneficial for testing and debugging (and hence reliability in general). Since there is no pre-emptive context switching in a typical Twisted-based program, you generally don't need to worry about locking. Race conditions that depend on the order of different network events happening can easily be unit tested by simulating those network events (whereas simulating a context switch isn't a feature provided by most (any?) threading libraries).
Twisted is also really, really concerned with quality. So you'll rarely find regressions in a Twisted release, and most of the APIs just work, even if you aren't using them in the common way (because we try to test all the ways you might use them, not just the common way). This is particularly true for all of the code added to Twisted (or modified) in the last 3 or 4 years, since 100% line coverage has been a minimum testing requirement since then.
Another often overlooked strength of Twisted is its ten years of figuring out different platform quirks. There are lots of undocumented socket errors on different platforms and it's really hard to learn that they even exist, let alone handle them. Twisted has gradually covered more and more of these, and it's pretty good about it at this point. Younger projects don't have this experience, so they miss obscure failure modes that will probably only happen to users of any project you release, not to you.
All that say, what I find coolest about Twisted is that it's a pretty boring library that lets me ignore a lot of really boring problems and just focus on the interesting and fun things. :)
Well it's probably according to taste.
Twisted allows you to easily create event driven network servers/clients, without really worrying about everything that goes into accomplishing this. And thanks to the MIT License, Twisted can be used almost anywhere. But I haven't done any benchmarking so I have no idea how it scales, but I'm guessing quite good.
Another plus would be the Twisted Projects, with which you can quickly see how to implement most of the server/services that you would want to.
Twisted also has some great documentation, when I started with it a couple of weeks ago I was able to quickly get a working prototype.
Quite new to the python scene please correct me if i'm in the wrong.

Categories