I am new at Server side,
but I have gotten a chance to design and implement a server that will cover around 2000~3000 client.
And I am thinking that I will use Python and Websocket, though I don't know this choice is appropriate.
In this point, I am curious on how to design the server.
I think there must be some architecture normally in use depending on capacity that server handles.
Otherwise, Could I use a Websocket server offered by some python package like Tornado or Django?
I hope that I can get any information on this.
Any advice?
I've had good experiences using haproxy in front of sockjs-tornado. Depending on how complex your server-side logic, routing, and persistence requirements are, you could write all your server endpoints using tornado and use SQLAlchemy to handle writes to a relational database or use a non SQL data store like Redis.
If your main requirement is real-time interactivity it might be worth investigating meteor as well.
One of solutions could be Pyramid, sockjs, gunicorn, and gevent. Nginx probably better suits to be a frontend than Apache, but of course if you do not have any lengthy processing on the backend, any decent asynchronous Python server with websocket and sockjs support (not sure about socket.io as an alternative) will work for you out of the box.
Lenghty processing should be offloaded to some queue workers anyway, so asynchronous server will fit the bill.
Just check whether all used datastore/database adapters are compatible with your server solution be it asynchronous or multi-threading.
Related
I spent quite some time now with researching Server Backends/API/Frameworks. I need a solution where I can store user content (JSON & Binary data).
The obvious choice would be a REST API. The only missing element is a push feature when data on server changed and clients should be notified instantly. With more research in this matter I discovered classic approaches (Comet, Push, Server sent events, Bayeux, BOSH, …) as well as the „new“ league, Websockets. I would definitely prefer the method with Websockets or using directly TCP Sockets. But this post is not about pros/cons of these two technologies so please restrain yourself from getting side tracked in comments.
At moment exists following projects which are very similar to my needs:
- Simperium (simperium.com), this looks very promising, but core/server is sadly not open source and god knows when, if ever, this step happens
- Realtime.co (framework.realtime.co/storage), hosted service, but same principle
- Some Frameworks for building servers such as Atmosphere (java, no WAMP), Cometd (java, project page looks like stuck in the 90’s), Autobahn (python, WAMP)
My actual favorite is the Autobahn framework (autobahn.ws). Especially using the WAMP protocol (subset of Websocket) as it offers exactly what I need. So the idea would be to build a python backend/server with Autobahn Python (based on Twisted framework) which manages all socket (WAMP) connections and include a Postgresql database for data storing. For all desired clients exists already WAMP libraries. The server would need to be able to do the typical REST API features:
- Send, update, delete requested data (JSON/Binary) from/to server/clients
- Synchronize & automatic conflict management
- Offline handling when connection breaks, automatic restart when connection available again
So finally the questions:
- Have I missed an open source project which covers exactly my needs?
- If I would like to develop my own server with autobahn and a database, could you point me to right direction? Have lot of concerns and not enough depth understanding.. I know Autobahn gives you already a server, but this one is not very close to my final needs.. how to build a server efficient so that he can handle all connected sockets? How handle when a client needs server push? Are there schemas, models or concept how such a server should look like?
- Twisted is a very powerful python framework but not regarded as the most convenient for writing apps.. But I guess a Socket based storage server with db access should be possible? When I run twisted as a web ressource and develop server components with other python framework, would this compromise the latency/performance much?
- Is such a desired server backend with lot of data storage (JSON fields and also binary data such as documents, images) reasonable to build with Sockets by a single devoloper/small team or is this smth. which only bigger companies like Dropbox can do at the moment?
Thank you very much for your help & time!
So finally the questions:
Have I missed an open source project which covers exactly my needs?
No you've covered the open source projects. Open source only gets you about halfway there though. To implement a Global Realtime Network requires equal parts implementation and equal parts operations. You have to think about dropped messages, retries, what happens if a particular geography gets hot how do you scale your servers ...etc. I would argue that an open source solution won't achieve what you want unless you're willing to invest significant resources into operations. I would recommend a service like PubNub: http://pubnub.com
If I would like to develop my own server with autobahn and a database, could you point me to right direction? Have lot of concerns and not enough depth understanding.. I know Autobahn gives you already a server, but this one is not very close to my final needs.. how to build a server efficient so that he can handle all connected sockets? How handle when a client needs server push? Are there schemas, models or concept how such a server should look like?
A good database to back a realtime framework would be Cassandra because it supports high write volumes and handles time series data well: http://cassandra.apache.org/.
Twisted is a very powerful python framework but not regarded as the most convenient for writing apps.. But I guess a Socket based storage server with db access should be possible? When I run twisted as a web ressource and develop server components with other python framework, would this compromise the latency/performance much?
I would not use Twisted. I would use Gevent:http://www.gevent.org/. Its coroutine based so you don't get into callback hell. To support more connections you just increase your greenlet pool to listen on the socket.
Is such a desired server backend with lot of data storage (JSON fields and also binary data such as documents, images) reasonable to build with Sockets by a single devoloper/small team or is this smth. which only bigger companies like Dropbox can do at the moment?
Again I would not build this on your own. A service like PubNub: http://pubnub.com which takes care of all the operational issues for you and has a clean API would service your needs with minimal cost. PubNub takes care of the protocol for you so if your on a mobile device that doesn't support WebSockets it will use TCP, HTTP or whatever the best transport is for the device.
I'm writing a web application using Python's twisted.web on the server side.
On the frontend side I would like to use Ajax for displaying real time updates of events which are happening in the server.
There are lots of information out there on how this can be done, so I realized I need to pick a javascript library that would make my life easier.
socket.io seems to be a good choice since it supports several browsers and transport mechanisms, but by reading their examples it seems it can only work with node.js?
So, does anyone know if it's possible to use socket.io with twisted.web?
If so, any links for a good example/tutorial would be also welcome.
You could try https://github.com/DesertBus/sockjs-twisted or if you need SocketIO for a specific reason, it wouldn't be difficult to port TornadIO2 to Cyclone. You might find interesting this issue.
You need something server side to integrate with the socket.io script on the client side. The servers that I know that are written in Python and do this all use Tornado. You could look at an implementation like, Tornadio (https://github.com/MrJoes/tornadio) and see what methods and classes they used to hook Tornadio and Tornado together. This would give you a pretty good idea of how to integrate it with your twisted.web server.
We've just switched away from socket.io to sockJS (which is also compatible with Tornado) and have seen large performance improvements.
I have 2 python servers running independently of one another but sharing the same database. They need to communicate with each other about when certain changes have been made to the database so the other server (if running) can reload cached data.
What would be my best options for communicating between two such programs?
I've thought of using sockets but it seems like a lot of work. Either one program will be polling connect whenever the other is off, or they both need to have server/client capabilities. I looked into named pipes but didn't see any easy portable solution (needs to run on windows and unix).
You could have each one implement a simple XMLRPC server. Each one can then execute code in the other, such as telling the other one it needs to update.
The easiest way is to use the database itself as a means of communication. Add a table for logging updates. Then, either machine can periodically query to see if the underlying data has been changed.
Another easy form of communication is email using the smtplib module. Our buildbots and version control repository use this form of communication.
If you want something with a little more "industrial" strength, consider using RabbitMQ or somesuch for messaging between servers.
I agree that sockets are usually too low-level. If you investigate "RabbitMQ", you should also investigate celery. It can use RabbitMQ as a back-end, but it can also use the database, and it neatly encapsulates the messaging mechanism. It is also integrated with django and gevent.
I'm interested in something based on Jabber but I didn't find a free/opensource one so I'm thinking of writing one.
I've installed a Jabber server and now thinking about the ways in which I can write the client. I'm thinking of one of either these two methods.
1) An ajax call made to a jabber script running on the webserver that takes care of connecting to the server. But then I thought because of the dependencies involved in the jabber client, it might end up consuming too much memory when a few clients connect.
2) The other method is to run a client running as a daemon that takes care of all the heavy lifting. This way I need to have only one instance of the client that sends a spoofed message (sender's name as that of whatever the user entered on the site). A simple script running on the webserver talks to this daemon over some sort of API (XMLRPC or Msgpack maybe?)
I think #2 is better but I'm not sure. Are there other ways I can implement this? I'm considering using Perl or Python for this.
Jabber is usually called XMPP nowadays, and there are dozens of clients and servers, something for every language. If you are using Javascript (you mention Ajax), you probably want Strophe. Most servers are modular, so you only load the features you need (consider Tigase, ejabberd, or xmpppy). Writing your own is even worse an idea than it sounds.
BOSH
Install prosody because it is really eaSily installed and has BOSH support built-in. You could skip this but then you need to find out how to use BOSH via ejabberd.
use strophe.js to implement this(using BOSH). New browsers support cross-domain request(CORS -> read Proxy-less BOSH part). The old browsers you could use proxy or use flash in the middle as proxy.
read Professional XMPP Programming with JavaScript and jQuery to learn strophe. It even has chapters explaining how to create chat.
Node.js
Or you could consider installing node.js to create your chat system using socket.io.
I have an XMPP server (likely — python, twisted, wokkel), which I prefer not to restart even in the development version, and I have some python module “worker” (which is interface to particular django project), which gets jid and message text and returns some response (text or XML, either way).
The question is, what would be the best way to connect them, considering that I may prefer to update the module part too often?
Another consideration is that it might be required to run multiple instances of “worker” for it all to be high-load-capable.
One possible way I see is implementing a thread in the server which checks if the module was changed and reload()s it if necessary.
The other way would be making something similar to fastcgi through sockets, although not HTTP-based.
My suggestion is:
Use RabbitMQ with XMPP adaptor.
Use Python carrot for AMQP since it can be used directly under Django.
I can't say that I understand all of your question, but the bit where you're asking how to connect django and twisted and multiple workers: I'd suggest using AMPQ. This gets you reliable message delivery, multiple consumers, persistence.
There's the txAMQP library for twisted.
https://launchpad.net/txamqp
A good primer to AMQP here, it's a good place to start:
http://blogs.digitar.com/jjww/2009/01/rabbits-and-warrens/