I have a Python-driven web interface powered by Apache 2.2 with mod_python and Python 2.4. I need to make an asynchronous process appear synchronous to users of this web interface.
When users access one module on this website:
An external SOAP interface will be contacted with a unique identifier and will respond with a number N
The external interface will respond asynchronously by contacting a SOAP server on my machine between 1 and 10 times (the number N tells us how many responses we will receive)
I need to somehow aggregate these responses and pass them to the original module which will display the information back to the user. The goal is to make the process appear synchronous to the user.
What is the best way to handle this synchronization issue? Is this something Twisted would be well-suited for?
I am not restricting myself to Python for the solution, though it is preferred because everything else on the server is in Python. I prefer a solution that is both scalable and will take a minimal amount of programming time (though I understand that these attributes are somewhat at odds).
Maybe you can use Orbited to get ajax push with long-lived HTTP connections to your web clients. Orbited is based on Twisted, so I think it makes sense to look at if you already know Twisted. Have a look at this tutorial to get started.
Related
I am creating an application that basically has multiple connections to a third party Chat Streaming API(Socket based).
The way it works is - Every user has an account on my app and another account on the third party app. He gives me an access token for the third party chat app and I connect to the third party API to stream his chats. This happens for hundreds of users.
I need to create a socket connection pool for every user and run parallel threads. I am using a python library(for that API) and am able to achieve real time feeds for single users. How do I implement an asynchronous socket connection pool in Python or NodeJS? I have a Linux micro instance on EC2 and I need to run this application for 1000 users.
I am exploring Redis+Tornado to implement this. Are there any better alternatives?
This will be messy and also a couple of things to consider.
If you are going to use multiple threads remember that you can only run so many per CPU as the OS permits, rather go multiprocessing.
If you are going async with long polling processes it will prevent other clients from processing requests.
Solution
When your application absolutely needs to be real-time I would suggest websockets for server-client interaction.
Then from your clients request start a single process that listens\polls on your streaming API using multiprocessing in python. So you will essentially create a separate process for each client.
And now, to make your WebSocketHandler and Background API Streamer interact with each other you can use the Observer Pattern (https://en.wikipedia.org/wiki/Observer_pattern) to notify the WebSocket that you have received data from the API.
Make sure that you assign a unique ID to every client and make sure that you only post the data to the intended client when using websockets.
EDIT:
Web:
Also on your question regarding Tornado. It is a good lightweight framework for running a couple of users maybe 1000. But anything more than that I would suggest looking at Django as it will allow you to be more productive in producing code and also there are lots of tools out there that the community have developed over time.
Database:
Red.is is a good choice if you need a very fast no-sql db, also have a look at mongodb. If you require a multi-region DB I would suggest going with Cassandra or CouchDB due to the partitioned nodes. The image below might help you better decide which DB to use.
I have an interesting project going on at our workplace. The task, that stands before us, is such:
Build a custom server using Python
It has a web server part, serving REST
It has a FTP server part, serving files
It has a SMTP part, which receives mail only
and last but not least, a it has a background worker that manages lowlevel file IO based on requests received from the above mentioned services
Obviously the go to place was Twisted library/framework, which is an excelent networking tool. However, studying the docs further, a few things came up that I'm not sure about.
Having Java background, I would solve the task (at least at the beginning) by spawning a separate thread for each service and going from there. Being in Python however, I cannot do that for any reasonable purpose as Python has GIL. I'm not sure, how Twisted handles this. I would expect, that Twisted has large (if not majority) code written in C, where GIL is not the issue, but that I couldn't find the docs explained to my satisfaction.
So the most oustanding question is: Given that Twisted uses Reactor as it's main design pattern, will it be able to:
Serve all those services needed
Do it in a non-blocking fashion (it should, according to docs, but if someone could elaborate, I'd be grateful)
Be able to serve about few hundreds of clients at once
Serve large file downloads in a reasonable way, meaning that it can serve multiple clients, using multiple services, downloading and uploading large files.
Large files being in the order of hundres of MB, or few GB. The size is not important, it's the time that the client has to stay connected to the server that matters.
Edit: I'm actually inclined to go the way of python multiprocessing, but not sure, whether that's a correct thing to do with Twisted etc.
Serve all those services needed
Yes.
Do it in a non-blocking fashion (it should, according to docs, but if someone could elaborate, I'd be grateful)
Twisted's uses the common reactor model. I/O goes through your choice of poll, select, whatever to determine if data is available. It handles only what is available, and passes the data along to other stages of your app. This is how it is non-blocking.
I don't think it provides non-blocking disk I/O, but I'm not sure. That feature not what most people need when they say non-blocking.
Be able to serve about few hundreds of clients at once
Yes. No. Maybe. What are those clients doing? Is each hitting refresh every second on a browser making 100 requests? Is each one doing a numerical simulation of galaxy collisions? Is each sending the string "hi!" to the server, without expecting a response?
Twisted can easily handle 1000+ requests per second.
Serve large file downloads in a reasonable way, meaning that it can serve multiple clients, using multiple services, downloading and uploading large files.
Sure. For example, the original version of BitTorrent was written in Twisted.
I'm interested in something based on Jabber but I didn't find a free/opensource one so I'm thinking of writing one.
I've installed a Jabber server and now thinking about the ways in which I can write the client. I'm thinking of one of either these two methods.
1) An ajax call made to a jabber script running on the webserver that takes care of connecting to the server. But then I thought because of the dependencies involved in the jabber client, it might end up consuming too much memory when a few clients connect.
2) The other method is to run a client running as a daemon that takes care of all the heavy lifting. This way I need to have only one instance of the client that sends a spoofed message (sender's name as that of whatever the user entered on the site). A simple script running on the webserver talks to this daemon over some sort of API (XMLRPC or Msgpack maybe?)
I think #2 is better but I'm not sure. Are there other ways I can implement this? I'm considering using Perl or Python for this.
Jabber is usually called XMPP nowadays, and there are dozens of clients and servers, something for every language. If you are using Javascript (you mention Ajax), you probably want Strophe. Most servers are modular, so you only load the features you need (consider Tigase, ejabberd, or xmpppy). Writing your own is even worse an idea than it sounds.
BOSH
Install prosody because it is really eaSily installed and has BOSH support built-in. You could skip this but then you need to find out how to use BOSH via ejabberd.
use strophe.js to implement this(using BOSH). New browsers support cross-domain request(CORS -> read Proxy-less BOSH part). The old browsers you could use proxy or use flash in the middle as proxy.
read Professional XMPP Programming with JavaScript and jQuery to learn strophe. It even has chapters explaining how to create chat.
Node.js
Or you could consider installing node.js to create your chat system using socket.io.
I have an XMPP server (likely — python, twisted, wokkel), which I prefer not to restart even in the development version, and I have some python module “worker” (which is interface to particular django project), which gets jid and message text and returns some response (text or XML, either way).
The question is, what would be the best way to connect them, considering that I may prefer to update the module part too often?
Another consideration is that it might be required to run multiple instances of “worker” for it all to be high-load-capable.
One possible way I see is implementing a thread in the server which checks if the module was changed and reload()s it if necessary.
The other way would be making something similar to fastcgi through sockets, although not HTTP-based.
My suggestion is:
Use RabbitMQ with XMPP adaptor.
Use Python carrot for AMQP since it can be used directly under Django.
I can't say that I understand all of your question, but the bit where you're asking how to connect django and twisted and multiple workers: I'd suggest using AMPQ. This gets you reliable message delivery, multiple consumers, persistence.
There's the txAMQP library for twisted.
https://launchpad.net/txamqp
A good primer to AMQP here, it's a good place to start:
http://blogs.digitar.com/jjww/2009/01/rabbits-and-warrens/
Here is where I am at presently. I am designing a card game with the aim of utilizing major components for future work. The part that is hanging me up is creating a layer of abstraction between the server and the client(s). A server is started, and then one or more clients can connect (locally or remotely). I am designing a thick client but my friend is looking at doing a web-based client. I would like to design the server in a manner that allows a variety of different clients to call a common set of server commands.
So, for a start, I would like to create a 'server' which manages the game rules and player interactions, and a 'client' on the local CLI (I'm running Ubuntu Linux for convenience). I'm attempting to flesh out how the two pieces are supposed to interact, without mandating that future clients be CLI-based or on the local machine.
I've found the following two questions which are beneficial, but don't quite answer the above.
Client Server programming in python?
Evaluate my Python server structure
I don't require anything full-featured right away; I just want to establish the basic mechanisms for abstraction so that the resulting mock-up code reflects the relationship appropriately: there are different assumptions at play with a client/server relationship than with an all-in-one application.
Where do I start? What resources do you recommend?
Disclaimers:
I am familiar with code in a variety of languages and general programming/logic concepts, but have little real experience writing substantial amounts of code. This pet project is an attempt at rectifying this.
Also, I know the information is out there already, but I have the strong impression that I am missing the forest for the trees.
Read up on RESTful architectures.
Your fat client can use REST. It will use urllib2 to make RESTful requests of a server. It can exchange data in JSON notation.
A web client can use REST. It can make simple browser HTTP requests or a Javascript component can make more sophisticated REST requests using JSON.
Your server can be built as a simple WSGI application using any simple WSGI components. You have nice ones in the standard library, or you can use Werkzeug. Your server simply accepts REST requests and makes REST responses. Your server can work in HTML (for a browser) or JSON (for a fat client or Javascript client.)
I would consider basing all server / client interactions on HTTP -- probably with JSON payloads. This doesn't directly allow server-initiated interactions ("server push"), but the (newish but already traditional;-) workaround for that is AJAX-y (even though the X makes little sense as I suggest JSON payloads, not XML ones;-) -- the client initiates an async request (via a separate thread or otherwise) to a special URL on the server, and the server responds to those requests to (in practice) do "pushes". From what you say it looks like the limitations of this approach might not be a problem.
The key advantage of specifying the interactions in such terms is that they're entirely independent from the programming language -- so the web-based client in Javascript will be just as doable as your CLI one in Python, etc etc. Of course, the server can live on localhost as a special case, but there is no constraint for that as the HTTP URLs can specify whatever host is running the server; etc, etc.
First of all, regardless of the locality or type of the client, you will be communicating through an established message-based interface. All clients will be operating based on a common set of requests and responses, and the server will handle and reject these based on their validity according to game state. Whether you are dealing with local clients on the same machine or remote clients via HTTP does not matter whatsoever from an abstraction standpoint, as they will all be communicating through the same set of requests/responses.
What this comes down to is your protocol. Your protocol should be a well-defined and technically sound language between client and server that will allow clients to a) participate effectively, and b) participate fairly. This protocol should define what messages ('moves') a client can do, and when, and how the server will react.
Your protocol should be fully fleshed out and documented before you even start on game logic - the two are intrinsically connected and you will save a lot of wasted time and effort by competely defining your protocol first.
You protocol is the abstraction between client and server and it will also serve as the design document and programming guide for both.
Protocol design is all about state, state transitions, and validation. Game servers usually have a set of fairly common, generic states for each game instance e.g. initialization, lobby, gameplay, pause, recap, close game, etc...
Each one of these states has important state data related with it. For example, a 'lobby' state on the server-side might contain the known state of each player...how long since the last message or ping, what the player is doing (selecting an avatar, switching settings, going to the fridge, etc.). Organizing and managing state and substate data in code is important.
Managing these states, and the associated data requirements for each is a process that should be exquisitely planned out as they are directly related to volume of work and project complexity - this is very important and also great practice if you are using this project to step up into larger things.
Also, you must keep in mind that if you have a game, and you let people play, people will cheat. It's a fact of life. In order to minimize this, you must carefully design your protocol and state management to only ever allow valid state transitions. Never trust a single client packet.
For every permutation of client/server state, you must enforce a limited set of valid game messages, and you must be very careful in what you allow players to do, and when you allow them to do it.
Project complexity is generally exponential and not linear - client/server game programming is usually a good/painful way to learn this. Great question. Hope this helps, and good luck!