General guidelines when constructing multiple REST apis on same server in Python

General guidelines when constructing multiple REST apis on same server in Python - python

I was tasked with building a framework to unify half a dozen of python webapps that share some common functionality between them.
The current setup consists of an apache web server, serving up a specific cgi script (for each app) that talks to a simple XMLRPC service (again, a different service for each app), which in turn runs db queries, writes files to disk, runs external scripts, etc..
The idea was to do away with redundant code (such as 6 different RPC services creating and managing 6 different connections to the same db, executing very similar queries, etc), as well as the whole XMLRPC thing. Also, keep things in python land and maintain a double layer of abstraction - the client-side app can't talk directly to the api that runs the queries and writes to the disk; it must talk to an intermediate api (currently the cgi scripts) that in turn talk to the processes that perform all the querying / disk writing.
Since this is my first major Python undertaking, I am somewhat struggling with finding the correct, DRY, modular and pythonic way of architecting this framework.
Ideally, I would like to construct a REST api that the code on the client-side (currently implemented in React.js) can talk to directly.
This api should then be able to interface with another, internal REST api (currently, this is handled by the XMLRPC processes) that will do all the heavy-lifting (aforementioned db querying - perhaps using an ORM, executing tasks, writing files, and the like).
However, I'm not sure how I should implement the inter-api communication - should it be over HTTP (even if they're living on the same server?) or some other protocol I'm not aware of?
Also, how should I go about consolidating the common bits of code?
should there be a separate module / process that keeps a live connection to our db, and the disparate apis connect to it?
How can I implement an ORM with such a scheme, as I don't want new developers querying the db directly. I would like them to have an explicit model that they talk to instead.
I know this is a long and convoluted way of saying "I'm stuck", and for that I apologize and would happily provide more detail if asked, however what I'm asking for are more of a set of general guidelines:
How should different REST apis (implemented using Flask + SQLAlchemy, for example) talk to each other, whether they live on the same server, or on different ones?
How should I abstract away common tasks like connecting to a db into a separate process?
Finally, what are some good, common-sense rules for keeping this whole thing modular?

Related

codeigniter as web and python as the web service

is it possible to make the python as the web server and the front end is codeigniter?
For some reasons:
database security - like when you are saving data. the codeigniter will pass the data to python basehttpserver / or maybe flask (but i have not yet done the flask before)
SQL Injection.
it's like for example. front end codeigniter
form - send data.
back end python web service
receives data - will serve as an API. and the python will be the one in charge of saving data in MySQLdb.

In theory I don't see why this would be impossible.
You could easily write a web application using codeigniter and have the controllers just pass data along to a python based web service. If you're interested in a fully decoupled front-end/back-end, you could also use a queuing layer (such as RabbitMQ) in between the data entry facilities in your CI program, and the persistence web services in Python.
That said, I'm not clear why you would want to. CodeIgniter is PHP, and includes some very excellent data modelling components that integrate fully into the overall framework. Long story short, if you're using CodeIgniter, just have it connect to MySQL and do the data persistence for you.
Likewise, if you'd prefer to code your persistence in Python, why not just use Django? It's a fully realized Python web framework, and also features an excellent ORM and support for MySQL.
I don't really see how either technology gives you clear benefits, provided they are both used properly, for database security. Both have built-in methodologies for "cleansing" user provided data to prevent SQL injection (notes for Django and notes for CodeIgniter)
There are a great many other posts on StackOverflow dealing with preventing SQL injection with CodeIgniter and other frameworks. Just using python, or decoupling your front-end and back-end, will not provide you any additional security or protection guarantees. The only way to do that is to carefully architect your interactions with databases, using all the tools provided you by whatever framework you are using, or creating their equivalents if none are available (or switch to a better toolset).
Edit - expansion
Based on the comments above I figured it was actually worth writing a little more about the potential advantages and real challenges of a decoupled infrastructure.
In principle, it's easy to decouple a front-end from a more isolated backend. You could leverage in either Django or likely CodeIgniter (although I haven't personally seen it done in CI, just in Django) the existing model infrastructure, but deal with model objects in memory only on the frontend, and expand on the existing ORM functionality to use your backend services to actually store and retrieve data from a persistence layer (your database).
Practically, this can become quite a bit of work to do right. To gain the security advantages you desire, your decoupled backend needs to deal with the frontend as if in principle it is "hostile", or at the least, untrustworthy. So, be sure that you implement a method for the frontend to reliably authenticate itself to the backend. Ensure that all traffic is minimally using SSL between the frontend and the backend. Consider carefully your services architecture (the SOA layer in front of your backend logic) and make sure your APIs, where possible, are MECE (Mutually exclusive, collectively exhaustive).
I'm sure I'm missing some basic principles, but having recently participated in the design and build of a system along these lines I can assure you that the complexity can very quickly explode, so careful architectural discipline and adherence to both MECE and MVP (minimum viable product) is critical. A decoupled infrastructure can be an amazing end-product if it fits the need, and in use cases I've worked with it has been extremely effective. It isn't a one-size-fits-all, though, and hopefully some of the extra description here can help you make a more informed choice.
Hopefully this helps round out the topic answer. Basic principles: Design for what you need. Don't conflate complicated with secure. Simple can be secure, as can complex, but complexity breeds room for hard-to-plug security vulnerabilities, and simple gives the illusion of security by seeming easy. No approach guarantees a positive outcome, so don't try to cut corners; spend as much time as you can in research and design to minimize your time building, refactoring, and fixing.

Storage Backend based on Websockets

I spent quite some time now with researching Server Backends/API/Frameworks. I need a solution where I can store user content (JSON & Binary data).
The obvious choice would be a REST API. The only missing element is a push feature when data on server changed and clients should be notified instantly. With more research in this matter I discovered classic approaches (Comet, Push, Server sent events, Bayeux, BOSH, …) as well as the „new“ league, Websockets. I would definitely prefer the method with Websockets or using directly TCP Sockets. But this post is not about pros/cons of these two technologies so please restrain yourself from getting side tracked in comments.
At moment exists following projects which are very similar to my needs:
- Simperium (simperium.com), this looks very promising, but core/server is sadly not open source and god knows when, if ever, this step happens
- Realtime.co (framework.realtime.co/storage), hosted service, but same principle
- Some Frameworks for building servers such as Atmosphere (java, no WAMP), Cometd (java, project page looks like stuck in the 90’s), Autobahn (python, WAMP)
My actual favorite is the Autobahn framework (autobahn.ws). Especially using the WAMP protocol (subset of Websocket) as it offers exactly what I need. So the idea would be to build a python backend/server with Autobahn Python (based on Twisted framework) which manages all socket (WAMP) connections and include a Postgresql database for data storing. For all desired clients exists already WAMP libraries. The server would need to be able to do the typical REST API features:
- Send, update, delete requested data (JSON/Binary) from/to server/clients
- Synchronize & automatic conflict management
- Offline handling when connection breaks, automatic restart when connection available again
So finally the questions:
- Have I missed an open source project which covers exactly my needs?
- If I would like to develop my own server with autobahn and a database, could you point me to right direction? Have lot of concerns and not enough depth understanding.. I know Autobahn gives you already a server, but this one is not very close to my final needs.. how to build a server efficient so that he can handle all connected sockets? How handle when a client needs server push? Are there schemas, models or concept how such a server should look like?
- Twisted is a very powerful python framework but not regarded as the most convenient for writing apps.. But I guess a Socket based storage server with db access should be possible? When I run twisted as a web ressource and develop server components with other python framework, would this compromise the latency/performance much?
- Is such a desired server backend with lot of data storage (JSON fields and also binary data such as documents, images) reasonable to build with Sockets by a single devoloper/small team or is this smth. which only bigger companies like Dropbox can do at the moment?
Thank you very much for your help & time!

So finally the questions:
Have I missed an open source project which covers exactly my needs?
No you've covered the open source projects. Open source only gets you about halfway there though. To implement a Global Realtime Network requires equal parts implementation and equal parts operations. You have to think about dropped messages, retries, what happens if a particular geography gets hot how do you scale your servers ...etc. I would argue that an open source solution won't achieve what you want unless you're willing to invest significant resources into operations. I would recommend a service like PubNub: http://pubnub.com
If I would like to develop my own server with autobahn and a database, could you point me to right direction? Have lot of concerns and not enough depth understanding.. I know Autobahn gives you already a server, but this one is not very close to my final needs.. how to build a server efficient so that he can handle all connected sockets? How handle when a client needs server push? Are there schemas, models or concept how such a server should look like?
A good database to back a realtime framework would be Cassandra because it supports high write volumes and handles time series data well: http://cassandra.apache.org/.
Twisted is a very powerful python framework but not regarded as the most convenient for writing apps.. But I guess a Socket based storage server with db access should be possible? When I run twisted as a web ressource and develop server components with other python framework, would this compromise the latency/performance much?
I would not use Twisted. I would use Gevent:http://www.gevent.org/. Its coroutine based so you don't get into callback hell. To support more connections you just increase your greenlet pool to listen on the socket.
Is such a desired server backend with lot of data storage (JSON fields and also binary data such as documents, images) reasonable to build with Sockets by a single devoloper/small team or is this smth. which only bigger companies like Dropbox can do at the moment?
Again I would not build this on your own. A service like PubNub: http://pubnub.com which takes care of all the operational issues for you and has a clean API would service your needs with minimal cost. PubNub takes care of the protocol for you so if your on a mobile device that doesn't support WebSockets it will use TCP, HTTP or whatever the best transport is for the device.

Compound custom service server using Twisted

I have an interesting project going on at our workplace. The task, that stands before us, is such:
Build a custom server using Python
It has a web server part, serving REST
It has a FTP server part, serving files
It has a SMTP part, which receives mail only
and last but not least, a it has a background worker that manages lowlevel file IO based on requests received from the above mentioned services
Obviously the go to place was Twisted library/framework, which is an excelent networking tool. However, studying the docs further, a few things came up that I'm not sure about.
Having Java background, I would solve the task (at least at the beginning) by spawning a separate thread for each service and going from there. Being in Python however, I cannot do that for any reasonable purpose as Python has GIL. I'm not sure, how Twisted handles this. I would expect, that Twisted has large (if not majority) code written in C, where GIL is not the issue, but that I couldn't find the docs explained to my satisfaction.
So the most oustanding question is: Given that Twisted uses Reactor as it's main design pattern, will it be able to:
Serve all those services needed
Do it in a non-blocking fashion (it should, according to docs, but if someone could elaborate, I'd be grateful)
Be able to serve about few hundreds of clients at once
Serve large file downloads in a reasonable way, meaning that it can serve multiple clients, using multiple services, downloading and uploading large files.
Large files being in the order of hundres of MB, or few GB. The size is not important, it's the time that the client has to stay connected to the server that matters.
Edit: I'm actually inclined to go the way of python multiprocessing, but not sure, whether that's a correct thing to do with Twisted etc.

Serve all those services needed
Yes.
Do it in a non-blocking fashion (it should, according to docs, but if someone could elaborate, I'd be grateful)
Twisted's uses the common reactor model. I/O goes through your choice of poll, select, whatever to determine if data is available. It handles only what is available, and passes the data along to other stages of your app. This is how it is non-blocking.
I don't think it provides non-blocking disk I/O, but I'm not sure. That feature not what most people need when they say non-blocking.
Be able to serve about few hundreds of clients at once
Yes. No. Maybe. What are those clients doing? Is each hitting refresh every second on a browser making 100 requests? Is each one doing a numerical simulation of galaxy collisions? Is each sending the string "hi!" to the server, without expecting a response?
Twisted can easily handle 1000+ requests per second.
Serve large file downloads in a reasonable way, meaning that it can serve multiple clients, using multiple services, downloading and uploading large files.
Sure. For example, the original version of BitTorrent was written in Twisted.

Server Topology Help - Django and Twisted Possibility?

I am currently working on a complex web interface and backend, that will need to address several issues.
Scalablility
multiple deployments of varying load demands
Very structured authorization groups
Different views for different user groups
admin panel
user/content management
Large managed database
current
long term stored data (histories)
Data Updates
Polling
Ex. Search queries, static pages/files, report generation per request
Pushing (likely websockets)
Ex. Real-time notifications
Varying protocols
Ex. HTTP, SSL, Websockets
I would like to use Python, because I have grown to really enjoy the language, and I am considering some combo of Django and Twisted.
I have some experience with Django, which I love for its MVT style of application programming, its authorization models, its admin panel, and its database API. However, it is not so strong in some of the data requirements that I need, in particular, the real-time aspects.
Now, I have not really used Twisted before, but I have seen many interesting things to it. In particular the async aspects, and the ability to run many protocols.
The problems in getting the two to work together are obvious in that Django is a blocking server and Twisted is designed to be non-blocking. I have seen some topics stating using the two together is possible and have had success with it. It also seems possible to run both and proxy them to accept different urls, but getting the authentication over the two may become tricky?
Having said all of that, I would like to ask if I am on the right track for implementing this system, as well as suggestions on how to use the two together, alternatives, or if I should just kick one out (at this point, I guess it'd have to be Django, because the real time stuff is necessary). I should mention that I have written some of the preliminary data models and views in Django already.
I am quite experienced on the client side of things (JS,CSS,HTML), but I am not so savvy in the server side of things. Any input would be helpful, thanks.

You can definitely use Twisted with Django. Several projects have used the two together to good effect. twistd web --wsgi provides a basic way to get it set up, and there's a great example with more bells and whistles, like static content by Alex Clemesha on github.

Abstraction and client/server architecture questions for Python game program

Here is where I am at presently. I am designing a card game with the aim of utilizing major components for future work. The part that is hanging me up is creating a layer of abstraction between the server and the client(s). A server is started, and then one or more clients can connect (locally or remotely). I am designing a thick client but my friend is looking at doing a web-based client. I would like to design the server in a manner that allows a variety of different clients to call a common set of server commands.
So, for a start, I would like to create a 'server' which manages the game rules and player interactions, and a 'client' on the local CLI (I'm running Ubuntu Linux for convenience). I'm attempting to flesh out how the two pieces are supposed to interact, without mandating that future clients be CLI-based or on the local machine.
I've found the following two questions which are beneficial, but don't quite answer the above.
Client Server programming in python?
Evaluate my Python server structure
I don't require anything full-featured right away; I just want to establish the basic mechanisms for abstraction so that the resulting mock-up code reflects the relationship appropriately: there are different assumptions at play with a client/server relationship than with an all-in-one application.
Where do I start? What resources do you recommend?
Disclaimers:
I am familiar with code in a variety of languages and general programming/logic concepts, but have little real experience writing substantial amounts of code. This pet project is an attempt at rectifying this.
Also, I know the information is out there already, but I have the strong impression that I am missing the forest for the trees.

Read up on RESTful architectures.
Your fat client can use REST. It will use urllib2 to make RESTful requests of a server. It can exchange data in JSON notation.
A web client can use REST. It can make simple browser HTTP requests or a Javascript component can make more sophisticated REST requests using JSON.
Your server can be built as a simple WSGI application using any simple WSGI components. You have nice ones in the standard library, or you can use Werkzeug. Your server simply accepts REST requests and makes REST responses. Your server can work in HTML (for a browser) or JSON (for a fat client or Javascript client.)

I would consider basing all server / client interactions on HTTP -- probably with JSON payloads. This doesn't directly allow server-initiated interactions ("server push"), but the (newish but already traditional;-) workaround for that is AJAX-y (even though the X makes little sense as I suggest JSON payloads, not XML ones;-) -- the client initiates an async request (via a separate thread or otherwise) to a special URL on the server, and the server responds to those requests to (in practice) do "pushes". From what you say it looks like the limitations of this approach might not be a problem.
The key advantage of specifying the interactions in such terms is that they're entirely independent from the programming language -- so the web-based client in Javascript will be just as doable as your CLI one in Python, etc etc. Of course, the server can live on localhost as a special case, but there is no constraint for that as the HTTP URLs can specify whatever host is running the server; etc, etc.

First of all, regardless of the locality or type of the client, you will be communicating through an established message-based interface. All clients will be operating based on a common set of requests and responses, and the server will handle and reject these based on their validity according to game state. Whether you are dealing with local clients on the same machine or remote clients via HTTP does not matter whatsoever from an abstraction standpoint, as they will all be communicating through the same set of requests/responses.
What this comes down to is your protocol. Your protocol should be a well-defined and technically sound language between client and server that will allow clients to a) participate effectively, and b) participate fairly. This protocol should define what messages ('moves') a client can do, and when, and how the server will react.
Your protocol should be fully fleshed out and documented before you even start on game logic - the two are intrinsically connected and you will save a lot of wasted time and effort by competely defining your protocol first.
You protocol is the abstraction between client and server and it will also serve as the design document and programming guide for both.
Protocol design is all about state, state transitions, and validation. Game servers usually have a set of fairly common, generic states for each game instance e.g. initialization, lobby, gameplay, pause, recap, close game, etc...
Each one of these states has important state data related with it. For example, a 'lobby' state on the server-side might contain the known state of each player...how long since the last message or ping, what the player is doing (selecting an avatar, switching settings, going to the fridge, etc.). Organizing and managing state and substate data in code is important.
Managing these states, and the associated data requirements for each is a process that should be exquisitely planned out as they are directly related to volume of work and project complexity - this is very important and also great practice if you are using this project to step up into larger things.
Also, you must keep in mind that if you have a game, and you let people play, people will cheat. It's a fact of life. In order to minimize this, you must carefully design your protocol and state management to only ever allow valid state transitions. Never trust a single client packet.
For every permutation of client/server state, you must enforce a limited set of valid game messages, and you must be very careful in what you allow players to do, and when you allow them to do it.
Project complexity is generally exponential and not linear - client/server game programming is usually a good/painful way to learn this. Great question. Hope this helps, and good luck!

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.