Relation between components of tornado.httpserver.HTTPServer - python

Taking the following code into account:
http_server = tornado.httpserver.HTTPServer(app)
http_server.bind(options.port)
http_server.start(5)
What is the relation between the five subprocesses?? Does the dabatase connection instance start up along with the application share as part of the subprocesses?
What is the best practice to use http_server.start(5)?
Great thanks.

It depends where you connect to the DB. The simplest method is to use one shared DB connection, which is a class object on Application or on a RequestHandler. In that case, the single connection instance will be shared among all server processes.
For an example implementation, see the Blog demo app.

Related

How to organize my flask project properly?

I'm writing my blog project by flask and sqlalchemy, but I don't know how to organize it.Here is is the file tree:
the admin and main are two blueprint, I create my app in the blog's __init__.py.
But where should I create my models and how to use it correctly?
Can I create an engine instance and make it global? So that everytime I need to connect to database, I just need to make a new session and bind this engine? Or should I establish new connection and engine for every request?
I personally prefer the global instance method for two reasons. First it can reduce the handshake cost since it's a long-live connection. Second, it can easily transform into a connection pool.

What is a good way to organize your models, connections if one wants to use SQLAlchemy to connect several databases to various applications?

Background:
This is the situation I am facing and so far my current solution seems rather clunky. I want to improve on it. Right now:
I setup connections to each database in the main function of the Pyramid application:
def main(global_config, **settings):
a_engine = engine_from_config(settings, 'A.')
b_engine = engine_from_config(settings, 'B.')
ASession.configure(bind=a_engine)
BSession.configure(bind=b_engine)
"ASession" and "BSession" are simply globally defined scoped_session in /models/init.py.
ASession = scoped_session(sessionmaker(extension=ZopeTransactionExtension()))
I define model base class like so. For example:
ABase = declarative_base()
class user(ABase):
id = Column(Integer, primary_key=True)
name = Column(String)
This somehow already doesn't feel very clean. But now that this model is supposed to be accessed from a different application, I also need to define the engine and connection again in that application. This feels extremely redundant.
Problem Abstracted:
Assume that there are 2 different databases:
A and B
Also assume that you want A and B to be accessible from 2 different applications (e.g.: Pyramid application, Bokeh Server App which uses Tornado) using the same model.
In short, how would one best pattern objects/models/classes/functions to produce clean non-redundant code in Python3?
Initial Thought After The Question Was Posted:
Thinking about this a bit more, I think I want each model to be somehow "self-contained". The model should bring with it methods for initiating connections. In other words, the initiation of db connections should be decoupled from the web application itself.
And it should be done in an instance kind of manner. So that multiple applications can use the same models. Each application would have its own session connection to either DB.
How would the community pattern this? Friday afternoons don't lend themselves to find answers to these kinds of questions for myself at least.
I have done this. My recommendation below is how I like doing it, but is not the only way. I would ditch scoped sessions and the transaction manager and make explicit session management objects, with request lifecycle callbacks handling creation, closing, committing, or rolling back your sessions. Basically scoped sessions are a way to simulate a global by getting the same item for that thread of execution. The other way to do this in Pyramid is to attach things to the registry and the request, because you have those everywhere. You attach shared components to the registry (the ZCA) and per-request objects to the request.
When you have multiple sessions, I've found it much easier to reason about them and keep track of them if they are handled by components that wrap up everything for that engine. So for a case like that you describe, I've made two different DB engine components, that are created on start up, attached to the registry, and have a method for getting a fresh session. If you create these components properly, they should be usable in any application, whether it's Pyramid, Tornado, or your test script. You just make sure it has a constructor with some sane way of passing in settings for setting up the engine, whether it's a settings dict or kwargs. I then make my data model(s) live in their own python packages and it's easy to have any app in the family import the model, instantiate the engine components and go to town. Note that if you like using the ZCA registry (and I love it, it's a fantastic DI system), there's nothing preventing you from using it in non-pyramid apps, you just set it up manually in your server start up code.
In Pyramid specifically, I make a custom Request class and use the reify decorator to allow other pyramid code to get the session(s) for that request. The request class has end-of-life-callbacks attached to close out the sessions, and to do rollbacks or commits. There is a bit more boilerplate, but for me it's cleaner in that I can very easily trace where and when in code and time my session management is happening. It's also a good way for testing.
That said, there are lots of smart folks in SQLAlchemy/Pyramid land who swear by scoped sessions and the transaction manager, so there are other valid approaches. Hope that helps.

Give each user their own database

I'm constructing my app such that each user has their own database (for easy isolation, and to minimize the need for sharding). This means that each web request, and all of the background scripts, need to connect to a different database based on which user is making the request, and use that connection for all function calls.
I figure I can make some sort of middleware that would pass the right connection to my web requests by attaching it to the request variable, but I don't know how I should ensure that all functions and model methods called by the request use this connection.
Well how to "ensure that all functions and model methods called by the request use this connection" is easy. You pass the connection into your api as with any well-designed code that isn't relying on global variables for such things. So you have a database session object loaded per-request, and you pass it down. It's very easy for model objects to turtle that session object further without explicitly passing it because each managed object knows what session owns it, and you can query it from there.
db = request.db
user = db.query(User).get(1)
user.add_group('foo')
class User(Base):
def add_group(self, name):
db = sqlalchemy.orm.object_session(self)
group = Group(name=name)
db.add(group)
I'm not recommending you use that exact pattern but it serves as an example of how to grab the session from a managed object, avoiding having to pass the session everywhere explicitly.
On to your original question, how to handle multi-tenancy... In your data model! Designing a system where you are splitting things up at that low of a level is a big maintenance burden and it does not scale well. For example it becomes very difficult to use any type of connection pooling when you have an arbitrary number of independent connections. To get around that people commonly use the SQL SCHEMA feature supported by some databases. That allows you to use the same connection but have access to a different table structure per session. That's better, but again managing all of those schemas independently should raise some red flags, violating DRY with all of that duplication in your data model. Any duplication at that level quickly becomes a burden that you need to be ready for.

How to manage db connections using webpy with SQLObject?

Web.py has its own database API, web.db. It's possible to use SQLObject instead, but I haven't been able to find documentation describing how to do this properly. I'm especially interested in managing database connections. It would be best to establish a connection at the wsgi entry point, and reuse it. Webpy cookbook contains an example how to do this with SQLAlchemy. I'd be interested to see how to properly do a similar thing using SQLObject.
This is how I currently do it:
class MyPage(object):
def GET(self):
ConnectToDatabase()
....
return render.MyPage(...)
This is obviously inefficient, because it establishes a new database connection on each query. I'm sure there's a better way.
As far as I understand the SQLAlchemy example given, a processor is used, that is, a session is created for each connection and committed when the handler is complete (or rolled back if an error has occurred).
I don't see any simple way to do what you propose, i.e. open a connection at the WSGI entry point. You will probably need a connection pool to serve multiple clients at the same time. (I have no idea what are the requirements for efficiency, code simplicity and so on, though. Please comment.)
Inserting ConnectToDatabase calls into each handler is of course ugly. I suggest that you adapt the cookbook example replacing the SQLAlchemy session with a SQLObject connection.

Does Google App Engine run one instance of an app per one request? or for all requests?

Using google app engine:
# more code ahead not shown
application = webapp.WSGIApplication([('/', Home)],
debug=True)
def main():
run_wsgi_app(application)
if __name__ == "__main__":
main()
If two different users request the webpage on two different machine, two individual instances of the server will be invoked?
Or just one instance of the server is running all the time which handle all the requests?
How about if one user open the webpage twice in the same browser?
Edit:
According to the answers below, one instance may handle requests from different users turn by turn. Then consider the following fraction of code, taken from the example Google gave:
class User(db.Model):
email = db.EmailProperty()
nickname = db.StringProperty()
1, email and nickname here are defined as class variables?
2, All the requests handled by the same instance of server share the same variables and thus by mistake interfere with each other? (Say, one's email appears in another's page)
ps. I know that I should read the manual and doc more and I am doing it, however answers from experienced programmer will really help me understand faster and more through, thanks
An instance can handle many requests over its lifetime. In the python runtime's threading model, each instance can only handle a single request at any given time. If 2 requests arrive at the same time they might be handled one after the other by a single instance, or a second instance might be spawned to handle the request.
EDIT:
In general, variables used by each request will be scoped to a RequestHandler instance's .get() or .post() method, and thus can't "leak" into other requests. You should be careful about using global variables in your scripts, as these will be cached in the instance and would be shared between requests. Don't use globals without knowing exactly why you want to (which is good advice for any application, for that matter), and you'll be fine.
App Engine dynamically builds up and tears down instances based on request volume.
From the docs:
App Engine applications are powered by
any number of instances at any given
time, depending on the volume of
requests received by your application.
As requests for your application
increase, so do the number of
instances powering it.
Each instance has its own queue for
incoming requests. App Engine monitors
the number of requests waiting in each
instance's queue. If App Engine
detects that queues for an application
are getting too long due to increased
load, it automatically creates a new
instance of the application to handle
that load.
App Engine scales instances in reverse
when request volumes decrease. In this
way, App Engine ensures that all of
your application's current instances
are being used to optimal efficiency.
This automatic scaling makes running
App Engine so cost effective.
When an application is not being used
all, App Engine turns off its
associated instances, but readily
reloads them as soon as they are
needed.

Categories