I'm useing peewee with my tornado webapp,when I read peewee's document,I found:
Adding Request Hooks
When building web-applications, it is very important that you manage your database connections correctly. In this section I will describe how to add hooks to your web app to ensure the database connection is handled properly.
These steps will ensure that regardless of whether you’re using a simple SQLite database, or a pool of multiple Postgres connections, peewee will handle the connections correctly.
http://docs.peewee-orm.com/en/latest/peewee/database.html
Insides,it tells how Flask Django Bottle...to use that except the solution for Tornado
I wonder it's a easy way for tornado to solve this problem? Or this doesn't matter at all?
The idea there is that you want to open a connection when a request begins, and close it when the request is finished (the response is returned).
To do this it looks like you can subclass RequestHandler:
from tornado.web import RequestHandler
db = SqliteDatabase('my_db.db')
class PeeweeRequestHandler(RequestHandler):
def prepare(self):
db.connect()
return super(PeeweeRequestHandler, self).prepare()
def on_finish(self):
if not db.is_closed():
db.close()
return super(PeeweeRequestHandler, self).on_finish()
Related
My question is about the recommended approach for handling database connections when using Flask in a production environment or other environment where performance is a concern. In Flask, the g object is available for storing things, and open database connections can be placed there to allow the application to reuse them in subsequent database queries during the same request. However, the g object doesn't persist across requests, so it seems that each new request would require a new database connection (and the performance hit that entails).
The most related question I found on this matter is this one: How to preserve database connection in a python web server but the answers only raise the abstract idea of connection pooling (without tying it to how one might use it within Flask and how it would survive across requests) or propose a solution that is only relevant to one particular type of database or particular stack.
So my question is about the general approach that should be taken when productionising apps built on Flask that connect to any kind of database. It seems like something involving connection pooling is in the right direction, especially since that would work for a traditional Python application. But I'm wondering what the recommended approach is when using Flask because of the aforementioned issues of persistence across connections, and also the fact that Flask apps in production are run from WSGI servers, which potentially add further complications.
EDIT: Based on the comments recommending flask sqlalchemy. Assuming flask sqlalchemy solves the problem, does it also work for Neo4J or indeed any arbitrary database that the Flask app uses? Many of the existing database connectors already support pooling natively, so why introduce an additional dependency whose primary purpose is to provide ORM capabilities rather than connection management? Also, how does sqlalchemy get around the fundamental problem of persistence across requests?
Turns out there is a straightforward way to achieve what I was after. But as the commenters suggested, if it is at all possible to go the flask sqlalchemy route, then you might want to go that way. My approach to solving the problem is to save the connection object in a module level variable that is then imported as necessary. That way it will be available for use from within Flask and by other modules. Here is a simplified version of what I did:
app.py
from flask import Flask
from extensions import neo4j
app = Flask(__name__)
neo4j.init_app(app)
extensions.py
from neo4j_db import Neo4j
neo4j = Neo4j()
neo4j_db.py
from neo4j import GraphDatabase
class Neo4j:
def __init__(self):
self.app = None
self.driver = None
def init_app(self, app):
self.app = app
self.connect()
def connect(self):
self.driver = GraphDatabase.driver('bolt://xxx')
return self.driver
def get_db(self):
if not self.driver:
return self.connect()
return self.driver
example.py
from extensions import neo4j
driver = neo4j.get_db()
And from here driver will contain the database driver that will persist across Flask requests.
Hope that helps anyone that has the same issue.
I need to create a simple project in Flask. I don't want to use SQLAlchemy. In the code snippet below, everyone that connects to the server uses the same connection object but for each request, a new cursor object is created. I am asking this because I have never used Python DB api before in this way. Is it correct? Should I create a new connection object for each request or use the same connection and cursor object for each request or the method below. Which one is correct?
import mysql.connector
from flask import Flask, request
app = Flask(__name__)
try:
con = mysql.connector.connect(user='root',password='',host='localhost',database='pywork')
except mysql.connector.Error as err:
print("Something went wrong")
#app.route('/')
def home():
cursor = con.cursor()
cursor.execute("INSERT INTO table_name VALUES(NULL,'test record')")
con.commit()
cursor.close()
return ""
WSGI applications may be served by several worker processes and threads. So you might end up having multiple threads using the same connection. So you need to find out whether your library's implementation of the connection is thread safe. Look up the documentation and see if they claim to provide Level 2 thread safety.
Then you should reflect about whether or not you need transactions during your requests. If you find you need transactions (e.g., requests issue multiple database commands with an inconsistent state in between or possible race conditions), you should use different connections, because transactions are always connection wide. Note that some database systems or configurations don't support transactions or don't isolate separate connections from each other.
So if you share a connection, you should assume that you work with autocommit turned on (or better: actually do that).
Most of the blog posts and examples on the web, in purpose of connecting to the MongoDB using Mongoengine in Python/Django, have suggested that we should add these lines to the settings.py file of the app:
from mongoengine import connect
connect('project1', host='localhost')
It works fine for most of the cases except one I have faced recently:
When the database is down!
Let say if db goes down, the process that is taking care of the web server (in my case, Supervisord) will stop running the app because of Exception that connect throws. It may try few more times but after its timeout reached, it will stop trying.
So even if your app has some parts that are not tied to db, they will also break down.
A quick solution to this is adding a try/exception block to the connect code:
try:
connect('project1', host='localhost')
except Exception as e:
print(e)
but I am looking for a better and clean way to handle this.
Unfortunately this is not really possible with mongoengine unless you go with the try-except solution like you did.
You could try to connect with pure pymongo version 3.0+ using MongoClient and registering the connection manually in the mongoengine.connection._connection_settings dictionary (quite hacky but should work). From pymongo documentation:
Changed in version 3.0: MongoClient is now the one and only client class for a standalone server, mongos, or replica set. It includes the functionality that had been split into MongoReplicaSetClient: it can connect to a replica set, discover all its members, and monitor the set for stepdowns, elections, and reconfigs.
The MongoClient constructor no longer blocks while connecting to the server or servers, and it no longer raises ConnectionFailure if they are unavailable, nor ConfigurationError if the user’s credentials are wrong. Instead, the constructor returns immediately and launches the connection process on background threads.
I'm building a small website and I already have all my models in SQLAlchemy. The website is to publish some information from some calculations which are done offline. Only the results will be published to a slimmed down database i.e. it contains the results, not the raw data, but the website needs to query the results.
I'm going to use Flask, as my models are already driven with Python (and some heavy lifting in C++ via SWIG) and I don't want to use Django.
Now this has been asked before I'm sure and the usual mantra without much justification is to 'use Flask-SQLAlchemy'. The question is why?
If I write some session handling myself, why do I have to go through the additional layer of redefining my database in Flask-SQLAlchemy. Other than having to write some code like here in my Flask app somewhere:
#app.before_request
def before_request():
g.db = connect_db()
#app.teardown_request
def teardown_request(exception):
db = getattr(g, 'db', None)
if db is not None:
db.close()
What else do I need to worry about? SQLAlchemy even does connection pooling for me by default.
Actually, you are building a web application using Flask which do database related kinds of stuff by sqlalchemy. So, when you are dealing with database session with multiple requests which your application handles, then you have to ascertain that, you are creating and closing sessions cautiously.
If you read SQLAlchemy docs, they recommend to keep the lifecycle of the session separate and external from functions and objects that access and/or manipulate database data. This will greatly help with achieving a predictable and consistent transactional scope.
A web application is the easiest case because such an application is already constructed around a single, consistent scope - this is the request, which represents an incoming request from a browser, the processing of that request to formulate a response, and finally the delivery of that response back to the client. Integrating web applications with the Session is then the straightforward task of linking the scope of the Session to that of the request. The Session can be established as the request begins, or using a lazy initialization pattern which establishes one as soon as it is needed. The request then proceeds, with some system in place where application logic can access the current Session in a manner associated with how the actual request object is accessed. As the request ends, the Session is torn down as well, usually through the usage of event hooks provided by the web framework. The transaction used by the Session may also be committed at this point, or alternatively the application may opt for an explicit commit pattern, only committing for those requests where one is warranted, but still always tearing down the Session unconditionally at the end.
In laymen terms, i mean to say that
In SQLAlchemy, above action is mentioned because sessions in web application should be scoped, meaning that each request handler creates and destroys its own session.
This is necessary because web servers can be multi-threaded, so multiple requests might be served at the same time, each working with a different database session.
This means that if you are using SqlAlchemy with Flask, you have to manually handle session like creating scoped session and also remove them on each request cautiously otherwise you may be in deep shit, which adds extra layer of complexity to your web application.
But, there comes Flask-SqlAlchemy (an extension of sqlalchemy library for Flask app) which provides infrastructure to assist in the task of aligning the lifespan of a Session with that of each web request. Actually, you can also found that in SqlAlchmey docs, they also recommend to use this with Flask.
Flask-SQLAlchemy creates a fresh/new scoped session for each request. If you dig further it you will find out here, it also installs a hook on app.teardown_appcontext (for Flask >=0.9), app.teardown_request (for Flask 0.7-0.8), app.after_request (for Flask <0.7) and here is where it calls db.session.remove().
The code you put in the question is not actually valid for Sqlalchemy integration is Flask. I know it is just example, but saying that just in case.
For Sqlalchemy integration all you need to do is to make sure current DbSession is cleaned up at the end of request via something like this:
#app.teardown_appcontext
def shutdown_session(exception=None):
DbSession.remove()
where DbSession is scoped session.
Here is documentation for the case when you dont want to use Flask-Sqlalchemy package.
I am using Django ORM layer outside of Django. The project is a web application using a cusotm in-house built framework.
Now, I had no problems to set up Django ORM to run standalone, but I am a little bit worried about connection management. I have read Using only DB part of Django here at SO and it is true that Django does some special connection handling at the beginning and the end of each request. From django/db/__init__.py:
# Register an event that closes the database connection
# when a Django request is finished.
def close_connection(**kwargs):
for conn in connections.all():
conn.close()
signals.request_finished.connect(close_connection)
# Register an event that resets connection.queries
# when a Django request is started.
def reset_queries(**kwargs):
for conn in connections.all():
conn.queries = []
signals.request_started.connect(reset_queries)
# Register an event that rolls back the connections
# when a Django request has an exception.
def _rollback_on_exception(**kwargs):
from django.db import transaction
for conn in connections:
try:
transaction.rollback_unless_managed(using=conn)
except DatabaseError:
pass
signals.got_request_exception.connect(_rollback_on_exception)
What problems can I run into if I skip this connection management? (I have no way to plug in those signals into my framework easily)
It depends on your use case. Each of these functions do something specific, which may or may not affect you.
If this is a long-running process and you have DEBUG on, you'll need to reset the queries or it'll keep all the queries you've run in memory.
If you spawn lots of threads, connect to the DB once early on in each thread, and then leave the threads running, you'll also want to close the connections when you're done using the DB, or you could hit your DB's connection limit.
You almost certainly don't need _rollback_on_exception - I'm assuming you have your intended transaction behavior set up within the relevant code itself.