Flask-SQLAlchemy "MySQL server has gone away" when using HAproxy - python

I've built a small python REST service using Flask, with Flask-SQLAlchemy used for talking to the MySQL DB.
If I connect directly to the MySQL server everything is good, no problems at all. If I use HAproxy (handles HA/failover, though in this dev environment there is only one DB server) then I constantly get MySQL server has gone away errors if the application doesn't talk to the DB frequently enough.
My HAproxy client timeout is set to 50 seconds, so what I think is happening is it cuts the stream, but the application isn't aware and tries to make use of an invalid connection.
Is there a setting I should be using when using services like HAproxy?
Also it doesn't seem to reconnect automatically, but if I issue a request manually I get Can't reconnect until invalid transaction is rolled back, which is odd since it is just a select() call I'm making, so I don't think it is a commit() I'm missing - or should I be calling commit() after every ORM based query?

Just to tidy up this question with an answer I'll post what I (think I) did to solve the issues.
Problem 1: HAproxy
Either increase the HAproxy client timeout value (globally, or in the frontend definition) to a value longer than what MySQL is set to reset on (see this interesting and related SF question)
Or set SQLALCHEMY_POOL_RECYCLE = 30 (30 in my case was less than HAproxy client timeout) in Flask's app.config so that when the DB is initialised it will pull in those settings and recycle connections before HAproxy cuts them itself. Similar to this issue on SO.
Problem 2: Can't reconnect until invalid transaction is rolled back
I believe I fixed this by tweaking the way the DB is initialised and imported across various modules. I basically now have a module that simply has:
from flask.ext.sqlalchemy import SQLAlchemy
db = SQLAlchemy()
Then in my main application factory I simply:
from common.database import db
db.init_app(app)
Also since I wanted to easily load table structures automatically I initialised the metadata binds within the app context, and I think it was this which cleanly handled the commit() issue/error I was getting, as I believe the database sessions are now being correctly terminated after each request.
with app.app_context():
# Setup DB binding
db.metadata.bind = db.engine

Related

Taking mongoengine.connect out of the setting.py in django

Most of the blog posts and examples on the web, in purpose of connecting to the MongoDB using Mongoengine in Python/Django, have suggested that we should add these lines to the settings.py file of the app:
from mongoengine import connect
connect('project1', host='localhost')
It works fine for most of the cases except one I have faced recently:
When the database is down!
Let say if db goes down, the process that is taking care of the web server (in my case, Supervisord) will stop running the app because of Exception that connect throws. It may try few more times but after its timeout reached, it will stop trying.
So even if your app has some parts that are not tied to db, they will also break down.
A quick solution to this is adding a try/exception block to the connect code:
try:
connect('project1', host='localhost')
except Exception as e:
print(e)
but I am looking for a better and clean way to handle this.
Unfortunately this is not really possible with mongoengine unless you go with the try-except solution like you did.
You could try to connect with pure pymongo version 3.0+ using MongoClient and registering the connection manually in the mongoengine.connection._connection_settings dictionary (quite hacky but should work). From pymongo documentation:
Changed in version 3.0: MongoClient is now the one and only client class for a standalone server, mongos, or replica set. It includes the functionality that had been split into MongoReplicaSetClient: it can connect to a replica set, discover all its members, and monitor the set for stepdowns, elections, and reconfigs.
The MongoClient constructor no longer blocks while connecting to the server or servers, and it no longer raises ConnectionFailure if they are unavailable, nor ConfigurationError if the user’s credentials are wrong. Instead, the constructor returns immediately and launches the connection process on background threads.

Flask-SQLAlchemy and SQLAlchemy

I'm building a small website and I already have all my models in SQLAlchemy. The website is to publish some information from some calculations which are done offline. Only the results will be published to a slimmed down database i.e. it contains the results, not the raw data, but the website needs to query the results.
I'm going to use Flask, as my models are already driven with Python (and some heavy lifting in C++ via SWIG) and I don't want to use Django.
Now this has been asked before I'm sure and the usual mantra without much justification is to 'use Flask-SQLAlchemy'. The question is why?
If I write some session handling myself, why do I have to go through the additional layer of redefining my database in Flask-SQLAlchemy. Other than having to write some code like here in my Flask app somewhere:
#app.before_request
def before_request():
g.db = connect_db()
#app.teardown_request
def teardown_request(exception):
db = getattr(g, 'db', None)
if db is not None:
db.close()
What else do I need to worry about? SQLAlchemy even does connection pooling for me by default.
Actually, you are building a web application using Flask which do database related kinds of stuff by sqlalchemy. So, when you are dealing with database session with multiple requests which your application handles, then you have to ascertain that, you are creating and closing sessions cautiously.
If you read SQLAlchemy docs, they recommend to keep the lifecycle of the session separate and external from functions and objects that access and/or manipulate database data. This will greatly help with achieving a predictable and consistent transactional scope.
A web application is the easiest case because such an application is already constructed around a single, consistent scope - this is the request, which represents an incoming request from a browser, the processing of that request to formulate a response, and finally the delivery of that response back to the client. Integrating web applications with the Session is then the straightforward task of linking the scope of the Session to that of the request. The Session can be established as the request begins, or using a lazy initialization pattern which establishes one as soon as it is needed. The request then proceeds, with some system in place where application logic can access the current Session in a manner associated with how the actual request object is accessed. As the request ends, the Session is torn down as well, usually through the usage of event hooks provided by the web framework. The transaction used by the Session may also be committed at this point, or alternatively the application may opt for an explicit commit pattern, only committing for those requests where one is warranted, but still always tearing down the Session unconditionally at the end.
In laymen terms, i mean to say that
In SQLAlchemy, above action is mentioned because sessions in web application should be scoped, meaning that each request handler creates and destroys its own session.
This is necessary because web servers can be multi-threaded, so multiple requests might be served at the same time, each working with a different database session.
This means that if you are using SqlAlchemy with Flask, you have to manually handle session like creating scoped session and also remove them on each request cautiously otherwise you may be in deep shit, which adds extra layer of complexity to your web application.
But, there comes Flask-SqlAlchemy (an extension of sqlalchemy library for Flask app) which provides infrastructure to assist in the task of aligning the lifespan of a Session with that of each web request. Actually, you can also found that in SqlAlchmey docs, they also recommend to use this with Flask.
Flask-SQLAlchemy creates a fresh/new scoped session for each request. If you dig further it you will find out here, it also installs a hook on app.teardown_appcontext (for Flask >=0.9), app.teardown_request (for Flask 0.7-0.8), app.after_request (for Flask <0.7) and here is where it calls db.session.remove().
The code you put in the question is not actually valid for Sqlalchemy integration is Flask. I know it is just example, but saying that just in case.
For Sqlalchemy integration all you need to do is to make sure current DbSession is cleaned up at the end of request via something like this:
#app.teardown_appcontext
def shutdown_session(exception=None):
DbSession.remove()
where DbSession is scoped session.
Here is documentation for the case when you dont want to use Flask-Sqlalchemy package.

In Django, how can I set db connection timeout?

OK, I know it's not that simple. I have two db connections defined in my settings.py: default and cache. I'm using DatabaseCache backend from django.core.cache. I have database router defined so I can use separate database/schema/table for my models and for cache. Perfect!
Now sometimes my cache DB is not available and there are two cases:
Connection to databse was established already when DB crashed - this is easy - I can use this recipe: http://code.activestate.com/recipes/576780-timeout-for-nearly-any-callable/ and wrap my query like this:
try:
timelimited(TIMEOUT, self._meta.cache.get, cache_key))
expect TimeLimitExprired:
# live without cache
Connection to database wasn't yet established - so I need to wrap in timelimited some portion of code that actually establishes database connection. But I don't know where such code exists and how to wrap it selectively (i.e. wrap only cache connection, leave default connection without timeout)
Do you know how to do point 2?
Please note, this answer https://stackoverflow.com/a/1084571/940208 is not correct:
grep -R "connect_timeout" /usr/local/lib/python2.7/dist-packages/django/db
gives no results and cx_Oracle driver doesn't support this parameter as far as I know.

Database connection management when using only Django ORM

I am using Django ORM layer outside of Django. The project is a web application using a cusotm in-house built framework.
Now, I had no problems to set up Django ORM to run standalone, but I am a little bit worried about connection management. I have read Using only DB part of Django here at SO and it is true that Django does some special connection handling at the beginning and the end of each request. From django/db/__init__.py:
# Register an event that closes the database connection
# when a Django request is finished.
def close_connection(**kwargs):
for conn in connections.all():
conn.close()
signals.request_finished.connect(close_connection)
# Register an event that resets connection.queries
# when a Django request is started.
def reset_queries(**kwargs):
for conn in connections.all():
conn.queries = []
signals.request_started.connect(reset_queries)
# Register an event that rolls back the connections
# when a Django request has an exception.
def _rollback_on_exception(**kwargs):
from django.db import transaction
for conn in connections:
try:
transaction.rollback_unless_managed(using=conn)
except DatabaseError:
pass
signals.got_request_exception.connect(_rollback_on_exception)
What problems can I run into if I skip this connection management? (I have no way to plug in those signals into my framework easily)
It depends on your use case. Each of these functions do something specific, which may or may not affect you.
If this is a long-running process and you have DEBUG on, you'll need to reset the queries or it'll keep all the queries you've run in memory.
If you spawn lots of threads, connect to the DB once early on in each thread, and then leave the threads running, you'll also want to close the connections when you're done using the DB, or you could hit your DB's connection limit.
You almost certainly don't need _rollback_on_exception - I'm assuming you have your intended transaction behavior set up within the relevant code itself.

SqlAlchemy "too many clients already" error

I'm fairly new to python and pyramid framework. Recently I was introduced to SQLSoup to take care of my database (postgres) needs.
dbEngine1 = SqlSoup(settings['sqlalchemy.db1.url'])
users = dbEngine1.users.fetchall()
Everything is working great, however after a short period of using the pyramid app, I'm getting this error message. I have to kill pyramid to release all idling connections in postgres (about 50 idling connection before throwing below exception)
sorry, too many clients already
My question is, how do I close this idling connection, I tried adding a line of code as shown below, but it does not help.
dbEngine1 = SqlSoup(settings['sqlalchemy.db1.url'])
users = dbEngine1.users.fetchall()
dbEngine1.engine.connect().close()
Any pointer from SQLAlchemy gurus?
Seems like you create dbEngine1 on each request to yours Pyramid app.
For proper usage SqlSoup in webapp you must use SA Sessions.
See section "Accessing the Session" on this page.
how do I close this idling connection
SqlSoup such as raw SA use the connection pool, each connection in pool at idle status util query executes. This connection pool must be created once.

Categories