Context: I'm working on a Flask app, running on CherryPy, DB handled using SQLAlchemy ORM.
Problem:
The app runs fine and does everything I want, however, if I have a page which fetches some data from DB and displays, and I press and hold "Ctrl + R" or "F5". That is, continuously refresh a page, making that many DB requests. First few goes fine, and then it breaks.
The following errors are logged:
(OperationalError) (2013, 'Lost connection to MySQL server during query')
Can't reconnect until invalid transaction is rolled back (original cause:
InvalidRequestError: Can't reconnect until invalid transaction is rolled back)
This result object does not return rows. It has been closed automatically.
(ProgrammingError) (2014, "Commands out of sync; you can't run this command now")
There's also another error which bothers me (but not logged this time), it's
dictionary changed size during iteration
This happens when I'm iterating through a query, using values obtained to populate a dictionary. The dictionary is local (scope of the dict) to the function.
More info:
How I am handling sessions:
A new session is created when you enter any page, use that session to perform all the DB transactions, and the session is closed right before rendering the HTML. Technically, that means, the scope of session is the same as the HTTP request.
I do a session.rollback() only when there's an exception raised during updating table or inserting into a table. No rollback() during any query() operations.
I'm pretty sure I've made some silly mistakes or am not doing things the right way.
Unlimited refreshes like that is not really a probably scenario, but can't be overlooked.
Also, I think the behavior would be similar when there a lot of users using it at the same time.
How the SQLAlchemy engine, sessionmaker was handled:
sql_alchemy_engine = create_engine(self.db_string, echo=False, encoding="utf8", convert_unicode=True, pool_recycle=9)
sqla_session = sessionmaker(bind=sql_alchemy_engine)
It's done only ONCE like it's recommended in the SQLA documentation, and a new session is created and returned sqla_session() whenever required.
If you're using Flask, you should be using flask-sqlalchemy, and let the Flask request context manage your session, and not handling your engine and sessions by hand. This is how SQLAlchemy recommends it:
Most web frameworks include infrastructure to establish a single Session, associated with the request, which is correctly constructed and torn down corresponding torn down at the end of a request. Such infrastructure pieces include products such as Flask-SQLAlchemy, for usage in conjunction with the Flask web framework, and Zope-SQLAlchemy, for usage in conjunction with the Pyramid and Zope frameworks. SQLAlchemy strongly recommends that these products be used as available.
http://docs.sqlalchemy.org/en/rel_0_9/orm/session.html?highlight=flask
Then you create your engine simply by:
from flask import Flask
from flask.ext.sqlalchemy import SQLAlchemy
app = Flask(__name__)
app.config['SQLALCHEMY_DATABASE_URI'] = 'your db uri'
db = SQLAlchemy(app)
Or, if you're using app factory:
from flask import Flask
from flask.ext.sqlalchemy import SQLAlchemy
db = SQLAlchemy()
def create_app():
app = Flask(__name__)
app.config['SQLALCHEMY_DATABASE_URI'] = 'your db uri'
db.init_app(app)
With that, the base declarative model you should be using will be at db.Model and the session you should be using will be at db.session.
Related
In short, why am I getting an "sqlalchemy.exc.InvalidRequestError: A transaction is already begun. Use subtransactions=True to allow subtransactions" error?
Following the best practices of separating and keeping the session external, I created foo(input) with a context manager instead of using the try / except / else. If I use foo(user) instead of it I get the above error. My guess is that foo() isn't committing and closing the connection. Howevere the documentation states otherwise.
Flask documentation uses a scoped_session but the SQLAlchemy documentation says "It is however strongly recommended that the integration tools provided with the web framework itself be used, if available, instead of scoped_session." Perhaps the scoped_session is causing errors across threads with the requests?
Here is my main code:
#__init__.py
import os
from flask import Flask, render_template, redirect, request, url_for
def create_app(test_config=None):
# create and configure the app
app = Flask(__name__, instance_relative_config=False)
app.config.from_object('config.DevelopmentConfig')
# set up extensions
# all flask extensions must support factory pattern
# can run these two steps from the cli
from app.database import init_db
init_db()
#app.route('/')
def index():
return render_template('index.html')
from app.auth import RegistrationForm
from app.models import User
from app.database import db_session, foo
#app.route('/register', methods=['GET', 'POST'])
def register():
form = RegistrationForm(request.form)
if request.method == 'POST' and form.validate():
user = User(form.name.data, form.email.data,
form.password.data)
foo(user)
# try:
# db_session.add(user)
# except:
# db_session.rollback()
# raise
# else:
# db_session.commit()
return redirect(url_for('login'))
return render_template('register.html', form=form)
#app.route('/login', methods=['GET'])
def login():
return render_template('login.html')
#app.teardown_appcontext
def shutdown_session(exception=None):
db_session.remove()
return app
Here is my database code:
#database.py
from sqlalchemy import create_engine
from sqlalchemy.orm import scoped_session, sessionmaker
from sqlalchemy.ext.declarative import declarative_base
_database_uri = os.environ['DATABASE_URL'] engine = create_engine(_database_uri)
db_session = scoped_session(sessionmaker(autocommit=False,
autoflush=False,
bind=engine))
Base = declarative_base() Base.query = db_session.query_property()
def init_db():
# import all modules here that might define models so that
# they will be registered properly on the metadata. Otherwise
# you will have to import them first before calling init_db()
import app.models
Base.metadata.create_all(bind=engine)
def foo(input):
with db_session.begin() as session:
session.add(input)
I'm not sure whether this will actually answer your question or not but I think it worth mentioning.
Near the last line in your database.py file, I suggest you to not alias db_session.begin() as session because you'll then be confused thinking that session is an object of Session class while it's an object of SessionTransaction class which is:
largely an internal object that in modern use provides a context manager for session transactions. SessionTransaction
You can switch to either:
with db_session() as session, session.begin():
session.add(input)
or shorter version
with db_session.begin():
db_session.add(input)
Also you need to wrap your User object creation with Session.begin() context like below:
def register():
form = RegistrationForm(request.form)
if request.method == 'POST' and form.validate():
with db_session.begin():
user = User(form.name.data, form.email.data,
form.password.data)
foo(user)
Because User model is just a proxy object that will actually execute database query under the hood. Therefore A transaction is already begun. in the creation process. The exception itself will be raised upon the next transactional query calls.
When using a Session, it’s useful to consider the ORM mapped objects that it maintains as proxy objects to database rows, which are local to the transaction being held by the Session. In order to maintain the state on the objects as matching what’s actually in the database, there are a variety of events that will cause objects to re-access the database in order to keep synchronized. It is possible to “detach” objects from a Session, and to continue using them, though this practice has its caveats. It’s intended that usually, you’d re-associate detached objects with another Session when you want to work with them again, so that they can resume their normal task of representing database state.
Session Basic
As additional answer to your last question: Perhaps the scoped_session is causing errors across threads with the requests?
No. SQLAlchemy scoped_session is actually a helper function that act as Registry for the global Session object. It is very useful in multithreading application, helping to ensure the same Session object is being used accross threads while each keeping their own data to local via threading.local api provided by python. Most web framework uses threading strategies to cope with many web requests at once hence most of them provide some integration with/without this helper.
This is how I basically create my app and db:
from flask import Flask
from flask_restful import Api
from flask_sqlalchemy import SQLAlchemy
app = Flask(__name__)
api = Api(app)
app.config['SQLALCHEMY_DATABASE_URI'] = mysql+pymysql://user:pass#heroku-hostname
db = SQLAlchemy(app)
#app.before_first_request
def create_tables():
db.create_all()
I do not explicitly handle any connection object myself, only using this db, like:
def save_to_db(self):
db.session.add(self)
db.session.commit()
def find_by_email(email):
return User.query.filter_by(email=email).first()
Pure SQLAlchemy creates some kind of Engine object with pooling parameters while Flask-SQLAlchemy's configuration documentation says there are keys for pooling options, but they are getting deprecated.
This clearly seems as an abstraction built upon this Engine object. Also, it shows these settings are actually getting default values if not specified.
The last config field is the SQLALCHEMY_ENGINE_OPTIONS, and it states indeed: A dictionary of keyword args to send to create_engine().
So how is this happening from here? Are we just basically using this single config field from now to run the original SQLAlchemy Engine functions?
What kind of dictionary values should I provide; do I still get any default values?
Coming from Flask and learning things in a top-down approach I'm a bit confused how things are working at lower layers.
I've been using Flask-SQLAlchemy for a project for about a year. I like that it abstract away the sessioning for me. But now I need more granular control over my sessioning, namely to make a DB connection in a thread after the user has left my application. Is it possible / are there any dangers to use both Flask-SQLAlchemy and SQLAlchemy at the same time?
Bonus: if I must revert to just SQLAlchemy, what must I know? Is it just session scope?
EDIT trying detached session:
(pdb) db
<SQLAlchemy engine=None>
(Pdb) db.session
<sqlalchemy.orm.scoping.scoped_session object at 0x104b81210>
(Pdb) db.session()
*** RuntimeError: application not registered on db instance and no application bound to current context
You have an app like:
from flask import Flask
from flask_sqlalchemy import SQLAlchemy
app = Flask(__name__)
db = SQLAlchemy(app)
Currently you use:
#app.route('/add-some-item', method=['POST'])
def add_some_item():
some_item = SomeItem(foo=request.form.get('foo'))
db.session.add(some_item)
db.session.commit()
return 'The new id is: {}'.format(some_item.id)
But you can also use:
def i_run_in_some_other_thread():
# no need for a Flask request context, just use:
session = db.session() # a bare SQLAlchemy session
...
some_item = SomeItem(foo=bar)
session.add(some_item)
session.commit()
#app.route('/do-a-cron-job')
def do_a_cron_job()
Thread(target=i_run_in_some_other_thread).start()
return 'Thread started.'
By default a session is bound to a thread. For this simple case you don't need to do any changes to your code at all, but if sessions are shared between threads, then you would need to do a few changes: “Session and sessionmaker()”.
Just don't share sessions or objects between threads I'd say, or things will get messy. Share IDs and you're fine.
I have purposefully defined 2 different engines (using the same DB URL) meant for 2 sessions with different configuration, Pyramid's model.py:
DBSession = scoped_session(sessionmaker(extension=ZopeTransactionExtension()))
DBSessionTask = scoped_session(sessionmaker(extension=ZopeTransactionExtension(), expire_on_commit=False))
Configuring sessions (in Pyramid app's main __init__.py):
engine = engine_from_config(settings, 'sqlalchemy.')
DBSession.configure(bind=engine)
Base.metadata.bind = engine
engine_task = engine_from_config(settings, 'sqlalchemy.')
DBSessionTask.configure(bind=engine_task)
The sessions are meant to be used for 2 different categories of objects (DBSessionTask for long-running supervision objects kept in the app-wide settings, DBSession for typical scoped session on "data" objects of a web app).
I'm getting a warning:
sqlalchemy\orm\scoping.py:99: SAWarning: At least one scoped session is already present. configure() can not affect sessions that have already been created.
warn('At least one scoped session is already present. '
Those are 2 different engines, so why SQA is warning me about it? They're using the same DB url of course, but why should that be a problem?
If you want to use multiple sessions in Pyramid+SQLAlchemy, you should manage them explicitly instead of relying on scoped sessions. The scoped session sessionmaker expects to make one session per thread, hence your issues. Many pyramid devs prefer doing this anyway as a general rule as it fits well with the pyramid philosophy of passing everything through the request and context objects. My preference is to make a db engine component that has a method for getting and closing the session, and register this component through the configurator. Then I have a custom request factory that creates the db session at the beginning of the request and commits or rolls it back at the end. You can do the same without a custom request factory too by registering request lifecycle callbacks in your configurator section. Here is an example of doing the above, taken from the cookbook, which you could adapt for multiple engines easily enough:
http://pyramid-cookbook.readthedocs.org/en/latest/database/sqlalchemy.html
# __init__.py
from pyramid.config import Configurator
from sqlalchemy import engine_from_config
from sqlalchemy.orm import sessionmaker
def db(request):
maker = request.registry.dbmaker
session = maker()
def cleanup(request):
if request.exception is not None:
session.rollback()
else:
session.commit()
session.close()
request.add_finished_callback(cleanup)
return session
def main(global_config, **settings):
config = Configurator(settings=settings)
engine = engine_from_config(settings, prefix='sqlalchemy.')
config.registry.dbmaker = sessionmaker(bind=engine)
config.add_request_method(db, reify=True)
You should use one scope session binding your models to different database engines instead.
I have a python module UserManager that takes care for all things user management related - users, groups, rights, authentication. Access to these assets is provided via master class that is passed SQLAlchemy engine parameter at constructor. The engine is needed to make the table-class mappings (using mapper objects), and to emit sessions.
This is how the gobal variables are established in the app module:
class UserManager:
def __init__(self, db):
self.db = db
self._db_session = None
meta = MetaData(db)
user_table = Table(
'USR_User', meta,
Column('field1'),
Column('field3')
)
mapper(User, user_table)
#property
def db_session(self):
if self._db_session is None:
self._db_session = scoped_session(sessionmaker())
self._db_session.configure(bind=self.db)
return self._db_session
class User(object):
def init(self, um):
self.um = um
from flask.ext.sqlalchemy import SQLAlchemy
db = SQLAlchemy(app)
um = UserManager(db.engine)
This module as such is designed to be context-agnostic by purpose, so that it can be used both for locally run and web application.
But here the problems arise: time to time I get the dreaded "Can't reconnect until invalid transaction is rolled back" error, presumably caused by some failed transaction in the UserManager code.
I am now trying to identify the problem source. Maybe it is not right way how to handle the database in the dynamic context of web server? Perhaps I have to pass the db.session to the um object so that I can be sure that the db connections are not mixed up?
In web context you should consider the request for every user isolated. For this you must use the flask.g
To share data that is valid for one request only from one function to
another, a global variable is not good enough because it would break
in threaded environments.Flask provides you with a special object
that ensures it is only valid for the active request and that will
return different values for each request. In a nutshell: it does the
right thing, like it does for request and session.
You can see more about here.