According to SQLAlchemy documentation, engine and sessionmaker instances should be created in the application's global scope:
When do I make a sessionmaker? Just one time, somewhere in your application’s global scope. It should be looked upon as part of your application’s configuration. If your application has three .py files in a package, you could, for example, place the sessionmaker line in your __init__.py file; from that point on your other modules say “from mypackage import Session”. That way, everyone else just uses Session(), and the configuration of that session is controlled by that central point.
Questions:
What is the best practice for cleaning up SQLAlchemy engine and sessionmaker instances? Please refer to my example below, while I could call engine.dispose() in main.py, it does not seem a good practice to clean up a global object from a different module (database.py) in __main__ (main.py), is there a better way to do it?
Do we need to clean up sessionmaker instances? It seems there is no method for closing the sessionmaker instance (sessionmaker.close_all() is deprecated, and session.close_all_sessions() is a session instance method and not sessionmaker method.)
Example:
I created the engine and sessionmaker object in a module called database.py:
from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker
from contextlib import contextmanager
DB_ENGINE = create_engine(DB_CONNECTION_STRING, pool_size=5, max_overflow=10)
DB_SESSION = sessionmaker(bind=DB_ENGINE, autocommit=False, autoflush=True, expire_on_commit=False)
#contextmanager
def db_session(db_session_factory):
"""Provide a transactional scope around a series of operations."""
session = db_session_factory()
try:
yield session
session.commit()
except:
session.rollback()
raise
finally:
session.close()
In my main application main.py, I import the module and use the engine and sessionmaker instances as follows. I cleaned up the engine instance at the end of __main__.
from multiprocessing.pool import ThreadPool
from database import DB_ENGINE, DB_SESSION, db_session
def worker_func(data):
with db_session(DB_SESSION) as session:
[...database operations using session object...]
if __name__ == '__main__':
try:
data = [1,2,3,4,5]
with ThreadPool(processes=5) as thread_pool:
results = thread_pool.map(worker_func, data)
except:
raise
finally:
# Cleanup
DB_ENGINE.dispose()
Related
I have a function that returns an SQLAlchemy engine object
create_db.py:
from sqlalchemy import create_engine
def init_db():
return create_engine("sqlite://", echo=True, future=True)
and I've a test that's attempting to use pytest's monkeypatch to mock the call to create_engine.
test_db.py:
import sqlalchemy
from create_db import init_db
def test_correct_db_path_selected(monkeypatch):
def mocked_create_engine():
return "test_connection_string"
monkeypatch.setattr(sqlalchemy, "create_engine", mocked_create_engine())
engine = init_db()
assert engine == "test_connection_string"
When I run pytest, the test is failing as a real sqlalchemy engine object is getting returned, not the mocked string.
AssertionError: assert Engine(sqlite://) == 'test_connection_string'
I've tried the following calls to setattr but they all fail in the same manner:
monkeypatch.setattr("sqlalchemy.engine.create.create_engine", mocked_create_engine)
monkeypatch.setattr(sqlalchemy.engine.create, "create_engine", mocked_create_engine)
monkeypatch.setattr(sqlalchemy.engine, "create_engine", mocked_create_engine)
I've gotten the basic examples from pytest docs to work but it doesn't cover a static function from a library. Does anyone have any suggestions on what I'm doing wrong?
So I've found a solution for my problem, but I'm still not clear on why the above code doesn't work.
If I change my create_db.py to directly call sqlalchemy.create_engine, the mocking function works.
create_db.py:
import sqlalchemy
def init_db():
return sqlalchemy.create_engine("sqlite://")
test_db.py:
import sqlalchemy
from create_db import init_db
class MockEngine:
def __init__(self, path):
self.path = path
def test_correct_db_path_selected(monkeypatch):
def mocked_create_engine(path):
return MockEngine(path=path)
monkeypatch.setattr(sqlalchemy, "create_engine", mocked_create_engine)
engine = init_db()
assert engine.path == "sqlite://"
I don't mind changing my code to make it more testable but I'd still like to know if it's possible to mock a call to the original create_engine call. I'll leave the question and answer up in case any else runs into the same problem.
Edit:
I found a solution that doesn't involve changing the code to be tested. The following call to setattr will mock a function call that isn't on an object:
monkeypatch.setattr(create_db, "create_engine", mocked_create_engine)
This works as it's telling monkeypatch to mock direct calls to create_engine in the create_db.py file.
I would like to create a number of functions which start by calling one particular function, and end by calling another.
Each function would take a different number of arguments, but they would share the first and last line. Is this possible?
To give you an example, I am trying to use this to create a set of functions which can connect to my database via sqlalchemy, add an entry to it, and exit nicely:
from sqlalchemy import create_engine
from os import path
from common_classes import *
from sqlalchemy.orm import sessionmaker
def loadSession():
db_path = "sqlite:///" + path.expanduser("~/animal_data.db")
engine = create_engine(db_path, echo=False)
Session = sessionmaker(bind=engine)
session = Session()
Base.metadata.create_all(engine)
return session, engine
def add_animal(id_eth, cage_eth, sex, ear_punches, id_uzh="", cage_uzh=""):
session, engine = loadSession()
new_animal = Animal(id_eth=id_eth, cage_eth=cage_eth, sex=sex, ear_punches=ear_punches, id_uzh=id_uzh, cage_uzh=cage_uzh)
session.add(new_animal)
commit_and_close(session, engine)
def add_genotype(name, zygosity):
session, engine = loadSession()
new_genotype = Genotype(name=name, zygosity=zygosity)
session.add(new_animal)
commit_and_close(session, engine)
def commit_and_close(session, engine):
session.commit()
session.close()
engine.dispose()
Again, what I am trying to do is collapse add_animal() and add_genotype() (and prospectively many more functions) into a single constructor.
I have thought maybe I can use a class for this, and while I believe loadSession() could be called from __init__ I have no idea how to call the commit_and_close() function at the end - nor how to manage the variable number of arguments of every subclass...
Instead of having add_X functions for every type X, just create a single add function that adds an object which you create on the “outside” of the funcition:
So add_animal(params…) becomes add(Animal(params…)), and add_genotype(params…) becomes add(Genotype(params…)).
That way, your add function would just look like this:
def add (obj):
session, engine = loadSession()
session.add(obj)
commit_and_close(session, engine)
Then it’s up to the caller of that function to create the object, which opens up the interface and allows you to get objects from elsewhere too. E.g. something like this would be possible too then:
for animal in zoo.getAnimals():
add(animal)
I'm using a lot of werkzeug.local.LocalProxy objects in my Flask app. They are supposed to be perfect stand-ins for objects, but they aren't really, since they don't respond to type() or instanceof() correctly.
SQLAlchemy doesn't like them at all. If I make a LocalProxy to a SQLAlchemy record, SQLAlchemy considers it to be None. If I pass it a LocalProxy to a simpler type, it just says it's the wrong type.
Here's an example of Flask-SQLAlchemy having a bad time with LocalProxy.
How do you guys deal with this problem? Just call _get_current_object() a lot? It'd be pretty cool if SQLAlchemy or Flask-SQLAlchemy could automatically handle these LocalProxy objects more gracefully, especially considering Flask-Login uses them, and pretty much everybody uses that, right?
I'm considering adding this function to my project to deal with it, and wrapping any of my localproxies in it before passing them to sqlalchemy:
from werkzeug.local import LocalProxy
def real(obj):
if isinstance(obj, LocalProxy):
return obj._get_current_object()
return obj
I patched the drivers used by SQLAlchemy, but I fear that it is not the most generic solution.
from flask_sqlalchemy import SQLAlchemy as FlaskSQLAlchemy
from sqlalchemy.engine import Engine
from werkzeug.local import LocalProxy
class SQLAlchemy(FlaskSQLAlchemy):
"""Implement or overide extension methods."""
def apply_driver_hacks(self, app, info, options):
"""Called before engine creation."""
# Don't forget to apply hacks defined on parent object.
super(SQLAlchemy, self).apply_driver_hacks(app, info, options)
if info.drivername == 'sqlite':
from sqlite3 import register_adapter
def adapt_proxy(proxy):
"""Get current object and try to adapt it again."""
return proxy._get_current_object()
register_adapter(LocalProxy, adapt_proxy)
elif info.drivername == 'postgresql+psycopg2': # pragma: no cover
from psycopg2.extensions import adapt, register_adapter
def adapt_proxy(proxy):
"""Get current object and try to adapt it again."""
return adapt(proxy._get_current_object())
register_adapter(LocalProxy, adapt_proxy)
elif info.drivername == 'mysql+pymysql': # pragma: no cover
from pymysql import converters
def escape_local_proxy(val, mapping):
"""Get current object and try to adapt it again."""
return converters.escape_item(
val._get_current_object(),
self.engine.dialect.encoding,
mapping=mapping,
)
converters.encoders[LocalProxy] = escape_local_proxy
Original source can be found here.
I am trying to add an event listener to the before_commit event of an SQLAlchemy Session inside of a Flask application. When doing the following
def before_commit(session):
for item in session:
if hasattr(item, 'on_save'):
item.on_save(session)
event.listen(db.session, 'before_commit', before_commit)
I get
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "app.py", line 60, in <module>
event.listen(db.session, 'before_commit', before_commit)
File "C:\Python27\lib\site-packages\sqlalchemy\event\api.py", line 49, in listen
_event_key(target, identifier, fn).listen(*args, **kw)
File "C:\Python27\lib\site-packages\sqlalchemy\event\api.py", line 22, in _event_key
tgt = evt_cls._accept_with(target)
File "C:\Python27\lib\site-packages\sqlalchemy\orm\events.py", line 1142, in _accept_with
"Session event listen on a scoped_session "
sqlalchemy.exc.ArgumentError: Session event listen on a scoped_session requires that its creation callable is associated with the Session class.
I can't find the correct way to register the event listener. The documentation actually states that event.listen() also accepts a scoped_session, but it seems like it does not?!
http://docs.sqlalchemy.org/en/latest/orm/events.html#sqlalchemy.orm.events.SessionEvents
The listen() function will accept Session objects as well as the return result of
sessionmaker() and scoped_session().
Additionally, it accepts the Session class which will apply listeners to all Session
instances globally.
it means that the factory you've passed to scoped_session() must be a sessionmaker():
from sqlalchemy.orm import scoped_session, sessionmaker, sessionmaker
from sqlalchemy import event
# good
ss1 = scoped_session(sessionmaker())
#event.listens_for(ss1, "before_flush")
def evt(*arg, **kw):
pass
# bad
ss2 = scoped_session(lambda: Session)
#event.listens_for(ss2, "before_flush")
def evt(*arg, **kw):
pass
To give another example, this codebase won't work:
https://sourceforge.net/p/turbogears1/code/HEAD/tree/branches/1.5/turbogears/database.py
# bad
def create_session():
"""Create a session that uses the engine from thread-local metadata.
The session by default does not begin a transaction, and requires that
flush() be called explicitly in order to persist results to the database.
"""
if not metadata.is_bound():
bind_metadata()
return sqlalchemy.orm.create_session()
session = sqlalchemy.orm.scoped_session(create_session)
Instead it needs to be something like the following:
# good
class SessionMakerAndBind(sqlalchemy.orm.sessionmaker):
def __call__(self, **kw):
if not metadata.is_bound():
bind_metadata()
return super(SessionMakerAndBind, self).__call__(**kw)
sessionmaker = SessionMakerAndBind(autoflush=False,
autocommit=True, expire_on_commit=False)
session = sqlalchemy.orm.scoped_session(sessionmaker)
In my application I'm using SQLAlchemy for storing most persistent data across app restarts. For this I have a db package containing my mapper classes (like Tag, Group etc.) and a support class creating a single engine instance using create_engine and a single, global, Session factory using sessionmaker.
Now my understanding of how to use SQLAlchemys sessions is, that I don't pass them around in my app but rather create instances using the global factory whenever I need database access.
This leads to situations were a record is queried in one session and then passed on to another part of the app, which uses a different session instance. This gives me exceptions like this one:
Traceback (most recent call last):
File "…", line 29, in delete
session.delete(self.record)
File "/usr/lib/python3.3/site-packages/sqlalchemy/orm/session.py", line 1444, in delete
self._attach(state, include_before=True)
File "/usr/lib/python3.3/site-packages/sqlalchemy/orm/session.py", line 1748, in _attach
state.session_id, self.hash_key))
sqlalchemy.exc.InvalidRequestError: Object '<Group at 0x7fb64c7b3f90>' is already attached to session '1' (this is '3')
Now my question is: did I get the usage of Session completly wrong (so I should use one session only at a time and pass that session around to other components together with records from the database) or could this result from actual code issue?
Some example code demonstrating my exact problem:
from sqlalchemy import create_engine, Column, Integer, String
from sqlalchemy.orm import sessionmaker
from sqlalchemy.ext.declarative import declarative_base, declared_attr
Base = declarative_base()
class Record(Base):
__tablename__ = "record"
id = Column(Integer, primary_key=True)
name = Column(String)
def __init__(self, name):
self.name = name
def __repr__(self):
return "<%s('%s')>" % (type(self).__name__, self.name)
engine = create_engine("sqlite:///:memory:")
Base.metadata.create_all(engine)
Session = sessionmaker(bind=engine)
s1 = Session()
record = Record("foobar")
s1.add(record)
s1.commit()
# This would be a completly different part of app
s2 = Session()
record = s2.query(Record).filter(Record.name == "foobar").first()
def delete_record(record):
session = Session()
session.delete(record)
session.commit()
delete_record(record)
For now I switched over to using a single, global session instance. That's neither nice nor clean in my opinion, but including lots and lots of boiler plate code to expunge objects from one session just to add them back to their original session after handing it over to some other application part was no realistic option, either.
I suppose this will completely blow up if I start using multiple threads to access the database via the very same session…