How to initialize a Postgresql database in a Flask app with SQLAlchemy - python

The Flask tutorial (and many other tutorials out there) suggests that the engine, the db_session and the Base (an instance of declarative_metadata) are all created at import-time.
This creates some problems, one being, that the URI of the DB is hardcoded in the code and evaluated only once.
One solution is to wrap these calls in functions that accept the app as a parameter, which is what I've done. Mind you - each call caches the result in app.config:
def get_engine(app):
"""Return the engine connected to the database URI in the config file.
Store it in the config for later use.
"""
engine = app.config.setdefault(
'DB_ENGINE', create_engine(app.config['DATABASE_URI'](), echo=True))
return engine
def get_session(app):
"""Return the DB session for the database in use
Store it in the config for later use.
"""
engine = get_engine(app)
db_session = app.config.setdefault(
'DB_SESSION', scoped_session(sessionmaker(
autocommit=False, autoflush=False, bind=engine)))
return db_session
def get_base(app):
"""Return the declarative base to use in DB models.
Store it in the config for later use.
"""
Base = app.config.setdefault('DB_BASE', declarative_base())
Base.query = get_session(app).query_property()
return Base
In init_db, I call all those functions, but there's still code smell:
def init_db(app):
"""Initialise the database"""
create_db(app)
engine = get_engine(app)
db_session = get_session(app)
base = get_base(app)
if not app.config['TESTING']:
import flaskr.models
else:
if 'flaskr.models' not in sys.modules:
import flaskr.models
else:
import flaskr.models
importlib.reload(flaskr.models)
base.metadata.create_all(bind=engine)
The smell is of course the hoops I have to go through to import and create all models.
The reason for the code above is that, when unit testing, init_db is called once for each test (in setup(), as suggested in the same tutorial), but the import will only be performed the first time, and create_all will therefore work only that time.
Not only that, now with a session shared for the duration of the app, I have problems in parametrized negative unit tests (that is, parametrized unit tests that expect some sort of failures): the first instance of the test will trigger a failure (e.g. login failure, see test_login_validate_input in the tutorial) and exit correctly, while all subsequent will bail out early because the db_session should be rolled back first. Clearly there's something wrong with the DB initialization.
What is the Right Way(TM) to initialize the database?

I have eventually decided to refactor the app so that it uses Flask-SQLAlchemy.
In short, the app now does something like this:
from flask_sqlalchemy import SQLAlchemy
db = SQLAlchemy()
def create_app():
app = Flask(__name__)
db.init_app(app)
# ...
With the benefit of hindsight, it's definitely a cleaner approach.
What put me off at the start was this entry from the tutorial (bold is mine):
Because SQLAlchemy is a common database abstraction layer and object relational mapper that requires a little bit of configuration effort, there is a Flask extension that handles that for you. This is recommended if you want to get started quickly.
Which I somehow read as "Using the Flask-SQLAlchemy extension will allow you to cut some corners, which you'll probably end up paying for later".
It's very early stages, but so far no price to pay in terms of flexibility for using said extension.

Related

Should you always only have one session open at a time?

I created a session factory using python-flask-sqlalchemy and i use it to create sessions for reading and writing, however i did not use session.close or with statement to close the session (I thought it would automatically close when there are not more reference to it).
There was a bug somewhere and i accidentally gained some insight to how my backend works.
Object < ClassName at memory location > is already attached to session '134' (this is '135')
134 sounds like an awefully huge number of sessions, if i understand this correctly, the previous session was not closed and it is consuming extra memory?
edit: the error was due to the fact that i was trying to initiate 2 different sessions attached to the same class (as i want to write to one and read on a different db). But i accidentally found out that i have 134 sessions going on (if i understand it correctly)!
edit #2 : alternatively should i just stop using session factory and stick to the officially endorsed multiple bind method?
Heres the code
import sqlalchemy
from sqlalchemy.orm import Query, Session, scoped_session, sessionmaker
from db import db
from config import SQL_SERVER_STRING
class ReadFactory:
SQL_SERVER_STRING = SQL_SERVER_STRING
def __init__(self):
engine = db.create_engine(self.SQL_SERVER_STRING, {})
Session = sessionmaker()
Session.configure(bind=engine)
self.readSession = Session()
def read(self):
return self.readSession
and then when i use it inside a model class
#classmethod
def get_from_db(cls, **kwargs):
readSession = ReadFactory()
return readSession.read().query(cls).filter_by(**kwargs).all()

Python flask-sqlalchemy: Do I have to commit session after a query?

I am writing an app in python flask-sqlalchemy with MySQL DB (https://flask-sqlalchemy.palletsprojects.com/en/2.x/) and I am wondering if I have to make "db.session.commit() or db.session.rollback()" after GET call, which only query DB .
For example:
#app.route('/getOrders')
def getOrders():
orders = Order.query.all()
# Do I have to put here "db.session.commit()" or "db.session.rollback" ?
return { 'orders': [order.serialize() for order in orders] }
orders = Order.query.all() is a SELECT query that could be extended to include additional filters (WHERE etc.). It doesn't alter the database, it simply reads values from it. You don't need to commit on a read for precisely that reason - what would you store other than "I just read this data"? Databases are concerned about this in other ways i.e. access permissions and logs.
Given the above, rollback doesn't make any sense because there are no changes to actually roll back.
Flask-SQLAlchemy does a bit of magic around sessions amongst other things. It's roughly equivalent to:
from sqlalchemy.orm import scoped_session, sessionmaker
Session = sessionmaker(bind=engine, autocommit=False, autoflush=False)
db_session = scoped_session(Session)
Followed with a method to close sessions:
def init_db(app):
app.teardown_appcontext(teardown_session)
def teardown_session(exception=None):
db_session.remove()
Bottom line being: no, you don't have to worry about commit or rollback here, even in SQL, and the session management (completely separate) is handled by Flask-SQLALchemy

Dynamically linking to databases using flask-sqlalchemy

I'm building an api that creates users and each user get a new database. I'd like to link flask-sqlalchemy to that database created. Is there an easy way to do that?
I also want to make it clear that the reason each use gets their own database is to meet regulation so rearranging the schema so that everything is in one database is not an option.
I'm thinking of going with vanilla sqlalchemy as described here: https://dev.to/nestedsoftware/flask-and-sqlalchemy-without-the-flask-sqlalchemy-extension-3cf8 but I want to make sure I have no other options.
Without posting the entire code, the key concepts I use using Flask-SQLalchemy to have dynamic databases are:
from flask_sqlalchemy import SQLAlchemy
db = SQLAlchemy()
Have a global database registry
DATABASE_REGISTRY = {}
SESSION_MAKER = db.sessionmaker()
create the database engine and add to registry (I have this in another helper function)
engine = db.create_engine(db_uri)
DATABASE_REGISTRY['username'] = { 'engine': engine}
Then to get the user db dynamically, have a helper function
def get_db_session(username):
db_session = SESSION_MAKER(bind=DATABASE_REGISTRY.get(username)['engine'])
return db_session
Then you can use:
db_session = get_db_session('user_bob')
bob_query = db_session.query(ATable).all()
And don't forget to close the session
db_session.close()

Using python mocking library on sqlalchemy

I'm using sqlalchemy to query my databases for a project. Side-by-side, I'm new to unit testing and I'm trying to learn how can I make unit tests to test my database. I tried to use mocking library to test but so far, it seems to be very difficult.
So I made a piece of code which creates a Session object. This object is used to connect to database.
from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.exc import OperationalError, ArgumentError
test_db_string = 'postgresql+psycopg2://testdb:hello#localhost/' \
'test_databasetable'
def get_session(database_connection_string):
try:
Base = declarative_base()
engine = create_engine(database_connection_string)
Base.metadata.bind = engine
DBSession = sessionmaker(bind=engine)
session = DBSession()
connection = session.connection()
return session
except OperationalError:
return None
except ArgumentError:
return None
So I made a unit test case for this function:
import mock
import unittest
from mock import sentinel
import get_session
class TestUtilMock(unittest.TestCase):
#mock.patch('app.Utilities.util.create_engine') # mention the whole path
#mock.patch('app.Utilities.util.sessionmaker')
#mock.patch('app.Utilities.util.declarative_base')
def test_get_session1(self, mock_delarative_base, mock_sessionmaker,
mock_create_engine):
mock_create_engine.return_value = sentinel.engine
get_session('any_path')
mock_delarative_base.called
mock_create_engine.assert_called_once_with('any_path')
mock_sessionmaker.assert_called_once_with(bind=sentinel.engine)
As you see in my unit test, I cannot test code in get_session() starting from line session = DBSession(). So far after googling, I cannot find out if mock value returned can be also used to mock function calls - something like, I mock session object and verify if I called session.connection()
Is the above method of writing unit test case the right way? Is there a better method to do this?
First of all you can test session = DBSession() line also by
self.assertEqual(get_session('any_path'), mock_sessionmaker.return_value.return_value)
Moreover, mock_delarative_base.called is not an assert and cannot fail. Replace it with
self.assertTrue(mock_delarative_base.called)
General considerations
Writing test like these can be very time consuming and make your production code really coupled to test code. It is better to write your own wrapper that presents a comfortable interface for your business code and test it with some mocks and real (simple) tests: you should trust sqlalchemy implementation and just test how your wrapper calls it
After you can either mock your wrapper by a fake object that you can control or use mocks and check how your code calls it but your wrapper will present just some business interface and mocking it should be very simple.

Is it OK to execute code when a module imports?

I'm designing a small GUI application to wrap an sqlite DB (simple CRUD operations). I have created three sqlalchemy models (m_person, m_card.py, m_loan.py, all in a /models folder) and had previously had the following code at the top of each one:
from sqlalchemy import Table, Column, create_engine
from sqlalchemy import Integer, ForeignKey, String, Unicode
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import backref, relation
engine = create_engine("sqlite:///devdata.db", echo=True)
declarative_base = declarative_base(engine)
metadata = declarative_base.metadata
This felt a bit wrong (DRY) so it was suggested that I move all this stuff out to the module level (into models/__init__.py).
from sqlalchemy import create_engine
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import sessionmaker
from sqlalchemy import Table, Column, Boolean, Unicode
from settings import setup
engine = create_engine('sqlite:///' + setup.get_db_path(), echo=False)
declarative_base = declarative_base(engine)
metadata = declarative_base.metadata
session = sessionmaker()
session = session()
..and import declarative_base like so:
from sqlalchemy import Table, Column, Unicode
from models import declarative_base
class Person(declarative_base):
"""
Person model
"""
__tablename__ = "people"
id = Column(Unicode(50), primary_key=True)
fname = Column(Unicode(50))
sname = Column(Unicode(50))
However I've had a lot of feedback that executing code as the module imports like this is bad? I'm looking for a definitive answer on the right way to do it as it seems like by trying to remove code repetition I've introduced some other bad practises. Any feedback would be really useful.
(Below is the get_db_path() method from settings/setup.py for completeness as it is called in the above models/__init__.py code.)
def get_db_path():
import sys
from os import makedirs
from os.path import join, dirname, exists
from constants import DB_FILE, DB_FOLDER
if len(sys.argv) > 1:
db_path = sys.argv[1]
else:
#Check if application is running from exe or .py and adjust db path accordingly
if getattr(sys, 'frozen', False):
application_path = dirname(sys.executable)
db_path = join(application_path, DB_FOLDER, DB_FILE)
elif __file__:
application_path = dirname(__file__)
db_path = join(application_path, '..', DB_FOLDER, DB_FILE)
#Check db path exists and create it if not
def ensure_directory(db_path):
dir = dirname(db_path)
if not exists(dir):
makedirs(dir)
ensure_directory(db_path)
return db_path
Some popular frameworks (Twisted is one example) perform a good deal of initialization logic at import-time. The benefits of being able to build module contents dynamically do come at a price, one of them being that IDEs cannot always decide what is "in" the module or not.
In your specific case, you may want to refactor so that the particular engine is not supplied at import-time but later. You can create the metadata and your declarative_base class at import-time. Then during start-time, after all the classes are defined, you call create_engine and bind the result to your sqlalchemy.orm.sessionmaker. But if your needs are simple, you may not even need to go that far.
In general, I would say this is not Java or C. There is no reason to fear doing things at the module level other than define functions, classes and constants. Your classes are created when the application starts anyway, one after the other. A little monkey-patching of classes (in the same module!), or creating one or two global lookup tables, is OK in my opinion if it simplifies your implementation.
What I would definitely avoid is any code in your module that causes the order of imports to matter for your users (other than the normal way of simply providing logic to be used), or that modifies the behavior of code outside your module. Then your module becomes black magic, which is accepted in the Perl world (ala use strict;), but I find is not "pythonic".
For example, if your module modifies properties of sys.stdout when imported, I would argue that behavior should instead be moved into a function that the user can either call or not.
In principle there is nothing wrong with executing Python code when a module is imported, in fact every single Python module works that way. Defining module members is executing code, after all.
However, in your particular use case I would strongly advise against making a singleton database session object in your code base. You'll be losing out on the ability to do many things, for example unit test your model against an in-memory SQLite or other kind of database engine.
Take a look at the documentation for declarative_base and note how the examples are able to create the model with a declarative_base supplied class that's not yet bound to a database engine. Ideally you want to do that and then have some kind of connection function or class that will manage creating a session and then bind it to base.
There is nothing wrong with executing code at import time.
The guideline is executing enough so that your module is usable, but not so much that importing it is unnecessarily slow, and not so much that you are unnecessarily limiting what can be done with it.
Typically this means defining classes, functions, and global names (actually module level -- bit of a misnomer there), as well as importing anything your classes, functions, etc., need to operate.
This does not usually involve making connections to databases, websites, or other external, dynamic resources, but rather supplying a function to establish those connections when the module user is ready to do so.
I ran into this as well, and created a database.py file with a database manager class, after which I created a single global object. That way the class could read settings from my settings.py file to configure the database, and the first class that needed to use a base object (or session / engine) would initialize the global object, after which everyone just re-uses it. I feel much more comfortable having "from myproject.database import DM" at the top of each class using the SQLAlchemy ORM, where DM is my global database object, and then DM.getBase() to get the base object.
Here is my database.py class:
Session = scoped_session(sessionmaker(autoflush=True))
class DatabaseManager():
"""
The top level database manager used by all the SQLAlchemy classes to fetch their session / declarative base.
"""
engine = None
base = None
def ready(self):
"""Determines if the SQLAlchemy engine and base have been set, and if not, initializes them."""
host='<database connection details>'
if self.engine and self.base:
return True
else:
try:
self.engine = create_engine(host, pool_recycle=3600)
self.base = declarative_base(bind=self.engine)
return True
except:
return False
def getSession(self):
"""Returns the active SQLAlchemy session."""
if self.ready():
session = Session()
session.configure(bind=self.engine)
return session
else:
return None
def getBase(self):
"""Returns the active SQLAlchemy base."""
if self.ready():
return self.base
else:
return None
def getEngine(self):
"""Returns the active SQLAlchemy engine."""
if self.ready():
return self.engine
else:
return None
DM = DatabaseManager()

Categories