How to Generate Fixtures from Database with SqlAlchemy

How to Generate Fixtures from Database with SqlAlchemy - python

I'm starting to write tests with Flask-SQLAlchemy, and I'd like to add some fixtures for those. I have plenty of good data for that in my development database and a lot of tables so writing data manually would get annoying. I'd really like to just sample data from the dev database into fixtures and then use those. What's a good way to do this?

i would use factory boy
to create a model factory you just do:
import factory
from . import models
class UserFactory(factory.Factory):
class Meta:
model = models.User
first_name = 'John'
last_name = 'Doe'
admin = False
then to create instances:
UserFactory.create()
to add static data just give as kwarg to create
UserFactory.create(name='hank')
so to seed a bunch of stuff throw that in a for loop. :)

If you need to handle fixtures with SQLAlchemy or another ORM/backend then the Fixture package may be of use: Flask-Fixtures 0.3.3
That is a simple library that allows you to add database fixtures for your unit tests using nothing but JSON or YAML.

While Kyle's answer is correct, we still need to provide the model factory with a database session, otherwise we would never actually commit to the db. Also, factory boy has a dedicated class SQLAlchemyModelFactory for interacting with SQLAlchemy.
https://factoryboy.readthedocs.io/en/stable/orms.html#sqlalchemy
The whole setup could look something like this:
import pytest
import os
from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker
from factory.alchemy import SQLAlchemyModelFactory
engine = create_engine( os.getenv("SQLALCHEMY_DATABASE_URI"))
SessionLocal = sessionmaker(autocommit=False, autoflush=False, bind=engine)
# this resets our tables in between each test
def _reset_schema():
db = SessionLocal()
for table in Base.metadata.sorted_tables:
db.execute(
'TRUNCATE {name} RESTART IDENTITY CASCADE;'.format(name=table.name)
)
db.commit()
#pytest.fixture
def test_db():
yield engine
engine.dispose()
_reset_schema()
#pytest.fixture
def session(test_db):
connection = test_db.connect()
transaction = connection.begin()
db = scoped_session(sessionmaker(bind=engine))
try:
yield db
finally:
db.close()
transaction.rollback()
connection.close()
db.remove()
class UserFactory(SQLAlchemyModelFactory):
class Meta:
model = models.User
first_name = 'John'
last_name = 'Doe'
admin = False
#pytest.fixture(autouse=True)
def provide_session_to_factories(session):
# usually you'd have one factory for each db table
for factory in [UserFactory, ...]:
factory._meta.sqlalchemy_session = session

Related

SQLAlchemy doesn't correctly create in-memory database

Making an API using FastAPI and SQLAlchemy I'm experiencing strange behaviour when database (SQLite) is in-memory which doesn't occur when stored as file.
Model:
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy import Column, Integer, String
Base = declarative_base()
class Thing(Base):
__tablename__ = "thing"
id = Column(Integer, primary_key=True, autoincrement=True)
name = Column(String)
I create two global engine objects. One with database as file, the other as in-memory database:
from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker
args = dict(echo=True, connect_args={"check_same_thread": False})
engine1 = create_engine("sqlite:///db.sqlite", **args)
engine2 = create_engine("sqlite:///:memory:", **args)
Session1 = sessionmaker(bind=engine1)
Session2 = sessionmaker(bind=engine2)
I create my FastAPI app and a path to add an object to database:
from fastapi import FastAPI
app = FastAPI()
#app.get("/")
def foo(x: int):
with {1: Session1, 2: Session2}[x]() as session:
session.add(Thing(name="foo"))
session.commit()
My main to simulate requests and check everything is working:
from fastapi.testclient import TestClient
if __name__ == "__main__":
Base.metadata.create_all(engine1)
Base.metadata.create_all(engine2)
client = TestClient(app)
assert client.get("/1").status_code == 200
assert client.get("/2").status_code == 200
thing table is created in engine1 and committed, same with engine2. On first request "foo" was successfully inserted into engine1's database (stored as file) but second request raises "sqlite3.OperationalError" claiming "no such table: thing".
Why is there different behaviour between the two? Why does in-memory database claim the table doesn't exist even though SQLAlchemy logs show create table statement ran successfully and was committed?

The docs explain this in the following https://docs.sqlalchemy.org/en/14/dialects/sqlite.html#using-a-memory-database-in-multiple-threads
To use a :memory: database in a multithreaded scenario, the same connection object must be shared among threads, since the database exists only within the scope of that connection. The StaticPool implementation will maintain a single connection globally, and the check_same_thread flag can be passed to Pysqlite as False
It also shows how to get the intended behavior, so in your case
from sqlalchemy.pool import StaticPool
args = dict(echo=True, connect_args={"check_same_thread": False}, poolclass=StaticPool)

SQLAlchemy: automap_base in a forking code

I develop an API server that interacts with MySQL DB reflecting it's schema and also get worked into multiple processes. My code for DB work looks like this:
from sqlalchemy import MetaData
from sqlalchemy.ext.automap import automap_base
from sqlalchemy.orm.session import Session
my_engine = create_engine_by_info(my_config)
metadata = MetaData(bind=my_engine)
Base: type = automap_base(metadata=metadata)
class User(Base):
__tablename__ = 'auth_user'
# Relation descriptions...
# Other classes...
Base.prepare(my_engine, reflect=True)
def find_user(field):
with Session(my_engine) as session:
query = session.query(User)
query = query.filter(User.field == field)
records = query.all()
for u in records:
return u
return None
And it works fine until process gets forked: after work of the child process the original one looses connection: Lost connection to MySQL server during query.
I guess I should keep my_engine separate for each process (e.g some function with a dict of engines where key is a PID), but how can I do that if my classes definition requires an engine at the beginning? Probably I can move classes in a function too, but it would be a hell... So, what is a good solution here?

How use pytest to unit test sqlalchemy orm classes

I want to write some py.test code to test 2 simple sqlalchemy ORM classes that were created based on this Tutorial. The problem is, how do I set a the database in py.test to a test database and rollback all changes when the tests are done? Is it possible to mock the database and run tests without actually connect to de database?
here is the code for my classes:
from sqlalchemy import create_engine, ForeignKey
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy import Column, Integer, String
from sqlalchemy.orm import sessionmaker, relationship
eng = create_engine('mssql+pymssql://user:pass#host/my_database')
Base = declarative_base(eng)
Session = sessionmaker(eng)
intern_session = Session()
class Author(Base):
__tablename__ = "Authors"
AuthorId = Column(Integer, primary_key=True)
Name = Column(String)
Books = relationship("Book")
def add_book(self, title):
b = Book(Title=title, AuthorId=self.AuthorId)
intern_session.add(b)
intern_session.commit()
class Book(Base):
__tablename__ = "Books"
BookId = Column(Integer, primary_key=True)
Title = Column(String)
AuthorId = Column(Integer, ForeignKey("Authors.AuthorId"))
Author = relationship("Author")

I usually do that this way:
I do not instantiate engine and session with the model declarations, instead I only declare a Base with no bind:
Base = declarative_base()
and I only create a session when needed with
engine = create_engine('<the db url>')
db_session = sessionmaker(bind=engine)
You can do the same by not using the intern_session in your add_book method but rather use a session parameter.
def add_book(self, session, title):
b = Book(Title=title, AuthorId=self.AuthorId)
session.add(b)
session.commit()
It makes your code more testable since you can now pass the session of your choice when you call the method.
And you are no more stuck with a session bound to a hardcoded database url.
I add a custom --dburl option to pytest using its pytest_addoption hook.
Simply add this to your top-level conftest.py:
def pytest_addoption(parser):
parser.addoption('--dburl',
action='store',
default='<if needed, whatever your want>',
help='url of the database to use for tests')
Now you can run pytest --dburl <url of the test database>
Then I can retrieve the dburl option from the request fixture
From a custom fixture:
#pytest.fixture()
def db_url(request):
return request.config.getoption("--dburl")
# ...
Inside a test:
def test_something(request):
db_url = request.config.getoption("--dburl")
# ...
At this point you are able to:
get the test db_url in any test or fixture
use it to create an engine
create a session bound to the engine
pass the session to a tested method
It is quite a mess to do this in every test, so you can make a usefull usage of pytest fixtures to ease the process.
Below are some fixtures I use:
from sqlalchemy import create_engine
from sqlalchemy.orm import scoped_session, sessionmaker
#pytest.fixture(scope='session')
def db_engine(request):
"""yields a SQLAlchemy engine which is suppressed after the test session"""
db_url = request.config.getoption("--dburl")
engine_ = create_engine(db_url, echo=True)
yield engine_
engine_.dispose()
#pytest.fixture(scope='session')
def db_session_factory(db_engine):
"""returns a SQLAlchemy scoped session factory"""
return scoped_session(sessionmaker(bind=db_engine))
#pytest.fixture(scope='function')
def db_session(db_session_factory):
"""yields a SQLAlchemy connection which is rollbacked after the test"""
session_ = db_session_factory()
yield session_
session_.rollback()
session_.close()
Using the db_session fixture you can get a fresh and clean db_session for each single test.
When the test ends, the db_session is rollbacked, keeping the database clean.

SQLAlchemy not creating tables

I am trying to setup a database just like in a tutorial but I am getting a programming error that a table doesn't exist when I'm trying to add a User
This is the file that errors (database.py):
from sqlalchemy import create_engine, MetaData
from sqlalchemy.orm import scoped_session, sessionmaker
from sqlalchemy.ext.declarative import declarative_base
engine = create_engine(
"mysql+pymysql://testuser:testpassword#localhost/test?charset=utf8",
connect_args = {
"port": 3306
},
echo="debug",
echo_pool=True
)
db_session = scoped_session(
sessionmaker(
bind=engine,
autocommit=False,
autoflush=False
)
)
Base = declarative_base()
def init_db():
import models
Base.metadata.create_all(bind=engine)
from models import User
db_session.add(
User(username="testuser", password_hash=b"", password_salt=b"", balance=1)
)
db_session.commit()
print("Initialized the db")
if __name__ == "__main__":
init_db()
To init the database (create the tables) I just run the file.
It errors when it creates the test user.
Here is models.py:
from sqlalchemy import Column, Integer, Numeric, Binary, String
from sqlalchemy.orm import relationship
from database import Base
class User(Base):
__tablename__ = "users"
id = Column(Integer, primary_key=True)
username = Column(String(16), unique=True)
password_hash = Column(Binary(32))
password_salt = Column(Binary(32))
balance = Column(Numeric(precision=65, scale=8))
def __repr__(self):
return "<User(balance={})>".format(balance)
I tried:
Committing before adding users (after create_all)
Drop existing tables from the database (although it seems like the table never gets committed)
from models import User instead of import models (before create_all)
Sorry if there are so many simillar questions, I promise I scavenged for answers, but it's always silly mistakes I made sure I didn't make (or atleast the ones I saw).
I am using MariaDB.
Sorry for long post, many thanks in advance.

The Base in database.py isn't the same Base that is imported into models.py.
A simple test is to put a print('creating Base') function call just above the Base = declarative_base() statement, and you'll see it is being created twice.
Python calls the module that is being executed '__main__', which you know as you have the if __name__ == '__main__' conditional at the bottom of your module. So the first Base that is created is __main__.Base. Then, in models.py, from database import Base causes the database module to be parsed again, creating database.Base in the namespace, and that is the Base from which User inherits. Then back in database.py, the Base.metadata.create_all(bind=engine) call is using the metadata from __main__.Base which has no tables in it, and as such creates nothing.
Don't execute out of the module that creates the Base instance. Create another module called main.py (or whatever), and move your init_db() function there and import Base, db_session and engine from database.py into main.py. That way, you are always using the same Base instance. This is example of main.py:
from database import Base, db_session, engine
from models import User
def init_db():
Base.metadata.create_all(bind=engine)
db_session.add(
User(username="testuser", password_hash=b"", password_salt=b"", balance=1)
)
db_session.commit()
print("Initialized the db")
if __name__ == "__main__":
init_db()

Declare Base class once(for each database) & import it to all modules which define table classes (inherited from Base)
For Base (a metaclass) to scan & find out all classes which are inherited from it, we need to import all the modules where such table classes (inherited from Base) are defined to module where we call Metadata.create_all(engine).

You need to import the relevant model where you call "Base.metadata.create_all". Example below to create user table
from ModelBase import Base
from UserModel import User
def create_db_schema(engine):
Base.metadata.create_all(engine,checkfirst=True)

peewee - Define models separately from Database() initialization

I need to use some ORM engine, like peewee, for handling SQLite database within my python application. However, most of such libraries offer syntax like this to define models.py:
import peewee
db = peewee.Database('hello.sqlite')
class Person(peewee.Model):
name = peewee.CharField()
class Meta:
database = db
However, in my application, i cannot use such syntax since database file name is provided by outside code after import, from module, which imports my models.py.
How to initialize models from outside of their definition knowing dynamic database file name? Ideally, models.py should not contain "database" mentions at all, like normal ORM.

Maybe you are looking at proxy feature :
proxy - peewee
database_proxy = Proxy() # Create a proxy for our db.
class BaseModel(Model):
class Meta:
database = database_proxy # Use proxy for our DB.
class User(BaseModel):
username = CharField()
# Based on configuration, use a different database.
if app.config['DEBUG']:
database = SqliteDatabase('local.db')
elif app.config['TESTING']:
database = SqliteDatabase(':memory:')
else:
database = PostgresqlDatabase('mega_production_db')
# Configure our proxy to use the db we specified in config.
database_proxy.initialize(database)

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to Generate Fixtures from Database with SqlAlchemy - python

If you need to handle fixtures with SQLAlchemy or another ORM/backend then the Fixture package may be of use: Flask-Fixtures 0.3.3 That is a simple library that allows you to add database fixtures for your unit tests using nothing but JSON or YAML.

Related

SQLAlchemy doesn't correctly create in-memory database

SQLAlchemy: automap_base in a forking code

How use pytest to unit test sqlalchemy orm classes

SQLAlchemy not creating tables

peewee - Define models separately from Database() initialization

Categories

Resources