Description of the problem
I wanted to use ormar with SQLite for my project but ran into the problem that ormar doesn't save changes to the database. Although everything seems to be done according to the documentation. (Additionally, I used the faker to generate unique names to fill db and loguru to logging)
My code
import asyncio
from databases import Database
from faker import Faker
from loguru import logger as log
from ormar import ModelMeta, Model, Integer, String
from sqlalchemy import create_engine, MetaData
fake = Faker()
DB_PATH = 'sqlite:///db.sqlite3'
database = Database(DB_PATH)
metadata = MetaData()
class BaseMeta(ModelMeta):
database = database
metadata = metadata
class User(Model):
class Meta(BaseMeta):
tablename = 'users'
id: int = Integer(primary_key=True)
name: str = String(max_length=64)
# Also I tried without `with_connect` function, but it also doesn't work
async def with_connect(function):
async with database:
await function()
async def create():
return f"User created: {await User(name=fake.name()).save()}"
# Also I tried this: `User.objects.get_or_create(name=fake.name())`
# but it also doesn't work:
async def read():
return f"All data from db: {await User.objects.all()}"
async def main():
log.info(await create())
log.info(await create())
log.info(await read())
if __name__ == '__main__':
engine = create_engine(DB_PATH)
metadata.drop_all(engine)
metadata.create_all(engine)
try:
asyncio.run(with_connect(main))
finally:
metadata.drop_all(engine)
Results
As a result, I expected that after each run of the code, data would be printed during previous runs. That is so that the created objects are saved to the file db.sqlite3.
The actual result is that after each run of the code, only the data generated during that run is printed.
Conclusion
Why is the data not saved to the database file? Maybe I misunderstood how ORMs work?
Related
I am using Flask-SQLAlchemy, with autocommit set to False and autoflush set to True. It's connecting to a mysql database.
I have 3 methods like this:
def insert_something():
insert_statement = <something>
db.session.execute(insert_statement);
db.session.commit()
def delete_something():
delete_statement = <something>
db.session.execute(delete_statement);
db.session.commit()
def delete_something_else():
delete_statement = <something>
db.session.execute(delete_statement);
db.session.commit()
Sometimes I want to run these methods individually; no problems there — but sometimes I want to run them together in a nested transaction. I want insert_something to run first, and delete_something to run afterwards, and delete_something_else to run last. If any of those methods fail then I want everything to be rolled back.
I've tried the following:
db.session.begin_nested()
insert_something()
delete_something()
delete_something_else()
db.session.commit()
This doesn't work, though, because insert_something exits the nested transaction (and releases the savepoint). Then, when delete_something runs db.session.commit() it actually commits the deletion to the database because it is in the outermost transaction.
That final db.session.commit() in the code block above doesn't do anything..everything is already committed by that point.
Maybe I can do something like this, but it's ugly as hell:
db.session.begin_nested()
db.session.begin_nested()
db.session.begin_nested()
db.session.begin_nested()
insert_something()
delete_something()
delete_something_else()
db.session.commit()
There's gotta be a better way to do it without touching the three methods..
Edit:
Now I'm doing it like this:
with db.session.begin_nested():
insert_something()
with db.session.begin_nested():
delete_something()
with db.session.begin_nested():
delete_something_else()
db.session.commit()
Which is better, but still not great.
I'd love to be able to do something like this:
with db.session.begin_nested() as nested:
insert_something()
delete_something()
delete_something_else()
nested.commit() # though I feel like you shouldn't need this in a with block
The docs discuss avoiding this pattern in arbitrary-transaction-nesting-as-an-antipattern and session-faq-whentocreate.
But there is an example in the docs that is similar to this but it is for testing.
https://docs.sqlalchemy.org/en/14/orm/session_transaction.html?highlight=after_transaction_end#joining-a-session-into-an-external-transaction-such-as-for-test-suites
Regardless, here is a gross transaction manager based on the example that "seems" to work but don't do this. I think there are a lot of gotchas in here.
import contextlib
from sqlalchemy import (
create_engine,
Integer,
String,
)
from sqlalchemy.schema import (
Column,
MetaData,
)
from sqlalchemy.orm import declarative_base, Session
from sqlalchemy import event
from sqlalchemy.sql import delete, select
db_uri = 'postgresql+psycopg2://username:password#/database'
engine = create_engine(db_uri, echo=True)
metadata = MetaData()
Base = declarative_base(metadata=metadata)
class Device(Base):
__tablename__ = "devices"
id = Column(Integer, primary_key=True, autoincrement=True)
name = Column(String(50))
def get_devices(session):
return [d.name for (d,) in session.execute(select(Device)).all()]
def create_device(session, name):
session.add(Device(name=name))
session.commit()
def delete_device(session, name):
session.execute(delete(Device).filter(Device.name == name))
session.commit()
def almost_create_device(session, name):
session.add(Device(name=name))
session.flush()
session.rollback()
#contextlib.contextmanager
def force_nested_transaction_forever(session, commit_on_complete=True):
"""
Keep re-entering a nested transaction everytime a transaction ends.
"""
d = {
'nested': session.begin_nested()
}
#event.listens_for(session, "after_transaction_end")
def end_savepoint(session, transaction):
# Start another nested trans if the prior one is no longer active.
if not d['nested'].is_active:
d['nested'] = session.begin_nested()
try:
yield
finally:
# Stop trapping us in perpetual nested transactions.
# Is this the right place for this ?
event.remove(session, "after_transaction_end", end_savepoint)
# This seems like it would be error prone.
if commit_on_complete and d['nested'].is_active:
d.pop('nested').commit()
if __name__ == '__main__':
metadata.create_all(engine)
with Session(engine) as session:
with session.begin():
# THIS IS NOT RECOMMENDED
with force_nested_transaction_forever(session):
create_device(session, "0")
create_device(session, "a")
delete_device(session, "a")
almost_create_device(session, "a")
create_device(session, "b")
assert len(get_devices(session)) == 2
assert len(get_devices(session)) == 2
I am studying the "Cosmic Python" book and chapter 6 explains how to use the Unit of Work pattern to change the interaction with the database/repository.
Chapter 6 of the book can be accessed here:
https://www.cosmicpython.com/book/chapter_06_uow.html
The code provided by the author is the following:
from __future__ import annotations
import abc
from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker
from sqlalchemy.orm.session import Session
from allocation import config
from allocation.adapters import repository
class AbstractUnitOfWork(abc.ABC):
products: repository.AbstractRepository
def __enter__(self) -> AbstractUnitOfWork:
return self
def __exit__(self, *args):
self.rollback()
#abc.abstractmethod
def commit(self):
raise NotImplementedError
#abc.abstractmethod
def rollback(self):
raise NotImplementedError
DEFAULT_SESSION_FACTORY = sessionmaker(bind=create_engine(
config.get_postgres_uri(),
isolation_level="REPEATABLE READ",
))
class SqlAlchemyUnitOfWork(AbstractUnitOfWork):
def __init__(self, session_factory=DEFAULT_SESSION_FACTORY):
self.session_factory = session_factory
def __enter__(self):
self.session = self.session_factory() # type: Session
self.products = repository.SqlAlchemyRepository(self.session)
return super().__enter__()
def __exit__(self, *args):
super().__exit__(*args)
self.session.close()
def commit(self):
self.session.commit()
def rollback(self):
self.session.rollback()
I am trying to test my endpoints on Flask but I could not make it rollback the data inserted after each test.
To solve that I tried to install the package pytest-flask-sqlalchemy but with the following error:
'SqlAlchemyUnitOfWork' object has no attribute 'engine'
I do not quite understand how pytest-flask-sqlalchemy works and I have no clue on how to make the Unit of Work rollback transactions after a test.
Is it possible to make it work the way the author implemented it?
Edited
It is possible to replicate my situation through the following repository:
https://github.com/Santana94/CosmicPythonRollbackTest
You should get that the test is not rolling back previous actions by cloning it and running make all.
Finally, I got to make the rollback functionality happen after every test.
I got that working when I saw a package called pytest-postgresql implementing it on itself. I just made my adjustments to make tests rollback the database data that I was working with. For that, I just had to implement this function on conftest.py:
#pytest.fixture(scope='function')
def db_session():
engine = create_engine(config.get_postgres_uri(), echo=False, poolclass=NullPool)
metadata.create_all(engine)
pyramid_basemodel.Session = scoped_session(sessionmaker(extension=ZopeTransactionExtension()))
pyramid_basemodel.bind_engine(
engine, pyramid_basemodel.Session, should_create=True, should_drop=True)
yield pyramid_basemodel.Session
transaction.commit()
metadata.drop_all(engine)
After that, I had to place the db_session as a parameter of a test if I wanted to rollback transactions:
#pytest.mark.usefixtures('postgres_db')
#pytest.mark.usefixtures('restart_api')
def test_happy_path_returns_202_and_batch_is_allocated(db_session):
orderid = random_orderid()
sku, othersku = random_sku(), random_sku('other')
earlybatch = random_batchref(1)
laterbatch = random_batchref(2)
otherbatch = random_batchref(3)
api_client.post_to_add_batch(laterbatch, sku, 100, '2011-01-02')
api_client.post_to_add_batch(earlybatch, sku, 100, '2011-01-01')
api_client.post_to_add_batch(otherbatch, othersku, 100, None)
r = api_client.post_to_allocate(orderid, sku, qty=3)
assert r.status_code == 202
r = api_client.get_allocation(orderid)
assert r.ok
assert r.json() == [
{'sku': sku, 'batchref': earlybatch},
]
It is possible to check out the requirements for that and other aspects of that implementation on my GitHub repository.
https://github.com/Santana94/CosmicPythonRollbackTest
I have a file called redis_db.py which has code to connect to redis
import os
import redis
import sys
class Database:
def __init__(self, zset_name):
redis_host = os.environ.get('REDIS_HOST', '127.0.0.1')
redis_port = os.environ.get('REDIS_PORT', 6379)
self.db = redis.StrictRedis(host=redis_host, port=redis_port)
self.zset_name = zset_name
def add(self, key):
try:
self.db.zadd(self.zset_name, {key: 0})
except redis.exceptions.ConnectionError:
print("Unable to connect to redis host.")
sys.exit(0)
I have another file called app.py which is like this
from flask import Flask
from redis_db import Database
app = Flask(__name__)
db = Database('zset')
#app.route('/add_word/word=<word>')
def add_word(word):
db.add(word)
return ("{} added".format(word))
if __name__ == '__main__':
app.run(host='0.0.0.0', port='8080')
Now I am writing unit test for add_word function like this
import unittest
import sys
import os
from unittest import mock
sys.path.append(os.path.dirname(os.path.realpath(__file__)) + "/../api/")
from api import app # noqa: E402
class Testing(unittest.TestCase):
def test_add_word(self):
with mock.patch('app.Database') as mockdb:
mockdb.return_value.add.return_value = ""
result = app.add_word('shivam')
self.assertEqual(result, 'shivam word added.')
Issue I am facing is that even though I am mocking the db method call it is still calling the actual method in the class instead of returning mocked values and during testing I am getting error with message Unable to connect to redis host..
Can someone please help me in figuring out how can I mock the redis database calls.
I am using unittest module
The issue is that db is defined on module import, so the mock.patch does not affect the db variable. Either you move the instantiation of
db in the add_word(word) function or you patch db instead of Database, e.g.
def test_add_word():
with mock.patch('api.app.db') as mockdb:
mockdb.add = mock.MagicMock(return_value="your desired return value")
result = app.add_word('shivam')
print(result)
Note that the call of add_word has to be in the with block, otherwise the unmocked version is used.
I have tried unsuccesfully to save fetched API data to sqlite database in a Flask app. I have used requests.get() to extract external API data to dataframe. The function "extract_to_df_race" works when i test it in Jupyter Notebook. I have placed try-except statements to print error messages to console. Since, there were no error messages logged in console, I initally presume that the data has been successfully fetched and save to database. However, upon checking the database, none of the records have been saved.
I have used a custom Flask command to execute the 'historical_records' function to one-off load the database.
Are there any better methods of debugging that i could try?
app/api/log.py
from app import app
from app.models import Race, db
from app.utils import *
import click
#app.cli.command()
def historical_records():
seasons = [2015]
races_round = range(1,5)
df_races = extract_to_df_race('results', seasons, races_round)
save_races_to_db(df_races, db)
def save_races_to_db(df_races, db):
for idx,row in df_races.iterrows():
r = Race()
r.url = df_races.loc[idx,"url"]
r.season = df_races.loc[idx,"season"]
r.raceName = df_races.loc[idx,"raceName"]
db.session.add(r)
try:
db.session.commit()
except Exception as e:
db.session.rollback()
eprint(str(e))
To execute historical_records function from virtual environment, i ran "export FLASK_APP=app/api/log.py", then "flask historical_records"
app/utils.py
from __future__ import print_function
import requests
import json
import pandas as pd
import datetime
import sys
def eprint(*args, **kwargs):
print(*args, file=sys.stderr, **kwargs)
def extract_to_df_race(results_type, seasons, races_round):
df_races = pd.DataFrame()
if results_type == 'results':
for s in seasons:
for r in races_round:
try:
response = requests.get(API_URL)
response.raise_for_status()
dictionary = response.content
dictionary = json.loads(dictionary)
races = transform_func(dictionary, s, r)
df_races = pd.concat([df_races, races])
except requests.exceptions.HTTPError as err:
eprint(err)
sys.exit(1)
return df_races
Races model
class Race(db.Model, Serializer):
__tablename__ = 'races'
raceId = db.Column(db.Integer, primary_key=True)
url = db.Column(db.String(50), unique=True)
season = db.Column(db.Integer)
raceName = db.Column(db.String(50))
def __init__(self, **kwargs):
super(Race, self).__init__(**kwargs)
My TaskB requires TaskA, and on completion TaskA writes to a MySQL table, and then TaskB is to take in this output to the table as its input.
I cannot seem to figure out how to do this in Luigi. Can someone point me to an example or give me a quick example here?
The existing MySqlTarget in luigi uses a separate marker table to indicate when the task is complete. Here's the rough approach I would take...but your question is very abstract, so it is likely to be more complicated in reality.
import luigi
from datetime import datetime
from luigi.contrib.mysqldb import MySqlTarget
class TaskA(luigi.Task):
rundate = luigi.DateParameter(default=datetime.now().date())
target_table = "table_to_update"
host = "localhost:3306"
db = "db_to_use"
user = "user_to_use"
pw = "pw_to_use"
def get_target(self):
return MySqlTarget(host=self.host, database=self.db, user=self.user, password=self.pw, table=self.target_table,
update_id=str(self.rundate))
def requires(self):
return []
def output(self):
return self.get_target()
def run(self):
#update table
self.get_target().touch()
class TaskB(luigi.Task):
def requires(self):
return [TaskA()]
def run(self):
# reading from target_table