PendingRollbackError when accessing test database in FastAPI async test

PendingRollbackError when accessing test database in FastAPI async test - python

I'm trying to mimic Django behavior when running tests on FastAPI: I want to create a test database in the beginning of each test, and destroy it in the end. The problem is the async nature of FastAPI is breaking everything. When I did a sanity check and turned everything synchronous, everything worked beautifully. When I try to run things async though, everything breaks. Here's what I have at the moment:
The fixture:
#pytest.fixture(scope="session")
def event_loop():
return asyncio.get_event_loop()
#pytest.fixture(scope="session")
async def session():
sync_test_db = "postgresql://postgres:postgres#postgres:5432/test"
if not database_exists(sync_test_db):
create_database(sync_test_db)
async_test_db = "postgresql+asyncpg://postgres:postgres#postgres:5432/test"
engine = create_async_engine(url=async_test_db, echo=True, future=True)
async with engine.begin() as conn:
await conn.run_sync(SQLModel.metadata.create_all)
Session = sessionmaker(engine, class_=AsyncSession, expire_on_commit=False)
async with Session() as session:
def get_session_override():
return session
app.dependency_overrides[get_session] = get_session_override
yield session
drop_database(sync_test_db)
The test:
class TestSomething:
#pytest.mark.asyncio
async def test_create_something(self, session):
data = {"some": "data"}
response = client.post(
"/", json=data
)
assert response.ok
results = await session.execute(select(Something)) # <- This line fails
assert len(results.all()) == 1
The error:
E sqlalchemy.exc.PendingRollbackError: This Session's transaction has been rolled back due to a previous exception during flush. To begin a new transaction with this Session, first issue Session.rollback(). Original exception was: Task <Task pending name='anyio.from_thread.BlockingPortal._call_func' coro=<BlockingPortal._call_func() running at /usr/local/lib/python3.9/site-packages/anyio/from_thread.py:187> cb=[TaskGroup._spawn.<locals>.task_done() at /usr/local/lib/python3.9/site-packages/anyio/_backends/_asyncio.py:629]> got Future <Future pending cb=[Protocol._on_waiter_completed()]> attached to a different loop (Background on this error at: https://sqlalche.me/e/14/7s2a)
/usr/local/lib/python3.9/site-packages/sqlalchemy/orm/session.py:601: PendingRollbackError
Any ideas what I might be doing wrong?

Check if other statements in your test-cases involving the database might fail before this error is raised.
For me the PendingRollbackError was caused by an InsertionError that was raised by a prior test.
All my tests were (async) unit tests that involved database insertions into a postgres database.
After the tests, the database session was supposed to do a rollback of its entries.
The InsertionError was caused by Insertions to the database that failed a unique constraint. All subsequent tests raised the PendingRollbackError.

Related

FastAPI + pytest unable to clean Django ORM

I'm creating a FastAPI project that integrates with the Django ORM. When running pytest though, the PostgreSQL database is not rolling back the transactions. Switching to SQLite, the SQLite database is not clearing the transactions, but it is tearing down the db (probably because SQLite uses in-memory db). I believe pytest-django is not calling the rollback method to clear the database.
In my pytest.ini, I have the --reuse-db flag on.
Here's the repo: https://github.com/Andrew-Chen-Wang/fastapi-django-orm which includes pytest-django and pytest-asyncio If anyone's done this with flask, that would help too.
Assuming you have PostgreSQL:
Steps to reproduce:
sh bin/create_db.sh which creates a new database called testorm
pip install -r requirements/local.txt
pytest tests/
The test is calling a view that creates a new record in the database tables and tests whether there is an increment in the number of rows in the table:
# In app/core/api/a_view.py
#router.get("/hello")
async def hello():
await User.objects.acreate(name="random")
return {"message": f"Hello World, count: {await User.objects.acount()}"}
# In tests/conftest.py
import pytest
from httpx import AsyncClient
from app.main import fast
#pytest.fixture()
def client() -> AsyncClient:
return AsyncClient(app=fast, base_url="http://test")
# In tests/test_default.py
async def test_get_hello_view(client):
"""Tests whether the view can use a Django model"""
old_count = await User.objects.acount()
assert old_count == 0
async with client as ac:
response = await ac.get("/hello")
assert response.status_code == 200
new_count = await User.objects.acount()
assert new_count == 1
assert response.json() == {"message": "Hello World, count: 1"}
async def test_clears_database_after_test(client):
"""Testing whether Django clears the database"""
await test_get_hello_view(client)
The first test case passes but the second doesn't. If you re-run pytest, the first test case also starts not passing because the test database is not clearing the transaction from the first run.
I adjusted the test to not include the client call, but it seems like pytest-django is simply not creating a transaction around the Django ORM, because the db is not being cleared for each test:
async def test_get_hello_view(client):
"""Tests whether the view can use a Django model"""
old_count = await User.objects.acount()
assert old_count == 0
await User.objects.acreate(name="test")
new_count = await User.objects.acount()
assert new_count == 1
async def test_clears_database_after_test(client):
"""Testing whether Django clears the database"""
await test_get_hello_view(client)
How should I clear the database for each test case?

Turns out my pytest.mark.django_db just needed transaction=True in it like pytest.mark.django_db(transaction=True)

How to use sqlalchemy AsyncSession in Celery

In celery, I want use asyncsession to operate mysql database,this is my code:
db.py
async_engine = create_async_engine(
ASYNC_DATABASE_URL,
future=True,
echo=True,
pool_size=1000,
pool_pre_ping=True,
pool_recycle=28800-300,
max_overflow=0, )
async_session = sessionmaker(
async_engine, expire_on_commit=False, class_=AsyncSession)
celery.py
app = Celery(
'tasks',
broker=f"{config.REDIS_URL}/{config.CELERY_DB}",
CELERY_TASK_SERIALIZER='pickle',
CELERY_RESULT_SERIALIZER='pickle',
CELERYD_CONCURRENCY=20,
CELERY_ACCEPT_CONTENT=[
'pickle',
'json'],
include=['celery_task.user_permission', 'celery_task.repair_gpu']
)
#app.task(name='repair_gpu')
def repair_gpu(task_id, repair_map: list):
\# print("test")
loop = asyncio.new_event_loop()
asyncio.set_event_loop(loop)
\#
\#
loop.run_until_complete(repair_gpu_task(task_id, repair_map))
async def repair_gpu_task(task_id, repair_map: list):
status = True
async with async_session() as session:
query = select(module.Task).where(module.Task.id == task_id)
task: module.Task = (await session.execute(query)).scalars().first()
....many codes......
The program works fine when I only run one task，Running multiple tasks at the same time will report the following error
The garbage collector is trying to clean up connection <AdaptedConnection <asyncmy.connection.Connection object at 0x000001DECDAA37C0>>. This feature is unsupported on unsupported on asyn
cio dbapis that lack a "terminate" feature, since no IO can be performed at this stage to reset the connection. Please close out all connections when they are no longer used, calling close() or using a context manager to manage
their lifetime.
How can i use it? How to fix this error?
Thank u.
I don't know how to fix and try......

How to return error response when database failed to commit

Introduction:
In our FastAPI app, we implemented dependency which is being used to commit changes in the database (according to https://fastapi.tiangolo.com/tutorial/dependencies/dependencies-with-yield/).
Issue:
The issue we are facing (which is also partially mentioned on the website provided above) is that the router response is 200 even though the commit is not succeeded. This is simply because in our case commit or rollback functions are being called after the response is sent to the requestor.
Example:
Database dependency:
def __with_db(request: Request):
db = Session()
try:
yield db
db.commit()
except Exception as e:
db.rollback()
raise e
finally:
db.close()
As an endpoint example, we import csv file with records, then create db model instances and then add them to the db session (for simplicity, irrelevant things deleted).
from models import Movies
...
#router.post("/import")
async def upload_movies(file: UploadFile, db: DbType = db_dependency):
df = await read_csv(file)
new_records = [Movies(**item) for item in df.to_dict("records")]
db.add_all(new_records) # this is still valid operation, no error here
return "OK"
Everything within the endpoint doesn't raise an error, so the endpoint returns a positive response, however, once the rest of the dependency code is being executed, then it throws an error (ie. whenever one of the records has a null value).
Question:
Is there any solution, to how to actually get an error when the database failed to commit the changes?
Of course, the simplest one would be to add db.commit() or even db.flush() to each endpoint but because of the fact we have a lot of endpoints, we want to avoid this repetition in each of them (if it is even possible).
Best regards,

This is the solution we have implemented for this individual use case.
As a reminder, the main purpose was to catch a database error and react to it by sending proper response to the client. The tricky part was that we wanted to omit the scenario of adding the same line of code to every endpoint as we have plenty of them.
We managed to solve it with middleware.
Updated dependency.py
def __with_db(request: Request):
db = Session()
#assign db to request state for middleware to be able to acces it
request.state.db = db
yield db
Added one line to the app.py
# fastAPI version: 0.79.0
from starlette.middleware.base import BaseHTTPMiddleware
from middlewares import unit_of_work_middleware
...
app = FastAPI()
...
app.add_middleware(BaseHTTPMiddleware, dispatch=unit_of_work_middleware) #new line
...
And created main middleware logic in middlewares.py
from fastapi import Request, Response
async def unit_of_work_middleware(request: Request, call_next) -> Response:
try:
response = await call_next(request)
# Committing the DB transaction after the API endpoint has finished successfully
# So that all the changes made as part of the router are written into the database all together
# This is an implementation of the Unit of Work pattern https://martinfowler.com/eaaCatalog/unitOfWork.html
if "db" in request.state._state:
request.state.db.commit()
return response
except:
# Rolling back the database state to the version before the API endpoint call
# As the exception happened, all the database changes made as part of the API call
# should be reverted to keep data consistency
if "db" in request.state._state:
request.state.db.rollback()
raise
finally:
if "db" in request.state._state:
request.state.db.close()
The middleware logic is applied to every endpoint so each request that is coming is going through it.
I think it's relatively easy way to implement it and get this case resolved.

I don't know your FastAPI version. But as i know, from 0.74.0, dependencies with yield can catch HTTPException and custom exceptions before response was sent(i test 0.80.0 is okay):
async def get_database():
with Session() as session:
try:
yield session
except Exception as e:
# rollback or other operation.
raise e
finally:
session.close()
If one HTTPException raised, the flow is:
endpoint -> dependency catch exception -> ExceptionMiddleware catch exception -> respond
Get more info https://fastapi.tiangolo.com/release-notes/?h=asyncexi#breaking-changes_1
Additional, about commit only in one code block,
Solution 1, can use decorator:
def decorator(func):
#wraps(func)
async def wrapper(*arg, **kwargs):
rsp = await func(*arg, **kwargs)
if 'db' in kwargs:
kwargs['db'].commit()
return rsp
return wrapper
#router.post("/import")
#decorator
async def upload_movies(file: UploadFile, db: DbType = db_dependency):
df = await read_csv(file)
new_records = [Movies(**item) for item in df.to_dict("records")]
db.add_all(new_records) # this is still valid operation, no error here
return "OK"
#Barosh I think decorator is the easiest way. I also thought about middleware but it's not possible.
Solution 2, just an idea:
save session in request
async def get_database(request: Request):
with Session() as session:
request.state.session = session
try:
yield session
except Exception as e:
# rollback or other operation.
raise e
finally:
session.close()
custom starlette.request_response
def custom_request_response(func: typing.Callable) -> ASGIApp:
"""
Takes a function or coroutine `func(request) -> response`,
and returns an ASGI application.
"""
is_coroutine = iscoroutinefunction_or_partial(func)
async def app(scope: Scope, receive: Receive, send: Send) -> None:
request = Request(scope, receive=receive, send=send)
if is_coroutine:
response = await func(request)
else:
response = await run_in_threadpool(func, request)
request.state.session.commit() # or other operation
await response(scope, receive, send)
return app
custom FastAPI.APIRoute.app
class CustomRoute(APIRoute):
def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
self.app = custom_request_response(self.get_route_handler())
create router with CustomRoute
router = APIRouter(route_class=CustomRoute)
I think this is a executable idea. You can test it.
Hope this is useful.

Pytest Alembic initialize database with async migrations

The existing posts didn't provide a useful answer to me.
I'm trying to run asynchronous database tests using Pytest (db is Postgres with asyncpg), and I'd like to initialize my database using my Alembic migrations so that I can verify that they work properly in the meantime.
My first attempt was this:
#pytest.fixture(scope="session")
async def tables():
"""Initialize a database before the tests, and then tear it down again"""
alembic_config: config.Config = config.Config('alembic.ini')
command.upgrade(alembic_config, "head")
yield
command.downgrade(alembic_config, "base")
which didn't actually do anything at all (migrations were never applied to the database, tables not created).
Both Alembic's documentation & Pytest-Alembic's documentation say that async migrations should be run by configuring your env like this:
async def run_migrations_online() -> None:
"""Run migrations in 'online' mode.
In this scenario we need to create an Engine
and associate a connection with the context.
"""
connectable = engine
async with connectable.connect() as connection:
await connection.run_sync(do_run_migrations)
await connectable.dispose()
asyncio.run(run_migrations_online())
but this doesn't resolve the issue (however it does work for production migrations outside of pytest).
I stumpled upon a library called pytest-alembic that provides some built-in tests for this.
When running pytest --test-alembic, I get the following exception:
got Future attached to a different loop
A few comments on pytest-asyncio's GitHub repository suggest that the following fixture might fix it:
#pytest.fixture(scope="session")
def event_loop() -> Generator:
loop = asyncio.get_event_loop_policy().new_event_loop()
yield loop
loop.close()
but it doesn't (same exception remains).
Next I tried to run the upgrade test manually, using:
async def test_migrations(alembic_runner):
alembic_runner.migrate_up_to("revision_tag_here")
which gives me
alembic_runner.migrate_up_to("revision_tag_here")
venv/lib/python3.9/site-packages/pytest_alembic/runner.py:264: in run_connection_task
return asyncio.run(run(engine))
RuntimeError: asyncio.run() cannot be called from a running event loop
However this is an internal call by pytest-alembic, I'm not calling asyncio.run() myself, so I can't apply any of the online fixes for this (try-catching to check if there is an existing event loop to use, etc.). I'm sure this isn't related to my own asyncio.run() defined in the alembic env, because if I add a breakpoint - or just raise an exception above it - the line is actually never executed.
Lastly, I've also tried nest-asyncio.apply(), which just hangs forever.
A few more blog posts suggest to use this fixture to initialize database tables for tests:
async with engine.begin() as connection:
await connection.run_sync(Base.metadata.create_all)
which works for the purpose of creating a database to run tests against, but this doesn't run through the migrations so that doesn't help my case.
I feel like I've tried everything there is & visited every docs page, but I've got no luck so far. Running an async migration test surely can't be this difficult?
If any extra info is required I'm happy to provide it.

I got this up and running pretty easily with the following
env.py - the main idea here is that the migration can be run synchronously
import asyncio
from logging.config import fileConfig
from alembic import context
from sqlalchemy import engine_from_config
from sqlalchemy import pool
from sqlalchemy.ext.asyncio import AsyncEngine
config = context.config
if config.config_file_name is not None:
fileConfig(config.config_file_name)
target_metadata = mymodel.Base.metadata
def run_migrations_online():
connectable = context.config.attributes.get("connection", None)
if connectable is None:
connectable = AsyncEngine(
engine_from_config(
context.config.get_section(context.config.config_ini_section),
prefix="sqlalchemy.",
poolclass=pool.NullPool,
future=True
)
)
if isinstance(connectable, AsyncEngine):
asyncio.run(run_async_migrations(connectable))
else:
do_run_migrations(connectable)
async def run_async_migrations(connectable):
async with connectable.connect() as connection:
await connection.run_sync(do_run_migrations)
await connectable.dispose()
def do_run_migrations(connection):
context.configure(
connection=connection,
target_metadata=target_metadata,
compare_type=True,
)
with context.begin_transaction():
context.run_migrations()
run_migrations_online()
then I added a simple db init script
init_db.py
from alembic import command
from alembic.config import Config
from sqlalchemy.ext.asyncio import create_async_engine
__config_path__ = "/path/to/alembic.ini"
__migration_path__ = "/path/to/folder/with/env.py"
cfg = Config(__config_path__)
cfg.set_main_option("script_location", __migration_path__)
async def migrate_db(conn_url: str):
async_engine = create_async_engine(conn_url, echo=True)
async with async_engine.begin() as conn:
await conn.run_sync(__execute_upgrade)
def __execute_upgrade(connection):
cfg.attributes["connection"] = connection
command.upgrade(cfg, "head")
then your pytest fixture can look like this
conftest.py
...
#pytest_asyncio.fixture(autouse=True)
async def migrate():
await migrate_db(conn_url)
yield
...
Note: I don't scope my migrate fixture to the test session, I tend to drop and migrate after each test.

sqlalchemy when does an object become "not persistent"

I have a function that has a semi-long running session that I use for a bunch of database rows... and at a certain point I want to reload or "refresh" one of the rows to make sure none of the state has changed. most of the time this code works fine, but every now and then I get this error
sqlalchemy.exc.InvalidRequestError: Instance '<Event at 0x58cb790>' is not persistent within this Session
I've been reading up on state but cannot understand why an object would stop being persistent? I'm still within a session, so I'm not sure why I would stop being persistent.
Can someone explain what could cause my object to be "not persistent" within the session? I'm not doing any writing to the object prior to this point.
db_event below is the object that is becoming "not persistent"
async def event_white_check_mark_handler(
self: Events, ctx, channel: TextChannel, member: discord.Member, message: Message
):
"""
This reaction is for completing an event
"""
session = database_objects.SESSION()
try:
message_id = message.id
db_event = self.get_event(session, message_id)
if not db_event:
return
logger.debug(f"{member.display_name} wants to complete an event {db_event.id}")
db_guild = await db.get_or_create(
session, db.Guild, name=channel.guild.name, discord_id=channel.guild.id
)
db_member = await db.get_or_create(
session,
db.Member,
name=member.name,
discord_id=member.id,
nick=member.display_name,
guild_id=db_guild.discord_id,
)
db_scheduler_config: db.SchedulerConfig = (
session.query(db.SchedulerConfig)
.filter(db.SchedulerConfig.guild_id == channel.guild.id)
.one()
)
# reasons to not complete the event
if len(db_event) == 0:
await channel.send(
f"{member.display_name} you cannot complete an event with no one on it!"
)
elif (
db_member.discord_id == db_event.creator_id
or await db_scheduler_config.check_permission(
ctx, db_event.event_name, member, db_scheduler_config.MODIFY
)
):
async with self.EVENT_LOCKS[db_event.id]:
session.refresh(db_event) ########### <---- right here is when I get the error thrown
db_event.status = const.COMPLETED
session.commit()
self.DIRTY_EVENTS.add(db_event.id)
member_list = ",".join(
filter(
lambda x: x not in const.MEMBER_FIELD_DEFAULT,
[str(x.mention) for x in db_event.members],
)
)
await channel.send(f"Congrats on completing a event {member_list}!")
logger.info(f"Congrats on completing a event {member_list}!")
# await self.stop_tracking_event(db_event)
del self.REMINDERS_BY_EVENT_ID[db_event.id]
else:
await channel.send(
f"{member.display_name} you did not create this event and do not have permission to delete the event!"
)
logger.warning(f"{member.display_name} you did not create this event!")
except Exception as _e:
logger.error(format_exc())
session.rollback()
finally:
database_objects.SESSION.remove()

I am fairly certain that the root cause in this case is a race condition. Using a scoped session in its default configuration manages scope based on the thread only. Using coroutines on top can mean that 2 or more end up sharing the same session, and in case of event_white_check_mark_handler they then race to commit/rollback and to remove the session from the scoped session registry, effectively closing it and expunging all remaining instances from the now-defunct session, making the other coroutines unhappy.
A solution is to not use scoped sessions at all in event_white_check_mark_handler, because it fully manages its session's lifetime, and seems to pass the session forward as an argument. If on the other hand there are some paths that use the scoped session database_objects.SESSION instead of receiving the session as an argument, define a suitable scopefunc when creating the registry:
https://docs.sqlalchemy.org/en/13/orm/contextual.html#using-custom-created-scopes
SQLAlchemy+Tornado: How to create a scopefunc for SQLAlchemy's ScopedSession?
Correct usage of sqlalchemy scoped_session with python asyncio

I experienced this issue when retrieving a session from a generator, and try to run the exact same query again from different yielded sessions:
SessionLocal = sessionmaker(bind=engine, class_=Session)
def get_session() -> Generator:
with SessionLocal() as session:
yield session
The solution was to use session directly (in my case).
Perhaps in your case I would commit the session, before executing a new query.
def get_data():
with Session(engine) as session:
statement = select(Company)
results = session.exec(statement)

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

PendingRollbackError when accessing test database in FastAPI async test - python

Related

FastAPI + pytest unable to clean Django ORM

How to use sqlalchemy AsyncSession in Celery

How to return error response when database failed to commit

Pytest Alembic initialize database with async migrations

sqlalchemy when does an object become "not persistent"

Categories

Resources