sqlalchemy when does an object become "not persistent"

sqlalchemy when does an object become "not persistent" - python

I have a function that has a semi-long running session that I use for a bunch of database rows... and at a certain point I want to reload or "refresh" one of the rows to make sure none of the state has changed. most of the time this code works fine, but every now and then I get this error
sqlalchemy.exc.InvalidRequestError: Instance '<Event at 0x58cb790>' is not persistent within this Session
I've been reading up on state but cannot understand why an object would stop being persistent? I'm still within a session, so I'm not sure why I would stop being persistent.
Can someone explain what could cause my object to be "not persistent" within the session? I'm not doing any writing to the object prior to this point.
db_event below is the object that is becoming "not persistent"
async def event_white_check_mark_handler(
self: Events, ctx, channel: TextChannel, member: discord.Member, message: Message
):
"""
This reaction is for completing an event
"""
session = database_objects.SESSION()
try:
message_id = message.id
db_event = self.get_event(session, message_id)
if not db_event:
return
logger.debug(f"{member.display_name} wants to complete an event {db_event.id}")
db_guild = await db.get_or_create(
session, db.Guild, name=channel.guild.name, discord_id=channel.guild.id
)
db_member = await db.get_or_create(
session,
db.Member,
name=member.name,
discord_id=member.id,
nick=member.display_name,
guild_id=db_guild.discord_id,
)
db_scheduler_config: db.SchedulerConfig = (
session.query(db.SchedulerConfig)
.filter(db.SchedulerConfig.guild_id == channel.guild.id)
.one()
)
# reasons to not complete the event
if len(db_event) == 0:
await channel.send(
f"{member.display_name} you cannot complete an event with no one on it!"
)
elif (
db_member.discord_id == db_event.creator_id
or await db_scheduler_config.check_permission(
ctx, db_event.event_name, member, db_scheduler_config.MODIFY
)
):
async with self.EVENT_LOCKS[db_event.id]:
session.refresh(db_event) ########### <---- right here is when I get the error thrown
db_event.status = const.COMPLETED
session.commit()
self.DIRTY_EVENTS.add(db_event.id)
member_list = ",".join(
filter(
lambda x: x not in const.MEMBER_FIELD_DEFAULT,
[str(x.mention) for x in db_event.members],
)
)
await channel.send(f"Congrats on completing a event {member_list}!")
logger.info(f"Congrats on completing a event {member_list}!")
# await self.stop_tracking_event(db_event)
del self.REMINDERS_BY_EVENT_ID[db_event.id]
else:
await channel.send(
f"{member.display_name} you did not create this event and do not have permission to delete the event!"
)
logger.warning(f"{member.display_name} you did not create this event!")
except Exception as _e:
logger.error(format_exc())
session.rollback()
finally:
database_objects.SESSION.remove()

I am fairly certain that the root cause in this case is a race condition. Using a scoped session in its default configuration manages scope based on the thread only. Using coroutines on top can mean that 2 or more end up sharing the same session, and in case of event_white_check_mark_handler they then race to commit/rollback and to remove the session from the scoped session registry, effectively closing it and expunging all remaining instances from the now-defunct session, making the other coroutines unhappy.
A solution is to not use scoped sessions at all in event_white_check_mark_handler, because it fully manages its session's lifetime, and seems to pass the session forward as an argument. If on the other hand there are some paths that use the scoped session database_objects.SESSION instead of receiving the session as an argument, define a suitable scopefunc when creating the registry:
https://docs.sqlalchemy.org/en/13/orm/contextual.html#using-custom-created-scopes
SQLAlchemy+Tornado: How to create a scopefunc for SQLAlchemy's ScopedSession?
Correct usage of sqlalchemy scoped_session with python asyncio

I experienced this issue when retrieving a session from a generator, and try to run the exact same query again from different yielded sessions:
SessionLocal = sessionmaker(bind=engine, class_=Session)
def get_session() -> Generator:
with SessionLocal() as session:
yield session
The solution was to use session directly (in my case).
Perhaps in your case I would commit the session, before executing a new query.
def get_data():
with Session(engine) as session:
statement = select(Company)
results = session.exec(statement)

Related

How to use sqlalchemy AsyncSession in Celery

In celery, I want use asyncsession to operate mysql database,this is my code:
db.py
async_engine = create_async_engine(
ASYNC_DATABASE_URL,
future=True,
echo=True,
pool_size=1000,
pool_pre_ping=True,
pool_recycle=28800-300,
max_overflow=0, )
async_session = sessionmaker(
async_engine, expire_on_commit=False, class_=AsyncSession)
celery.py
app = Celery(
'tasks',
broker=f"{config.REDIS_URL}/{config.CELERY_DB}",
CELERY_TASK_SERIALIZER='pickle',
CELERY_RESULT_SERIALIZER='pickle',
CELERYD_CONCURRENCY=20,
CELERY_ACCEPT_CONTENT=[
'pickle',
'json'],
include=['celery_task.user_permission', 'celery_task.repair_gpu']
)
#app.task(name='repair_gpu')
def repair_gpu(task_id, repair_map: list):
\# print("test")
loop = asyncio.new_event_loop()
asyncio.set_event_loop(loop)
\#
\#
loop.run_until_complete(repair_gpu_task(task_id, repair_map))
async def repair_gpu_task(task_id, repair_map: list):
status = True
async with async_session() as session:
query = select(module.Task).where(module.Task.id == task_id)
task: module.Task = (await session.execute(query)).scalars().first()
....many codes......
The program works fine when I only run one task，Running multiple tasks at the same time will report the following error
The garbage collector is trying to clean up connection <AdaptedConnection <asyncmy.connection.Connection object at 0x000001DECDAA37C0>>. This feature is unsupported on unsupported on asyn
cio dbapis that lack a "terminate" feature, since no IO can be performed at this stage to reset the connection. Please close out all connections when they are no longer used, calling close() or using a context manager to manage
their lifetime.
How can i use it? How to fix this error?
Thank u.
I don't know how to fix and try......

PendingRollbackError when accessing test database in FastAPI async test

I'm trying to mimic Django behavior when running tests on FastAPI: I want to create a test database in the beginning of each test, and destroy it in the end. The problem is the async nature of FastAPI is breaking everything. When I did a sanity check and turned everything synchronous, everything worked beautifully. When I try to run things async though, everything breaks. Here's what I have at the moment:
The fixture:
#pytest.fixture(scope="session")
def event_loop():
return asyncio.get_event_loop()
#pytest.fixture(scope="session")
async def session():
sync_test_db = "postgresql://postgres:postgres#postgres:5432/test"
if not database_exists(sync_test_db):
create_database(sync_test_db)
async_test_db = "postgresql+asyncpg://postgres:postgres#postgres:5432/test"
engine = create_async_engine(url=async_test_db, echo=True, future=True)
async with engine.begin() as conn:
await conn.run_sync(SQLModel.metadata.create_all)
Session = sessionmaker(engine, class_=AsyncSession, expire_on_commit=False)
async with Session() as session:
def get_session_override():
return session
app.dependency_overrides[get_session] = get_session_override
yield session
drop_database(sync_test_db)
The test:
class TestSomething:
#pytest.mark.asyncio
async def test_create_something(self, session):
data = {"some": "data"}
response = client.post(
"/", json=data
)
assert response.ok
results = await session.execute(select(Something)) # <- This line fails
assert len(results.all()) == 1
The error:
E sqlalchemy.exc.PendingRollbackError: This Session's transaction has been rolled back due to a previous exception during flush. To begin a new transaction with this Session, first issue Session.rollback(). Original exception was: Task <Task pending name='anyio.from_thread.BlockingPortal._call_func' coro=<BlockingPortal._call_func() running at /usr/local/lib/python3.9/site-packages/anyio/from_thread.py:187> cb=[TaskGroup._spawn.<locals>.task_done() at /usr/local/lib/python3.9/site-packages/anyio/_backends/_asyncio.py:629]> got Future <Future pending cb=[Protocol._on_waiter_completed()]> attached to a different loop (Background on this error at: https://sqlalche.me/e/14/7s2a)
/usr/local/lib/python3.9/site-packages/sqlalchemy/orm/session.py:601: PendingRollbackError
Any ideas what I might be doing wrong?

Check if other statements in your test-cases involving the database might fail before this error is raised.
For me the PendingRollbackError was caused by an InsertionError that was raised by a prior test.
All my tests were (async) unit tests that involved database insertions into a postgres database.
After the tests, the database session was supposed to do a rollback of its entries.
The InsertionError was caused by Insertions to the database that failed a unique constraint. All subsequent tests raised the PendingRollbackError.

Using a single redis connection (aioredis)

My application will be sending hundreds, if not thousands, of messages over redis every second, so we are going to open the connection when the app launches and use that connection for every transaction. I am having trouble keeping that connection open.
Here is some of my redis handler class:
class RedisSingleDatabase(AsyncGetSetDatabase):
async def redis(self):
if not self._redis:
self._redis = await aioredis.create_redis(('redis', REDIS_PORT))
return self._redis
def __init__(self):
self._redis = None
async def get(self, key):
r = await self.redis()
# r = await aioredis.create_redis(('redis', REDIS_PORT))
print(f'CONNECTION IS {"CLOSED" if r.closed else "OPEN"}!')
data = await r.get(key, encoding='utf-8')
if data is not None:
return json.loads(data)
This does not work. By the time we call r.get, the connection is closed (ConnectionClosedError).
If I uncomment the second line of the get method, and connect to the database locally in the method, it works. But then we're no longer using the same connection.
I've considered trying all this using dependency injection as well. This is a flask app, and in my create_app method I would connect to the database and construct my RedisSingleDatabase with that connection. Besides the question of whether my current problem would be the same, I can't seem to get this approach to work since we need to await the connection which I can't do in my root-level create_app which can't be async! (unless I'm missing something).
I've tried asyncio.run all over the place and it either doesn't fix the problem or raises its own error that it can't be called from a running event loop.
Please help!!!

Using #pytest.fixture(scope="module") with #pytest.mark.asyncio

I think the example below is a really common use case:
create a connection to a database once,
pass this connection around to test which insert data
pass the connection to a test which verifies the data.
Changing the scope of #pytest.fixture(scope="module") causes ScopeMismatch: You tried to access the 'function' scoped fixture 'event_loop' with a 'module' scoped request object, involved factories.
Also, the test_insert and test_find coroutine do not need the event_loop argument because the loop is accessible already by passing the connection.
Any ideas how to fix those two issues?
import pytest
#pytest.fixture(scope="function") # <-- want this to be scope="module"; run once!
#pytest.mark.asyncio
async def connection(event_loop):
""" Expensive function; want to do in the module scope. Only this function needs `event_loop`!
"""
conn await = make_connection(event_loop)
return conn
#pytest.mark.dependency()
#pytest.mark.asyncio
async def test_insert(connection, event_loop): # <-- does not need event_loop arg
""" Test insert into database.
NB does not need event_loop argument; just the connection.
"""
_id = 0
success = await connection.insert(_id, "data")
assert success == True
#pytest.mark.dependency(depends=['test_insert'])
#pytest.mark.asyncio
async def test_find(connection, event_loop): # <-- does not need event_loop arg
""" Test database find.
NB does not need event_loop argument; just the connection.
"""
_id = 0
data = await connection.find(_id)
assert data == "data"

The solution is to redefine the event_loop fixture with the module scope. Include that in the test file.
#pytest.fixture(scope="module")
def event_loop():
loop = asyncio.get_event_loop()
yield loop
loop.close()

Similar ScopeMismatch issue was raised in github for pytest-asyncio (link). The solution (below) works for me:
#pytest.yield_fixture(scope='class')
def event_loop(request):
loop = asyncio.get_event_loop_policy().new_event_loop()
yield loop
loop.close()

how to return value from tornado.ioloop.IOLoop instance add_future method

I'm using RethinkDB with Tornado with an async approach. Base on my data model in RethinkDB, upon inserting a record for my topic table, I also have to update/insert new record into user_to_topic table. Here's the basic set up of my post request handler.
class TopicHandler(tornado.web.RequestHandler):
def get(self):
pass
#gen.coroutine
def post(self, *args, **kwargs):
# to establish databse connection
connection = rth.connect(host='localhost', port=00000, db=DATABASE_NAME)
# Thread the connection
threaded_conn = yield connection
title = self.get_body_argument('topic_title')
# insert into table
new_topic_record = rth.table('Topic').insert({
'title': topic_title,
}, conflict="error",
durability="hard").run(threaded_conn)
# {TODO} for now assume the result is always written successfully
io_loop = ioloop.IOLoop.instance()
# I want to return my generated_keys here
io_loop.add_future(new_topic_record, self.return_written_record_id)
# do stuff with the generated_keys here, how do I get those keys
def return_written_record_id(self, f):
_result = f.result()
return _result['generated_keys']
after the insert operation finishes, the Future object with return result from RethinkDB insert operation through Future.result() method which I can retrieve id of new record using generated_keys attribute of result. According to Tornado document, I can do stuff with the result in my callback function, return_written_record_id. Of course I can do all the other database operations inside my return_written_record_id function, but is it possible to return all the ids to my post function? Or this is the way it has to be using coroutine in tornado?
Any suggestion will be appreciated. Thank you!

Simply do:
result = yield new_topic_record
Whenever you yield a Future inside a coroutine, Tornado pauses the coroutine until the Future is resolved to a value or exception. Then Tornado resumes the coroutine by passing the value in or raising an exception at the "yield" expression.
For more info, see Refactoring Tornado Coroutines.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

sqlalchemy when does an object become "not persistent" - python

Related

How to use sqlalchemy AsyncSession in Celery

PendingRollbackError when accessing test database in FastAPI async test

Using a single redis connection (aioredis)

Using #pytest.fixture(scope="module") with #pytest.mark.asyncio

how to return value from tornado.ioloop.IOLoop instance add_future method

Categories

Resources