"This transaction is closed" after rollback - python

I'm trying to do some schema changes inside a transaction manager provided by pyramid. I'm running into various issues trying to run commit after a rollback:
The simplified version is:
def get_version(conn):
try:
result = conn.execute('SELECT version FROM versions LIMIT 1')
return result.scalar()
except:
conn.rollback()
return 0
def m_version_table(conn):
conn.execute('CREATE TABLE versions (version INT)')
conn.execute('INSERT INTO versions VALUES (1)')
def handle(conn):
ver = get_version(conn)
m_version_table(conn)
# task started with pyramid's transaction manager
with env['request'].tm as tm:
handle(env['request'].dbsession)
The transactions are started implicitly, which I can see in the logs:
BEGIN (implicit)
SELECT version FROM versions LIMIT 1
()
ROLLBACK
BEGIN (implicit)
CREATE TABLE versions (version INT)
()
INSERT INTO versions VALUES (1)
()
UPDATE versions SET version = %s
(1,)
ROLLBACK
If versions exists (and I run a different ALTER afterwards) everything works fine. But after the rollback, I just get:
Traceback (most recent call last):
File ".venv/bin/schema_refresh", line 11, in <module>
load_entry_point('project', 'console_scripts', 'schema_refresh')()
File ".../schema_refresh.py", line 270, in run
handle(env['request'].dbsession, tm)
File ".../transaction-2.4.0-py3.7.egg/transaction/_manager.py", line 140, in __exit__
self.commit()
File ".../transaction-2.4.0-py3.7.egg/transaction/_manager.py", line 131, in commit
return self.get().commit()
...
sqlalchemy.exc.ResourceClosedError: This transaction is closed
Why can't the next transaction be committed, even if a new transaction has been correctly started after the rollback? (ROLLBACK is followed by BEGIN (implicit))

tl;dr
It very much seems like it's not the new transaction that __exit__ tries to commit in your example.
Calling rollback on the DB session does create a new session transaction, but that's not joined with the transaction that the manager tracks within the context. Your subsequent calls to execute are done in the new session transaction, but commit is called on the original, first transaction, created when entering the context.
Assuming that you used cookiecutter to set up your project, your models.__init__.py will probably be the default from the repo.
That means that env['request'].tm returns a Zope TransactionManager and, when entering its context, the begin() method instantiates a Transaction object and stores it in the _txn attribute.
env['request'].dbsession returns an SQLAlchemy Session, after registering it with the transaction manager.
The TransactionManager's Transaction is now joined with the Session's SessionTransaction and should control its end and outcome.
Rolling back the SessionTransaction while handling the exception raised in the execute() call bypasses the transaction manager. Calling its commit() or rollback() methods, as done later by __exit__, will have it still try to terminate the SessionTransaction you rolled back.
Also, there's no mechanism that would join the new transaction with the manager.
You can either use the transaction manager or opt for manual transaction control. Just stick with your choice and don't mix both.

You're using conn.execute which is not tracked by the transaction manager (which, by default, only tracks changes done via the ORM). You can either 1) modify your code that does zope.sqlalchemy.registry(session) to set initial_state='changed' such that the default will be to COMMIT instead of ROLLBACK (the default avoids extra commits if it doesn't know something changed - for performance) or 2) mark specific sessions that do this with zope.sqlalchemy.mark_changed(session).
Finally, get_version should be done by coordinating with the transaction manager so that the entire transaction doesn't go into a bad state (despite your rollback the manager is still marked aborted right now). To do this use tm.savepoint():
def get_version(conn, tm):
sp = tm.savepoint()
try:
result = conn.execute('SELECT version FROM versions LIMIT 1')
return result.scalar()
except:
sp.rollback()
return 0

Related

SQLAlchemy use Transactions for E2E Testing

I am writing E2E tests for our software and would like to figure out why the rollback call does not roll the DB back to the state when the test started.
I use a decorator for my pytest test functions.
The issue I get is that data that I write to the DB during the tests persists, even though I call the rollback() in the final statement. Which indicates that the transaction is not setup or SQLAlchemy is doing something else in the background.
I see SQLAlchemy has the SAVEPOINT feature but I am not sure if this is what I really need. I think my request is pretty simple yet the framework obfuscates it. Or simply that I am not too experienced with it...
Note - the functions tested can have multiple commit calls...
def get_postgres_db():
v2_session = sessionmaker(
autocommit=False,
autoflush=False,
bind=v2_engine
)
try:
yield v2_session
finally:
v2_session.close()
def postgres_test_decorator(test_function):
"""
Decorator to open db connection and roll back regardless of test outcome
Can be ported into pytest later
"""
def the_decorator(*args, **kwargs):
try:
postgres_session = list(get_postgres_db())[0]
# IN MY SQL MIND I WOULD LIKE TO DO HERE
# BEGIN TRANSACTION
test_function(postgres_session)
finally:
# THIS SHOULD ROLLBACK TO ORIGINAL STATE
# ROLLBACK
postgres_session.rollback()
return the_decorator

Preserve an aborted transaction when SQLAlchemy raises a ProgrammingError

I have a slightly unusual problem with transaction state and error handling in SQLAlchemy. The short version: is there any way of preserving a transaction when SQLAlchemy raises a ProgrammingError and aborts it?
Background
I'm working on an integration test suite for a legacy codebase. Right now, I'm designing a set of fixtures that will allow us to run all tests inside transactions, inspired by the SQLAlchemy documentation. The general paradigm involves opening a connection, starting a transaction, binding a session to that connection, and then mocking out most database access methods so that they make use of that transaction. (To get a sense of what this looks like, see the code provided in the docs link above, including the note at the end.) The goal is to allow ourselves to run methods from the codebase that perform a lot of database updates in the context of a test, with the assurance that any side effects that happen to alter the test database will get rolled back after the test has completed.
My problem is that the code often relies on handling DBAPI errors to accomplish control flow when running queries, and those errors automatically abort transactions (per the psycopg2 docs). This poses a problem, since I need to preserve the work that has been done in that transaction up to the point that the error is raised, and I need to continue using the transaction after the error handling is done.
Here's a representative method that uses error handling for control flow:
from api.database import engine
def entity_count():
"""
Count the entities in a project.
"""
get_count = '''
SELECT COUNT(*) AS entity_count FROM entity_browser
'''
with engine.begin() as conn:
try:
count = conn.execute(count).first().entity_count
except ProgrammingError:
count = 0
return count
In this example, the error handling provides a quick way of determining if the table entity_browser exists: if not, Postgres will throw an error that gets caught at the DBAPI level (psycopg2) and passed up to SQLAlchemy as a ProgrammingError.
In the tests, I mock out engine.begin() so that it always returns the connection with the ongoing transaction that was established in the test setup. Unfortunately, this means that when the code continues execution after SQLAlchemy has raised a ProgrammingError and psycopg2 has aborted the transaction, SQLAlchemy will raise an InternalError the next time a database query runs using the open connection, complaining that the transaction has been aborted.
Here's a sample test exhibiting this behavior:
import sqlalchemy as sa
def test_entity_count(session):
"""
Test the `entity_count` method.
`session` is a fixture that sets up the transaction and mocks out
database access, returning a Flask-SQLAlchemy `scoped_session` object
that we can use for queries.
"""
# Make a change to a table that we can observe later
session.execute('''
UPDATE users
SET name = 'in a test transaction'
WHERE id = 1
''')
# Drop `entity_browser` in order to raise a `ProgrammingError` later
session.execute('''DROP TABLE entity_browser''')
# Run the `entity_count` method, making sure that it raises an error
with pytest.raises(sa.exc.ProgrammingError):
count = entity_count()
assert count == 0
# Make sure that the changes we made earlier in the test still exist
altered_name = session.execute('''
SELECT name
FROM users
WHERE id = 1
''')
assert altered_name == 'in a test transaction'
Here's the type of output I get:
> altered_name = session.execute('''
SELECT name
FROM users
WHERE id = 1
''')
[... traceback history...]
def do_execute(self, cursor, statement, parameters, context=None):
> cursor.execute(statement, parameters)
E sqlalchemy.exc.InternalError: (psycopg2.InternalError) current transaction is
aborted, commands ignored until end of transaction block
Attempted solutions
My first instinct was to try to interrupt the error handling and force a rollback using SQLAlchemy's handle_error event listener. I added a listener into the test fixture that would roll back the raw connection (since SQLAlchemy Connection instances have no rollback API, as far as I understand it):
#sa.event.listens_for(connection, 'handle_error')
def raise_error(context):
dbapi_conn = context.connection.connection
dbapi_conn.rollback()
This successfully keeps the transaction open for further use, but ends up rolling back all of the previous changes made in the test. Sample output:
> assert altered_name == 'in a test transaction'
E AssertionError
Clearly, rolling back the raw connection is too aggressive of an approach. Thinking that I might be able to roll back to the last savepoint, I tried rolling back the scoped session, which has an event listener attached to it that automatically opens up a new nested transaction when a previous one ends. (See the note at the end of the SQLAlchemy doc on transactions in tests for a sample of what this looks like.)
Thanks to the mocks set up in the session fixture, I can import the scoped session directly into the event listener and roll it back:
#sa.event.listens_for(connection, 'handle_error')
def raise_error(context):
from api.database import db
db.session.rollback()
However, this approach also raises an InternalError on the next query. It seems that it doesn't actually rollback the transaction to the satisfaction of the underlying cursor.
Summary question
Is there any way of preserving the transaction after a ProgrammingError gets raised? On a more abstract level, what is happening when psycopg2 "aborts" the transaction, and how can I work around it?
The root of the problem is that you're hiding the exception from the context manager. You catch the ProgrammingError too soon and so the with-statement never sees it. Your entity_count() should be:
def entity_count():
"""
Count the entities in a project.
"""
get_count = '''
SELECT COUNT(*) AS entity_count FROM entity_browser
'''
try:
with engine.begin() as conn:
count = conn.execute(get_count).first().entity_count
except ProgrammingError:
count = 0
return count
And then if you provide something like
#contextmanager
def fake_begin():
""" Begin a nested transaction and yield the global connection.
"""
with connection.begin_nested():
yield connection
as the mocked engine.begin(), the connection stays usable. But #JL Peyret raises a good point about the logic of your test. Engine.begin() usually1 provides a new connection with an armed transaction from the pool, so your session and entity_count() shouldn't probably even be using the same connection.
1: Depends on pool configuration.

What is the difference (in MySQL) between transaction-rollback and not commiting?

I have a question regarding MySQL and transactions. I work with MySQL 5.7.18, python 3 and the Oracle mysql connector v2.1.4
I do not understand the difference between
a) having a transaction and –in case of error – rollback and
b) not having a transaction and – in case of error – simply not commiting the changes.
Both seem to leave me with exactly the same results (i.e. no entries in table, see code example below). Does this have to do with using InnoDB – would the results differ otherwise?
What is the advantage of using a transaction if
1) I cannot rollback commited changes and
2) I could just as well not commit changes (until I am done with my task or sure that some query didn’t raise any exceptions)?
I have tried to find the answers to those questions in https://downloads.mysql.com/docs/connector-python-en.a4.pdf but failed to find the essential difference.
Somebody asked an almost identical question and received some replies but I don’t think those actually contain an answer: Mysql transaction : commit and rollback Replies focused on having multiple connections open and visibility of changes. Is that all there is to it?
import mysql.connector
# Connect to MySQL-Server
conn = mysql.connector.connect(user='test', password='blub',
host='127.0.0.1', db='my_test')
cursor = conn.cursor(buffered=True)
# This is anyway the default in mysql.connector
# cursor.autocommit = False
sql = """CREATE TABLE IF NOT EXISTS `my_test`.`employees` (
`emp_no` int(11) NOT NULL AUTO_INCREMENT,
`first_name` varchar(14) NOT NULL,
PRIMARY KEY (`emp_no`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8"""
try:
cursor.execute(sql)
conn.commit()
except:
print("error")
# Arguments on default values
# conn.start_transaction(consistent_snapshot=False,
# isolation_level=None, readonly=False)
sql = """INSERT INTO `my_test`.`employees`
(`first_name`)
VALUES
(%s);"""
employees = {}
employees["1"] = ["Peter"]
employees["2"] = ["Bruce"]
for employee, value in employees.items():
cursor.execute(sql, (value[0],))
print(conn.in_transaction)
# If I do not commit the changes, table is left empty (whether I write
# start_transaction or not)
# If I rollback the changes (without commit first), table is left empty
# (whether I write start_transaction or not)
# If I commit and then rollback, the rollback had no effect (i.e. there are
# values in the table (whether I write start_transaction or not)
conn.commit()
conn.rollback()
Thank you very much for your help in advance! I appreciate it.
I think having not committed nor rolled back leaves the transaction in a running state, in which it may still hold resources like locks etc
Well it doesn't matter which db you are using when you call a transaction ,it will lock the resource (I.e any table) until the transaction is completed or rolled back for example if i write a transaction to insert something to a table test the test table will be locked until the transaction is completed this may leads to deadlock since others may need that table...You can try it on yourself just open two instances of your mysql in the first instance run transaction without commit and in the second try to insert something on the same table ...it will clear your doubt
Transactions prevent other queries from modifying the data while your query is running. Furthermore, a transaction scope can contain multiple queries - so you can rollback ALL of them in the event of an error, whereas that is not the case if some of them run successfully and only one query results in error, in which case you may end up with partially committed results, like JLH said.
Your decision to have a transaction should take into account the numerous reasons for having one, including having multiple statements each of which commits writes the database.
In your example I don't think it makes a difference, but in more complicated scenarios you need a transaction to ensure ACID.

Python SQLite - How to manually BEGIN and END transactions?

Context
So I am trying to figure out how to properly override the auto-transaction when using SQLite in Python. When I try and run
cursor.execute("BEGIN;")
.....an assortment of insert statements...
cursor.execute("END;")
I get the following error:
OperationalError: cannot commit - no transaction is active
Which I understand is because SQLite in Python automatically opens a transaction on each modifying statement, which in this case is an INSERT.
Question:
I am trying to speed my insertion by doing one transaction per several thousand records.
How can I overcome the automatic opening of transactions?
As #CL. said you have to set isolation level to None. Code example:
s = sqlite3.connect("./data.db")
s.isolation_level = None
try:
c = s.cursor()
c.execute("begin")
...
c.execute("commit")
except:
c.execute("rollback")
The documentaton says:
You can control which kind of BEGIN statements sqlite3 implicitly executes (or none at all) via the isolation_level parameter to the connect() call, or via the isolation_level property of connections.
If you want autocommit mode, then set isolation_level to None.

"BusyError: cannot rollback savepoint - SQL statements in progress" when using SQLite with APSW

I'm working with an SQLite database using the Python apsw bindings. The code goes like this:
with apsw.Connection(path) as t:
c = t.cursor()
c.execute(...)
... more code ...
if c.execute(...).next()[0]:
raise Exception
I expect the with statement to put a savepoint and the raise statement to rollback to that savepoint (or, if there's nothing to raise, commit the transaction). It commits just fine, but when there's something to raise it refuses to rollback with:
BusyError: BusyError: cannot rollback savepoint - SQL statements in progress
I'm not sure where to look first. As far as I understand the error means there's another connection that blocks the access, but this doesn't look so from the code, and, if this was so, wouldn't it fail on commits as well?
SQLite 3.7.7.1, matching apsw, Python 2.7.
Well, I found it:
if c.execute(...).next()[0]:
raise Exception
The problem is that at the moment I get the next row with next(), the underlying cursor stays active, ready to return more rows. It has to be closed explicitly:
if c.execute(...).next()[0]:
c.close()
raise Exception
or implicitly, by reading out all data:
if list(c.execute(...))[0][0]:
raise Exception
UPDATE. For convenience I wrote a Python class that wraps apsw.Cursor and provides a context manager, so I can write:
with Cursor(connection) as c:
c.execute(...)
This was an issue in SQLite itself. It was fixed in March 2012, in version 3.7.11. From the changelog:
Pending statements no longer block ROLLBACK. Instead, the pending statement will return SQLITE_ABORT upon next access after the ROLLBACK.

Categories