What is the difference (in MySQL) between transaction-rollback and not commiting? - python

I have a question regarding MySQL and transactions. I work with MySQL 5.7.18, python 3 and the Oracle mysql connector v2.1.4
I do not understand the difference between
a) having a transaction and –in case of error – rollback and
b) not having a transaction and – in case of error – simply not commiting the changes.
Both seem to leave me with exactly the same results (i.e. no entries in table, see code example below). Does this have to do with using InnoDB – would the results differ otherwise?
What is the advantage of using a transaction if
1) I cannot rollback commited changes and
2) I could just as well not commit changes (until I am done with my task or sure that some query didn’t raise any exceptions)?
I have tried to find the answers to those questions in https://downloads.mysql.com/docs/connector-python-en.a4.pdf but failed to find the essential difference.
Somebody asked an almost identical question and received some replies but I don’t think those actually contain an answer: Mysql transaction : commit and rollback Replies focused on having multiple connections open and visibility of changes. Is that all there is to it?
import mysql.connector
# Connect to MySQL-Server
conn = mysql.connector.connect(user='test', password='blub',
host='127.0.0.1', db='my_test')
cursor = conn.cursor(buffered=True)
# This is anyway the default in mysql.connector
# cursor.autocommit = False
sql = """CREATE TABLE IF NOT EXISTS `my_test`.`employees` (
`emp_no` int(11) NOT NULL AUTO_INCREMENT,
`first_name` varchar(14) NOT NULL,
PRIMARY KEY (`emp_no`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8"""
try:
cursor.execute(sql)
conn.commit()
except:
print("error")
# Arguments on default values
# conn.start_transaction(consistent_snapshot=False,
# isolation_level=None, readonly=False)
sql = """INSERT INTO `my_test`.`employees`
(`first_name`)
VALUES
(%s);"""
employees = {}
employees["1"] = ["Peter"]
employees["2"] = ["Bruce"]
for employee, value in employees.items():
cursor.execute(sql, (value[0],))
print(conn.in_transaction)
# If I do not commit the changes, table is left empty (whether I write
# start_transaction or not)
# If I rollback the changes (without commit first), table is left empty
# (whether I write start_transaction or not)
# If I commit and then rollback, the rollback had no effect (i.e. there are
# values in the table (whether I write start_transaction or not)
conn.commit()
conn.rollback()
Thank you very much for your help in advance! I appreciate it.

I think having not committed nor rolled back leaves the transaction in a running state, in which it may still hold resources like locks etc

Well it doesn't matter which db you are using when you call a transaction ,it will lock the resource (I.e any table) until the transaction is completed or rolled back for example if i write a transaction to insert something to a table test the test table will be locked until the transaction is completed this may leads to deadlock since others may need that table...You can try it on yourself just open two instances of your mysql in the first instance run transaction without commit and in the second try to insert something on the same table ...it will clear your doubt

Transactions prevent other queries from modifying the data while your query is running. Furthermore, a transaction scope can contain multiple queries - so you can rollback ALL of them in the event of an error, whereas that is not the case if some of them run successfully and only one query results in error, in which case you may end up with partially committed results, like JLH said.

Your decision to have a transaction should take into account the numerous reasons for having one, including having multiple statements each of which commits writes the database.
In your example I don't think it makes a difference, but in more complicated scenarios you need a transaction to ensure ACID.

Related

Can't commit with MySQL/InnoDB after delete?

I'm using sqlservice with sqlalchemy to connect with a MySQL database and I can do everything I need except delete and commit. I've turned SQL_ECHO on and I see the DELETE but the COMMIT is never made (even doing it explicitly).
Example:
db.Table.filter_by(item_id=item_id).delete()
db.commit()
The closest related question I could find was here: SQLAlchemy delete() function flushes, but does not commit, even after calling commit()
I have verified the delete is on the correct id, the query is returning the expected results, and delete is returning the correct number of rows. I have even tried explicitly flushing before/after commit.
What am I doing wrong?
As per my comment,
Under the Quickstart section on https://pypi.org/project/sqlservice/:
Destroy the model record:
db.User.destroy(user)
# OR db.User.destroy([user])
# OR db.User.destroy(user.id)
# OR db.User.destroy([user.id])
# OR db.User.destroy(dict(user))
# OR db.User.destroy([dict(user)])

how can Python see changes made to MariaDB by another client?

On my Windows machine, I have a very simple database on MariaDB (10.3.7) to which I connect with the mysql-connector-python-rf (2.2.2).
I also connect to the database with 2 instances of HeidiSQL workbench.
When I add or delete a line in a data table using one of the work benches, I can immediately access the changed data with a SELECT statement in the other work bench. My conclusion: the first work bench has already committed the change to the data base.
However, seeing the change in Python seems more complicated. I have to add a commit() before the query to see the changes:
config = {'user' : 'some_user',
'password': 'some_password',
'host' : '127.0.0.1',
'database': 'some_database',
'raise_on_warnings': True,
}
db = mysql.connector.connect(**config)
# wait some to make changes to the database using the HeidiSQL workbenches
db.commit() # even though Python has not changed anything which needs to be
# committed, this seems necessary to re-read the db to catch
# the changes that were committed by the other clients
cursor = db.cursor()
cursor.execute('some_SQL_query')
for result in cursor:
do_something_with(result)
cursor.close()
So far I thought commit() is used to commit changes that Python wants to make to the database.
Is it correct to say that commit() also reads changes into Python that were done by other clients since the last connect()? Is this a bug/inconvenience or a feature?
Or is something else going on here that I am missing?
The thread writing issues COMMIT after writing. Doing the COMMIT in the reading thread has no effect.
I would not change the "isolation level" unless you need for the reader to see unfinished changes while they are happening. This is not normally required.
So, the writer should issue COMMIT as soon as it has finished some unit of work. That might be a single INSERT; it might be a long, complicated, combination of operations. A simple example is the classic 'transfer of funds:
BEGIN;
UPDATE accounts SET balance = balance + 100 WHERE id = 123; -- my account
UPDATE accounts SET balance = balance - 100 WHERE id = 432; -- your account
COMMIT;
For the integrity of accounts you want both UPDATEs to either happen nor not, even if the system crashes in the middle. And you don't want any other thread to see an inconsistency in balance if it reads the data in the middle.
Another way to phrase it: The writer is responsible for saying "I'm done" (by calling commit).
As #brunodesthuilliers has pointed out, the answer seems to be in the isolation levels. The default for Python seems to be REPEATABLE READ. To always read the latest commits it is necessary to change the transactions' isolation level, e.g. to READ COMMITTED.
config = {'user' : 'some_user',
'password': 'some_password',
'host' : '127.0.0.1',
'database': 'some_database',
'raise_on_warnings': True,
}
db = mysql.connector.connect(**config)
cursor = db.cursor()
cursor.execute('SET SESSION TRANSACTION ISOLATION LEVEL READ COMMITTED;')
cursor.close()
# wait some to make changes to the database using the HeidiSQL workbenches
cursor = db.cursor()
cursor.execute('some_SQL_query') # will now read the last committed data
for result in cursor:
do_something_with(result)
cursor.close()

Python SQLite - How to manually BEGIN and END transactions?

Context
So I am trying to figure out how to properly override the auto-transaction when using SQLite in Python. When I try and run
cursor.execute("BEGIN;")
.....an assortment of insert statements...
cursor.execute("END;")
I get the following error:
OperationalError: cannot commit - no transaction is active
Which I understand is because SQLite in Python automatically opens a transaction on each modifying statement, which in this case is an INSERT.
Question:
I am trying to speed my insertion by doing one transaction per several thousand records.
How can I overcome the automatic opening of transactions?
As #CL. said you have to set isolation level to None. Code example:
s = sqlite3.connect("./data.db")
s.isolation_level = None
try:
c = s.cursor()
c.execute("begin")
...
c.execute("commit")
except:
c.execute("rollback")
The documentaton says:
You can control which kind of BEGIN statements sqlite3 implicitly executes (or none at all) via the isolation_level parameter to the connect() call, or via the isolation_level property of connections.
If you want autocommit mode, then set isolation_level to None.

How do I efficiently do a bulk insert-or-update with SQLAlchemy?

I'm using SQLAlchemy with a Postgres backend to do a bulk insert-or-update. To try to improve performance, I'm attempting to commit only once every thousand rows or so:
trans = engine.begin()
for i, rec in enumerate(records):
if i % 1000 == 0:
trans.commit()
trans = engine.begin()
try:
inserter.execute(...)
except sa.exceptions.SQLError:
my_table.update(...).execute()
trans.commit()
However, this isn't working. It seems that when the INSERT fails, it leaves things in a weird state that prevents the UPDATE from happening. Is it automatically rolling back the transaction? If so, can this be stopped? I don't want my entire transaction rolled back in the event of a problem, which is why I'm trying to catch the exception in the first place.
The error message I'm getting, BTW, is "sqlalchemy.exc.InternalError: (InternalError) current transaction is aborted, commands ignored until end of transaction block", and it happens on the update().execute() call.
You're hitting some weird Postgresql-specific behavior: if an error happens in a transaction, it forces the whole transaction to be rolled back. I consider this a Postgres design bug; it takes quite a bit of SQL contortionism to work around in some cases.
One workaround is to do the UPDATE first. Detect if it actually modified a row by looking at cursor.rowcount; if it didn't modify any rows, it didn't exist, so do the INSERT. (This will be faster if you update more frequently than you insert, of course.)
Another workaround is to use savepoints:
SAVEPOINT a;
INSERT INTO ....;
-- on error:
ROLLBACK TO SAVEPOINT a;
UPDATE ...;
-- on success:
RELEASE SAVEPOINT a;
This has a serious problem for production-quality code: you have to detect the error accurately. Presumably you're expecting to hit a unique constraint check, but you may hit something unexpected, and it may be next to impossible to reliably distinguish the expected error from the unexpected one. If this hits the error condition incorrectly, it'll lead to obscure problems where nothing will be updated or inserted and no error will be seen. Be very careful with this. You can narrow down the error case by looking at Postgresql's error code to make sure it's the error type you're expecting, but the potential problem is still there.
Finally, if you really want to do batch-insert-or-update, you actually want to do many of them in a few commands, not one item per command. This requires trickier SQL: SELECT nested inside an INSERT, filtering out the right items to insert and update.
This error is from PostgreSQL. PostgreSQL doesn't allow you to execute commands in the same transaction if one command creates an error. To fix this you can use nested transactions (implemented using SQL savepoints) via conn.begin_nested(). Heres something that might work. I made the code use explicit connections, factored out the chunking part and made the code use the context manager to manage transactions correctly.
from itertools import chain, islice
def chunked(seq, chunksize):
"""Yields items from an iterator in chunks."""
it = iter(seq)
while True:
yield chain([it.next()], islice(it, chunksize-1))
conn = engine.commit()
for chunk in chunked(records, 1000):
with conn.begin():
for rec in chunk:
try:
with conn.begin_nested():
conn.execute(inserter, ...)
except sa.exceptions.SQLError:
conn.execute(my_table.update(...))
This still won't have stellar performance though due to nested transaction overhead. If you want better performance try to detect which rows will create errors beforehand with a select query and use executemany support (execute can take a list of dicts if all inserts use the same columns). If you need to handle concurrent updates, you'll still need to do error handling either via retrying or falling back to one by one inserts.

Python's MySqlDB not getting updated row

I have a script that waits until some row in a db is updated:
con = MySQLdb.connect(server, user, pwd, db)
When the script starts the row's value is "running", and it waits for the value to become "finished"
while(True):
sql = '''select value from table where some_condition'''
cur = self.getCursor()
cur.execute(sql)
r = cur.fetchone()
cur.close()
res = r['value']
if res == 'finished':
break
print res
time.sleep(5)
When I run this script it hangs forever. Even though I see the value of the row has changed to "finished" when I query the table, the printout of the script is still "running".
Is there some setting I didn't set?
EDIT: The python script only queries the table. The update to the table is carried out by a tomcat webapp, using JDBC, that is set on autocommit.
This is an InnoDB table, right? InnoDB is transactional storage engine. Setting autocommit to true will probably fix this behavior for you.
conn.autocommit(True)
Alternatively, you could change the transaction isolation level. You can read more about this here:
http://dev.mysql.com/doc/refman/5.0/en/set-transaction.html
The reason for this behavior is that inside a single transaction the reads need to be consistent. All consistent reads within the same transaction read the snapshot established by the first read. Even if you script only reads the table this is considered a transaction too. This is the default behavior in InnoDB and you need to change that or run conn.commit() after each read.
This page explains this in more details: http://dev.mysql.com/doc/refman/5.0/en/innodb-consistent-read.html
I worked around this by running
c.execute("""set session transaction isolation level READ COMMITTED""")
early on in my reading session. Updates from other threads do come through now.
In my instance I was keeping connections open for a long time (inside mod_python) and so updates by other processes weren't being seen at all.

Categories