I'm using sqlservice with sqlalchemy to connect with a MySQL database and I can do everything I need except delete and commit. I've turned SQL_ECHO on and I see the DELETE but the COMMIT is never made (even doing it explicitly).
Example:
db.Table.filter_by(item_id=item_id).delete()
db.commit()
The closest related question I could find was here: SQLAlchemy delete() function flushes, but does not commit, even after calling commit()
I have verified the delete is on the correct id, the query is returning the expected results, and delete is returning the correct number of rows. I have even tried explicitly flushing before/after commit.
What am I doing wrong?
As per my comment,
Under the Quickstart section on https://pypi.org/project/sqlservice/:
Destroy the model record:
db.User.destroy(user)
# OR db.User.destroy([user])
# OR db.User.destroy(user.id)
# OR db.User.destroy([user.id])
# OR db.User.destroy(dict(user))
# OR db.User.destroy([dict(user)])
Related
I am struggling to unittest a function that doesn't return anything and executes delete operation. The function is as follows:
def removeReportParseData(self, report_id, conn=None):
table_id = dbFind(
"select table_id from table_table where report_id=%s", (report_id), conn
)
for t in table_id:
self.removeTableParseData(int(t["table_id"]), conn)
dbUpdate("delete from table_table where table_id=%s", t["table_id"], conn)
I want to make sure that the commands were executed but don't want to affect the actual db. My current code is:
def test_remove_report_parse_data(self):
with patch("com.pdfgather.GlobalHelper.dbFind") as mocked_find:
mocked_find.return_value = [123, 232, 431]
mocked_find.assert_called_once()
Thank you in advance for any support.
I don't believe it is possible to execute SQL without executing it, so to speak, however if you are on MySQL you might be able to get fairly close to what you want by wrapping your queries in START TRANSACTION; and ROLLBACK;
i.e. you might replace your queries with:
START TRANSACTION;
YOUR QUERY HERE;
ROLLBACK
This will prove that your function works without actually changing the contents of the database.
However, if it is sufficient to simply test that these functions would execute ANY query, you could alternatively opt to test them with empty queries, and simply assert that your dbFind and dbUpdate methods were called as many times as you would expect.
Again, though, as I alluded to in my comment, I would strongly suggest NOT having your test suite interact with even your development database.
While there is certainly some configuration involved in setting up another database for your tests, you should be able to find some boilerplate code to do this quite easily as it is very common practice.
On my Windows machine, I have a very simple database on MariaDB (10.3.7) to which I connect with the mysql-connector-python-rf (2.2.2).
I also connect to the database with 2 instances of HeidiSQL workbench.
When I add or delete a line in a data table using one of the work benches, I can immediately access the changed data with a SELECT statement in the other work bench. My conclusion: the first work bench has already committed the change to the data base.
However, seeing the change in Python seems more complicated. I have to add a commit() before the query to see the changes:
config = {'user' : 'some_user',
'password': 'some_password',
'host' : '127.0.0.1',
'database': 'some_database',
'raise_on_warnings': True,
}
db = mysql.connector.connect(**config)
# wait some to make changes to the database using the HeidiSQL workbenches
db.commit() # even though Python has not changed anything which needs to be
# committed, this seems necessary to re-read the db to catch
# the changes that were committed by the other clients
cursor = db.cursor()
cursor.execute('some_SQL_query')
for result in cursor:
do_something_with(result)
cursor.close()
So far I thought commit() is used to commit changes that Python wants to make to the database.
Is it correct to say that commit() also reads changes into Python that were done by other clients since the last connect()? Is this a bug/inconvenience or a feature?
Or is something else going on here that I am missing?
The thread writing issues COMMIT after writing. Doing the COMMIT in the reading thread has no effect.
I would not change the "isolation level" unless you need for the reader to see unfinished changes while they are happening. This is not normally required.
So, the writer should issue COMMIT as soon as it has finished some unit of work. That might be a single INSERT; it might be a long, complicated, combination of operations. A simple example is the classic 'transfer of funds:
BEGIN;
UPDATE accounts SET balance = balance + 100 WHERE id = 123; -- my account
UPDATE accounts SET balance = balance - 100 WHERE id = 432; -- your account
COMMIT;
For the integrity of accounts you want both UPDATEs to either happen nor not, even if the system crashes in the middle. And you don't want any other thread to see an inconsistency in balance if it reads the data in the middle.
Another way to phrase it: The writer is responsible for saying "I'm done" (by calling commit).
As #brunodesthuilliers has pointed out, the answer seems to be in the isolation levels. The default for Python seems to be REPEATABLE READ. To always read the latest commits it is necessary to change the transactions' isolation level, e.g. to READ COMMITTED.
config = {'user' : 'some_user',
'password': 'some_password',
'host' : '127.0.0.1',
'database': 'some_database',
'raise_on_warnings': True,
}
db = mysql.connector.connect(**config)
cursor = db.cursor()
cursor.execute('SET SESSION TRANSACTION ISOLATION LEVEL READ COMMITTED;')
cursor.close()
# wait some to make changes to the database using the HeidiSQL workbenches
cursor = db.cursor()
cursor.execute('some_SQL_query') # will now read the last committed data
for result in cursor:
do_something_with(result)
cursor.close()
I have a question regarding MySQL and transactions. I work with MySQL 5.7.18, python 3 and the Oracle mysql connector v2.1.4
I do not understand the difference between
a) having a transaction and –in case of error – rollback and
b) not having a transaction and – in case of error – simply not commiting the changes.
Both seem to leave me with exactly the same results (i.e. no entries in table, see code example below). Does this have to do with using InnoDB – would the results differ otherwise?
What is the advantage of using a transaction if
1) I cannot rollback commited changes and
2) I could just as well not commit changes (until I am done with my task or sure that some query didn’t raise any exceptions)?
I have tried to find the answers to those questions in https://downloads.mysql.com/docs/connector-python-en.a4.pdf but failed to find the essential difference.
Somebody asked an almost identical question and received some replies but I don’t think those actually contain an answer: Mysql transaction : commit and rollback Replies focused on having multiple connections open and visibility of changes. Is that all there is to it?
import mysql.connector
# Connect to MySQL-Server
conn = mysql.connector.connect(user='test', password='blub',
host='127.0.0.1', db='my_test')
cursor = conn.cursor(buffered=True)
# This is anyway the default in mysql.connector
# cursor.autocommit = False
sql = """CREATE TABLE IF NOT EXISTS `my_test`.`employees` (
`emp_no` int(11) NOT NULL AUTO_INCREMENT,
`first_name` varchar(14) NOT NULL,
PRIMARY KEY (`emp_no`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8"""
try:
cursor.execute(sql)
conn.commit()
except:
print("error")
# Arguments on default values
# conn.start_transaction(consistent_snapshot=False,
# isolation_level=None, readonly=False)
sql = """INSERT INTO `my_test`.`employees`
(`first_name`)
VALUES
(%s);"""
employees = {}
employees["1"] = ["Peter"]
employees["2"] = ["Bruce"]
for employee, value in employees.items():
cursor.execute(sql, (value[0],))
print(conn.in_transaction)
# If I do not commit the changes, table is left empty (whether I write
# start_transaction or not)
# If I rollback the changes (without commit first), table is left empty
# (whether I write start_transaction or not)
# If I commit and then rollback, the rollback had no effect (i.e. there are
# values in the table (whether I write start_transaction or not)
conn.commit()
conn.rollback()
Thank you very much for your help in advance! I appreciate it.
I think having not committed nor rolled back leaves the transaction in a running state, in which it may still hold resources like locks etc
Well it doesn't matter which db you are using when you call a transaction ,it will lock the resource (I.e any table) until the transaction is completed or rolled back for example if i write a transaction to insert something to a table test the test table will be locked until the transaction is completed this may leads to deadlock since others may need that table...You can try it on yourself just open two instances of your mysql in the first instance run transaction without commit and in the second try to insert something on the same table ...it will clear your doubt
Transactions prevent other queries from modifying the data while your query is running. Furthermore, a transaction scope can contain multiple queries - so you can rollback ALL of them in the event of an error, whereas that is not the case if some of them run successfully and only one query results in error, in which case you may end up with partially committed results, like JLH said.
Your decision to have a transaction should take into account the numerous reasons for having one, including having multiple statements each of which commits writes the database.
In your example I don't think it makes a difference, but in more complicated scenarios you need a transaction to ensure ACID.
Context
So I am trying to figure out how to properly override the auto-transaction when using SQLite in Python. When I try and run
cursor.execute("BEGIN;")
.....an assortment of insert statements...
cursor.execute("END;")
I get the following error:
OperationalError: cannot commit - no transaction is active
Which I understand is because SQLite in Python automatically opens a transaction on each modifying statement, which in this case is an INSERT.
Question:
I am trying to speed my insertion by doing one transaction per several thousand records.
How can I overcome the automatic opening of transactions?
As #CL. said you have to set isolation level to None. Code example:
s = sqlite3.connect("./data.db")
s.isolation_level = None
try:
c = s.cursor()
c.execute("begin")
...
c.execute("commit")
except:
c.execute("rollback")
The documentaton says:
You can control which kind of BEGIN statements sqlite3 implicitly executes (or none at all) via the isolation_level parameter to the connect() call, or via the isolation_level property of connections.
If you want autocommit mode, then set isolation_level to None.
I'm using SQLAlchemy with a Postgres backend to do a bulk insert-or-update. To try to improve performance, I'm attempting to commit only once every thousand rows or so:
trans = engine.begin()
for i, rec in enumerate(records):
if i % 1000 == 0:
trans.commit()
trans = engine.begin()
try:
inserter.execute(...)
except sa.exceptions.SQLError:
my_table.update(...).execute()
trans.commit()
However, this isn't working. It seems that when the INSERT fails, it leaves things in a weird state that prevents the UPDATE from happening. Is it automatically rolling back the transaction? If so, can this be stopped? I don't want my entire transaction rolled back in the event of a problem, which is why I'm trying to catch the exception in the first place.
The error message I'm getting, BTW, is "sqlalchemy.exc.InternalError: (InternalError) current transaction is aborted, commands ignored until end of transaction block", and it happens on the update().execute() call.
You're hitting some weird Postgresql-specific behavior: if an error happens in a transaction, it forces the whole transaction to be rolled back. I consider this a Postgres design bug; it takes quite a bit of SQL contortionism to work around in some cases.
One workaround is to do the UPDATE first. Detect if it actually modified a row by looking at cursor.rowcount; if it didn't modify any rows, it didn't exist, so do the INSERT. (This will be faster if you update more frequently than you insert, of course.)
Another workaround is to use savepoints:
SAVEPOINT a;
INSERT INTO ....;
-- on error:
ROLLBACK TO SAVEPOINT a;
UPDATE ...;
-- on success:
RELEASE SAVEPOINT a;
This has a serious problem for production-quality code: you have to detect the error accurately. Presumably you're expecting to hit a unique constraint check, but you may hit something unexpected, and it may be next to impossible to reliably distinguish the expected error from the unexpected one. If this hits the error condition incorrectly, it'll lead to obscure problems where nothing will be updated or inserted and no error will be seen. Be very careful with this. You can narrow down the error case by looking at Postgresql's error code to make sure it's the error type you're expecting, but the potential problem is still there.
Finally, if you really want to do batch-insert-or-update, you actually want to do many of them in a few commands, not one item per command. This requires trickier SQL: SELECT nested inside an INSERT, filtering out the right items to insert and update.
This error is from PostgreSQL. PostgreSQL doesn't allow you to execute commands in the same transaction if one command creates an error. To fix this you can use nested transactions (implemented using SQL savepoints) via conn.begin_nested(). Heres something that might work. I made the code use explicit connections, factored out the chunking part and made the code use the context manager to manage transactions correctly.
from itertools import chain, islice
def chunked(seq, chunksize):
"""Yields items from an iterator in chunks."""
it = iter(seq)
while True:
yield chain([it.next()], islice(it, chunksize-1))
conn = engine.commit()
for chunk in chunked(records, 1000):
with conn.begin():
for rec in chunk:
try:
with conn.begin_nested():
conn.execute(inserter, ...)
except sa.exceptions.SQLError:
conn.execute(my_table.update(...))
This still won't have stellar performance though due to nested transaction overhead. If you want better performance try to detect which rows will create errors beforehand with a select query and use executemany support (execute can take a list of dicts if all inserts use the same columns). If you need to handle concurrent updates, you'll still need to do error handling either via retrying or falling back to one by one inserts.