Testing multiple models in PyQt simultaneously; which one failed? - python

I happened to stumble across Qt Model Testing earlier today, and realized this is exactly what is needed on a project which was grown in a very organic manner.
The idea is simple: implement a command-line flag that can be switched on in the future to run the program with the harmless consistency checks running in the background. Afterwards, start hunting down the problems one by one until the problems literally go away.
At its core, the basics of the first step seem easy enough:
self.mdlAlpha = alphaModel(self)
self.mdlBeta = betaModel(self)
# ...
# TODO: implement argument-switch toggle
from PyQt5.QtTest import QAbstractItemModelTester
QAbstractItemModelTester(self.mdlAlpha, QAbstractItemModelTester.FailureReportingMode.Warning, self)
QAbstractItemModelTester(self.mdlBeta, QAbstractItemModelTester.FailureReportingMode.Warning, self)
# ...
Within seconds, there were hundreds of (duplicate) errors that showed long before the program was in a usable state. Perfect! .... or is it?
It turns out that the reported errors aren't clear enough:
qt.modeltest: FAIL! flags == Qt::ItemIsDropEnabled || flags == 0 () returned FALSE (qabstractitemmodeltester.cpp:323)
qt.modeltest: FAIL! topLeft.isValid() () returned FALSE (qabstractitemmodeltester.cpp:753)
Sure, the failed tests are documented, but I have no clue which model is the buggy one due to the number of models being tested. (Commenting out all but one class at a time kind of defeats the point of having a command line flag to test everything...) I would really like to know which object / class is at fault and log that too, but I have no clue how to accomplish that.
Note that I have implemented a QtMessageHandler to convert Qt log messages into the programs logging messages, and I still want the failed tests to end up in that log file.

Naming your objects helps some. If you were to set up like so:
self.mdlAlpha = alphaModel(self)
self.mdlAlpha.setObjectName("mdlAlpha")
self.mdlBeta = betaModel(self)
self.mdlBeta.setObjectName("mdlBeta")
before connecting the models to the QAbstractModelTester, then some of the output would include the object name. Not the lines you quoted, but at least the setup stuff where it goes to insert new data happened to show the destination model, though almost by accident:
qt.modeltest: rowsAboutToBeInserted start= 1 end= 1 parent=
QModelIndex(-1,-1,0x0,QObject(0x0)) parent data= "" current count of parent= 1
last before insertion= QModelIndex(0,0,0x0,FilesModel(0x55b37e4b88f0,
name = "files.model")) QVariant(QString, "demo")
qt.modeltest: rowsAboutToBeInserted start= 1 end= 1 parent=
QModelIndex(-1,-1,0x0,QObject(0x0)) parent data= "" current count of parent= 1
last before insertion= QModelIndex(0,0,0x55b37e46ab90,
FileFilterProxyModel(0x55b37e473b60, name = "files.sortfilterproxy"))
QVariant(QString, "demo")
qt.modeltest: rowsInserted start= 1 end= 1 parent= QModelIndex(-1,-1,0x0,QObject(0x0))
parent data= "" current count of parent= 2
qt.modeltest: itemWasInserted: 1 QVariant(QString, "test")
qt.modeltest: rowsInserted start= 1 end= 1 parent= QModelIndex(-1,-1,0x0,QObject(0x0))
parent data= "" current count of parent= 2
qt.modeltest: itemWasInserted: 1 QVariant(QString, "test")
(I say that because, it's only the fact that it logged the placeholder for the row before my new one, that the model name was visible. Any log message that includes a valid QModelIndex shows the model name (if set), seems like — it's in the value of index().model(). However, most indexes aren't valid at the time the model tester is logging, so they still rarely show. Data that's being created logs with an invalid index, because the tester runs just before the actual insertion of the data into the model.
Still — YMMV, but I found that between the model names sometimes being logged, and being able to see the actual data involved in some of the other logs, I was able to follow which model/indexes were involved in any of the other messages in between.
Reading the code of QAbstractModelTester itself can also be enlightening. The "flags == Qt::ItemIsDropEnabled || flags == 0 () returned FALSE message includes a source file location, which is the line I've marked with a <-- !!! comment below:
/*
nonDestructiveBasicTest tries to call a number of the basic functions (not all)
to make sure the model doesn't outright segfault, testing the functions that makes sense.
*/
void QAbstractItemModelTesterPrivate::nonDestructiveBasicTest()
{
MODELTESTER_VERIFY(!model->buddy(QModelIndex()).isValid());
model->canFetchMore(QModelIndex());
MODELTESTER_VERIFY(model->columnCount(QModelIndex()) >= 0);
fetchingMore = true;
model->fetchMore(QModelIndex());
fetchingMore = false;
Qt::ItemFlags flags = model->flags(QModelIndex());
MODELTESTER_VERIFY(flags == Qt::ItemIsDropEnabled || flags == 0); // <--- !!!
model->hasChildren(QModelIndex());
const bool hasRow = model->hasIndex(0, 0);
Even though the message doesn't, the source makes it clear that it's testing whether a new, empty (therefore, invalid) QModelIndex() generated by the model has any flags it shouldn't (since it's invalid). Like me, you were probably returning a fixed set of flags from your flags() implementation, without checking whether the index requested was a valid one. Returning Qt.ItemIsEnabled, Qt.ItemIsEditable, and etc. for an invalid index is a model error, since there's no item referred to by that index.

Related

Inconsistent behavior of QSqlTableModel with OnRowSubmit

Premise: this question possibly refers to two distinct problems, but I believe they might be linked. If, after comments and further research we will find out that they are actually unrelated, I will open a separate question.
I'm experiencing some unexpected and odd behavior with some aspects of QSqlTableModel, and with subclassing in at least one case. I'm not an expert on Sql, but one of the problems doesn't seem what the expected behavior should be.
I can confirm this only for SQLite as I don't use other database systems.
I can also reproduce these problems with both [Py]Qt 5.15.2 and 6.2.2.
1. New row is "removed" after ignoring editor changes
With the default OnRowChange edit strategy, if a row is added, some data is inserted in a field, and editing of another field on the same row is cancelled using Esc, the whole row is then removed from the view.
The actual database, though, is still updated, and opening the program again shows the row that was previously "hidden", except for the field that has been cancelled.
from PyQt5 import QtWidgets, QtSql
class TestModel(QtSql.QSqlTableModel):
def __init__(self):
super().__init__()
QtSql.QSqlQuery().exec(
'CREATE TABLE IF NOT EXISTS test (name, value, data);')
self.setTable('test')
self.select()
app = QtWidgets.QApplication([])
db = QtSql.QSqlDatabase.addDatabase('QSQLITE')
db.setDatabaseName('test.db')
db.open()
win = QtWidgets.QWidget()
layout = QtWidgets.QVBoxLayout(win)
addButton = QtWidgets.QPushButton('Add row')
layout.addWidget(addButton)
table = QtWidgets.QTableView()
layout.addWidget(table)
model = TestModel()
table.setModel(model)
addButton.clicked.connect(lambda: model.insertRow(model.rowCount()))
app.aboutToQuit.connect(model.submitAll)
win.resize(640, 480)
win.show()
app.exec()
These are the steps to reproduce the problem:
add a row with the button;
edit at least one field, but not all fields;
start editing an empty field;
press Esc;
close and restart the program;
After step 4, you'll see that the added row is removed from the view, which is not completely unexpected: since the strategy is OnRowChange, cancelling reverts all cached changes (including insertRow()); I don't completely agree with the behavior (imagine filling dozens of fields and then hitting Esc by mistake), but that's not the point.
What's unexpected is that the model is actually updated with the new row and all fields that have been submitted before hitting Esc, and restarting the program will show that.
2. Implementing data() reverts to previous data for incomplete records
Editing an index that has empty (NULL) fields for its row brings different results whether data() has been implemented or not in the subclass, even if the override just calls the base implementation.
Add the following to the TestModel class above:
def data(self, index, role=QtCore.Qt.DisplayRole):
return super().data(index, role)
And a submit button before app.exec():
submitButton = QtWidgets.QPushButton('Submit')
layout.addWidget(submitButton)
submitButton.clicked.connect(model.submitAll)
To reproduce the problem follow these steps:
open a database with at least one row with an empty field at the bottom, similarly to what done above (note: with "empty field" I mean an item that has never been edited);
edit any field in that row and press Enter;
With the OnRowChange or OnFieldChange strategy, the result is that the whole row is made invalid: the vertical header shows "!" (a hint for an invalid record) and all fields are cleared, including those that have previous value from the database.
When the edit strategy is set to OnManualSubmit, calling submitAll() will revert to the original values of the database, just like as changes have been reverted.
The behavior is slightly different if the row with the empty field is not at the bottom; do the first two steps above, then:
press the submit button;
close and restart the program;
In this case, after step 3 the view seem to have accepted the changes, but restarting the program shows that no modification has been applied.
Depending on the edit strategy and the situation, the behavior changes. Usually, if a record with an empty field is followed by at least a record with all fields set, the view and model behave as expected when cancelling editing of that field.
In at least one case it was even impossible to edit an empty field at all (I've to admit, I did many random/speed tests and when I found out that I wasn't able to edit a field I couldn't remember the steps to reproduce it).
What's also strange is that both setData() and submitAll() return True, and there is no explicit lastError(). Despite of that, the shown (and stored) data reverts to the previous database content.
I believe that both issues are potentially caused by a common bug, but, before submitting something to the Qt bug report system I'd like to have some feedback, especially from people being more experienced in SQL and other db drivers, in order to provide a better report (and eventually know if those issues are in fact related or not).
Both issues are caused by bugs in Qt, but they aren't related.
Before explaining these issues, some clarification of the symbols used in the vertical header may be helpful, because they provide some important clues regarding the source of the problems. The symbols are documented thus:
If you insert rows programmatically using
QSqlTableModel::insertRows(), the new rows will be marked with an
asterisk (*) until they are submitted using submitAll() or
automatically when the user moves to another record (assuming the edit
strategy is QSqlTableModel::OnRowChange). Likewise, if you remove rows
using removeRows(), the rows will be marked with an exclamation mark
(!) until the change is submitted.
The first issue is caused by this sequence of events:
After pressing Esc whilst editing a new row (i.e. * is shown in the vertical header), the delegate will emit closeEditor with the RevertModelCache hint. This calls the closeEditor slot of the view, which in turn calls revert() on the table-model - and also, ultimately, the private revertCachedRow function. This function calls beginRemoveRows - but crucially before clearing the cache. Next, rowsAboutToBeRemoved is emitted, which removes the row from the view, causing currentRowChanged to be emitted, which in turn calls the submit() slot of the table-model. Oops! The still uncleared cache data is now inadvertently committed to the database, before endRemoveRows is called after the cache data is finally removed. So, in short, the bug here is that there is no guard to stop submit() being called during the execution of revert().
The second issue is much more subtle. The problem occurs because the SQL table is created without a primary key and the columns do not have an explicit type. This is all perfectly valid, but it exposes a critical bug in a small section of Qt code that builds SQL statements.
This happens in QSqlTableModel::selectRow, which needs to build a where-clause from the QSqlRecord returned by primaryValues. The sqlStatement function of the database driver is used for this, but that needs to know the exact type of the field values in order to quote them correctly. However, the table-model cache does not ensure that a sensible default type is used for columns without an explicit type. This means untyped values will pass through unquoted, allowing arbitrary SQL expressions to be evaluated whilst editing the table. Oops!
It's this that can sometimes make the bug hard to reproduce, because the exact behaviour depends on the precise values that are entered. A value like foo will cause an SQL error, because it's a valid column name that doesn't exist; yet a value like 6 won't raise an error, but will wrongly fail to return any rows, due to a type-mismatch (i.e. INT vs TEXT). If selectRow can't find the relevant row, it may call cache.refresh(), which will clear the values and mark the row for deletion (hence the ! shown in the vertical header). Note also that QSqlQuery is used to execute the problematic statement, so any errors will pass silently and won't be available via the database or driver.
I have provided a re-write below of the original example with some fixes that can be switched on via the command-line (1 to fix the first issue, 2 to fix the second, and 3 to fix both). These are mainly meant for debugging, but could also be adapted as work-arounds if required. The second fix is rather hackish (because primaryValues can't be reimplemented in PyQt) - but it's only needed if you don't have control over the database schema. If the table has a typed primary key and/or all the columns have an explicit type, the second issue won't occur at all. Hopefully the output from the script should make it clear what is going on.
PyQt5:
import sys
from PyQt5 import QtCore, QtWidgets, QtSql
BUGFIX = int(sys.argv[1]) if len(sys.argv) > 1 else 0
class TestModel(QtSql.QSqlTableModel):
def __init__(self):
super().__init__()
self._select_row = None
self._reverting = False
QtSql.QSqlQuery().exec(
'CREATE TABLE IF NOT EXISTS test (name, value, data);')
self.setTable('test')
self.select()
def selectRow(self, row):
if BUGFIX & 2:
self._select_row = row
result = super().selectRow(row)
print(f'selectRow: {result}')
return result
def select(self):
return super().select() if self._select_row is None else False
def selectStatement(self):
if self._select_row is not None:
record = self.primaryValues(self._select_row)
for index in range(record.count()):
field = record.field(index)
if (not field.isNull() and
field.type() == QtCore.QVariant.Invalid):
field.setType(QtCore.QVariant.String)
record.replace(index, field)
where = self.database().driver().sqlStatement(
QtSql.QSqlDriver.WhereStatement,
self.tableName(), record, False)
if where[:6].upper() == 'WHERE ':
where = where[6:]
self.setFilter(where)
self._select_row = None
statement = super().selectStatement()
print(f'selectStatement: {statement!r}')
query = self.database().exec(statement)
if query.lastError().isValid():
print(f' query-lastError: {query.lastError().text()!r}')
else:
print(f' query-next: {query.next()}')
return statement
def revert(self):
if BUGFIX & 1:
self._reverting = True
print('reverting ...')
super().revert()
self._reverting = False
print('reverted')
def submit(self):
print('submitting ...')
result = False if self._reverting else super().submit()
print(f'submitted: {result}')
return result
app = QtWidgets.QApplication(['Test'])
db = QtSql.QSqlDatabase.addDatabase('QSQLITE')
db.setDatabaseName('test.db')
db.open()
win = QtWidgets.QWidget()
layout = QtWidgets.QVBoxLayout(win)
addButton = QtWidgets.QPushButton('Add row')
layout.addWidget(addButton)
table = QtWidgets.QTableView()
layout.addWidget(table)
model = TestModel()
table.setModel(model)
submitButton = QtWidgets.QPushButton('Submit')
layout.addWidget(submitButton)
submitButton.clicked.connect(model.submitAll)
addButton.clicked.connect(lambda: model.insertRow(model.rowCount()))
app.aboutToQuit.connect(model.submitAll)
win.setGeometry(1000, 50, 640, 480)
win.show()
app.exec()
PyQt6:
import sys
from PyQt6 import QtCore, QtWidgets, QtSql
BUGFIX = int(sys.argv[1]) if len(sys.argv) > 1 else 0
class TestModel(QtSql.QSqlTableModel):
def __init__(self):
super().__init__()
self._select_row = None
self._reverting = False
QtSql.QSqlQuery().exec(
'CREATE TABLE IF NOT EXISTS test (name, value, data);')
self.setTable('test')
self.select()
def selectRow(self, row):
if BUGFIX & 2:
self._select_row = row
result = super().selectRow(row)
print(f'selectRow: {result}')
return result
def select(self):
return super().select() if self._select_row is None else False
def selectStatement(self):
if self._select_row is not None:
record = self.primaryValues(self._select_row)
MetaType = QtCore.QMetaType.Type
MetaString = QtCore.QMetaType(MetaType.QString.value)
for index in range(record.count()):
field = record.field(index)
if (not field.isNull() and
field.metaType().id() == MetaType.UnknownType.value):
field.setMetaType(MetaString)
record.replace(index, field)
where = self.database().driver().sqlStatement(
QtSql.QSqlDriver.StatementType.WhereStatement,
self.tableName(), record, False)
if where[:6].upper() == 'WHERE ':
where = where[6:]
self.setFilter(where)
self._select_row = None
statement = super().selectStatement()
print(f'selectStatement: {statement!r}')
query = self.database().exec(statement)
if query.lastError().isValid():
print(f' query-lastError: {query.lastError().text()!r}')
else:
print(f' query-next: {query.next()}')
return statement
def revert(self):
if BUGFIX & 1:
self._reverting = True
print('reverting ...')
super().revert()
self._reverting = False
print('reverted')
def submit(self):
print('submitting ...')
result = False if self._reverting else super().submit()
print(f'submitted: {result}')
return result
app = QtWidgets.QApplication(['Test'])
db = QtSql.QSqlDatabase.addDatabase('QSQLITE')
db.setDatabaseName('test.db')
db.open()
win = QtWidgets.QWidget()
layout = QtWidgets.QVBoxLayout(win)
addButton = QtWidgets.QPushButton('Add row')
layout.addWidget(addButton)
table = QtWidgets.QTableView()
layout.addWidget(table)
model = TestModel()
table.setModel(model)
submitButton = QtWidgets.QPushButton('Submit')
layout.addWidget(submitButton)
submitButton.clicked.connect(model.submitAll)
addButton.clicked.connect(lambda: model.insertRow(model.rowCount()))
app.aboutToQuit.connect(model.submitAll)
win.setGeometry(1000, 50, 640, 480)
win.show()
app.exec()

How to rewrite a state machine in a clearer style?

I am interacting with an external device, and I have to issue certain commands in order. Sometimes I have to jump back and redo steps. Pseudocode (the actual code has more steps and jumps):
enter_update_mode() # step 1
success = start_update()
if not success:
retry from step 1
leave_update_mode()
How do I handle this the cleanest way? What I did for now is to define an enum, and write a state machine. This works, but is pretty ugly:
class Step(Enum):
ENTER_UPDATE_MODE = 1
START_UPDATE = 2
LEAVE_UPDATE_MODE = 3
EXIT = 4
def main():
next_step = Step.ENTER_UPDATE_MODE
while True:
if next_step == Step.ENTER_UPDATE_MODE:
enter_update_mode()
next_step = Step.START_UPDATE
elif next_step == Step.START_UPDATE:
success = start_update()
if success:
next_step = Step.LEAVE_UPDATE_MODE
else:
next_step = Step.ENTER_UPDATE_MODE
....
I can imagine an alternative would be to just call the functions nested. As long as this is only a few levels deep, it should not be a problem:
def enter_update_mode():
# do stuff ...
# call next step:
perform_update()
def perform_update():
# ...
# call next step:
if success:
leave_update_mode()
else:
enter_update_mode()
I have looked into the python-statemachine module, but it seems to be there to model state machines. You can define states and query which state it is in, and you can attach behavior to states. But that is not what I'm looking for. I am looking for a way to write the behavior code in a very straightforward, imperative style, like you would use for pseudocode or instructions to a human.
There is also a module to add goto to Python, but I think it is a joke and would not like to use it in production :-).
Notes:
This code is synchronous, meaning it is a terminal app or a separate thread. Running concurrently with other code would be an added complication. If a solution allows that (e.g. by using yield) that would be a bonus, but not neccessary.
I left out a lot of retry logic. A step may be only retried a certain number of times.
Releated discussion of explicit state machine vs. imperative style: https://softwareengineering.stackexchange.com/q/147182/62069

How to create an augmented AFL fuzzer which skips certain seeds?

I am a master's student working on replicating the results of the paper : https://www.microsoft.com/en-us/research/publication/not-all-bytes-are-equal-neural-byte-sieve-for-fuzzing/
I want to create an augmented fuzzer which rejects the modifications to seeds which it finds not useful. Any help in achieving this will be very much helpful.
I have created a simple python function for the augmented fuzzer. To test the implementation, I took the trivial "deadbeef" program and wrote the python function such that whenever the seed is modified to "deadbeef", the function sends a "not useful" return to the 'common_fuzz_stuff()' function of the AFL-fuzz code. It should mean that the fuzzer should not be able to find the crash. But it still is able to find the crash and I'm not able to determine where I have gone wrong.
Here is the python function for AFL:
def check_useful(seed):
my_string = str.encode('deadbeef')
file = open(seed, 'rb')
value = file.read()
if (value == my_string):
print('[*] Crash Found!')
return True
else:
return False
And here is the afl-fuzz.c code snippet:
/* Write a modified test case, run program, process results. Handle
error conditions, returning 1 if it's time to bail out. This is
a helper function for fuzz_one(). */
EXP_ST u8 common_fuzz_stuff(char** argv, u8* out_buf, u32 len) {
if (PyCallable_Check(pFuncCheckModel)){
pArgs = PyTuple_New(1);
PyTuple_SetItem(pArgs, 0, PyUnicode_FromString(queue_cur->fname));
pFuncReturn = PyObject_CallObject(pFuncCheckModel, pArgs);
if (PyObject_IsTrue(pFuncReturn)){
skip_requested = 1;
return 1;
}
} else
{
PyErr_Print();
}
How is my program still able to find the crash even if the return value is 1 from the common_fuzz_stuff() function for the seed "deadbeef"?
In case your decision whether this input is useful or not depends only on the input itself (not the mutation), as far as I understand, you could use the experimental/post_library stuff. The documentation is included in the example post_library and contains a note, that this is probably not what you want -- not you for your specific need, this is approximate cite from that documentation. :)
On the other hand, this single-function-API description contains the following:
2) If you want to skip this test case altogether and have AFL generate a
new one, return NULL. Use this sparingly - it's faster than running
the target program with patently useless inputs, but still wastes CPU
time.
To answer my own question:
I had to send out_file to the Python function instead of queue_cur->fname.
PyTuple_SetItem(pArgs, 0, PyUnicode_FromString(out_file));
Also skip_requested = 1; in the above code is redundant.
Now the fuzzer will run and will not find the crash

Django attribute is not up to date

I'm using Django / Postgresql, and I want to build some kind of media player behavior : start playing when asked, play until further command, and stop playing when asked.
My Django model is designed as follows :
a boolean attribute state
a function that switches the state value, and if it's true, calls the looping function
a function that loops while the state attribute is True, and stops when it is False.
The code is as follows:
state = models.BooleanField(default=False)
def switchState(self):
print 'switching from %s to %s' % (self.state, not(self.state))
self.state = not(self.state)
self.save()
if self.state :
self.loop()
def loop(self):
while self.state:
print 'looping'
time.sleep( 1 )
print 'stopped looping'
Behavior :
when I call switchState() on a model instance, state is set to True (I can see it in the database) and the loop() function starts printing lines
when I call switchState() again, state is set to False (again, I can see it in the database) but then, the loop() function does not stop. When I print state, its value is still True...
I can't get an up to date value of that damned state attribute.
I must be missing something, but what ?
Thanks for your help !
OK I finally found an answer. As Daniel Roseman told me, the problem wad due to the loop function, which was looping in its own transaction, not being able to see what happened in another transaction.
I first tried to force a refresh on my instance, with the new refresh_from_db() function, but that didn't work at all (I still don't get why, please tell me if you know).
As Daniel suggested, I read again the Django transaction doc, and finally found out a solution that seems to work. The idea is to allow the loop function to update the instance at each loop, using the with transaction.atomic() command.
def loop(self):
vTest = Source.objects.get(id=self.id)
while vTest:
with transaction.atomic():
vTest = Source.objects.get(id=self.id).state
print 'looping'
time.sleep( 1 )
print 'stopped looping'
Nb : you can't use
vTest = self.state
instead of
vTest = Source.objects.get(id=self.id).state
Maybe it's not the right way to do it, feel free to correct me if I'm wrong.

Duplicate entries in High Replication Datastore

We still have a rare case of duplicate entries when this POST method is called.
I had asked for advice previously on Stack overflow and was given a solution, that is utilising the parent/child methodology to retain strongly consistent queries.
I have migrated all data into that form and let it run for another 3 months.
However the problem was never solved.
The problem is right here with this conditional if recordsdb.count() == 1:
It should be true in order to update the entry, but instead HRD might not always find the latest entry and creates a new entry instead.
As you can see, we are writing/reading from the Record via Parent/Child methodology as recommended:
new_record = FeelTrackerRecord(parent=user.key,...)
And yet still upon retrieval, the HRD still doesn't always fetch the latest entry:
recordsdb = FeelTrackerRecord.query(ancestor = user.key).filter(FeelTrackerRecord.record_date == ... )
So we are quite stuck on this and don't know how to solve it.
#requires_auth
def post(self, ios_sync_timestamp):
user = User.query(User.email == request.authorization.username).fetch(1)[0]
if user:
json_records = request.json['records']
for json_record in json_records:
recordsdb = FeelTrackerRecord.query(ancestor = user.key).filter(FeelTrackerRecord.record_date == date_parser.parse(json_record['record_date']))
if recordsdb.count() == 1:
rec = recordsdb.fetch(1)[0]
if 'timestamp' in json_record:
if rec.timestamp < json_record['timestamp']:
rec.rating = json_record['rating']
rec.notes = json_record['notes']
rec.timestamp = json_record['timestamp']
rec.is_deleted = json_record['is_deleted']
rec.put()
elif recordsdb.count() == 0:
new_record = FeelTrackerRecord(parent=user.key,
user=user.key,
record_date = date_parser.parse(json_record['record_date']),
rating = json_record['rating'],
notes = json_record['notes'],
timestamp = json_record['timestamp'])
new_record.put()
else:
raise Exception('Got more than two records for the same record date - among REST post')
user.last_sync_timestamp = create_timestamp(datetime.datetime.today())
user.put()
return '', 201
else:
return '', 401
Possible Solution:
The very last idea I have to solve this would be, stepping away from Parent/Child strategy and using the user.key PLUS date-string as part of the key.
Saving:
new_record = FeelTrackerRecord(id=str(user.key) + json_record['record_date'], ...)
new_record.put()
Loading:
key = ndb.Key(FeelTrackerRecord, str(user.key) + json_record['record_date'])
record = key.get();
Now I could check if record is None, I shall create a new entry, otherwise I shall update it. And hopefully HRD has no reason not finding the record anymore.
What do you think, is this a guaranteed solution?
The Possible Solution appears to have the same problem as the original code. Imagine the race condition if two servers execute the same instructions practically simultaneously. With Google's overprovisioning, that is sure to happen once in a while.
A more robust solution should use Transactions and a rollback for when concurrency causes a consistency violation. The User entity should be the parent of its own Entity Group. Increment a records counter field in the User entity within a transaction. Create the new FeelTrackerRecord only if the Transaction completes successfully. Therefore the FeelTrackerRecord entities must have a User as parent.
Edit: In the case of your code the following lines would go before user = User.query(... :
Transaction txn = datastore.beginTransaction();
try {
and the following lines would go after user.put() :
txn.commit();
} finally {
if (txn.isActive()) {
txn.rollback();
}
}
That may overlook some flow control nesting detail, it is the concept that this answer is trying to describe.
With an active transaction, if multiple processes (for example on multiple servers executing the same POST concurrently because of overprovisioning) the first process will succeed with its put and commit, while the second process will throw the documented ConcurrentModificationException.
Edit 2: The transaction that increments the counter (and may throw an exception) must also create the new record. That way if the exception is thrown, the new record is not created.

Categories