How can I Cause a Deadlock in MySQL for Testing Purposes - python

I want to make my Python library working with MySQLdb be able to detect deadlocks and try again. I believe I've coded a good solution, and now I want to test it.
Any ideas for the simplest queries I could run using MySQLdb to create a deadlock condition would be?
system info:
MySQL 5.0.19
Client 5.1.11
Windows XP
Python 2.4 / MySQLdb 1.2.1 p2

Here's some pseudocode for how i do it in PHP:
Script 1:
START TRANSACTION;
INSERT INTO table <anything you want>;
SLEEP(5);
UPDATE table SET field = 'foo';
COMMIT;
Script 2:
START TRANSACTION;
UPDATE table SET field = 'foo';
SLEEP(5);
INSERT INTO table <anything you want>;
COMMIT;
Execute script 1 and then immediately execute script 2 in another terminal. You'll get a deadlock if the database table already has some data in it (In other words, it starts deadlocking after the second time you try this).
Note that if mysql won't honor the SLEEP() command, use Python's equivalent in the application itself.

you can always run LOCK TABLE tablename from another session (mysql CLI for instance). That might do the trick.
It will remain locked until you release it or disconnect the session.

I'm not familar with Python, so excuse my incorrect language If I'm saying this wrong... but open two sessions (in separate windows, or from separate Python processes - from separate boxes would work ... ) Then ...
. In Session A:
Begin Transaction
Insert TableA() Values()...
. Then In Session B:
Begin Transaction
Insert TableB() Values()...
Insert TableA() Values() ...
. Then go back to session A
Insert TableB() Values () ...
You'll get a deadlock...

You want something along the following lines.
parent.py
import subprocess
c1= subprocess.Popen( ["python", "child.py", "1"], stdin=subprocess.PIPE, stdout=subprocess.PIPE )
c2= subprocess.Popen( ["python", "child.py", "2"], stdin=subprocess.PIPE, stdout=subprocess.PIPE )
out1, err1= c1.communicate( "to 1: hit it!" )
print " 1:", repr(out1)
print "*1:", repr(err1)
out2, err2= c2.communicate( "to 2: ready, set, go!" )
print " 2:", repr(out2)
print "*2:", repr(err2)
out1, err1= c1.communicate()
print " 1:", repr(out1)
print "*1:", repr(err1)
out2, err2= c2.communicate()
print " 2:", repr(out2)
print "*2:", repr(err2)
c1.wait()
c2.wait()
child.py
import yourDBconnection as dbapi2
def child1():
print "Child 1 start"
conn= dbapi2.connect( ... )
c1= conn.cursor()
conn.begin() # turn off autocommit, start a transaction
ra= c1.execute( "UPDATE A SET AC1='Achgd' WHERE AC1='AC1-1'" )
print ra
print "Child1", raw_input()
rb= c1.execute( "UPDATE B SET BC1='Bchgd' WHERE BC1='BC1-1'" )
print rb
c1.close()
print "Child 1 finished"
def child2():
print "Child 2 start"
conn= dbapi2.connect( ... )
c1= conn.cursor()
conn.begin() # turn off autocommit, start a transaction
rb= c1.execute( "UPDATE B SET BC1='Bchgd' WHERE BC1='BC1-1'" )
print rb
print "Child2", raw_input()
ra= c1.execute( "UPDATE A SET AC1='Achgd' WHERE AC1='AC1-1'" )
print ta
c1.close()
print "Child 2 finish"
try:
if sys.argv[1] == "1":
child1()
else:
child2()
except Exception, e:
print repr(e)
Note the symmetry. Each child starts out holding one resource. Then they attempt to get someone else's held resource. You can, for fun, have 3 children and 3 resources for a really vicious circle.
Note that difficulty in contriving a situation in which deadlock occurs. If your transactions are short -- and consistent -- deadlock is very difficult to achieve. Deadlock requires (a) transaction which hold locks for a long time AND (b) transactions which acquire locks in an inconsistent order. I have found it easiest to prevent deadlocks by keeping my transactions short and consistent.
Also note the non-determinism. You can't predict which child will die with a deadlock and which will continue after the other died. Only one of the two need to die to release needed resources for the other. Some RDBMS's claim that there's a rule based on number of resources held blah blah blah, but in general, you'll never know how the victim was chosen.
Because of the two writes being in a specific order, you sort of expect child 1 to die first. However, you can't guarantee that. It's not deadlock until child 2 tries to get child 1's resources -- the sequence of who acquired first may not determine who dies.
Also note that these are processes, not threads. Threads -- because of the Python GIL -- might be inadvertently synchronized and would require lots of calls to time.sleep( 0.001 ) to give the other thread a chance to catch up. Processes -- for this -- are slightly simpler because they're fully independent.

Not sure if either above is correct.
Check out this:
http://www.xaprb.com/blog/2006/08/08/how-to-deliberately-cause-a-deadlock-in-mysql/

Related

why does the the connection cursor need to be readable for a psycopg2 notification listening loop

psycopg2 provides some example code for using postgresql notify facilities
import select
import psycopg2
import psycopg2.extensions
conn = psycopg2.connect(DSN)
conn.set_isolation_level(psycopg2.extensions.ISOLATION_LEVEL_AUTOCOMMIT)
curs = conn.cursor()
curs.execute("LISTEN test;")
print "Waiting for notifications on channel 'test'"
while True:
if select.select([conn],[],[],5) == ([],[],[]):
print "Timeout"
else:
conn.poll()
while conn.notifies:
notify = conn.notifies.pop(0)
print "Got NOTIFY:", notify.pid, notify.channel, notify.payload
What is the select.select([conn],[],[],5) == ([],[],[]) code doing?
The documentation of select.select suggests that we are trying to make sure that the socket that underlies the conn is "ready for reading".
What does it mean for a psycogp2 connection to be "ready for reading"?
conn.poll() is a nonblocking command. If there are no notifications (or other comms) available, it will still return immediately, just without having done anything. When done in a tight loop, this would be busy waiting and that is bad. It will burn an entire CPU doing nothing, and may even destabilize the whole system.
The select means "wait politely for there to be something on the connection to be read (implying poll will probably have something to do), or for 5 sec, whichever occurs first. If you instead wanted to wait indefinitely for a message to show up, just remove ,5 from the select. If you wanted to wait for something interesting to happen on the first of several handles, you would put all of them in the select list, not just conn.

Basic concurrent SQLite writer in Python

I have created a very basic script that periodically writes some data into a database:
test.py
import sqlite3
import sys
import time
DB_CREATE_TABLE = 'CREATE TABLE IF NOT EXISTS items (item TEXT)'
DB_INSERT = 'INSERT INTO items VALUES (?)'
FILENAME = 'test.db'
def main():
index = int()
c = sqlite3.connect(FILENAME)
c.execute(DB_CREATE_TABLE)
c.commit()
while True:
item = '{name}_{index}'.format(name=sys.argv[1], index=index)
c.execute(DB_INSERT, (item,))
c.commit()
time.sleep(1)
index += 1
c.close()
if __name__ == '__main__':
main()
Now I can achieve a simple concurrency by running the script several times:
python3 test.py foo &
python3 test.py bar &
I have tried to read some articles about scripts writing into the same database file at same time but still I'm not sure how will my script handle such event and I didn't figure any way how to test it.
My expectations are that in the unlikely event when the two instances of my script try to write to the database in the same millisecond, the later one will simply silently wait till the earlier finishes its job.
Does my current implementation meet my expectations? If it does not, how does it behave in case of such event and how can I fix it?
TL;DR
This script will meet the expectations.
Explanation
When the unlikely event of two script instances trying to write at the same time happens, the first one locks the database and the second one silently waits for a while until the first one finishes its transaction so that the database is unlocked for writing again.
More precisely, the second script instance waits for 5 seconds (by default) and then raises the OperationalError with the message database is locked. As #roganjosh commented, this behavior is actually specific for a Python SQLite wrapper. The documentation states:
When a database is accessed by multiple connections, and one of the processes modifies the database, the SQLite database is locked until that transaction is committed. The timeout parameter specifies how long the connection should wait for the lock to go away until raising an exception. The default for the timeout parameter is 5.0 (five seconds).
Tests
To demonstrate the collision event of the two instances I modified the main function:
def main():
c = sqlite3.connect(FILENAME)
c.execute(DB_CREATE_TABLE)
c.commit()
print('{} {}: {}'.format(time.time(), sys.argv[1], 'trying to insert ...'))
try:
c.execute(DB_INSERT, (sys.argv[1],))
except sqlite3.OperationalError as e:
print('{} {}: {}'.format(time.time(), sys.argv[1], e))
return
time.sleep(int(sys.argv[2]))
c.commit()
print('{} {}: {}'.format(time.time(), sys.argv[1], 'done'))
c.close()
The documentation states that the database is locked until the transaction is commited. So simply sleeping during the transaction should be enough to test it.
Test 1
We run the following command:
python3 test.py first 10 & sleep 1 && python3 test.py second 0
The first instance is being run and after 1s the second instance is being run. The first instance creates a 10s long transaction during which the second one tries to write to the database, waits and then raises an exception. The log demonstrates that:
1540307088.6203635 first: trying to insert ...
1540307089.6155508 second: trying to insert ...
1540307094.6333485 second: database is locked
1540307098.6353421 first: done
Test 2
We run the following command:
python3 test.py first 3 & sleep 1 && python3 test.py second 0
The first instance is being run and after 1s the second instance is being run. The first instance creates a 3s long transaction during which the second one tries to write to the database and waits. Since it has been created after 1s it has to wait 3s - 1s = 2s which is less than the default 5s so both transactions will finish successfully. The log demonstrates that:
1540307132.2834115 first: trying to insert ...
1540307133.2811155 second: trying to insert ...
1540307135.2912169 first: done
1540307135.3217440 second: done
Conclusion
The time needed for the transaction to finish is significantly smaller (milliseconds) than the lock time limit (5s) so in this scenario the script indeed meets the expectations. But as #HarlyH. commented, the transactions wait in a queue to be commited so for a heavily used or very large database this is not a good solution since the communication with the database will become slow.

Cassandra CQL UPDATE with IF

Newbie here (and it seems like it might be a newbie question).
Using Ubuntu 14.04 with a fresh install of Cassandra 2.1.1, CQL 3.2.0 (it says).
Writing a back-end database for a CherryPy site, initially as a session database.
I've come up with a scheme for a kind of 'row locking' as a session lock, but it doesn't seem to be hanging together, so I've reduced it to a simple test program running against a local Cassandra instance. To run this test, I open two terminal windows to run two python instances of it at the same time, each with different instance numbers ('1' and '2').
import time, sys, os, cassandra
from cassandra.cluster import Cluster
from cassandra.auth import PlainTextAuthProvider
instance = sys.argv[1]
cluster = Cluster( auth_provider=PlainTextAuthProvider( username='cassandra', password='cassandra'))
cdb = cluster.connect()
cdb.execute("CREATE KEYSPACE IF NOT EXISTS test WITH replication = {'class':'SimpleStrategy', 'replication_factor' : 1}")
cdb.execute("CREATE TABLE IF NOT EXISTS test.test ( id text primary key, val int, lock text )")
cdb.execute("INSERT INTO test.test (id, val, lock) VALUES ('session_id1', 0, '') ")
raw_input( '<Enter> to start ... ')
i = 0
while i < 10000:
i += 1
# set lock
while True:
r = cdb.execute( "UPDATE test.test SET lock = '%s' WHERE id = 'session_id1' IF lock = '' " % instance)
if r[0].applied == True:
break
# check lock and increment val
s0 = cdb.execute("SELECT val,lock FROM test.test WHERE id = 'session_id1' " )[0]
if s0.lock != instance:
print 'error: instance [%s] %s %s' % (instance, s0, r[0])
cdb.execute( "UPDATE test.test SET val = %s WHERE id = 'session_id1'", (s0.val + 1,))
# clear lock
cdb.execute( "UPDATE test.test SET lock = '' WHERE id = 'session_id1' ")
time.sleep( .01)
So if I understand correctly, the UPDATE..IF should be 'applied' (and the break taken) only if the existing value of lock is '' (an empty string), so this should give an effective exclusive lock on the row.
The problem is that the 's1.lock != instance' test quite frequently fires, showing that despite the UPDATE being applied, the value of lock afterwards is variously still '' or that of the other instance...
I know that when I roll out to a cluster I'm going to have to manage consistency issues, but this is against a single local Cass instance - surely consistency shouldn't be a problem here?
I can't imagine this CQL form is broken (tm), so it must be me. What am I doing wrong, or what is it I don't understand? TIA.
UPDATE: Ok, I googled a lot on this before I posted here, and now have spent the day since posting doing the same.
In particular, the stackoverflow posting Cassandra Optimistic Locking is addressing a similar issue (for a different reason), and his solution was:
"update table1 set version_num = 5 where name = 'abc' if version_num = 4"
which he says works for him - but is really exactly what I am doing, but which isn't working for me.
So I believe my approach to be sound, but clearly I have a problem.
Are there any environmental issues that could be affecting me? (installation, pythonic, whatever...)
Unsatisfactory Work-Around found*
After trying a lot of variations of the test code (above), I have come to the view that the statement:
"UPDATE test.test SET lock = '%s' WHERE id = 'session_id1' IF lock = '' "
around 5% of the time it finds lock is '' (empty string), it actually fails to write the value to lock, but nevertheless returns 'applied=True'.
By way of further testing, I modified that test code as follows:
# set lock
while True:
r = cdb.execute( "UPDATE test.test SET lock = '%s' WHERE id = 'session_id1' IF lock = '' " % instance)
if r[0].applied == True:
s = cdb.execute("SELECT lock FROM test.test WHERE id = 'session_id1' " )
if s[0].lock == instance:
break
# check lock and increment val
(etc)
... so this code now confirms that the lock had been applied, and if not, it goes back to try again...
So this is:
1) Horrible
2) Kludgy
3) Inefficient
4) Totally reliable (the only thing that really matters to me)
I've tested this on the 'single local Cassandra instance', and the main point is that the incrementing of the 'val' column that the lock is supposed to be protecting, does reach the proper terminating value (20000 with the code as above).
I've also tested it on a 2-node cluster with a replication factor of 2, with one instance of the test code running on each node, and that works too (although the "UPDATE ... IF" statement, now with a consistency of QUORUM, occasionally returns:
exception - code=1100 [Coordinator node timed out waiting for replica nodes' responses]\
message="Operation timed out - received only 1 responses." \
info={'received_responses': 1, 'required_responses': 2, 'write_type': 5, 'consistency': 8}
... that needs careful handling, as it appears that the lock has always been set, despite not having received all of the replies... and it cannot be retried, as the operation isn't idempotent...)
So I clearly haven't fixed the underlying problem, and although I have fixed the symptom, I would still appreciate a more thorough insight into what is happening...
I'd appreciate any feedback (but at least I can make progress again). TIA
So I've had some communication with Tyler Hobbs (Datastax), and in a nutshell:
"The correct functioning of the mechanism that provides the atomic test-and-set facility (via LightWeight Transactions) depends upon using the same mechanism to clear the lock."
... so I need to use a similar 'IF' construct to clear it, even though I already know the contents...
# clear lock
cdb.execute( "UPDATE test.test SET lock = '' WHERE id = 'session_id1' IF lock = '%s'" % instance)
... and that works.

No data with QProcess.waitForReadyRead(): non consistent behavior

I experienced this bug in a more complex application I am developing. I execute a python server program and want to read the first data available. And then close it. I do that to validate some settings from the server.
My problem boils down to:
QProcess.waitForReadyRead() doesn't return and timeouts, it's supposed to return True very quickly
It used to work, I rollbacked to an older revision to try to find what caused this to break now, but it's always there now, I really tried everything I could think of, so I want to know if it's a normal problem or maybe something that would only affect me and caused by my environment.
This is the test I wrote to show the problem, when I execute it, the 3 first checks return the data immediately, but the last one timeouts and I get no data.
This is certainly not logical. In the test I used wait, in my server it's just a select-like function, it's implemented with base modules in python.
from PyQt4 import QtCore
#FILE: 1.py
#print 'TEST'
#FILE: 2.py
# import time
#print 'TEST'
#time.sleep(100)
#FILE: 1.sh
# echo 'TEST'
#FILE: 2.sh
# echo 'TEST'
# sleep 100
proc0= QtCore.QProcess()
proc0.start('sh', ['./1.sh'])
proc0.waitForStarted()
proc0.waitForReadyRead(10000)
output0 = proc0.readAll()
proc1= QtCore.QProcess()
proc1.start('sh', ['./2.sh'])
proc1.waitForStarted()
proc1.waitForReadyRead(10000)
output1 = proc1.readAll()
proc2= QtCore.QProcess()
proc2.start('python', ['./1.py'])
proc2.waitForStarted()
proc2.waitForReadyRead(10000)
output2 = proc2.readAll()
proc3= QtCore.QProcess()
proc3.start('python', ['./2.py'])
proc3.waitForStarted()
proc3.waitForReadyRead(10000)
output3 = proc3.readAll()
print "0"
print output0.size()
print repr(output0.data())
print "1"
print output1.size()
print repr(output1.data())
print "2"
print output2.size()
print repr(output2.data())
print "3"
print output3.size()
print repr(output3.data())
proc0.close()
proc1.close()
proc2.close()
proc3.close()
Is the last test (proc3) supposed to behave like I described? Is there a workaround or a fix that would let me read the data form stdout in my python server...? What is it?
It's a comment but...
I found the solution, python's print doesn't flush stdout and waits for a certain amount of data before actually pushing the data to the stdout. sys.stdout.flush() fixed it.
Hope it helps.

Python's MySqlDB not getting updated row

I have a script that waits until some row in a db is updated:
con = MySQLdb.connect(server, user, pwd, db)
When the script starts the row's value is "running", and it waits for the value to become "finished"
while(True):
sql = '''select value from table where some_condition'''
cur = self.getCursor()
cur.execute(sql)
r = cur.fetchone()
cur.close()
res = r['value']
if res == 'finished':
break
print res
time.sleep(5)
When I run this script it hangs forever. Even though I see the value of the row has changed to "finished" when I query the table, the printout of the script is still "running".
Is there some setting I didn't set?
EDIT: The python script only queries the table. The update to the table is carried out by a tomcat webapp, using JDBC, that is set on autocommit.
This is an InnoDB table, right? InnoDB is transactional storage engine. Setting autocommit to true will probably fix this behavior for you.
conn.autocommit(True)
Alternatively, you could change the transaction isolation level. You can read more about this here:
http://dev.mysql.com/doc/refman/5.0/en/set-transaction.html
The reason for this behavior is that inside a single transaction the reads need to be consistent. All consistent reads within the same transaction read the snapshot established by the first read. Even if you script only reads the table this is considered a transaction too. This is the default behavior in InnoDB and you need to change that or run conn.commit() after each read.
This page explains this in more details: http://dev.mysql.com/doc/refman/5.0/en/innodb-consistent-read.html
I worked around this by running
c.execute("""set session transaction isolation level READ COMMITTED""")
early on in my reading session. Updates from other threads do come through now.
In my instance I was keeping connections open for a long time (inside mod_python) and so updates by other processes weren't being seen at all.

Categories