I making a simple python script which checks a mysql table every x seconds and print the result to the console. I use the MySQL Connector Driver.
However, running the script only prints the initalial values. By that I mean, that if I change the values in the database while my script is running, it's not registered by the script and it's keeps on writing the initial values.
The code which retrieves the values in a while loop is as follows:
def get_fans():
global cnx
query = 'SELECT * FROM config'
while True:
cursor = cnx.cursor()
cursor.execute(query)
for (config_name, config_value) in cursor:
print config_name, config_value
print "\n"
cursor.close()
time.sleep(3)
Why is this happening?
Most likely, it's an autocommit issue. MySQL Connector Driver documentation states, that it has autocommit turned off. Make sure you commit your implicit transactions while changing your table. Also because default isolation is REPEATABLE READ you have:
All consistent reads within the same transaction read the snapshot
established by the first read.
So I guess you have to manage transaction even for your polling script. Or change isolation level to READ COMMITTED.
Though, the better way is to restore to MySQL client default autocommit-on mode. Even though PEP 249 guides to have it initially disabled, it's mere a proposal and most likely a serious design mistake. Not only it makes novices wonder about uncommited changes, makes even your read-only workload slower, it complicates data-layer design and breaks explicit is better than implicit Python zen. Even sluggish things like Django have it rethought.
Related
I'm kind of new to Python and its MySQLdb connector.
I'm writing an API to return some data from a database using the RESTful approach. In PHP, I wrapped the Connection management part in a class, acting as an abstraction layer for MySQL queries.
In Python:
I define the connection early on in the script: con = mdb.connect('localhost', 'user', 'passwd', 'dbname')
Then, in all subsequent methods:
import MySQLdb as mdb
def insert_func():
with con:
cur = con.cursor(mdb.cursors.DictCursor)
cur.execute("INSERT INTO table (col1, col2, col3) VALUES (%s, %s, %s)", (val1, val2, val3) )
rows = cur.fetchall()
#do something with the results
return someval
etc.
I use mdb.cursors.DictCursor because I prefer to be able to access database columns in an associative array manner.
Now the problems start popping up:
in one function, I issue an insert query to create a 'group' with unique 'groupid'.
This 'group' has a creator. Every user in the database holds a JSON array in the 'groups' column of his/her row in the table.
So when I create a new group, I want to assign the groupid to the user that created it.
I update the user's record using a similar function.
I've wrapped the 'insert' and 'update' parts in two separate function defs.
The first time I run the script, everything works fine.
The second time I run the script, the script runs endlessly (I suspect due to some idle connection to the MySQL database).
When I interrupt it using CTRL + C, I get one of the following errors:
"'Cursor' object has no attribute 'connection'"
"commands out of sync; you can't run this command now"
or any other KeyboardInterrupt exception, as would be expected.
It seems to me that these errors are caused by some erroneous way of handling connections and cursors in my code.
I read it was good practice to use with con: so that the connection will automatically close itself after the query. I use 'with' on 'con' in each function, so the connection is closed, but I decided to define the connection globally, for any function to use it. This seems incompatible with the with con: context management. I suspect the cursor needs to be 'context managed' in a similar way, but I do not know how to do this (To my knowledge, PHP doesn't use cursors for MySQL, so I have no experience using them).
I now have the following questions:
Why does it work the first time but not the second? (it will however, work again, once, after the CTRL + C interrupt).
How should I go about using connections and cursors when using multiple functions (that can be called upon in sequence)?
I think there are two main issues going on here- one appears to be python code and the other is the structure of how you're interacting to your DB.
First, you're not closing your connection. This depends on your application's needs - you have to decide how long it should stay open. Reference this SO question
from contextlib import closing
with closing( connection.cursor() ) as cursor:
... use the cursor ...
# cursor closed. Guaranteed.
connection.close()
Right now, you have to interrupt your program with Ctl+C because there's no reason for your with statement to stop running.
Second, start thinking about your interactions with the DB in terms of 'transactions'. Do something, commit it to the DB, if it didn't work, rollback, if it did, close the connection. Here's a tutorial.
With connections, as with file handles the rule of thumb is open late, close early.
So I would recommend share connections only where they are trying to do one thing. Or if you multiprocess, then each process gets a connection, again following open late, close early. And if you are doing sequential operation (say in a loop) open and close outside the loop. Having global connections can get messy. Mainly because now you have to keep track of which function uses it at what time, and what it tries to do with it.
The issue of "cannot run command now", is because your keyboard interrupt kills the active connection.
As to part one of your question - endlessly could be anywhere. Each instance of python will get its own connection. So when you run it the second time it should get its own connection. Open up a mysql client and do
show full processlist
to see whats going on.
The situation is detailed in my previous question:
MySQLdb is caching SELECT results?
In short:
python 2.7 + MySQLdb
the "issue" happens inside a Python script (but not from the mysql client prompt)
when querying a SELECT inside a loop, the first result is repeated for all subsequent iterations of the loop
this happens even though another program updates the DB (and commits).
I can see the changes from mysql clients, but not from my python loop.
SQL_NO_CACHE didn't fix it
recreating a cursor didn't help!
autocommit(True) worked --> each query reflects DB change.
So why MySQLdb thinks it's inside a transaction, when it's clearly irrelevant?
I am writing a program on python which interacts with MySQL database.
For sql queries I use MySQLdb.
The problem is that fetchone() returns None but with the database browser I can see that that row exists.
This piece of code:
query = "SELECT * FROM revision WHERE rev_id=%s;"
cursor.execute(query % revision_id)
row = cursor.fetchone()
if row == None:
raise Exception("there isn't revision with id %s" % revision_id)
I have no idea what is going on here. Any ideas?
EDIT: okay, in some cases it works in some cases it doesn't but anyway when it
does not work the row exists in the table. I am passing a cursor object to a function and the code above is in the function. The problem is connected with this cursor object. Could the problem be that I pass the cursor as an argument to the function? How can I test it?
EDIT2: yes, the problem is that cursor does not work after I use it several times. Wether because other program connects to the DB or I am doing something wrong.
I have while loop in which I call a function to get info from the DB. After some iterations it does not work again. There is another program which writes to
the DB while while loop works.
Okay, db.autocommit(True) solved my problem.
This is related to transaction isolation level on your MySQL server. In the case of REPEATABLE_READ which is the default level for InnoDb, a snapshot is created at the time of first read, and subsequent read by the same cursor are made from this snapshot. Read more about isolation levels here
What we usually require while reusing the same cursor to run multiple queries, is READ_COMMITTED. Thankfully, if you can not change this on your SQL server, you can set your cursor to a particular isolation level.
cur = conn.cursor()
cur.execute("SET SESSION TRANSACTION ISOLATION LEVEL READ COMMITTED")
This makes sure that every query you make, there is a fresh latest committed snapshot is used.
Best Practice is to commit db, after all query executed db.commit()
I have a small issue(for lack of a better word) with MySQL db. I am using Python.
So I have this table in which rows are inserted regularly. As regularly as 1 row /sec.
I run two Python scripts together. One that simulates the insertion at 1 row/sec. I have also turned autocommit off and explicitly commit after some number of rows, say 10.
The other script is a simple "SELECT count(*) ..." query on the table. This query doesn't show me the number of rows the table currently has. It is stubbornly stuck at whatever number of rows the table had initially when the script started running. I have even tried "SELECT SQL_NO_CACHE count(*) ..." to no effect.
Any help would be appreciated.
My guess is you're using INNODB with REPEATABLE READ isolation mode. Try setting the isolation mode to READ COMMITTED:
SET SESSION TRANSACTION ISOLATION LEVEL READ COMMITTED
Another way is starting a new transaction every time you perform a select query. Read more here
If autocommit is turned off in the reader as well, then it will be doing the reads inside a transaction and thus not seeing the writes the other script is doing.
My guess is that either the reader or writer (most likely the writer) is operating inside a transaction which hasn't been committed. Try ensuring that the writer is committing after each write, and try a ROLLBACK from the reader to make sure that it isn't inside a transaction either.
I use this python code to output the number of Things every 5 seconds:
def my_count():
while True:
print "Number of Things: %d" % Thing.objects.count()
time.sleep(5)
my_count()
If another process generates a new Thing while my_count() is running, my_count() will keep printing the same number, even though it now has changed in the database. (But if I kill my_count() and restart it, it will display the new Thing count.)
Things are stored in a MYSQL innodb database, and this code runs on ubuntu.
Why won't my_count() display the new Thing.objects.count() without being restarted?
Because Python DB API is by default in AUTOCOMMIT=OFF mode, and (at least for MySQLdb) on REPEATABLE READ isolation level. This means that behind the scenes you have an ongoing database transaction (InnoDB is transactional engine) in which the first access to given row (or maybe even table, I'm not sure) fixes "view" of this resource for the remaining part of the transaction.
To prevent this behaviour, you have to 'refresh' current transaction:
from django.db import transaction
#transaction.autocommit
def my_count():
while True:
transaction.commit()
print "Number of Things: %d" % Thing.objects.count()
time.sleep(5)
-- note that the transaction.autocommit decorator is only for entering transaction management mode (this could also be done manually using transaction.enter_transaction_management/leave_transaction_managemen functions).
One more thing - to be aware - Django's autocommit is not the same autocommit you have in database - it's completely independent. But this is out of scope for this question.
Edited on 22/01/2012
Here is a "twin answer" to a similar question.