MySQLDB query not returning all rows - python

I am trying to do a simple fetch using MySQLDB in Python.
I have 2 tables(Accounts & Products). I have to look up Accounts table, get acc_id from it & query the Products table using it.
The Products tables has more than 10 rows. But when I run this code it randomly returns between 0 & 6 rows each time I run it.
Here's the code snippet:
# Set up connection
con = mdb.connect('db.xxxxx.com', 'user', 'password', 'mydb')
# Create cursor
cur = con.cursor()
# Execute query
cur.execute("SELECT acc_id FROM Accounts WHERE ext_acc = '%s'" % account_num ) # account_num is alpha-numberic and is got from preceding part of the program
# A tuple is returned, so get the 0th item from it
acc_id = cur.fetchone()[0]
print "account_id = ", acc_id
# Close the cursor - I was not sure if I can reuse it
cur.close()
# Reopen the cursor
cur = con.cursor()
# Second query
cur.execute("SELECT * FROM Products WHERE account_id = %d" % acc_id)
keys = cur.fetchall()
print cur.rowcount # This prints incorrect row count
for key in keys: # Does not print all rows. Tried to directly print keys instead of iterating - same result :(
print key
# Closing the cursor & connection
cur.close()
con.close()
The weird part is, I tried to step through the code using a debugger(PyDev on Eclipse) and it correctly gets all rows(both the value stored in the variable 'keys' as well as console output are correct).
I am sure my DB has correct data since I ran the same SQL on MySQL console & got the correct result.
Just to be sure I was not improperly closing the connection, I tried using with con instead of manually closing the connection and it's the same result.
I did RTM but I couldn't find much in it to help me with this issue.
Where am I going wrong?
Thank you.
EDIT: I noticed another weird thing now. In the line
cur.execute("SELECT * FROM Products WHERE account_id = %d" % acc_id), I hard-coded the acc_id value, i.e made it
cur.execute("SELECT * FROM Products WHERE account_id = %d" % 322) and it returns all rows

This is not actually an answer, just an attempt to gather together all of the information from a chat with RBK that ruled out a bunch of potential problems, but still didn't come up with an explanation or a solution, in hopes that someone else can spot the problem or think of something else to try.
It's clearly something in this line:
cur.execute("SELECT * FROM Products WHERE account_id = %d" % acc_id)
Especially since putting 322 in place of acc_id fixes everything. (As proven below.)
There are actually two problems with that line, which could be getting in the way. You always want to use DB-API binding rather than string formatting (and the equivalent in any other language), to avoid SQL injection attacks, for correctness of escaping/conversion/etc., and for efficiency. Also, both DB-ABI binding and string formatting require a tuple of arguments, not a single argument. (For legacy reasons, a single argument often works, but sometimes it doesn't, and then it's just confusing to debug… better not to do it.) So, this should be:
cur.execute("SELECT * FROM Products WHERE account_id = %d", (acc_id,))
Unfortunately, after discussing this in chat, and having you try a bunch of things, we were unable to find what's actually wrong here. Summarizing what we tried:
So then, we tried:
cur.execute("SELECT COUNT(*) FROM Devices WHERE account_id = %s" , (333,))
print cur.fetchone()[0]
print 'account id =', acc_id
print type(acc_id)
cur.execute("SELECT COUNT(*) FROM Devices WHERE account_id = %s" , (acc_id,))
print cur.fetchone()[0]
The output was:
10
account id = 333
<type 'long'>
2
When run repeatedly, the last number varies from 0-6, while the first is always 10. There's no way using acc_id could be different from using 333, and yet it is. And just in case one query was somehow "infecting" the next one, without the first two lines, the rest works the same way.
So, there's no way using acc_id could possibly be different than using 333. And yet, it is.
At some point during the chat we apparently moved from Products to Devices, and from 322 to 333, but regardless, the tests shown above were definitely done exactly as shown, and returned different results.
Maybe he has a buggy or badly-installed version of MySQLDb. He's going to try looking for a newer version, or one of the other Python MySQL libraries, and see if it makes a difference.
My next best guess at this point is that RBK has inadvertently angered some technologically-sophisticated god of mischief, but I can't even think of one of those off the top of my head.

I sort of figured out the problem. It was silly at the end. It was a race condition!
This is how my actual code was organized :
Code Block 1
{code which calls an API which creates an entry in Accounts table &
Creates corresponding entries in Product table(10 entries)}
......
Code Block2
{The code I had posted in my question}
The problem was that the API(called in Code Block 1) took a few seconds to add 10 entries into the Product table.
When my code(Code Block 2) ran a fetch query, all the 10 rows were not added and hence fetched somewhere between 0 to 6 rows(how much ever was added at that time).
What I did to solve this was made the code sleep for 5 seconds before I did the SQL queries:
Code Block 1
time.sleep(5)
Code Block 2
The reason why it worked when I hard coded the acc_id was that, the acc_id which I hard-coded was from a precious execution(each run returns a new acc_id).
And the reason why it worked while stepping through a debugger was that manually stepping acted like giving it a sleep time.
It is a lesson for me to know a little about the inside working of APIs(even though they are supposed to be like a black box) and think about race conditions like this, the next time I come across similar issues.

Related

How to format postgreSQL queries in a python script for better readability?

I have another question that is related to a project I am working on for school. I have created a PostgreSQL database, with 5 tables and a bunch of rows. I have created a script that allows a user to search for information in the database using a menu, as well as adding and removing content from one of the tables.
When displaying a table in PostgreSQL CLI itself, it looks pretty clean, however, whenever displaying even a simple table with no user input, it looks really messy. While this is an optional component for the project, I would prefer to have something that looks a little cleaner.
I have tried a variety of potential solutions that I have seen online, even a few from stack overflow, but none of them work. Whenever I try to use any of the methods I have seen and somewhat understand, I always get the error:
TypeError: 'int' object is not subscriptable
I added a bunch of print statements in my code, to try and figure out why it refuses to typecast. It is being dumb. Knowing me it is probably a simple typo that I can't see. Not even sure if this solution will work, just one of the examples I saw online.
try:
connection = psycopg2.connect(database='Blockbuster36', user='dbadmin')
cursor = connection.cursor()
except psycopg2.DatabaseError:
print("No connection to database.")
sys.exit(1)
cursor.execute("select * from Customer;")
tuple = cursor.fetchone()
List_Tuple = list(tuple)
print("Customer_ID | First_Name | Last_Name | Postal_Code | Phone_Num | Member_Date")
print(List_Tuple)
print()
for item in List_Tuple:
print(item[0]," "*(11-len(str(item[0]))),"|")
print(item)
print(type(item))
print()
num = str(item[0])
print(num)
print(type(num))
print(str(item[0]))
print(type(str(item[0])))
cursor.close()
connection.close()
I uploaded the difference between the output I get through a basic python script and in the PostgreSQL CLI. I have blocked out names in the tables for privacy reasons. https://temporysite.weebly.com/
It doesn't have to look exactly like PostgreSQL, but anything that looks better than the current mess would be great.
Use string formatting to do that. You can also set it to pad right or left.
As far as the dates use datetime.strftime.
The following would set the padding to 10 places:
print(”{:10}|{:10}".format(item[0], item[1]))

Python psycopg2 cursor.fetchall() returns empty list but cursor.rowcount is > 1

I am getting an issue here:
conn = psycopg2.connect(conn_string)
cursor = conn.cursor()
sql = """
SELECT DISTINCT (tenor_years)
FROM bond_pnl
WHERE country = '%s'
""" % country
cursor.execute(sql)
print(cursor.fetchall())
print(cursor.rowcount)
It gives the following output:
[]
11
which means that cursor.rowcount is 11 but cursor.fetchall() is empty list. I have already tried doing this:
conn.set_session(readonly=True, autocommit=True)
and this solution as well :Click to see
Any help regarding this will be appreciated.
EDIT: Just came across another thing, this code when executed first time, works fine. But executing it again(second, third, ...n execution) gives the above behavior.
I also faced the same issue. I figured out that,
Might be while debugging, we are allowing some fraction of time after connection has been made
#conn = psycopg2.connect(conn_string)
#cursor = conn.cursor()
By the time we hit the execution button for the next line (which contains the query), database is timed out and it is returning empty list.
If anyone has any other logic for why this is happening please do share.
After trying different solution, I have figured out that the problem described in the question arises when I execute it in "Debugging Mode" in pycharm. But on the other hand if I execute the code in "Run Mode" in pycharm , it returns the right expected output (a list with 11 elements):
[a,b,c,d,e,f,g,h,i,j,k]
Not sure about the exact reason for it but somehow the cursor was breaking somewhere when run in "Debugging Mode".
If anyone describes the exact reason, It ll be highly appreciated.

2 Try / Except statement doesn't work in normal order but works when codes are flipped

Good day guys, I hope to get a little advice on this. I can't seem to get this 2 TRY/EXCEPT statement to run in the order I want them to. They however, work great if I put STEP2 first then STEP1.
This current code prints out only.
Transferred: x rows.
If flipped, they print both.
Unfetched: x rows.
Transferred: x rows.
I tried:
Assigning individual cur.close() and db.commit() as per the examples
here, didn't work either. (Side question: Should I be closing/committing
them individually nevertheless? Is that a general good practice or
context-based?)
Using a cur.rowcount method for Step 2 as well as I thought maybe the
problem was on the SQL side but, the problem still persists.
Did a search on SO and couldn't find any similar case.
Running on Python 2.7. Code:
import MySQLdb
import os
#Initiate connection to database.
db = MySQLdb.connect(host="localhost",user="AAA",passwd="LETMEINYO",db="sandbox")
cur = db.cursor()
#Declare variables.
viewvalue = "1"
mainreplace = (
"INSERT INTO datalog "
"SELECT * FROM cachelog WHERE viewcount = %s; "
"DELETE FROM cachelog WHERE viewcount = %s; "
% (viewvalue, viewvalue)
)
balance = (
"SELECT COUNT(*) FROM cachelog "
"WHERE viewcount > 1"
)
#STEP 1: Copy and delete old data then print results.
try:
cur.execute(mainreplace)
transferred = cur.rowcount
print "Transferred: %s rows." %(transferred)
except:
pass
#STEP 2: Check for unfetched data and print results.
try:
cur.execute(balance)
unfetched = cur.fetchone()
print "Unfetched: %s rows." % (unfetched)
except:
pass
#Confirm and close connection.
cur.close()
db.commit()
db.close()
Pardon any of my un-Pythonic ways as I am still very much a beginner. Any advice is much appreciated, thank you!
You have two blaring un-Pythonic bits of code: the use of a bare except: without saying which exception you want to catch, and using pass in that except block so the exception is entirely ignored!
The problem with code like that is that if something goes wrong, you'll never see the error message, so you can't find out what's wrong.
The problem is perhaps that your "mainreplace" query deletes everything from the "cachelog" table, so the "balance" query after it has no rows, so fetchone() fails, throws an exception and the line after it is never executed. Or maybe something completely different, hard to tell from here.
If you didn't have that try/except there, you would have had a nice error message and you wouldn't have had to ask this question.

Python SQlite ORDER BY command doesn't work

i just started out with programmming and wrote a few lines of code in pyscripter using sqlite3.
The table "gather" is created beforehand. i then select certain rows from "gather" to put them into another table. i try to sort this table by a specific column 'date'. But it doesn't seem to work. it doesn't give me an error message or something like that. It's just not sorted. If i try the same command (SELECT * FROM matches ORDER BY date) in sqlitemanager, it works fine on the exact same table! what is the problem here? i googled quite some time, but i don't find a solution. it's proobably something stupid i'm missing..
as i said i'm a total newbie. i guess you all break out in tears looking at the code. so if you have any tips how i can shorten the code or make it faster or whatever, you're very welcome :) (but everything works fine except the above mentioned part.)
import sqlite3
connection = sqlite3.connect("gather.sqlite")
cursor1 = connection.cursor()
cursor1.execute('Drop table IF EXISTS matches')
cursor1.execute('CREATE TABLE matches(date TEXT, team1 TEXT, team2 TEXT)')
cursor1.execute('INSERT INTO matches (date, team1, team2) SELECT * FROM gather WHERE team1=? or team2=?, (a,a,))
cursor1.execute("SELECT * FROM matches ORDER BY date")
connection.commit()
OK, I think I understand your problem. First of all: I'm not sure if that commit call is necessary at all. However, if it is, you'll definitely want that to be before your select statement. 'connection.commit()' is essentially saying, commit the changes I just made to the database.
Your second issue is that you are executing the select query but never actually doing anything with the results of the query.
try this:
import sqlite3
connection = sqlite3.connect("gather.sqlite")
cursor1 = connection.cursor()
cursor1.execute('Drop table IF EXISTS matches')
cursor1.execute('CREATE TABLE matches(date TEXT, team1 TEXT, team2 TEXT)')
cursor1.execute('INSERT INTO matches (date, team1, team2) SELECT * FROM gather WHERE team1=? or team2=?, (a,a,))
connection.commit()
# directly iterate over the results of the query:
for row in cursor1.execute("SELECT * FROM matches ORDER BY date"):
print row
you are executing the query, but never actually retrieving the results. There are two ways to do this with sqlite3: One way is the way I showed you above, where you can just use the execute statement directly as an iteratable object.
The other way is as follows:
import sqlite3
connection = sqlite3.connect("gather.sqlite")
cursor1 = connection.cursor()
cursor1.execute('Drop table IF EXISTS matches')
cursor1.execute('CREATE TABLE matches(date TEXT, team1 TEXT, team2 TEXT)')
cursor1.execute('INSERT INTO matches (date, team1, team2) SELECT * FROM gather WHERE team1=? or team2=?, (a,a,))
connection.commit()
cursor1.execute("SELECT * FROM matches ORDER BY date")
# fetch all means fetch all rows from the last query. here you put the rows
# into their own result object instead of directly iterating over them.
db_result = cursor1.fetchall()
for row in db_result:
print row
Try moving the commit before the SELECT * (I'm not sure 100% that this is an issue) You then just need to fetch the results of the query :-) Add a line like res = cursor1.fetchall() after you've executed the SELECT. If you want to display them like in sqlitemanager, add
for hit in res:
print '|'.join(hit)
at the bottom.
Edit: To address your issue of storing the sort order to the table:
I think what you're looking for is something like a clustered index. (Which doesn't actually sort the values in th table, but comes close; see here).
SQLIte doesn't have such such indexes, but you can simulate them by actually ordering the table. You can only do this once, as you're inserting the data. You would need an SQL command like the following:
INSERT INTO matches (date, team1, team2)
SELECT * FROM gather
WHERE team1=? or team2=?
ORDER BY date;
instead of the one you currently use.
See point 4 here, which is where I got the idea.

pymssql ( python module ) unable to use temporary tables

This isn't a question, so much as a pre-emptive answer. (I have gotten lots of help from this website & wanted to give back.)
I was struggling with a large bit of SQL query that was failing when I tried to run it via python using pymssql, but would run fine when directly through MS SQL. (E.g., in my case, I was using MS SQL Server Management Studio to run it outside of python.)
Then I finally discovered the problem: pymssql cannot handle temporary tables. At least not my version, which is still 1.0.1.
As proof, here is a snippet of my code, slightly altered to protect any IP issues:
conn = pymssql.connect(host=sqlServer, user=sqlID, password=sqlPwd, \
database=sqlDB)
cur = conn.cursor()
cur.execute(testQuery)
The above code FAILS (returns no data, to be specific, and spits the error "pymssql.OperationalError: No data available." if you call cur.fetchone() ) if I call it with testQuery defined as below:
testQuery = """
CREATE TABLE #TEST (
[sample_id] varchar (256)
,[blah] varchar (256) )
INSERT INTO #TEST
SELECT DISTINCT
[sample_id]
,[blah]
FROM [myTableOI]
WHERE [Shipment Type] in ('test')
SELECT * FROM #TEST
"""
However, it works fine if testQuery is defined as below.
testQuery = """
SELECT DISTINCT
[sample_id]
,[blah]
FROM [myTableOI]
WHERE [Shipment Type] in ('test')
"""
I did a Google search as well as a search within Stack Overflow, and couldn't find any information regarding the particular issue. I also looked under the pymssql documentation and FAQ, found at http://code.google.com/p/pymssql/wiki/FAQ, and did not see anything mentioning that temporary tables are not allowed. So I thought I'd add this "question".
Update: July 2016
The previously-accepted answer is no longer valid. The second "will NOT work" example does indeed work with pymssql 2.1.1 under Python 2.7.11 (once conn.autocommit(1) is replaced with conn.autocommit(True) to avoid "TypeError: Cannot convert int to bool").
For those who run across this question and might have similar problems, I thought I'd pass on what I'd learned since the original post. It turns out that you CAN use temporary tables in pymssql, but you have to be very careful in how you handle commits.
I'll first explain by example. The following code WILL work:
testQuery = """
CREATE TABLE #TEST (
[name] varchar(256)
,[age] int )
INSERT INTO #TEST
values ('Mike', 12)
,('someone else', 904)
"""
conn = pymssql.connect(host=sqlServer, user=sqlID, password=sqlPwd, \
database=sqlDB) ## obviously setting up proper variables here...
conn.autocommit(1)
cur = conn.cursor()
cur.execute(testQuery)
cur.execute("SELECT * FROM #TEST")
tmp = cur.fetchone()
tmp
This will then return the first item (a subsequent fetch will return the other):
('Mike', 12)
But the following will NOT work
testQuery = """
CREATE TABLE #TEST (
[name] varchar(256)
,[age] int )
INSERT INTO #TEST
values ('Mike', 12)
,('someone else', 904)
SELECT * FROM #TEST
"""
conn = pymssql.connect(host=sqlServer, user=sqlID, password=sqlPwd, \
database=sqlDB) ## obviously setting up proper variables here...
conn.autocommit(1)
cur = conn.cursor()
cur.execute(testQuery)
tmp = cur.fetchone()
tmp
This will fail saying "pymssql.OperationalError: No data available." The reason, as best I can tell, is that whether you have autocommit on or not, and whether you specifically make a commit yourself or not, all tables must explicitly be created AND COMMITTED before trying to read from them.
In the first case, you'll notice that there are two "cur.execute(...)" calls. The first one creates the temporary table. Upon finishing the "cur.execute()", since autocommit is turned on, the SQL script is committed, the temporary table is made. Then another cur.execute() is called to read from that table. In the second case, I attempt to create & read from the table "simultaneously" (at least in the mind of pymssql... it works fine in MS SQL Server Management Studio). Since the table has not previously been made & committed, I cannot query into it.
Wow... that was a hassle to discover, and it will be a hassle to adjust my code (developed on MS SQL Server Management Studio at first) so that it will work within a script. Oh well...

Categories