MySQL: Get column name or alias from query - python

I'm not asking for the SHOW COLUMNS command.
I want to create an application that works similarly to heidisql, where you can specify an SQL query and when executed, returns a result set with rows and columns representing your query result. The column names in the result set should match your selected columns as defined in your SQL query.
In my Python program (using MySQLdb) my query returns only the row and column results, but not the column names. In the following example the column names would be ext, totalsize, and filecount. The SQL would eventually be external from the program.
The only way I can figure to make this work, is to write my own SQL parser logic to extract the selected column names.
Is there an easy way to get the column names for the provided SQL?
Next I'll need to know how many columns does the query return?
# Python
import MySQLdb
#===================================================================
# connect to mysql
#===================================================================
try:
db = MySQLdb.connect(host="myhost", user="myuser", passwd="mypass",db="mydb")
except MySQLdb.Error, e:
print "Error %d: %s" % (e.args[0], e.args[1])
sys.exit (1)
#===================================================================
# query select from table
#===================================================================
cursor = db.cursor ()
cursor.execute ("""\
select ext,
sum(size) as totalsize,
count(*) as filecount
from fileindex
group by ext
order by totalsize desc;
""")
while (1):
row = cursor.fetchone ()
if row == None:
break
print "%s %s %s\n" % (row[0], row[1], row[2])
cursor.close()
db.close()

cursor.description will give you a tuple of tuples where [0] for each is the column header.
num_fields = len(cursor.description)
field_names = [i[0] for i in cursor.description]

This is the same as thefreeman but more in pythonic way using list and dictionary comprehension
columns = cursor.description
result = [{columns[index][0]:column for index, column in enumerate(value)} for value in cursor.fetchall()]
pprint.pprint(result)

Similar to #James answer, a more pythonic way can be:
fields = [field_md[0] for field_md in cursor.description]
result = [dict(zip(fields,row)) for row in cursor.fetchall()]
You can get a single column with list comprehension over the result:
extensions = [row['ext'] for row in result)
or filter results using an additional if in the list comprehension:
large = [row for row in result if row['filesize'] > 1024 and row['filesize'] < 4096]
or accumulate values for filtered columns:
totalTxtSize = reduce(
lambda x,y: x+y,
filter(lambda x: x['ext'].lower() == 'txt', result)
)

I think this should do what you need (builds on the answer above) . I am sure theres a more pythony way to write it, but you should get the general idea.
cursor.execute(query)
columns = cursor.description
result = []
for value in cursor.fetchall():
tmp = {}
for (index,column) in enumerate(value):
tmp[columns[index][0]] = column
result.append(tmp)
pprint.pprint(result)

You could also use MySQLdb.cursors.DictCursor. This turns your result set into a python list of python dictionaries, although it uses a special cursor, thus technically less portable than the accepted answer. Not sure about speed. Here's the edited original code that uses this.
#!/usr/bin/python -u
import MySQLdb
import MySQLdb.cursors
#===================================================================
# connect to mysql
#===================================================================
try:
db = MySQLdb.connect(host='myhost', user='myuser', passwd='mypass', db='mydb', cursorclass=MySQLdb.cursors.DictCursor)
except MySQLdb.Error, e:
print 'Error %d: %s' % (e.args[0], e.args[1])
sys.exit(1)
#===================================================================
# query select from table
#===================================================================
cursor = db.cursor()
sql = 'SELECT ext, SUM(size) AS totalsize, COUNT(*) AS filecount FROM fileindex GROUP BY ext ORDER BY totalsize DESC;'
cursor.execute(sql)
all_rows = cursor.fetchall()
print len(all_rows) # How many rows are returned.
for row in all_rows: # While loops always make me shudder!
print '%s %s %s\n' % (row['ext'], row['totalsize'], row['filecount'])
cursor.close()
db.close()
Standard dictionary functions apply, for example, len(row[0]) to count the number of columns for the first row, list(row[0]) for a list of column names (for the first row), etc. Hope this helps!

This is only an add-on to the accepted answer:
def get_results(db_cursor):
desc = [d[0] for d in db_cursor.description]
results = [dotdict(dict(zip(desc, res))) for res in db_cursor.fetchall()]
return results
where dotdict is:
class dotdict(dict):
__getattr__ = dict.get
__setattr__ = dict.__setitem__
__delattr__ = dict.__delitem__
This will allow you to access much easier the values by column names.
Suppose you have a user table with columns name and email:
cursor.execute('select * from users')
results = get_results(cursor)
for res in results:
print(res.name, res.email)

Something similar to the proposed solutions, only the result is json with column_header : vaule for db_query ie sql.
cur = conn.cursor()
cur.execute(sql)
res = [dict((cur.description[i][0], value) for i, value in enumerate(row)) for row in cur.fetchall()]
output json example:
[
{
"FIRST_ROW":"Test 11",
"SECOND_ROW":"Test 12",
"THIRD_ROW":"Test 13"
},
{
"FIRST_ROW":"Test 21",
"SECOND_ROW":"Test 22",
"THIRD_ROW":"Test 23"
}
]

Looks like MySQLdb doesn't actually provide a translation for that API call. The relevant C API call is mysql_fetch_fields, and there is no MySQLdb translation for that

Try:
cursor.column_names
mysql connector version:
mysql.connector.__version__
'2.2.9'

You can also do this to just get the field titles:
table = cursor.description
check = 0
for fields in table:
for name in fields:
if check < 1:
print(name),
check +=1
check =0

cursor.column_names is a nice and simple one.

column_names = cursor.field_names

found an easy way of having colums like sql using pymysql and pandas
import pymysql
import pandas as pd
db = pymysql.connect(host="myhost", user="myuser", passwd="mypass", db="mydb")
query = """SELECT ext
SUM(size) as totalsize,
COUNT(*) as filecount
FROM fileindex
GROUP BY ext
ORDER BY totalsize DESC;
"""
df = pd.read_sql_query(query,db)
the DataFrame will have column names ext,totalsize,filecount by default no need to do additional stuff.
for example in my case:

Related

How can I handle errors inside of a for loop inside of a cx_Oracle connection?

here's a run down of what I'd like to do: I have a list of table names, and I want to run sql against an oracle database and pull back the table name and row count for every table in my table list. However, not every table name in my list of table names is necessarily actually in the database. This causes my code to throw a database error. What I would like to do, is whenever I come to a table name that is not in the database, I create a dataframe that contains the table name and instead of count(*), there's some text that says 'table not found', or something similar. At the end of the loop I'm concatenating all of the dataframes into one dataframe. The overall goal here is to validate that certain tables exist and that they have the expected row counts.
query_list=[]
df_List=[]
connstr= '%s/%s#%s' %(username, password, server)
conn = cx_Oracle.connect(connstr)
with conn:
query_list = ["SELECT '%s' as tbl, count(*) FROM %s." %(elm, database) +elm for elm in table_list]
df_List = [pd.read_sql(elm,conn) for elm in query_list]
df = pd.concat(df_List)
Consider try/except handling to return query output or table not found output:
def get_table_count(sql, conn, elm):
try:
return pd.read_sql(sql, conn)
except:
return pd.DataFrame({'tbl': elm, 'note': 'table not found'}, index = [0])
with conn:
sql = "SELECT '{t}' as tbl, count(*) as table_count FROM {d}.{t}"
df_List = [get_table_count(sql.format(t = elm, d = database), conn, elm) \
for elm in table_list]
df = pd.concat(df_List, ignore_index = True)
Get a list of all the Table Names which are in the DB, then create a loop to query each Table to get the row count.
Here is a SQL statement to get a list of all Tables in an Oracle DB:
SQL:
SELECT DISTINCT TABLE_NAME FROM ALL_TAB_COLUMNS ORDER BY TABLE_NAME ASC;
Python (to make list of tables you want row counts for and which exist in the DB):
list(set(tables_that_exist_in_DB) - (set(tables_that_exist_in_DB) - set(list_of_tables_you_want)))

Using Python, how to return just one value instead of an entire row in an SQL query

I have a SQL database file which contains a multitude of columns, two of which are 'GEO_ID' and 'MED_INCOME'. I am trying to retrieve just the 'MED_INCOME' column data using the associated 'GEO_ID'. Here is what I thought would work:
import sqlite3 as db
def getIncome(censusID):
conn = db.connect('census.db')
c = conn.cursor()
c.execute("SELECT 'MED_INCOME' FROM censusDbTable WHERE GEO_ID = %s" % (censusID)
response = c.fetchall()
c.close()
conn.close()
return response
id = 60014001001
incomeValue = getIncome(id)
print("incomeValue: ", incomeValue)
Which results in:
incomeValue: [('MED_INCOME',)]
I thought that I had used this method before when attempting to retrieve the data from just one column, but this method does not appear to work. If I were to instead write:
c.execute("SELECT * FROM censusDbTable WHERE GEO_ID = %s" % (censusID)
I get the full row's data, so I know the ID is in the database file.
Is there something about my syntax that is causing this request to result in an empty set?
Per #Ernxst comment, I adjusted the request to:
c.execute("SELECT MED_INCOME FROM censusDbTable WHERE GEO_ID = %s" % (censusID)
Removing the quotes around the column ID, which solved the problem.

insert into mysql database with pymysql failing to insert

I'm trying to insert dummy data into a mysql database.
The database structure looks like:
database name: messaround
database table name: test
table structure:
id (Primary key, auto increment)
path (varchar(254))
UPDATED 2 method below, and error.
I have a method to try to insert via:
def insert_into_db(dbcursor, table, *cols, **vals):
try:
query = "INSERT INTO {} ({}) VALUES ('{}')".format(table, ",".join(cols), "'),('".join(vals))
print(query)
dbcursor.execute(query)
dbcursor.commit()
print("inserted!")
except pymysql.Error as exc:
print("error inserting...\n {}".format(exc))
connection=conn_db()
insertstmt=insert_into_db(connection, table='test', cols=['path'], vals=['test.com/test2'])
However, this is failing saying:
INSERT INTO test () VALUES ('vals'),('cols')
error inserting...
(1136, "Column count doesn't match value count at row 1")
Can you please assist?
Thank you.
If you use your code:
def insert_into_db(dbcursor, table, *cols, **vals):
query = "INSERT INTO {} ({}) VALUES ({})".format(table,",".join(cols), ",".join(vals))
print(query)
insert_into_db('cursor_here', 'table_here', 'name', 'city', name_person='diego', city_person='Sao Paulo')
Python returns:
INSERT INTO table_here (name,city) VALUES (name_person,city_person)
Now with this other:
def new_insert_into_db(dbcursor, table, *cols, **vals):
vals2 = ''
for first_part, second_part in vals.items():
vals2 += '\'' + second_part + '\','
vals2 = vals2[:-1]
query = "INSERT INTO {} ({}) VALUES ({})".format(table,",".join(cols), vals2)
print(query)
new_insert_into_db('cursor_here', 'table_here', 'name', 'city', name_person='diego', city_person='Sao Paulo')
Python will return the correct SQL:
INSERT INTO table_here (name,city) VALUES ('diego','Sao Paulo')
Generally in Python you pass a parameterized query to the DB driver. See this example in PyMySQL's documentation; it constructs the INSERT query with placeholder characters, then calls cursor.execute() passing the query, and a tuple of the actual values.
Using parameterized queries is also recommended for security purposes, as it defeats many common SQL injection attacks.
you should print the sql statement which you've generated, that makes it a lot easier to see what's wrong.
But I guess you need quotes ' around string values for your ",".join(vals) (in case there are string values.
So your code is producing
insert into test (path,) values (test.com/test2,);
but it should produce
insert into test (`path`) values ('test.com/test2');
Otherwise try https://github.com/markuman/MariaSQL/ which makes it super easy to insert data to MariaDB/MySQL using pymysql.
Change your query as below
query = "INSERT INTO {} ({}) VALUES ('{}')".format(table, ",".join(cols), "'),('".join(vals))
As you are using join, the variable is expected to be a list but not a string
table = 'test'
cols = ['path']
vals = ['test.com/test2', 'another.com/anothertest']
print(query)
"INSERT INTO test (path) VALUES ('test.com/test2'),('another.com/anothertest')"
Update:
def insert_into_db(dbconnection=None, table='', cols=None, vals=None):
mycursor = dbconnection.cursor()
if not (dbconnection and table and cols and vals):
print('Must need all values')
quit()
try:
query = "INSERT INTO {} ({}) VALUES ('{}')".format(table, ",".join(cols), "'),('".join(vals))
mycursor.execute(query)
dbconnection.commit()
print("inserted!")
except pymysql.Error as exc:
print("error inserting...\n {}".format(exc))
connection=conn_db()
insertstmt=insert_into_db(dbconnection=connection, table='test', cols=['path'], vals=['test.com/test2'])

Python: Set param for columns and values pypyodbc - executemany

I have this situation where I created a method that will insert rows in database. I provide to that method columns, values and table name.
COLUMNS = [['NAME','SURNAME','AGE'],['SURNAME','NAME','AGE']]
VALUES = [['John','Doe',56],['Doe','John',56]]
TABLE = 'people'
This is how I would like to pass but it doesn't work:
db = DB_CONN.MSSQL() #method for connecting to MS SQL or ORACLE etc.
cursor = db.cursor()
sql = "insert into %s (?) VALUES(?)" % TABLE
cursor.executemany([sql,[COLUMNS[0],VALUES[0]],[COLUMNS[1],VALUES[1]]])
db.commit()
This is how it will pass query but problem is that I must have predefined column names and that's not good because what if the other list has different column sort? Than the name will be in surname and surname in name.
db = DB_CONN.MSSQL() #method for connecting to MS SQL or ORACLE etc.
cursor = db.cursor()
sql = 'insert into %s (NAME,SURNAME,AGE) VALUES (?,?,?)'
cursor.executemany(sql,[['John','Doe',56],['Doe','John',56]])
db.commit()
I hope I explained it clearly enough.
Ps. COLUMNS and VALUES are extracted from json dictionary
[{'NAME':'John','SURNAME':'Doe','AGE':56...},{'SURNAME':'Doe','NAME':'John','AGE':77...}]
if that helps.
SOLUTION:
class INSERT(object):
def __init__(self):
self.BASE_COL = ''
def call(self):
GATHER_DATA = [{'NAME':'John','SURNAME':'Doe','AGE':56},{'SURNAME':'Doe','NAME':'John','AGE':77}]
self.BASE_COL = ''
TABLE = 'person'
#check dictionary keys
for DATA_EVAL in GATHER_DATA:
if self.BASE_COL == '': self.BASE_COL = DATA_EVAL.keys()
else:
if self.BASE_COL != DATA_EVAL.keys():
print ("columns in DATA_EVAL.keys() have different columns")
#send mail or insert to log or remove dict from list
exit(403)
#if everything goes well make an insert
columns = ','.join(self.BASE_COL)
sql = 'insert into %s (%s) VALUES (?,?,?)' % (TABLE, columns)
db = DB_CONN.MSSQL()
cursor = db.cursor()
cursor.executemany(sql, [DATA_EVAL.values() for DATA_EVAL in GATHER_DATA])
db.commit()
if __name__ == "__main__":
ins = INSERT()
ins.call()
You could take advantage of the non-random nature of key-value pair listing for python dictionaries.
You should check that all items in the json array of records have the same fields, otherwise you'll run into an exception in your query.
columns = ','.join(records[0].keys())
sql = 'insert into %s (%s) VALUES (?,?,?)' % (TABLE, columns)
cursor.executemany(sql,[record.values() for record in records])
References:
https://stackoverflow.com/a/835430/5189811

How to get field names when running plain sql query in django

In one of my django views I query database using plain sql (not orm) and return results.
sql = "select * from foo_bar"
cursor = connection.cursor()
cursor.execute(sql)
rows = cursor.fetchall()
I am getting the data fine, but not the column names. How can I get the field names of the result set that is returned?
On the Django docs, there's a pretty simple method provided (which does indeed use cursor.description, as Ignacio answered).
def dictfetchall(cursor):
"Return all rows from a cursor as a dict"
columns = [col[0] for col in cursor.description]
return [
dict(zip(columns, row))
for row in cursor.fetchall()
]
According to PEP 249, you can try using cursor.description, but this is not entirely reliable.
I have found a nice solution in Doug Hellmann's blog:
http://doughellmann.com/2007/12/30/using-raw-sql-in-django.html
from itertools import *
from django.db import connection
def query_to_dicts(query_string, *query_args):
"""Run a simple query and produce a generator
that returns the results as a bunch of dictionaries
with keys for the column values selected.
"""
cursor = connection.cursor()
cursor.execute(query_string, query_args)
col_names = [desc[0] for desc in cursor.description]
while True:
row = cursor.fetchone()
if row is None:
break
row_dict = dict(izip(col_names, row))
yield row_dict
return
Example usage:
row_dicts = query_to_dicts("""select * from table""")
try the following code :
def read_data(db_name,tbl_name):
details = sfconfig_1.dbdetails
connect_string = 'DRIVER=ODBC Driver 17 for SQL Server;SERVER={server}; DATABASE={database};UID={username}\
;PWD={password};Encrypt=YES;TrustServerCertificate=YES'.format(**details)
connection = pyodbc.connect(connect_string)#connecting to the server
print("connencted to db")
# query syntax
query = 'select top 100 * from '+'[{}].[dbo].[{}]'.format(db_name,tbl_name) + ' t where t.chargeid ='+ "'622102*3'"+';'
#print(query,"\n")
df = pd.read_sql_query(query,con=connection)
print(df.iloc[0])
return "connected to db...................."

Categories