I'd like to create a function that allows the user to input a SQL query and have it converted into a Pandas df. So far I've tried the following:
def dataset():
raw_sql_query = input("Enter your SQL query: ")
sql_query = """" " + raw_sql_query + " """"
sql3 =
sql_query
df = pd.io.sql.read_sql(sql3, cnxn)
df.head()
Which yields the error:
File "<ipython-input-18-6b10c2bc776f>", line 4
sql_query = """" " + raw_sql_query + " """"
^
SyntaxError: EOL while scanning string literal
I've also tried a few similar versions of the above code, including:
def dataset():
raw_sql_query = input("Enter your SQL query: ")
sql_query = """"" + raw_sql_query + """""
sql3 =
sql_query
df = pd.io.sql.read_sql(sql3, cnxn)
df.head()
Which led to the following error:
File "<ipython-input-23-e501c9746878>", line 5
sql3 =
^
SyntaxError: invalid syntax
Is a function like this possible? If so, how would I go about creating a working function for this action?
All the documentation I've read about functions only includes examples for stuff like printing "Hello World" or basic addition/subtraction/etc - so not very useful.
EDIT:
Using pandas.read_sql_query like this:
def dataset():
"""This functions allows you to input a SQL query and it will be transformed into a Pandas dataframe"""
raw_sql_query = input("Enter your SQL query: ")
sql_query = """"" + raw_sql_query + """""
sql3 = sql_query
df = pd.io.sql.read_sql(sql3, cnxn)
df.head()
This doesn't return an error, but also doesn't return the expected results. It returns nothing.
I like the flexibility of sqlalchemy combined to pandas.read_sql. This is the code that I use:
import sqlalchemy as sa
def bindQuery(query, **params):
for key, value in params.items():
key = f":{key}"
if isinstance(value, str):
value = f"'{value}'"
query = query.replace(key, str(value))
query = query.replace("\n", " ").replace("\t", " ")
return query
def readQuery(query, engine, **params):
query = bindQuery(query, **params)
return pd.read_sql(query, engine)
So when I've to run the following QUERY
QUERY = """
SELECT count(*)
FROM table
where id in :ids
"""
ids = (1, 2, 3)
df = readQuery(query=QUERY,
engine=my_engine,
ids=ids)
Related
I'm trying to make a function to pass parameters and return queries in jupyter but it returns the following error "Not all parameters were used in the SQL statement".
For now I made the connection using:
import mysql.connector as connection
import pandas as pd
db = connection.connect (parameters of connection)
My function has to pass 3 parameters and return the dataframe and its queries according to these parameters, so i'm doing:
def param (id, id_user, date):
cursor = db.cursor()
if (id == '' or None) and (id_user == '' or None) and (date == '' or None):
query = 'select * from database'
df = pd.read_sql(query, db)
elif (id_user == '' or None) and (date == '' or None):
query = 'select * from database where id = %s'
df = pd.read_sql(query, db, params = id)
else:
print(':(')
return df
Then the error appears when i execute:
param (18, '', '')
What am I doing wrong?
Try passing a tuple as the value for params:
query = 'select * from database where id = %s'
df = pd.read_sql(query, db, params = (id,)) # change is here
In your current code, you are passing a single variable id directly, but most likely it is expecting a tuple.
I am pretty new in python developing. I have a long python script what "clone" a database and add additional stored functions and procedures. Clone means copy only the schema of DB.These steps work fine.
My question is about pymysql insert exection:
I have to copy some table contents into the new DB. I don't get any sql error. If I debug or print the created INSERT INTO command is correct (I've tested it in an sql editor/handler). The insert execution is correct becuse the result contain the exact row number...but all rows are missing from destination table in dest.DB...
(Ofcourse DB_* variables have been definied!)
import pymysql
liveDbConn = pymysql.connect(DB_HOST, DB_USER, DB_PWD, LIVE_DB_NAME)
testDbConn = pymysql.connect(DB_HOST, DB_USER, DB_PWD, TEST_DB_NAME)
tablesForCopy = ['role', 'permission']
for table in tablesForCopy:
with liveDbConn.cursor() as liveCursor:
# Get name of columns
liveCursor.execute("DESCRIBE `%s`;" % (table))
columns = '';
for column in liveCursor.fetchall():
columns += '`' + column[0] + '`,'
columns = columns.strip(',')
# Get and convert values
values = ''
liveCursor.execute("SELECT * FROM `%s`;" % (table))
for result in liveCursor.fetchall():
data = []
for item in result:
if type(item)==type(None):
data.append('NULL')
elif type(item)==type('str'):
data.append("'"+item+"'")
elif type(item)==type(datetime.datetime.now()):
data.append("'"+str(item)+"'")
else: # for numeric values
data.append(str(item))
v = '(' + ', '.join(data) + ')'
values += v + ', '
values = values.strip(', ')
print("### table: %s" % (table))
testDbCursor = testDbConn.cursor()
testDbCursor.execute("INSERT INTO `" + TEST_DB_NAME + "`.`" + table + "` (" + columns + ") VALUES " + values + ";")
print("Result: {}".format(testDbCursor._result.message))
liveDbConn.close()
testDbConn.close()
Result is:
### table: role
Result: b"'Records: 16 Duplicates: 0 Warnings: 0"
### table: permission
Result: b'(Records: 222 Duplicates: 0 Warnings: 0'
What am I doing wrong? Thanks!
You have 2 main issues here:
You don't use conn.commit() (which would be either be liveDbConn.commit() or testDbConn.commit() here). Changes to the database will not be reflected without committing those changes. Note that all changes need committing but SELECT, for example, does not.
Your query is open to SQL Injection. This is a serious problem.
Table names cannot be parameterized, so there's not much we can do about that, but you'll want to parameterize your values. I've made multiple corrections to the code in relation to type checking as well as parameterization.
for table in tablesForCopy:
with liveDbConn.cursor() as liveCursor:
liveCursor.execute("SELECT * FROM `%s`;" % (table))
name_of_columns = [item[0] for item in liveCursor.description]
insert_list = []
for result in liveCursor.fetchall():
data = []
for item in result:
if item is None: # test identity against the None singleton
data.append('NULL')
elif isinstance(item, str): # Use isinstance to check type
data.append(item)
elif isinstance(item, datetime.datetime):
data.append(item.strftime('%Y-%m-%d %H:%M:%S'))
else: # for numeric values
data.append(str(item))
insert_list.append(data)
testDbCursor = testDbConn.cursor()
placeholders = ', '.join(['`%s`' for item in insert_list[0]])
testDbCursor.executemany("INSERT INTO `{}.{}` ({}) VALUES ({})".format(
TEST_DB_NAME,
table,
name_of_columns,
placeholders),
insert_list)
testDbConn.commit()
From this github thread, I notice that executemany does not work as expected in psycopg2; it instead sends each entry as a single query. You'll need to use execute_batch:
from psycopg2.extras import execute_batch
execute_batch(testDbCursor,
"INSERT INTO `{}.{}` ({}) VALUES ({})".format(TEST_DB_NAME,
table,
name_of_columns,
placeholders),
insert_list)
testDbConn.commit()
How to insert data into table using python pymsql
Find my solution below
import pymysql
import datetime
# Create a connection object
dbServerName = "127.0.0.1"
port = 8889
dbUser = "root"
dbPassword = ""
dbName = "blog_flask"
# charSet = "utf8mb4"
conn = pymysql.connect(host=dbServerName, user=dbUser, password=dbPassword,db=dbName, port= port)
try:
# Create a cursor object
cursor = conn.cursor()
# Insert rows into the MySQL Table
now = datetime.datetime.utcnow()
my_datetime = now.strftime('%Y-%m-%d %H:%M:%S')
cursor.execute('INSERT INTO posts (post_id, post_title, post_content, \
filename,post_time) VALUES (%s,%s,%s,%s,%s)',(5,'title2','description2','filename2',my_datetime))
conn.commit()
except Exception as e:
print("Exeception occured:{}".format(e))
finally:
conn.close()
I have been trying to loop through a list as a parameter for a query from database, and convert it into xlsx format, using pyodbc, pandas, xlsxwriter modules.
However, the message below keeps on appearing despite a process of trial and error:
The first argument to execute must be a string or unicode query.
Could this have something to do with the query itself or the module 'pandas'?
Thank you.
This is for exporting a query result to an excel spreadsheet using pandas and pyodbc, with python 3.7 ver.
import pyodbc
import pandas as pd
#Database Connection
conn = pyodbc.connect(driver='xxxxxx', server='xxxxxxx', database='xxxxxx',
user='xxxxxx', password='xxxxxxxx')
cursor = conn.cursor()
depts = ['Human Resources','Accounting','Marketing']
query = """
SELECT *
FROM device ID
WHERE
Department like ?
AND
Status like 'Active'
"""
target = r'O:\\Example'
today = target + os.sep + time.strftime('%Y%m%d')
if not os.path.exists(today):
os.mkdir(today)
for i in departments:
cursor.execute(query, i)
#workbook = Workbook(today + os.sep + i + 'xlsx')
#worksheet = workbook.add_worksheet()
data = cursor.fetchall()
P_data = pd.read_sql(data, conn)
P_data.to_excel(today + os.sep + i + 'xlsx')
When you read data into a dataframe using pandas.read_sql(), pandas expects the first argument to be a query to execute (in string format), not the results from the query.
Instead of your line:
P_data = pd.read_sql(data, conn)
You'd want to use:
P_data = pd.read_sql(query, conn)
And to filter out the departments, you'd want to serialize the list into SQL syntax string:
depts = ['Human Resources','Accounting','Marketing']
# gives you the string to use in your sql query:
depts_query_string = "('{query_vals}')".format(query_vals="','".join(depts))
To use the new SQL string in your query, use str.format:
query = """
SELECT *
FROM device ID
WHERE
Department in {query_vals}
AND
Status like 'Active'
""".format(query_vals=depts_query_string)
All together now:
import pyodbc
import pandas as pd
#Database Connection
conn = pyodbc.connect(driver='xxxxxx', server='xxxxxxx', database='xxxxxx',
user='xxxxxx', password='xxxxxxxx')
cursor = conn.cursor()
depts = ['Human Resources','Accounting','Marketing']
# gives you the string to use in your sql query:
depts_query_string = "('{query_vals}')".format(query_vals="','".join(depts))
query = """
SELECT *
FROM device ID
WHERE
Department in {query_vals}
AND
Status like 'Active'
""".format(query_vals=depts_query_string)
target = r'O:\\Example'
today = target + os.sep + time.strftime('%Y%m%d')
if not os.path.exists(today):
os.mkdir(today)
for i in departments:
#workbook = Workbook(today + os.sep + i + 'xlsx')
#worksheet = workbook.add_worksheet()
P_data = pd.read_sql(query, conn)
P_data.to_excel(today + os.sep + i + 'xlsx')
Once you have your query sorted, you can just load directly into a dataframe with the following command.
P_data = pd.read_sql_query(query, conn)
P_data.to_excel('desired_filename.format')
windows 7
python 2.7
Django 1.11
I have used Django to develop a website. In the backend I have the sqlite database which have 2 tables. One table accepts the form user submitted, and the other is for comparison.
Once a form A is submitted by the user, it will be save under table catalog_fw, and the catalog_fw.ODM and catalog_fw.project_name will be compared with the ones in the table catalog_fw_instance. If one line have the exact same content for catalog_fw.ODM and catalog_fw.project, catalog_fw_instance.level will be combined with A to pass to the an .exe to generate a txtx file.
However, error occurs in this line: c.execute("catalog_fw_instance.level,......
`
when I run this python file:
sqlite3.OperationalError: near "catalog_fw_instance": syntax error
The code to get sqlite data, compare and pass to the .exe is here:
def when_call_exe():
with sqlite3.connect('db.sqlite3') as con:
c = con.cursor()
#c.execute("catalog_fw_instance.level, SELECT catalog_fw.ODM_name, catalog_fw.project_name, catalog_fw.UAP, catalog_fw.NAP, catalog_fw.LAP, catalog_fw.num_address FROM catalog_fw INNER JOIN catalog_fw_instance ON catalog_fw.ODM_name=catalog_fw_instance.ODM_name AND catalog_fw.project_name=catalog_fw_instance.project_name")
sql = ("SELECT catalog_fw.ODM_name, catalog_fw.project_name, catalog_fw.UAP, catalog_fw.NAP, catalog_fw.LAP, " +
"catalog_fw.num_address, catalog_fw_instance.level " +
"FROM catalog_fw catalog_fw" +
"INNER JOIN catalog_fw_instance catalog_fw_instanc" +
" ON catalog_fw.ODM_name = catalog_fwi.ODM_name AND catalog_fw.project_name = catalog_fw_instance.project_name")
c.execute(sql)
print '1:', c.fetchone()
parameter = c.fetchone()
print '2', parameter
#pass to exe
args = ['.//exe//Test.exe', parameter[0], parameter[1]+parameter[2], parameter[3], parameter[4], parameter[5], parameter[6]]
output = my_check_output(args)
if 'SUCCESS' in output:
filename = output[28:-1]
else:
filename = output[8:-1]
downloadlink = os.path.join('/exe', '%s' % filename)
#save link to sqlite db
c.execute('''UPDATE catalog_fw SET download = %s WHERE
ODM_Name=parameter[1] AND project_Name=parameter[2] ''' % downloadlink)
here shows the 2 tables in the sqlite database
table 1
table 2
As far as I know, when calling cursor#execute() in Python, we should be passing a single string containing the query to be run. It looks like you are passing one of the select parameters, followed by a query, all together as a single string. Consider the following version:
c = con.cursor()
sql = ("SELECT cf.ODM_name, cf.project_name, cf.UAP, cf.NAP, cf.LAP, " +
"cf.num_address, cfi.level " +
"FROM catalog_fw cf " +
"INNER JOIN catalog_fw_instance cfi " +
" ON cf.ODM_name = cfi.ODM_name AND cf.project_name = cfi.project_name")
c.execute(sql)
print(c.fetchone())
parameter = c.fetchone()
I am trying to write. code that will allow a user to select specific columns from a sqlite database which will then be transformed into a pandas data frame. I am using a test database titled test_database.db with a table titled test. The table has three columns, id, value_one, and value_two. The function I am showing exists within a class that establishes a connection to the database and in this function the user only needs to pass the table name and a list of columns that they would like to extract. For instance in command line sqlite I might type the command select value_one, value_two from test if I wanted only to read in the columns value_one and column_two from the table test. If I type this command into command line the method works. However, in this case I use python to build the text string which is fed into pandas.read_sql_query() and the method does not work. My code is shown below
class ReadSQL:
def __init__(self, database):
self.database = database
self.conn = sqlite3.connect(self.database)
self.cur = self.conn.cursor()
def query_columns_to_dataframe(table, columns):
query = 'select '
for i in range(len(columns)):
query = query + columns[I] + ', '
query = query[:-2] + ' from ' + table
# print(query)
df = pd.read_sql_query(query, self.conn)
return
def close_database()
self.conn.close
return
test = ReadSQL(test_database.db)
df = query_columns_to_dataframe('test', ['value_one', 'value_two'])
I am assuming my problem has something to do with the way that query_columns_to_dataframe() pre-processes the information because if I uncomment the print command in query_columnes_to_dataframe() I get a text string that looks identical to what works if I just type it directly into command line. Any help is appreciated.
I mopped up a few mistakes in your code to produce this, which works. Note that I inadvertently changed the names of the fields in your test db.
import sqlite3
import pandas as pd
class ReadSQL:
def __init__(self, database):
self.database = database
self.conn = sqlite3.connect(self.database)
self.cur = self.conn.cursor()
def query_columns_to_dataframe(self, table, columns):
query = 'select '
for i in range(len(columns)):
query = query + columns[i] + ', '
query = query[:-2] + ' from ' + table
#~ print(query)
df = pd.read_sql_query(query, self.conn)
return df
def close_database():
self.conn.close
return
test = ReadSQL('test_database.db')
df = test.query_columns_to_dataframe('test', ['value_1', 'value_2'])
print (df)
Output:
value_1 value_2
0 2 3
Your code are full of syntax errors and issues
The return in query_columns_to_dataframe should be return df. This is the primary reason why your code does not return anything.
self.cur is not used
Missing self parameter when declaring query_columns_to_dataframe
Missing colon at the end of the line def close_database()
Missing self parameter when declaring close_database
Missing parentheses here: self.conn.close
This df = query_columns_to_dataframe should be df = test.query_columns_to_dataframe
Fixing these errors and your code should work.