How to use variables in sql query in sqlalchemy? - python

Unable to pass data from excel as variable in sql query in sql alchemy in Python.
Below is my code where I am accessing list of company ids from excel; But when I try to use that in the sql query in (in condition), it throws me error. Can you tell me how to pass variable in sql query in sql alchemy.
file_loc = path + file
company_id = pd.read_excel(file_loc,sheet_name='Sheet1',index_col=None,usecols="A",header=1)
qry = sqlalchemy.text("SELECT year,name,source,SUM(value) FROM table WHERE id in (:company_id) GROUP BY year,name,source")
qryres = pd.read_sql(qry,conn)

you can use this :
file_loc = path + file
company_id = pd.read_excel(file_loc,sheet_name='Sheet1',index_col=None,usecols="A",header=1)
qry = sqlalchemy.text(f"SELECT year,name,source,SUM(value) FROM table WHERE id in ({company_id}) GROUP BY year,name,source")
qryres = pd.read_sql(qry,conn)
Docs : https://docs.python.org/3.5/library/string.html#format-string-syntax

Related

Read a txt/sql file as formatted string - f'{}' in Python

I have sql file which has table names as formatted string
query.sql
SELECT * FROM {table_name} WHERE LOAD_DT = '{load_date}'
How to read the sql file as f-string to pass it to pd.read_sql() method?
table_name = PRODUCTS
load_date = '15-08-2020'
# n other local variables
with open('query.sql','r') as file:
sql_str = file.read()
Note: I do not prefer .format(table_name,load_date) or .format(**locals()) as I have a custom function to read various sql files and don't want to send a param list of format variables every time, the reason being if the format list is huge it will be laborious while preparing the sql file with positional arguments and chances of mistakes are high
You can use .format method of string:
sql_str = "SELECT * FROM {table_name} WHERE LOAD_DT = '{load_date}'"
sql_str.format(table_name="PRODUCTS", load_date="15-08-2020")
You can also pass all local variables into the .format method:
table_name = "PRODUCTS"
load_date = "15-08-2020"
sql_str.format(**locals())
It is also possible to achieve desired result using eval, which is quite dangerous:
table_name = "PRODUCTS"
load_date = "15-08-2020"
sql_str = "SELECT * FROM {table_name} WHERE LOAD_DT = '{load_date}'"
sql_str_f = f"f\"{sql_str}\""
result = eval(sql_str_f)

Python Script to get multi table counts

I'm trying to write a python script to get a count of some tables for monitoring which looks a bit like the code below. I'm trying to get an output such as below and have tried using python multi-dimensional arrays but not having any luck.
Expected Output:
('oltptransactions:', [(12L,)])
('oltpcases:', [(24L,)])
Script:
import psycopg2
# Connection with the DataBase
conn = psycopg2.connect(user = "appuser", database = "onedb", host = "192.168.1.1", port = "5432")
cursor = conn.cursor()
sql = """SELECT COUNT(id) FROM appuser.oltptransactions"""
sql2 = """SELECT count(id) FROM appuser.oltpcases"""
sqls = [sql,sql2]
for i in sqls:
cursor.execute(i)
result = cursor.fetchall()
print('Counts:',result)
conn.close()
Current output:
[root#pgenc python_scripts]# python multi_getrcount.py
('Counts:', [(12L,)])
('Counts:', [(24L,)])
Any help is appreciated.
Thanks!
I am a bit reluctant to show this way, because best practices recommend to never build a dynamic SQL string but always use a constant string and parameters, but this is one use case where computing the string is legit:
a table name cannot be a parameter in SQL
the input only comes from the program itself and is fully mastered
Possible code:
sql = """SELECT count(*) from appuser.{}"""
tables = ['oltptransactions', 'oltpcases']
for t in tables:
cursor.execute(sql.format(t))
result = cursor.fetchall()
print("('", t, "':,", result, ")")
I believe something as below, Unable to test code because of certificate issue.
sql = """SELECT 'oltptransactions', COUNT(id) FROM appuser.oltptransactions"""
sql2 = """SELECT 'oltpcases', COUNT(id) FROM appuser.oltpcases"""
sqls = [sql,sql2]
for i in sqls:
cursor.execute(i)
for name, count in cursor:
print ("")
Or
sql = """SELECT 'oltptransactions :'||COUNT(id) FROM appuser.oltptransactions"""
sql2 = """SELECT 'oltpcases :'||COUNT(id) FROM appuser.oltpcases"""
sqls = [sql,sql2]
for i in sqls:
cursor.execute(i)
result = cursor.fetchall()
print(result)

How can I handle errors inside of a for loop inside of a cx_Oracle connection?

here's a run down of what I'd like to do: I have a list of table names, and I want to run sql against an oracle database and pull back the table name and row count for every table in my table list. However, not every table name in my list of table names is necessarily actually in the database. This causes my code to throw a database error. What I would like to do, is whenever I come to a table name that is not in the database, I create a dataframe that contains the table name and instead of count(*), there's some text that says 'table not found', or something similar. At the end of the loop I'm concatenating all of the dataframes into one dataframe. The overall goal here is to validate that certain tables exist and that they have the expected row counts.
query_list=[]
df_List=[]
connstr= '%s/%s#%s' %(username, password, server)
conn = cx_Oracle.connect(connstr)
with conn:
query_list = ["SELECT '%s' as tbl, count(*) FROM %s." %(elm, database) +elm for elm in table_list]
df_List = [pd.read_sql(elm,conn) for elm in query_list]
df = pd.concat(df_List)
Consider try/except handling to return query output or table not found output:
def get_table_count(sql, conn, elm):
try:
return pd.read_sql(sql, conn)
except:
return pd.DataFrame({'tbl': elm, 'note': 'table not found'}, index = [0])
with conn:
sql = "SELECT '{t}' as tbl, count(*) as table_count FROM {d}.{t}"
df_List = [get_table_count(sql.format(t = elm, d = database), conn, elm) \
for elm in table_list]
df = pd.concat(df_List, ignore_index = True)
Get a list of all the Table Names which are in the DB, then create a loop to query each Table to get the row count.
Here is a SQL statement to get a list of all Tables in an Oracle DB:
SQL:
SELECT DISTINCT TABLE_NAME FROM ALL_TAB_COLUMNS ORDER BY TABLE_NAME ASC;
Python (to make list of tables you want row counts for and which exist in the DB):
list(set(tables_that_exist_in_DB) - (set(tables_that_exist_in_DB) - set(list_of_tables_you_want)))

compose mysql query in python

I want to fetch all rows from MySQL table with
query = "SELECT * FROM %s WHERE last_name=%s"
cursor.execute(query, ("employees","Smith"))
but I'm getting
You have an error in your SQL syntax. When I try
query = "SELECT * FROM employees WHERE last_name=%s"
cursor.execute(query, ("Smith",))
all is fine.
Documentation says
cursor.execute(operation, params=None, multi=False)
The parameters found in the tuple or dictionary params are bound to the variables in the operation.link on docs
The first will generate an SQL like this:
SELECT * FROM 'employees' WHERE last_name='smith'
The parameters are SQL quoted.
If you really need to have a table name as param, you must proceed in 2 steps:
table_name = 'employees'
query_tpl = "SELECT * FROM {} WHERE last_name=%s"
query = query_tpl.format(table_name)
cursor.execute(query, ("Smith",))
you need to add the quote symbol. So the query will be like
SELECT * FROM employees WHERE last_name='Smith'
Change both your query to
query = "SELECT * FROM %s WHERE last_name='%s'"
query = "SELECT * FROM employees WHERE last_name='%s'"
You can't use a parameter for the table name in the execute call.
But you can use Python string interpolation for that:
query = "SELECT * FROM %s WHERE last_name=%s" %("employees","Smith")
cursor.execute(query)
You can't use a table name as a parameter. you are generating invalid sql with your code that is putting quotes around each string. the table name cannot have quotes around it.
sql you are generating
select * from 'employees' where last_name = 'Smith'
What sql you want
select * from employees where last_name = 'Smith'
you would have to format the string first like the example below.
query = "SELECT * from {} wherre last_name ='{}'"
cursor.execute(query.format("employees","Smith"))
using code like this does open up the possibility of SQL injection. so please bear that in mind.
query="SELECT * FROM %s WHERE name=%s",(employees,smith)
cursor.execute(query)
rows = cursor.fetchall()
Try this one. Hopefully it works for you.

How can I use multiple parameters using pandas pd.read_sql_query?

I am trying to pass three variables in a sql query. These are region, feature, newUser. I am using SQL driver SQL Server Native Client 11.0.
Here is my code that works.
query = "SELECT LicenseNo FROM License_Mgmt_Reporting.dbo.MATLAB_NNU_OPTIONS WHERE Region = ?"
data_df = pd.read_sql_query((query),engine,params={region})
output.
LicenseNo
0 12
1 5
Instead i want to pass in three variables and this code does not work.
query = "SELECT LicenseNo FROM License_Mgmt_Reporting.dbo.MATLAB_NNU_OPTIONS WHERE Region = ? and FeatureName = ? and NewUser =?"
nnu_data_df = pd.read_sql_query((query),engine,params={region, feature, newUser})
Output returns an empty data frame.
Empty DataFrame
Columns: [LicenseNo]
Index: []
try a string in a tuple, also you can take out the () in the query:
so you could do something like
query = "SELECT LicenseNo FROM License_Mgmt_Reporting.dbo.MATLAB_NNU_OPTIONS WHERE Region = ? and FeatureName = ? and NewUser =?"
region = 'US'
feature = 'tall'
newUser = 'john'
data_df = pd.read_sql_query(query, engine, params=(region, feature , newUser))
Operator error by me :( I was using the wrong variable and the database returned no results because it didn't exist!

Categories