Python - execute multiple SQL queries in a array in python - python

I am going to execute multiple SQL queries in python but I think since they are in array there are some extra characters like [" "] which read_sql_query function cannot execute them, or maybe there is another problem. Do anyone know how can I solve this problem?
My array:
array([['F_TABLE1'],
['F_TABLE2'],
['F_TABLE3'],
['F_TABLE4'],
['F_TABLE5'],
['F_TABLE6'],
['F_TABLE1'],
['F_TABLE8']], dtype=object)
My python code:
SQL_Query = []
for row in range(len(array)):
SQL_Query.append('SELECT ' + "'" + array[row] + "'" + ', COUNT(*) FROM ' + array[row])
SQL = []
for row in range(len(SQL_Query)):
SQL = pd.read_sql_query(SQL_Query[row], conn)
PS: I separated them in two for to see what is wrong with my code.
Also I print one of the arrays to see what is the output of my array.
print(SQL_Query[0])
The output:
["SELECT 'F_CLINICCPARTY_HIDDEN', COUNT(*) FROM F_TABLE1"]
Because of the above output I think the problem is extra characters.
It gives me this error:
Execution failed on sql '["SELECT 'F_TABLE1', COUNT(*) FROM F_TABLE1"]': expecting string or bytes object

Hi reason for this kind of an issue is when getting a numpy array value like array[row] will resultant in a numpy object. Even though in basic array terms it looks like the way numpy behaves differently. use item method to overcome the issue.
Based on your array I have written a sample code in there I'm only referring to the same array object instead of columns and tables.
import numpy as np
array1 = np.array([['F_TABLE1'],
['F_TABLE2'],
['F_TABLE3'],
['F_TABLE4'],
['F_TABLE5'],
['F_TABLE6'],
['F_TABLE1'],
['F_TABLE8']], dtype=object)
SQL_Query = []
for row in range(len(array1)):
SQL_Query.append("SELECT \'{0}\',COUNT(*) FROM {1}".format(str(array1.item(row)),str(array1.item(row))) )
print(SQL_Query)
feel free to use two array objects for selecting SQL columns and tables.
And using '' while selecting a column name in a query is not recommended. I have included that in this answer because I haven't got any idea about the destination database type.

I change my query to this(putting * helped me to omit other characters):
SQL_Query = []
for row in range(len(array)):
SQL_Query.append('SELECT ' + "'" + array[row] + "'" AS tableName + ', COUNT(*) AS Count FROM ' + array[row])
SQL = []
for row in range(len(SQL_Query)):
SQL.append(pd.read_sql_query(*SQL_Query[row], conn))
result = []
for item in SQL:
result.append(item)
result = pd.concat(result)
result
My output:
tableName
Count
F_TABLE1
20
F_TABLE2
30
F_TABLE3
220
F_TABLE4
50
F_TABLE5
10
F_TABLE6
2130
F_TABLE7
250

Related

Retrieve data from SQL Server database using Python and PYODBC

I have Python code that connects with SQL Server database using PYODBC and Streamlit to create a web app.
The problem is when I try to perform a select query with multiple conditions the result is empty where as the result it must return records.
If I try the SQL query direct on the database it return the below result:
SELECT TOP (200) ID, first, last
FROM t1
WHERE (first LIKE '%tes%') AND (last LIKE '%tesn%')
where as the query from the python it return empty
sql="select * from testDB.dbo.t1 where ID = ? and first LIKE '%' + ? + '%' and last LIKE '%' + ? + '%' "
param0 = vals[0]
param1=f'{vals[1]}'
param2=f'{vals[2]}'
rows = cursor.execute(sql, param0,param1,param2).fetchall()
Code:
import pandas as pd
import streamlit as st
vals = []
expander_advanced_search = st.beta_expander('Advanced Search')
with expander_advanced_search:
for i, col in enumerate(df.columns):
val = st_input_update("search for {}".format(col))
expander_advanced_search.markdown(val, unsafe_allow_html=True)
vals.append(val)
if st.form_submit_button("search"):
if len(vals)>0:
sql='select * from testDB.dbo.t1 where ID = ? and first LIKE ? and last LIKE ? '
param0 = vals[0]
param1=f'%{vals[1]}%'
param2=f'%{vals[2]}%'
rows = cursor.execute(sql, param0,param1,param2).fetchall()
df = pd.DataFrame.from_records(rows, columns = [column[0] for column in cursor.description])
st.dataframe(df)
Based on suggestion of Dale k I use the OR operator in the select query:
sql="select * from testDB.dbo.t1 where ID = ? OR first LIKE ? or last LIKE ? "
param0 = vals[0] # empty
param1=f'%{vals[1]}%' # nabi
param2=f'%{vals[2]}%' # empty
rows = cursor.execute(sql, param0,param1,param2).fetchall()
The displayed result:
all the records in the database
The expected result:
id first last
7 nabil jider
I think this is probably in your parameters - your form is only submitting first/last values, but your query says ID=?
You're not providing an ID from the form so there are no results. Or it's putting the value from the 'first' input into vals[0] and the resulting query is looking for an ID = 'tes'.
Also, look into pd.read_sql() to pipe query results directly into a DataFrame/
OR statement might be what you're after if you want each clause treated separately:
where ID = ? or first LIKE ? or last LIKE ?'

Parameterized Python SQLite3 query is returning the first parameter

I'm trying to make a query to a SQLite database from a python script. However, whenever I use parameterization it just returns the first parameter, which is column2. The desired result is for it to return the value held in column2 on the row where column1 is equal to row1.
conn = sqlite3.connect('path/to/database')
c = conn.cursor()
c.execute('SELECT ? from table WHERE column1 = ? ;', ("column2","row1"))
result = c.fetchone()[0]
print(result)
It prints
>>column2
Whenever I run this using concatenated strings, it works fine.
conn = sqlite3.connect('path/to/database')
c = conn.cursor()
c.execute('SELECT ' + column2 + ' from table WHERE column1 = ' + row1 + ';')
result = c.fetchone()[0]
print(result)
And it prints:
>>desired data
Any idea why this is happening?
This behaves as designed.
The mechanism that parameterized queries provide is meant to pass literal values to the query, not meta information such as column names.
One thing to keep in mind is that the database must be able to parse the parameterized query string without having the parameter at hand: obviously, a column name cannot be used as parameter under such assumption.
For your use case, the only possible solution is to concatenate the column name into the query string, as shown in your second example. If the parameter comes from outside your code, be sure to properly validate it before that (for example, by checking it against a fixed list of values).

Is there a way to pass a nan value as null from a Pandas dataframe to a table in an Oracle database using SQLAlchemy?

I am attempting to merge values from a pandas dataframe into a table within an Oracle database using SQLAlchemy. The table has 137 columns, and some entries within the columns contain no value (represented by null in Oracle). Is there a way to enter no value (None/null) using sqlalchemy without the values being changed to CLOB within the database? I will be working with many large tables with different column specifications, so it would be good to find a generic way of doing this.
When I assign the numpy nan value, I get the following error:
sqlalchemy.exc.DatabaseError: (cx_Oracle.DatabaseError) ORA-00910: specified length too long for its datatype
The columns have been specified within the database with a variety of different precisions (some are only 1 character). Therefore entering 'nan' into some table cells will not work.
I have a dataframe, df, of dimensions n rows x 137 columns.
I have tried filling in the empty values with:
df.fillna(np.nan,inplace=True)
To specify the datatypes I use:
dtypes1 = {c:types.VARCHAR(df[c].str.len().max()) for c in df.columns[df.dtypes == 'object'].tolist()}
And then the following statement to transfer the table to a temporary table within Oracle (this works fine):
df.to_sql('table1', conn2, if_exists='replace', dtype=dtypes1, index=false)
I then attempt to merge the data from table1 to the original table, table2 by:
creating a connection to the Oracle db using an SQLAlchemy engine:
conn = create_engine('oracle+cx_oracle://.......')
Defining the merge statement:
SQL_statement = 'MERGE INTO ' + table2 + ' USING ' + table1 + ' ON (' + table1 + '.TIME = ' + table2 + '.TIME AND ' + table1 + '.ID = ' + table2 + '.ID) WHEN MATCHED THEN UPDATE SET ' + updateMsg + ' WHEN NOT MATCHED THEN INSERT (' + insertMsgB + ') VALUES (' + insertMsgA + ')'
executing the merge statement
conn.execute(SQL_statement)
I would like no value (null) to be entered within the table when the table changes are merged into the original Oracle table.
At the moment, I think my code is trying to write 'nan' into the cells, but this doesn't conform with the column specifications.
Any help with this would be much appreciated.

How to pass tuple in read_sql 'where in' clause in pandas python

I am passing a tuple converted to a string in a read_sql method as
sql = "select * from table1 where col1 in " + str(tuple1) + " and col2 in " + str(tuple2)
df = pd.read_sql(sql, conn)
This is working fine but, when tuple have only one value sql fails with ORA-00936: missing expression, as single element tuple has an extra comma
For example
tuple1 = (4011,)
tuple2 = (23,24)
sql formed is as
select * from table1 where col1 in (4011,) + " and col2 in (23,24)
^
ORA-00936: missing expression
Is there any better way doing this, other than removal of comma with string operations?
Is there a better way to paramatrize read_sql function?
the reason you're getting the error is because of SQL syntax.
When you have a WHERE col in (...) list, a trailing comma will cause a syntax error.
Either way, putting values into SQL statements using string concatenation is frowned upon, and will ultimately lead you to more problems down the line.
Most Python SQL libraries will allow for parameterised queries. Without knowing which library you're using to connect, I can't link exact documentation, but the principle is the same for psycopg2:
http://initd.org/psycopg/docs/usage.html#passing-parameters-to-sql-queries
This functionality is also exposed in pd.read_sql, so to acheive what you want safely, you would do this:
sql = "select * from table1 where col1 in %s and col2 in %s"
df = pd.read_sql(sql, conn, params = [tuple1, tuple2])
There might be a better way to do it but I would add an if statement around making the query and would use .format() instead of + to parameterise the query.
Possible if statement:
if len(tuple1) < 2:
tuple1 = tuple1[0]
This will vary based on what your input is. If you have a list of tuples you can do this:
tuples = [(4011,), (23, 24)]
new_t = []
for t in tuples:
if len(t) == 2:
new_t.append(t)
elif len(t) == 1:
new_t.append(t[0])
Ouput:
[4011, (23, 24)]
Better way of parameterising querys using .format():
sql = "select * from table1 where col1 in {} and col2 in {}".format(str(tuple1), str(tuple2))
Hope this helps!
select * from table_name where 1=1 and (column_a, column_b) not in ((28,1),(25,1))

MYSQL: how to insert statement without specifying col names or question marks?

I have a list of tuples of which i'm inserting into a Table.
Each tuple has 50 values. How do i insert without having to specify the column names and how many ? there is?
col1 is an auto increment column so my insert stmt starts in col2 and ends in col51.
current code:
l = [(1,2,3,.....),(2,4,6,.....),(4,6,7,.....)...]
for tup in l:
cur.execute(
"""insert into TABLENAME(col2,col3,col4.........col50,col51)) VALUES(?,?,?,.............)
""")
want:
insert into TABLENAME(col*) VALUES(*)
MySQL's syntax for INSERT is documented here: http://dev.mysql.com/doc/refman/5.7/en/insert.html
There is no wildcard syntax like you show. The closest thing is to omit the column names:
INSERT INTO MyTable VALUES (...);
But I don't recommend doing that. It works only if you are certain you're going to specify a value for every column in the table (even the auto-increment column), and your values are guaranteed to be in the same order as the columns of the table.
You should learn to use code to build the SQL query based on arrays of values in your application. Here's a Python example the way I do it. Suppose you have a dict of column: value pairs called data_values.
placeholders = ['%s'] * len(data_values)
sql_template = """
INSERT INTO MyTable ({columns}) VALUES ({placeholders})
"""
sql = sql_template.format(
columns=','.join(keys(data_values)),
placeholders=','.join(placeholders)
)
cur = db.cursor()
cur.execute(sql, data_values)
example code to put before your code:
cols = "("
for x in xrange(2, 52):
cols = cols + "col" + str(x) + ","
test = test[:-1]+")"
Inside your loop
for tup in l:
cur.execute(
"""insert into TABLENAME " + cols " VALUES {0}".format(tup)
""")
This is off the top of my head with no error checking

Categories