Running select query on db for different variables using python - python

I am using python to establish db connection and reading csv file. For each line in csv i want to run a PostgreSQL query and get value corresponding to each line read.
DB connection and file reading is working fine. Also if i run query for hardcoded value then it works fine. But if i try to run query for each row in csv file using python variable then i am not getting correct value.
cursor.execute("select team from users.teamdetails where p_id = '123abc'")
Above query works fine.
but when i try it for multiple values fetched from csv file then i am not getting correct value.
cursor.execute("select team from users.teamdetails where p_id = queryPID")
Complete code for Reference:
import psycopg2
import csv
conn = psycopg2.connect(dbname='', user='', password='', host='', port='')
cursor = conn.cursor()
with open('playerid.csv','r') as csv_file:
csv_reader = csv.reader(csv_file)
for line in csv_reader:
queryPID = line[0]
cursor.execute("select team from users.teamdetails where p_id = queryPID")
team = cursor.fetchone()
print (team[0])
conn.close()

DO NOT concatenate the csv data. Use a parameterised query.
Use %s inside your string, then pass the additional variable:
cursor.execute('select team from users.teamdetails where p_id = %s', (queryPID,))
Concatenation of text leaves your application vulnerable to SQL injection.
https://www.psycopg.org/docs/usage.html

Related

mysql insert into csv file not working in python

I have tried many times and searched all over the internet and this is still not working for me.
I am trying to read from a csv file and insert data into a database with python.
This is my code, and I don't understand why it's not working
import mysql.connector
import csv
import pandas as pd
with open(r'files\files1.csv') as csv_file:
csvfile = csv.reader(csv_file, delimiter=';')
allvalues=[]
for row in csvfile:
value = (row[0],row[1],row[2])
allvalues.append(value)
print(allvalues)
db = mysql.connector.connect(
host = 'ip',
user = 'user',
passwd = 'pass',
database = 'db',
auth_plugin='mysql_native_password'
)
cursor = db.cursor()
query = "INSERT INTO table1 (col1, col2, col3) VALUES (%s , %s , %s)"
cursor.execute(query, allvalues)
db.commit()
this gives the following error:
result = self._cmysql.convert_to_mysql(*params)
_mysql_connector.MySQLInterfaceError: Python type tuple cannot be converted
I also want to mentions that I have tried many other things to insert into the table not only the method above, and everytime I get a different error.
Can someone please tell me how do I do it? I would really appreciate it
Thank you very much
use cursor.executemany(query, allvalues)
If you have multiple elements which are saved in a list or tuple then use,
cursor.executemany(query, list) or cursor.executemany(query, tuple)
Or you can use for loop
for value in allvalues:
cursor.execute(query, value)

Uploading data with psycopg2 and python

With the next cmds I am trying to upload a csv file where columns are separated by tabs and sometimes null values can be assigned to a column.
conn = psycopg2.connect(host="localhost",
port="5432",
user="postgres",
password="somepwd",
database="mydb",
options="-c search_path=dbo")
...
cur = conn.cursor()
with open(opath, "r") as opath_file:
next(opath_file) # skip the header row
cur.copy_from(opath_file, table_name[3:], null='', columns=cols.split(','))
cols has a string with the column names separated by ','
the table with name table_name[3:] belongs to the dbo schema
This code runs, no error is reported but no data is uploaded. The owner of the db is postgres.
Any ideas?
Would you believe me if the problem was I needed to run
conn.commit()
after the cur.copy_from cmd?

Python Running SQL Query With Temp Tables

I am new to the Python-SQL connectivity world. My goal is to retrieve data from SQL in a pandas DataFrame format by executing long SQL queries thru my python script.
Most of my SQL queries are long with multiple interim-temp tables before the final SELECT statement from the last temp table. When I run such a monolithic query in Python I get an error saying -
"pandas.io.sql.DatabaseError: Execution failed on sql"
Though they run absolutely fine in MS SQL Management Studio
I suspect this is due to the interim-temp tables, because if I split my long query into two pieces (with everything before the final SELECT in 1st section and final SELECT in the 2nd section) the two section sequentially, run fine
Can someone guide me why is it so or alternatively what is the best way to run long queries with temp tables/views and retrieve results in a pandas DataFrame?
Here is my sample Python code that ideally should take a fine name as an input and run the SQL to retrieve results in a data frame, however it fails in case of a query with temp tables
import pyodbc as db
import pandas as pd
filename = 'file.sql'
username = 'XXXX'
password = 'YYYYY'
driver= '{ODBC Driver 13 for SQL Server}'
database = 'DB'
server = 'local'
conn = db.connect('DRIVER='+driver+'; PORT=1433; SERVER='+server+';
PORT=1443; DATABASE='+database+'; UID='+username+'; PWD='+ password)
fd = open(filename, 'r')
sqlfile = fd.read()
fd.close()
sqlcommand1 = sql
df_table = pd.read_sql(sqlcommand1, conn)
If I break my sql query in two pieces (one with all temp tables and 2nd with final Select), then it runs fine. Below is a modified function that splits the long Query after finding '/**/' and it works fine
"""
This Function Reads a SQL Script From an Extrenal File and Executes The
Script in SQL. If The SQL Script Has Bunch of Tem Tables/Views
Followed By a Select Statement to Retrieve Data From Those Views Then Input
SQL File Should Have '/**/' Immediately Before the Final
Select Statement. This is to Esnure Final Select Statement is Executed on
the Temporary Views Already Run by Python.
Input is a SQL File Name and Output is a DataFrame
"""
import pyodbc as db
import pandas as pd
filename = 'filename.sql'
username = 'XXXX'
password = 'YYYYY'
driver= '{ODBC Driver 13 for SQL Server}'
database = 'DB'
server = 'local'
conn = db.connect('DRIVER='+driver+'; PORT=1433; SERVER='+server+';
PORT=1443; DATABASE='+database+'; UID='+username+'; PWD='+ password)
fd = open(filename, 'r')
sqlfile = fd.read()
fd.close()
sql = sqlfile.split('/**/')
sqlcommand1 = sql[0] #1st Section of Query with temp tables
sqlcommand2 = sql[1] #2nd section of Query with final SELECT statement
conn.execute(sqlcommand1)
df_table = pd.read_sql(sqlcommand2, conn)
Quick and dirty answer: if using T-SQL put the line SET NOCOUNT ON at the beginning of your query.
Like #Parfait mentioned above the pandas read_sql method can only support one result set. However, when you generate a temp table in T-sql you do create a result set in the form "(XX row(s) affected)" which is what causes your original query to fail. By setting NOCOUNT you eliminate any early returns and only get the results from your final SELECT statement.
Alternatively, if using pyodbc cursor instead of pandas you can utilize nextset() to skip the result sets from the temp table(s). More info on pyodbc here.

Python PYDOBC Insert Into SQL Server DB with Parameters

I am currently trying to use pyodbc to insert data from a .csv into an Azure SQL Server database. I found a majority of this syntax on Stack Overflow, however for some reason I keep getting one of two different errors.
1) Whenever I use the following code, I get an error that states 'The SQL contains 0 parameter markers, but 7 parameters were supplied'.
import pyodbc
import csv
cnxn = pyodbc.connect('driver', user='username', password='password', database='database')
cnxn.autocommit = True
cursor = cnxn.cursor()
csvfile = open('CSV File')
csv_data = csv.reader(csvfile)
SQL="insert into table([Col1],[Col2],[Col3],[Col4],[Col5],[Col6],[Col7]) values ('?','?','?','?','?','?','?')"
for row in csv_data:
cursor.execute(SQL, row)
time.sleep(1)
cnxn.commit()
cnxn.close()
2) In order to get rid of that error, I am defining the parameter markers by adding '=?' to each of the columns in the insert statement (see code below), however this then gives the following error: ProgrammingError: ('42000'"[42000] [Microsoft] [ODBC SQL Server Driver][SQL Server] Incorrect syntax near '=').
import pyodbc
import csv
cnxn = pyodbc.connect('driver', user='username', password='password', database='database')
cnxn.autocommit = True
cursor = cnxn.cursor()
csvfile = open('CSV File')
csv_data = csv.reader(csvfile)
SQL="insert into table([Col1]=?,[Col2]=?,[Col3]=?,[Col4]=?,[Col5]=?,[Col6]=?,[Col7]=?) values ('?','?','?','?','?','?','?')"
for row in csv_data:
cursor.execute(SQL, row)
time.sleep(1)
cnxn.commit()
cnxn.close()
This is the main error I am haveing trouble with, I have searched all over Stack Overflow and can't seem to find a solution. I know this error is probably very trivial, however I am new to Python and would greatly appreciate any advice or help.
Since SQL server can import your entire CSV file with a single statement this is a reinvention of the wheel.
BULK INSERT my_table FROM 'CSV_FILE'
WITH ( FIELDTERMINATOR=',', ROWTERMINATOR='\n');
If you want to persist with using python, just execute the above query with pyodbc!
If you would still prefer to execute thousands of statements instead of just one
SQL="insert into table([Col1],[Col2],[Col3],[Col4],[Col5],[Col6],[Col7]) values (?,?,?,?,?,?,?)"
note that the ' sorrounding the ? shouldn't be there.
# creating column list for insertion
colsInsert = "["+"],[".join([str(i) for i in mydata.columns.tolist()]) +']'
# Insert DataFrame recrds one by one.
for i,row in mydata.iterrows():
sql = "INSERT INTO Test (" +colsInsert + ") VALUES (" + "%?,"*(len(row)-1) + "%?)"
cursor.execute(sql, tuple(row))
# cursor.execute(sql, tuple(row))
# the connection is not autocommitted by default, so we must commit to save our changes
c.commit()

Insert data from file into database

I have a .sql file with multiple insert statements ( 1000 + ) and I want to run the statements in this file into my Oracle database.
For now, im using a python with odbc to connect to my database with the following:
import pyodbc
from ConfigParser import SafeConfigParser
def db_call(self, cfgFile, sql):
parser = SafeConfigParser()
parser.read(cfgFile)
dsn = parser.get('odbc', 'dsn')
uid = parser.get('odbc', 'user')
pwd = parser.get('odbc', 'pass')
try:
con = pyodbc.connect('DSN=' + dsn + ';PWD=' + pwd + ';UID=' + pwd)
cur = con.cursor()
cur.execute(sql)
con.commit()
except pyodbc.DatabaseError, e:
print 'Error %s' % e
sys.exit(1)
finally:
if con and cur:
cur.close()
con.close()
with open('theFile.sql','r') as f:
cfgFile = 'c:\\dbinfo\\connectionInfo.cfg'
#here goes the code to insert the contents into the database using db_call_many
statements = f.read()
db_call(cfgFile,statements)
But when i run it i receive the following error:
pyodbc.Error: ('HY000', '[HY000] [Oracle][ODBC][Ora]ORA-00911: invalid character\n (911) (SQLExecDirectW)')
But all the content of the file are only:
INSERT INTO table (movie,genre) VALUES ('moviename','horror');
Edit
Adding print '<{}>'.format(statements) before the db_db_call(cfgFile,statements) i get the results(100+):
<INSERT INTO table (movie,genre) VALUES ('moviename','horror');INSERT INTO table (movie,genre) VALUES ('moviename_b','horror');INSERT INTO table (movie,genre) VALUES ('moviename_c','horror');>
Thanks for your time on reading this.
Now it's somewhat clarified - you have a lot of separate SQL statements such as INSERT INTO table (movie,genre) VALUES ('moviename','horror');
Then, you're effectively after cur.executescript() than the current state (I have no idea if pyodbc supports that part of the DB API, but any reason, you can't just execute an execute to the database itself?
When you read a file using read() function, the end line (\n) at the end of file is read too. I think you should use db_call(cfgFile,statements[:-1]) to eliminate the end line.

Categories