I am trying to achieve the following using Python and the MySQLdb interface:
Read the contents of a table that has a few million rows.
Process and modify the output of every row.
Put the modified rows into another table.
It seems sensible to me to iterate over each row, process on-the-fly and then insert each new row into the new table on-the-fly.
This works:
import MySQLdb
import MySQLdb.cursors
conn=MySQLdb.connect(
host="somehost",user="someuser",
passwd="somepassword",db="somedb")
cursor1 = conn.cursor(MySQLdb.cursors.Cursor)
query1 = "SELECT * FROM table1"
cursor1.execute(query1)
cursor2 = conn.cursor(MySQLdb.cursors.Cursor)
for row in cursor1:
values = some_function(row)
query2 = "INSERT INTO table2 VALUES (%s, %s, %s)"
cursor2.execute(query2, values)
cursor2.close()
cursor1.close()
conn.commit()
conn.close()
But this is slow and memory-consuming since it's using a client-side cursor for the SELECT query. If I instead use a server-side cursor for the SELECT query:
cursor1 = conn.cursor(MySQLdb.cursors.SSCursor)
Then I get a 2014 error:
Exception _mysql_exceptions.ProgrammingError: (2014, "Commands out of sync; you can't run this command now") in <bound method SSCursor.__del__ of <MySQLdb.cursors.SSCursor object at 0x925d6ec>> ignored
So it doesn't seem to like starting another cursor while iterating over a server-side cursor. Which seems to leave me stuck with a very slow client-side iterator.
Any suggestions?
You need a seperate connection to the database, since the first connection is stuck with streaming the resultset, you can't run the insert query.
Try this:
import MySQLdb
import MySQLdb.cursors
conn=MySQLdb.connect(
host="somehost",user="someuser",
passwd="somepassword",db="somedb")
cursor1 = conn.cursor(MySQLdb.cursors.SSCursor)
query1 = "SELECT * FROM table1"
cursor1.execute(query1)
insertConn=MySQLdb.connect(
host="somehost",user="someuser",
passwd="somepassword",db="somedb")
cursor2 = inserConn.cursor(MySQLdb.cursors.Cursor)
for row in cursor1:
values = some_function(row)
query2 = "INSERT INTO table2 VALUES (%s, %s, %s)"
cursor2.execute(query2, values)
cursor2.close()
cursor1.close()
conn.commit()
conn.close()
insertConn.commit()
insertConn.close()
Related
I am trying to write a script that reads a tab delimited text file and first creates a mysql table and then inserts the data into that table.
Problem: I'm stuck on writing the INSERT query because %s placeholder serves a new purpose with the mysql.connector API. Here is my code:
def insertmanyquery(tabletitle, headers, values):
'''connects to a mysql database and inserts a list of tab delimited rows into a table'''
cnxn = connect(all the connection parameters)
cursor = cnxn.cursor()
numofvalues = r"%s, " * len(headers.split(','))
numofvalues = numofvalues[:-2]
query = "INSERT INTO %s (%s) VALUES (%s)" % (tabletitle, headers, numofvalues)
cursor.executemany(query,values)
cnxn.commit()
cursor.close()
cnxn.close()
This would hopefully allow the insert query to adapt to however many columns are present in the table.
If I call the function as follows:
tabletitle = 'Bikes'
headers = 'BikeBrand, BikeName, Purpose, Price, YearPurchased'
values = ['Norco', 'Range', 'Enduro', 8,000.00, 2018]
insertmanyquery(tabletitle, headers, values)
I get the following error: mysql.connector.errors.ProgrammingError: Not all parameters were used in the SQL statement
If I just print the query instead of executing it, it looks fine:
INSERT INTO Bikes (BikeBrand, BikeName, Purpose, Price, YearPurchased) VALUES (%s, %s, %s, %s, %s)
I believe I am getting this error because the third %s in my INSERT query is being interpreted as a part of the connector INSERT syntax instead of first being interpreted in a pythonic manner and then being interpreted as connector syntax.
I am very new to coding so maybe I'm approaching this all wrong, regardless, I'd like to hear potential solutions to this problem or better ways to code this.
Thank you for your time
UPDATE:
I have tried making the query query = "INSERT INTO %s (%s)" % (tabletitle, headers) + " VALUES (" + numofvalues +");" and I still get the same error! so it doesn't have to do with using the placeholder.. >.<
I am using PyMySQL in Python 2.7. I have to create a function - where, given a table name, the query will find unique values of all the column names.
Since there are more than one tables involved, I do not want to hard-code table name. Now, a simpler query is like:
cursor.execute(" SELECT DISTINCT(`Trend`) AS `Trend` FROM `Table_1` ORDER BY `Trend` DESC ")
I want to do something like:
tab = 'Table_1'
cursor.execute(" SELECT DISTINCT(`Trend`) AS `Trend` FROM tab ORDER BY `Trend` DESC ")
I am getting the following error:
ProgrammingError: (1146, u"Table 'Table_1.tab' doesn't exist")
Can someone please help. TIA
Make sure the database you're using is correct,and use %s to format you sql statement.
DB_SCHEMA='test_db'
table_name='table1'
connection = pymysql.connect(host=DB_SERVER_IP,
port=3306,
db=DB_SCHEMA,
charset='UTF8',
cursorclass=pymysql.cursors.DictCursor
)
try:
with connection.cursor() as cursor:
sql = "SELECT DISTINCT(`Trend`) AS `Trend` FROM `%s` ORDER BY `Trend` DESC"%(table_name)
cursor.execute(sql)
connection.commit()
except Exception as e:
print e
finally:
connection.close()
Hope this helps.
I am currently trying to use pyodbc to insert data from a .csv into an Azure SQL Server database. I found a majority of this syntax on Stack Overflow, however for some reason I keep getting one of two different errors.
1) Whenever I use the following code, I get an error that states 'The SQL contains 0 parameter markers, but 7 parameters were supplied'.
import pyodbc
import csv
cnxn = pyodbc.connect('driver', user='username', password='password', database='database')
cnxn.autocommit = True
cursor = cnxn.cursor()
csvfile = open('CSV File')
csv_data = csv.reader(csvfile)
SQL="insert into table([Col1],[Col2],[Col3],[Col4],[Col5],[Col6],[Col7]) values ('?','?','?','?','?','?','?')"
for row in csv_data:
cursor.execute(SQL, row)
time.sleep(1)
cnxn.commit()
cnxn.close()
2) In order to get rid of that error, I am defining the parameter markers by adding '=?' to each of the columns in the insert statement (see code below), however this then gives the following error: ProgrammingError: ('42000'"[42000] [Microsoft] [ODBC SQL Server Driver][SQL Server] Incorrect syntax near '=').
import pyodbc
import csv
cnxn = pyodbc.connect('driver', user='username', password='password', database='database')
cnxn.autocommit = True
cursor = cnxn.cursor()
csvfile = open('CSV File')
csv_data = csv.reader(csvfile)
SQL="insert into table([Col1]=?,[Col2]=?,[Col3]=?,[Col4]=?,[Col5]=?,[Col6]=?,[Col7]=?) values ('?','?','?','?','?','?','?')"
for row in csv_data:
cursor.execute(SQL, row)
time.sleep(1)
cnxn.commit()
cnxn.close()
This is the main error I am haveing trouble with, I have searched all over Stack Overflow and can't seem to find a solution. I know this error is probably very trivial, however I am new to Python and would greatly appreciate any advice or help.
Since SQL server can import your entire CSV file with a single statement this is a reinvention of the wheel.
BULK INSERT my_table FROM 'CSV_FILE'
WITH ( FIELDTERMINATOR=',', ROWTERMINATOR='\n');
If you want to persist with using python, just execute the above query with pyodbc!
If you would still prefer to execute thousands of statements instead of just one
SQL="insert into table([Col1],[Col2],[Col3],[Col4],[Col5],[Col6],[Col7]) values (?,?,?,?,?,?,?)"
note that the ' sorrounding the ? shouldn't be there.
# creating column list for insertion
colsInsert = "["+"],[".join([str(i) for i in mydata.columns.tolist()]) +']'
# Insert DataFrame recrds one by one.
for i,row in mydata.iterrows():
sql = "INSERT INTO Test (" +colsInsert + ") VALUES (" + "%?,"*(len(row)-1) + "%?)"
cursor.execute(sql, tuple(row))
# cursor.execute(sql, tuple(row))
# the connection is not autocommitted by default, so we must commit to save our changes
c.commit()
am trying to insert the data entered into the web form into database table,i am passing the data to the function to insert the data,but it was not successful below is my code
def addnew_to_database(tid,pid,usid,address,status,phno,email,ord_date,del_date):
connection = mysql.connector.connect(user='admin_operations', password='mypassword',host='127.0.0.1',database='tracking_system')
try:
print tid,pid,usid,address,status,phno,email,ord_date,del_date
cursor = connection.cursor()
cursor.execute("insert into track_table (tid,pid,usid,address,status,phno,email,ord_date,del_date) values(tid,pid,usid,address,status,phno,email,ord_date,del_date)")
cursor.execute("insert into user_table (tid,usid) values(tid,usid)")
finally:
connection.commit()
connection.close()
You should pass the variables as an argument to .execute instead of putting them in the actual query. E.g.:
cursor.execute("""insert into track_table
(tid,pid,usid,address,status,phno,email,ord_date,del_date)
values (%s,%s,%s,%s,%s,%s,%s,%s,%s)""",
(tid,pid,usid,address,status,phno,email,ord_date,del_date))
cursor.execute("""insert into user_table
(tid,usid)
values (%s,%s)""",(tid,usid))
You should tell us what API you are using and what the error code is.
You should define the values within the execution, right now within the sql statement as a string they are not referencing anything.
Typically when you use a variable name inside of a sql statement this way, you need to indicate that it is a variable you are binding data to. This might be replacing it with (1,2,3,4..) or (%s,%s,...) that corresponds to an ordered list or using variable names (:tid,:pid,...) that you then define the values of with a dictionary as the second argument of execute().
Like this:
track_table_data = [tid,pid,usid,address,status,phno,email,ord_date,del_date]
user_table_data = [tid,usid]
cursor.execute("insert into track_table (tid,pid,usid,address,status,phno,email,ord_date,del_date) values(1,2,3,4,5,6,7,8,9)", track_table_data)
cursor.execute("insert into user_table (tid,usid) values(1,2)",user_table_data)
or
cursor.execute("insert into track_table (tid,pid,usid,address,status,phno,email,ord_date,del_date) values(:tid,:pid,:usid,:address,:status,:phno,:email,:ord_date,:del_date))", {'tid':tid,'pid':pid,'usid':usid,'address':address,'status':status,'phno':status,'email':email,'ord_date':ord_date,'del_date':del_date})
cursor.execute("insert into user_table (tid,usid) values(:tid,:usid)",{'tid':tid,'usid':usid})
I am using sqlite for the first time. I used Xammp before. Now I have a scene here. Each time I run the code below, records don't just get appended at the end of the table rather the table is created new and thus it's working just like a console.
Can any one tell me what I am doing wrong here?
import sqlite3
db = sqlite3.connect('test.db')
db.row_factory = sqlite3.Row
db.execute('drop table if exists test')
db.execute('create table test (t1 text,i1 text)')
db.execute('insert into test (t1, i1) values (?, ?)',('xyzs','51'))
cursor = db.execute('select * from test')
for row in cursor:
print(row['t1'],row['i1'])
First, you need to execute commands on a cursor and not the connection itself. Second, you need to commit your transactions:
import sqlite3
db = sqlite3.connect('test.db')
db.row_factory = sqlite3.Row
cur = db.cursor() # getting a cursor
cur.execute('drop table if exists test')
cur.execute('create table test (t1 text,i1 text)')
db.commit() # commit the transaction, note commits are done
# at the connection, not on the cursor
cur.execute('insert into test (t1, i1) values (?, ?)',('xyzs','51'))
db.commit()
cursor = cur.execute('select * from test')
for row in cursor:
print(row['t1'],row['i1'])
Please have a look at the documentation. This will help you once you start working with other databases in Python because they all follow the same API.
This line drops the old table:
db.execute('drop table if exists test')
And this one creates a new table:
db.execute('create table test (t1 text,i1 text)')
That should explain your problem. Remove these two lines and you'll be fine - But create the table first separately.