Why pymysql not insert record into table? - python

I am pretty new in python developing. I have a long python script what "clone" a database and add additional stored functions and procedures. Clone means copy only the schema of DB.These steps work fine.
My question is about pymysql insert exection:
I have to copy some table contents into the new DB. I don't get any sql error. If I debug or print the created INSERT INTO command is correct (I've tested it in an sql editor/handler). The insert execution is correct becuse the result contain the exact row number...but all rows are missing from destination table in dest.DB...
(Ofcourse DB_* variables have been definied!)
import pymysql
liveDbConn = pymysql.connect(DB_HOST, DB_USER, DB_PWD, LIVE_DB_NAME)
testDbConn = pymysql.connect(DB_HOST, DB_USER, DB_PWD, TEST_DB_NAME)
tablesForCopy = ['role', 'permission']
for table in tablesForCopy:
with liveDbConn.cursor() as liveCursor:
# Get name of columns
liveCursor.execute("DESCRIBE `%s`;" % (table))
columns = '';
for column in liveCursor.fetchall():
columns += '`' + column[0] + '`,'
columns = columns.strip(',')
# Get and convert values
values = ''
liveCursor.execute("SELECT * FROM `%s`;" % (table))
for result in liveCursor.fetchall():
data = []
for item in result:
if type(item)==type(None):
data.append('NULL')
elif type(item)==type('str'):
data.append("'"+item+"'")
elif type(item)==type(datetime.datetime.now()):
data.append("'"+str(item)+"'")
else: # for numeric values
data.append(str(item))
v = '(' + ', '.join(data) + ')'
values += v + ', '
values = values.strip(', ')
print("### table: %s" % (table))
testDbCursor = testDbConn.cursor()
testDbCursor.execute("INSERT INTO `" + TEST_DB_NAME + "`.`" + table + "` (" + columns + ") VALUES " + values + ";")
print("Result: {}".format(testDbCursor._result.message))
liveDbConn.close()
testDbConn.close()
Result is:
### table: role
Result: b"'Records: 16 Duplicates: 0 Warnings: 0"
### table: permission
Result: b'(Records: 222 Duplicates: 0 Warnings: 0'
What am I doing wrong? Thanks!

You have 2 main issues here:
You don't use conn.commit() (which would be either be liveDbConn.commit() or testDbConn.commit() here). Changes to the database will not be reflected without committing those changes. Note that all changes need committing but SELECT, for example, does not.
Your query is open to SQL Injection. This is a serious problem.
Table names cannot be parameterized, so there's not much we can do about that, but you'll want to parameterize your values. I've made multiple corrections to the code in relation to type checking as well as parameterization.
for table in tablesForCopy:
with liveDbConn.cursor() as liveCursor:
liveCursor.execute("SELECT * FROM `%s`;" % (table))
name_of_columns = [item[0] for item in liveCursor.description]
insert_list = []
for result in liveCursor.fetchall():
data = []
for item in result:
if item is None: # test identity against the None singleton
data.append('NULL')
elif isinstance(item, str): # Use isinstance to check type
data.append(item)
elif isinstance(item, datetime.datetime):
data.append(item.strftime('%Y-%m-%d %H:%M:%S'))
else: # for numeric values
data.append(str(item))
insert_list.append(data)
testDbCursor = testDbConn.cursor()
placeholders = ', '.join(['`%s`' for item in insert_list[0]])
testDbCursor.executemany("INSERT INTO `{}.{}` ({}) VALUES ({})".format(
TEST_DB_NAME,
table,
name_of_columns,
placeholders),
insert_list)
testDbConn.commit()
From this github thread, I notice that executemany does not work as expected in psycopg2; it instead sends each entry as a single query. You'll need to use execute_batch:
from psycopg2.extras import execute_batch
execute_batch(testDbCursor,
"INSERT INTO `{}.{}` ({}) VALUES ({})".format(TEST_DB_NAME,
table,
name_of_columns,
placeholders),
insert_list)
testDbConn.commit()

How to insert data into table using python pymsql
Find my solution below
import pymysql
import datetime
# Create a connection object
dbServerName = "127.0.0.1"
port = 8889
dbUser = "root"
dbPassword = ""
dbName = "blog_flask"
# charSet = "utf8mb4"
conn = pymysql.connect(host=dbServerName, user=dbUser, password=dbPassword,db=dbName, port= port)
try:
# Create a cursor object
cursor = conn.cursor()
# Insert rows into the MySQL Table
now = datetime.datetime.utcnow()
my_datetime = now.strftime('%Y-%m-%d %H:%M:%S')
cursor.execute('INSERT INTO posts (post_id, post_title, post_content, \
filename,post_time) VALUES (%s,%s,%s,%s,%s)',(5,'title2','description2','filename2',my_datetime))
conn.commit()
except Exception as e:
print("Exeception occured:{}".format(e))
finally:
conn.close()

Related

Python - how to issue SQL insert into statement with ' in value

I am moving data from MySQL to MSSQL - however I have a problem with insert into statement when I have ' in value.
for export i have used code below:
import pymssql
import mysql.connector
conn = pymssql.connect(host='XXX', user='XXX',
password='XXX', database='XXX')
sqlcursor = conn.cursor()
cnx = mysql.connector.connect(user='root',password='XXX',
database='XXX')
cursor = cnx.cursor()
sql= "SELECT Max(ID) FROM XXX;"
cursor.execute(sql)
row=cursor.fetchall()
maxID = str(row)
maxID = maxID.replace("[(", "")
maxID = maxID.replace(",)]", "")
AMAX = int(maxID)
LC = 1
while LC <= AMAX:
LCC = str(LC)
sql= "SELECT * FROM XX where ID ='"+ LCC +"'"
cursor.execute(sql)
result = cursor.fetchall()
data = str(result)
data = data.replace("[(","")
data = data.replace(")]","")
data = data.replace("None","NULL")
#print(row)
si = "insert into [XXX].[dbo].[XXX] select " + data
#print(si)
#sys.exit("stop")
try:
sqlcursor.execute(si)
conn.commit()
except Exception:
print("-----------------------")
print(si)
LC = LC + 1
print('Import done | total count:', LC)
It is working fine until I have ' in one of my values:
'N', '0000000000', **"test string'S nice company"**
I would like to avoid spiting the data into columns and then checking if there is ' in the data - as my table has about 500 fields.
Is there a smart way of replacing ' with ''?
Answer:
Added SET QUOTED_IDENTIFIER OFF to insert statement:
si = "SET QUOTED_IDENTIFIER OFF insert into [TechAdv].[dbo].[aem_data_copy]
select " + data
In MSSQL, you can SET QUOTED_IDENTIFIER OFF, then you can use double quotes to escape a singe quote, or use two single quotes to escape one quote.

How to dynamically generate mysql ON DUPLICATE UPDATE in python

I am trying to dynamically generate MySQL insert/update queries given a csv file.
I have a csv file hobbies.csv:
id,name,hobby
"1","rick","coding"
"2","mike","programming"
"3","tim","debugging"
I then have 2 functions: 1 to generate the queries, 1 to update the database:
generate_sql.py
from connect_to_database import read_db_config
from config_parser import read_csv_files
from update_db import insert_records
import csv
def generate_mysql_queries():
csv_file_list, table_list, temp_val, temp_key, temp_table, reader, header, data, data_list = ([] for i in range(9))
val_param = '%s'
query = ''
total_queries = 0
db = read_db_config(filename='config.ini', section='mysql')
csv_file_dict = read_csv_files(filename='config.ini', section='data')
for key, value in csv_file_dict.items():
temp_val = [value]
temp_key = [key]
csv_file_list.append(temp_val)
table_list.append(temp_key)
for index, files in enumerate(csv_file_list):
with open("".join(files), 'r') as f:
reader = csv.reader(f)
header.append(next(reader))
data.append([row for row in reader])
for d in range(len(data[index])):
val_param_subs = ','.join((val_param,) * len(data[index][d]))
total_queries += 1
query = """INSERT INTO """ + str(db['database']) + """.""" + """""".join('{0}'.format(t) for t in table_list[index]) + \
"""(""" + """, """.join('{0}'.format(h) for h in header[index]) + """) VALUES (%s)""" % val_param_subs + \
""" ON DUPLICATE KEY UPDATE """ + """=%s, """.join(header[index]) + """=%s"""
data_list.append(data[index][d])
insert_records(query, data_list)
I then pass the query and data to insert_records() in update_db.py:
from mysql.connector import MySQLConnection, Error
from connect_to_database import read_db_config
def insert_records(query, data):
query_string = query
data_tuple = tuple(data)
try:
db_config = read_db_config(filename='config.ini', section='mysql')
conn = MySQLConnection(**db_config)
cursor = conn.cursor()
cursor.executemany(query, data_tuple)
print("\tExecuted!")
conn.commit()
except Error as e:
print('\n\tError:', e)
print("\n\tNot Executed!")
finally:
cursor.close()
conn.close()
The data passed into cursor.executemany(query, data_string) looks like the following (query is a string and data_tuple is a tuple):
query: INSERT INTO test.hobbies(id, name, hobby) VALUES (%s,%s,%s) ON DUPLICATE KEY UPDATE id=%s, name=%s, hobby=%s
data_tuple: (['1', 'rick', 'coding'], ['2', 'mike', 'programming'], ['3', 'tim', 'debugging'])
Given these two parameters, I get the following error:
Error: 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MariaDB server version for the right syntax to use near '%s, name=%s, hobby=%s' at line 1
I've tried passing in the same string non-dynamically by just sending the full string without the '%s' parameters and it works fine. What am I missing? Any help is much appreciated.
Probably is the use of the triple double quotes in python. When you use this
query = """INSERT INTO """ + str(db['database']) + """.""" + """""".join('{0}'.format(t) for t in table_list[index]) + \
"""(""" + """, """.join('{0}'.format(h) for h in header[index]) + """) VALUES (%s)""" % val_param_subs + \
""" ON DUPLICATE KEY UPDATE """ + """=%s, """.join(header[index]) + """=%s"""
You're saying to python that everything is a string including %s.

SQLite3 - Executemany not finishing update on large list in Python3

I'm attempting to update around 500k rows in a SQLite database. I can create them rather quickly, but when I'm updating, it seems to be indefinitely hung, but I don't get an error message. (An insert of the same size took 35 seconds, this update has been at it for over 12 hours).
The portion of my code that does the updating is:
for line in result:
if --- blah blah blah ---:
stuff
else:
counter = 1
print("Starting to append result_list...")
result_list = []
for line in result:
result_list.append((str(line),counter))
counter += 1
sql = 'UPDATE BRFSS2015 SET ' + col[1] + \
' = ? where row_id = ?'
print("Executing SQL...")
c.executemany(sql, result_list)
print("Committing.")
conn.commit()
It prints "Executing SQL..." and presumably attempts the executemany and that's where its stuck. The variable "result" is a list of records and is working as far as I can tell because the insert statement is working and it is basically the same.
Am I misusing executemany? I see many threads on executemany(), but all of them as far as I can tell are getting an error message, not just hanging indefinitely.
For reference, the full code I have is below. Basically I'm trying to convert an ASCII file to a sqlite database. I know I could technically insert all columns at the same time, but the machines I have access to are all limited to 32bit Python and they run out of memory (this file is quite large, close to 1GB of text).
import pandas as pd
import sqlite3
ascii_file = r'c:\Path\to\file.ASC_'
sqlite_file = r'c:\path\to\sqlite.db'
conn = sqlite3.connect(sqlite_file)
c = conn.cursor()
# Taken from https://www.cdc.gov/brfss/annual_data/2015/llcp_varlayout_15_onecolumn.html
raw_list = [[1,"_STATE",2],
[17,"FMONTH",2],
... many other values here
[2154,"_AIDTST3",1],]
col_list = []
for col in raw_list:
begin = (col[0] - 1)
col_name = col[1]
end = (begin + col[2])
col_list.append([(begin, end,), col_name,])
for col in col_list:
print(col)
col_specification = [col[0]]
print("Parsing...")
data = pd.read_fwf(ascii_file, colspecs=col_specification)
print("Done")
result = data.iloc[:,[0]]
result = result.values.flatten()
sql = '''CREATE table if not exists BRFSS2015
(row_id integer NOT NULL,
''' + col[1] + ' text)'
print(sql)
c.execute(sql)
conn.commit()
sql = '''ALTER TABLE
BRFSS2015 ADD COLUMN ''' + col[1] + ' text'
try:
c.execute(sql)
print(sql)
conn.commit()
except Exception as e:
print("Error Happened instead")
print(e)
counter = 1
result_list = []
for line in result:
result_list.append((counter, str(line)))
counter += 1
if '_STATE' in col:
counter = 1
result_list = []
for line in result:
result_list.append((counter, str(line)))
counter += 1
sql = 'INSERT into BRFSS2015 (row_id,' + col[1] + ')'\
+ 'values (?,?)'
c.executemany(sql, result_list)
else:
counter = 1
print("Starting to append result_list...")
result_list = []
for line in result:
result_list.append((str(line),counter))
counter += 1
sql = 'UPDATE BRFSS2015 SET ' + col[1] + \
' = ? where row_id = ?'
print("Executing SQL...")
c.executemany(sql, result_list)
print("Committing.")
conn.commit()
print("Comitted... moving on to next column...")
For each row to be updated, the database has to search for that row. (This is not necessary when inserting.) If there is no index on the row_id column, then the database has to go through the entire table for each update.
It would be a better idea to insert entire rows at once. If that is not possible, create an index on row_id, or better, declare it as INTEGER PRIMARY KEY.

How to change the cursor to the next row using pyodbc in Python

I am trying to fetch records after a regular interval from a database table which growing with records. I am using Python and its pyodbc package to carry out the fetching of records. While fetching, how can I point the cursor to the next row of the row which was read/fetched last so that with every fetch I can only get the new set of records inserted.
To explain more,
my table has 100 records and they are fetched.
after an interval the table has 200 records and I want to fetch rows from 101 to 200. And so on.
Is there a way with pyodbc cursor?
Or any other suggestion would be very helpful.
Below is the code I am trying:
#!/usr/bin/python
import pyodbc
import csv
import time
conn_str = (
"DRIVER={PostgreSQL Unicode};"
"DATABASE=postgres;"
"UID=userid;"
"PWD=database;"
"SERVER=localhost;"
"PORT=5432;"
)
conn = pyodbc.connect(conn_str)
cursor = conn.cursor()
def fetch_table(**kwargs):
qry = kwargs['qrystr']
try:
#cursor = conn.cursor()
cursor.execute(qry)
all_rows = cursor.fetchall()
rowcnt = cursor.rowcount
rownum = cursor.description
#return (rowcnt, rownum)
return all_rows
except pyodbc.ProgrammingError as e:
print ("Exception occured as :", type(e) , e)
def poll_db():
for i in [1, 2]:
stmt = "select * from my_database_table"
rows = fetch_table(qrystr = stmt)
print("***** For i = " , i , "******")
for r in rows:
print("ROW-> ", r)
time.sleep(10)
poll_db()
conn.close()
I don't think you can use pyodbc, or any other odbc package, to find "new" rows. But if there is a 'timestamp' column in your database, or if you can add such a column (some databases allow for it to be automatically populated as the time of insertion so you don't have to change the insert queries) then you can change your query to select only the rows whose timestamp is greater than the previous timestamp. And you can keep changing the prev_timestamp variable on each iteration.
def poll_db():
prev_timestamp = ""
for i in [1, 2]:
if prev_timestamp == "":
stmt = "select * from my_database_table"
else:
# convert your timestamp str to match the database's format
stmt = "select * from my_database_table where timestamp > " + str(prev_timestamp)
rows = fetch_table(qrystr = stmt)
prev_timestamp = datetime.datetime.now()
print("***** For i = " , i , "******")
for r in rows:
print("ROW-> ", r)
time.sleep(10)

How do I read cx_Oracle.LOB data in Python?

I have this code:
dsn = cx_Oracle.makedsn(hostname, port, sid)
orcl = cx_Oracle.connect(username + '/' + password + '#' + dsn)
curs = orcl.cursor()
sql = "select TEMPLATE from my_table where id ='6'"
curs.execute(sql)
rows = curs.fetchall()
print rows
template = rows[0][0]
orcl.close()
print template.read()
When I do print rows, I get this:
[(<cx_Oracle.LOB object at 0x0000000001D49990>,)]
However, when I do print template.read(), I get this error:
cx_Oracle.DatabaseError: Invalid handle!
Do how do I get and read this data? Thanks.
I've found out that this happens in case when connection to Oracle is closed before the cx_Oracle.LOB.read() method is used.
orcl = cx_Oracle.connect(usrpass+'#'+dbase)
c = orcl.cursor()
c.execute(sq)
dane = c.fetchall()
orcl.close() # before reading LOB to str
wkt = dane[0][0].read()
And I get: DatabaseError: Invalid handle!
But the following code works:
orcl = cx_Oracle.connect(usrpass+'#'+dbase)
c = orcl.cursor()
c.execute(sq)
dane = c.fetchall()
wkt = dane[0][0].read()
orcl.close() # after reading LOB to str
Figured it out. I have to do something like this:
curs.execute(sql)
for row in curs:
print row[0].read()
You basically have to loop through the fetchall object
dsn = cx_Oracle.makedsn(hostname, port, sid)
orcl = cx_Oracle.connect(username + '/' + password + '#' + dsn)
curs = orcl.cursor()
sql = "select TEMPLATE from my_table where id ='6'"
curs.execute(sql)
rows = curs.fetchall()
for x in rows:
list_ = list(x)
print(list_)
There should be an extra comma in the for loop, see in below code, i have supplied an extra comma after x in for loop.
dsn = cx_Oracle.makedsn(hostname, port, sid)
orcl = cx_Oracle.connect(username + '/' + password + '#' + dsn)
curs = orcl.cursor()
sql = "select TEMPLATE from my_table where id ='6'"
curs.execute(sql)
rows = curs.fetchall()
for x, in rows:
print(x)
I had the same problem with in a slightly different context. I needed to query a +27000 rows table and it turns out that cx_Oracle cuts the connection to the DB after a while.
While a connection to the db is open, you can use the read() method of the cx_Oracle.Lob object to transform it into a string. But if the query brings a table that is too big, it won´t work because the connection will stop at some point and when you want to read the results from the query you´ll gt an error on the cx_Oracle objects.
I tried many things, like setting
connection.callTimeout = 0 (according to documentation, this means it would wait indefinetly), using fetchall() and then putting the results on a dataframe or numpy array but I could never read the cx_Oracle.Lob objects.
If I try to run the query using pandas.DataFrame.read_sql(query, connection) The dataframe would contain cx_Oracle.Lob objects with the connection closed, making them useless. (Again this only happens if the table is very big)
In the end I found a way of getting around this by querying and creating a csv file inmediatlely after, even though I know it´s not ideal.
def csv_from_sql(sql: str, path: str="dataframe.csv") -> bool:
try:
with cx_Oracle.connect(config.username, config.password, config.database, encoding=config.encoding) as connection:
connection.callTimeout = 0
data = pd.read_sql(sql, con=connection)
data.to_csv(path)
print("FILE CREATED")
except cx_Oracle.Error as error:
print(error)
return False
finally:
print("PROCESS ENDED\n")
return True
def make_query(sql: str, path: str="dataframe.csv") -> pd.DataFrame:
if csv_from_sql(sql, path):
dataframe = pd.read_csv("dataframe.csv")
return dataframe
return pd.DataFrame()
This took a long time (about 4 to 5 minutes) to bring my +27000-rows table, but it worked when everything else didn´t.
If anyone knows a better way, it would be helpful for me too.

Categories