Slow SQLite update with Python - python

I have a sqlite database that I've built and it gets both added to and updated on a weekly basis. The issue I have is the update seems to take a very long time. (Roughly 2 hours without the transaction table). I'm hoping there is a faster way to do this. What the script does is read from a CSV and updates the database line by line through a loop
An example data entry would be:
JohnDoe123 018238e1f5092c66d896906bfbcf9abf5abe978975a8852eb3a78871e16b4268
The Code that I use is
#updates reported table
def update_sha(conn, sha, ID, op):
sql_update_reported = 'UPDATE reported SET sha = ? WHERE ID = ? AND operator = ?'
sql_update_blocked = 'UPDATE blocked SET sha = ? WHERE ID = ? AND operator = ?'
sql_update_trans = 'UPDATE transactions SET sha = ? WHERE ID = ? AND operator = ?'
data = (sha, ID, op)
cur = conn.cursor()
cur.execute(sql_update_reported, data)
cur.execute(sql_update_blocked, data)
cur.execute(sql_update_trans, data)
conn.commit()
def Count(conn):
#Creates a dataframe with the Excel sheet information and ensures them to
#be strings
df = pd.DataFrame()
df = pd.read_excel("Count.xlsx", engine='openpyxl',converters={'ID':str})
#Runs through the DataFrame once for reported
for i in df.index:
ID = df['ID'][i]
Sha = df['Sha'][i]
op = df['op'][i]
print(i)
with conn:
update_dupi(conn, Sha, ID, op)
if __name__ == '__main__':
conn = create_connection(database)
print("Updating Now..")
Count(conn)
conn.close()

Related

Incorrect date value when loading xlsx file to table using pymysql and xlrd

(Very) beginner python user here. I'm trying to load an xlsx file into a MySQL table using xlrd and pymysql python libraries and I'm getting an error:
pymysql.err.InternalError: (1292, "Incorrect date value: '43500' for column 'invoice_date' at row 1")
The datatype for invoice_date for my table is DATE. The format for this field on my xlsx file is also Date. Things work fine if I change the table datatype to varchar, but I'd prefer to have the data load into my table as a date instead of converting after the fact. Any ideas as to why I'm getting this error? It appears that xlrd or pymysql is reading '2/4/2019' in my xlxs file as '43500' and mysql is rejecting it due to a datatype mismatch.
import xlrd
import pymysql as MySQLdb
# Open workbook and define first sheet
book = xlrd.open_workbook("2019_Complete.xlsx")
sheet = book.sheet_by_index(0)
# MySQL connection
database = MySQLdb.connect (host="localhost", user="root",passwd="password", db="vendor")
# Get cursor, which is used to traverse the databse, line by line
cursor = database.cursor()
# INSERT INTO SQL query
query = """insert into table values (%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s)"""
# Create a For loop to iterate through each row in the XLS file, starting at row 2 to skip the headers
for r in range(1, sheet.nrows):
lp = sheet.cell(r,0).value
pallet_lp = sheet.cell(r,1).value
bol = sheet.cell(r,2).value
invoice_date = sheet.cell(r,3).value
date_received = sheet.cell(r,4).value
date_repaired = sheet.cell(r,5).value
time_in_repair = sheet.cell(r,6).value
date_shipped = sheet.cell(r,7).value
serial_number = sheet.cell(r,8).value
upc = sheet.cell(r,9).value
product_type = sheet.cell(r,10).value
product_description = sheet.cell(r,11).value
repair_code = sheet.cell(r,12).value
condition = sheet.cell(r,13).value
repair_cost = sheet.cell(r,14).value
parts_cost = sheet.cell(r,15).value
total_cost = sheet.cell(r,16).value
repair_notes = sheet.cell(r,17).value
repair_cap = sheet.cell(r,18).value
complaint = sheet.cell(r,19).value
delta = sheet.cell(r,20).value
# Assign values from each row
values = (lp, pallet_lp, bol, invoice_date, date_received, date_repaired, time_in_repair, date_shipped, serial_number, upc, product_type, product_description, repair_code, condition, repair_cost, parts_cost, total_cost, repair_notes, repair_cap, complaint, delta)
# Execute sql Query
cursor.execute(query, values)
# Close the cursor
cursor.close()
# Commit the transaction
database.commit()
# Close the database connection
database.close()
# Print results
print ("")
columns = str(sheet.ncols)
rows = str(sheet.nrows)
print ("I just imported " + columns + " columns and " + rows + " rows to MySQL!")
You can see this answer for a more detailed explanation, but basically Excel treats dates as a number relative to 1899-12-31, and so to convert your date value to an actual date you need to convert that number into an ISO format date which MySQL will accept. You can do that using date.fromordinal and date.isoformat. For example:
dval = 43500
d = date.fromordinal(dval + 693594)
print(d.isoformat())
Output:
2019-02-04

Why pymysql not insert record into table?

I am pretty new in python developing. I have a long python script what "clone" a database and add additional stored functions and procedures. Clone means copy only the schema of DB.These steps work fine.
My question is about pymysql insert exection:
I have to copy some table contents into the new DB. I don't get any sql error. If I debug or print the created INSERT INTO command is correct (I've tested it in an sql editor/handler). The insert execution is correct becuse the result contain the exact row number...but all rows are missing from destination table in dest.DB...
(Ofcourse DB_* variables have been definied!)
import pymysql
liveDbConn = pymysql.connect(DB_HOST, DB_USER, DB_PWD, LIVE_DB_NAME)
testDbConn = pymysql.connect(DB_HOST, DB_USER, DB_PWD, TEST_DB_NAME)
tablesForCopy = ['role', 'permission']
for table in tablesForCopy:
with liveDbConn.cursor() as liveCursor:
# Get name of columns
liveCursor.execute("DESCRIBE `%s`;" % (table))
columns = '';
for column in liveCursor.fetchall():
columns += '`' + column[0] + '`,'
columns = columns.strip(',')
# Get and convert values
values = ''
liveCursor.execute("SELECT * FROM `%s`;" % (table))
for result in liveCursor.fetchall():
data = []
for item in result:
if type(item)==type(None):
data.append('NULL')
elif type(item)==type('str'):
data.append("'"+item+"'")
elif type(item)==type(datetime.datetime.now()):
data.append("'"+str(item)+"'")
else: # for numeric values
data.append(str(item))
v = '(' + ', '.join(data) + ')'
values += v + ', '
values = values.strip(', ')
print("### table: %s" % (table))
testDbCursor = testDbConn.cursor()
testDbCursor.execute("INSERT INTO `" + TEST_DB_NAME + "`.`" + table + "` (" + columns + ") VALUES " + values + ";")
print("Result: {}".format(testDbCursor._result.message))
liveDbConn.close()
testDbConn.close()
Result is:
### table: role
Result: b"'Records: 16 Duplicates: 0 Warnings: 0"
### table: permission
Result: b'(Records: 222 Duplicates: 0 Warnings: 0'
What am I doing wrong? Thanks!
You have 2 main issues here:
You don't use conn.commit() (which would be either be liveDbConn.commit() or testDbConn.commit() here). Changes to the database will not be reflected without committing those changes. Note that all changes need committing but SELECT, for example, does not.
Your query is open to SQL Injection. This is a serious problem.
Table names cannot be parameterized, so there's not much we can do about that, but you'll want to parameterize your values. I've made multiple corrections to the code in relation to type checking as well as parameterization.
for table in tablesForCopy:
with liveDbConn.cursor() as liveCursor:
liveCursor.execute("SELECT * FROM `%s`;" % (table))
name_of_columns = [item[0] for item in liveCursor.description]
insert_list = []
for result in liveCursor.fetchall():
data = []
for item in result:
if item is None: # test identity against the None singleton
data.append('NULL')
elif isinstance(item, str): # Use isinstance to check type
data.append(item)
elif isinstance(item, datetime.datetime):
data.append(item.strftime('%Y-%m-%d %H:%M:%S'))
else: # for numeric values
data.append(str(item))
insert_list.append(data)
testDbCursor = testDbConn.cursor()
placeholders = ', '.join(['`%s`' for item in insert_list[0]])
testDbCursor.executemany("INSERT INTO `{}.{}` ({}) VALUES ({})".format(
TEST_DB_NAME,
table,
name_of_columns,
placeholders),
insert_list)
testDbConn.commit()
From this github thread, I notice that executemany does not work as expected in psycopg2; it instead sends each entry as a single query. You'll need to use execute_batch:
from psycopg2.extras import execute_batch
execute_batch(testDbCursor,
"INSERT INTO `{}.{}` ({}) VALUES ({})".format(TEST_DB_NAME,
table,
name_of_columns,
placeholders),
insert_list)
testDbConn.commit()
How to insert data into table using python pymsql
Find my solution below
import pymysql
import datetime
# Create a connection object
dbServerName = "127.0.0.1"
port = 8889
dbUser = "root"
dbPassword = ""
dbName = "blog_flask"
# charSet = "utf8mb4"
conn = pymysql.connect(host=dbServerName, user=dbUser, password=dbPassword,db=dbName, port= port)
try:
# Create a cursor object
cursor = conn.cursor()
# Insert rows into the MySQL Table
now = datetime.datetime.utcnow()
my_datetime = now.strftime('%Y-%m-%d %H:%M:%S')
cursor.execute('INSERT INTO posts (post_id, post_title, post_content, \
filename,post_time) VALUES (%s,%s,%s,%s,%s)',(5,'title2','description2','filename2',my_datetime))
conn.commit()
except Exception as e:
print("Exeception occured:{}".format(e))
finally:
conn.close()

How to update table sqlite with python in loop

I am trying to calculate the mode value of each row and store the value in the judge = judge column, however it updates only the first record and leaves the loop
ps: Analisador is my table and resultado_2 is my db
import sqlite3
import statistics
conn = sqlite3.connect("resultado_2.db")
cursor = conn.cursor()
data = cursor.execute("SELECT Bow, FastText, Glove, Wordvec, Python, juiz, id FROM Analisador")
for x in data:
list = [x[0],x[1],x[2],x[3],x[4],x[5],x[6]]
mode = statistics.mode(list)
try:
cursor.execute(f"UPDATE Analisador SET juiz={mode} where id={row[6]}") #row[6] == id
conn.commit()
except:
print("Error")
conn.close()
You have to fetch your records after SQL is executed:
cursor.execute("SELECT Bow, FastText, Glove, Wordvec, Python, juiz, id FROM Analisador")
data = cursor.fetchall()
That type of SQL query is different from UPDATE (that you're using in your code too) which doesn't need additional step after SQL is executed.

Data frame does not display column names

I wrote a script which first runs a SQL query to get the data from Redshift (via Databricks). Then, I want to display it in a pandas data frame. The problem is that somehow the names of the columns were removes/are not displayed. Why?
#SQL Query
query = """
SELECT * FROM table1 limit 1;
"""
# Execute the query
try:
cursor.execute(query)
except OperationalError as msg:
print ("Command skipped: ")
#Fetch all rows from the result
rows = cursor.fetchall()
# Convert into a Pandas Dataframe
df = pd.DataFrame( [[ij for ij in i] for i in rows] )
df.head()
Output:
As you can see, the column names turned into numbers (in yellow). The intent was to display column name 1: Customer_id, column name 2: Purchases, column name 3: Product_id etc.
I appreciate any help. Thanks!
As suggested by #Chris you can use pd.read_sql in the following way:-
query = """SELECT * FROM table1 limit 1;"""
connection = psycopg2.connect(user = 'your_username',
password = 'password',
host = 'host_ip',
port = 5432,
database = 'db_name')
data = pd.read_sql(sql=query, con=connection)
Now when you will print your data it will show the column names as well!

Store Mysql coulmn names in array using Python mysql connector

I'm quite new to mysql as in manipulating the database itself. I succeeded to store new lines in a table but my next endeavor will be a little more complex.
I'd like to fetch the column names from an existing mysql database and save them to an array in python. I'm using the official mysql connector.
I'm thinking I can achieve this through the information_schema.columns command but I have no idea how to build the query and store the information in an array. It will be around 100-200 columns so performance might become an issue so I don't think its wise just to iterate my way through it for each column.
The base code to inject code into mysql using the connector is:
def insert(data):
query = "INSERT INTO templog(data) " \
"VALUES(%s,%s,%s,%s,%s)"
args = (data)
try:
db_config = read_db_config()
conn = MySQLConnection(db_config)
cursor = conn.cursor()
cursor.execute(query, args)
#if cursor.lastrowid:
# print('last insert id', cursor.lastrowid)
#else:
# print('last insert id not found')
conn.commit()
cursor.close()
conn.close()
except Error as error:
print(error)
As said this above code needs to be modified in order to get data from the sql server. Thanks in advance!
Thanks for the help!
Got this as working code:
def GetNames(web_data, counter):
#get all names from the database
connection = create_engine('mysql+pymysql://user:pwd#server:3306/db').connect()
result = connection.execute('select * from price_usd')
a = 0
sql_matrix = [0 for x in range(counter + 1)]
for v in result:
while a == 0:
for column, value in v.items():
a = a + 1
if a > 1:
sql_matrix[a] = str(('{0}'.format(column)))
This will get all column names from the existing sql database

Categories