Postgresql Python: ignore duplicate key exception - python

I insert items using psycopg2 in the following way:
cursor = connection.cursor()
for item in items:
try:
cursor.execute(
"INSERT INTO items (name, description) VALUES (%s, %s) RETURNING id",
(item[0], item[1])
)
id = cursor.fetchone[0]
if id is not None:
cursor.execute(
"INSERT INTO item_tags (item, tag) VALUES (%s, %s) RETURNING id",
(id, 'some_tag')
)
except psycopg2.Error:
connection.rollback()
print("PostgreSQL Error: " + e.diag.message_primary)
continue
print(item[0])
connection.commit()
Obviously, when an item is already in the database, the duplicate key exception is being thrown. Is there a way to ignore the exception? Is the whole transaction is going to be aborted when the exception is thrown? If yes, then what is the best option to rewrite the query, maybe using batch inserting?

from Graceful Primary Key Error handling in Python/psycopg2:
You should rollback transaction on error.
I've added one more try..except..else construction in the code bellow
to show the exact place where exception will occur.
try:
cur = conn.cursor()
try:
cur.execute( """INSERT INTO items (name, description)
VALUES (%s, %s) RETURNING id""", (item[0], item[1]))
except psycopg2.IntegrityError:
conn.rollback()
else:
conn.commit()
cur.close()
except Exception , e:
print 'ERROR:', e[0]

Related

CSV to MSSQL using pymssql

the motive is to continuously look for new records in my CSV and insert the records to the mssql using pymssql library.
The CSV initially has 244 rows and I'm trying to insert 1 value and wants to dynamically insert the new row only when the script is ran with the scheduler.
I have the script which runs every 15 seconds to insert the values, but post inserting the values the first time, the second time the script throws 'Cannot insert duplicate key in object' as I have my first column DateID which is set a PK and terminates the statement from the first record itself, therefore doesn't insert the new row.
How do I encounter this.
Code:
def trial():
try:
for row in df.itertuples():
datevalue = datetime.datetime.strptime(row.OrderDate, format)
query= "INSERT INTO data (OrderDate, Region, City, Category) VALUES (%s,%s,%s,%s)"
cursor.execute(query, (datevalue, row.Region,row.City,row.Category))
print('"Values inserted')
conn.commit()
conn.close()
except Exception as e:
print("Handle error", e)
pass
schedule.every(15).seconds.do(trial)
Library used: pymssql
SQL: MSSQL server 2019
To avoid duplicate values, consider adjusting query to use EXCEPT clause (part of UNION and INTERSECT set operator family) against actual data. Also, consider using executemany by passing a nested list of all row/column data with DataFrame.to_numpy().tolist().
By the way if OrderDate column is a datetime type in data frame and database table, you do not need to re-format to string value.
def trial():
try:
query= (
"INSERT INTO data (OrderDate, Region, City, Category) "
"SELECT %s, %s, %s, %s "
"EXCEPT "
"SELECT OrderDate, Region, City, Category "
"FROM data"
)
vals = df[["OrderDate", "Region", "City", "Category"]].to_numpy()
vals = tuple(map(tuple, vals))
cur.executemany(query, vals)
print('Values inserted')
conn.commit()
except Exception as e:
print("Handle error", e)
finally:
cur.close()
conn.close()
For a faster, bulk insert, consider using a staging, temp table:
# CREATE EMPTY TEMP TABLE
query = "SELECT TOP 0 OrderDate, Region, City, Category INTO #pydata FROM data"
cur.execute(query)
# INSERT INTO TEMP TABLE
query= (
"INSERT INTO #pydata (OrderDate, Region, City, Category) "
"VALUES (%s, %s, %s, %s) "
)
vals = df[["OrderDate", "Region", "City", "Category"]].to_numpy()
vals = tuple(map(tuple, vals))
cur.execute("BEGIN TRAN")
cur.executemany(query, vals)
# MIGRATE TO FINAL TABLE
query= (
"INSERT INTO data (OrderDate, Region, City, Category) "
"SELECT OrderDate, Region, City, Category "
"FROM #pydata "
"EXCEPT "
"SELECT OrderDate, Region, City, Category "
"FROM data"
)
cur.execute(query)
conn.commit()
print("Values inserted")

Executing multiple SQL queries with Python Flask

I have a python function which should execute 2 SQL queries. I have found that it is impossible to execute 2 queries in one command at once, so as a workaround I created a list of my queries and try to iterate over it with execute command. However nothing is added to MySQL table. Here is the code:
#app.route('/addComment', methods=['POST'])
def addComment():
try:
if session.get('user'):
_description = request.form['description']
_user = session.get('user')
_term_id = request.form['termID']
_time = datetime.now()
operation = ['"INSERT INTO comments (description, user, termID, time) VALUES (%s, %s, %s, %s)", (_description, _user, _term_id, _time)', '"INSERT INTO history (user, term, time) VALUES (%s, %s, %s)", (_user, _term_id, _time)']
conn = mysql.connect()
cursor = conn.cursor()
for item in operation:
cursor.execute()
conn.commit()
data = cursor.fetchall()
if len(data) == 0:
conn.commit()
return json.dumps({'status':'OK'})
else:
return json.dumps({'status':'ERROR'})
except Exception as e:
return json.dumps({'status':'Unauthorized access'})
finally:
cursor.close()
conn.close()
Could you please help me?
Errors in your code lies in the following areas:
A. On iteration sql statement is not passed to execute()
Should be:
for item in operation:
cursor.execute(item)
conn.commit()
B. Invalid parameterization
'"INSERT INTO comments (description, user, termID, time) VALUES (%s, %s, %s, %s)", (_description, _user, _term_id, _time)'
This string statement doesn't apply variables to SQL statement string. Depending on your value types you should decide whether to add ' (apostrophe) or not. More safely would be to pass parameters to .execute() function. Example below.
cursor.execute(
"INSERT INTO comments (description, user, termID, time) VALUES (:description, :user, :term_id, :time)",
description=_description,
user=_user,
term_id=_term_id,
time=_time
)

How to continue loop and log error when validation error is raised in exception?

I'm pretty new to python but I have created a small script to insert rows in postgres, and a table that stores stock prices.
The table have a unique constraint on date and ticker, to ensure only one price per ticker per day is inserted. I want the script to skip inserts and continue on with the rest, but I cannot figure out how to make the loop continue when the exception block is triggered.
The Python script is as follows:
def createConnection(db="db", user="john", password='doe', host='host', port=5432):
conn = psycopg2.connect(
database=db, user=user, password=password, host=host, port=port)
return conn
def insertNewPrices(df):
conn = createConnection()
cur = conn.cursor()
for row in df.head().itertuples(index=False):
try:
print(row)
cur.execute(
"INSERT INTO daily_price (price_date, ticker, close_price) VALUES (%s, %s,
%s)", row)
except psycopg2.IntegrityError as e:
print("something went wrong")
print(e)
continue
conn.commit()
conn.close()
error raised:
psycopg2.errors.UniqueViolation: duplicate key value violates unique constraint "daily_price_ticker_price_date_key"
DETAIL: Key (ticker, price_date)=(EQUINOR, 1990-02-28) already exists.
You have the insert statement outside the try statement. Can you remove the insert from outside and you should be OK.
for row in df.head().itertuples(index=False):
#cur.execute(
# "INSERT INTO daily_price (price_date, ticker, close_price) VALUES (%s, %s, %s)",
row)
try:
print(row)
cur.execute(
"INSERT INTO daily_price (price_date, ticker, close_price) VALUES (%s, %s,
%s)", row)
except psycopg2.IntegrityError, psycopg2.errors.UniqueViolation) as e:
print("something went wrong")
print(e)
Also you don't need continue at the end of the except statement as its the last line.
Instead of checking for specific errors, you can also catch all errors and warnings using the below:
except (psycopg2.Error, psycopg2.Warning) as e:

Python "INSERT INTO" vs. "INSERT INTO...ON DUPLICATE KEY UPDATE"

I am trying to use python to insert a record into a MySQL database and then update that record. To do this I have created 2 functions:
def insert_into_database():
query = "INSERT INTO pcf_dev_D.users(user_guid,username) VALUES (%s, %s) "
data = [('1234', 'user1234')]
parser = ConfigParser()
parser.read('db/db_config.ini')
db = {}
section = 'mysql'
if parser.has_section(section):
items = parser.items(section)
for item in items:
db[item[0]] = item[1]
else:
raise Exception('{0} not found in the {1} file'.format(section, filename))
try:
conn = MySQLConnection(**db)
cursor = conn.cursor()
cursor.executemany(query, data)
conn.commit()
except Error as e:
print('Error:', e)
finally:
# print("done...")
cursor.close()
conn.close()
This works fine and inserts 1234, user1234 into the db.
Now I want to update this particular user's username to '5678', so I have created another function:
def upsert_into_database():
query = "INSERT INTO pcf_dev_D.users(user_guid,username) " \
"VALUES (%s, %s) ON DUPLICATE KEY UPDATE username='%s'"
data = [('1234', 'user1234', 'user5678')]
parser = ConfigParser()
parser.read('db/db_config.ini')
db = {}
section = 'mysql'
if parser.has_section(section):
items = parser.items(section)
for item in items:
db[item[0]] = item[1]
else:
raise Exception('{0} not found in the {1} file'.format(section, 'db/db_config.ini'))
try:
conn = MySQLConnection(**db)
cursor = conn.cursor()
cursor.executemany(query, data)
conn.commit()
except Error as e:
print('Error:', e)
finally:
# print("done...")
cursor.close()
conn.close()
Which produces the following error:
Error: Not all parameters were used in the SQL statement
What's interesting is if I modify query and data to be:
query = "INSERT INTO pcf_dev_D.users(user_guid,username) " \
"VALUES (%s, %s) ON DUPLICATE KEY UPDATE username='user5678'"
data = [('1234', 'user1234')]
Then python updates the record just fine...what am I missing?
You included the 3rd parameter within single quotes in the update clause, therefore it is interpreted as part of a string, not as a placeholder for parameter. You must not enclose a parameter by quotes:
query = "INSERT INTO pcf_dev_D.users(user_guid,username) " \
"VALUES (%s, %s) ON DUPLICATE KEY UPDATE username=%s"
UPDATE
If you want to use the on duplicate key update clause with a bulk insert (e.g. executemany()), then you should not provide any parameters in the update clause because you can only have one update clause in the bulk insert statement. Use the values() function instead:
query = "INSERT INTO pcf_dev_D.users(user_guid,username) " \
"VALUES (%s, %s) ON DUPLICATE KEY UPDATE username=VALUES(username)"
In assignment value expressions in the ON DUPLICATE KEY UPDATE clause, you can use the VALUES(col_name) function to refer to column values from the INSERT portion of the INSERT ... ON DUPLICATE KEY UPDATE statement. In other words, VALUES(col_name) in the ON DUPLICATE KEY UPDATE clause refers to the value of col_name that would be inserted, had no duplicate-key conflict occurred. This function is especially useful in multiple-row inserts. The VALUES() function is meaningful only in the ON DUPLICATE KEY UPDATE clause or INSERT statements and returns NULL otherwise.

Mysqldb and Python KeyError Handling

I am attempting to add multiple values to MySQL table, here's the code:
Try:
cursor.execute("INSERT INTO companies_and_charges_tmp (etags, company_id, created, delivered, satisfied, status, description, persons_entitled) VALUES ('%s, %s, %s, %s, %s, %s, %s, %s')" % (item['etag'], ch_no, item['created_on'], item['delivered_on'], item['satisfied_on'], item['status'], item['particulars'][0]['description'], item['persons_entitled'][0]['name']))
Except KeyError:
pass
The problem is that this code is in the loop and at times one of the values that are beiing inserted will be missing, which will result in Key Error cancelling the entire insertion.
How do I get past the KeyError, so when the KeyError relating to one of the items that are being inserted occurs, others are still added to the table and the one that is missing is simply left as NULL?
You can use the dict.get() method which would return None if a key would not be found in a dictionary. MySQL driver would then convert None to NULL during the query parameterization step:
# handling description and name separately
try:
description = item['particulars'][0]['description']
except KeyError:
description = None
# TODO: violates DRY - extract into a reusable method?
try:
name = item['persons_entitled'][0]['name']
except KeyError:
name = None
cursor.execute("""
INSERT INTO
companies_and_charges_tmp
(etags, company_id, created, delivered, satisfied, status, description, persons_entitled)
VALUES
(%s, %s, %s, %s, %s, %s, %s, %s)""",
(item.get('etag'), ch_no, item.get('created_on'), item.get('delivered_on'), item.get('satisfied_on'), item.get('status'), description, name))

Categories