Anyway to Upsert database using PostgreSQL in Python - python

I want to upsert with least effort, for simplicity, i reduce columns, this not work:
sql = '''INSERT INTO temp.tickets
(id, created_at, updated_at, emails, status)
VALUES
(%s, %s, %s, %s, %s)
ON CONFLICT (id)
DO UPDATE SET ( emails, status) values (%s,%s)
'''
cursor = cm.cursor()
## cm is a custom module
cursor.execute(sql, (ticket['id'],
ticket['created_at'],
ticket['updated_at'],
ticket['emails'], ticket['status'], )
This code show Error:
return super(DictCursor, self).execute(query, vars)
IndexError: tuple index out of range
What I need to change in the cursor.execute() to work?
The Bellow code work but I like to use %s instead of type: email = excluded.email for each columns
sql = '''INSERT INTO temp.tickets
(id, created_at, updated_at, emails, status)
VALUES
(%s, %s, %s, %s, %s)
ON CONFLICT (id)
DO UPDATE SET emails = excluded.eamils, status = excluded.status
'''
cursor = cm.cursor()
# cm is a custom module
cursor.execute(sql, (ticket['id'],
ticket['created_at'],
ticket['updated_at'],
ticket['emails'], ticket['status'], )
There are two Relevant Questions link1, link2

I would try something like this:
sql = '''INSERT INTO temp.tickets
(id, created_at, updated_at, emails, status)
VALUES
(%s, %s, %s, %s, %s)
ON CONFLICT (id)
DO UPDATE SET ( emails, status) values (%s,%s)
'''
cursor = cm.cursor()
## cm is a custom module
cursor.execute(sql, (ticket['id'],
ticket['created_at'],
ticket['updated_at'],
ticket['emails'],
ticket['status'],
ticket['emails'],
ticket['status'] )
Thre number of %s must match the number of parameters.

When Postgres encounters a captured conflict it basically creates a record called EXCLUDED that contains the values you attempted to insert, You can refer to this record in DO UPDATE. Try the following:
INSERT INTO temp.tickets
(id, created_at, updated_at, emails, status)
VALUES
(%s, %s, %s, %s, %s)
ON CONFLICT (id)
DO UPDATE
SET emails = excluded.emails
, status = excluded.status
, updated_at = excluded.updated_at -- my assumption.
...
You will have to format is into the requirements of your source language.

Related

PostgreSQL INSERT ON CONFLICT and TKinter

I have problem with syntax within python/Tkinter when updating PostgreSQL table.
Syntax works for function code below without ON CONFLICT option:
def myclick_start():
# Create a database or connect to one
conn = psycopg2.connect(database="*",# hidden credentials here
host="*",
user="*",
password="*",
port="*")
# Create cursor
c = conn.cursor()
# Insert Into Database Table
thing1 = o_num.get()
thing2 = op_id.get()
thing3 = proc_name_cb.get()
# this works
c.execute('''INSERT INTO orders (order_id, op_id, status_id) VALUES (%s, %s, %s)''',
(thing1, thing2, thing3)
)
# Commit Changes
conn.commit()
# Close Connection
conn.close()
but not working when I want to UPDATE table ON CONFLICT of order_id value:
# but this is not working
c.execute('''INSERT INTO orders (order_id, op_id, status_id) VALUES (%s, %s, %s)''',
(thing1, thing2, thing3),
ON CONFLICT (order_id)
DO UPDATE SET op_id = EXCLUDED.op_id, status_id = EXCLUDED.status_id;
)
Resulted error:
File "E:\***.py", line 229
'''c.execute('''INSERT INTO orders (order_id, op_id, status_id) VALUES (%s, %s, %s)''',
^^^^^^
SyntaxError: invalid syntax
I've tried many syntax variants and kind of stuck with my error.
Appreciate your help.
If you take a closer look at the syntax highlight, you will notice that your ON CONFLICT ... isn't part of the SQL query (i.e. it's not part of the string that makes up the query).
Moving that part inside the string should solve the problem, like this
c.execute('''INSERT INTO orders (order_id, op_id, status_id) VALUES (%s, %s, %s)
ON CONFLICT (order_id)
DO UPDATE SET op_id = EXCLUDED.op_id, status_id = EXCLUDED.status_id;''',
(thing1, thing2, thing3)
)

IndexError: tuple index out of range connecting Python to PostgreSQL

I know this question has been asked a number of times, but i am stuck here unable to proceed further. I am executing a for loop in python to load data to fact table.
I am executing the below code
for index, row in df.iterrows():
# get songid and artistid from song and artist tables
cur.execute(song_select, (row.song, row.artist, row.length))
results = cur.fetchone()
if results:
song_id, artist_id = results
else:
song_id, artist_id = None, None
# insert songplay record
songplay_data = (pd.to_datetime(row.ts, unit='ms'),row.userId,row.level,song_id,artist_id,row.sessionId,row.location,row.userAgent)
cur.execute(songplay_table_insert, songplay_data)
conn.commit()
and getting the error
<ipython-input-22-b8b0e27022de> in <module>()
13
14 songplay_data = (pd.to_datetime(row.ts, unit='ms'),row.userId,row.level,song_id,artist_id,row.sessionId,row.location,row.userAgent)
15 cur.execute(songplay_table_insert, songplay_data)
16 conn.commit()
IndexError: tuple index out of range
My table i am trying to insert is
songplay_table_insert = ("""INSERT INTO songplays (songplay_id, start_time,
user_id, level, song_id, artist_id, session_id, location, user_agent )
VALUES(%s, %s, %s, %s, %s, %s, %s, %s, %s)
I am really stuck, any help appreciated.
You have one too many %s markers.
VALUES(%s, %s, %s, %s, %s, %s, %s, %s, %s)
has 9 markers, while
songplay_data = (pd.to_datetime(row.ts, unit='ms'),row.userId,row.level,song_id,artist_id,row.sessionId,row.location,row.userAgent)
has 8 elements. When it tries to evaluate the last marker, it looks for the 9th element, i.e. songplay_data[8], and that raises the error.
You will also need to remove songplay_id from the SQL to make the INSERT statement valid. The database should be generating the primary key for you if you don't have a value to provide, if not we should take a look at your table definition.

psycopg2 - Inserting multiple rows that have multiple columns faster

I'm trying to insert multiple rows into my database, and currently I do not know a way to insert them all at the same time or any other method which will help save time (sequentially it takes about ~30s for around 300 rows).
My 'rows' are are tuples in a list of tuples (converted into tuple of tuples), e.g. [(col0, col1, col2), (col0, col1, col2), (.., .., ..), ..]
def commit(self, tuple):
cursor = self.conn.cursor()
for tup in tuple:
try:
sql = """insert into "SSENSE_Output" ("productID", "brand", "categoryID", "productName", "price", "sizeInfo", "SKU", "URL", "dateInserted", "dateUpdated")
values (%s, %s, %s, %s, %s, %s, %s, %s, %s, %s)"""
cursor.execute(sql, tup)
self.conn.commit()
except psycopg2.IntegrityError:
self.conn.rollback()
sql = 'insert into "SSENSE_Output" ' \
'("productID", "brand", "categoryID", "productName", "price", "sizeInfo", "SKU", "URL", "dateInserted", "dateUpdated")' \
'values (%s, %s, %s, %s, %s, %s, %s, %s, %s, %s) on conflict ("productID") do update set "dateUpdated" = EXCLUDED."dateUpdated"'
cursor.execute(sql, tup)
self.conn.commit()
except Exception as e:
print(e)
I have also tried commiting after the for loop is done, but still results in the same amount of time. Are there any ways to make this insert significantly faster?
In postgres you can use a format like:
INSERT INTO films (code, title, did, date_prod, kind) VALUES
('B6717', 'Tampopo', 110, '1985-02-10', 'Comedy'),
('HG120', 'The Dinner Game', 140, DEFAULT, 'Comedy');
Due to your record base exception handling you can better first resolve the duplicates before generating this query as the whole query might fail when an integrity error occurs.
Building one large INSERT statement instead of many of them will considerably improve the execution time, you should take a look here. It is for mysql, but I think a similar approach apply for postgreSQL

Inserting mysql data from one table to another with python

I'm trying to insert data that's already in one mysql table into another, using python. The column names are the same in each table, and objkey is the distinguishing piece of data I have for the item that I'd like to use to tell mysql which columns to look at.
import MySQLdb
db = MySQLdb.connect(host='', user='', passwd='', db='')
cursor = db.cursor
sql = "INSERT INTO newtable (%s, %s, %s, %s) SELECT %s, %s, %s, %s FROM oldtable
WHERE %s;" % ((name, desig, data, num), name, desig, data, num, obj = repr(objkey))
cursor.execute(sql)
db.commit()
db.close()
It says I have a syntax error, but I'm not sure where since I'm pretty sure there should be parentheses around the field names the first time but not the second one. Anyone know what I'm doing wrong?
I'm not exactly sure what you are trying to do with the obj = repr(objkey) line, but python is thinking you are defining variables with this line, not setting sql syntax (if that is indeed your desire here).
sql = "INSERT INTO newtable (%s, %s, %s, %s) SELECT %s, %s, %s, %s FROM oldtable
WHERE %s;" % ((name, desig, data, num), name, desig, data, num, obj = repr(objkey))
should probably be changed to something like:
sql = "INSERT INTO newtable (%s, %s, %s, %s) SELECT %s, %s, %s, %s FROM oldtable
WHERE obj=%;" % ((name, desig, data, num), name, desig, data, num, repr(objkey))
But even then, you would need objkey defined somewhere as a python variable.
This answer may be way off, but you need to defined what you are expecting to achieve with obj = repr(objkey), in order to get more accurate answers.

Compound SQL INSERT statement

How would I combine the following two statements to create a valid SQL query?
provider = os.path.basename(file)
cursor.execute("""INSERT into main_app_financialstatements
(statement_id, provider_id***, url, date)
VALUES (%s, %s***, %s, %s)""",
(statement_id, provider***, url, date))
provider_id = SELECT id FROM main_app_provider WHERE provider=provider
In other words, I have the provider, and I need to SELECT the provider_id from another table in order to INSERT it into the main_app_financialstatements.
cursor.execute("""INSERT into main_app_financialstatements
(statement_id, provider_id, url, date)
VALUES (%s, (SELECT id FROM main_app_provider WHERE provider=%s), %s, %s)""",
(statement_id, provider, url, date))
You could use the INSERT ... SELECT ... FROM variant of the INSERT command:
provider = os.path.basename(file)
sql = """
INSERT INTO main_app_financialstatements
(statement_id, provider_id, url, date)
SELECT %s, id, %s, %s
FROM main_app_provider
WHERE provider = %s
"""
args = (statement_id, url, date, provider)
cursor.execute(sql, args)

Categories