Hello how can i do to insert uniq rows without duplicate.
cursor.execute("CREATE TABLE IF NOT EXISTS tab1 (id varchar(36) primary key, cap1 VARCHAR(4), cap2 varchar(55), cap3 int(6), Version VARCHAR(4));")
id = uuid.uuid1()
id = str(id)
cursor.execute("INSERT IGNORE INTO tab1 (id, cap1, cap2, cap3, Version) VALUES (%s, %s, %s, %s, %s )", (vals))
I should not insert the third row while is the same as first row.
Hope im clear .
Thank you in advance,
The problem is that uuid() will always give a unique identifier and since id is a primary key, the row is getting inserted with duplicate values except for id column which is different always.
I think this link might answer your question or else, create a unique index on columns that you want to be unique.
Let me know if it helps!!
Related
I have a pandas data frame, and I want to insert/update (upsert) it to a table. Condition is-
There will be an insert of new rows (this scenario will add current timestamp to the column INSERT_TIMESTAMP while inserting to the table)
This scenario will keep UPDATE_TIMESTAMP & PREVIOUS_UPDATE_TIMESTAMP blank.
For the existing rows(calling the row as existing if the primary key exists in the table), there will be an update of the values into existing row, except INSERT_TIMESTAMP value.
And this scenario will add current timestamp to column UPDATE_TIMESTAMP while updating.
Also it should copy into PREVIOUS_UPDATE_TIMESTAMP value with the UPDATE_TIMESTAMP value before update.
Here is my code where I am trying to upsert a dataframe into a table that is already created.
CODE is the primary key.
Again as mentioned, I want to write a code to Insert the row if the CODE is not present in the table, and adding INSERT_TIMESTAMP as current. And if the CODE is existent in the table already, then update the row in table excluding INSERT_TIMESTAMP, but add current time to UPDATE_TIMESTAMP in that case. Also copy UPDATE_TIMESTAMP for the row before UPDATE and add it to PREVIOUS_UPDATE_TIMESTAMP.
This code is giving me error for list tuple out of index.
for index, row in dataframe.iterrows():
print(row)
cur.execute("""INSERT INTO TABLE_NAME
(CODE, NAME, CODE_GROUP,
INDICATOR, INSERT_TIMESTAMP,
UPDATE_SOURCE, IDD, INSERT_SOURCE)
VALUES (%s, %s, %s, %s, NOW(), %s, %s, %s)
ON CONFLICT(CODE)
DO UPDATE SET
NAME = %s,
CODE_GROUP = %s,
INDICATOR = %s,
UPDATE_TIMESTAMP = NOW(),
UPDATE_SOURCE = %s,
IDD = %s, INSERT_SOURCE = %s,
PREV_UPDATE_TIMESTAMP = EXCLUDED.UPDATE_TIMESTAMP""",
(row["CODE"],
row['NAME'],
row['CODE_GROUP'],
row['INDICATOR'],
row['UPDATE_SOURCE'],
row['IDD'],
row['INSERT_SOURCE']))
conn.commit()
cur.close()
conn.close()
Please help me where is it going wrong. Should I add all the columns into UPDATE statement which I have mentioned in the INSERT statement? Because in insert statement, UPDATE_TIMESTAMP & PREVIOUS_UPDATE_TIMESTAMP should be null. And in UPDATE query, INSERT_TIMESTAMP has to be the same as it was before.
I have a csv file from stock that is updated every day.
I want to enter this data in a table and Just add new data every day.
this is my code:
# - *- coding: utf- 8 - *-
import csv
import mysql.connector
from datetime import datetime
cnx = mysql.connector.connect(host= 'localhost',
user= 'root',
passwd='pass',
db='stock')
cursor = cnx.cursor()
cursor.execute("""CREATE TABLE IF NOT EXISTS stock(id INT AUTO_INCREMENT ,
name VARCHAR(50), day DATE UNIQUE, open float, high float, low float,
close float, vol float, PRIMARY KEY(id))""")
a = 0
with open("file path")as f:
data = csv.reader(f)
for row in data:
if a== 0 :
a=+1
else:
cursor.execute('''INSERT INTO stock(name,day,open,high,low,close,vol)
VALUES("%s","%s","%s","%s","%s","%s","%s")''',
(row[0],int(row[1]),float(row[2]),float(row[3]),float(row[4]),
float(row[5]),float(row[6])))
cnx.commit()
cnx.close()
But I can not prevent duplication of information
Assuming that you want to avoid duplicates on (name, day), one approach would be to set a unique key on this tuple of columns. You can then use insert ignore, or better yet on duplicate key syntax to skip the duplicate rows.
You create the table like:
create table if not exists stock(
id int auto_increment ,
name varchar(50),
day date,
open float,
high float,
low float,
close float,
vol float,
primary key(id),
unique (name, day) -- unique constraint
);
Then:
insert into stock(name,day,open,high,low,close,vol)
values(%s, %s, %s, %s, %s, %s, %s)
on duplicate key update name = values(name) -- dummy update
Notes:
you should not have double quotes around the %s placeholders
your original create table code had a unique constraint on column day; this does not fit with your question, as I understood it. In any case, you should put the unique constraint on the column (or set of columns) on which you want to avoid duplicates.
i have 3 table in my database
CREATE TABLE IF NOT EXISTS depances (
id SERIAL PRIMARY KEY UNIQUE NOT NULL,
type VARCHAR NOT NULL,
nom VARCHAR,
montant DECIMAL(100,2) NOT NULL,
date DATE,
temp TIME)
CREATE TABLE IF NOT EXISTS transactions (
id SERIAL PRIMARY KEY UNIQUE NOT NULL,
montant DECIMAL(100,2),
medecin VARCHAR,
patient VARCHAR,
acte VARCHAR,
date_d DATE,
time_d TIME,
users_id INTEGER)
CREATE TABLE IF NOT EXISTS total_jr (
id SERIAL PRIMARY KEY UNIQUE NOT NULL,
total_revenu DECIMAL(100,2),
total_depance DECIMAL(100,2),
total_différence DECIMAL(100,2),
date DATE)
my idea is to insert defrent value in table depances and transaction using a GUI interface.
and after that adding the SUM of montant.depances in total_depance.total_jr
and the SUM of montant.transactions in total_revenu.total_jr where all rows have the same time
that's the easy part using this code
self.cur.execute( '''SELECT SUM(montant) AS totalsum FROM depances WHERE date = %s''',(date,))
result = self.cur.fetchall()
for i in result:
o = i[0]
self.cur_t = self.connection.cursor()
self.cur_t.execute( '''INSERT INTO total_jr(total_depance)
VALUES (%s)'''
, (o,))
self.connection.commit()
self.cur.execute( '''UPDATE total_jr SET total_depance = %s WHERE date = %s''',(o, date))
self.connection.commit()
But every time it adds a new row to the table of total_jr
How can i add thos value of SUM(montant) to the table where the date is the same every time its only put the value of sum in one row not every time it add a new row
The result should will be like this
id|total_revenu|total_depance|total_différence|date
--+------------+-------------+----------------+----
1 sum(montant1) value value 08/07/2020
2 sum(montant2) value value 08/09/2020
3 sum(montant3) value value 08/10/2020
but it only give me this result
id|total_revenu|total_depance|total_différence|date
--+------------+-------------+----------------+----
1 1 value value 08/07/2020
2 2 value value 08/07/2020
3 3 value value 08/7/2020
if there is any idea or any hit that will be hulpefull
You didn't mention which DBMS or SQL module you're using so I'm guessing MySQL.
In your process, run the update first and check how many rows were changed. If zero row changed, then insert a new row for that date.
self.cur.execute( '''SELECT SUM(montant) AS totalsum FROM depances WHERE date = %s''',(date,))
result = self.cur.fetchall()
for i in result:
o = i[0]
self.cur.execute( '''UPDATE total_jr SET total_depance = %s WHERE date = %s''',(o, date))
rowcnt = self.cur.rowcount # number of rows updated - psycopg2
self.connection.commit()
if rowcnt == 0: # no rows updated, need to insert new row
self.cur_t = self.connection.cursor()
self.cur_t.execute( '''INSERT INTO total_jr(total_depance, date)
VALUES (%s, %s)'''
, (o, date))
self.connection.commit()
I find a solution for anyone who need it in future first of all we need to update the table
create_table_total_jr = ''' CREATE TABLE IF NOT EXISTS total_jr (
id SERIAL PRIMARY KEY UNIQUE NOT NULL,
total_revenu DECIMAL(100,2),
total_depance DECIMAL(100,2),
total_différence DECIMAL(100,2),
date DATE UNIQUE)''' #add unique to the date
and after that we use the UPSERT and ON CONFLICT
self.cur_t.execute( ''' INSERT INTO total_jr(date) VALUES (%s)
ON CONFLICT (date) DO NOTHING''', (date,))
self.connection.commit()
with this code when there is an insert value with the same date it will do nothing
after that we update the value of the SUM
self.cur.execute( '''UPDATE total_jr SET total_depance = %s WHERE date = %s''',(o, date))
self.connection.commit()
Special thanks to Mike67 for his help
You do not need 2 database calls for this. As #Mike67 suggested UPSERT functionality is what you want. However, you need to send both date and total_depance. In SQL that becomes:
insert into total_jr(date,total_depance)
values (date_value, total_value
on conflict (date)
do update
set total_depance = excluded.total_depance;
or depending on input total_depance just the transaction value while on the table total_depance is an accumulation:
insert into total_jr(date,total_depance)
values (date_value, total_value
on conflict (date)
do update
set total_depance = total_depance + excluded.total_depance;
I believe your code then becomes something like (assuming the 1st insert is correct)
self.cur_t.execute( ''' INSERT INTO total_jr(date,total_depance) VALUES (%s1,$s2)
ON CONFLICT (date) DO UPDATE set total_depance = excluded.$s2''',(date,total_depance))
self.connection.commit()
But that could off, you will need to verify.
Tip of the day: You should change the column name date to something else. Date is a reserved word in both Postgres and the SQL Standard. It has predefined meanings based on its context. While you may get away with using it as a data name Postgres still has the right to change that at any time without notice, unlikely but still true. If so, then your code (and most code using that/those table(s)) fails, and tracking down why becomes extremely difficult. Basic rule do not use reserved words as data names; using reserved words as data or db object names is a bug just waiting to bite.
So I have a SQL query ran in python that will add data to a database, but I am wondering if there is a duplicate key that just updates a couple of fields. The data that I am using is around 30 columns, and wondering if there is a way to do this.
data = [3, "hello", "this", "is", "random", "data",.......,44] #this being 30 items long
car_placeholder = ",".join(['%s'] * len(data))
qry = f"INSERT INTO car_sales_example VALUES ({car_placeholder}) ON DUPLICATE KEY UPDATE
Price = {data[15]}, IdNum = {data[29]}"
cursor.execute(qry, data)
conn.commit()
I want to be able to add an entry if the key doesn't exist, but if it does, update some of the columns within the entry which is that being the Price and the IdNum, which are at odd locations in the dataset. Is this even possible?
If this is not, is there a way to update every column within the database without explicitly saying it. For example
qry = f"INSERT INTO car_sales_example VALUES ({car_placeholder}) ON DUPLICATE KEY UPDATE
car_sales_example VALUES ({car_placeholder})"
instead of going column by column ->
ON DUPLICATE KEY UPDATE Id = %s, Name = %s, Number = %s, etc... #for 30 columns
In ON DUPLICATE KEY UPDATE you can use the VALUES() function with the name of a column to get the value that would have been inserted into that column.
ON DUPLICATE KEY UPDATE price = VALUES(price), idnum = VALUES(idnum)
I have a python script that executes some simple SQL.
c.execute("CREATE TABLE IF NOT EXISTS simpletable (id integer PRIMARY KEY, post_body text, post_id text, comment_id text, url text);")
command = "INSERT OR IGNORE INTO simpletable VALUES ('%s', '%s', '%s', '%s')" % (comments[-1].post_body, comments[-1].post_id, comments[-1].comment_id,
comments[-1].url)
c.execute(command)
c.commit()
But when I execute it, I get an error
sqlite3.OperationalError: table simpletable has 5 columns but 4 values were supplied
Why is it not automatically filling in the id key?
In Python 3.6 I did as shown below and data was inserted successfully.
I used None for autoincrementing ID since Null was not found.
conn.execute("INSERT INTO CAMPAIGNS VALUES (?, ?, ?, ?)", (None, campaign_name, campaign_username, campaign_password))
The ID structure is as follows.
ID INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL
If you don't specify the target columns VALUES is expected to provide values for all columns and that you didn't do.
INSERT
OR IGNORE INTO simpletable
(text,
post_id,
comment_id,
text)
VALUES ('%s',
'%s',
'%s',
'%s');
Specifying the target columns is advisable in any case. The query won't break, if, for any reason, the order of the columns in the tables changes.
try to specify the columns names to ensure that the destination of values doesn't depends on order.
ex:
INTO simpletable
(text,
post_id,
comment_id,
text)
And if you wants the id column to be automatically incremented make sure to add Identity property on, or similar auto increment of your dbms.
ex:
CREATE TABLE IF NOT EXISTS simpletable (id integer PRIMARY KEY Identity(1,1),
and remember your script is not prepared to alter the table structure, only creation.
If you wrote code correctly delete your SQL file(name.db) and run your code again some time it solve the problem.
Imagine this is your code:
cursor.execute('''CREATE TABLE IF NOT EXISTS food(name TEXT , price TEXT)''')
cursor.execute('INSERT INTO food VALUES ("burger" , "20")')
connection.commit()
and you see an error like this:
table has 1 column but 2 values were supplied
it happened because for example you create a file with one column and then you modify your file to two column but you don't change the file name so compiler do not over write it because it exist.