I have code that loops, adding a row with information to each row. However, I find that each row does not have a new timestamp, but rather has the same one as the very first row, leading me to believe that the value of current_timestamp is not updating each time. Thus, what fix this problem? Here is my code:
if __name__ == "__main__":
main()
deleteAll() # Clears current table
ID = 0
while ID < 100:
insert(ID, 'current_date', 'current_timestamp')
ID += 1
conn.commit()
my insert function:
def insert(ID, date, timestamp): # Assumes table name is test1
cur.execute(
"""INSERT INTO test1 (ID, date,timestamp) VALUES (%s, %s, %s);""", (ID, AsIs(date), AsIs(timestamp)))
This code is in python, btw, and it is using postgresql for database stuff.
The immediate fix is to commit after each insert otherwise all of the inserts will be done inside a single transaction
while ID < 100:
insert(ID, 'current_date', 'current_timestamp')
ID += 1
conn.commit()
http://www.postgresql.org/docs/current/static/functions-datetime.html#FUNCTIONS-DATETIME-CURRENT
Since these functions return the start time of the current transaction, their values do not change during the transaction. This is considered a feature: the intent is to allow a single transaction to have a consistent notion of the "current" time, so that multiple modifications within the same transaction bear the same time stamp.
Those functions should not be passed as parameters but included in the SQL statement
def insert(ID): # Assumes table name is test1
cur.execute("""
INSERT INTO test1 (ID, date, timestamp)
VALUES (%s, current_date, current_timestamp);
""", (ID,)
)
The best practice is to keep the commit outside of the loop to have a single transaction
while ID < 100:
insert(ID)
ID += 1
conn.commit()
and use the statement_timestamp function which, as the name implies, returns the statement timestamp in instead of the transaction beginning timestamp
INSERT INTO test1 (ID, date, timestamp)
values (%s, statement_timestamp()::date, statement_timestamp())
Related
I have a pandas data frame, and I want to insert/update (upsert) it to a table. Condition is-
There will be an insert of new rows (this scenario will add current timestamp to the column INSERT_TIMESTAMP while inserting to the table)
This scenario will keep UPDATE_TIMESTAMP & PREVIOUS_UPDATE_TIMESTAMP blank.
For the existing rows(calling the row as existing if the primary key exists in the table), there will be an update of the values into existing row, except INSERT_TIMESTAMP value.
And this scenario will add current timestamp to column UPDATE_TIMESTAMP while updating.
Also it should copy into PREVIOUS_UPDATE_TIMESTAMP value with the UPDATE_TIMESTAMP value before update.
Here is my code where I am trying to upsert a dataframe into a table that is already created.
CODE is the primary key.
Again as mentioned, I want to write a code to Insert the row if the CODE is not present in the table, and adding INSERT_TIMESTAMP as current. And if the CODE is existent in the table already, then update the row in table excluding INSERT_TIMESTAMP, but add current time to UPDATE_TIMESTAMP in that case. Also copy UPDATE_TIMESTAMP for the row before UPDATE and add it to PREVIOUS_UPDATE_TIMESTAMP.
This code is giving me error for list tuple out of index.
for index, row in dataframe.iterrows():
print(row)
cur.execute("""INSERT INTO TABLE_NAME
(CODE, NAME, CODE_GROUP,
INDICATOR, INSERT_TIMESTAMP,
UPDATE_SOURCE, IDD, INSERT_SOURCE)
VALUES (%s, %s, %s, %s, NOW(), %s, %s, %s)
ON CONFLICT(CODE)
DO UPDATE SET
NAME = %s,
CODE_GROUP = %s,
INDICATOR = %s,
UPDATE_TIMESTAMP = NOW(),
UPDATE_SOURCE = %s,
IDD = %s, INSERT_SOURCE = %s,
PREV_UPDATE_TIMESTAMP = EXCLUDED.UPDATE_TIMESTAMP""",
(row["CODE"],
row['NAME'],
row['CODE_GROUP'],
row['INDICATOR'],
row['UPDATE_SOURCE'],
row['IDD'],
row['INSERT_SOURCE']))
conn.commit()
cur.close()
conn.close()
Please help me where is it going wrong. Should I add all the columns into UPDATE statement which I have mentioned in the INSERT statement? Because in insert statement, UPDATE_TIMESTAMP & PREVIOUS_UPDATE_TIMESTAMP should be null. And in UPDATE query, INSERT_TIMESTAMP has to be the same as it was before.
I have a csv file from stock that is updated every day.
I want to enter this data in a table and Just add new data every day.
this is my code:
# - *- coding: utf- 8 - *-
import csv
import mysql.connector
from datetime import datetime
cnx = mysql.connector.connect(host= 'localhost',
user= 'root',
passwd='pass',
db='stock')
cursor = cnx.cursor()
cursor.execute("""CREATE TABLE IF NOT EXISTS stock(id INT AUTO_INCREMENT ,
name VARCHAR(50), day DATE UNIQUE, open float, high float, low float,
close float, vol float, PRIMARY KEY(id))""")
a = 0
with open("file path")as f:
data = csv.reader(f)
for row in data:
if a== 0 :
a=+1
else:
cursor.execute('''INSERT INTO stock(name,day,open,high,low,close,vol)
VALUES("%s","%s","%s","%s","%s","%s","%s")''',
(row[0],int(row[1]),float(row[2]),float(row[3]),float(row[4]),
float(row[5]),float(row[6])))
cnx.commit()
cnx.close()
But I can not prevent duplication of information
Assuming that you want to avoid duplicates on (name, day), one approach would be to set a unique key on this tuple of columns. You can then use insert ignore, or better yet on duplicate key syntax to skip the duplicate rows.
You create the table like:
create table if not exists stock(
id int auto_increment ,
name varchar(50),
day date,
open float,
high float,
low float,
close float,
vol float,
primary key(id),
unique (name, day) -- unique constraint
);
Then:
insert into stock(name,day,open,high,low,close,vol)
values(%s, %s, %s, %s, %s, %s, %s)
on duplicate key update name = values(name) -- dummy update
Notes:
you should not have double quotes around the %s placeholders
your original create table code had a unique constraint on column day; this does not fit with your question, as I understood it. In any case, you should put the unique constraint on the column (or set of columns) on which you want to avoid duplicates.
i have 3 table in my database
CREATE TABLE IF NOT EXISTS depances (
id SERIAL PRIMARY KEY UNIQUE NOT NULL,
type VARCHAR NOT NULL,
nom VARCHAR,
montant DECIMAL(100,2) NOT NULL,
date DATE,
temp TIME)
CREATE TABLE IF NOT EXISTS transactions (
id SERIAL PRIMARY KEY UNIQUE NOT NULL,
montant DECIMAL(100,2),
medecin VARCHAR,
patient VARCHAR,
acte VARCHAR,
date_d DATE,
time_d TIME,
users_id INTEGER)
CREATE TABLE IF NOT EXISTS total_jr (
id SERIAL PRIMARY KEY UNIQUE NOT NULL,
total_revenu DECIMAL(100,2),
total_depance DECIMAL(100,2),
total_différence DECIMAL(100,2),
date DATE)
my idea is to insert defrent value in table depances and transaction using a GUI interface.
and after that adding the SUM of montant.depances in total_depance.total_jr
and the SUM of montant.transactions in total_revenu.total_jr where all rows have the same time
that's the easy part using this code
self.cur.execute( '''SELECT SUM(montant) AS totalsum FROM depances WHERE date = %s''',(date,))
result = self.cur.fetchall()
for i in result:
o = i[0]
self.cur_t = self.connection.cursor()
self.cur_t.execute( '''INSERT INTO total_jr(total_depance)
VALUES (%s)'''
, (o,))
self.connection.commit()
self.cur.execute( '''UPDATE total_jr SET total_depance = %s WHERE date = %s''',(o, date))
self.connection.commit()
But every time it adds a new row to the table of total_jr
How can i add thos value of SUM(montant) to the table where the date is the same every time its only put the value of sum in one row not every time it add a new row
The result should will be like this
id|total_revenu|total_depance|total_différence|date
--+------------+-------------+----------------+----
1 sum(montant1) value value 08/07/2020
2 sum(montant2) value value 08/09/2020
3 sum(montant3) value value 08/10/2020
but it only give me this result
id|total_revenu|total_depance|total_différence|date
--+------------+-------------+----------------+----
1 1 value value 08/07/2020
2 2 value value 08/07/2020
3 3 value value 08/7/2020
if there is any idea or any hit that will be hulpefull
You didn't mention which DBMS or SQL module you're using so I'm guessing MySQL.
In your process, run the update first and check how many rows were changed. If zero row changed, then insert a new row for that date.
self.cur.execute( '''SELECT SUM(montant) AS totalsum FROM depances WHERE date = %s''',(date,))
result = self.cur.fetchall()
for i in result:
o = i[0]
self.cur.execute( '''UPDATE total_jr SET total_depance = %s WHERE date = %s''',(o, date))
rowcnt = self.cur.rowcount # number of rows updated - psycopg2
self.connection.commit()
if rowcnt == 0: # no rows updated, need to insert new row
self.cur_t = self.connection.cursor()
self.cur_t.execute( '''INSERT INTO total_jr(total_depance, date)
VALUES (%s, %s)'''
, (o, date))
self.connection.commit()
I find a solution for anyone who need it in future first of all we need to update the table
create_table_total_jr = ''' CREATE TABLE IF NOT EXISTS total_jr (
id SERIAL PRIMARY KEY UNIQUE NOT NULL,
total_revenu DECIMAL(100,2),
total_depance DECIMAL(100,2),
total_différence DECIMAL(100,2),
date DATE UNIQUE)''' #add unique to the date
and after that we use the UPSERT and ON CONFLICT
self.cur_t.execute( ''' INSERT INTO total_jr(date) VALUES (%s)
ON CONFLICT (date) DO NOTHING''', (date,))
self.connection.commit()
with this code when there is an insert value with the same date it will do nothing
after that we update the value of the SUM
self.cur.execute( '''UPDATE total_jr SET total_depance = %s WHERE date = %s''',(o, date))
self.connection.commit()
Special thanks to Mike67 for his help
You do not need 2 database calls for this. As #Mike67 suggested UPSERT functionality is what you want. However, you need to send both date and total_depance. In SQL that becomes:
insert into total_jr(date,total_depance)
values (date_value, total_value
on conflict (date)
do update
set total_depance = excluded.total_depance;
or depending on input total_depance just the transaction value while on the table total_depance is an accumulation:
insert into total_jr(date,total_depance)
values (date_value, total_value
on conflict (date)
do update
set total_depance = total_depance + excluded.total_depance;
I believe your code then becomes something like (assuming the 1st insert is correct)
self.cur_t.execute( ''' INSERT INTO total_jr(date,total_depance) VALUES (%s1,$s2)
ON CONFLICT (date) DO UPDATE set total_depance = excluded.$s2''',(date,total_depance))
self.connection.commit()
But that could off, you will need to verify.
Tip of the day: You should change the column name date to something else. Date is a reserved word in both Postgres and the SQL Standard. It has predefined meanings based on its context. While you may get away with using it as a data name Postgres still has the right to change that at any time without notice, unlikely but still true. If so, then your code (and most code using that/those table(s)) fails, and tracking down why becomes extremely difficult. Basic rule do not use reserved words as data names; using reserved words as data or db object names is a bug just waiting to bite.
I'm trying to insert some data into SQL database, and the problem is that I'm really green on this. So the MAIN problem is that How can I sort all the items in table? I have 3 main things: ID, CARNUM, TIME. But in this 'Insertion' I have to type the id manually. How can I make that the system would create a numeric id numeration automatically?
Here's the insertion code:
postgres_insert_query = """ INSERT INTO Vartotojai (ID, CARNUM, TIME) VALUES (%s,%s,%s)"""
record_to_insert = (id, car_numb, Reg_Tikslus_Laikas)
cursor.execute(postgres_insert_query, record_to_insert)
connection.commit()
count = cursor.rowcount
print (count, "Record inserted successfully into mobile table")
pgadmin sort
pgadmin table
You could change the datatype of ID to serial, which is an auto incrementing integer. Meaning that you don't have to manually enter an ID when inserting into the database.
Read more about datatype serial: source
I have a python script that aggregates data from multiple sources to one, for technical reasons.
In this script, I create an employees table fills it with data and in a second step, fetch each employee's name/last name from another data source. My code is the following:
Create the table and fill it with data:
def createIdentite(mariaConnector, fmsConnector):
print('Creating table "Identite"...')
mariadbCursor = mariaConnector.cursor()
# verify we have the destination tables we need
print(' Checking for table Identite...')
if mariaCheckTableExists(mariadbConnector, 'Identite') == False:
print(' Table doesn\'t exist, creating it...')
mariadbCursor.execute("""
CREATE TABLE Identite (
PK_FP VARCHAR(50) NOT NULL,
LieuNaissance TEXT,
PaysNaissance TEXT,
Name TEXT,
LastName TEXT,
Nationalite TEXT,
PaysResidence TEXT,
PersonneAPrevenir TEXT,
Tel1_PAP TEXT,
Tel2_PAP TEXT,
CategorieMutuelle TEXT,
Ep1_MUTUELLE BOOLEAN,
TypeMutuelle BOOLEAN,
NiveauMutuelle BOOLEAN,
NiveauMutuelle2 BOOLEAN,
NiveauMutuelle3 BOOLEAN,
PartMutuelleSalarie FLOAT,
PartMutuelleSalarieOption FLOAT,
PRIMARY KEY (PK_FP)
)
""")
mariadbCursor.execute("CREATE INDEX IdentitePK_FP ON Identite(PK_FP)")
else:
# flush the table
print(' Table exists, flushing it...')
mariadbCursor.execute("DELETE FROM Identite")
# now fill it with fresh data
print(' Retrieving the data from FMS...')
fmsCursor = fmsConnector.cursor()
fmsCursor.execute("""
SELECT
PK_FP,
Lieu_Naiss_Txt,
Pays_Naiss_Txt,
Nationalite_Txt,
Pays_Resid__Txt,
Pers_URG,
Tel1_URG,
Tel2_URG,
CAT_MUTUELLE,
CASE WHEN Ep1_MUTUELLE = 'OUI' THEN 1 ELSE 0 END as Ep1_MUTUELLE,
CASE WHEN TYPE_MUT = 'OUI' THEN 1 ELSE 0 END as TYPE_MUT,
CASE WHEN Niv_Mutuelle IS NULL THEN 0 ELSE 1 END as Niv_Mutuelle,
CASE WHEN NIV_MUTUELLE[2] IS NULL THEN 0 ELSE 1 END as Niv_Mutuelle2,
CASE WHEN NIV_MUTUELLE[3] IS NULL THEN 0 ELSE 1 END as Niv_Mutuelle3,
PART_MUT_SAL,
PART_MUT_SAL_Option
FROM B_EMPLOYE
WHERE PK_FP IS NOT NULL
""")
print(' Transferring...')
#for row in fmsCursor:
insert = """INSERT INTO Identite (
PK_FP,
LieuNaissance,
PaysNaissance,
Nationalite,
PaysResidence,
PersonneAPrevenir,
Tel1_PAP,
Tel2_PAP,
CategorieMutuelle,
Ep1_MUTUELLE,
TypeMutuelle,
NiveauMutuelle,
NiveauMutuelle2,
NiveauMutuelle3,
PartMutuelleSalarie,
PartMutuelleSalarieOption
) VALUES (
%s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s
)"""
values = fmsCursor.fetchall()
mariadbCursor.executemany(insert, values)
mariaConnector.commit()
print(' Inserted '+str(len(values))+' values')
return len(values)
And the part where I retrieve first name and last name:
def updateEmployeeNames(mariaConnector, mssqlConnector):
print("Updating employee names...")
mariadbCursor = mariaConnector.cursor()
mssqlCursor = mssqlConnector.cursor()
mssqlCursor.execute("SELECT Name, LastName, PK_FP FROM F_Person")
rows = mssqlCursor.fetchall()
query = """
UPDATE Identite
SET Name = %s, LastName = %s
WHERE PK_FP = %s
"""
mariadbCursor.executemany(query, rows)
mariadbConnector.commit()
As you might have guessed, the first function takes almost no time to execute (less that 2 seconds), where the second one take almost 20.
Python's not my strong suit, but there might be another way, the aim is to make it much faster.
I already tried adding values to createIdentite's each tuple before the executeMany, but Mysql connector won't let me do that.
Thanks a lot for your help.
So the UPDATE to the existing MariaDB table is the bottleneck, in which case it might be faster to do the update on a pandas DataFrame and then push the result the MariaDB table using pandas to_sql method. A simplified example would be ...
df_main = pd.read_sql_query(fms_query, fms_engine, index_col='PK_FP')
df_mssql = pd.read_sql_query(mssql_query, mssql_engine, index_col='PK_FP')
df_main.update(df_mssql)
df_main.to_sql('Identite', mariadb_engine, if_exists='replace',
dtype={'PK_FP': sqlalchemy.types.String(50)})
... where fms_query and mssql_query are the queries from your question. fms_engine, mssql_engine, and mariadb_engine would be SQLAlchemy Engine objects.
In all MySQL Python Drivers the execute_many is rewritten, since bulk operations are not supported in MySQL, they are supported only via binary protocol in MariaDB since 10.2, full support (including delete and update) was added later and is available in the lastest 10.2, 10.3 and 10.4 versions of MariaDB Server.
The python Driver is rewriting an insert query, iterates over the number of rows and transforms the statement to
INSERT INTO t1 VALUES (row1_id, row1_data), (row2_id, row2_data),....(rown_id, row_n data)
This is quite fast, but the SQL Syntax doesn't allow this for UPDATE or DELETE. In this case the Driver needs to execute the statement n times (n= number of rows), passing the values for each row in a single statment.
MariaDB binary protocol allows to prepare the statement, executing it by sending all data at once (The execute package also contains the data).
If C would be an alternative, take a look at the bulk unittests on Github repository of MariaDB Connector/C. Otherwise you have to wait, MariaDB will likey release it's own python Driver next year.
Create the index as you create the temp table.
These combined statements work: CREATE TABLE ... SELECT ...; and INSERT INTO table ... SELECT .... However, they may be difficult to perform from Python.
It is unclear whether you need the temp table at all.
Learn how to use JOIN to get information simultaneously from two tables.