Beginners question here. I wish to populate a table with many rows of data straight from a query I'm running in the same session. I wish to do it using with excutemany(). currently, I insert each row as a tuple, as shown in the script below.
Select Query to get the needed data:
This query returns data with 4 columns Parking_ID, Snapshot_Date, Snapshot_Time, Parking_Stat
park_set_stat_query = "SET #row_number = 0;"
park_set_stat_query2 = "SET #row_number2 = 0;"
# one time load to catch only the changes done in the input table
park_change_stat_query = """select in1.Parking_ID,
in1.Snapshot_Date as Snapshot_Date,
in1.Snapshot_Time as Snapshot_Time,
in1.Parking_Stat
from (SELECT
Parking_ID,
Snapshot_Date,
Snapshot_Time,
Parking_Stat,
(#row_number:=#row_number + 1) AS num1
from Fact_Parking_Stat_Input
WHERE Parking_Stat<>0) as in1
left join (SELECT
Parking_ID,
Snapshot_Date,
Snapshot_Time,
Parking_Stat,
(#row_number2:=#row_number2 + 1)+1 AS num2
from Fact_Parking_Stat_Input
WHERE Parking_Stat<>0) as in2
on in1.Parking_ID=in2.Parking_ID and in1.num1=in2.num2
WHERE (CASE WHEN in1.Parking_Stat<>in2.Parking_Stat THEN 1 ELSE 0 END=1) OR num1=1"""
Here is the insert part of the script:
as you can see below I insert each row to the destination table Fact_Parking_Stat_Input_Alter
mycursor = connection.cursor()
mycursor2 = connection.cursor()
mycursor.execute(park_set_stat_query)
mycursor.execute(park_set_stat_query2)
mycursor.execute(park_change_stat_query)
# # keep only changes in a staging table named Fact_Parking_Stat_Input_Alter
qSQLresults = mycursor.fetchall()
for row in qSQLresults:
Parking_ID = row[0]
Snapshot_Date = row[1]
Snapshot_Time = row[2]
Parking_Stat = row[3]
#SQL query to INSERT a record into the table Fact_Parking_Stat_Input_Alter.
mycursor2.execute('''INSERT into Fact_Parking_Stat_Input_Alter (Parking_ID, Snapshot_Date, Snapshot_Time, Parking_Stat)
values (%s, %s, %s, %s)''',
(Parking_ID, Snapshot_Date, Snapshot_Time, Parking_Stat))
# Commit your changes in the database
connection.commit()
mycursor.close()
mycursor2.close()
connection.close()
How can I improve the code so it will insert the data in on insert command?
Thanks
Amir
MYSQL has an INSERT INTO command that is probably far more efficient than query it in python, pulling it and re-iserting
https://www.mysqltutorial.org/mysql-insert-into-select/
Related
def LiraRateApiCall():
R = requests.get(url)
timestamp = R.json()['buy'][-1][0]/1000
format_date = '%d/%m/%y'
date = datetime.fromtimestamp(timestamp)
buyRate = R.json()['buy'][-1][1]
print(date.strftime(format_date))
print(buyRate)
#ADDDING TO SQL SERVER
conn = odbc.connect("Driver={ODBC Driver 17 for SQL Server};"
'Server=LAPTOP-36NUUO53\SQLEXPRESS;'
'Database=test;'
'Trusted_connection=yes;')
cursor = conn.cursor()
cursor.execute('''
INSERT INTO Data_table (Time1,Price)
VALUES
('date',140),
('Date2' , 142)
''')
conn.commit()
cursor.execute('SELECT * FROM Data_table')
for i in cursor:
print(i)
How do i pass the variables date and buy rate to the table instead of putting in values liek i did (i put in'date' , 140 for example but i want to pass variables not specific values)
You'll need to check the driver version that you're using, but what you're looking for is the concept of bind variables. I'd suggest you look into the concept of fast_executemany as well - that should help speed things up. I've edited your code to show how bind variables typically work (using the (?, ?) SQL syntax), but there are other formats out there.
def LiraRateApiCall():
R = requests.get(url)
timestamp = R.json()['buy'][-1][0]/1000
format_date = '%d/%m/%y'
date = datetime.fromtimestamp(timestamp)
buyRate = R.json()['buy'][-1][1]
print(date.strftime(format_date))
print(buyRate)
#ADDDING TO SQL SERVER
conn = odbc.connect("Driver={ODBC Driver 17 for SQL Server};"
'Server=LAPTOP-36NUUO53\SQLEXPRESS;'
'Database=test;'
'Trusted_connection=yes;')
cursor = conn.cursor()
#Setup data
data = [('date',140), ('Date2' , 142)]
#Use executemany since we have a list
cursor.executemany('''
INSERT INTO Data_table (Time1,Price)
VALUES (?, ?)
''', data)
conn.commit()
cursor.execute('SELECT * FROM Data_table')
for i in cursor:
print(i)
I dont understand at all your question
If you want to pass the variables:
insert_sql = 'INSERT INTO Data_table (Time1,Price) VALUES (' + date + ',' + str(buyRate) + ')'
cursor.execute(insert_sql)
If you want to do dynamic Insert:
You can only insert knowing the values or by inserting with a select
INSERT INTO table
SELECT * FROM tableAux
WHERE condition;
That or you could iterate through the fields you have in a table, extract them and compare it to your variables to do a dynamic insert.
With this select you can extract the columns.
SELECT *
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME = N'table1'
What's the best / fastest solution for the following task:
Used technology: MySQL database + Python
I'm downloading a data.sql file. It's format:
INSERT INTO `temp_table` VALUES (group_id,city_id,zip_code,post_code,earnings,'group_name',votes,'city_name',person_id,'person_name',networth);
INSERT INTO `temp_table` VALUES (group_id,city_id,zip_code,post_code,earnings,'group_name',votes,'city_name',person_id,'person_name',networth);
.
.
Values in each row differ.
Tables structures: http://sqlfiddle.com/#!9/8f10d6
A person can have multiple cities
A person can be only in one group or can be without group.
A group can have multiple persons
And i know from which country these .sql data are.
I need to split these data into 3 tables. And I will be updating data that are already in the tables and if not then I will create new row.
So I came up with 2 solutions:
Split the values from the file via python and then perform for each line 3x select + 3x update/insert in the transaction.
Somehow bulk insert the data into a temporary table and then manipulate with the data inside a database - meaning for each row in the temporary table I will perform 3 select queries (one to each actual table) and if I find row I will send 3x (update query and if not then I run insert query).
I will be running this function multiple times per day with over 10K lines in the .sql file and it will be updating / creating over 30K rows in the database.
//EDIT
My inserting / updating code now:
autocomit = "SET autocommit=0"
with connection.cursor() as cursor:
cursor.execute(autocomit)
data = data.sql
lines = data.splitlines
for line in lines:
with connection.cursor() as cursor:
cursor.execute(line)
temp_data = "SELECT * FROM temp_table"
with connection.cursor() as cursor:
cursor.execute(temp_data)
temp_data = cursor.fetchall()
for temp_row in temp_data:
group_id = temp_row[0]
city_id = temp_row[1]
zip_code = temp_row[2]
post_code = temp_row[3]
earnings = temp_row[4]
group_name = temp_row[5]
votes = temp_row[6]
city_name = temp_row[7]
person_id = temp_row[8]
person_name = temp_row[9]
networth = temp_row[10]
group_select = "SELECT * FROM perm_group WHERE group_id = %s AND countryid_fk = %s"
group_values = (group_id, countryid)
with connection.cursor() as cursor:
row = cursor.execute(group_select, group_values)
if row == 0 and group_id != 0: #If person doesn't have group do not create
group_insert = "INSERT INTO perm_group (group_id, group_name, countryid_fk) VALUES (%s, %s, %s)"
group_insert_values = (group_id, group_name, countryid)
with connection.cursor() as cursor:
cursor.execute(group_insert, group_insert_values)
groupid = cursor.lastrowid
elif row == 1 and group_id != 0:
group_update = "UPDATE perm_group SET group_name = group_name WHERE group_id = %s and countryid_fk = %s"
group_update_values = (group_id, countryid)
with connection.cursor() as cursor:
cursor.execute(group_update, group_update_values)
#Select group id for current row to assign correct group to the person
group_certain_select = "SELECT id FROM perm_group WHERE group_id = %s and countryid_fk = %s"
group_certain_select_values = (group_id, countryid)
with connection.cursor() as cursor:
cursor.execute(group_certain_select, group_certain_select_values)
groupid = cursor.fetchone()
#.
#.
#.
#Repeating the same piece of code for person and city
Measured time: 206 seconds - which is not acceptable.
group_insert = "INSERT INTO perm_group (group_id, group_name, countryid_fk) VALUES (%s, %s, %s) ON DUPLICATE KEY UPDATE group_id = %s, group_name = %s"
group_insert_values = (group_id, group_name, countryid, group_id, group_name)
with connection.cursor() as cursor:
cursor.execute(group_insert, group_insert_values)
#Select group id for current row to assign correct group to the person
group_certain_select = "SELECT id FROM perm_group WHERE group_id = %s and countryid_fk = %s"
group_certain_select_values = (group_id, countryid)
with connection.cursor() as cursor:
cursor.execute(group_certain_select, group_certain_select_values)
groupid = cursor.fetchone()
Measured time: from 30 to 50 seconds. (Still quite long, but it's getting better)
Are there any other better (faster) options on how to do it?
Thanks in advice, popcorn
I would recommend that you load the data into a staging table and do the processing in SQL.
Basically, your ultimate result is a set of SQL tables, so SQL is necessarily going to be part of the solution. You might as well put as much logic into the database as you can, to simply the number of tools needed.
Loading 10,000 rows should not take much time. However, if you have a choice of data formats, I would recommend a CSV file over inserts. inserts incur extra overhead, if only because they are larger.
Once the data is in the database, I would not worry much about the processing time for storing the data in three tables.
I am using python3, postgress 10 and Psycopg2 to query multiple records like so
import psycopg2
conn = psycopg2.connect(<my connection string>)
with conn:
with conn.cursor() as cur:
cur.execute('select id,field1 from table1')
for id, field1 from cur.fetchall():
print(id,field1)
#todo: how up update field1 to be f(field1) where f is an arbitrary python function
My question is: how do i update the value of the rows that I am reading and set the value of field1 to some arbitrary python-based calculation
edit: the purpose is to update the rows in the table
You need another cursor, e.g.:
with conn:
with conn.cursor() as cur:
cur.execute('select id,field1 from table1')
for id, field1 in cur.fetchall():
print(id,field1)
with conn.cursor() as cur_update:
cur_update.execute('update table1 set field1 = %s where id = %s', (f(field1), id))
Note however that this involves as many updates as selected rows, which is obviously not efficient. The update can be done in a single query using psycopg2.extras.execute_values():
from psycopg2.extras import execute_values
with conn:
with conn.cursor() as cur:
cur.execute('select id,field1 from table1')
rows = cur.fetchall()
for id, field1 in rows:
print(id,field1)
# convert rows to new values of field1
values = [(id, f(field1)) for id, field1 in rows]
sql = '''
with upd (id, field1) as (values %s)
update table1 t
set field1 = upd.field1
from upd
where upd.id = t.id
'''
with conn.cursor() as cur:
execute_values(cur, sql, values)
I want to export specific column from one database to another one using Python but its not coming:
# Display all Non-Duplicate data
import sqlite3
import csv
conn = sqlite3.connect('data.db')
# STEP 2 : create a small data file with only three fields account_id, product_id and unit_quantity
cursor = conn.execute("SELECT field1,field12,field14 FROM database")
for row in cursor:
print row[0:11]
print "Operation done successfully";
conn.close()
Create second connection and insert directly
conn = sqlite3.connect('data.db')
cursor = conn.execute("SELECT field1,field12,field14 FROM database")
export = sqlite3.connect('exported.db')
#get result as list
for values in cursor.fetchall():
export.execute('INSERT INTO tablename(field1,field12,field14) VALUES (%s, %s, %s)' % (values[0], values[1], values[2]))
export.commit()
export.close()
I have made a function which counts the amount of rows there are in a table using the cursor.rowcountfunction in Python. Now I want to apply that to tables I choose through %s. The problem is that I don't know how to apply it. Here is the sample I am working with.
def data_input (table):
cursor.execute ("USE database")
cursor.execute ("TRUNCATE TABLE table1")
cursor.execute ("TRUNCATE TABLE table2")
cursor.execute ("LOAD DATA LOCAL INFILE 'table1data' INTO TABLE table1 FIELDS TERMINATED BY '\t' LINES TERMINATED BY '\n' (field1, field2, field3, field4, field5)")
cursor.execute ("LOAD DATA LOCAL INFILE 'table2data' INTO TABLE table2 FIELDS TERMINATED BY '\t' LINES TERMINATED BY '\n' (field1, field2, field3)")
cursor.execute ("SELECT * FROM %s", table)
print cursor.rowcount
data_input ("table1")
Basically what it already does is input all the data into tables in MySQL from a text file, now I want the function to also print the number of rows for a particular table. It is getting an error message saying wrong MySQL syntax so this code is wrong for the rowcount part.
query = "SELECT COUNT(*) from `%s`" %table
cursor.execute(query) #execute query separately
res = cursor.fetchone()
total_rows = res[0] #total rows