I'm trying to insert into a MySQL table from data in this Excel sheet: https://www.dropbox.com/s/w7m282386t08xk3/GA.xlsx?dl=0
The script should start from the second sheet "Daily Metrics" at row 16. The MySQL table already has the fields called date, campaign, users, and sessions.
Using Python 2.7, I've already created the MySQL connection and opened the sheet, but I'm not sure how to loop over those rows and insert into the database.
import MySQLdb as db
from openpyxl import load_workbook
wb = load_workbook('GA.xlsx')
sheetranges = wb['Daily Metrics']
print(sheetranges['A16'].value)
conn = db.connect('serverhost','username','password','database')
cursor = conn.cursor()
cursor.execute('insert into test_table ...')
conn.close()
Thank you for you help!
Try this and see if it does what you are looking for. You will need to update to the correct workbook name and location. Also, udate the range that you want to iterate over in for rw in wb["Daily Metrics"].iter_rows("A16:B20"):
from openpyxl import load_workbook
wb = load_workbook("c:/testing.xlsx")
for rw in wb["Daily Metrics"].iter_rows("A16:B20"):
for cl in rw:
print cl.value
Only basic knowledge of MySQL and Openpyxl is needed, you can solve it by reading tutorials on your own.
Before executing the script, you need to create database and table. Assuming you've done it.
import openpyxl
import MySQLdb
wb = openpyxl.load_workbook('/path/to/GA.xlsx')
ws = wb['Daily Metrics']
# map is a convenient way to construct a list. you can get a 2x2 tuple by slicing
# openpyxl.worksheet.worksheet.Worksheet instance and last row of worksheet
# from openpyxl.worksheet.worksheet.Worksheet.max_row
data = map(lambda x: {'date': x[0].value,
'campaign': x[1].value,
'users': x[2].value,
'sessions': x[3].value},
ws[16: ws.max_row])
# filter is another builtin function. Filter blank cells out if needed
data = filter(lambda x: None not in x.values(), data)
db = MySQLdb.connect('host', 'user', 'password', 'database')
cursor = db.cursor()
for row in data:
# execute raw MySQL syntax by using execute function
cursor.execute('insert into table (date, campaign, users, sessions)'
'values ("{date}", "{campaign}", {users}, {sessions});'
.format(**row)) # construct MySQL syntax through format function
db.commit()
Related
With the next cmds I am trying to upload a csv file where columns are separated by tabs and sometimes null values can be assigned to a column.
conn = psycopg2.connect(host="localhost",
port="5432",
user="postgres",
password="somepwd",
database="mydb",
options="-c search_path=dbo")
...
cur = conn.cursor()
with open(opath, "r") as opath_file:
next(opath_file) # skip the header row
cur.copy_from(opath_file, table_name[3:], null='', columns=cols.split(','))
cols has a string with the column names separated by ','
the table with name table_name[3:] belongs to the dbo schema
This code runs, no error is reported but no data is uploaded. The owner of the db is postgres.
Any ideas?
Would you believe me if the problem was I needed to run
conn.commit()
after the cur.copy_from cmd?
I am using python to establish db connection and reading csv file. For each line in csv i want to run a PostgreSQL query and get value corresponding to each line read.
DB connection and file reading is working fine. Also if i run query for hardcoded value then it works fine. But if i try to run query for each row in csv file using python variable then i am not getting correct value.
cursor.execute("select team from users.teamdetails where p_id = '123abc'")
Above query works fine.
but when i try it for multiple values fetched from csv file then i am not getting correct value.
cursor.execute("select team from users.teamdetails where p_id = queryPID")
Complete code for Reference:
import psycopg2
import csv
conn = psycopg2.connect(dbname='', user='', password='', host='', port='')
cursor = conn.cursor()
with open('playerid.csv','r') as csv_file:
csv_reader = csv.reader(csv_file)
for line in csv_reader:
queryPID = line[0]
cursor.execute("select team from users.teamdetails where p_id = queryPID")
team = cursor.fetchone()
print (team[0])
conn.close()
DO NOT concatenate the csv data. Use a parameterised query.
Use %s inside your string, then pass the additional variable:
cursor.execute('select team from users.teamdetails where p_id = %s', (queryPID,))
Concatenation of text leaves your application vulnerable to SQL injection.
https://www.psycopg.org/docs/usage.html
From Jupiter notebook, I was able to create Database with Psycopg2.
But somehow I was not able to create Table and store element in it.
import psycopg2
from psycopg2.extensions import ISOLATION_LEVEL_AUTOCOMMIT
con = psycopg2.connect("user=postgres password='abc'");
con.set_isolation_level(ISOLATION_LEVEL_AUTOCOMMIT);
cursor = con.cursor();
name_Database = "socialmedia";
sqlCreateDatabase = "create database "+name_Database+";"
cursor.execute(sqlCreateDatabase);
With the above code, I can see database named "socialmedia" from psql (windows command prompt).
But with the below code, I can not see table named "test_table" from psql.
import psycopg2
# Open a DB session
dbSession = psycopg2.connect("dbname='socialmedia' user='postgres' password='abc'");
# Open a database cursor
dbCursor = dbSession.cursor();
# SQL statement to create a table
sqlCreateTable = "CREATE TABLE test_table(id bigint, cityname varchar(128), latitude numeric, longitude numeric);";
# Execute CREATE TABLE command
dbCursor.execute(sqlCreateTable);
# Insert statements
sqlInsertRow1 = "INSERT INTO test_table values(1, 'New York City', 40.73, -73.93)";
sqlInsertRow2 = "INSERT INTO test_table values(2, 'San Francisco', 37.733, -122.446)";
# Insert statement
dbCursor.execute(sqlInsertRow1);
dbCursor.execute(sqlInsertRow2);
# Select statement
sqlSelect = "select * from test_table";
dbCursor.execute(sqlSelect);
rows = dbCursor.fetchall();
# Print rows
for row in rows:
print(row);
I can get elements only from Jupyter notebook, but not from psql.
And it seems elements are stored temporarily.
How can I see table and elements from psql and keep the element permanently?
I don't see any dbCursor.execute('commit') in the second part of your question?
You have provided an example with AUTOCOMMIT which works, and you are asking why results are stored temporarly when you are not using AUTOCOMMIT?
Well, they are not commited!
They are stored only for the current session, that's why you can get it from your Jupyter session
Also:
you don't need to put semicolons in your python code
you don't need to put semicolons in your SQL code (except when you execute multiple statements, which is not the case here)
I am new to the Python-SQL connectivity world. My goal is to retrieve data from SQL in a pandas DataFrame format by executing long SQL queries thru my python script.
Most of my SQL queries are long with multiple interim-temp tables before the final SELECT statement from the last temp table. When I run such a monolithic query in Python I get an error saying -
"pandas.io.sql.DatabaseError: Execution failed on sql"
Though they run absolutely fine in MS SQL Management Studio
I suspect this is due to the interim-temp tables, because if I split my long query into two pieces (with everything before the final SELECT in 1st section and final SELECT in the 2nd section) the two section sequentially, run fine
Can someone guide me why is it so or alternatively what is the best way to run long queries with temp tables/views and retrieve results in a pandas DataFrame?
Here is my sample Python code that ideally should take a fine name as an input and run the SQL to retrieve results in a data frame, however it fails in case of a query with temp tables
import pyodbc as db
import pandas as pd
filename = 'file.sql'
username = 'XXXX'
password = 'YYYYY'
driver= '{ODBC Driver 13 for SQL Server}'
database = 'DB'
server = 'local'
conn = db.connect('DRIVER='+driver+'; PORT=1433; SERVER='+server+';
PORT=1443; DATABASE='+database+'; UID='+username+'; PWD='+ password)
fd = open(filename, 'r')
sqlfile = fd.read()
fd.close()
sqlcommand1 = sql
df_table = pd.read_sql(sqlcommand1, conn)
If I break my sql query in two pieces (one with all temp tables and 2nd with final Select), then it runs fine. Below is a modified function that splits the long Query after finding '/**/' and it works fine
"""
This Function Reads a SQL Script From an Extrenal File and Executes The
Script in SQL. If The SQL Script Has Bunch of Tem Tables/Views
Followed By a Select Statement to Retrieve Data From Those Views Then Input
SQL File Should Have '/**/' Immediately Before the Final
Select Statement. This is to Esnure Final Select Statement is Executed on
the Temporary Views Already Run by Python.
Input is a SQL File Name and Output is a DataFrame
"""
import pyodbc as db
import pandas as pd
filename = 'filename.sql'
username = 'XXXX'
password = 'YYYYY'
driver= '{ODBC Driver 13 for SQL Server}'
database = 'DB'
server = 'local'
conn = db.connect('DRIVER='+driver+'; PORT=1433; SERVER='+server+';
PORT=1443; DATABASE='+database+'; UID='+username+'; PWD='+ password)
fd = open(filename, 'r')
sqlfile = fd.read()
fd.close()
sql = sqlfile.split('/**/')
sqlcommand1 = sql[0] #1st Section of Query with temp tables
sqlcommand2 = sql[1] #2nd section of Query with final SELECT statement
conn.execute(sqlcommand1)
df_table = pd.read_sql(sqlcommand2, conn)
Quick and dirty answer: if using T-SQL put the line SET NOCOUNT ON at the beginning of your query.
Like #Parfait mentioned above the pandas read_sql method can only support one result set. However, when you generate a temp table in T-sql you do create a result set in the form "(XX row(s) affected)" which is what causes your original query to fail. By setting NOCOUNT you eliminate any early returns and only get the results from your final SELECT statement.
Alternatively, if using pyodbc cursor instead of pandas you can utilize nextset() to skip the result sets from the temp table(s). More info on pyodbc here.
I'm not experienced in Python.
I have the following Python code:
How can i import various values from files outside the file, and use them in a SQL request?
#!/usr/bin/env python
import MySQLdb
import Stamdata
from Stamdata import Varmekurve
K = Varmekurve
print K #this vorks, and the value 1.5 from Varmekurve is printed.
#Open database connection
db = MySQLdb.connect("localhost","root","Codename","MyDvoDb")
#prepare a cursor object using cursor method
cursor = db.cursor()
#Get SetTemp FROM SQL
sql = ("SELECT SetTemp FROM varmekurver WHERE kurvenummer = '1.5' AND TempSensor ='15'")
#Here i would like to import the value from Varmekurve instead of '1.5', and the data from a DS18b20 temp. sensor instead of '15'.
#The DS18B20 sensor are located in '/sys/bus/w1/devices/28-0316007914ff/w1_slave'
cursor.execute(sql)
results = cursor.fetchall()
for row in results:
print row[0]
db.close()
Only the Stamdata file are in the same library.
The Script shall control a motorvalve by calling the SetTemp and open/close a mix-valve if the temp. is to high or low (within 2-3 degrees)
But i haven't come that far yet :0)
to dynamically change the value in the string from a variable do:
SELECT SetTemp FROM varmekurver WHERE kurvenummer = '{}' AND TempSensor ='{}'.format(val1, val2)
If you want to import these values from an external source, like a flat file, you can do it in a number of ways. For example using Pandas.