With the next cmds I am trying to upload a csv file where columns are separated by tabs and sometimes null values can be assigned to a column.
conn = psycopg2.connect(host="localhost",
port="5432",
user="postgres",
password="somepwd",
database="mydb",
options="-c search_path=dbo")
...
cur = conn.cursor()
with open(opath, "r") as opath_file:
next(opath_file) # skip the header row
cur.copy_from(opath_file, table_name[3:], null='', columns=cols.split(','))
cols has a string with the column names separated by ','
the table with name table_name[3:] belongs to the dbo schema
This code runs, no error is reported but no data is uploaded. The owner of the db is postgres.
Any ideas?
Would you believe me if the problem was I needed to run
conn.commit()
after the cur.copy_from cmd?
Related
I am using python to establish db connection and reading csv file. For each line in csv i want to run a PostgreSQL query and get value corresponding to each line read.
DB connection and file reading is working fine. Also if i run query for hardcoded value then it works fine. But if i try to run query for each row in csv file using python variable then i am not getting correct value.
cursor.execute("select team from users.teamdetails where p_id = '123abc'")
Above query works fine.
but when i try it for multiple values fetched from csv file then i am not getting correct value.
cursor.execute("select team from users.teamdetails where p_id = queryPID")
Complete code for Reference:
import psycopg2
import csv
conn = psycopg2.connect(dbname='', user='', password='', host='', port='')
cursor = conn.cursor()
with open('playerid.csv','r') as csv_file:
csv_reader = csv.reader(csv_file)
for line in csv_reader:
queryPID = line[0]
cursor.execute("select team from users.teamdetails where p_id = queryPID")
team = cursor.fetchone()
print (team[0])
conn.close()
DO NOT concatenate the csv data. Use a parameterised query.
Use %s inside your string, then pass the additional variable:
cursor.execute('select team from users.teamdetails where p_id = %s', (queryPID,))
Concatenation of text leaves your application vulnerable to SQL injection.
https://www.psycopg.org/docs/usage.html
import sqlite3
conn = sqlite3.connect('serpin.db')
c = conn.cursor()
c.execute("""CREATE TABLE Gene(Gene_name TEXT, Organism TEXT, link_2_gene_with_ID TEXT, Number_SpliceForm INTEGER,ID_mRNA INTEGER, ID_Prt INTEGER);""")
c.execute(".import practice.csv Gene --csv")
c.execute(".mode column")
c.execute("select * from Gene;")
print(c.fetchall())
conn.commit()
conn.close
I can run all these commands individually on my own on the windows terminal in sqlite3. However I get multiple errors running this code, which is roughly what i used in a bash script where i got no errors. The first error I receive is an error that ssays "table Gene already exists." Now even if i comment out that line, i also get an error in the import command, where it says there is a syntax error with the period right before import. These are all sqlite3.OperationalError. I have tried running these commands on their own directly in sqlite3 and have no issues, so i'm not sure what the problem is.
I have no database in this folder, so I'm not sure how the table is already made.
edit(solution): the output of this is not formatted correctly, but this runs without errors.
import csv,sqlite3
conn = sqlite3.connect('serpin.db')
c = conn.cursor()
try:
c.execute("""CREATE TABLE Gene (Gene_name TEXT, Organism TEXT, link_2_gene_with_ID TEXT, Number_SpliceForm INTEGER,ID_mRNA INTEGER, ID_Prt INTEGER);""")
except:
pass
path = r'C:\Users\User\Desktop\sqlite\practice.csv'
with open(path,'r') as fin: # `with` statement available in 2.5+
# csv.DictReader uses first line in file for column headings by default
dr = csv.DictReader(fin) # comma is default delimiter
to_db = [(i['Gene_name'], i['Organism'],i['link_2_gene_with_ID'],i['Number_SpliceForm'],i['ID_mRNA'],i['ID_Prt'] ) for i in dr]
c.executemany("INSERT INTO Gene (Gene_name,Organism,link_2_gene_with_ID,Number_SpliceForm,ID_mRNA,ID_Prt) VALUES (?,?,?,?,?,?);", to_db)
c.execute("select * from Gene;")
print(c.fetchall())
conn.commit()
conn.close
About the fact that you may already have created the table and that gives you an error:
try:
c.execute("""CREATE TABLE Gene(Gene_name TEXT, Organism TEXT, link_2_gene_with_ID TEXT, Number_SpliceForm INTEGER,ID_mRNA INTEGER, ID_Prt INTEGER);""")
except:
pass
To import the file, I report here from another answer from the user mechanical_meat
Importing a CSV file into a sqlite3 database table using Python:
import csv, sqlite3
con = sqlite3.connect(":memory:") # change to 'sqlite:///your_filename.db'
cur = con.cursor()
cur.execute("CREATE TABLE t (col1, col2);") # use your column names here
with open('data.csv','r') as fin: # `with` statement available in 2.5+
# csv.DictReader uses first line in file for column headings by default
dr = csv.DictReader(fin) # comma is default delimiter
to_db = [(i['col1'], i['col2']) for i in dr]
cur.executemany("INSERT INTO t (col1, col2) VALUES (?, ?);", to_db)
con.commit()
con.close()
Don't know about the .mode command, but as far as I know, operation in SQLite3 in python are all in capital letters, thus also select should be SELECT
print ('Files in Drive:')
!ls drive/AI
Files in Drive:
database.sqlite
Reviews.csv
Untitled0.ipynb
fine_food_reviews.ipynb
Titanic.csv
When I run the above code in Google Colab, clearly my sqlite file is present in my drive. But whenever I run some query on this file, it says
# using the SQLite Table to read data.
con = sqlite3.connect('database.sqlite')
#filtering only positive and negative reviews i.e.
# not taking into consideration those reviews with Score=3
filtered_data = pd.read_sql_query("SELECT * FROM Reviews WHERE Score !=3",con)
DatabaseError: Execution failed on sql 'SELECT * FROM Reviews WHERE
Score != 3 ': no such table: Reviews
Below you will find code that addresses the db setup on the Colab VM, table creation, data insertion and data querying. Execute all code snippets in individual notebook cells.
Note however that this example only shows how to execute the code on a non-persistent Colab VM. If you want to save your database to GDrive you will have to mount your Gdrive first (source):
from google.colab import drive
drive.mount('/content/gdrive')
and navigate to the appropriate file directory after.
Step 1: Create DB
import sqlite3
conn = sqlite3.connect('SQLite_Python.db') # You can create a new database by changing the name within the quotes
c = conn.cursor() # The database will be saved in the location where your 'py' file is saved
# Create table - CLIENTS
c.execute('''CREATE TABLE SqliteDb_developers
([id] INTEGER PRIMARY KEY, [name] text, [email] text, [joining_date] date, [salary] integer)''')
conn.commit()
Test whether the DB was created successfully:
!ls
Output:
sample_data SQLite_Python.db
Step 2: Insert Data Into DB
import sqlite3
try:
sqliteConnection = sqlite3.connect('SQLite_Python.db')
cursor = sqliteConnection.cursor()
print("Successfully Connected to SQLite")
sqlite_insert_query = """INSERT INTO SqliteDb_developers
(id, name, email, joining_date, salary)
VALUES (1,'Python','MakesYou#Fly.com','2020-01-01',1000)"""
count = cursor.execute(sqlite_insert_query)
sqliteConnection.commit()
print("Record inserted successfully into SqliteDb_developers table ", cursor.rowcount)
cursor.close()
except sqlite3.Error as error:
print("Failed to insert data into sqlite table", error)
finally:
if (sqliteConnection):
sqliteConnection.close()
print("The SQLite connection is closed")
Output:
Successfully Connected to SQLite
Record inserted successfully into SqliteDb_developers table 1
The SQLite connection is closed
Step 3: Query DB
import sqlite3
conn = sqlite3.connect("SQLite_Python.db")
cur = conn.cursor()
cur.execute("SELECT * FROM SqliteDb_developers")
rows = cur.fetchall()
for row in rows:
print(row)
conn.close()
Output:
(1, 'Python', 'MakesYou#Fly.com', '2020-01-01', 1000)
Try this instead. See what tables are there.
"SELECT name FROM sqlite_master WHERE type='table'"
give similar sharable id to your database file just like you did with Reviews.csv
database_file=drive.CreateFile({'id':'your_sharable_id for sqlite file'})
database_file.GetContentFile('database.sqlite')
If you are trying to access the files from your google drive, you need to mount the drive first:
from google.colab import drive
drive.mount('/content/drive')
After you do this, right click on the file that you intend to read in colab session and select 'Copy Path'and paste it in the connection string.
con = sqlite3.connect('/content/database.sqlite')
You can now read the file.
con = sqlite3.connect('database.sqlite')
filtered_data = pd.read_sql_query("SELECT * FROM Reviews WHERE Score !=3",con)
If you are executing it twice you will definitely end with this type of error.Execute it exactly once without any fail.
If you get any error then remove
database.sqlite
this file and extract it again.This time execute it again without any fail/error .This worked for me .
I'm trying to insert into a MySQL table from data in this Excel sheet: https://www.dropbox.com/s/w7m282386t08xk3/GA.xlsx?dl=0
The script should start from the second sheet "Daily Metrics" at row 16. The MySQL table already has the fields called date, campaign, users, and sessions.
Using Python 2.7, I've already created the MySQL connection and opened the sheet, but I'm not sure how to loop over those rows and insert into the database.
import MySQLdb as db
from openpyxl import load_workbook
wb = load_workbook('GA.xlsx')
sheetranges = wb['Daily Metrics']
print(sheetranges['A16'].value)
conn = db.connect('serverhost','username','password','database')
cursor = conn.cursor()
cursor.execute('insert into test_table ...')
conn.close()
Thank you for you help!
Try this and see if it does what you are looking for. You will need to update to the correct workbook name and location. Also, udate the range that you want to iterate over in for rw in wb["Daily Metrics"].iter_rows("A16:B20"):
from openpyxl import load_workbook
wb = load_workbook("c:/testing.xlsx")
for rw in wb["Daily Metrics"].iter_rows("A16:B20"):
for cl in rw:
print cl.value
Only basic knowledge of MySQL and Openpyxl is needed, you can solve it by reading tutorials on your own.
Before executing the script, you need to create database and table. Assuming you've done it.
import openpyxl
import MySQLdb
wb = openpyxl.load_workbook('/path/to/GA.xlsx')
ws = wb['Daily Metrics']
# map is a convenient way to construct a list. you can get a 2x2 tuple by slicing
# openpyxl.worksheet.worksheet.Worksheet instance and last row of worksheet
# from openpyxl.worksheet.worksheet.Worksheet.max_row
data = map(lambda x: {'date': x[0].value,
'campaign': x[1].value,
'users': x[2].value,
'sessions': x[3].value},
ws[16: ws.max_row])
# filter is another builtin function. Filter blank cells out if needed
data = filter(lambda x: None not in x.values(), data)
db = MySQLdb.connect('host', 'user', 'password', 'database')
cursor = db.cursor()
for row in data:
# execute raw MySQL syntax by using execute function
cursor.execute('insert into table (date, campaign, users, sessions)'
'values ("{date}", "{campaign}", {users}, {sessions});'
.format(**row)) # construct MySQL syntax through format function
db.commit()
I am working on pushing data from DBF files from a UNC to a sql server DB. There are about 50 DBF files, all of which with different schemas. Now I know I can create a program and list all 50 Tables and all 50 DBF files but this is going to take forever. Is there a way to derive the DBF field names somehow to do the insert rather then going through every DBF and typing out every field name in the DBF? Here's the code I have right now that inserts records from two fields in one DBF file.
import pyodbc
from dbfread import DBF
# SQL Server Connection Test
cnxn = pyodbc.connect('DRIVER={SQL Server};SERVER=**********;DATABASE=TEST_DBFIMPORT;UID=test;PWD=test')
cursor = cnxn.cursor()
dir = 'E\\Backups\\'
table = DBF('E:\\Backups\\test.dbf', lowernames=True)
for record in table.records:
rec1 = record['field1']
rec2 = record['field2']
cursor.execute ("insert into tblTest (column1,column2) values(?,?)", rec1, rec2)
cnxn.commit()
Some helpful hints using my dbf package:
import dbf
import os
for filename in os.listdir('e:/backups'):
with dbf.Table('e:/backups/'+filename) as table:
fields = dbf.field_names(table)
for record in table:
values = list(record)
# insert fields, values using odbc
If you want to transfer all fields, then you'll need to calculate the table name, the field names, and the values; some examples:
sql_table = os.path.splitext(filename)[0]
fields = ','.join(fields)
place_holders = ','.join(['?'] * len(fields))
values = tuple(record)
sql = "insert into %s (%s) values(%s)" % (sql_table, fields, place_holders)
curser.execute(sql, *values)