Load CSV MySQLdb - Python - python

I have a CSV that I'm attempting to load into a MySQL database. I'm running the following code:
import MySQLdb
con = MySQLdb.connect(host="myhost",
user="me",
passwd="mypw",
db="mydb")
cur = con.cursor()
sqlscript =r"""
DROP TABLE IF EXISTS MyTable;
CREATE TABLE MyTable
(Col1 VARCHAR(255),
Col2 VARCHAR(255),
CONSTRAINT PK_MyTable PRIMARY KEY (Col1));
LOAD DATA LOCAL INFILE 'C:\\Users\\me\\Documents\\Rec\\New Files\\mycsv.csv'
INTO TABLE MyTable
CHARACTER SET UTF8
FIELDS TERMINATED BY ','
ESCAPED BY '!'
ENCLOSED BY '"'
OPTIONALLY ENCLOSED BY '\''
LINES TERMINATED BY '\r\n'
IGNORE 1 LINES;"""
cur.execute(sqlscript)
cur.close()
This runs without error, but does not load data from my CSV file. It correctly drops the table and creates it using the script. When I then query the table, it has zero rows. What am I missing?

Related

Uploading data with psycopg2 and python

With the next cmds I am trying to upload a csv file where columns are separated by tabs and sometimes null values can be assigned to a column.
conn = psycopg2.connect(host="localhost",
port="5432",
user="postgres",
password="somepwd",
database="mydb",
options="-c search_path=dbo")
...
cur = conn.cursor()
with open(opath, "r") as opath_file:
next(opath_file) # skip the header row
cur.copy_from(opath_file, table_name[3:], null='', columns=cols.split(','))
cols has a string with the column names separated by ','
the table with name table_name[3:] belongs to the dbo schema
This code runs, no error is reported but no data is uploaded. The owner of the db is postgres.
Any ideas?
Would you believe me if the problem was I needed to run
conn.commit()
after the cur.copy_from cmd?

Python Script for sqlite3 can't create table

import sqlite3
conn = sqlite3.connect('serpin.db')
c = conn.cursor()
c.execute("""CREATE TABLE Gene(Gene_name TEXT, Organism TEXT, link_2_gene_with_ID TEXT, Number_SpliceForm INTEGER,ID_mRNA INTEGER, ID_Prt INTEGER);""")
c.execute(".import practice.csv Gene --csv")
c.execute(".mode column")
c.execute("select * from Gene;")
print(c.fetchall())
conn.commit()
conn.close
I can run all these commands individually on my own on the windows terminal in sqlite3. However I get multiple errors running this code, which is roughly what i used in a bash script where i got no errors. The first error I receive is an error that ssays "table Gene already exists." Now even if i comment out that line, i also get an error in the import command, where it says there is a syntax error with the period right before import. These are all sqlite3.OperationalError. I have tried running these commands on their own directly in sqlite3 and have no issues, so i'm not sure what the problem is.
I have no database in this folder, so I'm not sure how the table is already made.
edit(solution): the output of this is not formatted correctly, but this runs without errors.
import csv,sqlite3
conn = sqlite3.connect('serpin.db')
c = conn.cursor()
try:
c.execute("""CREATE TABLE Gene (Gene_name TEXT, Organism TEXT, link_2_gene_with_ID TEXT, Number_SpliceForm INTEGER,ID_mRNA INTEGER, ID_Prt INTEGER);""")
except:
pass
path = r'C:\Users\User\Desktop\sqlite\practice.csv'
with open(path,'r') as fin: # `with` statement available in 2.5+
# csv.DictReader uses first line in file for column headings by default
dr = csv.DictReader(fin) # comma is default delimiter
to_db = [(i['Gene_name'], i['Organism'],i['link_2_gene_with_ID'],i['Number_SpliceForm'],i['ID_mRNA'],i['ID_Prt'] ) for i in dr]
c.executemany("INSERT INTO Gene (Gene_name,Organism,link_2_gene_with_ID,Number_SpliceForm,ID_mRNA,ID_Prt) VALUES (?,?,?,?,?,?);", to_db)
c.execute("select * from Gene;")
print(c.fetchall())
conn.commit()
conn.close
About the fact that you may already have created the table and that gives you an error:
try:
c.execute("""CREATE TABLE Gene(Gene_name TEXT, Organism TEXT, link_2_gene_with_ID TEXT, Number_SpliceForm INTEGER,ID_mRNA INTEGER, ID_Prt INTEGER);""")
except:
pass
To import the file, I report here from another answer from the user mechanical_meat
Importing a CSV file into a sqlite3 database table using Python:
import csv, sqlite3
con = sqlite3.connect(":memory:") # change to 'sqlite:///your_filename.db'
cur = con.cursor()
cur.execute("CREATE TABLE t (col1, col2);") # use your column names here
with open('data.csv','r') as fin: # `with` statement available in 2.5+
# csv.DictReader uses first line in file for column headings by default
dr = csv.DictReader(fin) # comma is default delimiter
to_db = [(i['col1'], i['col2']) for i in dr]
cur.executemany("INSERT INTO t (col1, col2) VALUES (?, ?);", to_db)
con.commit()
con.close()
Don't know about the .mode command, but as far as I know, operation in SQLite3 in python are all in capital letters, thus also select should be SELECT

How to load csv into an empty SQL table, using python?

So, I have this empty table which I created (see code below) and I need to load it with data from a csv file, using python-sql connection. As I do this, need to replace the html codes and change to correct datatypes (clean the file) and finally load it into this empty sql table.
This is the code I wrote but, without any success...when I check the table in SQL it just returns an empty table:
Python code:
import csv
with open ('UFOGB_Observations.csv', 'r') as UFO_Obsr:
## Write to the csv file, to clean it and change the html codes:
with open ('UFO_Observations.csv', 'w') as UFO_Obsw:
for line in UFO_Obsr:
line = line.replace('&#44', ',')
line = line.replace('&#39', "'")
line = line.replace('&#33', '!')
line = line.replace('&', '&')
UFO_Obsw.write(line)
##To Connect Python to SQL:
import pyodbc
print('Connecting...')
conn = pyodbc.connect('Trusted_Connection=yes', driver = '{ODBC Driver 13 for SQL Server}', server = '.\SQLEXPRESS', database = 'QA_DATA_ANALYSIS')
print('Connected')
cursor = conn.cursor()
print('cursor established')
cursor.execute('''DROP TABLE IF EXISTS UFO_GB_1;
CREATE TABLE UFO_GB_1 (Index_No VARCHAR(10) NOT NULL, date_time VARCHAR(15) NULL, city_or_state VARCHAR(50) NULL,
country_code VARCHAR(50) NULL, shape VARCHAR (200) NULL, duration VARCHAR(50) NULL,
date_posted VARCHAR(15) NULL, comments VARCHAR(700) NULL);
''')
print('Commands succesfully completed')
#To insert that csv into the table:
cursor.execute('''BULK INSERT QA_DATA_ANALYSIS.dbo.UFO_GB_1
FROM 'F:\GSS\QA_DATA_ANALYSIS_LEVEL_4\MODULE_2\Challenge_2\TASK_2\UFO_Observations.csv'
WITH ( fieldterminator = '', rowterminator = '\n')''')
conn.commit()
conn.close()
I was expecting to see a table with all 1900+ rows, when I type SELECT * FROM table, with correct data types (i.e. date_time and date_posted columns as timestamp)
(Apologies in advance. New here so not allowed to comment.)
1) Why are you creating the table each time? Is this meant to be a temporary table?
2) What do you get as a response to your query?
3) What happens when you break the task down into parts?
Does the code create the table?
If the table already exists and you run just the insert data code does it work? When you import the csv and then write back to the same file does that produce the result you are looking for or crash? What if you wrote to a different file and imported that?
you are writing your queries like how you would in SQL, but you need to re-write them in python. Python needs to understand the query as a python string, then it can parse it into sql. IE don't wrap the statement with '''.
This is not tested, but try something like this:
bulk_load_sql = """
BULK INSERT QA_DATA_ANALYSIS.dbo.UFO_GB_1
FROM 'F:\GSS\QA_DATA_ANALYSIS_LEVEL_4\MODULE_2\Challenge_2\TASK_2\UFO_Observations.csv'
WITH ( fieldterminator = '', rowterminator = '\n')
"""
cursor.execute(bulk_load_sql)
This uses a docstring to put the sql on multiple lines, but you may want to use a regular string.
Here is an answer that goes over formatting your query for pyodbc
https://stackoverflow.com/a/43855693/4788717

empty set after load data in file mysql

I'm using pymysql to load a large csv file into a database, because of memory limitations im using load infile rather than insert. however after the code completes when i query the server it for the data in the table it returns an empty set.
import pymysql
conn = pymysql.connect(host = 'localhost', port = 3306, user = 'root', passwd = '', local_infile = True)
cur = conn.cursor()
cur.execute("CREATE SCHEMA IF NOT EXISTS `test`DEFAULT "
"CHARACTER SET utf8 COLLATE utf8_unicode_ci ;")
cur.execute("CREATE TABLE IF NOT EXISTS "
"`test`.`scores` ( `date` DATE NOT NULL, "
"`name` VARCHAR(15) NOT NULL,"
"`score` DECIMAL(10,3) NOT NULL);")
conn.commit()
def push(fileName = '/home/pi/test.csv', tableName = '`test`.`scores`'):
push = """LOAD DATA LOCAL INFILE "%s" INTO TABLE %s
FIELDS TERMINATED BY ','
LINES TERMINATED BY '\r\n'
IGNORE 1 LINES
(date, name, score);""" % (fileName, tableName)
cur.execute(push)
conn.commit()
push()
I get some truncation warnings but no other errors or warnings to work off of. any ideas on how to fix this?
I did a few things to fix this, First I changed the config files for my sql server to allow load infile, following this MySQL: Enable LOAD DATA LOCAL INFILE. Then the problem was with the line,
LINES TERMINATED BY '\r\n'
the fix was to change it to
LINES TERMINATED BY '\n'
after that the script runs fine and is significantly faster than inserting row by row

tab separator in sqlite3

How do you specify a tab separator in a table in sqlite3 for data import? That is:
import sqlite3
con = sqlite3.connect('mydatabase.db')
cur = con.cursor()
cur.execute("CREATE TABLE mytable(c1 INT, c2 REAL);")
at this point I'd need to specify the tab separator just like you'd type .separator "\t" inside Sqlite after creating the table. Any thoughts?

Categories