copy csv file into sqlite database table using python - python

There is a post that tells me how to write data fom a csv-file into a sqlite database (link). Is there a way to simply copy the whole file into a database table instead of loading the data and then iterate through the rows appending them to the table as suggested in the link.
To do this in sqlite it says here to simply type
sqlite> create table test (id integer, datatype_id integer, level integer, meaning text);
sqlite> .separator ","
sqlite> .import no_yes.csv test
I am new to working with databases and sqlite3, but I thought doing something like this could work
import sqlite3
conn = sqlite3.connect('mydatabase.db')
c = conn.cursor()
def create_table(name):
c.execute("CREATE TABLE IF NOT EXISTS {} (time REAL, event INTEGER, id INTEGER, size INTERGER, direction INTEGER)".format(name))
def copy_data(file2copy,table_name):
c.executescript("""
.separator ","
.import {} {}""".format(file2copy,table_name))
conn.commit()
try:
create_table('abc')
copy_data(r'Y:\MYPATH\somedata.csv','abc')
except Exception as e:
print e
c.close()
conn.close()
But apparently it doesn't. I get the error
near ".": syntax error
EDIT: Thanks to the below suggestion to use the subprocess module, I came up with the follwing solution.
import subprocess
sqlshell = r'c:\sqlite-tools\sqlite3' # sqlite3.exe is a shell that can be obtained from the sqlite website
process = subprocess.Popen([sqlshell], stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
cmd = """.open {db}
.seperator ","
.import {csv} {tb}""".format(db='C:/Path/to/database.db', csv='C:/Path/to/file.csv', tb='table_name')
stdout, stderr = process.communicate(input=cmd)
if len(stderr):
print 'something went wrong'
Importantly, you should use '/' instead of '\' in your directory names. Also, you have to be carefull with blank spaces in your directory names.

The commands you're using in copy_data that start with . are not part of sqlite3, but the interactive shell that it ships with it. You can't use them through the sqlite3 module.
You either need to manually write the insertion step or use the subprocess module to run the shell program.

Related

Python SQLite - fuse_hidden not deleted

I am trying to setup a python script to get some data and store it into a SQLite database. However when I am running the script a .fuse_hidden file is created.
On windows no .fuse_hidden file is observed but on ubuntu it generates at each call. The .fuse_hidden file seems to contain some form of sql query with input and tables.
I can delete the files without error during runtime but they are not deleted automatically. I make sure to end my connection to the db when I am finished with the query.
lsof give no information.
I am out of ideas on what to try next to get the files removed automatically. Any suggestions?
Testing
In order to confirm that it is nothing wrong with the code I made a simple script
(Assume there is an empty error.db)
import sqlite3
conn = sqlite3.connect("error.db")
cur = conn.cursor()
create_query = """
CREATE TABLE Errors (
name TEXT
);"""
try:
cur.execute(create_query)
except:
pass
cur.execute("INSERT INTO Errors (name) VALUES(?)", ["Test2"])
conn.commit()
cur.close()
conn.close()

LOAD DATA LOCAL INFILE with incremental field

I have multiple unstructured txt files in a directory and I want to insert all of them into mysql; basically, the entire content of each text file should be placed into a row . In MySQL, I have 2 columns: ID (auto increment), and LastName(nvarchar(45)). I used Python to connect to MySql; used LOAD DATA LOCAL INFILE to insert the whole content. But when I run the code I see the following messages in Python console:
.
Also, when I check MySql, I see nothing but a bunch of empty rows with Ids being automatically generated.
Here is the code:
import MySQLdb
import sys
import os
result = os.listdir("C:\\Users\\msalimi\\Google Drive\\s\\Discharge_Summary")
for x in result:
db = MySQLdb.connect("localhost", "root", "Pass", "myblog")
cursor = db.cursor()
file1 = os.path.join(r'C:\\Discharge_Summary\\'+x)
cursor.execute("LOAD DATA LOCAL INFILE '%s' INTO TABLE clamp_test" %(file1,));
db.commit()
db.close()
Can someone please tell me what is wrong with the code? What is the right way to achieve my goal?
I edited my code with:
.....cursor.execute("LOAD DATA LOCAL INFILE '%s' INTO TABLE clamp_test LINES TERMINATED BY '\r' (Lastname) SET id = NULL" %(file1,))
and it worked :)

using python 2.7 to query sqlite3 database and getting "sqlite3 operational error no such table"

My simple test code is listed below. I created the table already and can query it using the SQLite Manager add-in on Firefox so I know the table and data exist. When I run the query in python (and using the python shell) I get the no such table error
def TroyTest(self, acctno):
conn = sqlite3.connect('TroyData.db')
curs = conn.cursor()
v1 = curs.execute('''
SELECT acctvalue
FROM balancedata
WHERE acctno = ? ''', acctno)
print v1
conn.close()
When you pass SQLite a non-existing path, it'll happily open a new database for you, instead of telling you that the file did not exist before. When you do that, it'll be empty and you'll instead get a "No such table" error.
You are using a relative path to the database, meaning it'll try to open the database in the current directory, and that is probably not where you think it is..
The remedy is to use an absolute path instead:
conn = sqlite3.connect('/full/path/to/TroyData.db')
You need to loop over the cursor to see results:
curs.execute('''
SELECT acctvalue
FROM balancedata
WHERE acctno = ? ''', acctno)
for row in curs:
print row[0]
or call fetchone():
print curs.fetchone() # prints whole row tuple
The problem is the SQL statment. you must specify the db name and after the table name...
'''SELECT * FROM db_name.table_name WHERE acctno = ? '''

Search Sqlite Database - All Tables and Columns

Is there a library or open source utility available to search all the tables and columns of an Sqlite database? The only input would be the name of the sqlite DB file.
I am trying to write a forensics tool and want to search sqlite files for a specific string.
Just dump the db and search it.
% sqlite3 file_name .dump | grep 'my_search_string'
You could instead pipe through less, and then use / to search:
% sqlite3 file_name .dump | less
You could use "SELECT name FROM sqlite_master WHERE type='table'"
to find out the names of the tables in the database. From there it is easy to SELECT all rows of each table.
For example:
import sqlite3
import os
filename = ...
with sqlite3.connect(filename) as conn:
conn.row_factory = sqlite3.Row
cursor = conn.cursor()
cursor.execute("SELECT name FROM sqlite_master WHERE type='table'")
for tablerow in cursor.fetchall():
table = tablerow[0]
cursor.execute("SELECT * FROM {t}".format(t = table))
for row in cursor:
for field in row.keys():
print(table, field, row[field])
I know this is late to the party, but I had a similar issue but since it was inside of a docker image I had no access to python, so I solved it like so:
for X in $(sqlite3 database.db .tables) ; do sqlite3 database.db "SELECT * FROM $X;" | grep >/dev/null 'STRING I WANT' && echo $X; done
This will iterate through all tables in a database file and perform a select all operation which I then grep for the string. If it finds the string, it prints the table, and from there I can simply use sqlite3 to find out how it was used.
Figured it might be helpful to other who cannot use python.
#MrWorf's answer didn't work for my sqlite file (an .exb file from Evernote) but this similar method worked:
Open the file with DB Browser for SQLite sqlitebrowser mynotes.exb
File / Export to SQL file (will create mynotes.exb.sql)
grep 'STRING I WANT" mynotes.exb.sql

execute *.sql file with python MySQLdb

How can execute sql script stored in *.sql file using MySQLdb python driver. I was trying
cursor.execute(file(PATH_TO_FILE).read())
but this doesn't work because cursor.execute can run only one sql command at once. My sql script contains several sql statements instead. Also I was trying
cursor.execute('source %s'%PATH_TO_FILE)
but also with no success.
From python, I start a mysql process to execute the file for me:
from subprocess import Popen, PIPE
process = Popen(['mysql', db, '-u', user, '-p', passwd],
stdout=PIPE, stdin=PIPE)
output = process.communicate('source ' + filename)[0]
I also needed to execute a SQL file, but the catch was that there wasn't one statement per line, so the accepted answer didn't work for me.
The SQL file I wanted to execute looked like this:
-- SQL script to bootstrap the DB:
--
CREATE USER 'x'#'%' IDENTIFIED BY 'x';
GRANT ALL PRIVILEGES ON mystore.* TO 'x'#'%';
GRANT ALL ON `%`.* TO 'x'#`%`;
FLUSH PRIVILEGES;
--
--
CREATE DATABASE oozie;
GRANT ALL PRIVILEGES ON oozie.* TO 'oozie'#'localhost' IDENTIFIED BY 'oozie';
GRANT ALL PRIVILEGES ON oozie.* TO 'oozie'#'%' IDENTIFIED BY 'oozie';
FLUSH PRIVILEGES;
--
USE oozie;
--
CREATE TABLE `BUNDLE_ACTIONS` (
`bundle_action_id` varchar(255) NOT NULL,
`bundle_id` varchar(255) DEFAULT NULL,
`coord_id` varchar(255) DEFAULT NULL,
`coord_name` varchar(255) DEFAULT NULL,
`critical` int(11) DEFAULT NULL,
`last_modified_time` datetime DEFAULT NULL,
`pending` int(11) DEFAULT NULL,
`status` varchar(255) DEFAULT NULL,
`bean_type` varchar(31) DEFAULT NULL,
PRIMARY KEY (`bundle_action_id`),
KEY `I_BNDLTNS_DTYPE` (`bean_type`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
--
--
Some statements in the above file lie on a single line and some statements also span multiple lines (like the CREATE TABLE at the end). There are also a few SQL inline comment lines that begin with "--".
As suggested by ThomasK, I had to write some simple rules to join lines into a statement. I ended up with a function to execute a sql file:
def exec_sql_file(cursor, sql_file):
print "\n[INFO] Executing SQL script file: '%s'" % (sql_file)
statement = ""
for line in open(sql_file):
if re.match(r'--', line): # ignore sql comment lines
continue
if not re.search(r';$', line): # keep appending lines that don't end in ';'
statement = statement + line
else: # when you get a line ending in ';' then exec statement and reset for next statement
statement = statement + line
#print "\n\n[DEBUG] Executing SQL statement:\n%s" % (statement)
try:
cursor.execute(statement)
except (OperationalError, ProgrammingError) as e:
print "\n[WARN] MySQLError during execute statement \n\tArgs: '%s'" % (str(e.args))
statement = ""
I'm sure there's scope for improvement, but for now it's working pretty well for me. Hope someone finds it useful.
This worked for me:
with open('schema.sql') as f:
cursor.execute(f.read().decode('utf-8'), multi=True)
for line in open(PATH_TO_FILE):
cursor.execute(line)
This assumes you have one SQL statement per line in your file. Otherwise you'll need to write some rules to join lines together.
Another solution that allows to leverage on the MySQL interpreter without any parsing is to use the os.system command to run a MySQL prompt command directly inside python:
from os import system
USERNAME = "root"
PASSWORD = "root"
DBNAME = "pablo"
HOST = "localhost"
PORT = 3306
FILE = "file.sql"
command = """mysql -u %s -p"%s" --host %s --port %s %s < %s""" %(USERNAME, PASSWORD, HOST, PORT, DBNAME, FILE)
system(command)
It avoids any parsing error when for example you would have a string variable with a smiley ;-) in it or if you check for the ; as the last character, if you have comments afterward like SELECT * FROM foo_table; # selecting data
Many of the answers here have serious flaws...
First don't try to parse an open ended sql script yourself! If you think that is easily done, you aren't aware of how robust and complicated sql can be. Serious sql scripts certainly involve statements and procedure definitions spanning multiple lines. It is also common to explicitly declare and change delimiters the in middle of your scripts. You can also nest source commands within each other. For so many reasons, you want to run the script through the MySQL client and allow it to handle the heavy lifting. Trying to reinvent that is fraught peril and a huge waste of time. Maybe if you are the only one writing these scripts, and you are not writing anything sophisticated you could get away with that, but why limit yourself to such a degree? What about machine generated scripts, or those written by other developers?
The answer from #jdferreira is on the right track, but also has problems and weaknesses. The most significant is that a security hole is being opened up by sending the connection parameters to the process in that manner.
Here's a solution / example for your copy & paste pleasure. My extended discussion follows:
First, create a separate config file to save your user name and password.
db-creds.cfg
[client]
user = XXXXXXX
password = YYYYYYY
Slap the right file system permissions on that, so the python process can read from it, but no one can view that who should not be able to.
Then, use this Python (in my example case the creds file is adjacent to the py script):
#!/usr/bin/python
import os
import sys
import MySQLdb
from subprocess import Popen, PIPE, STDOUT
__MYSQL_CLIENT_PATH = "mysql"
__THIS_DIR = os.path.dirname( os.path.realpath( sys.argv[0] ) )
__DB_CONFIG_PATH = os.path.join( __THIS_DIR, "db-creds.cfg" )
__DB_CONFIG_SECTION = "client"
__DB_CONN_HOST = "localhost"
__DB_CONN_PORT = 3306
# ----------------------------------------------------------------
class MySqlScriptError( Exception ):
def __init__( self, dbName, scriptPath, stdOut, stdErr ):
Exception.__init__( self )
self.dbName = dbName
self.scriptPath = scriptPath
self.priorOutput = stdOut
self.errorMsg = stdErr
errNumParts = stdErr.split("(")
try : self.errorNum = long( errNumParts[0].replace("ERROR","").strip() )
except: self.errorNum = None
try : self.sqlState = long( errNumParts[1].split(")")[0].strip() )
except: self.sqlState = None
def __str__( self ):
return ("--- MySqlScriptError ---\n" +
"Script: %s\n" % (self.scriptPath,) +
"Database: %s\n" % (self.dbName,) +
self.errorMsg )
def __repr__( self ): return self.__str__()
# ----------------------------------------------------------------
def databaseLoginParms() :
from ConfigParser import RawConfigParser
parser = RawConfigParser()
parser.read( __DB_CONFIG_PATH )
return ( parser.get( __DB_CONFIG_SECTION, "user" ).strip(),
parser.get( __DB_CONFIG_SECTION, "password" ).strip() )
def databaseConn( username, password, dbName ):
return MySQLdb.connect( host=__DB_CONN_HOST, port=__DB_CONN_PORT,
user=username, passwd=password, db=dbName )
def executeSqlScript( dbName, scriptPath, ignoreErrors=False ) :
scriptDirPath = os.path.dirname( os.path.realpath( scriptPath ) )
sourceCmd = "SOURCE %s" % (scriptPath,)
cmdList = [ __MYSQL_CLIENT_PATH,
"--defaults-extra-file=%s" % (__DB_CONFIG_PATH,) ,
"--database", dbName,
"--unbuffered" ]
if ignoreErrors :
cmdList.append( "--force" )
else:
cmdList.extend( ["--execute", sourceCmd ] )
process = Popen( cmdList
, cwd=scriptDirPath
, stdout=PIPE
, stderr=(STDOUT if ignoreErrors else PIPE)
, stdin=(PIPE if ignoreErrors else None) )
stdOut, stdErr = process.communicate( sourceCmd if ignoreErrors else None )
if stdErr is not None and len(stdErr) > 0 :
raise MySqlScriptError( dbName, scriptPath, stdOut, stdErr )
return stdOut
If you want to test it out, add this:
if __name__ == "__main__":
( username, password ) = databaseLoginParms()
dbName = "ExampleDatabase"
print "MySQLdb Test"
print
conn = databaseConn( username, password, dbName )
cursor = conn.cursor()
cursor.execute( "show tables" )
print cursor.fetchall()
cursor.close()
conn.close()
print
print "-----------------"
print "Execute Script with ignore errors"
print
scriptPath = "test.sql"
print executeSqlScript( dbName, scriptPath,
ignoreErrors=True )
print
print "-----------------"
print "Execute Script WITHOUT ignore errors"
print
try : print executeSqlScript( dbName, scriptPath )
except MySqlScriptError as e :
print "dbName: %s" % (e.dbName,)
print "scriptPath: %s" % (e.scriptPath,)
print "errorNum: %s" % (str(e.errorNum),)
print "sqlState: %s" % (str(e.sqlState),)
print "priorOutput:"
print e.priorOutput
print
print "errorMsg:"
print e.errorMsg
print
print e
print
And for good measure, here's an example sql script to feed into it:
test.sql
show tables;
blow up;
show tables;
So, now for some discussion.
First, I illustrate how to use MySQLdb along with this external script execution, while storing the creds in one shared file you can use for both.
By using --defaults-extra-file on the command line you can SECURELY pass your connection parameters in.
The combination of either --force with stdin streaming the source command OR --execute running the command on the outside let's you dictate how the script will run. That is by ignoring errors and continuing to run, or stopping as soon as an error occurs.
The order in which the results comeback will also be preserved via --unbuffered. Without that, your stdout and stderr streams will be jumbled and undefined in their order, making it very hard to figure out what worked and what did not when comparing that to the input sql.
Using the Popen cwd=scriptDirPath let's you nest source commands within one another using relative paths. If your scripts will all be in the same directory (or a known path relative to it), doing this let's you reference those relative to where the top level script resides.
Finally, I threw in an exception class which carries all the info you could possibly want about what happened. If you are not using the ignoreErrors option, one of these exceptions will be thrown in your python when something goes wrong and script has stopped running upon that error.
At least MySQLdb 1.2.3 seems to allow this out of the box, you just have to call cursor.nextset() to cycle through the returned result sets.
db = conn.cursor()
db.execute('SELECT 1; SELECT 2;')
more = True
while more:
print db.fetchall()
more = db.nextset()
If you want to be absolutely sure the support for this is enabled, and/or disable the support, you can use something like this:
MYSQL_OPTION_MULTI_STATEMENTS_ON = 0
MYSQL_OPTION_MULTI_STATEMENTS_OFF = 1
conn.set_server_option(MYSQL_OPTION_MULTI_STATEMENTS_ON)
# Multiple statement execution here...
conn.set_server_option(MYSQL_OPTION_MULTI_STATEMENTS_OFF)
The accepted answer will encounter problems when your sql script contains empty lines and your query sentence spans multiple lines. Instead, using the following approach will solve the problem:
f = open(filename, 'r')
query = " ".join(f.readlines())
c.execute(query)
As mentioned in one of the comments, if you are sure that every command ends with a semi-colon, you can do this:
import mysql.connector
connection = mysql.connector.connect(
host=host,
user=user,
password=password
)
cursor = connection.cursor()
with open(script, encoding="utf-8") as f:
commands = f.read().split(';')
for command in commands:
cursor.execute(command)
print(command)
connection.close()
Load mysqldump file:
for line in open(PATH_TO_FILE).read().split(';\n'):
cursor.execute(line)
Are you able to use a different database driver?
If yes: what you want is possible with the MySQL Connector/Python driver by MySQL.
Its cursor.execute method supports executing multiple SQL statements at once by passing Multi=True.
Splitting the SQL statements in the file by semicolon is not necessary.
Simple example (mainly copy & paste from the second link, I just added reading the SQL from the file):
import mysql.connector
file = open('test.sql')
sql = file.read()
cnx = mysql.connector.connect(user='uuu', password='ppp', host='hhh', database='ddd')
cursor = cnx.cursor()
for result in cursor.execute(sql, multi=True):
if result.with_rows:
print("Rows produced by statement '{}':".format(
result.statement))
print(result.fetchall())
else:
print("Number of rows affected by statement '{}': {}".format(
result.statement, result.rowcount))
cnx.close()
I'm using this to import MySQL dumps (created in phpMyAdmin by exporting the whole database to a SQL file) from the *.sql file back into a database.
Here's a code snippet that will import a typical .sql that comes from an export. (I used it with exports from Sequel Pro successfully.) Deals with multi-line queries and comments (#).
Note 1: I used the initial lines from Thomas K's response but added more.
Note 2: For newbies, replace the DB_HOST, DB_PASS etc with your database connection info.
import MySQLdb
from configdb import DB_HOST, DB_PASS, DB_USER, DB_DATABASE_NAME
db = MySQLdb.connect(host=DB_HOST, # your host, usually localhost
user=DB_USER, # your username
passwd=DB_PASS, # your password
db=DB_DATABASE_NAME) # name of the data base
cur = db.cursor()
PATH_TO_FILE = "db-testcases.sql"
fullLine = ''
for line in open(PATH_TO_FILE):
tempLine = line.strip()
# Skip empty lines.
# However, it seems "strip" doesn't remove every sort of whitespace.
# So, we also catch the "Query was empty" error below.
if len(tempLine) == 0:
continue
# Skip comments
if tempLine[0] == '#':
continue
fullLine += line
if not ';' in line:
continue
# You can remove this. It's for debugging purposes.
print "[line] ", fullLine, "[/line]"
try:
cur.execute(fullLine)
except MySQLdb.OperationalError as e:
if e[1] == 'Query was empty':
continue
raise e
fullLine = ''
db.close()
How about using the pexpect library? The idea is, that you can start a process pexpect.spawn(...), and wait until the output of that process contains a certain pattern process.expect(pattern).
I actually used this to connect to the mysql client and execute some sql scripts.
Connecting:
import pexpect
process = pexpect.spawn("mysql", ["-u", user, "-p"])
process.expect("Enter password")
process.sendline(password)
process.expect("mysql>")
This way the password is not hardcoded into the command line parameter (removes security risk).
Executing even several sql scripts:
error = False
for script in sql_scripts:
process.sendline("source {};".format(script))
index = process.expect(["mysql>", "ERROR"])
# Error occurred, interrupt
if index == 1:
error = True
break
if not error:
# commit changes of the scripts
process.sendline("COMMIT;")
process.expect("mysql>")
print "Everything fine"
else:
# don't commit + print error message
print "Your scripts have errors"
Beware that you always call expect(pattern), and that it matches, otherwise you will get a timeout error. I needed this bit of code to execute several sql scripts and only commit their changes if no error occurred, but it is easily adaptable for use cases with only one script.
You can use something like this-
def write_data(schema_name: str, table_name: str, column_names: str, data: list):
try:
data_list_template = ','.join(['%s'] * len(data))
insert_query = f"insert into {schema_name}.{table_name} ({column_names}) values {data_list_template}"
db.execute(insert_query, data)
conn_obj.commit()
except Exception as e:
db.execute("rollback")
raise e

Categories