How to Import Big JSON file to MYSQL [duplicate] - python

I am having a hard time using the MySQLdb module to insert information into my database. I need to insert 6 variables into the table.
cursor.execute ("""
INSERT INTO Songs (SongName, SongArtist, SongAlbum, SongGenre, SongLength, SongLocation)
VALUES
(var1, var2, var3, var4, var5, var6)
""")
Can someone help me with the syntax here?

Beware of using string interpolation for SQL queries, since it won't escape the input parameters correctly and will leave your application open to SQL injection vulnerabilities. The difference might seem trivial, but in reality it's huge.
Incorrect (with security issues)
c.execute("SELECT * FROM foo WHERE bar = %s AND baz = %s" % (param1, param2))
Correct (with escaping)
c.execute("SELECT * FROM foo WHERE bar = %s AND baz = %s", (param1, param2))
It adds to the confusion that the modifiers used to bind parameters in a SQL statement varies between different DB API implementations and that the mysql client library uses printf style syntax instead of the more commonly accepted '?' marker (used by eg. python-sqlite).

You have a few options available. You'll want to get comfortable with python's string iterpolation. Which is a term you might have more success searching for in the future when you want to know stuff like this.
Better for queries:
some_dictionary_with_the_data = {
'name': 'awesome song',
'artist': 'some band',
etc...
}
cursor.execute ("""
INSERT INTO Songs (SongName, SongArtist, SongAlbum, SongGenre, SongLength, SongLocation)
VALUES
(%(name)s, %(artist)s, %(album)s, %(genre)s, %(length)s, %(location)s)
""", some_dictionary_with_the_data)
Considering you probably have all of your data in an object or dictionary already, the second format will suit you better. Also it sucks to have to count "%s" appearances in a string when you have to come back and update this method in a year :)

The linked docs give the following example:
cursor.execute ("""
UPDATE animal SET name = %s
WHERE name = %s
""", ("snake", "turtle"))
print "Number of rows updated: %d" % cursor.rowcount
So you just need to adapt this to your own code - example:
cursor.execute ("""
INSERT INTO Songs (SongName, SongArtist, SongAlbum, SongGenre, SongLength, SongLocation)
VALUES
(%s, %s, %s, %s, %s, %s)
""", (var1, var2, var3, var4, var5, var6))
(If SongLength is numeric, you may need to use %d instead of %s).

Actually, even if your variable (SongLength) is numeric, you will still have to format it with %s in order to bind the parameter correctly. If you try to use %d, you will get an error. Here's a small excerpt from this link http://mysql-python.sourceforge.net/MySQLdb.html:
To perform a query, you first need a cursor, and then you can execute queries on it:
c=db.cursor()
max_price=5
c.execute("""SELECT spam, eggs, sausage FROM breakfast
WHERE price < %s""", (max_price,))
In this example, max_price=5 Why, then, use %s in the string? Because MySQLdb will convert it to a SQL literal value, which is the string '5'. When it's finished, the query will actually say, "...WHERE price < 5".

As an alternative to the chosen answer, and with the same safe semantics of Marcel's, here is a compact way of using a Python dictionary to specify the values. It has the benefit of being easy to modify as you add or remove columns to insert:
meta_cols = ('SongName','SongArtist','SongAlbum','SongGenre')
insert = 'insert into Songs ({0}) values ({1})'.format(
','.join(meta_cols), ','.join( ['%s']*len(meta_cols)))
args = [ meta[i] for i in meta_cols ]
cursor = db.cursor()
cursor.execute(insert,args)
db.commit()
Where meta is the dictionary holding the values to insert. Update can be done in the same way:
meta_cols = ('SongName','SongArtist','SongAlbum','SongGenre')
update='update Songs set {0} where id=%s'.
.format(','.join([ '{0}=%s'.format(c) for c in meta_cols ]))
args = [ meta[i] for i in meta_cols ]
args.append(songid)
cursor=db.cursor()
cursor.execute(update,args)
db.commit()

The first solution works well. I want to add one small detail here. Make sure the variable you are trying to replace/update it will has to be a type str. My mysql type is decimal but I had to make the parameter variable as str to be able to execute the query.
temp = "100"
myCursor.execute("UPDATE testDB.UPS SET netAmount = %s WHERE auditSysNum = '42452'",(temp,))
myCursor.execute(var)

Here is another way to do it. It's documented on the MySQL official website.
https://dev.mysql.com/doc/connector-python/en/connector-python-api-mysqlcursor-execute.html
In the spirit, it's using the same mechanic of #Trey Stout's answer. However, I find this one prettier and more readable.
insert_stmt = (
"INSERT INTO employees (emp_no, first_name, last_name, hire_date) "
"VALUES (%s, %s, %s, %s)"
)
data = (2, 'Jane', 'Doe', datetime.date(2012, 3, 23))
cursor.execute(insert_stmt, data)
And to better illustrate any need for variables:
NB: note the escape being done.
employee_id = 2
first_name = "Jane"
last_name = "Doe"
insert_stmt = (
"INSERT INTO employees (emp_no, first_name, last_name, hire_date) "
"VALUES (%s, %s, %s, %s)"
)
data = (employee_id, conn.escape_string(first_name), conn.escape_string(last_name), datetime.date(2012, 3, 23))
cursor.execute(insert_stmt, data)

Related

IndexError out of range on Python

I have a python code to insert data into my database. Here is the line:
query = 'insert into 1_recipe values({0})'. I used {0} to pass all data from my CSV file. It works perfectly before I use sys.argv in my code. Here is the new code :
import sys
nomor = sys.argv[1]
.....
query = "insert into {idnumber}_recipe values ({0})".format(idnumber = nomor)
query = query.format(','.join(['%s'] * len(data)))
.....
When I run this code, always back with this error :
'query = "insert into {idnumber}_recipe values ({0})".format(idnumber = nomor)
IndexError: Replacement index 0 out of range for positional args tuple'
How to fix it? Thanks.
Update:
I already found the answer. Thank you
You are only passing one argument to the format() function:
.format(idnumber = nomor)
The format function doesn't have a value to give to the ({0}) part of the formatted string.
Either give another value or change it so it will use idnumber as well
You can look at query development for formatting here.
e.g.:
insert_stmt = (
"INSERT INTO employees (emp_no, first_name, last_name, hire_date) "
"VALUES (%s, %s, %s, %s)"
)
data = (2, 'Jane', 'Doe', datetime.date(2012, 3, 23))
cursor.execute(insert_stmt, data)
Your code can be rewritten something like this using better formatting:
import sys
nomor = sys.argv[1]
data_str_value = ','.join(['%s'] * len(data))
.....
query = "insert into {idnumber}_recipe values ({values})".format(idnumber = nomor, values = data_str_value)
.....
Note: This code is showing only better formatting as per the example given. This query may or may not run as expected due to incorrect syntax.

WHERE IN Clause in python list [duplicate]

This question already has answers here:
imploding a list for use in a python MySQLDB IN clause
(8 answers)
Closed 1 year ago.
I need to pass a batch of parameters to mysql in python. Here is my code:
sql = """ SELECT * from my_table WHERE name IN (%s) AND id=%(Id)s AND puid=%(Puid)s"""
params = {'Id':id,'Puid' : pid}
in_p=', '.join(list(map(lambda x: '%s', names)))
sql = sql %in_p
cursor.execute(sql, names) #todo: add params to sql clause
The problem is I want to pass the name list to sql IN clause, meanwhile I also want to pass the id and puid as parameters to the sql query clause. How do I implement these in python?
Think about the arguments to cursor.execute that you want. You want to ultimately execute
cursor.execute("SELECT * FROM my_table WHERE name IN (%s, %s, %s) AND id = %s AND puid = %s;", ["name1", "name2", "name3", id, pid])
How do you get there? The tricky part is getting the variable number of %ss right in the IN clause. The solution, as you probably saw from this answer is to dynamically build it and %-format it into the string.
in_p = ', '.join(list(map(lambda x: '%s', names)))
sql = "SELECT * FROM my_table WHERE name IN (%s) AND id = %s AND puid = %s;" % in_p
But this doesn't work. You get:
TypeError: not enough arguments for format string
It looks like Python is confused about the second two %ss, which you don't want to replace. The solution is to tell Python to treat those %ss differently by escaping the %:
sql = "SELECT * FROM my_table WHERE name IN (%s) AND id = %%s AND puid = %%s;" % in_p
Finally, to build the arguments and execute the query:
args = names + [id, pid]
cursor.execute(sql, args)
sql = """ SELECT * from my_table WHERE name IN (%s) AND id=%(Id)s AND puid=%(Puid)s""".replace("%s", "%(Clause)s")
print sql%{'Id':"x", 'Puid': "x", 'Clause': "x"}
This can help you.

Python to MySQLdb will not pass variables I think I have tried everything

I am trying to store some TV information in a MySQLdb. I have tried about everything and I cannot get the variables to post. There is information in the variables as I am able to print the information.
My Code:
import pytvmaze
import MySQLdb
AddShow = pytvmaze.get_show(show_name='dexter')
MazeID = AddShow.maze_id
ShowName = "Show" + str(MazeID)
show = pytvmaze.get_show(MazeID, embed='episodes')
db = MySQLdb.connect("localhost","root","XXXXXXX","TVshows" )
cursor = db.cursor()
for episode in show.episodes:
Show = show.name
ShowStatus = show.status
ShowSummary = show.summary
Updated = show.updated
Season = episode.season_number
Episode = episode.episode_number
Title = episode.title
AirDate = episode.airdate
ShowUpdate = show.updated
EpisodeSummary = episode.summary
try:
sql = "INSERT INTO " + ShowName + " VALUES (%s,%s,%s,%s,%s,%s,%s,%s,%s,%s)""" (Show,ShowStatus,ShowSummary,Updated,Season,Episode,Title,AirDate,ShowUpdate,EpisodeSummary)
cursor.execute(sql)
db.commit()
except:
db.rollback()
db.close()
Any thoughts? Thanks in advance.
EDIT - WORKING CODE
import pytvmaze
import MySQLdb
AddShow = pytvmaze.get_show(show_name='dexter')
MazeID = AddShow.maze_id
ShowNameandID = "Show" + str(MazeID)
show = pytvmaze.get_show(MazeID, embed='episodes')
db = MySQLdb.connect("localhost","root","letmein","TVshows" )
cursor = db.cursor()
for episode in show.episodes:
ShowName = show.name
ShowStatus = show.status
ShowSummary = show.summary
Updated = show.updated
Season = episode.season_number
Episode = episode.episode_number
Title = episode.title
AirDate = episode.airdate
ShowUpdate = show.updated
EpisodeSummary = episode.summary
sql = "INSERT INTO " + ShowNameandID + """ VALUES (%s, %s, %s, %s, %s, %s, %s, %s, %s, %s)"""
cursor.execute(sql, (ShowName, ShowStatus, ShowSummary, Updated, Season, Episode, Title, AirDate, ShowUpdate, EpisodeSummary))
db.commit()
print sql ##Great for debugging
db.close()
First of all, you've actually made things more difficult for yourself by catching all the exceptions via bare try/expect and then silently rolling back. Temporarily remove the try/except and see what the real error is, or log the exception in the except block. I bet the error would be related to a syntax error in the query since you would miss the quotes around the column value(s).
Anyway, arguably the biggest problem you have is how you pass the variables into the query. Currently, you are using string formatting, which is highly not recommended because of the SQL injection attack danger and problems with type conversions. Parameterize your query:
sql = """
INSERT INTO
{show}
VALUES
(%s, %s, %s, %s, %s, %s, %s, %s, %s, %s)
""".format(show=Show)
cursor.execute(sql, (ShowStatus, ShowSummary, Updated, Season, Episode, Title, AirDate, ShowUpdate, EpisodeSummary))
Note that it is not possible to parameterize the table name (Show in your case) - we are using string formatting for it - make sure you either trust your source, or escape it manually via MySQLdb.escape_string(), or validate it with a separate custom code.

Python/MySQLdb import from CSV with data encapsulation

I have been using WAMP to ingest some csv logs, and wanted to move to a more automated process by scripting some of the routine actions I need to take.
I was using the direct CSV import function in PHPmyadmin to handle the dialect and specifics of the CSV.
I have written an uploader in Python, using MySQLdb that parses the log file, however as the logs contain some unhelpful chars, I am finding that I need to do lots of running around sanitizing inputs where I probably don't want to be...
Example, the log is some data from a directory scanner, and I have no control over the folder naming conventions folks use. I have this folder:-
"C:\user\NZ Business Roundtable_Download_13Feb2013, 400 Access"
and the ,char is being read as a new field marker (it is csv after all). What I actually want it to do is to ignore all text inside the quote marks:- "......"
I see a similar issue with ' chars, and I'm sure there will be more.
I found this:- http://www.tech-recipes.com/rx/2345/import_csv_file_directly_into_mysql/ that shows how I could script the Python to function like the PHPmyadmin load routine. Mainly using this snippet:
load data local infile 'uniq.csv' into table tblUniq fields terminated by ','
enclosed by '"'
lines terminated by '\n'
(uniqName, uniqCity, uniqComments)
However there is some in depth processing and changes to the table that I would like to protect that I have already scripted, so wondered if there was a way to "tell" MySQL that I want to use "" as text encapsulation. The main processing I want to protect is that I give it a specific table name when creating the new table, and use that throughout the rest of the processing.
Example of my table maker script:-
def make_table(self):
query ="DROP TABLE IF EXISTS `atl`.`{}`".format(self.table)
self.cur.execute(query)
query = "CREATE TABLE IF NOT EXISTS `atl`.`{}` (`PK` INT NOT NULL AUTO_INCREMENT PRIMARY KEY, `ID` varchar(10), `PARENT_ID` varchar(10), `URI` varchar(284), \
`FILE_PATH` varchar(230), `NAME` varchar(125), `METHOD` varchar(9), `STATUS` varchar(4), `SIZE` varchar(9), \
`TYPE` varchar(9), `EXT` varchar(11), `LAST_MODIFIED` varchar(19), `EXTENSION_MISMATCH` varchar(20), `MD5_HASH` varchar(32), \
`FORMAT_COUNT` varchar(2), `PUID` varchar(9), `MIME_TYPE` varchar(71), `FORMAT_NAME` varchar(59), `FORMAT_VERSION` varchar(7), \
`delete_flag` tinyint, `delete_reason` VARCHAR(80), `move_flag` TINYINT, `move_reason` VARCHAR(80), \
`ext_change_flag` TINYINT, `ext_change_reason` VARCHAR(80), `ext_change_value` VARCHAR(4), `fname_change_flag` TINYINT, `fname_change_reason` VARCHAR(80),\
`fname_change_value` VARCHAR(80))".format(self.table)
self.cur.execute(query)
self.mydb.commit()
Example of my ingest script:-
def ingest_row(self, row):
query = "insert"
# Prepare SQL query to INSERT a record into the database.
query = "INSERT INTO `atl`.`{0}` (`ID`, `PARENT_ID`, `URI`, `FILE_PATH`, `NAME`, `METHOD`, `STATUS`, `SIZE`, `TYPE`, `EXT`, \
`EXTENSION_MISMATCH`, `LAST_MODIFIED`, `MD5_HASH`, `FORMAT_COUNT`, `PUID`, `MIME_TYPE`, `FORMAT_NAME`, `FORMAT_VERSION`) \
VALUES ('{1}','{2}','{3}','{4}','{5}','{6}','{7}','{8}','{9}','{10}','{11}','{12}','{13}','{14}','{15}','{16}','{17}','{18}')".format(self.table, row[0], row[1], row[2], row[3], row[4], \
row[5], row[6], row[7], row[8], row[9], row[10], row[11], row[12], row[13], row[14], row[15], row[16], row[17])
try:
self.cur.execute(query)
self.mydb.commit()
except:
print query
quit()
Example of log:-
"ID","PARENT_ID","URI","FILE_PATH","NAME","METHOD","STATUS","SIZE","TYPE","EXT","LAST_MODIFIED","EXTENSION_MISMATCH","MD5_HASH","FORMAT_COUNT","PUID","MIME_TYPE","FORMAT_NAME","FORMAT_VERSION"
"1","","file:/C:/jay/NZ%20Business%20Roundtable_Download_13Feb2013,%20400%20Access/","C:\jay\NZ Business Roundtable_Download_13Feb2013, 400 Access","NZ Business Roundtable_Download_13Feb2013, 400 Access",,"Done","","Folder",,"2013-06-28T11:31:36","false",,"",,"","",""
"2","1","file:/C:/jay/NZ%20Business%20Roundtable_Download_13Feb2013,%20400%20Access/1993/","C:\jay\NZ Business Roundtable_Download_13Feb2013, 400 Access\1993","1993",,"Done","","Folder",,"2013-06-28T11:31:36","false",,"",,"","",""
You should use SQL prepared statements. Mixing data and sql code with format opens the door for SQL injection (which is almost always 1st in the top 25 software flaw / security issue).
example, here is your data:
>>> log = """\
... "ID","PARENT_ID","URI","FILE_PATH","NAME","METHOD","STATUS","SIZE","TYPE","EXT","LAST_MODIFIED","EXTENSION_MISMATCH","MD5_HASH","FORMAT_COUNT","PUID","MIME_TYPE","FORMAT_NAME","FORMAT_VERSION"
... "1","","file:/C:/jay/NZ%20Business%20Roundtable_Download_13Feb2013,%20400%20Access/","C:\jay\NZ Business Roundtable_Download_13Feb2013, 400 Access","NZ Business Roundtable_Download_13Feb2013, 400 Access",,"Done","","Folder",,"2013-06-28T11:31:36","false",,"",,"","",""
... "2","1","file:/C:/jay/NZ%20Business%20Roundtable_Download_13Feb2013,%20400%20Access/1993/","C:\jay\NZ Business Roundtable_Download_13Feb2013, 400 Access\1993","1993",,"Done","","Folder",,"2013-06-28T11:31:36","false",,"",,"","",""
... """
I don't have the file, so let's pretend I do:
>>> import StringIO
>>> logfile = StringIO.StringIO(log)
then let's build the query:
>>> import csv
>>> csvreader = csv.reader(logfile)
>>> fields = csvreader.next()
>>>
>>> table = 'mytable'
>>>
>>> fields_fmt = ', '.join([ '`%s`' % f for f in fields ])
>>> values_fmt = ', '.join(['%s'] * len(fields))
>>> query = "INSERT INTO `atl`.`{0}` ({1}) VALUES ({2})".format(
... # self.table, fields_fmt, values_fmt)
... table, fields_fmt, values_fmt)
>>> query
'INSERT INTO `atl`.`mytable` (`ID`, `PARENT_ID`, `URI`, `FILE_PATH`, `NAME`, `METHOD`, `STATUS`, `SIZE`, `TYPE`, `EXT`, `LAST_MODIFIED`, `EXTENSION_MISMATCH`, `MD5_HASH`, `FORMAT_COUNT`, `PUID`, `MIME_TYPE`, `FORMAT_NAME`, `FORMAT_VERSION`) VALUES (%s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s)'
then if you massage ingest_row:
def ingest_row(self, row):
try:
self.cur.execute(query, row)
self.mydb.commit()
except:
print query
quit()
you can then import the data with:
for row in csvreader:
ingest_row(row)
Never use string formatting, concatenation etc. to build a sql query!
dbapi requires from all drivers to support parameterized queries, parameters should be supplied to the execute method of the cursor. For MySQLdb, whch supports format style parameterisation, it would look like:
cursor.execute('insert into sometable values (%s, %s)', ('spam', 'eggs'))
The supplied parameters are correctly escaped by the library, so it won't matter if your strings contain characters that must be escaped.
The only exception in your special case would be the table name, as escaping that would produce illegal sql.

Inserting Variables MySQL Using Python, Not Working

I want to insert the variable bob, and dummyVar into my table, logger. Now from what I can tell all I should need to do is, well what I have below, however this doesn't insert anything into my table at all. If I hard-code what should be written (using 'example' then it writes example to the table, so my connection and syntax for inserting is correct to this point). Any help would be more than appreciated!
conn = mysql.connector.connect(user='username', password='password!',
host='Host', database='database')
cursor = conn.cursor()
bob = "THIS IS AN EXAMPLE"
dummyVar = "Variable Test"
loggit = ("""
INSERT INTO logger (logged_info, dummy)
VALUES
(%s, %s)
""", (bob, dummyVar))
cursor.execute(loggit)
conn.commit()
I have also tried this:
loggit = ("""
INSERT INTO logger (logged_info, dummy)
VALUES
(%(bob)s,(Hello))
""", (bob))
and:
bob = "THIS IS AN EXAMPLE"
dummyVar = "Variable Test"
loggit = ("""
INSERT INTO logger (logged_info, dummy)
VALUES
(%s, %s)
""", (bob, dummyVar))
cursor.execute(loggit, (bob, dummyVar))
conn.commit()
cursor.execute(loggit, (bob, dummyVar))
conn.commit()
You need to pass the SQL statement and the parameters as separate arguments:
cursor.execute(loggit[0], loggit[1])
or use the variable argument syntax (a splat, *):
cursor.execute(*loggit)
Your version tries to pass in a tuple containing the SQL statement and bind parameters as the only argument, where the .execute() function expects to find just the SQL statement string.
It's more usual to keep the two separate and perhaps store just the SQL statement in a variable:
loggit = """
INSERT INTO logger (logged_info, dummy)
VALUES
(%s, %s)
"""
cursor.execute(loggit, (bob, dummyVar))

Categories