I am trying to write a simple Python script to bulk add movie titles into a local database, using the MySQLdb (mysqlclient) package. I am reading the titles from a TSV file. But when go to sanitize the inputs using MySQLdb::escape_string(), I get the character b before my string. I believe this means that SQL is interpreting it as a bit value, but when I go to execute my query I get the following error:
You have an error in your SQL syntax; check the manual that
corresponds to your MariaDB server version for the right syntax to use
near 'b'Bowery to Bagdad',1955)' at line 1"
The select statement in question:
INSERT INTO movies (imdb_id, title, release_year) VALUES ('tt0044388',b'Bowery to Bagdad',1955)
def TSV_to_SQL(file_to_open):
from MySQLdb import _mysql
db=_mysql.connect(host='localhost', user='root', passwd='', db='tutorialdb', charset='utf8')
q = """SELECT * FROM user_id"""
# MySQLdb.escape_string()
# db.query(q)
# results = db.use_result()
# print(results.fetch_row(maxrows=0, how=1))
print("starting?")
with open(file_to_open, encoding="utf8") as file:
tsv = csv.reader(file, delimiter="\t")
count = 0
for line in tsv:
if count == 10:
break
# print(MySQLdb.escape_string(line[1]))
statement = "INSERT INTO movies (imdb_id, title, release_year) VALUES ('{imdb_id}',{title},{year})\n".format(
imdb_id=line[0], title=MySQLdb.escape_string(line[1]), year=line[2])
# db.query(statement)
print(statement)
count = count + 1
I know a simple solution would be to just remove the character b from the start of the string, but I was wondering if there was a more proper way, or if I missed something in documentation.
The 'b' infront of the string represents that the string is binary encoded rather than a literal string.
If you use .encode() you will be able to get what you want.
How to convert 'binary string' to normal string in Python3?
It's more common to let the connector perform the escaping automatically, by inserting placeholders in the SQL statement and passing a sequence (conventionally a tuple) of values as the second argument to cursor.execute.
conn = MySQLdb.connect(host='localhost', user='root', passwd='', db='tutorialdb', charset='utf8')
cursor = conn.cursor()
statement = """INSERT INTO movies (imdb_id, title, release_year) VALUES (%s, %s, %s)"""
cursor.execute(statement, (line[0], line[1], line[2]))
conn.commit()
The resulting code is more portable - apart from the connection it will work with all DB-API connectors*. Dropping down to low-level functions like _mysql.connect and escape_string is unusual in Python code (though you are perfectly free to code like this if you want, of course).
* Some connection packages may use a different placeholder instead of %s, but %s seems to be the favoured placeholder for MySQL connector packages.
Related
I'm trying to transfer a user input from a python code to a table in postgresql
What I want to do is place an input() in this code and make it's value go to the comment (#) in the code.
conn = psycopg2.connect(
host="localhost",
database="Twitterzuil",
user="postgres",
password="")
cur = conn.cursor()
cur.execute("INSERT INTO Bericht2 (name) VALUES (#THIS IS WHERE I WANT THE INPUT TO GO)");
conn.commit()
I have no idea how, I'm really a beginner in all this so any help is appreciated
I believe what you are asking about is called string interpolation. Using f-style format, this might look like
new_name = "'bob'" # need single quotes for SQL strings
sql = f"INSERT INTO Bericht2 (name) VALUES ({new_name})" # => sql == "INSERT INTO Bericht2 (name) VALUES ('bob')"
cur.execute(sql)
Note the f at the start of the string, when you do this expressions inside {} pairs get replaced with their python values (tutorial). There are also string formatting approaches involving % substitution and the .format method on strings.
If you are doing anything beyond the basics you should look into using the SQLAlchemy package; here's the link to their insert api. Using SQLAlchemy will help reduce the risks that can come with manually constructing SQL queries.
Example from "Inserting Rows with SQLAlchemy"
from sqlalchemy import insert
stmt = insert(user_table).values(name='spongebob', fullname="Spongebob Squarepants")
with engine.connect() as conn:
result = conn.execute(stmt)
conn.commit()
I'm working on a bit of python code to run a query against a redshift (postgres) SQL database, and I'm running into an issue where I can't strip off the surrounding single quotes from a variable I'm passing to the query. I'm trying to drop a number of tables from a list. This is the basics of my code:
def func(table_list):
drop_query = 'drop table if exists %s' #loaded from file
table_name = table_list[0] #table_name = 'my_db.my_table'
con=psycopg2.connect(dbname=DB, host=HOST, port=PORT, user=USER, password=PASS)
cur=con.cursor()
cur.execute(drop_query, (table_name, )) #this line is giving me trouble
#cleanup statements for the connection
table_list = ['my_db.my_table']
when func() gets called, I am given the following error:
syntax error at or near "'my_db.my_table'"
LINE 1: drop table if exists 'my_db.my_table...
^
Is there a way I can remove the surrounding single quotes from my list item?
for the time being, I've done it (what think is) the wrong way and used string concatenation, but know this is basically begging for SQL-injection.
This is not how psycopg2 works. You are using a string operator %s to replace with a string. The reason for this is to tokenize your string safely to avoid SQL injection, psycopg2 handles the rest.
You need to modify the query before it gets to the execute statement.
drop_query = 'drop table if exists {}'.format(table_name)
I warn you however, do not allow these table names to be create by outside sources, or you risk SQL injection.
However a new version of PSYCOPG2 kind of allows something similar
http://initd.org/psycopg/docs/sql.html#module-psycopg2.sql
from psycopg2 import sql
cur.execute(
sql.SQL("insert into {} values (%s, %s)").format(sql.Identifier('my_table')),[10, 20]
)
So here is my problem: I am trying to select a specific value from a table
comparing it with a unicode string. The value is also unicode. I am using
mysql.connector. The server settings are all utf8 oriented. When I run
following query - I get an empty list. When I run it without 'WHERE Title like '%s'' part, I get a full set of values, and they properly displayed in the
output. The same query works in the command line on the server. The value is
there for sure. What is it that I am missing?
conn = sql.connect(host='xxxxxxx', user='xxx', password='xxx', database='db', charset="utf8")
cur = conn.cursor()
townQuery = (u"""SELECT * FROM Towns WHERE Title like '%s' """)
tqd = (u"%" +u"Серов"+u"%")
cur.execute(townQuery, tqd)
for i in cur:
print i
When you use the 2-argument form of cur.execute (thus passing the arguments, tqd, to the parametrized sql, townQuery), the DB adaptor will quote the arguments for you. Therefore, remove the single quotes from around the %s in townQuery:
townQuery = u"""SELECT * FROM Towns WHERE Title like %s"""
tqd = [u"%Серов%"]
cur.execute(townQuery, tqd)
Also note that the second argument, tqd, must be a sequence such as a list or tuple. The square brackets around u"%Серов%" makes [u"%Серов%"] a list. Parentheses around u"%Серов%" do NOT make (u"%Серов%") a tuple because Python evaluates the quantity in parentheses to a unicode. To make it a tuple, add a comma before the closing parenthesis: (u"%Серов%",).
I cannot find a solution.
Can you help me with this question please?
dic={'username':u'\uc774\ud55c\ub098','userid':u'david007', 'nation':u'\ub300\ud55c\ubbfc\uad6d'}
c=MySQLdb.connect(host=ddb['host'],user=ddb['user'],passwd=ddb['passwd'],db=ddb['db'], use_unicode=True, charset="utf8")
s=c.cursor()
sql="INSERT INTO "+db+" "+col+" VALUES "+str(tuple(dic.values()))
s.execute(sql)
"You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ''\\uc774\\ud55 ... at line 1")
print sql
INSERT INTO user_tb (username, userid, nation) VALUES (u'\uc774\ud55c\ub098', u'david007', u'\ub300\ud55c\ubbfc\uad6d')
And the error is:
You need to use a parametrised query:
sql = "INSERT INTO " + db + " " + col + " VALUES (%s, %s, %s)"
s.execute(sql, dic.values())
When you simply concatenate the tuple to your query, the u prefix of the unicode strings will make those strings invalid SQL. With parameters MySQLdb, will do the right thing with the parameter replacement (i.e. encoding the unicode strings to a byte representation) and generate valid SQL.
Anyway as a general principle you should always use parameters in your queries to prevent SQL injections.
When using prepared statements with named parameters in SQLite (specifically with the python sqlite3 module http://docs.python.org/library/sqlite3.html ) is there anyway to include string values without getting quotes put around them ?
I've got this :
columnName = '''C1'''
cur = cur.execute('''SELECT DISTINCT(:colName) FROM T1''', {'colName': columnName})
And it seems the SQL I end up with is this :
SELECT DISTINCT('C1') FROM T1
which isn't much use of course, what I really want is :
SELECT DISTINCT(C1) FROM T1 .
Is there any way to prompt the execute method to interpret the supplied arguments in such a way that it doesn't wrap quotes around them ?
I've written a little test program to explore this fully so for what it's worth here it is :
import sys
import sqlite3
def getDatabaseConnection():
DEFAULTDBPATH = ':memory:'
conn = sqlite3.connect(DEFAULTDBPATH, detect_types=sqlite3.PARSE_DECLTYPES|sqlite3.PARSE_COLNAMES)
conn.text_factory = str
return conn
def initializeDBTables(conn):
conn.execute('''
CREATE TABLE T1(
id INTEGER PRIMARY KEY AUTOINCREMENT,
C1 STRING);''')
cur = conn.cursor()
cur.row_factory = sqlite3.Row # fields by name
for v in ['A','A','A','B','B','C']:
cur.execute('''INSERT INTO T1 values (NULL, ?)''', v)
columnName = '''C1'''
cur = cur.execute('''SELECT DISTINCT(:colName) FROM T1''', {'colName': columnName})
#Should end up with three output rows, in
#fact we end up with one
for row in cur:
print row
def main():
conn = getDatabaseConnection()
initializeDBTables(conn)
if __name__ == '__main__':
main()
Would be interested to hear of anyway of manipulating the execute method to allow this to work.
In SELECT DISTINCT(C1) FROM T1 the C1 is not a string value, it is a piece of SQL code. The parameters (escaped in execute) are used to insert values, not pieces of code.
You are using bindings and bindings can only be used for values, not for table or column names. You will have to use string interpolation/formstting to get the effect you want but it does leave you open to SQL injection attacks if the column name came from an untrusted source. In that case you can sanitize the string (eg only allow alphanumerics) and use the authorizer interface to check no unexpected activity will happen.