I am using python and trying to update rows using
db=sqlite3.connect('db')
cursor=db.execute('select * from infos where is_processed=0')
films=cursor.fetchall()
cursor.close()
db.close()
for film in films:
inputLayer=np.array([film[1],film[2],film[3]],dtype=float)
name=film[0]
#print inputLayer
NeuralNetwork3.nn(inputLayer,film[4])
sql="update infos set is_processed=1 where file_name='"+name+"'"
db = sqlite3.connect('db')
db.execute(sql)
db.commit()
db.close()
I get: sqlite3.OperationalError: near "t": syntax error what is wrong?
Note it points at line "db.excute(sql)" and says at that line is the error
Suppose name contains a single quote followed by a t, as in
name = "don't look now"
sql = "update foo set is_processed=1 where bar='"+name+"'"
Then sql would equal
In [156]: sql
Out[156]: "update foo set is_processed=1 where bar='don't look now'"
and sqlite3 will think the conditional is where bar='don' followed by a syntax error, t look now'. sqlite3 then raises
sqlite3.OperationalError: near "t": syntax error
This is an example of why you should always use parametrized SQL. To avoid this problem (and protect your code from SQL injection attacks), use parametrized SQL and pass a sequence (or, depending on the paramstyle, a mapping) of values as the second argument to cursor.execute:
sql = "update foo set is_processed=1 where bar=?"
cursor.execute(sql, [name])
When you pass arguments (such as [name]) as the second argument to
cursor.execute, sqlite3 will escape the single-quote for you.
Per the Python Database API, when you pass parameters as the second argument to cursor.execute (my emphasis):
The module will use the __getitem__ method of the parameters object to map
either positions (integers) or names (strings) to parameter values. This
allows for both sequences and mappings to be used as input.
The term bound refers to the process of binding an input value to a database
execution buffer. In practical terms, this means that the input value is
directly used as a value in the operation. The client should not be required
to "escape" the value so that it can be used — the value should be equal to
the actual database value
Here is a runnable toy example to help see the problem and how it is avoided using parametrized SQL:
import sqlite3
with sqlite3.connect(':memory:') as conn:
cursor = conn.cursor()
cursor.execute('''CREATE TABLE foo
(id INTEGER PRIMARY KEY AUTOINCREMENT,
bar TEXT,
is_processed BOOL)''')
name = "don't look now"
sql = "update foo set is_processed=1 where bar='"+name+"'"
print(sql)
cursor.execute(sql)
# comment out `cursor.execute(sql)` above and compare with
# sql = "update foo set is_processed=1 where bar=?"
# cursor.execute(sql, [name])
Related
I am trying to write a simple Python script to bulk add movie titles into a local database, using the MySQLdb (mysqlclient) package. I am reading the titles from a TSV file. But when go to sanitize the inputs using MySQLdb::escape_string(), I get the character b before my string. I believe this means that SQL is interpreting it as a bit value, but when I go to execute my query I get the following error:
You have an error in your SQL syntax; check the manual that
corresponds to your MariaDB server version for the right syntax to use
near 'b'Bowery to Bagdad',1955)' at line 1"
The select statement in question:
INSERT INTO movies (imdb_id, title, release_year) VALUES ('tt0044388',b'Bowery to Bagdad',1955)
def TSV_to_SQL(file_to_open):
from MySQLdb import _mysql
db=_mysql.connect(host='localhost', user='root', passwd='', db='tutorialdb', charset='utf8')
q = """SELECT * FROM user_id"""
# MySQLdb.escape_string()
# db.query(q)
# results = db.use_result()
# print(results.fetch_row(maxrows=0, how=1))
print("starting?")
with open(file_to_open, encoding="utf8") as file:
tsv = csv.reader(file, delimiter="\t")
count = 0
for line in tsv:
if count == 10:
break
# print(MySQLdb.escape_string(line[1]))
statement = "INSERT INTO movies (imdb_id, title, release_year) VALUES ('{imdb_id}',{title},{year})\n".format(
imdb_id=line[0], title=MySQLdb.escape_string(line[1]), year=line[2])
# db.query(statement)
print(statement)
count = count + 1
I know a simple solution would be to just remove the character b from the start of the string, but I was wondering if there was a more proper way, or if I missed something in documentation.
The 'b' infront of the string represents that the string is binary encoded rather than a literal string.
If you use .encode() you will be able to get what you want.
How to convert 'binary string' to normal string in Python3?
It's more common to let the connector perform the escaping automatically, by inserting placeholders in the SQL statement and passing a sequence (conventionally a tuple) of values as the second argument to cursor.execute.
conn = MySQLdb.connect(host='localhost', user='root', passwd='', db='tutorialdb', charset='utf8')
cursor = conn.cursor()
statement = """INSERT INTO movies (imdb_id, title, release_year) VALUES (%s, %s, %s)"""
cursor.execute(statement, (line[0], line[1], line[2]))
conn.commit()
The resulting code is more portable - apart from the connection it will work with all DB-API connectors*. Dropping down to low-level functions like _mysql.connect and escape_string is unusual in Python code (though you are perfectly free to code like this if you want, of course).
* Some connection packages may use a different placeholder instead of %s, but %s seems to be the favoured placeholder for MySQL connector packages.
I'm working on a bit of python code to run a query against a redshift (postgres) SQL database, and I'm running into an issue where I can't strip off the surrounding single quotes from a variable I'm passing to the query. I'm trying to drop a number of tables from a list. This is the basics of my code:
def func(table_list):
drop_query = 'drop table if exists %s' #loaded from file
table_name = table_list[0] #table_name = 'my_db.my_table'
con=psycopg2.connect(dbname=DB, host=HOST, port=PORT, user=USER, password=PASS)
cur=con.cursor()
cur.execute(drop_query, (table_name, )) #this line is giving me trouble
#cleanup statements for the connection
table_list = ['my_db.my_table']
when func() gets called, I am given the following error:
syntax error at or near "'my_db.my_table'"
LINE 1: drop table if exists 'my_db.my_table...
^
Is there a way I can remove the surrounding single quotes from my list item?
for the time being, I've done it (what think is) the wrong way and used string concatenation, but know this is basically begging for SQL-injection.
This is not how psycopg2 works. You are using a string operator %s to replace with a string. The reason for this is to tokenize your string safely to avoid SQL injection, psycopg2 handles the rest.
You need to modify the query before it gets to the execute statement.
drop_query = 'drop table if exists {}'.format(table_name)
I warn you however, do not allow these table names to be create by outside sources, or you risk SQL injection.
However a new version of PSYCOPG2 kind of allows something similar
http://initd.org/psycopg/docs/sql.html#module-psycopg2.sql
from psycopg2 import sql
cur.execute(
sql.SQL("insert into {} values (%s, %s)").format(sql.Identifier('my_table')),[10, 20]
)
I am querying an MS Access database from Python using the pyodbc module. I am able to do this if I query all records in a table, but when I add a where clause, I am getting an error.
This is my code:
wpc_ids = ['WPCMOOTEST2', 'WPCMOOTEST1']
conn = pyodbc.connect(r'Driver={Microsoft Access Driver (*.mdb, *.accdb)};DBQ=P:\Conservation Programs\Natural Heritage Program\Data Management\ACCESS databases\POND_entry\POND_be.accdb;')
cursor = conn.cursor()
wpc_list = ','.join(str(x) for x in wpc_ids)
cursor.execute('SELECT * FROM pools WHERE wpc_id IN (%s)'%wpc_list)
I am getting the following error:
Error: ('07002', u'[07002] [Microsoft][ODBC Microsoft Access Driver] Too few parameters. Expected 2. (-3010) (SQLExecDirectW)')
I don't get that error without the where clause, so I'm not sure what the second parameter is that I need. Can anyone help with this?
cursor.execute(
'SELECT * FROM pools WHERE wpc_id IN ({})'.format(
','.join('?'*len(wpc_ids))), wpc_ids
)
Explanation:
There is a PEP about databases, PEP249, you can read it here https://www.python.org/dev/peps/pep-0249/
This PEP defines how API of database modules should be. pyodbc is the database module you're using, and it is compatible with PEP249.
One of the things the PEP defines is that each module should have a paramstyle. pyodbc.paramstyle is qmark so that is why you use '?' with pyodbc. More details https://www.python.org/dev/peps/pep-0249/#paramstyle
Now, instead of building a query as a string and sending it to the database, the idea is to use parameter passing, which is a way to send the query and the parameters separately... It uses the paramstyle to put placeholders in the query, then you pass a sequence of parameters as a second parameter to execute. Example:
sql = 'SELECT * FROM foo WHERE id = ? AND text_col = ?'
params = (12, 'testing')
cursor.execute(sql, params)
Note that this is not mixing the params with the string. The code is passing them as two separate arguments to .execute(). That means it will be the database's job to do the interpolation safely.
Since you want to pass multiple values to the query, you must generate a string containing the number of placeholders separated by comma, same number as the elements in the list:
','.join('?'*len(wpc_ids)))
# will generate ?,?,?,?,? according with length of list
For example, when I use cursor.execute() as documented:
>>> from django.db import connection
>>> cur = connection.cursor()
>>> cur.execute("DROP TABLE %s", ["my_table"])
django.db.utils.DatabaseError: near "?": syntax error
When Django's argument substitution is not used, the query works as expected:
>>> cur.execute("DROP TABLE my_table")
django.db.utils.DatabaseError: no such table: my_table
What am I doing wrong? How can I make parameterized queries work?
Notes:
Suffixing the query with ; does not help
As per the documentation, %s should be used, not SQLite's ? (Django translates %s to ?)
You cannot use parameters in SQL statements in place of identifiers (column or table names). You can only use them in place of single values.
Instead, you must use dynamic SQL to construct the entire SQL string and send that, unparameterized, to the database (being extra careful to avoid injection if the table name originates outside your code).
You can't substitute metadata in parameterized queries.
What's the best way to make psycopg2 pass parameterized queries to PostgreSQL? I don't want to write my own escpaing mechanisms or adapters and the psycopg2 source code and examples are difficult to read in a web browser.
If I need to switch to something like PyGreSQL or another python pg adapter, that's fine with me. I just want simple parameterization.
psycopg2 follows the rules for DB-API 2.0 (set down in PEP-249). That means you can call execute method from your cursor object and use the pyformat binding style, and it will do the escaping for you. For example, the following should be safe (and work):
cursor.execute("SELECT * FROM student WHERE last_name = %(lname)s",
{"lname": "Robert'); DROP TABLE students;--"})
From the psycopg documentation
(http://initd.org/psycopg/docs/usage.html)
Warning Never, never, NEVER use Python string concatenation (+) or string parameters interpolation (%) to pass variables to a SQL query string. Not even at gunpoint.
The correct way to pass variables in a SQL command is using the second argument of the execute() method:
SQL = "INSERT INTO authors (name) VALUES (%s);" # Note: no quotes
data = ("O'Reilly", )
cur.execute(SQL, data) # Note: no % operator
Here are a few examples you might find helpful
cursor.execute('SELECT * from table where id = %(some_id)d', {'some_id': 1234})
Or you can dynamically build your query based on a dict of field name, value:
query = 'INSERT INTO some_table (%s) VALUES (%s)'
cursor.execute(query, (my_dict.keys(), my_dict.values()))
Note: the fields must be defined in your code, not user input, otherwise you will be susceptible to SQL injection.
I love the official docs about this:
https://www.psycopg.org/psycopg3/docs/basic/params.html