I am trying to make a dynamic mySQL update statement. My update fails if certain characters are in the string.
import mysql.connector as sql
import MySQLdb
#Values are taken from a wxGrid.
key_id = str("'") + str(self.GetCellValue(event.GetRow(),1)) + str("'") #Cell column with unique ID
target_col = str(self.GetColLabelValue(event.GetCol())) #Column being updated
key_col = str(self.GetColLabelValue(1)) #Unique ID column
nVal = str("'")+self.GetCellValue(event.GetRow(),event.GetCol()) + str("'") #Updated value
sql_update = f"""Update {tbl} set {target_col} = {nVal} where {key_col} = {key_id}"""
self.cursor.execute(sql_update)
My Key column always contains Email addresses or integers. So if key_id = test#email.com, the update is successful, but if key_id = t'est#email.com, it fails. How do I get around this?
You can fix this by using query parameters. Stop concatenating strings into your SQL query. Use placeholders and then pass the values in a separate list argument to execute().
sql_update = f"""Update {tbl} set {target_col} = %s where {key_col} = %s"""
self.cursor.execute(sql_update, (nVal, key_id,))
Query parameters only work where you would use a literal value in your query, like a quoted string literal or a numeric literal.
You can't use query parameters for identifiers like the table name or column names. But I hope your identifiers are less likely to contain ' characters!
Likewise you cannot use query parameters for expressions or SQL keywords or lists of values e.g. for an IN() predicate. One query parameter = one scalar value.
See also:
MySQL parameterized queries
https://dev.mysql.com/doc/connector-python/en/connector-python-example-cursor-transaction.html
Literally any other Python SQL tutorial.
Use execute function instead.
Not recommended solution: A workaround for single quote literal is to replace with an escape character; just before the query key_id.replace("'", "\'"). That you might have to do for each special character like %, , _, and [.
Related
I'm trying to execute a raw sql query and safely pass an order by/asc/desc based on user input. This is the back end for a paginated datagrid. I cannot for the life of me figure out how to do this safely. Parameters get converted to strings so Oracle can't execute the query. I can't find any examples of this anywhere on the internet. What is the best way to safely accomplish this? (I am not using the ORM, must be raw sql).
My workaround is just setting ASC/DESC to a variable that I set. This works fine and is safe. However, how do I bind a column name to the ORDER BY? Is that even possible? I can just whitelist a bunch of columns and do something similar as I do with the ASC/DESC. I was just curious if there's a way to bind it. Thanks.
#default.route('/api/barcodes/<sort_by>/<sort_dir>', methods=['GET'])
#json_enc
def fetch_barcodes(sort_by, sort_dir):
#time.sleep(5)
# Can't use sort_dir as a parameter, so assign to variable to sanitize it
ord_dir = "DESC" if sort_dir.lower() == 'desc' else 'ASC'
records = []
stmt = text("SELECT bb_request_id,bb_barcode,bs_status, "
"TO_CHAR(bb_rec_cre_date, 'MM/DD/YYYY') AS bb_rec_cre_date "
"FROM bars_barcodes,bars_status "
"WHERE bs_status_id = bb_status_id "
"ORDER BY :ord_by :ord_dir ")
stmt = stmt.bindparams(ord_by=sort_by,ord_dir=ord_dir)
rs = db.session.execute(stmt)
records = [dict(zip(rs.keys(), row)) for row in rs]
DatabaseError: (cx_Oracle.DatabaseError) ORA-01036: illegal variable name/number
[SQL: "SELECT bb_request_id,bb_barcode,bs_status, TO_CHAR(bb_rec_cre_date, 'MM/DD/YYYY') AS bb_rec_cre_date FROM bars_barcodes,bars_status WHERE bs_status_id = bb_status_id ORDER BY :ord_by :ord_dir "] [parameters: {'ord_by': u'bb_rec_cre_date', 'ord_dir': 'ASC'}]
UPDATE Solution based on accepted answer:
def fetch_barcodes(sort_by, sort_dir, page, rows_per_page):
ord_dir_func = desc if sort_dir.lower() == 'desc' else asc
query_limit = int(rows_per_page)
query_offset = (int(page) - 1) * query_limit
stmt = select([column('bb_request_id'),
column('bb_barcode'),
column('bs_status'),
func.to_char(column('bb_rec_cre_date'), 'MM/DD/YYYY').label('bb_rec_cre_date')]).\
select_from(table('bars_barcode')).\
select_from(table('bars_status')).\
where(column('bs_status_id') == column('bb_status_id')).\
order_by(ord_dir_func(column(sort_by))).\
limit(query_limit).offset(query_offset)
result = db.session.execute(stmt)
records = [dict(row) for row in result]
response = json_return()
response.addRecords(records)
#response.setTotal(len(records))
response.setTotal(1001)
response.setSuccess(True)
response.addMessage("Records retrieved successfully. Limit: " + str(query_limit) + ", Offset: " + str(query_offset) + " SQL: " + str(stmt))
return response
You could use Core constructs such as table() and column() for this instead of raw SQL strings. That'd make your life easier in this regard:
from sqlalchemy import select, table, column, asc, desc
ord_dir = desc if sort_dir.lower() == 'desc' else asc
stmt = select([column('bb_request_id'),
column('bb_barcode'),
column('bs_status'),
func.to_char(column('bb_rec_cre_date'),
'MM/DD/YYYY').label('bb_rec_cre_date')]).\
select_from(table('bars_barcodes')).\
select_from(table('bars_status')).\
where(column('bs_status_id') == column('bb_status_id')).\
order_by(ord_dir(column(sort_by)))
table() and column() represent the syntactic part of a full blown Table object with Columns and can be used in this fashion for escaping purposes:
The text handled by column() is assumed to be handled like the name of a database column; if the string contains mixed case, special characters, or matches a known reserved word on the target backend, the column expression will render using the quoting behavior determined by the backend.
Still, whitelisting might not be a bad idea.
Note that you don't need to manually zip() the row proxies in order to produce dictionaries. They act as mappings as is, and if you need dict() for serialization reasons or such, just do dict(row).
I'm currently building SQL queries depending on input from the user. An example how this is done can be seen here:
def generate_conditions(table_name,nameValues):
sql = u""
for field in nameValues:
sql += u" AND {0}.{1}='{2}'".format(table_name,field,nameValues[field])
return sql
search_query = u"SELECT * FROM Enheter e LEFT OUTER JOIN Handelser h ON e.Id == h.Enhet WHERE 1=1"
if "Enhet" in args:
search_query += generate_conditions("e",args["Enhet"])
c.execute(search_query)
Since the SQL changes every time I cannot insert the values in the execute call which means that I should escape the strings manually. However, when I search everyone points to execute...
I'm also not that satisfied with how I generate the query, so if someone has any idea for another way that would be great also!
You have two options:
Switch to using SQLAlchemy; it'll make generating dynamic SQL a lot more pythonic and ensures proper quoting.
Since you cannot use parameters for table and column names, you'll still have to use string formatting to include these in the query. Your values on the other hand, should always be using SQL parameters, if only so the database can prepare the statement.
It's not advisable to just interpolate table and column names taken straight from user input, it's far too easy to inject arbitrary SQL statements that way. Verify the table and column names against a list of such names you accept instead.
So, to build on your example, I'd go in this direction:
tables = {
'e': ('unit1', 'unit2', ...), # tablename: tuple of column names
}
def generate_conditions(table_name, nameValues):
if table_name not in tables:
raise ValueError('No such table %r' % table_name)
sql = u""
params = []
for field in nameValues:
if field not in tables[table_name]:
raise ValueError('No such column %r' % field)
sql += u" AND {0}.{1}=?".format(table_name, field)
params.append(nameValues[field])
return sql, params
search_query = u"SELECT * FROM Enheter e LEFT OUTER JOIN Handelser h ON e.Id == h.Enhet WHERE 1=1"
search_params = []
if "Enhet" in args:
sql, params = generate_conditions("e",args["Enhet"])
search_query += sql
search_params.extend(params)
c.execute(search_query, search_params)
I want to add another condition to this WHERE clause:
stmt = 'SELECT account_id FROM asmithe.data_hash WHERE percent < {};'.format(threshold)
I have the variable juris which is a list. The value of account_id and juris are related in that when an account_id is created, it contains the substring of a juris.
I want to add to the query the condition that it needs to match anyone of the juris elements. Normally I would just add ...AND account_id LIKE '{}%'".format(juris) but this doesn't work because juris is a list.
How do I add all elements of a list to the WHERE clause?
Use Regex with operator ~:
juris = ['2','7','8','3']
'select * from tbl where id ~ \'^({})\''.format('|'.join(juris))
which leads to this query:
select * from tbl where id ~ '^(2|7|8|3)'
This brings the rows which their id start with any of 2,7,8 or 3. Here is a fiddle for it.
If you want the id start with 2783 use:
select * from tbl where id ~ '^2783'
and if id contains any of 2,7,8 or 3
select * from t where id ~ '.*(2|7|8|3).*'
Stop using string formatting with SQL. Right now. Understand?
OK now. There's a construct, ANY in SQL, that lets you take an operator and apply it to an array. psycopg2 supports passing a Python list as an SQL ARRAY[]. So in this case you can just
curs.execute('SELECT account_id FROM asmithe.data_hash WHERE percent LIKE ANY (%s)', (thelist,))
Note here that %s is the psycopg2 query-parameter placeholder. It's not actually a format specifier. The second argument is a tuple, the query parameters. The first (and only) parameter is the list.
There's also ALL, which works like ANY but is true only if all the matches are true, not just if one or more is true.
I am hoping juris is a list of strings? If so, this might help:
myquery = ("SELECT accountid FROM asmithe.data_hash "
"WHERE percent in (%s)" % ",".join(map(str,juris)))
See these links:
python list in sql query as parameter
How to select item matching Only IN List in sql server
String formatting operations
For some reason I am getting errors when using placeholders in select statements.
def get_id(table_name, id_name):
db = sqlite3.connect('test_db')
max_id = db.execute('''SELECT max(?) FROM ?''', (id_name, table_name)).fetchone()[0]
if not max_id:
primary_id = 0
else:
primary_id = max_id + 1
This functions returns this error:
File "test.py", line 77, in get_id
max_id = db.execute('''SELECT max(?) FROM ?''', (id_name, table_name)).fetchone()[0]
sqlite3.OperationalError: near "?": syntax error
You aren't able to use placeholders for column or table names. The placeholders are for values used to insert or retrieve data from the database. The library properly sanitizes them.
To do what you want, try something like this:
db.execute('''SELECT max({}) FROM {}'''.format(id_name, table_name)).fetchone()[0]
This will use string formatting to build your query. If you need to add a WHERE condition to this, you can still do that using parameters:
db.execute('''SELECT max({}) FROM {} WHERE ID = ?'''.format(id_name, table_name), id_variable).fetchone()[0]
You're seeing this error because placeholders can only be used to substitute values, not column or table names.
In this case, you will have to use Python's string formatting, being very careful that the values don't contain SQL or special characters:
max_id = db.execute(
'SELECT max(%s) FROM %s where foo > ?' %(id_name, table_name),
(max_foo_value, ),
)
I have a large SQLite database with a mix of text and lots of other columns var1 ... var 50. Most of these are numeric, though some are text based.
I am trying to extract data from the database, process it in python and write it back - I need to do this for all rows in the db.
So far, the below sort of works:
# get row using select and process
fields = (','.join(keys)) # "var1, var2, var3 ... var50"
results = ','.join([results[key] for key in keys]) # "value for var1, ... value for var50"
cur.execute('INSERT OR REPLACE INTO results (id, %s) VALUES (%s, %s);' %(fields, id, results))
This however, nulls the columns that I don't explicitly add back. I can fix this by re-writing the code, but this feels quite messy, as I would have to surround with quotes using string concatenation and rewrite data that was there to begin with (i.e. the columns I didn't change).
Apparently the way to run updates on rows is something like this:
update table set var1 = 4, var2 = 5, var3="some text" where id = 666;
Presumably the way for me would be to run map , and add the = signs somehow (not sure how), but how would I quote all of the results appropriately (Since I would have to quote the text fields, and they might contain quotes within them too .. )?
I'm a bit confused. Any pointers would be very helpful.
Thanks!
As others have stressed, use parametrized arguments. Here is an example of how you might construct the SQL statement when it has a variable number of keys:
sql=('UPDATE results SET '
+ ', '.join(key+' = ?' for key in keys)
+ 'WHERE id = ?')
args = [results[key] for key in keys] + [id]
cur.execute(sql,args)
Use parameter substitution. It's more robust (and safer I think) than string formatting.
So if you did something like
query = 'UPDATE TABLE SET ' + ', '.join(str(f) + '=?,' for f in fields) + ';'
Or alternatively
query = 'UPDATE TABLE SET %s;' % (', '.join(str(f) + '=?,' for f in fields))
Or using new style formatting:
query = 'UPDATE TABLE SET {0};'.format(', '.join(str(f) + '=?,' for f in fields))
So the complete program would look something like this:
vals = {'var1': 'foo', 'var2': 3, 'var24':999}
fields = vals.keys()
results = vals.values()
query = 'UPDATE TABLE SET {0};'.format(', '.join(str(f) + '=?,' for f in fields))
conn.execute(query, results)
And that should work - and I presume do what you want it to.
You don't have to care about things like quotations etc, and in fact you shouldn't. If you do it like this, it's not only more convenient but also takes care of security issues known as sql injections:
sql = "update table set var1=%s, var2=%s, var3=%s where id=666"
cursor.execute(sql, (4, 5, "some text"))
the key point here ist that the sql and the values in the second statement aren't separated by a "%", but by a "," - this is not a string manipulation, but instead you pass two arguments to the execute function, the actual sql and the values. Each %s is replaced by a value from the value tuple. the database driver then knows how to take care of the individual types of the values.
the insert statement can be rewritten the same way, although I'm not sure and currently can't test whether you can also replace field names that way (the first %s in your insert-sql statement)
so to come back to your overall problem, you can loop over your values and dynamically add ", var%d=%%s" % i for your i-th variable while adding the actual value to a list at the same time