Alternating SQL queries - python

I'm looking for a way to implement alternating SQL queries - i.e. a function that allows me to filter entries based on different columns. Take the following example:
el=[["a","b",1],["a","b",3]]
def save_sql(foo):
with sqlite3.connect("fn.db") as db:
cur=db.cursor()
cur.execute("CREATE TABLE IF NOT EXISTS et"
"(var1 VARCHAR, var2 VARCHAR, var3 INT)")
cur.executemany("INSERT INTO et VALUES "
"(?,?,?)", foo)
db.commit()
def load_sql(v1,v2,v3):
with sqlite3.connect("fn.db") as db:
cur=db.cursor()
cur.execute("SELECT * FROM et WHERE var1=? AND var2=? AND var3=?", (v1,v2,v3))
return cur.fetchall()
save_sql(el)
Now if I were to use load_sql("a","b",1), it would work. But assume I want to only query for the first and third column, i.e. load_sql("a",None,1) (the None is just intended as a placeholder) or only the last column load_sql(None,None,5), this wouldn't work.
This could of course be done with if statements checking which variables were supplied in the function call, but in tables with larger amounts of columns, this might get messy.
Is there a good way to do this?

What if load_sql() would accept an arbitrary number of keyword arguments, where keyword argument names would correspond to column names. Something along these lines:
def load_sql(**values):
with sqlite3.connect("fn.db") as db:
cur = db.cursor()
query = "SELECT * FROM et"
conditions = [f"{column_name} = :{column_name}" for column_name in values]
if conditions:
query = query + " WHERE " + " AND ".join(conditions)
cur.execute(query, values)
return cur.fetchall()
Note that here we trust keyword argument names to be valid and existing column names (and string-format them into the query) which may potentially be used as an SQL injection attack vector.
As a side note, I cannot stop but think that this feels like a reinventing-the-wheel step towards an actual ORM. Look into lightweight PonyORM or Peewee abstraction layers between Python and a database.

It will inevitably get messy if you want your SQL statements to remain sanitized/safe, but as long as you control your function signature it can remain reasonably safe, e.g.:
def load_sql(var1, var2, var3):
fields = dict(field for field in locals().items() if field[1] is not None)
query = "SELECT * FROM et"
if fields: # if at least one field is not None:
query += " WHERE " + " AND ".join((k + "=?" for k in fields.keys()))
with sqlite3.connect("fn.db") as db:
cur = db.cursor()
cur.execute(query, fields.values())
return cur.fetchall()
You can replace the function signature with load_sql(**kwargs) and then use kwargs.items() instead of locals.items() so that you can pass arbitrary column names, but that can be very dangerous and is certainly not recommended.

Related

Sqlite3 / Python -- increment column by 1 using '?' and UPDATE, getting syntax error [duplicate]

I want to dynamically choose what table to use in a SQL query, but I just keep getting error however I am trying to format this. Also tried %s instead of ?.
Any suggestions?
group_food = (group, food)
group_food_new = (group, food, 1)
with con:
cur = con.cursor()
tmp = cur.execute("SELECT COUNT(Name) FROM (?) WHERE Name=?", group_food)
if tmp == 0:
cur.execute("INSERT INTO ? VALUES(?, ?)", group_food_new)
else:
times_before = cur.execute("SELECT Times FROM ? WHERE Name=?", group_food)
group_food_update = (group, (times_before +1), food)
cur.execute("UPDATE ? SET Times=? WHERE Name=?", group_food_update)
You cannot use SQL parameters to be placeholders in SQL objects; one of the reasons for using a SQL parameters is to escape the value such that the database can never mistake the contents for a database object.
You'll have to interpolate the database objects separately; escape your identifiers by doubling any " double quote parameters and use
cur.execute('SELECT COUNT(Name) FROM "{}" WHERE Name=?'.format(group.replace('"', '""')), (food,))
and
cur.execute('INSERT INTO "{}" VALUES(?, ?)'.format(group.replace('"', '""')), (food, 1))
and
cur.execute('UPDATE "{}" SET Times=? WHERE Name=?'.format(group.replace('"', '""')),
(times_before + 1, food))
The ".." double quotes are there to properly demark an identifier, even if that identifier is also a valid keyword; any existing " characters in the name must be doubled; this also helps de-fuse SQL injection attempts.
However, if your object names are user-sourced, you'll have to do your own (stringent) validation on the object names to prevent SQL injection attacks here. Always validate them against existing objects in that case.
You should really consider using a project like SQLAlchemy to generate your SQL instead; it can take care of validating object names and rigorously protect you from SQL injection risks. It can load your table definitions up front so it'll know what names are legal:
from sqlalchemy import create_engine, func, select, MetaData
engine = create_engine('sqlite:////path/to/database')
meta = MetaData()
meta.reflect(bind=engine)
conn = engine.connect()
group_table = meta.tables[group] # can only find existing tables
count_statement = select([func.count(group_table.c.Name)], group_table.c.Name == food)
count, = conn.execute(count_statement).fetchone()
if count:
# etc.
What are the values of group and food? The guidelines say to make the question as such others can benefit from it, for that to be the case here we need the values of group and food.
It seems you use the Python String formatter instead of SQL parameters for table names even though https://docs.python.org/3/library/sqlite3.html#module-sqlite3 says using the String formatter in unsafe.
# Never do this -- insecure!
symbol = 'RHAT'
c.execute("SELECT * FROM stocks WHERE symbol = '%s'" % symbol)
# Do this instead
t = ('RHAT',)
c.execute('SELECT * FROM stocks WHERE symbol=?', t)
Using %s as a placeholder then putting % outside the string with the variables in Python2 has been Replaced in Python3 with .format() function with the variables as arguments.
I've found building the SQL query up as a text string and then passing this string into the c.execute() function works.
querySelect = "SELECT * FROM " + str(your_table_variable)
queryWhere = " WHERE " + str(variableName) + " = " str(variableValue)
query = querySelect + queryWhere
c.execute(query)
I don't know the security situation around it though (re injection) and I'm sure there are probably better ways of doing this.

How to safely bind Oracle column to ORDER BY to SQLAlchemy in a raw query?

I'm trying to execute a raw sql query and safely pass an order by/asc/desc based on user input. This is the back end for a paginated datagrid. I cannot for the life of me figure out how to do this safely. Parameters get converted to strings so Oracle can't execute the query. I can't find any examples of this anywhere on the internet. What is the best way to safely accomplish this? (I am not using the ORM, must be raw sql).
My workaround is just setting ASC/DESC to a variable that I set. This works fine and is safe. However, how do I bind a column name to the ORDER BY? Is that even possible? I can just whitelist a bunch of columns and do something similar as I do with the ASC/DESC. I was just curious if there's a way to bind it. Thanks.
#default.route('/api/barcodes/<sort_by>/<sort_dir>', methods=['GET'])
#json_enc
def fetch_barcodes(sort_by, sort_dir):
#time.sleep(5)
# Can't use sort_dir as a parameter, so assign to variable to sanitize it
ord_dir = "DESC" if sort_dir.lower() == 'desc' else 'ASC'
records = []
stmt = text("SELECT bb_request_id,bb_barcode,bs_status, "
"TO_CHAR(bb_rec_cre_date, 'MM/DD/YYYY') AS bb_rec_cre_date "
"FROM bars_barcodes,bars_status "
"WHERE bs_status_id = bb_status_id "
"ORDER BY :ord_by :ord_dir ")
stmt = stmt.bindparams(ord_by=sort_by,ord_dir=ord_dir)
rs = db.session.execute(stmt)
records = [dict(zip(rs.keys(), row)) for row in rs]
DatabaseError: (cx_Oracle.DatabaseError) ORA-01036: illegal variable name/number
[SQL: "SELECT bb_request_id,bb_barcode,bs_status, TO_CHAR(bb_rec_cre_date, 'MM/DD/YYYY') AS bb_rec_cre_date FROM bars_barcodes,bars_status WHERE bs_status_id = bb_status_id ORDER BY :ord_by :ord_dir "] [parameters: {'ord_by': u'bb_rec_cre_date', 'ord_dir': 'ASC'}]
UPDATE Solution based on accepted answer:
def fetch_barcodes(sort_by, sort_dir, page, rows_per_page):
ord_dir_func = desc if sort_dir.lower() == 'desc' else asc
query_limit = int(rows_per_page)
query_offset = (int(page) - 1) * query_limit
stmt = select([column('bb_request_id'),
column('bb_barcode'),
column('bs_status'),
func.to_char(column('bb_rec_cre_date'), 'MM/DD/YYYY').label('bb_rec_cre_date')]).\
select_from(table('bars_barcode')).\
select_from(table('bars_status')).\
where(column('bs_status_id') == column('bb_status_id')).\
order_by(ord_dir_func(column(sort_by))).\
limit(query_limit).offset(query_offset)
result = db.session.execute(stmt)
records = [dict(row) for row in result]
response = json_return()
response.addRecords(records)
#response.setTotal(len(records))
response.setTotal(1001)
response.setSuccess(True)
response.addMessage("Records retrieved successfully. Limit: " + str(query_limit) + ", Offset: " + str(query_offset) + " SQL: " + str(stmt))
return response
You could use Core constructs such as table() and column() for this instead of raw SQL strings. That'd make your life easier in this regard:
from sqlalchemy import select, table, column, asc, desc
ord_dir = desc if sort_dir.lower() == 'desc' else asc
stmt = select([column('bb_request_id'),
column('bb_barcode'),
column('bs_status'),
func.to_char(column('bb_rec_cre_date'),
'MM/DD/YYYY').label('bb_rec_cre_date')]).\
select_from(table('bars_barcodes')).\
select_from(table('bars_status')).\
where(column('bs_status_id') == column('bb_status_id')).\
order_by(ord_dir(column(sort_by)))
table() and column() represent the syntactic part of a full blown Table object with Columns and can be used in this fashion for escaping purposes:
The text handled by column() is assumed to be handled like the name of a database column; if the string contains mixed case, special characters, or matches a known reserved word on the target backend, the column expression will render using the quoting behavior determined by the backend.
Still, whitelisting might not be a bad idea.
Note that you don't need to manually zip() the row proxies in order to produce dictionaries. They act as mappings as is, and if you need dict() for serialization reasons or such, just do dict(row).

sqlite3 Operational Error Syntax Error [duplicate]

I want to dynamically choose what table to use in a SQL query, but I just keep getting error however I am trying to format this. Also tried %s instead of ?.
Any suggestions?
group_food = (group, food)
group_food_new = (group, food, 1)
with con:
cur = con.cursor()
tmp = cur.execute("SELECT COUNT(Name) FROM (?) WHERE Name=?", group_food)
if tmp == 0:
cur.execute("INSERT INTO ? VALUES(?, ?)", group_food_new)
else:
times_before = cur.execute("SELECT Times FROM ? WHERE Name=?", group_food)
group_food_update = (group, (times_before +1), food)
cur.execute("UPDATE ? SET Times=? WHERE Name=?", group_food_update)
You cannot use SQL parameters to be placeholders in SQL objects; one of the reasons for using a SQL parameters is to escape the value such that the database can never mistake the contents for a database object.
You'll have to interpolate the database objects separately; escape your identifiers by doubling any " double quote parameters and use
cur.execute('SELECT COUNT(Name) FROM "{}" WHERE Name=?'.format(group.replace('"', '""')), (food,))
and
cur.execute('INSERT INTO "{}" VALUES(?, ?)'.format(group.replace('"', '""')), (food, 1))
and
cur.execute('UPDATE "{}" SET Times=? WHERE Name=?'.format(group.replace('"', '""')),
(times_before + 1, food))
The ".." double quotes are there to properly demark an identifier, even if that identifier is also a valid keyword; any existing " characters in the name must be doubled; this also helps de-fuse SQL injection attempts.
However, if your object names are user-sourced, you'll have to do your own (stringent) validation on the object names to prevent SQL injection attacks here. Always validate them against existing objects in that case.
You should really consider using a project like SQLAlchemy to generate your SQL instead; it can take care of validating object names and rigorously protect you from SQL injection risks. It can load your table definitions up front so it'll know what names are legal:
from sqlalchemy import create_engine, func, select, MetaData
engine = create_engine('sqlite:////path/to/database')
meta = MetaData()
meta.reflect(bind=engine)
conn = engine.connect()
group_table = meta.tables[group] # can only find existing tables
count_statement = select([func.count(group_table.c.Name)], group_table.c.Name == food)
count, = conn.execute(count_statement).fetchone()
if count:
# etc.
What are the values of group and food? The guidelines say to make the question as such others can benefit from it, for that to be the case here we need the values of group and food.
It seems you use the Python String formatter instead of SQL parameters for table names even though https://docs.python.org/3/library/sqlite3.html#module-sqlite3 says using the String formatter in unsafe.
# Never do this -- insecure!
symbol = 'RHAT'
c.execute("SELECT * FROM stocks WHERE symbol = '%s'" % symbol)
# Do this instead
t = ('RHAT',)
c.execute('SELECT * FROM stocks WHERE symbol=?', t)
Using %s as a placeholder then putting % outside the string with the variables in Python2 has been Replaced in Python3 with .format() function with the variables as arguments.
I've found building the SQL query up as a text string and then passing this string into the c.execute() function works.
querySelect = "SELECT * FROM " + str(your_table_variable)
queryWhere = " WHERE " + str(variableName) + " = " str(variableValue)
query = querySelect + queryWhere
c.execute(query)
I don't know the security situation around it though (re injection) and I'm sure there are probably better ways of doing this.

Python Sqlite3: INSERT INTO table VALUE(dictionary goes here)

I would like to use a dictionary to insert values into a table, how would I do this?
import sqlite3
db = sqlite3.connect('local.db')
cur = db.cursor()
cur.execute('DROP TABLE IF EXISTS Media')
cur.execute('''CREATE TABLE IF NOT EXISTS Media(
id INTEGER PRIMARY KEY, title TEXT,
type TEXT, genre TEXT,
onchapter INTEGER, chapters INTEGER,
status TEXT
)''')
values = {'title':'jack', 'type':None, 'genre':'Action', 'onchapter':None,'chapters':6,'status':'Ongoing'}
#What would I Replace x with to allow a
#dictionary to connect to the values?
cur.execute('INSERT INTO Media VALUES (NULL, x)'), values)
cur.execute('SELECT * FROM Media')
meida = cur.fetchone()
print meida
If you're trying to use a dict to specify both the column names and the values, you can't do that, at least not directly.
That's really inherent in SQL. If you don't specify the list of column names, you have to specify them in CREATE TABLE order—which you can't do with a dict, because a dict has no order. If you really wanted to, of course, you could use a collections.OrderedDict, make sure it's in the right order, and then just pass values.values(). But at that point, why not just have a list (or tuple) in the first place? If you're absolutely sure you've got all the values, in the right order, and you want to refer to them by order rather than by name, what you have is a list, not a dict.
And there's no way to bind column names (or table names, etc.) in SQL, just values.
You can, of course, generate the SQL statement dynamically. For example:
columns = ', '.join(values.keys())
placeholders = ', '.join('?' * len(values))
sql = 'INSERT INTO Media ({}) VALUES ({})'.format(columns, placeholders)
values = [int(x) if isinstance(x, bool) else x for x in values.values()]
cur.execute(sql, values)
However, this is almost always a bad idea. This really isn't much better than generating and execing dynamic Python code. And you've just lost all of the benefits of using placeholders in the first place—primarily protection from SQL injection attacks, but also less important things like faster compilation, better caching, etc. within the DB engine.
It's probably better to step back and look at this problem from a higher level. For example, maybe you didn't really want a static list of properties, but rather a name-value MediaProperties table? Or, alternatively, maybe you want some kind of document-based storage (whether that's a high-powered nosql system, or just a bunch of JSON or YAML objects stored in a shelve)?
An alternative using named placeholders:
columns = ', '.join(my_dict.keys())
placeholders = ':'+', :'.join(my_dict.keys())
query = 'INSERT INTO my_table (%s) VALUES (%s)' % (columns, placeholders)
print query
cur.execute(query, my_dict)
con.commit()
There is a solution for using dictionaries. First, the SQL statement
INSERT INTO Media VALUES (NULL, 'x');
would not work, as it assumes you are referring to all columns, in the order they are defined in the CREATE TABLE statement, as abarnert stated. (See SQLite INSERT.)
Once you have fixed it by specifying the columns, you can use named placeholders to insert data. The advantage of this is that is safely escapes key-characters, so you do not have to worry. From the Python sqlite-documentation:
values = {
'title':'jack', 'type':None, 'genre':'Action',
'onchapter':None,'chapters':6,'status':'Ongoing'
}
cur.execute(
'INSERT INTO Media (id, title, type, onchapter, chapters, status)
VALUES (:id, :title, :type, :onchapter, :chapters, :status);',
values
)
You could use named parameters:
cur.execute('INSERT INTO Media VALUES (NULL, :title, :type, :genre, :onchapter, :chapters, :status)', values)
This still depends on the column order in the INSERT statement (those : are only used as keys in the values dict) but it at least gets away from having to order the values on the python side, plus you can have other things in values that are ignored here; if you're pulling what's in the dict apart to store it in multiple tables, that can be useful.
If you still want to avoid duplicating the names, you could extract them from an sqlite3.Row result object, or from cur.description, after doing a dummy query; it may be saner to keep them around in python form near wherever you do your CREATE TABLE.
Here's a more generic way with the benefit of escaping:
# One way. If keys can be corrupted don't use.
sql = 'INSERT INTO demo ({}) VALUES ({})'.format(
','.join(my_dict.keys()),
','.join(['?']*len(my_dict)))
# Another, better way. Hardcoded w/ your keys.
sql = 'INSERT INTO demo ({}) VALUES ({})'.format(
','.join(my_keys),
','.join(['?']*len(my_dict)))
cur.execute(sql, tuple(my_dict.values()))
key_lst = ('status', 'title', 'chapters', 'onchapter', 'genre', 'type')
cur.execute('INSERT INTO Media (status,title,chapters,onchapter,genre,type) VALUES ' +
'(?,?,?,?,?,?);)',tuple(values[k] for k in key_lst))
Do your escaping right.
You probably also need a commit call in there someplace.
Super late to this, but figured I would add my own answer. Not an expert, but something I found that works.
There are issues with preserving order when using a dictionary, which other users have stated, but you could do the following:
# We're going to use a list of dictionaries, since that's what I'm having to use in my problem
input_list = [{'a' : 1 , 'b' : 2 , 'c' : 3} , {'a' : 14 , 'b' : '' , 'c' : 43}]
for i in input_list:
# I recommend putting this inside a function, this way if this
# Evaluates to None at the end of the loop, you can exit without doing an insert
if i :
input_dict = i
else:
input_dict = None
continue
# I am noting here that in my case, I know all columns will exist.
# If you're not sure, you'll have to get all possible columns first.
keylist = list(input_dict.keys())
vallist = list(input_dict.values())
query = 'INSERT INTO example (' +','.join( ['[' + i + ']' for i in keylist]) + ') VALUES (' + ','.join(['?' for i in vallist]) + ')'
items_to_insert = list(tuple(x.get(i , '') for i in keylist) for x in input_list)
# Making sure to preserve insert order.
conn = sqlite3.connect(':memory:')
cur = conn.cursor()
cur.executemany(query , items_to_insert)
conn.commit()
dictionary = {'id':123, 'name': 'Abc', 'address':'xyz'}
query = "insert into table_name " + str(tuple(dictionary.keys())) + " values" + str(tuple(dictionary.values())) + ";"
cursor.execute(query)
query becomes
insert into table_name ('id', 'name', 'address') values(123, 'Abc', 'xyz');
I was having the similar problem so I created a string first and then passed that string to execute command. It does take longer time to execute but mapping was perfect for me. Just a work around:
create_string = "INSERT INTO datapath_rtg( Sr_no"
for key in record_tab:
create_string = create_string+ " ," + str(key)
create_string = create_string+ ") VALUES("+ str(Sr_no)
for key in record_tab:
create_string = create_string+ " ," + str(record_tab[key])
create_string = create_string + ")"
cursor.execute(create_string)
By doing above thing I ensured that if my dict (record_tab) doesn't contain a particular field then the script wont throw out error and proper mapping can be done which is why I used dictionary at the first place.
I was having a similar problem and ended up with something not entirely unlike the following (Note - this is the OP's code with bits changed so that it works in the way they requested)-
import sqlite3
db = sqlite3.connect('local.db')
cur = db.cursor()
cur.execute('DROP TABLE IF EXISTS Media')
cur.execute('''CREATE TABLE IF NOT EXISTS Media(
id INTEGER PRIMARY KEY, title TEXT,
type TEXT, genre TEXT,
onchapter INTEGER, chapters INTEGER,
status TEXT
)''')
values = {'title':'jack', 'type':None, 'genre':'Action', 'onchapter':None,'chapters':6,'status':'Ongoing'}
#What would I Replace x with to allow a
#dictionary to connect to the values?
#cur.execute('INSERT INTO Media VALUES (NULL, x)'), values)
# Added code.
cur.execute('SELECT * FROM Media')
colnames = cur.description
list = [row[0] for row in cur.description]
new_list = [values[i] for i in list if i in values.keys()]
sql = "INSERT INTO Media VALUES ( NULL, "
qmarks = ', '.join('?' * len(values))
sql += qmarks + ")"
cur.execute(sql, new_list)
#db.commit() #<-Might be important.
cur.execute('SELECT * FROM Media')
media = cur.fetchone()
print (media)

Escaping dynamic sqlite query?

I'm currently building SQL queries depending on input from the user. An example how this is done can be seen here:
def generate_conditions(table_name,nameValues):
sql = u""
for field in nameValues:
sql += u" AND {0}.{1}='{2}'".format(table_name,field,nameValues[field])
return sql
search_query = u"SELECT * FROM Enheter e LEFT OUTER JOIN Handelser h ON e.Id == h.Enhet WHERE 1=1"
if "Enhet" in args:
search_query += generate_conditions("e",args["Enhet"])
c.execute(search_query)
Since the SQL changes every time I cannot insert the values in the execute call which means that I should escape the strings manually. However, when I search everyone points to execute...
I'm also not that satisfied with how I generate the query, so if someone has any idea for another way that would be great also!
You have two options:
Switch to using SQLAlchemy; it'll make generating dynamic SQL a lot more pythonic and ensures proper quoting.
Since you cannot use parameters for table and column names, you'll still have to use string formatting to include these in the query. Your values on the other hand, should always be using SQL parameters, if only so the database can prepare the statement.
It's not advisable to just interpolate table and column names taken straight from user input, it's far too easy to inject arbitrary SQL statements that way. Verify the table and column names against a list of such names you accept instead.
So, to build on your example, I'd go in this direction:
tables = {
'e': ('unit1', 'unit2', ...), # tablename: tuple of column names
}
def generate_conditions(table_name, nameValues):
if table_name not in tables:
raise ValueError('No such table %r' % table_name)
sql = u""
params = []
for field in nameValues:
if field not in tables[table_name]:
raise ValueError('No such column %r' % field)
sql += u" AND {0}.{1}=?".format(table_name, field)
params.append(nameValues[field])
return sql, params
search_query = u"SELECT * FROM Enheter e LEFT OUTER JOIN Handelser h ON e.Id == h.Enhet WHERE 1=1"
search_params = []
if "Enhet" in args:
sql, params = generate_conditions("e",args["Enhet"])
search_query += sql
search_params.extend(params)
c.execute(search_query, search_params)

Categories