Insert NULL into a nullable row, using $$ in postgres - python

I was working on securing my application:
q = "insert into foo (name, descr, some_fk_int) values ($${}$$, $${}$$, $${}$$);".format(
name.replace("$$",""),
description.replace("$$",""),
str(fkToAnotherTable).replace("$$","") if fkToANotherTable is not None else 'NULL'
)
cur.execute(q)
I noticed that when doing this it will try to insert the string NULL into the DB into an integer entry. For the sake of keeping things uniform, I was pretty much taking all strings processing it this way, and all ints converting to a string and then doing it as well (just covering bases).
It was working when it was just:
.... ($$Tony$$,$$Bar$$, NULL)
but now that it is:
.....($$Tony$$, $$Bar$$, $$NULL$$)
it fails. I was thinking that there was a way to implement this in a uniform fashion. I was doing this because when doing lookups on indexes, i noticed that it understood ints as strings
select * from foo where id = $$id$$
So i was blanketing the app where user input was going to be put into databases. I could remove it from ints, but I thought this would be smart in case someone tried to pass a string into an int to see "what would happen".
I was thinking that the query would fail if the ID was an int and you did:
select * from foo where id = $$hello$$
but i was thinking of more interesting cases where someone would try sqlinjection
select * from foo where id = $$$$ or 1=1$$$$
where the injected string was: $$ or 1=1$$. I was thinking that i could resolve it by casting to a string, and then replacing key characters. BUT in the process of injecting or selects, I was noticing the NULL case, and was trying to figure out how to Bypass it?
Maybe I should customise the query to do something like:
q = "insert into foo (name, description, fk) values
( $${}$$, $${}$$, {} );".format(
name.replace("$$",""),
description.replace("$$",""),
"$${}$$".format(str(FK).replace("$$","")) if FK is not None else "NULL"
)
instead which only adds $$ if it is NOT None? Then it would work.
TLDR: Is there a property way to pass 'NULL' into an int field and it understand it as null?
Edit: It seems like people were thinking I should follow parameterized query defintions done in psycopg2 under the Cursor's execute function's second property.
I was not sure sure if it would solve SQL injection, but i was thinking of doing something as follows:
q = "select * from foo where name in ({}) and user_id = %s".format(
",".join(["%s" for d in my_list])
)
# creates: select * from foo where name in (%s, %s, %s, %s, %s) and user_id = %s;
args = [d for d in my_list] + [user_id]
# creates [1,2,3,4,5,7493]
cursor.execute(q, args)
as the website allows the second argument to be a sequence, list, or dict.

Related

how to specify table name using %s [duplicate]

I have the following code, using pscyopg2:
sql = 'select %s from %s where utctime > %s and utctime < %s order by utctime asc;'
data = (dataItems, voyage, dateRangeLower, dateRangeUpper)
rows = cur.mogrify(sql, data)
This outputs:
select 'waterTemp, airTemp, utctime' from 'ss2012_t02' where utctime > '2012-05-03T17:01:35+00:00'::timestamptz and utctime < '2012-05-01T17:01:35+00:00'::timestamptz order by utctime asc;
When I execute this, it falls over - this is understandable, as the quotes around the table name are illegal.
Is there a way to legally pass the table name as a parameter, or do I need to do a (explicitly warned against) string concatenation, ie:
voyage = 'ss2012_t02'
sql = 'select %s from ' + voyage + ' where utctime > %s and utctime < %s order by utctime asc;'
Cheers for any insights.
According to the official documentation:
If you need to generate dynamically an SQL query (for instance
choosing dynamically a table name) you can use the facilities
provided by the psycopg2.sql module.
The sql module is new in psycopg2 version 2.7. It has the following syntax:
from psycopg2 import sql
cur.execute(
sql.SQL("insert into {table} values (%s, %s)")
.format(table=sql.Identifier('my_table')),
[10, 20])
More on: https://www.psycopg.org/docs/sql.html#module-usage
[Update 2017-03-24: AsIs should NOT be used to represent table or fields names, the new sql module should be used instead: https://stackoverflow.com/a/42980069/5285608 ]
Also, according to psycopg2 documentation:
Warning: Never, never, NEVER use Python string concatenation (+) or string parameters interpolation (%) to pass variables to a SQL query string. Not even at gunpoint.
Per this answer you can do it as so:
import psycopg2
from psycopg2.extensions import AsIs
#Create your connection and cursor...
cursor.execute("SELECT * FROM %(table)s", {"table": AsIs("my_awesome_table")})
The table name cannot be passed as a parameter, but everything else can. Thus, the table name should be hard coded in your app (Don't take inputs or use anything outside of the program as a name). The code you have should work for this.
On the slight chance that you have a legitimate reason to take an outside table name, make sure that you don't allow the user to directly input it. Perhaps an index could be passed to select a table, or the table name could be looked up in some other way. You are right to be wary of doing this, however. This works, because there are relatively few table names around. Find a way to validate the table name, and you should be fine.
It would be possible to do something like this, to see if the table name exists. This is a parameterised version. Just make sure that you do this and verify the output prior to running the SQL code. Part of the idea for this comes from this answer.
SELECT 1 FROM information_schema.tables WHERE table_schema = 'public' and table_name=%s LIMIT 1
This is a workaround I have used in the past
query = "INSERT INTO %s (col_1, col_2) VALUES (%%s, %%s)" % table_name
cur.execute(query, (col_1_var, col_2_var))
Hope it help :)
This is a small addition to #Antoine Dusséaux's answer. If you want to pass two (unquoted) parameters in a SQL query, you can do it as follows: -
query = sql.SQL("select {field} from {table} where {pkey} = %s").format(
field=sql.Identifier('my_name'),
table=sql.Identifier('some_table'),
pkey=sql.Identifier('id'))
As per the documentation,
Usually you should express the template of your query as an SQL
instance with {}-style placeholders and use format() to merge the
variable parts into them, all of which must be Composable subclasses.
You can still have %s-style placeholders in your query and pass values
to execute(): such value placeholders will be untouched by format()
Source: https://www.psycopg.org/docs/sql.html#module-usage
Also, please keep this in mind while writing queries.
I have created a little utility for preprocessing of SQL statements with variable table (...) names:
from string import letters
NAMECHARS = frozenset(set(letters).union('.'))
def replace_names(sql, **kwargs):
"""
Preprocess an SQL statement: securely replace table ... names
before handing the result over to the database adapter,
which will take care of the values.
There will be no quoting of names, because this would make them
case sensitive; instead it is ensured that no dangerous chars
are contained.
>>> replace_names('SELECT * FROM %(table)s WHERE val=%(val)s;',
... table='fozzie')
'SELECT * FROM fozzie WHERE val=%(val)s;'
"""
for v in kwargs.values():
check_name(v)
dic = SmartDict(kwargs)
return sql % dic
def check_name(tablename):
"""
Check the given name for being syntactically valid,
and usable without quoting
"""
if not isinstance(tablename, basestring):
raise TypeError('%r is not a string' % (tablename,))
invalid = set(tablename).difference(NAMECHARS)
if invalid:
raise ValueError('Invalid chars: %s' % (tuple(invalid),))
for s in tablename.split('.'):
if not s:
raise ValueError('Empty segment in %r' % tablename)
class SmartDict(dict):
def __getitem__(self, key):
try:
return dict.__getitem__(self, key)
except KeyError:
check_name(key)
return key.join(('%(', ')s'))
The SmartDict object returns %(key)s for every unknown key, preserving them for the value handling. The function could check for the absence of any quote characters, since all quoting now should be taken care of ...
If you want to pass the table name as a parameter, you can use this wrapper:
class Literal(str):
def __conform__(self, quote):
return self
#classmethod
def mro(cls):
return (object, )
def getquoted(self):
return str(self)
Usage: cursor.execute("CREATE TABLE %s ...", (Literal(name), ))
You can just use the module format for the table name and then use the regular paramaterization for the execute:
xlist = (column, table)
sql = 'select {0} from {1} where utctime > %s and utctime < %s order by utctime asc;'.format(xlist)
Keep in mind if this is exposed to the end user, you will not be protected from SQL injection unless you write for it.
Surprised no one has mentioned doing this:
sql = 'select {} from {} where utctime > {} and utctime < {} order by utctime asc;'.format(dataItems, voyage, dateRangeLower, dateRangeUpper)
rows = cur.mogrify(sql)
format puts in the string without quotations.

Alternating SQL queries

I'm looking for a way to implement alternating SQL queries - i.e. a function that allows me to filter entries based on different columns. Take the following example:
el=[["a","b",1],["a","b",3]]
def save_sql(foo):
with sqlite3.connect("fn.db") as db:
cur=db.cursor()
cur.execute("CREATE TABLE IF NOT EXISTS et"
"(var1 VARCHAR, var2 VARCHAR, var3 INT)")
cur.executemany("INSERT INTO et VALUES "
"(?,?,?)", foo)
db.commit()
def load_sql(v1,v2,v3):
with sqlite3.connect("fn.db") as db:
cur=db.cursor()
cur.execute("SELECT * FROM et WHERE var1=? AND var2=? AND var3=?", (v1,v2,v3))
return cur.fetchall()
save_sql(el)
Now if I were to use load_sql("a","b",1), it would work. But assume I want to only query for the first and third column, i.e. load_sql("a",None,1) (the None is just intended as a placeholder) or only the last column load_sql(None,None,5), this wouldn't work.
This could of course be done with if statements checking which variables were supplied in the function call, but in tables with larger amounts of columns, this might get messy.
Is there a good way to do this?
What if load_sql() would accept an arbitrary number of keyword arguments, where keyword argument names would correspond to column names. Something along these lines:
def load_sql(**values):
with sqlite3.connect("fn.db") as db:
cur = db.cursor()
query = "SELECT * FROM et"
conditions = [f"{column_name} = :{column_name}" for column_name in values]
if conditions:
query = query + " WHERE " + " AND ".join(conditions)
cur.execute(query, values)
return cur.fetchall()
Note that here we trust keyword argument names to be valid and existing column names (and string-format them into the query) which may potentially be used as an SQL injection attack vector.
As a side note, I cannot stop but think that this feels like a reinventing-the-wheel step towards an actual ORM. Look into lightweight PonyORM or Peewee abstraction layers between Python and a database.
It will inevitably get messy if you want your SQL statements to remain sanitized/safe, but as long as you control your function signature it can remain reasonably safe, e.g.:
def load_sql(var1, var2, var3):
fields = dict(field for field in locals().items() if field[1] is not None)
query = "SELECT * FROM et"
if fields: # if at least one field is not None:
query += " WHERE " + " AND ".join((k + "=?" for k in fields.keys()))
with sqlite3.connect("fn.db") as db:
cur = db.cursor()
cur.execute(query, fields.values())
return cur.fetchall()
You can replace the function signature with load_sql(**kwargs) and then use kwargs.items() instead of locals.items() so that you can pass arbitrary column names, but that can be very dangerous and is certainly not recommended.

MySQLdb insert into database using lists for field names and values python

I have these two lists:
list1=['a','b','c']
list2=['1','2','3']
I am trying to insert these into a database with field names like so:
a | b | c | d | e
I am currently trying putting these lists as strings and then simply adding in the execute, e.g. cur.execute(insert,(strList1,strList2)) where strList1 and strList2 are just strings of list1 and list2 formed using:
strList1=''
for thing in list1:
strList1+=thing+','
strList1=strList1[:-1]
My current SQL statement is:
insert="""insert into tbl_name(%s) values(%s)"""
cur.execute(insert,(strList1,strList2))
I also have a follow up question: how could I ensure that say column a needed to be a primary key that on a duplicate entry it would update the other fields if they were blank?
Do not use %s in queries as this is a security risk. This is due to %s simply inserting the value into the string, meaning it can be a whole separate query all together.
Instead use "?" where you want the value to be, and add a second argument to execute in the form of a tuple like so
curs.execute("SELECT foo FROM bar WHERE foobar = ?",(some_value,))
Or in a slightly longer example
curs.execute("UPDATE foo SET bar = ? WHERE foobar = ?",(first_value,second_value))
Edit:
Hopefully i understood what you want correctly this time, sadly you cannot use "?" for tables so you are stuck with %s. I made a quick little test script.
import sqlite3
list1=['foo','bar','foobar'] #List of tables
list2=['First_value','second_value','Third_value'] #List of values
db_conn = sqlite3.connect("test.db") #I used sqlite to test it quickly
db_curs = db_conn.cursor()
for table in list1: #Create all the tables in the db
query = "CREATE TABLE IF NOT EXISTS %s(foo text, bar text,foobar text)" % table
db_curs.execute(query)
db_conn.commit()
for table in list1: #Insert all the values into all the tables
query = "INSERT INTO %s VALUES (?,?,?)" % table
db_curs.execute(query,tuple(list2))
db_conn.commit()
for table in list1: #Print all the values out to see if it worked
db_curs.execute("SELECT * FROM %s" % table)
fetchall = db_curs.fetchall()
for entry in fetchall:
print entry[0], entry[1],entry[2]
One thing you could do on those lists to make things easier...
list1=['a','b','c']
print ",".join(list1)
#a,b,c
Your insert looks good. Seems like a batch insert would be the only other option.
This is the way prepared statements work (simplified):
* The statement is send to the database with a list of parameters;
* The statement is retrieved from the statement cache or if not present, prepared and added to the statement cache;
* Parameters are applied;
* Statement is executed.
Statements have to be complete except that the parameters are replaced by %s (or ? or :parm depending on the language used). Parameters are your final numerical/string/date/etc values only. So labels or other parts can not be replaced.
In your case that means:
insert="""insert into tbl_name(%s) values(%s)"""
Should become something like:
insert="""insert into tbl_name(a,b,c) values(%s,%s,%s)"""
To use parameters, you must provide a "%s" (or %d, whatever) for each item. You can use two lists/tuples as follows:
insert="""insert into tbl_name (%s,%s,%s) values (%s, %s, %s);"""
strList1=('a','b','c')
strList2=(1,2,3)
curs.execute(insert % (strList1 + strList2))
*I'm using python3, (strList1,StrList2) doesn't work for me, but you might have slight differences.

Does a SELECT query have a return type in MySQL Python?

I am wanting to check if a specific row exists in a database using a SELECT query; if it doesnt exist I want to INSERT INTO the database with another query, if it does exist I dont want anything to happen.
I am thinking of using an if statement in Python to handle the scenario above, but I cant seem to find whether a SELECT query will return True, False, etc. if the SELECT doesnt find the row.
***PS. Im using MySQLdb
def _query (self, query):
cursor = self.db.cursor()
cursor.execute (query)
return cursor`
for k,v in sorted(config.items()):
#checking for duplicate values
check_items_table = "SELECT * FROM items WHERE text = ('%s') AND name = ('%s')" % \
(str (k), str (v))
self._query (check_items_table)
if check_items_table == <True or False or Something>:
#do nothing
else:
#insert into items table
items_table = "INSERT INTO items(item_id, \
name, text) \
VALUES (null,'%s', '%s')" % \
(str (k), str (v))
self._query (items_table)
You're using MySQLdb, which is DB-API 2.0-compliant.
You're calling execute, then returning the Cursor object.
Cursors have an attribute rowcount, but in some database modules, that's only set for modification queries like INSERT or UPDATE, not for SELECT, and in others, it's only set after you do a fetch* operation. (I believe MySQLdb is one of the latter, but I'm not positive.)
So, what you want to to is to call the fetchone method. If it returns a row, the row exists; if it raises an exception, the row doesn't exist.
Alternatively, instead of selecting the row itself, you could SELECT COUNT(*) FROM …. Then fetchone is guaranteed to return one row with one value, which will be either 0 or 1.
However, I'm not sure you want to do things this way in the first place. Usually, it's better to create a unique constraint on the column (or columns), so the database won't allow two rows to exist with the same values for that column. Then, you can just INSERT the row, and you'll get an error if it already exists; no need for a SELECT at all. (Or, if you don't need to know whether it already existed, you can use INSERT IGNORE, which is MySQL-specific.)
Meanwhile, to expand on my side note:
It's always a bad idea to build SQL strings dynamically, like this:
items_table = "INSERT INTO items(item_id, \
name, text) \
VALUES (null,'%s', '%s')" % \
(str (k), str (v))
This leaves you open to SQL injection, forces you to deal with fiddly quoting/escaping/type-conversion problems, and prevents the database from realizing that you're running the same query repeatedly.
Use parameterized statements instead. Like this:
items_table = "INSERT INTO items(item_id, \
name, text) \
VALUES (null, %s, %s)"
Notice that I didn't quote the %s's, and I didn't use a % operator. The values get sent to the execute statement:
cursor.execute(items_table, (k, v))
So, you just need to update your _query wrapper like this:
def _query(self, query, parameters=[]):
cursor = self.db.cursor()
cursor.execute(query, parameters)
return cursor
… and you can call it like this:
self._query(items_table, (k, v))

Python Sqlite3: INSERT INTO table VALUE(dictionary goes here)

I would like to use a dictionary to insert values into a table, how would I do this?
import sqlite3
db = sqlite3.connect('local.db')
cur = db.cursor()
cur.execute('DROP TABLE IF EXISTS Media')
cur.execute('''CREATE TABLE IF NOT EXISTS Media(
id INTEGER PRIMARY KEY, title TEXT,
type TEXT, genre TEXT,
onchapter INTEGER, chapters INTEGER,
status TEXT
)''')
values = {'title':'jack', 'type':None, 'genre':'Action', 'onchapter':None,'chapters':6,'status':'Ongoing'}
#What would I Replace x with to allow a
#dictionary to connect to the values?
cur.execute('INSERT INTO Media VALUES (NULL, x)'), values)
cur.execute('SELECT * FROM Media')
meida = cur.fetchone()
print meida
If you're trying to use a dict to specify both the column names and the values, you can't do that, at least not directly.
That's really inherent in SQL. If you don't specify the list of column names, you have to specify them in CREATE TABLE order—which you can't do with a dict, because a dict has no order. If you really wanted to, of course, you could use a collections.OrderedDict, make sure it's in the right order, and then just pass values.values(). But at that point, why not just have a list (or tuple) in the first place? If you're absolutely sure you've got all the values, in the right order, and you want to refer to them by order rather than by name, what you have is a list, not a dict.
And there's no way to bind column names (or table names, etc.) in SQL, just values.
You can, of course, generate the SQL statement dynamically. For example:
columns = ', '.join(values.keys())
placeholders = ', '.join('?' * len(values))
sql = 'INSERT INTO Media ({}) VALUES ({})'.format(columns, placeholders)
values = [int(x) if isinstance(x, bool) else x for x in values.values()]
cur.execute(sql, values)
However, this is almost always a bad idea. This really isn't much better than generating and execing dynamic Python code. And you've just lost all of the benefits of using placeholders in the first place—primarily protection from SQL injection attacks, but also less important things like faster compilation, better caching, etc. within the DB engine.
It's probably better to step back and look at this problem from a higher level. For example, maybe you didn't really want a static list of properties, but rather a name-value MediaProperties table? Or, alternatively, maybe you want some kind of document-based storage (whether that's a high-powered nosql system, or just a bunch of JSON or YAML objects stored in a shelve)?
An alternative using named placeholders:
columns = ', '.join(my_dict.keys())
placeholders = ':'+', :'.join(my_dict.keys())
query = 'INSERT INTO my_table (%s) VALUES (%s)' % (columns, placeholders)
print query
cur.execute(query, my_dict)
con.commit()
There is a solution for using dictionaries. First, the SQL statement
INSERT INTO Media VALUES (NULL, 'x');
would not work, as it assumes you are referring to all columns, in the order they are defined in the CREATE TABLE statement, as abarnert stated. (See SQLite INSERT.)
Once you have fixed it by specifying the columns, you can use named placeholders to insert data. The advantage of this is that is safely escapes key-characters, so you do not have to worry. From the Python sqlite-documentation:
values = {
'title':'jack', 'type':None, 'genre':'Action',
'onchapter':None,'chapters':6,'status':'Ongoing'
}
cur.execute(
'INSERT INTO Media (id, title, type, onchapter, chapters, status)
VALUES (:id, :title, :type, :onchapter, :chapters, :status);',
values
)
You could use named parameters:
cur.execute('INSERT INTO Media VALUES (NULL, :title, :type, :genre, :onchapter, :chapters, :status)', values)
This still depends on the column order in the INSERT statement (those : are only used as keys in the values dict) but it at least gets away from having to order the values on the python side, plus you can have other things in values that are ignored here; if you're pulling what's in the dict apart to store it in multiple tables, that can be useful.
If you still want to avoid duplicating the names, you could extract them from an sqlite3.Row result object, or from cur.description, after doing a dummy query; it may be saner to keep them around in python form near wherever you do your CREATE TABLE.
Here's a more generic way with the benefit of escaping:
# One way. If keys can be corrupted don't use.
sql = 'INSERT INTO demo ({}) VALUES ({})'.format(
','.join(my_dict.keys()),
','.join(['?']*len(my_dict)))
# Another, better way. Hardcoded w/ your keys.
sql = 'INSERT INTO demo ({}) VALUES ({})'.format(
','.join(my_keys),
','.join(['?']*len(my_dict)))
cur.execute(sql, tuple(my_dict.values()))
key_lst = ('status', 'title', 'chapters', 'onchapter', 'genre', 'type')
cur.execute('INSERT INTO Media (status,title,chapters,onchapter,genre,type) VALUES ' +
'(?,?,?,?,?,?);)',tuple(values[k] for k in key_lst))
Do your escaping right.
You probably also need a commit call in there someplace.
Super late to this, but figured I would add my own answer. Not an expert, but something I found that works.
There are issues with preserving order when using a dictionary, which other users have stated, but you could do the following:
# We're going to use a list of dictionaries, since that's what I'm having to use in my problem
input_list = [{'a' : 1 , 'b' : 2 , 'c' : 3} , {'a' : 14 , 'b' : '' , 'c' : 43}]
for i in input_list:
# I recommend putting this inside a function, this way if this
# Evaluates to None at the end of the loop, you can exit without doing an insert
if i :
input_dict = i
else:
input_dict = None
continue
# I am noting here that in my case, I know all columns will exist.
# If you're not sure, you'll have to get all possible columns first.
keylist = list(input_dict.keys())
vallist = list(input_dict.values())
query = 'INSERT INTO example (' +','.join( ['[' + i + ']' for i in keylist]) + ') VALUES (' + ','.join(['?' for i in vallist]) + ')'
items_to_insert = list(tuple(x.get(i , '') for i in keylist) for x in input_list)
# Making sure to preserve insert order.
conn = sqlite3.connect(':memory:')
cur = conn.cursor()
cur.executemany(query , items_to_insert)
conn.commit()
dictionary = {'id':123, 'name': 'Abc', 'address':'xyz'}
query = "insert into table_name " + str(tuple(dictionary.keys())) + " values" + str(tuple(dictionary.values())) + ";"
cursor.execute(query)
query becomes
insert into table_name ('id', 'name', 'address') values(123, 'Abc', 'xyz');
I was having the similar problem so I created a string first and then passed that string to execute command. It does take longer time to execute but mapping was perfect for me. Just a work around:
create_string = "INSERT INTO datapath_rtg( Sr_no"
for key in record_tab:
create_string = create_string+ " ," + str(key)
create_string = create_string+ ") VALUES("+ str(Sr_no)
for key in record_tab:
create_string = create_string+ " ," + str(record_tab[key])
create_string = create_string + ")"
cursor.execute(create_string)
By doing above thing I ensured that if my dict (record_tab) doesn't contain a particular field then the script wont throw out error and proper mapping can be done which is why I used dictionary at the first place.
I was having a similar problem and ended up with something not entirely unlike the following (Note - this is the OP's code with bits changed so that it works in the way they requested)-
import sqlite3
db = sqlite3.connect('local.db')
cur = db.cursor()
cur.execute('DROP TABLE IF EXISTS Media')
cur.execute('''CREATE TABLE IF NOT EXISTS Media(
id INTEGER PRIMARY KEY, title TEXT,
type TEXT, genre TEXT,
onchapter INTEGER, chapters INTEGER,
status TEXT
)''')
values = {'title':'jack', 'type':None, 'genre':'Action', 'onchapter':None,'chapters':6,'status':'Ongoing'}
#What would I Replace x with to allow a
#dictionary to connect to the values?
#cur.execute('INSERT INTO Media VALUES (NULL, x)'), values)
# Added code.
cur.execute('SELECT * FROM Media')
colnames = cur.description
list = [row[0] for row in cur.description]
new_list = [values[i] for i in list if i in values.keys()]
sql = "INSERT INTO Media VALUES ( NULL, "
qmarks = ', '.join('?' * len(values))
sql += qmarks + ")"
cur.execute(sql, new_list)
#db.commit() #<-Might be important.
cur.execute('SELECT * FROM Media')
media = cur.fetchone()
print (media)

Categories