Detect SQL injections in the source code - python

Consider the following code snippet:
import MySQLdb
def get_data(id):
db = MySQLdb.connect(db='TEST')
cursor = db.cursor()
cursor.execute("SELECT * FROM TEST WHERE ID = '%s'" % id)
return cursor.fetchall()
print(get_data(1))
There is a major problem in the code - it is vulnerable to SQL injections attacks since the query is not parameterized through DB API and is constructed via string formatting. If you call the function this way:
get_data("'; DROP TABLE TEST -- ")
the following query would be executed:
SELECT * FROM TEST WHERE ID = ''; DROP TABLE TEST --
Now, my goal is to analyze the code in the project and detect all places potentially vulnerable to SQL injections. In other words, where the query is constructed via string formatting as opposed to passing query parameters in a separate argument.
Is it something that can be solved statically, with the help of pylint, pyflakes or any other static code analysis packages?
I'm aware of sqlmap popular penetration testing tool, but, as far as I understand, it is working against a web resource, testing it as a black-box through HTTP requests.

There is a tool that tries to solve exactly what the question is about, py-find-injection:
py_find_injection uses various heuristics to look for SQL injection
vulnerabilities in python source code.
It uses ast module, looks for session.execute() and cursor.execute() calls, and checks whether the query inside is formed via string interpolation, concatenation or format().
Here is what it outputs while checking the snippet in the question:
$ py-find-injection test.py
test.py:6 string interpolation of SQL query
1 total errors
The project, though, is not actively maintained, but could be used as a starting point. A good idea would be to make a pylint or pyflakes plugin out of it.

Not sure how this will compare with the other packages, but to a certain extent you need to parse the arguments being passed to cursor.execute. This bit of pyparsing code looks for:
arguments using string interpolation
arguments using string concatenation with variable names
arguments that are just variable names
But sometimes arguments use string concatenation just to break up a long string into - if all the strings in the expression are literals being added together, there is no risk of SQL injection.
This pyparsing snippet will look for calls to cursor.execute, and then look for the at-risk argument forms:
from pyparsing import *
import re
identifier = Word(alphas, alphanums+'_')
integer = Word(nums)
LPAR,RPAR,PLUS,PERCENT = map(Literal, '()+%')
stringInterpRE = re.compile(r"%-?\d*\*?\.?\d*\*?s")
def containsStringInterpolation(s,l,tokens):
if not stringInterpRE.search(tokens[0]):
raise ParseException(s,l,"No string interpolation")
tupleContents = identifier | integer
tupleExpr = LPAR + delimitedList(tupleContents) + RPAR
stringInterpArg = identifier | tupleExpr
interpolatedString = originalTextFor(quotedString.copy().setParseAction(containsStringInterpolation) +
PERCENT + stringInterpArg)
stringTerm = interpolatedString | OneOrMore(quotedString.copy()) | identifier
stringTerm.setName("stringTerm")
unsafeStringExpr = (stringTerm + OneOrMore(PLUS + stringTerm)) | identifier | interpolatedString
def unsafeExpr(s,l,tokens):
if not any(term == interpolatedString or term == identifier
for term in tokens):
raise ParseException(s,l,"No unsafe string terms")
unsafeStringExpr.setParseAction(unsafeExpr)
unsafeStringExpr.setName("unsafeExpr")
func = Literal("cursor.execute")
statement = func + LPAR + unsafeStringExpr + RPAR
statement.setName("execute stmt")
#statement.ignore(pythonComment)
for tokens in statement.searchString(sample):
print ' '.join(tokens.asList())
This will scan through the following sample:
sample = """
import MySQLdb
def get_data(id):
db = MySQLdb.connect(db='TEST')
cursor = db.cursor()
cursor.execute("SELECT * FROM TEST WHERE ID = '%s' -- UNSAFE" % id)
cursor.execute("SELECT * FROM TEST WHERE ID = '" + id + "' -- UNSAFE")
cursor.execute(sqlVar + " -- UNSAFE")
cursor.execute("SELECT * FROM TEST WHERE ID = 'FRED' -- SAFE")
cursor.execute("SELECT * FROM TEST WHERE ID = " +
"'FRED' -- SAFE")
cursor.execute("SELECT * FROM TEST "
"WHERE ID = "
"'FRED' -- SAFE")
cursor.execute("SELECT * FROM TEST "
"WHERE ID = " +
"'%s' -- UNSAFE" % name)
return cursor.fetchall()
print(get_data(1))"""
and report these unsafe statements:
cursor.execute ( "SELECT * FROM TEST WHERE ID = '%s' -- UNSAFE" % id )
cursor.execute ( "SELECT * FROM TEST WHERE ID = '" + id + "' -- UNSAFE" )
cursor.execute ( sqlVar + " -- UNSAFE" )
cursor.execute ( "SELECT * FROM TEST " "WHERE ID = " + "'%s' -- UNSAFE" % name )
You can also have pyparsing report the location of the found lines, using scanString instead of searchString.

About the best that I can think you'd get would be grep'ing through your codebase, looking for cursor.execute() statements being passed a string using Python string interpolation, as in your example:
cursor.execute("SELECT * FROM TEST WHERE ID = '%s'" % id)
which of course should have been written as a parameterized query to avoid the vulnerability:
cursor.execute("SELECT * FROM TEST WHERE ID = '%s'", (id,))
That's not going to be perfect -- for instance, you might have a hard time catching code like this:
query = "SELECT * FROM TEST WHERE ID = '%s'" % id
# some stuff
cursor.execute(query)
But it might be about the best you can easily do.

It's a good thing that you're already aware of the problem and trying to resolve it.
As you may already know, the best practices to execute SQL in any DB is to use prepared statements or stored procedures if these are available.
In this particular case, you can implement a prepared statement by "preparing" the statement and then executing.
e.g:
cursor = db.cursor()
query = "SELECT * FROM TEST WHERE ID = %s"
cur.execute(query, "2")

Related

sqlite3.OperationalError: near "........": syntax error

windows 7
python 2.7
Django 1.11
I have used Django to develop a website. In the backend I have the sqlite database which have 2 tables. One table accepts the form user submitted, and the other is for comparison.
Once a form A is submitted by the user, it will be save under table catalog_fw, and the catalog_fw.ODM and catalog_fw.project_name will be compared with the ones in the table catalog_fw_instance. If one line have the exact same content for catalog_fw.ODM and catalog_fw.project, catalog_fw_instance.level will be combined with A to pass to the an .exe to generate a txtx file.
However, error occurs in this line: c.execute("catalog_fw_instance.level,......
`
when I run this python file:
sqlite3.OperationalError: near "catalog_fw_instance": syntax error
The code to get sqlite data, compare and pass to the .exe is here:
def when_call_exe():
with sqlite3.connect('db.sqlite3') as con:
c = con.cursor()
#c.execute("catalog_fw_instance.level, SELECT catalog_fw.ODM_name, catalog_fw.project_name, catalog_fw.UAP, catalog_fw.NAP, catalog_fw.LAP, catalog_fw.num_address FROM catalog_fw INNER JOIN catalog_fw_instance ON catalog_fw.ODM_name=catalog_fw_instance.ODM_name AND catalog_fw.project_name=catalog_fw_instance.project_name")
sql = ("SELECT catalog_fw.ODM_name, catalog_fw.project_name, catalog_fw.UAP, catalog_fw.NAP, catalog_fw.LAP, " +
"catalog_fw.num_address, catalog_fw_instance.level " +
"FROM catalog_fw catalog_fw" +
"INNER JOIN catalog_fw_instance catalog_fw_instanc" +
" ON catalog_fw.ODM_name = catalog_fwi.ODM_name AND catalog_fw.project_name = catalog_fw_instance.project_name")
c.execute(sql)
print '1:', c.fetchone()
parameter = c.fetchone()
print '2', parameter
#pass to exe
args = ['.//exe//Test.exe', parameter[0], parameter[1]+parameter[2], parameter[3], parameter[4], parameter[5], parameter[6]]
output = my_check_output(args)
if 'SUCCESS' in output:
filename = output[28:-1]
else:
filename = output[8:-1]
downloadlink = os.path.join('/exe', '%s' % filename)
#save link to sqlite db
c.execute('''UPDATE catalog_fw SET download = %s WHERE
ODM_Name=parameter[1] AND project_Name=parameter[2] ''' % downloadlink)
here shows the 2 tables in the sqlite database
table 1
table 2
As far as I know, when calling cursor#execute() in Python, we should be passing a single string containing the query to be run. It looks like you are passing one of the select parameters, followed by a query, all together as a single string. Consider the following version:
c = con.cursor()
sql = ("SELECT cf.ODM_name, cf.project_name, cf.UAP, cf.NAP, cf.LAP, " +
"cf.num_address, cfi.level " +
"FROM catalog_fw cf " +
"INNER JOIN catalog_fw_instance cfi " +
" ON cf.ODM_name = cfi.ODM_name AND cf.project_name = cfi.project_name")
c.execute(sql)
print(c.fetchone())
parameter = c.fetchone()

nested if vs one if in loop Python

I am trying to build on the fly valid SQL statement including boundary constrains. My question is is there any easy way to build valid SQL statement using for loop like below?
sql = 'SELECT * from table '
firststatment = True
r = dict(request.query)
for k,v in r.items():
if firststatment:
sql = sql + ' where {} = {}'.format(k,v)
firststatment = False
else:
sql = sql + ' and {} = {}'.format(k,v)
or in such case is better to use a structure like
if bondarydate1 and bondarydate2:
sql = sql + ' where year(date) between ({} and {})'.format(bondarydate1, bondarydate2)
marker = True
elif bondarydate1:
sql = sql + ' where year(date) = {}'.format(bondarydate1)
marker = True
elif bondarydate2:
sql = sql + ' where year(date) = {}'.format(bondarydate2)
marker =True
if marker and boundaryparam2:
sql = sql + ' and boundaryparam2 = {}'.format(boundaryparam2)
elif boundaryparam2:
sql = sql + ' where boundaryparam2 = {}'.format(boundaryparam2)
marker = True
if marker and boundaryparam3:
sql = sql+' and boundaryparam3 = {}'.format(boundaryparam3)
elif boundaryparam3:
sql = sql + ' where boundaryparam3 = {}'.format(boundaryparam3)
marker = True
if marker and boundaryparam4:
sql = sql + ' and boundaryparam4 = {}'.format(boundaryparam4)
elif boundaryparam4:
sql = sql + ' where boundaryparam4 = {}'.format(boundaryparam4)
marker = True
if marker and boundaryparam5:
sql = sql + ' and boundaryparam5 = {}'.format(boundaryparam5)
elif boundaryparam5:
sql = sql +' where boundaryparam5 = {}'.format(boundaryparam5)
basically I am trying somehow to rich valid SQL statement like below
SELECT * from x where date between bondarydate1 and boundarydate2 and column1 = dhkjf and column2 = 343
P.S:
Is there some short way to build SQL which will contain only boundary for hardcoded keys like r['predefined'] and if r = {'predefined':'someValue', 'sapmKey':'spamvalue'} so that spam value won't be inserted in query?
Yes, but it is not a good idea to build queries like that: it easily allows SQL injection.
You can do it with the following code:
constraints = ' AND '.join('{} = {}'.format(*t) for t in r.items())
sql = 'SELECT * FROM table WHERE {}'.format(constraints)
But this is not safe: if the key or the value contains for instance backquotes, a hacker could aim to export your database.
I would advice to at least escape the values (usually - hopefully - a user has only control over the values), and then you can use placeholders like %s, and let a library like MySQLdb escape the values. For instance:
constraints = ' AND '.join('{} = %s'.format(k) for k in r.keys())
sql = 'SELECT * FROM table WHERE {}'.format(constraints)
cursor.execute(sql, r.values())
Furthermore there exist a lot of Python libraries (like SQLAlchemy) where one can safely construct SQL queries (well of course given the libraries themselves are safe). Which will usually have a positive impact on the security of your application.
I thus strongly advice you to use some sort of library to talk to the database. Usually it will provide an additional level of abstraction which is nice, and furthermore it will usually protect you against most security vulnerabilities.

Using % wildcard with pg8000

I have a query similar to below:
def connection():
pcon = pg8000.connect(host='host', port=1234, user='user', password='password', database = 'database')
return pcon, pcon.cursor()
pcon, pcur = connection()
query = """ SELECT * FROM db WHERE (db.foo LIKE 'string-%' OR db.foo LIKE 'bar-%')"""
db = pd.read_sql_query(query, pcon)
However when I try to run the code I get:
DatabaseError: '%'' not supported in a quoted string within the query string
I have tried escaping the symbol with \ and an additional % with no luck. How can I get pg8000 to treat this as a wildcard properly?
"In Python, % usually refers to a variable that follows the string. If you want a literal percent sign, then you need to double it. %%"
-- Source
LIKE 'string-%%'
Otherwise, if that doesn't work, PostgreSQL also supports underscores for pattern matching.
'abc' LIKE 'abc' true
'abc' LIKE 'a%' true
'abc' LIKE '_b_' true
But, as mentioned in the comments,
An underscore (_) in pattern stands for (matches) any single character; a percent sign (%) matches any sequence of zero or more characters
According to the source code, though, it would appear the problem is the single quote following the % in your LIKE statement.
if next_c == "%":
in_param_escape = True
else:
raise InterfaceError(
"'%" + next_c + "' not supported in a quoted "
"string within the query string")
So if next_c == "'" instead of next_c == "%", then you would get your error
'%'' not supported in a quoted string within the query string
With a recent version of pg8000 you shouldn't have any problems with a % in a LIKE. For example:
>>> import pg8000.dbapi
>>>
>>> con = pg8000.dbapi.connect(user="postgres", password="cpsnow")
>>> cur = con.cursor()
>>> cur.execute("CREATE TEMPORARY TABLE book (id SERIAL, title TEXT)")
>>> for title in ("Ender's Game", "The Magus"):
... cur.execute("INSERT INTO book (title) VALUES (%s)", [title])
>>>
>>> cur.execute("SELECT * from book WHERE title LIKE 'The %'")
>>> cur.fetchall()
([2, 'The Magus'],)

Query doesn't update table when it is run form Python code

When I run a query from sqlite browser the table get updated but when I use same query from Python the database won't get updated:
def updateDB (number, varCheck=True):
conn = sqlite3.connect(db)
c = conn.cursor()
i = 1
for each_test in number:
c.execute("UPDATE table1 SET val='%s' WHERE amount='%s' AND rank='%s'" % (each_test , str(i), 'rank2'))
i += 1
conn.commit()
conn.close()
return True
How can I fix the issue? I run python code as sudo.
In the past, I had similar issues while creating sql queries. I doubt if your sql query is being correctly formatted. The % string interpolation method can be a problem. Try using the .format() on the sql query string. PEP3101 explains the same about using .format() instead of % operator for string interpolation.
val='"' + each_test + '"'
amount = '"' + str(i) + '"'
rank= '"' + "rank2" + '"'
sql_qeury = "UPDATE table1 SET val={val} WHERE amount={amount} AND rank={rank}".format(val=val,amount=amount,rank=rank)

How to string format SQL IN clause with Python

I am trying to create a statement as follows:
SELECT * FROM table WHERE provider IN ('provider1', 'provider2', ...)
However, I'm having some trouble with the string formatting of it from the Django API. Here's what I have so far:
profile = request.user.get_profile()
providers = profile.provider.values_list('provider', flat=True) # [u'provider1', u'provider2']
providers = tuple[str(item) for item in providers] # ('provider1', 'provider2')
SQL = "SELECT * FROM table WHERE provider IN %s"
args = (providers,)
cursor.execute(sql,args)
DatabaseError
(1241, 'Operand should contain 1 column(s)')
MySQLdb has a method to help with this:
Doc
string_literal(...)
string_literal(obj) -- converts object obj into a SQL string literal.
This means, any special SQL characters are escaped, and it is enclosed
within single quotes. In other words, it performs:
"'%s'" % escape_string(str(obj))
Use connection.string_literal(obj), if you use it at all.
_mysql.string_literal(obj) cannot handle character sets.
Usage
# connection: <_mysql.connection open to 'localhost' at 1008b2420>
str_value = connection.string_literal(tuple(provider))
# '(\'provider1\', \'provider2\')'
SQL = "SELECT * FROM table WHERE provider IN %s"
args = (str_value,)
cursor.execute(sql,args)
Another answer that I don't like particularly, but will work for your apparent use-case:
providers = tuple[str(item) for item in providers] # ('provider1', 'provider2')
# rest of stuff...
SQL = 'SELECT * FROM table WHERE provider IN {}'.format(repr(providers))
cursor.execute(SQL)
You should probably do the string replacement before passing it to the cursor object to execute:
sql = "SELECT * FROM table WHERE provider IN (%s)" % \
(','.join(str(x) for x in providers))
cursor.execute(sql)
So, you have string input for ID's required:
some_vals = '1 3 5 76 5 4 2 5 7 8'.split() # convert to suitable type if required
SomeModel.objects.filter(provider__in=some_vals)
"SELECT * FROM table WHERE provider IN ({0},{1},{2})".format(*args) #where args is list or tuple of arguments.
try this.... should work.
SQL = "SELECT * FROM table WHERE provider IN %s"%(providers)
exec 'cursor.execute("%s")'%(SQL)

Categories