I am trying to build on the fly valid SQL statement including boundary constrains. My question is is there any easy way to build valid SQL statement using for loop like below?
sql = 'SELECT * from table '
firststatment = True
r = dict(request.query)
for k,v in r.items():
if firststatment:
sql = sql + ' where {} = {}'.format(k,v)
firststatment = False
else:
sql = sql + ' and {} = {}'.format(k,v)
or in such case is better to use a structure like
if bondarydate1 and bondarydate2:
sql = sql + ' where year(date) between ({} and {})'.format(bondarydate1, bondarydate2)
marker = True
elif bondarydate1:
sql = sql + ' where year(date) = {}'.format(bondarydate1)
marker = True
elif bondarydate2:
sql = sql + ' where year(date) = {}'.format(bondarydate2)
marker =True
if marker and boundaryparam2:
sql = sql + ' and boundaryparam2 = {}'.format(boundaryparam2)
elif boundaryparam2:
sql = sql + ' where boundaryparam2 = {}'.format(boundaryparam2)
marker = True
if marker and boundaryparam3:
sql = sql+' and boundaryparam3 = {}'.format(boundaryparam3)
elif boundaryparam3:
sql = sql + ' where boundaryparam3 = {}'.format(boundaryparam3)
marker = True
if marker and boundaryparam4:
sql = sql + ' and boundaryparam4 = {}'.format(boundaryparam4)
elif boundaryparam4:
sql = sql + ' where boundaryparam4 = {}'.format(boundaryparam4)
marker = True
if marker and boundaryparam5:
sql = sql + ' and boundaryparam5 = {}'.format(boundaryparam5)
elif boundaryparam5:
sql = sql +' where boundaryparam5 = {}'.format(boundaryparam5)
basically I am trying somehow to rich valid SQL statement like below
SELECT * from x where date between bondarydate1 and boundarydate2 and column1 = dhkjf and column2 = 343
P.S:
Is there some short way to build SQL which will contain only boundary for hardcoded keys like r['predefined'] and if r = {'predefined':'someValue', 'sapmKey':'spamvalue'} so that spam value won't be inserted in query?
Yes, but it is not a good idea to build queries like that: it easily allows SQL injection.
You can do it with the following code:
constraints = ' AND '.join('{} = {}'.format(*t) for t in r.items())
sql = 'SELECT * FROM table WHERE {}'.format(constraints)
But this is not safe: if the key or the value contains for instance backquotes, a hacker could aim to export your database.
I would advice to at least escape the values (usually - hopefully - a user has only control over the values), and then you can use placeholders like %s, and let a library like MySQLdb escape the values. For instance:
constraints = ' AND '.join('{} = %s'.format(k) for k in r.keys())
sql = 'SELECT * FROM table WHERE {}'.format(constraints)
cursor.execute(sql, r.values())
Furthermore there exist a lot of Python libraries (like SQLAlchemy) where one can safely construct SQL queries (well of course given the libraries themselves are safe). Which will usually have a positive impact on the security of your application.
I thus strongly advice you to use some sort of library to talk to the database. Usually it will provide an additional level of abstraction which is nice, and furthermore it will usually protect you against most security vulnerabilities.
Related
So i have a table with all of the products and their language availability. I want to write a function to check the language availability when input the product name and language.
My code is as follow:
"CREATE TABLE t (Language,French, Italian, German, Spanish, Japanese, Korean, 'Traditional Chinese');")
//insert data to table t
def checkLanguageAvailability(self, product, language):
query = "SELECT " + language + " FROM t WHERE Language = '" + product + "'"
cur = self.df.cursor()
cur.execute(query)
# print cur.fetchall()
res = cur.fetchall()
if res[0][0] == '':
return False
elif int(float(res[0][0])) != 0:
return True
so when i test it , it all works fine with one-word text
checkLanguageAvailability("productname",'French')) --> True
But with multiple-word text
checkLanguageAvailability("productname",'Traditional Chinese'))
it raise this error :
cur.execute(query)
sqlite3.OperationalError: no such column: Traditional
It seems that instead of taking the whole string 'Traditional Chinese' as a parameter, it just take 'traditional' and there is no column have this name in table
I disagree with your table structure and also with your code. Adding a new column for each language is costly and maximally inflexible. This approach requires a major schema change each time you decide to support a new language. In addition, your current concatenated query string is prone to SQL injection. Beyond this, you should generally not make the column names in a query as parameters. When you find yourself doing this, it might indicate bad design or a hack. Instead, I propose the following table:
CREATE TABLE t (language TEXT, product TEXT)
This design represents the presence of a given product and language as a single row. Hence, if we find a record entry for a given product and language then we know it is present.
Try using code something like the following:
def checkLanguageAvailability(self, product, language):
cur = self.df.cursor()
cmd = cur.execute("SELECT 1 FROM t WHERE product = ? AND language = ?", (product, language))
res = cur.fetchall()
cnt = len(res)
if cnt == 0
return False
else
return True
Use LIKE:
def checkLanguageAvailability(self, product, language):
query = "SELECT " + language + " FROM t WHERE Language LIKE '%" + product + "%'"
cur = self.df.cursor()
cur.execute(query)
# print cur.fetchall()
res = cur.fetchall()
if res[0][0] == '':
return False
elif int(float(res[0][0])) != 0:
return True
And is this query accepting any external input? Because if so, you should use prepared statements.
I am trying to get the mssql table column names using pyodbc, and getting an error saying
ProgrammingError: No results. Previous SQL was not a query.
Here is my code:
class get_Fields:
def GET(self,r):
web.header('Access-Control-Allow-Origin', '*')
web.header('Access-Control-Allow-Credentials', 'true')
fields = []
datasetname = web.input().datasetName
tablename = web.input().tableName
cnxn = pyodbc.connect(connection_string)
cursor = cnxn.cursor()
query = "USE" + "[" +datasetname+ "]" + "SELECT COLUMN_NAME,* FROM INFORMATION_SCHEMA.COLUMNS WHERE TABLE_NAME = " + "'"+ tablename + "'"
cursor.execute(query)
DF = DataFrame(cursor.fetchall())
columns = [column[0] for column in cursor.description]
return json.dumps(columns)
how to solve this?
You can avoid this by using some of pyodbc's built in methods. For example, instead of:
query = "USE" + "[" +datasetname+ "]" + "SELECT COLUMN_NAME,* FROM INFORMATION_SCHEMA.COLUMNS WHERE TABLE_NAME = " + "'"+ tablename + "'"
cursor.execute(query)
DF = DataFrame(cursor.fetchall())
Try:
column_data = cursor.columns(table=tablename, catalog=datasetname, schema='dbo').fetchall()
print(column_data)
That will return the column names (and other column metadata). I believe the column name is the fourth element per row. This also relieves the very valid concerns about SQL injection. You can then figure out how to build your DataFrame from the resulting data.
Good luck!
Your line
query = "USE" + "[" +datasetname+ "]" + "SELECT COLUMN_NAME,*...
Will produce something like
USE[databasename]SELECT ...
In SSMS this would work, but I'd suggest to look on proper spacing and to separate the USE-statement with a semicolon:
query = "USE " + "[" +datasetname+ "]; " + "SELECT COLUMN_NAME,*...
Set the database context using the Database attribute when building the connection string
Use parameters any time you are passing user input (especially from HTTP requests!) to a WHERE clause.
These changes eliminate the need for dynamic SQL, which can be insecure and difficult to maintain.
When I run a query from sqlite browser the table get updated but when I use same query from Python the database won't get updated:
def updateDB (number, varCheck=True):
conn = sqlite3.connect(db)
c = conn.cursor()
i = 1
for each_test in number:
c.execute("UPDATE table1 SET val='%s' WHERE amount='%s' AND rank='%s'" % (each_test , str(i), 'rank2'))
i += 1
conn.commit()
conn.close()
return True
How can I fix the issue? I run python code as sudo.
In the past, I had similar issues while creating sql queries. I doubt if your sql query is being correctly formatted. The % string interpolation method can be a problem. Try using the .format() on the sql query string. PEP3101 explains the same about using .format() instead of % operator for string interpolation.
val='"' + each_test + '"'
amount = '"' + str(i) + '"'
rank= '"' + "rank2" + '"'
sql_qeury = "UPDATE table1 SET val={val} WHERE amount={amount} AND rank={rank}".format(val=val,amount=amount,rank=rank)
I have this Python code:
def get_employees(conditions, fields):
cursor.execute("SELECT employeeID FROM employees WHERE name=%s, budget=%s,
%year=%s,(some of conditions))
Is there any way to get employeeIDs if I set in conditions not all parameters, etc. only name and year?
If conditions were a dictionary, you could construct a query string:
def get_employees(conditions):
query = 'select employeeid from employees'
if conditions:
query += ' where ' + ' and '.join(key + ' = %s' for key in conditions.keys())
cursor.execute(query, conditions.values())
(I should note that here I am assuming that conditions does not have user-supplied keys. If there are user-supplied keys, this is definitely vulnerable to SQL injection.)
usually it is done via dynamic sql building, like this:
sql = "SELECT employeeID FROM employees WHERE 1=1";
if condition has name
sql += " and name='" .escape(name) . "'"
if condition has somefield
sql += " and somefield='" .escape(somefield) . "'"
etc
execute final sql
Consider the following code snippet:
import MySQLdb
def get_data(id):
db = MySQLdb.connect(db='TEST')
cursor = db.cursor()
cursor.execute("SELECT * FROM TEST WHERE ID = '%s'" % id)
return cursor.fetchall()
print(get_data(1))
There is a major problem in the code - it is vulnerable to SQL injections attacks since the query is not parameterized through DB API and is constructed via string formatting. If you call the function this way:
get_data("'; DROP TABLE TEST -- ")
the following query would be executed:
SELECT * FROM TEST WHERE ID = ''; DROP TABLE TEST --
Now, my goal is to analyze the code in the project and detect all places potentially vulnerable to SQL injections. In other words, where the query is constructed via string formatting as opposed to passing query parameters in a separate argument.
Is it something that can be solved statically, with the help of pylint, pyflakes or any other static code analysis packages?
I'm aware of sqlmap popular penetration testing tool, but, as far as I understand, it is working against a web resource, testing it as a black-box through HTTP requests.
There is a tool that tries to solve exactly what the question is about, py-find-injection:
py_find_injection uses various heuristics to look for SQL injection
vulnerabilities in python source code.
It uses ast module, looks for session.execute() and cursor.execute() calls, and checks whether the query inside is formed via string interpolation, concatenation or format().
Here is what it outputs while checking the snippet in the question:
$ py-find-injection test.py
test.py:6 string interpolation of SQL query
1 total errors
The project, though, is not actively maintained, but could be used as a starting point. A good idea would be to make a pylint or pyflakes plugin out of it.
Not sure how this will compare with the other packages, but to a certain extent you need to parse the arguments being passed to cursor.execute. This bit of pyparsing code looks for:
arguments using string interpolation
arguments using string concatenation with variable names
arguments that are just variable names
But sometimes arguments use string concatenation just to break up a long string into - if all the strings in the expression are literals being added together, there is no risk of SQL injection.
This pyparsing snippet will look for calls to cursor.execute, and then look for the at-risk argument forms:
from pyparsing import *
import re
identifier = Word(alphas, alphanums+'_')
integer = Word(nums)
LPAR,RPAR,PLUS,PERCENT = map(Literal, '()+%')
stringInterpRE = re.compile(r"%-?\d*\*?\.?\d*\*?s")
def containsStringInterpolation(s,l,tokens):
if not stringInterpRE.search(tokens[0]):
raise ParseException(s,l,"No string interpolation")
tupleContents = identifier | integer
tupleExpr = LPAR + delimitedList(tupleContents) + RPAR
stringInterpArg = identifier | tupleExpr
interpolatedString = originalTextFor(quotedString.copy().setParseAction(containsStringInterpolation) +
PERCENT + stringInterpArg)
stringTerm = interpolatedString | OneOrMore(quotedString.copy()) | identifier
stringTerm.setName("stringTerm")
unsafeStringExpr = (stringTerm + OneOrMore(PLUS + stringTerm)) | identifier | interpolatedString
def unsafeExpr(s,l,tokens):
if not any(term == interpolatedString or term == identifier
for term in tokens):
raise ParseException(s,l,"No unsafe string terms")
unsafeStringExpr.setParseAction(unsafeExpr)
unsafeStringExpr.setName("unsafeExpr")
func = Literal("cursor.execute")
statement = func + LPAR + unsafeStringExpr + RPAR
statement.setName("execute stmt")
#statement.ignore(pythonComment)
for tokens in statement.searchString(sample):
print ' '.join(tokens.asList())
This will scan through the following sample:
sample = """
import MySQLdb
def get_data(id):
db = MySQLdb.connect(db='TEST')
cursor = db.cursor()
cursor.execute("SELECT * FROM TEST WHERE ID = '%s' -- UNSAFE" % id)
cursor.execute("SELECT * FROM TEST WHERE ID = '" + id + "' -- UNSAFE")
cursor.execute(sqlVar + " -- UNSAFE")
cursor.execute("SELECT * FROM TEST WHERE ID = 'FRED' -- SAFE")
cursor.execute("SELECT * FROM TEST WHERE ID = " +
"'FRED' -- SAFE")
cursor.execute("SELECT * FROM TEST "
"WHERE ID = "
"'FRED' -- SAFE")
cursor.execute("SELECT * FROM TEST "
"WHERE ID = " +
"'%s' -- UNSAFE" % name)
return cursor.fetchall()
print(get_data(1))"""
and report these unsafe statements:
cursor.execute ( "SELECT * FROM TEST WHERE ID = '%s' -- UNSAFE" % id )
cursor.execute ( "SELECT * FROM TEST WHERE ID = '" + id + "' -- UNSAFE" )
cursor.execute ( sqlVar + " -- UNSAFE" )
cursor.execute ( "SELECT * FROM TEST " "WHERE ID = " + "'%s' -- UNSAFE" % name )
You can also have pyparsing report the location of the found lines, using scanString instead of searchString.
About the best that I can think you'd get would be grep'ing through your codebase, looking for cursor.execute() statements being passed a string using Python string interpolation, as in your example:
cursor.execute("SELECT * FROM TEST WHERE ID = '%s'" % id)
which of course should have been written as a parameterized query to avoid the vulnerability:
cursor.execute("SELECT * FROM TEST WHERE ID = '%s'", (id,))
That's not going to be perfect -- for instance, you might have a hard time catching code like this:
query = "SELECT * FROM TEST WHERE ID = '%s'" % id
# some stuff
cursor.execute(query)
But it might be about the best you can easily do.
It's a good thing that you're already aware of the problem and trying to resolve it.
As you may already know, the best practices to execute SQL in any DB is to use prepared statements or stored procedures if these are available.
In this particular case, you can implement a prepared statement by "preparing" the statement and then executing.
e.g:
cursor = db.cursor()
query = "SELECT * FROM TEST WHERE ID = %s"
cur.execute(query, "2")