Sqlite select column text with multple words - python

So i have a table with all of the products and their language availability. I want to write a function to check the language availability when input the product name and language.
My code is as follow:
"CREATE TABLE t (Language,French, Italian, German, Spanish, Japanese, Korean, 'Traditional Chinese');")
//insert data to table t
def checkLanguageAvailability(self, product, language):
query = "SELECT " + language + " FROM t WHERE Language = '" + product + "'"
cur = self.df.cursor()
cur.execute(query)
# print cur.fetchall()
res = cur.fetchall()
if res[0][0] == '':
return False
elif int(float(res[0][0])) != 0:
return True
so when i test it , it all works fine with one-word text
checkLanguageAvailability("productname",'French')) --> True
But with multiple-word text
checkLanguageAvailability("productname",'Traditional Chinese'))
it raise this error :
cur.execute(query)
sqlite3.OperationalError: no such column: Traditional
It seems that instead of taking the whole string 'Traditional Chinese' as a parameter, it just take 'traditional' and there is no column have this name in table

I disagree with your table structure and also with your code. Adding a new column for each language is costly and maximally inflexible. This approach requires a major schema change each time you decide to support a new language. In addition, your current concatenated query string is prone to SQL injection. Beyond this, you should generally not make the column names in a query as parameters. When you find yourself doing this, it might indicate bad design or a hack. Instead, I propose the following table:
CREATE TABLE t (language TEXT, product TEXT)
This design represents the presence of a given product and language as a single row. Hence, if we find a record entry for a given product and language then we know it is present.
Try using code something like the following:
def checkLanguageAvailability(self, product, language):
cur = self.df.cursor()
cmd = cur.execute("SELECT 1 FROM t WHERE product = ? AND language = ?", (product, language))
res = cur.fetchall()
cnt = len(res)
if cnt == 0
return False
else
return True

Use LIKE:
def checkLanguageAvailability(self, product, language):
query = "SELECT " + language + " FROM t WHERE Language LIKE '%" + product + "%'"
cur = self.df.cursor()
cur.execute(query)
# print cur.fetchall()
res = cur.fetchall()
if res[0][0] == '':
return False
elif int(float(res[0][0])) != 0:
return True
And is this query accepting any external input? Because if so, you should use prepared statements.

Related

How can I (safely) convert a dictionary into an UPDATE statement for sqlite in Python?

Say for example, I have a table of students, and I have a Python dictionary
mydict = {"fname" : "samwise", "lname" : "gamgee", "age" : 13}
How can I safely generate a Python function that can UPDATE this into my student table? (In my use-case I can safely assume that the student already exists in the table, AND I KNOW the id already)
I have created a function that achieves this functionality, but I can't help but think it's a bit crude, and perhaps open to SQL injection attacks
def sqlite_update(conn, table, data, pkeyname, pkeyval):
set_lines = ""
for k,v in data.items():
set_lines += "{} = '{}', ".format(k,v)
set_lines = set_lines[:-2] #remove space and comma from last item
sql = "UPDATE {0} SET {1} WHERE {2} = '{3}'"
statement = sql.format(table, set_lines, pkeyname, pkeyval)
conn.execute(statement)
conn.commit()
And to update I just call
sqlite_update(conn, "student", mydict, "id", 1)
As I assume you are using sqlalchemy. In this case, you can use sqlalchemy.sql.text function which escapes strings if required.
You can try to adjust your function as below.
from sqlalchemy.sql import text
def sqlite_update(conn, table, data, pkeyname, pkeyval):
set_lines = ",".join([f"{k}=:{k}" for k in data.keys()])
statement = text(f"UPDATE {table} SET {set_lines} WHERE {pkeyname} = :pkeyval")
args = dict(data)
args["pkeyval"] = pkeyval
conn.execute(statement, args)
conn.commit()
For more details, refer to sqlalchemy official documentation on text function.
EDIT
As for sqlite3 connection you can do basically the same thing as above, with slight changes.
def sqlite_update(conn, table, data, pkeyname, pkeyval):
set_lines = ",".join([f"{k}=:{k}" for k in data.keys()])
statement = f"UPDATE {table} SET {set_lines} WHERE {pkeyname} = :pkeyval"
args = dict(data)
args["pkeyval"] = pkeyval
conn.execute(statement, args)
conn.commit()
Refer to sqlite3 execute
This is indeed widely opened to SQL injection, because you build the query as a string including its values, instead of using a parameterized query.
Building a parameterized query with Python is easy:
def sqlite_update(conn, table, data, pkeyname, pkeyval):
query = f"UPDATE {table} SET " + ', '.join(
"{}=?".format(k) for k in data.keys()) + f" WHERE {pkeyname}=?"
# uncomment next line for debugging
# print(query, list(data.values()) + [pkeyval])
conn.execute(query, list(data.values()) + [pkeyval])
With your example, the query displayed by the debug print line is:
UPDATE student SET fname=?, lname=?, age=? WHERE id=?
with the following values list: ['samwise', 'gamgee', 13, 1]
But beware, to be fully protected from SQL injection, you should sanitize the table and field names to ensure they contain no dangerous characters like ;

Scan a single column in SQL Server for a data entry using python

Example of a column:
This is what I have tried. I only want to search based on a single column in the table. Lets says the table name is Employees. The input parameter is entered by the user in console.
exists = cursor.execute("SELECT TOP 1 * FROM Employees WHERE ID = ?", (str(input),))
print(exists)
if exists is None:
return False
else:
return True
I think this is what you are looking for:
insert_query = '''SELECT TOP 1 * FROM EmployeeTable WHERE ID = (?);''' # '?' is a placeholder
cursor.execute(insert_query, str(input))

nested if vs one if in loop Python

I am trying to build on the fly valid SQL statement including boundary constrains. My question is is there any easy way to build valid SQL statement using for loop like below?
sql = 'SELECT * from table '
firststatment = True
r = dict(request.query)
for k,v in r.items():
if firststatment:
sql = sql + ' where {} = {}'.format(k,v)
firststatment = False
else:
sql = sql + ' and {} = {}'.format(k,v)
or in such case is better to use a structure like
if bondarydate1 and bondarydate2:
sql = sql + ' where year(date) between ({} and {})'.format(bondarydate1, bondarydate2)
marker = True
elif bondarydate1:
sql = sql + ' where year(date) = {}'.format(bondarydate1)
marker = True
elif bondarydate2:
sql = sql + ' where year(date) = {}'.format(bondarydate2)
marker =True
if marker and boundaryparam2:
sql = sql + ' and boundaryparam2 = {}'.format(boundaryparam2)
elif boundaryparam2:
sql = sql + ' where boundaryparam2 = {}'.format(boundaryparam2)
marker = True
if marker and boundaryparam3:
sql = sql+' and boundaryparam3 = {}'.format(boundaryparam3)
elif boundaryparam3:
sql = sql + ' where boundaryparam3 = {}'.format(boundaryparam3)
marker = True
if marker and boundaryparam4:
sql = sql + ' and boundaryparam4 = {}'.format(boundaryparam4)
elif boundaryparam4:
sql = sql + ' where boundaryparam4 = {}'.format(boundaryparam4)
marker = True
if marker and boundaryparam5:
sql = sql + ' and boundaryparam5 = {}'.format(boundaryparam5)
elif boundaryparam5:
sql = sql +' where boundaryparam5 = {}'.format(boundaryparam5)
basically I am trying somehow to rich valid SQL statement like below
SELECT * from x where date between bondarydate1 and boundarydate2 and column1 = dhkjf and column2 = 343
P.S:
Is there some short way to build SQL which will contain only boundary for hardcoded keys like r['predefined'] and if r = {'predefined':'someValue', 'sapmKey':'spamvalue'} so that spam value won't be inserted in query?
Yes, but it is not a good idea to build queries like that: it easily allows SQL injection.
You can do it with the following code:
constraints = ' AND '.join('{} = {}'.format(*t) for t in r.items())
sql = 'SELECT * FROM table WHERE {}'.format(constraints)
But this is not safe: if the key or the value contains for instance backquotes, a hacker could aim to export your database.
I would advice to at least escape the values (usually - hopefully - a user has only control over the values), and then you can use placeholders like %s, and let a library like MySQLdb escape the values. For instance:
constraints = ' AND '.join('{} = %s'.format(k) for k in r.keys())
sql = 'SELECT * FROM table WHERE {}'.format(constraints)
cursor.execute(sql, r.values())
Furthermore there exist a lot of Python libraries (like SQLAlchemy) where one can safely construct SQL queries (well of course given the libraries themselves are safe). Which will usually have a positive impact on the security of your application.
I thus strongly advice you to use some sort of library to talk to the database. Usually it will provide an additional level of abstraction which is nice, and furthermore it will usually protect you against most security vulnerabilities.

Building the appropriate SQL queries for a web from, when order doesn't matter

I will read the input into the following web form:
via Flask thus:
def query():
if request.method == "POST":
job_title_input = request.form['title']
description_operator = request.form['DescriptionOperator']
description_input = request.form['description']
salary_operator = request.form['SalaryOperator']
salary_input = request.form['salary']
location_input = request.form['location']
The values received will be used to query this table in SQLite:
CREATE TABLE scrapedjobs
(id integer primary key autoincrement not null,
huburl text,
business_name text,
location text,
jobtitle text,
description text,
joburl text,
salary int,
date_first_scraped date)
It's my first time attempting to handle of the permutations in a web form's submission, and after cascading down through just the first two fields, I can see already that it's is getting convoluted:
base_query = """SELECT * FROM scrapedjobs"""
if job_title_input != '':
base_query = base_query + """ WHERE jobtitle IN ('""" + job_title_input + """')"""
job_title_input_flag = """ AND"""
else:
job_title_input_flag = """"""
if description_input != '':
if job_title_input_flag != '':
base_query = base_query + """ AND description """ + description_operator + " " + "('" + description_input + "')"
description_input_flag = """ AND"""
else:
base_query = base_query + """ WHERE description """ + description_operator + " " + "('" + description_input + "')"
description_input_flag = """ AND"""
else:
description_input_flag = """"""
The further down I go, the more nesting of if/else's will be needed, and it's not sustainable if new fields are added to the form. A major flaw also is that I'm setting things up whereby each field depends on whether the previous fields had a value, and so on. The reality of course is that in a straight query like this without grouping or joins or the like, order is irrelevant, once I can take care of when I need AND (in case a field did have a value) and when I need instead WHERE. Worse still, I will have to parameterize all of this when I pass it to cursor.execute() in sqlite3.
I'm not the first person to every read in values from a multiple form and build appropriate SELECTs - what patterns or more efficient techniques can I take advantage of?
I've been there and I know how it will end. Manually constructing the queries like in your case is highly error-prone, not readable and leads to supporting nightmares. Instead, look into abstracting things away with sqlalchemy and flask-sqlalchemy specifically. With SQLAlchemy you can compose queries in pieces and it will build the final query correctly.
As a real world example, study the Flask Mega-Tutorial (sqlalchemy is used in chapter 4 and later).

Detect SQL injections in the source code

Consider the following code snippet:
import MySQLdb
def get_data(id):
db = MySQLdb.connect(db='TEST')
cursor = db.cursor()
cursor.execute("SELECT * FROM TEST WHERE ID = '%s'" % id)
return cursor.fetchall()
print(get_data(1))
There is a major problem in the code - it is vulnerable to SQL injections attacks since the query is not parameterized through DB API and is constructed via string formatting. If you call the function this way:
get_data("'; DROP TABLE TEST -- ")
the following query would be executed:
SELECT * FROM TEST WHERE ID = ''; DROP TABLE TEST --
Now, my goal is to analyze the code in the project and detect all places potentially vulnerable to SQL injections. In other words, where the query is constructed via string formatting as opposed to passing query parameters in a separate argument.
Is it something that can be solved statically, with the help of pylint, pyflakes or any other static code analysis packages?
I'm aware of sqlmap popular penetration testing tool, but, as far as I understand, it is working against a web resource, testing it as a black-box through HTTP requests.
There is a tool that tries to solve exactly what the question is about, py-find-injection:
py_find_injection uses various heuristics to look for SQL injection
vulnerabilities in python source code.
It uses ast module, looks for session.execute() and cursor.execute() calls, and checks whether the query inside is formed via string interpolation, concatenation or format().
Here is what it outputs while checking the snippet in the question:
$ py-find-injection test.py
test.py:6 string interpolation of SQL query
1 total errors
The project, though, is not actively maintained, but could be used as a starting point. A good idea would be to make a pylint or pyflakes plugin out of it.
Not sure how this will compare with the other packages, but to a certain extent you need to parse the arguments being passed to cursor.execute. This bit of pyparsing code looks for:
arguments using string interpolation
arguments using string concatenation with variable names
arguments that are just variable names
But sometimes arguments use string concatenation just to break up a long string into - if all the strings in the expression are literals being added together, there is no risk of SQL injection.
This pyparsing snippet will look for calls to cursor.execute, and then look for the at-risk argument forms:
from pyparsing import *
import re
identifier = Word(alphas, alphanums+'_')
integer = Word(nums)
LPAR,RPAR,PLUS,PERCENT = map(Literal, '()+%')
stringInterpRE = re.compile(r"%-?\d*\*?\.?\d*\*?s")
def containsStringInterpolation(s,l,tokens):
if not stringInterpRE.search(tokens[0]):
raise ParseException(s,l,"No string interpolation")
tupleContents = identifier | integer
tupleExpr = LPAR + delimitedList(tupleContents) + RPAR
stringInterpArg = identifier | tupleExpr
interpolatedString = originalTextFor(quotedString.copy().setParseAction(containsStringInterpolation) +
PERCENT + stringInterpArg)
stringTerm = interpolatedString | OneOrMore(quotedString.copy()) | identifier
stringTerm.setName("stringTerm")
unsafeStringExpr = (stringTerm + OneOrMore(PLUS + stringTerm)) | identifier | interpolatedString
def unsafeExpr(s,l,tokens):
if not any(term == interpolatedString or term == identifier
for term in tokens):
raise ParseException(s,l,"No unsafe string terms")
unsafeStringExpr.setParseAction(unsafeExpr)
unsafeStringExpr.setName("unsafeExpr")
func = Literal("cursor.execute")
statement = func + LPAR + unsafeStringExpr + RPAR
statement.setName("execute stmt")
#statement.ignore(pythonComment)
for tokens in statement.searchString(sample):
print ' '.join(tokens.asList())
This will scan through the following sample:
sample = """
import MySQLdb
def get_data(id):
db = MySQLdb.connect(db='TEST')
cursor = db.cursor()
cursor.execute("SELECT * FROM TEST WHERE ID = '%s' -- UNSAFE" % id)
cursor.execute("SELECT * FROM TEST WHERE ID = '" + id + "' -- UNSAFE")
cursor.execute(sqlVar + " -- UNSAFE")
cursor.execute("SELECT * FROM TEST WHERE ID = 'FRED' -- SAFE")
cursor.execute("SELECT * FROM TEST WHERE ID = " +
"'FRED' -- SAFE")
cursor.execute("SELECT * FROM TEST "
"WHERE ID = "
"'FRED' -- SAFE")
cursor.execute("SELECT * FROM TEST "
"WHERE ID = " +
"'%s' -- UNSAFE" % name)
return cursor.fetchall()
print(get_data(1))"""
and report these unsafe statements:
cursor.execute ( "SELECT * FROM TEST WHERE ID = '%s' -- UNSAFE" % id )
cursor.execute ( "SELECT * FROM TEST WHERE ID = '" + id + "' -- UNSAFE" )
cursor.execute ( sqlVar + " -- UNSAFE" )
cursor.execute ( "SELECT * FROM TEST " "WHERE ID = " + "'%s' -- UNSAFE" % name )
You can also have pyparsing report the location of the found lines, using scanString instead of searchString.
About the best that I can think you'd get would be grep'ing through your codebase, looking for cursor.execute() statements being passed a string using Python string interpolation, as in your example:
cursor.execute("SELECT * FROM TEST WHERE ID = '%s'" % id)
which of course should have been written as a parameterized query to avoid the vulnerability:
cursor.execute("SELECT * FROM TEST WHERE ID = '%s'", (id,))
That's not going to be perfect -- for instance, you might have a hard time catching code like this:
query = "SELECT * FROM TEST WHERE ID = '%s'" % id
# some stuff
cursor.execute(query)
But it might be about the best you can easily do.
It's a good thing that you're already aware of the problem and trying to resolve it.
As you may already know, the best practices to execute SQL in any DB is to use prepared statements or stored procedures if these are available.
In this particular case, you can implement a prepared statement by "preparing" the statement and then executing.
e.g:
cursor = db.cursor()
query = "SELECT * FROM TEST WHERE ID = %s"
cur.execute(query, "2")

Categories