Using % wildcard with pg8000 - python

I have a query similar to below:
def connection():
pcon = pg8000.connect(host='host', port=1234, user='user', password='password', database = 'database')
return pcon, pcon.cursor()
pcon, pcur = connection()
query = """ SELECT * FROM db WHERE (db.foo LIKE 'string-%' OR db.foo LIKE 'bar-%')"""
db = pd.read_sql_query(query, pcon)
However when I try to run the code I get:
DatabaseError: '%'' not supported in a quoted string within the query string
I have tried escaping the symbol with \ and an additional % with no luck. How can I get pg8000 to treat this as a wildcard properly?

"In Python, % usually refers to a variable that follows the string. If you want a literal percent sign, then you need to double it. %%"
-- Source
LIKE 'string-%%'
Otherwise, if that doesn't work, PostgreSQL also supports underscores for pattern matching.
'abc' LIKE 'abc' true
'abc' LIKE 'a%' true
'abc' LIKE '_b_' true
But, as mentioned in the comments,
An underscore (_) in pattern stands for (matches) any single character; a percent sign (%) matches any sequence of zero or more characters
According to the source code, though, it would appear the problem is the single quote following the % in your LIKE statement.
if next_c == "%":
in_param_escape = True
else:
raise InterfaceError(
"'%" + next_c + "' not supported in a quoted "
"string within the query string")
So if next_c == "'" instead of next_c == "%", then you would get your error
'%'' not supported in a quoted string within the query string

With a recent version of pg8000 you shouldn't have any problems with a % in a LIKE. For example:
>>> import pg8000.dbapi
>>>
>>> con = pg8000.dbapi.connect(user="postgres", password="cpsnow")
>>> cur = con.cursor()
>>> cur.execute("CREATE TEMPORARY TABLE book (id SERIAL, title TEXT)")
>>> for title in ("Ender's Game", "The Magus"):
... cur.execute("INSERT INTO book (title) VALUES (%s)", [title])
>>>
>>> cur.execute("SELECT * from book WHERE title LIKE 'The %'")
>>> cur.fetchall()
([2, 'The Magus'],)

Related

select name where id = "in the python list"?

Let's say i have a python list of customer id like this:
id = ('12','14','15','11',.......)
the array has 1000 values in it, and i need to insert the customer name to a table based on the ids from the list above.
my code is like:
ids = ",".join(id)
sql = "insert into cust_table(name)values(names)where cust_id IN('ids')"
cursor.execute(sql)
after running the code, i get nothing inserted to the table. What mistake do i have?
Please help :(
You need to format the string.
ids = ",".join(id)
sql = "insert into cust_table(name)values(names)where cust_id IN('{ids}')"
cursor.execute(sql.format(ids= ids))
Simply writing the name of a variable into a string doesn't magically make its contents appear in the string.
>>> p = 'some part'
>>> s = 'replace p of a string'
>>> s
'replace p of a string'
>>> s = 'replace %s of a string' % p
>>> s
'replace some part of a string'
>>> s = 'replace {} of a string'.format(p)
>>> s
'replace some part of a string'
In your case this would mean:
>>> sql = "insert into cust_table (name) values (names) where cust_id IN ('%s')"
>>> ids = ", ".join(id)
>>> cursor.execute(sql % ids)
although I strongly suspect that you have a similar problem with names.
In order to avoid possible sql injection problems, it would be preferable to use a "parameterized statement". This would look something like:
>>> sql = 'insert into ... where cust_id IN %s'
>>> cursor.execute(sql, (id,))
Some database connectors for python are capable of this, but yours probably isn't.
A workaround might be something like
>>> params = ', '.join(['%s']*len(id))
>>> sql = 'insert into ... where cust_id IN (%s)' % params
>>> cursor.execute(sql, id)

How to use like pattern matching with PostgreSQL and Python with multiple percentage (%) symbols?

I am trying to pattern match with the LIKE LOWER('% %') command however I think the fact that I am using a python variable with %s is mucking it up. I can't seem to find any escape characters for the percentage symbol and my program gives me no errors. Is this the problem or is there something else I'm missing. It does work if I just run LIKE %s however I need to be able to search like not equals.
# Ask for the database connection, and get the cursor set up
conn = database_connect()
if(conn is None):
return ERROR_CODE
cur = conn.cursor()
print("search_term: ", search_term)
try:
# Select the bays that match (or are similar) to the search term
sql = """SELECT fp.name AS "Name", fp.size AS "Size", COUNT(*) AS "Number of Fish"
FROM FishPond fp JOIN Fish f ON (fp.pondID = f.livesAt)
WHERE LOWER(fp.name) LIKE LOWER('%%s%') OR LOWER(fp.size) LIKE LOWER('%%s%')
GROUP BY fp.name, fp.size"""
cur.execute(sql, (search_term, ))
rows = cur.fetchall()
cur.close() # Close the cursor
conn.close() # Close the connection to the db
return rows
except:
# If there were any errors, return a NULL row printing an error to the debug
print("Error with Database - Unable to search pond")
cur.close() # Close the cursor
conn.close() # Close the connection to the db
return None
Instead of embedding the ampersands in the query string, you could wrap the search term string in ampersands, and then pass that to cursor.execute():
sql = 'SELECT * from FishPond fp WHERE LOWER(fp.name) LIKE LOWER(%s)'
search_term = 'xyz'
like_pattern = '%{}%'.format(search_term)
cur.execute(sql, (like_pattern,))
The query is simplified for the purpose of example.
This is more flexible because the calling code can pass any valid LIKE pattern to the query.
BTW: In Postgresql you can use ILIKE for case insensitive pattern matching, so the example query could be written as this:
sql = 'SELECT * from FishPond fp WHERE fp.name ILIKE %s'
As noted in the documentation ILIKE is a Postgresql extension, not standard SQL.
You can escape % with another %
>>> test = 'test'
>>> a = 'LIKE %%s%'
>>> a % test
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: incomplete format
>>>
>>> a = 'LIKE %%%s%%'
>>> a % test
'LIKE %test%'
P.S. you also have two placeholders, but you are passing only one argument in execute

Detect SQL injections in the source code

Consider the following code snippet:
import MySQLdb
def get_data(id):
db = MySQLdb.connect(db='TEST')
cursor = db.cursor()
cursor.execute("SELECT * FROM TEST WHERE ID = '%s'" % id)
return cursor.fetchall()
print(get_data(1))
There is a major problem in the code - it is vulnerable to SQL injections attacks since the query is not parameterized through DB API and is constructed via string formatting. If you call the function this way:
get_data("'; DROP TABLE TEST -- ")
the following query would be executed:
SELECT * FROM TEST WHERE ID = ''; DROP TABLE TEST --
Now, my goal is to analyze the code in the project and detect all places potentially vulnerable to SQL injections. In other words, where the query is constructed via string formatting as opposed to passing query parameters in a separate argument.
Is it something that can be solved statically, with the help of pylint, pyflakes or any other static code analysis packages?
I'm aware of sqlmap popular penetration testing tool, but, as far as I understand, it is working against a web resource, testing it as a black-box through HTTP requests.
There is a tool that tries to solve exactly what the question is about, py-find-injection:
py_find_injection uses various heuristics to look for SQL injection
vulnerabilities in python source code.
It uses ast module, looks for session.execute() and cursor.execute() calls, and checks whether the query inside is formed via string interpolation, concatenation or format().
Here is what it outputs while checking the snippet in the question:
$ py-find-injection test.py
test.py:6 string interpolation of SQL query
1 total errors
The project, though, is not actively maintained, but could be used as a starting point. A good idea would be to make a pylint or pyflakes plugin out of it.
Not sure how this will compare with the other packages, but to a certain extent you need to parse the arguments being passed to cursor.execute. This bit of pyparsing code looks for:
arguments using string interpolation
arguments using string concatenation with variable names
arguments that are just variable names
But sometimes arguments use string concatenation just to break up a long string into - if all the strings in the expression are literals being added together, there is no risk of SQL injection.
This pyparsing snippet will look for calls to cursor.execute, and then look for the at-risk argument forms:
from pyparsing import *
import re
identifier = Word(alphas, alphanums+'_')
integer = Word(nums)
LPAR,RPAR,PLUS,PERCENT = map(Literal, '()+%')
stringInterpRE = re.compile(r"%-?\d*\*?\.?\d*\*?s")
def containsStringInterpolation(s,l,tokens):
if not stringInterpRE.search(tokens[0]):
raise ParseException(s,l,"No string interpolation")
tupleContents = identifier | integer
tupleExpr = LPAR + delimitedList(tupleContents) + RPAR
stringInterpArg = identifier | tupleExpr
interpolatedString = originalTextFor(quotedString.copy().setParseAction(containsStringInterpolation) +
PERCENT + stringInterpArg)
stringTerm = interpolatedString | OneOrMore(quotedString.copy()) | identifier
stringTerm.setName("stringTerm")
unsafeStringExpr = (stringTerm + OneOrMore(PLUS + stringTerm)) | identifier | interpolatedString
def unsafeExpr(s,l,tokens):
if not any(term == interpolatedString or term == identifier
for term in tokens):
raise ParseException(s,l,"No unsafe string terms")
unsafeStringExpr.setParseAction(unsafeExpr)
unsafeStringExpr.setName("unsafeExpr")
func = Literal("cursor.execute")
statement = func + LPAR + unsafeStringExpr + RPAR
statement.setName("execute stmt")
#statement.ignore(pythonComment)
for tokens in statement.searchString(sample):
print ' '.join(tokens.asList())
This will scan through the following sample:
sample = """
import MySQLdb
def get_data(id):
db = MySQLdb.connect(db='TEST')
cursor = db.cursor()
cursor.execute("SELECT * FROM TEST WHERE ID = '%s' -- UNSAFE" % id)
cursor.execute("SELECT * FROM TEST WHERE ID = '" + id + "' -- UNSAFE")
cursor.execute(sqlVar + " -- UNSAFE")
cursor.execute("SELECT * FROM TEST WHERE ID = 'FRED' -- SAFE")
cursor.execute("SELECT * FROM TEST WHERE ID = " +
"'FRED' -- SAFE")
cursor.execute("SELECT * FROM TEST "
"WHERE ID = "
"'FRED' -- SAFE")
cursor.execute("SELECT * FROM TEST "
"WHERE ID = " +
"'%s' -- UNSAFE" % name)
return cursor.fetchall()
print(get_data(1))"""
and report these unsafe statements:
cursor.execute ( "SELECT * FROM TEST WHERE ID = '%s' -- UNSAFE" % id )
cursor.execute ( "SELECT * FROM TEST WHERE ID = '" + id + "' -- UNSAFE" )
cursor.execute ( sqlVar + " -- UNSAFE" )
cursor.execute ( "SELECT * FROM TEST " "WHERE ID = " + "'%s' -- UNSAFE" % name )
You can also have pyparsing report the location of the found lines, using scanString instead of searchString.
About the best that I can think you'd get would be grep'ing through your codebase, looking for cursor.execute() statements being passed a string using Python string interpolation, as in your example:
cursor.execute("SELECT * FROM TEST WHERE ID = '%s'" % id)
which of course should have been written as a parameterized query to avoid the vulnerability:
cursor.execute("SELECT * FROM TEST WHERE ID = '%s'", (id,))
That's not going to be perfect -- for instance, you might have a hard time catching code like this:
query = "SELECT * FROM TEST WHERE ID = '%s'" % id
# some stuff
cursor.execute(query)
But it might be about the best you can easily do.
It's a good thing that you're already aware of the problem and trying to resolve it.
As you may already know, the best practices to execute SQL in any DB is to use prepared statements or stored procedures if these are available.
In this particular case, you can implement a prepared statement by "preparing" the statement and then executing.
e.g:
cursor = db.cursor()
query = "SELECT * FROM TEST WHERE ID = %s"
cur.execute(query, "2")

Python MySQL escape special characters

I am using python to insert a string into MySQL with special characters.
The string to insert looks like so:
macaddress_eth0;00:1E:68:C6:09:A0;macaddress_eth1;00:1E:68:C6:09:A1
Here is the SQL:
UPGRADE inventory_server
set server_mac = macaddress\_eth0\;00\:1E\:68\:C6\:09\:A0\;macaddress\_eth1\;00\:1E\:68\:C6\:09\:A1'
where server_name = 'myhost.fqdn.com
When I execute the update, I get this error:
ERROR 1064 (42000):
You have an error in your SQL syntax; check the manual that corresponds to your
MySQL server version for the right syntax to use near 'UPGRADE inventory_server
set server_mac = 'macaddress\_eth0\;00\:1E\:68\:C6\:09\' at line 1
The python code:
sql = 'UPGRADE inventory_server set server_mac = \'%s\' where server_name = \'%s\'' % (str(mydb.escape_string(macs)),host)
print sql
try:
con = mydb.connect(DBHOST,DBUSER,DBPASS,DB);
with con:
cur = con.cursor(mydb.cursors.DictCursor)
cur.execute(sql)
con.commit()
except:
return False
How can I insert this text raw?
This is one of the reasons you're supposed to use parameter binding instead of formatting the parameters in Python.
Just do this:
sql = 'UPGRADE inventory_server set server_mac = %s where server_name = %s'
Then:
cur.execute(sql, macs, host)
That way, you can just deal with the string as a string, and let the MySQL library figure out how to quote and escape it for you.
On top of that, you generally get better performance (because MySQL can compile and cache one query and reuse it for different parameter values) and avoid SQL injection attacks (one of the most common ways to get yourself hacked).
Welcome to the world of string encoding formats!
tl;dr - The preferred method for handling quotes and escape characters when storing data in MySQL columns is to use parameterized queries and let the MySQLDatabase driver handle it. Alternatively, you can escape quotes and slashes by doubling them up prior to insertion.
Full example at bottom of link
standard SQL update
# as_json must have escape slashes and quotes doubled
query = """\
UPDATE json_sandbox
SET data = '{}'
WHERE id = 1;
""".format(as_json)
with DBConn(*client.conn_args) as c:
c.cursor.execute(query)
c.connection.commit()
parameterized SQL update
# SQL Driver will do the escaping for you
query = """\
UPDATE json_sandbox
SET data = %s
WHERE id = %s;
"""
with DBConn(*client.conn_args) as c:
c.cursor.execute(query, (as_json, 1))
c.connection.commit()
Invalid JSON SQL
{
"abc": 123,
"quotes": "ain't it great",
"multiLine1": "hello\nworld",
"multiLine3": "hello\r\nuniverse\r\n"
}
Valid JSON SQL
{
"abc": 123,
"quotes": "ain''t it great",
"multiLine1": "hello\\nworld",
"multiLine3": "hello\\r\\nuniverse\\r\\n"
}
Python transform:
# must escape the escape characters, so each slash is doubled
# Some MySQL Python libraries also have an escape() or escape_string() method.
as_json = json.dumps(payload) \
.replace("'", "''") \
.replace('\\', '\\\\')
Full example
import json
import yaml
from DataAccessLayer.mysql_va import get_sql_client, DBConn
client = get_sql_client()
def encode_and_store(payload):
as_json = json.dumps(payload) \
.replace("'", "''") \
.replace('\\', '\\\\')
query = """\
UPDATE json_sandbox
SET data = '{}'
WHERE id = 1;
""".format(as_json)
with DBConn(*client.conn_args) as c:
c.cursor.execute(query)
c.connection.commit()
return
def encode_and_store_2(payload):
as_json = json.dumps(payload)
query = """\
UPDATE json_sandbox
SET data = %s
WHERE id = %s;
"""
with DBConn(*client.conn_args) as c:
c.cursor.execute(query, (as_json, 1))
c.connection.commit()
return
def retrieve_and_decode():
query = """
SELECT * FROM json_sandbox
WHERE id = 1
"""
with DBConn(*client.conn_args) as cnx:
cursor = cnx.dict_cursor
cursor.execute(query)
rows = cursor.fetchall()
as_json = rows[0].get('data')
payload = yaml.safe_load(as_json)
return payload
if __name__ == '__main__':
payload = {
"abc": 123,
"quotes": "ain't it great",
"multiLine1": "hello\nworld",
"multiLine2": """
hello
world
""",
"multiLine3": "hello\r\nuniverse\r\n"
}
encode_and_store(payload)
output_a = retrieve_and_decode()
encode_and_store_2(payload)
output_b = retrieve_and_decode()
print("original: {}".format(payload))
print("method_a: {}".format(output_a))
print("method_b: {}".format(output_b))
print('')
print(output_a['multiLine1'])
print('')
print(output_b['multiLine2'])
print('\nAll Equal?: {}'.format(payload == output_a == output_b))
Python example how to insert raw text:
Create a table in MySQL:
create table penguins(id int primary key auto_increment, msg VARCHAR(4000))
Python code:
#!/usr/bin/env python
import sqlalchemy
from sqlalchemy import text
engine = sqlalchemy.create_engine(
"mysql+mysqlconnector://yourusername:yourpassword#yourhostname.com/your_database")
db = engine.connect()
weird_string = "~!##$%^&*()_+`1234567890-={}|[]\;':\""
sql = text('INSERT INTO penguins (msg) VALUES (:msg)')
insert = db.execute(sql, msg=weird_string)
db.close()
Run it, examine output:
select * from penguins
1 ~!##$%^&*()_+`1234567890-={}|[]\;\':"
None of those characters were interpreted on insert.
Although I also think parameter binding should be used, there is also this:
>>> import MySQLdb
>>> example = r"""I don't like "special" chars ¯\_(ツ)_/¯"""
>>> example
'I don\'t like "special" chars \xc2\xaf\\_(\xe3\x83\x84)_/\xc2\xaf'
>>> MySQLdb.escape_string(example)
'I don\\\'t like \\"special\\" chars \xc2\xaf\\\\_(\xe3\x83\x84)_/\xc2\xaf'

How to string format SQL IN clause with Python

I am trying to create a statement as follows:
SELECT * FROM table WHERE provider IN ('provider1', 'provider2', ...)
However, I'm having some trouble with the string formatting of it from the Django API. Here's what I have so far:
profile = request.user.get_profile()
providers = profile.provider.values_list('provider', flat=True) # [u'provider1', u'provider2']
providers = tuple[str(item) for item in providers] # ('provider1', 'provider2')
SQL = "SELECT * FROM table WHERE provider IN %s"
args = (providers,)
cursor.execute(sql,args)
DatabaseError
(1241, 'Operand should contain 1 column(s)')
MySQLdb has a method to help with this:
Doc
string_literal(...)
string_literal(obj) -- converts object obj into a SQL string literal.
This means, any special SQL characters are escaped, and it is enclosed
within single quotes. In other words, it performs:
"'%s'" % escape_string(str(obj))
Use connection.string_literal(obj), if you use it at all.
_mysql.string_literal(obj) cannot handle character sets.
Usage
# connection: <_mysql.connection open to 'localhost' at 1008b2420>
str_value = connection.string_literal(tuple(provider))
# '(\'provider1\', \'provider2\')'
SQL = "SELECT * FROM table WHERE provider IN %s"
args = (str_value,)
cursor.execute(sql,args)
Another answer that I don't like particularly, but will work for your apparent use-case:
providers = tuple[str(item) for item in providers] # ('provider1', 'provider2')
# rest of stuff...
SQL = 'SELECT * FROM table WHERE provider IN {}'.format(repr(providers))
cursor.execute(SQL)
You should probably do the string replacement before passing it to the cursor object to execute:
sql = "SELECT * FROM table WHERE provider IN (%s)" % \
(','.join(str(x) for x in providers))
cursor.execute(sql)
So, you have string input for ID's required:
some_vals = '1 3 5 76 5 4 2 5 7 8'.split() # convert to suitable type if required
SomeModel.objects.filter(provider__in=some_vals)
"SELECT * FROM table WHERE provider IN ({0},{1},{2})".format(*args) #where args is list or tuple of arguments.
try this.... should work.
SQL = "SELECT * FROM table WHERE provider IN %s"%(providers)
exec 'cursor.execute("%s")'%(SQL)

Categories