Escape SQL "LIKE" value for Postgres with psycopg2 - python

Does psycopg2 have a function for escaping the value of a LIKE operand for Postgres?
For example I may want to match strings that start with the string "20% of all", so I want to write something like this:
sql = '... WHERE ... LIKE %(myvalue)s'
cursor.fetchall(sql, { 'myvalue': escape_sql_like('20% of all') + '%' }
Is there an existing escape_sql_like function that I could plug in here?
(Similar question to How to quote a string value explicitly (Python DB API/Psycopg2), but I couldn't find an answer there.)

Yeah, this is a real mess. Both MySQL and PostgreSQL use backslash-escapes for this by default. This is a terrible pain if you're also escaping the string again with backslashes instead of using parameterisation, and it's also incorrect according to ANSI SQL:1992, which says there are by default no extra escape characters on top of normal string escaping, and hence no way to include a literal % or _.
I would presume the simple backslash-replace method also goes wrong if you turn off the backslash-escapes (which are themselves non-compliant with ANSI SQL), using NO_BACKSLASH_ESCAPE sql_mode in MySQL or standard_conforming_strings conf in PostgreSQL (which the PostgreSQL devs have been threatening to do for a couple of versions now).
The only real solution is to use the little-known LIKE...ESCAPE syntax to specify an explicit escape character for the LIKE-pattern. This gets used instead of the backslash-escape in MySQL and PostgreSQL, making them conform to what everyone else does and giving a guaranteed way to include the out-of-band characters. For example with the = sign as an escape:
# look for term anywhere within title
term= term.replace('=', '==').replace('%', '=%').replace('_', '=_')
sql= "SELECT * FROM things WHERE description LIKE %(like)s ESCAPE '='"
cursor.execute(sql, dict(like= '%'+term+'%'))
This works on PostgreSQL, MySQL, and ANSI SQL-compliant databases (modulo the paramstyle of course which changes on different db modules).
There may still be a problem with MS SQL Server/Sybase, which apparently also allows [a-z]-style character groups in LIKE expressions. In this case you would want to also escape the literal [ character with .replace('[', '=['). However according to ANSI SQL escaping a character that doesn't need escaping is invalid! (Argh!) So though it will probably still work across real DBMSs, you'd still not be ANSI-compliant. sigh...

I was able to escape % by using %% in the LIKE operand.
sql_query = "select * from mytable where website like '%%.com'"
cursor.fetchall(sql_query)

If you're using a prepared statement, then the input will be wrapped in '' to prevent sql injection. This is great, but also prevents input + sql concatenation.
The best and safest way around this would be to pass in the %(s) as part of the input.
cursor.execute('SELECT * FROM goats WHERE name LIKE %(name)s', { 'name': '%{}%'.format(name)})

You can also look at this problem from a different angle. What do you want? You want a query that for any string argument executes a LIKE by appending a '%' to the argument. A nice way to express that, without resorting to functions and psycopg2 extensions could be:
sql = "... WHERE ... LIKE %(myvalue)s||'%'"
cursor.execute(sql, { 'myvalue': '20% of all'})

I found a better hack. Just append '%' to your search query_text.
con, queryset_list = psycopg2.connect(**self.config), None
cur = con.cursor(cursor_factory=RealDictCursor)
query = "SELECT * "
query += " FROM questions WHERE body LIKE %s OR title LIKE %s "
query += " ORDER BY questions.created_at"
cur.execute(query, ('%'+self.q+'%', '%'+self.q+'%'))

I wonder if all of the above is really needed. I am using psycopg2 and was simply able to use:
data_dict['like'] = psycopg2.Binary('%'+ match_string +'%')
cursor.execute("SELECT * FROM some_table WHERE description ILIKE %(like)s;", data_dict)

Instead of escaping the percent character, you could instead make use of PostgreSQL's regex implementation.
For example, the following query against the system catalogs will provide a list of active queries which are not from the autovacuuming sub-system:
SELECT procpid, current_query FROM pg_stat_activity
WHERE (CURRENT_TIMESTAMP - query_start) >= '%s minute'::interval
AND current_query !~ '^autovacuum' ORDER BY (CURRENT_TIMESTAMP - query_start) DESC;
Since this query syntax doesn't utilize the 'LIKE' keyword, you're able to do what you want... and not muddy the waters with respect to python and psycopg2.

Having failed to find a built-in function so far, the one I wrote is pretty simple:
def escape_sql_like(s):
return s.replace('\\', '\\\\').replace('%', '\\%').replace('_', '\\_')

You can create a Like class subclassing str and register an adapter for it to have it converted in the right like syntax (e.g. using the escape_sql_like() you wrote).

I made some modifications to the code above to do the following:
def escape_sql_like(SQL):
return SQL.replace("'%", 'PERCENTLEFT').replace("%'", 'PERCENTRIGHT')
def reescape_sql_like(SQL):
return SQL.replace('PERCENTLEFT', "'%").replace('PERCENTRIGHT', "%'")
SQL = "SELECT blah LIKE '%OUCH%' FROM blah_tbl ... "
SQL = escape_sql_like(SQL)
tmpData = (LastDate,)
SQL = cur.mogrify(SQL, tmpData)
SQL = reescape_sql_like(SQL)
cur.execute(SQL)

It just requires to concatenate double % before and after it. Using "ilike" instead of "like" makes it case insensitive.
query = """
select
*
from
table
where
text_field ilike '%%' || %(search_text)s || '%%'
"""

I think it would be simpler and more readable to use f-strings.
query = f'''SELECT * FROM table where column like '%%{my_value}%%' '''
cursor.execute(query)

Related

Receiving Error not all arguments converted during string formatting

I am new to working on Python. I m not able to understand how can I send the correct input t0 the query.
list_of_names = []
for country in country_name_list.keys():
list_of_names.append(getValueMethod(country))
sql_query = f"""SELECT * FROM table1
where name in (%s);"""
db_results = engine.execute(sql_query, list_of_names).fetchone()
Give the error " not all arguments converted during string formatting"
As implied by John Gordon's comment, the number of placeholders in the SQL statement should match the number of elements in the list. However SQLAlchemy 2.0+ no longer accepts raw SQL statements. A future-proof version of the code would be:
import sqlalchemy as sa
...
# SQL statements should be wrapped with text(), and should used
# the "named" parameter style.
sql_query = sa.text("""SELECT * FROM table1 where name in :names)"""
# Values should be dictionaries of lists of dictionaries,
values = {'names': list_of_names}
# Execute statements using a context manager.
with engine.connect() as conn:
db_results = conn.execute(sql_query, values).fetchone()
If I know right, there are a simpler solution. If you write curly bracets {}, not bracets (), and you place inside the bracets a variable, which contains the %s value, should work. I don't know, how sql works, but you should use one " each side, not three.
Sorry, I'm not english. From this, maybe I wasn't help with the question, because I don't understand correctly.

MySql read_sql python query with variable #

I am aware that queries in Python can be parameterized using either ? or %s in execute query here or here
However I have some long query that would use some constant variable defined at the beginning of the query
Set #my_const = 'xyz';
select #my_const;
-- Query that use #my_const 40 times
select ... coalesce(field1, #my_const), case(.. then #my_const)...
I would like to do the least modif possible to the query from Mysql. So that instead of modifying the query to
pd.read_sql(select ... coalesce(field1, %s), case(.. then %s)... , [my_const, my_const, my_const, ..]
,I could write something along the line of the initial query. Upon trying the following, however, I am getting a TypeError: 'NoneType' object is not iterable
query_str = "Set #null_val = \'\'; "\
" select #null_val"
erpur_df = pd.read_sql(query_str, con = db)
Any idea how to use the original variable defined in Mysql query ?
The reason
query_str = "Set #null_val = \'\'; "\
" select #null_val"
erpur_df = pd.read_sql(query_str, con = db)
throws that exception is because all you are doing is setting null_value to '' and then selecting that '' - what exactly would you have expected that to give you? EDIT read_sql only seems to execute one query at a time, and as the first query returns no rows it results in that exception.
If you split them in to two calls to read_sql then it will in fact return you the value of your #null value in the second call. Due to this behaviour read_sql is clearly not a good way to do this. I strongly suggest you use one of my suggestions below.
Why are you wanting to set the variable in the SQL using '#' anyway?
You could try using the .format style of string formatting.
Like so:
query_str = "select ... coalesce(field1, {c}), case(.. then {c})...".format(c=my_const)
pd.read_sql(query_str)
Just remember that if you do it this way and your my_const is a user input then you will need to sanitize it manually to prevent SQL injection.
Another possibility is using a dict of params like so:
query_str = "select ... coalesce(field1, %(my_const)s, case(.. then %(my_const)s)..."
pd.read_sql(query_str, params={'my_const': const_value})
However this is dependent on which database driver you use.
From the pandas.read_sql docs:
Check your database driver documentation for which of the five syntax
styles, described in PEP 249’s paramstyle, is supported. Eg. for
psycopg2, uses %(name)s so use params={‘name’ : ‘value’}

Escaping single and double quotes for mysql in python

I am using mysql.connector python library with python 2.7
I have a unicode string which may or may not contain single and double quotes.
Here are the things I tried for my escape function:
def escape(string):
#string.MySQL.escape_string()
#string = string.decode('string_escape')
#string = string.encode('unicode-escape').replace("'", "''")
#string = string.encode('unicode-escape').replace('"', '\"')
#string = string.encode('unicode-escape').replace("'", u"\u2019")
#string = string.encode('unicode-escape').replace('''"''', u"\u201D")
#string = string.encode('unicode-escape').replace('''''', u"\u201D")
return string
Nothing seems to have worked. I tried using this function but still gives mysql syntax error.
I need something within mysql.connector library which escapes the single and double quotes without breaking the unicode as well as mysql query.
Here is an example of SQL query I am using:
"""SELECT * FROM messages WHERE msg_id = '{msg_id}'""".format(**db_dict)
Let me know if any more details needed
EDIT: Example SQL query updated
MySQLdb officially declares to use the format paramstyle, but it also supports the pyformat style*, so if you want to use parameters from a dict, you can use:
db_dict = {'msg_id': "1'2'3", ...}
cursor.execute("SELECT * FROM messages WHERE msg_id = %(msg_id)s", db_dict)
Using string manipulation to create sql queries only leads to sql injection vulnerabilities, so you should never do it.
*... most db connectors that use python string formatting behind the screen do the same, they specify one of format or pyformat as paramstyle but actually support both. The dbapi2 doesnt't allow to specify two values here, but it doesn't forbid to support multiple parmstyles either. If you write code that potentially uses an unknowon dbapi2 connector it's enough that you can query a supported paramstyle, being able to know all would be nice but it's not necessary.
cursor.execute('SELECT * FROM messages WHERE msg_id = %s', (db_dict['msg_id'],)) is what you want to run here. Standard string escapes aren't supported by python's database interface, and, per #bobince's comment, are a security hole to boot.

Lightweight DBAL for python

can somebody please recomend me some python DBAL library that will best suit my requirements. I would like to write my sql statements directly, most of the logics will be in db stored procedures (postgresql), so I only need to invoke db procedures, pass arguments to them and fetch the results. The library should help me with quoting (preventing sql inject).
I played with sqlalchemy, but i think that there is no quoting helper when writing sql statement directly to engine.execute method.
Thank you
You should have given sqlalchemy a deeper look; It does a fine job of quoting placeholders:
>>> engine = sqlalchemy.create_engine("sqlite:///:memory:")
>>> engine.execute("select ?", 5).fetchall()
[(5,)]
>>> engine.execute("select ?", "; drop table users; --").fetchall()
[(u'; drop table users; --',)]
psycopg2 (via DB-API) will automatically quote to prevent SQL injection, IF you use it properly. (The python way is wrong; you have to pass the parameters as arguments to the query command itself.)
WRONG:
cur.execute('select * from table where last="%s" and first="%s"'
% (last, first))
RIGHT:
cur.execute('select * from table where last=%s and first=%s',
(last, first))
Note: you don't use %, and you don't put quotes around your values.
The syntax is slightly different for MySQLdb and sqlite3. (For example, sqlite uses ? instead of %s.)
Also, for psycopg2, always use %s even if you're dealing with numbers or some other type.

Python: inserting double or single quotes around a string

Im using python to access a MySQL database and im getting a unknown column in field due to quotes not being around the variable.
code below:
cur = x.cnx.cursor()
cur.execute('insert into tempPDBcode (PDBcode) values (%s);' % (s))
rows = cur.fetchall()
How do i manually insert double or single quotes around the value of s?
I've trying using str() and manually concatenating quotes around s but it still doesn't work.
The sql statement works fine iv double and triple check my sql query.
You shouldn't use Python's string functions to build the SQL statement. You run the risk of leaving an SQL injection vulnerability. You should do this instead:
cur.execute('insert into tempPDBcode (PDBcode) values (%s);', s)
Note the comma.
Python will do this for you automatically, if you use the database API:
cur = x.cnx.cursor()
cur.execute('insert into tempPDBcode (PDBcode) values (%s)',s)
Using the DB API means that python will figure out whether to use quotes or not, and also means that you don't have to worry about SQL-injection attacks, in case your s variable happens to contain, say,
value'); drop database; '
If this were purely a string-handling question, the answer would be tojust put them in the string:
cur.execute('insert into tempPDBcode (PDBcode) values ("%s");' % (s))
That's the classic use case for why Python supports both kinds of quotes.
However as other answers & comments have pointed out, there are SQL-specific concerns that are relevant in this case.

Categories