Handle multiple parameter in SQL query using MySQLdb - python

I have a scenario where I need to exclude few thousands of email id with specific domain name.
My current query is
Select * from users where email NOT LIKE '%abc.com'
AND email NOT LIKE '%efg.com'
AND email NOT LIKE '%xyz.com'
When I moved to python, I wrote a query like
MySQLcursor.execute (Select * from users where email NOT LIKE '%abc.com'
AND email NOT LIKE '%efg.com'
AND email NOT LIKE '%xyz.com' )
Can I make a generic list of domains and exclude them?
What I tried is
list_of_domains = ('%%abc.com','%%xyz.com','%%efg.com')
MySQLcursor.execute (Select * from users where email NOT LIKE %(exclude_domain)s, {"exclude_domain":list_of_domains} )
It seems to work if there is only 1 value in list_of_domains. Because, when it unpacks the list it can only match "email Not Like" condition with 1 list of domain.
How can I make a generic query, so that if tomorrow, If I have new domains, I simply add that to list_of_domains and it works fine.
I am not sure if it is possible? Can somebody help?

I don't like constructing queries using string interpolation and concatenation, but, assuming you cannot change the table schema and this is the query you have to do and you trust the source of the domain list, here is something to get you started:
domains = [...]
pattern = "email NOT LIKE '%{0}'"
conditions = " AND ".join([pattern.format(domain) for domain in domains])
query = "SELECT * FROM users WHERE " + conditions

Related

How to avoid SQL Injection in Python for Upsert Query to SQL Server?

I have a sql query I'm executing that I'm passing variables into. In the current context I'm passing the parameter values in as f strings, but this query is vulnerable to sql injection. I know there is a method to use a stored procedure and restrict permissions on the user executing the query. But is there a way to avoid having to go the stored procedure route and perhaps modify this function to be secure against SQL Injection?
I have the below query created to execute within a python app.
def sql_gen(tv, kv, join_kv, col_inst, val_inst, val_upd):
sqlstmt = f"""
IF NOT EXISTS (
SELECT *
FROM {tv}
WHERE {kv} = {join_kv}
)
INSERT {tv} (
{col_inst}
)
VALUES (
{val_inst}
)
ELSE
UPDATE {tv}
SET {val_upd}
WHERE {kv} = {join_kv};
"""
engine = create_engine(f"mssql+pymssql://{username}:{password}#{server}/{database}")
connection = engine.raw_connection()
cursor = connection.cursor()
cursor.execute(sqlstmt)
connection.commit()
cursor.close()
Fortunately, most database connectors have query parameters in which you pass the variable instead of giving in the string inside the query yourself for the risks you mentioned.
You can read more on this here: https://realpython.com/prevent-python-sql-injection/#understanding-python-sql-injection
Example:
# Vulnerable
cursor.execute("SELECT admin FROM users WHERE username = '" + username + '");
# Safe
cursor.execute("SELECT admin FROM users WHERE username = %s'", (username, ));
As Amanzer mentions correctly in his reply Python has mechanisms to pass parameters safely.
However, there are other elements in your query (table names and column names) that are not supported as parameters (bind variables) because JDBC does not support those.
If these are from an untrusted source (or may be in the future) you should be sure you validate these elements. This is a good coding practice to do even if you are sure.
There are some options to do this safely:
You should limit your tables and columns based on positive validation - make sure that the only values allowed are the ones that are authorized
If that's not possible (because these are user created?):
You should make sure tables or column names limit the
names to use a "safe" set of characters (alphanumeric & dashes,
underscores...)
You should enquote the table names / column names -
adding double quotes around the objects. If you do this, you need to
be careful to validate there are no quotes in the name, and error out
or escape the quotes. You also need to be aware that adding quotes
will make the name case sensitive.

How can I safely and dynamically set column names in a query without using %s?

I am making a script for a coworker to be able to generate reports from a SQL Server database. I am using pymssql to manage the connection to the database.
My problem is that I want the person using the script to be able to specify which column they want to have returned, as well as the date range they need.
I am using this snippet of code but the column name don't work(from what I understand they get escaped and that is part of the problem) VS typing them out.
if len(columns) == 1:
query = "SELECT %s FROM dbo.tblPayments WHERE %s < dbo.tblPayments.fldPayDate AND dbo.tblPayments.fldPayDate > %s;"
else:
query = "SELECT " + ("%s, " * (len(columns) - 1)) + "%s FROM dbo.tblPayments WHERE %s <= dbo.tblPayments.fldPayDate AND dbo.tblPayments.fldPayDate >= %s;"
I have seen people saying to use string concatenation with .format() instead but I don't want to insert security risks if I can avoid them. Is there something I am missing? What can I do to make it secure?
Thank you
You can find all the columns of a certain table by using the sys.columns view. Get the list first and validate the user input in your application using this result. If the column is found you can use it to build the query string.
SELECT c.name
FROM sys.columns c
WHERE OBJECT_SCHEMA_NAME ( c.object_id) ='dbo'
AND OBJECT_NAME(c.object_id) = 'tblPayments'

Simple query in Django

I'm trying to do a RAW Query like this:
User.objects.raw("SELECT username FROM app_user WHERE id != {0} AND LOWER(username) LIKE LOWER('%{1}%')".format('1','john'))
I get this error:
django.db.utils.ProgrammingError: not enough arguments for format string
The query works perfectly in SQLite but does not work in MySQL.
After you performed the formatting, Django obtains a query like:
SELECT username FROM app_user WHERE id != 1 AND LOWER(username) LIKE LOWER('%john%')
As you can see this string contains %j and %). This is part of another way to format strings in Python that Django will use to inject parameters the proper way. It thus looks for extra parameters. But it can not find any.
But regardless what happens, this is not a good idea, since such queryes are vulnerable to SQL injection. If later 'John' is replaced with '); DROP TABLE app_user -- (or something similar), then somebody can remove the entire table.
If you want to perform such query, it should look like:
User.objects.raw(
"SELECT username FROM app_user WHERE id != %s AND LOWER(username) LIKE LOWER('%%%s%%')",
['1','john']
)
Or better: use the Django ORM:
User.objects.exclude(id=1).filter(
username__icontains='john'
).values_list('username', flat=True)
Or we can encode the full query like:
User.objects.exclude(id=request.user.pk).annotate(
flname=Concat('first_name', Value(' '), 'last_name')
).filter(
Q(username__icontains=q) | Q(flname__icontains=q)
).values_list('id', 'username', 'first_name', 'last_name')
If you are after the User objects, and thus not that much the id, username, etc. columns itself, the by dropping the .values_list(..) you get the User objects, not a QuerySet of lists.

How to find rowid of a variable? sqlite3

I am making a username password program using sqlite and I want to check if a username is in the database (I've already done this) then I want to find the row id of said username. This will be a username that the user has input. I know how to find the rowid of a word in the database e.g 'Word'. How would I make is so I could replace the word with a variable?
def sign_in():
usernameask = input("What is your username?")
passwordask = input("What is your password?")
c.execute("SELECT username FROM stuffToPlot")
names = {name[0] for name in c.fetchall()}
if usernameask in names:
print("Yes")
c.execute("SELECT password FROM stuffToPlot")
passs = {name[0] for name in c.fetchall()}
if passwordask in passs:
print("yes,pass")
t = c.execute("SELECT rowid, FROM stuffToPlot WHERE username = 'usernameask' ")
rowid = t.fetchall()
for r in rowid:
print(r)
else:
print("No,pass"
I am looking at where it says t = c.execute("SELECT rowid, FROM stuffToPlot WHERE username = 'usernameask' ")
and want to replace the 'usernameask' which is currently looking for it as a word in the database to a variable. How would I do this?
There is no error, it just finds the position of the word "usernameask" which isn't in the database.
You want to use a parameterised query. You put placeholders in your query where your data has to go, and leave it to the database driver to put your data and the query together.
Parameterised queries let you avoid a common security pitfall, the SQL injection attack, where an attacker can 'augment' your database query by putting in more commands than you originally anticipated. Query parameters always make sure your data is only ever handled as data, not as commands.
A parameterised query us usually also faster, as it lets the database avoid having to parse your query every time if you use it more than once, and it can also reuse query plans.
The sqlite3 database library uses ? for positional parameters; put a ? where ever you need to use data from your code, and put the parameter values in a sequence (like a tuple or a list) in the second argument to cursor.execute():
t = c.execute("SELECT rowid, FROM stuffToPlot WHERE username = ?", (usernameask,))
Note that (usernameask,) is a tuple with one element. You could also use [usernameask].
This executes your SELECT query using the string value that usernameask references in the WHERE username = filter. The driver takes care of quoting your value properly.
You could also use named parameters, these take the form of :parametername, (where you can pick your own names), and then you use a dictionary for the second argument to cursor.execute(), mapping names to values:
t = c.execute(
"SELECT rowid, FROM stuffToPlot WHERE username = :username",
{'username': usernameask})
Here the placeholder is named username, and the dictionary maps that to the usernameask value.

Django ORM limiting queryset to only return a subset of data

I have the following query in a Django app. The user field is a foreign key. The results may contain 1000 MyModel objects, but only for a handful of users. I'd like to limit it to 5 MyModel objects returned per user in the user__in= portion of the query. I should end up with 5*#users or less MyModel objects.
lfs = MyModel.objects.filter(
user__in=[some,users,here,],
active=True,
follow=True,
)
Either through the ORM or SQL (using Postgres) would be acceptable.
Thanks
EDIT 2
Found a simpler way to get this done, which I've added as an answer below.
EDIT
Some of the links mentioned in the comments had some good information, although none really worked with Postgres or the Django ORM. For anyone else looking for this information in the future my adaptation of the code in those other questions/asnwers is here.
To implement this is postgres 9.1, I had to create a couple functions using pgperl (which also required me to install pgperl)
CREATE OR REPLACE FUNCTION set_int_var(name text, val bigint) RETURNS bigint AS $$
if ($_SHARED{$_[0]} = $_[1]) {
return $_[1];
} else {
return $_[1];
}
$$ LANGUAGE plperl;
CREATE OR REPLACE FUNCTION get_int_var(name text) RETURNS bigint AS $$
return $_SHARED{$_[0]};
$$ LANGUAGE plperl;
And my final query looks something like the following
SELECT x.id, x.ranking, x.active, x.follow, x.user_id
FROM (
SELECT tbl.id, tbl.active, tbl.follow, tbl.user_id,
CASE WHEN get_int_var('user_id') != tbl.user_id
THEN
set_int_var('rownum', 1)
ELSE
set_int_var('rownum', get_int_var('rownum') + 1)
END AS
ranking,
set_int_var('user_id', tbl.user_id)
FROM my_table AS tbl
WHERE tbl.active = TRUE AND tbl.follow=TRUE
ORDER BY tbl.user_id
) AS x
WHERE x.ranking <= 5
ORDER BY x.user_id
LIMIT 50
The only downside to this is that if I try to limit the users that it looks for by using user_id IN (), the whole thing breaks and it just returns every row, rather than just 5 per user.
This is what ended up working, and allowed me to only select a handful of users, or all users (by removing the AND mt.user_id IN () line).
SELECT * FROM mytable
WHERE (id, user_id, follow, active) IN (
SELECT id, likeable, user_id, follow, active FROM mytable mt
WHERE mt.user_id = mytable.user_id
AND mt.user_id IN (1, 2)
ORDER BY user_id LIMIT 5)
ORDER BY likeable
I think this is what you where looking for (i didn't see it in other posts):
https://docs.djangoproject.com/en/dev/topics/db/queries/#limiting-querysets
In other examples, they pass from queryset to list before "slicing". If you make something like this (for example):
lfs = MyModel.objects.filter(
user__in=[some,users,here,],
active=True,
follow=True,
)[:10]
the resulting SQL it's a query with LIMIT 10 in it's clauses.
So, the query you are looking for would be something like this:
mymodel_ids = []
for user in users:
mymodel_5ids_for_user = (MyModel.objects.filter(
user=user,
active=True,
follow=True,
)[:5]).values_list('id', flat=True)
mymodel_ids.extend(mymodel_5ids_for_user)
lfs = MyModel.objects.filter(id__in=mymodel_ids)
having in lfs the objects of MyModel you where looking for (5 entries per user).
I think the number of queries is, at least, one per user and one to retrieve all MyModel objects with that filter.
Be aware of the order you want to filter the objects. If you change the order of "mymodel_5ids_for_user" query, the first 5 elements of the query could change.

Categories