I'm a newbie in Django and I have some questions about making queries by QuerySet API.
For instance, I have User, his Orders, and its Statuses
class User(models.Model):
first_name = models.CharField(max_length=100)
last_name = models.CharField(max_length=100)
is_active = models.BooleanField()
class OrderStatus(models.Model):
name = models.CharField(max_length=100)
class Order(models.Model):
number = models.CharField(max_length=10)
amount = models.DecimalField(max_digits=19, decimal_places=2)
user = models.ForeignKey(User, on_delete=models.PROTECT, related_name="orders")
order_status = models.ForeignKey(OrderStatus, on_delete=models.PROTECT)
creation_datetime = models.DateTimeField(auto_now_add=True)
# Some filtering field
filtering_field = models.IntegerField()
I combined all of my questions to this one query:
Get active users with some additional data for each user:
'Amount' of the Orders filtered by 'filtering_field' and aggregated by Min and Max
'Number' and 'Amount' of the first Order filtered by 'filtering_field'
Count of the Orders filtered by 'filtering_field', aggregated by Count and grouped by 'Order Status'. This grouping means that data from query #1 and #2 can be duplicated and it's ok.
I could make this query in T-SQL by 3 separated subquery with own grouping, filtering, ordering:
SELECT
u.id,
u.first_name,
u.last_name,
ts.min_amount,
ts.max_amount,
first_order.number as first_order_number,
first_order.amount as first_order_amount,
cnt.order_status_id,
cnt.cnt
FROM
[User] u
-- 1. 'Amount' of the Orders filtered by 'filtering_field' and aggregated by Min and Max
LEFT OUTER JOIN (
SELECT
[user_id],
MIN(amount) min_amount,
MAX(amount) max_amount
FROM
[Order]
WHERE
filtering_field = 1
GROUP BY
[user_id]
) ts ON u.id = ts.[user_id]
-- 2. 'Number' and 'Amount' of the first Order filtered by 'filtering_field'
OUTER APPLY (
SELECT TOP 1
o.number,
o.amount
FROM
[Order] o
WHERE
u.id = o.[user_id] AND
o.filtering_field = 2
ORDER BY
o.creation_datetime
) first_order
-- 3. Count of the Orders filtered by 'filtering_field', aggregated by Count and grouped by 'Order Status'.
LEFT OUTER JOIN (
SELECT
[user_id],
order_status_id,
COUNT(*) cnt
FROM
[Order]
WHERE
filtering_field = 3
GROUP BY
[user_id],
order_status_id
) cnt ON u.id = cnt.[user_id]
WHERE
u.is_active = 1
How I can do the same by QuerySet API?
Query #1 I can do Min and Max in Annotate.
data = User.objects.filter(
Q(is_active=True)
).values(
'id',
'first_name',
'last_name',
).annotate(
min_amount=Min(
'orders__amount',
filter=Q(orders__filtering_field=1)
),
max_amount=Max(
'orders__amount',
filter=Q(orders__filtering_field=1)
)
)
But what about query #2 & #3?
I've considered Subquery(), but It supports the only one output value.
I mean if you wanna get 5 fields from 1 queryset, sql server runs 5 queries. I think it's not good for performance.
How I can join the first order once to use its fields and How can I use Count() with grouping by filtered rows of child model?
I'd like to use .prefetch_related() as a substitution of Subquery in T-SQL for each query like this:
Prefetch(
'orders',
queryset=Order.objects.filter(filtering_field=1)..., #staff with .values(), annotate(Min(), Max()) and etc.
to_attr='pf_query_1'
)
And then use 'pf_query_1' like 'orders__pf_query_1__amount' in User.objects...values()...annotate().
But I can't use .values() in Prefetch as well as 'pf_query_1' as a model field.
So what is the best practice to make this one query by QuerySet API?
I'd like to see the whole QuerySet API query just like T-SQL query
Have you considered the Django Subquery as described in the docs?
Regarding your 3rd question the only approach coming to my mind is dynamically creating the annotations.
Here a (tested) code sample using your models:
def test_query(self):
# 1st question
min_order = Order.objects.filter(user=OuterRef('pk'), filtering_field=1)\
.order_by().values('user').annotate(min=Min('amount')).values('min')
max_order = Order.objects.filter(user=OuterRef('pk'), filtering_field=1)\
.order_by().values('user').annotate(max=Max('amount')).values('max')
# 2nd question
first_number = Order.objects.filter(user=OuterRef('pk'), filtering_field=1)\
.order_by().values('user').annotate(fnumber=F('number')).values('fnumber')
first_amount = Order.objects.filter(user=OuterRef('pk'), filtering_field=1)\
.order_by().values('user').annotate(fnumber=F('amount')).values('amount')
kwargs = {
'min': Subquery(min_order, output_field=DecimalField()),
'max': Subquery(max_order, output_field=DecimalField()),
'first_n': Subquery(first_number, output_field=CharField()),
'first_a': Subquery(first_amount, output_field=DecimalField())
}
# 3rd question
for o in OrderStatus.objects.all():
kwargs['%s_count' % o.name] = \
Subquery(Order.objects.filter(user=OuterRef('pk'), filtering_field=1, order_status=o)\
.order_by().values('user').annotate(c=Count('pk')).values('c'), output_field=IntegerField())
# Putting it all together
qs2 = User.objects.annotate(**kwargs)
# Testing the results
for user in qs2:
v = Order.objects.filter(user=user, filtering_field=1).aggregate(Min('amount'), Max('amount'))
self.assertEqual(v['amount__min'], user.min)
self.assertEqual(v['amount__max'], user.max)
v = Order.objects.filter(user=user, filtering_field=1).first()
self.assertEqual(v.number, user.first_n)
self.assertEqual(v.amount, user.first_a)
for o in OrderStatus.objects.all():
v = Order.objects.filter(user=user, filtering_field=1, order_status=o).count()
if v == 0:
v = None
k = '%s_count' % o.name
v1 = getattr(user, k)
self.assertEqual(v, v1)
# The sql
print(qs2.query)
Please note:
The code is part of a TestCase where I put it to check if it worked
as expected
I know some parts of the query can be generated without
Subquery using the filter attribute of the aggregation functions. As
this filter attribute was only introduced in Django 2.0 and not
supported in the LTS version 1.11 I did not use it.
EDIT: Here is another approach I came up with starting with a "base queryset" and annotating that one:
def test_query2(self):
qs = Order.objects.filter(filtering_field=1).values('user', 'order_status').distinct()
# 1st question
min_order = Order.objects.filter(user=OuterRef('user'), filtering_field=1)\
.order_by().values('user').annotate(min=Min('amount')).values('min')
max_order = Order.objects.filter(user=OuterRef('user'), filtering_field=1)\
.order_by().values('user').annotate(max=Max('amount')).values('max')
# 2nd question
first_number = Order.objects.filter(user=OuterRef('user'), filtering_field=1)\
.order_by().values('user').annotate(fnumber=F('number')).values('fnumber')
first_amount = Order.objects.filter(user=OuterRef('user'), filtering_field=1)\
.order_by().values('user').annotate(fnumber=F('amount')).values('amount')
# 3rd question
total_count = Order.objects.filter(user=OuterRef('user'), filtering_field=1, order_status=OuterRef('order_status'))\
.order_by().values('user').annotate(c=Count('pk')).values('c')
qs2 = qs.annotate(
min = Subquery(min_order, output_field=DecimalField()),
max = Subquery(max_order, output_field=DecimalField()),
first_n = Subquery(first_number, output_field=CharField()),
first_a = Subquery(first_amount, output_field=CharField()),
c = Subquery(total_count, output_field=IntegerField())
)
# Testing the results
for d in qs2:
v = Order.objects.filter(user=d['user'], filtering_field=1).aggregate(Min('amount'), Max('amount'))
self.assertEqual(v['amount__min'], d['min'])
self.assertEqual(v['amount__max'], d['max'])
v = Order.objects.filter(user=d['user'], filtering_field=1).first()
self.assertEqual(v.number, d['first_n'])
self.assertEqual(v.amount, d['first_a'])
v = Order.objects.filter(user=d['user'], filtering_field=1, order_status=d['order_status']).count()
self.assertEqual(v, d['c'])
print(qs2.query)
Related
In the following code snippet, my goal is to get outstanding_event_total_gross.
To get that I first lookup for each ticket that belongs to the event the amount of sold_tickets. Out of that, I can calculate tickets_left. For each ticket I then calculate the outstanding_ticket_total_gross which I add up to outstanding_event_total_gross.
A lot of the business logic happens in Python, but I wonder now if there is a more efficient query set to achieve what I am doing while calling the data from the database?
tickets = Ticket.objects.filter(event=3)
outstanding_event_total_gross = 0
for ticket in tickets:
sold_tickets = ticket.attendees.filter(
canceled=False,
order__status__in=(
OrderStatus.PAID,
OrderStatus.PENDING,
OrderStatus.PARTIALLY_REFUNDED,
OrderStatus.FREE,
),
).count()
tickets_left = ticket.quantity - sold_tickets
outstanding_ticket_total_gross = tickets_left * ticket.price_gross
outstanding_event_total_gross += outstanding_ticket_total_gross
print(outstanding_event_total_gross)
Here a part of the models. I simplified them for better readability.
class Ticket(TimeStampedModel):
event = models.ForeignKey()
price_gross = models.PositiveIntegerField()
quantity = models.PositiveIntegerField()
class Order(AbstractTransaction, LogMixin):
event = models.ForeignKey()
status = models.CharField(
max_length=18, choices=OrderStatus.CHOICES, verbose_name=_("Status")
)
total_gross = models.PositiveIntegerField()
Maybe you can try like this with help of conditional aggregation:
from django.db.models import Q, Count, Sum
tickets = Ticket.objects.filter(event=3).annotate(
sold_tickets=Count(
'attendees',
filter=Q(
attendees__canceled=False,
attendees__order__status__in=(
OrderStatus.PAID,
OrderStatus.PENDING,
OrderStatus.PARTIALLY_REFUNDED,
OrderStatus.FREE,
)
),
distinct=True
)
).annotate(
tickets_left=F('quantity')-F('sold_tickets')
).annotate(
outstanding_gross=F('tickets_left') * F('price_gross')
)
outstanding_event_total_gross = tickets.aggregate(total=Sum('outstanding_gross'))['total']
I have a table with next columns:
key
time
value
And I need to have a query like that:
SELECT
"time",
SUM("value")
FROM (
SELECT
"key",
django_trunc_datetime("time"),
AVG("value")
FROM my_table
GROUP BY "key", django_trunc_datetime("time")
)
GROUP BY "time"
Is it possible in Django ORM? Maybe with some fake model based on the subquery?
Thanks
UPDATED:
Looks like I have to create five database views (because there are Hour/Day/Week/Month/Year arguments for the django_trunc_datetime) but it can have a bad performance because in this case, I can't do the previous filtering. :(
I also thought about SQLAlchemy but it doesn't have universal datetime truncate function
SOLUTION
The solution with DjangoORM (not completed solution but illustrate the idea)
class TheApp(models.Model):
a = models.DateTimeField()
b = models.IntegerField()
class B(models.Model):
class Meta:
managed = False
c = models.DateTimeField()
d = models.IntegerField()
TheApp.objects.create(a=datetime.now(), b=4)
TheApp.objects.create(a=datetime.now(), b=5)
TheApp.objects.create(a=datetime.now(), b=7)
q1 = TheApp.objects.annotate(c=F('b'), d=Max('a')).values('c', 'd', 'id').query
q1.group_by = ('c',)
q2 = B.objects.annotate(a=F('c') * 2, b=Max('d')).values('a', 'b', 'id').query
q2.group_by = ('a',)
q3 = str(q2).replace('theapp_b', 'sub').replace('FROM "sub" ', f'FROM ({q1}) AS "sub" ')
print(q3)
print(list(B.objects.raw(q3)))
The solution I have chosen:
Use SQLAlchemy via aldjemy
In Sales -> Reports -> Pipeline I would like to allow to filter by res.partner.category.
In Odoo res.partner has a field category_id
category_id = fields.Many2many('res.partner.category', column1='partner_id',
column2='category_id', string='Tags', default=_default_category)
I tried copying
category_id = fields.Many2many('res.partner.category', column1='partner_id',
column2='category_id', string='Tags', default=_default_category)` to my crm_opportunity_report (that has inherited crm.opportunity.report) but I get errors.
Tried adding field
category_ids = fields.Many2many(comodel_name='res.partner.category', relation="res_partner_res_partner_category_rel",
column1='category_id', column2='partner_id')
and this failed too.
How to add category name as a filter to crm_opportunity_report? What can be done to allow filtering by category?
Here's somewhat solution (based on discussion: see comments on the question). It makes a string ("'Tagname1';'Tagname2';'Tagname3';...") from tag names to filter on.
SELECT
c.id,
c.name as name,
c.date_deadline,
c.date_open as opening_date,
c.date_closed as date_closed,
c.date_last_stage_update as date_last_stage_update,
c.user_id,
c.probability,
c.stage_id,
stage.name as stage_name,
c.type,
c.company_id,
c.priority,
c.team_id,
(SELECT COUNT(*)
FROM mail_message m
WHERE m.model = 'crm.lead' and m.res_id = c.id) as nbr_activities,
c.active,
c.campaign_id,
c.source_id,
c.medium_id,
c.partner_id,
c.city,
c.country_id,
c.planned_revenue as total_revenue,
c.planned_revenue*(c.probability/100) as expected_revenue,
c.create_date as create_date,
extract('epoch' from (c.date_closed-c.create_date))/(3600*24) as delay_close,
abs(extract('epoch' from (c.date_deadline - c.date_closed))/(3600*24)) as delay_expected,
extract('epoch' from (c.date_open-c.create_date))/(3600*24) as delay_open,
c.lost_reason,
c.date_conversion as date_conversion,
COALESCE(rp.customer, FALSE) as is_customer,
COALESCE(x.Categories, '') AS Categories
FROM
"crm_lead" c
LEFT JOIN "res_partner" rp ON rp.id = c.partner_id
LEFT JOIN "crm_stage" stage ON stage.id = c.stage_id
LEFT JOIN
(
SELECT rp.id AS partner_id, array_to_string(array_agg(''''||rpc.name||'''' ORDER BY rp.id, rpc.name),';') AS Categories
FROM res_partner_res_partner_category_rel rpcl
JOIN res_partner_category rpc ON rpc.id = rpcl.category_id
JOIN res_partner rp ON rp.id = rpcl.partner_id
GROUP BY rp.id
ORDER BY rp.id
) AS x ON x.partner_id = c.partner_id
GROUP BY c.id, stage.name, COALESCE(rp.customer, FALSE), COALESCE(x.Categories, '')
ORDER BY c.partner_id
I am writing a Django app that queries a Bugzilla database for reporting. I am trying to build a query that can get all of the bugs that have specific flags set.
The model representing the flags table.
class Bugzilla_flags(models.Model):
class Meta:
db_table = 'flags'
type_id = models.IntegerField()
status = models.CharField(max_length=50)
bug_id = models.IntegerField()
creation_date = models.DateTimeField()
modification_date = models.DateTimeField()
setter_id = models.IntegerField()
requestee_id = models.IntegerField()
def __unicode__(self):
return str(self.bug_id)
I have a dictionary that represents the flags I want to look for (type_id : status).
flags = {'36':'?','12':'+'}
I tried using the reduce function but I don't think it will work because I is checking that all of the flags to be present in the same row. If I run the query with a dictionary with just a single k,v pair, it works fine, but not with more than 1.
query = reduce(operator.and_, (Q(type_id=flag,status=val) for (flag,val) in flags.items()))
I will then take the results of that query, and use it as the search for the actual bugs database.
inner = Bugzilla_flags.objects.using('bugzilla').filter(query)
bugs = Bugzilla_bugs.objects.using('bugzilla').filter(bug_id__in=inner)
For some history, I am currently using a series of steps to generate some sql which I send as a raw query, but I am trying to see if I can do it in Django. The resulting sql is like this:
select b.bug_id, b.priority, b.bug_severity, b.bug_status, b.resolution, b.cf_verified_in, b.assigned_to, b.qa_contact, b.short_desc, b.cf_customercase,
MAX(CASE WHEN f.type_id = 31 THEN f.status ELSE NULL END) as Unlocksbranch1,
MAX(CASE WHEN f.type_id = 31 THEN f.status ELSE NULL END) as Unlocksbranch2,
MAX(CASE WHEN f.type_id = 33 THEN f.status ELSE NULL END) as Unlocksbranch3,
MAX(CASE WHEN f.type_id = 34 THEN f.status ELSE NULL END) as Unlocksbranch4,
MAX(CA5E WHEN f.type_id = 36 THEN f.status ELSE NULL END) as Unlocksbranch5,
MAX(CASE WHEN f.type_id = 41 THEN f.status ELSE NULL END) as Unlocksbranch6,
MAX(CASE WHEN f.type_id = 12 THEN f.status ELSE NULL END) as CodeReviewed
from bugs b
inner join flags f on f.bug_id = b.bug_id
where ( b.bug_status = 'RESOLVED' or b.bug_status = 'VERIFIED' or b.bug_status = 'CLOSED' )
and b.resolution = 'FIXED'
group by b.bug_id
having CodeReviewed = '+' and Unlocksbranch1 = '?' and Unlocksbranch2 = '+'
The result of this gives me a single queryset that has all of the flags I care about as columns, which I can then do my analysis on. The last "having" section is what I am actually querying on, and is what I am trying to get with the above Django queries.
EDIT
Basically what I need to do is like this:
flags1 = {'36':'?'}
flags2 = {'12':'+'}
query1 = reduce(operator.and_, (Q(type_id=flag,status=val) for (flag,val) in flags1.items()))
query2 = reduce(operator.and_, (Q(type_id=flag,status=val) for (flag,val) in flags2.items()))
inner1 = Bugzilla_flags.objects.using('bugzilla').filter(query1)
inner2 = Bugzilla_flags.objects.using('bugzilla').filter(query2)
inner1_bugs = [row.bug_id for row in inner1] # list of just the bug_ids
inner2_bugs = [row.bug_id for row in inner2] # list of just the bug_ids
intersect = set(inner1_bugs) & set(inner2_bugs)
The intersect is a set that has all of the bug_ids that I can then use in the Bugzilla_bugs query to get the actual bug data.
How can I do the 3 operations (query, inner, inner_bugs) and then the intersect using a variable length dictionary input such as:
flags = {'36':'?','12':'+','15','?',etc}
Your inner query looks right to me. To find bugs that have all those flags, not just any one, you can either use reduce again to and together a bunch of flag= Q objects, or iterate and build up multiple filter clauses.
inner = Bugzilla_flags.objects.using('bugzilla').filter(query)
flag_filter = reduce(operator.and_, (Q(flag=flag) for flag in inner))
bugs = Bugzilla_bugs.objects.using('bugzilla').filter(flag_filter)
Or:
inner = Bugzilla_flags.objects.using('bugzilla').filter(query)
bugs = Bugzilla_bugs.objects.using('bugzilla').all()
for flag in inner:
bugs = bugs.filter(flag=flag)
Or, for that matter, take advantage of the fact that multiple Q objects are anded together:
inner = Bugzilla_flags.objects.using('bugzilla').filter(query)
flag_filters = [Q(flag=flag) for flag in inner]
bugs = Bugzilla_bugs.objects.using('bugzilla').filter(*flag_filters)
Assuming I have two models:
class Profile(models.Model):
#some fields here
class Ratings(models.Model):
profile = models.ForeignKey(profile)
category = models.IntegerField()
points = models.IntegerField()
Assuming following examle of MySQL table "ratings":
profile | category | points
1 1 10
1 1 4
1 2 10
1 3 0
1 4 10
1 4 10
1 4 10
1 5 0
I have following values in my POST data and also other fields values:
category_1_avg_val = 7
category_2_avg_val = 5
category_3_avg_val = 5
category_4_avg_val = 7
category_5_avg_val = 9
I want to filter profiles that have the average ratings calculated for categories higher or equal to required values.
Some filters are applied initially as:
q1 = [('associated_with', search_for),
('profile_type__slug__exact', profile_type),
('gender__in', gender),
('rank__in', rank),
('styles__style__in', styles),
('age__gte', age_from),
('age__lte', age_to)]
q1_list = [Q(x) for x in q1 if x[1]]
q2 = [('user__first_name__icontains', search_term),
('user__last_name__icontains', search_term),
('profile_type__name__icontains', search_term),
('styles__style__icontains', search_term),
('rank__icontains', search_term)]
q2_list = [Q(x) for x in q2 if x[1]]
if q1_list:
objects = Profile.objects.filter(
reduce(operator.and_, q1_list))
if q2_list:
if objects:
objects = objects.filter(
reduce(operator.or_, q2_list))
else:
objects = Profile.objects.filter(
reduce(operator.or_, q2_list))
if order_by_ranking_level == 'desc':
objects = objects.order_by('-ranking_level').distinct()
else:
objects = objects.order_by('ranking_level').distinct()
Now i want to filter profiles whose (average of points) (group by category) >= (avg values of category coming in post)
I tried to do this one by one as
objects = objects.filter(
ratings__category=1) \
.annotate(avg_points=Avg('ratings__points'))\
.filter(avg_points__gte=category_1_avg_val)
objects = objects.filter(
ratings__category=2) \
.annotate(avg_points=Avg('ratings__points'))\
.filter(avg_points__gte=category_2_avg_val)
But this is wrong I think. Please help me out. If return is a queryset that would be great.
Edited
Using the answer posted by hynekcer I came up with slightly different solution as I have already queryset of profiles which needs to be filtered more based on rating.
def check_ratings_avg(pr, rtd):
ok = True
qr = Ratings.objects.filter(profile__id=pr.id) \
.values('category')\
.annotate(points_avg=Avg('points'))
qr = {i['category']:i['points_avg'] for i in qr}
for cat in rtd:
val = rtd[cat]
if qr[cat] >= val:
pass
else:
ok = False
break
return ok
rtd = {1: category_1_avg_val, 2: category_2_avg_val, 3: category_3_avg_val,
4: category_4_avg_val, 5: category_5_avg_val}
objects = [i for i in objects if check_ratings_avg(i, rtd)]
Your complex query require a subquery in the principle. Possible solutions are:
A subquery written by 'extra' queryset method or raw SQL query. It is not DRY and it was unsupported by some db backends, e.g. by some versions of MySQL, however subqueries are by some limited way used since Django 1.1.
Saving intermediate results into a temporary table in the database. It is not nice in Django.
Emulation of the outer query by loop in Python. The best universal solution. A loop in Python over database data aggregated by the first query can aggregate and filter the data fast enough.
A) Subquery emulated by Python
from django.db.models import Q, Avg
from itertools import groupby
from myapp.models import Profile, Ratings
def iterator_filtered_by_average(dictionary):
qr = Ratings.objects.values('profile', 'category', 'points').order_by(
'profile', 'category').annotate(points_avg=Avg('points'))
f = Q()
for k, v in dictionary.iteritems():
f |= Q(category=k, points_avg__gte=v)
for profile, grp in groupby(qr.filter(f).values('profile')):
if len(list(grp)) == len(dictionary):
yield profile
#example
FILTER_DATA = {1:category_1_avg_val, 2:category_2_avg_val, 3:category_3_avg_val,
4:category_4_avg_val, 5:category_5_avg_val}
for row in iterator_filtered_by_average(FILTER_DATA):
print row
This is a simple solution for the original question without later additional requirements.
B) Solution with subqueries:
It is necessary for the more detailed version of question because if the initial filters are based on some field of type ManyToManyField and also because it contains a distinct clause:
# objects: QuerySet that you get from your initial filters. Not yet executed.
if rtd:
# Method `as_nested_sql` removes the `order_by` clase, unlike `as_sql`
subquery3 = objects.values('id').query \
.get_compiler(connection=connection).as_nested_sql()
subquery2 = ("""SELECT profile_id, category, avg(points) AS points_avg
FROM myapp_ratings
WHERE profile_id in
( %s
) GROUP BY profile_id, category
""" % subquery3[0], subquery3[1]
)
where_sql = ' OR '.join(
'category = %d AND points_avg >= %%s' % cat for cat in rtd.keys()
)
subquery = (
"""SELECT profile_id
FROM
( %s
) subquery2
WHERE %s
GROUP BY profile_id
HAVING count(*) = %s
""" % (subquery2[0], where_sql, len(rtd)),
subquery2[1] + tuple(rtd.values())
)
assert order_by_ranking_level in ('asc', 'desc')
mainquery = ("""SELECT myapp_profile.* FROM myapp_profile
INNER JOIN
( %s
) subquery ON subquery.profile_id=myapp_profile.id
ORDER BY ranking_level %s"""
% (subquery[0], order_by_ranking_level), subquery[1]
)
objects = Profile.objects.raw(mainquery[0], params=mainquery[1])
return objects
Replace please all strings myapp by name_of_your_application.
Example of SQL generated by this code
SELECT myapp_profile.* FROM myapp_profile
INNER JOIN
( SELECT profile_id
FROM
( SELECT profile_id, category, avg(points) AS points_avg
FROM myapp_ratings
WHERE profile_id IN
( SELECT U0.`id` FROM `myapp_profile` U0 WHERE U0.`ranking_level` >= 4
) GROUP BY profile_id, category
) subquery2
WHERE category = 1 AND points_avg >= 7 OR category = 2 AND points_avg >= 5
OR category = 3 AND points_avg >= 5 OR category = 4 AND points_avg >= 7
OR category = 5 AND points_avg >= 9
GROUP BY profile_id
HAVING count(*) = 5
) subquery ON subquery.profile_id=myapp_profile.id
ORDER BY ranking_level asc
(This SQL is for better readability parsed manually with strings %s replaced by parameters, however the database engine receive parameters unparsed for security reasons.)
Your problem is due to little support of subqueries generated by Django. Only examples from documentation of more complicated queries create a subquery. (e.g. aggregate after annotate or count after annotate or aggregate after distinct, but no annotate after distinct or after annotate) Complicated nested aggregations are simplified to one query which is unexpected.
All other solutions that execute a new individual SQL query for every object filtered by the first query are discouraged for production although they can be very useful for testing results of any better solution.
You could add methods to a manager
# Untested code
class ProfileManager(models.Manager):
def with_category_average(self, cat, avg):
# Give each filter a unique annotation key
key = 'avg_pts_' + str(cat)
return self.filter(ratings__category=cat) \
.annotate(**{key: Avg('ratings__points')}) \
.filter(**{key + '__gte': avg})
# Expects a dict of `cat: avg` pairs
def filter_by_averages(self, avg_dict):
qs = self.get_query_set()
for key, val in avg_dict.items():
qs &= self.with_category_average(key, val)
return qs