How to do an exclude django query on multiple foreign key - python

My model example:
class Thing(models.Model):
alpha = models.ForeignKey('auth.User', on_delete=models.CASCADE,
related_name='alpha_thing')
beta = models.ForeignKey('auth.User', on_delete=models.CASCADE,
related_name='beta_thing')
assigned_at = models.DateTimeField(
_('assigned at'),
null=True,
help_text=_('Assigned at this date'))
I wish to query all the users which don't have a Thing with an assigned_at date, ie they could have other Things, but that should have a date set.
I've tried:
return User.objects.exclude(
alpha_thing__assigned_at__isnull=True
).exclude(
beta_thing__assigned_at__isnull=True
).all()
but the result is empty (the thing table is empty, so i'm not sure if it has something to do with the join?).

What about this,
from django.db.models import Q
User.objects.filter(Q(alpha_thing__assigned_at__isnull=False) | Q(beta_thing__assigned_at__isnull=False)).distinct()
Screenshots
1. Auth model structure - User
2. Thing model

There is another way, since you want to filter user which "things" contains all an assigned_date.
You could:
User.objects.filter(
alpha_thign__assigned_at__isnull=False,
beta_thign__assigned_at__isnull=False,
)
Simple.
There are no need to Use Q objects here or | (or) operations.
What you want is not
alpha_thing__assigned_at__isnull=False OR
beta_thing__assigned_at__isnull=False
What you're looking for is
alpha_thing__assigned_at__isnull=False AND
beta_thing__assigned_at__isnull=False

For all users which don't have a Thing with an empty date try:
return User.objects.exclude(
alpha_thing__assigned_at=None
).exclude(
beta_thing__assigned_at=None
).all()
By the way, I got the same result whether I used .all() at the end or not, so:
return User.objects.exclude(
alpha_thing__assigned_at=None
).exclude(
beta_thing__assigned_at=None
)
returned the same result as the first example.

Have you tried something like this?
from django.db.models import Q
has_null_alpha = Q(alpha_thing__isnull=False, alpha_thing__assigned_at__isnull=True)
has_null_beta = Q(beta_thing__isnull=False, beta_thing__assigned_at__isnull=True)
User.objects.exclude(has_null_alpha | has_null_beta)
Reasoning
I think the reason you're seeing unexpected results may not have anything to do with the fact that there are multiple ForeignKey paths in the queryset. Your statement that "the thing table is empty" might be the key, and the reason users aren't showing up is because they have no alpha_thing or beta_thing relation.
NOTES:
The QuerySet User.objects.exclude(alpha_thing__assigned_at__isnull=True) produces a left outer join between the User table and the Thing table, which means that before doing any comparisons in the WHERE clause, you're getting NULL for assigned_at in any row where there is no Thing.
One really weird thing here is that a filter causes an INNER join, so that the statement User.objects.filter(alpha_thing__assigned_at__isnull=False) actually only yields the users who actually have alpha_thing related objects with a non-NULL value for assigned_at (leaving out those guys with no related alpha_thing).

Related

Django: Ordering by last item in reverse ForeignKey relation

I'm trying to implement a qs.sort_by operation, but I'm not sure how to go about it.
Given the following models
class Group(models.Model):
name = models.CharField(max_length=300)
( ... )
class Budget(models.Model):
( ... )
amount = models.DecimalField(
decimal_places=2,
max_digits=8,
)
group = models.ForeignKey(
Group,
related_name="budgets",
on_delete=models.CASCADE,
)
Where each Group has either 0 or > 5 budgets assigned
I am trying to sort a Group.objects.all() queryset by the amount of their last (as in, most recently assigned) Budget.
I know that if it was a OneToOne I could do something like the following:
Group.objects.all().order_by('budget__amount')
With ForeignKey relations I was hoping I could do something like
Group.objects.all().order_by('budgets__last__amount')
But alas, that is not a valid operation, but I am not sure on how to proceed otherwise.
Does anyone know on how to perform this sorting operation? (if it is indeed possible)
This should do it:
latest = Budget.objects.filter(group=OuterRef('pk')).order_by('-pk')
Group.objects.annotate(
latest_budget=Subquery(
latest.values('amount')[:1]
)
).order_by('latest_budget')
You can try use sorted with a custom key:
sorted(Group.objects.all(), key=lambda x: x.last__budget__amount)
Perhaps .last__budget__amount is not a valid method, but you get the idea.

django ORM turns two conditions on related table into two separate JOINs

I have the case that I need to filter on two attributes from a related table.
class Item(models.Model):
vouchers = models.ManyToManyField()
class Voucher(models.Model):
is_active = models.BooleanField()
status = models.PositiveIntegerField()
When I query the ORM like this:
Item.objects.exclude(
vouchers__is_active=False,
vouchers__status__in=[1, 2])
The created query looks like this:
SELECT *
FROM `item`
WHERE NOT (`item`.`id` IN (
SELECT U1.`item_id`
FROM `itemvouchers` U1
INNER JOIN `voucher` U2 ON (U1.`voucher_id` = U2.`id`)
WHERE U2.`is_active` = FALSE)
AND
`item`.`id` IN (
SELECT U1.`item_id`
FROM `itemvouchers` U1
INNER JOIN `voucher` U2 ON (U1.`voucher_id` = U2.`id`)
WHERE U2.`status` IN (1, 2))
)
I want to exclude vouchers which are both inactive AND have status 1 or 2.
What the query does is creating two separate joins. This is at first unnecessary and bad for performance. Second it's just wrong.
Case:
voucher_a = Voucher.objects.create(status=3, is_active=True)
voucher_b = Voucher.objects.create(status=1, is_active=False)
If I have an item in related with voucher_a and voucher_b it does not get found because it is in JOIN 1 but not in JOIN 2.
It looks like a bug in django but I wasn't able to find anything useful on the web to this topic.
We are on django==2.1.1 and tried out switching exclude with filter or using Q-expressions. Nothing worked so far.
Your setup is an m2m relation, and you want to exclude any single object that has at least one m2m relation for which this AND combination of conditions is true.
M2M relationships are special when it comes to filter/exclude querysets, see https://docs.djangoproject.com/en/2.1/topics/db/queries/#spanning-multi-valued-relationships
Also note in that documentation:
The behavior of filter() for queries that span multi-value relationships, as described above, is not implemented equivalently for exclude(). Instead, the conditions in a single exclude() call will not necessarily refer to the same item.
The solution presented in the documentation is the following:
Blog.objects.exclude(
entry__in=Entry.objects.filter(
headline__contains='Lennon',
pub_date__year=2008,
),
)

Django filter by the number of rows matching a certain condition in a ManyToMany

I need to filter for objects where the number of elements in a ManyToMany relationship matches a condition. Here's some simplified models:
Place(models.Model):
name = models.CharField(max_length=100)
Person(models.Model):
type = models.CharField(max_length=1)
place = models.ManyToManyField(Place, related_name="people")
I tried to do this:
c = Count(Q(people__type='V'))
p = Places.objects.annotate(v_people=c)
But this just makes the .v_people attribute count the number of People.
Since python-2.0, you can use the filter=... parameter of the Count(..) function [Django-doc] for this:
Place.objects.annotate(
v_people=Count('people', filter=Q(people__type='V'))
)
So this will assign to v_people the number of people with type='V' for that specific Place object.
An alternative is to .filter(..) the relation first:
Place.objects.filter(
Q(people__type='V') | Q(people__isnull=True)
).annotate(
v_people=Count('people')
)
Here we thus filter the relation such that we allow people that either have type='V', or with no people at all (since it is possible that the Place has no people. We then count the related model.
This generates a query like:
SELECT `place`.*, COUNT(`person_place`.`person_id`) AS `v_people`
FROM `place`
LEFT OUTER JOIN `person_place` ON `place`.`id` = `person_place`.`place_id`
LEFT OUTER JOIN `person` ON `person_place`.`person_id` = `person`.`id`
WHERE `person`.`type` = V OR `person_place`.`person_id` IS NULL

How to force Django to use LEFT OUTER JOIN in query?

I have two models: Person and Task.
class Person(models.Model):
display_name = models.CharField()
...
class Task(models.Model):
person = models.ForeignKey(Person)
is_deleted = models.BooleanField()
...
I want to get a list of ALL people along with amount of tasks (including 0).
Initially, I wrote below query and it worked pretty well:
Person.objects.values('person_id', 'display_name').annotate(crt_task_amt=Count('task__id')).order_by('-crt_task_amt', 'display_name')
Later, I introduced a filter on is_deleted. Then people with no tasks disappeared:
Person.objects.filter(task__is_deleted=False).values('person_id', 'display_name').annotate(crt_task_amt=Count('task__id')).order_by('-crt_task_amt', 'display_name')
I'm looking for something like:
SELECT p.id, p.display_name, count(t.id) FROM dashboard_person p LEFT OUTER JOIN dashboard_task t ON (p.person_id=t.person_id AND t.is_deleted=0) GROUP BY t.person_id
Is there any way to achieve it without using raw SQL?
Sometimes django ORM decides to use INNER JOIN and sometimes LEFT OUTER JOIN. What is the logic behind, I haven't found yet. But I have tested some cases which get me idea behind.
Starting case (I am using django 1.8.1):
class Parent(...)
...
class Child(...):
parent = ForeignKey(Parent)
status = CharField()
name = CharField()
...
qs = Parent.object.all()
Task 1: for each parent record count how many child records contains
This should work:
qs = qs.annotate(child_count_all=Count("child"))
Looking into qs.query - you can see that LEFT OUTER JOIN is used, what is correct.
but if I do it with SUM + CASE-WHEN:
qs = qs.annotate(
child_count=Sum(Case(default=1), output_field=IntegerField())
)
Looking into qs.query - you can see that this time INNER JOIN is used, what will filter out all parent records which don't contain any child records, producing wrong results.
The workaround for this is something like:
qs = qs.annotate(
child_count=Sum(
Case(
When(child__id=None, then=0),
default=1,
output_field=IntegerField())
))
This time qs.query showed using LEFT OUTER JOIN producing correct results.
Task 2: count how many active child records contains
Active child records are detected with status<>'INA'. Based on previous solution I tried following:
qs = qs.annotate(
child_count=Sum(
Case(
When(child__id=None, then=0),
When(child__status='INA', then=0),
default=1,
output_field=IntegerField())
))
but again, the qs.query shows that INNER JOIN is being used, thus producing wrong results (for my case).
The workaround/solution is using two or-ed Q objects:
qs = qs.annotate(
child_count=Sum(
Case(
When(Q(child__id=None) | Q(child__status="INA"), then=0),
default=1,
output_field=IntegerField())
))
Again, qs.query used LEFT OUTER JOIN, yielding correct results.
Task 3: same as 2 but count only records which have name filled
This works:
qs = qs.annotate(
child_with_name_count=Sum(
Case(
When(Q(child__id=None) | Q(child__status="INA"), then=0),
When(child__name__isnull=False, then=1),
default=0,
output_field=IntegerField())
))
Conclusion
Can not tell for sure why is sometimes used inner and sometimes left join, so my way how to deal with it was to test various combinations by inspecting qs.query until I found proper result. Other way is by using qs.raw/join/extra and other more native and advanced django ORM/SQL combinations.
q = Task.objects.filter(is_deleted=False).values('person__id').annotate(crt_task_amt=Count('id')).order_by('-crt_task_amt', 'person__display_name')
q[0].person_id # gives person_id
q[0].display_name #gives person name
q[0].crt_task_amt # gives count of task of first person
UPDATE:
Hope this works.
Task.objects.filter(is_deleted=False, person__isnull = True).values('person__id').annotate(crt_task_amt=Count('id')).order_by('-crt_task_amt', 'person__display_name')
This can be done easily using joins, but you need to use little bit of raw SQL for that.

How to restrict query of a ManyToMany relationship with Q.AND in Django

I want to get all images that have 2 specific tags, 'tag1' AND 'tag2'. My simplified models:
class Image(models.Model):
title = models.CharField(max_length=100)
class Tag(models.Model):
name = models.CharField(max_length=64, unique=True)
images = models.ManyToManyField(Image, null=True, blank=True)
Concatenating filter works:
query = Image.objects.filter(tag__name='tag1').filter(tag__name='tag2')
However, I thought I could do it using the Q object from Django. I'm building a complex query, so using Q would be more straightforward. I'm adding all parameters to a qobj = Q() using qobj.add(Q(tag__name='tag1'), Q.AND). But... the following retrieves nothing:
qobj = Q()
qobj.add(Q(tag__name='tag1'), Q.AND)
qobj.add(Q(tag__name='tag2'), Q.AND)
query = Image.objects.filter(qobj)
Everything works as expected when using OR connector in the code above, returning correctly images that have tag1 OR tag2.
It seems that in the AND case it is looking for a row in app_tag_images with both tags, which is obviously absent, since each row has only one tag_id for a image_id.
Is there a way to build this query with Q?
ps: let me know if more details of the code are needed.
edit:
Here is que sql query of the query with Q (I cleaned most SELECT columns for clarity):
SELECT "meta_image"."id", "meta_image"."title"
FROM "meta_image"
INNER JOIN "meta_tag_images" ON ("meta_image"."id" = "meta_tag_images"."image_id")
INNER JOIN "meta_tag" ON ("meta_tag_images"."tag_id" = "meta_tag"."id")
WHERE ("meta_tag"."name" = tag1 AND "meta_tag"."name" = tag2)
OR query is identical as above (replacing AND by OR).
Just for reference, the working method using filter concatenating prints this query (also simplified):
SELECT "meta_image"."id", "meta_image"."title"
FROM "meta_image"
INNER JOIN "meta_tag_images" ON ("meta_image"."id" = "meta_tag_images"."image_id")
INNER JOIN "meta_tag" ON ("meta_tag_images"."tag_id" = "meta_tag"."id")
INNER JOIN "meta_tag_images" T4 ON ("meta_image"."id" = T4."image_id")
INNER JOIN "meta_tag" T5 ON (T4."tag_id" = T5."id")
WHERE ("meta_tag"."name" = tag1 AND T5."name" = tag2)
I wasn't even aware of that format!
What's wrong with the way the docs show Q object usage? http://docs.djangoproject.com/en/dev/topics/db/queries/#complex-lookups-with-q-objects
Image.objects.filter(Q(tag__name='tag1') & Q(tag__name='tag2'))
UPDATE:
I tested the qobj.add() method on my model with m2m and it works fine on 1.2.3
It also works fine copy and pasting your simplified model.
Are you sure your query is supposed to return something?
Does the standard Q usage Q(tag__name='tag1') & Q(tag__name='tag2') return results?
Can you print myquery.query as well?
Let's narrow this down.

Categories