Django: sum annotation on relation with a limit clause

Django: sum annotation on relation with a limit clause - python

I have a situation similar to the following one:
class Player(models.Model):
pass
class Item(models.Model):
player = models.ForeignKey(Player,
on_delete=models.CASCADE,
related_name='item_set')
power = models.IntegerField()
I would like to annotate Player.objects.all() with Sum(item_set__power), taking into account only the top N Item when sorted by descending power. Ideally, I would like to do this with a subquery, but I don't know how to write it. How can this be done?

This is the solution using raw queryset (it was easier to implement for me than using ORM, but it might be possible with ORM using Subquery):
N = 2
query = """
SELECT id,
(
SELECT SUM(power)
FROM (SELECT power FROM myapp_item WHERE myapp_item.player_id = players.id ORDER BY power DESC LIMIT %s)
)
AS power__sum
FROM myapp_player AS players GROUP BY players.id
"""
players = Player.objects.raw(query, [N])
Update
Adding annotations is not possible with RawQueryset, but you can use RawSQL expression:
from django.db.models.expressions import RawSQL
N = 2
queryset = Player.objects.all()
query2 = """
SELECT SUM(power)
FROM (SELECT power FROM myapp_item WHERE myapp_item.player_id = myapp_player.id ORDER BY power DESC LIMIT %s)
"""
queryset.annotate(power__sum=RawSQL(query2, (N,)), my_annotation1=..., my_annotation2=...)

Related

How to filter not only by outerref id in a subquery?

I have a problem with filtering by boolean field in a subquery.
For example, I have two models: Good and Order.
class Good(models.Model):
objects = GoodQuerySet.as_manager()
class Order(models.Model):
good = models.FK(Good, related_name="orders")
is_completed = models.BooleanField(default=False)
I want to calculate how many completed orders has each good.
I implemented a method in Good's manager:
class GoodQuerySet(models.QuerySet):
def completed_orders_count(self):
subquery = Subquery(
Order.objects.filter(good_id=OuterRef("id"))
.order_by()
.values("good_id")
.annotate(c=Count("*"))
.values("c")
)
return self.annotate(completed_orders_count=Coalesce(subquery, 0))
This method counts all existing orders for a good, but it works when I call it like this:
Good.objects.completed_orders_count().first().completed_orders_count
To get the correct value of completed orders I tried to add filter is_completed=True. The final version looks like this:
class GoodQuerySet(models.QuerySet):
def completed_orders_count(self):
subquery = Subquery(
Order.objects.filter(good_id=OuterRef("id"), is_completed=True)
.order_by()
.values("good_id")
.annotate(c=Count("*"))
.values("c")
)
return self.annotate(completed_orders_count=Coalesce(subquery, 0))
If I try to call Good.objects.completed_orders_count().first().completed_orders_count I got an error:
django.core.exceptions.FieldError: Expression contains mixed types. You must set output_field.

How to use inner join on Subquery() Dajngo ORM?

I have two models:
class FirstModel(models.Model():
some_fields...
class SecondModel(models.Model):
date = models.DateTimeField()
value = models.IntegerField()
first_model = models.ForeignKey(to="FirstModel", on_delete=models.CASCADE)
and I need to do the following query:
select sum(value) from second_model
inner join (
select max(date) as max_date, id from second_model
where date < NOW()
group by id
) as subquery
on date = max_date and id = subquery.id
I think I can do it using Subquery
subquery = Subquery(SecondModel.objects.values("first_model")
.annotate(max_date=Max("date"))
.filter(date__lt=Func(function="NOW")))
and F() expressions but it only can resolve model fields, not a subquery
Question
Is it possible to implement using Django ORM only?
Also, can I evaluate the sum of values from the second model for all values in the first model by annotating this value? Like
FirstModel.objects.annotate(sum_values=sum_with_inner_join_query).all()

Django - Select MAX of field of related query set

Say if I have models:
class User(models.Model):
name = ...
dob = ...
class Event(models.Model):
user = models.ForeignKey(User, ...)
timestamp = models.DateTimeField()
And I want to query all Users and annotate with both Count of events and MAX of Events.timestamp
I know for count I can do:
Users.objects.all().annotate(event_count=models.Count('event_set'))
But how do I do max of a related queryset field? I want it to be a single query, like:
SELECT Users.*, MAX(Events.timestamp), COUNT(Events)
FROM Users JOIN Events on Users.id = Events.user_id

You could use Query Expressions to achieve that. You might have to play around with the foreign key related name depending on your code, but it would result in something looking like this:
from django.db.models import Count, F, Func,
Users.objects.all().annotate(
event_count=Count('event_set'),
max_timestamp=Func(F('event_set__timestamp'), function='MAX')
)

You can try an aggregate query for this as follows:
from django.db.models import Max
Users.objects.all().aggregate(Max('Events.timestamp'))
See details for the above code in Django documentation here

how to convert sql query to django queryset?

i have two tables like categories and items in models items has foreign key of categories,here we have 4 categories and 12 items where each category has 3 items how to write a query set to get query set to get items with same category
i know how to write Sql query (select * from category where category_id =1;). how to write it in d'jango query set.

You can achieve it by using Django Queries:
Item.objects.filter(category__id=1)

As mentioned in the Kamil's answer, you could use filters or if you want to use SQL query as is, you could also use raw queries. An example (taken from official docs) -
class Person(models.Model):
first_name = models.CharField(...)
last_name = models.CharField(...)
birth_date = models.DateField(...)
And querying it would be -
# querying with SQL raw query
for p in Person.objects.raw('SELECT * FROM myapp_person'):
And in your case -
# assuming your query is correct
Item.objects.raw('select * from category where myapp_category_id = 1')

Get least recently rented movies in Django

So imagine you have the following two tables:
CREATE movies (
id int,
name varchar(255),
...
PRIMARY KEY (id)
);
CREATE movieRentals (
id int,
movie_id int,
customer varchar(255),
dateRented datetime,
...
PRIMARY KEY (id)
FOREIGN KEY (movie_id) REFERENCES movies(id)
);
With SQL directly, I'd approach this query as:
(
SELECT movie_id, count(movie_id) AS rent_count
FROM movieRentals
WHERE dateRented > [TIME_ARG_HERE]
GROUP BY movie_id
)
UNION
(
SELECT id AS movie_id, 0 AS rent_count
FROM movie
WHERE movie_id NOT IN
(
SELECT movie_id
FROM movieRentals
WHERE dateRented > [TIME_ARG_HERE]
GROUP BY movie_id
)
)
(Get a count of all movie rentals, by id, since a given date)
Obviously the Django version of these tables are simple models:
class Movies(models.Model):
name = models.CharField(max_length=255, unique=True)
class MovieRentals(models.Model):
customer = models.CharField(max_length=255)
dateRented = models.DateTimeField()
movie = models.ForeignKey(Movies)
However, translating this to an equivalent query appears to be difficult:
timeArg = datetime.datetime.now() - datetime.timedelta(7,0)
queryset = models.MovieRentals.objects.all()
queryset = queryset.filter(dateRented__gte=timeArg)
queryset = queryset.annotate(rent_count=Count('movies'))
querysetTwo = models.Movies.objects.all()
querysetTwo = querysetTwo.filter(~Q(id__in=[val["movie_id"] for val in queryset.values("movie_id")]))
# Somehow need to set the 0 count. For now force it with Extra:
querysetTwo.extra(select={"rent_count": "SELECT 0 AS rent_count FROM app_movies LIMIT 1"})
# Now union these - for some reason this doesn't work:
# return querysetOne | querysetTwo
# so instead
set1List = [_getMinimalDict(model) for model in queryset]
# Where getMinimalDict just extracts the values I am interested in.
set2List = [_getMinimalDict(model) for model in querysetTwo]
return sorted(set1List + set2List, key=lambda x: x['rent_count'])
However, while this method seems to work, it is incredibly slow. Is there a better way I am missing?

With straight SQL, this would be much easier expressed like this:
SELECT movie.id, count(movieRentals.id) as rent_count
FROM movie
LEFT JOIN movieRentals ON (movieRentals.movie_id = movie.id AND dateRented > [TIME_ARG_HERE])
GROUP BY movie.id
The left join will produce a single row for each movie unrented since [TIME_ARG_HERE], but in those rows, the movieRentals.id column will be NULL.
Then, COUNT(movieRentals.id) will count all of the rentals where they exist, and return 0 if there was only the NULL value.

I must be missing something obvious. Why wouldn't the following work:
queryset = models.MovieRentals.filter(dateRented__gte=timeArg).values('movies').annotate(Count('movies')).aggregate(Min('movies__count'))
Also, clauses can be chained (as shown in the code above), so there is no reason to constantly set a queryset variable to the intermediate querysets.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Django: sum annotation on relation with a limit clause - python

Related

How to filter not only by outerref id in a subquery?

How to use inner join on Subquery() Dajngo ORM?

Django - Select MAX of field of related query set

how to convert sql query to django queryset?

Get least recently rented movies in Django

Categories

Resources