I have three Django models:
class Review(models.Model):
rating = models.FloatField()
reviewer = models.ForeignKey('Reviewer')
movie = models.ForeignKey('Movie')
class Movie(models.Model):
release_date = models.DateTimeField(auto_now_add=True)
class Reviewer(models.Model):
...
I would like to write a query that returns the following for each reviewer:
The reviewer's id
Their average rating for the 5 most recently released movies
Their average rating for the 10 most recently released movies
The release date for the most recent movie they rated a 3 (out of 5) or lower
The result would be formatted:
<Queryset [{'id': 1, 'average_5': 4.7, 'average_10': 4.3, 'most_recent_bad_review': '2018-07-27'}, ...]>
I'm familiar with using .annotate(Avg(...)), but I can't figure out how to write a query that averages just a subset of the potential values. Similarly, I'm lost on how to annotate a query for the most recent <3 rating.
All of those are basically just some if statements in python code and when statements in your database assuming it is SQL-like, so, you can just use django's built-in Case and When functions, you'd probably combine them with Avg in your case and would need a new annotation field for every when, so your queryset would look roughly like
Model.objects.annotate(
average_5=Avg(Case(When(then=...), When(then=...)),
average_10=Avg(Case(When(then=...), When(then=...)),
)
with appropriate conditions inside when and appropriate then values.
Related
I have been trying to tackle this problem all week but I just can't seem to find the solution.
Basically I want to group on 2 values (user and assignment), then take the last element based on date and get a sum of these scores. Below a description of the problem.
With Postgres this would be easily solved by using the .distinct("value") but unfortunately I do not use Postgres.
Any help would be much appreciated!!
UserAnswer
- user
- assignment
- date
- answer
- score
So I want to group on all user / assignment combinations. Then I want to get the score of each last element in that group. So basically:
user_1, assignment_1, 2019, score 1
user_1, assignment_1, 2020, score 2 <- Take this one
user_2, assignment_1, 2020, score 1
user_2, assignment_1, 2021, score 2 <- Take this one
My best attempt is using annotation but then I do not have the score value anymore:
UserAnswer.objects.filter(user=student, assignment__in=assignments)
.values("user", "assignment")
.annotate(latest_date=Max('date'))
At the end, I had to use raw query rather than django's ORM.
subquery2 = UserAnswer.objects.raw("\
SELECT id, user_id, assignment_id, score, MAX(date) AS latest_date\
FROM soforms_useranswer \
GROUP BY user_id, assignment_id\
")
# the raw queryset from above raw query
# is very similar to queryset you get from django ORM query.
# The difference is now we add 'id' and 'score' to the fields,
# so later we can retrieve them, like below.
sum2= 0
for obj in subquery2:
print(obj.score)
sum2 += obj.score
print('sum2 is')
print(sum2)
Here, I assumed that both user and assignment are foreinkeys. Something liek below:
class Assignment(models.Model):
name = models.CharField(max_length=50)
class UserAnswer(models.Model):
user = models.ForeignKey(settings.AUTH_USER_MODEL, on_delete=models.CASCADE, related_name='answers')
assignment = models.ForeignKey(Assignment, on_delete=models.CASCADE)
#assignment = models.CharField(max_length=200)
score = models.IntegerField()
date = models.DateTimeField(default=timezone.now)
I have a model which I want to get both the most recent values out of, meaning the values in the most recently added item, and an aggregated value over a period of time. I can get the answers in separate QuerySets and then unite them in Python but I feel like there should be a better ORM approach to this. Anybody know how it can be done?
Simplified example:
Class Rating(models.Model):
movie = models.ForeignKey(Movie, related_name="movieRatings")
rating = models.IntegerField(blank=True, null=True)
timestamp = models.DateTimeField(auto_now_add=True)
I wish to get the avg rating in the past month and the most recent rating per movie.
Current approach:
recent_rating = Rating.objects.order_by('movie_id','-timestamp').distinct('movie')
monthly_ratings = Rating.objects.filter(timestamp__gte=datetime.datetime.now() - datetime.timedelta(days=30)).values('movie').annotate(month_rating=Avg('rating'))
And then I need to somehow join them on the movie id.
Thank you!
Try this solution based on Subquery expressions:
from django.db.models import OuterRef, Subquery, Avg, DecimalField
month_rating_subquery = Rating.objects.filter(
movie=OuterRef('movie'),
timestamp__gte=datetime.datetime.now() - datetime.timedelta(days=30)
).values('movie').annotate(monthly_avg=Avg('rating'))
result = Rating.objects.order_by('movie', '-timestamp').distinct('movie').values(
'movie', 'rating'
).annotate(
monthly_rating=Subquery(month_rating_subquery.values('monthly_avg'), output_field=DecimalField())
)
I suggest you add a property method (monthly_rating) to your rating model using the #property decorator instead of calculating it in your views.py :
#property
def monthly_rating(self):
return 'calculate your avg rating here'
Prologue:
This is a question arising often in SO:
Django Models Group By
Django equivalent for count and group by
How to query as GROUP BY in django?
How to use the ORM for the equivalent of a SQL count, group and join query?
I have composed an example on SO Documentation but since the Documentation will get shut down on August 8, 2017, I will follow the suggestion of this widely upvoted and discussed meta answer and transform my example to a self-answered post.
Of course, I would be more than happy to see any different approach as well!!
Question:
Assume the model:
class Books(models.Model):
title = models.CharField()
author = models.CharField()
price = models.FloatField()
How can I perform the following queries on that model utilizing Django ORM:
GROUP BY ... COUNT:
SELECT author, COUNT(author) AS count
FROM myapp_books GROUP BY author
GROUP BY ... SUM:
SELECT author, SUM (price) AS total_price
FROM myapp_books GROUP BY author
We can perform a GROUP BY ... COUNT or a GROUP BY ... SUM SQL equivalent queries on Django ORM, with the use of annotate(), values(), the django.db.models's Count and Sum methods respectfully and optionally the order_by() method:
GROUP BY ... COUNT:
from django.db.models import Count
result = Books.objects.values('author')
.order_by('author')
.annotate(count=Count('author'))
Now result contains a dictionary with two keys: author and count:
author | count
------------|-------
OneAuthor | 5
OtherAuthor | 2
... | ...
GROUP BY ... SUM:
from django.db.models import Sum
result = Books.objects.values('author')
.order_by('author')
.annotate(total_price=Sum('price'))
Now result contains a dictionary with two columns: author and total_price:
author | total_price
------------|-------------
OneAuthor | 100.35
OtherAuthor | 50.00
... | ...
UPDATE 13/04/2021
As #dgw points out in the comments, in the case that the model uses a meta option to order rows (ex. ordering), the order_by() clause is paramount for the success of the aggregation!
in group by SUM() you can get almost two dict objects like
inv_data_tot_paid =Invoice.objects.aggregate(total=Sum('amount', filter=Q(status = True,month = m,created_at__year=y)),paid=Sum('amount', filter=Q(status = True,month = m,created_at__year=y,paid=1)))
print(inv_data_tot_paid)
##output -{'total': 103456, 'paid': None}
do not try out more than two query filter otherwise, you will get error like
............Models............
class Product(models.Model):
user = models.ForeignKey(User)
name = models.CharField(max_length=140)
description = tinymce_models.HTMLField()
class Purchase(models.Model):
user = models.ForeignKey(User, blank=True, null=True)
product = models.ForeignKey(Product)
sale_date = models.DateTimeField(auto_now_add=True)
price = models.DecimalField(max_digits=6, decimal_places=2)
I am looking to get an output that says the sum of the purchase prices for a product in the past month and the past week, but want to do this for multiple products.
The output would look something like this that i could loop through in my templates...
product1 name-- product1 description -- sum of product1 weekly sales -- sum of product1 monthly sales
product2 name-- product2 description -- sum of product2 weekly sales -- sum of product2 monthly sales
Should I used raw sql? What would that query look like? Should i try to use sqlalchemy or can i do this in the Django ORM?
I suggest you to do it in sql statement, because sql performs better to manipulate data, sum and order it... Your code will manage to retrieve data from RDBMS and display it...
I don't know what is your RDBMS, but I give you an exemple in sql server, the syntaxe is around 95% the same in other RDBMS.
You query should looks like something like this :
SELECT
prod.product_id
,prod.description
,ISNULL(weekSales.priceSum,0) as weekSales
,ISNULL(monthSales.priceSum,0) as monthSales
FROM
Product prod
left join (
SELECT
product_id
,sum(price) as priceSum
FROM Purchase
WHERE DATEPART(wk,sale_date) = DATEPART(wk,GETDATE())
GROUP BY product_id) weekSales
on prod.product_id = weekSales.product_id
left join (
SELECT
product_id
,sum(price) as priceSum
FROM Purchase
WHERE
month(sale_date) = MONTH(GETDATE())
AND year(sale_date) = year(GETDATE())
GROUP BY product_id) monthSales
on prod.product_id = monthSales.product_id
Here you can see a demo of this query with SQLFiddle
In sql server,
GETDATE() return the current date
DATEPART(wk,[date]) return the week part of a date (WEEK([date]) function in mysql)
MONTH([date]) return the month part of a date
YEAR([date]) return the year part of a date
ISNULL([field],[value_if_null]) replace null value (IFNULL([field],[value_if_null]) in mysql)
You should be able to find similar functions in all RDBMS.
If you need some help in sql, comment above and specify your RDBMS ;)
Hope it helps.
I currently have the following models, where there is a Product class, which has many Ratings. Each Rating has a date_created DateTime field, and a stars field, which is an integer from 1 to 10. Is there a way I can add up the total number of stars given to all products on a certain day, for all days?
For instance, on December 21st, 543 stars were given to all Products in total (ie. 200 on Item A, 10 on Item B, 233 on Item C). On the next day, there might be 0 stars, because there were no ratings for any Products.
I can imagine first getting a list of dates, and then filtering on each date, and aggregating each one, but this seems very intensive. Is there an easier way?
You should be able to do it all in one query, using values:
from datetime import date, timedelta
from django.db.models import Sum
end_date = date.now()
start_date = end_date - timedelta(days=7)
qs = Rating.objects.filter(date_created__gt=start_date, date_created__lt=end_date)
qs = qs.values('date_created').annotate(total=Sum('stars'))
print qs
Should output something like:
[{'date_created': '1-21-2013', 'total': 150}, ... ]
The SQL for it looks like this (WHERE clause omitted):
SELECT "myapp_ratings"."date_created", SUM("myapp_ratings"."stars") AS "total" FROM "myapp_ratings" GROUP BY "myapp_ratings"."date_created"
You'll want to use Django's aggregation functions; specifically, Sum.
>>> from django.db.models import Sum
>>>
>>> date = '2012-12-21'
>>> Rating.objects.filter(date_created=date).aggregate(Sum('stars'))
{'stars__sum': 543}
As a side note, your scenario actually doesn't need to use any submodels at all. Since the date_created field and the stars field are both members of the Rating model, you can just do a query over it directly.
You could always just perform some raw SQL:
from django.db import connection, transaction
cursor = connection.cursor()
cursor.execute('SELECT date_created, SUM(stars) FROM yourapp_rating GROUP BY date_created')
result = cursor.fetchall() # looks like [('date1', 'sum1'), ('date2', 'sum2'), etc]