Django get most recent value AND aggregate values - python

I have a model which I want to get both the most recent values out of, meaning the values in the most recently added item, and an aggregated value over a period of time. I can get the answers in separate QuerySets and then unite them in Python but I feel like there should be a better ORM approach to this. Anybody know how it can be done?
Simplified example:
Class Rating(models.Model):
movie = models.ForeignKey(Movie, related_name="movieRatings")
rating = models.IntegerField(blank=True, null=True)
timestamp = models.DateTimeField(auto_now_add=True)
I wish to get the avg rating in the past month and the most recent rating per movie.
Current approach:
recent_rating = Rating.objects.order_by('movie_id','-timestamp').distinct('movie')
monthly_ratings = Rating.objects.filter(timestamp__gte=datetime.datetime.now() - datetime.timedelta(days=30)).values('movie').annotate(month_rating=Avg('rating'))
And then I need to somehow join them on the movie id.
Thank you!

Try this solution based on Subquery expressions:
from django.db.models import OuterRef, Subquery, Avg, DecimalField
month_rating_subquery = Rating.objects.filter(
movie=OuterRef('movie'),
timestamp__gte=datetime.datetime.now() - datetime.timedelta(days=30)
).values('movie').annotate(monthly_avg=Avg('rating'))
result = Rating.objects.order_by('movie', '-timestamp').distinct('movie').values(
'movie', 'rating'
).annotate(
monthly_rating=Subquery(month_rating_subquery.values('monthly_avg'), output_field=DecimalField())
)

I suggest you add a property method (monthly_rating) to your rating model using the #property decorator instead of calculating it in your views.py :
#property
def monthly_rating(self):
return 'calculate your avg rating here'

Related

how can i change this SQL to Django ORM code?

select *
from sample
join process
on sample.processid = process.id
where (processid) in (
select max(processid) as processid
from main_sample
group by serialnumber
)
ORDER BY sample.create_at desc;
models.py
class Sample(models.Model):
processid = models.IntegerField(default=0)
serialnumber = models.CharField(max_length=256) ##
create_at = models.DateTimeField(null=True)
class Process(models.Model):
sample = models.ForeignKey(Sample, blank=False, null=True, on_delete=models.SET_NULL)
Hi I have two models and I need to change this SQL query to Django ORM, Python code.
I need to retrieve the latest Sample(by processid) per unique serial number.
for example,
=> after RUN query
How can I change the SQL query to ORM code?
how can i change the subquery to ORM?
Thanks for reading.
EDIT: To also order by a column that is not one of the distinct or retrieved columns you can fall-back on subqueries. To filter by a single row from a subquery you can use the syntax described in the docs here
from django.db.models import Subquery, OuterRef
subquery = Subquery(Sample.objects.filter(
serialnumber=OuterRef('serialnumber')
).order_by(
'-processid'
).values(
'processid'
)[:1])
results = Sample.objects.filter(
processid=subquery
).order_by(
'create_at'
)
When using PostgreSQL you can pass fields to distinct to get a single result per a certain column, this returns the first result so combined with ordering will do what you need
Sample.objects.order_by('serialnumber', '-processid').distinct('serialnumber')
If you don't use PostgreSQL. Use a values query of the column that should be unique and then annotate the queryset with the condition that should group the values, Max in this case
from django.db.models import Max
Sample.objects.order_by(
'serialnumber'
).values(
'serialnumber'
).annotate(
max_processid=Max('processid')
)
I think this is what you need:
If want multiple related objects
samples = Sample.objects.prefetch_related('process').group_by('serialinumber')
If you want related objects for only one object
samples = Sample.objects.filter(id=1).select_related('process').group_by('serialinumber')

Django: Average of Subqueries as Multiple Annotations on a Query Result

I have three Django models:
class Review(models.Model):
rating = models.FloatField()
reviewer = models.ForeignKey('Reviewer')
movie = models.ForeignKey('Movie')
class Movie(models.Model):
release_date = models.DateTimeField(auto_now_add=True)
class Reviewer(models.Model):
...
I would like to write a query that returns the following for each reviewer:
The reviewer's id
Their average rating for the 5 most recently released movies
Their average rating for the 10 most recently released movies
The release date for the most recent movie they rated a 3 (out of 5) or lower
The result would be formatted:
<Queryset [{'id': 1, 'average_5': 4.7, 'average_10': 4.3, 'most_recent_bad_review': '2018-07-27'}, ...]>
I'm familiar with using .annotate(Avg(...)), but I can't figure out how to write a query that averages just a subset of the potential values. Similarly, I'm lost on how to annotate a query for the most recent <3 rating.
All of those are basically just some if statements in python code and when statements in your database assuming it is SQL-like, so, you can just use django's built-in Case and When functions, you'd probably combine them with Avg in your case and would need a new annotation field for every when, so your queryset would look roughly like
Model.objects.annotate(
average_5=Avg(Case(When(then=...), When(then=...)),
average_10=Avg(Case(When(then=...), When(then=...)),
)
with appropriate conditions inside when and appropriate then values.

How to aggregate the average of a calculation based on two columns?

I want to write a Django query to give me the average across all rows in my table. My model looks like
class StatByDow(models.Model):
total_score = models.DecimalField(default=0, max_digits=12, decimal_places=2)
num_articles = models.IntegerField(default=0)
day_of_week = IntegerField(
null=True,
validators=[
MaxValueValidator(6),
MinValueValidator(0)
]
)
and I attempt to calculate the average like this
everything_avg = StatByDow.objects.all().aggregate(Avg(Func(F('total_score') / F('num_articles'))))
but this results in the error
File "/Users/davea/Documents/workspace/mainsite_project/venv/lib/python3.7/site-packages/django/db/models/query.py", line 362, in aggregate
raise TypeError("Complex aggregates require an alias")
TypeError: Complex aggregates require an alias
What's the right way to calculate the average?
You don't need Func for the division, but you need to reconcile the two different field types. Use an ExpressionWrapper around Avg:
from django.db.models import ExpressionWrapper
everything_avg = (StatByDow.objects
.aggregate(avg=ExpressionWrapper(
Avg(F('total_score') / F('num_articles')),
DecimalField()
))
)
You could also use a Cast from integer to decimal (not with PostgreSQL, which objects to Django's syntax ::numeric(NONE, NONE)) or an ExpressionWrapper around the division, but just one ExpressionWrapper at the end is the quickest solution as it happens once at the end.
you need to pass a name of an alias (obviously by the error text) for aggregate function. the query should be something like this:
everything_avg = StatByDow.objects.all().aggregate(avg_f=Avg(Func(F('total_score') / F('num_articles'))))

Django calculate Avg for a computed field

I have a model as below.
class Transaction(models.Model):
time1 = models.DateTimeField(null=True)
time2 = models.DateTimeField(null=True)
#property
def time_diff(self):
return time2.total_seconds() - time1.total_seconds()
I need to get the average of the (time2 - time1) in seconds for a list of records, if both values are not null.
time_avg = transactions.aggregate(total=Avg('time_diff',field='time_diff'))
This gives an error saying 'time_diff' is not a valid field. I want to keep this column as a derived property. Not a stored column.
You first need to find difference and then take avarage. You can use django F function for this like this.
from django.db.models import F
result = Transaction.objects.filter('your_filter_here').annotate(time_diff=F('time1')-F('time2')).aggregate(Avg('time_diff'))
Have you tried
time_avg = transactions.aggregate(Avg('time_diff')).values()

GROUP BY in Django Queries

Dear StackOverFlow community:
I need your help in executing following SQL query.
select DATE(creation_date), COUNT(creation_date) from blog_article WHERE creation_date BETWEEN SYSDATE() - INTERVAL 30 DAY AND SYSDATE() GROUP BY DATE(creation_date) AND author="scott_tiger";
Here is my Django Model
class Article(models.Model):
title = models.CharField(...)
author = models.CharField(...)
creation_date = models.DateField(...)
How can I form aforementioned Django query using aggregate() and annotate() functions. I created something like this -
now = datetime.datetime.now()
date_diff = datetime.datetime.now() + datetime.timedelta(-30)
records = Article.objects.values('creation_date', Count('creation_date')).aggregate(Count('creation_date')).filter(author='scott_tiger', created_at__gt=date_diff, created_at__lte=now)
When I run this query it gives me following error -
'Count' object has no attribute 'split'
Any idea who to use it?
Delete Count('creation_date') from values and add annotate(Count('creation_date')) after filter.
Try
records = Article.objects.filter(author='scott_tiger', created_at__gt=date_diff,
created_at__lte=now).values('creation_date').aggregate(
ccd=Count('creation_date')).values('creation_date', 'ccd')
You need to use creation_date__count or customized name(ccd here) to refer the count result column, after aggregate().
Also, values() before aggregate limits group by columns and last value() declares the columns to be selected. There is no need to group by COUNT which is based on group of rows already.

Categories