Django: Count() in multiple annotate()

Django: Count() in multiple annotate() - python

I have a Django project with ForumPost, ForumComment, ForumPostLike, ForumCommentLike models. Post has multiple comments, multiple forumpostlikes.
Comment has multiple forumcommentlikes. Each user can only like a post or a comment once.
What I want: I want to order posts by calculated = (num_comments*5 + num_postlikes + num_commentlikes)
Questions: With the following code, It seems that I am getting what I wanted. But there are some weird behaviors of Count(). Also, I would like to know if there is a better way of achieving my goal.
models.py
class ForumPost(models.Model):
# author, title .. etc.
class ForumComment(models.Model):
post = models.ForeignKey(ForumPost, on_delete=models.CASCADE, related_name='comments')
#other fields
class ForumPostLike(models.Model):
liker = models.ForeignKey(settings.AUTH_USER_MODEL, on_delete=models.CASCADE, related_name='liked_forumposts')
post = models.ForeignKey(ForumPost, on_delete=models.CASCADE, related_name='forumpostlikes')
class ForumCommentLike(models.Model):
liker = models.ForeignKey(settings.AUTH_USER_MODEL, on_delete=models.CASCADE, related_name='liked_forumcomments')
comment = models.ForeignKey(ForumComment, on_delete=models.CASCADE, related_name='forumcommentlikes')
views.py
posts= ForumPost.objects.filter(published_date__lte=timezone.now()).\
prefetch_related('comments', 'forumpostlikes')
posts_top = posts\
.annotate(num_commentlikes=Count('comments__forumcommentlikes__liker'))\
.annotate(num_postlikes=Count('forumpostlikes', distinct=True))\
.annotate(num_comments=Count('comments', distinct=True))\
.annotate(calculated=(F('num_comments')*Value(5)) + F('num_postlikes') +F('num_commentlikes')*Value(0.5))\
.order_by('-calculated')
Weird behavior of Count():
annotate(num_commentlikes=Count('comments__forumcommentlikes__liker'))
using this code, I wanted to get the total number of likes from all comments associated with the current post. But the result shows 2*correct number. Once I realized this, I multiplied this number with Value(0.5) to reflect correct number.
If I use distinct as follow:
annotate(num_commentlikes=Count('comments__forumcommentlikes__liker', distinct=True))
this code only gives me the number of likes from the first comment.
Why is this happening? Also, how to improve my code? Thanks!

Related

How to get highest rated comment for another model through Django ORM?

This is a pretty straight forward issue but I don't know why prefetch related isn't working for me. My relevant models:
class Topic:
name = models.CharField(max_length=100, unique=True)
class Aritcle:
topic = models.ForeignKey(Topic, on_delete=models.CASCADE)
title_text = models.CharField(max_length=200, unique=True)
class Comment:
num_likes = models.IntegerField(default=0)
pub_date = models.DateTimeField('date published')
word = models.ForeignKey(Article, on_delete=models.CASCADE)
I want to return a list of Articles, and for each article the highest rated comment on that article (max num_likes).
I have a QuerySet[Article] called search_results. I keep trying:
search_results.prefetch_related(
Prefetch('comment_set', queryset=Definition.objects.order_by('-num_likes').first(), to_attr="top_comment")
)
But it doesn't seem to work. I've tried to use the attr and it gives me an attribute error:
for article in search_results:
print(article.top_comment)
generates:
AttributeError: 'Article' object has no attribute 'top_comment'
I've tried with arbitrary query sets, doing Comments.objects.filter('pub_date') but nothing seems to work
I should note that if I change 'comment_set' to something else such as 'comments' it gives me an error, so comment_set must be a valid part of Article

First of all you can't use model object instead of queryset in Prefetch (I mean the line queryset=Definition.objects.order_by('-num_likes').first()) and it should be a queryset object.
I think one way to showing top comment based on each article is to use something like:
search_results.prefetch_related('comment_set') # prefetch related for optimization purpose
and then in your for loop get the top comment like following codes:
for article in search_results:
top_comment = article.comment_set.order_by('-num_likes').first() or "Does not have comments"
print(top_comment)

I would suggest querying the other way to make your life easier.
Comment.objects.order_by('id', '-num_likes').distinct('id').select_related('article')

Reducing number of ORM queries in Django web application

I'm trying to improve the performance of one of my Django applications to make them run just a bit smoother, as part of a first iteration in improving what I currently have running. When doing some profiling I've noticed that I have a very high number of SQL queries being executed on a couple of pages.
The dashboard page for instance easily has 250+ SQL queries being executed. Further investigation pointed me to the following piece of code in my views.py:
for project in projects:
for historicaldata in project.historical_data_for_n_months_ago(i):
for key in ('hours', 'expenses'):
history_data[key] = history_data[key] + getattr(historicaldata, key)
Relevant function in models.py file:
def historical_data_for_n_months_ago(self, n=1):
n_year, n_month = n_months_ago(n)
try:
return self.historicaldata_set.filter(year=n_year, month=n_month)
except HistoricalData.DoesNotExist:
return []
As you can see, this will cause a lot of queries being executed for each project in the list. Originally this was set-up this way to keep functionality centrally at the model level and introduce convenience functions across the application.
What would be possible ways on how to reduce the number of queries being executed when loading this page? I was thinking on either removing the convince function and just working with select_related() in the view, but, it would still need a lot of queries in order to filter out records for a given year and month.
Thanks a lot in advance!
Edit As requested, some more info on the related models.
Project
class Project(models.Model):
name = models.CharField(max_length=200)
status = models.IntegerField(choices=PROJECT_STATUS_CHOICES, default=1)
last_updated = models.DateTimeField(default=datetime.datetime.now)
total_hours = models.DecimalField(default=0, max_digits=10, decimal_places=2)
total_expenses = models.DecimalField(default=0, max_digits=10, decimal_places=2)
def __str__(self):
return "{i.name}".format(i=self)
def historical_data_for_n_months_ago(self, n=1):
n_year, n_month = n_months_ago(n)
try:
return self.historicaldata_set.filter(year=n_year, month=n_month)
except HistoricalData.DoesNotExist:
return []
HistoricalData
class HistoricalData(models.Model):
project = models.ForeignKey(Project, on_delete=models.CASCADE)
person = models.ForeignKey(Person, on_delete=models.CASCADE)
year = models.IntegerField()
month = models.IntegerField()
hours = models.DecimalField(max_digits=10, decimal_places=2, default=0)
expenses = models.DecimalField(max_digits=10, decimal_places=2, default=0)
def __str__(self):
return "Historical data {i.month}/{i.year} for {i.person} ({i.project})".format(i=self)

I don't think looping through querysets is ever a good idea. So it would be better if you could find some other way. If you could elaborate your view function and exactly what its supposed be to done maybe I could help further.
If you want all the historical_data entries for a project (reverse related) you need to use prefetch_related. Since you want a specific portion of the historical data associated with said project you need to use it with Prefetch.
from django.db.models import Prefetch
Project.objects.prefetch_related(
Prefetch(
'historicaldata_set',
queryset=HistoricalData.objects.filter(year=n_year, month=n_month)
)
)
After that, you should be looping through this dataset in your django template (if you are using that). You can also pass it to a drf-serializer and that would also get your work done :)

How can I correct my ORM statement to show all friends not associated with a user in Django?

In my Django application, I've got two models, one Users and one Friendships. There is a Many to Many relationship between the two, as Users can have many Friends, and Friends can have many other Friends that are Users.
How can I return all friends (first and last name) whom are NOT friends with the user with the first_name='Daniel'?
Models.py:
class Friendships(models.Model):
user = models.ForeignKey('Users', models.DO_NOTHING, related_name="usersfriend")
friend = models.ForeignKey('Users', models.DO_NOTHING, related_name ="friendsfriend")
created_at = models.DateTimeField(blank=True, null=True)
updated_at = models.DateTimeField(blank=True, null=True)
class Meta:
managed = False
db_table = 'friendships'
class Users(models.Model):
first_name = models.CharField(max_length=45, blank=True, null=True)
last_name = models.CharField(max_length=45, blank=True, null=True)
created_at = models.DateTimeField(blank=True, null=True)
updated_at = models.DateTimeField(blank=True, null=True)
class Meta:
managed = False
db_table = 'users'
So far, here's what I've tried in my controller (views.py) -- please note, I understand controllers should be skinny but still learning so apologies. What I tried in the snippet below (after many failed attempts at a cleaner method) was to try and first grab friends of daniels (populating them into a list and then removing any duplicate ids), and then filter them out by their id.
# show first and last name of all friends who daniel is not friends with:
def index(req):
friends_of_daniel = Friendships.objects.filter(user__first_name='Daniel')
daniels_friends = []
for friend_of_daniel in friends_of_daniel:
daniels_friends.append(friend_of_daniel.friend.id)
daniels_friends = list(set(daniels_friends))
not_daniels_friends = Friendships.objects.exclude(id__in=daniels_friends)
context = {
'not_daniels_friends':not_daniels_friends,
}
return render(req, "friendapp/index.html",context)
However, when I try the following in my views (templates) file, I still see individuals whom are friends of Daniels. Any idea what I'm doing wrong?
<ul>
{% for not_daniel_friend in not_daniels_friends %}
<li>{{ not_daniel_friend.user.first_name }} {{ not_daniel_friend.user.last_name }}</li>
{% endfor %}
</ul>

I guess something like this will do. Just then take the list users, and get the first and last name of those users.
daniels = Users.objects.filter(first_name="Daniel") # There may be more than one Daniel
users = Friendships.objects.exclude(friend__in=daniels)
Note here, while Friendships.friend is a foreignkey of type Users you can pass Users instances (i.e daniels list) in friend__in to exclude those users.

Try this,In the place of friend_of_daniel.friend.id , You should exclude the results from User model.
Something like this :
def index(req):
friends_of_daniel = Friendships.objects.filter(user__first_name='Daniel')
daniels_friends = []
for friend_of_daniel in friends_of_daniel:
daniels_friends.append(friend_of_daniel.friend.id)
daniels_friends = list(set(daniels_friends))
not_daniels_friends = Users.objects.exclude(id__in=daniels_friends)
context = {
'not_daniels_friends':not_daniels_friends,
}
return render(req, "friendapp/index.html",context)
Thanks.

Firstly as a general comment: a cleaner way of populating a list of ids is using the .value_list() method from django (part of the .values() method in previous versions of Django). It has a "flat" flag that creates the list you want.
So, instead of:
friends_of_daniel = Friendships.objects.filter(user__first_name='Daniel')
daniels_friends = []
for friend_of_daniel in friends_of_daniel:
daniels_friends.append(friend_of_daniel.friend.id)
daniels_friends = list(set(daniels_friends))
You could do, in one line:
daniels_friends = Friendships.objects \
.filter(user__first_name='Daniel') \
.distinct('friend') \
.values_list('friend', flat=True)
distinct makes the same as your list() - set() cast (it makes sure that your list has no repeated elements) and values_list with flat=True can be customizable to any field in the related "user" table: .values_list('friend__id', flat=True) or .values_list('friend__first_name', flat=True) to get a list of first_names of Daniel's friends.
Coming back to your general question, you can do the whole query directly in one line using your related_names, as I am not really sure of what you want (an user instance, a Friendship instance or just a list of firsts and last names) I will give you many options:
If you want a Friendship instance (what you are trying in your sample
code):
friendships_not_friends_with_daniel = Friendships.objects\
.exclude(friend__first_name="Daniel")
This is equivalent to what #Rafael proposes in his answer:
daniels = Users.objects.filter(first_name="Daniel") # There may be
more than one Daniel users =
Friendships.objects.exclude(friend__in=daniels)
Here I am embedding his first query in the exclude by referencing the
field in the related table with double underscore (which is an very
powerful standard in Django).
If you want an User instance:
users_with_no_friendship_with_daniel = Users.objects\
.exclude(usersfriend__friend__first_name="Daniel")
Here you are using the related name of your model to access from the
users table to the friendships table, and then check if the friend of this user is called Daniel. This way of querying is a bit complex to understand but as soon as you get used to it becomes really powerful because it is very similar to the spoken language: you want all users, but excluding the ones that have a friendship, whose friend's first name is Daniel. Depending on how many friends an user hat or how many users are called Daniel, you might to add some distinct() methods or split the query in two.
As an advice, maybe you could improve the related name in your model, because it is what you would use if you have an user instance and want to get the related friendships: user_instance.friendships instead of user_instance.usersfriend and user_instance.friendsfriendships instead of user_instance.friendsfriend.... Do not know, it is always difficult to me to choose good related names...
If you want a list of tuples of users first and last names:
names_of_users_with_no_friendship_with_daniel = Users.objects\
.exclude(usersfriend__friend__first_name="Daniel")\
.values_list('first_name', 'last_name')
I am sorry if something is not clear, please ask and I try to explain better. (I am quite new in stackoverflow)

Get record from master table if alteast one record exists in slave/child tables using django api

To elaborate
e.g.
I have two models "Subject" and "Question"
class Subject(models.Model):
title = models.CharField(max_length=200,unique=True)
is_active = models.BooleanField(default=True)
def __str__(self):
return self.title
class Question(models.Model):
title = models.CharField(max_length=500)
is_active = models.BooleanField(default=True)
subject = models.ForeignKey('Subject')
def __str__(self):
return self.title
I want the list of active subjects having at least one active question.
I have done initial search and also checked django queryset api, but did not got answer.
I am not looking for raw sql query option.
I hope this clears the query. I have tried django api, but did not get expected result. I think this is very obvious query and there should be simple answer to it.
Thanks in advance for any help.

Did you try this?
Subject.objects.filter(question__id__isnull=False).distinct()
You might even be able to simplify it to the following, but I'm too lazy to look up if it's correct or try it out:
Subject.objects.filter(question__isnull=False).distinct()

Challenging django queryset construction

I have a Django app where users log in, set various topics, and then leave comments under the said topics. The following models reflect this rudimentary set up:
class Topic(models.Model):
topic_text = models.TextField()
submitted_on = models.DateTimeField(auto_now_add=True)
class Comment(models.Model):
comment_text = models.TextField()
which_topic = models.ForeignKey(Topic)
submitted_by = models.ForeignKey(User)
submitted_on = models.DateTimeField(auto_now_add=True)
For each user, I am trying to get all topics where any one of the most recent 5 comments were written by the user. In other words, if a user has not commented among a topic's most recent 5 comments, the topic will be excluded from the queryset.
So how do I go about forming this queryset? Btw I was going to show you what I've tried, but it's woefully inadequate and obviously wrong. Can someone please help?

I haven't tested it, but a subquery should work. Something like this:
Topic.objects.filter(
comment__submitted_by__in=Comment.objects.values(
'submitted_by'
).order_by(
'-submitted_on'
).limit(5),
submitted_by=user
)
(Add .prefetch_related('comment_set') if you plan to access the comments.)

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Django: Count() in multiple annotate() - python

Related

How to get highest rated comment for another model through Django ORM?

Reducing number of ORM queries in Django web application

How can I correct my ORM statement to show all friends not associated with a user in Django?

Get record from master table if alteast one record exists in slave/child tables using django api

Challenging django queryset construction

Categories

Resources