how to add annotate data in django-rest-framework queryset responses? - python

I am generating aggregates for each item in a QuerySet:
def get_queryset(self):
from django.db.models import Count
queryset = Book.objects.annotate(Count('authors'))
return queryset
But I am not getting the count in the JSON response.
thank you in advance.

The accepted solution will hit the database as many times as results are returned. For each result, a count query to the database will be made.
The question is about adding annotations to the serializer, which is way more effective than doing a count query for each item in the response.
A solution for that:
models.py
class Author(models.Model):
name = models.CharField(...)
other_stuff = models...
...
class Book(models.Model):
author = models.ForeignKey(Author)
title = models.CharField(...)
publication_year = models...
...
serializers.py
class BookSerializer(serializers.ModelSerializer):
authors = serializers.IntegerField()
class Meta:
model = Book
fields = ('id', 'title', 'authors')
views.py
class BookViewSet(viewsets.ModelViewSet):
queryset = Book.objects.annotate(authors=Count('author'))
serializer_class = BookSerializer
...
That will make the counting at database level, avoiding to hit database to retrieve authors count for each one of the returned Book items.

The queryset returned from get_queryset provides the list of things that will go through the serializer, which controls how the objects will be represented. Try adding an additional field in your Book serializer, like:
author_count = serializers.IntegerField(
source='author_set.count',
read_only=True
)
Edit: As others have stated, this is not the most efficient way to add counts for cases where many results are returned, as it will hit the database for each instance. See the answer by #José for a more efficient solution.

Fiver's solution will hit the db for every instance in the queryset so if you have a large queryset, his solution will create a lot of queries.
I would override the to_representation of your Book serializer, it reuses the result from the annotation. It will look something like:
class BookSerializer(serializers.ModelSerializer):
def to_representation(self, instance):
return {'id': instance.pk, 'num_authors': instance.authors__count}
class Meta:
model = Book

So, if you make an annotation like
Model.objects.annotate(
some_new_col=Case(
When(some_field=some_value, then=Value(something)),
# etc...
default=Value(something_default),
output_field=SomeTypeOfField(),
)
).filter()#etccc
and the interpreter throws in an error that something is not a model field for the related serializer, there is a workaround. It's not nice but if you add a method some_new_col, it recognizes the value from the query above.
The following will do just fine.
def some_new_col(self):
pass;

Related

How to show only a few Many-to-many relations in DRF?

If for an example I have 2 models and a simple View:
class Recipe(model.Model):
created_at = model.DateField(auto_add_now=True)
class RecipeBook(model.Model):
recipes = model.ManyToManyField(Recipe)
...
class RecipeBookList(ListAPIView):
queryset = RecipeBook.objects.all()
serializer_class = RecipeBookSerializer
...
class RecipeBookSerializer(serializers.ModelSerializer):
recipe = RecipeSerializer(many=True, read_ony=True)
class Meta:
model = RecipeBook
fields = "__all__"
What would be the best way, when showing all Restaurants with a simple GET method, to show only the first 5 recipes created and not all of them?
QuerySet way:
You can specify custom Prefetch operation in your queryset to limit the prefetched related objects:
queryset.prefetch_related(Prefetch('recipes', queryset=Recipe.objects.all()[:5]))
Docs: https://docs.djangoproject.com/en/3.2/ref/models/querysets/#prefetch-objects
Serializer way:
You can use source to provide a custom method to return a custom queryset
class RecipeBookSerializer(serializers.ModelSerializer):
recipes = RecipeSerializer(many=True, read_only=Treue, source='get_recipes')
class Meta:
model = RecipeBook
fields = "__all__"
def get_recipes(self, obj):
return obj.recipes.all()[:5]
Then use prefetch_related("recipes") to minimize related queries.
Source: django REST framework - limited queryset for nested ModelSerializer?
The problem with the serializer way is that either a related query for recipes is performed per recipe book object or all related recipes are pre-fetched from the beginning.

Annotate each query with a function of itself

I need to annotate each query in queryset using a model method:
class myModel(models.Model):
...
def myModel_foo(self):
....
return myModel_foo
I need something like .annotate(myModel_foo=myModel.myModel_foo()). The problem is myModel_foo() requires self. I tried iterating over queryset and using query.annotate(myModel_foo=myModel.myModel_foo(self)), but i got object has no attribute 'annotate' error. What's the correct approach?
UPDATE
OK, the idea is this: i have two models with single to one relation.
class myModel1(models.Model):
fk = ForeignKey(myModel2)
status = ChoiceField
class myModel2(models.Model):
def get_status(self):
# query all objects from myModel1, get status of the last one and
# return it
return get_status
Then I want to send a queryset with status included in it via ajax call.
qs = MyModel2.objects.all()
qs_json = serializers.serialize('json', qs)
return HttpResponse(qs_json, content_type='application/json')
How can i access it in template then in a way like qs_json[0].get_status?
Just add a property to the model and don't use the "annotation" at all.
like:
class MyModel(models.Model):
#property
def foo(self):
....
return self.bar()
But be careful with N+1 queries problem if the bar makes extra queries. You can describe your particular case and I will help you with this.
UPDATE:
There still unclear is there extra queries or not. If not, you can still use the property.
You just need the right serializer like this:
from rest_framework import serializers
class MyModelSerializer(serializers.ModelSerializer):
foo = serializers.CharField(read_only=True)
class Meta:
model = MyModel
fields = [
# other fields
'foo'
]
return Response(MyModelSerializer(MyModel.objects.all(), many=True).data)

Parameterized property in Django

For example, I have two models:
class Page(models.Model):
# Some fields
...
#property
def title(self):
return PageTranslation.objects.get(page=self, language=language).title # I can not pass property to the parameter
class PageTranslation(models.Model):
page = models.ForeignKey(Page)
title = models.CharField()
And some DRF view, which get_queryset method looks like this:
def get_queryset(self):
return Page.objects.all()
And serializer:
class PageSerializer(serializers.ModelSerializer):
class Meta:
model = Page
fields = (..., 'title',) # title = property
I want to return QuerySet with Page model instances, and use title property in serializer, but I can not pass language (that is set somewhere in the request — headers, query param, etc) there.
What is the correct way to do this?
from django.utils.translation import get_language
from django.config import settings
class Page(models.Model):
#property
def title(self):
language = get_language() or settings.LANGUAGE_CODE
return PageTranslation.objects.get(page=self, language=language).title
get_language() gives you the current active language, if i18n is disabled it gives you None, and for that we have the settings.LANGUAGE_CODE fallback.
For the serializer part, I think you are supposed to explicitly say that your property is a field, ModelSerializer only finds the actual database fields for you, nothing else.
class PageSerializer(serializers.ModelSerializer):
title = serializers.Field()
class Meta:
model = Page
fields = (..., 'title',)
Register django.middleware.locale.LocaleMiddleware in your project settings. This makes LANGUAGE_CODE property available in the request
In DRF view where there is the request context, filter QuerySet by the language.
def get_queryset(self):
return Page.objects.filter(language=self.request.LANGUAGE_CODE)
Then computed title property declared in the Page model is not necessary and inefficient.
This is why:
N queries are executed per record in the original QuerySet to serialise a value for the title field. There is great likelihood for an N+1 problem to occur when another model has a many-to-many relation with Page.
Also, serialised results can be inconsistent because title value can be null in cases where the record doesn't exist for the language.

Optimizing database queries in Django REST framework

I have the following models:
class User(models.Model):
name = models.Charfield()
email = models.EmailField()
class Friendship(models.Model):
from_friend = models.ForeignKey(User)
to_friend = models.ForeignKey(User)
And those models are used in the following view and serializer:
class GetAllUsers(generics.ListAPIView):
authentication_classes = (SessionAuthentication, TokenAuthentication)
permission_classes = (permissions.IsAuthenticated,)
serializer_class = GetAllUsersSerializer
model = User
def get_queryset(self):
return User.objects.all()
class GetAllUsersSerializer(serializers.ModelSerializer):
is_friend_already = serializers.SerializerMethodField('get_is_friend_already')
class Meta:
model = User
fields = ('id', 'name', 'email', 'is_friend_already',)
def get_is_friend_already(self, obj):
request = self.context.get('request', None)
if request.user != obj and Friendship.objects.filter(from_friend = user):
return True
else:
return False
So basically, for each user returned by the GetAllUsers view, I want to print out whether the user is a friend with the requester (actually I should check both from_ and to_friend, but does not matter for the question in point)
What I see is that for N users in database, there is 1 query for getting all N users, and then 1xN queries in the serializer's get_is_friend_already
Is there a way to avoid this in the rest-framework way? Maybe something like passing a select_related included query to the serializer that has the relevant Friendship rows?
Django REST Framework cannot automatically optimize queries for you, in the same way that Django itself won't. There are places you can look at for tips, including the Django documentation. It has been mentioned that Django REST Framework should automatically, though there are some challenges associated with that.
This question is very specific to your case, where you are using a custom SerializerMethodField that makes a request for each object that is returned. Because you are making a new request (using the Friends.objects manager), it is very difficult to optimize the query.
You can make the problem better though, by not creating a new queryset and instead getting the friend count from other places. This will require a backwards relation to be created on the Friendship model, most likely through the related_name parameter on the field, so you can prefetch all of the Friendship objects. But this is only useful if you need the full objects, and not just a count of the objects.
This would result in a view and serializer similar to the following:
class Friendship(models.Model):
from_friend = models.ForeignKey(User, related_name="friends")
to_friend = models.ForeignKey(User)
class GetAllUsers(generics.ListAPIView):
...
def get_queryset(self):
return User.objects.all().prefetch_related("friends")
class GetAllUsersSerializer(serializers.ModelSerializer):
...
def get_is_friend_already(self, obj):
request = self.context.get('request', None)
friends = set(friend.from_friend_id for friend in obj.friends)
if request.user != obj and request.user.id in friends:
return True
else:
return False
If you just need a count of the objects (similar to using queryset.count() or queryset.exists()), you can include annotate the rows in the queryset with the counts of reverse relationships. This would be done in your get_queryset method, by adding .annotate(friends_count=Count("friends")) to the end (if the related_name was friends), which will set the friends_count attribute on each object to the number of friends.
This would result in a view and serializer similar to the following:
class Friendship(models.Model):
from_friend = models.ForeignKey(User, related_name="friends")
to_friend = models.ForeignKey(User)
class GetAllUsers(generics.ListAPIView):
...
def get_queryset(self):
from django.db.models import Count
return User.objects.all().annotate(friends_count=Count("friends"))
class GetAllUsersSerializer(serializers.ModelSerializer):
...
def get_is_friend_already(self, obj):
request = self.context.get('request', None)
if request.user != obj and obj.friends_count > 0:
return True
else:
return False
Both of these solutions will avoid N+1 queries, but the one you pick depends on what you are trying to achieve.
Described N+1 problem is a number one issue during Django REST Framework performance optimization, so from various opinions, it requires more solid approach then direct prefetch_related() or select_related() in get_queryset() view method.
Based on collected information, here's a robust solution that eliminates N+1 (using OP's code as an example). It's based on decorators and slightly less coupled for larger applications.
Serializer:
class GetAllUsersSerializer(serializers.ModelSerializer):
friends = FriendSerializer(read_only=True, many=True)
# ...
#staticmethod
def setup_eager_loading(queryset):
queryset = queryset.prefetch_related("friends")
return queryset
Here we use static class method to build the specific queryset.
Decorator:
def setup_eager_loading(get_queryset):
def decorator(self):
queryset = get_queryset(self)
queryset = self.get_serializer_class().setup_eager_loading(queryset)
return queryset
return decorator
This function modifies returned queryset in order to fetch related records for a model as defined in setup_eager_loading serializer method.
View:
class GetAllUsers(generics.ListAPIView):
serializer_class = GetAllUsersSerializer
#setup_eager_loading
def get_queryset(self):
return User.objects.all()
This pattern may look like an overkill, but it's certainly more DRY and has advantage over direct queryset modification inside views, as it allows more control over related entities and eliminates unnecessary nesting of related objects.
Using this metaclass DRF optimize ModelViewSet MetaClass
from django.utils import six
#six.add_metaclass(OptimizeRelatedModelViewSetMetaclass)
class MyModelViewSet(viewsets.ModelViewSet):
queryset = MyModel.objects.all()
serializer_class = MyModelSerializer
You can split the view into two query.
First, only get the Users list (without is_friend_already field). This only require one query.
Second, get the friends list of request.user.
Third, modify the results depending on if the user is in the request.user's friend list.
class GetAllUsersSerializer(serializers.ModelSerializer):
...
class UserListView(ListView):
def get(self, request):
friends = request.user.friends
data = []
for user in self.get_queryset():
user_data = GetAllUsersSerializer(user).data
if user in friends:
user_data['is_friend_already'] = True
else:
user_data['is_friend_already'] = False
data.append(user_data)
return Response(status=200, data=data)

Django QuerySet: select_related-like functionality over many-to-many given all but one unique_together parameters?

I realize that select_related only works on foreign key and one-to-one relationships, but it seems there should be a simple, select_related-like way to join over many-to-many relations that are unique together provided all but one of the unique_together parameters is given.
class User(models.Model):
article_access_set = models.ManyToManyField(Article,
through='UserArticleAccess', related_name='user_access_set')
# User Information ...
class Article(models.Model):
# Article Information ...
class UserArticleAccess(models.Model):
user = models.ForeignKey(User)
article = models.ForeignKey(Article)
# UserArticleAccess Information: flags, liked, last_access_time, ...
class Meta:
unique_together = ('user', 'article')
I'm looking for a magical method:
qs = Article.objects.all().magical_select_related(select={
'user_access_set': {'user': request.user}})
print qs[0].user_access_set
# <UserArticleAccess ...>
print qs[1].user_access_set # No Access
# None
Or maybe:
qs = Article.objects.all().magical_select_related(select = {
'user_access_set': {'user': request.user}},
as = {'user_access_set': 'user_access'})
print qs[0].user_access
# <UserArticleAccess ...>
print qs[1].user_access # No Access
# None
Is there any way to do this? (Or a reason that this shouldn't be implemented in this way or a similar way?)
With Django 1.10 you can use django.db.models.fields.related.ForeignObject class (it's not a public API).
See the conversation about this issue in https://groups.google.com/forum/#!topic/django-users/pGGHKb4Y8ZY
You can see examples in tests/foreign_object/tests.py of the following commit:
https://github.com/django/django/commit/80dac8c33e7f6f22577e4346f44e4c5ee89b648c
Please, look at this post Django : select_related with ManyToManyField.
prefetch_related() can join data but in python.

Categories