How to generate feed from different models in Django? - python

So, I have two models called apartments and jobs. It's easy to display contents of both models separately, but what I can't figure out is how to display the mix feed of both models based on the date.
jobs = Job.objects.all().order_by('-posted_on')
apartments = Apartment.objects.all().order_by('-date')
The posted date on job is represented by 'posted_by' and the posted date on apartment is represented by 'date'. How can I combine both of these and sort them according to the date posted? I tried combining both of these models in a simpler way like:
new_feed = list(jobs) + list(apartments)
This just creates the list of both of these models, but they are not arranged based on date.

I suggest two ways to achieve that.
With union() New in Django 1.11.
Uses SQL’s UNION operator to combine the results of two or more QuerySets
You need to to make sure that you have a unique name for the ordered field
Like date field for job and also apartment
jobs = Job.objects.all().order_by('-posted_on')
apartments = Apartment.objects.all().order_by('-date')
new_feed = jobs.union(apartments).order_by('-date')
Note with this options, you need to have the same field name to order them.
Or
With chain(), used for treating consecutive sequences as a single sequence and use sorted() with lambda to sort them
from itertools import chain
# remove the order_by() in each queryset, use it once with sorted
jobs = Job.objects.all()
apartments = Apartment.objects.all()
result_list = sorted(chain(job, apartments),
key=lambda instance: instance.date)
With this option, you don't really need to rename or change one of your field names, just add a property method, let's choose the Job Model
class Job(models.Model):
''' fields '''
posted_on = models.DateField(......)
#property
def date(self):
return self.posted_on
So now, both of your models have the attribute date, you can use chain()
result_list = sorted(chain(job, apartments),
key=lambda instance: instance.date)

A good way to do that is to use adapter design pattern. The idea is that we introduce an auxiliary data structure that can be used for the purpose of sorting these model objects. This method has several benefits over trying to fit both models to have the identically named attribute used for sorting. The most important is that the change won't affect any other code in your code base.
First, you fetch your objects as you do but you don't have to fetch them sorted, you can fetch all of them in arbitrary order. You may also fetch just top 100 of them in the sorted order. Just fetch what fits your requirements here:
jobs = Job.objects.all()
apartments = Apartment.objects.all()
Then, we build an auxiliary list of tuples (attribute used for sorting, object), so:
auxiliary_list = ([(job.posted_on, job) for job in jobs]
+ [(apartment.date, apartment) for apartment in apartments])
now, it's time to sort. We're going to sort this auxiliary list. By default, python sort() method sorts tuples in lexicographical order, which mean it will use the first element of the tuples i.e. posted_on and date attributes for ordering. Parameter reverse is set to True for sorting in decreasing order i.e. as you want them in your feed.
auxiliary_list.sort(reverse=True)
now, it's time to return only second elements of the sorted tuples:
sorted_feed = [obj for _, obj in auxiliary_list]
Just keep in mind that if you expect your feed to be huge then sorting these elements in memory is not the best way to do this, but I guess this is not your concern here.

I implemented this in the following ways.
I Video model and Article model that had to be curated as a feed. I made another model called Post, and then had a OneToOne key from both Video & Article.
# apps.feeds.models.py
from model_utils.models import TimeStampedModel
class Post(TimeStampedModel):
...
#cached_property
def target(self):
if getattr(self, "video", None) is not None:
return self.video
if getattr(self, "article", None) is not None:
return self.article
return None
# apps/videos/models.py
class Video(models.Model):
post = models.OneToOneField(
"feeds.Post",
on_delete=models.CASCADE,
)
...
# apps.articles.models.py
class Article(models.Model):
post = models.OneToOneField(
"feeds.Post",
on_delete=models.CASCADE,
)
...
Then for the feed API, I used django-rest-framework to sort on Post queryset's created timestamp. I customized serializer's method and added queryset annotation for customization etc. This way I was able to get either Article's or Video's data as nested dictionary from the related Post instance.
The advantage of this implementation is that you can optimize the queries easily with annotation, select_related, prefetch_related methods that works well on Post queryset.
# apps.feeds.serializers.py
class FeedSerializer(serializers.ModelSerializer):
type = serializers.SerializerMethodField()
class Meta:
model = Post
fields = ("type",)
def to_representation(self, instance) -> OrderedDict:
ret = super().to_representation(instance)
if isinstance(instance.target, Video):
ret["data"] = VideoSerializer(
instance.target, context={"request": self.context["request"]}
).data
else:
ret["data"] = ArticleSerializer(
instance.target, context={"request": self.context["request"]}
).data
return ret
def get_type(self, obj):
return obj.target._meta.model_name
#staticmethod
def setup_eager_loading(qs):
"""
Inspired by:
http://ses4j.github.io/2015/11/23/optimizing-slow-django-rest-framework-performance/
"""
qs = qs.select_related("live", "article")
# other db optimizations...
...
return qs
# apps.feeds.viewsets.py
class FeedViewSet(viewsets.ModelViewSet):
queryset = Post.objects.all()
serializer_class = FeedSerializer
permission_classes = (IsAuthenticatedOrReadOnly,)
def get_queryset(self):
qs = super().get_queryset()
qs = self.serializer_class().setup_eager_loading(qs)
return as
...

Related

Aggregation by Foreign Key and other field in Django Admin

I'm working in Django and having an issue displaying something properly in my Admin site. These are the models
class IndexSetSize(models.Model):
""" A time series of sizes for each index set """
index_set = models.ForeignKey(IndexSet, on_delete=models.CASCADE)
byte_size = models.BigIntegerField()
timestamp = models.DateTimeField()
class IndexSet(models.Model):
title = models.CharField(max_length=4096)
# ... some other stuff that isn't really important
def __str__(self):
return f"{self.title}"
It is displaying all the appropriate data I need, but, I want to display the sum of IndexSetSize, grouped by the index_set key and also grouped by the timestamp (There can be multiple occurrences of an IndexSet for a given timestamp, so I want to add up all the byte_sizes). Currently is just showing every single record. Additionally, I would prefer the total_size field to be sortable
Current Admin model looks like:
class IndexSetSizeAdmin(admin.ModelAdmin):
""" View-only admin for index set sizes """
fields = ["index_set", "total_size", "byte_size", "timestamp"]
list_display = ["index_set", "total_size", "timestamp"]
search_fields = ["index_set"]
list_filter = ["index_set__title"]
def total_size(self, obj):
""" Returns human readable size """
if obj.total_size:
return humanize.naturalsize(obj.total_size)
return "-"
total_size.admin_order_field = 'total_size'
def get_queryset(self, request):
queryset = super().get_queryset(request).select_related()
queryset = queryset.annotate(
total_size=Sum('byte_size', filter=Q(index_set__in_graylog=True)))
return queryset
It seems the proper way to do a group by in Django is to use .values(), although if I use that in get_queryset, an error is thrown saying Cannot call select_related() after .values() or .values_list(). I'm having trouble finding in the documentation if there's a 'correct' way to values/annotate/aggregate that will work correctly with get_queryset. It's a pretty simple sum/group by query I'm trying to do, but I'm not sure what the "Django way" is to accomplish it.
Thanks
I don't think you would be able to return the full queryset and group by index_set in get_queryset as you can't select all columns but group by an individual column in sql
SELECT *, SUM(index_size) FROM indexsetsize GROUP BY index_set // doesn't work
You could perform an extra query in the total_size method to get the aggregated value. However, this would perform the query for every row returned and slow your page load down.
def total_size(self, obj):
""" Returns human readable size """
return humanize.naturalsize(sum(IndexSetSize.objects.filter(
index_set=obj.index_set).values_list(
'byte_size', flat=True)))
total_size.admin_order_field = 'total_size'
It would be better to perform this annotation within the IndexSetAdmin as the index_set will already be grouped through the reverse foreign key. This will mean you can perform the annotation in get_queryset. I would also set the related_name on the foreign key on IndexSetSize so you can access the realted IndexSetSize objects from IndexSet using that name.
class IndexSetSize(models.Model):
index_set = models.ForeignKey(IndexSet, on_delete=models.CASCADE, related_name='index_set_sizes')
...
class IndexSetAdmin(admin.ModelAdmin):
...
def total_size(self, obj):
""" Returns human readable size """
if obj.total_size:
return humanize.naturalsize(obj.total_size)
return "-"
def get_queryset(self, request):
queryset = super().get_queryset(request).prefetch_related('index_set_sizes').annotate(
total_size=Sum('index_set_sizes__byte_size')).order_by('total_size')
return queryset

Django rest framework custom filter backend data duplication

I am trying to make my custom filter and ordering backend working with default search backend in django rest framework. The filtering and ordering working perfectly with each other, but when search is included in the query and i am trying to order query by object name, then data duplication is happening.
I tried to print queries and queries size, but it seems ok when i logging it in the filters, but in a response i have different object counts(ex. 79 objects in filter query, 170 duplicated objects in the final result)
Here is my filterset class
class PhonesFilterSet(rest_filters.FilterSet):
brands = InListFilter(field_name='brand__id')
os_ids = InListFilter(field_name='versions__os')
version_ids = InListFilter(field_name='versions')
launched_year_gte = rest_filters.NumberFilter(field_name='phone_launched_date__year', lookup_expr='gte')
ram_gte = rest_filters.NumberFilter(field_name='internal_memories__value', method='get_rams')
ram_memory_unit = rest_filters.NumberFilter(field_name='internal_memories__units', method='get_ram_units')
def get_rams(self, queryset, name, value):
#here is the problem filter
#that not works with ordering by name
q=queryset.filter(Q(internal_memories__memory_type=1) & Q(internal_memories__value__gte=value))
print('filter_set', len(q))
print('filter_set_query', q.query)
return q
def get_ram_units(self, queryset, name, value):
return queryset.filter(Q(internal_memories__memory_type=1) & Q(internal_memories__units=value))
class Meta:
model = Phone
fields = ['brands', 'os_ids', 'version_ids', 'status', 'ram_gte']
My ordering class:
class CustomFilterBackend(filters.OrderingFilter):
allowed_custom_filters = ['ram', 'camera', 'year']
def get_ordering(self, request, queryset, view):
params = request.query_params.get(self.ordering_param)
if params:
fields = [param.strip() for param in params.split(',')]
ordering = [f for f in fields if f in self.allowed_custom_filters]
if ordering:
return ordering
# No ordering was included, or all the ordering fields were invalid
return self.get_default_ordering(view)
def filter_queryset(self, request, queryset, view):
ordering = self.get_ordering(request, queryset, view)
if ordering:
if 'ram' in ordering:
max_ram = Max('internal_memories__value', filter=Q(internal_memories__memory_type=1))
queryset = queryset.annotate(max_ram=max_ram).order_by('-max_ram')
elif 'camera' in ordering:
max_camera = Max('camera_pixels__megapixels', filter=Q(camera_pixels__camera_type=0))
queryset = queryset.annotate(max_camera=max_camera).order_by('-max_camera')
elif 'year' in ordering:
queryset = queryset.filter(~Q(phone_released_date=None)).order_by('-phone_released_date__year')
elif 'name' in ordering:
#here is the problem ordering
#thats not working with filter
#with one to many relations
queryset = queryset.order_by('-brand__name', '-model__name')
return queryset
Viewset class:
class PhoneViewSet(viewsets.ModelViewSet):
queryset = Phone.objects.all()
serializer_class = PhoneSerializer
filter_backends = (filters.SearchFilter, CustomFilterBackend, django_filters.rest_framework.DjangoFilterBackend)
search_fields = ('brand__name', 'model__name')
ordering_fields = ('brand__name', 'model__name')
filter_class = PhonesFilterSet
As a result i am expecting no data duplication when i am applying ordering with filter and search. My question is why the number of objects is different in filter and in the response, where the data is becoming duplicated? I have no idea where to start debugging from this point. Thanks in advance.
Using distinct() should fix this:
Returns a new QuerySet that uses SELECT DISTINCT in its SQL query. This eliminates duplicate rows from the query results.
By default, a QuerySet will not eliminate duplicate rows. In practice, this is rarely a problem, because simple queries such as Blog.objects.all() don’t introduce the possibility of duplicate result rows. However, if your query spans multiple tables, it’s possible to get duplicate results when a QuerySet is evaluated. That’s when you’d use distinct().
Note however, that you still might get duplicate results:
Any fields used in an order_by() call are included in the SQL SELECT columns. This can sometimes lead to unexpected results when used in conjunction with distinct(). If you order by fields from a related model, those fields will be added to the selected columns and they may make otherwise duplicate rows appear to be distinct. Since the extra columns don’t appear in the returned results (they are only there to support ordering), it sometimes looks like non-distinct results are being returned.
https://docs.djangoproject.com/en/2.2/ref/models/querysets/#django.db.models.query.QuerySet.distinct
If you are using PostgreSQL, you can specify the names of fields to which the DISTINCT should apply. This might help. (I'm not sure.) For more on this, see the link above.
So, I'd return queryset.distinct() in the methods where you commented that you get issues. I would not apply it always (as I had written in my comment above for debugging) because you don't need it for simple queries.

Django-filter with DRF - How to do 'and' when applying multiple values with the same lookup?

This is a slightly simplified example of the filterset I'm using, which I'm using with the DjangoFilterBackend for Django Rest Framework. I'd like to be able to send a request to /api/bookmarks/?title__contains=word1&title__contains=word2 and have results returned that contain both words, but currently it ignores the first parameter and only filters for word2.
Any help would be very appreciated!
class BookmarkFilter(django_filters.FilterSet):
class Meta:
model = Bookmark
fields = {
'title': ['startswith', 'endswith', 'contains', 'exact', 'istartswith', 'iendswith', 'icontains', 'iexact'],
}
class BookmarkViewSet(viewsets.ModelViewSet):
serializer_class = BookmarkSerializer
permission_classes = (IsAuthenticated,)
filter_backends = (DjangoFilterBackend,)
filter_class = BookmarkFilter
ordering_fields = ('title', 'date', 'modified')
ordering = '-modified'
page_size = 10
The main problem is that you need a filter that understands how to operate on multiple values. There are basically two options:
Use MultipleChoiceFilter (not recommended for this instance)
Write a custom filter class
Using MultipleChoiceFilter
class BookmarkFilter(django_filters.FilterSet):
title__contains = django_filters.MultipleChoiceFilter(
name='title',
lookup_expr='contains',
conjoined=True, # uses AND instead of OR
choices=[???],
)
class Meta:
...
While this retains your desired syntax, the problem is that you have to construct a list of choices. I'm not sure if you can simplify/reduce the possible choices, but off the cuff it seems like you would need to fetch all titles from the database, split the titles into distinct words, then create a set to remove duplicates. This seems like it would be expensive/slow depending on how many records you have.
Custom Filter
Alternatively, you can create a custom filter class - something like the following:
class MultiValueCharFilter(filters.BaseCSVFilter, filters.CharFilter):
def filter(self, qs, value):
# value is either a list or an 'empty' value
values = value or []
for value in values:
qs = super(MultiValueCharFilter, self).filter(qs, value)
return qs
class BookmarkFilter(django_filters.FilterSet):
title__contains = MultiValueCharFilter(name='title', lookup_expr='contains')
class Meta:
...
Usage (notice that the values are comma-separated):
GET /api/bookmarks/?title__contains=word1,word2
Result:
qs.filter(title__contains='word1').filter(title__contains='word2')
The syntax is changed a bit, but the CSV-based filter doesn't need to construct an unnecessary set of choices.
Note that it isn't really possible to support the ?title__contains=word1&title__contains=word2 syntax as the widget can't render a suitable html input. You would either need to use SelectMultiple (which again, requires choices), or use javascript on the client to add/remove additional text inputs with the same name attribute.
Without going into too much detail, filters and filtersets are just an extension of Django's forms.
A Filter has a form Field, which in turn has a Widget.
A FilterSet is composed of Filters.
A FilterSet generates an inner form based on its filters' fields.
Responsibilities of each filter component:
The widget retrieves the raw value from the data QueryDict.
The field validates the raw value.
The filter constructs the filter() call to the queryset, using the validated value.
In order to apply multiple values for the same filter, you would need a filter, field, and widget that understand how to operate on multiple values.
The custom filter achieves this by mixing in BaseCSVFilter, which in turn mixes in a "comma-separation => list" functionality into the composed field and widget classes.
I'd recommend looking at the source code for the CSV mixins, but in short:
The widget splits the incoming value into a list of values.
The field validates the entire list of values by validating individual values on the 'main' field class (such as CharField or IntegerField). The field also derives the mixed in widget.
The filter simply derives the mixed in field class.
The CSV filter was intended to be used with in and range lookups, which accept a list of values. In this case, contains expects a single value. The filter() method fixes this by iterating over the values and chaining together individual filter calls.
You can create custom list field something like this:
from django.forms.widgets import SelectMultiple
from django import forms
class ListField(forms.Field):
widget = SelectMultiple
def __init__(self, field, *args, **kwargs):
super(ListField, self).__init__( *args, **kwargs)
self.field = field
def validate(self, value):
super(ListField, self).validate(value)
for val in value:
self.field.validate(val)
def run_validators(self, value):
for val in value:
self.field.run_validators(val)
def to_python(self, value):
if not value:
return []
elif not isinstance(value, (list, tuple)):
raise ValidationError(self.error_messages['invalid_list'], code='invalid_list')
return [self.field.to_python(val) for val in value]
and create custom filter using MultipleChoiceFilter:
class ContainsListFilter(django_filters.MultipleChoiceFilter):
field_class = ListField
def get_filter_predicate(self, v):
name = '%s__contains' % self.name
try:
return {name: getattr(v, self.field.to_field_name)}
except (AttributeError, TypeError):
return {name: v}
After that you can create FilterSet with your custom filter:
from django.forms import CharField
class StorageLocationFilter(django_filters.FilterSet):
title_contains = ContainsListFilter(field=CharField())
Working for me. Hope it will be useful for you.
Here is a sample code that just works:
it supports - product?name=p1,p2,p3 and will return products with name (p1,p2,p3)
def resolve_csvfilter(queryset, name, value):
lookup = { f'{name}__in': value.split(",") }
queryset = queryset.filter(**lookup)
return queryset
class ProductFilterSet(FilterSet):
name = CharFilter(method=resolve_csvfilter)
class Meta:
model = Product
fields = ['name']
Ref: https://django-filter.readthedocs.io/en/master/guide/usage.html#customize-filtering-with-filter-method
https://github.com/carltongibson/django-filter/issues/137

Django Combine a Variable Number of QuerySets

Is there a way to concatenate a unknown number of querysets into a list?
Here are my models:
class Item(models.Model):
name = models.CharField(max_length=200)
brand = models.ForeignKey(User, related_name='brand')
tags = models.ManyToManyField(Tag, blank=True, null=True)
def __unicode__(self):
return self.name
class Meta:
ordering = ['-id']
class Tag(models.Model):
name = models.CharField(max_length=64, unique=True)
def __unicode__(self):
return self.name
I have two types of queries that I'm working with:
items = Item.objects.filter(brands__in=brands)
items = Item.objects.filter(tags__name='80s').filter(tags__name='comedy')
With regards to the second type of query, users can save searches (for example "80s comedy"), and can save multiple searches at the same time, so I will need to create a query for each search that they have saved.
I originally wanted to try and construct a single query that will handle both cases (see Django Combining AND and OR Queries with ManyToMany Field ), but I now think the best way to do this would be to combine all queries into a list.
I like what #akaihola suggests here:
How to combine 2 or more querysets in a Django view? but I can't figure out how to use itertools.chain with a variable number of queries.
Does anyone know the best way to accomplish that?
EDIT: Forgot to mention, what I'm looking for are items that have a certain brand OR have all of the required tags.
Slightly unorthodox, but you could use recursion. So in your example:
def recursive_search(tags, results_queryset):
if len(tags) > 0:
result_qs = result_queryset.filter(tags_name=tags[0])
if result_queryset.exists():
return filter_recursion(tags[1:],result_queryset)
else:
return None
return result_queryset
tags = ["comedy", "80s", "action", "thriller"] # This can be variable
result_queryset = Item.objects.filter(brands__in=brands) # Could be Item.objects.all()
print recursive_search(tags, result_queryset)
So you start off with a list of the tags you are searching for, and a queryset of ALL of your items that could possibly fit your criteria (in this case we start with the list of items of a particular brand)
You then recursively go through the list of tags one by one and cut the queryset down. For each level, you re-filter the entire queryset to only those items which have all the mentioned tags.
so:
the first call/level would be for all the items that have the tag favourite,
the second call/level would be for all the items that have the tags favourite and loudest,
etc.
If the queryset returned by the filter is None, it means there are no items that have all the required tags, and the method will quit and return None (i.e. it quits at the first possible instance of failure). Furthermore, there should only be a single hit to the database (I think!)
I've tested this out and it should work, so give it a shot
EDIT
To concatonate the queryset returned from the brands (q1) and the queryset created above using itertools (q2):
list = []
for item in itertools.chain(q1, q2):
list.append(item)
EDIT 2
does this not accomplish what you need in one query?
# list of tags = ['comedy','80s']
qs = Item.objects.all( Q(brand__iexact="brand name") | Q(tags__name__in=[tag for tag in list_of_tags]) )

How do you order lists in the same way QuerySets are ordered in Django?

I have a model that has an ordering field under its Meta class. When I perform a query and get back a QuerySet for the model it is in the order specified. However if I have instances of this model that are in a list and execute the sort method on the list the order is different from the one I want. Is there a way to sort a list of instances of a model such that the order is equal to that specified in the model definition?
Not automatically, but with a bit of work, yes. You need to define a comparator function (or cmp method on the model class) that can compare two model instances according to the relevant attribute. For instance:
class Dated(models.Model):
...
created = models.DateTimeField(default=datetime.now)
class Meta:
ordering = ('created',)
def __cmp__(self, other):
try:
return cmp(self.created, other.created)
except AttributeError:
return cmp(self.created, other)
The answer to your question is varying degrees of yes, with some manual requirements. If by list you mean a queryset that has been formed by some complicated query, then, sure:
queryset.order_by(ClassName.Meta.ordering)
or
queryset.order_by(instance._meta.ordering)
or
queryset.order_by("fieldname") #If you like being manual
If you're not working with a queryset, then of course you can still sort, the same way anyone sorts complex objects in python:
Comparators
Specifying keys
Decorate/Sort/Undecorate
See the python wiki for a detailed explanation of all three.
Building on Carl's answer, you could easily add the ability to use all the ordering fields and even detect the ones that are in reverse order.
class Person(models.Model):
first_name = models.CharField(max_length=50)
last_name = models.CharField(max_length=50)
birthday = date = models.DateField()
class Meta:
ordering = ['last_name', 'first_name']
def __cmp__(self, other):
for order in self._meta.ordering:
if order.startswith('-'):
order = order[1:]
mode = -1
else:
mode = 1
if hasattr(self, order) and hasattr(other, order):
result = mode * cmp(getattr(self, order), getattr(other, order))
if result: return result
return 0

Categories