Django Combine a Variable Number of QuerySets - python

Is there a way to concatenate a unknown number of querysets into a list?
Here are my models:
class Item(models.Model):
name = models.CharField(max_length=200)
brand = models.ForeignKey(User, related_name='brand')
tags = models.ManyToManyField(Tag, blank=True, null=True)
def __unicode__(self):
return self.name
class Meta:
ordering = ['-id']
class Tag(models.Model):
name = models.CharField(max_length=64, unique=True)
def __unicode__(self):
return self.name
I have two types of queries that I'm working with:
items = Item.objects.filter(brands__in=brands)
items = Item.objects.filter(tags__name='80s').filter(tags__name='comedy')
With regards to the second type of query, users can save searches (for example "80s comedy"), and can save multiple searches at the same time, so I will need to create a query for each search that they have saved.
I originally wanted to try and construct a single query that will handle both cases (see Django Combining AND and OR Queries with ManyToMany Field ), but I now think the best way to do this would be to combine all queries into a list.
I like what #akaihola suggests here:
How to combine 2 or more querysets in a Django view? but I can't figure out how to use itertools.chain with a variable number of queries.
Does anyone know the best way to accomplish that?
EDIT: Forgot to mention, what I'm looking for are items that have a certain brand OR have all of the required tags.

Slightly unorthodox, but you could use recursion. So in your example:
def recursive_search(tags, results_queryset):
if len(tags) > 0:
result_qs = result_queryset.filter(tags_name=tags[0])
if result_queryset.exists():
return filter_recursion(tags[1:],result_queryset)
else:
return None
return result_queryset
tags = ["comedy", "80s", "action", "thriller"] # This can be variable
result_queryset = Item.objects.filter(brands__in=brands) # Could be Item.objects.all()
print recursive_search(tags, result_queryset)
So you start off with a list of the tags you are searching for, and a queryset of ALL of your items that could possibly fit your criteria (in this case we start with the list of items of a particular brand)
You then recursively go through the list of tags one by one and cut the queryset down. For each level, you re-filter the entire queryset to only those items which have all the mentioned tags.
so:
the first call/level would be for all the items that have the tag favourite,
the second call/level would be for all the items that have the tags favourite and loudest,
etc.
If the queryset returned by the filter is None, it means there are no items that have all the required tags, and the method will quit and return None (i.e. it quits at the first possible instance of failure). Furthermore, there should only be a single hit to the database (I think!)
I've tested this out and it should work, so give it a shot
EDIT
To concatonate the queryset returned from the brands (q1) and the queryset created above using itertools (q2):
list = []
for item in itertools.chain(q1, q2):
list.append(item)
EDIT 2
does this not accomplish what you need in one query?
# list of tags = ['comedy','80s']
qs = Item.objects.all( Q(brand__iexact="brand name") | Q(tags__name__in=[tag for tag in list_of_tags]) )

Related

How to fetch related model in django_tables2 to avoid a lot of queries?

I might be missing something simple here. And I simply lack the knowledge or some how-to.
I got two models, one is site, the other one is siteField and the most important one - siteFieldValue.
My idea is to create a django table (for site) that uses the values from siteFieldValue as a number in a row, for a specific site, under certain header. The problem is - each site can have 50s of them. That * number of columns specified by def render_ functions * number of sites equals to a lot of queries and I want to avoid that.
My question is - is it possible to, for example, prefetch all the values for each site (SiteFieldValue.objects.filter(site=record).first() somewhere in the SiteListTable class), put them into an array and then use them in the def render_ functions by simply checking the value assigned to a key (id of the field).
Models:
class Site(models.Model):
name = models.CharField(max_length=100)
class SiteField(models.Model):
name = models.CharField(max_length=100)
description = models.CharField(max_length=500, null=True, blank=True)
def __str__(self):
return self.name
class SiteFieldValue(models.Model):
site = models.ForeignKey(Site, on_delete=models.CASCADE)
field = models.ForeignKey(SiteField, on_delete=models.CASCADE)
value = models.CharField(max_length=500)
Table view
class SiteListTable(tables.Table):
name = tables.Column()
importance = tables.Column(verbose_name='Importance',empty_values=())
vertical = tables.Column(verbose_name='Vertical',empty_values=())
#... and many more to come... all values based on siteFieldValue
def render_importance(self, value, record):
q = SiteFieldValue.objects.filter(site=record, field=1).first()
# ^^ I don't want this!! I would want the SiteFieldValue to be prefetched somewhere else for that model and just check the array for field id in here.
if (q):
return q.value
else:
return None
def render_vertical(self, value, record):
q = SiteFieldValue.objects.filter(site=record, field=2).first()
# ^^ I don't want this!! I would want the SiteFieldValue to be prefetched somewhere else for that model and just check the array for field id in here.
if (q):
return q.value
else:
return None
class Meta:
model = Site
attrs = {
"class": "table table-striped","thead" : {'class': 'thead-light',}}
template_name = "django_tables2/bootstrap.html"
fields = ("name", "importance", "vertical",)
This might get you started. I've broken it up into steps but they can be chained quite easily.
#Get all the objects you'll need. You can filter as appropriate, say by site__name).
qs = SiteFieldValue.objects.select_related('site', 'field')
#lets keep things simple and only get the values we want
qs_values = qs.values('site__name','field__name','value')
#qs_values is a queryset. For ease of manipulation, let's make it a list
qs_list = list(qs_values)
#set up a final dict
final_dict = {}
# create the keys (sites) and values so they are all grouped
for site in qs_list:
#create the sub_dic for the fields if not already created
if site['site__name'] not in final_dict:
final_dict[site['site__name']] = {}
final_dict[site['site__name']][site['name']] = site['site__name']
final_dict[site['site__name']][site['field__name']] = site['value']
#now lets convert our dict of dicts into a list of dicts
# for use as per table2 docs
data = []
for site in final_dict:
data.append(final_dict[site])
Now you have a list of dicts eg,
[{'name':site__name, 'col1name':value...] and can add it as shown in the table2 docs

How to generate feed from different models in Django?

So, I have two models called apartments and jobs. It's easy to display contents of both models separately, but what I can't figure out is how to display the mix feed of both models based on the date.
jobs = Job.objects.all().order_by('-posted_on')
apartments = Apartment.objects.all().order_by('-date')
The posted date on job is represented by 'posted_by' and the posted date on apartment is represented by 'date'. How can I combine both of these and sort them according to the date posted? I tried combining both of these models in a simpler way like:
new_feed = list(jobs) + list(apartments)
This just creates the list of both of these models, but they are not arranged based on date.
I suggest two ways to achieve that.
With union() New in Django 1.11.
Uses SQL’s UNION operator to combine the results of two or more QuerySets
You need to to make sure that you have a unique name for the ordered field
Like date field for job and also apartment
jobs = Job.objects.all().order_by('-posted_on')
apartments = Apartment.objects.all().order_by('-date')
new_feed = jobs.union(apartments).order_by('-date')
Note with this options, you need to have the same field name to order them.
Or
With chain(), used for treating consecutive sequences as a single sequence and use sorted() with lambda to sort them
from itertools import chain
# remove the order_by() in each queryset, use it once with sorted
jobs = Job.objects.all()
apartments = Apartment.objects.all()
result_list = sorted(chain(job, apartments),
key=lambda instance: instance.date)
With this option, you don't really need to rename or change one of your field names, just add a property method, let's choose the Job Model
class Job(models.Model):
''' fields '''
posted_on = models.DateField(......)
#property
def date(self):
return self.posted_on
So now, both of your models have the attribute date, you can use chain()
result_list = sorted(chain(job, apartments),
key=lambda instance: instance.date)
A good way to do that is to use adapter design pattern. The idea is that we introduce an auxiliary data structure that can be used for the purpose of sorting these model objects. This method has several benefits over trying to fit both models to have the identically named attribute used for sorting. The most important is that the change won't affect any other code in your code base.
First, you fetch your objects as you do but you don't have to fetch them sorted, you can fetch all of them in arbitrary order. You may also fetch just top 100 of them in the sorted order. Just fetch what fits your requirements here:
jobs = Job.objects.all()
apartments = Apartment.objects.all()
Then, we build an auxiliary list of tuples (attribute used for sorting, object), so:
auxiliary_list = ([(job.posted_on, job) for job in jobs]
+ [(apartment.date, apartment) for apartment in apartments])
now, it's time to sort. We're going to sort this auxiliary list. By default, python sort() method sorts tuples in lexicographical order, which mean it will use the first element of the tuples i.e. posted_on and date attributes for ordering. Parameter reverse is set to True for sorting in decreasing order i.e. as you want them in your feed.
auxiliary_list.sort(reverse=True)
now, it's time to return only second elements of the sorted tuples:
sorted_feed = [obj for _, obj in auxiliary_list]
Just keep in mind that if you expect your feed to be huge then sorting these elements in memory is not the best way to do this, but I guess this is not your concern here.
I implemented this in the following ways.
I Video model and Article model that had to be curated as a feed. I made another model called Post, and then had a OneToOne key from both Video & Article.
# apps.feeds.models.py
from model_utils.models import TimeStampedModel
class Post(TimeStampedModel):
...
#cached_property
def target(self):
if getattr(self, "video", None) is not None:
return self.video
if getattr(self, "article", None) is not None:
return self.article
return None
# apps/videos/models.py
class Video(models.Model):
post = models.OneToOneField(
"feeds.Post",
on_delete=models.CASCADE,
)
...
# apps.articles.models.py
class Article(models.Model):
post = models.OneToOneField(
"feeds.Post",
on_delete=models.CASCADE,
)
...
Then for the feed API, I used django-rest-framework to sort on Post queryset's created timestamp. I customized serializer's method and added queryset annotation for customization etc. This way I was able to get either Article's or Video's data as nested dictionary from the related Post instance.
The advantage of this implementation is that you can optimize the queries easily with annotation, select_related, prefetch_related methods that works well on Post queryset.
# apps.feeds.serializers.py
class FeedSerializer(serializers.ModelSerializer):
type = serializers.SerializerMethodField()
class Meta:
model = Post
fields = ("type",)
def to_representation(self, instance) -> OrderedDict:
ret = super().to_representation(instance)
if isinstance(instance.target, Video):
ret["data"] = VideoSerializer(
instance.target, context={"request": self.context["request"]}
).data
else:
ret["data"] = ArticleSerializer(
instance.target, context={"request": self.context["request"]}
).data
return ret
def get_type(self, obj):
return obj.target._meta.model_name
#staticmethod
def setup_eager_loading(qs):
"""
Inspired by:
http://ses4j.github.io/2015/11/23/optimizing-slow-django-rest-framework-performance/
"""
qs = qs.select_related("live", "article")
# other db optimizations...
...
return qs
# apps.feeds.viewsets.py
class FeedViewSet(viewsets.ModelViewSet):
queryset = Post.objects.all()
serializer_class = FeedSerializer
permission_classes = (IsAuthenticatedOrReadOnly,)
def get_queryset(self):
qs = super().get_queryset()
qs = self.serializer_class().setup_eager_loading(qs)
return as
...

How to return objects by popularity that was calculated with django-hitcount

I have itens in my app that I want to be returned by "popularity". This popularity meaning the number of views the item has.
I'm using django-hitcount to do this. I saw here how I could get the number of hits of each object. But I don't want to load all my Item objects to memory to accomplish what I want because it's an unnecessary overload.
I want to return the N most popular itens to be passed to the view and the access number of each item.
My Item model is as bellow
class Item(models.Model, HitCountMixin):
nome = models.CharField(max_length=255, unique=True)
slug = models.SlugField(max_length=255, null=True)
imagem = models.ImageField(upload_to='itens/item/', null=True, blank=True)
descricao = RichTextUploadingField(null=True, blank=True)
categoria = models.ForeignKey(Categoria)
hit_count_generic = GenericRelation(
HitCount, object_id_field='object_pk',
related_query_name='hit_count_generic_relation')
def __str__(self):
return '{}'.format(self.nome)
def get_absolute_url(self):
from django.urls import reverse
return reverse('itens:detail_item', args=[str(self.slug)])
At first, in my View I was trying to get the most popular itens with this function
def get_most_popular_itens(amount):
return Item.objects.order_by('-hit_count.hits')[:amount]
But it didn't work. I couldn't understand how this contenttype/generic relationship works.
So, I saw how the database tables were and managed to do something functional (see bellow).
But it has one problem. The queryset returned isn't ordered by the number of views and I don't have access to this number.
Even more, it seems to me that my solution is at least bad.
So, I wanted any idea on how I could improve that, maybe taking some advantage from the Generic Relationship?
def get_most_popular_itens(amount):
ct = ContentType.objects.get_for_model(Item)
hit_counts = HitCount.objects.filter(content_type_id=ct.id).order_by('-hits')[:amount]
items = []
for hit in hit_counts:
items.append(hit.object_pk)
return Item.objects.filter(id__in=items)
This should work:
Item.objects.all().order_by('-hit_count_generic__hits')

How Do I Select From Two Different Tables in Django?

I have these Models and I want to be able to select from the first two.
class Comments(models.Model):
post_id = models.ForeignKey('Posts')
content = models.CharField(max_length=480)
time_created = models.DateTimeField()
class Posts(models.Model):
user_id = models.ForeignKey('Users')
content = models.CharField(max_length=480)
time_created = models.DateTimeField()
class Users(models.Model):
email = models.EmailField()
name = models.CharField(max_length=60)
time_created = models.DateTimeField()
I want to be able to select posts and their comments and have them ordered by datetime so that Posts and Comments will be mixed when displaying them.
I think Twitter does the same thing with their Tweets and Retweets.
You might not be able to do it with a single query. However, you can get the two querysets and use itertools to merge the two iterables.
Example, assuming you want the users' posts and comments,
posts = user.posts_set.all() #or Posts.objects.filter(user=user)
comments = Comments.objects.filter(post_id__user=user)
import itertools
qs = itertools.chain.from_iterable([posts, comments])
Alternatively, if you are not slicing the queryset,
qs = posts | comments
Now, you can order by key:
qs_sorted = sorted(qs, key=lambda x: x.time_created)
You might want to limit the queryset to avoid unusual loading times, as the querysets are evaluated int he sorted function
To select a certain group of Posts:
filtered_posts = Posts.objects.filter(however_you_filter_your_queryset)
To get all of the Comments associated with a certain post:
related_comments = p.comments_set.all()
You could create a list of tuples where each holds (data_type, content, time):
tuple_list = []
for p in filtered_posts:
post_tuple = ('post', p.content, p.time_created)
tuple_list.append(post_tuple)
related_comments = p.comments_set.all()
for c in related_comments:
comment_tuple = ('comment', p.content, p.time_created)
tuple_list.append(comment_tuple)
You ultimately end up with a list of tuples containing all of the posts you grabbed along with the comments related to those posts. If you sort your list by the third element of the tuples, you'll be sorting by the datetime field you're after. You can also remove duplicates by making a set().

Querying for Multiple ManyToMany Fields

I have a model, Entry, with a ManyToMany field:
tags = models.ManyToManyField(Tag)
The Tag model is simple:
class Tag(models.Model):
name = models.CharField(max_length=25)
def __unicode__(self):
return self.name
I want to create a function, getEntryByTags that takes a list of tag names and returns any Entry that has those tags. If I knew the number of tags and the names of each, this would be trivial:
Entry.objects.filter(tags__name="tech").filter(tags__name="music").filter(tags__name="other")
But since this is going to be based on user input and the tags number will be variable, I'm not sure how to proceed. How would I iterate over a multiple item list to get an object that contains each of the ManyToMany objects with the names represented in the list?
You can try,
from django.db.models.query_utils import Q
tag_name_list = xxx # dynamic tag name list based on user input
query_list = [Q(tags__name=tag_name) for tag_name in tag_name_list]
query_set = Entry.objects
for query in query_list:
query_set = query_set.filter(query)
return query_set

Categories