Using a queryset in Django (in a view) I only want to get said rows 51-100. i.e. I only want it to return these rows.
is this possible and how within .
objectQuerySet = Recipient.objects.filter(incentiveid=incentive).order_by('fullname')
I don't want to use any paging system etc this is just a one time thing?
Thank you
You can use slicing to execute a LIMIT OFFSET statement:
objectQuerySet = Recipient.objects.filter(incentiveid=incentive).order_by('fullname')[51:100]
Related
In my view i am trying to use prefetch_related to get data from 2 related models. The following line gives me the results i want by returning the most recent entry for all controllers in my database
# allNames is a list containing all the names of controllers i want to get data for
measurements = Microcontrollers.objects.filter(name=allNames[i]).prefetch_related(Prefetch('measurements_basic',queryset=MeasurementsBasic.objects.order_by('-time_taken')))
However when i try to get more entries by adding [:3] at the end it still only returns one for each name in the list. When i try to do so on the prefetch query i get a slice error.
AssertionError at /api/CUTAQ/all/testdata/
Cannot filter a query once a slice has been taken.
My question is how i can make it so i get the amount of entries i want for each name in the list.
Slicing is not used for getting sets of data from a queryset consecutively. If that's what you need, you better go with paginating your queryset instead.
My goal is to optimize the retrieval of the count of objects I have in my Django model.
I have two models:
Users
Prospects
It's a one-to-many relationship. One User can create many Prospects. One Prospect can only be created by one User.
I'm trying to get the Prospects created by the user in the last 24 hours.
Prospects model has roughly 7 millions rows on my PostgreSQL database. Users only 2000.
My current code is taking to much time to get the desired results.
I tried to use filter() and count():
import datetime
# get the date but 24 hours earlier
date_example = datetime.datetime.now() - datetime.timedelta(days = 1)
# Filter Prospects that are created by user_id_example
# and filter Prospects that got a date greater than date_example (so equal or sooner)
today_prospects = Prospect.objects.filter(user_id = 'user_id_example', create_date__gte = date_example)
# get the count of prospects that got created in the past 24 hours by user_id_example
# this is the problematic call that takes too long to process
count_total_today_prospects = today_prospects.count()
I works, but it takes too much time (5 minutes). Because it's checking the entire database instead of just checking, what I though it would: only the prospects that got created in the last 24 hours by the user.
I also tried using annotate but it's equally slow, because it's ultimately doing the same thing than the regular .count():
today_prospects.annotate(Count('id'))
How can I get the count in a more optimized way?
Assuming that you don't have it already, I suggest adding an index that includes both user and date fields (make sure that they are in this order, first the user and then the date, because for the user you are looking for an exact match but for the date you only have a starting point). That should speed up the query.
For example:
class Prospect(models.Model):
...
class Meta:
...
indexes = [
models.Index(fields=['user', 'create_date']),
]
...
This should create a new migration file (run makemigrations and migrate) where it adds the index to the database.
After that, your same code should run a bit faster:
count_total_today_prospects = Prospect.objects\
.filter(user_id='user_id_example', create_date__gte=date_example)\
.count()
Django's documentation:
A count() call performs a SELECT COUNT(*) behind the scenes, so you should always use count() rather than loading all of the record into Python objects and calling len() on the result (unless you need to load the objects into memory anyway, in which case len() will be faster).
Note that if you want the number of items in a QuerySet and are also retrieving model instances from it (for example, by iterating over it), it’s probably more efficient to use len(queryset) which won’t cause an extra database query like count() would.
If the queryset has already been fully retrieved, count() will use that length rather than perform an extra database query.
Take a look at this link: https://docs.djangoproject.com/en/3.2/ref/models/querysets/#count.
Try to use len().
I am using django 1.5. I need to split pages to data. I read docs here. I am not sure about whether it retrieves all data first or not. Since I have a large table, it should be better to using something like 'limit'. Thanks.
EDIT
I am using queryset in ModelManager.
example:
class KeywordManager(models.Manager):
def currentkeyword(self, kw, bd, ed):
wholeres = super(KeywordManager, self).get_query_set() \
.values("sc", "begindate", "enddate") \
.filter(keyword=kw, begindate__gte=bd, enddate__lte=ed) \
.order_by('enddate')
return wholeres
First, a queryset is a lazy object, and django will retrieve the data as soon you request it, but if you dont, django won't hit the DB. If you use over a queryset any list methods as len(), you will evaluate all the queryset and forcing django to retrieve all the data.
If you pass a queryset to the Paginator, it would not retrieve all the data, because, as docs says, if you pass a queryset, it will use .count() methods avoiding converting the queryset into a list and the use of len() method.
If your data is not coming from the database, then yes - Paginator will have to load all the information first in order to determine how to "split" it.
If you're not and you're simply interacting with the database with Django's auto-generated SQL, then the Paginator performs a query to determine the number of items in the database (i.e. an SQL COUNT()) and uses the value you supplied to determine how many pages to generate. Example: count() returns 43, and you want pages of 10 results - the number of pages generated is equivalent to: 43 % 10 + 1 = 5
Consider this query:
query = Novel.objects.< ...some filtering... >.annotate(
latest_chapter_id=Max("volume__chapter__id")
)
Actually what I need is to annotate each Novel with its latest Chapter object, so after this query, I have to execute another query to select actual objects by annotated IDs. IMO this is ugly. Is there a way to combine them into a single query?
Yes, it's possible.
To get a queryset containing all Chapters which are the last in their Novels, simply do:
from django.db.models.expressions import F
from django.db.models.aggregates import Max
Chapters.objects.annotate(last_chapter_pk=Max('novel__chapter__pk')
).filter(pk=F('last_chapter_pk'))
Tested on Django 1.7.
Possible with Django 3.2+
Make use of django.db.models.functions.JSONObject (added in Django 3.2) to combine multiple fields (in this example, I'm fetching the latest object, however it is possible to fetch any arbitrary object provided that you can get LIMIT 1) to yield your object):
MainModel.objects.annotate(
last_object=RelatedModel.objects.filter(mainmodel=OuterRef("pk"))
.order_by("-date_created")
.values(
data=JSONObject(
id="id", body="body", date_created="date_created"
)
)[:1]
)
Yes, using Subqueries, docs: https://docs.djangoproject.com/en/3.0/ref/models/expressions/#subquery-expressions
latest_chapters = Chapter.objects.filter(novel = OuterRef("pk"))\
.order_by("chapter_order")
novels_with_chapter = Novel.objects.annotate(
latest_chapter = Subquery(latest_chapters.values("chapter")[:1]))
Tested on Django 3.0
The subquery creates a select statement inside the select statement for the novels, then adds this as an annotation. This means you only hit the database once.
I also prefer this to Rune's answer as it actually annotates a Novel object.
Hope this helps, anyone who came looking like much later like I did.
No, it's not possible to combine them into a single query.
You can read the following blog post to find two workarounds.
idarr = [1,2,3,4,5]
for i in range(len(idarr)):
upload.objects.filter(idarr[i])
Cant we pass the idarr at one shot to the query
I am assuming that you are trying to filter all instances of Upload whose id is in the list idarr. If that is the case then you can go about it like this:
Upload.objects.filter(id__in = idarr)
Read the documentation for more details.
So much wrong in so few lines...
In Python, never loop through range(len(whatever)). Just do for i in whatever.
Assuming upload is a Django model, you can't just pass a value to filter - you need to say what you're filtering against. Presumably it's the primary key, so you want .filter(pk=i).
If you want to filter against any of the values in a list, use __in: .filter(pk__in=idarr).