"connection.queries" returns nothing in Django - python

from django.db import connection, reset_queries
Prints: []
reset_queries()
p = XModel.objects.filter(id=id) \
.values('name') \
.annotate(quantity=Count('p_id'))\
.order_by('-quantity') \
.distinct()[:int(count)]
print(connection.queries)
While this prints:
reset_queries()
tc = ZModel.objects\
.filter(id=id, stock__gt=0) \
.aggregate(Sum('price'))
print(connection.queries)
I have changed fields names to keep things simple. (Fields are of parent tables i.e. __ to multiple level)
I was trying to print MySQL queries that Django makes and came across connection.queries, I was wondering why doesn't it prints empty with first, while with second it works fine. Although I am getting the result I expect it to. Probably the query is executed. Also am executing only one at a time.

As the accepted answer says you must consume the queryset first since it's lazy (e.g. list(qs)).
Another reason can be that you must be in DEBUG mode (see FAQ):
connection.queries is only available if Django DEBUG setting is True.

Because QuerySets in Django are lazy: as long as you do not consume the result, the QuerySet is not evaluated: no querying is done, until you want to obtain non-QuerySet objects like lists, dictionaries, Model objects, etc.
We can however not doe this for all ORM calls: for example Model.objects.get(..) has as type a Model object, we can not postpone that fetch (well of course we could wrap it in a function, and call it later, but then the "type" is a function, not a Model instance).
The same with a .aggregate(..) since then the result is a dictionary, that maps the keys to the corresponding result of the aggregation.
But your first query does not need to be evaluated. By writing a slicing, you only have added a LIMIT statement at the end of the query, but no need to evaluate it immediately: the type of this is still a QuerySet.
If you would however call list(qs) on a QuerySet (qs), then this means the QuerySet has to be evaluated, and Django will make the query.
The laziness of QuerySets also makes these chainings possible. Imagine that you write:
Model.objects.filter(foo=42).filter(bar=1425)
If the QuerySet of Model.objects.filter(foo=42) would be evaluated immediately, then this could result in a huge amount of Model instances, but by postponing this, we now filter on bar=1425 as well (we constructed a new QuerySet that takes both .filter(..)s into account). This can result in a query that can be evaluated more efficiently, and for example, can result in less data that has to be transferred from the database to the Django server.

The documentation says QuerySets are lazy as shown below:
QuerySets are lazy – the act of creating a QuerySet doesn’t involve
any database activity. You can stack filters together all day long,
and Django won’t actually run the query until the QuerySet is
evaluated. Take a look at this example:
>>> q = Entry.objects.filter(headline__startswith="What")
>>> q = q.filter(pub_date__lte=datetime.date.today())
>>> q = q.exclude(body_text__icontains="food")
>>> print(q)
Though this looks like three database hits, in fact it hits the
database only once, at the last line (print(q)). In general, the
results of a QuerySet aren’t fetched from the database until you “ask”
for them. When you do, the QuerySet is evaluated by accessing the
database. For more details on exactly when evaluation takes place, see
When QuerySets are evaluated.

Related

Why django ORM .filter() two way binding my data?

Let's say I store my query result temporarily to a variable
temp_doc = Document.objects.filter(detail=res)
and then I want to insert some data in said model
and will be something like this
p = Document(detail=res)
p.save()
note that res are object from other model to make some FK relation.
For some reason the temp_doc will contain the new data.
Are .filter() supposed to work like that?
Because with .get() the data inside temp_doc don't change
Django Querysets are lazy, this behavior is well documented
QuerySets are lazy – the act of creating a QuerySet doesn’t involve
any database activity. You can stack filters together all day long,
and Django won’t actually run the query until the QuerySet is
evaluated.
So basically that means until you don't ask for data evaluation, database query won't be executed
In your example
temp_doc = Document.objects.filter(detail=res)
p = Document(detail=res)
p.save()
enter code here
evaluating temp_doc now would include newly created Document as database query would return it
simply constructing list would evaluate QuerySet at the start
#evaluation happens here
list(temp_doc) = Document.objects.filter(detail=res)
p = Document(detail=res)
p.save()

Pass a queryset as the argument to __in in django?

I have a list of object ID's that I am getting from a query in an model's method, then I'm using that list to delete objects from a different model:
class SomeObject(models.Model):
# [...]
def do_stuff(self, some_param):
# [...]
ids_to_delete = {item.id for item in self.items.all()}
other_object = OtherObject.objects.get_or_create(some_param=some_param)
other_object.items.filter(item_id__in=ids_to_delete).delete()
What I don't like is that this takes 2 queries (well, technically 3 for the get_or_create() but in the real code it's actually .filter(some_param=some_param).first() instead of the .get(), so I don't think there's any easy way around that).
How do I pass in an unevaluated queryset as the argument to an __in lookup?
I would like to do something like:
ids_to_delete = self.items.all().values("id")
other_object.items.filter(item_id__in=ids_to_delete).delete()
You can, pass a QuerySet to the query:
other_object.items.filter(id__in=self.items.all()).delete()
this will transform it in a subquery. But not all databases, especially MySQL ones, are good with such subqueries. Furthermore Django handles .delete() manually. It will thus make a query to fetch the primary keys of the items, and then trigger the delete logic (and also remove items that have a CASCADE dependency). So .delete() is not done as one query, but at least two queries, and often a larger amount due to ForeignKeys with an on_delete trigger.
Note however that you here remove Item objects, not "unlink" this from the other_object. For this .remove(…) [Django-doc] can be used.
I should've tried the code sample I posted, you can in fact do this. It's given as an example in the documentation, but it says "be cautious about using nested queries and understand your database server’s performance characteristics" and recommends against doing this, casting the subquery into a list:
values = Blog.objects.filter(
name__contains='Cheddar').values_list('pk', flat=True)
entries = Entry.objects.filter(blog__in=list(values))

What if I delete rows from Django queryset and then filter again?

Consider the following code:
questions = Question.objects.only('id', 'pqa_id', 'retain')
del_questions = questions.filter(retain=False)
# Some computations on del_questions
del_questions.delete()
add_questions = questions.filter(pqa_id=None)
Will add_questions not contain questions with retain=False? I.e. is questions object re-evaluated when we run delete() on its subset del_questions?
Short answer: you here use different QuerySets, so you will here, by creating a copy, make another query. If you would use the same QuerySet, Django will remove the cache, and so it will re-evaluate the QuerySet. It is however possible to let objects temporarily survive a .delete() call, due to caching in another QuerySet that was evaluated before.
is questions object re-evaluated when we run delete() on its subset del_questions
questionss is never evaluated in the first place. A QuerySet is iterable and in case you iterate over it (or fetch the length, or something else), will result in a query. But if you write Model.objects.all().filter(foo=3) then Django will not first "evaluate" the .all() by fetching all Model objects into memory.
A QuerySet is in essence a tool to build a query, by chaining operations and each time constructing a new queryset. Eventually you can evaluate one of the querysets.
Here by apply a .filter(..) for the two calls. We thus constructed two different QuerySets, and so if you evaluate the former, then this will not result in any caching in the latter.
A second important note is that a .delete() does not evaluate the queryset, and thus does not cache the results. If we inspect the .delete() method [GitHub], we see:
def delete(self):
"""Delete the records in the current QuerySet."""
assert self.query.can_filter(), \
"Cannot use 'limit' or 'offset' with delete."
if self._fields is not None:
raise TypeError("Cannot call delete() after .values() or .values_list()")
del_query = self._chain()
# The delete is actually 2 queries - one to find related objects,
# and one to delete. Make sure that the discovery of related
# objects is performed on the same database as the deletion.
del_query._for_write = True
# Disable non-supported fields.
del_query.query.select_for_update = False
del_query.query.select_related = False
del_query.query.clear_ordering(force_empty=True)
collector = Collector(using=del_query.db)
collector.collect(del_query)
deleted, _rows_count = collector.delete()
# Clear the result cache, in case this QuerySet gets reused.
self._result_cache = None
return deleted, _rows_count
With self._chain(), one creates a copy of the querset. So even if this would change the state of a QuerySet, then it would not change the state of this QuerySet.
Another interesting part is self._result_cache = None, here Django resets the cache. So if the queryset was already evaluated before you called .delete() (for example you materialized the queryset before calling .delete()), then it will remove that cache. So if you would reevaluate the QuerySet, this would result in another query to fetch the items.
There is however a scenario where data can still get outdated. For example the following:
questions = Question.objects.all() # create a queryset
list(questions) # materialize the result
questions2 = questions.all() # create a copy of this queryset
questions2.delete() # remove the entries
If we now would call list(questions), we thus obtain the elements in the cache of questions, and this QuerySet is not invalidated, so the elements "survive" a .delete() from another queryset (a copy from this one, although that is not necessary, a simply Questions.objects.all().delete() would also do the trick).

How do querysets work when getting multiple random objects from Django?

I need to get multiple random objects from a Django model.
I know I can get one random object from the model Person by typing:
person = Person.objects.order_by('?')[0]
Then, I saw suggestions in How to get two random records with Django saying I could simply do this by:
people = Person.objects.order_by('?')[0:n]
However, as soon as I add that [0:n], instead of returning the objects, Django returns a QuerySet object. This results in the unfortunate consequences that if I then ask for
print(people[0].first_name, people[0].last_name)
I get the first_name and last_name for 2 different people as QuerySets are evaluated as they are called (right?). How do I get the actual list of people that were returned from the first query?
I am using Python 3.4.0 and Django 1.7.1
Simeon Popov's answer solves the problem, but let me explain where it comes from.
As you probably know querysets are lazy and won't be evaluated until it's necessary. They also have an internal cache that gets filled once the entire queryset is evaluated. If only a single object is taken from a queryset (or a slice with a step specified, i.e. [0:n:2]), Django evaluates it, but the results won't get cached.
Take these two examples:
Example 1
>>> people = Person.objects.order_by('?')[0:n]
>>> print(people[0].first_name, people[0].last_name)
# first and last name of different people
Example 2
>>> people = Person.objects.order_by('?')[0:n]
>>> for person in people:
>>> print(person.first_name, person.last_name)
# first and last name are properly matched
In example 1, the queryset is not yet evaluated when you access the first item. It won't get cached, so when you access the first item again it runs another query on the database.
In the second example, the entire queryset is evaluated when you loop over it. Thus, the cache is filled and there won't be any additional database queries that would change the order of the returned items. In that case the names are properly aligned to each other.
Methods for evaluating an entire queryset are a.o. iteration, list(), bool() and len(). There are some subtle differences between these methods. If all you want to do is make sure the queryset is cached, I'd suggest using bool(), i.e.:
>>> people = Person.objects.order_by('?')[0:n]
>>> bool(people)
True
>>> print(people[0].first_name, people[0].last_name)
# matching names
Try this ...
people = []
for person in Person.objects.order_by('?')[0:n]:
people.append(person)

django - how to do this with kwargs

I am wondering when I touch db when doing queries. more precisely, when is the query performed:
i have this kwargs dic:
kwargs = {'name__startswith':'somename','color__iexact':'somecolor'}
but only for name__startswith query, i need to distinct(). and not for color__iexact.
I thought, i would set for name__startswith the distinct() in loop like this:
for q in kwargs:
if q == 'name__startswith':
Thing.objects.filter(name__startswith=somename).distinct('id')
and then query for all dynamically:
allthings = Thing.objects.filter(**kwargs)
but this is somehow wrong, i seem to be doing two different things here..
how can i do these two queries dynamically?
django querysets are lazy, so the actual queries aren't evaluated until you use the data.
allthings = Thing.objects.filter(**kwargs)
if 'name__startswith' in kwargs:
allthings = allthings.distinct('id')
No queries should be preformed above unitl you actually use the data. This is great for filtering queries as you wish to do
From the docs:
QuerySets are lazy – the act of creating a QuerySet doesn’t involve any database activity. You can stack filters together all day long, and Django won’t actually run the query until the QuerySet is evaluated. Take a look at this example:
>>> q = Entry.objects.filter(headline__startswith="What")
>>> q = q.filter(pub_date__lte=datetime.date.today())
>>> q = q.exclude(body_text__icontains="food")
>>> print(q)
Though this looks like three database hits, in fact it hits the database only once, at the last line (print(q)). In general, the results of a QuerySet aren’t fetched from the database until you “ask” for them. When you do, the QuerySet is evaluated by accessing the database. For more details on exactly when evaluation takes place, see When QuerySets are evaluated.
You can use models.Q to create dynamic queries in django.
query = models.Q(name__startswith=somename)
query &= models.Q('color__iexact':'somecolor')
all_things = Thing.objects.filter(query).distinct('name')
Also read
Constructing Django filter queries dynamically with args and kwargs

Categories