Django queryset return DurationField value in seconds - python

I have two models: Post, Comment (Comment has FK relation to Post).
Now I want to return all posts with theirs "response time". I get this response time in timedelta format. Can I receive it in seconds instead? I tried ExtractSecond but it is not what I'm looking for:
base_posts_queryset.annotate(
min_commnet_date=Min("comment_set__date_created"),
response_time=ExpressionWrapper(F("min_commnet_date") - F("date_created"), output_field=DurationField()),
response_time_in_sec=ExtractSecond(F("response_time"))
).filter(response_time__isnull=False).values("response_time", "response_time_in_sec")
This code returns following objects:
{'response_time': datetime.timedelta(days=11, seconds=74024, microseconds=920107), 'response_time_in_sec': 44}
What I want to achieve is basically call .seconds for each item in result queryset. I could do this in python, but mb it could be done on db level?

Sure can, but the exact mechanism may depend upon your database.
In postgres, you can use EXTRACT(epoch FROM <interval>) to get the total number of seconds.
To use this in Django, you can create a Func subclass:
class Epoch(django.db.models.expressions.Func):
template = 'EXTRACT(epoch FROM %(expressions)s)::INTEGER'
output_field = models.IntegerField()
Then you can use it directly:
base_posts.annotate(
response_time_sec=Epoch(F('min_comment_date') - F('date_created'))
)

Nice solution!
One wrinkle is that I think there is a missing 's' needed to get this to work in Django 3
class Epoch(django.db.models.expressions.Func):
template = 'EXTRACT(epoch FROM %(expressions)s)::INTEGER'
output_field = models.IntegerField()

As already answered here, it depends upon your database. As stated in Django documentation of Extract method:
Django usually uses the databases’ extract function, so you may use any lookup_name that your database supports.
So for example with PostgreSQL:
response_time_in_sec=Extract(F("response_time"), "epoch")

Related

Prefetch object not working with order_by queryset

Using Django 11 with PostgreSQL db. I have the models as shown below. I'm trying to prefetch a related queryset, using the Prefetch object and prefetch_related without assigning it to an attribute.
class Person(Model):
name = Charfield()
#property
def latest_photo(self):
return self.photos.order_by('created_at')[-1]
class Photo(Model):
person = ForeignKey(Person, related_name='photos')
created_at = models.DateTimeField(auto_now_add=True)
first_person = Person.objects.prefetch_related(Prefetch('photos', queryset=Photo.objects.order_by('created_at'))).first()
first_person.photos.order_by('created_at') # still hits the database
first_person.latest_photo # still hits the database
In the ideal case, calling person.latest_photo will not hit the database again. This will allow me to use that property safely in a list display.
However, as noted in the comments in the code, the prefetched queryset is not being used when I try to get the latest photo. Why is that?
Note: I've tried using the to_attr argument of Prefetch and that seems to work, however, it's not ideal since it means I would have to edit latest_photo to try to use the prefetched attribute.
The problem is with slicing, it creates a different query.
You can work around it like this:
...
#property
def latest_photo(self):
first_use_the_prefetch = list(self.photos.order_by('created_at'))
then_slice = first_use_the_prefetch[-1]
return then_slice
And in case you want to try, it is not possible to use slicing inside the Prefetch(query=...no slicing here...) (there is a wontfix feature request for this somewhere in Django tracker).

Reorder Django QuerySet by dynamically added field

A have piece of code, which fetches some QuerySet from DB and then appends new calculated field to every object in the Query Set. It's not an option to add this field via annotation (because it's legacy and because this calculation based on another already pre-fetched data).
Like this:
from django.db import models
class Human(models.Model):
name = models.CharField()
surname = models.CharField()
def calculate_new_field(s):
return len(s.name)*42
people = Human.objects.filter(id__in=[1,2,3,4,5])
for s in people:
s.new_column = calculate_new_field(s)
# people.somehow_reorder(new_order_by=new_column)
So now all people in QuerySet have a new column. And I want order these objects by new_column field. order_by() will not work obviously, since it is a database option. I understand thatI can pass them as a sorted list, but there is a lot of templates and other logic, which expect from this object QuerySet-like inteface with it's methods and so on.
So question is: is there some not very bad and dirty way to reorder existing QuerySet by dinamically added field or create new QuerySet-like object with this data? I believe I'm not the only one who faced this problem and it's already solved with django. But I can't find anything (except for adding third-party libs, and this is not an option too).
Conceptually, the QuerySet is not a list of results, but the "instructions to get those results". It's lazily evaluated and also cached. The internal attribute of the QuerySet that keeps the cached results is qs._result_cache
So, the for s in people sentence is forcing the evaluation of the query and caching the results.
You could, after that, sort the results by doing:
people._result_cache.sort(key=attrgetter('new_column'))
But, after evaluating a QuerySet, it makes little sense (in my opinion) to keep the QuerySet interface, as many of the operations will cause a reevaluation of the query. From this point on you should be dealing with a list of Models
Can you try it functions.Length:
from django.db.models.functions import Length
qs = Human.objects.filter(id__in=[1,2,3,4,5])
qs.annotate(reorder=Length('name') * 42).order_by('reorder')

Django Query only one field of a model using .extra() and without using .defer() or .only()

I'm using django ORM's exact() method to query only selected fields from a set of models to save RAM. I can't use defer() or only() due to some constraints on the ORM manager I am using (it's not the default one).
The following code works without an error:
q1 = Model.custom_manager.all().extra(select={'field1':'field1'})
# I only want one field from this model
However, when I jsonify the q1 queryset, I get every single field of the model.. so extra() must not have worked, or am I doing something wrong?
print SafeString(serializers.serialize('json', q1))
>>> '{ everything!!!!!}'
To be more specific, the custom manager I am using is django-sphinx. Model.search.query(...) for example.
Thanks.
So, Im not sure if you can do exactly what you want to do. However, if you only want the values for a particular field or a few fields, you can do it with values
It likely does the full query, but the result will only have the values you want. Using your example:
q1 = Model.custom_manager.values('field1', 'field2').all()
This should return a ValuesQuerySet. Which you will not be able to use with serializers.serialize so you will have to do something like this:
from django.utils import simplejson
data = [value for value in q1]
json_dump = simplejson.dumps(data)
Another probably better solution is to just do your query like originally intended, forgetting extra and values and just use the fields kwarg in the serialize method like this:
print SafeString(serializers.serialize('json', q1, fields=('field1', 'field2')))
The downside is that none of these things actually do the same thing as Defer or Only(all the fields are returned from the database), but you get the output you desire.

How do I test Django QuerySets are equal?

I am trying to test my Django views. This view passes a QuerySet to the template:
def merchant_home(request, slug):
merchant = Merchant.objects.get(slug=slug)
product_list = merchant.products.all()
return render_to_response('merchant_home.html',
{'merchant': merchant,
'product_list': product_list},
context_instance=RequestContext(request))
and test:
def test(self):
"Merchant home view should send merchant and merchant products to the template"
merchant = Merchant.objects.create(name='test merchant')
product = Product.objects.create(name='test product', price=100.00)
merchant.products.add(product)
test_client = Client()
response = test_client.get('/' + merchant.slug)
# self.assertListEqual(response.context['product_list'], merchant.products.all())
self.assertQuerysetEqual(response.context['product_list'], merchant.products.all())
EDIT
I am using self.assertQuerysetEqual instead of self.assertListEqual. Unfortunately this still doesn't work, and the terminal displays this:
['<Product: Product object>'] != [<Product: Product object>]
assertListEqual raises: 'QuerySet' object has no attribute 'difference' and
assertEqual does not work either, although self.assertSetEqual(response.context['product_list'][0], merchant.products.all()[0]) does pass.
I assume this is because the QuerySets are different objects even though they contain the same model instances.
How do I test that two QuerySets contain the same data? I am even testing this correctly? This is my 4th day learning Django so I would like to know best practices, if possible. Thanks.
By default assertQuerysetEqual uses repr() on the first argument. This is why you were having issues with the strings in the queryset comparison.
To work around this you can override the transform argument with a lambda function that doesn't use repr():
self.assertQuerysetEqual(queryset_1, queryset_2, transform=lambda x: x)
Use assertQuerysetEqual, which is built to compare the two querysets for you. You will need to subclass Django's django.test.TestCase for it to be available in your tests.
I just had the same problem. The second argument of assertQuerysetEqual needs to be a list of the expected repr()s as strings. Here is an example from the Django test suite:
self.assertQuerysetEqual(c1.tags.all(), ["<Tag: t1>", "<Tag: t2>"], ordered=False)
I ended up solving this issue using map to repr() each entry in the queryset inside the self.assertQuerysetEqual call, e.g.
self.assertQuerysetEqual(queryset_1, map(repr, queryset_2))
An alternative, but not necessarily better, method might look like this (testing context in a view, for example) when using pytest:
all_the_things = Things.objects.all()
assert set(response.context_data['all_the_things']) == set(all_the_things)
This converts it to a set, which is directly comparable with another set. Be careful with the behaviour of set though, it might not be exactly what you want since it will remove duplicates and ignore the order of objects.
I found that using self.assertCountEqual(queryset1, queryset2) also solves the issue.

Django equivalent for count and group by

I have a model that looks like this:
class Category(models.Model):
name = models.CharField(max_length=60)
class Item(models.Model):
name = models.CharField(max_length=60)
category = models.ForeignKey(Category)
I want select count (just the count) of items for each category, so in SQL it would be as simple as this:
select category_id, count(id) from item group by category_id
Is there an equivalent of doing this "the Django way"? Or is plain SQL the only option? I am familiar with the count( ) method in Django, however I don't see how group by would fit there.
Here, as I just discovered, is how to do this with the Django 1.1 aggregation API:
from django.db.models import Count
theanswer = Item.objects.values('category').annotate(Count('category'))
Since I was a little confused about how grouping in Django 1.1 works I thought I'd elaborate here on how exactly you go about using it. First, to repeat what Michael said:
Here, as I just discovered, is how to do this with the Django 1.1 aggregation API:
from django.db.models import Count
theanswer = Item.objects.values('category').annotate(Count('category'))
Note also that you need to from django.db.models import Count!
This will select only the categories and then add an annotation called category__count. Depending on the default ordering this may be all you need, but if the default ordering uses a field other than category this will not work. The reason for this is that the fields required for ordering are also selected and make each row unique, so you won't get stuff grouped how you want it. One quick way to fix this is to reset the ordering:
Item.objects.values('category').annotate(Count('category')).order_by()
This should produce exactly the results you want. To set the name of the annotation you can use:
...annotate(mycount = Count('category'))...
Then you will have an annotation called mycount in the results.
Everything else about grouping was very straightforward to me. Be sure to check out the Django aggregation API for more detailed info.
(Update: Full ORM aggregation support is now included in Django 1.1. True to the below warning about using private APIs, the method documented here no longer works in post-1.1 versions of Django. I haven't dug in to figure out why; if you're on 1.1 or later you should use the real aggregation API anyway.)
The core aggregation support was already there in 1.0; it's just undocumented, unsupported, and doesn't have a friendly API on top of it yet. But here's how you can use it anyway until 1.1 arrives (at your own risk, and in full knowledge that the query.group_by attribute is not part of a public API and could change):
query_set = Item.objects.extra(select={'count': 'count(1)'},
order_by=['-count']).values('count', 'category')
query_set.query.group_by = ['category_id']
If you then iterate over query_set, each returned value will be a dictionary with a "category" key and a "count" key.
You don't have to order by -count here, that's just included to demonstrate how it's done (it has to be done in the .extra() call, not elsewhere in the queryset construction chain). Also, you could just as well say count(id) instead of count(1), but the latter may be more efficient.
Note also that when setting .query.group_by, the values must be actual DB column names ('category_id') not Django field names ('category'). This is because you're tweaking the query internals at a level where everything's in DB terms, not Django terms.
How's this? (Other than slow.)
counts= [ (c, Item.filter( category=c.id ).count()) for c in Category.objects.all() ]
It has the advantage of being short, even if it does fetch a lot of rows.
Edit.
The one query version. BTW, this is often faster than SELECT COUNT(*) in the database. Try it to see.
counts = defaultdict(int)
for i in Item.objects.all():
counts[i.category] += 1

Categories