I need to annotate a queryset with strings from dictionary. The dictionary keys come from the model's field called 'field_name'.
I can easily annotate with strings from dictionary using the Value operator:
q = MyModel.objects.annotate(
new_value=Value(value_dict[key], output_field=CharField()))
And I can get the field value from the model with F expression:
q = MyModel.objects.annotate(new_value=F('field_name'))
Putting them together however fails:
# doesn't work, throws
# KeyError: F(field_name)
q = MyModel.objects.annotate(
new_value=Value(value_dict[F('field_name')],
output_field=CharField()))
Found this question, which afaiu tries to do the same thing but that solution throws another error:
Unsupported lookup 'field_name' for CharField or join on the field not permitted.
I feel like I'm missing something really obvious here but I just can't get it to work. Any help appreciated.
Right, just as I thought, a tiny piece was missing. The Case(When(... solution in the linked question worked, I just needed to wrap the dictionary value in Value() operator as follows:
qs = MyModel.objects.annotate(
new_value=Case(
*[ When(field_name=k, then=Value(v)) for k,v in value_dict.items() ],
output_field=CharField()
)
)
Related
suppose we have a model in django defined as follows:
class Literal:
name = models.CharField(...)
...
Name field is not unique, and thus can have duplicate values. I need to accomplish the following task:
Select all rows from the model that have at least one duplicate value of the name field.
I know how to do it using plain SQL (may be not the best solution):
select * from literal where name IN (
select name from literal group by name having count((name)) > 1
);
So, is it possible to select this using django ORM? Or better SQL solution?
Try:
from django.db.models import Count
Literal.objects.values('name')
.annotate(Count('id'))
.order_by()
.filter(id__count__gt=1)
This is as close as you can get with Django. The problem is that this will return a ValuesQuerySet with only name and count. However, you can then use this to construct a regular QuerySet by feeding it back into another query:
dupes = Literal.objects.values('name')
.annotate(Count('id'))
.order_by()
.filter(id__count__gt=1)
Literal.objects.filter(name__in=[item['name'] for item in dupes])
This was rejected as an edit. So here it is as a better answer
dups = (
Literal.objects.values('name')
.annotate(count=Count('id'))
.values('name')
.order_by()
.filter(count__gt=1)
)
This will return a ValuesQuerySet with all of the duplicate names. However, you can then use this to construct a regular QuerySet by feeding it back into another query. The django ORM is smart enough to combine these into a single query:
Literal.objects.filter(name__in=dups)
The extra call to .values('name') after the annotate call looks a little strange. Without this, the subquery fails. The extra values tricks the ORM into only selecting the name column for the subquery.
try using aggregation
Literal.objects.values('name').annotate(name_count=Count('name')).exclude(name_count=1)
In case you use PostgreSQL, you can do something like this:
from django.contrib.postgres.aggregates import ArrayAgg
from django.db.models import Func, Value
duplicate_ids = (Literal.objects.values('name')
.annotate(ids=ArrayAgg('id'))
.annotate(c=Func('ids', Value(1), function='array_length'))
.filter(c__gt=1)
.annotate(ids=Func('ids', function='unnest'))
.values_list('ids', flat=True))
It results in this rather simple SQL query:
SELECT unnest(ARRAY_AGG("app_literal"."id")) AS "ids"
FROM "app_literal"
GROUP BY "app_literal"."name"
HAVING array_length(ARRAY_AGG("app_literal"."id"), 1) > 1
Ok, so for some reason none of the above worked for, it always returned <MultilingualQuerySet []>. I use the following, much easier to understand but not so elegant solution:
dupes = []
uniques = []
dupes_query = MyModel.objects.values_list('field', flat=True)
for dupe in set(dupes_query):
if not dupe in uniques:
uniques.append(dupe)
else:
dupes.append(dupe)
print(set(dupes))
If you want to result only names list but not objects, you can use the following query
repeated_names = Literal.objects.values('name').annotate(Count('id')).order_by().filter(id__count__gt=1).values_list('name', flat='true')
Goal
Querying for all products, slicing them, returning subset of those products with an added key:value , in other words, enriched.
Code that works but I can't use
I can't use this code because I use a paginator, the paginator accesses the count of the QuerySet. If I pass the sliced QuerySet then that count is just for that sliced part, not the overall QuerySet, hence why I can't use it.
products_qs = final_qs[paginator.get_offset(request):
paginator.get_offset(request) + paginator.get_limit(request)]
for product in products_qs:
product.raw['super_cool_new_key'] = ms_response.get('results').get(product.id)
This works great, when I print the data I can see that super_cool_new_key enrichment in every product. Awesome. Problem? Well, I have had to slice it and now the count method is no longer true. Of course, I can do something like:
products_qs.count = final_qs.count
and move on with my life, but it feels... hacky, or maybe not?
Code I would like for it to work, but doesn't
for i in range(paginator.get_offset(request),
paginator.get_offset(request) + paginator.get_limit(request)):
product = final_qs[i]
product.raw['super_cool_new_key'] = ms_response.get('results').get(product.id)
When I see the output of the data, the super_cool_new_key is not there. I can't wrap my head around as to why?
Maybe I am having a thick day and I don't understand accessing by reference, so I remove the middlemonkey.
final_qs = final_qs.all()
for i in range(paginator.get_offset(request),
paginator.get_offset(request) + paginator.get_limit(request)):
final_qs[i].raw['super_cool_new_key'] = ms_response.get('results').get(final_qs[i].id, '')
Suspicions
It's obvious it's something about the code difference that is the culprit of why one way works and the other way doesn't. My dollar is on the following:
The slice
The iteration
Looking into Django Docs for QuerySet :
Iteration. A QuerySet is iterable, and it executes its database query the first time you iterate over it.
Then about slicing:
Slicing. As explained in Limiting QuerySets, a QuerySet can be sliced, using Python’s array-slicing syntax. Slicing an unevaluated QuerySet usually returns another unevaluated QuerySet, but Django will execute the database query if you use the “step” parameter of slice syntax, and will return a list
I can't be the slicing then, because I don't do a slice with a "step" parameter. Since it returns an unevaluated QuerySet the code I want to work, should in theory work. (Isn't that always the case?ha ha)
Ok so that clears up the fact that when, in the first option of coding, I did an iteration of for x in x_container the QuerySet was executed. Could that be the answer? So I modified the code:
Spoiler Alert: still does not work
final_qs = final_qs.all()
for i in range(paginator.get_offset(request),
paginator.get_offset(request) + paginator.get_limit(request)):
product = final_qs[i]
product.raw['super_cool_new_key'] = ms_response.get('results').get(product.id)
Emmm... help?
A suggested answer that, spoiler alert, did not work
from django.db.models import When, Case, Value, CharField
when = [ When(id=k, then=Value(v)) for k,v in ms_response.get('results').items()]
p = final_qs[paginator.get_offset(request)
:paginator.get_offset(request) + paginator.get_limit(request)]
p = p.annotate(super_cool_new_key=Case(
*when,
default=Value(''),
output_field=CharField()
)
)
I also tried it without slicing but with .all().annotate() . Still didn't work. It doesn't work, not due to an Exception happening, but because when I see the output, that super_cool_new_key is not there, meaning it didn't enrich the objects, which is the whole point.
It sounds like what you are looking for is similar to the answer here which utilizes When and Case. For your use it will be something along the following:
from django.db.models import When, Case, Value, CharField
ms_response = {5458: 'abc', 9900: 'def'}
whens = [
When(id=k, then=Value(v)) for k, v in ms_response.items()
]
qs = YourModelName.objects.all().annotate(
super_cool_key=Case(
*whens,
default=Value('xyz'),
output_field=CharField()
)
)
when you then call qs.get(id=5458).super_cool_key it will then return 'abc'
I'm trying to use django annotation to create queryset field which is a list of values of some related model attribute.
queryset = ...
qs = queryset.annotate(
list_field=SomeAggregateFunction(
Case(When(related_model__field="abc"), then="related_model__id")
),
list_elements=Count(F('list_field'))
)
I was thinking about about concatenating all these id with some separator, but i don't know the appropriate functions. Another solution is to make list_field a queryset. I know this syntax is wrong. Thank you for any help.
If you are using postgresql and django >= 1.9, you could use postgres specific aggregating functions e.g.
ArrayAgg:
Returns a list of values, including nulls, concatenated into an array.
In case, you need to concatenate these values using a delimiter, you could also use StringAgg.
I have done something like that:
qs = queryset \
.annotate(
field_a=ArrayAgg(Case(When(
related_model__field="A",
then="related_model__pk")
)),
field_b=ArrayAgg(Case(When(
related_model__field="B",
then="related_model__pk")
)),
field_c=ArrayAgg(Case(When(
related_model__field="C",
then="related_model__pk")
))
)
Now there are lists of None or pk under each field_a, field_b and field_c for every object in queryset. You can also define other default value for Case instead of None.
I'am trying to create a model unittest for a ManyToMany relationship.
The aim is to check, if there is the right category saved in the table Ingredient.
class IngredientModelTest(TestCase):
def test_db_saves_ingredient_with_category(self):
category_one = IngredientsCategory.objects.create(name='Food')
first_Ingredient = Ingredient.objects.create(name='Apple')
first_Ingredient.categories.add(category_one)
category_two = IngredientsCategory.objects.create(name='Medicine')
second_Ingredient = Ingredient.objects.create(name='Antibiotics')
second_Ingredient.categories.add(category_two)
first_ = Ingredient.objects.first()
self.assertEqual('Apple', first_.name)
self.assertEqual(first_.categories.all(), [category_one])
self.assertEqual(first_, first_Ingredient)
for self.asserEqual(first_.categories.all(), [category_one]) in the second last row I get this weird assert:
AssertionError: [<IngredientsCategory: Food>] != [<IngredientsCategory: Food>]
I tried many other different ways, but none of it worked. Does any one suppose how I can get the information of first_.categories.all() to compare it with something else?
That'll be because they're not equal - one is a QuerySet, the other is a list - they just happen to have the same str representations.
You could either cast the QuerySet to a list with list(first_.categories.all()), or a possible solution for this situation may be:
self.assertEqual(first_.categories.get(), category_one)
I am new to django and trying to filter multiple fields that contain text.
columns = ['ticketId', 'checkSum']
q_objects = [Q(fieldname +'__contains', myString) for fieldname in columns]
objects = objects.filter(reduce(operator.or_, q_objects))
I get
Exception Type: ValueError
Exception Value: too many values to unpack on the "filter" last line. Any ideas
Try this:
Q(**{fieldname + '__contains': myString})
This is equivalent to providing a keyword argument, as you normally would when instantiating a Q object. For example:
Q(fieldname__contains=myString, another_fieldname__contains=myOtherstring)
The Q object essentially needs pairs of values to work. Looking at the code it seems you can also use tuples of length two, like this (I haven't tested, though):
Q(("fieldname__contains", myString), ("another_fieldname__contains", myOtherString))
What is the model you are querying? It looks like you left that out.
The last line,
objects = objects.filter(reduce(operator.or_, q_objects))
Should be something like
objects = MyModel.objects.filter(...)