Enrich Django QuerySet different outcomes

Enrich Django QuerySet different outcomes - python

Goal
Querying for all products, slicing them, returning subset of those products with an added key:value , in other words, enriched.
Code that works but I can't use
I can't use this code because I use a paginator, the paginator accesses the count of the QuerySet. If I pass the sliced QuerySet then that count is just for that sliced part, not the overall QuerySet, hence why I can't use it.
products_qs = final_qs[paginator.get_offset(request):
paginator.get_offset(request) + paginator.get_limit(request)]
for product in products_qs:
product.raw['super_cool_new_key'] = ms_response.get('results').get(product.id)
This works great, when I print the data I can see that super_cool_new_key enrichment in every product. Awesome. Problem? Well, I have had to slice it and now the count method is no longer true. Of course, I can do something like:
products_qs.count = final_qs.count
and move on with my life, but it feels... hacky, or maybe not?
Code I would like for it to work, but doesn't
for i in range(paginator.get_offset(request),
paginator.get_offset(request) + paginator.get_limit(request)):
product = final_qs[i]
product.raw['super_cool_new_key'] = ms_response.get('results').get(product.id)
When I see the output of the data, the super_cool_new_key is not there. I can't wrap my head around as to why?
Maybe I am having a thick day and I don't understand accessing by reference, so I remove the middlemonkey.
final_qs = final_qs.all()
for i in range(paginator.get_offset(request),
paginator.get_offset(request) + paginator.get_limit(request)):
final_qs[i].raw['super_cool_new_key'] = ms_response.get('results').get(final_qs[i].id, '')
Suspicions
It's obvious it's something about the code difference that is the culprit of why one way works and the other way doesn't. My dollar is on the following:
The slice
The iteration
Looking into Django Docs for QuerySet :
Iteration. A QuerySet is iterable, and it executes its database query the first time you iterate over it.
Then about slicing:
Slicing. As explained in Limiting QuerySets, a QuerySet can be sliced, using Python’s array-slicing syntax. Slicing an unevaluated QuerySet usually returns another unevaluated QuerySet, but Django will execute the database query if you use the “step” parameter of slice syntax, and will return a list
I can't be the slicing then, because I don't do a slice with a "step" parameter. Since it returns an unevaluated QuerySet the code I want to work, should in theory work. (Isn't that always the case?ha ha)
Ok so that clears up the fact that when, in the first option of coding, I did an iteration of for x in x_container the QuerySet was executed. Could that be the answer? So I modified the code:
Spoiler Alert: still does not work
final_qs = final_qs.all()
for i in range(paginator.get_offset(request),
paginator.get_offset(request) + paginator.get_limit(request)):
product = final_qs[i]
product.raw['super_cool_new_key'] = ms_response.get('results').get(product.id)
Emmm... help?
A suggested answer that, spoiler alert, did not work
from django.db.models import When, Case, Value, CharField
when = [ When(id=k, then=Value(v)) for k,v in ms_response.get('results').items()]
p = final_qs[paginator.get_offset(request)
:paginator.get_offset(request) + paginator.get_limit(request)]
p = p.annotate(super_cool_new_key=Case(
*when,
default=Value(''),
output_field=CharField()
)
)
I also tried it without slicing but with .all().annotate() . Still didn't work. It doesn't work, not due to an Exception happening, but because when I see the output, that super_cool_new_key is not there, meaning it didn't enrich the objects, which is the whole point.

It sounds like what you are looking for is similar to the answer here which utilizes When and Case. For your use it will be something along the following:
from django.db.models import When, Case, Value, CharField
ms_response = {5458: 'abc', 9900: 'def'}
whens = [
When(id=k, then=Value(v)) for k, v in ms_response.items()
]
qs = YourModelName.objects.all().annotate(
super_cool_key=Case(
*whens,
default=Value('xyz'),
output_field=CharField()
)
)
when you then call qs.get(id=5458).super_cool_key it will then return 'abc'

Related

For making complex queries, Is it better to interact query_set like Sets or use built-in functions?

I want to write a complex query that needs union and intersection. When I checked this QA, I found two approaches. So my need would be accomplished by
needed_keys = [A, B, C]
qs1 = model.objects.filter(entity=needed_keys[0])
for entity in needed_keys:
qs1 = qs1 | model.objects.filter(entity=entity)
qs2 = qs2 & qs1
or
needed_keys = [A, B, C]
qs1 = model.objects.filter(entity=needed_keys[0])
for entity in needed_keys:
qs1 = qs1.union(model.objects.filter(entity=entity))
qs2 = qs2.intersection(qs1)
based on one of the answers, It seems that the first approach will evaluate the result of the queries and then calculate the result of the AND or OR at the python level, Although I haven't seen anything about this evaluation in Django's doc. What will happen exactly?
To sum up, my questions are, Which approach is better? Is Django really evaluates 'model.objects.filter(entity=entity)' each time in the loop of the first approach?
P.S.
please do not focus on the variable names or structure of the code, the goal was just to illustrate the situation.
the type of 'entity' is textField, so I can't use 'model.objects.filter(entity__in=needed_keys)'
when I checked the output of 'q2.query', the first approach was more readable for me but I want to be sure about its performance too.

In my opinion both you variants are wrong.
And is wrong too this sentence:
the type of 'entity' is textField, so I can't use 'model.objects.filter(entity__in=needed_keys)'
__in works for strings too. That's why:
... # somethere in code
needed_filters = Q(entity__in = ['Adidas', 'Basidas', 'Cididas'])
... # somwhere else in code
needed_filters &= Q(other filter depends on your project)
... # at the end only one queryset creation
queryset = models.objects.filter(needed_filters)
# in your case: models.objects.filter(entity__in = ['Adidas', 'Basidas', 'Cididas'])
more here:
https://docs.djangoproject.com/en/4.1/ref/models/querysets/#in
Next step, you want to filter qs2. Why you don't do it directly instead of intersect:
qs2 = qs2.filter(pk__in = qs1.values_list('pk', flat=True))
Next step. Try start to works with query's into querysets, before ask into DB, not with data in querysets.
Otherwise you loose time on the obj creation, which you throw away with intersect:
query = Q(_connector=Q.OR, **{'entity':entity for entity in needed_keys})
qs2 = model.objects.filter(query_tocreate_qs2, query) # your intersection
Once more time: if you do intersect for already created objects - you contaminate the function time.

What's the cleanest way of counting the iteration of a queryset loop?

I need to loop through a queryset and put the objects into a dictionary along with the sequence number, i, in which they appear in the queryset. However a Python for loop doesn't seem to provide a sequence number. Therefore I've had to create my own variable i and increment it with each loop. I've done something similar to this, is this the cleanest solution?
ledgers = Ledger.objects.all()
i = 0
for ledger in ledgers:
print('Ledger no '+str(i)+' is '+ledger.name)
i += 1

enumerate(…) [Python-doc] is designed for that:
ledgers = Ledger.objects.all()
for i, ledger in enumerate(ledgers):
print(f'Ledger no {i} is {ledger.name}')

This is not exactly the answer to your question.
If you'd need the index afterwards at the QuerySet, you can annotate it like that:
from django.db.models import Window, F
ledgers_qs = Ledger.objects.annotate(index=Window(expression=DenseRank(), order_by=F('id').desc()))
# You can iterate over the object indexes and names
for index, name in ledgers_qs.values_list('index', 'name'):
print('Ledger no '+str(i)+' is '+ledger.name)
i += 1
# You can make further QuerySet operations if needed
# because ledgers is a QuerySet with annotated indexes:
ledgers_qs.filter(index__lt=5, name__contains='foo')
With that you get a queryset that you can use for further database operations.
Why did I create this answer?
Because it is often better to work with QuerySet's if possible to be able to enhance the existing code structure without limitations.
DenseRank

django: order a QuerySet

I have a view like this:
def profile (request):
articles = Post.thing.all()
newSet = set()
def score():
for thing in articles:
Val = some calculation...
....
newSet.update([(thing,Val)])
score()
context = {
'articles': articles.order_by('id'),
'newSet':newSet,
}
return render(request,'my_profile/my_profile.html',context)
and the outcome is a Queryset which looks like this:
set([(<thing: sfd>, 1), (<thing: quality>, 0), (<thing: hello>, -1), (<thing: hey>, 4), (<thing: test>, 0)
I am now trying to order the set by the given Values so its a list which starts with the highest Value, but when i do newSet.order_by/filter/split/join
it does not work since 'set' object has no attribute join/filter/split.
Can anybody give me a hint how to sort the querySet i could not find anything helpful on my own.
I need this to work out in the view so it cannot/should not be done in the model. Thanks for any advise.

the outcome is a Queryset which looks like this
Actually this is a set (python builtin type), not a QuerySet (django's orm type).
set is an unordered collection type. To "sort" it you first have to turn it into a list - which is actually as simple as newlist = list(newset),
then you can sort the list in-place with newlist.sort(). Since you want this list to be sorted on it's items second elements, you'll need to use the key argument to tell sort on what you want to sort:
newlist.sort(key=lambda item: item[1])
or you can just change your score() function to store (score, obj) tuples instead in which case list.sort() will naturally do the RightThing (it will sort your tuples on their first item).
While we're at it: instead of calling newSet.update() with a list of a single item, you can just use newSet.add() instead, ie:
def score():
for thing in articles:
val = some calculation...
....
newset.add((thing, val))
# or if you don't want to specify a callback
# for `list.sort()`:
# newset.add((val, thing))
And finally: do you really need a set at all here ? Why not using a list right from the start ?

I think you might be slightly confused here between a set, a list and a QuerySet? A set is unordered, while a list is not. Sets don't expose the methods you listed above (filter, order_by, split, join). A QuerySet is a Django-specific class which has many useful methods.
I think it would be simpler to make newSet a list of tuples, and then to order that list by value using list.sort(key=lambda x: x[1]).
If your calculation of val is eligible for it though, I'd recommend using annotate and then doing away with newDict or newSet, and simply pass back the queryset of articles, which would be much simpler, maybe faster, and orderable by using articles.order_by('value'). If you post the calculation of val, I'll try to tell you if that's feasible.

Django filtering AND loop in a Django on M2M field

I have a list of IDs I need to query and filter (using AND) in Django. I would like to use something along the lines of example 2 below but it gives incorrect results 0. The models are simple, Many Products can have Many Tags. What is wrong with example 2?
Correct Results
Example 1:
q = Product.objects.all()
for id in _list_of_ids:
q.filter(tags__id=id)
Example 2:
Incorrect results but seems better (edited for brevity) ...
for id in _list_of_ids:
q = Q(tags__id=id)
# apend q here etc
# q = (AND: ('tags__id', 1), ('tags__id', 2))
Products.objects.filter(q)

What you are searching for is:
products = reduce(lambda qs, p_id: qs.filter(tags=p_id), _list_of_ids, Product.objects.all())
Basically there is a difference between a single .filter call with several Q objects and multiple .filter calls each one with a single Q object.
In the first scenario you get one inner join with all Q filters applied to it.
In the second scenario you get many inner joins, each applying only one Q object.
In your case, when you are searching for a product, having a combination of multiple tags, you need to make an inner join per tag in order to find such a product (this is the second scenario) so you need many .filter calls.
More about that in the docs: Spanning multi-valued relationships

What is the full code for Example 2?
Something like this seems like it should work...
q_expression = [Q("tags__", id) for id in list_of_ids]
queryset = Product.objects.filter(reduce(operator.and_, q_expression))

q = Product.objects.filter(tags__id__in=list_of_ids)

Filter by property

Is it possible to filter a Django queryset by model property?
i have a method in my model:
#property
def myproperty(self):
[..]
and now i want to filter by this property like:
MyModel.objects.filter(myproperty=[..])
is this somehow possible?

Nope. Django filters operate at the database level, generating SQL. To filter based on Python properties, you have to load the object into Python to evaluate the property--and at that point, you've already done all the work to load it.

I might be misunderstanding your original question, but there is a filter builtin in python.
filtered = filter(myproperty, MyModel.objects)
But it's better to use a list comprehension:
filtered = [x for x in MyModel.objects if x.myproperty()]
or even better, a generator expression:
filtered = (x for x in MyModel.objects if x.myproperty())

Riffing off #TheGrimmScientist's suggested workaround, you can make these "sql properties" by defining them on the Manager or the QuerySet, and reuse/chain/compose them:
With a Manager:
class CompanyManager(models.Manager):
def with_chairs_needed(self):
return self.annotate(chairs_needed=F('num_employees') - F('num_chairs'))
class Company(models.Model):
# ...
objects = CompanyManager()
Company.objects.with_chairs_needed().filter(chairs_needed__lt=4)
With a QuerySet:
class CompanyQuerySet(models.QuerySet):
def many_employees(self, n=50):
return self.filter(num_employees__gte=n)
def needs_fewer_chairs_than(self, n=5):
return self.with_chairs_needed().filter(chairs_needed__lt=n)
def with_chairs_needed(self):
return self.annotate(chairs_needed=F('num_employees') - F('num_chairs'))
class Company(models.Model):
# ...
objects = CompanyQuerySet.as_manager()
Company.objects.needs_fewer_chairs_than(4).many_employees()
See https://docs.djangoproject.com/en/1.9/topics/db/managers/ for more.
Note that I am going off the documentation and have not tested the above.

Looks like using F() with annotations will be my solution to this.
It's not going to filter by #property, since F talks to the databse before objects are brought into python. But still putting it here as an answer since my reason for wanting filter by property was really wanting to filter objects by the result of simple arithmetic on two different fields.
so, something along the lines of:
companies = Company.objects\
.annotate(chairs_needed=F('num_employees') - F('num_chairs'))\
.filter(chairs_needed__lt=4)
rather than defining the property to be:
#property
def chairs_needed(self):
return self.num_employees - self.num_chairs
then doing a list comprehension across all objects.

I had the same problem, and I developed this simple solution:
objects = [
my_object
for my_object in MyModel.objects.all()
if my_object.myProperty == [...]
]
This is not a performatic solution, it shouldn't be done in tables that contains a large amount of data. This is great for a simple solution or for a personal small project.

PLEASE someone correct me, but I guess I have found a solution, at least for my own case.
I want to work on all those elements whose properties are exactly equal to ... whatever.
But I have several models, and this routine should work for all models. And it does:
def selectByProperties(modelType, specify):
clause = "SELECT * from %s" % modelType._meta.db_table
if len(specify) > 0:
clause += " WHERE "
for field, eqvalue in specify.items():
clause += "%s = '%s' AND " % (field, eqvalue)
clause = clause [:-5] # remove last AND
print clause
return modelType.objects.raw(clause)
With this universal subroutine, I can select all those elements which exactly equal my dictionary of 'specify' (propertyname,propertyvalue) combinations.
The first parameter takes a (models.Model),
the second a dictionary like:
{"property1" : "77" , "property2" : "12"}
And it creates an SQL statement like
SELECT * from appname_modelname WHERE property1 = '77' AND property2 = '12'
and returns a QuerySet on those elements.
This is a test function:
from myApp.models import myModel
def testSelectByProperties ():
specify = {"property1" : "77" , "property2" : "12"}
subset = selectByProperties(myModel, specify)
nameField = "property0"
## checking if that is what I expected:
for i in subset:
print i.__dict__[nameField],
for j in specify.keys():
print i.__dict__[j],
print
And? What do you think?

i know it is an old question, but for the sake of those jumping here i think it is useful to read the question below and the relative answer:
How to customize admin filter in Django 1.4

It may also be possible to use queryset annotations that duplicate the property get/set-logic, as suggested e.g. by #rattray and #thegrimmscientist, in conjunction with the property. This could yield something that works both on the Python level and on the database level.
Not sure about the drawbacks, however: see this SO question for an example.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Enrich Django QuerySet different outcomes - python

Related

For making complex queries, Is it better to interact query_set like Sets or use built-in functions?

What's the cleanest way of counting the iteration of a queryset loop?

django: order a QuerySet

Django filtering AND loop in a Django on M2M field

Filter by property

Categories

Resources