django queryset filter with back reference - python

I'm a C++ developer and noob of python, just following django tutorials.
I want to know how to filter queryset by its back references' information.
Below is my models.
# models.py
import datetime
from django.db import models
from django.utils import timezone
class Question(models.Model):
pub_date = models.DateTimeField('date published')
class Choice(models.Model):
question = models.ForeignKey(Question, on_delete=models.CASCADE)
Now, I want to get query set of Question which pub_date is past from now AND is referenced by any Choice. The second statement causes my problem.
Below are what I tried.
# First method
question_queryset = Question.objects.filter(pub_date__lte=timezone.now())
for q in question_queryset.iterator():
if Choice.objects.filter(question=q.pk).count() == 0:
print(q)
# It works. Only #Question which is not referenced
# by any #Choice is printed.
# But how can I exclude #q in #question_queryset?
# Second method
question_queryset = Question.objects.filter(pub_date__lte=timezone.now()
& Choice.objects.filter(question=pk).count()>0) # Not works.
# NameError: name 'pk' is not defined
# How can I use #pk as rvalue in #Question.objects.filter context?
Is it because I'm not familiar with Python syntax? Or is the approach itself to data wrong? Do you have any good ideas for solving my problem without changing the model?
=======================================
edit: I just found the way for the first method.
# First method
question_queryset = Question.objects.filter(pub_date__lte=timezone.now())
for q in question_queryset.iterator():
if Choice.objects.filter(question=q.pk).count() == 0:
question_queryset = question_queryset.exclude(pk=q.pk)
A new concern arises: if the number of #Question rows is n and #Choice's is m, above method takes O(n * m) times, right? Is there any way to increase performance? Could it be that my way of handling data is the problem? Or is the structure of the data a problem?

Here is the documentation on how to follow relationships backwards. The following query yields the same result:
queryset = (Question.objects
.filter(pub_date__lte=timezone.now())
.annotate(num_choices=Count('choice'))
.filter(num_choices__gt=0))
It is probably better to rely on the Django ORM than writing your own filter. I believe that in the best scenario the time complexity will be the same.
Related to the design, this kind of relationship will lead to duplicates in your database, different questions sometimes have the same answer. I would probably go with a many-to-many relationship instead.

Thats not how the querysets are supposed to work. Iterating the quueryset is iterating every data in the queryset that is returned by your database. You don't need to use iterate()
question_queryset = Question.objects.filter(pub_date=timezone.now())
for q in question_queryset:
if Choice.objects.filter(question=q.pk).count() == 0:
print(q)
I didn't test it. But this should work.

Related

Get variable OneToOneField in Django

Lets say I have a Form model:
class Form(models.Model):
name = models.TextField()
date = models.DateField()
and various "child" models
class FormA(models.Model):
form = models.OneToOneField(Form, on_delete=models.CASCADE)
property_a = models.TextField()
class FormB(models.Model):
form = models.OneToOneField(Form, on_delete=models.CASCADE)
property_b = models.IntegerField()
class FormC(models.Model):
form = models.OneToOneField(Form, on_delete=models.CASCADE)
property_c = models.BooleanField()
a Form can be one AND ONLY ONE of 3 types of forms (FormA, FormB, FormC). Given a Query Set of Form, is there any way I can recover what types of Form (A, B or C) they are?
I would need to get a better understanding of your actual use case to know whether this is a good option for you or not, but in these situations, I would first suggest using model inheritance instead of a one to one field. The code you have there is basically doing what multi-table inheritance already does.
Take a read through the inheritance docs real quick first and make sure that multi-table inheritance makes sense for you as compared to the other options provided by django. If you do wish to continue with multi-table inheritance, I would suggest taking a look at InheritanceManager from django-module-utils.
At this point (if using InheritanceManager), you would be able to use isinstance.
for form in Form.objects.select_subclasses():
if isinstance(form, FormA):
..... do stuff ......
This might sound like a lot of extra effort but IMO it would reduce the moving parts (and custom code) and make things easier to deal with while still handling the functionality you need.
You can check it by name or isinstance.
a = FormA()
print(a.__class__)
print(a.__class__.__name__)
print(isinstance(a, Forma))
outputs:
<class __main__.FormA at 0xsomeaddress>
'FormA'
True
------------------- EDIT -----------------
Ok based on your comment, you just want to know which instance is assigned to your main Form.
So you can do something like this:
if hasattr(form, 'forma'):
# do something
elif hasattr(form, 'formb'):
# do something else
elif hasattr(form, 'formb'):
# do something else
After investigating a bit I came up with this
for form in forms:
#reduces fields to those of OneToOne types
one_to_ones = [field for field in form._meta.get_fields() if field.one_to_one]
for o in one_to_ones:
if hasattr(form,o.name):
#do something
Might have some drawbacks (maybe bad runtime) but serves its purpose for now.
Ideas to improve this are appreciated

How to manage concurrency in querys?

I have this code:
if LikedSlot.objects.filter(restaurant__id=r.id, user__id=u.id).count() == 0:
l = LikedSlot.objects.create(restaurant=r, user=u)
So the idea is to create a new LikedSlot only if the user didn't liked the restaurant before, but I have a race condition because two requests can get True in the first line if it's reached at the same time.
I tried the following but it doesn't seem to fix the issue either:
from django.db import transaction
with transaction.atomic():
if LikedSlot.objects.filter(restaurant__id=r.id, user__id=u.id).count() == 0:
l = LikedSlot.objects.create(restaurant=r, user=u)
Do you have an idea how to fix this?
I suggest you to use your database's referential integrity in such cases. Change your model so that the resturant+user pair is unique:
class LikedSlot(models.Model):
...
class Meta:
unique_together = ('restaurant', 'user',)
This way the database will prevent duplicate records from being created.
After making this change, you can also use the built-in get_or_create function instead of checking for duplicates yourself:
liked_slot, created = LikedSlot.objects.get_or_create(restaurant=r, user=u)

Django Model inheritance for efficient code

I have a Django app that uses an Abstract Base Class ('Answer') and creates different Answers depending on the answer_type required by the Question objects. (This project started life as the Polls tutorial). Question is now:
class Question(models.Model):
ANSWER_TYPE_CHOICES = (
('CH', 'Choice'),
('SA', 'Short Answer'),
('LA', 'Long Answer'),
('E3', 'Expert Judgement of Probabilities'),
('E4', 'Expert Judgment of Values'),
('BS', 'Brainstorms'),
('FB', 'Feedback'),
)
answer_type = models.CharField(max_length=2,
choices=ANSWER_TYPE_CHOICES,
default='SA')
question_text = models.CharField(max_length=200, default="enter a question here")
And Answer is:
class Answer(models.Model):
"""
Answer is an abstract base class which ensures that question and user are
always defined for every answer
"""
question = models.ForeignKey(Question, on_delete=models.CASCADE)
user = models.ForeignKey(User, on_delete=models.CASCADE, default=1)
class Meta:
abstract = True
ordering = ['user']
At the moment, I have a single method in Answer (overwriting get_or_update_answer()) with type-specific instructions to look in the right table and collect or create the right type of object.
#classmethod
def get_or_update_answer(self, user, question, submitted_value={}, pk_ans=None):
"""
this replaces get_or_update_answer with appropriate handling for all
different Answer types. This allows the views answer and page_view to get
or create answer objects for every question type calling this function.
"""
if question.answer_type == 'CH':
if not submitted_value:
# by default, select the top of a set of radio buttons
selected_choice = question.choice_set.first()
answer, _created = Vote.objects.get_or_create(
user=user,
question=question,
defaults={'choice': selected_choice})
else:
selected_choice = question.choice_set.get(pk=submitted_value)
answer = Vote.objects.get(user=user, question=question)
answer.choice = selected_choice
elif question.answer_type == 'SA':
if not submitted_value:
submitted_value = ""
answer, _created = Short_Answer.objects.get_or_create(
user=user,
question=question,
defaults={'short_answer': submitted_value})
else:
answer = Short_Answer.objects.get(
user=user,
question=question)
answer.short_answer = hashtag_cleaner(submitted_value['short_answer'])
etc... etc... (similar handling for five more types)
By putting all this logic in 'models.py', I can load user answers for a page_view for any number of questions with:
for question in page_question_list:
answers[question] = Answer.get_or_update_answer(user, question, submitted_value, pk_ans)
I believe there is a more Pythonic way to design this code - something that I haven't learned to use, but I'm not sure what. Something like interfaces, so that each object type can implement its own version of Answer.get_or_update_answer(), and Python will use the version appropriate for the object. This would make extending 'models.py' a lot neater.
I've revisited this problem recently, replaced one or two hundred lines of code with five or ten, and thought it might one day be useful to someone to find what I did here.
There are several elements to the problem I had - first, many answer types to be created, saved and retrieved when required; second, the GET vs POST dichotomy (and my idiosyncratic solution of always creating an answer, sending it to a form); third, some of the types have different logic (the Brainstorm can have multiple answers per user, the FeedBack does not even need a response - if it is created for a user, it has been presented.) These elements probably obscured some opportunity to remove repetition, which make the visitor pattern quite appropriate.
Solution for elements 1 & 2
A dictionary of question.answer_type codes that map to the relevant Answer sub-class, is created in views.py (because its hard to place it in models.py and resolve dependencies):
# views.py:
ANSWER_CLASS_DICT = {
'CH': Vote,
'SA': Short_Answer,
'LA': Long_Answer,
'E3': EJ_three_field,
'E4': EJ_four_field,
'BS': Brainstorm,
'FB': FB,}
Then I can get the class of Answer that I want 'get_or_created' for any question with:
ANSWER_CLASS_DICT[question.answer_type]
I pass it as a parameter to the class method:
# models.py:
def get_or_update_answer(self, user, question, Cls, submitted_value=None, pk_ans=None):
if not submitted_value:
answer, _created = Cls.objects.get_or_create(user=user, question=question)
elif isinstance(submitted_value, dict):
answer, _created = Cls.objects.get_or_create(user=user, question=question)
for key, value in submitted_value.items():
setattr(answer, key, value)
else:
pass
So the same six lines of code handles get_or_creating any Answer when submitted_value=None (GET) or not (submitted_value).
Solution for element 3
The solution for element three has been to extend the model to separate at least three types of handling for users revisiting the same question:
'S' - single, which allows them to record only one answer, revisit and amend the answer, but never to give two different answers.
'T' - tracked, which allows them to update their answer every time, but makes the history of what their answer was available (e.g. to researchers.)
'M' - multiple, which allows many answers to be submitted to a question.
Still bug-fixing after all these changes, so I won't post code.
Next feature: compound questions and question templates, so people can use the admin to screen to make their own answer types.
Based on what you've shown, you're most of the way to reimplementing the Visitor pattern, which is a pretty standard way of handling this sort of situation (you have a bunch of related subclasses, each needing its own handling logic, and want to iterate over instances of them and do something with each).
I'd suggest taking a look at how that pattern works, and perhaps implementing it more explicitly.

Is it possible to do arithmetic operation on OuterRef expression?

I'm building an Experience booking system in django and I've the following models.
class Experience(models.Model):
name = models.CharField(max_length=20)
capacity = models.IntegerField()
class Booking(models.Model):
experience = models.ForeignKey(Experience)
occupied = models.IntegerField()
Each experience has a limited number of capacity and when user perform booking, it will be added to the Booking table with occupied number. Now how will I find the experiences which are not occupied completely?
available_experiences = Experience.objects.all().exclude(id__in=Subquery(Booking.objects.filter(occupied__gt=OuterRef('capacity') - request_persons).values_list('experience', flat=True)))
Here, request_persons is the number of required vacancy in an experience. This is not working and showing an error like 'ResolvedOuterRef' object has no attribute 'relabeled_clone'. Is it possible to do arithmetic operation on OutRef() expression like F()?
Without adding request_persons, the above code works. Why it is not possible to add a value to the OutRef() expression?
NOTE: My actual code is much complex one and it will be really great to get an answer without modifying the entire structure of the above code.
By doing arithmetic operations in the query referenced by OuterRef() directly you can resolve this issue:
available_experiences = Experience.objects.annotate(
total=models.F('capacity') - request_persons
).exclude(
id__in=Subquery(Booking.objects.filter(
occupied__gt=OuterRef('total')
).values_list('experience', flat=True))
)
If you found another way without modifying your structure or using RawSQL() or .extra(), let us know!
This seems to be fixed in Django 2.0: https://github.com/django/django/pull/9722/files
The fix can be backported to 1.11.x in a similar fashion:
from django.db.models.expressions import ResolvedOuterRef
if not hasattr(ResolvedOuterRef, 'relabeled_clone'):
ResolvedOuterRef.relabeled_clone = lambda self, relabels: self

Reorder Django QuerySet by dynamically added field

A have piece of code, which fetches some QuerySet from DB and then appends new calculated field to every object in the Query Set. It's not an option to add this field via annotation (because it's legacy and because this calculation based on another already pre-fetched data).
Like this:
from django.db import models
class Human(models.Model):
name = models.CharField()
surname = models.CharField()
def calculate_new_field(s):
return len(s.name)*42
people = Human.objects.filter(id__in=[1,2,3,4,5])
for s in people:
s.new_column = calculate_new_field(s)
# people.somehow_reorder(new_order_by=new_column)
So now all people in QuerySet have a new column. And I want order these objects by new_column field. order_by() will not work obviously, since it is a database option. I understand thatI can pass them as a sorted list, but there is a lot of templates and other logic, which expect from this object QuerySet-like inteface with it's methods and so on.
So question is: is there some not very bad and dirty way to reorder existing QuerySet by dinamically added field or create new QuerySet-like object with this data? I believe I'm not the only one who faced this problem and it's already solved with django. But I can't find anything (except for adding third-party libs, and this is not an option too).
Conceptually, the QuerySet is not a list of results, but the "instructions to get those results". It's lazily evaluated and also cached. The internal attribute of the QuerySet that keeps the cached results is qs._result_cache
So, the for s in people sentence is forcing the evaluation of the query and caching the results.
You could, after that, sort the results by doing:
people._result_cache.sort(key=attrgetter('new_column'))
But, after evaluating a QuerySet, it makes little sense (in my opinion) to keep the QuerySet interface, as many of the operations will cause a reevaluation of the query. From this point on you should be dealing with a list of Models
Can you try it functions.Length:
from django.db.models.functions import Length
qs = Human.objects.filter(id__in=[1,2,3,4,5])
qs.annotate(reorder=Length('name') * 42).order_by('reorder')

Categories