I'm trying to get into djangos annotate, but can't quite figure out how it works exactly.
I've got a function where I'd like to annotate a queryset of customers, filter them and return the number of customers
def my_func(self):
received_signatures = self.customer_set.annotate(Count('registrations').filter().count()
Now for the filter part, thats where I have a problem figuring out how to do that. The thing I'd like to filter for is my received_signatures, which is a function that is being called in my customer.py
def received_signatures(self):
signatures = [reg.brought_signature for reg in self.registrations.all() if reg.status == '1_YES']
if len(signatures):
return all(signatures)
else:
return None
brough_signature is a DB Field
So how can I annotate the queryset, filter for the received_signatures and then return a number?
Relevant Model Information:
class Customer(models.Model):
brought_signature = models.BooleanField(u'Brought Signature', default=False)
class Registration(models.Model):
brought_signature = models.BooleanField(u'Brought Signature', default=False)
status = models.CharField(u'Status', max_length=10, choices=STATUS_CHOICES, default='4_RECEIVED')
Note: A participant and a registration can have brought_signature. I have a setting in my program which allows me to either A) mark only brought_signature at my participant (which mean he brought the signature for ALL his registrations) or B) mark brought_signature for every registration he has
For this case Option B) is relevant. With my received_signatures I check if the customer has brought every signature for every registration where his status is "1_YES" and I want to count all the customers who did so and return a number (which I then use in another function for a pygal chart)
If I understand it correctly, you want to check if all the Registrations for a given Customer with status == '1_YES should have as attribute .brought_signature = True, and there should be at least such value. There are several approaches for this.
We can do this by writing it like:
received_signatures = self.customer_set.filter(
registration__status='1_YES'
).annotate(
minb=Min('registration__brought_signature')
).filter(
minb__gt=0
).count()
So what we here do is first .filter(..) on the registrations that have as status 1_YES, next we calculate for every customer a value minb that is the minimum of brought_signature of these Registrations. So in case one of the brought_signatures of the related Registrations is False (in a database that is usually 0), then Min(..) is False as well. In case all brought_signatures are True (in a database that is usually 1), then the result is 1, we can then filter on the fact that minb should thus be greater than 0.
So Customers with no Registration will not be counted, Customers with no Registration with status 1_YES, will not be counted, Customers with Registrations for which there is a Registration with status 1_YES, but with brough_signature will not be counted. Only Customers for which all Registrations that have status 1_YES (not per se all Registrations) have brough_signature = True are counted.
Bonjour, I have a question regarding django-filters. My problem is:
I have two classes defined in my models.py that are:
class Volcano(models.Model):
vd_id = models.AutoField("ID, Volcano Identifier (Index)",
primary_key=True)
[...]
class VolcanoInformation(models.Model):
# Primary key
vd_inf_id = models.AutoField("ID, volcano information identifier (index)",
primary_key=True)
# Other attributes
vd_inf_numcal = models.IntegerField("Number of calderas")
[...]
# Foreign key(s)
vd_id = models.ForeignKey(Volcano, null=True, related_name='vd_inf_vd_id',
on_delete=models.CASCADE)
The two of them are linked throught the vd_id attribute.
I want to develop a search tool that allows the user to search a volcano by its number of calderas (vd_inf_numcal).
I am using django-filters and for now here's my filters.py:
from .models import *
import django_filters
class VolcanoFilter(django_filters.FilterSet):
vd_name = django_filters.ModelChoiceFilter(
queryset=Volcano.objects.values_list('vd_name', flat=True),
widget=forms.Select, label='Volcano name',
to_field_name='vd_name',
)
vd_inf_numcal = django_filters.ModelChoiceFilter(
queryset=VolcanoInformation.objects.values_list('vd_inf_numcal', flat=True),
widget=forms.Select, label='Number of calderas',
)
class Meta:
model = Volcano
fields = ['vd_name', 'vd_inf_numcal']
My views.py is:
def search(request):
feature_list = Volcano.objects.all()
feature_filter = VolcanoFilter(request.GET, queryset = feature_list)
return render(request, 'app/search_list.html', {'filter' : feature_filter, 'feature_type': feature_type})
In my application, a dropdown list of the possible number of calderas appears but the search returns no result which is normal because there is no relation between VolcanoInformation.vd_inf_numcal, VolcanoInformation.vd_id and Volcano.vd_id.
It even says "Select a valid choice. That choice is not one of the available choices."
My question is how could I make this link using django_filters ?
I guess I should write some method within the class but I have absolutely no idea on how to do it.
If anyone had the answer, I would be more than thankful !
In general, you need to answer two questions:
What field are we querying against & what query/lookup expressions need to be generated.
What kinds of values should we be filtering with.
These answers are essentially the left hand and right hand side of your .filter() call.
In this case, you're filtering across the reverse side of the Volcano-Volcano Information relationship (vd_inf_vd_id), against the number of calderas (vd_inf_numcal) for a Volcano. Additionally, you want an exact match.
For the values, you'll need a set of choices containing integers.
AllValuesFilter will look at the DB column and generate the choices from the column values. However, the downside is that the choices will not include any missing values, which look weird when rendered. You could either adapt this field, or use a plain ChoiceFilter, generating the values yourself.
def num_calderas_choices():
# Get the maximum number of calderas
max_count = VolcanoInformation.objects.aggregate(result=Max('vd_inf_numcal'))['result']
# Generate a list of two-tuples for the select dropdown, from 0 to max_count
# e.g, [(0, 0), (1, 1), (2, 2), ...]
return zip(range(max_count), range(max_count))
class VolcanoFilter(django_filters.FilterSet):
name = ...
num_calderas = django_filters.ChoiceFilter(
# related field traversal (note the connecting '__')
field_name='vd_inf_vd_id__vd_inf_numcal',
label='Number of calderas',
choices=num_calderas_choices
)
class Meta:
model = Volcano
fields = ['name', 'num_calderas']
Note that I haven't tested the above code myself, but it should be close enough to get you started.
Thanks a lot ! That's exactly what I was looking for ! I didn't understand how the .filter() works.
What I did, for other attributes is to generate the choices but in a different way. For instance if I just wanted to display a list of the available locations I would use:
# Location attribute
loc = VolcanoInformation.objects.values_list('vd_inf_loc', flat=True)
vd_inf_loc = django_filters.ChoiceFilter(
field_name='vd_inf_vd_id__vd_inf_loc',
label='Geographic location',
choices=zip(loc, loc),
)
In a Django project, I have these simplified models defined:
class People(models.Model):
name = models.CharField(max_length=96)
class Event(models.Model):
name = models.CharField(verbose_name='Nom', max_length=96)
date_start = models.DateField()
date_end = models.DateField()
participants = models.ManyToManyField(to='People', through='Participation')
class Participation(models.Model):
"""Represent the participation of 1 people to 1 event, with information about arrival date and departure date"""
people = models.ForeignKey(to=People, on_delete=models.CASCADE)
event = models.ForeignKey(to=Event, on_delete=models.CASCADE)
arrival_d = models.DateField(blank=True, null=True)
departure_d = models.DateField(blank=True, null=True)
Now, I need generate a participation graph: for each single event day, I want the corresponding total number of participations.
Currently, I use this awful code:
def daterange(start, end, include_last_day=False):
"""Return a generator for each date between start and end"""
days = int((end - start).days)
if include_last_day:
days += 1
for n in range(days):
yield start + timedelta(n)
class ParticipationGraph(DetailView):
template_name = 'events/participation_graph.html'
model = Event
def get_context_data(self, **kwargs):
labels = []
data = []
for d in daterange(self.object.date_start, self.object.date_end):
labels.append(formats.date_format(d, 'd/m/Y'))
total_participation = self.object.participation_set
.filter(arrival_d__lte=d, departure_d__gte=d).count()
data.append(total_participation)
kwargs.update({
'labels': labels,
'data': data,
})
return super(ParticipationGraph, self).get_context_data(**kwargs)
Obviously, I run a new SQL query for each day between Event.date_start and Event.date_end. Is there a way to get the same result with a reduced number of SQL query (ideally, only one)?
I tried many aggregation tools from Django orm (values(), distinct(), etc.) but I always fall to the same issue: I don't have a field with a simple date value, I only have start and end date (in Event) and departure and arrival date (in Participation), so I can't find a way to group my results by date.
I agree that the current approach is expensive because, for each day, you are re-querying the DB for participants that you already retrieved earlier. I would instead approach this by doing a one-time query to the DB to get the participants and then use that data to populate your result data structure.
One structural change I would make to your solution is that instead of tracking two lists where each index corresponds to a day and the participation, aggregate the data in a dictionary mapping the day to the number of participants. If we aggregate results this way, we can always convert this to the two-lists at the end if needed.
Here is what my general (pseudo-codeish) approach is:
def formatDate(d):
return formats.date_format(d, 'd/m/Y')
def get_context_data(self, **kwargs):
# initialize the results with dates in question
result = {}
for d in daterange(self.object.date_start, self.object.date_end):
result[formatDate(d)] = 0
# for each participant, add 1 to each date that they are there
for participant in self.object.participation_set:
for d in daterange(participant.arrival_d, participant.departure_d):
result[formatDate(d)] += 1
# if needed, convert result to appropriate two-list format here
kwargs.update({
'participation_amounts': result
})
return super(ParticipationGraph, self).get_context_data(**kwargs)
In terms of performance, both approaches do the same number of operations. In your approach, for every day, d, you filter over every participant, p. Thus, the number of operations is O(dp). In my approach, for each participant I go through every day they attended (worse cast every day, d). Thus, it is also O(dp).
The reason to prefer my approach is what you pointed out. It only hits the database once to retrieve the participant list. Thus, it is less dependent on network latency. It does sacrifice some of the perf benefits that you get from SQL queries over python code. However, the python code is not too complex and should be fairly easy to process for events that even have hundreds of thousands of people.
I saw this question few days ago and honoured it with an upvote, since it is really well written and the problematics is very interesting. Finally I found some time to dedicate to its solution.
Django is a variation of a Model-View-Controller called Model-Template-View. My approach would follow thus the paradigm "fat model and thin controllers" (or translated to conform with Django "fat model and thin views").
Here is how I would rewrite the models:
import pandas
from django.db import models
from django.utils.functional import cached_property
class Person(models.Model):
name = models.CharField(max_length=96)
class Event(models.Model):
name = models.CharField(verbose_name='Nom', max_length=96)
date_start = models.DateField()
date_end = models.DateField()
participants = models.ManyToManyField(to='Person', through='Participation')
#cached_property
def days(self):
days = pandas.date_range(self.date_start, self.date_end).tolist()
return [day.date() for day in days]
#cached_property
def number_of_participants_per_day(self):
number_of_participants = []
participations = self.participation_set.all()
for day in self.days:
count = len([par for par in participations if day in par.days])
number_of_participants.append((day, count))
return number_of_participants
class Participation(models.Model):
people = models.ForeignKey(to=Person, on_delete=models.CASCADE)
event = models.ForeignKey(to=Event, on_delete=models.CASCADE)
arrival_d = models.DateField(blank=True, null=True)
departure_d = models.DateField(blank=True, null=True)
#cached_property
def days(self):
days = pandas.date_range(self.arrival_d, self.departure_d).tolist()
return [day.date() for day in days]
All calculations are placed in the models. Information that depends on the data stored in the database is made available as cached_property.
Let's see an example for Event:
djangocon = Event.objects.create(
name='DjangoCon Europe 2018',
date_start=date(2018,5,23),
date_end=date(2018,5,28)
)
djangocon.days
>>> [datetime.date(2018, 5, 23),
datetime.date(2018, 5, 24),
datetime.date(2018, 5, 25),
datetime.date(2018, 5, 26),
datetime.date(2018, 5, 27),
datetime.date(2018, 5, 28)]
I used pandas for generating the date range, which is probably an overkill for your application, but it has nice syntax and is good for demonstrational purposes. You can generate the date range in your own way.
To get this result there was only one query. The days is available as any other field.
The same thing I made in Participation, here are some examples:
antwane = Person.objects.create(name='Antwane')
rohan = Person.objects.create(name='Rohan Varma')
cezar = Person.objects.create(name='cezar')
They all want to visit DjangoCon Europe in 2018, but not all of them are attending all days:
p1 = Participation.objects.create(
people=antwane,
event=djangocon,
arrival_d=date(2018,5,23),
departure_d=date(2018,5,28)
)
p2 = Participation.objects.create(
people=rohan,
event=djangocon,
arrival_d=date(2018,5,23),
departure_d=date(2018,5,26)
)
p3 = Participation.objects.create(
people=cezar,
event=djangocon,
arrival_d=date(2018,5,25),
departure_d=date(2018,5,28)
)
Now we want to see how many participants there are for every day the event is going on. We track the number of SQL queries too.
from django.db import connection
djangocon = Event.objects.get(pk=1)
djangocon.number_of_participants_per_day
>>> [(datetime.date(2018, 5, 23), 2),
(datetime.date(2018, 5, 24), 2),
(datetime.date(2018, 5, 25), 3),
(datetime.date(2018, 5, 26), 3),
(datetime.date(2018, 5, 27), 2),
(datetime.date(2018, 5, 28), 2)]
connection.queries
>>>[{'time': '0.000', 'sql': 'SELECT "participants_event"."id", "participants_event"."name", "participants_event"."date_start", "participants_event"."date_end" FROM "participants_event" WHERE "participants_event"."id" = 1'},
{'time': '0.000', 'sql': 'SELECT "participants_participation"."id", "participants_participation"."people_id", "participants_participation"."event_id", "participants_participation"."arrival_d", "participants_participation"."departure_d" FROM "participants_participation" WHERE "participants_participation"."event_id" = 1'}]
There are two queries. The first one fetches the object Event and the second gets the number of participants per day for the event.
Now it's up to you to use it in your views as you please. And thanks to the cached properties you won't need to repeat the database query to get the result.
You can follow the same principle and maybe add property to list all participants for each day of an event. It could look like:
class Event(models.Model):
# ... snip ...
#cached_property
def participants_per_day(self):
participants = []
participations = self.participation_set.all().select_related('people')
for day in self.days:
people = [par.people for par in participations if day in par.days]
participants.append((day, people))
return participants
# refactor the number of participants per day
#cached_property
def number_of_participants_per_day(self):
return [(day, len(people)) for day, people in self.participants_per_day]
I hope you like this solution.
I am trying to create a custom primary_key within my helpdesk/models.py that I will use to track our help desk tickets. I am in the process of writing a small ticking system for our office.
Maybe there is a better way? Right now I have:
id = models.AutoField(primary_key=True)
This increments in the datebase as; 1, 2, 3, 4....50...
I want to take this id assignment and then use it within a function to combine it with some additional information like the date, and the name, 'HELPDESK'.
The code I was using is as follows:
id = models.AutoField(primary_key=True)
def build_id(self, id):
join_dates = str(datetime.now().strftime('%Y%m%d'))
return (('HELPDESK-' + join_dates) + '-' + str(id))
ticket_id = models.CharField(max_length=15, default=(build_id(None, id)))
The idea being is that the entries in the database would be:
HELPDESK-20170813-1
HELPDESK-20170813-2
HELPDESK-20170814-3
...
HELPDESK-20170901-4
...
HELPDESK-20180101-50
...
I want to then use this as the ForeignKey to link the help desk ticket to some other models in the database.
Right now what's coming back is:
HELPDESK-20170813-<django.db.models.fields.AutoField>
This post works - Custom Auto Increment Field Django Curious if there is a better way. If not, this will suffice.
This works for me. It's a slightly modified version from Custom Auto Increment Field Django from above.
models.py
def increment_helpdesk_number():
last_helpdesk = helpdesk.objects.all().order_by('id').last()
if not last_helpdesk:
return 'HEL-' + str(datetime.now().strftime('%Y%m%d-')) + '0000'
help_id = last_helpdesk.help_num
help_int = help_id[13:17]
new_help_int = int(help_int) + 1
new_help_id = 'HEL-' + str(datetime.now().strftime('%Y%m%d-')) + str(new_help_int).zfill(4)
return new_help_id
It's called like this:
help_num = models.CharField(max_length=17, unique=True, default=increment_helpdesk_number, editable=False)
If gives you the following:
HEL-20170815-0000
HEL-20170815-0001
HEL-20170815-0002
...
The numbering doesn't start over after each day, which is something I may look at doing. The more I think about it; however, I am not sure if I even need the date there as I have a creation date field in the model already. So I may just change it to:
HEL-000000000
HEL-000000001
HEL-000000002
...
I am tryint to get objects sorted. this is my code:
ratings = Rate.objects.order_by(sortid)
locations = Location.objects.filter(locations_rate__in=ratings).order_by('locations_rate').distinct('id')
this is my model:
class Rate(models.Model):
von_location= models.ForeignKey(Location,related_name="locations_rate")
price_leistung = models.IntegerField(max_length=5,default=00)
bewertung = models.IntegerField(max_length=3,default=00)
how can I get all Locations in that order which is equal to that of ratings?
what I have above isnot working.
EDIT:
def sort(request):
sortid = request.GET.get('sortid')
ratings = Rate.objects.all()
locations = Location.objects.filter(locations_rate__in=ratings).order_by('locations_rate__%s' % sortid).distinct('id')
if request.is_ajax():
template = 'resultpart.html'
return render_to_response(template,{'locs':locations},context_instance=RequestContext(request))
You must specify a field to use for sorting the Rate objects, for example:
ratings = Rate.objects.all()
locations = Location.objects.filter(
locations_rate__in=ratings
).order_by('locations_rate__%s' % sortid).distinct('id')
You do not need to sort ratings beforehand.
The documentation provides example of use of order_by on related fields.