How to mix Sum and arithmetic with Django queryset - python

I 've got this in my code :
forcasting_order = ProductLine.objects.values('product_name', 'product_id')\
.filter(delivery_date__date__in=ref_days, order_date__date=Func(F('delivery_date'),function="date"))\
.annotate(quantity_to_order=Sum('quantity'))\
.order_by('product_id')
for x in forcasting_order:
x['quantity_to_order'] = round(x['quantity_to_order'] / Command.avg_on_x_week)
Is there a way to divide Sum('quantity') by a constant integer (Command.avg_on_x_week here) inside the query ?

Try something like this:
from django.db.models import Value
.annotate(quantity_to_order=Sum('quantity') / Value(Command.avg_on_x_week))

Related

Complex Django query involving an ArrayField & coefficients

On the one hand, let's consider this Django model:
from django.db import models
from uuid import UUID
class Entry(models.Model):
id = models.UUIDField(primary_key=True, default=uuid4, editable=False)
value = models.DecimalField(decimal_places=12, max_digits=22)
items = ArrayField(base_field=models.UUIDField(null=False, blank=False), default=list)
On the other hand, let's say we have this dictionary:
coefficients = {item1_uuid: item1_coef, item2_uuid: item2_coef, ... }
Entry.value is intended to be distributed among the Entry.items according to coefficients.
Using Django ORM, what would be the most efficient way (in a single SQL query) to get the sum of the values of my Entries for a single Item, given the coefficients?
For instance, for item1 below I want to get 168.5454..., that is to say 100 * 1 + 150 * (0.2 / (0.2 + 0.35)) + 70 * 0.2.
Entry ID
Value
Items
uuid1
100
[item1_uuid]
uuid2
150
[item1_uuid, item2_uuid]
uuid3
70
[item1_uuid, item2_uuid, item3_uuid]
coefficients = { item1_uuid: Decimal("0.2"), item2_uuid: Decimal("0.35"), item3_uuid: Decimal("0.45") }
Bonus question: how could I adapt my models for this query to run faster? I've deliberately chosen to use an ArrayField and decided not to use a ManyToManyField, was that a bad idea? How to know where I could add db_index[es] for this specific query?
I am using Python 3.10, Django 4.1. and Postgres 14.
I've found a solution to my own question, but I'm sure someone here could come up with a more efficient & cleaner approach.
The idea here is to chain the .alias() methods (cf. Django documentation) and the conditional expressions with Case and When in a for loop.
This results in an overly complex query, which at least does work as expected:
def get_value_for_item(coefficients, item):
item_coef = coefficients.get(item.pk, Decimal(0))
if not item_coef:
return Decimal(0)
several = Q(items__len__gt=1)
queryset = (
Entry.objects
.filter(items__contains=[item.pk])
.alias(total=Case(When(several, then=Value(Decimal(0)))))
)
for k, v in coefficients.items():
has_k = Q(items__contains=[k])
queryset = queryset.alias(total=Case(
When(several & has_k, then=Value(v) + F("total")),
default="total",
)
)
return (
queryset.annotate(
coef_applied=Case(
When(several, then=Value(item_coef) / F("total") * F("value")),
default="value",
)
).aggregate(Sum("coef_applied", default=Decimal(0)))
)["coef_applied__sum"]
With the example I gave in my question and for item1, the output of this function is Decimal(168.5454...) as expected.

Find all users x miles away in flask + geoalchemy with ORM

I have created the following SQL query to find all users within a mile and it seems to work fine:
SELECT * FROM user
WHERE ST_DWithin(
user.location,
ST_MakePoint(-2.242631, 53.480759)::geography, 1609)
);
However I want to convert this into a flask/sqlalchemy/geoalchemy query?
Try something like this:
DISTANCE = 100 #100 meters
db.session.query(User).filter(func.ST_DWithin(User.location, cast(funct.ST_SetSRID(func.ST_MakePoint(-2.242631, 53.480759), 1609), Geography), DISTANCE)).all()

Django: Get nearest object in future or past

How can you achieve this with one query:
upcoming_events = Event.objects.order_by('date').filter(date__gte=today)
try:
return upcoming_events[0]
except IndexError:
return Event.objects.all().order_by('-date')[0]
My idea is to do something like this:
Event.objects.filter(Q(date__gte=today) | Q(date is max date))[0]
But I don't know how to implement the max date. Maybe I've just to do it with Func. Or When or Case in django.db.expressions might be helpful.
Here is a solution with one query (thanks to this post) but I'm not sure if it's faster than the implementation in my question:
from django.db.models import Max, Value, Q
latest = (
Event.objects
.all()
.annotate(common=Value(1))
.values('common')
.annotate(latest=Max('date'))
.values('latest')
)
events = Event.objects.order_by('date').filter(
Q(date__gte=datetime.date.today()) | Q(date=latest)
)
return events[0]
In my case I finally just took the rows in question (the two latest) and checked for the right event on Python level.

Django ORM calculate number of days between two date attributes

Scenario
I have a table student. it has following attributes
name,
age,
school_passout_date,
college_start_date
I need a report to know what is the avg number of days student get free between the passing the school and starting college.
Current approach
Currently i am irritating over the range of values finding days for each student and getting its avg.
Problem
That is highly inefficient when the record set gets bigger.
Question
Is there any ability in the Django ORM that gives me totals days between the two dates?
Possibility
I am looking for something like this.
Students.objects.filter(school_passed=True, started_college=True).annotate(total_days_between=Count('school_passout_date', 'college_start_date'), Avg_days=Avg('school_passout_date', 'college_start_date'))
You can do this like so:
Model.objects.annotate(age=Cast(ExtractDay(TruncDate(Now()) - TruncDate(F('created'))), IntegerField()))
This lets you work with the integer value, eg you could then do something like this:
from django.db.models import IntegerField, F
from django.db.models.functions import Cast, ExtractDay, TruncDate
qs = (
Model
.objects
.annotate(age=Cast(ExtractDay(TruncDate(Now()) - TruncDate(F('created'))), IntegerField()))
.annotate(age_bucket=Case(
When(age__lt=30, then=Value('new')),
When(age__lt=60, then=Value('current')),
default=Value('aged'),
output_field=CharField(),
))
)
This question is very old but Django ORM is much more advanced now.
It's possible to do this using F() functions.
from django.db.models import Avg, F
college_students = Students.objects.filter(school_passed=True, started_college=True)
duration = college_students.annotate(avg_no_of_days=Avg( F('college_start_date') - F('school_passout_date') )
Mathematically, according to the (expected) fact that the pass out date is allway later than the start date, you can just get an average off all your start date, and all your pass out date, and make the difference.
This gives you a solution like that one
from django.db.models import Avg
avg_start_date = Students.objects.filter(school_passed=True, started_college=True).aggregate(Avg('school_start_date'))
avg_passout_date = Students.objects.filter(school_passed=True, started_college=True).aggregate(Avg('school_passout_date'))
avg_time_at_college = avg_passout_date - avg_start_date
Django currently only accept aggregation for 4 function : Max, Min, Count, et Average, so this is a little tricky to do.
Then the solution is using the method extra . That way:
Students.objects.
extra(select={'difference': 'school_passout_date' - 'college_start_date'}).
filter('school_passed=True, started_college=True)
But then, you still have to do the average on the server side

django - annotate() instead of distinct()

I am stuck in this issue:
I have two models:
Location and Rate.
each location has its rate, possibly multiple rates.
i want to get locations ordered by its rates, ascendingly.
obvouisly, order_by and distinct() dont work together:
locations = Location.objects.filter(**s_kwargs).order_by('locations_rate__rate').distinct('id')
then i read the docs and came to annotate(). but i am not sure whether i have to use a function between annotate.
if i do this:
locations = Location.objects.filter(**s_kwargs).annotate(rate=Count('locations_rate__rate')).order_by('rate')
but this counts the rates and orders by the sum. i want to get locations with its rates ordered by the value of those rates.
my model definitions are:
class Location(models.Model):
name = models.TextField()
adres = models.TextField()
class Rate(models.Model):
location = models.ForeignKey(Location,related_name='locations_rate')
rate = models.IntegerField(max_length=2)
price_rate = models.IntegerField(max_length=2) #<--- added now
datum = models.DateTimeField(auto_now_add=True,blank=True) #<--- added now
Well the issue is not how to make query in Django for the problem you described. It's that your problem is either incorrect or not property thought through. Let me explained with an example:
Suppose you have two Location objects, l1 and l2. l1 has two Rate objects related to it, r1 and r3, such that r1.rate = 1 and r3.rate = 3; And l2 has one rate object related to it, r2, such that r2.rate = 2. Now what should be the order of your query's result l1 followed l2 or l2 followed by l1?? As one of l1's rate is less than l2's rate and the other one is greater than l2's rate.
Try this:
from django.db.models import Count, Sum
# if you want to annotate by count of rates
locations = Location.objects.filter(**s_kwargs) \
.annotate(rate_count = Count('locations_rate')) \
.order_by('rate_count')
# if you want to annotate on values of rate e.g. Sum
locations = Location.objects.filter(**s_kwargs) \
.annotate(rate_count = Sum('locations_rate')) \
.order_by('rate_count')
Possibly you want something like this:
locations = (Location.objects.filter(**s_kwargs)
.values('locations_rate__rate')
.annotate(Count('locations_rate__rate'))
.order_by('locations_rate__rate'))
You need the Count() since you actually need a GROUP BY query, and GROUP BY only works with aggregate functions like COUNT or SUM.
Anyway I think your problem can be solved with normal distinct():
locations = (Location.objects.filter(**s_kwargs)
.order_by('locations_rate__rate')
.distinct('locations_rate__rate'))
Why would you want to use annotate() instead?
I haven't tested both but hope it helps.
annotate(*args, **kwargs),Annotates each object in the QuerySet with the provided list of aggregate values (averages, sums, etc) that have
been computed over the objects that are related to the objects in the QuerySet.
So if you want only to get locations ordered by its rates, ascendingly you dont have to use annotate()
you can try this :
loc = Location.objects.all()
rate = Rate.objects.filter(loc=rate__location).order_by('-rate')

Categories