Django aggregate Count only True values - python

I'm using aggregate to get the count of a column of booleans. I want the number of True values.
DJANGO CODE:
count = Model.objects.filter(id=pk).aggregate(bool_col=Count('my_bool_col')
This returns the count of all rows.
SQL QUERY SHOULD BE:
SELECT count(CASE WHEN my_bool_col THEN 1 ELSE null END) FROM <table_name>
Here is my actual code:
stats = Team.objects.filter(id=team.id).aggregate(
goals=Sum('statistics__goals'),
assists=Sum('statistics__assists'),
min_penalty=Sum('statistics__minutes_of_penalty'),
balance=Sum('statistics__balance'),
gwg=Count('statistics__gwg'),
gk_goals_avg=Sum('statistics__gk_goals_avg'),
gk_shutout=Count('statistics__gk_shutout'),
points=Sum('statistics__points'),
)
Thanks to Peter DeGlopper suggestion to use django-aggregate-if
Here is the solution:
from django.db.models import Sum
from django.db.models import Q
from aggregate_if import Count
stats = Team.objects.filter(id=team.id).aggregate(
goals=Sum('statistics__goals'),
assists=Sum('statistics__assists'),
balance=Sum('statistics__balance'),
min_penalty=Sum('statistics__minutes_of_penalty'),
gwg=Count('statistics__gwg', only=Q(statistics__gwg=True)),
gk_goals_avg=Sum('statistics__gk_goals_avg'),
gk_shutout=Count('statistics__gk_shutout', only=Q(statistics__gk_shutout=True)),
points=Sum('statistics__points'),
)

Updated for Django 1.10. You can perform conditional aggregation now:
from django.db.models import Count, Case, When
query_set.aggregate(bool_col=Count(Case(When(my_bool_col=True, then=1))))
More information at:
https://docs.djangoproject.com/en/1.11/ref/models/conditional-expressions/#case

Update:
Since Django 1.10 you can:
from django.db.models import Count, Case, When
query_set.aggregate(
bool_col=Count(
Case(When(my_bool_col=True, then=Value(1)))
)
)
Read about the Conditional Expression classes
Old answer.
It seems what you want to do is some kind of "Conditional aggregation". Right now Aggregation functions do not support lookups like filter or exclude: fieldname__lt, fieldname__gt, ...
So you can try this:
django-aggregate-if
Description taken from the official page.
Conditional aggregates for Django queries, just like the famous SumIf and CountIf in Excel.
You can also first annotate the desired value for each team, I mean count for each team the ammount of True in the field you are interested. And then do all the aggregation you want to do.

Another Solution for count Bool is:
from django.db.models import Sum, IntegerField
from django.db.models.functions import Cast
Model.objects.filter(id=pk).annotate(bool_col=Sum(Cast('my_bool_col', IntegerField())))
Just convert False to 0 and True to 1, and then just Sum

Related

Django merge QuerySet while keeping the order

i'm trying to join together 2 QuerySets. Right now, I'm using the | operator, but doing it this way won't function as an "append".
My current code is:
df = RegForm((querysetA.all() | querysetB.all()).distinct())
I need the elements from querysetA to be before querysetB. Is it even possible to accomplish while keeping them just queries?
This can be solved by using annotate to add a custom field for ordering on the querysets, and use that in a union like this:
from django.db.models import Value
a = querysetA.annotate(custom_order=Value(1))
b = querysetB.annotate(custom_order=Value(2))
a.union(b).order_by('custom_order')
Prior to django-3.2, you need to specify the output_field for Value:
from django.db.models import IntegerField
a = querysetA.annotate(custom_order=Value(1, IntegerField()))
b = querysetB.annotate(custom_order=Value(2, IntegerField()))

Django get the sum of all columns for a particular user

I have a django model as follows:
class Order(models.Model):
cash=models.DecimalField(max_digits=11,decimal_places=2,default=0)
balance=models.DecimalField(max_digits=11,decimal_places=2,default=0)
current_ac=models.DecimalField(max_digits=11,decimal_places=2,default=0)
added_by = models.ForeignKey(User)
There can be multiple Orders and multiple users can create orders.
How can I get the sum of all orders for each columns for a particular user, something like
ord=Order.objects.filter(added_by.id=1).sum()
an SQL equivalent would be something like
Select sum(cash), sum (balance), sum(current_ac) from Orders where added_by = 1
You can aggregate, for example the sum of the current_ac with:
from decimal import Decimal
from django.db.models import Sum
ord=Order.objects.filter(added_by_id=1).aggregate(
total=Sum('current_ac')
)['total'] or Decimal()
or if you want to sum up the items for cash, balance and current_ac, you can work with:
from decimal import Decimal
from django.db.models import Sum
ord=Order.objects.filter(added_by_id=1).aggregate(
total_cash=Sum('current_ac'),
total_balance=Sum('balance'),
total_ac=Sum('current_ac')
)
here ord will be a dictionary that contains the corresponding values, for example:
{
'total_cash': Decimal('14.25'),
'total_balance': Decimal('13.02'),
'total_ac': Decimal('17.89')
}
or if you want to count the number of Orders, then we can work with:
from decimal import Decimal
ord=Order.objects.filter(added_by_id=1).count()
If you want to do that per User, it is more efficient to work with .annotate(…) [Django-doc].
If I get you correctly. You want to count the number of records, right? if that's the case. You can use filter and count. like in the example below:
numberOfRecords = Orders.filter(added_by=user_id).count
you can try this
from django.db.models import Sum
total=Order.objects.filter(added_by_id=1).aggregate(
total=Sum('current_ac')
)['total'] or 0

Annotate query for calculate sum of 2 table value with Django ORM

I have 2 tables
Class Billing(models.Model):
id=models.AutoField(primary_key=True)
.....
#Some more fields
....
Class BillInfo(models.Model):
id=models.AutoField(primary_key=True)
billing=models.ForeignKey(Billing)
testId=models.ForeignKey(AllTests)
costOfTest=models.IntegerField(default=0)
concession=models.IntegerField(default=0)
Here BillInfo is verticle table i.e one Billing has multiple BillInfo. Here I want to calculate the Sum(costOfTest - concession) for a single Billing.
Can I achieve this using single query?
Need help, Thanks in advance.
You can write this as:
from django.db.models import F, Sum
Billing.objects.annotate(
the_sum=Sum(F('billinfo__costOfTest') - F('billinfo__concession'))
)
Here every Billing object in this QuerySet will have an extra attribute .the_sum which is the sum of all costOfTests minus the concession of all related BillingInfo objects.
The SQL query that calculates this will look, approximately as:
SELECT billing.*
SUM(billinginfo.costOfTest - billinginfo.concession) AS the_sum
FROM billing
LEFT OUTER JOIN billinginfo ON billinginfo.billing_id = billing.id
GROUP BY billing.id
So when you "materialize" the query, the query will obtain the sum for all the Billing objects in a single call.
For Billing objects without any related BillingInfo, the the_sum attribute will be None, we can avoid that by using the Coalesce [Django-doc] function:
from django.db.models import F, Sum, Value
from django.db.models.functions import Coalesce
Billing.objects.annotate(
the_sum=Coalesce(
Sum(F('billinfo__costOfTest') - F('billinfo__concession')),
Value(0)
)
)

Calculate Max of Sum of an annotated field over a grouped by query in Django ORM?

To keep it simple I have four tables(A, B, Category and Relation), Relation table stores the Intensity of A in B and Category stores the type of B.
A <--- Relation ---> B ---> Category
(So the relation between A and B is n to n, when the relation between B and Category is n to 1)
I need an ORM to group Relation records by Category and A, then calculate Sum of Intensity in each (Category, A) (seems simple till here), then I want to annotate Max of calculated Sum in each Category.
My code is something like:
A.objects.values('B_id').annotate(AcSum=Sum(Intensity)).annotate(Max(AcSum))
Which throws the error:
django.core.exceptions.FieldError: Cannot compute Max('AcSum'): 'AcSum' is an aggregate
Django-group-by package with the same error.
For further information please also see this stackoverflow question.
I am using Django 2 and PostgreSQL.
Is there a way to achieve this using ORM, if there is not, what would be the solution using raw SQL expression?
Update
After lots of struggling I found out that what I wrote was indeed an aggregation, however what I want is to find out the maximum of AcSum of each A in each category. So I suppose I have to group-by the result once more after AcSum Calculation. Based on this insight I found a stack-overflow question which asks the same concept(The question was asked 1 year, 2 months ago without any accepted answer).
Chaining another values('id') to the set does not function neither as a group_by nor as a filter for output attributes, It removes AcSum from the set. Adding AcSum to values() is also not an option due to changes in the grouped by result set.
I think what I am trying to do is re grouping the grouped by query based on the fields inside a column (i.e id).
any thoughts?
You can't do an aggregate of an aggregate Max(Sum()), it's not valid in SQL, whether you're using the ORM or not. Instead, you have to join the table to itself to find the maximum. You can do this using a subquery. The below code looks right to me, but keep in mind I don't have something to run this on, so it might not be perfect.
from django.db.models import Subquery, OuterRef
annotation = {
'AcSum': Sum('intensity')
}
# The basic query is on Relation grouped by A and Category, annotated
# with the Sum of intensity
query = Relation.objects.values('a', 'b__category').annotate(**annotation)
# The subquery is joined to the outerquery on the Category
sub_filter = Q(b__category=OuterRef('b__category'))
# The subquery is grouped by A and Category and annotated with the Sum
# of intensity, which is then ordered descending so that when a LIMIT 1
# is applied, you get the Max.
subquery = Relation.objects.filter(sub_filter).values(
'a', 'b__category').annotate(**annotation).order_by(
'-AcSum').values('AcSum')[:1]
query = query.annotate(max_intensity=Subquery(subquery))
This should generate SQL like:
SELECT a_id, category_id,
(SELECT SUM(U0.intensity) AS AcSum
FROM RELATION U0
JOIN B U1 on U0.b_id = U1.id
WHERE U1.category_id = B.category_id
GROUP BY U0.a_id, U1.category_id
ORDER BY SUM(U0.intensity) DESC
LIMIT 1
) AS max_intensity
FROM Relation
JOIN B on Relation.b_id = B.id
GROUP BY Relation.a_id, B.category_id
It may be more performant to eliminate the join in Subquery by using a backend specific feature like array_agg (Postgres) or GroupConcat (MySQL) to collect the Relation.ids that are grouped together in the outer query. But I don't know what backend you're using.
Something like this should work for you. I couldn't test it myself, so please let me know the result:
Relation.objects.annotate(
b_category=F('B__Category')
).values(
'A', 'b_category'
).annotate(
SumInensityPerCategory=Sum('Intensity')
).values(
'A', MaxIntensitySumPerCategory=Max('SumInensityPerCategory')
)

Django ORM query GROUP BY multiple columns combined by MAX

I am using Django with MySQL. I have a model similar to the following:
class MM(models.Model):
a = models.IntegerField()
b = models.IntegerField()
c = models.DateTimeField(auto_now_add=True)
I have multiple rows that a is equal to b, and I want to perform the following SQL query:
SELECT a, b, MAX(c) AS max FROM MM GROUP BY b, a;
How can this be done with Django ORM? I have tried different approaches using annotations, but now luck so far.
Thanks a lot!
I think you can do something like:
MM.objects.all().values('b', 'a').annotate(max=Max('c'))
Note that you need to import something to use Max: from django.db.models import Max
values('b', 'a') will give GROUP BY b, a and annotate(...) will compute the MAX in your query.
You can try this also
from django.db.models import Max
mm_list=MM.objects.all().values('b','a').annotate(max=Max('c'))
for mm in mm_list:
a=mm['a']
b=mm['b']
max=mm['max']
Sum a field group by two fields.
from django.db.models import Sum
SQL
select caja_tipo_id, tipo_movimiento, sum(monto) from config_caja group by caja_tipo_id, tipo_movimiento
Django
objs = Caja.objects.values('caja_tipo__nombre','tipo_movimiento').order_by().annotate(total=Sum('monto'))

Categories