I have a flask application that tracks shift information, and the orders logged during that shift, the models are set up like this
class Shift(db.Model):
# Columns
orders = db.relationship('Orders', lazy='dynamic')
class Orders(db.Model):
pay = db.Column(db.Integer)
dash_id = db.Column(db.Integer, db.ForeignKey('dash.id'))
While the user is in the middle of a shift I want to display the total pay they have made so far, and I also will commit it into the Shift table later as well. To get the total pay of all the related orders I tried to query something like
current_shift = Shift.query.filter_by(id=session['shiftID']).first()
orders = current_shift.orders
total_pay = func.sum(orders.pay)
But it always returns that 'AppenderBaseQuery' object has no attribute 'pay'
I know that I can loop through like this
total_pay = 0
for order in orders:
total_pay += order
but that can't be as quick, efficient, or certainly readable as an aggregate function in a query.
My question is this, what is the correct way to sum the Orders.pay columns (or perform aggregate functions of any column) of the related orders?
You don't need to go through the shifts table, because you already have all the information that you need in the orders table.
To get the result for a single shift you can do
pay = db_session.query(func.sum(Orders.pay)).filter(Orders.shifts_id == shift_id).one()
or for multiple shifts
pays = (
s.query(Orders.shifts_id, sa.func.sum(Orders.pay))
.filter(Orders.shifts_id.in_(list_of_shift_ids))
.group_by(Orders.shifts_id)
.all()
)
Note that both queries return rows as tuples, for example (50,), [(25,), (50,)] respectively.
Related
G'day All,
I'm trying to create a running balance of all negative transactions with a date less than or equal to the current transaction object's transaction date however if I use __lte=transaction_date I get multiple rows, while this is correct I want to sum those multiple rows, how would I do that and annotate to my queryset?
Current Attempt:
#getting transaction sum and negative balances to take away from running balance
totals = queryset.values('transaction_date','award_id','transaction').annotate(transaction_amount_sum=Sum("transaction_amount"))\
.annotate(negative_balance=Coalesce(Sum("transaction_amount",filter=Q(transaction__in=[6,11,7,8,9,10,12,13])),0))\
#adding it all to the queryset
queryset = queryset\
.annotate(transaction_amount_sum=SubquerySum(totals.filter(award_id=OuterRef('award_id'),transaction_date=OuterRef('transaction_date'),transaction=OuterRef('transaction'))\
.values('transaction_amount_sum')))\
.annotate(negative_balance=SubquerySum(
totals.filter(award_id=OuterRef('award_id'),transaction_date=OuterRef('transaction_date'),transaction=OuterRef('transaction'))\
.values('negative_balance')
))\
.annotate(total_awarded=SubquerySum("award__total_awarded"))\
.annotate(running_balance=F('total_awarded')-F('negative_balance')) #This doesnt work correct, we need transaction date to be less than or eqaul not just the transaction date.
#filtering on distinct, we only want one of each record, doesnt matter which one. :)
distinct_pk = queryset.distinct('transaction_date','award_id','transaction').values_list('pk',flat=True)
queryset = queryset.filter(pk__in=distinct_pk)
What needs to be fixed:
.annotate(negative_balance=SubquerySum(
totals.filter(award_id=OuterRef('award_id'),transaction_date=OuterRef('transaction_date'),transaction=OuterRef('transaction'))\
.values('negative_balance')
The above should really be:
.annotate(negative_balance=SubquerySum(
totals.filter(award_id=OuterRef('award_id'),transaction_date__lte=OuterRef('transaction_date'))\
.values('negative_balance')
It will return multiple rows if I do this, what I want to do is sum those multiple rows on negative_balance.
Hope the above makes sense.
Any help will be greatly appreciated.
Thanks,
Thomas Lewin
I have three queries and another table called output_table. This code works but needs to be executed in 1. REPLACE INTO query. I know this involves nested and subqueries, but I have no idea if this is possible since my key is the DISTINCT coins datapoints from target_currency.
How to rewrite 2 and 3 so they execute in query 1? That is, the REPLACE INTO query instead of the individual UPDATE ones:
1. conn3.cursor().execute(
"""REPLACE INTO coin_best_returns(coin) SELECT DISTINCT target_currency FROM output_table"""
)
2. conn3.cursor().execute(
"""UPDATE coin_best_returns SET
highest_price = (SELECT MAX(ask_price_usd) FROM output_table WHERE coin_best_returns.coin = output_table.target_currency),
lowest_price = (SELECT MIN(bid_price_usd) FROM output_table WHERE coin_best_returns.coin = output_table.target_currency)"""
)
3. conn3.cursor().execute(
"""UPDATE coin_best_returns SET
highest_market = (SELECT exchange FROM output_table WHERE coin_best_returns.highest_price = output_table.ask_price_usd),
lowest_market = (SELECT exchange FROM output_table WHERE coin_best_returns.lowest_price = output_table.bid_price_usd)"""
)
You can do it with the help of some window functions, a subquery, and an inner join. The version below is pretty lengthy, but it is less complicated than it may appear. It uses window functions in a subquery to compute the needed per-currency statistics, and factors this out into a common table expression to facilitate joining it to
itself.
Other than the inline comments, the main reason for the complication is original query number 3. Queries (1) and (2) could easily be combined as a single, simple, aggregate query, but the third query is not as easily addressed. To keep the exchange data associated with the corresponding ask and bid prices, this query uses window functions instead of aggregate queries. This also provides a vehicle different from DISTINCT for obtaining one result per currency.
Here's the bare query:
WITH output_stats AS (
-- The ask and bid information for every row of output_table, every row
-- augmented by the needed maximum ask and minimum bid statistics
SELECT
target_currency as tc,
ask_price_usd as ask,
bid_price_usd as bid,
exchange as market,
MAX(ask_price_usd) OVER (PARTITION BY target_currency) as high,
ROW_NUMBER() OVER (
PARTITION_BY target_currency, ask_price_usd ORDER BY exchange DESC)
as ask_rank
MIN(bid_price_usd) OVER (PARTITION BY target_currency) as low,
ROW_NUMBER() OVER (
PARTITION_BY target_currency, bid_price_usd ORDER BY exchange ASC)
as bid_rank
FROM output_table
)
REPLACE INTO coin_best_returns(
-- you must, of course, include all the columns you want to fill in the
-- upsert column list
coin,
highest_price,
lowest_price,
highest_market,
lowest_market)
SELECT
-- ... and select a value for each column
asks.tc,
asks.ask,
bids.bid,
asks.market,
bids.market
FROM output_stats asks
JOIN output_stats bids
ON asks.tc = bids.tc
WHERE
-- These conditions choose exactly one asks row and one bids row
-- for each currency
asks.ask = asks.high
AND asks.ask_rank = 1
AND bids.bid = bids.low
AND bids.bid_rank = 1
Note well that unlike the original query 3, this will consider only exchange values associated with the target currency for setting the highest_market and lowest_market columns in the destination table. I'm supposing that that's what you really want, but if not, then a different strategy will be needed.
Given these three classes
class User(BaseModel):
name = models.CharField(..)
class Order(BaseModel):
user = models.ForeignKey(User,...,related_name='orders')
class OrderItem(BaseModel):
order = models.ForeignKey(Order,...,related_name='items'
quatity = models.IntegerField(default=1)
price = models.FloatField()
and this is the base class (it is enough to note that it has the created_at field)
class BaseModel(models.Model):
createt_at = models.DateTimeField(auto_now_add=True)
Now each User will have multiple Orders and each Order has multiple OrdeItems
I want to annotate the User objects with the total price of the last order.
Take this data for example:
The User objects should be annotated with the sum of the last order that is for user john with id=1
we should return the sum of order_items (with ids= 3 & 4) since they are related to the order id=2 since it is the latest order.
I hope I have made my self clear. I am new to Django and tried to go over the docs and tried many different things but I keep getting stuck at getting the last order items
Sometimes it's unclear how to make such query in Django ORM. In your case I'd write the query in raw SQL something like:
WITH last_order_for_user AS (
SELECT id, user_id, MAX(created_at)
FROM orders
GROUP BY user_id
) -- last order_id for each user_id
SELECT
order.user_id, order.id, SUM(item.price)
FROM
last_order_for_user order
LEFT JOIN
orderitems item ON order.id=item.order_id
GROUP BY 1,2 -- sum of items for last order, user
And then perform raw SQL django-docs
To keep it simple I have four tables(A, B, Category and Relation), Relation table stores the Intensity of A in B and Category stores the type of B.
A <--- Relation ---> B ---> Category
(So the relation between A and B is n to n, when the relation between B and Category is n to 1)
I need an ORM to group Relation records by Category and A, then calculate Sum of Intensity in each (Category, A) (seems simple till here), then I want to annotate Max of calculated Sum in each Category.
My code is something like:
A.objects.values('B_id').annotate(AcSum=Sum(Intensity)).annotate(Max(AcSum))
Which throws the error:
django.core.exceptions.FieldError: Cannot compute Max('AcSum'): 'AcSum' is an aggregate
Django-group-by package with the same error.
For further information please also see this stackoverflow question.
I am using Django 2 and PostgreSQL.
Is there a way to achieve this using ORM, if there is not, what would be the solution using raw SQL expression?
Update
After lots of struggling I found out that what I wrote was indeed an aggregation, however what I want is to find out the maximum of AcSum of each A in each category. So I suppose I have to group-by the result once more after AcSum Calculation. Based on this insight I found a stack-overflow question which asks the same concept(The question was asked 1 year, 2 months ago without any accepted answer).
Chaining another values('id') to the set does not function neither as a group_by nor as a filter for output attributes, It removes AcSum from the set. Adding AcSum to values() is also not an option due to changes in the grouped by result set.
I think what I am trying to do is re grouping the grouped by query based on the fields inside a column (i.e id).
any thoughts?
You can't do an aggregate of an aggregate Max(Sum()), it's not valid in SQL, whether you're using the ORM or not. Instead, you have to join the table to itself to find the maximum. You can do this using a subquery. The below code looks right to me, but keep in mind I don't have something to run this on, so it might not be perfect.
from django.db.models import Subquery, OuterRef
annotation = {
'AcSum': Sum('intensity')
}
# The basic query is on Relation grouped by A and Category, annotated
# with the Sum of intensity
query = Relation.objects.values('a', 'b__category').annotate(**annotation)
# The subquery is joined to the outerquery on the Category
sub_filter = Q(b__category=OuterRef('b__category'))
# The subquery is grouped by A and Category and annotated with the Sum
# of intensity, which is then ordered descending so that when a LIMIT 1
# is applied, you get the Max.
subquery = Relation.objects.filter(sub_filter).values(
'a', 'b__category').annotate(**annotation).order_by(
'-AcSum').values('AcSum')[:1]
query = query.annotate(max_intensity=Subquery(subquery))
This should generate SQL like:
SELECT a_id, category_id,
(SELECT SUM(U0.intensity) AS AcSum
FROM RELATION U0
JOIN B U1 on U0.b_id = U1.id
WHERE U1.category_id = B.category_id
GROUP BY U0.a_id, U1.category_id
ORDER BY SUM(U0.intensity) DESC
LIMIT 1
) AS max_intensity
FROM Relation
JOIN B on Relation.b_id = B.id
GROUP BY Relation.a_id, B.category_id
It may be more performant to eliminate the join in Subquery by using a backend specific feature like array_agg (Postgres) or GroupConcat (MySQL) to collect the Relation.ids that are grouped together in the outer query. But I don't know what backend you're using.
Something like this should work for you. I couldn't test it myself, so please let me know the result:
Relation.objects.annotate(
b_category=F('B__Category')
).values(
'A', 'b_category'
).annotate(
SumInensityPerCategory=Sum('Intensity')
).values(
'A', MaxIntensitySumPerCategory=Max('SumInensityPerCategory')
)
I have a model like below
Class Product(models.Model):
name = models.CharField(max_length=255)
price = models.IntegerField()
So suppose we have 4 product records in database, is there anyway to check whether all the 4 product records have the same price ?
I don't want to loop through all the products, because there may be thousands of product records in database, and doing so will become a performance issue.
So i am looking for something like using builtin django database ORM to do this
check_whether_all_the_product_records_has_same_price_value = some django ORM operation......
if check_whether_all_the_product_records_has_same_price_value:
# If all the Product table records(four) has the same price value
# return the starting record
return check_whether_product_has_same_price_value(0)
So can anyone please let me know how can we do this ?
Can propose You count lines using filter
if Product.objects.all().count() == Product.objects.filter(price=price).count():
pass
or use distinct
if Product.objects.all().products.distinct('price').count() == 1:
pass
Note that this example works correctly on Portgres only.
Also You can Use annotate to calculate count I think
if Product.objects.all().values('price').annotate(Count('price')).count() == 1:
pass
You can use distinct to find unique prices:
products = Product.objects.filter([some condition])
prices = products.values_list('price', flat=True).distinct()
Then check the length of prices.