Django: Filtering a queryset then count - python

I'm trying to limit the number of queries I perform on a page. The queryset returns the objects created within the last 24 hours. I then want to filter that queryset to count the objects based upon a field.
Example:
cars = self.get_queryset()
volvos_count = cars.filter(brand="Volvo").count()
mercs_count = cars.filter(brand="Merc").count()
With an increasing number of brands (in this example), the number of queries grows linearly with the number of brands that must be queried.
How can you make a single query for the cars that returns a dict of all of the unique values for brand and the number of instances within the queryset?
Result:
{'volvos': 4, 'mercs': 50, ...}
Thanks!
EDIT:
Of the comments so far, they have been close but not quite on the mark. Using a values_list('brand', flat=True) will return the brands. From there you can use
from collections import Counter
To return the totals. It would be great if there is a way to do this from a single query, but maybe it isn't possible.

To generate a count for each distinct brand, you use values in conjunction with annotate.
totals = cars.values('brand').annotate(Count('brand'))
This gives you a queryset, each of whose elements is a dictionary with brand and brand__count. You can convert that directly into a dict with the format you want:
{item['brand']: item['brand__count'] for item in totals}

SELECT brand, COUNT(*) as total
FROM cars
GROUP BY brand
ORDER BY total DESC
Equivalent:
cars.objects.all().values('brand').annotate(total=Count('brand')).order_by('total')

Related

How to join these 2 tables by date with ORM

I have two querysets -
A = Bids.objects.filter(*args,**kwargs).annotate(highest_priority=Case(*[
When(data_source=data_source, then Value(i))
for i, data_source in enumerate(data_source_order_list)
],
.order_by(
"date",
"highest_priority"
))
B= A.values("date").annotate(Min("highest_priority)).order_by("date")
First query give me all objects with selected time range with proper data sources and values. Through highest_priority i set which item should be selected. All items have additional data.
Second query gives me grouped by information about items in every date. In second query i do not have important values like price etc. So i assume i have to join these two tables and filter out where a.highest_priority = b.highest priority. Because in this case i will get queryset with objects and only one item per date.
I have tried using distinct - not working with .first()/.last(). Annotates gives me dict by grouped by, and grouping by only date cutting a lot of important data, but i have to group by only date...
Tables looks like that
A
B
How to join them? Because when i join them i could easily filter highest_prio with highest_prio and get my date with only one database shot. I want to use ORM, because i could just distinct and put it on the list and i do not want to hammer base with connecting multiple queries through date.
Look if this sugestion works :
SELECT * , (to_char(a.date, 'YYYYMMDD')::integer)*highest_priority AS prioritycalc;
FROM table A
JOIN table B ON (to_char(a.date, 'YYYYMMDD')::integer)*highest_priority = (to_char(b.date, 'YYYYMMDD')::integer)*highest_priority
ORDER BY prioritycalc DESC;

How to Sum a column of a related table in Flask-SQLAlchemy

I have a flask application that tracks shift information, and the orders logged during that shift, the models are set up like this
class Shift(db.Model):
# Columns
orders = db.relationship('Orders', lazy='dynamic')
class Orders(db.Model):
pay = db.Column(db.Integer)
dash_id = db.Column(db.Integer, db.ForeignKey('dash.id'))
While the user is in the middle of a shift I want to display the total pay they have made so far, and I also will commit it into the Shift table later as well. To get the total pay of all the related orders I tried to query something like
current_shift = Shift.query.filter_by(id=session['shiftID']).first()
orders = current_shift.orders
total_pay = func.sum(orders.pay)
But it always returns that 'AppenderBaseQuery' object has no attribute 'pay'
I know that I can loop through like this
total_pay = 0
for order in orders:
total_pay += order
but that can't be as quick, efficient, or certainly readable as an aggregate function in a query.
My question is this, what is the correct way to sum the Orders.pay columns (or perform aggregate functions of any column) of the related orders?
You don't need to go through the shifts table, because you already have all the information that you need in the orders table.
To get the result for a single shift you can do
pay = db_session.query(func.sum(Orders.pay)).filter(Orders.shifts_id == shift_id).one()
or for multiple shifts
pays = (
s.query(Orders.shifts_id, sa.func.sum(Orders.pay))
.filter(Orders.shifts_id.in_(list_of_shift_ids))
.group_by(Orders.shifts_id)
.all()
)
Note that both queries return rows as tuples, for example (50,), [(25,), (50,)] respectively.

Getting distinct count of column on a SQLAlchemy query object?

Given a sqlalchemy.orm.query.Query object, is it possible to count distinct column on it? I am asking because .count() returns dupes due to the join conditions.
For instance:
from sqlalchemy import func, distinct
channels = db.session.query(Channel).join(ChannelUsers).filter(
ChannelUsers.user_id == USER_ID,
Message.channel_id.isnot(None)
).outerjoin(Message)
# this gives us a number with duplicate channels
# and .count() does not take extra parameters to target on column
channels.count()
...
# later on I need to access all these channels via channels.all()
To get a distinct channels count, I can do this by duplicating the filter condition above again and query the distinct column. Something like this
distinct_count = db.session.query(
func.count(distinct(Channel.id))
).join(ChannelUsers).filter(
ChannelUsers.user_id == USER_ID,
Message.channel_id.isnot(None)
).outerjoin(Message)
But that's not ideal as I need to access some or all channels after getting the distinct count.
Found this looking for the answer myself. After some more research, I was able to get the expected result using a combination of load_only and distinct in order to count only distinct values of an ID field. Let's say for simplicity that Channel has a unique field named id.
distinct_count = channels.options(load_only(Channel.id)).distinct().count()

Django Aggregation for Goals

I'm saving every Sale in a Store. I want to use aggregation to sum all of the sales in a month for every store. And i want to filter the stores that reach the goal (100.000$).
I've already came up with a solution using python and a list. But i wanted to know if there is a better solution using only the ORM.
Sales model
Store Sale Date
Store A 5.000 11/01/2014
Store A 3.000 11/01/2014
Store B 1.000 15/01/2014
Store C 8.000 17/01/2014
...
The result should be this:
Month: January
Store Amount
A 120.000
B 111.000
C 150.000
and discard
D 70.000
Thanks for your help.
Other suggested methods discard a lot of data that takes a fraction of a second to load, and that could be useful later on in your code. Hence this answer.
Instead of querying on the Sales object, you can query on the Store object. The query is roughly the same, except for the relations:
from django.db.models import Sum
stores = Store.objects.filter(sales__date__month=month, sales__date__year=year) \
.annotate(montly_sales=Sum('sales__amount')) \
.filter(montly_sales__gte=100000) \
# optionally prefetch all `sales` objects if you know you need them
.prefetch_related('sales')
>>> [s for s in stores]
[
<Store object 1>,
<Store object 2>,
etc.
]
All Store objects have an extra attribute montly_sales that has the total amount of sales for that particular month. By filtering on month and year before annotating, the annotation only uses the filtered related objects. Note that the sales attribute on the store still contains all sales for that store.
With this method, all store attributes are easily accessible, unlike when you use .values to group your results.
Without a good look at your models the best I can do is pseudocode. But I would expect you need something along the lines of
from django.db.models import Sum
results = Sales.objects.filter(date__month=month, date__year=year)
results = results.values('store')
results = results.annotate(total_sales=Sum(sale))
return results.filter(total_sales__gt=100)
Basically, what we're doing is using django's aggregation capabilities to compute the Sum of sales for each store. Per django's documentation, we can use the values function to group our results by distinct values in a given field.
In line 2 we filter our sales to only sales from this month.
In line 3, we limit our results to the values for field store.
In line 4, we annotate each result with the Sum of all sales from the
original query.
In line 5, we filter on that annotation, limiting the returned results to stores with total_sales greater than 100.
You can use annotate to handle this. Since I do not know your model structure, That is an average guess
from djnago.db.models import Sum
Sales.objects.filter(date__month=3, date__year=2014).values('store').annotate(monthly_sale=Sum('sale'))
That will return you a Queryset of Stores and their monthly sales like:
>> [
{"store": 1, "monthly_sale": 120.000},
{"store": 2, "monthly_sale": 100.000},
...
]
In above query assume you have:
Sales model have a Date or Datetime field named date
Your Sale model have a ForeignKey relation to Store
Your Sales model have a numeric field (Integer, Decimal etc.) named sale
In your resulting QuerySet, store is the id of your store record. But since it is a ForeigKey, you can use relation to get its name etc...
Sales.objects.filter(date__month=3, date__year=2014).values('store__name').annotate(monthly_sale=Sum('sale'))
>> [
{"store__name": "Store A", "monthly_sale": 120.000},
{"store__name": "Store B", "monthly_sale": 100.000},
...
]

how to check whether all the values of a database field are same using django queries

I have a model like below
Class Product(models.Model):
name = models.CharField(max_length=255)
price = models.IntegerField()
So suppose we have 4 product records in database, is there anyway to check whether all the 4 product records have the same price ?
I don't want to loop through all the products, because there may be thousands of product records in database, and doing so will become a performance issue.
So i am looking for something like using builtin django database ORM to do this
check_whether_all_the_product_records_has_same_price_value = some django ORM operation......
if check_whether_all_the_product_records_has_same_price_value:
# If all the Product table records(four) has the same price value
# return the starting record
return check_whether_product_has_same_price_value(0)
So can anyone please let me know how can we do this ?
Can propose You count lines using filter
if Product.objects.all().count() == Product.objects.filter(price=price).count():
pass
or use distinct
if Product.objects.all().products.distinct('price').count() == 1:
pass
Note that this example works correctly on Portgres only.
Also You can Use annotate to calculate count I think
if Product.objects.all().values('price').annotate(Count('price')).count() == 1:
pass
You can use distinct to find unique prices:
products = Product.objects.filter([some condition])
prices = products.values_list('price', flat=True).distinct()
Then check the length of prices.

Categories