Django GROUP_BY through .annotate() - python

I have table invoices with field customer_id and some others fields. I need select count of purchases, taken by each user. In SQL it's should looks like this:
SELECT username, COUNT('customer_id') FROM `invoices`
LEFT JOIN `auth_user` ON `auth_user`.id = `invoices`.customer_id
GROUP BY `customer_id`, username
In Django i try:
Invoice.objects.annotate(buy_count=Count('customer')).all()
But this code groups by invoices.id instead of invoices.customer_id and returns wrong result.

I think you should turn it around, something like:
Customer.objects.annotate(buy_count=Count('invoice')).all()
There you'd get a list of Customer with their count of invoice.

Related

Django ORM Group By with Foreign Keys

user is a foreign key on tournament.
select u.id, u.display_name, count(t.id)
from tournament t join user u
on t.user_id = u.id
where date(t.start_date)> '2022-07-01'
group by u.display_name, u.id
How can I make the above SQL query work with django's ORM?
In the majority of cases trying to translate an sql query into Django ORM syntax isn't the way to go.
From what i understand, you want to count tournaments, filtered with a date, bound to an user.
Try something like:
UserModel.objects.annotate(tournament_count=Count("tournament", filter=Q(start_date__gt=my_date)))
The annotate method allows for additionnal columns to be present in the ResultSet moslty related or calculated ones. ("tournament" is name of your Tournament model, if you defined a related_name for the user FK, use this name instead)
If you really want a group by, take a look at this How to query as GROUP BY in django?

Group by in Django ORM

I want to get the number of "chats" between 2 users in my app and I have a table called "Message" that contains:
sender_user | reciever_user | contain | date
I want to do a query that gives me all the messages between 2 differents users, I know in SQL I need to use GROUP BY but how can I get the list of messages with the Django ORM? Do you have any idea?
NOTE: sender_user and reciever_user are instances of User table)
You don't need to "GROUP BY" in the case described. You need to filter on sender and receiver users - it's equivalent to an SQL "WHERE" clause. And you would also order by the date.
Message.objects.filter(sender_user='sender', receiver_user='receiver')
You can use annotate method of queryset. anotate method is abstruction for group by clauses in django.
Your query should probably be something like:
User.objects.filter(message_set__sender=me).annotate(message_count=Count('message'))
Reference: https://docs.djangoproject.com/en/1.7/topics/db/aggregation/#order-of-annotate-and-filter-clauses

How to query data rows with id is specified by another query with Django queryset

I'm trying to make this query:
SELECT * FROM users WHERE ID IN (SELECT user_id FROM some_table WHERE something)
Can anyone provide me the solution for how to do this using Django queryset?
You can use 2 nested querysets, something like:
User.objects.filter(id__in=SomeModel.objects.filter(field=something).values_list('user_id', flat=True))
Hope it helps!

Django model search concatenated string

I am trying to use a Django model to for a record but then return a concatenated field of two different tables joined by a foreign key.
I can do it in SQL like this:
SELECT
location.location_geoname_id as id,
CONCAT_WS(', ', location.location_name, region.region_name, country.country_name) AS 'text'
FROM
geonames_location as location
JOIN
geonames_region as region
ON
location.region_geoname_id = region.region_geoname_id
JOIN
geonames_country as country
ON
region.country_geoname_id = country.country_geoname_id
WHERE
location.location_name like 'location'
ORDER BY
location.location_name, region.region_name, country.country_name
LIMIT 10;
Is there a cleaner way to do this using Django models? Or do I need to just use SQL for this one?
Thank you
Do you really need the SQL to return the concatenated field? Why not query the models in the usual way (with select_related()) and then concatenate in Python? Or if you're worried about querying more columns than you need, use values_list:
locations = Location.objects.values_list(
'location_name', 'region__region_name', 'country__country_name')
location_texts = [','.join(l) for l in locations]
You can also write raw query for this in your code like that and later on you can concatenate.
Example:
org = Organization.objects.raw('SELECT organization_id, name FROM organization where is_active=1 ORDER BY name')
Keep one thing in a raw query you have to always fetch primary key of table, it's mandatory. Here organization_id is a primary key of contact_organization table.
And it's depend on you which one is useful and simple(raw query or model query).

Query syntax to select exactly one item for each category

class Category(models.Model):
pass
class Item(models.Model):
cat = models.ForeignKey(Category)
I want to select exactly one item for each category, which is the query syntax for do this?
Your question isn't entirely clear: since you didn't say otherwise, I'm going to assume that you don't care which item is selected for each category, just that you need any one. If that isn't the case, please update the question to clarify.
tl;dr version: there is no documented
way to explicitly use GROUP BY
statements in Django, except by using
a raw query. See the bottom for code to do so.
The problem is that in doing what you're looking for in SQL itself requires a bit of a hack. You can easily try this example with by entering sqlite3 :memory: at the command line:
CREATE TABLE category
(
id INT
);
CREATE TABLE item
(
id INT,
category_id INT
);
INSERT INTO category VALUES (1);
INSERT INTO category VALUES (2);
INSERT INTO category VALUES (3);
INSERT INTO item VALUES (1,1);
INSERT INTO item VALUES (2,2);
INSERT INTO item VALUES (3,3);
INSERT INTO item VALUES (4,1);
INSERT INTO item VALUES (5,2);
SELECT id, category_id, COUNT(category_id) FROM item GROUP BY category_id;
returns
4|1|2
5|2|2
3|3|1
Which is what you're looking for (one item id for each category id), albeit with an extraneous COUNT. The count (or some other aggregate function) is needed in order to apply the GROUP BY.
Note: this will ignore categories that don't contain any items, which seems like sensible behaviour.
Now the question becomes, how to do this in Django?
The obvious answer is to use Django's aggregation/annotation support, in particular, combining annotate with values as is recommend elsewhere to GROUP queries in Django.
Reading those posts, it would seem we could accomplish what we're looking for with
Item.objects.values('id').annotate(unneeded_count=Count('category_id'))
However this doesn't work. What Django does here is not just GROUP BY "category_id", but groups by all fields selected (ie GROUP BY "id", "category_id")1. I don't believe there is a way (in the public API, at least) to change this behaviour.
The solution is to fall back to raw SQL:
qs = Item.objects.raw('SELECT *, COUNT(category_id) FROM myapp_item GROUP BY category_id')
1: Note that you can inspect what queries Django is running with:
from django.db import connection
print connection.queries[-1]
Edit:
There are a number of other possible approaches, but most have (possibly severe) performance problems. Here are a couple:
1. Select an item from each category.
items = []
for c in Category.objects.all():
items.append(c.item_set[0])
This is a more clear and flexible approach, but has the obvious disadvantage of requiring many more database hits.
2. Use select_related
items = Item.objects.select_related()
and then do the grouping/filtering yourself (in Python).
Again, this is perhaps more clear than using raw SQL and only requires one query, but this one query could be very large (it will return all items and their categories) and doing the grouping/filtering yourself is probably less efficient than letting the database do it for you.

Categories