How can I convert Inner Join and Group By to Django ORM? - python

I want to change this raw SQL to Django ORM but I couldn't manage to convert.
most_read_students = LendedBook.objects.raw('
SELECT student_id as id, name, surname, COUNT(*) as count
FROM "Book_lendedbook"
INNER JOIN User_student on Book_lendedbook.student_id=User_student.id
where did_read=1
group by student_id
order by count DESC
LIMIT 5')`
I tried this and I get close result.But unfortunately, I couldn't do what I want. Because I want to join this table with another table.
most_read_students = LendedBook.objects.values('student_id')
.filter(did_read=True, return_date__month=(datetime.datetime.now().month))
.annotate(count=Count('student_id'))
When I use select_related with "User_student" table like this;
most_read_students = LendedBook.objects.select_related('User_student')
.values('student_id', 'name', 'surname')
.filter(did_read=True, return_date__month=(datetime.datetime.now().month))
.annotate(count=Count('student_id'))
It throws an error like Cannot resolve keyword 'name' into field. Choices are: book, book_id, did_read, id, is_returned, lend_date, return_date, student, student_id
But I should be able to get student properties like name and surname when I join "User_student" table.
Thank you for your help!

I solved it!
How to combine select_related() and value()? (2016)
Funny fact; my problem was not about ORM i guess. I didn't know I could reach student object's properties by just adding this to .values('student__name', 'student__surname') in the last code I've shared on this post.
This code ;
LendedBook.objects.select_related('User_student')
.values('student_id', 'name', 'surname')
.filter(did_read=True, return_date__month=(datetime.datetime.now().month))
.annotate(count=Count('student_id'))
To this code ;
LendedBook.objects.select_related('User_student')
.values('student_id', 'student__name', 'student__surname')
.filter(did_read=True, return_date__month=(datetime.datetime.now().month))
.annotate(count=Count('student_id'))
By the way, deleting .select_related('User_student') doesn't affect the result.
So, using _ _ solved my problem!

Related

SQLite Query with COUNT and ID string

First... i have a SQLite database:
I have an user table tbl_members
member_id
name
and an order table tbl_orders
order_id
member_ids
name
An order can be edited by more than one member and this members are stored in tbl_orders member_ids in this fashion 1,2,34,23,65,
I need a query that returns:
tbl_members.member_id, tbl_members.name and a COUNT(tbl_orders.order_id) of the orders where the tbl.members.member_id is in tbl.orders.member_ids
I can't get it... can anyone give me a hint?
is this your expected answer?
SELECT tm.member_id, tm.name, COUNT(to.order_id)
FROM tbl_members as tm
LEFT JOIN tbl_orders as to on tm.member_id = to.member_id
GROUP BY tm.member.id, tm.name
I got it!
SELECT tm.member_id, tm.name, COUNT(to.order_id)
FROM tbl_members tm
LEFT JOIN tbl_orders to ON (to.member_ids LIKE '%,'||tm.member_id||'%')
GROUP BY tm.member_id
Works for me

How to use variable column name in filter in Django ORM?

I have two tables BloodBank(id, name, phone, address) and BloodStock(id, a_pos, b_pos, a_neg, b_neg, bloodbank_id). I want to fetch all the columns from two tables where the variable column name (say bloodgroup) which have values like a_pos or a_neg... like that and their value should be greater than 0. How can I write ORM for the same?
SQL query is written like this to get the required results.
sql="select * from public.bloodbank_bloodbank as bb, public.bloodbank_bloodstock as bs where bs."+blood+">0 and bb.id=bs.bloodbank_id order by bs."+blood+" desc;"
cursor = connection.cursor()
cursor.execute(sql)
bloodbanks = cursor.fetchall()
You could be more specific in your questions, but I believe you have a variable called blood which contains the string name of the column and that the columns a_pos, b_pos, etc. are numeric.
You can use a dictionary to create keyword arguments from strings:
filter_dict = {bloodstock__blood + '__gt': 0}
bloodbanks = Bloodbank.objects.filter(**filter_dict)
This will get you Bloodbank objects that have a related bloodstock with a greater than zero value in the bloodgroup represented by the blood variable.
Note that the way I have written this, you don't get the bloodstock columns selected, and you may get duplicate bloodbanks. If you want to get eliminate duplicate bloodbanks you can add .distinct() to your query. The bloodstocks are available for each bloodbank instance using .bloodstock_set.all().
The ORM will generate SQL using a join. Alternatively, you can do an EXISTS in the where clause and no join.
from django.db.models import Exists, OuterRef
filter_dict = {blood + '__gt': 0}
exists = Exists(Bloodstock.objects.filter(
bloodbank_id=OuterRef('id'),
**filter_dict
)
bloodbanks = Bloodbank.objects.filter(exists)
There will be no need for a .distinct() in this case.

Dynamically add filter to SQLAlchemy TextClause

Assume I have a SQLAlchemy table which looks like:
class Country:
name = VARCHAR
population = INTEGER
continent = VARCHAR
num_states = INTEGER
My application allow seeing name and population for all Countries. So I have a TextClause which looks like
"select name, population from Country"
I allow raw queries in my application so I don't have option to change this to selectable.
At runtime, I want to allow my users to choose a field name and put a field value on which I want to allow filtering. eg: User can say I only want to see name and population for countries where Continent is Asia. So I dynamically want to add the filter
.where(Country.c.continent == 'Asia')
But I can't add .where to a TextClause.
Similarly, my user may choose to see name and population for countries where num_states is greater than 10. So I dynamically want to add the filter
.where(Country.c.num_states > 10)
But again I can't add .where to a TextClause.
What are the options I have to solve this problem?
Could subquery help here in any way?
Please add a filter based on the conditions. filter is used for adding where conditions in sqlalchemy.
Country.query.filter(Country.num_states > 10).all()
You can also do this:
query = Country.query.filter(Country.continent == 'Asia')
if user_input == 'states':
query = query.filter(Country.num_states > 10)
query = query.all()
This is not doable in a general sense without parsing the query. In relational algebra terms, the user applies projection and selection operations to a table, and you want to apply selection operations to it. Since the user can apply arbitrary projections (e.g. user supplies SELECT id FROM table), you are not guaranteed to be able to always apply your filters on top, so you have to apply your filters before the user does. That means you need to rewrite it to SELECT id FROM (some subquery), which requires parsing the user's query.
However, we can sort of cheat depending on the database that you are using, by having the database engine do the parsing for you. The way to do this is with CTEs, by basically shadowing the table name with a CTE.
Using your example, it looks like the following. User supplies query
SELECT name, population FROM country;
You shadow country with a CTE:
WITH country AS (
SELECT * FROM country
WHERE continent = 'Asia'
) SELECT name, population FROM country;
Unfortunately, because of the way SQLAlchemy's CTE support works, it is tough to get it to generate a CTE for a TextClause. The solution is to basically generate the string yourself, using a custom compilation extension, something like this:
class WrappedQuery(Executable, ClauseElement):
def __init__(self, name, outer, inner):
self.name = name
self.outer = outer
self.inner = inner
#compiles(WrappedQuery)
def compile_wrapped_query(element, compiler, **kwargs):
return "WITH {} AS ({}) {}".format(
element.name,
compiler.process(element.outer),
compiler.process(element.inner))
c = Country.__table__
cte = select(["*"]).select_from(c).where(c.c.continent == "Asia")
query = WrappedQuery("country", cte, text("SELECT name, population FROM country"))
session.execute(query)
From my tests, this only works in PostgreSQL. SQLite and SQL Server both treat it as recursive instead of shadowing, and MySQL does not support CTEs.
I couldn't find anything nice for this in the documentation for this. I ended up resorting to pretty much just string processing.... but at least it works!
from sqlalchemy.sql import text
query = """select name, population from Country"""
if continent is not None:
additional_clause = """WHERE continent = {continent};"""
query = query + additional_clause
text_clause = text(
query.format(
continent=continent,
),
)
else:
text_clause = text(query)
with sql_connection() as conn:
results = conn.execute(text_clause)
You could also chain this logic with more clauses, although you'll have to create a boolean flag for the first WHERE clause and then use AND for the subsequent ones.

How to count rows with SELECT COUNT(*) with SQLAlchemy?

I'd like to know if it's possible to generate a SELECT COUNT(*) FROM TABLE statement in SQLAlchemy without explicitly asking for it with execute().
If I use:
session.query(table).count()
then it generates something like:
SELECT count(*) AS count_1 FROM
(SELECT table.col1 as col1, table.col2 as col2, ... from table)
which is significantly slower in MySQL with InnoDB. I am looking for a solution that doesn't require the table to have a known primary key, as suggested in Get the number of rows in table using SQLAlchemy.
Query for just a single known column:
session.query(MyTable.col1).count()
I managed to render the following SELECT with SQLAlchemy on both layers.
SELECT count(*) AS count_1
FROM "table"
Usage from the SQL Expression layer
from sqlalchemy import select, func, Integer, Table, Column, MetaData
metadata = MetaData()
table = Table("table", metadata,
Column('primary_key', Integer),
Column('other_column', Integer) # just to illustrate
)
print select([func.count()]).select_from(table)
Usage from the ORM layer
You just subclass Query (you have probably anyway) and provide a specialized count() method, like this one.
from sqlalchemy.sql.expression import func
class BaseQuery(Query):
def count_star(self):
count_query = (self.statement.with_only_columns([func.count()])
.order_by(None))
return self.session.execute(count_query).scalar()
Please note that order_by(None) resets the ordering of the query, which is irrelevant to the counting.
Using this method you can have a count(*) on any ORM Query, that will honor all the filter andjoin conditions already specified.
I needed to do a count of a very complex query with many joins. I was using the joins as filters, so I only wanted to know the count of the actual objects. count() was insufficient, but I found the answer in the docs here:
http://docs.sqlalchemy.org/en/latest/orm/tutorial.html
The code would look something like this (to count user objects):
from sqlalchemy import func
session.query(func.count(User.id)).scalar()
Addition to the Usage from the ORM layer in the accepted answer: count(*) can be done for ORM using the query.with_entities(func.count()), like this:
session.query(MyModel).with_entities(func.count()).scalar()
It can also be used in more complex cases, when we have joins and filters - the important thing here is to place with_entities after joins, otherwise SQLAlchemy could raise the Don't know how to join error.
For example:
we have User model (id, name) and Song model (id, title, genre)
we have user-song data - the UserSong model (user_id, song_id, is_liked) where user_id + song_id is a primary key)
We want to get a number of user's liked rock songs:
SELECT count(*)
FROM user_song
JOIN song ON user_song.song_id = song.id
WHERE user_song.user_id = %(user_id)
AND user_song.is_liked IS 1
AND song.genre = 'rock'
This query can be generated in a following way:
user_id = 1
query = session.query(UserSong)
query = query.join(Song, Song.id == UserSong.song_id)
query = query.filter(
and_(
UserSong.user_id == user_id,
UserSong.is_liked.is_(True),
Song.genre == 'rock'
)
)
# Note: important to place `with_entities` after the join
query = query.with_entities(func.count())
liked_count = query.scalar()
Complete example is here.
If you are using the SQL Expression Style approach there is another way to construct the count statement if you already have your table object.
Preparations to get the table object. There are also different ways.
import sqlalchemy
database_engine = sqlalchemy.create_engine("connection string")
# Populate existing database via reflection into sqlalchemy objects
database_metadata = sqlalchemy.MetaData()
database_metadata.reflect(bind=database_engine)
table_object = database_metadata.tables.get("table_name") # This is just for illustration how to get the table_object
Issuing the count query on the table_object
query = table_object.count()
# This will produce something like, where id is a primary key column in "table_name" automatically selected by sqlalchemy
# 'SELECT count(table_name.id) AS tbl_row_count FROM table_name'
count_result = database_engine.scalar(query)
I'm not clear on what you mean by "without explicitly asking for it with execute()" So this might be exactly what you are not asking for.
OTOH, this might help others.
You can just run the textual SQL:
your_query="""
SELECT count(*) from table
"""
the_count = session.execute(text(your_query)).scalar()
def test_query(val: str):
query = f"select count(*) from table where col1='{val}'"
rtn = database_engine.query(query)
cnt = rtn.one().count
but you can find the way if you checked debug watch
query = session.query(table.column).filter().with_entities(func.count(table.column.distinct()))
count = query.scalar()
this worked for me.
Gives the query:
SELECT count(DISTINCT table.column) AS count_1
FROM table where ...
Below is the way to find the count of any query.
aliased_query = alias(query)
db.session.query(func.count('*')).select_from(aliased_query).scalar()
Here is the link to the reference document if you want to explore more options or read details.

GROUP BY in Django Queries

Dear StackOverFlow community:
I need your help in executing following SQL query.
select DATE(creation_date), COUNT(creation_date) from blog_article WHERE creation_date BETWEEN SYSDATE() - INTERVAL 30 DAY AND SYSDATE() GROUP BY DATE(creation_date) AND author="scott_tiger";
Here is my Django Model
class Article(models.Model):
title = models.CharField(...)
author = models.CharField(...)
creation_date = models.DateField(...)
How can I form aforementioned Django query using aggregate() and annotate() functions. I created something like this -
now = datetime.datetime.now()
date_diff = datetime.datetime.now() + datetime.timedelta(-30)
records = Article.objects.values('creation_date', Count('creation_date')).aggregate(Count('creation_date')).filter(author='scott_tiger', created_at__gt=date_diff, created_at__lte=now)
When I run this query it gives me following error -
'Count' object has no attribute 'split'
Any idea who to use it?
Delete Count('creation_date') from values and add annotate(Count('creation_date')) after filter.
Try
records = Article.objects.filter(author='scott_tiger', created_at__gt=date_diff,
created_at__lte=now).values('creation_date').aggregate(
ccd=Count('creation_date')).values('creation_date', 'ccd')
You need to use creation_date__count or customized name(ccd here) to refer the count result column, after aggregate().
Also, values() before aggregate limits group by columns and last value() declares the columns to be selected. There is no need to group by COUNT which is based on group of rows already.

Categories