How to convert SQL scalar subquery to SQLAlchemy expression

How to convert SQL scalar subquery to SQLAlchemy expression - python

I need a litle help with expressing in SQLAlchemy language my code like this:
SELECT
s.agent_id,
s.property_id,
p.address_zip,
(
SELECT v.valuation
FROM property_valuations v WHERE v.zip_code = p.address_zip
ORDER BY ABS(DATEDIFF(v.as_of, s.date_sold))
LIMIT 1
) AS back_valuation,
FROM sales s
JOIN properties p ON s.property_id = p.id
Inner subquery aimed to get property value from table propert_valuations with columns (zip_code INT, valuation DECIMAL, as_if DATE) closest to the date of sale from table sales. I know how to rewrite it but I completely stuck on order_by expression - I cannot prepare subquery to pass ordering member later.
Currently I have following queries:
subquery = (
session.query(PropertyValuation)
.filter(PropertyValuation.zip_code == Property.address_zip)
.order_by(func.abs(func.datediff(PropertyValuation.as_of, Sale.date_sold)))
.limit(1)
)
query = session.query(Sale).join(Sale.property_)
How to combine these queries together?

How to combine these queries together?
Use as_scalar(), or label():
subquery = (
session.query(PropertyValuation.valuation)
.filter(PropertyValuation.zip_code == Property.address_zip)
.order_by(func.abs(func.datediff(PropertyValuation.as_of, Sale.date_sold)))
.limit(1)
)
query = session.query(Sale.agent_id,
Sale.property_id,
Property.address_zip,
# `subquery.as_scalar()` or
subquery.label('back_valuation'))\
.join(Property)
Using as_scalar() limits returned columns and rows to 1, so you cannot get the whole model object using it (as query(PropertyValuation) is a select of all the attributes of PropertyValuation), but getting just the valuation attribute works.
but I completely stuck on order_by expression - I cannot prepare subquery to pass ordering member later.
There's no need to pass it later. Your current way of declaring the subquery is fine as it is, since SQLAlchemy can automatically correlate FROM objects to those of an enclosing query. I tried creating models that somewhat represent what you have, and here's how the query above works out (with added line-breaks and indentation for readability):
In [10]: print(query)
SELECT sale.agent_id AS sale_agent_id,
sale.property_id AS sale_property_id,
property.address_zip AS property_address_zip,
(SELECT property_valuations.valuation
FROM property_valuations
WHERE property_valuations.zip_code = property.address_zip
ORDER BY abs(datediff(property_valuations.as_of, sale.date_sold))
LIMIT ? OFFSET ?) AS back_valuation
FROM sale
JOIN property ON property.id = sale.property_id

Related

Passing a query output as input to in clause for another query in SQLAlchemy

I have 2 tables (say Student and College) and a third table (StudentCollege), which has student_id and college_id foreign keys.
I want to give output from below query:
list = (
db.session.query(StudentCollegeModel.College_id)
.filter(StudentCollegeModel.student_id== student_id)
.all()
)
to below query:
(
db.session.query(CollegeModel)
.filter(CollegeModel.College_id.in_(list))
.all()
)
But its giving programming error.

You don't need to execute the first query to use it as a subquery in the second. This saves you from having to construct an in memory list of all of the College_id's before making the in_() query and means that you only make one round trip to the database.
subquery = (
db.session.query(StudentCollegeModel.College_id)
.filter(StudentCollegeModel.student_id== student_id)
)
result = (
db.session.query(CollegeModel)
.filter(CollegeModel.College_id.in_(subquery))
.all()
)

Your first query returns a list of named tuples. You need to convert it to a list. For example:
college_tuples = db.session.query(StudentCollegeModel.College_id).filter(StudentCollegeModel.student_id == student_id).all()
college_id_list = [r.College_id for r in college_tuples]
db.session.query(CollegeModel.filter(CollegeModel.College_id.in_(college_id_list)).all()
Here we fetch your college id tuples, flatten it and used the flattened version to preform your second query.

Similar as follows:
cs = StudentCollegeModel.query.filter_by(student_id=student_id).distinct('college_id').entities_with('college_id', flat=True).all()
CollegeModel.query.filter(CollegeModel.College_id.in_(cs)).all()

Better Alternate instead of using chained union queries in Django ORM

I needed to achieve something like this in Django ORM :
(SELECT * FROM `stats` WHERE MODE = 1 ORDER BY DATE DESC LIMIT 2)
UNION
(SELECT * FROM `stats` WHERE MODE = 2 ORDER BY DATE DESC LIMIT 2)
UNION
(SELECT * FROM `stats` WHERE MODE = 3 ORDER BY DATE DESC LIMIT 2)
UNION
(SELECT * FROM `stats` WHERE MODE = 6 ORDER BY DATE DESC LIMIT 2)
UNION
(SELECT * FROM `stats` WHERE MODE = 5 AND is_completed != 3 ORDER BY DATE DESC)
# mode 5 can return more than 100 records so NO LIMIT here
for which i wrote this :
query_run_now_job_ids = Stats.objects.filter(mode=5).exclude(is_completed=3).order_by('-date')
list_of_active_job_ids = Stats.objects.filter(mode=1).order_by('-date')[:2].union(
Stats.objects.filter(mode=2).order_by('-date')[:2],
Stats.objects.filter(mode=3).order_by('-date')[:2],
Stats.objects.filter(mode=6).order_by('-date')[:2],
query_run_now_job_ids)
but somehow list_of_active_job_ids returned is unordered i.e list_of_active_job_ids.ordered returns False due to which when this query is passed to Paginator class it gives :
UnorderedObjectListWarning:
Pagination may yield inconsistent results with an unordered object_list
I have already set ordering in class Meta in models.py
class Meta:
ordering = ['-date']
Without paginator query works fine and page loads but using paginator , view never loads it keeps on loading .
Is there any better alternate for achieving this without using chain of union .
So I tried another alternate for above mysql query but i'm stuck in another problem to write up condition for mode = 5 in this query :
SELECT
MODE ,
SUBSTRING_INDEX(GROUP_CONCAT( `job_id` SEPARATOR ',' ),',',2) AS job_id_list,
SUBSTRING_INDEX(GROUP_CONCAT( `total_calculations` SEPARATOR ',' ),',',2) AS total_calculations
FROM `stats`
ORDER BY DATE DESC
Even if I was able to write this Query it would lead me to another challenging situation i.e to convert this query for Django ORM .
So why My Query is not ordered even when i have set it in Class Meta .
Also if not this query , Is there any better alternate for achieving this ?
Help would be appreciated ! .
I'm using Python 2.7 and Django 1.11 .

While subqueries may be ordered, the resulting union data is not. You need to explicitly define the ordering.
from django.db import models
def make_query(mode, index):
return (
Stats.objects.filter(mode=mode).
annotate(_sort=models.Value(index, models.IntegerField())).
order_by('-date')
)
list_of_active_job_ids = make_query(1, 1)[:2].union(
make_query(2, 2)[:2],
make_query(3, 3)[:2],
make_query(6, 4)[:2],
make_query(5, 5).exclude(is_completed=3)
).order_by('_sort', '-date')
All I did was add a new, literal value field _sort that has a different value for each subquery and then ordered by it in the final query.The rest of the code is just to reduce duplication. It would have been even cleaner if it wasn't for that mode=6 subquery.

SQLAlchemy group_by where select differs from aggregate target

I'm struggling to write an aggregating GROUP BY query with SQL Alchemy that returns the result of aggregating over a table "lower down" and a joined entity "higher up" which happens to be the grouping key, instead of returning the aggregating entity, e.g.:
qry = session.query(PSU, func.count(PSU.id)).join(PSU).join(StockUnit).join(Part).group_by(Part)
but I want to return (Part, the_count), not (PSU, the_count). Writing session.query(Part, func.count(...)) queries the wrong way round.
Here is the SQL I want query using SQL Alchemy semantics:
select
psu.package_id,
p.*, -- the joined entity
count(psu.*) -- the aggregate
from packaged_stock_unit psu
inner join stock_unit su
on su.id = psu.stock_unit_id
inner join part p
on p.id = su.part_id
where
psu.some_value = 1
and psu.package_id = 1
group by psu.package_id, p.sku;
Perhaps this is possible with the SQLAlchemy base functions?

Use select_from() to control the "left" side of the join in case you need it:
qry = session.query(Part, func.count(PSU.id)).\
select_from(PSU).\
join(StockUnit).\
join(Part).\
group_by(Part)

How can django produce this SQL?

I have the following SQL query that returns what i need:
SELECT sensors_sensorreading.*, MAX(sensors_sensorreading.timestamp) AS "last"
FROM sensors_sensorreading
GROUP BY sensors_sensorreading.chipid
In words: get the last sensor reading entry for each unique chipid.
But i cannot seem to figure out the correct Django ORM statement to produce this query. The best i could come up with is:
SensorReading.objects.values('chipid').annotate(last=Max('timestamp'))
But if i inspect the raw sql it generates:
>>> print connection.queries[-1:]
[{u'time': u'0.475', u'sql': u'SELECT
"sensors_sensorreading"."chipid",
MAX("sensors_sensorreading"."timestamp") AS "last" FROM
"sensors_sensorreading" GROUP BY "sensors_sensorreading"."chipid"'}]
As you can see, it almost generates the correct SQL, except django selects only the chipid field and the aggregate "last" (but i need all the table fields returned instead).
Any idea how to return all fields?

Assuming you also have other fields in the table besides chipid and timestamp, then I would guess this is the SQL you actually need:
select * from (
SELECT *, row_number() over (partition by chipid order by timestamp desc) as RN
FROM sensors_sensorreading
) X where RN = 1
This will return the latest rows for each chipid with all the data that is in the row.

How to count rows with SELECT COUNT(*) with SQLAlchemy?

I'd like to know if it's possible to generate a SELECT COUNT(*) FROM TABLE statement in SQLAlchemy without explicitly asking for it with execute().
If I use:
session.query(table).count()
then it generates something like:
SELECT count(*) AS count_1 FROM
(SELECT table.col1 as col1, table.col2 as col2, ... from table)
which is significantly slower in MySQL with InnoDB. I am looking for a solution that doesn't require the table to have a known primary key, as suggested in Get the number of rows in table using SQLAlchemy.

Query for just a single known column:
session.query(MyTable.col1).count()

I managed to render the following SELECT with SQLAlchemy on both layers.
SELECT count(*) AS count_1
FROM "table"
Usage from the SQL Expression layer
from sqlalchemy import select, func, Integer, Table, Column, MetaData
metadata = MetaData()
table = Table("table", metadata,
Column('primary_key', Integer),
Column('other_column', Integer) # just to illustrate
)
print select([func.count()]).select_from(table)
Usage from the ORM layer
You just subclass Query (you have probably anyway) and provide a specialized count() method, like this one.
from sqlalchemy.sql.expression import func
class BaseQuery(Query):
def count_star(self):
count_query = (self.statement.with_only_columns([func.count()])
.order_by(None))
return self.session.execute(count_query).scalar()
Please note that order_by(None) resets the ordering of the query, which is irrelevant to the counting.
Using this method you can have a count(*) on any ORM Query, that will honor all the filter andjoin conditions already specified.

I needed to do a count of a very complex query with many joins. I was using the joins as filters, so I only wanted to know the count of the actual objects. count() was insufficient, but I found the answer in the docs here:
http://docs.sqlalchemy.org/en/latest/orm/tutorial.html
The code would look something like this (to count user objects):
from sqlalchemy import func
session.query(func.count(User.id)).scalar()

Addition to the Usage from the ORM layer in the accepted answer: count(*) can be done for ORM using the query.with_entities(func.count()), like this:
session.query(MyModel).with_entities(func.count()).scalar()
It can also be used in more complex cases, when we have joins and filters - the important thing here is to place with_entities after joins, otherwise SQLAlchemy could raise the Don't know how to join error.
For example:
we have User model (id, name) and Song model (id, title, genre)
we have user-song data - the UserSong model (user_id, song_id, is_liked) where user_id + song_id is a primary key)
We want to get a number of user's liked rock songs:
SELECT count(*)
FROM user_song
JOIN song ON user_song.song_id = song.id
WHERE user_song.user_id = %(user_id)
AND user_song.is_liked IS 1
AND song.genre = 'rock'
This query can be generated in a following way:
user_id = 1
query = session.query(UserSong)
query = query.join(Song, Song.id == UserSong.song_id)
query = query.filter(
and_(
UserSong.user_id == user_id,
UserSong.is_liked.is_(True),
Song.genre == 'rock'
)
)
# Note: important to place `with_entities` after the join
query = query.with_entities(func.count())
liked_count = query.scalar()
Complete example is here.

If you are using the SQL Expression Style approach there is another way to construct the count statement if you already have your table object.
Preparations to get the table object. There are also different ways.
import sqlalchemy
database_engine = sqlalchemy.create_engine("connection string")
# Populate existing database via reflection into sqlalchemy objects
database_metadata = sqlalchemy.MetaData()
database_metadata.reflect(bind=database_engine)
table_object = database_metadata.tables.get("table_name") # This is just for illustration how to get the table_object
Issuing the count query on the table_object
query = table_object.count()
# This will produce something like, where id is a primary key column in "table_name" automatically selected by sqlalchemy
# 'SELECT count(table_name.id) AS tbl_row_count FROM table_name'
count_result = database_engine.scalar(query)

I'm not clear on what you mean by "without explicitly asking for it with execute()" So this might be exactly what you are not asking for.
OTOH, this might help others.
You can just run the textual SQL:
your_query="""
SELECT count(*) from table
"""
the_count = session.execute(text(your_query)).scalar()

def test_query(val: str):
query = f"select count(*) from table where col1='{val}'"
rtn = database_engine.query(query)
cnt = rtn.one().count
but you can find the way if you checked debug watch

query = session.query(table.column).filter().with_entities(func.count(table.column.distinct()))
count = query.scalar()
this worked for me.
Gives the query:
SELECT count(DISTINCT table.column) AS count_1
FROM table where ...

Below is the way to find the count of any query.
aliased_query = alias(query)
db.session.query(func.count('*')).select_from(aliased_query).scalar()
Here is the link to the reference document if you want to explore more options or read details.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to convert SQL scalar subquery to SQLAlchemy expression - python

Related

Passing a query output as input to in clause for another query in SQLAlchemy

Better Alternate instead of using chained union queries in Django ORM

SQLAlchemy group_by where select differs from aggregate target

How can django produce this SQL?

How to count rows with SELECT COUNT(*) with SQLAlchemy?

Categories

Resources