I would like to convert the following query to SqlAlchemy, but the documentation isn't very helpful:
select * from (
select *,
RANK() OVER (PARTITION BY id ORDER BY date desc) AS RNK
from table1
) d
where RNK = 1
Any suggestions?
use over expression
from sqlalchemy import func
subquery = db.session.query(
table1,
func.rank().over(
order_by=table1.c.date.desc(),
partition_by=table1.c.id
).label('rnk')
).subquery()
query = db.session.query(subquery).filter(
subquery.c.rnk==1
)
Related
Need help translating this SQL query into SQLAlchemy:
select
COALESCE(DATE_1,DATE_2) as DATE_COMPLETE,
QUESTIONS_CNT,
ANSWERS_CNT
from (
(select DATE as DATE_1,
count(distinct QUESTIONS) as QUESTIONS_CNT
from GUEST_USERS
where LOCATION like '%TEXAS%'
and DATE = '2021-08-08'
group by DATE
) temp1
full join
(select DATE as DATE_2,
count(distinct ANSWERS) as ANSWERS_CNT
from USERS
where LOCATION like '%TEXAS%'
and DATE = '2021-08-08'
group by DATE
) temp2
on temp1.DATE_1=temp2.DATE_2
)
Mainly struggling with the join of the two subqueries. I've tried this (just for the join part of the SQL):
query1 = db.session.query(
GUEST_USERS.DATE_WEEK_START.label("DATE_1"),
func.count(GUEST_USERS.QUESTIONS).label("QUESTIONS_CNT")
).filter(
GUEST_USERS.LOCATION.like("%TEXAS%"),
GUEST_USERS.DATE == "2021-08-08"
).group_by(GUEST_USERS.DATE)
query2 = db_session_stg.query(
USERS.DATE.label("DATE_2"),
func.count(USERS.ANSWERS).label("ANSWERS_CNT")
).filter(
USERS.LOCATION.like("%TEXAS%"),
USERS.DATE == "2021-08-08"
).group_by(USERS.DATE)
sq2 = query2.subquery()
query1_results = query1.join(
sq2,
sq2.c.DATE_2 == GUEST_USERS.DATE)
).all()
In this output I receive only the DATE_1 column and the QUESTIONS_CNT columns. Any idea why the selected output from the subquery is not being returned in the result?
Not sure if this is the best solution but this is how I got it to work. Using 3 subqueries essentially.
query1 = db.session.query(
GUEST_USERS.DATE_WEEK_START.label("DATE_1"),
func.count(GUEST_USERS.QUESTIONS).label("QUESTIONS_CNT")
).filter(
GUEST_USERS.LOCATION.like("%TEXAS%"),
GUEST_USERS.DATE == "2021-08-08"
).group_by(GUEST_USERS.DATE)
query2 = db_session_stg.query(
USERS.DATE.label("DATE_2"),
func.count(USERS.ANSWERS).label("ANSWERS_CNT")
).filter(
USERS.LOCATION.like("%TEXAS%"),
USERS.DATE == "2021-08-08"
).group_by(USERS.DATE)
sq1 = query1.subquery()
sq2 = query2.subquery()
query3 = db.session.query(sq1, sq2).join(
sq2,
sq2.c.DATE_2 == sq1.c.DATE_1)
sq3 = query3.subquery()
query4 = db.session.query(
func.coalesce(
sq3.c.DATE_1, sq3.c.DATE_2),
sq3.c.QUESTIONS_CNT,
sq3.c.ANSWERS_CNT
)
results = query4.all()
I have a complex SQL query as below which I am using to access MySQL db from a Python script.
sql_query_vav = """SELECT t1.deviceId, t1.date, t1.vavId, t1.timestamp, t1.nvo_airflow as airflow, t1.nvo_air_damper_position as damper_position , t1.nvo_temperature_sensor_pps as vavTemperature , d.MILO as miloId ,m1.timestamp as miloTimestamp, m1.temperature as miloTemperature
FROM
(SELECT deviceId, date, nvo_airflow, nvo_air_damper_position, nvo_temperature_sensor_pps, vavId, timestamp, counter from vavData where date=%s and floor=%s) t1
INNER JOIN
(SELECT date,max(timestamp) as timestamp,vavId from vavData where date=%s and floor=%s group by vavId) t2
ON (t1.timestamp = t2.timestamp)
INNER JOIN
(SELECT VAV,MILO,floor from VavMiloMapping where floor = %s) d
ON (t1.vavId = d.VAV )
INNER JOIN
(SELECT t1.deviceId,t1.date,t1.timestamp,t1.humidity,t1.temperature,t1.block,t1.floor,t1.location
FROM
(SELECT deviceId,date,timestamp,humidity,temperature,block,floor,location from miloData WHERE date=%s and floor=%s) t1
INNER JOIN
(SELECT deviceId,max(timestamp) as timestamp,location from miloData where date=%s and floor=%s GROUP BY deviceId) t2
ON (t1.timestamp = t2.timestamp)) m1
ON (d.MILO = m1.location) order by t1.vavId"""
I get an error with the above query which says
mysql.connector.errors.ProgrammingError: 1055 (42000): Expression #3 of SELECT list is not in GROUP BY
clause and contains nonaggregated column 'minniedb.miloData.location' which is not functionally dependent
on columns in GROUP BY clause; this is incompatible with sql_mode=only_full_group_by
I have tried to change the SQL mode by executing
SET GLOBAL sql_mode=(SELECT REPLACE(##sql_mode,'ONLY_FULL_GROUP_BY',''));
and tried to restart the MySQL service using
sudo service mysql restart
I think I have done everything required. Why am I still getting the same error?
If you want to find the place of uncorrectness you must format the code carefully at least.
SELECT t1.deviceId,
t1.date,
t1.vavId,
t1.timestamp,
t1.nvo_airflow as airflow,
t1.nvo_air_damper_position as damper_position ,
t1.nvo_temperature_sensor_pps as vavTemperature ,
d.MILO as miloId ,
m1.timestamp as miloTimestamp,
m1.temperature as miloTemperature
FROM ( SELECT deviceId,
date,
nvo_airflow,
nvo_air_damper_position,
nvo_temperature_sensor_pps,
vavId,
timestamp,
counter
from vavData
where date=%s
and floor=%s
) t1
INNER JOIN ( SELECT date,
max(timestamp) as timestamp,
vavId
from vavData
where date=%s
and floor=%s
group by vavId
) t2 ON (t1.timestamp = t2.timestamp)
INNER JOIN ( SELECT VAV,
MILO,
floor
from VavMiloMapping
where floor = %s
) d ON (t1.vavId = d.VAV )
INNER JOIN ( SELECT t1.deviceId,
t1.date,
t1.timestamp,
t1.humidity,
t1.temperature,
t1.block,
t1.floor,
t1.location
FROM ( SELECT deviceId,
date,
timestamp,
humidity,
temperature,
block,
floor,
location
from miloData
WHERE date=%s
and floor=%s
) t1
INNER JOIN ( SELECT deviceId,
max(timestamp) as timestamp,
location
from miloData
where date=%s
and floor=%s
GROUP BY deviceId
) t2 ON (t1.timestamp = t2.timestamp)
) m1 ON (d.MILO = m1.location)
order by t1.vavId
Now it is visible that there are 2 points of uncorrectness. Both problematic subqueries have an alias t2 and looks like
SELECT some_Id,
max(timestamp) as timestamp,
some_another_field
from some_table
where some_conditions
GROUP BY some_Id
The fiels marked as some_another_field is included into neither GROUP BY expression not aggregate function.
Correct these subqueries.
I am interested in finding the most efficient manner to query the following:
For a list of table names, return the table name if it contains at least one record that meet the conditions
Essentially, something similar to the following Python code in a single query:
dfs = [pd.read_sql('SELECT name FROM {} WHERE a=1 AND b=2'.format(table), engine) for table in tables]
tables = [table for table, df in zip(tables, dfs) if not df.empty]
Is this possible in MySQL?
Assuming you trust the table names in tables not to contain any surprises leading to SQL injection, you could device something like:
from sqlalchemy import text
selects = [f'SELECT :table_{i} FROM {table} WHERE a = 1 AND b = 2'
for i, table in enumerate(tables)]
stmt = ' UNION '.join(selects)
stmt = text(stmt)
results = engine.execute(
stmt, {f'table_{i}': table for i, table in enumerate(tables)})
or you could use SQLAlchemy constructs to build the same query safely:
from sqlalchemy import table, column, union, and_, select, Integer, literal
tbls = [table(name,
column('a', Integer),
column('b', Integer)) for name in tables]
stmt = union(*[select([literal(name).label('name')]).
select_from(tbl).
where(and_(tbl.c.a == 1, tbl.c.b == 2))
for tbl, name in zip(tbls, tables)])
results = engine.execute(stmt)
You can use a UNION of queries that search each table.
(SELECT 'table1' AS table_name
FROM table1
WHERE a = 1 AND b = 2
LIMIT 1)
UNION
(SELECT 'table2' AS table_name
FROM table2
WHERE a = 1 AND b = 2
LIMIT 1)
UNION
(SELECT 'table3' AS table_name
FROM table3
WHERE a = 1 AND b = 2
LIMIT 1)
...
I am trying to execute a SQL query in python. The query works well in SQL, throws error in python. The query is used to find all the duplicates in 'Excel_Values' table
Query:
WITH duplicates AS (
select CaseID, ColumnName, TabType, ColumnValue, count(*) from [Excel_Values]
where Validation_Key= 52
group by CaseID, ColumnName, TabType
having count(distinct ColumnValue) > 1
)
SELECT a.*
FROM [Excel_Values] a
JOIN duplicates b ON (a.CaseID = b.CaseID and a.ColumnName = b.ColumnName and a.TabType = b.TabType and a.Validation_Key = 52);
Each row in my table has a date. The date is not unique. The same date is present more than one time.
I want to get all objects with the youngest date.
My solution work but I am not sure if this is a elegent SQLAlchemy way.
query = _session.query(Table._date) \
.order_by(Table._date.desc()) \
.group_by(Table._date)
# this is the younges date (type is date.datetime)
young = query.first()
query = _session.query(Table).filter(Table._date==young)
result = query.all()
Isn't there a way to put all this in one query object or something like that?
You need a having clause, and you need to import the max function
then your query will be:
from sqlalchemy import func
stmt = _session.query(Table) \
.group_by(Table._date) \
.having(Table._date == func.max(Table._date)
This produces a sql statement like the following.
SELECT my_table.*
FROM my_table
GROUP BY my_table._date
HAVING my_table._date = MAX(my_table._date)
If you construct your sql statement with a select, you can examine the sql produced in your case using. *I'm not sure if this would work with statements query
str(stmt)
Two ways of doing this using a sub-query:
# #note: do not need to alias, but do in order to specify `name`
T1 = aliased(MyTable, name="T1")
# version-1:
subquery = (session.query(func.max(T1._date).label("max_date"))
.as_scalar()
)
# version-2:
subquery = (session.query(T1._date.label("max_date"))
.order_by(T1._date.desc())
.limit(1)
.as_scalar()
)
qry = session.query(MyTable).filter(MyTable._date == subquery)
results = qry.all()
The output should be similar to:
# version-1
SELECT my_table.id AS my_table_id, my_table.name AS my_table_name, my_table._date AS my_table__date
FROM my_table
WHERE my_table._date = (
SELECT max("T1"._date) AS max_date
FROM my_table AS "T1")
# version-2
SELECT my_table.id AS my_table_id, my_table.name AS my_table_name, my_table._date AS my_table__date
FROM my_table
WHERE my_table._date = (
SELECT "T1"._date AS max_date
FROM my_table AS "T1"
ORDER BY "T1"._date DESC LIMIT ? OFFSET ?
)