I have folowing sql code:
SELECT t1.*
FROM table1 t1
INNER JOIN table2 t2
ON t1.id = t2.sample_id
GROUP BY t1.id
HAVING NOW() > (SELECT end FROM table2 WHERE start =
max(t2.start)) + (t1.time::text||' minute')::interval
I need to rewrite this code in sqlalchemy like this:
Table1.query
.join(Table2, Table1.id == Table2.sample_id)
.group_by(Table1.id)
.having(
datetime.datetime.now() > ((Table2.query.filter_by(
start=
(database.session.query(func.max(Table2.start)))
)).first().end +
datetime.timedelta(minutes=Table1.query.filter_by(
id=Table1.id
).first().frequency))).all()
This SQL query returns one line of data from the database, but my query in sqlalchemy returns empty dict. Can someone help me with my recreation of my SQL query?
Related
I have a complex SQL query as below which I am using to access MySQL db from a Python script.
sql_query_vav = """SELECT t1.deviceId, t1.date, t1.vavId, t1.timestamp, t1.nvo_airflow as airflow, t1.nvo_air_damper_position as damper_position , t1.nvo_temperature_sensor_pps as vavTemperature , d.MILO as miloId ,m1.timestamp as miloTimestamp, m1.temperature as miloTemperature
FROM
(SELECT deviceId, date, nvo_airflow, nvo_air_damper_position, nvo_temperature_sensor_pps, vavId, timestamp, counter from vavData where date=%s and floor=%s) t1
INNER JOIN
(SELECT date,max(timestamp) as timestamp,vavId from vavData where date=%s and floor=%s group by vavId) t2
ON (t1.timestamp = t2.timestamp)
INNER JOIN
(SELECT VAV,MILO,floor from VavMiloMapping where floor = %s) d
ON (t1.vavId = d.VAV )
INNER JOIN
(SELECT t1.deviceId,t1.date,t1.timestamp,t1.humidity,t1.temperature,t1.block,t1.floor,t1.location
FROM
(SELECT deviceId,date,timestamp,humidity,temperature,block,floor,location from miloData WHERE date=%s and floor=%s) t1
INNER JOIN
(SELECT deviceId,max(timestamp) as timestamp,location from miloData where date=%s and floor=%s GROUP BY deviceId) t2
ON (t1.timestamp = t2.timestamp)) m1
ON (d.MILO = m1.location) order by t1.vavId"""
I get an error with the above query which says
mysql.connector.errors.ProgrammingError: 1055 (42000): Expression #3 of SELECT list is not in GROUP BY
clause and contains nonaggregated column 'minniedb.miloData.location' which is not functionally dependent
on columns in GROUP BY clause; this is incompatible with sql_mode=only_full_group_by
I have tried to change the SQL mode by executing
SET GLOBAL sql_mode=(SELECT REPLACE(##sql_mode,'ONLY_FULL_GROUP_BY',''));
and tried to restart the MySQL service using
sudo service mysql restart
I think I have done everything required. Why am I still getting the same error?
If you want to find the place of uncorrectness you must format the code carefully at least.
SELECT t1.deviceId,
t1.date,
t1.vavId,
t1.timestamp,
t1.nvo_airflow as airflow,
t1.nvo_air_damper_position as damper_position ,
t1.nvo_temperature_sensor_pps as vavTemperature ,
d.MILO as miloId ,
m1.timestamp as miloTimestamp,
m1.temperature as miloTemperature
FROM ( SELECT deviceId,
date,
nvo_airflow,
nvo_air_damper_position,
nvo_temperature_sensor_pps,
vavId,
timestamp,
counter
from vavData
where date=%s
and floor=%s
) t1
INNER JOIN ( SELECT date,
max(timestamp) as timestamp,
vavId
from vavData
where date=%s
and floor=%s
group by vavId
) t2 ON (t1.timestamp = t2.timestamp)
INNER JOIN ( SELECT VAV,
MILO,
floor
from VavMiloMapping
where floor = %s
) d ON (t1.vavId = d.VAV )
INNER JOIN ( SELECT t1.deviceId,
t1.date,
t1.timestamp,
t1.humidity,
t1.temperature,
t1.block,
t1.floor,
t1.location
FROM ( SELECT deviceId,
date,
timestamp,
humidity,
temperature,
block,
floor,
location
from miloData
WHERE date=%s
and floor=%s
) t1
INNER JOIN ( SELECT deviceId,
max(timestamp) as timestamp,
location
from miloData
where date=%s
and floor=%s
GROUP BY deviceId
) t2 ON (t1.timestamp = t2.timestamp)
) m1 ON (d.MILO = m1.location)
order by t1.vavId
Now it is visible that there are 2 points of uncorrectness. Both problematic subqueries have an alias t2 and looks like
SELECT some_Id,
max(timestamp) as timestamp,
some_another_field
from some_table
where some_conditions
GROUP BY some_Id
The fiels marked as some_another_field is included into neither GROUP BY expression not aggregate function.
Correct these subqueries.
I am interested in finding the most efficient manner to query the following:
For a list of table names, return the table name if it contains at least one record that meet the conditions
Essentially, something similar to the following Python code in a single query:
dfs = [pd.read_sql('SELECT name FROM {} WHERE a=1 AND b=2'.format(table), engine) for table in tables]
tables = [table for table, df in zip(tables, dfs) if not df.empty]
Is this possible in MySQL?
Assuming you trust the table names in tables not to contain any surprises leading to SQL injection, you could device something like:
from sqlalchemy import text
selects = [f'SELECT :table_{i} FROM {table} WHERE a = 1 AND b = 2'
for i, table in enumerate(tables)]
stmt = ' UNION '.join(selects)
stmt = text(stmt)
results = engine.execute(
stmt, {f'table_{i}': table for i, table in enumerate(tables)})
or you could use SQLAlchemy constructs to build the same query safely:
from sqlalchemy import table, column, union, and_, select, Integer, literal
tbls = [table(name,
column('a', Integer),
column('b', Integer)) for name in tables]
stmt = union(*[select([literal(name).label('name')]).
select_from(tbl).
where(and_(tbl.c.a == 1, tbl.c.b == 2))
for tbl, name in zip(tbls, tables)])
results = engine.execute(stmt)
You can use a UNION of queries that search each table.
(SELECT 'table1' AS table_name
FROM table1
WHERE a = 1 AND b = 2
LIMIT 1)
UNION
(SELECT 'table2' AS table_name
FROM table2
WHERE a = 1 AND b = 2
LIMIT 1)
UNION
(SELECT 'table3' AS table_name
FROM table3
WHERE a = 1 AND b = 2
LIMIT 1)
...
I have a python script which connects to sql server instance. I running a cte query to remove duplicates. The query run sucessfully but when i used the fetchall() function it results in an Error: the previous query is not a sql query and after checking in the db table for the duplicates, it shows the duplicate still exists. This is the same case with both pyodbc and sqlalchemy.
Code pyodbc:
import pyodbc
conn = pyodbc.connect(connection_string)
cursor = conn.cursor()
query = ''';with cte as
(
SELECT [ID], [TIME], ROW_NUMBER() OVER
(PARTITION BY [ID] order by [TIME] desc) as rn
from table
)delete from cte WHERE rn > 1'''
cursor.execute(query)
cursor.close()
conn.close()
Code for sqlalchemy:
from sqlalchemy import create_engine
from sqlalchemy.sql import text
import urllib
conn = urllib.parse.quote_plus(connection_string)
engine = create_engine('mssql+pyodbc:///?odbc_connect={}'.format(conn))
query = '''with cte as
(
SELECT [ID], [TIME], ROW_NUMBER() OVER (PARTITION BY [ID] order by [TIME] desc) as rn
from table
)
delete from cte WHERE rn > 1'''
connect = engine.connect()
result = connect.execute(query)
if result.returns_rows:
print("Duplicates removed")
else:
print("No row is returned")
when i used the fetchall() function it results in an Error: the previous query is not a sql query
This is the expected behaviour. Although your query includes a SELECT as part of the CTE, the query itself is ultimately a DELETE query which does not return rows. It will return a row count that you can retrieve with Cursor#rowcount, but Cursor#fetchall() will throw an error because there are no rows to retrieve.
I am trying to execute a SQL query in python. The query works well in SQL, throws error in python. The query is used to find all the duplicates in 'Excel_Values' table
Query:
WITH duplicates AS (
select CaseID, ColumnName, TabType, ColumnValue, count(*) from [Excel_Values]
where Validation_Key= 52
group by CaseID, ColumnName, TabType
having count(distinct ColumnValue) > 1
)
SELECT a.*
FROM [Excel_Values] a
JOIN duplicates b ON (a.CaseID = b.CaseID and a.ColumnName = b.ColumnName and a.TabType = b.TabType and a.Validation_Key = 52);
Each row in my table has a date. The date is not unique. The same date is present more than one time.
I want to get all objects with the youngest date.
My solution work but I am not sure if this is a elegent SQLAlchemy way.
query = _session.query(Table._date) \
.order_by(Table._date.desc()) \
.group_by(Table._date)
# this is the younges date (type is date.datetime)
young = query.first()
query = _session.query(Table).filter(Table._date==young)
result = query.all()
Isn't there a way to put all this in one query object or something like that?
You need a having clause, and you need to import the max function
then your query will be:
from sqlalchemy import func
stmt = _session.query(Table) \
.group_by(Table._date) \
.having(Table._date == func.max(Table._date)
This produces a sql statement like the following.
SELECT my_table.*
FROM my_table
GROUP BY my_table._date
HAVING my_table._date = MAX(my_table._date)
If you construct your sql statement with a select, you can examine the sql produced in your case using. *I'm not sure if this would work with statements query
str(stmt)
Two ways of doing this using a sub-query:
# #note: do not need to alias, but do in order to specify `name`
T1 = aliased(MyTable, name="T1")
# version-1:
subquery = (session.query(func.max(T1._date).label("max_date"))
.as_scalar()
)
# version-2:
subquery = (session.query(T1._date.label("max_date"))
.order_by(T1._date.desc())
.limit(1)
.as_scalar()
)
qry = session.query(MyTable).filter(MyTable._date == subquery)
results = qry.all()
The output should be similar to:
# version-1
SELECT my_table.id AS my_table_id, my_table.name AS my_table_name, my_table._date AS my_table__date
FROM my_table
WHERE my_table._date = (
SELECT max("T1"._date) AS max_date
FROM my_table AS "T1")
# version-2
SELECT my_table.id AS my_table_id, my_table.name AS my_table_name, my_table._date AS my_table__date
FROM my_table
WHERE my_table._date = (
SELECT "T1"._date AS max_date
FROM my_table AS "T1"
ORDER BY "T1"._date DESC LIMIT ? OFFSET ?
)