'Having' Clause in SQLite3 Python - python

I am trying to execute a SQL query in python. The query works well in SQL, throws error in python. The query is used to find all the duplicates in 'Excel_Values' table
Query:
WITH duplicates AS (
select CaseID, ColumnName, TabType, ColumnValue, count(*) from [Excel_Values]
where Validation_Key= 52
group by CaseID, ColumnName, TabType
having count(distinct ColumnValue) > 1
)
SELECT a.*
FROM [Excel_Values] a
JOIN duplicates b ON (a.CaseID = b.CaseID and a.ColumnName = b.ColumnName and a.TabType = b.TabType and a.Validation_Key = 52);

Related

Error while executing a SQL query using MySQL db and Python client

I have a complex SQL query as below which I am using to access MySQL db from a Python script.
sql_query_vav = """SELECT t1.deviceId, t1.date, t1.vavId, t1.timestamp, t1.nvo_airflow as airflow, t1.nvo_air_damper_position as damper_position , t1.nvo_temperature_sensor_pps as vavTemperature , d.MILO as miloId ,m1.timestamp as miloTimestamp, m1.temperature as miloTemperature
FROM
(SELECT deviceId, date, nvo_airflow, nvo_air_damper_position, nvo_temperature_sensor_pps, vavId, timestamp, counter from vavData where date=%s and floor=%s) t1
INNER JOIN
(SELECT date,max(timestamp) as timestamp,vavId from vavData where date=%s and floor=%s group by vavId) t2
ON (t1.timestamp = t2.timestamp)
INNER JOIN
(SELECT VAV,MILO,floor from VavMiloMapping where floor = %s) d
ON (t1.vavId = d.VAV )
INNER JOIN
(SELECT t1.deviceId,t1.date,t1.timestamp,t1.humidity,t1.temperature,t1.block,t1.floor,t1.location
FROM
(SELECT deviceId,date,timestamp,humidity,temperature,block,floor,location from miloData WHERE date=%s and floor=%s) t1
INNER JOIN
(SELECT deviceId,max(timestamp) as timestamp,location from miloData where date=%s and floor=%s GROUP BY deviceId) t2
ON (t1.timestamp = t2.timestamp)) m1
ON (d.MILO = m1.location) order by t1.vavId"""
I get an error with the above query which says
mysql.connector.errors.ProgrammingError: 1055 (42000): Expression #3 of SELECT list is not in GROUP BY
clause and contains nonaggregated column 'minniedb.miloData.location' which is not functionally dependent
on columns in GROUP BY clause; this is incompatible with sql_mode=only_full_group_by
I have tried to change the SQL mode by executing
SET GLOBAL sql_mode=(SELECT REPLACE(##sql_mode,'ONLY_FULL_GROUP_BY',''));
and tried to restart the MySQL service using
sudo service mysql restart
I think I have done everything required. Why am I still getting the same error?
If you want to find the place of uncorrectness you must format the code carefully at least.
SELECT t1.deviceId,
t1.date,
t1.vavId,
t1.timestamp,
t1.nvo_airflow as airflow,
t1.nvo_air_damper_position as damper_position ,
t1.nvo_temperature_sensor_pps as vavTemperature ,
d.MILO as miloId ,
m1.timestamp as miloTimestamp,
m1.temperature as miloTemperature
FROM ( SELECT deviceId,
date,
nvo_airflow,
nvo_air_damper_position,
nvo_temperature_sensor_pps,
vavId,
timestamp,
counter
from vavData
where date=%s
and floor=%s
) t1
INNER JOIN ( SELECT date,
max(timestamp) as timestamp,
vavId
from vavData
where date=%s
and floor=%s
group by vavId
) t2 ON (t1.timestamp = t2.timestamp)
INNER JOIN ( SELECT VAV,
MILO,
floor
from VavMiloMapping
where floor = %s
) d ON (t1.vavId = d.VAV )
INNER JOIN ( SELECT t1.deviceId,
t1.date,
t1.timestamp,
t1.humidity,
t1.temperature,
t1.block,
t1.floor,
t1.location
FROM ( SELECT deviceId,
date,
timestamp,
humidity,
temperature,
block,
floor,
location
from miloData
WHERE date=%s
and floor=%s
) t1
INNER JOIN ( SELECT deviceId,
max(timestamp) as timestamp,
location
from miloData
where date=%s
and floor=%s
GROUP BY deviceId
) t2 ON (t1.timestamp = t2.timestamp)
) m1 ON (d.MILO = m1.location)
order by t1.vavId
Now it is visible that there are 2 points of uncorrectness. Both problematic subqueries have an alias t2 and looks like
SELECT some_Id,
max(timestamp) as timestamp,
some_another_field
from some_table
where some_conditions
GROUP BY some_Id
The fiels marked as some_another_field is included into neither GROUP BY expression not aggregate function.
Correct these subqueries.

MySQL - select table name if it contains record for list of tables

I am interested in finding the most efficient manner to query the following:
For a list of table names, return the table name if it contains at least one record that meet the conditions
Essentially, something similar to the following Python code in a single query:
dfs = [pd.read_sql('SELECT name FROM {} WHERE a=1 AND b=2'.format(table), engine) for table in tables]
tables = [table for table, df in zip(tables, dfs) if not df.empty]
Is this possible in MySQL?
Assuming you trust the table names in tables not to contain any surprises leading to SQL injection, you could device something like:
from sqlalchemy import text
selects = [f'SELECT :table_{i} FROM {table} WHERE a = 1 AND b = 2'
for i, table in enumerate(tables)]
stmt = ' UNION '.join(selects)
stmt = text(stmt)
results = engine.execute(
stmt, {f'table_{i}': table for i, table in enumerate(tables)})
or you could use SQLAlchemy constructs to build the same query safely:
from sqlalchemy import table, column, union, and_, select, Integer, literal
tbls = [table(name,
column('a', Integer),
column('b', Integer)) for name in tables]
stmt = union(*[select([literal(name).label('name')]).
select_from(tbl).
where(and_(tbl.c.a == 1, tbl.c.b == 2))
for tbl, name in zip(tbls, tables)])
results = engine.execute(stmt)
You can use a UNION of queries that search each table.
(SELECT 'table1' AS table_name
FROM table1
WHERE a = 1 AND b = 2
LIMIT 1)
UNION
(SELECT 'table2' AS table_name
FROM table2
WHERE a = 1 AND b = 2
LIMIT 1)
UNION
(SELECT 'table3' AS table_name
FROM table3
WHERE a = 1 AND b = 2
LIMIT 1)
...

fetchall() throws "Previous SQL was not a query." for DELETE query

I have a python script which connects to sql server instance. I running a cte query to remove duplicates. The query run sucessfully but when i used the fetchall() function it results in an Error: the previous query is not a sql query and after checking in the db table for the duplicates, it shows the duplicate still exists. This is the same case with both pyodbc and sqlalchemy.
Code pyodbc:
import pyodbc
conn = pyodbc.connect(connection_string)
cursor = conn.cursor()
query = ''';with cte as
(
SELECT [ID], [TIME], ROW_NUMBER() OVER
(PARTITION BY [ID] order by [TIME] desc) as rn
from table
)delete from cte WHERE rn > 1'''
cursor.execute(query)
cursor.close()
conn.close()
Code for sqlalchemy:
from sqlalchemy import create_engine
from sqlalchemy.sql import text
import urllib
conn = urllib.parse.quote_plus(connection_string)
engine = create_engine('mssql+pyodbc:///?odbc_connect={}'.format(conn))
query = '''with cte as
(
SELECT [ID], [TIME], ROW_NUMBER() OVER (PARTITION BY [ID] order by [TIME] desc) as rn
from table
)
delete from cte WHERE rn > 1'''
connect = engine.connect()
result = connect.execute(query)
if result.returns_rows:
print("Duplicates removed")
else:
print("No row is returned")
when i used the fetchall() function it results in an Error: the previous query is not a sql query
This is the expected behaviour. Although your query includes a SELECT as part of the CTE, the query itself is ultimately a DELETE query which does not return rows. It will return a row count that you can retrieve with Cursor#rowcount, but Cursor#fetchall() will throw an error because there are no rows to retrieve.

Sqlalchemy from sql using subqueries

I have folowing sql code:
SELECT t1.*
FROM table1 t1
INNER JOIN table2 t2
ON t1.id = t2.sample_id
GROUP BY t1.id
HAVING NOW() > (SELECT end FROM table2 WHERE start =
max(t2.start)) + (t1.time::text||' minute')::interval
I need to rewrite this code in sqlalchemy like this:
Table1.query
.join(Table2, Table1.id == Table2.sample_id)
.group_by(Table1.id)
.having(
datetime.datetime.now() > ((Table2.query.filter_by(
start=
(database.session.query(func.max(Table2.start)))
)).first().end +
datetime.timedelta(minutes=Table1.query.filter_by(
id=Table1.id
).first().frequency))).all()
This SQL query returns one line of data from the database, but my query in sqlalchemy returns empty dict. Can someone help me with my recreation of my SQL query?

Get table name for field in database result in Python (PostgreSQL)

I'm trying to get table name for field in result set that I got from database (Python, Postgres). There is a function in PHP to get table name for field, I used it and it works so I know it can be done (in PHP). I'm looking for similar function in Python.
pg_field_table() function in PHP gets results and field number and "returns the name of the table that field belongs to". That is exactly what I need, but in Python.
Simple exaple - create tables, insert rows, select data:
CREATE TABLE table_a (
id INT,
name VARCHAR(10)
);
CREATE TABLE table_b (
id INT,
name VARCHAR(10)
);
INSERT INTO table_a (id, name) VALUES (1, 'hello');
INSERT INTO table_b (id, name) VALUES (1, 'world');
When using psycopg2 or sqlalchemy I got right data and right field names but without information about table name.
import psycopg2
query = '''
SELECT *
FROM table_a A
LEFT JOIN table_b B
ON A.id = B.id
'''
con = psycopg2.connect('dbname=testdb user=postgres password=postgres')
cur = con.cursor()
cur.execute(query)
data = cur.fetchall()
print('fields', [desc[0] for desc in cur.description])
print('data', data)
The example above prints field names. The output is:
fields ['id', 'name', 'id', 'name']
data [(1, 'hello', 1, 'world')]
I know that there is cursor.description, but it does not contain table name, just the field name.
What I need - some way to retrieve table names for fields in result set when using raw SQL to query data.
EDIT 1: I need to know if "hello" came from "table_a" or "table_b", both fields are named same ("name"). Without information about table name you can't tell in which table the value is.
EDIT 2: I know that there are some workarounds like SQL aliases: SELECT table_a.name AS name1, table_b.name AS name2 but I'm really asking how to retrieve table name from result set.
EDIT 3: I'm looking for solution that allows me to write any raw SQL query, sometimes SELECT *, sometimes SELECT A.id, B.id ... and after executing that query I will get field names and table names for fields in the result set.
It is necessary to query the pg_attribute catalog for the table qualified column names:
query = '''
select
string_agg(format(
'%%1$s.%%2$s as "%%1$s.%%2$s"',
attrelid::regclass, attname
) , ', ')
from pg_attribute
where attrelid = any (%s::regclass[]) and attnum > 0 and not attisdropped
'''
cursor.execute(query, ([t for t in ('a','b')],))
select_list = cursor.fetchone()[0]
query = '''
select {}
from a left join b on a.id = b.id
'''.format(select_list)
print cursor.mogrify(query)
cursor.execute(query)
print [desc[0] for desc in cursor.description]
Output:
select a.id as "a.id", a.name as "a.name", b.id as "b.id", b.name as "b.name"
from a left join b on a.id = b.id
['a.id', 'a.name', 'b.id', 'b.name']

Categories