How can I make a nested query more efficient?

How can I make a nested query more efficient? - python

Lets say I'm having 4 tables 'A(id, type, protocol), B(id, A_id, info), C(id, B_id, details) and D(id, C_id, port_info). Table A and Table B are connected via foreign key id from Table A and A_id from Table B. Similarly, Table B and TableC are connected via foreign key id from TableB and B_id from Table C, and in the same way , Table C and Table D are also connected.
Now, I want to get port_info from Table D of all the protocols from Table A.
I know one method whose time complexity is O(n^4), which I'm using currently. The method is as follow :
db = MySQLdb.connect(host="localhost", user="root", passwd="", db="mydb")
cur = db.cursor()
cur.execute("SELECT * FROM A")
A_results = cur.fetchall()
for A_row in A_results :
id = A_row[0]
cur.execute("SELECT * FROM B WHERE A_id = %d " % (id ))
B_results = cur.fetchall()
for B_row in B_results :
id = B_row[0]
cur.execute("SELECT * FROM C WHERE B_id = %d " % (id ))
c_results = cur.fetchall()
for C_row in C_results :
id = C_row[0]
cur.execute("SELECT * FROM D WHERE C_id = %d " % (id ))
D_results = cur.fetchall()
for D_row in D_results :
print "Port = " + str(port)
But this method takes O(n^4), so is there any efficient way in terms of time complexity , that can solve this problem.
Your suggestions are highly appreciated.

Execute it in a single JOIN query and let MySQL do the necessary optimizations while handling large data sets (which, after all, is what the database is best at), providing your application with a single result set. The query looks like this:
SELECT A.protocol, D.port_info
FROM A JOIN B ON A.id = B.A_id
JOIN C ON B.id = C.B_id
JOIN D ON C.id = D.C_id
ORDER BY protocol
...and then use your cursor to go through that single resultset.

Related

Python MySQL. How to INSERT ON DUPLICATE KEY UPDATE from dictionary

I am updating a MySQL database table from a Python dictionary where the keys are the database fields. I have an insert statement that works but what I really need is INSERT ON DUPLICATE KEY UPDATE.
Here is the insert statement:
for d in r.json()['mydict'].items():
d = d[1] #the dictionary is the 2nd element in the tuple
placeholders = ', '.join(['%s'] * len(d))
columns = ', '.join(d.keys())
sql = "INSERT INTO %s ( %s ) VALUES ( %s )" % ("my_table", columns, placeholders)
c = create_connection() #from function create tables
cur = c[0] #function returns cursor
db = c[1] #function returns db
cur.execute(sql, list(d.values()))
db.commit()
db.close()
As always, any help is appreciated.

Iterate through FOR CURSOR in Python

I just started with coding in Python and I have one problem. I am using ibm_dbi driver.
Here is my sql example:
FOR v AS cur CURSOR FOR
SELECT S.NAME FROM SYSIBM.SYSTABLES AS S
JOIN (SELECT DISTINCT TBNAME FROM SYSIBM.SYSCOLUMNS WHERE CREATOR = 'SOME_SCHEMA_DB') AS C.TBNAME = S.NAME WHERE S.CREATOR = 'SOME_SCHEMA' AND S.TYPE = 'T'
DO
SET TABLE = SCHEMA1 ||'.'||v.Name
...
SELECT 1 INTO TEMP_TABLE FROM SYSIBM.SYSTABLES WHERE TYPE='T' AND CREATOR = SCHEMA1 AND NAME = v.NAME
END FOR;
So far i have in Python:
#make a connection to db
conn = ibm_db_dbi.connect("DATABASE=%s;HOSTNAME=%s,....)
#define a cursor
cur = conn.cursor()
sql="""SELECT S.NAME FROM SYSIBM.SYSTABLES AS S
JOIN (SELECT DISTINCT TBNAME FROM SYSIBM.SYSCOLUMNS WHERE CREATOR = 'SOME_SCHEMA_DB') AS C.TBNAME = S.NAME WHERE S.CREATOR = 'SOME_SCHEMA' AND S.TYPE = 'T'"""
resultSet = cur.execute(sql)
I don't know how to iterate through result from query and set values from the cursor. How to set value for this piece of code
SET TABLE = SCHEMA1 ||'.'||v.Name
...
SELECT 1 INTO TEMP_TABLE FROM SYSIBM.SYSTABLES WHERE TYPE='T' AND CREATOR = SCHEMA1 AND NAME = v.NAME

Tip: get your SQL working outside of python first. Your question shows that query that has syntax mistakes or typos.
To iterate through a result-set from a query, use one of the documented fetch methods for the DBI interface (example, fetchmany() , or fetchall() or others).
Here is a fragment, that uses fetchmany() .
try:
cur = conn.cursor()
sql="""SELECT S.NAME FROM SYSIBM.SYSTABLES AS S
inner JOIN (SELECT DISTINCT TBNAME
FROM SYSIBM.SYSCOLUMNS
WHERE TBCREATOR = 'SOME_SCHEMA') AS C
ON C.TBNAME = S.NAME
WHERE S.CREATOR = 'SOME_SCHEMA'
AND S.TYPE = 'T';"""
cur.execute(sql)
result = cur.fetchmany()
while result:
for i in result:
print(i)
result = cur.fetchmany()
except Exception as e:
print(e)
raise

MySQL - select table name if it contains record for list of tables

I am interested in finding the most efficient manner to query the following:
For a list of table names, return the table name if it contains at least one record that meet the conditions
Essentially, something similar to the following Python code in a single query:
dfs = [pd.read_sql('SELECT name FROM {} WHERE a=1 AND b=2'.format(table), engine) for table in tables]
tables = [table for table, df in zip(tables, dfs) if not df.empty]
Is this possible in MySQL?

Assuming you trust the table names in tables not to contain any surprises leading to SQL injection, you could device something like:
from sqlalchemy import text
selects = [f'SELECT :table_{i} FROM {table} WHERE a = 1 AND b = 2'
for i, table in enumerate(tables)]
stmt = ' UNION '.join(selects)
stmt = text(stmt)
results = engine.execute(
stmt, {f'table_{i}': table for i, table in enumerate(tables)})
or you could use SQLAlchemy constructs to build the same query safely:
from sqlalchemy import table, column, union, and_, select, Integer, literal
tbls = [table(name,
column('a', Integer),
column('b', Integer)) for name in tables]
stmt = union(*[select([literal(name).label('name')]).
select_from(tbl).
where(and_(tbl.c.a == 1, tbl.c.b == 2))
for tbl, name in zip(tbls, tables)])
results = engine.execute(stmt)

You can use a UNION of queries that search each table.
(SELECT 'table1' AS table_name
FROM table1
WHERE a = 1 AND b = 2
LIMIT 1)
UNION
(SELECT 'table2' AS table_name
FROM table2
WHERE a = 1 AND b = 2
LIMIT 1)
UNION
(SELECT 'table3' AS table_name
FROM table3
WHERE a = 1 AND b = 2
LIMIT 1)
...

How to pass list values to SQL Select query?

SQL query:
Select *
from table_name
where ID in (123)
and date in (Select max(date)
from table_name
where ID in (123))
I want to pass below mentioned list values one at time in above SQL query and collect results for each ID in list
Package: cx_Oracle
My try:
import cx_oracle
List= {123, 234,345,....}
List1 = []
query = " Select * from table_name where ID in (%s)
and date in (Select max(date) from table_name where ID in (%s))"
for j in List:
cursor1 = db_ora.cursor()
tb = cursor1.execute(query, params= List )
for i in tb:
List1.append(i)
Thank you in advance, let me know if you need more details from my side

If you want to keep it similar to your original code, you can use string formatting
Python 2
import cx_oracle
List= [123, 234,345,....]
List1 = []
masterQuery = " Select * from table_name where ID in (%s)
and date in (Select max(date) from table_name where ID in (%s))"
for j in List:
cursor1 = db_ora.cursor()
newQuery = masterQuery % (j, j)
tb = cursor1.execute(newQuery)
for i in tb:
List1.append(i)
Python 3
import cx_oracle
List= [123, 234,345,....]
List1 = []
masterQuery = " Select * from table_name where ID in {}
and date in (Select max(date) from table_name where ID in {})"
for j in List:
cursor1 = db_ora.cursor()
newQuery = masterQuery.format(j, j)
tb = cursor1.execute(newQuery)
for i in tb:
List1.append(i)

As far as I can tell, Oracle won't accept such a list as a valid parameter. Either store that list of values into a separate table and use it as a source for your query, such as
and t.date in (select max(t1.date) from table_name t1
where t1.id in (select st.id from some_table st)
)
or, if possible, split that comma-separated-values string into rows, e.g.
and t.date in (select max(t1.date) from table_name t1
where t1.id in (select regexp_substr(%s, '[^,]+', 1, level)
from dual
connect by level <= regexp_count(%s, ',') + 1
)
)
Also, I'd suggest you to precede column names with table aliases to avoid possible confusion.

Get table name for field in database result in Python (PostgreSQL)

I'm trying to get table name for field in result set that I got from database (Python, Postgres). There is a function in PHP to get table name for field, I used it and it works so I know it can be done (in PHP). I'm looking for similar function in Python.
pg_field_table() function in PHP gets results and field number and "returns the name of the table that field belongs to". That is exactly what I need, but in Python.
Simple exaple - create tables, insert rows, select data:
CREATE TABLE table_a (
id INT,
name VARCHAR(10)
);
CREATE TABLE table_b (
id INT,
name VARCHAR(10)
);
INSERT INTO table_a (id, name) VALUES (1, 'hello');
INSERT INTO table_b (id, name) VALUES (1, 'world');
When using psycopg2 or sqlalchemy I got right data and right field names but without information about table name.
import psycopg2
query = '''
SELECT *
FROM table_a A
LEFT JOIN table_b B
ON A.id = B.id
'''
con = psycopg2.connect('dbname=testdb user=postgres password=postgres')
cur = con.cursor()
cur.execute(query)
data = cur.fetchall()
print('fields', [desc[0] for desc in cur.description])
print('data', data)
The example above prints field names. The output is:
fields ['id', 'name', 'id', 'name']
data [(1, 'hello', 1, 'world')]
I know that there is cursor.description, but it does not contain table name, just the field name.
What I need - some way to retrieve table names for fields in result set when using raw SQL to query data.
EDIT 1: I need to know if "hello" came from "table_a" or "table_b", both fields are named same ("name"). Without information about table name you can't tell in which table the value is.
EDIT 2: I know that there are some workarounds like SQL aliases: SELECT table_a.name AS name1, table_b.name AS name2 but I'm really asking how to retrieve table name from result set.
EDIT 3: I'm looking for solution that allows me to write any raw SQL query, sometimes SELECT *, sometimes SELECT A.id, B.id ... and after executing that query I will get field names and table names for fields in the result set.

It is necessary to query the pg_attribute catalog for the table qualified column names:
query = '''
select
string_agg(format(
'%%1$s.%%2$s as "%%1$s.%%2$s"',
attrelid::regclass, attname
) , ', ')
from pg_attribute
where attrelid = any (%s::regclass[]) and attnum > 0 and not attisdropped
'''
cursor.execute(query, ([t for t in ('a','b')],))
select_list = cursor.fetchone()[0]
query = '''
select {}
from a left join b on a.id = b.id
'''.format(select_list)
print cursor.mogrify(query)
cursor.execute(query)
print [desc[0] for desc in cursor.description]
Output:
select a.id as "a.id", a.name as "a.name", b.id as "b.id", b.name as "b.name"
from a left join b on a.id = b.id
['a.id', 'a.name', 'b.id', 'b.name']

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

How can I make a nested query more efficient? - python

Related

Python MySQL. How to INSERT ON DUPLICATE KEY UPDATE from dictionary

Iterate through FOR CURSOR in Python

MySQL - select table name if it contains record for list of tables

How to pass list values to SQL Select query?

Get table name for field in database result in Python (PostgreSQL)

Categories

Resources