I need to update a field with a value from another table in MySQL, using Python Connector (not that important though). I need to select a value from one table based on a matching criteria and insert the extracted column back into the previous table based on the same matching criteria.
I have the following, which doesn't work of cource.
for match_field in list:
cursor_importer.execute(UPDATE table1 SET table1_field =
(SELECT field_new FROM table2 WHERE match_field = %s)
WHERE match_field = %s LIMIT 1,
(match_field, match_field ))
You can use UPDATE with JOINS.
Below is an example in MySQL:
UPDATE table1 a JOIN table2 b ON a.match_field = b.match_field
SET a.table1_field = b.field_new
WHERE a.match_field = 'filter criteria'
Related
I'm try to filter if id of column A not exist in column B by this code.
query = db.session.query().select_from(Spare_Parts, Vendors, Replacement)\
.filter(Vendors.vendor_code == Spare_Parts.vendor_code,\
~ exists().where(Spare_Parts.spare_part_code == Replacement.spare_part_code))
I want to query the data from Spare_Parts that not have an id exist in Replacement as a foriegn key but i got the error like this.
Select statement 'SELECT *
FROM spare_parts, replacement
WHERE spare_parts.spare_part_code = replacement.spare_part_code' returned no FROM clauses due to auto-correlation; specify correlate(<tables>) to control correlation manually.
So what is a problem and how to fix that.
try to use the subquery like this instead
to filter spare_part_code from spare_parts which are not in replacement table``
SELECT *
FROM spare_parts
WHERE spare_parts.spare_part_code not in
(select distinct
replacement.spare_part_code
FROM replacement)
or you can use not exists
SELECT *
FROM spare_parts
WHERE not exists
(select 1
FROM replacement
where spare_parts.spare_parts_code = replacement.spare_parts_code)
When I'm trying to remove all tables with:
base.metadata.drop_all(engine)
I'm getting following error:
ERROR:libdl.database_operations:Cannot drop table: (psycopg2.errors.DependentObjectsStillExist) cannot drop sequence <schema>.<sequence> because other objects depend on it
DETAIL: default for table <schema>.<table> column id depends on sequence <schema>.<sequence>
HINT: Use DROP ... CASCADE to drop the dependent objects too.
Is there an elegant one-line solution for that?
import psycopg2
from psycopg2 import sql
cnn = psycopg2.connect('...')
cur = cnn.cursor()
cur.execute("""
select s.nspname as s, t.relname as t
from pg_class t join pg_namespace s on s.oid = t.relnamespace
where t.relkind = 'r'
and s.nspname !~ '^pg_' and s.nspname != 'information_schema'
order by 1,2
""")
tables = cur.fetchall() # make sure they are the right ones
for t in tables:
cur.execute(
sql.SQL("drop table if exists {}.{} cascade")
.format(sql.Identifier(t[0]), sql.Identifier(t[1])))
cnn.commit() # goodbye
I am performing an ETL task where I am querying tables in a Data Warehouse to see if it contains IDs in a DataFrame (df) which was created by joining tables from the operational database.
The DataFrame only has ID columns from each joined table in the operational database. I have created a variable for each of these columns, e.g. 'billing_profiles_id' as below:
billing_profiles_dim_id = df['billing_profiles_dim_id']
I am attempting to iterated row by row to see if the ID here is in the 'billing_profiles_dim' table of the Data Warehouse. Where the ID is not present, I want to populate the DWH tables row by row using the matching ID rows in the ODB:
for key in billing_profiles_dim_id:
sql = "SELECT * FROM billing_profiles_dim WHERE id = '"+str(key)+"'"
dwh_cursor.execute(sql)
result = dwh_cursor.fetchone()
if result == None:
sqlQuery = "SELECT * from billing_profile where id = '"+str(key)+"'"
sqlInsert = "INSERT INTO billing_profile_dim VALUES ('"+str(key)+"','"+billing_profile.name"')
op_cursor = op_connector.execute(sqlInsert)
billing_profile = op_cursor.fetchone()
So far at least, I am receiving the following error:
SyntaxError: EOL while scanning string literal
This error message points at the close of barcket at
sqlInsert = "INSERT INTO billing_profile_dim VALUES ('"+str(key)+"','"+billing_profile.name"')
Which I am currently unable to solve. I'm also aware that this code may run into another problem or two. Could someone please see how I can solve the current issue and please ensure that I head down the correct path?
You are missing a double tick and a +
sqlInsert = "INSERT INTO billing_profile_dim VALUES ('"+str(key)+"','"+billing_profile.name+"')"
But you should really switch to prepared statements like
sql = "SELECT * FROM billing_profiles_dim WHERE id = '%s'"
dwh_cursor.execute(sql,(str(key),))
...
sqlInsert = ('INSERT INTO billing_profile_dim VALUES '
'(%s, %s )')
dwh_cursor.execute(sqlInsert , (str(key), billing_profile.name))
This is my query using code found perusing this site:
query="""SELECT Family
FROM Table2
INNER JOIN Table1 ON Table1.idSequence=Table2.idSequence
WHERE (Table1.Chromosome, Table1.hg19_coordinate) IN ({seq})
""".format(seq=','.join(['?']*len(matchIds_list)))
matchIds_list is a list of tuples in (?,?) format.
It works if I just ask for one condition (ie just Table1.Chromosome as oppose to both Chromosome and hg_coordinate) and matchIds_list is just a simple list of single values, but I don't know how to get it to work with a composite key or both columns.
Since you're running SQLite 3.7.17, I'd recommend to just use a temporary table.
Create and populate your temporary table.
cursor.executescript("""
CREATE TEMP TABLE control_list (
Chromosome TEXT NOT NULL,
hg19_coordinate TEXT NOT NULL
);
CREATE INDEX control_list_idx ON control_list (Chromosome, hg19_coordinate);
""")
cursor.executemany("""
INSERT INTO control_list (Chromosome, hg19_coordinate)
VALUES (?, ?)
""", matchIds_list)
Just constrain your query to the control list temporary table.
SELECT Family
FROM Table2
INNER JOIN Table1
ON Table1.idSequence = Table2.idSequence
-- Constrain to control_list.
WHERE EXISTS (
SELECT *
FROM control_list
WHERE control_list.Chromosome = Table1.Chromosome
AND control_list.hg19_coordinate = Table1.hg19_coordinate
)
And finally perform your query (there's no need to format this one).
cursor.execute(query)
# Remove the temporary table since we're done with it.
cursor.execute("""
DROP TABLE control_list;
""")
Short Query (requires SQLite 3.15): You actually almost had it. You need to make the IN ({seq}) a subquery
expression.
SELECT Family
FROM Table2
INNER JOIN Table1
ON Table1.idSequence = Table2.idSequence
WHERE (Table1.Chromosome, Table1.hg19_coordinate) IN (VALUES {seq});
Long Query (requires SQLite 3.8.3): It looks a little complicated, but it's pretty straight forward. Put your
control list into a sub-select, and then constrain that main select by the control
list.
SELECT Family
FROM Table2
INNER JOIN Table1
ON Table1.idSequence = Table2.idSequence
-- Constrain to control_list.
WHERE EXISTS (
SELECT *
FROM (
SELECT
-- Name the columns (must match order in tuples).
"" AS Chromosome,
":1" AS hg19_coordinate
FROM (
-- Get control list.
VALUES {seq}
) AS control_values
) AS control_list
-- Constrain Table1 to control_list.
WHERE control_list.Chromosome = Table1.Chromosome
AND control_list.hg19_coordinate = Table1.hg19_coordinate
)
Regardless of which query you use, when formatting the SQL replace {seq} with (?,?) for each compsite
key instead of just ?.
query = " ... ".format(seq=','.join(['(?,?)']*len(matchIds_list)))
And finally flatten matchIds_list when you execute the query because it is a list of tuples.
import itertools
cursor.execute(query, list(itertools.chain.from_iterable(matchIds_list)))
I have created this table in python 2.7 . I use it to store unique pairs name and value. In some queries I search for names and in others I search for values. Lets say that SELECT queries are 50-50. Is there any way to create a table that will be double index (one index on names and another for values) so my program will seek faster the data ?
Here is the database and table creation:
import sqlite3
#-------------------------db creation ---------------------------------------#
db1 = sqlite3.connect('/my_db.db')
cursor = db1.cursor()
cursor.execute("DROP TABLE IF EXISTS my_table")
sql = '''CREATE TABLE my_table (
name TEXT DEFAULT NULL,
value INT
);'''
cursor.execute(sql)
sql = ("CREATE INDEX index_my_table ON my_table (name);")
cursor.execute(sql)
Or is there any other faster struct for faster value seek ?
You can create another index...
sql = ("CREATE INDEX index_my_table2 ON my_table (value);")
cursor.execute(sql)
I think the best way for faster research is to create a index on the 2 fields.
like: sql = ("CREATE INDEX index_my_table ON my_table (Field1, field2)")
Multi-Column Indices or Covering Indices.
see the (great) doc here: https://www.sqlite.org/queryplanner.html