How to include a subquery in a SELECT clause? - python

I'm trying to convert the following SQL:
select t1.id_
, t1.field_a
, (select t2.field_c
from belgarath.test_2 t2
where t2.field_a = t1.field_a
and t2.field_b = 1) test_2_field_b_1
, (select t2.field_c
from belgarath.test_2 t2
where t2.field_a = t1.field_a
and t2.field_b = 2) test_2_field_b_2
from belgarath.test_1 t1
I've got as far as this:
sq1 = session.query(Test2.field_C)
sq1 = sq1.filter(Test2.field_A.__eq__(Test1.field_A), Test2.field_B.__eq__(1))
sq1 = sq1.subquery(name="test_2_field_b_1")
sq2 = session.query(Test2.field_C)
sq2 = sq2.filter(Test2.field_A.__eq__(Test1.field_A), Test2.field_B.__eq__(2))
sq2 = sq2.subquery(name="test_2_field_b_2")
session.query(Test1.id_, Test2.field_A, sq1, sq2)
But I'm getting the following SQL:
SELECT belgarath.test_1.id_ AS belgarath_test_1_id_, belgarath.test_2.`field_A` AS `belgarath_test_2_field_A`, test_2_field_b_1.`field_C` AS `test_2_field_b_1_field_C`, test_2_field_b_2.`field_C` AS `test_2_field_b_2_field_C`
FROM belgarath.test_1, belgarath.test_2, (SELECT belgarath.test_2.`field_C` AS `field_C`
FROM belgarath.test_2, belgarath.test_1
WHERE belgarath.test_2.`field_A` = belgarath.test_1.`field_A` AND belgarath.test_2.`field_B` = %(field_B_1)s) AS test_2_field_b_1, (SELECT belgarath.test_2.`field_C` AS `field_C`
FROM belgarath.test_2, belgarath.test_1
WHERE belgarath.test_2.`field_A` = belgarath.test_1.`field_A` AND belgarath.test_2.`field_B` = %(field_B_2)s) AS test_2_field_b_2
It looks like SQLAlchemy is pushing the sub queries into the FROM clause...
Here are the table classes if it helps:
class Test1(Base):
__tablename__ = "test_1"
field_A = Column(Integer)
class Test2(Base):
__tablename__ = "test_2"
field_A = Column(Integer)
field_B = Column(Integer)
field_C = Column(String(3))

Related

Tables relationship on mysql to calculate new field

I'll explain the structure I have, but first I'll tell what I need. I've got a table with forecasts and one with the data of what really happened. And I need to calculate the forecast - happened field. Both tables have the coordenates field (lon,lat), the date and the precipitation. Forecast has one more field which is the date the forecast was made.
class Real(Base):
__tablename__ = 'tbl_real'
id = Column(Integer, primary_key=True, autoincrement=True)
lon = Column(Integer,index=True)
lat = Column(Integer,index=True)
date = Column(DATE,index=True)
prec = Column(Integer)
class Forecast(Base):
__tablename__ = 'tbl_forecast'
id = Column(Integer, primary_key=True, autoincrement=True)
real_id = Column(Integer,ForeignKey('tbl_real.id'))
date_pub = Column(DATE,index=True)
date_prev = Column(DATE,index=True)
lon = Column(Integer,index=True)
lat = Column(Integer,index=True)
prec = Column(Integer,)
class Error(Base):
__tablename__ = 'tbl_error'
id = Column(Integer, primary_key=True)
forecast_id = Column(Integer,ForeignKey('tbl_forecast.id'))
real_id = Column(Integer,ForeignKey('tbl_realizado.id'))
error = Column(Integer)
To insert data on Error I'm using:
def insert_error_by_coord_data(self,real_id,lon,lat,date,prec,session):
ec_ext = session.query(Forecast.id,Forecast.prec).filter((Forecast.lon == lon)&
(Extendido.lat == lat)&
(Extendido.date_prev == date)).all()
data = list()
for row in ec_ext:
id = row[0]
if session.query(Erro).get(id) is None:
prev = comb[1]
error = prev - prec
data.append(Erro(id = id,
ext_id = id,
real_id = real_id,
error= error))
if len(data) > 0:
session.bulk_save_objects(objects=data)
session.commit()
session.close()
Each forecast file has 40 data_prev and 25000 coordenates. And each real file has 25000 coordenates. It's been I think some 2 hours and I only got 80000 rows on Error. It started taking 1.03 s to insert a record and now it's 3.04s. I'm using 12 cpus with multiprocessing, if you think the mistake is here please point out and I can show the code, but I don't think it is.
The question is what should I do differently?

Converting a query with built-in MySQL functions to flask-sqlalchemy

I have a MySQL query like this:
UPDATE mytable SET is_active=false
WHERE created < DATE_SUB(NOW(), INTERVAL `interval` second)
How can I express it using flask-sqlalchemy's ORM (i.e. via the MyTable model)?
This may not be the most elegant solution, but it seems to be working for me:
Base = declarative_base()
class Account(Base):
__tablename__ = "so62234199"
id = sa.Column(sa.Integer, primary_key=True)
created = sa.Column(sa.DateTime)
interval = sa.Column(sa.Integer)
is_active = sa.Column(sa.Boolean)
def __repr__(self):
return f"<Account(id={self.id}, created='{self.created}')>"
Session = sessionmaker(bind=engine)
session = Session()
account_table = list(Account.metadata.tables.values())[0]
upd = (
account_table.update()
.values(is_active=False)
.where(
Account.created
< sa.func.date_sub(
sa.func.now(),
sa.text(
" ".join(
["INTERVAL", str(Account.interval.compile()), "SECOND"]
)
),
)
)
)
with engine.connect() as conn:
conn.execute(upd)
The SQL statement generated is
INFO sqlalchemy.engine.Engine UPDATE so62234199 SET is_active=%s WHERE so62234199.created < date_sub(now(), INTERVAL so62234199.interval SECOND)
INFO sqlalchemy.engine.Engine (0,)

How to set alias in the sqlalchemy for the table?

I'm trying to write MySQL query in the sqlalchemy syntax, but I don't know how to represent aliasForC. Could you help me?
Query with aliasForC alias:
SELECT aliasForC.secondId
FROM A, B, C as aliasForC
WHERE B.firstId = A.firstId
AND B.status = 'Reprep'
AND A.secondId = aliasForC.secondId
AND B.status = ALL (
SELECT status
FROM C
INNER JOIN A ON A.secondId = C.secondId
INNER JOIN B ON A.firstId = B.firstId
WHERE code = aliasForC.code
)
You can do it in this way:
aliasForC = aliased(C)
# And then:
join(aliasForC, aliasForC.firstId == A.firstId )
For All statement, you can use all_()
I think alias is what you're looking for.
http://docs.sqlalchemy.org/en/latest/core/selectable.html
http://docs.sqlalchemy.org/en/latest/core/selectable.html#sqlalchemy.sql.expression.Alias
user_alias = aliased(User, name='user2')
q = sess.query(User, User.id, user_alias)
See: http://docs.sqlalchemy.org/en/latest/orm/query.html#sqlalchemy.orm.query.Query.column_descriptions
import sqlparse
import sqlalchemy as sa
meta = sa.MetaData()
a = sa.Table(
'a', meta,
sa.Column('id', sa.Integer, primary_key=True),
)
b = sa.Table(
'b', meta,
sa.Column('id', sa.Integer, primary_key=True),
sa.Column('x', sa.Integer, sa.ForeignKey(a.c.id)),
sa.Column('y', sa.Integer, sa.ForeignKey(a.c.id)),
)
x = b.alias('x')
y = b.alias('y')
query = (
sa.select(['*']).
select_from(a.join(x, a.c.id == x.c.x)).
select_from(a.join(y, a.c.id == y.c.y))
)
print(sqlparse.format(str(query), reindent=True))
# OUTPUT:
#
# SELECT *
# FROM a
# JOIN b AS x ON a.id = x.x,
# a
# JOIN b AS y ON a.id = y.y
Per https://gist.github.com/sirex/04ed17b9c9d61482f98b#file-main-py-L27-L28

sqlalchemy join to a table via two foreign keys to that same table (ambiguous column error)

I am trying to do a join on a table that has two foriegn keys to the same table. Namely, sourceToOutputRelation points twice to Entries, as shown in the code. Also, Entries have tags. I am trying to do a join so that I get every sourceToOutputRelation that has all the given tags (via Entries). I am just trying to understand the join (the filtering works, I think). Here is the code I have for the join and filter. :
'''
tags is a list of strings that are supposed to match the Tags.tag strings
'''
from sqlalchemy.orm import aliased
q = SourceToOutputRelation.query.\
join(Entries.source_entries, Entries.output_entries).\
join(original_tag_registration).\
join(Tags).\
filter(Tags.tag == tags[0])
print(q.all())
Here are my model definitions :
class SourceToOutputRelation(alchemyDB.Model):
__tablename__ = 'sourceToOutputRel'
id = alchemyDB.Column(alchemyDB.Integer, primary_key = True)
source_article = alchemyDB.Column(alchemyDB.Integer, alchemyDB.ForeignKey('entries.id'))
output_article = alchemyDB.Column(alchemyDB.Integer, alchemyDB.ForeignKey('entries.id'))
class Entries(alchemyDB.Model):
__tablename__ = 'entries'
id = alchemyDB.Column(alchemyDB.Integer, primary_key = True)
tags = alchemyDB.relationship('Tags',
secondary = original_tag_registration,
backref = alchemyDB.backref('relevant_entries', lazy = 'dynamic'),
lazy = 'dynamic')
source_entries = alchemyDB.relationship('SourceToOutputRelation',
primaryjoin="SourceToOutputRelation.output_article==Entries.id",
foreign_keys = [SourceToOutputRelation.output_article],
backref = alchemyDB.backref('output', lazy = 'joined'),
lazy = 'dynamic',
cascade = 'all, delete-orphan')
output_entries = alchemyDB.relationship('SourceToOutputRelation',
primaryjoin="SourceToOutputRelation.source_article==Entries.id",
foreign_keys = [SourceToOutputRelation.source_article],
backref = alchemyDB.backref('source', lazy = 'joined'),
lazy = 'dynamic',
cascade = 'all, delete-orphan')
original_tag_registration = alchemyDB.Table('original_tag_registration',
alchemyDB.Column('tag_id', alchemyDB.Integer, alchemyDB.ForeignKey('tagTable.id')),
alchemyDB.Column('entry_id', alchemyDB.Integer, alchemyDB.ForeignKey('entries.id'))
)
class Tags(alchemyDB.Model):
'''
a table to hold unique tags
'''
__tablename__ = 'tagTable'
id = alchemyDB.Column(alchemyDB.Integer, primary_key = True)
tag = alchemyDB.Column(alchemyDB.String(64), unique=True)
entries_with_this_tag = alchemyDB.relationship('Entries',
secondary = original_tag_registration,
backref = alchemyDB.backref('tag', lazy = 'dynamic'),
lazy = 'dynamic')
I get this error :
OperationalError: (OperationalError) ambiguous column name:
sourceToOutputRel.id u'SELECT "sourceToOutputRel".id AS
"sourceToOutputRel_id", "sourceToOutputRel".source_article AS
"sourceToOutputRel_source_article", "sourceToOutputRel".output_article
AS "sourceToOutputRel_output_article",
"sourceToOutputRel".needs_processing AS
"sourceToOutputRel_needs_processing",
"sourceToOutputRel".number_of_votes AS
"sourceToOutputRel_number_of_votes", "sourceToOutputRel".date_related
AS "sourceToOutputRel_date_related",
"sourceToOutputRel".confirmed_relationship_type AS
"sourceToOutputRel_confirmed_relationship_type", entries_1.id AS
entries_1_id, entries_1.title AS entries_1_title, entries_1.text AS
entries_1_text, entries_1.body_html AS entries_1_body_html,
entries_1.user_id AS entries_1_user_id, entries_1.date_posted AS
entries_1_date_posted, entries_2.id AS entries_2_id, entries_2.title
AS entries_2_title, entries_2.text AS entries_2_text,
entries_2.body_html AS entries_2_body_html, entries_2.user_id AS
entries_2_user_id, entries_2.date_posted AS entries_2_date_posted
\nFROM entries JOIN "sourceToOutputRel" ON
"sourceToOutputRel".output_article = entries.id JOIN
"sourceToOutputRel" ON "sourceToOutputRel".source_article = entries.id
JOIN original_tag_registration ON entries.id =
original_tag_registration.entry_id JOIN "tagTable" ON "tagTable".id =
original_tag_registration.tag_id LEFT OUTER JOIN entries AS entries_1
ON "sourceToOutputRel".output_article = entries_1.id LEFT OUTER JOIN
entries AS entries_2 ON "sourceToOutputRel".source_article =
entries_2.id \nWHERE "tagTable".tag = ?' (u'brods',)
Look at the docs.
Paragraph
Joins to a Target with an ON Clause
a_alias = aliased(Address)
q = session.query(User).\
join(User.addresses).\
join(a_alias, User.addresses).\
filter(Address.email_address=='ed#foo.com').\
filter(a_alias.email_address=='ed#bar.com')
There are multiple join on one table.
You already import aliased funciton.
Try this code
'''
tags is a list of strings that are supposed to match the Tags.tag strings
'''
from sqlalchemy.orm import aliased
entry_alias = aliased(Entries)
q = SourceToOutputRelation.query.\
join(Entries.source_entries).\
join(entry_alias, Entries.output_entries).\
join(original_tag_registration).\
join(Tags).\
filter(Tags.tag == tags[0])
print(q.all())

How can I write an SQLAlchemy Query with a Join and an Aggregate?

I have a table that has 3 columns: type, content and time (an integer). For each 'type', I want to select the entry with the greatest (most recent) 'time' integer and the corresponding data. How can I do this using SQLAlchemy and Python? I could do this using SQL by performing:
select
c.type,
c.time,
b.data
from
parts as b
inner join
(select
a.type,
max(a.time) as time
from parts as a
group by a.type) as c
on
b.type = c.type and
b.time = c.time
But how can I accomplish this in SQLAlchemy?
The table mapping:
class Structure(Base):
__tablename__ = 'structure'
id = Column(Integer, primary_key=True)
type = Column(Text)
content = Column(Text)
time = Column(Integer)
def __init__(self, type, content):
self.type = type
self.content = content
self.time = time.time()
def serialise(self):
return {"type" : self.type,
"content" : self.content};
The attempted query:
max = func.max(Structure.time).alias("time")
c = DBSession.query(max)\
.add_columns(Structure.type, Structure.time)\
.group_by(Structure.type)\
.subquery()
c.alias("c")
b = DBSession.query(Structure.content)\
.add_columns(c.c.type, c.c.time)\
.join(c, Structure.type == c.c.type)
Gives me:
sqlalchemy.exc.OperationalError: (OperationalError) near "(": syntax
error u'SELECT structure.content AS structure_content, anon_1.type AS
anon_1_type, anon_1.t ime AS anon_1_time \nFROM structure JOIN (SELECT
time.max_1 AS max_1, structure.type AS type, structure.time AS time
\nFROM max(structure.time) AS time, structu re GROUP BY
structure.type) AS anon_1 ON structure.type = anon_1.type' ()
I'm essentially stabbing in the dark, so any help would be appreciated.
Try the code below using sub-query:
subq = (session.query(
Structure.type,
func.max(Structure.time).label("max_time")
).
group_by(Structure.type)
).subquery()
qry = (session.query(Structure).
join(subq, and_(Structure.type == subq.c.type, Structure.time == subq.c.max_time))
)
print qry
producing SQL:
SELECT structure.id AS structure_id, structure.type AS structure_type, structure.content AS structure_content, structure.time AS structure_time
FROM structure
JOIN (SELECT structure.type AS type, max(structure.time) AS max_time
FROM structure GROUP BY structure.type) AS anon_1
ON structure.type = anon_1.type
AND structure.time = anon_1.max_time

Categories