SQLAlchemy sort by foreign column through hybrid property - python

I'm trying to build a universal method of building queries for various models based on input from the frontend. Some queries require filter or sort on child table columns.
So I decided to have such columns as hybrid properties in master model:
class PurchaseOrder(db.Model, BaseModel):
__tablename__ = 'purchase_orders'
''' Represents purchase order '''
id = Column(String(23), primary_key=True, nullable=False)
vendor_po_id = Column(String(12))
suborder_id = Column(String(20), ForeignKey('suborders.id'), nullable=False)
suborder = relationship('Suborder', foreign_keys=[suborder_id], lazy='joined')
#hybrid_property
def purchase_date(self):
return self.suborder.buyout_date
#purchase_date.expression
def purchase_date(cls):
return Suborder.buyout_date
So, I build a query and final clase looks like this:
query = query.order_by(PurchaseOrder.purchase_date)
In this case I expect that final query would be something like:
SELECT purchase_orders.*, suborders.*
FROM purchase_orders JOIN suborders ON purchase_orders.suborder_id = suborders.id
ORDER BY suborders.buyout_date
However what I've got is (columns are reduced):
SELECT purchase_orders.*, suborders_1.*
FROM
purchase_orders
LEFT OUTER JOIN suborders AS suborders_1 ON suborders_1.id = purchase_orders.suborder_id
ORDER BY suborders.buyout_date
Naturally I'm getting an error:
sqlalchemy.exc.OperationalError: (MySQLdb._exceptions.OperationalError) (1054, "Unknown column 'suborders.buyout_date' in 'order clause'")
So the question - how do I do it right?

Related

Get all columns from a SqlAlchemy table to filter on them

I build default equal filters for all columns of the table and then add/override only specific filters I want on a few columns, like this:
# Table definition
class OrderTable(Base):
__tablename__ = "orders"
id = Column(Integer, unique=True, nullable=False, primary_key=True)
name = Column(String, nullable=False)
# Build all default filters for this table
table_filters = {column.name: lambda value: column == value for column in OrderTable.__table__.columns}
# Add/override specific filters
all_filters = {**table_filters, "name": lambda value: OrderTable.name.like(value)}
# Execute the query
query = ...
query = query.filter(all_filters["id"](123))
query.delete()
But I get this warning when using default filters:
SAWarning: Evaluating non-mapped column expression 'orders.id' onto ORM instances; this is a deprecated use case. Please make use of the actual mapped columns in ORM-evaluated UPDATE / DELETE expressions.
Is there a better way to get all columns to be able to filter on them without getting this warning?
I tried different ways of gettings all columns for a table with OrderTable.__mapper__.attrs and inspect(OrderTable).attrs but then the filters do not work.
I am not used to post so please tell me if I can improve my question and I will edit it.
It appears to be related to fetching columns from the table vs using attributes directly.
When you use the attibute directly, you get an InstrumentedAttribute, when you get the column from __table__.columns you get a Column.
With the Column you get that warning:
id_filter_col = OrderTable.__table__.c["id"] == 1
query = session.query(OrderTable)
query = query.filter(id_filter_col)
query.delete() # SAWarning: Evaluating non-mapped column expression 'orders.id' onto ORM instances...
But now when you use the InstrumentedAttribute:
id_filter_attr = OrderTable.id == 2
query = session.query(OrderTable)
query = query.filter(id_filter_attr)
query.delete() # OK
You can access the attributes from the mapper via __mapper__.all_orm_descriptors, which should solve your problem.
class OrderTable(Base):
__tablename__ = "orders"
id = Column(Integer, unique=True, nullable=False, primary_key=True)
name = Column(String, nullable=False)
table_filters = {column.key: lambda value: column == value for column in OrderTable.__mapper__.all_orm_descriptors}
all_filters = {**table_filters, "name": lambda value: OrderTable.name.like(value)}
query = ...
query = query.filter(all_filters["id"](123))
query.delete() # OK

selecting columns with tables created using double quotation fail

I connected a postgresql database to Apache Superset and am playing around with their SQL editor. I'm running into a problem where I cannot do a left join between two tables with an associated id.
SELECT id, profile_name FROM "ProductionRun"
LEFT JOIN "StatsAssociation" ON "ProductionRun".id = "StatsAssociation".production_run_id;
Is my above syntax correct? The tables must be referenced with double quotation because they are created case sensitive. This returns only the id and profile_name columns of ProductionRun table without joining with StatsAssociation table.
I created the tables using sqlalchemy and here are the table schema:
ProductionRun
class ProductionRun(Base):
__tablename__ = 'ProductionRun'
id = Column(Integer, primary_key=True, autoincrement=True)
profile_name = Column(String, nullable=False)
StatsAssociation
class StatsAssociation(Base):
__tablename__ = 'StatsAssociation'
production_run_id = Column(Integer, ForeignKey('ProductionRun.id'), primary_key=True)
stats_package_id = Column(Integer, ForeignKey('StatsPackage.id'), unique=True, nullable=False)
stats_package = relationship('StatsPackage', back_populates='stats_association', cascade='all,delete')
production_run = relationship('ProductionRun', back_populates='stats_association')
When I view the tables, they both exist and StatsAssociation has production_run_id column which shares the same ids as ProductionRun.
This was originally posted as a comment.
You're not specifying any column from the "StatsAssociation" table, so it is expected that nothing would show up. To get columns in the output of the SELECT query, you need to list them -- the only exception that I can currently think of being if you use "TableName".* or * in SELECT.
For example, and just to start you off:
SELECT id, profile_name, production_run_id
FROM ...
where ... is the rest of your query.

How to count child table items with or without join to parent table using SQLAlchemy?

I used SQLAlchemy to create a SQLite database which stores bibliographic data of some document, and I want to query the author number of each document.
I know how to do this in raw SQL, but how can I achieve the same result using SQLAlchemy? It is possible without using join?
Here is the classes that I have defined:
class WosDocument(Base):
__tablename__ = 'wos_document'
document_id = Column(Integer, primary_key=True)
unique_id = Column(String, unique=True)
......
authors = relationship('WosAuthor', back_populates='document')
class WosAuthor(Base):
__tablename__ = 'wos_author'
author_id = Column(Integer, primary_key=True, autoincrement=True)
document_unique_id = Column(String, ForeignKey('wos_document.unique_id'))
document = relationship('WosDocument', back_populates='authors')
last_name = Column(String)
first_name = Column(String)
And my goal is to get the same result as this SQL query does:
SELECT a.unique_id, COUNT(*)
FROM wos_document AS a
LEFT JOIN wos_author AS b
ON a.unique_id = b.document_unique_id
GROUP BY a.unique_id
I tried the codes below:
session.query(WosDocument.unique_id, len(WosDocument.authors)).all()
session.query(WosDocument.unique_id, func.count(WosDocument.authors)).all()
The first line raised an error, the second line doesn't give me the desired result, it return only one row and I don't recognize what it is:
[('000275510800023', 40685268)]
Since WosDocument Object has a one-to-many relationship authors, I supposed that I can query the author number of each document without using join explicitly, but I can't find out how to do this with SQLAlchemy.
Can you help me? Thanks!
If you have written the right relation in your model. Then the query would be like:
db.session.query(ParentTable.pk,func.count('*').label("count")).join(Childtable).group_by(ParentTable).all()
The detail of the document of the join() is
https://docs.sqlalchemy.org/en/latest/orm/query.html#sqlalchemy.orm.query.Query.join
If you don't join() explictly you would need to deal with something like parent.relations as a field.

Delete whole hierarchy with sqlalchemy polymorphic relationship

I want to delete some elements in tables that have a polymorphic relationship in sqlalchemy. Here's the model:
class Employee(Base):
__tablename__ = 'employee'
id = Column(Integer, primary_key=True)
name = Column(String(50))
type = Column(String(50))
__mapper_args__ = {
'polymorphic_identity':'employee',
'polymorphic_on':type
}
class Engineer(Employee):
__tablename__ = 'engineer'
id = Column(Integer, ForeignKey('employee.id'), primary_key=True)
engineer_name = Column(String(30))
__mapper_args__ = {
'polymorphic_identity':'engineer',
}
And here's how I delete it:
e = Engineer();
e.name = "John";
e.engineer_name = "Doe";
DBSession.add(e);
q = session.query(Engineer).filter(Employee.name == "John")
q.delete(False)
I get the following error, is that a bug or am I doing it the wrong way ?
sqlalchemy.exc.OperationalError: (sqlite3.OperationalError) no such
column: employee.name [SQL: u'DELETE FROM engineer WHERE employee.name
= ?'] [parameters: ('John',)]
I'm expecting sqlalchemy to delete the entres in the engineer and employee tables.
First you should define the on delete behaviour of this relationship:
id = Column(Integer, ForeignKey('employee.id', ondelete='CASCADE'), primary_key=True)
Then, using the ORM, you can delete all engineers with name "John" through a loop:
eng_list = session.query(Engineer).filter(Employee.name == "John").all()
for eng in eng_list:
session.delete(eng)
session.commit()
This will delete the records from both the Employee and Engineer tables.
update: comment on error message:
sqlalchemy.exc.OperationalError: (sqlite3.OperationalError) no such
column: employee.name [SQL: u'DELETE FROM engineer WHERE employee.name
= ?'] [parameters: ('John',)]
Your attempt tries to Delete from Engineer with a join to Employee (to access the field Employee.name). But this join is missing from the query sqlalchemy is emitting to the backend.
I don't think SQLite supports deleting with joins. Perhaps you can try to run session.query(Engineer).filter(Employee.name == "John").delete() against a different backend, and sqlalchemy may be able to emit the proper SQL statement. I haven't tried it though.
update 2: On backends that respect foreign key constraints (and the onupdate constraint has been set to cascade), it should be sufficient to delete the row in the parent row, and the linked rows in the child will automatically be deleted.
I tried this example with both MySQL & Postgresql backends, and the following query deleted the row from both tables (employee & engineer):
session.query(Employee).filter(Employee.name=='John').delete()
For some reason, on Sqlite, this only deletes the record from employee.
Because doing the joined DELETE is not supported directly, I found an easy workaround is to use your normal joined query to select the ids to delete, then pass those ids to a separate DELETE query.
One minor annoyance is that since your returned ids are integers you would likely run into this error like I did if you try to pass those ids (technically an array of tuples) directly to the DELETE query. A simple intermediate conversion to strings fixes that.
So all together:
ids_to_delete = session.query(Engineer.id). \
filter(Employee.name == "John"). \
all()
# Convert the resulting int tuples to simple strings:
id_strings = [str(id_[0]) for id_ in ids_to_delete]
session.query(Engineer). \
filter(Engineer.id.in_(id_strings)). \
delete(synchronize_session=False)

Get last record of a day using joins in Flask-SQLAlchemy

I have a table of data given timestamps and a couple of values where I want to get the last value of each day.
In raw SQL I would do it like this:
SELECT strftime('%Y-%m-%d', a1.created_at) AS created_at,
a1.my_value
FROM my_table a1
JOIN
(SELECT max(id) AS MAX
FROM mytable
GROUP BY strftime('%Y-%m-%d', created_at)) a2 ON a1.id = a2.MAX;
I am working on a Flask Application where I want to use the Flask-SQLAlchemy extension and not sqlalchemy.orm.session.Session or raw SQL. The defined Model looks like this:
class MyModel(SurrogatePK, Model):
__tablename__ = 'my_table'
id = Column(db.Integer(), nullable=False, primary_key=True)
created_at = Column(db.DateTime(), nullable=False, default=dt.datetime.utcnow)
my_value = Column(db.Integer(), nullable=True)
What I have so far is:
Outer query:
MyModel.query.with_entities(MyModel.created_at, MyModel.my_value)...
Inner query:
MyModel.query.with_entities(func.max(MyModel.id).label('max')).group_by(func.strftime('%Y-%m-%d', MyModel.created_at))
But I cannot find the way to join both together to get the desired result.
This is how i would do it in sqlalchemy,
from sqlalchemy import literal_column
a2 = session.query(func.max(mytable.id).label("MAX"))
.group_by(func.convert(literal_column('DATE'), mytable.created_at)).subquery()
session.query(func.convert(literal_column('DATE'),
mytable.created_at).label("created_at"),
my_table.my_value).join(a2, my_table.id==a2.c.id)

Categories