Delete whole hierarchy with sqlalchemy polymorphic relationship - python

I want to delete some elements in tables that have a polymorphic relationship in sqlalchemy. Here's the model:
class Employee(Base):
__tablename__ = 'employee'
id = Column(Integer, primary_key=True)
name = Column(String(50))
type = Column(String(50))
__mapper_args__ = {
'polymorphic_identity':'employee',
'polymorphic_on':type
}
class Engineer(Employee):
__tablename__ = 'engineer'
id = Column(Integer, ForeignKey('employee.id'), primary_key=True)
engineer_name = Column(String(30))
__mapper_args__ = {
'polymorphic_identity':'engineer',
}
And here's how I delete it:
e = Engineer();
e.name = "John";
e.engineer_name = "Doe";
DBSession.add(e);
q = session.query(Engineer).filter(Employee.name == "John")
q.delete(False)
I get the following error, is that a bug or am I doing it the wrong way ?
sqlalchemy.exc.OperationalError: (sqlite3.OperationalError) no such
column: employee.name [SQL: u'DELETE FROM engineer WHERE employee.name
= ?'] [parameters: ('John',)]
I'm expecting sqlalchemy to delete the entres in the engineer and employee tables.

First you should define the on delete behaviour of this relationship:
id = Column(Integer, ForeignKey('employee.id', ondelete='CASCADE'), primary_key=True)
Then, using the ORM, you can delete all engineers with name "John" through a loop:
eng_list = session.query(Engineer).filter(Employee.name == "John").all()
for eng in eng_list:
session.delete(eng)
session.commit()
This will delete the records from both the Employee and Engineer tables.
update: comment on error message:
sqlalchemy.exc.OperationalError: (sqlite3.OperationalError) no such
column: employee.name [SQL: u'DELETE FROM engineer WHERE employee.name
= ?'] [parameters: ('John',)]
Your attempt tries to Delete from Engineer with a join to Employee (to access the field Employee.name). But this join is missing from the query sqlalchemy is emitting to the backend.
I don't think SQLite supports deleting with joins. Perhaps you can try to run session.query(Engineer).filter(Employee.name == "John").delete() against a different backend, and sqlalchemy may be able to emit the proper SQL statement. I haven't tried it though.
update 2: On backends that respect foreign key constraints (and the onupdate constraint has been set to cascade), it should be sufficient to delete the row in the parent row, and the linked rows in the child will automatically be deleted.
I tried this example with both MySQL & Postgresql backends, and the following query deleted the row from both tables (employee & engineer):
session.query(Employee).filter(Employee.name=='John').delete()
For some reason, on Sqlite, this only deletes the record from employee.

Because doing the joined DELETE is not supported directly, I found an easy workaround is to use your normal joined query to select the ids to delete, then pass those ids to a separate DELETE query.
One minor annoyance is that since your returned ids are integers you would likely run into this error like I did if you try to pass those ids (technically an array of tuples) directly to the DELETE query. A simple intermediate conversion to strings fixes that.
So all together:
ids_to_delete = session.query(Engineer.id). \
filter(Employee.name == "John"). \
all()
# Convert the resulting int tuples to simple strings:
id_strings = [str(id_[0]) for id_ in ids_to_delete]
session.query(Engineer). \
filter(Engineer.id.in_(id_strings)). \
delete(synchronize_session=False)

Related

selecting columns with tables created using double quotation fail

I connected a postgresql database to Apache Superset and am playing around with their SQL editor. I'm running into a problem where I cannot do a left join between two tables with an associated id.
SELECT id, profile_name FROM "ProductionRun"
LEFT JOIN "StatsAssociation" ON "ProductionRun".id = "StatsAssociation".production_run_id;
Is my above syntax correct? The tables must be referenced with double quotation because they are created case sensitive. This returns only the id and profile_name columns of ProductionRun table without joining with StatsAssociation table.
I created the tables using sqlalchemy and here are the table schema:
ProductionRun
class ProductionRun(Base):
__tablename__ = 'ProductionRun'
id = Column(Integer, primary_key=True, autoincrement=True)
profile_name = Column(String, nullable=False)
StatsAssociation
class StatsAssociation(Base):
__tablename__ = 'StatsAssociation'
production_run_id = Column(Integer, ForeignKey('ProductionRun.id'), primary_key=True)
stats_package_id = Column(Integer, ForeignKey('StatsPackage.id'), unique=True, nullable=False)
stats_package = relationship('StatsPackage', back_populates='stats_association', cascade='all,delete')
production_run = relationship('ProductionRun', back_populates='stats_association')
When I view the tables, they both exist and StatsAssociation has production_run_id column which shares the same ids as ProductionRun.
This was originally posted as a comment.
You're not specifying any column from the "StatsAssociation" table, so it is expected that nothing would show up. To get columns in the output of the SELECT query, you need to list them -- the only exception that I can currently think of being if you use "TableName".* or * in SELECT.
For example, and just to start you off:
SELECT id, profile_name, production_run_id
FROM ...
where ... is the rest of your query.

SQLAlchemy sort by foreign column through hybrid property

I'm trying to build a universal method of building queries for various models based on input from the frontend. Some queries require filter or sort on child table columns.
So I decided to have such columns as hybrid properties in master model:
class PurchaseOrder(db.Model, BaseModel):
__tablename__ = 'purchase_orders'
''' Represents purchase order '''
id = Column(String(23), primary_key=True, nullable=False)
vendor_po_id = Column(String(12))
suborder_id = Column(String(20), ForeignKey('suborders.id'), nullable=False)
suborder = relationship('Suborder', foreign_keys=[suborder_id], lazy='joined')
#hybrid_property
def purchase_date(self):
return self.suborder.buyout_date
#purchase_date.expression
def purchase_date(cls):
return Suborder.buyout_date
So, I build a query and final clase looks like this:
query = query.order_by(PurchaseOrder.purchase_date)
In this case I expect that final query would be something like:
SELECT purchase_orders.*, suborders.*
FROM purchase_orders JOIN suborders ON purchase_orders.suborder_id = suborders.id
ORDER BY suborders.buyout_date
However what I've got is (columns are reduced):
SELECT purchase_orders.*, suborders_1.*
FROM
purchase_orders
LEFT OUTER JOIN suborders AS suborders_1 ON suborders_1.id = purchase_orders.suborder_id
ORDER BY suborders.buyout_date
Naturally I'm getting an error:
sqlalchemy.exc.OperationalError: (MySQLdb._exceptions.OperationalError) (1054, "Unknown column 'suborders.buyout_date' in 'order clause'")
So the question - how do I do it right?

How to count child table items with or without join to parent table using SQLAlchemy?

I used SQLAlchemy to create a SQLite database which stores bibliographic data of some document, and I want to query the author number of each document.
I know how to do this in raw SQL, but how can I achieve the same result using SQLAlchemy? It is possible without using join?
Here is the classes that I have defined:
class WosDocument(Base):
__tablename__ = 'wos_document'
document_id = Column(Integer, primary_key=True)
unique_id = Column(String, unique=True)
......
authors = relationship('WosAuthor', back_populates='document')
class WosAuthor(Base):
__tablename__ = 'wos_author'
author_id = Column(Integer, primary_key=True, autoincrement=True)
document_unique_id = Column(String, ForeignKey('wos_document.unique_id'))
document = relationship('WosDocument', back_populates='authors')
last_name = Column(String)
first_name = Column(String)
And my goal is to get the same result as this SQL query does:
SELECT a.unique_id, COUNT(*)
FROM wos_document AS a
LEFT JOIN wos_author AS b
ON a.unique_id = b.document_unique_id
GROUP BY a.unique_id
I tried the codes below:
session.query(WosDocument.unique_id, len(WosDocument.authors)).all()
session.query(WosDocument.unique_id, func.count(WosDocument.authors)).all()
The first line raised an error, the second line doesn't give me the desired result, it return only one row and I don't recognize what it is:
[('000275510800023', 40685268)]
Since WosDocument Object has a one-to-many relationship authors, I supposed that I can query the author number of each document without using join explicitly, but I can't find out how to do this with SQLAlchemy.
Can you help me? Thanks!
If you have written the right relation in your model. Then the query would be like:
db.session.query(ParentTable.pk,func.count('*').label("count")).join(Childtable).group_by(ParentTable).all()
The detail of the document of the join() is
https://docs.sqlalchemy.org/en/latest/orm/query.html#sqlalchemy.orm.query.Query.join
If you don't join() explictly you would need to deal with something like parent.relations as a field.

SQLAlchemy: insert record if certain record and relationship data does not already exist

I want to know how to query data from multiple tables with multiple conditions.
My example db has the following tables:
class Location(Base):
__tablename__ = "location"
id = Column('id', Integer, primary_key=True)
location = Column('Location', String)
class Person(Base):
__tablename__ = "person"
id = Column('id', Integer, primary_key=True)
name = Column('Name', String, unique=True)
profession = Column('Profession', String)
location_id = Column(Integer, ForeignKey('location.id'))
location = relationship(Location)
We have in this database a person with a specific location. My goal is to write a query where I can check conditions of the Location table and the Person table.
A person with the name Eric lives in Houston. Now I want to know if I already have an Eric from Houston in my database.
The following query doesn't work.
new_location = Location(location='Houston')
obj = Person(name='Eric', profession='Teacher', location=new_location)
if session.query(Person).filter(Person.name == obj.name,
Person.profession == obj.profession,
Person.location_id == obj.location.id).first() == None:
session.add(obj)
session.commit()
print("Insert sucessful")
The problem in my query is the last line where I check the location but I don't know how to solve it. Maybe someone has a working example with the SQLAlchemy method exists()?
You can do something like the following to join Person and Location and filter for any record where the name and location are the same as the new person instance you have created. The query will either return the record or None, so you can use the result in your if (remember that indentation matters - maybe the code example in your question just copied incorrectly).
new_location = Location(location='Houston')
new_person = Person(name='Eric', profession='Teacher', location=new_location)
person_location_exists = session.query(Person).\
join(Location).\
filter(Person.name == new_person.name).\
filter(Location.location == new_location.location).\
first()
if not person_location_exists:
session.add(new_person)
session.commit()
print("Insert successful")
You could use exists() to accomplish the same thing, but I think the above is a bit simpler.

Creating ORM mappings over subqueries of a table

I'm trying to use SQLAlchemy in a situation where I have a one to many table construct and but I essentially want to create a one to one mapping between tables using a subquery.
For example
class User:
__tablename__='user'
userid = Column(Integer)
username = Column(String)
class Address:
__tablename__='address'
userid = Column(Integer)
address= Column(String)
type= Column(String)
In this case the type column of Address includes strings like "Home", "Work" etc. I would like the output to look something like this
I tried using a subquery where I tried
session.query(Address).filter(Address.type =="Home").subquery("HomeAddress")
and then joining against that but then I lose ORM "entity" mapping.
How can I subquery but retain the ORM attributes in the results object?
You don't need to use a subquery. The join condition is not limited to foreign key against primary key:
home_address = aliased(Address, "home_address")
work_address = aliased(Address, "work_address")
session.query(User) \
.join(home_address, and_(User.userid == home_address.userid,
home_address.type == "Home")) \
.join(work_address, and_(User.userid == work_address.userid,
work_address.type == "Work")) \
.with_entities(User, home_address, work_address)

Categories