Many-to-many relations from table to itself in SQLAlchemy - python

I'm using SQLAlchemy to represent a relationship between authors. I'd like to have authors related to other authors (coauthorshp), with extra data in the relation, such that with an author a I can find their coauthors.
How this is done between two different objects is this:
class Association(Base):
__tablename__ = 'association'
left_id = Column(Integer, ForeignKey('left.id'), primary_key=True)
right_id = Column(Integer, ForeignKey('right.id'), primary_key=True)
extra_data = Column(String(80))
child = relationship('Child', backref='parent_assocs')
class Parent(Base):
__tablename__ = 'left'
id = Column(Integer, primary_key=True)
children = relationship('Association', backref='parent')
class Child(Base):
__tablename__ = 'right'
id = Column(Integer, primary_key=True)
but how would I do this in my case?
The nature of a coauthorship is that it is bidirectional. So, when you insert the tuple (id_left, id_right) into the coauthorship table through a coauthoship object, is there a way to also insert the reverse relation easily? I'm asking because I want to use association proxies.

if you'd like to literally have pairs of rows in association, that is, for every id_left, id_right that's inserted, you also insert an id_right, id_left, you'd use an attribute event to listen for append events on either side, and produce an append in the other direction.
If you just want to be able to navigate between Parent/Child in either direction, just a single row of id_left, id_right is sufficient. The examples in the docs regarding this kind of mapping illustrate the whole thing.

Related

Combining the Results of SQLAlchemy JOINs

When you do something like,
session.query(BaseModel, JoinModel).join(JoinModel, BaseModel.id == JoinModel.id), isouter=True)
The result is is similar to the following,
(<__main__.BaseModel object at 0x000001E32BC81220>, <__main__.JoinModel object at 0x000001E32A15BE50>)
(<__main__.BaseModel object at 0x000001E32BC81220>, <__main__.JoinModel object at 0x000001E32A15BC70>)
(<__main__.BaseModel object at 0x000001E32BC81220>, <__main__.JoinModel object at 0x000001E32A317670>)
When you do .__dict__ on one of these objects (either BaseModel or JoinModel) you can get the values of the attributes.
Is there an efficient way to combine the attributes of both these objects for one result? I mean in the end, ideally it should be just one set of values, right?
You can define relationships and backrefs in your model. This will help you to access the attributes without using join. Here is a sample code which might help you
from sqlalchemy.orm import backref, relationship
class BaseModel():
id = Column(Integer, primary_key=True, autoincrement=True)
join_model = relationship(
"JoinModel",
backref=backref("base_model", cascade="all, delete-orphan", lazy=True))
class JoinModel():
id = Column(Integer, primary_key=True, autoincrement=True)
base_id = Column(Integer, ForeignKey("base.id",ondelete="CASCADE"), nullable=True)
name = Column(String)
You can do like
base_model = BaseModel.query.all()
print(base_model.join_model.name)

Ordering relationship by property of child elements

I use flask-sqlalchemy on a Flask project to model my database.
I need to sort the elements of a many-to-many relationship based on properties of different child elements of one side.
I have "Work" (the parent element), "Tag" (the children), "Type" (a one-to-many relationship on Tag) and "Block" (a one-to-many relationship on Type). Tags and Works are joined with a mapping table "work_tag_mapping".
In essence, each tag has exactly one type, each type belongs to exactly one block, and many tags can be added on many works.
I now want the list of tags on a work be sorted by block first and type second (both have a "position" column for that purpose).
Here are my tables (simplified for the sake of the question):
class Work(db.Model):
__tablename__ = 'work'
id = db.Column(db.Integer, primary_key=True)
name = db.Column(db.String(255, collation='utf8_bin'))
tags = db.relationship('Tag', order_by="Tag.type.block.position, Tag.type.position", secondary=work_tag_mapping)
class Tag(db.Model):
__tablename__ = 'tag'
id = db.Column(db.Integer, primary_key=True)
name = db.Column(db.String(255, collation='utf8_bin'))
type_id = db.Column(db.Integer, db.ForeignKey('type.id'), nullable=False)
type = db.relationship('Type')
work_tag_mapping = db.Table('work_tag_mapping',
db.Column('id', db.Integer, primary_key=True),
db.Column('work_id', db.Integer, db.ForeignKey('work.id'), nullable=False),
db.Column('tag_id', db.Integer, db.ForeignKey('tag.id'), nullable=False)
)
class Type(db.Model):
__tablename__ = 'type'
id = db.Column(db.Integer, primary_key=True)
name = db.Column(db.String(255, collation='utf8_bin'))
position = db.Column(db.Integer)
block_id = db.Column(db.Integer, db.ForeignKey('block.id'), nullable=False)
block = db.relationship('Block')
class Block(db.Model):
__tablename__ = 'block'
id = db.Column(db.Integer, primary_key=True)
name = db.Column(db.String(255, collation='utf8_bin'))
position = db.Column(db.Integer)
Now, it is the "order_by" in the "tags" relationship that doesn't work as I initially hoped.
The error I get is "sqlalchemy.exc.InvalidRequestError: Property 'type' is not an instance of ColumnProperty (i.e. does not correspond directly to a Column)."
I am new to SQLalchemy, Flask and indeed Python, and none of the ressources or questions here mention a case like this.
While this appears not to be possible directly, adding a getter and performing the sorting on retrieval does the trick. Adding lazy='dynamic' ensures the collection behaves as a query, so joins can be performed.
_tags = db.relationship('Tag', lazy='dynamic')
#hybrid_property
def tags(self):
return self._tags.join(Type).join(Block).order_by(Block.position, Type.position)

SQLAlchemy: list of objects, preserve reference if they're deleted

I'm trying to implement a user-facing PreviewList of Articles, which will keep its size even if an Article is deleted. So if the list has four objects [1, 2, 3, 4] and one is deleted, I want it to contain [1, 2, None, 4].
I'm using a relationship with a secondary table. Currently, deleting either Article or PreviewList will delete the row in that table. I've experimented with cascade options, but they seem to affect the related items directly, not the contents of the secondary table.
The snippet below tests for the desired behaviour: deleting an Article should preserve the row in ArticlePreviewListAssociation, but deleting a PreviewList should delete it (and not the Article).
In the code below, deleting the Article will preserve the ArticlePreviewListAssociation, but pl.articles does not treat that as a list entry.
from db import DbSession, Base, init_db
from sqlalchemy import Column, String, Integer, ForeignKey
from sqlalchemy.orm import relationship
session = DbSession()
class Article(Base):
__tablename__ = 'articles'
id = Column(Integer, primary_key=True)
title = Column(String)
class PreviewList(Base):
__tablename__ = 'preview_lists'
id = Column(Integer, primary_key=True)
articles = relationship('Article', secondary='associations')
class ArticlePreviewListAssociation(Base):
__tablename__ = 'associations'
article_id = Column(Integer, ForeignKey('articles.id'), nullable=True)
previewlist_id = Column(Integer, ForeignKey('preview_lists.id'), primary_key=True)
article = relationship('Article')
preview_list = relationship('PreviewList')
init_db()
print(f"Creating test data")
a = Article(title="StackOverflow: 'Foo' not setting 'Bar'?")
pl = PreviewList(articles=[a])
session.add(a)
session.add(pl)
session.commit()
print(f"ArticlePreviewListAssociations: {session.query(ArticlePreviewListAssociation).all()}")
print(f"Deleting PreviewList")
session.delete(pl)
associations = session.query(ArticlePreviewListAssociation).all()
print(f"ArticlePreviewListAssociations: should be empty: {associations}")
if len(associations) > 0:
print("FAIL")
print("Reverting transaction")
session.rollback()
print("Deleting article")
session.delete(a)
articles_in_list = pl.articles
associations = session.query(ArticlePreviewListAssociation).all()
print(f"ArticlePreviewListAssociations: should not be empty: {associations}")
if len(associations) == 0:
print("FAIL")
print(f"Articles in PreviewList: should not be empty: {articles_in_list}")
if len(articles_in_list) == 0:
print("FAIL")
# desired outcome: pl.articles should be [None], not []
print("Reverting transaction")
session.rollback()
This may come down to "How can you make a many-to-many relationship where pk_A == 1 and pk_B == NULL include the None in A's list?"
The given examples would seem to assume that the order of related articles is preserved, even upon deletion. There are multiple approaches to that, for example the Ordering List extension, but it is easier to first solve the problem of preserving associations to deleted articles. This seems like a use case for an association object and proxy.
The Article class gets a new relationship so that deletions cascade in a session. The default ORM-level cascading behavior is to set the foreign key to NULL, but if the related association object is not loaded, we want to let the DB do it, so passive_deletes=True is used:
class Article(Base):
__tablename__ = 'articles'
id = Column(Integer, primary_key=True)
title = Column(String)
previewlist_associations = relationship(
'ArticlePreviewListAssociation', back_populates='article',
passive_deletes=True)
Instead of a many to many relationship PreviewList uses the association object pattern, along with an association proxy that replaces the many to many relationship. This time the cascades are a bit different, since the association object should be deleted, if the parent PreviewList is deleted:
class PreviewList(Base):
__tablename__ = 'preview_lists'
id = Column(Integer, primary_key=True)
article_associations = relationship(
'ArticlePreviewListAssociation', back_populates='preview_list',
cascade='all, delete-orphan', passive_deletes=True)
articles = association_proxy(
'article_associations', 'article',
creator=lambda a: ArticlePreviewListAssociation(article=a))
Originally the association object used previewlist_id as the primary key, but then a PreviewList could contain a single Article only. A surrogate key solves that. The foreign key configurations include the DB level cascades. These are the reason for using passive deletes:
class ArticlePreviewListAssociation(Base):
__tablename__ = 'associations'
id = Column(Integer, primary_key=True)
article_id = Column(
Integer, ForeignKey('articles.id', ondelete='SET NULL'))
previewlist_id = Column(
Integer, ForeignKey('preview_lists.id', ondelete='CASCADE'),
nullable=False)
# Using a unique constraint on a nullable column is a bit ugly, but
# at least this prevents inserting an Article multiple times to a
# PreviewList.
__table_args__ = (UniqueConstraint(article_id, previewlist_id), )
article = relationship(
'Article', back_populates='previewlist_associations')
preview_list = relationship(
'PreviewList', back_populates='article_associations')
With these changes in place no "FAIL" is printed.

Sqlalchemy: Querying on m2m relations for polymorphic classes

I've got two classes that are connected through a many-to-many relationship: Parent and Tag.
Base = declarative_base()
association_table = Table('associations', Base.metadata,
Column('parent_id', Integer, ForeignKey('parent.id')),
Column('tag_id', Integer, ForeignKey('tag.id')),
)
class Tag(Base):
__tablename__ = 'tags'
id = Column(Integer, Sequence('tag_id_seq'), primary_key=True)
name = Column(String)
class Parent(Base):
__tablename__ = 'parents'
id = Column(Integer, Sequence('parent_id_seq'), primary_key=True)
tags = relationship('Tag', secondary=association_table, backref='parents')
If I want to query for all the Tag objects that have one or more relationships to a Parent, I'd do:
session.query(Tag).filter(Tag.parents.any()).all()
However, this Parent class is parent to child classes, Alice and Bob:
class Alice(Parent):
__tablename__ = 'alices'
__mapper_args__ = {'polymorphic_identity': 'alice'}
alice_id = Column(Integer, ForeignKey('parents.id'), primary_key=True)
class Bob(Parent):
__tablename__ = 'bobs'
__mapper_args__ = {'polymorphic_identity': 'bob'}
bob_id = Column(Integer, ForeignKey('parents.id'), primary_key=True)
Now I'd like to be able to retrieve all the Tag objects that have one or more relations to an Alice object. The previous query, session.query(Tag).filter(Tag.parents.any()).all(), would not do as it doesn't discriminate between Alice or Bob objects - it doesn't even know about their existence.
I've messed around with Query's for a while with no success so I assume that if it can be done, it must have something to do with some extra lines of code in the Table classes like those shown above. While the documentation holds info about polymorphic classes and many-to-many relations and Mike Bayer himself offered someone this answer to a seemingly related question which looks interesting but which I'm far from understanding, I'm kind of stuck.
The code samples may disgust the Python interpreter, but does hopefully get my point across. Candy for those who can help me on my way!
While writing a little MWE I actually found a solution, which is actually almost the same as what I had tried already. budder gave me new hope for the approach though, thanks :)
from sqlalchemy import create_engine, ForeignKey, Column, String, Integer, Sequence, Table
from sqlalchemy.orm import sessionmaker, relationship, backref
from sqlalchemy.ext.declarative import declarative_base
Base = declarative_base()
association_table = Table('associations', Base.metadata,
Column('parent_id', Integer, ForeignKey('parents.id')),
Column('tag_id', Integer, ForeignKey('tags.id')),
)
class Tag(Base):
__tablename__ = 'tags'
id = Column(Integer, Sequence('tag_id_seq'), primary_key=True)
name = Column(String)
class Parent(Base):
__tablename__ = 'parents'
id = Column(Integer, Sequence('parent_id_seq'), primary_key=True)
tags = relationship('Tag', secondary=association_table, backref='parents')
class Alice(Parent):
__tablename__ = 'alices'
__mapper_args__ = {'polymorphic_identity': 'alice'}
alice_id = Column(Integer, ForeignKey('parents.id'), primary_key=True)
class Bob(Parent):
__tablename__ = 'bobs'
__mapper_args__ = {'polymorphic_identity': 'bob'}
bob_id = Column(Integer, ForeignKey('parents.id'), primary_key=True)
engine = create_engine("sqlite://")
Base.metadata.create_all(engine)
Session = sessionmaker(bind=engine)
session = Session()
tag_a = Tag(name='a')
tag_b = Tag(name='b')
tag_c = Tag(name='c')
session.add(tag_a)
session.add(tag_b)
session.add(tag_c)
session.commit()
session.add(Alice(tags=[tag_a]))
session.add(Bob(tags=[tag_b]))
session.commit()
for tag in session.query(Tag).\
filter(Tag.parents.any(Parent.id == Alice.alice_id)).\
all():
print(tag.name)
If there's a good alternative approach, I'd still be interested. I can imagine sqlalchemy offering a more direct and elegant approach so that one could do, for example:
for tag in session.query(Tag).\
filter(Tag.alices.any()).\
all():
print(tag.name)
If you write your object classes a certain way you can use the blunt force method...Just search for them...Kind of like a crude query method:
all_tag_objects = session.query(Tag).all() ## All tag objects in your database
tags = []
for tag in all_tag_objects:
for parent in tag.parents:
if parent.alices != []: ## If there are alice objects in the tag parents alice reltionship instance variable...Then we append the tag because it meets our criteria.
flagged_tags.append(tag)
Sounds like you found a better way though, but I guess the ultimate test would be to actually do a speed test.

SQLAlchemy One-to-Many relationship on single table inheritance - declarative

Basically, I have this model, where I mapped in a single table a "BaseNode" class, and two subclasses. The point is that I need one of the subclasses, to have a one-to-many relationship with the other subclass.
So in sort, it is a relationship with another row of different class (subclass), but in the same table.
How do you think I could write it using declarative syntax?.
Note: Due to other relationships in my model, if it is possible, I really need to stick with single table inheritance.
class BaseNode(DBBase):
__tablename__ = 'base_node'
id = Column(Integer, primary_key=True)
discriminator = Column('type', String(50))
__mapper_args__ = {'polymorphic_on': discriminator}
class NodeTypeA(BaseNode):
__mapper_args__ = {'polymorphic_identity': 'NodeTypeA'}
typeB_children = relationship('NodeTypeB', backref='parent_node')
class NodeTypeB(BaseNode):
__mapper_args__ = {'polymorphic_identity': 'NodeTypeB'}
parent_id = Column(Integer, ForeignKey('base_node.id'))
Using this code will throw:
sqlalchemy.exc.ArgumentError: NodeTypeA.typeB_children and
back-reference NodeTypeB.parent_node are both of the same direction
. Did you mean to set remote_side on the
many-to-one side ?
Any ideas or suggestions?
I was struggling through this myself earlier. I was able to get this self-referential relationship working:
class Employee(Base):
__tablename__ = 'employee'
id = Column(Integer, primary_key=True)
name = Column(String(64), nullable=False)
Employee.manager_id = Column(Integer, ForeignKey(Employee.id))
Employee.manager = relationship(Employee, backref='subordinates',
remote_side=Employee.id)
Note that the manager and manager_id are "monkey-patched" because you cannot make self-references within a class definition.
So in your example, I would guess this:
class NodeTypeA(BaseNode):
__mapper_args__ = {'polymorphic_identity': 'NodeTypeA'}
typeB_children = relationship('NodeTypeB', backref='parent_node',
remote_side='NodeTypeB.parent_id')
EDIT: Basically what your error is telling you is that the relationship and its backref are both identical. So whatever rules that SA is applying to figure out what the table-level relationships are, they don't jive with the information you are providing.
I learned that simply saying mycolumn=relationship(OtherTable) in your declarative class will result in mycolumn being a list, assuming that SA can detect an unambiguous relationship. So if you really want an object to have a link to its parent, rather than its children, you can define parent=relationship(OtherTable, backref='children', remote_side=OtherTable.id) in the child table. That defines both directions of the parent-child relationship.

Categories