SQLAlchemy how to create various relationship with backrefs? - python

I was reading through the SQLAlchemy documentation on basic relationships and I feel like I'm missing some basic understandings as to how to create the relationship declarations. When I run my code, I'm running into errors such as:
sqlalchemy.exc.NoForeignKeysError: Can't find any foreign key relationships between 'entity' and 'category'.
sqlalchemy.exc.NoForeignKeysError: Could not determine join condition between parent/child tables on relationship Entity.categories - there are no foreign keys linking these tables. Ensure that referencing columns are associated with a ForeignKey or ForeignKeyConstraint, or specify a 'primaryjoin' expression.
I thought that the purpose of the relationship() directive was to minimize the creation of manual keys and ids.
I'm also a little confused on the syntax with regards to one-to-many and many-to-many, and many-to-one and how the syntax would differentiate between the different types of relationships.
Heres my example where I create an Entity and various classes around it to try out the various relationships:
class Entity(Base):
__tablename__ = 'entity'
id = Column(Integer, primary_key=True)
name = Column(String(250), nullable=False )
# many-to-one - many entities will belong to one manufacturer
# do i need to define the mfg_id manually?
manufacturer_id = Column(Integer, ForeignKey('manufacturer.id'))
manufacturer = relationship("Manufacturer")
# one-to-many relationship where an entity will have lots of
# properties that belong to it. Each property will only belong to one entity
properties = relationship("EntityProperty", backref="entity")
# this is a many-to-many relationship mapping where entity can belong
# to multiple categories and you can look up entities by category
categories = relationship("Category", backref="entities")
class EntityProperty(Base):
__tablename__ = 'entity_property'
id = Column(Integer, primary_key=True)
key = Column(String(250), nullable=False )
value = Column(String(250), nullable=False )
# do we need to define this? or can this be implied by relationship?
entity_id = Column(Integer, ForeignKey('entity.id'))
class Category(Base):
__tablename__ = 'category'
id = Column(Integer, primary_key=True)
name = Column(String(250), nullable=False )
# Does this need to know anything about entities? Many entities
# can belong to a category and entities can also belong to multiple
# categories. Usage is to look up entities that belong to a category.
class Manufacturer(Base):
__tablename__ = 'manufacturer'
id = Column(Integer, primary_key=True)
name = Column(String(250), nullable=False )
# Similar to category except not all manufactures would have entities.
# How to decouple code at this level from entity
Can someone point me in the right direction to learn more about the proper usage of relationship()? Thank you

First, there has to be a foreign key in the entity table referencing a column in the category table (or vice versa) to establish a relationship. You currently have none.
However, if you intend to have a many-to-many relationship between entity and category then kindly see
SQLAlchemy many-to-many Relationships

Sometimes you end up solving your own questions as soon as they are posed. So heres what I learned.
I was getting a little confused with the back_populdate vs backref. I was thinking that if I added a backref I wouldn't need to add the foreignkey in the opposite class, but this was incorrect.
So for the one to many:
This is declared in the Entity
properties = relationship("EntityProperty", backref="entity")
And this is declared in the EntityProperty to facilitate the necessary back-linking and is required:
entity_id = Column(Integer, ForeignKey("entity.id"))
In the many-to-many case, I was missing an association table:
cat_entity_association_table = Table("cat_entity_assocaition", Base.metadata,
Column("category_id", Integer, ForeignKey("category.id")),
Column("entity_id", Integer, ForeignKey("entity.id")),
)
This association is used to construct the bi-directional linking between entities:
categories = relationship("Category", secondary=cat_entity_association_table, back_populates="entities")
and categories:
entities = relationship("Entity", secondary=cat_entity_association_table, back_populates="categories")
There was some ambiguity on when to use the external table, but hopefully this will help someone else as well.

Related

Flask SQLAlchemy: many to many relationship error

I am trying to set up many-to-many relationship in SQLAlchemy but I am getting the error:
from shopapp import db
db.create_all()
sqlalchemy.exc.NoReferencedTableError: Foreign key associated with column 'shoppinglists_products.shoppinglist_id_v2' could not find table 'shoppinglist' with which to generate a foreign key to target column 'id'
My code:
from sqlalchemy import ForeignKey
from shopapp import db
shoppinglists_products = db.Table("shoppinglists_products",
db.Column("shoppinglist_id", db.Integer, ForeignKey("shoppinglist.id")),
db.Column("product_id", db.Integer, ForeignKey("product.id")))
class ShoppingList(db.Model):
id = db.Column(db.Integer, primary_key=True)
name = db.Column(db.String(20), unique=True, nullable=False)
products = db.relationship('Product', back_populates="shoppinglists", secondary="shoppinglists_products")
class Product(db.Model):
id = db.Column(db.Integer, primary_key=True)
name = db.Column(db.String(20), unique=True, nullable=False)
Where is the problem?
It seems like Flask-SQLAlchemy has problem finding the table for foreign key reference. Based on your code, here are the two ways you can fix this:
1) Fix shoppinglists_products table:
Flask-SQLAlchemy often converts the CamelCased model names into a syntax similar to this: camel_cased. In your case, ShoppingList will be referred to as shopping_list. Therefore, changing the ForeignKey("shoppinglist.id") to ForeignKey("shopping_list.id") will do the trick.
shoppinglists_products = db.Table("shoppinglists_products",
db.Column("shoppinglist_id", db.Integer, ForeignKey("shopping_list.id")), # <-- fixed
2) Change the model names:
If you'd like, you could go ahead and change the model name from ShoppingList to Shopping and later refer to this as shopping. This would prevent any confusion from rendering further. Usually, developers don't quite often go for a class name which is combined of two words, especially for the ORM cases. This is because various frameworks has different ways of interpreting the class names to create tables.
Expanding on #P0intMaN's answer - explicitly providing the SQL Alchemy table name with __tablename__ = "ShoppingList" (for example) lets you use your preferred case style and prevents SQLAlchemy from 'helping' you by changing the name of something kind of important without telling you.
class ShoppingList(db.Model):
__tablename__ = "ShoppingList"
id = db.Column(db.Integer, primary_key=True)
name = db.Column(db.String(20), unique=True, nullable=False)
products = db.relationship('Product', back_populates="shoppinglists", secondary="shoppinglists_products")
In many/most Flask tutorials and books, simplistic table names (e.g. posts, comments, users) are used, which elide this issue. Thus a trap awaits for those of us who insist on meaningful CamelCased class names. This is mentioned somewhat casually in the documentation here: https://flask-sqlalchemy.palletsprojects.com/en/2.x/models/
Some parts that are required in SQLAlchemy are optional in
Flask-SQLAlchemy. For instance the table name is automatically set for
you unless overridden. It’s derived from the class name converted to
lowercase and with “CamelCase” converted to “camel_case”. To override
the table name, set the tablename class attribute.

SQLAlchemy: How to disambiguate Foreign Key relationships?

I have a simple data model of customer and addresses with SQL Alchemy annotations to make the objects persistable in a database. Unfortunately, when I try to create a customer object with c = Customer() I receive an error:
sqlalchemy.exc.AmbiguousForeignKeysError: Could not determine join condition between parent/child
tables on relationship Customer.addresses - there are multiple foreign key paths linking the tables.
Specify the 'foreign_keys' argument, providing a list of those columns which should be counted as
containing a foreign key reference to the parent table.
This is pretty clear -- I need to further annotate the line:
addresses = relationship("Address", back_populates="customer")
with something disambiguate the foreign key relationships. However, I can't understand what I need to (or could) specify in this case. Can anyone point me in the right direction?
Update: Looking further, it seems to me that SQLAlchemy is attempting to infer the direction of the addresses relationship and is unable to do so because there are PK/FK relationships in each direction between these classes. This cannot be resolved by adding foreign_keys= on the addresses relationship because the foreign key for this relationship is in the other table.
I can get this to work by removing the addresses relation entirely from Customer and instead doing customer = relationship("Customer", backref="addresses", foreign_keys=[customer_id]) in the Address class. I don't really like this solution, however, as I want to "express" addresses in the Customer class rather than as the side-effect of creating an otherwise unwanted customer relationship in the Address class.
So, still looking for a way of modifying the addresses relationship to make it work.
Here is my entire model:
from sqlalchemy import Column, Integer, String, ForeignKey
from sqlalchemy.orm import relationship
from sqlalchemy.ext.declarative import declarative_base
Base = declarative_base()
class Customer(Base):
__tablename__ = "customer"
id = Column("customer_id", Integer, primary_key=True)
name = Column("name", String, nullable=False)
bill_address_id = Column("bill_id", Integer, ForeignKey("addresses.address_id"))
ship_address_id = Column("ship_id", Integer, ForeignKey("addresses.address_id"))
bill_address = relationship("Address", foreign_keys=[bill_address_id])
ship_address = relationship("Address", foreign_keys=[ship_address_id])
addresses = relationship("Address", back_populates="customer")
class Address(Base):
__tablename__ = "addresses"
id = Column("address_id", Integer, primary_key=True)
address = Column(String, nullable=False)

How do I define SQLAlchemy FKs and relationships to allow the database to cascade delete?

How does one define a ForeignKey and relationship such that one can disable SQLAlchemy's FK-nullifying behavior?
The documentation here seems to describe the use of
passive_deletes=True to allow the database to cascade delete, but only in the context of defining the cascade relationship
property documented here, a property which it seems
to me defines how SQLAlchemy will perform the cascade deletion itself, which is explicitly described as slower than the database engine's
cascade deletion in this section
(see the green box titled ORM-level “delete” cascade vs. FOREIGN KEY level “ON DELETE” cascade).
To use the database's cascade delete, are we supposed to do the following?
define ondelete="CASCADE" on the ForeignKey column,
define passive_deletes=True on the same relationships,
AND define a cascade="delete, delete-orphan" parameter on all relationships between the objects?
It is step 3 that I seem to be confused about: it seems to be defining the cascade for SQLAlchemy rather than allowing the database
to perform it's own deletion. But SQLAlchemy seems to want to null out all dependent foreign keys before the database can get a
chance to cascade delete. I need to disable this behavior, but passive_deletes=True seems not to do it on its own.
The (late) answer here explicitly addresses my issue, but it is not working. He states
There's an important caveat here. Notice how I have a relationship specified with passive_deletes=True? If you don't have that, the entire thing will not work.
This is because by default when you delete a parent record SqlAlchemy does something really weird.
It sets the foreign keys of all child rows to NULL. So if you delete a row from parent_table where id = 5, then it will basically execute
UPDATE child_table SET parent_id = NULL WHERE parent_id = 5
In my code
class Annotation(SearchableMixin, db.Model):
id = db.Column(db.Integer, primary_key=True)
locked = db.Column(db.Boolean, index=True, default=False)
active = db.Column(db.Boolean, default=True)
HEAD = db.relationship("Edit",
primaryjoin="and_(Edit.current==True,"
"Edit.annotation_id==Annotation.id)", uselist=False,
lazy="joined", passive_deletes=True)
edits = db.relationship("Edit",
primaryjoin="and_(Edit.annotation_id==Annotation.id,"
"Edit.approved==True)", lazy="joined", passive_deletes=True)
history = db.relationship("Edit",
primaryjoin="and_(Edit.annotation_id==Annotation.id,"
"Edit.approved==True)", lazy="dynamic", passive_deletes=True)
all_edits = db.relationship("Edit",
primaryjoin="Edit.annotation_id==Annotation.id", lazy="dynamic",
passive_deletes=True)
class Edit(db.Model):
id = db.Column(db.Integer, primary_key=True)
edit_num = db.Column(db.Integer, default=0)
approved = db.Column(db.Boolean, default=False, index=True)
rejected = db.Column(db.Boolean, default=False, index=True)
annotation_id = db.Column(db.Integer,
db.ForeignKey("annotation.id", ondelete="CASCADE"), index=True)
hash_id = db.Column(db.String(40), index=True)
current = db.Column(db.Boolean, default=False, index=True, passive_deletes=True)
annotation = db.relationship("Annotation", foreign_keys=[annotation_id])
previous = db.relationship("Edit",
primaryjoin="and_(remote(Edit.annotation_id)==foreign(Edit.annotation_id),"
"remote(Edit.edit_num)==foreign(Edit.edit_num-1))")
priors = db.relationship("Edit",
primaryjoin="and_(remote(Edit.annotation_id)==foreign(Edit.annotation_id),"
"remote(Edit.edit_num)<=foreign(Edit.edit_num-1))",
uselist=True, passive_deletes=True)
simply setting passive_deletes=True on the parent relationship is not working. I also thought perhaps it was being caused by the relationship
from the child to it's siblings (the relationships Edit.previous and Edit.priors) but setting passive_deletes=True on those two relationships
does not solve the problem, and it causes the following warnings when I simply run an Edit.query.get(n):
/home/malan/projects/icc/icc/venv/lib/python3.7/site-packages/sqlalchemy/orm/relationships.py:1790: SAWarning: On Edit.previous, 'passive_deletes' is normally configured on one-to-many, one-to-one, many-to-many relationships only.
% self)
/home/malan/projects/icc/icc/venv/lib/python3.7/site-packages/sqlalchemy/orm/relationships.py:1790: SAWarning: On Edit.priors, 'passive_deletes' is normally configured on one-to-many, one-to-one, many-to-many relationships only.
% self)
I have actually found this interesting question from 2015 that has never had an answer. It details a failed attempt to execute documentation code.
It seems that after a thorough attempt to analyze my relationships, I have discovered the problem.
First, I will note, passive_deletes=True is the only necessary parameter. You do not need to define cascade at all to take advantage of the database's cascade system.
More importantly, my problem seems to have stemmed from my tree of foreign-key depedencies. I had a cascade that looked like this:
Annotation
/ | \
Vote Edit annotation_followers
/ \
EditVote tags
Where ondelete="CASCADE" was defined for each parent_id column on each child class. Until I set passive_deletes on all of the children in the graph, the nullification behavior continued to misbehave.
For anyone running into a similar problem, my advice is: thoroughly analyze all of your intersecting relationships, and define passive_deletes=True on all of them that it makes sense.
That said, there are still complications I'm working out; for instance, on a many-to-many table the id's aren't even nullifying. Possible next question.

SQLAlchemy - How to profit from dynamic and eager loading at the same time

I have two tables Factorys and Products, each Factory can have a large collection of Products, so the lazy=dynamic has been applied.
class Factory(Base):
__tablename__ = 'factorys'
ID = Column(Integer, primary_key=True)
products = relationship("Product",lazy='dynamic' )
class Product(Base):
__tablename__ = 'products'
ID = Column(Integer, primary_key=True)
factory_id = Column(Integer, ForeignKey('factorys.ID'))
Name = Column(Text)
In case all products of a factory are needed:
factory.products.all()
should be applied. But since the factory is already loaded at this point of time, it is more performant to have an eagerjoined loading between Factory and Product.
But a joined relation between both tables make the overall performance worse due to the large collection of products, and is not required for example when appending products to a factory.
Is it possible to define different relations between two tables, but using them only in specific cases? For example in a method for the factory class such as:
class Factory(Base):
__tablename__ = 'factorys'
ID = Column(Integer, primary_key=True)
products = relationship("Product",lazy='dynamic' )
def _getProducts():
return relationship("Product",lazy='joined' )
How can I get all the products of a factory in a performant way, not loosing performance when adding products to a factory?
Any tips would be appreciated.
I have run into the same question and had a very difficult time finding the answer.
What you are proposing with returning a relationship will not work as SQLAlchemy must know about the relationship belonging to the table, but doing:
class Factory(Base):
__tablename__ = 'factorys'
ID = Column(Integer, primary_key=True)
products_dyn = relationship("Product",lazy='dynamic', viewonly=True)
products = relationship("Product",lazy='joined' )
should work. Note the viewonly attribute, it is very important because without it SQLAlchemy may try to use both relationships when you add a product to the factory and may produce duplicate entries in specific cases (such as using a secondary table for the relationship).
This way you could use both the eager loaded products and perform an optimized query with the join while hiding it with the table declaration.
Hope that helps!

min/max with orm relationships

I'm trying to find the min/max of a collection off a foreign key. I know that you can do session.query with func.min and func.max, but is there a way that lets me use the standard ORM relationship stuff?
For example with a blog, if I wanted to find the biggest "number comment" for a given post given the schema below, is it possible to do something like Post.query.get(0).number_comments.max()?
class Post(base):
id = Column(Integer, primary_key=True)
number_comments = relationship("NumberComment")
class NumberComment(base):
id = Column(Integer, primary_key=True)
num = Column(Integer, nullable=False)
As in case of using raw SQL, you need to join those tables in your query:
# This class lacks a foreign key in your example.
class NumberComment(base):
# ...
post_id = Column(Integer, ForeignKey(Post.id), nullable=False)
# ...
session.query(func.max(NumberComment.num)).join(Post).\
filter(Post.id == 1).scalar()
There's no other way to do this, at least not like you wanted. There's a reason why SQLAlchemy is called like that and not ORMSorcery ;-)
My advice would be to think in terms of SQL when trying to come up with a query, this will help you a lot.

Categories