Flask SQLAlchemy: many to many relationship error - python

I am trying to set up many-to-many relationship in SQLAlchemy but I am getting the error:
from shopapp import db
db.create_all()
sqlalchemy.exc.NoReferencedTableError: Foreign key associated with column 'shoppinglists_products.shoppinglist_id_v2' could not find table 'shoppinglist' with which to generate a foreign key to target column 'id'
My code:
from sqlalchemy import ForeignKey
from shopapp import db
shoppinglists_products = db.Table("shoppinglists_products",
db.Column("shoppinglist_id", db.Integer, ForeignKey("shoppinglist.id")),
db.Column("product_id", db.Integer, ForeignKey("product.id")))
class ShoppingList(db.Model):
id = db.Column(db.Integer, primary_key=True)
name = db.Column(db.String(20), unique=True, nullable=False)
products = db.relationship('Product', back_populates="shoppinglists", secondary="shoppinglists_products")
class Product(db.Model):
id = db.Column(db.Integer, primary_key=True)
name = db.Column(db.String(20), unique=True, nullable=False)
Where is the problem?

It seems like Flask-SQLAlchemy has problem finding the table for foreign key reference. Based on your code, here are the two ways you can fix this:
1) Fix shoppinglists_products table:
Flask-SQLAlchemy often converts the CamelCased model names into a syntax similar to this: camel_cased. In your case, ShoppingList will be referred to as shopping_list. Therefore, changing the ForeignKey("shoppinglist.id") to ForeignKey("shopping_list.id") will do the trick.
shoppinglists_products = db.Table("shoppinglists_products",
db.Column("shoppinglist_id", db.Integer, ForeignKey("shopping_list.id")), # <-- fixed
2) Change the model names:
If you'd like, you could go ahead and change the model name from ShoppingList to Shopping and later refer to this as shopping. This would prevent any confusion from rendering further. Usually, developers don't quite often go for a class name which is combined of two words, especially for the ORM cases. This is because various frameworks has different ways of interpreting the class names to create tables.

Expanding on #P0intMaN's answer - explicitly providing the SQL Alchemy table name with __tablename__ = "ShoppingList" (for example) lets you use your preferred case style and prevents SQLAlchemy from 'helping' you by changing the name of something kind of important without telling you.
class ShoppingList(db.Model):
__tablename__ = "ShoppingList"
id = db.Column(db.Integer, primary_key=True)
name = db.Column(db.String(20), unique=True, nullable=False)
products = db.relationship('Product', back_populates="shoppinglists", secondary="shoppinglists_products")
In many/most Flask tutorials and books, simplistic table names (e.g. posts, comments, users) are used, which elide this issue. Thus a trap awaits for those of us who insist on meaningful CamelCased class names. This is mentioned somewhat casually in the documentation here: https://flask-sqlalchemy.palletsprojects.com/en/2.x/models/
Some parts that are required in SQLAlchemy are optional in
Flask-SQLAlchemy. For instance the table name is automatically set for
you unless overridden. It’s derived from the class name converted to
lowercase and with “CamelCase” converted to “camel_case”. To override
the table name, set the tablename class attribute.

Related

Purpose of joining tables in SQLAlchemy

I'm currently switching from raw SQL queries to the SQLAlchemy package and I'm wondering when to join there tables.
I have 3 tables. Actor and movie are in a M:N relationship. Actor_Movie is the junction table:
class Actor(Base):
__tablename__ = 'actor'
act_id = Column(Integer, primary_key=True)
last_name = Column(String(150), nullable=False, index=True)
first_name = Column(String(150), nullable=False, index=True)
movies = relationship('Movie', secondary='actor_movie')
def __init__(self, last_name, first_name):
self.last_name = last_name
self.first_name = first_name
class Movie(Base):
__tablename__ = 'movie'
movie_id = Column(Integer, primary_key=True)
title = Column(String(150))
actors = relationship('Actor', secondary='actor_movie')
def __init__(self, title):
self.title = title
class ActorMovie(Base):
__tablename__ = 'actor_movie'
fk_actor_id = Column(Integer, ForeignKey('actor.act_id'), primary_key=True)
fk_movie_id = Column(Integer, ForeignKey('movie.movie_id'), primary_key=True)
def __init__(self, fk_actor_id, fk_movie_id):
self.fk_actor_id = fk_actor_id
self.fk_movie_id = fk_movie_id
When I write a simple query like:
result = session.query(Movie).filter(Movie.title == 'Terminator').first()
I get the Movie Object back with and actor field. This actor field contains an InstrumentedList with all actors that are related to the film. This seems like a lot overhead when the relationships are always joined.
Why is the relationship automatically populated and when do I need a manual join?
Based on the result I'm not even sure if the junction table is correct. This seems to be the most "raw SQL" way. I also saw alternative approaches i. e.:
Official SQLAlchemy documentation
"This seems like a lot overhead when the relationships are always joined."
They are not. By default relationships perform a select the first time they are accessed — so called lazy loading.
"Why is the relationship automatically populated"
It was accessed on an instance and the relationship is using default configuration.
"...and when do I need a manual join?"
If you need to for example filter the query based on the related table, or you are fetching many movies and know beforehand that you will need all or some of their actors, though for a many to many relationship selectin eager loading may perform better than a join.
"Based on the result I'm not even sure if the junction table is correct."
That is the correct approach. SQLAlchemy is an ORM, and relationship attributes are the object side of the mapping and association/junction tables the relational side.
All in all the purposes are much the same as in raw SQL, but the ORM handles joins under the hood in some cases, such as eager loading, if configured or instructed to do so. As they say on the home page:
SQLAlchemy is the Python SQL toolkit and Object Relational Mapper that gives application developers the full power and flexibility of SQL.

SQLAlchemy - when to make extra models and relationships vs. just storing JSON in column?

I'm writing an app framework for a project, where each app is a set of functions. To describe these functions (parameter schemas, return schemas, plugin info, etc.) I'm using an OpenAPI 3.0-like syntax: https://swagger.io/specification/
These app API descriptions are stored in a PostgreSQL database using SQLAlchemy and serialized/deserialized using Marshmallow.
My question mainly concerns nested objects like the Info object: https://swagger.io/specification/#infoObject
In my mind, I could go about this in one of two ways:
A: Just storing the JSON representation of the object in a column, and validating the schema of that object myself:
class AppApi(Base):
__tablename__ = 'app_api'
id_ = Column(UUIDType(binary=False), primary_key=True, nullable=False, default=uuid4)
info = Column(sqlalchemy_utils.JSONType, nullable=False)
B: Creating a new table for each nested object, and relying on Marshmallow to validate it against the schema during serialization:
class AppApi(Base):
__tablename__ = 'app_api'
id_ = Column(UUIDType(binary=False), primary_key=True, nullable=False, default=uuid4)
info = relationship("Info", cascade="all, delete-orphan", passive_deletes=True)
class ApiInfo(Base):
__tablename__ = 'api_info'
id_ = Column(UUIDType(binary=False), primary_key=True, nullable=False, default=uuid4)
app_api_id = Column(sqlalchemy_utils.UUIDType(binary=False), ForeignKey('app_api.id_', ondelete='CASCADE'))
name = Column(String(), nullable=False)
description = Column(String(), nullable=False)
...etc.
I'm inclined to go for option A since it seems much less involved, but option B feels more "correct." Option A gives me more flexibility and doesn't require me to make models for every single object, but Option B makes it clearer what is being stored in the database.
The app's info object won't be accessed independently of the rest of the app's API, so I'm not sure that there's much value in creating a separate table for it.
What are some other considerations I should be making to choose one or the other?
I think B is better.
With this configuration, you can access the column of ApiInfo faster (and easier).

SQLAlchemy how to create various relationship with backrefs?

I was reading through the SQLAlchemy documentation on basic relationships and I feel like I'm missing some basic understandings as to how to create the relationship declarations. When I run my code, I'm running into errors such as:
sqlalchemy.exc.NoForeignKeysError: Can't find any foreign key relationships between 'entity' and 'category'.
sqlalchemy.exc.NoForeignKeysError: Could not determine join condition between parent/child tables on relationship Entity.categories - there are no foreign keys linking these tables. Ensure that referencing columns are associated with a ForeignKey or ForeignKeyConstraint, or specify a 'primaryjoin' expression.
I thought that the purpose of the relationship() directive was to minimize the creation of manual keys and ids.
I'm also a little confused on the syntax with regards to one-to-many and many-to-many, and many-to-one and how the syntax would differentiate between the different types of relationships.
Heres my example where I create an Entity and various classes around it to try out the various relationships:
class Entity(Base):
__tablename__ = 'entity'
id = Column(Integer, primary_key=True)
name = Column(String(250), nullable=False )
# many-to-one - many entities will belong to one manufacturer
# do i need to define the mfg_id manually?
manufacturer_id = Column(Integer, ForeignKey('manufacturer.id'))
manufacturer = relationship("Manufacturer")
# one-to-many relationship where an entity will have lots of
# properties that belong to it. Each property will only belong to one entity
properties = relationship("EntityProperty", backref="entity")
# this is a many-to-many relationship mapping where entity can belong
# to multiple categories and you can look up entities by category
categories = relationship("Category", backref="entities")
class EntityProperty(Base):
__tablename__ = 'entity_property'
id = Column(Integer, primary_key=True)
key = Column(String(250), nullable=False )
value = Column(String(250), nullable=False )
# do we need to define this? or can this be implied by relationship?
entity_id = Column(Integer, ForeignKey('entity.id'))
class Category(Base):
__tablename__ = 'category'
id = Column(Integer, primary_key=True)
name = Column(String(250), nullable=False )
# Does this need to know anything about entities? Many entities
# can belong to a category and entities can also belong to multiple
# categories. Usage is to look up entities that belong to a category.
class Manufacturer(Base):
__tablename__ = 'manufacturer'
id = Column(Integer, primary_key=True)
name = Column(String(250), nullable=False )
# Similar to category except not all manufactures would have entities.
# How to decouple code at this level from entity
Can someone point me in the right direction to learn more about the proper usage of relationship()? Thank you
First, there has to be a foreign key in the entity table referencing a column in the category table (or vice versa) to establish a relationship. You currently have none.
However, if you intend to have a many-to-many relationship between entity and category then kindly see
SQLAlchemy many-to-many Relationships
Sometimes you end up solving your own questions as soon as they are posed. So heres what I learned.
I was getting a little confused with the back_populdate vs backref. I was thinking that if I added a backref I wouldn't need to add the foreignkey in the opposite class, but this was incorrect.
So for the one to many:
This is declared in the Entity
properties = relationship("EntityProperty", backref="entity")
And this is declared in the EntityProperty to facilitate the necessary back-linking and is required:
entity_id = Column(Integer, ForeignKey("entity.id"))
In the many-to-many case, I was missing an association table:
cat_entity_association_table = Table("cat_entity_assocaition", Base.metadata,
Column("category_id", Integer, ForeignKey("category.id")),
Column("entity_id", Integer, ForeignKey("entity.id")),
)
This association is used to construct the bi-directional linking between entities:
categories = relationship("Category", secondary=cat_entity_association_table, back_populates="entities")
and categories:
entities = relationship("Entity", secondary=cat_entity_association_table, back_populates="categories")
There was some ambiguity on when to use the external table, but hopefully this will help someone else as well.

SQLAlchemy - How to profit from dynamic and eager loading at the same time

I have two tables Factorys and Products, each Factory can have a large collection of Products, so the lazy=dynamic has been applied.
class Factory(Base):
__tablename__ = 'factorys'
ID = Column(Integer, primary_key=True)
products = relationship("Product",lazy='dynamic' )
class Product(Base):
__tablename__ = 'products'
ID = Column(Integer, primary_key=True)
factory_id = Column(Integer, ForeignKey('factorys.ID'))
Name = Column(Text)
In case all products of a factory are needed:
factory.products.all()
should be applied. But since the factory is already loaded at this point of time, it is more performant to have an eagerjoined loading between Factory and Product.
But a joined relation between both tables make the overall performance worse due to the large collection of products, and is not required for example when appending products to a factory.
Is it possible to define different relations between two tables, but using them only in specific cases? For example in a method for the factory class such as:
class Factory(Base):
__tablename__ = 'factorys'
ID = Column(Integer, primary_key=True)
products = relationship("Product",lazy='dynamic' )
def _getProducts():
return relationship("Product",lazy='joined' )
How can I get all the products of a factory in a performant way, not loosing performance when adding products to a factory?
Any tips would be appreciated.
I have run into the same question and had a very difficult time finding the answer.
What you are proposing with returning a relationship will not work as SQLAlchemy must know about the relationship belonging to the table, but doing:
class Factory(Base):
__tablename__ = 'factorys'
ID = Column(Integer, primary_key=True)
products_dyn = relationship("Product",lazy='dynamic', viewonly=True)
products = relationship("Product",lazy='joined' )
should work. Note the viewonly attribute, it is very important because without it SQLAlchemy may try to use both relationships when you add a product to the factory and may produce duplicate entries in specific cases (such as using a secondary table for the relationship).
This way you could use both the eager loaded products and perform an optimized query with the join while hiding it with the table declaration.
Hope that helps!

min/max with orm relationships

I'm trying to find the min/max of a collection off a foreign key. I know that you can do session.query with func.min and func.max, but is there a way that lets me use the standard ORM relationship stuff?
For example with a blog, if I wanted to find the biggest "number comment" for a given post given the schema below, is it possible to do something like Post.query.get(0).number_comments.max()?
class Post(base):
id = Column(Integer, primary_key=True)
number_comments = relationship("NumberComment")
class NumberComment(base):
id = Column(Integer, primary_key=True)
num = Column(Integer, nullable=False)
As in case of using raw SQL, you need to join those tables in your query:
# This class lacks a foreign key in your example.
class NumberComment(base):
# ...
post_id = Column(Integer, ForeignKey(Post.id), nullable=False)
# ...
session.query(func.max(NumberComment.num)).join(Post).\
filter(Post.id == 1).scalar()
There's no other way to do this, at least not like you wanted. There's a reason why SQLAlchemy is called like that and not ORMSorcery ;-)
My advice would be to think in terms of SQL when trying to come up with a query, this will help you a lot.

Categories