I am a newbie in SQL Alchemy and I need a help to implement the following relationship :
I have two tables Trends and ClosestTrends and I want to declare two one-to-many relationships :
Tables relationship
In SQL, it would be :
ALTER TABLE "closest_trends" ADD FOREIGN KEY ("id_trend_ref") REFERENCES "trends" ("id") ON DELETE CASCADE;
ALTER TABLE "closest_trends" ADD FOREIGN KEY ("id_trend_close") REFERENCES "trends" ("id") ON DELETE CASCADE;
I tried the following implementation :
class Trends(Base):
__tablename__ = "trends"
__table_args__ = (
UniqueConstraint(
"name", "id_region", "language_iso", name="name_id_region_language"
),
)
id = Column(Integer, primary_key=True, index=True, unique=True)
.
.
.
closest_trends = relationship("ClosestTrends", backref="Trends")
def __str__(self):
return "Trends"
class ClosestTrends(Base):
__tablename__ = "closest_trends"
__table_args__ = (
UniqueConstraint(
"id_trend_ref", "id_trend_close", name="id_trend_ref_id_trend_close"
),
)
id = Column(Integer, primary_key=True, index=True, unique=True)
.
.
.
id_trend_ref = Column(
Integer, ForeignKey("trends.id", ondelete="CASCADE"), nullable=False
)
id_trend_close = Column(
Integer, ForeignKey("trends.id", ondelete="CASCADE"), nullable=False
)
def __str__(self):
return "ClosestTrends"
I does not work and I am receiving the following error:
sqlalchemy.exc.AmbiguousForeignKeysError: Could not determine join
condition between parent/child tables on relationship
Trends.closest_trends - there are multiple foreign key paths linking
the tables. Specify the 'foreign_keys' argument, providing a list of
those columns which should be counted as containing a foreign key
reference to the parent table.
Does anyone have an idea how to fix this ?
Many thanks
your solution is written in official site.
https://docs.sqlalchemy.org/en/13/orm/join_conditions.html#handling-multiple-join-paths
according to official site, I implement like this.
class Trends(Base):
__tablename__ = "trends"
__table_args__ = (
UniqueConstraint(
"name", "id_region", "language_iso", name="name_id_region_language"
),
)
id = Column(Integer, primary_key=True, index=True, unique=True)
.
.
.
# closest_trends = relationship("ClosestTrends", backref="Trends")
def __str__(self):
return "Trends"
class ClosestTrends(Base):
__tablename__ = "closest_trends"
__table_args__ = (
UniqueConstraint(
"id_trend_ref", "id_trend_close", name="id_trend_ref_id_trend_close"
),
)
id = Column(Integer, primary_key=True, index=True, unique=True)
.
.
.
id_trend_ref = Column(
Integer, ForeignKey("trends.id", ondelete="CASCADE"), nullable=False
)
id_trend_close = Column(
Integer, ForeignKey("trends.id", ondelete="CASCADE"), nullable=False
)
trend_ref = relationship("Trends", foreign_keys=[id_trend_ref])
trend_close = relationship("Trends", foreign_keys=[id_trend_close])
def __str__(self):
Related
I'm building a CRUD application and trying to display a list of post "tags", with a number next to each of how many posts have used that tag, and ordered by the number of posts. I have one table for posts, one for tags, and one join table called posts_tags. When I execute the query I think should do the trick, it displays the count of all rows of the posts_tags table instead of just the count of rows associated with each tag. In the image below, the "test" tag has been used on 3 posts and "test 2" on 1 (which are the numbers that should show up next to them), but as you can see I get 4 instead:
display of incorrect post counts for tags
My tags table has a relationship with the posts_tags table, allowing me to use "Tag.tagged_post_ids" in the query:
`
class Tag(db.Model):
""" Model for tags table """
__tablename__ = "tags"
id = db.Column(
db.Integer,
primary_key=True,
autoincrement=True
)
tag = db.Column(
db.String(30),
nullable=False,
unique=True
)
description = db.Column(
db.Text,
nullable=False
)
tagged_post_ids = db.relationship(
"PostTag"
)
`
Here's the SQLA query I wrote:
`
tags = db.session.query(Tag.tag, func.count(Tag.tagged_post_ids).label("count")).group_by(Tag.tag).order_by(func.count(Tag.tagged_post_ids)).all()
`
I have successfully built the query in SQL:
SELECT tags.tag, COUNT(posts_tags.post_id) FROM tags JOIN posts_tags ON posts_tags.tag_id = tags.id GROUP BY tags.tag ORDER BY COUNT(posts_tags.post_id) DESC;
My main issue is trying to translate this into SQLAlchemy. I feel like my query is a 1-to-1 for my SQL query, but it's not working! Any help would be greatly appreciated.
EDIT: Adding my Post model and PostTag (join) model:
class Post(db.Model):
""" Model for posts table """
__tablename__ = "posts"
id = db.Column(
db.Integer,
primary_key=True,
autoincrement=True
)
user_id = db.Column(
db.Integer,
db.ForeignKey("users.id")
)
title = db.Column(
db.Text,
nullable=False
)
content = db.Column(
db.Text
)
url = db.Column(
db.Text
)
img_url = db.Column(
db.Text
)
created_at = db.Column(
db.DateTime,
nullable=False,
default=db.func.now()
)
score = db.Column(
db.Integer,
nullable=False,
default=0
)
tags = db.relationship(
"Tag",
secondary="posts_tags",
backref="posts"
)
comments = db.relationship(
"Comment",
backref="post"
)
#property
def tag_list(self):
""" Builds comma separated list of tags for the post. """
tag_list = []
for tag in self.tags:
tag_list.append(tag.tag)
return tag_list
class PostTag(db.Model):
""" Model for join table between posts and tags """
__tablename__ = "posts_tags"
post_id = db.Column(
db.Integer,
db.ForeignKey("posts.id"),
primary_key=True
)
tag_id = db.Column(
db.Integer,
db.ForeignKey("tags.id"),
primary_key=True
)
If you are using backref you only need to define one side of the relationship. I actually don't know what happens when you use func.count on a relationship, I only use it on a column. Here are a couple options. An outer join is needed to catch the case when there are 0 posts with that tag otherwise with an inner join that tag will just be missing from the result. I also use func.coalesce to convert NULL to 0 in the first example.
class Tag(Base):
""" Model for tags table """
__tablename__ = "tags"
id = Column(
Integer,
primary_key=True,
autoincrement=True
)
tag = Column(
String(30),
nullable=False,
unique=True
)
# Redundant
# tagged_post_ids = relationship(
# "PostTag"
# )
class Post(Base):
""" Model for posts table """
__tablename__ = "posts"
id = Column(
Integer,
primary_key=True,
autoincrement=True
)
title = Column(
Text,
nullable=False
)
tags = relationship(
"Tag",
secondary="posts_tags",
backref="posts"
)
#property
def tag_list(self):
""" Builds comma separated list of tags for the post. """
tag_list = []
for tag in self.tags:
tag_list.append(tag.tag)
return tag_list
class PostTag(Base):
""" Model for join table between posts and tags """
__tablename__ = "posts_tags"
post_id = Column(
Integer,
ForeignKey("posts.id"),
primary_key=True
)
tag_id = Column(
Integer,
ForeignKey("tags.id"),
primary_key=True
)
metadata.create_all(engine)
with Session(engine) as session, session.begin():
# With subquery
tag_subq = select(
PostTag.tag_id,
func.count(PostTag.post_id).label("post_count")
).group_by(
PostTag.tag_id
).order_by(
func.count(PostTag.post_id)
).subquery()
q = session.query(
Tag.tag,
func.coalesce(tag_subq.c.post_count, 0)
).outerjoin(
tag_subq,
Tag.id == tag_subq.c.tag_id
).order_by(
func.coalesce(tag_subq.c.post_count, 0))
for (tag_name, post_count) in q.all():
print (tag_name, post_count)
# With join
q = session.query(
Tag.tag,
func.count(PostTag.post_id).label('post_count')
).outerjoin(
PostTag,
Tag.id == PostTag.tag_id
).group_by(
Tag.id
).order_by(
func.count(PostTag.post_id))
for (tag_name, post_count) in q.all():
print (tag_name, post_count)
A store can have many interests. User request a product that is tagged. Query required is to get the product requests that have tags shared with current store.
# in Store -> relationship('Tag', secondary=store_interest_tags, lazy='dynamic', backref=backref('store', lazy=True))
store_tags = store.interests
matched_requests_to_store = []
for tag in store_tags:
r = session.query(ProductRequest).filter(ProductRequest.product_tags.contains(tag)).all()
matched_requests_to_store.extend(r)
I am sure there might be a more efficient way to query that. I have tried the following:
session.query(ProductRequest).filter(ProductRequest.product_tags.any(store_tags)).all()
But got
psycopg2.errors.SyntaxError: subquery must return only one column
LINE 5: ..._id AND tag.id = product_requests_tags.tag_id AND (SELECT ta...
Any idea how to achieve such query?
A query like this might work, I think it could be done with less joins but this is less rigid than dropping into using the secondary tables directly and specifying the individual joins:
q = session.query(
ProductRequest
).join(
ProductRequest.tags
).join(
Tag.stores
).filter(
Store.id == store.id)
product_requests_for_store = q.all()
With a schema like this:
stores_tags_t = Table(
"stores_tags",
Base.metadata,
Column("id", Integer, primary_key=True),
Column("store_id", Integer, ForeignKey("stores.id")),
Column("tag_id", Integer, ForeignKey("tags.id")),
)
product_requests_tags_t = Table(
"product_request_tags",
Base.metadata,
Column("id", Integer, primary_key=True),
Column("product_request_id", Integer, ForeignKey("product_requests.id")),
Column("tag_id", Integer, ForeignKey("tags.id")),
)
class Store(Base):
__tablename__ = "stores"
id = Column(Integer, primary_key=True)
name = Column(String(), unique=True, index=True)
tags = relationship('Tag', secondary=stores_tags_t, backref=backref('stores'))
class ProductRequest(Base):
__tablename__ = "product_requests"
id = Column(Integer, primary_key=True)
name = Column(String(), unique=True, index=True)
tags = relationship('Tag', secondary=product_requests_tags_t, backref=backref('product_requests'))
class Tag(Base):
__tablename__ = "tags"
id = Column(Integer, primary_key=True)
name = Column(String())
This worked:
session.query(ProductRequest).filter( ProductRequest.product_tags.any(Tag.id.in_(store_tag.id for store_tag in store_tags) ) ).all()
I am still a beginner in Python and I am stuck with the following relation.
Three tables:
tx_bdproductsdb_domain_model_product
sys_category
sys_category_record_mm
sys_category class looks like this:
class Category(Base):
__tablename__ = "sys_category"
uid = Column(
Integer,
ForeignKey("sys_category_record_mm.uid_local"),
primary_key=True,
autoincrement=True,
)
title = Column(String)
products = relationship(
"Product",
uselist=False,
secondary="sys_category_record_mm",
back_populates="categories",
foreign_keys=[uid],
)
Products looks like this:
class Product(Base):
__tablename__ = "tx_bdproductsdb_domain_model_product"
uid = Column(
Integer,
ForeignKey(SysCategoryMMProduct.uid_foreign),
primary_key=True,
autoincrement=True,
)
category = Column(Integer)
categories = relationship(
Category,
secondary=SysCategoryMMProduct,
back_populates="products",
foreign_keys=[uid],
)
And here is the mm table class that should link the two.
class SysCategoryMMProduct(Base):
__tablename__ = "sys_category_record_mm"
uid_local = Column(Integer, ForeignKey(Category.uid), primary_key=True)
uid_foreign = Column(
Integer, ForeignKey("tx_bdproductsdb_domain_model_product.uid")
)
fieldname = Column(String)
I'm currently stuck, does anyone have any ideas? I get the following messages in the console:
sqlalchemy.exc.NoForeignKeysError: Could not determine join condition between parent/child tables on relationship Category.products - there are no foreign keys linking these tables via secondary table 'sys_category_record_mm'. Ensure that referencing columns are associated with a ForeignKey or ForeignKeyConstraint, or specify 'primaryjoin' and 'secondaryjoin' expressions.
root#booba:/var/pythonWorks/crawler/develop/releases/current# python3 Scraper2.py
Traceback (most recent call last):
File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/orm/relationships.py", line 2739, in _determine_joins
self.secondaryjoin = join_condition(
File "<string>", line 2, in join_condition
File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/sql/selectable.py", line 1229, in _join_condition
raise exc.NoForeignKeysError(
sqlalchemy.exc.NoForeignKeysError: Can't find any foreign key relationships between 'tx_bdproductsdb_domain_model_product' and 'sys_category_record_mm'.
sqlalchemy.exc.NoForeignKeysError: Could not determine join condition between parent/child tables on relationship Category.products - there are no foreign keys linking these tables via secondary table 'sys_category_record_mm'. Ensure that referencing columns are associated with a ForeignKey or ForeignKeyConstraint, or specify 'primaryjoin' and 'secondaryjoin' expressions.
Thank you :)
When using an association class you should reference the association directly. You need this instead of secondary because you have data associated with the link (ie. fieldname). I changed some of your naming schema to make it more clear.
There is a pretty good explanation of the association pattern in the sqlalchemy docs. There is a big red warning at the end of that section about mixing the use of the secondary and the Association pattern.
I use backref="related_categories" to automatically create the property related_categories on Product. This is a list of association objects, and not actual categories.
from sqlalchemy import (
create_engine,
Integer,
String,
ForeignKey,
)
from sqlalchemy.schema import (
Column,
)
from sqlalchemy.orm import declarative_base, relationship
from sqlalchemy.orm import Session
Base = declarative_base()
# This connection string is made up
engine = create_engine(
'postgresql+psycopg2://user:pw#/db',
echo=False)
class Category(Base):
__tablename__ = "categories"
uid = Column(
Integer,
primary_key=True,
autoincrement=True,
)
title = Column(String)
class Product(Base):
__tablename__ = "products"
uid = Column(
Integer,
primary_key=True,
autoincrement=True,
)
title = Column(String)
class SysCategoryMMProduct(Base):
__tablename__ = "categories_products"
uid = Column(Integer, primary_key=True)
category_uid = Column(Integer, ForeignKey("categories.uid"))
product_uid = Column(Integer, ForeignKey("products.uid"))
fieldname = Column(String)
product = relationship(
"Product",
backref="related_categories",
)
category = relationship(
"Category",
backref="related_products",
)
Base.metadata.create_all(engine)
with Session(engine) as session:
category = Category(title="kitchen")
session.add(category)
product = Product(title="spoon")
session.add(product)
association = SysCategoryMMProduct(
product=product,
category=category,
fieldname="Extra metadata")
session.add(association)
session.commit()
category = session.query(Category).first()
assert len(category.related_products) == 1
assert category.related_products[0].product.related_categories[0].category == category
q = session.query(Category).join(Category.related_products).join(SysCategoryMMProduct.product).filter(Product.title == "spoon")
print (q)
assert q.first() == category
The last query looks like:
SELECT categories.uid AS categories_uid, categories.title AS categories_title
FROM categories JOIN categories_products ON categories.uid = categories_products.category_uid JOIN products ON products.uid = categories_products.product_uid
WHERE products.title = 'spoon'
I am defining a table in Flask like
groups = db.Table(
"types",
db.Column("one_id", db.Integer, db.ForeignKey("one.id")),
db.Column("two_id", db.Integer, db.ForeignKey("two.id")),
UniqueConstraint('one_id', 'two_id', name='uix_1') #Unique constraint given for unique-together.
)
But this is not working.
I think you can refer to an old topic https://stackoverflow.com/a/10061143/18269348
Here is the code :
# version1: table definition
mytable = Table('mytable', meta,
# ...
Column('customer_id', Integer, ForeignKey('customers.customer_id')),
Column('location_code', Unicode(10)),
UniqueConstraint('customer_id', 'location_code', name='uix_1')
)
# or the index, which will ensure uniqueness as well
Index('myindex', mytable.c.customer_id, mytable.c.location_code, unique=True)
# version2: declarative
class Location(Base):
__tablename__ = 'locations'
id = Column(Integer, primary_key = True)
customer_id = Column(Integer, ForeignKey('customers.customer_id'),
nullable=False)
location_code = Column(Unicode(10), nullable=False)
__table_args__ = (UniqueConstraint('customer_id', 'location_code',
name='_customer_location_uc'),
)
You have a little explanation on the post and a link to the official documentation of sqlalchemy.
Thanks to Van who posted that.
I have a many-to-many relationship between the Image and Tag tables in my project:
tags2images = db.Table("tags2images",
db.Column("tag_id", db.Integer, db.ForeignKey("tags.id", ondelete="CASCADE", onupdate="CASCADE"), primary_key=True),
db.Column("image_id", db.Integer, db.ForeignKey("images.id", ondelete="CASCADE", onupdate="CASCADE"), primary_key=True)
)
class Image(db.Model):
__tablename__ = "images"
id = db.Column(db.Integer, primary_key=True, autoincrement=False)
title = db.Column(db.String(1000), nullable=True)
tags = db.relationship("Tag", secondary=tags2images, back_populates="images", passive_deletes=True)
class Tag(db.Model):
__tablename__ = "tags"
id = db.Column(db.Integer, primary_key=True, autoincrement=True)
name = db.Column(db.String(250), nullable=False, unique=True)
images = db.relationship(
"Image",
secondary=tags2images,
back_populates="tags",
passive_deletes=True
)
and I'd like to grab a list of tags, ordered by how many times they're used in images. My images and tags tables contain ~200.000 and ~1.000.000 rows respectively, so there's a decent amount of data.
After a bit of messing around, I arrived at this monstrosity:
db.session.query(Tag, func.count(tags_assoc.c.tag_id).label("total"))\
.join(tags_assoc)\
.group_by(Tag)\
.order_by(text("total DESC"))\
.limit(20).all()
and while it does return a list of (Tag, count) tuples the way I want it to, it takes several seconds, which is not optimal.
I found this very helpful post (Counting relationships in SQLAlchemy) that helped me simplify the above to just
db.session.query(Tag.name, func.count(Tag.id))\
.join(Tag.works)\
.group_by(Tag.id)\
.limit(20).all()
and while this is wicked fast compared to my first attempt, the output obviously isn't sorted anymore. How can I get SQLAlchemy to produce the desired result while keeping the query fast?
This seems like something you probably need to use EXPLAIN for in psql. I added a combined index on both the tag_id and image_id via Index('idx_tags2images', 'tag_id', 'image_id'). I'm not sure what is better, individual indices or combined? But maybe see if using a limited subquery on just the association table before joining is faster.
from sqlalchemy import select
tags2images = Table("tags2images",
Base.metadata,
Column("id", Integer, primary_key=True),
Column("tag_id", Integer, ForeignKey("tags.id", ondelete="CASCADE", onupdate="CASCADE"), index=True),
Column("image_id", Integer, ForeignKey("images.id", ondelete="CASCADE", onupdate="CASCADE"), index=True),
Index('idx_tags2images', 'tag_id', 'image_id'),
)
class Image(Base):
__tablename__ = "images"
id = Column(Integer, primary_key=True)
title = Column(String(1000), nullable=True)
tags = relationship("Tag", secondary=tags2images, back_populates="images", passive_deletes=True)
class Tag(Base):
__tablename__ = "tags"
id = Column(Integer, primary_key=True, autoincrement=True)
name = Column(String(250), nullable=False, unique=True)
images = relationship(
"Image",
secondary=tags2images,
back_populates="tags",
passive_deletes=True
)
with Session() as session:
total = func.count(tags2images.c.image_id).label("total")
# Count, group and order just the association table itself.
sub = select(
tags2images.c.tag_id,
total
).group_by(
tags2images.c.tag_id
).order_by(
total.desc()
).limit(20).alias('sub')
# Now bring in the Tag names with a join
# we order again but this time only across 20 entries.
# #NOTE: Subquery will not get tags with image_count == 0
# since we use INNER join.
q = session.query(
Tag,
sub.c.total
).join(
sub,
Tag.id == sub.c.tag_id
).order_by(sub.c.total.desc())
for tag, image_count in q.all():
print (tag.name, image_count)