SQLAlchemy bulk create if not exists - python

I am trying to optimize my code by reducing the calls to the Database. I have the following models:
class PageCategory(Base):
category_id = Column(Text, ForeignKey('category.category_id'), primary_key=True)
page_id = Column(Text, ForeignKey('page.page_id'), primary_key=True)
class Category(Base):
category_id = Column(Text, primary_key=True)
name = Column(Text, nullable=False)
pages = relationship('Page', secondary='page_category')
class Page(Base):
page_id = Column(Text, primary_key=True)
name = Column(Text, nullable=False)
categories = relationship('Category', secondary='page_category')
The code receives a stream of Facebook likes and each one comes with a Pagea Category and the obvious relation between them a PageCategory. I need to find a way to bulk create, if not existing already, the different Pages, Categories and the relation between them. Given that the code needs to be fast I can't afford a round trip to the Database when creating every object.
page = Page(page_id='1', name='1')
category = Category(category_id='2', name='2')
session.add(page)
session.add(category)
session.commit()
...same for PageCategory
Now, given that a page_id and category_id are PK, the database will raise an IntegrityError if we try to insert duplicates, but that is still a round-trip dance. I would need a utility that receives, say a list of objects like session.bulk_save_objects([page1, page2, category1, category2, page_category1, page_category2]) but just create the objects that do not raise an IntegrityError, and ignore the ones that do.
This way I will be avoiding Database IO for every triple of objects. I don't know if this is possible or this exceeds SQLAlchemy capabilities.

Related

Is it possible to have SQL Alchemy database models point to attributes of a different model, so that when one model's data changes, so does the other?

I'm making a very simple warehouse management system, and I'd like for users to be able to create templates for items. The template will show up on a list, and then can individually be used to create instances of an item that will also gain a quantity and warehouse attribute.
The goal is, if one of the item templates gets modified to specify a different size or price, the size or price attributes of the actual item instance gets changed as well.
Here is my code in case that helps you visualize what I'm trying to do. I'm not sure if this is possible or if there is a different solution I should consider. It's my first time working with Flask SQLAlchemy.
class ItemTemplate(db.model):
"""This template will simply store the information related to an item type.
Individual items that will be associated with the warehouse they're stored in
will inherit this information from the item templates."""
_id = db.Column(db.Integer, primary_key=True)
name = db.Column(db.String(15), unique=True, nullable=False)
price = db.column(db.Float, nullable=False)
cost = db.column(db.Float, nullable=False)
size = db.column(db.Integer, nullable=False)
lowThreshold = db.column(db.Integer, nullable=False)
# Actual items
class Item(db.model):
"""This template will be used to represent the actual items that are associated with a warehouse."""
_id = db.Column(db.Integer, primary_key=True)
quantity = db.Column(db.Integer, primary_key=True)
"""Here I want the Item attributes to be able to just point to attributes from the ItemTemplate class.
ItemTemplate(name='tape') <--- will be a template with the information for tape.
Item(name='tape') <--- will be an actually instance of tape that should inherit all the attributes from the tape template.
I want these attributes to be like pointers so that if the tape template has its name changed, for instance, to
'scotch tape', all the Item instances that point to the tape template will have their names changed."""
# Warehouse
class Warehouse(db.model):
_id = db.Column(db.Integer, primary_key=True)
name = db.Column(db.String(15), unique=True, nullable=False)
capacity = db.Column(db.column(db.Integer, nullable=False))
items = db.relationship("Item", backref="warehouse", lazy=True)```
As I understand, you just declare a One-Many relationship between ItemTemplate and Item, that one template will be used for many items.
Define Model
Just try to declare their relationship like this
class ItemTemplate(db.model):
_id = db.Column(db.Integer, primary_key=True)
... # Other attribute
instances = db.relationship('item', backref='item_template', lazy=True)
class Item(db.model):
_id = db.Column(db.Integer, primary_key=True)
quantity = db.Column(db.Integer, primary_key=True)
item_template_id = db.Column(db.Integer, db.ForeignKey('item_template._id'), nullable=False)
Docs for more information about relationship:
https://flask-sqlalchemy.palletsprojects.com/en/2.x/models/#one-to-many-relationships
Query
Next time querying, just join two tables and you can have your ItemTemplate.name
items_qr = db.session.query(Item, ItemTemplate.name).join(ItemTemplate)
for item, item_name in items_qr:
print(item.id, item_name)
SQLAlchemy Doc for query.join(): https://docs.sqlalchemy.org/en/14/orm/query.html#sqlalchemy.orm.Query.join
Some relative SO questions may help
flask Sqlalchemy One to Many getting parent attributes
One-to-many Flask | SQLAlchemy

SQLAlchemy Join to retrieve data from multiple tables

I'm trying to retrieve data from multiple tables with SQLAlchemy using the .join() method.
When I run the query I was expecting to get a single object back which had all the data from the different tables joined so that I could use a.area_name and so on where area_name is on one of the joined tables. Below is the query I am running and the table layout, if anyone could offer insight into how to achieve the behavior I'm aiming for I would greatly appreciate it! I've been able to use the .join() method with this same syntax to match results and return them, I figured it would return the extra data from the rows as well since it joins the tables (perhaps I'm misunderstanding how the method works or how to retrieve the information via the query object?).
If it helps with the troubleshooting I'm using MySQL as the database
query:
a = User.query.filter(User.user_id==1).join(UserGroup,
User.usergroup==UserGroup.group_id).join(Areas, User.area==Areas.area_id).first()
and the tables:
class User(db.Model):
user_id = db.Column(db.Integer, primary_key=True)
name = db.Column(db.String(20), unique=True)
usergroup = db.Column(db.Integer, db.ForeignKey('user_group.group_id'), nullable=False)
area = db.Column(db.Integer, db.ForeignKey('areas.area_id'), nullable=False)
class UserGroups(db.Model):
id = db.Column(db.Integer, primary_key=True)
group_id = db.Column(db.Integer, nullable=False, unique=True)
group_name = db.Column(db.String(64), nullable=False, unique=True)
class Areas(db.Model):
id = db.Column(db.Integer, primary_key=True)
area_id = db.Column(db.Integer, nullable=False, unique=True)
area_name = db.Column(db.String(64), nullable=False, unique=True)
So it seems that I need to use a different approach to the query, and that it returns a tuple of objects which I then need to parse.
What worked is:
a = db.session.query(User, UserGroups, Areas
).filter(User.user_id==1
).join(UserGroup,User.usergroup==UserGroup.group_id
).join(Areas, User.area==Areas.area_id
).first()
The rest remaining the same. This then returned a tuple that I could parse where the data from User is a[0], from UserGroups is a[1], and Areas is a[2]. I can then access the group_name column with a[1].group_name etc.
Hopefully this helps someone else who's trying to work with this!
Take a look at SQLAlchemy's relationship function:
http://docs.sqlalchemy.org/en/latest/orm/basic_relationships.html#one-to-many
You may want to add a new attribute to your User class like so:
group = sqlalchemy.relationship('UserGroups', back_populates='users')
This will automagically resolve the one-to-many relationship between User and UserGroups (assuming that a User can only be member of one UserGroup at a time). You can then simply access the attributes of the UserGroup once you have queried a User (or set of Users) from your database:
a = User.query.filter(...).first()
print(a.group.group_name)
SQLAlchemy resolves the joins for you, you do not need to explicitly join the foreign tables when querying.
The reverse access is also possible; if you just query for a UserGroup, you can access the corresponding members directly (via the back_populates-keyword argument):
g = UserGroup.query.filter(...).first()
for u in g.users:
print(u.name)

Delete all in a Many to Many secondary table association in sqlalchemy

I have following models and associations:
class CartProductsAssociation(db.Model):
__tablename__ = 'cart_products_association'
cart_id = db.Column(db.Integer, db.ForeignKey('carts.id',ondelete='CASCADE'),primary_key=True)
product_id = db.Column(db.Integer, db.ForeignKey('products.id',ondelete='CASCADE'), primary_key=True)
quantity = db.Column(db.Integer)
product = db.relationship("Product", backref="cart_associations", cascade="all,delete",passive_deletes=True)
cart = db.relationship("Cart", backref="product_associations",cascade="all,delete",passive_deletes=True)
class Product(db.Model):
__tablename__ = 'products'
id = db.Column(db.Integer, primary_key=True)
name = db.Column(db.String)
img_path = db.Column(db.String)
price = db.Column(db.Float, default=0.0)
product_categories = db.relationship(
"ProductCategory",
secondary=product_product_categories,
back_populates="products")
carts = db.relationship("Product", secondary="cart_products_association",passive_deletes=True,cascade="all,delete" )
class Cart(db.Model):
__tablename__ = 'carts'
id = db.Column(db.Integer, primary_key=True)
branch_id = db.Column(db.Integer, db.ForeignKey('branch.id'))
branch = db.relationship("Branch", back_populates="carts")
page_id = db.Column(db.Integer, db.ForeignKey('pages.id'))
page = db.relationship("Page", back_populates="carts")
shopper_id = db.Column(db.String, db.ForeignKey('shoppers.fb_user_id'))
shopper = db.relationship(
"Shopper",
back_populates="carts")
products = db.relationship("Product", secondary="cart_products_association")
cart_status = db.Column(db.Enum('user_unconfirmed','user_confirmed','client_unconfirmed','client_confirmed', name='cart_status'), default='user_unconfirmed')
When I am trying to delete a product I am getting following error:
AssertionError
AssertionError: Dependency rule tried to blank-out primary key column 'cart_products_association.cart_id' on instance '<CartProductsAssociation at 0x7f5fd41721d0>'
How can I solve it?
it solved the problem:
product = models.Product.query.get(product_id)
for ass in product.cart_associations:
db.session.delete(ass)
db.session.delete(product)
db.session.commit()
The error is caused by back references cart_associations and product_associations created by CartProductsAssociation. Since they don't have explicit cascades set, they have the default save-update, merge, and without delete the
default behavior is to instead de-associate ... by setting their foreign key reference to NULL.
Due to this when a Product is up for deletion SQLAlchemy will first fetch the related CartProductsAssociation objects and try to set the primary key to NULL.
It seems that originally there has been an attempt to use passive_deletes=True with ondelete='CASCADE', but the passive deletes have ended up on the wrong side of the relationship pair. This should produce a warning:
sqlalchemy/orm/relationships.py:1790: SAWarning: On CartProductsAssociation.product, 'passive_deletes' is normally configured on one-to-many, one-to-one, many-to-many relationships only.
If the relationships are configured as
class CartProductsAssociation(db.Model):
...
product = db.relationship(
"Product", backref=db.backref("cart_associations",
cascade="all",
passive_deletes=True))
cart = db.relationship(
"Cart", backref=db.backref("product_associations",
cascade="all",
passive_deletes=True))
instead, then when a Product instance that has not loaded its related CartProductsAssociation objects is deleted, SQLAlchemy will let the DB handle cascading. Note that the SQLAlchemy delete cascade is also necessary, or the error will come back if a Product instance that has loaded its related association objects is deleted. passive_deletes="all" can also be used, if there are some special triggers or such in place in the DB that must be allowed to fire.
When deleting a Product that has loaded both carts and cart_associations the situation is even more complicated, because both association object pattern and a many to many relationship are in use, and the 2 relationships do not coordinate changes together – see the warning in "Association Object". You might want to consider either making the other relationship viewonly, or use the association proxy extension across the association object relationship:
class Product:
...
carts = association_proxy(
'cart_associations', 'cart',
creator=lambda cart: CartProductsAssociation(cart=cart))
Finally, the delete cascade in Product.carts is a bit odd, though may be as designed, and will delete the related Cart objects along with the Product if they have been loaded, and additionally removes rows from the secondary table. On the other hand that relationship has passive deletes also, so the Cart objects are not deleted if not loaded when the Product is deleted, which would seem to conflict with the SQLAlchemy cascade.

How to create a field with a list of foreign keys in SQLAlchemy?

I am trying to store a list of models within the field of another model. Here is a trivial example below, where I have an existing model, Actor, and I want to create a new model, Movie, with the field Movie.list_of_actors:
import uuid
from sqlalchemy import Boolean, Column, Integer, String, DateTime
from sqlalchemy.schema import ForeignKey
rom sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import relationship
Base = declarative_base()
class Actor(Base):
__tablename__ = 'actors'
id = Column(UUID(as_uuid=True), primary_key=True, default=uuid.uuid4)
name = Column(String)
nickname = Column(String)
academy_awards = Column(Integer)
# This is my new model:
class Movie(Base):
__tablename__ = 'movies'
id = Column(UUID(as_uuid=True), primary_key=True, default=uuid.uuid4)
title = Column(String)
# How do I make this a list of foreign keys???
list_of_actors = Column(UUID(as_uuid=True), ForeignKey('actors.id'))
I understand that this can be done with a many-to-many relationship, but is there a more simple solution? Note that I don't need to look up which Movie's an Actor is in - I just want to create a new Movie model and access the list of my Actor's. And ideally, I would prefer not to add any new fields to my Actor model.
I've gone through the tutorials using the relationships API, which outlines the various one-to-many/many-to-many combinations using back_propagates and backref here: http://docs.sqlalchemy.org/en/latest/orm/basic_relationships.html But I can't seem to implement my list of foreign keys without creating a full-blown many-to-many implementation.
But if a many-to-many implementation is the only way to proceed, is there a way to implement it without having to create an "association table"? The "association table" is described here: http://docs.sqlalchemy.org/en/latest/orm/basic_relationships.html#many-to-many ? Either way, an example would be very helpful!
Also, if it matters, I am using Postgres 9.5. I see from this post there might be support for arrays in Postgres, so any thoughts on that could be helpful.
Update
It looks like the only reasonable approach here is to create an association table, as shown in the selected answer below. I tried using ARRAY from SQLAlchemy's Postgres Dialect but it doesn't seem to support Foreign Keys. In my example above, I used the following column:
list_of_actors = Column('actors', postgresql.ARRAY(ForeignKey('actors.id')))
but it gives me an error. It seems like support for Postgres ARRAY with Foreign Keys is in progress, but still isn't quite there. Here is the most up to date source of information that I found: http://blog.2ndquadrant.com/postgresql-9-3-development-array-element-foreign-keys/
If you want many actors to be associated to a movie, and many movies be associated to an actor, you want a many-to-many. This means you need an association table. Otherwise, you could chuck away normalisation and use a NoSQL database.
An association table solution might resemble:
class Actor(Base):
__tablename__ = 'actors'
id = Column(UUID(as_uuid=True), primary_key=True, default=uuid.uuid4)
name = Column(String)
nickname = Column(String)
academy_awards = Column(Integer)
class Movie(Base):
__tablename__ = 'movies'
id = Column(UUID(as_uuid=True), primary_key=True, default=uuid.uuid4)
title = Column(String)
actors = relationship('ActorMovie', uselist=True, backref='movies')
class ActorMovie(Base):
__tablename__ = 'actor_movies'
actor_id = Column(UUID(as_uuid=True), ForeignKey('actors.id'))
movie_id = Column(UUID(as_uuid=True), ForeignKey('movies.id'))
If you don't want ActorMovie to be an object inheriting from Base, you could use sqlachlemy.schema.Table.

sqlalchemy foreign key relationship attributes

I have a User table and a Friend table. The Friend table holds two foreign keys both to my User table as well as a status field. I am trying to be able to call attributes from my User table on a Friend object. For example, I would love to be able to do something like, friend.name, or friend.email.
class User(Base):
""" Holds user info """
__tablename__ = 'user'
id = Column(Integer, primary_key=True)
name = Column(String(25), unique=True)
email = Column(String(50), unique=True)
password = Column(String(25))
admin = Column(Boolean)
# relationships
friends = relationship('Friend', backref='Friend.friend_id',primaryjoin='User.id==Friend.user_id', lazy='dynamic')
class Friend(Base):
__tablename__ = 'friend'
user_id = Column(Integer, ForeignKey(User.id), primary_key=True)
friend_id = Column(Integer, ForeignKey(User.id), primary_key=True)
request_status = Column(Boolean)
When I get friend objects all I have is the 2 user_ids and i want to display all properties of each user so I can use that information in forms, etc. I am new to sqlalchemy - still trying to learn more advanced features. This is just a snippet from a larger Flask project and this feature is going to be for friend requests, etc. I've tried to look up association objects, etc, but I am having a hard with it.
Any help would be greatly appreciated.
First, if you're using flask-sqlalchemy, why are you using directly sqlalchemy instead of the Flask's db.Model?
I strongly reccomend to use flask-sqlalchemy extension since it leverages the sessions and some other neat things.
Creating a proxy convenience object is straightforward. Just add the relationship with it in the Friend class.
class Friend(Base):
__tablename__ = 'friend'
user_id = Column(Integer, ForeignKey(User.id), primary_key=True)
friend_id = Column(Integer, ForeignKey(User.id), primary_key=True)
request_status = Column(Boolean)
user = relationship('User', foreign_keys='Friend.user_id')
friend = relationship('User', foreign_keys='Friend.friend_id')
SQLAlchemy will take care of the rest and you can access the user object simply by:
name = friend.user.name
If you plan to use the user object every time you use the friend object specify lazy='joined' in the relationship. This way it loads both object in a single query.

Categories