Querying based on related element attributes in SQLAlchemy - python

For simplicity sake, I will make an example to illustrate my problem.
I have a database that contains a table for baskets (primary keys basket_1, basket_2,..) and a table for fruits (apple_1, apple_2, pear_1, banana_1,...).
Each fruit instance has an attribute that describes its type (apple_1, and apple_2 have an attribute type = 'apple', pear_1 has an attribute type='pear' and so on).
Each basket has a one to many relationship with the fruits (for example basket_1 has an apple_1, an apple_2 and a pear_1).
My question is, given a series of inputs such as [2x elements of type apple and 1 element of type pear], is there a straightforward way to query/find which baskets do indeed contain all those fruits?
I tried something along the lines of:
from sqlalchemy import (
Table, Column, String, ForeignKey, Boolean
)
from sqlalchemy.orm import relationship, declarative_base
from sqlalchemy import (
Table, Column, String, ForeignKey, Boolean
)
from sqlalchemy.orm import relationship, declarative_base
from sqlalchemy.orm import sessionmaker
# Create session
database_path = "C:/Data/my_database.db"
engine = create_engine(database_path)
session = sessionmaker()
session.configure(bind=engine)
# Model
class Basket(Base):
__tablename__ = "baskets"
id = Column(String, primary_key=True)
fruits = relationship("Fruit",backref='baskets')
class Fruit(Base):
__tablename__ = "fruits"
id = Column(String, primary_key=True)
type = Column(String)
parent_basket = Column(String, ForeignKey('basket.id'))
# List of fruits
fruit_list = ['apple', 'apple', 'pear']
# Query baskets that contain all those elements (I currently have no idea on how to set up the condition or should I use a 'join' in this query)
CONDITION_TO_FILTER_ON = "basket should contain as many fruits of each type as specified in the fruit list"
baskets = session.query(Basket).filter(CONDITION_TO_FILTER_ON)
Sorry if the phrasing/explanation is not clear enough. I've been playing around with filters but it still isn't clear enough to me how to approach this.

Related

SQLalchemy with column names starting and ending with underscores

Set RDBMS_URI env var to a connection string like postgresql://username:password#host/database, then on Python 3.9 with PostgreSQL 15 and SQLalchemy 1.14 run:
from os import environ
from sqlalchemy import Boolean, Column, Identity, Integer
from sqlalchemy import create_engine
from sqlalchemy.orm import declarative_base
Base = declarative_base()
class Tbl(Base):
__tablename__ = 'Tbl'
__has_error__ = Column(Boolean)
id = Column(Integer, primary_key=True, server_default=Identity())
engine = create_engine(environ["RDBMS_URI"])
Base.metadata.create_all(engine)
Checking the database:
=> \d "Tbl"
Table "public.Tbl"
Column | Type | Collation | Nullable | Default
--------+---------+-----------+----------+----------------------------------
id | integer | | not null | generated by default as identity
Indexes:
"Tbl_pkey" PRIMARY KEY, btree (id)
How do I force the column names with double underscore to work?
I believe that the declarative machinery explicitly excludes attributes whose names start with a double underscore from the mapping process (based on this and this). Consequently your __has_error__ column is not created in the target table.
There are at least two possible workarounds. Firstly, you could give the model attribute a different name, for example:
_has_error = Column('__has_error__', BOOLEAN)
This will create the database column __has_attr__, accessed through Tbl._has_error*.
If you want the model's attribute to be __has_error__, then you can achieve this by using an imperative mapping.
import sqlalchemy as sa
from sqlalchemy import orm
mapper_registry = orm.registry()
tbl = sa.Table(
'tbl',
mapper_registry.metadata,
sa.Column('__has_error__', sa.Boolean),
sa.Column(
'id', sa.Integer, primary_key=True, server_default=sa.Identity()
),
)
class Tbl:
pass
mapper_registry.map_imperatively(Tbl, tbl)
mapper_registry.metadata.create_all(engine)
* I tried using a synonym to map __has_error__ to _has_error but it didn't seem to work. It probably gets exluded in the mapper as well, but I didn't investigate further.

Flask sqlalchemy filter objects in relationship for each object

How to filter objects in relationship in one command?
Example filter: I have list of childrens and every children has toys. Show me how to filter each childs toys, so that list child.toys contains only red toys.
class Child(db.Model):
id = db.Column(db.Integer, primary_key=True)
name = db.Column(db.String(40))
toys = db.relationship('Toy', lazy='dynamic')
class Toy(db.Model):
id = db.Column(db.Integer, primary_key=True)
color = db.Column(db.String(40))
child_id = db.Column(db.Integer, db.ForeignKey('child.id') )
Child id
Child name
1
First
2
Second
Toy id
Toy color
Toy child_id
1
Blue
1
2
Red
1
3
Orange
2
4
Red
2
Desired output in python list:
Child id
Child name
Toy id
Toy color
1
First
2
Red
2
Second
4
Red
Edit:
this table will be printed by:
for child in filtered_children:
for toy in child.toys:
print(f'{child.id} {child.name} {toy.id} {toy.color}')
Dynamic Relationship Loaders provides correct result, but you have to iterate like this in for loop:
children = Child.query.all()
for child in children:
child.toys = child.toys.filter_by(color='Red')
len( children[0].toys ) #should by one
len( children[1].toys ) #should by one
Is there a way how to filter objects from relationship without for loop?
Edit:
Reformulated question: Is there a way to apply the filter at the outer query such that no additional filtering need to be done inside the for loop such that each child.toys attribute for each child in the loop will only contain Red toys?
Since it appears you already configured some rudimentary relationship that allowed the usage of join conditions, the next step is simply to actually use the Query.join() call.
My example below uses SQLAlchemy directly, so you may need to adapt the references to those specific to Flask-SQLALchemy (e.g. instead of creating a session you may instead use db.session, which I referenced from the Quickstart; also the Child and Toy classes I used inherit from a common sqlalchemy.ext.declarative.declarative_base() for the same reason, otherwise the same principles introduced below should be commonly applicable).
First, import and set up the classes - to keep this generalized, I will be using sqlalchemy directly as noted:
from sqlalchemy import Column, Integer, String
from sqlalchemy import create_engine, ForeignKey
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import sessionmaker, relationship, contains_eager
Base = declarative_base()
class Child(Base):
__tablename__ = 'child'
id = Column(Integer, primary_key=True)
name = Column(String(40))
toys = relationship('Toy')
class Toy(Base):
__tablename__ = 'toy'
id = Column(Integer, primary_key=True)
color = Column(String(40))
child_id = Column(Integer, ForeignKey('child.id') )
Note that we have removed the extra keyword arguments for the Child.toys relationship construct, as the specified 'dynamic' style of loading is incompatible with the query desired.
Then set up the engine and add your data provided from your question:
engine = create_engine('sqlite://')
Base.metadata.create_all(engine)
Session = sessionmaker(bind=engine)
session = Session()
session.add(Child(name='First'))
session.add(Child(name='Second'))
session.add(Toy(color='Blue', child_id=1))
session.add(Toy(color='Red', child_id=1))
session.add(Toy(color='Orange', child_id=2))
session.add(Toy(color='Red', child_id=2))
session.commit()
With the data added, we can see that in the documentation for loading relationships from SQLAlchemy, there is a rather comprehensive set of relationship loading API, and the contains_eager is the suitable one for your use case, as it states "that the given attribute should be eagerly loaded from columns stated manually in the query." So let's see it in action:
filtered_children = session.query(Child).join(Child.toys).filter(
Toy.color=='Red'
).options(
contains_eager(Child.toys)
).all()
This query ensures that the toys relationship declared through the Child is joined with the query, and that we also filter by "Red" toys, with the option to indicate that Child.toys should be "eager loaded from columns stated manually in the query" such that it will become available through each of the returned child object.
Now, see that your most recent desired for loop over filtered_children and their toys produce your desired output:
for child in filtered_children:
for toy in child.toys:
print(f'{child.id} {child.name} {toy.id} {toy.color}')
The following output should be produced:
1 First 2 Red
2 Second 4 Red
If we had logging enabled, we will see that the following output that indicates the SELECT statement that was issued by SQLAlchemy's engine:
INFO:sqlalchemy.engine.base.Engine:SELECT toy.id AS toy_id, toy.color AS toy_color, toy.child_id AS toy_child_id, child.id AS child_id, child.name AS child_name
FROM child JOIN toy ON child.id = toy.child_id
WHERE toy.color = ?
INFO:sqlalchemy.engine.base.Engine:('Red',)
DEBUG:sqlalchemy.engine.base.Engine:Col ('toy_id', 'toy_color', 'toy_child_id', 'child_id', 'child_name')
DEBUG:sqlalchemy.engine.base.Engine:Row (2, 'Red', 1, 1, 'First')
DEBUG:sqlalchemy.engine.base.Engine:Row (4, 'Red', 2, 2, 'Second')
Note that only a single SELECT ... JOIN query was done and no additional queries were made for each of the Child, which would result in significant performance impact.

SQLAlchemy best way to filter a table based on values from another table

I apologize in advance if my question is banal: I am a total beginner of SQL.
I want to create a simple database, with two tables: Students and Answers.
Basically, each student will answer three question (possible answers are True or False for each question), and his answers will be stored in Answers table.
Students can have two "experience" levels: "Undergraduate" and "Graduate".
What is the best way to obtain all Answers that were given by Students with "Graduate" experience level?
This is how I define SQLAlchemy classes for entries in Students and Answers tables:
import random
from sqlalchemy import create_engine
from sqlalchemy import Column, Integer, String, Date, Boolean, ForeignKey
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import sessionmaker, relationship
db_uri = "sqlite:///simple_answers.db"
db_engine = create_engine(db_uri)
db_connect = db_engine.connect()
Session = sessionmaker()
Session.configure(bind=db_engine)
db_session = Session()
Base = declarative_base()
class Student(Base):
__tablename__ = "Students"
id = Column(Integer, primary_key=True)
experience = Column(String, nullable=False)
class Answer(Base):
__tablename__ = "Answers"
id = Column(Integer, primary_key=True)
student_id = Column(Integer, ForeignKey("Students.id"), nullable=False)
answer = Column(Boolean, nullable=False)
Base.metadata.create_all(db_connect)
Then, I insert some random entries in the database:
categories_experience = ["Undergraduate", "Graduate"]
categories_answer = [True, False]
n_students = 20
n_answers_by_each_student = 3
random.seed(1)
for _ in range(n_students):
student = Student(experience=random.choice(categories_experience))
db_session.add(student)
db_session.commit()
answers = [Answer(student_id=student.id, answer=random.choice(categories_answer))
for _ in range(n_answers_by_each_student)]
db_session.add_all(answers)
db_session.commit()
Then, I obtain Student.id of all "Graduate" students:
ids_graduates = db_session.query(Student.id).filter(Student.experience == "Graduate").all()
ids_graduates = [result.id for result in ids_graduates]
And finally, I select Answers from "Graduate" Students using .in_ operator:
answers_graduates = db_session.query(Answer).filter(Answer.student_id.in_(ids_graduates)).all()
I manually checked the answers, and they are right. But, since I am a total beginner of SQL, I suspect that there is some better way to achieve the same result.
Is there such an objectively "best" way (more Pythonic, more efficient...)? I would like to achieve my result with SQLAlchemy, possibly using the ORM interface.
When I asked the question, I was in a hurry.
Since then, I have had the time to study SQLAlchemy ORM documentation.
There are two recommended ways to filter tables based on values from another table.
The first way is actually very similar to what I had originally tried:
query_graduates = (
db_session
.query(User.id)
.filter(User.experience == "Graduate")
)
query_answers_graduates = (
db_session
.query(Answer)
.filter(Answer.user_id.in_(query_graduates))
)
answers_graduates = query_answers_graduates.all()
It uses .in_ operator, which accepts as argument either a list of objects, or another query.
The second way uses .join method:
query_answers_graduates = (
db_session
.query(Answer)
.join(User)
.filter(User.experience == "Graduate")
)
The second approach is more concise. I timed both solutions, and the second approach, which uses .join, is slightly faster.
You are mentioning SQL but I am confused if you want to do this particular step in Python or SQL. If SQL, something like this could work:
select * from Students s
inner join Answers a on s.id = a.student_id
where s.experience = "Graduate";
Updated code
I have never used SQLAlchemy before but something similar to this may work...
sql = """select s.Id, a.answer from Students s
inner join Answers a on s.id = a.student_id
where s.experience = "Graduate";"""
with db_session as con:
rows = con.execute(sql)
for row in rows:
print(row)

SQLAlchemy many-to-many query

Let's say I have a blog front page with several posts that each have a number of tags (like the example at http://pythonhosted.org/Flask-SQLAlchemy/models.html#many-to-many-relationships but with posts instead of pages). How do I retrieve all tags for all shown posts in a single query with SQLAlchemy?
The way I would do it is this (I'm just curious if there's a better way):
Run a query that returns all relevant posts for the page.
Use a list comprehension to get a list of all post IDs in the above query.
Run a single query that gets all tags where post_id in ( [the list of post IDs I just made] )
Is that the way to do it?
This is of course not the way to do it. The purpose of an ORM like sqlalchemy is to represent the records and all relations/related records as objects which you can just work on without thinking about the underlying sql-queries.
You don't need to retrieve anything. You already have it. The tags-property of your Post()-objects is (something like) a list of Tag()-objects.
I don't know Flask-SQLAlchemy but since you asked for SQLAlchemy I feel free to post a pure SQLAlchemy example that uses the models from the Flask example (and is self contained):
#!/usr/bin/env python3
# coding: utf-8
import sqlalchemy as sqAl
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import sessionmaker, relationship, backref
engine = sqAl.create_engine('sqlite:///m2m.sqlite') #, echo=True)
metadata = sqAl.schema.MetaData(bind=engine)
Base = declarative_base(metadata)
tags = sqAl.Table('tags', Base.metadata,
sqAl.Column('tag_id', sqAl.Integer, sqAl.ForeignKey('tag.id')),
sqAl.Column('page_id', sqAl.Integer, sqAl.ForeignKey('page.id'))
)
class Page(Base):
__tablename__ = 'page'
id = sqAl.Column(sqAl.Integer, primary_key=True)
content = sqAl.Column(sqAl.String)
tags = relationship('Tag', secondary=tags,
backref=backref('pages', lazy='dynamic'))
class Tag(Base):
__tablename__ = 'tag'
id = sqAl.Column(sqAl.Integer, primary_key=True)
label = sqAl.Column(sqAl.String)
def create_sample_data(sess):
tag_strings = ('tag1', 'tag2', 'tag3', 'tag4')
page_strings = ('This is page 1', 'This is page 2', 'This is page 3', 'This is page 4')
tag_obs, page_obs = [], []
for ts in tag_strings:
t = Tag(label=ts)
tag_obs.append(t)
sess.add(t)
for ps in page_strings:
p = Page(content=ps)
page_obs.append(p)
sess.add(p)
page_obs[0].tags.append(tag_obs[0])
page_obs[0].tags.append(tag_obs[1])
page_obs[1].tags.append(tag_obs[2])
page_obs[1].tags.append(tag_obs[3])
page_obs[2].tags.append(tag_obs[0])
page_obs[2].tags.append(tag_obs[1])
page_obs[2].tags.append(tag_obs[2])
page_obs[2].tags.append(tag_obs[3])
sess.commit()
Base.metadata.create_all(engine, checkfirst=True)
session = sessionmaker(bind=engine)()
# uncomment the next line and run it once to create some sample data
# create_sample_data(session)
pages = session.query(Page).all()
for p in pages:
print("page '{0}', content:'{1}', tags: '{2}'".format(
p.id, p.content, ", ".join([t.label for t in p.tags])))
Yes, life can be so easy...

Cannot move object from one database to another

I am trying to move an object from one database into another. The mappings are the same but the tables are different. This is a merging tool where data from an old database need to be imported into a new one. Still, I think I am missing something fundamental about SQLAlchemy here. What is it?
from sqlalchemy import Column, Float, String, Enum
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy import orm
from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker
DeclarativeBase = declarative_base()
class Datum (DeclarativeBase):
__tablename__ = "xUnitTestData"
Key = Column(String, primary_key=True)
Value = Column(Float)
def __init__ (self, k, v):
self.Key = k
self.Value = v
src_engine = create_engine('sqlite:///:memory:', echo=False)
dst_engine = create_engine('sqlite:///:memory:', echo=False)
DeclarativeBase.metadata.create_all(src_engine)
DeclarativeBase.metadata.create_all(dst_engine)
SessionSRC = sessionmaker(bind=src_engine)
SessionDST = sessionmaker(bind=dst_engine)
item = Datum('eek', 666)
session1 = SessionSRC()
session1.add(item)
session1.commit()
session1.close()
session2 = SessionDST()
session2.add(item)
session2.commit()
print item in session2 # >>> True
print session2.query(Datum).all() # >>> []
session2.close()
I'm not really aware about what happens under the hood, but in the ORM pattern an object matches to a particular row in a particular table. If you try to add the same object to two different tables in two different databases, that doesn't sound like a good practice even if the table definition is exactly the same.
What I'd do to workaround this problem is just create a new object that is a copy of the original object and add it to the database:
session1 = SessionSRC()
session1.add(item)
session1.commit()
new_item = Datum(item.Key, item.Value)
session2 = SessionDST()
session2.add(new_item)
session2.commit()
print new_item in session2 # >>> True
print session2.query(Datum).all() # >>> [<__main__.Datum object at 0x.......>]
session2.close()
session1.close()
Note that session1 isn't closed immediately to be able to read the original object attributes while creating the new object.

Categories