I have a table defined with relationships and I noticed that even though I don't use joins in my query, the information is still retrieved:
class Employee(Base):
__tablename__ = "t_employee"
id = Column(Identifier(20), Sequence('%s_id_seq' % __tablename__), primary_key=True, nullable=False)
jobs = relationship("EmployeeJob")
roles = relationship("EmployeeRole")
class EmployeeJob(Base):
__tablename__ = "t_employee_job"
id = Column(Integer(20), Sequence('%s_id_seq' % __tablename__), primary_key=True, nullable=False)
employee_id = Column(Integer(20), ForeignKey('t_employee.id', ondelete="CASCADE"), primary_key=True)
job_id = Column(Integer(20), ForeignKey('t_job.id', ondelete="CASCADE"), primary_key=True)
class EmployeeRole(Base):
__tablename__ = "t_employee_role"
id = Column(Integer(20), Sequence('%s_id_seq' % __tablename__), primary_key=True, nullable=False)
employee_id = Column(Integer(20), ForeignKey('t_employee.id', ondelete="CASCADE"), nullable=False)
location_id = Column(Identifier(20), ForeignKey('t_location.id', ondelete="CASCADE"))
role_id = Column(Integer(20), ForeignKey('t_role.id', ondelete="CASCADE"), nullable=False)
session.query(Employee).all() retrieves also the roles and jobs but does so by querying the db for each row.
I have 2 questions about this situation:
1. In terms of performance I guess I should do the join by myself. Am I correct?
2. How do I map a table to a certain data structure? For example, I want to get the list of employees with their roles where each role should be represented by an Array of location ID and role ID e.g. {id:1, jobs:[1,2,3], roles:[[1,1],[1,2],[2,3]]}
1) Please read Eager Loading from the SA documentation.
By default, relationships are loaded lazy on first access to it. In your case, you could use, for example, Joined Load, so that the related rows would be loaded in the same query:
qry = (session.query(Employee).
options(joinedload(Employee.jobs)).
options(joinedload(Employee.roles))
).all()
If you want those relationships to be always loaded when an Employee is loaded, you can configure the relationship to automatically be loaded:
class Employee(Base):
# ...
jobs = relationship("EmployeeJob", lazy="joined")
roles = relationship("EmployeeRole", lazy="subquery")
2) Just create a method to extract the data structure from your query. Something like below should do it (using qry from first part of the answer):
def get_structure(qry):
res = [{"id": e.id,
"jobs": [j.id for j in e.jobs],
"roles": [[r.location_id, r.role_id] for r in e.roles],
}
for e in qry
]
return res
Also note: your EmployeeJob table has funny primary_key, which includes both the id column as well as two ForeignKey columns. I think you should choose either one or the other.
I have finally found a way to accomplish my second issue and decided to answer my own question for the benefit of others:
from sqlalchemy.ext.hybrid import hybrid_property
class Employee(Base):
__tablename__ = "t_employee"
id = Column(Identifier(20), Sequence('%s_id_seq' % __tablename__), primary_key=True, nullable=False)
_jobs = relationship("EmployeeJob", lazy="joined", cascade="all, delete, delete-orphan")
_roles = relationship("EmployeeRole", lazy="joined", cascade="all, delete, delete-orphan")
#hybrid_property
def jobs(self):
return [item.employee_id for item in self._jobs]
#jobs.setter
def jobs(self, value):
self._jobs = [EmployeeJob(job_id=id) for id in value]
#hybrid_property
def roles(self):
return [[item.location_id, item.role_id] for item in self._roles]
#roles.setter
def roles(self, value):
self._roles = [EmployeeRole(location_id=l_id, role_id=r_id) for l_id, r_id in value]
The cascade in the relationship is to ensure that the orphans are deleted once the list is updated, and the decorators define the getter and setter of each complex property
Thank you van for pointing me to the right direction!
Related
I have many-to-many relationships for users and roles and I want to select user which have specific roles using realtions.
For example I want to get user having:
roles = ["role_1", "role_2", "role_3"]
so I tried
query.filter(Users.roles.contains(roles))
(where roles - List[Roles])
but I got
sqlalchemy.exc.ArgumentError: Mapped instance expected for relationship comparison to object. Classes, queries and other SQL elements are not accepted in this context; for comparison with a subquery, use Users.roles.has(**criteria).
then I tried
query.filter(Users.roles.has(Roles.name.in_(roles)))
where roles already List[str]
And I got
sqlalchemy.exc.InvalidRequestError: 'has()' not implemented for collections. Use any().
but any() selects entry that has any associated role when I need entry that has all required roles. So how to select it right way using relationships instead of joins and etc.?
class Users(sa.Model):
__tablename__ = 'users'
id = Column(Integer, primary_key=True, autoincrement=True)
login = Column(String(50), unique=False)
roles = relationship('Roles', secondary='user_roles_map',
cascade='all, delete')
class Roles(sa.Model):
__tablename__ = 'roles'
id = Column(Integer, primary_key=True, autoincrement=True)
name = Column(String(40), unique=True)
class UserRolesMap(sa.Model):
__tablename__ = 'user_roles_map'
id_seq = Sequence(__tablename__ + "_id_seq")
id = Column(Integer(), id_seq, server_default=id_seq.next_value(),
unique=True, nullable=False)
user_id = Column(
Integer, ForeignKey('users.id'),
primary_key=True)
role_id = Column(
Integer, ForeignKey('roles.id'),
primary_key=True)
I didn't find what I was looking for, so for now I just wrote it with joins:
query = db_session.query(Users) \
.filter_by(**parameters)
.join(UserRolesMap, UserRolesMap.user_id == Users.id)\
.filter(UserRolesMap.role_id.in_(roles_ids))\
.group_by(Users)\
.having(func.count(UserRolesMap.role_id) >= len(roles_ids))
where roles_ids was collected from Roles table before. And if you need user with only required roles you can replace ">=" with "==".
I am into a very confusing situation where I have one to many relation and I want to query data like I want all parent table data but want to get only data from child tables which fulfill condition of site_id = 100.
class Policy(Base):
"""table containing details for Policies"""
__tablename__ = "UmbrellaPolicy"
id = Column(Integer, primary_key=True)
policy_id = Column(Integer, nullable=False, index=True)
user_defined_name = Column(String(255), nullable=True)
and child is like this
class Site(Base):
__tablename__ = "Site"
id = Column(Integer, primary_key=True)
policy_id = Column(Integer, ForeignKey("Policy.id"))
site_id = Column(String(32), nullable=False, index=True)
policy = relationship("Policy", backref="sites")
You should be able to filter join relations like this
parents = Policy.objects.filter(site__site_id=100)
You can find more info about the Django query API here but its generally of the form where you reference the relation with classname__columnname there are many other ways to filter/query that you can reference in the docs
I have following models defined:
class Attribute(Base):
__tablename__ = "attributes"
id = Column(BigInteger, primary_key=True, index=True)
data_id = Column(BigInteger, ForeignKey("data.art_no"))
name = Column(VARCHAR(500), index=True)
data = relationship("Data", back_populates="attributes")
class Data(Base):
__tablename__ = "data"
art_no = Column(BigInteger, primary_key=True, index=True)
multiplier = Column(Float)
attributes = relationship("Attribute", back_populates="data", cascade="all, delete, delete-orphan")
If I query for a Data object, I get this for attributes:
[<app.db.models.Attribute object at 0x10d755d30>]
But I want to get:
['attribute name X']
What I want to get is, that the attributes field should be an array of the Attribute.name fields of the join'ed attributes.
My current query is:
db.query(models.Data).all()
How do I need to modify my query so the attributes field of Data contains not Attribute objects but just the strings name of `Attributes?
I hope you understand the question well ;)
db.query(models.Data).all() returns an array of Data objects. So you can define a custom property on the Data class to extract names from attributes relationship:
class Attribute(Base):
__tablename__ = "attributes"
id = Column(BigInteger, primary_key=True, index=True)
data_id = Column(BigInteger, ForeignKey("data.art_no"))
name = Column(VARCHAR(500), index=True)
data = relationship("Data", back_populates="attributes_rel")
class Data(Base):
__tablename__ = "data"
art_no = Column(BigInteger, primary_key=True, index=True)
multiplier = Column(Float)
attributes_rel = relationship("Attribute", back_populates="data", cascade="all, delete, delete-orphan")
#property
def attributes(self):
return [attribute.name for attribute in self.attributes_rel]
Note that by default sqlalchemy will fetch attributes_rel for each Data object separately upon access. This might result in N+1 selects problem. To avoid that you should specify relationship loading technique
Also take a look at with_entities and hybrid attributes
So, I have a model that is something like:
class Foo(model):
__tablename__ = "foo"
id = Column(Integer, primary_key=True)
data = relationship(
"FooData",
cascade="all, delete-orphan",
backref="foo",
lazy="dynamic",
order_by="desc(FooData.timestamp)"
)
#property
def first_item(self):
# the problem is here:
return self.data.order_by(asc("timestamp")).first()
#property
def latest_item(self):
return self.data.first()
class FooData(Model):
__tablename__ = "foo_data"
foo_id = Column(Integer, ForeignKey("foo.id"), primary_key=True)
timestamp = Column(DateTime, primary_key=True)
actual_data = Column(Float, nullable=False)
So, the problem is with the first_item method there: when it is defined as above, the SQL looks like this:
SELECT foo_data.timestamp AS foo_data_timestamp, foo_data.actual_data AS foo_data_actual_data, foo_data.foo_id AS foo_data_foo_id
FROM foo_data
WHERE :param_1 = foo_data.foo_id ORDER BY foo_data.timestamp DESC, foo_data.timestamp ASC
-- ^^^^^^^^^^^^^^^^^^^^^^
Obviously, the order_by specified in the query is being appended to the one specified in the relationship definition, instead of replacing it; is there a way for a query to override the original order_by? I know I could specify a separate query directly on the FooData class, but I would like to avoid that if possible.
According to documentation:
All existing ORDER BY settings can be suppressed by passing None - this will suppress any ORDER BY configured on mappers as well.
So the simple solution is to reset ORDER BY clause and then apply the one you need. Like:
self.data.order_by(None).order_by(asc("timestamp")).first()
In case you don't want to reset whole ORDER BY clause, but only want to override one column order, AFAIK there is no built-in way for it.
I know this is an old post, but it showed up when I was searching, so maybe this will be useful to someone else
class Foo(model):
__tablename__ = "foo"
id = Column(Integer, primary_key=True)
data = relationship(
"FooData",
cascade="all, delete-orphan",
backref="foo",
lazy="dynamic",
order_by=lambda: FooData.__table__.columns.timestamp.desc()
)
...
class FooData(Model):
__tablename__ = "foo_data"
foo_id = Column(Integer, ForeignKey("foo.id"), primary_key=True)
timestamp = Column(DateTime, primary_key=True)
actual_data = Column(Float, nullable=False)
There are two tables that one column of table A is pointing another table B's primary key.
But they are placed in different database, so I cannot configure them with foreign key.
Configuring via relationship() is unavailable, so I implemented property attribute manually.
class User(Base):
__tablename__ = 'users'
id = Column(BigInteger, id_seq, primary=True)
name = Column(Unicode(256))
class Article(Base):
__tablename__ = 'articles'
__bind_key__ = 'another_engine'
# I am using custom session configures bind
# each mappers to multiple database engines via this attribute.
id = Column(BigInteger, id_seq, primary=True)
author_id = Column(BigInteger, nullable=False, index=True)
body = Column(UnicodeText, nullable=False)
#property
def author(self):
_session = object_session(self)
return _session.query(User).get(self.author_id)
#author.setter
def author(self, user):
if not isinstance(user, User):
raise TypeError('user must be a instance of User')
self.author_id = user.id
This code works well for simple operations. But it causes dirty queries making SQLAlchemy's features meaningless.
Code would be simple if it was configured via relationship() (e.g. query.filter(author=me)) got messed up(e.g. query.filter(author_id=me.id)).
Relationship(e.g. join) related features are never able to be used in query building.
Can I use property attribute, at least, in building query criterion(filter()/filter_by())?
you can still use relationship here. If you stick to "lazy loading", it will query for the related item in database B after loading the lead item in database A. You can place a ForeignKey() directive in the Column, even if there isn't a real one in the database. Or you can use primaryjoin directly:
class User(Base):
__tablename__ = 'users'
id = Column(BigInteger, id_seq, primary=True)
name = Column(Unicode(256))
class Article(Base):
__tablename__ = 'articles'
__bind_key__ = 'another_engine'
id = Column(BigInteger, id_seq, primary=True)
author_id = Column(BigInteger, nullable=False, index=True)
body = Column(UnicodeText, nullable=False)
author = relationship("User",
primaryjoin="foreign(Article.author_id) == User.id")