Can a sqlalchemy relationship be accessed using a new session? - python

When using yield_per I am forced to use a separate session if I want to perform another query while the result from the yield_per query have not yet been all fetched.
Let's take this models for our example:
class Parent(Base):
__tablename__ = 'parent'
id = Column(Integer, primary_key=True)
children = relationship("Child", backref="parent")
class Child(Base):
__tablename__ = 'child'
id = Column(Integer, primary_key=True)
parent_id = Column(Integer, ForeignKey('parent.id'))
Here are three ways to achieve the same thing, but only the first one works:
query = session.query(Parent).yield_per(5)
for p in query:
print(p.id)
# 1: Using a new session (will work):
c = newsession.query(Child).filter_by(parent_id=p.id).first()
print(c.id)
# 2: Using the same session (will not work):
c = session.query(Child).filter_by(parent_id=p.id).first()
print(c.id)
# 3: Using the relationship (will not work):
c = p.children[0]
print(c.id)
Indeed (when using mysql) both 2 and 3 will throw an exception and stop execution with the following error: "Commands out of sync; you can't run this command now".
My question is, is there a way I can make relationship lookup work in this context ? Is there maybe a way to trick sqlalchemy into using a new session when the first one is busy ?

Try selectinload, as it supports eagerly loading with yield_per.
import sqlalchemy as sa
query = session.query(
Parent
).options(
sa.orm.selectinload(Parent.children)
).yield_per(5)
for parent in query:
for child in parent.children:
print(child.id)

Related

Why am I unable to generate a query using relationships?

I'm experimenting with relationship functionality within SQLAlchemy however I've not been able to crack it. The following is a simple MRE:
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy import Column, ForeignKey, Integer, create_engine
from sqlalchemy.orm import relationship, sessionmaker
Base = declarative_base()
class Tournament(Base):
__tablename__ = "tournament"
__table_args__ = {"schema": "belgarath", "extend_existing": True}
id_ = Column(Integer, primary_key=True)
tournament_master_id = Column(Integer, ForeignKey("belgarath.tournament_master.id_"))
tournament_master = relationship("TournamentMaster", back_populates="tournament")
class TournamentMaster(Base):
__tablename__ = "tournament_master"
__table_args__ = {"schema": "belgarath", "extend_existing": True}
id_ = Column(Integer, primary_key=True)
tour_id = Column(Integer, index=True)
tournament = relationship("Tournament", back_populates="tournament_master")
engine = create_engine("mysql+mysqlconnector://root:root#localhost/")
Session = sessionmaker(bind=engine)
session = Session()
qry = session.query(Tournament.tournament_master.id_).limit(100)
I was hoping to be able to query the id_ field from the tournament_master table through a relationship specified in the tournament table. However I get the following error:
AttributeError: Neither 'InstrumentedAttribute' object nor 'Comparator' object associated with Tournament.tournament_master has an attribute 'id_'
I've also tried replacing the two relationship lines with a single backref line in TournamentMaster:
tournament = relationship("Tournament", backref="tournament_master")
However I then get the error:
AttributeError: type object 'Tournament' has no attribute 'tournament_master'
Where am I going wrong?
(I'm using SQLAlchemy v1.3.18)
Your ORM classes look fine. It's the query that's incorrect.
In short you're getting that "InstrumentedAttribute" error because you are misusing the session.query method.
From the docs the session.query method takes as arguments, "SomeMappedClass" or "entities". You have 2 mapped classes defined, Tournament, and TournamentMaster. These "entities" are typically either your mapped classes (ORM objects) or a Column of these mapped classes.
However you are passing in Tournament.tournament_master.id_ which is not a "MappedClass" or a column and thus not an "entity" that session.query can consume.
Another way to look at it is that by calling Tournament.tournament_master.id_ you are trying to access a 'TournamentMaster' record (or instance) from the 'Tournament' class, which doesn't make sense.
It's not super clear to me what exactly you hoping to return from the query. In any case though here's a start.
Instead of
qry = session.query(Tournament.tournament_master.id_).limit(100)
try
qry = session.query(Tournament, TournamentMaster).join(TournamentMaster).limit(100)
This may also work (haven't tested) to only return the id_ field, if that is you intention
qry = session.query(Tournament, TournamentMaster).join(Tournament).with_entities(TournamentMaster.id_).limit(100)

Is it possible to use session.insert for one to main relationships in SQLAlchemy?

I have read in the following link:
Sqlalchemy adding multiple records and potential constraint violation
That using SQLAlchemy core library to perform the inserts is much faster option, rather than the ORM's session.add() method:
i.e:
session.add()
should be replaced with:
session.execute(Entry.__table__.insert(), params=inserts)
In the following code I have tried to replace .add with .insert:
from sqlalchemy import Column, DateTime, String, Integer, ForeignKey, func
from sqlalchemy.orm import relationship, backref
from sqlalchemy.ext.declarative import declarative_base
Base = declarative_base()
class Department(Base):
__tablename__ = 'department'
id = Column(Integer, primary_key=True)
name = Column(String)
class Employee(Base):
__tablename__ = 'employee'
id = Column(Integer, primary_key=True)
name = Column(String)
# Use default=func.now() to set the default hiring time
# of an Employee to be the current time when an
# Employee record was created
hired_on = Column(DateTime, default=func.now())
department_id = Column(Integer, ForeignKey('department.id'))
# Use cascade='delete,all' to propagate the deletion of a Department onto its Employees
department = relationship(
Department,
backref=backref('employees',
uselist=True,
cascade='delete,all'))
from sqlalchemy import create_engine
engine = create_engine('postgres://blah:blah#blah:blah/blah')
from sqlalchemy.orm import sessionmaker
session = sessionmaker()
session.configure(bind=engine)
Base.metadata.create_all(engine)
d = Department(name="IT")
emp1 = Employee(name="John", department=d)
s = session()
s.add(d)
s.add(emp1)
s.commit()
s.delete(d) # Deleting the department also deletes all of its employees.
s.commit()
s.query(Employee).all()
# Insert Option Attempt
from sqlalchemy.dialects.postgresql import insert
d = insert(Department).values(name="IT")
d1 = d.on_conflict_do_nothing()
s.execute(d1)
emp1 = insert(Employee).values(name="John", department=d1)
emp1 = emp1.on_conflict_do_nothing()
s.execute(emp1)
The error I receive:
sqlalchemy.exc.CompileError: Unconsumed column names: department
I can't quite understand the syntax and how to do it in the right way, I'm new to the SQLAlchemy.
It looks my question is similar to How to get primary key columns in pd.DataFrame.to_sql insertion method for PostgreSQL "upsert"
, so potentially by answering either of our questions, you could help two people at the same time ;-)
I am new to SQLAlchemy as well, but this is what I found :
Using your exact code, adding department only didn't work using "s.execute(d1)", so I changed it to the below and it does work :
with engine.connect() as conn:
d = insert(Department).values(name="IT")
d1 = d.on_conflict_do_nothing()
conn.execute(d1)
I found on SQLAlchemy documentation that in the past it was just a warning when you try to use a virtual column that doesn't really exist. But from version 0.8, it has been changed to an exception.
As a result, I am not sure if you can do that using the insert. I think that SQLAlchemy does it behind the scene in some other way when using session.add(). Maybe some experts can elaborate here.
I hope that will help.

SQLAlchemy Deleting a parent after changing all references to it in its children still delete children

Given this SQLAlchemy database definition:
class Project(Base):
__tablename__ = 'project'
id = Column(Integer, primary_key=True)
name = Column(Unicode, unique=True)
tasks = relationship('Task', cascade='all', backref='project')
class Task(Base):
__tablename__ = 'task'
id = Column(Integer, primary_key=True)
title = Column(Unicode)
project_id = Column(Integer, ForeignKey('project.id'), nullable=False)
I want to merge two projects. My first naive attempt was to do something like this:
def merge_1(session, src_prj, dst_prj):
for task in src_prj.tasks:
task.project = dst_prj
session.delete(src_prj)
But that caused only half (!) of the tasks to be transfered, the other half got deleted.
If instead I do this:
def merge_2(session, src_prj, dst_prj):
for task in src_prj.tasks:
task.project_id = dst_prj.id
session.delete(src_prj)
None of my tasks are transfered. They get deleted when the project is deleted.
Then I tried that:
def merge_3(session, src_prj, dst_prj):
for task in src_prj.tasks:
task.project_id = dst_prj.id
session.commit()
session.delete(src_prj)
It works, but calling session.commit() before deleting the project defeats the purpose of session transactions.
This final version works as well (and is faster):
def merge_4(session, src_prj, dst_prj):
session.query(Task).filter_by(project_id=src_prj.id) \
.update({'project_id': dst_prj.id})
session.delete(src_prj)
But I would like to know why merge_1() and merge_2() do not behave as expected.
I tested using SQLAlchemy 1.1.4. The full test program is available here: https://gist.github.com/agateau/887af14b7ddd1e151f9ac89d5e423ef6
I would try committing the change before doing the deletion operation. Otherwise, I'm guessing that the deletion is not recognizing the committed operation directly before it:
def merge_1(session, src_prj, dst_prj):
for task in src_prj.tasks:
task.project = dst_prj
session.commit()
session.delete(src_prj)
Can you do it in an update statement instead of individually? Something like:
Project.objects.filter(name=src_prj).update(name=dst_prj)
Project.objects.filter(name=src_prj).delete()

How to merge an object into the session based on ID using SQLAlchemy while keeping correct discriminator of child classes?

I am passing object IDs between threads, and would like to be able to merge an object into the session by ID. I could of course query objects across threads to accomplish this but would rather avoid the overhead of a database trip if the object is in the session already.
My code is working, but it is not correctly setting the discriminator value when I merge by the parent class. How can I ensure the discriminator is set correctly?
from sqlalchemy import create_engine, __version__
from sqlalchemy.orm import sessionmaker, scoped_session
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy import Column
from sqlalchemy.types import Integer, String
Session = scoped_session(sessionmaker())
Base = declarative_base()
class Person(Base):
__tablename__ = 'person'
id = Column(Integer, primary_key=True)
name = Column(String, nullable=False)
discriminator = Column(String, nullable=False)
__mapper_args__ = {'polymorphic_identity': 'person',
'polymorphic_on': discriminator}
class Programmer(Person):
__mapper_args__ = {'polymorphic_identity': 'programmer'}
fav_language = Column(String)
engine = create_engine('postgresql+zxjdbc://mnaber:test123#localhost:5432/ajtest2', echo=True)
Session.configure(bind=engine)
Base.metadata.bind = engine
Base.metadata.drop_all(checkfirst=True)
Base.metadata.create_all()
s = Session()
michael = Programmer(name='Michael', fav_language='Python')
s.add(michael)
s.commit()
print "Sqlalchemy Version: %s" % __version__
print "INITIAL: %s %s" % (type(michael), michael.discriminator)
michael_merged = s.merge(Person(id=michael.id)) #Merge by parent class
print "MERGED: %s %s" % (type(michael_merged), michael_merged.discriminator)
print "FINAL: %s %s" % (type(michael), michael.discriminator)
michael_merged.fav_language = 'Jython'
s.add(michael_merged)
s.commit()
The interesting output of this is:
Sqlalchemy Version: 0.8.7
INITIAL: <class '__main__.Programmer'> programmer
MERGED: <class '__main__.Programmer'> person
FINAL: <class '__main__.Programmer'> person
site-packages/sqlalchemy/orm/persistence.py:154: SAWarning: Flushing object <Programmer at 0x18> with incompatible polymorphic identity 'person'; the object may not refresh and/or load correctly
mapper._validate_polymorphic_identity(mapper, state, dict_)
How should I ensure that MERGED and FINAL have a discriminator of programmer?
merge() assumes an object coming in looks the way you want it to look so you'd need to pass a Programmer here. Passing in Person(id=1) when there's really a Programmer in the database for that identity is not a supported pattern, the behavior is undefined. What happens at the moment is that your Person object has the "discriminator" of "person" set up front, so that's the value that gets flushed into the database; it overrides what's already in the session.
You can trick it into working like this:
p1 = Person(id=michael.id)
del p1.discriminator
michael_merged = s.merge(p1) #Merge by parent class
however, I can't guarantee that code like the above will always work for future SQLAlchemy versions. It would not, for example, work for earlier versions where the "discriminator" was chosen at flush time.
Your code example is such that Programmer is already present in the Session; you can get this object in a class-agnostic way like this:
obj = session.query(Person).get(michael.id)
then you have your Programmer object and you can modify it freely.

SQLAlchemy association table (association object pattern) raises IntegrityError

I'm using SQLAlchemy release 0.8.2 (tried python 2.7.5 and 3.3.2)
I've had to use the association object pattern (for a many-to-many relationship) in my code, but whenever I've been adding an association, it has been raising an IntegrityError exception. This is because instead of executing "INSERT INTO association (left_id, right_id, extra_data) [...]", it instead executes "INSERT INTO association (right_id, extra_data) [...]", which is going to raise an IntegrityError exception since it's missing a primary key.
After trying to narrow down the problem for a while and simplifying the code as much as possible, I found the culprit(s?), but I don't understand why it's behaving this way.
I included my complete code so the reader can test it as is. The class declarations are exactly the same as in the documentation (with backrefs).
#!/usr/bin/env python2
import sqlalchemy
from sqlalchemy.orm import sessionmaker
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy import Column, Integer, String
from sqlalchemy import ForeignKey
from sqlalchemy.orm import relationship, backref
Base = declarative_base()
class Association(Base):
__tablename__ = 'association'
left_id = Column(Integer, ForeignKey('left.id'), primary_key=True)
right_id = Column(Integer, ForeignKey('right.id'), primary_key=True)
extra_data = Column(String(50))
child = relationship("Child", backref="parent_assocs")
class Parent(Base):
__tablename__ = 'left'
id = Column(Integer, primary_key=True)
children = relationship("Association", backref="parent")
class Child(Base):
__tablename__ = 'right'
id = Column(Integer, primary_key=True)
def main():
engine = sqlalchemy.create_engine('sqlite:///:memory:', echo=True)
Base.metadata.create_all(engine)
Session = sessionmaker(bind=engine)
session = Session()
# populate old data
session.add(Child())
# new data
p = Parent()
session.add(p) # Commenting this fixes the error.
session.flush()
# rest of new data
a = Association(extra_data="some data")
a.child = session.query(Child).one()
# a.child = Child() # Using this instead of the above line avoids the error - but that's not what I want.
p.children.append(a)
# a.parent = p # Using this instead of the above line fixes the error! They're logically equivalent.
session.add(p)
session.commit()
if __name__ == '__main__':
main()
So, as mentioned in the comments in the code above, there are three ways to fix/avoid the problem.
Don't add the parent to the session before declaring the association
Create a new child for the association instead of selecting an already existing child.
Use the backref on the association
I don't understand the behaviour of all three cases.
The second case does something different, so it's not a possible solution. I don't understand the behaviour however, and would appreciate an explanation of why the problem is avoided in this case.
I'm thinking the first case may have something to do with "Object States", but I don't know exactly what's causing it either. Oh, and adding session.autoflush=False just before the first occurrence of session.add(p) also fixes the problem which adds to my confusion.
For the third case, I'm drawing a complete blank since they should be logically equivalent.
Thanks for any insight!
what happens here is that when you call upon p.children.append(), SQLAlchemy can't append to a plain collection without loading it first. As it goes to load, autoflush kicks in - you know this because in your stack trace you will see a line like this:
File "/Users/classic/dev/sqlalchemy/lib/sqlalchemy/orm/session.py", line 1183, in _autoflush
self.flush()
Your Association object is then flushed here in an incomplete state; it's in the session in the first place because when you say a.child = some_persistent_child, an event appends a to the parent_assocs collection of Child which then cascades the Association object into the session (see Controlling Cascade on Backrefs for some background on this, and one possible solution).
But without affecting any relationships, the easiest solution when you have this chicken/egg sort of problem is to temporarily disable autoflush using no_autoflush:
with session.no_autoflush:
p.children.append(a)
by disabling the autoflush when p.children is loaded, your pending object a is not flushed; it is then associated with the already persistent Parent (because you've added and flushed that already) and is ready for INSERT.
this allows your test program to succeed.

Categories