SQLAlchemy relationship fields name constructor

SQLAlchemy relationship fields name constructor - python

I'm using SQLAlchemy 1.4 to build my database models (posgresql).
I've stablished relationships between my models, which I follow using the different SQLAlchemy capabilities. When doing so, the fields of the related models get aliases which don't work for me.
Here's an example of one of my models:
from sqlalchemy import Column, DateTime, ForeignKey, Integer, func
from sqlalchemy.orm import relationship
class Process(declarative_model()):
"""Process database table class.
Process model. It contains all the information about one process
iteration. This is the proces of capturing an image with all the
provided cameras, preprocess the images and make a prediction for
them as well as computing the results.
"""
id: int = Column(Integer, primary_key=True, index=True, autoincrement=True)
"""Model primary key."""
petition_id: int = Column(Integer, ForeignKey("petition.id", ondelete="CASCADE"))
"""Foreign key to the related petition."""
petition: "Petition" = relationship("Petition", backref="processes", lazy="joined")
"""Related petition object."""
camera_id: int = Column(Integer, ForeignKey("camera.id", ondelete="CASCADE"))
"""Foreign key to the related camera."""
camera: "Camera" = relationship("Camera", backref="processes", lazy="joined")
"""Related camera object."""
n: int = Column(Integer, comment="Iteration number for the given petition.")
"""Iteration number for the given petition."""
image: "Image" = relationship(
"Image", back_populates="process", uselist=False, lazy="joined"
)
"""Related image object."""
datetime_init: datetime = Column(DateTime(timezone=True), server_default=func.now())
"""Datetime when the process started."""
datetime_end: datetime = Column(DateTime(timezone=True), nullable=True)
"""Datetime when the process finished if so."""
The model works perfectly and joins the data by default as expected, so far so good.
My problem comes when I make a query and I extract the results through query.all() or through pd.read_sql(query.statement, db).
Reading the documentation, I should get aliases for my fields like "{table_name}.{field}" but instead of that I'm getting like "{field}_{counter}". Here's an example of a query.statement for my model:
SELECT process.id, process.petition_id, process.camera_id, process.n, process.datetime_init, process.datetime_end, asset_quality_1.id AS id_2, asset_quality_1.code AS code_1, asset_quality_1.name AS name_1, asset_quality_1.active AS active_1, asset_quality_1.stock_quality_id, pit_door_1.id AS id_3, pit_door_1.code AS code_2, petition_1.id AS id_4, petition_1.user_id, petition_1.user_code, petition_1.load_code, petition_1.provider_code, petition_1.origin_code, petition_1.asset_quality_initial_id, petition_1.pit_door_id, petition_1.datetime_init AS datetime_init_1, petition_1.datetime_end AS datetime_end_1, mask_1.id AS id_5, mask_1.camera_id AS camera_id_1, mask_1.prefix_path, mask_1.position, mask_1.format, camera_1.id AS id_6, camera_1.code AS code_3, camera_1.pit_door_id AS pit_door_id_1, camera_1.position AS position_1, image_1.id AS id_7, image_1.prefix_path AS prefix_path_1, image_1.format AS format_1, image_1.process_id
FROM process LEFT OUTER JOIN petition AS petition_1 ON petition_1.id = process.petition_id LEFT OUTER JOIN asset_quality AS asset_quality_1 ON asset_quality_1.id = petition_1.asset_quality_initial_id LEFT OUTER JOIN stock_quality AS stock_quality_1 ON stock_quality_1.id = asset_quality_1.stock_quality_id LEFT OUTER JOIN pit_door AS pit_door_1 ON pit_door_1.id = petition_1.pit_door_id LEFT OUTER JOIN camera AS camera_1 ON camera_1.id = process.camera_id LEFT OUTER JOIN mask AS mask_1 ON camera_1.id = mask_1.camera_id LEFT OUTER JOIN image AS image_1 ON process.id = image_1.process_id
Does anybody know how can I change this behavior and make it alias the fields like “{table_name}_{field}"?

SQLAlchemy uses label styles to configure how columns are labelled in SQL statements. The default in 1.4.x is LABEL_STYLE_DISAMBIGUATE_ONLY, which will add a "counter" for columns with the same name in a query. LABEL_STYLE_TABLENAME_PLUS_COL is closer to what you want.
Default:
q = session.query(Table1, Table2).join(Table2)
q = q.set_label_style(LABEL_STYLE_DISAMBIGUATE_ONLY)
print(q)
gives
SELECT t1.id, t1.child_id, t2.id AS id_1
FROM t1 JOIN t2 ON t2.id = t1.child_id
whereas
q = session.query(Table1, Table2).join(Table2)
q = q.set_label_style(LABEL_STYLE_TABLENAME_PLUS_COL)
print(q)
generates
SELECT t1.id AS t1_id, t1.child_id AS t1_child_id, t2.id AS t2_id
FROM t1 JOIN t2 ON t2.id = t1.child_id
If you want to enforce a style for all orm queries you could sublcass Session:
class MySession(orm.Session):
_label_style = LABEL_STYLE_TABLENAME_PLUS_COL
and use this class for your sessions, or pass it it a sessionmaker, if you use one:
Session = orm.sessionmaker(engine, class_=MySession)

You can use the label argument of the Column or the relationship method to specify the custom name for a field.
For example, to give a custom label for the process.petition_id field, you can use:
petition_id = Column(Integer, ForeignKey("petition.id", ondelete="CASCADE"), label='process_petition_id')
And for the petition relationship, you can use:
petition = relationship("Petition", backref="processes", lazy="joined", lazyload=True, innerjoin=True, viewonly=False, foreign_keys=[petition_id], post_update=False, cascade='all, delete-orphan', passive_deletes=True, primaryjoin='Process.petition_id == Petition.id', single_parent=False, uselist=False, query_class=None, foreignkey=None, remote_side=None, remote_side_use_alter=False, order_by=None, secondary=None, secondaryjoin=None, back_populates=None, collection_class=None, doc=None, extend_existing=False, associationproxy=None, comparator_factory=None, proxy_property=None, impl=None, _create_eager_joins=None, dynamic=False, active_history=False, passive_updates=False, enable_typechecks=None, info=None, join_depth=None, innerjoin=None, outerjoin=None, selectin=None, selectinload=None, with_polymorphic=None, join_specified=None, viewonly=None, comparison_enabled=None, useexisting=None, label='process_petition')
With this, the fields should be aliased to process_petition_id and process_petition respectively.

Related

SQLAlchemy Filter On Relationship Not Working

I have two classes:
Trade
TradeItem
One Trade has multiple TradeItems. It's a simple 1-to-many relationship.
Trade
class Trade(Base):
__tablename__ = 'trade'
id = Column(Integer, primary_key=True)
name= Column(String)
trade_items = relationship('TradeItem',
back_populates='trade')
TradeItem
class TradeItem(Base):
__tablename__ = 'trade_item'
id = Column(Integer, primary_key=True)
location = Column(String)
trade_id = Column(Integer, ForeignKey('trade.id'))
trade = relationship('Trade',
back_populates='trade_items')
Filtering
session.query(Trade).filter(Trade.name == trade_name and TradeItem.location == location)
I want to get the Trade object by name and only those Trade Items where the location matches the given location name.
So if there is a trade object: Trade(name='Trade1') and it has 3 items:
TradeItem(location='loc1'),TradeItem(location='loc2'),TradeItem(location='loc2')
Then the query should return only my target trade items for the trade:
session.query(Trade).filter(Trade.name == 'Trade1' and TradeItem.location == 'loc1').first()
Then I expect the query to return Trade with name Trade1 but only one Trade item populated with location as loc1
But when I get the trade object, it has all of the Trade Items and it completely ignores the filter condition.

You can use the contains_eager option to to load only the child rows selected by the filter.
Note that to create a WHERE ... AND ... filter you can use sqlalchemy.and_, or chain filter or filter_by methods, but using Python's logical and will not work as expected.
from sqlalchemy import orm, and_
trade = (session.query(Trade)
.filter(and_(Trade.name == 'Trade1', TradeItem.location == 'loc1'))
.options(orm.contains_eager(Trade.trade_items))
.first())
The query generates this SQL :
SELECT trade_item.id AS trade_item_id, trade_item.location AS trade_item_location, trade_item.trade_id AS trade_item_trade_id, trade.id AS trade_id, trade.name AS trade_name
FROM trade
JOIN trade_item ON trade.id = trade_item.trade_id
WHERE trade.name = 'trade1' AND trade_item.location = 'loc1'
LIMIT 1 OFFSET 0

SqlAlchemy querying deep nested objects

I need help with querying model to get deeper relatonship.
Animals is a main table. It should easly load imgs and stories and its pretty simple. But AnimalsStories has own imgs in same AnimalsImgs table. AnimalsImgs has imgs for both classes Animals and AnimalsStories and both classes have relationships with it.
So i should be able to load all Animals and their stories in AnimalsStories and then from this class i should be able to use .img attrubute which reffers to AnimalsImgs and holds imgs for stories. SqlAlchemy say that it is possible with subqueryload. And it is. But only two levels down. AnimalsStories.img never load.
class Animals(db.Model):
__tablename__ = 'animals'
id = db.Column(db.BigInteger, primary_key=True)
imgs = db.relationship("AnimalsImgs", backref=db.backref('animals',lazy=False))
stories= db.relationship("AnimalsStories",lazy='joined')
class AnimalsImgs(db.Model):
__tablename__ = 'animals_imgs'
id = db.Column(db.BigInteger, primary_key=True,autoincrement='auto')
id_animal = db.Column(db.BigInteger, db.ForeignKey('animals.id'),nullable=False)
class AnimalsStories(db.Model):
__tablename__='animals_stories'
id = db.Column(db.BigInteger, primary_key=True)
id_animal= db.Column(db.BigInteger,db.ForeignKey('animals.id'), nullable=False)
id_animal_img = db.Column(db.BigInteger,db.ForeignKey('animals_imgs.id'))
img=db.relationship("AnimalsImgs", uselist=False)
I've tried something like this :
query = Animals.query.options(subqueryload(Animals.stories).subqueryload(AnimalsStories.img))
result = query.all()
print(result)
for res in result:
print(res.rescue.img)
And ended with " AttributeError: 'InstrumentedList' object has no attribute 'img' "
It should be pretty simple to query deeper objects. I think the problem is somewhere in models structures.
Edit #1
I've ended up with solution. It was not that hard.
Animals reffers to stories as one to many. Stories reffers to Imgs as one to one. So with query i posted (with subqueryload) it can be done by:
query = Animals.query.options(subqueryload(Animals.stories).subqueryload(AnimalsStories.img))
result = query.all()
print(result)
for res in result:
print(res.stories)
for r in res.stories: # stories appears as array so they are iterable
print(r.img)
And it prints all the levels pretty clear.
It can be done as well, maybe even better, with join. We are sure that result doesnt have any missing or empty arrays.
query = Animals.query.join(AnimalsStories).join(AnimalsImgs)

I was not sure which types of relationship you want to model. I made the following assumptions:
A image can have a relationship to one or no animal, as well to one or no story
A story has a relationship to one animal, as well as to none or several images
A animal can have relationships to none or several images, as well to none or several stories
This would lead to a model like this:
class Animals(db.Model):
__tablename__ = 'animals'
id = db.Column(db.Integer, primary_key=True)
imgs = db.relationship("AnimalsImgs", backref=db.backref('animals'),lazy=True)
stories= db.relationship("AnimalsStories", backref=db.backref('animals'),
lazy=True)
class AnimalsImgs(db.Model):
__tablename__ = 'animals_imgs'
id = db.Column(db.Integer, primary_key=True)
id_animal = db.Column(db.Integer, db.ForeignKey('animals.id'),nullable=True)
id_animal_story =
db.Column(db.Integer,db.ForeignKey('animals_stories.id'),nullable=True)
class AnimalsStories(db.Model):
__tablename__='animals_stories'
id = db.Column(db.Integer, primary_key=True)
id_animal= db.Column(db.Integer,db.ForeignKey('animals.id'), nullable=False)
imgs=db.relationship("AnimalsImgs", backref=db.backref('stories'),lazy=True)
Now I have added the following scenrios:
An animal a1 that has a relationship to an image ai1 and to an story ast1. (but not relationship between the story and the image.
An animal a2 that has a relationship to story ast2. Ast2 has a relationship to a image ai2. But no relationship betwenn ast2 and a2.
An animal a3 that has a relationship to a story ast3 and a image ai3. ai3 has also a relationship to ast3.
Code sample like this:
a1 = Animals()
a2 = Animals()
a3 = Animals ()
a4 = Animals ()
db.session.add(a1)
db.session.add(a2)
db.session.add(a3)
db.session.add(a4)
ai1 = AnimalsImgs(animals = a1)
db.session.add(ai1)
ast1 = AnimalsStories(animals = a1)
db.session.add(ast1)
ast2 = AnimalsStories(animals = a2)
db.session.add(ast2)
ai2 = AnimalsImgs(stories = ast2)
db.session.add(ai2)
ast3 = AnimalsStories(animals = a3)
db.session.add(ast3)
ai3 = AnimalsImgs(animals = a3, stories = ast3)
db.session.add(ai3)
If you want to query all images that are connected to stories, you can use:
animalimages_ofallstories_ofalanimals =
AnimalsImgs.query.join(AnimalsStories).join(Animals).all()
You get image ai2 and ai3 as result. (ai1 is not connected to a story but only to an animal)

Sorting time column derived from values of other columns using hybrid_property

I'm trying to build an event-booking system as a side project to learn python and web development. Below are two of the models implemented in my project. An EventSlot represents a timeslot scheduled for a particular Event.
Models
from app import db
from sqlalchemy import ForeignKey
from dateutil.parser import parse
from datetime import timedelta
from sqlalchemy.ext.hybrid import hybrid_property
class Event(db.Model):
event_id = db.Column(db.Integer, primary_key=True)
title = db.Column(db.String, index=True, nullable=False)
duration = db.Column(db.Float, nullable=False)
price = db.Column(db.Float, nullable=False)
slots = db.relationship('EventSlot', cascade='all, delete', back_populates='event')
class EventSlot(db.Model):
slot_id = db.Column(db.Integer, primary_key=True)
event_date = db.Column(db.DateTime, nullable=False)
event_id = db.Column(db.Integer, ForeignKey('event.event_id'))
event = db.relationship('Event', back_populates='slots')
I've provided an admin page (Flask-Admin) for admin-users to view database records. On the EventSlot page, I included 'Start Time' and 'End Time' column which I want to make sortable. I've appended to the EventSlot model the following:
class EventSlot(db.Model):
#...
## working as intended ##
#hybrid_property
def start_time(self):
dt = parse(str(self.event_date))
return dt.time().strftime('%I:%M %p')
#start_time.expression
def start_time(cls):
return db.func.time(cls.event_date)
## DOES NOT WORK: can display derived time, but sorting is incorrect ##
#hybrid_property
def end_time(self):
rec = Event.query.filter(Event.event_id == self.event_id).first()
duration = rec.duration * 60
derived_time = self.event_date + timedelta(minutes=duration)
dt = parse(str(derived_time))
return dt.time().strftime('%I:%M %p')
#end_time.expression
def end_time(cls):
rec = Event.query.filter(Event.event_id == cls.event_id).first()
duration = '+' + str(int(rec.duration * 60)) + ' minutes'
return db.func.time(cls.event_date, duration)
As can be seen from the image below, the sort order is wrong when I sort by 'end time'. It appears to be still sorting by start time. What might be the problem here?
(Admittedly, I still don't understand hybrid_properties. I thought I had got it when got start_time working, but now it seems I still don't understand a thing...)

In the expression for end_time the cls.event_id represents a column, not a value, so the query ends up performing an implicit join between Event and EventSlot and picks the first result of that join. This of course is not what you want, but instead for an EventSlot you want to find out the duration of the related Event in SQL. This seems like a good place to use a correlated scalar subquery:
#end_time.expression
def end_time(cls):
# Get the duration of the related Event
ev_duration = Event.query.\
with_entities(Event.duration * 60).\
filter(Event.event_id == cls.event_id).\
as_scalar()
# This will form a string concatenation SQL expression, binding the strings as
# parameters to the query.
duration = '+' + ev_duration.cast(db.String) + ' minutes'
return db.func.time(cls.event_date, duration)
Note that the query is not run when the attribute is accessed in query context, but becomes a part of the parent query.

How to do efficient reverse foreign key check on Django multi-table inheritance model?

I've got file objects of different types, which inherit from a BaseFile, and add custom attributes, methods and maybe fields. The BaseFile also stores the File Type ID, so that the corresponding subclass model can be retrieved from any BaseFile object:
class BaseFile(models.Model):
name = models.CharField(max_length=80, db_index=True)
size= models.PositiveIntegerField()
time_created = models.DateTimeField(default=datetime.now)
file_type = models.ForeignKey(ContentType, on_delete=models.PROTECT)
class FileType1(BaseFile):
storage_path = '/path/for/filetype1/'
def custom_method(self):
<some custom behaviour>
class FileType2(BaseFile):
storage_path = '/path/for/filetype2/'
extra_field = models.CharField(max_length=12)
I also have different types of events which are associated with files:
class FileEvent(models.Model):
file = models.ForeignKey(BaseFile, on_delete=models.PROTECT)
time = models.DateTimeField(default=datetime.now)
I want to be able to efficiently get all files of a particular type which have not been involved in a particular event, such as:
unprocessed_files_type1 = FileType1.objects.filter(fileevent__isnull=True)
However, looking at the SQL executed for this query:
SELECT "app_basefile"."id", "app_basefile"."name", "app_basefile"."size", "app_basefile"."time_created", "app_basefile"."file_type_id", "app_filetype1"."basefile_ptr_id"
FROM "app_filetype1"
INNER JOIN "app_basefile"
ON("app_filetype1"."basefile_ptr_id" = "app_basefile"."id")
LEFT OUTER JOIN "app_fileevent" ON ("app_basefile"."id" = "app_fileevent"."file_id")
WHERE "app_fileevent"."id" IS NULL
It looks like this might not be very efficient because it joins on BaseFile.id instead of FileType1.basefile_ptr_id, so it will check ALL BaseFile ids to see whether they are present in FileEvent.file_id, when I only need to check the BaseFile ids corresponding to FileType1, or FileType1.basefile_ptr_ids.
This could result in a significant performance difference if there are a very large number of BaseFiles, but FileType1 is only a small subset of that, because it will be doing a large amount of unnecessary lookups.
Is there a way to force Django to join on "app_filetype1"."basefile_ptr_id" or otherwise achieve this functionality more efficiently?
Thanks for the help
UPDATE:
Using annotations and Exists subquery seems to do what I'm after, however the resulting SQL still appears strange:
unprocessed_files_type1 = FileType1.objects.annotate(file_event=Exists(FileEvent.objects.filter(file=OuterRef('pk')))).filter(file_event=False)
SELECT "app_basefile"."id", "app_basefile"."name", "app_basefile"."size", "app_basefile"."time_created", "app_basefile"."file_type_id", "app_filetype1"."basefile_ptr_id",
EXISTS(
SELECT U0."id", U0."file_id", U0."time"
FROM "app_fileevent" U0
WHERE U0."file_id" = ("app_filetype1"."basefile_ptr_id"))
AS "file_event"
FROM "app_filetype1"
INNER JOIN "app_basefile" ON ("app_filetype1"."basefile_ptr_id" = "app_basefile"."id")
WHERE EXISTS(
SELECT U0."id", U0."file_id", U0."time"
FROM "app_fileevent" U0
WHERE U0."file_id" = ("app_filetype1"."basefile_ptr_id")) = 0
It appears to be doing the WHERE EXISTS subquery twice instead of just using the annotated 'file_event' label... Maybe this is just a Django/SQLite driver bug?

Preserve key naming when creating a joinedload sql_query in SQLAlchemy

I am extracting a table row and the corresponding row from all referenced tables via SQLAlchemy.
Given the following object structure:
class DNAExtractionProtocol(Base):
__tablename__ = 'dna_extraction_protocols'
id = Column(Integer, primary_key=True)
code = Column(String, unique=True)
name = Column(String)
sample_mass = Column(Float)
mass_unit_id = Column(String, ForeignKey('measurement_units.id'))
mass_unit = relationship("MeasurementUnit", foreign_keys=[mass_unit_id])
digestion_buffer_id = Column(String, ForeignKey("solutions.id"))
digestion_buffer = relationship("Solution", foreign_keys=[digestion_buffer_id])
digestion_buffer_volume = Column(Float)
digestion_id = Column(Integer, ForeignKey("incubations.id"))
digestion = relationship("Incubation", foreign_keys=[digestion_id])
lysis_buffer_id = Column(String, ForeignKey("solutions.id"))
lysis_buffer = relationship("Solution", foreign_keys=[lysis_buffer_id])
lysis_buffer_volume = Column(Float)
lysis_id = Column(Integer, ForeignKey("incubations.id"))
lysis = relationship("Incubation", foreign_keys=[lysis_id])
proteinase_id = Column(String, ForeignKey("solutions.id"))
proteinase = relationship("Solution", foreign_keys=[proteinase_id])
proteinase_volume = Column(Float)
inactivation_id = Column(Integer, ForeignKey("incubations.id"))
inactivation = relationship("Incubation", foreign_keys=[inactivation_id])
cooling_id = Column(Integer, ForeignKey("incubations.id"))
cooling = relationship("Incubation", foreign_keys=[cooling_id])
centrifugation_id = Column(Integer, ForeignKey("incubations.id"))
centrifugation = relationship("Incubation", foreign_keys=[centrifugation_id])
volume_unit_id = Column(String, ForeignKey('measurement_units.id'))
volume_unit = relationship("MeasurementUnit", foreign_keys=[volume_unit_id])
I am using:
sql_query = session.query(DNAExtractionProtocol).options(Load(DNAExtractionProtocol).joinedload("*")).filter(DNAExtractionProtocol.code == code)
for item in sql_query:
pass
mystring = str(sql_query)
mydf = pd.read_sql_query(mystring,engine,params=[code])
print(mydf.columns)
This gives me:
Index([u'dna_extraction_protocols_id', u'dna_extraction_protocols_code',
u'dna_extraction_protocols_name',
u'dna_extraction_protocols_sample_mass',
u'dna_extraction_protocols_mass_unit_id',
u'dna_extraction_protocols_digestion_buffer_id',
u'dna_extraction_protocols_digestion_buffer_volume',
u'dna_extraction_protocols_digestion_id',
u'dna_extraction_protocols_lysis_buffer_id',
u'dna_extraction_protocols_lysis_buffer_volume',
u'dna_extraction_protocols_lysis_id',
u'dna_extraction_protocols_proteinase_id',
u'dna_extraction_protocols_proteinase_volume',
u'dna_extraction_protocols_inactivation_id',
u'dna_extraction_protocols_cooling_id',
u'dna_extraction_protocols_centrifugation_id',
u'dna_extraction_protocols_volume_unit_id', u'measurement_units_1_id',
u'measurement_units_1_code', u'measurement_units_1_long_name',
u'measurement_units_1_siunitx', u'solutions_1_id', u'solutions_1_code',
u'solutions_1_name', u'solutions_1_supplier',
u'solutions_1_supplier_id', u'incubations_1_id', u'incubations_1_speed',
u'incubations_1_duration', u'incubations_1_temperature',
u'incubations_1_movement', u'incubations_1_speed_unit_id',
u'incubations_1_duration_unit_id', u'incubations_1_temperature_unit_id',
u'solutions_2_id', u'solutions_2_code', u'solutions_2_name',
u'solutions_2_supplier', u'solutions_2_supplier_id',
u'incubations_2_id', u'incubations_2_speed', u'incubations_2_duration',
u'incubations_2_temperature', u'incubations_2_movement',
u'incubations_2_speed_unit_id', u'incubations_2_duration_unit_id',
u'incubations_2_temperature_unit_id', u'solutions_3_id',
u'solutions_3_code', u'solutions_3_name', u'solutions_3_supplier',
u'solutions_3_supplier_id', u'incubations_3_id', u'incubations_3_speed',
u'incubations_3_duration', u'incubations_3_temperature',
u'incubations_3_movement', u'incubations_3_speed_unit_id',
u'incubations_3_duration_unit_id', u'incubations_3_temperature_unit_id',
u'incubations_4_id', u'incubations_4_speed', u'incubations_4_duration',
u'incubations_4_temperature', u'incubations_4_movement',
u'incubations_4_speed_unit_id', u'incubations_4_duration_unit_id',
u'incubations_4_temperature_unit_id', u'incubations_5_id',
u'incubations_5_speed', u'incubations_5_duration',
u'incubations_5_temperature', u'incubations_5_movement',
u'incubations_5_speed_unit_id', u'incubations_5_duration_unit_id',
u'incubations_5_temperature_unit_id', u'measurement_units_2_id',
u'measurement_units_2_code', u'measurement_units_2_long_name',
u'measurement_units_2_siunitx', u'dna_extractions_1_id',
u'dna_extractions_1_code', u'dna_extractions_1_protocol_id',
u'dna_extractions_1_source_id'],
dtype='object')
This indeed contains all the columns I want - but the naming does not help me select what I want.
Isit possible to preserve the key names from the original table in this dataframe? e.g. instead of measurement_units_1_code I would like to have mass_unit_code.

This is not what joinedload is supposed to be used for. You want to do an explicit join in this case:
session.query(DNAExtractionProtocol.id.label("id"),
...,
MeasurementUnit.id.label("mass_unit_id"),
...) \
.join(DNAExtractionProtocol.mass_unit) \
.join(DNAExtractionProtocol.digestion_buffer) \
... \
.filter(...)
If you don't want to type out all those names, you can inspect the DNAExtractionProtocol class to find all relationships and dynamically construct the query and labels. An example:
cols = []
joins = []
insp = inspect(DNAExtractionProtocol)
for name, col in insp.columns.items():
cols.append(col.label(name))
for name, rel in insp.relationships.items():
alias = aliased(rel.mapper.class_, name=name)
for col_name, col in inspect(rel.mapper).columns.items():
aliased_col = getattr(alias, col.key)
cols.append(aliased_col.label("{}_{}".format(name, col_name)))
joins.append((alias, rel.class_attribute))
query = session.query(*cols).select_from(DNAExtractionProtocol)
for join in joins:
query = query.join(*join)
EDIT: Depending on your data structure you might need to use outerjoin instead of join on the last line.
You'll probably need to tweak this to your liking. For example, this doesn't take into account potential naming conflicts, e.g. for mass_unit_id, is it DNAExtractionProtocol.mass_unit_id or is it MeasurementUnit.id?
In addition, you'll probably want to execute sql_query.statement instead of str(sql_query). str(sql_query) is for printing purposes, not for execution. I believe you don't need to pass params=[code] if you use sql_query.statement because code will already have been bound to the appropriate parameter in the query.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

SQLAlchemy relationship fields name constructor - python

Related

SQLAlchemy Filter On Relationship Not Working

SqlAlchemy querying deep nested objects

Sorting time column derived from values of other columns using hybrid_property

How to do efficient reverse foreign key check on Django multi-table inheritance model?

Preserve key naming when creating a joinedload sql_query in SQLAlchemy

Categories

Resources