How to query with like() when using many-to-many relationships in SQLAlchemy? - python

I have the folloing many-to-many relationship defined in SQLAlchemy:
training_ids_association_table = db.Table(
"training_ids_association",
db.Model.metadata,
Column("training_id", Integer, ForeignKey("training_sessions.id")),
Column("ids_id", Integer, ForeignKey("image_data_sets.id")),
)
class ImageDataSet(db.Model):
__tablename__ = "image_data_sets"
id = Column(Integer, primary_key=True)
tags = Column(String)
trainings = relationship("TrainingSession", secondary=training_ids_association_table, back_populates="image_data_sets")
class TrainingSession(db.Model):
__tablename__ = "training_sessions"
id = Column(Integer, primary_key=True)
image_data_sets = relationship("ImageDataSet", secondary=training_ids_association_table, back_populates="trainings")
Note the field ImageDataSet.tags, which can contain a list of string items (i.e. tags), separated by a slash character. If possible I would rather stick to that format instead of creating a new table just for these tags.
What I want now is to query table TrainingSession for all entries that have a certain tag set ub their related ImageDataSet's. Now, if an ImageDataSet has only one tag saved in the tags field, then the following works:
TrainingSession.query.filter(TrainingSession.image_data_sets.any(tags=find_tag))
However, as soon as there are multiple tags in the tags field (e.g. something like "tag1/tag2/tag3"), then of course this filter above does not work any more. So I tried it with a like:
.filter(TrainingSession.image_data_sets.like(f'%{find_tag}%'))
But this leads to an NotImplementedError in SQLAlchemy. So is there a way to achieve what I am trying to do here, or do I necessarily need another table for the tags per ImageDataSet?

You can apply any filters on related model columns if you join this model first:
query = session.query(TrainingSession). \
join(TrainingSession.image_data_sets). \
filter(ImageDataSet.tags.like(f"%{find_tag}%"))
This query is translated to the following SQL statement:
SELECT training_sessions.id FROM training_sessions
JOIN training_ids_association ON training_sessions.id = training_ids_association.training_id
JOIN image_data_sets ON image_data_sets.id = training_ids_association.ids_id
WHERE image_data_sets.tags LIKE %(find_tag)s
Note that you may stumble to a problem with storing tags as strings with separators. If some records have tags tag1, tag12, tag123 they will all pass the filter LIKE '%tag1%'.
It would be better to switch to ARRAY column if your database supports this column type (PostgreSQL for example). Your column may be defined like this:
tags = Column(ARRAY(String))
And the query may look like this:
query = session.query(TrainingSession). \
join(TrainingSession.image_data_sets). \
filter(ImageDataSet.tags.any(find_tag))

Related

SQLAlchemy: partial unique constraint where a field has a certain value

In my flask project I need a table with a unique constraint on a column, if the values in an other column are identical. So I try to do something like that:
if premiumuser_id = "a value I don't know in advance" then track_id=unique
This is similar to Creating partial unique index with sqlalchemy on Postgres, but I use sqlite (where partial indexes should also be possible: https://docs.sqlalchemy.org/en/13/dialects/sqlite.html?highlight=partial%20indexes#partial-indexes) and the condition is different.
So far my code looks like that:
class Queue(db.Model):
id = db.Column(db.Integer, primary_key=True)
track_id = db.Column(db.Integer)
premiumuser_id = db.Column(
db.Integer, db.ForeignKey("premium_user.id"), nullable=False
)
__table_args__ = db.Index(
"idx_partially_unique_track",
"track_id",
unique=True,
sqlite_where="and here I'm lost",
)
All examples I've found operate with boolean or fixed values. How should the syntax for sqlite_where look like for the condition: premiumuser_id = "a value I don't know in advance"?

How to return specific dictionary keys from within a nested list from a jsonb column in sqlalchemy

I am attempting to return some named columns from a jsonb data set that is stored with PostgreSQL.
I am able to run a raw query that meets my needs directly, however I am trying to run the query utilising SQLAlchemy, in order to ensure that my code is 'pythonic' and easy to read.
The query that returns the correct result (two columns) is:
SELECT
tmp.item->>'id',
tmp.item->>'name'
FROM (SELECT jsonb_array_elements(t.data -> 'users') AS item FROM tpeople t) as tmp
Example json (each user has 20+ columns)
{ "results":247, "users": [
{"id":"202","regdate":"2015-12-01","name":"Bob Testing"},
{"id":"87","regdate":"2014-12-12","name":"Sally Testing"},
{"id":"811", etc etc}
...
]}
The table is simple enough, with a PK, datetime of json extraction, and the jsonb column for the extract
CREATE TABLE tpeople
(
record_id bigint NOT NULL DEFAULT nextval('"tpeople_record_id_seq"'::regclass) ( INCREMENT 1 START 1 MINVALUE 1 MAXVALUE 9223372036854775807 CACHE 1 ),
scrape_time timestamp without time zone NOT NULL,
data jsonb NOT NULL,
CONSTRAINT "tpeople_pkey" PRIMARY KEY (record_id)
);
Additionally I have a People Class that looks as follows:
class people(Base):
__tablename__ = 'tpeople'
record_id = Column(BigInteger, primary_key=True, server_default=text("nextval('\"tpeople_record_id_seq\"'::regclass)"))
scrape_time = Column(DateTime, nullable=False)
data = Column(JSONB(astext_type=Text()), nullable=False)
Presently my code to return the two columns looks like this:
from db.db_conn import get_session // Generic connector for my db
from model.models import people
from sqlalchemy import func,
sess = get_session()
sub = sess.query(func.jsonb_array_elements(people.data["users"]).label("item")).subquery()
test = sess.query(sub.c.item).select_entity_from(sub).all()
SQLAlchemy generates the following SQL:
SELECT anon_1.item AS anon_1_item
FROM (SELECT jsonb_array_elements(tpeople.data -> %(data_1)s) AS item
FROM tpeople) AS anon_1
{'data_1': 'users'}
But nothing I seem to do can allow me to only get certain columns within the item itself like the raw SQL I can write. Some of the approaches I have tried as follows (they all error out):
test = sess.query("sub.item.id").select_entity_from(sub).all()
test = sess.query(sub.item.["id"]).select_entity_from(sub).all()
aas = func.jsonb_to_recordset(people.data["users"])
res = sess.query("id").select_from(aas).all()
sub = select(func.jsonb_array_elements(people.data["users"]).label("item"))
Presently I can extract the columns I need in a simple for loop, but this seems like a hacky way to do it, and I'm sure there is something dead obvious I'm missing.
for row in test:
print(row.item['id'])
Searched for a few hours eventually found some who accidentally did this while trying to get another result.
sub = sess.query(func.jsonb_array_elements(people.data["users"]).label("item")).subquery()
tmp = sub.c.item.op('->>')('id')
tmp2 = sub.c.item.op('->>')('name')
test = sess.query(tmp, tmp2).all()

SQLAlchemy column name with space

I'm trying to filter a table on a column that contain spaces.
...
events = database_session.query(table)
events.filter(table.column with space == 'xvalue') < -- I want to do that
...
There is for sure a simple way of doing that, but I can't seem to find it anywhere.
There are two ways to resolve this.
When defining the table you would need to specify an alias with the key parameter
t_table_name = Table(
'tablename',
metadata,
Column('SQL Column', Integer, key='sql_column')
)
Define the ORM class as
class Employee(Base):
emp_name = Column("employee name", String)

SQLite to Django - join tables with same field names

I'm trying to transfer this very simplified query to Django models:
select B.value from A join B on A.id = B.id where B.param = "foo" group by B.value;
Basically giving me unique B.value of rows with common id
My models in Django are:
#A
id = CharField()
...
#B
id = CharField()
param = CharField()
value = CharField()
...
From what I've read, people are against joining tables in Django. How does my query translate to Django in simplest form?
I'm almost willing to just execute this query using django.db.connection but I'd rather not
Looks like you are trying to retrieve all the values of B which matches a specific param value and has a corresponding id in A.. Right ?
Try this:
a_ids = A.objects.values_list('id', flat=True)
b_values = B.objects.filter(param='foo', id__in=a_ids).values_list('value', flat=True).distinct()
I would encourage you to look at the Queries in terms of a usecase, rather than converting a query to an ORM equivalent.

SQLAlchemy ORM select multiple entities from subquery

I need to query multiple entities, something like session.query(Entity1, Entity2), only from a subquery rather than directly from the tables. The docs have something about selecting one entity from a subquery but I can't find how to select more than one, either in the docs or by experimentation.
My use case is that I need to filter the tables underlying the mapped classes by a window function, which in PostgreSQL can only be done in a subquery or CTE.
EDIT: The subquery spans a JOIN of both tables so I can't just do aliased(Entity1, subquery).
from sqlalchemy import *
from sqlalchemy.orm import *
from sqlalchemy.ext.declarative import declarative_base
Base = declarative_base()
class A(Base):
__tablename__ = "a"
id = Column(Integer, primary_key=True)
bs = relationship("B")
class B(Base):
__tablename__ = "b"
id = Column(Integer, primary_key=True)
a_id = Column(Integer, ForeignKey('a.id'))
e = create_engine("sqlite://", echo=True)
Base.metadata.create_all(e)
s = Session(e)
s.add_all([A(bs=[B(), B()]), A(bs=[B()])])
s.commit()
# with_labels() here is to disambiguate A.id and B.id.
# without it, you'd see a warning
# "Column 'id' on table being replaced by another column with the same key."
subq = s.query(A, B).join(A.bs).with_labels().subquery()
# method 1 - select_from()
print s.query(A, B).select_from(subq).all()
# method 2 - alias them both. "subq" renders
# once because FROM objects render based on object
# identity.
a_alias = aliased(A, subq)
b_alias = aliased(B, subq)
print s.query(a_alias, b_alias).all()
I was trying to do something like the original question: join a filtered table with another filtered table using an outer join. I was struggling because it's not at all obvious how to:
create a SQLAlchemy query that returns entities from both tables. #zzzeek's answer showed me how to do that: get_session().query(A, B).
use a query as a table in such a query. #zzzeek's answer showed me how to do that too: filtered_a = aliased(A).filter(...).subquery().
use an OUTER join between the two entities. Using select_from() after outerjoin() destroys the join condition between the tables, resulting in a cross join. From #zzzeek answer I guessed that if a is aliased(), then you can include a in the query() and also .outerjoin(a), and it won't be joined a second time, and that appears to work.
Following either of #zzzeek's suggested approaches directly resulted in a cross join (combinatorial explosion), because one of my models uses inheritance, and SQLAlchemy added the parent tables outside the inner SELECT without any conditions! I think this is a bug in SQLAlchemy. The approach that I adopted in the end was:
filtered_a = aliased(A, A.query().filter(...)).subquery("filtered_a")
filtered_b = aliased(B, B.query().filter(...)).subquery("filtered_b")
query = get_session().query(filtered_a, filtered_b)
query = query.outerjoin(filtered_b, filtered_a.relation_to_b)
query = query.order_by(filtered_a.some_column)
for a, b in query:
...

Categories