SQLAlchemy, Postgres, getting distinct to join with Alchemy model - python

I'm relatively new to SQLAlchemy, and thus far have not had to do anything that complex. I now have a need to return the latest "version" of a row. I can use "distinct" to return the relevant list, however I'm struggling to have the query return SQLAlchemy models.
session.query(Document.document_id,func.max(Document.id)).\
filter_by(container_id=1,active=True).\
group_by(Document.document_id).all()
This returns the list of ids that I need. But what I really need is the whole model.
I'm sure there's a simple way to join it. However it has completely eluded me.

Using a subquery, you can than join:
subq = (session.query(
# Document.document_id, # do not need this really
func.max(Document.id).label("max_id")
)
.filter(Document.container_id == 1)
.filter(Document.active == True)
.group_by(Document.document_id)
).subquery("subq")
qry = (session.query(Document)
.join(subq, Document.id == subq.c.max_id)
).all()

Related

How to implement a specific SQL statement as a SQLAlchemy ORM query

I was following a tutorial to make my first Flask API (https://medium.com/#dushan14/create-a-web-application-with-python-flask-postgresql-and-deploy-on-heroku-243d548335cc) I did it but now I want to do queries more custom with SQLAlchemy and PostgreSQL. My question is how I could do something like this:
query = text("""SELECT enc.*, persona."Persona_Nombre", persona."Persona_Apellido", metodo."MetEnt_Nombre", metodo_e."MetPag_Descripcion"
FROM "Ventas"."Enc_Ventas" AS enc
INNER JOIN "General"."Persona" AS persona ON enc."PersonaId" = persona."PersonaId"
INNER JOIN "Ventas"."Metodo_Entrega" AS metodo ON enc."MetodoEntregaId" = metodo."MetodoEntregaId"
INNER JOIN "General"."Metodo_Pago" AS metodo_e ON enc."MetodoPagoId" = metodo_e."MetodoPagoId"
INNER JOIN "General"."Estatus" AS estado ON enc. """)
but with SQLAlchemy in order to use the models that I created previously. Thanks in advance for any answer!!
Edit:
The columns that I wish to see at the final result are: enc.*, persona."Persona_Nombre", persona."Persona_Apellido", metodo."MetEnt_Nombre", metodo_e."MetPag_Descripcion"
I really wish I could share more info but sadly I can't at the moment.
Doing this from the ORM layer, you would reference model names (I match the names of your query above, but I'm sure some of the model/table names are off - now adjusted slightly).
Now revised to include the specific columns you only want to see (note that I ignore your SQL aliases, ORM layer handles the actual query construction):
selection = session.query(Enc_Ventas, Persona.Persona_Nombre, Persona.Persona_Apellido, Metodo_Entrega.MetEnt_Nombre, Metodo_Pago.MetPag_Descripcion).\
join(Persona, Enc_Ventas.PersonaId == Persona.PersonaId).
join(Metodo_Entrega, Enc_Ventas.MetodoEntregaId == Metodo_Entrega.MetodoEntregaId).\
join(Metodo_Pago, Enc_Ventas.MetodoPagoId == Metodo_Pago.MetodoPagoId).\
join(Estatus).all()
Referencing the selection collection would be by iteration through the rows of tuples. A more robust and stable solution would be to transform each output row into a dict.
Otherwise, by including whole models, the collection of rows returned can be individually accessed by referencing as dot notation the model names in the query().
If you need further access to the columns in the related tables, use the ORM technique of .options(joinedload(myTable)), which in a single database query will bring in those additional columns, using the relationship name, also as dot notation.
You also need to define sqlalchemy relationships within your models for this to work, as well as defining the underlying SQL foreign keys.
Much more detail and/or a more specific question is needed to help further, imo.

How to extend a partially unknown query with a filter in SqlAlchemy?

With a "partially unknown query", I mean a query which is composed of sub-queries where the original SqlAlchemy sub-query objects are not available / not known to the entity that is working with the query.
Consider the following example: I have a function that produces some query that contains sub-queries, which serves as a basis for specialized queries. An example of such a function could look like this:
def get_base_query(user_id: int) -> Query:
max_reps = get_max_reps_query().subquery()
user_reps = get_user_reps_query(user_id).subquery()
return (
session
.query(
max_reps.c.topic.label('topic'),
(max_reps.c.reps - user_reps.c.reps).label('reps'),
)
.select_from(
max_reps.join(user_reps, max_reps.c.topic == user_reps.c.topic)
)
)
Some other function is going to receive that Query object and wants to extend it. This function only knows the the structure of the input query result (i.e. two columns topic and reps). Say this function needs to extend the query in a way to limit the rows to match a certain topic.
I've tried the following ways to achieve this, but the behavior is as outlined in the comments below:
from sqlalchemy import func as F
def query_filter_topic(query: Query, topic: str) -> Query:
# Doesn't actually filter the rows.
query = query.filter(Exercise.topic == topic)
# Doesn't actually filter the rows.
query = query.filter(query.subquery().c.topic == topic)
# Results in 0 rows.
query = query.filter(F.topic == topic)
...
My hypothesis is that none of these column references match any of the columns of the outer-most SELECT of the query (because they are dynamically created columns). I was really expecting the subquery().c.exercise_name bit to work.
Is there a "correct" way to extend the query? Given that I explicitly labeled the columns in get_base_query(), I feel like there should be a way to reference those columns.
On a side note, I would expect SqlAlchemy to throw an error rather than silently accepting these columns that it apparently can't process. The filter() calls that doesn't filter rows doesn't seem to change the Query object at all (judging by the str() representation).
The only way I could figure out to get it working is the below, but it doesn't feel like the right way to do it.
def query_filter_topic(query: Query, topic: str) -> Query:
query = query.subquery()
query = session.query(*query.c).filter(query.c.topic == topic)

SQLAlchemy NOT exists on subselect?

I'm trying to replicate this raw sql into proper sqlalchemy implementation but after a lot of tries I can't find a proper way to do it:
SELECT *
FROM images i
WHERE NOT EXISTS (
SELECT image_id
FROM events e
WHERE e.image_id=i.id AND e.chat_id=:chat_id)
ORDER BY random()
LIMIT 1
Closest I got is:
session.query(Image).filter(and_(Event.image_id == Image.id, Event.chat_id == chat_id)).order_by(func.random()).limit(1)
But I cant seem to find how to put the NOT EXISTS clause.
Can anyone lend a helping hand?
Thanks!
You're querying the FROM images table, but the WHERE clause is a subquery, not e.image_id=i.id AND e.chat_id=:chat_id (these filters are for events instead). So, the correct query is of the form
session.query(Image).filter(subquery).order_by(func.random()).limit(1)
The way to form an EXISTS subquery is with the .exists() method, so to get NOT EXISTS just use the ~ operator:
subquery = ~session.query(Event).filter(Event.image_id == Image.id, Event.chat_id == chat_id).exists()
Note that the emitted query is not identical to your original (e.g. it uses EXISTS (SELECT 1 ...)), but it's functionally the same.

Changing where clause without generating subquery in SQLAlchemy

I'm trying to build a relatively complex query and would like to manipulate the where clause of the result directly, without cloning/subquerying the returned query. An example would look like:
session = sessionmaker(bind=engine)()
def generate_complex_query():
return select(
columns=[location.c.id.label('id')],
from_obj=location,
whereclause=location.c.id>50
).alias('a')
query = generate_complex_query()
# based on this query, I'd like to add additional where conditions, ideally like:
# `query.where(query.c.id<100)`
# but without subquerying the original query
# this is what I found so far, which is quite verbose and it doesn't solve the subquery problem
query = select(
columns=[query.c.id],
from_obj=query,
whereclause=query.c.id<100
)
# Another option I was considering was to map the query to a class:
# class Location(object):pass
# mapper(Location, query)
# session.query(Location).filter(Location.id<100)
# which looks more elegant, but also creates a subquery
result = session.execute(query)
for r in result:
print r
This is the generated query:
SELECT a.id
FROM (SELECT location.id AS id
FROM location
WHERE location.id > %(id_1)s) AS a
WHERE a.id < %(id_2)s
I would like to obtain:
SELECT location.id AS id
FROM location
WHERE id > %(id_1)s and
id < %(id_2)s
Is there any way to achieve this? The reason for this is that I think query (2) is slightly faster (not much), and the mapper example (2nd example above) which I have in place messes up the labels (id becomes anon_1_id or a.id if I name the alias).
Why don't you do it like this:
query = generate_complex_query()
query = query.where(location.c.id < 100)
Essentially you can refine any query like this. Additionally, I suggest reading the SQL Expression Language Tutorial which is pretty awesome and introduces all the techniques you need. The way you build a select is only one way. Usually, I build my queries more like this: select(column).where(expression).where(next_expression) and so on. The FROM is usually automatically inferred by SQLAlchemy from the context, i.e. you rarely need to specify it.
Since you don't have access to the internals of generate_complex_query try this:
query = query.where(query.c.id < 100)
This should work in your case I presume.
Another idea:
query = query.where(text("id < 100"))
This uses SQLAlchemy's text expression. This could work for you, however, and this is important: If you want to introduce variables, read the description of the API linked above, because just using format strings intead of bound parameters will open you up to SQL injection, something that normally is a no-brainer with SQLAlchemy but must be taken care of if working with such literal expressions.
Also note that this works because you label the column as id. If you don't do that and don't know the column name, then this won't work either.

SQLAlchemy ORM: modify the columns returned from a query

If I've got an SQLAlchemy ORM query:
admin_users = Session.query(User).filter_by(is_admin=True)
Is it possible to modify the columns returned by that query?
For example, so that I could select only the User.id column, and use that in a sub query:
admin_email_addresses = Session.query(EmailAddress)\
.filter(EmailAddress.user_id.in_(admin_users.select_columns(User.id))
Note: the .values() method will not work, as it executes the query and returns an iterable of results (so, ex, EmailAddress.user_id.in_(admin_users.values(User.id)) will perform two queries, not one).
I know that I could modify the first query to be Session.query(User.id), but I'm specifically wondering how I could modify the columns returned by a query.
I feel your pain on the values() thing. In 0.6.5 I added with_entities() which is just like values() except doesn't iterate:
q = q.with_entities(User.id)
Assuming that your Address.user_id defines a ForeignKey, the query below will do the job more efficiently compared to IN operator:
admin_email_addresses = session.query(EmailAddress).\
join(User).filter(User.is_admin==True)
If you do not have a ForeignKey (although you should), you can specify the join condition explicitely:
admin_email_addresses = session.query(EmailAddress).\
join(User, User.id==EmailAddress.user_id).filter(User.is_admin==True)
But if you really would like to do it with in_ operator, here you go (note the subquery):
subq = session.query(User.id).filter(User.is_admin==True).subquery()
admin_email_addresses = session.query(EmailAddress).\
filter(EmailAddress.user_id.in_(subq))

Categories