I'm trying to replicate this raw sql into proper sqlalchemy implementation but after a lot of tries I can't find a proper way to do it:
SELECT *
FROM images i
WHERE NOT EXISTS (
SELECT image_id
FROM events e
WHERE e.image_id=i.id AND e.chat_id=:chat_id)
ORDER BY random()
LIMIT 1
Closest I got is:
session.query(Image).filter(and_(Event.image_id == Image.id, Event.chat_id == chat_id)).order_by(func.random()).limit(1)
But I cant seem to find how to put the NOT EXISTS clause.
Can anyone lend a helping hand?
Thanks!
You're querying the FROM images table, but the WHERE clause is a subquery, not e.image_id=i.id AND e.chat_id=:chat_id (these filters are for events instead). So, the correct query is of the form
session.query(Image).filter(subquery).order_by(func.random()).limit(1)
The way to form an EXISTS subquery is with the .exists() method, so to get NOT EXISTS just use the ~ operator:
subquery = ~session.query(Event).filter(Event.image_id == Image.id, Event.chat_id == chat_id).exists()
Note that the emitted query is not identical to your original (e.g. it uses EXISTS (SELECT 1 ...)), but it's functionally the same.
Related
I'd like to perform a query like so:
query = users.select().where(users.column.id == id)
res = await database.execute(query)
I would then like to add some error handling by checking res. if not res does not seem to be correct. What is the proper way to do this?
Depending on the complexity of your query, it might be cheaper to wrap it inside a SELECT EXISTS(...) than to SELECT ... LIMIT 1, as Antonio describes. PostgreSQL probably knows to treat those queries the same, but according to this answer, not all DBMS might do so.
The beauty of such a solution is that it exists as soon as it finds any result -- it does as little work as possible and is thus as cheap as possible. In SQLAlchemy that would be sa.select([sa.exists(query)])
You can try taking first object from your query and checking if it is Null like this
result = users.select().where(users.column.id == id).first()
if not result:
print('query is empty')
The other way editing your code would be
res = query.first()
And then checking if it is Null
Using sqlalchemy I would like to do something like:
q = session.query(a, b.id, func.count(a.id))
q = q.outerjoin(b, b.id == a.b_id)
q = q.group_by(b.id)
However in most of sql implementations it is impossible to select fields that are not in group by clause.
Can I order sqlalchemy to select from table a, but not select any field directly from a? In this case I would be able to just change join order but I've got some complex queries that aren't so easy to modify.
You can set the FROM clause explicitly with select_from:
session.query(b.id, func.count(a.id)).select_from(a).outerjoin(b, ...)...
I'm relatively new to SQLAlchemy, and thus far have not had to do anything that complex. I now have a need to return the latest "version" of a row. I can use "distinct" to return the relevant list, however I'm struggling to have the query return SQLAlchemy models.
session.query(Document.document_id,func.max(Document.id)).\
filter_by(container_id=1,active=True).\
group_by(Document.document_id).all()
This returns the list of ids that I need. But what I really need is the whole model.
I'm sure there's a simple way to join it. However it has completely eluded me.
Using a subquery, you can than join:
subq = (session.query(
# Document.document_id, # do not need this really
func.max(Document.id).label("max_id")
)
.filter(Document.container_id == 1)
.filter(Document.active == True)
.group_by(Document.document_id)
).subquery("subq")
qry = (session.query(Document)
.join(subq, Document.id == subq.c.max_id)
).all()
I'm trying to build a relatively complex query and would like to manipulate the where clause of the result directly, without cloning/subquerying the returned query. An example would look like:
session = sessionmaker(bind=engine)()
def generate_complex_query():
return select(
columns=[location.c.id.label('id')],
from_obj=location,
whereclause=location.c.id>50
).alias('a')
query = generate_complex_query()
# based on this query, I'd like to add additional where conditions, ideally like:
# `query.where(query.c.id<100)`
# but without subquerying the original query
# this is what I found so far, which is quite verbose and it doesn't solve the subquery problem
query = select(
columns=[query.c.id],
from_obj=query,
whereclause=query.c.id<100
)
# Another option I was considering was to map the query to a class:
# class Location(object):pass
# mapper(Location, query)
# session.query(Location).filter(Location.id<100)
# which looks more elegant, but also creates a subquery
result = session.execute(query)
for r in result:
print r
This is the generated query:
SELECT a.id
FROM (SELECT location.id AS id
FROM location
WHERE location.id > %(id_1)s) AS a
WHERE a.id < %(id_2)s
I would like to obtain:
SELECT location.id AS id
FROM location
WHERE id > %(id_1)s and
id < %(id_2)s
Is there any way to achieve this? The reason for this is that I think query (2) is slightly faster (not much), and the mapper example (2nd example above) which I have in place messes up the labels (id becomes anon_1_id or a.id if I name the alias).
Why don't you do it like this:
query = generate_complex_query()
query = query.where(location.c.id < 100)
Essentially you can refine any query like this. Additionally, I suggest reading the SQL Expression Language Tutorial which is pretty awesome and introduces all the techniques you need. The way you build a select is only one way. Usually, I build my queries more like this: select(column).where(expression).where(next_expression) and so on. The FROM is usually automatically inferred by SQLAlchemy from the context, i.e. you rarely need to specify it.
Since you don't have access to the internals of generate_complex_query try this:
query = query.where(query.c.id < 100)
This should work in your case I presume.
Another idea:
query = query.where(text("id < 100"))
This uses SQLAlchemy's text expression. This could work for you, however, and this is important: If you want to introduce variables, read the description of the API linked above, because just using format strings intead of bound parameters will open you up to SQL injection, something that normally is a no-brainer with SQLAlchemy but must be taken care of if working with such literal expressions.
Also note that this works because you label the column as id. If you don't do that and don't know the column name, then this won't work either.
I have 2 tables; we'll call them table1 and table2. table2 has a foreign key to table1. I need to delete the rows in table1 that have zero child records in table2. The SQL to do this is pretty straightforward:
DELETE FROM table1
WHERE 0 = (SELECT COUNT(*) FROM table2 WHERE table2.table1_id = table1.table1_id);
However, I haven't been able to find a way to translate this query to SQLAlchemy. Trying the straightforward approach:
subquery = session.query(sqlfunc.count(Table2).label('t2_count')).select_from(Table2).filter(Table2.table1_id == Table1.table1_id).subquery()
session.query(Table1).filter(0 == subquery.columns.t2_count).delete()
Just yielded an error:
sqlalchemy.exc.ArgumentError: Only deletion via a single table query is currently supported
How can I perform this DELETE with SQLAlchemy?
Python 2.7
PostgreSQL 9.2.4
SQLAlchemy 0.7.10 (Cannot upgrade due to using GeoAlchemy, but am interested if newer versions would make this easier)
I'm pretty sure this is what you want. You should try it out though. It uses EXISTS.
from sqlalchemy.sql import not_
# This fetches rows in python to determine which ones were removed.
Session.query(Table1).filter(not_(Table1.table2s.any())).delete(
synchronize_session='fetch')
# If you will not be referencing more Table1 objects in this session then you
# can just ignore syncing the session.
Session.query(Table1).filter(not_(Table1.table2s.any())).delete(
synchronize_session=False)
Explanation of the argument for delete():
http://docs.sqlalchemy.org/en/rel_0_8/orm/query.html#sqlalchemy.orm.query.Query.delete
Example with exists(using any() above uses EXISTS):
http://docs.sqlalchemy.org/en/rel_0_8/orm/tutorial.html#using-exists
Here is the SQL that should be generated:
DELETE FROM table1 WHERE NOT (EXISTS (SELECT 1
FROM table2
WHERE table1.id = table2.table1_id))
If you are using declarative I think there is a way to access Table2.table and then you could just use the sql layer of sqlalchemy to do exactly what you want. Although you run into the same issue of making your Session out of sync.
Well, I found one very ugly way to do it. You can do a select with a join to get the rows loaded into memory, then you can delete them individually:
subquery = session.query(Table2.table1_id
,sqlalchemy.func.count(Table2.table2_id).label('t1count')
) \
.select_from(Table2) \
.group_by(Table2.table1_id) \
.subquery()
rows = session.query(Table1) \
.select_from(Table1) \
.outerjoin(subquery, Table1.table1_id == subquery.c.table1_id) \
.filter(subquery.c.t1count == None) \
.all()
for r in rows:
session.delete(r)
This is not only nasty to write, it's also pretty nasty performance-wise. For starters, you have to bring the table1 rows into memory. Second, if you were like me and had a line like this on Table2's class definition:
table1 = orm.relationship(Table1, backref=orm.backref('table2s'))
then SQLAlchemy will actually perform a query to pull the related table2 rows into memory, too (even though there aren't any). Even worse, because you have to loop over the list (I tried just passing in the list; didn't work), it does so one table1 row at a time. So if you're deleting 10 rows, it's 21 individual queries (1 for the initial select, 1 for each relationship pull, and 1 for each delete). Maybe there are ways to mitigate that; I would have to go through the documentation to see. All this for things I don't even want in my database, much less in memory.
I won't mark this as the answer. I want a cleaner, more efficient way of doing this, but this is all I have for now.