sqlalchemy orm join error [duplicate] - python

Using sqlalchemy I would like to do something like:
q = session.query(a, b.id, func.count(a.id))
q = q.outerjoin(b, b.id == a.b_id)
q = q.group_by(b.id)
However in most of sql implementations it is impossible to select fields that are not in group by clause.
Can I order sqlalchemy to select from table a, but not select any field directly from a? In this case I would be able to just change join order but I've got some complex queries that aren't so easy to modify.

You can set the FROM clause explicitly with select_from:
session.query(b.id, func.count(a.id)).select_from(a).outerjoin(b, ...)...

Related

SQLAlchemy joins table from subquery. How to prevent it of doing this?

I have this subquery:
aliasBilling = aliased(Billing)
subQueryReversesBilling = db.session.query(BillingReversal).select_from(BillingReversal).with_entities(BillingReversal.reverses_billing_id).filter(BillingReversal.billing_id == aliasBilling.id).subquery()
Which generates:
SELECT billing_reversal.reverses_billing_id
FROM billing_reversal, billing AS billing_1
WHERE billing_reversal.billing_id = billing_1.id
How to get rid of that join(comma style) I didn't specified as it makes a join for the full table and every record. Apparently it is assuming I need it as it is in the filtering part(where clause), but I take that table from the bigger query I have later on. That is why it is a subquery. What I need is this:
SELECT billing_reversal.reverses_billing_id
FROM billing_reversal
WHERE billing_reversal.billing_id = billing_1.id
The cross join is coming from the aliased Billing table in the filter clause.
If BillingReversal and Billing are linked with a relationship (and a foreign key), you should leverage this relationship and join instead of filter.
subQueryReversesBilling = (
db.session.query(BillingReversal)
.join(Billing, BillingReversal.billing)
.with_entities(BillingReversal.id)
.subquery()
)

Sort by a column in a union query in SqlAlchemy SQLite

As explained in this question, you can use string literals to do order by in unions.
For example, this works with Oracle:
querypart1 = select([t1.c.col1.label("a")]).order_by(t1.c.col1).limit(limit)
querypart2 = select([t2.c.col2.label("a")]).order_by(t2.c.col2).limit(limit)
query = querypart1.union_all(querypart2).order_by("a").limit(limit)
The order-by can take a string literal, which is the name of the column in the union result.
(There are gazillions of rows in partitioned tables and I'm trying to paginate the damn things)
When running against SQLite3, however, this generates an exception:
sqlalchemy.exc.OperationalError: (OperationalError) near "ORDER": syntax error
How can you order by the results of a union?
The queries that are part of a union query must not be sorted.
To be able to use limits inside a compound query, you must wrap the individual queries inside a separate subquery:
SELECT * FROM (SELECT ... LIMIT ...)
UNION ALL
SELECT * FROM (SELECT ... LIMIT ...)
q1 = select(...).limit(...).subquery()
q2 = select(...).limit(...).subquery()
query = q1.union_all(q2)...

Changing where clause without generating subquery in SQLAlchemy

I'm trying to build a relatively complex query and would like to manipulate the where clause of the result directly, without cloning/subquerying the returned query. An example would look like:
session = sessionmaker(bind=engine)()
def generate_complex_query():
return select(
columns=[location.c.id.label('id')],
from_obj=location,
whereclause=location.c.id>50
).alias('a')
query = generate_complex_query()
# based on this query, I'd like to add additional where conditions, ideally like:
# `query.where(query.c.id<100)`
# but without subquerying the original query
# this is what I found so far, which is quite verbose and it doesn't solve the subquery problem
query = select(
columns=[query.c.id],
from_obj=query,
whereclause=query.c.id<100
)
# Another option I was considering was to map the query to a class:
# class Location(object):pass
# mapper(Location, query)
# session.query(Location).filter(Location.id<100)
# which looks more elegant, but also creates a subquery
result = session.execute(query)
for r in result:
print r
This is the generated query:
SELECT a.id
FROM (SELECT location.id AS id
FROM location
WHERE location.id > %(id_1)s) AS a
WHERE a.id < %(id_2)s
I would like to obtain:
SELECT location.id AS id
FROM location
WHERE id > %(id_1)s and
id < %(id_2)s
Is there any way to achieve this? The reason for this is that I think query (2) is slightly faster (not much), and the mapper example (2nd example above) which I have in place messes up the labels (id becomes anon_1_id or a.id if I name the alias).
Why don't you do it like this:
query = generate_complex_query()
query = query.where(location.c.id < 100)
Essentially you can refine any query like this. Additionally, I suggest reading the SQL Expression Language Tutorial which is pretty awesome and introduces all the techniques you need. The way you build a select is only one way. Usually, I build my queries more like this: select(column).where(expression).where(next_expression) and so on. The FROM is usually automatically inferred by SQLAlchemy from the context, i.e. you rarely need to specify it.
Since you don't have access to the internals of generate_complex_query try this:
query = query.where(query.c.id < 100)
This should work in your case I presume.
Another idea:
query = query.where(text("id < 100"))
This uses SQLAlchemy's text expression. This could work for you, however, and this is important: If you want to introduce variables, read the description of the API linked above, because just using format strings intead of bound parameters will open you up to SQL injection, something that normally is a no-brainer with SQLAlchemy but must be taken care of if working with such literal expressions.
Also note that this works because you label the column as id. If you don't do that and don't know the column name, then this won't work either.

Update all models at once in Django

I am trying to update position field for all objects in specific order at once in Django (python).
This is how I've done it now, but the problem is that it makes loads of queries.
servers = frontend_models.Server.objects.all().order_by('-vote_count')
i = 1
for server in servers:
server.last_rank = i
server.save()
i += 1
Is there a way to update with
Model.objects.all().order_by('some_field').update(position=some_number_that_changes_for each_object)
Thank you!
You can use the F() expression from django.db.models to do the same:
Model.objects.all().order_by('some_field').update(position=F(some_field)+1)
which will generate a single SQL query to update all of the column values, so it is efficient to the database too.
As far as I know, Django's object-relational mapping system doesn't provide a way to express this update operation. But if you know how to express it in SQL, then you can run it via a custom SQL query:
from django.db import connection
cursor = connection.cursor()
cursor.execute('''UPDATE myapp_server ...''')
Different database engines express this operation in different ways. In MySQL you'd run this query:
SET #rownum=0;
UPDATE myapp_server A,
(SELECT id, #rownum := #rownum + 1 AS rank
FROM myapp_server
ORDER BY vote_count DESCENDING) B
SET A.rank = B.rank
WHERE A.id = B.id
In PostgreSQL I think you'd use
UPDATE myapp_server A,
(SELECT id, rownumber() AS rank
OVER (ORDER BY vote_count DESCENDING)
FROM myapp_server) B
SET A.rank = B.rank
WHERE A.id = B.id
(but that's untested, so beware!).

SQLAlchemy ORM: modify the columns returned from a query

If I've got an SQLAlchemy ORM query:
admin_users = Session.query(User).filter_by(is_admin=True)
Is it possible to modify the columns returned by that query?
For example, so that I could select only the User.id column, and use that in a sub query:
admin_email_addresses = Session.query(EmailAddress)\
.filter(EmailAddress.user_id.in_(admin_users.select_columns(User.id))
Note: the .values() method will not work, as it executes the query and returns an iterable of results (so, ex, EmailAddress.user_id.in_(admin_users.values(User.id)) will perform two queries, not one).
I know that I could modify the first query to be Session.query(User.id), but I'm specifically wondering how I could modify the columns returned by a query.
I feel your pain on the values() thing. In 0.6.5 I added with_entities() which is just like values() except doesn't iterate:
q = q.with_entities(User.id)
Assuming that your Address.user_id defines a ForeignKey, the query below will do the job more efficiently compared to IN operator:
admin_email_addresses = session.query(EmailAddress).\
join(User).filter(User.is_admin==True)
If you do not have a ForeignKey (although you should), you can specify the join condition explicitely:
admin_email_addresses = session.query(EmailAddress).\
join(User, User.id==EmailAddress.user_id).filter(User.is_admin==True)
But if you really would like to do it with in_ operator, here you go (note the subquery):
subq = session.query(User.id).filter(User.is_admin==True).subquery()
admin_email_addresses = session.query(EmailAddress).\
filter(EmailAddress.user_id.in_(subq))

Categories