SQLAlchemy joins table from subquery. How to prevent it of doing this? - python

I have this subquery:
aliasBilling = aliased(Billing)
subQueryReversesBilling = db.session.query(BillingReversal).select_from(BillingReversal).with_entities(BillingReversal.reverses_billing_id).filter(BillingReversal.billing_id == aliasBilling.id).subquery()
Which generates:
SELECT billing_reversal.reverses_billing_id
FROM billing_reversal, billing AS billing_1
WHERE billing_reversal.billing_id = billing_1.id
How to get rid of that join(comma style) I didn't specified as it makes a join for the full table and every record. Apparently it is assuming I need it as it is in the filtering part(where clause), but I take that table from the bigger query I have later on. That is why it is a subquery. What I need is this:
SELECT billing_reversal.reverses_billing_id
FROM billing_reversal
WHERE billing_reversal.billing_id = billing_1.id

The cross join is coming from the aliased Billing table in the filter clause.
If BillingReversal and Billing are linked with a relationship (and a foreign key), you should leverage this relationship and join instead of filter.
subQueryReversesBilling = (
db.session.query(BillingReversal)
.join(Billing, BillingReversal.billing)
.with_entities(BillingReversal.id)
.subquery()
)

Related

count subquery in sqlalchemy

I'm having some trouble translating a subquery into sqlalchemy. I have two tables that both have a store_id column that is a foreign key (but it isn't a direct many-to-many relationship) and I need to return the id, store_id and name from table 1 along with the number of records from table 2 that also have the same store_id. I know the SQL that I would use to return those records I'm just now sure how to do it using sqlalchemy.
SELECT
table_1.id
table_1.store_id,
table_1.name,
(
SELECT
count(table_2.id)
FROM
table_2
WHERE
table_1.store_id = table_2.store_id
) AS store_count FROM table_1;
This post actually answered my question. I must have missed it when I was searching initially. My solution below.
Generate sql with subquery as a column in select statement using SQLAlchemy
store_count = session.query(func.count(Table2.id)).filter(Table2.store_id == Table1.store_id)
session.query.add_columns(Table1.id, Table1.name, Table1.store_id, store_count.label("store_count"))

Update Django model based on the row number of rows produced by a subquery on the same model

I have a PostgreSQL UPDATE query which updates a field (global_ranking) of every row in a table, based on the ROW_NUMBER() of each row in that same table sorted by another field (rating). Additionally, the update is partitioned, so that the ranking of each row is relative only to those rows which belong to the same language.
In short, I'm updating the ranking of each player in a game, based on their current rating.
The PostgreSQL query looks like this:
UPDATE stats_userstats
SET
global_ranking = sub.row_number
FROM (
SELECT id, ROW_NUMBER() OVER (
PARTITION BY language
ORDER BY rating DESC
) AS row_number
FROM stats_userstats
) sub
WHERE stats_userstats.id = sub.id;
I'm also using Django, and it'd be fun to learn how to express this query using the Django ORM, if possible.
At first, it seemed like Django had everything necessary to express the query, including the ability to use PostgreSQL's ROW_NUMBER() windowing function, but my best attempt updates all rows ranking with 1:
from django.db.models import F, OuterRef, Subquery
from django.db.models.expressions import Window
from django.db.models.functions import RowNumber
UserStats.objects.update(
global_ranking=Subquery(
UserStats.objects.filter(
id=OuterRef('id')
).annotate(
row_number=Window(
expression=RowNumber(),
partition_by=[F('language')],
order_by=F('rating').desc()
)
).values('row_number')
)
)
I used from django.db import connection; print(connection.queries) to see the query produced by that Django ORM statement, and got this vaguely similar SQL statement:
UPDATE "stats_userstats"
SET "global_ranking" = (
SELECT ROW_NUMBER() OVER (
PARTITION BY U0."language"
ORDER BY U0."rating" DESC
) AS "row_number"
FROM "stats_userstats" U0
WHERE U0."id" = "stats_userstats"."id"
It looks like what I need to do is move the subquery from the SET portion of the query to the FROM, but it's unclear to me how to restructure the Django ORM statement to achieve that.
Any help is greatly appreciated. Thank you!
Subquery filters qs by provided OuterRef. You're always getting 1 as each user is in fact first in any ranking if only them are considered.
A "correct" query would be:
UserStats.objects.alias(
row_number=Window(
expression=RowNumber(),
partition_by=[F('language')],
order_by=F('rating').desc()
)
).update(global_ranking=F('row_number'))
But Django will not allow that:
django.core.exceptions.FieldError: Window expressions are not allowed in this query
Related Django ticket: https://code.djangoproject.com/ticket/25643
I think you might comment there with your use case.

When SQLAlchemy decides to use subquery with .limit() method?

I have an error, when SQLAlchemy produced wrong SQL query, but I can't determine conditions.
I use Flask-SQLAlchemy and initially it's a just MyModel.query and it represented by simple SELECT with JOINs. But when .limit() method is applied, it transforms and uses subquery for fetch main objects and only then apply JOINs. The problem is in ORDER BY statement, which remains the same and ignores the subquery definition.
Here's example and I've simplify select fields:
-- Initially
SELECT *
FROM customer_rates
LEFT OUTER JOIN seasons AS seasons_1 ON seasons_1.id = customer_rates.season_id
LEFT OUTER JOIN users AS users_1 ON users_1.id = customer_rates.customer_id
-- other joins ...
ORDER BY customer_rates.id, customer_rates.id
-- Then .limit()
SELECT anon_1.*, *
FROM (
SELECT customer_rates.*
FROM customer_rates
LIMIT :param_1) AS anon_1
LEFT OUTER JOIN seasons AS seasons_1 ON seasons_1.id = anon_1.customer_rates_season_id
LEFT OUTER JOIN users AS users_1 ON users_1.id = anon_1.customer_rates_customer_id
-- other joins
ORDER BY customer_rates.id, customer_rates.id
And this query gives following error:
ProgrammingError: (psycopg2.ProgrammingError) missing FROM-clause entry for table "customer_rates"
The last line in query should be:
ORDER BY anon_1.customer_rates_id
The code, that produces this queries is a part of large application. I've tried to implement this from scratch in a small flask application, But I can't reproduce it. In small application it always uses a JOIN.
So I need to know, when SQLAlchemy decides to use subquery.
I use python 2.7 and PostgreSQL 9
The answer is pretty straightforward. It uses subquery when it joined table has many-to-one relations with queried model. So for producing correct number of results it limits the queried rows in the subquery

SQLite Inner Join with Limit on Left Table

Firstly, let me describe a scenario similar to the one I am facing; to better explain my issue. In this scenario, I am creating a system which needs to select 'n' random blog posts from a table and then get all the replies for those selected posts.
Imagine my structure like so:
blog_posts(id INTEGER PRIMARY KEY, thepost TEXT)
blog_replies(id INTEGER PRIMARY KEY, postid INTEGER FOREIGN KEY REFERENCES blog_posts(id), thereply TEXT)
This is the current SQL I have, but am getting an error:
SELECT blog_post.id, blog_post.thepost, blog_replies.id, blog_replies.thereply
FROM (SELECT blog_post.id, blog_post.thepost FROM blog_post ORDER BY RANDOM() LIMIT ?)
INNER JOIN blog_replies
ON blog_post.id=blog_replies_options.postid;
Here is the error:
sqlite3.OperationalError: no such column: hmquestion.id
Your query needs an alias added to the subquery. This will allow you to reference the fields from your subquery within the outer query:
SELECT hmquestion.id, hmquestion.question,
hmquestion_options.id, hmquestion_options.option
FROM (SELECT hmquestion.id, hmquestion.question
FROM hmquestion ORDER BY RANDOM() LIMIT ?) AS hmquestion <--Add Alias Here
INNER JOIN hmquestion_options
ON hmquestion.id=hmquestion_options.questionid;
As is, you're outer query doesn't know what hmquestion references.

Django model search concatenated string

I am trying to use a Django model to for a record but then return a concatenated field of two different tables joined by a foreign key.
I can do it in SQL like this:
SELECT
location.location_geoname_id as id,
CONCAT_WS(', ', location.location_name, region.region_name, country.country_name) AS 'text'
FROM
geonames_location as location
JOIN
geonames_region as region
ON
location.region_geoname_id = region.region_geoname_id
JOIN
geonames_country as country
ON
region.country_geoname_id = country.country_geoname_id
WHERE
location.location_name like 'location'
ORDER BY
location.location_name, region.region_name, country.country_name
LIMIT 10;
Is there a cleaner way to do this using Django models? Or do I need to just use SQL for this one?
Thank you
Do you really need the SQL to return the concatenated field? Why not query the models in the usual way (with select_related()) and then concatenate in Python? Or if you're worried about querying more columns than you need, use values_list:
locations = Location.objects.values_list(
'location_name', 'region__region_name', 'country__country_name')
location_texts = [','.join(l) for l in locations]
You can also write raw query for this in your code like that and later on you can concatenate.
Example:
org = Organization.objects.raw('SELECT organization_id, name FROM organization where is_active=1 ORDER BY name')
Keep one thing in a raw query you have to always fetch primary key of table, it's mandatory. Here organization_id is a primary key of contact_organization table.
And it's depend on you which one is useful and simple(raw query or model query).

Categories