I'm using Django 1.4 and Python 2.7.
I'm doing a Sum of some values... when I do this, this work perfect:
CategoryAnswers.objects.using('mam').filter(category=cat["category"], brand=cat["brand"], category__segment_category=cat["category__segment_category"]).values('category__name', 'brand__name','brand__pk').annotate(total=Sum('answer'))
And generate a query:
SELECT `category`.`name`, `brand`.`name`, `category_answers`.`brand_id`, SUM(`category_answers`.`answer`) AS `total`
FROM `category_answers`
INNER JOIN `category`
ON (`category_answers`.`category_id` = `category`.`id`)
INNER JOIN `brand`
ON (`category_answers`.`brand_id` = `brand`.`id`)
WHERE (`category_answers`.`category_id` = 6 AND
`category_answers`.`brand_id` = 1 AND
`category`.`segment_category_id` = 1 )
GROUP BY `category`.`name`, `brand`.`name`, `category_answers`.`brand_id`
ORDER BY NULL
But when I add a new value, this not work:
CategoryAnswers.objects.using('mam').order_by().filter(category=cat["category"], brand=cat["brand"], category__segment_category=cat["category__segment_category"]).values('category__name','category__pk','brand__name','brand__pk').annotate(total=Sum('answer'))
Seeing the query that is returned, the problem is django add on group by a wrong field (category_answers.id):
SELECT `category`.`name`, `category_answers`.`category_id`, `brand`.`name`, `category_answers`.`brand_id`,
SUM(`category_answers`.`answer`) AS `total`
FROM `category_answers`
INNER JOIN `category`
ON (`category_answers`.`category_id` = `category`.`id`)
INNER JOIN `brand`
ON (`category_answers`.`brand_id` = `brand`.`id`)
WHERE (`category_answers`.`category_id` = 6 AND
`category_answers`.`brand_id` = 1 AND
`category`.`segment_category_id` = 1 )
GROUP BY `category_answers`.`id`, `category`.`name`, `category_answers`.`category_id`, `brand`.`name`, `category_answers`.`brand_id`
ORDER BY NULL
If I remove any parameter this work, so I do not believe this to be problem specific parameter... Am I doing something wrong?
I can't resolve this, so... I do this with raw SQL query:
cursor = connections["mam"].cursor()
cursor.execute("SELECT B.name, A.category_id, A.brand_id, SUM(A.answer) AS total, C.name FROM category_answers A INNER JOIN category B ON A.category_id = B.id INNER JOIN brand C ON A.brand_id = C.id WHERE A.brand_id = %s AND A.category_id = %s AND B.segment_category_id = %s", [cat["brand"],cat["category"],cat["category__segment_category"]])
c_answers = cursor.fetchone()
This is not the best way, but it's works. :)
Related
I'm trying to use Django ORM to generate a queryset and I can't find how to use an OuterRef in the joining condition with a FilteredRelation.
What I have in Django
Main queryset
queryset = LineOutlier.objects.filter(report=self.kwargs['report_pk'], report__apn__customer__cen_id=self.kwargs['customer_cen_id']) \
.select_related('category__traffic') \
.select_related('category__frequency') \
.select_related('category__stability') \
.prefetch_related('category__traffic__labels') \
.prefetch_related('category__frequency__labels') \
.prefetch_related('category__stability__labels') \
.annotate(history=subquery)
The subquery
subquery = ArraySubquery(
LineOutlierReport.objects
.filter((Q(lineoutlier__imsi=OuterRef('imsi')) | Q(lineoutlier__isnull=True)) & Q(id__in=last_5_reports_ids))
.values(json=JSONObject(
severity='lineoutlier__severity',
report_id='id',
report_start_date='start_date',
report_end_date='end_date'
)
)
)
The request can be executed, but the SQL generated is not exactly what I want :
SQL Generated
SELECT "mlformalima_lineoutlier"."id",
"mlformalima_lineoutlier"."imsi",
ARRAY(
SELECT JSONB_BUILD_OBJECT('severity', V1."severity", 'report_id', V0."id", 'report_start_date', V0."start_date", 'report_end_date', V0."end_date") AS "json"
FROM "mlformalima_lineoutlierreport" V0
LEFT OUTER JOIN "mlformalima_lineoutlier" V1
ON (V0."id" = V1."report_id")
WHERE ((V1."imsi" = ("mlformalima_lineoutlier"."imsi") OR V1."id" IS NULL) AND V0."id" IN (SELECT DISTINCT ON (U0."id") U0."id" FROM "mlformalima_lineoutlierreport" U0 WHERE U0."apn_id" = 2 ORDER BY U0."id" ASC, U0."end_date" DESC LIMIT 5))
) AS "history",
FROM "mlformalima_lineoutlier"
The problem here is that the OuterRef condition (V1."imsi" = ("mlformalima_lineoutlier"."imsi")) is done on the WHERE statement, and I want it to be on the JOIN statement
What I want in SQL
SELECT "mlformalima_lineoutlier"."id",
"mlformalima_lineoutlier"."imsi",
ARRAY(
SELECT JSONB_BUILD_OBJECT('severity', V1."severity", 'report_id', V0."id", 'report_start_date', V0."start_date", 'report_end_date', V0."end_date") AS "json"
FROM "mlformalima_lineoutlierreport" V0
LEFT OUTER JOIN "mlformalima_lineoutlier" V1
ON (V0."id" = V1."report_id" AND ((V1."id" IS NULL) OR V1."imsi" = ("mlformalima_lineoutlier"."imsi")))
WHERE V0."id" IN (SELECT DISTINCT ON (U0."id") U0."id" FROM "mlformalima_lineoutlierreport" U0 WHERE U0."apn_id" = 2 ORDER BY U0."id" ASC, U0."end_date" DESC LIMIT 5))
) AS "history",
FROM "mlformalima_lineoutlier"
What I tried in Django
I tried to use the FilteredRelation to change the JOIN condition, but I can't seem to use it in combination with an OuterRef
subquery = ArraySubquery(
LineOutlierReport.objects
.annotate(filtered_relation=FilteredRelation('lineoutlier', condition=Q(lineoutlier__imsi=OuterRef('imsi')) | Q(lineoutlier__isnull=True)))
.filter(Q(id__in=last_5_reports_ids))
.values(json=JSONObject(
severity='filtered_relation__severity',
report_id='id',
report_start_date='start_date',
report_end_date='end_date'
)
)
)
I can't execute this query because of the following error
ValueError: This queryset contains a reference to an outer query and may only be used in a subquery.
How can I modify my query to make it work ?
This looks like this Django bug. As a workaround you can annotate another column and reference it in the FilteredRelation, like so :
subquery = ArraySubquery(
LineOutlierReport.objects
.annotate(
outer_imsi=OuterRef('imsi'),
filtered_relation=FilteredRelation('lineoutlier', condition=Q(lineoutlier__imsi=F('outer_imsi')) | Q(lineoutlier__isnull=True)))
.filter(Q(id__in=last_5_reports_ids))
.values(json=JSONObject(
severity='filtered_relation__severity',
report_id='id',
report_start_date='start_date',
report_end_date='end_date'
)
)
)
That way you avoid OuterRef being processed inside FilteredRelation.
This works before I put the CONCAT statement in. I was hoping to return a tuple for the uploads field based off of the subquery.
def my_query():
conn = create_connection()
cur = conn[0]
cur.execute("""SELECT d.id, s.val1, d.val2, s.val3, d.val4,
r.val5, r.val6, r.val7,
CONCAT(SELECT u.id, u.file_name, u.file_path FROM related_docs u WHERE u.icd_id = d.id)
AS uploads
FROM icd_d d, icd_s s, icd_r r
WHERE d.id = s.icd_id
AND d.id = r.icd_id
ORDER BY id ASC
""")
data = cur.fetchall()
return data
I think you want this:
(SELECT CONCAT(u.id, u.file_name, u.file_path) FROM related_docs u WHERE u.icd_id = d.id) AS uploads
but it is better to add spaces between each value:
(SELECT CONCAT(u.id, ' ', u.file_name, ' ', u.file_path) FROM related_docs u WHERE u.icd_id = d.id) AS uploads
This will work only if the subquery returns only 1 row.
If there is a case of multiple rows, use GROUP_CONCAT() also to get a comma separated list of each row:
(SELECT GROUP_CONCAT(CONCAT(u.id, ' ', u.file_name, ' ', u.file_path)) FROM related_docs u WHERE u.icd_id = d.id) AS uploads
I have a function that returns a query that would fetch «New priorities for emails» by given account id.
First it selects a domain name for that account, and then selects a data structure for it.
And everything should be OK IMO, but not at this time: SQLAlchemy is generating SQL that is syntactically wrong, and I can’t understand how to fix it. Here are the samples:
def unprocessed_by_account_id(account_id: str):
account_domain = select(
[tables.organizations.c.organization_id]).select_from(
tables.accounts.join(
tables.email_addresses,
tables.accounts.c.account_id == tables.email_addresses.c.email_address,
).join(tables.organizations)
).where(
tables.accounts.c.account_id == account_id,
)
domain_with_subdomains = concat('%', account_domain)
fields = [
tables.users.c.first_name,
…
tables.priorities.c.name,
]
fromclause = tables.users.join(
…
).join(tables.organizations)
whereclause = and_(
…
tables.organizations.c.organization_id.notlike(
domain_with_subdomains),
)
stmt = select(fields).select_from(fromclause).where(whereclause)
return stmt
print(unprocessed_by_account_id(‘foo’))
So it generates:
SELECT
users.first_name,
…
priorities.name
FROM (SELECT organizations.organization_id AS organization_id
FROM accounts
JOIN email_addresses
ON accounts.account_id = email_addresses.email_address
JOIN organizations
ON organizations.organization_id = email_addresses.organization_id
WHERE accounts.account_id = :account_id_1), users
JOIN
…
JOIN organizations
ON organizations.organization_id = email_addresses.organization_id
WHERE emails.account_id = :account_id_2 AND
priorities_new_emails.status = :status_1 AND
organizations.organization_id NOT LIKE
concat(:concat_1, (SELECT organizations.organization_id
FROM accounts
JOIN email_addresses ON accounts.account_id =
email_addresses.email_address
JOIN organizations
ON organizations.organization_id =
email_addresses.organization_id
WHERE accounts.account_id = :account_id_1))
But the first
(SELECT organizations.organization_id AS organization_id
FROM accounts
JOIN email_addresses
ON accounts.account_id = email_addresses.email_address
JOIN organizations
ON organizations.organization_id = email_addresses.organization_id
WHERE accounts.account_id = :account_id_1)
Is redundant here and produces
[2017-05-29 23:49:51] [42601] ERROR: subquery in FROM must have an alias
[2017-05-29 23:49:51] Hint: For example, FROM (SELECT ...) [AS] foo.
[2017-05-29 23:49:51] Position: 245
I tried to use account_domain = account_domain.cte(), but no luck, except that the subquery went to WITH clause as expected.
Also I tried with_only_columns with no effect at all.
I think that Alchemy is adding this statement, because it sees it inside WHERE clause and thinks that without it the filtering will result in an error, but I’m not sure.
Also I must mention than in previous version of code the statement was almost the same except there were no concat(‘%’, account_domain) and notlike was !=.
Also I tried inserting alias here and there, but had no success with that either. And if I manually delete that first statement from the select is plain SQL, then I’d receive expectable results.
Any help is appreciated, thank you.
If you're using a subquery as a value, you need to declare it as_scalar():
domain_with_subdomains = concat('%', account_domain.as_scalar())
I have 2 classes:
class A(Base):
id = Column(Integer, primary_key=True)
name = Column(String)
children = relationship('B')
class B(Base):
id = Column(Integer, primary_key=True)
id_a = Column(Integer, ForeignKey('a.id'))
name = Column(String)
Now I need all object A which contains B with some name and A object will contain all B objects filtered.
To achieve it I build query.
query = db.session.query(A).join(B).options(db.contains_eager(A.children)).filter(B.name=='SOME_TEXT')
Now I need only 50 items of query so I do:
query.limit(50).all()
Result contain less then 50 even if without limit there is more than 50. I read The Zen of Eager Loading. But there must be some trick to achieve it. One of my idea is to make 2 query. One with innerjoin to take ID's then use this ID's in first query.
But maybe there is better solve for this.
First, take a step back and look at the SQL. Your current query is
SELECT * FROM a JOIN b ON b.id_a = a.id WHERE b.name == '...' LIMIT 50;
Notice the limit is on a JOIN b and not a, but if you put the limit on a you can't filter by the field in b. There are two solutions to this problem. The first is to use a scalar subquery to filter on b.name, like this:
SELECT * FROM a
WHERE EXISTS (SELECT 1 FROM b WHERE b.id_a = a.id AND b.name = '...')
LIMIT 50;
This can be inefficient depending on the DB backend. The second solution is to do a DISTINCT on a after the join, like this:
SELECT DISTINCT a.* FROM a JOIN b ON b.id_a = a.id
WHERE b.name == '...'
LIMIT 50;
Notice how in either case you do not get any column from b. How do we get them? Do another join!
SELECT * FROM (
SELECT DISTINCT a.* FROM a JOIN b ON b.id_a = a.id
WHERE b.name == '...'
LIMIT 50;
) a JOIN b ON b.id_a = a.id
WHERE b.name == '...';
Now, to write all of this in SQLAlchemy:
subquery = (
session.query(A)
.join(B)
.with_entities(A) # only select A's columns
.filter(B.name == '...')
.distinct()
.limit(50)
.subquery() # convert to subquery
)
aliased_A = aliased(A, subquery)
query = (
session.query(aliased_A)
.join(B)
.options(contains_eager(aliased_A.children))
.filter(B.name == "...")
)
Say I have the following SQL code and I want to change it to Sqlalchemy:
SELECT amount FROM table1
JOIN table2
ON table2.id = table1.b_id
JOIN (SELECT id FROM table3 WHERE val1 = %s AND val2 = %s) inst
ON inst.id = table1.i_id
WHERE
val3 = %s
I've tried making a subquery for the SELECT id FROM table3 clause as follows:
subq = session.query(table3.id).filter(and_(table3.val1 == 'value', table3.val2 == 'value')).subquery()
And then putting everything together:
query = session.query(table1).join(table2).filter(table2.id == table1.b_id).\
join(subq).filter(table1.val3 == 'value')
When I ouput query.first().amount, this works for a few examples, but for some queries I'm getting no results when there should be something there, so I must be messing up somewhere. Any ideas where I'm going wrong? Thanks
Query below should produce exactly the SQL you have. It is not much different from your, but removes some unnecessary things.
So if it does not work, then also your original SQL might not work. Therefore, I assume that your issue is not SQL but either data or the parameters for that query. And you can always print out the query itself by engine.echo = True.
val1, val2, val3 = 'value', 'value', 'value' # #NOTE: specify filter values
subq = (session.query(table3.id)
.filter(and_(table3.val1 == val1, table3.val2 == val2))
).subquery(name='inst')
quer = (
session.query(table1.amount) # #NOTE: select only one column
.join(table2) # #NOTE: no need for filter(...)
.join(subq)
.filter(table1.val3 == val3)
).first()
print(quer and quer.amount)