How to use avg and sum in SQLAlchemy query

How to use avg and sum in SQLAlchemy query - python

I'm trying to return a totals/averages row from my dataset which contains the SUM of certain fields and the AVG of others.
I could do this in SQL via:
SELECT SUM(field1) as SumFld, AVG(field2) as AvgFld
FROM Rating WHERE url=[url_string]
My attempt to translate this into SQLAlchemy is as follows:
totals = Rating.query(func.avg(Rating.field2)).filter(Rating.url==url_string.netloc)
But this is erroring out with:
TypeError: 'BaseQuery' object is not callable

You should use something like:
from sqlalchemy.sql import func
session.query(func.avg(Rating.field2).label('average')).filter(Rating.url==url_string.netloc)
You cannot use MyObject.query here, because SqlAlchemy tries to find a field to put result of avg function to, and it fails.

You cannot use MyObject.query here, because SqlAlchemy tries to find a field to put result of avg function to, and it fails.
This isn't exactly true. func.avg(Rating.field2).label('average') returns a Column object (the same type object that it was given to be precise). So you can use it with the with_entities method of the query object.
This is how you would do it for your example:
Rating.query.with_entities(func.avg(Rating.field2).label('average')).filter(Rating.url == url_string.netloc)

attention = Attention_scores.query
.with_entities(func.avg(Attention_scores.score))
.filter(classroom_number == classroom_number)
.all()
I tried it like this and it gave the correct average.

Related

Annotate with Subquery "get" instead of "filter"

Why can i run an annotate with a Subquery featuring a filter query like this:
invoices = invoices.annotate(
supplier_code = Subquery(Supplier.objects.filter(
pk = OuterRef('supplier'),
cep_country__name = OuterRef('cep_country'),
).values('code')[:1]),
)
But when i try to use a get query method gives me the error ValueError: This queryset contains a reference to an outer query and may only be used in a subquery.
invoices = invoices.annotate(
supplier_code = Subquery(Supplier.objects.get(
pk = OuterRef('supplier'),
cep_country__name = OuterRef('cep_country'),
).values('code')[:1]),
)
## OR
invoices = invoices.annotate(
supplier_code = Subquery(Supplier.objects.get(
pk = OuterRef('supplier'),
cep_country__name = OuterRef('cep_country'),
).code),
)
## BOTH GIVE THE SAME ERROR
What's wrong here? Is it simply impossible to use a get query inside the Subquery? I can live with the filter option, but it would be more correct for me to use the get since i know for sure there's always one and only one match.

QuerySets are normally lazy since we can chain many methods on them, example: .filter(...).order_by(...), etc. without any actual query being made to the database (Would be making too many unneeded queries otherwise).
But the .get() method does not return a queryset it returns an instance of the model and hence it cannot be lazy. So no you cannot use .get() in a subquery.
You already achieve what you want to do by performing that slice on the queryset your_queryset.values('code')[:1] what this does is it uses the LIMIT clause of SQL so that only one row will be returned. In fact this is better than using get anyway since it does not limit the number of results the database returns and if more than one result is returned it raises a MultipleObjectsReturned exception.

django custom Func for specific SQL function

I'm currently performing a raw query in my database because i use the MySQL function instr. I would like to translate is into a django Func class.I've spend several days reading the docs, Django custom for complex Func (sql function) and Annotate SQL Function to Django ORM Queryset `FUNC` / `AGGREGATE` but i still fail to write succesfully my custom Func.
This is my database
from django.db import models
class Car(models.Model):
brand = models.CharField("brand name", max_length=50)
#then add the __str__ function
Then I populate my database for this test
Car.objects.create(brand="mercedes")
Car.objects.create(brand="bmw")
Car.objects.create(brand="audi")
I want to check if something in my table is in my user input. This is how i perform my SQL query currently
query = Car.objects.raw("SELECT * FROM myAppName_car WHERE instr(%s, brand)>0", ["my audi A3"])
# this return an sql query with one element in this example
I'm trying to tranform it in something that would look like this
from django.db.models import Func
class Instr(Func):
function = 'INSTR'
query = Car.objects.filter(brand=Instr('brand'))
EDIT
Thank to the response, the correct answer is
from django.db.models import Value
from django.db.models.functions import StrIndex
query = Car.objects.annotate(pos=StrIndex(Value("my audi A3"), "brand")).filter(pos__gt=0)

Your custom database function is totally correct, but you're using it in the wrong way.
When you look at your usage in the raw SQL function, you can clearly see that you need 3 parameters for your filtering to work correctly: a string, a column name and a threshold (in your case it is always zero)
If you want to use this function in the same way in your query, you should do it like this:
query = Car.objects.annotate(
position_in_brand=Instr("my audi A3")
).filter(position_in_brand__gt=0)
For this to work properly, you also need to add output_field = IntegerField() to your function class, so Django knows that the result will always be an integer.
But... You've missed 2 things and your function is not needed at all, because it already exist in Django as StrIndex.
And the 2nd thing is: you can achieve the same outcome by using already existing django lookups, like contains or icontains:
query = Car.objects.filter(brand__contains="my audi A3")
You can find out more about lookups in Django docs

Apply Python Code To Sqlalchemy Filters

I'm trying to figure out how to apply python code (like splitting a list) to a sqlalchemy filter. An example is as follows: my database stores a full name as a field in the table. I want to query my database for all people who have a given first name. So what I want to do is something like:
User.query.filter(User.name.split()[0].lower() == 'henry'.lower())
When I try to run this query, I get the error:
AttributeError: Neither 'InstrumentedAttribute' object nor 'Comparator' object associated with User.name has an attribute 'split'
What is the general way to apply python commands like split(), lower(), etc. to my sqlalchemy queries?

SQLAlchemy is constructing a SQL expression, not a Python expression. You can apply SQL functions to the expression by using the func object.
from sqlalchemy import func
User.query.filter(func.lower(func.substring_index(User.name, ' ', 1)) == 'henry')

Python SQLAlchemy Query using labeled OVER clause with ORM

This other question says how to use the OVER clause on sqlalchemy:
Using the OVER window function in SQLAlchemy
But how to do that using ORM? I have something like:
q = self.session.query(self.entity, func.count().over().label('count_over'))
This fails when I call q.all() with the following message:
sqlalchemy.exc.InvalidRequestError:
Ambiguous column name 'count(*) OVER ()' in result set! try 'use_labels' option on select statement
How can I solve this?

You have the over syntax almost correct, it should be something like this:
import sqlalchemy
q = self.session.query(
self.entity,
sqlalchemy.over(func.count()).label('count_over'),
)
Example from the docs:
from sqlalchemy import over
over(func.row_number(), order_by='x')

SQLAlchemy Query object has with_entities method that can be used to customize the list of columns the query returns:
Model.query.with_entities(Model.foo, func.count().over().label('count_over'))
Resulting in following SQL:
SELECT models.foo AS models_foo, count(*) OVER () AS count_over FROM models

You got the functions right. They way to use them to produce the desired result would be as follows:
from sqlalchemy import func
q = self.session.query(self.entity, func.count(self.entity).over().label('count_over'))
This will produce a COUNT(*) statement since no Entity.field was specified. I use the following format:
from myschema import MyEntity
from sqlalchemy import func
q = self.session.query(MyEntity, func.count(MyEntity.id).over().label('count'))
That is if there is an id field, of course. But you get the mechanics :-)

SQLAlchemy returns tuple not dictionary

I've updated SQLAlchemy to 0.6 but it broke everything. I've noticed it returns tuple not a dictionary anymore. Here's a sample query:
query = session.query(User.id, User.username, User.email).filter(and_(User.id == id, User.username == username)).limit(1)
result = session.execute(query).fetchone()
This piece of code used to return a dictionary in 0.5.
My question is how can I return a dictionary?

session.execute has never returned a dict, it returns a RowProxy object, that can be indexed like a dict using either integer keys for positional lookup, string keys for label based lookup up or Column objects to lookup the value of that column. The problem here is that session.execute(query) doesn't do what you seem to expect it to do. It converts the Query object to a Select statement, executes that and returns the result directly. The resultset doesn't know anything about ORM level features. What changed between 0.5 ad 0.6 is that ORM uses a different algorithm to label the columns in queries, it now prepends the table name to the label. So when previously row['id'] happened to work, now row['users_id'] works. In both cases row[User.__table__.columns['id']] works.
To execute ORM queries you should actually use the .all(), .first() and .one() methods or iterate over it or using numeric indexing. Query returns named tuple objects. Zip the tuple with its keys if you want a dict:
row = session.query(User.id, User.username, User.email)\
.filter(and_(User.id == id, User.username == username)).first()
print("id=%s username=%s email=%s" % row) # positional
print(row.id, row.username) # by label
print(dict(zip(row.keys(), row))) # as a dict

Are you sure it isn't a ResultProxy which pretends to be a tuple when you print it? Many objects in the ORM are not what their __str__ function returns.

This should work:
dict(zip(['id','username','email'],result)) (or you could use a dictionary comprehension if you're on Python 3.x).
Also, you don't need to call session.execute on a session.query object. You'll want to use the .one() method on it instead. This also obviates the need for the .limit(1) call hanging off the end of your query.

I solved this using:
# Get a list os user_ids
data_from_db = session.execute(select(User.id)).all()
# Parsing each Row object to a dict by "_mapping" attribute
return [dict(data._mapping) for data in data_from_db]

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to use avg and sum in SQLAlchemy query - python

You should use something like: from sqlalchemy.sql import func session.query(func.avg(Rating.field2).label('average')).filter(Rating.url==url_string.netloc) You cannot use MyObject.query here, because SqlAlchemy tries to find a field to put result of avg function to, and it fails.

attention = Attention_scores.query .with_entities(func.avg(Attention_scores.score)) .filter(classroom_number == classroom_number) .all() I tried it like this and it gave the correct average.

Related

Annotate with Subquery "get" instead of "filter"

django custom Func for specific SQL function

Apply Python Code To Sqlalchemy Filters

Python SQLAlchemy Query using labeled OVER clause with ORM

SQLAlchemy returns tuple not dictionary

Categories

Resources