I have a sql query as follows
select cloumn1,column2,count(column1) as c
from Table1 where user_id='xxxxx' and timestamp > xxxxxx
group by cloumn1,column2
order by c desc limit 1;
And I successed in write the sqlalchemy equvalent
result = session.query(Table1.field1,Table1.field2,func.count(Table1.field1)).filter(
Table1.user_id == self.user_id).filter(Table1.timestamp > self.from_ts).group_by(
Table1.field1,Travelog.field2).order_by(desc(func.count(Table1.field1))).first()
But I want to avoid using func.count(Table1.field1) in the order_by clause.
How can I use alias in sqlalchemy? Can any one show any example?
Aliases are for tables; columns in a query are given a label instead. This trips me up from time to time too.
You can go about this two ways. It is sufficient to store the func.count() result is a local variable first and reuse that:
field1_count = func.count(Table1.field1)
result = session.query(Table1.field1, Table1.field2, field1_count).filter(
Table1.user_id == self.user_id).filter(Table1.timestamp > self.from_ts).group_by(
Table1.field1, Travelog.field2).order_by(desc(field1_count)).first()
The SQL produced would still be the same as your own code would generate, but at least you don't have to type out the func.count() call twice.
To give this column an explicit label, call the .label() method on it:
field1_count = func.count(Table1.field1).label('c')
and you can then use that same label string in the order_by clause:
result = session.query(Table1.field1, Table1.field2, field1_count).filter(
Table1.user_id == self.user_id).filter(Table1.timestamp > self.from_ts).group_by(
Table1.field1, Travelog.field2).order_by(desc('c')).first()
or you could use the field1_count.name attribute:
result = session.query(Table1.field1, Table1.field2, field1_count).filter(
Table1.user_id == self.user_id).filter(Table1.timestamp > self.from_ts).group_by(
Table1.field1, Travelog.field2).order_by(desc(field1_count.name)).first()
Can also using the c which is an alias of the column attribute but in this case a label will work fine as stated.
Will also point out that the filter doesn't need to be used multiple times can pass comma separated criterion.
result = (session.query(Table1.field1, Table1.field2,
func.count(Table1.field1).label('total'))
.filter(Table1.c.user_id == self.user_id, Table1.timestamp > self.from_ts)
.group_by(Table1.field1,Table1.field2)
.order_by(desc('total')).first())
Related
I have the following query that is attempting to return authors and their article counts:
SELECT (
SELECT COUNT(*)
FROM aldryn_newsblog_article
WHERE
aldryn_newsblog_article.author_id IN (1,2) AND
aldryn_newsblog_article.app_config_id = 1 AND
aldryn_newsblog_article.is_published IS TRUE AND
aldryn_newsblog_article.publishing_date <= now()
) as article_count, aldryn_people_person.*
FROM aldryn_people_person
However, it is currently returning the same number for each author because it counts all articles for authors with ID's of 1 and 2.
How should the query be modified, so it returns proper article counts for each author?
On a separate note, how can one turn the (1,2) into a list that can be spliced into the query dynamically? That is, suppose I have a Python list of author IDs, for which I would like to look up article counts. How could I pass that information to the SQL?
As commented, for a subquery to work you need to correlate it to the outer query usually by a unique identifier (assumed to be author_id) which appears to also be used for a filtered condition to be run in WHERE of outer query. Also, use table aliases for clarity between subquery and outer query.
SELECT main.*
, (SELECT COUNT(*)
FROM aldryn_newsblog_article AS sub
WHERE
sub.author_id = main.author_id AND
sub.app_config_id = 1 AND
sub.is_published IS TRUE AND
sub.publishing_date <= now()
) AS article_count
FROM aldryn_people_person AS main
WHERE main.author_id IN (1, 2)
Alternatively, for a more efficient query, have main query JOIN to an aggregate subquery to calculate counts once and avoid re-running subquery for every outer query's number of rows.
SELECT main.*,
, sub.article_count
FROM aldryn_people_person AS main
INNER JOIN
(SELECT author_id
, COUNT(*) AS article_count
FROM aldryn_newsblog_article AS sub
WHERE
sub.app_config_id = 1 AND
sub.is_published IS TRUE AND
sub.publishing_date <= now()
GROUP BY author_id
) AS sub
ON sub.author_id = main.author_id
AND main.author_id IN (1, 2)
Re your separate note, there are many SO questions like this one that asks for a dynamic list in IN operator which involves creating a prepared statement with dynamic number of parameter placeholders, either ? or %s depending on Python DB-API (e.g., psycopg2, pymysql, pyodbc). Then, pass parameters in second argument of cursor.execute() clause. Do note the limit of such values for your database.
# BUILD PARAM PLACEHOLDERS
qmarks = ", ".join(['?' for _ in range(len([list_of_author_ids]))])
# INTERPOLATE WITH F-STRING (PYTHON 3.6+)
sql = f'''SELECT ...
FROM ....
INNER JOIN ....
AND main.author_id IN ({qmarks})'''
# BIND PARAMS
cursor.execute(sql, [list_of_author_ids])
The way I normally handle these sorts of aggregates is first design a query that gets a list of author names and articles, then create a column to serve as the article count. At the lowest level this looks silly, because every article is 1. Then I wrap that in a subquery and sum from it.
SELECT sub.author, articleCount = sum(sub.rowCount)
FROM (
select distinct
author = x.author_id
, article = x.articleTitle
, rowCount = 1
from aldryn_newsblog_article x
where x.author_id in (1,2) and x.is_pubished = true --whatever other conditions you need here
) sub
GROUP BY sub.author
As far as the (1,2) being replaced with something more dynamic, the way I've seen it done before is to use CHARINDEX to parse a comma separated string in the where clause so you would have something like
DECLARE #passedFilter VARCHAR(50) = ',1,2,'
SELECT * FROM aldryn_newsblog_article WHERE CHARINDEX(',' + CAST(author_id AS VARCHAR) + ',', #passedFilter, 0) > 0
What this does is takes your list of ids (note the leading and trailing commas) and lets the query do a pattern match on it off of the key value. I've read that this doesn't give the absolute best performance, but sometimes that isn't the biggest concern. We used this a lot in passing filters from a web app to SQL Server reports. Another method would be to declare a table variable / temp table, populate it somehow with the authors you want to filter for then join that subquery from the first bit of my answer to that table.
this is my first post on stack overflow... thanks in advance for any and all help
I am very new to programming and i created a function in Python to dynamically search an sqlite3 database rather than entering in tons of queries. i will show the code and try to explain what i intended to happen at all the stages. In short my cursor.fetchall() always evaluates to empty even when i am certain there is a value in the database that it should find.
def value_in_database_check(table_column_name: str, value_to_check: str):
db, cursor = get_connection() # here i get a database and cursor connection
for tuple_item in schema(): # here i get the schema of my database from another function
if tuple_item[0] == "table": # now I check if the tuple is a table
schema_table = tuple_item[4] # this just gives me the table info of the tuple
# this lets me know the index of the column I am looking for in the table
found_at = schema_table.find(table_column_name)
# if my column value was found I will enter this block of code
if not found_at == -1:
table_name = tuple_item[1]
to_find_sql = "SELECT * FROM {} WHERE ? LIKE ?".format(table_name)
# value_to_check correlates to table_column_name
# example "email", "this#email.com"
cursor.execute(to_find_sql, (table_column_name, value_to_check))
# this always evaluates to an empty list even if I am certain that the
# information is in the database
fetch_to_find = cursor.fetchall()
if len(fetch_to_find) > 0:
return True
else:
return False
I believe ? can only be used as placeholder for values, not for names of tables or - as you try to do - columns. A likely fix (haven't tested it though):
to_find_sql = "SELECT * FROM {} WHERE {} LIKE ?".format(table_name, table_column_name)
cursor.execute(to_find_sql, (value_to_check, ))
I want to call a stored procedure and receive the output parameter in python. I am using sqlAlchemy and can use parameters but do not know how to have the output be read into a variable. I understand that there is a outParam() attribute in sqlAlchemy, but I have not found a useful example.
Here is a simple SQL code for testing:
CREATE PROCEDURE [dbo].[Test]
#numOne int,
#numTwo int,
#numOut int OUTPUT
AS
BEGIN
SET NOCOUNT ON;
SET #numOut = #numOne + #numTwo
END
And simple python:
engine = sqlalchemy.create_engine("mssql+pyodbc:///?odbc_connect=%s" % params)
outParam = 0
result = engine.execute('Test ? ,?, ? OUTPUT', [1, 2, outParam])
outParam is still 0. I have tried modifying it with:
outParam = sqlalchemy.sql.outparam("ret_%d", type_=int)
But this produces a "Programming Error." What am I missing?
SQLAlchemy returns a ResultProxy object. Try it like this:
engine = sqlalchemy.create_engine(...)
rproxy = engine.execute(...)
result = rproxy.fetchall()
result should be a list of RowProxy objects that you can treat like dictionaries, keyed on the column names from the query.
If you are looking for a true OUT param, your approach is almost correct. There is just a small error in the first parameter in the call to sqlalchemy.sql.outparam (but it is valid Python syntax). It should be like this:
outParam = sqlalchemy.sql.outparam("ret_%d" % i, type_=int)
Note the change to the first parameter: it just needed a value to substitute into the format string. The first parameter is the key value, and most likely %d is to be replaced with the column index number (which should be in i).
I'm having trouble converting this SQL query into a SQL Alchemy query:
query = """
SELECT i.case_num,
to_char(i.date_time, 'FMMonth FMDD, YYYY'),
to_char(i.date_time, 'HH24:MI'),
i.incident_type,
i.incident_cat,
i.injury,
i.property_damage,
i.description,
i.root_cause,
a.corrective_action,
a.due_date,
i.user_id
FROM incident as i, action_items as a
WHERE i.case_num = a.case_id AND i.case_num = %s;
"""
I have tried the following but have received nothing but errors:
sqlalchemy.orm.exc.NoResultFound: No row was found for one()
results = dbsession.query(Incidents.case_num,
func.to_char(Incidents.date_time, 'FMMonth FMDD, YYYY'),
func.to_char(Incidents.date_time, 'HH24:MI'),
Incidents.incident_type,
Incidents.incident_cat,
Incidents.injury,
Incidents.property_damage,
Incidents.description,
Incidents.root_cause,
Actions.corrective_action,
Actions.due_date,
Incidents.user_id).join(Actions).filter_by(case_id = id).one()
AttributeError: mapper
results = dbsession.query(Incidents.case_num,
func.to_char(Incidents.date_time, 'FMMonth FMDD, YYYY'),
func.to_char(Incidents.date_time, 'HH24:MI'),
Incidents.incident_type,
Incidents.incident_cat,
Incidents.injury,
Incidents.property_damage,
Incidents.description,
Incidents.root_cause,
Incidents.user_id).join(Actions.corrective_action, Actions.due_date).filter_by(case_id = id).one()
I figure I can do two separate queries but would rather figure out how to perform one join query instead.
you shouldn't need to specify a join explicitly to get sqlalchemy to generate the statment you want.
Also, (my opinion). Avoid using filter_by.
In this case filter_by is not smart enough to realize that id is a column in Incidents, because id is a built in function. filter_by (see source)
accepts where conditions as keyword arguments, unpacks them, treating the keys as columns to be looked up, but not the values, then it calls the filter method with all the conditions conjoined.
relevant bit of code:
def filter_by(self, **kwargs):
clauses = [_entity_descriptor(self._joinpoint_zero(), key) == value
for key, value in kwargs.items()]
return self.filter(sql.and_(*clauses))
if id were provided as a left-hand value, i.e.
stmt = dbsession.query(...).join(...).filter_by(id = 123)
The statement would compile. However, the following would not compile
stmt = dbsession.query(...).join(...).filter_by(id = case_id)
because, case_id is not a variable in scope
And, the OP's version
stmt = dbsession.query(...).join(...).filter_by(case_id = id)
can resolve case_id properly, and sees that there is something in the current scope named id (the built-in), and tries to use it
This should do what you want:
results = dbsession.query(
Incidents.case_num,
func.to_char(Incidents.date_time, 'FMMonth FMDD, YYYY'),
func.to_char(Incidents.date_time, 'HH24:MI'),
Incidents.incident_type,
Incidents.incident_cat,
Incidents.injury,
Incidents.property_damage,
Incidents.description,
Incidents.root_cause,
Actions.corrective_action,
Actions.due_date,
Incidents.user_id).filter(
Actions.case_id == Incidents.id
).filter(
Incidents.case_num == 123
).one()
# ^ here's how one would add multiple filters to a query
FYI, you can save query objects and inspect them, like this:
stmt = dbsession.query(...).filter(...)
print(stmt)
And then fetch the results with
stmt.one()
# or stmt.first() or stmt.all() or ...
I have a function which receives an optional argument.
I am querying a database table within this function.
What I would like is:
If the optional argument is specified, I want to add another additional .filter() to my database query.
My query line is already rather long so I don't want to do If .. else .. in which I repeat the whole query twice.
What is the way to do this?
Below is an example to my query and if my_val is specified, I need to add another filtering line.
def my_def (my_val):
query = Session.query(Table1, Table2).\
filter(Table1.c1.in_(some_val)).\
filter(Table1.c2 == 113).\
filter(Table2.c3 == val1).\
filter(Table1.c4 == val2).\
filter(Table2.c5 == val5).\
all()
You can wait to call the .all() method on the query set, something like this:
def my_def (my_val=my_val):
query = Session.query(Table1, Table2).\
filter(Table1.c1.in_(some_val)).\
filter(Table1.c2 == 113).\
filter(Table2.c3 == val1).\
filter(Table1.c4 == val2).\
filter(Table2.c5 == val5)
if my_val:
query = query.filter(Table1.c6 == my_val)
return query.all()