I need to generate/build sqlalchemy query dynamically using dynamic columns and their values.
Example -
I have a table in SQL called "Convo" and it has columns like - UserID, ConvoID, ContactID.
I need to get rows based on the below criteria.
criteria = (('UserID', 2), ('ConvoID', 1) ,('ContactID', 353))
I have used "Baked query" logic for this. But Some how I am not able to run this query successfully.
Below is the my code.
criteria = (('UserID', 2), ('ConvoID', 1) ,('ContactID', 353))
baked_query = bakery(lambda session: session.query(tablename))
for key1 in condition:
baked_query += lambda q: q.filter(tablename.key1 == condition[key1])
result = baked_query(self.session).all()
I am getting error as -
AttributeError: type object 'Convo' has no attribute 'key1'
Please help me out with this
criteria = (('UserID', 2), ('ConvoID', 1) ,('ContactID', 353))
query = session.query(tablename)
for _filter, value in criteria:
query = query.filter(getattr(tablename, _filter) == value)
result = query.all()
If you're using dynamic keys and "simple" equality checks, the filter_by method might be more convenient, as it takes keyword arguments that match you property names and assembles that into where clause.
So your iterative query construction could look like that:
baked_query = bakery(lambda session: session.query(tablename))
for key, value in condition.items():
baked_query += lambda q: q.filter_by(key=value)
Plus, since filter_by takes mulitple keyword arguments, you can probably simplify your query construction to a single filter_by invocation:
baked_query = bakery(lambda session: session.query(tablename))
baked_query += lambda q: q.filter_by(**condition)
All of the above obviously assuming that your condition variable refers to a dictionary.
Related
This question is probably me not understanding architecture of (new) sqlalchemy, typically I use code like this:
query = select(models.Organization).where(
models.Organization.organization_id == organization_id
)
result = await self.session.execute(query)
return result.scalars().all()
Works fine, I get a list of models (if any).
With a query with specific columns only:
query = (
select(
models.Payment.organization_id,
models.Payment.id,
models.Payment.payment_type,
)
.where(
models.Payment.is_cleared.is_(True),
)
.limit(10)
)
result = await self.session.execute(query)
return result.scalars().all()
I am getting first row, first column only. Same it seems to: https://docs.sqlalchemy.org/en/14/core/connections.html?highlight=scalar#sqlalchemy.engine.Result.scalar
My understanding so far was that in new sqlalchemy we should always call scalars() on the query, as described here: https://docs.sqlalchemy.org/en/14/changelog/migration_20.html#migration-orm-usage
But with specific columns, it seems we cannot use scalars() at all. What is even more confusing is that result.scalars() returns sqlalchemy.engine.result.ScalarResult that has fetchmany(), fechall() among other methods that I am unable to iterate in any meaningful way.
My question is, what do I not understand?
My understanding so far was that in new sqlalchemy we should always call scalars() on the query
That is mostly true, but only for queries that return whole ORM objects. Just a regular .execute()
query = select(Payment)
results = sess.execute(query).all()
print(results) # [(Payment(id=1),), (Payment(id=2),)]
print(type(results[0])) # <class 'sqlalchemy.engine.row.Row'>
returns a list of Row objects, each containing a single ORM object. Users found that awkward since they needed to unpack the ORM object from the Row object. So .scalars() is now recommended
results = sess.scalars(query).all()
print(results) # [Payment(id=1), Payment(id=2)]
print(type(results[0])) # <class '__main__.Payment'>
However, for queries that return individual attributes (columns) we don't want to use .scalars() because that will just give us one column from each row, normally the first column
query = select(
Payment.id,
Payment.organization_id,
Payment.payment_type,
)
results = sess.scalars(query).all()
print(results) # [1, 2]
Instead, we want to use a regular .execute() so we can see all the columns
results = sess.execute(query).all()
print(results) # [(1, 123, None), (2, 234, None)]
Notes:
.scalars() is doing the same thing in both cases: return a list containing a single (scalar) value from each row (default is index=0).
sess.scalars() is the preferred construct. It is simply shorthand for sess.execute().scalars().
These two queries are semantically identical, but one of them succeeds and the other one fails. The only difference is in the WHERE clause, where the two operands of the OR operator have been switched.
from sqlalchemy import create_engine
engine = create_engine('mysql+pymysql://user:pwd#host:port/db',
# Query succeeds (and does what's expected)
with engine.connect() as cn:
cn.exec_driver_sql(
f'UPDATE table SET column = "value" WHERE id = %s OR id IN %s',
(3, (1, 2))
)
# Query fails
with engine.connect() as cn:
cn.exec_driver_sql(
f'UPDATE table SET column = "value" WHERE id IN %s OR id = %s',
((1, 2), 3)
)
Output of the failed query:
TypeError: 'int' object is not iterable
It seems that sqlalchemy argument parsing is dependant on the type of the first argument. If the first one if a tuple, the query fails, if it is a int/float/str, it succeeds.
The workaround I've found so far is to use named arguments:
# Query succeeds
with engine.connect() as cn:
cn.exec_driver_sql(
f'UPDATE table SET column = "value" WHERE id IN %(arg1)s OR id = %(arg2)s',
{'arg1': (1, 2), 'arg2': 3}
)
However it is more verbose and I don't want to use this everywhere. Also note that PyMySQL cursor's execute method accepted both queries.
Is there a reason for this behaviour?
I think the problem lies in DefaultExecutionContext._init_statement classmethod.
...
if not parameters:
...
elif isinstance(parameters[0], dialect.execute_sequence_format):
self.parameters = parameters
elif isinstance(parameters[0], dict):
...
else:
self.parameters = [
dialect.execute_sequence_format(p) for p in parameters
]
self.executemany = len(parameters) > 1
isinstance(parameters[0], dialect.execute_sequence_format) is checking whether the first element of parameters is a tuple. This seems to be heuristic for efficiently detecting an executemany scenario: probably it should check that all the elements are tuples and of equal length*. As it is values ((1, 2), 3) will cause the equivalent of
cursor.executemany(sql, [(1, 2), 3])
and syntactically invalid statements like
SELECT * FROM tbl WHERE id IN 1 OR id = 2
-- ^^^^
Wrapping the parameters in a list fixes the problem, since len(parameters) will no longer be greater than one.
with engine.connect() as cn:
cn.exec_driver_sql(
f'UPDATE table SET column = "value" WHERE id IN %s OR id = %s',
[((1, 2), 3)]
)
Obviously this is a workaround on top of a heuristic, so it may not work in every possible situation. It's probably worth opening a discussion on GitHub to explore whether this is a bug that should be fixed.
* Passing a tuple to create parenthesised values for ...IN %s or ...VALUES %s is not supported by all drivers, so it's not that bad a heuristic.
I'm having trouble converting this SQL query into a SQL Alchemy query:
query = """
SELECT i.case_num,
to_char(i.date_time, 'FMMonth FMDD, YYYY'),
to_char(i.date_time, 'HH24:MI'),
i.incident_type,
i.incident_cat,
i.injury,
i.property_damage,
i.description,
i.root_cause,
a.corrective_action,
a.due_date,
i.user_id
FROM incident as i, action_items as a
WHERE i.case_num = a.case_id AND i.case_num = %s;
"""
I have tried the following but have received nothing but errors:
sqlalchemy.orm.exc.NoResultFound: No row was found for one()
results = dbsession.query(Incidents.case_num,
func.to_char(Incidents.date_time, 'FMMonth FMDD, YYYY'),
func.to_char(Incidents.date_time, 'HH24:MI'),
Incidents.incident_type,
Incidents.incident_cat,
Incidents.injury,
Incidents.property_damage,
Incidents.description,
Incidents.root_cause,
Actions.corrective_action,
Actions.due_date,
Incidents.user_id).join(Actions).filter_by(case_id = id).one()
AttributeError: mapper
results = dbsession.query(Incidents.case_num,
func.to_char(Incidents.date_time, 'FMMonth FMDD, YYYY'),
func.to_char(Incidents.date_time, 'HH24:MI'),
Incidents.incident_type,
Incidents.incident_cat,
Incidents.injury,
Incidents.property_damage,
Incidents.description,
Incidents.root_cause,
Incidents.user_id).join(Actions.corrective_action, Actions.due_date).filter_by(case_id = id).one()
I figure I can do two separate queries but would rather figure out how to perform one join query instead.
you shouldn't need to specify a join explicitly to get sqlalchemy to generate the statment you want.
Also, (my opinion). Avoid using filter_by.
In this case filter_by is not smart enough to realize that id is a column in Incidents, because id is a built in function. filter_by (see source)
accepts where conditions as keyword arguments, unpacks them, treating the keys as columns to be looked up, but not the values, then it calls the filter method with all the conditions conjoined.
relevant bit of code:
def filter_by(self, **kwargs):
clauses = [_entity_descriptor(self._joinpoint_zero(), key) == value
for key, value in kwargs.items()]
return self.filter(sql.and_(*clauses))
if id were provided as a left-hand value, i.e.
stmt = dbsession.query(...).join(...).filter_by(id = 123)
The statement would compile. However, the following would not compile
stmt = dbsession.query(...).join(...).filter_by(id = case_id)
because, case_id is not a variable in scope
And, the OP's version
stmt = dbsession.query(...).join(...).filter_by(case_id = id)
can resolve case_id properly, and sees that there is something in the current scope named id (the built-in), and tries to use it
This should do what you want:
results = dbsession.query(
Incidents.case_num,
func.to_char(Incidents.date_time, 'FMMonth FMDD, YYYY'),
func.to_char(Incidents.date_time, 'HH24:MI'),
Incidents.incident_type,
Incidents.incident_cat,
Incidents.injury,
Incidents.property_damage,
Incidents.description,
Incidents.root_cause,
Actions.corrective_action,
Actions.due_date,
Incidents.user_id).filter(
Actions.case_id == Incidents.id
).filter(
Incidents.case_num == 123
).one()
# ^ here's how one would add multiple filters to a query
FYI, you can save query objects and inspect them, like this:
stmt = dbsession.query(...).filter(...)
print(stmt)
And then fetch the results with
stmt.one()
# or stmt.first() or stmt.all() or ...
I have a sql query as follows
select cloumn1,column2,count(column1) as c
from Table1 where user_id='xxxxx' and timestamp > xxxxxx
group by cloumn1,column2
order by c desc limit 1;
And I successed in write the sqlalchemy equvalent
result = session.query(Table1.field1,Table1.field2,func.count(Table1.field1)).filter(
Table1.user_id == self.user_id).filter(Table1.timestamp > self.from_ts).group_by(
Table1.field1,Travelog.field2).order_by(desc(func.count(Table1.field1))).first()
But I want to avoid using func.count(Table1.field1) in the order_by clause.
How can I use alias in sqlalchemy? Can any one show any example?
Aliases are for tables; columns in a query are given a label instead. This trips me up from time to time too.
You can go about this two ways. It is sufficient to store the func.count() result is a local variable first and reuse that:
field1_count = func.count(Table1.field1)
result = session.query(Table1.field1, Table1.field2, field1_count).filter(
Table1.user_id == self.user_id).filter(Table1.timestamp > self.from_ts).group_by(
Table1.field1, Travelog.field2).order_by(desc(field1_count)).first()
The SQL produced would still be the same as your own code would generate, but at least you don't have to type out the func.count() call twice.
To give this column an explicit label, call the .label() method on it:
field1_count = func.count(Table1.field1).label('c')
and you can then use that same label string in the order_by clause:
result = session.query(Table1.field1, Table1.field2, field1_count).filter(
Table1.user_id == self.user_id).filter(Table1.timestamp > self.from_ts).group_by(
Table1.field1, Travelog.field2).order_by(desc('c')).first()
or you could use the field1_count.name attribute:
result = session.query(Table1.field1, Table1.field2, field1_count).filter(
Table1.user_id == self.user_id).filter(Table1.timestamp > self.from_ts).group_by(
Table1.field1, Travelog.field2).order_by(desc(field1_count.name)).first()
Can also using the c which is an alias of the column attribute but in this case a label will work fine as stated.
Will also point out that the filter doesn't need to be used multiple times can pass comma separated criterion.
result = (session.query(Table1.field1, Table1.field2,
func.count(Table1.field1).label('total'))
.filter(Table1.c.user_id == self.user_id, Table1.timestamp > self.from_ts)
.group_by(Table1.field1,Table1.field2)
.order_by(desc('total')).first())
I have a function which receives an optional argument.
I am querying a database table within this function.
What I would like is:
If the optional argument is specified, I want to add another additional .filter() to my database query.
My query line is already rather long so I don't want to do If .. else .. in which I repeat the whole query twice.
What is the way to do this?
Below is an example to my query and if my_val is specified, I need to add another filtering line.
def my_def (my_val):
query = Session.query(Table1, Table2).\
filter(Table1.c1.in_(some_val)).\
filter(Table1.c2 == 113).\
filter(Table2.c3 == val1).\
filter(Table1.c4 == val2).\
filter(Table2.c5 == val5).\
all()
You can wait to call the .all() method on the query set, something like this:
def my_def (my_val=my_val):
query = Session.query(Table1, Table2).\
filter(Table1.c1.in_(some_val)).\
filter(Table1.c2 == 113).\
filter(Table2.c3 == val1).\
filter(Table1.c4 == val2).\
filter(Table2.c5 == val5)
if my_val:
query = query.filter(Table1.c6 == my_val)
return query.all()