SQL Alchemy join statement - python

I'm having trouble converting this SQL query into a SQL Alchemy query:
query = """
SELECT i.case_num,
to_char(i.date_time, 'FMMonth FMDD, YYYY'),
to_char(i.date_time, 'HH24:MI'),
i.incident_type,
i.incident_cat,
i.injury,
i.property_damage,
i.description,
i.root_cause,
a.corrective_action,
a.due_date,
i.user_id
FROM incident as i, action_items as a
WHERE i.case_num = a.case_id AND i.case_num = %s;
"""
I have tried the following but have received nothing but errors:
sqlalchemy.orm.exc.NoResultFound: No row was found for one()
results = dbsession.query(Incidents.case_num,
func.to_char(Incidents.date_time, 'FMMonth FMDD, YYYY'),
func.to_char(Incidents.date_time, 'HH24:MI'),
Incidents.incident_type,
Incidents.incident_cat,
Incidents.injury,
Incidents.property_damage,
Incidents.description,
Incidents.root_cause,
Actions.corrective_action,
Actions.due_date,
Incidents.user_id).join(Actions).filter_by(case_id = id).one()
AttributeError: mapper
results = dbsession.query(Incidents.case_num,
func.to_char(Incidents.date_time, 'FMMonth FMDD, YYYY'),
func.to_char(Incidents.date_time, 'HH24:MI'),
Incidents.incident_type,
Incidents.incident_cat,
Incidents.injury,
Incidents.property_damage,
Incidents.description,
Incidents.root_cause,
Incidents.user_id).join(Actions.corrective_action, Actions.due_date).filter_by(case_id = id).one()
I figure I can do two separate queries but would rather figure out how to perform one join query instead.

you shouldn't need to specify a join explicitly to get sqlalchemy to generate the statment you want.
Also, (my opinion). Avoid using filter_by.
In this case filter_by is not smart enough to realize that id is a column in Incidents, because id is a built in function. filter_by (see source)
accepts where conditions as keyword arguments, unpacks them, treating the keys as columns to be looked up, but not the values, then it calls the filter method with all the conditions conjoined.
relevant bit of code:
def filter_by(self, **kwargs):
clauses = [_entity_descriptor(self._joinpoint_zero(), key) == value
for key, value in kwargs.items()]
return self.filter(sql.and_(*clauses))
if id were provided as a left-hand value, i.e.
stmt = dbsession.query(...).join(...).filter_by(id = 123)
The statement would compile. However, the following would not compile
stmt = dbsession.query(...).join(...).filter_by(id = case_id)
because, case_id is not a variable in scope
And, the OP's version
stmt = dbsession.query(...).join(...).filter_by(case_id = id)
can resolve case_id properly, and sees that there is something in the current scope named id (the built-in), and tries to use it
This should do what you want:
results = dbsession.query(
Incidents.case_num,
func.to_char(Incidents.date_time, 'FMMonth FMDD, YYYY'),
func.to_char(Incidents.date_time, 'HH24:MI'),
Incidents.incident_type,
Incidents.incident_cat,
Incidents.injury,
Incidents.property_damage,
Incidents.description,
Incidents.root_cause,
Actions.corrective_action,
Actions.due_date,
Incidents.user_id).filter(
Actions.case_id == Incidents.id
).filter(
Incidents.case_num == 123
).one()
# ^ here's how one would add multiple filters to a query
FYI, you can save query objects and inspect them, like this:
stmt = dbsession.query(...).filter(...)
print(stmt)
And then fetch the results with
stmt.one()
# or stmt.first() or stmt.all() or ...

Related

SqlAlchemy 2.x with specific columns makes scalars() return non-orm objects

This question is probably me not understanding architecture of (new) sqlalchemy, typically I use code like this:
query = select(models.Organization).where(
models.Organization.organization_id == organization_id
)
result = await self.session.execute(query)
return result.scalars().all()
Works fine, I get a list of models (if any).
With a query with specific columns only:
query = (
select(
models.Payment.organization_id,
models.Payment.id,
models.Payment.payment_type,
)
.where(
models.Payment.is_cleared.is_(True),
)
.limit(10)
)
result = await self.session.execute(query)
return result.scalars().all()
I am getting first row, first column only. Same it seems to: https://docs.sqlalchemy.org/en/14/core/connections.html?highlight=scalar#sqlalchemy.engine.Result.scalar
My understanding so far was that in new sqlalchemy we should always call scalars() on the query, as described here: https://docs.sqlalchemy.org/en/14/changelog/migration_20.html#migration-orm-usage
But with specific columns, it seems we cannot use scalars() at all. What is even more confusing is that result.scalars() returns sqlalchemy.engine.result.ScalarResult that has fetchmany(), fechall() among other methods that I am unable to iterate in any meaningful way.
My question is, what do I not understand?
My understanding so far was that in new sqlalchemy we should always call scalars() on the query
That is mostly true, but only for queries that return whole ORM objects. Just a regular .execute()
query = select(Payment)
results = sess.execute(query).all()
print(results) # [(Payment(id=1),), (Payment(id=2),)]
print(type(results[0])) # <class 'sqlalchemy.engine.row.Row'>
returns a list of Row objects, each containing a single ORM object. Users found that awkward since they needed to unpack the ORM object from the Row object. So .scalars() is now recommended
results = sess.scalars(query).all()
print(results) # [Payment(id=1), Payment(id=2)]
print(type(results[0])) # <class '__main__.Payment'>
However, for queries that return individual attributes (columns) we don't want to use .scalars() because that will just give us one column from each row, normally the first column
query = select(
Payment.id,
Payment.organization_id,
Payment.payment_type,
)
results = sess.scalars(query).all()
print(results) # [1, 2]
Instead, we want to use a regular .execute() so we can see all the columns
results = sess.execute(query).all()
print(results) # [(1, 123, None), (2, 234, None)]
Notes:
.scalars() is doing the same thing in both cases: return a list containing a single (scalar) value from each row (default is index=0).
sess.scalars() is the preferred construct. It is simply shorthand for sess.execute().scalars().

How to safely bind Oracle column to ORDER BY to SQLAlchemy in a raw query?

I'm trying to execute a raw sql query and safely pass an order by/asc/desc based on user input. This is the back end for a paginated datagrid. I cannot for the life of me figure out how to do this safely. Parameters get converted to strings so Oracle can't execute the query. I can't find any examples of this anywhere on the internet. What is the best way to safely accomplish this? (I am not using the ORM, must be raw sql).
My workaround is just setting ASC/DESC to a variable that I set. This works fine and is safe. However, how do I bind a column name to the ORDER BY? Is that even possible? I can just whitelist a bunch of columns and do something similar as I do with the ASC/DESC. I was just curious if there's a way to bind it. Thanks.
#default.route('/api/barcodes/<sort_by>/<sort_dir>', methods=['GET'])
#json_enc
def fetch_barcodes(sort_by, sort_dir):
#time.sleep(5)
# Can't use sort_dir as a parameter, so assign to variable to sanitize it
ord_dir = "DESC" if sort_dir.lower() == 'desc' else 'ASC'
records = []
stmt = text("SELECT bb_request_id,bb_barcode,bs_status, "
"TO_CHAR(bb_rec_cre_date, 'MM/DD/YYYY') AS bb_rec_cre_date "
"FROM bars_barcodes,bars_status "
"WHERE bs_status_id = bb_status_id "
"ORDER BY :ord_by :ord_dir ")
stmt = stmt.bindparams(ord_by=sort_by,ord_dir=ord_dir)
rs = db.session.execute(stmt)
records = [dict(zip(rs.keys(), row)) for row in rs]
DatabaseError: (cx_Oracle.DatabaseError) ORA-01036: illegal variable name/number
[SQL: "SELECT bb_request_id,bb_barcode,bs_status, TO_CHAR(bb_rec_cre_date, 'MM/DD/YYYY') AS bb_rec_cre_date FROM bars_barcodes,bars_status WHERE bs_status_id = bb_status_id ORDER BY :ord_by :ord_dir "] [parameters: {'ord_by': u'bb_rec_cre_date', 'ord_dir': 'ASC'}]
UPDATE Solution based on accepted answer:
def fetch_barcodes(sort_by, sort_dir, page, rows_per_page):
ord_dir_func = desc if sort_dir.lower() == 'desc' else asc
query_limit = int(rows_per_page)
query_offset = (int(page) - 1) * query_limit
stmt = select([column('bb_request_id'),
column('bb_barcode'),
column('bs_status'),
func.to_char(column('bb_rec_cre_date'), 'MM/DD/YYYY').label('bb_rec_cre_date')]).\
select_from(table('bars_barcode')).\
select_from(table('bars_status')).\
where(column('bs_status_id') == column('bb_status_id')).\
order_by(ord_dir_func(column(sort_by))).\
limit(query_limit).offset(query_offset)
result = db.session.execute(stmt)
records = [dict(row) for row in result]
response = json_return()
response.addRecords(records)
#response.setTotal(len(records))
response.setTotal(1001)
response.setSuccess(True)
response.addMessage("Records retrieved successfully. Limit: " + str(query_limit) + ", Offset: " + str(query_offset) + " SQL: " + str(stmt))
return response
You could use Core constructs such as table() and column() for this instead of raw SQL strings. That'd make your life easier in this regard:
from sqlalchemy import select, table, column, asc, desc
ord_dir = desc if sort_dir.lower() == 'desc' else asc
stmt = select([column('bb_request_id'),
column('bb_barcode'),
column('bs_status'),
func.to_char(column('bb_rec_cre_date'),
'MM/DD/YYYY').label('bb_rec_cre_date')]).\
select_from(table('bars_barcodes')).\
select_from(table('bars_status')).\
where(column('bs_status_id') == column('bb_status_id')).\
order_by(ord_dir(column(sort_by)))
table() and column() represent the syntactic part of a full blown Table object with Columns and can be used in this fashion for escaping purposes:
The text handled by column() is assumed to be handled like the name of a database column; if the string contains mixed case, special characters, or matches a known reserved word on the target backend, the column expression will render using the quoting behavior determined by the backend.
Still, whitelisting might not be a bad idea.
Note that you don't need to manually zip() the row proxies in order to produce dictionaries. They act as mappings as is, and if you need dict() for serialization reasons or such, just do dict(row).

Syntax error when assigning field name variable to a "?" placeholder [duplicate]

I'm currently building SQL queries depending on input from the user. An example how this is done can be seen here:
def generate_conditions(table_name,nameValues):
sql = u""
for field in nameValues:
sql += u" AND {0}.{1}='{2}'".format(table_name,field,nameValues[field])
return sql
search_query = u"SELECT * FROM Enheter e LEFT OUTER JOIN Handelser h ON e.Id == h.Enhet WHERE 1=1"
if "Enhet" in args:
search_query += generate_conditions("e",args["Enhet"])
c.execute(search_query)
Since the SQL changes every time I cannot insert the values in the execute call which means that I should escape the strings manually. However, when I search everyone points to execute...
I'm also not that satisfied with how I generate the query, so if someone has any idea for another way that would be great also!
You have two options:
Switch to using SQLAlchemy; it'll make generating dynamic SQL a lot more pythonic and ensures proper quoting.
Since you cannot use parameters for table and column names, you'll still have to use string formatting to include these in the query. Your values on the other hand, should always be using SQL parameters, if only so the database can prepare the statement.
It's not advisable to just interpolate table and column names taken straight from user input, it's far too easy to inject arbitrary SQL statements that way. Verify the table and column names against a list of such names you accept instead.
So, to build on your example, I'd go in this direction:
tables = {
'e': ('unit1', 'unit2', ...), # tablename: tuple of column names
}
def generate_conditions(table_name, nameValues):
if table_name not in tables:
raise ValueError('No such table %r' % table_name)
sql = u""
params = []
for field in nameValues:
if field not in tables[table_name]:
raise ValueError('No such column %r' % field)
sql += u" AND {0}.{1}=?".format(table_name, field)
params.append(nameValues[field])
return sql, params
search_query = u"SELECT * FROM Enheter e LEFT OUTER JOIN Handelser h ON e.Id == h.Enhet WHERE 1=1"
search_params = []
if "Enhet" in args:
sql, params = generate_conditions("e",args["Enhet"])
search_query += sql
search_params.extend(params)
c.execute(search_query, search_params)

how use alias in sqlachemy

I have a sql query as follows
select cloumn1,column2,count(column1) as c
from Table1 where user_id='xxxxx' and timestamp > xxxxxx
group by cloumn1,column2
order by c desc limit 1;
And I successed in write the sqlalchemy equvalent
result = session.query(Table1.field1,Table1.field2,func.count(Table1.field1)).filter(
Table1.user_id == self.user_id).filter(Table1.timestamp > self.from_ts).group_by(
Table1.field1,Travelog.field2).order_by(desc(func.count(Table1.field1))).first()
But I want to avoid using func.count(Table1.field1) in the order_by clause.
How can I use alias in sqlalchemy? Can any one show any example?
Aliases are for tables; columns in a query are given a label instead. This trips me up from time to time too.
You can go about this two ways. It is sufficient to store the func.count() result is a local variable first and reuse that:
field1_count = func.count(Table1.field1)
result = session.query(Table1.field1, Table1.field2, field1_count).filter(
Table1.user_id == self.user_id).filter(Table1.timestamp > self.from_ts).group_by(
Table1.field1, Travelog.field2).order_by(desc(field1_count)).first()
The SQL produced would still be the same as your own code would generate, but at least you don't have to type out the func.count() call twice.
To give this column an explicit label, call the .label() method on it:
field1_count = func.count(Table1.field1).label('c')
and you can then use that same label string in the order_by clause:
result = session.query(Table1.field1, Table1.field2, field1_count).filter(
Table1.user_id == self.user_id).filter(Table1.timestamp > self.from_ts).group_by(
Table1.field1, Travelog.field2).order_by(desc('c')).first()
or you could use the field1_count.name attribute:
result = session.query(Table1.field1, Table1.field2, field1_count).filter(
Table1.user_id == self.user_id).filter(Table1.timestamp > self.from_ts).group_by(
Table1.field1, Travelog.field2).order_by(desc(field1_count.name)).first()
Can also using the c which is an alias of the column attribute but in this case a label will work fine as stated.
Will also point out that the filter doesn't need to be used multiple times can pass comma separated criterion.
result = (session.query(Table1.field1, Table1.field2,
func.count(Table1.field1).label('total'))
.filter(Table1.c.user_id == self.user_id, Table1.timestamp > self.from_ts)
.group_by(Table1.field1,Table1.field2)
.order_by(desc('total')).first())

Escaping dynamic sqlite query?

I'm currently building SQL queries depending on input from the user. An example how this is done can be seen here:
def generate_conditions(table_name,nameValues):
sql = u""
for field in nameValues:
sql += u" AND {0}.{1}='{2}'".format(table_name,field,nameValues[field])
return sql
search_query = u"SELECT * FROM Enheter e LEFT OUTER JOIN Handelser h ON e.Id == h.Enhet WHERE 1=1"
if "Enhet" in args:
search_query += generate_conditions("e",args["Enhet"])
c.execute(search_query)
Since the SQL changes every time I cannot insert the values in the execute call which means that I should escape the strings manually. However, when I search everyone points to execute...
I'm also not that satisfied with how I generate the query, so if someone has any idea for another way that would be great also!
You have two options:
Switch to using SQLAlchemy; it'll make generating dynamic SQL a lot more pythonic and ensures proper quoting.
Since you cannot use parameters for table and column names, you'll still have to use string formatting to include these in the query. Your values on the other hand, should always be using SQL parameters, if only so the database can prepare the statement.
It's not advisable to just interpolate table and column names taken straight from user input, it's far too easy to inject arbitrary SQL statements that way. Verify the table and column names against a list of such names you accept instead.
So, to build on your example, I'd go in this direction:
tables = {
'e': ('unit1', 'unit2', ...), # tablename: tuple of column names
}
def generate_conditions(table_name, nameValues):
if table_name not in tables:
raise ValueError('No such table %r' % table_name)
sql = u""
params = []
for field in nameValues:
if field not in tables[table_name]:
raise ValueError('No such column %r' % field)
sql += u" AND {0}.{1}=?".format(table_name, field)
params.append(nameValues[field])
return sql, params
search_query = u"SELECT * FROM Enheter e LEFT OUTER JOIN Handelser h ON e.Id == h.Enhet WHERE 1=1"
search_params = []
if "Enhet" in args:
sql, params = generate_conditions("e",args["Enhet"])
search_query += sql
search_params.extend(params)
c.execute(search_query, search_params)

Categories