How to get Peewee ORM contains column working with join - python

I'm doing a join across two tables, pretty simple set up, but when I add a contains or startswith that references a column in the table being joined I can never get the results. No errors, but the count is always 0, despite me knowing that the records exist and being able to write the equivalent query in raw SQL and have it return all the results I expect.
Here's what it looks like, assume A and B are tables, they're related through a foreign key, and both the fields I'm using in the where clause are CharField.
This version does not work despite me expecting it to:
(A.select().join(B).where(
A.some_column.contains(B.other_column)
))
But this does work as expected:
(A.select().join(B).where(
SQL("t1.some_column ILIKE '%%' || t2.other_column || '%%'")
))
I would expect those two to be equivalent, but they're not. Looking at the output SQL from the first one it looks like this:
(SELECT "t1"."some_column" from "A" as "t1"
INNER JOIN "B" as "t2" ON ("t1"."b_id" = "t2"."id")
WHERE ("t1"."some_column" ILIKE %s)', ['%<CharField: B.other_column>%'])
The interesting thing to me about the SQL output is at the end where it's referencing B.other_column. I'm guessing that if it were t2.other_column instead then the query would work, but how do I make peewee do that? I've tried everything I can think of and I can't figure out a pure ORM way to get this working.

The contains method performs interpolation of the parameter.
To achieve what you're trying to do, you would stay away from the "contains" method and use the ILIKE operation.
A.select().join(B).where(
A.some_column % ('%' + B.other + '%'))
The first "%" is the operator overload for ILIKE. The '%' + B.other + '%' will concatenate the wildcards for substring search.
UPDATE: I felt like this was a legit issue, so I've made a small change to make the .contains(), .startswith() and .endswith() methods work properly when the right-hand-side value is, for example, a field. Going forward it should work more intuitively.
Commit here: https://github.com/coleifer/peewee/commit/0c98f3e1f556eba10cbbdf7c386c49c64f4da41c

Related

How to use 'contains' in SQLAlchemy Queries in order to make use of MSSQL Full-text index

MSSQL has two operators 'contains' and 'like' which behave differently:
Contains performs full-text search only on full-text indexed columns
Like does not need the column to be indexed
Contains is typically faster:
https://www.mytecbits.com/microsoft/sql-server/like-vs-contains
I have an SQLAlchemy query using 'Like':
if 'FullName' in data:
filters.append(People.FullName.like('%' + data['FullName'] + '%'))
If I change the query to use 'Contains':
if 'FullName' in data:
filters.append(People.FullName.contains(data['FullName']))
The echo returns pretty much the same thing, both use like, neither actually uses 'contains':
FROM [People]
WHERE [People].[FullName] LIKE ?
ORDER BY [People].[Id] DESC
OFFSET ? ROWS
FETCH FIRST ? ROWS ONLY
2022-06-08 15:51:23,247 INFO sqlalchemy.engine.Engine [generated in 0.00312s] ('%Bob%', 0, 100)
FROM [People]
WHERE ([People].[FullName] LIKE '%' + ? + '%')
ORDER BY [People].[Id] DESC
OFFSET ? ROWS
FETCH FIRST ? ROWS ONLY
2022-06-08 15:49:40,280 INFO sqlalchemy.engine.Engine [generated in 0.00228s] ('Bob'), 0, 100)
SQLAlchemy documentation references Contains which produces an expression column LIKE '%' || <other> || '%'. I don't seem to find an example where SQLAlchemy makes use of 'contains'.
Does anyone know how to make SQLAlchemy use 'contains' in order for me to make use of the full text index? I would expect the query to look something like:
FROM [People]
WHERE CONTAINS([People].[FullName], ?)
ORDER BY [People].[Id] DESC
OFFSET ? ROWS
FETCH FIRST ? ROWS ONLY
2022-06-08 15:49:40,280 INFO sqlalchemy.engine.Engine [generated in 0.00228s] ('Bob'), 0, 100)
Or is the only solution to write the query as SQL?
Many thanks :-)
All the methods like(), contains(), startswith() and endswith() function in the same way. According to the following official documentation of SQLAlchemy, any of the above methods, when used, internally creates an expression using LIKE operator and then it is evaluated to get the results.
Column Elements and Expressions — SQLAlchemy 1.4 Documentation
So, if your query needs to be better in terms of performance, it is better to use SQL query. But using SQLAlchemy for contains() method will result in the same output as using LIKE operator in SQL.

How to use ilike and any sqlalchemy on postgresql array field?

I know this may seem like a duplicate, but the answer to the question asked that is basically identical to this one did not work for me.
```
from sqlalchemy import or_, func as F
query = request.args.get("query")
search_books = SavedBooks.query.filter(SavedBooks.authors.any(f'{query}')).all()
search_books =
SavedBooks.query.filter(F.array_to_string(SavedBooks.authors,',').ilike(f'{query}')).all()
search_books = SavedBooks.query.filter(SavedBooks.authors.like(any_(f'{query}'))).all()
```
Of these three search_books options, the first returns the author if the query string is an exact
match only. The second does the exact same as the first, the ilike seems to not make a difference,
the third is an option that has some type of syntax error, but I suppose would work. Any
suggestions?
Edit: This is the error I get when trying out the different queries.
sqlalchemy.exc.ProgrammingError: (psycopg2.errors.UndefinedFunction) operator does not exist: character varying[] ~~ unknown
LINE 3: ....isbn13 ILIKE '%eastmond%' OR saved_books.authors LIKE ANY (...
^
HINT: No operator matches the given name and argument types. You might need to add explicit type casts.
Edit:
I am aware of this related post, How to use ilike sqlalchemy on postgresql array field?. I used it to formulate my question. Anyways, I was not able to query the author column of type array(string) with ilike, so I went with a workaround by creating a new column that was a copy of the authors column but in normal string format (I called authors_string). Then i just queried that column with ilike and it worked just fined. Make sure you remove the brackets from authors before you commit it into authors_string. I did that by using this. authors_string = str(authors).strip('[]')
this should work..
search_books = SavedBooks.query.filter(SavedBooks.authors.ilike(query))).all()

SQLAlchemy match with or

I'm getting myself tied up in knots with some sqlalchemy I'm trying to work out. I've got an old web app I'm trying to tart up, and have decided to rewrite it from scratch. As part of that, I'm playing with SQL Alchemy and trying to improve my pythonic skills - so I've got a search object I'm trying to run, where I'm checking to see if the customer query exists in either the account name and customer name fields and match against either of them. However SQL Alchemy registers it as an AND
If I add extra or_ blocks, it fails to recognise them and process appropriately.
I've moved it so it's the first query, but the query planner in sqlalchemy leaves it exactly the same.
Any ideas?
def CustomerCountryMatch(query, page):
customer=models.Customer
country=models.CustomerCodes
query=customer.query.order_by(customer.account_name).\
group_by(customer.account_name).having(func.max(customer.renewal_date)).\
join(country, customer.country_code==country.CODE).\
add_columns(customer.account_name,
customer.customer_name,
customer.account_id,
customer.CustomerNote,
country.COUNTRY,
country.SupportRegion,
customer.renewal_date,
customer.contract_type,
customer.CCGroup).\
filter(customer.account_name.match(query)).filter(or_(customer.customer_name.match(query))).\
paginate(page, 50, False)
The query as executed is below:
sqlalchemy.engine.base.Engine SELECT customer.customer_id AS customer_customer_id,
customer.customer_code AS customer_customer_code,
customer.address_code AS customer_address_code,
customer.customer_name AS customer_customer_name,
customer.account_id AS customer_account_id,
customer.account_name AS customer_account_name,
customer.`CustomerNote` AS `customer_CustomerNote`,
customer.renewal_date AS customer_renewal_date,
customer.contract_type AS customer_contract_type,
customer.country_code AS customer_country_code,
customer.`CCGroup` AS `customer_CCGroup`,
customer.`AgentStatus` AS `customer_AgentStatus`,
customer.comments AS customer_comments,
customer.`SCR` AS `customer_SCR`,
customer.`isDummy` AS `customer_isDummy`,
customer_codes.`COUNTRY` AS `customer_codes_COUNTRY`,
customer_codes.`SupportRegion` AS `customer_codes_SupportRegion`
FROM customer INNER JOIN
customer_codes ON customer.country_code=customer_codes.`CODE` WHERE
MATCH (customer.account_name) AGAINST (%s IN BOOLEAN MODE) AND
MATCH (customer.customer_name) AGAINST (%s IN BOOLEAN MODE) GROUP BY
customer.account_name HAVING max(customer.renewal_date) ORDER BY
customer.account_name LIMIT %s,
%s 2015-11-06 03:32:52,035 INFO sqlalchemy.engine.base.Engine ('bob', 'bob', 0, 50)
The filter clause should be:
filter(
or_(
customer.account_name.match(query),
customer.customer_name.match(query)
)
)
Calling filter twice, as in filter(clause1).filter(clause2) joins the criteria using AND (see the docs).
The construct: filter(clause1).filter(or_(clause2)) does not do what you intend, and is translated into SQL: clause1 AND clause2.
The following example makes sense: filter(clause1).filter(or_(clause2, clause3)), and is translated into SQL as: clause1 AND (clause2 OR clause 3).
A simpler approach is to use an OR clause using the '|' operator within your match if you want to find all matches that contain one or more of the words your are searching for eg
query = query.filter(Table.text_searchable_column.match('findme | orme'))

Changing where clause without generating subquery in SQLAlchemy

I'm trying to build a relatively complex query and would like to manipulate the where clause of the result directly, without cloning/subquerying the returned query. An example would look like:
session = sessionmaker(bind=engine)()
def generate_complex_query():
return select(
columns=[location.c.id.label('id')],
from_obj=location,
whereclause=location.c.id>50
).alias('a')
query = generate_complex_query()
# based on this query, I'd like to add additional where conditions, ideally like:
# `query.where(query.c.id<100)`
# but without subquerying the original query
# this is what I found so far, which is quite verbose and it doesn't solve the subquery problem
query = select(
columns=[query.c.id],
from_obj=query,
whereclause=query.c.id<100
)
# Another option I was considering was to map the query to a class:
# class Location(object):pass
# mapper(Location, query)
# session.query(Location).filter(Location.id<100)
# which looks more elegant, but also creates a subquery
result = session.execute(query)
for r in result:
print r
This is the generated query:
SELECT a.id
FROM (SELECT location.id AS id
FROM location
WHERE location.id > %(id_1)s) AS a
WHERE a.id < %(id_2)s
I would like to obtain:
SELECT location.id AS id
FROM location
WHERE id > %(id_1)s and
id < %(id_2)s
Is there any way to achieve this? The reason for this is that I think query (2) is slightly faster (not much), and the mapper example (2nd example above) which I have in place messes up the labels (id becomes anon_1_id or a.id if I name the alias).
Why don't you do it like this:
query = generate_complex_query()
query = query.where(location.c.id < 100)
Essentially you can refine any query like this. Additionally, I suggest reading the SQL Expression Language Tutorial which is pretty awesome and introduces all the techniques you need. The way you build a select is only one way. Usually, I build my queries more like this: select(column).where(expression).where(next_expression) and so on. The FROM is usually automatically inferred by SQLAlchemy from the context, i.e. you rarely need to specify it.
Since you don't have access to the internals of generate_complex_query try this:
query = query.where(query.c.id < 100)
This should work in your case I presume.
Another idea:
query = query.where(text("id < 100"))
This uses SQLAlchemy's text expression. This could work for you, however, and this is important: If you want to introduce variables, read the description of the API linked above, because just using format strings intead of bound parameters will open you up to SQL injection, something that normally is a no-brainer with SQLAlchemy but must be taken care of if working with such literal expressions.
Also note that this works because you label the column as id. If you don't do that and don't know the column name, then this won't work either.

create a string with parameters that depend on a list values in Python

I'm new to Python and I would like to create a string like this :
print("{} INNER JOIN {}".format(table1, table2))
The problem is that table1 and table2 are stored in a list tables_list = ['table1','table2'] and I don't know how many values there are in that list, so if I only have one item in that list, the result of the print would be :
table1
without the join.
I guess I should be looping on tables_list but I can't seem to figure out how to use format in that case.
Any help would be appreciated.
You can use join combined with slicing.
" INNER JOIN ".join(tables_list[:2])
Though it looks like you are trying to build an SQL query, so I'd warn you you should be wary of rolling your own query builder. Look in to the docs for whichever DB library you are using to make sure you aren't rewriting things they can already do for you.
You can do it like this:
>>> tables_list = ['table1','table2']
>>> print("{} INNER JOIN {}".format(*tables_list))
table1 INNER JOIN table2
It looks like you are writing SQL. To help mitigate the risk of getting SQL injected most Python database libraries offer their own form of string formatting, (a.k.a bind variables). Utilizing bind variables in Python usually looks like this.
query_results = cursor.execute("?INNER JOIN ?", ("table1", "table2"))

Categories