How to add a SQLServer Full Text Search in Django? - python

In Django, one can use full text search natively when using Postgres. However, when using it with SQL Server (and django-pyodbc-azure) there is no simple way to do it (as far I know).
To do a full text search in SQL Server you use the CONTAINS(column, word) function as described in the docs, but Django ORM contains do: LIKE '% text %'.
I did find two alternative methods to bypass this problem. One is using RAW SQL the other is using django extra.
Snippet using django raw SQL:
sql = "SELECT id FROM table WHERE CONTAINS(text_field, 'term')"
table = Table.objects.raw(sql)
Using django extra:
where = "CONTAINS(text_field, 'term')"
table = Table.objects.extra(where=[where])
There is two problems with it:
Raw queries are harder to mantain.
Django docs. recommend against using the extra method.
So I want to know if there a better way to do this, using "pure" django ORM if possible.

Related

Full text mysql database search in Django

We have been using a MYSQL database for our project and Django as the backend framework. We want to support a full text search on a particular table and return the Queryset in Django. We know that Django supports full text search on a Postgres database but we can't move to another database now.
From what we have gathered till now -
Using inbuilt search functionality - Here we check on every field if the value exists and then take an OR to combine the results. Similar to the link (Django Search query within multiple fields in same data table).
This approach however straight forward may be inefficient for us because we have huge amounts of data.
Using a library or package - From what we have read Django haystack is something a lot of people are talking about when it comes to full text search.
Django Haystack - https://django-haystack.readthedocs.io/en/master/tutorial.html#installation
We haven't checked the library completely yet because we are trying to avoid using any library for this purpose. Let us know if you people have worked with this and have any views.
Any help is appreciated. Thanks.

Django ORM "filter" method produces SQL query without quotes

I am building an app with Django relying on several PosgreSQL databases which I do not manage, let's call them database A and database B. For each database, I've used python manage.py inspectdb to build my models.py files.
I am trying to use Django ORM to perform the following query (significantly simplified here), to_date being a datetime.datetime object:
sample = my_model_in_B.objects\
.filter(a_column=instance_of_a_model_in_A.name)\
.exclude(another_column='Useless things')\
.filter(a_column_with_dates__lte=to_date)
My issue is that it produces the following SQL query:
SELECT "myschema"."mytable"."a_column", "myschema"."mytable"."another_column" from "myschema"."mytable"
WHERE "myschema"."mytable"."a_column" = Here is the name of instance_of_a_model_in_A
AND "myschema"."mytable"."a_column_with_dates" <= 2020-02-03
AND NOT ("myschema"."mytable"."another_column" = Useless things
AND "myschema"."mytable"."another_column" IS NOT NULL))
In other terms, my issue is that the Django ORM does not automatically add quotes where I need them. I don't understand what I did wrong. I don't know if that matters but note that:
I use Django 2.2. and my database B is POSTGRESQL 8 only,
All the columns that I use correspond to CharField in my models, except a_column_with_dates which corresponds to a DateField.
Actually, my initial question was wrongly worded and misleading. I was assuming that QuerySet.query (that I used to get the above SQL code) was supposed to return the valid SQL query behind Django ORM, but it isn't. It just aims at providing a basic representation of the query, nonetheless, from a comment made on the official Django project website:
Django never actually interpolates the parameters: it sends the query
and the parameters separately to the database adapter, which performs
the appropriate operations.
Indeed, from the official documentation:
The query parameter to QuerySet exists so that specialized query subclasses can reconstruct internal query state. The value of the parameter is an opaque representation of that query state and is not part of a public API.

Insert statment created by django ORM at bulk_create

I am kind of new to python and django.
I am using bulk_create to insert a lot of rows and as a former DBA I would very much like to see what insert statments are being executed. I know that for querys you can use .query but for insert statments I can't find a command.
Is there something I'm missing or is there no easy way to see it? (A regular print is fine by me.)
The easiest way is to set DEBUG = True and check connection.queries after executing the query. This stores the raw queries and the time each query takes.
from django.db import connection
MyModel.objects.bulk_create(...)
print(connection.queries[-1]['sql'])
There's more information in the docs.
A great tool to make this information easily accessible is the django-debug-toolbar.

How to implement a search engine using python mysql?

I have a MySQL database created using a custom Python script. I need to implement full-text search on a table in the database. I can use SELECT * FROM myTable WHERE (title LIKE '%hello%' OR title LIKE '%world%'), however I don't think that is a very efficient way of implementing search since the data in the table has nearly one million rows.
I am using innoDB tables so the built in MySQL full text search for MyISAM will not work. Any suggestions on methods or tutorials that will point me in the right direction?
If your data is content like you could use some full-text search specific engine like Lucene:
http://lucene.apache.org/pylucene/
If you are doing Django you have Haystack:
http://haystacksearch.org/
Solr is also a full-text search related technology you might read about:
http://wiki.apache.org/solr/SolPython
I am no expert with MySQL, but I can immediately say that you should not be selecting everything that is like to a value. If the user types in "and", and there are thousands of results, it may be better just to select a certain amount from the database and then load more using the LIMIT parameter when the user goes to the next page (e.g).
SELECT * FROM `myTable` WHERE (`title` LIKE '%hello%' OR `title` LIKE '%world%') LIMIT numberOfValues,startingAtRowNumber
So to answer your question, the query is not efficient, and you should use something like I suggested above.
Take a look at: Fulltext Search with InnoDB. They suggest using an external search engine since there is no really good option to search within innoDB tables.

Select Data from Table and Insert into a different DB

I'm using python and psycopg2 to remotely query some psql databases, and I'm trying to figure out the best way to select the data I need from the remote table, and insert it into a table on a separate DB (local application server).
Most of the stuff I've read has directed me to avoid executemany and look toward COPY operations, but I'm unsure how to implement this on a specific select statement as opposed to the entire table. Should I be headed this way or am I completely off?
but I'm unsure how to implement this on a specific select statement as opposed to the entire table
COPY isn't limited to tables, you can use a query as the source as well, check out the examples in the manual, it shows how to use COPY to create a text file based on a query:
http://www.postgresql.org/docs/current/static/sql-copy.html#AEN59055
(3rd example)
Take a look at http://ryrobes.com/featured-articles/using-a-simple-python-script-for-end-to-end-data-transformation-and-etl-part-1/
Granted, this is pulling from Oracle and inserting into SQL Server, but the concepts should be the same.

Categories