django icontains to search postgres database - python

I currently have a Django application and postgres database. I want to have a search bar that allows users to enter in a value and it will search some of the fields of the model to search for matching values. I want this to work even for values of "". I currently have:
MyModel.objects.filter(myfield__icontains=search_query).order_by(...)
How would I make this so that it can search multiple fields of the model at the same time. What is the most efficient way to do so? Is "icontains" okay for this?
Any help would be greatly appreciated!

Doing this through regular filter queries and icontains is not advisable as it becomes inefficient pretty quickly - you certainly don't want to be doing that on multiple large text fields.
However, PostgreSQL comes with a full text search engine which is designed for exactly this purpose. Django provides support for this.
You can define a SearchVector to perform full text search on multiple fields at once, e.g., :
from django.contrib.postgres.search import SearchVector
MyModel.objects.annotate(
search=SearchVector('field_1') + SearchVector('field_2'),
).filter(search='search_query')
The documentation I've linked to provides a lot of additional information on how to perform ranking etc. on search results.
Another alternative is to use a search engine like Elasticsearch - whether this is necessary depends on how many objects you have and what kind of filtering and ranking you need to do on results.

You can use Q to search multiple fields, for example:
fields that you want to search:
field0
field1
field2
Django search code:
from django.db.models import Q
search_result = MyModel.objects.filter(
Q(field0_icontains=search_query) |
Q(field1_icontains=search_query) |
Q(field2_icontains=search_query)
).order_by(...)

Related

Django: Store Q query objects for repeatable search?

In my Django based web app users can perform a search; the query consists of several dynamically constructed complex Q objects.
Depending on the user search parameters, the search will query a variable number of columns and also can stretch over multiple models.
The user should be able to save her search to repeat it at some later point.
For that I'd like to store the Q objects (I guess) in a database table.
Is this good practice? How would you approach this?
Thanks in advance.
If you have just one or a fixed number of Q objects as part of the filter, you can save the argument passed to Q as a dict.
.e.g This:
Q(buy_book__entity__type=ENTITY.INTERNAL)
Is equivalent to this:
q_filter = {"buy_book__entity__type": ENTITY.INTERNAL}
Q(**q_filter)
You can save q_filter in your datastore.

Raw query and row level access control over multiple models in Django

I'm trying to provide an interface for the user to write custom queries over the database. I need to make sure they can only query the records they are allowed to. In order to do that, I decided to apply row based access control using django-guardian.
Here is how my schemas look like
class BaseClass(models.Model):
somefield = models.TextField()
class Meta:
permissions = (
('view_record', 'View record'),
)
class ClassA(BaseClass):
# some other fields here
classb = models.ForeignKey(ClassB)
class ClassB(BaseClass):
# some fields here
classc = models.ForeignKey(ClassC)
class ClassC(BaseClass):
# some fields here
I would like to be able to use get_objects_for_group as follows:
>>> group = Group.objects.create('some group')
>>> class_c = ClassC.objects.create('ClassC')
>>> class_b = ClassB.objects.create('ClassB', classc=class_c)
>>> class_a = ClassA.objects.create('ClassA', classb=class_b)
>>> assign_perm('view_record', group, class_c)
>>> assign_perm('view_record', group, class_b)
>>> assign_perm('view_record', group, class_a)
>>> get_objects_for_group(group, 'view_record')
This gives me a QuerySet. Can I use the BaseClass that I defined above and write a raw query over other related classes?
>>> qs.intersection(get_objects_for_group(group, 'view_record'), \
BaseClass.objects.raw('select * from table_a a'
'join table_b b on a.id=b.table_a_id '
'join table_c c on b.id=c.table_b_id '
'where some conditions here'))
Does this approach make sense? Is there a better way to tackle this problem?
Thanks!
Edit:
Another way to tackle the problem might be creating a separate table for each user. I understand the complexity this might add to my application but:
The number of users will not be more than 100s for a long time. Not a consumer application.
Per our use case, it's quite unlikely that I'll need to query across these tables. I won't write a query that needs to aggregate anything from table1, table2, table3 that belongs to the same model.
Maintaining a separate table per customer could have an advantage.
Do you think this is a viable approach?
After researching many options I found out that I can solve this problem at the database level using Row Level Security on PostgreSQL. It ends up being the easiest and the most elegant.
This article helped me a lot to bridge the application level users with PostgreSQL policies.
What I learned by doing my research is:
Separate tables could still be an option in the future when customers can potentially affect each others' query performances since they are allowed to run arbitrary queries.
Trying to solve it at the ORM level is almost impossible if you are planning to use raw or ad-hoc queries.
I think you already know what you need to do. The word you are looking for is multitenancy. Although it is not one table per customer. The best suit for you will be one schema per customer. Unfortunately, the best article I had on multitenancy is no more available. See if you can find a cached version: https://msdn.microsoft.com/en-us/library/aa479086.aspx otherwise there are numerous articles availabe on the internet.
Another viable approach is to take a look at custom managers. You could write one custom manager for each Model-Customer and query it accordingly. But all this will lead to application complexity and will soon get out of your hand. Any bug in the application security layer is a nightmare to you.
Weighing both I'd be inclined to say multitenancy solution as you said in your edit is by far the best approach.
First, you should provide us with more details, how is your architecture set and built, with django so that we can help you. Have you implemented an API? using django template is not really a good idea if you are building a large scale application, consuming a lot of data.Because this can affect the query load massively.I can suggest extracting your front-end from the backend.

What kind of search should be used

I'm working on a online store using Django, the question is simple, for a simple model named Product with "name" and "description" fields, should I try a full text search using PostgreSQL or a simple query using "icontains" field lookup?
The simplest way to use full text search is to search a single term against a single column in the database.
example: Product.objects.filter(description_text__search='lorem')
Searching against a single field is great but rather limiting. To query against both fields, use a SearchVector
Same way you can use SearchQuery too.

haystack solr search ALL fields

I have a solr search engine set up with multiple fields and I want to be able to search ALL fields.
I can do a .filter(content='string') but this only searches whatever fields are in the document=True
EDIT
Also, some of the non document=True fields have different filters/tokenisers applied so im guessing that would not work with adding them into a single field...
Maybe you can make a second field with 'use_template' and a template displaying ALL fields.
I never tried to do this, but this sound a good way to do it to me.
EDIT since OP comment:
Then my best bet is to eaither sublass SearchQueryset to add a method or to create a function that will loop and all fields in your SearchIndex and do something like:
qs = SearchQuerySet().filter(content=query)
for field in fieldlist:
qs = qs.filter_or(**{'field':query})
I have no idea if this works at all but that's worth trying.
#neolaser: I think what you want can be achieved by using DisMax search. It allows searching through multiple fields and specify the boost value for each of them. For more details:
http://wiki.apache.org/solr/SolrRelevancyFAQ
http://wiki.apache.org/solr/DisMaxQParserPlugin
You can search all the fields buy including them all into your filtering query parameter or by naming them in the query string (e.g. if you need to search for "keyword" search for "((field_1:keyword) OR (field_2:keyword) OR (field_3: keyword))" instead).
However, it is usually better to have a dedicated field concatenating all the others you need to search and search this single field. You can set up a copyfield in your schema to have that content generated automatically when your document is indexed.

Is there a framework or pattern for applying filters to data?

The problem:
I have some hierarchical data in a Django application that will be passed on through to javascript. Some of this data will need to be filtered out from javascript based on the state of several data classes in the javascript. I need a way of defining the filters in the backend (Django) that will then be applied in javascript.
The filters should look like the following:
dataobject.key operator value
Filters can also be conditional:
if dataobject.key operator value
and dataobject.key2 operator value
or dataobject.key3 operator value
And probably any combination of conditionals such as:
if (condition and condition) or condition
Some keys will have a set of allowed values, and other keys will have free text fields. This system must be usable by business-type end-users otherwise there is no point in having this system at all. The primary goal is to have a system that is fully managed by the end-users. If most of these goals can be implemented, I'll consider it a win.
Is a rule engine appropriate for this scenario? Is there a python or django framework available for implementing this behaviour or any well defined patterns?
Update (Based on S.Lott's answer):
I'm not talking about filtering the data using the Django ORM. I want to pass all the data and all the rules to javascript, so the javascript application can remain 'disconnected'.
What I need is a way of having users define these rules and combinations of rules, and storing them in a database. Then when a page is loaded, this data and all the rules are retrieved and placed onto the page. The definition of the rules is the important piece of the puzzle.
Django filters can easily be piled on top of each other.
initial_query_set = SomeModel.objects.filter( ... some defaults ... )
if got_some_option_from_javascript:
query_set = initial_query_set.filter( this )
else:
query_set = initial_query_set
if got_some_other_option:
query_set = query_set.exclude( that )
if yet_more:
query_set = query_set.filter( and on and on )
That's the standard approach. If you're not talking about Django ORM query filters, please update your question.

Categories