Order Query results by startswith match - python

We are doing a site search where we search a field with an icontains. It's working great, but the results that come up dont have the most relevant result at the top.
An example is searching for "Game of Thrones". If the search is "Game of", the first result could be "Crazy Game of..." and the second is "Game of Thrones"
Essentially, I'd like to search with icontains, but order by startswith. Any ideas?

Django added new features since this question was asked, and now handles this use case natively.
You can order results by an arbitrary comparison using the following features in Django's QuerySet API:
Q() Object: encapsulates a reusable query expression. Our expression compares the object's field against our "Game of" search value.
ExressionWrapper(): wraps a query expression, in order to control the ModelField type with output_field. In this case, the type is a Boolean.
annotate(): dynamically add a ModelField to each QuerySet object, based on the result of a query expression. We want to add a True/False field to sort on.
order_by(): order the QuerySet objects by specified fields. We want to order by the annotated field value, in reverse, so that True values are displayed first.
from django.db.model import Q, ExpressionWrapper, BooleanField
....
# Your original setup.
search_term = 'Game of'
data = MyModel.objects.filter(name__icontains=search_term)
# Encapsulate the comparison expression.
expression = Q(name__startswith=search_term)
# Wrap the expression to specify the field type.
is_match = ExpressionWrapper(expression, output_field=BooleanField())
# Annotate each object with the comparison.
data = data.annotate(my_field=is_match)
# Order by the annotated field in reverse, so `True` is first (0 < 1).
data = data.order_by('-my_field')
....

The ordering you're describing is subjective and there's no data like that that comes out of the database (that I'm aware of). If a feature like this is important, you might want to look into a search engine like Solr or Sphinx where you can configure how relevancy scores are determined.

Related

Filtering QuerySet in Djnago where a field has a specific value or is null

what's the right way to do this in Django, i want to to filter a queryset where the field has a specific value or is null,which means that if i filter a queryset by field named "type", it would return a queryset with objects has a specific value of type like "active" and also object with "type" has null value.
You can work with Q objects [Django-doc] for this:
MyModel.objects.filter(
Q(type='active') | Q(type=None)
)
Here we combine the two Q objects, which act as filters with a bitwise or operator |, and this means that we create a filter that acts as a logical or between the two filter conditions.
I think this is the simplest technique to achieve your aim.
Model.objects.filter(Q(type='active') | Q(type=True))
There is a whole section in Django documentation related to model field lookups. But if you don't want to mess with it here is some examples:
# Obtaining all non-null fields
Model.objects.filter(some_field__isnull=False)
# Obtaining all rows with field value greater than some value
Model.objects.filter(some_field__isnull=False, some_field__gt=value)
# Actually the better way of writing complex queries is using Q objects
from django.db.models import Q
Model.objects.filter(Q(some_field__isnull=False) & Q(some_field__gt=value)) # The same thing as the previous example
# Using Q objects in this situation may seem useless and it kinda is, but as for me
# This improves readability of your query
There are a lot more things that you may want to do with your field, but I think that I described the most common use case.
Edit
As written in the Willem's answer to obtain rows that has value of "active" or null you should write such query:
Model.objects.filter(Q(type='active') | Q(type__isnull=True))

Django - calling attribute from queryset with a string

I'm trying to loop over different query sets while not repeating myself too much and have encountered a problem using the queryset class.
This is not necessarily completely a Django-problem.
What I'm trying to do is to use my keylist, which corresponds to a django model's column names, to create a list of the data from those column names, what i want to do is something like this:
if needthisdata==1:
needdata=['column1', 'column2', 'column3']
else:
needdata=['column1', 'column4', 'column7']
entry=djangomodel.get.all().filter(identifier='id')
dictitems=[]
for n in range(0, len(needdata)):
if n==0:
dictitems=[entry.needdata[n]]
else:
dictitems.append(entry.needdata[n])
Which of course doesn't work since the queryset doesn't have a need data attribute, is there some way to call an attribute for a class with a string in this way?
A valid Django statement to obtain a single entry
First of all, there are some semantical problems here:
itentifier should probably be identifier, id, or pk;
you use .all immedately instead of first obtaining a manager (probably .objects); and
you here use a .filter(..) on the queryset to filter on an identifier, but usually this should be a .get(..), since by using a filter, zero, one or more results can be returned in an iterable.
entry = djangomodel.objects.get(id=some_id)
So now we obtain a single entry, but that of course does not resolve
obtaining the columns.
If all elements are real Django columns
In case the columns are real Django fields (so no #propertys, etc.) then we can use values_list, and perform a list(..) constructor on it:
dictitems = list(djangomodel.objects.values_list(*needdata).get(id=some_id))
If case some elements are #propertys
In case not all those fields are real Django fields, then we can use attrgetter instead:
from operator import attrgetter
dictitems = list(attrgetter(*needdata)(djangomodel.objects.get(id=some_id)))

Using 'like', '<=' and '=' operators when building an SQLAlchemy query filter from dict

I currently have a python dictionary that is created from the data that a user submits through a form. The form fields are optional, but if they are all filled out, then the dictionary (dict_filter) might look like this:
{"item_type": "keyboard", "location": "storage1"}
I can then query the database as shown:
items = Item.query.filter_by(**dict_filter).all()
This works fine and returns all the keyboard items that are currently in storage1 as desired.
However, I want to add two new date fields to the form such that a completely filled out form would result in a dictionary similar to the following:
{"item_type": "keyboard", "location": "storage1", "purchase_date": 2017-02-18, "next_maintenance": 2018-02-18}
Based on this new dict, I would like to do the following:
First, use like() when filtering the item_type. I want this so that if a user searches for keyboard then the results will also include items like mechanical keyboard for example. I know I can do this individually as shown:
val = form.item_type.data
items = Item.query.filter(getattr(Item, 'item_type').like("%%%s%%" % val)).all()
Second, use the '<=' (less than or equal to) operator when dealing with dates such that if, for example, a user enters a purchase_date in the form, then all the items returned will have a purchase_date before or on the same date as entered by the user. I know I can do this individually as shown:
items = Item.query.filter(Item.purchase_date <= form.purchase_date.data)
Note that if both dates are filled out in the form, then the filter should check both dates as shown:
items = Item.query.filter(and_(Item.purchase_date <= form.purchase_date.data, Item.next_maintenance <= form.next_maintenance.data))
Third, if the location field is filled out in the form, then the query should check for items with matching locations (as it currently does with the dict). I know I can do this using a dict as I am currently doing:
dict = {"location": "storage1"}
items = Item.query.filter_by(**dict_filter).all()
or
items = Item.query.filter_by(location=form.location.data).all()
The greatest challenge that I have is that since the form fields are optional I have no way of knowing beforehand what combination of filter conditions I'll have to apply. Therefore, it may be possible that for one user's input, I'll have to search the db for all screen items in office1 with next_maintenance date before yyyy-mm-dd while for another user's input I'll have to search the db for all items in all location regardless of next_maintenance date with a purchase_date before yyyy-mm-dd, and so on. This is precisely why I'm currently using a dict as a filter; it allows me to check if a form field was completed and if it was, then I add it to the dict and filter only based on form fields with input.
With all that being said, how can I combine all three filters discussed above (like, <=, =) into one while also accounting for the fact that not all three filters may always be necessary?
This was not intended to be an answer but a comment. But apparently I can't use code block in a comment.
In case you don't know, you can use multiple filter or filter_by by chaining them together like this:
Item.query.filter(Item.a < 5).filter(Item.b > 6).all()
Therefore you can store the returning value as a variable (it is actually an object of Query) temporarily and use it later.
q = Item.query.filter(Item.a < 5)
if some_condition_value:
q = q.filter(Item.b > 6)
items = q.all()
You can apply your conditions to the Query object and then you can have optional filters.

Djapian - filtering results

I use Djapian to search for object by keywords, but I want to be able to filter results. It would be nice to use Django's QuerySet API for this, for example:
if query.strip():
results = Model.indexer.search(query).prefetch()
else:
results = Model.objects.all()
results = results.filter(somefield__lt=somevalue)
return results
But Djapian returns a ResultSet of Hit objects, not Model objects. I can of course filter the objects "by hand", in Python, but it's not realistic in case of filtering all objects (when query is empty) - I would have to retrieve the whole table from database.
Am I out of luck with using Djapian for this?
I went through its source and found that Djapian has a filter method that can be applied to its results. I have just tried the below code and it seems to be working.
My indexer is as follows:
class MarketIndexer( djapian.Indexer ):
fields = [ 'name', 'description', 'tags_string', 'state']
tags = [('state', 'state'),]
Here is how I filter results (never mind the first line that does stuff for wildcard usage):
objects = model.indexer.search(q_wc).flags(djapian.resultset.xapian.QueryParser.FLAG_WILDCARD).prefetch()
objects = objects.filter(state=1)
When executed, it now brings Markets that have their state equal to "1".
I dont know Djapian, but i am familiar with xapian. In Xapian you can filter the results with a MatchDecider.
The decision function of the match decider gets called on every document which matches the search criteria so it's not a good idea to do a database query for every document here, but you can of course access the values of the document.
For example at ubuntuusers.de we have a xapian database which contains blog posts, forum posts, planet entries, wiki entries and so on and each document in the xapian database has some additional access information stored as value. After the query, an AuthMatchDecider filters the potential documents and returns the filtered MSet which are then displayed to the user.
If the decision procedure is as simple as somefield < somevalue, you could also simply add the value of somefield to the values of the document (using the sortable_serialize function provided by xapian) and add (using OP_FILTER) an OP_VALUE_RANGE query to the original query.

In Django, how does one filter a QuerySet with dynamic field lookups?

Given a class:
from django.db import models
class Person(models.Model):
name = models.CharField(max_length=20)
Is it possible, and if so how, to have a QuerySet that filters based on dynamic arguments? For example:
# Instead of:
Person.objects.filter(name__startswith='B')
# ... and:
Person.objects.filter(name__endswith='B')
# ... is there some way, given:
filter_by = '{0}__{1}'.format('name', 'startswith')
filter_value = 'B'
# ... that you can run the equivalent of this?
Person.objects.filter(filter_by=filter_value)
# ... which will throw an exception, since `filter_by` is not
# an attribute of `Person`.
Python's argument expansion may be used to solve this problem:
kwargs = {
'{0}__{1}'.format('name', 'startswith'): 'A',
'{0}__{1}'.format('name', 'endswith'): 'Z'
}
Person.objects.filter(**kwargs)
This is a very common and useful Python idiom.
A simplified example:
In a Django survey app, I wanted an HTML select list showing registered users. But because we have 5000 registered users, I needed a way to filter that list based on query criteria (such as just people who completed a certain workshop). In order for the survey element to be re-usable, I needed for the person creating the survey question to be able to attach those criteria to that question (don't want to hard-code the query into the app).
The solution I came up with isn't 100% user friendly (requires help from a tech person to create the query) but it does solve the problem. When creating the question, the editor can enter a dictionary into a custom field, e.g.:
{'is_staff':True,'last_name__startswith':'A',}
That string is stored in the database. In the view code, it comes back in as self.question.custom_query . The value of that is a string that looks like a dictionary. We turn it back into a real dictionary with eval() and then stuff it into the queryset with **kwargs:
kwargs = eval(self.question.custom_query)
user_list = User.objects.filter(**kwargs).order_by("last_name")
Additionally to extend on previous answer that made some requests for further code elements I am adding some working code that I am using
in my code with Q. Let's say that I in my request it is possible to have or not filter on fields like:
publisher_id
date_from
date_until
Those fields can appear in query but they may also be missed.
This is how I am building filters based on those fields on an aggregated query that cannot be further filtered after the initial queryset execution:
# prepare filters to apply to queryset
filters = {}
if publisher_id:
filters['publisher_id'] = publisher_id
if date_from:
filters['metric_date__gte'] = date_from
if date_until:
filters['metric_date__lte'] = date_until
filter_q = Q(**filters)
queryset = Something.objects.filter(filter_q)...
Hope this helps since I've spent quite some time to dig this up.
Edit:
As an additional benefit, you can use lists too. For previous example, if instead of publisher_id you have a list called publisher_ids, than you could use this piece of code:
if publisher_ids:
filters['publisher_id__in'] = publisher_ids
Django.db.models.Q is exactly what you want in a Django way.
This looks much more understandable to me:
kwargs = {
'name__startswith': 'A',
'name__endswith': 'Z',
***(Add more filters here)***
}
Person.objects.filter(**kwargs)
A really complex search forms usually indicates that a simpler model is trying to dig it's way out.
How, exactly, do you expect to get the values for the column name and operation?
Where do you get the values of 'name' an 'startswith'?
filter_by = '%s__%s' % ('name', 'startswith')
A "search" form? You're going to -- what? -- pick the name from a list of names? Pick the operation from a list of operations? While open-ended, most people find this confusing and hard-to-use.
How many columns have such filters? 6? 12? 18?
A few? A complex pick-list doesn't make sense. A few fields and a few if-statements make sense.
A large number? Your model doesn't sound right. It sounds like the "field" is actually a key to a row in another table, not a column.
Specific filter buttons. Wait... That's the way the Django admin works. Specific filters are turned into buttons. And the same analysis as above applies. A few filters make sense. A large number of filters usually means a kind of first normal form violation.
A lot of similar fields often means there should have been more rows and fewer fields.

Categories