Django: Admin - order results based on regular expression - python

I have a problem where I want to order model rows in Django admin based on a number within a string and ignore the letters - like XX0345XX, X0346XXX, XXX0347XX. I'm thinking that I should probably use a regular expression for that. I have an SQL query (PostgreSQL) that returns what I want:
select * from mytable order by substring(my_field, '\d+')::int DESC
But I'm having trouble applying it to get the same result in the Django admin get_queryset().
I've tried doing something like:
def get_queryset():
return Model.objects.raw("select * from mytable order by substring(my_field, '\d+')::int DESC")
but the problem is that I'm not returning the right type this way. Model.objects.raw('...') returns RawQuerySet, but get_queryset() should return QuerySet instance not RawQuerySet one, so I can't do it this way.
Any suggestions on how to solve this problem? Thanks!

You can use the .extra() method of to convert from a rawqueryset to a queryset, see here
This example is taken from, here
class CustomManager(manager.Manager):
def get_queryset():
qs = self.get_queryset()
sql = "myapp_model_b.id IN (SELECT UNNEST(myapp_model_a.pk_values) FROM myapp_model_a WHERE myapp_model_a.id='%s')" % index_id
return qs.extra(where=[sql])

As pointed out by #Azy_Crw4282, you can use QuerySet.extra() to run raw SQL queries and still have the results returned as a QuerySet instance.
Here's how I managed to do a regex based "ORDER BY" on objects in admin view, based on a number within a string. I.e. order fields like XX0345XX, X0346XXX, XXX0347XX, based on the number they contain - basically get a substring that matches '\d+' regular expression.
Just override the get_queryset() function in your admin.ModelAdmin class.
Solution:
def get_queryset(self, request):
sql_query = "CAST(substring(booking_reference, '\d+') as int)"
bookings = Booking.objects.extra(select={'book_ref_digits': sql_query}).order_by('-book_ref_digits')
return bookings
As far as I understand, it adds a new temporary field, e.g. book_ref_digits in the QuerySet object and then you can do .order_by() on that.
Note: Using older version of Django 1.10.5

Related

How to make case insensetive searcn in Django with Postgres

I have a project with Django, Django REST Framework and PostgreSQL.
And my goal to make a search with certain conditions:
logical operators (AND, OR, NOT)
case insensitive
operator like * To search by prefixes. som* -> search for some, somali and so on
My first attempt was to use Postgres full search with this type
search_type='websearch'
It's all good but don't have operator *
So I switched to raw type search and my class for search looks like it now
class DataSearch(generics.ListAPIView):
serializer_class = DataSerializer
def get_queryset(self):
q = self.request.query_params.get('query')
if q:
vector = SearchVector('research', 'data', 'research__name')
query = SearchQuery(q, search_type='raw')
queryset = Data.objects.annotate(search=vector).filter(search=query)
else:
queryset = Data.objects.none()
return queryset
Logical operator works, search by prefixes works with :* but I don't know how to make it case insensitive.
Is it possible to make this type of search case insensitive? Or maybe there are another option for it?
Please try:
SearchQuery(q, search_type='raw', config='english')
OR
SearchQuery(q, config='pg_catalog.simple', search_type='raw')
This article contains more details about search in Django https://docs.djangoproject.com/en/4.0/ref/contrib/postgres/search/

is it possible to return a single value from Q in Django?

I am using the Django ORM and I would like to do something like:
self.queryset.annotate(cupcake_name=Q(baked_goods__frosted_goods__cupcakes__cupcake_id='xxx'))
but return the individual cupcake name field value somehow so I can serve it as a data attribute.
Assuming you have the cupcake id and you want to do this in a single query you can use a Subquery:
from django.db.models import Subquery
self.queryset.annotate(
cupcake_name=Subquery(Cupcake.objects.filter(id='xxx').values('name')),
)
See the Subquery docs here if you need to link the subquery to the queryset: https://docs.djangoproject.com/en/3.1/ref/models/expressions/#subquery-expressions
If you don't mind making two queries it's a bit more clear to use the literal Value expression:
from django.db.models import Value
cupcake_name = Cupcake.objects.get(id='xxx').name
self.queryset.annotate(
cupcake_name=Value(cupcake_name),
)

Django django-filter django-tables2 limit query results

I am trying to limit the number of rows displayed in a table filtered using django-filter and build by django-tables2. I did not find anything here or in the docs (I don't want to use pagination).
I know I can slice the queryset but I also want to have the table sortable and can't figure out how to do both.
This is my views.py:
def filtered_table(request):
f = itemFilter(request.GET, queryset=ItemModel.objects.all())
has_filter = any(field in request.GET for field in set(f.get_fields()))
table = None
if has_filter:
if not request.GET.get('sort'):
table = ItemTable(f.qs, order_by='-timestamp')
else:
table = ItemTable(f.qs, order_by=request.GET.get('sort'))
return render(request, 'itemlist/filteredlist.html', {
'itemtable': table,
'filter': f,
})
I tried to slice the queryset before passing it to the table:
table = ItemTable(f.qs.order_by('-timestamp')[:20])
table = ItemTable(f.qs.order_by(request.GET.get('sort'))[:20])
Resulting in:
AssertionError: Cannot reorder a query once a slice has been taken.
Because django-tables2 calls .order_by() again.
Is there a way to configure django-tables2 or manipulate the queryset to limit the displayed rows?
Update:
I tried as suggested, which does not work with my database:
This version of MariaDB doesn't yet support 'LIMIT & IN/ALL/ANY/SOME subquery'
With a slight change this works for me:
f_qs = ItemModel.objects.filter(id__in=list(f_qs_ids))
I think this will now do two queries on the database but that is not a problem for me.
Thank you for answering and pointing me in the right direction. Much appriciated!
This is a bit of a round about way to get there, but you could use your original QuerySet (f.qs), and then take a slice of the obj ids, and then re-filter the original QuerySet with those ids.
# get the 20 ids for the objects you want
f_qs_ids = f.qs.order_by(request.GET.get('sort')).values_list("id", flat=True)[:20]
# create a new queryset by also filtering on the set of 20 ids
f_qs = f.qs.filter(id__in=f_qs_ids)
# pass a legitimate queryset to the table
table = PassTable(f_qs)

Does django core pagination retrieve all data first?

I am using django 1.5. I need to split pages to data. I read docs here. I am not sure about whether it retrieves all data first or not. Since I have a large table, it should be better to using something like 'limit'. Thanks.
EDIT
I am using queryset in ModelManager.
example:
class KeywordManager(models.Manager):
def currentkeyword(self, kw, bd, ed):
wholeres = super(KeywordManager, self).get_query_set() \
.values("sc", "begindate", "enddate") \
.filter(keyword=kw, begindate__gte=bd, enddate__lte=ed) \
.order_by('enddate')
return wholeres
First, a queryset is a lazy object, and django will retrieve the data as soon you request it, but if you dont, django won't hit the DB. If you use over a queryset any list methods as len(), you will evaluate all the queryset and forcing django to retrieve all the data.
If you pass a queryset to the Paginator, it would not retrieve all the data, because, as docs says, if you pass a queryset, it will use .count() methods avoiding converting the queryset into a list and the use of len() method.
If your data is not coming from the database, then yes - Paginator will have to load all the information first in order to determine how to "split" it.
If you're not and you're simply interacting with the database with Django's auto-generated SQL, then the Paginator performs a query to determine the number of items in the database (i.e. an SQL COUNT()) and uses the value you supplied to determine how many pages to generate. Example: count() returns 43, and you want pages of 10 results - the number of pages generated is equivalent to: 43 % 10 + 1 = 5

Django ORM: Joining QuerySets

I'm trying to use the Django ORM for a task that requires a JOIN in SQL. I
already have a workaround that accomplishes the same task with multiple queries
and some off-DB processing, but I'm not satisfied by the runtime complexity.
First, I'd like to give you a short introduction to the relevant part of my
model. After that, I'll explain the task in English, SQL and (inefficient) Django ORM.
The Model
In my CMS model, posts are multi-language: For each post and each language, there can be one instance of the post's content. Also, when editing posts, I don't UPDATE, but INSERT new versions of them.
So, PostContent is unique on post, language and version. Here's the class:
class PostContent(models.Model):
""" contains all versions of a post, in all languages. """
language = models.ForeignKey(Language)
post = models.ForeignKey(Post) # the Post object itself only
version = models.IntegerField(default=0) # contains slug and id.
# further metadata and content left out
class Meta:
unique_together = (("resource", "language", "version"),)
The Task in SQL
And this is the task: I'd like to get a list of the most recent versions of all posts in each language, using the ORM. In SQL, this translates to a JOIN on a subquery that does GROUP BY and MAX to get the maximum of version for each unique pair of resource and language. The perfect answer to this question would be a number of ORM calls that produce the following SQL statement:
SELECT
id,
post_id,
version,
v
FROM
cms_postcontent,
(SELECT
post_id as p,
max(version) as v,
language_id as l
FROM
cms_postcontent
GROUP BY
post_id,
language_id
) as maxv
WHERE
post_id=p
AND version=v
AND language_id=l;
Solution in Django
My current solution using the Django ORM does not produce such a JOIN, but two seperate SQL
queries, and one of those queries can become very large. I first execute the subquery (the inner SELECT from above):
maxv = PostContent.objects.values('post','language').annotate(
max_version=Max('version'))
Now, instead of joining maxv, I explicitly ask for every single post in maxv, by
filtering PostContent.objects.all() for each tuple of post, language, max_version. The resulting SQL looks like
SELECT * FROM PostContent WHERE
post=P1 and language=L1 and version=V1
OR post=P2 and language=L2 and version=V2
OR ...;
In Django:
from django.db.models import Q
conjunc = map(lambda pc: Q(version=pc['max_version']).__and__(
Q(post=pc['post']).__and__(
Q(language=pc['language']))), maxv)
result = PostContent.objects.filter(
reduce(lambda disjunc, x: disjunc.__or__(x), conjunc[1:], conjunc[0]))
If maxv is sufficiently small, e.g. when retrieving a single post, this might be
a good solution, but the size of the query and the time to create it grow linearly with
the number of posts. The complexity of parsing the query is also at least linear.
Is there a better way to do this, apart from using raw SQL?
You can join (in the sense of union) querysets with the | operator, as long as the querysets query the same model.
However, it sounds like you want something like PostContent.objects.order_by('version').distinct('language'); as you can't quite do that in 1.3.1, consider using values in combination with distinct() to get the effect you need.

Categories