Django "NULLS LAST" for creating Indexes

Django "NULLS LAST" for creating Indexes - python

Django 1.11 and later allow using F-expressions for adding nulls last option to queries:
queryset = Person.objects.all().order_by(F('wealth').desc(nulls_last=True))
However, we want to use this functionality for creating Indexes. A standard Django Index inside a model definition:
indexes = [
models.Index(fields=['-wealth']),
]
I tried something along the lines of:
indexes = [
models.Index(fields=[models.F('wealth').desc(nulls_last=True)]),
]
which returns AttributeError: 'OrderBy' object has no attribute 'startswith'.
Is this possible to do using F-expressions in Django?

No, unfortunately, that is currently (Django <=2.1) not possible. If you look at the source of models.Index, you will see that it assumes that the argument fields contains model names and nothing else.
As a workaround, you could manually create your index with raw SQL in a migration.

Fortunately it is now possible to create functional indexes since Django 3.2. The example which you posted has to be adjusted a little by moving the field from fields to *expressions and by adding a name, which is required when using expressions.
indexes = [
Index(F('wealth').desc(nulls_last=True), name='wealth_desc_idx'),
]
https://docs.djangoproject.com/en/3.2/ref/models/indexes/#expressions

Related

How do you incrementally add lexeme/s to an existing Django SearchVectorField document value through the ORM?

You can add to an existing Postgresql tsvector value using ||, for example:
UPDATE acme_table
SET my_tsvector = my_tsvector ||
to_tsvector('english', 'some new words to add to existing ones')
WHERE id = 1234;
Is there any way to access this functionality via the Django ORM? I.e. incrementally add to an existing SearchVectorField value rather than reconstruct from scratch?
The issue I'm having is the SearchVectorField property returns the tsvector as a string. So when I use the || operator as +, eg:
from django.contrib.postgres.search import SearchVector
instance.my_tsvector_prop += SearchVector(
["new", "fields"],
weight="A",
config='english'
)
I get the error:
TypeError: SearchVector can only be combined with other SearchVector instances, got str.
Because:
type(instance.my_tsvector_prop) == str
A fix to this open Django bug whereby a SearchVectorField property returns a SearchVector instance would probably enable this, if possible. (Although less efficient than combining in the database. In our case the update will run asynchronously so performance is not too important.)
MyModel.objects
.filter(pk=1234)
.update(my_tsvector_prop=
F("my_tsvector_prop") +
SearchVector(
["new_field_name"],
weight="A",
config='english')
)
)
Returns:
FieldError: Cannot resolve expression type, unknown output_field
Another solution would be to run a raw SQL UPDATE, although I'd rather do it through the Django ORM if possible as our tsvector fields often reference values many joins away, so it'd be nice to find a sustainable solution.

Django remove duplicates from .values_list query while preserving order

I have a model say MyModel which contains a CharField type. The model has a default meta ordering which should be preserved. I am using the following query to get the list of types -
MyModel.objects.all().values_list('type', flat=True).distinct()
However, the types are getting repeated. I can do .order_by('type').distinct() but that will change the ordering which I don't want. Is there any way to get the list of types in order without manually creating a list in python? Alternative faster solutions are also welcome.
Django version - 1.11

Distinct is not matching with type because you don't specified it
use this code
MyModel.objects.all().values_list('type', flat=True).distinct("type")
instead of this code
MyModel.objects.all().values_list('type', flat=True).distinct()

You can try for this
MyModel.objects.all().values('type', flat=True).order_by('type').distinct()
it will work for you

You can do this in 2 steps:
First, get the id of the records with unique types and save them in a list:
ids = list(MyModel.objects.values_list('id', flat=True).order_by('type').distinct('type'))
Then do the filter using the ids:
MyModel.objects.values_list('type', flat=True).filter(id__in=ids)

Django change database field from integer to CharField

I have a Django app with a populated (Postgres) database that has an integer field that I need to change to a CharField. I need to start storing data with leading zeros in this field. If I run migrate (Django 1.8.4), I get the following error:
psycopg2.ProgrammingError: operator does not exist: character varying >= integer
HINT: No operator matches the given name and argument type(s). You might need to add explicit type casts.
I tried searching Google, but didn't really find much help. I don't really know what I'm supposed to do here. Can anyone help?

Originally, I thought that there would be a simple solution where Django or Postgres would do the conversion automatically, but it appears that it doesn't work that way. I think some of suggestions made by others might have worked, but I came up with a simple solution of my own. This was done on a production database so I had to be careful. I ended up adding the character field to the model and did the migration. I then ran a small script in the python shell that copied the data in the integer field, did the conversion that I needed, then wrote that to the new field in the same record in the production database.
for example:
members = Members.objects.all()
for mem in members:
mem.memNumStr = str(mem.memNum)
... more data processing here
mem.save()
So now, I had the data duplicated in the table in a str field and an int field. I could then modify the views that accessed that field and test it on the production database without breaking the old code. Once that is done, I can drop the old int field. A little bit involved, but pretty simple in the end.

You'll need to generate a schema migration. How you do that will depend on which version of Django you are using (versions 1.7 and newer have built-in migrations; older versions of Django will use south).
Of Note: if this data is production data, you'll want to be very careful with how you proceed. Make sure you have a known good copy of your data you can reinstall if things get hairy. Test your migrations in a non-production environment. Be. Careful!
As for the transformation on the field (from IntegerField to CharField) and the transformation on the field values (to be prepended with leading zeroes) - Django cannot do this for you, you'll have to write this manually. The proper way to do this is to use the django.db.migrations.RunPython migration command (see here).
My advice would be to generate multiple migrations; one that creates a new IntegerField, my_new_column and then write to this new column via RunPython. Then, run a second migration that removes the original CharField my_old_column and renames my_new_column as my_old_column.

from django 2.x you just need to change the field type from
IntegerField to CharField
and django will automatically alter field and migrate data as well for you.

I thought a full code example would be helpful. I followed the approach outlined by ken-koster in the comment above. I ended up with two migrations (0091 and 0092). It seems that the two migrations could be squashed into one migration but I did not go that far. (Maybe Django does this automatically but the framework here could be used in case the string values are more complicated than a simple int to string conversion. Also I included a reverse conversion example.)
First migration (0091):
from django.db import migrations, models
class Migration(migrations.Migration):
dependencies = [
('myapp', '0090_auto_20200622_1452'),
]
operations = [
# store original values in tmp fields
migrations.RenameField(model_name='member',
old_name='mem_num',
new_name='mem_num_tmp'),
# add back fields as string fields
migrations.AddField(
model_name='member',
name='mem_num',
field=models.CharField(default='0', max_length=64, verbose_name='Number of members'),
),
]
Second migration (0092):
from django.db import migrations
def copyvals(apps, schema_editor):
Member = apps.get_model("myapp", "Member")
members = Member.objects.all()
for member in members:
member.rotate_xy = str(member.mem_num_tmp)
member.save()
def copyreverse(apps, schema_editor):
Member = apps.get_model("myapp", "Member")
members = Member.objects.all()
for member in members:
try:
member.mem_num_tmp = int(member.mem_num)
member.save()
except Exception:
print("Reverse migration for member %s failed." % member.name)
print(member.mem_num)
class Migration(migrations.Migration):
dependencies = [
('myapp', '0091_custom_migration'),
]
operations = [
# convert integers to strings
migrations.RunPython(copyvals, reverse_code=copyreverse),
# remove the tmp field
migrations.RemoveField(
model_name='member',
name='mem_num_tmp',
),
]

In django, is there a way to directly annotate a query with a related object in single query?

Consider this query:
query = Novel.objects.< ...some filtering... >.annotate(
latest_chapter_id=Max("volume__chapter__id")
)
Actually what I need is to annotate each Novel with its latest Chapter object, so after this query, I have to execute another query to select actual objects by annotated IDs. IMO this is ugly. Is there a way to combine them into a single query?

Yes, it's possible.
To get a queryset containing all Chapters which are the last in their Novels, simply do:
from django.db.models.expressions import F
from django.db.models.aggregates import Max
Chapters.objects.annotate(last_chapter_pk=Max('novel__chapter__pk')
).filter(pk=F('last_chapter_pk'))
Tested on Django 1.7.

Possible with Django 3.2+
Make use of django.db.models.functions.JSONObject (added in Django 3.2) to combine multiple fields (in this example, I'm fetching the latest object, however it is possible to fetch any arbitrary object provided that you can get LIMIT 1) to yield your object):
MainModel.objects.annotate(
last_object=RelatedModel.objects.filter(mainmodel=OuterRef("pk"))
.order_by("-date_created")
.values(
data=JSONObject(
id="id", body="body", date_created="date_created"
)
)[:1]
)

Yes, using Subqueries, docs: https://docs.djangoproject.com/en/3.0/ref/models/expressions/#subquery-expressions
latest_chapters = Chapter.objects.filter(novel = OuterRef("pk"))\
.order_by("chapter_order")
novels_with_chapter = Novel.objects.annotate(
latest_chapter = Subquery(latest_chapters.values("chapter")[:1]))
Tested on Django 3.0
The subquery creates a select statement inside the select statement for the novels, then adds this as an annotation. This means you only hit the database once.
I also prefer this to Rune's answer as it actually annotates a Novel object.
Hope this helps, anyone who came looking like much later like I did.

No, it's not possible to combine them into a single query.
You can read the following blog post to find two workarounds.

In Django, how does one filter a QuerySet with dynamic field lookups?

Given a class:
from django.db import models
class Person(models.Model):
name = models.CharField(max_length=20)
Is it possible, and if so how, to have a QuerySet that filters based on dynamic arguments? For example:
# Instead of:
Person.objects.filter(name__startswith='B')
# ... and:
Person.objects.filter(name__endswith='B')
# ... is there some way, given:
filter_by = '{0}__{1}'.format('name', 'startswith')
filter_value = 'B'
# ... that you can run the equivalent of this?
Person.objects.filter(filter_by=filter_value)
# ... which will throw an exception, since `filter_by` is not
# an attribute of `Person`.

Python's argument expansion may be used to solve this problem:
kwargs = {
'{0}__{1}'.format('name', 'startswith'): 'A',
'{0}__{1}'.format('name', 'endswith'): 'Z'
}
Person.objects.filter(**kwargs)
This is a very common and useful Python idiom.

A simplified example:
In a Django survey app, I wanted an HTML select list showing registered users. But because we have 5000 registered users, I needed a way to filter that list based on query criteria (such as just people who completed a certain workshop). In order for the survey element to be re-usable, I needed for the person creating the survey question to be able to attach those criteria to that question (don't want to hard-code the query into the app).
The solution I came up with isn't 100% user friendly (requires help from a tech person to create the query) but it does solve the problem. When creating the question, the editor can enter a dictionary into a custom field, e.g.:
{'is_staff':True,'last_name__startswith':'A',}
That string is stored in the database. In the view code, it comes back in as self.question.custom_query . The value of that is a string that looks like a dictionary. We turn it back into a real dictionary with eval() and then stuff it into the queryset with **kwargs:
kwargs = eval(self.question.custom_query)
user_list = User.objects.filter(**kwargs).order_by("last_name")

Additionally to extend on previous answer that made some requests for further code elements I am adding some working code that I am using
in my code with Q. Let's say that I in my request it is possible to have or not filter on fields like:
publisher_id
date_from
date_until
Those fields can appear in query but they may also be missed.
This is how I am building filters based on those fields on an aggregated query that cannot be further filtered after the initial queryset execution:
# prepare filters to apply to queryset
filters = {}
if publisher_id:
filters['publisher_id'] = publisher_id
if date_from:
filters['metric_date__gte'] = date_from
if date_until:
filters['metric_date__lte'] = date_until
filter_q = Q(**filters)
queryset = Something.objects.filter(filter_q)...
Hope this helps since I've spent quite some time to dig this up.
Edit:
As an additional benefit, you can use lists too. For previous example, if instead of publisher_id you have a list called publisher_ids, than you could use this piece of code:
if publisher_ids:
filters['publisher_id__in'] = publisher_ids

Django.db.models.Q is exactly what you want in a Django way.

This looks much more understandable to me:
kwargs = {
'name__startswith': 'A',
'name__endswith': 'Z',
***(Add more filters here)***
}
Person.objects.filter(**kwargs)

A really complex search forms usually indicates that a simpler model is trying to dig it's way out.
How, exactly, do you expect to get the values for the column name and operation?
Where do you get the values of 'name' an 'startswith'?
filter_by = '%s__%s' % ('name', 'startswith')
A "search" form? You're going to -- what? -- pick the name from a list of names? Pick the operation from a list of operations? While open-ended, most people find this confusing and hard-to-use.
How many columns have such filters? 6? 12? 18?
A few? A complex pick-list doesn't make sense. A few fields and a few if-statements make sense.
A large number? Your model doesn't sound right. It sounds like the "field" is actually a key to a row in another table, not a column.
Specific filter buttons. Wait... That's the way the Django admin works. Specific filters are turned into buttons. And the same analysis as above applies. A few filters make sense. A large number of filters usually means a kind of first normal form violation.
A lot of similar fields often means there should have been more rows and fewer fields.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Django "NULLS LAST" for creating Indexes - python

No, unfortunately, that is currently (Django <=2.1) not possible. If you look at the source of models.Index, you will see that it assumes that the argument fields contains model names and nothing else. As a workaround, you could manually create your index with raw SQL in a migration.

Related

How do you incrementally add lexeme/s to an existing Django SearchVectorField document value through the ORM?

Django remove duplicates from .values_list query while preserving order

Django change database field from integer to CharField

In django, is there a way to directly annotate a query with a related object in single query?

In Django, how does one filter a QuerySet with dynamic field lookups?

Categories

Resources