django admin incorrectly adds order by into query

django admin incorrectly adds order by into query - python

I have noticed thanks to django debug toolbar, that every django admin list page, always add an "ORDER BY id DESC" to all my queries, EVEN if I manually override the get_queryset method of the admin.ModelAdmin (which I usually do because I want custom sorting on some of my admin pages)
I guess this is not really something to worry about, but it is an additional sorting operation the database will need to do, even if it doesn't make sense at all.
Is there any way to prevent this? It seems like on some models (even that, not on all) if I add the ordering meta data, then it won't automatically add an order by by id, but it will however add by that field, which is something also I don't want, because doing so would add that order by to all my other queries across the code.
EDIT: Seems like the culprit is at django.contrib.admin.views.main at ChangeList, on the function get_ordering at line 316 (django 1.7.10)
pk_name = self.lookup_opts.pk.name
if not (set(ordering) & set(['pk', '-pk', pk_name, '-' + pk_name])):
# The two sets do not intersect, meaning the pk isn't present. So
# we add it.
ordering.append('-pk')
I wonder what's the reason behind this...
EDIT:
To improve performance, and since MySQL (and InnoDB) returns data in the clustered index order when no order by is given, I can safely remove that id appending.
To do so, it is quite easy, I have just extended django's ChangeList and modified the get_ordering method. After that, just made a custom admin model that extendes from ModelAdmin and overrides the get_changelist method to the rerturn the above class.
I hope it helps anyone :)

Was having the exact same issue as this question where an admin queryset was 4 times slower due to the ID sort when I already have unique sorts. Thanks to #user1777914 and his work I don't have timeouts every other load! I am just adding this answer here for clarity if others suffer the same. As user1777914 mentions extend the ChangeList:
class NoPkChangeList(ChangeList):
def get_ordering(self, request, queryset):
"""
Returns the list of ordering fields for the change list.
First we check the get_ordering() method in model admin, then we check
the object's default ordering. Then, any manually-specified ordering
from the query string overrides anything. Finally, WE REMOVE the primary
key ordering field.
"""
params = self.params
ordering = list(self.model_admin.get_ordering(request) or self._get_default_ordering())
if ORDER_VAR in params:
# Clear ordering and used params
ordering = []
order_params = params[ORDER_VAR].split('.')
for p in order_params:
try:
none, pfx, idx = p.rpartition('-')
field_name = self.list_display[int(idx)]
order_field = self.get_ordering_field(field_name)
if not order_field:
continue # No 'admin_order_field', skip it
# reverse order if order_field has already "-" as prefix
if order_field.startswith('-') and pfx == "-":
ordering.append(order_field[1:])
else:
ordering.append(pfx + order_field)
except (IndexError, ValueError):
continue # Invalid ordering specified, skip it.
# Add the given query's ordering fields, if any.
ordering.extend(queryset.query.order_by)
# Ensure that the primary key is systematically present in the list of
# ordering fields so we can guarantee a deterministic order across all
# database backends.
# pk_name = self.lookup_opts.pk.name
# if not (set(ordering) & {'pk', '-pk', pk_name, '-' + pk_name}):
# # The two sets do not intersect, meaning the pk isn't present. So
# # we add it.
# ordering.append('-pk')
return ordering
Then in your ModelAdmin just override get_changelist:
class MyAdmin(ModelAdmin):
def get_changelist(self, request, **kwargs):
return NoPkChangeList

The answer of 7Wonders can be reduced to the following statements because only ChangeList._get_deterministic_ordering() needs to change:
# admin.py
class MyAdmin(ModelAdmin):
def get_changelist(self, request, **kwargs):
"""Improve changelist query speed by disabling deterministic ordering.
Please be aware that this might disturb pagination.
"""
from django.contrib.admin.views.main import ChangeList
class NoDeterministicOrderChangeList(ChangeList):
def _get_deterministic_ordering(self, ordering):
return ordering
return NoDeterministicOrderChangeList

Related

Django undirected unique-together

I want to model pair-wise relations between all members of a set.
class Match(models.Model):
foo_a = models.ForeignKey(Foo, related_name='foo_a')
foo_b = models.ForeignKey(Foo, related_name='foo_b')
relation_value = models.IntegerField(default=0)
class Meta:
unique_together = ('ingredient_a', 'ingredient_b')
When I add a pair A-B, it successfully prevents me from adding A-B again, but does not prevent me from adding B-A.
I tried following, but to no avail.
unique_together = (('ingredient_a', 'ingredient_b'), ('ingredient_b', 'ingredient_a'))
Edit:
I need the relationship_value to be unique for every pair of items

If you define a model like what you defined, its not just a ForeignKey, its called a ManyToMany Relation.
In the django docs, it is explicitly defined that unique together constraint cannot be included for a ManyToMany Relation.
From the docs,
A ManyToManyField cannot be included in unique_together. (It’s not clear what that would even mean!) If you need to validate uniqueness related to a ManyToManyField, try using a signal or an explicit through model.
EDIT
After lot of search and some trial and errors and finally I think I have found a solution for your scenario. Yes, as you said, the present schema is not as trivial as we all think. In this context, Many to many relation is not the discussion we need to forward. The solution is, (or what I think the solution is) model clean method:
class Match(models.Model):
foo_a = models.ForeignKey(Foo, related_name='foo_a')
foo_b = models.ForeignKey(Foo, related_name='foo_b')
def clean(self):
a_to_b = Foo.objects.filter(foo_a = self.foo_a, foo_b = self.foo_b)
b_to_a = Foo.objects.filter(foo_a = self.foo_b, foo_b = self.foo_a)
if a_to_b.exists() or b_to_a.exists():
raise ValidationError({'Exception':'Error_Message')})
For more details about model clean method, refer the docs here...

I've overridden the save method of the object to save 2 pairs every time. If the user wants to add a pair A-B, a record B-A with the same parameters is automatically added.
Note: This solution affects the querying speed. For my project, it is not an issue, but it needs to be considered.
def save(self, *args, **kwargs):
if not Match.objects.filter(foo_a=self.foo_a, foo_b=self.foo_b).exists():
super(Match, self).save(*args, **kwargs)
if not Match.objects.filter(foo_a=self.foo_b, foo_b=self.foo_a).exists():
Match.objects.create(foo_a=self.foo_b, foo_b=self.foo_a, bar=self.bar)
EDIT: Update and remove methods need to be overridden too of course.

Django admin change list view disable sorting for some fields

Is there are way(s) to disable the sorting function for some fields in django admin change list so for those fields, users cannot click the column header to sort the list.
I tried on the following method, but it doesn't work.
https://djangosnippets.org/snippets/2580/
I also tired to override the changelist_view in ModelAdmin but also nothing happen.
def changelist_view(self, request, extra_context=None):
self.ordering_fields = ['id']
return super(MyModelAdmin, self).changelist_view(request, extra_context)
In the above case, I would like to only allow user to sort the list by ID.
Anyone has suggestion? Thanks.

For Django 1.7 (or the version that I last use) do not support such things. One possible dirty work-around could be defining a model class method and using that method instead of model field.
class TestClass(Model):
some_field = (.....)
other_field = (........)
def show_other_field(self):
return self.other_field
class TestClassAdmin(ModelAdmin):
list_display = ("some_field", "show_other_field")
Since show_other_field is a model class method, django do not knows how to sort (or process) the return result of that method.
But as I said, this is a dirty hack that might require more processing (and maybe more database calls) according to use-case than displaying a field of a model.
Extra: If you want to make a model method sortable, you must pass admin_order_field value like:
def show_other_field(self):
return self.other_field
show_other_field.admin_order_field = "other_field"
That will make your model method sortable in admin list_display. But you have to pass a field or relation that is usable in the order_by method of database api.
TestClass.objects.filter(....).order_by(<admin_order_field>)

Tastypie Dehydrate reverse relation count

I have a simple model which includes a product and category table. The Product model has a foreign key Category.
When I make a tastypie API call that returns a list of categories /api/vi/categories/
I would like to add a field that determines the "product count" / the number of products that have a giving category. The result would be something like:
category_objects[
{
id: 53
name: Laptops
product_count: 7
},
...
]
The following code is working but the hit on my DB is heavy
def dehydrate(self, bundle):
category = Category.objects.get(pk=bundle.obj.id)
products = Product.objects.filter(category=category)
bundle.data['product_count'] = products.count()
return bundle
Is there a more efficient way to build this query? Perhaps with annotate ?

You can use prefetch_related method of QuerSet to reverse select_related.
Asper documentation,
prefetch_related(*lookups)
Returns a QuerySet that will automatically
retrieve, in a single batch, related objects for each of the specified
lookups.
This has a similar purpose to select_related, in that both are
designed to stop the deluge of database queries that is caused by
accessing related objects, but the strategy is quite different.
If you change your dehydrate function to following then database will be hit single time.
def dehydrate(self, bundle):
category = Category.objects.prefetch_related("product_set").get(pk=bundle.obj.id)
bundle.data['product_count'] = category.product_set.count()
return bundle
UPDATE 1
You should not initialize queryset inside dehydrate function. queryset should be always set in Meta class only. Please have a look at following example from django-tastypie documentation.
class MyResource(ModelResource):
class Meta:
queryset = User.objects.all()
excludes = ['email', 'password', 'is_staff', 'is_superuser']
def dehydrate(self, bundle):
# If they're requesting their own record, add in their email address.
if bundle.request.user.pk == bundle.obj.pk:
# Note that there isn't an ``email`` field on the ``Resource``.
# By this time, it doesn't matter, as the built data will no
# longer be checked against the fields on the ``Resource``.
bundle.data['email'] = bundle.obj.email
return bundle
As per official django-tastypie documentation on dehydrate() function,
dehydrate
The dehydrate method takes a now fully-populated bundle.data & make
any last alterations to it. This is useful for when a piece of data
might depend on more than one field, if you want to shove in extra
data that isn’t worth having its own field or if you want to
dynamically remove things from the data to be returned.
dehydrate() is only meant to make any last alterations to bundle.data.

Your code does additional count query for each category. You're right about annotate being helpfull in this kind of a problem.
Django will include all queryset's fields in GROUP BY statement. Notice .values() and empty .group_by() serve limiting field set to required fields.
cat_to_prod_count = dict(Product.objects
.values('category_id')
.order_by()
.annotate(product_count=Count('id'))
.values_list('category_id', 'product_count'))
The above dict object is a map [category_id -> product_count].
It can be used in dehydrate method:
bundle.data['product_count'] = cat_to_prod_count[bundle.obj.id]
If that doesn't help, try to keep similar counter on category records and use singals to keep it up to date.
Note categories are usually a tree-like beings and you probably want to keep count of all subcategories as well.
In that case look at the package django-mptt.

get_or_create django model with ManyToMany field

Suppose I have three django models:
class Section(models.Model):
name = models.CharField()
class Size(models.Model):
section = models.ForeignKey(Section)
size = models.IntegerField()
class Obj(models.Model):
name = models.CharField()
sizes = models.ManyToManyField(Size)
I would like to import a large amount of Obj data where many of the sizes fields will be identical. However, since Obj has a ManyToMany field, I can't just test for existence like I normally would. I would like to be able to do something like this:
try:
x = Obj(name='foo')
x.sizes.add(sizemodel1) # these can be looked up with get_or_create
...
x.sizes.add(sizemodelN) # these can be looked up with get_or_create
# Now test whether x already exists, so I don't add a duplicate
try:
Obj.objects.get(x)
except Obj.DoesNotExist:
x.save()
However, I'm not aware of a way to get an object this way, you have to just pass in keyword parameters, which don't work for ManyToManyFields.
Is there any good way I can do this? The only idea I've had is to build up a set of Q objects to pass to get:
myq = myq & Q(sizes__id=sizemodelN.id)
But I am not sure this will even work...

Use a through model and then .get() against that.
http://docs.djangoproject.com/en/dev/topics/db/models/#extra-fields-on-many-to-many-relationships
Once you have a through model, you can .get() or .filter() or .exists() to determine the existence of an object that you might otherwise want to create. Note that .get() is really intended for columns where unique is enforced by the DB - you might have better performance with .exists() for your purposes.
If this is too radical or inconvenient a solution, you can also just grab the ManyRelatedManager and iterate through to determine if the object exists:
object_sizes = obj.sizes.all()
exists = object_sizes.filter(id__in = some_bunch_of_size_object_ids_you_are_curious_about).exists()
if not exists:
(your creation code here)

Your example doesn't make much sense because you can't add m2m relationships before an x is saved, but it illustrated what you are trying to do pretty well. You have a list of Size objects created via get_or_create(), and want to create an Obj if no duplicate obj-size relationship exists?
Unfortunately, this is not possible very easily. Chaining Q(id=F) & Q(id=O) & Q(id=O) doesn't work for m2m.
You could certainly use Obj.objects.filter(size__in=Sizes) but that means you'd get a match for an Obj with 1 size in a huge list of sizes.
Check out this post for an __in exact question, answered by Malcolm, so I trust it quite a bit.
I wrote some python for fun that could take care of this.
This is a one time import right?
def has_exact_m2m_match(match_list):
"""
Get exact Obj m2m match
"""
if isinstance(match_list, QuerySet):
match_list = [x.id for x in match_list]
results = {}
match = set(match_list)
for obj, size in \
Obj.sizes.through.objects.filter(size__in=match).values_list('obj', 'size'):
# note: we are accessing the auto generated through model for the sizes m2m
try:
results[obj].append(size)
except KeyError:
results[obj] = [size]
return bool(filter(lambda x: set(x) == match, results.values()))
# filter any specific objects that have the exact same size IDs
# if there is a match, it means an Obj exists with exactly
# the sizes you provided to the function, no more.
sizes = [size1, size2, size3, sizeN...]
if has_exact_m2m_match(sizes):
x = Obj.objects.create(name=foo) # saves so you can use x.sizes.add
x.sizes.add(sizes)

Django - AutoField with regards to a foreign key

I have a model with a unique integer that needs to increment with regards to a foreign key, and the following code is how I currently handle it:
class MyModel(models.Model):
business = models.ForeignKey(Business)
number = models.PositiveIntegerField()
spam = models.CharField(max_length=255)
class Meta:
unique_together = (('number', 'business'),)
def save(self, *args, **kwargs):
if self.pk is None: # New instance's only
try:
highest_number = MyModel.objects.filter(business=self.business).order_by('-number').all()[0].number
self.number = highest_number + 1
except ObjectDoesNotExist: # First MyModel instance
self.number = 1
super(MyModel, self).save(*args, **kwargs)
I have the following questions regarding this:
Multiple people can create MyModel instances for the same business, all over the internet. Is it possible for 2 people creating MyModel instances at the same time, and .count() returns 500 at the same time for both, and then both try to essentially set self.number = 501 at the same time (raising an IntegrityError)? The answer seems like an obvious "yes, it could happen", but I had to ask.
Is there a shortcut, or "Best way" to do this, which I can use (or perhaps a SuperAutoField that handles this)?
I can't just slap a while model_not_saved: try:, except IntegrityError: in, because other restraints in the model could lead to an endless loop, and a disaster worse than Chernobyl (maybe not quite that bad).

You want that constraint at the database level. Otherwise you're going to eventually run into the concurrency problem you discussed. The solution is to wrap the entire operation (read, increment, write) in a transaction.
Why can't you use an AutoField for instead of a PositiveIntegerField?
number = models.AutoField()
However, in this case number is almost certainly going to equal yourmodel.id, so why not just use that?
Edit:
Oh, I see what you want. You want a numberfield that doesn't increment unless there's more than one instance of MyModel.business.
I would still recommend just using the id field if you can, since it's certain to be unique. If you absolutely don't want to do that (maybe you're showing this number to users), then you will need to wrap your save method in a transaction.
You can read more about transactions in the docs:
http://docs.djangoproject.com/en/dev/topics/db/transactions/
If you're just using this to count how many instances of MyModel have a FK to Business, you should do that as a query rather than trying to store a count.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.