I am developing a Django project, which already got plenty of real world data so I can see its performance.
The performances of few DjangoAdmin lists are just terrible.
I have one admin list, lets call it devices. In that list I am fetching additional informations for each row, those fields are related from other tables and connected via FK/PK/M2N.
List contains about 500 records and loading of that screen takes, according to django-debug-toolbar, around 6.5 seconds, which is unbearable.
This admin class
#admin.register(Device)
class DeviceAdmin(admin.ModelAdmin):
list_select_related = True
list_display = ('id', 'name', 'project', 'location', 'machine', 'type', 'last_maintenance_log')
inlines = [CommentInline, TestLogInline]
def project(self, obj):
try:
return Device.objects.get(pk=obj.pk).machine.location.project.project_name
except AttributeError:
return '-'
def location(self, obj):
try:
return Device.objects.get(pk=obj.pk).machine.location.name
except AttributeError:
return '-'
def last_maintenance_log(self, obj):
try:
log = AdminLog.objects.filter(object_id=obj.pk).latest('time')
return '{} | {}'.format(log.time.strftime("%d/%m/%Y, %-I:%M %p"), log.title)
except AttributeError:
return '-'
All Machine, Location and Project are tables in database.
After looking into queries in django-debug-toolbar I discovered something terrible.
That screen required 287 sql queries ! Yes, over two hundreds!
Is there anything I can do to reduce this time to something reasonable?
EDIT:
Thanks to Bruno I removed Device.objects.get(pk=obj.id) (as it was really redundant, I totally overlooked this.
So everywhere i put obj. for example obj.machine.location.project.project_name
Only this alone reduces the speed and query count to half, so far so good.
I not have trouble fusing obj approach and select_related approach. This is my current code, which is worse than only obj approach.
def project(self, obj):
try:
Device.objects.select_related('machine__location__project').get(id=obj.pk).machine.location.project.project_name
except AttributeError:
return '-'
Which creates a nice INNER JOINs in queries, but the performance is around 15% worse than without it and using only obj.machine.location.project.project_name
What am I doing wrong?
EDIT2:
Best performance I got is with this code:
#admin.register(Device)
class DeviceAdmin(admin.ModelAdmin):
list_select_related = True
save_as = True
form = DeviceForm
list_display = ('id', 'name', 'project', 'location', 'machine', 'type', 'last_maintenance_log')
inlines = [CommentInline, TestLogInline]
def project(self, obj):
try:
return obj.machine.location.project.project_name
except AttributeError:
return '-'
def location(self, obj):
try:
return obj.machine.location.name
except AttributeError:
return '-'
def last_maintenance_log(self, obj):
try:
log = AdminLog.objects.filter(object_id=obj.pk).latest('time')
return '{} | {}'.format(log.time.strftime("%d/%m/%Y, %-I:%M %p"), log.title)
except AttributeError:
return '-'
def get_queryset(self, request):
return Device.objects.select_related('machine__location__project').all()
Which pushed this down to 104 queries (from almost 300) and time reduction over 50%. Can this be improved any further?
First, avoid totally useless queries - this:
Device.objects.get(pk=obj.pk)
is as useless as can be since obj already the Device instance you are looking for.
Then override your DeviceAdmin.get_queryset method to make proper use of select_related and prefetch_related so you only have the minimal required number of queries.
Related
I'm trying to get a queryset from the cache, but am unsure if this even has a point.
I have the following method (simplified) inside a custom queryset:
def queryset_from_cache(self, key: str=None, timeout: int=60):
# Generate a key based on the query.
if key is None:
key = self.__generate_key # ()
# If the cache has the key, return the cached object.
cached_object = cache.get(key, None)
# If the cache doesn't have the key, set the cache,
# and then return self (from DB) as cached_object
if cached_object is None:
cached_object = self
cache.set(key, cached_object , timeout=timeout)
return cached_object
The usage is basically to append it to a django QuerySet method, for example:
queryset = MyModel.objects.filter(id__range=[0,99]).queryset_from_cache()
My question:
Would usage like this work?
Or would it call MyModel.objects.filter(id__range=[0,99]) from the database no matter what?
Since normally caching would be done like this:
cached_object = cache.get(key, None)
if cached_object is None:
cached_object = MyModel.objects.filter(id__range=[0,99])
#Only now call the query
cache.set(key, cached_object , timeout=timeout)
And thus the queryset filter() method only gets called when the key is not present in the cache, as opposed to always calling it, and then trying to get it from the cache with the queryset_from_cache method.
This is a really cool idea, but I'm not sure if you can Cache full-on Objects.. I think it's only attributes
Now this having a point. Grom what I'm seeing from the limited code I've seen idk if it does have a point, unless filtering for Jane and John (and only them) is very common. Very narrow.
Maybe just try caching ALL the users or just individual Users, and only the attributes you need
Update
Yes! you are completetly correct, you can cache full on objects- how cool!
I don't think your example method of queryset = MyModel.objects.filter(id__range=[0,99]).queryset_from_cache() would work.
but you can do something similar by using Model Managers and do something like: queryset = MyModel.objects.queryset_from_cache(filterdict)
Models
Natually you can return just the qs, this is just for the example to show it actually is from the cache
from django.db import models
class MyModelManager(models.Manager):
def queryset_from_cache(self, filterdict):
from django.core.cache import cache
cachekey = 'MyModelCache'
qs = cache.get(cachekey)
if qs:
d = {
'in_cache': True,
'qs': qs
}
else:
qs = MyModel.objects.filter(**filterdict)
cache.set(cachekey, qs, 300) # 5 min cache
d = {
'in_cache': False,
'qs': qs
}
return d
class MyModel(models.Model):
name = models.CharField(max_length=200)
#
# other attributes
#
objects = MyModelManager()
Example Use
from app.models import MyModel
filterdict = {'pk__range':[0,99]}
r = MyModel.objects.queryset_from_cache(filterdict)
print(r['qs'])
While it's not exactly what you wanted, it might be close enough
I am creating a browser-based RPG where fighting mechanics are built into a model called "Battle". It performs actions on Hero, Monster and Item models according to some formulas. Each action adds a message to a "battle log". A player can issue a fight against another player or NPC in a form. When the form is submitted, it calls the same view, the Battle object is created, the characters are drafted and the game mechanics are run.
For some reason, old "Battle" objects are still "selected" between runs of these views, as long as it's in the same web session. So even though I create a new object, the old battle log gets carried over to this new object.
What am I doing wrong here?
Updated with more context
The fightlog field in the first object is correct.
The fightlog field in the second object is the the first objects data PLUS the new data.
The third fightlog is the first plus second plus third, etc.
views.py
def battle_log(request):
if request.session["last_battle"]:
pk = request.session["last_battle"]
b = Battletest.objects.get(pk=pk)
battle_log = b.fightlog
request.session["last_battle"] = ""
context = { 'battle_log' : battle_log, }
return render(request, 'battle.html', context)
else:
return redirect('/game/monster')
def fight_select_monster(request):
form = SelectCharacter()
if request.method=='POST':
form = SelectCharacter(request.POST)
if form.is_valid():
b = Battletest.objects.create()
b.draft(request.POST.get("character"))
b.start_fight()
b.round()
b.eof()
b.save()
request.session["last_battle"] = b.pk
return redirect('/game/fight/')
context = { 'form': form }
request.session["last_battle"] = ""
return render(request, 'fight.html', context)
models.py
class Battletest(models.Model):
messages = []
fightlog = models.TextField()
opponent = ""
def draft(self, opponent):
CHARACTERS= (
(0, 'Confident Hacker'),
(1, 'Confused Coder'),
)
self.opponent = CHARACTERS[int(opponent)][1]
def start_fight(self):
self.messages.append([0, "You joined the battle."])
self.messages.append([0, self.opponent + "has appeared"])
def round(self):
# have character objects do stuff to eachother until
# some edge case is met.
self.messages.append([1, "You smack " + self.opponent + " in the face"])
self.messages.append([1, self.opponent + " decides to leave this stupid fight"])
def eof(self):
self.messages.append([2, "The fight is over and noone wins"])
self.fightlog = self.messages
forms.py
class SelectCharacter(forms.Form):
CONFIDENTHACKER = 0
CONFUSEDCODER = 1
CHARACTERS= (
(CONFIDENTHACKER, 'Confident Hacker'),
(CONFUSEDCODER, 'Confused Coder'),
)
character = forms.ChoiceField(choices=CHARACTERS)
Ok, your issue is here:
class Battletest(models.Model):
messages = []
opponent = ""
This defines messages and opponent as class attributes
- attributes that belong to the class object itself and as such are shared between all instances of the class, making them, practically, global variables (since class objects are singletons).
What you want here is to make messages an instance attribute by defining int in the initializer (that's what it's for):
class Battletest(models.Model):
fightlog = models.TextField()
def __init__(self, *args, **kwargs):
# let Model do it's own stuff !!!
super(Battletest, self).__init__(*args, **kwargs)
# and add our ones:
self.messages = []
self.opponent = None
As a side note: such mistakes are often the sign someone kind of "jumped in" Django without learning Python's basics first and wrongly assumes that because Django models use class attribute to define db fields, Python's class syntax is the same as Java or PHP where you define attributes at the class top-level. But that's not how Python works and I very strongly suggest that at this point you stop everything and do the whole official Python tutorial - it will saves you a lot of time and pain, really.
As a second side note: in the context of server side web app, you want to avoid any kind of (mutable) global state in your code. Every bit of mutable global state should live in some databaseyour models, sessions, whatever as long as it's external to your code AND can be shared amongst many processes - because in a typical production setup, your app WILL be served by many distinct processes (yes, even if you have one single HTTP front server, it will typically manage a pool of django processes, and requests will arbitrarily dispatched to any of those processes).
Now, you have another issue here:
def eof(self):
# ...
self.fightlog = self.messages
You defined fightlog as a text field, but you're assigning a list of lists to it. What get saved will be a textual representation of the list, which is not very usable.
Theoretically, what you have here is a one to many relationship (a Battletest has many Message), so the proper relational design would be to use a distinct Message model with a ForeignKey on Battletest - as shown in the tutorial (you did the tutorial, did you ?).
Now if you really insist on denormalizing this, the best (less worse at least) solution is to serialize messages to json at save() time and unserialize it back to Python in the initializer. This can be done manually:
import json
class Battletest(models.Model):
fightlog = models.TextField()
def __init__(self, *args, **kwargs):
# let Model do it's own stuff !!!
super(Battletest, self).__init__(*args, **kwargs)
# and add our ones:
if self.fightlog:
self.messages = json.loads(self.fightlog)
else:
self.messages = []
self.opponent = None
# ...
def save(self, *args, **kwargs):
self.fightlog = json.dumps(self.messages)
super(Battletest, self).save(*args, **kwargs)
or using a JSONField (that will more or less automagically take care of this) if your RDBMS support it. Googling for "django JSONField" should provide some hints...
Oh and yes... you have duplicated code here:
class Battletest(models.Model):
# ...
def draft(self, opponent):
CHARACTERS= (
(0, 'Confident Hacker'),
(1, 'Confused Coder'),
)
self.opponent = CHARACTERS[int(opponent)][1]
and here:
class SelectCharacter(forms.Form):
CONFIDENTHACKER = 0
CONFUSEDCODER = 1
CHARACTERS= (
(CONFIDENTHACKER, 'Confident Hacker'),
(CONFUSEDCODER, 'Confused Coder'),
)
character = forms.ChoiceField(choices=CHARACTERS)
You want to factor this out so you have one single point of truth:
class Battletest(models.Model):
CONFIDENTHACKER = 0
CONFUSEDCODER = 1
CHARACTERS= [
(CONFIDENTHACKER, 'Confident Hacker'),
(CONFUSEDCODER, 'Confused Coder'),
]
def draft(self, opponent):
self.opponent = self.CHARACTERS[int(opponent)][1]
and in your forms.py:
from . models import Battletest
class SelectCharacter(forms.Form):
character = forms.ChoiceField(choices=Battltest.CHARACTERS)
I'm using django 1.8.
What I need is to do case insensitive admin-search in multiple fields and allow the user to use the AND, OR and NOT operators and some how group words either with parentheses or quotes.
Search Example:
cotton and (red or "dark blue")
I've already discovered django-advanced-filter and django-filter...
They are filters! I also want to allow the user to type in the keys in the search box.
I know that get_search_results allows us to override the search behaviour, but before I write a code for this, I want to ask is there a package that would do this for me?
Note that I feel that making a custom search with haystack is pretty complex.
This answer seems to work for me after performing the little edit mentioned in my comment. Yet, I have no idea whether this is the "correct" way of doing it.
Here is the updated code that works on django 1.8:
from django.contrib import admin
from django.db import models
from bookstore.models import Book
from django.contrib.admin.views.main import ChangeList
import operator
class MyChangeList(ChangeList):
def __init__(self, *a):
super(MyChangeList, self).__init__(*a)
def get_queryset(self, request):
print dir(self)
# First, we collect all the declared list filters.
(self.filter_specs, self.has_filters, remaining_lookup_params,
use_distinct) = self.get_filters(request)
# Then, we let every list filter modify the queryset to its liking.
qs = self.root_queryset
for filter_spec in self.filter_specs:
new_qs = filter_spec.queryset(request, qs)
if new_qs is not None:
qs = new_qs
try:
# Finally, we apply the remaining lookup parameters from the query
# string (i.e. those that haven't already been processed by the
# filters).
qs = qs.filter(**remaining_lookup_params)
except (SuspiciousOperation, ImproperlyConfigured):
# Allow certain types of errors to be re-raised as-is so that the
# caller can treat them in a special way.
raise
except Exception, e:
# Every other error is caught with a naked except, because we don't
# have any other way of validating lookup parameters. They might be
# invalid if the keyword arguments are incorrect, or if the values
# are not in the correct type, so we might get FieldError,
# ValueError, ValidationError, or ?.
raise IncorrectLookupParameters(e)
# Use select_related() if one of the list_display options is a field
# with a relationship and the provided queryset doesn't already have
# select_related defined.
if not qs.query.select_related:
if self.list_select_related:
qs = qs.select_related()
else:
for field_name in self.list_display:
try:
field = self.lookup_opts.get_field(field_name)
except Exception as ex:# models.FieldDoesNotExist:
print ex
pass
else:
if isinstance(field.rel, models.ManyToOneRel):
qs = qs.select_related()
break
# Set ordering.
ordering = self.get_ordering(request, qs)
qs = qs.order_by(*ordering)
# Apply keyword searches.
def construct_search(field_name):
if field_name.startswith('^'):
return "%s__istartswith" % field_name[1:]
elif field_name.startswith('='):
return "%s__iexact" % field_name[1:]
elif field_name.startswith('#'):
return "%s__search" % field_name[1:]
else:
return "%s__icontains" % field_name
if self.search_fields and self.query:
orm_lookups = [construct_search(str(search_field))
for search_field in self.search_fields]
or_queries = []
for bit in self.query.split():
or_queries += [models.Q(**{orm_lookup: bit})
for orm_lookup in orm_lookups]
if len(or_queries) > 0:
qs = qs.filter(reduce(operator.or_, or_queries))
if not use_distinct:
for search_spec in orm_lookups:
if admin.utils.lookup_needs_distinct(self.lookup_opts, search_spec):
use_distinct = True
break
if use_distinct:
return qs.distinct()
else:
return qs
#admin.register(Book)
class AdminBookstore(admin.ModelAdmin):
list_display = ('title', 'author', 'description')
search_fields = ('title', 'author', 'description')
def get_changelist(*a, **k):
return MyChangeList
I'm using peewee as ORM for a project and want to extend it to handle logical deletes.
I've added "deleted" field to my base model and have extended the delete operations as follows:
#classmethod
def delete(cls, permanently=False):
if permanently:
return super(BaseModel, cls).delete()
else:
return super(BaseModel, cls).update(deleted=True, modified_at=datetime.datetime.now())
def delete_instance(self, permanently=False, recursive=False, delete_nullable=False):
if permanently:
return self.delete(permanently).where(self.pk_expr()).execute()
else:
self.deleted = True
return self.save()
This works great. However, when I'm overriding select I get some problems.
#classmethod
def select(cls, *selection):
print selection
return super(BaseModel, cls).select(cls, *selection).where(cls.deleted == False)
This works in most cases, but in certains selects it breaks when the resulting query ends up using a join with the keyword "IN" with the following error: "1241, 'Operand should contain 1 column(s)"
Any suggestion on how to properly override select or work around this problem?
I always use a field on my models to indicate whether the model is deleted. I do not recommend overriding methods like delete, delete_instance and especially select. Rather create a new API and use that. Here's how I typically do it:
class StatusModel(Model):
status = IntegerField(
choices=(
(1, 'Public'),
(2, 'Private'),
(3, 'Deleted')),
default=1)
#classmethod
def public(cls):
return cls.select().where(cls.status == 1)
I'm using django-filter app. There is however one problem I do not know how to solve. It's almost exactly the same thing as is described in django documentation:
https://docs.djangoproject.com/en/1.2/topics/db/queries/#spanning-multi-valued-relationships
I want to make a query where I select all Blogs that has an entry with both "Lennon" in headline and was published in 2008, eg.:
Blog.objects.filter(entry__headline__contains='Lennon',
entry__pub_date__year=2008)
Not to select Blogs that has an entry with "Lennon" in headline and another entry (possibly the same) that was published in 2008:
Blog.objects.filter(entry__headline__contains='Lennon').filter(
entry__pub_date__year=2008)
However, if I set up Filter such that there are two fields (nevermind __contains x __exact, just an example):
class BlogFilter(django_filters.FilterSet):
entry__headline = django_filters.CharFilter()
entry__pub_date = django_filters.CharFilter()
class Meta:
model = Blog
fields = ['entry__headline', 'entry__pub_date', ]
django-filter will generete the latter:
Blog.objects.filter(entry__headline__exact='Lennon').filter(
entry__pub_date__exact=2008)
Is there a way to combine both filters into a single filter field?
Well, I came with a solution. It is not possible to do using the regular django-filters, so I extended it a bit. Could've been improved, this is quick-n-dirty solution.
1st added a custom "grouped" field to django_filters.Filter and a filter_grouped method (almost copy of filter method)
class Filter(object):
def __init__(self, name=None, label=None, widget=None, action=None,
lookup_type='exact', required=False, grouped=False, **kwargs):
(...)
self.grouped = grouped
def filter_grouped(self, qs, value):
if isinstance(value, (list, tuple)):
lookup = str(value[1])
if not lookup:
lookup = 'exact' # we fallback to exact if no choice for lookup is provided
value = value[0]
else:
lookup = self.lookup_type
if value:
return {'%s__%s' % (self.name, lookup): value}
return {}
the only difference is that instead of creating a filter on query set, it returns a dictionary.
2nd updated BaseFilterSet qs method/property:
class BaseFilterSet(object):
(...)
#property
def qs(self):
if not hasattr(self, '_qs'):
qs = self.queryset.all()
grouped_dict = {}
for name, filter_ in self.filters.iteritems():
try:
if self.is_bound:
data = self.form[name].data
else:
data = self.form.initial.get(name, self.form[name].field.initial)
val = self.form.fields[name].clean(data)
if filter_.grouped:
grouped_dict.update(filter_.filter_grouped(qs, val))
else:
qs = filter_.filter(qs, val)
except forms.ValidationError:
pass
if grouped_dict:
qs = qs.filter(**grouped_dict)
(...)
return self._qs
The trick is to store all "grouped" filters in a dictionary and then use them all as a single filter.
The filter will look something like this then:
class BlogFilter(django_filters.FilterSet):
entry__headline = django_filters.CharFilter(grouped=True)
entry__pub_date = django_filters.CharFilter(grouped=True)
class Meta:
model = Blog
fields = ['entry__headline', 'entry__pub_date', ]