How can I deal with a massive delete from Django Admin? - python

I'm working with Django 2.2.10.
I have a model called Site, and a model called Record.
Each record is associated with a single site (Foreign Key).
After my app runs for a few days/weeks/months, each site can have thousands of records associated with it. I use the database efficiently, so this isn't normally a problem.
In Django Admin, when I try to delete a site however, Django Admin tries to figure out every single associated object that will also be deleted, and because my ForeignKey uses on_delete=models.CASCADE, which is what I want, it tries to generate a page that lists thousands, possibly millions of records that will be deleted. Sometimes this succeeds, but takes a few seconds. Sometimes the browser just gives up waiting.
How can I have Django Admin not list every single record it intends to delete? Maybe just say something like "x number of records will be deleted" instead.
Update: Should I be overriding Django admin's delete_confirmation.html? It looks like the culprit might be this line:
<ul>{{ deleted_objects|unordered_list }}</ul>
Or is there an option somewhere that can be enabled to automatically not list every single object to be deleted, perhaps if the object count is over X number of objects?
Update 2: Removing the above line from delete_confirmation.html didn't help. I think it's the view that generates the deleted_objects variable that is taking too long. Not quite sure how to override a Django Admin view

Add this to your admin class, and than you can delete with this action without warning
actions = ["silent_delete"]
def silent_delete(self, request, queryset):
queryset.delete()
If you want to hide default delete action, add this to your admin class
def get_actions(self, request):
actions = super().get_actions(request)
if 'delete_selected' in actions:
del actions['delete_selected']
return actions

Since django 2.1 you can override get_deleted_objects to limit the amount of deleted objects listed (it's either a list or a nested list). The timeout is probably due to the django app server timing out on the view's response.
You could limit the size of the returned list:
class YourModelAdmin(django.admin.ModelAdmin):
def get_deleted_objects(self, objs, request):
deleted = super().get_deleted_objects(objs, request)
deleted_objs = deleted[0]
return (self.__limit_nested(deleted_objs),) + deleted[1:]
def __limit_nested(self, objs):
limit = 10
if isinstance(objs, list):
return list(map(self.__limit_nested, objs))
if len(objs) > limit:
return objs[:limit] + ['...']
return objs
But chances are the call to super takes too long as well, so you probably want to return [], {}, set(), [] instead of calling super; though it doesn't tell you about missing permissions or protected relations then (but I saw no alternative other than copy pasting code from django github). You will want to override the delete_confirmation.html and the delete_selected_confirmation.html template as well. You'll also want to make sure the admin has permission to delete any related objects that might get deleted by the cascading deletes.
In fact, the deletion itself may take too long. It's probably best defer the deletion (and permission checks if those are slow too) to a celery task.

Related

How do I prevent the Django admin delete confirmation view to display all the objects to be deleted? [duplicate]

I have a model with a huge amount of data, and Django creates delete confirmation page a very long time. I have to skip this process and delete data without any confirmation. I have tried some solutions from the internet, but it doesn't work - I still see confirmation page.
Anyone know how to do that?
Django 2.1 released the new ModelAdmin method get_deleted_objects that allows to specify parameters to the confirmation screen for both single and multiple delete (e.g. using the delete action).
In my case, I wanted to delete a list of objects with several relationships, but set with cascade deletion. I ended up with something like this:
def get_deleted_objects(self, objs, request):
deleted_objects = [str(obj) for obj in objs]
model_count = {MyModel._meta.verbose_name_plural: len(deleted_objects)}
perms_needed = []
protected = []
return (deleted_objects, model_count, perms_needed, protected)
I could include other models in the model_count dict, getting only the count, for example, to still avoid list thousands of minor instances that I don't need to see individually.
def delete_selected(modeladmin, request, queryset):
queryset.delete()
class SomeAdmin(admin.ModelAdmin):
actions = (delete_selected,)

How to skip delete confirmation page in Django admin for specific model?

I have a model with a huge amount of data, and Django creates delete confirmation page a very long time. I have to skip this process and delete data without any confirmation. I have tried some solutions from the internet, but it doesn't work - I still see confirmation page.
Anyone know how to do that?
Django 2.1 released the new ModelAdmin method get_deleted_objects that allows to specify parameters to the confirmation screen for both single and multiple delete (e.g. using the delete action).
In my case, I wanted to delete a list of objects with several relationships, but set with cascade deletion. I ended up with something like this:
def get_deleted_objects(self, objs, request):
deleted_objects = [str(obj) for obj in objs]
model_count = {MyModel._meta.verbose_name_plural: len(deleted_objects)}
perms_needed = []
protected = []
return (deleted_objects, model_count, perms_needed, protected)
I could include other models in the model_count dict, getting only the count, for example, to still avoid list thousands of minor instances that I don't need to see individually.
def delete_selected(modeladmin, request, queryset):
queryset.delete()
class SomeAdmin(admin.ModelAdmin):
actions = (delete_selected,)

What is the most efficient way to iterate django objects updating them?

So I have a queryset to update
stories = Story.objects.filter(introtext="")
for story in stories:
#just set it to the first 'sentence'
story.introtext = story.content[0:(story.content.find('.'))] + ".</p>"
story.save()
And the save() operation completely kills performance. And in the process list, there are multiple entries for "./manage.py shell" yes I ran this through django shell.
However, in the past I've ran scripts that didn't need to use save(), as it was changing a many to many field. These scripts were very performant.
My project has this code, which could be relevant to why these scripts were so good.
#receiver(signals.m2m_changed, sender=Story.tags.through)
def save_story(sender, instance, action, reverse, model, pk_set, **kwargs):
instance.save()
What is the best way to update a large queryset (10000+) efficiently?
As far as new introtext value depends on content field of the object you can't do any bulk update. But you can speed up saving list of individual objects by wrapping it into transaction:
from django.db import transaction
with transaction.atomic():
stories = Story.objects.filter(introtext='')
for story in stories:
introtext = story.content[0:(story.content.find('.'))] + ".</p>"
Story.objects.filter(pk=story.pk).update(introtext=introtext)
transaction.atomic() will increase speed by order of magnitude.
filter(pk=story.pk).update() trick allows you to prevent any pre_save/post_save signals which are emitted in case of the simple save(). This is the officially recommended method of updating single field of the object.
You can use update built-in function over a queryset
Exmaple:
MyModel.objects.all().update(color=red)
In your case, you need use F() (read more here) built-in function to use instance own attributes:
from django.db.models import F
stories = Story.objects.filter(introtext__exact='')
stories.update(F('introtext')[0:F('content').find(.)] + ".</p>" )

How django validates and saves inlineforms

I would like to implement my own functionality similar to Django inlineformsets. What I'm interested in is how Django deals with validation and saving of a main object together with it's related objects in inline forms.
Let's say I have two models: Blog and Entry. Entry has a foreign key to the Blog that is not null. I want to be able to create both the blog and it's entries in one place. This is how I would do it using Django inline forms:
blogform = BlogForm(request.POST)
if blogform.is_valid():
tmp = blogform.save(commit=False)
entriesform = EntryInlineFormset(request.POST, instance=tmp)
if entriesform.is_valid():
entriesform.save()
blog.save()
What's going under the hood here? How is Django able to validate entries without blog being saved to the database? I wanted to find this in Django code, but I wasn't able to find the place where they actually do this.
My gues is they create a transaction. They save the blogform and if the entriesform is invalid they rollback. However what if the entriesform is valid, what next? Does the blog instance stay saved? What if save never gets called then.
Or do they span the transaction over two methods (is_valid and save)? I don't think it's best practice to start the transaction in one method and end it in another.
You can validate them both before calling save on either. You can pass a blank instance into both the parent form and the formset.
blog = Blog()
blogform = BlogForm(request.POST, instance=blog)
entriesform = EntryInlineFormset(request.POST, instance=blog)
blog_valid = BlogForm.is_valid()
entries_valid = entriesform.is_valid()
if blog_valid and entries_valid:
... save ...
I validate the forms separately and save the results to variables to avoid short-circuiting.

when to delete the cache entry in django

In my django application ,I have a BlogEntry which belongs to a Category.A BlogEntry may belong to many Categorys
class BlogEntry(models.Model):
creationdate=models.DateField(default=date.today)
description=models.TextField()
author=models.ForeignKey(User,null=True)
categories=models.ManyToManyField(Category)
class Category(models.Model):
name=models.CharField(unique=True,max_length=50)
description=models.TextField(blank=True)
A user may edit a BlogEntry and in doing so, may remove a Category it was in.
Suppose blogEntry1 belonged to java,scala before .If user edits it such that he removes scala.Now the entry has only one category ie java
In my list_view I am using cache as below
from django.core.cache import cache
def list_entries_on_day(request,year,month,day):
...
key = 'entries_day'+'-'+year+'-'+month+'-'+day
if key not in cache:
entries = BlogEntry.objects.filter(...args..)
cache.set(key,entries)
entries_on_day =cache.get(key)
...
Suppose I have created 2 entries for today and these are put in cache.If I edit one of these BlogEntys and remove a category
ie;
blogEntry1 has categories :java,scala
blogEntry2 has categories :dotnet,vbasic
Initially I make a query for entries for today and put the result in cache
cache now has [blogEntry1,blogEntry2] against key 'entries_day-2012-11-11'
Now I edit blogEntry1 such that it now has java as category
,do I need to remove the stored entries from cache?(Since the cache contains a BlogEntry object before it's modification)
You can invalidate cache by registering a signal handler for model.save
You can also live with the fact that users will see the stale content until the cache expiration (1 hour default) make sure he logged in user will not see the cached content, otherwise he will honk the edit was lost.
Hmmm my answer is a bit vague, but I just wanted to say: no, you don't strictly have to invalidate cache at each edit, it is a choice between performance and content freshness.
One more nit: the preferred idiom for cache usage is:
entries_on_day = cache.get(key)
if entries_on_day is None:
entries_on_day = BlogEntry.objects.filter(...args..)
cache.set(key,entries_on_day)
You save one cache query

Categories