when to delete the cache entry in django - python

In my django application ,I have a BlogEntry which belongs to a Category.A BlogEntry may belong to many Categorys
class BlogEntry(models.Model):
creationdate=models.DateField(default=date.today)
description=models.TextField()
author=models.ForeignKey(User,null=True)
categories=models.ManyToManyField(Category)
class Category(models.Model):
name=models.CharField(unique=True,max_length=50)
description=models.TextField(blank=True)
A user may edit a BlogEntry and in doing so, may remove a Category it was in.
Suppose blogEntry1 belonged to java,scala before .If user edits it such that he removes scala.Now the entry has only one category ie java
In my list_view I am using cache as below
from django.core.cache import cache
def list_entries_on_day(request,year,month,day):
...
key = 'entries_day'+'-'+year+'-'+month+'-'+day
if key not in cache:
entries = BlogEntry.objects.filter(...args..)
cache.set(key,entries)
entries_on_day =cache.get(key)
...
Suppose I have created 2 entries for today and these are put in cache.If I edit one of these BlogEntys and remove a category
ie;
blogEntry1 has categories :java,scala
blogEntry2 has categories :dotnet,vbasic
Initially I make a query for entries for today and put the result in cache
cache now has [blogEntry1,blogEntry2] against key 'entries_day-2012-11-11'
Now I edit blogEntry1 such that it now has java as category
,do I need to remove the stored entries from cache?(Since the cache contains a BlogEntry object before it's modification)

You can invalidate cache by registering a signal handler for model.save
You can also live with the fact that users will see the stale content until the cache expiration (1 hour default) make sure he logged in user will not see the cached content, otherwise he will honk the edit was lost.
Hmmm my answer is a bit vague, but I just wanted to say: no, you don't strictly have to invalidate cache at each edit, it is a choice between performance and content freshness.
One more nit: the preferred idiom for cache usage is:
entries_on_day = cache.get(key)
if entries_on_day is None:
entries_on_day = BlogEntry.objects.filter(...args..)
cache.set(key,entries_on_day)
You save one cache query

Related

How do I prevent the Django admin delete confirmation view to display all the objects to be deleted? [duplicate]

I have a model with a huge amount of data, and Django creates delete confirmation page a very long time. I have to skip this process and delete data without any confirmation. I have tried some solutions from the internet, but it doesn't work - I still see confirmation page.
Anyone know how to do that?
Django 2.1 released the new ModelAdmin method get_deleted_objects that allows to specify parameters to the confirmation screen for both single and multiple delete (e.g. using the delete action).
In my case, I wanted to delete a list of objects with several relationships, but set with cascade deletion. I ended up with something like this:
def get_deleted_objects(self, objs, request):
deleted_objects = [str(obj) for obj in objs]
model_count = {MyModel._meta.verbose_name_plural: len(deleted_objects)}
perms_needed = []
protected = []
return (deleted_objects, model_count, perms_needed, protected)
I could include other models in the model_count dict, getting only the count, for example, to still avoid list thousands of minor instances that I don't need to see individually.
def delete_selected(modeladmin, request, queryset):
queryset.delete()
class SomeAdmin(admin.ModelAdmin):
actions = (delete_selected,)

How can I deal with a massive delete from Django Admin?

I'm working with Django 2.2.10.
I have a model called Site, and a model called Record.
Each record is associated with a single site (Foreign Key).
After my app runs for a few days/weeks/months, each site can have thousands of records associated with it. I use the database efficiently, so this isn't normally a problem.
In Django Admin, when I try to delete a site however, Django Admin tries to figure out every single associated object that will also be deleted, and because my ForeignKey uses on_delete=models.CASCADE, which is what I want, it tries to generate a page that lists thousands, possibly millions of records that will be deleted. Sometimes this succeeds, but takes a few seconds. Sometimes the browser just gives up waiting.
How can I have Django Admin not list every single record it intends to delete? Maybe just say something like "x number of records will be deleted" instead.
Update: Should I be overriding Django admin's delete_confirmation.html? It looks like the culprit might be this line:
<ul>{{ deleted_objects|unordered_list }}</ul>
Or is there an option somewhere that can be enabled to automatically not list every single object to be deleted, perhaps if the object count is over X number of objects?
Update 2: Removing the above line from delete_confirmation.html didn't help. I think it's the view that generates the deleted_objects variable that is taking too long. Not quite sure how to override a Django Admin view
Add this to your admin class, and than you can delete with this action without warning
actions = ["silent_delete"]
def silent_delete(self, request, queryset):
queryset.delete()
If you want to hide default delete action, add this to your admin class
def get_actions(self, request):
actions = super().get_actions(request)
if 'delete_selected' in actions:
del actions['delete_selected']
return actions
Since django 2.1 you can override get_deleted_objects to limit the amount of deleted objects listed (it's either a list or a nested list). The timeout is probably due to the django app server timing out on the view's response.
You could limit the size of the returned list:
class YourModelAdmin(django.admin.ModelAdmin):
def get_deleted_objects(self, objs, request):
deleted = super().get_deleted_objects(objs, request)
deleted_objs = deleted[0]
return (self.__limit_nested(deleted_objs),) + deleted[1:]
def __limit_nested(self, objs):
limit = 10
if isinstance(objs, list):
return list(map(self.__limit_nested, objs))
if len(objs) > limit:
return objs[:limit] + ['...']
return objs
But chances are the call to super takes too long as well, so you probably want to return [], {}, set(), [] instead of calling super; though it doesn't tell you about missing permissions or protected relations then (but I saw no alternative other than copy pasting code from django github). You will want to override the delete_confirmation.html and the delete_selected_confirmation.html template as well. You'll also want to make sure the admin has permission to delete any related objects that might get deleted by the cascading deletes.
In fact, the deletion itself may take too long. It's probably best defer the deletion (and permission checks if those are slow too) to a celery task.

Django model reload_from_db() vs. explicitly recalling from db

If I have an object retrieved from a model, for example:
obj = Foo.objects.first()
I know that if I want to reference this object later and make sure that it has the current values from the database, I can call:
obj.refresh_from_db()
My question is, is there any advantage to using the refresh_from_db() method over simply doing?:
obj = Foo.objects.get(id=obj.id)
As far as I know, the result will be the same. refresh_from_db() seems more explicit, but in some cases it means an extra line of code. Lets say I update the value field for obj and later want to test that it has been updated to False. Compare:
obj = Foo.objects.first()
assert obj.value is True
# value of foo obj is updated somewhere to False and I want to test below
obj.refresh_from_db()
assert obj.value is False
with this:
obj = Foo.objects.first()
assert obj.value is True
# value of foo obj is updated somewhere to False and I want to test below
assert Foo.objects.get(id=obj.id).value is False
I am not interested in a discussion of which of the two is more pythonic. Rather, I am wondering if one method has a practical advantage over the other in terms of resources, performance, etc. I have read this bit of documentation, but I was not able to ascertain from that whether there is an advantage to using reload_db(). Thank you!
Django sources are usually relatively easy to follow. If we look at the refresh_from_db() implementation, at its core it is still using this same Foo.objects.get(id=obj.id) approach:
db_instance_qs = self.__class__._default_manager.using(db).filter(pk=self.pk)
...
db_instance_qs = db_instance_qs.only(*fields)
...
db_instance = db_instance_qs.get()
Only there are couple extra bells and whistles:
deferred fields are ignored
stale foreign key references are cleared (according to the comment explanation)
So for everyday usage it is safe to say that they are pretty much the same, use whatever you like.
Just to add to #serg's answer, there's a case where explicitly re-fetching from the db is helpful and refreshing from the db isn't so much useful.
This is the case when you're adding permissions to an object and checking them immediately afterwards, and you need to clear the cached permissions for the object so that your permission checks work as expected.
According to the permission caching section of the django documentation:
The ModelBackend caches permissions on the user object after the first time they need to be fetched for a permissions check. This is typically fine for the request-response cycle since permissions aren’t typically checked immediately after they are added (in the admin, for example). If you are adding permissions and checking them immediately afterward, in a test or view for example, the easiest solution is to re-fetch the user from the database...
For an example, consider this block of code inspired by the one in the documentation cited above:
from django.contrib.auth import get_user_model
from django.contrib.auth.models import Permission
from django.contrib.contenttypes.models import ContentType
from smoothies.models import Smoothie
def force_deblend(user, smoothie):
# Any permission check will cache the current set of permissions
if not user.has_perm('smoothies.deblend_smoothie'):
permission = Permission.objects.get(
codename='deblend_smoothie',
content_type=ContentType.objects.get_for_model(Smoothie)
)
user.user_permissions.add(permission)
# Subsequent permission checks hit the cached permission set
print(user.has_perm('smoothies.deblend_smoothie')) # False
# Re-fetch user (explicitly) from db to clear permissions cache
# Be aware that user.refresh_from_db() won't help here
user = get_user_model().objects.get(pk=user.pk)
# Permission cache is now repopulated from the database
print(user.has_perm('smoothies.deblend_smoothie')) # True
...
...
It seems there is a difference if you use cached properties.
See here:
p.roles[0]
<Role: 16888649>
p.refresh_from_db()
p.roles[0]
<Role: 16888649>
p = Person.objects.get(id=p.id)
p.roles[0]
<Role: 16888650>
Definition from models.py:
#cached_property
def roles(self):
return Role.objects.filter(employer__person=self).order_by("id")

How to skip delete confirmation page in Django admin for specific model?

I have a model with a huge amount of data, and Django creates delete confirmation page a very long time. I have to skip this process and delete data without any confirmation. I have tried some solutions from the internet, but it doesn't work - I still see confirmation page.
Anyone know how to do that?
Django 2.1 released the new ModelAdmin method get_deleted_objects that allows to specify parameters to the confirmation screen for both single and multiple delete (e.g. using the delete action).
In my case, I wanted to delete a list of objects with several relationships, but set with cascade deletion. I ended up with something like this:
def get_deleted_objects(self, objs, request):
deleted_objects = [str(obj) for obj in objs]
model_count = {MyModel._meta.verbose_name_plural: len(deleted_objects)}
perms_needed = []
protected = []
return (deleted_objects, model_count, perms_needed, protected)
I could include other models in the model_count dict, getting only the count, for example, to still avoid list thousands of minor instances that I don't need to see individually.
def delete_selected(modeladmin, request, queryset):
queryset.delete()
class SomeAdmin(admin.ModelAdmin):
actions = (delete_selected,)

appengine many to many field update value and lookup efficiently

I am using appengine with python 2.7 and webapp2 framework. I am not using ndb.model.
I have the following model:
class Story(db.Model);
name = db.StringProperty()
class UserProfile(db.Model):
name = db.StringProperty()
user = db.UserProperty()
class Tracking(db.Model):
user_profile = db.ReferenceProperty(UserProfile)
story = db.ReferenceProperty(Story)
upvoted = db.BooleanProperty()
flagged = db.BoolenProperty()
A user can upvote and/or flag a story but only once. Hence I came up with the above model.
Now when a user clicks on the upvote link, on the database I try to see if the user has not already voted it, hence I do try to do the following:
get the user instance with his id as up = db.get(db.Key.from_path('UserProfile', uid))
then get the story instance as follows s_ins = db.get(db.Key.from_path('Story', uid))
Now it is the turn to check if a Tracking based on these two exist, if yes then don't allow voting, else allow him to vote and update the Tracking instance.
What is the most convenient way to fetch a Tracking instance given an id(db.key().id()) of user_profile and story?
What is the most convenient way to save a Tracking model having given a user profile id and an story id?
Is there a better way to implement tracking?
You can try tracking using lists of keys versus having a separate entry for track/user/story:
class Story(db.Model);
name = db.StringProperty()
class UserProfile(db.Model):
name = db.StringProperty()
user = db.UserProperty()
class Tracking(db.Model):
story = db.ReferenceProperty(Story)
upvoted = db.ListProperty(db.Key)
flagged = db.ListProperty(db.Key)
So when you want to see if a user upvoted for a given story:
Tracking.all().filter('story =', db.Key.from_path('Story', uid)).filter('upvoted =', db.Key.from_path('UserProfile', uid)).get(keys_only=True)
Now the only problem here is the size of the upvoted/flagged lists can't grow too large (I think the limit is 5000), so you'd have to make a class to manage this (that is, when adding to the upvoted/flagged lists, detect if X entries exists, and if so, start a new tracking object to hold additional values). You will also have to make this transactional and with HR you have a 1 write per second threshold. This may or may not be an issue depending on your expected use case. A way around the write threshold would be to implement upvotes/flags using pull-queues and to have a cron job that pulls and batch updates tracking objects as needed.
This method has its pros/cons. The most obvious cons are the ones I just listed. The pros, however, may be worth it. You can get a full list of users who upvoted/flagged a story from a single list (or multiple depending on how popular the story is). You can get a full list of users with a lot fewer queries to the datastore. This method should also take less storage, index, and metadata space. Additionally, adding a user to a tracking object will be cheaper, instead of writing a new object + 2 writes for each property, you would just be charged 1 write for the object + 2 writes for the entry to the list (9 vs 3 writes for adding users to a pre-existing tracked story, or 9 vs 7 for untracked stories)
What you propose sounds reasonable.
Don't use the app engine generated key for Tracking. Because the combination of story/user should be unique, create your own key as a combination of the story/user. Something like
tracking = Tracking.get_or_insert(str(story.id) + "-" + str(user.id), **params)
If you know the story/user, then you can always fetch the tracking by key name.

Categories