Django "emulate" database trigger behavior on bulk insert/update/delete

Django "emulate" database trigger behavior on bulk insert/update/delete - python

It's a self expaining question but here we go.
I'm creating a business app in Django, and i didn't wanted to "spread" all the logic across app AND database, but in the other hand, i didn't wanted to let the Database handle this task (its possible through the use of Triggers).
So I wanted to "reproduce" the behavior of the Databse Triggers, but inside the Model Class in Django (um currently using Django 1.4).
After some research, I figured out that with single objects, I could override the "save" and "delete" methods of "models.Model" class, inserting the "before" and "after" hooks so they could be executed before and after the parent's save/delete. Like This:
class MyModel(models.Model):
def __before(self):
pass
def __after(self):
pass
#commit_on_success #the decorator is only to ensure that everything occurs inside the same transaction
def save(self, *args, *kwargs):
self.__before()
super(MyModel,self).save(args, kwargs)
self.__after()
The BIG problem is with bulk operations. Django doesn't triggers the save/delete of the models when running the "update()"/"delete()" from it's QuerySet. Insted, it uses the QuerySet's own method. And to get a little bit worst, it doesn't trigger any signal either.
Edit:
Just to be a little more specific: the model loading inside the view is dynamic, so it's impossible to define a "model specific" way. In this case, I should create an Abstract Class and handle it there.
My last attempt was to create a custom Manager, and in this custom manager, override the update method, looping over the models inside the queryset, and trigering the "save()" of each model (take in consideration the implementation above, or the "signals" system). It works, but results in a database "overload" (imagine a 10k rows queryset being updated).

First, instead of overriding save to add __before and __after methods, you can use the built-in pre_save, post_save, pre_delete, and post_delete signals. https://docs.djangoproject.com/en/1.4/topics/signals/
from django.db.models.signals import post_save
class YourModel(models.Model):
pass
def after_save_your_model(sender, instance, **kwargs):
pass
# register the signal
post_save.connect(after_save_your_model, sender=YourModel, dispatch_uid=__file__)
pre_delete and post_delete will get triggered when you call delete() on a queryset.
For bulk updating, you'll have to manually call the function you want to trigger yourself, however. And you can throw it all in a transaction as well.
To call the proper trigger function if you're using dynamic models, you can inspect the model's ContentType. For example:
from django.contrib.contenttypes.models import ContentType
def view(request, app, model_name, method):
...
model = get_model(app, model_name)
content_type = ContentType.objects.get_for_model(model)
if content_type == ContenType.objects.get_for_model(YourModel):
after_save_your_model(model)
elif content_type == Contentype.objects.get_for_model(AnotherModel):
another_trigger_function(model)

With a few caveats, you can override the queryset's update method to fire the signals, while still using an SQL UPDATE statement:
from django.db.models.signals import pre_save, post_save
def CustomQuerySet(QuerySet):
#commit_on_success
def update(self, **kwargs):
for instance in self:
pre_save.send(sender=instance.__class__, instance=instance, raw=False,
using=self.db, update_fields=kwargs.keys())
# use self instead of self.all() if you want to reload all data
# from the db for the post_save signal
result = super(CustomQuerySet, self.all()).update(**kwargs)
for instance in self:
post_save.send(sender=instance.__class__, instance=instance, created=False,
raw=False, using=self.db, update_fields=kwargs.keys())
return result
update.alters_data = True
I clone the current queryset (using self.all()), because the update method will clear the cache of the queryset object.
There are a few issues that may or may not break your code. First of all it will introduce a race condition. You do something in the pre_save signal's receivers based on data that may no longer be accurate when you update the database.
There may also be some serious performance issues with large querysets. Unlike the update method, all models will have to be loaded into memory, and then the signals still need to be executed. Especially if the signals themselves have to interact with the database, performance can be unacceptably slow. And unlike the regular pre_save signal, changing the model instance will not automatically cause the database to be updated, as the model instance is not used to save the new data.
There are probably some more issues that will cause a problem in a few edge cases.
Anyway, if you can handle these issues without having some serious problems, I think this is the best way to do this. It produces as little overhead as possible while still loading the models into memory, which is pretty much required to correctly execute the various signals.

Related

How do I set a specific value for an attribute of a model every time I save it on Django?

I have a legacy project that saves models with save, bulk_create and other methods within the framework.
What is the best way to set a specific value for an attribute so that every time a record is saved the new value is also saved? This value is constructed based on other attributes of the instance that is being saved.
I pose this question because I'm not sure all ways that save is possible in Django except save and bulk_create and knowing that on bulk_create:
The model’s save() method will not be called, and the pre_save and
post_save signals will not be sent.
https://docs.djangoproject.com/en/1.8/ref/models/querysets/#bulk-create

As far as I know, there are 3 ways to create/update model instances (which are records in database tables):
Using the model instance method save().
Using the queryset methods create(), update(), get_or_create(), update_or_create() and bulk_create().
Using raw SQL or other low-level ways.
If you intend to calculate the value of a field when saving, you could override all of the methods I listed above.
Signals (like pre_create) are not a complete solution because they don't get triggered when bulk_create() is used and so some instance could get saved without the calculated attribute.
There is no django way (that I know) to intercept the third point I mentioned (raw SQL).
You did not elaborate on your use case, but (depending on your table size and change frequency) maybe you could also try:
run a periodical process (maybe using crontab) that updates the calculated field of all model instances.
add a database trigger that calculates the field.
Legacy databases or systems or usually not fun to work with, so maybe you will have to settle for a sub-optimal solution.

You can set default value in your model's field using custom functions. For example you have a Post model that also has a field slug. You want default value for slug field to be auto generated from name field. You can write your model like below:
class Post(models.Model):
def generate_slug(self):
return slugify(self.name)
name = models.CharField()
description = models.TextField()
attachment = models.FileField()
slug = models.CharField(default=generate_slug)
This way when you create a new post, the slug field will be auto generated from the name field.

Another way to do that is to create a layer between your caller and the models(database layer) so you can add your logic there. With this you will narrow the possibilities to just the methods you expose in that layer and have control over what should happen everywhere in terms of database talk.

The best way to deal with this issue is to override the save method().
You can use as well raw sql queries , which can easily solve your problems as well
class Model(model.Model):
field1=models.CharField()
field2=models.CharField()
field3=models.CharField()
def myfunc (self):
pass
#
def save(self, *args, **kwargs):
q = MyModel.objects.select_related('fields1', 'field2', 'filed2').filter(related_field)
super(Model, self).save(*args, **kwargs)

Should I prefer one general signal instead of multiple specific ones?

When the user creates a product, multiple actions have to be done in save() method before calling super(Product,self).save(*args,**kwargs).
I'm not sure if I should use just one pre_save signal to do all these actions or it is better to create a signal for each of these actions separately.
Simple example (I'm going to replace save overrides by signals):
class Product(..):
def save(...):
if not self.pk:
if not self.category:
self.category = Category.get_default()
if not self.brand:
self.brand = 'NA'
super(Product,self).save(*args,**kwargs)
...
SO
#receiver(pre_save,sender=Product)
def set_attrs(instance,**kwargs):
if kwargs['created']:
instance.category = Category.get_default()
instance.brand = 'NA'
OR
#receiver(pre_save,sender=Product)
def set_category(instance,**kwargs):
if kwargs['created']:
instance.category = Category.get_default()
#receiver(pre_save,sender=Product)
def set_brand(instance,**kwargs):
if kwargs['created']:
instance.brand = 'NA'
This is just simple example. In this case, the general set_attrs should be probably enough but there are more complex situations with different actions like creating userprofile for user and then userplan etc.
Is there some best practice advice for this? Your opinions?

To put the facts out simply, it could be pointed out as a single piece of advice,
If action on one model's instance affects another model, signals are the cleanest way to go about. This is an example where you can go with a signal, because you might want to avoid some_model.save() call from within the save() method of another_model, if you know what I mean.
To elaborate on an example, when overriding save() methods, common task is to create slugs from some fields in the model. If you are required to implement this process on multiple models, then using a pre_save signal would be a benefit, rather than hard-coding in save() method of each models.
Also, on bulk operations, these signals and methods are not necessarily called.
From the docs,
Overridden model methods are not called on bulk operations
Note that the delete() method for an object is not necessarily called when deleting objects in bulk using a QuerySet or as a result of a cascading delete. To ensure customized delete logic gets executed, you can use pre_delete and/or post_delete signals.
Unfortunately, there isn’t a workaround when creating or updating objects in bulk, since none of save(), pre_save, and post_save are called.
For more reference,
Django override save() or signals?
Overriding predefined model methods
Django: signal or model method?

Django disable model delete

So every model comes with some commonly used functions such as save and delete.
Delete is often overridden to set a boolean field such as is_active to false, this way data is not lost. But sometimes a model exists that has information that, once created, should always exist and never even be "inactive". I was wondering what the best practice for handling this model's delete method would be?
ideas
make it simply useless:
def delete(self):
return False
but that just seems odd. Is there maybe a Meta option to disable deleting? is there any "nice" way to do this?

Well it depends, you cannot truly restrict deletion, because somebody can always call delete() on queryset or just plain DELETE sql command. If you want to disable delete button in django admin though, you should look here.

delete() on queryset can be restricted with this:
class NoDeleteQuerySet(models.QuerySet):
def delete(self, *args, **kwargs):
pass
class MyModel(models.Model):
objects = NoDeleteQuerySet.as_manager()
...
Django docs - link

Django trigger parent model save when editing inline in admin

I have a model (Parent) with one-to-many relation to another model (Child). The save method of Parent model is overwritten:
class ParentModel(models.Model)
(...)
def save(self, *args, **kwargs):
(...) # Do sth with the model
super(ParentModel, self).save(*args, **kwargs)
class ChildModel(models.Model):
parent= models.ForeignKey(ParentModel)
In admin panel multiple Child models objects are displayed using StackedInline on Parent model's page. If a field of parent is edited and saved, the save method is called. When only child's fields are edited, Django do not call the save method of parent (as expected, because nothing changed).
What is the best way to force saving the parent, even if only child was edited (so that my overwritten method does it's stuff)?

You have a few solutions. Here goes, from simpler to more complex:
You could implement a custom save method for ChildModel that calls ParentModel.save.
You could also connect to your ChildModel's post_save or pre_save signal.
Now, these two solutions will prove annoying if you're going to update a lot of ChildModel instances at once, as you will be calling ParentModel.save several times, maybe without purpose.
You might then want to use the following:
Override your ParentModel's ModelAdmin.change_view to handle your logic; this is pretty tricky however.
I'm however pretty surprised by the behavior your're encountering, from checking the source, the object should be saved anyway; edited or not.

Django get_query_set override is being cached

I'm overriding Django's get_query_set function on one of my models dynamically. I'm doing this to forcibly filter the original query set returned by Model.objects.all/filter/get by a "scenario" value, using a decorator. Here's the decorator's function:
# Get the base QuerySet for these models before we modify their
# QuerySet managers. This prevents infinite recursion since the
# get_query_set function doesn't rely on itself to get this base QuerySet.
all_income_objects = Income.objects.all()
# Figure out what scenario the user is using.
current_scenario = Scenario.objects.get(user=request.user, selected=True)
# Modify the imported income class to filter based on the current scenario.
Expense.objects.get_query_set = lambda: all_expense_objects.filter(scenario=current_scenario)
# Call the method that was initially supposed to
# be executed before we were so rudely interrupted.
return view(request, **arguments)
I'm doing this to DRY up the code, so that all of my queries aren't littered with an additional filter. However, if the scenario changes, no objects are being returned. If I kill all of my python processes on my server, the objects for the newly select scenario appear. I'm thinking that it's caching the modified class, and then when the scenario changes, it's applying another filter that will never make sense, since objects can only have one scenario at a time.
This hasn't been an issue with user-based filters because the user never changes for my session. Is passenger doing something stupid to hold onto class objects between requests? Should I be bailing on this weird design pattern and just implement these filters on a per-view basis? There must be a best practice for DRYing filters up that apply across many views based on something dynamic, like the current user.

What about creating a Manager object for the model which takes the user as an argument where this filtering is done. My understanding of being DRY w/ Django querysets is to use a Model Manager
#### view code:
def some_view(request):
expenses = Expense.objects.filter_by_cur_scenario(request.user)
# add additional filters here, or add to manager via more params
expenses = expenses.filter(something_else=True)
#### models code:
class ExpenseManager(models.Manager):
def filter_by_cur_scenario(self, user):
current_scenario = Scenario.objects.get(user=request.user, selected=True)
return self.filter(scenario=current_scenario)
class Expense(models.Model):
objects = ExpenseManager()
Also, one quick caveat on the manager (which may apply to overriding get_query_set): foreign relationships will not take into account any filtering done at this level. For example, you override the MyObject.objects.filter() method to always filter out deleted rows; A model w/ a foreignkey to that won't use that filter function (at least from what I understand -- someone please correct me if I'm wrong).

I was hoping to have this implementation happen without having to code anything in other views. Essentially, after the class is imported, I want to modify it so that no matter where it's referenced using Expense.objects.get/filter/all it's already been filtered. As a result, there is no implementation required for any of the other views; it's completely transparent. And, even in cases where I'm using it as a ForeignKey, when an object is retrieved using the aforementioned Expense.objects.get/filter/all, they'll be filtered as well.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Django "emulate" database trigger behavior on bulk insert/update/delete - python

Related

How do I set a specific value for an attribute of a model every time I save it on Django?

Should I prefer one general signal instead of multiple specific ones?

Django disable model delete

Django trigger parent model save when editing inline in admin

Django get_query_set override is being cached

Categories

Resources