Is this the way to validate Django model fields? - python

As I understand it, when one creates a Django application, data is validated by the form before it's inserted into a model instance which is then written to the database. But if I want to create an additional layer of protection at the data model layer, is what I've done below the current "best practice?" I'm trying to ensure that a reviewer's name cannot be omitted nor be left blank. Should I be putting any custom validation in the 'clean' method as I've done here and then have 'save' call 'full_clean" which calls 'clean'? If not, what's the preferred method? Thanks.
class Reviewer(models.Model):
name = models.CharField(max_length=128, default=None)
def clean(self, *args, **kwargs):
if self.name == '':
raise ValidationError('Reviewer name cannot be blank')
super(Reviewer, self).clean(*args, **kwargs)
def full_clean(self, *args, **kwargs):
return self.clean(*args, **kwargs)
def save(self, *args, **kwargs):
self.full_clean()
super(Reviewer, self).save(*args, **kwargs)

Firstly, you shouldn't override full_clean as you have done. From the django docs on full_clean:
Model.full_clean(exclude=None)
This method calls Model.clean_fields(), Model.clean(), and Model.validate_unique(), in that order and raises a ValidationError that has a message_dict attribute containing errors from all three stages.
So the full_clean method already calls clean, but by overriding it, you've prevented it calling the other two methods.
Secondly, calling full_clean in the save method is a trade off. Note that full_clean is already called when model forms are validated, e.g. in the Django admin. So if you call full_clean in the save method, then the method will run twice.
It's not usually expected for the save method to raise a validation error, somebody might call save and not catch the resulting error. However, I like that you call full_clean rather than doing the check in the save method itself - this approach allows model forms to catch the problem first.
Finally, your clean method would work, but you can actually handle your example case in the model field itself. Define your CharField as
name = models.CharField(max_length=128)
The blank option will default to False. If the field is blank, a ValidationError will be raised when you run full_clean. Putting default=None in your CharField doesn't do any harm, but it is a bit confusing when you don't actually allow None as a value.

After thinking about Alasdair's answer and doing addtional reading, my sense now is that Django's models weren't designed so as to be validated on a model-only basis as I'm attempting to do. Such validation can be done, but at a cost, and it entails using validation methods in ways they weren't intended for.
Instead, I now believe that any constraints other than those that can be entered directly into the model field declarations (e.g. "unique=True") are supposed to be performed as a part of Form or ModelForm validation. If one wants to guard against entering invalid data into a project's database via any other means (e.g. via the ORM while working within the Python interpreter), then the validation should take place within the database itself. Thus, validation could be implemented on three levels: 1) First, implement all constraints and triggers via DDL in the database; 2) Implement any constraints available to your model fields (e.g. "unique=True"); and 3) Implement all other constraints and validations that mirror your database-level constraints and triggers within your Forms and ModelForms. With this approach, any form validation errors can be re-displayed to the user. And if the programmer is interacting directly with the database via the ORM, he/she would see the database exceptions directly.
Thoughts anyone?

Capturing the pre-save signals on on my models ensured clean will be called automatically.
from django.db.models.signals import pre_save
def validate_model(sender, **kwargs):
if 'raw' in kwargs and not kwargs['raw']:
kwargs['instance'].full_clean()
pre_save.connect(validate_model, dispatch_uid='validate_models')

Thanks #Kevin Parker for your answer, quite helpful!
It is common to have models in your app outside of the ones you define, so here is a modified version you can use to scope this behavior only to your own models or a specific app/module as desired.
from django.db.models.signals import pre_save
import inspect
import sys
MODELS = [obj for name, obj in
inspect.getmembers(sys.modules[__name__], inspect.isclass)]
def validate_model(sender, instance, **kwargs):
if 'raw' in kwargs and not kwargs['raw']:
if type(instance) in MODELS:
instance.full_clean()
pre_save.connect(validate_model, dispatch_uid='validate_models')
This code will run against any models defined inside the module where it is executed but you can adapt this to scope more strictly or to be a set of modules / apps if desired.

Related

How to add additional keyword argument to all Django fields?

The application I am working on requires merging of identical type Django models. These models hold state that can be altered by chronological events, so it is not as straightforward as deep copying one object to the other, as it is not always correct to take the latest value or always copy truthy values for example.
I have written a model merging class to handle this operation, however, I need to be able to describe on a field by field basis whether it should be included in that merge and if it is to be included, how to handle that merge.
I have already tried creating a dictionary to describe this behaviour and pass it into the merger. However, this becomes unwieldy at greater levels of nesting and is very brittle to codebase change.
I have also tried adding a merge method to each individual model, which solved the problem but is highly susceptible to failure if a foreign key relationship that lives on a different model is missed, or the codebase changes.
I have started writing a custom version of every field in Django, as the fields feel like the correct place for the logic to live, but it also feels unwieldy and brittle to have to maintain custom versions of every field.
Is there a way in Django to add an additional keyword argument to the base Field class or perhaps decorate each field without having to subclass them?
Thanks
Just in case this helps anybody else, I have ended up creating a mixin and subclassing each individual field. Below is a cut down example.
from django.db import models
class MappableFieldMixin():
def __init__(self, should_map=True, map_mode=None, *args, **kwargs):
self.should_map = should_map
if should_map and not map_mode:
raise TypeError('Mappable field requires map_mode if should_map set to True')
self.map_mode = map_mode
super().__init__(*args, **kwargs)
def deconstruct(self):
name, path, args, kwargs = super().deconstruct()
kwargs['should_map'] = self.should_map
kwargs['map_mode'] = self.map_mode
return name, path, args, kwargs
class MappableBooleanField(MappableFieldMixin, models.BooleanField):
pass
Usage:
class Membership(models.Model):
is_active = MappableBooleanField(map_mode=MapMode.MAP_ALWAYS, default=True)
You can find further information on creating custom fields in the Django documentation.

Generalizing deletion of placeholderfield django-cms with signals

I'm currently working in django-cms and utilizing a PlaceholderField in several of my models. As such, I'd like to generalize this process to avoid having to override every model's delete, and add a specialized manager for each type of object just to handle deletions.
a little back story:
After working up the design of my application a little bit and using the (honestly impressive) PlaceholderFields I noticed that if I deleted a model that contained one of these fields, it would leave behind it's plugins/placeholder after deletion of the model instance that spawned it. This surprised me, so I contacted them and according to django-cms's development team:
By design, the django CMS PlaceholderField does not handle deletion of the plugins for you.
If you would like to clear the placeholder content and remove the placeholder itself when the object that references it is deleted, you can do so by calling the clear() method on the placeholder instance and then the delete() method
So being that this is expected to happen prior to deletion of the model, my first thought was use the pre_delete signal provided by django. So I set up the following:
my problem
models.py
class SimplifiedCase(models.Model):
#... fields/methods ...
my_placeholder_instance= PlaceholderField('reading_content') # ****** the placeholder
#define local method for clearing placeholderfields for this object
def cleanup_placeholders(self):
# remove any child plugins of this placeholder
self.my_placeholder_instance.clear()
# remove the placeholder itself
self.my_placeholder_instance.delete()
# link the reciever to the section
signals.pre_delete.connect(clear_placeholderfields, sender=SimplifiedCase)
signals.py
# create a generalized reciever
#(expecting multiple models to contain placeholders so generalizing the process)
def clear_placeholderfields(sender, instance, **kwargs):
instance.cleanup_placeholders() # calls the newly defined cleanup method in the model
I expected this to work without any issues, but I'm getting some odd behavior from when calling the [placeholder].delete() method from within the method called by the pre_delete receiver.
For some reason, when calling the placeholder's delete() method in my cleanup_placeholders method, it fires the parent's pre_delete method again. Resulting in an recursion loop
I'm relatively new to using django/django-cms, so possibly I'm overlooking something or fundamentally misunderstanding what's causing this loop, but is there a way to achieve what I'm trying to do here using the pre_delete signals? or am I going about this poorly?
Any suggestions would be greatly appreciated.
After several days of fighting this, I believe I've found a method of deleting Placeholders along with the 3rd party app models automatically.
Failed attempts:
- Signals failed to be useful due to the recursion mentioned my question, which is caused by all related models of a placeholder triggering a pre_delete event during the handling of the connected model's pre_delete event.
Additionally, I had a need for handling child FK-objects that also contained their own placeholders. after much trial and error the best course of action (I could find) to ensure deletion of placeholders for child objects is as follows:
define a queryset which performs an iterative deletion. (non-ideal, but only way to ensure the execution of the following steps)
class IterativeDeletion_Manager(models.Manager):
def get_queryset(self):
return IterativeDeletion_QuerySet(self.model, using=self._db)
def delete(self, *args, **kwargs):
return self.get_queryset().delete(*args, **kwargs)
class IterativeDeletion_QuerySet(models.QuerySet):
def delete(self, *args, **kwargs):
with transaction.atomic():
# attempting to prevent 'bulk delete' which ignores overridden delete
for obj in self:
obj.delete()
Set the model containing the PlaceholderField to use the newly defined manager.
Override the deletion method of any model that contains a placeholder field to handle the deletion of the placeholder AFTER deletion of the connected model. (i.e. an unoffical post_delete event)
class ModelWithPlaceholder(models.Model):
objects = IterativeDeletion_Manager()
# the placeholder
placeholder_content = PlaceholderField('slotname_for_placeholder')
def delete(self, *args, **kwargs):
# ideally there would be a method to get all fields of 'placeholderfield' type to populate this
placeholders = [self.placeholder_content]
# if there are any FK relations to this model, set the child's
# queryset to use iterative deletion as well, and manually
# call queryset delete in the parent (as follows)
#self.child_models_related_name.delete()
# delete the object
super(ModelWithPlaceholder,self).delete(*args, **kwargs)
# clear, and delete the placeholders for this object
# ( must be after the object's deletion )
for ph in placeholders:
ph.clear()
ph.delete()
Using this method I've verified that the child PlaceholderFieldfor each object is deleted along with the object utilizing the Admin Interface, Queryset deletions, Direct deletions. (at least in my usage cases)
Note: This seems unintuitive to me, but the deletion of placeholders needs to happen after deletion of the model itself, at least in the case of having child relations in the object being deleted.
This is due to the fact that calling placeholder.delete() will trigger a deletion of all related models to the deleted-model containing the PlaceholderField.
I didn't expect that at all. idk if this was the expected functionality of deleting a placeholder, but it does. Being that the placeholder is still contained in the database prior to deleting the object (by design), there shouldn't be an issue with handling it's deletion after calling the super(...).delete()
if ANYONE has a better solution to this problem, feel free to comment and let me know my folly.
This is the best I could come up with after many hours running through debuggers and tracing through the deletion process.

Custom the `on_delete` param function in Django model fields

I have a IPv4Manage model, in it I have a vlanedipv4network field:
class IPv4Manage(models.Model):
...
vlanedipv4network = models.ForeignKey(
to=VlanedIPv4Network, related_name="ipv4s", on_delete=models.xxx, null=True)
As we know, on the on_delete param, we general fill the models.xxx, such as models.CASCADE.
Is it possible to custom a function, to fill there? I want to do other logic things there.
The choices for on_delete can be found in django/db/models/deletion.py
For example, models.SET_NULL is implemented as:
def SET_NULL(collector, field, sub_objs, using):
collector.add_field_update(field, None, sub_objs)
And models.CASCADE (which is slightly more complicated) is implemented as:
def CASCADE(collector, field, sub_objs, using):
collector.collect(sub_objs, source=field.remote_field.model,
source_attr=field.name, nullable=field.null)
if field.null and not connections[using].features.can_defer_constraint_checks:
collector.add_field_update(field, None, sub_objs)
So, if you figure out what those arguments are then you should be able to define your own function to pass to the on_delete argument for model fields. collector is most likely an instance of Collector (defined in the same file, not sure what it's for exactly), field is most likely the model field being deleted, sub_objs is likely instances that relate to the object by that field, and using denotes the database being used.
There are alternatives for custom logic for deletions too, incase overriding the on_delete may be a bit overkill for you.
The post_delete and pre_delete allows you define some custom logic to run before or after an instance is deleted.
from django.db.models.signals import post_save
def delete_ipv4manage(sender, instance, using):
print('{instance} was deleted'.format(instance=str(instance)))
post_delete.connect(delete_ipv4manage, sender=IPv4Manage)
And lastly you can override the delete() method of the Model/Queryset, however be aware of caveats with bulk deletes using this method:
Overridden model methods are not called on bulk operations
Note that the delete() method for an object is not necessarily called when deleting objects in bulk using a QuerySet or as a result of a cascading delete. To ensure customized delete logic gets executed, you can use pre_delete and/or post_delete signals.
Another useful solution is to use the models.SET() where you can pass a function (deleted_guest in the example below)
guest = models.ForeignKey('Guest', on_delete=models.SET(deleted_guest))
and the function deleted_guest is
DELETED_GUEST_EMAIL = 'deleted-guest#introtravel.com'
def deleted_guest():
""" used for setting the guest field of a booking when guest is deleted """
from intro.models import Guest
from django.conf import settings
deleted_guest, created = Guest.objects.get_or_create(
first_name='Deleted',
last_name='Guest',
country=settings.COUNTRIES_FIRST[0],
email=DELETED_GUEST_EMAIL,
gender='M')
return deleted_guest
You can't send any parameters and you have to be careful with circular imports. In my case I am just setting a filler record, so the parent model has a predefined guest to represent one that has been deleted. With the new GDPR rules we gotta be able to delete guest information.
CASCADE and PROTECT etc are in fact functions, so you should be able to inject your own logic there. However, it will take a certain amount of inspection of the code to figure out exactly how to get the effect you're looking for.
Depending what you want to do it might be relatively easy, for example the PROTECT function just raises an exception:
def PROTECT(collector, field, sub_objs, using):
raise ProtectedError(
"Cannot delete some instances of model '%s' because they are "
"referenced through a protected foreign key: '%s.%s'" % (
field.remote_field.model.__name__, sub_objs[0].__class__.__name__, field.name
),
sub_objs
)
However if you want something more complex you'd have to understand what the collector is doing, which is certainly discoverable.
See the source for django.db.models.deletion to get started.
There is nothing stopping you from adding your own logic. However, you need to consider multiple factors including compatibility with the database that you are using.
For most use cases, the out of the box logic is good enough if your database design is correct. Please check out your available options here https://docs.djangoproject.com/en/2.0/ref/models/fields/#django.db.models.ForeignKey.on_delete.

Django: how to validate m2m relationships?

Let's say I have a Basket model and I want to validate that no more than 5 Items can be added to it:
class Basket(models.Model):
items = models.ManyToManyField('Item')
def save(self, *args, **kwargs):
self.full_clean()
super(Basket, self).save(*args, **kwargs)
def clean(self):
super(Basket, self).clean()
if self.items.count() > 5:
raise ValidationError('This basket can\'t have so many items')
But when trying to save a Basket a RuntimeError is thrown because the maximum recursion depth is exceeded.
The error is the following:
ValueError: "<Basket: Basket>" needs to have a value for field "basket" before this many-to-many relationship can be used.
It happens in the if self.items.count() > 5: line.
Apparently Django's intricacies simply won't allow you to validate m2m relationships when saving a model. How can I validate them then?
You can never validate relationships in the clean method of the model. This is because at clean time, the model may not yet exist, as is the case with your Basket. Something that does not exist, can also not have relationships.
You either need to do your validation on the form data as pointed out by #bhattravii, or call form.save(commit=False) and implement a method called save_m2m, which implements the limit.
To enforce the limit at the model level, you need to listen to the m2m_changed signal. Note that providing feedback to the end user is a lot harder, but it does prevent overfilling the basket through different means.
I've been discussing this on the Django Developers list and have in fact tabled a method of doing this for consideration in the Django core in one form or another. The method is not fully tested nor finalised but results for now are very encouraging and I'm employing it on a site of mine with success.
In principle it relies on:
Using PostgreSQL as your database engine (we're fairly sure it won't
work on Lightdb or MySQL, but keen for anyone to test this)
enter code here
Overriding the post() method of your (Class based) view such that it:
Opens a an atomic transaction
Saves the form
Saves all the formsets if any
Calls Model.clean() or something else like Model.full_clean()
In your Model then, in the method called in 2.4 above you will see all your many to many and one to many relations in place. You can validate them and throw a ValidationError to see the whole transaction rolled back and no impact on the database.
This is working wonderfully for me:
def post(self, request, *args, **kwargs):
# The self.object atttribute MUST exist and be None in a CreateView.
self.object = None
self.form = self.get_form()
self.success_url = reverse_lazy('view', kwargs=self.kwargs)
if connection.vendor == 'postgresql':
if self.form.is_valid():
try:
with transaction.atomic():
self.object = self.form.save()
save_related_forms(self) # A separate routine that collects all the formsets in the request and saves them
if (hasattr(self.object, 'full_clean') and callable(self.object.full_clean)):
self.object.full_clean()
except (IntegrityError, ValidationError) as e:
if hasattr(e, 'error_dict') and isinstance(e.error_dict, dict):
for field, errors in e.error_dict.items():
for error in errors:
self.form.add_error(field, error)
return self.form_invalid(self.form)
return self.form_valid(self.form)
else:
return self.form_invalid(self.form)
else:
# The standard Djangop post() method
if self.form.is_valid():
self.object = self.form.save()
save_related_forms(self)
return self.form_valid(self.form)
else:
return self.form_invalid(self.form)
And the conversation on the Developers list is here:
https://groups.google.com/forum/#!topic/django-developers/pQ-8LmFhXFg
if you'd like to contribute any experience you gain from experimenting with this (perhaps with other database backends).
The one big caveat in the above approach is it delegates saving to the post() method which in the default view is done in the form_valid() method, so you need to override form_valid() as well, otherwise a post() like the one above will see you saving the form twice. Which is just a waste of time on an UpdateView but rather disastrous on a CreateView.

Use model method as default value for Django Model?

I actually have this method in my Model:
def speed_score_compute(self):
# Speed score:
#
# - 8 point for every % of time spent in
# high intensity running phase.
# - Average Speed in High Intensity
# running phase (30km/h = 50 points
# 0-15km/h = 15 points )
try:
high_intensity_running_ratio = ((
(self.h_i_run_time * 100)/self.training_length) * 8)
except ZeroDivisionError:
return 0
high_intensity_running_ratio = min(50, high_intensity_running_ratio)
if self.h_i_average_speed < 15:
average_speed_score = 10
else:
average_speed_score = self.cross_multiplication(
30, self.h_i_average_speed, 50)
final_speed_score = high_intensity_running_ratio + average_speed_score
return final_speed_score
I want to use it as default for my Model like this:
speed_score = models.IntegerField(default=speed_score_compute)
But this don't work (see Error message below) . I've checked different topic like this one, but this work only for function (not using self attribute) but not for methods (I must use methods since I'm working with actual object attributes).
Django doc. seems to talk about this but I don't get it clearly maybe because I'm still a newbie to Django and Programmation in general.
Is there a way to achieve this ?
EDIT:
My function is defined above my models. But here is my error message:
ValueError: Could not find function speed_score_compute in tournament.models.
Please note that due to Python 2 limitations, you cannot serialize unbound method functions (e.g. a method declared and used in the same class body). Please move the function into the main module body to use migrations.
For more information, see https://docs.djangoproject.com/en/1.8/topics/migrations/#serializing-values
Error message is clear, it seems that I'm not able to do this. But is there another way to achieve this ?
The problem
When we provide a default=callable and provide a method from a model, it doesn't get called with the self argument that is the model instance.
Overriding save()
I haven't found a better solution than to override MyModel.save()* like below:
class MyModel(models.Model):
def save(self, *args, **kwargs):
if self.speed_score is None:
self.speed_score = ...
# or
self.calculate_speed_score()
# Now we call the actual save method
super(MyModel, self).save(*args, **kwargs)
This makes it so that if you try to save your model, without a set value for that field, it is populated before the save.
Personally I just prefer this approach of having everything that belongs to a model, defined in the model (data, methods, validation, default values etc). I think this approach is referred to as the fat Django model approach.
*If you find a better approach, I'd like to learn about it too!
Using a pre_save signal
Django provides a pre_save signal that runs before save() is ran on the model. Signals run synchronously, i.e. the pre_save code needs to finish running before save() is called on the model. You'll get the same results (and order of execution) as overriding the save().
from django.db.models.signals import pre_save
from django.dispatch import receiver
from myapp.models import MyModel
#receiver(pre_save, sender=MyModel)
def my_handler(sender, **kwargs):
instance = kwargs['instance']
instance.populate_default_values()
If you prefer to keep the default values behavior separated from the model, this approach is for you!
When is save() called? Can I work with the object before it gets saved?
Good questioin! Because we'd like the ability to work with our object before subjecting to saving of populating default values.
To get an object, without saving it, just to work with it we can do:
instance = MyModel()
If you create it using MyModel.objects.create(), then save() will be called. It is essentially (see source code) equivalent to:
instance = MyModel()
instance.save()
If it's interesting to you, you can also define a MyModel.populate_default_values(), that you can call at any stage of the object lifecycle (at creation, at save, or on-demande, it's up to you)

Categories