Order of Serializer Validation in Django REST Framework - python

Situation
While working with validation in the Django REST Framework's ModelSerializer, I have noticed that the Meta.model fields are always validated, even when it does not necessarily make sense to do so. Take the following example for a User model's serialization:
I have an endpoint that creates a user. As such, there is a password field and a confirm_password field. If the two fields do not match, the user cannot be created. Likewise, if the requested username already exists, the user cannot be created.
The user POSTs improper values for each of the fields mentioned above
An implementation of validate has been made in the serializer (see below), catching the non-matching password and confirm_password fields
Implementation of validate:
def validate(self, data):
if data['password'] != data.pop('confirm_password'):
raise serializers.ValidationError("Passwords do not match")
return data
Problem
Even when the ValidationError is raised by validate, the ModelSerializer still queries the database to check to see if the username is already in use. This is evident in the error-list that gets returned from the endpoint; both the model and non-field errors are present.
Consequently, I would like to know how to prevent model validation until after non-field validation has finished, saving me a call to my database.
Attempt at solution
I have been trying to go through the DRF's source to figure out where this is happening, but I have been unsuccessful in locating what I need to override in order to get this to work.

Since most likely your username field has unique=True set, Django REST Framework automatically adds a validator that checks to make sure the new username is unique. You can actually confirm this by doing repr(serializer()), which will show you all of the automatically generated fields, which includes the validators.
Validation is run in a specific, undocumented order
Field deserialization called (serializer.to_internal_value and field.run_validators)
serializer.validate_[field] is called for each field
Serializer-level validators are called (serializer.run_validation followed by serializer.run_validators)
serializer.validate is called
So the problem that you are seeing is that the field-level validation is called before your serializer-level validation. While I wouldn't recommend it, you can remove the field-level validator by setting extra_kwargs in your serilalizer's meta.
class Meta:
extra_kwargs = {
"username": {
"validators": [],
},
}
You will need to re-implement the unique check in your own validation though, along with any additional validators that have been automatically generated.

I was also trying to understand how the control flows during serializer validation and after carefully going through the source code of djangorestframework-3.10.3 I came up with below request flow diagram. I have described the flow and what happens in the flow to the best of my understanding without going into too much detail as it can be looked up from source.
Ignore the incomplete method signatures. Only focusing on what methods are called on what classes.
Assuming you have an overridden is_valid method on your serializer class (MySerializer(serializers.Serializer)) when you call my_serializer.is_valid() the following takes place.
MySerializer.is_valid() is executed.
Assuming you are calling the super class (BaseSerializer) is_valid method (like: super(MySerializer, self).is_valid(raise_exception) in your MySerializer.is_valid() method, that will be called.
Now since MySerializer is extending serializers.Serializer, the run_validation() method from serializer.Serializers is called. This is validating only the data dict the first. So we haven't yet started field level validations.
Then the validate_empty_values from fields.Field gets called. This again happens on the entire data and not a single field.
Then the Serializer.to_internal_method is called.
Now we loop over each fields defined on the serializer. And for each field, first we call the field.run_validation() method. If the field has overridden the Field.run_validation() method then that will be called first. In case of a CharField it is overridden and calls the run_validation method of Field base class. Step 6-2 in the figure.
On that field we again call the Field.validate_empty_values()
The to_internal_value of the type of field is called next.
Now there is a call to the Field.run_validators() method. I presume this is where the additional validators that we add on the field by specifying the validators = [] field option get executed one by one
Once all this is done, we are back to the Serializer.to_internal_value() method. Now remember that we are doing the above for each field within that for loop. Now the custom field validators you wrote in your serializer (methods like validate_field_name) are run. If an exception occurred in any of the previous steps, your custom validators wont run.
read_only_defaults()
update validate data with defaults I think
run object level validators. I think the validate() method on your object is run here.

I don't believe the above solutions work any more. In my case, my model has fields 'first_name' and 'last_name', but the API will only receive 'name'.
Setting 'extra_kwargs' and 'validators' in the Meta class seems to have no effect, first_name and last_name are allways deemed required, and validators are always called. I can't overload the first_name/last_name character fields with
anotherrepfor_first_name = serializers.CharField(source=first_name, required=False)
as the names make sense. After many hours of frustration, I found the only way I could override the validators with a ModelSerializer instance was to override the class initializer as follows (forgive the incorrect indentation):
class ContactSerializer(serializers.ModelSerializer):
name = serializers.CharField(required=True)
class Meta:
model = Contact
fields = [ 'name', 'first_name', 'last_name', 'email', 'phone', 'question' ]
def __init__(self, *args, **kwargs):
self.fields['first_name'] = serializers.CharField(required=False, allow_null=True, allow_blank=True)
self.fields['last_name'] = serializers.CharField(required=False, allow_null=True, allow_blank=True)
return super(ContactSerializer, self).__init__(*args, **kwargs)
def create(self, validated_data):
return Contact.objects.create()
def validate(self, data):
"""
Remove name after getting first_name, last_name
"""
missing = []
for k in ['name', 'email', 'question']:
if k not in self.fields:
missing.append(k)
if len(missing):
raise serializers.ValidationError("Ooops! The following fields are required: %s" % ','.join(missing))
from nameparser import HumanName
names = HumanName(data['name'])
names.capitalize()
data['last_name'] = names.last
if re.search(r'\w+', names.middle):
data['first_name'] = ' '.join([names.first, names.middle])
else:
data['first_name'] = names.first
del(data['name'])
return data
Now the doc says that allowing blank and null with character fields is a no no, but this is a serializer, not a model, and as the API gets called by all kinds of cowboys, I need to cover my bases.

Here's the approach that worked for me.
Use a sentinel error type that gets caught in an overridden view function
The sentinel is raised from the custom serializer
The sentinel error type:
from django.core.exceptions import ValidationError
class CustomValidationErrors(ValidationError):
""" custom validation error for the api view to catch the status code """
And in the serializer we override errors, _errors, validated_data, and _validated_data, as well as is_valid:
class CustomSerializer(serializers.ModelSerializer):
# fields that usually run validation before parent serializer validation
child_field1 = Child1Serializer()
child_field2 = Child2Serializer()
# override DRF fields
errors = {}
_errors = None
validated_data = {}
_validated_data = []
def is_valid(self, *args, **kwargs):
# override drf.serializers.Serializer.is_valid
# and raise CustomValidationErrors from parent validate
self.validate(self.initial_data)
return not bool(self.errors)
def validate(self, attrs):
self._errors = {}
if len(attrs.get("child_field1", {}).get("name", "")) > 100:
self._errors.update({"child_field1": {"name": "child 1 name > 100"}})
if len(attrs.get("child_field2", {}).get("description", "")) > 1000:
self._errors.update({"child_field2.description": "child 2 description > 100"})
if len(self._errors):
# set the overriden DRF values
self.errors = self._errors
# raise the sentinel error type
raise CustomValidationErrors(self._errors)
# set the overriden DRF serializer values
self._errors = None
self.validated_data = attrs
self._validated_data = [[k, v] for k, v in attrs.items()]
return attrs
class Meta:
model = CustomModel
And in the view we can override the default method, and catch the sentinel error type:
class CustomSerializerView(ListCreateAPIView):
serializer_class = CustomeSerializer
def post(self, *args, **kwargs):
try:
# if this fails for any exception
# other than CustomValidationErrors
# it will return the default error
return super().post(*args, **kwargs)
except CustomValidationErrors as e:
############
# returns a 400 with the following
# {"child_field1":
# [[{"name": "child 1 name > 100"}]],
# "child_field2.description":
# [["child 2 description > 100"]]
# }
############
return Response(e.error_dict, status=400)
drf version:
djangorestframework==3.11.0

Related

How should i auto fill a field and make it readonly in django?

I`m new to django and i was doing a test for my knowledge.
Found a lot of duplicates in here and web but nothing useful
I'm trying to make a ForeignKey field which gets filled due to the other fields that user fills, and make it unchangeable for the user.
I thought that I should use overriding save() method but couldn't figure that at all.
How should I do that auto-fill and read-only thing?
Your approach is right. Override the save method and if self.pk is not None raise an exception if your field has changed. You can use django model utils to easily track changes in your model: https://django-model-utils.readthedocs.io/en/latest/utilities.html#field-tracker
Principle:
class MyModel(models.Model):
#....
some_field = models.Foreignkey(...)
tracker = FieldTracker()
def save(*args, **kwargs):
if self.pk is None:
# new object is being created
self.some_field = SomeForeignKeyObject
else:
if self.tracker.has_changed("some_field"):
raise Exception("Change is not allowed")
super().save(*args, **kwargs)

How to automatically fill-in model fields in Django rest_framework serializer?

Let's assume I have a model like this:
class Data(models.Model):
a = models.CharField()
b = models.CharField()
c = models.IntegerField()
I would like to setup a serializer in such a way that it automatically fills in field c and it is not required for a POST. I tried to overwrite the create function of the serializer, but it doesn't work:
class DataSerializer(serializers.HyperlinkedModelSerializer):
class Meta:
model = Data
fields = ('a', 'b')
def create(self, validated_data, **kwargs):
Data.objects.c = 5
return Data.objects.create(**validated_data)
However, if I try this, I end up with an IntegrityError: NOT NULL constraint failed: model_data.c. What is the syntax that I have to use here?
EDIT: Updated formatting.
The reason you're getting the error because field c is not set to null = True - as such an error is raised at the validation stage even before the serializer hits the create method.
Bear in mind that the process goes like this:
Submit serializer data
field-level validation happens - this includes checks for null integrity, min/max length etc and also any custom field validations defined in def validate_<field_name>
object-level validation happens - this calls the def validate method
validated data is passed to the save method, depending on how you designed the serializer - it will save the instance, or route the data to either create or update
All of the info regarding this can be found in Django's and DRF's docs.
A few things to consider:
are you setting a global default for that field? If so, set the default in your models - c = models.IntegerField(default=a_number_or_a_callable_that_returns_an_integer)
do you intend to display the field? If so, include c in your fields and add one more Meta attribute - read_only_fields = ('c',)
If it's neither of the above, you might want to override the validate_c method
Apologies for the poor formatting, typing it on my phone - will update once I get to a computer
In your code Data.objects.c = 5 does nothing.
If you want to set this value yourself use validated_data['c'] = 5 or Data.objects.create(c=5, **validated_data) (just not both at the same time).
Rather than doing this in the serializer, there are hooks in the generic views that allow you to pass values to the serializer. So in your case you might have:
class DataViewSet(ModelViewSet):
# ...
def perform_create(self, serializer):
serializer.save(c=5)
See the "Save and deletion hooks" section here

Validating a Django model field based on another field's value?

I have a Django app with models accessible by both Django REST Framework and a regular form interface. The form interface has some validation checks before saving changes to the model, but not using any special Django framework, just a simple local change in the view.
I'd like to apply the same validation to forms and REST calls, so I want to move my validation into the model. I can see how to do that for simple cases using the validators field of the Field, but in one case I have a name/type/value model where the acceptable values for 'value' change depending on which type is selected. The validator doesn't get sent any information about the model that the field is in, so it doesn't have access to other fields.
How can I perform this validation, without having essentially the same code in a serializer for DRF and my POST view for the form?
I dug around codebase of drf a little bit. You can get values of all fields using following approach. Doing so, you can throw serialization error as
{'my_field':'error message} instead of {'non_field_error':'error message'}.
def validate_myfield(self, value):
data = self.get_initial() # data for all the fields
#do your validation
However, if you wish to do it for ListSerializer, i.e for serializer = serializer_class(many=True), this won't work. You will get list of empty values.
In that scenario, you could write your validations in def validate function and to avoid non_field_errors in your serialization error, you can raise ValidationError with error message as a dictionary instead of string.
def validate(self, data):
# do your validation
raise serializers.ValidationError({"your_field": "error_message"})
The validation per-field doesn't get sent any information about other fields, when it is defined like this:
def validate_myfield(self, value):
...
However, if you have a method defined like this:
def validate(self, data):
...
Then you get all the data in a dict, and you can do cross-field validation.
You can use the required package for your cross-field validation. It allows you to express your validation rules declaratively in python. You would have something like this with DRF:
class MySerializer(serializers.Serializer):
REQUIREMENTS = (
Requires("end_date", "start_date") +
Requires("end_date", R("end_date") > R("start_date")) +
Requires("end_date", R("end_date") < today.date() + one_year) +
Requires("start_date", R("start_date") < today.date() + one_year)
)
start_date = serializers.DateField(required=False, null=True, blank=True)
end_date = serializers.DateField(required=False, null=True, blank=True)
def validate(self, data):
self.REQUIREMENTS.validate(data) # handle validation error
You could put the REQUIREMENTS on your Model and have both your DRF and Django Form validate your data using it.
Here is a blog post explaining more

Form Validation - Clean Method Interfering with PATCH Requests

I have a form that looks like this:
class ContactForm(forms.ModelForm):
error_messages = {
'duplicate_name': 'A backup contact with that name already exists for this location',
'missing_location': 'No location supplied.'
}
class Meta:
fields = ('name', 'notification_preference', 'email', 'phone')
model = Contact
········
def clean_name(self):
# FIXME: Location validation shouldn't be happening here, but it's difficult to get
# form fields for foreign key relationships to play nicely with Tastypie
print dir(self)
if 'location' in self.data:
location = Location.objects.get(pk=self.uri_to_pk(self.data['location']))
else:
raise forms.ValidationError(self.error_messages['missing_location'])
# No duplicate names in a given location
if 'name' in self.cleaned_data and Contact.objects.filter(name=self.cleaned_data['name'], location=location).exists():
raise forms.ValidationError(self.error_messages['duplicate_name'])
return self.cleaned_data
I'm using it to validate calls to my TastyPie API. The clean_name method is meant to prevent POST requests from happening if they post a contact with the same name to the same location. This works perfectly as long as I'm making a POST request.
If I make a PATCH however, changing, say, the email and phone field on an already existent contact, the clean_name logic is still fired. Since the name already exists for a given location, it raises a validation error.
Should I be overriding something other than clean_name? Can I change the way PATCH works so it ignores certain validations?
Yes, if you're checking values between fields, I'd recommend implementing a general def clean(self, data) that checks the values don't conflict.
Regarding checking for duplicates, I'd advise you to use an .exclude(pk=instance.pk) in your exists() queryset to prevent mistakenly detecting an update the the model as a duplicate. Looking at the tastypie validation source, it adds the appropriate self.instance for the object being updated.
qs = Contact.objects.filter(name=self.cleaned_data.get('name', ''), location=location)
if self.instance and self.instance.pk:
qs = qs.exclude(pk=self.instance.pk)
if qs.exists():
raise forms.ValidationError(...)

Dynamically limiting queryset of related field

Using Django REST Framework, I want to limit which values can be used in a related field in a creation.
For example consider this example (based on the filtering example on https://web.archive.org/web/20140515203013/http://www.django-rest-framework.org/api-guide/filtering.html, but changed to ListCreateAPIView):
class PurchaseList(generics.ListCreateAPIView)
model = Purchase
serializer_class = PurchaseSerializer
def get_queryset(self):
user = self.request.user
return Purchase.objects.filter(purchaser=user)
In this example, how do I ensure that on creation the purchaser may only be equal to self.request.user, and that this is the only value populated in the dropdown in the form in the browsable API renderer?
I ended up doing something similar to what Khamaileon suggested here. Basically I modified my serializer to peek into the request, which kind of smells wrong, but it gets the job done... Here's how it looks (examplified with the purchase-example):
class PurchaseSerializer(serializers.HyperlinkedModelSerializer):
def get_fields(self, *args, **kwargs):
fields = super(PurchaseSerializer, self).get_fields(*args, **kwargs)
fields['purchaser'].queryset = permitted_objects(self.context['view'].request.user, fields['purchaser'].queryset)
return fields
class Meta:
model = Purchase
permitted_objects is a function which takes a user and a query, and returns a filtered query which only contains objects that the user has permission to link to. This seems to work both for validation and for the browsable API dropdown fields.
Here's how I do it:
class PurchaseList(viewsets.ModelViewSet):
...
def get_serializer(self, *args, **kwargs):
serializer_class = self.get_serializer_class()
context = self.get_serializer_context()
return serializer_class(*args, request_user=self.request.user, context=context, **kwargs)
class PurchaseSerializer(serializers.ModelSerializer):
...
def __init__(self, *args, request_user=None, **kwargs):
super(PurchaseSerializer, self).__init__(*args, **kwargs)
self.fields['user'].queryset = User._default_manager.filter(pk=request_user.pk)
The example link does not seem to be available anymore, but by reading other comments, I assume that you are trying to filter the user relationship to purchases.
If i am correct, then i can say that there is now an official way to do this. Tested with django rest framework 3.10.1.
class UserPKField(serializers.PrimaryKeyRelatedField):
def get_queryset(self):
user = self.context['request'].user
queryset = User.objects.filter(...)
return queryset
class PurchaseSeriaizer(serializers.ModelSerializer):
users = UserPKField(many=True)
class Meta:
model = Purchase
fields = ('id', 'users')
This works as well with the browsable API.
Sources:
https://github.com/encode/django-rest-framework/issues/1985#issuecomment-328366412
https://medium.com/django-rest-framework/limit-related-data-choices-with-django-rest-framework-c54e96f5815e
I disliked the style of having to override the init method for every place where I need to have access to user data or the instance at runtime to limit the queryset. So I opted for this solution.
Here is the code inline.
from rest_framework import serializers
class LimitQuerySetSerializerFieldMixin:
"""
Serializer mixin with a special `get_queryset()` method that lets you pass
a callable for the queryset kwarg. This enables you to limit the queryset
based on data or context available on the serializer at runtime.
"""
def get_queryset(self):
"""
Return the queryset for a related field. If the queryset is a callable,
it will be called with one argument which is the field instance, and
should return a queryset or model manager.
"""
# noinspection PyUnresolvedReferences
queryset = self.queryset
if hasattr(queryset, '__call__'):
queryset = queryset(self)
if isinstance(queryset, (QuerySet, Manager)):
# Ensure queryset is re-evaluated whenever used.
# Note that actually a `Manager` class may also be used as the
# queryset argument. This occurs on ModelSerializer fields,
# as it allows us to generate a more expressive 'repr' output
# for the field.
# Eg: 'MyRelationship(queryset=ExampleModel.objects.all())'
queryset = queryset.all()
return queryset
class DynamicQuersetPrimaryKeyRelatedField(LimitQuerySetSerializerFieldMixin, serializers.PrimaryKeyRelatedField):
"""Evaluates callable queryset at runtime."""
pass
class MyModelSerializer(serializers.ModelSerializer):
"""
MyModel serializer with a primary key related field to 'MyRelatedModel'.
"""
def get_my_limited_queryset(self):
root = self.root
if root.instance is None:
return MyRelatedModel.objects.none()
return root.instance.related_set.all()
my_related_model = DynamicQuersetPrimaryKeyRelatedField(queryset=get_my_limited_queryset)
class Meta:
model = MyModel
The only drawback with this is that you would need to explicitly set the related serializer field instead of using the automatic field discovery provided by ModelSerializer. i would however expect something like this to be in rest_framework by default.
In django rest framework 3.0 the get_fields method was removed. But in a similar way you can do this in the init function of the serializer:
class PurchaseSerializer(serializers.HyperlinkedModelSerializer):
class Meta:
model = Purchase
def __init__(self, *args, **kwargs):
super(PurchaseSerializer, self).__init__(*args, **kwargs)
if 'request' in self.context:
self.fields['purchaser'].queryset = permitted_objects(self.context['view'].request.user, fields['purchaser'].queryset)
I added the if check since if you use PurchaseSerializer as field in another serializer on get methods, the request will not be passed to the context.
First to make sure you only allow "self.request.user" when you have an incoming http POST/PUT (this assumes the property on your serializer and model is named "user" literally)
def validate_user(self, attrs, source):
posted_user = attrs.get(source, None)
if posted_user:
raise serializers.ValidationError("invalid post data")
else:
user = self.context['request']._request.user
if not user:
raise serializers.ValidationError("invalid post data")
attrs[source] = user
return attrs
By adding the above to your model serializer you ensure that ONLY the request.user is inserted into your database.
2) -about your filter above (filter purchaser=user) I would actually recommend using a custom global filter (to ensure this is filtered globally). I do something for a software as a service app of my own and it helps to ensure each http request is filtered down (including an http 404 when someone tries to lookup a "object" they don't have access to see in the first place)
I recently patched this in the master branch so both list and singular views will filter this
https://github.com/tomchristie/django-rest-framework/commit/1a8f07def8094a1e34a656d83fc7bdba0efff184
3) - about the api renderer - are you having your customers use this directly? if not I would say avoid it. If you need this it might be possible to add a custom serlializer that would help to limit the input on the front-end
Upon request # gabn88, as you may know by now, with DRF 3.0 and above, there is no easy solution.
Even IF you do manage to figure out a solution, it won't be pretty and will most likely fail on subsequent versions of DRF as it will override a bunch of DRF source which will have changed by then.
I forget the exact implementation I used, but the idea is to create 2 fields on the serializer, one your normal serializer field (lets say PrimaryKeyRelatedField etc...), and another field a serializer method field, which the results will be swapped under certain cases (such as based on the request, the request user, or whatever). This would be done on the serializers constructor (ie: init)
Your serializer method field will return a custom query that you want.
You will pop and/or swap these fields results, so that the results of your serializer method field will be assigned to the normal/default serializer field (PrimaryKeyRelatedField etc...) accordingly. That way you always deal with that one key (your default field) while the other key remains transparent within your application.
Along with this info, all you really need is to modify this: http://www.django-rest-framework.org/api-guide/serializers/#dynamically-modifying-fields
I wrote a custom CustomQueryHyperlinkedRelatedField class to generalize this behavior:
class CustomQueryHyperlinkedRelatedField(serializers.HyperlinkedRelatedField):
def __init__(self, view_name=None, **kwargs):
self.custom_query = kwargs.pop('custom_query', None)
super(CustomQueryHyperlinkedRelatedField, self).__init__(view_name, **kwargs)
def get_queryset(self):
if self.custom_query and callable(self.custom_query):
qry = self.custom_query()(self)
else:
qry = super(CustomQueryHyperlinkedRelatedField, self).get_queryset()
return qry
#property
def choices(self):
qry = self.get_queryset()
return OrderedDict([
(
six.text_type(self.to_representation(item)),
six.text_type(item)
)
for item in qry
])
Usage:
class MySerializer(serializers.HyperlinkedModelSerializer):
....
somefield = CustomQueryHyperlinkedRelatedField(view_name='someview-detail',
queryset=SomeModel.objects.none(),
custom_query=lambda: MySerializer.some_custom_query)
#staticmethod
def some_custom_query(field):
return SomeModel.objects.filter(somefield=field.context['request'].user.email)
...
I did the following:
class MyModelSerializer(serializers.ModelSerializer):
myForeignKeyFieldName = MyForeignModel.objects.all()
def get_fields(self, *args, **kwargs):
fields = super(MyModelSerializer, self).get_fields()
qs = MyModel.objects.filter(room=self.instance.id)
fields['myForeignKeyFieldName'].queryset = qs
return fields
I looked for a solution where I can set the queryset upon creation of the field and don't have to add a separate field class. This is what I came up with:
class PurchaseSerializer(serializers.HyperlinkedModelSerializer):
class Meta:
model = Purchase
fields = ["purchaser"]
def get_purchaser_queryset(self):
user = self.context["request"].user
return Purchase.objects.filter(purchaser=user)
def get_extra_kwargs(self):
kwargs = super().get_extra_kwargs()
kwargs["purchaser"] = {"queryset": self.get_purchaser_queryset()}
return kwargs
The main issue for tracking suggestions regarding this seems to be drf#1985.
Here's a re-usable generic serializer field that can be used instead of defining a custom field for every use case.
class DynamicPrimaryKeyRelatedField(serializers.PrimaryKeyRelatedField):
"""A PrimaryKeyRelatedField with ability to set queryset at runtime.
Pass a function in the `queryset_fn` kwarg. It will be passed the serializer `context`.
The function should return a queryset.
"""
def __init__(self, queryset_fn=None, **kwargs):
assert queryset_fn is not None, "The `queryset_fn` argument is required."
self.queryset_fn = queryset_fn
super().__init__(**kwargs)
def get_queryset(self):
return self.queryset_fn(context=self.context)
Usage:
class MySerializer(serializers.ModelSerializer):
my_models = DynamicPrimaryKeyRelatedField(
queryset_fn=lambda context: MyModel.objects.visible_to_user(context["request"].user)
)
# ...
Same works for serializers.SlugRelatedField.

Categories