Validating a Django model field based on another field's value? - python

I have a Django app with models accessible by both Django REST Framework and a regular form interface. The form interface has some validation checks before saving changes to the model, but not using any special Django framework, just a simple local change in the view.
I'd like to apply the same validation to forms and REST calls, so I want to move my validation into the model. I can see how to do that for simple cases using the validators field of the Field, but in one case I have a name/type/value model where the acceptable values for 'value' change depending on which type is selected. The validator doesn't get sent any information about the model that the field is in, so it doesn't have access to other fields.
How can I perform this validation, without having essentially the same code in a serializer for DRF and my POST view for the form?

I dug around codebase of drf a little bit. You can get values of all fields using following approach. Doing so, you can throw serialization error as
{'my_field':'error message} instead of {'non_field_error':'error message'}.
def validate_myfield(self, value):
data = self.get_initial() # data for all the fields
#do your validation
However, if you wish to do it for ListSerializer, i.e for serializer = serializer_class(many=True), this won't work. You will get list of empty values.
In that scenario, you could write your validations in def validate function and to avoid non_field_errors in your serialization error, you can raise ValidationError with error message as a dictionary instead of string.
def validate(self, data):
# do your validation
raise serializers.ValidationError({"your_field": "error_message"})

The validation per-field doesn't get sent any information about other fields, when it is defined like this:
def validate_myfield(self, value):
...
However, if you have a method defined like this:
def validate(self, data):
...
Then you get all the data in a dict, and you can do cross-field validation.

You can use the required package for your cross-field validation. It allows you to express your validation rules declaratively in python. You would have something like this with DRF:
class MySerializer(serializers.Serializer):
REQUIREMENTS = (
Requires("end_date", "start_date") +
Requires("end_date", R("end_date") > R("start_date")) +
Requires("end_date", R("end_date") < today.date() + one_year) +
Requires("start_date", R("start_date") < today.date() + one_year)
)
start_date = serializers.DateField(required=False, null=True, blank=True)
end_date = serializers.DateField(required=False, null=True, blank=True)
def validate(self, data):
self.REQUIREMENTS.validate(data) # handle validation error
You could put the REQUIREMENTS on your Model and have both your DRF and Django Form validate your data using it.
Here is a blog post explaining more

Related

Correct way to validate GET parameters in django

I'm building a social website that uses django templates/dynamic pages (no SPA technology in place).
I have some ajax calls that check the users news feed or new messages.
Example GET web request of those looks as follows:
GET /feeds/check/?last_feed=3&feed_source=all&_=1500749662203 HTTP/1.1
This is how I receive it in the view:
#login_required
#ajax_required
def check(request):
last_feed = request.GET.get('last_feed')
feeds = Feed.get_feeds_after(last_feed)
It all works, but I want to protect it so the function get_feeds_after does not crash when a malicious user sets the GET parameter to last_feed="123malicious4556". Currently it crashes because in the Feed model the function does this:
#staticmethod
def get_feeds_after(feed):
feeds = Feed.objects.filter(parent=None, id__gt=float(feed))
return feeds
and crashes with the error:
ValueError at /feeds/check/
invalid literal for float(): 2fff2
I currently solve this by directly performing checks on the GET variable and handling exception on int() casting:
def check(request):
last_feed = request.GET.get('last_feed')
try:
feed_source = int(request.GET.get('last_feed'))
except ValueError:
return HttpResponse(0)
My question is what is the best django-recommended way to address this?
I know django has special support forms validation. But this does not seem quite right here, as the GET calls are more of an api rather than forms so it seems like a bad idea to define forms for those GET parameters.
Thanks
All you actually need are the form fields which do all basic validation for you.
If you need custom validation you can write your own validator or even better your own custom form field.
To use the field alone without the form you can do like that for example:
evalType = forms.CharField().clean(request.GET.get('eval-type'))
Because calling this way is not very human friendly I prefer to write a function to deal with it:
def cleanParam(params, paramName, FieldType, *args, **kwargs):
field = FieldType(*args, **kwargs)
cleaned = field.clean(params.get(paramName))
return cleaned
Which we use this way:
evalType = cleanParam(request.GET, 'eval-type', forms.CharField)
This will save you a form class. But I don't think it's very ugly to create a django form for that. A bit too much for the problem but no great concern IMHO.
The advantage of having a form is that you declare the fields you expect in your api call and can check all at once then see the result of is_valid().
I hope this helps.
You can also use a django Validator, which is
a callable that takes a value and raises a ValidationError if it does not meet some criteria.
In your specific case, you can use a combination of validate_slug() - which does not raise a ValidationError only if the passed input is composed of letters, numbers, underscores or hyphens - and int() functions in this way:
from django.core.validators import validate_slug
...
last_feed = request.GET.get("last_feed", "")
try:
validate_slug(last_feed)
last_feed = int(last_feed)
...
except ValueError:
print("invalid literal passed to int()")
except ValidationError:
print("invalid input passed to validate_slug()")
...
Request params can be validated with the DRF serializer.
Create a serializer with all the parameters that are required to be validated.
from rest_framework import serializers
class OrderIDSerializer(serializers.Serializer):
order_id = serializers.CharField(required=True)
def validate(self, attrs):
order_id = attrs.get('order_id')
if not Order.objects.filter(
id=order_id).exists():
raise error
attrs['order_id'] = order_id
return attrs
Import serializer in the views:
from order.serializers.order_serializer import OrderIDSerializer
serializer = OrderIDSerializer(data=request.query_params)
if serializer.is_valid():
order_id = serializer.data['order_id']
// your logic after the validation

How to automatically fill-in model fields in Django rest_framework serializer?

Let's assume I have a model like this:
class Data(models.Model):
a = models.CharField()
b = models.CharField()
c = models.IntegerField()
I would like to setup a serializer in such a way that it automatically fills in field c and it is not required for a POST. I tried to overwrite the create function of the serializer, but it doesn't work:
class DataSerializer(serializers.HyperlinkedModelSerializer):
class Meta:
model = Data
fields = ('a', 'b')
def create(self, validated_data, **kwargs):
Data.objects.c = 5
return Data.objects.create(**validated_data)
However, if I try this, I end up with an IntegrityError: NOT NULL constraint failed: model_data.c. What is the syntax that I have to use here?
EDIT: Updated formatting.
The reason you're getting the error because field c is not set to null = True - as such an error is raised at the validation stage even before the serializer hits the create method.
Bear in mind that the process goes like this:
Submit serializer data
field-level validation happens - this includes checks for null integrity, min/max length etc and also any custom field validations defined in def validate_<field_name>
object-level validation happens - this calls the def validate method
validated data is passed to the save method, depending on how you designed the serializer - it will save the instance, or route the data to either create or update
All of the info regarding this can be found in Django's and DRF's docs.
A few things to consider:
are you setting a global default for that field? If so, set the default in your models - c = models.IntegerField(default=a_number_or_a_callable_that_returns_an_integer)
do you intend to display the field? If so, include c in your fields and add one more Meta attribute - read_only_fields = ('c',)
If it's neither of the above, you might want to override the validate_c method
Apologies for the poor formatting, typing it on my phone - will update once I get to a computer
In your code Data.objects.c = 5 does nothing.
If you want to set this value yourself use validated_data['c'] = 5 or Data.objects.create(c=5, **validated_data) (just not both at the same time).
Rather than doing this in the serializer, there are hooks in the generic views that allow you to pass values to the serializer. So in your case you might have:
class DataViewSet(ModelViewSet):
# ...
def perform_create(self, serializer):
serializer.save(c=5)
See the "Save and deletion hooks" section here

Set foreignkey field based on request

I'm using the Django Rest Framework 3.2. I have a model "Report" with a ForeignKey to the model "Office".
I want to set the report.office based on the request.
This is how I did it (which I don't like, because it requires duplication):
In my view I differ between GET or POST/PUT/PATCH requests, where GET uses a nested Serializer representation, and POST/PUT/PATCH uses flat serializer representations.
Then I do in my view:
def perform_create(self, serializer):
serializer.save(office=Office.objects.get_only_one_from_request(self.request))
and in my FlatOfficeSerializer:
def validate(self, attrs):
if self.instance is not None:
instance = self.instance
instance_fields = instance._meta.local_fields
for f in instance_fields:
f = self.get_initial().get(f.name)
else:
instance = self.Meta.model(**attrs)
request = self.context.get('request')
instance.office = Office.objects.get_only_one_from_request(request)
instance.clean()
return attrs
It is clear that there is a duplication there and although it is working I don't like it. Without the duplication the validation fails (of course) on the instance.clean(), which calls the clean method of the model. I really need this, because my business logic is in the model's clean and save methods.

Order of Serializer Validation in Django REST Framework

Situation
While working with validation in the Django REST Framework's ModelSerializer, I have noticed that the Meta.model fields are always validated, even when it does not necessarily make sense to do so. Take the following example for a User model's serialization:
I have an endpoint that creates a user. As such, there is a password field and a confirm_password field. If the two fields do not match, the user cannot be created. Likewise, if the requested username already exists, the user cannot be created.
The user POSTs improper values for each of the fields mentioned above
An implementation of validate has been made in the serializer (see below), catching the non-matching password and confirm_password fields
Implementation of validate:
def validate(self, data):
if data['password'] != data.pop('confirm_password'):
raise serializers.ValidationError("Passwords do not match")
return data
Problem
Even when the ValidationError is raised by validate, the ModelSerializer still queries the database to check to see if the username is already in use. This is evident in the error-list that gets returned from the endpoint; both the model and non-field errors are present.
Consequently, I would like to know how to prevent model validation until after non-field validation has finished, saving me a call to my database.
Attempt at solution
I have been trying to go through the DRF's source to figure out where this is happening, but I have been unsuccessful in locating what I need to override in order to get this to work.
Since most likely your username field has unique=True set, Django REST Framework automatically adds a validator that checks to make sure the new username is unique. You can actually confirm this by doing repr(serializer()), which will show you all of the automatically generated fields, which includes the validators.
Validation is run in a specific, undocumented order
Field deserialization called (serializer.to_internal_value and field.run_validators)
serializer.validate_[field] is called for each field
Serializer-level validators are called (serializer.run_validation followed by serializer.run_validators)
serializer.validate is called
So the problem that you are seeing is that the field-level validation is called before your serializer-level validation. While I wouldn't recommend it, you can remove the field-level validator by setting extra_kwargs in your serilalizer's meta.
class Meta:
extra_kwargs = {
"username": {
"validators": [],
},
}
You will need to re-implement the unique check in your own validation though, along with any additional validators that have been automatically generated.
I was also trying to understand how the control flows during serializer validation and after carefully going through the source code of djangorestframework-3.10.3 I came up with below request flow diagram. I have described the flow and what happens in the flow to the best of my understanding without going into too much detail as it can be looked up from source.
Ignore the incomplete method signatures. Only focusing on what methods are called on what classes.
Assuming you have an overridden is_valid method on your serializer class (MySerializer(serializers.Serializer)) when you call my_serializer.is_valid() the following takes place.
MySerializer.is_valid() is executed.
Assuming you are calling the super class (BaseSerializer) is_valid method (like: super(MySerializer, self).is_valid(raise_exception) in your MySerializer.is_valid() method, that will be called.
Now since MySerializer is extending serializers.Serializer, the run_validation() method from serializer.Serializers is called. This is validating only the data dict the first. So we haven't yet started field level validations.
Then the validate_empty_values from fields.Field gets called. This again happens on the entire data and not a single field.
Then the Serializer.to_internal_method is called.
Now we loop over each fields defined on the serializer. And for each field, first we call the field.run_validation() method. If the field has overridden the Field.run_validation() method then that will be called first. In case of a CharField it is overridden and calls the run_validation method of Field base class. Step 6-2 in the figure.
On that field we again call the Field.validate_empty_values()
The to_internal_value of the type of field is called next.
Now there is a call to the Field.run_validators() method. I presume this is where the additional validators that we add on the field by specifying the validators = [] field option get executed one by one
Once all this is done, we are back to the Serializer.to_internal_value() method. Now remember that we are doing the above for each field within that for loop. Now the custom field validators you wrote in your serializer (methods like validate_field_name) are run. If an exception occurred in any of the previous steps, your custom validators wont run.
read_only_defaults()
update validate data with defaults I think
run object level validators. I think the validate() method on your object is run here.
I don't believe the above solutions work any more. In my case, my model has fields 'first_name' and 'last_name', but the API will only receive 'name'.
Setting 'extra_kwargs' and 'validators' in the Meta class seems to have no effect, first_name and last_name are allways deemed required, and validators are always called. I can't overload the first_name/last_name character fields with
anotherrepfor_first_name = serializers.CharField(source=first_name, required=False)
as the names make sense. After many hours of frustration, I found the only way I could override the validators with a ModelSerializer instance was to override the class initializer as follows (forgive the incorrect indentation):
class ContactSerializer(serializers.ModelSerializer):
name = serializers.CharField(required=True)
class Meta:
model = Contact
fields = [ 'name', 'first_name', 'last_name', 'email', 'phone', 'question' ]
def __init__(self, *args, **kwargs):
self.fields['first_name'] = serializers.CharField(required=False, allow_null=True, allow_blank=True)
self.fields['last_name'] = serializers.CharField(required=False, allow_null=True, allow_blank=True)
return super(ContactSerializer, self).__init__(*args, **kwargs)
def create(self, validated_data):
return Contact.objects.create()
def validate(self, data):
"""
Remove name after getting first_name, last_name
"""
missing = []
for k in ['name', 'email', 'question']:
if k not in self.fields:
missing.append(k)
if len(missing):
raise serializers.ValidationError("Ooops! The following fields are required: %s" % ','.join(missing))
from nameparser import HumanName
names = HumanName(data['name'])
names.capitalize()
data['last_name'] = names.last
if re.search(r'\w+', names.middle):
data['first_name'] = ' '.join([names.first, names.middle])
else:
data['first_name'] = names.first
del(data['name'])
return data
Now the doc says that allowing blank and null with character fields is a no no, but this is a serializer, not a model, and as the API gets called by all kinds of cowboys, I need to cover my bases.
Here's the approach that worked for me.
Use a sentinel error type that gets caught in an overridden view function
The sentinel is raised from the custom serializer
The sentinel error type:
from django.core.exceptions import ValidationError
class CustomValidationErrors(ValidationError):
""" custom validation error for the api view to catch the status code """
And in the serializer we override errors, _errors, validated_data, and _validated_data, as well as is_valid:
class CustomSerializer(serializers.ModelSerializer):
# fields that usually run validation before parent serializer validation
child_field1 = Child1Serializer()
child_field2 = Child2Serializer()
# override DRF fields
errors = {}
_errors = None
validated_data = {}
_validated_data = []
def is_valid(self, *args, **kwargs):
# override drf.serializers.Serializer.is_valid
# and raise CustomValidationErrors from parent validate
self.validate(self.initial_data)
return not bool(self.errors)
def validate(self, attrs):
self._errors = {}
if len(attrs.get("child_field1", {}).get("name", "")) > 100:
self._errors.update({"child_field1": {"name": "child 1 name > 100"}})
if len(attrs.get("child_field2", {}).get("description", "")) > 1000:
self._errors.update({"child_field2.description": "child 2 description > 100"})
if len(self._errors):
# set the overriden DRF values
self.errors = self._errors
# raise the sentinel error type
raise CustomValidationErrors(self._errors)
# set the overriden DRF serializer values
self._errors = None
self.validated_data = attrs
self._validated_data = [[k, v] for k, v in attrs.items()]
return attrs
class Meta:
model = CustomModel
And in the view we can override the default method, and catch the sentinel error type:
class CustomSerializerView(ListCreateAPIView):
serializer_class = CustomeSerializer
def post(self, *args, **kwargs):
try:
# if this fails for any exception
# other than CustomValidationErrors
# it will return the default error
return super().post(*args, **kwargs)
except CustomValidationErrors as e:
############
# returns a 400 with the following
# {"child_field1":
# [[{"name": "child 1 name > 100"}]],
# "child_field2.description":
# [["child 2 description > 100"]]
# }
############
return Response(e.error_dict, status=400)
drf version:
djangorestframework==3.11.0

Django Rest Framework - Read nested data, write integer

So far I'm extremely happy with Django Rest Framework, which is why I alsmost can't believe there's such a large omission in the codebase. Hopefully someone knows of a way how to support this:
class PinSerializer(serializers.ModelSerializer):
item = ItemSerializer(read_only=True, source='item')
item = serializers.IntegerSerializer(write_only=True)
class Meta:
model = Pin
with the goal
The goal here is to read:
{pin: item: {name: 'a', url: 'b'}}
but to write using an id
{pin: item: 10}
An alternative would be to use two serializers, but that looks like a really ugly solution:
django rest framework model serializers - read nested, write flat
Django lets you access the Item on your Pin with the item attribute, but actually stores the relationship as item_id. You can use this strategy in your serializer to get around the fact that a Python object cannot have two attributes with the same name (a problem you would encounter in your code).
The best way to do this is to use a PrimaryKeyRelatedField with a source argument. This will ensure proper validation gets done, converting "item_id": <id> to "item": <instance> during field validation (immediately before the serializer's validate call). This allows you to manipulate the full object during validate, create, and update methods. Your final code would be:
class PinSerializer(serializers.ModelSerializer):
item = ItemSerializer(read_only=True)
item_id = serializers.PrimaryKeyRelatedField(write_only=True,
source='item',
queryset=Item.objects.all())
class Meta:
model = Pin
fields = ('id', 'item', 'item_id',)
Note 1: I also removed source='item' on the read-field as that was redundant.
Note 2: I actually find it rather unintuitive that Django Rest is set up such that a Pin serializer without an Item serializer specified returns the item_id as "item": <id> and not "item_id": <id>, but that is beside the point.
This method can even be used with forward and reverse "Many" relationships. For example, you can use an array of pin_ids to set all the Pins on an Item with the following code:
class ItemSerializer(serializers.ModelSerializer):
pins = PinSerializer(many=True, read_only=True)
pin_ids = serializers.PrimaryKeyRelatedField(many=True,
write_only=True,
source='pins',
queryset=Pin.objects.all())
class Meta:
model = Item
fields = ('id', 'pins', 'pin_ids',)
Another strategy that I previously recommended is to use an IntegerField to directly set the item_id. Assuming you are using a OneToOneField or ForeignKey to relate your Pin to your Item, you can set item_id to an integer without using the item field at all. This weakens the validation and can result in DB-level errors from constraints being violated. If you want to skip the validation DB call, have a specific need for the ID instead of the object in your validate/create/update code, or need simultaneously writable fields with the same source, this may be better, but I wouldn't recommend anymore. The full line would be:
item_id = serializers.IntegerField(write_only=True)
If you are using DRF 3.0 you can implement the new to_internal_value method to override the item field to change it to a PrimaryKeyRelatedField to allow the flat writes. The to_internal_value takes unvalidated incoming data as input and should return the validated data that will be made available as serializer.validated_data. See the docs: http://www.django-rest-framework.org/api-guide/serializers/#to_internal_valueself-data
So in your case it would be:
class ItemSerializer(ModelSerializer):
class Meta:
model = Item
class PinSerializer(ModelSerializer):
item = ItemSerializer()
# override the nested item field to PrimareKeyRelatedField on writes
def to_internal_value(self, data):
self.fields['item'] = serializers.PrimaryKeyRelatedField(queryset=Item.objects.all())
return super(PinSerializer, self).to_internal_value(data)
class Meta:
model = Pin
Two things to note: The browsable web api will still think that writes will be nested. I'm not sure how to fix that but I only using the web interface for debug so not a big deal. Also, after you write the item returned will have flat item instead of the nested one. To fix that you can add this code to force the reads to use the Item serializer always.
def to_representation(self, obj):
self.fields['item'] = ItemSerializer()
return super(PinSerializer, self).to_representation(obj)
I got the idea from this from Anton Dmitrievsky's answer here: DRF: Simple foreign key assignment with nested serializers?
You can create a Customized Serializer Field (http://www.django-rest-framework.org/api-guide/fields)
The example took from the link:
class ColourField(serializers.WritableField):
"""
Color objects are serialized into "rgb(#, #, #)" notation.
"""
def to_native(self, obj):
return "rgb(%d, %d, %d)" % (obj.red, obj.green, obj.blue)
def from_native(self, data):
data = data.strip('rgb(').rstrip(')')
red, green, blue = [int(col) for col in data.split(',')]
return Color(red, green, blue)
Then use this field in your serializer class.
I create a Field type that tries to solve the problem of the Data Save requests with its ForeignKey in Integer, and the requests to read data with nested data
This is the class:
class NestedRelatedField(serializers.PrimaryKeyRelatedField):
"""
Model identical to PrimaryKeyRelatedField but its
representation will be nested and its input will
be a primary key.
"""
def __init__(self, **kwargs):
self.pk_field = kwargs.pop('pk_field', None)
self.model = kwargs.pop('model', None)
self.serializer_class = kwargs.pop('serializer_class', None)
super().__init__(**kwargs)
def to_representation(self, data):
pk = super(NestedRelatedField, self).to_representation(data)
try:
return self.serializer_class(self.model.objects.get(pk=pk)).data
except self.model.DoesNotExist:
return None
def to_internal_value(self, data):
return serializers.PrimaryKeyRelatedField.to_internal_value(self, data)
And so it would be used:
class PostModelSerializer(serializers.ModelSerializer):
message = NestedRelatedField(
queryset=MessagePrefix.objects.all(),
model=MessagePrefix,
serializer_class=MessagePrefixModelSerializer
)
I hope this helps you.

Categories