Validate field in Marshmallow but don't deserialize it - python

I'm trying to build a Marshmallow schema based on a model, but with one additional field. While this seemed to work by declaring the special field by itself and then setting meta.model to my assigned model, I fail to find a solution so that the additional field gets validated (it is marked as required), but does not turn up in the resulting, deserialized object.
I tried setting it as excluded and dump_only, but to no avail, either the validation does not take place, or the deserialized object also contains the additional field (which then clashes with my ORM).

For now I solved it by subclassing my model schema, adding the additional field there, and then - before loading my data through the model schema - validating it through the subclassed schema.
If there is a more elegant solution I'd still love to hear it.

Related

Is there any way to add through to M2M field with keeping the related manager's .add() working?

While Django doesn't completely support adding the through attribute to a M2M field (in order to add some extra fields), it can be migrated. The main issue is, Django will complain when any code tries to .add() models to the related set even if there are no required fields in the through model outside of the FKs of the linked models.
So, I want to add a nullable field to the through model and still keep .add() and remove working like it was before (and implicitely using None as the nullable field value). Adding auto_created=True in the meta almost works, but it breaks migrations amongst other things. Is there any way to make it work outside of overriding the many2many descriptor (which isn't exactly included in the public API, though a lot of third-party Django packages use it)?

SQLAlchemy whole model validation with Cerberus

I want to create some universal validation mechanism for all models using cerberus. The goal is to have cerberus schema in model's __schema__ property and perform validation of whole model using this schema each time model's state changed (not necessarily before insert or update). I thought about using events but I'm not sure how to do it and do not miss something.
Based on what you're describing, you might be better off using marshmallow instead of cerberus:
http://marshmallow.readthedocs.io/en/latest/examples.html#quotes-api-flask-sqlalchemy

Why does MongoEngine Document accepts undefined fields without raising an Exception?

If I have a MongoEngine Document and if I make an instance of it supplying undefined fields, it doesn't raise an exception. Why is this? I would like to know when I supply an undefined field.
MyDoc(undefined_field='test').save()
The advantage of using MongoDB is to be flexible in schema, and hence they are said to be schemaless. The frameworks allows the functionality to be useful, by allowing adding non defined fields.
I am not sure about MongoEngine, but if you are using mongokit, we can specify not to be strict to our schema, otherwise ValidationError will be raised by default.
To make it schemaless, we add use_schemaless = True in Document class, to make it schemaless.
I would prefer to use MongoKit over MongoEngine, not just because of this reason, but of the fact that, MongoEngine adds extra fields and attributes into your data when the same is saved
in DB. If you would rather like your data to be clean, please go for MongoKit(Just an advice from previous experience). You can refer here: mongokit.
Thanks!
I have the feeling this is missing functionality from MongoEngine so I created a pull request:
https://github.com/MongoEngine/mongoengine/pull/457

data validation for SQLAlchemy declarative models

I'm using CherryPy, Mako templates, and SQLAlchemy in a web app. I'm coming from a Ruby on Rails background and I'm trying to set up some data validation for my models. I can't figure out the best way to ensure, say, a 'name' field has a value when some other field has a value. I tried using SAValidation but it allowed me to create new rows where a required column was blank, even when I used validates_presence_of on the column. I've been looking at WTForms but that seems to involve a lot of duplicated code--I already have my model class set up with the columns in the table, why do I need to repeat all those columns again just to say "hey this one needs a value"? I'm coming from the "skinny controller, fat model" mindset and have been looking for Rails-like methods in my model like validates_presence_of or validates_length_of. How should I go about validating the data my model receives, and ensuring Session.add/Session.merge fail when the validations fail?
Take a look at the documentation for adding validation methods. You could just add an "update" method that takes the POST dict, makes sure that required keys are present, and uses the decorated validators to set the values (raising an error if anything is awry).
I wrote SAValidation for the specific purpose of avoiding code duplication when it comes to validating model data. It works well for us, at least for our use cases.
In our tests, we have examples of the model's setup and tests to show the validation works.
API Logic Server provides business rules for SQLAlchemy models. This includes not only multi-field, multi-table validations, but multi-table validations. It's open source.
I ended up using WTForms after all.

Separation of ORM and validation

I use django and I wonder in what cases where model validation should go. There are at least two variants:
Validate in the model's save method and to raise IntegrityError or another exception if business rules were violated
Validate data using forms and built-in clean_* facilities
From one point of view, answer is obvious: one should use form-based validation. It is because ORM is ORM and validation is completely another concept. Take a look at CharField: forms.CharField allows min_length specification, but models.CharField does not.
Ok cool, but what the hell all that validation features are doing in django.db.models? I can specify that CharField can't be blank, I can use EmailField, FileField, SlugField validation of which are performed here, in python, not on RDBMS. Furthermore there is the URLField which checks existance of url involving some really complex logic.
From another side, if I have an entity I want to guarantee that it will not be saved in inconsistent state whether it came from a form or was modified/created by some internal algorithms. I have a model with name field, I expect it should be longer than one character. I have a min_age and a max_age fields also, it makes not much sense if min_age > max_age. So should I check such conditions in save method?
What are the best practices of model validation?
I am not sure if this is best practise but what I do is that I tend to validate both client side and server side before pushing the data to the database. I know it requires a lot more effort but this can be done by setting some values before use and then maintaining them.
You could also try push in size contraints with **kwargs into a validation function that is called before the put() call.
Your two options are two different things.
Form-based validation can be regarded as syntactic validation + convert HTTP request parameters from text to Python types.
Model-based validation can be regarded as semantic validation, sometimes using context not available at the HTTP/form layer.
And of course there is a third layer at the DB where constraints are enforced, and may not be checkable anywhere else because of concurrent requests updating the database (e.g. uniqueness constraints, optimistic locking).
"but what the hell all that validation features are doing in django.db.models? "
One word: Legacy. Early versions of Django had less robust forms and the validation was scattered.
"So should I check such conditions in save method?"
No, you should use a form for all validation.
"What are the best practices of model validation?"*
Use a form for all validation.
"whether it came from a form or was modified/created by some internal algorithms"
What? If your algorithms suffer from psychotic episodes or your programmers are sociopaths, then -- perhaps -- you have to validate internally-generated data.
Otherwise, internally-generated data is -- by definition -- valid. Only user data can be invalid. If you don't trust your software, what's the point of writing it? Are your unit tests broken?
There's an ongoing Google Summer of Code project that aims to bring validation to the Django model layer. You can read more about it in this presentation from the GSoC student (Honza Kral). There's also a github repository with the preliminary code.
Until that code finds its way into a Django release, one recommended approach is to use ModelForms to validate data, even if the source isn't a form. It's described in this blog entry from one of the Django core devs.
DB/Model validation
The data store in database must always be in a certain form/state. For example: required first name, last name, foreign key, unique constraint. This is where the logic of you app resides. No matter where you think the data comes from - it should be "validated" here and an exception raised if the requirements are not met.
Form validation
Data being entered should look right. It is ok if this data is entered differently through some other means (through admin or api calls).
Examples: length of person's name, proper capitalization of the sentence...
Example1: Object has a StartDate and an EndDate. StartDate must always be before EndDate. Where do you validate this? In the model of course! Consider a case when you might be importing data from some other system - you don't want this to go through.
Example2: Password confirmation. You have a field for storing the password in the db. However you display two fields: password1 and password2 on your form. The form, and only the form, is responsible for comparing those two fields to see that they are the same. After form is valid you can safely store the password1 field into the db as the password.

Categories