Why does MongoEngine Document accepts undefined fields without raising an Exception?

Why does MongoEngine Document accepts undefined fields without raising an Exception? - python

If I have a MongoEngine Document and if I make an instance of it supplying undefined fields, it doesn't raise an exception. Why is this? I would like to know when I supply an undefined field.
MyDoc(undefined_field='test').save()

The advantage of using MongoDB is to be flexible in schema, and hence they are said to be schemaless. The frameworks allows the functionality to be useful, by allowing adding non defined fields.
I am not sure about MongoEngine, but if you are using mongokit, we can specify not to be strict to our schema, otherwise ValidationError will be raised by default.
To make it schemaless, we add use_schemaless = True in Document class, to make it schemaless.
I would prefer to use MongoKit over MongoEngine, not just because of this reason, but of the fact that, MongoEngine adds extra fields and attributes into your data when the same is saved
in DB. If you would rather like your data to be clean, please go for MongoKit(Just an advice from previous experience). You can refer here: mongokit.
Thanks!

I have the feeling this is missing functionality from MongoEngine so I created a pull request:
https://github.com/MongoEngine/mongoengine/pull/457

Related

Validate field in Marshmallow but don't deserialize it

I'm trying to build a Marshmallow schema based on a model, but with one additional field. While this seemed to work by declaring the special field by itself and then setting meta.model to my assigned model, I fail to find a solution so that the additional field gets validated (it is marked as required), but does not turn up in the resulting, deserialized object.
I tried setting it as excluded and dump_only, but to no avail, either the validation does not take place, or the deserialized object also contains the additional field (which then clashes with my ORM).

For now I solved it by subclassing my model schema, adding the additional field there, and then - before loading my data through the model schema - validating it through the subclassed schema.
If there is a more elegant solution I'd still love to hear it.

What is the proper process for validating and saving data with with Django/Django Rest Framework regardless the data source?

I have a particular model that I'd like to perform custom validations on. I'd like to guarantee that at least one identifier field is always present when creating a new instance such that its impossible to create an instance without one of these fields, though no field in particular is individually required.
from django.db import models
class Security(models.Model):
symbol = models.CharField(unique=True, blank=True)
sedol = models.CharField(unique=True, blank=True)
tradingitemid = models.Charfield(unique=True, blank=True)
I'd like a clean, reliable way to do this no matter where the original data is coming from (e.g., an API post or internal functions that get this data from other sources like a .csv file).
I understand that I could overwrite the models .save() method and perform validation, but best practice stated here suggests that raising validation errors in the .save() method is a bad idea because views will simply return a 500 response instead of returning a validation error to a post request.
I know that I can define a custom serializer with a validator using Django Rest Framework for this model that validates the data (this would be a great solution for a ModelViewSet where the objects are created and I can guarantee this serializer is used each time). But this data integrity guarantee is only good on that API endpoint and then as good as the developer is at remembering to use that serializer each and every time an object is created elsewhere in the codebase (objects can be created throughout the codebase from sources besides the web API).
I am also familiar with Django's .clean() and .full_clean() methods. These seem like the perfect solutions, except that it again relies upon the developer always remembering to call these methods--a guarantee that's only as good as the developer's memory. I know the methods are called automatically when using a ModelForm, but again, for my use case models can be created from .csv downloads as well--I need a general purpose guarantee that's best practice. I could put .clean() in the model's .save() method, but this answer (and related comments and links in the post) seem to make this approach controversial and perhaps an anti-pattern.
Is there a clean, straightforward way to make a guarantee that this model can never be saved without one of the three fields that 1. doesn't raise 500 errors through a view, 2. that doesn't rely upon the developer explicitly using the correct serializer throughout the codebase when creating objects, and 3. Doesn't rely upon hacking a call to .clean() into the .save() method of the model (a seeming anti-pattern)? I feel like there must be a clean solution here that isn't a hodge podge of putting some validation in a serializer, some in a .clean() method, hacking the .save() method to call .clean() (it would get called twice with saves from ModelForms), etc...

One could certainly imagine a design where save() did double duty and handled validation for you. For various reasons (partially summarized in the links here), Django decided to make this a two-step process. So I agree with the consensus you found that trying to shoehorn validation into Model.save() is an anti-pattern. It runs counter to Django's design, and will probably cause problems down the road.
You've already found the "perfect solution", which is to use Model.full_clean() to do the validation. I don't agree with you that remembering this will be burdensome for developers. I mean, remembering to do anything right can be hard, especially with a large and powerful framework, but this particular thing is straightforward, well documented, and fundamental to Django's ORM design.
This is especially true when you consider what is actually, provably difficult for developers, which is the error handling itself. It's not like developers could just do model.validate_and_save(). Rather, they would have to do:
try:
model.validate_and_save()
except ValidationError:
# handle error - this is the hard part
Whereas Django's idiom is:
try:
model.full_clean()
except ValidationError:
# handle error - this is the hard part
else:
model.save()
I don't find Django's version any more difficult. (That said, there's nothing stopping you from writing your own validate_and_save convenience method.)
Finally, I would suggest adding a database constraint for your requirement as well. This is what Django does when you add a constraint that it knows how to enforce at the database level. For example, when you use unique=True on a field, Django will both create a database constraint and add Python code to validate that requirement. But if you want to create a constraint that Django doesn't know about you can do the same thing yourself. You would simply write a Migration that creates the appropriate database constraint in addition to writing your own Python version in clean(). That way, if there's a bug in your code and the validation isn't done, you end up with an uncaught exception (IntegrityError) rather than corrupted data.

Is there a way to check if a string is a valid filter for a django queryset?

I'm trying to add some functionality to give a user the ability to filter a paginated queryset in Django via URL get parameters, and have got this successfully working:
for f in self.request.GET.getlist('f'):
try:
k,v = f.split(':', 1)
queryset = queryset.filter(**{k:v})
except:
pass
However, I am hoping to do so in a way that doesn't use try/except blocks. Is there a standard way in django to check if a string is a valid filter parameter?
For example something like:
my_str = "bad_string_not_in_database"
if some_queryset.is_valid_filter_string(my_str):
some_queryset.filter(**{my_str:100})

You can start by looking at the field names:
qs.model._meta.get_all_field_names()
You are also probably going to want to work with the extensions such as field__icontains, field__gte etc. So more work will be required.
Disclaimer: try/except is the far superior way. I don't know why you want to dismiss this method.

The short answer is no, but there are other options.
Django does not provide, nor make it easy to create, the kind of validation function you're asking about. There are not just fields and forward relationships that you can filter on, but also reverse relationships, for which a related_name or a related_query_name on a field in a completely different model might be the valid way to filter your querysets. And there are various filtering mechanisms, like iexact, startswith, regex, etc., that are valid as postfixes to those relationship names. So, to validate everything correctly, you would need to replicate a lot of Django's internal parsing code and that would be a big mistake.
If you just want to filter by this model's fields and forward relationships, you can use hasattr(SomeModel, my_str), but that won't always work correctly (your model has other attributes besides fields, such as methods and properties).
Instead of doing a blanket except: pass, you can at least catch the specific error that will be thrown when an invalid string is used in the filtering kwargs (it's TypeError). You can also return a 400 error to let the client know that their request was not valid, instead of silently continuing with the un-filtered queryset.
My preferred solution would be to outsource this kind of boilerplate, generalizable logic to a library, such as dynamic-rest.

Creating a dynamic ORM resource in tastypie

I'm looking to create a dynamic resource in tastypie. Basically the idea is that there are a lot of models to hook up, so this may help save time with the standard no-frills resources.
I have most of this working, however I'm having trouble with the related fields being populated. I'm overriding the constructor for a class that inherits from ModelResource, and in this constructor I'm attempting to set the tastypie relationships. However when I review my resource the data is not being populated.
setattr(self, field, fields.ForeignKey(class_thing, attribute=field, full=True))
Basically I'm using setattr in the constructor to try and hook up what the relationship should be. If I'm goofing off with the instance I can see this object is getting created but the resource output is not changing. Is anyone familiar enough with tastypie/doing something like this to give me a clue?
Thanks for your time.
Edit: Nevermind, just overrode dehydrate and did this from there.

Rather than go through the constructor (which is messy since tastypie/django does stuff there anyway) I did this through a dehydrate override which is kind of designed to do this.
The bundle.obj has all the associated data there, so basically I just serialized the related objects and add them to the bundle.data dictionary before returning the bundle. Seemed cleaner and worked like a charm.

data validation for SQLAlchemy declarative models

I'm using CherryPy, Mako templates, and SQLAlchemy in a web app. I'm coming from a Ruby on Rails background and I'm trying to set up some data validation for my models. I can't figure out the best way to ensure, say, a 'name' field has a value when some other field has a value. I tried using SAValidation but it allowed me to create new rows where a required column was blank, even when I used validates_presence_of on the column. I've been looking at WTForms but that seems to involve a lot of duplicated code--I already have my model class set up with the columns in the table, why do I need to repeat all those columns again just to say "hey this one needs a value"? I'm coming from the "skinny controller, fat model" mindset and have been looking for Rails-like methods in my model like validates_presence_of or validates_length_of. How should I go about validating the data my model receives, and ensuring Session.add/Session.merge fail when the validations fail?

Take a look at the documentation for adding validation methods. You could just add an "update" method that takes the POST dict, makes sure that required keys are present, and uses the decorated validators to set the values (raising an error if anything is awry).

I wrote SAValidation for the specific purpose of avoiding code duplication when it comes to validating model data. It works well for us, at least for our use cases.
In our tests, we have examples of the model's setup and tests to show the validation works.

API Logic Server provides business rules for SQLAlchemy models. This includes not only multi-field, multi-table validations, but multi-table validations. It's open source.

I ended up using WTForms after all.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.