Django's Model save flow - python

I noticed that there's no guarantee that the data base is updated synchronously after calling save() on a model.
I have done a simple test by making an ajax call to the following method
def save(request, id)
product = ProductModel.objects.find(id = id)
product.name = 'New Product Name'
product.save()
return HTTPResponse('success')
On the client side I wait for a response from the above method and then execute findAll method that retrieves the list of products. The returned list of products contains the old value for the name of the updated product.
However, if I delay the request for the list of products then it contains the new value, just like it is should.
This means that return HTTPResponse('success') if fired before the new values are written into the data base.
If the above is true then is there a way to return the HTTP response only after the data base is updated.

You should have mentioned App Engine more prominently. I've added it to the tags.
This is very definitely because of your lack of understanding of GAE, rather than anything to do with Django. You should read the GAE documentation on eventual consistency in the datastore, and structure your models and queries appropriately.
Normal Django, running with a standard relational database, would not have this issue.

The view should not return anything prior to the .save() function ends its flow.
As for the flow itself, the Django's docs declare it quite explicitly:
When you save an object, Django performs the following steps:
1) Emit a pre-save signal. The signal django.db.models.signals.pre_save is sent, allowing any functions listening for that signal to take some customized action.
2) Pre-process the data. Each field on the object is asked to perform any automated data modification that the field may need to perform.
Most fields do no pre-processing — the field data is kept as-is. Pre-processing is only used on fields that have special behavior. For example, if your model has a DateField with auto_now=True, the pre-save phase will alter the data in the object to ensure that the date field contains the current date stamp. (Our documentation doesn’t yet include a list of all the fields with this “special behavior.”)
3) Prepare the data for the database. Each field is asked to provide its current value in a data type that can be written to the database.
Most fields require no data preparation. Simple data types, such as integers and strings, are ‘ready to write’ as a Python object. However, more complex data types often require some modification.
For example, DateField fields use a Python datetime object to store data. Databases don’t store datetime objects, so the field value must be converted into an ISO-compliant date string for insertion into the database.
4) Insert the data into the database. The pre-processed, prepared data is then composed into an SQL statement for insertion into the database.
5) Emit a post-save signal. The signal django.db.models.signals.post_save is sent, allowing any functions listening for that signal to take some customized action.
Let me note that the behaviour you're receiving is possible if you've applied #transaction.commit_on_success decorator to your view, though, I don't see it in your code.
More on transactions: https://docs.djangoproject.com/en/1.5/topics/db/transactions/

Related

How do you avoid SQL Injection attacks in your Django Rest APIs if using native ORM?

They say that by using Django ORM you are already protected against most SQL Injection attacks. However, I wanted to know if there are any additional measures that should or can be used to process user input? Any libraries like bleach?
The main danger of using a Django ORM is that you might give users a powerful tool to select, filter and aggregate over arbitrary fields.
Indeed, say for example that you make a form that enables users to select the fields to return, then you can implement this as:
data = MyModel.objects.values(*request.GET.getlist('fields'))
If MyModel has a ForeignKey to the user model named owner, then the user could forge a request with owner__password as field, and thus retrieve the (hashed) passwords. While Django stores for its default User model a hashed password, it still means that the hashed data is exposed and it might make it easier to thus retrieve passwords.
But even if there is no user model, it can result in the fact that users can forge requests where they use links to sensitive data, and thus can retrieve a large amount of sensitive data. The same can happen with arbitrary filtering, annotating, aggregating, etc.
What you thus should do is keep a list of acceptable values, and check that the request only contains these values, for example:
acceptable = {'title', 'description', 'created_at'}
data = [field for field in request.GET.getlist('fields') if field in acceptable]
data = MyModel.objects.values(*data)
If you for example make use of packages like django-filter [readthedocs.io] you list the fields that can be filtered and what lookups can be done for these fields. The other data in the request.GET will be ignored, and thus will prevent filtering with arbitrary fields.

How to get model's objects one by one in Django

I need to make a function that will be launched in celery and will take records from the model in turn, check something and write data to another model with onetoone relationship. There are a lot of entries and using model_name.objects.all () is not appropriate (it will take a lot of memory and time) how to do it correctly.
You can use an iterator over the queryset https://docs.djangoproject.com/en/dev/ref/models/querysets/#iterator so your records are fetched on by one
model_iterator = your_model.objects.all().iterator()
for record in model_iterator:
do_something(record)

Django Fallback to model lookup from external API

I'm using Django REST framework to serve up JSON content for a website front end. On the back end, I have two Django models, Player and Match, that each reference multiple of the other. A Match contains multiple Players, and a Player contains multiple Matches. This data is originally retrieved from a third-party API.
Matches and Players must be fetched separately from the API, and can only be fetched one at a time. When an object is fetched, its data is converted from the external JSON format into my Django model. At this point, the Match/Player will live forever in Django. The hard part is that I want this external fetching to be seamless. If I query for a player or match and it's in the DB, then just serve what we have there. Otherwise, I want to fetch that object from the external DB.
My question is, does Django provide any convenient way of handling this? Ideally, any query along the lines of Match.objects.get(id=...) will handle this API fallback transparently (I don't mind the fact that this query may take significantly longer in some cases).
If a way is "convenient" depends on your expectations ...
You could create a custom QuerySet where you override the get() method to include your fetch-from-API logic. Then you create a custom manager based on that QuerySet, like the docs show here.
Finally add that custom manager to your model.
See also this question from 2011.

Batch Editing in Flask-Admin

I'm using Flask-Admin and I want to be able to update many fields at once from the list view. It seemed like what I'm looking for is a custom action.
I was able to make it work, but I suspect not in the best way. I'm wondering if it could be done more "Flask"-ily.
What I do now, for example if I was updating all rows in table cars to have tires = 4:
A custom action in the CarView class collects the ids of the rows to be modified, a callback url from request.referrer, and the tablename cars, and returns render_template(mass_update_info.html) with these as parameters.
mass_update_info.html is an HTML form where the user specifies 1) the field they would like to change and 2) the value to change it to. On submit, the form makes a POST to a a certain view (do_mass_update) with this data (everything else is passed as hidden fields in this form).
do_mass_update uses the data sent to it to construct a SQL query string -- in its entirety, "UPDATE {} SET {}='{}' WHERE id IN ({})".format(table, column, value, ids) -- which is run via db.engine.execute().
The user is redirected to the callback url.
It bothers me that I don't seem to be using any of SQLAlchemy, but (from a newbie's perspective) it all seems to be based on the model objects e.g. User.query(...), while I only have access to the model/table name as a string. Can I get some kind of identifier from the model, pass that through, and do a lookup to retrieve the on the other side?

When are property validations run in Google App Engine (GAE)?

So I was reading the following documentation on defining your own property types in GAE. I noticed that I could also include a .validate() method when extending a new Property. This validate method will be called "when an assignment is made to a property to make sure that it is compatible with your assigned attributes". Fair enough, but when exactly is that?
My question is, when exactly is this validate method called? Specifically, is it called before or after it is put? If I create this entity in a transaction, is validate called within the transaction or before the transaction?
I am aware that optimally, every Property should be "self contained" or at most, it should only deal with the state of the entity is resides in. But, what would happen if you performed a Query in the validate method? Would it blow up if you did a Query within validate that was in a different entity group than your current transactions entity group?
Before put, and during the transaction, respectively (it may abort the transaction if validation fails of course). "When an assignment is made" to a property of your entity is when you write theentity.theproperty = somevalue (or when you perform it implicitly).
I believe that queries of unrelated entities during a transaction (in validate or otherwise) are non-transactional (and thus very iffy practice), but not forbidden -- but on this last point I'm not sure.

Categories