About converting custom functions in models into annotations in custom managers/querysets - python

Being new to Django, I'm starting to care a bit about performance of my web application.
I'm trying to transform many of my custom functions / properties which were originally in my models to querysets within custom managers.
in my model I have:
class Shape(models.Model):
#property
def nb_color(self):
return 1 if self.colors=='' else int(1+sum(self.colors.upper().count(x) for x in 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'))
def __str__(self):
return self.name + "-" + self.constraints
#property
def image_url(self):
return format_html(f'{settings.SVG_DIR}/{self}.svg')
#property
def image_src(self):
return format_html('<img src="{url}"|urlencode />'.format(url = self.image_url))
def image_display(self):
return format_html(f'{self.image_src}"')
But I'm not clear on a few points:
1/ is there any pros or cons declaring with the propriety decorator in a django model?
2/ what is the cost of calling a function/property in term of database calls
and therefore, is there an added value to use custom managers / querysets and define annotations to simulate my functions at that level?
3/ how would you suggest me to transform my image & nb_color functions into annotations
Thanks in advance
PS: For the image related functions, I mostly figured it out:
self.annotate(image_url = Concat(Value(join(settings.SVG_DIR,'')), F('fullname'), Value('.svg'), output_field=CharField()),
image_src = Concat(Value('<img src="'), F('image_url'), Value('"|urlencode />'), output_field=CharField()),
image_display = Concat(Value(''),F('image_src'), Value(''), output_field=CharField()),
)
I am however having an issue for the display of image_src
through:
readonly_fields=['image']
def image(self, obj):
return format_html(obj.image_src)
it doesn't seem to find the image while the adress is ok.
If anybody has an idea...

PS: For the image related functions, I mostly figured it out:
self.annotate(image_url = Concat(Value(join(settings.SVG_DIR,'')),
F('fullname'), Value('.svg'), output_field=CharField()),
image_src = Concat(Value(''), output_field=CharField()),
image_display = Concat(Value(''),F('image_src'),
Value(''), output_field=CharField()),
) I am however having an issue for the display of image_src through:
readonly_fields=['image'] def image(self, obj):
return format_html(obj.image_src) it doesn't seem to find the image while the adress is ok.
I figured it up for my image problem: I should simply use a relative path and let Django manage:
self.annotate(image_url = Concat(Value('/static/SVG_shapes/'), F('fullname'), Value('.svg'), output_field=CharField()),)

With now 1.5 years more experience, I'll try to answer my newbie questions for the next ones who may have the same questions poping into their minds.
1/ is there any pros or cons declaring with the propriety decorator in a django model?
No cons that I could see so far.
It allows the data to be retrieved as a property of the model (my_shape.image_url), instead of having to call the corresponding method (my_shape.image_url())
However, for different purposes, one my prefer to have a callable (the method) instead of a property
2/ what is the cost of calling a function/property in term of database calls
No extra calling to the database if the data it needs as input are already available, or are themselves attributes of the instance object (fields / properties / methods that don't require input from outside the instance object)
However, if external data are needed, a database call will be generated for each of them.
For this reason, it can be valuable to cache the result of such a property by using the #cached_property decorator instead of the #property decorator
The only thing needed to use cached properties is the following import:
from django.utils.functional import cached_property
After being called for the first time, the cached property will remain available at no extra cost during all the lifetime of the object instance,
and its content can be manipulated like any other property / variable:
and therefore, is there an added value to use custom managers / querysets and define annotations to simulate my functions at that level?
In my understanding and practice so far, it is not uncommon to replicate the same functionality in both property & managers
The reason is that properties are easily available when we are interested only in one specific object instance,
while when you are interested into comparing / retrieving a given property for a range of objects, it is much more efficient to calculate & annotate this property for the whole queryset, for instance through using model managers
My give-away would be:
For a given model,
(1) try to put all the business logic concerning a single object instance into model methods / properties
(2) and all the business logic concerning a range of objects into model managers
3/ how would you suggest me to transform my image & nb_color functions into annotations
Already answered in previous answer

Related

"Updater" design pattern, as opposed to "Builder"

This is actually language agnostic, but I always prefer Python.
The builder design pattern is used to validate that a configuration is valid prior to creating an object, via delegation of the creation process.
Some code to clarify:
class A():
def __init__(self, m1, m2): # obviously more complex in life
self._m1 = m1
self._m2 = m2
class ABuilder():
def __init__():
self._m1 = None
self._m2 = None
def set_m1(self, m1):
self._m1 = m1
return self
def set_m2(self, m1):
self._m2 = m2
return self
def _validate(self):
# complicated validations
assert self._m1 < 1000
assert self._m1 < self._m2
def build(self):
self._validate()
return A(self._m1, self._m2)
My problem is similar, with an extra constraint that I can't re-create the object each time due to to performance limitations.
Instead, I want to only update an existing object.
Bad solutions I came up with:
I could do as suggested here and just use setters like so
class A():
...
set_m1(self, m1):
self._m1 = m1
# and so on
But this is bad because using setters
Beats the purpose of encapsulation
Beats the purpose of the buillder (now updater), which is supposed to validate that some complex configuration is preserved after the creation, or update in this case.
As I mentioned earlier, I can't recreate the object every time, as this is expensive and I only want to update some fields, or sub-fields, and still validate or sub-validate.
I could add update and validation methods to A and call those, but this beats the purpose of delegating the responsibility of updates, and is intractable in the number of fields.
class A():
...
def update1(m1):
pass # complex_logic1
def update2(m2):
pass # complex_logic2
def update12(m1, m2):
pass # complex_logic12
I could just force to update every single field in A in a method with optional parameters
class A():
...
def update("""list of all fields of A"""):
pass
Which again is not tractable, as this method will soon become a god method due to the many combinations possible.
Forcing the method to always accept changes in A, and validating in the Updater also can't work, as the Updater will need to look at A's internal state to make a descision, causing a circular dependency.
How can I delegate updating fields in my object A
in a way that
Doesn't break encapsulation of A
Actually delegates the responsibility of updating to another object
Is tractable as A becomes more complicated
I feel like I am missing something trivial to extend building to updating.
I am not sure I understand all of your concerns, but I want to try and answer your post. From what you have written I assume:
Validation is complex and multiple properties of an object must be checked to decide if any change to the object is valid.
The object must always be in a valid state. Changes that make the object invalid are not permitted.
It is too expensive to copy the object, make the change, validate the object, and then reject the change if the validation fails.
Move the validation logic out of the builder and into a separate class like ModelValidator with a validateModel(model) method
The first option is to use a command pattern.
Create abstract class or interface named Update (I don't think Python abstract classes/interfaces, but that's fine). The Update interface implements two methods, execute() and undo().
A concrete class has a name like UpdateAdress, UpdatePortfolio, or UpdatePaymentInfo.
Each concrete Update object also holds a reference to your model object.
The concrete classes hold the state needed to for a particular kind of update. Imageine these methods exist on the UpdateAddress class:
UpdateAddress
setStreetNumber(...)
setCity(...)
setPostcode(...)
setCountry(...)
The update object needs to hold both the current and new values of a property. Like:
setStreetNumber(aString):
self.oldStreetNumber = model.getStreetNumber
self.newStreetNumber = aString
When the execute method is called, the model is updated:
execute:
model.setStreetNumber(newStreetNumber)
model.setCity(newCity)
# Set postcode and country
if not ModelValidator.isValid(model):
self.undo()
raise ValidationError
and the undo method looks like:
undo:
model.setStreetNumber(oldStreetNumber)
model.setCity(oldCity)
# Set postcode and country
That is a lot of typing, but it would work. Mutating your model object is nicely encapsulated by different kinds of updates. You can execute or undo the changes by calling those methods on the update object. You can even store a list of update objects for multi-level undos and re-tries.
However, it is a lot of typing for the programmer. Consider using persistent data structures. Persistent data structures can be used to copy objects very quickly -- approximately constant time complexity. Here is a python library of persistent data structures.
Let's assume your data was in a persistent data structure version of a dict. The library I referenced calls it a PMap.
The implementation of the update classes can be simpler. Starting with the constructor:
UpdateAddress(pmap)
self.oldPmap = pmap
self.newPmap = pmap
The setters are easier:
setStreetNumber(aString):
self.newPmap = newPmap.set('streetNumber', aString)
Execute passes back a new instance of the model, with all the updates.
execute:
if ModelValidator.isValid(newModel):
return newModel;
else:
raise ValidationError
The original object has not changed at all, thanks to the magic of persistent data structures.
The best thing is to not do any of this. Instead, use an ORM or object database. That is the "enterprise grade" solution. These libraries give you sophisticated tools like transactions and object version history.

Django save object in database

First of all I should mention that I'm new to Python as well as Django.
I have a Django project with an sqlite3 database.
I have created a couple of classes for calculating some physical problems, and I want to access the result of these through my website using forms. My previous approach to this was to change my classes to functions, and create several model fields for my objects.
Then I reached a point where the output of my function was a 2D-array, which I have issues saving in my model. So I decided it would be easier just to store the entire object in the database - then I wouldn't have to change my classes to functions in order to utilize it in Django.
I'm using inputs for my calculations in a form, a and b below.
Is it possible to store objects in my model?
I have made a hypothetic example, not sure if the code works but hope that you get the point.
class Butterfly_model(models.Model):
a = models.FloatField(verbose_name='a')
b = models.FloatField(verbose_name='b')
results = >>>Some_Object_Field<<<
my computation script could contain something like this:
class Compute:
def __init__(self, a, b):
self.X=[]
self.Y=[]
for i in range(0,a):
self.X.append(i)
self.Y.append(i+b)
And my view.py contains following:
if form_input.is_valid():
instance = form_input.save(commit=False)
instance.results = Compute(instance.a,instance.b)
instance.save()
If not possible, does any of you have a suggestion for how handle calculations and resulting data like this?
Best regards,
Joachim
What you are probably looking for are model properties. From anticipating your needs I think you are looking for:
You can add properties to you models that calculate a function. For your case it would be sth like:
class Butterfly_model(models.Model):
a = models.FloatField(verbose_name='a')
b = models.FloatField(verbose_name='b')
#property
def X(self):
return [i for i in range(0,self.a)]
#property
def Y(self):
return [i+b for i in range(0,self.a)]
That way you can access X and Y for each model instance and do not need to save it in the database (if that is what you want?).

Use model method as default value for Django Model?

I actually have this method in my Model:
def speed_score_compute(self):
# Speed score:
#
# - 8 point for every % of time spent in
# high intensity running phase.
# - Average Speed in High Intensity
# running phase (30km/h = 50 points
# 0-15km/h = 15 points )
try:
high_intensity_running_ratio = ((
(self.h_i_run_time * 100)/self.training_length) * 8)
except ZeroDivisionError:
return 0
high_intensity_running_ratio = min(50, high_intensity_running_ratio)
if self.h_i_average_speed < 15:
average_speed_score = 10
else:
average_speed_score = self.cross_multiplication(
30, self.h_i_average_speed, 50)
final_speed_score = high_intensity_running_ratio + average_speed_score
return final_speed_score
I want to use it as default for my Model like this:
speed_score = models.IntegerField(default=speed_score_compute)
But this don't work (see Error message below) . I've checked different topic like this one, but this work only for function (not using self attribute) but not for methods (I must use methods since I'm working with actual object attributes).
Django doc. seems to talk about this but I don't get it clearly maybe because I'm still a newbie to Django and Programmation in general.
Is there a way to achieve this ?
EDIT:
My function is defined above my models. But here is my error message:
ValueError: Could not find function speed_score_compute in tournament.models.
Please note that due to Python 2 limitations, you cannot serialize unbound method functions (e.g. a method declared and used in the same class body). Please move the function into the main module body to use migrations.
For more information, see https://docs.djangoproject.com/en/1.8/topics/migrations/#serializing-values
Error message is clear, it seems that I'm not able to do this. But is there another way to achieve this ?
The problem
When we provide a default=callable and provide a method from a model, it doesn't get called with the self argument that is the model instance.
Overriding save()
I haven't found a better solution than to override MyModel.save()* like below:
class MyModel(models.Model):
def save(self, *args, **kwargs):
if self.speed_score is None:
self.speed_score = ...
# or
self.calculate_speed_score()
# Now we call the actual save method
super(MyModel, self).save(*args, **kwargs)
This makes it so that if you try to save your model, without a set value for that field, it is populated before the save.
Personally I just prefer this approach of having everything that belongs to a model, defined in the model (data, methods, validation, default values etc). I think this approach is referred to as the fat Django model approach.
*If you find a better approach, I'd like to learn about it too!
Using a pre_save signal
Django provides a pre_save signal that runs before save() is ran on the model. Signals run synchronously, i.e. the pre_save code needs to finish running before save() is called on the model. You'll get the same results (and order of execution) as overriding the save().
from django.db.models.signals import pre_save
from django.dispatch import receiver
from myapp.models import MyModel
#receiver(pre_save, sender=MyModel)
def my_handler(sender, **kwargs):
instance = kwargs['instance']
instance.populate_default_values()
If you prefer to keep the default values behavior separated from the model, this approach is for you!
When is save() called? Can I work with the object before it gets saved?
Good questioin! Because we'd like the ability to work with our object before subjecting to saving of populating default values.
To get an object, without saving it, just to work with it we can do:
instance = MyModel()
If you create it using MyModel.objects.create(), then save() will be called. It is essentially (see source code) equivalent to:
instance = MyModel()
instance.save()
If it's interesting to you, you can also define a MyModel.populate_default_values(), that you can call at any stage of the object lifecycle (at creation, at save, or on-demande, it's up to you)

Idiomatic python - property or method?

I have a django model class that maintains state as a simple property. I have added a couple of helper properties to the class to access aggregate states - e.g. is_live returns false if the state is any one of ['closed', 'expired', 'deleted'] etc.
As a result of this my model has a collection of is_ properties, that do very simple lookups on internal properties of the object.
I now want to add a new property, is_complete - which is semantically the same as all the other properties - a boolean check on the state of the object - however, this check involves loading up dependent (one-to-many) child objects, checking their state and reporting back based on the results - i.e. this property actually does some (more than one) database query, and processes the results.
So, is it still valid to model as a property (using the #property decorator), or should I instead forego the decorator and leave it as a method?
Pro of using a property is that it's semantically consistent with all the other is_ properties.
Pro of using a method is that it indicates to other developers that this is something that has a more complex implementation, and so should be used sparingly (i.e. not inside a for.. loop).
from django.db import models
class MyModel(models.Model):
state = CharField(default='new')
#property
def is_open(self):
# this is a simple lookup, so makes sense as a property
return self.state in ['new', 'open', 'sent']
def is_complete(self):
# this is a complex database activity, but semantically correct
related_objects = self.do_complicated_database_lookup()
return len(related_objects)==0
EDIT: I come from a .NET background originally, where the split is admirably defined by Jeff Atwood as
"if there's any chance at all that code could spawn an hourglass, it definitely should be a method."
EDIT 2: slight update to the question - would it be a problem to have it as a method, called is_complete, so that there are mixed properties and methods with similar names - or is that just confusing?
So - it would look something like this:
>>> m = MyModel()
>>> m.is_live
True
>>> m.is_complete()
False
It is okay to do that, especially if you will use the following pattern:
class SomeClass(models.Model):
#property
def is_complete(self):
if not hasattr(self, '_is_complete'):
related_objects = self.do_complicated_database_lookup()
self._is_complete = len(related_objects) == 0
return self._is_complete
Just remember that it "caches" the results, so first execution does calculation, but subsequent use existing results.

Meaning of the map function in couchdb-pythons ViewField

I'm using the couchdb.mapping in one of my projects. I have a class called SupportCase derived from Document that contains all the fields I want.
My database (called admin) contains multiple document types. I have a type field in all the documents which I use to distinguish between them. I have many documents of type "case" which I want to get at using a view. I have design document called support with a view inside it called cases. If I request the results of this view using db.view("support/cases), I get back a list of Rows which have what I want.
However, I want to somehow have this wrapped by the SupportCase class so that I can call a single function and get back a list of all the SupportCases in the system. I created a ViewField property
#ViewField.define('cases')
def all(self, doc):
if doc.get("type","") == "case":
yield doc["_id"], doc
Now, if I call SupportCase.all(db), I get back all the cases.
What I don't understand is whether this view is precomputed and stored in the database or done on demand similar to db.query. If it's the latter, it's going to be slow and I want to use a precomputed view. How do I do that?
I think what you need is:
#classmethod
def all(cls):
result = cls.view(db, "support/all", include_docs=True)
return result.rows
Document class has a classmethod view which wraps the rows by class on which it is called. So the following returns you a ViewResult with rows of type SupportCase and taking .rows of that gives a list of support cases.
SupportCase.view(db, viewname, include_docs=True)
And I don't think you need to get into the ViewField magic. But let me explain how it works. Consider the following example from the CouchDB-python documentation.
class Person(Document):
#ViewField.define('people')
def by_name(doc):
yield doc['name'], doc
I think this is equivalent to:
class Person(Document):
#classmethod
def by_name(cls, db, **kw):
return cls.view(db, **kw)
With the original function attached to People.by_name.map_fun.
The map function is in some ways analogous to an index in a relational database. It is not done again every time, and when new documents are added the way it is updated does not require everything to be redone (it's a kind of tree structure).
This has a pretty good summary
ViewField uses a pre-defined view so, once built, will be fast. It definitely doesn't use a temporary view.

Categories