Meaning of the map function in couchdb-pythons ViewField - python

I'm using the couchdb.mapping in one of my projects. I have a class called SupportCase derived from Document that contains all the fields I want.
My database (called admin) contains multiple document types. I have a type field in all the documents which I use to distinguish between them. I have many documents of type "case" which I want to get at using a view. I have design document called support with a view inside it called cases. If I request the results of this view using db.view("support/cases), I get back a list of Rows which have what I want.
However, I want to somehow have this wrapped by the SupportCase class so that I can call a single function and get back a list of all the SupportCases in the system. I created a ViewField property
#ViewField.define('cases')
def all(self, doc):
if doc.get("type","") == "case":
yield doc["_id"], doc
Now, if I call SupportCase.all(db), I get back all the cases.
What I don't understand is whether this view is precomputed and stored in the database or done on demand similar to db.query. If it's the latter, it's going to be slow and I want to use a precomputed view. How do I do that?

I think what you need is:
#classmethod
def all(cls):
result = cls.view(db, "support/all", include_docs=True)
return result.rows
Document class has a classmethod view which wraps the rows by class on which it is called. So the following returns you a ViewResult with rows of type SupportCase and taking .rows of that gives a list of support cases.
SupportCase.view(db, viewname, include_docs=True)
And I don't think you need to get into the ViewField magic. But let me explain how it works. Consider the following example from the CouchDB-python documentation.
class Person(Document):
#ViewField.define('people')
def by_name(doc):
yield doc['name'], doc
I think this is equivalent to:
class Person(Document):
#classmethod
def by_name(cls, db, **kw):
return cls.view(db, **kw)
With the original function attached to People.by_name.map_fun.

The map function is in some ways analogous to an index in a relational database. It is not done again every time, and when new documents are added the way it is updated does not require everything to be redone (it's a kind of tree structure).
This has a pretty good summary

ViewField uses a pre-defined view so, once built, will be fast. It definitely doesn't use a temporary view.

Related

Python #property Mapper 'mapped class User->user' has no property 'member_groups

Context: For my work, I'm running a script populate.py where we populate a database. It fails at a certain advanced stage (let's just say step 9) as I'm trying to add a new many to many association table.
I made some changes to correspond to a similar entity that works. Specifically I added some #property and #x.setter methods because I thought that would solve it or at least be good practice. Now I get a failure at an earlier stage that was working before (let's say step 4).
I'm not looking for a solution to either of these issues but an understanding of how there could be such an error as logged with respect to how python is designed to work.
The error is:
`sqlalchemy.exc.InvalidRequestError: Mapper 'mapped class User->user' has no property 'member_groups'`
I did change a class attribute from
member_groups
to
_member_groups
But I also added the following code
#property
def member_groups(self) -> List[MemberGroup]:
return self._member_groups
#member_groups.setter
def member_groups(self, new_member_groups):
'''
Now, what do we do with you?
'''
self._member_groups = new_member_groups
I thought the whole point of the #property decorator in python was for this very use case- that I could call the class attribute _foobar as long as I had the decorator grabbing and setting it correctly..
#property
def member_groups(self):
return self._foobar
..and that it shouldn't make any difference what the class attribute is called as long as we give what we want the user to access it through the property descriptor. I thought that was the whole purpose or at least a major point of this pythonic api, specifically to create pseudo private variables and not have it make a difference or be breaking with existing code, but it shows that mapped class has no property.
I just want the theory, not a code solution per se, although I'll take ideas if you have them. I just want to understand Python better.

About converting custom functions in models into annotations in custom managers/querysets

Being new to Django, I'm starting to care a bit about performance of my web application.
I'm trying to transform many of my custom functions / properties which were originally in my models to querysets within custom managers.
in my model I have:
class Shape(models.Model):
#property
def nb_color(self):
return 1 if self.colors=='' else int(1+sum(self.colors.upper().count(x) for x in 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'))
def __str__(self):
return self.name + "-" + self.constraints
#property
def image_url(self):
return format_html(f'{settings.SVG_DIR}/{self}.svg')
#property
def image_src(self):
return format_html('<img src="{url}"|urlencode />'.format(url = self.image_url))
def image_display(self):
return format_html(f'{self.image_src}"')
But I'm not clear on a few points:
1/ is there any pros or cons declaring with the propriety decorator in a django model?
2/ what is the cost of calling a function/property in term of database calls
and therefore, is there an added value to use custom managers / querysets and define annotations to simulate my functions at that level?
3/ how would you suggest me to transform my image & nb_color functions into annotations
Thanks in advance
PS: For the image related functions, I mostly figured it out:
self.annotate(image_url = Concat(Value(join(settings.SVG_DIR,'')), F('fullname'), Value('.svg'), output_field=CharField()),
image_src = Concat(Value('<img src="'), F('image_url'), Value('"|urlencode />'), output_field=CharField()),
image_display = Concat(Value(''),F('image_src'), Value(''), output_field=CharField()),
)
I am however having an issue for the display of image_src
through:
readonly_fields=['image']
def image(self, obj):
return format_html(obj.image_src)
it doesn't seem to find the image while the adress is ok.
If anybody has an idea...
PS: For the image related functions, I mostly figured it out:
self.annotate(image_url = Concat(Value(join(settings.SVG_DIR,'')),
F('fullname'), Value('.svg'), output_field=CharField()),
image_src = Concat(Value(''), output_field=CharField()),
image_display = Concat(Value(''),F('image_src'),
Value(''), output_field=CharField()),
) I am however having an issue for the display of image_src through:
readonly_fields=['image'] def image(self, obj):
return format_html(obj.image_src) it doesn't seem to find the image while the adress is ok.
I figured it up for my image problem: I should simply use a relative path and let Django manage:
self.annotate(image_url = Concat(Value('/static/SVG_shapes/'), F('fullname'), Value('.svg'), output_field=CharField()),)
With now 1.5 years more experience, I'll try to answer my newbie questions for the next ones who may have the same questions poping into their minds.
1/ is there any pros or cons declaring with the propriety decorator in a django model?
No cons that I could see so far.
It allows the data to be retrieved as a property of the model (my_shape.image_url), instead of having to call the corresponding method (my_shape.image_url())
However, for different purposes, one my prefer to have a callable (the method) instead of a property
2/ what is the cost of calling a function/property in term of database calls
No extra calling to the database if the data it needs as input are already available, or are themselves attributes of the instance object (fields / properties / methods that don't require input from outside the instance object)
However, if external data are needed, a database call will be generated for each of them.
For this reason, it can be valuable to cache the result of such a property by using the #cached_property decorator instead of the #property decorator
The only thing needed to use cached properties is the following import:
from django.utils.functional import cached_property
After being called for the first time, the cached property will remain available at no extra cost during all the lifetime of the object instance,
and its content can be manipulated like any other property / variable:
and therefore, is there an added value to use custom managers / querysets and define annotations to simulate my functions at that level?
In my understanding and practice so far, it is not uncommon to replicate the same functionality in both property & managers
The reason is that properties are easily available when we are interested only in one specific object instance,
while when you are interested into comparing / retrieving a given property for a range of objects, it is much more efficient to calculate & annotate this property for the whole queryset, for instance through using model managers
My give-away would be:
For a given model,
(1) try to put all the business logic concerning a single object instance into model methods / properties
(2) and all the business logic concerning a range of objects into model managers
3/ how would you suggest me to transform my image & nb_color functions into annotations
Already answered in previous answer

MongoEngine: Create Pickle field

I'm using MongoEngine and trying to create a field that works like SQLAlchemy's PickleType field. Basically, I just need to pickle objects before they're written to the database, and unpickle them when they're loaded.
However it looks like MongoEngine's fields don't provide proper conversion methods I could override, instead having two coercion methods (to_python and to_mongo). If I understand correctly, these functions can be called anytime, that is, a call to to_python(v) does not guarantee that v comes from the database. I've thought of writing something like this:
class PickleField(fields.BinaryField):
def to_python(self, value):
value = super().to_python(value)
if <<value was pickled by the field>>
return pickle.loads(value)
else:
return value
Unfortunately, if I want to be as general as possible, I don't see a way to check whether the value should be unpickled or not. For instance,
a = pickle.dumps(x)
PickleField().to_python(a) # should return a, will return x
I also don't think I can store any state in the PickleField, since that's shared by all instances.
Is there a way around this?

hybrid property with join in sqlalchemy

I have probably not grasped the use of #hybrid_property fully. But what I try to do is to make it easy to access a calculated value based on a column in another table and thus a join is required.
So what I have is something like this (which works but is awkward and feels wrong):
class Item():
:
#hybrid_property
def days_ago(self):
# Can I even write a python version of this ?
pass
#days_ago.expression
def days_ago(cls):
return func.datediff(func.NOW(), func.MAX(Event.date_started))
This requires me to add the join on the Action table by the caller when I need to use the days_ago property. Is the hybrid_property even the correct approach to simplifying my queries where I need to get hold of the days_ago value ?
One way or another you need to load or access Action rows either via join or via lazy load (note here it's not clear what Event vs. Action is, I'm assuming you have just Item.actions -> Action).
The non-"expression" version of days_ago intends to function against Action objects that are relevant only to the current instance. Normally within a hybrid, this means just iterating through Item.actions and performing the operation in Python against loaded Action objects. Though in this case you're looking for a simple aggregate you could instead opt to run a query, but again it would be local to self so this is like object_session(self).query(func.datediff(...)).select_from(Action).with_parent(self).scalar().
The expression version of the hybrid when formed against another table typically requires that the query in which it is used already have the correct FROM clauses set up, so it would look like session.query(Item).join(Item.actions).filter(Item.days_ago == xyz). This is explained at Join-Dependent Relationship Hybrid.
your expression here might be better produced as a column_property, if you can afford using a correlated subquery. See that at http://docs.sqlalchemy.org/en/latest/orm/mapping_columns.html#using-column-property-for-column-level-options.

How do I get the key value of a db.ReferenceProperty without a database hit?

Is there a way to get the key (or id) value of a db.ReferenceProperty, without dereferencing the actual entity it points to? I have been digging around - it looks like the key is stored as the property name preceeded with an _, but I have been unable to get any code working. Examples would be much appreciated. Thanks.
EDIT: Here is what I have unsuccessfully tried:
class Comment(db.Model):
series = db.ReferenceProperty(reference_class=Series);
def series_id(self):
return self._series
And in my template:
more
The result:
more
Actually, the way that you are advocating accessing the key for a ReferenceProperty might well not exist in the future. Attributes that begin with '_' in python are generally accepted to be "protected" in that things that are closely bound and intimate with its implementation can use them, but things that are updated with the implementation must change when it changes.
However, there is a way through the public interface that you can access the key for your reference-property so that it will be safe in the future. I'll revise the above example:
class Comment(db.Model):
series = db.ReferenceProperty(reference_class=Series);
def series_id(self):
return Comment.series.get_value_for_datastore(self)
When you access properties via the class it is associated, you get the property object itself, which has a public method that can get the underlying values.
You're correct - the key is stored as the property name prefixed with '_'. You should just be able to access it directly on the model object. Can you demonstrate what you're trying? I've used this technique in the past with no problems.
Edit: Have you tried calling series_id() directly, or referencing _series in your template directly? I'm not sure whether Django automatically calls methods with no arguments if you specify them in this context. You could also try putting the #property decorator on the method.

Categories