hybrid property with join in sqlalchemy - python

I have probably not grasped the use of #hybrid_property fully. But what I try to do is to make it easy to access a calculated value based on a column in another table and thus a join is required.
So what I have is something like this (which works but is awkward and feels wrong):
class Item():
:
#hybrid_property
def days_ago(self):
# Can I even write a python version of this ?
pass
#days_ago.expression
def days_ago(cls):
return func.datediff(func.NOW(), func.MAX(Event.date_started))
This requires me to add the join on the Action table by the caller when I need to use the days_ago property. Is the hybrid_property even the correct approach to simplifying my queries where I need to get hold of the days_ago value ?

One way or another you need to load or access Action rows either via join or via lazy load (note here it's not clear what Event vs. Action is, I'm assuming you have just Item.actions -> Action).
The non-"expression" version of days_ago intends to function against Action objects that are relevant only to the current instance. Normally within a hybrid, this means just iterating through Item.actions and performing the operation in Python against loaded Action objects. Though in this case you're looking for a simple aggregate you could instead opt to run a query, but again it would be local to self so this is like object_session(self).query(func.datediff(...)).select_from(Action).with_parent(self).scalar().
The expression version of the hybrid when formed against another table typically requires that the query in which it is used already have the correct FROM clauses set up, so it would look like session.query(Item).join(Item.actions).filter(Item.days_ago == xyz). This is explained at Join-Dependent Relationship Hybrid.
your expression here might be better produced as a column_property, if you can afford using a correlated subquery. See that at http://docs.sqlalchemy.org/en/latest/orm/mapping_columns.html#using-column-property-for-column-level-options.

Related

Django conditional create

Does the Django ORM provide a way to conditionally create an object?
For example, let's say you want to use some sort of optimistic concurrency control for inserting new objects.
At a certain point, you know the latest object to be inserted in that table, and you want to only create a new object only if no new objects have been inserted since then.
If it's an update, you could filter based on a revision number:
updated = Account.objects.filter(
id=self.id,
version=self.version,
).update(
balance=balance + amount,
version=self.version + 1,
)
However, I can't find any documented way to provide conditions for a create() or save() call.
I'm looking for something that will apply these conditions at the SQL query level, so as to avoid "read-modify-write" problems.
EDIT: This is not an Optimistic Lock attempt. This is a direct answer to OP's provided code.
Django offers a way to implement conditional queries. It also offers the update_or_create(defaults=None, **kwargs) shortcut which:
The update_or_create method tries to fetch an object from the database based on the given kwargs. If a match is found, it updates the fields passed in the defaults dictionary.
The values in defaults can be callables.
So we can attempt to mix and match those two in order to recreate the supplied query:
obj, created = Account.objects.update_or_create(
id=self.id,
version=self.version,
defaults={
balance: Case(
When(version=self.version, then=F('balance')+amount),
default=amount
),
version: Case(
When(version=self.version, then=F('version')+1),
default=self.version
)
}
)
Breakdown of the Query:
The update_or_create will try to retrieve an object with id=self.id and version=self.version in the database.
Found: The object's balance and version fields will get updated with the values inside the Case conditional expressions accordingly (see the next section of the answer).
Not Found: The object with id=self.id and version=self.version will be created and then it will get its balance and version fields updated.
Breakdown of the Conditional Queries:
balance Query:
If the object exists, the When expression's condition will be true, therefore the balance field will get updated with the value of:
# Existing balance # Added amount
F('balance') + amount
If the object gets created, it will receive as an initial balance the amount value.
version Query:
If the object exists, the When expression's condition will be true, therefore the version field will get updated with the value of:
# Existing version # Next Version
F('version') + 1
If the object gets created, it will receive as an initial version the self.version value (it can also be a default initial version like 1.0.0).
Notes:
You may need to provide an output_field argument to the Case expression, have a look here.
In case (pun definitely intended) of curiosity about what F() expression is and how it is used, I have a Q&A style example here: How to execute arithmetic operations between Model fields in django
Except for QuerySet.update returning the number of affected rows Django doesn't provide any primitives to deal with optimistic locking.
However there's a few third-party apps out there that provide such a feature.
django-concurrency which is the most popular option that provides both database level constraints and application one
django-optimistic-lock which is a bit less popular but I've tried in a past project and it was working just fine.
django-locking unmaintained.
Edit: It looks like OP was not after optimistic locking solutions after all.

Building Django Q() objects from other Q() objects, but with relation crossing context

I commonly find myself writing the same criteria in my Django application(s) more than once. I'll usually encapsulate it in a function that returns a Django Q() object, so that I can maintain the criteria in just one place.
I will do something like this in my code:
def CurrentAgentAgreementCriteria(useraccountid):
'''Returns Q that finds agent agreements that gives the useraccountid account current delegated permissions.'''
AgentAccountMatch = Q(agent__account__id=useraccountid)
StartBeforeNow = Q(start__lte=timezone.now())
EndAfterNow = Q(end__gte=timezone.now())
NoEnd = Q(end=None)
# Now put the criteria together
AgentAgreementCriteria = AgentAccountMatch & StartBeforeNow & (NoEnd | EndAfterNow)
return AgentAgreementCriteria
This makes it so that I don't have to think through the DB model more than once, and I can combine the return values from these functions to build more complex criterion. That works well so far, and has saved me time already when the DB model changes.
Something I have realized as I start to combine the criterion from these functions that is that a Q() object is inherently tied to the type of object .filter() is being called on. That is what I would expect.
I occasionally find myself wanting to use a Q() object from one of my functions to construct another Q object that is designed to filter a different, but related, model's instances.
Let's use a simple/contrived example to show what I mean. (It's simple enough that normally this would not be worth the overhead, but remember that I'm using a simple example here to illustrate what is more complicated in my app.)
Say I have a function that returns a Q() object that finds all Django users, whose username starts with an 'a':
def UsernameStartsWithAaccount():
return Q(username__startswith='a')
Say that I have a related model that is a user profile with settings including whether they want emails from us:
class UserProfile(models.Model):
account = models.OneToOneField(User, unique=True, related_name='azendalesappprofile')
emailMe = models.BooleanField(default=False)
Say I want to find all UserProfiles which have a username starting with 'a' AND want use to send them some email newsletter. I can easily write a Q() object for the latter:
wantsEmails = Q(emailMe=True)
but find myself wanting to something to do something like this for the former:
startsWithA = Q(account=UsernameStartsWithAaccount())
# And then
UserProfile.objects.filter(startsWithA & wantsEmails)
Unfortunately, that doesn't work (it generates invalid PSQL syntax when I tried it).
To put it another way, I'm looking for a syntax along the lines of Q(account=Q(id=9)) that would return the same results as Q(account__id=9).
So, a few questions arise from this:
Is there a syntax with Django Q() objects that allows you to add "context" to them to allow them to cross relational boundaries from the model you are running .filter() on?
If not, is this logically possible? (Since I can write Q(account__id=9) when I want to do something like Q(account=Q(id=9)) it seems like it would).
Maybe someone suggests something better, but I ended up passing the context manually to such functions. I don't think there is an easy solution, as you might need to call a whole chain of related tables to get to your field, like table1__table2__table3__profile__user__username, how would you guess that? User table could be linked to table2 too, but you don't need it in this case, so I think you can't avoid setting the path manually.
Also you can pass a dictionary to Q() and a list or a dictionary to filter() functions which is much easier to work with than using keyword parameters and applying &.
def UsernameStartsWithAaccount(context=''):
field = 'username__startswith'
if context:
field = context + '__' + field
return Q(**{field: 'a'})
Then if you simply need to AND your conditions you can combine them into a list and pass to filter:
UserProfile.objects.filter(*[startsWithA, wantsEmails])

Undo `lazyload()` with the relationship default

I have a Query object which was initially configured to lazyload() all relations on a model:
query = session.query(Article).options(lazyload('author'))
Is it possible to revert the relationship loading back to default? E.g. the relationship was configured with lazy='joined', and I want the query to have joinedload() behavior without using joinedload() explicitly.
I was expecting defaultload() to have this behavior, but in fact it does not: it references the query default instead of the relationship default. So I'm searching for kinda resetload() solution.
The reason for doing this is because I'm creating a JSON-based query syntax, and no relations should be loaded unless the user explicitly names them.
Currently, I'm using lazyload() on all relations that were not explicitly requested, but want to go the other way around: lazyload() all relations first, and then override it for some of them.
This would have made the code more straigntforward.
Just to be clear:
By default, all inter-object relationships are lazy loading.
http://docs.sqlalchemy.org/en/latest/orm/loading.html
So we are talking about a case in which a relation has been specifically marked as eager loading, then the queries are configured as lazy loading, then you want to "override the override" as it were.
Chaining calls to options will override earlier calls. I did test this a bit.
q = s.query(User) # lazy loads 'addresses'
q = s.query(User).options(contains_eager('addresses')) # eager loads
q = s.query(User).options(contains_eager('addresses'))\
.options(lazyload('addresses')) # lazy loads
q = s.query(User).options(contains_eager('addresses'))\
.options(lazyload('addresses'))\
.options(contains_eager('addresses')) # eager loads
However, it sounds like you're talking about just reverting the lazyload option, whereas the above case involves an explicit change to eager loading.
The defaultload docstring says its use case is to be chained to other loader options, so I don't think it's related.
Based on a glance through the source, I don't think this behavior is supported. When you update the loading strategy option, it updates a dictionary with the new loading strategy and I don't think there's still a reference to the old strategy, at least as far as I can tell.
You could keep a reference to the query object before .options(lazyload(...)), or just have an option to generate the query with or without the lazyload on everything.
To force everything to lazyload, ignoring what was specified on the relationship, you can use the '*' target. From the docs:
affecting all relationships not otherwise specified in the query. This
feature is available by passing the string '*' as the argument to any
of these options:
session.query(Article).options(lazyload('*'))
Then you can specify whatever load types you want per relationship or relationship chain.
# not sure how you are mapping json data to relationships
# once you know the relationships, you can build a list of them to load
my_loads = [joinedload(rel) for rel in json_rel_data]
query = session.query(Article).options(lazyload('*'), *my_loads)
# query lazy loads **everything** except the explicitly set joined loads
If you are joining on the relationships for query purposes, you can use contains_eager instead of joinedload in the options to use the already joined relationship.
my_eagers = [contains_eager(rel) for rel in json_rel_joins]
my_loads = [joinedload(rel) for rel in json_rel_loads]
query = session.query(Article
).join(*json_rel_joins
).options(lazyload('*'), *my_eagers, *my_loads)

Is it possible in SQLAlchemy to filter by a database function or stored procedure?

We're using SQLalchemy in a project with a legacy database. The database has functions/stored procedures. In the past we used raw SQL and we could use these functions as filters in our queries.
I would like to do the same for SQLAlchemy queries if possible. I have read about the #hybrid_property, but some of these functions need one or more parameters, for example;
I have a User model that has a JOIN to a bunch of historical records. These historical records for this user, have a date and a debit and credit field, so we can look up the balance of a user at a specific point in time, by doing a SUM(credit) - SUM(debit) up until the given date.
We have a database function for that called dbo.Balance(user_id, date_time). I can use this to check the balance of a user at a given point in time.
I would like to use this as a criterium in a query, to select only users that have a negative balance at a specific date/time.
selection = users.filter(coalesce(Users.status, 0) == 1,
coalesce(Users.no_reminders, 0) == 0,
dbo.pplBalance(Users.user_id, datetime.datetime.now()) < -0.01).all()
This is of course a non-working example, just for you to get the gist of what I'd like to do. The solution looks to be to use hybrd properties, but as I mentioned above, these only work without parameters (as they are properties, not methods).
Any suggestions on how to implement something like this (if it's even possible) are welcome.
Thanks,
a #hybrid_property isn't by itself a means of producing a particular SQL statement, it is only a helper that can add more query-generation capabilities to an ORM-mapped class.
SQL functions that can be called as plain functions (e.g. without any kind of "EXEC XYZ" type of syntax) can be called using the func construct, meaning the query you have is pretty much ready to go:
from sqlalchemy import func
selection = users.filter(coalesce(Users.status, 0) == 1,
coalesce(Users.no_reminders, 0) == 0,
func.dbo.pplBalance(Users.user_id, datetime.datetime.now()) < -0.01).all()

Meaning of the map function in couchdb-pythons ViewField

I'm using the couchdb.mapping in one of my projects. I have a class called SupportCase derived from Document that contains all the fields I want.
My database (called admin) contains multiple document types. I have a type field in all the documents which I use to distinguish between them. I have many documents of type "case" which I want to get at using a view. I have design document called support with a view inside it called cases. If I request the results of this view using db.view("support/cases), I get back a list of Rows which have what I want.
However, I want to somehow have this wrapped by the SupportCase class so that I can call a single function and get back a list of all the SupportCases in the system. I created a ViewField property
#ViewField.define('cases')
def all(self, doc):
if doc.get("type","") == "case":
yield doc["_id"], doc
Now, if I call SupportCase.all(db), I get back all the cases.
What I don't understand is whether this view is precomputed and stored in the database or done on demand similar to db.query. If it's the latter, it's going to be slow and I want to use a precomputed view. How do I do that?
I think what you need is:
#classmethod
def all(cls):
result = cls.view(db, "support/all", include_docs=True)
return result.rows
Document class has a classmethod view which wraps the rows by class on which it is called. So the following returns you a ViewResult with rows of type SupportCase and taking .rows of that gives a list of support cases.
SupportCase.view(db, viewname, include_docs=True)
And I don't think you need to get into the ViewField magic. But let me explain how it works. Consider the following example from the CouchDB-python documentation.
class Person(Document):
#ViewField.define('people')
def by_name(doc):
yield doc['name'], doc
I think this is equivalent to:
class Person(Document):
#classmethod
def by_name(cls, db, **kw):
return cls.view(db, **kw)
With the original function attached to People.by_name.map_fun.
The map function is in some ways analogous to an index in a relational database. It is not done again every time, and when new documents are added the way it is updated does not require everything to be redone (it's a kind of tree structure).
This has a pretty good summary
ViewField uses a pre-defined view so, once built, will be fast. It definitely doesn't use a temporary view.

Categories