SQLAlchemy hybrid_property and expressions - python

I am working on storing some data produced by an external process in a postgres database using sqlalchemy. The external data has several dates stored as strings that I would like to use as datetime objects for comparison and duration calculation and I'd like the conversion to happen in the data model to maintain consistency. I'm trying to use a hybrid_property but I am running into problems based on the different ways that SQLAlchemy uses the hybrid_property as an instance or class.
A (simplified) case looks like this...
class Contact(Base):
id = Column(String(100), primary_key=True)
status_date = Column(String(100))
#hybrid_property
def real_status_date(self):
return convert_from_outside_date(self.status_date)
with the conversion function something like this (the function can return a date, False on conversion failure or None on being passed None)...
def convert_from_outside_date(in_str):
out_date = None
if in_str != None:
try:
out_date = datetime.datetime.strptime(in_str,"%Y-%m-%d")
except ValueError:
out_date = False
return out_date
When I use an instance of Contact, contact.real_status_date properly works as a datetime. The problem is when Contact.real_status_date is used in a query filter.
db_session.query(Contact).filter(
Contact.real_status_date > datetime.datetime.now())
Gets me a "TypeError: Boolean value of this clause is not defined" exception, with the
in_str != None
line of the conversion function as the last part of the stack trace.
Some answers (https://stackoverflow.com/a/14504695/416308) show the use of a setter function and the addition of new column in the data model. Other answers (https://stackoverflow.com/a/13642708/416308) show the addition of #property.expression function that returns something sqlalchemy can interpret into a sql expression.
Adding a setter to the Contact class works but the addition of new columns seems like it shouldn't be necessary and makes some table metadata parsing more difficult later and I'd like to avoid it if I can.
_real_status_date = Column(DateTime())
#hybrid_property
def real_status_date(self):
return self._real_status_date
#real_status_date.setter
def value(self):
self._real_status_date = convert_from_outside_date(self.status_date)
If I used an #.expression decorator would I have to implement a strptime function that is more sql compatible? What would that look like? Is there something wrong with the conversion function that is causing trouble here?

As zzzeek mentions, you could add the following to your class
Depending on your DB, it might already interpret a python datetime object
So it could work only modifying your conversion function to:
def convert_from_outside_date(in_str):
if in_str:
try:
return datetime.datetime.strptime(in_str,"%Y-%m-%d")
# Return None for a Null date
return None
Otherwise you need to add an expression function:
#real_status_date.expression
def real_status_date(self):
return sqlalchemy.Date(self.real_status_date)

Related

SQLAlchemy hybrid method as an attribute in the query result?

In a search for more syntactically pleasant models, I've created something like (omitted not so relevant parts):
class Post(BaseModel):
acl = db.relationship("ACL", cascade="all, delete-orphan")
#hybrid_method
def is_editable_by(self, user):
if user_is_admin(user):
return True
else:
return any(acl.user_id == user.id for acl in self.acl)
#is_editable_by.expression
def is_editable_by(cls, user):
if user_is_admin(user):
return true().label("is_editable")
else:
return and_(true(), cls.acl.any(user_id=user.id)).label("is_editable")
That works nicely with a query like:
results = db.session.query(Post, Post.is_editable_by(current_user)).all()
usage = [(item.Post.name, item.is_editable) for item in results]
However, I am getting a list of tuples, so the question is: Does SQLAlchemy has any mechanism to make it one object (some kind of proxy/adapter maybe)? If it were not for a parameter (user), hybrid_property would be exact answer:
[(item.name, item.is_editable) for item in results]
It is possible to make the desirable with yet another class, but I wonder if ORM has something already for this case? The main idea is when a lot of objects are obtained, there is no need to calculate something for each object. Or at least that calculation will not require extra calls to the database. In other words, how to have result objects with some derived attributes, which depend on some parameter?
Maybe there is totally different pattern for this kind of thing. This case is in some sense similar to load_only, but it's about adding a "virtual" column to an object, not removing.
Could also be nice with version 0.9 of SQLAlchemy...
Subtlety:
results = db.session.query(Post, Post.is_editable_by(user1)).all()
[(item.name, item.is_editable) for item in results]
...
results = db.session.query(Post, Post.is_editable_by(user2)).all()
[(item.name, item.is_editable) for item in results]
should not cause problems.

How can I compare two SQLAlchemy queries if they are the same?

I have a function returning SQLAlchemy query object and I want to test this function that it builds correct query.
For example:
import sqlalchemy
metadata = sqlalchemy.MetaData()
users = sqlalchemy.Table(
"users",
metadata,
sqlalchemy.Column("email", sqlalchemy.String(255), nullable=False, unique=True),
sqlalchemy.Column("username", sqlalchemy.String(50), nullable=False, unique=True),
)
def select_first_users(n):
return users.select().limit(n)
def test_queries_are_equal(self):
expected_query = users.select().limit(10)
assert select_first_users(10) == expected_query # fails here
assert select_first_users(10).compare(expected_query) # fails here too
I have no idea how to compare two queries for equality. == doesn't work here because as far as I can see these objects do not have the __eq__ method defined, so it compares objects by address in memory and surely fails. The compare method also does is comparison.
The only solution I see is like:
assert str(q1.compile()) == str(q2.compile())
, but it is strange and contains placeholders instead of actual values.
So how can I compare two SQLAlchemy queries for equality?
I use Python 3.7.4, SQLAlchemy==1.3.10.
There is a parameter to the compile function that solves the placeholder problem
query.compile(compile_kwargs={"literal_binds": True}), so instead of
SELECT users.email,
users.username
FROM users
LIMIT :param_1
you get
SELECT users.email,
users.username
FROM users
LIMIT 10
So I think you could do something like that
import sqlparse
def format_query(query):
return sqlparse.format(str(query.compile(compile_kwargs={"literal_binds": True})),
reindent=True, keyword_case='upper')
def test_queries_are_equal():
expected_query = users.select().limit(10)
assert format_query(expected_query) == format_query(select_first_users(10))
If comparison is based on an exact string match, I think it's better to ensure a consistent formatting, hence the use of sqlparse.
Natively it handles only int and strings but it can be extended, see the doc for more info
https://docs.sqlalchemy.org/en/13/faq/sqlexpressions.html#faq-sql-expression-string

SQLAlchemy--can I map an empty string to null in the DDL? I want a nullable integer column to translate '' to NULL on insert or update

I have a SQLAlchemy model with an integer column being populated from an HTML form (I'm using Flask and WTForms-alchemy and I'm trying to avoid writing custom code in routes). If the user does not enter a value for this integer on the form, the code that populates objects from the form ends up trying to put an empty string in for that column and MySQL complains that this is not an integer value. To help people searching: the error I got started with Incorrect integer value: '' for column ....
I don't want to use the sql_mode='' hack, which people suggest in order to put MySQL back into the old behavior of making a guess whenever you give it incorrect data, because I can't control the MySQL server this will eventually be used for.
What I want is something like the default column specification, except that instead of specifying a default for when nothing is passed in, I want to intercept an attempt to put an empty string in and replace it with None which I think will get translated to a null as it goes to the database.
Is there any way to do this in the model definition? I realize that it may incur a performance hit, but throughput is not a big deal in this application.
I found a way. The validates decorator can change a value on the way in. Here's how I did it:
from sqlalchemy.orm import validates
class Task(Base):
__tablename__ = 'task'
id = Column(INTEGER(11), primary_key=True)
time_per_unit = Column(INTEGER(11))
#validates('time_per_unit')
def empty_string_to_null(self, key, value):
if isinstance(value,str) and value == '':
return None
else:
return value
I ran into a similar issue and created a decorator, that could then be used on a setter for the field. This allows easy reuse but does require using a hybrid property and setter for the field.
helpers.py
def noneIfEmptyString(func):
#functools.wraps(func)
def wrapped_function(*args, **kwargs):
"""Decorator to convert empty string to None """
for arg in args:
if arg == '':
arg = None
for k,v in kwargs.items():
if v == '':
kwargs[k] = None
return func(*args, **kwargs)
return wrapped_function
Models.py
_columnName = db.Column(db.String(120))
#hybrid_property
def columnName(self):
return self._columnName
#columnName.setter
#noneIfEmptyString
def columnName(self, name):
self._columnName = name;

Why use MongoAlchemy when you could subclass a Python Dict?

A friend recently showed me that you can create an instance that is a subclass of dict in Python, and then use that instance to save, update, etc. Seems like you have more control, and it looks easier as well.
class Marker(dict):
def __init__(self, username, email=None):
self.username = username
if email:
self.email = email
#property
def username(self):
return self.get('username')
#username.setter
def username(self, val):
self['username'] = val
def save(self):
db.collection.save(self)
Author here. The general reason you'd want to use it (or one of the many similar libraries) is for safety. When you assign a value to a MongoAlchemy Document it does a check a check to make sure all of the constraints you specified are satisfied (e.g. type, lengths of strings, numeric bounds).
It also has a query DSL that can be more pleasant to use than the json-like built in syntax. Here's an example from the docs:
>>> query = session.query(BloodDonor)
>>> for donor in query.filter(BloodDonor.first_name == 'Jeff', BloodDonor.age < 30):
>>> print donor
Jeff Jenkins (male; Age: 28; Type: O+)
The MongoAlchemy Session object also allows you to simulate transactions:
with session:
do_stuff()
session.insert(doc1)
do_more_stuff()
session.insert(doc2)
do_even_more_stuff()
session.insert(doc3)
# note that at this point nothing has been inserted
# now things are inserted
This doesn't mean that these inserts are one atomic operation—or even that all of the write will succeed—but it does mean that if your application has errors in the "do_stuff" functions that you won't have done half of the inserts. So it prevents a specific and reasonably common type of error

Filter by an object in SQLAlchemy

I have a declared model where the table stores a "raw" path identifier of an object. I then have a #hybrid_property which allows directly getting and setting the object which is identified by this field (which is not another declarative model). Is there a way to query directly on this high level?
I can do this:
session.query(Member).filter_by(program_raw=my_program.raw)
I want to be able to do this:
session.query(Member).filter_by(program=my_program)
where my_program.raw == "path/to/a/program"
Member has a field program_raw and a property program which gets the correct Program instance and sets the appropriate program_raw value. Program has a simple raw field which identifies it uniquely. I can provide more code if necessary.
The problem is that currently, SQLAlchemy simply tries to pass the program instance as a parameter to the query, instead of its raw value. This results in a Error binding parameter 0 - probably unsupported type. error.
Either, SQLAlchemy needs to know that when comparing the program, it must use Member.program_raw and match that against the raw property of the parameter. Getting it to use Member.program_raw is done simply using #program.expression but I can't figure out how to translate the Program parameter correctly (using a Comparator?), and/or
SQLAlchemy should know that when I filter by a Program instance, it should use the raw attribute.
My use-case is perhaps a bit abstract, but imagine I stored a serialized RGB value in the database and had a property with a Color class on the model. I want to filter by the Color class, and not have to deal with RGB values in my filters. The color class has no problems telling me its RGB value.
Figured it out by reading the source for relationship. The trick is to use a custom Comparator for the property, which knows how to compare two things. In my case it's as simple as:
from sqlalchemy.ext.hybrid import Comparator, hybrid_property
class ProgramComparator(Comparator):
def __eq__(self, other):
# Should check for case of `other is None`
return self.__clause_element__() == other.raw
class Member(Base):
# ...
program_raw = Column(String(80), index=True)
#hybrid_property
def program(self):
return Program(self.program_raw)
#program.comparator
def program(cls):
# program_raw becomes __clause_element__ in the Comparator.
return ProgramComparator(cls.program_raw)
#program.setter
def program(self, value):
self.program_raw = value.raw
Note: In my case, Program('abc') == Program('abc') (I've overridden __new__), so I can just return a "new" Program all the time. For other cases, the instance should probably be lazily created and stored in the Member instance.

Categories