SQLAlchemy - can you add custom methods to the query object? - python

Is there a way to create custom methods to the query object so you can do something like this?
User.query.all_active()
Where all_active() is essentially .filter(User.is_active == True)
And be able to filter off of it?
User.query.all_active().filter(User.age == 30)

You can subclass the base Query class to add your own methods:
from sqlalchemy.orm import Query
class MyQuery(Query):
def all_active(self):
return self.filter(User.is_active == True)
You then tell SQLAlchemy to use this new query class when you create the session (docs here). From your code it looks like you might be using Flask-SQLAlchemy, so you would do it as follows:
db = SQLAlchemy(session_options={'query_cls': MyQuery})
Otherwise you would pass the argument directly to the sessionmaker:
sessionmaker(bind=engine, query_cls=MyQuery)
As of right now, this new query object isn't that interesting because we hardcoded the User class in the method, so it won't work for anything else. A better implementation would use the query's underlying class to determine which filter to apply. This is slightly tricky but can be done as well:
class MyOtherQuery(Query):
def _get_models(self):
"""Returns the query's underlying model classes."""
if hasattr(query, 'attr'):
# we are dealing with a subquery
return [query.attr.target_mapper]
else:
return [
d['expr'].class_
for d in query.column_descriptions
if isinstance(d['expr'], Mapper)
]
def all_active(self):
model_class = self._get_models()[0]
return self.filter(model_class.is_active == True)
Finally, this new query class won't be used by dynamic relationships (if you have any). To let those also use it, you can pass it as argument when you create the relationship:
users = relationship(..., query_class=MyOtherQuery)

this work for me finely
from sqlalchemy.orm import query
from flask_sqlalchemy import BaseQuery
class ParentQuery(BaseQuery):
def _get_models(self):
if hasattr(query, 'attr'):
return [query.attr.target_mapper]
else:
return self._mapper_zero().class_
def FilterByCustomer(self):
model_class = self._get_models()
return self.filter(model_class.customerId == int(g.customer.get('customerId')))
#using like this
class AccountWorkflowModel(db.Model):
query_class = ParentQuery
.................

To provide a custom method that will be used by all your models that inherit from a particular parent, first as mentioned before inherit from the Query class:
from flask_sqlalchemy import SQLAlchemy, BaseQuery
from sqlalchemy.inspection import inspect
class MyCustomQuery(BaseQuery):
def all_active(self):
# get the class
modelClass = self._mapper_zero().class_
# get the primary key column
ins = inspect(modelClass)
# get a list of passing objects
passingObjs = []
for modelObj in self:
if modelObj.is_active == True:
# add to passing object list
passingObjs.append(modelObj.__dict__[ins.primary_key[0].name])
# change to tuple
passingObjs = tuple(passingObjs)
# run a filter on the query object
return self.filter(ins.primary_key[0].in_(passingObjs))
# add this to the constructor for your DB object
myDB = SQLAlchemy(query_class=MyCustomQuery)
This is for flask-sqlalchemy, for which people will still get here when looking for this answer.

Related

Completely restart/reload declarative class with dynamic functionality in SQLAlchemy

I am using SQLAlchemy + SQLite3 for creating multiple databases based on user input. When initializing a new database, the user defines any number of arbitrary features and their types. I wrote a DBManager class to serve as an interface between user input and database creation/access.
Dynamically "injecting" these arbitrary features in the declarative model (the Features class) is working as expected. The problem I have is when the user wants to create a second/different database: I can't figure out how to completely "clear" or "refresh" the model or the declarative_base so that the user is able to create a new database (with possibly different features).
Below is a minimal reproducible example of my situation:
src.__init__.py:
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import sessionmaker
Session = sessionmaker()
Base = declarative_base()
src.features.py
from sqlalchemy import Column, ForeignKey, Integer
from sqlalchemy.orm import relationship
from src import Base
class Features(Base):
__tablename__ = "features"
features_id = Column(Integer, primary_key=True)
#classmethod
def add_feature(cls, feature_name, feature_type):
setattr(cls, feature_name, Column(feature_type))
src.db_manager.py:
from typing import Optional, Dict
from sqlalchemy import create_engine
from src import Base, Session
from src.features import Features
class DBManager:
def __init__(self, path: str, features: Optional[Dict] = None) -> None:
self.engine = create_engine(f'sqlite:///{path}')
Session.configure(bind=self.engine)
self.session = Session()
self.features = features
if self.features: # user passed in some arbitrary features
self.bind_features_to_features_table()
Base.metadata.create_all(bind=self.engine)
def bind_features_to_features_table(self):
for feature_name, feature_type in self.features.items():
Features.add_feature(feature_name=feature_name, feature_type=feature_type)
I'd like to be able to do something like this:
from sqlalchemy import String, Float, Integer
from src.db_manager import DBManager
# User wants to create a database with these features
features = {
'name': String,
'height': Float,
}
db_manager = DBManager(path='my_database.db', features=features)
# ... User does some stuff with database here ...
# Now the user wants to create another database with these features
other_features = {
'age': Integer,
'weight': Float,
'city_of_residence': String,
'name': String,
}
db_manager = DBManager(path='another_database.db', features=other_features)
After executing the last line, I'm met with: InvalidRequestError: Implicitly combining column features.name with column features.name under attribute 'name'. Please configure one or more attributes for these same-named columns explicitly. The error wouldn't occur if the feature name did not appear on both databases, but then the feature height would be brought over to the second database, which is not desired.
Things I tried but didn't work:
call Base.metadata.clear() between DBManager instances: same error
call sqlalchemy.orm.clear_mappers() between DBManager instances: results in AttributeError: 'NoneType' object has no attribute 'instrument_attribute'
call delattr(Features, feature_name): results in NotImplementedError: Can't un-map individual mapped attributes on a mapped class..
This program will be running inside a GUI, so I can't really afford to exit/restart the script in order to connect to the second database. The user should be able to load/create different databases without having to close the program.
I understand that the error stems from the fact that the underlying Base object has not been "refreshed" and is still keeping track of the features created in my first DBManager instance. However I do not know how to fix this. What's worse, any attempt to overwrite/reload a new Base object will need to be applied to all modules that imported that object from __init__.py, which sounds tricky. Does anyone have a solution for this?
My solution was to define the Features declarative class inside a function, get_features, that takes a Base (declarative base) instance as an argument. The function returns the Features class object, so that every call essentially creates a new Features class as a whole.
The class DBManager is then responsible for calling that function, and Features becomes a instance attribute of DBManager. Creating a new instance of DBManager means creating an entire new class based on Features, to which I can then add any arbitrary features I'd like.
The code looks something like this:
def get_features(declarative_base):
class Features(declarative_base):
__tablename__ = "features"
features_id = Column(Integer, primary_key=True)
#classmethod
def add_feature(cls, feature_name, feature_type):
setattr(cls, feature_name, Column(feature_type))
return Features
class DBManager:
def __init__(self, path, features):
self.engine = create_engine(f'sqlite:///{path}')
Session.configure(bind=self.engine)
self.session = Session()
base = declarative_base()
self.features_table = get_features(base=base)
if self.features: # user passed in some arbitrary features
self.bind_features_to_features_table()
Base.metadata.create_all(bind=self.engine)
def bind_features_to_features_table(self):
for feature_name, feature_type in self.features.items():
self.features_table.add_feature(feature_name=feature_name, feature_type=feature_type)
It definitely feels a bit convoluted, and I have no idea if there are any caveats I'm not aware of, but as far as I can tell this approach solved my problem.

Why an UnmappedInstanceError while populating a database using Flask-SQLAlchemy?

I'm new to SQLAlchemy and am using Flask-SQLAlchemy for my current project. I'm getting an error that has me stumped:
sqlalchemy.orm.exc.UnmappedInstanceError: Class 'flask_sqlalchemy._BoundDeclarativeMeta' is not mapped; was a class (app.thing.Thing) supplied where an instance was required?
I've tried changing the notation in the instatiation, and moving some things around. Could the problem be related to this question, or something else?
Here is my module (thing.py) with the class that inherits from Flask-SQLAlchemy's db.Model (like SQLAlchemy's Base class):
from app import db
class Thing(db.Model):
indiv_id = db.Column(db.INTEGER, primary_key = True)
stuff_1 = db.Column(db.INTEGER)
stuff_2 = db.Column(db.INTEGER)
some_stuff = db.Column(db.BLOB)
def __init__():
pass
And here is my call to populate the database in the thing_wrangler.py module:
import thing
from app import db
def populate_database(number_to_add):
for each_thing in range(number_to_add):
a_thing = thing.Thing
a_thing.stuff_1 = 1
a_thing.stuff_2 = 2
a_thing.some_stuff = [1,2,3,4,5]
db.session.add(a_thing)
db.session.commit()
You need to create an instance of your model. Instead of a_thing = thing.Thing, it should be a_thing = thing.Thing(). Notice the parentheses. Since you overrode __init__, you also need to fix it so it takes self as the first argument.

PYMongo : Parsing|Serializing query output of a collection

By default collection.find or collection.findone() functions results in a dictionary types and if you pass paramater as_class=SomeUserClass than it will try to parse the result into this class format.
but it seems this class should also be derived class of dictionary (as it required __setitem__ function to be defined and i can add keys in the class ).
Here i want to set the properties of the class. how can i do achieve this?
Also, my collection class contains some child classes as properties .So how can i set the properties of child classes also.
It sounds like you want something like an object-relational mapper. I am the primary author of one Ming , but there exist several others for Python as well. In Ming, you might do the following to set up your mapping:
from ming import schema, Field
from ming.orm import (mapper, Mapper, RelationProperty,
ForeignIdProperty)
WikiDoc = collection(‘wiki_page', session,
Field('_id', schema.ObjectId()),
Field('title', str, index=True),
Field('text', str))
CommentDoc = collection(‘comment', session,
Field('_id', schema.ObjectId()),
Field('page_id', schema.ObjectId(), index=True),
Field('text', str))
class WikiPage(object): pass
class Comment(object): pass
ormsession.mapper(WikiPage, WikiDoc, properties=dict(
comments=RelationProperty('WikiComment')))
ormsession.mapper(Comment, CommentDoc, properties=dict(
page_id=ForeignIdProperty('WikiPage'),
page=RelationProperty('WikiPage')))
Mapper.compile_all()
Then you can query for some particular page via:
pg = WikiPage.query.get(title='MyPage')
pg.comments # loads comments via a second query from MongoDB
The various ODMs I know of for MongoDB in Python are listed below.
Ming
MongoKit
MongoEngine
I have solved this by adding __setitem__ in class.
than i do
result = as_class()
for key,value in dict_expr.items():
result.__setitem__(key,value)
and in my class __setitem__ is like
def __setitem__(self,key,value):
try:
attr = getattr(class_obj,key)
if(attr!=None):
if(isinstance(value,dict)):
for child_key,child_value in value.items():
attr.__setitem__(child_key,child_value)
setattr(class_obj,key,attr)
else:
setattr(class_obj,key,value)
except AttributeError:
pass

Proxy pattern idiom

I'm a web application developer and in using SQLAlchemy I find it clumsy to do this in many of my controllers when I'm wanting a specific row from (say) the users table:
from model import dbsession # SQLAlchemy SessionMaker instance
from model import User
user = dbsession().query(User).filter_by(some_kw_args).first()
Or say I want to add a user to the table (assuming another controller):
from model import dbsession # SQLAlchemy SessionMaker instance
from model import User
user = User("someval", "anotherval", "yanv")
dbsession().add(user)
So, because of that clumsiness (I won't go into some of my other personal idioms) I didn't like having to do all of that just to add a record to the table or to get a record from the table. So I decided (after a lot of nasty hacking on SQLAlchemy and deciding I was doing too many "magical" things) this was appropriate for the proxy pattern.
I (at first) did something like this inside of the model module:
def proxy_user(delete=False, *args, **kwargs):
session = DBSession()
# Keyword args? Let's instantiate it...
if (len(kwargs) > 0) and delete:
obj = session.query(User).filter_by(**kwargs).first()
session.delete(obj)
return True
elif len(kwargs) > 0:
kwargs.update({'removed' : False})
return session.query(User).filter_by(**kwargs).first()
else:
# Otherwise, let's create an empty one and add it to the session...
obj = User()
session.add(obj)
return obj
I did this for all of my models (nasty duplication of code, I know) and it works quite well. I can pass in keyword arguments to the proxy function and it handles all of the session querying for me (even providing a default filter keyword for the removed flag). I can initialize an empty model object and then add data to it by updating the object attributes and all of those changes are tracked (and committed/flushed) because the object has been added to the SQLAlchemy session.
So, to reduce duplication, I put the majority of the logic an decorator function and am now doing this:
def proxy_model(proxy):
"""Decorator for the proxy_model pattern."""
def wrapper(delete=False, *args, **kwargs):
model = proxy()
session = DBSession()
# Keyword args? Let's instantiate it...
if (len(kwargs) > 0) and delete:
obj = session.query(model).filter_by(**kwargs).first()
session.delete(obj)
return True
elif len(kwargs) > 0:
kwargs.update({'removed' : False})
return session.query(model).filter_by(**kwargs).first()
else:
# Otherwise, let's create an empty one and add it to the session...
obj = model()
session.add(obj)
return obj
return wrapper
# The proxy_model decorator is then used like so:
#proxy_model
def proxy_user(): return User
So now, in my controllers I can do this:
from model import proxy_user
# Fetch a user
user = proxy_user(email="someemail#ex.net") # Returns a user model filtered by that email
# Creating a new user, ZopeTransaction will handle the commit so I don't do it manually
new_user = proxy_user()
new_user.email = 'anotheremail#ex.net'
new_user.password = 'just an example'
If I need to do other more complex queries I will usually write function that handles it if I use it often. If it is a one-time thing I will just import the dbsession instance and then do the "standard" SQLAlchemy orm query....
This is much cleaner and works wonderfully but I still feel like it isn't "locked in" quite. Can anyone else (or more experienced python programmers) provide a better idiom that would achieve a similar amount of lucidity that I'm seeking while being a clearer abstraction?
You mention "didn't like having to do 'all of that'" where 'all of that' looks an awful lot like only 1 - 2 lines of code so I'm feeling that this isn't really necessary. Basically I don't really think that either statement you started with is all that verbose or confusing.
However, If I had to come up with a way to express this I wouldn't use a decorator here as you aren't really decorating anything. The function "proxy_user" really doesn't do anything without the decorator applied imo. Since you need to provide the name of the model somehow I think you're better of just using a function and passing the model class to it. I also think that rolling the delete functionality into your proxy is out of place and depending on how you've configured your Session the repeated calls to DBSession() may be creating new unrelated sessions which is going to cause problems if you need to work with multiple objects in the same transaction.
Anyway, here's a quick stab at how I would refactor your decorator into a pair of functions:
def find_or_add(model, session, **kwargs):
if len(kwargs) > 0:
obj = session.query(model).filter_by(**kwargs).first()
if not obj:
obj = model(**kwargs)
session.add(obj)
else:
# Otherwise, let's create an empty one and add it to the session...
obj = model()
session.add(obj)
return obj
def find_and_delete(model, session, **kwargs):
deleted = False
obj = session.query(model).filter_by(**kwargs).first()
if obj:
session.delete(obj)
deleted = True
return deleted
Again, I'm not convinced this is necessary but I think I can agree that:
user = find_or_add(User, mysession, email="bob#localhost.com")
Is perhaps nicer looking than the straight SQLAlchemy code necessary to find / create a user and add them to session.
I like the above functions better than your current decorator approach because:
The names clearly denote what your intent is here, where I feel proxy_user doesn't really make it clear that you want a user object if it exists otherwise you want to create it.
The session is managed explicitly
They don't require me to wrap every model in a decorator
The find_or_add function always returns an instance of model instead of sometimes returning True, a query result set, or a model instance.
the find_and_delete function always returns a boolean indicated whether or not it was successfully able to find and delete the record specified in kwargs.
Of course you might consider using a class decorator to add these functions as methods on your model classes, or perhaps deriving your models from a base class that includes this functionality so that you can do something like:
# let's add a classmethod to User or its base class:
class User(...):
...
#classmethod
def find_or_add(cls, session, **kwargs):
if len(kwargs) > 0:
obj = session.query(cls).filter_by(**kwargs).first()
if not obj:
obj = cls(**kwargs)
session.add(obj)
else:
# Otherwise, let's create an empty one and add it to the session...
obj = cls()
session.add(obj)
return obj
...
user = User.find_or_add(session, email="someone#tld.com")

Equivalent of objects.latest() in App Engine

What would be the best way to get the latest inserted object using AppEngine ?
I know in Django this can be done using
MyObject.objects.latest()
in AppEngine I'd like to be able to do this
class MyObject(db.Model):
time = db.DateTimeProperty(auto_now_add=True)
# Return latest entry from MyObject.
MyObject.all().latest()
Any idea ?
Your best bet will be to implement a latest() classmethod directly on MyObject and call it like
latest = MyObject.latest()
Anything else would require monkeypatching the built-in Query class.
Update
I thought I'd see how ugly it would be to implement this functionality. Here's a mixin class you can use if you really want to be able to call MyObject.all().latest():
class LatestMixin(object):
"""A mixin for db.Model objects that will add a `latest` method to the
`Query` object returned by cls.all(). Requires that the ORDER_FIELD
contain the name of the field by which to order the query to determine the
latest object."""
# What field do we order by?
ORDER_FIELD = None
#classmethod
def all(cls):
# Get the real query
q = super(LatestMixin, cls).all()
# Define our custom latest method
def latest():
if cls.ORDER_FIELD is None:
raise ValueError('ORDER_FIELD must be defined')
return q.order('-' + cls.ORDER_FIELD).get()
# Attach it to the query
q.latest = latest
return q
# How to use it
class Foo(LatestMixin, db.Model):
ORDER_FIELD = 'timestamp'
timestamp = db.DateTimeProperty(auto_now_add=True)
latest = Foo.all().latest()
MyObject.all() returns an instance of the Query class
Order the results by time:
MyObject.all().order('-time')
So, assuming there is at least one entry, you can get the most recent MyObject directly by:
MyObject.all().order('-time')[0]
or
MyObject.all().order('-time').fetch(limit=1)[0]

Categories