Completely restart/reload declarative class with dynamic functionality in SQLAlchemy - python

I am using SQLAlchemy + SQLite3 for creating multiple databases based on user input. When initializing a new database, the user defines any number of arbitrary features and their types. I wrote a DBManager class to serve as an interface between user input and database creation/access.
Dynamically "injecting" these arbitrary features in the declarative model (the Features class) is working as expected. The problem I have is when the user wants to create a second/different database: I can't figure out how to completely "clear" or "refresh" the model or the declarative_base so that the user is able to create a new database (with possibly different features).
Below is a minimal reproducible example of my situation:
src.__init__.py:
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import sessionmaker
Session = sessionmaker()
Base = declarative_base()
src.features.py
from sqlalchemy import Column, ForeignKey, Integer
from sqlalchemy.orm import relationship
from src import Base
class Features(Base):
__tablename__ = "features"
features_id = Column(Integer, primary_key=True)
#classmethod
def add_feature(cls, feature_name, feature_type):
setattr(cls, feature_name, Column(feature_type))
src.db_manager.py:
from typing import Optional, Dict
from sqlalchemy import create_engine
from src import Base, Session
from src.features import Features
class DBManager:
def __init__(self, path: str, features: Optional[Dict] = None) -> None:
self.engine = create_engine(f'sqlite:///{path}')
Session.configure(bind=self.engine)
self.session = Session()
self.features = features
if self.features: # user passed in some arbitrary features
self.bind_features_to_features_table()
Base.metadata.create_all(bind=self.engine)
def bind_features_to_features_table(self):
for feature_name, feature_type in self.features.items():
Features.add_feature(feature_name=feature_name, feature_type=feature_type)
I'd like to be able to do something like this:
from sqlalchemy import String, Float, Integer
from src.db_manager import DBManager
# User wants to create a database with these features
features = {
'name': String,
'height': Float,
}
db_manager = DBManager(path='my_database.db', features=features)
# ... User does some stuff with database here ...
# Now the user wants to create another database with these features
other_features = {
'age': Integer,
'weight': Float,
'city_of_residence': String,
'name': String,
}
db_manager = DBManager(path='another_database.db', features=other_features)
After executing the last line, I'm met with: InvalidRequestError: Implicitly combining column features.name with column features.name under attribute 'name'. Please configure one or more attributes for these same-named columns explicitly. The error wouldn't occur if the feature name did not appear on both databases, but then the feature height would be brought over to the second database, which is not desired.
Things I tried but didn't work:
call Base.metadata.clear() between DBManager instances: same error
call sqlalchemy.orm.clear_mappers() between DBManager instances: results in AttributeError: 'NoneType' object has no attribute 'instrument_attribute'
call delattr(Features, feature_name): results in NotImplementedError: Can't un-map individual mapped attributes on a mapped class..
This program will be running inside a GUI, so I can't really afford to exit/restart the script in order to connect to the second database. The user should be able to load/create different databases without having to close the program.
I understand that the error stems from the fact that the underlying Base object has not been "refreshed" and is still keeping track of the features created in my first DBManager instance. However I do not know how to fix this. What's worse, any attempt to overwrite/reload a new Base object will need to be applied to all modules that imported that object from __init__.py, which sounds tricky. Does anyone have a solution for this?

My solution was to define the Features declarative class inside a function, get_features, that takes a Base (declarative base) instance as an argument. The function returns the Features class object, so that every call essentially creates a new Features class as a whole.
The class DBManager is then responsible for calling that function, and Features becomes a instance attribute of DBManager. Creating a new instance of DBManager means creating an entire new class based on Features, to which I can then add any arbitrary features I'd like.
The code looks something like this:
def get_features(declarative_base):
class Features(declarative_base):
__tablename__ = "features"
features_id = Column(Integer, primary_key=True)
#classmethod
def add_feature(cls, feature_name, feature_type):
setattr(cls, feature_name, Column(feature_type))
return Features
class DBManager:
def __init__(self, path, features):
self.engine = create_engine(f'sqlite:///{path}')
Session.configure(bind=self.engine)
self.session = Session()
base = declarative_base()
self.features_table = get_features(base=base)
if self.features: # user passed in some arbitrary features
self.bind_features_to_features_table()
Base.metadata.create_all(bind=self.engine)
def bind_features_to_features_table(self):
for feature_name, feature_type in self.features.items():
self.features_table.add_feature(feature_name=feature_name, feature_type=feature_type)
It definitely feels a bit convoluted, and I have no idea if there are any caveats I'm not aware of, but as far as I can tell this approach solved my problem.

Related

Why is SQLAlchemy Postgres ORM requiring __init__(self) for a Declarative Base?

Setup: Postgres 13, Python 3.7, SQLAlchemy 1.4
The current structure uses base.py to create an engine, a Scoped Session, and an Augmented Base. All of this gets called
in models.py where we define a table Class, and is called again in inserts.py where we test inserting new values to
the database using the ORM. All of this is working well.
My question is regarding the def __init__(self) function in the models.py table Classes. Without this function the code will error with TypeError: __init__() takes 1 positional argument but 5 were given
As soon as I include the __init__ function the code works properly.
I am perplexed as to why this error is being produced given that all the models are defined via the Declarative system
which means our Class should get an __init__() method constructor which automatically accepts keyword names that
match the columns we’ve mapped.
I suspect there is an error in how I am coding the interactions between db_session = scoped_session and
Base = declarative_base(cls=Base, metadata=metadata_obj) and the way Base is being passed in class NumLimit(Base).
I can't quite work this out and would appreciate being directed to where I am creating this error. Thank you!
base.py
from sqlalchemy import Column, create_engine, Integer, MetaData
from sqlalchemy.orm import declared_attr, declarative_base, scoped_session, sessionmaker
engine = create_engine('postgresql://user:pass#localhost:5432/dev', echo=True)
db_session = scoped_session(
sessionmaker(
bind=engine,
autocommit=False,
autoflush=False
)
)
# Augment the base class by using the cls argument of the declarative_base() function so all classes derived
# from Base will have a table name derived from the class name and an id primary key column.
class Base:
#declared_attr
def __tablename__(cls):
return cls.__name__.lower()
id = Column(Integer, primary_key=True)
# Write all tables to schema 'collect'
metadata_obj = MetaData(schema='collect')
# Instantiate a Base class for our classes definitions
Base = declarative_base(cls=Base, metadata=metadata_obj)
models.py
from base import Base
from sqlalchemy import Column, DateTime, Integer, Text
from sqlalchemy.dialects.postgresql import UUID
import uuid
class NumLimit(Base):
org = Column(UUID(as_uuid=True), default=uuid.uuid4, unique=True)
limits = Column(Integer)
limits_rate = Column(Integer)
rate_use = Column(Integer)
def __init__(self, org, limits, allowance_rate, usage, last_usage):
super().__init__()
self.org = org
self.limits = limits
self.limits_rate = limits_rate
self.rate_use = rate_use
def __repr__(self):
return f'<NumLimit(org={self.org}, limits={self.limits}, limits_rate={self.limits_rate},' \
f' rate_use={self.rate_use})>'
insert.py
def insert_num_limit():
# Generate database schema based on definitions in models.py
Base.metadata.create_all(bind=engine)
# Create instances of the NumLimit class
a_num_limit = NumLimit('123e4567-e89b-12d3-a456-426614174000', 20, 4, 8)
another_limit = NumLimit('123e4567-e89b-12d3-a456-426614174660', 7, 2, 99)
# Use the current session to persist data
db_session.add_all([a_num_limit, another_limit])
# Commit current session to database and close session
db_session.commit()
db_session.close()
return
All parameters of the __init__() generated by sqlalchemy are keyword-only. It is as if the definition was:
def __init__(self, *, id=None, org=None, limits=None, limits_rate=None, rate_use=None):
self.id = id
self.org = org
# etc...
So when you try to supply the arguments positionally: NumLimit('123e4567-e89b-12d3-a456-426614174000', 20, 4, 8), you will get the error TypeError: __init__() takes 1 positional argument but 5 were given, because __init__() indeed only takes one positional argument. If you however supply them as keyword-arguments: NumLimit(id='123e4567-e89b-12d3-a456-426614174000', org=20, limits=4, limits_rate=8) everything works.

Python SQLalchemy access huge DB data without creating models

I am using flaks python and sqlalchemy to connect to a huge db, where a lot of stats are saved. I need to create some useful insights with the use of these stats, so I only need to read/get the data and never modify.
The issue I have now is the following:
Before I can access a table I need to replicate the table in my models file. For example I see the table Login_Data in the DB. So I go into my models and recreate the exact same table.
class Login_Data(Base):
__tablename__ = 'login_data'
id = Column(Integer, primary_key=True)
date = Column(Date, nullable=False)
new_users = Column(Integer, nullable=True)
def __init__(self, date=None, new_users=None):
self.date = date
self.new_users = new_users
def get(self, id):
if self.id == id:
return self
else:
return None
def __repr__(self):
return '<%s(%r, %r, %r)>' % (self.__class__.__name__, self.id, self.date, self.new_users)
I do this because otherwise I cant query it using:
some_data = Login_Data.query.limit(10)
But this feels unnecessary, there must be a better way. Whats the point in recreating the models if they are already defined. What shall I use here:
some_data = [SOMETHING HERE SO I DONT NEED TO RECREATE THE TABLE].query.limit(10)
Simple question but I have not found a solution yet.
Thanks to Tryph for the right sources.
To access the data of an existing DB with sqlalchemy you need to use automap. In your configuration file where you load/declare your DB type. You need to use the automap_base(). After that you can create your models and use the correct table names of the DB without specifying everything yourself:
from sqlalchemy.ext.automap import automap_base
from sqlalchemy.orm import Session
from sqlalchemy import create_engine
import stats_config
Base = automap_base()
engine = create_engine(stats_config.DB_URI, convert_unicode=True)
# reflect the tables
Base.prepare(engine, reflect=True)
# mapped classes are now created with names by default
# matching that of the table name.
LoginData = Base.classes.login_data
db_session = Session(engine)
After this is done you can now use all your known sqlalchemy functions on:
some_data = db_session.query(LoginData).limit(10)
You may be interested by reflection and automap.
Unfortunately, since I never used any of those features, I am not able to tell you more about them. I just know that they allow to use the database schema without explicitly declaring it in Python.

Cornice schema validation with colanderalchemy

Cornice's documentation mentions how to validate your schema using a colander's MappingSchema subclass. How should we use a colanderalchemy schema for the same purpose? Because if we create a schema using colanderalchemy as stated in the documentation, the schema object has already instantiated the colander's class, and I think that this results in an error.
To be more precise, here is my sample code:
from sqlalchemy.ext.declarative import declarative_base
from cornice.resource import resource, view
from colanderalchemy import SQLAlchemySchemaNode
from sqlalchemy import (
Column,
Integer,
Unicode,
)
Base = declarative_base()
'''
SQLAlchemy part
'''
class DBTable(Base):
__tablename__ = 'mytable'
id = Column(Integer, primary_key=True,
info={'colanderalchemy': {'exclude': True}})
name = Column(Unicode(70), nullable=False)
description = Column(Unicode(256))
'''
ColanderAlchemy part
'''
ClndrTable = SQLAlchemySchemaNode(DBTable)
'''
Cornice part
'''
PRF='api'
#resource(collection_path='%s/' % PRF, path='%s/{fid}' % PRF)
class TableApi(object):
def __init__(self, request):
self.request = request
#view(schema=ClndrTable, renderer='json')
def put(self):
# do my stuff here
pass
Where ClndrTable is my auto-generated schema. Now, when trying to deploy this code, I get the following error:
NotImplementedError: Schema node construction without a typ argument or a schema_type() callable present on the node class
As I've mentioned earlier, I am suspecting that the problem is that ClndrTable (given as an argument to the view decorator) is an instantiation of the automatically generated schema by colanderalchemy.
Anyone knowing how to resolve this?
Thanks all in advance!
This appears to be due to the issue of colander having both a typ property and a schema_type property. They're both supposed to tell you the schema's type, but they can actually be different values. I filed an issue with colander, but if there's a fix it'll likely not make it to pypi any time soon.
So what's happing is: ColanderAlchemy ignores schema_type and uses typ while Cornice ignores typ and uses schema_type.
You can hack a fix with the following: ClndrTable.schema_type = lambda: ClndrTable.typ
However, that just leads you to the next exception:
cornice.schemas.SchemaError: schema is not a MappingSchema: <class 'colanderalchemy.schema.SQLAlchemySchemaNode'>
This is due to Cornice not duck typing but expecting all Schema to be a subclass of MappingSchema. However, MappingSchema is just a Schema with typ/schema_type being Mapping (which is what ColanderAlchemy returns).
I'll see if I can enact some changes to fix this.
Update
Despite the names, 'typ' and 'schema_type' have two different purposes. 'typ' always tells you the type of a schema instance. 'schema_type' is a method that's called to give a SchemaNode a default type when it's instantiated (so it's called in the __init__ if you don't pass a typ in, but other than that it's not supposed to be used).
Cornice has been patched to properly use typ now (though, as of this message, it's not part of the latest release).

Flask + SQLAlchemy - custom metaclass to modify column setters (dynamic hybrid_property)

I have an existing, working Flask app that uses SQLAlchemy. Several of the models/tables in this app have columns that store raw HTML, and I'd like to inject a function on a column's setter so that the incoming raw html gets 'cleansed'. I want to do this in the model so I don't have to sprinkle "clean this data" all through the form or route code.
I can currently already do this like so:
from application import db, clean_the_data
from sqlalchemy.ext.hybrid import hybrid_property
class Example(db.Model):
__tablename__ = 'example'
normal_column = db.Column(db.Integer,
primary_key=True,
autoincrement=True)
_html_column = db.Column('html_column', db.Text,
nullable=False)
#hybrid_property
def html_column(self):
return self._html_column
#html_column.setter
def html_column(self, value):
self._html_column = clean_the_data(value)
This works like a charm - except for the model definition the _html_column name is never seen, the cleaner function is called, and the cleaned data is used. Hooray.
I could of course stop there and just eat the ugly handling of the columns, but why do that when you can mess with metaclasses?
Note: the following all assumes that 'application' is the main Flask module, and that it contains two children: 'db' - the SQLAlchemy handle and 'clean_the_data', the function to clean up the incoming HTML.
So, I went about trying to make a new base Model class that spotted a column that needs cleaning when the class is being created, and juggled things around automatically, so that instead of the above code, you could do something like this:
from application import db
class Example(db.Model):
__tablename__ = 'example'
__html_columns__ = ['html_column'] # Our oh-so-subtle hint
normal_column = db.Column(db.Integer,
primary_key=True,
autoincrement=True)
html_column = db.Column(db.Text,
nullable=False)
Of course, the combination of trickery with metaclasses going on behind the scenes with SQLAlchemy and Flask made this less than straight-forward (and is also why the nearly matching question "Custom metaclass to create hybrid properties in SQLAlchemy" doesn't quite help - Flask gets in the way too). I've almost gotten there with the following in application/models/__init__.py:
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.ext.hybrid import hybrid_property
# Yes, I'm importing _X stuff...I tried other ways to avoid this
# but to no avail
from flask_sqlalchemy import (Model as BaseModel,
_BoundDeclarativeMeta,
_QueryProperty)
from application import db, clean_the_data
class _HTMLBoundDeclarativeMeta(_BoundDeclarativeMeta):
def __new__(cls, name, bases, d):
# Move any fields named in __html_columns__ to a
# _field/field pair with a hybrid_property
if '__html_columns__' in d:
for field in d['__html_columns__']:
if field not in d:
continue
hidden = '_' + field
fget = lambda self: getattr(self, hidden)
fset = lambda self, value: setattr(self, hidden,
clean_the_data(value))
d[hidden] = d[field] # clobber...
d[hidden].name = field # So we don't have to explicitly
# name the column. Should probably
# force a quote on the name too
d[field] = hybrid_property(fget, fset)
del d['__html_columns__'] # Not needed any more
return _BoundDeclarativeMeta.__new__(cls, name, bases, d)
# The following copied from how flask_sqlalchemy creates it's Model
Model = declarative_base(cls=BaseModel, name='Model',
metaclass=_HTMLBoundDeclarativeMeta)
Model.query = _QueryProperty(db)
# Need to replace the original Model in flask_sqlalchemy, otherwise it
# uses the old one, while you use the new one, and tables aren't
# shared between them
db.Model = Model
Once that's set, your model class can look like:
from application import db
from application.models import Model
class Example(Model): # Or db.Model really, since it's been replaced
__tablename__ = 'example'
__html_columns__ = ['html_column'] # Our oh-so-subtle hint
normal_column = db.Column(db.Integer,
primary_key=True,
autoincrement=True)
html_column = db.Column(db.Text,
nullable=False)
This almost works, in that there's no errors, data is read and saved correctly, etc. Except the setter for the hybrid_property is never called. The getter is (I've confirmed with print statements in both), but the setter is ignored totally and the cleaner function is thus never called. The data is set though - changes are made quite happily with the un-cleaned data.
Obviously I've not quite completely emulated the static version of the code in my dynamic version, but I honestly have no idea where the issue is. As far as I can see, the hybrid_property should be registering the setter just like it has the getter, but it's just not. In the static version, the setter is registered and used just fine.
Any ideas on how to get that final step working?
Maybe use a custom type ?
from sqlalchemy import TypeDecorator, Text
class CleanedHtml(TypeDecorator):
impl = Text
def process_bind_param(self, value, dialect):
return clean_the_data(value)
Then you can just write your models this way:
class Example(db.Model):
__tablename__ = 'example'
normal_column = db.Column(db.Integer, primary_key=True, autoincrement=True)
html_column = db.Column(CleanedHtml)
More explanations are available in the documentation here: http://docs.sqlalchemy.org/en/latest/core/custom_types.html#augmenting-existing-types

Is it possible to unload declarative classes in SQLAlchemy?

I’m working on a library where the user shall be able to simply declare a few classes which are automatically backed by the database. In short, somewhere hidden in the code, there is
from sqlalchemy.ext.declarative import declarative_base
Base = declarative_base()
class LibraryBase(Base):
# important library stuff
and the user should then do
class MyStuff(LibraryBase):
# important personal stuff
class MyStuff_2(LibraryBase):
# important personal stuff
mystuff = MyStuff()
Library.register(mystuff)
mystuff.changeIt() # apply some changes to the instance
Library.save(mystuff) # and save it
# same for all other classes
In a static environment, e.g. the user has created one file with all personal classes and imports this file, this works pretty well. All class names are fixed and SQLAlchemy knows how to map each class.
In an interactive environment, things are different: Now, there is a chance of a class being defined twice. Both classes might have different modules; but still SQLAlchemy will complain:
SAWarning: The classname 'MyStuff' is already in the registry of this declarative base, mapped to < class 'OtherModule.MyStuff' >
Is there a way to deal with this? Can I somehow unload a class from its declarative_base so that I can exchange its definition with a new one?
You can use:
sqlalchemy.orm.instrumentation.unregister_class(cl)
del cl._decl_class_registry[cl.__name__]
The first line is to prevent accidental use of your unregisted class. The second unregisters and will prevent the warning.
It looks like, And I'm not really sure this even works, but I think what you want is
sqlalchemy.orm.instrumentation.unregister_class()
http://hg.sqlalchemy.org/sqlalchemy/file/762548ff8eef/lib/sqlalchemy/orm/instrumentation.py#l466
In my project I use this solution.
Where library specified columns defined as mixin by declared_attr and target mapper created by type call with bases, as result I have full functional mapper.
from sqlalchemy import create_engine, BigInteger, Column
from sqlalchemy.orm import sessionmaker, scoped_session
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.ext.declarative import declared_attr
Base = declarative_base()
class LibraryBase(object):
__tablename__ = 'model'
#declared_attr
def library_field(self):
return Column(BigInteger)
class MyLibrary(object):
#classmethod
def register(cls, entity):
tablename = entity.__tablename__
Mapper = type('Entity_%s' % tablename, (Base, LibraryBase, entity), {
'__tablename__': tablename,
'id': Column(BigInteger, primary_key=True),
})
return Mapper
#classmethod
def setup(cls):
Base.metadata.create_all()
class MyStaff(object):
__tablename__ = 'sometable1'
#declared_attr
def staff_field(self):
return Column(BigInteger)
def mymethod(self):
print('My method:', self)
class MyStaff2(MyStaff):
__tablename__ = 'sometable2'
if __name__ == '__main__':
engine = create_engine('sqlite://', echo=True)
Base.metadata.bind = engine
Session = scoped_session(sessionmaker(bind=engine))
session = Session()
# register and install
MyStaffMapper = MyLibrary.register(MyStaff)
MyStaffMapper2 = MyLibrary.register(MyStaff2)
MyLibrary.setup()
MyStaffMapper().mymethod()
MyStaffMapper2().mymethod()
session.query(MyStaffMapper.library_field) \
.filter(MyStaffMapper.staff_field != None) \
.all()

Categories