how can i define a column as a positive integer using flask sqlalchemy?
i am hoping the answer would look something like this:
class City(db.Model):
id = db.Column(db.Integer, primary_key=True)
population = db.Column(db.Integer, positive=True)
def __init__(self,population):
self.population = population
however, this class definition will throw an error b/c sqlalchemy does not know about a 'positive' argument.
i could raise an exception if an object is instantiated with a negative value for the population. but i don't know how to ensure that the population remains positive after an update.
thanks for any help.
unfortunately, on the python side, sqlalchemy does its best to stay out of the way; there's no 'special sqlalchemy' way to express that the instance attribute must satisfy some constraint:
>>> class Foo(Base):
... __tablename__ = 'foo'
... id = Column(Integer, primary_key=True)
... bar = Column(Integer)
...
>>> f = Foo()
>>> f.bar = "not a number!"
>>> f.bar
'not a number!'
If you tried to commit this object, sqlalchey would complain because it doesn't know how to render the supplied python value as SQL for the column type Integer.
If that's not what you're looking for, you just want to make sure that bad data doesn't reach the database, then you need a Check Constraint.
class Foo(Base):
__tablename__ = 'foo'
id = Column(Integer, primary_key=True)
bar = Column(Integer)
__table_args__ = (
CheckConstraint(bar >= 0, name='check_bar_positive'),
{})
I know this is old but for what it's worth, my approach was to use marshmallow (de/serialization and data validation library) to validate input data.
Create the schema to your model as such:
from marshmallow import validate, fields, Schema
...
class CitySchema(Schema):
population = fields.Integer(validate=validate.Range(min=0, max=<your max value>))
Then use your schema to serialize/deserialize the data when appropriate:
...
city_data = {...} # your city's data (dict)
city_schema = CitySchema()
deserialized_city, validation_errors = city_schema.load(city_data) # validation done at deserialization
...
The advantage of using a de/serialization library is that you can enforce all your data integrity rules in one place
Related
I'm wondering what's the proper way of retrieving data from the model. Let's take the classes below for the example:
class A(db.Model):
def get_attributes():
return self.product_category.attributes
class Attribute(db.Mdel):
attribute_id = db.Column(db.Integer, primary_key=True, autoincrement=True)
label = db.Column(db.String(255), nullable=False)
Let's say that calling get_attributes() returns three objects of the Attribute class.
In my route, I only want to receive the list of attribute labels. What I'm currently doing is looping through the objects and retrieve the label property like this:
labels = [i.label for i in obj.get_attributes()]
Which I don't think is a proper way of doing it. Is there any better way to achieve this?
You should use relationship between A and Attribute.
For example
class A(db.Model):
# ...
# table primary key, columns etc.
# ...
attributes = db.relationship(
"Attribute",
backref="as",
lazy=True
)
class Attribute(db.Model):
# ...
# table primary key, columns etc.
# ...
a_id = db.Column(db.Integer, db.ForeignKey('a.id'))
After that, you can access certain "a" attributes with a.attributes. To add an attribute to "a" just: a.attributes.append(a)
I have following models defined:
class Attribute(Base):
__tablename__ = "attributes"
id = Column(BigInteger, primary_key=True, index=True)
data_id = Column(BigInteger, ForeignKey("data.art_no"))
name = Column(VARCHAR(500), index=True)
data = relationship("Data", back_populates="attributes_rel")
class Data(Base):
__tablename__ = "data"
art_no = Column(BigInteger, primary_key=True, index=True)
multiplier = Column(Float)
attributes_rel = relationship("Attribute", back_populates="data", cascade="all, delete, delete-orphan")
#property
def attributes(self):
return [attribute.name for attribute in self.attributes_rel]
If I query for Data rows, I get this rows (only attributes property:
#1 ['attributeX', 'attributeY']
#2 ['attributeZ']
#3 ['attributeX', 'attributeZ']
I want to do following thing now:
I have this list ['attributeX'] and I want to query my data and only get the Data rows back, which has the 'attributeX' attribute.
If I have this list ['attributeX', 'attributeZ'], I want to query my data and only get the Data rows back, which has the 'attributeX' AND 'attributeZ' attribute.
How can I do the queries?
I tried .filter(models.Data.attributes_rel.any(models.Attribute.name.in_(attributesList))) which returns all rows which has any of the attribute from attributesList .... but I want only get the models.Data rows which has exactly the attributes from the list (or even others, too, but at least the ones from the list)
Optical sample of my issue:
This three attributes are associated to Data rows. I have set attributeList=['Test2','Test3'] ... but also the last row is returned .. because it has attribute Test3 but it should not be returned because it has not Test2 ... any idea?
in_ will basically do any of the attributes is present, whereas what you want is ALL attributes are present.
To achieve this, just add filter for each attribute name separately:
q = session.query(Data)
for attr in attributesList:
q = q.filter(Data.attributes_rel.any(Attribute.name == attr))
How can I automatically truncate string values in a data model across many attributes, without explicitly defining a #validates method for each one?
My current code:
from sqlalchemy import Column, Integer, String
from sqlalchemy.orm import validates
class MyModel:
__tablename__ = 'my_model'
id = Column(Integer, primary_key=True, autoincrement=True)
name = Column(String(40), nullable=False, unique=True)
# I can "force" truncation to my model using "validates"
# I'd prefer not to use this solution though...
#validates('name')
def validate_code(self, key, value):
max_len = getattr(self.__class__, key).prop.columns[0].type.length
if value and len(value) > max_len:
value = value[:max_len]
return value
My concern is that my ORM will span many tables and fields and there's a high risk of oversight in including attributes in string length validation. In simpler words, I need a solution that'll scale. Ideally, something in my session configuration which'll automatically truncate strings that are too long...
You could create a customised String type that automatically truncates its value on insert.
import sqlalchemy.types as types
class LimitedLengthString(types.TypeDecorator):
impl = types.String
def process_bind_param(self, value, dialect):
return value[:self.impl.length]
def copy(self, **kwargs):
return LimitedLengthString(self.impl.length)
class MyModel:
__tablename__ = 'my_model'
id = Column(Integer, primary_key=True, autoincrement=True)
name = Column(LimitedLengthString(40), nullable=False, unique=True)
The extended type will still create VARCHAR(40) in the database, so it should be possible to replace String(40) with LimitedLengthString(40)* in your code without a database migration.
* You might want to choose a shorter name.
Problem: Simply put, I am trying to redefine a SQLAlchemy ORM table's primary key after it has already been defined.
Example:
class Base:
#declared_attr
def __tablename__(cls):
return f"{cls.__name__}"
#declared_attr
def id(cls):
return Column(Integer, cls.seq, unique=True,
autoincrement=True, primary_key=True)
Base = declarative_base(cls=Base)
class A_Table(Base):
newPrimaryKeyColumnsDerivedFromAnotherFunction = []
# Please Note: as the variable name tries to say,
# these columns are auto-generated and not known until after all
# ORM classes (models) are defined
# OTHER CLASSES
def changePriKeyFunc(model):
pass # DO STUFF
# Then do
Base.metadata.create_all(bind=arbitraryEngine)
# After everything has been altered and tied into a little bow
*Please note, this is a simplification of the true problem I am trying to solve.
Possible Solution: Your first thought might have been to do something like this:
def possibleSolution(model):
for pricol in model.__table__.primary_key:
pricol.primary_key = False
model.__table__.primary_key = PrimaryKeyConstraint(
*model.newPrimaryKeyColumnsDerivedFromAnotherFunction,
# TODO: ADD all the columns that are in the model that are also a primary key
# *[col for col in model.__table__.c if col.primary_key]
)
But, this doesn't work, because when trying to add, flush, and commit, an error gets thrown:
InvalidRequestError: Instance <B_Table at 0x104aa1d68> cannot be refreshed -
it's not persistent and does not contain a full primary key.
Even though this:
In [2]: B_Table.__table__.primary_key
Out[2]: PrimaryKeyConstraint(Column('a_TableId', Integer(),
ForeignKey('A_Table.id'), table=<B_Table>,
primary_key=True, nullable=False))
as well as this:
In [3]: B_Table.__table__
Out[3]: Table('B_Table', MetaData(bind=None),
Column('id', Integer(), table=<B_Table>, nullable=False,
default=Sequence('test_1', start=1, increment=1,
metadata=MetaData(bind=None))),
Column('a_TableId', Integer(),
ForeignKey('A_Table.id'), table=<B_Table>,
primary_key=True, nullable=False),
schema=None)
and finally:
In [5]: b.a_TableId
Out[5]: 1
Also note that the database actually reflects the changed (and true) primary key, so I know that there's something going on with the ORM/SQLAlchemy.
Question: In summary, how can I change the model's primary key after the model has already been defined?
edit: See below for full code (same type of error, just in SQLite)
from sqlalchemy import Column, Integer, ForeignKey
from sqlalchemy.orm import relationship, sessionmaker
from sqlalchemy.ext.declarative import declared_attr, declarative_base
from sqlalchemy.schema import PrimaryKeyConstraint
from sqlalchemy import Sequence, create_engine
class Base:
#declared_attr
def __tablename__(cls):
return f"{cls.__name__}"
#declared_attr
def seq(cls):
return Sequence("test_1", start=1, increment=1)
#declared_attr
def id(cls):
return Column(Integer, cls.seq, unique=True, autoincrement=True, primary_key=True)
Base = declarative_base(cls=Base)
def relate(model, x):
"""Model is the original class, x is what class needs to be as
an attribute for model"""
attributeName = x.__tablename__
idAttributeName = "{}Id".format(attributeName)
setattr(model, idAttributeName,
Column(ForeignKey(x.id)))
setattr(model, attributeName,
relationship(x,
foreign_keys=getattr(model, idAttributeName),
primaryjoin=getattr(
model, idAttributeName) == x.id,
remote_side=x.id
)
)
return model.__table__.c[idAttributeName]
def possibleSolution(model):
if len(model.defined):
newPriCols = []
for x in model.defined:
newPriCols.append(relate(model, x))
for priCol in model.__table__.primary_key:
priCol.primary_key = False
priCol.nullable = True
model.__table__.primary_key = PrimaryKeyConstraint(
*newPriCols
# TODO: ADD all the columns that are in the model that are also a primary key
# *[col for col in model.__table__.c if col.primary_key]
)
class A_Table(Base):
pass
class B_Table(Base):
defined = [A_Table]
possibleSolution(B_Table)
engine = create_engine('sqlite://')
Base.metadata.create_all(bind=engine)
Session = sessionmaker(bind=engine)
session = Session()
a = A_Table()
b = B_Table(A_TableId=a.id)
print(B_Table.__table__.primary_key)
session.add(a)
session.commit()
session.add(b)
session.commit()
Originally, the error you say the PK reassignment is causing is:
InvalidRequestError: Instance <B_Table at 0x104aa1d68> cannot be refreshed -
it's not persistent and does not contain a full primary key.
I don't get that running you MCVE, instead I get a pretty helpful warning first:
SAWarning: Column 'B_Table.A_TableId' is marked as a member of the
primary key for table 'B_Table', but has no Python-side or server-side
default generator indicated, nor does it indicate 'autoincrement=True'
or 'nullable=True', and no explicit value is passed. Primary key
columns typically may not store NULL.
And a very detailed exception message when the script fails:
sqlalchemy.orm.exc.FlushError: Instance has
a NULL identity key. If this is an auto-generated value, check that
the database table allows generation of new primary key values, and
that the mapped Column object is configured to expect these generated
values. Ensure also that this flush() is not occurring at an
inappropriate time, such as within a load() event.
So assuming that the example accurately describes your problem, the answer is straightforward. A primary key cannot be null.
A_Table inherits off Base:
class A_Table(Base):
pass
Base gives A_Table an autoincrement PK through declared_attr id():
#declared_attr
def id(cls):
return Column(Integer, cls.seq, unique=True, autoincrement=True, primary_key=True)
Similarly, B_Table is defined off Base but the PK is overwritten in possibleSolution() such that it becomes a ForeignKey to A_Table:
PrimaryKeyConstraint(Column('A_TableId', Integer(), ForeignKey('A_Table.id'), table=<B_Table>, primary_key=True, nullable=False))
Then, we instantiate an instance of A_Table without any kwargs and immediately allocate the id attribute of instance a to field A_TableId when constructing b:
a = A_Table()
b = B_Table(A_TableId=a.id)
At this point we can stop and inspect the attribute values of each:
print(a.id, b.A_TableId)
# None None
a.id is None because it's an autoincrement which needs to be populated by the database, not the ORM. So SQLAlchemy doesn't know it's value until after the instance is flushed to the database.
So what happens if we include a flush() operation after adding instance a to the session:
a = A_Table()
session.add(a)
session.flush()
b = B_Table(A_TableId=a.id)
print(a.id, b.A_TableId)
# 1 1
So by issuing the flush first, we've got a value for a.id, meaning that we also have a value for b.A_TableId.
session.add(b)
session.commit()
# no error
I'm working on an application that uses sqlalchemy to pull a lot of data. The application does a lot of computations on the resulting objects, but never makes any changes to those objects. The application is too slow, and profiling suggests a lot of the time is spent accessing attributes that are managed by sqlalchemy. Therefore, I'm trying to figure out how to prevent sqlalchemy from instrumenting these objects.
So far, I've made a little progress based on this answer and this example. I've found two different ways to do it, but neither one seems to work with relationships. I'm mostly proceeding by poking around without any comprehensive understanding of the internals of sqlalchemy. In both cases, I'm using the following test session and data:
engine = create_engine('sqlite://')
engine.execute('CREATE TABLE foo (id INTEGER, name TEXT)')
engine.execute('INSERT INTO foo VALUES (0, \'Jeremy\')')
engine.execute('CREATE TABLE bar (id INTEGER, foo_id Integer)')
engine.execute('INSERT INTO bar VALUES (0, 0)')
engine.execute('INSERT INTO bar VALUES (1, 0)')
Session = sessionmaker(bind=engine)
session = Session()
Then I either (1) use a custom subclass of InstrumentationManager:
class ReadOnlyInstrumentationManager(InstrumentationManager):
def install_descriptor(self, class_, key, inst):
pass
Base = declarative_base()
Base.__sa_instrumentation_manager__ = ReadOnlyInstrumentationManager
class Bar(Base):
__tablename__ = 'bar'
id = Column(Integer, primary_key=True)
foo_id = Column(Integer, ForeignKey('foo.id'))
class Foo(Base):
__tablename__ = 'foo'
id = Column(Integer, primary_key=True)
name = Column(String(32))
bars = relationship(Bar)
f = session.query(Foo).first()
print type(f.bars)
which gives:
<class 'sqlalchemy.orm.relationships.RelationshipProperty'>
instead of the expected list of bars.
Or, (2) use a custom subclass of ClassManager:
class MyClassManager(ClassManager):
def new_instance(self, state=None):
if hasattr(self.class_, '__readonly_type__'):
instance = self.class_.__readonly_type__.__new__(self.class_.__readonly_type__)
else:
instance = self.class_.__new__(self.class_)
self.setup_instance(instance, state)
return instance
Base = declarative_base()
Base.__sa_instrumentation_manager__ = MyClassManager
class ReadonlyFoo(object):
pass
class ReadonlyBar(object):
pass
class Bar(Base, ReadonlyBar):
__tablename__ = 'bar'
__readonly_type__ = ReadonlyBar
id = Column(Integer, primary_key=True)
foo_id = Column(Integer, ForeignKey('foo.id'))
class Foo(Base, ReadonlyFoo):
__tablename__ = 'foo'
__readonly_type__ = ReadonlyFoo
id = Column(Integer, primary_key=True)
name = Column(String(32))
bars = relationship(Bar)
f = session.query(Foo).first()
print f.bars
which gives:
AttributeError: 'ReadonlyFoo' object has no attribute 'bars'
Is there a way to modify one of these approaches so that relationships still work? Or, is there another approach to this problem that's better?