SQLAlchemy - don't enforce foreign key constraint on a relationship - python

I have a Test model/table and a TestAuditLog model/table, using SQLAlchemy and SQL Server 2008. The relationship between the two is Test.id == TestAuditLog.entityId, with one test having many audit logs. TestAuditLog is intended to keep a history of changes to rows in the Test table. I want to track when a Test is deleted, also, but I'm having trouble with this. In SQL Server Management Studio, I set the FK_TEST_AUDIT_LOG_TEST relationship's "Enforce Foreign Key Constraint" property to "No", thinking that would allow a TestAuditLog row to exist with an entityId that no longer connects to any Test.id because the Test has been deleted. However, when I try to create a TestAuditLog with SQLAlchemy and then delete the Test, I get an error:
(IntegrityError) ('23000', "[23000] [Microsoft][ODBC SQL Server Driver][SQL Server]Cannot insert the value NULL into column 'AL_TEST_ID', table 'TEST_AUDIT_LOG'; column does not allow nulls. UPDATE fails. (515) (SQLExecDirectW); [01000] [Microsoft][ODBC SQL Server Driver][SQL Server]The statement has been terminated. (3621)") u'UPDATE [TEST_AUDIT_LOG] SET [AL_TEST_ID]=? WHERE [TEST_AUDIT_LOG].[AL_ID] = ?' (None, 8)
I think because of the foreign-key relationship between Test and TestAuditLog, after I delete the Test row, SQLAlchemy is trying to update all that test's audit logs to have a NULL entityId. I don't want it to do this; I want SQLAlchemy to leave the audit logs alone. How can I tell SQLAlchemy to allow audit logs to exist whose entityId does not connect with any Test.id?
I tried just removing the ForeignKey from my tables, but I'd like to still be able to say myTest.audits and get all of a test's audit logs, and SQLAlchemy complained about not knowing how to join Test and TestAuditLog. When I then specified a primaryjoin on the relationship, it grumbled about not having a ForeignKey or ForeignKeyConstraint with the columns.
Here are my models:
class TestAuditLog(Base, Common):
__tablename__ = u'TEST_AUDIT_LOG'
entityId = Column(u'AL_TEST_ID', INTEGER(), ForeignKey(u'TEST.TS_TEST_ID'),
nullable=False)
...
class Test(Base, Common):
__tablename__ = u'TEST'
id = Column(u'TS_TEST_ID', INTEGER(), primary_key=True, nullable=False)
audits = relationship(TestAuditLog, backref="test")
...
And here's how I'm trying to delete a test while keeping its audit logs, their entityId intact:
test = Session.query(Test).first()
Session.begin()
try:
Session.add(TestAuditLog(entityId=test.id))
Session.flush()
Session.delete(test)
Session.commit()
except:
Session.rollback()
raise

You can solve this by:
POINT-1: not having a ForeignKey neither on the RDBMS level nor on the SA level
POINT-2: explicitly specify join conditions for the relationship
POINT-3: mark relationship cascades to rely on passive_deletes flag
Fully working code snippet below should give you an idea (points are highlighted in the code):
from sqlalchemy import create_engine, Column, Integer, String, ForeignKey
from sqlalchemy.orm import scoped_session, sessionmaker, relationship
from sqlalchemy.ext.declarative import declarative_base
Base = declarative_base()
engine = create_engine('sqlite:///:memory:', echo=False)
Session = sessionmaker(bind=engine)
class TestAuditLog(Base):
__tablename__ = 'TEST_AUDIT_LOG'
id = Column(Integer, primary_key=True)
comment = Column(String)
entityId = Column('TEST_AUDIT_LOG', Integer, nullable=False,
# POINT-1
#ForeignKey('TEST.TS_TEST_ID', ondelete="CASCADE"),
)
def __init__(self, comment):
self.comment = comment
def __repr__(self):
return "<TestAuditLog(id=%s entityId=%s, comment=%s)>" % (self.id, self.entityId, self.comment)
class Test(Base):
__tablename__ = 'TEST'
id = Column('TS_TEST_ID', Integer, primary_key=True)
name = Column(String)
audits = relationship(TestAuditLog, backref='test',
# POINT-2
primaryjoin="Test.id==TestAuditLog.entityId",
foreign_keys=[TestAuditLog.__table__.c.TEST_AUDIT_LOG],
# POINT-3
passive_deletes='all',
)
def __init__(self, name):
self.name = name
def __repr__(self):
return "<Test(id=%s, name=%s)>" % (self.id, self.name)
Base.metadata.create_all(engine)
###################
## tests
session = Session()
# create test data
tests = [Test("test-" + str(i)) for i in range(3)]
_cnt = 0
for _t in tests:
for __ in range(2):
_t.audits.append(TestAuditLog("comment-" + str(_cnt)))
_cnt += 1
session.add_all(tests)
session.commit()
session.expunge_all()
print '-'*80
# check test data, delete one Test
t1 = session.query(Test).get(1)
print "t: ", t1
print "t.a: ", t1.audits
session.delete(t1)
session.commit()
session.expunge_all()
print '-'*80
# check that audits are still in the DB for deleted Test
t1 = session.query(Test).get(1)
assert t1 is None
_q = session.query(TestAuditLog).filter(TestAuditLog.entityId == 1)
_r = _q.all()
assert len(_r) == 2
for _a in _r:
print _a
Another option would be to duplicate the column used in the FK, and make the FK column nullable with ON CASCADE SET NULL option. In this way you can still check the audit trail of deleted objects using this column.

Related

Deleting object in a (SQLAlchemy) many-to-many relationship (works in sqlite but fails when using postgres)?

I seem to hit an issue when trying to delete an object from a many-to-many relation that seems to work with sqlite, but fails on postgres.
Any help or hint is highly appreciated!
This part is the code fails when using postgres.
# try to delete group 1
session.query(Group).filter_by(name="group 1").delete()
Example code:
Many to many example:
An Item can belong to many groups and a Group can contain many items
Is it possible to delete a group containing a number of items without
a need to set the Group.items collection to [] and still keep all the items?
from sqlalchemy import Column, Integer, String, Table, ForeignKey, create_engine
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import sessionmaker, relationship
from testcontainers.postgres import PostgresContainer
Base = declarative_base()
item_group_members = Table('item_group_members', Base.metadata,
Column('group_id', ForeignKey('group.id'), primary_key=True),
Column('item_id', ForeignKey('item.id'), primary_key=True)
)
class Group(Base):
__tablename__ = 'group'
id = Column(Integer, primary_key=True, index=True)
name = Column(String, unique=True, nullable=False)
# relationships
items = relationship("Item", secondary=item_group_members, back_populates="groups")
class Item(Base):
__tablename__ = 'item'
id = Column(Integer, primary_key=True, index=True)
name = Column(String, unique=False, nullable=False)
# relationships
groups = relationship("Group", secondary=item_group_members, back_populates="items")
def run(connection_url):
engine = create_engine(connection_url)
Base.metadata.create_all(engine)
Session = sessionmaker(bind=engine)
session = Session()
# create some groups
group1 = Group(name="group 1")
group2 = Group(name="group 2")
# create some items
item1 = Item(name="Item 1")
item2 = Item(name="item 2")
item3 = Item(name="Item 1")
# add some items to a group
group1.items = [item1, item2, item3]
group2.items = [item2, item3]
# add all to the session and commit
session.add_all([group1, group2, item1, item2, item3])
session.commit()
# try to delete group 1
# FAILS when using Postgres!!!
session.query(Group).filter_by(name="group 1").delete()
assert session.query(Item).count() == 3
assert session.query(Group).count() == 1
def run_sqlite():
run('sqlite://')
def run_postgres():
with PostgresContainer("postgres:latest") as postgres:
run(postgres.get_connection_url())
if __name__ == '__main__':
run_sqlite() # works
run_postgres() # fails with an error message
sqlite works, but postgres does not. It results in an error:
sqlalchemy.exc.IntegrityError: (psycopg2.errors.ForeignKeyViolation) update or delete on table "group" violates foreign key constraint "item_group_members_group_id_fkey" on table "item_group_members"
DETAIL: Key (id)=(1) is still referenced from table "item_group_members".```
Has been answered in extend here:
https://github.com/sqlalchemy/sqlalchemy/discussions/7941

null value for primary key when trying to insert list of dictionaries

I'm trying to do a large number of inserts with one call, and the way someone here recommended was by giving .insert a list of dictionaries. This is using SQLAlchemy Core.
As an example:
try:
engine = db.create_engine(f"postgres://user:pass#myip/addressbook", connect_args={'connect_timeout': 5})
connection = engine.connect()
metadata = db.MetaData()
except exc.OperationalError:
print_error(f":: Could not connect to myip!")
sys.exit()
table_addressbook = db.Table('addressbook', metadata, autoload=True, autoload_with=engine)
list = []
list.append({'firstname': "John", 'lastname': "Doe"})
list.append({'firstname': "Jane", 'lastname': "Doe"})
query = db.insert(table_addressbook).values(list)
connection.execute(query)
But I'm getting an error saying the column id violates a non-null constraint. This is because insert normally auto-generates the primary-key id. How do I use this method but specify that id should be auto-generated? Or is there a different method I should use?
edit
Table name is addressbook.
Column id is type integer with default sequence 'untitled_table_id_seq', constraints are PRIMARY_KEY. This was autogenerated by Postico for Mac, but I've always been able to insert without including id and it auto increments from the last inserted ID.
Columns firstname and lastname are type text, no default, no constraints.
Without any information on your model and/or connection it is a bit difficult to answer your question. Please find below a piece of code which uses insert without throwing non-null constraint errors. Hopefully it helps you.
from sqlalchemy import create_engine, Column, Integer, String, Table
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import sessionmaker
from sqlalchemy.sql import insert
engine = create_engine('sqlite:///:memory:', echo=True)
Base = declarative_base()
class User(Base):
__tablename__ = 'users'
id = Column(Integer, primary_key=True)
firstname = Column(String)
lastname = Column(String)
Base.metadata.create_all(engine)
Session = sessionmaker(bind=engine)
Session.configure(bind=engine) # once engine is available
session = Session()
new_users = []
new_users.append({'firstname': "John", 'lastname': "Doe"})
new_users.append({'firstname': "Jane", 'lastname': "Doe"})
i = insert(User).values(new_users)
session.execute(i)
PS: most of this is coming from the tutorial on: https://docs.sqlalchemy.org/en/13/orm/tutorial.html
from sqlalchemy import Column
from sqlalchemy import create_engine
from sqlalchemy import Integer
from sqlalchemy import String
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import Session
engine = create_engine('sqlite:///:memory:', echo=True)
Base = declarative_base()
# Example Model definition for the illustration
class Customer(Base):
__tablename__ = "customer"
id = Column(Integer, primary_key=True)
name = Column(String(255))
description = Column(String(255))
Base.metadata.create_all(engine)
######################################################
# Bulk insert using dictionaries.
######################################################
# Insert test records into `customer`table.
def bulk_insert_customers(n):
session = Session(bind=engine)
session.bulk_insert_mappings(
Customer,
[
dict(
name="customer name %d" % i,
description="customer description %d" % i,
)
for i in range(n)
],
)
session.commit()
Refer these for more examples of how to do bulk inserts in different ways:
https://docs.sqlalchemy.org/en/13/_modules/examples/performance/bulk_inserts.html

Change SQLAlchemy Primary Key after it has been defined

Problem: Simply put, I am trying to redefine a SQLAlchemy ORM table's primary key after it has already been defined.
Example:
class Base:
#declared_attr
def __tablename__(cls):
return f"{cls.__name__}"
#declared_attr
def id(cls):
return Column(Integer, cls.seq, unique=True,
autoincrement=True, primary_key=True)
Base = declarative_base(cls=Base)
class A_Table(Base):
newPrimaryKeyColumnsDerivedFromAnotherFunction = []
# Please Note: as the variable name tries to say,
# these columns are auto-generated and not known until after all
# ORM classes (models) are defined
# OTHER CLASSES
def changePriKeyFunc(model):
pass # DO STUFF
# Then do
Base.metadata.create_all(bind=arbitraryEngine)
# After everything has been altered and tied into a little bow
*Please note, this is a simplification of the true problem I am trying to solve.
Possible Solution: Your first thought might have been to do something like this:
def possibleSolution(model):
for pricol in model.__table__.primary_key:
pricol.primary_key = False
model.__table__.primary_key = PrimaryKeyConstraint(
*model.newPrimaryKeyColumnsDerivedFromAnotherFunction,
# TODO: ADD all the columns that are in the model that are also a primary key
# *[col for col in model.__table__.c if col.primary_key]
)
But, this doesn't work, because when trying to add, flush, and commit, an error gets thrown:
InvalidRequestError: Instance <B_Table at 0x104aa1d68> cannot be refreshed -
it's not persistent and does not contain a full primary key.
Even though this:
In [2]: B_Table.__table__.primary_key
Out[2]: PrimaryKeyConstraint(Column('a_TableId', Integer(),
ForeignKey('A_Table.id'), table=<B_Table>,
primary_key=True, nullable=False))
as well as this:
In [3]: B_Table.__table__
Out[3]: Table('B_Table', MetaData(bind=None),
Column('id', Integer(), table=<B_Table>, nullable=False,
default=Sequence('test_1', start=1, increment=1,
metadata=MetaData(bind=None))),
Column('a_TableId', Integer(),
ForeignKey('A_Table.id'), table=<B_Table>,
primary_key=True, nullable=False),
schema=None)
and finally:
In [5]: b.a_TableId
Out[5]: 1
Also note that the database actually reflects the changed (and true) primary key, so I know that there's something going on with the ORM/SQLAlchemy.
Question: In summary, how can I change the model's primary key after the model has already been defined?
edit: See below for full code (same type of error, just in SQLite)
from sqlalchemy import Column, Integer, ForeignKey
from sqlalchemy.orm import relationship, sessionmaker
from sqlalchemy.ext.declarative import declared_attr, declarative_base
from sqlalchemy.schema import PrimaryKeyConstraint
from sqlalchemy import Sequence, create_engine
class Base:
#declared_attr
def __tablename__(cls):
return f"{cls.__name__}"
#declared_attr
def seq(cls):
return Sequence("test_1", start=1, increment=1)
#declared_attr
def id(cls):
return Column(Integer, cls.seq, unique=True, autoincrement=True, primary_key=True)
Base = declarative_base(cls=Base)
def relate(model, x):
"""Model is the original class, x is what class needs to be as
an attribute for model"""
attributeName = x.__tablename__
idAttributeName = "{}Id".format(attributeName)
setattr(model, idAttributeName,
Column(ForeignKey(x.id)))
setattr(model, attributeName,
relationship(x,
foreign_keys=getattr(model, idAttributeName),
primaryjoin=getattr(
model, idAttributeName) == x.id,
remote_side=x.id
)
)
return model.__table__.c[idAttributeName]
def possibleSolution(model):
if len(model.defined):
newPriCols = []
for x in model.defined:
newPriCols.append(relate(model, x))
for priCol in model.__table__.primary_key:
priCol.primary_key = False
priCol.nullable = True
model.__table__.primary_key = PrimaryKeyConstraint(
*newPriCols
# TODO: ADD all the columns that are in the model that are also a primary key
# *[col for col in model.__table__.c if col.primary_key]
)
class A_Table(Base):
pass
class B_Table(Base):
defined = [A_Table]
possibleSolution(B_Table)
engine = create_engine('sqlite://')
Base.metadata.create_all(bind=engine)
Session = sessionmaker(bind=engine)
session = Session()
a = A_Table()
b = B_Table(A_TableId=a.id)
print(B_Table.__table__.primary_key)
session.add(a)
session.commit()
session.add(b)
session.commit()
Originally, the error you say the PK reassignment is causing is:
InvalidRequestError: Instance <B_Table at 0x104aa1d68> cannot be refreshed -
it's not persistent and does not contain a full primary key.
I don't get that running you MCVE, instead I get a pretty helpful warning first:
SAWarning: Column 'B_Table.A_TableId' is marked as a member of the
primary key for table 'B_Table', but has no Python-side or server-side
default generator indicated, nor does it indicate 'autoincrement=True'
or 'nullable=True', and no explicit value is passed. Primary key
columns typically may not store NULL.
And a very detailed exception message when the script fails:
sqlalchemy.orm.exc.FlushError: Instance has
a NULL identity key. If this is an auto-generated value, check that
the database table allows generation of new primary key values, and
that the mapped Column object is configured to expect these generated
values. Ensure also that this flush() is not occurring at an
inappropriate time, such as within a load() event.
So assuming that the example accurately describes your problem, the answer is straightforward. A primary key cannot be null.
A_Table inherits off Base:
class A_Table(Base):
pass
Base gives A_Table an autoincrement PK through declared_attr id():
#declared_attr
def id(cls):
return Column(Integer, cls.seq, unique=True, autoincrement=True, primary_key=True)
Similarly, B_Table is defined off Base but the PK is overwritten in possibleSolution() such that it becomes a ForeignKey to A_Table:
PrimaryKeyConstraint(Column('A_TableId', Integer(), ForeignKey('A_Table.id'), table=<B_Table>, primary_key=True, nullable=False))
Then, we instantiate an instance of A_Table without any kwargs and immediately allocate the id attribute of instance a to field A_TableId when constructing b:
a = A_Table()
b = B_Table(A_TableId=a.id)
At this point we can stop and inspect the attribute values of each:
print(a.id, b.A_TableId)
# None None
a.id is None because it's an autoincrement which needs to be populated by the database, not the ORM. So SQLAlchemy doesn't know it's value until after the instance is flushed to the database.
So what happens if we include a flush() operation after adding instance a to the session:
a = A_Table()
session.add(a)
session.flush()
b = B_Table(A_TableId=a.id)
print(a.id, b.A_TableId)
# 1 1
So by issuing the flush first, we've got a value for a.id, meaning that we also have a value for b.A_TableId.
session.add(b)
session.commit()
# no error

Retrieving all columns but some with SQLAlchemy

I'm making a WebService that sends specific tables in JSON.
I use SQLAlchemy to communicate with the database.
I'd want to retrieve just the columns the user has the right to see.
Is there a way to tell SQLAlchemy to not retrieve some columns ?
It's not correct but something like this :
SELECT * EXCEPT column1 FROM table.
I know it is possible to specify just some columns in the SELECT statement but it's not exactly what I want because I don't know all the table columns. I just want all the columns but some.
I also tried to get all the columns and delete the column attribute I don't want like this :
result = db_session.query(Table).all()
for row in result:
row.__delattr(column1)
but it seems SQLAlchemy doesn't allow to do this.
I get the warning :
Warning: Column 'column1' cannot be null
cursor.execute(statement, parameters)
ok
What would be the most optimized way to do it for you guys ?
Thank you
You can pass in all columns in the table, except the ones you don't want, to the query method.
session.query(*[c for c in User.__table__.c if c.name != 'password'])
Here is a runnable example:
#!/usr/bin/env python
from sqlalchemy import create_engine
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy import Column, Integer, String
from sqlalchemy.orm import Session
Base = declarative_base()
class User(Base):
__tablename__ = 'users'
id = Column(Integer, primary_key=True)
name = Column(String)
fullname = Column(String)
password = Column(String)
def __init__(self, name, fullname, password):
self.name = name
self.fullname = fullname
self.password = password
def __repr__(self):
return "<User('%s','%s', '%s')>" % (self.name, self.fullname, self.password)
engine = create_engine('sqlite:///:memory:', echo=True)
Base.metadata.create_all(engine)
session = Session(bind=engine)
ed_user = User('ed', 'Ed Jones', 'edspassword')
session.add(ed_user)
session.commit()
result = session.query(*[c for c in User.__table__.c if c.name != 'password']).all()
print(result)
You can make the column a defered column. This feature allows particular columns of a table be loaded only upon direct access, instead of when the entity is queried using Query.
See Deferred Column Loading
This worked for me
users = db.query(models.User).filter(models.User.email != current_user.email).all()
return users

Problem with sqlalchemy, reflected table and defaults for string fields

hmm, is there any reason why sa tries to add Nones to for varchar columns that have defaults set in in database schema ?, it doesnt do that for floats or ints (im using reflection).
so when i try to add new row :
like
u = User()
u.foo = 'a'
u.bar = 'b'
sa issues a query that has a lot more cols with None values assigned to those, and db obviously bards and doesnt perform default substitution.
What version do you use and what is actual code? Below is a sample code showing that server_default parameter works fine for string fields:
from sqlalchemy import *
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import sessionmaker
metadata = MetaData()
Base = declarative_base(metadata=metadata)
class Item(Base):
__tablename__="items"
id = Column(String, primary_key=True)
int_val = Column(Integer, nullable=False, server_default='123')
str_val = Column(String, nullable=False, server_default='abc')
engine = create_engine('sqlite://', echo=True)
metadata.create_all(engine)
session = sessionmaker(engine)()
item = Item(id='foo')
session.add(item)
session.commit()
print item.int_val, item.str_val
The output is:
<...>
<...> INSERT INTO items (id) VALUES (?)
<...> ['foo']
<...>
123 abc
I've found its a bug in sa, this happens only for string fields, they dont get server_default property for some unknow reason, filed a ticket for this already

Categories