Retrieving all columns but some with SQLAlchemy - python

I'm making a WebService that sends specific tables in JSON.
I use SQLAlchemy to communicate with the database.
I'd want to retrieve just the columns the user has the right to see.
Is there a way to tell SQLAlchemy to not retrieve some columns ?
It's not correct but something like this :
SELECT * EXCEPT column1 FROM table.
I know it is possible to specify just some columns in the SELECT statement but it's not exactly what I want because I don't know all the table columns. I just want all the columns but some.
I also tried to get all the columns and delete the column attribute I don't want like this :
result = db_session.query(Table).all()
for row in result:
row.__delattr(column1)
but it seems SQLAlchemy doesn't allow to do this.
I get the warning :
Warning: Column 'column1' cannot be null
cursor.execute(statement, parameters)
ok
What would be the most optimized way to do it for you guys ?
Thank you

You can pass in all columns in the table, except the ones you don't want, to the query method.
session.query(*[c for c in User.__table__.c if c.name != 'password'])
Here is a runnable example:
#!/usr/bin/env python
from sqlalchemy import create_engine
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy import Column, Integer, String
from sqlalchemy.orm import Session
Base = declarative_base()
class User(Base):
__tablename__ = 'users'
id = Column(Integer, primary_key=True)
name = Column(String)
fullname = Column(String)
password = Column(String)
def __init__(self, name, fullname, password):
self.name = name
self.fullname = fullname
self.password = password
def __repr__(self):
return "<User('%s','%s', '%s')>" % (self.name, self.fullname, self.password)
engine = create_engine('sqlite:///:memory:', echo=True)
Base.metadata.create_all(engine)
session = Session(bind=engine)
ed_user = User('ed', 'Ed Jones', 'edspassword')
session.add(ed_user)
session.commit()
result = session.query(*[c for c in User.__table__.c if c.name != 'password']).all()
print(result)

You can make the column a defered column. This feature allows particular columns of a table be loaded only upon direct access, instead of when the entity is queried using Query.
See Deferred Column Loading

This worked for me
users = db.query(models.User).filter(models.User.email != current_user.email).all()
return users

Related

Sqlalchemy: how to define constraint newid() for Guid column (mssql)

I want to create a column (Id) of type uniqueidentifier in sqlalchemy in a table called Staging.Transactions. Also, I want the column to automatically generate new guids for inserts.
What I want to accomplish is the following (expressed in sql)
ALTER TABLE [Staging].[Transactions] ADD DEFAULT (newid()) FOR [Id]
GO
The code in sqlalchemy is currently:
from sqlalchemy import Column, Float, Date
import uuid
from database.base import base
from sqlalchemy_utils import UUIDType
class Transactions(base):
__tablename__ = 'Transactions'
__table_args__ = {'schema': 'Staging'}
Id = Column(UUIDType, primary_key=True, default=uuid.uuid4)
AccountId = Column(UUIDType)
TransactionAmount = Column(Float)
TransactionDate = Column(Date)
def __init__(self, account_id, transaction_amount, transaction_date):
self.Id = uuid.uuid4()
self.AccountId = account_id
self.TransactionAmount = transaction_amount
self.TransactionDate = transaction_date
When I create the schema from the python code it does not generate the constraint that I want in SQL - that is - to auto generate new guids/uniqueidentifiers for the column [Id].
If I try to make a manual insert I get error message: "Cannot insert the value NULL into column 'Id', table 'my_database.Staging.Transactions'; column does not allow nulls. INSERT fails."
Would appreciate tips on how I can change the python/sqlalchemy code to fix this.
I've found two ways:
1)
Do not use uuid.uuid4() in init of your table class, keep it simple:
class Transactions(base):
__tablename__ = 'Transactions'
__table_args__ = {'schema': 'Staging'}
Id = Column(String, primary_key=True)
...
def __init__(self, Id, account_id, transaction_amount, transaction_date):
self.Id = Id
...
Instead use it in the creation of a new record:
import uuid
...
new_transac = Transacatins(Id = uuid.uuid4(),
...
)
db.session.add(new_transac)
db.session.commit()
Here, db is my SQLAlchemy(app)
2)
Without uuid, you can use raw SQL to do de job (see SQLAlchemy : how can I execute a raw INSERT sql query in a Postgres database?).
Well... session.execute is a Sqlalchemy solution...
In your case, should be something like this:
table = "[Staging].[Transactions]"
columns = ["[Id]", "[AccountId]", "[TransactionAmount]", "[TransactionDate]"]
values = ["(NEWID(), \'"+str(form.AccountId.data) +"\', "+\
str(form.TransactionAmount.data) +", "+\
str(form.TransactionDate.data))"]
s_cols = ', '.join(columns)
s_vals = ', '.join(values)
insSQL = db.session.execute(f"INSERT INTO {table} ({s_cols}) VALUES {s_vals}")
print (insSQL) # to see if SQL command is OK
db.session.commit()
You have to check where single quotes are really needed.

null value for primary key when trying to insert list of dictionaries

I'm trying to do a large number of inserts with one call, and the way someone here recommended was by giving .insert a list of dictionaries. This is using SQLAlchemy Core.
As an example:
try:
engine = db.create_engine(f"postgres://user:pass#myip/addressbook", connect_args={'connect_timeout': 5})
connection = engine.connect()
metadata = db.MetaData()
except exc.OperationalError:
print_error(f":: Could not connect to myip!")
sys.exit()
table_addressbook = db.Table('addressbook', metadata, autoload=True, autoload_with=engine)
list = []
list.append({'firstname': "John", 'lastname': "Doe"})
list.append({'firstname': "Jane", 'lastname': "Doe"})
query = db.insert(table_addressbook).values(list)
connection.execute(query)
But I'm getting an error saying the column id violates a non-null constraint. This is because insert normally auto-generates the primary-key id. How do I use this method but specify that id should be auto-generated? Or is there a different method I should use?
edit
Table name is addressbook.
Column id is type integer with default sequence 'untitled_table_id_seq', constraints are PRIMARY_KEY. This was autogenerated by Postico for Mac, but I've always been able to insert without including id and it auto increments from the last inserted ID.
Columns firstname and lastname are type text, no default, no constraints.
Without any information on your model and/or connection it is a bit difficult to answer your question. Please find below a piece of code which uses insert without throwing non-null constraint errors. Hopefully it helps you.
from sqlalchemy import create_engine, Column, Integer, String, Table
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import sessionmaker
from sqlalchemy.sql import insert
engine = create_engine('sqlite:///:memory:', echo=True)
Base = declarative_base()
class User(Base):
__tablename__ = 'users'
id = Column(Integer, primary_key=True)
firstname = Column(String)
lastname = Column(String)
Base.metadata.create_all(engine)
Session = sessionmaker(bind=engine)
Session.configure(bind=engine) # once engine is available
session = Session()
new_users = []
new_users.append({'firstname': "John", 'lastname': "Doe"})
new_users.append({'firstname': "Jane", 'lastname': "Doe"})
i = insert(User).values(new_users)
session.execute(i)
PS: most of this is coming from the tutorial on: https://docs.sqlalchemy.org/en/13/orm/tutorial.html
from sqlalchemy import Column
from sqlalchemy import create_engine
from sqlalchemy import Integer
from sqlalchemy import String
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import Session
engine = create_engine('sqlite:///:memory:', echo=True)
Base = declarative_base()
# Example Model definition for the illustration
class Customer(Base):
__tablename__ = "customer"
id = Column(Integer, primary_key=True)
name = Column(String(255))
description = Column(String(255))
Base.metadata.create_all(engine)
######################################################
# Bulk insert using dictionaries.
######################################################
# Insert test records into `customer`table.
def bulk_insert_customers(n):
session = Session(bind=engine)
session.bulk_insert_mappings(
Customer,
[
dict(
name="customer name %d" % i,
description="customer description %d" % i,
)
for i in range(n)
],
)
session.commit()
Refer these for more examples of how to do bulk inserts in different ways:
https://docs.sqlalchemy.org/en/13/_modules/examples/performance/bulk_inserts.html

How to create a pg_trgm index using SQLAlchemy for Scrapy?

I am using Scrapy to scrape data from a web forum. I am storing this data in a PostgreSQL database using SQLAlchemy. The table and columns create fine, however, I am not able to have SQLAlchemy create an index on one of the columns. I am trying to create a trigram index (pg_trgm) using gin.
The Postgresql code that would create this index is:
CREATE INDEX description_idx ON table USING gin (description gin_trgm_ops);
The SQLAlchemy code I have added to my models.py file is:
desc_idx = Index('description_idx', text("description gin_trgm_ops"), postgresql_using='gin')
I have added this line to my models.py but when I check in postgresql, the index was never created.
Below are my full models.py and pipelines.py files. Am I going about this all wrong??
Any help would be greatly appreciated!!
models.py:
from sqlalchemy import create_engine, Column, Integer, String, DateTime, Index, text
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.engine.url import URL
import settings
DeclarativeBase = declarative_base()
def db_connect():
return create_engine(URL(**settings.DATABASE))
def create_forum_table(engine):
DeclarativeBase.metadata.create_all(engine)
class forumDB(DeclarativeBase):
__tablename__ = "table"
id = Column(Integer, primary_key=True)
title = Column('title', String)
desc = Column('description', String, nullable=True)
desc_idx = Index('description_idx', text("description gin_trgm_ops"), postgresql_using='gin')
pipelines.py
from scrapy.exceptions import DropItem
from sqlalchemy.orm import sessionmaker
from models import forumDB, db_connect, create_forum_table
class ScrapeforumToDB(object):
def __init__(self):
engine = db_connect()
create_forum_table(engine)
self.Session = sessionmaker(bind=engine)
def process_item(self, item, spider):
session = self.Session()
forumitem = forumDB(**item)
try:
session.add(forumitem)
session.commit()
except:
session.rollback()
raise
finally:
session.close()
return item
The proper way to reference an Operator Class in SQLAlchemy (such as gin_trgm_ops) is to use the postgresql_ops parameter. This will also allow tools like alembic to understand how use it when auto-generating migrations.
Index('description_idx',
'description', postgresql_using='gin',
postgresql_ops={
'description': 'gin_trgm_ops',
})
Since the Index definition uses text expression it has no references to the Table "table", which has been implicitly created by the declarative class forumDB. Compare that to using a Column as expression, or some derivative of it, like this:
Index('some_index_idx', forumDB.title)
In the above definition the index will know about the table and the other way around.
What this means in your case is that the Table "table" has no idea that such an index exists. Adding it as an attribute of the declarative class is the wrong way to do it. It should be passed to the implicitly created Table instance. The attribute __table_args__ is just for that:
class forumDB(DeclarativeBase):
__tablename__ = "table"
# Note: This used to use `text('description gin_trgm_ops')` instead of the
# `postgresql_ops` parameter, which should be used.
__table_args__ = (
Index('description_idx', "description",
postgresql_ops={"description": "gin_trgm_ops"},
postgresql_using='gin'),
)
id = Column(Integer, primary_key=True)
title = Column('title', String)
desc = Column('description', String, nullable=True)
With the modification in place, a call to create_forum_table(engine) resulted in:
> \d "table"
Table "public.table"
Column | Type | Modifiers
-------------+-------------------+----------------------------------------------------
id | integer | not null default nextval('table_id_seq'::regclass)
title | character varying |
description | character varying |
Indexes:
"table_pkey" PRIMARY KEY, btree (id)
"description_idx" gin (description gin_trgm_ops)

sqlalchemy: column_prefix causes issues accessing model attributes

I went searching w/o result in a way to get the integer value or the boolean value from an object model created via sqlalchemy,
I mean i can add it and it works flawless but i cant get the integer value or the boolean value all i get when i tried to print it is the object name:
from sqlalchemy import create_engine, MetaData, Table, Column,Integer,String,Boolean,Sequence
from sqlalchemy.orm import mapper, sessionmaker
from sqlalchemy.ext.declarative import declarative_base
import json
class Bookmarks(object):
pass
#----------------------------------------------------------------------
engine = create_engine('postgresql://u:p#localghost/asd', echo=True)
Base = declarative_base()
class Tramo(Base):
__tablename__ = 'tramos'
__mapper_args__ = {'column_prefix':'tramos'}
id = Column(Integer, primary_key=True)
nombre = Column(String)
tramo_data = Column(String)
estado = Column(Boolean,default=True)
def __init__(self,nombre,tramo_data):
self.nombre=nombre
self.tramo_data=tramo_data
def __repr__(self):
return "[id:%s][nombre:%s][tramo:%s]" % (getattr(self, 'id'), self.nombre,self.tramo_data)
Session = sessionmaker(bind=engine)
session = Session()
tabla = Tramo.__table__
metadata = Base.metadata
metadata.create_all(engine)
b=Tramo('tramo1','adadas')
session.add(b)
session.commit()
print b
print b.id
its prints
[id:tramos.id][nombre:tramo1][tramo:adadas]
tramos.id
i cant get to print the id value, looks like the object column is in there but it doesn't return the value ot the property
i even use
session.refresh(b)
after the add but the result is the same.
According to the documentation Naming All Columns with a Prefix:
...prefix to the mapped attribute names relative to the
(table) column name ...
Since you define the mapped attributes in your class, I do not think it does what you desire.
Solution-1: remove the 'column_prefix':'tramos' from your __mapper_args__
Solution-2: print b.tramosid will print its id. You would need to change the __repr__ accordingly:
def __repr__(self):
return "[id:%s][nombre:%s][tramo:%s]" % (getattr(self, 'tramosid'), self.nombre, self.tramo_data)

SQLAlchemy - don't enforce foreign key constraint on a relationship

I have a Test model/table and a TestAuditLog model/table, using SQLAlchemy and SQL Server 2008. The relationship between the two is Test.id == TestAuditLog.entityId, with one test having many audit logs. TestAuditLog is intended to keep a history of changes to rows in the Test table. I want to track when a Test is deleted, also, but I'm having trouble with this. In SQL Server Management Studio, I set the FK_TEST_AUDIT_LOG_TEST relationship's "Enforce Foreign Key Constraint" property to "No", thinking that would allow a TestAuditLog row to exist with an entityId that no longer connects to any Test.id because the Test has been deleted. However, when I try to create a TestAuditLog with SQLAlchemy and then delete the Test, I get an error:
(IntegrityError) ('23000', "[23000] [Microsoft][ODBC SQL Server Driver][SQL Server]Cannot insert the value NULL into column 'AL_TEST_ID', table 'TEST_AUDIT_LOG'; column does not allow nulls. UPDATE fails. (515) (SQLExecDirectW); [01000] [Microsoft][ODBC SQL Server Driver][SQL Server]The statement has been terminated. (3621)") u'UPDATE [TEST_AUDIT_LOG] SET [AL_TEST_ID]=? WHERE [TEST_AUDIT_LOG].[AL_ID] = ?' (None, 8)
I think because of the foreign-key relationship between Test and TestAuditLog, after I delete the Test row, SQLAlchemy is trying to update all that test's audit logs to have a NULL entityId. I don't want it to do this; I want SQLAlchemy to leave the audit logs alone. How can I tell SQLAlchemy to allow audit logs to exist whose entityId does not connect with any Test.id?
I tried just removing the ForeignKey from my tables, but I'd like to still be able to say myTest.audits and get all of a test's audit logs, and SQLAlchemy complained about not knowing how to join Test and TestAuditLog. When I then specified a primaryjoin on the relationship, it grumbled about not having a ForeignKey or ForeignKeyConstraint with the columns.
Here are my models:
class TestAuditLog(Base, Common):
__tablename__ = u'TEST_AUDIT_LOG'
entityId = Column(u'AL_TEST_ID', INTEGER(), ForeignKey(u'TEST.TS_TEST_ID'),
nullable=False)
...
class Test(Base, Common):
__tablename__ = u'TEST'
id = Column(u'TS_TEST_ID', INTEGER(), primary_key=True, nullable=False)
audits = relationship(TestAuditLog, backref="test")
...
And here's how I'm trying to delete a test while keeping its audit logs, their entityId intact:
test = Session.query(Test).first()
Session.begin()
try:
Session.add(TestAuditLog(entityId=test.id))
Session.flush()
Session.delete(test)
Session.commit()
except:
Session.rollback()
raise
You can solve this by:
POINT-1: not having a ForeignKey neither on the RDBMS level nor on the SA level
POINT-2: explicitly specify join conditions for the relationship
POINT-3: mark relationship cascades to rely on passive_deletes flag
Fully working code snippet below should give you an idea (points are highlighted in the code):
from sqlalchemy import create_engine, Column, Integer, String, ForeignKey
from sqlalchemy.orm import scoped_session, sessionmaker, relationship
from sqlalchemy.ext.declarative import declarative_base
Base = declarative_base()
engine = create_engine('sqlite:///:memory:', echo=False)
Session = sessionmaker(bind=engine)
class TestAuditLog(Base):
__tablename__ = 'TEST_AUDIT_LOG'
id = Column(Integer, primary_key=True)
comment = Column(String)
entityId = Column('TEST_AUDIT_LOG', Integer, nullable=False,
# POINT-1
#ForeignKey('TEST.TS_TEST_ID', ondelete="CASCADE"),
)
def __init__(self, comment):
self.comment = comment
def __repr__(self):
return "<TestAuditLog(id=%s entityId=%s, comment=%s)>" % (self.id, self.entityId, self.comment)
class Test(Base):
__tablename__ = 'TEST'
id = Column('TS_TEST_ID', Integer, primary_key=True)
name = Column(String)
audits = relationship(TestAuditLog, backref='test',
# POINT-2
primaryjoin="Test.id==TestAuditLog.entityId",
foreign_keys=[TestAuditLog.__table__.c.TEST_AUDIT_LOG],
# POINT-3
passive_deletes='all',
)
def __init__(self, name):
self.name = name
def __repr__(self):
return "<Test(id=%s, name=%s)>" % (self.id, self.name)
Base.metadata.create_all(engine)
###################
## tests
session = Session()
# create test data
tests = [Test("test-" + str(i)) for i in range(3)]
_cnt = 0
for _t in tests:
for __ in range(2):
_t.audits.append(TestAuditLog("comment-" + str(_cnt)))
_cnt += 1
session.add_all(tests)
session.commit()
session.expunge_all()
print '-'*80
# check test data, delete one Test
t1 = session.query(Test).get(1)
print "t: ", t1
print "t.a: ", t1.audits
session.delete(t1)
session.commit()
session.expunge_all()
print '-'*80
# check that audits are still in the DB for deleted Test
t1 = session.query(Test).get(1)
assert t1 is None
_q = session.query(TestAuditLog).filter(TestAuditLog.entityId == 1)
_r = _q.all()
assert len(_r) == 2
for _a in _r:
print _a
Another option would be to duplicate the column used in the FK, and make the FK column nullable with ON CASCADE SET NULL option. In this way you can still check the audit trail of deleted objects using this column.

Categories