How to specify fillfactor for a table in sqlalchemy? - python

Let's say I have a simple table. In raw SQL it looks like this:
CREATE TABLE events (id INT PRIMARY KEY) WITH (fillfactor=60);
My question is how to specify fillfactor for a table using sqlalchemy declarative base?
Of course I can achieve that by using raw SQL in sqlalchemy like execute("ALTER TABLE events SET (fillfactor=60)"), but I'm interested whether there is a way to do that using native sqlalchemy tools.
I've tried the following approach, but that didnt't work:
from sqlalchemy import Column, Integer, String
from sqlalchemy.ext.declarative import declarative_base
Base = declarative_base()
class SimpleExampleTable(Base):
__tablename__ = 'events'
__table_args__ = {'comment': 'events table', "fillfactor": 60}
id = Column(Integer, primary_key=True)
TypeError: Additional arguments should be named <dialectname>_<argument>, got 'fillfactor'
Looking through documentation I've managed to find information only about fillfactor usage in indexes.
My environment:
python 3.9
sqlalchemy 1.3.22
PostgreSQL 11.6

Tbh, have the same question, and the only thing I found which is related to the fillfactor in the SQLAlchemy docs is the one with index (link to docs):
PostgreSQL allows storage parameters to be set on indexes. The storage
parameters available depend on the index method used by the index.
Storage parameters can be specified on Index using the postgresql_with
keyword argument:
Index('my_index', my_table.c.data, postgresql_with={"fillfactor": 50})
But it seems, there is no setting option where you can set the fillfactor directly for the table.
But there is still an option to run the raw SQL query (as the alembic migration, let's say):
ALTER TABLE mytable SET (fillfactor = 70);
Note that setting fillfactor on an existing table will not rearrange
the data, it will only apply to future inserts. But you can use
VACUUM to rewrite the table, which will respect the new fillfactor
setting.
The previous quote is taken from here

Extending the answer from Max Kapustin, you can use an event listener to automatically execute the ALTER TABLE statement when the table is created.
import sqlalchemy as sa
engine = sa.create_engine('postgresql:///test', echo=True, future=True)
tablename = 't65741211'
tbl = sa.Table(
tablename,
sa.MetaData(),
sa.Column('id', sa.Integer, primary_key=True),
listeners=[
(
'after_create',
sa.schema.DDL(
f"""ALTER TABLE "{tablename}" SET (fillfactor = 70)"""
),
)
],
)
tbl.drop(engine, checkfirst=True)
tbl.create(engine)

Related

Use "on_conflict_do_update()" with sqlalchemy ORM

I am currently using SQLAlchemy ORM to deal with my db operations. Now I have a SQL command which requires ON CONFLICT (id) DO UPDATE. The method on_conflict_do_update() seems to be the correct one to use. But the post here says the code have to switch to SQLAlchemy core and the high-level ORM functionalities are missing. I am confused by this statement since I think the code like the demo below can achieve what I want while keep the functionalities of SQLAlchemy ORM.
class Foo(Base):
...
bar = Column(Integer)
foo = Foo(bar=1)
insert_stmt = insert(Foo).values(bar=foo.bar)
do_update_stmt = insert_stmt.on_conflict_do_update(
set_=dict(
bar=insert_stmt.excluded.bar,
)
)
session.execute(do_update_stmt)
I haven't tested it on my project since it will require a huge amount of modification. Can I ask if this is the correct way to deal with ON CONFLICT (id) DO UPDATE with SQLALchemy ORM?
As noted in the documentation, the constraint= argument is
The name of a unique or exclusion constraint on the table, or the constraint object itself if it has a .name attribute.
so we need to pass the name of the PK constraint to .on_conflict_do_update().
We can get the PK constraint name via the inspection interface:
insp = inspect(engine)
pk_constraint_name = insp.get_pk_constraint(Foo.__tablename__)["name"]
print(pk_constraint_name) # tbl_foo_pkey
new_bar = 123
insert_stmt = insert(Foo).values(id=12345, bar=new_bar)
do_update_stmt = insert_stmt.on_conflict_do_update(
constraint=pk_constraint_name, set_=dict(bar=new_bar)
)
with Session(engine) as session, session.begin():
session.execute(do_update_stmt)

How to automap a backend-agnostic Base in SQLAlchemy

I am trying to take a simple database schema in Oracle and migrate it to a mssql database. There are other ways I can do this but my first thought was to utilize SQLAlchemy's automap and create_all functionality to do it pretty much instantaneously.
Unfortunately when I attempt to do so I run into some conversion errors:
Input:
from sqlalchemy.ext.automap import automap_base
from custom_connections import connect_to_oracle, connect_to_mssql
Base = automap_base()
oracle_engine = connect_to_oracle()
mssql_engine = connect_to_mssql()
Base.prepare(oracle_engine, reflect=True, schema = ‘ORACLE_MAIN_DB’)
Base.metadata.create_all(mssql_engine)
(Note that the connect_to functions are custom functions which return sqlalchemy engines. Currently they just return engines with base settings.)
Output:
CompileError: (in table 'acct', column 'acctnbr'): Compiler <sqlalchemy.dialects.mssql.base.MSTypeCompiler object at 0x00000268E8FF6DA0> can't render element of type <class 'sqlalchemy.dialects.oracle.base.NUMBER'>
The issue is that while Sqlalchemy is converting most types to sqlalchemy types when mapping the Base, it doesn't do the same with Oracle NUMBER types. I attempted a similar trick using alembic autogeneration off the automapped Base, but the Oracle NUMBER types caused issues there as well.
Given all the power behind it, I would have thought Sqlalchemy would be able to handle this without any issues. Is there a technique or setting I could use when running this code which would cause it to convert all types to their Sqlalchemy equivalent when mapping the base instead of just most types?
This is described in the SQLAlchemy docs about reflection:
from sqlalchemy import MetaData, Table, create_engine
from sqlalchemy import event
mysql_engine = create_engine("mysql://scott:tiger#localhost/test")
metadata_obj = MetaData()
my_mysql_table = Table("my_table", metadata_obj, autoload_with=mysql_engine)
# my_table should already exist in your db, this is the table you want to convert
#event.listens_for(metadata_obj, "column_reflect")
def genericize_datatypes(inspector, tablename, column_dict):
column_dict["type"] = column_dict["type"].as_generic()
my_generic_table = Table("my_table", metadata_obj, autoload_with=mysql_engine)
# generic table will be a generic version of my_table
sqlite_eng = create_engine("sqlite:///test.db", echo=True)\
my_generic_table.create(sqlite_eng)

Integer field not autoincrementing in SQLAlchemy

I have a Flask-SQLAlchemy model with an Integer field that I'd like to autoincrement. It's not a primary key; it's a surrogate ID. The model looks like:
class StreetSegment(db.Model):
id = db.Column(db.Integer, autoincrement=True)
seg_id = db.Column(db.Integer, primary_key=True)
When I create the table in my Postgres database, the id field is created as a plain integer. If I insert rows without specifying a value for id, it doesn't get populated. Is there some way I can force SQLAlchemy to use SERIAL even if it isn't the primary key?
Use Sequence instead of autoincrement:
id = db.Column(db.Integer, db.Sequence("seq_street_segment_id"))
SQLAlchemy does not support auto_increment for non-primary-key columns.
If your database supports it, you can setup the same behavior using sequences. PostgreSQL supports this. Sequences actually are not bound to a very specific column. Instead, they exist on the database level and can be reused. Sequences are the exact construct, SQLAlchemy uses for auto incrementing primary-key columns.
To use a sequence as described in the accepted answer, it must exist. Following, I have an example of an alembic migration with SQLAlchemy to achieve that.
You can associate a sequence with a column in the column constructor. The DDL Expression Constructs API helps you creating and dropping the sequence.
An example:
from alembic import op
import sqlalchemy as sa
measurement_id_seq = sa.Sequence('Measurement_MeasurementId_seq') # represents the sequence
def upgrade():
op.execute(sa.schema.CreateSequence(measurement_id_seq)) # create the sequence
op.create_table(
'Measurement',
sa.Column('DataSourceId',
sa.Integer,
sa.ForeignKey('DataSource.DataSourceId'),
nullable=False),
sa.Column('LocationId',
sa.Integer,
sa.ForeignKey('Location.LocationId'),
nullable=False),
sa.Column('MeasurementId',
sa.Integer,
measurement_id_seq, # the sequence as SchemaItem
server_default=measurement_id_seq.next_value())) # next value of the sequence as default
[...]
op.create_primary_key('Measurement_pkey', 'Measurement',
['DataSourceId', 'LocationId', 'Timestamp'])
pass
def downgrade():
op.execute(
sa.schema.DropSequence(sa.Sequence('Measurement_MeasurementId_seq')))
op.drop_constraint('Measurement_pkey', 'Measurement')
op.drop_table('Measurement')
pass

In SQLAlchemy, why must I alias the select construct when mapping against arbitrary selects?

I tried to replicate the code from the docs regarding mapping models to arbitrary tables, but I get the following error:
sqlalchemy.exc.InvalidRequestError: When mapping against a select() construct, map against an alias() of the construct instead.This because several databases don't allow a SELECT from a subquery that does not have an alias.
Here is how I implemented the code example.
from sqlalchemy import (
select, func,
Table, Column,
Integer, ForeignKey,
MetaData,
)
from sqlalchemy.ext.declarative import declarative_base
metadata = MetaData()
Base = declarative_base()
customers = Table('customer', metadata,
Column('id', Integer, primary_key=True),
)
orders = Table('order', metadata,
Column('id', Integer, primary_key=True),
Column('price', Integer),
Column('customer_id', Integer, ForeignKey('customer.id')),
)
subq = select([
func.count(orders.c.id).label('order_count'),
func.max(orders.c.price).label('highest_order'),
orders.c.customer_id
]).group_by(orders.c.customer_id).alias()
customer_select = select([customers,subq]).\
where(customers.c.id==subq.c.customer_id)
class Customer(Base):
__table__ = customer_select
I can make this work by using the following:
class Customer(Base):
__table__ = customer_select.alias()
Unfortunately, that creates all the queries in a subselect, which is prohibitively slow.
Is there a way to map a model against an arbitrary select? Is this a documentation error--the code sample from the docs doesn't work for me (in sqlalchemy==0.8.0b2 or 0.7.10)
If a subquery is turning out to be "prohibitively slow" this probably means you're using MySQL, which as I've noted elsewhere has really unacceptable subquery performance.
The mapping of a select needs that statement to be parenthesized, basically, so that it's internal constructs don't leak outwards into the context in which it is used and so that the select() construct itself doesn't need to be mutated.
Consider if you were to map to a SELECT like this:
SELECT id, name FROM table WHERE id = 5
now suppose you were to select from a mapping of this:
ma = aliased(Mapping)
query(Mapping).join(ma, Mapping.name == ma.name)
without your mapped select being an alias, SQLAlchemy has to modify your select() in place, which means it has to dig into it and fully understand its geometry, which can be quite complex (consider if it has LIMIT, ORDER BY, GROUP BY, aggregates, etc.):
SELECT id, name FROM table
JOIN (SELECT id, name FROM table) as anon_1 ON table.name = anon_1.name
WHERE table.id = 5
whereas, when it's self-contained, SQLAlchemy can remain ignorant of the structure of your select() and use it just like any other table:
SELECT id, name FROM (SELECT id, name FROM table WHERE id = 5) as a1
JOIN (SELECT id, name FROM table WHERE id = 5) as a2 ON a1.name = a2.name
In practice, mapping to a SELECT statement is a rare use case these days as the Query construct typically plenty flexible enough to do whatever pattern you need. Very early versions of SQLAlchemy featured the mapper() as the primary querying object, and internally we do make use of mapper()'s flexibility in this way (especially in joined inheritance), but the explicit approach isn't needed too often.
Is this a documentation error--the code sample from the docs doesn't work for me (in sqlalchemy==0.8.0b2 or 0.7.10)
Thanks for pointing this out. I've corrected the example and also added a huge note discouraging the practice overall, as it doesn't have a lot of use (I know this because I never use it).

sqlalchemy 0.6 legacy database access?

I feel like this should be simple, but i cant find a single example of it being done.
As an example I have the following existing tables:
CREATE TABLE `source` (
`source_id` tinyint(3) unsigned NOT NULL auto_increment,
`name` varchar(40) default NULL,
PRIMARY KEY (`source_id`),
UNIQUE KEY `source_name` (`name`)
) ENGINE=InnoDB AUTO_INCREMENT=3 DEFAULT CHARSET=utf8;
CREATE TABLE `event` (
`source_id` tinyint(3) unsigned NOT NULL default '0',
`info` varchar(255) NOT NULL default '',
`item` varchar(100) NOT NULL default '',
PRIMARY KEY (`source_id`,`info`,`item`),
KEY `event_fkindex1` (`source_id`),
CONSTRAINT `event_fk1` FOREIGN KEY (`source_id`) REFERENCES `source` (`source_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
I'd like to use sqlalchemy 0.6 to add a lot of rows to the events table. I've seen some sqlsoup examples, but really hate the way it accesses the db by constantly calling the db object. I followed the docs for the db reflection stuff and got this far:
import sqlalchemy
from sqlalchemy import Table, Column, MetaData, create_engine
from sqlalchemy.orm import sessionmaker
engine = create_engine('mysql://user:pass#server/db', echo=True)
metadata = MetaData()
source = Table('source', metadata, autoload=True, autoload_with=engine)
Session = sessionmaker(bind=engine)
session = Session()
session.query(source).first()
This returns a really ugly object. I really want the mapper functionality of the sqlalchemy ORM so I can construct Event objects to insert into the DB.
I looked at the sqlsoup stuff:
from sqlalchemy.ext.sqlsoup import SqlSoup
db = SqlSoup(engine)
db.sources.all() #this kinda works out bet
But I couldn't figure out how to add objects form this point. I'm not even sure this is what I want, I'd like to be able to follow the tutorial and the declarative_base stuff. Is this possible without having to rewrite a class to model the whole table structure? If its not, can someone show me how I'd do this in this example?
Can someone set me down the right path for getting the mapper stuff to work?
You can use a predefined/autoloaded table with declarative_base by assigning it to the __table__ attribute. The columns are picked up from the table, but you'll still have declare any relations you want to use.
class Source(Base):
__table__ = source
class Event(Base):
__table__ = event
source = relation(Source)
However if you are going to be inserting a huge amount of rows, then going around the ORM and using executemany will get you a large performance increase. You can use execute many like this:
conn = engine.connect()
conn.execute(event.insert(),[
{'source_id': 1, 'info': 'xyz', 'item': 'foo'},
{'source_id': 1, 'info': 'xyz', 'item': 'bar'},
...
])

Categories