I feel like this should be simple, but i cant find a single example of it being done.
As an example I have the following existing tables:
CREATE TABLE `source` (
`source_id` tinyint(3) unsigned NOT NULL auto_increment,
`name` varchar(40) default NULL,
PRIMARY KEY (`source_id`),
UNIQUE KEY `source_name` (`name`)
) ENGINE=InnoDB AUTO_INCREMENT=3 DEFAULT CHARSET=utf8;
CREATE TABLE `event` (
`source_id` tinyint(3) unsigned NOT NULL default '0',
`info` varchar(255) NOT NULL default '',
`item` varchar(100) NOT NULL default '',
PRIMARY KEY (`source_id`,`info`,`item`),
KEY `event_fkindex1` (`source_id`),
CONSTRAINT `event_fk1` FOREIGN KEY (`source_id`) REFERENCES `source` (`source_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
I'd like to use sqlalchemy 0.6 to add a lot of rows to the events table. I've seen some sqlsoup examples, but really hate the way it accesses the db by constantly calling the db object. I followed the docs for the db reflection stuff and got this far:
import sqlalchemy
from sqlalchemy import Table, Column, MetaData, create_engine
from sqlalchemy.orm import sessionmaker
engine = create_engine('mysql://user:pass#server/db', echo=True)
metadata = MetaData()
source = Table('source', metadata, autoload=True, autoload_with=engine)
Session = sessionmaker(bind=engine)
session = Session()
session.query(source).first()
This returns a really ugly object. I really want the mapper functionality of the sqlalchemy ORM so I can construct Event objects to insert into the DB.
I looked at the sqlsoup stuff:
from sqlalchemy.ext.sqlsoup import SqlSoup
db = SqlSoup(engine)
db.sources.all() #this kinda works out bet
But I couldn't figure out how to add objects form this point. I'm not even sure this is what I want, I'd like to be able to follow the tutorial and the declarative_base stuff. Is this possible without having to rewrite a class to model the whole table structure? If its not, can someone show me how I'd do this in this example?
Can someone set me down the right path for getting the mapper stuff to work?
You can use a predefined/autoloaded table with declarative_base by assigning it to the __table__ attribute. The columns are picked up from the table, but you'll still have declare any relations you want to use.
class Source(Base):
__table__ = source
class Event(Base):
__table__ = event
source = relation(Source)
However if you are going to be inserting a huge amount of rows, then going around the ORM and using executemany will get you a large performance increase. You can use execute many like this:
conn = engine.connect()
conn.execute(event.insert(),[
{'source_id': 1, 'info': 'xyz', 'item': 'foo'},
{'source_id': 1, 'info': 'xyz', 'item': 'bar'},
...
])
Related
Let's say I have a simple table. In raw SQL it looks like this:
CREATE TABLE events (id INT PRIMARY KEY) WITH (fillfactor=60);
My question is how to specify fillfactor for a table using sqlalchemy declarative base?
Of course I can achieve that by using raw SQL in sqlalchemy like execute("ALTER TABLE events SET (fillfactor=60)"), but I'm interested whether there is a way to do that using native sqlalchemy tools.
I've tried the following approach, but that didnt't work:
from sqlalchemy import Column, Integer, String
from sqlalchemy.ext.declarative import declarative_base
Base = declarative_base()
class SimpleExampleTable(Base):
__tablename__ = 'events'
__table_args__ = {'comment': 'events table', "fillfactor": 60}
id = Column(Integer, primary_key=True)
TypeError: Additional arguments should be named <dialectname>_<argument>, got 'fillfactor'
Looking through documentation I've managed to find information only about fillfactor usage in indexes.
My environment:
python 3.9
sqlalchemy 1.3.22
PostgreSQL 11.6
Tbh, have the same question, and the only thing I found which is related to the fillfactor in the SQLAlchemy docs is the one with index (link to docs):
PostgreSQL allows storage parameters to be set on indexes. The storage
parameters available depend on the index method used by the index.
Storage parameters can be specified on Index using the postgresql_with
keyword argument:
Index('my_index', my_table.c.data, postgresql_with={"fillfactor": 50})
But it seems, there is no setting option where you can set the fillfactor directly for the table.
But there is still an option to run the raw SQL query (as the alembic migration, let's say):
ALTER TABLE mytable SET (fillfactor = 70);
Note that setting fillfactor on an existing table will not rearrange
the data, it will only apply to future inserts. But you can use
VACUUM to rewrite the table, which will respect the new fillfactor
setting.
The previous quote is taken from here
Extending the answer from Max Kapustin, you can use an event listener to automatically execute the ALTER TABLE statement when the table is created.
import sqlalchemy as sa
engine = sa.create_engine('postgresql:///test', echo=True, future=True)
tablename = 't65741211'
tbl = sa.Table(
tablename,
sa.MetaData(),
sa.Column('id', sa.Integer, primary_key=True),
listeners=[
(
'after_create',
sa.schema.DDL(
f"""ALTER TABLE "{tablename}" SET (fillfactor = 70)"""
),
)
],
)
tbl.drop(engine, checkfirst=True)
tbl.create(engine)
This question already has answers here:
How to define a table without primary key with SQLAlchemy?
(7 answers)
Closed 3 years ago.
I have a flask application that relies on an existing Teradata Database to serve up information to and accept input from its users. I am able to successfully make the connection between the application and the Teradata Database, however, I am not able to then define classes that will represent tables already existing in my database.
Currently, I am defining a 'Base' class using sqlalchemy that represents the connection to my database. There is no problem here and I am even able to execute queries using the connection used to build the 'Base' class. However, my problem is in using this 'Base' class to create a subclass 'Users' for my teradata table 'users'. My understanding is that sqlalchemy should allow for me to define a subclass of the superclass 'Base' which will inherit the metadata from the underlying teradata table that the subclass represents - in this case, my 'users' table. Here is the code I have so far:
import getpass
from sqlalchemy import create_engine
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.schema import MetaData
user = 'user_id_string'
pasw=getpass.getpass()
host = 'host_string'
db_name = 'db_name'
engine = create_engine(f'{host}?user={user}&password={pasw}&logmech=LDAP')
connection = engine.connect()
connection.execute(f'DATABASE {db_name}')
md = MetaData(bind=connection, reflect=False, schema='db_name')
md.reflect(only=['users'])
Base = declarative_base(bind=connection, metadata=md)
class Users(Base):
__table__ = md.tables['db_name.users']
This is the error that I receive when constructing the subclass 'Users':
sqlalchemy.exc.ArgumentError: Mapper mapped class Users->users could not assemble any primary key columns for mapped table 'users'
Is there some reason that my subclass 'Users' is not automatically being mapped to the table metadata from the existing teradata table 'users' that I have assigned it to in defining the class? The underlying table already has a primary key set so I don't understand why sqlalchemy is not assuming the existing primary key. Thanks for your help in advance.
EDIT: The underlying table DOES NOT have a primary KEY, only a primary INDEX.
From SQLAlchmey documentation: (https://docs.sqlalchemy.org/en/13/faq/ormconfiguration.html#how-do-i-map-a-table-that-has-no-primary-key)
The SQLAlchemy ORM, in order to map to a particular table, needs there to be at least one column denoted as a primary key column; multiple-column, i.e. composite, primary keys are of course entirely feasible as well. These columns do not need to be actually known to the database as primary key columns, though it’s a good idea that they are. It’s only necessary that the columns behave as a primary key does, e.g. as a unique and not nullable identifier for a row.
Most ORMs require that objects have some kind of primary key defined because the object in memory must correspond to a uniquely identifiable row in the database table; at the very least, this allows the object can be targeted for UPDATE and DELETE statements which will affect only that object’s row and no other. However, the importance of the primary key goes far beyond that. In SQLAlchemy, all ORM-mapped objects are at all times linked uniquely within a Session to their specific database row using a pattern called the identity map, a pattern that’s central to the unit of work system employed by SQLAlchemy, and is also key to the most common (and not-so-common) patterns of ORM usage.
In almost all cases, a table does have a so-called candidate key, which is a column or series of columns that uniquely identify a row. If a table truly doesn’t have this, and has actual fully duplicate rows, the table is not corresponding to first normal form and cannot be mapped. Otherwise, whatever columns comprise the best candidate key can be applied directly to the mapper:
class SomeClass(Base):
__table__ = some_table_with_no_pk
__mapper_args__ = {
'primary_key':[some_table_with_no_pk.c.uid, some_table_with_no_pk.c.bar]
}
Better yet is when using fully declared table metadata, use the primary_key=True flag on those columns:
class SomeClass(Base):
__tablename__ = "some_table_with_no_pk"
uid = Column(Integer, primary_key=True)
bar = Column(String, primary_key=True)
We're converting a codebase to SqlAlchemy, where there's an existing database that we can modify but not completely replace.
There's a collection of widgets, and for each widget we keep track of the 20 most similar other widgets (this is a directional relationship, i.e. widget_2 can appear in widget_1's most similar widgets, but not vice versa).
There's a widget table which has a widget_id field and some other things.
There's a similarity table which has first_widget_id, second_widget_id and similarity_score. We only save the 20 most similar widgets in the database, so that every widget_id appears exactly 20 times as first_widget_id.
first_widget_id and second_widget_id have foreign keys pointing to the widget table.
We're using SQLAlchemy's automap functionality, so a Widget object has a Widget.similarity_collection field. However, for a specified widget_id, it only includes items where second_widget_id == widget_id, whereas we want first_widget_id == widget_id. I understand that SQLAlchemy has no way to know which of the 2 it should pick.
Can we tell it somehow?
EDIT: as per the comment, here are more details on the models:
CREATE TABLE IF NOT EXISTS `similarity` (
`first_widget_id` int(6) NOT NULL,
`second_widget_id` int(6) NOT NULL,
`score` int(5) NOT NULL,
PRIMARY KEY (`first_widget_id`,`second_widget_id`),
KEY `first_widget_id` (`first_widget_id`),
KEY `second_widget_id_index` (`second_widget_id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
ALTER TABLE `similarity`
ADD CONSTRAINT `similar_first_widget_id_to_widgets_foreign_key` FOREIGN KEY (`first_widget_id`) REFERENCES `widgets` (`widget_id`) ON DELETE CASCADE ON UPDATE CASCADE,
ADD CONSTRAINT `similar_second_widget_id_to_widgets_foreign_key` FOREIGN KEY (`second_widget_id`) REFERENCES `widgets` (`widget_id`) ON DELETE CASCADE ON UPDATE CASCADE;
CREATE TABLE IF NOT EXISTS `widgets` (
`widget_id` int(6) NOT NULL AUTO_INCREMENT,
`widget_name` varchar(70) NOT NULL,
PRIMARY KEY (`game_id`),
) ENGINE=InnoDB DEFAULT CHARSET=latin1 AUTO_INCREMENT=13179 ;
And using this python code to initialize SQLAlchemy:
base = automap_base()
engine = create_engine(
'mysql://%s:%s#%s/%s?charset=utf8mb4' % (
config.DB_USER, config.DB_PASSWD, config.DB_HOST, config.DB_NAME
), echo=False
)
# reflect the tables
base.prepare(self.engine, reflect=True)
Widgets = base.classes.widgets
Now when we do something like:
session.query(Widgets).filter_by(widget_id=1).similarity_collection
We get sqlalchemy.ext.automap.similar objects for which second_widget_id == 1, whereas we want first_widget_id == 1
You can override how the similarity_collection joins, even when automapping, with an explicit class definition and passing foreign_keys to the relationship:
base = automap_base()
engine = create_engine(
'mysql://%s:%s#%s/%s?charset=utf8mb4' % (
config.DB_USER, config.DB_PASSWD, config.DB_HOST, config.DB_NAME
), echo=False
)
# The class definition that ensures certain join path for the relationship.
# Rest of the mapping is automapped upon reflecting.
class Widgets(base):
__tablename__ = 'widgets'
similarity_collection = relationship(
'similarity', foreign_keys='similarity.first_widget_id')
base.prepare(self.engine, reflect=True)
If you wish to also control the created relationship(s) in similarity – for neat association proxies or such – use the same pattern.
I'm working on a project with Alembic and SQLAlchemy, but I'm having trouble creating a simple entry in the database as a test. I get the following error:
sqlalchemy.exc.ArgumentError: Mapper Mapper|Sale|sales_cache could not assemble any primary key columns for mapped table 'sales_cache'
I've established the primary key (account_id) in both places below, any idea why SQLAlchemy doesn't recognize that or how to fix it? The other answers I've read have all dealt with exception cases for multiple/no primary keys, and have been solved accordingly, but this is a pretty vanilla model that keeps failing.
I've read up on other answers, most of which deal with the declarative system:
class Sale(Base):
__tablename__ = 'sales_cache'
But I'm required to use the classical mapping system; here's my mapped class and schema, respectively:
class Sale(object):
def __init__(self, notification):
self._sale_id = self._notification.object_id
self._account_id = self._notification.account_id
### schema.py file ###
from sqlalchemy.schema import MetaData, Table, Column
from sqlalchemy.types import (Unicode, Integer)
from database import metadata
metadata = MetaData()
sales_cache = Table('sales_cache', metadata,
Column('account_id', Integer, primary_key=True, autoincrement=False),
Column('sale_id', Integer, nullable=False)
)
And this is the relevant line from my alembic revision:
sa.Column('account_id', sa.Integer(), primary_key=True, autoincrement=False),
I thought it might be failing because I was setting self._sale_id and self._account_id instead of self.sale_id and self.account_id (without the underscore), but nothing changed when I tried it this way too.
Thanks in advance
I tried to replicate the code from the docs regarding mapping models to arbitrary tables, but I get the following error:
sqlalchemy.exc.InvalidRequestError: When mapping against a select() construct, map against an alias() of the construct instead.This because several databases don't allow a SELECT from a subquery that does not have an alias.
Here is how I implemented the code example.
from sqlalchemy import (
select, func,
Table, Column,
Integer, ForeignKey,
MetaData,
)
from sqlalchemy.ext.declarative import declarative_base
metadata = MetaData()
Base = declarative_base()
customers = Table('customer', metadata,
Column('id', Integer, primary_key=True),
)
orders = Table('order', metadata,
Column('id', Integer, primary_key=True),
Column('price', Integer),
Column('customer_id', Integer, ForeignKey('customer.id')),
)
subq = select([
func.count(orders.c.id).label('order_count'),
func.max(orders.c.price).label('highest_order'),
orders.c.customer_id
]).group_by(orders.c.customer_id).alias()
customer_select = select([customers,subq]).\
where(customers.c.id==subq.c.customer_id)
class Customer(Base):
__table__ = customer_select
I can make this work by using the following:
class Customer(Base):
__table__ = customer_select.alias()
Unfortunately, that creates all the queries in a subselect, which is prohibitively slow.
Is there a way to map a model against an arbitrary select? Is this a documentation error--the code sample from the docs doesn't work for me (in sqlalchemy==0.8.0b2 or 0.7.10)
If a subquery is turning out to be "prohibitively slow" this probably means you're using MySQL, which as I've noted elsewhere has really unacceptable subquery performance.
The mapping of a select needs that statement to be parenthesized, basically, so that it's internal constructs don't leak outwards into the context in which it is used and so that the select() construct itself doesn't need to be mutated.
Consider if you were to map to a SELECT like this:
SELECT id, name FROM table WHERE id = 5
now suppose you were to select from a mapping of this:
ma = aliased(Mapping)
query(Mapping).join(ma, Mapping.name == ma.name)
without your mapped select being an alias, SQLAlchemy has to modify your select() in place, which means it has to dig into it and fully understand its geometry, which can be quite complex (consider if it has LIMIT, ORDER BY, GROUP BY, aggregates, etc.):
SELECT id, name FROM table
JOIN (SELECT id, name FROM table) as anon_1 ON table.name = anon_1.name
WHERE table.id = 5
whereas, when it's self-contained, SQLAlchemy can remain ignorant of the structure of your select() and use it just like any other table:
SELECT id, name FROM (SELECT id, name FROM table WHERE id = 5) as a1
JOIN (SELECT id, name FROM table WHERE id = 5) as a2 ON a1.name = a2.name
In practice, mapping to a SELECT statement is a rare use case these days as the Query construct typically plenty flexible enough to do whatever pattern you need. Very early versions of SQLAlchemy featured the mapper() as the primary querying object, and internally we do make use of mapper()'s flexibility in this way (especially in joined inheritance), but the explicit approach isn't needed too often.
Is this a documentation error--the code sample from the docs doesn't work for me (in sqlalchemy==0.8.0b2 or 0.7.10)
Thanks for pointing this out. I've corrected the example and also added a huge note discouraging the practice overall, as it doesn't have a lot of use (I know this because I never use it).