SQLAlchemy declarative concrete autoloaded table inheritance - python

I've an already existing database and want to access it using SQLAlchemy. Because, the database structure's managed by another piece of code (Django ORM, actually) and I don't want to repeat myself, describing every table structure, I'm using autoload introspection. I'm stuck with a simple concrete table inheritance.
Payment FooPayment
+ id (PK) <----FK------+ payment_ptr_id (PK)
+ user_id + foo
+ amount
+ date
Here is the code, with table SQL descritions as docstrings:
class Payment(Base):
"""
CREATE TABLE payments(
id serial NOT NULL,
user_id integer NOT NULL,
amount numeric(11,2) NOT NULL,
date timestamp with time zone NOT NULL,
CONSTRAINT payment_pkey PRIMARY KEY (id),
CONSTRAINT payment_user_id_fkey FOREIGN KEY (user_id)
REFERENCES users (id) MATCH SIMPLE)
"""
__tablename__ = 'payments'
__table_args__ = {'autoload': True}
# user = relation(User)
class FooPayment(Payment):
"""
CREATE TABLE payments_foo(
payment_ptr_id integer NOT NULL,
foo integer NOT NULL,
CONSTRAINT payments_foo_pkey PRIMARY KEY (payment_ptr_id),
CONSTRAINT payments_foo_payment_ptr_id_fkey
FOREIGN KEY (payment_ptr_id)
REFERENCES payments (id) MATCH SIMPLE)
"""
__tablename__ = 'payments_foo'
__table_args__ = {'autoload': True}
__mapper_args__ = {'concrete': True}
The actual tables have additional columns, but this is completely irrelevant to the question, so in attempt to minimize the code I've simplified everything just to the core.
The problem is, when I run this:
payment = session.query(FooPayment).filter(Payment.amount >= 200.0).first()
print payment.date
The resulting SQL is meaningless (note the lack of join condidion):
SELECT payments_foo.payment_ptr_id AS payments_foo_payment_ptr_id,
... /* More `payments_foo' columns and NO columns from `payments' */
FROM payments_foo, payments
WHERE payments.amount >= 200.0 LIMIT 1 OFFSET 0
And when I'm trying to access payment.date I get the following error: Concrete Mapper|FooPayment|payments_foo does not implement attribute u'date' at the instance level.
I've tried adding implicit foreign key reference id = Column('payment_ptr_id', Integer, ForeignKey('payments_payment.id'), primary_key=True) to FooPayment without any success. Trying print session.query(Payment).first().user works (I've omited User class and commented the line) perfectly, so FK introspection works.
How can I perform a simple query on FooPayment and access Payment's values from resulting instance?
I'm using SQLAlchemy 0.5.3, PostgreSQL 8.3, psycopg2 and Python 2.5.2.
Thanks for any suggestions.

Your table structures are similar to what is used in joint table inheritance, but they certainly don't correspond to concrete table inheritance where all fields of parent class are duplicated in the table of subclass. Right now you have a subclass with less fields than parent and a reference to instance of parent class. Switch to joint table inheritance (and use FooPayment.amount in your condition or give up with inheritance in favor of simple aggregation (reference).
Filter by a field in other model doesn't automatically add join condition. Although it's obvious what condition should be used in join for your example, it's not possible to determine such condition in general. That's why you have to define relation property referring to Payment and use its has() method in filter to get proper join condition.

Related

How can I map subclasses of the sqlalchemy declarative_base() class to their tables in a Teradata Database? [duplicate]

This question already has answers here:
How to define a table without primary key with SQLAlchemy?
(7 answers)
Closed 3 years ago.
I have a flask application that relies on an existing Teradata Database to serve up information to and accept input from its users. I am able to successfully make the connection between the application and the Teradata Database, however, I am not able to then define classes that will represent tables already existing in my database.
Currently, I am defining a 'Base' class using sqlalchemy that represents the connection to my database. There is no problem here and I am even able to execute queries using the connection used to build the 'Base' class. However, my problem is in using this 'Base' class to create a subclass 'Users' for my teradata table 'users'. My understanding is that sqlalchemy should allow for me to define a subclass of the superclass 'Base' which will inherit the metadata from the underlying teradata table that the subclass represents - in this case, my 'users' table. Here is the code I have so far:
import getpass
from sqlalchemy import create_engine
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.schema import MetaData
user = 'user_id_string'
pasw=getpass.getpass()
host = 'host_string'
db_name = 'db_name'
engine = create_engine(f'{host}?user={user}&password={pasw}&logmech=LDAP')
connection = engine.connect()
connection.execute(f'DATABASE {db_name}')
md = MetaData(bind=connection, reflect=False, schema='db_name')
md.reflect(only=['users'])
Base = declarative_base(bind=connection, metadata=md)
class Users(Base):
__table__ = md.tables['db_name.users']
This is the error that I receive when constructing the subclass 'Users':
sqlalchemy.exc.ArgumentError: Mapper mapped class Users->users could not assemble any primary key columns for mapped table 'users'
Is there some reason that my subclass 'Users' is not automatically being mapped to the table metadata from the existing teradata table 'users' that I have assigned it to in defining the class? The underlying table already has a primary key set so I don't understand why sqlalchemy is not assuming the existing primary key. Thanks for your help in advance.
EDIT: The underlying table DOES NOT have a primary KEY, only a primary INDEX.
From SQLAlchmey documentation: (https://docs.sqlalchemy.org/en/13/faq/ormconfiguration.html#how-do-i-map-a-table-that-has-no-primary-key)
The SQLAlchemy ORM, in order to map to a particular table, needs there to be at least one column denoted as a primary key column; multiple-column, i.e. composite, primary keys are of course entirely feasible as well. These columns do not need to be actually known to the database as primary key columns, though it’s a good idea that they are. It’s only necessary that the columns behave as a primary key does, e.g. as a unique and not nullable identifier for a row.
Most ORMs require that objects have some kind of primary key defined because the object in memory must correspond to a uniquely identifiable row in the database table; at the very least, this allows the object can be targeted for UPDATE and DELETE statements which will affect only that object’s row and no other. However, the importance of the primary key goes far beyond that. In SQLAlchemy, all ORM-mapped objects are at all times linked uniquely within a Session to their specific database row using a pattern called the identity map, a pattern that’s central to the unit of work system employed by SQLAlchemy, and is also key to the most common (and not-so-common) patterns of ORM usage.
In almost all cases, a table does have a so-called candidate key, which is a column or series of columns that uniquely identify a row. If a table truly doesn’t have this, and has actual fully duplicate rows, the table is not corresponding to first normal form and cannot be mapped. Otherwise, whatever columns comprise the best candidate key can be applied directly to the mapper:
class SomeClass(Base):
__table__ = some_table_with_no_pk
__mapper_args__ = {
'primary_key':[some_table_with_no_pk.c.uid, some_table_with_no_pk.c.bar]
}
Better yet is when using fully declared table metadata, use the primary_key=True flag on those columns:
class SomeClass(Base):
__tablename__ = "some_table_with_no_pk"
uid = Column(Integer, primary_key=True)
bar = Column(String, primary_key=True)

Flask SQLAlchemy get columns from two joined mapped entities in query result

I have a table, MenuOptions which represents any option found in a dropdown in my app. Each option can be identified by the menu it is part of (e.g. MenuOptions.menu_name) and the specific value of that option (MenuOptions.option_value).
This table has relationships all across my db and doesn't use foreign keys, so I'm having trouble getting it to mesh with SQLAlchemy.
In SQL it would be as easy as:
SELECT
*
FROM
document
JOIN
menu_options ON menu_options.option_menu_name = 'document_type'
AND menu_options.option_value = document.document_type_id
to define this relationship. However I'm running into trouble when doing this in SQLAlchemy because I can't map this relationship cleanly without foreign keys. In SQLAlchemy the best I've done so far is:
the_doc = db.session.query(Document, MenuOptions).filter(
Document.id == document_id
).join(
MenuOptions,
and_(
MenuOptions.menu_name == text('"document_type"'),
MenuOptions.value == Document.type_id
)
).first()
Which does work, and does return the correct values, but returns them as a list of two separate model objects such that I have to reference the mapped Document properties via the_doc[0] and the mapped MenuOptions properties via the_doc[1]
Is there a way I can get this relationship returned as a single query object with all the properties on it without using foreign keys or any ForeignKeyConstraint in my model? I've tried add_columns and add_entity but I get essentially the same result.
you can use with_entities
entities = [getattr(Document, c) for c in Document.__table__.columns.keys()] + \
[getattr(MenuOptions, c) for c in MenuOptions.__table__.columns.keys()]
session.query(Document, MenuOptions).filter(
Document.id == document_id
).join(
MenuOptions,
and_(
MenuOptions.menu_name == text('"document_type"'),
MenuOptions.value == Document.type_id
)
).with_entities(*entities)
I ended up taking a slightly different approach using association_proxy, but if you ended up here from google then this should help you. In the following example, I store a document_type_id in the document table and hold the corresponding values for that id in a table called menu_options. Normally you would use foreign keys for this, but our menu_options is essentially an inhouse lookup table, and it contains relationships to several other tables so foreign keys are not a clean solution.
By first establishing a relationship via the primaryjoin property, then using associationproxy, I can immediately load the document_type based on the document_type_id with the following code:
from sqlalchemy import and_
from sqlalchemy.ext.associationproxy import association_proxy
class Document(db.Model):
__tablename__ = "document"
document_type_id = db.Column(db.Integer)
document_type_proxy = db.relationship(
"MenuOptions",
primaryjoin=(
and_(
MenuOptions.menu_name=='document_type',
foreign('Document.document_type_id')==MenuOptions.value
)
),
lazy="immediate"
viewonly=True
)
If all you need is a mapped relationship without the use of foreign keys within your database, then this will do just fine. If, however, you want to be able to access the remote attribute (in this case the document_type) directly as an attribute on the initial class (in this case Document) then you can use association_proxy to do so by simply passing the name of the mapped relationship and the name of the remote property:
document_type = association_proxy("document_type_proxy", "document_type")

Django Polymoprhic

I want to implement models using inheritance and I've found this package django-polymorphic. But I was reading about inheritance in django models and almost on every page I found they recommend using abstract = True in parent model. Which will duplicate fields for subclasses, resultsing in making queries faster.
I've done some testing and found out that this library doesn't use use abstract varaible:
class Parent(PolymorphicModel):
parent_field = models.TextField()
class Child(Parent):
child_field = models.TextField()
This results in:
Parent table:
| app_parent| CREATE TABLE `app_parent` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`parent_field` longtext NOT NULL,
`polymorphic_ctype_id` int(11),
PRIMARY KEY (`id`),
KEY `app_polymorphic_ctype_id_a7b8d4c7_fk_django_content_type_id` (`polymorphic_ctype_id`),
CONSTRAINT `app_polymorphic_ctype_id_a7b8d4c7_fk_django_content_type_id` FOREIGN KEY (`polymorphic_ctype_id`) REFERENCES `django_content_type` (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 |
Child table:
| app_child | CREATE TABLE `app_child` (
`parent_ptr_id` int(11) NOT NULL,
`child_field` varchar(20) NOT NULL,
PRIMARY KEY (`parent_ptr_id`),
CONSTRAINT `no_parent_ptr_id_079ccc0e_fk_app_parent_id` FOREIGN KEY (`parent_ptr_id`) REFERENCES `app_arent` (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 |
Should I use my own classes which uses abstract field or should i stick with this?
Do you have a need to be able to query parent table?
Parent.objects.all()
If yes, then most probably you will need to use the multi-table inheritance with abstract=False.
Using model inheritance with abstract=False you get more complicated database schema, with more database relations. Creating a child instance will take 2 inserts instead of 1 (parent and child table). Querying child data will require table join. So this method certainly has got it's shortcomings. But when you want to query for common columns data, it's best supported way in django.
Django polymorphic builds on top of standard django model inheritance, by adding extra column polymorphic_ctype which allows to identify subclass having only a parent object.
There are various ways that you can use to achieve similar results using abstract=True. But often it results in more complicated querying code.
If using abstract=True bellow is 2 examples how you can query common-data of all childs.
Chaining multiple queries
def query_all_childs(**kwargs):
return chain(
Child1.objects.filter(**kwargs)
Child2.objects.filter(**kwargs)
)
Using database views
Manually create a database view that combines several tables (this could be done by attaching sql code to post-migrate signal):
create database view myapp_commonchild
select 'child1' as type, a, b from child1
union all
select 'child2' as type, a, b from child2
Create a concrete-parent model with managed=False. This flag tells django to ignore the table in database migrations (because we have manually created database view for that).
class Parent(models.Model):
a = CharField()
b = CharField()
class CommonChild(Parent):
type = models.CharField()
class Meta:
managed = False
class Child1(Parent):
pass
class Child2(Parent):
pass
Now you can query CommonChild.objects.all() and access common fields of child classes.
Speaking of performance, I don't know how big your tables are or how heavy reads/writes are, but most probably using abstract=False will not impact your performance in a noticeable way.

SQLAlchemy: Pass information to automap many to many relationship

We're converting a codebase to SqlAlchemy, where there's an existing database that we can modify but not completely replace.
There's a collection of widgets, and for each widget we keep track of the 20 most similar other widgets (this is a directional relationship, i.e. widget_2 can appear in widget_1's most similar widgets, but not vice versa).
There's a widget table which has a widget_id field and some other things.
There's a similarity table which has first_widget_id, second_widget_id and similarity_score. We only save the 20 most similar widgets in the database, so that every widget_id appears exactly 20 times as first_widget_id.
first_widget_id and second_widget_id have foreign keys pointing to the widget table.
We're using SQLAlchemy's automap functionality, so a Widget object has a Widget.similarity_collection field. However, for a specified widget_id, it only includes items where second_widget_id == widget_id, whereas we want first_widget_id == widget_id. I understand that SQLAlchemy has no way to know which of the 2 it should pick.
Can we tell it somehow?
EDIT: as per the comment, here are more details on the models:
CREATE TABLE IF NOT EXISTS `similarity` (
`first_widget_id` int(6) NOT NULL,
`second_widget_id` int(6) NOT NULL,
`score` int(5) NOT NULL,
PRIMARY KEY (`first_widget_id`,`second_widget_id`),
KEY `first_widget_id` (`first_widget_id`),
KEY `second_widget_id_index` (`second_widget_id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
ALTER TABLE `similarity`
ADD CONSTRAINT `similar_first_widget_id_to_widgets_foreign_key` FOREIGN KEY (`first_widget_id`) REFERENCES `widgets` (`widget_id`) ON DELETE CASCADE ON UPDATE CASCADE,
ADD CONSTRAINT `similar_second_widget_id_to_widgets_foreign_key` FOREIGN KEY (`second_widget_id`) REFERENCES `widgets` (`widget_id`) ON DELETE CASCADE ON UPDATE CASCADE;
CREATE TABLE IF NOT EXISTS `widgets` (
`widget_id` int(6) NOT NULL AUTO_INCREMENT,
`widget_name` varchar(70) NOT NULL,
PRIMARY KEY (`game_id`),
) ENGINE=InnoDB DEFAULT CHARSET=latin1 AUTO_INCREMENT=13179 ;
And using this python code to initialize SQLAlchemy:
base = automap_base()
engine = create_engine(
'mysql://%s:%s#%s/%s?charset=utf8mb4' % (
config.DB_USER, config.DB_PASSWD, config.DB_HOST, config.DB_NAME
), echo=False
)
# reflect the tables
base.prepare(self.engine, reflect=True)
Widgets = base.classes.widgets
Now when we do something like:
session.query(Widgets).filter_by(widget_id=1).similarity_collection
We get sqlalchemy.ext.automap.similar objects for which second_widget_id == 1, whereas we want first_widget_id == 1
You can override how the similarity_collection joins, even when automapping, with an explicit class definition and passing foreign_keys to the relationship:
base = automap_base()
engine = create_engine(
'mysql://%s:%s#%s/%s?charset=utf8mb4' % (
config.DB_USER, config.DB_PASSWD, config.DB_HOST, config.DB_NAME
), echo=False
)
# The class definition that ensures certain join path for the relationship.
# Rest of the mapping is automapped upon reflecting.
class Widgets(base):
__tablename__ = 'widgets'
similarity_collection = relationship(
'similarity', foreign_keys='similarity.first_widget_id')
base.prepare(self.engine, reflect=True)
If you wish to also control the created relationship(s) in similarity – for neat association proxies or such – use the same pattern.

SQLAlchemy modelling a complex relationship using reflection

I am querying a proprietary database which is maintained by a third party. The database has many tables each with large numbers of fields.
My problem refers to three tables of interest, Tree, Site and Meter.
The tree table describes nodes in a simple tree structure. Along with other data it has a foreign key referencing its own primary key. It also has an Object_Type field and an Object_ID field. The Site and Meter tables each have many fields.
A tree node has a one-to-one relationship with either be a meter or a site. If the Object_Type field is 1 then the Object_ID field refers to the primary key in the Site table. If it is 2 then it refers to the primary key in the Meter table.
following this example https://bitbucket.org/sqlalchemy/sqlalchemy/src/408388e5faf4/examples/declarative_reflection/declarative_reflection.py
I am using reflection to load the table structures like so
Base = declarative_base(cls=DeclarativeReflectedBase)
class Meter(Base):
__tablename__ = 'Meter'
class Site(Base):
__tablename__ = 'Site'
class Tree(Base):
__tablename__ = 'Tree'
Parent_Node_ID = Column(Integer, ForeignKey('Tree.Node_ID'))
Node_ID = Column(Integer, primary_key=True)
children = relationship("Tree", backref=backref('parent', remote_side=[Node_ID]))
Base.prepare(engine)
I have included the self-referential relationship and that works perfectly. How can I add the two relationships using Object_ID as the foreign key, with the appropriate check on the Object_Type field?
First a note on reflection. I've found myself much better off not relying on reflection.
it does not require a valid database connection for you to load/work with your code
it violates the python guide that explicit is better than implicit. If you look at you code you are better off seeing the elements (columns etc) rather than having them magically created outside your field of view.
This means more code but more maintainable.
The reason I suggested that is at least in part that I cannot see schema in your posting.
If you create the tables and classes in your code rather than relying on reflection, you can then have better control over mapping.
In this case you want to use polymorphic mapping
create a TreeNode class as above.
create SiteNode and MeterNode as subclasses
Your code would then include something like:
mapper(TreeNode,tree_table,polymorphic_on=tree_table.c.object_type)
mapper(SiteNode, site_table,inherits=TreeNode,
inherit_condition=site_table.c.node_id==tree_table.c.node_id,
polymorphic_identity=1)
Hope this helps.
for tree.object_id to be a foreign key that can refer either to Site or Meter, you can either have Site and Meter descend from a common base table, that is, joined table inheritance, or be mapped to the same table, that is, single table inheritance, or as someone said have Tree be mapped to two different tables as well as a common base table. This last suggestion goes well with the idea that TreeNode already has a "type" field.
The final alternative which might be easier is to use two foreign keys on TreeNode directly - site_id and meter_id, as well as two relationships, "meter" and "site"; then use a Python #property to return one or the other:
class TreeNode(Base):
# ...
#property
def object(self):
return self.meter or self.site

Categories