SQLAlchemy modelling a complex relationship using reflection - python

I am querying a proprietary database which is maintained by a third party. The database has many tables each with large numbers of fields.
My problem refers to three tables of interest, Tree, Site and Meter.
The tree table describes nodes in a simple tree structure. Along with other data it has a foreign key referencing its own primary key. It also has an Object_Type field and an Object_ID field. The Site and Meter tables each have many fields.
A tree node has a one-to-one relationship with either be a meter or a site. If the Object_Type field is 1 then the Object_ID field refers to the primary key in the Site table. If it is 2 then it refers to the primary key in the Meter table.
following this example https://bitbucket.org/sqlalchemy/sqlalchemy/src/408388e5faf4/examples/declarative_reflection/declarative_reflection.py
I am using reflection to load the table structures like so
Base = declarative_base(cls=DeclarativeReflectedBase)
class Meter(Base):
__tablename__ = 'Meter'
class Site(Base):
__tablename__ = 'Site'
class Tree(Base):
__tablename__ = 'Tree'
Parent_Node_ID = Column(Integer, ForeignKey('Tree.Node_ID'))
Node_ID = Column(Integer, primary_key=True)
children = relationship("Tree", backref=backref('parent', remote_side=[Node_ID]))
Base.prepare(engine)
I have included the self-referential relationship and that works perfectly. How can I add the two relationships using Object_ID as the foreign key, with the appropriate check on the Object_Type field?

First a note on reflection. I've found myself much better off not relying on reflection.
it does not require a valid database connection for you to load/work with your code
it violates the python guide that explicit is better than implicit. If you look at you code you are better off seeing the elements (columns etc) rather than having them magically created outside your field of view.
This means more code but more maintainable.
The reason I suggested that is at least in part that I cannot see schema in your posting.
If you create the tables and classes in your code rather than relying on reflection, you can then have better control over mapping.
In this case you want to use polymorphic mapping
create a TreeNode class as above.
create SiteNode and MeterNode as subclasses
Your code would then include something like:
mapper(TreeNode,tree_table,polymorphic_on=tree_table.c.object_type)
mapper(SiteNode, site_table,inherits=TreeNode,
inherit_condition=site_table.c.node_id==tree_table.c.node_id,
polymorphic_identity=1)
Hope this helps.

for tree.object_id to be a foreign key that can refer either to Site or Meter, you can either have Site and Meter descend from a common base table, that is, joined table inheritance, or be mapped to the same table, that is, single table inheritance, or as someone said have Tree be mapped to two different tables as well as a common base table. This last suggestion goes well with the idea that TreeNode already has a "type" field.
The final alternative which might be easier is to use two foreign keys on TreeNode directly - site_id and meter_id, as well as two relationships, "meter" and "site"; then use a Python #property to return one or the other:
class TreeNode(Base):
# ...
#property
def object(self):
return self.meter or self.site

Related

Retrieving a child model's annotation via a query to the parent in Django?

I have a concrete base model, from which other models inherit (all models in this question have been trimmed for brevity):
class Order(models.Model):
state = models.ForeignKey('OrderState')
Here are a few examples of the "child" models:
class BorrowOrder(Order):
parts = models.ManyToManyField('Part', through='BorrowOrderPart')
class ReturnOrder(Order):
parts = models.ManyToManyField('Part', through='ReturnOrderPart')
As you can see from these examples, each child model has a many-to-many relationship of Parts through a custom table. Those custom through-tables look something like this:
class BorrowOrderPart(models.Model):
borrow_order = models.ForeignKey('BorrowOrder', related_name='borrowed_parts')
part = models.ForeignKey('Part')
qty_borrowed = models.PositiveIntegerField()
class ReturnOrderPart(models.Model):
return_order = models.ForeignKey('ReturnOrder', related_name='returned_parts')
part = models.ForeignKey('Part')
qty_returned = models.PositiveIntegerField()
Note that the "quantity" field in each through table has a custom name (unfortunately): qty_borrowed or qty_returned. I'd like to be able to query the base table (so that I'm searching across all order types), and include an annotated field for each that sums these quantity fields:
# Not sure what I specify in the Sum() call here, given that the fields
# I'm interested in are different depending on the child's type.
qs = models.Order.objects.annotate(total_qty=Sum(???))
# For a single model, I would do something like:
qs = models.BorrowOrder.objects.annotate(
total_qty=Sum('borrowed_parts__qty_borrowed'))
So I guess I have two related questions:
Can I annotate a child-model's data through a query on the parent model?
If so, can I conditionally specify the field to be annotated, given that the actual field name changes depending on the model in question?
This feels to me like a place where using When() and Case() might be helpful, but I'm not sure how I'd build the necessary logic.
The problem is that, when you are querying the base model (in multi-table inheritance), it's hard to find out which subclass the object actually is. See How to know which is the child class of a model.
The query might be achievable in theory, with something like
SELECT
CASE
WHEN child1.base_ptr_id IS NOT NULL THEN ...
WHEN child2.base_ptr_id IS NOT NULL THEN ...
END AS ...
FROM base
LEFT JOIN child1 ON child1.base_ptr_id = base.id
LEFT JOIN child2 ON child2.base_ptr_id = base.id
...
but I don't know how to translate that in Django and I think it would be too much trouble to do it. It could be done, if not anything else using raw queries.
Another solution would be to add to the base class a field that specifies which actual subclass each object is; in that case, you'd need to make as many queries as there are subclasses and join them. I don't like this solution either. Update: After I slept on this I conclude that the most Django-like solution would be not to query the parent model in the first place; simply query the submodels and join the results. I would explore the third option below only if there were performance or other practical problems.
Another idea is to create a database view (with CREATE VIEW) based on the above SQL query and translate it into a Django model with managed = False, and query that one. Maybe this is somewhat cleaner than the other solutions, but it is a bit non-standard.

Why am I getting a duplicate foreign key error?

I'm trying to use Python and SQLAlcehmy to insert stuff into the database but it's giving me a duplicate foreign key error. I didn't have any problems when I was executing the SQL queries to create the tables earlier.
You're getting the duplicate because you've written the code as a one to one relationship, when it is at least a one to many relationship.
Sql doesn't let you have more than one of any variable. It creates keys for each variable, and when you try to insert the same variable, but haven't set up any type of relationship between the table it gets really upset at you, and throws up the error you're getting.
The below code is a one-to-many relationship for your tables using flask to connect to the database.. if you aren't using flask yourself.. figure out the translation, or use it.
class ChildcareUnit(db.Model):
Childcare_id=db.Column('ChildcareUnit_id',db.Integer,primary_key=True)
fullname = db.Column(String(250), nullable = False)
shortname = db.Column(String(250), nullable = False)
_Menu = db.relationship('Menu')
def __init__(self,fullname,shortname):
self.fullname = fullname
self.shortname = shortname
def __repr__(self):
return '<ChildcareUnit %r>' % self.id
class Menu(db.Model):
menu_id = db.Column('menu_id', db.Integer, primary_key=True)
menu_date = db.Column('Date', Date, nullable=True)
idChildcareUnit=db.Column(db.Integer,db.Forgeinkey('ChilecareUnit.ChilecareUnit_id'))
ChilecareUnits = db.relationship('ChildcareUnit')
def __init__(self,menu_date):
self.menu_date = menu_date
def __repr__(self):
return '<Menu %r>' % self.id
A couple differences here to note. the Columns are now db.Column() not Column(). This is the Flask code at work. it makes a connection between your database and the column in that table, saying "hey, these two things are connected".
Also, look at the db.Relationship() variables I've added to both of the tables. This is what tells your ORM that the two tables have a 1-2-many relationship. They need to be in both of the tables, and the relationship column in one table needs to list the other for it to work, as you can see.
Lastly, look at __repr__. This is what you're ORM uses to generate the foreign Keys for your database. It is also really important to include. Your code will either be super super slow without it, or just not work all together.
there are two different options you have to generate foreign keys in sqlalchemy. __repr__ and __str__
__repr__ is designed to generate keys that are easier for the machine to read, which will help with performance, but might make reading and understanding them a little more difficult.
__str__ is designed to be human friendly. It'll make your foreign keys easier to understand, but it will also make your code run just a little bit slower.
You can always use __str__ while you're developing, and then switch __repr__ when you're ready to have your final database.

The foreign key associated with column 'x.y' could not ... generate a foreign key to target column 'None'

I am during creating my first database project in SQLAlchemy and SQLite. I want to connect two entity as relational database's relational model. Here is the source:
class Models(Base):
__tablename__ = "models"
id_model = Column(Integer, primary_key=True)
name_of_model = Column(String, nullable = False)
price = Column(Integer, nullable = False)
def __init__(self, name_of_model):
self.name_of_model = name_of_model
class Cars(Base):
__tablename__ = "cars"
id_car = Column(Integer, primary_key=True)
id_equipment = Column(Integer, nullable = False)
id_package = Column(Integer, nullable = False)
id_model = Column(Integer, ForeignKey('Models'))
model = relationship("Models", backref=backref('cars', order_by = id_model))
I want to achieve a relationship like this:
https://imgur.com/af62zli
The error which occurs is:
The foreign key associated with column 'cars.id_model' could not find table 'Models' with which to generate a foreign key to target column 'None'.
Any ideas how to solve this problem?
From the docs:
The argument to ForeignKey is most commonly a string of the form
<tablename>.<columnname>, or for a table in a remote schema or “owner”
of the form <schemaname>.<tablename>.<columnname>. It may also be an
actual Column object...
In defining your ForeignKey on Cars.id_model you pass the string form of a class name ('Models') which is not an accepted form.
However, you can successfully define your foreign key using one of the below options:
ForeignKey(Models.id_model)
This uses the actual Column object to specify the foreign key. The disadvantage of this method is that you need to have the column in your namespace adding extra complexity in needing to import the model into a module if it is not defined there, and also may cause you to care about the order of instantiation of your models. This is why it's more common to use one of the string-based options, such as:
ForeignKey('models.id_model')
Notice that this example doesn't include the string version of the class name (not Models.id_model) but rather the string version of the table name. The string version means that table objects required are only resolved when needed and as such avoid the complexities of dealing with Column objects themselves.
Another interesting example that works in this case:
ForeignKey('models')
If the two columns are named the same on both tables, SQLAlchemy seems to infer the column from the table. If you alter the name of either of the id_model columns definitions in your example so that they are named differently, this will cease to work. Also I haven't found this to be well documented and it is less explicit, so not sure if really worth using and am really just mentioning for completeness and because I found it interesting. A comment in the source code of ForeignKey._column_tokens() seemed to be more explicit than the docs with respect to acceptable formatting of the column arg:
# A FK between column 'bar' and table 'foo' can be
# specified as 'foo', 'foo.bar', 'dbo.foo.bar',
# 'otherdb.dbo.foo.bar'. Once we have the column name and
# the table name, treat everything else as the schema
# name.

Django ForeignKey which does not require referential integrity?

I'd like to set up a ForeignKey field in a django model which points to another table some of the time. But I want it to be okay to insert an id into this field which refers to an entry in the other table which might not be there. So if the row exists in the other table, I'd like to get all the benefits of the ForeignKey relationship. But if not, I'd like this treated as just a number.
Is this possible? Is this what Generic relations are for?
This question was asked a long time ago, but for newcomers there is now a built in way to handle this by setting db_constraint=False on your ForeignKey:
https://docs.djangoproject.com/en/dev/ref/models/fields/#django.db.models.ForeignKey.db_constraint
customer = models.ForeignKey('Customer', db_constraint=False)
or if you want to to be nullable as well as not enforcing referential integrity:
customer = models.ForeignKey('Customer', null=True, blank=True, db_constraint=False)
We use this in cases where we cannot guarantee that the relations will get created in the right order.
EDIT: update link
I'm new to Django, so I don't now if it provides what you want out-of-the-box. I thought of something like this:
from django.db import models
class YourModel(models.Model):
my_fk = models.PositiveIntegerField()
def set_fk_obj(self, obj):
my_fk = obj.id
def get_fk_obj(self):
if my_fk == None:
return None
try:
obj = YourFkModel.objects.get(pk = self.my_fk)
return obj
except YourFkModel.DoesNotExist:
return None
I don't know if you use the contrib admin app. Using PositiveIntegerField instead of ForeignKey the field would be rendered with a text field on the admin site.
This is probably as simple as declaring a ForeignKey and creating the column without actually declaring it as a FOREIGN KEY. That way, you'll get o.obj_id, o.obj will work if the object exists, and--I think--raise an exception if you try to load an object that doesn't actually exist (probably DoesNotExist).
However, I don't think there's any way to make syncdb do this for you. I found syncdb to be limiting to the point of being useless, so I bypass it entirely and create the schema with my own code. You can use syncdb to create the database, then alter the table directly, eg. ALTER TABLE tablename DROP CONSTRAINT fk_constraint_name.
You also inherently lose ON DELETE CASCADE and all referential integrity checking, of course.
To do the solution by #Glenn Maynard via South, generate an empty South migration:
python manage.py schemamigration myapp name_of_migration --empty
Edit the migration file then run it:
def forwards(self, orm):
db.delete_foreign_key('table_name', 'field_name')
def backwards(self, orm):
sql = db.foreign_key_sql('table_name', 'field_name', 'foreign_table_name', 'foreign_field_name')
db.execute(sql)
Source article
(Note: It might help if you explain why you want this. There might be a better way to approach the underlying problem.)
Is this possible?
Not with ForeignKey alone, because you're overloading the column values with two different meanings, without a reliable way of distinguishing them. (For example, what would happen if a new entry in the target table is created with a primary key matching old entries in the referencing table? What would happen to these old referencing entries when the new target entry is deleted?)
The usual ad hoc solution to this problem is to define a "type" or "tag" column alongside the foreign key, to distinguish the different meanings (but see below).
Is this what Generic relations are for?
Yes, partly.
GenericForeignKey is just a Django convenience helper for the pattern above; it pairs a foreign key with a type tag that identifies which table/model it refers to (using the model's associated ContentType; see contenttypes)
Example:
class Foo(models.Model):
other_type = models.ForeignKey('contenttypes.ContentType', null=True)
other_id = models.PositiveIntegerField()
# Optional accessor, not a stored column
other = generic.GenericForeignKey('other_type', 'other_id')
This will allow you use other like a ForeignKey, to refer to instances of your other model. (In the background, GenericForeignKey gets and sets other_type and other_id for you.)
To represent a number that isn't a reference, you would set other_type to None, and just use other_id directly. In this case, trying to access other will always return None, instead of raising DoesNotExist (or returning an unintended object, due to id collision).
tablename= columnname.ForeignKey('table', null=True, blank=True, db_constraint=False)
use this in your program

SQLAlchemy declarative concrete autoloaded table inheritance

I've an already existing database and want to access it using SQLAlchemy. Because, the database structure's managed by another piece of code (Django ORM, actually) and I don't want to repeat myself, describing every table structure, I'm using autoload introspection. I'm stuck with a simple concrete table inheritance.
Payment FooPayment
+ id (PK) <----FK------+ payment_ptr_id (PK)
+ user_id + foo
+ amount
+ date
Here is the code, with table SQL descritions as docstrings:
class Payment(Base):
"""
CREATE TABLE payments(
id serial NOT NULL,
user_id integer NOT NULL,
amount numeric(11,2) NOT NULL,
date timestamp with time zone NOT NULL,
CONSTRAINT payment_pkey PRIMARY KEY (id),
CONSTRAINT payment_user_id_fkey FOREIGN KEY (user_id)
REFERENCES users (id) MATCH SIMPLE)
"""
__tablename__ = 'payments'
__table_args__ = {'autoload': True}
# user = relation(User)
class FooPayment(Payment):
"""
CREATE TABLE payments_foo(
payment_ptr_id integer NOT NULL,
foo integer NOT NULL,
CONSTRAINT payments_foo_pkey PRIMARY KEY (payment_ptr_id),
CONSTRAINT payments_foo_payment_ptr_id_fkey
FOREIGN KEY (payment_ptr_id)
REFERENCES payments (id) MATCH SIMPLE)
"""
__tablename__ = 'payments_foo'
__table_args__ = {'autoload': True}
__mapper_args__ = {'concrete': True}
The actual tables have additional columns, but this is completely irrelevant to the question, so in attempt to minimize the code I've simplified everything just to the core.
The problem is, when I run this:
payment = session.query(FooPayment).filter(Payment.amount >= 200.0).first()
print payment.date
The resulting SQL is meaningless (note the lack of join condidion):
SELECT payments_foo.payment_ptr_id AS payments_foo_payment_ptr_id,
... /* More `payments_foo' columns and NO columns from `payments' */
FROM payments_foo, payments
WHERE payments.amount >= 200.0 LIMIT 1 OFFSET 0
And when I'm trying to access payment.date I get the following error: Concrete Mapper|FooPayment|payments_foo does not implement attribute u'date' at the instance level.
I've tried adding implicit foreign key reference id = Column('payment_ptr_id', Integer, ForeignKey('payments_payment.id'), primary_key=True) to FooPayment without any success. Trying print session.query(Payment).first().user works (I've omited User class and commented the line) perfectly, so FK introspection works.
How can I perform a simple query on FooPayment and access Payment's values from resulting instance?
I'm using SQLAlchemy 0.5.3, PostgreSQL 8.3, psycopg2 and Python 2.5.2.
Thanks for any suggestions.
Your table structures are similar to what is used in joint table inheritance, but they certainly don't correspond to concrete table inheritance where all fields of parent class are duplicated in the table of subclass. Right now you have a subclass with less fields than parent and a reference to instance of parent class. Switch to joint table inheritance (and use FooPayment.amount in your condition or give up with inheritance in favor of simple aggregation (reference).
Filter by a field in other model doesn't automatically add join condition. Although it's obvious what condition should be used in join for your example, it's not possible to determine such condition in general. That's why you have to define relation property referring to Payment and use its has() method in filter to get proper join condition.

Categories