I am looking for a way to add a "Category" child to an "Object" entity without wasting the performance on loading the child objects first.
The "Object" and "Category" tables are linked with many-to-many relationship, stored in "ObjectCategory" table. The "Object" model is supplied with the relationsip:
categories = relationship('Category', secondary = 'ObjectCategory', backref = 'objects')
Now this code works just fine:
obj = models.Object.query.get(9)
cat1 = models.Category.query.get(22)
cat2 = models.Category.query.get(28)
obj.categories.extend([cat1, cat2])
But in the debug output I see that instantiating the obj and each category costs me a separate SELECT command to the db server, in addition to the single bulk INSERT command. Totally unneeded in this case, because I was not interested in manipulating the given category objects. Basically all I need is to nicely insert the appropriate category IDs.
The most obvious solution would be to go ahead and insert the entries in the association table directly:
db.session.add(models.ObjectCategory(oct_objID=9, oct_catID=22))
db.session.add(models.ObjectCategory(oct_objID=9, oct_catID=28))
But this approach is kind of ugly, it doesn't seem to use the power of the abstracted SQLAlchemy relationships. What's more it produces separate INSERT for every add(), vs the nice bulk INSERT as in the obj.categories.extend([list]) case. I imagine there could be some lazy object mode that would let the object live with only it's ID (unverified) and load the other fields only if they are requested. That would allow adding children in one-to-many or many-to-many relationships without issuing any SELECT to the database, yet letting to use the powerful ORM abstraction (ie, treating the list of children as a Python list).
How should I adjust my code to carry out this task using the power of SQLAlchemy but being conservative on the database use?
Do you have a ORM mapping for the ObjectCategory table? If so you could create and add ObjectCategory objects:
session.add(ObjectCategory(obj_id=9, category_id=22)
session.add(ObjectCategory(obj_id=9, category_id=28)
Related
I need to retrieve data from multiple tables, with a dynamically built filter that might or might not use data from any of the tables.
So say I have this:
class Solution(models.Model):
name = models.CharField(max_length=MAX, unique=True)
# Other data
class ExportTrackingRecord(models.Model):
tracked_id = models.IntegerField()
solution = models.ForeignKey(Solution)
# Other data
Then elsewhere I need to do:
def get_data(user_provided_criteria):
etr = ExportTrackingRecord.objects.filter(make_Q_object(user_provided_criteria)).select_related()
for data in etr:
s = data.solution
# do things with data from both tables
As far as I can tell, if I happen to filter on a field in Solution, django will do the join, and select_related get both objects. If I only filter on fields in ExportTrackingRecord then there will be no join, and django will generate a new query for each ExportTrackingRecord in the QuerySet (which could be thousands...)
I am fairly new to django, but is there a reasonable way to force the join?
select_related() is the key to your problem. If you don't use it and don't filter on fields of the related model Django will not do a join and cause an extra query for every row in the result if you are accessing data of the related model.
If you do something like ExportTrackingRecord.objects.filter(...).select_related('solution') you force Django to always do a join with the Solution table.
If you need to do the same in the other direction, through the reverse foreign key relation ship you need prefetch_related(), same for many-to-many relations
select_related controls what gets loaded into the results when the QuerySet is evaluated. it will force the join regardless of filtering.
If you don't specify select_related, then even if your filter produces a sql query with a join, the parent model's fields won't be loaded in the results, and accessing them will still require additional queries.
I have a dictionary and a list (of dictionaries) variables that I parsed some data into and these two variables have one-to-many relation between each other. I wonder if there is any way to put these dictionaries into table classes and store them into the database. I expect sqlalchemy to handle the foreign key details. Also Dictionaries have lots of keys so it should populate the object fields from dictionary.
There are two steps involved in creating a Many To One relationship in SQLAlchemy:
Create the constraint on the Dependent object. This will ensure that Dependents only reference valid Masters. For this, you will need a Primary Key on the Master object.
Create a reference so you can access the Master in an Object-Oriented fashion from the Dependent.
A note: don't get confused. The SQLAlchemy documentation uses an example where the relationship is Parents (n) -> Child (1). To avoid confusion, I used the wording Dependent (n) -> Master (1).
In some project I implement user-requested mapping (at runtime) of two tables which are connected by a 1-to-n relation (one table has a ForeignKey field).
From what I get from the documentation, the usual way is to add a orm.relation to the mapped properties with a mapped_collection as collection_class on the non-foreignkey table with a backref, so that in the end both table orm objects have each other mapped on an attribute (one has a collection through the collection_class of the orm.relation used on it, the other has an attribute placed on it by the backref).
I am in a situation where I sometimes do just want the ForeignKey-side to have a mapped attribute to the other table (that one, that is created by the backref), depending on what the user decides (he might just want to have that side mapped).
Now I'm wondering whether I can simply use an orm.relation on the ForeignKey table aswell, so I'd probably end up with an orm.relation on the non-foreignkey table as before with a mapped_collection but no backref, and another orm.relation on the foreignkey table replacing that automagic backref (making two orm.relations on both tables mapping each other from both sides).
Will that get me into trouble? Is the result equivalent (to just one orm.relation on the non-foreignkey table with a backref)? Is there another way how I could map just on the ForeignKey-side without having to map the dictionary on the non-ForeignKey table aswell with that backref?
I found the answer myself by now:
If you use an orm.relation from each side and no backrefs, you have to use back_populates or if you mess around at one side, it won't be properly updated in the mapping on the other side.
Therefore, an orm.relation from each side instead of an automated backref IS possible but you have to use back_populates accordingly.
I have an SQLAlchemy ORM class, linked to MySQL, which works great at saving the data I need down to the underlying table. However, I would like to also save the identical data to a second archive table.
Here's some psudocode to try and explain what I mean
my_data = Data() #An ORM Class
my_data.name = "foo"
#This saves just to the 'data' table
session.add(my_data)
#This will save it to the identical 'backup_data' table
my_data_archive = my_data
my_data_archive.__tablename__ = 'backup_data'
session.add(my_data_archive)
#And commits them both
session.commit()
Just a heads up, I am not interested in mapping a class to a JOIN, as in: http://www.sqlalchemy.org/docs/05/mappers.html#mapping-a-class-against-multiple-tables
I list some options below. I would go for the DB trigger if you do not need to work on those objects in your model.
use database trigger to do this job for you
create a SessionExtension which will create and add to session copy-objects (usually on before_flush). Edit-1: You can take versioning example from SA as a basic; this code is doing even more then you need.
see SA Versioning example which will not only give you a copy of the object, but the whole version history, which might be what you wish for
see Reverse mapping from a table to a model in SQLAlchemy question, where the proposed solution is described in the blogpost.
Create 2 identical models: one mapped to main table and another mapped to archive table. Create a MapperExtension with redefined method after_insert() (depending on your demands you might also need after_update() and after_delete()). This method should copy data from main model to archive and add it to the session. There are some tricks to copy all columns and many-to-many relations automagically.
Note, that you'll have to flush() session twice to store both objects since unit of work is computed before mapper extension adds new object to the session. You can redefine Session.flush() to take care of this problem. Also auto-incremented fields are assigned when the object is flushed, so you'll have to delay copying if you need them too.
It is one possible scenario which is proved to work. I'd like to know if there is a better way.
I'm using SqlAlchemy to interact with an existing PostgreSQL database.
I need to access data organized in a many-to-many relationship. The documentation describes how to create relationships, but I cannot find an example for neatly loading and query an existing one.
Querying an existing relation is not really different than creating a new one. You pretty much write the same code but specify the table and column names that are already there, and of course you won't need SQLAlchemy to issue the CREATE TABLE statements.
See http://www.sqlalchemy.org/docs/05/mappers.html#many-to-many . All you need to do is specify the foreign key columns for your existing parent, child, and association tables as in the example, and specify autoload=True to fill out the other fields on your Tables. If your association table stores additional information, as they almost always do, you should just break your many-to-many relation into two many-to-one relations.
I learned SQLAlchemy while working with MySQL. With that database I always had to specify the foreign key relationships because they weren't explicit database constraints. You might get lucky and be able to reflect even more from your database, but you might prefer to use something like http://pypi.python.org/pypi/sqlautocode to just code the entire database schema and avoid the reflection delay.