I have a dictionary and a list (of dictionaries) variables that I parsed some data into and these two variables have one-to-many relation between each other. I wonder if there is any way to put these dictionaries into table classes and store them into the database. I expect sqlalchemy to handle the foreign key details. Also Dictionaries have lots of keys so it should populate the object fields from dictionary.
There are two steps involved in creating a Many To One relationship in SQLAlchemy:
Create the constraint on the Dependent object. This will ensure that Dependents only reference valid Masters. For this, you will need a Primary Key on the Master object.
Create a reference so you can access the Master in an Object-Oriented fashion from the Dependent.
A note: don't get confused. The SQLAlchemy documentation uses an example where the relationship is Parents (n) -> Child (1). To avoid confusion, I used the wording Dependent (n) -> Master (1).
Related
I'm going through the SQLAlchemy ORM tutorial and I'm a bit confused in the "Querying with Joins" section (https://docs.sqlalchemy.org/en/latest/orm/tutorial.html#querying-with-joins).
The schema the tutorial has previously set up is a simple two-table arrangement where Address.user_id is a foreign key to User.id. In the Address object it is set up as user_id = Column(Integer, ForeignKey('users.id')).
The basic join syntax does make sense to me:
session.query(User).join(Address).filter(Address.email_address=='jack#google.com').all()
However, the tutorial then makes the statement: If there were no foreign keys, or several, Query.join() works better when one of the following forms are used:
query.join(Address, User.id==Address.user_id) # explicit condition
query.join(User.addresses) # specify relationship from left to right
query.join(Address, User.addresses) # same, with explicit target
query.join('addresses') # same, using a string
The first example makes sense: If there were no foreign key, or multiple, then we would need to help SQLAlchemy by explicitly specifying the join condition since it couldn't determine from our foreign keys which one to use to perform the join.
However, the last three examples don't really make sense to me. I believe User.addresses is the relationship that we previously set up on our User object as relationship("Address", order_by=Address.id, back_populates="user"), and it has a counterpart on the Address object user = relationship("User", back_populates="addresses"). But in the case where there are no foreign keys, or multiple, I don't see how specifying a specific relationship declaration would really help us.
At least so far in this tutorial, the relationship declarations don't themselves specify a particular foreign key or column relationship to be used to implement the relationship. So far, I think it's just been implicit that we wanted the relationship to use the single foreign key that we did set. So are there other cases where more information is encoded in the relationship object, i.e. some sort of join condition? That's the only way I can see that specifying a relationship would provide enough information to specify a join condition if there are zero or more than one foreign keys between two tables.
A common type of relationship in schemas is this: a joiner table has a datetime element and is meant to store history about relationships between the rows of two other tables over time. These relationships are one-to-one or one-to-many even though we're using an association table which usually implies many-to-many. At any given point in time only one mapping, the latest at that point in time, is valid. For example:
Tables:
Computer: [id, name, description]
Locations: [id, name, address]
ComputerLocations: [id, computers_id, locations_id, timestamp]
A Computers object can only belong to one Locations object at a time (and Locations can have many Computers), but we store the history in the table. Rows in ComputerLocations aren't deleted, only superseded by new rows at query-time. Perhaps in the future some prune-type event will remove older rows as their usefulness is reduced.
What I'm looking do do is model this in SQLAlchemy, specifically in the ORM, so that a Computers class has the following properties:
A new Computer can be created without (independently of) a location (this makes sense because the location table is separate)
A new Location can be created without (independently of) a computer
If a Computer has a location it must be a member of Locations (foreign key constraint)
When updating an existing Computers object's location, a new row will be added to ComputerLocations with a datetime of NOW()
When creating a new Computers object with a location, a new row will be added to ComputerLocations with a datetime of NOW()
Everything should be atomic (i.e. fail if a new Computer is created but the row associating it to a location can't be created)
Is there a specific design pattern or a concrete method in SQLAlchemy ORM to accomplish this? The documentation has a section on Non-traditional mappings that includes mapping a class against multiple tables and to arbitrary selects so this looks promising. Further there was another question of stackoverflow that mentioned vertical tables. Due to my relative inexperience with SQLAlchemy I cannot synthesize this information into a robust and elegant solution yet. Any help would be greatly appreciated.
I'm using MySQL but a solution should be general enough for any database through the SQLAlchemy dialects system.
I am looking for a way to add a "Category" child to an "Object" entity without wasting the performance on loading the child objects first.
The "Object" and "Category" tables are linked with many-to-many relationship, stored in "ObjectCategory" table. The "Object" model is supplied with the relationsip:
categories = relationship('Category', secondary = 'ObjectCategory', backref = 'objects')
Now this code works just fine:
obj = models.Object.query.get(9)
cat1 = models.Category.query.get(22)
cat2 = models.Category.query.get(28)
obj.categories.extend([cat1, cat2])
But in the debug output I see that instantiating the obj and each category costs me a separate SELECT command to the db server, in addition to the single bulk INSERT command. Totally unneeded in this case, because I was not interested in manipulating the given category objects. Basically all I need is to nicely insert the appropriate category IDs.
The most obvious solution would be to go ahead and insert the entries in the association table directly:
db.session.add(models.ObjectCategory(oct_objID=9, oct_catID=22))
db.session.add(models.ObjectCategory(oct_objID=9, oct_catID=28))
But this approach is kind of ugly, it doesn't seem to use the power of the abstracted SQLAlchemy relationships. What's more it produces separate INSERT for every add(), vs the nice bulk INSERT as in the obj.categories.extend([list]) case. I imagine there could be some lazy object mode that would let the object live with only it's ID (unverified) and load the other fields only if they are requested. That would allow adding children in one-to-many or many-to-many relationships without issuing any SELECT to the database, yet letting to use the powerful ORM abstraction (ie, treating the list of children as a Python list).
How should I adjust my code to carry out this task using the power of SQLAlchemy but being conservative on the database use?
Do you have a ORM mapping for the ObjectCategory table? If so you could create and add ObjectCategory objects:
session.add(ObjectCategory(obj_id=9, category_id=22)
session.add(ObjectCategory(obj_id=9, category_id=28)
If I use bulk_create to insert objects:
objList = [a, b, c,] #none are saved
model.objects.bulk_create(objList)
The id's of the objects would not be updated (see https://docs.djangoproject.com/en/dev/ref/models/querysets/#bulk-create).
So I can't use these guys as foreign key objects. I thought of querying them back from the database after they're bulk created and then using them as foreign key objects, but I don't have their ids to query them. How do I query these objects from the database (given that there can be duplicate values in columns other than the id)? Or is there a better way to make bulk created items as foreign keys?
If you have only three objects, as in your example, you might want to call save on each individually, wrapping the calls within a transaction, if it needs to be atomic.
If there are many more, which is likely the reason for using bulk_create, you could potentially loop through them instead and call save on each. Again, you could wrap that in a transaction if required. Though, one might not like this option as running tonnes of insert queries could potentially be a problem for some database setups.
Alternatively, a hack would be to add some known unique identifier to the object so you could re-query these after save.
I'm using SqlAlchemy to interact with an existing PostgreSQL database.
I need to access data organized in a many-to-many relationship. The documentation describes how to create relationships, but I cannot find an example for neatly loading and query an existing one.
Querying an existing relation is not really different than creating a new one. You pretty much write the same code but specify the table and column names that are already there, and of course you won't need SQLAlchemy to issue the CREATE TABLE statements.
See http://www.sqlalchemy.org/docs/05/mappers.html#many-to-many . All you need to do is specify the foreign key columns for your existing parent, child, and association tables as in the example, and specify autoload=True to fill out the other fields on your Tables. If your association table stores additional information, as they almost always do, you should just break your many-to-many relation into two many-to-one relations.
I learned SQLAlchemy while working with MySQL. With that database I always had to specify the foreign key relationships because they weren't explicit database constraints. You might get lucky and be able to reflect even more from your database, but you might prefer to use something like http://pypi.python.org/pypi/sqlautocode to just code the entire database schema and avoid the reflection delay.