Load an existing many-to-many table relation with sqlalchemy

Load an existing many-to-many table relation with sqlalchemy - python

I'm using SqlAlchemy to interact with an existing PostgreSQL database.
I need to access data organized in a many-to-many relationship. The documentation describes how to create relationships, but I cannot find an example for neatly loading and query an existing one.

Querying an existing relation is not really different than creating a new one. You pretty much write the same code but specify the table and column names that are already there, and of course you won't need SQLAlchemy to issue the CREATE TABLE statements.
See http://www.sqlalchemy.org/docs/05/mappers.html#many-to-many . All you need to do is specify the foreign key columns for your existing parent, child, and association tables as in the example, and specify autoload=True to fill out the other fields on your Tables. If your association table stores additional information, as they almost always do, you should just break your many-to-many relation into two many-to-one relations.
I learned SQLAlchemy while working with MySQL. With that database I always had to specify the foreign key relationships because they weren't explicit database constraints. You might get lucky and be able to reflect even more from your database, but you might prefer to use something like http://pypi.python.org/pypi/sqlautocode to just code the entire database schema and avoid the reflection delay.

Related

Association objects with history for relationships using ORM

A common type of relationship in schemas is this: a joiner table has a datetime element and is meant to store history about relationships between the rows of two other tables over time. These relationships are one-to-one or one-to-many even though we're using an association table which usually implies many-to-many. At any given point in time only one mapping, the latest at that point in time, is valid. For example:
Tables:
Computer: [id, name, description]
Locations: [id, name, address]
ComputerLocations: [id, computers_id, locations_id, timestamp]
A Computers object can only belong to one Locations object at a time (and Locations can have many Computers), but we store the history in the table. Rows in ComputerLocations aren't deleted, only superseded by new rows at query-time. Perhaps in the future some prune-type event will remove older rows as their usefulness is reduced.
What I'm looking do do is model this in SQLAlchemy, specifically in the ORM, so that a Computers class has the following properties:
A new Computer can be created without (independently of) a location (this makes sense because the location table is separate)
A new Location can be created without (independently of) a computer
If a Computer has a location it must be a member of Locations (foreign key constraint)
When updating an existing Computers object's location, a new row will be added to ComputerLocations with a datetime of NOW()
When creating a new Computers object with a location, a new row will be added to ComputerLocations with a datetime of NOW()
Everything should be atomic (i.e. fail if a new Computer is created but the row associating it to a location can't be created)
Is there a specific design pattern or a concrete method in SQLAlchemy ORM to accomplish this? The documentation has a section on Non-traditional mappings that includes mapping a class against multiple tables and to arbitrary selects so this looks promising. Further there was another question of stackoverflow that mentioned vertical tables. Due to my relative inexperience with SQLAlchemy I cannot synthesize this information into a robust and elegant solution yet. Any help would be greatly appreciated.
I'm using MySQL but a solution should be general enough for any database through the SQLAlchemy dialects system.

Pros and Cons of manually creating an ORM for an existing database?

What are the pros and cons of manually creating an ORM for an existing database vs using database reflection?
I'm writing some code using SQLAlchemy to access a pre-existing database. I know I can use sqlalchemy.ext.automap to automagically reflect the schema and create the mappings.
However, I'm wondering if there is any significant benefit of manually creating the mapping classes vs letting the automap do it's magic.
If there is significant benefit, can SQLAlchemy auto-generate the python mapping classes like Django's inspectdb? That would make creating all of the declarative base mappings much faster, as I'd only have to verify and tweak rather than write from scratch.
Edit:
As #iuridiniz says below, there are a few solutions that mimic Django's inspectdb. See Is there a Django's inspectdb equivalent for SQLAlchemy?. The answers in that thread are not Python3 compatible, so look into sqlacodegen or flask-sqlacodegen if you're looking for something that's actually maintained.

I see a lot of tables that were created with: CREATE TABLE suppliers
AS (SELECT * FROM companies WHERE 1 = 2 );, (a poor man's table copy), which will have no primary keys. If existing tables don't have primary keys, you'll have to constantly catch exceptions and feed Column objects into the mapper. If you've got column objects handy, you're already halfway to writing your own ORM layer. If you just complete the ORM, you won't have to worry about whether tables have primary keys set.

SQLAlchemy: add a child in many-to-many relationship by IDs

I am looking for a way to add a "Category" child to an "Object" entity without wasting the performance on loading the child objects first.
The "Object" and "Category" tables are linked with many-to-many relationship, stored in "ObjectCategory" table. The "Object" model is supplied with the relationsip:
categories = relationship('Category', secondary = 'ObjectCategory', backref = 'objects')
Now this code works just fine:
obj = models.Object.query.get(9)
cat1 = models.Category.query.get(22)
cat2 = models.Category.query.get(28)
obj.categories.extend([cat1, cat2])
But in the debug output I see that instantiating the obj and each category costs me a separate SELECT command to the db server, in addition to the single bulk INSERT command. Totally unneeded in this case, because I was not interested in manipulating the given category objects. Basically all I need is to nicely insert the appropriate category IDs.
The most obvious solution would be to go ahead and insert the entries in the association table directly:
db.session.add(models.ObjectCategory(oct_objID=9, oct_catID=22))
db.session.add(models.ObjectCategory(oct_objID=9, oct_catID=28))
But this approach is kind of ugly, it doesn't seem to use the power of the abstracted SQLAlchemy relationships. What's more it produces separate INSERT for every add(), vs the nice bulk INSERT as in the obj.categories.extend([list]) case. I imagine there could be some lazy object mode that would let the object live with only it's ID (unverified) and load the other fields only if they are requested. That would allow adding children in one-to-many or many-to-many relationships without issuing any SELECT to the database, yet letting to use the powerful ORM abstraction (ie, treating the list of children as a Python list).
How should I adjust my code to carry out this task using the power of SQLAlchemy but being conservative on the database use?

Do you have a ORM mapping for the ObjectCategory table? If so you could create and add ObjectCategory objects:
session.add(ObjectCategory(obj_id=9, category_id=22)
session.add(ObjectCategory(obj_id=9, category_id=28)

django postgresql query not working

I have a postgreSQL database that has a table foo that I've created outside of django. I used manage.py inspectdb to build the model for table foo for me. This technique worked fine when I was using MySQL but with PostgreSQL it is failing MISERABLY. The table is multiple gigabytes and I build it from a text file with PostgreSQL 'COPY'.
I can run raw queries on table foo and everything executes and expected.
For example
foo.objects.raw('bar_sql')
executes as expected.
But running queries like:
foo.objects.get(bar=bar)
throw
ProgrammingError column foo.id does not exist LINE 1: SELECT "foo"."id", "foo"."bar1", "all_...
foo doesn't innately have an id field. As I understand it django is suppose to create one. Have I some how subverted this step when creating the tables outside of django?
Queries run on models whose table was populated threw django run as expected in all cases.
I'm missing something very basic here and any help would be appreciated.
I'm using django 1.6 with postgreSQL 9.3.

Django doesn't modify your existing database tables. It only creates new tables. If you have existing tables, it usually doesn't touch them at all.
"As I understand it django is suppose to create one." --> It only adds a primary key to a table when it creates it, which means you don't need to specify that explicitly in your model, but it won't do anything to an existing table.
So if for example you later on decide to add fields to your models, you have to update your databases manually.
What you need to do in your case is that by doing manual database administration make sure that your table has a primary key, and also that the name of the primary key is "id" (although I am not sure if this is necessary, it is better to do it.) So use a database administration tool, modify your table and add the primary key, and name it id. Then it should start working.

Is there a Non-Sqlalchemy way to deal with many-to-many relationships in Python?

I've searched for quite a long time on the web for a method that deals with many-to-many relationships in python sqlite3, but all seems to lead to Sqlalchemy. I'm not against using sqlalchemy at all(although I do find it an overkill from time to time and it does introduce some unnecessary logic in many cases), I was wondering if there is a 'golden class/function' that provides basic CRUD interface directly without bothering Sqlalchemy? Any references (online or paper-based) will be highly appreciated.

If you want to solve many-to-many relations in basic SQL you can do this manualy using third table for storing those relations.
CREATE TABLE users {
int user_id,
varchar user_name
};
CREATE TABLE categories {
int category_id,
varchar category_name
};
CREATE TABLE category_permission {
int user_id,
int category_id
}; -- for storing relations
These three tables represents two models (user, category) and one many-to-many relation (category_permission)
You have to query them manualy and also manualy maintain stored relations. Based on SQL engine you are using you should consider using
unique index in table category_permission on all two collumns
foreing keys to maintain relations when deleting something.
You can then select this way:
-- to list all users and their category count
SELECT U.user_name, count(CP.category_id) as 'permitted'
FROM users U
LEFT JOIN category_permission PM
ON PM.user_id = U.user_id
ORDER BY permitted DESC;
-- to list all categories for __desired_user__
SELECT C.* FROM categories C
JOIN category_permission CM
ON CM.category_id = C.category_id
WHERE CM.user_id = __desired_user_id__;
For further reference search for SQL solution instead of Python (which will always lead you to some framework). Many-to-many relationship is a common problem in relation databases.
Mysql database design in a many to many relationship
MySQL many-to-many relationship with FOREIGN KEYS
... and so on ...

If you are looking at other ORMs than SqlAlchemy, you can compare the ORMs presented here: What are some good Python ORM solutions?
I personnaly tried Storm that supports Many-to-Many almost transparently but you have to write SQL to create the table (IIRC).
I also tried autumn which is dead but a fork exists: AutORM. It is very lightweight but doesn't support many-to-many. You probably can work that around by declaring your junction table explicitly and it is targeted to sqlite (see peewee).
I tested dejavu (dead) and peewee which also doesn't have many-to-many transparently but explains how to do it in its docs (you can apply the same to AutORM).
For my own case I used SqlAlchemy finally because the machinery necessary for ORMs is anyway so big that I preferred getting more functions for the price and at that time Python 3 support was not that common (my two cents :-) ).

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.