SQLAlchemy - defining a foreign key relationship in a different database - python

I'm using sqlalchemy declarative and python2.7 to read asset information from an existing database. The database uses a number of foreign keys for constant values. Many of the foreign keys exist on a different database.
How can I specify a foreign key relationship where the data exists on a separate database?
I've tried to use two separate Base classes, with the models inheriting from them separately.
I've also looked into specifying the primaryjoin keyword in relationship, but I've been unable to understand how it would be done in this case.
I think the problem is that I can only bind one engine to a session object. I can't see any way to ask sqlalchemy to use a different engine when making a query on a nested foreign key item.
OrgBase = declarative_base()
CommonBase = declarative_base()
class SomeClass:
def __init__(sql_user, sql_pass, sql_host, org_db, common_host, common)
self.engine = create_engine("{type}://{user}:{password}#{url}/{name}".format(type=db_type,
user=sql_user,
password=sql_pass,
url=sql_host,
name=org_db))
self.engine_common = create_engine("{type}://{user}:{password}#{url}/{name}".format(type=db_type,
user=sql_user,
password=sql_pass,
url=common_host,
name="common"))
self.session = sessionmaker(bind=self.engine)()
OrgBase.metadata.bind = self.engine
CommonBase.metadata.bind = self.engine_common
models.py:
class FrameRate(CommonBase):
__tablename__ = 'content_frame_rates'
__table_args__ = {'autoload': True}
class VideoAsset(OrgBase):
__tablename__ = 'content_video_files'
__table_args__ = {'autoload': True}
frame_rate_id = Column(Integer, ForeignKey('content_frame_rates.frame_rate_id'))
frame_rate = relationship(FrameRate, foreign_keys=[frame_rate_id])
Error with this code:
NoReferencedTableError: Foreign key associated with column 'content_video_files.frame_rate_id' could not find table 'content_frame_rates' with which to generate a foreign key to target column 'frame_rate_id'
if I run:
asset = self.session.query(self.VideoAsset).filter_by(uuid=asset_uuid).first()
My hope is that the VideoAsset model can nest frame_rate properly, finding the value on the separate database.
Thank you!

Related

Why am I unable to generate a query using relationships?

I'm experimenting with relationship functionality within SQLAlchemy however I've not been able to crack it. The following is a simple MRE:
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy import Column, ForeignKey, Integer, create_engine
from sqlalchemy.orm import relationship, sessionmaker
Base = declarative_base()
class Tournament(Base):
__tablename__ = "tournament"
__table_args__ = {"schema": "belgarath", "extend_existing": True}
id_ = Column(Integer, primary_key=True)
tournament_master_id = Column(Integer, ForeignKey("belgarath.tournament_master.id_"))
tournament_master = relationship("TournamentMaster", back_populates="tournament")
class TournamentMaster(Base):
__tablename__ = "tournament_master"
__table_args__ = {"schema": "belgarath", "extend_existing": True}
id_ = Column(Integer, primary_key=True)
tour_id = Column(Integer, index=True)
tournament = relationship("Tournament", back_populates="tournament_master")
engine = create_engine("mysql+mysqlconnector://root:root#localhost/")
Session = sessionmaker(bind=engine)
session = Session()
qry = session.query(Tournament.tournament_master.id_).limit(100)
I was hoping to be able to query the id_ field from the tournament_master table through a relationship specified in the tournament table. However I get the following error:
AttributeError: Neither 'InstrumentedAttribute' object nor 'Comparator' object associated with Tournament.tournament_master has an attribute 'id_'
I've also tried replacing the two relationship lines with a single backref line in TournamentMaster:
tournament = relationship("Tournament", backref="tournament_master")
However I then get the error:
AttributeError: type object 'Tournament' has no attribute 'tournament_master'
Where am I going wrong?
(I'm using SQLAlchemy v1.3.18)
Your ORM classes look fine. It's the query that's incorrect.
In short you're getting that "InstrumentedAttribute" error because you are misusing the session.query method.
From the docs the session.query method takes as arguments, "SomeMappedClass" or "entities". You have 2 mapped classes defined, Tournament, and TournamentMaster. These "entities" are typically either your mapped classes (ORM objects) or a Column of these mapped classes.
However you are passing in Tournament.tournament_master.id_ which is not a "MappedClass" or a column and thus not an "entity" that session.query can consume.
Another way to look at it is that by calling Tournament.tournament_master.id_ you are trying to access a 'TournamentMaster' record (or instance) from the 'Tournament' class, which doesn't make sense.
It's not super clear to me what exactly you hoping to return from the query. In any case though here's a start.
Instead of
qry = session.query(Tournament.tournament_master.id_).limit(100)
try
qry = session.query(Tournament, TournamentMaster).join(TournamentMaster).limit(100)
This may also work (haven't tested) to only return the id_ field, if that is you intention
qry = session.query(Tournament, TournamentMaster).join(Tournament).with_entities(TournamentMaster.id_).limit(100)

SQL Alchemy ORM tables on-demand

Running into something guys and was hoping to get some ideas/help.
I have a database with the tree structure where leaf can participate in the several parents as a foreign key. The typical example is a city, which belongs to the country and to the continent. Needless to say that countries and continents should not be repeatable, hence before adding another city I need to find an object in the DB. If it doesn't exist I have to create it, but if for instance country doesn't exist yet, then I have to check for the continent and if this one doesn't exist then I have to have creation process for it.
So far I got around with the creation of a whole bunch of items if I run it from the single file, but if I push the SQL alchemy code into module the story becomes different. For some reason meta scope becomes limited and if the table doesn't exist yet, then the code start throwing ProgrammingError exceptions if I query for the foreign key presence (from the city for the country). I have intercepted it and in the __init__ class constructor of the class I am looking for (country) I am checking if the table exists and creating it if doesn't. Two things I have a problem with and need an advice on:
1) Verification of the table is inefficient - I am working with the Base.metadata.sorted_tables array through which I have to look through and figure out if the table structure is the one that matches my class __tablename__. Such as:
for table in Base.metadata.sorted_tables:
# Find a right table in the list of tables
if table.name == self.__tablename__:
if __DEBUG__:
print 'DEBUG: Found table {} that equal to the class table {}'.format(table.name, self.__tablename__)
if not table.exists():
session.get_bind().execute(table.create())
Needless to say, this takes time I am looking for more efficient way to do the same.
2) The second issue is with the inheritance of the declarative base (declarative_base()) with respect to the OOP in Python. I want to take some of the code repetitions away and pull them into one class from which the other classes will be derived from. For instance code above can be taken out into the separate function and have something like this:
Base = declarative_base()
class OnDemandTables(Base):
__tablename__ = 'no_table'
# id = Column(Integer, Sequence('id'), nullable=False, unique=True, primary_key=True, autoincrement=True)
def create_my_table(self, session):
if __DEBUG__:
print 'DEBUG: Creating tables for the class {}'.format(self.__class__)
print 'DEBUG: Base.metadata.sorted_tables exists returns {}'.format(Base.metadata.sorted_tables)
for table in Base.metadata.sorted_tables:
# Find a right table in the list of tables
if table.name == self.__tablename__:
if __DEBUG__:
print 'DEBUG: Found table {} that equal to the class table {}'.format(table.name, self.__tablename__)
if not table.exists():
session.get_bind().execute(table.create())
class Continent(OnDemandTables):
__tablename__ = 'continent'
id = Column(Integer, Sequence('id'), nullable=False, unique=True, primary_key=True, autoincrement=True)
name = Column(String(64), unique=True, nullable=False)
def __init__(self, session, continent_description):
if type(continent_description) != dict:
raise AttributeError('Continent should be described by the dictionary!')
else:
self.create_my_table(session)
if 'continent' not in continent_description:
raise ReferenceError('No continent can be created without a name!. Dictionary is {}'.
format(continent_description))
else:
self.name = continent_description['continent']
print 'DEBUG: Continent name is {} '.format(self.name)
The problem here is that the metadata is trying to link unrelated classes together and requires __tablename__ and some index column to be present in the parent OnDemandTables class, which doesn't make any sense to me.
Any ideas?
Cheers
Wanted to post the solution here for the rest of the gang to keep it in mind. Apparently, SQLAlchemy doesn't see the classes in the module if they are not being used, so to say. After couple days of trying to work around things, the simplest solution that I found was to do it in a semi-manual way - not rely on the ORM to construct and build-up the database for you, but rather do this part in a sort of manual approach using class methods. The code is:
__DEBUG__ = True
from sqlalchemy import String, Integer, Column, ForeignKey, BigInteger, Float, Boolean, Sequence
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import relationship
from sqlalchemy.orm.exc import MultipleResultsFound, NoResultFound
from sqlalchemy.exc import ProgrammingError
from sqlalchemy import create_engine, schema
from sqlalchemy.orm import sessionmaker
Base = declarative_base()
engine = create_engine("mysql://test:test123#localhost/test", echo=True)
Session = sessionmaker(bind=engine, autoflush=False)
session = Session()
schema.MetaData.bind = engine
class TemplateBase(object):
__tablename__ = None
#classmethod
def create_table(cls, session):
if __DEBUG__:
print 'DEBUG: Creating tables for the class {}'.format(cls.__class__)
print 'DEBUG: Base.metadata.sorted_tables exists returns {}'.format(Base.metadata.sorted_tables)
for table in Base.metadata.sorted_tables:
# Find a right table in the list of tables
if table.name == cls.__tablename__:
if __DEBUG__:
print 'DEBUG: Found table {} that equal to the class table {}'.format(table.name, cls.__tablename__)
if not table.exists():
if __DEBUG__:
print 'DEBUG: Session is {}, engine is {}, table is {}'.format(session, session.get_bind(), dir(table))
table.create()
#classmethod
def is_provisioned(cls):
for table in Base.metadata.sorted_tables:
# Find a right table in the list of tables
if table.name == cls.__tablename__:
if __DEBUG__:
print 'DEBUG: Found table {} that equal to the class table {}'.format(table.name, cls.__tablename__)
return table.exists()
class Continent(Base, TemplateBase):
__tablename__ = 'continent'
id = Column(Integer, Sequence('id'), nullable=False, unique=True, primary_key=True, autoincrement=True)
name = Column(String(64), unique=True, nullable=False)
def __init__(self, session, provision, continent_description):
if type(continent_description) != dict:
raise AttributeError('Continent should be described by the dictionary!')
else:
if 'continent' not in continent_description:
raise ReferenceError('No continent can be created without a name!. Dictionary is {}'.
format(continent_description))
else:
self.name = continent_description['continent']
if __DEBUG__:
print 'DEBUG: Continent name is {} '.format(self.name)
It gives the following:
1. Class methods is_provisioned and create_table can be called during initial code start and will reflect the database state
2. Class inheritance is done from the second class where these methods are being kept and which is not interfering with the ORM classes, hence is not being linked.
As the result of the Base.metadata.sorted_tables loop is just a class table, the code can be optimized even further removing the loop. The following action would be to organize classes to have their tables checked and possibly created in a form of a list with keeping in mind their linkages and then loop through them using is_provisioned and, if necessary, create table methods.
Hope it helps the others.
Regards

Using multiple databases with single sqlalchemy model

I want to use multiple database engines with a single sqlalchemy database model.
Following situation:
I have a photo album software (python) and the different albums are stored in different folders. In each folder is a separate sqlite database with additional information about the photos. I don't want to use a single global database because with this way I can simply move, delete and copy albums on a folder base.
Opening a single album is fairly straightforward:
Creating a db session:
maker = sessionmaker(autoflush=True, autocommit=False,
extension=ZopeTransactionExtension())
DBSession = scoped_session(maker)
Base class and metadata for db model:
DeclarativeBase = declarative_base()
metadata = DeclarativeBase.metadata
Defining database model (shortened):
pic_tag_table = Table('pic_tag', metadata,
Column('pic_id', Integer,
ForeignKey('pic.pic_id'),
primary_key=True),
Column('tag_id', Integer,
ForeignKey('tag.tag_id'),
primary_key=True))
class Picture(DeclarativeBase):
__tablename__ = 'pic'
pic_id = Column (Integer, autoincrement = True, primary_key=True)
...
class Tags(DeckarativeBase):
__tablename__ = 'tag'
tag_id = Column (Integer, autoincrement = True, primary_key=True)
...
pictures = relation('Picture', secondary=pic_tag_table, backref='tags')
And finally open the connection:
engine = engine_from_config(config, '...')
DBSession.configure(bind=engine)
metadata.bind = engine
This works well for opening one album. Now I want to open multiple albums (and db connections) the same time. Every album has the same database model so my hope is that I can reuse it. My problem is that the model class definition is inheritet from the declarative base which is connected to the metadata and the database engine. I want to connect the classes to different metadata with different enginges. Is this possible?
P.S.: I also want to query the databases via the ORM, e.g. DBSession.query(Picture).all() (or DBSession[0], ... for multiple sessions on different databases - so not one query for all pictures in all databases but one ORM style query for querying one database)
You can achieve this with multiple engines and sessions (you don't need multiple metadata):
engine1 = create_engine("sqlite:///tmp1.db")
engine2 = create_engine("sqlite:///tmp2.db")
Base.metadata.create_all(bind=engine1)
Base.metadata.create_all(bind=engine2)
session1 = Session(bind=engine1)
session2 = Session(bind=engine2)
print(session1.query(Picture).all()) # []
print(session2.query(Picture).all()) # []
session1.add(Picture())
session1.commit()
print(session1.query(Picture).all()) # [Picture]
print(session2.query(Picture).all()) # []
session2.add(Picture())
session2.commit()
print(session1.query(Picture).all()) # [Picture]
print(session2.query(Picture).all()) # [Picture]
session1.close()
session2.close()
For scoped_session, you can create multiple of those as well.
engine1 = create_engine("sqlite:///tmp1.db")
engine2 = create_engine("sqlite:///tmp2.db")
Base.metadata.create_all(bind=engine1)
Base.metadata.create_all(bind=engine2)
Session1 = scoped_session(sessionmaker(bind=engine1))
Session2 = scoped_session(sessionmaker(bind=engine2))
session1 = Session1()
session2 = Session2()
...
If you have a variable number of databases you need to have open, scoped_session might be a little cumbersome. You'll need some way to keep track of them.

sqlalchemy dynamic schema on entity at runtime

I'm using SQL Alchemy and have some schema's that are account specific. The name of the schema is derived using the account ID, so I don't have the name of the schema until I hit my application service or repository layer. I'm wondering if it's possible to run a query against an entity that has it's schema dynamically set at runtime?
I know I need to set the __table_args__['schema'] and have tried doing that using the type() built-in, but I always get the following error:
could not assemble any primary key columns for mapped table
I'm ready to give up and just write straight sql, but I really hate to do that. Any idea how this can be done? I'm using SA 0.99 and I do have a PK mapped.
Thanks
from sqlalchemy 1.1,
this can be done easily using using schema_translation_map.
https://docs.sqlalchemy.org/en/11/changelog/migration_11.html#multi-tenancy-schema-translation-for-table-objects
One option would be to reflect the particular account-dependent tables. Here is the SqlAlchemy Documentation on the matter.
Alternatively, You can create the table with a static schema attribute and update it as needed at runtime and run the queries you need to. I can't think of a non-messy way to do this. So here's the messy option
Use a loop to update the schema property in each table definition whenever the account is switched.
add all the tables that are account-specific to a list.
if the tables are expressed in the declarative syntax, then you have to modify the DeclarativeName.__table__.schema attribute. I'm not sure if you need to also modify DeclarativeName.__table_args__['schema'], but I guess it won't hurt.
If the tables are expressed in the old style Table syntax, then you have to modify the Table.schema attribute.
If you're using text for any relationships or foreign keys, then that will break, and you have to inspect each table for such hard coded usage and change them
example
user_id = Column(ForeignKey('my_schema.user.id')) needs to be written as user_id = Column(ForeignKey(User.id)). Then you can change the schema of User to my_new_schema. Otherwise, at query time sqlalchemy will be confused because the foreign key will point to my_schema.user.id while the query would point to my_new_schema.user.
I'm not sure if more complicated relationships can be expressed without the use of plain text, so I guess that's the limit to my proposed solution.
Here's an example I wrote up in the terminal:
>>> from sqlalchemy import Column, Table, Integer, String, select, ForeignKey
>>> from sqlalchemy.orm import relationship, backref
>>> from sqlalchemy.ext.declarative import declarative_base
>>> B = declarative_base()
>>>
>>> class User(B):
... __tablename__ = 'user'
... __table_args__ = {'schema': 'first_schema'}
... id = Column(Integer, primary_key=True)
... name = Column(String)
... email = Column(String)
...
>>> class Posts(B):
... __tablename__ = 'posts'
... __table_args__ = {'schema':'first_schema'}
... id = Column(Integer, primary_key=True)
... user_id = Column(ForeignKey(User.id))
... text = Column(String)
...
>>> str(select([User.id, Posts.text]).select_from(User.__table__.join(Posts)))
'SELECT first_schema."user".id, first_schema.posts.text \nFROM first_schema."user" JOIN first_schema.posts ON first_schema."user".id = first_schema.posts.user_id'
>>> account_specific = [User, Posts]
>>> for Tbl in account_specific:
... Tbl.__table__.schema = 'second_schema'
...
>>> str(select([User.id, Posts.text]).select_from(User.__table__.join(Posts)))
'SELECT second_schema."user".id, second_schema.posts.text \nFROM second_schema."user" JOIN second_schema.posts ON second_schema."user".id = second_schema.posts.user_id'
As you see the same query refers to the second_schema after I update the table's schema attribute.
edit: Although you can do what I did here, using the schema translation map as shown in the the answer below is the proper way to do it.
They are set statically. Foreign keys needs the same treatment, and I have an additional issue, in that I have multiple schemas that contain multiple tables so I did this:
from sqlalchemy.ext.declarative import declarative_base
staging_dbase = declarative_base()
model_dbase = declarative_base()
def adjust_schemas(staging, model):
for vv in staging_dbase.metadata.tables.values():
vv.schema = staging
for vv in model_dbase.metadata.tables.values():
vv.schema = model
def all_tables():
return staging_dbase.metadata.tables.union(model_dbase.metadata.tables)
Then in my startup code:
adjust_schemas(staging=staging_name, model=model_name)
You can mod this for a single declarative base.
I'm working on a project in which I have to create postgres schemas and tables dynamically and then insert data in proper schema. Here is something I have done maybe it will help someone:
import sqlalchemy
from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker
from app.models.user import User
engine_uri = "postgres://someusername:somepassword#localhost:5432/users"
engine = create_engine(engine_uri, pool_pre_ping=True)
SessionLocal = sessionmaker(autocommit=False, autoflush=False, bind=engine)
def create_schema(schema_name: str):
"""
Creates a new postgres schema
- **schema_name**: name of the new schema to create
"""
if not engine.dialect.has_schema(engine, schema_name):
engine.execute(sqlalchemy.schema.CreateSchema(schema_name))
def create_tables(schema_name: str):
"""
Create new tables for postgres schema
- **schema_name**: schema in which tables are to be created
"""
if (
engine.dialect.has_schema(engine, schema_name) and
not engine.dialect.has_table(engine, str(User.__table__.name))
):
User.__table__.schema = schema_name
User.__table__.create(engine)
def add_data(schema_name: str):
"""
Add data to a particular postgres schema
- **schema_name**: schema in which data is to be added
"""
if engine.dialect.has_table(engine, str(User.__table__.name)):
db = SessionLocal()
db.connection(execution_options={
"schema_translate_map": {None: schema_name}},
)
user = User()
user.name = "Moin"
user.salary = 10000
db.add(user)
db.commit()

SQLAlchemy not setting primary key from auto increment after commit

I am using SQLAlchemy to connect to a postgresql database. I have defined my primary key columns in postgresql to be of type serial i.e. auto-increment integer and have marked them in my SQLAlchemy model with primary_key=true.
On committing the SQLAlchemy session, the model is saved to the db and I can see the primary key set in the database but the id property on my SQLAlchemy model object always has a value of None i.e. it isn't picking up the auto-increment value. I'm not sure what I have got wrong.
I have checked out the existing SO questions but have not found an answer:
Set SQLAlchemy to use PostgreSQL SERIAL for identity generation
sqlalchemy flush() and get inserted id?
My code is below:
Create the table in postgres:
CREATE TABLE my_model
(
id serial NOT NULL,
type text,
user_id integer,
CONSTRAINT pk_network_task PRIMARY KEY (id)
)
WITH (
OIDS=FALSE
);
Set up SQLAlchemy and the model:
from sqlalchemy.ext.declarative import declarative_base
Base = declarative_base()
engine = create_engine(db_url, convert_unicode=True, echo=True)
session = scoped_session(sessionmaker(autocommit=False, autoflush=False, bind=engine))
class MyModel(Base):
__tablename__ = 'my_model'
id = sa.Column(sa.Integer, primary_key=True)
user_id = sa.Column(sa.Integer)
type = sa.Column(sa.String)
Try and store the model:
my_model = MyModel()
user_id = 1
type = "A type"
session.merge(my_model)
session.commit()
my_model.id #Always None, don't know why
my_model.id is still None after the commit. I have also try calling close on the session but that didn't work either.
Turns out I didn't understand the difference between
session.merge(my_model)
and
session.add(my_model)
session.merge(my_model) (which I had been using) doesn't add the object given to it to the session. Instead it returns a new object i.e. the merged model, which has been added to the session. If you reference this new object all is well i.e.
my_model = session.merge(my_model)
add on the other hand, adds the object given to it to the session.

Categories