I want to be able to load data automatically upon creation of tables using SQLAlchemy.
In django, you have fixtures which allow you to easily pre-populate your database with data upon creation of a table. This I found useful especially when you have basic "lookup" tables e.g. product_type, student_type which contain just a few rows or even a table like currencies which will load all the currencies of the world without you having to key them in over and over again when you destroy your models/classes.
My current app isn't using django. I have SQLAlchemy. How can I achieve the same thing? I want the app to know that the database is being created for the first time and hence it populates some tables with data.
I used the event listener to prepopulate database with data upon creation of a table.
Let's say you have ProductType model in your code:
from sqlalchemy import event, Column, Integer, String
from sqlalchemy.ext.declarative import declarative_base
Base = declarative_base()
class ProductType(Base):
__tablename__ = 'product_type'
id = Column(Integer, primary_key=True)
name = Column(String(100))
First, you need to define a callback function, which will be executed when the table is created:
def insert_data(target, connection, **kw):
connection.execute(target.insert(), {'id': 1, 'name':'spam'}, {'id':2, 'name': 'eggs'})
Then you just add the event listener:
event.listen(ProductType.__table__, 'after_create', insert_data)
The short answer is no, SQLAlchemy doesn't provide the same feature as dumpdata and loaddata like Django.
There is https://github.com/kvesteri/sqlalchemy-fixtures that might be useful for you but the workflow is different.
Related
This question already has answers here:
How to define a table without primary key with SQLAlchemy?
(7 answers)
Closed 3 years ago.
I have a flask application that relies on an existing Teradata Database to serve up information to and accept input from its users. I am able to successfully make the connection between the application and the Teradata Database, however, I am not able to then define classes that will represent tables already existing in my database.
Currently, I am defining a 'Base' class using sqlalchemy that represents the connection to my database. There is no problem here and I am even able to execute queries using the connection used to build the 'Base' class. However, my problem is in using this 'Base' class to create a subclass 'Users' for my teradata table 'users'. My understanding is that sqlalchemy should allow for me to define a subclass of the superclass 'Base' which will inherit the metadata from the underlying teradata table that the subclass represents - in this case, my 'users' table. Here is the code I have so far:
import getpass
from sqlalchemy import create_engine
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.schema import MetaData
user = 'user_id_string'
pasw=getpass.getpass()
host = 'host_string'
db_name = 'db_name'
engine = create_engine(f'{host}?user={user}&password={pasw}&logmech=LDAP')
connection = engine.connect()
connection.execute(f'DATABASE {db_name}')
md = MetaData(bind=connection, reflect=False, schema='db_name')
md.reflect(only=['users'])
Base = declarative_base(bind=connection, metadata=md)
class Users(Base):
__table__ = md.tables['db_name.users']
This is the error that I receive when constructing the subclass 'Users':
sqlalchemy.exc.ArgumentError: Mapper mapped class Users->users could not assemble any primary key columns for mapped table 'users'
Is there some reason that my subclass 'Users' is not automatically being mapped to the table metadata from the existing teradata table 'users' that I have assigned it to in defining the class? The underlying table already has a primary key set so I don't understand why sqlalchemy is not assuming the existing primary key. Thanks for your help in advance.
EDIT: The underlying table DOES NOT have a primary KEY, only a primary INDEX.
From SQLAlchmey documentation: (https://docs.sqlalchemy.org/en/13/faq/ormconfiguration.html#how-do-i-map-a-table-that-has-no-primary-key)
The SQLAlchemy ORM, in order to map to a particular table, needs there to be at least one column denoted as a primary key column; multiple-column, i.e. composite, primary keys are of course entirely feasible as well. These columns do not need to be actually known to the database as primary key columns, though it’s a good idea that they are. It’s only necessary that the columns behave as a primary key does, e.g. as a unique and not nullable identifier for a row.
Most ORMs require that objects have some kind of primary key defined because the object in memory must correspond to a uniquely identifiable row in the database table; at the very least, this allows the object can be targeted for UPDATE and DELETE statements which will affect only that object’s row and no other. However, the importance of the primary key goes far beyond that. In SQLAlchemy, all ORM-mapped objects are at all times linked uniquely within a Session to their specific database row using a pattern called the identity map, a pattern that’s central to the unit of work system employed by SQLAlchemy, and is also key to the most common (and not-so-common) patterns of ORM usage.
In almost all cases, a table does have a so-called candidate key, which is a column or series of columns that uniquely identify a row. If a table truly doesn’t have this, and has actual fully duplicate rows, the table is not corresponding to first normal form and cannot be mapped. Otherwise, whatever columns comprise the best candidate key can be applied directly to the mapper:
class SomeClass(Base):
__table__ = some_table_with_no_pk
__mapper_args__ = {
'primary_key':[some_table_with_no_pk.c.uid, some_table_with_no_pk.c.bar]
}
Better yet is when using fully declared table metadata, use the primary_key=True flag on those columns:
class SomeClass(Base):
__tablename__ = "some_table_with_no_pk"
uid = Column(Integer, primary_key=True)
bar = Column(String, primary_key=True)
Say, there is a movie.py, which contians the table definition for movie and base like
# base.py
from sqlalchemy import create_engine
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import sessionmaker
engine = create_engine('postgresql://usr:pass#localhost:5432/sqlalchemy')
Session = sessionmaker(bind=engine)
Base = declarative_base()
# move.py
from sqlalchemy import Column, String, Integer, Date
from base import Base
class Movie(Base):
__tablename__ = 'movies'
id = Column(Integer, primary_key=True)
title = Column(String)
release_date = Column(Date)
def __init__(self, title, release_date):
self.title = title
self.release_date = release_date
And insert some queries like
# coding=utf-8
from datetime import date
from base import Session, engine, Base
from movie import Movie
# 2 - generate database schema
Base.metadata.create_all(engine)
# 3 - create a new session
session = Session()
# 4 - create movies
bourne_identity = Movie("The Bourne Identity", date(2002, 10, 11))
furious_7 = Movie("Furious 7", date(2015, 4, 2))
pain_and_gain = Movie("Pain & Gain", date(2013, 8, 23))
# 5 - persists data
session.add(bourne_identity)
session.add(furious_7)
session.add(pain_and_gain)
# 10 - commit and close session
session.commit()
session.close()
Is there a way I could restore the old data I have inserted, if I got a new definition for my movie table (add more columns to the move.py) ?
Did you mean to ask "how do I update the database schema of an existing database?". That's called "schema migration". There are a number of ways of attacking that. The most basic way is to have SqlAlchemy work in the other direction...have it generate its schema metadata from an existing database, instead of creating metadata and then having SqlAlchemy build a database from that. This is called Reflection. You'd do this, then issue individual commands to update your database schema. In doing this, you'd have to allow for what is going to happen to the existing rows in your table as you make these changes. You would still use your domain object definition (the Movie object), but you wouldn't use create_all(). create_all() ignores any tables that already exist.
In reality, this gets complex quickly, and so you usually want to use a formal schema migration strategy, and probably a support package for doing so. SqlAlchemy's own documentation recommends two packages for doing so. See this page:
https://docs.sqlalchemy.org/en/latest/core/metadata.html
Scroll down a bit to the "Altering Schemas through Migrations" section.
Someone may have more to offer you in terms of how to do this manually, without a migration package. I've always used such a package for any task where I wasn't willing to blow away my data and start from scratch whenever my schema changed.
Another option I've seen used is to export all your data, have SqlAlchemy build a fresh, empty database, and then import your existing data back into that new database. You would set up appropriate defaults for the new fields that won't exist in the incoming data. You'll be doing this thing with setting defaults for missing columns no matter how you choose to attack this problem.
Starting from an existing (SQLite) database with foreign keys, can SQLAlchemy automatically build relationships?
SQLAlchemy classes are automatically created via __table_args__ = {'autoload': True}.
The goal would be to easily access data from related tables without having to add all the relationships one by one by hand (i.e. without using sqlalchemy.orm.relationship() and sqlalchemy.orm.backref).
[Update] As of SQLAlchemy 0.9.1 there is Automap extension for doing that.
For SQLAlchemy < 0.9.0 it is possible to use sqlalchemy reflection.
SQLAlchemy reflection loads foreign/primary keys relations between tables. But doesn't create relations between mapped classes. Actually reflection doesn't create mapped classes for you - you have to specify mapped class name.
Actually I think that reflection support for loading foreign keys is a great helper and time saving tool. Using it you can build a query using joins without need to specify which columns to use for a join.
from sqlalchemy import *
from sqlalchemy import create_engine, orm
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import relationship
metadata = MetaData()
Base = declarative_base()
Base.metadata = metadata
db = create_engine('<db connection URL>',echo=False)
metadata.reflect(bind=db)
cause_code_table = metadata.tables['cause_code']
ndticket_table = metadata.tables['ndticket']
sm = orm.sessionmaker(bind=db, autoflush=True, autocommit=True, expire_on_commit=True)
session = orm.scoped_session(sm)
q = session.query(ndticket_table,cause_code_table).join(cause_code_table)
for r in q.limit(10):
print r
Also when I was using reflection to run queries to existing database - I had to define only mapped classes names, table bindings, relations, BUT there were no need to define table columns for these relations.
class CauseCode(Base):
__tablename__ = "cause_code"
class NDTicket(Base):
__tablename__ = "ndticket"
cause_code = relationship("CauseCode", backref = "ndticket")
q = session.query(NDTicket)
for r in q.limit(10):
print r.ticket_id, r.cause_code.cause_code
Overall SQLAlchemy reflection is already powerful tool and save me time, so adding relations manually is a small overhead for me.
If I would have to develop functionality that will add relations between mapped objects using existing foreign keys, I would start from using reflection with inspector. Using get_foreign_keys() method gives all information required to build relations - referred table name, referred column name and column name in target table. And would use this information for adding property with relationship into mapped class.
insp = reflection.Inspector.from_engine(db)
print insp.get_table_names()
print insp.get_foreign_keys(NDTicket.__tablename__)
>>>[{'referred_table': u'cause_code', 'referred_columns': [u'cause_code'], 'referred_schema': None, 'name': u'SYS_C00135367', 'constrained_columns': [u'cause_code_id']}]
As of SQLAlchemy 0.9.1 the (for now experimental) Automap extension would seem to do just that: http://docs.sqlalchemy.org/en/rel_0_9/orm/extensions/automap.html
I have a SQLAlchemy ORM model that currently looks a bit like this:
Base = declarative_base()
class Database(Base):
__tablename__ = "databases"
__table_args__ = (
saschema.PrimaryKeyConstraint('db', 'role'),
{
'schema' : 'defines',
},
)
db = Column(String, nullable=False)
role = Column(String, nullable=False)
server = Column(String)
Here's the thing, in practice this model exists in multiple databases, and in those databases it'll exist in mutiple schemas. For any one operation I'll only use one (database, schema) tuple.
Right now, I can set the database engine using this:
Session = scoped_session(sessionmaker())
Session.configure(bind=my_db_engine)
# ... do my operations on the model here.
But I'm not sure how I can change the __table_args__ at execution time so that the schema will be the right one.
One option is to use the create_engine to bind the models to a schema/database rather than do so in the actual database.
#first connect to the database that holds the DB and schema
engine1 = create_engine('mysql://user:pass#db.com/schema1')
Session = session(sessionmaker())
session = Session(bind=engine1)
#fetch the first database
database = session.query(Database).first()
engine2 = create_engine('mysql://user:pass#%s/%s' % (database.DB, database.schema))
session2 = Session(bind=engine2)
I don't know that this is ideal, but it is one way to do it. If you cache the list of databases before hand then in most cases you are only having to create one session.
I'm also looking for an elegant solution for this problem. If there are standard tables/models, but different table names, databases, and schemas at runtime, how is this handled? Others have suggested writing some sort of function that takes a tablename and schema argument, and constructs the model for you. I've found that using __abstract__ helps. I've suggested a solution here that may be useful. It involves adding a Base with a specific schema/metadata into the inheritance.
How close can I get to defining a model in SQLAlchemy like:
class Person(Base):
pass
And just have it dynamically pick up the field names? anyway to get naming conventions to control the relationships between tables? I guess I'm looking for something similar to RoR's ActiveRecord but in Python.
Not sure if this matters but I'll be trying to use this under IronPython rather than cPython.
It is very simple to automatically pick up the field names:
from sqlalchemy import Table
from sqlalchemy.orm import MetaData, mapper
metadata = MetaData()
metadata.bind = engine
person_table = Table(metadata, "tablename", autoload=True)
class Person(object):
pass
mapper(Person, person_table)
Using this approach, you have to define the relationships in the call to mapper(), so no auto-discovery of relationships.
To automatically map classes to tables with same name, you could do:
def map_class(class_):
table = Table(metadata, class_.__name__, autoload=True)
mapper(class_, table)
map_class(Person)
map_class(Order)
Elixir might do everything you want.
AFAIK sqlalchemy intentionally decouples database metadata and class layout.
You may should investigate Elixir (http://elixir.ematia.de/trac/wiki): Active Record Pattern for sqlalchemy, but you have to define the classes, not the database tables.