I have 2 simple scripts:
from sqlalchemy import create_engine, ForeignKey, Table
from sqlalchemy import Column, Date, Integer, String, DateTime, BigInteger, event
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.engine import Engine
from sqlalchemy.orm import relationship, backref, sessionmaker, scoped_session, Session
class Test(declarative_base()):
__tablename__ = "Test"
def __init__(self, *args, **kwargs):
args = args[0]
for key in args:
setattr(self, key, args[key] )
key = Column(String, primary_key=True)
data = []
for a in range(0,10000):
data.append({ "key" : "key%s" % a})
engine = create_engine("sqlite:///testn", echo=False)
with engine.connect() as connection:
Test.metadata.create_all(engine)
session = Session(engine)
list(map(lambda x: session.merge(Test(x)), data))
session.commit()
result:
real 0m15.300s
user 0m14.920s
sys 0m0.351s
second script:
from peewee import *
class Test(Model):
key = TextField(primary_key=True,null=False)
dbname = "test"
db = SqliteDatabase(dbname)
Test._meta.database = db
data = []
for a in range(0,10000):
data.append({ "key" : "key%s" % a })
if not Test.table_exists():
db.create_tables([Test])
with db.atomic() as tr:
Test.insert_many(data).upsert().execute()
result:
real 0m3.253s
user 0m2.620s
sys 0m0.571s
Why?
This comparison is not entirely valid, as issuing an upsert style query is very different from what SQLAlchemy's Session.merge does:
Session.merge() examines the primary key attributes of the source instance, and attempts to reconcile it with an instance of the same primary key in the session. If not found locally, it attempts to load the object from the database based on primary key, and if none can be located, creates a new instance.
In this test case this will result in 10,000 load attempts against the database, which is expensive.
On the other hand when using peewee with sqlite the combination of insert_many(data) and upsert() can result in a single query:
INSERT OR REPLACE INTO Test (key) VALUES ('key0'), ('key1'), ...
There's no session state to reconcile, since peewee is a very different kind of ORM from SQLAlchemy and on a quick glance looks closer to Core and Tables
In SQLAlchemy instead of list(map(lambda x: session.merge(Test(x)), data)) you could revert to using Core:
session.execute(Test.__table__.insert(prefixes=['OR REPLACE']).values(data))
A major con about this is that you have to write a database vendor specific prefix to INSERT by hand. This will also subvert the Session, as it will have no information or knowledge about the newly added rows.
Bulk insertions using model objects are a little more involved with SQLAlchemy. Very simply put using an ORM is a trade-off between ease of use and speed:
ORMs are basically not intended for high-performance bulk inserts - this is the whole reason SQLAlchemy offers the Core in addition to the ORM as a first-class component.
Related
I would like to update (or add/delete) an item from a Database Table using SQLAlchemy but the table has 1000+ columns and it wouldn't be feasible to create the ORM Object using the declarative_base procedure. Ideally, I would like to be able to extract the structure of the Item from the database and send a new one (or update, or delete).
I tried using the sqlalchemy.Table object, but it only allow me to read data from the database and doesn't allow for editting.
For example :
from datetime import date
import sqlalchemy as sa
import sqlalchemy.orm as orm
engine = sa.create_engine('sqlite:///FullDatabase.db')
meta = sa.MetaData(engine)
history = sa.Table('history', meta, autoload=True)
session = orm.Session(engine)
entry = session.query(history).where(history.c.Date == date(2023, 1, 25)).first()
With this I can extract the information, but I can't edit it the same way I could if I had the History object defined (but as I mentioned, there are 1000+ columns to map)
For example : (one of the columns is HousePrice)
entry.HousePrice = 100
AttributeError: can't set attribute
Is there a way to do this update ?
A similar solution will be used to add/delete new items as well, I'd assume.
Appreciate in advance
You can definitely reflect classes declaratively (see the docs or this answer).
And then you can update in the way you've described above (see my example below).
However I'd imagine that in your case the error is just that you're trying to access an attribute directly while you should be reaching through the c columns.
entry.c.HousePrice = 100 # should work
Demo of reflecting a class.
Let's set up a simple table.
import sqlite3
con = sqlite3.connect("/tmp/so.db")
cur = con.cursor()
cur.execute("CREATE TABLE user (username VARCHAR PRIMARY KEY)")
cur.executemany(
"INSERT INTO user (username) VALUES (?)",
[("alice",), ("bob",), ("charlie",)],
)
con.commit()
con.close()
And then, reflect and update alice to ALICE.
from sqlalchemy import Table, create_engine, select
from sqlalchemy.orm import Session, declarative_base
engine = create_engine("sqlite:////tmp/so.db", future=True, echo=True)
Base = declarative_base()
class User(Base):
__table__ = Table("user", Base.metadata, autoload_with=engine)
with Session(engine) as session:
alice = session.scalars(select(User).filter(User.username == "alice")).first()
alice.username = "ALICE"
session.add(alice)
session.commit()
with Session(engine) as session:
results = session.scalars(select(User)).all()
print([r.username for r in results]) # ['ALICE', 'bob', 'charlie']
I am using SQLAlchemy as ORM for a python project. I have created few models/schema and it is working fine. Now I need to query a existing MySQL database, no insert/update just the select statement.
How can I create a wrapper around the tables of this existing database? I have briefly gone through the sqlalchemy docs and SO but couldn't find anything relevant. All suggest execute method, where I need to write the raw sql queries, while I want to use the SQLAlchemy query method in same way as I am using with the SA models.
For example if the existing db has table name User then I want to query it using the dbsession ( only the select operation, probably with join)
You seem to have an impression that SQLAlchemy can only work with a database structure created by SQLAlchemy (probably using MetaData.create_all()) - this is not correct. SQLAlchemy can work perfectly with a pre-existing database, you just need to define your models to match database tables. One way to do that is to use reflection, as Ilja Everilä suggests:
from sqlalchemy import Table
from sqlalchemy.ext.declarative import declarative_base
Base = declarative_base()
class MyClass(Base):
__table__ = Table('mytable', Base.metadata,
autoload=True, autoload_with=some_engine)
(which, in my opinion, would be totally fine for one-off scripts but may lead to incredibly frustrating bugs in a "real" application if there's a potential that the database structure may change over time)
Another way is to simply define your models as usual taking care to define your models to match the database tables, which is not that difficult. The benefit of this approach is that you can map only a subset of database tables to you models and even only a subset of table columns to your model's fields. Suppose you have 10 tables in the database but only interested in users table from where you only need id, name and email fields:
import sqlalchemy as sa
from sqlalchemy.ext.declarative import declarative_base
Base = declarative_base()
class User(Base):
id = sa.Column(sa.Integer, primary_key=True)
name = sa.Column(sa.String)
email = sa.Column(sa.String)
(note how we didn't need to define some details which are only needed to emit correct DDL, such as the length of the String fields or the fact that the email field has an index)
SQLAlchemy will not emit INSERT/UPDATE queries unless you create or modify models in your code. If you want to ensure that your queries are read-only you may create a special user in the database and grant that user SELECT privileges only. Alternatively/in addition, you may also experiment with rolling back the transaction in your application code.
You can access an existing table using the automap extension:
from sqlalchemy.ext.automap import automap_base
from sqlalchemy.orm import Session
Base = automap_base()
Base.prepare(engine, reflect=True)
Users = Base.classes.users
session = Session(engine)
res = session.query(Users).first()
Create a table with autoload enabled that will inspect it. Some example code:
from sqlalchemy.sql import select
from sqlalchemy import create_engine, MetaData, Table
CONN_STR = '…'
engine = create_engine(CONN_STR, echo=True)
metadata = MetaData()
cookies = Table('cookies', metadata, autoload=True,
autoload_with=engine)
cols = cookies.c
with engine.connect() as conn:
query = (
select([cols.created_at, cols.name])
.order_by(cols.created_at)
.limit(1)
)
for row in conn.execute(query):
print(row)
Other answers don't mention what to do if you have a table with no primary key, so I thought I would address this. Assuming a table called Customers that has columns for CustomerId, CustomerName, CustomerLocation you could do;
from sqlalchemy.ext.automap import automap_base
from sqlalchemy import create_engine, MetaData, Column, String, Table
from sqlalchemy.orm import Session
Base = automap_base()
conn_str = '...'
engine = create_engine(conn_str)
metadata = MetaData()
# you only need to define which column is the primary key. It can automap the rest of the columns.
customers = Table('Customers',metadata, Column('CustomerId', String, primary_key=true), autoload=True, autoload_with=engine)
Base.prepare()
Customers= Base.classes.Customers
session = Session(engine)
customer1 = session.query(Customers).first()
print(customer1.CustomerName)
Assume we have a Postgresql database named accounts. And we already have a table named users.
import sqlalchemy as sa
psw = "verysecret"
db = "accounts"
# create an engine
pengine = sa.create_engine('postgresql+psycopg2://postgres:' + psw +'#localhost/' + db)
from sqlalchemy.ext.declarative import declarative_base
# define declarative base
Base = declarative_base()
# reflect current database engine to metadata
metadata = sa.MetaData(pengine)
metadata.reflect()
# build your User class on existing `users` table
class User(Base):
__table__ = sa.Table("users", metadata)
# call the session maker factory
Session = sa.orm.sessionmaker(pengine)
session = Session()
# filter a record
session.query(User).filter(User.id==1).first()
Warning: Your table should have a Primary Key defined. Otherwise, Sqlalchemy won't like it.
I'm using SQL Alchemy and have some schema's that are account specific. The name of the schema is derived using the account ID, so I don't have the name of the schema until I hit my application service or repository layer. I'm wondering if it's possible to run a query against an entity that has it's schema dynamically set at runtime?
I know I need to set the __table_args__['schema'] and have tried doing that using the type() built-in, but I always get the following error:
could not assemble any primary key columns for mapped table
I'm ready to give up and just write straight sql, but I really hate to do that. Any idea how this can be done? I'm using SA 0.99 and I do have a PK mapped.
Thanks
from sqlalchemy 1.1,
this can be done easily using using schema_translation_map.
https://docs.sqlalchemy.org/en/11/changelog/migration_11.html#multi-tenancy-schema-translation-for-table-objects
One option would be to reflect the particular account-dependent tables. Here is the SqlAlchemy Documentation on the matter.
Alternatively, You can create the table with a static schema attribute and update it as needed at runtime and run the queries you need to. I can't think of a non-messy way to do this. So here's the messy option
Use a loop to update the schema property in each table definition whenever the account is switched.
add all the tables that are account-specific to a list.
if the tables are expressed in the declarative syntax, then you have to modify the DeclarativeName.__table__.schema attribute. I'm not sure if you need to also modify DeclarativeName.__table_args__['schema'], but I guess it won't hurt.
If the tables are expressed in the old style Table syntax, then you have to modify the Table.schema attribute.
If you're using text for any relationships or foreign keys, then that will break, and you have to inspect each table for such hard coded usage and change them
example
user_id = Column(ForeignKey('my_schema.user.id')) needs to be written as user_id = Column(ForeignKey(User.id)). Then you can change the schema of User to my_new_schema. Otherwise, at query time sqlalchemy will be confused because the foreign key will point to my_schema.user.id while the query would point to my_new_schema.user.
I'm not sure if more complicated relationships can be expressed without the use of plain text, so I guess that's the limit to my proposed solution.
Here's an example I wrote up in the terminal:
>>> from sqlalchemy import Column, Table, Integer, String, select, ForeignKey
>>> from sqlalchemy.orm import relationship, backref
>>> from sqlalchemy.ext.declarative import declarative_base
>>> B = declarative_base()
>>>
>>> class User(B):
... __tablename__ = 'user'
... __table_args__ = {'schema': 'first_schema'}
... id = Column(Integer, primary_key=True)
... name = Column(String)
... email = Column(String)
...
>>> class Posts(B):
... __tablename__ = 'posts'
... __table_args__ = {'schema':'first_schema'}
... id = Column(Integer, primary_key=True)
... user_id = Column(ForeignKey(User.id))
... text = Column(String)
...
>>> str(select([User.id, Posts.text]).select_from(User.__table__.join(Posts)))
'SELECT first_schema."user".id, first_schema.posts.text \nFROM first_schema."user" JOIN first_schema.posts ON first_schema."user".id = first_schema.posts.user_id'
>>> account_specific = [User, Posts]
>>> for Tbl in account_specific:
... Tbl.__table__.schema = 'second_schema'
...
>>> str(select([User.id, Posts.text]).select_from(User.__table__.join(Posts)))
'SELECT second_schema."user".id, second_schema.posts.text \nFROM second_schema."user" JOIN second_schema.posts ON second_schema."user".id = second_schema.posts.user_id'
As you see the same query refers to the second_schema after I update the table's schema attribute.
edit: Although you can do what I did here, using the schema translation map as shown in the the answer below is the proper way to do it.
They are set statically. Foreign keys needs the same treatment, and I have an additional issue, in that I have multiple schemas that contain multiple tables so I did this:
from sqlalchemy.ext.declarative import declarative_base
staging_dbase = declarative_base()
model_dbase = declarative_base()
def adjust_schemas(staging, model):
for vv in staging_dbase.metadata.tables.values():
vv.schema = staging
for vv in model_dbase.metadata.tables.values():
vv.schema = model
def all_tables():
return staging_dbase.metadata.tables.union(model_dbase.metadata.tables)
Then in my startup code:
adjust_schemas(staging=staging_name, model=model_name)
You can mod this for a single declarative base.
I'm working on a project in which I have to create postgres schemas and tables dynamically and then insert data in proper schema. Here is something I have done maybe it will help someone:
import sqlalchemy
from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker
from app.models.user import User
engine_uri = "postgres://someusername:somepassword#localhost:5432/users"
engine = create_engine(engine_uri, pool_pre_ping=True)
SessionLocal = sessionmaker(autocommit=False, autoflush=False, bind=engine)
def create_schema(schema_name: str):
"""
Creates a new postgres schema
- **schema_name**: name of the new schema to create
"""
if not engine.dialect.has_schema(engine, schema_name):
engine.execute(sqlalchemy.schema.CreateSchema(schema_name))
def create_tables(schema_name: str):
"""
Create new tables for postgres schema
- **schema_name**: schema in which tables are to be created
"""
if (
engine.dialect.has_schema(engine, schema_name) and
not engine.dialect.has_table(engine, str(User.__table__.name))
):
User.__table__.schema = schema_name
User.__table__.create(engine)
def add_data(schema_name: str):
"""
Add data to a particular postgres schema
- **schema_name**: schema in which data is to be added
"""
if engine.dialect.has_table(engine, str(User.__table__.name)):
db = SessionLocal()
db.connection(execution_options={
"schema_translate_map": {None: schema_name}},
)
user = User()
user.name = "Moin"
user.salary = 10000
db.add(user)
db.commit()
I am attempting to use SqlAlchemy (0.5.8) to interface with a legacy database declaratively and using reflection. My test code looks like this:
from sqlalchemy import *
from sqlalchemy.orm import create_session
from sqlalchemy.ext.declarative import declarative_base
Base = declarative_base()
engine = create_engine('oracle://schemaname:pwd#SID')
meta = MetaData(bind=engine)
class CONSTRUCT(Base):
__table__ = Table('CONSTRUCT', meta, autoload=True)
class EXPRESSION(Base):
__table__ = Table('EXPRESSION', meta, autoload=True)
session = create_session(bind=engine)
Now when I attempt to run a query using the join between these two tables (defined by a foreign key constraint in the underlying oracle schema):
print session.query(EXPRESSION).join(PURIFICATION)
... no joy:
sqlalchemy.exc.ArgumentError: Can't find any foreign key relationships between 'EXPRESSION' and 'PURIFICATION'
However:
>>> EXPRESSION.epiconstruct_pkey.property.columns
[Column(u'epiconstruct_pkey', OracleNumeric(precision=10, scale=2, asdecimal=True,
length=None), ForeignKey(u'construct.pkey'), table=<EXPRESSION>, nullable=False)]
>>> CONSTRUCT.pkey.property.columns
[Column(u'pkey', OracleNumeric(precision=38, scale=0, asdecimal=True, length=None),
table=<CONSTRUCT>, primary_key=True, nullable=False)]
Which clearly indicates that the reflection picked up the foreign key.
Where am I going wrong?
After debugging the script + SqlAlchemy code with Eclipse, I found that the list of tables/columns is kept internally in lower case. As such, there was never any possibility of a match between EXPRESSION.foreignkey and expression.foreignkey. Hence the error message.
Digging deep into the SqlAlchemy documentation ( http://www.sqlalchemy.org/docs/reference/dialects/oracle.html#identifier-casing ) I then found the following:
"In Oracle, the data dictionary represents all case insensitive identifier names using UPPERCASE text. SQLAlchemy on the other hand considers an all-lower case identifier name to be case insensitive. The Oracle dialect converts all case insensitive identifiers to and from those two formats during schema level communication, such as reflection of tables and indexes. Using an UPPERCASE name on the SQLAlchemy side indicates a case sensitive identifier, and SQLAlchemy will quote the name - this will cause mismatches against data dictionary data received from Oracle, so unless identifier names have been truly created as case sensitive (i.e. using quoted names), all lowercase names should be used on the SQLAlchemy side."
So my code works if it looks like this (differences are case-changes only):
from sqlalchemy import *
from sqlalchemy.orm import create_session
from sqlalchemy.ext.declarative import declarative_base
Base = declarative_base()
engine = create_engine('oracle://EPIGENETICS:sgc04lab#ELN')
meta = MetaData(bind=engine)
class construct(Base):
__table__ = Table('construct', meta, autoload=True)
class expression(Base):
__table__ = Table('expression', meta, autoload=True)
class purification(Base):
__table__ = Table('purification', meta, autoload=True)
session = create_session(bind=engine)
print session.query(expression).join(purification,expression)
... which spits out:
SELECT expression.pkey AS expression_pkey, expression.cellline AS expression_cellline, expression.epiconstruct_pkey AS expression_epiconstruct_pkey, expression.elnexp AS expression_elnexp, expression.expression_id AS expression_expression_id, expression.expressioncomments AS expression_expressioncomments, expression.cellmass AS expression_cellmass, expression.datestamp AS expression_datestamp, expression.person AS expression_person, expression.soluble AS expression_soluble, expression.semet AS expression_semet, expression.scale AS expression_scale, expression.purtest AS expression_purtest, expression.nmrlabelled AS expression_nmrlabelled, expression.yield AS expression_yield
FROM expression JOIN purification ON expression.pkey = purification.epiexpression_pkey JOIN expression ON expression.pkey = purification.epiexpression_pkey
Case closed.
I am giving Pylons a try with SQLAlchemy, and I love it, there is just one thing, is it possible to print out the raw SQL CREATE TABLE data generated from Table().create() before it's executed?
from sqlalchemy.schema import CreateTable
print(CreateTable(table))
If you are using declarative syntax:
print(CreateTable(Model.__table__))
Update:
Since I have the accepted answer and there is important information in klenwell answer, I'll also add it here.
You can get the SQL for your specific database (MySQL, Postgresql, etc.) by compiling with your engine.
print(CreateTable(Model.__table__).compile(engine))
Update 2:
#jackotonye Added in the comments a way to do it without an engine.
print(CreateTable(Model.__table__).compile(dialect=postgresql.dialect()))
You can set up you engine to dump the metadata creation sequence, using the following:
def metadata_dump(sql, *multiparams, **params):
# print or write to log or file etc
print(sql.compile(dialect=engine.dialect))
engine = create_engine(myDatabaseURL, strategy='mock', executor=metadata_dump)
metadata.create_all(engine)
One advantage of this approach is that enums and indexes are included in the printout. Using CreateTable leaves this out.
Another advantage is that the order of the schema definitions is correct and (almost) usable as a script.
I needed to get the raw table sql in order to setup tests for some existing models. Here's a successful unit test that I created for SQLAlchemy 0.7.4 based on Antoine's answer as proof of concept:
from sqlalchemy import create_engine
from sqlalchemy.schema import CreateTable
from model import Foo
sql_url = "sqlite:///:memory:"
db_engine = create_engine(sql_url)
table_sql = CreateTable(Foo.table).compile(db_engine)
self.assertTrue("CREATE TABLE foos" in str(table_sql))
Something like this? (from the SQLA FAQ)
http://docs.sqlalchemy.org/en/latest/faq/sqlexpressions.html
It turns out this is straight-forward:
from sqlalchemy.dialects import postgresql
from sqlalchemy.schema import CreateTable
from sqlalchemy import Table, Column, String, MetaData
metadata = MetaData()
users = Table('users', metadata,
Column('username', String)
)
statement = CreateTable(users)
print(statement.compile(dialect=postgresql.dialect()))
Outputs this:
CREATE TABLE users (
username VARCHAR
)
Going further, it can even support bound parameters in prepared statements.
Reference
How do I render SQL expressions as strings, possibly with bound parameters inlined?
...
or without an Engine:
from sqlalchemy.dialects import postgresql
print(statement.compile(dialect=postgresql.dialect()))
SOURCE: http://docs.sqlalchemy.org/en/latest/faq/sqlexpressions.html#faq-sql-expression-string
Example: Using SQL Alchemy to generate a user rename script
#!/usr/bin/env python
import csv
from sqlalchemy.dialects import postgresql
from sqlalchemy import bindparam, Table, Column, String, MetaData
metadata = MetaData()
users = Table('users', metadata,
Column('username', String)
)
renames = []
with open('users.csv') as csvfile:
for row in csv.DictReader(csvfile):
renames.append({
'from': row['sAMAccountName'],
'to': row['mail']
})
for rename in renames:
stmt = (users.update()
.where(users.c.username == rename['from'])
.values(username=rename['to']))
print(str(stmt.compile(dialect=postgresql.dialect(),
compile_kwargs={"literal_binds": True})) + ';')
When processing this users.csv:
sAMAccountName,mail
bmcboatface,boaty.mcboatface#example.com
ndhyani,naina.dhyani#contoso.com
Gives output like this:
UPDATE users SET username='boaty.mcboatface#example.com' WHERE users.username = 'bmcboatface';
UPDATE users SET username='naina.dhyani#contoso.com' WHERE users.username = 'ndhyani';users.username = 'ndhyani';
Why a research vessel has an email address is yet to be determined. I have been in touch with Example Inc's IT team and have had no response.
May be you mean echo parameter of sqlalchemy.create_engine?
/tmp$ cat test_s.py
import sqlalchemy as sa
from sqlalchemy.ext.declarative import declarative_base
Base = declarative_base()
class Department(Base):
__tablename__ = "departments"
department_id = sa.Column(sa.types.Integer, primary_key=True)
name = sa.Column(sa.types.Unicode(100), unique=True)
chief_id = sa.Column(sa.types.Integer)
parent_department_id = sa.Column(sa.types.Integer,
sa.ForeignKey("departments.department_id"))
parent_department = sa.orm.relation("Department")
engine = sa.create_engine("sqlite:///:memory:", echo=True)
Base.metadata.create_all(bind=engine)
/tmp$ python test_s.py
2011-03-24 15:09:58,311 INFO sqlalchemy.engine.base.Engine.0x...42cc PRAGMA table_info("departments")
2011-03-24 15:09:58,312 INFO sqlalchemy.engine.base.Engine.0x...42cc ()
2011-03-24 15:09:58,312 INFO sqlalchemy.engine.base.Engine.0x...42cc
CREATE TABLE departments (
department_id INTEGER NOT NULL,
name VARCHAR(100),
chief_id INTEGER,
parent_department_id INTEGER,
PRIMARY KEY (department_id),
UNIQUE (name),
FOREIGN KEY(parent_department_id) REFERENCES departments (department_id)
)
2011-03-24 15:09:58,312 INFO sqlalchemy.engine.base.Engine.0x...42cc ()
2011-03-24 15:09:58,312 INFO sqlalchemy.engine.base.Engine.0x...42cc COMMIT