I'm trying to dump PostgressSQL -> to -> SQLite3 with all its data.
The main idea, was to create two engines, one for PSQL and second for sqlite3. then I reflect the psql engine on the sqlite engine - and run create_all() but then I receive the following error
2019-07-18 11:41:47,660 INFO sqlalchemy.engine.base.Engine ()
2019-07-18 11:41:47,660 INFO sqlalchemy.engine.base.Engine ROLLBACK
Traceback (most recent call last):
... etc ...
sqlalchemy.exc.OperationalError: (pysqlite2.dbapi2.OperationalError) near "(": syntax error [SQL: u"\nCREATE TABLE table1 (\n\tcolumn_id INTEGER DEFAULT nextval('table1_id_seq'::regclass) NOT NULL, \n\t
(Background on this error at: http://sqlalche.me/e/e3q8)
Which is funny, because SQLAlchemy generated that CREATE TABLE - did. and the issue is when it's going to execute that create table in sqlite3 then sqlite3 throw the error back to SQLAlchemy - where it doesn't understand what are the following nextval and :: are:
column_id INTEGER DEFAULT nextval('table1_id_seq'::regclass) NOT NULL,
column_name VARCHAR(15) DEFAULT 'no-name'::character varying,
Personally I don't even need those - as the sqlite will be used as a snapshot DB, but how can I ignore that ? or adjust that ?
EDIT 1- with a specific model
If inside the code I write something like this
class Table1(Base):
__table__ = Table('table1',
Base.metadata,
Column('column_id', Integer, primary_key=True),
Column('column_name', Text, default='no-name'),
autoload=True)
Working example - but I'm trying to do the same w/o class Table1
from sqlalchemy import create_engine
from sqlalchemy import Table, Column, Integer, Text
from sqlalchemy.ext.declarative import declarative_base
def review_md_tables(metadata):
if not metadata.sorted_tables:
print "-> Tables not found"
return
for Table in metadata.sorted_tables:
print "->", Table.name
print "PSQL database"
psql_url = "postgress://..."
psql_engine = create_engine(psql_url, echo=False)
psql_base = declarative_base(bind=psql_engine)
review_md_tables(psql_base.metadata)
class Table1(psql_base):
__table__ = Table('table1',
pql_base.metadata,
Column('column_id', Integer, primary_key=True),
Column('column_name', Text, default='no-name'),
autoload=True)
review_md_tables(psql_base.metadata)
sqlite_url = "sqlite:////tmp/db.sqlite"
sqlite_enging = create_engine(sqlite_url, echo=False)
# Duplicate PSQL tables -> SQLite
psql_base.metadata.create_all(sqlite_enging)
Issue is, I don't want to start writing class Models for every table in the DB ... any thoughts ?
Unless someone has a better idea, this is the solution I found so far - is to remove the server_default from those columns one by one ( we can define with default an SQLAlchemy default ..
def remove_defaults_from_tables(metadata):
for Table in metadata.sorted_tables:
print "--> Adjusting table: ", Table.name
# Fixing PSQL unsupported DEFAULT & serial columns
# https://github.com/sqlalchemy/sqlalchemy/issues/525
# https://github.com/sqlalchemy/sqlalchemy/issues/1565
if Table.name in ["table1", "table2"]:
Table.c.id.server_default = None
So this function should be used after filling all the metadata with Tables
# This doesn't require pre-defined Models -
# BUT they will also load those special PSQL variables which SQLAlchemy
# can't determine later during the `create_all`
Base.metadata.reflect(bind=my_psql_engine, only=["table1", "table2"])
remove_defaults_from_tables(Base.metadata)
Pre-existing db and sqlalchemy. Using reflections I would like to query such database, but there is a problem. Table is called 'logs' and it has two foreign keys both referring table 'server'. Table server has column called 'class' and that name is restricted in python.
Code:
from sqlalchemy import orm, create_engine
from sqlalchemy.ext.automap import automap_base
from django.conf import settings
base = automap_base()
connection_setup = (
"{driver}://{user}:{password}#{host}:{port}/{dbname}".format(
**settings.ALCHEMY_DB))
engine = create_engine(connection_setup, echo=False)
base.prepare(engine, reflect=True)
scoped_session = orm.scoped_session(orm.sessionmaker(bind=engine))
session = scoped_session()
logs = base.classes.logs
server = base.classes.server
local_server = orm.aliased(server, name='local_server')
remote_server = orm.aliased(server, name='remote_server')
query = (
session
.query(
logs, local_server.class, remote_server.class)
.outerjoin(
local_server, logs.local_server_id == local_server.id
)
.outerjoin(
remote_server, logs.remte_server_id == remote_server.id
)
)
rows = query.all()
Exception:
File "ff.py", line 29
logs, local_server.class, remote_server.class)
^
SyntaxError: invalid syntax
How to approach such problem?
Probably the easiest solution:
getattr(local_server, 'class')
Alternatively it should be possible to explicitly override column:
https://docs.sqlalchemy.org/en/13/orm/extensions/automap.html#specifying-classes-explicitly
Tried and tested both of them.
I have been struggling on this for a while now and did not find an answer yet, or maybe I already have seen the answer and just didnt get it - however, I hope I am able to describe my problem.
I have a MS SQL database in which the tables are grouped in namespaces (or whatever it is called), denoted by Prefix.Tablename (with a dot). So a native sql statement to request some content looks like this:
SELECT TOP 100
[Value], [ValueDate]
FROM [FinancialDataBase].[Reporting].[IndexedElements]
How to map this to sqlalchemy?
If the "Reporting" prefix would not be there, the solution (or one way to do it) looks like this:
from sqlalchemy import *
from sqlalchemy.ext.declarative import declarative_base, declared_attr
from sqlalchemy.orm import sessionmaker
def get_session():
from urllib.parse import quote_plus as urllib_quote_plus
server = "FinancialDataBase.sql.local"
connstr = "DRIVER={SQL Server};SERVER=%s;DATABASE=FinancialDataBase" % server
params = urllib_quote_plus(connstr)
base_url = "mssql+pyodbc:///?odbc_connect=%s" % params
engine = create_engine(base_url,echo=True)
Session = sessionmaker(bind=engine)
session = Session()
return engine, session
Base = declarative_base()
class IndexedElements(Base):
__tablename__ = "IndexedElements"
UniqueID = Column(String,primary_key=True)
ValueDate = Column(DateTime)
Value = Column(Float)
And then requests can be done and wrapped in a Pandas dataframe for example like this:
import pandas as pd
engine, session = get_session()
query = session.query(IndexedElements.Value,IndexedElements.ValueDate)
data = pd.read_sql(query.statement,query.session.bind)
But the SQL statement that is compiled and actually executed in this, includes this wrong FROM part:
FROM [FinancialDataBase].[IndexedElements]
Due to the namespace-prefix it would have to be
FROM [FinancialDataBase].[Reporting].[IndexedElements]
Simply expanding the table name to
__tablename__ = "Reporting.IndexedElements"
doesnt fix it, because it changes the compiled sql statement to
FROM [FinancialDataBase].[Reporting.IndexedElements]
which doesnt work properly.
So how can this be solved?
The answer is given in the comment by Ilja above:
The "namespace" is a so called schema and has to be declarated in the mapped object. Given the example from the opening post, the mapped table has to be defined like this:
class IndexedElements(Base):
__tablename__ = "IndexedElements"
__table_args__ = {"schema": "Reporting"}
UniqueID = Column(String,primary_key=True)
ValueDate = Column(DateTime)
Value = Column(Float)
Or define a base class containing these informations for different schemata. Check also "Augmenting the base" in sqlalchemy docs:
http://docs.sqlalchemy.org/en/latest/orm/extensions/declarative/mixins.html#augmenting-the-base
I try to store (an admittedly very large) BLOB into an sqlite database using SqlAlchemy.
For the MCVE I use ubuntu-14.04.2-desktop-amd64.iso as BLOB I want to store. Its size:
$ ls -lhubuntu-14.04.2-desktop-amd64.iso
... 996M ... ubuntu-14.04.2-desktop-amd64.iso
The code
from pathlib import Path
from sqlalchemy import (Column, Integer, String, BLOB, create_engine)
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import sessionmaker
from sqlite3 import dbapi2 as sqlite
SA_BASE = declarative_base()
class DbPath(SA_BASE):
__tablename__ = 'file'
pk_path = Column(Integer, primary_key=True)
path = Column(String)
data = Column(BLOB, default=None)
def create_session(db_path):
db_url = 'sqlite+pysqlite:///{}'.format(db_path)
engine = create_engine(db_url, module=sqlite)
SA_BASE.metadata.create_all(engine)
session = sessionmaker(bind=engine)
return session()
if __name__ == '__main__':
pth = Path('/home/user/Downloads/iso/ubuntu-14.04.2-desktop-amd64.iso')
with pth.open('rb') as file_pointer:
iso_data = file_pointer.read()
db_pth = DbPath(path=str(pth), data=iso_data)
db_session = create_session('test.sqlite')
db_session.add(db_pth)
db_session.commit()
Running this raises the error
InterfaceError: (InterfaceError) Error binding parameter 1 - probably unsupported
type. 'INSERT INTO file (path, data) VALUES (?, ?)'
('/home/user/Downloads/iso/ubuntu-14.04.2-desktop-amd64.iso', <memory
at 0x7faf37cc18e0>)
I looked at the sqlite limitations but found nothing that should prevent me from doing this. Does SqlAlchemy have a limitation?
Everything of this works fine for this file:
$ ls -lh ubuntu-14.04.2-server-amd64.iso
... 595M ... ubuntu-14.04.2-server-amd64.iso
Is there a data size limit? or what do I have to do differently when the file size surpasses a certain (where would that be?) limit?
And whatever the answer about the limit: what im interested about is how can is store files of this size into sqlite using SqlAlchemy?
I made a table using SQLAlchemy and forgot to add a column. I basically want to do this:
users.addColumn('user_id', ForeignKey('users.user_id'))
What's the syntax for this? I couldn't find it in the docs.
I have the same problem, and a thought of using migration library only for this trivial thing makes me
tremble. Anyway, this is my attempt so far:
def add_column(engine, table_name, column):
column_name = column.compile(dialect=engine.dialect)
column_type = column.type.compile(engine.dialect)
engine.execute('ALTER TABLE %s ADD COLUMN %s %s' % (table_name, column_name, column_type))
column = Column('new_column_name', String(100), primary_key=True)
add_column(engine, table_name, column)
Still, I don't know how to insert primary_key=True into raw SQL request.
This is referred to as database migration (SQLAlchemy doesn't support migration out of the box). You can look at using sqlalchemy-migrate to help in these kinds of situations, or you can just ALTER TABLE through your chosen database's command line utility,
See this section of the SQLAlchemy documentation: http://docs.sqlalchemy.org/en/latest/core/metadata.html#altering-schemas-through-migrations
Alembic is the latest software to offer this type of functionality and is made by the same author as SQLAlchemy.
I have a database called "ncaaf.db" built with sqlite3 and a table called "games". So I would CD into the same directory on my linux command prompt and do
sqlite3 ncaaf.db
alter table games add column q4 type float
and that is all it takes! Just make sure you update your definitions in your sqlalchemy code.
from sqlalchemy import create_engine
engine = create_engine('sqlite:///db.sqlite3')
engine.execute('alter table table_name add column column_name String')
I had the same problem, I ended up just writing my own function in raw sql. If you are using SQLITE3 this might be useful.
Then if you add the column to your class definition at the same time it seems to do the trick.
import sqlite3
def add_column(database_name, table_name, column_name, data_type):
connection = sqlite3.connect(database_name)
cursor = connection.cursor()
if data_type == "Integer":
data_type_formatted = "INTEGER"
elif data_type == "String":
data_type_formatted = "VARCHAR(100)"
base_command = ("ALTER TABLE '{table_name}' ADD column '{column_name}' '{data_type}'")
sql_command = base_command.format(table_name=table_name, column_name=column_name, data_type=data_type_formatted)
cursor.execute(sql_command)
connection.commit()
connection.close()
I've recently had this same issue so I took a point from AlexP in an earlier answer. The problem was in getting the new column into my program's metadata. Using sqlAlchemy's append_column functionality had some unexpected downstream effects ('str' object has no attribute 'dialect impl'). I corrected this by adding the column with DDL (MySQL database in this case) and then reflecting the table back from the DB into my metadata.
Code is as roughly as follows (modified slightly from what I have in order to reduce it to its minimal essence. I apologize for any mistakes - if there, they should be minor)...
try:
# Use back quotes as a protection against SQL Injection Attacks. Can we do more?
common.qry_engine.execute('ALTER TABLE %s ADD COLUMN %s %s' %
('`' + self.tbl.schema + '`.`' + self.tbl.name + '`',
'`' + self.outputs[new_col] + '`', 'VARCHAR(50)'))
except exc.SQLAlchemyError as msg:
raise GRError(desc='Unable to physically add derived column to table. Contact support.',
data=str(self.outputs), other_info=str(msg))
try: # Refresh the metadata to show the new column
self.tbl = sqlalchemy.Table(self.tbl.name, self.tbl.metadata, extend_existing=True, autoload=True)
except exc.SQLAlchemyError as msg:
raise GRError(desc='Unable to establish metadata for new column. Contact support.',
data=str(self.outputs), other_info=str(msg))
Yes you can
Install sqlalchemy-migrate (pip install sqlalchemy-migrate) and use it in your script to call Table and Column create() method:
from sqlalchemy import String, MetaData, create_engine
from migrate.versioning.schema import Table, Column
db_engine = create_engine(app.config.get('SQLALCHEMY_DATABASE_URI'))
db_meta = MetaData(bind=db_engine)
table = Table('tabel_name' , db_meta)
col = Column('new_column_name', String(20), default='foo')
col.create(table)
Just continuing the simple way proposed by chasmani, little improvement
'''
# simple migration
# columns to add:
# last_status_change = Column(BigInteger, default=None)
# last_complete_phase = Column(String, default=None)
# complete_percentage = Column(DECIMAL, default=0.0)
'''
import sqlite3
from config import APP_STATUS_DB
from sqlalchemy import types
def add_column(database_name: str, table_name: str, column_name: str, data_type: types, default=None):
ret = False
if default is not None:
try:
float(default)
ddl = ("ALTER TABLE '{table_name}' ADD column '{column_name}' '{data_type}' DEFAULT {default}")
except:
ddl = ("ALTER TABLE '{table_name}' ADD column '{column_name}' '{data_type}' DEFAULT '{default}'")
else:
ddl = ("ALTER TABLE '{table_name}' ADD column '{column_name}' '{data_type}'")
sql_command = ddl.format(table_name=table_name, column_name=column_name, data_type=data_type.__name__,
default=default)
try:
connection = sqlite3.connect(database_name)
cursor = connection.cursor()
cursor.execute(sql_command)
connection.commit()
connection.close()
ret = True
except Exception as e:
print(e)
ret = False
return ret
add_column(APP_STATUS_DB, 'procedures', 'last_status_change', types.BigInteger)
add_column(APP_STATUS_DB, 'procedures', 'last_complete_phase', types.String)
add_column(APP_STATUS_DB, 'procedures', 'complete_percentage', types.DECIMAL, 0.0)
If using docker:
go to the terminal of the container holding your DB
get into the db: psql -U usr [YOUR_DB_NAME]
now you can alter tables using raw SQL: alter table [TABLE_NAME] add column [COLUMN_NAME] [TYPE]
Note you will need to have mounted your DB for the changes to persist between builds.
Adding the column "manually" (not using python or SQLAlchemy) is perhaps the easiest?
Same problem over here. What I will do is iterating over the db and add each entry to a new database with the extra column, then delete the old db and rename the new to this one.