Concurrent db table indexing through alembic script

Concurrent db table indexing through alembic script - python

Is it possible to create concurrent indexes for DB table through alembic script?
I'm using postgres DB, and able to create concurrent table indexes through sql command on postgres prompt.(create index concurrently on ();)
But couldn't find way to create same through Db migration(alembic) script. If we create normal index(not concurrent) , it'll lock DB table so can't perform any query in parallel. So just want to know how to create concurrent index through alembic(DB migration) script

Alembic supports PostgreSQL concurrently indexes creation
def upgrade():
op.execute('COMMIT')
op.create_index('ix_1', 't1', ['col1'], postgresql_concurrently=True)

I'm not using Postgres and I am not able to test it, but it should be possible.
According to:
http://docs.sqlalchemy.org/en/latest/dialects/postgresql.html
Concurrent indexes are allowed in the Postgres dialect from version 0.9.9.
However, a migration script like this should work with older versions (direct SQL creation):
from alembic import op, context
from sqlalchemy import Table, Column, Integer, String, MetaData, ForeignKey
from sqlalchemy.orm import relationship
from sqlalchemy.sql import text
# ---------- COMMONS
# Base objects for SQL operations are:
# - use op = INSERT, UPDATE, DELETE
# - use connection = SELECT (and also INSERT, UPDATE, DELETE but this object has lot of logics)
metadata = MetaData()
connection = context.get_bind()
tbl = Table('test', metadata, Column('data', Integer), Column("unique_key", String))
# If you want to define a index on the current loaded schema:
# idx1 = Index('test_idx1', tbl.c.data, postgresql_concurrently=True)
def upgrade():
...
queryc = \
"""
CREATE INDEX CONCURRENTLY test_idx1 ON test (data, unique_key);
"""
# it should be possible to create an index here (direct SQL):
connection.execute(text(queryc))
...

Whereas concurrent indexes are allowed in Postgresql, Alembic does not support concurrent operations, only one process should be running at a time.

Related

Print SQL generated by SQLObject

from sqlobject import *
class Data(SQLObject):
ts = TimeCol()
val = FloatCol()
Data.select().count()
Fails with:
AttributeError: No connection has been defined for this thread or process
How do I get the SQL which would be generated, without declaring a connection?

It's impossible for two reasons. 1st, .count() not only generates a query, it also executes it, so not only it requires a connection, it also requires a database and a populated table. 2nd, different queries could be generated for different backends (esp. in the area of quoting strings) so a connection is required to render a query object to a string.
To generate a query string with accumulator function you need to repeat the code that generates the query. So the full solution for your question is
#! /usr/bin/env python
from sqlobject import *
__connection__ = "sqlite:/:memory:?debug=1"
class Data(SQLObject):
ts = TimeCol()
val = FloatCol()
print(Data.select().queryForSelect().newItems("COUNT(*)"))

SQLAlchemy Oracle - InvalidRequestError: could not retrieve isolation level

I am having problems accessing tables in an Oracle database over a SQLAlchemy connection. Specifically, I am using Kedro catalog.load('table_name') and getting the error message Table table_name not found. So I decided to test my connection using the method listed in this answer: How to verify SqlAlchemy engine object.
from sqlalchemy import create_engine
engine = create_engine('oracle+cx_oracle://USER:PASSWORD#HOST:PORT/?service_name=SERVICE_NAME')
engine.connect()
Error: InvalidRequestError: could not retrieve isolation level
I have tried explicitly adding an isolation level as explained in the documentation like this:
engine = create_engine('oracle+cx_oracle://USER:PASSWORD#HOST:PORT/?service_name=SERVICE_NAME', execution_options={'isolation_level': 'AUTOCOMMIT'})
and this:
engine.connect().execution_options(isolation_level='AUTOCOMMIT')
and this:
connection = engine.connect()
connection = connection.execution_options(
isolation_level="AUTOCOMMIT"
)
but I get the same error in all cases.

Upgrading from SqlAlchemy 1.3.21 to 1.3.22 solved the problem.

How to transfer changes made in SQliteStudio to Flask models (SQlAlchemy) [duplicate]

I'm using Alembic as migration tool and I'm launching the following pseudo script on an already updated database (no revision entries for Alembic, the database schema is just up to date).
revision = '1067fd2d11c8'
down_revision = None
from alembic import op
import sqlalchemy as sa
def upgrade():
op.add_column('box', sa.Column('has_data', sa.Boolean, server_default='0'))
def downgrade():
pass
It gives me the following error only with PostgreSQL behind (it's all good with MySQL):
INFO [alembic.migration] Context impl PostgresqlImpl.
INFO [alembic.migration] Will assume transactional DDL.
INFO [root] (ProgrammingError) ERREUR: la colonne « has_data » de la relation « box » existe déjà
Last line means the column has_data already exists.
I want to check that the column exists before op.add_column.

We ran into the same issue: we had to accommodate an edge case when a column added in a revision might exist in the schema. Silencing the error is not an option, as that will rollback the current transaction (unless using sqlite), and the version table will not be updated. Checking for column existence seems optimal here. Here's our solution (same idea as in the accepted answer, but updated for 2022):
from alembic import op
from sqlalchemy import inspect
def column_exists(table_name, column_name):
bind = op.get_context().bind
insp = inspect(bind)
columns = insp.get_columns(table_name)
return any(c["name"] == column_name for c in columns)
This is called from a revision file, so the context accessed via op.get_context() has been configured (presumably in your env.py, and the bind exists.

The easiest answer is not to try to do this. Instead, make your Alembic migrations represent the full layout of the database. Then any migrations you make will be based off the changes to the existing database.
To make a starting migration if you already have a database, temporarily point at an empty database and run alembic revision --autogenerate -m "base". Then, point back at the actual database and run alembic stamp head to say that the current state of the database is represented by the latest migration, without actually running it.
If you don't want to do that for some reason, you can choose not to use --autogenerate and instead generate empty revisions that you fill in with the operations you want. Alembic won't stop you from doing this, it's just much less convenient.

I am, unfortunately, in a situation where we have multiple versions with different schemas that all need to migrate to a single codebase. There are no migrations anywhere yet and no versions tagged in any db. So the first migration will have these conditional checks. After the first migration, everything will be in a known state and I can avoid such hacks.
So I added this in my migration (credit belongs to http://www.derstappen-it.de/tech-blog/sqlalchemie-alembic-check-if-table-has-column):
from alembic import op
from sqlalchemy import engine_from_config
from sqlalchemy.engine import reflection
def _table_has_column(table, column):
config = op.get_context().config
engine = engine_from_config(
config.get_section(config.config_ini_section), prefix='sqlalchemy.')
insp = reflection.Inspector.from_engine(engine)
has_column = False
for col in insp.get_columns(table):
if column not in col['name']:
continue
has_column = True
return has_column
My upgrade function has the following checks (note that I have a batch flag set that adds the with op.batch_alter_table line, which probably isn't in most setups:
def upgrade():
# ### commands auto generated by Alembic - please adjust! ###
with op.batch_alter_table('mytable', schema=None) as batch_op:
if not _table_has_column('mytable', 'mycol'):
batch_op.add_column(sa.Column('mycol', sa.Integer(), nullable=True))
if not _table_has_column('mytable', 'mycol2'):
batch_op.add_column(sa.Column('mycol2', sa.Integer(), nullable=True))

Dropping SQL Procedures/Functions w/SQL Alchemy and Python

I'm building the skeleton of a larger application which relies on SQL Server to do much of the heavy lifting and passes data back to Pandas for consumption by the user/insertion into flat files or Excel. Thus far my code is able to insert a stored procedure and function into a database and execute both without issue. However when I try to run the code a second time, in the same database, the drop commands don't seem to work. Here are the various files and flow of code through them.
First, proc_drop.sql, which is to used to store the SQL drop commands.
/*
IF EXISTS(SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'proc_create_tractor_test'))
DROP PROCEDURE [dbo].[proc_create_tractor_test]
GO
IF EXISTS(SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'fn_mr_parse'))
DROP FUNCTION [dbo].[fn_mr_parse]
GO
*/
IF OBJECT_ID('[proc_create_tractor_test]') IS NOT NULL DROP PROCEDURE [proc_create_tractor_test]
IF OBJECT_ID('[fn_mr_parse]') IS NOT NULL DROP PROCEDURE [fn_mr_parse]
I realize there are two kinds of drop statements in the file. Have tested a number of different iterations and none of the drop statements seem to work when executed by Python/SQL Alchemy but all work on their own when executed in SQL Management Studio
Next, the helper.py file in which I am storing helper functions. My drop SQL originally fed into the "deploy_procedures" function as a file and to be execute in the body of the function. I've since isolated the drop SQL reading/executing into another function for testing purposes only.
def clean_databases(engines, procedures):
for engine in engines:
for proc in procedures:
with open(proc, "r") as procfile:
code = procfile.read()
print code
engine.execute(code)
def deploy_procedures(engines, procedures):
for engine in engines:
for proc in procedures:
with open(proc, "r") as procfile:
code = procfile.read()
engine.execute(code)
Next, proc_create_tractor_test.sql, which is executed by the code and creates an associated stored procedure in the database. For brevity I've added the top portion of that code only:
CREATE PROCEDURE [dbo].[proc_create_tractor_test]
#meta_risks varchar(256)
AS
BEGIN
Finally, the major file piecing it altogether below. You'll notice that I create SQL Alchemy engines which are passed to the helper functions after being initialized with connection information. Those engines are passed as a list as are the other SQL procedures I referenced and the helper function simply iterates through each engine and procedure executing one at a time. Also using PYMSSQL as a driver to connect which is working fine.
So it is the "deploy_procedures" function which is crashing when trying to create the function or stored procedure the second time the code is run. And as far as I can tell, this is because the drop SQL at the top of my explanation is never run.
Can anyone shed some light on what the issue is or whether I am missing something totally obvious?
run_tractor.py:
import pandas
import pandas.io.sql as pdsql
import sqlalchemy as sqla
import xlwings
import helper as hlp
# Last import contains user-defined functions
# ---------- SERVER VARIABLES ---------- #
server = 'DEV\MSSQLSERVER2K12'
database = 'injection_test'
username= 'username'
password = 'pwd'
# ---------- CONFIGURING [Only change base path if relevant, not file names ---------- #
base_path = r'C:\Code\code\Tractor Analysis Basic Code'
procedure_drop = r'' + base_path + '\proc_drop.sql'
procedure_create_curves = r'' + base_path + '\proc_create_tractor_test.sql'
procedure_create_mr_function = r'' + base_path + '\create_mr_parse_function.sql'
procedures = [procedure_create_curves, procedure_create_mr_function]
del_procedures = [procedure_drop]
engine_analysis = sqla.create_engine('mssql+pymssql://{2}:{3}#{0}/{1}'.format(server,database,username,password))
engine_analysis.connect()
engines = [engine_analysis]
hlp.clean_databases(engines, del_procedures)
hlp.deploy_procedures(engines, procedures)

Sqlite / SQLAlchemy: how to enforce Foreign Keys?

The new version of SQLite has the ability to enforce Foreign Key constraints, but for the sake of backwards-compatibility, you have to turn it on for each database connection separately!
sqlite> PRAGMA foreign_keys = ON;
I am using SQLAlchemy -- how can I make sure this always gets turned on?
What I have tried is this:
engine = sqlalchemy.create_engine('sqlite:///:memory:', echo=True)
engine.execute('pragma foreign_keys=on')
...but it is not working!...What am I missing?
EDIT:
I think my real problem is that I have more than one version of SQLite installed, and Python is not using the latest one!
>>> import sqlite3
>>> print sqlite3.sqlite_version
3.3.4
But I just downloaded 3.6.23 and put the exe in my project directory!
How can I figure out which .exe it's using, and change it?

For recent versions (SQLAlchemy ~0.7) the SQLAlchemy homepage says:
PoolListener is deprecated. Please refer to PoolEvents.
Then the example by CarlS becomes:
engine = create_engine(database_url)
def _fk_pragma_on_connect(dbapi_con, con_record):
dbapi_con.execute('pragma foreign_keys=ON')
from sqlalchemy import event
event.listen(engine, 'connect', _fk_pragma_on_connect)

Building on the answers from conny and shadowmatter, here's code that will check if you are using SQLite3 before emitting the PRAGMA statement:
from sqlalchemy import event
from sqlalchemy.engine import Engine
from sqlite3 import Connection as SQLite3Connection
#event.listens_for(Engine, "connect")
def _set_sqlite_pragma(dbapi_connection, connection_record):
if isinstance(dbapi_connection, SQLite3Connection):
cursor = dbapi_connection.cursor()
cursor.execute("PRAGMA foreign_keys=ON;")
cursor.close()

I now have this working:
Download the latest sqlite and pysqlite2 builds as described above: make sure correct versions are being used at runtime by python.
import sqlite3
import pysqlite2
print sqlite3.sqlite_version # should be 3.6.23.1
print pysqlite2.__path__ # eg C:\\Python26\\lib\\site-packages\\pysqlite2
Next add a PoolListener:
from sqlalchemy.interfaces import PoolListener
class ForeignKeysListener(PoolListener):
def connect(self, dbapi_con, con_record):
db_cursor = dbapi_con.execute('pragma foreign_keys=ON')
engine = create_engine(database_url, listeners=[ForeignKeysListener()])
Then be careful how you test if foreign keys are working: I had some confusion here. When using sqlalchemy ORM to add() things my import code was implicitly handling the relation hookups so could never fail. Adding nullable=False to some ForeignKey() statements helped me here.
The way I test sqlalchemy sqlite foreign key support is enabled is to do a manual insert from a declarative ORM class:
# example
ins = Coverage.__table__.insert().values(id = 99,
description = 'Wrong',
area = 42.0,
wall_id = 99, # invalid fkey id
type_id = 99) # invalid fkey_id
session.execute(ins)
Here wall_id and type_id are both ForeignKey()'s and sqlite throws an exception correctly now if trying to hookup invalid fkeys. So it works! If you remove the listener then sqlalchemy will happily add invalid entries.
I believe the main problem may be multiple sqlite3.dll's (or .so) lying around.

As a simpler approach if your session creation is centralised behind a Python helper function (rather than exposing the SQLA engine directly), you can just issue session.execute('pragma foreign_keys=on') before returning the freshly created session.
You only need the pool listener approach if arbitrary parts of your application may create SQLA sessions against the database.

From the SQLite dialect page:
SQLite supports FOREIGN KEY syntax when emitting CREATE statements for tables, however by default these constraints have no effect on the operation of the table.
Constraint checking on SQLite has three prerequisites:
At least version 3.6.19 of SQLite must be in use
The SQLite libary must be compiled without the SQLITE_OMIT_FOREIGN_KEY or SQLITE_OMIT_TRIGGER symbols enabled.
The PRAGMA foreign_keys = ON statement must be emitted on all connections before use.
SQLAlchemy allows for the PRAGMA statement to be emitted automatically for new connections through the usage of events:
from sqlalchemy.engine import Engine
from sqlalchemy import event
#event.listens_for(Engine, "connect")
def set_sqlite_pragma(dbapi_connection, connection_record):
cursor = dbapi_connection.cursor()
cursor.execute("PRAGMA foreign_keys=ON")
cursor.close()

One-liner version of conny's answer:
from sqlalchemy import event
event.listen(engine, 'connect', lambda c, _: c.execute('pragma foreign_keys=on'))

I had the same problem before (scripts with foreign keys constraints were going through but actuall constraints were not enforced by the sqlite engine); got it solved by:
downloading, building and installing the latest version of sqlite from here: sqlite-sqlite-amalgamation; before this I had sqlite 3.6.16 on my ubuntu machine; which didn't support foreign keys yet; it should be 3.6.19 or higher to have them working.
installing the latest version of pysqlite from here: pysqlite-2.6.0
after that I started getting exceptions whenever foreign key constraint failed
hope this helps, regards

If you need to execute something for setup on every connection, use a PoolListener.

Enforce Foreign Key constraints for sqlite when using Flask + SQLAlchemy.
from flask import Flask
from flask_sqlalchemy import SQLAlchemy
def create_app(config: str=None):
app = Flask(__name__, instance_relative_config=True)
if config is None:
app.config.from_pyfile('dev.py')
else:
logger.debug('Using %s as configuration', config)
app.config.from_pyfile(config)
db.init_app(app)
# Ensure FOREIGN KEY for sqlite3
if 'sqlite' in app.config['SQLALCHEMY_DATABASE_URI']:
def _fk_pragma_on_connect(dbapi_con, con_record): # noqa
dbapi_con.execute('pragma foreign_keys=ON')
with app.app_context():
from sqlalchemy import event
event.listen(db.engine, 'connect', _fk_pragma_on_connect)
Source:
https://gist.github.com/asyd/a7aadcf07a66035ac15d284aef10d458

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Concurrent db table indexing through alembic script - python

Alembic supports PostgreSQL concurrently indexes creation def upgrade(): op.execute('COMMIT') op.create_index('ix_1', 't1', ['col1'], postgresql_concurrently=True)

Whereas concurrent indexes are allowed in Postgresql, Alembic does not support concurrent operations, only one process should be running at a time.

Related

Print SQL generated by SQLObject

SQLAlchemy Oracle - InvalidRequestError: could not retrieve isolation level

How to transfer changes made in SQliteStudio to Flask models (SQlAlchemy) [duplicate]

Dropping SQL Procedures/Functions w/SQL Alchemy and Python

Sqlite / SQLAlchemy: how to enforce Foreign Keys?

Categories

Resources