No exception raised in SqlAlchemy if error in Vertica database - python

I have table defined in Vertica in which one of the columns has UNIQUE constraint enforced. Now, on inserting a new row, if the same value is present in the column then an error 6745 is raised when the query is executed in the database shell. I am trying to achieve this using Sqlalchemy.
I have an Sqlalchemy engine defined and connect to the DB using this. Next I use execute() which can be used with the above connection created to execute a raw SQL query. I am using a try-except block around the above implementation to catch any exceptions. On inserting a new row with Sqlalchemy no exception is raised but the constraint is enforced in the database side(no duplicated entries written). But the error raised in the database is not captured by Sqlalchemy, hence cannot really say if the operation succeeded or if there was a conflict with the new data being added.
How can I configure Sqlalchemy to raise an exception in case an error was raised on the Database?
I am using the vertica_python dialect.
Temporary Solution:
For now, I use the number of entries in the table before and after performing the operation to classify the status of the operation. This is a dirty hack and not efficient.

You can configure SqlAlchemy to raise an exception by setting the raise_on_unique_violation flag to True on your Vertica connection object. This flag tells SqlAlchemy to raise an exception if a unique constraint violation occurs, rather than silently ignoring it.
For example:
from sqlalchemy import create_engine
from sqlalchemy.dialects.vertica import VerticaDialect
engine = create_engine("vertica+vertica_python://username:password#hostname:port/dbname",
connect_args={'raise_on_unique_violation': True},
echo=True,
dialect_cls=VerticaDialect)
connection = engine.connect()
When you use the connection.execute() method to insert a new row, if a unique constraint violation occurs, SqlAlchemy will raise a UniqueViolation exception, which you can catch and handle in your code.
You can also use session.flush() and session.commit() to handle the exception.
from sqlalchemy.orm import sessionmaker
Session = sessionmaker(bind=engine)
session = Session()
try:
session.add(new_row)
session.flush()
session.commit()
except IntegrityError as e:
session.rollback()
raise e
You can check if the error code is 6745, if yes then it is a unique constraint violation error.

Related

Sqlalchemy add_all() ignore duplicate key IntegrityError

I'm adding a list of objects entries to a database. Sometimes it may happen that one of this objects is already in the database (I do not have any control on that).
With only one IntegrityError all the transactions will fail, i.e. all the objects in entries will not be inserted into the database.
try:
session.add_all(entries)
session.commit()
except:
logger.error(f"Error! Rolling back")
session.rollback()
raise
finally:
session.close()
My desired behavior would be: if there is a IntegrityError in one of the entries, catch it and do not add that object to the database, otherwise continue normally (do not fail)
Edit: I'm usign MySQL as backend.
I depends on what backend you're using.
PostgreSQL has a wonderful INSERT() ON CONFLICT DO NOTHING clause which you can use with SQLAlchemy:
from sqlalchemy.dialects.postgresql import insert
session.execute(insert(MyTable)
.values(my_entries)
.on_conflict_do_nothing())
MySQL has the similar INSERT IGNORE clause, but SQLAlchemy has less support for it. Luckily, according to this answer, there is a workaround, using prefix_with:
session.execute(MyTable.__table__
.insert()
.prefix_with('IGNORE')
.values(my_entries))
The only thing is that my_entries needs to be a list of column to value mappings. That means [{ 'id': 1, 'name': 'Ringo' }, { 'id': 2, 'name': 'Paul' }, ...] et cetera.
A solution I have found is to query the database before adding it
try:
instance = session.query(InstancesTable).filter_by(id=entry.id).first()
if instance:
return
session.add(entry)
session.commit()
except:
logger.error(f"Error! Rolling back")
session.rollback()
raise

SQLAlchemy not rolling back after FlushError

I'm writing some test to a REST API linked to a MySQL db with python+werkzeug+SQLalchemy, one of the test is to try to add a "object" with the primary key missing in the json and verify that it fails and doesn't insert anything in the DB. It used to work fine with sqlite but I switched to MySQLdb and now I get a FlushError (instead of an IntegrityError I used to catch) and when I try to rollback after the error, it doesn't throw any error but the entry is in the database with the primary key set to ''. The code looks like this:
session = Session()
try:
res = func(*args, session=session, **kwargs)
session.commit()
except sqlalchemy.exc.SQLAlchemyError as e:
session.rollback()
return abort(422)
else:
return res
finally:
session.close()
And here's the error that I catch during the try/except:
class 'sqlalchemy.orm.exc.FlushError':Instance has a NULL identity key. If this is an auto-generated value, check that the database table allows generation of new primary key values, and that the mapped Column object is configured to expect these generated values. Ensure also that this flush() is not occurring at an inappropriate time, such as within a load() event.
I just read the documentation about the SQLalchemy session and rollback feature but don't understand why the rollback doesn't work for me as this is almost textbook example from the documentation.
I use Python 2.7.13, werkzeug '0.12.2', sqlalchemy '1.1.13' and MySQLdb '1.2.3' and mysql Ver 14.14 Distrib 5.1.73 !
Thanks for your help
It looks like the problem was MYSQL only:
By default, the strict mode isn't activated and allow incorrect insert/update to make changes in the database (wtf?), the solution is to change the sql_mode, either globally:
MySQL: Setting sql_mode permanently
Or in SQLalchemy like explained in this blog post:
https://www.enricozini.org/blog/2012/tips/sa-sqlmode-traditional/

Catch python DatabaseErrors generically

I have a database schema that might be implemented in a variety of different database engines (let's say an MS Access database that I'll connect to with pyodbc or a SQLite database that I'll connect to via the built-in sqlite3 module as an simple example).
I'd like to create a factory function/method that returns a database connection of the appropriate type based on some parameter, similar to the following:
def createConnection(connType, params):
if connType == 'sqlite':
return sqlite3.connect(params['filename'])
elif connType == 'msaccess':
return pyodbc.connect('DRIVER={Microsoft Access Driver (*.mdb, *.accdb)};DBQ={};'.format(params['filename']))
else:
# do something else
Now I've got some query code that should work with any connection type (since the schema is identical no matter the underlying DB engine) but may throw an exception that I'll need to catch:
db = createDatabase(params['dbType'], params)
cursor = db.cursor()
try:
cursor.execute('SELECT A, B, C FROM TABLE')
for row in cursor:
print('{},{},{}'.format(row.A, row.B, row.C))
except DatabaseError as err:
# Do something...
The problem I'm having is that the DatabaseError classes from each DB API 2.0 implementation don't share a common base class (other than the way-too-generic Exception), so I don't know how to catch these exceptions generically. Obviously I could do something like the following:
try:
# as before
except sqlite3.DatabaseError as err:
# do something
except pyodbc.DatabaseError as err:
# do something again
...where I included an explicit catch block for each possible database engine. But this seems distinctly non-pythonic to me.
How can I generically catch DatabaseErrors from different underlying DB API 2.0 database implementations?
There is a number of approaches :
Use a catch-all exception and then work out what exception it is. If it is not in your list, raise the exception again (or your own). See: Python When I catch an exception, how do I get the type, file, and line number?
Perhaps you want to take the problem in a different way: your factory code should also provide the exception to test for.
A simpler approach in my view (and the one I use in practice), is to have a class for all database connections, and to subclass it for each specific database type/syntax. Inheritance allows you to take care of all specificities. For some reason, I never had to worry about this issue.

SQLAlchemy and explicit locking

I have multiple processes that can potentially insert duplicate rows into the database. These inserts do not happen very frequently (a few times every hour) so it is not performance critical.
I've tried an exist check before doing the insert, like so:
#Assume we're inserting a camera object, that's a valid SQLAlchemy ORM object that inherits from declarative_base...
try:
stmt = exists().where(Camera.id == camera_id)
exists_result = session.query(Camera).with_lockmode("update").filter(stmt).first()
if exists_result is None:
session.add(Camera(...)) #Lots of parameters, just assume it works
session.commit()
except IntegrityError as e:
session.rollback()
The problem I'm running into is that the exist() check doesn't lock the table, and so there is a chance that multiple processes could attempt to insert the same object at the same time. In such a scenario, one process succeeds with the insert and the others fail with an IntegrityError exception. While this works, it doesn't feel "clean" to me.
I would really like some way of locking the Camera table before doing the exists() check.
Pehaps this might be of interest to you:
https://groups.google.com/forum/?fromgroups=#!topic/sqlalchemy/8WLhbsp2nls
You can lock the tables by executing the SQL directly. I'm not sure what that looks like in Elixir, but in plain SA it'd be something like:
conn = engine.connect()
conn.execute("LOCK TABLES Pointer WRITE")
#do stuff with conn
conn.execute("UNLOCK TABLES")

Checking for non-existent optional tables

I am using SQLAlchemy + Pyramid to operate on my database. However, there are some optional tables which are not always expected to be present in the DB. So while querying them I try to catch such cases with the NoSuchTableError
try:
x = session.query(ABC.name.label('sig_name'),func.count('*').label('count_')).join(DEF).join(MNO).filter(MNO.relevance >= relevance_threshold).group_by(DEF.signature).order_by(desc('count_')).all()[:val]
except NoSuchTableError:
x = [-1,]
But on executing this statement, I get a ProgrammingError
ProgrammingError: (ProgrammingError) (1146, "Table 'db.mno' doesn't exist")
Why does SQLAlchemy raise the more general ProgrammingError instead of the more specific NoSuchTableError? And if this is indeed expected behaviour, how do I ensure the app displays correct information depending on whether tables are present/absent?
EDIT1
Since this is part of my webapp, the model of DB is in models.py (under my pyramid webapp). I do have a setting in my .ini file that asks user to select whether additional tables are available or not. But not trusting the user, I want to be able to check for myself (in the views) whether table exists or not. The contentious table is something like (in models.py)
class MNO(Base):
__tablename__="mno"
id=Column(Integer,primary_key=True,autoincrement=True)
sid=Column(Integer)
cid=Column(mysql.MSInteger(unsigned=True))
affectability=Column(Integer)
cvss_base=Column(Float)
relevance=Column(Float)
__table_args__=(ForeignKeyConstraint(['sid','cid',],['def.sid','def.cid',]),UniqueConstraint('sid','cid'),)
How and Where should the check be made so that a variable can be set (preferably during app setup) which tells me whether the tables are present or not?
Note: In this case I would have to try if...else rather than 'ask for forgiveness'
According to the sqlalchemy docs, a NoSuchTableError is only thrown when "SQLAlchemy [is] asked to load a table's definition from the database, but the table doesn't exist." You could try loading a table's definition, catching the error there, and doing your query otherwise.
If you want to do things via "asking for forgiveness":
try:
table = Table(table_name, MetaData(engine))
except NoSuchTableError:
pass
Alternatively, you could just check whether the table exists:
Edit:
Better yet, why don't you use the has_table method:
if engine.dialect.has_table(connection, table_name):
#do your crazy query
Why don't you use Inspector to grab the table names first?
Maybe something like this:
from sqlalchemy import create_engine
from sqlalchemy.engine import reflection
#whatever code you already have
engine = create_engine('...')
insp = reflection.Inspector.from_engine(engine)
table_name = 'foo'
table_names = insp.get_table_names()
if table_name in table_names:
x = session.query(ABC.name.label('sig_name'),func.count('*').label('count_')).join(DEF).join(MNO).filter(MNO.relevance >= relevance_threshold).group_by(DEF.signature).order_by(desc('count_')).all()[:val]

Categories