SQLAlchemy session reconnect

SQLAlchemy session reconnect - python

How can I force my engine to reconnect if a query returns an OperationalError like user does not have access to the database or something like that?
engine = create_engine(url, pool_recycle=3600)
Session = sessionmaker(bind=engine)
try:
sesh = Session()
sesh.query....
sesh.close()
except OperationalError:
# force engine to reconnect here somehow?

If you catch an error that indicates the connection was closed during an operation, SQLAlchemy automatically reconnects on the next access. However, when a database disconnects, your transaction is gone, so SQLAlchemy requires that you emit rollback() on the Session in order to establish within your application that a new transaction is to take place. you then need to start your whole transaction all over again.
Dealing with that issue has a few angles. You should read through the Dealing with Disconnects section of the documentation which illustrates two ways to work with disconnects. Beyond that, if you truly wanted to pick up your transaction from where you left off, you'd need to "replay" the whole thing back, assuming you've done more than one thing in your transaction. This is best suited by application code that packages what it needs to do in a function that can be called again. Note that a future version of SQLAlchemy may introduce an extension called the Transaction Replay Extension that provides another way of doing this, however it will have lots of caveats, as replaying a lost transaction in a generic way is not a trivial affair.

A lot have happened since this question was first answered.
By taking a pessimistic error handling approach you get the most bang for the buck - easy implementation and very effective.
Apply the pool_pre_ping=Truewhen you create the engine, like this:
engine = create_engine("mysql+pymysql://user:pw#host/db", pool_pre_ping=True)
See more: docs.sqlalchemy.org/en/latest/core/pooling.html#pool-disconnects-pessimistic
Another approach is to deal with errors in an optimistic way - as they happen. In that case you can wrap the execute statement in a try and except and the invalidate the connection if an exception is raised. Once the connection is invalidated you and re-instantiate it.
See more: docs.sqlalchemy.org/en/latest/core/pooling.html#disconnect-handling-optimistic
Both approaches works great in situations where your connection otherwise would timeout e.g. overnight / weekend. It also makes it much easier for IT operations to take a database down and not have to worry too much about downstream applications relying on a restart. How ever this is not a silver bullet, it's worth thinking about secure transaction handling (as mentioned by zzzeek) if you deal with very critical transactions.

If you are using sqlalchemy in Flask, when you query like
MyModel.query.all()
and you get errors like
File "./lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 427, in _revalidate_connection
"Can't reconnect until invalid "
StatementError: (sqlalchemy.exc.InvalidRequestError) Can't reconnect until invalid transaction is rolled back
You can simply reconnect DB by
MyModel.query.session.close()
MyModel.query.all()

Related

SQLalchemy: Engine/Connection disconnects after some hours of inactivity

Situation
I have a plotly-dash application running in a docker container (based on python3.7-slim).
The app is accessing a postgres database and visualizes the queried data.
However, if the app has not been used for some time (I would estimate around 24-48 hours. We first noticed this issue on mondays after nobody used the app during the weekend) i.e. if no data has been queried from the database, the app freezes and the logs show some errors related to the database.
I cannot fully access the logs, but they contain this error:
AttributeError: 'Connection' object has no attribute '_Connection_connection'
and in the following, all the pieces of code which tried to query data from the database are stated (but not what exactly went wrong).
The problem was always solved with a restart of the app (and thus a new connection to the database)
Assumption
As stated above, this always occured after a period of inactivity. So my assumption is, that the engine disconnects after some idle time
Code Sample
For accessing the database, I have a DatabaseConnection class. The relevant part of the code contais something like this:
from sqlalchemy import create_engine
...
engine = create_engine(f"postgresql+psycopg2://{user}:{passw}#{url}:{port}/{db_name}")
self.engine = engine.connect()
...
Question
What is the best solution for overcoming the issue of the disconnect after some inactivity?
How could I possibly check whether the database connection is still active and if not, reconnect it somehow?
-Is there a better way the access the database than through an engine-object?
Is there something wrong with my approach in general?
Please let me know if you require further information. Thanks in Advance.

There is an error in my code. It should be
self.engine = create_engine(f"postgresql+psycopg2://{user}:{passw}#{url}:{port}/{db_name}")
and the second line should be omitted. I misunderstood what engine.connect() is doing: It returns a Connection object (not an engine, as the attribute name suggests).
Then, for each query I execute, I use the context manager like this:
with self.engine.connect() as conn
table1 = pd.read_sql_table("table1", con=conn)
That way, the Connection object is closed after it has been used. But the engine object may open new connections whenever necessary.
In my previous solution, die Connection was killed after some Idle time.
(Based on this GitHub Discussion)

sqlalchemy disable cache (calling to DB in every call) [duplicate]

I have a caching problem when I use sqlalchemy.
I use sqlalchemy to insert data into a MySQL database. Then, I have another application process this data, and update it directly.
But sqlalchemy always returns the old data rather than the updated data. I think sqlalchemy cached my request ... so ... how should I disable it?

The usual cause for people thinking there's a "cache" at play, besides the usual SQLAlchemy identity map which is local to a transaction, is that they are observing the effects of transaction isolation. SQLAlchemy's session works by default in a transactional mode, meaning it waits until session.commit() is called in order to persist data to the database. During this time, other transactions in progress elsewhere will not see this data.
However, due to the isolated nature of transactions, there's an extra twist. Those other transactions in progress will not only not see your transaction's data until it is committed, they also can't see it in some cases until they are committed or rolled back also (which is the same effect your close() is having here). A transaction with an average degree of isolation will hold onto the state that it has loaded thus far, and keep giving you that same state local to the transaction even though the real data has changed - this is called repeatable reads in transaction isolation parlance.
http://en.wikipedia.org/wiki/Isolation_%28database_systems%29

This issue has been really frustrating for me, but I have finally figured it out.
I have a Flask/SQLAlchemy Application running alongside an older PHP site. The PHP site would write to the database and SQLAlchemy would not be aware of any changes.
I tried the sessionmaker setting autoflush=True unsuccessfully
I tried db_session.flush(), db_session.expire_all(), and db_session.commit() before querying and NONE worked. Still showed stale data.
Finally I came across this section of the SQLAlchemy docs: http://docs.sqlalchemy.org/en/latest/dialects/postgresql.html#transaction-isolation-level
Setting the isolation_level worked great. Now my Flask app is "talking" to the PHP app. Here's the code:
engine = create_engine(
"postgresql+pg8000://scott:tiger#localhost/test",
isolation_level="READ UNCOMMITTED"
)
When the SQLAlchemy engine is started with the "READ UNCOMMITED" isolation_level it will perform "dirty reads" which means it will read uncommited changes directly from the database.
Hope this helps
Here is a possible solution courtesy of AaronD in the comments
from flask.ext.sqlalchemy import SQLAlchemy
class UnlockedAlchemy(SQLAlchemy):
def apply_driver_hacks(self, app, info, options):
if "isolation_level" not in options:
options["isolation_level"] = "READ COMMITTED"
return super(UnlockedAlchemy, self).apply_driver_hacks(app, info, options)

Additionally to zzzeek excellent answer,
I had a similar issue. I solved the problem by using short living sessions.
with closing(new_session()) as sess:
# do your stuff
I used a fresh session per task, task group or request (in case of web app). That solved the "caching" problem for me.
This material was very useful for me:
When do I construct a Session, when do I commit it, and when do I close it

This was happening in my Flask application, and my solution was to expire all objects in the session after every request.
from flask.signals import request_finished
def expire_session(sender, response, **extra):
app.db.session.expire_all()
request_finished.connect(expire_session, flask_app)
Worked like a charm.

I have tried session.commit(), session.flush() none worked for me.
After going through sqlalchemy source code, I found the solution to disable caching.
Setting query_cache_size=0 in create_engine worked.
create_engine(connection_string, convert_unicode=True, echo=True, query_cache_size=0)

First, there is no cache for SQLAlchemy.
Based on your method to fetch data from DB, you should do some test after database is updated by others, see whether you can get new data.
(1) use connection:
connection = engine.connect()
result = connection.execute("select username from users")
for row in result:
print "username:", row['username']
connection.close()
(2) use Engine ...
(3) use MegaData...
please folowing the step in : http://docs.sqlalchemy.org/en/latest/core/connections.html
Another possible reason is your MySQL DB is not updated permanently. Restart MySQL service and have a check.

As i know SQLAlchemy does not store caches, so you need to looking at logging output.

How to implement callback on connection drop event using SQLObject?

I'm using a Python script that does certain flow control on our outgoing mail messages, mostly checking whether a user is sending spam.
The script establishes a persistent connection with a database via a SQLObject. Under certain circumstances, the connection is dropped by a third-party (e.g. our firewall closes the connection due to excess idle), and the SQLObject doesn't notice it has been closed and it continues sending queries on a dead TCP handler, resulting in log entries like these:
Feb 06 06:56:07 mailsrv2 flow: ERROR Processing request error: [Failure instance: Traceback: <class 'psycopg2.InterfaceError'>: connection already closed#012/usr/lib/python2.7/threading.py:524:__bootstrap#012/usr/lib/python2.7/threading.py:551:__bootstrap_inner#012/usr/lib/python2.7/threading.py:504:run#012--- <exception caught here>---#012
/opt/scripts/virtualenv/local/lib/python2.7/site-packages/twisted/python/threadpool.py:191:_worker#012
/opt/scripts/virtualenv/local/lib/python2.7/site-packages/twisted/python/context.py:118:callWithContext#012
/opt/scripts/virtualenv/local/lib/python2.7/site-packages/twisted/python/context.py:81:callWithContext#012
/opt/scripts/flow/server.py:91:check#012
/opt/scripts/flow/flow.py:252:check#012
/opt/scripts/flow/flow.py:155:append_to_log#012
/opt/scripts/virtualenv/local/lib/python2.7/site-packages/SQLObject-1.3.1-py2.7.egg/sqlobject/main.py:1226:__init__#012
/opt/scripts/virtualenv/local/lib/python2.7/site-packages/SQLObject-1.3.1-py2.7.egg/sqlobject/main.py:1274:_create#012
/opt/scripts/virtualenv/local/lib/python2.7/site-packages/SQLObject-1.3.1-py2.7.egg/sqlobject/main.py:1298:_SO_finishCreate#012
/opt/scripts/virtualenv/local/lib/python2.7/site-packages/SQLObject-1.3.1-py2.7.egg/sqlobject/dbconnection.py:468:queryInsertID#012
/opt/scripts/virtualenv/local/lib/python2.7/site-packages/SQLObject-1.3.1-py2.7.egg/sqlobject/dbconnection.py:327:_runWithConnection#012
/opt/scripts/virtualenv/local/lib/python2.7/site-packages/SQLObject-1.3.1-py2.7.egg/sqlobject/postgres/pgconnection.py:191:_queryInsertID#012]
This makes me think that indeed there must be some callback for this kind of situation, otherwise that log entry wouldn't be written. I'd use that callback to establish a new connection to the database. I've been unable to find any piece of documentation about that.
Does anyone know if it's even possible to implement that callback and how to declare it?
Thanks.

We're more regular users of SQLAlchemy rather than SQLObject. According to this thread from 2010 (http://sourceforge.net/p/sqlobject/mailman/message/26460439), SQLObject does not support reconnection logic for PostgreSQL. It's an old thread, but there does not appear to be any discussion about solving this from within SQLObject.
I have three suggested solutions.
The first solution is to explore Connection Pools. It might provide a way to open a new connection object when SQLObject detects the psycopg2 has disconnected. I can't guarantee it will, but if it does this solution would be your best best as it requires the least amount of changes on your part.
The second solution is to switch your backend from Postgres to MySQL. The SQLObject docs provide information on how use the reconnection logic of the mysql driver - http://sourceforge.net/p/mysql-python/feature-requests/9
The third solution is to switch to SQLAlchemy as your ORM and to use their version of Connection Pools. According to the core event documentation, when using pools if a connection is dropped or closed a new connection will opened -- http://docs.sqlalchemy.org/en/rel_0_9/core/exceptions.html#sqlalchemy.exc.DisconnectionError
Best of luck

SQLalchemy session issues in daemon but not in pyramid. What might i be missing?

I figured I would make this a different question for the sake of being tidy. It is based on:
SQLAlchemy won't update my database and SQLAlchemy session: how to keep it alive?.
So here's the deal: I have a Pyramid application that's talking to a daemon, which in turn talks to a database.
Now for some reason stuff isn't getting committed to the database when I add it to the database session variable, as in:
DBSession.add(ModelInstance)
Calling flush or commit doesn't make it commit.
This is how I make DBSession:
settings = {
'sqlalchemy.url':'blah blah'
}
DBSession = scoped_session(sessionmaker(extension=ZopeTransactionExtension()))
engine = engine_from_config(settings, 'sqlalchemy.')
DBSession.configure(bind=engine)
This seems fine to me because I can query the database fine. ie: this sort of thing works:
DBSession.query(ModelClass).get(id)
This fine gentleman https://stackoverflow.com/users/100297/martijn-pieters suggested the use of the following little bit of code:
import transaction
transaction.commit()
And that worked fine for making sure that my stuff got committed. The only problem is that it somehow renders my DBSession useless. So if I want to use the objects that my session is keeping track of I need to re instantiate the session and those items. This sucks. It takes up a whole lot of time.
My question is, in short, how can I avoid this?
And in long:
How do I get my DBSession to commit properly without breaking it?
OR
How do I fix my DBSession and associated model instances without the need for lengthly database calls?
AND
Any idea why this is happening? I have successfully constructed a DBSession in the same way inside the Pyramid app I mentioned and it worked totally fine, it committed when I wanted it to and everything.
For details of the errors I encountered please refer to the two questions I mentioned in the beginning

In SQLAlchemy sessions expire the objects they are managing upon commit. This is sane because after commit you have no guarantees in a concurrent world that something else isn't changing the state they are attempting to mirror in the database.
Pyramid recommends the use of a transaction manager that helps you maintain a single transaction per request. It will automatically call transaction.commit() for you after the request is complete. In this way, you don't have to think/worry about objects expiring, and transactions are properly aborted if your code raises an exception.
The way to setup the transaction manager is by installing pyramid_tm and zope.sqlalchemy and then connecting your DBSession to zope.sqlalchemy.ZopeTransactionExtension. Then things will "just work".
DBSession = scoped_session(sessionmaker(extension=ZopeTransactionExtension()))
# ...
def main(global_conf, **settings):
config = Configurator(...)
config.include('pyramid_tm')
# ...
If you need to populate a new object's primary key or ensure that some SQL will execute properly you can use DBSession.flush() to execute the SQL within your transaction without actually committing it. Any errors will be raised there for you to catch and deal with.
This basic setup of your sessions is described within the tutorial in the Pyramid documentation:
http://docs.pylonsproject.org/projects/pyramid/en/1.4-branch/tutorials/wiki2/basiclayout.html
Update: I realized I kind of answered your question about using the transaction manager in Pyramid, which you are already using successfully. I think that the answer also clearly explains what's going on with the ZopeTransactionExtension, however, and you just need confirmation about committing transactions. You'd be wise to simply use one transaction in your script, which you can create via
import transaction
with transaction.manager:
# do tons of database stuff
Now if an exception happens the transaction will be aborted, and if not it will be committed.

According to sqlalchemy's doc
Another behavior of commit() is that by default it expires the state
of all instances present after the commit is complete. This is so that
when the instances are next accessed, either through attribute access
or by them being present in a Query result set, they receive the most
recent state. To disable this behavior, configure sessionmaker() with
expire_on_commit=False.
What seems like a problem is actually done on purpose : when you commit, all the objects are marked as expired. This is done so that you don't keep using old cached values that may have changed in the database.
The reason it works in pyramid is because each request has its own transaction, and queries object before working on them. You try to use object from a preview transaction, which might not be a good idea, because their may not be in sync with the database.
To solve your problem, you can either make sure you don't reuse object after the end of a transaction (may you need your transactions to include more things), or you can use expire_on_commit=False as advised in the transaction. But if you use the latter, be aware that the object may be outdated.

If I'm reading this correctly, I think I know what is going on...
The downside of the pyramid/zope transaction managers is that they're all or nothing - due to the way they're implemented, you can't both use them and call commit(). I don't remember exactly why, but I once dug into all their code after fighting with this for hours , and there just wasn't a way to commit within the page.
For a variety of reasons I decided to not use the automatic transaction wrapping in my apps. i had a lot of scenarios where I want one or more commits , or needed to use savepoints (i use postgresql).

How to manage db connections using webpy with SQLObject?

Web.py has its own database API, web.db. It's possible to use SQLObject instead, but I haven't been able to find documentation describing how to do this properly. I'm especially interested in managing database connections. It would be best to establish a connection at the wsgi entry point, and reuse it. Webpy cookbook contains an example how to do this with SQLAlchemy. I'd be interested to see how to properly do a similar thing using SQLObject.
This is how I currently do it:
class MyPage(object):
def GET(self):
ConnectToDatabase()
....
return render.MyPage(...)
This is obviously inefficient, because it establishes a new database connection on each query. I'm sure there's a better way.

As far as I understand the SQLAlchemy example given, a processor is used, that is, a session is created for each connection and committed when the handler is complete (or rolled back if an error has occurred).
I don't see any simple way to do what you propose, i.e. open a connection at the WSGI entry point. You will probably need a connection pool to serve multiple clients at the same time. (I have no idea what are the requirements for efficiency, code simplicity and so on, though. Please comment.)
Inserting ConnectToDatabase calls into each handler is of course ugly. I suggest that you adapt the cookbook example replacing the SQLAlchemy session with a SQLObject connection.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.