I'm facing a strange behaviour in multi-process environment.
My first process (later called P1) writes to db through sqa. My second process (later called P2) reads from db through sqa. This second process is a web application which ask for an up to date data with an ajax call.
When P1 updates the data (write), P2 does not see the change (read) immedialty. It has to poll several times before actually seeing the db change (issuing session.query(...)). If I run another P3 process, I can see the change actually be done in db but P2 (web app) does not see it immediately.
I'm running sqa 0.8.4 (underlying db: sqlite) on Ubuntu 13.04 and my web app is based on the cherrypy framework (3.2.0)
I used scoped sessions to get thread-safe session objects as mentioned in
SQLAlchemy documentation
Here's my OrmManager class used by all my processes:
class OrmManager:
def __init__(self, database, metadata, echo=False):
self.database = database
engine = create_engine('sqlite:///' + database,
echo=echo,
connect_args={'detect_types': sqlite3.PARSE_DECLTYPES|
sqlite3.PARSE_COLNAMES},
native_datetime=True,
poolclass=NullPool,
convert_unicode=True
)
metadata.create_all(engine)
# this factory is thread safe: a session object is returned (always the same) to the
# caller. If called from another thread, the returned session object will be different
session_factory = sessionmaker(bind=engine, expire_on_commit=False)
self.session = scoped_session(session_factory)
def get_session(self):
session = self.session()
return session
P1, P2 and P3 implement an OrmManager and use the session returned as following:
orm_mgr = OrmManager(database=<path/to/my/.sqlite/file>, metadata=METADATA)
session = orm_mgr.get_session()
# do some stuff here
session.commit()
I checked P1 code. The db change is well commited (call to session.commit()) but the change is not seen in real time by P2 (web app) compared to P3 (cmd line process). It can take seconds for P2 to get the change...
Any ideas ?
Thanks a lot,
Pierre
Found the issue.
According to SQLAlchemy documentation, a Session.remove() has to be called after each web request.
I added the following code to my cherrypy app:
def on_end_request():
"""As mentioned in SQLAlchemy documentation,
scoped_session .remove() method has to be called
at the end of each request"""
Session.remove()
and:
cherrypy.config.update({'tools.dbsession_close.on' : True})
# As mentioned in SQLAlchemy documentation, call the .remove() method
# of the scoped_session object at the end of each request
cherrypy.tools.dbsession_close = cherrypy.Tool('on_end_request', on_end_request)
Works fine now.
HTH other people,
Pierre
Related
So this question is a little like Does SQLAlchemy reset the database session between SQLAlchemy Sessions from the same connection?
I have a Flask/SQLAlchemy/Postgres app, which intermittently seems to drop connections after a commit() that occurs as part of a POST request.
This causes me headaches as I rely upon a customized option (https://www.postgresql.org/docs/9.6/runtime-config-custom.html) to control row level security - in effect executing the following before each Flask request while utilising scoped sessions:
#app.before_request
def load_user():
...
# Set-up RLS.
statement = f"SET app.permitted_workspace_id = '{workspace_id}'"
db.db_session.execute(statement)
...
This pattern generally works fine, but occasionally seems to fail when, so far as I can tell, after a commit(), SQLAlchemy drops the existing session and checks out a new one, in which app.permitted_workspace_id is no longer set.
My workaround for this is to listen for session checkout events, and then re-set the parameter:
#event.listens_for(db_engine, 'checkout')
def receive_checkout(dbapi_connection, connection_record, connection_proxy):
...
cursor = dbapi_connection.cursor()
statement = f"SET app.permitted_workspace_id = '{g.user.workspace_id}'"
cursor.execute(statement)
return
So my question is really: is it unavoidable that SQLAlchemy may close sessions after commit(), meaning I lose my session parameters - even with more DB work still to do?
If so, do we think this pattern is secure or even acceptable practice? Ideally, I'd keep the session open until removed (via #app.teardown_appcontext), but since I'm struggling to achieve that, and still have the relevant info available within the Flask request, I think this is the next best way to go.
Thanks
Edit 1:
In terms of session scoping, the layout is this:
In a database module, I lay out the following:
def get_database_connection()
...
db_engine = sa.create_engine(
f'postgresql://{user}:{password}#{host}/postgres',
echo=False,
poolclass=sa.pool.NullPool
)
# Connect - RLS is controlled by db_get_user_details.
db_connection = db_engine.connect()
db_session = scoped_session(
sessionmaker(
autocommit=False,
autoflush=False,
expire_on_commit=False,
bind=db_engine
)
)
return(db_engine, db_session, db_connection)
This is then called up top from inside the main Flask application:
db_engine, db_session, db_connection = db.get_database_connection()
And session removal is controlled by a function as follows:
#app.teardown_appcontext
def remove_session(exception=None):
db_session.remove()
So the answer in here seems to be that commit() does perform a checkin with this pattern:
https://github.com/sqlalchemy/sqlalchemy/issues/4925
if Session is what you're working with then yes, the Session will release connections when any of commit(), rollback(), or close() is called.
I built an API using Flask and I'm using a service (as below) to create my database connections.
class DatabaseService:
def __init__(self):
self.connection_string = "foo"
def create_connection(self):
engine = create_engine(self.connection_string)
Session = scoped_session(sessionmaker(bind=engine))
return Session
In my app.py I add and remove these sessions to Flask application context (g) as the docs suggests.
So I can reference to g.session always I need them.
def get_session():
if 'session' not int g:
session = database_service.create_session()
g.session = session
#app.teardown_appcontext
def shutdown_session(exception=None):
if 'session' in g:
g.session.remove()
return None
This way every request has your own session that will close after processing. Am I right?
I don't understand why the connections are still alive on my database after the request is already done.
Always I run the command show processlist I can see multiple connections sleeping from my API.
I see no problem opening and closing sessions per-request
my_session = Session(engine)
my_session.execute(some_query)
my_session.close()
I am using the following version of neo4j libraries:
neo4j==1.7.2
neobolt==1.7.9
neotime==1.7.4
I have a flask app and in development I am using the internal flask application server. (In prod I will use a docker container with uwsgi, but this quesiton is about my dev setup.)
I have encapsuated neo4j into a class and my application maintains a single instance of this class:
class ChartStoreConnectionClass():
driver = None
def __init__(self, configDict):
self.driver = neo4j.GraphDatabase.driver(
configDict["boltUri"],
auth=(configDict["basicAuthUsername"], configDict["basicAuthPassword"]),
encrypted=True,
trust=neo4j.TRUST_SYSTEM_CA_SIGNED_CERTIFICATES,
# trust=neo4j.TRUST_ALL_CERTIFICATES,
# trust=neo4j.TRUST_CUSTOM_CA_SIGNED_CERTIFICATES, Custom CA support is not implemented
)
def readonlyQuery(self, queryFN):
res = None
with self.driver.session() as session:
tx = session.begin_transaction()
res = queryFN(tx)
tx.rollback()
return res
def execute(self, queryFN):
res = None
with self.driver.session() as session:
tx = session.begin_transaction()
res = queryFN(tx)
tx.commit()
return res
This setup works for a while, but sometimes I get the following error:
neobolt.exceptions.ServiceUnavailable: Failed to read from defunct connection Address(host='127.0.0.1',
port=7687)
when I simply retry the request it works the second time. I have read around the error message and found mutiple posts talking about neo4j in multi-threaded vs multi-process environments but I do not believe they are relevant to me here.
There errors are occurring on the commit of the execute function. The queryFN I pass to it is a very simple one liner which takes next to no time to execute.
Is it wrong to have a single driver instance for my application? (I thought this was the way to do it because the driver creates a connection pool and it makes sense that my application has a connection pool.)
What is the recommended way to use neo4j with Flask?
I have seen this example https://github.com/neo4j-examples/movies-python-bolt/blob/master/movies.py but they simply have a single driver object in the same way as I do. (Except that is global not inside a class, but functionally mine is the same.)
Where should I be looking to debug this issue?
I have an API I have written in flask. It uses sqlalchemy to deal with a MySQL database. I don't use flask-sqlalchemy, because I don't like how the module forces you into a certain pattern for declaring the model.
I'm having a problem in which my database connections are not closing. The object representing the connection is going out of scope, so I assume it is being garbage collected. I also explicitly call close() on the session. Despite this, the connections stay open long after the API call has returned its response.
sqlsession.py: Here is the wrapper I am using for the session.
class SqlSession:
def __init__(self, conn=Constants.Sql):
self.db = SqlSession.createEngine(conn)
Session = sessionmaker(bind=self.db)
self.session = Session()
#staticmethod
def createEngine(conn):
return create_engine(conn.URI.format(user=conn.USER, password=conn.PASS, host=conn.HOST, port=conn.PORT, database=conn.DATABASE, poolclass=NullPool))
def close(self):
self.session.close()
flaskroutes.py: Here is an example of the flask app instantiating and using the wrapper object. Note that it instantiates it in the beginning within the scope of the api call, then closes the session at the end, and presumably is garbage collected after the response is returned.
def commands(self, deviceId):
sqlSession = SqlSession(self.sessionType) <---
commandsQueued = getCommands()
jsonCommands = []
for command in commandsQueued:
jsonCommand = command.returnJsonObject()
jsonCommands.append(jsonCommand)
sqlSession.session.delete(command)
sqlSession.session.commit()
resp = jsonify({'commands': jsonCommands})
sqlSession.close() <---
resp.status_code = 200
return resp
I would expect the connections to be cleared as soon as the HTTP response is made, but instead, the connections end up with the "SLEEP" state (when viewed in the MySQL command line interface 'show processlist').
I ended up using the advice from this SO post:
How to close sqlalchemy connection in MySQL
I strongly recommend reading that post to anyone having this problem. Basically, I added a dispose() call to the close method. Doing so causes the entire connection to be destroyed, while closing simply returns connections to an available pool (but leave them open).
def close(self):
self.session.close()
self.db.dispose()
This whole this was a bit confusing to me, but at least now I understand more about the connection pool.
Updated:
Going through the Werkzeug link text tutorial, got stack with creating SQLAlchemy session using sessionmaker() instead of create_session() as recommended.
Note: it is not about SA, it is about Werkzeug.
Werkzeug tutorial:
session = scoped_session(lambda: create_session(bind=application.database_engine,
autoflush=True, autocommit=False), local_manager.get_ident)
I asked how to achieve the same using sessionmaker():
As a result guys from #pocoo RCI helped me with this:
session = scoped_session(lambda: sessionmaker(bind=application.database_engine)(),
local_manager.get_ident)
without () at the end of sessionmaker(**args) it kept giving me an error:
RuntimeError: no object bound to application
P.S. if delete lambda it will not work.
sessionmaker() returns a session factory, not a session itself. scoped_session() takes a session factory as argument. So just omit the lambda: and pass the result of sessionmaker() directly to scoped_session().