SQLAlchemy - Multithreading Best Practices - Packet sequence number wrong

SQLAlchemy - Multithreading Best Practices - Packet sequence number wrong - python

I'm tring to make a clean Flask App which use SQLAlchemy and Multi-Threading.
I've read the doc : https://docs.sqlalchemy.org/en/14/orm/contextual.html#thread-local-scope but can't manage to make it work successfully.
SQL Alchemy is initate directly at the app init.py file with something like that :
db = SQLAlchemy(app)
session_factory = sessionmaker( bind=db.engine, autocommit=False, autoflush=False)
DBSession = scoped_session( session_factory )
And in another file, which is not some Flask routes, on a Class :
from blabla import DBSession
class Worker(Thread):
def __init__(self):
Thread.__init__(self)
self.dbsession = DBSession()
def run(self):
self.dbsession.query(...)
When I run my app, multiple Worker Class is running at the same time.
Then I'm facing lot of errors like :
sqlalchemy.exc.InternalError: (pymysql.err.InternalError) Packet sequence number wrong - got 54 expected 1
What I am doing wrong ?
Thanks a lot in advance for your time !

Looks like maybe you are duplicating the scoped session already provided by flask sqlalchemy. You can probably use db.session directly. Although you have to clean it up yourself when the thread ends or at regular intervals with db.session.remove() because it isn't within a request like flask expects. Why are you adding custom threading ?
https://flask-sqlalchemy.palletsprojects.com/en/2.x/quickstart/#road-to-enlightenment
a preconfigured scoped session called session

Thanks to comments and answer, I've made it worky pretty smoothly. Just need to thinks it from scratch.
On my app.py :
db = SQLAlchemy(app, session_options={"autocommit":False, "autoflush":False, "expire_on_commit":False })
Ian highlighted that the Object db.session is already a scoped session. So no need to create another one manually.
Moreover, I didn't know that it's possible to add session_options directly at this step. Obviously it's easier to directly configure the already integrated one.
On my others files :
dbsession = db.session
dbsession.query(...)
...
dbsession.commit()
dbsession.close()
Each new db.session seems to create a "real" session, I can see them directly on my MySQL instance. So, It's very important to close() and limit the number of concurent threads.
Did you see something wrong in this way ? The answer is YES.
EDIT : With Flask-SQLALchemy V3 comes a new way of deals whith context. The session scope to use the current app context instead of the thread. That is a problem when used out of context.
It's possible to add with app.app_context(): at some points. But it's not clear for me where it has to be done. And I feel like this will overload my code.
Maybe it's possible to come back to the previous behaviour directly from the params ? Don't know how to for the moment.

Related

sqlalchemy disable cache (calling to DB in every call) [duplicate]

I have a caching problem when I use sqlalchemy.
I use sqlalchemy to insert data into a MySQL database. Then, I have another application process this data, and update it directly.
But sqlalchemy always returns the old data rather than the updated data. I think sqlalchemy cached my request ... so ... how should I disable it?

The usual cause for people thinking there's a "cache" at play, besides the usual SQLAlchemy identity map which is local to a transaction, is that they are observing the effects of transaction isolation. SQLAlchemy's session works by default in a transactional mode, meaning it waits until session.commit() is called in order to persist data to the database. During this time, other transactions in progress elsewhere will not see this data.
However, due to the isolated nature of transactions, there's an extra twist. Those other transactions in progress will not only not see your transaction's data until it is committed, they also can't see it in some cases until they are committed or rolled back also (which is the same effect your close() is having here). A transaction with an average degree of isolation will hold onto the state that it has loaded thus far, and keep giving you that same state local to the transaction even though the real data has changed - this is called repeatable reads in transaction isolation parlance.
http://en.wikipedia.org/wiki/Isolation_%28database_systems%29

This issue has been really frustrating for me, but I have finally figured it out.
I have a Flask/SQLAlchemy Application running alongside an older PHP site. The PHP site would write to the database and SQLAlchemy would not be aware of any changes.
I tried the sessionmaker setting autoflush=True unsuccessfully
I tried db_session.flush(), db_session.expire_all(), and db_session.commit() before querying and NONE worked. Still showed stale data.
Finally I came across this section of the SQLAlchemy docs: http://docs.sqlalchemy.org/en/latest/dialects/postgresql.html#transaction-isolation-level
Setting the isolation_level worked great. Now my Flask app is "talking" to the PHP app. Here's the code:
engine = create_engine(
"postgresql+pg8000://scott:tiger#localhost/test",
isolation_level="READ UNCOMMITTED"
)
When the SQLAlchemy engine is started with the "READ UNCOMMITED" isolation_level it will perform "dirty reads" which means it will read uncommited changes directly from the database.
Hope this helps
Here is a possible solution courtesy of AaronD in the comments
from flask.ext.sqlalchemy import SQLAlchemy
class UnlockedAlchemy(SQLAlchemy):
def apply_driver_hacks(self, app, info, options):
if "isolation_level" not in options:
options["isolation_level"] = "READ COMMITTED"
return super(UnlockedAlchemy, self).apply_driver_hacks(app, info, options)

Additionally to zzzeek excellent answer,
I had a similar issue. I solved the problem by using short living sessions.
with closing(new_session()) as sess:
# do your stuff
I used a fresh session per task, task group or request (in case of web app). That solved the "caching" problem for me.
This material was very useful for me:
When do I construct a Session, when do I commit it, and when do I close it

This was happening in my Flask application, and my solution was to expire all objects in the session after every request.
from flask.signals import request_finished
def expire_session(sender, response, **extra):
app.db.session.expire_all()
request_finished.connect(expire_session, flask_app)
Worked like a charm.

I have tried session.commit(), session.flush() none worked for me.
After going through sqlalchemy source code, I found the solution to disable caching.
Setting query_cache_size=0 in create_engine worked.
create_engine(connection_string, convert_unicode=True, echo=True, query_cache_size=0)

First, there is no cache for SQLAlchemy.
Based on your method to fetch data from DB, you should do some test after database is updated by others, see whether you can get new data.
(1) use connection:
connection = engine.connect()
result = connection.execute("select username from users")
for row in result:
print "username:", row['username']
connection.close()
(2) use Engine ...
(3) use MegaData...
please folowing the step in : http://docs.sqlalchemy.org/en/latest/core/connections.html
Another possible reason is your MySQL DB is not updated permanently. Restart MySQL service and have a check.

As i know SQLAlchemy does not store caches, so you need to looking at logging output.

How to use Peewee with Tornado perfectly

I'm useing peewee with my tornado webapp,when I read peewee's document,I found:
Adding Request Hooks
When building web-applications, it is very important that you manage your database connections correctly. In this section I will describe how to add hooks to your web app to ensure the database connection is handled properly.
These steps will ensure that regardless of whether you’re using a simple SQLite database, or a pool of multiple Postgres connections, peewee will handle the connections correctly.
http://docs.peewee-orm.com/en/latest/peewee/database.html
Insides,it tells how Flask Django Bottle...to use that except the solution for Tornado
I wonder it's a easy way for tornado to solve this problem? Or this doesn't matter at all?

The idea there is that you want to open a connection when a request begins, and close it when the request is finished (the response is returned).
To do this it looks like you can subclass RequestHandler:
from tornado.web import RequestHandler
db = SqliteDatabase('my_db.db')
class PeeweeRequestHandler(RequestHandler):
def prepare(self):
db.connect()
return super(PeeweeRequestHandler, self).prepare()
def on_finish(self):
if not db.is_closed():
db.close()
return super(PeeweeRequestHandler, self).on_finish()

Flask-SQLAlchemy and SQLAlchemy

I'm building a small website and I already have all my models in SQLAlchemy. The website is to publish some information from some calculations which are done offline. Only the results will be published to a slimmed down database i.e. it contains the results, not the raw data, but the website needs to query the results.
I'm going to use Flask, as my models are already driven with Python (and some heavy lifting in C++ via SWIG) and I don't want to use Django.
Now this has been asked before I'm sure and the usual mantra without much justification is to 'use Flask-SQLAlchemy'. The question is why?
If I write some session handling myself, why do I have to go through the additional layer of redefining my database in Flask-SQLAlchemy. Other than having to write some code like here in my Flask app somewhere:
#app.before_request
def before_request():
g.db = connect_db()
#app.teardown_request
def teardown_request(exception):
db = getattr(g, 'db', None)
if db is not None:
db.close()
What else do I need to worry about? SQLAlchemy even does connection pooling for me by default.

Actually, you are building a web application using Flask which do database related kinds of stuff by sqlalchemy. So, when you are dealing with database session with multiple requests which your application handles, then you have to ascertain that, you are creating and closing sessions cautiously.
If you read SQLAlchemy docs, they recommend to keep the lifecycle of the session separate and external from functions and objects that access and/or manipulate database data. This will greatly help with achieving a predictable and consistent transactional scope.
A web application is the easiest case because such an application is already constructed around a single, consistent scope - this is the request, which represents an incoming request from a browser, the processing of that request to formulate a response, and finally the delivery of that response back to the client. Integrating web applications with the Session is then the straightforward task of linking the scope of the Session to that of the request. The Session can be established as the request begins, or using a lazy initialization pattern which establishes one as soon as it is needed. The request then proceeds, with some system in place where application logic can access the current Session in a manner associated with how the actual request object is accessed. As the request ends, the Session is torn down as well, usually through the usage of event hooks provided by the web framework. The transaction used by the Session may also be committed at this point, or alternatively the application may opt for an explicit commit pattern, only committing for those requests where one is warranted, but still always tearing down the Session unconditionally at the end.
In laymen terms, i mean to say that
In SQLAlchemy, above action is mentioned because sessions in web application should be scoped, meaning that each request handler creates and destroys its own session.
This is necessary because web servers can be multi-threaded, so multiple requests might be served at the same time, each working with a different database session.
This means that if you are using SqlAlchemy with Flask, you have to manually handle session like creating scoped session and also remove them on each request cautiously otherwise you may be in deep shit, which adds extra layer of complexity to your web application.
But, there comes Flask-SqlAlchemy (an extension of sqlalchemy library for Flask app) which provides infrastructure to assist in the task of aligning the lifespan of a Session with that of each web request. Actually, you can also found that in SqlAlchmey docs, they also recommend to use this with Flask.
Flask-SQLAlchemy creates a fresh/new scoped session for each request. If you dig further it you will find out here, it also installs a hook on app.teardown_appcontext (for Flask >=0.9), app.teardown_request (for Flask 0.7-0.8), app.after_request (for Flask <0.7) and here is where it calls db.session.remove().

The code you put in the question is not actually valid for Sqlalchemy integration is Flask. I know it is just example, but saying that just in case.
For Sqlalchemy integration all you need to do is to make sure current DbSession is cleaned up at the end of request via something like this:
#app.teardown_appcontext
def shutdown_session(exception=None):
DbSession.remove()
where DbSession is scoped session.
Here is documentation for the case when you dont want to use Flask-Sqlalchemy package.

sqlalchemy update not commiting changes to database. Using single connection in an app

I'm using sqlalchemy in a pyqt desktop application. When I execute and update to the database the changes are not reflected in the database, if I inspect the session object 'sqlalchemy.orm.scoping.ScopedSession', it tell's me that my object is not present in the session, but if I try to added It tells me that is already present.
The way I'm managing the connections is the following, when the application starts I open a connection and keep it open all the user session, when the application is closed I close the connection. Thus all queries are performed opening only one connection.
selected_mine = Mine.query.filter_by(mine_name="some_name").first()
''' updating the object attributes '''
selected_mine.region = self.ui.region.text()
self.sqlite_model.conn.commit()
I've inspect the sessions, and there are two different objects (I don't know why).
s=self.sqlite_model.conn()
>>> s
<sqlalchemy.orm.session.Session object at 0x30fb210>
s.object_session(selected_mine)
>>> <sqlalchemy.orm.session.Session object at 0x30547d0>
how do I solve this? why the commit it's not working?
I'm creating the session in the class SqliteModel (class of the object self.sqlite_model)
Session = scoped_session(sessionmaker(autocommit=False, bind=self.engine))
self.conn = Session()
Base.query = Session.query_property()
''' connect to database '''
Base.metadata.create_all(bind=self.engine)

You don't show what the Mine class is, but I believe your issue lies with this line:
selected_mine = Mine.query.filter_by(mine_name="some_name").first()
You are not actually using the session that you originally setup to query the database thus a new session is being created for you. I believe your code should look like this instead:
selected_mine = self.conn.query(Mine).filter_by(mine_name="some_name").first()
If you haven't done so yet, you should definitely glance through the excellent documentation that SQLAlchemy has made available explaining what a Session is and how to use it.

scoped_session(sessionmaker()) or plain sessionmaker() in sqlalchemy?

I am using SQlAlchemy in my web project. What should I use - scoped_session(sessionmaker()) or plain sessionmaker() - and why? Or should I use something else?
## model.py
from sqlalchemy import *
from sqlalchemy.orm import *
engine = create_engine('mysql://dbUser:dbPassword#dbServer:dbPort/dbName',
pool_recycle=3600, echo=False)
metadata = MetaData(engine)
Session = scoped_session(sessionmaker())
Session.configure(bind=engine)
user = Table('user', metadata, autoload=True)
class User(object):
pass
usermapper = mapper(User, user)
## some other python file called abc.py
from models import *
def getalluser():
session = Session()
session.query(User).all()
session.flush()
session.close()
## onemore file defg.py
from models import *
def updateuser():
session = Session()
session.query(User).filter(User.user_id == '4').update({User.user_lname: 'villkoo'})
session.commit()
session.flush()
session.close()
I create a session = Session() object for each request and I close it. Am I doing the right thing or is there a better way to do it?

Reading the documentation is recommended:
the scoped_session() function is provided which produces a thread-managed registry of Session objects. It is commonly used in web applications so that a single global variable can be used to safely represent transactional sessions with sets of objects, localized to a single thread.
In short, use scoped_session() for thread safety.

Scoped_session at every method will give you a thread of local session which you cannot obtain beforehand (like at the module level).It's not needed to open a new session in every method, You can use a global session , Create a session only when the global session is not available. i.e you can write a method which returns a session and add it to the init.py inside your package.

Don't use scoped_session and don't use Flask-SQLAlchemy.
Just use Session = sessionmaker() held in a singleton/service class, and use session = Session() on every HTTP request to guarantee that a fresh connection is provided.
Thread Local storage is clumsy and involves holding state which doesn't play nicely with different web-server threading models. Better to stay stateless. See for example SqlAlchemy's documentation here mentioning not to forget to call .remove() if you are using scoped_session. Will anyone remember to do that?
Below is an excerpt from https://docs.sqlalchemy.org/en/14/orm/contextual.html#using-thread-local-scope-with-web-applications:
Using the above flow, the process of integrating the Session with the web application has exactly two requirements:
Create a single scoped_session registry when the web application first starts, ensuring that this object is accessible by the rest of the application.
Ensure that scoped_session.remove() is called when the web request ends, usually by integrating with the web framework’s event system to establish an “on request end” event.

FYI, when using flask-sqlalchemy, the session object provided is by default a scoped session object.
http://flask-sqlalchemy.pocoo.org/2.3/quickstart/#road-to-enlightenment

I am looking into this myself, but I am not an expert.
My three points are:
SQLAlchemy docs provide a proposed approach using scoped_session, per Mr. Kluev's comment above, at this link: http://docs.sqlalchemy.org/en/rel_0_9/orm/session.html#using-thread-local-scope-with-web-applications.
At that web location, the SQLAlchemy docs also say that it is "...strongly recommended that the integration tools provided with the web framework itself be used, if available, instead of scoped_session."
Flask-SQLAlchemy, for example, appears to claim that it takes care of this: http://pythonhosted.org/Flask-SQLAlchemy/quickstart.html#a-minimal-application

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.