Issue with Stale data Flask/SqlAlchemy - python

I have the following set up for which on session.query() SqlAlchemy returns stale data:
Web application running on Flask with Gunicorn + supervisor.
one of the services is composed in this way:
app.py:
#app.route('/api/generatepoinvoice', methods=["POST"])
#auth.login_required
def generate_po_invoice():
try:
po_id = request.json['po_id']
email = request.json['email']
return jsonify(response=POInvoiceGenerator.get_invoice(po_id, email))
except Exception as ex:
app.logger.error("generate_po_invoice(): " + ex.message)
in another folder i have the database related stuff:
DatabaseModels (folder)
|-->Model.py
|-->Connection.py
that's what is contained in the connection.py file:
from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker, scoped_session
from sqlalchemy.ext.declarative import declarative_base
engine = create_engine(DB_BASE_URI, isolation_level="READ COMMITTED")
Session = scoped_session(sessionmaker(bind=engine))
session = Session()
Base = declarative_base()
and thats an extract of the model.py file:
from DatabaseModels.Connection import Base
from sqlalchemy import Column, String, etc...
class Po(Base):
__tablename__ = 'PLC_PO'
id = Column("POId", Integer, primary_key=True)
code = Column("POCode", String(50))
etc...
Then i have another file POInvoiceGenerator.py
that contains the call to the database for fetching some data:
import DatabaseModels.Connection as connection
import DatabaseModels.model as model
def get_invoice(po_code, email):
try:
po_code = po_code.strip()
PLCConnection.session.expire_all()
po = connection.session.query(model.Po).filter(model.Po.code == po_code).first()
except Exception as ex:
logger.error("get_invoice(): " + ex.message)
in subsequent users calls to this service sometimes i start to get errors like: could not find data in the db for that specific code and so on. Like if the data are stale and so on.
My first approach was to add isolation_level="READ COMMITTED" to the engine declaration and then to create a scoped session, but the stale data reading keeps appening.
Is there anyone that had any idea if my setup is wrong (the session and the model are reused among multiple methods and files)
Thanks in advance.

even if the solution pointed by #TonyMountax seems valid and made me discover something that i didn't know about SqlAlchemy, In the end i opted for something different.
I figured out that the connection established by SqlAlchemy was durable since it was created from a pool of connection everytime, this somehow was causing the data to be stale.
i added a NullPool to my code:
from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker, scoped_session
from sqlalchemy.pool import NullPool
engine = create_engine(DB_URI, isolation_level="READ COMMITTED", poolclass=NullPool)
Session = scoped_session(sessionmaker(bind=engine))
session = Session()
and then i'm calling a session close for every query that i make:
session.query("some query..")
session.close()
this will cause SqlAlchemy to create a new connection every time and get fresh data from the db.
Hope that this is the correct way to use it and that might be useful to someone else.

The way you instantiate your database connections means that they are reused for the next request, and they have some state left from the previous request. SQLAlchemy uses a concept of sessions to interact with the database, so that your data does not abruptly change in a single request even if you happen to perform the same query twice. This makes sense when you are using the ORM query features. For instance, if you were to query len(User.friendlist) twice during the same session, but a friend request was accepted during the request, then it will still show the same number in both locations.
To fix this, you must set up the session on first request, then you must tear it down when the request is finished. To do so is not trivial, but there is a well-established project that does it already: Flask-SQLAlchemy. It's from Pallets, the people behind Flask itself and Jinja2.

Related

Where to session.commit() for a SELECT query - SQLALchemy, Flask

My application does not update the database - all queries are SELECT statements. I'm struggling how best to handle direct changes to the database (i.e. opening MySQLWorkbench and changing data there). Without session.commit(), my Flask application is returning stale data.
My solution right now is to have a session.commit() as the first line of each Flask endpoint, but I feel this is the incorrect way of handling this.
Session creation at start of app:
engine = db.create_engine('mysql+pymysql://...')
connection = engine.connect()
metadata = db.MetaData()
Base = declarative_base()
Session = sessionmaker(autoflush=True)
Session.configure(bind=engine)
session = Session()
session.expire_all() to mark all session data as expired. Then when you are trying to access something, it will be fetched from the database.
session.expire(object) does the same but for objects only
db.session.refresh(some_object) expires and reloads all object data
Nice article about that can be found here: https://www.michaelcho.me/article/sqlalchemy-commit-flush-expire-refresh-merge-whats-the-difference

How to keep session active for long time in sqlalchemy?

I have a code that runs a query from a query list. These query are long and take quite a long time to execute. Since I am executing these query in a loop, the session seems to expire and I get a error telling me that the connection to the server was lost.
Then I created the session as well as engine inside the loop (I closed the session and disposed the engine at the end of the loop.) I have understood that creating new connection is an expensive operation.
How can I re-use the connection in this case so that I do not have to create the session and engine each time?
from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker
# an Engine, which the Session will use for connection
# resources
some_engine = create_engine('mysql://user:password#localhost/')
# create a configured "Session" class
Session = sessionmaker(bind=some_engine)
# create a Session
session = Session()
for long_query in long_query_list:
# work with sess
session.execute(long_query)
session.commit()

'module' object is not callable with sqlalchemy

I'm totally new using sqlalchemy and postgresql. I read this tutorial to build the following piece of code :
import sqlalchemy
from sqlalchemy import create_engine
from sqlalchemy import engine
def connect(user, password, db, host='localhost', port=5432):
'''Returns a connection and a metadata object'''
# We connect with the help of the PostgreSQL URL
# postgresql://federer:grandestslam#localhost:5432/tennis
url = 'postgresql://{}:{}#{}:{}/{}'
url = url.format(user, password, host, port, db)
# The return value of create_engine() is our connection object
con = sqlalchemy.create_engine(url, client_encoding='utf8')
# We then bind the connection to MetaData()
meta = sqlalchemy.MetaData(bind=con, reflect=True)
return con, meta
con, meta = connect('federer', 'grandestslam', 'tennis')
con
engine('postgresql://federer:***#localhost:5432/tennis')
meta
MetaData(bind=Engine('postgresql://federer:***#localhost:5432/tennis'))
When running it I have this error :
File "test.py", line 22, in <module>
engine('postgresql://federer:***#localhost:5432/tennis')
TypeError: 'module' object is not callable
what should I do ? thanks !
So, your problem is happening because you've made this call:
from sqlalchemy import engine
And then you've used this later in the file:
engine('postgresql://federer:***#localhost:5432/tennis')
Strangely, in that section, you have some statements that are just con and meta with no assignments or calls or anything. I'm not sure what you're doing there. I would suggest that you check out SQLalchemy's page on engine and connection use to help get you sorted.
It will of course depend on exactly how you've set up your database. I used the declarative_base module in one of my projects, so my process of setting up a session to connect to my DB looks like this:
from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker
# Connect to Database and create database session
engine = create_engine('postgresql://catalog:catalog#localhost/menus')
Base.metadata.bind = engine
DBSession = sessionmaker(bind=engine)
session = DBSession()
And in my database setup file, I've assigned:
Base = declarative_base()
But you'll have to customize it a bit to your particular setup. I hope that helps.
Edit: I see now where those calls to con and meta were coming from, as well as your other confusing lines, it's part of the tutorial you linked to. What he was doing in that tutorial was using the Python interpreter in command line. I'll explain a few of the things he did there in the hope that it helps you some more. Lines beginning with >>> are what he enters in as commands. The other lines are the output he receives back.
>>> con, meta = connect('federer', 'grandestslam', 'tennis') # he creates the connection and meta objects
>>> con # now he calls the connection by itself to have it show that it's connected to his DB
Engine(postgresql://federer:***#localhost:5432/tennis)
>>> meta # here he calls his meta object to show how it, too, is connected
MetaData(bind=Engine(postgresql://federer:***#localhost:5432/tennis))

How does Flask start a new SQLAlchemy transaction at the start of each request?

I tried to totally seperate Flask and SQLAlchemy using this method but Flask still seems to be able to detect my database and start a new transaction at the beginning of each request.
The db.py file creates a new session and defines a simple model of a table:
from sqlalchemy import create_engine
from sqlalchemy.orm import scoped_session, sessionmaker
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy import Column, String
engine = create_engine("mysql://web:kingtezdu#localhost/web_unique")
print("creating new session")
db_session = scoped_session(sessionmaker(bind=engine))
Base = declarative_base()
Base.query = db_session.query_property()
# define model of 'persons' table
class Person(Base):
__tablename__ = "persons"
name = Column(String(30), primary_key=True)
def __repr__(self):
return "Person(\"{0.name}\")".format(self)
# create table
Base.metadata.create_all(bind=engine)
And app.py, a simple Flask application using the SQLAlchemy session and model:
from flask import Flask, escape
app = Flask(__name__)
# importing new session
from db import db_session, Person
# registering for app teardown to remove session
#app.teardown_appcontext
def shutdown_session(exception=None):
db_session.remove()
#app.route("/query")
def query():
# query all persons in the database
all_persons = Person.query.all()
print all_persons
return "" # we use the console output
if __name__ == "__main__":
app.run(debug=True)
Let's run this:
$ python app.py
creating new session
* Running on http://127.0.0.1:5000/
* Restarting with reloader
creating new session
Weired enough it runs db.py two times but we just ignore this, let's access the webpage /query:
[]
127.0.0.1 - - [23/Dec/2015 18:20:14] "GET /query HTTP/1.1" 200 -
We can see that our request was answered, though we only use the console output. There is no Person in the database yet, let's add one:
mysql> INSERT INTO persons (name) VALUES ("Marie");
Query OK, 1 row affected (0.11 sec)
Marie is part of the database now so we reload the webpage:
[Person("Marie")]
127.0.0.1 - - [23/Dec/2015 18:24:48] "GET /query HTTP/1.1" 200 -
As you can see the session already knows about Marie. Flask didn't create a new session. That means that there was a new transaction started. Contrast this to the plan python example below to see the difference.
My question is how Flask is able to start a new transaction on the begin of each request. Flask shouldn't know about the database but seems to be able to change something about it's behaviour.
In case you don't know what a SQLAlchemy transaction is read this paragraph extracted from Managing Transactions:
When the transactional state is completed after a rollback or commit,
the Session releases all Transaction and Connection resources, and
goes back to the “begin” state, which will again invoke new Connection
and Transaction objects as new requests to emit SQL statements are
received.
So a transaction is ended by a commit and will cause a new connection to be set up which will then make the session read the database again. In reality this means that you have to commit when you want to see changes made to the database:
First in interactive python mode:
>>> from db import db_session, Person
creating new session
>>> Person.query.all()
[]
Switch over to MySQL and insert a new Person:
mysql> INSERT INTO persons (name) VALUES ("Paul");
Query OK, 1 row affected (0.03 sec)
Finally try to load Paul into our session:
>>> Person.query.all()
[]
>>> db_session.commit()
>>> Person.query.all()
[Person("Paul")]
I think the issue here is that scoped_session somewhat hides what happens to the actual sessions in use. When your teardown handler
# registering for app teardown to remove session
#app.teardown_appcontext
def shutdown_session(exception=None):
db_session.remove()
runs at the end of each request, you call db_session.remove() which disposes of the session used in that particular request along with any transaction context. See http://docs.sqlalchemy.org/en/latest/orm/contextual.html for the details, particularly
The scoped_session.remove() method first calls Session.close() on the
current Session, which has the effect of releasing any
connection/transactional resources owned by the Session first, then
discarding the Session itself. “Releasing” here means that connections
are returned to their connection pool and any transactional state is
rolled back, ultimately using the rollback() method of the underlying
DBAPI connection.

flask sqlalchemy keeps Postgres transaction idle

Im writting a flask api using flaskrestful,sqlalchemy, Postgres, nginx,uwsgi. Im a newbie to python.These are my configuration
database.py
from cuewords import app
from flask.ext.sqlalchemy import SQLAlchemy
from sqlalchemy.pool import NullPool
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy import create_engine
from sqlalchemy import Column, Integer, String , Text , Boolean , DateTime, MetaData, Table ,Sequence
from sqlalchemy.dialects.postgresql import JSON
Base = declarative_base()
db_name="postgresql+psycopg2://admin:password#localhost:5434/database_name"
from sqlalchemy.orm import sessionmaker
engine = create_engine(db_name,poolclass=NullPool ,echo=True)
Session = sessionmaker(autocommit=False ,bind=engine)
connection = engine.connect()
metadata = MetaData()
api.py
class Webcontent(Resource):
def post(self):
session=Session()
...assign some params...
try:
insert_data=Websitecontent(site_url,publisher_id)
session.add(insert_data)
session.commit()
Websitecontent.update_url(insert_data.id,session)
except:
session.rollback()
raise
finally:
return "Data created "
session.close()
else:
return "some value"
Here im first saving the just the url then saving all the content of the site using boilerpipe later .Idea is to move to queue later
model.py
class Websitecontent(Base):
#classmethod
def update_url(self,id,session):
existing_record=session.query(Websitecontent).filter_by(id=int(id)).first()
data=Processing.processingContent(str(existing_record.url))
#boilerpipe processing the content here
#assigning some data to existing record in session
session.add(existing_record)
session.commit()
Websitecontent.processingWords(existing_record,session)
#classmethod
def processingWords(self,record,session)
...processing
Websitecontent.saveKeywordMapping(session,keyword_url,record)
#classmethod
def saveKeywordMapping(session,keyword_url,record)
session.commit()
session.close()
So this code works perfectly in locally but its doesnt work in production .So when i check pag_stat_activity it show the state "idle in transaction". The app hangs then i have to restart the servers. i dont get it why session.close() does not close the pool connection why its keeping psql transaction state busy . Guys any help would be really appreciated.
You are returning before closing the session:
return "Data created "
session.close()
I think returning inside finally might swallow the exception, as well.

Categories