How does a session work with SQL Alchemy with pool? - python

engine = db.create_engine(self.url, convert_unicode=True, pool_size=5, pool_recycle=1800, max_overflow=10)
connection = self.engine.connect()
Session = scoped_session(sessionmaker(bind=self.engine, autocommit=False, autoflush=True))
I initialize my session like this. And
def first():
with Session() as session:
second(session)
def second(session):
session.add(obj)
third(session)
def third(session):
session.execute(query)
I use my session like this.
I think the pool is assigned one for each session. So, I think that the above code can work well when pool_size=1, max_overflow=0. But, when I set up like that It stuck and return Exception like.
descriptor '__init__' requires a 'super' object but received a 'tuple'
Why is that? The pool is assigned more than one per session rather than one by one?
And when using session with with, Can do I careless commit and rollback when exception?

Related

scoped_session.close() in sqlalchemy

I am using the scoped_session for my APIs from sqlalchemy python
class DATABASE():
def __init__(self):
engine = create_engine(
'mssql+pyodbc:///?odbc_connect=%s' % (
urllib.parse.quote_plus(
'DRIVER={/usr/lib/x86_64-linux-gnu/odbc/libtdsodbc.so};SERVER=localhost;'
'DATABASE=db1;UID=sa;PWD=admin;port=1433;'
)), isolation_level='READ COMMITTED', connect_args={'options': '-c lock_timeout=30 -c statement_timeout=30', 'timeout': 40}, max_overflow=10, pool_size=30, pool_timeout=60)
session = sessionmaker(bind=engine)
self.Session = scoped_session(session)
def calculate(self, book_id):
session = self.Session
output = None
try:
result = session.query(Book).get(book_id)
if result:
output = result.pages
except:
session.rollback()
finally:
session.close()
return output
def generate(self):
session = self.Session
try:
result = session.query(Order).filter(Order.product_name=='book').first()
pages = self.calculate(result.product_id)
if not output:
result.product_details = str(pages)
session.commit()
except:
session.rollback()
finally:
session.close()
return output
database = DATABASE()
database.generate()
Here, the session is not committing, then I go through the code, the generate function calls the calculate function, there, after the calculations are completed, the session is closed - due to this, the changes made in the generate function is not committed to the database
If I remove the session.close() from calculate function, the changes made in generate function is committed to the database
From the blogs, it is recommend to close the session after API complete its accessing to the database
How to resolve this, and what is the flow of sqlalchemy?
Thanks
Scoped sessions default to being thread-local, so as long as you are in the same thread, the factory (self.Session in this case) will always return the same session. So calculate and generate are both using the same session, and closing it in calculate will roll back the changes made in generate.
Moreover, scoped sessions should not be closed, they should be removed from the session registry (by calling self.Session.remove()); they will be closed automatically in this case.
You should work out where in your code you will have finished with your session, and remove it there and nowhere else. It will probably be best to commit or rollback in the same place. In the code in the question I'd remove rollback and close from calculate.
The docs on When do I construct a Session, when do I commit it, and when do I close it? and Contextual/Thread-local Sessions should be helpful.

SQLAlchemy not identifying Python pool threads as seperate process

I am trying to convert my single threaded application to multi threaded application which uses database using SQLAlchemy. And I found that SQLAlchemy session is not thread safe. So we need to use scoped_session factory for thread safe db access.
Below is my input dataset
input_list = [data1, data2, data3, data4, data5]
Single thread application
from sqlalchemy.orm import sessionmaker, scoped_session
Session = sessionmaker(bind=engine_url)
for data in input_list:
def myfunction(data):
db_session = Session()
print(db_session)
# use db_session to query/store the data
When I try to convert it to multithreaded application
from sqlalchemy.orm import sessionmaker, scoped_session
Session = scoped_session(sessionmaker(bind=engine_url))
def myfunction(data):
db_session = Session()
print(db_session)
# use db_session to query/store the data
def myfunction_parallel():
with ThreadPool(4) as pool:
output = pool.map(myfunction, input_list)
In multithread variant, I am getting db_session as same object, but my expectation is that there should be a new session object created for each thread and the session should be different?
The scoped session registry registers session for each thread that requests one. This enables code to call db_session = Session() and get the expected session for the thread.
However it's application's responsibility to inform the session registry when a session is no longer required. The application does this by calling Session.remove(), as documented here:
The scoped_session.remove() method first calls Session.close() on the current Session, which has the effect of releasing any connection/transactional resources owned by the Session first, then discarding the Session itself. “Releasing” here means that connections are returned to their connection pool and any transactional state is rolled back, ultimately using the rollback() method of the underlying DBAPI connection.
At this point, the scoped_session object is “empty”, and will create a new Session when called again.
This code should work as expected:
def myfunction(data):
db_session = Session()
print(db_session)
# use db_session to query/store the data
Session.remove()

closing session in sqlalchemy

I have created a method in seperate python file. Whenever I have to get any data from database I call this method.
Now I am doing a for loop where for every iteration, db call is made to below method for ex-
def get_method(self, identifier):
sess = session.get_session()
id = sess.query(..).filter(I.. == ..)
return list(id)[0]
def get_session():
engine = create_engine('postgresql+psycopg2://postgres:postgres#localhost/db', echo=True)
Session = sessionmaker(engine)
sess = Session()
return sess
I am getting FATAL: sorry, too many clients already , probably because I am not closing the sess object . Even after closing I am getting the same issue.
How do I handle this.
You shouldn't be opening your session within the for loop. Do that before your loop begins, and close it after you're finished with your transactions. The documentation is helpful here: when to open and close sessions

PSQL connection closing time delay using SQLAlchemy

Why does there appear to be a significant delay between calling session.close() and the session actually closing?
I'm "using up" connections in a way that doesn't feel right. Is there a better way to do this or a design pattern I'm missing?
Following the guide here I use the following code, copied for completeness:
#contextmanager
def session_scope():
"""Provide a transactional scope around a series of operations."""
session = Session()
try:
yield session
session.commit()
except:
session.rollback()
raise
finally:
session.close()
def run_my_program():
with session_scope() as session:
ThingOne().go(session)
ThingTwo().go(session)
This works great for reliably committing data and avoiding invalid sessions.
The problem is with hitting connection limits.
For example, say I have a page that makes 5 asynchronous calls per visit. If I visit the page, and hit refresh in quick succession it will spawn 5 * number_times_refreshed connections. It will eventually close them but there is a non negligible time delay.
The issue was the use of sessionmaker() and binding the db engine. Specifically this not good way:
def newSession():
engine = create_engine(settings.DATABASE_URL)
Base.metadata.bind = engine
DBSession = sessionmaker(bind=engine)
session = DBSession()
return session
#contextmanager
def session_scope():
session = newSession()
try:
yield session
session.commit()
except:
session.rollback()
raise
finally:
session.close()
This creates a new engine with each request. In addition to this being a poor pattern, it caused confusion on which "connection" was being closed. In this context we have both a "session" and a "DBSession". The session does get closed with session.close() but this does not touch the "DBSession".
A better way (Solution):
engine = create_engine(settings.DATABASE_URL)
Base.metadata.bind = engine
DBSession = sessionmaker(bind=engine)
#contextmanager
def session_scope():
session = DBSession()
try:
yield session
session.commit()
except:
session.rollback()
raise
finally:
session.close()
Creates one connection for the lifetime of the app

Zombie Connection in SQLAlchemy

DBSession = sessionmaker(bind=self.engine)
def add_person(name):
s = DBSession()
s.add(Person(name=name))
s.commit()
Everytime I run add_person() another connection is created with my postgreSQL DB.
Looking at:
SELECT count(*) FROM pg_stat_activity;
I see the count going up, until I get a Remaining connection slots are reserved for non-replication superuser connections error.
How do I kill those connections? Am I wrong in opening a new session everytime I want to add a Person record?
In general, you should keep your Session object (here DBSession) separate from any functions that make changes to the database. So in your case you might try something like this instead:
DBSession = sessionmaker(bind=self.engine)
session = DBSession() # create your session outside of functions that will modify database
def add_person(name):
session.add(Person(name=name))
session.commit()
Now you will not get new connections every time you add a person to the database.

Categories