I'm newbie using SQLAlchemy and I'm working on a complex ETL process so I did the below simplified code:
module1.py
class Foo:
def foo_method(self):
# doing stuff with database
module2.py
class Bar:
def bar_method(self):
# doing stuff with database
main_script.py
from module1 import Foo
from module2 import Bar
def run():
with Pool(processes = num_workers) as pool:
responses = [pool.apply_async(some_func, (param)) for param in params]
for response in responses:
response.get()
def some_func(param):
engine = create_engine(connection_string, echo=True)
Session = scoped_session(sessionmaker(bind=engine))
session = Session()
# Start doing some stuff with database
foo = Foo()
foo.foo_method()
bar = Bar()
bar.bar_method()
So I have a Pool with worker process. When I call main_script.run() each worker creates a database session inside some_func. My question is how can I use the same session for each worker in module1 and module2 without passing the session by param to each method? Should I add the follow lines in each module/file?
engine = create_engine(connection_string, echo=True)
Session = scoped_session(sessionmaker(bind=engine))
session = Session()
scoped_session should be created at the module level. For your project structure, that probably means having a separate module to house the engine and session:
db.py
engine = create_engine(connection_string, echo=True)
Session = scoped_session(sessionmaker(bind=engine))
module1.py
from db import Session
class Foo:
def foo_method(self):
session = Session()
session.query(...)...
Related
All of the sudden, in my Celery application I'm getting lots of:
(psycopg2.errors.InFailedSqlTransaction) current transaction is aborted, commands ignored until end of transaction block
The truth is that I have no idea how to properly setup Celery to work with SQLAlchemy. I have a BaseTask that all tasks inherit from and it looks like this:
from sqlalchemy.orm import scoped_session, sessionmaker
session_factory = sessionmaker(
autocommit=False,
autoflush=False,
bind=create_engine("postgresql://****")
)
Session = scoped_session(session_factory)
class BaseTask(celery.Task):
def after_return(self, *args: tuple, **kwargs: dict) -> None:
Session.remove()
#property
def session(self) -> OrmSession:
return Session()
And all of my tasks (bound or not) are either using self.session or {task_func).session to make the queries. Should I rather use a context manager around my queries, whitin the tasks like:
#contextmanager
def session_scope():
"""Provide a transactional scope around a series of operations."""
session = Session()
try:
yield session
session.commit()
except:
session.rollback()
raise
finally:
session.close()
#app.task()
def my_task():
with session_scope() as session:
do_a_query(session)
Can someone please explain to me how those sessions work? And guide me towards the correct "Celery use of SQLAlchemy"?
Thank you.
Instead of:
from sqlalchemy import create_engine
from sqlalchemy.orm import Session
# an Engine, which the Session will use for connection
# resources
engine = create_engine('sqlite:///...')
# create session and add objects
with Session(engine) as session:
session.add(some_object)
session.add(some_other_object)
session.commit()
I create a sessionmaker (according to example in documentation, see bellow):
from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker
# an Engine, which the Session will use for connection
# resources, typically in module scope
engine = create_engine('postgresql://scott:tiger#localhost/')
# a sessionmaker(), also in the same scope as the engine
Session = sessionmaker(engine)
# we can now construct a Session() without needing to pass the
# engine each time
with Session() as session:
session.add(some_object)
session.add(some_other_object)
session.commit()
Can I use the sessions from session maker in different threads (spawning multiple sessions at the same time)? In other words, is session maker thread safe object? If yes, can multiple sessions exists and read/write into same tables at the same time?
Furthermore, what is the advantage of of using 'scoped_session' - is it realated to problem of multiple sessions (one per thread)?:
# set up a scoped_session
from sqlalchemy.orm import scoped_session
from sqlalchemy.orm import sessionmaker
session_factory = sessionmaker(bind=some_engine)
Session = scoped_session(session_factory)
# now all calls to Session() will create a thread-local session
some_session = Session()
# you can now use some_session to run multiple queries, etc.
# remember to close it when you're finished!
Session.remove()
Session objects are not thread-safe, but are thread-local. What I recommend using is sessionmaker instead of Session. It will yield a Session object every time you need it, thus not idling the database connection. I'd use the approach below.
from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker, Session
DB_ENGINE = create_engine('sqlite:///...')
DB_SES_MAKER = sessionmaker(bind=DB_ENGINE)
def get_db():
db = DB_SES_MAKER()
try:
yield db
finally:
db.close()
Then call get_db whenever needed:
db = next(get_db())
I have the following simple setup
from concurrent.futures import ThreadPoolExecutor
from functools import partial
from sqlalchemy.orm import sessionmaker, scoped_session
def do_query():
engine = ...
session_factory = sessionmaker(bind=engine, autocommit=False, autoflush=False)
ThreadedSession = scoped_session(session_factory)
f = partial(
_query_function,
session=ThreadedSession,
)
queries = [...]
with ThreadPoolExecutor(max_workers=num_threads) as pool:
pool.map(f, queries)
ThreadedSession.commit()
def _query_function(query, session):
s = session()
s.execute(query)
return
Where I will pass the queries to a ThreadPoolExecutor, and have each thread use the shared session factory as in https://docs.sqlalchemy.org/en/13/orm/contextual.html#contextual-thread-local-sessions. However, the changes are not committed like this. Why?
I'm working on an async FastAPI project and I want to connect to the database during tests. Coming from Django, my instinct was to create pytest fixtures that take care of creating/dropping the test database. However, I couldn't find much documentation on how to do this. The most complete instructions I could find were in this tutorial, but they don't work for me because they are all synchronous. I'm somewhat new to async development so I'm having trouble adapting the code to work async. This is what I have so far:
import pytest
from sqlalchemy.ext.asyncio import create_async_engine, session
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import sessionmaker
from sqlalchemy_utils import database_exists, create_database
from fastapi.testclient import TestClient
from app.core.db import get_session
from app.main import app
Base = declarative_base()
#pytest.fixture(scope="session")
def db_engine():
default_db = (
"postgresql+asyncpg://postgres:postgres#postgres:5432/postgres"
)
test_db = "postgresql+asyncpg://postgres:postgres#postgres:5432/test"
engine = create_async_engine(default_db)
if not database_exists(test_db): # <- Getting error on this line
create_database(test_db)
Base.metadata.create_all(bind=engine)
yield engine
#pytest.fixture(scope="function")
def db(db_engine):
connection = db_engine.connect()
# begin a non-ORM transaction
connection.begin()
# bind an individual Session to the connection
Session = sessionmaker(bind=connection)
db = Session()
# db = Session(db_engine)
yield db
db.rollback()
connection.close()
#pytest.fixture(scope="function")
def client(db):
app.dependency_overrides[get_session] = lambda: db
PREFIX = "/api/v1/my-endpoint"
with TestClient(PREFIX, app) as c:
yield c
And this is the error I'm getting:
E sqlalchemy.exc.MissingGreenlet: greenlet_spawn has not been called; can't call await_() here. Was IO attempted in an unexpected place? (Background on this error at: https://sqlalche.me/e/14/xd2s)
/usr/local/lib/python3.9/site-packages/sqlalchemy/util/_concurrency_py3k.py:67: MissingGreenlet
Any idea what I have to do to fix it?
You try to use sync engine with async session. Try to use:
from sqlalchemy.ext.asyncio import AsyncSession
Session = sessionmaker(bind= connection, class_=AsyncSession)
https://docs.sqlalchemy.org/en/14/orm/extensions/asyncio.html
I'm getting this error sometime (sometime is ok, sometime is wrong):
sqlalchemy.exc.OperationalError: (OperationalError) MySQL Connection not available.
while using session.query
I'm writing a simple server with Flask and SQLAlchemy (MySQL). My app.py like this:
Session = sessionmaker(bind=engine)
session = Session()
#app.route('/foo')
def foo():
try:
session.query(Foo).all()
except Exception:
session.rollback()
Update
I also create new session in another file and call it in app.py
Session = sessionmaker(bind=engine)
session = Session()
def foo_helper(): #call in app.py
session.query(Something).all()
Update 2
My engine:
engine = create_engine('path')
How can I avoid that error?
Thank you!
Make sure the value of ‘pool_recycle option’ is less than your MYSQLs wait_timeout value when using SQLAlchemy ‘create_engine’ function.
engine = create_engine("mysql://username:password#localhost/myDatabase", pool_recycle=3600)
Try to use scoped_session to make your session:
from sqlalchemy.orm import scoped_session, sessionmaker
session = scoped_session(sessionmaker(autocommit=False, autoflush=False, bind=engine))
and close/remove your session after retrieving your data.
session.query(Foo).all()
session.close()