I use Flask and sqlalchemy. I want to create the engine only once as the documentation says
The typical usage of create_engine() is once per particular database
URL, held globally for the lifetime of a single application process.
My basic idea is to put the engine as a class variable like
class EngineUnique:
engine=sqla.create_engine(f"postgresql://{user}:{password}#{server}:{port}/{database}")
and then when I need the engine do
with EngineUnique.engine.connect() as connection:
Is it a correct way to do or do I have to implement the Singleton pattern?
Loïc
Related
I created a rest API with openapi generator that contains all the requests necessary for selecting, inserting, and updating my SQL database.
I use from my database generation and manipulation SQLAlchemy and I'm not sure how to use the session to interact with the database in this context.
My project looks like this:
DB
| openapi_server (generated)
| __init__.py
| request.py
| database.py
In database.py I keep my database structure.
In request.py I have all the functions that need to be processed on every request(to interact with the database).
My way of handling this situation is: I create a session variable at the beginning of each function and after the operation is complete I close it.
Any other methods that are more scalable and easy to maintain or which are the best practices?
My understanding is that the sqlalchemy session is different from the client session in that the client session stores information about authorization & permissions whereas the sqlalchemy session is a gil-bound transaction state which associates your code / machine to an external database.
Assuming you're not utilizing multithreading or parallel processing, a single sqlalchemy session shared between your application would be appropriate. In the case where your users have different levels of database permissions, I would establish those rules in your application authorization, rather than the database user-permission schema. (That should be reserved for system-users.)
Bear in mind, multiple sqlalchemy sessions are appropriate in many scenarios and there are advantages for creating and closing sessions on the fly. But there are also potential downsides, such as write collisions (2 processes try to write the same record) and so on. In these more fine grained cases, I'd suggest a queuing process as a central orchestrator.
For implementation:I usually create a file create_session.py which has a function to create a new db session with the appropriate DB URI. I then call that function in the main __init__.py like so: session = create_session() --> importing that session throughout the application is done by importing session from the main module ex: from database import session.
In cases where you need to create new / multiple sessions, do so with:
# Getting the path right here isn't always straightforward tbh
# basically, import the function from the module directly
from create_session import create_session
def do_something():
# Always create your session in a method
# otherwise your db will open many unnecessary connections
my_session = create_session()
print('Done')
# Close the session when you're done
my_session.close()
In SQLAlchemy's docs, they mention that you should instantiate an Engine object (by calling create_engine) only once (per DB URL) in the lifetime of your application, as "the Engine is most efficient when created just once at the module level of an application".
My question is: if I'm creating the engine disabling pooling (using NullPool) is there still advantages from instantiating it only once per application lifetime? The reason I'm using NullPool is that we were having some issues related to multiprocessing, and that was one of the things suggested in SQLAlchemy's docs for solving it.
My understanding is that by not using pooling, we're not keeping DB connections open after they are used, so what would be the potential downsides of calling create_engine for each query that gets run?
I'm asking because I'm working with a codebase that right now instantiates the engine object for each query that it will run, so I'm trying to assess whether there are significant benefits from changing that to instantiate the engine only once that would make it worth the time and effort that will be required for this refactoring.
I'm writing my blog project by flask and sqlalchemy, but I don't know how to organize it.Here is is the file tree:
the admin and main are two blueprint, I create my app in the blog's __init__.py.
But where should I create my models and how to use it correctly?
Can I create an engine instance and make it global? So that everytime I need to connect to database, I just need to make a new session and bind this engine? Or should I establish new connection and engine for every request?
I personally prefer the global instance method for two reasons. First it can reduce the handshake cost since it's a long-live connection. Second, it can easily transform into a connection pool.
Background:
This is the situation I am facing and so far my current solution seems rather clunky. I want to improve on it. Right now:
I setup connections to each database in the main function of the Pyramid application:
def main(global_config, **settings):
a_engine = engine_from_config(settings, 'A.')
b_engine = engine_from_config(settings, 'B.')
ASession.configure(bind=a_engine)
BSession.configure(bind=b_engine)
"ASession" and "BSession" are simply globally defined scoped_session in /models/init.py.
ASession = scoped_session(sessionmaker(extension=ZopeTransactionExtension()))
I define model base class like so. For example:
ABase = declarative_base()
class user(ABase):
id = Column(Integer, primary_key=True)
name = Column(String)
This somehow already doesn't feel very clean. But now that this model is supposed to be accessed from a different application, I also need to define the engine and connection again in that application. This feels extremely redundant.
Problem Abstracted:
Assume that there are 2 different databases:
A and B
Also assume that you want A and B to be accessible from 2 different applications (e.g.: Pyramid application, Bokeh Server App which uses Tornado) using the same model.
In short, how would one best pattern objects/models/classes/functions to produce clean non-redundant code in Python3?
Initial Thought After The Question Was Posted:
Thinking about this a bit more, I think I want each model to be somehow "self-contained". The model should bring with it methods for initiating connections. In other words, the initiation of db connections should be decoupled from the web application itself.
And it should be done in an instance kind of manner. So that multiple applications can use the same models. Each application would have its own session connection to either DB.
How would the community pattern this? Friday afternoons don't lend themselves to find answers to these kinds of questions for myself at least.
I have done this. My recommendation below is how I like doing it, but is not the only way. I would ditch scoped sessions and the transaction manager and make explicit session management objects, with request lifecycle callbacks handling creation, closing, committing, or rolling back your sessions. Basically scoped sessions are a way to simulate a global by getting the same item for that thread of execution. The other way to do this in Pyramid is to attach things to the registry and the request, because you have those everywhere. You attach shared components to the registry (the ZCA) and per-request objects to the request.
When you have multiple sessions, I've found it much easier to reason about them and keep track of them if they are handled by components that wrap up everything for that engine. So for a case like that you describe, I've made two different DB engine components, that are created on start up, attached to the registry, and have a method for getting a fresh session. If you create these components properly, they should be usable in any application, whether it's Pyramid, Tornado, or your test script. You just make sure it has a constructor with some sane way of passing in settings for setting up the engine, whether it's a settings dict or kwargs. I then make my data model(s) live in their own python packages and it's easy to have any app in the family import the model, instantiate the engine components and go to town. Note that if you like using the ZCA registry (and I love it, it's a fantastic DI system), there's nothing preventing you from using it in non-pyramid apps, you just set it up manually in your server start up code.
In Pyramid specifically, I make a custom Request class and use the reify decorator to allow other pyramid code to get the session(s) for that request. The request class has end-of-life-callbacks attached to close out the sessions, and to do rollbacks or commits. There is a bit more boilerplate, but for me it's cleaner in that I can very easily trace where and when in code and time my session management is happening. It's also a good way for testing.
That said, there are lots of smart folks in SQLAlchemy/Pyramid land who swear by scoped sessions and the transaction manager, so there are other valid approaches. Hope that helps.
I'm building a small website and I already have all my models in SQLAlchemy. The website is to publish some information from some calculations which are done offline. Only the results will be published to a slimmed down database i.e. it contains the results, not the raw data, but the website needs to query the results.
I'm going to use Flask, as my models are already driven with Python (and some heavy lifting in C++ via SWIG) and I don't want to use Django.
Now this has been asked before I'm sure and the usual mantra without much justification is to 'use Flask-SQLAlchemy'. The question is why?
If I write some session handling myself, why do I have to go through the additional layer of redefining my database in Flask-SQLAlchemy. Other than having to write some code like here in my Flask app somewhere:
#app.before_request
def before_request():
g.db = connect_db()
#app.teardown_request
def teardown_request(exception):
db = getattr(g, 'db', None)
if db is not None:
db.close()
What else do I need to worry about? SQLAlchemy even does connection pooling for me by default.
Actually, you are building a web application using Flask which do database related kinds of stuff by sqlalchemy. So, when you are dealing with database session with multiple requests which your application handles, then you have to ascertain that, you are creating and closing sessions cautiously.
If you read SQLAlchemy docs, they recommend to keep the lifecycle of the session separate and external from functions and objects that access and/or manipulate database data. This will greatly help with achieving a predictable and consistent transactional scope.
A web application is the easiest case because such an application is already constructed around a single, consistent scope - this is the request, which represents an incoming request from a browser, the processing of that request to formulate a response, and finally the delivery of that response back to the client. Integrating web applications with the Session is then the straightforward task of linking the scope of the Session to that of the request. The Session can be established as the request begins, or using a lazy initialization pattern which establishes one as soon as it is needed. The request then proceeds, with some system in place where application logic can access the current Session in a manner associated with how the actual request object is accessed. As the request ends, the Session is torn down as well, usually through the usage of event hooks provided by the web framework. The transaction used by the Session may also be committed at this point, or alternatively the application may opt for an explicit commit pattern, only committing for those requests where one is warranted, but still always tearing down the Session unconditionally at the end.
In laymen terms, i mean to say that
In SQLAlchemy, above action is mentioned because sessions in web application should be scoped, meaning that each request handler creates and destroys its own session.
This is necessary because web servers can be multi-threaded, so multiple requests might be served at the same time, each working with a different database session.
This means that if you are using SqlAlchemy with Flask, you have to manually handle session like creating scoped session and also remove them on each request cautiously otherwise you may be in deep shit, which adds extra layer of complexity to your web application.
But, there comes Flask-SqlAlchemy (an extension of sqlalchemy library for Flask app) which provides infrastructure to assist in the task of aligning the lifespan of a Session with that of each web request. Actually, you can also found that in SqlAlchmey docs, they also recommend to use this with Flask.
Flask-SQLAlchemy creates a fresh/new scoped session for each request. If you dig further it you will find out here, it also installs a hook on app.teardown_appcontext (for Flask >=0.9), app.teardown_request (for Flask 0.7-0.8), app.after_request (for Flask <0.7) and here is where it calls db.session.remove().
The code you put in the question is not actually valid for Sqlalchemy integration is Flask. I know it is just example, but saying that just in case.
For Sqlalchemy integration all you need to do is to make sure current DbSession is cleaned up at the end of request via something like this:
#app.teardown_appcontext
def shutdown_session(exception=None):
DbSession.remove()
where DbSession is scoped session.
Here is documentation for the case when you dont want to use Flask-Sqlalchemy package.