I'm fairly new to python and pyramid framework. Recently I was introduced to SQLSoup to take care of my database (postgres) needs.
dbEngine1 = SqlSoup(settings['sqlalchemy.db1.url'])
users = dbEngine1.users.fetchall()
Everything is working great, however after a short period of using the pyramid app, I'm getting this error message. I have to kill pyramid to release all idling connections in postgres (about 50 idling connection before throwing below exception)
sorry, too many clients already
My question is, how do I close this idling connection, I tried adding a line of code as shown below, but it does not help.
dbEngine1 = SqlSoup(settings['sqlalchemy.db1.url'])
users = dbEngine1.users.fetchall()
dbEngine1.engine.connect().close()
Any pointer from SQLAlchemy gurus?
Seems like you create dbEngine1 on each request to yours Pyramid app.
For proper usage SqlSoup in webapp you must use SA Sessions.
See section "Accessing the Session" on this page.
how do I close this idling connection
SqlSoup such as raw SA use the connection pool, each connection in pool at idle status util query executes. This connection pool must be created once.
Related
I have a data model defined with Peewee and it is all good. Now I need to bookkeep some form of database context in an xlwings front end application but I'm not 100% sure how to proceed. I see couple of bits and pieces of information here and there but not 100% sure how to:
https://docs.peewee-orm.com/en/2.10.2/peewee/database.html#connection-pooling
https://docs.peewee-orm.com/en/2.10.2/peewee/database.html#automatic-reconnect
I'm already using a DatabaseProxy to deffer tying the model to a specific connection until runtime as I can target different vendors e.g. PROD -> Postgres and UT -> SQLite:
database_proxy = DatabaseProxy()
Furthermore, I use playhouse.db_url's connect way with a fully qualified URL. I need a "Database Context" that:
Works with DatabaseProxy and playhouse.db_url.connect and can support at least SQLite and Postgres.
Keeps a pool of N connections which are managed internally and if the connection is lost then a retry is attempted until a timeout.
Indeed, I need a connect timeout instead of keep waiting forever to fail.
The connection pooling part: automatic recycling and closing of idle connections.
Robust automatic retrial on OperationalError due to short database outages.
I struggle to put all the pieces together here, for example, the playhouse.db_url.connect doesn't have retry or timeout arguments, it is also not clear how to decorate such connection with pooling and so on.
In Tiangolo's FastAPI, it states that you can create a persistent database connection by using a dependency
https://fastapi.tiangolo.com/tutorial/sql-databases/#create-a-dependency
However, in the Async database docs, the database is connected to in app startup
https://fastapi.tiangolo.com/advanced/async-sql-databases/#connect-and-disconnect
This same pattern is followed in the encode/databases docs
https://www.encode.io/databases/connections_and_transactions/
Which is the right pattern? It seems to me that using dependencies, one database connection would be created per API call, while connecting the the database during startup would establish one database connection per worker. If this is correct, connecting to the database on startup would be far superior.
What the difference between the two and is one better?
I won't go into the details of each database library. Let me just say that most modern tools use a connection pool. They do it explicitly or implicitly, hiding behind certain abstractions like Session in your first link.
In all of your examples, the connection pool is created when the application starts. And when creating a session, there is no heavy operation of establishing a new connection, but only a connection is fetched from the pool, and when the session is closed, the connection is returned to the pool.
Most of the blog posts and examples on the web, in purpose of connecting to the MongoDB using Mongoengine in Python/Django, have suggested that we should add these lines to the settings.py file of the app:
from mongoengine import connect
connect('project1', host='localhost')
It works fine for most of the cases except one I have faced recently:
When the database is down!
Let say if db goes down, the process that is taking care of the web server (in my case, Supervisord) will stop running the app because of Exception that connect throws. It may try few more times but after its timeout reached, it will stop trying.
So even if your app has some parts that are not tied to db, they will also break down.
A quick solution to this is adding a try/exception block to the connect code:
try:
connect('project1', host='localhost')
except Exception as e:
print(e)
but I am looking for a better and clean way to handle this.
Unfortunately this is not really possible with mongoengine unless you go with the try-except solution like you did.
You could try to connect with pure pymongo version 3.0+ using MongoClient and registering the connection manually in the mongoengine.connection._connection_settings dictionary (quite hacky but should work). From pymongo documentation:
Changed in version 3.0: MongoClient is now the one and only client class for a standalone server, mongos, or replica set. It includes the functionality that had been split into MongoReplicaSetClient: it can connect to a replica set, discover all its members, and monitor the set for stepdowns, elections, and reconfigs.
The MongoClient constructor no longer blocks while connecting to the server or servers, and it no longer raises ConnectionFailure if they are unavailable, nor ConfigurationError if the user’s credentials are wrong. Instead, the constructor returns immediately and launches the connection process on background threads.
I've built a small python REST service using Flask, with Flask-SQLAlchemy used for talking to the MySQL DB.
If I connect directly to the MySQL server everything is good, no problems at all. If I use HAproxy (handles HA/failover, though in this dev environment there is only one DB server) then I constantly get MySQL server has gone away errors if the application doesn't talk to the DB frequently enough.
My HAproxy client timeout is set to 50 seconds, so what I think is happening is it cuts the stream, but the application isn't aware and tries to make use of an invalid connection.
Is there a setting I should be using when using services like HAproxy?
Also it doesn't seem to reconnect automatically, but if I issue a request manually I get Can't reconnect until invalid transaction is rolled back, which is odd since it is just a select() call I'm making, so I don't think it is a commit() I'm missing - or should I be calling commit() after every ORM based query?
Just to tidy up this question with an answer I'll post what I (think I) did to solve the issues.
Problem 1: HAproxy
Either increase the HAproxy client timeout value (globally, or in the frontend definition) to a value longer than what MySQL is set to reset on (see this interesting and related SF question)
Or set SQLALCHEMY_POOL_RECYCLE = 30 (30 in my case was less than HAproxy client timeout) in Flask's app.config so that when the DB is initialised it will pull in those settings and recycle connections before HAproxy cuts them itself. Similar to this issue on SO.
Problem 2: Can't reconnect until invalid transaction is rolled back
I believe I fixed this by tweaking the way the DB is initialised and imported across various modules. I basically now have a module that simply has:
from flask.ext.sqlalchemy import SQLAlchemy
db = SQLAlchemy()
Then in my main application factory I simply:
from common.database import db
db.init_app(app)
Also since I wanted to easily load table structures automatically I initialised the metadata binds within the app context, and I think it was this which cleanly handled the commit() issue/error I was getting, as I believe the database sessions are now being correctly terminated after each request.
with app.app_context():
# Setup DB binding
db.metadata.bind = db.engine
I am using Django ORM layer outside of Django. The project is a web application using a cusotm in-house built framework.
Now, I had no problems to set up Django ORM to run standalone, but I am a little bit worried about connection management. I have read Using only DB part of Django here at SO and it is true that Django does some special connection handling at the beginning and the end of each request. From django/db/__init__.py:
# Register an event that closes the database connection
# when a Django request is finished.
def close_connection(**kwargs):
for conn in connections.all():
conn.close()
signals.request_finished.connect(close_connection)
# Register an event that resets connection.queries
# when a Django request is started.
def reset_queries(**kwargs):
for conn in connections.all():
conn.queries = []
signals.request_started.connect(reset_queries)
# Register an event that rolls back the connections
# when a Django request has an exception.
def _rollback_on_exception(**kwargs):
from django.db import transaction
for conn in connections:
try:
transaction.rollback_unless_managed(using=conn)
except DatabaseError:
pass
signals.got_request_exception.connect(_rollback_on_exception)
What problems can I run into if I skip this connection management? (I have no way to plug in those signals into my framework easily)
It depends on your use case. Each of these functions do something specific, which may or may not affect you.
If this is a long-running process and you have DEBUG on, you'll need to reset the queries or it'll keep all the queries you've run in memory.
If you spawn lots of threads, connect to the DB once early on in each thread, and then leave the threads running, you'll also want to close the connections when you're done using the DB, or you could hit your DB's connection limit.
You almost certainly don't need _rollback_on_exception - I'm assuming you have your intended transaction behavior set up within the relevant code itself.