Caching/reusing a DB connection for later view usage - python

I am saving a user's database connection. On the first time they enter in their credentials, I do something like the following:
self.conn = MySQLdb.connect (
host = 'aaa',
user = 'bbb',
passwd = 'ccc',
db = 'ddd',
charset='utf8'
)
cursor = self.conn.cursor()
cursor.execute("SET NAMES utf8")
cursor.execute('SET CHARACTER SET utf8;')
cursor.execute('SET character_set_connection=utf8;')
I then have the conn ready to go for all the user's queries. However, I don't want to re-connect every time the view is loaded. How would I store this "open connection" so I can just do something like the following in the view:
def do_queries(request, sql):
user = request.user
conn = request.session['conn']
cursor = request.session['cursor']
cursor.execute(sql)
Update: it seems like the above is not possible and not good practice, so let me re-phrase what I'm trying to do:
I have a sql editor that a user can use after they enter in their credentials (think of something like Navicat or SequelPro). Note this is NOT the default django db connection -- I do not know the credentials beforehand. Now, once the user has 'connected', I would like them to be able to do as many queries as they like without me having to reconnect every time they do this. For example -- to re-iterate again -- something like Navicat or SequelPro. How would this be done using python, django, or mysql? Perhaps I don't really understand what is necessary here (caching the connection? connection pooling? etc.), so any suggestions or help would be greatly appreciated.

You could use an IoC container to store a singleton provider for you. Essentially, instead of constructing a new connection every time, it will only construct it once (the first time ConnectionContainer.connection_provider() is called) and thereafter it will always return the previously constructed connection.
You'll need the dependency-injector package for my example to work:
import dependency_injector.containers as containers
import dependency_injector.providers as providers
class ConnectionProvider():
def __init__(self, host, user, passwd, db, charset):
self.conn = MySQLdb.connect(
host=host,
user=user,
passwd=passwd,
db=db,
charset=charset
)
class ConnectionContainer(containers.DeclarativeContainer):
connection_provider = providers.Singleton(ConnectionProvider,
host='aaa',
user='bbb',
passwd='ccc',
db='ddd',
charset='utf8')
def do_queries(request, sql):
user = request.user
conn = ConnectionContainer.connection_provider().conn
cursor = conn.cursor()
cursor.execute(sql)
I've hardcoded the connection string here, but it is also possible to make it variable depending on a changeable configuration. In that case you could also create a container for the configuration file and have the connection container read its config from there. You then set the config at runtime. As follows:
import dependency_injector.containers as containers
import dependency_injector.providers as providers
class ConnectionProvider():
def __init__(self, connection_config):
self.conn = MySQLdb.connect(**connection_config)
class ConfigContainer(containers.DeclarativeContainer):
connection_config = providers.Configuration("connection_config")
class ConnectionContainer(containers.DeclarativeContainer):
connection_provider = providers.Singleton(ConnectionProvider, ConfigContainer.connection_config)
def do_queries(request, sql):
user = request.user
conn = ConnectionContainer.connection_provider().conn
cursor = conn.cursor()
cursor.execute(sql)
# run code
my_config = {
'host':'aaa',
'user':'bbb',
'passwd':'ccc',
'db':'ddd',
'charset':'utf8'
}
ConfigContainer.connection_config.override(my_config)
request = ...
sql = ...
do_queries(request, sql)

I don't see why do you need a cached connection here and why not just reconnect on every request caching user's credentials somewhere, but anyway I'll try to outline a solution that might fit your requirements.
I'd suggest to look into a more generic task first - cache something between subsequent requests your app needs to handle and can't serialize into django's sessions.
In your particular case this shared value would be a database connection (or multiple connections).
Lets start with a simple task of sharing a simple counter variable between requests, just to understand what's actually happening under the hood.
Amaizingly but neither answer has mentioned anything regarding a web server you might use!
Actually there are multiple ways to handle concurrent connections in web apps:
Having multiple processes, every request comes into one of them at random
Having multiple threads, every request is handled by a random thread
p.1 and p.2 combined
Various async techniques, when there's a single process + event loop handling requests with a caveat that request handlers shouldn't block for a long time
From my own experience p.1-2 are fine for majority of typical webapps.
Apache1.x could only work with p.1, Apache2.x can handle all of 1-3.
Lets start with the following django app and run a single-process gunicorn webserver.
I'm going to use gunicorn because it's fairly easy to configure it unlike apache (personal opinion :-)
views.py
import time
from django.http import HttpResponse
c = 0
def main(self):
global c
c += 1
return HttpResponse('val: {}\n'.format(c))
def heavy(self):
time.sleep(10)
return HttpResponse('heavy done')
urls.py
from django.contrib import admin
from django.urls import path
from . import views
urlpatterns = [
path('admin/', admin.site.urls),
path('', views.main, name='main'),
path('heavy/', views.heavy, name='heavy')
]
Running it in a single process mode:
gunicorn testpool.wsgi -w 1
Here's our process tree - there's only 1 worker that would handle ALL requests
pstree 77292
-+= 77292 oleg /Users/oleg/.virtualenvs/test3.4/bin/python /Users/oleg/.virtualenvs/test3.4/bin/gunicorn testpool.wsgi -w 1
\--- 77295 oleg /Users/oleg/.virtualenvs/test3.4/bin/python /Users/oleg/.virtualenvs/test3.4/bin/gunicorn testpool.wsgi -w 1
Trying to use our app:
curl 'http://127.0.0.1:8000'
val: 1
curl 'http://127.0.0.1:8000'
val: 2
curl 'http://127.0.0.1:8000'
val: 3
As you can see you can easily share the counter between subsequent requests.
The problem here is that you can only serve a single request in parallel. If you request for /heavy/ in one tab, / won't work until /heavy is done
Lets now use 2 worker processes:
gunicorn testpool.wsgi -w 2
This is how the process tree would look like:
pstree 77285
-+= 77285 oleg /Users/oleg/.virtualenvs/test3.4/bin/python /Users/oleg/.virtualenvs/test3.4/bin/gunicorn testpool.wsgi -w 2
|--- 77288 oleg /Users/oleg/.virtualenvs/test3.4/bin/python /Users/oleg/.virtualenvs/test3.4/bin/gunicorn testpool.wsgi -w 2
\--- 77289 oleg /Users/oleg/.virtualenvs/test3.4/bin/python /Users/oleg/.virtualenvs/test3.4/bin/gunicorn testpool.wsgi -w 2
Testing our app:
curl 'http://127.0.0.1:8000'
val: 1
curl 'http://127.0.0.1:8000'
val: 2
curl 'http://127.0.0.1:8000'
val: 1
The first two requests has been handled by the first worker process, and the 3rd one - by the second worker process that has its own memory space so you see 1 instead of 3.
Notice your output may differ because process 1 and 2 are selected at random. But sooner or later you'll hit a different process.
That's not very helpful for us because we need to handle multiple concurrent requests and we need to somehow get our request handled by a specific process that can't be done in general case.
Most pooling technics coming out of the box would only cache connections in the scope of a single process, if your request gets served by a different process - a NEW connection would need to be made.
Lets move to threads
gunicorn testpool.wsgi -w 1 --threads 2
Again - only 1 process
pstree 77310
-+= 77310 oleg /Users/oleg/.virtualenvs/test3.4/bin/python /Users/oleg/.virtualenvs/test3.4/bin/gunicorn testpool.wsgi -w 1 --threads 2
\--- 77313 oleg /Users/oleg/.virtualenvs/test3.4/bin/python /Users/oleg/.virtualenvs/test3.4/bin/gunicorn testpool.wsgi -w 1 --threads 2
Now if you run /heavy in one tab you'll still be able to query / and your counter will be preserved between requests!
Even if the number of threads is growing or shrinking depending on your workload it should still work fine.
Problems: you'll need to synchronize access to the shared variable like this using python threads synchronization technics (read more).
Another problem is that the same user may need to to issue multiple queries in parallel - i.e. open multiple tabs.
To handle it you can open multiple connections on the first request when you have db credentials available.
If a user needs more connections than your app might wait on lock until a connection becomes available.
Back to your question
You can create a class that would have the following methods:
from contextlib import contextmanager
class ConnectionPool(object):
def __init__(self, max_connections=4):
self._pool = dict()
self._max_connections = max_connections
def preconnect(self, session_id, user, password):
# create multiple connections and put them into self._pool
# ...
#contextmanager
def get_connection(sef, session_id):
# if have an available connection:
# mark it as allocated
# and return it
try:
yield connection
finally:
# put it back to the pool
# ....
# else
# wait until there's a connection returned to the pool by another thread
pool = ConnectionPool(4)
def some_view(self):
session_id = ...
with pool.get_connection(session_id) as conn:
conn.query(...)
This is not a complete solution - you'll need to somehow delete outdated connections not used for a long time.
If a user comes back after a long time and his connection have been closed, he'll need to provide his credentials again - hopefully it's ok from your app's perspective.
Also keep in mind python threads have its performance penalties, not sure if this is an issue for you.
I haven't checked it for apache2 (too much configuration burden, I haven't used it for ages and generally use uwsgi), but it should work there too - would be happy to hear back from you
if you manage to run it )
And also don't forget about p.4 (async approach) - unlikely will you be able to use it on apache, but it's worth investigation - keywords: django + gevent, django + asyncio. It has its pros/cons and may greatly affect your app implementation so it's hard to suggest any solution without knowing your app requirements in detail

This is not a good idea to do such a thing synchronously in web app context. Remember that your application may needs to work in multi process/thread fashion, and you could not share connection between processes normally. So if you create a connection for your user on a process, there is no guaranty to receive query request on the same one. May be a better idea is to have a single process background worker which handles connections in multiple threads (a thread per session) to make queries on database and retrieve result on web app. Your application should assign a unique ID to each session and the background worker track each thread using session ID. You may use celery or any other task queues supporting async result. So the design would be something like below:
|<--| |<--------------| |<--|
user (id: x) | | webapp | | queue | | worker (thread x) | | DB
|-->| |-->| |-->| |-->|
Also you could create a queue for each user until they have an active session, as a result you could run a separate background process for each session.

I actually shared my solution to this exact issue. What I did here was create a pool of connections that you can specify the max with, and then queued query requests async through this channel. This way you can leave a certain amount of connections open, but it will queue and pool async and keep the speed you are used to.
This requires gevent and postgres.
Python Postgres psycopg2 ThreadedConnectionPool exhausted

I'm no expert in this field, but I believe that PgBouncer would do the job for you, assuming you're able to use a PostgreSQL back-end (that's one detail you didn't make clear). PgBouncer is a connection pooler, which allows you re-use connections avoiding the overhead of connecting on every request.
According to their documentation:
user, password
If user= is set, all connections to the destination database will be done with the specified user, meaning that there will be only one pool for this database.
Otherwise PgBouncer tries to log into the destination database with client username, meaning that there will be one pool per user.
So, you can have a single pool of connections per user, which sounds just like what you want.
In MySQL land, the mysql.connector.pooling module allows you to do some connection pooling, though I'm not sure if you can do per-user pooling. Given that you can set up the pool name, I'm guessing you could use the user's name to identify the pool.
Regardless of what you use, you will likely have occasions where reconnecting is unavoidable (a user connects, does a few things, goes away for a meeting and lunch, comes back and wants to take more action).

I am just sharing my knowledge over here.
Install the PyMySQL to use the MySql
For Python 2.x
pip install PyMySQL
For Python 3.x
pip3 install PyMySQL
1. If you are open to use Django Framework then it's very easy to run the SQL query without any re-connection.
In setting.py file add the below lines
DATABASES = {
'default': {
'ENGINE': 'django.db.backends.mysql',
'NAME': 'test',
'USER': 'test',
'PASSWORD': 'test',
'HOST': 'localhost',
'OPTIONS': {'charset': 'utf8mb4'},
}
}
In views.py file add these lines to get the data. You can customized your query according to your need
from django.db import connection
def connect(request):
cursor = connection.cursor()
cursor.execute("SELECT * FROM Tablename");
results = cursor.fetchall()
return results
You will get the desire results.
Click here for more information about it
2. For python Tkinter
from Tkinter import *
import MySQLdb
db = MySQLdb.connect("localhost","root","root","test")
# prepare a cursor object using cursor() method
cursor = db.cursor()
cursor.execute("SELECT * FROM Tablename")
if cursor.fetchone() is not None:
print("In If")
else:
print("In Else")
cursor.close()
Refer this for more information
PS: You can check this link for your question to reusing a DB connection for later.
How to enable MySQL client auto re-connect with MySQLdb?

Related

Does SQLAlchemy close sessions after commit()?

So this question is a little like Does SQLAlchemy reset the database session between SQLAlchemy Sessions from the same connection?
I have a Flask/SQLAlchemy/Postgres app, which intermittently seems to drop connections after a commit() that occurs as part of a POST request.
This causes me headaches as I rely upon a customized option (https://www.postgresql.org/docs/9.6/runtime-config-custom.html) to control row level security - in effect executing the following before each Flask request while utilising scoped sessions:
#app.before_request
def load_user():
...
# Set-up RLS.
statement = f"SET app.permitted_workspace_id = '{workspace_id}'"
db.db_session.execute(statement)
...
This pattern generally works fine, but occasionally seems to fail when, so far as I can tell, after a commit(), SQLAlchemy drops the existing session and checks out a new one, in which app.permitted_workspace_id is no longer set.
My workaround for this is to listen for session checkout events, and then re-set the parameter:
#event.listens_for(db_engine, 'checkout')
def receive_checkout(dbapi_connection, connection_record, connection_proxy):
...
cursor = dbapi_connection.cursor()
statement = f"SET app.permitted_workspace_id = '{g.user.workspace_id}'"
cursor.execute(statement)
return
So my question is really: is it unavoidable that SQLAlchemy may close sessions after commit(), meaning I lose my session parameters - even with more DB work still to do?
If so, do we think this pattern is secure or even acceptable practice? Ideally, I'd keep the session open until removed (via #app.teardown_appcontext), but since I'm struggling to achieve that, and still have the relevant info available within the Flask request, I think this is the next best way to go.
Thanks
Edit 1:
In terms of session scoping, the layout is this:
In a database module, I lay out the following:
def get_database_connection()
...
db_engine = sa.create_engine(
f'postgresql://{user}:{password}#{host}/postgres',
echo=False,
poolclass=sa.pool.NullPool
)
# Connect - RLS is controlled by db_get_user_details.
db_connection = db_engine.connect()
db_session = scoped_session(
sessionmaker(
autocommit=False,
autoflush=False,
expire_on_commit=False,
bind=db_engine
)
)
return(db_engine, db_session, db_connection)
This is then called up top from inside the main Flask application:
db_engine, db_session, db_connection = db.get_database_connection()
And session removal is controlled by a function as follows:
#app.teardown_appcontext
def remove_session(exception=None):
db_session.remove()
So the answer in here seems to be that commit() does perform a checkin with this pattern:
https://github.com/sqlalchemy/sqlalchemy/issues/4925
if Session is what you're working with then yes, the Session will release connections when any of commit(), rollback(), or close() is called.

Best way to re/use redis connections for prometheus django exporter

I am getting an error
redis.exceptions.ConnectionError: Error 24 connecting to redis-service:6379. Too many open files.
...
OSError: [Errno 24] Too many open files
I know this can be fixed by increasing the ulimit but I don't think that's the issue here and also this is a service running on a container.
The application starts up correctly works for 48 hours correctly and then I get the above error.
Which implies that the connections are growing over time exponentially.
What my application is basically doing
background_task (ran using celery) -> collects data from postgres and sets it on redis
prometheus reaches the app at '/metrics' which is a django view -> collects data from redis and serves the data using django prometheus exporter
The code looks something like this
views.py
from prometheus_client.core import GaugeMetricFamily, REGISTRY
from my_awesome_app.taskbroker.celery import app
class SomeMetricCollector:
def get_sample_metrics(self):
with app.connection_or_acquire() as conn:
client = conn.channel().client
result = client.get('some_metric_key')
return {'some_metric_key': result}
def collect(self):
sample_metrics = self.get_sample_metrics()
for key, value in sample_metrics.items():
yield GaugeMetricFamily(key, 'This is a custom metric', value=value)
REGISTRY.register(SomeMetricCollector())
tasks.py
# This is my boilerplate taskbroker app
from my_awesome_app.taskbroker.celery import app
# How it's collecting data from postgres is trivial to this issue.
from my_awesome_app.utility_app.utility import some_value_calculated_from_query
#app.task()
def app_metrics_sync_periodic():
with app.connection_or_acquire() as conn:
client = conn.channel().client
client.set('some_metric_key', some_value_calculated_from_query(), ex=21600)
return True
I don't think the background data collection in tasks.py is causing the Redis connections to grow exponentially but it's the Django view '/metrics' in views.py which is causing.
Can you please tell me what I am doing wrong here?
If there is a better way to read from Redis from a Django view. The Prometheus instance scrapes the Django application every 5s.
This answer is according to my use case and research.
The issue here, according to me, is the fact that each request to /metrics initiates a new thread where the views.py creates new connections in the Celery broker's connection pool.
This can be easily handled by letting Django manage its own Redis connection pool through cache backend and Celery manage its own Redis connection pool and not use each other's connection pools from their respective threads.
Django Side
config.py
# CACHES
# ------------------------------------------------------------------------------
# For more details on options for your cache backend please refer
# https://docs.djangoproject.com/en/3.1/ref/settings/#backend
CACHES = {
"default": {
"BACKEND": "django_redis.cache.RedisCache",
"LOCATION": "redis://localhost:6379/0",
"OPTIONS": {
"CLIENT_CLASS": "django_redis.client.DefaultClient",
},
}
}
views.py
from prometheus_client.core import GaugeMetricFamily, REGISTRY
# *: Replacing celery app with Django cache backend
from django.core.cache import cache
class SomeMetricCollector:
def get_sample_metrics(self):
# *: This is how you will get the new client, which is still context managed.
with cache.client.get_client() as client:
result = client.get('some_metric_key')
return {'some_metric_key': result}
def collect(self):
sample_metrics = self.get_sample_metrics()
for key, value in sample_metrics.items():
yield GaugeMetricFamily(key, 'This is a custom metric', value=value)
REGISTRY.register(SomeMetricCollector())
This will ensure that Django will maintain it's Redis connection pool and not cause new connections to be spun up unnecessarily.
Celery Side
tasks.py
# This is my boilerplate taskbroker app
from my_awesome_app.taskbroker.celery import app
# How it's collecting data from postgres is trivial to this issue.
from my_awesome_app.utility_app.utility import some_value_calculated_from_query
#app.task()
def app_metrics_sync_periodic():
with app.connection_or_acquire() as conn:
# *: This will force celery to always look into the existing connection pool for connection.
client = conn.default_channel.client
client.set('some_metric_key', some_value_calculated_from_query(), ex=21600)
return True
How do I monitor connections?
There is a nice prometheus celery exporter which will help you monitor your celery task activity not sure how you can add connection pool and connection monitoring to it.
The easiest way to manually verify if the connections are growing every time /metrics is hit on the web app, is by:
$ redis-cli
127.0.0.1:6379> CLIENT LIST
...
The client list command will help you see if the number of connections are growing or not.
I don't use queues sadly but I would recommend using queues. This is how my worker runs:
$ celery -A my_awesome_app.taskbroker worker --concurrency=20 -l ERROR -E

StreamingHttpResponse: return database connection to pool / close it

If you have a StreamingHttpResponse returned from a Django view, when does it return any database connection to the pool? If by default it does it once the StreamingHttpResponse has completed, is there a way to make return the connection earlier?
def my_view(request):
# Some database queries using the Django ORM
# ...
def yield_data():
# A generator, with no database queries using the Django ORM
# ...
return StreamingHttpResponse(
yield_data(), status=200
)
If it makes a difference, this is using https://pypi.org/project/django-db-geventpool/ with gunicorn, and any answer should also work when tested with pytest.mark.django_db (that I think wraps tests in transactions)
If you look at the documentation
https://docs.djangoproject.com/en/3.0/ref/databases/
Connection management
Django opens a connection to the database when it first makes a database query. It keeps this connection open and reuses it in subsequent requests. Django closes the connection once it exceeds the maximum age defined by CONN_MAX_AGE or when it isn’t usable any longer.
In detail, Django automatically opens a connection to the database whenever it needs one and doesn’t have one already — either because this is the first connection, or because the previous connection was closed.
At the beginning of each request, Django closes the connection if it has reached its maximum age. If your database terminates idle connections after some time, you should set CONN_MAX_AGE to a lower value, so that Django doesn’t attempt to use a connection that has been terminated by the database server. (This problem may only affect very low traffic sites.)
At the end of each request, Django closes the connection if it has reached its maximum age or if it is in an unrecoverable error state. If any database errors have occurred while processing the requests, Django checks whether the connection still works, and closes it if it doesn’t. Thus, database errors affect at most one request; if the connection becomes unusable, the next request gets a fresh connection.
Also if you see the db/__init__.py in django source code
# For backwards compatibility. Prefer connections['default'] instead.
connection = DefaultConnectionProxy()
# Register an event to reset saved queries when a Django request is started.
def reset_queries(**kwargs):
for conn in connections.all():
conn.queries_log.clear()
signals.request_started.connect(reset_queries)
# Register an event to reset transaction state and close connections past
# their lifetime.
def close_old_connections(**kwargs):
for conn in connections.all():
conn.close_if_unusable_or_obsolete()
signals.request_started.connect(close_old_connections)
signals.request_finished.connect(close_old_connections)
It has connection to the request_started and request_finished signal to close old connections using close_old_connections.
So if you are not willing to wait for connections to be closed you can call this method yourself. You updated code will be like below
from django.db import close_old_connections
def my_view(request):
# Some database queries using the Django ORM
# ...
close_old_connections()
def yield_data():
# A generator, with no database queries using the Django ORM
# ...
return StreamingHttpResponse(
yield_data(), status=200
)

Ironworker job done notification

I'm writing python app which currently is being hosted on Heroku. It is in early development stage, so I'm using free account with one web dyno. Still, I want my heavier tasks to be done asynchronously so I'm using iron worker add-on. I have it all set up and it does the simplest jobs like sending emails or anything that doesn't require any data being sent back to the application. The question is: How do I send the worker output back to my application from the iron worker? Or even better, how do I notify my app that the worker is done with the job?
I looked at other iron solutions like cache and message queue, but the only thing I can find is that I can explicitly ask for the worker state. Obviously I don't want my web service to poll the worker because it kind of defeats the original purpose of moving the tasks to background. What am I missing here?
I see this question is high in Google so in case you came here with hopes to find some more details, here is what I ended up doing:
First, I prepared the endpoint on my app. My app uses Flask, so this is how the code looks:
#app.route("/worker", methods=["GET", "POST"])
def worker():
#refresh the interface or whatever is necessary
if flask.request.method == 'POST':
return 'Worker endpoint reached'
elif flask.request.method == 'GET':
worker = IronWorker()
task = worker.queue(code_name="hello", payload={"WORKER_DB_URL": app.config['WORKER_DB_URL'],
"WORKER_CALLBACK_URL": app.config['WORKER_CALLBACK_URL']})
details = worker.task(task)
flask.flash("Work queued, response: ", details.status)
return flask.redirect('/')
Note that in my case, GET is here only for testing, I don't want my users to hit this endpoint and invoke the task. But I can imagine situations when this is actually useful, specifically if you don't use any type of scheduler for your tasks.
With the endpoint ready, I started to look for a way of visiting that endpoint from the worker. I found this fantastic requests library and used it in my worker:
import sys, json
from sqlalchemy import *
import requests
print "hello_worker initialized, connecting to database..."
payload = None
payload_file = None
for i in range(len(sys.argv)):
if sys.argv[i] == "-payload" and (i + 1) < len(sys.argv):
payload_file = sys.argv[i + 1]
break
f = open(payload_file, "r")
contents = f.read()
f.close()
payload = json.loads(contents)
print "contents: ", contents
print "payload as json: ", payload
db_url = payload['WORKER_DB_URL']
print "connecting to database ", db_url
db = create_engine(db_url)
metadata = MetaData(db)
print "connection to the database established"
users = Table('users', metadata, autoload=True)
s = users.select()
#def run(stmt):
# rs = stmt.execute()
# for row in rs:
# print row
#run(s)
callback_url = payload['WORKER_CALLBACK_URL']
print "task finished, sending post to ", callback_url
r = requests.post(callback_url)
print r.text
So in the end there is no real magic here, the only important thing is to send the callback url in the payload if you need to notify your page when the task is done. Alternatively you can place the endpoint url in the database if you use one in your app. Btw. the snipped above also shows how to connect to the postgresql database in your worker and print all the users.
One last thing you need to be aware of is how to format your .worker file, mine looks like this:
# set the runtime language. Python workers use "python"
runtime "python"
# exec is the file that will be executed:
exec "hello_worker.py"
# dependencies
pip "SQLAlchemy"
pip "requests"
This will install the latest versions of SQLAlchemy and requests, if your project is dependent on any specific version of the library, you should do this instead:
pip "SQLAlchemy", "0.9.1"
Easiest way - push message to your api from worker - it's log or anything you need to have in your app

How do I make one instance in Python that I can access from different modules?

I'm writing a web application that connects to a database. I'm currently using a variable in a module that I import from other modules, but this feels nasty.
# server.py
from hexapoda.application import application
if __name__ == '__main__':
from paste import httpserver
httpserver.serve(application, host='127.0.0.1', port='1337')
# hexapoda/application.py
from mongoalchemy.session import Session
db = Session.connect('hexapoda')
import hexapoda.tickets.controllers
# hexapoda/tickets/controllers.py
from hexapoda.application import db
def index(request, params):
tickets = db.query(Ticket)
The problem is that I get multiple connections to the database (I guess that because I import application.py in two different modules, the Session.connect() function gets executed twice).
How can I access db from multiple modules without creating multiple connections (i.e. only call Session.connect() once in the entire application)?
Try the Twisted framework with something like:
from twisted.enterprise import adbapi
class db(object):
def __init__(self):
self.dbpool = adbapi.ConnectionPool('MySQLdb',
db='database',
user='username',
passwd='password')
def query(self, sql)
self.dbpool.runInteraction(self._query, sql)
def _query(self, tx, sql):
tx.execute(sql)
print tx.fetchone()
That's probably not what you want to do - a single connection per app means that your app can't scale.
The usual solution is to connect to the database when a request comes in and store that connection in a variable with "request" scope (i.e. it lives as long as the request).
A simple way to achieve that is to put it in the request:
request.db = ...connect...
Your web framework probably offers a way to annotate methods or something like a filter which sees all requests. Put the code to open/close the connection there.
If opening connections is expensive, use connection pooling.

Categories