I have the problem that my django application accumulates postgres connections over time. It seems that about every 30min a new connection is established and the old connections don't close (see screen).
As max connections is set to 100 after some time all connections are blocked.
Does anyone know what is causing this problem?
I discovered this after I integrated some celery tasks. So I am quite sure that it is related to celery.
So I tried to close the connection manually after every Task using after_return method:
from django.db import connection
class DBTask(Task):
abstract = True
def after_return(self, *args, **kwargs):
connection.close()
#task(name='example', base=DBTask)
def example_task(value):
# do some stuff
But this also doesn't help.
Maybe I am totally wrong and it isn't related to celery at all.
My database configuration:
DATABASES = {
'default': {
'ENGINE': 'django.contrib.gis.db.backends.postgis',
'NAME': 'production',
'USER': 'production',
'HOST': 'some.host',
'CONN_MAX_AGE': 0,
},
}
Installed packages:
django 1.8.9
pyscopg2 2.6.1
celery 3.1.20
django-celery 3.1.17
The app is deployed at webfaction (maybe this helps)
I have also seen this question, but setting CONN_MAX_AGE: 0 didn't help.
Update:
Tried adding connection.close() at the end of each celery task, but the number of connection is still increasing.
Update 2:
Tried adding connection.close() at the top of the celery file, but this didn't help either.
Update 3:
Here is the code I am actually using in the celery tasks:
celery_tasks.py
#task(name='push_notifications', base=DBTask)
def push_notifications_task(user_id):
user = CustomUser.objects.get(id=user_id)
PusherAPI().push_notifications(user)
connection.close()
models.py
class PusherAPI(object):
def push_notifications(self, user):
from .serializers import NotificationSerializer
self.pusher.trigger(
'user_%s' % user.slug,
'notifications',
NotificationSerializer(user).data
)
serializers.py
class NotificationSerializer(object):
def __init__(self, user=None):
if user is None:
self.user = get_current_user()
else:
self.user = user
#property
def data(self):
# get notifications from db
notifications = self.user.notifications.unread()
# create the notification dict
...
return note_dict
The only db-queries are in CustomUser.objects.get(id=user_id) and notifications = self.user.notifications.unread()
Make sure it is actually old connections that are not closed and not new connections that pile up because some part of your application can't handle the load. Have a look at the individual connections, e.g. with SELECT * FROM pg_stat_activity;
Related
I need to programmatically generate the CREATE TABLE statement for a given unmanaged model in my Django app (managed = False)
Since i'm working on a legacy database, i don't want to create a migration and use sqlmigrate.
The ./manage.py sql command was useful for this purpose but it has been removed in Django 1.8
Do you know about any alternatives?
As suggested, I post a complete answer for the case, that the question might imply.
Suppose you have an external DB table, that you decided to access as a Django model and therefore have described it as an unmanaged model (Meta: managed = False).
Later you need to be able to create it in your code, e.g for some tests using your local DB. Obviously, Django doesn't make migrations for unmanaged models and therefore won't create it in your test DB.
This can be solved using Django APIs without resorting to raw SQL - SchemaEditor. See a more complete example below, but as a short answer you would use it like this:
from django.db import connections
with connections['db_to_create_a_table_in'].schema_editor() as schema_editor:
schema_editor.create_model(YourUnmanagedModelClass)
A practical example:
# your_app/models/your_model.py
from django.db import models
class IntegrationView(models.Model):
"""A read-only model to access a view in some external DB."""
class Meta:
managed = False
db_table = 'integration_view'
name = models.CharField(
db_column='object_name',
max_length=255,
primaty_key=True,
verbose_name='Object Name',
)
some_value = models.CharField(
db_column='some_object_value',
max_length=255,
blank=True,
null=True,
verbose_name='Some Object Value',
)
# Depending on the situation it might be a good idea to redefine
# some methods as a NOOP as a safety-net.
# Note, that it's not completely safe this way, but might help with some
# silly mistakes in user code
def save(self, *args, **kwargs):
"""Preventing data modification."""
pass
def delete(self, *args, **kwargs):
"""Preventing data deletion."""
pass
Now, suppose you need to be able to create this model via Django, e.g. for some tests.
# your_app/tests/some_test.py
# This will allow to access the `SchemaEditor` for the DB
from django.db import connections
from django.test import TestCase
from your_app.models.your_model import IntegrationView
class SomeLogicTestCase(TestCase):
"""Tests some logic, that uses `IntegrationView`."""
# Since it is assumed, that the `IntegrationView` is read-only for the
# the case being described it's a good idea to put setup logic in class
# setup fixture, that will run only once for the whole test case
#classmethod
def setUpClass(cls):
"""Prepares `IntegrationView` mock data for the test case."""
# This is the actual part, that will create the table in the DB
# for the unmanaged model (Any model in fact, but managed models will
# have their tables created already by the Django testing framework)
# Note: Here we're able to choose which DB, defined in your settings,
# will be used to create the table
with connections['external_db'].schema_editor() as schema_editor:
schema_editor.create_model(IntegrationView)
# That's all you need, after the execution of this statements
# a DB table for `IntegrationView` will be created in the DB
# defined as `external_db`.
# Now suppose we need to add some mock data...
# Again, if we consider the table to be read-only, the data can be
# defined here, otherwise it's better to do it in `setUp()` method.
# Remember `IntegrationView.save()` is overridden as a NOOP, so simple
# calls to `IntegrationView.save()` or `IntegrationView.objects.create()`
# won't do anything, so we need to "Improvise. Adapt. Overcome."
# One way is to use the `save()` method of the base class,
# but provide the instance of our class
integration_view = IntegrationView(
name='Biggus Dickus',
some_value='Something really important.',
)
super(IntegrationView, integration_view).save(using='external_db')
# Another one is to use the `bulk_create()`, which doesn't use
# `save()` internally, and in fact is a better solution
# if we're creating many records
IntegrationView.objects.using('external_db').bulk_create([
IntegrationView(
name='Sillius Soddus',
some_value='Something important',
),
IntegrationView(
name='Naughtius Maximus',
some_value='Whatever',
),
])
# Don't forget to clean after
#classmethod
def tearDownClass(cls):
with connections['external_db'].schema_editor() as schema_editor:
schema_editor.delete_model(IntegrationView)
def test_some_logic_using_data_from_integration_view(self):
self.assertTrue(IntegrationView.objects.using('external_db').filter(
name='Biggus Dickus',
))
To make the example more complete... Since we're using multiple DB (default and external_db) Django will try to run migrations on both of them for the tests and as of now there's no option in DB settings to prevent this. So we have to use a custom DB router for testing.
# your_app/tests/base.py
class PreventMigrationsDBRouter:
"""DB router to prevent migrations for specific DBs during tests."""
_NO_MIGRATION_DBS = {'external_db', }
def allow_migrate(self, db, app_label, model_name=None, **hints):
"""Actually disallows migrations for specific DBs."""
return db not in self._NO_MIGRATION_DBS
And a test settings file example for the described case:
# settings/test.py
DATABASES = {
'default': {
'ENGINE': 'django.db.backends.oracle',
'NAME': 'db_name',
'USER': 'username',
'HOST': 'localhost',
'PASSWORD': 'password',
'PORT': '1521',
},
# For production here we would have settings to connect to the external DB,
# but for testing purposes we could get by with an SQLite DB
'external_db': {
'ENGINE': 'django.db.backends.sqlite3',
},
}
# Not necessary to use a router in production config, since if the DB
# is unspecified explicitly for some action Django will use the `default` DB
DATABASE_ROUTERS = ['your_app.tests.base.PreventMigrationsDBRouter', ]
Hope this detailed new Django user user-friendly example will help someone and save their time.
unfortunately there seems to be no easy way to do this, but for your luck I have just succeeded in producing a working snippet for you digging in the internals of the django migrations jungle.
Just:
save the code to get_sql_create_table.py (in example)
do $ export DJANGO_SETTINGS_MODULE=yourproject.settings
launch the script with python get_sql_create_table.py yourapp.yourmodel
and it should output what you need.
Hope it helps!
import django
django.setup()
from django.db.migrations.state import ModelState
from django.db.migrations import operations
from django.db.migrations.migration import Migration
from django.db import connections
from django.db.migrations.state import ProjectState
def get_create_sql_for_model(model):
model_state = ModelState.from_model(model)
# Create a fake migration with the CreateModel operation
cm = operations.CreateModel(name=model_state.name, fields=model_state.fields)
migration = Migration("fake_migration", "app")
migration.operations.append(cm)
# Let the migration framework think that the project is in an initial state
state = ProjectState()
# Get the SQL through the schema_editor bound to the connection
connection = connections['default']
with connection.schema_editor(collect_sql=True, atomic=migration.atomic) as schema_editor:
state = migration.apply(state, schema_editor, collect_sql=True)
# return the CREATE TABLE statement
return "\n".join(schema_editor.collected_sql)
if __name__ == "__main__":
import importlib
import sys
if len(sys.argv) < 2:
print("Usage: {} <app.model>".format(sys.argv[0]))
sys.exit(100)
app, model_name = sys.argv[1].split('.')
models = importlib.import_module("{}.models".format(app))
model = getattr(models, model_name)
rv = get_create_sql_for_model(model)
print(rv)
For Django v4.1.3, the above get_create_sql_for_model soruce code changed like this:
from django.db.migrations.state import ModelState
from django.db.migrations import operations
from django.db.migrations.migration import Migration
from django.db import connections
from django.db.migrations.state import ProjectState
def get_create_sql_for_model(model):
model_state = ModelState.from_model(model)
table_name = model_state.options['db_table']
# Create a fake migration with the CreateModel operation
cm = operations.CreateModel(name=model_state.name, fields=model_state.fields.items())
migration = Migration("fake_migration", "app")
migration.operations.append(cm)
# Let the migration framework think that the project is in an initial state
state = ProjectState()
# Get the SQL through the schema_editor bound to the connection
connection = connections['default']
with connection.schema_editor(collect_sql=True, atomic=migration.atomic) as schema_editor:
state = migration.apply(state, schema_editor, collect_sql=True)
sqls = schema_editor.collected_sql
items = []
for sql in sqls:
if sql.startswith('--'):
continue
items.append(sql)
return table_name,items
#EOP
I used it to create all tables (like the command syncdb of old Django version):
for app in settings.INSTALLED_APPS:
app_name = app.split('.')[0]
app_models = apps.get_app_config(app_name).get_models()
for model in app_models:
table_name,sqls = get_create_sql_for_model(model)
if settings.DEBUG:
s = "SELECT COUNT(*) AS c FROM sqlite_master WHERE name = '%s'" % table_name
else:
s = "SELECT COUNT(*) AS c FROM information_schema.TABLES WHERE table_name='%s'" % table_name
rs = select_by_raw_sql(s)
if not rs[0]['c']:
for sql in sqls:
exec_by_raw_sql(sql)
print('CREATE TABLE DONE:%s' % table_name)
The full soure code can be found at Django syncdb command came back for v4.1.3 version
For example assume that I have 100 clients who uses WordPress and I have to write a service in Django which should return list of posts from WordPress's MySQL DB. The problem is 100 clients are having different database connection settings.
I know that I can use DatabaseRouter to switch databases which are already loaded in settings. But I don't know how to make a singe model class to use different database settings.
I have tried mutating settings.
I also tried mutating model's app_label.
But I later understood that mutating anyting in Django is meaning less.
My Requirements
I want to create a model and dynamically change database connection. List of connection can be in a managed database table. But I don't want to unnecessarily load all the connection settings or create multiple models.
I made something like that, but to change mongodb connections.
I created a GenericView that select the connection and use it on the get_queryset.
I'm using django rest framework, so I made something like this:
class SwitchDBMixinView(object):
model = None
fields = None
def initial(self, request, *args, **kwargs):
result = super().initial(request, *args, **kwargs)
if request.user.is_authenticated():
request.user.database_connection.register()
return result
def get_object(self, *args, **kwargs):
return super().get_object(*args, **kwargs).switch_db(self.get_db_alias())
def get_db_alias(self):
if self.request is None or not self.request.user.is_authenticated():
return DEFAULT_CONNECTION_NAME
return self.request.user.database_connection.name
def get_queryset(self):
return self.model.objects.using(self.get_db_alias()).all()
def perform_destroy(self, instance):
instance.switch_db(self.get_db_alias()).delete()
The model:
from mongoengine.connection import register_connection, get_connection
AUTH_USER_MODEL = getattr(settings, 'AUTH_USER_MODEL')
class Connection(models.Model):
class Meta:
pass
owner = models.OneToOneField(
AUTH_USER_MODEL,
related_name='database_connection',
)
uri = models.TextField(
default=DefaultMongoURI()
)
def register(self):
register_connection(
self.name,
host=self.uri,
tz_aware=True,
)
get_connection(
self.name,
reconnect=True
)
def get_name(self):
return 'client-%d' % self.owner.pk
name = property(get_name)
def __str__(self):
return self.uri
You may want to have a look at django.db.connections (in django/db/__init__.py) and django.db.utils.ConnectionHandler (which django.db.connections is an instance of). This should let you dynamically add new db configs without hacking settings.DATABASES (actually ConnectionHandler builds it's _databases attribute from settings.DATABASES). I can't tell for sure since I never tried but it should mostly boils down to
from django import db
def add_db(alias, connection_infos):
databases = db.connections.databases
if alias in databases:
either_raise_or_log_and_ignore(your choice)
db.connections.databases[alias] = connection_infos
where connection_infos is a mapping similar to the ones expected in settings.DATABASES.
Then it's mostly a matter of using Queryset.using(alias) for your queries, ie:
alias = get_alias_for_user(request.user)
posts = Post.objects.using(alias).all()
cf https://docs.djangoproject.com/en/1.11/topics/db/multi-db/#manually-selecting-a-database
The main problem with this IMHO (assuming you manage to make something that works out of the untested suggestion above) is that you will have to store databases users/password in clear somewhere which can be a major security issue. I don't know how much control you have on the databases admin part but it would be better if you could add a 'django' user with a same password (and appropriate permissions of course) on all those databases so you can keep the password in your settings file instead of having to keep it in your main db.
I want to work with multiple databases with Python Pyramid Framework and SQL Alchemy.
I have 1 database with user information, and multiple databases (with the same structure) where the application information is stored. Each user at login time selects a database and is only shown information from that database (not others).
How should I structure my code?
I was thinking on saving in the session the dbname and on every request check user permission on the selected database and generate a new db session. So my view would look like (PSEUDO CODE):
#view_config(route_name='home', renderer='json')
def my_view_ajax(request):
try:
database = int(request.GET['database'])
# check user permissions from user database
engine = create_engine('postgresql://XXX:XXX#localhost/'+database)
DBSession.configure(bind=engine)
items = DBSession.query('table').all()
except DBAPIError:
return 'error'
return items
Should I generate a new db session with the user information on each request? Or is there a better way?
Thanks
This is quite easy to do in Pyramid+SQLAlchemy, but you'll likely want to
switch to a heavier boilerplate, more manual session management style, and you'll want to be up on the session management docs for SQLA 'cause you can easily trip up when working with multiple concurrent sessions. Also, things like connection management should stay out of views, and be in components that live in the server start up lifecycle and are shared across request threads. If you're doing it right in Pyramid, your views should be pretty small and you should have lots of components that work together through the ZCA (the registry).
In my apps, I have a db factory objects that get sessions when asked for them, and I instantiate these objects in the server start up code (the stuff in __ init __.py) and save them on the registry. Then you can attach sessions for each db to your request object with the reify decorator, and also attach a house keeping end of request cleanup method to close them. This can be done either with custom request factories or with the methods for attaching to the request right from init, I personally wind up using the custom factories as I find it easier to read and I usually end up adding more there.
# our DBFactory component, from some model package
class DBFactory(object):
def __init__(self, db_url, **kwargs):
db_echo = kwargs.get('db_echo', False)
self.engine = create_engine(db_url, echo=db_echo)
self.DBSession = sessionmaker(autoflush=False)
self.DBSession.configure(bind=self.engine)
self.metadata = Base.metadata
self.metadata.bind = self.engine
def get_session(self):
session = self.DBSession()
return session
# we instantiate them in the __init__.py file, and save on registry
def main(global_config, **settings):
"""runs on server start, returns a Pyramid WSGI application """
config = Configurator(
settings=settings,
# ask for a custom request factory
request_factory = MyRequest,
)
config.registry.db1_factory = DBFactory( db_url=settings['db_1_url'] )
config.registry.db2_factory = DBFactory( db_url=settings['db_2_url'] )
# and our custom request class, probably in another file
class MyRequest(Request):
"override the pyramid request object to add explicit db session handling"
#reify
def db1_session(self):
"returns the db_session at start of request lifecycle"
# register callback to close the session automatically after
# everything else in request lifecycle is done
self.add_finished_callback( self.close_dbs_1 )
return self.registry.db1_factory.get_session()
#reify
def db2_session(self):
self.add_finished_callback( self.close_dbs_2 )
return self.registry.db2_factory.get_session()
def close_dbs_1(self, request):
request.db1_session.close()
def close_dbs_2(self, request):
request.db2_session.close()
# now view code can be very simple
def my_view(request):
# get from db 1
stuff = request.db1_session.query(Stuff).all()
other_stuff = request.db2_session.query(OtherStuff).all()
# the above sessions will be closed at end of request when
# pyramid calls your close methods on the Request Factory
return Response("all done, no need to manually close sessions here!")
I have a python module UserManager that takes care for all things user management related - users, groups, rights, authentication. Access to these assets is provided via master class that is passed SQLAlchemy engine parameter at constructor. The engine is needed to make the table-class mappings (using mapper objects), and to emit sessions.
This is how the gobal variables are established in the app module:
class UserManager:
def __init__(self, db):
self.db = db
self._db_session = None
meta = MetaData(db)
user_table = Table(
'USR_User', meta,
Column('field1'),
Column('field3')
)
mapper(User, user_table)
#property
def db_session(self):
if self._db_session is None:
self._db_session = scoped_session(sessionmaker())
self._db_session.configure(bind=self.db)
return self._db_session
class User(object):
def init(self, um):
self.um = um
from flask.ext.sqlalchemy import SQLAlchemy
db = SQLAlchemy(app)
um = UserManager(db.engine)
This module as such is designed to be context-agnostic by purpose, so that it can be used both for locally run and web application.
But here the problems arise: time to time I get the dreaded "Can't reconnect until invalid transaction is rolled back" error, presumably caused by some failed transaction in the UserManager code.
I am now trying to identify the problem source. Maybe it is not right way how to handle the database in the dynamic context of web server? Perhaps I have to pass the db.session to the um object so that I can be sure that the db connections are not mixed up?
In web context you should consider the request for every user isolated. For this you must use the flask.g
To share data that is valid for one request only from one function to
another, a global variable is not good enough because it would break
in threaded environments.Flask provides you with a special object
that ensures it is only valid for the active request and that will
return different values for each request. In a nutshell: it does the
right thing, like it does for request and session.
You can see more about here.
I'm writing a management command where I want to change the default isolation level. Django and my database will default it to "READ COMITTED" and I need it to be "READ UNCOMMITTED" only for this particular management command.
When running:
./manage.py my_command
I've noticed that Django by default opens a transaction with the default isolation level even if your command don't need any database connection:
from django.core.management.base import BaseCommand
class Command(BaseCommand):
help = "Updates assortment data by refreshing the mviews"
def handle(self, *args, **options):
print "fdkgjldfkjgdlkfjgklj"
This behaviour doesn't fit my problem and I'm asking if there is a way of:
Write a management command where Django doesn't even touch the database leaving all the transaction control completely manual?
Write a management command where you can define transaction characteristics only for it?
Regards
I came across your post on Facebook and thought I might be able to be some help :-)
You can specify database connections with read uncommitted with the following database settings in your settings.py:
DATABASES: {
'default': {...}
'uncommitted_db': {
'ENGINE': ...
'NAME': ...
'USER': '...',
'PASSWORD': '...',
'HOST': '...',
'OPTIONS': {
'init_command': 'SET SESSION TRANSACTION ISOLATION LEVEL READ UNCOMMITTED' #MySQL
'init_command': 'SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED' #Postgres
}
}
}
With this in place you can access your non-transactional database connection using Django's normal multidatabase syntax:
Model.objects.using('uncommitted_db').all()
Of course, you might not want to have your non-transactional database connection globally available in your entire application, so you'd ideally want a way to have it only available during the execution of this management command. Unfortunately, management commands don't work like that: once you hit the handle method on the Command class, your settings.py has already been parsed and your database connections have already been created. If you can find a way to re-initialise Django with a new set of database settings after runtime, or having a logical split in your settings.py based on your launch conditions, like so:
import sys
if 'some_management_cmd' in sys.argv:
DATABASES['default']['OPTIONS']['init_command'] = 'SET TRANSACTION...'
this could work, but is pretty horrible!