How to make Django management command not open a transaction? - python

I'm writing a management command where I want to change the default isolation level. Django and my database will default it to "READ COMITTED" and I need it to be "READ UNCOMMITTED" only for this particular management command.
When running:
./manage.py my_command
I've noticed that Django by default opens a transaction with the default isolation level even if your command don't need any database connection:
from django.core.management.base import BaseCommand
class Command(BaseCommand):
help = "Updates assortment data by refreshing the mviews"
def handle(self, *args, **options):
print "fdkgjldfkjgdlkfjgklj"
This behaviour doesn't fit my problem and I'm asking if there is a way of:
Write a management command where Django doesn't even touch the database leaving all the transaction control completely manual?
Write a management command where you can define transaction characteristics only for it?
Regards

I came across your post on Facebook and thought I might be able to be some help :-)
You can specify database connections with read uncommitted with the following database settings in your settings.py:
DATABASES: {
'default': {...}
'uncommitted_db': {
'ENGINE': ...
'NAME': ...
'USER': '...',
'PASSWORD': '...',
'HOST': '...',
'OPTIONS': {
'init_command': 'SET SESSION TRANSACTION ISOLATION LEVEL READ UNCOMMITTED' #MySQL
'init_command': 'SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED' #Postgres
}
}
}
With this in place you can access your non-transactional database connection using Django's normal multidatabase syntax:
Model.objects.using('uncommitted_db').all()
Of course, you might not want to have your non-transactional database connection globally available in your entire application, so you'd ideally want a way to have it only available during the execution of this management command. Unfortunately, management commands don't work like that: once you hit the handle method on the Command class, your settings.py has already been parsed and your database connections have already been created. If you can find a way to re-initialise Django with a new set of database settings after runtime, or having a logical split in your settings.py based on your launch conditions, like so:
import sys
if 'some_management_cmd' in sys.argv:
DATABASES['default']['OPTIONS']['init_command'] = 'SET TRANSACTION...'
this could work, but is pretty horrible!

Related

Configuring PostgreSQL schema for default Django DB working with PgBouncer connection pool

I need to set the default DB schema for a Django project, so that all tables of all apps (including 3rd party apps) store their tables in the configured PostgreSQL schema.
One solution is to use a DB connection option, like this:
# in Django settings module add "OPTIONS" to default DB, specifying "search_path" for the connection
DB_DEFAULT_SCHEMA = os.environ.get('DB_DEFAULT_SCHEMA', 'public') # use postgresql default "public" if not overwritten
DATABASES['default']['OPTIONS'] = {'options': f'-c search_path={DB_DEFAULT_SCHEMA}'}
This works for a direct connection to PostgreSQL, but not when connecting to PgBouncer (to use connection pools), failing with OperatonalError: unsupported startup parameter: options". It appears PgBouncer doesn't recognize options as a startup parameter (at this point of time).
Another solution to set the schema without using startup parameters, is to prefix all tables with the schema . To make sure this works for built-in and 3rd party apps too (not just my own app), a solution is to inject the schema name to db_table attribute of all models when they're being loaded by Django, using class_prepared signal, and an AppConfig. This approach is close to what projects like django-db-prefix use, only need to make sure the schema name is well quoted:
from django.conf import settings
from django.db.models.signals import class_prepared
def set_model_schema(sender, **kwargs):
schema = getattr(settings, "DB_DEFAULT_SCHEMA", "")
db_table = sender._meta.db_table
if schema and not db_table[1:].startswith(schema):
sender._meta.db_table = '"{}"."{}"'.format(schema, db_table)
class_prepared.connect(set_model_schema)
This works for connection pools too, however it doesn't play well with Django migrations.
Using this solution, python manage.py migrate fails to work, because migrate command ensures django_migrations table exists, by introspecting existing tables, which the db_table prefix of models has no effect on.
I'm curious what a proper way could be to solve this problem.
This is the solution I came up with. Mixing both solutions above, using 2 separate DB connections.
Using the connection startup parameters (which works well for apps and migrations), but only to run migrations, not the app server. This means Django migrations has to connect to PostgreSQL directly, and not via PgBouncer, which for my case is fine.
Prefixing DB tables with the schema using a class_prepared signal handler, but excluding django_migrations table. The handler is registered with a Django app (say django_dbschema) using the AppConfig.__init__() method, which is the 1st stage of project initialization process, so all other apps are affected. An environment variable is used to flag bypassing this registration, which is set when running migrations. This way when the app runs to serve requests, it can connect to PgBouncer just as good, but Django migrations is unaware of schema prefixes.
Two environment variables (used by Django settings module) will be used to configure this behavior: DB_DEFAULT_SCHEMA is the name of the schema, and DB_SCHEMA_NO_PREFIX flags disabling registration for signal handler. It'll be like this:
The django_dbschema app structure (in the root of the project)
django_dbschema/
├── apps.py
├── __init__.py
where apps.py defines the signal handler and AppConfig to register it:
from django.apps import AppConfig
from django.conf import settings
from django.db.models.signals import class_prepared
def set_model_schema(sender, **kwargs):
"""Prefix the DB table name for the model with the configured DB schema.
Excluding Django migrations table itself (django_migrations).
Because django migartion command directly introspects tables in the DB, looking
for eixsting "django_migrations" table, prefixing the table with schemas won't work
so Django migrations thinks, the table doesn't exist, and tries to create it.
So django migrations can/should not use this app to target a schema.
"""
schema = getattr(settings, "DB_DEFAULT_SCHEMA", "")
if schema == "":
return
db_table = sender._meta.db_table
if db_table != "django_migrations" and not db_table[1:].startswith(schema):
# double quotes are important to target a schema
sender._meta.db_table = '"{}"."{}"'.format(schema, db_table)
class DjangoDbschemaConfig(AppConfig):
"""Django app to register a signal handler for model class preparation
signal, to prefix all models' DB tables with the schema name from "DB_DEFAULT_SCHEMA"
in settings.
This is better than specifying "search_path" as "options" of the connection,
because this approach works both for direct connections AND connection pools (where
the "options" connection parameter is not accepted by PGBouncer)
NOTE: This app defines __init__(), to register class_prepared signal.
Make sure no models are imported in __init__. see
https://docs.djangoproject.com/en/3.2/ref/signals/#class-prepared
NOTE: The signal handler for this app excludes django migrations,
So django migrations can/should not use this app to target a schema.
This means with this enabled, when starting the app server, Django thinks
migrations are missing and always warns with:
You have ... unapplied migration(s). Your project may not work properly until you apply the migrations for ...
To actually run migrations (python manage.py migrate) use another way to set the schema
"""
name = "django_dbschema"
verbose_name = "Configure DB schema for Django models"
def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
schema = getattr(settings, "DB_DEFAULT_SCHEMA", "")
if schema and not getattr(
settings, "DB_SCHEMA_NO_PREFIX", False
): # don't register signal handler if no schema or disabled
class_prepared.connect(set_model_schema)
This app is registered to the list of INSTALLED_APPS (I had to use full class path to the app config, otherwise Django wouldn't load my AppConfig definition).
Also the Django settings module (say settings.py), would define 1 extra DB connection (a copy of default) but with connection options:
# ...
INSTALLED_APPS = [
'django_dbschema.apps.DjangoDbschemaConfig', # has to be full path to class otherwise django won't load local app
'django.contrib.admin',
# ...
]
# 2 new settings to control the schema and prefix
DB_DEFAULT_SCHEMA = os.environ.get('DB_DEFAULT_SCHEMA', '')
DB_SCHEMA_NO_PREFIX = os.environ.get('DB_SCHEMA_NO_PREFIX', False) # if should force disable prefixing DB tables with schema
DATABASES = {
'default': { # default DB connection definition, used by app not migrations, to work with PgBouncer no connection options
# ...
}
}
# default_direct: the default DB connection, but a direct connection NOT A CONNECTION POOL, so can have connection options
DATABASES['default_direct'] = deepcopy(DATABASES['default'])
# explicit test db info, prevents django test from confusing multi DB aliases to the same actual DB with circular dependencies
DATABASES['default_direct']['TEST'] = {'DEPENDENCIES': [], 'NAME': 'test_default_direct'}
# allow overriding connection parameters if necessary
if os.environ.get('DIRECT_DB_HOST'):
DATABASES['default_direct']['HOST'] = os.environ.get('DIRECT_DB_HOST')
if os.environ.get('DIRECT_DB_PORT'):
DATABASES['default_direct']['PORT'] = os.environ.get('DIRECT_DB_PORT')
if os.environ.get('DIRECT_DB_NAME'):
DATABASES['default_direct']['NAME'] = os.environ.get('DIRECT_DB_NAME')
if DB_DEFAULT_SCHEMA:
DATABASES['default_direct']['OPTIONS'] = {'options': f'-c search_path={DB_DEFAULT_SCHEMA}'}
# ...
Now setting the environment variable DB_DEFAULT_SCHEMA=myschema configures the schema. To run migrations, we'll set the proper environment variable, and explicitly use the direct DB connection:
env DB_SCHEMA_NO_PREFIX=True python manage.py migrate --database default_direct
And when the app server runs, it'll use the default DB connection, which works with PgBouncer.
The down side is that since Django migrations is excluded from the signal handler, it'll think no migrations were run, so it always warns about this:
"You have ... unapplied migration(s). Your project may not work properly until you apply the migrations for ..."
Which is not true if we make sure we actually run migrations always before running the app server.
A side note about this solution is that, now the Django project has multiple DB connection settings (if it didn't have before). So for example DB migrations should have been written to work with an explicit connection, and not relying on default connection. For example if RunPython is used in a migration, it should pass the connection (schema_editor.connection.alias) to the object manager when querying. For example:
my_model.save(using=schema_editor.connection.alias)
# or
my_model.objects.using(schema_editor.connection.alias).all()

How to use PostgreSQL's stored procedures or functions in Django project

I am working on one Django project. And I decided to write logic code in PostgreSQL instead of writing in Python. So, I created a stored procedure in PostgreSQL. For example, a stored procedure looks like this:
create or replace procedure close_credit(id_loan int)
language plpgsql
as $$
begin
update public.loan_loan
set sum = 0
where id = id_loan;
commit;
end;$$
Then in settings.py, I made the following changes:
DATABASES = {
'default': {
'ENGINE': 'django.db.backends.postgresql_psycopg2',
'NAME': 'pawnshop',
'USER': 'admin',
'PASSWORD': password.database_password,
'HOST': 'localhost',
'PORT': '',
}
}
So the question is, How can I call this stored procedure in views.py?
p.s.
Maybe it sounds like a dumb question, but I really couldn't find any solution in Django.
I would recommend storing the procedure definition in a migration file. For example, in the directory myapp/migrations/sql.py:
from django.db import migrations
SQL = """
create procedure close_credit(id_loan int)
language plpgsql
as $$
begin
update public.loan_loan
set sum = 0
where id = id_loan;
commit;
end;$$
"""
class Migration(migrations.Migration):
dependencies = [
('myapp', '0001_initial'),
]
operations = [migrations.RunSQL(SQL)]
Note: you will need to replace myapp with the name of your application, and you will need to include only the most recent migration file for your app as a dependency.
Now you can install the procedure using python3 manage.py migrate.
Once your procedure is defined in the database, you can call it using cursor.callproc:
from django.db import connection
def close_credit(id_loan):
with connection.cursor() as cursor:
cursor.callproc('close_credit', [id_loan])
All that being said, if your procedure is really as trivial as the example you provided, it would cost much less in maintenance to write the equivalent using the ORM:
Loan.objects.filter(id=id_loan).update(sum=0)

django with multiple databases and foreignkeys for User

Suppose I have a django app on my server, but I wish to do authentication using django.contrib.auth.models where the User and Group models/data are on another server in another database. In Django, my DATABASES setting would be something like this:
DATABASES = {
'default': {},
'auth_db': {
'NAME' : 'my_auth_db',
'ENGINE' : 'django.db.backends.mysql',
'USER' : 'someuser',
'PASSWORD' : 'somepassword',
'HOST' : 'some.host.com',
'PORT' : '3306',
},
'myapp': {
'NAME': 'myapp_db',
'ENGINE': 'django.db.backends.mysql',
'USER': 'localuser',
'PASSWORD': 'localpass',
}
}
DATABASE_ROUTERS = ['pathto.dbrouters.AuthRouter', 'pathto.dbrouters.MyAppRouter']
First question: will this work, ie will it allow me to login to my Django app using users that are stored in the remote DB 'my_auth_db'?
Assuming the answer to the above is yes, what happens if in my local DB (app 'myapp') I have models that have a ForeignKey to User? In other words, my model SomeModel is defined in myapp and should exist in the myapp_db, but it have a ForeignKey to a User in my_auth_db:
class SomeModel(models.model):
user = models.ForeignKey(User, unique=False, null=False)
description = models.CharField(max_length=255, null=True)
dummy = models.CharField(max_length=32, null=True)
etc.
Second question: Is this possible or is it simply not possible for one DB table to have a ForeignKey to a table in another DB?
If I really wanted to make this work, could I replace the ForeignKey field 'user' with an IntegerField 'user_id' and then if I needed somemodel.user I would instead get somemodel.user_id and use models.User.objects.get(pk=somemodel.user_id), where the router knows to query auth_db for the User? Is this a viable approach?
The answer to question 1 is: Yes.
What you will need in any case is a database router (The example in the Django docs is exactly about the auth app, so there's no need to copy this code here).
The answer to question 2 is: Maybe. Not officially. It depends on how you have set up MySQL:
https://docs.djangoproject.com/en/dev/topics/db/multi-db/#limitations-of-multiple-databases
Django doesn’t currently provide any support for foreign key or many-to-many relationships spanning multiple databases.
This is because of referential integrity.
However, if you’re using SQLite or MySQL with MyISAM tables, there is no enforced referential integrity; as a result, you may be able to ‘fake’ cross database foreign keys. However, this configuration is not officially supported by Django.
I have a setup with several legacy MySQL DBs (readonly). This answer shows How to use django models with foreign keys in different DBs?
I later ran into troubles with Django ManyToMany through with multiple databases and the solution (as stated in the accepted answer there) is to set the table name with quotes:
class Meta:
db_table = '`%s`.`table2`' % db2_name
Related questions that might provide some additional information:
How to work around lack of support for foreign keys across databases in Django
How to use django models with foreign keys in different DBs?
It would be nice if somebody would take all this information and put in into the official Django doc :-)

Django: Postgres connection not closing

I have the problem that my django application accumulates postgres connections over time. It seems that about every 30min a new connection is established and the old connections don't close (see screen).
As max connections is set to 100 after some time all connections are blocked.
Does anyone know what is causing this problem?
I discovered this after I integrated some celery tasks. So I am quite sure that it is related to celery.
So I tried to close the connection manually after every Task using after_return method:
from django.db import connection
class DBTask(Task):
abstract = True
def after_return(self, *args, **kwargs):
connection.close()
#task(name='example', base=DBTask)
def example_task(value):
# do some stuff
But this also doesn't help.
Maybe I am totally wrong and it isn't related to celery at all.
My database configuration:
DATABASES = {
'default': {
'ENGINE': 'django.contrib.gis.db.backends.postgis',
'NAME': 'production',
'USER': 'production',
'HOST': 'some.host',
'CONN_MAX_AGE': 0,
},
}
Installed packages:
django 1.8.9
pyscopg2 2.6.1
celery 3.1.20
django-celery 3.1.17
The app is deployed at webfaction (maybe this helps)
I have also seen this question, but setting CONN_MAX_AGE: 0 didn't help.
Update:
Tried adding connection.close() at the end of each celery task, but the number of connection is still increasing.
Update 2:
Tried adding connection.close() at the top of the celery file, but this didn't help either.
Update 3:
Here is the code I am actually using in the celery tasks:
celery_tasks.py
#task(name='push_notifications', base=DBTask)
def push_notifications_task(user_id):
user = CustomUser.objects.get(id=user_id)
PusherAPI().push_notifications(user)
connection.close()
models.py
class PusherAPI(object):
def push_notifications(self, user):
from .serializers import NotificationSerializer
self.pusher.trigger(
'user_%s' % user.slug,
'notifications',
NotificationSerializer(user).data
)
serializers.py
class NotificationSerializer(object):
def __init__(self, user=None):
if user is None:
self.user = get_current_user()
else:
self.user = user
#property
def data(self):
# get notifications from db
notifications = self.user.notifications.unread()
# create the notification dict
...
return note_dict
The only db-queries are in CustomUser.objects.get(id=user_id) and notifications = self.user.notifications.unread()
Make sure it is actually old connections that are not closed and not new connections that pile up because some part of your application can't handle the load. Have a look at the individual connections, e.g. with SELECT * FROM pg_stat_activity;

Running Django unittests causes South migrations to duplicate tables

How do you prevent Django unittests from running South migrations?
I have a custom Django app, myapp, that I'm trying to test with manage.py test myapp but when I run it I get the error:
django.db.utils.OperationalError: table "myapp_mymodel" already exists
and sure enough, the traceback shows that South is being executed:
File "/usr/local/myproject/.env/local/lib/python2.7/site-packages/south/management/commands/test.py", line 8, in handle
super(Command, self).handle(*args, **kwargs)
However, in my settings, I've specified:
SOUTH_TESTS_MIGRATE = 0
SKIP_SOUTH_TESTS = 1
which I believe should prevent Django's test framework from executing any South components.
What am I doing wrong?
Edit: I worked around this by simply removing south with:
if 'test' in sys.argv:
INSTALLED_APPS.remove('south')
However, then I got:
ImproperlyConfigured: settings.DATABASES is improperly configured. Please supply the NAME value.
For my test database settings, I was using:
DATABASES = {
'default':{
'ENGINE': 'django.db.backends.sqlite3'
}
}
which worked fine in Django 1.4. Now I'm using Django 1.5, and I guess that's not kosher. However, no NAME value I see it to fixes it. They all report none of my tables exist. I've tried:
DATABASES = {
'default':{
'ENGINE': 'django.db.backends.sqlite3',
'NAME': '/dev/shm/test.db',
'TEST_NAME': '/dev/shm/test.db',
}
}
DATABASES = {
'default':{
'ENGINE': 'django.db.backends.sqlite3',
'NAME': ':memory:',
'TEST_NAME': ':memory:',
}
}
DATABASES = {
'default':{
'ENGINE': 'django.db.backends.sqlite3',
'NAME': os.path.join(os.path.dirname(__file__), 'test.db'),
'TEST_NAME': os.path.join(os.path.dirname(__file__), 'test.db'),
}
}
each seems to create a physical test.db file, which I don't understand, because unittests should be run in-memory. It should never save anything to disk. Presumably, it's failing to run syncdb after creating the file but before it executes the actual unittest. How do I fix this?
Edit: I discovered that, in one of my forms, I was populating field choices by directly querying a model (whereas I should have been doing that inside the form's init), so when Django's test framework imported my model, it was trying to read the table before the sqlite3 database had been created. I fixed that, but now I'm getting the error:
DatabaseError: table "myapp_mythroughmodel" already exists
so I'm back to square one, even though it's throwing a different exception type than initially.
Edit: I had a duplicate through model defined, causing Django to attempt to create it twice, resulting in the error.
This also happened to me with a legacy code but for another reason.
I had two models with db_table referencing the same db table.
I know that is stupid, but it's not my fault )
And I never found anything on the internet that could help me.
I was saved by verbosity set to 3 (manage.py test -v 3)
Hope this helps anyone.
class Bla1(Model):
some_column = ...
class Meta:
db_table = 'some_table'
class Bla2(Model):
some_column = ...
class Meta:
db_table = 'some_table'
This error was the result of several problems. I'll summarize them here to help others who may have stumbled across this.
Ensure your settings.DATABASES is set correctly. Django's docs mention using TEST_NAME, but for clarity, I find it easier to check for the test command and override everything. e.g. at the bottom of my settings.py, I have:
if 'test' in sys.argv:
DATABASES = {
'default':{
'ENGINE': 'django.db.backends.sqlite3',
'NAME': ':memory:',
},
}
Unless you have a good reason, always use :memory: to ensure it runs in memory and doesn't create a physical file that will be bogged down on disk. For some odd reason, a lot of other answers on SO recommend specifying a literal path to a test.db file for testing. This is horrible advice.
Unless you want to test South and/or your South migrations, disable South, because it'll only complicate things:
SOUTH_TESTS_MIGRATE = False
SKIP_SOUTH_TESTS = True
Don't be dumb like me and try to access your models before they're created. This mostly means don't directly refer to models from the fields of other models or forms. e.g.
class MyForm(forms.Form):
somefield = forms.ChoiceField(
required=True,
choices=[(_.id, _.name) for _ in OtherModel.objects.filter(criteria=blah)],
)
This might work in code where your database already exists, but it'll break Django's unittest framework when it tries to load your tests, which load your models.py and forms.py, causing it to read a table that doesn't exist. Instead, set the choices value in the form's __init__().

Categories