django migrations - workflow with multiple dev branches - python

I'm curious how other django developers manage multiple code branches (in git for instance) with migrations.
My problem is as follows:
- we have multiple feature branches in git, some of them with django migrations (some of them altering fields, or removing them altogether)
- when I switch branches (with git checkout some_other_branch) the database does not reflect always the new code, so I run into "random" errors, where a db table column does not exist anymore, etc...
Right now, I simply drop the db and recreate it, but it means I have to recreate a bunch of dummy data to restart work. I can use fixtures, but it requires keeping track of what data goes where, it's a bit of a hassle.
Is there a good/clean way of dealing with this use-case? I'm thinking a post-checkout git hook script could run the necessary migrations, but I don't even know if migration rollbacks are at all possible.

Migrations rollback are possible and usually handled automatically by django.
Considering the following model:
class MyModel(models.Model):
pass
If you run python manage.py makemigrations myapp, it will generate the initial migration script.
You can then run python manage.py migrate myapp 0001 to apply this initial migration.
If after that you add a field to your model:
class MyModel(models.Model):
my_field = models.CharField()
Then regenerate a new migration, and apply it, you can still go back to the initial state. Just run
python manage.py migrate myapp 0001 and the ORM will go backward, removing the new field.
It's more tricky when you deal with data migrations, because you have to write the forward and backward code.
Considering an empty migration created via python manage.py makemigrations myapp --empty,
you'll end up with something like:
# -*- coding: utf-8 -*-
from __future__ import unicode_literals
from django.db import models, migrations
def forward(apps, schema_editor):
# load some data
MyModel = apps.get_model('myapp', 'MyModel')
while condition:
instance = MyModel()
instance.save()
def backward(apps, schema_editor):
# delete previously loaded data
MyModel = apps.get_model('myapp', 'MyModel')
while condition:
instance = MyModel.objects.get(myargs)
instance.delete()
class Migration(migrations.Migration):
dependencies = [
('myapp', '0003_auto_20150918_1153'),
]
operations = [
migrations.RunPython(forward, backward),
]
For pure data-loading migrations, you usually don't need the backward migration.
But when you alter the schema and update existing rows
(like converting all values in a column to slug), you'll generally have to write the backward step.
In our team, we try to avoid working on the same models at the same time to avoid collision.
If it is not possible, and two migration with the same number (e.g 0002) are created,
you can still rename one of them to change the order in which they will be applied (also remember to update
the dependencies attribute on the migration class to your new order).
If you end up working on the same model fields at the same time in different features,
you'll still be in trouble, but it may mean these features are related and should be handled
together in a single branch.
For the git-hooks part, it's probably possible to write something, Assuming your are on branch mybranch
and want to check out another feature branch myfeature:
Just before switching, you dump the list of currently applied migrations into
a temporary file mybranch_database_state.txt
Then, you apply myfeature branch migrations, if any
Then, when checking back mybranch, you reapply your previous database state
by looking to the dump file.
However, it seems a bit hackish to me, and it would probably be really difficult to handle properly all scenarios:
rebasing, merging, cherry-picking, etc.
Handling the migrations conflicts when they occurs seems easier to me.

I don't have a good solution to this, but I feel the pain.
A post-checkout hook will be too late. If you are on branch A and you check out branch B, and B has fewer migrations than A, the rollback information is only in A and needs to be run before checkout.
I hit this problem when jumping between several commits trying to locate the origin of a bug. Our database (even in development trim) is huge, so dropping and recreating isn't practical.
I'm imagining a wrapper for git-checkout that:
Notes the newest migration for each of your INSTALLED_APPS
Looks in the requested branch and notes the newest migrations there
For each app where the migrations in #1 are farther ahead than in #2, migrate back to the highest migration in #2
Check out the new branch
For each app where migrations in #2 were ahead of #1, migrate forward
A simple matter of programming!

For simple changes I rely on migration rollback, as discussed by Agate.
However, if I know a feature branch is going to involve highly invasive database changes, or if it will involve a lot of data migration, I like to create a clone of the local (or remote dev) database as soon as I start the new branch. This may not always be convenient, but especially for local development using sqlite it is just a matter op copying a file (which is not under source control).
The first commit on the new branch then updates my Django settings (local/dev) to use the cloned database. This way, when I switch branches, the correct database is selected automatically. No need to worry about rolling back schema changes, missing data, etc. No complicated stuff.
After the feature branch has been fully merged, the cloned database can be removed.

So far I have found two Github projects (django-south-compass and django_nomad) that try to solve the issue of migrating between dev branches and there is a couple of answers on Stack Overflow.
Citing an article on Medium, most of the solutions boil down to one of the following concepts:
Dropping all the tables and reapply migrations in the target branch from scratch. When the tables are created from scratch, all the data will be lost and needs to be recreated as well. This can be handled with fixtures and data migrations but managing them, in turn, will become a nightmare, not to mention that it will take some time (...)
Have a separate database for each branch and change the settings file with the target branch’s settings every time the branch is switched using tools like sed. This can be done with a post_checkout hook. Maintaining one large database for each branch would be very storage-intensive. Also, checking out individual commit IDs might potentially produce the same errors.
Finding the differences in migrations between the source and target branch, and apply the differences. We can do so with post_checkout script but there is a small issue. This post explains the issue in detail. To summarize the issue, post_checkout is run after all the files in the target branch are checked out, which includes migration files. If the target branch doesn’t contain all the migrations in the source branch when we run python manage.py migrate app1 Django won’t be able to find the missing migrations which are needed to apply reverse migrations. We have to temporarily checkout migration files in the source branch, run python manage.py migrate and checkout migration files in the target branch. django-south-compass does something very similar but is available only for up to python 2.6.
Using a management command (which uses python git module), find all the migration operations differences between the source branch and the merge-base of the source branch and target branch and notify the user of these changes. If these changes don’t interfere with the reason for branch change, the user can go ahead and change the branch. Else, using another management command, un-apply all migration till merge base, switch branch, and apply the migrations in the target branch. There will be a small data loss and if the two branches haven’t diverged a lot, is manageable. django_nomad does some of this work.
Keep a track of applied and unapplied migrations in files and use this data to populate the tables when switching branches.

Related

How can I delete unapplied migrations?

I would like to delete only unapplied migrations, showmigrations gives:
[X] 0011_auto_20190917_1522
[X] 0012_auto_20190917_1600
[ ] 0013_auto_20190917_1638
[ ] 0014_auto_20190917_1647
[ ] 0015_auto_20190917_1652
[ ] 0016_auto_20190917_1654
[ ] 0017_auto_20190917_1704
...
I have 21 unapplied migrations! The question is when migrations are unapplied, don't have any effect on database, right? Can I just delete them from "myapp" migrations folder and after that to do makemigrations and migrate again?
Short answer: You can do that, but you should be careful, and take some situations into account.
You can remove the migrations that are not yet applied to the database. But you should be very careful with that, since it is not said that if the migrations are not applied to one database (for example a database you use for development), that other databases (for example in production) are not migrated already further in the migration chain.
You thus should only remove migrations for which you are sure that no database has already made the migrations. If you delete migrations that have been applied already. The database(s) might for example have already constructed tables, columns, etc. If you later run the new migration, that migration will contain modifications starting from the point where you removed the migrations, and thus might run into errors, since that migration will aim to construct a table that already exists.
Another potential problem is that migrations might contain RunPython operations [Django-doc]. These are usually small pieces of code that one has inserted manually. For example to replace all the records such that a specific column now has a certain value. By removing these migrations, and by making new migrations, these RunPython operations will be lost. The name of the migrations (it contains auto) suggests that the migrations have been constructed automatically, but it is not impossible that a programmer later modified such file, and thus inserted a RunPython operation.
Having multiple migrations is not a severe problem. Django will run a topological sorting algorithm on the "migration graph", and this can be done in O(n). Having a large amount of files is usually not a severe bottleneck.
You might want to consider using squashmigrations [Django-doc] to group migrations together in a new file. This will take into account the RunPython operations, and thus might be more safe than squashing migrations by removing and recreating migrations.
Yes, if they are not applied to the database, you can simply remove them.
Yes, you can just delete them from migrations folder (dont delete the migrations folder itself).

Freezing database for testing new features in Django

In my Django app, I want to add a couple of fields to my existing models and possibly create a new class. I just want to test the new feature and approve if it works.
I can revert the code using git easily. But if I make a makemigrations+migrate then my MySQL database will change and reversing the changes looks like manual deletion of tables and reverting to an old state using a command like django-admin migrate [app_label] [migration_name] (In some cases it looks really cumbersome, example).
I'm wondering if there is any safe practice to try manipulating the database and revert it back to it's initial state safely.
Probable solution #1:
You can utilize the test database that gets created when using django.test.TestCase:
Tests that require a database (namely, model tests) will not use your
“real” (production) database. Separate, blank databases are created
for the tests.
Create some unit tests for your project and make your migrations (without migrating to your production DB, just keep the migrations). Then:
If the database does not exist, it will first be created. Any
migrations will also be applied in order to keep it up to date.
Usually, the database gets destroyed at the end of your tests, but you can keep it between runs:
You can prevent the test databases from being destroyed by using them
test --keepdb option. This will preserve the test database between
runs.
With this trick you can test every migration you make in a fake DB and when you do finalize your model and you have all the migrations history complete, you can migrate on your production DB.
Probable solution #2:
You can make a copy of your database as #albar suggests and have it as a back up while you are working on your new migrations.
Break stuff as much as you want and when you are set and done, replace the "battered" DB with your back up and apply your migration history to it.

Django Migrate Change of App Name (active project)

So... I've done a lot of research on this... there are answers, but not complete or appropriate answers. I have an in-use and in-production django "project" in which the "main" application is called "pages" ... for reasonably dumb reasons. My problem is now to add mezzanine ... which has a sub-module mezzanine.pages (seems to be required .... but I'm pretty sure I need it).
mezzanine.pages apparently conflicts with "pages" ...
Now ... my pages contains a slew of non-trivial models including one that extends user (One-to-One ref), and many references to other app's tables (fortunately only outbound, ForeignKey). It also has management/commands and about 20 migrations of it's own history.
I gather I either have to changes pages to mypages or is there another route (seemingly changing mezzanine.pages seems wrong-headed).
for reference, The project is on Django 1.8 right now, so the preferred answer includes migrations.
I've worked on this since I posted it, and the real answer is what I've synthesized from multiple sources (including other stack exchange posts).
So... Everything changed in Django before I started using it. After 1.7, the 'migrations' bit was internalized and posts including the word "South" are about how the world was before 1.7. Further, the complication in my case dealt with the issue of migrations in that the project was already active and had real data in production.
There were some posts including a GITHub chunk of code that talked about migrating tables from one App to another App. This is inherently part of the process, but several posts noted that to do this as a "migration" you needed the Migration.py to be in another App. Maybe even an App created for the purpose.
In-the-end, I decided to approach the problem by changing the label in the Application class of apps.py in the application in question. In my case, I am changing "pages" to "phpages" but the directory name of my app is still pages. This works for me because the mezzanine app's "pages" sub-App is back in the python library and not a conflict in the filesystem. If this is not your situation, you can solve it with another use of label.
So... Step-by-step, my procedure to rename pages to phpages.
Create apps.py in the pages sub-directory. In it put:
class PagesConfig(AppConfig):
name = "pages"
label = "phpages"
verbose_name = "Purple Hat Pages"
Key among these is label which is going to change things.
In __init__.py in the pages sub-directory, put default_app_config = "pages.apps.PagesConfig"
In your settings.py change the INSTALLED_APPS entry for your app to 'pages.apps.PagesConfig', ...
All of your migrations need to be edited in this step. In the dependencies list, you'll need to change 'pages' to 'phpages'. In the ForeignKeys you'll need to also change 'pages.Something' to 'phpages.Something' for every something in every migration file. Find these under pages/mitrations/nnnn_*.py
If you refer to foreign keys in other modules by from pages.models import Something and then use ForeignKey(Something), you're good for this stop. If you use ForeignKey('pages.Something') then you need to change those references to ForeignKey('phpages.Something'). I would assume other like-references are the same.
For the next 4 steps (7, 8, 9 and 10), I built pagestophpages.sql and added it to the pages sub-directory. It's not a standard django thing, but each test copy and each production copy of the database was going to need the same set of steps.
UPDATE django_contecnt_type SET app_label='phpages' WHERE app_label='pages';
UPDATE django_migrations SET app='phpages' WHERE app='pages';
Now... in your database (my is PostgreSQL) there will be a bunch of tables that start with "pages". You need to list all of these. In PostgreSQL, in addition to tables, there will be sequences for each AutoField. For each table construct ALTER TABLE pages_something RENAME TO phpages_something; For each sequence ALTER SEQUENCE pages_something_id_seq RENAME TO phpages_something_id_seq;
You should probably backup the database. You may need to try this a few times. Run your SQL script through your database shell. Note that all other changes can be propagated by source code control (git, svn, etc). This last step must be run on each and every database.
Obviously, you need to change pages and phpages to your stuff. You may have more than one table with one auto field and it may not be named something.
Another thing of note, in terms of process, is that this is probably a hard point in your development where everything needs be in sync. Given that we're playing with editing migrations and changing names, you need a hard stop in development so that everything that's going to be changed (dev box, test box, staging box, production box ... and all of their databases) is at the same revision and schema. YMMV.
This is also solving the problem by using the label field of class Application. I choose this method in deference to changing the directory name because it involved fewer changes. I chose not to change the name field because that did not work for me. YMMV.
I must say that I'm a little disappointed that myapp/pages conflicts with mezzanine.pages. It looks like some of the reasons are due to the pages slug being used in the database table name (and off top of my head, I don't see a good solution there). What I don't see that would make sense is the equivalent to "from mezzanine import pages as mpages" or somesuch. The ability to alias imported apps (not talking about apps in my own file tree). I think this might be possible if I sucked in the app into my own file tree --- but this doesn't seem to be a sanctioned act, either.

How to create a new table using model

So I have this django installation in which there are a bunch of migration scripts. They look like so:
00001_initial.py
00002_blah_blah.py
00003_bleh_bleh.py
Now I know these are "database building" scripts which will take stuff defined in models.py and run them against the db to "create" tables and stuff.
I want to create a new table(so I created its definition in models.py). For this, I have copied another model class and edited its name and fields and it is all fine. Lets call this new model class 'boom'.
My question is now how do I "create" this boom table using the migration script and the boom model?
I am worried that I might accidentally disrupt anything that is already in DB. How do I run the migration to create only boom table? How do I create a migration script specifically for it?
I know that it has something to do with manage.py and running migrate or runmigration (or is it sqlmigrate?...im confused). While creating the boom table, I dont want the database to go boom if you know what I mean :)
First, create a backup of your database. Copy it to your development machine. Try things out on that. That way it doesn't matter if it does go "boom" for some reason.
The first thing to do is
python manage.py showmigrations
This shows all the existing migrations, and it should show that they have been applied with an [X].
Then,
python manage.py makemigrations
Makes a new migration file for your new model (name 00004_...).
Then do
python manage.py migrate
to apply it. To undo it, go back to the state of migrations 00003, with
python manage.py migrate <yourappname> 00003
There are two steps to migrations in Django.
./manage.py makemigrations
will create the migration files that you see - these describe the changes that should be made to the database.
You also need to run
./manage.py migrate
this will apply the migrations and actually run the alter table commands in SQL to change the actual database structure.
Generally adding fields or tables won't affect anything else in the database. Be more careful when altering or deleting existing fields as that can affect your data.
The reason for two steps is so that you can make changes on a dev machine and once happy commit the migration files and release to your production environment. Then you run the migrate command on your production machine to bring the production database to the same state as your dev machine (no need for makemigrations on production assuming that your databases started the same).
My question is now how do I "create" this boom table using the
migration script and the boom model?
./manage.py makemigrations
I am worried that I might accidentally disrupt anything that is
already in DB.
The whole point of migrations, is that it doesn't
I know that it has something to do with manage.py and running migrate
or runmigration
For more information please refer to : https://docs.djangoproject.com/en/1.10/topics/migrations/
And rest assured that your database will not go boom! :-)
I solved it simply, changing the name of the new model to the original name, and then I checked if there is the table in the database, if not, I just create a new table with the old name with just a field like id.
And then clear migrations and create new migrations, migrate and verify table was fixed in DB and has all missing fields.
If it still doesn't work, then change the model name back to a new one.
but when django asks you if you are renaming the model you should say NO to get the old one removed properly and create a new one.
This type of error usually occurs when you delete some table in dB manually, and then the migration history changes in the tables are lost.
But it is not necessary to erase the entire database and start from scratch.

Migrate models with from one django app to several other apps

I have a django app which consists of 17 models. Now I have realized that these models should be in 3 different apps(not in the original app). So now I would like to migrate these models out of the original app to these 3 different apps. How do I do that?
There exists foreign key, generic foreign key and ManyToMany relationships among the models. I also have data in the database(MySql), so I would like the data to be preserved during migration.
I have installed south for migrations, but don't know how to use it for solving this issue. I have gone through this similar question but could not find an answer that would solve my problem. Would be thankful for any help !
In my opinion, you have two ways of completing this task as stated below:
Move the models and add Meta.db_table to refer the existing sql table as needed as #kroolik suggested
Perform a three steps migration
The former is easier while the later could be better as tables would be named as you expect.
First of all, you mention you already has south installed. The first step would be to create the initial migration for the existing app. Take a look to the south tutorial. Then you must apply that migration, but as you already has the tables in db it would fail unless you include --fake flag.
After that you need to create the three apps you mention, and their models. Also create and apply (this time without fake flag) the initial migration for them.
Next step is write a datamigration. You must write it manually, although you can create the skeleton with datamigration. You must write "by hand" the migration.
Now you are almost done, the only remaining thing is remove the original tables. You can just remove those models, and create an "auto" schemamigration.
Don't forget to apply the migrations with migrate command. Also as #Bibhas mention a copy of database and/or a dump of it is a pretty good idea.

Categories