Django change database field from integer to CharField - python

I have a Django app with a populated (Postgres) database that has an integer field that I need to change to a CharField. I need to start storing data with leading zeros in this field. If I run migrate (Django 1.8.4), I get the following error:
psycopg2.ProgrammingError: operator does not exist: character varying >= integer
HINT: No operator matches the given name and argument type(s). You might need to add explicit type casts.
I tried searching Google, but didn't really find much help. I don't really know what I'm supposed to do here. Can anyone help?

Originally, I thought that there would be a simple solution where Django or Postgres would do the conversion automatically, but it appears that it doesn't work that way. I think some of suggestions made by others might have worked, but I came up with a simple solution of my own. This was done on a production database so I had to be careful. I ended up adding the character field to the model and did the migration. I then ran a small script in the python shell that copied the data in the integer field, did the conversion that I needed, then wrote that to the new field in the same record in the production database.
for example:
members = Members.objects.all()
for mem in members:
mem.memNumStr = str(mem.memNum)
... more data processing here
mem.save()
So now, I had the data duplicated in the table in a str field and an int field. I could then modify the views that accessed that field and test it on the production database without breaking the old code. Once that is done, I can drop the old int field. A little bit involved, but pretty simple in the end.

You'll need to generate a schema migration. How you do that will depend on which version of Django you are using (versions 1.7 and newer have built-in migrations; older versions of Django will use south).
Of Note: if this data is production data, you'll want to be very careful with how you proceed. Make sure you have a known good copy of your data you can reinstall if things get hairy. Test your migrations in a non-production environment. Be. Careful!
As for the transformation on the field (from IntegerField to CharField) and the transformation on the field values (to be prepended with leading zeroes) - Django cannot do this for you, you'll have to write this manually. The proper way to do this is to use the django.db.migrations.RunPython migration command (see here).
My advice would be to generate multiple migrations; one that creates a new IntegerField, my_new_column and then write to this new column via RunPython. Then, run a second migration that removes the original CharField my_old_column and renames my_new_column as my_old_column.

from django 2.x you just need to change the field type from
IntegerField to CharField
and django will automatically alter field and migrate data as well for you.

I thought a full code example would be helpful. I followed the approach outlined by ken-koster in the comment above. I ended up with two migrations (0091 and 0092). It seems that the two migrations could be squashed into one migration but I did not go that far. (Maybe Django does this automatically but the framework here could be used in case the string values are more complicated than a simple int to string conversion. Also I included a reverse conversion example.)
First migration (0091):
from django.db import migrations, models
class Migration(migrations.Migration):
dependencies = [
('myapp', '0090_auto_20200622_1452'),
]
operations = [
# store original values in tmp fields
migrations.RenameField(model_name='member',
old_name='mem_num',
new_name='mem_num_tmp'),
# add back fields as string fields
migrations.AddField(
model_name='member',
name='mem_num',
field=models.CharField(default='0', max_length=64, verbose_name='Number of members'),
),
]
Second migration (0092):
from django.db import migrations
def copyvals(apps, schema_editor):
Member = apps.get_model("myapp", "Member")
members = Member.objects.all()
for member in members:
member.rotate_xy = str(member.mem_num_tmp)
member.save()
def copyreverse(apps, schema_editor):
Member = apps.get_model("myapp", "Member")
members = Member.objects.all()
for member in members:
try:
member.mem_num_tmp = int(member.mem_num)
member.save()
except Exception:
print("Reverse migration for member %s failed." % member.name)
print(member.mem_num)
class Migration(migrations.Migration):
dependencies = [
('myapp', '0091_custom_migration'),
]
operations = [
# convert integers to strings
migrations.RunPython(copyvals, reverse_code=copyreverse),
# remove the tmp field
migrations.RemoveField(
model_name='member',
name='mem_num_tmp',
),
]

Related

How do you incrementally add lexeme/s to an existing Django SearchVectorField document value through the ORM?

You can add to an existing Postgresql tsvector value using ||, for example:
UPDATE acme_table
SET my_tsvector = my_tsvector ||
to_tsvector('english', 'some new words to add to existing ones')
WHERE id = 1234;
Is there any way to access this functionality via the Django ORM? I.e. incrementally add to an existing SearchVectorField value rather than reconstruct from scratch?
The issue I'm having is the SearchVectorField property returns the tsvector as a string. So when I use the || operator as +, eg:
from django.contrib.postgres.search import SearchVector
instance.my_tsvector_prop += SearchVector(
["new", "fields"],
weight="A",
config='english'
)
I get the error:
TypeError: SearchVector can only be combined with other SearchVector instances, got str.
Because:
type(instance.my_tsvector_prop) == str
A fix to this open Django bug whereby a SearchVectorField property returns a SearchVector instance would probably enable this, if possible. (Although less efficient than combining in the database. In our case the update will run asynchronously so performance is not too important.)
MyModel.objects
.filter(pk=1234)
.update(my_tsvector_prop=
F("my_tsvector_prop") +
SearchVector(
["new_field_name"],
weight="A",
config='english')
)
)
Returns:
FieldError: Cannot resolve expression type, unknown output_field
Another solution would be to run a raw SQL UPDATE, although I'd rather do it through the Django ORM if possible as our tsvector fields often reference values many joins away, so it'd be nice to find a sustainable solution.

Fixing error "Badly formed hexadecimal UUID string" after converting existing id to uuid in Django (Django 3.0)

I created a table with an initial IntegerField primary key and later changed the id to a UUIDField. Now this raises a "Badly formed hexadecimal UUID string", I guess because a number such as "1" isn't a valid UUID value. Does anyone know a concise way to fix this in code when updating the models.py file for the django app?
Before everything, double check that you don't have any foreign key with Cascade uppon this model. Then, instead of changing the type of id = models.IntegerField() :
create a new field uuid = models.UUIDField(default=uuid.uuid4, primary=True) (default will set the field for existing rows)
make migrations
remove the id field
make migrations
rename the uuid field to id
make migrations
migrate
However, you should know that removing the id field, and / or naming the uuid is not mandatory, and it's often a good idea to keep both.

Django migration with "--fake-initial" is not working if AddField referes to "same" column

I'm playing with django (I'm a quite new beginner) and while surfing the web I read it could be possible to keep our internal camelCase naming conventions inside the mySQL database and also for the models' name inside models.py
Well, after some days I can conclude it's better to leave things as they were designed and use the standard output generated by inspectdb without any change to its code (I removed the .lower() functions :-) )
Anyhow, just out of curiosity, I would appreciate if somebody can explain me why what follows is not working. Briefly, it seems to me that the code responsible for the migration is not checking correctly(?) if a column name is already inside the database, or at least it does its comparison in case-sensitive manner. Is that by design?
I'm using this guide from the internet https://datascience.blog.wzb.eu/2017/03/21/using-django-with-an-existinglegacy-database/
The mysql is running with the option " --lower-case-table-names=0" and the collation is case-insensitive.
Inside the models.py I have this
class City(models.Model):
id = models.AutoField(db_column='ID', primary_key=True)
name = models.CharField(db_column='Name', max_length=35)
countrycode = models.ForeignKey(Country, db_column='CountryCode')
district = models.CharField(db_column='District', max_length=20)
population = models.IntegerField(db_column='Population', default=0)
def __str__(self):
return self.name
class Meta:
managed = True
db_table = 'city'
verbose_name_plural = 'Cities'
ordering = ('name', )
if I change the reference 'db_column' to db_column='countryCode' (note the lower "c" ) and I run
./manage.py migrate --database world_data --fake-initial worlddata
I get errors saying 'django.db.utils.OperationalError: (1050, "Table 'city' already exists")'
and the problem arises only using the --fake-initial option
After analyzing "...django/db/migrations/executor.py" I found those line that check if a column is already inside the existing
column_names = [
column.name for column in
self.connection.introspection.get_table_description(self.co$
]
if field.column not in column_names:
return False, project_state
here, for what I understand, there is no case sensitive comparison so the "countryCode" column is not found inside "column_names":
-> if field.column not in column_names:
(Pdb) field.column
'countryCode'
(Pdb) column_names
['ID', 'Name', 'CountryCode', 'District', 'Population']
First of all I'd like to congratulate you on being so through with your first question! Many older contributors don't go into as much depth as you.
So first let's get things straight. You mention that --lower-case-table-names=0 is enabled but collation is case insensitive. From the docs I see that the option forces case sensitivity for table names. I might just be reading it wrong but it looks like you're saying everything should be case insensitive. Also collation usually refers to the data itself, not column names in case you're unaware.
That said as far as I know, all databases treat column names case-insensitively (I just tested in SQLite) so you might have just uncovered a bug in Django! I looked through the history of the file, and in the 5-odd years that code has existed I guess no one ran into this issue. It's understandable since usually people either a) just let django create the db from scratch and thus everything is in sync, or b) they use inspectdb to generate the code with the right case for the columns.
It looks like you were just playing around so I don't think you are looking for a specific solution. Perhaps the next step is to file a bug ;)? From what I see there would be no downside to adding case-insensitive comparison there, but the guys working on Django 24/7 may have a different opinion.

Django "NULLS LAST" for creating Indexes

Django 1.11 and later allow using F-expressions for adding nulls last option to queries:
queryset = Person.objects.all().order_by(F('wealth').desc(nulls_last=True))
However, we want to use this functionality for creating Indexes. A standard Django Index inside a model definition:
indexes = [
models.Index(fields=['-wealth']),
]
I tried something along the lines of:
indexes = [
models.Index(fields=[models.F('wealth').desc(nulls_last=True)]),
]
which returns AttributeError: 'OrderBy' object has no attribute 'startswith'.
Is this possible to do using F-expressions in Django?
No, unfortunately, that is currently (Django <=2.1) not possible. If you look at the source of models.Index, you will see that it assumes that the argument fields contains model names and nothing else.
As a workaround, you could manually create your index with raw SQL in a migration.
Fortunately it is now possible to create functional indexes since Django 3.2. The example which you posted has to be adjusted a little by moving the field from fields to *expressions and by adding a name, which is required when using expressions.
indexes = [
Index(F('wealth').desc(nulls_last=True), name='wealth_desc_idx'),
]
https://docs.djangoproject.com/en/3.2/ref/models/indexes/#expressions

Django JSONField inside ArrayField

I have a problem inserting to a field using ArrayField with JSONField inside.
models.py
locations = ArrayField(JSONField(null = True,blank = True), blank=True, null = True)
Insert
location_arr = [{"locations" : "loc1","amount":Decimal(100.00)},{"locations" : "loc2","amount":Decimal(200.25)}]
instance.locations = location_arr
instance.save()
When I do this, I got
column "locations" is of type jsonb[] but expression is of type text[]
LINE 1: ...d" = 2517, "locations" = ARRAY['{"loc...
Hint: You will need to rewrite or cast the expression.
So I tried to dump it using:
import json
location_arr = [{"locations" : "loc1","amount":Decimal(100.00)},{"locations" : "loc2","amount":Decimal(200.25)}]
instance.locations = json.dumps(location_arr)
instance.save()
then I got this
LINE 1: ...d" = 2517, "locations" = '[{"loc":...
DETAIL: "[" must introduce explicitly-specified array dimensions.
I am using:
Django 1.9
Python 2.7
Postgres 9.4.10
psycopg2 2.6.2
Arrays
First of all, let's take a close look at this important text from the Postgresql Arrays document.
Tip: Arrays are not sets; searching for specific array elements can be
a sign of database misdesign. Consider using a separate table with a
row for each item that would be an array element. This will be easier
to search, and is likely to scale better for a large number of
elements.
Most of the time, you should not be using arrays.
JSONB
JSONB is available in Django as the JSONField type. This field is more scalable and flexible than array fields and can be searched more efficiently. However if you find yourself searching inside JSONB fields all the time the above statement about Arrays is equally valid for JSONB.
Now what do you have in your system? A an array that holds JSONB field. This is a disaster waiting to happen. Please normalize your data.
Recap
so when to use ArrayField?
On the rare occasion when you don't need to search in that column and you don't need to use that column for a join.
I have encountered the same scenario. Here is the way how I solved it
models.py
from django.contrib.postgres.fields.jsonb import JSONField as JSONBField
location = JSONBField(default=list,null=True,blank=True)
insert
model_object.location = [{"locations" : "loc1","amount":Decimal(100.00)},{"locations" : "loc2","amount":Decimal(200.25)}]
update
model_object.location.append({"locations" : "loc1","amount":Decimal(100.00)})
model_object.save()
This worked for me in
Django - 2.0.2
Postgres - 9.5
psycopg2 - 2.7.4
python - 3.4.3
You can sidestep this issue by using the JSONField as the column field type with a list as the root element.
from django.contrib.postgres.fields import JSONField
class MyDBArray(models.Model):
array_data = models.JSONField(default=list)
my_db_array = MyDBArray(array_data=[1, 2, 3])
my_db_array.save()
You would need to validate in the save method that the array_data field is actually list-like.
This was fixed in the latest unreleased version of Django 2.2a1
pip install Django==2.2a1
PS I believe that it will work with versions >= 2.2a1
I guess the easiest way is to turn the field from an array of jsonfield into a jsonfield. Just add a key to the location_arr.
Related thread: ArrayField with JSONField as base_field in Django

Categories