Django JSONField inside ArrayField - python

I have a problem inserting to a field using ArrayField with JSONField inside.
models.py
locations = ArrayField(JSONField(null = True,blank = True), blank=True, null = True)
Insert
location_arr = [{"locations" : "loc1","amount":Decimal(100.00)},{"locations" : "loc2","amount":Decimal(200.25)}]
instance.locations = location_arr
instance.save()
When I do this, I got
column "locations" is of type jsonb[] but expression is of type text[]
LINE 1: ...d" = 2517, "locations" = ARRAY['{"loc...
Hint: You will need to rewrite or cast the expression.
So I tried to dump it using:
import json
location_arr = [{"locations" : "loc1","amount":Decimal(100.00)},{"locations" : "loc2","amount":Decimal(200.25)}]
instance.locations = json.dumps(location_arr)
instance.save()
then I got this
LINE 1: ...d" = 2517, "locations" = '[{"loc":...
DETAIL: "[" must introduce explicitly-specified array dimensions.
I am using:
Django 1.9
Python 2.7
Postgres 9.4.10
psycopg2 2.6.2

Arrays
First of all, let's take a close look at this important text from the Postgresql Arrays document.
Tip: Arrays are not sets; searching for specific array elements can be
a sign of database misdesign. Consider using a separate table with a
row for each item that would be an array element. This will be easier
to search, and is likely to scale better for a large number of
elements.
Most of the time, you should not be using arrays.
JSONB
JSONB is available in Django as the JSONField type. This field is more scalable and flexible than array fields and can be searched more efficiently. However if you find yourself searching inside JSONB fields all the time the above statement about Arrays is equally valid for JSONB.
Now what do you have in your system? A an array that holds JSONB field. This is a disaster waiting to happen. Please normalize your data.
Recap
so when to use ArrayField?
On the rare occasion when you don't need to search in that column and you don't need to use that column for a join.

I have encountered the same scenario. Here is the way how I solved it
models.py
from django.contrib.postgres.fields.jsonb import JSONField as JSONBField
location = JSONBField(default=list,null=True,blank=True)
insert
model_object.location = [{"locations" : "loc1","amount":Decimal(100.00)},{"locations" : "loc2","amount":Decimal(200.25)}]
update
model_object.location.append({"locations" : "loc1","amount":Decimal(100.00)})
model_object.save()
This worked for me in
Django - 2.0.2
Postgres - 9.5
psycopg2 - 2.7.4
python - 3.4.3

You can sidestep this issue by using the JSONField as the column field type with a list as the root element.
from django.contrib.postgres.fields import JSONField
class MyDBArray(models.Model):
array_data = models.JSONField(default=list)
my_db_array = MyDBArray(array_data=[1, 2, 3])
my_db_array.save()
You would need to validate in the save method that the array_data field is actually list-like.

This was fixed in the latest unreleased version of Django 2.2a1
pip install Django==2.2a1
PS I believe that it will work with versions >= 2.2a1

I guess the easiest way is to turn the field from an array of jsonfield into a jsonfield. Just add a key to the location_arr.
Related thread: ArrayField with JSONField as base_field in Django

Related

How do you incrementally add lexeme/s to an existing Django SearchVectorField document value through the ORM?

You can add to an existing Postgresql tsvector value using ||, for example:
UPDATE acme_table
SET my_tsvector = my_tsvector ||
to_tsvector('english', 'some new words to add to existing ones')
WHERE id = 1234;
Is there any way to access this functionality via the Django ORM? I.e. incrementally add to an existing SearchVectorField value rather than reconstruct from scratch?
The issue I'm having is the SearchVectorField property returns the tsvector as a string. So when I use the || operator as +, eg:
from django.contrib.postgres.search import SearchVector
instance.my_tsvector_prop += SearchVector(
["new", "fields"],
weight="A",
config='english'
)
I get the error:
TypeError: SearchVector can only be combined with other SearchVector instances, got str.
Because:
type(instance.my_tsvector_prop) == str
A fix to this open Django bug whereby a SearchVectorField property returns a SearchVector instance would probably enable this, if possible. (Although less efficient than combining in the database. In our case the update will run asynchronously so performance is not too important.)
MyModel.objects
.filter(pk=1234)
.update(my_tsvector_prop=
F("my_tsvector_prop") +
SearchVector(
["new_field_name"],
weight="A",
config='english')
)
)
Returns:
FieldError: Cannot resolve expression type, unknown output_field
Another solution would be to run a raw SQL UPDATE, although I'd rather do it through the Django ORM if possible as our tsvector fields often reference values many joins away, so it'd be nice to find a sustainable solution.

django orm group by json key in json field

I'm using json field on my django model:
class JsonTable(models.Model):
data = JSONField()
type = models.IntegerField()
I tried next query, which works for normal sql fields:
JsonTable.objects.filter(type=1).values('type').annotate(Avg('data__superkey'))
But this throws next error:
FieldError: Cannot resolve keyword 'superkey' into field. Join on 'data' not permitted.
Is there way to make group by on json key, using Django ORM or some python lib, without use of raw sql?
Versions: Django 1.9b, PostgreSQL 9.4
UPDATE
Example 2:
JsonTable.objects.filter(type=1).values('data__happykey').annotate(Avg('data_superkey'))
throws same error on happykey
After some researching I found next solution:
from django.db.models import Count
from django.contrib.postgres.fields.jsonb import KeyTextTransform
superkey = KeyTextTransform('superkey', 'data')
table_items = JsonTable.objects.annotate(superkey = superkey).values('superkey').annotate(Count('id')).order_by()
I did not sure about order_by(), but documentation says that is needed.
For another aggregation function type casting needed:
from django.db.models import IntegerField
from django.db.models.functions import Cast
superkey = Cast(KeyTextTransform('superkey', 'data'), IntegerField())
I test with another model, hope that write this code without misprints. PostgreSQL 9.6, Django 2.07
If you are using this package https://github.com/bradjasper/django-jsonfield,
there is nothing in the code for managing such simulated related queries (data__some_json_key)
As Json data is text, you will have to go to raw sql or better : use queryset extra() method, but parsing Json in sql seems to be difficult.

Django change database field from integer to CharField

I have a Django app with a populated (Postgres) database that has an integer field that I need to change to a CharField. I need to start storing data with leading zeros in this field. If I run migrate (Django 1.8.4), I get the following error:
psycopg2.ProgrammingError: operator does not exist: character varying >= integer
HINT: No operator matches the given name and argument type(s). You might need to add explicit type casts.
I tried searching Google, but didn't really find much help. I don't really know what I'm supposed to do here. Can anyone help?
Originally, I thought that there would be a simple solution where Django or Postgres would do the conversion automatically, but it appears that it doesn't work that way. I think some of suggestions made by others might have worked, but I came up with a simple solution of my own. This was done on a production database so I had to be careful. I ended up adding the character field to the model and did the migration. I then ran a small script in the python shell that copied the data in the integer field, did the conversion that I needed, then wrote that to the new field in the same record in the production database.
for example:
members = Members.objects.all()
for mem in members:
mem.memNumStr = str(mem.memNum)
... more data processing here
mem.save()
So now, I had the data duplicated in the table in a str field and an int field. I could then modify the views that accessed that field and test it on the production database without breaking the old code. Once that is done, I can drop the old int field. A little bit involved, but pretty simple in the end.
You'll need to generate a schema migration. How you do that will depend on which version of Django you are using (versions 1.7 and newer have built-in migrations; older versions of Django will use south).
Of Note: if this data is production data, you'll want to be very careful with how you proceed. Make sure you have a known good copy of your data you can reinstall if things get hairy. Test your migrations in a non-production environment. Be. Careful!
As for the transformation on the field (from IntegerField to CharField) and the transformation on the field values (to be prepended with leading zeroes) - Django cannot do this for you, you'll have to write this manually. The proper way to do this is to use the django.db.migrations.RunPython migration command (see here).
My advice would be to generate multiple migrations; one that creates a new IntegerField, my_new_column and then write to this new column via RunPython. Then, run a second migration that removes the original CharField my_old_column and renames my_new_column as my_old_column.
from django 2.x you just need to change the field type from
IntegerField to CharField
and django will automatically alter field and migrate data as well for you.
I thought a full code example would be helpful. I followed the approach outlined by ken-koster in the comment above. I ended up with two migrations (0091 and 0092). It seems that the two migrations could be squashed into one migration but I did not go that far. (Maybe Django does this automatically but the framework here could be used in case the string values are more complicated than a simple int to string conversion. Also I included a reverse conversion example.)
First migration (0091):
from django.db import migrations, models
class Migration(migrations.Migration):
dependencies = [
('myapp', '0090_auto_20200622_1452'),
]
operations = [
# store original values in tmp fields
migrations.RenameField(model_name='member',
old_name='mem_num',
new_name='mem_num_tmp'),
# add back fields as string fields
migrations.AddField(
model_name='member',
name='mem_num',
field=models.CharField(default='0', max_length=64, verbose_name='Number of members'),
),
]
Second migration (0092):
from django.db import migrations
def copyvals(apps, schema_editor):
Member = apps.get_model("myapp", "Member")
members = Member.objects.all()
for member in members:
member.rotate_xy = str(member.mem_num_tmp)
member.save()
def copyreverse(apps, schema_editor):
Member = apps.get_model("myapp", "Member")
members = Member.objects.all()
for member in members:
try:
member.mem_num_tmp = int(member.mem_num)
member.save()
except Exception:
print("Reverse migration for member %s failed." % member.name)
print(member.mem_num)
class Migration(migrations.Migration):
dependencies = [
('myapp', '0091_custom_migration'),
]
operations = [
# convert integers to strings
migrations.RunPython(copyvals, reverse_code=copyreverse),
# remove the tmp field
migrations.RemoveField(
model_name='member',
name='mem_num_tmp',
),
]

Django: Lookup by length of text field

I have a model with a text field on it. I want to do a lookup that will return all the items that have a string with a length of 7 or more in that field. Possible?
How about a lookup for all objects in which that field isn't ''?
I think regex lookup can help you:
ModelWithTextField.objects.filter(text_field__iregex=r'^.{7,}$')
or you can always perform raw SQL queries on Django model:
ModelWithTextField.objects.raw('SELECT * FROM model_with_text_field WHERE LEN_FUNC_NAME(text_field) > 7')
where len_func_name is the name of "string length" function for your DBMS. For example in mysql it's named "length".
Since django 1.8, you can use the Length database function:
from django.db.models.functions import Length
qs = ModelWithTextField.objects \
.annotate(text_len=Length('text_field_name')) \
.filter(text_len__gte=7)
Since Django 1.9 the Length database function may be registered as a queryset lookup:
from django.db.models import CharField
from django.db.models.functions import Length
CharField.register_lookup(Length, 'length')
So that then you are able use it in a following manner:
# Get authors whose name is longer than 7 characters
authors = Author.objects.filter(name__length__gt=7)
See: Database Functions - Length() in official Django 3.2 Docs

ListField without duplicates in Python mongoengine

I must be missing something really obvious. But I can't seem to find a way to represent a set using mongoengine.
class Item(Document):
name = StringField(required=True)
description = StringField(max_length=50)
parents = ListField(ReferenceField('self'))
i = Item.objects.get_or_create(name='test item')[0]
i2 = Item(name='parents1')
i2.save()
i3 = Item(name='parents3')
i3.save()
i.parents.append(i2)
i.parents.append(i2)
i.parents.append(i3)
i.save()
The above code will create a duplicate entry for i2 in the parents field of i1. How do you express a foreign key like relationship in mongoengine?
Instead of using append then using save and letting MongoEngine convert that to updates, you could use atomic updates and the $addToSet method - see the updating mongoDB docs
So in your case you could do:
i.update(add_to_set__parents=i2)
i.update(add_to_set__parents=i3)
i.update(add_to_set__parents=i2)
Support for addToSet and each doesn't currently exist - see: https://github.com/MongoEngine/mongoengine/issues/33
Update:
add_to_set is supported.

Categories