ListField without duplicates in Python mongoengine - python

I must be missing something really obvious. But I can't seem to find a way to represent a set using mongoengine.
class Item(Document):
name = StringField(required=True)
description = StringField(max_length=50)
parents = ListField(ReferenceField('self'))
i = Item.objects.get_or_create(name='test item')[0]
i2 = Item(name='parents1')
i2.save()
i3 = Item(name='parents3')
i3.save()
i.parents.append(i2)
i.parents.append(i2)
i.parents.append(i3)
i.save()
The above code will create a duplicate entry for i2 in the parents field of i1. How do you express a foreign key like relationship in mongoengine?

Instead of using append then using save and letting MongoEngine convert that to updates, you could use atomic updates and the $addToSet method - see the updating mongoDB docs
So in your case you could do:
i.update(add_to_set__parents=i2)
i.update(add_to_set__parents=i3)
i.update(add_to_set__parents=i2)
Support for addToSet and each doesn't currently exist - see: https://github.com/MongoEngine/mongoengine/issues/33
Update:
add_to_set is supported.

Related

Array type in SQlite

I'm in the middle of developing a small site in Python. I use flask and venv.
I am currently in the middle of writing the data base and here is one of my tables:
class Message(db.Model):
message_id = db.Column(db.Integer, primary_key=True)
session_id = db.Column(db.String(30), unique=True)
application_id = db.Column(db.Integer)
participants = db.Column(db.Array())
content = db.Column(db.String(200))
The problem is in line 5:
"Array".
There is no such variable type.
I want to create a list of message recipients. Is there an Array or List variable type in SQlite?
If so, what is and how is it used?
And if not, how can I make a list of recipients anyway?
Anyone know?
Thank you very much!
SQLite does not support arrays directly. It only does Ints, Floats and Text. See here the type it supports.
To accomplish what you need, you have to use a custom encoding, or use an FK, i.e. create another table, where each item in the array is stored as a row. This would get tedious in my opinion.
Alternatively, it can be done in SQLAlchemy and you will want to have a look at the PickleType:
array = db.Column(db.PickleType(mutable=True))
Please note that you will have to use the mutable=True parameter to be able to edit the column. SQLAlchemy will detect changes automatically and they will be saved as soon as you commit them.
Also, have a look at the ScalarListType in SQLAlchemy for saving multiple values in column.
Update:
In SqlAlchemy You can use array column.
For example:
class Example(db.Model):
id = db.Column(db.Integer, primary_key=True)
my_array = db.Column(db.ARRAY(db.Integer())
# You can easily find records:
# Example.my_array.contains([1, 2, 3]).all()
# You can use text items of array
# db.Column(db.ARRAY(db.Text())
Update: This doesn't work in SQLite, SQLAlchemy's ARRAY type is for Postgres databases only. The best alternative for most people would be something involving JSON or switching to Postgres if possible. I'll be attempting JSON myself. credit to the replier in the comments.

Django JSONField inside ArrayField

I have a problem inserting to a field using ArrayField with JSONField inside.
models.py
locations = ArrayField(JSONField(null = True,blank = True), blank=True, null = True)
Insert
location_arr = [{"locations" : "loc1","amount":Decimal(100.00)},{"locations" : "loc2","amount":Decimal(200.25)}]
instance.locations = location_arr
instance.save()
When I do this, I got
column "locations" is of type jsonb[] but expression is of type text[]
LINE 1: ...d" = 2517, "locations" = ARRAY['{"loc...
Hint: You will need to rewrite or cast the expression.
So I tried to dump it using:
import json
location_arr = [{"locations" : "loc1","amount":Decimal(100.00)},{"locations" : "loc2","amount":Decimal(200.25)}]
instance.locations = json.dumps(location_arr)
instance.save()
then I got this
LINE 1: ...d" = 2517, "locations" = '[{"loc":...
DETAIL: "[" must introduce explicitly-specified array dimensions.
I am using:
Django 1.9
Python 2.7
Postgres 9.4.10
psycopg2 2.6.2
Arrays
First of all, let's take a close look at this important text from the Postgresql Arrays document.
Tip: Arrays are not sets; searching for specific array elements can be
a sign of database misdesign. Consider using a separate table with a
row for each item that would be an array element. This will be easier
to search, and is likely to scale better for a large number of
elements.
Most of the time, you should not be using arrays.
JSONB
JSONB is available in Django as the JSONField type. This field is more scalable and flexible than array fields and can be searched more efficiently. However if you find yourself searching inside JSONB fields all the time the above statement about Arrays is equally valid for JSONB.
Now what do you have in your system? A an array that holds JSONB field. This is a disaster waiting to happen. Please normalize your data.
Recap
so when to use ArrayField?
On the rare occasion when you don't need to search in that column and you don't need to use that column for a join.
I have encountered the same scenario. Here is the way how I solved it
models.py
from django.contrib.postgres.fields.jsonb import JSONField as JSONBField
location = JSONBField(default=list,null=True,blank=True)
insert
model_object.location = [{"locations" : "loc1","amount":Decimal(100.00)},{"locations" : "loc2","amount":Decimal(200.25)}]
update
model_object.location.append({"locations" : "loc1","amount":Decimal(100.00)})
model_object.save()
This worked for me in
Django - 2.0.2
Postgres - 9.5
psycopg2 - 2.7.4
python - 3.4.3
You can sidestep this issue by using the JSONField as the column field type with a list as the root element.
from django.contrib.postgres.fields import JSONField
class MyDBArray(models.Model):
array_data = models.JSONField(default=list)
my_db_array = MyDBArray(array_data=[1, 2, 3])
my_db_array.save()
You would need to validate in the save method that the array_data field is actually list-like.
This was fixed in the latest unreleased version of Django 2.2a1
pip install Django==2.2a1
PS I believe that it will work with versions >= 2.2a1
I guess the easiest way is to turn the field from an array of jsonfield into a jsonfield. Just add a key to the location_arr.
Related thread: ArrayField with JSONField as base_field in Django

How to filter a Django queryset via text field in the latest corresponding entry in a separate model

I have a model where I needed historical data for a couple specific fields, so I put those fields into a separate model with a foreign key relationship.
Something sort of like this:
class DataThing(models.Model):
# a bunch of fields here...
class DataThingHistory(models.Model):
datathing_id = models.ForeignKey('DataThing', on_delete=models.CASCADE)
text_with_history = models.CharField(max_length=500, null=True, blank=True)
# other similar fields...
timestamp = models.DateTimeField()
Now I'm trying to filter the former model using a text field in the latest corresponding entry in the latter.
Basically if these were not separate models I'd just try this:
search_results = DataThing.objects.filter(text_with_history__icontains=searchterm)
But I haven't figured out a good way to do this across this one-to-many relationship and using only the entry with the latest timestamp in the latter model, at least by using the Django ORM.
I have an idea of how to do the query I want using raw SQL, but I'd really like to avoid using raw if at all possible.
This solution makes use of distinct(*fields) which is currently only supported by Postgres:
latest_things = DataThingHistory.objects.
order_by('datathing_id_id', '-timestamp').
distinct('datathing_id_id')
lt_with_searchterm = DataThingHistory.objects.
filter(id__in=latest_things, text_with_history__icontains=searchterm)
search_results = DataThing.objects.filter(datathinghistory__in=lt_with_searchterm)
This should result in single db query. I have split the query for readability, but you can nest it into a single statement. Btw, as you might see here, foo_id is not a good name for a ForeignKey field.
You would do the same by querying DataThing while referring to DataThingHistory:
search_results = DataThing.objects.filter(datathinghistory__text_with_history__icontains=searchterm)
Check django doc on how to query on reverse relationship.
Edit:
My previous answer is incomplete. In order to search on latest history for each DataThing, you need to annotate on timestamp using Max:
from django.db.models import Max
search_results = search_results.values('field1', 'field2',...).annotate(latest_history=Max('datathinghistory__timestemp'))
This wouldn't give you complete DataThing objects, but you could add as many fields to values as you want.

Count number of foreign keys in Django

I've read the documentation and all the other answers on here but I can't seem to get my head around it.
I need to count the number of foreign keys in an other table connected by a foreign key to a row from a queryset.
class PartReference(models.Model):
name = models.CharField(max_length=10)
code = models.IntegerField()
class Part(models.Model):
code = models.ForeignKey(PartReference)
serial_number = models.IntegerField()
I'll do something like:
results = PartReference.objects.all()
But I want a variable containing the count of the number of parts like any other field, something like:
results[0].count
to ultimately do something like:
print(results[0].name, results[0].code, results[0].count)
I can't wrap my head around the Django documentation- There is a some stuff going on with entry_set in the example, but doesn't explain where entry came from or how it was defined.
_set is used by django to look up reverse queries on database relationships and is appended to the end of a foreign key field name to state that you want to query all objects related so in your case that would be part_set, which means for any given result you can access the count of that part set as follows:
results = PartReference.objects.all()
for result in results:
print(result.name, result.code, results.part_set.count())
I can't wrap my head around the Django documentation- There is a some
stuff going on with entry_set in the example, but doesn't explain
where entry came from or how it was defined.
entry_set comes from the Entry table, that's weird I have to say that Django doc doesn't explain that, it's kind of implicit, e.g. if B is linked to A, then it will be b_set to access B from A.
Coming back to your problem, I think this was asked many times here, e.g. Django count related objects

Mongoengine integer id, and User creating

MongoDB is using string(hash) _id field instead of integer; so, how to get classic id primary key? Increment some variable each time I create my class instance?
class Post(Document):
authors_id = ListField(IntField(required=True), required=True)
content = StringField(max_length=100000, required=True)
id = IntField(required=True, primary_key=True)
def __init__(self):
//what next?
Trying to create new user raises exception:
mongoengine.queryset.OperationError: Tried to save duplicate unique keys
(E11000 duplicate key error index: test.user.$_types_1_username_1
dup key: { : "User", : "admin" })
Code:
user = User.create_user(username='admin', email='example#mail.com',
password='pass')
user.is_superuser = True
user.save()
Why?
There is the SequenceField which you could use to provide this. But as stated incrementing id's dont scale well and are they really needed? Can't you use ObjectId or a slug instead?
If you want to use an incrementing integer ID, the method to do it is described here:
http://www.mongodb.org/display/DOCS/How+to+Make+an+Auto+Incrementing+Field
This won't scale for a vary large DB/app but it works well for small or moderate application.
1) If you really want to do it you have to override the mongoengine method saving your documents, to make it look for one document with the highest value for your id and save the document using that id+1. This will create overhead (and one additional read every write), therefore I discourage you to follow this path. You could also have issues of duplicated IDs (if you save two records at the exactly same time, you'll read twice the last id - say 1 and save twice the id 1+1 = 2 => that's really bad - to avoid this issue you'd need to lock the entire collection at every insert, by losing performances).
2) Simply you can't save more than one user with the same username (as the error message is telling you) - and you already have a user called "admin".

Categories