NDB: Query for unset properties? - python

I've had some data being gathered in production for a couple of days with, lets say, the following model:
class Tags(ndb.Model):
dt_added = ndb.DateTimeProperty(auto_now_add=True)
s_name = ndb.StringProperty(required=True, indexed=True)
Imagine I now add a new property to the model:
class Foo(ndb.Model):
is_valid = ndb.BooleanProperty(default=False)
some_key = ndb.KeyProperty(repeated=True)
class Tags(ndb.Model):
dt_added = ndb.DateTimeProperty(auto_now_add=True)
name = ndb.StringProperty(required=True, indexed=True)
new_prop = ndb.StructuredProperty(Foo)
... and gather some more data with this new model.
So now I have a portion of data that has the property new_prop set, and another portion that does not have it set.
My question is: how to I query for the data with the new property new_prop NOT set?
I've tried:
query_tags = Tags.query(Tags.new_prop == None).fetch()
But does not seem to get the data without that property set... Any suggestions?
Thanks!

The Datastore distinguishes between an entity that does not possess a property and one that possesses the property with a null value (None).
It is not possible to query for entities that are specifically lacking a given property. One alternative is to define a fixed (modeled) property with a default value of None, then filter for entities with None as the value of that property.

Related

How to fetch related model in django_tables2 to avoid a lot of queries?

I might be missing something simple here. And I simply lack the knowledge or some how-to.
I got two models, one is site, the other one is siteField and the most important one - siteFieldValue.
My idea is to create a django table (for site) that uses the values from siteFieldValue as a number in a row, for a specific site, under certain header. The problem is - each site can have 50s of them. That * number of columns specified by def render_ functions * number of sites equals to a lot of queries and I want to avoid that.
My question is - is it possible to, for example, prefetch all the values for each site (SiteFieldValue.objects.filter(site=record).first() somewhere in the SiteListTable class), put them into an array and then use them in the def render_ functions by simply checking the value assigned to a key (id of the field).
Models:
class Site(models.Model):
name = models.CharField(max_length=100)
class SiteField(models.Model):
name = models.CharField(max_length=100)
description = models.CharField(max_length=500, null=True, blank=True)
def __str__(self):
return self.name
class SiteFieldValue(models.Model):
site = models.ForeignKey(Site, on_delete=models.CASCADE)
field = models.ForeignKey(SiteField, on_delete=models.CASCADE)
value = models.CharField(max_length=500)
Table view
class SiteListTable(tables.Table):
name = tables.Column()
importance = tables.Column(verbose_name='Importance',empty_values=())
vertical = tables.Column(verbose_name='Vertical',empty_values=())
#... and many more to come... all values based on siteFieldValue
def render_importance(self, value, record):
q = SiteFieldValue.objects.filter(site=record, field=1).first()
# ^^ I don't want this!! I would want the SiteFieldValue to be prefetched somewhere else for that model and just check the array for field id in here.
if (q):
return q.value
else:
return None
def render_vertical(self, value, record):
q = SiteFieldValue.objects.filter(site=record, field=2).first()
# ^^ I don't want this!! I would want the SiteFieldValue to be prefetched somewhere else for that model and just check the array for field id in here.
if (q):
return q.value
else:
return None
class Meta:
model = Site
attrs = {
"class": "table table-striped","thead" : {'class': 'thead-light',}}
template_name = "django_tables2/bootstrap.html"
fields = ("name", "importance", "vertical",)
This might get you started. I've broken it up into steps but they can be chained quite easily.
#Get all the objects you'll need. You can filter as appropriate, say by site__name).
qs = SiteFieldValue.objects.select_related('site', 'field')
#lets keep things simple and only get the values we want
qs_values = qs.values('site__name','field__name','value')
#qs_values is a queryset. For ease of manipulation, let's make it a list
qs_list = list(qs_values)
#set up a final dict
final_dict = {}
# create the keys (sites) and values so they are all grouped
for site in qs_list:
#create the sub_dic for the fields if not already created
if site['site__name'] not in final_dict:
final_dict[site['site__name']] = {}
final_dict[site['site__name']][site['name']] = site['site__name']
final_dict[site['site__name']][site['field__name']] = site['value']
#now lets convert our dict of dicts into a list of dicts
# for use as per table2 docs
data = []
for site in final_dict:
data.append(final_dict[site])
Now you have a list of dicts eg,
[{'name':site__name, 'col1name':value...] and can add it as shown in the table2 docs

Insert into Django JsonField without pulling the content into memory

Have a Django model
class Employee(models.Model):
data = models.JSONField(default=dict, blank=True)
This JSONField contains two year of data, like a ton of data.
class DataQUery:
def __init__(self, **kwargs):
super(DataQUery, self).__init__()
self.data = kwargs['data'] #is pulling the data in memory,
#that's what I want to avoid
# Then We have dictionary call today to hold daily data
today = dict()
# ... putting stuff in today
# Then insert it into data with today's date as key for that day
self.data[f"{datetime.now().date()}"] = today
# Then update the database
Employee.objects.filter(id=1).update(data=self.data)
I want to insert into data column without pulling it into memory.
Yes, I could change default=dict to default=list and directly do
Employee.objects.filter(id=1).update(data=today)
BUT I need today's DATE to identify the different days.
So if I don't need to pull the data column, I don't need kwargs dict. Let's say I don't init anything (so not pulling anything into memory), how can I update data column with a dictionary that's identified by today's date, such that after the update, the data column (JSONField) will look like {2021-08-10:{...}, 2021-08-11:{..}}
For relational databases, one can store multiple items that belong to the same entity by creating a new model with a ForeignKey to that other model. This thus means that we implement this as:
class Employee(models.Model):
# …
pass
class EmployeePresence(models.Model):
employee = models.ForeignKey(Employee, on_delete=models.CASCADE)
date = models.DateField(auto_now_add=True)
data = models.JSONField(default=dict, blank=True)
class Meta:
ordering = ['employee', 'date']
In that case we thus want to add a new EmployeePresence object that relates to an Employee object e, we thus create a new one with:
EmployeePresence.objects.create(
date='2021-08-11',
data={'some': 'data', 'other': 'data'}
)
We can access all EmployeePresences of a given Employee object e with:
e.employeepresence_set.all()
creating, updating, removing a single EmployeePresence record is thus simpler, and can be done efficiently through querying.

Avoid non-specified attributes on model creation (Django + MongoDB)

I want to create an object in Python but, since I'm using Django, I don't want to save those attributes with None or null value.
When I create an object passing a couple of arguments, those attributes passed are assigned correctly; but the rest, which I wish were not created, are assigned the value "None"
models.py:
from djongo import models
class MyClass(models.Model):
_id = models.CharField(max_length=20, primary_key = True)
attr1 = models.TextField()
attr2 = models.CharField(max_length=300)
attr3 = models.IntegerField()
attr4 = models.CharField(max_length=50)
views.py:
def index(request):
a = MyClass( _id = 'TestID', attr1 = 'Xxxx' )
for key in a:
print(str(key) + " = " + str(a[key]))
a.save()
What I get from the print loop is:
_id = TestID
attr1 = Xxxx
attr2 = None
attr3 = None
attr4 = None
Which means I am saving an object full of empty data.
I would like to save a MyClass object with just the specified attributes in order to save space in memory.
EDIT:
I changed the code and only used djongo. Before I was using mongoengine's Document structure, but not anymore.
Also, I have checked what's written in the MongoDB database and it saves this:
{
"_id" : "TestAptm_2", #Type String
"access" : "", #Type String
"address" : "", #Type String
"capacity" : null, #Type null
"city" : "" #Type String
}
Those empty and null values is what I want to avoid.
As django models are the bridge between your complex python objects and your database, which makes up the django ORM system and is an essential part of your database, you simply must not design your fields dynamically. This is same as having a database table with different number of columns per row, which is impossible. Instead of doing that, you can create separate models for each type of object of your and store them in your database that way.
You are printing the attributes of the python object which doesn't necessarily reflects what got saved in the database.
To check the the shape of the raw object in the database, you can use as_pymongo() method.
Mongoengine is actually behaving as you want, see below:
class MyClass(Document):
_id = StringField(max_length=20, primary_key = True)
atr1 = StringField()
atr2 = StringField(max_length=300)
atr5 = StringField(max_length=50)
atr6 = IntField()
MyClass(_id='whatever', atr1='garbage').save()
print(MyClass.objects.as_pymongo())
# [{u'atr1': u'garbage', u'_id': u'whatever'}]

How to override create_table() in peewee to create extra history?

Following this SO answer and using the (excellent) Peewee-ORM I'm trying to make a versioned database in which a history of a record is stored in a second _history table. So when I create a new using the create_table() method I also need to create a second table with four extra fields.
So let's say I've got the following table:
class User(db.Model):
created = DateTimeField(default=datetime.utcnow)
name = TextField()
address = TextField()
When this table is created I also want to create the following table:
class UserHistory(db.Model):
created = DateTimeField() # Note this shouldn't contain a default anymore because the value is taken from the original User table
name = TextField()
address = TextField()
# The following fields are extra
original = ForeignKeyField(User, related_name='versions')
updated = DateTimeField(default=datetime.utcnow)
revision = IntegerField()
action = TextField() # 'INSERT' or 'UPDATE' (I never delete anything)
So I tried overriding the Model class like this:
class Model(db.Model):
#classmethod
def create_table(cls, fail_silently=False):
db.Model.create_table(cls, fail_silently=fail_silently)
history_table_name = db.Model._meta.db_table + 'history'
# How to create the history table here?
As you can see I manage to create a variable with the history table name, but from there I'm kinda lost.
Does anybody know how I can create a new table which is like the original one, but just with the added 4 fields in there? All tips are welcome!
Maybe something like this:
class HistoryModel(Model):
#classmethod
def create_table(cls...):
# Call parent `create_table()`.
Model.create_table(cls, ...)
history_fields = {
'original': ForeignKeyField(cls, related_name='versions'),
'updated': DateTimeField(default=datetime.utcnow),
'revision': IntegerField(),
'action': TextField()}
model_name = cls.__name__ + 'History'
HistoryModel = type(model_name, (cls,), history_fields)
Model.create_table(HistoryModel, ...)
Also note you'll want to do the same thing for create_indexes().
I'd suggest creating a property or some other way to easily generate the HistoryModel.

App Engine db.model reference question

How can I get at the Labels data from within my Task model?
class Task(db.Model):
title = db.StringProperty()
class Label(db.Model):
name = db.StringProperty()
class Tasklabel(db.Model):
task = db.ReferenceProperty(Task)
label = db.ReferenceProperty(Label)
creating the association is no problem, but how can I get at the labels associated with a task like:
task = Task.get('...')
for label in task.labels
This worked for me with your current datamodel:
taskObject = db.Query(Task).get()
for item in taskObject.tasklabel_set:
item.label.name
Or you could remove the Label class and just do a one-to-many relationship between Task and TaskLabel:
class Task(db.Model):
title = db.StringProperty()
class TaskLabel(db.Model):
task = db.ReferenceProperty(Task)
label = db.StringProperty()
Then
taskObject = db.Query(Task).get()
for item in taskObject.tasklabel_set:
item.label
Here is a tip from the Google article on modeling relationships in the datastore
By defining it as a ReferenceProperty, you have created a property that can only be assigned values of type 'Task'. Every time you define a reference property, it creates an implicit collection property on the referenced class. By default, this collection is called _set. In this case, it would make a property Task.tasklabel_set.
The article can be found here.
I also recommend playing around with this code in the interactive console on the dev appserver.
Don't you want a ListProperty on Task like this to do a many-to-many?
class Label(db.Model)
name = db.StringProperty()
#property
def members(self):
return Task.gql("WHERE labels = :1", self.key())
class Task(db.Model)
title = db.StringProperty();
labels = db.ListProperty(db.Key)
Then you could do
foo_label = Label.gql("WHERE name = 'foo'").get()
task1 = Task.gql("WHERE title = 'task 1'").get()
if foo_label.key() not in task1.labels:
task1.labels.append(foo_label.key())
task1.put()
There's a thorough article about modeling entity relationships on Google code. I stole the code above from this article.

Categories