Django Python Serializer Prefetch - python

I am using the Django Python Serializer to serialize a list of models which contain a many-to-many relationship. Even with prefetch_related, the serialization retrieves the prefetched-fields. For example:
class House(models.Model):
name = models.CharField(...)
rooms = models.ManyToManyField(Door)
class Room(models.Model):
name = models.CharField(...)
num_windows = models.PositiveIntegerField(...)
Using debug mode I can see that the following function makes the expected 2 database requests.
getHouses():
House.objects.all().prefetch_related('rooms')
However, when I attempt to serialize this object using the django.python.Serializer, it makes an additional query for the rooms in each house. Is there a way to configure the serializer to see the prefetched m2m relationships?

The only way to get rid of this is to build the map by yourself. You will need 2 seperate serializers, one for House and other for Room.
While serializing, first iterate on houses query and serialize it, then run Room serializer for house.rooms which gives you serialized room which can be put as another key in serialized house using house_serialized['rooms_serialized'] = rooms_serialized.
Hope this helps.

Related

Add a virtual field to a django query

I want to add an extra field to a query set in Django.
The field does not exist in the model but I want to add it to the query set.
Basically I want to add an extra field called "upcoming" which should return "True"
I already tried adding a #property method to my model class. This does not work because apparently django queries access the DB directly.
models.py
class upcomingActivity(models.Model):
title = models.CharField (max_length=150)
address = models.CharField (max_length=150)
Views.py
def get(self, request):
query = upcomingActivity.objects.all()
feature_collection = serialize('geojson', query ,
geometry_field='location',
fields= ( 'upcoming','title','address','pk' )
)
This answer is for the case that you do not want to add a virtual property to the model (the model remains as is).
To add an additional field to the queryset result objects:
from django.db.models import BooleanField, Value
upcomingActivity.objects.annotate(upcoming=Value(True, output_field=BooleanField())).all()
Each object in the resulting queryset will have the attribute upcoming with the boolean value True.
(Performance should be nice because this is easy work for the DB, and Django/Python does not need to do much additional work.)
EDIT after comment by Juan Carlos:
The Django serializer is for serializing model objects and thus will not serialize any non-model fields by default (because basically, the serializer in this case is for loading/dumping DB data).
See https://docs.djangoproject.com/en/2.2/topics/serialization/
Django’s serialization framework provides a mechanism for “translating” Django models into other formats.
See also: Django - How can you include annotated results in a serialized QuerySet?
From my own experience:
In most cases, we are using json.dumps for serialization in views and this works like a charm. You can prepare the data very flexibly for whatever needs arize, either by annotations or by processing the data (list comprehension etc.) before dumping it via json.
Another possibility in your situation could be to customize the serializer or the input to the serializer after fetching the data from the DB.
You can use a class function to return the upcoming value like this:
def upcoming(self):
is_upcoming = # some logic query or just basically set it to true.
return is_upcoming
then call it normally in your serializer the way you did it.
fields= ( 'upcoming','title','address','pk' )

How to create/update in DRF many-to-many fields?

I have four models:
QuestStatus
AdventureStatus
QuestAdventureStatus (consists of two
things, foreign key fields to QuestStatus and AdventureStatus)
QuestAdventure (has the M2M relationship to QuestAdventureStatus)
I have a serializer for QuestAdventure and QuestAdventureStatus exists as a field on my serializer:
quest_adventure_status = serializers.ListField(source='quest_adventure_status.all', required=False)
How do I properly create a new QuestAdventure and create quest_adventure_status(es) as well (updating too)? For creating, quest_adventure_Status is mandatory, but when I pass in my instance it's already serialized and not model objects?
Is there a proper way to deal with this in DRF?
I'd suggest you taking a look at drf-writable nested: https://github.com/beda-software/drf-writable-nested

Django setting many_to_many object while doing a bulk_create

I am using Django 1.9 and am trying the bulk_create to create many new model objects and associate them with a common related many_to_many object.
My models are as follows
#Computational Job object
class OT_job(models.Model):
is_complete = models.BooleanField()
is_submitted = models.BooleanField()
user_email = models.EmailField()
#Many sequences
class Seq(models.Model):
sequence=models.CharField(max_length=100)
ot_job = models.ManyToManyField(OT_job)
I have thousands of Seq objects that are submitted and have to be associated with their associated job. Previously I was using an iterator and saving them in a for loop. But after reading realized that Django 1.9 has bulk_create.
Currently I am doing
DNASeqs_list = []
for a_seq in some_iterable_with_my_data:
# I create new model instances and add them to the list
DNASeqs_list.append(Seq(sequence=..., ))
I now want to bulk_create these sequence and associate them with the current_job_object.
created_dnaseqs = Seq.objects.bulk_create(DNASeqs_list)
# How do I streamline this part below
for a_seq in created_dnaseqs:
# Had to call save here otherwise got an error
a_seq.save()
a_seq.ot_job.add(curr_job_obj)
I had to call "a_seq.save()" in for loop because I got an error in the part where I was doing "a_seq.ot_job.add(curr_job_obj)" which said
....needs to have a value for field "seq" before this many-to-many relationship can be used.
Despite reading the other questions on this topic , I am still confused because unlike others I do not have a custom "through" model. I am confused with how best to associate the OT_Job with many Seqs with minimal hits to database.
From the docs https://docs.djangoproject.com/en/1.9/ref/models/querysets/#bulk-create:
If the model’s primary key is an AutoField it does not retrieve and set the primary key attribute, as save() does.
It does not work with many-to-many relationships.
bulk_create literally will just create the objects, it does not retrieve the PK into the variable as save does. You would have to re-query the db to get your newly created objects, and then create the M2M relationships, but it sounds like that would not be appropriate and that your current method is currently the best solution.

Django serialize multiple objects in one call

I was wondering how I can decrease the number of calls to my database when serializing:
I have the following 2 models:
class House(models.Model):
name = models.CharField(max_length = 100, null = True, blank = True)
address = models.CharField(max_length = 500, null = True, blank = True)
class Room(models.Model):
house = models.ForeignKey(House)
name = models.CharField(max_length = 100)
There is 1 house, it can have multiple Room.
I am using django-rest-framework and trying to serialize all 3 things together at the house level.
class HouseSerializer(serializers.ModelSerializer)
rooms = serializers.SerializerMethodField('room_serializer')
def room_serializer(self):
rooms = Room.objects.filter(house_id = self.id) # we are in House serializer, so self is a house
return RoomSerializer(rooms).data
class Meta:
model = House
fields = ('id', 'name', 'address')
So now, for every house I want to serialize, I need to make a separate call for its Rooms. It works, but that's an extra call.
(imagine me trying to package a lot of stuff together!)
Now, if I had 100 houses, to serialize everything, I would need to make 100 Database hits, O(n) time
I know I can decrease this to 2 hits, if I can get all the information together. O(1) time
my_houses = Houses.objects.filter(name = "mine")
my_rooms = Rooms.objects.filter(house_id__in = [house.id for house in my_houses])
My question is how can I do this? and get the serializers to be happy?
Can I somehow do a loop after doing my two calls, to "attach" a Room to a House, then serialize it? (am I allowed to add an attribute like that?) If I can, how do i get my serializer to read it?
Please note that I do not need django-rest-serializer to allow me to change the attributes in the Rooms, this way. This is for GET only.
As it is currently writen, using a SerializerMethodField, you are making N+1 queries. I have covered this a few times on Stack Overflow for optimizing the database queries and in general, it's similar to how you would improve the performance in Django. You are dealing with a one-to-many relationship, which can be optimized the same way as many-to-many relationships with prefetch_related.
class HouseSerializer(serializers.ModelSerializer)
rooms = RoomSerializer(read_only=True, source="room_set", many=True)
class Meta:
model = House
fields = ('id', 'name', 'address', )
The change I made uses nested serializers instead of manually generating the serializer within a SerializerMethodField. I had restricted it to be read_only, as you mentioned you only need it for GET requests and writable serializers have problems in Django REST Framework 2.4.
As your reverse relationship for the Room -> House relationship has not been set, it is the default room_set. You can (and should) override this by setting the related_name on the ForeignKey field, and you would need to adjust the source accordingly.
In order to prevent the N+1 query issue, you will need to override the queryset on your view. In the case of a generic view, this would be done on the queryset attribute or within the get_queryset method like queyset = House.objects.prefetch_related('room_set'). This will request all of the related rooms alongisde the request for the House object, so instead of N+1 requests you will only have two requests.

Django object extension / one to one relationship issues

Howdy. I'm working on migrating an internal system to Django and have run into a few wrinkles.
Intro
Our current system (a billing system) tracks double-entry bookkeeping while allowing users to enter data as invoices, expenses, etc.
Base Objects
So I have two base objects/models:
JournalEntry
JournalEntryItems
defined as follows:
class JournalEntry(models.Model):
gjID = models.AutoField(primary_key=True)
date = models.DateTimeField('entry date');
memo = models.CharField(max_length=100);
class JournalEntryItem(models.Model):
journalEntryID = models.AutoField(primary_key=True)
gjID = models.ForeignKey(JournalEntry, db_column='gjID')
amount = models.DecimalField(max_digits=10,decimal_places=2)
So far, so good. It works quite smoothly on the admin side (inlines work, etc.)
On to the next section.
We then have two more models
InvoiceEntry
InvoiceEntryItem
An InvoiceEntry is a superset of / it inherits from JournalEntry, so I've been using a OneToOneField (which is what we're using in the background on our current site). That works quite smoothly too.
class InvoiceEntry(JournalEntry):
invoiceID = models.AutoField(primary_key=True, db_column='invoiceID', verbose_name='')
journalEntry = models.OneToOneField(JournalEntry, parent_link=True, db_column='gjID')
client = models.ForeignKey(Client, db_column='clientID')
datePaid = models.DateTimeField(null=True, db_column='datePaid', blank=True, verbose_name='date paid')
Where I run into problems is when trying to add an InvoiceEntryItem (which inherits from JournalEntryItem) to an inline related to InvoiceEntry. I'm getting the error:
<class 'billing.models.InvoiceEntryItem'> has more than 1 ForeignKey to <class 'billing.models.InvoiceEntry'>
The way I see it, InvoiceEntryItem has a ForeignKey directly to InvoiceEntry. And it also has an indirect ForeignKey to InvoiceEntry through the JournalEntry 1->M JournalEntryItems relationship.
Here's the code I'm using at the moment.
class InvoiceEntryItem(JournalEntryItem):
invoiceEntryID = models.AutoField(primary_key=True, db_column='invoiceEntryID', verbose_name='')
invoiceEntry = models.ForeignKey(InvoiceEntry, related_name='invoiceEntries', db_column='invoiceID')
journalEntryItem = models.OneToOneField(JournalEntryItem, db_column='journalEntryID')
I've tried removing the journalEntryItem OneToOneField. Doing that then removes my ability to retrieve the dollar amount for this particular InvoiceEntryItem (which is only stored in journalEntryItem).
I've also tried removing the invoiceEntry ForeignKey relationship. Doing that removes the relationship that allows me to see the InvoiceEntry 1->M InvoiceEntryItems in the admin inline. All I see are blank fields (instead of the actual data that is currently stored in the DB).
It seems like option 2 is closer to what I want to do. But my inexperience with Django seems to be limiting me. I might be able to filter the larger pool of journal entries to see just invoice entries. But it would be really handy to think of these solely as invoices (instead of a subset of journal entries).
Any thoughts on how to do what I'm after?
First, inheriting from a model creates an automatic OneToOneField in the inherited model towards the parents so you don't need to add them. Remove them if you really want to use this form of model inheritance.
If you only want to share the member of the model, you can use Meta inheritance which will create the inherited columns in the table of your inherited model. This way would separate your JournalEntry in 2 tables though but it would be easy to retrieve only the invoices.
All fields in the superclass also exist on the subclass, so having an explicit relation is unnecessary.
Model inheritance in Django is terrible. Don't use it. Python doesn't need it anyway.

Categories