Django - improve the query consisting many-to-many and foreignKey fields - python

I want to export a report from the available data into a CSV file. I wrote the following code and it works fine. What do you suggest to improve the query?
Models:
class shareholder(models.Model):
title = models.CharField(max_length=100)
code = models.IntegerField(null=False)
class Company(models.Model):
isin = models.CharField(max_length=20, null=False)
cisin = models.CharField(max_length=20)
name_fa = models.CharField(max_length=100)
name_en = models.CharField(max_length=100)
class company_shareholder(models.Model):
company = models.ManyToManyField(Company)
shareholder = models.ForeignKey(shareholder, on_delete=models.SET_NULL, null=True)
share = models.IntegerField(null = True) # TODO: *1000000
percentage = models.DecimalField(max_digits=8, decimal_places=2, null=True)
difference = models.DecimalField(max_digits=11, decimal_places=2, null=True)
update_datetime = models.DateTimeField(null=True)
View:
def ExportAllShare(request):
response = HttpResponse(content_type='text/csv')
response['Content-Disposition'] = 'attachment; filename="shares.csv"'
response.write(u'\ufeff'.encode('utf8'))
writer = csv.writer(response)
writer.writerow(['date','company','shareholder title','shareholder code','difference','share'])
results = company_shareholder.objects.all()
for result in results:
row = (
result.update_datetime,
result.company.first().name_fa,
result.shareholder.title,
result.shareholder.code,
result.difference,
result.share,
)
writer.writerow(row)
return (response)

First of all if it's working fine for you, then it's working fine, don't optimize prematurely.
But, in a query like this you are running into n+1 problem. In Django you avoid it using select_related and prefetch_related. Like this:
results = company_shareholder.objects.select_related('shareholder').prefetch_related('company').all()
This should reduce the number of queries you are generating. If you need a little bit more performance and since you are not using percentage I would defer it.
Also, I would highly suggest you follow PEP8 styling guide and name your classes in CapWords convention like Shareholder and CompanyShareholder.

Related

Trying to know how many rows I have from the different fleets in django

I'm working with a project that has 3 models.
Devices: All the devices involved in the project.
Fleets: Each device is in one fleet.
Data: The data that the devices send.
Now I want to know how much data have come me today.
In my view I have this,
def dataPlots(request):
today = datetime.date(2020, 5, 18)
data = DevData.objects.filter(data_timestamp__date=today)
data = loads(serializers.serialize('json', data))
If I use len() I can now the total value. But how can I know how much data do I have for each fleet?
I have come to create this loop that would tell me for each device, but what I need is the value for each fleet and furthermore I think I am getting complicated.
data_dict = {}
for d in data:
if d['fields']['dev_eui'] in data_dict:
data_dict[d['fields']['dev_eui']] = data_dict[d['fields']['dev_eui']] + 1
else:
data_dict[d['fields']['dev_eui']] = 1
print(data_dict)
The models are:
class Fleet(models.Model):
fleet_id = models.IntegerField(primary_key=True, unique=True)
fleet_name = models.CharField(max_length=20, unique=True)
class Device(models.Model):
dev_eui = models.CharField(max_length=16, primary_key=True, unique=True)
dev_name = models.CharField(max_length=20, unique=True)
fleet_id = models.ForeignKey(Fleet, on_delete=models.CASCADE)
def __str__(self):
return self.dev_eui
class DevData(models.Model):
data_uuid = models.UUIDField(primary_key=True, default=uuid.uuid1, editable=False)
data_timestamp = models.DateTimeField()
data = models.FloatField()
dev_eui = models.ForeignKey(Device, on_delete=models.CASCADE)
def __str__(self):
return self.dev_eui
Can somebody help me? I imagine that combining two models and some filter should suffice, but I don't know how to do it.
You can annotate the Fleets with the given amount of data, for example:
from django.db.models import Count
Fleet.objects.filter(
device__devdata__data_timestamp__date=today
).annotate(
total_data=Count('device__devdata')
)
The Fleet objects that arise from this queryset will have an extra attribute .total_data that contains the total amount of data for today.

Simplifying this django query - Have django obtain instances based on values

So I currently have this django query. The first two statements are needed in order to obtain the 3rd statement. My question is if there is a way to only use the 3rd statements without using the first two statements.
#patient_name and quest are two strings
patientobj = modelPatient.objects.get(patient_name=patient_name)
questobj = modelInterviewQuestion.objects.get(question=quest)
answer = modelInterviewAnswer.objects.get(patients=patientobj, questions=questobj)
I know I could do something like this
answer = modelInterviewAnswer.objects.get(patients= modelPatient.objects.get(patient_name=patient_name), questions= modelInterviewQuestion.objects.get(question=quest))
but I was wondering if there is anything simpler ?
Here are the relationship between models
class modelPatient(models.Model):
patient_name = models.CharField(max_length=128, unique=False)
patient_sex = models.CharField(max_length=128, unique=False)
patient_image = models.ImageField(upload_to='images/',
class modelInterviewQuestion(models.Model):
question = models.CharField(max_length=1000, unique=True)
class modelInterviewAnswer(models.Model):
patients = models.ForeignKey(modelPatient)
questions = models.ForeignKey(modelInterviewQuestion)
patient_response = models.CharField(max_length=1000, unique=True)
Try out this.
answer = modelInterviewAnswer.objects.get(patients__patient_name=patient_name, questions__question=quest)
Please go through this documentation to know how to write query that span relationship.
I want to draw you attention at naming convention.
Don't prefix model name with model, for example modelPatient should be only Patient.
Don't need to write patient_<field_name> in model. It should be only <field_name>
For example your Paitent model should look like
class Patient(models.Model):
name = models.CharField(max_length=128, unique=False)
sex = models.CharField(max_length=128, unique=False)
image = models.ImageField(upload_to='images/')
Follow same instructions for other models too.
class InterviewQuestion(models.Model):
question = models.CharField(max_length=1000, unique=True)
class InterviewAnswer(models.Model):
patients = models.ForeignKey(modelPatient)
interview_questions = models.ForeignKey(modelInterviewQuestion)
patient_response = models.CharField(max_length=1000, unique=True)
So Your query will be.
answer = InterviewAnswer.objects.get(patients__name=patient_name, interview_questions__question=quest)

Reference multiple foreign keys in Django Model

I'm making a program that helps log missions in a game. In each of these missions I would like to be able to select a number of astronauts that will go along with it out of the astronauts table. This is fine when I only need one, but how could I approach multiple foreign keys in a field?
I currently use a 'binary' string that specifies which astronauts are to be associated with the mission (1 refers to Jeb, but not Bill, Bob, or Val and 0001 means only Val), with the first digit specifying the astronaut with id 1 and so forth. This works, but it feels quite clunky.
Here's the model.py for the two tables in question.
class astronauts(models.Model):
name = models.CharField(max_length=200)
adddate = models.IntegerField(default=0)
experience = models.IntegerField(default=0)
career = models.CharField(max_length=9, blank=True, null=True)
alive = models.BooleanField(default=True)
def __str__(self):
return self.name
class Meta:
verbose_name_plural = "Kerbals"
class missions(models.Model):
# mission details
programid = models.ForeignKey(programs, on_delete=models.SET("Unknown"))
missionid = models.IntegerField(default=0)
status = models.ForeignKey(
missionstatuses, on_delete=models.SET("Unknown"))
plan = models.CharField(max_length=1000)
# launch
launchdate = models.IntegerField(default=0)
crewmembers = models.IntegerField(default=0)
# recovery
summary = models.CharField(max_length=1000, blank=True)
recdate = models.IntegerField(default=0)
def __str__(self):
return str(self.programid) + '-' + str(self.missionid)
class Meta:
verbose_name_plural = "Missions"
I saw a post about an 'intermediate linking table' to store the crew list but that also isn't ideal.
Thanks!
This is the use case for Django's ManyToManyField. Change the appropriate field on the missions:
class missions(models.Model):
crewmembers = models.ManyToManyField('astronauts')
You can access this from the Astronaut model side like so:
jeb = astronaut.objects.get(name='Jebediah Kerman')
crewed_missions = jeb.missions_set.all()
Or from the mission side like so:
mission = missions.objects.order_by('?')[0]
crew = mission.crewmembers.all()
This creates another table in the database, in case that is somehow a problem for you.

Django filter only on aggregate/annotate

I'm trying to construct a fairly complicated Django query and I'm not making much progress. I was hoping some wizard here could help me out?
I have the following models:
class Person(models.Model):
MALE = "M"
FEMALE = "F"
OTHER = "O"
UNKNOWN = "U"
GENDER_CHOICES = (
(MALE, "Male"),
(FEMALE, "Female"),
(UNKNOWN, "Other"),
)
firstName = models.CharField(max_length=200, null=True, db_column="firstname")
lastName = models.CharField(max_length=200, null=True, db_column="lastname")
gender = models.CharField(max_length=1, choices=GENDER_CHOICES, default=UNKNOWN, null=True)
dateOfBirth = models.DateField(null=True, db_column="dateofbirth")
dateInService = models.DateField(null=True, db_column="dateinservice")
photo = models.ImageField(upload_to='person_photos', null=True)
class SuccessionTerm(models.Model):
originalName = models.CharField(max_length=200, null=True, db_column="originalname")
description = models.CharField(max_length=200, blank=True, null=True)
score = models.IntegerField()
class Succession(model.Model):
position = models.ForeignKey(Position, to_field='positionId', db_column="position_id")
employee = models.ForeignKey(Employee, to_field='employeeId', db_column="employee_id")
term = models.ForeignKey(SuccessionTerm)
class Position(models.Model):
positionId = models.CharField(max_length=200, unique=True, db_column="positionid")
title = models.CharField(max_length=200, null=True)
# There cannot be a DB constraint, as that would make it impossible to add the first position.
dottedLine = models.ForeignKey("Position", to_field='positionId', related_name="Dotted Line",
null=True, db_constraint=False, db_column="dottedline_id")
solidLine = models.ForeignKey("Position", to_field='positionId', related_name="SolidLine",
null=True, db_constraint=False, db_column="solidline_id")
grade = models.ForeignKey(Grade)
businessUnit = models.ForeignKey(BusinessUnit, null=True, db_column="businessunit_id")
functionalArea = models.ForeignKey(FunctionalArea, db_column="functionalarea_id")
location = models.ForeignKey(Location, db_column="location_id")
class Employee(models.Model):
person = models.OneToOneField(Person, db_column="person_id")
fte = models.IntegerField(default=100)
dataSource = models.ForeignKey(DataSource, db_column="datasource_id")
talentStatus = models.ForeignKey(TalentStatus, db_column="talentstatus_id")
retentionRisk = models.ForeignKey(RetentionRisk, db_column="retentionrisk_id")
retentionRiskReason = models.ForeignKey(RetentionRiskReason, db_column="retentionriskreason_id")
performanceStatus = models.ForeignKey(PerformanceStatus, db_column="performancestatus_id")
potential = models.ForeignKey(Potential, db_column="potential_id")
mobility = models.ForeignKey(Mobility, db_column="mobility_id")
currency = models.ForeignKey(Currency, null=True, db_column="currency_id")
grade = models.ForeignKey(Grade, db_column="grade_id")
position = models.OneToOneField(Position, to_field='positionId', null=True,
blank=True, db_column="position_id")
employeeId = models.CharField(max_length=200, unique=True, db_column="employeeid")
dateInPosition = models.DateField(null=True, db_column="dateinposition")
Now, what I want is for each employee to get the position title, the person's name, and for each succession term (of which there are three) how many times the position of that employee is in the succession table, and the number of times each of these employees occurs in the successors table. Above all, I want to do all of this in a singe query (or more specifically, a single Django ORM statement), as I'm doing this in a paginated way, but I want to be able to order the result on any of these columns!
So far, I have this:
emps = Employee.objects.all()
.annotate(ls_st=Count('succession__term'))
.filter(succession__term__description="ShortTerm")
.order_by(ls_st)
.prefetch_related('person', 'position')[lower_limit:upper_limit]
This is only one of the succession terms, and I would like to extend it to all terms by adding more annotate calls.
My problem is that the filter call works on the entire query. I would like to only filter on the Count call.
I've tried doing something like Count(succession__term__description'="ShortTerm") but that doesn't work. Is there any other way to do this?
Thank you very much in advance,
Regards,
Linus
So what you want is a count of each different type of succession__term? That is pretty complex, and I don't think you can do this with the built in django orm right now. (unless you did a .extra() query)
In django 1.8, I believe you will be able to do it with the new Query Expressions (https://docs.djangoproject.com/en/dev/releases/1.8/#query-expressions). But of course 1.8 isn't released yet, so that doesn't help you.
In the meantime, you can use the very handy django-aggregate-if package. (https://github.com/henriquebastos/django-aggregate-if/, https://pypi.python.org/pypi/django-aggregate-if)
With django-aggregate-if, your query might look like this:
emps = Employee.objects.annotate(
ls_st=Count('succession__term', only=Q(succession__term__description="ShortTerm")),
ls_lt=Count('succession__term', only=Q(succession__term__description="LongTerm")), # whatever your other term descriptions are.
ls_ot=Count('succession__term', only=Q(succession__term__description="OtherTerm"))
)
.order_by('ls_st')
.prefetch_related('person', 'position')[lower_limit:upper_limit]
Disclaimer: I have never used django-aggregate-if, so I'm not entirely sure if this will work, but according the the README, it seems like it should.

django - inner join queryset not working

The SQL I want to accomplish is this -
SELECT jobmst_id, jobmst_name, jobdtl_cmd, jobdtl_params FROM jobmst
INNER JOIN jobdtl ON jobmst.jobdtl_id = jobdtl.jobdtl_id
WHERE jobmst_id = 3296
I've only had success once with an inner join in django off of a annote and order_by but I can't seem to get it to work doing either prefetch_related() or select_related()
My models are as so -
class Jobdtl(models.Model):
jobdtl_id = models.IntegerField(primary_key=True)
jobdtl_cmd = models.TextField(blank=True)
jobdtl_fromdt = models.DateTimeField(blank=True, null=True)
jobdtl_untildt = models.DateTimeField(blank=True, null=True)
jobdtl_fromtm = models.DateTimeField(blank=True, null=True)
jobdtl_untiltm = models.DateTimeField(blank=True, null=True)
jobdtl_priority = models.SmallIntegerField(blank=True, null=True)
jobdtl_params = models.TextField(blank=True) # This field type is a guess.
class Meta:
managed = False
db_table = 'jobdtl'
class Jobmst(MPTTModel):
jobmst_id = models.IntegerField(primary_key=True)
jobmst_type = models.SmallIntegerField()
jobmst_prntid = TreeForeignKey('self', null=True, blank=True, related_name='children', db_column='jobmst_prntid')
jobmst_name = models.TextField(db_column='jobmst_name', blank=True)
# jobmst_owner = models.IntegerField(blank=True, null=True)
jobmst_owner = models.ForeignKey('Owner', db_column='jobmst_owner', related_name = 'Jobmst_Jobmst_owner', blank=True, null=True)
jobmst_crttm = models.DateTimeField()
jobdtl_id = models.ForeignKey('Jobdtl', db_column='jobdtl_id', blank=True, null=True)
jobmst_prntname = models.TextField(blank=True)
class MPTTMeta:
order_insertion_by = ['jobmst_id']
class Meta:
managed = True
db_table = 'jobmst'
I have a really simple view like so -
# Test Query with Join
def test_queryjoin(request):
queryset = Jobmst.objects.filter(jobmst_id=3296).order_by('jobdtl_id')
queryresults = serializers.serialize("python", queryset, fields=('jobmst_prntid', 'jobmst_id', 'jobmst_prntname', 'jobmst_name', 'jobmst_owner', 'jobdtl_cmd', 'jobdtl_params'))
t = get_template('test_queryjoin.html')
html = t.render(Context({'query_output': queryresults}))
return HttpResponse(html)
I've tried doing a bunch of things -
queryset = Jobmst.objects.all().prefetch_related()
queryset = Jobmst.objects.all().select_related()
queryset = jobmst.objects.filter(jobmst_id=3296).order_by('jobdtl_id')
a few others as well I forget.
Each time the json I'm getting is only from the jobmst table with no mention of the jobdtl results which I want. If I go the other way and do Jobdtl.objects.xxxxxxxxx same thing it's not giving me the results from the other model.
To recap I want to display fields from both tables where a certain clause is met.
What gives?
Seems that I was constantly looking in the wrong place. Coming from SQL I kept thinking in terms of inner joining tables which is not how this works. I'm joining the results from models.
Hence, rethinking my search I came across itertools and the chain function.
I now have 2 queries under a def in my views.py
from itertools import chain
jobmstquery = Jobmst.objects.filter(jobmst_id=3296)
jobdtlquery = Jobdtl.objects.filter(jobdtl_id=3296)
queryset = chain(jobmstquery, jobdtlquery)
queryresults = serializers.serialize("python", queryset)
That shows me the results from each table "joined" like I would want in SQL. Now I can focus on filtering down the results to give me what I want.
Remember folks, the information you need is almost always there, it's just a matter of knowing how to look for it :)
What you are looking for might be this
queryset = Jobmst.objects.filter(id=3296).values_list(
'id', 'name', 'jobmst_owner__cmd', 'jobmst_owner__params')
You would get your results with only one query and you should be able to use sort with this.
P.S. Coming from SQL you might find some great insights playing with queryset.query (the SQL generated by django) in a django shell.

Categories