Django annotate + SUM how to get all entries - python

My models
class Machine(models.Model):
machineName = models.CharField(verbose_name="Machine Name", max_length=20, blank=False, null=False)
class SalesReport(models.Model):
machine = models.ForeignKey(Machine, on_delete=models.CASCADE, null=False, blank=False)
deviceDate = models.CharField(max_length=200, null=True, blank=True)
serverDate = models.DateTimeField(auto_now_add=True)
totalPrice = models.FloatField()
I have 3 machines, I wanted to get the total sales from each machines for the last 7 days.
my query is
from django.db.models import Sum, Value as V
from django.db.models.functions import Coalesce
SalesReport.objects.values("serverDate__date", "machine__machineName").annotate(
... sales=Coalesce(Sum("totalPrice"),V(0))).filter(
... serverDate__gte=week_start,
... serverDate__lte=week_end)
Which gives the following result,
[{'serverDate__date': datetime.date(2020, 7, 22), 'machine__machineName': 'machine__1', 'sales': 15.0},
{'serverDate__date': datetime.date(2020, 7, 28), 'machine__machineName': 'machine__1', 'sales': 145.0},
{'serverDate__date': datetime.date(2020, 7, 28), 'machine__machineName': 'machine__2', 'sales': 270.0},
{'serverDate__date': datetime.date(2020, 7, 28), 'machine__machineName': 'machine__3', 'sales': 255.0}]
What i am trying to get is
[{'serverDate__date': datetime.date(2020, 7, 22), 'machine__machineName': 'machine__1', 'sales': 15.0},
{'serverDate__date': datetime.date(2020, 7, 22), 'machine__machineName': 'machine__2', 'sales': 0.0},
{'serverDate__date': datetime.date(2020, 7, 22), 'machine__machineName': 'machine__3', 'sales': 0.0},
{'serverDate__date': datetime.date(2020, 7, 28), 'machine__machineName': 'machine__1', 'sales': 145.0},
{'serverDate__date': datetime.date(2020, 7, 28), 'machine__machineName': 'machine__2', 'sales': 270.0},
{'serverDate__date': datetime.date(2020, 7, 28), 'machine__machineName': 'machine__3', 'sales': 255.0}]
I am trying to do it with Coalesce, but i'm getting it wrong .
*I'm using mysql as db. a db specific query is also fine .

Since it is more SQL question I add a more specific answer
SELECT m.machineName, s.price
FROM machine m LEFT OUTER JOIN (
SELECT machine_id id, sum(totalPrice) price
FROM salesreport
WHERE serverDate BETWEEN DATE_SUB(curdate(), INTERVAL 1 WEEK) and curdate()
GROUP BY by machine_id) s on m.id = s.id
If you want the serverDate as outpout you have to apply an aggregate function (Max, Min) since it is located in your SalesReport table.
It depends what serverDate stands for. If it is the date when you bought the machine then it should be in machine table and it can be selected directly from machine table (and the WHERE BETWEEN clause must exist the sub-select and also apply on machine table). If it is a salesDate then it has to be in SalesReport and you must apply an aggregate function on it. ie: You can have potentially 7 dates over a week...
SELECT m.machineName, s.MaxserverDate, s.price
FROM machine m LEFT OUTER JOIN (
SELECT machine_id id, max(serverDate) MaxserverDate, sum(totalPrice) price
FROM salesreport
WHERE serverDate BETWEEN DATE_SUB(curdate(), INTERVAL 1 WEEK) and curdate()
GROUP BY by machine_id) s on m.id = s.id

The thing is that you don't have any sales for some dates. It is more a DB specific issue than a django ORM one. I would suggest to use raw sql with a left outer join on your machine table => take all the machine and list sales when present.
machine = Machine.objects.raw('''
SELECT machine.id, machine.name, sales.sid FROM app_machinelist as machine
LEFT JOIN (select sales_id as sid
from app_sales
where profile_id = {0}) sales
ON sales.sid = machine.id
ORDER BY machine.name ASC
'''.format(myuser.id))
This example works but for security reason, it is better to pass your parameters through a dictionary
machine = Machine.objects.raw(mysql, params)
Where
params = {'profile_id': pk, 'startdate': startdate, 'enddate': enddate}
mysql = '''
SELECT machine.id, machine.name, sales.sid FROM app_machinelist as machine
LEFT JOIN (select sales_id as sid
from app_sales
where profile_id = %(profile_id)s) sales
ON sales.sid = machine.id
ORDER BY machine.name ASC
'''

Related

Appending new items to a Python list with logic operators (Time values)

I'm starting out on Python on a job (I'm used to R) where I have to get daily data from an API that returns the datetime and value (which is a certain number of listeners on a podcast) and then send that data to a bigquery database.
After I split up the date and time, I need to add a new column that indicates which program was playing in that moment. In other words:
if time is >= than 11:00 and <= 11:59 then add a 'program name' value to the row into the column 'program'.
I've ran into several problems, namely the fact that time has been split as strings (could be due to the fact that we use google data studio, which has extremely rigid datetime implementation).
How would you go about it?
if response.status_code == 200:
data = response.text
result = json.loads(data)
test = result
#Append Items
for k in test:
l = []
l.append(datetime.datetime.strptime(k["time"], "%Y-%m-%dT%H:%M:%S.%fZ").strftime("%Y-%m-%d"))
l.append(datetime.datetime.strptime(k["time"], "%Y-%m-%dT%H:%M:%S.%fZ").astimezone(pytz.timezone("America/Toronto")).strftime("%H:%M"))
l.append(k["value"])
You need to have a 'DB' of the programs timetable. See below.
Your loop will call the function below with the time value and you will have the program name.
import datetime
from collections import namedtuple
Program = namedtuple('Program', 'name start end')
PROGRAMS_DB = [Program('prog1', datetime.time(3, 0, 0), datetime.time(3, 59, 0)),
Program('prog2', datetime.time(18, 0, 0), datetime.time(18, 59, 0)),
Program('prog3', datetime.time(4, 0, 0), datetime.time(4, 59, 0))]
def get_program_name(time_val):
for program in PROGRAMS_DB:
if program.start <= time_val <= program.end:
return program.name
data_from_the_web = [{"time": "2019-02-19T18:10:00.000Z", "value": 413, "details": None},
{"time": "2019-02-19T15:12:00.000Z", "value": 213, "details": None}]
for entry in data_from_the_web:
t = datetime.datetime.strptime(entry["time"], "%Y-%m-%dT%H:%M:%S.%fZ").time()
entry['prog'] = get_program_name(t)
for entry in data_from_the_web:
print(entry)
Output
{'prog': 'prog2', 'details': None, 'value': 413, 'time': '2019-02-19T18:10:00.000Z'}
{'prog': None, 'details': None, 'value': 213, 'time': '2019-02-19T15:12:00.000Z'}

SQLAlchemy _asdict() method returns only one column

I am trying to convert the rows returned in a SQLAlchemy query to dictionaries. When I try to use the ._asdict() method, I am only getting a key-value pair for the first column in my results.
Is there something else I should do to create a key-value pair in the dictionary for all columns in the result row?
class Project(db.Model):
__tablename__ = 'entries'
id = db.Column(db.Integer, primary_key=True)
time_start = db.Column(db.DateTime(timezone=False))
time_end = db.Column(db.DateTime(timezone=False))
name = db.Column(db.String(256), nullable=True)
analyst = db.Column(db.String(256), nullable=True)
def __init__(id, time_start, time_end, project_name, analyst):
self.id = id
self.time_start = time_start
self.time_end = time_end
self.name = name
self.analyst = analyst
latest_projects = db.session.query((func.max(Project.time_end)), Project.analyst).group_by(Project.analyst)
for row in latest_projects.all():
print (row._asdict())
{'analyst': 'Bob'}
{'analyst': 'Jane'}
{'analyst': 'Fred'}
I was expecting to see results like this...
{'analyst': 'Bob', 'time_end': '(2018, 11, 21, 14, 55)'}
{'analyst': 'Jane', 'time_end': '(2017, 10, 21, 08, 00)'}
{'analyst': 'Fred', 'time_end': '(2016, 09, 06, 01, 35)'}
You haven't named the func.max() column, so there is no name to use as a key in the resulting dictionary. Aggregate function columns are not automatically named, even when aggregating a single column; that you based that column on on the time_end column doesn't matter here.
Give that column a label:
latest_projects = db.session.query(
func.max(Project.time_end).label('time_end'),
Project.analyst
).group_by(Project.analyst)
Demo:
>>> latest_projects = db.session.query(
... func.max(Project.time_end).label('time_end'),
... Project.analyst
... ).group_by(Project.analyst)
>>> for row in latest_projects.all():
... print (row._asdict())
...
{'time_end': datetime.datetime(2018, 11, 21, 14, 55), 'analyst': 'Bob'}
{'time_end': datetime.datetime(2016, 9, 6, 1, 35), 'analyst': 'Fred'}
{'time_end': datetime.datetime(2017, 10, 21, 8, 0), 'analyst': 'Jane'}

Compare datetime with Django birthday objects

I have a question about my script. I want to know all people who have more than 16 years from my Database. I want to check this when user triggers the function.
I have this function :
def Recensement_array(request) :
date = datetime.now().year
print date # I get year from now
birthday = Identity.objects.values_list('birthday', flat=True) # Return list with all birthday values
for element in birthday :
if date - element < 117 :
print "ok < 117"
else :
print "ok > 117"
From print date I get :
2017
From print birthday I get :
<QuerySet [datetime.date(1991, 12, 23), datetime.date(1900, 9, 12), datetime.date(1900, 9, 12), datetime.date(1900, 9, 12), datetime.date(1900, 9, 12), datetime.date(1089, 9, 22), datetime.date(1900, 9, 12), datetime.date(1900, 9, 12), datetime.date(1089, 9, 22), datetime.date(1089, 9, 22), datetime.date(1089, 9, 22), datetime.date(1089, 9, 22), datetime.date(1990, 12, 12)]>
So my goal is to substract date with birthday and compare if date - birthday = 16 years, I print element, else nothing.
I get two problems :
How extract only year from birthday ?
Then the comparison method is between int and tuple up to now. If I could extract only year from birthday, it should work right ?
Thank you
EDIT :
For example I want to get all people who had 16 years old since the begining of this year or will get 16 years old before the first year :
def Recensement_array(request) :
today = datetime.now()
age_16 = (today - relativedelta(years=16))
result = Identity.objects.filter(birthday__range=[age_16, today]).order_by('lastname')
paginator = Paginator(result, 3)
page = request.GET.get('page', 1)
try:
result = paginator.page(page)
except PageNotAnInteger:
result = paginator.page(1)
except EmptyPage:
result = paginator.page(paginator.num_pages)
context = {
"Identity":Identity,
"age_16":age_16,
"datetime" : datetime,
"result" : result,
"PageNotAnInteger":PageNotAnInteger,
}
return render(request, 'Recensement_resume.html', context)
If you need filter records with some specific year you can just use __year method of date field:
age_16 = (today - relativedelta(years=16))
result = Identity.objects.filter(birthday__year=age_16.year).order_by‌​('last‌​name')

Loop for dictionaries in python

I have the following script:
from collections import defaultdict
class OrderLine:
def __init__(self, shop_id, qty, price):
self.shop_id = shop_id
self.qty = qty
self.price = price
order_lines = [OrderLine(1, 2, 30), OrderLine(1, 1, 50), OrderLine(3, 3, 10)]
shop_sum = defaultdict(int)
for order_line in order_lines:
print order_line.price
print order_line.qty
print order_line.shop_id
each line consists of (shop_id, qty, price) I want to loop for this shop_sum to give me for each shop_id:
total_price = qty * price
example: I have shop ids (1,3) , in shop_id =1
I have two order lines [OrderLine(1, 2, 30), OrderLine(1, 1, 50)]
I want to calculate the total price for all order lines which has the same shop_id where: total_price = qty * price
Thanks in advance
You can populate your dictionary shop_sum using the code snippet below. The if statement checks if the shop_id has already exist in the dictionary shop_sum and initialises it to zero. The logic after the if performs the actual sum. You can write this code more elegantly using list comprehension, you may want to read up on this.
from collections import defaultdict
class OrderLine:
def __init__(self, shop_id, qty, price):
self.shop_id = shop_id
self.qty = qty
self.price = price
order_lines = [OrderLine(1, 2, 30), OrderLine(1, 1, 50), OrderLine(3, 3, 10)]
shop_sum = defaultdict(int)
for order_line in order_lines:
shop_id = order_line.shop_id
if shop_id not in shop_sum:
shop_sum[shop_id]=0
shop_sum[shop_id] = shop_sum[shop_id] + order_line.qty * order_line.price
print order_line.price
print order_line.qty
print order_line.shop_id

group objects with same field value

I've got two models like these:
class Schedule(models.Model):
name = models.CharField(_('name'), blank=True, max_length=15)
class Day(models.Model):
DAYS_OF_THE_WEEK = (
(0, _('Monday')),
(1, _('Tuesday')),
(2, _('Wednesday')),
(3, _('Thursday')),
(4, _('Friday')),
(5, _('Saturday')),
(6, _('Sunday')),
)
schedule = models.ForeignKey(Schedule, blank=True, null=True, verbose_name=_('schedule'))
day = models.SmallIntegerField(_('day'), choices=DAYS_OF_THE_WEEK)
opening = models.TimeField(_('opening'), blank=True)
closing = models.TimeField(_('closing'), blank=True)
It's possible that a schedule can have two Day objects like so:
Day(schedule=1, day=0, opening=datetime.time(7, 30), closing=datetime.time(10, 30))
Day(schedule=1, day=0, opening=datetime.time(12, 30), closing=datetime.time(15, 30))
like different shifts on the same day.
If I iterate them now i'll get two entries of day 0, like so
[day for day in schedule]
[0, 0, 1, 2, 3, 4, 5, 6]
How can I create a queryset so it'll group same days together and keep their attributes?
[day for day in schedule]
[0 (two entries), 1, 3, 4, 5, 6]
Maybe something like
[id: [day], id: [day]]
The code I ended up using is this:
from itertools import groupby
day_set = store.schedule_set.all()[0].day_set.all()
schedule = dict()
for k, v in groupby(day_set, lambda x: x.day):
schedule[k] = list(v)
and sending schedule to the template for rendering, which works like a charm.
You can group them at template level, using {% regroup %} or {% for %}-loop with {% ifchanged %} tag.
In Python code use groupby.

Categories