Next upcoming date model object ignoring past dates - python

I'm trying to utilise latest() on a django model queryset to return the next upcoming date in a model.
I've tried a few different things, using __lte and __gte lookups on a filter and to no avail.
The filter option would work for me, if there was a way to effectively utilise a model method within an exclude() but without writing a custom manager that's not going to be an option.
There must be an easier way?
class RaidSession(models.Model):
scheduled = models.DateTimeField()
duration = models.DurationField()
def is_expired(self):
duration_to_date = self.scheduled + self.duration
return True if duration_to_date < timezone.now() else False

Since I'm a little old school, it usually helps me to think of such problems as an SQL query. In your case this would be
SELECT * FROM app_raidsession rs
WHERE rs.scheduled >= now()
ORDER BY rs.scheduled
LIMIT 1
This gives you the next scheduled raid.
In django ORM, you should be able to translate this more or less straightforward to:
from django.utils.timezone import now
# first() returns None if the result is empty
next_raid = models.RaidSession.objects \
.filter(scheduled__gte=now()) \
.order_by('scheduled') \
.first()
If the duration is relevant, you will need an F-expression:
from django.db.models import F
next_raid = models.RaidSession.objects \
.filter(scheduled__gte=now() - F('duration')) \
.order_by('scheduled') \
.first()

Related

Sqlalchemy get row in timeslot

I have a model called Appointment which has the columns datetime which is a DateTime field and duration which is an Integer field and represents duration in minutes. Now I want to check if func.now() is between the datetime of the appointment and the sum of the datetime and duration
I am currently to try to do it this way, but I need a solution that will work for both PostgreSQL and SQLite.
current_appointment = Appointment.query.filter(
Appointment.datetime.between(
func.now(),
func.timestampadd(
'MINUTES', Appointment.duration, func.now()
)
)
).limit(1).one_or_none()
I don't think you'll be able to do this directly in the ORM for both sqlite and postgres, but sqlalchemy lets you extend it in a cross-dialect way with Custom SQL Constructs and Compilation Extension.
This snippet might not be exactly right because I hacked at it with some different models and translated it over for this, but I got something very close to render the postgres SQL correctly:
from sqlalchemy import func
from sqlalchemy.sql import expression
from sqlalchemy.types import DateTime
from sqlalchemy.ext.compiler import compiles
class durationnow(expression.FunctionElement):
type = DateTime()
name = 'durationnow'
#compiles(durationnow, 'sqlite')
def sl_durationnow(element, compiler, **kw):
return compiler.process(
func.timestampadd('MINUTES', element.clauses, func.now())
)
#compiles(durationnow, 'postgresql')
def pg_durationnow(element, compiler, **kw):
return compiler.process(
func.now() + func.make_interval(0, 0, 0, 0, 0, element.clauses)
)
# Or alternatively...
# return "now() - make_interval(0, 0, 0, 0, 0, {})".format(compiler.process(element.clauses))
# which is more in-line with how the documentation uses 'compiles'
With something like that set up you should be able to turn your original query into a cross-dialect one that renders to SQL directly instead of doing the duration computation in Python:
current_appointment = Appointment.query.filter(
Appointment.datetime.between(
func.now(),
durationnow(Appointment.duration)
).limit(1).one_or_none()
Disclaimer 1: First of all, think if it is not "cheaper" to actually use postgresql instead of sqlite everywhere. I assume you have development/production differences, which you should avoid. Installation of postgresql on any modern OS is quite trivial.
Assuming above is not an option/desired, let's continue.
Disclaimer 2: The solution with the custom SQL construct (as per #Josh's answer) is really the only reasonable way to achieve this.
Unfortunately, the proposed solution does not actually work for sqlite, and could not be fixed with just few lines, hence a separate answer.
Solution:
Assuming you have the following model:
class Appointment(Base):
__tablename__ = 'appointment'
id = Column(Integer, primary_key=True)
name = Column(String(255))
datetime = Column(DateTime) # #note: should be better named `start_date`?
duration = Column(Integer)
sqlite is really tricky dealing with dates operations, especially adding/subtracting intervals from dates. Therefore, let's approach it somewhat differently and create custom functions to get an interval between two dates in minutes:
class diff_minutes(expression.FunctionElement):
type = Integer()
name = 'diff_minutes'
#compiles(diff_minutes, 'sqlite')
def sqlite_diff_minutes(element, compiler, **kw):
dt1, dt2 = list(element.clauses)
return compiler.process(
(func.strftime('%s', dt1) - func.strftime('%s', dt2)) / 60
)
#compiles(diff_minutes, 'postgresql')
def postgres_diff_minutes(element, compiler, **kw):
dt1, dt2 = list(element.clauses)
return compiler.process(func.extract('epoch', dt1 - dt2) / 60)
You can already implement your check using following query (i am not adding limit(1).one_or_none in my examples, which you can obviously do when you need it):
q = (
session
.query(Appointment)
.filter(Appointment.datetime <= func.now())
.filter(diff_minutes(func.now(), Appointment.datetime) <= Appointment.duration)
)
But now you are not limited by current time (func.now()), and you can check (and unit test) your data against any time:
# at_time = func.now()
at_time = datetime.datetime(2017, 11, 11, 17, 50, 0)
q = (
session
.query(Appointment)
.filter(Appointment.datetime <= at_time)
.filter(diff_minutes(at_time, Appointment.datetime) <= Appointment.duration)
)
Basically, problem is solved here, and the solution should work for both database engines you use.
BONUS:
You can hide the implementation of checking if the event is current using Hybrid Methods.
Lets add following to the Appointment class:
#hybrid_method
def is_current(self, at_time=None):
if at_time is None:
at_time = datetime.datetime.now()
return self.datetime <= at_time <= self.datetime + datetime.timedelta(minutes=self.duration)
#is_current.expression
def is_current(cls, at_time=None):
if at_time is None:
at_time = datetime.datetime.now()
stime = cls.datetime
diffm = diff_minutes(at_time, cls.datetime)
return and_(diffm >= 0, cls.duration >= diffm).label('is_current')
The first one allows you to run the check in memory (on python, not on SQL side):
print(my_appointment.is_current())
The second one allows you to construct query like below:
q = session.query(Appointment).filter(Appointment.is_current(at_time))
where if at_time it not specified, current time will be used. You can, of course then modify the query as you wish:
current_appointment = session.query(Appointment).filter(Appointment.is_current()).limit(1).one_or_none()
If I'm understanding the question correctly...
Something like this?
def check_for_current_appt(appt_epoch, appt_duration):
'''INPUT : appt_timestamp (int (epoch time)): start time for appointment
appt_duration (int): duration of appointment in seconds
OUTPUT : appt_underway (bool): True if appointment is currently underway'''
now = time.time()
appt_underway = 0 < (now - appt_epoch) < appt_duration
return appt_underway
I'll leave it to you to convert to epoch time and seconds for the duration
From what I understand of it, PostgreSQL uses unix timestamps while Sqlite uses iso-8601 timestamps stored as strings. So if you change the overall structure of your database to use the Sqlite format it should give you the functionality you want, you can convert datetime with the .isoformat() function. Unfortunately if you are not working with only test data you will have to iterate over all the rows to change them. Not sure if this is acceptable to you but is an easy way to do it.
Based on the datetime section of http://docs.sqlalchemy.org/en/latest/core/type_basics.html

Django ORM filter by Max column value of two related models

I have 3 related models:
Program(Model):
... # which aggregates ProgramVersions
ProgramVersion(Model):
program = ForeignKey(Program)
index = IntegerField()
UserProgramVersion(Model):
user = ForeignKey(User)
version = ForeignKey(ProgramVersion)
index = IntegerField()
ProgramVersion and UserProgramVersion are orderable models based on index field - object with highest index in the table is considered latest/newest object (this is handled by some custom logic, not relevant).
I would like to select all latest UserProgramVersion's, i.e. latest UPV's which point to the same Program.
this can be handled by this UserProgramVersion queryset:
def latest_user_program_versions(self):
latest = self\
.order_by('version__program_id', '-version__index', '-index')\
.distinct('version__program_id')
return self.filter(id__in=latest)
this works fine however im looking for a solution which does NOT use .distinct()
I tried something like this:
def latest_user_program_versions(self):
latest = self\
.annotate(
'max_version_index'=Max('version__index'),
'max_index'=Max('index'))\
.filter(
'version__index'=F('max_version_index'),
'index'=F('max_index'))
return self.filter(id__in=latest)
this however does not work
Use Subquery() expressions in Django 1.11. The example in docs is similar and the purpose is also to get the newest item for required parent records.
(You could start probably by that example with your objects, but I wrote also a complete more complicated suggestion to avoid possible performance pitfalls.)
from django.db.models import OuterRef, Subquery
...
def latest_user_program_versions(self, *args, **kwargs):
# You should filter users by args or kwargs here, for performance reasons.
# If you do it here it is applied also to subquery - much faster on a big db.
qs = self.filter(*args, **kwargs)
parent = Program.objects.filter(pk__in=qs.values('version__program'))
newest = (
qs.filter(version__program=OuterRef('pk'))
.order_by('-version__index', '-index')
)
pks = (
parent.annotate(newest_id=Subquery(newest.values('pk')[:1]))
.values_list('newest_id', flat=True)
)
# Maybe you prefer to uncomment this to be it compiled by two shorter SQLs.
# pks = list(pks)
return self.filter(pk__in=pks)
If you considerably improve it, write the solution in your answer.
EDIT Your problem in your second solution:
Nobody can cut a branch below him, neither in SQL, but I can sit on its temporary copy in a subquery, to can survive it :-) That is also why I ask for a filter at the beginning. The second problem is that Max('version__index') and Max('index') could be from two different objects and no valid intersection is found.
EDIT2: Verified: The internal SQL from my query is complicated, but seems correct.
SELECT app_userprogramversion.id,...
FROM app_userprogramversion
WHERE app_userprogramversion.id IN
(SELECT
(SELECT U0.id
FROM app_userprogramversion U0
INNER JOIN app_programversion U2 ON (U0.version_id = U2.id)
WHERE (U0.user_id = 123 AND U2.program_id = (V0.id))
ORDER BY U2.index DESC, U0.index DESC LIMIT 1
) AS newest_id
FROM app_program V0 WHERE V0.id IN
(SELECT U2.program_id AS Col1
FROM app_userprogramversion U0
INNER JOIN app_programversion U2 ON (U0.version_id = U2.id)
WHERE U0.user_id = 123
)
)

How to annotate Count with a condition in a Django queryset

Using Django ORM, can one do something like queryset.objects.annotate(Count('queryset_objects', gte=VALUE)). Catch my drift?
Here's a quick example to use for illustrating a possible answer:
In a Django website, content creators submit articles, and regular users view (i.e. read) the said articles. Articles can either be published (i.e. available for all to read), or in draft mode. The models depicting these requirements are:
class Article(models.Model):
author = models.ForeignKey(User)
published = models.BooleanField(default=False)
class Readership(models.Model):
reader = models.ForeignKey(User)
which_article = models.ForeignKey(Article)
what_time = models.DateTimeField(auto_now_add=True)
My question is: How can I get all published articles, sorted by unique readership from the last 30 mins? I.e. I want to count how many distinct (unique) views each published article got in the last half an hour, and then produce a list of articles sorted by these distinct views.
I tried:
date = datetime.now()-timedelta(minutes=30)
articles = Article.objects.filter(published=True).extra(select = {
"views" : """
SELECT COUNT(*)
FROM myapp_readership
JOIN myapp_article on myapp_readership.which_article_id = myapp_article.id
WHERE myapp_readership.reader_id = myapp_user.id
AND myapp_readership.what_time > %s """ % date,
}).order_by("-views")
This sprang the error: syntax error at or near "01" (where "01" was the datetime object inside extra). It's not much to go on.
For django >= 1.8
Use Conditional Aggregation:
from django.db.models import Count, Case, When, IntegerField
Article.objects.annotate(
numviews=Count(Case(
When(readership__what_time__lt=treshold, then=1),
output_field=IntegerField(),
))
)
Explanation:
normal query through your articles will be annotated with numviews field. That field will be constructed as a CASE/WHEN expression, wrapped by Count, that will return 1 for readership matching criteria and NULL for readership not matching criteria. Count will ignore nulls and count only values.
You will get zeros on articles that haven't been viewed recently and you can use that numviews field for sorting and filtering.
Query behind this for PostgreSQL will be:
SELECT
"app_article"."id",
"app_article"."author",
"app_article"."published",
COUNT(
CASE WHEN "app_readership"."what_time" < 2015-11-18 11:04:00.000000+01:00 THEN 1
ELSE NULL END
) as "numviews"
FROM "app_article" LEFT OUTER JOIN "app_readership"
ON ("app_article"."id" = "app_readership"."which_article_id")
GROUP BY "app_article"."id", "app_article"."author", "app_article"."published"
If we want to track only unique queries, we can add distinction into Count, and make our When clause to return value, we want to distinct on.
from django.db.models import Count, Case, When, CharField, F
Article.objects.annotate(
numviews=Count(Case(
When(readership__what_time__lt=treshold, then=F('readership__reader')), # it can be also `readership__reader_id`, it doesn't matter
output_field=CharField(),
), distinct=True)
)
That will produce:
SELECT
"app_article"."id",
"app_article"."author",
"app_article"."published",
COUNT(
DISTINCT CASE WHEN "app_readership"."what_time" < 2015-11-18 11:04:00.000000+01:00 THEN "app_readership"."reader_id"
ELSE NULL END
) as "numviews"
FROM "app_article" LEFT OUTER JOIN "app_readership"
ON ("app_article"."id" = "app_readership"."which_article_id")
GROUP BY "app_article"."id", "app_article"."author", "app_article"."published"
For django < 1.8 and PostgreSQL
You can just use raw for executing SQL statement created by newer versions of django. Apparently there is no simple and optimized method for querying that data without using raw (even with extra there are some problems with injecting required JOIN clause).
Articles.objects.raw('SELECT'
' "app_article"."id",'
' "app_article"."author",'
' "app_article"."published",'
' COUNT('
' DISTINCT CASE WHEN "app_readership"."what_time" < 2015-11-18 11:04:00.000000+01:00 THEN "app_readership"."reader_id"'
' ELSE NULL END'
' ) as "numviews"'
'FROM "app_article" LEFT OUTER JOIN "app_readership"'
' ON ("app_article"."id" = "app_readership"."which_article_id")'
'GROUP BY "app_article"."id", "app_article"."author", "app_article"."published"')
For django >= 2.0 you can use Conditional aggregation with a filter argument in the aggregate functions:
from datetime import timedelta
from django.utils import timezone
from django.db.models import Count, Q # need import
Article.objects.annotate(
numviews=Count(
'readership__reader__id',
filter=Q(readership__what_time__gt=timezone.now() - timedelta(minutes=30)),
distinct=True
)
)

Sqlalchemy searching dates

I'm trying to do a search between two dates with sqlalchemy. If I used static dates will be this way.
def secondExercise():
for instance in session.query(Puppy.name, Puppy.weight, Puppy.dateOfBirth).\
filter(Puppy.dateOfBirth <= '2015-08-31', Puppy.dateOfBirth >= '2015-02-25' ).order_by(desc("dateOfBirth")):
print instance
Manipulating dates in python is quite easy.
today = date.today().strftime("%Y/%m/%d")
sixthmonth = date(date.today().year, date.today().month-6,date.today().day).strftime("%Y/%m/%d")
The problem is, I don't know how to implement this as parameter. Any help with this?
for instance in session.query(Puppy.name, Puppy.weight, Puppy.dateOfBirth).\
filter(Puppy.dateOfBirth <= today, Puppy.dateOfBirth >= sixthmonth ).order_by(desc("dateOfBirth")):
SQLAlchemy supports comparison by datetime.date() and datetime.datetime() objects.
http://docs.sqlalchemy.org/en/rel_1_0/core/type_basics.html?highlight=datetime#sqlalchemy.types.DateTime
You can expose these as parameters (replace your_query with all the stuff you want to be constant and not parametrized):
six_months_ago = datetime.datetime.today() - datetime.timedelta(180)
today = datetime.datetime.today()
def query_puppies(birth_date=six_months_ago):
for puppy in your_query.filter(Puppy.dateOfBirth.between(birthdate, today)):
print puppy.name # for example..
Also note the usage of the between clause for some extra awesomeness :)
but two seperate clasuses using <= and >= would also work.
cheers

Ordering a queryset while ignoring a value

I have a field hotrank, It saves hot music (1,2,3,4...10)
But if the music didn't make it in top 10, I will save 0.
And I have another field called releaseday which saves the release day of the music
And now I want to query :
Music.objects.filter(releaseday__lte=today).order_by('hotrank','-releaseday')
But here is a problem,the order_by of hotrank is start by 0 ,but 0 is not the top music
how can I let order_by start from hotrank=1? Is there any method?
I think you can results with hotrank=0 from the query results, like this:
Music.objects.filter(releaseday__lte=today).exclude(hotrank=0).order_by(
'-hotrank','-releaseday')
If you want to add the results of those with hotrank=0 right after the ordered results you can do it this way:
released_music = Music.objects.filter(releaseday__lte=today).order_by('-hotrank',
'-releaseday')
result = released_music.exclude(hotrank=0) | released_music.filter(hotrank=0)
Use the queryset's extra() method with SQL CASE expression:
Music.objects.filter(releaseday__lte=today) \
.extra({'vrank': 'CASE WHEN hotrank=0 THEN 11 ELSE hotrank END'}) \
.order_by('vrank','-releaseday')
I don't believe there is a way to achieve this through Django queries. I also discourage you to use any database specific queries. You can pull all your objects from database and sort them the way you want in Python:
music_list = Music.objects.filter(releaseday__lte=today).all()
sorted_music = sorted(music_list, key=lambda x: x.hotrank if x.hotrank > 0 else 11)

Categories