Ordering a queryset while ignoring a value - python

I have a field hotrank, It saves hot music (1,2,3,4...10)
But if the music didn't make it in top 10, I will save 0.
And I have another field called releaseday which saves the release day of the music
And now I want to query :
Music.objects.filter(releaseday__lte=today).order_by('hotrank','-releaseday')
But here is a problem,the order_by of hotrank is start by 0 ,but 0 is not the top music
how can I let order_by start from hotrank=1? Is there any method?

I think you can results with hotrank=0 from the query results, like this:
Music.objects.filter(releaseday__lte=today).exclude(hotrank=0).order_by(
'-hotrank','-releaseday')
If you want to add the results of those with hotrank=0 right after the ordered results you can do it this way:
released_music = Music.objects.filter(releaseday__lte=today).order_by('-hotrank',
'-releaseday')
result = released_music.exclude(hotrank=0) | released_music.filter(hotrank=0)

Use the queryset's extra() method with SQL CASE expression:
Music.objects.filter(releaseday__lte=today) \
.extra({'vrank': 'CASE WHEN hotrank=0 THEN 11 ELSE hotrank END'}) \
.order_by('vrank','-releaseday')

I don't believe there is a way to achieve this through Django queries. I also discourage you to use any database specific queries. You can pull all your objects from database and sort them the way you want in Python:
music_list = Music.objects.filter(releaseday__lte=today).all()
sorted_music = sorted(music_list, key=lambda x: x.hotrank if x.hotrank > 0 else 11)

Related

Django ORM filter by Max column value of two related models

I have 3 related models:
Program(Model):
... # which aggregates ProgramVersions
ProgramVersion(Model):
program = ForeignKey(Program)
index = IntegerField()
UserProgramVersion(Model):
user = ForeignKey(User)
version = ForeignKey(ProgramVersion)
index = IntegerField()
ProgramVersion and UserProgramVersion are orderable models based on index field - object with highest index in the table is considered latest/newest object (this is handled by some custom logic, not relevant).
I would like to select all latest UserProgramVersion's, i.e. latest UPV's which point to the same Program.
this can be handled by this UserProgramVersion queryset:
def latest_user_program_versions(self):
latest = self\
.order_by('version__program_id', '-version__index', '-index')\
.distinct('version__program_id')
return self.filter(id__in=latest)
this works fine however im looking for a solution which does NOT use .distinct()
I tried something like this:
def latest_user_program_versions(self):
latest = self\
.annotate(
'max_version_index'=Max('version__index'),
'max_index'=Max('index'))\
.filter(
'version__index'=F('max_version_index'),
'index'=F('max_index'))
return self.filter(id__in=latest)
this however does not work
Use Subquery() expressions in Django 1.11. The example in docs is similar and the purpose is also to get the newest item for required parent records.
(You could start probably by that example with your objects, but I wrote also a complete more complicated suggestion to avoid possible performance pitfalls.)
from django.db.models import OuterRef, Subquery
...
def latest_user_program_versions(self, *args, **kwargs):
# You should filter users by args or kwargs here, for performance reasons.
# If you do it here it is applied also to subquery - much faster on a big db.
qs = self.filter(*args, **kwargs)
parent = Program.objects.filter(pk__in=qs.values('version__program'))
newest = (
qs.filter(version__program=OuterRef('pk'))
.order_by('-version__index', '-index')
)
pks = (
parent.annotate(newest_id=Subquery(newest.values('pk')[:1]))
.values_list('newest_id', flat=True)
)
# Maybe you prefer to uncomment this to be it compiled by two shorter SQLs.
# pks = list(pks)
return self.filter(pk__in=pks)
If you considerably improve it, write the solution in your answer.
EDIT Your problem in your second solution:
Nobody can cut a branch below him, neither in SQL, but I can sit on its temporary copy in a subquery, to can survive it :-) That is also why I ask for a filter at the beginning. The second problem is that Max('version__index') and Max('index') could be from two different objects and no valid intersection is found.
EDIT2: Verified: The internal SQL from my query is complicated, but seems correct.
SELECT app_userprogramversion.id,...
FROM app_userprogramversion
WHERE app_userprogramversion.id IN
(SELECT
(SELECT U0.id
FROM app_userprogramversion U0
INNER JOIN app_programversion U2 ON (U0.version_id = U2.id)
WHERE (U0.user_id = 123 AND U2.program_id = (V0.id))
ORDER BY U2.index DESC, U0.index DESC LIMIT 1
) AS newest_id
FROM app_program V0 WHERE V0.id IN
(SELECT U2.program_id AS Col1
FROM app_userprogramversion U0
INNER JOIN app_programversion U2 ON (U0.version_id = U2.id)
WHERE U0.user_id = 123
)
)

Dynamic queries in Elasticsearch and issue of keywords

I'm currently running into a problem, trying to build dynamic queries for Elasticsearch in Python. To make a query I use Q shortсut from elasticsearch_dsl. This is something I try to implement
...
s = Search(using=db, index="reestr")
condition = {"attr_1_":"value 1", "attr_2_":"value 2"} # try to build query from this
must = []
for key in condition:
must.append(Q('match',key=condition[key]))
But that in fact results to this condition:
[Q('match',key="value 1"),Q('match',key="value 2")]
However, what I want is:
[Q('match',attr_1_="value 1"),Q('match',attr_2_="value 2")]
IMHO, the way this library does queries is not effective. I think this syntax:
Q("match","attrubute_name"="attribute_value")
is much more powerful and makes it possible to do a lot more things, than this one:
Q("match",attribute_name="attribute_value")
It seems, as if it is impossible to dynamically build attribute_names. Or it is, of course, possible that I do not know the right way to do it.
Suppose,
filters = {'condition1':['value1'],'condition2':['value3','value4']}
Code:
filters = data['filters_data']
must_and = list() # The condition that has only one value
should_or = list() # The condition that has more than 1 value
for key in filters:
if len(filters[key]) > 1:
for item in filters[key]:
should_or.append(Q("match", **{key:item}))
else:
must_and.append(Q("match", **{key:filters[key][0]}))
q1 = Bool(must=must_and)
q2 = Bool(should=should_or)
s = s.query(q1).query(q2)
result = s.execute()
One can also use terms, that can directly accept the list of values and no need of complicated for loops,
Code:
for key in filters:
must_and.append(Q("terms", **{key:filters[key]}))
q1 = Bool(must=must_and)

Next upcoming date model object ignoring past dates

I'm trying to utilise latest() on a django model queryset to return the next upcoming date in a model.
I've tried a few different things, using __lte and __gte lookups on a filter and to no avail.
The filter option would work for me, if there was a way to effectively utilise a model method within an exclude() but without writing a custom manager that's not going to be an option.
There must be an easier way?
class RaidSession(models.Model):
scheduled = models.DateTimeField()
duration = models.DurationField()
def is_expired(self):
duration_to_date = self.scheduled + self.duration
return True if duration_to_date < timezone.now() else False
Since I'm a little old school, it usually helps me to think of such problems as an SQL query. In your case this would be
SELECT * FROM app_raidsession rs
WHERE rs.scheduled >= now()
ORDER BY rs.scheduled
LIMIT 1
This gives you the next scheduled raid.
In django ORM, you should be able to translate this more or less straightforward to:
from django.utils.timezone import now
# first() returns None if the result is empty
next_raid = models.RaidSession.objects \
.filter(scheduled__gte=now()) \
.order_by('scheduled') \
.first()
If the duration is relevant, you will need an F-expression:
from django.db.models import F
next_raid = models.RaidSession.objects \
.filter(scheduled__gte=now() - F('duration')) \
.order_by('scheduled') \
.first()

Converting LEFT OUTER JOIN query to Django orm queryset/query

Given PostgreSQL 9.2.10, Django 1.8, python 2.7.5 and the following models:
class restProdAPI(models.Model):
rest_id = models.PositiveIntegerField(primary_key=True)
rest_host = models.CharField(max_length=20)
rest_ip = models.GenericIPAddressField(default='0.0.0.0')
rest_mode = models.CharField(max_length=20)
rest_state = models.CharField(max_length=20)
class soapProdAPI(models.Model):
soap_id = models.PositiveIntegerField(primary_key=True)
soap_host = models.CharField(max_length=20)
soap_ip = models.GenericIPAddressField(default='0.0.0.0')
soap_asset = models.CharField(max_length=20)
soap_state = models.CharField(max_length=20)
And the following raw query which returns exactly what I am looking for:
SELECT
app_restProdAPI.rest_id, app_soapProdAPI.soap_id, app_restProdAPI.rest_host, app_restProdAPI.rest_ip, app_soapProdAPI.soap_asset, app_restProdAPI.rest_mode, app_restProdAPI.rest_state
FROM
app_soapProdAPI
LEFT OUTER JOIN
app_restProdAPI
ON
((app_restProdAPI.rest_host = app_soapProdAPI.soap_host)
OR
(app_restProdAPI.rest_ip = app_soapProdAPI.soap_ip))
WHERE
app_restProdAPI.rest_mode = 'Excluded';
Which returns like this:
rest_id | soap_id | rest_host | rest_ip | soap_asset | rest_mode | rest_state
---------+---------+---------------+----------------+------------+-----------+-----------
1234 | 12345 | 1G24019123ABC | 123.123.123.12 | A1234567 | Excluded | Up
What would be the best method for making this work using Django's model and orm structure?
I have been looking around for possible methods for joining the two tables entirely without a relationship but there does not seem to be a clean or efficient way to do this. I have also tried looking for methods to do left outer joins in django, but again documentation is sparse or difficult to decipher.
I know I will probably have to use Q objects to do the or clause I have in there. Additionally I have looked at relationships and it looks like a foreignkey() may work but I am unsure if this is the best method of doing it. Any and all help would be greatly appreciated. Thank you in advance.
** EDIT 1 **
So far Todor has offered a solution that uses a INNER JOIN that works. I may have found a solution HERE if anyone can decipher that mess of inline raw html.
** EDIT 2 **
Is there a way to filter on a field (where something = 'something') like my query above given, Todor's answer? I tried the following but it is still including all records even though my equivalent postresql query is working as expected. It seems I cannot have everything in the where that I do because when I remove one of the or statements and just do a and statement it applies the excluded filter.
soapProdAPI.objects.extra(
select = {
'rest_id' : 'app_restprodapi.rest_id',
'rest_host' : 'app_restprodapi.rest_host',
'rest_ip' : 'app_restprodapi.rest_ip',
'rest_mode' : 'app_restprodapi.rest_mode',
'rest_state' : 'app_restprodapi.rest_state'
},
tables = ['app_restprodapi'],
where = ['app_restprodapi.rest_mode=%s \
AND app_restprodapi.rest_host=app_soapprodapi.soap_host \
OR app_restprodapi.rest_ip=app_soapprodapi.soap_ip'],
params = ['Excluded']
)
** EDIT 3 / CURRENT SOLUTION IN PLACE **
To date Todor has provided the most complete answer, using an INNER JOIN, but the hope is that this question will generate thought into how this still may be accomplished. As this does not seem to be inherently possible, any and all suggestions are welcome as they may possibly lead to better solutions. That being said, using Todor's answer, I was able accomplish the exact query I needed:
restProdAPI.objects.extra(
select = {
'soap_id' : 'app_soapprodapi.soap_id',
'soap_asset' : 'app_soapprodapi.soap_asset'
},
tables = ['app_soapprodapi'],
where = ['app_restprodapi.rest_mode = %s',
'app_soapprodapi.soap_host = app_restprodapi.rest_host OR \
app_soapprodapi.soap_ip = app_restprodapi.rest_ip'
],
params = ['Excluded']
)
** TLDR **
I would like to convert this PostGreSQL query to the ORM provided by Django WITHOUT using .raw() or any raw query code at all. I am completely open to changing the model to having a foreignkey if that facilitates this and is, from a performance standpoint, the best method. I am going to be using the objects returned in conjunction with django-datatables-view if that helps in terms of design.
Solving it with INNER JOIN
In case you can go with only soapProdAPI's that contain corresponding restProdAPI ( in terms of your join statement -> linked by host or ip). You can try the following:
soapProdAPI.objects.extra(
select = {
'rest_id' : "app_restProdAPI.rest_id",
'rest_host' : "app_restProdAPI.rest_host",
'rest_ip' : "app_restProdAPI.rest_ip",
'rest_mode' : "app_restProdAPI.rest_mode",
'rest_state': "app_restProdAPI.rest_state"
},
tables = ["app_restProdAPI"],
where = ["app_restProdAPI.rest_host = app_soapProdAPI.soap_host \
OR app_restProdAPI.rest_ip = app_soapProdAPI.soap_ip"]
)
How to filter more?
Since we are using .extra I would advice to read the docs carefully. In general we can't use .filter with some of the fields inside the select dict, because they are not part of the soapProdAPI and Django can't resolve them. We have to stick with the where kwarg in .extra, and since it's a list, we better just add another element.
where = ["app_restProdAPI.rest_host = app_soapProdAPI.soap_host \
OR app_restProdAPI.rest_ip = app_soapProdAPI.soap_ip",
"app_restProdAPI.rest_mode=%s"
],
params = ['Excluded']
Repeated subquery
If you really need all soapProdAPI's no matter if they have corresponding restProdAPI I can only think of a one ugly example where a subquery is repeated for each field you need.
soapProdAPI.objects.extra(
select = {
'rest_id' : "(select rest_id from app_restProdAPI where app_restProdAPI.rest_host = app_soapProdAPI.soap_host OR app_restProdAPI.rest_ip = app_soapProdAPI.soap_ip)",
'rest_host' : "(select rest_host from app_restProdAPI where app_restProdAPI.rest_host = app_soapProdAPI.soap_host OR app_restProdAPI.rest_ip = app_soapProdAPI.soap_ip)",
'rest_ip' : "(select rest_ip from app_restProdAPI where app_restProdAPI.rest_host = app_soapProdAPI.soap_host OR app_restProdAPI.rest_ip = app_soapProdAPI.soap_ip)",
'rest_mode' : "(select rest_mode from app_restProdAPI where app_restProdAPI.rest_host = app_soapProdAPI.soap_host OR app_restProdAPI.rest_ip = app_soapProdAPI.soap_ip)",
'rest_state': "(select rest_state from app_restProdAPI where app_restProdAPI.rest_host = app_soapProdAPI.soap_host OR app_restProdAPI.rest_ip = app_soapProdAPI.soap_ip)"
},
)
I think this could be usefull for you! Effectively, you can use Q to construct your query.
I try it the Django shell, I create some data and I did something like this:
restProdAPI.objects.filter(Q(rest_host=s1.soap_host)|Q(rest_ip=s1.soap_ip))
Where s1 is a soapProdAPI.
This is all the code i whote, you can try it and to see if can help you
from django.db.models import Q
from core.models import restProdAPI, soapProdAPI
s1 = soapProdAPI.objects.get(soap_id=1)
restProdAPI.objects.filter(Q(rest_id=s1.soap_id)|Q(rest_ip=s1.soap_ip))

sqlalchemy query using joinedload exponentially slower with each new filter clause

I have this sqlalchemy query:
query = session.query(Store).options(joinedload('salesmen').
joinedload('comissions').
joinedload('orders')).\
filter(Store.store_code.in_(selected_stores))
stores = query.all()
for store in stores:
for salesman in store.salesmen:
for comission in salesman.comissions:
#generate html for comissions for each salesman in each store
#print html document using PySide
This was working perfectly, however I added two new filter queries:
filter(Comissions.payment_status == 0).\
filter(Order.order_date <= self.dateEdit.date().toPython())
If I add just the first filter the application hangs for a couple of seconds, if I add both the application hangs indefinitely
What am I doing wrong here? How do I make this query fast?
Thank you for your help
EDIT: This is the sql generated, unfortunately the class and variable names are in Portuguese, I just translated them to English so it would be easier to undertand,
so Loja = Store, Vendedores = Salesmen, Pedido = Order, Comission = Comissao
Query generated:
SELECT "Loja"."CodLoja", "Vendedores_1"."CodVendedor", "Vendedores_1"."NomeVendedor", "Vendedores_1"."CodLoja", "Vendedores_1"."PercentualComissao",
"Vendedores_1"."Ativo", "Comissao_1"."CodComissao", "Comissao_1"."CodVendedor", "Comissao_1"."CodPedido",
"Pedidos_1"."CodPedido", "Pedidos_1"."CodLoja", "Pedidos_1"."CodCliente", "Pedidos_1"."NomeCliente", "Pedidos_1"."EnderecoCliente", "Pedidos_1"."BairroCliente",
"Pedidos_1"."CidadeCliente", "Pedidos_1"."UFCliente", "Pedidos_1"."CEPCliente", "Pedidos_1"."FoneCliente", "Pedidos_1"."Fone2Cliente", "Pedidos_1"."PontoReferenciaCliente",
"Pedidos_1"."DataPedido", "Pedidos_1"."ValorProdutos", "Pedidos_1"."ValorCreditoTroca",
"Pedidos_1"."ValorTotalDoPedido", "Pedidos_1"."Situacao", "Pedidos_1"."Vendeu_Teflon", "Pedidos_1"."ValorTotalTeflon",
"Pedidos_1"."DataVenda", "Pedidos_1"."CodVendedor", "Pedidos_1"."TipoVenda", "Comissao_1"."Valor", "Comissao_1"."DataPagamento", "Comissao_1"."StatusPagamento"
FROM "Comissao", "Pedidos", "Loja" LEFT OUTER JOIN "Vendedores" AS "Vendedores_1" ON "Loja"."CodLoja" = "Vendedores_1"."CodLoja"
LEFT OUTER JOIN "Comissao" AS "Comissao_1" ON "Vendedores_1"."CodVendedor" = "Comissao_1"."CodVendedor" LEFT OUTER JOIN "Pedidos" AS "Pedidos_1" ON "Pedidos_1"."CodPedido" = "Comissao_1"."CodPedido"
WHERE "Loja"."CodLoja" IN (:CodLoja_1) AND "Comissao"."StatusPagamento" = :StatusPagamento_1 AND "Pedidos"."DataPedido" <= :DataPedido_1
Your FROM clause is producing a Cartesian product and includes each table twice, once for filtering the result and once for eagerly loading the relationship.
To stop this use contains_eager instead of joinedload in your options. This will look for the related attributes in the query's columns instead of constructing an extra join. You will also need to explicitly join to the other tables in your query, e.g.:
query = session.query(Store)\
.join(Store.salesmen)\
.join(Store.commissions)\
.join(Store.orders)\
.options(contains_eager('salesmen'),
contains_eager('comissions'),
contains_eager('orders'))\
.filter(Store.store_code.in_(selected_stores))\
.filter(Comissions.payment_status == 0)\
.filter(Order.order_date <= self.dateEdit.date().toPython())

Categories