Django, grouping query items - python

say I have such model:
class Foo(models.Model):
name = models.CharField("name",max_length=25)
type = models.IntegerField("type number")
after doing some query like Foo.objects.filter(), I want to group the query result as such:
[ [{"name":"jb","type:"whiskey"},{"name":"jack daniels","type:"whiskey"}],
[{"name":"absolute","type:"vodka"},{name:"smirnoff ":"vodka"}],
[{name:"tuborg","type":beer}]
]
So as you can see, grouping items as list of dictionaries. List of group query lists intead of dictionary would also be welcome :)
Regards

You can do this with the values method of a queryset:
http://docs.djangoproject.com/en/1.1/ref/models/querysets/#values-fields
values(*fields)
Returns a ValuesQuerySet -- a QuerySet
that evaluates to a list of
dictionaries instead of model-instance
objects.

You can do the grouping in your view by using itertools.groupby().

Check out the regroup template tag. If you want to do the grouping for display in your template then this should be what you need. Otherwise you can read the source to see how they accomplish the grouping.

You can sort of do this by using order_by:
Foo.objects.order_by( "type" );
drinks = Foo.objects.all( )
Now you have an array of drinks ordered by type. You could use this or write a function to create the structure you want without having to sort it with a linear scan.

Related

delet duplicates data from a queryset in django (code not working) [duplicate]

suppose we have a model in django defined as follows:
class Literal:
name = models.CharField(...)
...
Name field is not unique, and thus can have duplicate values. I need to accomplish the following task:
Select all rows from the model that have at least one duplicate value of the name field.
I know how to do it using plain SQL (may be not the best solution):
select * from literal where name IN (
select name from literal group by name having count((name)) > 1
);
So, is it possible to select this using django ORM? Or better SQL solution?
Try:
from django.db.models import Count
Literal.objects.values('name')
.annotate(Count('id'))
.order_by()
.filter(id__count__gt=1)
This is as close as you can get with Django. The problem is that this will return a ValuesQuerySet with only name and count. However, you can then use this to construct a regular QuerySet by feeding it back into another query:
dupes = Literal.objects.values('name')
.annotate(Count('id'))
.order_by()
.filter(id__count__gt=1)
Literal.objects.filter(name__in=[item['name'] for item in dupes])
This was rejected as an edit. So here it is as a better answer
dups = (
Literal.objects.values('name')
.annotate(count=Count('id'))
.values('name')
.order_by()
.filter(count__gt=1)
)
This will return a ValuesQuerySet with all of the duplicate names. However, you can then use this to construct a regular QuerySet by feeding it back into another query. The django ORM is smart enough to combine these into a single query:
Literal.objects.filter(name__in=dups)
The extra call to .values('name') after the annotate call looks a little strange. Without this, the subquery fails. The extra values tricks the ORM into only selecting the name column for the subquery.
try using aggregation
Literal.objects.values('name').annotate(name_count=Count('name')).exclude(name_count=1)
In case you use PostgreSQL, you can do something like this:
from django.contrib.postgres.aggregates import ArrayAgg
from django.db.models import Func, Value
duplicate_ids = (Literal.objects.values('name')
.annotate(ids=ArrayAgg('id'))
.annotate(c=Func('ids', Value(1), function='array_length'))
.filter(c__gt=1)
.annotate(ids=Func('ids', function='unnest'))
.values_list('ids', flat=True))
It results in this rather simple SQL query:
SELECT unnest(ARRAY_AGG("app_literal"."id")) AS "ids"
FROM "app_literal"
GROUP BY "app_literal"."name"
HAVING array_length(ARRAY_AGG("app_literal"."id"), 1) > 1
Ok, so for some reason none of the above worked for, it always returned <MultilingualQuerySet []>. I use the following, much easier to understand but not so elegant solution:
dupes = []
uniques = []
dupes_query = MyModel.objects.values_list('field', flat=True)
for dupe in set(dupes_query):
if not dupe in uniques:
uniques.append(dupe)
else:
dupes.append(dupe)
print(set(dupes))
If you want to result only names list but not objects, you can use the following query
repeated_names = Literal.objects.values('name').annotate(Count('id')).order_by().filter(id__count__gt=1).values_list('name', flat='true')

django: order a QuerySet

I have a view like this:
def profile (request):
articles = Post.thing.all()
newSet = set()
def score():
for thing in articles:
Val = some calculation...
....
newSet.update([(thing,Val)])
score()
context = {
'articles': articles.order_by('id'),
'newSet':newSet,
}
return render(request,'my_profile/my_profile.html',context)
and the outcome is a Queryset which looks like this:
set([(<thing: sfd>, 1), (<thing: quality>, 0), (<thing: hello>, -1), (<thing: hey>, 4), (<thing: test>, 0)
I am now trying to order the set by the given Values so its a list which starts with the highest Value, but when i do newSet.order_by/filter/split/join
it does not work since 'set' object has no attribute join/filter/split.
Can anybody give me a hint how to sort the querySet i could not find anything helpful on my own.
I need this to work out in the view so it cannot/should not be done in the model. Thanks for any advise.
the outcome is a Queryset which looks like this
Actually this is a set (python builtin type), not a QuerySet (django's orm type).
set is an unordered collection type. To "sort" it you first have to turn it into a list - which is actually as simple as newlist = list(newset),
then you can sort the list in-place with newlist.sort(). Since you want this list to be sorted on it's items second elements, you'll need to use the key argument to tell sort on what you want to sort:
newlist.sort(key=lambda item: item[1])
or you can just change your score() function to store (score, obj) tuples instead in which case list.sort() will naturally do the RightThing (it will sort your tuples on their first item).
While we're at it: instead of calling newSet.update() with a list of a single item, you can just use newSet.add() instead, ie:
def score():
for thing in articles:
val = some calculation...
....
newset.add((thing, val))
# or if you don't want to specify a callback
# for `list.sort()`:
# newset.add((val, thing))
And finally: do you really need a set at all here ? Why not using a list right from the start ?
I think you might be slightly confused here between a set, a list and a QuerySet? A set is unordered, while a list is not. Sets don't expose the methods you listed above (filter, order_by, split, join). A QuerySet is a Django-specific class which has many useful methods.
I think it would be simpler to make newSet a list of tuples, and then to order that list by value using list.sort(key=lambda x: x[1]).
If your calculation of val is eligible for it though, I'd recommend using annotate and then doing away with newDict or newSet, and simply pass back the queryset of articles, which would be much simpler, maybe faster, and orderable by using articles.order_by('value'). If you post the calculation of val, I'll try to tell you if that's feasible.

Most performant way to retrieve a Django queryset sorted in accordance to a redis sorted set

I have a redis sorted set containing object ids and their respective sorting scores. I need to retrieve a Django query set of these objects, sorted in accordance to the sorted set's scores. What's the most efficient way to do this? My DB is postgresql.
Currently, I'm retrieving the redis sorted set and the unsorted queryset separately. Next, I sort the queryset in accordance to the sorted set's scores in python. Can steps be meaningfully cut here, e.g. somehow sorting the queryset at the point of retrieval, etc.?
obj_ids_with_scr = get_ids_w_scr() #retrieves redis sorted set with scores
obj_ids = map(itemgetter(0),obj_ids_with_scr) # filters sorted set for just the obj ids
queryset = Widget.objects.filter(id__in=obj_ids) #unsorted queryset
a = dict(obj_ids_with_scr) #turning the sorted set into a dictionary
for obj_pk, sort_score in a:
obj = queryset.get(id=obj_pk) #get object with id equalling obj_pk
sort_score = obj #assign the object to the 'score' value of this key-value
result = a.values() #making a list of all values, that are now in sorted order
return result
If you are using postgres, you can simulate mysql's order by field syntax that is discussed here by defining your own field function in postgres:
CREATE FUNCTION field(anyelement, VARIADIC anyarray) RETURNS numeric AS $$
SELECT
COALESCE(
( SELECT i FROM generate_subscripts($2, 1) gs(i)
WHERE $2[i] = $1 ),
0);
$$ LANGUAGE SQL STABLE
And then calling with something like:
queryset = queryset.extra(
select={'manual': "FIELD(id, %s)" % (','.join(map(str, obj_ids)),) },
order_by=['manual']
)
n.b. You can also optimize your python code like so if you so choose:
widgets = Widgets.objects.in_bulk(obj_ids)
sorted_widgets = [ widgets[x] for x in obj_ids if x in widgets ]
In-bulk returns a dict mapping id to Widget instances, which is useful for the type of sorting you are doing. You can read more about it here.

Check if queryset already exists

I have a list of querysets such as ([qSet1, qSet2, qSet3],[qSet3, qSet2],[qSet1, qSet3])
Then, I want to add another queryset, but only if it not already exists in list. Sets can have the same content, but in different order: [qSet1, qSet2], [qSet2, qSet1]. That querysets must be considered as the same => must not to be added twice.
How can I do this?
To check for sameness for a set of QuerySets (pun intended), you can make use of the underlying SQL query gotten from the .query method which Django provides for QuerySets. This method returns a string literal.
Say you have a list of tuples of QuerySets:
list_of_querysets = [(qSet1, qSet2, qSet3), (qSet3, qSet2), (qSet1, qSet3)]
To check if a new arbitrary tuple qsets_arb = (qSet_arb1, qSet_arb2) already has a match in the list, you would do:
def get_sql(tuple_of_querysets):
'''Return a set of SQL statements from a tuple of QuerySets'''
return set([str(queryset.query) for queryset in tuple_of_querysets])
# ordering of items in sets do not matter: set([q1, q2]) = set([q2, q1])
if get_sql(qsets_arb) in map(get_sql, list_of_querysets[:]):
print("This tuple of QuerySets has already been included")
else:
list_of_querysets.append(qsets_arb)
That should pretty much do what you want.
If what you want to compare is the content of the lists, and you do not want order to make a difference, you should use a set:
set([qSet1, qSet2]) == set([qSet2, qSet1])
This is assuming that the same queryset is not twice in the same list.

How can I filter by key, or keys, a query in Python for Google App Engine?

I have a query and I can apply filters on them without any problem. This works fine:
query.filter('foo =', 'bar')
But what if I want to filter my query by key or a list of keys?
I have them as Key() property or as a string and by trying something like this, it didn't work:
query.filter('key =', 'some_key') #no success
query.filter('key IN', ['key1', 'key2']) #no success
Whilst it's possible to filter on key - see #dplouffe's answer - it's not a good idea. 'IN' clauses execute one query for each item in the clause, so you end up doing as many queries as there are keys, which is a particularly inefficient way to achieve your goal.
Instead, use a batch fetch operation, as #Luke documents, then filter any elements you don't want out of the list in your code.
You can filter queries by doing a GQL Query like this:
result = db.GqlQuery('select * from Model where __key__ IN :1', [db.Key.from_path('Model', 'Key1'), db.Key.from_path('Model', 'Key2')]).fetch(2)
or
result = Model.get([db.Key.from_path('Model', 'Key1'), db.Key.from_path('ProModelduct', 'Key2')])
You cannot filter on a Key. Oops, I was wrong about that. You can filter on a key and other properties at the same time if you have an index set up to handle it. It would look like this:
key = db.Key.from_path('MyModel', 'keyname')
MyModel.all().filter("__key__ =", key).filter('foo = ', 'bar')
You can also look up a number of models by their keys, key IDs, or key names with the get family of methods.
# if you have the key already, or can construct it from its path
models = MyModel.get(Key.from_path(...), ...)
# if you have keys with names
models = MyModel.get_by_key_name('asdf', 'xyz', ...)
# if you have keys with IDs
models = MyModel.get_by_id(123, 456, ...)
You can fetch many entities this way. I don't know the exact limit. If any of the keys doesn't exist, you'll get a None in the list for that entity.
If you need to filter on some property as well as the key, you'll have to do that in two steps. Either fetch by the keys and check for the property, or query on the property and validate the keys.
Here's an example of filtering after fetching. Note that you don't use the Query class's filter method. Instead just filter the list.
models = MyModels.get_by_key_name('asdf', ...)
filtered = itertools.ifilter(lambda x: x.foo == 'bar', models)
Have a look at: https://developers.google.com/appengine/docs/python/ndb/entities?hl=de#multiple
list_of_entities = ndb.get_multi(list_of_keys)

Categories