Sort a list of dictionaries by value - python

I have a list of dictionaries, of the form:
neighbour_list = [{1:4}, {3:5}, {4:9}, {5:2}]
I need to sort the list in order of the dictionary with the largest value. So, for the above code the sorted list would look like:
sorted_list = [{4:9}, {3:5}, {1:4}, {5:2}]
Each dictionary within the list only has one mapping.
Is there an efficient way to do this? Currently I am looping through the list to get the biggest value, then remembering where it was found to return the largest value, but I'm not sure how to extend this to be able to sort the entire list.
Would it just be easier to implement my own dict class?
EDIT: here is my code for returning the dictionary which should come 'first' in an ideally sorted list.
temp = 0
element = 0
for d in list_of_similarities:
for k in d:
if (d[k] > temp):
temp = d[k]
element = k
dictionary = d
first = dictionary[element]

You can use an anonymous function as your sorting key to pull out the dict value (not sure if i've done this the most efficient way though:
sorted(neighbour_list, key = lambda x: tuple(x.values()), reverse=True)
[{4: 9}, {3: 5}, {1: 4}, {5: 2}]
Note we need to coerce x.values() to a tuple, since in Python 3, x.values() is of type "dict_values" which is unorderable. I guess the idea is that a dict is more like a set than a list (hence the curly braces), and there's no (well-) ordering on sets; you can't use the usual lexicographic ordering since there's no notion of "first element", "second element", etc.

You could list.sort using the dict values as the key.
neighbour_list.sort(key=lambda x: x.values(), reverse=1)
Considering you only have one value, for python2 you can just call next on itervalues to get the first and only value:
neighbour_list.sort(key=lambda x: next(x.itervalues()), reverse=1)
print(neighbour_list)
For python3, you cannot call next on dict.values, it would have to be:
neighbour_list.sort(key=lambda x: next(iter(x.values())), reverse=1)
And have to call list on dict.values:
neighbour_list.sort(key=lambda x: list(x.values()), reverse=1)

Related

Sorting list of dictionaries with different keys by value

I have a list of dictionaries with one key and one value only. Keys are always different, value is float number. How do I sort it by value?
example_list = [{'c47-d75 d75-e6b e6b-ff1 ff1-6d6 6d6-e63 e63-80c': 292.1799470129255}, {'805-7fd': 185.56518334219}, {'805-dd3 dd3-088 088-dd3 dd3-80c': 368.5010685728143}, {'805-6b5': 145.897977770909}, {'77e-805 805-7fd': 326.693786870932}, {'323-83d': 131.71963170528})
The result should be sorted by value so the first item should be
{'805-dd3 dd3-088 088-dd3 dd3-80c': 368.5010685728143}
Could you please help?
To sort the list based on the first value of the dictionary, pass a function to sorted() to extract the first value of the dictionary.
d.values() returns an iterable of the dictionary values
iter() generates an iterator of the dictionary values
Since there is only 1 dictionary value (by assumption), calling next() on the iterator returns the first (and only) value.
To sort by greatest to least value, pass the reverse=True keyword argument to sorted().
def first_value(d):
return next(iter(d.values()))
example_list = [{'c47-d75 d75-e6b e6b-ff1 ff1-6d6 6d6-e63 e63-80c': 292.1799470129255}, {'805-7fd': 185.56518334219}, {'805-dd3 dd3-088 088-dd3 dd3-80c': 368.5010685728143}, {'805-6b5': 145.897977770909}, {'77e-805 805-7fd': 326.693786870932}, {'323-83d': 131.71963170528}]
sorted_list = sorted(example_list, key=first_value, reverse=True)
print(sorted_list[0])
Here's a slightly different approach:
>>> for d in sorted(example_list, key=lambda d: max(d.values()), reverse=True):
... print(d)
...
{'805-dd3 dd3-088 088-dd3 dd3-80c': 368.5010685728143}
{'77e-805 805-7fd': 326.693786870932}
{'c47-d75 d75-e6b e6b-ff1 ff1-6d6 6d6-e63 e63-80c': 292.1799470129255}
{'805-7fd': 185.56518334219}
{'805-6b5': 145.897977770909}
{'323-83d': 131.71963170528}
This sorts by using the max() function on the values associated with each dictionary. Since there's only one value in each dictionary, it just returns that value.
Honest question, why have a list of dictionaries instead of a single dictionary with key-value pairs?
You can create a new dictionary sorted by values as follows:
{k: v for k, v in sorted(example_dict.items(), key=lambda item: item[1], reverse = True)}
Since you want the values sorted in descending order you should keep the parameter reverse = True, else for ascending order set it to False.

getting minimum (key, value) in a list that holds a dictionary

I have a list that holds a dictionary row like this:
queue = [{1: 0.39085439023582913, 2: 0.7138416909634645, 3: 0.9871959077954673}]
I'm tryin to get it to return the smallest value along with its key, so in this case it would return
1,0.39085439023582913
I've tried using
min(queue, key=lambda x:x[1])
but that just returns the whole row like this: any suggestions? thank you!
{1: 0.39085439023582913, 2: 0.7138416909634645, 3: 0.9871959077954673}
If you want the min for each dict in the list, you can use the following list comprehension:
[min(d.items(), key=lambda x: x[1]) for d in queue]
Which for your example returns:
[(1, 0.39085439023582913)]
d.items() returns a list of tuples in the form (key, value) for the dictionary d. We then sort these tuples using the value (x[1] in this case).
If you always have your data in the form of a list with one dictionary, you could also call .items() on the first element of queue and find the min:
print(min(queue[0].items(), key=lambda x:x[1]))
#(1, 0.39085439023582913)
Yes, min(queue) will find minimum in the list and not in the enclosed dictionary
Do this:
key_of_min_val = min(queue[0], key=queue[0].get)
print((key_of_min_val , queue[0][key_of_min_val]))
you can do it through
mykey = min(queue[0], key=queue[0].get)
then just use this key and get your dictionary value
mykey, queue[0][mykey]

Sort list of dictionaries based on keys

I want to sort a list of dictionaries based on the presence of keys. Let's say I have a list of keys [key2,key3,key1], I need to order the list in such a way the dictionary with key2 should come first, key3 should come second and with key1 last.
I saw this answer (Sort python list of dictionaries by key if key exists) but it refers to only one key
The sorting is not based on value of the 'key'. It depends on the presence of the key and that too with a predefined list of keys.
Just use sorted using a list like [key1 in dict, key2 in dict, ...] as the key to sort by. Remember to reverse the result, since True (i.e. key is in dict) is sorted after False.
>>> dicts = [{1:2, 3:4}, {3:4}, {5:6, 7:8}]
>>> keys = [5, 3, 1]
>>> sorted(dicts, key=lambda d: [k in d for k in keys], reverse=True)
[{5: 6, 7: 8}, {1: 2, 3: 4}, {3: 4}]
This is using all the keys to break ties, i.e. in above example, there are two dicts that have the key 3, but one also has the key 1, so this one is sorted second.
I'd do this with:
sorted_list = sorted(dict_list, key = lambda d: next((i for (i, k) in enumerate(key_list) if k in d), len(key_list) + 1))
That uses a generator expression to find the index in the key list of the first key that's in each dictionary, then use that value as the sort key, with dicts that contain none of the keys getting len(key_list) + 1 as their sort key so they get sorted to the end.
How about something like this
def sort_key(dict_item, sort_list):
key_idx = [sort_list.index(key) for key in dict_item.iterkeys() if key in sort_list]
if not key_idx:
return len(sort_list)
return min(key_idx)
dict_list.sort(key=lambda x: sort_key(x, sort_list))
If the a given dictionary in the list contains more than one of the keys in the sorting list, it will use the one with the lowest index. If none of the keys are present in the sorting list, the dictionary is sent to the end of the list.
Dictionaries that contain the same "best" key (i.e. lowest index) are considered equal in terms of order. If this is a problem, it wouldn't be too hard to have the sort_key function consider all the keys rather than just the best.
To do that, simply return the whole key_idx instead of min(key_idx) and instead of len(sort_list) return [len(sort_list)]

Sorting nested python dictionary by label

I'm trying to find a smart way to sort the following data structure by std:
{'4555':{'std':5656, 'var': 5664}, '5667':{'std':5656, 'var': 5664}}
Ideally like to have a sorted dictionary (bad I know), or a list of sorted tuples, but I don't know how to get the 'std' part in my lambda expression. I'm trying the following, but how should I get at the 'stdev' bit in a smart manner? Which I want to go give a list of tuples (each tuple contains index such as [(4555, 5656), (5667, 5656)].
sorted_list = sorted(sd_dict.items(), key=lambda x:x['std'])
Since sd_dict.items() returns a list of tuples, you no longer can access the elements as if it was a dictionary in the key function. Instead, the key function gets a two-element tuple with the first element being the key and the second element being the value. So to get the std value, you need to access it like this:
lambda x: x[1]['std']
But since in your example both values are identical, you don’t actually change anything:
>>> list(sorted(sd_dict.items(), key=lambda x: x[1]['std']))
[('5667', {'var': 5664, 'std': 5656}), ('4555', {'var': 5664, 'std': 5656})]
And if you just want a pair of the outer dictionary key and the std value, then you should use a list comprehension first to transform the dictionary values:
>>> lst = [(key, value['std']) for key, value in sd_dict.items()]
>>> lst.sort(key=lambda x: x[1])
>>> lst
[('5667', 5656), ('4555', 5656)]
Or maybe you want to include an int conversion, and also sort by the key too:
>>> lst = [(int(key), value['std']) for key, value in sd_dict.items()]
>>> lst.sort(key=lambda x: (x[1], x[0]))
>>> lst
[(4555, 5656), (5667, 5656)]
Each element in your input list is a tuple of (k, dictionary) items, so you need to index into that to get to the std key in the dictionary value:
sorted(sd_dict.items(), key=lambda i: i[1]['std'])
If you wanted a tuple of just the key and the std value from the dictionary, you need to pick those out; it doesn't matter if you do so before or after sorting, just adjust your sort key accordingly
sorted([(int(k), v['std']) for k, v in sd_dict.items()], key=lambda i: i[1])
or
[(int(k), v['std']) for k, v in sorted(sd_dict.items()], key=lambda i: i[1]['std'])
However, extracting the std values just once instead of both for sorting and for extraction is going to be faster.

Dict comprehension, tuples and lazy evaluation

I am trying to see if I can pull off something quite lazy in Python.
I have a dict comprehension, where the value is a tuple. I want to be able to create the second entry of the tuple by using the first entry of the tuple.
An example should help.
dictA = {'a': 1, 'b': 3, 'c': 42}
{key: (a = someComplexFunction(value), moreComplexFunction(a)) for key, value in dictA.items()}
Is it possible that the moreComplexFunction uses the calculation in the first entry of the tuple?
You could add a second loop over a one-element tuple:
{key: (a, moreComplexFuntion(a)) for key, value in dictA.items()
for a in (someComplexFunction(value),)}
This gives you access to the output of someComplexFunction(value) in the value expression, but that's rather ugly.
Personally, I'd move to a regular loop in such cases:
dictB = {}
for key, value in dictA.items():
a = someComplexFunction(value)
dictB[key] = (a, moreComplexFunction(a))
and be done with it.
or, you could just write a function to return the tuple:
def kv_tuple(a):
tmp = someComplexFunction(a)
return (a, moreComplexFunction(tmp))
{key:kv_tuple(value) for key, value in dictA.items()}
this also gives you the option to use things like namedtuple to get names for the tuple items, etc. I don't know how much faster/slower this would be though... the regular loop is likely to be faster (fewer function calls)...
Alongside Martijn's answer, using a generator expression and a dict comprehension is also quite semantic and lazy:
dictA = { ... } # Your original dict
partially_computed = ((key, someComplexFunction(value))
for key, value in dictA.items())
dictB = {key: (a, moreComplexFunction(a)) for key, a in partially_computed}

Categories