Sort dictionary by value in python

Sort dictionary by value in python - python

This is my dictionary :
d = {'jan': 50, 'feb': 30, 'march': 60, 'april': 50, 'may': 50, 'june': 60, 'july': 20}
I am expecting output like this :
d = {'jan': 50, 'april': 50, 'may': 50, 'march': 60, 'june': 60, 'feb': 30, 'july': 20}
When I run this program I am getting different output than expected :
d = {'jan': 50, 'feb': 30, 'march': 60, 'april': 50, 'may': 50, 'june': 60, 'july': 20}
sortlist = sorted(d, key=d.get)
print(sortlist)

You can start by counting the amount of times each value appears with collections.Counter:
from collections import Counter
c = Counter(d.values())
# Counter({20: 1, 30: 1, 50: 3, 60: 2})
And now sort the dictionary looking up how many times each value appears using a key in sorted:
sorted(d.items(), key=lambda x: c[x[1]], reverse=True)
[('jan', 50), ('april', 50), ('may', 50), ('march', 60), ('june', 60),
('feb', 30), ('july', 20)]
Note however that if you obtain a dictionary from the result, the order will not be mantained, as dictionaries have no order.
So one thing you can so is use collections.OrderedDict to keep the order, simply call OrderedDict(res) on the resulting list of tuples.

d={'january': 500, 'feb':600, 'march':300,'april':500,'may':500,'june':600,'july':200}
from collections import defaultdict
from collections import OrderedDict
count_dict = defaultdict(int)
for key, value in d.items():
count_dict[value] += 1
First, we count occurences of each value. Counter can be used instead of defaultdict. Then sort them according to count_dict lookup table we just created.
sorted_dict = OrderedDict(sorted(d.items(), key=lambda item: count_dict[item[1]], reverse=True))
print(sorted_dict)
>>> OrderedDict([('january', 500), ('april', 500), ('may', 500), ('feb', 600), ('june', 600), ('march', 300), ('july', 200)])
Update : You can create count_dict with Counter like:
from collections import Counter
count_dict = Counter(d.values())

Related

Efficient grouping into dict

I have a list of tuples:
[('Player1', 'A', 1, 100),
('Player1', 'B', 15, 100),
('Player2', 'A', 7, 100),
('Player2', 'B', 65, 100),
('Global Total', None, 88, 100)]
Which I wish to convert to a dict in the following format:
{
'Player1': {
'A': [1, 12.5],
'B': [15, 18.75],
'Total': [16, 18.18]
},
'Player2': {
'A': [7, 87.5],
'B': [65, 81.25],
'Total': [72, 81.81]
},
'Global Total': {
'A': [8, 100],
'B': [80, 100]
}
}
So each Player dict has it's local total value and it's percentage according to it's global total value.
Currently, I do it like this:
fixed_vals = {}
for name, status, qtd, prct in data_set: # This is the list of tuples var
if name in fixed_vals:
fixed_vals[name].update({status: [qtd, prct]})
else:
fixed_vals[name] = {status: [qtd, prct]}
fixed_vals['Global Total']['Total'] = fixed_vals['Global Total'].pop(None)
total_a = 0
for k, v in fixed_vals.items():
if k != 'Global Total':
total_a += v['A'][0]
fixed_vals['Global Total']['A'] = [
total_a, total_a * 100 / fixed_vals['Global Total']['Total'][0]
]
fixed_vals['Global Total']['B'] = [
fixed_vals['Global Total']['Total'][0] - total_a,
fixed_vals['Global Total']['Total'][0] - fixed_vals['Global Total']['A'][1]
]
for player, vals in fixed_vals.items():
if player != 'Global Total':
vals['A'][1] = vals['A'][0] * 100 / fixed_vals['Global Total']['A'][0]
vals['B'][1] = fixed_vals['Global Total']['A'][1] - vals['B'][1]
the problem being that this is not very flexible since I have to do something similar to this,
but with almost 12 categories (A, B, ...)
Is there a better approach to this? Perhaps this is trivial with pandas?
Edit for clarification:
There are no duplicate categories for each Player, everyone of them has the same sequence (some might have 0 but the category is unique)

Everyone seems attracted to a dict-only solution, but why not try converting to pandas?
import pandas as pd
# given
tuple_list = [('Player1', 'A', 1, 100),
('Player1', 'B', 15, 100),
('Player2', 'A', 7, 100),
('Player2', 'B', 65, 100),
('Global Total', None, 88, 100)]
# make a dataframe
df = pd.DataFrame(tuple_list , columns = ['player', 'game','score', 'pct'])
del df['pct']
df = df[df.player!='Global Total']
df = df.pivot(index='player', columns='game', values='score')
df.columns.name=''
df.index.name=''
# just a check
assert df.to_dict() == {'A': {'Player1': 1, 'Player2': 7},
'B': {'Player1': 15, 'Player2': 65}}
# A B
#player
#Player1 1 15
#Player2 7 65
print('Obtained dataset:\n', df)
Basically, all you need is 'df' dataframe, and the rest you can
compute and add later, no need to save it to dictionary.
Below is updated on OP request:
# the sum across columns is this - this was the 'Grand Total' in the dicts
# A 8
# B 80
sum_col = df.sum(axis=0)
# lets calculate the share of each player score:
shares = df / df.sum(axis=0) * 100
assert shares.transpose().to_dict() == {'Player1': {'A': 12.5, 'B': 18.75},
'Player2': {'A': 87.5, 'B': 81.25}}
# in 'shares' the columns add to 100%:
# A B
#player
#Player1 12.50 18.75
#Player2 87.50 81.25
# lets mix up a dataframe close to original dictionary structure
mixed_df = pd.concat([df.A, shares.A, df.B, shares.B], axis=1)
totals = mixed_df.sum(axis=0)
totals.name = 'Total'
mixed_df = mixed_df.append(totals.transpose())
mixed_df.columns = ['A', 'A_pct', 'B', 'B_pct']
print('\nProducing some statistics\n', mixed_df)

one solution would be to use groupby to group consecutive Player scores from the same player
tup = [('Player1', 'A', 1, 100),('Player1', 'B', 15, 100),('Player2', 'A', 7, 100), ('Player2', 'B', 65, 100), ('Global Total', None, 88, 100)]`
then import our groupby
from itertools import groupby
result = dict((name,dict((x[1],x[2:]) for x in values)) for name,values in groupby(tup,lambda x:x[0]))
then just go and update all the totals
for key in result:
if key == "Global Total": continue # skip this one ...
# sum up our player scores
result[key]['total'] = [sum(col) for col in zip(*result[key].values())]
# you can print the results too
print result
# {'Player2': {'A': (7, 100), 'total': [72, 200], 'B': (65, 100)}, 'Player1': {'A': (1, 100), 'total': [16, 200], 'B': (15, 100)}, 'Global Total': {'total': [88, 100], None: (88, 100)}}
NOTE This solution !REQUIRES! that all of player1's scores are grouped together in your tuple, and all of player2's scores are grouped etc

A) Break your code up into manageable chunks:
from collections import defaultdict
result = defaultdict(dict)
for (cat, sub, num, percent) in input_list:
result[cat][sub] = [num, percent]
Now we have a dict with the player counts, but the only valid percentages are for total and we don't have global counts.
from collections import Counter
def build_global(dct):
keys = Counter()
for key in dct:
if key == "Global Total":
continue
for sub_key in dct[key]:
keys[sub_key] += dct[key][sub_key][0]
for key in keys:
dct["Global Total"][key] = [keys[key], 100]
build_global(result) now yields valid global counts for each event.
Finally:
def calc_percent(dct):
totals = dct["Global Total"]
for key in dct:
local_total = 0
if key == "Global Total":
continue
for sub_key in dct[key]:
local_total += dct[key][sub_key][0]
dct[key][sub_key][1] = (dct[key][sub_key][0]/float(totals[sub_key][0])) * 100
dct[key]['Total'] = [local_total, (local_total/float(dct['Global Total'][None][0])) * 100]
calc_percent(result) goes through and builds the percentages.
result is then:
defaultdict(<type 'dict'>,
{'Player2': {'A': [7, 87.5], 'B': [65, 81.25], 'Total': [72, 81.81818181818183]},
'Player1': {'A': [1, 12.5], 'B': [15, 18.75], 'Total': [16, 18.181818181818183]},
'Global Total': {'A': [8, 100], None: [88, 100], 'B': [80, 100]}})
If you need it exactly as specified, you can delete the None entry in global total and dict(result) to convert the defaultdict into a vanilla dict.

Using a remapping tool from more_itertools in Python 3.6+:
Given
import copy as cp
import collections as ct
import more_itertools as mit
data = [
("Player1", "A", 1, 100),
("Player1", "B", 15, 100),
("Player2", "A", 7, 100),
("Player2", "B", 65, 100),
('Global Total', None, 88, 100)
]
# Discard the last entry
data = data[:-1]
# Key functions
kfunc = lambda tup: tup[0]
vfunc = lambda tup: tup[1:]
rfunc = lambda x: {item[0]: [item[1]] for item in x}
Code
# Step 1
remapped = mit.map_reduce(data, kfunc, vfunc, rfunc)
# Step 2
intermediate = ct.defaultdict(list)
for d in remapped.values():
for k, v in d.items():
intermediate[k].extend(v)
# Step 3
remapped["Global Total"] = {k: [sum(v)] for k, v in intermediate.items()}
final = cp.deepcopy(remapped)
for name, d in remapped.items():
for lbl, v in d.items():
stat = (v[0]/remapped["Global Total"][lbl][0]) * 100
final[name][lbl].append(stat)
Details
Step 1 - build a new dict of remapped groups.
This is done by defining key functions that dictate how to process the keys and values. The reducing function processes the values into sub-dictionaries. See also docs for more details on more_itertools.map_reduce.
>>> remapped
defaultdict(None,
{'Player1': {'A': [1], 'B': [15]},
'Player2': {'A': [7], 'B': [65]}})
Step 2 - build an intermediate dict for lookups
>>> intermediate
defaultdict(list, {'A': [1, 7], 'B': [15, 65]})
Step 3 - build a final dict from the latter dictionaries
>>> final
defaultdict(None,
{'Player1': {'A': [1, 12.5], 'B': [15, 18.75]},
'Player2': {'A': [7, 87.5], 'B': [65, 81.25]},
'Global Total': {'A': [8, 100.0], 'B': [80, 100.0]}})

Sort Python dictionary by it's keys using another Python list

I have a list of words:
['apple', 'zoo', 'chicken', 'needle', 'car', 'computer']
I also have a dictionary with keys and values:
{'zoo': 42, 'needle': 32, 'computer': 18, 'apple': 39, 'car': 11, 'chicken': 12}
The keys of my dictionary are all from the list of words. How can I sort the dictionary so that the order of its keys is the same as the order of the words in the list? So once sorted, my dictionary should look like this:
{'apple': 39, 'zoo': 42, 'chicken': 12, 'needle': 32, 'car': 11, 'computer': 18}
Thanks so much!

For python versions < 3.6, dictionaries do not maintain order, and sorting a dictionary is consequently not possible.
You may use the collections.OrderedDict to build a new dictionary with the order you want:
In [269]: from collections import OrderedDict
In [270]: keys = ['apple', 'zoo', 'chicken', 'needle', 'car', 'computer']
...: dict_1 = {'zoo': 42, 'needle': 32, 'computer': 18, 'apple': 39, 'car': 11, 'chicken': 12}
...:
In [271]: dict_2 = OrderedDict()
In [272]: for k in keys:
...: dict_2[k] = dict_1[k]
...:
In [273]: dict_2
Out[273]:
OrderedDict([('apple', 39),
('zoo', 42),
('chicken', 12),
('needle', 32),
('car', 11),
('computer', 18)])
In Python3.6, a simple dict comprehension suffices:
>>> {x : dict_1[x] for x in keys}
{'apple': 39, 'zoo': 42, 'chicken': 12, 'needle': 32, 'car': 11, 'computer': 18}

You can used OrderedDict since regular dictionaries are unordered. For your case you could do this:
from collections import OrderedDict
od = OrderedDict()
ll = ['apple', 'zoo', 'chicken', 'needle', 'car', 'computer']
d = {'zoo': 42, 'needle': 32, 'computer': 18, 'apple': 39, 'car': 11, 'chicken': 12}
for f in ll:
od[f] = d[f]
#Outputs: OrderedDict([('apple', 39), ('zoo', 42), ('chicken', 12), ('needle', 32), ('car', 11), ('computer', 18)])

Python dict doesn't preserve order by default, you should use collections.OrderedDict. The first item you put into OrderedDict is the first item you will get when you enumerate it (e.g. using for).
from collections import OrderedDict
order_list = ['apple', 'zoo', 'chicken', 'needle', 'car', 'computer']
unordered_dict = {'zoo': 42, 'needle': 32, 'computer': 18, 'apple': 39, 'car': 11, 'chicken': 12}
ordered_dict = OrderedDict()
for item in order_list:
ordered_dict[item] = unordered_dict[item]
for k, v in unordered_dict.items():
print(k, v)
for k, v in ordered_dict.items():
print(k, v)

Combining and ordering dictionaries in python

If I have multiple dictionaries like so:
dict1 = {Jamal: 10, Steve: 20}
dict2 = {Frank: 200, Steve: 30}
dict3 = {Carl: 14, Jamal: 26}
How would I combine them and order them in descending order of numbers without overwriting any values so that it shows something like:
Frank: 200, Steve: 30, Jamal: 26, Steve: 20, Carl: 14, Jamal: 10

There are two Steve from dict1, dict2. Mapping types does not allow duplicated keys.
You need to use sequences type (like list, tuple, ..).
>>> dict1 = {'Jamal': 10, 'Steve': 20}
>>> dict2 = {'Frank': 200, 'Steve': 30}
>>> dict3 = {'Carl': 14, 'Jamal': 26}
>>>
>>> import itertools
>>> it = itertools.chain.from_iterable(d.items() for d in (dict1, dict2, dict3))
>>> ', '.join('{}: {}'.format(*x) for x in sorted(it, key=lambda x: x[1], reverse=True))
'Frank: 200, Steve: 30, Jamal: 26, Steve: 20, Carl: 14, Jamal: 10'

You can not create a dictionary with duplicate keys, use OrderedDict with dict.setdefault method to put the relative values in a list :
dict1 = {'Jamal': 10, 'Steve': 20}
dict2 = {'Frank': 200, 'Steve': 30}
dict3 = {'Carl': 14, 'Jamal': 26}
from collections import OrderedDict
from itertools import chain
from operator import itemgetter
d = OrderedDict()
for key, value in sorted(chain.from_iterable([dict1.items(),dict2.items(),dict3.items()]), key=itemgetter(1),reverse=True):
d.setdefault(key,[]).append(value)
print d
OrderedDict([('Frank', [200]), ('Steve', [30, 20]), ('Jamal', [26, 10]), ('Carl', [14])])
Or just sorted like following :
print sorted(chain.from_iterable([dict1.items(),dict2.items(),dict3.items()]), key=itemgetter(1),reverse=True)
[('Frank', 200), ('Steve', 30), ('Jamal', 26), ('Steve', 20), ('Carl', 14), ('Jamal', 10)]

Sorting a dictionary of dictionary using values in Python

How Do I sort a dict of dict in python??
I have a dictionary :
d = {
1: {2: 30, 3: 40, 4: 20, 6: 10},
2: {3: 30, 4: 60, 5: -60},
3: {1: -20, 5: 60, 6: 100},
}
How can I get the sorted (reverse) dict based on their values ?? How can I get the output like:
d = {
1: {3: 40, 2: 30, 4: 20, 6: 10},
2: {4: 60, 3: 30, 5: -60},
3: { 6: 100, 5: 60,1: -20},
}

What Padraic did, more compacted:
sort_vals = lambda d: OrderedDict(sorted(d.items(), key=lambda pair: pair[1], reversed=True))
d = dict((k, sort_vals(sub_dict)) for k, sub_dict in d.items())
(Not tested)

If you want order you need an OrderedDict of OrderedDicts, by chance the outer dict is in sorted order but normal dicts have no order:
from collections import OrderedDict
d = {
1: {2: 30, 3: 40, 4: 20, 6: 10},
2: {3: 30, 4: 60, 5: -60},
3: {1: -20, 5: 60, 6: 100},
}
keys = sorted(d.keys()) # sort outer dict keys
o = OrderedDict()
for k in keys: # loop over the sorted keys
# make an orderedDict out of the sorted items from each sub dict.
o[k] = OrderedDict((k,v) for k,v in sorted(d[k].items(),key=lambda x:x[1],reverse=True))
print(o)
OrderedDict([(1, OrderedDict([(3, 40), (2, 30), (4, 20), (6, 10)])), (2, OrderedDict([(4, 60), (3, 30), (5, -60)])), (3, OrderedDict([(6, 100), (5, 60), (1, -20)]))])

Get the first 100 elements of OrderedDict

preresult is an OrderedDict().
I want to save the first 100 elements in it. Or keep preresult but delete everything other than the first 100 elements.
The structure is like this
stats = {'a': {'email1':4, 'email2':3},
'the': {'email1':2, 'email3':4},
'or': {'email1':2, 'email3':1}}
Will islice work for it？ Mine tells itertool.islice does not have items

Here's a simple solution using itertools:
>>> import collections
>>> from itertools import islice
>>> preresult = collections.OrderedDict(zip(range(200), range(200)))
>>> list(islice(preresult, 100))[-10:]
[90, 91, 92, 93, 94, 95, 96, 97, 98, 99]
This returns only keys. If you want items, use iteritems (or just items in Python 3):
>>> list(islice(preresult.iteritems(), 100))[-10:]
[(90, 90), (91, 91), (92, 92), (93, 93), (94, 94), (95, 95), (96, 96), (97, 97), (98, 98), (99, 99)]

You can slice the keys of OrderedDict and copy it.
from collections import OrderedDict
a = OrderedDict()
for i in xrange(10):
a[i] = i*i
b = OrderedDict()
for i in a.keys()[0:5]:
b[i] = a[i]
b is a sliced version of a

for k, v in list(od.items())[:100]:
pass

Can't we just convert the list into a dictionary with keys and values, then slide it as you need, then put it back into an orderedDict?
Here's how I did it.
from collections import OrderedDict
#defined an OrderedDict()
stats = OrderedDict()
#loading the ordered list with 100 keys
for i in range(100):
stats[str(i)] = {'email'+str(i):i,'email'+str(i+1):i+1}
#Then slicing the first 20 elements from the OrderedDict
#I first convert it to a list, then slide, then put it back as an OrderedDict
st = OrderedDict(list(stats.items())[:20])
print (stats)
print (st)
The output of this will be as follows. I reduced the first one to 10 items and sliced it to only the first 5 items:
OrderedDict([('0', {'email0': 0, 'email1': 1}), ('1', {'email1': 1, 'email2': 2}), ('2', {'email2': 2, 'email3': 3}), ('3', {'email3': 3, 'email4': 4}), ('4', {'email4': 4, 'email5': 5}), ('5', {'email5': 5, 'email6': 6}), ('6', {'email6': 6, 'email7': 7}), ('7', {'email7': 7, 'email8': 8}), ('8', {'email8': 8, 'email9': 9}), ('9', {'email9': 9, 'email10': 10})])
OrderedDict([('0', {'email0': 0, 'email1': 1}), ('1', {'email1': 1, 'email2': 2}), ('2', {'email2': 2, 'email3': 3}), ('3', {'email3': 3, 'email4': 4}), ('4', {'email4': 4, 'email5': 5})])
I did a print (dict(st)) to get this:
{'0': {'email0': 0, 'email1': 1}, '1': {'email1': 1, 'email2': 2}, '2': {'email2': 2, 'email3': 3}, '3': {'email3': 3, 'email4': 4}, '4': {'email4': 4, 'email5': 5}}

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Sort dictionary by value in python - python

Related

Efficient grouping into dict

Sort Python dictionary by it's keys using another Python list

Combining and ordering dictionaries in python

Sorting a dictionary of dictionary using values in Python

Get the first 100 elements of OrderedDict

Categories

Resources