Combining and ordering dictionaries in python - python

If I have multiple dictionaries like so:
dict1 = {Jamal: 10, Steve: 20}
dict2 = {Frank: 200, Steve: 30}
dict3 = {Carl: 14, Jamal: 26}
How would I combine them and order them in descending order of numbers without overwriting any values so that it shows something like:
Frank: 200, Steve: 30, Jamal: 26, Steve: 20, Carl: 14, Jamal: 10

There are two Steve from dict1, dict2. Mapping types does not allow duplicated keys.
You need to use sequences type (like list, tuple, ..).
>>> dict1 = {'Jamal': 10, 'Steve': 20}
>>> dict2 = {'Frank': 200, 'Steve': 30}
>>> dict3 = {'Carl': 14, 'Jamal': 26}
>>>
>>> import itertools
>>> it = itertools.chain.from_iterable(d.items() for d in (dict1, dict2, dict3))
>>> ', '.join('{}: {}'.format(*x) for x in sorted(it, key=lambda x: x[1], reverse=True))
'Frank: 200, Steve: 30, Jamal: 26, Steve: 20, Carl: 14, Jamal: 10'

You can not create a dictionary with duplicate keys, use OrderedDict with dict.setdefault method to put the relative values in a list :
dict1 = {'Jamal': 10, 'Steve': 20}
dict2 = {'Frank': 200, 'Steve': 30}
dict3 = {'Carl': 14, 'Jamal': 26}
from collections import OrderedDict
from itertools import chain
from operator import itemgetter
d = OrderedDict()
for key, value in sorted(chain.from_iterable([dict1.items(),dict2.items(),dict3.items()]), key=itemgetter(1),reverse=True):
d.setdefault(key,[]).append(value)
print d
OrderedDict([('Frank', [200]), ('Steve', [30, 20]), ('Jamal', [26, 10]), ('Carl', [14])])
Or just sorted like following :
print sorted(chain.from_iterable([dict1.items(),dict2.items(),dict3.items()]), key=itemgetter(1),reverse=True)
[('Frank', 200), ('Steve', 30), ('Jamal', 26), ('Steve', 20), ('Carl', 14), ('Jamal', 10)]

Related

Maximum value from dictionary using python

s={'a': [11, 22, 33], 'b': [44, 77, 99], 'c': [99, 200, 100]}
Output should be the maximum value from each dictionary like a=33, b=99, c=200.
A simple comprehension:
>>> {k: max(v) for k, v in s.items()}
{'a': 33, 'b': 99, 'c': 200}

Extract dictionary values from nested keys, leaving main key, then turn into list

a = {
1: {'abc': 50, 'def': 33, 'xyz': 40},
2: {'abc': 30, 'def': 22, 'xyz': 45},
3: {'abc': 15, 'def': 11, 'xyz': 50}
}
I would like to iterate through this nested dictionary, remove the sub keys (or extract the subkey values), but keep the main keys. the second step would be to turn the dictionary into a list of lists:
b = [
[1, 50, 33, 40],
[2, 30, 22, 45],
[3, 15, 11, 50]
]
I looked through the myriad of posts here talking about extracting keys and values but cannot find a close enough example to fit to what I need (still new at this): So far, I have this:
for key in a.keys():
if type(a[key]) == dict:
a[key] = a[key].popitem()[1]
which gives this - the value of the third sub key in each key: It's a start, but not complete or what I want
{1: 40, 2: 45, 3: 50}
Use a list comprehension over a.items(), use dict.values() to get the values, and then you can use unpacking (*) to get the desired lists.
>>> [[k, *v.values()] for k,v in a.items()]
[[1, 50, 33, 40], [2, 30, 22, 45], [3, 15, 11, 50]]
This solution may be not the most elegant solution, but it does what you exactly want:
a = {
1: {'abc': 50, 'def': 33, 'xyz': 40},
2: {'abc': 30, 'def': 22, 'xyz': 45},
3: {'abc': 15, 'def': 11, 'xyz': 50}
}
b = []
for key1, dict2 in a.items():
c = [key1]
c.extend(dict2.values())
b.append(c)
print(b)

Sort dictionary by value in python

This is my dictionary :
d = {'jan': 50, 'feb': 30, 'march': 60, 'april': 50, 'may': 50, 'june': 60, 'july': 20}
I am expecting output like this :
d = {'jan': 50, 'april': 50, 'may': 50, 'march': 60, 'june': 60, 'feb': 30, 'july': 20}
When I run this program I am getting different output than expected :
d = {'jan': 50, 'feb': 30, 'march': 60, 'april': 50, 'may': 50, 'june': 60, 'july': 20}
sortlist = sorted(d, key=d.get)
print(sortlist)
You can start by counting the amount of times each value appears with collections.Counter:
from collections import Counter
c = Counter(d.values())
# Counter({20: 1, 30: 1, 50: 3, 60: 2})
And now sort the dictionary looking up how many times each value appears using a key in sorted:
sorted(d.items(), key=lambda x: c[x[1]], reverse=True)
[('jan', 50), ('april', 50), ('may', 50), ('march', 60), ('june', 60),
('feb', 30), ('july', 20)]
Note however that if you obtain a dictionary from the result, the order will not be mantained, as dictionaries have no order.
So one thing you can so is use collections.OrderedDict to keep the order, simply call OrderedDict(res) on the resulting list of tuples.
d={'january': 500, 'feb':600, 'march':300,'april':500,'may':500,'june':600,'july':200}
from collections import defaultdict
from collections import OrderedDict
count_dict = defaultdict(int)
for key, value in d.items():
count_dict[value] += 1
First, we count occurences of each value. Counter can be used instead of defaultdict. Then sort them according to count_dict lookup table we just created.
sorted_dict = OrderedDict(sorted(d.items(), key=lambda item: count_dict[item[1]], reverse=True))
print(sorted_dict)
>>> OrderedDict([('january', 500), ('april', 500), ('may', 500), ('feb', 600), ('june', 600), ('march', 300), ('july', 200)])
Update : You can create count_dict with Counter like:
from collections import Counter
count_dict = Counter(d.values())

Efficient grouping into dict

I have a list of tuples:
[('Player1', 'A', 1, 100),
('Player1', 'B', 15, 100),
('Player2', 'A', 7, 100),
('Player2', 'B', 65, 100),
('Global Total', None, 88, 100)]
Which I wish to convert to a dict in the following format:
{
'Player1': {
'A': [1, 12.5],
'B': [15, 18.75],
'Total': [16, 18.18]
},
'Player2': {
'A': [7, 87.5],
'B': [65, 81.25],
'Total': [72, 81.81]
},
'Global Total': {
'A': [8, 100],
'B': [80, 100]
}
}
So each Player dict has it's local total value and it's percentage according to it's global total value.
Currently, I do it like this:
fixed_vals = {}
for name, status, qtd, prct in data_set: # This is the list of tuples var
if name in fixed_vals:
fixed_vals[name].update({status: [qtd, prct]})
else:
fixed_vals[name] = {status: [qtd, prct]}
fixed_vals['Global Total']['Total'] = fixed_vals['Global Total'].pop(None)
total_a = 0
for k, v in fixed_vals.items():
if k != 'Global Total':
total_a += v['A'][0]
fixed_vals['Global Total']['A'] = [
total_a, total_a * 100 / fixed_vals['Global Total']['Total'][0]
]
fixed_vals['Global Total']['B'] = [
fixed_vals['Global Total']['Total'][0] - total_a,
fixed_vals['Global Total']['Total'][0] - fixed_vals['Global Total']['A'][1]
]
for player, vals in fixed_vals.items():
if player != 'Global Total':
vals['A'][1] = vals['A'][0] * 100 / fixed_vals['Global Total']['A'][0]
vals['B'][1] = fixed_vals['Global Total']['A'][1] - vals['B'][1]
the problem being that this is not very flexible since I have to do something similar to this,
but with almost 12 categories (A, B, ...)
Is there a better approach to this? Perhaps this is trivial with pandas?
Edit for clarification:
There are no duplicate categories for each Player, everyone of them has the same sequence (some might have 0 but the category is unique)
Everyone seems attracted to a dict-only solution, but why not try converting to pandas?
import pandas as pd
# given
tuple_list = [('Player1', 'A', 1, 100),
('Player1', 'B', 15, 100),
('Player2', 'A', 7, 100),
('Player2', 'B', 65, 100),
('Global Total', None, 88, 100)]
# make a dataframe
df = pd.DataFrame(tuple_list , columns = ['player', 'game','score', 'pct'])
del df['pct']
df = df[df.player!='Global Total']
df = df.pivot(index='player', columns='game', values='score')
df.columns.name=''
df.index.name=''
# just a check
assert df.to_dict() == {'A': {'Player1': 1, 'Player2': 7},
'B': {'Player1': 15, 'Player2': 65}}
# A B
#player
#Player1 1 15
#Player2 7 65
print('Obtained dataset:\n', df)
Basically, all you need is 'df' dataframe, and the rest you can
compute and add later, no need to save it to dictionary.
Below is updated on OP request:
# the sum across columns is this - this was the 'Grand Total' in the dicts
# A 8
# B 80
sum_col = df.sum(axis=0)
# lets calculate the share of each player score:
shares = df / df.sum(axis=0) * 100
assert shares.transpose().to_dict() == {'Player1': {'A': 12.5, 'B': 18.75},
'Player2': {'A': 87.5, 'B': 81.25}}
# in 'shares' the columns add to 100%:
# A B
#player
#Player1 12.50 18.75
#Player2 87.50 81.25
# lets mix up a dataframe close to original dictionary structure
mixed_df = pd.concat([df.A, shares.A, df.B, shares.B], axis=1)
totals = mixed_df.sum(axis=0)
totals.name = 'Total'
mixed_df = mixed_df.append(totals.transpose())
mixed_df.columns = ['A', 'A_pct', 'B', 'B_pct']
print('\nProducing some statistics\n', mixed_df)
one solution would be to use groupby to group consecutive Player scores from the same player
tup = [('Player1', 'A', 1, 100),('Player1', 'B', 15, 100),('Player2', 'A', 7, 100), ('Player2', 'B', 65, 100), ('Global Total', None, 88, 100)]`
then import our groupby
from itertools import groupby
result = dict((name,dict((x[1],x[2:]) for x in values)) for name,values in groupby(tup,lambda x:x[0]))
then just go and update all the totals
for key in result:
if key == "Global Total": continue # skip this one ...
# sum up our player scores
result[key]['total'] = [sum(col) for col in zip(*result[key].values())]
# you can print the results too
print result
# {'Player2': {'A': (7, 100), 'total': [72, 200], 'B': (65, 100)}, 'Player1': {'A': (1, 100), 'total': [16, 200], 'B': (15, 100)}, 'Global Total': {'total': [88, 100], None: (88, 100)}}
NOTE This solution !REQUIRES! that all of player1's scores are grouped together in your tuple, and all of player2's scores are grouped etc
A) Break your code up into manageable chunks:
from collections import defaultdict
result = defaultdict(dict)
for (cat, sub, num, percent) in input_list:
result[cat][sub] = [num, percent]
Now we have a dict with the player counts, but the only valid percentages are for total and we don't have global counts.
from collections import Counter
def build_global(dct):
keys = Counter()
for key in dct:
if key == "Global Total":
continue
for sub_key in dct[key]:
keys[sub_key] += dct[key][sub_key][0]
for key in keys:
dct["Global Total"][key] = [keys[key], 100]
build_global(result) now yields valid global counts for each event.
Finally:
def calc_percent(dct):
totals = dct["Global Total"]
for key in dct:
local_total = 0
if key == "Global Total":
continue
for sub_key in dct[key]:
local_total += dct[key][sub_key][0]
dct[key][sub_key][1] = (dct[key][sub_key][0]/float(totals[sub_key][0])) * 100
dct[key]['Total'] = [local_total, (local_total/float(dct['Global Total'][None][0])) * 100]
calc_percent(result) goes through and builds the percentages.
result is then:
defaultdict(<type 'dict'>,
{'Player2': {'A': [7, 87.5], 'B': [65, 81.25], 'Total': [72, 81.81818181818183]},
'Player1': {'A': [1, 12.5], 'B': [15, 18.75], 'Total': [16, 18.181818181818183]},
'Global Total': {'A': [8, 100], None: [88, 100], 'B': [80, 100]}})
If you need it exactly as specified, you can delete the None entry in global total and dict(result) to convert the defaultdict into a vanilla dict.
Using a remapping tool from more_itertools in Python 3.6+:
Given
import copy as cp
import collections as ct
import more_itertools as mit
data = [
("Player1", "A", 1, 100),
("Player1", "B", 15, 100),
("Player2", "A", 7, 100),
("Player2", "B", 65, 100),
('Global Total', None, 88, 100)
]
# Discard the last entry
data = data[:-1]
# Key functions
kfunc = lambda tup: tup[0]
vfunc = lambda tup: tup[1:]
rfunc = lambda x: {item[0]: [item[1]] for item in x}
Code
# Step 1
remapped = mit.map_reduce(data, kfunc, vfunc, rfunc)
# Step 2
intermediate = ct.defaultdict(list)
for d in remapped.values():
for k, v in d.items():
intermediate[k].extend(v)
# Step 3
remapped["Global Total"] = {k: [sum(v)] for k, v in intermediate.items()}
final = cp.deepcopy(remapped)
for name, d in remapped.items():
for lbl, v in d.items():
stat = (v[0]/remapped["Global Total"][lbl][0]) * 100
final[name][lbl].append(stat)
Details
Step 1 - build a new dict of remapped groups.
This is done by defining key functions that dictate how to process the keys and values. The reducing function processes the values into sub-dictionaries. See also docs for more details on more_itertools.map_reduce.
>>> remapped
defaultdict(None,
{'Player1': {'A': [1], 'B': [15]},
'Player2': {'A': [7], 'B': [65]}})
Step 2 - build an intermediate dict for lookups
>>> intermediate
defaultdict(list, {'A': [1, 7], 'B': [15, 65]})
Step 3 - build a final dict from the latter dictionaries
>>> final
defaultdict(None,
{'Player1': {'A': [1, 12.5], 'B': [15, 18.75]},
'Player2': {'A': [7, 87.5], 'B': [65, 81.25]},
'Global Total': {'A': [8, 100.0], 'B': [80, 100.0]}})

Sort Python dictionary by it's keys using another Python list

I have a list of words:
['apple', 'zoo', 'chicken', 'needle', 'car', 'computer']
I also have a dictionary with keys and values:
{'zoo': 42, 'needle': 32, 'computer': 18, 'apple': 39, 'car': 11, 'chicken': 12}
The keys of my dictionary are all from the list of words. How can I sort the dictionary so that the order of its keys is the same as the order of the words in the list? So once sorted, my dictionary should look like this:
{'apple': 39, 'zoo': 42, 'chicken': 12, 'needle': 32, 'car': 11, 'computer': 18}
Thanks so much!
For python versions < 3.6, dictionaries do not maintain order, and sorting a dictionary is consequently not possible.
You may use the collections.OrderedDict to build a new dictionary with the order you want:
In [269]: from collections import OrderedDict
In [270]: keys = ['apple', 'zoo', 'chicken', 'needle', 'car', 'computer']
...: dict_1 = {'zoo': 42, 'needle': 32, 'computer': 18, 'apple': 39, 'car': 11, 'chicken': 12}
...:
In [271]: dict_2 = OrderedDict()
In [272]: for k in keys:
...: dict_2[k] = dict_1[k]
...:
In [273]: dict_2
Out[273]:
OrderedDict([('apple', 39),
('zoo', 42),
('chicken', 12),
('needle', 32),
('car', 11),
('computer', 18)])
In Python3.6, a simple dict comprehension suffices:
>>> {x : dict_1[x] for x in keys}
{'apple': 39, 'zoo': 42, 'chicken': 12, 'needle': 32, 'car': 11, 'computer': 18}
You can used OrderedDict since regular dictionaries are unordered. For your case you could do this:
from collections import OrderedDict
od = OrderedDict()
ll = ['apple', 'zoo', 'chicken', 'needle', 'car', 'computer']
d = {'zoo': 42, 'needle': 32, 'computer': 18, 'apple': 39, 'car': 11, 'chicken': 12}
for f in ll:
od[f] = d[f]
#Outputs: OrderedDict([('apple', 39), ('zoo', 42), ('chicken', 12), ('needle', 32), ('car', 11), ('computer', 18)])
Python dict doesn't preserve order by default, you should use collections.OrderedDict. The first item you put into OrderedDict is the first item you will get when you enumerate it (e.g. using for).
from collections import OrderedDict
order_list = ['apple', 'zoo', 'chicken', 'needle', 'car', 'computer']
unordered_dict = {'zoo': 42, 'needle': 32, 'computer': 18, 'apple': 39, 'car': 11, 'chicken': 12}
ordered_dict = OrderedDict()
for item in order_list:
ordered_dict[item] = unordered_dict[item]
for k, v in unordered_dict.items():
print(k, v)
for k, v in ordered_dict.items():
print(k, v)

Categories