Python sum list of dicts by key with nested dicts - python

I have a list of dicts and would like to design a function to output a new dict which contains the sum for each unique key across all the dicts in the list.
For the list:
[
{
'apples': 1,
'oranges': 1,
'grapes': 2
},
{
'apples': 3,
'oranges': 5,
'grapes': 8
},
{
'apples': 13,
'oranges': 21,
'grapes': 34
}
]
So far so good, this can be done with a counter:
def sumDicts(listToProcess):
c = Counter()
for entry in listToProcess:
c.update(entry)
return (dict(c))
Which correctly returns:
{'apples': 17, 'grapes': 44, 'oranges': 27}
The trouble comes when the dicts in my list start to contain nested dicts:
[
{
'fruits': {
'apples': 1,
'oranges': 1,
'grapes': 2
},
'vegetables': {
'carrots': 6,
'beans': 3,
'peas': 2
},
'grains': 4,
'meats': 1
},
{
'fruits': {
'apples': 3,
'oranges': 5,
'grapes': 8
},
'vegetables': {
'carrots': 7,
'beans': 4,
'peas': 3
},
'grains': 3,
'meats': 2
},
{
'fruits': {
'apples': 13,
'oranges': 21,
'grapes': 34
},
'vegetables': {
'carrots': 8,
'beans': 5,
'peas': 4
},
'grains': 2,
'meats': 3
},
]
Now the same function will give a TypeError because the counter can't add two Dicts.
The desired result would be:
{
'fruits': {
'apples': 17,
'oranges': 27,
'grapes': 44
},
'vegetables': {
'carrots': 21,
'beans': 12,
'peas': 9
},
'grains': 9,
'meats': 6
}
Any ideas on how to do this in a reasonably efficient, Pythonic, generalizable way?

I would do this by performing a recursive merge on a recursively defined collections.defaultdict object.
from collections import defaultdict
def merge(d, new_d):
for k, v in new_d.items():
if isinstance(v, dict):
merge(d[k], v)
else:
d[k] = d.setdefault(k, 0) + v
# https://stackoverflow.com/a/19189356/4909087
nested = lambda: defaultdict(nested)
d = nested()
for subd in data:
merge(d, subd)
Using default_to_regular to convert it back, we have:
default_to_regular(d)
# {
# "fruits": {
# "apples": 17,
# "oranges": 27,
# "grapes": 44
# },
# "vegetables": {
# "carrots": 21,
# "beans": 12,
# "peas": 9
# },
# "grains": 9,
# "meats": 6
# }

You can use recursion. This solution finds all the dictionary keys in the input passed to merge, and then sums the values for each key if the values are integers. If the values are dictionaries, however, merge is called again:
def merge(c):
_keys = {i for b in c for i in b}
return {i:[sum, merge][isinstance(c[0][i], dict)]([h[i] for h in c]) for i in _keys}
d = [{'fruits': {'apples': 1, 'oranges': 1, 'grapes': 2}, 'vegetables': {'carrots': 6, 'beans': 3, 'peas': 2}, 'grains': 4, 'meats': 1}, {'fruits': {'apples': 3, 'oranges': 5, 'grapes': 8}, 'vegetables': {'carrots': 7, 'beans': 4, 'peas': 3}, 'grains': 3, 'meats': 2}, {'fruits': {'apples': 13, 'oranges': 21, 'grapes': 34}, 'vegetables': {'carrots': 8, 'beans': 5, 'peas': 4}, 'grains': 2, 'meats': 3}]
import json
print(json.dumps(merge(d), indent=4))
Output:
{
"meats": 6,
"grains": 9,
"fruits": {
"grapes": 44,
"oranges": 27,
"apples": 17
},
"vegetables": {
"beans": 12,
"peas": 9,
"carrots": 21
}
}

Related

How do you sort a dictionary by a key's dictionary's value?

How can I sort a dictionary using the values of a key's dictionary?
Input:
myDict = {
"1":{
"VALUE1": 10,
"VALUE2": 5,
"VALUE3": 3
},
"2":{
"VALUE1": 5,
"VALUE2": 3,
"VALUE3": 1
},
"3":{
"VALUE1": 15,
"VALUE2": 2,
"VALUE3": 4
},
}
Expected output:
myDict = {
"3": {
"VALUE1": 15,
"VALUE2": 2,
"VALUE3": 4
},
"1": {
"VALUE1": 10,
"VALUE2": 5,
"VALUE3": 3
},
"2": {
"VALUE1": 5,
"VALUE2": 3,
"VALUE3": 1
},
}
It is now sorted by the value of keys VALUE1
How would I get the expected output?
Try:
newDict = dict(sorted(myDict.items(), key = lambda x: x[1]['VALUE1'], reverse=True))
newDict
{'3': {'VALUE1': 15, 'VALUE2': 2, 'VALUE3': 4},
'1': {'VALUE1': 10, 'VALUE2': 5, 'VALUE3': 3},
'2': {'VALUE1': 5, 'VALUE2': 3, 'VALUE3': 1}}

Average the values of a list of dictionaries

I have the following list of dictionaries. Each dictionary has a "Point" and a "Value" and goes from 1 to 10, for each series of points.
My_list = [{"Point": 1, "Value": 40}, {"Point": 2, "Value": 40}, {"Point": 3, "Value": 40}, \
{"Point": 4, "Value": 40}, {"Point": 5, "Value": 40}, {"Point": 6, "Value": 40}, \
{"Point": 7, "Value": 40}, {"Point": 8, "Value": 40}, {"Point": 9, "Value": 0},{"Point": 10, "Value": 250},\
{"Point": 1, "Value": 40}, {"Point": 2, "Value": 40}, {"Point": 3, "Value": 40}, \
{"Point": 4, "Value": 40}, {"Point": 5, "Value": 40}, {"Point": 6, "Value": 40}, \
{"Point": 7, "Value": 40}, {"Point": 8, "Value": 40}, {"Point": 9, "Value": 0},{"Point": 10, "Value": 250},\
{"Point": 1, "Value": 40}, {"Point": 2, "Value": 40}, {"Point": 3, "Value": 40}, \
{"Point": 4, "Value": 40}, {"Point": 5, "Value": 40}, {"Point": 6, "Value": 40}, \
{"Point": 7, "Value": 40}, {"Point": 8, "Value": 40}, {"Point": 9, "Value": 0},{"Point": 10, "Value": 250}]
I would like to find the average 'Value' for every 2 'Point', without messing with the 'Value' of the next series. I have done the following.
every2 = []
counter = 2
temp = []
for point in My_list:
if counter > 0:
temp.append(point["Value"])
else:
p = point
p["Value"] = sum(temp)/len(temp)
every2.append(point)
# reset the counter after every 2 point
counter = 2
temp = []
# temp.append(point["Value"])
counter -= 1
print(every2)
The result I am getting is:
[{'Point': 3, 'Value': 40.0}, {'Point': 5, 'Value': 40.0}, {'Point': 7, 'Value': 40.0}, {'Point': 9, 'Value': 40.0},
{'Point': 1, 'Value': 250.0}, {'Point': 3, 'Value': 40.0}, {'Point': 5, 'Value': 40.0}, {'Point': 7, 'Value': 40.0},
{'Point': 9, 'Value': 40.0}, {'Point': 1, 'Value': 250.0}, {'Point': 3, 'Value': 40.0}, {'Point': 5, 'Value': 40.0},
{'Point': 7, 'Value': 40.0}, {'Point': 9, 'Value': 40.0}]
However I am missing the first 'Point', as the 'Point' of the first series starts from 3 instead of 1 and as a consequence the 'Point' 9 has a value of 40 instead of 125.
So what I want should look like this:
[{'Point': 1, 'Value': 40.0}, {'Point': 3, 'Value': 40.0}, {'Point': 5, 'Value': 40.0}, {'Point': 7, 'Value': 40.0},
{'Point': 9, 'Value': 125.0}, {'Point': 1, 'Value': 40.0}, {'Point': 3, 'Value': 40.0}, {'Point': 5, 'Value': 40.0},
{'Point': 7, 'Value': 40.0}, {'Point': 9, 'Value': 125.0}, {'Point': 1, 'Value': 40.0}, {'Point': 3, 'Value': 40.0},
{'Point': 5, 'Value': 40.0}, {'Point': 7, 'Value': 40.0}, {'Point': 9, 'Value': 125.0}]
You can add a step argument to range() that will allow you to iterate over the list in steps of 2. Then, get both elements you want to use, create a new element using the values, and append that to your result list.
result_list = []
n_step = 2 # chunk size is 2
for i in range(0, len(My_list), n_step):
# Get all elements in this chunk
elems = My_list[i:i+n_step]
# Find the average of the Value key in elems
avg = sum(item['Value'] for item in elems) / len(elems)
# Point key from the first element; Value key from average
new_item = {"Point": elems[0]["Point"], "Value": avg}
result_list.append(new_item)
Which gives:
[{'Point': 1, 'Value': 40.0},
{'Point': 3, 'Value': 40.0},
{'Point': 5, 'Value': 40.0},
{'Point': 7, 'Value': 40.0},
{'Point': 9, 'Value': 125.0},
{'Point': 1, 'Value': 40.0},
{'Point': 3, 'Value': 40.0},
{'Point': 5, 'Value': 40.0},
{'Point': 7, 'Value': 40.0},
{'Point': 9, 'Value': 125.0},
{'Point': 1, 'Value': 40.0},
{'Point': 3, 'Value': 40.0},
{'Point': 5, 'Value': 40.0},
{'Point': 7, 'Value': 40.0},
{'Point': 9, 'Value': 125.0}]
You can also use list comprehension
res = [{**data[n], **{'Value': sum(v['Value']/2 for v in data[n: n+2])}} for n in range(0, len(data), 2)]
print(res)
Output:
[{'Point': 1, 'Value': 40.0},
{'Point': 3, 'Value': 40.0},
{'Point': 5, 'Value': 40.0},
{'Point': 7, 'Value': 40.0},
{'Point': 9, 'Value': 125.0},
{'Point': 1, 'Value': 40.0},
{'Point': 3, 'Value': 40.0},
{'Point': 5, 'Value': 40.0},
{'Point': 7, 'Value': 40.0},
{'Point': 9, 'Value': 125.0},
{'Point': 1, 'Value': 40.0},
{'Point': 3, 'Value': 40.0},
{'Point': 5, 'Value': 40.0},
{'Point': 7, 'Value': 40.0},
{'Point': 9, 'Value': 125.0}]
For python 3.9+
[data[n] | {'Value': sum(v['Value']/2 for v in data[n: n+2])} for n in range(0, len(data), 2)]
Here's another option that's using zip to "parallel loop" over My_list with an offset:
result = [
{"Point": p1["Point"], "Value": (p1["Value"] + p2["Value"]) / 2}
for p1, p2 in zip(My_list[::2], My_list[1::2])
]
Requirement is that all the series have even length.

Sort a Python dictionary within a dictionary

I am trying to sort a dictionary within a dictionary. My goal is to sort the 'sub' dictionary ['extra'] based on it's values, from high to low.The problem I'm having is that my 'sub' dictionary is nested deep within the main dictionary. Using other examples, I can do this for one level higher, see my code below. So instead of sorting 'marks', I would like to sort the items 1,2 & 3 based on their values. Code:
# initializing dictionary
test_dict = {'Nikhil' : { 'roll' : 24, 'marks' : 17, 'extra' : {'item1': 2, 'item2': 3, 'item3': 5}},
'Akshat' : {'roll' : 54, 'marks' : 12, 'extra' : {'item1': 8, 'item2': 3, 'item3': 4}},
'Akash' : { 'roll' : 12, 'marks' : 15, 'extra' : {'item1': 9, 'item2': 3, 'item3': 1}}}
# printing original dict
print("The original dictionary : " + str(test_dict))
# using sorted()
# Sort nested dictionary by key
res = sorted(test_dict.items(), key = lambda x: x[1]['marks'])
# print result
print("The sorted dictionary by marks is : " + str(res))
# How to sort on 'extra'?
So this is what I want it to look like:
sorted_dict = {'Nikhil' : { 'roll' : 24, 'marks' : 17, 'extra' : {'item3': 5, 'item2': 3, 'item1': 2}},
'Akshat' : {'roll' : 54, 'marks' : 12, 'extra' : {'item1': 8, 'item3': 4, 'item2': 3}},
'Akash' : { 'roll' : 12, 'marks' : 15, 'extra' : {'item1': 9, 'item2': 3, 'item3': 1}}}
Well this seems to do it:
test_dict = {
'Nikhil': {'roll': 24, 'marks': 17, 'extra': {'item1': 2, 'item2': 3, 'item3': 5}},
'Akshat': {'roll': 54, 'marks': 12, 'extra': {'item1': 8, 'item2': 3, 'item3': 4}},
'Akash': {'roll': 12, 'marks': 15, 'extra': {'item1': 9, 'item2': 3, 'item3': 1}}
}
sorted_dict = test_dict.copy()
for name in test_dict:
extra = test_dict[name]['extra']
sorted_extra = dict(reversed(sorted(extra.items(), key=lambda item: item[1])))
sorted_dict[name]['extra'] = sorted_extra
this sorts the values but not the keys of the 'extra' dict for all dicts in the big dictionary:
test_dict = {
'Nikhil': {'roll': 24, 'marks': 17, 'extra': {'item1': 2, 'item2': 3, 'item3': 5}},
'Akshat': {'roll': 54, 'marks': 12, 'extra': {'item1': 8, 'item2': 3, 'item3': 4}},
'Akash': {'roll': 12, 'marks': 15, 'extra': {'item1': 9, 'item2': 3, 'item3': 1}}
}
for dct in test_dict.values():
extra_keys, extra_values = dct['extra'].keys(), dct['extra'].values()
dct['extra'] = dict(zip(extra_keys, sorted(extra_values, reverse=True)))
print(test_dict)
output:
{'Nikhil': {'roll': 24, 'marks': 17, 'extra': {'item1': 5, 'item2': 3, 'item3': 2}},
'Akshat': {'roll': 54, 'marks': 12, 'extra': {'item1': 8, 'item2': 4, 'item3': 3}},
'Akash': {'roll': 12, 'marks': 15, 'extra': {'item1': 9, 'item2': 3, 'item3': 1}}}
Same thing can also be achived by List Comprehension
test_dict = {
'Nikhil': {'roll': 24, 'marks': 17, 'extra': {'item1': 2, 'item2': 3, 'item3': 5}},
'Akshat': {'roll': 54, 'marks': 12, 'extra': {'item1': 8, 'item2': 3, 'item3': 4}},
'Akash': {'roll': 12, 'marks': 15, 'extra': {'item1': 9, 'item2': 3, 'item3': 1}}
}
result = {
k: {
m: dict(sorted(n.items(), reverse=True, key=lambda x: x[1]))
if m == 'extra' else n
for (m, n) in v.items()
} for (k, v) in test_dict.items()
}
print(result)
sort the inner dict and assign it to test_dict[key]['extra'].. just with a loop
for key in test_dict.keys():
test_dict[key]["extra"] = dict(sorted(test_dict[key]["extra"].items(), key=itemgetter(1), reverse=True))
than the test_dict output would be as
{
'Nikhil': {'roll': 24, 'marks': 17, 'extra': {'item3': 5, 'item2': 3, 'item1': 2}},
'Akshat': {'roll': 54, 'marks': 12, 'extra': {'item1': 8, 'item3': 4, 'item2': 3}},
'Akash': {'roll': 12, 'marks': 15, 'extra': {'item1': 9, 'item2': 3, 'item3': 1}}
}

Create dictionary from difference of two dictionaries

Suppose, if I have a dictonary,
dictA = {
'flower':
{
'jasmine': 10,
'roses':
{
'red': 1,
'white': 2
}
},
'fruit':
{
'apple':3
}
}
and if dictA is updated (say to dictB)
dictB = {
'flower':
{
'jasmine': 10,
'roses':
{
'red': 1,
'white': 2
}
},
'fruit':
{
'apple':3,
'orange': 4
}
}
now how would I get a dictionary of only newly added items (preserving the structure}, something like,
difference(dictB, dictA) = {'fruit': {'orange': 4}}
by this way, I would avoid storing redundant items each time and instead have a smaller dictionary showing only newly added items
This kind of manipulation of dictionaries has a lot of practical uses, but unfortunately harder
Any help would be much appreciated and Thanks in advance
Use DictDiffer:
from dictdiffer import diff, patch, swap, revert
dictA = {
'flower':
{
'jasmine': 10,
'roses':
{
'red': 1,
'white': 2
}
},
'fruit':
{
'apple':3
}
}
dictB = {
'flower':
{
'jasmine': 10,
'roses':
{
'red': 1,
'white': 2
}
},
'fruit':
{
'apple':3,
'orange': 4
}
}
result = diff(dictA, dictB)
# [('add', 'fruit', [('orange', 4)])]
print(f'Diffrence :\n{list(result)}')
patched = patch(result, dictA)
# {'flower': {'jasmine': 10, 'roses': {'red': 1, 'white': 2}}, 'fruit': {'apple': 3}}
print(f'Apply diffrence :\n{patched}')

add dictionaries to empty list dict.value()

I have four dictionaries I would like to add as items in empty list that is a dictionary value. and I have no idea how to do this. Could someone please help me figure out how to turn this:
data = {'Cars': []}
dict1 = {'subaru': 1, 'honda': 5, 'volkswagen': 8}
dict2 = {'subaru': 7, 'honda': 3, 'volkswagen': 9}
dict3 = {'subaru': 9, 'honda': 2, 'volkswagen': 1}
dict4 = {'subaru': 2, 'honda': 8, 'volkswagen': 2}
print (data)
into this:
{'Cars': [{'subaru': 1, 'honda': 5, 'volkswagen': 8},
{'subaru': 7, 'honda': 3, 'volkswagen': 9},
{'subaru': 9, 'honda': 2, 'volkswagen': 1},
{'subaru': 2, 'honda': 8, 'volkswagen': 2}]}
data = {'Cars': []}
dict1 = {'subaru': 1, 'honda': 5, 'volkswagen': 8}
dict2 = {'subaru': 7, 'honda': 3, 'volkswagen': 9}
dict3 = {'subaru': 9, 'honda': 2, 'volkswagen': 1}
dict4 = {'subaru': 2, 'honda': 8, 'volkswagen': 2}
for item in [dict1, dict2, dict3, dict4]:
data['Cars'].append(item)
import pprint
pp = pprint.PrettyPrinter()
pp.pprint(data)
gives:
{'Cars': [{'subaru': 1, 'honda': 5, 'volkswagen': 8},
{'subaru': 7, 'honda': 3, 'volkswagen': 9},
{'subaru': 9, 'honda': 2, 'volkswagen': 1},
{'subaru': 2, 'honda': 8, 'volkswagen': 2}]}
Citation: """data['Cars'] is your initially-empty list. You add elements to a list by calling .append() on it. Thus, data['Cars'].append(dict1), and so on.""" – jasonharper
and this can be done in one step in a loop constructed as above.
To get the pretty print you import the pprint module, create a pp object using pp = pprint.PrettyPrinter() and use it's pp.pprint() function to print the list nested in the dictionary in a pretty way :) .
By the way: you can create the data dictionary with a list already containing the elements in one step using:
data = {'Cars': [
{'subaru': 1, 'honda': 5, 'volkswagen': 8},
{'subaru': 7, 'honda': 3, 'volkswagen': 9},
{'subaru': 9, 'honda': 2, 'volkswagen': 1},
{'subaru': 2, 'honda': 8, 'volkswagen': 2}]}
You need to access the Cars key in the data dictionary, then append to that.
data = {'Cars': []}
dict1 = {'subaru': 1, 'honda': 5, 'volkswagen': 8}
dict2 = {'subaru': 7, 'honda': 3, 'volkswagen': 9}
dict3 = {'subaru': 9, 'honda': 2, 'volkswagen': 1}
dict4 = {'subaru': 2, 'honda': 8, 'volkswagen': 2}
data['Cars'].append(dict1)
data['Cars'].append(dict2)
data['Cars'].append(dict3)
data['Cars'].append(dict4)
You could simplify this to just
data['Cars'].append({'subaru': 1, 'honda': 5, 'volkswagen': 8})
data['Cars'].append({'subaru': 7, 'honda': 3, 'volkswagen': 9})
data['Cars'].append({'subaru': 9, 'honda': 2, 'volkswagen': 1})
data['Cars'].append({'subaru': 2, 'honda': 8, 'volkswagen': 2})
Just append all the dicts to your data["Cars"] which is a list.
for i in [dict1, dict2, dict3, dict4]:
data["Cars"].append(i)
print data

Categories