Merge identical format dictionaries into one dictionary [duplicate] - python

This question already has answers here:
How to implement an ordered, default dict?
(10 answers)
Closed 7 years ago.
I want to convert the Current data format into Expected result
The items should be ordered as origin.
How could I do it in elegant way with Python
Current data format
[{'_id': 1800, 'count': 32},
.....
{'_id': 1892, 'count': 1},
{'_id': 1899, 'count': 13}]
Expected result
{"_id":[1800,1892,1899], "count":[32,1,13]}

Try this using defaultdict
from collections import defaultdict
l = [{'_id': 1800, 'count': 32},
{'_id': 1892, 'count': 1},
{'_id': 1899, 'count': 13}]
d = defaultdict(list)
for i in l:
for j,k in i.items():
d[j].append(k)
>>>d
defaultdict(<type 'list'>, {'count': [32, 1, 13], '_id': [1800, 1892, 1899]})
OR
using Counter
from collections import Counter
l = [{i:[j] for i,j in d.items()} for d in l]
result_counter = Counter()
for i in l:
result_counter.update(i)
>>>result_counter
Counter({'_id': [1800, 1892, 1899], 'count': [32, 1, 13]})

From python collections docs:
>>> s = [('yellow', 1), ('blue', 2), ('yellow', 3), ('blue', 4), ('red', 1)]
>>> d = defaultdict(list)
>>> for k, v in s:
... d[k].append(v)
...
>>> d.items()
[('blue', [2, 4]), ('red', [1]), ('yellow', [1, 3])]

You can simply traverse the list and construct the required dictionary.
In [2]: l = [{'_id': 1800, 'count': 32}, {'_id': 1892, 'count': 1}, {'_id': 1899, 'count': 13}]
In [3]: {'_id': [data['_id'] for data in l], 'count': [data['count'] for data in l]}
Out[3]: {'_id': [1800, 1892, 1899], 'count': [32, 1, 13]}

>>> data
[{'count': 32, '_id': 1800}, {'count': 1, '_id': 1892}, {'count': 13, '_id': 1899}]
>>> result = {key:[] for key in data[0]}
>>> for d in data:
... for k in d:
... result[k].append(d[k])
...
>>> result
{'count': [32, 1, 13], '_id': [1800, 1892, 1899]}

Related

Using Counts in python to subtract list of dictionaries

I've worked out how to use "counter" to add lists of dictionarys with the code below
from collections import Counter
a = [{'num': 'star1', 'count': 1},
{'num': 'star2', 'count': 3}]
b = [{'num': 'star1', 'count': 7},
{'num': 'star2', 'count': 2},
{'num': 'star3', 'count': 1}]
joint = sum((Counter({elem['num']: elem['count']}) for elem in a + b), Counter())
[{'num': num, 'count': counts} for num, counts in joint .items()]
However, when I try to subtract I get an error. For example:
from collections import Counter
a = [{'num': 'star1', 'count': 1},
{'num': 'star2', 'count': 3}]
b = [{'num': 'star1', 'count': 7},
{'num': 'star2', 'count': 2},
{'num': 'star3', 'count': 1}]
joint = sum((Counter({elem['num']: elem['count']}) for elem in a - b), Counter())
[{'num': num, 'count': counts} for num, counts in joint .items()]
Is anyone aware of a work around to this? Or how I could approach this issue?
I've tried using the subtract function in counter but it still doesn't seem to work
I don't think Counter accept negative counts, but if you don't mind about not having negative counts, this will work (just doing the two Counters separately):
from collections import Counter
a = [{'num': 'star1', 'count': 1},
{'num': 'star2', 'count': 3}]
b = [{'num': 'star1', 'count': 7},
{'num': 'star2', 'count': 2},
{'num': 'star3', 'count': 1}]
joint = sum((Counter({elem['num']: elem['count']}) for elem in a), Counter())
joint -= sum((Counter({elem['num']: elem['count']}) for elem in b), Counter())
[{'num': num, 'count': counts} for num, counts in joint .items()]
It outputs:
[{'num': 'star2', 'count': 1}]
The other values (star1 and star3) would both be less than 0 because they come to 1-7=-6 and 0-1=-1 respectively
Alternatively you can just not use counters and do it like this (this supports negative numbers and whatever operation you want):
a_info = {d['num']:d['count'] for d in a}
b_info = {d['num']:d['count'] for d in b}
[{'num':item, 'count': a_info.get(item, 0)-b_info.get(item, 0)} for item in set(a_info)|set(b_info)]

previous key to current key

I am new with the concept of dictionaries and trying to learn them. What I have is a dictionary like this:
{'cars': [{'values': [1, 534],
{'values': [25,32,164]
'bikes': [{'values': [23,12,1]
{'values': [2,4]
{'values': [68,69]
{'values': [4,93]
What I try to achieve is add Ids to all inner values starting from 1
If you want the ID as part of the value group, like this:
{'cars': [{'values': [1, 534], 'sedan': 1, 'count': 2, 'ID': 1},
{'values': [25, 32, 164], 'sedan': 1, 'count': 10, 'ID': 2}],
'bikes': [{'values': [23, 12, 1], 'road': 0, 'count': 9},
...
You can do:
for i in range(len(try_dict['cars'])):
try_dict['cars'][i]['ID'] = i+1
If you want what Phydeaux suggests, you can do:
new_dict = {'cars': {}}
for i in range(len(try_dict['cars'])):
new_dict['cars'][i+1] = try_dict['cars'][i]
Which will give you:
{'cars': {1: {'values': [1, 534], 'sedan': 1, 'count': 2},
2: {'values': [25, 32, 164], 'sedan': 1, 'count': 10}}}
If you want not just cars but also bikes (and maybe trucks, trains, whatever...). Use:
new_dict = {}
for key in try_dict.keys():
new_dict[key] = {}
for i in range(len(try_dict[key])):
new_dict[key][i+1] = try_dict[key][i]
This will give you:
{'cars': {1: {'values': [1, 534], 'sedan': 1, 'count': 2},
2: {'values': [25, 32, 164], 'sedan': 1, 'count': 10}},
'bikes': {1: {'values': [23, 12, 1], 'road': 0, 'count': 9},
2: {'values': [2, 4], 'road': 1, 'count': 24},
3: {'values': [68, 69], 'sedan': 0, 'count': 28},
4: {'values': [4, 93], 'sedan': 0, 'count': 6}}}
You can do this using a simple function:
def idx(dict, key):
dict = dict
dict[key].insert(0, 0)
return dict
Full Code:
def idx(dict, key):
dict = dict
dict[key].insert(0, 0)
return dict
dict = {'cars': [{'values': [1, 534],
'sedan': 1,
'count': 2},
{'values': [25,32,164],
'sedan': 1,
'count': 10}],
'bikes': [{'values': [23,12,1],
'road': 0,
'count': 9},
{'values': [2,4],
'road': 1,
'count': 24},
{'values': [68,69],
'sedan': 0,
'count': 28},
{'values': [4,93],
'sedan': 0,
'count': 6}]}
dict = idx(dict, "cars")
print(dict["cars"][1])
Explanation:
Replace dictionary with a new edited dictionary:
dict = {key: [...,...,...]}
dict = idx(dict, key)
Function is using the .insert method to insert 0 for the value of the first index to the key provided.
Learn more about Python .insert() method at:
[
https://www.w3schools.com/python/ref_list_insert.asp

Combining multiple lists of dictionaries

I have several lists of dictionaries, where each dictionary contains a unique id value that is common among all lists. I'd like to combine them into a single list of dicts, where each dict is joined on that id value.
list1 = [{'id': 1, 'value': 20}, {'id': 2, 'value': 21}]
list2 = [{'id': 1, 'sum': 10}, {'id': 2, 'sum': 11}]
list3 = [{'id': 1, 'total': 30}, {'id': 2, 'total': 32}]
desired_output = [{'id': 1, 'value': 20, 'sum': 10, 'total': 30}, {'id': 2, 'value': 21, 'sum': 11, 'total': 32}]
I tried doing something like the answer found at https://stackoverflow.com/a/42018660/7564393, but I'm getting very confused since I have more than 2 lists. Should I try using a defaultdict approach? More importantly, I am NOT always going to know the other values, only that the id value is present in all dicts.
You can use itertools.groupby():
from itertools import groupby
list1 = [{'id': 1, 'value': 20}, {'id': 2, 'value': 21}]
list2 = [{'id': 1, 'sum': 10}, {'id': 2, 'sum': 11}]
list3 = [{'id': 1, 'total': 30}, {'id': 2, 'total': 32}]
desired_output = []
for _, values in groupby(sorted([*list1, *list2, *list3], key=lambda x: x['id']), key=lambda x: x['id']):
temp = {}
for d in values:
temp.update(d)
desired_output.append(temp)
Result:
[{'id': 1, 'value': 20, 'sum': 10, 'total': 30}, {'id': 2, 'value': 21, 'sum': 11, 'total': 32}]
list1 = [{'id': 1, 'value': 20}, {'id': 2, 'value': 21}]
list2 = [{'id': 1, 'sum': 10}, {'id': 2, 'sum': 11}]
list3 = [{'id': 1, 'total': 30}, {'id': 2, 'total': 32}]
# combine all lists
d = {} # id -> dict
for l in [list1, list2, list3]:
for list_d in l:
if 'id' not in list_d: continue
id = list_d['id']
if id not in d:
d[id] = list_d
else:
d[id].update(list_d)
# dicts with same id are grouped together since id is used as key
res = [v for v in d.values()]
print(res)
You can first build a dict of dicts, then turn it into a list:
from itertools import chain
from collections import defaultdict
list1 = [{'id': 1, 'value': 20}, {'id': 2, 'value': 21}]
list2 = [{'id': 1, 'sum': 10}, {'id': 2, 'sum': 11}]
list3 = [{'id': 1, 'total': 30}, {'id': 2, 'total': 32}]
dict_out = defaultdict(dict)
for d in chain(list1, list2, list3):
dict_out[d['id']].update(d)
out = list(dict_out.values())
print(out)
# [{'id': 1, 'value': 20, 'sum': 10, 'total': 30}, {'id': 2, 'value': 21, 'sum': 11, 'total': 32}]
itertools.chain allows you to iterate on all the dicts contained in the 3 lists. We build a dict dict_out having the id as key, and the corresponding dict being built as value. This way, we can easily update the already built part with the small dict of our current iteration.
Here, I have presented a functional approach without using itertools (which is excellent in rapid development work).
This solution will work for any number of lists as the function takes variable number of arguments and also let user to specify the type of return output (list/dict).
By default it returns list as you want that otherwise it returns dictionary in case if you pass as_list = False.
I preferred dictionary to solve this because its fast and search complexity is also less.
Just have a look at the below get_packed_list() function.
get_packed_list()
def get_packed_list(*dicts_lists, as_list=True):
output = {}
for dicts_list in dicts_lists:
for dictionary in dicts_list:
_id = dictionary.pop("id") # id() is in-built function so preferred _id
if _id not in output:
# Create new id
output[_id] = {"id": _id}
for key in dictionary:
output[_id][key] = dictionary[key]
dictionary["id"] = _id # push back the 'id' after work (call by reference mechanism)
if as_list:
return [output[key] for key in output]
return output # dictionary
Test
list1 = [{'id': 1, 'value': 20}, {'id': 2, 'value': 21}]
list2 = [{'id': 1, 'sum': 10}, {'id': 2, 'sum': 11}]
list3 = [{'id': 1, 'total': 30}, {'id': 2, 'total': 32}]
output = get_packed_list(list1, list2, list3)
print(output)
# [{'id': 1, 'value': 20, 'sum': 10, 'total': 30}, {'id': 2, 'value': 21, 'sum': 11, 'total': 32}]
output = get_packed_list(list1, list2, list3, as_list=False)
print(output)
# {1: {'id': 1, 'value': 20, 'sum': 10, 'total': 30}, 2: {'id': 2, 'value': 21, 'sum': 11, 'total': 32}}
list1 = [{'id': 1, 'value': 20}, {'id': 2, 'value': 21}]
list2 = [{'id': 1, 'sum': 10}, {'id': 2, 'sum': 11}]
list3 = [{'id': 1, 'total': 30}, {'id': 2, 'total': 32}]
print(list1+list2+list3)
list1 = [{'id': 1, 'value': 20}, {'id': 2, 'value': 21}]
list2 = [{'id': 1, 'sum': 10}, {'id': 2, 'sum': 11}]
list3 = [{'id': 1, 'total': 30}, {'id': 2, 'total': 32}]
result = []
for i in range(0,len(list1)):
final_dict = dict(list(list1[i].items()) + list(list2[i].items()) + list(list3[i].items()))
result.append(final_dict)
print(result)
output : [{'id': 1, 'value': 20, 'sum': 10, 'total': 30}, {'id': 2, 'value': 21, 'sum': 11, 'total': 32}]

Merge list of python dictionaries using multiple keys

I want to merge two lists of dictionaries, using multiple keys.
I have a single list of dicts with one set of results:
l1 = [{'id': 1, 'year': '2017', 'resultA': 2},
{'id': 2, 'year': '2017', 'resultA': 3},
{'id': 1, 'year': '2018', 'resultA': 3},
{'id': 2, 'year': '2018', 'resultA': 5}]
And another list of dicts for another set of results:
l2 = [{'id': 1, 'year': '2017', 'resultB': 5},
{'id': 2, 'year': '2017', 'resultB': 8},
{'id': 1, 'year': '2018', 'resultB': 7},
{'id': 2, 'year': '2018', 'resultB': 9}]
And I want to combine them using the 'id' and 'year' keys to get the following:
all = [{'id': 1, 'year': '2017', 'resultA': 2, 'resultB': 5},
{'id': 2, 'year': '2017', 'resultA': 3, 'resultB': 8},
{'id': 1, 'year': '2018', 'resultA': 3, 'resultB': 7},
{'id': 2, 'year': '2018', 'resultA': 5, 'resultB': 9}]
I know that for combining two lists of dicts on a single key, I can use this:
l1 = {d['id']:d for d in l1}
all = [dict(d, **l1.get(d['id'], {})) for d in l2]
But it ignores the year, providing the following incorrect result:
all = [{'id': 1, 'year': '2018', 'resultA': 3, 'resultB': 5},
{'id': 2, 'year': '2018', 'resultA': 5, 'resultB': 8},
{'id': 1, 'year': '2018', 'resultA': 3, 'resultB': 7},
{'id': 2, 'year': '2018', 'resultA': 5, 'resultB': 9}]
Treating this as I would in R, by adding in the second variable I want to merge on, I get a KeyError:
l1 = {d['id','year']:d for d in l1}
all = [dict(d, **l1.get(d['id','year'], {})) for d in l2]
How do I merge using multiple keys?
Instead of d['id','year'], use the tuple (d['id'], d['year']) as your key.
You can combine both list and groupby the resulting list on id and year. Then merge the dict together that have same keys.
Grouping can be achieved by using itertools.groupby, and merge can be done using collection.ChainMap
>>> from itertools import groupby
>>> from collections import ChainMap
>>> [dict(ChainMap(*list(g))) for _,g in groupby(sorted(l1+l2, key=lambda x: (x['id'],x['year'])),key=lambda x: (x['id'],x['year']))]
>>> [{'resultA': 2, 'id': 1, 'resultB': 5, 'year': '2017'}, {'resultA': 3, 'id': 1, 'resultB': 7, 'year': '2018'}, {'resultA': 3, 'id': 2, 'resultB': 8, 'year': '2017'}, {'resultA': 5, 'id': 2, 'resultB': 9, 'year': '2018'}]
Alternatively to avoid lambda you can also use operator.itemgetter
>>> from operator import itemgetter
>>> [dict(ChainMap(*list(g))) for _,g in groupby(sorted(l1+l2, key=itemgetter('id', 'year')),key=itemgetter('id', 'year'))]
Expanding on #AlexHall's suggestion, you can use collections.defaultdict to help you:
from collections import defaultdict
d = defaultdict(dict)
for i in l1 + l2:
results = {k: v for k, v in i.items() if k not in ('id', 'year')}
d[(i['id'], i['year'])].update(results)
Result
defaultdict(dict,
{(1, '2017'): {'resultA': 2, 'resultB': 5},
(1, '2018'): {'resultA': 3, 'resultB': 7},
(2, '2017'): {'resultA': 3, 'resultB': 8},
(2, '2018'): {'resultA': 5, 'resultB': 9}})

sum value of two different dictionaries which is having same key

i am having two dictionaries
first = {'id': 1, 'age': 23}
second = {'id': 4, 'out': 100}
I want output dictionary as
{'id': 5, 'age': 23, 'out':100}
I tried
>>> dict(first.items() + second.items())
{'age': 23, 'id': 4, 'out': 100}
but i am getting id as 4 but i want to it to be 5 .
You want to use collections.Counter:
from collections import Counter
first = Counter({'id': 1, 'age': 23})
second = Counter({'id': 4, 'out': 100})
first_plus_second = first + second
print first_plus_second
Output:
Counter({'out': 100, 'age': 23, 'id': 5})
And if you need the result as a true dict, just use dict(first_plus_second):
>>> print dict(first_plus_second)
{'age': 23, 'id': 5, 'out': 100}
If you want to add values from the second to the first, you can do it like this:
first = {'id': 1, 'age': 23}
second = {'id': 4, 'out': 100}
for k in second:
if k in first:
first[k] += second[k]
else:
first[k] = second[k]
print first
The above will output:
{'age': 23, 'id': 5, 'out': 100}
You can simply update the 'id' key afterwards:
result = dict(first.items() + second.items())
result['id'] = first['id'] + second['id']

Categories