merging Python dictionaries [duplicate] - python

This question already has answers here:
How to merge dicts, collecting values from matching keys?
(17 answers)
Closed 6 months ago.
I am trying to merge the following python dictionaries as follow:
dict1= {'paul':100, 'john':80, 'ted':34, 'herve':10}
dict2 = {'paul':'a', 'john':'b', 'ted':'c', 'peter':'d'}
output = {'paul':[100,'a'],
'john':[80, 'b'],
'ted':[34,'c'],
'peter':[None, 'd'],
'herve':[10, None]}
I wish to keep all keys from both dictionaries.
Is there an efficient way to do this?

output = {k: [dict1[k], dict2.get(k)] for k in dict1}
output.update({k: [None, dict2[k]] for k in dict2 if k not in dict1})

This will work:
{k: [dict1.get(k), dict2.get(k)] for k in set(dict1.keys() + dict2.keys())}
Output:
{'john': [80, 'b'], 'paul': [100, 'a'], 'peter': [None, 'd'], 'ted': [34, 'c'], 'herve': [10, None]}

In Python2.7 or Python3.1 you can easily generalise to work with any number of dictionaries using a combination of list, set and dict comprehensions!
>>> dict1 = {'paul':100, 'john':80, 'ted':34, 'herve':10}
>>> dict2 = {'paul':'a', 'john':'b', 'ted':'c', 'peter':'d'}
>>> dicts = dict1,dict2
>>> {k:[d.get(k) for d in dicts] for k in {k for d in dicts for k in d}}
{'john': [80, 'b'], 'paul': [100, 'a'], 'peter': [None, 'd'], 'ted': [34, 'c'], 'herve': [10, None]}
Python2.6 doesn't have set comprehensions or dict comprehensions
>>> dict1 = {'paul':100, 'john':80, 'ted':34, 'herve':10}
>>> dict2 = {'paul':'a', 'john':'b', 'ted':'c', 'peter':'d'}
>>> dicts = dict1,dict2
>>> dict((k,[d.get(k) for d in dicts]) for k in set(k for d in dicts for k in d))
{'john': [80, 'b'], 'paul': [100, 'a'], 'peter': [None, 'd'], 'ted': [34, 'c'], 'herve': [10, None]}

In Python3.1,
output = {k:[dict1.get(k),dict2.get(k)] for k in dict1.keys() | dict2.keys()}
In Python2.6,
output = dict((k,[dict1.get(k),dict2.get(k)]) for k in set(dict1.keys() + dict2.keys()))

Using chain.from_iterable (from itertools) you can avoid the list/dict/set comprehension with:
dict(chain.from_iterable(map(lambda d: d.items(), list_of_dicts])))
It can be more or less convenient and readable than double comprehension, depending on your personal preference.

Related

How to interleave these two lists into one list?

a = ['a', 'b', 'c']
b = [10, 20, 30]
output should be like [[a:10], [b:20], [c:30]]
I do know how to use the zip to interweave two lists
l = []
for x,y in zip(a,b):
l.append([x,y])
And the output is : [['a', 10], ['b', 20], ['c', 30]]
instead of [[a:10], [b:20], [c:30]]
How should I make like this with ':'
Thanks
Assuming that you mean to create a list of singleton dicts, you can zip the two lists, convert sequence of value pairs to singletons with another zip, and map the resulting sequence to the dict constructor:
list(map(dict, zip(zip(a, b))))
Or use a list comprehension:
[{i: j} for i, j in zip(a, b)]
Both return:
[{'a': 10}, {'b': 20}, {'c': 30}]

Python Index Nested Dictionary with List

If I have a nested dictionary and varying lists:
d = {'a': {'b': {'c': {'d': 0}}}}
list1 = ['a', 'b']
list2 = ['a', 'b', 'c']
list3 = ['a', 'b', 'c', 'd']
How can I access dictionary values like so:
>>> d[list1]
{'c': {'d': 0}}
>>> d[list3]
0
you can use functools reduce. info here. You have a nice post on reduce in real python
from functools import reduce
reduce(dict.get, list3, d)
>>> 0
EDIT: mix of list and dictioanries
in case of having mixed list and dictionary values the following is possible
d = {'a': [{'b0': {'c': 1}}, {'b1': {'c': 1}}]}
list1 = ['a', 1, 'b1', 'c']
fun = lambda element, indexer: element[indexer]
reduce(fun, list1, d)
>>> 1
Use a short function:
def nested_get(d, lst):
out = d
for x in lst:
out = out[x]
return out
nested_get(d, list1)
# {'c': {'d': 0}}

How do you combine lists of multiple dictionaries in Python?

I'd like to merge a list of dictionaries with lists as values. Given
arr[0] = {'number':[1,2,3,4], 'alphabet':['a','b','c']}
arr[1] = {'number':[3,4], 'alphabet':['d','e']}
arr[2] = {'number':[6,7], 'alphabet':['e','f']}
the result I want would be
merge_arr = {'number':[1,2,3,4,3,4,6,7,], 'alphabet':['a','b','c','d','e','e','f']}
could you recommend any compact code?
If you know these are the only keys in the dict, you can hard code it. If it isn't so simple, show a complicated example.
from pprint import pprint
arr = [
{
'number':[1,2,3,4],
'alphabet':['a','b','c']
},
{
'number':[3,4],
'alphabet':['d','e']
},
{
'number':[6,7],
'alphabet':['e','f']
}
]
merged_arr = {
'number': [],
'alphabet': []
}
for d in arr:
merged_arr['number'].extend(d['number'])
merged_arr['alphabet'].extend(d['alphabet'])
pprint(merged_arr)
Output:
{'alphabet': ['a', 'b', 'c', 'd', 'e', 'e', 'f'],
'number': [1, 2, 3, 4, 3, 4, 6, 7]}
arr = [{'number':[1,2,3,4], 'alphabet':['a','b','c']},{'number':[3,4], 'alphabet':['d','e']},{'number':[6,7], 'alphabet':['e','f']}]
dict = {}
for k in arr[0].keys():
dict[k] = sum([dict[k] for dict in arr], [])
print (dict)
output:
{'number': [1, 2, 3, 4, 3, 4, 6, 7], 'alphabet': ['a', 'b', 'c', 'd', 'e', 'e', 'f']}
Here is code that uses defaultdict to more easily collect the items. You could leave the result as a defaultdict but this version converts that to a regular dictionary. This code will work with any keys, and the keys in the various dictionaries can differ, as long as the values are lists. Therefore this answer is more general than the other answers given so far.
from collections import defaultdict
arr = [{'number':[1,2,3,4], 'alphabet':['a','b','c']},
{'number':[3,4], 'alphabet':['d','e']},
{'number':[6,7], 'alphabet':['e','f']},
]
merge_arr_default = defaultdict(list)
for adict in arr:
for key, value in adict.items():
merge_arr_default[key].extend(value)
merge_arr = dict(merge_arr_default)
print(merge_arr)
The printed result is
{'number': [1, 2, 3, 4, 3, 4, 6, 7], 'alphabet': ['a', 'b', 'c', 'd', 'e', 'e', 'f']}
EDIT: As noted by #pault, the solution below is of quadratic complexity, and therefore not recommended for large lists. There are more optimal ways to go around it.
However if you’re looking for compactness and relative simplicity, keep reading.
If you want a more functional form, this two-liner will do:
arr = [{'number':[1,2,3,4], 'alphabet':['a','b','c']},{'number':[3,4], 'alphabet':['d','e']},{'number':[6,7], 'alphabet':['e','f']}]
keys = ['number', 'alphabet']
merge_arr = {key: reduce(list.__add__, [dict[key] for dict in arr]) for key in keys}
print arr
Outputs:
{'alphabet': ['a', 'b', 'c', 'd', 'e', 'e', 'f'], 'number': [1, 2, 3, 4, 3, 4, 6, 7]}
This won't merge recursively.
If you want it to work with arbitrary keys, not present in each dict, use:
keys = {k for k in dict.keys() for dict in arr}
merge_arr = {key: reduce(list.__add__, [dict.get(key, []) for dict in arr]) for key in keys}

slicing dictionary with values

I have a dictionary like:
d = {1: 'a', 2:'b', 3:'c', 4:'c', 5:'c', 6:'c'}
I want to slice this dictionary such that if the values in the end are same, it should return only the first value encountered. so the return is:
d = {1: 'a', 2:'b', 3:'c'}
I'm using collections.defaultdict(OrderedDict) to maintain sorting by the keys.
Currently, I'm using a loop. Is there a pythonic way of doing this?
UPDATE
the dictionary values can also be dictionaries:
d = {1: {'a': 'a1', 'b': 'b1'}, 2:{'a': 'a1', 'b': 'b2'}, 3:{'a': 'a1', 'b': 'c1'}, 4:{'a': 'a1', 'b': 'c1'}, 5:{'a': 'a1', 'b': 'c1'}, 6:{'a': 'a1', 'b': 'c1'}}
output:
d = {1: {'a': 'a1', 'b': 'b1'}, 2:{'a': 'a1', 'b': 'b2'}, 3:{'a': 'a1', 'b': 'c1'}}
You can use itertools.groupy with a list-comprehension to achieve your result
>>> from itertools import groupby
>>> d = {1: 'a', 2:'b', 3:'c', 4:'c', 5:'c', 6:'c'}
>>> n = [(min([k[0] for k in list(g)]),k) for k,g in groupby(d.items(),key=lambda x: x[1])]
>>> n
>>> [(1, 'a'), (2, 'b'), (3, 'c')]
The above expression can also be written as
>>> from operator import itemgetter
>>> n = [(min(map(itemgetter(0), g)), k) for k, g in groupby(d.items(), key=itemgetter(1))]
You can cast this to dict by simply using
>>> dict(n)
>>> {1: 'a', 2: 'b', 3: 'c'}
This obviously don't maintain order of keys, so you can use OrderedDict
>>> OrderedDict(sorted(n))
>>> OrderedDict([(1, 'a'), (2, 'b'), (3, 'c')])
If you want to get rid of for loop - you can do it this way:
{a:b for b,a in {y:x for x,y in sorted(d.iteritems(), reverse=True)}.iteritems()}
But it is not so pythonic and not so efficient.
Instead of using a ordered dictionary with the keys representing indexes, the more pythonic way is using a list. In this case, you will use indexes instead of keys and will be able to slice the list more effectively.
>>> d = {1: 'a', 2:'b', 3:'c', 4:'c', 5:'c', 6:'c'}
>>> a = list(d.values())
>>> a[:a.index(a[-1])+1]
['a', 'b', 'c']
Just in case, a solution with pandas
import pandas as pd
df = pd.DataFrame(dict(key=list(d.keys()),val=list(d.values())))
print(df)
key val
0 1 a
1 2 b
2 3 c
3 4 c
4 5 c
5 6 c
df = df.drop_duplicates(subset=['val'])
df.index=df.key
df.val.to_dict()
{1: 'a', 2: 'b', 3: 'c'}
Don't know performances issues on biggest dataset or if it is more pythonic.
Nevertheless, no loops.
You can check if two last values are same:
d = OrderedDict({1: 'a', 2:'b', 3:'c', 4:'c', 5:'c', 6:'c'})
while d.values()[-1] == d.values()[-2]:
d.popitem()
print d
# OrderedDict([(1, 'a'), (2, 'b'), (3, 'c')])

Dictionary of lists from list of lists

I have a list of lists of data:
[[1422029700000, 230.84, 230.42, 230.31, 230.32, 378], [1422029800000, 231.84, 231.42, 231.31, 231.32, 379], ...]
and a list of keys:
['a', 'b', 'c', 'd', 'e']
I want to combine them to a dictionary of lists so it looks like:
['a': [1422029700000, 1422029800000], 'b': [230.84, 231.84], ...]
I can do this using loops but I am looking for a pythonic way.
It is quite simple:
In [1]: keys = ['a','b','c']
In [2]: values = [[1,2,3],[4,5,6],[7,8,9]]
In [7]: dict(zip(keys, zip(*values)))
Out[7]: {'a': (1, 4, 7), 'b': (2, 5, 8), 'c': (3, 6, 9)}
If you need lists as values:
In [8]: dict(zip(keys, [list(t) for t in zip(*values)]))
Out[8]: {'a': [1, 4, 7], 'b': [2, 5, 8], 'c': [3, 6, 9]}
or:
In [9]: dict(zip(keys, map(list, zip(*values))))
Out[9]: {'a': [1, 4, 7], 'b': [2, 5, 8], 'c': [3, 6, 9]}
Use:
{k: [d[i] for d in data] for i, k in enumerate(keys)}
Example:
>>> data=[[1422029700000, 230.84, 230.42, 230.31, 230.32, 378], [1422029800000, 231.84, 231.42, 231.31, 231.32, 379]]
>>> keys = ["a", "b", "c"]
>>> {k: [d[i] for d in data] for i, k in enumerate(keys)}
{'c': [230.42, 231.42], 'a': [1422029700000, 1422029800000], 'b': [230.84, 231.84]}
Your question has everything in a list so if you want a list of dicts:
l1= [[1422029700000, 230.84, 230.42, 230.31, 230.32, 378], [1422029800000, 231.84, 231.42, 231.31, 231.32, 379]]
l2 = ['a', 'b', 'c', 'd', 'e',"f"] # added f to match length of sublists
print([{a:list(b)} for a,b in zip(l2,zip(*l1))])
[{'a': [1422029700000, 1422029800000]}, {'b': [230.84, 231.84]}, {'c': [230.42, 231.42]}, {'d': [230.31, 231.31]}, {'e': [230.32, 231.32]}, {'f': [378, 379]}]
If you actually want a dict use a dict comprehension with zip:
print({a:list(b) for a,b in zip(l2,zip(*l1))})
{'f': [378, 379], 'e': [230.32, 231.32], 'a': [1422029700000, 1422029800000], 'b': [230.84, 231.84], 'c': [230.42, 231.42], 'd': [230.31, 231.31]}
You example also has a list of keys shorter than the length of your sublists so zipping will actually mean you lose values from your sublists so you may want to address that.
If you are using python2 you can use itertools.izip:
from itertools import izip
print({a:list(b) for a,b in izip(l2,zip(*l1))

Categories