Merge two dictionary views in python - python

How do I merge the views of the items of two dicts in python?
My use case is: I have two private dictionaries in a class. I want the API to treat them as one, and provide an items method as such. The only way I know of is to combine them then provide a view on the two, but for large dictionaries this seems expensive. I'm thinking of sth like a.items() + b.items()
Note: I don't care about key clashes.
This is what I'd like to improve:
class A:
_priv1 = {'a': 1, 'b': 2}
_priv2 = {'c': 1, 'd': 2}
def items(self):
return {**self._priv1, **self._priv2}.items()

You can use ChainMap:
from collections import ChainMap
class A:
_priv1 = {'a': 1, 'b': 2}
_priv2 = {'c': 1, 'd': 2}
def items(self):
return ChainMap(self._priv1, self._priv2)

Merging views of dictionaries means nothing because a view always reflects the content of the corresponding dictionary.
So to have a different view, either you edit one of your dictionaries or you instantiate a new one (like you did), there is no way around it. See this.
But maybe what you want is itertools.chain to iterate across multiple iterables. This solution doesn't insatiate. Or as other have said collections.ChainMap. I would use chain to iterate and ChainMap to make lookups.

You can use ChainMap:
A ChainMap groups multiple dicts or other mappings together to create a single, updateable view. If no maps are specified, a single empty dictionary is provided so that a new chain always has at least one mapping.
from collections import ChainMap
context = ChainMap(_priv1, _priv2, ...)
Example:
In [3]: _priv1 = {1: 2, 3: 4}
In [4]: _priv2 = {5: 6, 7: 8}
In [5]: from collections import ChainMap
...: context = ChainMap(_priv1, _priv2)
In [6]: context
Out[6]: ChainMap({1: 2, 3: 4}, {5: 6, 7: 8})
In [7]: _priv1.update({9: 10})
In [8]: context
Out[8]: ChainMap({1: 2, 3: 4, 9: 10}, {5: 6, 7: 8})
In [9]: context.get(9)
Out[9]: 10
For your code example, I'd use:
from collections import ChainMap
class A:
_priv1 = {'a': 1, 'b': 2}
_priv2 = {'c': 1, 'd': 2}
_union_dict = ChainMap(_priv1, _priv2)
#classmethod
def items(cls):
return cls._union_dict.items()

You can use chain to combine two or more views. As stated:
Make an iterator that returns elements from the first iterable until it is exhausted, then proceeds to the next iterable, until all of the iterables are exhausted.
That way, the data is not copied.
from itertools import chain
class A:
_priv1 = {'a': 1, 'b': 2}
_priv2 = {'c': 1, 'd': 2}
def items(self):
return chain(self._priv1.items(), self._priv2.items())

Related

How to combine 2 dictionaries and add the values of the duplicate keys together [duplicate]

For example I have two dicts:
Dict A: {'a': 1, 'b': 2, 'c': 3}
Dict B: {'b': 3, 'c': 4, 'd': 5}
I need a pythonic way of 'combining' two dicts such that the result is:
{'a': 1, 'b': 5, 'c': 7, 'd': 5}
That is to say: if a key appears in both dicts, add their values, if it appears in only one dict, keep its value.
Use collections.Counter:
>>> from collections import Counter
>>> A = Counter({'a':1, 'b':2, 'c':3})
>>> B = Counter({'b':3, 'c':4, 'd':5})
>>> A + B
Counter({'c': 7, 'b': 5, 'd': 5, 'a': 1})
Counters are basically a subclass of dict, so you can still do everything else with them you'd normally do with that type, such as iterate over their keys and values.
A more generic solution, which works for non-numeric values as well:
a = {'a': 'foo', 'b':'bar', 'c': 'baz'}
b = {'a': 'spam', 'c':'ham', 'x': 'blah'}
r = dict(a.items() + b.items() +
[(k, a[k] + b[k]) for k in set(b) & set(a)])
or even more generic:
def combine_dicts(a, b, op=operator.add):
return dict(a.items() + b.items() +
[(k, op(a[k], b[k])) for k in set(b) & set(a)])
For example:
>>> a = {'a': 2, 'b':3, 'c':4}
>>> b = {'a': 5, 'c':6, 'x':7}
>>> import operator
>>> print combine_dicts(a, b, operator.mul)
{'a': 10, 'x': 7, 'c': 24, 'b': 3}
>>> A = {'a':1, 'b':2, 'c':3}
>>> B = {'b':3, 'c':4, 'd':5}
>>> c = {x: A.get(x, 0) + B.get(x, 0) for x in set(A).union(B)}
>>> print(c)
{'a': 1, 'c': 7, 'b': 5, 'd': 5}
Intro:
There are the (probably) best solutions. But you have to know it and remember it and sometimes you have to hope that your Python version isn't too old or whatever the issue could be.
Then there are the most 'hacky' solutions. They are great and short but sometimes are hard to understand, to read and to remember.
There is, though, an alternative which is to to try to reinvent the wheel.
- Why reinventing the wheel?
- Generally because it's a really good way to learn (and sometimes just because the already-existing tool doesn't do exactly what you would like and/or the way you would like it) and the easiest way if you don't know or don't remember the perfect tool for your problem.
So, I propose to reinvent the wheel of the Counter class from the collections module (partially at least):
class MyDict(dict):
def __add__(self, oth):
r = self.copy()
try:
for key, val in oth.items():
if key in r:
r[key] += val # You can custom it here
else:
r[key] = val
except AttributeError: # In case oth isn't a dict
return NotImplemented # The convention when a case isn't handled
return r
a = MyDict({'a':1, 'b':2, 'c':3})
b = MyDict({'b':3, 'c':4, 'd':5})
print(a+b) # Output {'a':1, 'b': 5, 'c': 7, 'd': 5}
There would probably others way to implement that and there are already tools to do that but it's always nice to visualize how things would basically works.
Definitely summing the Counter()s is the most pythonic way to go in such cases but only if it results in a positive value. Here is an example and as you can see there is no c in result after negating the c's value in B dictionary.
In [1]: from collections import Counter
In [2]: A = Counter({'a':1, 'b':2, 'c':3})
In [3]: B = Counter({'b':3, 'c':-4, 'd':5})
In [4]: A + B
Out[4]: Counter({'d': 5, 'b': 5, 'a': 1})
That's because Counters were primarily designed to work with positive integers to represent running counts (negative count is meaningless). But to help with those use cases,python documents the minimum range and type restrictions as follows:
The Counter class itself is a dictionary
subclass with no restrictions on its keys and values. The values are
intended to be numbers representing counts, but you could store
anything in the value field.
The most_common() method requires only
that the values be orderable.
For in-place operations such as c[key]
+= 1, the value type need only support addition and subtraction. So fractions, floats, and decimals would work and negative values are
supported. The same is also true for update() and subtract() which
allow negative and zero values for both inputs and outputs.
The multiset methods are designed only for use cases with positive values.
The inputs may be negative or zero, but only outputs with positive
values are created. There are no type restrictions, but the value type
needs to support addition, subtraction, and comparison.
The elements() method requires integer counts. It ignores zero and negative counts.
So for getting around that problem after summing your Counter you can use Counter.update in order to get the desire output. It works like dict.update() but adds counts instead of replacing them.
In [24]: A.update(B)
In [25]: A
Out[25]: Counter({'d': 5, 'b': 5, 'a': 1, 'c': -1})
myDict = {}
for k in itertools.chain(A.keys(), B.keys()):
myDict[k] = A.get(k, 0)+B.get(k, 0)
The one with no extra imports!
Their is a pythonic standard called EAFP(Easier to Ask for Forgiveness than Permission). Below code is based on that python standard.
# The A and B dictionaries
A = {'a': 1, 'b': 2, 'c': 3}
B = {'b': 3, 'c': 4, 'd': 5}
# The final dictionary. Will contain the final outputs.
newdict = {}
# Make sure every key of A and B get into the final dictionary 'newdict'.
newdict.update(A)
newdict.update(B)
# Iterate through each key of A.
for i in A.keys():
# If same key exist on B, its values from A and B will add together and
# get included in the final dictionary 'newdict'.
try:
addition = A[i] + B[i]
newdict[i] = addition
# If current key does not exist in dictionary B, it will give a KeyError,
# catch it and continue looping.
except KeyError:
continue
EDIT: thanks to jerzyk for his improvement suggestions.
import itertools
import collections
dictA = {'a':1, 'b':2, 'c':3}
dictB = {'b':3, 'c':4, 'd':5}
new_dict = collections.defaultdict(int)
# use dict.items() instead of dict.iteritems() for Python3
for k, v in itertools.chain(dictA.iteritems(), dictB.iteritems()):
new_dict[k] += v
print dict(new_dict)
# OUTPUT
{'a': 1, 'c': 7, 'b': 5, 'd': 5}
OR
Alternative you can use Counter as #Martijn has mentioned above.
For a more generic and extensible way check mergedict. It uses singledispatch and can merge values based on its types.
Example:
from mergedict import MergeDict
class SumDict(MergeDict):
#MergeDict.dispatch(int)
def merge_int(this, other):
return this + other
d2 = SumDict({'a': 1, 'b': 'one'})
d2.merge({'a':2, 'b': 'two'})
assert d2 == {'a': 3, 'b': 'two'}
From python 3.5: merging and summing
Thanks to #tokeinizer_fsj that told me in a comment that I didn't get completely the meaning of the question (I thought that add meant just adding keys that eventually where different in the two dictinaries and, instead, i meant that the common key values should be summed). So I added that loop before the merging, so that the second dictionary contains the sum of the common keys. The last dictionary will be the one whose values will last in the new dictionary that is the result of the merging of the two, so I thing the problem is solved. The solution is valid from python 3.5 and following versions.
a = {
"a": 1,
"b": 2,
"c": 3
}
b = {
"a": 2,
"b": 3,
"d": 5
}
# Python 3.5
for key in b:
if key in a:
b[key] = b[key] + a[key]
c = {**a, **b}
print(c)
>>> c
{'a': 3, 'b': 5, 'c': 3, 'd': 5}
Reusable code
a = {'a': 1, 'b': 2, 'c': 3}
b = {'b': 3, 'c': 4, 'd': 5}
def mergsum(a, b):
for k in b:
if k in a:
b[k] = b[k] + a[k]
c = {**a, **b}
return c
print(mergsum(a, b))
Additionally, please note a.update( b ) is 2x faster than a + b
from collections import Counter
a = Counter({'menu': 20, 'good': 15, 'happy': 10, 'bar': 5})
b = Counter({'menu': 1, 'good': 1, 'bar': 3})
%timeit a + b;
## 100000 loops, best of 3: 8.62 µs per loop
## The slowest run took 4.04 times longer than the fastest. This could mean that an intermediate result is being cached.
%timeit a.update(b)
## 100000 loops, best of 3: 4.51 µs per loop
One line solution is to use dictionary comprehension.
C = { k: A.get(k,0) + B.get(k,0) for k in list(B.keys()) + list(A.keys()) }
def merge_with(f, xs, ys):
xs = a_copy_of(xs) # dict(xs), maybe generalizable?
for (y, v) in ys.iteritems():
xs[y] = v if y not in xs else f(xs[x], v)
merge_with((lambda x, y: x + y), A, B)
You could easily generalize this:
def merge_dicts(f, *dicts):
result = {}
for d in dicts:
for (k, v) in d.iteritems():
result[k] = v if k not in result else f(result[k], v)
Then it can take any number of dicts.
This is a simple solution for merging two dictionaries where += can be applied to the values, it has to iterate over a dictionary only once
a = {'a':1, 'b':2, 'c':3}
dicts = [{'b':3, 'c':4, 'd':5},
{'c':9, 'a':9, 'd':9}]
def merge_dicts(merged,mergedfrom):
for k,v in mergedfrom.items():
if k in merged:
merged[k] += v
else:
merged[k] = v
return merged
for dct in dicts:
a = merge_dicts(a,dct)
print (a)
#{'c': 16, 'b': 5, 'd': 14, 'a': 10}
Here's yet another option using dictionary comprehensions combined with the behavior of dict():
dict3 = dict(dict1, **{ k: v + dict1.get(k, 0) for k, v in dict2.items() })
# {'a': 4, 'b': 2, 'c': 7, 'g': 1}
From https://docs.python.org/3/library/stdtypes.html#dict:
https://docs.python.org/3/library/stdtypes.html#dict
and also
If keyword arguments are given, the keyword arguments and their values are added to the dictionary created from the positional argument.
The dict comprehension
**{ k: v + dict1.get(v, 0), v in dict2.items() }
handles adding dict1[1] to v. We don't need an explicit if here because the default value for our dict1.get can be set to 0 instead.
This solution is easy to use, it is used as a normal dictionary, but you can use the sum function.
class SumDict(dict):
def __add__(self, y):
return {x: self.get(x, 0) + y.get(x, 0) for x in set(self).union(y)}
A = SumDict({'a': 1, 'c': 2})
B = SumDict({'b': 3, 'c': 4}) # Also works: B = {'b': 3, 'c': 4}
print(A + B) # OUTPUT {'a': 1, 'b': 3, 'c': 6}
The above solutions are great for the scenario where you have a small number of Counters. If you have a big list of them though, something like this is much nicer:
from collections import Counter
A = Counter({'a':1, 'b':2, 'c':3})
B = Counter({'b':3, 'c':4, 'd':5})
C = Counter({'a': 5, 'e':3})
list_of_counts = [A, B, C]
total = sum(list_of_counts, Counter())
print(total)
# Counter({'c': 7, 'a': 6, 'b': 5, 'd': 5, 'e': 3})
The above solution is essentially summing the Counters by:
total = Counter()
for count in list_of_counts:
total += count
print(total)
# Counter({'c': 7, 'a': 6, 'b': 5, 'd': 5, 'e': 3})
This does the same thing but I think it always helps to see what it is effectively doing underneath.
What about:
def dict_merge_and_sum( d1, d2 ):
ret = d1
ret.update({ k:v + d2[k] for k,v in d1.items() if k in d2 })
ret.update({ k:v for k,v in d2.items() if k not in d1 })
return ret
A = {'a': 1, 'b': 2, 'c': 3}
B = {'b': 3, 'c': 4, 'd': 5}
print( dict_merge_and_sum( A, B ) )
Output:
{'d': 5, 'a': 1, 'c': 7, 'b': 5}
More conventional way to combine two dict. Using modules and tools are good but understanding the logic behind it will help in case you don't remember the tools.
Program to combine two dictionary adding values for common keys.
def combine_dict(d1,d2):
for key,value in d1.items():
if key in d2:
d2[key] += value
else:
d2[key] = value
return d2
combine_dict({'a':1, 'b':2, 'c':3},{'b':3, 'c':4, 'd':5})
output == {'b': 5, 'c': 7, 'd': 5, 'a': 1}
Here's a very general solution. You can deal with any number of dict + keys that are only in some dict + easily use any aggregation function you want:
def aggregate_dicts(dicts, operation=sum):
"""Aggregate a sequence of dictionaries using `operation`."""
all_keys = set().union(*[el.keys() for el in dicts])
return {k: operation([dic.get(k, None) for dic in dicts]) for k in all_keys}
example:
dicts_same_keys = [{'x': 0, 'y': 1}, {'x': 1, 'y': 2}, {'x': 2, 'y': 3}]
aggregate_dicts(dicts_same_keys, operation=sum)
#{'x': 3, 'y': 6}
example non-identical keys and generic aggregation:
dicts_diff_keys = [{'x': 0, 'y': 1}, {'x': 1, 'y': 2}, {'x': 2, 'y': 3, 'c': 4}]
def mean_no_none(l):
l_no_none = [el for el in l if el is not None]
return sum(l_no_none) / len(l_no_none)
aggregate_dicts(dicts_diff_keys, operation=mean_no_none)
# {'x': 1.0, 'c': 4.0, 'y': 2.0}
dict1 = {'a':1, 'b':2, 'c':3}
dict2 = {'a':3, 'g':1, 'c':4}
dict3 = {} # will store new values
for x in dict1:
if x in dict2: #sum values with same key
dict3[x] = dict1[x] +dict2[x]
else: #add the values from x to dict1
dict3[x] = dict1[x]
#search for new values not in a
for x in dict2:
if x not in dict1:
dict3[x] = dict2[x]
print(dict3) # {'a': 4, 'b': 2, 'c': 7, 'g': 1}
Merging three dicts a,b,c in a single line without any other modules or libs
If we have the three dicts
a = {"a":9}
b = {"b":7}
c = {'b': 2, 'd': 90}
Merge all with a single line and return a dict object using
c = dict(a.items() + b.items() + c.items())
Returning
{'a': 9, 'b': 2, 'd': 90}

How to join all keys and values of dictionary and return in form of string?

Step 1. i/p= “wwwwaaadexxxxxx”
Step 2. converted= {'w': 4, 'a': 3, 'd': 1, 'e': 1, 'x': 6}
Step Final. o/p= 'w4a3d1e1x6'
I'm on S2 how to go to final step ?
Would appreciated direct conversions 1-> Final
Time Complexity should be less but would appreciate any Sol.
I want to return in form of String stored in any var
without importing anything
You can get ket and value pairs (using dict.items()) and parse them as a list, then use join to create a string out of it!
converted= {'w': 4, 'a': 3, 'd': 1, 'e': 1, 'x': 6}
print(''.join([f"{k}{v}" for k,v in converted.items()]))
w4a3d1e1x6
OR use Counter
Counter is from collections module that will give you a dict like structure with Count of each character
from collections import Counter
my_str = 'wwwwaaadexxxxxx'
print(''.join([f"{k}{v}" for k,v in Counter(my_str).items()]))

Using other dictionary values to define a dictionary value during initialization

Say I have three variables which I want to store in a dictionary such that the third is the sum of the first two. Is there a way to do this in one call when the dictionary is initialized? For example:
myDict = {'a': 1, 'b': 2, 'c': myDict['a'] + myDict['b']}
Python>=3.8's named assignment allows something like the following, which I guess you could interpret as one call:
>>> md = {**(md := {'a': 2, 'b': 3}), **{'c': md['a'] + md['b']}}
>>> md
{'a': 2, 'b': 3, 'c': 5}
But this is really just a fanciful way of forcing a two-liner into a single line and making it less readable and less memory-efficient (because of the intermediate dicts). Also note that the md used on the right hand side of the = really could be any name.
You could actually be a little more efficient and get rid of one spurious auxiliary dict:
(md := {'a': 2, 'b': 3}).update({'c': md['a'] + md['b']})
You can do:
>>> myDict = {'a': 1, 'b': 2}
>>> myDict["c"] = myDict["a"] + myDict["b"]
>>> myDict
{'a': 1, 'b': 2, 'c': 3}
You can not do this in 1 line, because myDict is not even exist while assigning to c

Selecting random values from dictionary

Let's say I have this dictionary:
dict = {'a': 100, 'b': 5, 'c': 150, 'd': 60};
I get the key which has greatest value with this code:
most_similar = max(dic.iteritems(), key=operator.itemgetter(1))[0]
it returns 'c'
But I want to select a random key from top 3 greatest values. According to this dictionary top 3 are:
c
a
d
It should randomly select a key from them. How can I do that?
If you want to find the top 3 keys and then get one of the keys randomly, then I would recommend using random.choice and collections.Counter, like this
>>> d = {'a': 100, 'b': 5, 'c': 150, 'd': 60}
>>> from collections import Counter
>>> from random import choice
>>> choice(Counter(d).most_common(3))[0]
'c'
Counter(d).most_common(3) will get the top three values from the dictionary based on the values of the dictionary object passed to it and then we randomly pick one of the returned values and return only the key from it.
Get the keys with the three largest values.
>>> import heapq
>>> d = {'a': 100, 'b': 5, 'c': 150, 'd': 60}
>>> largest = heapq.nlargest(3, d, key=d.__getitem__)
>>> largest
['c', 'a', 'd']
Then select one of them randomly:
>>> import random
>>> random.choice(largest)
'c'
Sort the dictionary by descending value, get the first three objects from the resulting list, then use random.choice:
>>> import random
>>> d = {'a': 100, 'b': 5, 'c': 150, 'd': 60}
>>> random.choice(sorted(d, reverse=True, key=d.get)[:3])
'c'
And don't call it dict or you'll mask the built-in.

How to compare values inside a dictionary to fill up sets()

dico = {"dico": {1:"bailler",2:"bailler",3:"percer",4:"calculer",5:"calculer",6:"trouer",7:"bailler",8:"découvrir",9:"bailler",10:"miser",11:"trouer",12:"changer"}}
I have a big dictionary of dictionaries like that. I want to put identic elements together in sets. So create a kind of condition which will say if the values of "dico" are equal put them in a set():
b=[set(1,2,7,9),set(3),set(4,5),set(6,11),set(8),set(10),set(12)]
I don't know if that question has already been asked but as a new pythonner I don't have all the keys... ^^
Thank you for you answers
I would reverse your dictionary and have the value a set(), then return all the values.
>>> from collections import defaultdict
>>>>my_dict= {"dico": {1:"bailler",2:"bailler",3:"percer",4:"calculer",5:"calculer",6:"trouer",7:"bailler",8:"découvrir",9:"bailler",10:"miser",11:"trouer",12:"changer"}}
>>> my_other_dict = defaultdict(set)
>>> for dict_name,sub_dict in my_dict.iteritems():
for k,v in sub_dict.iteritems():
my_other_dict[v].add(k) #the value, i.e. "bailler" is now the key
#e.g. {"bailler":set([1,2,9,7]),...
>>> [v for k,v in my_other_dict.iteritems()]
[set([8]), set([1, 2, 9, 7]), set([3]), set([4, 5]), set([12]), set([11, 6]), set([10])]
Of course as cynddl has pointed out, if your index in a list will always be the "key", simply enumerate a list and you won't have to store original data as a dictionary, nor use sets() as indices are unique.
You should write your data this way:
dico = ["bailler", "bailler", "percer", "calculer", "calculer", "trouer", "bailler", "découvrir", "bailler", "miser", "trouer", "changer"]
If you want to count the number of identic elements, use collections.Counter:
import collections
counter=collections.Counter(dico)
print(counter)
which returns a Counter object:
Counter({'bailler': 4, 'calculer': 2, 'trouer': 2, 'd\xc3\xa9couvrir': 1, 'percer': 1, 'changer': 1, 'miser': 1})
The dict.setdefault() method can be handy for tasks like this, as well as dict.items() which iterates through the (key, value) pairs of the dictionary.
>>> dico = {"dico": {1:"bailler",2:"bailler",3:"percer",4:"calculer",5:"calcul
er",6:"trouer",7:"bailler",8:"découvrir",9:"bailler",10:"miser",11:"trouer",12:"
changer"}}
>>> newdict = {}
>>> for k, subdict in dico.items():
... newdict[k] = {}
... for subk, subv in subdict.items():
... newdict[k].setdefault(subv, set()).add(subk)
...
>>> newdict
{'dico': {'bailler': {1, 2, 9, 7}, 'miser': {10}, 'découvrir': {8}, 'calculer':
{4, 5}, 'changer': {12}, 'percer': {3}, 'trouer': {11, 6}}}
>>> newdict['dico'].values()
dict_values([{1, 2, 9, 7}, {10}, {8}, {4, 5}, {12}, {3}, {11, 6}])

Categories