Dict merge in a dict comprehension - python

In python 3.5, we can merge dicts by using double-splat unpacking
>>> d1 = {1: 'one', 2: 'two'}
>>> d2 = {3: 'three'}
>>> {**d1, **d2}
{1: 'one', 2: 'two', 3: 'three'}
Cool. It doesn't seem to generalise to dynamic use cases, though:
>>> ds = [d1, d2]
>>> {**d for d in ds}
SyntaxError: dict unpacking cannot be used in dict comprehension
Instead we have to do reduce(lambda x,y: {**x, **y}, ds, {}), which seems a lot uglier. Why the "one obvious way to do it" is not allowed by the parser, when there doesn't seem to be any ambiguity in that expression?

It's not exactly an answer to your question but I'd consider using ChainMap to be an idiomatic and elegant way to do what you propose (merging dictionaries in-line):
>>> from collections import ChainMap
>>> d1 = {1: 'one', 2: 'two'}
>>> d2 = {3: 'three'}
>>> ds = [d1, d2]
>>> dict(ChainMap(*ds))
{1: 'one', 2: 'two', 3: 'three'}
Although it's not a particularly transparent solution, since many programmers might not know exactly how a ChainMap works. Note that (as #AnttiHaapala points out) "first found is used" so, depending on your intentions you might need to make a call to reversed before passing your dicts into ChainMap.
>>> d2 = {3: 'three', 2: 'LOL'}
>>> ds = [d1, d2]
>>> dict(ChainMap(*ds))
{1: 'one', 2: 'two', 3: 'three'}
>>> dict(ChainMap(*reversed(ds)))
{1: 'one', 2: 'LOL', 3: 'three'}

To me, the obvious way is:
d_out = {}
for d in ds:
d_out.update(d)
This is quick and probably quite performant. I don't know that I can speak for the python developers, but I don't know that your expected version is more easy to read. For example, your comprehension looks more like a set-comprehension to me due to the lack of a :. FWIW, I don't think there is any technical reason (e.g. parser ambiguity) that they couldn't add that form of comprehension unpacking.
Apparently, these forms were proposed, but didn't have universal enough support to warrant implementing them (yet).

You could use itertools.chain or itertools.chain.from_iterable:
import itertools
ds = [{'a': 1, 'b': 2}, {'c': 30, 'b': 40}]
merged_d = dict(itertools.chain(*(d.items() for d in ds)))
print(merged_d) # {'a': 1, 'b': 40, 'c': 30}

Based on this solution and also mentioned by #ilgia-everilä, but making it Py2 compatible and still avoiding intermediate structures. Encapsulating it inside a function makes its use quite readable.
def merge_dicts(*dicts, **extra):
"""
>>> merge_dicts(dict(a=1, b=1), dict(b=2, c=2), dict(c=3, d=3), d=4, e=4)
{'a': 1, 'b': 2, 'c': 3, 'd': 4, 'e': 4}
"""
return dict((
(k,v)
for d in dicts
for k,v in d.items()
), **extra)

Idiomatic, without ChainMap:
>>> d1 = {1: 'one', 2: 'two'}
>>> d2 = {3: 'three'}
>>> {k: v for d in [d1, d2] for k, v in d.items()}
{1: 'one', 2: 'two', 3: 'three'}

You could define this function:
from collections import ChainMap
def mergeDicts(l):
return dict(ChainMap(*reversed(list(l))))
You can then use it like this:
>>> d1 = {1: 'one', 2: 'two'}
>>> d2 = {3: 'three'}
>>> ds = [d1, d2]
>>> mergeDicts(ds)
{1: 'one', 2: 'two', 3: 'three'}

Related

Python: Remove item from dictionary in functional way

I've been programming in Python for quite a while now. I've always wondered, is there a way to remove an item from a dictionary and return the newly created dictionary? Basically removing an item from a dict in a functional way.
As far as I know, there are only the del dict[item] and dict.pop(item) methods, however both modify data and don't return the new dict.
There is no built-in way for dicts, you have to do it yourself. Something to the effect of:
>>> data = dict(a=1,b=2,c=3)
>>> data
{'a': 1, 'b': 2, 'c': 3}
>>> {k:v for k,v in data.items() if k != item}
Note, Python 3.9 did add a | operator for dicts to create a new, merged dict:
>>> data
{'a': 1, 'b': 2, 'c': 3}
>>> more_data = {"b":4, "c":5, "d":6}
Then
>>> data | more_data
{'a': 1, 'b': 4, 'c': 5, 'd': 6}
So, similar to + for list concatenation. Previously, could have done something like:
>>> {**data, **more_data}
{'a': 1, 'b': 4, 'c': 5, 'd': 6}
Note, set objects support operators to create new sets, providing operators for various basic set operations:
>>> s1 = {'a','b','c'}
>>> s2 = {'b','c','d'}
>>> s1 & s2 # set intersection
{'b', 'c'}
>>> s1 | s2 # set union
{'c', 'a', 'b', 'd'}
>>> s1 - s2 # set difference
{'a'}
>>> s1 ^ s2 # symmetric difference
{'a', 'd'}
This comes down to API design choices.
The solution would be to use dict.copy() to create a copy then to do your operations.
For Example:
initial_dict = {"a": 1, "b": 2}
dict_copy = initial_dict.copy()
# Then you can do your item operations
del dict_copy["a"]
# or
dict_copy.pop("b")

Python: select key, values from dictionary corresponding to given list

I have a set dictionary like so:
d = {'cat': 'one', 'dog': 'two', 'fish': 'three'}
Given a list, can I just keep the key, values given?
Input:
l = ['one', 'three']
Output:
new_d = {'cat': 'one', 'fish': 'three'}
You can use dictionary comprehension to achieve this easily:
{k: v for k, v in d.items() if v in l}
The scenario you've described above provides a perfect use case for the IN operator, which tests whether or not a value is a member of a collection, such as a list.
The code below is to demonstrate the concept. For more practical applications, look at dictionary comprehension.
d = {'cat': 'one', 'dog': 'two', 'fish': 'three'}
l = ['one', 'three']
d_output = {}
for k,v in d.items(): # Loop through input dictionary
if v in l: # Check if the value is included in the given list
d_output[k] = v # Assign the key: value to the output dictionary
print(d_output)
Output is:
{'cat': 'one', 'fish': 'three'}
You can copy your dictionary and drop unwanted elements:
d = {'cat': 'one', 'dog': 'two', 'fish': 'three'}
l = ['one', 'three']
new_d = d.copy()
for element in d:
if (d[element]) not in l:
new_d.pop(element)
print(d)
print(new_d)
Output is:
{'cat': 'one', 'dog': 'two', 'fish': 'three'}
{'cat': 'one', 'fish': 'three'}

Find the index from values in a list inside a dictionary

Does any one knows how to get the index of the values from dictionary 2 on dictionary 1.. like this:
Dictionary_1= {A: [Tom, Jane, Joe]; B: [Joana, Clare, Tom]; C: [Clare, Jane, Joe]}
Dictionary_2 = {A: Tom; B: Clare; C: Jane}
RESULT = {A: 1; B: 2; C: 2}
EDIT:
Sorry guys.. first of all I got confused and forgot that I needed it starting with "0" instead of "1".
I was having a problem, but it was because my list inside of dictionary 1 was in unicode format instead of list.
Also.. in the example I used here, I noticed later that the keys existed in both dictionaries, but in the code Im writting it wasnt the same thing. I didnt post the original here because it was bigger, so I tried to resume the most I could. Sorry for that too.
So I got it working with this code:
RESULT = {}
for x, y in Dictionary_1.items():
for a, b in Dictionary_2 .items():
if x == a:
z = Dictionary_1[x]
r = eval(z)
if '{0}'.format(b) in r:
RESULT [a] = r.index('{0}'.format(b))
I know that its looks messy but im still learning.
I really appreciate your help guys!
You can try using dict comprehension.
dict1={'A':['Tom','Jane','Joe'],'B':['Joana','Clare','Tom'],'C':['Clare','Jane','Joe']}
dict2={'A':'Tom','B':'Clare','C':'Jane'}
result={k:dict1[k].index(v)+1 for k,v in dict2.values()}
# {'A': 1, 'B': 2, 'C': 2}
#Or
# {k:dict1.get(k).index(v)+1 for k,v in dict2.items()}
Assuming you want 0-based indices, you can use list.index() with a dict comprehension:
d1 = {'A': ['Tom', 'Jane', 'Joe'], 'B': ['Joana', 'Clare', 'Tom'], 'C': ['Clare', 'Jane', 'Joe']}
d2 = {'A': 'Tom', 'B': 'Clare', 'C': 'Jane'}
result = {k: d1[k].index(v) for k, v in d2.items()}
print(result)
# {'A': 0, 'B': 1, 'C': 1}
If you want to have indices starting at 1, then you can do d1[k].index(v) + 1.
An easy to understand solution for you
d1 = {'A': ['Tom', 'Jane', 'Joe'], 'B': ['Joana', 'Clare', 'Tom'], 'C': ['Clare', 'Jane', 'Joe']}
d2 = {'A': 'Tom', 'B': 'Clare', 'C': 'Jane'}
output = {}
for k,v in d2.items():
output[k] = d1[k].index(v)+1
print(output)
This is certainly not the best approach but this is what I did quickly:
dict1 = {0: ['Tom', 'Jane', 'Joe'], 1: ['Joana', 'Clare', 'Tom'], 2: ['Clare', 'Jane', 'Joe']}
dict2 ={0: 'Tom', 1: 'Clare', 2: 'Jane'}
result = {}
val_list = list(dict1.values())
for i in range(0,len(dict1)):
result.update({i : val_list[i].index(dict2[i])})
print(result)

How to make "seen" hash with python dict?

In Perl one can do this:
my %seen;
foreach my $dir ( split /:/, $input ) {
$seen{$dir}++;
}
This is a way to remove duplicates by keeping track of what has been "seen". In python you cannot do:
seen = {}
for x in ['one', 'two', 'three', 'one']:
seen[x] += 1
The above python results in KeyError: 'one'.
What is python-y way of making a 'seen' hash?
Use a defaultdict for getting this behavior. The catch is that you need to specify the datatype for defaultdict to work for even those keys which don't have a value:
In [29]: from collections import defaultdict
In [30]: seen = defaultdict(int)
In [31]: for x in ['one', 'two', 'three', 'one']:
...: seen[x] += 1
In [32]: seen
Out[32]: defaultdict(int, {'one': 2, 'three': 1, 'two': 1})
You can use a Counter as well:
>>> from collections import Counter
>>> seen = Counter()
>>> for x in ['one', 'two', 'three', 'one']: seen[x] += 1
...
>>> seen
Counter({'one': 2, 'three': 1, 'two': 1})
If all you need are uniques, just do a set operation: set(['one', 'two', 'three', 'one'])
You could use a set:
>>> seen=set(['one', 'two', 'three', 'one'])
>>> seen
{'one', 'two', 'three'}
If you unroll seen[x] += 1 into seen[x] = seen[x] + 1, the problem with your code is obvious: you're trying to access seen[x] before you've assigned to it. Instead, you need to check if the key exists first:
seen = {}
for x in ['one', 'two', 'three', 'one']:
if x in seen:
seen[x] += 1 # we've seen it before, so increment
else:
seen[x] = 1 # first time seeing x

Strict Comparison of Dictionaries in Python

I'm having a little bit of trouble comparing two similar dictionaries. I would like stricter comparison of the values (and probably keys).
Here's the really basic problem:
>>> {'a': True} == {'a': 1}
True
Similarly (and somewhat confusingly):
>>> {1: 'a'} == {True: 'a'}
True
This makes sense because True == 1. What I'm looking for is something that behaves more like is, but compares two possibly nested dictionaries. Obviously you can't use use is on the two dictionaries, because that will always return False, even if all of the elements are identical.
My current solution is to just use json.dumps to get a string representation of both and compare that.
>>> json.dumps({'a': True}, sort_keys=True) == json.dumps({'a': 1}, sort_keys=True)
False
But this only works if everything is JSON-serializable.
I also tried comparing all of the keys and values manually:
>>> l = {'a': True}
>>> r = {'a': 1}
>>> r.keys() == l.keys() and all(l[key] is r[key] for key in l.keys())
False
But this fails if the dictionaries have some nested structure. I figured I could write a recursive version of this to handle the nested case, but it seemed unnecessarily ugly and un-pythonic.
Is there a "standard" or simple way of doing this?
Thanks!
You were pretty close with JSON: Use Python's pprint module instead. This is documented to sort dictionaries in Python 2.5+ and 3:
Dictionaries are sorted by key before the display is computed.
Let's confirm this. Here's a session in Python 3.6 (which conveniently preserves insertion order even for regular dict objects):
Python 3.6.2 (v3.6.2:5fd33b5, Jul 8 2017, 04:57:36) [MSC v.1900 64 bit (AMD64)]
on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> a = {2: 'two', 3: 'three', 1: 'one'}
>>> b = {3: 'three', 2: 'two', 1: 'one'}
>>> a
{2: 'two', 3: 'three', 1: 'one'}
>>> b
{3: 'three', 2: 'two', 1: 'one'}
>>> a == b
True
>>> c = {2: 'two', True: 'one', 3: 'three'}
>>> c
{2: 'two', True: 'one', 3: 'three'}
>>> a == b == c
True
>>> from pprint import pformat
>>> pformat(a)
"{1: 'one', 2: 'two', 3: 'three'}"
>>> pformat(b)
"{1: 'one', 2: 'two', 3: 'three'}"
>>> pformat(c)
"{True: 'one', 2: 'two', 3: 'three'}"
>>> pformat(a) == pformat(b)
True
>>> pformat(a) == pformat(c)
False
>>>
And let's quickly confirm that pretty-printing sorts nested dictionaries:
>>> a['b'] = b
>>> a
{2: 'two', 3: 'three', 1: 'one', 'b': {3: 'three', 2: 'two', 1: 'one'}}
>>> pformat(a)
"{1: 'one', 2: 'two', 3: 'three', 'b': {1: 'one', 2: 'two', 3: 'three'}}"
>>>
So, instead of serializing to JSON, serialize using pprint.pformat(). I imagine there may be some corner cases where two objects that you want to consider unequal nevertheless create the same pretty-printed representation. But those cases should be rare, and you wanted something simple and Pythonic, which this is.
You can test identity of all (key, value) pairs element-wise:
def equal_dict(d1, d2):
return all((k1 is k2) and (v1 is v2)
for (k1, v1), (k2, v2) in zip(d1.items(), d2.items()))
>>> equal_dict({True: 'a'}, {True: 'a'})
True
>>> equal_dict({1: 'a'}, {True: 'a'})
False
This should work with float, int, str and bool, but not other sequences or more complex objects.
Anyway, that's a start if you need it.
I think you are looking for something like this. However since you didn't provide example data I won't go into guessing what it could be
from boltons.itertools import remap
def compare(A, B): return A == B and type(A) == type(B)
dict_to_compare_against = { some dict }
def visit(path, key, value):
cur = dict_to_compare_against
for i in path:
cur = cur[i]
if not compare(cur, value):
raise Exception("Not equal")
remap(other_dict, visit=visit)
You can use isinstance() to delineate between a regular dictionary entry and a nested dictionary entry. This way you can iterate through using is to compare strictly, but also check when you need to dive down a level into the nested dictionary.
https://docs.python.org/3/library/functions.html#isinstance
myDict = {'a': True, 'b': False, 'c': {'a': True}}
for key, value in myDict.items():
if isinstance(value, dict):
# do what you need to do....
else:
# etc...

Categories