Strict Comparison of Dictionaries in Python - python

I'm having a little bit of trouble comparing two similar dictionaries. I would like stricter comparison of the values (and probably keys).
Here's the really basic problem:
>>> {'a': True} == {'a': 1}
True
Similarly (and somewhat confusingly):
>>> {1: 'a'} == {True: 'a'}
True
This makes sense because True == 1. What I'm looking for is something that behaves more like is, but compares two possibly nested dictionaries. Obviously you can't use use is on the two dictionaries, because that will always return False, even if all of the elements are identical.
My current solution is to just use json.dumps to get a string representation of both and compare that.
>>> json.dumps({'a': True}, sort_keys=True) == json.dumps({'a': 1}, sort_keys=True)
False
But this only works if everything is JSON-serializable.
I also tried comparing all of the keys and values manually:
>>> l = {'a': True}
>>> r = {'a': 1}
>>> r.keys() == l.keys() and all(l[key] is r[key] for key in l.keys())
False
But this fails if the dictionaries have some nested structure. I figured I could write a recursive version of this to handle the nested case, but it seemed unnecessarily ugly and un-pythonic.
Is there a "standard" or simple way of doing this?
Thanks!

You were pretty close with JSON: Use Python's pprint module instead. This is documented to sort dictionaries in Python 2.5+ and 3:
Dictionaries are sorted by key before the display is computed.
Let's confirm this. Here's a session in Python 3.6 (which conveniently preserves insertion order even for regular dict objects):
Python 3.6.2 (v3.6.2:5fd33b5, Jul 8 2017, 04:57:36) [MSC v.1900 64 bit (AMD64)]
on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> a = {2: 'two', 3: 'three', 1: 'one'}
>>> b = {3: 'three', 2: 'two', 1: 'one'}
>>> a
{2: 'two', 3: 'three', 1: 'one'}
>>> b
{3: 'three', 2: 'two', 1: 'one'}
>>> a == b
True
>>> c = {2: 'two', True: 'one', 3: 'three'}
>>> c
{2: 'two', True: 'one', 3: 'three'}
>>> a == b == c
True
>>> from pprint import pformat
>>> pformat(a)
"{1: 'one', 2: 'two', 3: 'three'}"
>>> pformat(b)
"{1: 'one', 2: 'two', 3: 'three'}"
>>> pformat(c)
"{True: 'one', 2: 'two', 3: 'three'}"
>>> pformat(a) == pformat(b)
True
>>> pformat(a) == pformat(c)
False
>>>
And let's quickly confirm that pretty-printing sorts nested dictionaries:
>>> a['b'] = b
>>> a
{2: 'two', 3: 'three', 1: 'one', 'b': {3: 'three', 2: 'two', 1: 'one'}}
>>> pformat(a)
"{1: 'one', 2: 'two', 3: 'three', 'b': {1: 'one', 2: 'two', 3: 'three'}}"
>>>
So, instead of serializing to JSON, serialize using pprint.pformat(). I imagine there may be some corner cases where two objects that you want to consider unequal nevertheless create the same pretty-printed representation. But those cases should be rare, and you wanted something simple and Pythonic, which this is.

You can test identity of all (key, value) pairs element-wise:
def equal_dict(d1, d2):
return all((k1 is k2) and (v1 is v2)
for (k1, v1), (k2, v2) in zip(d1.items(), d2.items()))
>>> equal_dict({True: 'a'}, {True: 'a'})
True
>>> equal_dict({1: 'a'}, {True: 'a'})
False
This should work with float, int, str and bool, but not other sequences or more complex objects.
Anyway, that's a start if you need it.

I think you are looking for something like this. However since you didn't provide example data I won't go into guessing what it could be
from boltons.itertools import remap
def compare(A, B): return A == B and type(A) == type(B)
dict_to_compare_against = { some dict }
def visit(path, key, value):
cur = dict_to_compare_against
for i in path:
cur = cur[i]
if not compare(cur, value):
raise Exception("Not equal")
remap(other_dict, visit=visit)

You can use isinstance() to delineate between a regular dictionary entry and a nested dictionary entry. This way you can iterate through using is to compare strictly, but also check when you need to dive down a level into the nested dictionary.
https://docs.python.org/3/library/functions.html#isinstance
myDict = {'a': True, 'b': False, 'c': {'a': True}}
for key, value in myDict.items():
if isinstance(value, dict):
# do what you need to do....
else:
# etc...

Related

Iterate over a list which contains dictionary names

I have a list which contains the names of different dictionaries. I want to iterate over those names and get the key and value of each of the dictionaries.
For eg:
ver=['one','two','three']
for v in ver:
for k,v in v.iteritems():
print k,v
where one, two, three are separate dictionaries. I need key, values from all the dictionaries.
But i am getting below error while doing this.
AttributeError("'str' object has no attribute 'iterkeys'",)
If, for some reason, you must use strings as you are doing. You can invoke globals() which keeps tracks over global variables or locals(), which keeps track of local variables like so:
>>> one = {'1': 1}
>>> two = {'2': 2}
>>> three = {'3': 3}
>>> ver = ['one', 'two', 'three']
>>> locals()
{'__name__': '__main__', '__doc__': None, '__package__': None, '__loader__': <class '_frozen_importlib.BuiltinImporter'>, '__spec__': None, '__annotations__': {}, '__builtins__': <module 'builtins' (built-in)>, 'a': 1, 'one': {'1': 1}, 'two': {'2': 2}, 'three': {'3': 3}, 'ver': ['one', 'two', 'three'], 'v': 'three'}
>>> for v in ver:
... print(globals()[v].items())
...
dict_items([('1', 1)])
dict_items([('2', 2)])
dict_items([('3', 3)])
You are mixing strings with the actual dicts.
Here is what you are actually looking for.
one = {1:9}
two = {'zz':34}
three = {}
ver = [one,two,three]
for v in ver:
for k,v in v.items():
print(k,v)
Variables in python are not enclosed with quotes. So instead define ver as:
ver = [one, two, three]
The most maintainable way to do what you require is to have them as keys of another dictionary, for example:
data = {
'one': {4:5, 'foo':'bar'},
'two': {1:'baz'},
'three': {'x':'y', 'a':'b'}
}
ver = ['one','two','three']
for x in ver:
for k,v in data[x].items():
print(k,v)
If you have them as separate variables, you could always start by constructing such a dictionary, as:
data = {
'one': one,
'two': two,
'three': three
}
Accessing variables dynamically by name is possible by use of the globals function, but in most cases this is less useful than organising your data in such a way as to make such an approach unnecessary.

The Second Element in A Dict Defined in Python keeps missing [duplicate]

This question already has answers here:
Boolean keys with other data types in dictionary
(3 answers)
Closed 4 years ago.
Python's version is 3.
In Python’s Interpreter in Mac's terminal (console), I tried defining a couple of Dicts but found that all the second elements in those Dicts were always missing. See the code below:
>>> dictOne = {True: 'real', 1: 'one', 'two': 2}
>>> dictOne
{True: 'one', 'two': 2}
>>> dictTwo = {1: 'one', True: 'real', 'two': 2}
>>> dictTwo
{1: 'real', 'two': 2}
>>> dictThree = {1: 'one', True: 'real', False: 'fake', 'two': 2}
>>> dictThree
{1: 'real', False: 'fake', 'two': 2}
Boolean and Integer values seem to interfere with each other. What happened?
True and 1 mean the same thing to Python. (True is basically bool(1), and True == 1 evaluates to True)
Python dicts don't allow duplicate keys, and True and 1 are considered duplicates.
EDIT: Alexandre Juma gave a nice explanation of this. Essentially, dict keys are hashed, and hash(1) and hash(True) return the same thing.

How to make "seen" hash with python dict?

In Perl one can do this:
my %seen;
foreach my $dir ( split /:/, $input ) {
$seen{$dir}++;
}
This is a way to remove duplicates by keeping track of what has been "seen". In python you cannot do:
seen = {}
for x in ['one', 'two', 'three', 'one']:
seen[x] += 1
The above python results in KeyError: 'one'.
What is python-y way of making a 'seen' hash?
Use a defaultdict for getting this behavior. The catch is that you need to specify the datatype for defaultdict to work for even those keys which don't have a value:
In [29]: from collections import defaultdict
In [30]: seen = defaultdict(int)
In [31]: for x in ['one', 'two', 'three', 'one']:
...: seen[x] += 1
In [32]: seen
Out[32]: defaultdict(int, {'one': 2, 'three': 1, 'two': 1})
You can use a Counter as well:
>>> from collections import Counter
>>> seen = Counter()
>>> for x in ['one', 'two', 'three', 'one']: seen[x] += 1
...
>>> seen
Counter({'one': 2, 'three': 1, 'two': 1})
If all you need are uniques, just do a set operation: set(['one', 'two', 'three', 'one'])
You could use a set:
>>> seen=set(['one', 'two', 'three', 'one'])
>>> seen
{'one', 'two', 'three'}
If you unroll seen[x] += 1 into seen[x] = seen[x] + 1, the problem with your code is obvious: you're trying to access seen[x] before you've assigned to it. Instead, you need to check if the key exists first:
seen = {}
for x in ['one', 'two', 'three', 'one']:
if x in seen:
seen[x] += 1 # we've seen it before, so increment
else:
seen[x] = 1 # first time seeing x

Python dict value query without iterating [duplicate]

I have the following dictionary in python:
d = {'1': 'one', '3': 'three', '2': 'two', '5': 'five', '4': 'four'}
I need a way to find if a value such as "one" or "two" exists in this dictionary.
For example, if I wanted to know if the index "1" existed I would simply have to type:
"1" in d
And then python would tell me if that is true or false, however I need to do that same exact thing except to find if a value exists.
>>> d = {'1': 'one', '3': 'three', '2': 'two', '5': 'five', '4': 'four'}
>>> 'one' in d.values()
True
Out of curiosity, some comparative timing:
>>> T(lambda : 'one' in d.itervalues()).repeat()
[0.28107285499572754, 0.29107213020324707, 0.27941107749938965]
>>> T(lambda : 'one' in d.values()).repeat()
[0.38303399085998535, 0.37257885932922363, 0.37096405029296875]
>>> T(lambda : 'one' in d.viewvalues()).repeat()
[0.32004380226135254, 0.31716084480285645, 0.3171098232269287]
EDIT: And in case you wonder why... the reason is that each of the above returns a different type of object, which may or may not be well suited for lookup operations:
>>> type(d.viewvalues())
<type 'dict_values'>
>>> type(d.values())
<type 'list'>
>>> type(d.itervalues())
<type 'dictionary-valueiterator'>
EDIT2: As per request in comments...
>>> T(lambda : 'four' in d.itervalues()).repeat()
[0.41178202629089355, 0.3959040641784668, 0.3970959186553955]
>>> T(lambda : 'four' in d.values()).repeat()
[0.4631338119506836, 0.43541407585144043, 0.4359898567199707]
>>> T(lambda : 'four' in d.viewvalues()).repeat()
[0.43414998054504395, 0.4213531017303467, 0.41684913635253906]
In Python 3, you can use
"one" in d.values()
to test if "one" is among the values of your dictionary.
In Python 2, it's more efficient to use
"one" in d.itervalues()
instead.
Note that this triggers a linear scan through the values of the dictionary, short-circuiting as soon as it is found, so this is a lot less efficient than checking whether a key is present.
Python dictionary has get(key) function
>>> d.get(key)
For Example,
>>> d = {'1': 'one', '3': 'three', '2': 'two', '5': 'five', '4': 'four'}
>>> d.get('3')
'three'
>>> d.get('10')
None
If your key does not exist, then it will return None value.
foo = d[key] # raise error if key doesn't exist
foo = d.get(key) # return None if key doesn't exist
Content relevant to versions less than 3.0 and greater than 5.0.
Use dictionary views:
if x in d.viewvalues():
dosomething()..
Different types to check the values exists
d = {"key1":"value1", "key2":"value2"}
"value10" in d.values()
>> False
What if list of values
test = {'key1': ['value4', 'value5', 'value6'], 'key2': ['value9'], 'key3': ['value6']}
"value4" in [x for v in test.values() for x in v]
>>True
What if list of values with string values
test = {'key1': ['value4', 'value5', 'value6'], 'key2': ['value9'], 'key3': ['value6'], 'key5':'value10'}
values = test.values()
"value10" in [x for v in test.values() for x in v] or 'value10' in values
>>True
You can use this:
d = {'1': 'one', '3': 'three', '2': 'two', '5': 'five', '4': 'four'}
print("one" in d.values)
Or you can use any function:
print(any([True for i,j in d1.items() if j == "one"]))
In Python 3 you can use the values() function of the dictionary. It returns a view object of the values. This, in turn, can be passed to the iter function which returns an iterator object. The iterator can be checked using in, like this,
'one' in iter(d.values())
Or you can use the view object directly since it is similar to a list
'one' in d.values()

Dict merge in a dict comprehension

In python 3.5, we can merge dicts by using double-splat unpacking
>>> d1 = {1: 'one', 2: 'two'}
>>> d2 = {3: 'three'}
>>> {**d1, **d2}
{1: 'one', 2: 'two', 3: 'three'}
Cool. It doesn't seem to generalise to dynamic use cases, though:
>>> ds = [d1, d2]
>>> {**d for d in ds}
SyntaxError: dict unpacking cannot be used in dict comprehension
Instead we have to do reduce(lambda x,y: {**x, **y}, ds, {}), which seems a lot uglier. Why the "one obvious way to do it" is not allowed by the parser, when there doesn't seem to be any ambiguity in that expression?
It's not exactly an answer to your question but I'd consider using ChainMap to be an idiomatic and elegant way to do what you propose (merging dictionaries in-line):
>>> from collections import ChainMap
>>> d1 = {1: 'one', 2: 'two'}
>>> d2 = {3: 'three'}
>>> ds = [d1, d2]
>>> dict(ChainMap(*ds))
{1: 'one', 2: 'two', 3: 'three'}
Although it's not a particularly transparent solution, since many programmers might not know exactly how a ChainMap works. Note that (as #AnttiHaapala points out) "first found is used" so, depending on your intentions you might need to make a call to reversed before passing your dicts into ChainMap.
>>> d2 = {3: 'three', 2: 'LOL'}
>>> ds = [d1, d2]
>>> dict(ChainMap(*ds))
{1: 'one', 2: 'two', 3: 'three'}
>>> dict(ChainMap(*reversed(ds)))
{1: 'one', 2: 'LOL', 3: 'three'}
To me, the obvious way is:
d_out = {}
for d in ds:
d_out.update(d)
This is quick and probably quite performant. I don't know that I can speak for the python developers, but I don't know that your expected version is more easy to read. For example, your comprehension looks more like a set-comprehension to me due to the lack of a :. FWIW, I don't think there is any technical reason (e.g. parser ambiguity) that they couldn't add that form of comprehension unpacking.
Apparently, these forms were proposed, but didn't have universal enough support to warrant implementing them (yet).
You could use itertools.chain or itertools.chain.from_iterable:
import itertools
ds = [{'a': 1, 'b': 2}, {'c': 30, 'b': 40}]
merged_d = dict(itertools.chain(*(d.items() for d in ds)))
print(merged_d) # {'a': 1, 'b': 40, 'c': 30}
Based on this solution and also mentioned by #ilgia-everilä, but making it Py2 compatible and still avoiding intermediate structures. Encapsulating it inside a function makes its use quite readable.
def merge_dicts(*dicts, **extra):
"""
>>> merge_dicts(dict(a=1, b=1), dict(b=2, c=2), dict(c=3, d=3), d=4, e=4)
{'a': 1, 'b': 2, 'c': 3, 'd': 4, 'e': 4}
"""
return dict((
(k,v)
for d in dicts
for k,v in d.items()
), **extra)
Idiomatic, without ChainMap:
>>> d1 = {1: 'one', 2: 'two'}
>>> d2 = {3: 'three'}
>>> {k: v for d in [d1, d2] for k, v in d.items()}
{1: 'one', 2: 'two', 3: 'three'}
You could define this function:
from collections import ChainMap
def mergeDicts(l):
return dict(ChainMap(*reversed(list(l))))
You can then use it like this:
>>> d1 = {1: 'one', 2: 'two'}
>>> d2 = {3: 'three'}
>>> ds = [d1, d2]
>>> mergeDicts(ds)
{1: 'one', 2: 'two', 3: 'three'}

Categories