comparing keys:- list of nested dictionary - python

I want to write a function that checks keys of dict1 (base dict) and compare it to keys of dict2 (list of nested dictionaries, can be one or multiple), such that it checks for the mandatory key and then optional keys(if and whatever are present) and returns the difference as a list.
dict1 = {"name": str, #mandatory
"details" : { #optional
"class" : str, #optional
"subjects" : { #optional
"english" : bool, #optional
"maths" : bool #optional
}
}}
dict2 = [{"name": "SK",
"details" : {
"class" : "A"}
},
{"name": "SK",
"details" : {
"class" : "A",
"subjects" :{
"english" : True,
"science" : False
}
}}]
After comparing dict2 with dict1,The expected output is:-
pass #no difference in keys in 1st dictionary
["science"] #the different key in second dictionary of dict2

Try out this recursive check function:
def compare_dict_keys(d1, d2, diff: list):
if isinstance(d2, dict):
for key, expected_value in d2.items():
try:
actual_value = d1[key]
compare_dict_keys(actual_value, expected_value, diff)
except KeyError:
diff.append(key)
else:
pass
dict1 vs dict2
difference = []
compare_dict_keys(dict1, dict2, difference)
print(difference)
# Output: ['science']
dict2 vs dict1
difference = []
compare_dict_keys(dict2, dict1, difference)
print(difference)
# Output: ['maths']

Related

How to convert values of a specific dictionary key to uppercase?

I have this simplified dict:
{
{
"birthPlace" : "london"
},
"hello": "hello",
"birthPlace" : "rome"
}
And I want to make the value of birthPlace uppercase: how? I tried
smallalphabetDict={}
for key, value in myjson.items():
smallalphabetDict[key.upper()] = value
It doesn't work
This changes all the values of a dict to uppercase, if the value is a string:
d = {......}
for k in d:
if type(d[k]) == str: d[k] = d[k].upper()

Check if value of one key in json has another key

I'm trying to print my json content. I know how to print just keys and values but I want to have access to the objects within the keys too. This is my code:
json_mini = json.loads('{"one" : {"testing" : 39, "this": 17}, "two" : "2", "three" : "3"}')
for index, value in json_mini.items():
print index, value
if value.items():
for ind2, val2 in value.items():
print ind2, val2
which gives me this error: AttributeError: 'unicode' object has no attribute 'items'
How to iterate over it? So I can do some process on each separate key and value?
Recursive example:
import json
def func(data):
for index, value in data.items():
print index, value
if isinstance(value, dict):
func(value)
json_mini = json.loads('{"one" : {"testing" : 39, "this": 17}, "two" : "2", "three" : "3"}')
func(json_mini)
Here's a recursive way that works in Python 2 and 3, which doesn't use isinstance(). It instead uses exceptions to determine whether a given element is a sub-object or not.
import json
def func(obj, name=''):
try:
for key, value in obj.items():
func(value, key)
except AttributeError:
print('{}: {}'.format(name, obj))
json_mini = json.loads('''{
"three": "3",
"two": "2",
"one": {
"this": 17,
"testing": 39
}
}''')
func(json_mini)
Output:
this: 17
testing: 39
three: 3
two: 2

pythonic way to check if my dict contains prototyped key hierarchy

I have a dict, lets say mydict
I also know about this json, let's say myjson:
{
"actor":{
"name":"",
"type":"",
"mbox":""
},
"result":{
"completion":"",
"score":{ "scaled":"" },
"success":"",
"timestamp":""
},
"verb":{
"display":{
"en-US":""
},
"id":""
},
"context":{
"location":"",
"learner_id": "",
"session_id": ""
},
"object":{
"definition":{
"name":{
"en-US":""
}
},
"id":"",
"activity_type":""
}
}
I want to know if ALL of myjson keys (with the same hierarchy) are in mydict. I don't care if mydict has more data in it (it can have more data). How do I do this in python?
Make a dictionary of myjson
import json
with open('myjson.json') as j:
new_dict = json.loads(j.read())
Then go through each key of that dictionary, and confirm that the value of that key is the same in both dictionaries
def compare_dicts(new_dict, mydict):
for key in new_dict:
if key in mydict and mydict[key] == new_dict[key]:
continue
else:
return False
return True
EDIT:
A little more complex, but something like this should suit you needs
def compare(n, m):
for key in n:
if key in m:
if m[key] == n[key]:
continue
elif isinstance(n[key], dict) and isinstance(m[key],dict):
if compare(n[key], m[key]):
continue
else:
return False
else:
return False
return True
If you just care about the values and not the keys you can do this:
>>> all(v in mydict.items() for v in myjson.items())
True
Will be true if all values if myjson are in mydict, even if they have other keys.
Edit: If you only care about the keys, use this:
>>> all(v in mydict.keys() for v in myjson.keys())
True
This returns true if every key of myjson is in mydict, even if they point to different values.

Check existence of a key recursively and append to array of dict

I've a dict as follows
{
"key1" : "value1",
"key2" : "value2",
"key3" : "value3",
"key4" : {
"key5" : "value5"
}
}
If the dict has key1==value1, I'll append the dict into a list.
Suppose key1==value1 is not present in the first key value pair, whereas it is inside nested dict as follows:
{
"key2" : "value2",
"key3" : "value3",
"key4" : {
"key5" : "value5",
"key1" : "value1",
"key6" : {
"key7" : "value7",
"key1" : "value1"
}
},
"key8" : {
"key9" : "value9",
"key10" : {
"key11" : "value11",
"key12" : "value12",
"key1" : "value1"
}
}
}
In the above dict, I've to check first whether there is key1=value1. If not, I've to traverse the nested dict and if it found in the nested dict, I've to append that dict to the list. If the nested dict is also a nested dict but key1=value1 is find in the first key value pair, then no need to check the inner dict(Eg key4 has key1=value1 in the in the first key value pair. Hence no need to check the inner one eventhough key6 has key1=value1).
So finally, I'll have the list as follows.
[
{
"key5" : "value5",
"key1" : "value1",
"key6" : {
"key7" : "value7",
"key1" : "value1"
}
},
{
"key11" : "value11",
"key12" : "value12",
"key1" : "value1"
}
]
How to achieve this?
Note: The depth of the dict may vary
if a dict contains key1 and value1 we will add it to the list and finish.
if not, we will got into all the values in the dict that are dict and do the same logic as well
l = []
def append_dict(d):
if d.get("key1") == "value1":
l.append(d)
return
for k,v in d.items():
if isinstance(v, dict):
append_dict(v)
append_dict(d)
print l
an iterative solution will be adding to queue the dict we would like to check:
from Queue import Queue
q = Queue()
l = []
q.put(d)
while not q.empty():
d = q.get()
if d.get("key1") == "value1":
l.append(d)
continue
for k,v in d.items():
if isinstance(v, dict):
q.put(v)
print l
As #shashank noted, usinq a stack instead of a queue will also work
it is BFS vs DFS for searching in the dictionary

Deep check for two python dictionaries and get the difference in report form

Say There are two dictionaries in python -
Dict1
mydict1 = {
"Person" :
{
"FName" : "Rakesh",
"LName" : "Roshan",
"Gender" : "Male",
"Status" : "Married",
"Age" : "60",
"Children" :
[
{
"Fname" : "Hrithik",
"Lname" : "Roshan",
"Gender" : "Male",
"Status" : "Married",
"Children" : ["Akram", "Kamal"],
},
{
"Fname" : "Pinky",
"Lname" : "Roshan",
"Gender" : "Female",
"Status" : "Married",
"Children" : ["Suzan", "Tina", "Parveen"]
}
],
"Movies" :
{
"The Last Day" :
{
"Year" : 1990,
"Director" : "Mr. Kapoor"
},
"Monster" :
{
"Year" : 1991,
"Director" : "Mr. Khanna"
}
}
}
}
Dict2
mydict2 = {
"Person" :
{
"FName" : "Rakesh",
"LName" : "Roshan",
"Gender" : "Male",
"Status" : "Married",
"Children" :
[
{
"Fname" : "Hrithik",
"Lname" : "Losan",
"Gender" : "Male",
"Status" : "Married",
"Children" : ["Akram", "Ajamal"],
},
{
"Fname" : "Pinky",
"Lname" : "Roshan",
"Gender" : "Female",
"Status" : "Married",
"Children" : ["Suzan", "Tina"]
}
]
}
}
I want to compare two dictionaries and print the difference in report format as below -
MISMATCH 1
==========
MATCH DICT KEY : Person >> Children >> LName
EXPECTED : Roshan
ACUTAL : Losan
MISMATCH 2
==========
MATCH LIST ITEM : Person >> Children >> Children
EXPECTED : Kamal
ACTUAL : Ajamal
MISMATCH 3
==========
MATCH LIST ITEM : Person >> Children >> Children
EXPECTED : Parveen
ACTUAL : NOT_FOUND
MISMATCH 4
==========
MATCH DICT KEY : Person >> Age
EXPECTED : 60
ACTUAL : NOT_FOUND
MISMATCH 5
==========
MATCH DICT KEY : Person >> Movies
EXPECTED : { Movies : {<COMPLETE DICT>} }
ACTUAL : NOT_FOUND
I tried with Python module called datadiff which does not give me a pretty output in a dictionary format. To generate the report I have to traverse dictionary and find '+' '-' keys. If the dictionary is too complex then its hard to traverse.
UPDATE: I've updated the code to deal with lists in a more appropriate way. I've also commented the code to make it more clear if you need to change it.
This answer is not 100% general right now, but it can be expanded upon easily to fit what you need.
def print_error(exp, act, path=[]):
if path != []:
print 'MATCH LIST ITEM: %s' % '>>'.join(path)
print 'EXPECTED: %s' % str(exp)
print 'ACTUAL: %s' % str(act)
print ''
def copy_append(lst, item):
foo = lst[:]
foo.append(str(item))
return foo
def deep_check(comp, compto, path=[], print_errors=True):
# Total number of errors found, is needed for when
# testing the similarity of dicts
errors = 0
if isinstance(comp, list):
# If the types are not the same then it is probably a critical error
# return a number to represent how important this is
if not isinstance(compto, list):
if print_errors:
print_error(comp, 'NOT_LIST', path)
return 1
# We don't want to destroy the original lists
comp_copy = comp[:]
compto_copy = compto[:]
# Remove items that are both is comp and compto
# and find items that are only in comp
for item in comp_copy[:]:
try:
compto_copy.remove(item)
# Only is removed if the item is in compto_copy
comp_copy.remove(item)
except ValueError:
# dicts need to be handled differently
if isinstance(item, dict):
continue
if print_errors:
print_error(item, 'NOT_FOUND', path)
errors += 1
# Find non-dicts that are only in compto
for item in compto_copy[:]:
if isinstance(item, dict):
continue
compto_copy.remove(item)
if print_errors:
print_error('NOT_FOUND', item, path)
errors += 1
# Now both copies only have dicts
# This is the part that compares dicts with the minimum
# errors between them, it is expensive since each dict in comp_copy
# has to be compared against each dict in compto_copy
for c in comp_copy:
lowest_errors = None
lowest_value = None
for ct in compto_copy:
errors_in = deep_check(c, ct, path, print_errors=False)
# Get and store the minimum errors
if errors_in < lowest_errors or lowest_errors is None:
lowest_errors = errors_in
lowest_value = ct
if lowest_errors is not None:
errors += lowest_errors
# Has to have print_errors passed incase the list of dicts
# contains a list of dicts
deep_check(c, lowest_value, path, print_errors)
compto_copy.remove(lowest_value)
return errors
if not isinstance(compto, dict):
# If the types are not the same then it is probably a critical error
# return a number to represent how important this is
if print_errors:
print_error(comp, 'NOT_DICT')
return 1
for key, value in compto.iteritems():
try:
comp[key]
except KeyError:
if print_errors:
print_error('NO_KEY', key, copy_append(path, key))
errors += 1
for key, value in comp.iteritems():
try:
tovalue = compto[key]
except KeyError:
if print_errors:
print_error(value, 'NOT_FOUND', copy_append(path, key))
errors += 1
continue
if isinstance(value, (list, dict)):
errors += deep_check(value, tovalue, copy_append(path, key), print_errors)
else:
if value != tovalue:
if print_errors:
print_error(value, tovalue, copy_append(path, key))
errors += 1
return errors
With your dicts as input I get:
MATCH LIST ITEM: Person>>Age
EXPECTED: 60
ACTUAL: NOT_FOUND
MATCH LIST ITEM: Person>>Movies
EXPECTED: {'The Last Day': {'Director': 'Mr. Kapoor', 'Year': 1990}, 'Monster': {'Director': 'Mr. Khanna', 'Year': 1991}}
ACTUAL: NOT_FOUND
MATCH LIST ITEM: Person>>Children>>Lname
EXPECTED: Roshan
ACTUAL: Losan
MATCH LIST ITEM: Person>>Children>>Children
EXPECTED: Kamal
ACTUAL: NOT_FOUND
MATCH LIST ITEM: Person>>Children>>Children
EXPECTED: NOT_FOUND
ACTUAL: Ajamal
MATCH LIST ITEM: Person>>Children>>Children
EXPECTED: Parveen
ACTUAL: NOT_FOUND
The way lists are compared has been updated so that these two lists:
['foo', 'bar']
['foo', 'bing', 'bar']
Will only raise an error about 'bing' not being in the first list. With string values the value can either be in the list or not, but an issue arises when you are comparing a list of dicts. You'll end up with dicts from the list that do not match to varying degrees, and knowing what dicts to compare from those is not straight forward.
My implementation solves this by assuming that pairs of dicts that create the lowest number of errors are the ones that need to be compared together. For example:
test1 = {
"Name": "Org Name",
"Members":
[
{
"Fname": "foo",
"Lname": "bar",
"Gender": "Neuter",
"Roles": ["President", "Vice President"]
},
{
"Fname": "bing",
"Lname": "bang",
"Gender": "Neuter",
"Roles": ["President", "Vice President"]
}
]
}
test2 = {
"Name": "Org Name",
"Members":
[
{
"Fname": "bing",
"Lname": "bang",
"Gender": "Male",
"Roles": ["President", "Vice President"]
},
{
"Fname": "foo",
"Lname": "bar",
"Gender": "Female",
"Roles": ["President", "Vice President"]
}
]
}
Produces this output:
MATCH LIST ITEM: Members>>Gender
EXPECTED: Neuter
ACTUAL: Female
MATCH LIST ITEM: Members>>Gender
EXPECTED: Neuter
ACTUAL: Male

Categories