Comparing dictionary of list of dictionary/nested dictionary

Comparing dictionary of list of dictionary/nested dictionary - python

There are two dict main and input, I want to validate the "input" such that all the keys in the list of dictionary and nested dictionary (if present/all keys are optional) matches that of the main if not the wrong/different key should be returned as the output.
main = "app":[{
"name": str,
"info": [
{
"role": str,
"scope": {"groups": list}
}
]
},{
"name": str,
"info": [
{"role": str}
]
}]
input_data = "app":[{
'name': 'nms',
'info': [
{
'role': 'user',
'scope': {'groups': ['xyz']
}
}]
},{
'name': 'abc',
'info': [
{'rol': 'user'}
]
}]
when compared input with main the wrong/different key should be given as output, in this case
['rol']

The schema module does exactly this.
You can catch SchemaUnexpectedTypeError to see which data doesn't match your pattern.
Also, make sure you don't use the word input as a variable name, as it's the name of a built-in function.

keys = []
def print_dict(d):
if type(d) == dict:
for val in d.keys():
df = d[val]
try:
if type(df) == list:
for i in range(0,len(df)):
if type(df[i]) == dict:
print_dict(df[i])
except AttributeError:
pass
keys.append(val)
else:
try:
x = d[0]
if type(x) == dict:
print_dict(d[0])
except:
pass
return keys
keys_input = print_dict(input)
keys = []
keys_main = print_dict(main)
print(keys_input)
print(keys_main)
for i in keys_input[:]:
if i in keys_main:
keys_input.remove(i)
print(keys_input)
This has worked for me. you can check above code snippet and if any changes provide more information so any chances if required.

Dictionary and lists compare theire content nested by default.
input_data == main should result in the right output if you format your dicts correctly. Try adding curly brackets "{"/"}" arround your dicts. It should probably look like something like this:
main = {"app": [{
"name": str,
"info": [
{
"role": str,
"scope": {"groups": list}
}
]
},{
"name": str,
"info": [
{"role": str}
]
}]}
input_data = {"app":[{
'name': 'nms',
'info': [
{
'role': 'user',
'scope': {'groups': ['xyz']
}
}]
},{
'name': 'abc',
'info': [
{'rol': 'user'}
]
}]}
input_data2 = {"app": [{
'name': 'nms',
'info': [
{
'role': 'user',
'scope': {'groups': ['xyz']
}
}]
}, {
'name': 'abc',
'info': [
{'rol': 'user'}
]
}]}
Comparision results should look like this:
input_data2 == input_data # True
main == input_data # False

Related

need to turn JSON values into keys

I have some json that I would like to transform from this:
[
{
"name":"field1",
"intValue":"1"
},
{
"name":"field2",
"intValue":"2"
},
...
{
"name":"fieldN",
"intValue":"N"
}
]
into this:
{ "field1" : "1",
"field2" : "2",
...
"fieldN" : "N",
}
For each pair, I need to change the value of the name field to a key, and the values of the intValue field to a value. This doesn't seem like flattening or denormalizing. Are there any tools that might do this out-of-the-box, or will this have to be brute-forced? What's the most pythonic way to accomplish this?

parameters = [ # assuming this is loaded already
{
"name":"field1",
"intValue":"1"
},
{
"name":"field2",
"intValue":"2"
},
{
"name":"fieldN",
"intValue":"N"
}
]
field_int_map = dict()
for p in parameters:
field_int_map[p['name']] = p['intValue']
yields {'field1': '1', 'field2': '2', 'fieldN': 'N'}
or as a dict comprehension
field_int_map = {p['name']:p['intValue'] for p in parameters}
This works to combine the name attribute with the intValue as key:value pairs, but the result is a dictionary instead of the original input type which was a list.

Use dictionary comprehension:
json_dct = {"parameters":
[
{
"name":"field1",
"intValue":"1"
},
{
"name":"field2",
"intValue":"2"
},
{
"name":"fieldN",
"intValue":"N"
}
]}
dct = {d["name"]: d["intValue"] for d in json_dct["parameters"]}
print(dct)
# {'field1': '1', 'field2': '2', 'fieldN': 'N'}

Merge dictionaries with same key from two lists of dicts in python

I have two dictionaries, as below. Both dictionaries have a list of dictionaries as the value associated with their properties key; each dictionary within these lists has an id key. I wish to merge my two dictionaries into one such that the properties list in the resulting dictionary only has one dictionary for each id.
{
"name":"harry",
"properties":[
{
"id":"N3",
"status":"OPEN",
"type":"energetic"
},
{
"id":"N5",
"status":"OPEN",
"type":"hot"
}
]
}
and the other list:
{
"name":"harry",
"properties":[
{
"id":"N3",
"type":"energetic",
"language": "english"
},
{
"id":"N6",
"status":"OPEN",
"type":"cool"
}
]
}
The output I am trying to achieve is:
"name":"harry",
"properties":[
{
"id":"N3",
"status":"OPEN",
"type":"energetic",
"language": "english"
},
{
"id":"N5",
"status":"OPEN",
"type":"hot"
},
{
"id":"N6",
"status":"OPEN",
"type":"cool"
}
]
}
As id: N3 is common in both the lists, those 2 dicts should be merged with all the fields. So far I have tried using itertools and
ds = [d1, d2]
d = {}
for k in d1.keys():
d[k] = tuple(d[k] for d in ds)
Could someone please help in figuring this out?

Here is one of the approach:
a = {
"name":"harry",
"properties":[
{
"id":"N3",
"status":"OPEN",
"type":"energetic"
},
{
"id":"N5",
"status":"OPEN",
"type":"hot"
}
]
}
b = {
"name":"harry",
"properties":[
{
"id":"N3",
"type":"energetic",
"language": "english"
},
{
"id":"N6",
"status":"OPEN",
"type":"cool"
}
]
}
# Create dic maintaining the index of each id in resp dict
a_ids = {item['id']: index for index,item in enumerate(a['properties'])} #{'N3': 0, 'N5': 1}
b_ids = {item['id']: index for index,item in enumerate(b['properties'])} #{'N3': 0, 'N6': 1}
# Loop through one of the dict created
for id in a_ids.keys():
# If same ID exists in another dict, update it with the key value
if id in b_ids:
b['properties'][b_ids[id]].update(a['properties'][a_ids[id]])
# If it does not exist, then just append the new dict
else:
b['properties'].append(a['properties'][a_ids[id]])
print (b)
Output:
{'name': 'harry', 'properties': [{'id': 'N3', 'type': 'energetic', 'language': 'english', 'status': 'OPEN'}, {'id': 'N6', 'status': 'OPEN', 'type': 'cool'}, {'id': 'N5', 'status': 'OPEN', 'type': 'hot'}]}

It might help to treat the two objects as elements each in their own lists. Maybe you have other objects with different name values, such as might come out of a JSON-formatted REST request.
Then you could do a left outer join on both name and id keys:
#!/usr/bin/env python
a = [
{
"name": "harry",
"properties": [
{
"id":"N3",
"status":"OPEN",
"type":"energetic"
},
{
"id":"N5",
"status":"OPEN",
"type":"hot"
}
]
}
]
b = [
{
"name": "harry",
"properties": [
{
"id":"N3",
"type":"energetic",
"language": "english"
},
{
"id":"N6",
"status":"OPEN",
"type":"cool"
}
]
}
]
a_names = set()
a_prop_ids_by_name = {}
a_by_name = {}
for ao in a:
an = ao['name']
a_names.add(an)
if an not in a_prop_ids_by_name:
a_prop_ids_by_name[an] = set()
for ap in ao['properties']:
api = ap['id']
a_prop_ids_by_name[an].add(api)
a_by_name[an] = ao
res = []
for bo in b:
bn = bo['name']
if bn not in a_names:
res.append(bo)
else:
ao = a_by_name[bn]
bp = bo['properties']
for bpo in bp:
if bpo['id'] not in a_prop_ids_by_name[bn]:
ao['properties'].append(bpo)
res.append(ao)
print(res)
The idea above is to process list a for names and ids. The names and ids-by-name are instances of a Python set. So members are always unique.
Once you have these sets, you can do the left outer join on the contents of list b.
Either there's an object in b that doesn't exist in a (i.e. shares a common name), in which case you add that object to the result as-is. But if there is an object in b that does exist in a (which shares a common name), then you iterate over that object's id values and look for ids not already in the a ids-by-name set. You add missing properties to a, and then add that processed object to the result.
Output:
[{'name': 'harry', 'properties': [{'id': 'N3', 'status': 'OPEN', 'type': 'energetic'}, {'id': 'N5', 'status': 'OPEN', 'type': 'hot'}, {'id': 'N6', 'status': 'OPEN', 'type': 'cool'}]}]
This doesn't do any error checking on input. This relies on name values being unique per object. So if you have duplicate keys in objects in both lists, you may get garbage (incorrect or unexpected output).

Group by keys and create a list of respective values

I want to convert my dictionary to this format. I have tried using groupby but not able to achieve the expected format.
input = [
{'algorithms': 'BLOWFISH', 'dcount': 5.8984375},
{'algorithms': 'AES-256', 'dcount': 5.609375},
{'algorithms': 'AES-256', 'dcount': 9.309375},
{'algorithms': 'RSA', 'dcount': 8.309375},
{'algorithms': 'BLOWFISH','dcount': 6.309375}
]
Expected output:
output = [
{
name: "BLOWFISH",
data: [5.8984375,6.309375]
},
{
name: "AES-256",
data: [5.609375,9.309375]
},
{
name: 'RSA',
data: [8.309375]
}
]

You need to sort input before itertools.groupby will work:
The operation of groupby() is similar to the uniq filter in Unix. It
generates a break or new group every time the value of the key
function changes (which is why it is usually necessary to have sorted
the data using the same key function). That behavior differs from
SQL’s GROUP BY which aggregates common elements regardless of their
input order.
from itertools import groupby
import json
input = [
{
"algorithms": "BLOWFISH",
"dcount": 5.8984375
},
{
"algorithms": "AES-256",
"dcount": 5.609375
},
{
"algorithms": "AES-256",
"dcount": 9.309375
},
{
"algorithms": "RSA",
"dcount": 8.309375
},
{
"algorithms": "BLOWFISH",
"dcount": 6.309375
}
]
output = [
{
"name": k,
"data": [d["dcount"] for d in g]
}
for k, g in groupby(sorted(input, key=lambda d: d["algorithms"]),
key=lambda d: d["algorithms"])
]
print(json.dumps(output, indent=4))
Output:
[
{
"name": "AES-256",
"data": [
5.609375,
9.309375
]
},
{
"name": "BLOWFISH",
"data": [
5.8984375,
6.309375
]
},
{
"name": "RSA",
"data": [
8.309375
]
}
]

Had a go at this for you:
data = [
{'algorithms': 'BLOWFISH', 'dcount': 5.8984375},
{'algorithms': 'AES-256', 'dcount': 5.609375},
{'algorithms': 'AES-256', 'dcount': 9.309375},
{'algorithms': 'RSA', 'dcount': 8.309375},
{'algorithms': 'BLOWFISH', 'dcount': 6.309375}
]
series = []
firstRun = True
for i in data:
found = False
if firstRun:
series.append(
{
'name': i['algorithms'],
'data': [i['dcount']]
}
)
firstRun = False
else:
for j in series:
if i['algorithms'] == j['name']:
j['data'].append(i['dcount'])
found = True
else:
continue
if not found:
series.append(
{
'name': i['algorithms'],
'data': [i['dcount']]
}
)
This should give you your desired output:
>>> print(series)
[{'name': 'BLOWFISH', 'data': [5.8984375, 6.309375]}, {'name': 'AES-256', 'data': [5.609375, 9.309375]}, {'name': 'RSA', 'data': [8.309375]}]

You can do that with no additional module:
def group_algorithms(input_list):
out = []
names = []
for algorithms in input_list:
if algorithms['algorithms'] in names:
out[names.index(algorithms['algorithms'])]["data"].append(algorithms['dcount'])
else:
out.append({"name": algorithms['algorithms'],
"data": [algorithms['dcount']]})
names.append(algorithms['algorithms'])
return out

How to deepmerge dicts/list for a json in python

I'm trying to deepmerge lists to get a specific json.
what I want to achieve is this format (the order of the elements is irelevant):
{
"report": {
"context": [{
"report_id": [
"Report ID 30"
],
"status": [
"Status 7"
],
"fallzahl": [
"Fallzahl 52"
],
"izahl": [
"IZahl 20"
]
}
],
"körpergewicht": [{
"any_event_en": [{
"gewicht": [{
"|magnitude": 185.44,
"|unit": "kg"
}
],
"kommentar": [
"Kommentar 94"
],
"bekleidung": [{
"|code": "at0011"
}
]
}
]
}
]
}
}
I try to deepmerge dicts and lists to achieve this specific format. My baseline are some dicts:
{'körpergewicht': [{'any_event_en': [{'gewicht': [{'|magnitude': '100', '|unit': 'kg'}]}]}]}
{'körpergewicht': [{'any_event_en': [{'bekleidung': [{'|code': 'at0013'}]}]}]}
{'körpergewicht': [{'any_event_en': [{'kommentar': ['none']}]}]}
{'context': [{'status': ['fatty']}]}
{'context': [{'fallzahl': ['123']}]}
{'context': [{'report_id': ['123']}]}
{'context': [{'izahl': ['123']}]}
what I tried to do is following I have a dict called tmp_dict in that I hold a baseline dict as I loop through. The so called collect_dict is the dict in that I try to merge my baseline dicts. element holds my current baseline dict.
if (index == (len(element)-1)): #when the baseline dict is traversed completely
if tmp_dict:
first_key_of_tmp_dict=list(tmp_dict.keys())[0]
if not (first_key_of_tmp_dict in collect_dict):
collect_dict.update(tmp_dict)
else:
merge(tmp_dict,collect_dict)
else:
collect_dict.update(tmp_dict)
and I also wrote a merge method:
def merge(tmp_dict,collect_dict):
first_common_key_of_dicts=list(tmp_dict.keys())[0]
second_depth_key_of_tmp_dict=list(tmp_dict[first_common_key_of_dicts][0].keys())[0]
second_depth_tmp_dict=tmp_dict[first_common_key_of_dicts][0]
second_depth_key_of_coll_dict=collect_dict[first_common_key_of_dicts][0]
if not second_depth_key_of_tmp_dict in second_depth_key_of_coll_dict:
collect_dict.update(second_depth_tmp_dict)
else:
merge(second_depth_tmp_dict,second_depth_key_of_coll_dict)
what I'm coming up with goes in the right direction but is far from beeing my desired output:
{"report": {
"k\u00f6rpergewicht": [{
"any_event_en": [{
"kommentar": ["none"]
}
],
"bekleidung": [{
"|code": "at0013"
}
],
"gewicht": [{
"|magnitude": "100",
"|unit": "kg"
}
]
}
],
"context": [{
"fallzahl": ["234"]
}
],
"report_id": ["234"],
"status": ["s"],
"izahl": ["234"]
}
}
With another set of inputs:
{'atemfrequenz': {'context': [{'status': [{'|code': 'at0012'}]}]}},
{'atemfrequenz': {'context': [{'kategorie': ['Kategorie']}]}},
{'atemfrequenz': {'atemfrequenz': [{'messwert': [{'|magnitude': '123', '|unit': 'min'}]}]}}
I would like to achieve the following output:
"atemfrequenz": {
"context": [
{
"status": [
{
"|code": "at0012"
}
],
"kategorie": [
"Kategorie"
]
}
],
"atemfrequenz": [
{
"messwert": [
{
"|magnitude": 123,
"|unit": "/min"
}
]
}
]
}

This code should get the correct answer. I removed the special character (ö) to prevent errors.
dd = [
{'korpergewicht': [{'any_event_en': [{'gewicht': [{'|magnitude': '100', '|unit': 'kg'}]}]}] },
{'korpergewicht': [{'any_event_en': [{'bekleidung': [{'|code': 'at0013'}]}]}]},
{'korpergewicht': [{'any_event_en': [{'kommentar': ['none']}]}]},
{'context': [{'status': ['fatty']}]},
{'context': [{'fallzahl': ['123']}]},
{'context': [{'report_id': ['123']}]},
{'context': [{'izahl': ['123']}]}
]
def merge(d):
if (type(d) != type([])): return d
if (type(list(d[0].values())[0])) == type(""): return d
keys = list(set(list(k.keys())[0] for k in d))
lst = [{k:[]} for k in keys]
for e in lst:
for k in d:
if (list(e.keys())[0] == list(k.keys())[0]):
e[list(e.keys())[0]] += k[list(k.keys())[0]]
for e in lst:
if (type(e[list(e.keys())[0]][0]) == type({})):
e[list(e.keys())[0]] = merge(e[list(e.keys())[0]])
for i in lst[1:]: lst[0].update(i)
lst2 = [] # return list of single dictionary
lst2.append(lst[0])
return lst2
dx = merge(dd)
dx = {'report': dx[0]} # no list at lowest level
print(dx)
Output (formatted)
{'report': {
'korpergewicht': [{
'any_event_en': [{
'kommentar': ['none'],
'bekleidung': [{'|code': 'at0013'}],
'gewicht': [{'|magnitude': '100', '|unit': 'kg'}]}]}],
'context': [{
'report_id': ['123'],
'izahl': ['123'],
'fallzahl': ['123'],
'status': ['fatty']}]}}
Concerning the second data set provided, the data needs to structured to match the previous data set.
This data set works correctly:
dd = [
{'atemfrequenz': [{'context': [{'status': [{'|code': 'at0012'}]}]}]},
{'atemfrequenz': [{'context': [{'kategorie': ['Kategorie']}]}]},
{'atemfrequenz': [{'atemfrequenz': [{'messwert': [{'|magnitude': '123', '|unit': 'min'}]}]}]}
]
Output (formatted)
{'report': {
'atemfrequenz': [{
'atemfrequenz': [{
'messwert': [{'|magnitude': '123', '|unit': 'min'}]}],
'context': [{
'kategorie': ['Kategorie'],
'status': [{'|code': 'at0012'}]}]}]}}

Create a list objects from request response

I want to extract certain object when going through a response i'm getting from an API CALL.
Response =
[
{'id': '2a15947c-8cdb-4f1d-a1cc-a8d76fd97d61', 'name': 'Human', 'i18nNameKey': 'Blank space Blueprint', 'pluginClone': True
},
{'id': '99accff8-9e24-4c76-b21a-f12ef6572369', 'name': 'Robot', 'i18nNameKey': 'Personal space Blueprint', 'pluginClone': True
},
{'id': 'bf40b0e5-f151-4df6-b305-4a91b4b7c1da', 'name': 'Dog', 'i18nNameKey': 'Game.blueprints.space.kb.name', 'pluginClone': True
},
{'id': '42868b38-b9f8-4540-ba26-0988e8a2e1f7', 'name': 'Bug', 'i18nNameKey': 'Game.blueprints.space.team.name', 'pluginClone': True
},
{'id': 'b23eb9fd-0106-452a-8cab-551ce3b45eb0', 'name': 'Cat', 'i18nNameKey': 'Game.blueprints.space.documentation.name', 'pluginClone': True
},
{'id': '67668d17-6c08-4c85-a6b6-3c1d6fb23000', 'name': 'Cat', 'i18nNameKey': 'Game.blueprints.space.sp.name', 'pluginClone': True,
}
]
I need to regroup, the id and the name, at the moment i'm able to retrieve them but the response is not what i want t achieve
I'm getting a list of string , instead of a list of object, what i'm doing wrong ?
def listallCategories(self):
self.intentsResponseDict=[]
url = self.url
response = requests.get(url,auth=(self.user,self.password))
json_data = response.json()
#print(json_data)
for resp in json_data:
self.intentsResponseDict.append(resp['id'])
self.intentsResponseDict.append(resp['name'])
return self.intentsResponseDict
this is what i'm getting
['2a15947c-8cdb-4f1d-a1cc-a8d76fd97d61', 'Human', '99accff8-9e24-4c76-b21a-f12ef6572369', 'Robot', 'bf40b0e5-f151-4df6-b305-4a91b4b7c1da', 'Dog', '42868b38-b9f8-4540-ba26-0988e8a2e1f7', 'Bug', 'b23eb9fd-0106-452a-8cab-551ce3b45eb0', 'Cat', '67668d17-6c08-4c85-a6b6-3c1d6fb23000', 'Bird']
This is what i want id , name
[
{'2a15947c-8cdb-4f1d-a1cc-a8d76fd97d61', 'Human'
},
{'99accff8-9e24-4c76-b21a-f12ef6572369', 'Robot'
},
{'bf40b0e5-f151-4df6-b305-4a91b4b7c1da', 'Dog'
},
{'42868b38-b9f8-4540-ba26-0988e8a2e1f7', 'Bug'
},
{'b23eb9fd-0106-452a-8cab-551ce3b45eb0', 'Cat'
},
{'67668d17-6c08-4c85-a6b6-3c1d6fb23000', 'Bird'
}
]

The resp is probably like:
[
{
"id": "2a15947c-8cdb-4f1d-a1cc-a8d76fd97d61",
"name": "Human",
... more fields ...
},
{
"id": "99accff8-9e24-4c76-b21a-f12ef6572369",
"name": "Robot",
... more fields ...
},
... more objects ...
]
The format that you want the data isn't a valid JSON, a better schema for that JSON may be:
[
{
"id": "2a15947c-8cdb-4f1d-a1cc-a8d76fd97d61",
"name": "Human",
},
{
"id": "99accff8-9e24-4c76-b21a-f12ef6572369",
"name": "Robot",
},
... more objects ...
]
So, you are just picking id and name in the response
for resp in json_data:
self.intentsResponseDict.append({
"id": resp["id"],
"name": resp["name"]
})

What you show is a list of sets. Just build it:
...
for resp in json_data:
self.intentsResponseDict.append({resp['id'], resp['name']})
...

You're adding both the ID and name to the end of a list
for resp in json_data:
self.intentsResponseDict.append(resp['id']) # here is your issue
self.intentsResponseDict.append(resp['name']) # and here
return self.intentsResponseDict
If you want these values to stay together you would need to add them to the list together. e.g.
for resp in json_data:
self.intentsResponseDict.append([resp['id'], resp['name']]) # to create list of lists
# OR
self.intentsResponseDict.append({resp['id'], resp['name']}) # to create list of dicts
return self.intentsResponseDict

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Comparing dictionary of list of dictionary/nested dictionary - python

The schema module does exactly this. You can catch SchemaUnexpectedTypeError to see which data doesn't match your pattern. Also, make sure you don't use the word input as a variable name, as it's the name of a built-in function.

Related

need to turn JSON values into keys

Merge dictionaries with same key from two lists of dicts in python

Group by keys and create a list of respective values

How to deepmerge dicts/list for a json in python

Create a list objects from request response

Categories

Resources