How do I parse nested json objects? - python

I am trying to load a JSON file to parse the contents nested in the root object. Currently I have the JSON file open and loaded as such:
with open(outputFile.name) as f:
data = json.load(f)
For the sake of the question here is an example of what the contents of the JSON file are like:
{
"rootObject" :
{
"person" :
{
"address" : "some place ave. 123",
"age" : 47,
"name" : "Joe"
},
"kids" :
[
{
"age" : 20,
"name" : "Joey",
"studySubject":"math"
},
{
"age" : 16,
"name" : "Josephine",
"studySubject":"chemistry"
}
],
"parents" :
{
"father" : "Joseph",
"mother" : "Joette"
}
How do I access the nested objects in "rootObject", such as "person", "kids" and its contents, and "parents"?

Below code using recursive function can extract values using specific key in a nested dictionary or 'lists of dictionaries':
data = {
"rootObject" :
{
"person" :
{
"address" : "some place ave. 123",
"age" : 47,
"name" : "Joe"
},
"kids" :
[
{
"age" : 20,
"name" : "Joey",
"studySubject":"math"
},
{
"age" : 16,
"name" : "Josephine",
"studySubject":"chemistry"
}
],
"parents" :
{
"father" : "Joseph",
"mother" : "Joette"
}
}}
def get_vals(nested, key):
result = []
if isinstance(nested, list) and nested != []: #non-empty list
for lis in nested:
result.extend(get_vals(lis, key))
elif isinstance(nested, dict) and nested != {}: #non-empty dict
for val in nested.values():
if isinstance(val, (list, dict)): #(list or dict) in dict
result.extend(get_vals(val, key))
if key in nested.keys(): #key found in dict
result.append(nested[key])
return result
get_vals(data, 'person')
Output
[{'address': 'some place ave. 123', 'age': 47, 'name': 'Joe'}]

The code for loading the JSON object should look like this:
from json import loads, load
with open("file.json") as file:
var = loads(load(file))
# loads() transforms the string in a python dict object

Related

Change all values of a specific key from json object

My json object:
"students": [
{
"name" : "ben",
"hometown" : "unknown"
},
{
"name" : "sam",
"hometown" : "unknown"
}
]
}
with this list
"hometowns":{California,Colorado}
change to this:
"students": [
{
"name" : "ben",
"hometown" : "California"
},
{
"name" : "sam",
"hometown" : "Colorado"
}
]
}
I need to loop and check if the key = "hometown" and change its value like
students[1].hometown == hometowns[1].
First, note that your syntax is a bit off (and your towns are actually states). After correcting the syntax, we can use the zip() function to iterate over both lists together:
hometowns = ["California", "Colorado"]
students = [{"name": "ben", "hometown": "unknown"},
{"name": "sam", "hometown": "unknown"}]
for student, hometown in zip(students, hometowns):
student['hometown'] = hometown
students
[{'name': 'ben', 'hometown': 'California'},
{'name': 'sam', 'hometown': 'Colorado'}]
You can do something like this
hometowns=["California","Colorado"]
students=[{"name" : "ben",},{"name" : "sam"}]
for student,town in zip(students,hometown):
student["hometown"]=town
I assumed you were trying to specify hometowns and students as variables rather than elements of a larger dictionary, which would change the syntax somewhat.
here is an alternative simple solution if you have not learnt about zip() in python:
hometowns = ["California","Colorado"]
a = {"students": [
{
"name" : "ben",
"hometown" : "unknown"
},
{
"name" : "sam",
"hometown" : "unknown"
}
] }
num = 0
for j in a["students"]:
j["hometown"] = hometowns[num]
num += 1
print(a)

i want to convert sample JSON data into nested JSON using specific key-value in python

I have below sample data in JSON format :
project_cost_details is my database result set after querying.
{
"1": {
"amount": 0,
"breakdown": [
{
"amount": 169857,
"id": 4,
"name": "SampleData",
"parent_id": "1"
}
],
"id": 1,
"name": "ABC PR"
}
}
Here is full json : https://jsoneditoronline.org/?id=2ce7ab19af6f420397b07b939674f49c
Expected output :https://jsoneditoronline.org/?id=56a47e6f8e424fe8ac58c5e0732168d7
I have this sample JSON which i created using loops in code. But i am stuck at how to convert this to expected JSON format. I am getting sequential changes, need to convert to tree like or nested JSON format.
Trying in Python :
project_cost = {}
for cost in project_cost_details:
if cost.get('Parent_Cost_Type_ID'):
project_id = str(cost.get('Project_ID'))
parent_cost_type_id = str(cost.get('Parent_Cost_Type_ID'))
if project_id not in project_cost:
project_cost[project_id] = {}
if "breakdown" not in project_cost[project_id]:
project_cost[project_id]["breakdown"] = []
if 'amount' not in project_cost[project_id]:
project_cost[project_id]['amount'] = 0
project_cost[project_id]['name'] = cost.get('Title')
project_cost[project_id]['id'] = cost.get('Project_ID')
if parent_cost_type_id == cost.get('Cost_Type_ID'):
project_cost[project_id]['amount'] += int(cost.get('Amount'))
#if parent_cost_type_id is None:
project_cost[project_id]["breakdown"].append(
{
'amount': int(cost.get('Amount')),
'name': cost.get('Name'),
'parent_id': parent_cost_type_id,
'id' : cost.get('Cost_Type_ID')
}
)
from this i am getting sample JSON. It will be good if get in this code only desired format.
Also tried this solution mention here : https://adiyatmubarak.wordpress.com/2015/10/05/group-list-of-dictionary-data-by-particular-key-in-python/
I got approach to convert sample JSON to expected JSON :
data = [
{ "name" : "ABC", "parent":"DEF", },
{ "name" : "DEF", "parent":"null" },
{ "name" : "new_name", "parent":"ABC" },
{ "name" : "new_name2", "parent":"ABC" },
{ "name" : "Foo", "parent":"DEF"},
{ "name" : "Bar", "parent":"null"},
{ "name" : "Chandani", "parent":"new_name", "relation": "rel", "depth": 3 },
{ "name" : "Chandani333", "parent":"new_name", "relation": "rel", "depth": 3 }
]
result = {x.get("name"):x for x in data}
#print(result)
tree = [];
for a in data:
#print(a)
if a.get("parent") in result:
parent = result[a.get("parent")]
else:
parent = ""
if parent:
if "children" not in parent:
parent["children"] = []
parent["children"].append(a)
else:
tree.append(a)
Reference help : http://jsfiddle.net/9FqKS/ this is a JavaScript solution i converted to Python
It seems that you want to get a list of values from a dictionary.
result = [value for key, value in project_cost_details.items()]

Mongodb document traversing

I have a query in mongo db, tried lots of solution but still not found it working. Any help will be appreciated.
How to find all keys named "channel" in document?
db.clients.find({"_id": 69})
{
"_id" : 69,
"configs" : {
"GOOGLE" : {
"drid" : "1246ABCD",
"adproviders" : {
"adult" : [
{
"type" : "landing",
"adprovider" : "abc123",
"channel" : "abc456"
},
{
"type" : "search",
"adprovider" : "xyz123",
"channel" : "xyz456"
}
],
"nonadult" : [
{
"type" : "landing",
"adprovider" : "pqr123",
"channel" : "pqr456"
},
{
"type" : "search",
"adprovider" : "lmn123",
"channel" : "lmn456"
}
]
}
},
"channel" : "ABC786",
"_cls" : "ClientGoogleDoc"
}
}
Trying to find keys with name channel
db.clients.find({"_id": 69, "channel": true})
Expecting:
{"channels": ["abc456", "xyz456", "ABC786", "xyz456", "pqr456", "lmn456", ...]}
As far as I know, you'd have to use python to recursively traverse the dictionary yourself in order to build the list that you want above:
channels = []
def traverse(my_dict):
for key, value in my_dict.items():
if isinstance(value, dict):
traverse(value)
else:
if key == "channel":
channels.append(value)
traverse({"a":{"channel":"abc123"}, "channel":"xyzzz"})
print(channels)
output:
['abc123', 'xyzzz']
However, using a thing called projections you can get sort of close to what you want (but not really, since you have to specify all of the channels manually):
db.clients.find({"_id": 69}, {"configs.channel":1})
returns:
{ "_id" : ObjectId("69"), "configs" : { "channel" : "ABC786" } }
If you want to get really fancy, you could write a generator function to generate all the keys in a given dictionary, no matter how deep:
my_dict = { "a": {
"channel":"abc123",
"key2": "jjj",
"subdict": {"deep_key": 5, "channel": "nested"}
},
"channel":"xyzzz"}
def getAllKeys(my_dict):
for key, value in my_dict.items():
yield key, value
if isinstance(value, dict):
for key, value in getAllKeys(value):
yield key, value
for key, value in getAllKeys(my_dict):
if key == "channel":
print value
output:
nested
abc123
xyzzz
You can use the $project mongodb operator to get only the value for the specific key. Check the documentation at http://docs.mongodb.org/manual/reference/operator/aggregation/project/

Is there a good way python to compare complex dictionaries with unkonwn depth & level

I have two dictionary objects which are very complex and created by converting large xml files into python dictionaries.
I don't know the depth of the dictionaries and just want to compare and want the following output...
e.g. My dictionaries are like this
d1 = {"great grand father":
{"name":"John",
"grand father":
{"name":"Tom",
"father":
{"name":"Andy",
"Me":
{"name":"Mike",
"son":
{"name":"Tom"}
}
}
}
}
}
d2 is also a similar but could be possible any one of the field is missing or changed as below
d2 = {"great grand father":
{"name":"John",
"grand father":
{"name":"Tom",
"father":
{"name":"Andy",
"Me":
{"name":"Tonny",
"son":
{"name":"Tom"}
}
}
}
}
}
The dictionary comparison should give me results like this -
Expected Key/Val : Me->name/"Mike"
Actual Key/Val : Me->name/"Tonny"
If the key "name" does not exists in "Me" in d2, it should give me following output
Expected Key/Val : Me->name/"Mike"
Actual Key/Val : Me->name/NOT_FOUND
I repeat the dictionary depth could be variable or dynamically generated. The two dictionaries here are given as examples...
All the dictionary comparison questions and their answers which I have seen in SO are related fixed depth Dictionaries.....
You're in luck, I did this as part of a project where I worked.
You need a recursive function something like:
def checkDifferences(dict_a,dict_b,differences=[])
You can first check for keys that don't exist in one or the other.
e.g
Expected Name/Tom Actual None
Then you compare the types of the values i.e check if the value is a dict or a list etc.
If it is then you can recursively call the function using the value as dict_a/b. When calling recursively pass the differences array.
If the type of the value is a list and the list may have dictionaries within it then you need to covert the list to a dict and call the function on the converted dictionary.
I'm sorry I can't help more but I no longer have access to the source code. Hopefully this is enough to get you started.
Here I found a way to compare any two dictionaries -
I have tried with various dictionaries of any depths and worked for me. The code is not so modular but just for the reference -
import pprint
pp = pprint.PrettyPrinter(indent=4)
dict1 = { 'Person' : { 'Male' : {'Boys' : {'Roger' : {'age' : 20},
'Rafa' : {'age' : 25}
}
},
'Female' : { 'Girls' : {'Serena' : {'age' : 23},
'Maria' : {'age' : 15}
}
}
},
'Animal' : { 'Huge' : {'Elephant' : {'color' : 'black' }
}
}
}
'''
dict2 = { 'Person' : { 'Male' : {'Boys' : {'Roger' : {'age' : 20}
}
},
'Female' : { 'Girls' : {'Serena' : {'age' : 23},
'Maria' : {'age' : 1}
}
}
}
}
dict2 = { 'Person' : { 'Male' : {'Boys' : {'Roger' : {'age' : 20},
'Rafa' : {'age' : 2}
}
}
}
}
'''
dict2 = { 'Person' : { 'Male' : {'Boys' : {'Roger' : {'age' : 2}}},
'Female' : 'Serena'}
}
key_list = []
err_list = {}
def comp(exp,act):
for key in exp:
key_list.append(key)
exp_val = exp[key]
try:
act_val = act[key]
is_dict_exp = isinstance(exp_val,__builtins__.dict)
is_dict_act = isinstance(act_val,__builtins__.dict)
if is_dict_exp == is_dict_act == True:
comp(exp_val,act_val)
elif is_dict_exp == is_dict_act == False:
if not exp_val == act_val:
temp = {"Exp" : exp_val,"Act" : act_val}
err_key = "-->".join(key_list)
if err_list.has_key(key):
err_list[err_key].update(temp)
else:
err_list.update({err_key : temp})
else:
temp = {"Exp" : exp_val, "Act" : act_val}
err_key = "-->".join(key_list)
if err_list.has_key(key):
err_list[err_key].update(temp)
else:
err_list.update({err_key : temp})
except KeyError:
temp = {"Exp" : exp_val,"Act" : "NOT_FOUND"}
err_key = "-->".join(key_list)
if err_list.has_key(key):
err_list[err_key].update(temp)
else:
err_list.update({err_key : temp})
key_list.pop()
comp(dict1,dict2)
pp.pprint(err_list)
Here is the output of my code -
{ 'Animal': { 'Act': 'NOT_FOUND',
'Exp': { 'Huge': { 'Elephant': { 'color': 'black'}}}},
'Person-->Female': { 'Act': 'Serena',
'Exp': { 'Girls': { 'Maria': { 'age': 15},
'Serena': { 'age': 23}}}},
'Person-->Male-->Boys-->Rafa': { 'Act': 'NOT_FOUND', 'Exp': { 'age': 25}},
'Person-->Male-->Boys-->Roger-->age': { 'Act': 2, 'Exp': 20}
}
One can also try with other dictionaries given in commented code..
One more thing - The keys are checked in expected dictionary and the matched with an actual. If we pass dictionaries in alternate order the other way matching is also possible...
comp(dict2,dict1)

serializing json in python

I know there are quite a few json/python questions out there but I can't seem to figure this one out. I am trying to serialize two lists into the same file. In order to do that I create a new class that holds the two lists:
class newJSON(object):
def __init__(self, list1, list2):
self.data = {'data': list1, 'info' : list2}
I need the resulting data file to look like the following:
{
"data" : [
{
"name" : "aName" ,
"coordinates" : {"obj2" : 33, "obj3" : 71}
} , {
"name" : "bName" ,
"coordinates" : {"obj2" : 12, "obj3" : 77}
}
] ,
"info" : [
{
"first" : ["xxx" , "yyy"] ,
"space" : 21
} , {
"first" : ["aaa" , "bbb"] ,
"space" : 12
}
]
}
So then I go to decode the object as recommended in Serializing python object instance to JSON and several others:
jsonToEncode = newJSON(myList1, myList2)
myNewJSONData = json.dumps(jsonToEncode.__dict__)
However I get the "is not JSON serializable error"... I have tried this with and without the dict but to no success. The JSON must be in the format shown above. What is the problem?
Thanks
****EDIT****
in order to make the two lists, I take a json file which is formatted exactly like the json shown and do the following:
list1 = [obj1(**myObj) for myObj in data["data"]]
and the same for list2. obj1 is made like this:
class obj1(object):
def__init__(self, name, coordinates):
self.name = name
self.coordinate = coordinates
There is no need to create a new object. Simply serialize the dictionary directly:
myNewJSONData = json.dumps({'data': list1, 'info': list2})
However, your code should have worked otherwise. You probably have data contained in list1 and list2 that is not serializable.
Try myNewJSONData = json.dumps(jsonToEncode.data) instead.
Or even myNewJSONData = json.dumps({'data': list1, 'info' : list2}). Why do you use that class jsonToEncode anyway?
try this:
myNewJSONData = json.dumps(jsonToEncode.data, indent=2)
or this:
>>> class newJSON2(dict):
... def __init__(self, list1, list2):
... self['data'] = list1
... self['info'] = list2
>>>
>>>
>>> json2 = newJSON2(list1, list2)
>>> json2
{'info': [{'space': 21, 'first': ['xxx', 'yyy']}, {'space': 12, 'first': ['aaa', 'bbb']}], 'data': [{'name': 'aName', 'coordinates': {'obj3': 71, 'obj2': 33}}, {'name': 'bName', 'coordinates': {'obj3': 77, 'obj2': 12}}]}
>>> print json.dumps(json2, indent=2)
{
"info": [
{
"space": 21,
"first": [
"xxx",
"yyy"
]
},
{
"space": 12,
"first": [
"aaa",
"bbb"
]
}
],
"data": [
{
"name": "aName",
"coordinates": {
"obj3": 71,
"obj2": 33
}
},
{
"name": "bName",
"coordinates": {
"obj3": 77,
"obj2": 12
}
}
]
}
>>>
Use simplejson library.
from simplejson import loads, dumps
print loads(json_string) # Converts a JSON string to dict
print dumps(python_object) # Converts any valid python object/dict to valid JSON string
Take a look at https://pypi.python.org/pypi/simplejson.

Categories