Generating a dynamic nested JSON object and array - python - python

As the question explains the problem, I've been trying to generate nested JSON object. In this case I have for loops getting the data out of dictionary dic. Below is the code:
f = open("test_json.txt", 'w')
flag = False
temp = ""
start = "{\n\t\"filename\"" + " : \"" +initial_filename+"\",\n\t\"data\"" +" : " +" [\n"
end = "\n\t]" +"\n}"
f.write(start)
for i, (key,value) in enumerate(dic.iteritems()):
f.write("{\n\t\"keyword\":"+"\""+str(key)+"\""+",\n")
f.write("\"term_freq\":"+str(len(value))+",\n")
f.write("\"lists\":[\n\t")
for item in value:
f.write("{\n")
f.write("\t\t\"occurance\" :"+str(item)+"\n")
#Check last object
if value.index(item)+1 == len(value):
f.write("}\n"
f.write("]\n")
else:
f.write("},") # close occurrence object
# Check last item in dic
if i == len(dic)-1:
flag = True
if(flag):
f.write("}")
else:
f.write("},") #close lists object
flag = False
#check for flag
f.write("]") #close lists array
f.write("}")
Expected output is:
{
"filename": "abc.pdf",
"data": [{
"keyword": "irritation",
"term_freq": 5,
"lists": [{
"occurance": 1
}, {
"occurance": 1
}, {
"occurance": 1
}, {
"occurance": 1
}, {
"occurance": 2
}]
}, {
"keyword": "bomber",
"lists": [{
"occurance": 1
}, {
"occurance": 1
}, {
"occurance": 1
}, {
"occurance": 1
}, {
"occurance": 2
}],
"term_freq": 5
}]
}
But currently I'm getting an output like below:
{
"filename": "abc.pdf",
"data": [{
"keyword": "irritation",
"term_freq": 5,
"lists": [{
"occurance": 1
}, {
"occurance": 1
}, {
"occurance": 1
}, {
"occurance": 1
}, {
"occurance": 2
},] // Here lies the problem "," before array(last element)
}, {
"keyword": "bomber",
"lists": [{
"occurance": 1
}, {
"occurance": 1
}, {
"occurance": 1
}, {
"occurance": 1
}, {
"occurance": 2
},], // Here lies the problem "," before array(last element)
"term_freq": 5
}]
}
Please help, I've trying to solve it, but failed. Please don't mark it duplicate since I have already checked other answers and didn't help at all.
Edit 1:
Input is basically taken from a dictionary dic whose mapping type is <String, List>
for example: "irritation" => [1,3,5,7,8]
where irritation is the key, and mapped to a list of page numbers.
This is basically read in the outer for loop where key is the keyword and value is a list of pages of occurrence of that keyword.
Edit 2:
dic = collections.defaultdict(list) # declaring the variable dictionary
dic[key].append(value) # inserting the values - useless to tell here
for key in dic:
# Here dic[x] represents list - each value of x
print key,":",dic[x],"\n" #prints the data in dictionary

What #andrea-f looks good to me, here another solution:
Feel free to pick in both :)
import json
dic = {
"bomber": [1, 2, 3, 4, 5],
"irritation": [1, 3, 5, 7, 8]
}
filename = "abc.pdf"
json_dict = {}
data = []
for k, v in dic.iteritems():
tmp_dict = {}
tmp_dict["keyword"] = k
tmp_dict["term_freq"] = len(v)
tmp_dict["lists"] = [{"occurrance": i} for i in v]
data.append(tmp_dict)
json_dict["filename"] = filename
json_dict["data"] = data
with open("abc.json", "w") as outfile:
json.dump(json_dict, outfile, indent=4, sort_keys=True)
It's the same idea, I first create a big json_dict to be saved directly in json. I use the with statement to save the json avoiding the catch of exception
Also, you should have a look to the doc of json.dumps() if you need future improve in your json output.
EDIT
And just for fun, if you don't like tmp var, you can do all the data for loop in a one-liner :)
json_dict["data"] = [{"keyword": k, "term_freq": len(v), "lists": [{"occurrance": i} for i in v]} for k, v in dic.iteritems()]
It could gave for final solution something not totally readable like this:
import json
json_dict = {
"filename": "abc.pdf",
"data": [{
"keyword": k,
"term_freq": len(v),
"lists": [{"occurrance": i} for i in v]
} for k, v in dic.iteritems()]
}
with open("abc.json", "w") as outfile:
json.dump(json_dict, outfile, indent=4, sort_keys=True)
EDIT 2
It looks like you don't want to save your json as the desired output, but be abble to read it.
In fact, you can also use json.dumps() in order to print your json.
with open('abc.json', 'r') as handle:
new_json_dict = json.load(handle)
print json.dumps(json_dict, indent=4, sort_keys=True)
There is still one problem here though, "filename": is printed at the end of the list because the d of data comes before the f.
To force the order, you will have to use an OrderedDict in the generation of the dict. Be careful the syntax is ugly (imo) with python 2.X
Here is the new complete solution ;)
import json
from collections import OrderedDict
dic = {
'bomber': [1, 2, 3, 4, 5],
'irritation': [1, 3, 5, 7, 8]
}
json_dict = OrderedDict([
('filename', 'abc.pdf'),
('data', [ OrderedDict([
('keyword', k),
('term_freq', len(v)),
('lists', [{'occurrance': i} for i in v])
]) for k, v in dic.iteritems()])
])
with open('abc.json', 'w') as outfile:
json.dump(json_dict, outfile)
# Now to read the orderer json file
with open('abc.json', 'r') as handle:
new_json_dict = json.load(handle, object_pairs_hook=OrderedDict)
print json.dumps(json_dict, indent=4)
Will output:
{
"filename": "abc.pdf",
"data": [
{
"keyword": "bomber",
"term_freq": 5,
"lists": [
{
"occurrance": 1
},
{
"occurrance": 2
},
{
"occurrance": 3
},
{
"occurrance": 4
},
{
"occurrance": 5
}
]
},
{
"keyword": "irritation",
"term_freq": 5,
"lists": [
{
"occurrance": 1
},
{
"occurrance": 3
},
{
"occurrance": 5
},
{
"occurrance": 7
},
{
"occurrance": 8
}
]
}
]
}
But be carefull, most of the time, it is better to save a regular .json file in order to be cross languages.

Your current code is not working because the loop iterates through the before-last item adding the }, then when the loop runs again it sets the flag to false, but the last time it ran it added a , since it thought that there will be another element.
If this is your dict: a = {"bomber":[1,2,3,4,5]} then you can do:
import json
file_name = "a_file.json"
file_name_input = "abc.pdf"
new_output = {}
new_output["filename"] = file_name_input
new_data = []
i = 0
for key, val in a.iteritems():
new_data.append({"keyword":key, "lists":[], "term_freq":len(val)})
for p in val:
new_data[i]["lists"].append({"occurrance":p})
i += 1
new_output['data'] = new_data
Then save the data by:
f = open(file_name, 'w+')
f.write(json.dumps(new_output, indent=4, sort_keys=True, default=unicode))
f.close()

Related

"TypeError: list indices must be integers or slices, not str" when trying to change keys

I want to remove some problematic $oid and everything that contains $ in a json file. I wrote:
import json
with open('C:\\Windows\\System32\\files\\news.json', 'r', encoding="utf8") as handle:
data = [json.loads(line) for line in handle]
for k,v in data[0].items():
#check if key has dict value
if type(v) == dict:
#find id with $
r = list(data[k].keys())[0]
#change value if $ occurs
if r[0] == '$':
data[k] = data[k][r]
print(data)
But I get TypeError: list indices must be integers or slices, not str. I know it is because the json dictionaries are made redeable for Python, but how do I fix it?
Edit: the .json file in my computer looks like this:
{
"_id": {
"$oid": "5e7511c45cb29ef48b8cfcff"
},
"description": "some text",
"startDate": {
"$date": "5e7511c45cb29ef48b8cfcff"
},
"completionDate": {
"$date": "2021-01-05T14:59:58.046Z"
}
}
I believe this is because your k is a str and you try to call data[k]?
It will be better if you show the format of the json as well.
Updating with answer.
This should work for the given json. But if you want to for a larger file. looping can be tricky, specially because you're trying to modify the keys of a dictionary.
import json
line = '{"_id": { "$oid": "5e7511c45cb29ef48b8cfcff" }, "description": "some text", "startDate": { "$date": "5e7511c45cb29ef48b8cfcff"},"completionDate": {"$date": "2021-01-05T14:59:58.046Z"}}'
data = [json.loads(line)]
for k,v in data[0].items():
if type(v) == dict:
for k2, v2 in data[0][k].items():
if k2[0] == '$':
formatted = k2[1:]
del data[0][k][k2]
data[0][k][formatted] = v2
print(data)
# import json
# with open('C:\\Windows\\System32\\files\\news.json', 'r', encoding="utf8") as handle:
# data = [json.loads(line) for line in handle]
data = [
{
"_id": {
"$oid": "5e7511c45cb29ef48b8cfcff"
},
"description": "some text",
"startDate": {
"$date": "5e7511c45cb29ef48b8cfcff"
},
"completionDate": {
"$date": "2021-01-05T14:59:58.046Z"
}
}
]
for d in data:
for k, v in d.items():
# check if key has dict value
del_keys = set()
if type(v) == dict:
# find id with $
del_keys.update([i for i in v if i.startswith("$")])
[v.pop(key) for key in del_keys]
print(data)
# [{'_id': {}, 'description': 'some text', 'startDate': {}, 'completionDate': {}}]

get data from a json

I want to get the data from a json. I have the idea of a loop to access all levels.
I have only been able to pull data from a single block.
print(output['body']['data'][0]['list'][0]['outUcastPkts'])
How do I get the other data?
import json,urllib.request
data = urllib.request.urlopen("http://172.0.0.0/statistic").read()
output = json.loads(data)
for elt in output['body']['data']:
print(output['body']['data'][0]['inUcastPktsAll'])
for elt in output['list']:
print(output['body']['data'][0]['list'][0]['outUcastPkts'])
{
"body": {
"data": [
{
"inUcastPktsAll": 3100617019,
"inMcastPktsAll": 7567,
"inBcastPktsAll": 8872,
"outPktsAll": 8585575441,
"outUcastPktsAll": 8220240108,
"outMcastPktsAll": 286184143,
"outBcastPktsAll": 79151190,
"list": [
{
"outUcastPkts": 117427359,
"outMcastPkts": 1990586,
"outBcastPkts": 246120
},
{
"outUcastPkts": 0,
"outMcastPkts": 0,
"outBcastPkts": 0
}
]
},
{
"inUcastPktsAll": 8269483865,
"inMcastPktsAll": 2405765,
"inBcastPktsAll": 124466,
"outPktsAll": 3101194852,
"outUcastPktsAll": 3101012296,
"outMcastPktsAll": 173409,
"outBcastPktsAll": 9147,
"list": [
{
"outUcastPkts": 3101012296,
"outMcastPkts": 90488,
"outBcastPkts": 9147
},
{
"outUcastPkts": 0,
"outMcastPkts": 0,
"outBcastPkts": 0
}
]
}
],
"msgs": [ "successful" ]
},
"header": {
"opCode": "1",
"token": "",
"state": "",
"version": 1
}
}
output = json.loads(data) #Type of output is a dictionary.
#Try to use ".get()" method.
print(output.get('body')) #Get values of key 'body'
print(output.get('body').get('data')) #Get a list of key 'data'
If a key doesn't exist, the '.get()' method will return None.
https://docs.python.org/3/library/stdtypes.html#dict.get
In python you can easily iterate over the objects of a list like so:
>>> l = [1, 2, 3, 7]
>>> for elem in l:
... print(elem)
...
1
2
3
7
This works regarding what can of object do you have in the list (integers, tuples, dictionaries). Having that in mind, your solution was not far off, you only to do the following changes:
for entry in output['body']['data']:
print(entry['inUcastPktsAll'])
for list_element in entry['list']:
print(list_element['outUcastPkts'])
This will give you the following for the json object you have provided:
3100617019
117427359
0
8269483865
3101012296
0

i want to convert sample JSON data into nested JSON using specific key-value in python

I have below sample data in JSON format :
project_cost_details is my database result set after querying.
{
"1": {
"amount": 0,
"breakdown": [
{
"amount": 169857,
"id": 4,
"name": "SampleData",
"parent_id": "1"
}
],
"id": 1,
"name": "ABC PR"
}
}
Here is full json : https://jsoneditoronline.org/?id=2ce7ab19af6f420397b07b939674f49c
Expected output :https://jsoneditoronline.org/?id=56a47e6f8e424fe8ac58c5e0732168d7
I have this sample JSON which i created using loops in code. But i am stuck at how to convert this to expected JSON format. I am getting sequential changes, need to convert to tree like or nested JSON format.
Trying in Python :
project_cost = {}
for cost in project_cost_details:
if cost.get('Parent_Cost_Type_ID'):
project_id = str(cost.get('Project_ID'))
parent_cost_type_id = str(cost.get('Parent_Cost_Type_ID'))
if project_id not in project_cost:
project_cost[project_id] = {}
if "breakdown" not in project_cost[project_id]:
project_cost[project_id]["breakdown"] = []
if 'amount' not in project_cost[project_id]:
project_cost[project_id]['amount'] = 0
project_cost[project_id]['name'] = cost.get('Title')
project_cost[project_id]['id'] = cost.get('Project_ID')
if parent_cost_type_id == cost.get('Cost_Type_ID'):
project_cost[project_id]['amount'] += int(cost.get('Amount'))
#if parent_cost_type_id is None:
project_cost[project_id]["breakdown"].append(
{
'amount': int(cost.get('Amount')),
'name': cost.get('Name'),
'parent_id': parent_cost_type_id,
'id' : cost.get('Cost_Type_ID')
}
)
from this i am getting sample JSON. It will be good if get in this code only desired format.
Also tried this solution mention here : https://adiyatmubarak.wordpress.com/2015/10/05/group-list-of-dictionary-data-by-particular-key-in-python/
I got approach to convert sample JSON to expected JSON :
data = [
{ "name" : "ABC", "parent":"DEF", },
{ "name" : "DEF", "parent":"null" },
{ "name" : "new_name", "parent":"ABC" },
{ "name" : "new_name2", "parent":"ABC" },
{ "name" : "Foo", "parent":"DEF"},
{ "name" : "Bar", "parent":"null"},
{ "name" : "Chandani", "parent":"new_name", "relation": "rel", "depth": 3 },
{ "name" : "Chandani333", "parent":"new_name", "relation": "rel", "depth": 3 }
]
result = {x.get("name"):x for x in data}
#print(result)
tree = [];
for a in data:
#print(a)
if a.get("parent") in result:
parent = result[a.get("parent")]
else:
parent = ""
if parent:
if "children" not in parent:
parent["children"] = []
parent["children"].append(a)
else:
tree.append(a)
Reference help : http://jsfiddle.net/9FqKS/ this is a JavaScript solution i converted to Python
It seems that you want to get a list of values from a dictionary.
result = [value for key, value in project_cost_details.items()]

Nested dictionary from data in a text file

I am new with python and I am trying to create a dictionary that outputs in a JSON file, this with data from a text file. So the text file would be this one.
557e155fc5f0 557e155fc5f0 1 557e155fc602 1
557e155fc610 557e155fc610 2
557e155fc620 557e155fc620 1 557e155fc626 1
557e155fc630 557e155fc630 1 557e155fc636 1
557e155fc640 557e155fc640 1
557e155fc670 557e155fc670 1 557e155fc698 1
557e155fc6a0 557e155fc6a0 1 557e155fc6d8 1
And the desired output for the first two lines would be
{ "functions": [
{
"address": "557e155fc5f0",
"blocks": [
"557e155fc5f0": "calls":{1}
"557e155fc602": "calls":{1}
]
},
{
"address": " 557e155fc610",
"blocks": [
" 557e155fc610": "calls":{2}
]
},
I have wrote a script to begin but I don't know how to continue.
import json
filename = 'calls2.out' # here the name of the output file
funs = {}
bbls = {}
with open(filename) as fh: # open file
for line in fh: # walk line by line
if line.strip(): # non-empty line?
rtn,bbl = line.split(None,1) # None means 'all whitespace', the default
for j in range(len(bbl)):
funs[rtn] = bbl.split()
print(json.dumps(funs, indent=2, sort_keys=True))
#json = json.dumps(fun, indent=2, sort_keys=True) # to save it into a file
#f = open("fout.json","w")
#f.write(json)
#f.close()
this script gives me this output
"557e155fc5f0": [
"557e155fc5f0",
"1",
"557e155fc602",
"1"
],
"557e155fc610": [
"557e155fc610",
"2"
],
"557e155fc620": [
"557e155fc620",
"1",
"557e155fc626",
"1"
],
funs[rtn] = bbl.split()
Here you add "557e155fc5f0", "1" as value to the rtnkey, because bbl is 557e155fc5f0 1 at this point, but you want to add it as a dictionary.
temp_dict = {bbl.split()[0]: bbl.split()[1]}
funs[rtn] = temp_dict
This will give you following json:
{
"557e155fc6a0": {
"557e155fc6a0": "1"
}
}
If you need the calls as key in the json you'd need to extend a bit:
temp_dict = {bbl.split()[0]: {"calls": bbl.split()[1]}}
funs[rtn] = temp_dict
Gives you this:
{
"557e155fc6a0": {
"557e155fc6a0": {
"calls": "1"
}
}
}
Also, your example json is malformed, I assume you want sth like this:
{
"functions": {
"address": "557e155fc5f0",
"blocks": {
"557e155fc5f0": {
"calls": 1
},
"557e155fc602": {
"calls": 1
}
}
},
"address": " 557e155fc610",
"blocks": {
"557e155fc610": {
"calls": 2
}
}
}
I'd try an Online JSON Editor for testing/creating examples.
Hope it helps!

How to sum integers stored in json

How can I sum the count values? My json data is as following.
{
"note":"This file contains the sample data for testing",
"comments":[
{
"name":"Romina",
"count":97
},
{
"name":"Laurie",
"count":97
},
{
"name":"Bayli",
"count":90
}
]
}
This is how i did it eventually.
import urllib
import json
mysumcnt = 0
input = urllib.urlopen('url').read()
info = json.loads(input)
myinfo = info['comments']
for item in myinfo:
mycnt = item['count']
mysumcnt += mycnt
print mysumcnt
Using a sum, map and a lambda function
import json
data = '''
{
"note": "This file contains the sample data for testing",
"comments": [
{
"name": "Romina",
"count": 97
},
{
"name": "Laurie",
"count": 97
},
{
"name": "Bayli",
"count": 90
}
]
}
'''
count = sum(map(lambda x: int(x['count']), json.loads(data)['comments']))
print(count)
If the JSON is currently a string and not been loaded into a python object you'll need to:
import json
loaded_json = json.loads(json_string)
comments = loaded_json['comments']
sum(c['count'] for c in comments)

Categories